Following up on my post on where Obama is winning and losing, here are some updated graphs breaking down his support at the state level.
First, here's a plot of a flexible polynomial fitted to state two-candidate vote totals by date -- you can see the upward trajectory in Obama's support levels, though the line weights each point equally:
In a linear regression, the strongest predictors of Obama's overall support are whether it's a caucus (+), Democratic presidential vote (-), and black population (+). Here's the plot of presidential vote, which indicates that Obama does worse in heavily Democratic states (the fitted line excludes the outlier of Washington, DC):
By contrast, Obama does better in states with larger numbers of African Americans, though the trend is concentrated among primary states (the linear fit is only for those states):
State education levels (specifically, the proportion of the population with a college degree) are also somewhat positively correlated with Obama support:
Finally, the strongest predictor of Obama's white support is the number of Southern Baptists in the state (-), which has been suggested as a measurable proxy for "Southernness":
Out of sample predictions from a model like this are likely to be highly inaccurate, but for what little it's worth, a regression of Obama support on black population, Hispanic population, state population (logged), 2004 Democratic presidential vote, a dummy variable for whether the state has a caucus, proportion college graduates, and Southern Baptist population predicts narrow losses for Obama in PA (48%) and OH (49%) but a big win in Texas (66%).
I'm just eyeballing your plots here, but it seems like "Obama support by Democratic presidential vote" is pretty much flat for primary states (even excluding DC).
Similarly, you see a trend in "Obama support by state education" - but again it looks pretty flat for primary states (excluding DC), strongly positive for caucuses.
I think you're contaminating your sample here. After all, you've already demonstrated that there's a big separation between caucus and primary election results.
Posted by: Jinchi | February 20, 2008 at 12:28 PM
Brendan,
Given that Obama seems to improve in a state the more he campaigns, does/should that diminish the weight of the Super Tuesday data, or at least impact our interpretation of them? For example, if they re-held Massachusetts today would the numbers be the same? And would does that tell your analysis?
(forgive weak logic: I am not a wonk)
Keep up the good work, Brendan!
Posted by: T_Porter | February 20, 2008 at 12:57 PM
Thanks for these analyses Brendan. I don't deal with data like these much, but I had two questions.
1) You exclude IL, AR, and NY. Perhaps HI and KS should be excluded for similar reasons?
2) Picking up on T_Porter's point, I wonder if there are publicly-available data you might add to your model to factor in the effect of time a candidate spends in a state in the month prior to the vote?
A model with a factor for money spent on advertising would be interesting too, but I wonder where you could possibly get such data.
Posted by: Ben | February 20, 2008 at 01:54 PM