April 23, 2008


Thanks Brendan once again for putting together these analyses. As a guaranteed Democratic voter in November who leans towards Obama, it is disappointing to see this trend continue.

Quick stats question: Were the covariates in the multivariable model chosen after some sort of elimination process (i.e. forward or backwards stepwise regression)? If so, what potential covariates were kicked out in the context of the significant ones?

Also, was the decision to use log(total state population) just an issue of scaling?

Nothing stepwise -- I included the demographic variables that have been discussed as predictors of Obama support in the press. The regression from which the significant coefficients above are reported also includes Hispanic and Southern Baptist population. At one point I also included median income but it was highly correlated with education. (See the previous posts linked above for more details.)

Thanks! So then, is it safe to say that even when adjusted for each other, all the following remained significant (let's say, p<0.10)?
% black population
the log of population
Democratic presidential vote (% statewide in 2004, I assume?)
whether the state has a caucus

* I read through some of the older posts and couldn't find how education was parameterized. Does it represent the proportion of people in the state with Bachelor's degrees (or higher)?

(All of this is my curiosity, not intended as criticism, BTW. Nice work on an interesting topic.)

Yes, education is % bachelor's or higher. DPV is the '04 number. Log of state population is a way to handle the massive population differences between states. Statistically significant above is defined as p<.10 or less in multiple regression. Thanks for the kind words...

