Can early primary polls tell us who is likely to win their party's nomination for president? In a post a few months ago, I argued they provide little information:
If we exclude sitting vice presidents (George H.W. Bush, Al Gore) and the vice president from the previous administration (Walter Mondale), the horse race polls [early in the year before the presidential election] only correctly predict the nominee two times out of eight (Bob Dole and George W. Bush).
New York Times blogger Nate Silver caricatured this position in a post yesterday. After stating that "[t]he correlation" between early polling and share of the primary popular vote "is far from perfect" but "also far from zero (in fact, it's a moderately strong 0.72)," he wrote the following (I'm linked in the parenthetical about "smart people"):
There is a fairly strong relationship between the candidates’ polling and the number of states and votes they won during the primary process — as well as their chances of winning the nomination...
One could take a variety of more sophisticated approaches with this data — for instance, by accounting in some way for the relative standing of the candidates in addition to their raw numbers. Nevertheless, this underscores that it’s simply quite wrong to suggest (as some smart people have) that early primary polls are meaningless. Instead, they have a reasonable amount of predictive power.
A more defensible hypothesis might be that one should account for any number of objective and subjective factors in addition to the national polls. It also might be the case that an expert could reliably identify candidates who were considerably stronger or weaker than suggested by their polling alone.
While I appreciate the compliment, I don't think it's a fair representation of my position. The fact that polling is correlated with primary performance shouldn't be surprising. Silver's data includes incumbent presidents who faced primary challenges and sitting vice presidents (who we would expect to poll well and win their party's nomination) as well as fringe candidates with no chance of winning the nomination (who will poll and perform poorly). In both cases, candidates' poll performance is largely a reflection of their institutional advantages or disadvantages rather than an independent factor predicting success or failure. (Jonathan Bernstein made a similar point this morning.)
To understand how much polls can tell us about who wins their party nomination, it's necessary to more carefully account for the "objective and subjective factors" to which Silver alludes, which is precisely what I did (crudely) in my previous post when I excluded cycles in which the incumbent vice president or the vice president from the previous administration was running.
Since Silver generously provided his data online (see links in his post), we can do this a bit more systematically. First, no candidate has won their party's nomination without serving as president, vice president, senator, or governor in the contemporary era. I therefore exclude candidates who lacked those qualifications (polls aren't necessary to tell us that these candidates will most likely lose) as well as incumbent presidents and vice presidents (polls aren't necessary to tell us that they will most likely win). Similarly, early polls often include candidates who are unlikely to run (past presidential nominees, celebrities, etc.) or longshot candidates who are unlikely to win, so I limit my focus to the top-tier candidates who are polled most frequently (a proxy for perceived viability). Specifically, I restrict the sample to the six most frequently polled candidates by party/election cycle, including more when there is a multi-way tie (no candidate outside the top six has won the nomination in this period).
Among this elite subset of non-incumbent candidates, the relationships are much weaker. When we disaggregate the results by party, we see they are largely driven by Republicans and a handful of celebrity Democrats (results shown are for states won but are similar for popular vote; click image for larger version):
When we try to predict who wins the nomination using polling averages among these groups, the overall relationship is significant but appears to be driven by Republicans.
Even these graphs overstate the relationship between polling and primary outcomes, however, because they fail to account for other structural advantages held by candidates that were known at the time. When we exclude those candidates who previously came in second in their party's last contested primary (a group that includes Ronald Reagan in 1980 and John McCain in 2008) and those who are related to a previous president (Ted Kennedy, Hillary Clinton, and George W. Bush), the relationship is weaker still:
In this case, we can't even estimate the relationship between the probability of winning and the candidate's polling average among Republicans because all of the contested nominations in the data were won by incumbents (Ford in 1976), sitting vice presidents (George H.W. Bush in 1988), previous runner-ups (Reagan in 1980, Dole in 1988, McCain in 2008), or people related to a previous president (George W. Bush in 2000).
In short, the evidence again suggests that early polls don't tell us much about who will win party nominations -- they're largely the result of name recognition and the structural (dis)advantages held by candidates before they enter the race. My headline "Early primary/straw polls don't matter" may have been too strong, but I stand by my conclusion that "At this point in the election cycle, the preferences that matter are those of the activists, elected officials, donors, and party elites who take part in the so-called 'invisible primary.'" Among the subgroup of viable GOP candidates, that's where the most important action is taking place right now -- and it's why I'd bet on Tim Pawlenty despite his low poll numbers.