« Obama's bizarre open access plan | Main | Hillary's kindergarten opp research »

December 03, 2007


And Edwards is how far back?

Read the Elder article, which is quite good, but I still think your analysis of the poll isn't quite right either. Obama and Clinton aren't so much "statistically tied," as there just isn't 95% confidence that Obama is definitely ahead.

I'm not a statistician so I don't know the exact formula, but I think a more correct headline would be "Obama not definitely leading in Iowa, but poll indicates X% confidence that he is."

Put another way, a true "statistic tie" would be a result of 28% to 28%, because at that point the statistics would be equally confident that either Obama or Clinton is ahead. These results don't show that, however; the statistics are more confident that Obama is leading, though not 95% confident.

No, no it doesn't. It means there's still a reasonable chance that Clinton is ahead, but it is in fact more likely that Obama leads, given the results of the poll.

Also, I love quoting the margin of error to 1 decimal place, but the poll results to integer percentages. With rounding, Obama might lead by very close to the margin of error here, which would give Clinton only a 5% chance of leading. That would still be a statistical tie, according to the conventions you advocate here, but Hillary would very likely be behind.

You need to better understand the meaning of "margin of error." Here's a good primer:


In the DMR poll, there is an 82% chance that Obama leads.

The phrase "statistical tie" is *wildly* misleading, far worse than suggesting that a 3% difference is significant here. To blithely say the leader "isn't winning" is even more misleading; the leader certainly isn't "losing".

It is still *much* more likely that the leader would win a poll of the entire population, just not quite to the standard confidence level. Had it been a 1% difference, I might have cut you some more slack. But a 3% difference here is not entirely trivial. Your attempt to trivialize it suggests that you don't really understand what "margin of error" means.

So, second guess:

I'm guessing your just ignoring Edwards?

The "new" poll was conducted over Nov 7-25, which is a very long time, and also ends as the DMR poll began. So, they're not measuring the same thing at all.

The statement:

But when looking at horse race numbers in a political poll, particularly in Iowa, with its quirky caucus system, historically low turnout (5 percent of Iowans participated in the Democratic caucus in 2004) and rules that change from one year to the next — this year Iowans can register to vote at the door on caucus night — the margin of sampling error is probably best applied in its strictest sense.

is nonsense. That's saying that not knowing about all kinds of inestimable systematic errors we should use the value of the sampling error as a flat confidence interval. I'm sure you've done posts before on the fallacy of using made up numbers as a starting point to revise into something useful, and that logic certainly would apply here to the statement above.

One should use the statistical confidence interval for its defined purpose or not at all.

It should also be noted that the sampling error does have its meaning altered when there are more than two choices and the measured probabilities are far away from 50%. I'll defer to the indispensible pollster.com for the details, though for the poll under discussion the change is minimal.

Thanks for the tip, Larry - I've added details on the poll's timing above. I want to be clear that I'm taking Elder's statement in a heuristic sense. Minor differences in numbers in a single poll are not particularly meaningful. The CI is not especially helpful, but requiring the press to use a .95 standard would at least constrain them from reporting tiny, fluctuating differences as "news" and telling stories about what caused those differences. The kind of big leads we could be confident are real are likely to exceed the .95 threshold.

PS This whole comment thread can be interpreted as a meta-debate about frequentist versus Bayesian statistics. Everyone implicitly wants to be a Bayesian and interpret p-values directly, but the frequentist paradigm is ill-suited to making those sorts of inferences.

I think I actually managed to get the dates wrong above, not noticing there were actually two old, lengthy polls puting Clinton ahead. That's all cleared up in your latest update.

True enough about the meta-debate, and the press is generally ill-qualified to be trusted to understand this to any level of detail beyond the 'statistical tie' nomenclature (assuming most of them even understand that). But, shouldn't the goal be to have a better informed press corps?

It is also true that the press over-reacts, looking at the polls in the horse-race sense and then writing stories "mind-reading" what the electorate must be thinking. But, again, I don't think the way to combat this is by forcing a simpler view of reading polls onto the press.

> In the end, the best approach is to consider all of the polls.

Agreed. If your point was "other polls suggest a closer race", then it would have been better to say so without suggesting that the (factually correct) reporting on this single poll was flawed.

The comments to this entry are closed.