November 2008
Sun Mon Tue Wed Thu Fri Sat
            1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30            
Oct   Dec


Blog-Parents

RaptorMagic

Orcinus

Blog-Brothers

Callimachus
(Done with Mirrors)

Gelmo
(Statistical blah blah blah)

Other Blogs I Read
Regularly Often

Athletics Nation

Andrew Sullivan
(Daily Dish)

Kevin Drum
(Political Animal)

Hilzoy
(Obsidian Wings)

 Monday, November 17, 2008
Forecasting

For all the lavish praise that political pundits and bloggers heaped on Nate Silver for his accurate projection of the presidential election at fivethirtyeight.com, it's surprising how many of them still don't really grasp what he did.

Here's what a pollster does: He asks some people, "If the election were held today, who would you vote for?" (or a similar question). Various demographic questions are also asked to help the pollster compare the randomly selected sample against the (presumed) general population. Adjustments are made accordingly, and the pollster reports the result: If the election were held today, the vote would be XXX,000 for candidate A and YYY,000 for candidate B. The pollster's employer (ie, some news media company) announces that, as of today, candidate A is leading by Z points.

Note that the pollster does not tell you who will win the election if it is held on election day. That's by design. The news media doesn't want that, because it's much more newsy to have a score that bounces up and down — now B is gaining, now B is ahead, now A retakes the lead — than a daily report of the same probable outcome slowly becoming more probable. The pollster likes it that way, too, because it's less work and less room to be wrong. If the pollster predicts who win will and he's too far off, people will rightly question his ability to predict. If the pollster states who is ahead right now, and then the actual election results come out different, the answer is simply that the electorate "moved" in those last few days.

What Nate Silver did (along with a couple of others like him) is take all that polling data, combine it with what is known about past polling data and elections, and run it through a model that attempts to predict the outcome of the election. This is possible because a plain poll result — even an accurate and well-designed one — is not as good an indicator as a the same poll result altered by certain educated assumptions. The average person sampled by a poll pays less attention to politics than the average person reading the poll. It's entirely possible that the polled person doesn't know who he will vote for. Or possibly he thinks he knows but in fact he'll vote the other way.

The eventual vote is not entirely predictable, obviously, but it is somewhat predictable, and if it's early enough in the campaign season there are factors that predict the end result better than sampling the voters. Better than either is some sort of model that intelligently combines polls and external factors. That's essentially what Nate Silver did, and his methodology is spelled out right there on the site, complete with discussions of what it does and doesn't attempt to correct for.

Fivethirtyeight.com is able to exist partly because better information technology makes such modeling more feasible, but even more because the Internet has created access to the narrow market for such information. There are enough wonks out there who are interested in such projections to support the efforts of two or three websites. But that hasn't changed the fact that the majority of journalists and their readers are more interested in the imaginary horse race.

The same blogger who whooped that "Nate Silver owned this election" will the very next day jump into an argument with other pundits about "when" McCain lost the election. Was it the Palin nomination? Was it the drop in the stock market? To make their case, they drag out a chart of poll numbers bouncing up and down and they point to the day where McCain's numbers seem to turn downward as if that proves that the events of that day must be the ones that determined the outcome of the race. It doesn't work that way.

Who's Up, Who's Down

The notion of a score that moves around from day to day is deeply ingrained in our election-watching consciousness, even to the point of absurdity. I have in my inbox an email about the Alaska Senate election. It is dated Nov 13, and the subject line "Begich now ahead." The text of the email is a copy of a story in the Anchorage Daily News. The first two sentences of the story begin "Mark Begich made a dramatic comeback Wednesday..." and "Begich, who was losing after election night, now leads...."

Not only does the score bounce up and down before the election; apparently it even bounces up and down after the election. This is absurd. The votes were all cast Nov 4. There is no way that any candidate can take or lose the lead after they have been cast. The result is already set; all that changes is what we know about it.

A better analogy than the horse race would be predicting the weather. If you've scheduled your wedding for March 31, 2009, at 2:00 pm, and you want to know what kind of weather you'll have for your outdoor ceremony, there is a progression in your ability to predict it. If you tried to forecast the weather right now, you'd have very little idea, no more than a general sense of what the weather has been at that time of year in previous years. If you look at the sky at 1:45 pm on March 31, 2009, you'll be able to predict the next hour with near certainty. Between now and then there is a gradual increase in the value of your projection. Some time around March 20 you can start getting weather forecasts of some value, and they'll gradually get better till some time around March 29 it's fairly reliable though still not a sure thing. With polling data and elections the time scale is different, but the general trend is the same.

If your wedding planner were to look outside every day between now and March and ask, "if the wedding were held today, would there be sunshine or rain?", that would be stupid. But that's pretty much our attitude toward election poll results.

9:58:46 PM  [permalink]  comment []