By David K. Randall
NEW YORK, Nov 9 (Reuters) - Nate Silver is an oxymoron come to life: the famous statistician.
After successfully calling the Electoral College results in all 50 states ahead of the U.S. presidential election, Silver, the man behind the popular FiveThirtyEight blog, has quickly become a symbol of the new power of data in politics.
Television pundits across the political spectrum praised his accuracy on the night of the election. His book, "The Signal and the Noise," about the science of prediction, shot up to number two on Amazon.com.
Silver spoke with Reuters Thursday evening. Here is an edited and condensed version of the conversation.
Q: Some of the more established polls this year had some of the worst results. Why do you think that was?
A: I think pollsters have to get back to the basics here. Do you have a poll that is actually calling everyone? Some of the polls that didn't include cellphones had bad years and that's what you would expect. If you aren't taking a representative sample, you won't get a representative snapshot. Polls on the Internet, like Ipsos, and those like it did pretty well. We are living our lives more online and you need to have different ways to capture that. (Ipsos is the polling partner of Reuters. A report by Fordham University ranked Ipsos/Reuters first of 28 polling organizations in accuracy of final, national pre-election estimates.)
Q: Before the election, you were criticized by some politicians and pundits who said the race was much closer than what your model suggested. Where was the discrepancy?
A: It helps to have a set of rules that you set up. You have to look at the data in a consistent way and an unbiased way and not be fooled by the noise associated with polling. So many people were distracted by the fact that you had polling firms that had outliers, whether from error or poor methodology. The outlier polls got the headlines, whereas the consensus was clear that Obama had a lead in the swing states.
Q: How did you get interested in forecasting?
A: A lot of it was baseball. It was hoping to win my fantasy league that drew me to sabermetrics (which applies statistical analysis to predict the performance of athletes). I was drafting Bobby Bonilla and Robin Ventura, these players who had big, brand names that weren't that good anymore. I then started to apply this stuff to analyze real world problems.
Q: What are the limitations of predictive modeling? Where doesn't it work?
A: A lot of things can't be modeled very well. In the book, I look at how difficult it is for economists to forecast jobs and growth. The techniques used are either algorithms or you spitball it (guess at the answer). Neither works all that well. Some of that is acknowledging that the economy is an incredibly complex organization. It's hard to predict 3 million people interacting with each other. Politics are an empirically answerable question. The polls in general elections are pretty reliable, and you can trust them 90 percent of the time. That's not as true in primary elections because it's a different turnout.
Q: Politicians take their own polls, of course, and have their own ways of analyzing data. Did you talk with either the Obama or Romney campaigns about how they were doing it?
A: Only a little bit. I think there are a lot of great reporters who have good conversations with the campaigns, and I had conversations with the campaigns, too. I trust the public information more.
Q: How will polling change before the 2016 election?
A: Polls did pretty well on the whole, but in four years you will see more Internet-based polling. That's really a success story this year, and those firms did quite well for themselves. The Google poll was almost perfect, much better than the Gallup poll. We're living in a world where Google beats Gallup.