Forecasting the 2016 Election Event Recap
October 25, 2016
An economist at Microsoft Research and the creator of the event forecasting tool PredictWise, Rothschild’s primary body of work focuses on forecasting and understanding public interest and sentiment. He has written extensively, in both the academic and popular press, on polling, prediction markets, social media and online data, and predictions of upcoming events, especially elections. Since joining Microsoft in 2012, he has been building prediction and sentiment models, and organizing novel/experimental polling and prediction games. This work has been utilized by Bing, Cortana, and Xbox. He has a Ph.D. in applied economics from the Wharton School of the University of Pennsylvania.
David began by explaining how he thinks about using empirical research to inform policy, and how the election cycle gives us a unique way to look back on how data has informed past policies, too. First, he emphasized the importance of understanding where public policy and opinion come from. The idea of traditional polling is built on the concept of a target population of likely voters within a country or state. There must be a non-zero, known probability of contacting anyone within that target population, and theoretically, after taking a small sample, the result you want to predict can be determined within a few percentage points of accuracy.
Unfortunately, this polling method faces more complications in reality. In the case of polling for the election, many people who are likely voters are unlikely to be found – they have no landlines, no cell phones, and can’t be identified via an online poll. In addition to this coverage error, non-response error can also skew results. If individuals are randomly selected but choose not to answer, this can have an impact on the poll results.
David illustrated his point through the specific example of the Sanders vs. Trump vs. Bloomberg electoral race. When Mayor Bloomberg was deciding whether he should run for President, he hired pollster Douglas Schoen to find out his probability of winning. Schoen produced results saying Bloomberg would win, but he was fundamentally flawed in his conclusions. David reminded us that we will never know the likelihood of what will happen for an event that has never happened! There is no way to judge its accuracy.
Another example of bias in polling can be found in Obama’s approval rating. Although his ratings were extremely stable, they have been shooting up in the past few months. Why? Because he does extremely well when people are asked about his replacements. So, is this an artifact of polling? And do we care? Schoen made the same error in his polling for Bloomberg – before people answered his questions, he reminded them of all of the Mayor’s accomplishments to make sure they knew who he was.
The entire industry of polling is built on this idea of probability polling, but because we don’t have comparable for outcomes most things (like Obama’s true approval rating or whether Mayor Bloomberg would have won the election), we need a good procedure for getting ideas so that we can trust that the outcome is meaningful.
When you average multiple polls, the impact of all of these errors is mitigated. In an aggregate poll, the predicted answer is never more than 2% away from the actual answer. So for David, this raised the question of whether you can take simple, cheap and fast samples and make the data useful.
Using an online poll from the MSN homepage, he and his team managed to transform a data set heavily skewed towards white, older males and make it representative of the entire population. In traditional polls, the respondents of the poll will have demographics of gender, age, race, geography and more attached to them. These respondents are given a value between 0 and 10 to weight them – if there are too few women, all women are given a weighting greater than one, and if there are too many men, they are given a rating of less than one. This method is flawed, however, since it can place too much weight on any one person who happens to be part of an underrepresented population.
What David did instead was look at a much larger set of demographics – state, age, race, sex, latent party identification, education level, and basically any other variable that distinguishes one voter from another. Then, he can create hundreds of thousands of unique combinations of these variables, to partition the population to very small, cleanly represented groups. Then, he takes the survey data and makes a model describing how any one of these groups would have responded to the poll. Finally, these predicted results are compared to the responses of any given respondent. An older, white male voter is then described by the trends of the entire male population, the entire older population, the entire white population, etc. This allows for a more accurate polling result that factors in every piece of historical data available to the pollsters.
David continues to use this methodology to predict this year’s election, collect data on the greatest NBA player of all time and more – you can see the results on MSN.com or on his blog, PredictWise.com.