At Accel.ai's Demystifing AI conference, I gave (what was supposed to be) a lightning talk on Big Data and the Presidential Election. The awesomely engaged audience kept it going for well over time, and I came out of it with great insights and thoughts to put down in a blog post.
What appealed to the audience was my 'post-mortem' approach, a name that I find deliciously macabre. The goal of the approach is to look at a project after the fact and analyze it's successes and failures at every step of the way. In this case, I looked at the presidential election, and called the 'project' of predicting the outcome a 'failure' in our astounding inability to predict that Donald Trump would win more Electoral College votes than Hillary Clinton. Let's unpack the discussion, slide by slide.
SLIDE 1: Big Data Failed Us This Election
The premise of the talk. What we know is that not a single major poll was able to accurately predict the outcome of this election. In fact, we failed SPECTACULARLY.
Worth understanding is we have standalone polls - like Gallup- groups who conduct their own polling, and we have metapolls, which are amalgamations of other polls, such as RealClearPolitics. This is an important distinction, because the former are responsible for executing their own surveys and locating their own samples, while the latter is an aggregate that relies on an 'ensemble' of polls.
Also worth note is that this election was supposed to be the launch of Votecastr, which was supposed to provide real-time, and accurate, insights into the election outcome. This also failed to accurately predict the outcome.
As a result, there has been a lot of post-election big data backlash, some of the more dramatic headlines being the inspiration for this talk's title. Let's unpack why.
SLIDE 2: Our Understanding of Big Data Failed Us This Election
Even within the polling community, there was inconsistent prediction. The New York Times Upshot model, for example, gave Clinton ~85% chance of winning, while FiveThirtyEight gave her a ~72% chance. That's a pretty big difference.
Let's look deeper. While the polls all agreed that she would win, they disagreed on the methodology to predict how. Most publicly, Nate Silver got into a heated battle with Huffington Post on his use of trend adjustment, which HuffPo called "changing the results of polls to fit what he thinks the polls are, rather than simply entering the poll numbers into his model and crunching them."
Rather than taking a simple average -- like RealClearPolitics does -- Silver’s model weights polls by his team’s assessment of their quality, and also performs several “adjustments” to account for things like the partisanship of a pollster or the trend lines across different polls. Yet other models take historical trend into account, and demographic shifts. There is no clear consensus on a 'best' model.
SLIDE 3: Our Explanation of Big Data Failed Us This Election
Talking data to media outlets is a dangerous game of telephone. In my opinion, it is the data scientist's responsibility to be as clear as possible and as unambiguous as possible on the true meaning of their model. What is the error margin? What is the degree of confidence? What does a "75% chance" mean (hint, it doesn't mean that there is a guarantee of winning).
Of course, this is sometimes at odds with current trends of clickbait journalism. "Clinton win likely with a 62 to 89 percent probability" is not as eye-catching or click-inducing as "Clinton 90% likely to win." What to some may be semantics is to us the meat of the discussion.
As scientists, we got caught up with selling precision. Polls are notoriously flawed, and predictive models that result from polls have a wide margin. At best, we over-reported how good our models were, at worst, people used that margin of error to their advantage.
SLIDE 4: Our Understanding Of How We Collect Big Data Failed Us This Election
Polling, as I mentioned above, is notoriously flawed. As a masters student (about a decade ago!) I sat in on many discussions and panels about declining response rates, sample biases, the rise of do-not-call lists and how to get people to tell the truth in polls. The share of households that agreed to participate in a telephone survey by the Pew Research Center dropped to 14 percent by 2012 from 43 percent in 1997. This was before the contention and mistrust sown by the current social and political climate.
Long story short, these problems have (some) methodological workarounds, but are far from solved. In fact, some of them are worse. Depending on where you lived, there may be strong incentive to lie about your vote to align with your region's preference.
In other words - GIGO, or garbage in, garbage out. If our data going into our models was flawed, our analyses coming out are not trustworthy.
SLIDE 5: We Failed Big Data This Election
There was a great post-election quote by Erik Brynjolfsson along the lines of "if you understand how models work, you weren't surprised by this election" - apologies that I can't find the source. He was referring to understanding that a probability isn't a certainty, but it globally applies to this election as a prediction project.
Ultimately, if we understand this as a data science project, we failed on all counts:
- we failed to bring in good data that we had faith in
- we failed to build a model that was accurate and delivered good results
- we failed to validate our model
- we failed to communicate our results properly to our audience
What is our takeaway? Humility and introspection. We are only as good as the models we build and the quality of work we produce.