A look at how pollsters have adjusted their methods for the 2020 election cycle.
The election result in 2016 seemed to show a catastrophic error in the US pre-election polls, triggering a great deal of forensic work on what went wrong. Investigations revealed that the national polls were actually quite accurate and correctly saw that Hillary Clinton would win the popular vote, with only a small deviation from the poll estimate. Specific state-level polls for Pennsylvania, Michigan and Wisconsin, however, were another matter. The polling averages for these three states showed Clinton with a solid lead. But they ended up going to Donald Trump by a razor-thin margin, making the difference in the election outcome.
In nearly all US Presidential elections, the winner of the national popular vote also wins the Electoral College and thus the presidency. This also appeared to be the case in the days leading up to Election Day 2016, where forecasts gave Clinton around a 90% probability of winning the presidency, with a range of 71% to 99%.
Below, we review some of the findings from analysts such as the American Association for Public Opinion Research (AAPOR) and Marist College’s Institute for Public Opinion regarding why key state polls underestimated the support for Trump.
1. Undecided voters played a significant role
In key states, more than half of undecided votes went for Trump. In Pennsylvania, Michigan, Florida, and Wisconsin, 11-15% of voters made their decision in the final week. On a national basis, 20% of voters in 2016 had not decided three months prior to the election. This time around, things look a bit different. Three months before the 2020 election, only 10% of those polled said they were undecided (or did not care), but analysts still see this as sizeable enough to affect the election.
Undecided voters tend to be heavily affected by events, and studies have shown that negative campaigns and campaign ads may have a bigger effect on undecided voters than positive campaigning. Will Biden’s heavy negative ad spending and Trump’s recent cash woes tilt the balance for Biden, or can Trump make a sprint to the finish on a bad debate performance from Biden? Past election cycles show that significant changes can occur in the final weeks of the campaign, though the polls have been far more stable in this election cycle.
2. Adjustment for education level
This was implemented in many national polls, but fewer state ones. Voters with higher education levels are more likely to complete surveys compared to less-educated peers. In a survey from 2017 looking at typical national polls, 45% of the respondents had a bachelor’s degree or higher, although this number was only around 30% in the general population.
During the two Obama elections, whites with lower educational achievement began tilting more Republican. Furthermore, less-educated voters tend to follow news on a less consistent basis and may thus be more open for persuasion – especially via targeted social media, a possibly decisive factor in battleground states in 2016. Key-states polls in the 2016 election may likely have had an overrepresentation of higher education levels, which were associated with the overestimated support for Clinton.
Geography plays a role when drawing representative samples for polls, as certain voter classifications may vote very differently depending on whether they live in urban, suburban or rural areas. Following the previous point, an uneducated white man living in the countryside may have a very different political opinion relative to an uneducated white man living in living in a large city or suburb in the same state.
4. A change in the voter turnout for key demographics
This was another key factor in the 2016 election relative to the patterns in 2012. There was an increased participation among Republicans and rural voters in some key states, while the turnover decreased for some of the core Democratic voters – especially African Americans. The fact that Clinton had a significant lead in the polls may have kept some of the Democratic voters in their couch, feeling that their vote would not matter anyway.
5. Shy Trump voters
Trump voters that did not want to reveal themselves in the pre-election polls may have outnumbered the late-revealing Clinton voters in 2016, although no clear effect has been definitively proven. A recent study by CloudResearch shows that for the 2020 election, Trump voters are half as likely to reveal their true opinion about their preferred presidential candidate compared to Biden supporters.
Polling organisations conduct surveys in different ways and through different media, and may thus be biased toward certain voter segments. As an example, 10% of American adults do not use the internet – an internet-based survey will underrepresent this group, which stereotypes might describe as a 65+ person with no higher education and low income, living in rural areas. The perfectly unbiased survey will forever remain unachievable, but being aware of these biases can help pollsters adjust for overrepresentations.
Over time, especially due to the internet, the barriers for conducting a poll have been drastically lowered, and the polling landscape is easily polluted by low-quality polls. In many polls, the errors tend to repeat in similar states, introducing a systematic miss, and the correlation between the poll results could easily be underestimated.
The typical polling margin is ±3% in state polls that can only ask a small subset of the whole population. Recent studies have shown that, when accounting for other possible errors such as the correlation between the state poll errors, the real-world margin of error should be twice as big. In practice, this means that some of the 2016 state polls would not have been able to call a winner within the uncertainty limits of the poll.
The larger polling organisations seem to be better prepared for the 2020 election, and are trying to learn from the pitfalls of their 2016 misses. Many of the errors above may be addressed by conducting thorough polling. One downside risk to this, however, is that it could result in polls which try to ‘overfit’ the 2016 scenario and may miss new developments specific to 2020. One new challenge is the Covid-19 pandemic, which may affect particular categories of voters more than others and may even lower overall turnout.
So, when analyzing 2020 election polls, one should be aware of i) how the survey group was selected, ii) if the survey is asking for other parameters such as education and geography, and iii) if the polls also report the uncertainty in their predictions. This seems to be a minimum requirement for conducting a reliable 2020 election poll, and if these things are not specified, one should be extra careful about drawing important conclusions.
The average error in national polls (first figure) has been in a downward trend and was relatively low for the 2016 election. The average error in state-level polls (second figure) was, however, higher in 2016 than in the past four presidential elections. Figures recreated from AAPOR.