A final snapshot and (House) prediction, 2024
Tea leaves, but not betting markets or early voting. Also, what a full dose of hopium looks like.
It is inevitable that polls will have some error. Averaging does not solve the problem since it only reduces error arising from sampling a population. Averaging does not eliminate systematic error.
Recently FiveThirtyEight wrote about total overall error in polls. Unfortunately, they didn’t distinguish random from systematic error. Systematic error arises from differences between who pollsters think will vote, compared with who actually votes. Every pollster has their own view of this subject. In the aggregate their judgment is better than they are individually. But that average can differ from what actually happens. That difference is systematic error.
In 2016 for the Presidential race, systematic error was about 1.5 points, leading to a surprise Trump win. We can assume that an error that large could happen this year. However, we don’t know in which direction. For all we know it could favor Trump again. Or it could favor Harris.
Let me show you which way I think it will go. (P.S. Here is the biennial Princeton Election Consortium Geek’s Guide, 2024 edition).
The Presidential race: how close is it really?
The Princeton Election Consortium defines a virtual margin in terms of the Electoral College. The question asked there is: how far is the race is from a perfect electoral tie?
First, here is a sharp statistical snapshot of all state polls, using median-based statistics.
The median of the distribution, tracked in black, very slightly favors Trump, but really by nothing at all. How “nothing”? We can translate it into how much the margins would all have to shift to create a perfect electoral tie. That looks like this:
The Presidential race is highly likely to be less close than polls indicate
Today, this “meta-margin” is Trump +0.3%. That’s much smaller than a hypothetical systematic error of 1.5 points. This has two implications:
We don’t know who’s going to win based on polls alone (duh).
The final outcome is likely to be less close…in one direction or the other.
So it might be helpful to have another stream of information.
First, here are two streams of information that won’t work…
Betting markets and early voting: even worse than polls
There are several streams of information one could think about using. However, they have problems.
Betting markets are no good at all. They basically reflect polling data. They also reflect the biases of individual bettors. In theory these biases cancel out because of the magic of markets. However, we know for a fact that bettors can distort such electronic markets. In 2012, a “Romney whale” disrupted the Iowa electronic market. And this year, some Frenchman has put nearly his whole net worth into Polymarket. These markets do not tell us much.
Early voting is, sadly, also hard to interpret. In some cases, heroic efforts by people like Jon Ralston in Nevada can shed some light. But early voting can lead one astray. For example, in 2020, Democrats voted by mail more than Republicans because of the pandemic. So if Republicans early-vote more now, that might only mean they are adapting to the new method.
In other words, early voting measures a combination of (a) enthusiasm for the vote-by-mail/early-voting, and (b) enthusiasm for voting immediately. And you can vote early, but you can’t vote harder. One vote is one vote. So early voting is not that informative.
Where can we turn for more insight? Let’s start with a simpler problem, control of the House of Representatives.
The generic Congressional ballot
Polling people on their generic partisan preference does well in predicting voter preference for the House of Representatives. The House of Representatives has little net overall bias, thanks to reductions in gerrymandering in the last decade (three cheers for anti-gerrymandering reformers!). Therefore we can take the national vote as a measure of who will control the House in 2025.
Here is the generic Congressional ballot. It shows a tie (link to PEC Congressional data page).
In the last six elections, the difference between generic Congressional measure and actual national vote has had a standard deviation of 2.4 points difference from actual results. That’s a pretty wide range.
But now…here is one other independent stream of evidence we can use: special elections.
House seat prediction: 224 D (range: 216-232), 211 R (range: 203-222)
Special elections, which occur when an official must be replaced in a one-off election, provide real voting data. Daniel Donner at Daily Kos Elections/the Downballot has found that these elections are predictive of the next Congressional election.
Since the Dobbs decision repealing Roe v. Wade, special elections have pointed to Democrats winning the national vote in November 2024 by 4.5 points. That’s the orange line above.
I used Bayesian inference to combine these two measures, polls and special elections, to get the red zone above. It shows a range of R+0.8% to D+5.8%, with a midpoint of D+2.5%. That corresponds to about an 8 in 9 probability that Democrats will win the popular vote - and take control of the House. This is consistent with Nancy Pelosi’s public statement that she expects Democrats to regain the majority.
From Electoral Innovation Lab / Vote Maximizer estimates, 1 point of vote margin translates to about 5 seats. That allows conversion of the above vote margins to an approximate seat margin. The midpoint of that range is 224 Democratic seats, 211 Republican seats. Even if this is wrong, whatever happens, the chamber will be very closely divided.
Presidency: probably, maybe, potentially Harris (or not)
Generic Congressional and state Presidential surveys are done by a professional community of pollsters. For this reason, I suspect that the two data streams will show similar systematic errors. As I wrote, in 2016, Presidential state polls overestimated Hillary Clinton’s margins by 1.5 points. And generic Congressional polls that year overestimated Democratic national margin by 1.7 points.
Here is what final, unadjusted polls look like now, with margins less than 1 point shaded beige.
If state Presidential margins are shifted by 1.5 points toward Harris, they look like this:
This latter condition is associated with a modal outcome of Harris 292 EV, Trump 246 EV (NC and GA split between the candidates, Arizona to Trump).
And yes, a 1.5-point error favoring Trump would move things in the other direction.
We’ll probably have to wait
A definitive answer on today’s federal election outcomes (President, Senate, House) will require…counting the votes. Some key states (Georgia, North Carolina, Michigan, Virginia, Florida, Ohio, and Colorado) will be fast. But others (Wisconsin, Pennsylvania, Nevada) will take days, at least.
Finally, what happens with a full dose of hopium?
What would happen if we made a full adjustment using the House analysis? That analysis suggested a systematic error of 2.5 points. That corresponds to 308 to 319 EV for Harris. In this circumstance, we would get an answer on the Presidential election tonight, with Kamala Harris winning both Georgia and North Carolina.
Barring such a large polling miss, learning the outcome will take a while. I’m not even getting into the lawsuits and disruptions by angry partisans. It could be a long night, and a long week. Hopefully not a long month.
Some thoughts...
The Dobbs decision, consistent Democratic overperformance in special elections, Trump’s underperformance in Republican primaries, the Harris Campaign’s incredible ground operation with millions of volunteers, Trump’s shocking behavior and ugliness especially these last few weeks, Independents and newly-registered voters (unpolled!) overwhelmingly preferring Harris – IMHO, pollsters have failed to sufficiently take these factors into account.
Moreover, bad-faith pollsters have been releasing an unprecedented number of questionable, with the clear aim of impacting the polling averages such as 538. To an astonishing degree, they have succeeded in manipulating the media election narrative. (Likewise, Polymarket has been a case study in narrative manipulation.)
During this2024 election, especially, basing predictions on the 538 average, means trying to predict on the basis of skewed, highly-manipulated data. PEC should, in my opinion, have restricted itself to an average of just the high-quality independent polls.
Interestingly, the average of the highest-quality independent polls HAS NOT CHANGED.
It’s worth paying attention to one notable pollster exception, one who doesn’t do manipulative weighting or make a whole bunch of hidden assumptions, is Ann Selzer. There are good reasons why she is perhaps the most respected and revered pollster in America.
i was curious about the switch of wisconsin, dark blue in the unadjusted poll map to light blue in the tilt by 1.5 points to harris map. if the results in the tilted towards harris map would generate better results for her, why did the color of the map seem to indicate a worse outcome for her in wisconsin?