The Signal and the Noise: Lessons for the Simulation Community from Failed US Presidential Election Prediction

Don’t worry, this blog has nothing to do with politics. I certainly won’t be endorsing either US Presidential candidate whilst wearing a Siemens hat. However, like many people, I am preoccupied with the Presidential Election, and more importantly, trying to predict its outcome.

This is, I guess, part of the human condition. In the animal kingdom, we are (almost) unique in our preoccupation with the future, and it seems that we are almost hard-wired by evolution to try to use prediction and forecasting to try to resolve uncertainty. However, for a species that spends so much time worrying about the future, we are spectacularly bad at predicting it.

In this blog, I wanted to spend some time trying to understand the lessons that we as engineers can learn from the failure of the prediction community to successfully forecast the outcome of the US Presidential Elections. For me this was an important backstory to the previous Presidential Election campaign: the pre-election poll predictions were wrong by a significant enough margin that they completely failed to forecast the outcome of that election.

This includes the prediction endorsed by celebrated American statistician Nate Silver, who famously correctly forecast the outcome of 49 out of 50 seats in Obama’s 2008 election victory (missing the 50th by a single percentage point). Although Silver’s prediction was much closer than most, many of whom gave Hillary Clinton a 95 per cent or more chance of winning the election, his carefully calculated model had Donald J. Trump’s chance of victory at no more than 30 per cent (although we need to acknowledge that 30 per cent is a relatively big chance of any outcome, he wasn’t saying it could never happen). He, like everyone else (including possibly President-Elect Trump), was surprised by the outcome of the election.

As things stand at the time of writing Silver’s 538 model of the election (based on 40,000 simulations) is predicting that the incumbent has only a 13% chance of being reelected. However, they do explicitly warn: “don’t count the underdog out! Upset wins are surprising but not impossible.”

What does this have to do with engineering simulation?” I hear you ask. The answer, I think, is “a great deal.” CFD engineers, like psephologists* (people who occupy themselves with election prediction), are also interested in making predictions about the future. Whereas Silver and his peers use statistical inference (from opinion polls and other data sources) to try and predict how people will vote in an election, we use numerical models of physics to predict the future performance of a proposed product or design.

As accurate as engineering simulation is, we have to acknowledge, that just like all predictions, it rarely, if ever, gives us an “exact answer” that we can rely on upon without any interpretation or scrutiny. All simulations rely on modeling assumptions (for example, to do with discretization, limitations of the physics models, boundary conditions, etc.) and are subject to errors (including operator error and uncertainties in boundary conditions). It’s typically true that if you gave a sufficiently complicated simulation scenario to five different CFD engineers, the modeling choices that they would make would result in five different simulation results. Hopefully, if the engineers were sufficiently competent, the results would be within a few percentage points of each other.

This is, of course, a problem of all prediction, not just simulation (including predicting the outcome of an election). Exactly the same thing would happen if you gave the same scenario to five different experimentalists (or looked at the results of thousands of different opinion polls). All experiments involve a degree of simplification (scaling assumptions, boundary conditions, etc.) and sources of error (operator error, measurement error, sampling error, etc.).

In his excellent book The Signal and the Noise: The Art and Science of Prediction (Why So Many Predictions Fail – But Some Don’t), Silver takes a rational look at prediction in all its forms, from earthquake prediction to climate change, from sports betting to the catastrophic failure in the prediction that resulted in the subprime collapse that plunged the whole world into recession. Although he doesn’t examine CFD or engineering simulation directly, he does explore the world of weather forecasting (which let’s face it, is glorified CFD on a massive scale) and comes to the conclusion that it is one of the most accurate predictive methods available, mainly because weather forecasters have the opportunity to compare the actual weather against the previous day’s forecast on a daily basis.

Although it could be regarded as post hoc rationalization, I think that it is incredibly healthy for everyone involved in the prediction game to critically examine the results of their simulations against the outcomes of their predictions. Including acknowledging those occasions in which our predictions do a poor job of predicting the real-world performance of a product.

In the engineering simulation game, we used to call this “validation,” in which we measured against some experimental model. With the arrival of the Digital Twin, we will be able to constantly validate our simulations in real-time, against real-world usage data. This is a unique opportunity to make our predictions better.

My favorite example of this is the Tesla Model S, which on release had an active suspension system that changes the ride height of the vehicle depending on how it is driven: “As Model S accelerates, it lowers the vehicle for optimized aerodynamics and increased range.” Through examining usage data (transmitted by the vehicles back to Tesla) they were able to work out that the system wasn’t working as expected (and that some of the vehicles were hitting bumps in the road) and so they were able to issue a patch to the operating system that automatically adjusted the ride height of the entire Tesla community. The Digital Twin will ultimately allow us to validate every simulation led decision, and where those decisions are incorrect (or perhaps inaccurate) will force us to examine and improve modelling choices.

More scrutiny and better validation will always lead to better simulations, which will ultimately lead to even better products. And maybe even more accurate election predictions…


*why can’t we as “CFD engineers” have a great collective name like “psephologists”?

Leave a Reply