Tuesday, November 22, 2016

A complete guide to election day data prediction mishap - oops we forgot about Johnson and errors

  • A top data visualization team and journal - the New York Times completely failed to predict a Trump win. Predicted 85% Clinton victory. 
  • Nate Silver, author of FiveThirtyEight, also the guy who perfectly predicted Obama 2008 Election results, predicted that Clinton has 2/3 of a chance to win. He's also saying Trump had 30% chance of winning - significantly higher than most people expected. People who had extreme distaste of Trump probably expected close to 0 chance.
    • Nate Silver later posted on his twitter : A—There's a 30% chance of an earthquake B—LOL ur crazy no way it's that high {{earthquake}} B—Idiot! You said a 70% chance of no earthquake
  • We all forgot about Johnson, the minority contender whose presence might have "stolen" the 1-5% margin Clinton so needed but instead narrowly losing to Trump in key Democratic states. Many forgot that these marginal alternative votes to Johnson has cost her many key states like Pennsylvania where her votes were only percentage away from Trump's. Personally I believe that the existence of a 3rd candidate caused Clinton to lose swinging states. In order to exercise their right to vote alternative, many voters have accidentally handed the presidency to Trump, whom they would not have voted for at all. Unintended mathematical result of voting alternative. 
  • Traditional sampling and polling methods claim to have being "blindsided" postmortem. They claim the root data was wrong. Specifically many blamed the early exit polls.
    • Postmortem examination of this method shows clear selection bias - the people who are more vocal would have revealed their votes, as are those who matched the popular expectations.
    • Margin of Error. Columbia University researcher Andrew Gelman found the margin of error of such polling can be as high as 7%, 14-point range +/-. An estimated 50% vote of confidence is actually 43% to 57%.  Crazy, that's the difference between a majority win or loss.
  • HuffingtonPost a Silicon Valley and women friendly media outpost claimed that Clinton will win by more than 90% chance.
  • Election night, people warmed in utter shock as Trump racked up electoral college votes in a landslide
  • The updated NYTimes visualization below shows the stunning "upset" where Trump overtook Clinton in an Election Night victory
  • Just like we cannot predict stock market win/loss, we cannot predict election win/loss.  
  • Data analysis results were wrong but there were some new data visualization charts. Nate Silver came up with a snake board-game-like intestine chart for electoral college votes and the states. Some criticism of the snake chart : no useful info, just a fancy map on electoral votes. And it's a design spin on an vintage game.  
  • Subjective perspective and inherent bias. Many media and data outlets were criticized post election. While no statisticians would ever unwisely predict 100% Clinton win, nearly all liberal outlets were prematurely hailing a Clinton win. Even Nate Silver who gave Trump a 30% chance of winning, did not step out to help the public understand this stats until after the election. 
  • Simulation and randomness. Nearly all respectful data outlets ran simulations with random factors. They build in scenarios of swinging states flipping, margin of error. Yet the models still fell short.
    • Imagine simulating percentage votes versus electoral college votes. Percentage is continuous and can change a fraction of percentage at a time. Electoral Votes are much more discrete, and is allocated by lots of 3, ... , 55 (Alaska...California) margin of error becomes huge on one hand. The "landslide effect" of Clinton's major upset was very apparent on election night when chunks of electoral votes were going to Trump, escalating him quickly to the threshold, swiftly making him the apparent winner.
  • Personally, I think at Election Night, watching the live visualization on Wall Street Journal website, I saw that large cities were voting as expected leaning either democratic or republican but yet there were many lesser known counties were overwhelming voting for Trump from California to New York. Few studies and data analysis were granular on the county level. We were so focused on state level results. 
  • This is an election that swinging states matter a great deal more than usual. Nate Silver used two numbers tipping-point chance and voter power index to highlight these important states that played a crucial role election night. 
  • Who's winning the popular vote? Nate Silver estimated Clinton an average of 48.5% percentage, and Trump 44.9%, really not bad at all. And we forgot Johnson 5%! That is enough margin to make Trump the winner! If Clinton fails to catch all 48.5%, and Johnson fails to capture 5% throughout, that's enough margin going to Trump. Plus the margin of error of estimation... Wow Trump and Hillary win were more like a flip of a coin 50% 50%. (visit Nate silver's blog to see this useful visualizaiton). Again, my personal opinion is that we forgot about Johnson
  • In my personal opinion, Trump's win was not a landslide, instead it appeared to be a landslide because of our electoral college system. The actual votes (popular vote) was a more even split. I personally think we really forgot about errors and Johnson. Landslide victories were unlikely (Obama had a true landslide), so margin of errors and Johnson presence were extremely important. Yet we forgot about them. We still don't think about them when we just claim there was a landslide victory and now we are learning what Trump did right and justify what he did right. Really he did a lot of things right and Clinton was close to do a lot of other things right. One of them won by chance. No one predicted that. 

Sources and Further Reading
  • Fast Company
  • Nate Silver 

Rules of Sudoku for Algorithm Exercises

Need to code a Sudoku solver? Here are three rules of Sudoku: A 9x9 grids, Each row ... Each column ... Each of the 9 3x3 grids (examp...