Predicting 2016 US Presidential Election Polls with Online and Media Variables

  • Veikko IsotaloEmail author
  • Petteri Saari
  • Maria Paasivaara
  • Anton Steineker
  • Peter A. Gloor
Conference paper
Part of the Springer Proceedings in Complexity book series (SPCOM)


Traditional media has always played a large role in elections by informing voters and shaping opinions, and recently, social media and various Internet information sources have also become considerable influencers on the voters. There is data publicly available on how these information sources and media channels are being used, which could potentially be analyzed for their effects on the election process. This chapter aims to determine if social media, Internet traffic, and traditional media data can be used to predict elections by searching for patterns between the data and poll numbers for 2016 US Republican and Democratic primaries. The results suggest that machine learning models with linear regression can produce quite accurate predictions; also statistically significant correlations were found between polls and betting odds and polls and Facebook page likes. More sophisticated methods could allow for better forecasting using this publicly available data.


Presidential Election Polls Page Likes Internet Information Sources Tweet Number Wikipedia Page Views 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. An J, Weber I (2015) Whom should we sense in “social sensing”-analyzing which users work best for social media now-casting. EPJ Data Sci 4(1):1–22CrossRefGoogle Scholar
  2. (2016) Accessed 5 Feb 2016
  3. Berg J, Forsythe R, Nelson F et al (2001) Results from a dozen years of election futures markets research. In: Plott CR, Smith VL (eds) Handbook of experimental economic results, vol 1. Elsevier, Amsterdam, pp 742–751CrossRefGoogle Scholar
  4. Berthold MR, Cebron N, Dill F, et al (2007) KNIME: The Konstanz Information Miner. In: Studies in classification, data analysis, and knowledge organization (GfKL 2007), Springer, HeidelbergGoogle Scholar
  5. Daisey M (2015) Ben Carson’s lies reveal a fundamental truth about candidates’ tall tales. In: The Guardian. Accessed 13 Feb 2016
  6. Dalsgaard S (2008) Facework on Facebook: The presentation of self in virtual life and its role in the US elections. Anthropol Today 24(6):8–12CrossRefGoogle Scholar
  7. Dalton RJ, Beck PA, Huckfeldt R et al (1998) A test of media-centered agenda setting: Newspaper content and public interests in a presidential election. Polit Commun 15(4):463–481CrossRefGoogle Scholar
  8. (2016) The primary debate schedule. Accessed 22 Feb 2016
  9. (2016) Accessed 6 Feb 2016
  10. Erikson RS, Wlezien C (2008) Are political markets really superior to polls as election predictors? Public Opin Q 72(2):190–215CrossRefGoogle Scholar
  11. Fox JR, Koloen G, Sahin V (2007) No joke: A comparison of substance in the daily show with Jon Stewart and broadcast network television coverage of the 2004 presidential election campaign. J Broadcast Electron Media 51(2):213–227CrossRefGoogle Scholar
  12. Google Trends (2016) Accessed 3 Feb 2016
  13. Griffin A (2015) Donald Trump wants to ban the Internet, will ask Bill Gates to ‘close it up’. In: The independent. Accessed 22 Feb 2016
  14. Guff S (2015) Watch Donald Trump get attacked by a Bald Eagle. In: The Huffington post. Accessed 22 Feb 2016
  15. Johnson J, Weigel D (2015) Donald Trump calls for ‘total’ ban on Muslims entering United States. In: Washington post. Accessed 10 Feb 2016
  16. McKelvey RD, Ordeshook PC (1985) Sequential elections with limited information. Am J Polit Sci 29(3):480–512MathSciNetCrossRefzbMATHGoogle Scholar
  17. McKelvey RD, Ordeshook PC (1986) Information, electoral equilibria, and the democratic ideal. J Polit 48(04):909–937CrossRefGoogle Scholar
  18. Metaxas PT, Mustafaraj E, Gayo-Avello D (2011) How (not) to predict elections. Paper presented at the Privacy, Security, Risk and Trust (PASSAT) and IEEE Third International Conference on Social Computing, 9–11 Oct 2011Google Scholar
  19. Morstatter F, Pfeffer J, Liu H et al (2013) Is the sample good enough? Comparing data from Twitter’s Streaming API with Twitter’s Firehose. arXiv preprint arXiv:1306.5204Google Scholar
  20. (2016) Accessed 5 Feb 2016
  21. Pew Research Center (2015) The evolving role of news on Twitter and Facebook. Pew Research Center, WashingtonGoogle Scholar
  22. (2016) Accessed 4 Feb 2016
  23. (2016) Accessed 17 Feb 2016
  24. Sinclair B, Plott CR (2012) From uninformed to informed choices: Voters, pre-election polls and updating. Electoral Stud 31(1):83–95CrossRefGoogle Scholar
  25. (2016) Accessed 11 Feb 2016
  26. (2016) Accessed 2 Feb 2016
  27. (2016) Accessed 10 Feb 2016
  28. Twitter Help Center (2016) FAQs about verified accounts. Accessed 21 Feb 2016
  29. Twittercounter (2016) Accessed 7 Feb 2016
  30. Vitak J, Zube P, Smock A et al (2011) It’s complicated: Facebook users’ political participation in the 2008 election. Cyberpsychol Behav Social Networking 14(3):107–114CrossRefGoogle Scholar
  31. Wolfers J, Leigh A (2002) Three tools for forecasting federal elections: Lessons from 2001. Aust J Polit Sci 37(2):223–240CrossRefGoogle Scholar
  32. Yasseri T, Bright J (2016) Wikipedia traffic data and electoral prediction: towards theoretically informed models. J EPJ Data Sci 22(5):1–15Google Scholar
  33. Zajonc RB (1968) Attitudinal effects of mere exposure. J Pers Soc Psychol 9(2):1–27CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Veikko Isotalo
    • 1
    Email author
  • Petteri Saari
    • 1
  • Maria Paasivaara
    • 1
  • Anton Steineker
    • 2
  • Peter A. Gloor
    • 3
  1. 1.Aalto UniversityHelsinkiFinland
  2. 2.University of CologneCologneGermany
  3. 3.MIT Center for Collective IntelligenceCambridgeUSA

Personalised recommendations