Skip to main content

Using Web Text Mining to Predict Future Events: A Test of the Wisdom of Crowds Hypothesis

  • Chapter
  • First Online:
Data Mining

Part of the book series: Annals of Information Systems ((AOIS,volume 8))

  • 4140 Accesses

Abstract

This chapter describes an algorithm that predicts events by mining Internet data. A number of specialized Internet search engine queries were designed to summarize results from relevant web pages. At the core of these queries was a set of algorithms that embody the wisdom of crowds hypothesis. This hypothesis states that under the proper conditions the aggregated opinion of a number of nonexperts is more accurate than the opinion of a set of experts. Natural language processing techniques were used to summarize the opinions expressed from all relevant web pages. The specialized queries predicted event results at a statistically significant level. It was hypothesized that predictions from the entire Internet would outperform the predictions of a smaller number of highly ranked web pages. This hypothesis was not confirmed. This data replicated results from an earlier study and indicated that the Internet can make accurate predictions of future events. Evidence that the Internet can function as a wise crowd as predicted by the wisdom of crowds hypothesis was mixed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. ABC (2007) Dancing with the Stars. Available: http://abc.go.com/indexCited01Feb2007

  2. Amazon.com (2007) New and Future Releases: Music. Available: http://www.amazon.com/New-Future-Releases-Music/b/ref=sv_m_2?ie=UTF8&node=465672Cited31Jan2007

  3. Bagrow JP, Rozenfeld HD, Bollt EM Ben-Avraham, D (2004) How famous is a scientist? -Famous to those who know us. Europhys Lett 67(4):511–516

    Article  Google Scholar 

  4. Billboard.com (2007) Billboard Album Charts - Top 100 Albums - Music Retail Sales. Available: http://www.billboard.com/bbcom/charts/chart_display.jsp?g=Albumsf=The+Billboard+200Cited31Jan2007

  5. Bodog.com (2007) Television and Movie Betting at Bodog Sportsbook. Available: http://www.bodog.com/sports-betting/tv-film-movie-props.jspCited01Feb2007

  6. Bodog.com (2007) Television and Movie Betting, American Idol Odds at Bodog Sportsbook. Available: http://www.bodog.com/sports-betting/tv-film-movie-props.jspCited01Feb2007

  7. Brin S, Page L (2007) The Anatomy of a Large-Scale Hypertextual Web Search Engine. Available: http://infolab.stanford.edu/backrub/google.htmlCited24Jan2007

  8. CNN (2006) CNN.com - Elections 2006. Available: http://www.cnn.com/ELECTION/2006/pages/results/Senate/Cited21Dec2006

  9. CNN (2006) CNN.com - Elections 2006. Available: http://www.cnn.com/ELECTION/2006/pages/results/governor/Cited21Dec2006

  10. CNN (2006) CNN.com - Elections 2006. Available: http://www.cnn.com/ELECTION/2006/pages/results/house/Cited21Dec2006

  11. Debnath S, Pennock DM, Giles CL, Lawrence S (2003) Information incorporation in online in-game sports betting markets. In: Proceedings of the 4th ACM conference on electronic commerce. ACM, New York

    Google Scholar 

  12. Fama EF (1965) Random Walks in Stock Market Prices. Financial Anal J September/October

    Google Scholar 

  13. Gelbukh A (2006) Computational Linguistics and Intelligent Text Processing. Springer, Berlin

    Book  Google Scholar 

  14. Pion S, Hamel L (2007) The Internet Democracy: A Predictive Model Based on Web Text Mining. In: Stahlbock R et al. (eds) Proceedings of the 2007 International Conference on Data Mining. CSREA Press, USA

    Google Scholar 

  15. Simkin MV, Roychowdhury VP (2006) Theory of Aces: Fame by chance or merit? Available: http://www.citebase.org/cgibin/fulltext?format=application/pdf&identifier=oai:arXiv.org:cond-mat/0310049Cited28Sep2006

  16. Surowiecki J (2004) The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations. Doubleday Publishing, Westminster, MD

    Google Scholar 

  17. TradeSports.com (2006) Available: http://www.tradesports.com/aav2/trading/tradingCited10Oct2006

  18. Tyburski G (2006) It’s Tough to Get a Good Date with a Search Engine. Available: http://searchenginewatch.com/showPage.html?page=2160061Cited16Dec2006

  19. VegasInsider.com (2007) College Basketball Future Book Odds at VegasInsider.com, the leader in Sportsbook and Gaming information - College Basketball Odds, College Basketball Futures, College Basketball Future Odds. Available: http://www.vegasinsider.com/college-basketball/odds/futures/Cited14Mar2007

  20. VegasInsider.com (2007) College Basketball Future Book Odds at VegasInsider.com, the leader in Sportsbook and Gaming information - NHL Odds, NHL Futures, Pro Hockey Odds, Pro Hockey Futures. Available: http://www.vegasinsider.com/nhl/odds/futures/Cited14Mar2007

  21. VegasInsider.com (2007) NBA Future Odds at VegasInsider.com, The Leader in Sportsbook and Gaming Information - NBA Odds, NBA Futures, NBA Future Odds. Available: http://www.vegasinsider.com/nba/odds/futures/index.cfm#1479Cited14Mar2007

  22. Wikipedia (2007) Project Runway. Available: http://en.wikipedia.org/wiki/Future_tenseCited01Feb2007

  23. Wikipedia (2007) Survivor: Cook Islands. Available: http://en.wikipedia.org/wiki/Survivor:_Cook_IslandsCited01Feb2007

  24. Yahoo! (2006) Available: http://www.yahoo.com/Cited16Dec2006

  25. Yahoo! (2006) Yahoo! search web services. Available: http://developer.yahoo.com/search/Cited16Dec2006

  26. Yahoo! Movies (2007) Yahoo! Movies - In Theaters This Weekend. Available: http://movies.yahoo.com/feature/thisweekend.htmlCited31Jan2007

  27. Yahoo! Movies (2007) Yahoo! Movies - Weekend Box Office and Buzz. Available: http://movies.yahoo.com/mv/boxoffice/Cited31Jan2007

  28. Yahoo! News (2006) Available: http://news.search.yahoo.com/news/search?fr=sfpei=p=testCited16Dec2006

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Scott Ryan or Lutz Hamel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Ryan, S., Hamel, L. (2010). Using Web Text Mining to Predict Future Events: A Test of the Wisdom of Crowds Hypothesis. In: Stahlbock, R., Crone, S., Lessmann, S. (eds) Data Mining. Annals of Information Systems, vol 8. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-1280-0_15

Download citation

  • DOI: https://doi.org/10.1007/978-1-4419-1280-0_15

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4419-1279-4

  • Online ISBN: 978-1-4419-1280-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics