Data Mining pp 335-350

Part of the Annals of Information Systems book series (AOIS, volume 8) | Cite as

Using Web Text Mining to Predict Future Events: A Test of the Wisdom of Crowds Hypothesis

Chapter

Abstract

This chapter describes an algorithm that predicts events by mining Internet data. A number of specialized Internet search engine queries were designed to summarize results from relevant web pages. At the core of these queries was a set of algorithms that embody the wisdom of crowds hypothesis. This hypothesis states that under the proper conditions the aggregated opinion of a number of nonexperts is more accurate than the opinion of a set of experts. Natural language processing techniques were used to summarize the opinions expressed from all relevant web pages. The specialized queries predicted event results at a statistically significant level. It was hypothesized that predictions from the entire Internet would outperform the predictions of a smaller number of highly ranked web pages. This hypothesis was not confirmed. This data replicated results from an earlier study and indicated that the Internet can make accurate predictions of future events. Evidence that the Internet can function as a wise crowd as predicted by the wisdom of crowds hypothesis was mixed.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    ABC (2007) Dancing with the Stars. Available: http://abc.go.com/indexCited01Feb2007
  2. 2.
  3. 3.
    Bagrow JP, Rozenfeld HD, Bollt EM Ben-Avraham, D (2004) How famous is a scientist? -Famous to those who know us. Europhys Lett 67(4):511–516CrossRefGoogle Scholar
  4. 4.
    Billboard.com (2007) Billboard Album Charts - Top 100 Albums - Music Retail Sales. Available: http://www.billboard.com/bbcom/charts/chart_display.jsp?g=Albumsf=The+Billboard+200Cited31Jan2007
  5. 5.
    Bodog.com (2007) Television and Movie Betting at Bodog Sportsbook. Available: http://www.bodog.com/sports-betting/tv-film-movie-props.jspCited01Feb2007
  6. 6.
    Bodog.com (2007) Television and Movie Betting, American Idol Odds at Bodog Sportsbook. Available: http://www.bodog.com/sports-betting/tv-film-movie-props.jspCited01Feb2007
  7. 7.
    Brin S, Page L (2007) The Anatomy of a Large-Scale Hypertextual Web Search Engine. Available: http://infolab.stanford.edu/backrub/google.htmlCited24Jan2007
  8. 8.
    CNN (2006) CNN.com - Elections 2006. Available: http://www.cnn.com/ELECTION/2006/pages/results/Senate/Cited21Dec2006
  9. 9.
    CNN (2006) CNN.com - Elections 2006. Available: http://www.cnn.com/ELECTION/2006/pages/results/governor/Cited21Dec2006
  10. 10.
    CNN (2006) CNN.com - Elections 2006. Available: http://www.cnn.com/ELECTION/2006/pages/results/house/Cited21Dec2006
  11. 11.
    Debnath S, Pennock DM, Giles CL, Lawrence S (2003) Information incorporation in online in-game sports betting markets. In: Proceedings of the 4th ACM conference on electronic commerce. ACM, New YorkGoogle Scholar
  12. 12.
    Fama EF (1965) Random Walks in Stock Market Prices. Financial Anal J September/OctoberGoogle Scholar
  13. 13.
    Gelbukh A (2006) Computational Linguistics and Intelligent Text Processing. Springer, BerlinCrossRefGoogle Scholar
  14. 14.
    Pion S, Hamel L (2007) The Internet Democracy: A Predictive Model Based on Web Text Mining. In: Stahlbock R et al. (eds) Proceedings of the 2007 International Conference on Data Mining. CSREA Press, USAGoogle Scholar
  15. 15.
    Simkin MV, Roychowdhury VP (2006) Theory of Aces: Fame by chance or merit? Available: http://www.citebase.org/cgibin/fulltext?format=application/pdf&identifier=oai:arXiv.org:cond-mat/0310049Cited28Sep2006
  16. 16.
    Surowiecki J (2004) The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations. Doubleday Publishing, Westminster, MDGoogle Scholar
  17. 17.
  18. 18.
    Tyburski G (2006) It’s Tough to Get a Good Date with a Search Engine. Available: http://searchenginewatch.com/showPage.html?page=2160061Cited16Dec2006
  19. 19.
    VegasInsider.com (2007) College Basketball Future Book Odds at VegasInsider.com, the leader in Sportsbook and Gaming information - College Basketball Odds, College Basketball Futures, College Basketball Future Odds. Available: http://www.vegasinsider.com/college-basketball/odds/futures/Cited14Mar2007
  20. 20.
    VegasInsider.com (2007) College Basketball Future Book Odds at VegasInsider.com, the leader in Sportsbook and Gaming information - NHL Odds, NHL Futures, Pro Hockey Odds, Pro Hockey Futures. Available: http://www.vegasinsider.com/nhl/odds/futures/Cited14Mar2007
  21. 21.
    VegasInsider.com (2007) NBA Future Odds at VegasInsider.com, The Leader in Sportsbook and Gaming Information - NBA Odds, NBA Futures, NBA Future Odds. Available: http://www.vegasinsider.com/nba/odds/futures/index.cfm#1479Cited14Mar2007
  22. 22.
    Wikipedia (2007) Project Runway. Available: http://en.wikipedia.org/wiki/Future_tenseCited01Feb2007
  23. 23.
    Wikipedia (2007) Survivor: Cook Islands. Available: http://en.wikipedia.org/wiki/Survivor:_Cook_IslandsCited01Feb2007
  24. 24.
    Yahoo! (2006) Available: http://www.yahoo.com/Cited16Dec2006
  25. 25.
    Yahoo! (2006) Yahoo! search web services. Available: http://developer.yahoo.com/search/Cited16Dec2006
  26. 26.
    Yahoo! Movies (2007) Yahoo! Movies - In Theaters This Weekend. Available: http://movies.yahoo.com/feature/thisweekend.htmlCited31Jan2007
  27. 27.
    Yahoo! Movies (2007) Yahoo! Movies - Weekend Box Office and Buzz. Available: http://movies.yahoo.com/mv/boxoffice/Cited31Jan2007
  28. 28.

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  1. 1.University of Rhode IslandKingstonUSA

Personalised recommendations