Abstract
This chapter describes an algorithm that predicts events by mining Internet data. A number of specialized Internet search engine queries were designed to summarize results from relevant web pages. At the core of these queries was a set of algorithms that embody the wisdom of crowds hypothesis. This hypothesis states that under the proper conditions the aggregated opinion of a number of nonexperts is more accurate than the opinion of a set of experts. Natural language processing techniques were used to summarize the opinions expressed from all relevant web pages. The specialized queries predicted event results at a statistically significant level. It was hypothesized that predictions from the entire Internet would outperform the predictions of a smaller number of highly ranked web pages. This hypothesis was not confirmed. This data replicated results from an earlier study and indicated that the Internet can make accurate predictions of future events. Evidence that the Internet can function as a wise crowd as predicted by the wisdom of crowds hypothesis was mixed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
ABC (2007) Dancing with the Stars. Available: http://abc.go.com/indexCited01Feb2007
Amazon.com (2007) New and Future Releases: Music. Available: http://www.amazon.com/New-Future-Releases-Music/b/ref=sv_m_2?ie=UTF8&node=465672Cited31Jan2007
Bagrow JP, Rozenfeld HD, Bollt EM Ben-Avraham, D (2004) How famous is a scientist? -Famous to those who know us. Europhys Lett 67(4):511–516
Billboard.com (2007) Billboard Album Charts - Top 100 Albums - Music Retail Sales. Available: http://www.billboard.com/bbcom/charts/chart_display.jsp?g=Albumsf=The+Billboard+200Cited31Jan2007
Bodog.com (2007) Television and Movie Betting at Bodog Sportsbook. Available: http://www.bodog.com/sports-betting/tv-film-movie-props.jspCited01Feb2007
Bodog.com (2007) Television and Movie Betting, American Idol Odds at Bodog Sportsbook. Available: http://www.bodog.com/sports-betting/tv-film-movie-props.jspCited01Feb2007
Brin S, Page L (2007) The Anatomy of a Large-Scale Hypertextual Web Search Engine. Available: http://infolab.stanford.edu/backrub/google.htmlCited24Jan2007
CNN (2006) CNN.com - Elections 2006. Available: http://www.cnn.com/ELECTION/2006/pages/results/Senate/Cited21Dec2006
CNN (2006) CNN.com - Elections 2006. Available: http://www.cnn.com/ELECTION/2006/pages/results/governor/Cited21Dec2006
CNN (2006) CNN.com - Elections 2006. Available: http://www.cnn.com/ELECTION/2006/pages/results/house/Cited21Dec2006
Debnath S, Pennock DM, Giles CL, Lawrence S (2003) Information incorporation in online in-game sports betting markets. In: Proceedings of the 4th ACM conference on electronic commerce. ACM, New York
Fama EF (1965) Random Walks in Stock Market Prices. Financial Anal J September/October
Gelbukh A (2006) Computational Linguistics and Intelligent Text Processing. Springer, Berlin
Pion S, Hamel L (2007) The Internet Democracy: A Predictive Model Based on Web Text Mining. In: Stahlbock R et al. (eds) Proceedings of the 2007 International Conference on Data Mining. CSREA Press, USA
Simkin MV, Roychowdhury VP (2006) Theory of Aces: Fame by chance or merit? Available: http://www.citebase.org/cgibin/fulltext?format=application/pdf&identifier=oai:arXiv.org:cond-mat/0310049Cited28Sep2006
Surowiecki J (2004) The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations. Doubleday Publishing, Westminster, MD
TradeSports.com (2006) Available: http://www.tradesports.com/aav2/trading/tradingCited10Oct2006
Tyburski G (2006) It’s Tough to Get a Good Date with a Search Engine. Available: http://searchenginewatch.com/showPage.html?page=2160061Cited16Dec2006
VegasInsider.com (2007) College Basketball Future Book Odds at VegasInsider.com, the leader in Sportsbook and Gaming information - College Basketball Odds, College Basketball Futures, College Basketball Future Odds. Available: http://www.vegasinsider.com/college-basketball/odds/futures/Cited14Mar2007
VegasInsider.com (2007) College Basketball Future Book Odds at VegasInsider.com, the leader in Sportsbook and Gaming information - NHL Odds, NHL Futures, Pro Hockey Odds, Pro Hockey Futures. Available: http://www.vegasinsider.com/nhl/odds/futures/Cited14Mar2007
VegasInsider.com (2007) NBA Future Odds at VegasInsider.com, The Leader in Sportsbook and Gaming Information - NBA Odds, NBA Futures, NBA Future Odds. Available: http://www.vegasinsider.com/nba/odds/futures/index.cfm#1479Cited14Mar2007
Wikipedia (2007) Project Runway. Available: http://en.wikipedia.org/wiki/Future_tenseCited01Feb2007
Wikipedia (2007) Survivor: Cook Islands. Available: http://en.wikipedia.org/wiki/Survivor:_Cook_IslandsCited01Feb2007
Yahoo! (2006) Available: http://www.yahoo.com/Cited16Dec2006
Yahoo! (2006) Yahoo! search web services. Available: http://developer.yahoo.com/search/Cited16Dec2006
Yahoo! Movies (2007) Yahoo! Movies - In Theaters This Weekend. Available: http://movies.yahoo.com/feature/thisweekend.htmlCited31Jan2007
Yahoo! Movies (2007) Yahoo! Movies - Weekend Box Office and Buzz. Available: http://movies.yahoo.com/mv/boxoffice/Cited31Jan2007
Yahoo! News (2006) Available: http://news.search.yahoo.com/news/search?fr=sfpei=p=testCited16Dec2006
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Ryan, S., Hamel, L. (2010). Using Web Text Mining to Predict Future Events: A Test of the Wisdom of Crowds Hypothesis. In: Stahlbock, R., Crone, S., Lessmann, S. (eds) Data Mining. Annals of Information Systems, vol 8. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-1280-0_15
Download citation
DOI: https://doi.org/10.1007/978-1-4419-1280-0_15
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-1279-4
Online ISBN: 978-1-4419-1280-0
eBook Packages: Computer ScienceComputer Science (R0)