MOA-TweetReader: Real-Time Analysis in Twitter Streaming Data

  • Albert Bifet
  • Geoffrey Holmes
  • Bernhard Pfahringer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6926)

Abstract

Twitter is a micro-blogging service built to discover what is happening at any moment in time, anywhere in the world. Twitter messages are short, generated constantly, and well suited for knowledge discovery using data stream mining. We introduce MOA-TweetReader, a system for processing tweets in real time. We show two main applications of the new system for studying Twitter data: detecting changes in term frequencies and performing real-time sentiment analysis.

Keywords

Data Stream Application Program Interface Opinion Mining Sentiment Analysis Lexical Resource 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Twitter API (2010), http://apiwiki.twitter.com/
  2. 2.
    Baccianella, S., Esuli, A., Sebastiani, F.: Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In: LREC. European Language Resources Association (2010)Google Scholar
  3. 3.
    Bach, F.R.: Bolasso: model consistent lasso estimation through the bootstrap. In: ICML, pp. 33–40 (2008)Google Scholar
  4. 4.
    Bifet, A., Frank, E.: Sentiment knowledge discovery in twitter streaming data. In: Pfahringer, B., Holmes, G., Hoffmann, A. (eds.) DS 2010. LNCS, vol. 6332, pp. 1–15. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  5. 5.
    Bifet, A., Gavaldà, R.: Learning from time-changing data with adaptive windowing. In: SDM (2007)Google Scholar
  6. 6.
    Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: Massive Online Analysis. Journal of Machine Learning Research (2010), http://moa.cs.waikato.ac.nz/
  7. 7.
    Cormode, G., Hadjieleftheriou, M.: Finding frequent items in data streams. PVLDB 1(2), 1530–1541 (2008)Google Scholar
  8. 8.
    Cormode, G., Korn, F., Tirthapura, S.: Exponentially decayed aggregates on data streams. In: Proceedings of the 24th International Conference on Data Engineering, ICDE 2008, Cancún, México, April 7-12, pp. 1379–1381 (2008)Google Scholar
  9. 9.
    Datar, M., Gionis, A., Indyk, P., Motwani, R.: Maintaining stream statistics over sliding windows. SIAM Journal on Computing 14(1), 27–45 (2002)MathSciNetMATHGoogle Scholar
  10. 10.
    Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80 (2000)Google Scholar
  11. 11.
    Gama, J., Medas, P., Castillo, G., Rodrigues, P.P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  12. 12.
    Go, A., Huang, L., Bhayani, R.: Twitter sentiment classification using distant supervision. In: CS224N Project Report, Stanford (2009)Google Scholar
  13. 13.
    Jansen, B.J., Zhang, M., Sobel, K., Chowdury, A.: Micro-blogging as online word of mouth branding. In: Proceedings of the 27th International Conference Extended Abstracts on Human Factors in Computing Systems, pp. 3859–3864 (2009)Google Scholar
  14. 14.
    Kalucki, J.: Twitter streaming API (2010), http://apiwiki.twitter.com/Streaming-API-Documentation
  15. 15.
    Kim, S., Bailey, D., Orr, B.: Timeline: Toyota from rise to recall crisis, hearings. Reuters Blog Article (2010), http://www.reuters.com/article/2010/02/23/us-toyota-timeline-idUSTRE61M0IT20100223
  16. 16.
    Lampos, V., Bie, T.D., Cristianini, N.: Flu detector - tracking epidemics on twitter. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS, vol. 6323, pp. 599–602. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  17. 17.
    Liker, J.K., Ogden, T.N.: Toyota Under Fire: Lessons for Turning Crisis into Opportunity. McGraw-Hill, New York (2011)Google Scholar
  18. 18.
    Liu, B.: Web data mining; Exploring hyperlinks, contents, and usage data. Springer, Heidelberg (2006)MATHGoogle Scholar
  19. 19.
    Liu, H., Lin, Y., Han, J.: Methods for mining frequent items in data streams: an overview. Knowl. Inf. Syst. 26(1), 1–30 (2011)CrossRefGoogle Scholar
  20. 20.
    Metwally, A., Agrawal, D., Abbadi, A.E.: Efficient computation of frequent and top-k elements in data streams. In: Eiter, T., Libkin, L. (eds.) ICDT 2005. LNCS, vol. 3363, pp. 398–412. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  21. 21.
    Miller, G.A., Fellbaum, C.: Wordnet then and now. Language Resources and Evaluation 41(2), 209–214 (2007)CrossRefGoogle Scholar
  22. 22.
    O’Connor, B., Balasubramanyan, R., Routledge, B.R., Smith, N.A.: From tweets to polls: Linking text sentiment to public opinion time series. In: Proceedings of the International AAAI Conference on Weblogs and Social Media, pp. 122–129 (2010)Google Scholar
  23. 23.
    Pak, A., Paroubek, P.: Twitter as a corpus for sentiment analysis and opinion mining. In: Proceedings of the Seventh Conference on International Language Resources and Evaluation, pp. 1320–1326 (2010)Google Scholar
  24. 24.
    Pang, B., Lee, L.: Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2(1-2), 1–135 (2008)CrossRefGoogle Scholar
  25. 25.
    Penner, C.: #numbers. Twitter Blog Article (2011), http://blog.twitter.com/2011/03/numbers.html
  26. 26.
    Petrovic, S., Osborne, M., Lavrenko, V.: The Edinburgh twitter corpus. In: #SocialMedia Workshop: Computational Linguistics in a World of Social Media, pp. 25–26 (2010)Google Scholar
  27. 27.
    Read, J.: Using emoticons to reduce dependency in machine learning techniques for sentiment classification. In: Proceedings of the ACL Student Research Workshop, pp. 43–48 (2005)Google Scholar
  28. 28.
    Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24(5), 513–523 (1988)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Albert Bifet
    • 1
  • Geoffrey Holmes
    • 1
  • Bernhard Pfahringer
    • 1
  1. 1.University of WaikatoHamiltonNew Zealand

Personalised recommendations