Advertisement

World Wide Web

, Volume 20, Issue 1, pp 61–87 | Cite as

Using time-sensitive interactions to improve topic derivation in twitter

  • Robertus NugrohoEmail author
  • Weiliang Zhao
  • Jian Yang
  • Cecile Paris
  • Surya Nepal
Article

Abstract

Twitter has become one of the most popular social media platforms, widely used for discussion and information dissemination on all kinds of topics. As a result, both business and academics have researched methods to identify the topics being discussed on Twitter. Those methods can be employed for a number of applications, including emergency management, advertisements, and corporate/government communication. However, deriving topics from this short text based and highly dynamic environment remains a huge challenge. Most current methods use the content of tweets as the only source for topic derivation. Recently, tweet interactions have been considered for improving the quality of topic derivation. In this paper, we propose a method that considers both content and interactions with a temporal aspect to further improve the quality of topic derivation. The impact of the temporal aspect in user/tweet interactions is analyzed based on several Twitter datasets. The proposed method incorporates time when it clusters tweets and identifies representative terms for each topic. Experimental results show that the inclusion of the temporal aspect in the interactions results in a significant improvement in the quality of topic derivation comparing to existing baseline methods.

Keywords

Topic derivation Temporal aspect in twitter Joint matrix factorization 

Notes

Acknowledgments

This work is partially supported by the Indonesian Directorate General of Higher Education (DGHE), Macquarie University, CSIRO Data61, Australian Research Council LP120200231, and Australian Research Council DP140101369.

References

  1. 1.
    Albakour, M., Macdonald, C., Ounis, I., et al.: On sparsity and drift for effective real-time filtering in microblogs. In: Proceedings of the 22nd ACM International Conference on Information andamp; Knowledge Management (CIKM 2013), pp. 419–428 (2013)Google Scholar
  2. 2.
    Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  3. 3.
    Cataldi, M., Di Caro, L., Schifanella, C.: Emerging topic detection on twitter based on temporal and social terms evaluation. In: Proceedings of the Tenth International Workshop on Multimedia Data Mining, pp. 4. ACM, Washington DC USA (2010)Google Scholar
  4. 4.
    Cha, Y., Bi, B., Hsieh, C.C., Cho, J.: Incorporating popularity in topic models for social network analysis. In: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, pp. 223-232. ACM, Dublin, Ireland (2013)Google Scholar
  5. 5.
    Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley & Sons (2012)Google Scholar
  6. 6.
    de Moor, A.: Conversations in context: a twitter case for social media systems design. In: Proceedings of the 6th International Conference on Semantic Systems, p. 29. ACM, New York, NY, USA (2010)Google Scholar
  7. 7.
    Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psychol. Bull. 76(5), 378 (1971)CrossRefGoogle Scholar
  8. 8.
    He, Z., Xie, S., Zdunek, R., Zhou, G., Cichocki, A.: Symmetric nonnegative matrix factorization: Algorithms and applications to probabilistic clustering. IEEE Trans. Neural Netw. 22(12), 2117–2131 (2011)CrossRefGoogle Scholar
  9. 9.
    Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 22nd Annual International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 50–57. ACM, Berkeley, CA, USA (1999)Google Scholar
  10. 10.
    Hu, Y., John, A., Wang, F., Kambhampati, S.: Et-lda: Joint topic modeling for aligning events and their twitter feedback. In: AAAI Conference on Artificial Intelligence (AAAI 2012), vol. 12, pp. 59–65. Toronto, Ontario, Canada (2012)Google Scholar
  11. 11.
    Kietzmann, J.H., Hermkens, K., McCarthy, I.P., Silvestre, B.S.: Social media? get serious! understanding the functional building blocks of social media. Bus. Horiz. 54(3), 241–251 (2011)CrossRefGoogle Scholar
  12. 12.
    Kim, J., Park, H.: Sparse nonnegative matrix factorization for clustering (2008)Google Scholar
  13. 13.
    Kuang, D., Park, H., Ding, C.: Symmetric nonnegative matrix factorization for graph clustering. In: SIAM International Conference on Data Mining (SDM), vol. 12, pp. 106–117. SIAM, Anaheim, California, USA (2012)Google Scholar
  14. 11.
    Kuczma, M.: An introduction to the theory of functional equations and inequalities: Cauchy’s equation and Jensen’s inequality Springer Science & Business Media (2009)Google Scholar
  15. 15.
    Lau, J.H., Collier, N., Baldwin, T.: On-line trend analysis with topic models: #twitter trends detection topic model online. In: Proceedings of COLING 2012, pp. 1519–1534. The COLING 2012 Organizing Committee, Mumbai, India. (2012). http://www.aclweb.org/anthology/C12-1093
  16. 16.
    Lee, D., Seung, H.: Algorithms for non-negative matrix factorization. Denver, CO, USA (2000)Google Scholar
  17. 17.
    Lin, J., Efron, M., Wang, Y., Sherman, G.: Overview of the trec-2014 microblog track. Tech. rep., DTIC Document (2014)Google Scholar
  18. 18.
    Liu, C., Yang, H.C., Fan, J., He, L.W., Wang, Y.M.: Distributed nonnegative matrix factorization for web-scale dyadic data analysis on mapreduce. In: Proceedings of the 19th International Conference on World Wide Web, WWW ’10, pp. 681–690. ACM, New York, NY, USA. (2010)Google Scholar
  19. 19.
    Manning, C., Raghavan, P., Schütze, H.: Introduction to Information Retrieval, vol. 1 Cambridge (2008)Google Scholar
  20. 20.
    Nugroho, R., Molla-Aliod, D., Yang, J., Paris, C., Nepal, S.: Incorporating tweet relationships into topic derivation. In: Conference of the Pacific Association for Computational Linguistics (PACLING 2015), p. 2015. PACLING, Bali, Indonesia (2015)Google Scholar
  21. 21.
    Nugroho, R., Yang, J., Zhong, Y., Paris, C., Nepal, S.: Deriving topics in twitter by exploiting tweet interactions. In: Proceedings of the 4th IEEE International Congress on Big Data. IEEE Services Computing Community, New York, USA (2015)Google Scholar
  22. 22.
    Nugroho, R., Zhao, W., Yang, J., Paris, C., Nepal, S., Mei, Y.: Time-sensitive topic derivation in twitter. In: Web Information Systems Engineering – WISE 2015: 16th International Conference, Miami, FL, USA, November 1-3, 2015, Proceedings, Part I, pp. 138–152. Springer International Publishing, Cham (2015)Google Scholar
  23. 23.
    Nugroho, R., Zhong, Y., Yang, J., Paris, C., Nepal, S.: Matrix inter-joint factorization - a new approach for topic derivation in twitter. In: Proceedings of the 4th IEEE International Congress on Big Data. IEEE Services Computing, New York, USA (2015)Google Scholar
  24. 24.
    Ramage, D., Dumais, S.T., Liebling, D.J.: Characterizing microblogs with topic models. The International AAAI Conference on Web and Social Media (ICWSM) 10, 130–137 (2010)Google Scholar
  25. 25.
    Richard, J., Landis, G.G.K.: The measurement of observer agreement for categorical data. Biometrics 33(1), 159–174 (1977)MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Saha, A., Sindhwani, V.: Learning evolving and emerging topics in social media: a dynamic nmf approach with temporal regularization. In: Proceedings of the fifth ACM international conference on Web search and data mining (WSDM 2012), pp. 693–702. ACM, Seattle, Washington (2012)Google Scholar
  27. 27.
    Salton, G.: Automatic Text Processing. Addison-Wesley, The Transformation, Analysis, and Retrieval of Information by Computer (1989)Google Scholar
  28. 28.
    Shahnaz, F., Berry, M.W., Pauca, V.P., Plemmons, R.J.: Document clustering using nonnegative matrix factorization. Inf. Process. Manag. 42(2), 373–386 (2006)CrossRefzbMATHGoogle Scholar
  29. 29.
    Stilo, G., Velardi, P.: Time makes sense: Event discovery in twitter using temporal similarity. In: Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT)-vol. 02, pp. 186–193. IEEE Computer Society, Warsaw, Poland (2014)Google Scholar
  30. 30.
    Takeuchi, K., Ishiguro, K., Kimura, A., Sawada, H.: Non-negative multiple matrix factorization. In: Proceedings of the Twenty-Third international joint conference on Artificial Intelligence, pp. 1713–1720. AAAI Press (2013)Google Scholar
  31. 31.
    Von Seggern, D.H.: CRC Standard Curves and Surfaces with Mathematica CRC Press (2006)Google Scholar
  32. 32.
    Vosecky, J., Jiang, D., Leung, K.W.T., Xing, K., Ng, W.: Integrating social and auxiliary semantics for multifaceted topic modeling in twitter. ACM Trans. Internet Technol. (TOIT) 14(4), 27 (2014)CrossRefGoogle Scholar
  33. 33.
    Wan, S., Paris, C.: Improving government services with social media feedback. In: Proceedings of the 19th International Conference on Intelligent User Interfaces, IUI ’14, pp. 27–36. ACM, New York, NY, USA. (2014)Google Scholar
  34. 34.
    Wang, F., Li, P., König, A.C.: Efficient document clustering via online nonnegative matrix factorizations. In: SIAM International Conference on Data Mining (SDM), vol. 11, pp. 908–919. SIAM, Arizona, USA (2011)Google Scholar
  35. 35.
    Yan, X., Guo, J., Lan, Y., Cheng, X.: A biterm topic model for short texts. In: Proceedings of the 22nd international conference on World Wide Web (WWW 2013), pp. 1445-1456. International World Wide Web Conferences Steering Committee, Rio de Janeiro, Brazil (2013)Google Scholar
  36. 36.
    Yan, X., Guo, J., Liu, S., Cheng, X., Wang, Y.: Learning topics in short texts by non-negative matrix factorization on term correlation matrix. In: Proceedings of the SIAM International Conference on Data Mining (SIAM 2013). SDM, San Diego, California, USA (2013)Google Scholar
  37. 37.
    Yang, L., Sun, T., Zhang, M., Mei, Q.: We know what@ you# tag: Does the dual role affect hashtag adoption?.In: Proceedings of the 21st International Conference on World Wide Web (WWW 2012), pp. 261–270. ACM, Lyon, France (2012)Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Department of ComputingMacquarie University and CSIRO Data61SydneyAustralia
  2. 2.Department of ComputingMacquarie UniversitySydneyAustralia
  3. 3.CSIRO Data61 AustraliaSydneyAustralia

Personalised recommendations