Skip to main content

Automatic Clustering and Summarisation of Microblogs: A Multi-subtopic Phrase Reinforcement Algorithm

  • Conference paper
  • First Online:
  • 1085 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10142))

Abstract

There is a phenomenal growth of microblogging-based social communication services and subscriptions in recent years. Through these services, users publish a large number of posts within a short period time, making it extremely hard for readers to keep track of a trending topic. A solution to this issue is text summarisation, which can generate a short summary of a trending topic from multiple posts. Most of the existing summarisation algorithms were proposed for long documents and do not work well for short microblogging posts. The PR (Phrase Reinforcement) algorithm was particularly designed to summarise microblogs, however it is merely able to generate a single-post summary that conveys a single topic, potentially overlooking other important information from the posts. In this paper, we contribute the PRICE (Phrase Reinforcement: Iteration, Clustering and Extraction) algorithm by extending the original PR algorithm with the ability to generate both multi-post and single-post summaries that span over multiple subtopics. Experimental evaluation results show that the PRICE algorithm outperforms the original PR algorithm in terms of both ROUGE-1 and Content metrics.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://zourbuth.com/tools/woeid/.

References

  1. Balahur, A., Lloret, E., Boldrini, E., Montoyo, A., Palomar, M., Martínez-Barco, P.: Summarizing threads in blogs using opinion polarity. In: Proceedings of the Workshop on Events in Emerging Text Types, pp. 23–31 (2009)

    Google Scholar 

  2. Baxendale, P.B.: Machine-made index for technical literature: an experiment. IBM J. Res. Dev. 2(4), 354–361 (1958)

    Article  Google Scholar 

  3. Carenini, G., Ng, R.T., Zhou, X.: Summarizing email conversations with clue words. In: Proceedings of the 16th International Conference on World Wide Web, pp. 91–100 (2007)

    Google Scholar 

  4. Chua, A.Y., Banerjee, S.: Customer knowledge management via social media: the case of Starbucks. J. Knowl. Manag. 17(2), 237–249 (2013)

    Article  Google Scholar 

  5. Chuang, W.T., Yang, J.: Extracting sentence segments for text summarization: a machine learning approach. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 152–159 (2000)

    Google Scholar 

  6. DeVoe, K.M.: Bursts of information: microblogging. Ref. Libr. 50(2), 212–214 (2009)

    Google Scholar 

  7. Doughty, M., Rowland, D., Lawson, S.: Co-viewing live TV with digital backchannel streams. In: Proceedings of the 9th International Interactive Conference on Interactive Television, pp. 141–144 (2011)

    Google Scholar 

  8. Ebner, M., Lienhardt, C., Rohs, M., Meyer, I.: Microblogs in higher education - a chance to facilitate informal and process-oriented learning? Comput. Educ. 55(1), 92–100 (2010)

    Article  Google Scholar 

  9. Edmundson, H.P.: New methods in automatic extracting. J. ACM 16(2), 264–285 (1969)

    Article  MATH  Google Scholar 

  10. Hartigan, J.A., Wong, M.A.: Algorithm as 136: a k-means clustering algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 100–108 (1979)

    MATH  Google Scholar 

  11. Java, A., Song, X., Finin, T., Tseng, B.: Why we Twitter: understanding microblogging usage and communities. In: Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis, pp. 56–65 (2007)

    Google Scholar 

  12. Jiranantanagorn, P., Shen, H., Goodwin, R., Teoh, K.K.: Classense: a mobile digital backchannel system for monitoring class morale. Int. J. Learn. Teach. 1(2), 161–167 (2015)

    Google Scholar 

  13. Jones, K.S.: Automatic summarising: the state of the art. Inf. Process. Manag. 43(6), 1449–1481 (2007)

    Article  Google Scholar 

  14. Krikorian, R.: New tweets per second record, and how! (2013). https://blog.twitter.com/2013/new-tweets-per-second-record-and-how. 16 August 2013

  15. Lin, C.Y., Hovy, E.: Automatic evaluation of summaries using n-gram co-occurrence statistics. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 71–78 (2003)

    Google Scholar 

  16. Louis, A., Nenkova, A.: Automatically assessing machine summary content without a gold standard. Comput. Linguist. 39(2), 267–300 (2013)

    Article  Google Scholar 

  17. Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2(2), 159–165 (1958)

    Article  MathSciNet  Google Scholar 

  18. Mackie, S., McCreadie, R., Macdonald, C., Ounis, I.: Comparing algorithms for microblog summarisation. In: Kanoulas, E., Lupu, M., Clough, P., Sanderson, M., Hall, M., Hanbury, A., Toms, E. (eds.) CLEF 2014. LNCS, vol. 8685, pp. 153–159. Springer, Heidelberg (2014). doi:10.1007/978-3-319-11382-1_15

    Google Scholar 

  19. Nam, T.: Suggesting frameworks of citizen-sourcing via government 2.0. Gov. Inf. Q. 29(1), 12–20 (2012)

    Article  Google Scholar 

  20. Nenkova, A., McKeown, K.: A survey of text summarization techniques. In: Aggarwal, C.C., Zhai, C. (eds.) Mining Text Data, pp. 43–76. Springer, New York (2012)

    Chapter  Google Scholar 

  21. Nenkova, A., Passonneau, R., Mckeown, K.: The pyramid method: incorporating human content selection variation in summarization evaluation. ACM Trans. Speech Lang. Process. 4(2), Article 4 (2007)

    Google Scholar 

  22. Nichols, J., Mahmud, J., Drews, C.: Summarizing sporting events using Twitter. In: Proceedings of ACM International Conference on Intelligent User Interfaces, pp. 189–198 (2012)

    Google Scholar 

  23. Olariu, A.: Efficient online summarization of microblogging streams. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 236–240 (2014)

    Google Scholar 

  24. Regina, B., Elhadad, M.: Using lexical chains for text summarization. In: Advances in Automatic Text Summarization, pp. 111–121 (1999)

    Google Scholar 

  25. Ren, Z., Ma, J., Wang, S., Liu, Y.: Summarizing web forum threads based on a latent topic propagation process. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 879–884 (2011)

    Google Scholar 

  26. Rosa, K.D., Shah, R., Lin, B., Gershman, A., Frederking, R.: Topical clustering of tweets. In: Proceedings of the ACM SIGIR 3rd Workshop on Social Web Search and Mining (2011)

    Google Scholar 

  27. Sharifi, B., Hutton, M.A., Kalita, J.: Summarizing microblogs automatically. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the ACL, pp. 685–688 (2010)

    Google Scholar 

  28. Sharifi, B., Hutton, M.A., Kalita, J.K.: Experiments in microblog summarization. In: Proceedings of the IEEE Second International Conference on Social Computing, pp. 49–56 (2010)

    Google Scholar 

  29. Uvarova, N.: Abstractive microblogs summarization. Master’s thesis, Gjøvik University College (2015)

    Google Scholar 

  30. Vanderwende, L., Suzuki, H., Brockett, C., Nenkova, A.: Beyond sumbasic: task-focused summarization with sentence simplification and lexical expansion. Inf. Process. Manag. 43(6), 1606–1618 (2007)

    Article  Google Scholar 

  31. Wu, Y., Zhang, H., Xu, B., Hao, H., Liu, C.: Automatic microblog summarization based on unsupervised key-bigram extraction. Int. J. Comput. Commun. Eng. 4(5), 363–370 (2015)

    Article  Google Scholar 

  32. Zhang, Y.Z., Zincir-Heywood, N., Milios, E.: Summarizing web sites automatically. In: Xiang, Y., Chaib-draa, B. (eds.) AI 2003. LNCS, vol. 2671, pp. 283–296. Springer, Heidelberg (2003). doi:10.1007/3-540-44886-1_22

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haifeng Shen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Alghamdi, M., Shen, H. (2017). Automatic Clustering and Summarisation of Microblogs: A Multi-subtopic Phrase Reinforcement Algorithm. In: Wagner, M., Li, X., Hendtlass, T. (eds) Artificial Life and Computational Intelligence. ACALCI 2017. Lecture Notes in Computer Science(), vol 10142. Springer, Cham. https://doi.org/10.1007/978-3-319-51691-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-51691-2_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-51690-5

  • Online ISBN: 978-3-319-51691-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics