Automatic generation of entity-oriented summaries for reputation management

  • Javier Rodríguez-VidalEmail author
  • Jorge Carrillo-de-Albornoz
  • Enrique Amigó
  • Laura Plaza
  • Julio Gonzalo
  • Felisa Verdejo
Original Research


Producing online reputation summaries for an entity (company, brand, etc.) is a focused summarization task with a distinctive feature: issues that may affect the reputation of the entity take priority in the summary. In this paper we (i) present a new test collection of manually created (abstractive and extractive) reputation reports which summarize tweet streams for 31 companies in the banking and automobile domains; (ii) propose a novel methodology to evaluate summaries in the context of online reputation monitoring, which profits from an analogy between reputation reports and the problem of diversity in search; and (iii) provide empirical evidence that producing reputation reports is different from a standard summarization problem, and incorporating priority signals is essential to address the task effectively.


Summarization Search with diversity Twitter Microblogs Online reputation management 



This research was partially supported by the Spanish Ministry of Science and Innovation (Vemodalen Project, TIN2015-71785-R) and UNED (project 2014V/PUNED/001).


  1. Alsaedi N, Burnap P, Rana OF (2016) Automatic summarization of real world events using twitter. In: Proceedings of the tenth international AAAI conference on web and social media. Cologne, pp 511–514Google Scholar
  2. Amigó E, De Albornoz JC, Chugur I, Corujo A, Gonzalo J, Martín T, Meij E, De Rijke M, Spina D (2013a) Overview of replab 2013: evaluating online reputation monitoring systems. In: International conference of the cross-language evaluation forum for European languages. Springer, pp 333–352Google Scholar
  3. Amigó E, Gonzalo J, Verdejo F (2013b) A general evaluation measure for document organization tasks. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 643–652Google Scholar
  4. Breiman L (2001) Random forests. Mach Learn 45(1):5–32CrossRefzbMATHGoogle Scholar
  5. Chakrabarti D, Punera K (2011) Event summarization using tweets. ICWSM 11:66–73Google Scholar
  6. Cho SG, Kim SB (2015) Summarization of documents by finding key sentences based on social network analysis. In: International conference on industrial, engineering and other applications of applied intelligent systems. Springer, pp 285–292Google Scholar
  7. Clarke CL, Kolla M, Cormack GV, Vechtomova O, Ashkan A, Büttcher S, MacKinnon I (2008) Novelty and diversity in information retrieval evaluation. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 659–666Google Scholar
  8. Cossu J-V, Bigot B, Bonnefoy L, Senay G (2014) Towards the improvement of topic priority assignment using various topic detection methods for e-reputation monitoring on twitter. In: International conference on applications of natural language to data bases/information systems. Springer, pp 154–159Google Scholar
  9. De Maio C, Fenza G, Loia V, Parente M (2016) Time aware knowledge extraction for microblog summarization on twitter. Inf Fusion 28:60–74CrossRefGoogle Scholar
  10. de Albornoz JC, Plaza L, Gervás P (2012) Sentisense: an easily scalable concept-based affective lexicon for sentiment analysis. In: LREC, pp 3562–3567Google Scholar
  11. Duan Y, Chen Z, Wei F, Zhou M, Shum H-Y (2012) Twitter topic summarization by ranking tweets using social influence and content quality. Proc COLING 2012:763–780Google Scholar
  12. Erkan G, Radev DR (2004) Lexrank: Graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457–479CrossRefGoogle Scholar
  13. Fiszman M, Demner-Fushman D, Kilicoglu H, Rindflesch TC (2009) Automatic summarization of medline citations for evidence-based medical treatment: a topic-oriented evaluation. J Biomed Inform 42(5):801–813CrossRefGoogle Scholar
  14. He R, Liu Y, Yu G, Tang J, Hu Q, Dang J (2017) Twitter summarization with social-temporal context. World Wide Web 20(2):267–290CrossRefGoogle Scholar
  15. Inouye D, Kalita JK (2011) Comparing twitter summarization algorithms for multiple post summaries. In: Privacy, security, risk and trust (PASSAT) and 2011 IEEE 3rd international conference on social computing (SocialCom). IEEE, pp 298–306Google Scholar
  16. Lin C-Y (2004) Rouge: a package for automatic evaluation of summaries. In: Proceedings of the workshop on text summarization branches out (WAS 2004). Association for Computational Linguistics, Barcelona, pp 74–81Google Scholar
  17. Litvak M, Last M, Kandel A (2013) Degext: a language-independent keyphrase extractor. J Ambient Intell Humaniz Comput 4(3):377–387CrossRefGoogle Scholar
  18. Litvak M, Last M (2008) Graph-based keyword extraction for single-document summarization. In: Proceedings of the workshop on multi-source multilingual information extraction and summarization. Association for Computational Linguistics, pp 17–24Google Scholar
  19. Litvak M, Vanetik N (2017) Query-based summarization using mdl principle. In: Proceedings of the multiling 2017 workshop on summarization and summary evaluation across source types and genres, pp 22–31Google Scholar
  20. Liu X, Li Y, Wei F, Zhou M (2012) Graph-based multi-tweet summarization using social signals. Proc COLING 2012:1699–1714Google Scholar
  21. Louis A, Newman T (2012) Summarization of business-related tweets: a concept-based approach. In: Proceedings of COLING 2012: Posters, pp 765–774Google Scholar
  22. Marujo L, Ribeiro R, de Matos DM, Neto JP, Gershman A, Carbonell J (2015) Extending a single-document summarizer to multi-document: a hierarchical approach. arXiv:1507.02907
  23. Meena YK, Gopalani D (2015) Feature priority based sentence filtering method for extractive automatic text summarization. Proc Comput Sci 48:728–734CrossRefGoogle Scholar
  24. Mei Q, Guo J, Radev D (2010) Divrank: the interplay of prestige and diversity in information networks. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1009–1018Google Scholar
  25. Mihalcea R, Tarau P (2004) Textrank: bringing order into text. In: Proceedings of the 2004 conference on empirical methods in natural language processingGoogle Scholar
  26. Mike T, Kevan B, Georgios P, Di C (2010) Sentiment in short strength detection informal text. JASIST 61(12):2544–2558CrossRefGoogle Scholar
  27. Moffat A, Zobel J (2008) Rank-biased precision for measurement of retrieval effectiveness. ACM Trans Inf Syst (TOIS) 27(1):2CrossRefGoogle Scholar
  28. Nastase V (2008) Topic-driven multi-document summarization with encyclopedic knowledge and spreading activation. In: Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 763–772Google Scholar
  29. Nguyen-Hoang T-A, Nguyen K, Tran Q-V (2012) Tsgvi: a graph-based summarization system for vietnamese documents. J Ambient Intell Humaniz Comput 3(4):305–313CrossRefGoogle Scholar
  30. Pang B, Lee L et al (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135CrossRefGoogle Scholar
  31. Plaza L, Carrillo-de Albornoz J (2013) Evaluating the use of different positional strategies for sentence selection in biomedical literature summarization. BMC Bioinform 14(1):71CrossRefGoogle Scholar
  32. Radev DR, Allison T, Blair-Goldensohn S, Blitzer J, Celebi A, Dimitrov S, Drabek E, Hakim A, Lam W, Liu D et al (2004) MEAD-a platform for multidocument multilingual text summarization. In: Proceedings of the fourth international conference on language resources and evaluation (LREC’04). European Language Resources Association (ELRA), Lisbon, PortugalGoogle Scholar
  33. Sarkar K, Saraf K, Ghosh A (2015) Improving graph based multidocument text summarization using an enhanced sentence similarity measure. In: Recent trends in information systems (ReTIS), 2015 IEEE 2nd international conference. IEEE, pp 359–365Google Scholar
  34. Sharifi B, Hutton M-A, Kalita J (2010) Summarizing microblogs automatically. In: Human language technologies: The 2010 annual conference of the north American chapter of the association for computational linguistics. Association for Computational Linguistics, pp 685–688Google Scholar
  35. Stone PJ, Dunphy DC, Smith MS (1966) The general inquirer: a computer approach to content analysis. MIT press, CambridgeGoogle Scholar
  36. Takamura H, Yokono H, Okumura M (2011) Summarizing a document stream. In: European conference on information retrieval. Springer, pp 177–188Google Scholar
  37. Van Erp M, Schomaker L (2000) Variants of the borda count method for combining ranked classifier hypotheses. In: 7th international workshop on frontiers in handwriting recognition. Amsterdam learning methodology inspired by humans intelligence Bo Zhang, Dayong Ding. And Ling Zhang, CiteseerGoogle Scholar
  38. Wu H, Gu Y, Sun S, Gu X (2016) Aspect-based opinion summarization with convolutional neural networks. In: Neural networks (IJCNN), 2016 international joint conference. IEEE, pp 3157–3163Google Scholar
  39. Zhang H, Fiszman M, Shin D, Miller CM, Rosemblat G, Rindflesch TC (2011) Degree centrality for semantic abstraction summarization of therapeutic studies. J Biomed Inform 44(5):830–838CrossRefGoogle Scholar
  40. Zhuang H, Rahman R, Hu X, Guo T, Hui P, Aberer K (2016) Data summarization with social contexts. In: Proceedings of the 25th ACM international on conference on information and knowledge management. ACM, pp 397–406Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.UNED IR & NLP GroupMadridSpain

Personalised recommendations