Skip to main content

A Survey of Text Summarization Techniques

Abstract

Numerous approaches for identifying important content for automatic text summarization have been developed to date. Topic representation approaches first derive an intermediate representation of the text that captures the topics discussed in the input. Based on these representations of topics, sentences in the input document are scored for importance. In contrast, in indicator representation approaches, the text is represented by a diverse set of possible indicators of importance which do not aim at discovering topicality. These indicators are combined, very often using machine learning techniques, to score the importance of each sentence. Finally, a summary is produced by selecting sentences in a greedy approach, choosing the sentences that will go in the summary one by one, or globally optimizing the selection, choosing the best set of sentences to form a summary. In this chapter we give a broad overview of existing approaches based on these distinctions, with particular attention on how representation, sentence scoring or summary selection strategies alter the overall performance of the summarizer. We also point out some of the peculiarities of the task of summarization which have posed challenges to machine learning approaches for the problem, and some of the suggested solutions.

Keywords

  • Extractive text summarization
  • topic representation
  • machine learning for summarization

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-1-4614-3223-4_3
  • Chapter length: 34 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   169.00
Price excludes VAT (USA)
  • ISBN: 978-1-4614-3223-4
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   219.99
Price excludes VAT (USA)
Hardcover Book
USD   219.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Aker, T. Cohn, and R. Gaizauskas. Multi-document summarization using a* search and discriminative training. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP’10, pages 482–491, 2010.

    Google Scholar 

  2. E. Amitay and C. Paris. Automatically summarizing web sites - is there a way around it? In Proceedings of the ACM Conference on Information and Knowledge Management, pages 173–179, 2000.

    Google Scholar 

  3. R. Barzilay and M. Elhadad. Text summarizations with lexical chains. In Inderjeet Mani and Mark Maybury, editors, Advances in Automatic Text Summarization, pages 111 121. MIT Press, 1999.

    Google Scholar 

  4. R. Barzilay and N. Elhadad. Sentence alignment for monolingual comparable corpora. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 25–32, 2003.

    Google Scholar 

  5. R. Barzilay and L. Lee. Catching the drift: Probabilistic content models, with applications to generation and summarization. In Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, pages 113–120, 2004.

    Google Scholar 

  6. F. Biadsy, J. Hirschberg, and E. Filatova. An unsupervised approach to biography production using wikipedia. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, pages 807–815, 2008.

    Google Scholar 

  7. S. Blair-Goldensohn, K. McKeown, and A. Schlaikjer. Defscriber: a hybrid system for definitional qa. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 462–462, 2003.

    Google Scholar 

  8. D. Blei, T. Griffiths, M. Jordan, and J. Tenenbaum. Hierarchical topic models and the nested chinese restaurant process. In Advances in Neural Information Processing Systems, page 2003, 2004.

    Google Scholar 

  9. J. Carbonell and J. Goldstein. The use of mmr, diversity-based rerunning for reordering documents and producing summaries. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 335–336, 1998.

    Google Scholar 

  10. G. Carenini, R. Ng, and X. Zhou. Summarizing email conversations with clue words. In Proceedings of the international conference on World Wide Web, pages 91–100, 2007.

    Google Scholar 

  11. A. Celikyilmaz and D. Hakkani-Tur. A hybrid hierarchical model for multi document summarization. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 815–824, 2010.

    Google Scholar 

  12. Y. Chali, S. Hasan, and S. Joty. Do automatic annotation techniques have any impact on supervised complex question answering? In Proceedings of the Joint Conference of the Annual Meeting of the ACL and the International Joint Conference on Natural Language Processing of the AFNLP, pages 329–332, 2009. [13] Y. Chali and S. Joty. Improving the performance of the random walk model for answering complex questions. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Short Papers, pages 9–12, 2008.

    Google Scholar 

  13. J. Conroy and D. O’Leary. Text summarization via hidden markov models. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 406–407, 2001.

    Google Scholar 

  14. J. Conroy, J. Schlesinger, and D. O’Leary. Topic-focused multidocument summarization using an approximate oracle score. In Proceedings of the International Conference on Computational Linguistics and the annual meeting of the Association for Computational Linguistics, pages 152–159, 2006.

    Google Scholar 

  15. T. Copeck and S. Szpakowicz. Leveraging pyramids. In Proceedings of the Document Understanding Conference, 2005.

    Google Scholar 

  16. H. Daum´e III and D. Marcu. A phrase-based HMM approach to document/abstract alignment. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 119– 126, 2004.

    Google Scholar 

  17. H. Daum´e III and D. Marcu. Bayesian query-focused summarization. In Proceedings of the International Conference on Computational Linguistics and the annual meeting of the Association for Computational Linguistics, pages 305–312, 2006.

    Google Scholar 

  18. S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, pages 391–407, 1990.

    Google Scholar 

  19. J.-Y. Delort, B. Bouchon-Meunier, and M. Rifqi. Enhanced web document summarization using hyperlinks. In Proceedings of the ACM conference on Hypertext and hypermedia, pages 208–215, 2003.

    Google Scholar 

  20. R. Donaway, K. Drummey, and L. Mather. A comparison of rankings produced by summarization evaluation measures. In Proceedings of the 2000 NAACL-ANLPWorkshop on Automatic summarization - Volume 4, pages 69–78, 2000.

    Google Scholar 

  21. T. Dunning. Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1):61–74, 1994.

    Google Scholar 

  22. H. Edmundson. New methods in automatic extracting. Journal of the ACM, 16(2):264–285, 1969.

    MATH  CrossRef  Google Scholar 

  23. N. Elhadad, M.-Y. Kan, J. Klavans, and K. McKeown. Customization in a unified framework for summarizing medical literature. Journal of Artificial Intelligence in Medicine, 33:179–198, 2005.

    CrossRef  Google Scholar 

  24. G. Erkan and D. Radev. Lexrank: Graph-based centrality as salience in text summarization. Journal of Artificial Intelligence Research, 2004.

    Google Scholar 

  25. E. Filatova and V. Hatzivassiloglou. A formal model for information selection in multi-sentence text extraction. In Proceedings of the International Conference on Computational Linguistic, pages 397–403, 2004.

    Google Scholar 

  26. M. Fuentes, E. Alfonseca, and H. Rodr´ıguez. Support vector machines for query-focused summarization trained and evaluated on pyramid data. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Companion Volume: Proceedings of the Demo and Poster Sessions, pages 57–60, 2007.

    Google Scholar 

  27. P. Fung and G. Ngai. One story, one flow: Hidden markov story models for multilingual multidocument summarization. ACM Transactions on Speech and Language Processing, 3(2):1–16, 2006.

    CrossRef  Google Scholar 

  28. S. Furui, M. Hirohata, Y. Shinnaka, and K. Iwano. Sentence extraction-based automatic speech summarization and evaluation techniques. In Proceedings of the Symposium on Large-scale Knowledge Resources, pages 33–38, 2005.

    Google Scholar 

  29. M. Galley. A skip-chain conditional random field for ranking meeting utterances by importance. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 364– 372, 2006.

    Google Scholar 

  30. M. Galley and K. McKeown. Improving word sense disambiguation in lexical chaining. In Proceedings of the international joint conference on Artificial intelligence, pages 1486–1488, 2003.

    Google Scholar 

  31. D. Gillick, K. Riedhammer, B. Favre, and D. Hakkani-Tur. A global optimization framework for meeting summarization. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pages 4769–4772, 2009.

    Google Scholar 

  32. Y. Gong and X. Liu. Generic text summarization using relevance measure and latent semantic analysis. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 19–25, 2001.

    Google Scholar 

  33. S. Gupta, A. Nenkova, and D. Jurafsky. Measuring importance and query relevance in topic-focused multi-document summarization. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Demo and Poster Sessions, pages 193–196, 2007.

    Google Scholar 

  34. B. Hachey, G. Murray, and D. Reitter. Dimensionality reduction aids term co-occurrence based multi-document summarization. In SumQA ’06: Proceedings of the Workshop on Task-Focused Summarization and Question Answering, pages 1–7, 2006.

    Google Scholar 

  35. A. Haghighi and L. Vanderwende. Exploring content models for multi-document summarization. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

    Google Scholar 

  36. pages 362–370, 2009.

    Google Scholar 

  37. D. Hakkani-Tur and G. Tur. Statistical sentence extraction for information distillation. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, volume 4, pages IV–1 –IV–4, 2007.

    Google Scholar 

  38. S. Harabagiu and F. Lacatusu. Topic themes for multi-document summarization. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR’05, pages 202–209, 2005.

    Google Scholar 

  39. V. Hatzivassiloglou, J. Klavans, M. Holcombe, R. Barzilay, M. Kan, and K. McKeown. Simfinder: A flexible clustering tool for summarization. In Proceedings of the NAACL Workshop on Automatic Summarization, pages 41–49, 2001.

    Google Scholar 

  40. E. Hovy and C.-Y. Lin. Automated text summarization in summarist. In Advances in Automatic Text Summarization, pages 82– 94, 1999.

    Google Scholar 

  41. M. Hu, A. Sun, and E.-P. Lim. Comments-oriented blog summarization by sentence extraction. In Proceedings of the ACM Conference on Information and Knowledge Management, pages 901–904, 2007.

    Google Scholar 

  42. H. Jing. Using hidden markov modeling to decompose humanwritten summaries. Computational linguistics, 28(4):527–543, 2002.

    CrossRef  Google Scholar 

  43. J. Kupiec, J. Pedersen, and F. Chen. A trainable document summarizer. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 68–73, 1995.

    Google Scholar 

  44. J. Leskovec, N. Milic-frayling, and M. Grobelnik. Impact of linguistic analysis on the semantic graph coverage and learning of document extracts. In Proceedings of the national conference on Artificial intelligence, pages 1069–1074, 2005.

    Google Scholar 

  45. C.-Y. Lin, G. Cao, J. Gao, and J.-Y. Nie. An information-theoretic approach to automatic evaluation of summaries. In Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT-NAACL’06), pages 463–470, 2006.

    Google Scholar 

  46. C.-Y. Lin and E. Hovy. The automated acquisition of topic signatures for text summarization. In Proceedings of the International Conference on Computational Linguistic, pages 495–501, 2000.

    Google Scholar 

  47. H. Lin and J. Bilmes. Multi-document summarization via budgeted maximization of submodular functions. In North American chapter of the Association for Computational Linguistics/Human Language Technology Conference (NAACL/HLT-2010), 2010.

    Google Scholar 

  48. H. Lin, J. Bilmes, and S. Xie. Graph-based submodular selection for extractive summarization. In Proc. IEEE Automatic Speech Recognition and Understanding (ASRU), 2009.

    Google Scholar 

  49. S.-H. Lin, Y.-M. Chang, J.-W. Liu, and B. Chen. Leveraging evaluation metric-related training criteria for speech summarization. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2010, pages 5314–5317, 2010.

    Google Scholar 

  50. S.-H. Lin and B. Chen. A risk minimization framework for extractive speech summarization. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 79–87, 2010.

    Google Scholar 

  51. A. Louis, A. Joshi, and A. Nenkova. Discourse indicators for content selection in summarization. In Proceedings of the Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 147–156, 2010.

    Google Scholar 

  52. A. Louis and A. Nenkova. Automatically evaluating content selection in summarization without human models. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 306–314, 2009.

    Google Scholar 

  53. H. P. Luhn. The automatic creation of literature abstracts. IBM Journal of Research and Development, 2(2):159–165, 1958.

    MathSciNet  CrossRef  Google Scholar 

  54. M. Mana-L´opez, M. De Buenaga, and J. G´omez-Hidalgo. Multidocument summarization: An added value to clustering in interactive retrieval. ACM Transactions on Informations Systems, 22(2):215–241, 2004.

    Google Scholar 

  55. I. Mani and E. Bloedorn. Summarizing similarities and differences among related documents. Information Retrieval, 1(1-2):35–67, April 1999.

    CrossRef  Google Scholar 

  56. D. Marcu. The automatic construction of large-scale corpora for summarization research. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 137–144, 1999.

    Google Scholar 

  57. R. McDonald. A study of global inference algorithms in multidocument summarization. In Proceedings of the European Conference on IR Research, pages 557–564, 2007.

    Google Scholar 

  58. K. McKeown, L. Shrestha, and O. Rambow. Using question-answer pairs in extractive summarization of email conversations. In Proceedings of the International Conference on Computational Linguistics and Intelligent Text Processing, pages 542–550, 2007.

    Google Scholar 

  59. K. McKeown, J. Klavans, V. Hatzivassiloglou, R. Barzilay, and E. Eskin. Towards multidocument summarization by reformulation: progress and prospects. In Proceedings of the national conference on Artificial intelligence, pages 453–460, 1999.

    Google Scholar 

  60. Q. Mei and C. Zhai. Generating impact-based summaries for scientific literature. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, pages 816–824, 2008.

    Google Scholar 

  61. R. Mihalcea and P. Tarau. Textrank: Bringing order into texts. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 404–411, 2004.

    Google Scholar 

  62. R. Mihalcea and P. Tarau. An algorithm for language independent single and multiple document summarization. In Proceedings of the International Joint Conference on Natural Language Processing, pages 19–24, 2005.

    Google Scholar 

  63. G.A. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. J. Miller. Introduction to wordnet: An on-line lexical database. International Journal of Lexicography (special issue), 3(4):235–312, 1990.

    Google Scholar 

  64. H. Murakoshi, A. Shimazu, and K. Ochimizu. Construction of deliberation structure in email conversation. In Proceedings of the Conference of the Pacific Association for Computational Linguistics, pages 570–577, 2004.

    Google Scholar 

  65. G. Murray, S. Renals, and J. Carletta. Extractive summarization of meeting recordings. In Proc. 9th European Conference on Speech Communication and Technology, pages 593–596, 2005.

    Google Scholar 

  66. A. Nenkova and A. Bagga. Facilitating email thread access by extractive summary generation. In Proceedings of the Recent Advances in Natural Language Processing Conference, 2003.

    Google Scholar 

  67. A. Nenkova and K. McKeown. Automatic Summarization. In Foundations and Trends in Information Retrieval 5(2–3), pages 103–233, 2011.

    Google Scholar 

  68. A. Nenkova, L. Vanderwende, and K. McKeown. A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 573–580, 2006.

    Google Scholar 

  69. P. Newman and J. Blitzer. Summarizing archived discussions: a beginning. In Proceedings of the international conference on Intelligent user interfaces, pages 273–276, 2003.

    Google Scholar 

  70. M. Osborne. Using maximum entropy for sentence extraction. In Proceedings of the ACL Workshop on Automatic Summarization, pages 1–8, 2002.

    Google Scholar 

  71. J. Otterbacher, G. Erkan, and D. Radev. Biased lexrank: Passage retrieval using random walks with question-based priors. Information Processing and Management, 45:42–54, January 2009.

    CrossRef  Google Scholar 

  72. M. Ozsoy, I. Cicekli, and F. Alpaslan. Text summarization of turkish texts using latent semantic analysis. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010), pages 869–876, August 2010.

    Google Scholar 

  73. D. Radev, H. Jing, M. Sty, and D. Tam. Centroid-based summarization of multiple documents. Information Processing and Management, 40:919–938, 2004.

    MATH  CrossRef  Google Scholar 

  74. D. Radev, S. Teufel, H. Saggion, W. Lam, J. Blitzer, H. Qi, A. C, elebi, D. Liu, and E. Drabek. Evaluation challenges in largescale document summarization. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics (ACL’03), pages 375–382, 2003.

    Google Scholar 

  75. O. Rambow, L. Shrestha, J. Chen, and C. Lauridsen. Summarizing email threads. In Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, 2004.

    Google Scholar 

  76. G. Rath, A. Resnick, and R. Savage. The formation of abstracts by the selection of sentences: Part 1: sentence selection by man and machines. American Documentation, 2(12):139–208, 1961.

    CrossRef  Google Scholar 

  77. K. Riedhammer, D. Gillick, B. Favre, and D. Hakkani-Tur. Packing the meeting summarization knapsack. In Proceedings of the Annual Conference of the International Speech Communication Association, pages 2434–2437, 2008.

    Google Scholar 

  78. G. Salton, A. Singhal, M. Mitra, and C. Buckley. Automatic text structuring and summarization. Information Processing and Management, 33(2):193–208, 1997.

    CrossRef  Google Scholar 

  79. G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24:513–523, 1988.

    CrossRef  Google Scholar 

  80. C. Sauper and R. Barzilay. Automatically generating wikipedia articles: A structure-aware approach. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pages 208–216, 2009.

    Google Scholar 

  81. B. Schiffman, I. Mani, and K. Concepcion. Producing biographical summaries: Combining linguistic knowledge with corpus statistics. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, pages 458–465, 2001.

    Google Scholar 

  82. B. Schiffman, A. Nenkova, and K. McKeown. Experiments in multidocument summarization. In Proceedings of the international conference on Human Language Technology Research, pages 52– 58, 2002.

    Google Scholar 

  83. D. Shen, J.-T. Sun, H. Li, Q. Yang, and Z. Chen. Document summarization using conditional random fields. In Proceedings of the 20th international joint conference on Artifical intelligence, pages 2862–2867, 2007.

    Google Scholar 

  84. L. Shrestha and K. McKeown. Detection of question-answer pairs in email conversations. In Proceedings of the International Conference on Computational Linguistic, 2004.

    Google Scholar 

  85. A. Siddharthan, A. Nenkova, and K. McKeown. Syntactic simplification for improving content selection in multi-document summarization. In Proceedings of the International Conference on Computational Linguistic, pages 896–902, 2004.

    Google Scholar 

  86. H. Silber and K. McCoy. Efficiently computed lexical chains as an intermediate representation for automatic text summarization. Computational Linguistics, 28(4):487–496, 2002.

    CrossRef  Google Scholar 

  87. K. Sparck Jones. A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28:11–21, 1972.

    CrossRef  Google Scholar 

  88. J. Steinberger, M. Poesio, M. A. Kabadjov, and K. Jeek. Two uses of anaphora resolution in summarization. Information Processing and Management, 43(6):1663–1680, 2007.

    CrossRef  Google Scholar 

  89. W. Yih, J. Goodman, L. Vanderwende, and H. Suzuki. Multidocument summarization by maximizing informative contentwords. In Proceedings of the international joint conference on Artificial intelligence, pages 1776–1782, 2007.

    Google Scholar 

  90. S. Teufel and M. Moens. Summarizing scientific articles: experiments with relevance and rhetorical status. Computational Linguisics., 28(4):409–445, 2002.

    CrossRef  Google Scholar 

  91. D. Radev, T. Allison, S. Blair-goldensohn, J. Blitzer, A. Celebi, S. Dimitrov, E. Drabek, A. Hakim, W. Lam, D. Liu, J. Otterba cher, H. Qi, H. Saggion, S. Teufel, A. Winkel, and Z. Zhang. Mead - a platform for multidocument multilingual text summarization. In Proceedings of the International Conference on Language Resources and Evaluation, 2004.

    Google Scholar 

  92. A. Turpin, Y. Tsegay, D. Hawking, and H. Williams. Fast generation of result snippets in web search. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 127–134, 2007.

    Google Scholar 

  93. J. Ulrich, G. Murray, and G. Carenini. A publicly available annotated corpus for supervised email summarization. In Proceedings of the AAAI EMAIL Workshop, pages 77–87, 2008.

    Google Scholar 

  94. L. Vanderwende, H. Suzuki, C. Brockett, and A. Nenkova. Beyond sumbasic: Task-focused summarization with sentence simplification and lexical expansion. Information Processing and Managment, 43:1606–1618, 2007.

    CrossRef  Google Scholar 

  95. R. Varadarajan and V. Hristidis. A system for query-specific document summarization. In Proceedings of the ACM Conference on Information and Knowledge Management, 2006.

    Google Scholar 

  96. X. Wan and J. Yang. Improved affinity graph based multidocument summarization. In Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers, pages 181–184, 2006.

    Google Scholar 

  97. D. Wang, S. Zhu, T. Li, and Y. Gong. Multi-document summarization using sentence-based topic models. In Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pages 297–300, 2009.

    Google Scholar 

  98. R. Weischedel, J. Xu, and A. Licuanan. A hybrid approach to answering biographical questions. In Mark Maybury, editor, New Directions In Question Answering, pages 59–70, 2004.

    Google Scholar 

  99. K. Wong, M. Wu, and W. Li. Extractive summarization using supervised and semi-supervised learning. In Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), pages 985–992, 2008.

    Google Scholar 

  100. S. Xie, H. Lin, and Y. Liu. Semi-supervised extractive speech summarization via co-training algorithm. In INTERSPEECH, the 11th Annual Conference of the International Speech Communication Association, pages 2522–2525, 2010.

    Google Scholar 

  101. S. Xie and Y. Liu. Using corpus and knowledge-based similarity measure in maximum marginal relevance for meeting summarization. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pages 4985–4988, 2008.

    Google Scholar 

  102. S. Ye, T.-S. Chua, M.-Y. Kan, and L. Qiu. Document concept lattice for text understanding and summarization. Information Processing and Management, 43(6):1643 – 1662, 2007.

    CrossRef  Google Scholar 

  103. W. Yih, J. Goodman, L. Vanderwende, and H. Suzuki. Multidocument summarization by maximizing informative contentwords. In Proceedings of the international joint conference on Artificial intelligence, pages 1776–1782, 2007.

    Google Scholar 

  104. L. Zhou and E. Hovy. A web-trained extraction summarization system. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pages 205–211, 2003.

    Google Scholar 

  105. L. Zhou, M. Ticrea, and E. Hovy. Multi-document biography summarization. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 434–441, 2004.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ani Nenkova .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Nenkova, A., McKeown, K. (2012). A Survey of Text Summarization Techniques. In: Aggarwal, C., Zhai, C. (eds) Mining Text Data. Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-3223-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-3223-4_3

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4614-3222-7

  • Online ISBN: 978-1-4614-3223-4

  • eBook Packages: Computer ScienceComputer Science (R0)