Skip to main content

Combining Syntax and Semantics for Automatic Extractive Single-Document Summarization

  • Conference paper
Book cover Computational Linguistics and Intelligent Text Processing (CICLing 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7182))

Abstract

The goal of automated summarization is to tackle the “information overload” problem by extracting and perhaps compressing the most important content of a document. Due to the difficulty that single-document summarization has in beating a standard baseline, especially for news articles, most efforts are currently focused on multi-document summarization. The goal of this study is to reconsider the importance of single-document summarization by introducing a new approach and its implementation. This approach essentially combines syntactic, semantic, and statistical methodologies, and reflects psychological findings that pinpoint specific selection patterns as humans construct summaries. Successful summary evaluation results and baseline out-performance are demonstrated when our system is executed on two separate datasets: the Document Understanding Conference (DUC) 2002 data set and a scientific magazine article set. These results have implications not only for extractive and abstractive single-document summarization, but could also be leveraged in multi-document summarization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Angheluta, R., Mitra, R., Jing, X., Moens, M.-F.: K.U. Leuven Summarization System at DUC 2004. Available on the Web (2004)

    Google Scholar 

  2. Arora, R., Ravindran, B.: Latent Dirichlet Allocation and Singular Value Decomposition based Multi-Document Summarization. In: ICDM 2008: Proceedings of the 2008 Eighth IEEE Int’l Conf. on Data Mining, pp. 713–718 (2008)

    Google Scholar 

  3. Baxendale, P.: Machine-made Index for Technical Literature - An Experiment. IBM Journal of Research Development 2(4), 354–361 (1958)

    Article  Google Scholar 

  4. Brin, S., Page, L.: The Anatomy of Large-scale Hypertextual Web Search Engine. Computer Networds and ISDN Systems 30, 1–7 (1998)

    Article  Google Scholar 

  5. Edmundson, H.: New Methods in Automatic Extraction. Journal of ACM 16(2), 264–285 (1969)

    Article  MATH  Google Scholar 

  6. Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press (1998)

    Google Scholar 

  7. Hovy, E., Lin, C.: Automatic Text Summarization in SUMMARIST. In: Mani, Maybury, M. (eds.) Adv. in Text Summarization, vol. 1. MIT Press (1999)

    Google Scholar 

  8. Ishikawa, K.: Trainable Automatic Text Summarization Using Segmentation of Sentence. In: Proceedings of the Third NTCIR Workshop (2003)

    Google Scholar 

  9. Li, S., Wang, W., Wang, C.: TAC 2009 Update Summarization Task of ICL. In: Text Analysis Conference 2008 (2008)

    Google Scholar 

  10. Lin, C.: ROUGE: A Package for Automatic Evaluation of Summaries. In: Proceedings of Workshop on Text Summarization Post-Conference Workshop (ACL 2004), Barcelona, Spain (2004)

    Google Scholar 

  11. Lin, C., Hovy, E.: Automatic Evaluation of Summaries Using n-gram Co-occurrence Statistics. In: HTL-NAACL (2003)

    Google Scholar 

  12. Lin, C.-Y., Hovy, E.H.: Identifying topics by position. In: ANLP, pp. 283–290 (1997)

    Google Scholar 

  13. Lorch, R., Lorch, E.: Effects of Headings of Text Recall and Summarization. Contemporary Educational Psychology 21, 261–278 (1996)

    Article  Google Scholar 

  14. Mani, I., Maybury, M.: Advances in Automatic Summarization. MIT Press, Cambridge (1999)

    Google Scholar 

  15. Mihalcea, R., Ceylan, H.: Explorations in Automatic Book Summarization. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2007), Prague (2007)

    Google Scholar 

  16. Mihalcea, R., Tarau, P.: TextRank: Bringing Order into Texts. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2004 (March 2004)

    Google Scholar 

  17. Nenkova, A.: Automatic Text Summarization of Newswire: Lessons Learned from the document understanding conference. In: AAAI, pp. 1436–1441 (2005)

    Google Scholar 

  18. Nenkova, A.: A General Introduction to Automatic Summarization (2009), http://webcast.jhu.edu/mediasite/Viewer/?peid=8cd235b1699a457f9c776c12d4925408

  19. Radev, D., Allison, T.: Mead - a Platform for Multidocument Multilingual Text Summarization. In: LREC (2004)

    Google Scholar 

  20. Radev, D., Jing, H., Stys, M., Tam, D.: Centroid-based Summarization of Multiple Documents. Information Proc. and Mgmt. 40, 919–938 (2004)

    Article  MATH  Google Scholar 

  21. Svore, K.M., Vanderwende, L., Burges, C.J.C.: Enhancing Single-document Summarization by Combining RankNet and Third-Party Sources. In: EMNLP-CoNLL, pp. 448–457 (2007)

    Google Scholar 

  22. Verma, R., Filozov, F.: Document Map and WN-Sum: A new framework for automatic text summarization and a first implementation. Technical Report UH-CS-10-03, University of Houston Computer Science Dept. (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Barrera, A., Verma, R. (2012). Combining Syntax and Semantics for Automatic Extractive Single-Document Summarization. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science, vol 7182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28601-8_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28601-8_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28600-1

  • Online ISBN: 978-3-642-28601-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics