Skip to main content

Improving Multi-document Text Summarization Performance using Local and Global Trimming

  • Conference paper
Proceedings of the First International Conference on Intelligent Human Computer Interaction

Abstract

Multi-document summarization can produce a condensed representation of the contents of multiple related text documents. With this summarization facility, web users can judge rapidly the relevance of a group of documents returned by the search engines and decide whether those should be discarded. This reduces the total search cost for the users. This paper presents a multi-document summarization system, which has two components: (1) the sentence extraction component that produces draft summaries by sentence extraction and (2) the sentence-trimming component that eliminates the low content and redundant elements from the sentences in the draft summaries for improving the summarization performance. In this paper, we also introduced several new local and global sentence-trimming rules. Our experiment on DUC 2004 data set shows that the local and global trimming can improve the extractive multi-document summarization performance in many cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baxendale, P. B.: Man-made index for technical literature—An experiment. IBM Journal of Research and Development 2(4), 354–361 (1958)

    Google Scholar 

  2. Edmundson, H. P.: New methods in automatic extracting. Journal of the Association for Computing Machinery 16(2), 264–285 (1969)

    MATH  Google Scholar 

  3. Luhn, H. P.: The automatic creation of lite rature abstracts. IBM Journal of Research Development 2(2), 159–165 (1958)

    Article  MathSciNet  Google Scholar 

  4. McKeown, K. R. and Radev R.D.: Generating summaries of multiple news articles. In Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval: Seattle, July, pp. 74–82 (1995)

    Google Scholar 

  5. Carbonell, Jaime G. and Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval: Melbourne, Australia, pp. 335–336 (1998)

    Google Scholar 

  6. McKeown, K, Klavans J., Hatzivassiloglou V., Barzilay R., and Eskin, E.: Towards multi-document summarization by reformulation: Progress and prospects. In Proceedings of the 16th National Conference of the American Association for Artificial Intelligence, pp. 453–460, 18–22 July (1999)

    Google Scholar 

  7. Marcu, D and Gerber L.: An inquiry into the nature of multi-document abstracts, extracts, and their evaluation. In Proceedings of the NAACL-2001 Workshop on Automatic Summarization: Pittsburgh, June. NAACL, pages 1–8 (2001)

    Google Scholar 

  8. Radev, D. R., Jing, H., Budzikowska, M. Centroid-based summarization of multiple documents: Sentence extraction, utility-based evaluation, and user studies. In ANLP/NAACL Workshop on Summarization: Seattle, April (2000)

    Google Scholar 

  9. Radev, D. R., Jing, H., Sty M., Tam, D.: Centroid-based summarization of multiple documents. Inf. Process. Manage. 40(6), 919–938 (2004)

    Article  MATH  Google Scholar 

  10. Barzilay, R., McKeown, K., Elhadad, M.: Information fusion in the context of multi-document summarization. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics: College Park, MD, 20–26 June, pp. 550–557 (1999)

    Google Scholar 

  11. Mani, I., Barbara, G., and Eric, B. Improving summaries by revising them. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics: College Park, MD, June, pp. 558–565 (1999)

    Google Scholar 

  12. Lin, C. Improving Summarization Performance by Sentence Compression—A Pilot Study. In the Proceedings of the Sixth International Workshop on Information Retrieval with Asian Language (IRAL): Sapporo, Japan, July 7 (2003)

    Google Scholar 

  13. Knight., Marcu, D.: Statistics-Based Summarization-Step One: Sentence Compression. In Proceedings of AAAI: Austin, TX, USA (2000)

    Google Scholar 

  14. Hovy, E., Lin, Z. L.: A BE-based Multi-document summarizer with sentence compression. In Proceedings of Multilingual Summariza-tion Evaluation (ACL), Ann Arbor, MI (2005)

    Google Scholar 

  15. Liu, H.: MontyLingua: An end-to-end natural language processor with common sense.: Available at: web.media.mit.edu/~hugo/montylingua, (2004)

    Google Scholar 

  16. Dorr, B. Zajic, J., David, S. R.: Hedgetrimmer: A parse-and-trim approach to headline generation. In Proceedings of the HLT/NAACL Text Summarization Workshop and Document Understanding Conference (DUC): (pp. 1–8). Edmonton, Alberta (2003)

    Google Scholar 

  17. Hovy, E.H., Fukumoto, J., Lin, C.-Y., Zhou L.: Basic Elements.: http://www.isi.edu/~cyl/BE (2005)

    Google Scholar 

  18. Barzilay, R., Elhadad., McKeown, K.: Sentence ordering in multi-document summarization. In Proceedings of the Human Language Technology Conference. (2001)

    Google Scholar 

  19. Lin, C.-Y., Hovy, E.: Automatic evaluation of summaries using n-gram cooccurrence. In Proceedings of Language Technology Conference (HLT-NAACL):, Edmonton, Canada, May 27–June 1 (2003)

    Google Scholar 

  20. Lin, C.Y.: ROUGE: A package for automatic evaluation of summaries. In WAS 2004: Proceedings of the Workshop on Text Summarization Branches Out, Barcelona, Spain July 25–26 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Indian Institute of Information Technology, India

About this paper

Cite this paper

Sarkar, K. (2009). Improving Multi-document Text Summarization Performance using Local and Global Trimming. In: Tiwary, U.S., Siddiqui, T.J., Radhakrishna, M., Tiwari, M.D. (eds) Proceedings of the First International Conference on Intelligent Human Computer Interaction. Springer, New Delhi. https://doi.org/10.1007/978-81-8489-203-1_27

Download citation

  • DOI: https://doi.org/10.1007/978-81-8489-203-1_27

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-8489-404-2

  • Online ISBN: 978-81-8489-203-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics