Evaluating Syntactic Sentence Compression for Text Summarisation

Perera, Prasad; Kosseim, Leila

doi:10.1007/978-3-642-38824-8_11

Prasad Perera²⁰ &
Leila Kosseim²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7934))

Included in the following conference series:

International Conference on Application of Natural Language to Information Systems

2402 Accesses
1 Citations

Abstract

This paper presents our work on the evaluation of syntactic based sentence compression for automatic text summarization. Sentence compression techniques can contribute to text summarization by removing redundant and irrelevant information and allowing more space for more relevant content. However, very little work has focused on evaluating the contribution of this idea for summarization. In this paper, we focus on pruning individual sentences in extractive summaries using phrase structure grammar representations. We have implemented several syntax-based pruning techniques and evaluated them in the context of automatic summarization, using standard evaluation metrics. We have performed our evaluation on the TAC and DUC corpora using the BlogSum and MEAD summarizers. The results show that sentence pruning can achieve compression rates as low as 60%, however when using this extra space to fill in more sentences, ROUGE scores do not improve significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chandrasekar, R., Doran, C., Srinivas, B.: Motivations and Methods for Text Simplification. In: Proceedings of COLING 1996, Copenhagen, pp. 1041–1044 (1996)
Google Scholar
Dorr, B., Zajic, D., Schwartz, R.: Hedge Trimmer: A Parse-and-Trim Approach to Headline Generation. In: Proceedings of the HLT-NAACL Workshop on Text Summarization, pp. 1–8 (2003)
Google Scholar
Knight, K., Marcu, D.: Summarization beyond sentence extraction: A probabilistic approach to sentence compression. Artificial Intelligence 139(1), 91–107 (2002)
Article MathSciNet MATH Google Scholar
Hahn, U., Mani, I.: The Challenges of Automatic Summarization. IEEE Computer
Google Scholar
Murray, G., Joty, S., Ng, R.: The University of British Columbia at TAC 2008. In: Proceedings of TAC 2008, Gaithersburg, Maryland, USA (2008)
Google Scholar
Jing, H.: Sentence Reduction for Automatic Text Summarization. In: Proceedings of the Sixth Conference on Applied Natural Language Processing, Seattle, pp. 310–315 (April 2000)
Google Scholar
Gagnon, M., Da Sylva, L.: Text Compression by Syntactic Pruning. In: Lamontagne, L., Marchand, M. (eds.) Canadian AI 2006. LNCS (LNAI), vol. 4013, pp. 312–323. Springer, Heidelberg (2006)
Chapter Google Scholar
Jaoua, M., Jaoua, F., Belguith, L.H., Hamadou, A.B.: Évaluation de l’impact de l’intégration des étapes de filtrage et de compression dans le processus d’automatisation du résumé. In: Résumé Automatique de Documents. Document numérique, Lavoisier, vol. 15, pp. 67–90 (2012)
Google Scholar
Jing, H., McKeown, K.R.: Cut and Paste Based Text Summarization. In: Proceedings of NAACL-2000, Seattle, pp. 178–185 (2000)
Google Scholar
Conroy, J.M., Schlesinger, J.D., O’Leary, D.P., Goldstein, J.: Back to Basics: CLASSY 2006. In: Proceedings of the HLT-NAACL 2006 Document Understanding Workshop, New York City (2006)
Google Scholar
Nguyen, M.L., Phan, X.H., Horiguchi, S., Shimazu, A.: A New Sentence Reduction Technique Based on a Decision Tree Model. International Journal on Artificial Intelligence Tools 16(1), 129–138 (2007)
Article Google Scholar
McClosky, D., Charniak, E., Johnson, M.: Effective Self-Training for Parsing. In: Proceedings of HLT-NAACL 2006, New York, pp. 152–159 (2006)
Google Scholar
Fellbaum, C.: WordNet: An Electronic Lexical Database. The MIT Press (May 1998)
Google Scholar
Le Nguyen, M., Shimazu, A., Horiguchi, S., Ho, B.T., Fukushi, M.: Probabilistic Sentence Reduction Using Support Vector Machines. In: Proceedings of COLING 2004, Geneva, pp. 743–749 (August 2004)
Google Scholar
Clarke, J., Lapata, M.: Global Inference for Sentence Compression an Integer Linear Programming Approach. Journal of Artificial Intelligence Research (JAIR) 31(1), 399–429 (2008)
MATH Google Scholar
Filippova, K., Strube, M.: Dependency Tree Based Sentence Compression. In: Proceedings of the Fifth International Natural Language Generation Conference, INLG 2008, Stroudsburg, PA, USA, pp. 25–32 (2008)
Google Scholar
Schlesinger, J.D., O’Leary, D.P., Conroy, J.M.: Arabic/English Multi-document Summarization with CLASSY: The Past and the Future. In: Gelbukh, A. (ed.) CICLing 2008. LNCS, vol. 4919, pp. 568–581. Springer, Heidelberg (2008)
Chapter Google Scholar
Dang, H.T.: DUC 2005: Evaluation of Question-focused Summarization Systems. In: Proceedings of the Workshop on Task-Focused Summarization and Question Answering, Sydney, pp. 48–55 (2006)
Google Scholar
Dang, H.T.: Overview of DUC 2006. In: Proceedings of the HLT-NAACL 2006 Document Understanding Workshop (2006)
Google Scholar
Zajic, D.M., Dorrand, B.J., Lin, J., Schwartz, R.: Multi-candidate Reduction: Sentence Compression as a Tool for Document Summarization Tasks. Information Processing and Management 43(6), 1549–1570 (2007)
Article Google Scholar
Harman, D., Liberman, M.: TIPSTER Complete. Linguistic Data Consortium (LDC), Philadelphia (1993)
Google Scholar
Marneffe, M.C.D., Manning, C.D.: The Stanford typed dependencies representation. In: Proceedings of the Workshop on Cross-Framework and Cross-Domain Parser Evaluation, CrossParser 2008, Manchester, pp. 1–8 (2008)
Google Scholar
Dang, H., Owczarzak, K.: Overview of the TAC 2008 Update Summarization Task. In: Proceedings of the Text Analysis Conference, TAC 2008, Gaithersburg (2008)
Google Scholar
Mithun, S.: Exploiting Rhetorical Relations in Blog Summarization. In: Farzindar, A., Kešelj, V. (eds.) Canadian AI 2010. LNCS, vol. 6085, pp. 388–392. Springer, Heidelberg (2010)
Chapter Google Scholar
Radev, D., Allison, T., Blair-Goldensohn, S., Blitzer, J., Çelebi, A., Dimitrov, S., Drabek, E., Hakim, A., Lam, W., Liu, D., Otterbacher, J., Qi, H., Saggion, H., Teufel, S., Topper, M., Winkel, A., Zhang, Z.: MEAD - A platform for multidocument multilingual text summarization. In: Proceedings of LREC 2004, Lisbon, Portugal (May 2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science & Software Engineering, Concordia University Montreal, Canada
Prasad Perera & Leila Kosseim

Authors

Prasad Perera
View author publications
You can also search for this author in PubMed Google Scholar
Leila Kosseim
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Conservatoire National des Arts et Métiers, 2 rue Conté, 75003, Paris, France
Elisabeth Métais
School of Computing, Science and Engineering, University of Salford, The Crescent, M5 4WT, Salford, Lancashire, UK
Farid Meziane & Sunil Vadera &
School of Computing Science and Engineering, University of Salford, The Crescent, M5 4WT, Salford, Lancashire, UK
Mohamad Saraee
Department of Decision and Information Sciences School of Business Administration, Oakland University, 306 Elliott Hall, 48309, Rochester, MI, USA
Vijayan Sugumaran

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Perera, P., Kosseim, L. (2013). Evaluating Syntactic Sentence Compression for Text Summarisation. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds) Natural Language Processing and Information Systems. NLDB 2013. Lecture Notes in Computer Science, vol 7934. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38824-8_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-38824-8_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38823-1
Online ISBN: 978-3-642-38824-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics