Skip to main content

Comparing Topiary-Style Approaches to Headline Generation

  • Conference paper
Book cover Advances in Information Retrieval (ECIR 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3408))

Included in the following conference series:

Abstract

In this paper we compare a number of Topiary-style headline generation systems. The Topiary system, developed at the University of Maryland with BBN, was the top performing headline generation system at DUC 2004. Topiary-style headlines consist of a number of general topic labels followed by a compressed version of the lead sentence of a news story. The Topiary system uses a statistical learning approach to finding topic labels for headlines, while our approach, the LexTrim system, identifies key summary words by analysing the lexical cohesive structure of a text. The performance of these systems is evaluated using the ROUGE evaluation suite on the DUC 2004 news stories collection. The results of these experiments show that a baseline system that identifies topic descriptors for headlines using term frequency counts outperforms the LexTrim and Topiary systems. A manual evaluation of the headlines also confirms this result.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Document Understanding Conference, DUC, http://duc.nist.gov/

  2. Lin, C.-Y., Hovy, E.: Automatic Evaluation of Summaries using n-gram Co-occurrence Statistics. In: The Proceedings of HLT/NACCL (2003)

    Google Scholar 

  3. Zajic, D., Dorr, B., Schwartz, R.: BBN/UMD at DUC-2004: Topiary. In: The Proceedings of the Document Understanding Conference, DUC (2004)

    Google Scholar 

  4. Kraaij, W., Spitters, M., Hulth, A.: Headline extraction based on a combination of uni- and multi-document summarization techniques. In: The Proceedings of the ACL workshop on Automatic Summarization/Document Understanding Conference, DUC 2002 (2002)

    Google Scholar 

  5. Alfonseca, E., Rodriguez, P.: Description of the UAM system for generating very short summaries at DUC 2003. In: The Proceedings of the HLT/NAACL workshop on Automatic Summarization/Document Understanding Conference, DUC 2003 (2003)

    Google Scholar 

  6. Copeck, T., Szpakowicz, S.: Picking phrases, picking sentences. In: The Proceedings of the HLT/NAACL workshop on Automatic Summarization/Document Understanding Conference, DUC 2003 (2003)

    Google Scholar 

  7. Zhou, L., Hovy, E.: Headline Summarization at ISI. In: The Proceedings of the HLT/NAACL workshop on Automatic Summarization/Document Understanding Conference, DUC 2003 (2003)

    Google Scholar 

  8. Lacatusu, F., Hickl, A., Harabagiu, S., Nezda, L.: Lite-GISTexter at DUC2004. In: The Proceedings of the HLT/NAACL workshop on Automatic Summarization/Document Understanding Conference, DUC 2004 (2004)

    Google Scholar 

  9. Angheluta, R., Mitra, R., Jing, X., Moens, M.-F., Leuven, K.U.: Summarization System at DUC 2004. In: The Proceedings of the HLT/NAACL workshop on Automatic Summarization/Document Understanding Conference. DUC 2004 (2004)

    Google Scholar 

  10. Alfonseca, E., Moreno-Sandoval, A., Guirao, J.M.: Description of the UAM System for Generation Very Short Summaries at DUC 2004. In: The Proceedings of the HLT/NAACL workshop on Automatic Summarization/Document Understanding Conference, DUC 2004 (2004)

    Google Scholar 

  11. Kolluru, B., Christensen, H., Gotoh, Y.: Decremental Feature-based Compaction. In: The Proceedings of the HLT/NAACL workshop on Automatic Summarization/Document Understanding Conference, DUC 2004 (2004)

    Google Scholar 

  12. Zhou, L., Hovy, E.: Template-filtered Headline Summarization. In: The Proceedings of the ACL workshop, Text Summarization Branches Out, pp. 56–60 (2004)

    Google Scholar 

  13. Witbrock, M., Mittal, V.: Ultra-Summarisation: A Statistical approach to generating highly condensed non-extractive summaries. In: The Proceedings of the ACM-SIGIR, pp. 315–316 (1999)

    Google Scholar 

  14. Banko, M., Mittal, V., Witbrock, M.: Generating Headline-Style Summaries. In: The Proceedings of the Association for Computational Linguistics (2000)

    Google Scholar 

  15. Jin, R., Hauptmann, A.G.: A new probabilistic model for title generation. In: The Proceedings of the International Conference on Computational Linguistics (2002)

    Google Scholar 

  16. Berger, A.L., Mittal, V.O.: OCELOT: a system for summarizing Web pages. In: The Proceedings of the ACM-SIGIR, pp. 144–151 (2000)

    Google Scholar 

  17. Zajic, D., Dorr, B.: Automatic headline generation for newspaper stories. In: The Proceedings of the ACL workshop on Automatic Summarization/Document Understanding Conference, DUC 2002 (2002)

    Google Scholar 

  18. Dorr, B., Zajic, D., Schwartz, R.: Hedge Trimmer: A Parse-and-Trim Approach to Headline Generation. In: The Proceedings of the Document Understanding Conference, DUC (2003)

    Google Scholar 

  19. Morris, J., Hirst, G.: Lexical Cohesion by Thesaural Relations as an Indicator of the Structure of Text. Computational Linguistics 17(1) (1991)

    Google Scholar 

  20. Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: Five Papers on WordNet. CSL Report 43, Cognitive Science Laboratory, Princeton University (July 1990)

    Google Scholar 

  21. Collins, M.: Three generative lexicalised models for statistical parsing. In: The Proceedings of ACL (1997)

    Google Scholar 

  22. Miller, S., Crystal, M., Fox, H., Ramshaw, L., Schwartz, R., Stones, R., Weischedel, R.: BBN: Description of the SIFT system as used for MUC-7. In: The Proceedings of MUC-7 (1998)

    Google Scholar 

  23. Xu, J., Broglio, J., Croft, W.B.: The design and implementation of a part of speech tagger for English. Technical Report IR-52, University of Massachusetts, Amherst, Center for Intelligent Information Retrieval (1994)

    Google Scholar 

  24. Stokes, N.: Applications of Lexical Cohesion Analysis in the Topic Detection and Tracking domain. Ph.D. thesis. Department of Computer Science, University College Dublin (2004)

    Google Scholar 

  25. Stokes, N., Newman, E., Carthy, J., Smeaton, A.F.: Broadcast News Gisting using Lexical Cohesion Analysis. In: McDonald, S., Tait, J.I. (eds.) ECIR 2004. LNCS, vol. 2997, pp. 209–222. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  26. Lin, C.-Y.: ROUGE: A Package for Automatic Evaluation of Summaries. In: The Proceedings of the ACL workshop, Text Summarization Branches Out, pp. 56–60 (2004)

    Google Scholar 

  27. Doran, W.P., Stokes, N., Newman, E., Dunnion, J., Carthy, J., Toolan, F.: News Story Gisting at University College Dublin. In: The Proceedings of the Document Understanding Conference, DUC (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, R., Stokes, N., Doran, W.P., Newman, E., Carthy, J., Dunnion, J. (2005). Comparing Topiary-Style Approaches to Headline Generation. In: Losada, D.E., Fernández-Luna, J.M. (eds) Advances in Information Retrieval. ECIR 2005. Lecture Notes in Computer Science, vol 3408. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31865-1_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-31865-1_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25295-5

  • Online ISBN: 978-3-540-31865-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics