Skip to main content

Experiments on Statistical and Pattern-Based Biographical Summarization

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNAI,volume 3808)

Abstract

We describe experiments on content selection for producing biographical summaries from multiple documents. The method relies on a set of patterns to identify descriptive phrases, an available co-reference resolution algorithm, and a greedy, corpus-based sentence deletion procedure for document compression. We show that in an automatic evaluation of content using ROUGE, the proposed method obtains very good performance.

Keywords

  • Noun Phrase
  • Vector Space Model
  • Longe Common Subsequence
  • Coreference Resolution
  • Target Entity

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (Canada)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Mani, I.: Automatic Text Summarization. John Benjamins Publishing Company, Amsterdam (2001)

    Google Scholar 

  2. Lacatusu, F., Hick, L., Harabagiu, S., Nezd, L.: Lite-GISTexter at DUC 2004. In: Proceedings of DUC 2004, NIST (2004)

    Google Scholar 

  3. Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: A framework and graphical development environment for robust NLP tools and applications. In: ACL 2002 (2002)

    Google Scholar 

  4. Saggion, H., Bontcheva, K., Cunningham, H.: Generic and Query-based Summarization. In: European Conference of the Association for Computational Linguistics (EACL) Research Notes and Demos, Budapest, Hungary, EACL (2003)

    Google Scholar 

  5. Baeza-Yates, R., Ribiero-Neto, B.: Modern Information Retrieval. ACM Press Books, New York (1999)

    Google Scholar 

  6. Saggion, H., Gaizauskas, R.: Multi-document summarization by cluster/profile relevance and redundancy removal. In: Proceedings of the Document Understanding Conference 2004, NIST (2004)

    Google Scholar 

  7. Ramshaw, L., Marcus, M.: Text chunking using transformation-based learning. In: Yarovsky, D., Church, K. (eds.) Proceedings of the Third Workshop on Very Large Corpora, Somerset, New Jersey, Association for Computational Linguistics, pp. 82–94 (1995)

    Google Scholar 

  8. Dimitrov, M., Bontcheva, K., Cunningham, H., Maynard, D.: A Light-weight Approach to Coreference Resolution for Named Entities in Text. In: Branco, A., McEnery, T., Mitkov, R. (eds.) Anaphora Processing: Linguistic, Cognitive and Computational Modelling. John Benjamins Publishing Company, Amsterdam (2004)

    Google Scholar 

  9. Joho, H., Sanderson, M.: Retrieving Descriptive Phrases from Large Amounts of Free Text. In: Proceedings of Conference on Information and Knoweldge Management (CIKM), pp. 180–186. ACM, New York (2000)

    Google Scholar 

  10. Marcu, D.: The automatic construction of large-scale corpora for summarization research. In: Hearst, M., Gey, F., Tong, R. (eds.) Proceedings of SIGIR 1999. 22nd International Conference on Research and Development in Information Retrieval, pp. 137–144. University of California, Beekely (1999)

    Google Scholar 

  11. Pollock, J., Zamora, A.: Automatic abstracting research at Chemical Abstracts Service. Journal of Chemical Information and Computer Sciences, 226–233 (1975)

    Google Scholar 

  12. Johnson, F.C., Paice, C.D., Black, W.J., Neal, A.: The application of linguistic processing to automatic abstract generation. Journal of Document & Text Management 1, 215–241 (1993)

    Google Scholar 

  13. Saggion, H., Radev, D., Teufel, S., Lam, W.: Meta-evaluation of Summaries in a Cross-lingual Environment using Content-based Metrics. In: Proceedings of COLING 2002, Taipei, Taiwan, pp. 849–855 (2002)

    Google Scholar 

  14. Lin, C.-Y.: ROUGE: A Package for Automatic Evaluation of Summaries. In: Proceedings of the Workshop on Text Summarization, Barcelona, ACL (2004)

    Google Scholar 

  15. Schiffman, B., Mani, I., Concepcion, K.: Producing Biographical Summaries: Combining Linguistic Knowlkedge with Corpus Statistics. In: Proceedings of EACL-ACL (2001)

    Google Scholar 

  16. Radev, D.R., McKeown, K.R.: Generating natural language summaries from multiple on-line sources. Computational Linguistics 24, 469–500 (1998)

    Google Scholar 

  17. Zhou, L., Ticrea, M., Hovy, E.: Multi-document Biography Summarization. In: Proceedings of Empirical Methods in Natural Language Processing (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Saggion, H., Gaizauskas, R. (2005). Experiments on Statistical and Pattern-Based Biographical Summarization. In: Bento, C., Cardoso, A., Dias, G. (eds) Progress in Artificial Intelligence. EPIA 2005. Lecture Notes in Computer Science(), vol 3808. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11595014_60

Download citation

  • DOI: https://doi.org/10.1007/11595014_60

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30737-2

  • Online ISBN: 978-3-540-31646-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics