Abstract
We describe experiments on content selection for producing biographical summaries from multiple documents. The method relies on a set of patterns to identify descriptive phrases, an available co-reference resolution algorithm, and a greedy, corpus-based sentence deletion procedure for document compression. We show that in an automatic evaluation of content using ROUGE, the proposed method obtains very good performance.
Keywords
- Noun Phrase
- Vector Space Model
- Longe Common Subsequence
- Coreference Resolution
- Target Entity
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Mani, I.: Automatic Text Summarization. John Benjamins Publishing Company, Amsterdam (2001)
Lacatusu, F., Hick, L., Harabagiu, S., Nezd, L.: Lite-GISTexter at DUC 2004. In: Proceedings of DUC 2004, NIST (2004)
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: A framework and graphical development environment for robust NLP tools and applications. In: ACL 2002 (2002)
Saggion, H., Bontcheva, K., Cunningham, H.: Generic and Query-based Summarization. In: European Conference of the Association for Computational Linguistics (EACL) Research Notes and Demos, Budapest, Hungary, EACL (2003)
Baeza-Yates, R., Ribiero-Neto, B.: Modern Information Retrieval. ACM Press Books, New York (1999)
Saggion, H., Gaizauskas, R.: Multi-document summarization by cluster/profile relevance and redundancy removal. In: Proceedings of the Document Understanding Conference 2004, NIST (2004)
Ramshaw, L., Marcus, M.: Text chunking using transformation-based learning. In: Yarovsky, D., Church, K. (eds.) Proceedings of the Third Workshop on Very Large Corpora, Somerset, New Jersey, Association for Computational Linguistics, pp. 82–94 (1995)
Dimitrov, M., Bontcheva, K., Cunningham, H., Maynard, D.: A Light-weight Approach to Coreference Resolution for Named Entities in Text. In: Branco, A., McEnery, T., Mitkov, R. (eds.) Anaphora Processing: Linguistic, Cognitive and Computational Modelling. John Benjamins Publishing Company, Amsterdam (2004)
Joho, H., Sanderson, M.: Retrieving Descriptive Phrases from Large Amounts of Free Text. In: Proceedings of Conference on Information and Knoweldge Management (CIKM), pp. 180–186. ACM, New York (2000)
Marcu, D.: The automatic construction of large-scale corpora for summarization research. In: Hearst, M., Gey, F., Tong, R. (eds.) Proceedings of SIGIR 1999. 22nd International Conference on Research and Development in Information Retrieval, pp. 137–144. University of California, Beekely (1999)
Pollock, J., Zamora, A.: Automatic abstracting research at Chemical Abstracts Service. Journal of Chemical Information and Computer Sciences, 226–233 (1975)
Johnson, F.C., Paice, C.D., Black, W.J., Neal, A.: The application of linguistic processing to automatic abstract generation. Journal of Document & Text Management 1, 215–241 (1993)
Saggion, H., Radev, D., Teufel, S., Lam, W.: Meta-evaluation of Summaries in a Cross-lingual Environment using Content-based Metrics. In: Proceedings of COLING 2002, Taipei, Taiwan, pp. 849–855 (2002)
Lin, C.-Y.: ROUGE: A Package for Automatic Evaluation of Summaries. In: Proceedings of the Workshop on Text Summarization, Barcelona, ACL (2004)
Schiffman, B., Mani, I., Concepcion, K.: Producing Biographical Summaries: Combining Linguistic Knowlkedge with Corpus Statistics. In: Proceedings of EACL-ACL (2001)
Radev, D.R., McKeown, K.R.: Generating natural language summaries from multiple on-line sources. Computational Linguistics 24, 469–500 (1998)
Zhou, L., Ticrea, M., Hovy, E.: Multi-document Biography Summarization. In: Proceedings of Empirical Methods in Natural Language Processing (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Saggion, H., Gaizauskas, R. (2005). Experiments on Statistical and Pattern-Based Biographical Summarization. In: Bento, C., Cardoso, A., Dias, G. (eds) Progress in Artificial Intelligence. EPIA 2005. Lecture Notes in Computer Science(), vol 3808. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11595014_60
Download citation
DOI: https://doi.org/10.1007/11595014_60
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30737-2
Online ISBN: 978-3-540-31646-6
eBook Packages: Computer ScienceComputer Science (R0)
