Using Noun Phrase Heads to Extract Document Keyphrases

Barker, Ken; Cornacchia, Nadia

doi:10.1007/3-540-45486-1_4

Ken Barker² &
Nadia Cornacchia²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1822))

Included in the following conference series:

Conference of the Canadian Society for Computational Studies of Intelligence

741 Accesses
79 Citations

Abstract

Automatically extracting keyphrases from documents is a task with many applications in information retrieval and natural language processing. Document retrieval can be biased towards documents containing relevant keyphrases; documents can be classified or categorized based on their keyphrases; automatic text summarization may extract sentences with high keyphrase scores.

This paper describes a simple system for choosing noun phrases from a document as keyphrases. A noun phrase is chosen based on its length, its frequency and the frequency of its head noun. Noun phrases are extracted from a text using a base noun phrase skimmer and an off-the-shelf online dictionary.

Experiments involving human judges reveal several interesting results: the simple noun phrase-based system performs roughly as well as a state-of-the-art, corpus-trained keyphrase extractor; ratings for individual keyphrases do not necessarily correlate with ratings for sets of keyphrases for a document; agreement among unbiased judges on the keyphrase rating task is poor.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Barker, Ken & Stan Szpakowicz (1998). “Semi-Automatic Recognition of Noun Modifier Relationships.” Proceedings of COLING-ACL’ 98. Montréal, 96–102.
Google Scholar
Barker, Ken, Sylvain Delisle & Stan Szpakowicz (1998). “Test-driving Tanka: Evaluating a Semi-Automatic System of Text Analysis for Knowledge Acquisition.” Proceedings of the Twelfth Canadian Conference on Artificial Intelligence (LNAI 1418), Vancouver. 60–71.
Google Scholar
Barker, Ken, Yllias Chali, Terry Copeck, Stan Matwin & Stan Szpakowicz (1998). “The Design of a Configurable Text Summarization System”. TR-98-04, School of Information Technology and Engineering, University of Ottawa.
Google Scholar
Brill, Eric (1995). “Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging.” Computational Linguistics 21(4), December, 1995. 543–566.
Google Scholar
Carletta, Jean (1996). “Assessing Agreement on Classification Tasks: The Kappa Statistic.” Computational Linguistics 22(2), June, 1996. 249–254.
Google Scholar
Chali, Yllias, Stan Matwin & Stan Szpakowicz (1999) “Query-Biased Text Summarization as a Question-Answering Technique”. Proceedings of the AAAI Fall Symposium Workshop on Question-Answering Systems. Cape Cod, Massachusetts, November 1999.
Google Scholar
Delannoy, Jean-François, Ken Barker, Terry Copeck, Martin Laplante, Stan Matwin & Stan Szpakowicz (1998) “Flexible Summarization”. AAAI Spring Symposium Workshop on Intelligent Text Summarization. Stanford, March, 1998.
Google Scholar
Delisle, Sylvain (1994). “Text processing without A-Priori Domain Knowledge: Semi-Automatic Linguistic analysis for Incremental Knowledge Acquisition.” Ph.D. thesis, TR-94-02, Department of Computer Science, University of Ottawa.
Google Scholar
Krulwich, Bruce & Chad Burkey (1996). “Learning user information interests through the extraction of semantically significant phrases.” In M. Hearst and H. Hirsh, editors, AAAI 1996 Spring Symposium on Machine Learning in Information Access. California: AAAI Press.
Google Scholar
Turney, Peter D. (1999). “Learning to Extract Keyphrases from Text.” National Research Council, Institute for Information Technology, Technical Report ERB-1057.
Google Scholar
Turney, Peter D. (2000). “Learning Algorithms for Keyphrase Extraction.” Information Retrieval. To appear.
Google Scholar
Witten, Ian H., Gordon W. Paynter, Eibe Frank, Carl Gutwin & Craig G. Nevill-Manning (1999). “KEA: Practical Automatic Keyphrase Extraction.” Proceedings of the Fourth ACM Conference on Digital Libraries.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information and Technology Engineering, University of Ottawa, Ottawa, Canada, K1N 6N5
Ken Barker & Nadia Cornacchia

Authors

Ken Barker
View author publications
You can also search for this author in PubMed Google Scholar
Nadia Cornacchia
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Regina, Regina, SK, S4S 0A2, Canada
Howard J. Hamilton

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Barker, K., Cornacchia, N. (2000). Using Noun Phrase Heads to Extract Document Keyphrases. In: Hamilton, H.J. (eds) Advances in Artificial Intelligence. Canadian AI 2000. Lecture Notes in Computer Science(), vol 1822. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45486-1_4

Download citation

DOI: https://doi.org/10.1007/3-540-45486-1_4
Published: 19 May 2000
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67557-0
Online ISBN: 978-3-540-45486-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics