Skip to main content

Using Noun Phrase Heads to Extract Document Keyphrases

  • Conference paper
  • First Online:
Advances in Artificial Intelligence (Canadian AI 2000)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1822))

Abstract

Automatically extracting keyphrases from documents is a task with many applications in information retrieval and natural language processing. Document retrieval can be biased towards documents containing relevant keyphrases; documents can be classified or categorized based on their keyphrases; automatic text summarization may extract sentences with high keyphrase scores.

This paper describes a simple system for choosing noun phrases from a document as keyphrases. A noun phrase is chosen based on its length, its frequency and the frequency of its head noun. Noun phrases are extracted from a text using a base noun phrase skimmer and an off-the-shelf online dictionary.

Experiments involving human judges reveal several interesting results: the simple noun phrase-based system performs roughly as well as a state-of-the-art, corpus-trained keyphrase extractor; ratings for individual keyphrases do not necessarily correlate with ratings for sets of keyphrases for a document; agreement among unbiased judges on the keyphrase rating task is poor.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barker, Ken & Stan Szpakowicz (1998). “Semi-Automatic Recognition of Noun Modifier Relationships.” Proceedings of COLING-ACL’ 98. Montréal, 96–102.

    Google Scholar 

  2. Barker, Ken, Sylvain Delisle & Stan Szpakowicz (1998). “Test-driving Tanka: Evaluating a Semi-Automatic System of Text Analysis for Knowledge Acquisition.” Proceedings of the Twelfth Canadian Conference on Artificial Intelligence (LNAI 1418), Vancouver. 60–71.

    Google Scholar 

  3. Barker, Ken, Yllias Chali, Terry Copeck, Stan Matwin & Stan Szpakowicz (1998). “The Design of a Configurable Text Summarization System”. TR-98-04, School of Information Technology and Engineering, University of Ottawa.

    Google Scholar 

  4. Brill, Eric (1995). “Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging.” Computational Linguistics 21(4), December, 1995. 543–566.

    Google Scholar 

  5. Carletta, Jean (1996). “Assessing Agreement on Classification Tasks: The Kappa Statistic.” Computational Linguistics 22(2), June, 1996. 249–254.

    Google Scholar 

  6. Chali, Yllias, Stan Matwin & Stan Szpakowicz (1999) “Query-Biased Text Summarization as a Question-Answering Technique”. Proceedings of the AAAI Fall Symposium Workshop on Question-Answering Systems. Cape Cod, Massachusetts, November 1999.

    Google Scholar 

  7. Delannoy, Jean-François, Ken Barker, Terry Copeck, Martin Laplante, Stan Matwin & Stan Szpakowicz (1998) “Flexible Summarization”. AAAI Spring Symposium Workshop on Intelligent Text Summarization. Stanford, March, 1998.

    Google Scholar 

  8. Delisle, Sylvain (1994). “Text processing without A-Priori Domain Knowledge: Semi-Automatic Linguistic analysis for Incremental Knowledge Acquisition.” Ph.D. thesis, TR-94-02, Department of Computer Science, University of Ottawa.

    Google Scholar 

  9. Krulwich, Bruce & Chad Burkey (1996). “Learning user information interests through the extraction of semantically significant phrases.” In M. Hearst and H. Hirsh, editors, AAAI 1996 Spring Symposium on Machine Learning in Information Access. California: AAAI Press.

    Google Scholar 

  10. Turney, Peter D. (1999). “Learning to Extract Keyphrases from Text.” National Research Council, Institute for Information Technology, Technical Report ERB-1057.

    Google Scholar 

  11. Turney, Peter D. (2000). “Learning Algorithms for Keyphrase Extraction.” Information Retrieval. To appear.

    Google Scholar 

  12. Witten, Ian H., Gordon W. Paynter, Eibe Frank, Carl Gutwin & Craig G. Nevill-Manning (1999). “KEA: Practical Automatic Keyphrase Extraction.” Proceedings of the Fourth ACM Conference on Digital Libraries.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Barker, K., Cornacchia, N. (2000). Using Noun Phrase Heads to Extract Document Keyphrases. In: Hamilton, H.J. (eds) Advances in Artificial Intelligence. Canadian AI 2000. Lecture Notes in Computer Science(), vol 1822. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45486-1_4

Download citation

  • DOI: https://doi.org/10.1007/3-540-45486-1_4

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-67557-0

  • Online ISBN: 978-3-540-45486-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics