Skip to main content

Reviewing and Evaluating Automatic Term Recognition Techniques

  • Conference paper
Advances in Natural Language Processing (GoTAL 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5221))

Included in the following conference series:

Abstract

Automatic Term Recognition (ATR) is defined as the task of identifying domain specific terms from technical corpora. Termhood-based approaches measure the degree that a candidate term refers to a domain specific concept. Unithood-based approaches measure the attachment strength of a candidate term constituents. These methods have been evaluated using different, often incompatible evaluation schemes and datasets. This paper provides an overview and a thorough evaluation of state-of-the-art ATR methods, under a common evaluation framework, i.e. corpora and evaluation method. Our contributions are two-fold: (1) We compare a number of different ATR methods, showing that termhood-based methods achieve in general superior performance. (2) We show that the number of independent occurrences of a candidate term is the most effective source for estimating term nestedness, improving ATR performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Church, K.W., Hanks, P.: Word association norms, mutual information, and lexicography. Computational Linguistics 16(1), 22–29 (1990)

    Google Scholar 

  2. Dias, G., Kaalep, H., Muischnek, K.: Automatic Extraction of Verb Phrases from Annotated Corpora: A Linguistic Evaluation for Estonian. In: EACL/ACL Workshop on Collocations, Toulouse, France (2001)

    Google Scholar 

  3. Dunning, T.E.: Accurate Methods for the Statistics of Surprise and Coincidence. Computational Linguistics 19(1), 61–74 (1993)

    Google Scholar 

  4. Evert, S., Krenn, B.: Methods for the qualitative evaluation of lexical association measures. In: ACL, Morristown, NJ, USA (2001)

    Google Scholar 

  5. Frantzi, K.T., Ananiadou, S., Mima, H.: Automatic recognition of multi-word terms: the C-value/NC-value method. International Journal on Digital Libraries 3(2), 115–130 (2000)

    Article  Google Scholar 

  6. Gu, B.: Recognizing Nested Named Entities in GENIA corpus. In: HLT-NAACL BioNLP Workshop, New York, pp. 112–113 (2006)

    Google Scholar 

  7. Justeson, J.S., Katz, S.M.: Technical terminology: some linguistic properties and an algorithm for identification in text. Natural Language Engineering 1(1), 9–27 (1995)

    Article  Google Scholar 

  8. Kageura, K., Umino, B.: Methods of automatic term recognition: a review. Terminology 3(2), 259–289 (1996)

    Article  Google Scholar 

  9. Kulick, S., Bies, A., Liberman, M., Mandel, M., Mcdonald, R., Palmer, M., Schein, A., Ungar, L., Winters, S., White, P.: Integrated Annotation for Biomedical Information Extraction. In: Hirschman, L., Pustejovsky, J. (eds.) HLT-NAACL BioLINK Workshop, Boston, Massachusetts, USA, pp. 61–68 (2004)

    Google Scholar 

  10. Manning, C., Schutze, H.: Foundations of Statistical Natural Language Processing. Chapter: Collocations. MIT Press, Cambridge (1999)

    MATH  Google Scholar 

  11. Mcinnes, B.T.: Extending the Log Likelihood Measure to Improve Collocation Identification. Master’s thesis. University of Minnesota (2004)

    Google Scholar 

  12. Mikheev, A., Moens, M., Grover, C.: Named Entity recognition without gazetteers. In: EACL, Bergen, Norway, pp. 1–8 (1999)

    Google Scholar 

  13. Nakagawa, H.: Automatic Term Recognition based on Statistics of Compound Nouns. Terminology 6(2), 195–210 (2000)

    MathSciNet  Google Scholar 

  14. Pecina, P., Schlesinger, P.: Combining Association Measures for Collocation Extraction. In: ACL, Sydney, Australia (2006)

    Google Scholar 

  15. Radev, D., Teufel, S., Saggion, H., Lam, W., Blitzer, J., Qi, H., Elebi, A., Liu, D., Drabek, E.: Evaluation challenges in large-scale document summarization. In: ACL, Sapporo, Japan (2003)

    Google Scholar 

  16. Wermter, J., Hahn, U.: Collocation extraction based on modifiability statistics. In: COLING, Morristown, NJ, USA (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Korkontzelos, I., Klapaftis, I.P., Manandhar, S. (2008). Reviewing and Evaluating Automatic Term Recognition Techniques. In: Nordström, B., Ranta, A. (eds) Advances in Natural Language Processing. GoTAL 2008. Lecture Notes in Computer Science(), vol 5221. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85287-2_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85287-2_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85286-5

  • Online ISBN: 978-3-540-85287-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics