Reviewing and Evaluating Automatic Term Recognition Techniques

Korkontzelos, Ioannis; Klapaftis, Ioannis P.; Manandhar, Suresh

doi:10.1007/978-3-540-85287-2_24

Ioannis Korkontzelos²,
Ioannis P. Klapaftis² &
Suresh Manandhar²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5221))

Included in the following conference series:

International Conference on Natural Language Processing

1606 Accesses
24 Citations

Abstract

Automatic Term Recognition (ATR) is defined as the task of identifying domain specific terms from technical corpora. Termhood-based approaches measure the degree that a candidate term refers to a domain specific concept. Unithood-based approaches measure the attachment strength of a candidate term constituents. These methods have been evaluated using different, often incompatible evaluation schemes and datasets. This paper provides an overview and a thorough evaluation of state-of-the-art ATR methods, under a common evaluation framework, i.e. corpora and evaluation method. Our contributions are two-fold: (1) We compare a number of different ATR methods, showing that termhood-based methods achieve in general superior performance. (2) We show that the number of independent occurrences of a candidate term is the most effective source for estimating term nestedness, improving ATR performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Church, K.W., Hanks, P.: Word association norms, mutual information, and lexicography. Computational Linguistics 16(1), 22–29 (1990)
Google Scholar
Dias, G., Kaalep, H., Muischnek, K.: Automatic Extraction of Verb Phrases from Annotated Corpora: A Linguistic Evaluation for Estonian. In: EACL/ACL Workshop on Collocations, Toulouse, France (2001)
Google Scholar
Dunning, T.E.: Accurate Methods for the Statistics of Surprise and Coincidence. Computational Linguistics 19(1), 61–74 (1993)
Google Scholar
Evert, S., Krenn, B.: Methods for the qualitative evaluation of lexical association measures. In: ACL, Morristown, NJ, USA (2001)
Google Scholar
Frantzi, K.T., Ananiadou, S., Mima, H.: Automatic recognition of multi-word terms: the C-value/NC-value method. International Journal on Digital Libraries 3(2), 115–130 (2000)
Article Google Scholar
Gu, B.: Recognizing Nested Named Entities in GENIA corpus. In: HLT-NAACL BioNLP Workshop, New York, pp. 112–113 (2006)
Google Scholar
Justeson, J.S., Katz, S.M.: Technical terminology: some linguistic properties and an algorithm for identification in text. Natural Language Engineering 1(1), 9–27 (1995)
Article Google Scholar
Kageura, K., Umino, B.: Methods of automatic term recognition: a review. Terminology 3(2), 259–289 (1996)
Article Google Scholar
Kulick, S., Bies, A., Liberman, M., Mandel, M., Mcdonald, R., Palmer, M., Schein, A., Ungar, L., Winters, S., White, P.: Integrated Annotation for Biomedical Information Extraction. In: Hirschman, L., Pustejovsky, J. (eds.) HLT-NAACL BioLINK Workshop, Boston, Massachusetts, USA, pp. 61–68 (2004)
Google Scholar
Manning, C., Schutze, H.: Foundations of Statistical Natural Language Processing. Chapter: Collocations. MIT Press, Cambridge (1999)
MATH Google Scholar
Mcinnes, B.T.: Extending the Log Likelihood Measure to Improve Collocation Identification. Master’s thesis. University of Minnesota (2004)
Google Scholar
Mikheev, A., Moens, M., Grover, C.: Named Entity recognition without gazetteers. In: EACL, Bergen, Norway, pp. 1–8 (1999)
Google Scholar
Nakagawa, H.: Automatic Term Recognition based on Statistics of Compound Nouns. Terminology 6(2), 195–210 (2000)
MathSciNet Google Scholar
Pecina, P., Schlesinger, P.: Combining Association Measures for Collocation Extraction. In: ACL, Sydney, Australia (2006)
Google Scholar
Radev, D., Teufel, S., Saggion, H., Lam, W., Blitzer, J., Qi, H., Elebi, A., Liu, D., Drabek, E.: Evaluation challenges in large-scale document summarization. In: ACL, Sapporo, Japan (2003)
Google Scholar
Wermter, J., Hahn, U.: Collocation extraction based on modifiability statistics. In: COLING, Morristown, NJ, USA (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, The University of York, Heslington, York, YO10 5NG, UK
Ioannis Korkontzelos, Ioannis P. Klapaftis & Suresh Manandhar

Authors

Ioannis Korkontzelos
View author publications
You can also search for this author in PubMed Google Scholar
Ioannis P. Klapaftis
View author publications
You can also search for this author in PubMed Google Scholar
Suresh Manandhar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Chalmers University of Technology, 41296, Göteborg, Sweden
Bengt Nordström & Aarne Ranta &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Korkontzelos, I., Klapaftis, I.P., Manandhar, S. (2008). Reviewing and Evaluating Automatic Term Recognition Techniques. In: Nordström, B., Ranta, A. (eds) Advances in Natural Language Processing. GoTAL 2008. Lecture Notes in Computer Science(), vol 5221. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85287-2_24

Download citation

DOI: https://doi.org/10.1007/978-3-540-85287-2_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85286-5
Online ISBN: 978-3-540-85287-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics