A New Operationalization of Contrastive Term Extraction Approach Based on Recognition of Both Representative and Specific Terms

Nugumanova, Aliya; Bessmertny, Igor; Baiburin, Yerzhan; Mansurova, Madina

doi:10.1007/978-3-319-45880-9_9

Aliya Nugumanova¹²,
Igor Bessmertny¹³,
Yerzhan Baiburin¹⁵ &
…
Madina Mansurova¹⁴

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 649))

Included in the following conference series:

International Conference on Knowledge Engineering and the Semantic Web

1 Citations

Abstract

A contrastive approach to term extraction is an extensive class of methods based on the assumption that the words frequently occurring within a domain and rarely beyond it are most likely terms. The disadvantage of this approach is a great number of type II errors – false negatives. The cause of these errors is in the idea of contrastive selection when the most representative high frequent terms are extracted from the texts and rare terms are discarded. In this work, we propose a new operationalization of the contrastive approach, which supports the capture of both high frequent and low frequent domain terms. Proposed operationalization reduces the number of false negatives. The experiments performed on the texts of the subject domain “Geology” show promising of proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Evaluation of cutoff policies for term extraction

Article Open access 14 July 2015

A term extraction algorithm based on machine learning and comprehensive feature strategy

Article 05 September 2023

Exploration of a Rich Feature Set for Automatic Term Extraction

References

Medelyan, O., Manion, S., Broekstra, J., Divoli, A., Huang, A.-L., Witten, I.H.: Constructing a focused taxonomy from a document collection. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 367–381. Springer, Heidelberg (2013)
Chapter Google Scholar
Medelyan, O. et al.: Automatic construction of lexicons, taxonomies, ontologies, and other knowledge structures. In: Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discover, vol. 3, no. 4, pp. 257–279 (2013)
Google Scholar
Fan, J., et al.: Automatic knowledge extraction from documents. IBM J. Res. Dev. 56(3.4), 5:1–5:10 (2012)
Google Scholar
Aggarwal, C.C., Zhai, C.X.: Mining Text Data. Springer Science & Business Media, New York (2012)
Book Google Scholar
Nenadi, G., Ananiadou, S., McNaught, J.: Enhancing automatic term recognition through recognition of variation. In: Proceedings of the 20th International Conference on Computational Linguistics, p. 604. Association for Computational Linguistics (2004)
Google Scholar
Ahrenberg, L.: Term extraction: A Review Draft Version 091221 (2009)
Google Scholar
Kageura, K., Umino, B.: Methods of automatic term recognition: a review. Terminology 3(2), 259–289 (1996)
Article Google Scholar
Wong, W., Liu, W., Bennamoun, M.: Determination of unithood and termhood for term recognition. In: Handbook of Research on Text and Web Mining Technologies. IGI Global (2008)
Google Scholar
Polya, G.: Mathematical Discovery: On Understanding, Learning, and Teaching Problem Solving. Wiley, New York (1981)
MATH Google Scholar
Heylen, K., De Hertog, D.: Automatic term extraction. In: Handbook of Terminology, vol. 1 (2014)
Google Scholar
Weeber, M., Baayen, R.H., Vos, R.: Extracting the lowest-frequency words: pitfalls and possibilities. Comput. Linguist. 26(3), 301–317 (2000)
Article Google Scholar
Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: ICML, vol. 97, pp. 412–420 (1997)
Google Scholar
Kim, S.N., Cavedon, L.: Classifying domain-specific terms using a dictionary. In: Australasian Language Technology Association Workshop 2011, p. 57 (2011)
Google Scholar
da Silva Conrado, M., Pardo, T.A.S., Rezende, S.O.: A machine learning approach to automatic term extraction using a rich feature set. In: HLT-NAACL, pp. 16–23 (2013)
Google Scholar
Ahmad, K., et al.: University of surrey participation in TREC8: weirdness Indexing for logical document extrapolation and retrieval (WILDER). In: TREC (1999)
Google Scholar
Gillam, L., Tariq, M., Ahmad, K.: Terminology and the construction of ontology. Terminology 11(1), 55–81 (2005)
Article Google Scholar
Peñas, A., et al.: Corpus-based terminology extraction applied to information access. In: Proceedings of Corpus Linguistics, pp. 458–465 (2001)
Google Scholar
Kim, S.N., Baldwin, T., Kan, M-Y.: An unsupervised approach to domain-specific term extraction. In: Australasian Language Technology Association Workshop 2009, pp. 94–98 (2009)
Google Scholar
Basili, R.: A contrastive approach to term extraction. In: Proceedings of the 4th Terminological and Artificial Intelligence Conference (TIA 2001) (2001)
Google Scholar
Wong, W., Liu, W., Bennamoun, M.: Determining termhood for learning domain ontologies using domain prevalence and tendency. In: Proceedings of the Sixth Australasian Conference on Data Mining and Analytics, vol. 70, pp. 47–54. Australian Computer Society, Inc. (2007)
Google Scholar
Sclano, F., Velardi, P.: Termextractor: a web application to learn the shared terminology of emergent web communities. In: Gonçalves, R.J., Müller, J.P., Mertins, K., Zelm, M. (eds.) Enterprise Interoperability II, pp. 287–290. Springer, London (2007)
Chapter Google Scholar
Astrakhantsev, N.A., Fedorenko, D.G., Turdakov, D.Y.: Methods for automatic term recognition in domain-specific text collections: a survey. Program. Comput. Softw. 41(6), 336–349 (2015)
Article MathSciNet Google Scholar
Kit, C., Liu, X.: Measuring mono-word termhood by rank difference via corpus comparison. Terminology 14(2), 204–229 (2008)
Article Google Scholar
Lopes, L., Fernandes, P., Vieira, R.: Estimating term domain relevance through term frequency, disjoint corpora frequency-tf-dcf. Knowl.-Based Syst. (2016)
Google Scholar
Wong, W., Liu, W., Bennamoun, M.: Determining termhood for learning domain ontologies in a probabilistic framework. In: Proceedings of the Sixth Australasian Conference on Data Mining and Analytics, vol. 70, pp. 55–63. Australian Computer Society, Inc. (2007)
Google Scholar
Prelov, V.: Mutual information of several random variables and its estimation via variation. Prob Inf Transm. 45(4), 295–308 (2009)
Article MathSciNet MATH Google Scholar
Manning, C.D., et al.: Introduction to Information Retrieval, vol. 1, p. 496. Cambridge University Press, Cambridge (2008)
Book MATH Google Scholar
Hasan, K.S., Ng, V.: Automatic keyphrase extraction: a survey of the state of the art. In: ACL, vol. 1, pp. 1262–1273 (2014)
Google Scholar
Matsuo, Y., Ishizuka, M.: Keyword extraction from a single document using word co-occurrence statistical information. Int. J. Artif. Intell. Tools 13(01), 157–169 (2004)
Article Google Scholar
Sokolovsky, A.K. (ed.): A Textbook of General geology: In 2 volumes, vol. 1, p. 448. KDU, George Town (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

D. Serikbayev East Kazakhstan State Technical University, Ust-Kamenogorsk, Kazakhstan
Aliya Nugumanova
Saint Petersburg National Research University of ITMO, Saint Petersburg, Russia
Igor Bessmertny
Al-Farabi Kazakh National University, Almaty, Kazakhstan
Madina Mansurova
East Kazakhstan State University, Ust-Kamenogorsk, Kazakhstan
Yerzhan Baiburin

Authors

Aliya Nugumanova
View author publications
You can also search for this author in PubMed Google Scholar
Igor Bessmertny
View author publications
You can also search for this author in PubMed Google Scholar
Yerzhan Baiburin
View author publications
You can also search for this author in PubMed Google Scholar
Madina Mansurova
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aliya Nugumanova .

Editor information

Editors and Affiliations

Leipzig University , Leipzig, Germany
Axel-Cyrille Ngonga Ngomo
Czech Technical University in Prague , Praha, Czech Republic
Petr Křemen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nugumanova, A., Bessmertny, I., Baiburin, Y., Mansurova, M. (2016). A New Operationalization of Contrastive Term Extraction Approach Based on Recognition of Both Representative and Specific Terms. In: Ngonga Ngomo, AC., Křemen, P. (eds) Knowledge Engineering and Semantic Web. KESW 2016. Communications in Computer and Information Science, vol 649. Springer, Cham. https://doi.org/10.1007/978-3-319-45880-9_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-45880-9_9
Published: 08 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45879-3
Online ISBN: 978-3-319-45880-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A New Operationalization of Contrastive Term Extraction Approach Based on Recognition of Both Representative and Specific Terms

Abstract

Access this chapter

Similar content being viewed by others

Evaluation of cutoff policies for term extraction

A term extraction algorithm based on machine learning and comprehensive feature strategy

Exploration of a Rich Feature Set for Automatic Term Extraction

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A New Operationalization of Contrastive Term Extraction Approach Based on Recognition of Both Representative and Specific Terms

Abstract

Access this chapter

Similar content being viewed by others

Evaluation of cutoff policies for term extraction

A term extraction algorithm based on machine learning and comprehensive feature strategy

Exploration of a Rich Feature Set for Automatic Term Extraction

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation