Automatic Validation of Terminology by Means of Formal Concept Analysis

  • Luis Felipe Melo Mora
  • Yannick Toussaint
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9113)


Term extraction tools extract candidate terms and annotate their occurrences in the texts. However, not all these occurrences are terminological and, at present, this is still a very challenging issue to distinguish when a candidate term is really used with a terminological meaning. The validation of term annotations is presented as a bi-classification model that classifies each term occurrence as a terminological or non-terminological occurrence. A context-based hypothesis approach is applied to a training corpus: we assume that the words in the sentence which contains the studied occurrence can be used to build positive and negative hypotheses that are further used to classify undetermined examples. The method is applied and evaluated on a french corpus in the linguistic domain and we also mention some improvements suggested by a quantitative and qualitative evaluation.


Target Attribute Formal Context Formal Concept Analysis Candidate Term Term Occurrence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Aronson, A., Lang, F.M.: An overview of metamap: historical perspective and recent advances. JAMIA 17(3), 229–236 (2010)Google Scholar
  2. 2.
    Aubin, S., Hamon, T.: Improving term extraction with terminological resources. In: Salakoski, T., Ginter, F., Pyysalo, S., Pahikkala, T. (eds.) FinTAL 2006. LNCS (LNAI), vol. 4139, pp. 380–387. Springer, Heidelberg (2006) CrossRefGoogle Scholar
  3. 3.
    Boumedyen, M., Camacho, J., Jacquey, E., Kister, L.: Annotation sémantique et validation terminologique en texte intégral en shs. In: Actes de la 21e Conférence sur le Traitement Automatique des Langues Naturelles (TALN’2014). Marseille, France (2014)Google Scholar
  4. 4.
    Bourigault, D., Jacquemin, C., L’Homme, M.: Searching for and identifyng conceptual relationships via a corpus-based approach to a terminological knowledge base (CTKB): method and results. In: Condamines, A., Rebeyrolle, J. (eds.) Recent Advances in Computational Terminology, Chap. 6. Natural Language Processing, J. Benjamins Publishing Company (2001)Google Scholar
  5. 5.
    Buzmakov, A., Kuznetsov, S., Napoli, A.: A new approach to classification by means of jumping emerging patterns. In: FCA4AI: International Workshop “What can FCA do for Artificial Intelligence?” - ECAI 2012 (2012)Google Scholar
  6. 6.
    Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V., Aswani, N., Roberts, I., Gorrell, G., Funk, A., Roberts, A., Damljanovic, D., Heitz, T., Greenwood, M., Saggion, H., Petrak, J., Li, Y., Peters, W.: Text Processing with GATE (Version 6) (2011).
  7. 7.
    David, S., Plante, P.: Termino version 1.0. Rapport du Centre dAnalyse de Textes par Ordinateur. Université du Québec à Montréal (1990)Google Scholar
  8. 8.
    Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations, 1st edn. Springer, Secaucus (1997) Google Scholar
  9. 9.
    Jacquemin, C.: Fastr: a unification-based front-end to automatic indexing. In: Funck-Brentano, J.L., Seitz, F. (eds.) RIAO, pp. 34–48. CID (1994)Google Scholar
  10. 10.
    Kister, L., Jacquey, E.: Relations syntaxiques entre lexiques terminologique et transdisciplinaire: analyse en texte intégral. In: Actes du Congrès Mondial de Linguistique Franaise, Lyon, France, pp. 909–919 (2012)Google Scholar
  11. 11.
    Klimushkin, M., Obiedkov, S., Roth, C.: Approaches to the selection of relevant concepts in the case of noisy data. In: Kwuida, L., Sertkaya, B. (eds.) ICFCA 2010. LNCS, vol. 5986, pp. 255–266. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  12. 12.
    Kuznetsov, S.: Machine learning on the basis of formal concept analysis. Autom. Remote Control 62(10), 1543–1564 (2001)CrossRefzbMATHMathSciNetGoogle Scholar
  13. 13.
    Kuznetsov, S.: Complexity of learning in concept lattices from positive and negative examples. Discrete Appl. Math. 142(13), 111–125 (2004)CrossRefzbMATHMathSciNetGoogle Scholar
  14. 14.
    Kuznetsov, S.: On stability of a formal concept. Ann. Math. Artif. Intell. 49(1–4), 101–115 (2007)CrossRefzbMATHMathSciNetGoogle Scholar
  15. 15.
    Maynard, D., Ananiadou, S.: Term extraction using a similarity-based approach. In: Recent Advances in Computational Terminology. John Benjamins, pp. 261–278 (1999)Google Scholar
  16. 16.
    Ramamohanarao, K., Bailey, J.: Discovery of emerging patterns and their use in classification. In: Gedeon, T., Fung, L. (eds.) AI 2003: Advances in Artificial Intelligence. Lecture Notes in Computer Science, vol. 2903, pp. 1–11. Springer, Berlin Heidelberg (2003)CrossRefGoogle Scholar
  17. 17.
    Rocheteau, J., Daille, B.: TTC termsuite: a uima application for multilingual terminology extraction from comparable corpora. In: Proceedings of the 5th International Joint Conference on Natural Language Processing (IJCNLP), Chiang Mai, Thailand (2011)Google Scholar
  18. 18.
    Sclano, F., Velardi, P.: Termextractor: a web application to learn the shared terminology of emergent web communities. In: Proceedings of the 3rd International Conference on Interoperability for Enterprise Software and Applications (I-ESA 2007) (2007)Google Scholar
  19. 19.
    Wüster, E.: La théorie générale de la terminologie un domaine interdisciplinaire impliquant la linguistique, la logique, l’ontologie, l’informatique et les sciences des objets. In: Actes du Colloque International de Terminologie (Québec, Manoir du lac Delage, 5–8 octobre 1975) (1976)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Inria Nancy-Grand EstVillers-lès-NancyFrance

Personalised recommendations