A Rule-Based Morphosemantic Analyzer for French for a Fine-Grained Semantic Annotation of Texts

  • Fiammetta Namer
Part of the Communications in Computer and Information Science book series (CCIS, volume 380)

Abstract

We describe DériF, a rule-based morphosemantic analyzer developed for French. Unlike existing word segmentation tools, DériF provides derived and compound words with various sorts of semantic information: (1) a definition, computed from both the base meaning and the specificities of the morphological rule; (2) lexical-semantic features, inferred from general linguistic properties of derivation rules; (3) lexical relations (synonymy, (co-)hyponymy) with other, morphologically unrelated, words belonging to the same analyzed corpus.

Keywords

NLP morphosemantic approach rule-based French derivation neoclassical compounding lexical-semantic feature neologism automatic definition synonymy hyponymy co-hyponymy 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Plag, I.: Word-formation in English. Cambridge University Press, Cambridge (2003)CrossRefGoogle Scholar
  2. 2.
    Cartoni, B., Lefer, M.-A.: Improving the representation of word-formation in multilingual lexicographic tools: the MuLeXFoR database. In: XIV EURALEX, pp. 581–591. Fryske Academy, Leeuwarden (2010)Google Scholar
  3. 3.
    Creutz, M., Lagus, K.: Inducing the Morphological Lexicon of a Natural Language from Unannotated Text. In: AKRR 2005, pp. 106–113. Pattern Recognition Society of Finland, Helsinki (2005)Google Scholar
  4. 4.
    Sagot, B.: The Lefff, a freely available and large-coverage morphological and syntactic lexicon for French. In: LREC 2010, pp. 2744–2751. ELRA, La Valetta (2010)Google Scholar
  5. 5.
    Bernhard, D., Cartoni, B., Tribout, D.: A Task-Based Evaluation of French Morphological Resources and Tools. Linguistic Issues in Language Technology 5, 2 (2011)Google Scholar
  6. 6.
    Bilotti, M.W., Katz, B., Lin, J.: What Works Better for Question Answering: Stemming or Morphological Query Expansion? In: Proceedings of the Information Retrieval for Question Answering (IR4QA) (Workshop at SIGIR 2004), Sheffield (2004)Google Scholar
  7. 7.
    Dasgupta, S., Ng, V.: Unsupervised morphological parsing of Bengali. Language Resources and Evaluation 40(3-4), 311–330 (2006)CrossRefGoogle Scholar
  8. 8.
    Goldsmith, J.: An algorithm for the unsupervised learning of morphology. Computational Linguistics 27(2), 153–198 (2001)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Cavar, D., Rodriguez, P., Schrementi, G.: Unsupervised morphology induction for part-of-speech-tagging. In: Proceedings of the 29th Annual Penn Linguistics Colloquium, vol. 12(1), pp. 29–41. University of Pennsylvania, Philadelphia (2006)Google Scholar
  10. 10.
    Claveau, V.: Unsupervised and semi-supervised morphological analysis for Information Retrieval in the biomedical domain. In: COLING, Mumbai, India, pp. 629–646 (2012)Google Scholar
  11. 11.
    Bernhard, D.: Automatic Acquisition of Semantic Relationships from Morphological Relatedness. In: Salakoski, T., Ginter, F., Pyysalo, S., Pahikkala, T. (eds.) FinTAL 2006. LNCS (LNAI), vol. 4139, pp. 121–132. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  12. 12.
    Clément, L., Sagot, B., Lang, B.: Morphology based automatic acquisition of large-coverage lexica. In: LREC, pp. 1841–1844. ELRA, Lisbon (2004)Google Scholar
  13. 13.
    Wicentowski, R.: Multilingual Noise-Robust Supervised Morphological Analysis using the WordFrame Model. In: Proceedings of 7th Meeting of the ACL Special Interest Group on Computational Phonology (SIGPHON), pp. 70–77. ACL, Barcelona (2004)CrossRefGoogle Scholar
  14. 14.
    Virpioja, S., Turunen, V.T., Spiegler, S., Kohonen, O., Kurimo, M.: Empirical Comparison of Evaluation Methods for Unsupervised Learning of Morphology. TAL 42(2), 45–90 (2011)Google Scholar
  15. 15.
    Stroppa, N., Yvon, F.: An Analogical Learner for Morphological Analysis. In: CoNLL, pp. 120–127. ACL, Ann Arbor (2005)CrossRefGoogle Scholar
  16. 16.
    Hathout, N.: Morphonette: a paradigm-based morphological network. Lingue e Linguaggio 2, 245–264 (2011)Google Scholar
  17. 17.
    Moreau, F., Claveau, V., Sébillot, P.: Automatic morphological query expansion using analogy-based machine learning. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECiR 2007. LNCS, vol. 4425, pp. 222–233. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  18. 18.
    Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)CrossRefGoogle Scholar
  19. 19.
    Hull, A.D.: Stemming Algorithms - A case study for detailed evaluation. Journal of the American Society of Information Science 47(1), 70–84 (1996)CrossRefGoogle Scholar
  20. 20.
    Juravsky, D., Martin, J.: Speech and Language Processing. Prentice Hall, New Jersey (2000)Google Scholar
  21. 21.
    Cohen-Sygal, Y., Wintner, S.: Finite-State Registered Automata for Non-Concatenative Morphology. Computational Linguistics 32(1), 49–82 (2006)MathSciNetMATHCrossRefGoogle Scholar
  22. 22.
    Walther, M.: Temiar reduplication in one-level prosodic morphology. In: Proceedings of SIGPHON, Workshop on Finite-State Phonology, Luxembourg, pp. 13–21 (2000)Google Scholar
  23. 23.
    Pacak, M.G., Norton, L.M., Dunham, G.S.: Morphosemantic Analysis of -ITIS Forms in Medical Language. In: Methods of Information in Medecine, pp. 99–105 (1980)Google Scholar
  24. 24.
    Schulz, S., Hahn, U.: Morpheme-based, cross-lingual indexing for medical document retrieval. International Journal of Medical Informatics 58-59, 87–99 (2000)CrossRefGoogle Scholar
  25. 25.
    Markó, K., Schulz, S., Hahn, U.: MorphoSaurus – design and evaluation of an interlingua-based, cross-language docuyment retrieval engine for the medical domain. Methods of Information in Medecine 44(4), 537–545 (2005)Google Scholar
  26. 26.
    Cartoni, B.: Lexical Morphology in Machine Translation: A Feasibility Study. In: Proceedings of the 12th EACL, pp. 130–138. ACL, Athens (2009)CrossRefGoogle Scholar
  27. 27.
    Namer, F., Baud, R.: Defining and relating biomedical terms: towards a cross-language morphosemantics-based system. International Journal of Medical Informatics 76(2-3), 226–233 (2007)CrossRefGoogle Scholar
  28. 28.
    Deléger, L., Namer, F., Zweigenbaum, P.: Morphosemantic parsing of medical compound words: Transferring a French analyzer to English. International Journal of Medical Informatics 78(suppl.1), 48–55 (2009)CrossRefGoogle Scholar
  29. 29.
    Bernhard, D.: Apprentissage de connaissances morphologiques pour l’acquisition automatique de ressources lexicales. Université Joseph Fourier, Grenoble (2006)Google Scholar
  30. 30.
    Wilbur, W.J.: BioNLP: Biological, Translational and clinical language processing, pp. 201–208. ACL, Prague (2007)Google Scholar
  31. 31.
    Clark, P., Fellbaum, C., Hobbs, J.R., Harrison, P., Murray, B., Thompson, J.: Augmenting WordNet for deep understanding of text. In: Proceedings of Semantics in Text Processing, pp. 45–57. ACL, Venezia (2008)Google Scholar
  32. 32.
    Dal, G., Hathout, N., Namer, F.: Construire un lexique dérivationnel: théorie et réalisations. In: TALN 1999, pp. 115–124. Université Paris 7, Cargèse (1999)Google Scholar
  33. 33.
    Namer, F.: Morphologie, Lexique et TAL: l’analyseur DériF. Hermes Sciences Publishing, London (2009)Google Scholar
  34. 34.
    Sapir, E.: Language. Harcourt, Brace and Company, New York (1921)Google Scholar
  35. 35.
    Aikhenvald, A.Y.: Typological distinctions in word-formation. In: Shopen, T. (ed.) Language Typology and Syntactic Description. Grammatical Categories and the Lexicon, vol. III, pp. 1–65. Cambridge University Press, Cambridge (2007)Google Scholar
  36. 36.
    Corbett, G.: Canonical Derivational Morphology. Word Structure 3(2), 141–155 (2010)CrossRefGoogle Scholar
  37. 37.
    Hathout, N., Namer, F.: Discrepancy between form and meaning in Word Formation: the case of over- and under-marking in French. In: Rainer, F., Dressler, W.U., Gardani, F., Luschützky, H.C. (eds.) Morphology and Meaning (Selected Papers from the 15th International Morphology Meeting), Vienna. John Benjamins, Amsterdam (2010)Google Scholar
  38. 38.
    Hathout, N., Namer, F.: Règles et paradigmes en morphologie informatique lexématique. In: TALN 2011, pp. 215–220. LIRMM/ATALA, Montpellier (2011)Google Scholar
  39. 39.
    Lüdeling, A.: Neoclassical word-formation, 2nd edn. Encyclopedia of Language and Linguistics, pp. 580–582. Elsevier (2006)Google Scholar
  40. 40.
    Baayen, R.H.: Quantitative aspects of morphological productivity. Yearbook of Morphology 1991, 109–149 (1992)CrossRefGoogle Scholar
  41. 41.
    Namer, F., Bouillon, P., Jacquey, E.: Un lexique Génératif de référence pour le Français. In: TALN 2007, pp. 233–242. ERSS, Toulouse (2007)Google Scholar
  42. 42.
    Namer, F., Jacquey, E.: Word Formation Rules and the Generative Lexicon: Representing noun-to-verb versus verb-to-noun Conversion. In: Pustejovsky, J., Bouillon, P., Isahara, H., Kanzaki, K., Chungmin, L. (eds.) Advances in Generative Lexicon Theory, pp. 385–414. Springer, Heidelberg (2012)Google Scholar
  43. 43.
    Ruimy, N., Monachini, M., Distnte, R., Guazzini, E., Molino, S., Uliveri, M., Calzolari, N., Zampolli, A.: CLIPS, A Multi-level Italian Computational Lexicon. In: LREC, pp. 792–799. ELRA, Las Palmas de Gran Canaria (2002)Google Scholar
  44. 44.
    Pustejovsky, J.: The Generative Lexicon. MIT Press, Cambridge (1995)Google Scholar
  45. 45.
    Namer, F., Bouillon, P., Jacquey, E., Ruimy, N.: Morphology-based enhancement of a French SIMPLE Lexicon. In: 5th International Conference on Generative Approaches to the Lexicon, pp. 153–161. ILC-CNR, Pisa (2009)Google Scholar
  46. 46.
    Chmielik, J., Grabar, N.: Détection de la spécialisation scientifique et technique des docu-ments biomédicaux grâce aux informations morphologiques. TAL 52(2), 151–179 (2011)Google Scholar
  47. 47.
    Cartoni, B., Zweigenbaum, P.: Extension of a specialised lexicon using specific termino-logical data: the Unified Medical Lexicon for French (UMLF). In: Proceedings of 14th EURALEX, pp. 892–905. De Skriuwers, Leeuwarden (2010)Google Scholar
  48. 48.
    Lieber, R., Štekauer, P.: Introduction: status and definition of compounding. In: Lieber, R., Štekauer, P. (eds.) The Oxford Handbook of Compounding, pp. 3–18. Oxford University Press, Oxford (2009)Google Scholar
  49. 49.
    Montermini, F.: Units in compounding. In: Scalise, S., Vogel, I. (eds.) Cross-Disciplinary Issues in Compounding, pp. 79–82. Benjamins, Amsterdam (2010)Google Scholar
  50. 50.
    Dal, G., Amiot, D.: La composition néoclassique en français et ordre des constituants. In: Amiot, D. (ed.) La Composition Dans une Perspective Typologique, pp. 89–113. Artois Presse Université, Arras (2008)Google Scholar
  51. 51.
    Namer, F.: Guessing the meaning of neoclassical compound within LG: the case of pathol-ogy nouns. In: 3d Workshop on Generative Approaches to the Lexicon, pp. 175–184. Université de Genève, Geneva (2005)Google Scholar
  52. 52.
    Quintard, L., Galibert, O., Adda, G., Grau, B., Laurent, D., Moriceau, V.R., Rosset, S., Tannier, X., Vilnat, A.: Question Answering on Web Data: The QA Evaluation in Quæro. In: LREC 2010, pp. 2368–2374. ELRA, La Valletta (2010)Google Scholar
  53. 53.
    Ayache, C., Grau, B., Vilnat, A.: EQueR: the French Evaluation campaign of Question-Answering Systems. In: LREC 2006, pp. 1157–1160. ELRA, Genova (2006)Google Scholar
  54. 54.
    Grappy, A., Grau, B., Ferret, O., Grouin, C., Moriceau, V.R., Robba, I., Tannier, X., Vilnat, A., Barbier, V.: A Corpus for Studying Full Answer Justification. In: LREC 2010, pp. 2361–2367. ELRA, La Valletta (2010)Google Scholar
  55. 55.
    Namer, F.: Analyse automatique des noms déverbaux composés: pourquoi et comment faire intéragir analogie et système de règles. In: TALN 2009, pp. 1–10. ATALA, Senlis (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Fiammetta Namer
    • 1
  1. 1.UMR 7118 ATILF - CNRS & Université de LorraineNancyFrance

Personalised recommendations