Prerequisites for a Comprehensive Dictionary of Serbian Compounds

  • Cvetana Krstev
  • Duško Vitas
  • Agata Savary
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4139)


The paper describes the steps that were undertaken in order to start the production of a comprehensive morphological dictionary of compounds for Serbian. First, the classes of multi-word expressions were determined that were to be covered by the dictionaries. In the next step the useful sources of compounds were detected. The retrieved compounds were then classified according to their inflectional properties. The recently developed special finite state transducers were constructed for each of these classes which produce all the variants and morphological forms for the compounds of the class. Finally, the software module was developed that facilitates the production of the dictionary of compound lemmas with all the necessary information in the required format.


Noun Phrase Word Form Grammatical Category Simple Word Slavic Language 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Corbett, G.G.: Number. Cambridge University Press, Cambridge (2000)Google Scholar
  2. 2.
    Courtois, B., Silberztein, M. (eds.): Dictionnaires électroniques du français. Langue Française, 87, Larousse (1990)Google Scholar
  3. 3.
    Downing, P.: On the Creation and Use of English Compound Nouns. In: Language, vol. 153(4). Linguistic Society of America (1977)Google Scholar
  4. 4.
    Gross, G. Définition des noms composés dans un lexique-grammaire. In: Langue Française, Larousse, Paris, vol. 87 (1990)Google Scholar
  5. 5.
    Habert, B., Jacquemin, C.: Noms composés, termes, dénominations complexes: problématiques linguistiques et traitements automatiques. In: TAL, vol. 2 (1993)Google Scholar
  6. 6.
    Jacquemin, C.: Spotting and Discovering Terms through Natural Language Processing. MIT Press, Cambridge (2001)Google Scholar
  7. 7.
    Karttunen, L.: Finite-State Lexicon Compiler. Technical Report. ISTL-NLTT2993-04-02. Xerox Palo Alto Research Center. Xerox Corporation (1993)Google Scholar
  8. 8.
    Koeva, S., Krstev, C., Obradović, I., Vitas, D.: Resources for Processing Bulgarian and Serbian — a brief overview of Completeness, Compatibility and Similarities. In: Piperidis, S., Paskaleva, E. (eds.) Workshop on Language and Speech Infrastructure for Information Access in the Balkanic Countries, Borovets, Bulgaria, September 25, 2005, pp. 31–38 (2005)Google Scholar
  9. 9.
    Krstev, C., Stanković, R., Vitas, D., Obradović, I.: WS4LR: A Workstation for Lexical Resources. In: Proc. of LREC 2006, Genoa, ELRA (2006)Google Scholar
  10. 10.
    Krstev, C., Vitas, D., Gucul, S.: Recognition of Personal Names in Serbian Texts. In: Angelova, G. (ed.) Proc. of the International Conference Recent Advances in Natural Language Processing, Borovets, Bulgaria, September 21-23, 2005, pp. 288–292 (2005)Google Scholar
  11. 11.
    Kyriacopoulou, T., Mrabti, S., Yannacopoulou, A.: Le dictionnaire électronique des noms composés en grec moderne. In: Lingvisticae Investigationes, vol. 25(1), pp. 7–28. John Benjamins B.V., Amsterdam (2002)Google Scholar
  12. 12.
    Laporte, E.: Reduction of lexical ambiguity. In: Lingvisticae Investigationes, vol. 24(1), pp. 67–103. John Benjamins B.V., Amsterdam (2001)Google Scholar
  13. 13.
    Mikheev, A., Grover, C., Moens, M.: Description of the LTG System Used for MUC-7. In: Proceedings of the 7th Message Understanding Conference (MUC-7)Google Scholar
  14. 14.
    Monachini, M., Soria, C.: Building Multilingual Terminological Lexicon for Less Widely Available Languages. In: Proc. of LTC 2005, Poznań, Poland, pp. 129–133 (2005)Google Scholar
  15. 15.
    Ranchhod, E.M.: Using Corpora to Increase Portuguese MWE Dictionaries. Tagging MWE in a Portuguese Corpus. In: Proc. of the Corpus Linguistics Conference Series, vol. 1(1) (to appear, 2005)Google Scholar
  16. 16.
    Savary, A.: A formalism for the computational morphology of multi-word units. Archives of Control Sciences 15(LI), 437–449 (2005)MATHGoogle Scholar
  17. 17.
    Savary, A.: Multiflex — User’s Manual and Technical Documentation, version 1.0. Technical Report 285, LI-University of Tours, Tours (2005)Google Scholar
  18. 18.
    Silberztein, M.: Le dictionnaire électronique des mots composés. Langue Française 87, 71–83 (1990)CrossRefGoogle Scholar
  19. 19.
    Silberztein, M.: NooJ Manual, Université de Franche-Comté (2005),
  20. 20.
    Vitas, D., Krstev, C.: Derivational Morphology in an E-Dictionary of Serbian. In: Vetulani, Z. (ed.) Proc. of LTC 2005, Poznań, Poland, pp. 139–143 (2005)Google Scholar
  21. 21.
    Vitas, D., Pavlović-Lažetić, G., Krstev, C., Popović, L., Obradović, I.: Processing Serbian Written Texts: An Overview of Resources and Basic Tools. In: Piperidis, S., Karkaletisis, V. (eds.) Workshop on Balkan Language Resources and Tools, Thessaloniki, Greece, November 21, 2003, pp. 97–104 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Cvetana Krstev
    • 1
  • Duško Vitas
    • 2
  • Agata Savary
    • 3
  1. 1.Faculty of PhilologyUniversity of BelgradeBelgrade
  2. 2.Faculty of MathematicsUniversity of BelgradeBelgrade
  3. 3.Computer Science LaboratoryFrançois-Rabelais University of ToursBlois Campus

Personalised recommendations