A Very Large Database of Collocations and Semantic Links

  • Igor Bolshakov
  • Alexander Gelbukh
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1959)


A computational system manages a very large database of colloca- tions (word combinations) and semantic links. The collocations are related (in the meaning of a dependency grammar) word pairs, joint immediately or through prepositions. Synonyms, antonyms, subclasses, superclasses, etc. repre- sent semantic relations and form a thesaurus. The structure of the system is uni- versal, so that its language-dependent parts are easily adjustable to any specific language (English, Spanish, Russian, etc.). Inference rules for prediction of highly probable new collocations automatically enrich the database at runtime. The inference is assisted by the available thesaurus links. The aim of the system is word processing, foreign language learning, parse filtering, and lexical dis- ambiguation.


dictionary collocations thesaurus syntactic relations semantic relations lexical disambiguation. 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bolshakov, I. A. Multifunctional thesaurus for computerized preparation of Russian texts. Automatic Documentation and Mathematical Linguistics. Allerton Press Inc. Vol. 28, No. 1, 1994, p. 13–28.Google Scholar
  2. 2.
    Bolshakov, I. A. Multifunction thesaurus for Russian word processing. Proceedings of 4th Conference on Applied Natural language Processing, Stuttgart, 13-15 October, 1994, p. 200–202.Google Scholar
  3. 3.
    Benson, M., et al. The BBI Combinatory Dictionary of English. John Benjamin Publ., Amsterdam, Philadelphia, 1989.Google Scholar
  4. 4.
    Fellbaum, Ch. (ed.) WordNet as Electronic Lexical Database. MIT Press, 1998.Google Scholar
  5. 5.
    Calzolari, N., R. Bindi. Acquisition of Lexical Information from a Large Textual Italian Corpus. Proc. of COLING-90, Helsinki, 1990.Google Scholar
  6. 6.
    Yasuo Koyama, et al. Large Scale Collocation Data and Their Application to Japanese Word Processor Technology. Proc. Intern. Conf. COLING-ACL.98, v. I, p. 694–698.Google Scholar
  7. 7.
    Satoshi Sekine., et al. Automatic Learning for Semantic Collocation. Proc. 3rd Conf. ANLP, Trento, Italy, 1992, p. 104–110.Google Scholar
  8. 8.
    Smadja, F. Retreiving Collocations from text: Xtract. Computational Linguistics. Vol. 19, No. 1, p. 143–177.Google Scholar
  9. 9.
    Leo Wanner (ed.) Lexical Functions in Lexicography and Natural Language Processing. Studies in Language Companion Series ser.31. John Benjamin Publ., Amsterdam, Philadelphia 1996.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Igor Bolshakov
    • 1
  • Alexander Gelbukh
    • 1
  1. 1.Center for Computing ResearchNational Polytechnic InstituteMéxico DF.Mexico

Personalised recommendations