Developing the Croatian National Corpus and Beyond

  • Marko Tadić
Part of the Text, Speech and Language Technology book series (TLTB, volume 31)

The Croatian National Corpus (HNK) has been collected since 1998 under grant # 130718 by the Ministry of Science and Technology of the Republic of Croatia. The theoretical foundations for such a corpus was laid down in Tadić (1996, 1998), where the need for a Croatian reference corpus (both synchronic and diachronic) was expressed. The tentative solution for its structure was suggested, its time-span and size as well as its availability over the WWW further elaborated. The overall structure of the HNK was divided on two constituents:


Ordinal Number Language Resource Parallel Corpus Arabic Number Tentative Solution 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer 2007

Authors and Affiliations

  • Marko Tadić
    • 1
  1. 1.Odsjek za lingvistikuFilozofski fakultet Sveučilišta u ZagrebuZagrebCroatia

Personalised recommendations