Using Statistical Search to Discover Semantic Relations of Political Lexica – Evidences from Bulgarian-Slovak EUROPARL 7 Corpus

  • Velislava StoykovaEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9582)


The paper presents statistical approach to discover semantic relations of political lexica using parallel Bulgarian-Slovak EUROPARL 7 Corpus. It employs statistical properties incorporated in the Sketch Engine software to generate concordances, co-occurrences and collocations. A comparative analysis of semantic structure of political lexica investigating synonymic, attributive and reciprocal semantic relations of most frequent key words from two parallel corpora – for both Bulgarian and Slovak languages is offered. The paper address some issue related to correct terms discovery, their translations and use in political speech. Finally, more general conclusions about semantic properties of political lexica are presented.


Data mining Combinatorics on words Machine translation 


  1. 1.
    Gale, W., Church, K.: A program for aligning sentences in bilingual corpora. Comput. Linguist. 19(1), 5–102 (1993)Google Scholar
  2. 2.
    Kilgarriff, A., Reddy, S., Pomikalek, J., Avinesh, P.: A corpus factory for many languages. In: Proceedings of the LREC 2010, pp. 904–910 (2010)Google Scholar
  3. 3.
    Kilgarriff, A., Rundell, M.: Lexical profiling software and its lexicographic applications: a case study. In: Proceedings from EURALEX 2002, pp. 807–811 (2002)Google Scholar
  4. 4.
    Kilgarriff, A., Rychly, P., Smrz, P., Tugwell, D.: The sketch engine. In: Proceedings from EURALEX 2004, pp. 105–116 (2004)Google Scholar
  5. 5.
    Koehn, P.: Europarl: A parallel corpus for statistical machine translation. In: Proceedings from MT Summit, pp. 79–86 (2005)Google Scholar
  6. 6.
    Michelfeit, J.: Parallel corpora in sketch engine. In: Sketch Engine Workshop IV, Tallinn (2013) (presentation)Google Scholar
  7. 7.
    Ondrejovic, S.: Between purism and glocalism. In: Sociolinguistica Slovaca, vol. 8, pp. 25–32. VEDA (2014)Google Scholar
  8. 8.
    Stoykova, V., Petkova, E.: Automatic extraction of mathematical terms for precalculus. In: Proceedia Technology, vol. 1, pp. 464–468. Elsevier (2012)Google Scholar
  9. 9.
    Stoykova, V., Simkova, M., Majchrakova, D., Gajdosova, K.: Detecting time expressions for bulgarian and slovak language from electronic text corpora. Proc. Soc. Behav. Sci. 186, 257–260 (2015). ElsevierCrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Institute for Bulgarian Language - BASSofiaBulgaria

Personalised recommendations