Language Resources and Evaluation

, Volume 53, Issue 1, pp 173–190 | Cite as

Simplicity matters: user evaluation of the Slovene reference corpus

  • Špela Arhar Holdt
  • Kaja Dobrovoljc
  • Nataša LogarEmail author
Project Notes


The latest reference corpus of written Slovene, the Gigafida corpus, was created as part of the ‘Communication in Slovene’ project. In the same project, a web concordancer was designed for the broadest possible use, and tailored to the needs and abilities of user groups such as translators, writers, proofreaders and teachers. Two years after the corpus was published within the new tool, its features were assessed by the users. With an average rate of 4.36 on a scale between 1 and 5 (1 = I strongly disagree, 5 = I strongly agree), the results indicate that most survey participants agreed or strongly agreed with positive statements about the new implementations (e.g. “The corpus results are displayed in a clear manner”). This is a considerable improvement in user experience from the previous reference corpus of Slovene, i.e. the FidaPLUS corpus within the ASP32 concordancer (rated with 3.67). In the user feedback, the simplicity of search options and the interface clarity are highlighted as the main advantages, while for the future development, advanced visualizations of corpus data and improved search of word-phrases are suggested. The evaluation also highlighted some relevant user habits, such as not taking the time to learn systematically about the tool before they start using it. The findings will be implemented in future editions of the Gigafida corpus, but are relevant to any project that aims at facilitating a wider use of reference corpora and corpus-based resources.


Reference corpus Corpus concordancer Gigafida Usability assessment User evaluation User satisfaction 



The resources described in this paper were funded within the national project ‘Communication in Slovene’ (2008–2013), financed by the European Social Fund and the Slovene Ministry of Education, Science and Sports (Grant No. 3311-08-986003). The evaluation was supported by the infrastructure programme (ARRS-I0-0051) at the Centre for Applied Linguistics (Trojina), and the reference corpus upgrade funded by the Slovene Ministry of Culture (2015–2018) (Grant No. 33400-15-141007). Authors are also grateful to all reviewers for their very constructive input and comments.


  1. Agarwal, R., & Venkatesh, V. (2002). Assessing a firm’s web presence: A heuristic evaluation procedure for the measurement of usability. Information Systems Research, 13(2), 168–186.CrossRefGoogle Scholar
  2. Al-Sulaiti, L., & Atwell, E. (2006). The design of a corpus of contemporary Arabic. International Journal of Corpus Linguistics, 11(1), 1–36.CrossRefGoogle Scholar
  3. Arhar, Š. (2009). Uporabniška evalvacija korpusa FidaPLUS: zasnova vprašalnika, prvi rezultati. In M. Stabej (Ed.), Infrastruktura slovenščine in slovenistike (pp. 19–26). Ljubljana: Znanstvena založba Filozofske fakultete.Google Scholar
  4. Arhar, Š., Gorjanc, V., & Krek, S. (2007). FidaPLUS corpus of Slovenian: The new generation of the Slovenian reference corpus: Its design and tools. In M. Davies (Ed.), Proceedings of the corpus linguistics conference CL2007 (pp. 1–12). Birmingham: University of Birmingham.Google Scholar
  5. Arhar Holdt, Š., Kosem, I., & Gantar, P. (2017). Corpus-based resources for L1 teaching: The case of Slovene. In A. Marcus-Quinn & T. Hourigan (Eds.), Handbook on digital learning for K-12 schools (pp. 91–113). Berlin: Springer.CrossRefGoogle Scholar
  6. Bryman, A. (2012). Social research methods. Oxford: Oxford University Press.Google Scholar
  7. Erjavec, T. (2013). Slovene corpora for corpus linguistics and language technologies. In K. Gajdošová & A. Žáková (Eds.), Proceedings of the seventh international conference SLOVKO 2013 (pp. 51–62). Bratislava: Slovenská académia vied.Google Scholar
  8. Erjavec, T., Fišer, D., Krek, S., & Ledinek, N. (2010). The JOS linguistically tagged corpus of Slovene. In N. Calzolari, et al. (Eds.), Proceedings of the 7th international conference on language resources and evaluation (pp. 1806–1809). Paris: ELRA.Google Scholar
  9. Flowerdew, L. (2009). Applying corpus linguistics to pedagogy: A critical evaluation. International Journal of Corpus Linguistics, 14(3), 393–417.CrossRefGoogle Scholar
  10. Frankenberg-Garcia, A. (2012). Raising teachers’ awareness of corpora. Language Teaching, 45(4), 475–489.CrossRefGoogle Scholar
  11. Gorjanc, V. (2006). Tracking lexical changes in the reference corpus of Slovene texts. In A. Wilson, D. Archer, & P. Rayson (Eds.), Corpus linguistics around the world (pp. 91–100). Amsterdam, New York: Rodopi.CrossRefGoogle Scholar
  12. Grčar, M., Krek, S., & Dobrovoljc, K. (2012). Obeliks: statistični oblikoskladenjski označevalnik in lematizator za slovenski jezik. In T. Erjavec & J. Žganec Gros (Eds.), Proceedings of the eighth language technologies conference (pp. 89–94). Ljubljana: Institut “Jožef Stefan”.Google Scholar
  13. Groves, M. R., Fowler, F. J., Jr., Couper, M. P., Lepkowski, J. M., Singer, E., & Tourangeau, R. (2004). Survey methodology. Hoboken, NJ: Wiley.Google Scholar
  14. Hardie, A. (2012). CQPweb—Combining power, flexibility and usability in a corpus analysis tool. International Journal of Corpus Linguistics, 17(3), 380–409.CrossRefGoogle Scholar
  15. Hewson, C., Vogel, C., & Laurent, D. (2016). Internet research methods. Los Angeles: Sage.CrossRefGoogle Scholar
  16. Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J., et al. (2014). The sketch engine: Ten years on. Lexicography, 1(1), 7–36.CrossRefGoogle Scholar
  17. Kilgarriff, A., Rundell, M., & Dhonnchadha, E. U. (2006). Efficient corpus development for lexicography: Building the New Corpus for Ireland. Language Resources and Evaluation, 40(2), 127–152.CrossRefGoogle Scholar
  18. Kosem, I. (2012). User-friendly interfaces for corpora of Slovene. Prace Filologiczne, 63, 167–180.Google Scholar
  19. Krek, S. (2012). The Slovene language in the digital age. Berlin, Heidelberg: Springer.Google Scholar
  20. Logar, N. (2017). Reference corpora revisited: Expansion of the Gigafida corpus. In V. Gorjanc, et al. (Eds.), Dictionary of modern Slovene: Problems and solutions (pp. 96–119). Ljubljana: Ljubljana University Press, Faculty of Arts.Google Scholar
  21. Logar Berginc, N., Grčar, M., Brakus, M., Erjavec, T., Arhar Holdt, Š., & Krek, S. (2012). Korpusi slovenskega jezika Gigafida, KRES, ccGigafida in ccKRES: Gradnja, vsebina, uporaba. Ljubljana: Trojina, zavod za uporabno slovenistiko, Fakulteta za družbene vede.Google Scholar
  22. Logar Berginc, N., & Krek, S. (2012). New Slovene corpora within the communication in Slovene project. Prace Filologiczne, 63, 197–207.Google Scholar
  23. Pérez-Paredes, P., Sánchez-Tornel, M., & Calero, J. M. A. (2012). Learners’ search patterns during corpus-based focus-on-form activities: A study on hands-on concordancing. International Journal of Corpus Linguistics, 17(4), 482–515.CrossRefGoogle Scholar
  24. Renouf, A., & Kehoe, A. (2013). Filling the gaps: Using the WebCorp Linguist’s Search Engine to supplement existing text resources. International Journal of Corpus Linguistics, 18(2), 167–198.CrossRefGoogle Scholar
  25. Santos, D., & Frankenberg-Garcia, A. (2007). The corpus, its users and their needs: A user-oriented evaluation of COMPARA. International Journal of Corpus Linguistics, 12(3), 335–374.CrossRefGoogle Scholar
  26. Soehn, J.-Ph., Zinsmeister, H., & Rehm, G. (2008). Requirements of a user-friendly, general-purpose corpus query interface. In A. Witt, et al. (Eds.), Proceedings of the LREC 2008 workshop ‘sustainability of language resources and tools for NLP (pp. 27–32). ELRA: Paris.Google Scholar

Copyright information

© Springer Nature B.V. 2018

Authors and Affiliations

  1. 1.Faculty of ArtsUniversity of LjubljanaLjubljanaSlovenia
  2. 2.Institute “Jožef Stefan”LjubljanaSlovenia
  3. 3.Faculty of Social SciencesUniversity of LjubljanaLjubljanaSlovenia

Personalised recommendations