Skip to main content

Manócska: A Unified Verb Frame Database for Hungarian

  • Conference paper
  • First Online:
  • 1341 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11107))

Abstract

This paper presents Manócska, a verb frame database for Hungarian. It is called unified as it was built by merging all available verb frame resources. To be able to merge these, we had to cope with their structural and conceptual differences. After that, we transformed them into two easy to use formats: a TSV and an XML file. Manócska is open-access, the whole resource and the scripts which were used to create it are available in a github repository. This makes Manócska reproducible and easy to access, version, fix and develop in the future. During the merging process, several errors came into sight. These were corrected as systematically as possible. Thus, by integrating and harmonizing the resources, we produced a Hungarian verb frame database of a higher quality.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The resource and a detailed description of its structure can be found at

    https://github.com/ppke-nlpg/manocska.

  2. 2.

    http://corpus.nytud.hu/isz/.

  3. 3.

    https://hlt.bme.hu/hu/resources/tade.

  4. 4.

    Mazsola and Tádé are two puppets from a Hungarian puppet animated film which was popular in the early 1970s. The eponym of our database, Manócska is also a puppet from this film.

  5. 5.

    https://github.com/kagnes/infinitival_constructions.

  6. 6.

    The rank value is computed by dividing the actual frame frequency of the given record and the summarized frame frequency for each resource, and finally by summarizing the divisions’ results.

  7. 7.

    Due to licence reasons, the original resources could not be included but they can be asked for by the original copyright holders at the given addresses.

References

  1. Baker, C.F., Fillmore, C.J., Lowe, J.B.: The Berkeley FrameNet Project. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, ACL 1998, vol. 1, pp. 86–90. Association for Computational Linguistics, Stroudsburg (1998). https://doi.org/10.3115/980845.980860

  2. Brew, C., Schulte im Walde, S.: Spectral clustering for German verbs. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, EMNLP 2002, - vol. 10, pp. 117–124. Association for Computational Linguistics, Stroudsburg (2002). https://doi.org/10.3115/1118693.1118709

  3. Halácsy, P., Kornai, A., Németh, L., Rung, A., Szakadát, I., Trón, V.: Creating open language resources for Hungarian. In: Calzolari, N. (ed.) Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004), pp. 203–210 (2004)

    Google Scholar 

  4. Indig, B., Vadász, N.: Windows in Human Parsing – How Far can a Preverb Go? In: Tadić, M., Bekavac, B. (eds.) Proceedings of the Tenth International Conference on Natural Language Processing (HrTAL2016) 2016, Dubrovnik, Croatia, 29–30 September 2016. Springer, Cham (2016). (accepted, in press)

    Google Scholar 

  5. Kalivoda, Á.: A magyar igei komplexumok vizsgálata [The Hungarian Verbal Complexes]. Master’s thesis, PPKE-BTK (2016). https://github.com/kagnes/hungarian_verbal_complex

  6. Kornai, A., Nemeskey, D.M., Recski, G.: Detecting Optional Arguments of Verbs. In: Calzolari, N., et al. (eds.) Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). European Language Resources Association (ELRA) (2016)

    Google Scholar 

  7. Oravecz, C., Váradi, T., Sass, B.: The Hungarian Gigaword Corpus. In: Calzolari, N., et al. (eds.) Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014). European Language Resources Association (ELRA) (2014)

    Google Scholar 

  8. Sass, B.: Igei szerkezetek gyakorisági szótára - Egy automatikus lexikai kinyerő eljárás és alkalmazása [A Frequency Dictionary of Verbal Structures - An Automatic Lexical Extraction Procedure and its Application]. Ph.D. thesis, Pázmány Péter Katolikus Egyetem ITK (2011)

    Google Scholar 

  9. Sass, B.: 28 millió szintaktikailag elemzett mondat és 500 000 igei szerkezet [28 Million Syntactically Parsed Sentences and 500 000 Verbal Structures]. In: Tanács, A., Varga, V., Vincze, V. (eds.) XI. Magyar Számítógépes Nyelvészeti Konferencia (MSZNY 2015) [XI. Hungarian Conference on Computational Linguistics], pp. 399–403. SZTE TTIK Informatikai Tanszékcsoport, Szeged (2015)

    Google Scholar 

  10. Sass, B., Váradi, T., Pajzs, J., Kiss, M.: Magyar igei szerkezetek - A leggyakoribb vonzatok és szókapcsolatok szótára [Hungarian Verbal Structures - The Dictionary of the Most Frequent Arguments and Phrases]. Tinta Könyvkiadó, Budapest (2010)

    Google Scholar 

  11. Schuler, K.K.: VerbNet: A broad-coverage, comprehensive verb lexicon. Ph.D. thesis, University of Pennsylvania (2006). http://verbs.colorado.edu/~kipper/Papers/dissertation.pdf

  12. Váradi, T.: The Hungarian National Corpus. In: Proceedings of the Third International Conference on Language Resources and Evaluation (LREC-2002), pp. 385–389. European Language Resources Association, Paris (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Balázs Indig .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kalivoda, Á., Vadász, N., Indig, B. (2018). Manócska: A Unified Verb Frame Database for Hungarian. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech, and Dialogue. TSD 2018. Lecture Notes in Computer Science(), vol 11107. Springer, Cham. https://doi.org/10.1007/978-3-030-00794-2_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00794-2_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00793-5

  • Online ISBN: 978-3-030-00794-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics