Advertisement

Developing Morpho-SLaWS: An API for the Morphosyntactic Annotation of the Serbian Language

  • Toma Tasovac
  • Saša Rudan
  • Siniša Rudan
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 537)

Abstract

Serbian Lexical Web Service (SLaWS) is a resource-oriented web service designed to offer multiple functionalities—including morphosyntactic, lexicographic, and canonical text services—to create the backbone of a digital humanities infrastructure for the Serbian language. In this paper, we describe a key component of this service called Morpho-SLaWS, the atomic morphosyntactic component of the service infrastructure. The goal of Morpho-SLaWs is to offer a reliable, programmatic way of extracting morphosyntactic information about word forms using a revised version of the MULTEXT-East specification. As a service-oriented lexical tool, Morpho-SLaWS can be deployed in a variety of contexts and combined with other linguistic and DH tools.

Keywords

API design Service architecture Morphological lexicon Serbian language Digital humanities 

References

  1. Berry, D.M.: Understanding Digital Humanities. Palgrave Macmillan, Houndmills (2012)CrossRefGoogle Scholar
  2. Calzolari, N: Approaches towards a lexical web: the role of interoperability. In: Proceedings of the First International Conference on Global Interoperability for Language Resources, pp. 34–42 (2008)Google Scholar
  3. Cooper, D.: When nice people won’t share: shy data, web APIs, and beyond. In: Proceedings of the Second International Conference on Global Interoperability for Language Resources (ICGL 2010) (2010)Google Scholar
  4. Erjavec, T.: MULTEXT-east version 4: multilingual morphosyntactic specifications, lexicons and corpora. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), European Language Resources Association (ELRA) (2010)Google Scholar
  5. Gold, M.K. (ed.): Debates in the Digital Humanities. University of Minnesota Press, Minneapolis (2012)Google Scholar
  6. Haslhofer, B., Simon, R., Sanderson, R., Van de Sompel, H.: The open annotation collaboration (OAC) model. In: Cyberpsychology and Behavior: The Impact of the Internet, Multimedia and Virtual Reality on Behavior and Society, pp. 5–9 (2011)Google Scholar
  7. Holmes, M.: CodeSharing: a simple API for disseminating our TEI encoding. In: Jenstad, J. (ed.): The Map of Early Modern London (2014). http://mapoflondon.uvic.ca/BLOG10.htm
  8. Hunter, J., Cole, T., Sanderson, R., Van de Sompel, H.: The open annotation collaboration: a data model to support sharing and interoperability of scholarly annotations. In: Digital Humanities 2010, pp. 175–78 (2010), espace.library.uq.edu.au
  9. Ishida, T.: Language grid: an infrastructure for intercultural collaboration. In: International Symposium on Applications and the Internet, SAINT 2006 (2006)Google Scholar
  10. Ivković, D.: Pragmatics meets ideology: digraphia and non-standard orthographic practices in serbian online news forums. J. Lang. Politics 12(3), 335–356 (2013)CrossRefGoogle Scholar
  11. Krstev, C.: Processing of Serbian: Automata, Texts and Electronic Dictionaries. Faculty of Philology, Belgrade (2008)Google Scholar
  12. Krstev, C., Vitas, D., Obradović, I., Utvić, M.: E-Dictionaries and finite-state automata for the recognition of named entities. In: Proceedings of the 9th International Workshop on Finite State Methods and Natural Language Processing, FSMNLP 2011, pp. 48–56. ACL, Stroudsburgh (2011)Google Scholar
  13. Liu, A.: The state of the digital humanities: a report and a critique. Arts Humanit. High. Educ. 11(1–2), 8–41 (2012)CrossRefGoogle Scholar
  14. Magner, T.F.: Digraphia in the territories of the croats and serbs. Int. J. Sociol. Lang. 2001(150), 11–26 (2001)CrossRefGoogle Scholar
  15. Muehlen, M., Nickerson, J.V., Swenson, K.D.: Developing web services choreography standards—the case of REST vs. SOAP. Decis. Support Syst. 40(1), 9–29 (2005)CrossRefGoogle Scholar
  16. Murakami, Y., Lin, D., Tanaka, M., Nakaguchi, T., Ishida, T.: Language service management with the language grid. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), European Language Resources Association (ELRA) (2010)Google Scholar
  17. Nosek, J.D.: Open access liturgical resources for judaism. Theological Librarianship: Online J. Am. Theological Library Assoc. 6(2), 63–66 (2013)Google Scholar
  18. Ramsay, S.: Reading Machines: Toward an Algorithmic Criticism. University of Illinois Press, Urbana (2011)Google Scholar
  19. Richardson, L., Ruby, S.: RESTful Web Services. O’Reilly Media Inc., Sebastopol (2007)Google Scholar
  20. Sanderson, R., Ciccarese, P., Van de Sompel, H.: Open Annotation Data Model. W3C Community Draft 8 (2013)Google Scholar
  21. Smith, N.: Citation in Classical Studies. Digital Humanities Quarterly (2009). http://www.digitalhumanities.org/dhq/vol/3/1/000028/000028.html
  22. Tasovac, T.: More or less than a dictionary? wordnet as a model for Serbian L2 lexicography. Infotheca: J. Inf. Librarinaship 10(1–2), 13–22 (2009)Google Scholar
  23. Tasovac, T.: Potentials and challenges of WordNet-based pedagogical lexicography: the transpoetika dictionary. In: Granger, S. (ed.) Potentials and Challenges of WordNet-Based Pedagogical Lexicography: The Transpoetika Dictionary. Electronic Lexicography, pp. 237–58. Oxford University Press (2012)Google Scholar
  24. Tasovac, T., Ermolaev, N.: Encoding diachrony: digital editions of serbian 18th-century texts. In: Gradmann, S., Borri, F., Meghini, C., Schuldt, H. (eds.) TPDL 2011. LNCS, vol. 6966, pp. 497–500. Springer, Heidelberg (2011a)Google Scholar
  25. Tasovac, T., Ermolaev, N.: A User-Centered Digital Edition of Vuk Stefanović Karadžić’s Lexicon Serbico-Germanico-Latinum. Digital Humanities 2011 (2011b). http://xtf-prod.stanford.edu/xtf/view?docId=tei/ab-297.xml;query=;brand=default
  26. Tasovac, T., Petrović, S.: MULTEXT-East Revisited: Serbian Morphosyntactic Tags in Action (forthcoming)Google Scholar
  27. Tiepmar, J., Teichmann, C., Heyer, G., Berti, M., Crane, G.: A new implementation for canonical text services. In: Proceedings of the 8th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH). ACL, Stroudsburgh (2014)Google Scholar
  28. Zhu, W.-P., Li M.-X., Huan,C.: Using MongoDB to implement textbook management system instead of MySQL. In: IEEE 3rd International Conference on Communication Software and Networks (ICCSN) (2011)Google Scholar
  29. Vitas, D., Popović, L., Krstev, C., Obradović, I., Pavlović-Lažetić, G., Stanojević, M.: The serbian language in the digital age. In: Rehm, G., Uszkoreit, H. (eds.) META-NET White Paper Series. Springer, Heidelberg (2012)Google Scholar
  30. Váradi, T., Wittenburg, P., Krauwer, S., Wynne, M., Koskenniemi, K.: CLARIN: common language resources and technology infrastructure. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008), European Language Resources Association (ELRA) (2008)Google Scholar
  31. Бpбopић, B.: O Jeзичкoм pacкoлy. Coциoлингвиcтички oглeди I. Бeoгpaд и Hoви Caд: ЦПЛ-Пpoмeтej (2000)Google Scholar
  32. Tacoвaц, T., Jepмoлaeв, H.: Диjaxpoниjcки пpиcтyп дигитaлним издaњимa cpпcкиx тeкcтoвa 18. вeкa. In: Bpaнeш, A. (ed.) Дигитaлизaциja кyлтypнe и нayчнe бaштинe 71–88. Филoлoшки фaкyлтeт, Бeoгpaд (2012)Google Scholar
  33. Чeмepикић, Д.: Збиpкa peчи из Пpизpeнa ДимитpиjaЧeмepикићa. Цeнтap зa дигитaлнe xyмaниcтичкe нayкe, Пpeпиc.opг. Бeoгpaд (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Belgrade Center for Digital HumanitiesBelgradeSerbia

Personalised recommendations