Skip to main content

A NooJ Dictionary for the Rromani Language: Toward a NooJ-Relevant Sorting of Morphosyntactic Tags

  • Conference paper
  • First Online:
Formalizing Natural Languages with NooJ and Its Natural Language Processing Applications (NooJ 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 811))

  • 599 Accesses

Abstract

This paper aims at presenting how to elaborate a relevant sorting of morphosyntactic tags to be used in the NooJ dictionary for Rromani language through three topics: dialectal issues, treatment of postpositions and countableness of substantives. This module encompasses all four dialects of Rromani, the isoglosses of which are basically no longer geographical. We have thus defined each of the four dialects through a combination of two tags corresponding to specific isoglosses. For instance, the so-called O-bi dialect (i.e. O-superdialect with no mutation of alveolar affricates) is labelled as “rro + rrbi” in NooJ. Then, on typological grounds, it was decided to treat the Rromani postpositions as agglutinative, non-inflectional, morphemes. Rromani postpositions are appended to substantives in the oblique case and in some cases cumulative (as in Modern Indic). In addition, the postposition of possession may be inflected in gender, number and case as an adjective (-qo, -qi, -qe of as basic forms, with variants). Accordingly, no less than some 250 potential forms are to be encountered for postpositions, covering all basic dialectal variants. However, they may all be rendered, by a much more economical system, appropriate to both Rromani grammar and computational analysis. Moreover, we investigated the system of countableness in Rromani nouns when relevant.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Proto-Rromani people were deported by Mahmood Sultan in 1018 from Kanauji in the middle valley of the Ganges.

  2. 2.

    The four dialects of the Rromani are respectively called O-bi, O-mu, E-bi and E-mu (see chap. 2).

  3. 3.

    In Rromani grammar, two levels of cases should be distinguished sharply: the two morphological cases (direct, oblique) expressed by an inflectional ending, and several functional cases (e.g. ablative) expressed, either by a postposition appended to a noun in the oblique case, or by a preposition preceding a noun in the direct case. Prepositional and postpositional phrases could be often equivalents (e.g. e raklesθar from the boy with a postposition -θar from vs. katar o raklo from the boy with a preposition katar from).

  4. 4.

    For example, a noun raklo boy generates 257 “forms” in total: seven forms without postposition, 10 forms with invariable postpositions and 240 forms with a variable postposition.

  5. 5.

    For example, long forms of the postposition -qo of are used only in the O-bi dialect.

  6. 6.

    The capital “D” precedes the inflectional or semantic information of determinee (e.g. possessed substantive) in the Rromani module. For example, “Dsg” means the possessed substantive is in the singular case.

  7. 7.

    In NooJ, words, lexemes and morphemes could be considered as ALUs. [3].

  8. 8.

    A colon means inclusiveness. For example, “:N” includes any noun in any inflected form.

  9. 9.

    Inanimate nouns in the oblique case do not exist without postposition in Rromani.

  10. 10.

    In general, a noun inflects in four forms; a masculine noun raklo boy inflects in: raklo boy (sg + dr), rakles boy (sg + ob), rakle boys (pl + dr), raklen boys (pl + ob).

  11. 11.

    However, a NooJ inflectional dictionary would recognize it.

  12. 12.

    Each constraint (and its variable) is numbered from left to right ($1 being the first constraint), and the various fields of the lexicon are named “L” (corresponding Lemma), “C” (morphosyntactic Category), “S” (Syntactic or semantic features) and “F” (inFlectional information). For instance, “$1L” means corresponding lemma of the first constraint. [4].

  13. 13.

    The paradigm buxlo large covers all adjectives, which are vocalic (i.e. ending “-o” in the basic form) and oxytonic (e.g. buxlo large, kalo black).

  14. 14.

    Remember the combination of two tags “rro + rrbi” represents the O-bi dialect.

  15. 15.

    The tag “rrs” (as south) represents the vernacular used in the Balkans.

  16. 16.

    Remember the capital “D” precedes the information of determinee.

  17. 17.

    These inflected forms of the possessive postposition are used in either the Balkan vernacular or the Carpathian one, both belonging to the O-bi dialect.

  18. 18.

    Remember that “Dabl” means the posssessed noun is in the ablative case, not the possessor.

  19. 19.

    The tag “rrn” (as north) represents the vernacular used in Russia and the north of Poland.

  20. 20.

    In Rromani, there is no indefinite article. However, the cardinal number jekh one is used as the singular indefinite article.

  21. 21.

    On morphological ground, the singular form of love money is *lovo, yet its diminutive lovorro is used as an equivalent.

References

  1. Courthiade, M.: The nominal flexion in Rromani. In: Courthiade, M., Grigore, D. (eds.) Professor Gherghe Sarău: a Life Devoted to the Rromani Language. Editura universității din bucurești, Bucharest (2016)

    Google Scholar 

  2. Courthiade, M., et al.: Morri angluni rromane ćhibǎqi evroputni lavustik. Cigány Ház, Budapest (2009)

    Google Scholar 

  3. Silberztein, M.: La formalisation des langues: l’approche de NooJ. ISTE Eds., London (2015)

    Google Scholar 

  4. Silberztein, M.: NooJ Manual (2003). www.nooj4nlp.net

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Masako Watabe .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Watabe, M. (2018). A NooJ Dictionary for the Rromani Language: Toward a NooJ-Relevant Sorting of Morphosyntactic Tags. In: Mbarki, S., Mourchid, M., Silberztein, M. (eds) Formalizing Natural Languages with NooJ and Its Natural Language Processing Applications. NooJ 2017. Communications in Computer and Information Science, vol 811. Springer, Cham. https://doi.org/10.1007/978-3-319-73420-0_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-73420-0_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-73419-4

  • Online ISBN: 978-3-319-73420-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics