Abstract

This paper describes a number of criteria for archivable documentation of grammars of natural languages, extending the work of Bird and Simons’ “Seven dimensions of portability for language documentation and description.” We then describe a system for writing and testing morphological and phonological grammars of languages, a system which satisfies most of these criteria (where it does not, we discuss plans to extend the system).

The core of this system is based on an XML schema which allows grammars to be written in a stable and linguistically-based formalism, a formalism which is independent of any particular parsing engine. This core system also includes a converter program, analogous to a programming language compiler, which translates grammars written in this format, plus a dictionary, into the programming language of a suitable parsing engine (currently the Stuttgart Finite State Tools). The paper describes some of the decisions which went into the design of the formalism; for example, the decision to aim for observational adequacy, rather than descriptive adequacy. We draw out the implications of this decision in several areas, particularly in the treatment of morphological reduplication.

We have used this system to produce formal grammars of Bangla, Urdu,Pashto, and Persian (Farsi), and we have derived parsers from those formal grammars. In the future we expect to implement similar grammars of other languages, including Dhivehi, Swahili, and Somali. In further work (briefly described in this paper), we have embedded formal grammars produced in this core system into traditional descriptive grammars of several of these languages. These descriptive grammars serve to document the formal grammars, and also provide automatically extractable test cases for the parser.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Beesley, K.R., Karttunen, L.: Finite State Morphology. University of Chicago Press, Chicago (2003)Google Scholar
  2. 2.
    Bird, S., Simons, G.: Seven dimensions of portability for language documentation and description. Language 79(3), 557–582 (2003)CrossRefGoogle Scholar
  3. 3.
    Blevins, J.: A reconsideration of Yokuts vowels. International Journal of American Linguistics 70(1), 33–51 (2004)CrossRefGoogle Scholar
  4. 4.
    Burnard, L., Bauman, S.: TEI P5: Guidelines for electronic text encoding and interchange (2013)Google Scholar
  5. 5.
    Chomsky, N.: Aspects of the Theory of Syntax. MIT Press, Cambridge (1965)Google Scholar
  6. 6.
    David, A., Maxwell, M.: Joint grammar development by linguists and computer scientists. In: IJCNLP, pp. 27–34. The Association for Computer Linguistics (2008)Google Scholar
  7. 7.
    Dieterman, J.I.: Secondary palatalization in Isthmus Mixe: a phonetic and phonological account. SIL International, Dallas (2008), http://www.sil.org/silepubs/Pubs/50951/50951_DietermanJ_Mixe_Palatalization.pdf
  8. 8.
    Halle, M.: Prolegomena to a theory of word formation. Linguistic Inquiry 4, 3–16 (1973)Google Scholar
  9. 9.
    Halle, M., Mohanan, K.P.: Segmental phonology of modern english. Linguistic Inquiry 16(1), 57–116 (1985)Google Scholar
  10. 10.
    Hankamer, J.: Finite state morphology and left to right phonology. In: Proceedings of the Fifth West Coast Conference on Formal Linguistics. pp. 29–34 (1986)Google Scholar
  11. 11.
    Harris, J.W.: Two theories of non-automatic morphophonological alternations. Language: Journal of the Linguistic Society of America 54, 41–60 (1978)CrossRefGoogle Scholar
  12. 12.
    Harris, Z.: Yokuts structure and Newman’s grammar. International Journal of American Linguistics 10, 196–211 (1944)CrossRefGoogle Scholar
  13. 13.
    ISO TC37: Language resource management — Feature structures — Part 1: Feature structure representation (2006)Google Scholar
  14. 14.
    ISO TC37: Language resource management — Lexical markup framework, LMF (2008)Google Scholar
  15. 15.
    ISO TC37: Language resource management — Feature structures — Part 2: Feature system declaration (2011)Google Scholar
  16. 16.
    Karttunen, L.: The insufficiency of paper-and-pencil linguistics: the case of Finnish prosody. In: Kaplan, R.M., Butt, M., Dalrymple, M., King, T.H. (eds.) Intelligent Linguistic Architectures: Variations on Themes, pp. 287–300. CSLI Publications, Stanford (2006)Google Scholar
  17. 17.
    Knuth, D.E.: Literate Programming. Center for the Study of Language and Information, Stanford (1992)MATHGoogle Scholar
  18. 18.
    Marantz, A.: Re reduplication. Linguistic Inquiry 13, 435–482 (1982)Google Scholar
  19. 19.
    Maxwell, M.: Electronic grammars and reproducible research. In: Nordoff, S., Poggeman, K.-L.G. (eds.) Electronic Grammaticography, pp. 207–235. University of Hawaii Press (2012)Google Scholar
  20. 20.
    Maxwell, M.: A Grammar Formalism for Computational Morphology (forthcoming)Google Scholar
  21. 21.
    Maxwell, M., David, A.: Interoperable grammars. In: Webster, J., Ide, N., Fang, A.C. (eds.) First International Conference on Global Interoperability for Language Resources (ICGL 2008), Hong Kong, pp. 155–162 (2008), http://hdl.handle.net/1903/11611
  22. 22.
    Newman, S.: The Yokuts Language of California. Viking Fund, New York (1944)Google Scholar
  23. 23.
    Rice, C., Blaho, S. (eds.): Modeling ungrammaticality in Optimality Theory. Advances in Optimality Theory. Equinox Press, London (2009)Google Scholar
  24. 24.
    Schmid, H.: A programming language for finite state transducers. In: Yli-Jyrä, A., Karttunen, L., Karhumäki, J. (eds.) FSMNLP 2005. LNCS (LNAI), vol. 4002, pp. 308–309. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  25. 25.
    Walsh, N.: DocBook 5: The Definitive Guide. O’Reilly, Sebastopol, California (2011), http://www.docbook.org/
  26. 26.
    Weber, D.J., Black, H.A., McConnel, S.R.: AMPLE: A Tool for Exploring Morphology. Summer Institute of Linguistics, Dallas (1988)Google Scholar
  27. 27.
    Weigel, W.F.: The interaction of theory and description: The yokuts canon. Talk Presented at the Annual Meeting of the Society for the Study of the Indigenous Languages of the Americas (2002)Google Scholar
  28. 28.
    Weigel, W.F.: Yowlumne in the Twentieth Century. Ph.D. thesis, University of California, Berkeley (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Michael Maxwell
    • 1
  1. 1.University of MarylandCollege ParkUSA

Personalised recommendations