An ontology for accessing transcription systems

  • Steven MoranEmail author
Original Paper


This paper presents the design and implementation of the Ontology for Accessing Transcription Systems (OATS), a knowledge base that supports interoperation over disparate transcription systems and practical orthographies. OATS uses RDF, SPARQL and Unicode to facilitate resource discovery and intelligent search over linguistic data. The knowledge base includes an ontological description of writing systems and relations for mapping transcription system segments to an interlingua pivot, the IPA. It includes orthographic and phonemic inventories from 203 African languages, which were mined from the Web. OATS is motivated by four use cases: querying data in the knowledge base via IPA, querying it in native orthography, error checking of digitized data, and conversion between transcription systems. The model in this paper implements each of these use cases.


Linguistics Interoperability Knowledge base Ontology Transcription system Orthography 



This work was supported in part by the Max-Planck-Institut für evolutionäre Anthropologie and thanks go to Bernard Comrie, Jeff Good and Michael Cysouw. For helpful comments and reviews, I thank Emily Bender, Scott Farrar, Sharon Hargus, Will Lewis, Dan McCloy, Richard Wright, and three anonymous reviewers.


  1. Avery, P., & Rice, K. (1989). Segment structure and coronal underspecification. Phonology, 6, 179–200.CrossRefGoogle Scholar
  2. Baader, F., Calvanese, D., McGuinness, D., Nardi, D., & Patel-Schneider, P. (2003). The description logic handbook: Theory, implementation, and applications. New York, NY: Cambridge University Press.Google Scholar
  3. Baader, F., & Sattler, U. (2001). An overview of Tableau Algorithms for description logics. Studia Logica, 69(1), 5–40.CrossRefGoogle Scholar
  4. Baldwin, T., Bird, S., & Hughes, B. (2006). Collecting low-density language materials on the web. In Proceedings of the 12th Australasian world wide web conference (AusWeb06).Google Scholar
  5. Beckett, D. (2004). RDF/XML syntax specification (Revised). Technical report, W3C.Google Scholar
  6. Bird, S., & Simons, G. (2003). Seven dimensions of portability for language documentation and description. Language, 79(3), 557–582.CrossRefGoogle Scholar
  7. Blass, R. (1975). Sisaala-English, English-Sisaala dictionary. Tamale, Ghana: Institute of Linguistics.Google Scholar
  8. Bodomo, A. (1997). The structure of dagaare. Stanford monographs in African languages. Stanford, CA: CSLI Publications.Google Scholar
  9. Calvanese, D., De Giacomo, G., Lenzerini, M., & Nardi, D. (2001). Reasoning in expressive description logics. In Handbook of automated reasoning, vol. II, (pp. 1581–1634). Amsterdam: Elsevier.Google Scholar
  10. Chomsky, N., & Halle, M. (1968). The sound pattern of English. New York, NY: Harper & Row.Google Scholar
  11. Clements, G. N., & Hume, E. (1995). The internal organization of speech sounds. In J. Goldsmith (Ed.), The handbook of phonological theory (pp. 245–306). Cambridge, MA: Blackwell.Google Scholar
  12. Clements, G. N. (1985). The geometry of phonological features. Phonology Yearbook, 2, 225–252.CrossRefGoogle Scholar
  13. Coulmas, F. (1999). The Blackwell encyclopedia of writing systems. Cambridge, MA: Blackwell.CrossRefGoogle Scholar
  14. Coulmas, F. (2003). Writing systems: An introduction to their analysis. Cambridge, UK: Cambridge University Press.Google Scholar
  15. Daniels, P., & Bright, W. (1996). The world's writing systems. New York, NY: Oxford University Press.Google Scholar
  16. Farrar, S., & Langendoen, T. (2003). A linguistic ontology for the semantic web. GLOT, 7(3), 97–100.Google Scholar
  17. Farrar, S., & Lewis, W. (2005). The GOLD community of practice: An infrastructure for linguistic data on the web. In E-MELD 2005: Workshop on morphosyntactic annotation and terminology: Linguistic ontologies and data categories for language resources.Google Scholar
  18. Gibbon, D., Hughes, B., & Trippel, T. (2005). Semantic decomposition of character encodings for linguistic knowledge discovery. In Proceedings of the 29th annual conference of the Gesellschaft für Klassifikation.Google Scholar
  19. Gibbon, D., Hughes, B., & Trippel, T. (2007). The computational semantics of characters. In Proceedings of the 7th international workshop on computational semantics.Google Scholar
  20. Gruber, T. R. (1993). A translation approach to portable ontology specifications. Knowledge Acquisition, 5, 199–220.CrossRefGoogle Scholar
  21. Hartell, R. (1993). Alphabets des langues africaines. Dakar: UNESCO.Google Scholar
  22. Jakobson, R., Fant, G., & Halle, M. (1952). Preliminaries to speech analysis. Cambridge, MA: MIT Press.Google Scholar
  23. Kohrt, M. (1986). The term ‘Grapheme’ in the history and theory of linguistics. In G. Augst (Ed.), New trends in graphemics and orthography (pp. 80–96). Berlin: de Gruyter.Google Scholar
  24. Lewis, W. (2006). ODIN: A model for adapting and enriching legacy infrastructure. In Proceedings of the e-Humanities workshop 2006: 2nd IEEE international conference on e-Science and grid computing.Google Scholar
  25. McCarthy J. (1988) Feature geometry and dependency: A review. Phonetica, 45, 84–108CrossRefGoogle Scholar
  26. McGill, S. (2004). Focus and activation in paasaal: The particle rε. Master’s thesis, University of Reading.Google Scholar
  27. Mcgill, S., Fembeti, S., & Toupin, M. (1999) A grammar of Sisaala-Pasaale. Ghana: University of Ghana.Google Scholar
  28. Moran, S. (2008). A grammatical sketch of Isaalo (Western Sisaala). Saarbrücken: VDM Verlag Dr Müller.Google Scholar
  29. Sagey, E. (1986). The representation of features and relations in non-linear phonology. Ph.D. thesis, MIT.Google Scholar
  30. Sampson, G. (1985) Writing systems. Stanford, CA: Stanford University Press.Google Scholar
  31. Sowa, J. (2000). Knowledge representation. Pacific Grove, CA: Brooks/ColeGoogle Scholar
  32. Sproat, R. (2000). A computational theory of writing systems. Cambridge, UK: Cambridge University Press.Google Scholar
  33. The Unicode Consortium. (2007). The unicode standard, Version 5.0, defined by: The Unicode Standard, Version 5.0.Google Scholar
  34. Toupin, M. (1995). The phonology of Sisaale-Pasaale. In Collected language notes, vol. 22. Ghana Institute of Linguistics, Literacy and Bible Translation.Google Scholar
  35. Yergeau, F. (2006). Extensible markup language (XML) 1.0 (Fourth Edition).Google Scholar
  36. Zuraw, K. (2006). Using the web as a phonological corpus: A case study from tagalog. In Proceedings of the 2nd international workshop on web as corpus.Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  1. 1.Department of LinguisticsUniversity of WashingtonSeattleUSA

Personalised recommendations