Using Linked Data to Create a Typological Knowledge Base

Abstract

In this paper, I describe the challenges in creating a Resource Description Framework (RDF) knowledge base for undertaking phonological typology. RDF is a model for data interchange that encodes representations of knowledge in a graph data structure by using sets of statements that link resource nodes via predicates that can be logically marked-up (Lassila and Swick, 1999). The model I describe uses Linked Data to combine data from disparate segment inventory databases. Once the data in these legacy databases have been made interoperable at the linguistic and computational levels, I show how additional knowledge about distinctive features is linked to the knowledge base. I call this resource the Phonetics Information Base and Lexicon (PHOIBLE, http://phoible.org) and it allows users to query segment inventories from a large number of languages at both the segment and distinctive feature levels (Moran, 2012). I then show how the knowledge base is useful for investigating questions of descriptive phonological universals, e.g. “do all languages have coronals?” and “does every phonological system have at least one front vowel or the palatal glide /j/?” (Hyman, 2008).

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Beckett D (2004) RDF/XML Syntax Specification (Revised). Tech. rep., W3C, URL http://www.w3.org/TR/rdf-syntax-grammar/
  2. Bender EM, Langendoen DT (2010) Computational Linguistics in Support of Linguistic Theory. Linguistic Issues in Language Technology (LiLT) 3(2):1–31 Google Scholar
  3. Berners-Lee T, Hendler J, Lassila O (2001) The Semantic Web. Scientific American URL http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21
  4. Bizer C, Cyganiak R, Heath T (2007) How to publish linked data on the web. http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/
  5. Blevins J (2009) Another Universal Bites the Dust: Northwest Mekeo Lacks Coronal Phonemes. Oceanic Linguistics 48(1):264–273 CrossRefGoogle Scholar
  6. Cardoso J, Sheth AP (eds) (2006) Semantic Web Services, Processes and Applications, Springer MATHGoogle Scholar
  7. Chanard C (2006) Systèmes alphabétiques des langues africaines. http://sumale.vjf.cnrs.fr/phono/
  8. Crothers JH, Lorentz JP, Sherman DA, Vihman MM (1979) Handbook of phonological data from a sample of the world’s languages: A report of the stanford phonology archive Google Scholar
  9. Hartell RL (ed) (1993) Alphabets des langues africaines. UNESCO and Société Internationale de Linguistique Google Scholar
  10. Hayes B (2009) Introductory Phonology. Blackwell Google Scholar
  11. Hyman LM (2008) Universals in phonology. The Linguistic Review 25:83–137 CrossRefGoogle Scholar
  12. International Phonetic Association (2005) International Phonetic Alphabet. Tech. rep., International Phonetic Association, URL http://www.arts.gla.ac.uk/IPA/
  13. Jones AA (1995) Mekeo. In: Tryon DT (ed) Comparative Austronesian Dictionary: An Introduction to Austronesian Studies, Part 1: Fascicle 2, Mouton de Gruyter Google Scholar
  14. Jones AA (1998) Towards a Lexicogrammar of Mekeo (An Austronesian Language of Western Central Papua). Pacific Linguistics, Canberra Google Scholar
  15. Lassila O, Swick RR (1999) Resource Description Framework (RDF): Model and syntax specification (recommendation). http://www.w3.org/TR/REC-rdf-syntax
  16. Maddieson I (1984) Pattern of Sounds, Cambridge University Press, Cambridge, UK CrossRefGoogle Scholar
  17. Maddieson I, Precoda K (1990) Updating UPSID. In: UCLA Working Papers in Phonetics, vol 74, pp 104–111 Google Scholar
  18. McGuinness DL, van Harmelen F (2004) OWL Web Ontology Language Overview. URL http://www.w3.org/TR/owl-features/
  19. Moran S (2012) Phonetics information base. PhD thesis, University of Washington Google Scholar
  20. Pericliev V (2010) Machine-Aided Linguistic Discovery: An Introduction and Some Examples. London: Equinox Google Scholar
  21. Prud’Hommeaux E, Seaborne A (2006) SPARQL query language for RDF. W3C working draft 4 Google Scholar
  22. The Unicode Consortium (2007) The Unicode Standard, Version 5.0.0, defined by: The Unicode Standard, Version 5.0. URL http://www.unicode.org/versions/Unicode5.0.0/

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.Ludwig-Maximilians-UniversitätMünchenGermany

Personalised recommendations