Behavior Research Methods

, Volume 49, Issue 5, pp 1939–1950 | Cite as

K-SPAN: A lexical database of Korean surface phonetic forms and phonological neighborhood density statistics

  • Jeffrey J. Holliday
  • Rory Turnbull
  • Julien Eychenne


This article presents K-SPAN (Korean Surface Phonetics and Neighborhoods), a database of surface phonetic forms and several measures of phonological neighborhood density for 63,836 Korean words. Currently publicly available Korean corpora are limited by the fact that they only provide orthographic representations in Hangeul, which is problematic since phonetic forms in Korean cannot be reliably predicted from orthographic forms. We describe the method used to derive the surface phonetic forms from a publicly available orthographic corpus of Korean, and report on several statistics calculated using this database; namely, segment unigram frequencies, which are compared to previously reported results, along with segment-based and syllable-based neighborhood density statistics for three types of representation: an “orthographic” form, which is a quasi-phonological representation, a “conservative” form, which maintains all known contrasts, and a “modern” form, which represents the pronunciation of contemporary Seoul Korean. These representations are rendered in an ASCII-encoded scheme, which allows users to query the corpus without having to read Korean orthography, and permits the calculation of a wide range of phonological measures.


Korean Phonological neighborhood density Lexicon Lexical database 

Supplementary material

13428_2016_836_MOESM1_ESM.pdf (721 kb)
(PDF 720 KB)


  1. Ahn, S C. (1998). An Introduction to Korean Phonology Hansin Munhwasa. Seoul: Hansin Munhwasa.Google Scholar
  2. Carreiras, M, Alvarez, C J, & de Vega, M (1993). Syllable frequency and visual word recognition in Spanish. Journal of Memory and Language, 32, 766–780.CrossRefGoogle Scholar
  3. Coady, J A, & Aslin, R N (2003). Phonological neighbourhoods in the developing lexicon. Journal of Child Language, 30, 441–469.CrossRefPubMedPubMedCentralGoogle Scholar
  4. Cock, P J A, Antao, T, Chang, J T, Chapman, B A, Cox, C J, Dalke, A., & Hoon, M (2009). Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics, 25 (11), 1422–1423.CrossRefPubMedPubMedCentralGoogle Scholar
  5. Cutler, A, Mehler, J, Norris, D, & Segui, J (1986). The syllable’s differing role in the segmentation of French and English. Journal of Memory and Language, 25, 385–400.CrossRefGoogle Scholar
  6. Eychenne, J, & Jang, T Y (2015). On the merger of Korean mid front vowels. Phonetics and Speech Sciences (Journal of the Korean Society of Speech Sciences), 7(2), 119–129.CrossRefGoogle Scholar
  7. Hieronymus, J L. (1994). ASCII Phonetic symbols for the world’s languages: Worldbet: Tech. rep. AT&T Bell Laboratories.Google Scholar
  8. Holliday, J J, & Turnbull, R (2015). Effects of phonological neighborhood density on word production in Korean. In Proceedings of the Eighteenth International Congress of the Phonetic Sciences.Google Scholar
  9. Hong, Y. (1988). A sociolinguistic study of Seoul Korean. Seoul: Hanshin Publishing Co.Google Scholar
  10. Kim, H. (2005). Hyeondae Gugeo Sayong Bindo Josa 2. Seoul: National Institute of the Korean Language.Google Scholar
  11. Kim, H (2006). Korean national corpus in the 21st century Sejong project. In Proceedings of the 13th National Institute of Japanese Literature (NIJL) International Symposium (pp. 49–54).Google Scholar
  12. Kwon, Y (2014). The syllable type and token frequency effect in naming task. Korean Journal of Cognitive Science, 25, 91–107.CrossRefGoogle Scholar
  13. Kwon, Y, & Nam, K (2011). The relationship between morphological family size and syllabic neighborhoods density in Korean visual word recognition. The Korean Journal of Cognitive and Biological Psychology, 23, 301–319.CrossRefGoogle Scholar
  14. Kwon, Y, Lee, C, Lee, K, & Nam, K (2011). The inhibitory effect of phonological syllables, rather than orthographic syllables, as evidenced in Korean lexical decision tasks. Psychologia, 54, 1–14.CrossRefGoogle Scholar
  15. Lee KM, & Ramsey SR. (2011). A history of the Korean language: Cambridge University Press.Google Scholar
  16. Luce, P A. (1986). Neighborhoods of words in the mental lexicon: PhD thesis, Indiana University.Google Scholar
  17. Luce, P A, & Pisoni, D B (1998). Recognizing spoken words: the neighborhood activation model. Ear & Hearing, 19(1), 1–36.CrossRefGoogle Scholar
  18. Mehler, J, Dommergues, J Y, Frauenfelder, U, & Segui, J (1981). The syllable’s role in speech segmentation. Journal of Verbal Learning and Verbal Behavior, 20, 298–305.CrossRefGoogle Scholar
  19. Munson, B, & Solomon, N P (2004). The effect of phonological neighborhood density on vowel articulation. Journal of Speech, Language, and Hearing Research, 47, 1048–1058.CrossRefPubMedPubMedCentralGoogle Scholar
  20. Oh, Y M, Coupé, C, Marsico, E, & Pellegrino, F (2015). Bridging phonological system and lexicon: insights from a corpus study of functional load. Journal of Phonetics, 53, 153–176.CrossRefGoogle Scholar
  21. Perea, M, & Carreiras, M (1998). Effects of syllable frequency and syllable neighborhood frequency in visual word recognition. Journal of Experimental Psychology: Human Perception and Performance, 24, 134–144.Google Scholar
  22. Pisoni, D B, Nusbaum, H C, Luce, P A, & Slowiaczek, L M (1985). Speech perception, word recognition and the structure of the lexicon. Speech Communication, 4, 75–95.CrossRefPubMedPubMedCentralGoogle Scholar
  23. Scarborough, R. (2004). Coarticulation and the structure of the lexicon. Los Angeles: PhD thesis, UCLA.Google Scholar
  24. Shin, J (2008). Phoneme and syllable frequencies of Korean based on the analysis of spontaneous speech data. Korean Journal of Communication Disorders, 13(2), 193–215.Google Scholar
  25. Shin, J, Kiaer, J, & Cha, J. (2013). The sounds of Korean. Cambridge: Cambridge University Press.Google Scholar
  26. Silverman, D (2010). Neutralization and anti-homophony in Korean. Journal of Linguistics, 46(02), 453–482.CrossRefGoogle Scholar
  27. Sohn, HM. (1999). The Korean language: Cambridge University Press.Google Scholar
  28. Song, J, Nam, K, & Koo, M (2012). The effect of word frequency and neighborhood density on spoken word segmentation in Korean. Journal of the Korean Society of Speech Sciences, 4(2), 3– 20.CrossRefGoogle Scholar
  29. Stokes, S F (2010). Neighborhood density and word frequency predict vocabulary size in toddlers. Journal of Speech, Language, and Hearing Research, 53, 670–683.CrossRefPubMedGoogle Scholar
  30. The Unicode Consortium (2015). The Unicode Standard, Version 8.0.0. The Unicode Consortium,
  31. Vitevitch, M S, & Stamer, M K (2006). The curious case of competition in Spanish speech production. Language and Cognitive Processes, 21, 760–770.CrossRefPubMedPubMedCentralGoogle Scholar
  32. Wedel, A, Jackson, S, & Kaplan, A (2013a). Functional load and the lexicon: evidence that syntactic category and frequency relationships in minimal lemma pairs predict the loss of phoneme contrasts in language change. Language and Speech, 56(3), 395–417.CrossRefPubMedGoogle Scholar
  33. Wedel, A, Kaplan, A, & Jackson, S (2013b). High functional load inhibits phonological contrast loss: a corpus study. Cognition, 128(2), 179–186.CrossRefPubMedGoogle Scholar
  34. Wright, R (2004). Factors of lexical competition in vowel articulation. In Local, J, & Ogden, R (Eds.) Papers in Laboratory Phonology, (Vol. 6 pp. 26–50). Cambridge: Cambridge University Press.Google Scholar

Copyright information

© Psychonomic Society, Inc. 2017

Authors and Affiliations

  • Jeffrey J. Holliday
    • 1
  • Rory Turnbull
    • 2
  • Julien Eychenne
    • 3
  1. 1.Department of Korean Language and LiteratureKorea UniversitySeoulSouth Korea
  2. 2.Laboratoire de Sciences Cognitives et Psycholinguistique (ENS, EHESS, CNRS), Département d’Etudes Cognitives, Ecole Normale SupérieurePSL Research UniversityParisFrance
  3. 3.Department of Linguistics and Cognitive ScienceHankuk University of Foreign StudiesGyeonggiSouth Korea

Personalised recommendations