Behavior Research Methods

, Volume 50, Issue 1, pp 313–322 | Cite as

A Web-based interface to calculate phonotactic probability for words and nonwords in Modern Standard Arabic



A number of databases (Storkel Behavior Research Methods, 45, 1159–1167, 2013) and online calculators (Vitevitch & Luce Behavior Research Methods, Instruments, and Computers, 36, 481–487, 2004) have been developed to provide statistical information about various aspects of language, and these have proven to be invaluable assets to researchers, clinicians, and instructors in the language sciences. The number of such resources for English is quite large and continues to grow, whereas the number of such resources for other languages is much smaller. This article describes the development of a Web-based interface to calculate phonotactic probability in Modern Standard Arabic (MSA). A full description of how the calculator can be used is provided. It can be freely accessed at


Phonotactic probability Modern Standard Arabic Online calculator 



We thank the Council for International Exchange of Scholars for funding F.A. through the Fulbright Scholars Program while he was at the University of Kansas, and the University of Kansas Information Technology department (especially Erica Boos, Chris Escalante, and Bob Lim) for their work on the interface.


  1. Adriaans, F. (2006). PhonotacTools (Test version) [Computer program]. The Netherlands: Utrecht Institute of Linguistics OTS, Utrecht University.Google Scholar
  2. Amayreh, M. M. (2003). Completion of the consonant inventory of Arabic. Journal of Speech, Language, and Hearing Research, 46, 517–529.Google Scholar
  3. Amayreh, M. M., & Dyson, A. T. (1998). The acquisition of Arabic consonants. Journal of Speech, Language, and Hearing Research, 41, 642–653.CrossRefPubMedGoogle Scholar
  4. Anderson, J. D., & Byrd, C. T. (2008). Phonotactic probability effects in children who stutter. Journal of Speech, Language, and Hearing Research, 51, 851–866. doi: 10.1044/1092-4388(2008/062) CrossRefPubMedPubMedCentralGoogle Scholar
  5. Arts, T., Belinkov, Y., Habash, N., Kilgarriff, A., & Suchomel, V. (2014). arTenTen: Arabic corpus and word sketches. Journal of King Saud University - Computer and Information Sciences, 26, 357–371.CrossRefGoogle Scholar
  6. Auer, E. T., & Luce, P. A. (2005). Probabilistic phonotactics in spoken word recognition. In D. B. Pisoni & R. E. Remez (Eds.), The handbook of speech perception (pp. 610–630). Oxford, UK: Blackwell. doi: 10.1002/9780470757024.ch25 CrossRefGoogle Scholar
  7. Badawi, E. (2006). Arabic for nonnative speakers in the 21st century: A shopping list. In K. M. Wahba, Z. A. Taha, & L. England (Eds.), Handbook for Arabic language teaching professionals (pp. ix–xiv). Mahwah, NJ: Erlbaum.Google Scholar
  8. Bailey, T. M., & Hahn, U. (2001). Determinants of wordlikeness: Phonotactics or lexical neighborhoods? Journal of Memory and Language, 44, 568–591.CrossRefGoogle Scholar
  9. Balota, D. A., Pilotti, M., & Cortese, M. J. (2001). Subjective frequency estimates for 2,938 monosyllabic words. Memory & Cognition, 29, 639–647. doi: 10.3758/BF03200465 CrossRefGoogle Scholar
  10. Boudelaa, S., & Marslen-Wilson, W. D. (2013). Morphological structure in the Arabic mental lexicon: Parallels between standard and dialectal Arabic. Language and Cognitive Processes, 28, 1453–1473.CrossRefPubMedGoogle Scholar
  11. Brysbaert, M., & New, B. (2009). Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behavior Research Methods, 41, 977–990. doi: 10.3758/BRM.41.4.977 CrossRefPubMedGoogle Scholar
  12. Buckwalter, T. (2002). Arabic transliteration. URL
  13. Cunningham, K. T., Haley, K. L., & Jacks, A. (2016). Speech sound distortions in aphasia and apraxia of speech: Reliability and diagnostic significance. Aphasiology, 30, 396–413.CrossRefGoogle Scholar
  14. Faizal, S. S. B., Khattab, G., & McKean, C. (2015). The Qur’an Lexicon Project: A database of lexical statistics and phonotactic probabilities for 19,286 contextually and phonetically transcribed types in Qur’anic Arabic. In Proceedings of the 18th International Congress of Phonetic Sciences. Retrieved from on April 26, 2016.
  15. Ferguson, C. A. (1959). Diglossia. Word, 15, 325–340.CrossRefGoogle Scholar
  16. Furman, N., Goldberg, D., & Lusin, N. (2007). Enrollments in languages other than English in United States institutions of higher education, Fall 2006. Modern Language Association of America. Retrieved from on November 29, 2007.
  17. Goldrick, M., & Larson, M. (2008). Phonotactic probability influences speech production. Cognition, 107, 1155–1164. doi: 10.1016/j.cognition.2007.11.009 CrossRefPubMedGoogle Scholar
  18. Gomez, R. L., & Gerken, L. (2000). Infant artificial language learning and language acquisition. Trends in Cognitive Sciences, 4, 178–186.CrossRefPubMedGoogle Scholar
  19. Goodrich, J., & Lonigan, C. J. (2015). Lexical characteristics of words and phonological awareness skills of preschool children. Applied Psycholinguistics, 36, 1509–1531.CrossRefGoogle Scholar
  20. Habash, N., & Rambow, O. (2005). Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop. In K. Knight (Ed.), Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (pp. 573–580). Stroudsburg, PA: Association for Computational Linguistics.Google Scholar
  21. Habash, N., Rambow, O., & Roth, R. (2009). MADA + TOKAN: A toolkit for Arabic tokenization, diacritization, morphological disambiguation, POS tagging, stemming and lemmatization. In Proceedings of the 2nd International Conference on Arabic Language Resources and Tools (MEDAR) (pp. 102–109). Cairo, Egypt: MEDAR.Google Scholar
  22. Holes, C. (1995). Modern Arabic: Structure, functions and varieties. London, UK: Longman.Google Scholar
  23. Hunter, C. R. (2016). Is the time course of lexical activation and competition in spoken word recognition affected by adult aging? An event-related potential (ERP) study. Neuropsychologia, 91, 451–464. doi: 10.1016/j.neuropsychologia.2016.09.007 CrossRefPubMedGoogle Scholar
  24. Ibrahim, R., & Aharon-Peretz, J. (2005). Is literary Arabic a second language for native Arab speakers? Journal of Psycholinguistic Research, 34, 51–70.CrossRefPubMedGoogle Scholar
  25. Ingham, B. (1994). Najdi Arabic: Central Arabian. Amsterdam, The Netherlands: Benjamins.CrossRefGoogle Scholar
  26. Jaradat, A. A., & Al-Khawaldeh, N. N. A. (2015). Teaching Modern Standard Arabic for non-native speakers as a lingua franca. Mediterranean Journal of Social Sciences, 6, 490–499.Google Scholar
  27. Jusczyk, P. W., & Luce, P. A. (1994). Infants’ sensitivity to phonotactic patterns in the native language. Journal of Memory and Language, 33, 630–645.CrossRefGoogle Scholar
  28. Kilgarriff, A., Rychly, P., Smrz, P., & Tugwell, D. (2004). The sketch engine. In Proceedings of EURALEX, Lorient, France (pp. 105–116). Available from
  29. Kučera, H., & Francis, W. N. (1967). Computational analysis of present-day American English. Providence, RI: Brown University Press.Google Scholar
  30. Leonard, L. B., Davis, J., & Deevy, P. (2007). Phonotactic probability and past tense use by children with specific language impairment and their typically developing peers. Clinical Linguistics and Phonetics, 21, 747–758.CrossRefPubMedPubMedCentralGoogle Scholar
  31. Lewis, M. P., Simons, G. F., & Fennig, C. D. (Eds.). (2016). Ethnologue: Languages of the world (19th ed.). Dallas, Texas: SIL International. Online version at
  32. Marian, V., Bartolotti, J., Chabal, S., & Shook, A. (2012). CLEARPOND: Cross-linguistic easy-access resource for phonological and orthographic neighborhood densities. PLoS ONE, 7, e43230. doi: 10.1371/journal.pone.0043230 CrossRefPubMedPubMedCentralGoogle Scholar
  33. Mattys, S. L., & Jusczyk, P. W. (2001). Phonotactic cues for segmentation of fluent speech by infants. Cognition, 78, 91–121.CrossRefPubMedGoogle Scholar
  34. Messer, M. H., Leseman, P. P., Boom, J., & Mayo, A. Y. (2010). Phonotactic probability effect in nonword recall and its relationship with vocabulary in monolingual and bilingual preschoolers. Journal of Experimental Child Psychology, 105, 306–323.CrossRefPubMedGoogle Scholar
  35. Palmer, J. (2008). Arabic diglossia: Student perception of spoken Arabic after living in the Arabic-speaking world. Arizona Working Papers in Second Language Acquisition and Teaching, 15, 81–95.Google Scholar
  36. Parkinson, D. B. (2003). Verbal features in oral fusha in Cairo. International Journal of the Sociology of Language, 163, 27–41.Google Scholar
  37. Pitt, M. A., & McQueen, J. M. (1998). Is compensation for coarticulation mediated by the lexicon? Journal of Memory and Language, 39, 347–370.CrossRefGoogle Scholar
  38. Richtsmeier, P. T., & Goffman, L. (2015). Learning trajectories for speech motor performance in children with specific language impairment. Journal of Communication Disorders, 55, 31–43.CrossRefPubMedPubMedCentralGoogle Scholar
  39. Rispens, J., Baker, A., & Duinmeijer, I. (2015). Word recognition and nonword repetition in children with language disorders: The effects of neighborhood density, lexical frequency, and phonotactic probability. Journal of Speech, Language, and Hearing Research, 85, 78–92. doi: 10.1044/2014_JSLHR-L-12-0393 CrossRefGoogle Scholar
  40. Ryding, K. C. (2005). A reference grammar of modern standard Arabic. Cambridge, UK: Cambridge University Press.CrossRefGoogle Scholar
  41. Ryding, C. K. (2006). Teaching Arabic in the United States. In K. M. Wahba, Z. A. Taha, & L. England (Eds.), Handbook for Arabic language teaching professionals in the 21st century (pp.13–20). Mahwah, NJ: Lawrence Erlbaum Associates.Google Scholar
  42. Storkel, H. L. (2001). Learning new words: Phonotactic probability in language development. Journal of Speech, Language, and Hearing Research, 44, 1321–1337.CrossRefPubMedGoogle Scholar
  43. Storkel, H. L. (2004). The Emerging lexicon of children with phonological delays: Phonotactic constraints and probability in acquisition. Journal of Speech, Language, and Hearing Research, 47, 1194–1212.CrossRefPubMedGoogle Scholar
  44. Storkel, H. L. (2013). A corpus of consonant–vowel–consonant real words and nonwords: Comparison of phonotactic probability, neighborhood density, and consonant age of acquisition. Behavior Research Methods, 45, 1159–1167. doi: 10.3758/s13428-012-0309-7 CrossRefPubMedGoogle Scholar
  45. Storkel, H. L., Armbruster, J., & Hogan, T. P. (2006). Differentiating phonotactic probability and neighborhood density in adult word learning. Journal of Speech, Language, and Hearing Research, 49, 1175–1192.CrossRefPubMedPubMedCentralGoogle Scholar
  46. van der Kleij, S. W., Rispens, J. E., & Scheper, A. R. (2016). The effect of phonotactic probability and neighbourhood density on pseudoword learning in 6- and 7-year-old children. First Language, 36, 93–108.CrossRefGoogle Scholar
  47. Vitevitch, M. S., Chan, K. Y., & Goldstein, R. (2014). Using English as a “model language” to understand language processing. In A. Lowit & N. Miller (Eds.), Motor speech disorders: A cross-language perspective (pp. 58–73). Buffalo, NY: Multilingual Matters.Google Scholar
  48. Vitevitch, M. S., & Donoso, A. J. (2012). Phonotactic probability of brand names: I’d buy that! Psychological Research, 76, 693–698.CrossRefPubMedGoogle Scholar
  49. Vitevitch, M. S., & Luce, P. A. (1998). When words compete: Levels of processing in perception of spoken words. Psychological Science, 9, 325–329. doi: 10.1111/1467-9280.00064 CrossRefGoogle Scholar
  50. Vitevitch, M. S., & Luce, P. A. (1999). Probabilistic phonotactics and neighborhood activation in spoken word recognition. Journal of Memory and Language, 40, 374–408.CrossRefGoogle Scholar
  51. Vitevitch, M. S., & Luce, P. A. (2004). A Web-based interface to calculate phonotactic probability for words and nonwords in English. Behavior Research Methods, Instruments, and Computers, 36, 481–487. doi: 10.3758/BF03195594 CrossRefPubMedPubMedCentralGoogle Scholar
  52. Vitevitch, M. S., & Luce, P. A. (2005). Increases in phonotactic probability facilitate spoken nonword repetition. Journal of Memory and Language, 52, 193–204. doi: 10.1016/j.jml.2004.10.003 CrossRefGoogle Scholar
  53. Vitevitch, M. S., Luce, P. A., Charles-Luce, J., & Kemmerer, D. (1997). Phonotactics and syllable stress: Implications for the processing of spoken nonsense words. Language and Speech, 40, 47–62.CrossRefPubMedGoogle Scholar
  54. Vitevitch, M. S., Luce, P. A., Pisoni, D. B., & Auer, E. T. (1999). Phonotactics, neighborhood activation and lexical access for spoken words. Brain and Language, 68, 306–311.CrossRefPubMedPubMedCentralGoogle Scholar
  55. Vitevitch, M. S., Pisoni, D. B., Kirk, K. I., Hay-McCutcheon, M., & Yount, S. L. (2002). Effects of phonotactic probabilities on the processing of spoken words and nonwords by postlingually deafened adults with cochlear implants. Volta Review, 102, 283–302.Google Scholar
  56. Vitevitch, M. S., & Stamer, M. K. (2006). The curious case of competition in Spanish speech production. Language and Cognitive Processes, 21, 760–770. doi: 10.1080/01690960500287196 CrossRefPubMedPubMedCentralGoogle Scholar
  57. Weber, A., & Cutler, A. (2006). First-language phonotactics in second-language listening. Journal of the Acoustical Society of America, 119, 597–607.CrossRefPubMedGoogle Scholar
  58. Wright, W. (1995). A grammar of the Arabic language. Cambridge, UK: Cambridge University Press.Google Scholar
  59. Zipf, G. K. (1935). The psycho-biology of language: An introduction to dynamic philology. Cambridge, MA: Houghton Mifflin.Google Scholar

Copyright information

© Psychonomic Society, Inc. 2017

Authors and Affiliations

  1. 1.Department of English Language and TranslationQassim UniversityBuraydahSaudi Arabia
  2. 2.Spoken Language Laboratory, Department of PsychologyUniversity of KansasLawrenceUSA

Personalised recommendations