Inducing nonlocal constraints from baseline phonotactics


Nonlocal phonological patterns such as vowel harmony and long-distance consonant assimilation and dissimilation motivate representations that include only the interacting segments—projections. We present an implemented computational learner that induces projections based on phonotactic properties of a language that are observable without nonlocal representations. The learner builds on the base grammar induced by the MaxEnt Phonotactic Learner (Hayes and Wilson 2008). Our model searches this baseline grammar for constraints that suggest nonlocal interactions, capitalizing on the observations that (a) nonlocal interactions can be seen in trigrams if the language has simple syllable structure, and (b) nonlocally interacting segments define a natural class. We show that this model finds nonlocal restrictions on laryngeal consonants in corpora of Quechua and Aymara, and vowel co-occurrence restrictions in Shona.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8


  1. 1.

    An anonymous reviewer asks how crucial it is to assume that the segmental inventory is given in advance. This is an interesting question, since traditional phonological reasoning about analyzing segmental inventories does usually depend on phonotactics: for example, the analysis of English [ʧ] as an affricate and [ts] as a cluster relies on distributional information. We do not attempt to solve this complex problem here, though see Sect. 5 for some related discussion.

  2. 2.

    Della Pietra et al. (1997:4) characterize gain as “the improvement [a constraint] brings to the model when it has weight [w]”: \(\mathit{Gain}_{Con}(w,C)=D(\tilde{p}||\mathit{Con})-D(\tilde{p}||\mathit{Con}_{wC})\), where C is the constraint with the weight w, D is the Kullback-Leibler divergence, is the probability distribution of the data, and Con is the current constraint grammar.

    Della Pietra et al. explain the reason for this method of calculating gain intuitively as follows: “We approximate the improvement due to adding a single candidate [constraint], measured by the reduction in Kullback-Leibler divergence, by adjusting only the weight of the [constraint] and keeping all of the other parameters of the [grammar] fixed. In general this is only an estimate, since it may well be that adding a [constraint] will require significant adjustments to all of the parameters in the new model. From a computational perspective, approximating the improvement in this way can enable the simultaneous evaluation of thousands of candidate [constraints], and makes the algorithm practical.” (We modified the language slightly to translate it into constraint/grammar terms.) We might add that defined in this way, gain can be calculated for each constraint even when the grammar contains no constraints yet, whereas for O/E, there needs to be an arbitrarily set threshold.

  3. 3.

    We did the counts for a transcribed Russian dictionary of 103,000 words. Looking at consonants in trigram and tetragram configurations, CVC accounted for 337,415 or 63% of all the combinations; CCC: 18,516 (3.4%), CCVC: 76,574 (14%), CVCC: 93,637 (18%), CVVC: 7,946 (1%). For vowel-to-vowel n-grams, the counts are VCV: 117,214 (64%), VCCV: 61,344 (33%), VCVV: 2,074 (1%), VVCV: 2512 (1%). We give comparable numbers for other languages, where relevant, in their respective sections.

  4. 4.

    Padgett (1991) does report a gradient co-occurrence restriction in 500 Russian roots; see also Kochetov and Radisic (2009).

  5. 5.

    Aspirates may also appear in vowel-initial words, though ejectives are absent from such forms. See Gallagher (2015) for discussion.

  6. 6.

    Our corpus is available on GitHub at While the newspaper is primarily a Quechua language periodical, it includes numerous articles in Spanish, as well as Spanish phrases and Spanish roots embedded in Quechua text. The majority of Spanish forms were removed from the word corpus, including Spanish words that were inflected with Quechua morphology. The only exception to this are those words, mostly place names, that are consistent with the native phonotactics of Quechua.

  7. 7.

    Another parameter is whether the model is asked to look for violable or inviolable constraints. In either condition, whether a constraint is included in the grammar depends on its gain, but an inviolable constraint simulation only considers constraints whose observed violations are zero. To keep the amount of information digestible, we only consider inviolable constraint models of Quechua and Aymara, since the laryngeal phonotactics are categorical. The results reported here are replicable with similar settings for violable constraint models as well. For all models reported throughout this paper, we ran the model with a large enough constraint set that the model returned fewer constraints than it was asked for. This means that constraint set size was not an analyst-manipulated parameter that affected the fit of the model.

  8. 8.

    A baseline grammar run on a modified Quechua training set where codas were added to all syllables confirmed that this is true; the grammar includes a highly weighted constraint against stop-consonant bigrams, but no trigram constraints on stop-[ ]-ejective or stop-[ ]-aspirate sequences.

  9. 9.

    Indeed, a model where binary [sg] and [cg] are used does not include any constraints on plain stops. This could be interpreted as a failing of the heuristic in the Hayes and Wilson model, or it could be taken as evidence that privative features are a better hypothesis in this particular case.

  10. 10.

    This means that the phonotactic learning here happens over a sublexicon of roots; see Sect. 6.5 for more discussion.

  11. 11.

    The use of a placeholder segment ‘X’ is of course not the ideal solution to this problem, as it obscures any other phonological generalizations that may hold of segments that are in an identity relation, like local restrictions on clusters of consonant-vowel interactions. A superior model would expand the search space of constraint to include algebraic notation. While Berent et al. (2012) present one potential method for constructing constraints of this type, no implementation of the model in that paper is available, nor has it been shown to be a general solution to phonological distinctions between identical and non-identical segments.

  12. 12.

    Morphologically, most of these stems appear to be imperatives, which are roots followed by some verbal projection suffixes (causatives, applicatives, etc.) and the [-a] suffix. Since all the citation forms of verbs end in [-a], this throws off the calculations for sequences that end in [a], so we removed that suffix for the purposes of O/E calculations. The suffix is present in the learning data for the simulations we report, however, since it is a categorical fact about Shona phonotactics that all words end in vowels.

  13. 13.

    We opted to use a different corpus from Hayes and Wilson (2008), who used an incomplete scanned version (Hannan 1974) that goes up to “m”. Our corpus is slightly smaller but contains the full range of initial consonants, which matters for phonotactic learning. We verified that the distribution of vowel-vowel pairs is comparable in the two corpora.

  14. 14.

    Suffixes harmonize with verbal roots, but Fortune mentions a minor pattern whereby root vowels alternate to match the final -a or -e: [ndi-ger-e] ‘I am seated’ vs. [ku-gar-a], [ndi-ɲerer-e] ‘I am silent’ vs. [ku-ɲarar-a]. He lists five roots that follow this pattern; all alternate between [a] and [e] (Fortune 1980:20). We leave the phonological analysis of this for future work; for our present purposes, the important observation is that even the minor alternations are consistent with the phonotactic characterization of vowel harmony that affixes display.

  15. 15.

    The list of clusters we included: [gw, mw, bw, hw, kw, sw, nd, ŋg, mb, nz, nʤ].

  16. 16.

    The exhaustive list of clusters that occur in the Shona corpus: [ʦw, kw, âw, rw, mv, ʃw, zw, nw, tw, ŋw, jw, mw, w, sw, hw, pw, Ʒw, , gw, ɬw, nâw, mb, nâ, ŋg, nz, nɮ, ɲŋ, nj, âr, mbw, nhw, jŋ, ŋgw, nzw, nzv]. Many of these could be analyzed as labialized singletons or prenasalized stops or fricatives. The attractiveness of this move is somewhat tempered by the computational cost of increasing the number of natural classes. We do not know of a phonological analysis that would allow treating sequences such as [nj] or [âr] as singletons.

  17. 17.

    Technically, [-low] includes [e i o u j w], since we specified the glides in the feature set as [-syll] segments with vocalic features. When the feature set was rigged to exclude glides from vowel natural classes, the results did not change.

  18. 18.

    The details of this statistical analysis are provided along with the code for the learner on GitHub ( In both the baseline and the final grammar, VCCV forms receive slightly higher harmony scores than VCV forms. Since the constraints on CC sequences are poorly understood, we severely limited the range of clusters in our nonce words. This means that VCV forms, with their wider range of consonants in medial position, are more likely to violate bigram constraints on CV and VC sequences. We do not know what the status of these constraints is in Shona speakers’ grammars, so it is an open question whether the computational learner is overfitting.

  19. 19.

    An anonymous reviewer suggests evaluating the fit of the [+syllabic] phonotactic grammar with that of our mosaic grammar in a linear model, as we did for the baseline vs. mosaic grammars earlier. Unsurprisingly, given the visual impression in the plot, there is a significant effect of vowel harmony status on harmony scores in a linear model for the [+syllabic] grammar. The question, then, is whether it is possible to decide which model is better on the basis of such statistical comparisons. The usual methods of model comparison such as Akaike Information Criterion do distinguish these models, favoring [+syllabic] over the mosaic model (52,773 vs. 53,140—lower is better)—but this comparison also favors the baseline model (AIC = 49,724) over both of the models that capture the vowel harmony generalizations that we are after. The statistical method of evaluating models therefore points away from linguistic intuitions, which could be a potential problem for us. The only way to find out which model captures the right generalizations is to test them experimentally on human speakers of Shona.

  20. 20.

    A similar criticism can be applied to the model of Goldsmith and Riggle (2012). They argue that their model discovers the projection relevant to Finnish vowel harmony, but it does so over segmental rather than featural representations—thus, the comparison is between V-to-V vs. V-to-C nonlocal relations. This assumes that the learner is considering only V and C natural classes, thereby giving the learner a vocalic projection for free. It also allows the learner to notice accidental nonlocal co-occurrence restrictions that do not involve segments from the same natural class, which our learner cannot detect.

  21. 21.

    Russian is one of the languages that causes the Java implementation of the learner to run out of memory at the constraint enumeration stage, due to the large number of natural classes. We got around this for Russian by redefining the feature set to use several privative oppositions and not transcribing certain important phonotactic patterns (such as vowel reduction). This reduces the number of natural classes for the learner to deal with, and with it the ability to make certain phonological generalizations. Even this move did not help with Hungarian.

  22. 22.

    An anonymous reviewer asks why nonlocal interactions aren’t more frequent in Polynesian languages, which have very simple syllable structure. First, several languages of the region have been noted for their nonlocal consonant interactions (see Blust 2012 for a review of OCP effects in these languages, as well as Coetzee and Pater 2008; Zuraw and Lu 2009). While we do predict that nonlocal interactions should be learnable via our method in Polynesian languages, there may be other reasons, including chance, why a language does or does not exhibit a particular type of pattern. For example, in a language with a small segmental inventory and simple syllable structure, nonlocal phonological dependencies introduce additional limitations on possible words, resulting in a relatively small set of unique words, unless words are extremely long. Morphological reduplication may make phonotactic nonlocal dependencies difficult to detect, since patterns may be ambiguous between a phonotactic and a morphological analysis.


  1. Adriaans, Frans, and René Kager. 2010. Adding generalization to statistical learning: The induction of phonotactics from continuous speech. Journal of Memory and Language 62: 311–331.

    Article  Google Scholar 

  2. Albright, Adam. 2002. The identification of bases in morphological paradigms. PhD diss., University of California, Los Angeles.

  3. Albright, Adam, and Bruce Hayes. 2003. Rules vs. analogy in English past tenses: A computational/experimental study. Cognition 90 (2): 119–161.

    Article  Google Scholar 

  4. Albright, Adam, and Bruce Hayes. 2006. Modeling productivity with the Gradual Learning Algorithm: The problem of accidentally exceptionless generalizations. In Gradience in grammar: Generative perspectives, eds. Gisbert Fanselow, Caroline Fery, Matthias Schlesewsky, and Ralf Vogel, 185–204. Oxford: Oxford University Press.

    Chapter  Google Scholar 

  5. Allen, Blake, and Michael Becker. 2015. Learning alternations from surface forms with sublexical phonology. Ms., UBC and Stony Brook.

  6. Becker, Michael, and Maria Gouskova. 2016. Source-oriented generalizations as grammar inference in Russian vowel deletion. Linguistic Inquiry 47 (3): 391–425.

    Article  Google Scholar 

  7. Becker, Michael, Nihan Ketrez, and Andrew Nevins. 2011. The surfeit of the stimulus: Analytic biases filter lexical statistics in Turkish devoicing neutralization. Language 87 (1): 84–125.

    Article  Google Scholar 

  8. Beckman, Jill. 1997. Positional faithfulness, positional neutralization, and Shona vowel harmony. Phonology 14 (1): 1–46.

    Article  Google Scholar 

  9. Beckman, Jill. 1998. Positional faithfulness. New York: Routledge.

    Google Scholar 

  10. Bennett, William G. 2015. Assimilation, dissimilation, and surface correspondence in Sundanese. Natural Language & Linguistic Theory 33 (2): 371–415.

    Article  Google Scholar 

  11. Berent, Iris, Colin Wilson, Gary Marcus, and Doug Bemis. 2012. On the role of variables in phonology: Remarks on Hayes and Wilson (2008). Linguistic Inquiry 43 (1): 97–119.

    Article  Google Scholar 

  12. Berkley, Deborah. 2000. Gradient OCP effects. PhD diss., Northwestern University.

  13. Berkson, Kelly Harper. 2013. Optionality and locality: Evidence from Navajo sibilant harmony. Laboratory Phonology 4 (2): 287–337.

    Article  Google Scholar 

  14. Blust, Robert. 2012. One mark per word? Some patterns of dissimilation in Austronesian and Australian languages. Phonology 29 (3): 355–381.

    Article  Google Scholar 

  15. Chimhundu, Herbert. 1996. Duramazwi reChiShona. Harare: College Press Publishing Ltd.

    Google Scholar 

  16. Chimhundu, Herbert, Oddrun Grønvik, Christian Emil Smith Ore, and Daniel Ridings. 1996. The African Languages Lexicon project (ALLEX). Available at Accessed 12 February 2019.

  17. Coetzee, Andries W., and Joe Pater. 2008. Weighted constraints and gradient restrictions on place co-occurrence in Muna and Arabic. Natural Language and Linguistic Theory 26 (2): 289–337.

    Article  Google Scholar 

  18. Cohn, Abigail. 1992. The consequences of dissimilation in Sundanese. Phonology 9: 199–220.

    Article  Google Scholar 

  19. Colavin, Rebecca S, Roger Levy, and Sharon Rose. 2010. Modeling OCP-Place in Amharic with the Maximum Entropy phonotactic learner. In Chicago Linguistics Society (CLS) 46, Vol. 2, 27–41. Chicago: Chicago Linguistic Society.

    Google Scholar 

  20. Cox, Betty Ellen, Myra Adamson, and Muriel Teusink. 1998. Kinyarwanda-English dictionary. Falls Church: The APICS Educational and Research Foundation.

    Google Scholar 

  21. De Lucca, Manuel. 1987. Diccionario Práctico Aymara-Español, Español-Aymara. La Paz: Editorial Los Amigos del Libro.

    Google Scholar 

  22. Della Pietra, Stephen, Vincent Della Pietra, and John Lafferty. 1997. Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 19 (4): 380–393.

    Article  Google Scholar 

  23. Downing, Laura J, and Maxwell Kadenge. 2015. Prosodic stems in Zezuru Shona. Southern African Linguistics and Applied Language Studies 33 (3): 291–305.

    Article  Google Scholar 

  24. Fortune, George. 1980. Shona grammatical constructions, 2nd edn., Vol. 1. Harare: Mercury Press.

    Google Scholar 

  25. Frisch, Stefan A., Janet B. Pierrehumbert, and Michael B. Broe. 2004. Similarity avoidance and the OCP. Natural Language and Linguistic Theory 22 (1): 179–228.

    Article  Google Scholar 

  26. Futrell, Richard, Adam Albright, Peter Graff, Edward Gibson, and Timothy J O’Donnell. 2015. A probabilistic autosegmental model of phonotactics. Ms., MIT.

  27. Gafos, Adamantios. 1999. The articulatory basis of locality in phonology. New York: Garland.

    Google Scholar 

  28. Gallagher, Gillian. 2014. An identity bias in phonotactics: Evidence from Cochabamba Quechua. Laboratory Phonology 5 (3): 337–378.

    Article  Google Scholar 

  29. Gallagher, Gillian. 2015. Natural classes in cooccurrence constraints. Lingua 166: 80–98.

    Article  Google Scholar 

  30. Gallagher, Gillian. 2016. Asymmetries in the representation of categorical phonotactics. Language 92 (3): 557–590.

    Article  Google Scholar 

  31. Gallagher, Gillian, and Jessica Coon. 2008. Distinguishing total and partial identity: Evidence from Chol. Natural Language and Linguistic Theory 27: 545–582.

    Article  Google Scholar 

  32. Goldsmith, John, and Jason Riggle. 2012. Information theoretic approaches to phonology: The case of Finnish vowel harmony. Natural Language and Linguistic Theory 30 (3): 859–896.

    Article  Google Scholar 

  33. Goldwater, Sharon, and Mark Johnson. 2003. Learning OT constraint rankings using a Maximum Entropy Model. In Stockholm Workshop on Variation within Optimality Theory, eds. Jennifer Spenader, Anders Eriksson, and Östen Dahl, 111–120. Stockholm: Stockholm University.

    Google Scholar 

  34. Gouskova, Maria, and Michael Becker. 2013. Nonce words show that Russian yer alternations are governed by the grammar. Natural Language and Linguistic Theory 31 (3): 735–765.

    Article  Google Scholar 

  35. Gouskova, Maria, Sofya Kasyanenko, and Luiza Newlin-Łukowicz. 2015. Selectional restrictions as phonotactics over sublexicons. Lingua 167: 41–81.

    Article  Google Scholar 

  36. Hannan, Michael. 1974. Standard Shona dictionary. Harare: College Press in conjunction with the Literature Bureau.

    Google Scholar 

  37. Hansson, Gunnar Olafur. 2001. Theoretical and typological issues in consonant harmony. PhD diss., University of California, Berkeley.

  38. Hardman, Martha James. 2001. Aymara. München: Lincom Europa.

    Google Scholar 

  39. Hayes, Bruce, and James White. 2013. Phonological naturalness and phonotactic learning. Linguistic Inquiry 44 (1): 45–75.

    Article  Google Scholar 

  40. Hayes, Bruce, and Colin Wilson. 2008. A Maximum Entropy Model of Phonotactics and Phonotactic Learning. Linguistic Inquiry 39 (3): 379–440.

    Article  Google Scholar 

  41. Hayes, Bruce, Kie Zuraw, Péter Siptár, and Zsuzsa Cziráky Londe. 2009. Natural and unnatural constraints in Hungarian vowel harmony. Language 85 (4): 822–863.

    Article  Google Scholar 

  42. Heinz, Jeffrey. 2010. Learning long-distance phonotactics. Linguistic Inquiry 41 (4): 623–661.

    Article  Google Scholar 

  43. Jardine, Adam. 2015. Learning tiers for long-distance phonotactics. In Generative Approaches to Language Acquisition North America (GALANA) 6.

    Google Scholar 

  44. Jardine, Adam, and Jeffrey Heinz. 2016. Learning tier-based strictly 2-local languages. Transactions of the Association for Computational Linguistics 4: 87–98.

    Article  Google Scholar 

  45. Kastner, Itamar, and Frans Adriaans. 2017. Linguistic constraints on statistical word segmentation: The role of consonants in Arabic and English. Cognitive Science 2 (S2): 494–518.

    Google Scholar 

  46. Kimper, Wendell. 2011. Competing triggers: Transparency and opacity in vowel harmony. PhD diss., University of Massachusetts, Amherst.

  47. Kochetov, Alexei, and Milica Radisic. 2009. Latent consonant harmony in Russian: Experimental evidence for agreement by correspondence. In Formal Approaches to Slavic Linguistics (FASL) 17, eds. Maria Babyonyshev, Darya Kavitskaya, and Jodi Reich, 111–130. Ann Arbor: Michigan Slavic Publications.

    Google Scholar 

  48. MacEachern, Margaret. 1997. Laryngeal cooccurrence restrictions. PhD diss., UCLA, Los Angeles.

  49. Maddieson, Ian. 1990. Shona velarization: Complex consonants or complex onsets? UCLA Working Papers in Linguistics 74: 16–34.

    Google Scholar 

  50. McCarthy, John J. 1986. OCP Effects: Gemination and antigemination. Linguistic Inquiry 17 (2): 207–263.

    Google Scholar 

  51. McCarthy, John J. 1988. Feature geometry and dependency: A review. Phonetica 43: 84–108.

    Article  Google Scholar 

  52. McCarthy, John J. 1989. Linear order in phonological representation. Linguistic Inquiry 20: 71–99.

    Google Scholar 

  53. McCarthy, John J. 1994. The phonetics and phonology of Semitic pharyngeals. In Phonological structure and phonetic form: Papers in laboratory phonology 3, ed. Patricia Keating, 191–233. Cambridge: Cambridge University Press.

    Chapter  Google Scholar 

  54. Mester, Armin. 1986. Studies in tier structure. PhD diss., University of Massachusetts, Amherst. Published 1988 in Outstanding Dissertations in Linguistics series. New York: Garland.

  55. Mudzingwa, Calisto. 2010. Shona morphophonemics: Repair strategies in Karanga and Zezuru. PhD diss., University of British Columbia.

  56. Myers, Scott. 1987. Tone and the structure of words in Shona. PhD diss., University of Massachusetts, Amherst.

  57. Padgett, Jaye. 1991. Stricture in feature geometry. PhD diss., University of Massachusetts, Amherst.

  58. Rose, Sharon, and Rachel Walker. 2004. A typology of consonant agreement as correspondence. Language 80 (3): 475–532.

    Article  Google Scholar 

  59. Siptár, Péter, and Miklós Törkenczy. 2000. The phonology of Hungarian. Oxford: Oxford University Press.

    Google Scholar 

  60. Stanton, Juliet. 2016. Learnability shapes typology: the case of the midpoint pathology. Language 92 (4): 753–791.

    Article  Google Scholar 

  61. Stanton, Juliet. 2017a. Constraints on the distribution of nasal-stop sequences: An argument for contrast. PhD diss., Massachusetts Institute of Technology.

  62. Stanton, Juliet. 2017b. Latin –alis/–aris and segmental blocking in dissimilation. In 2016 annual meeting on phonology, eds. Karen Jesney, Charlie O’Hara, Caitlin Smith, and Rachel Walker.

    Google Scholar 

  63. Suzuki, Keiichiro. 1998. A typological investigation of dissimilation. PhD diss., University of Arizona.

  64. Svantesson, Jan-Olof, Anna Tsendina, Anastasia Karlsson, and Vivan Franzén. 2005. The phonology of Mongolian. Oxford: Oxford University Press.

    Google Scholar 

  65. Trubetzkoy, N. S. 1939. Grundzuge der Phonologie. Prague: Travaux du cercle linguistique de Prague 7.

    Google Scholar 

  66. Van Kampen, Anja, Güliz Parmaksiz, Ruben van de Vijver, and Barbara Höhle. 2008. Metrical and statistical cues for word segmentation: The use of vowel harmony and word stress as cues to word boundaries by 6- and 9-month-old Turkish learners. In Language acquisition and development: Proceedings of GALA 2007, Vol. 2007, 313–324.

    Google Scholar 

  67. Walker, Rachel. 2001. Round licensing, harmony, and bisyllabic triggers in Altaic. Natural Language & Linguistic Theory 19 (4): 827–878.

    Article  Google Scholar 

  68. Walker, Rachel, Dani Byrd, and Fidèle Mpiranya. 2008. An articulatory view of Kinyarwanda coronal harmony. Phonology 25 (03): 499–535.

    Article  Google Scholar 

  69. Wilson, Colin, and Gillian Gallagher. 2018. Constraint complexity in surface-based phonotactics: A case study of South Bolivian Quechua. Linguistic Inquiry 49 (3): 610–623.

    Article  Google Scholar 

  70. Wilson, Colin, and Marieke Obdeyn. 2009. Simplifying subsidiary theory: Statistical evidence from Arabic, Muna, Shona, and Wargamay. Ms., Johns Hopkins.

  71. Zuraw, Kie. 2002. Aggressive reduplication. Phonology 19 (03): 395–439.

    Article  Google Scholar 

  72. Zuraw, Kie, and Bruce Hayes. 2017. Intersecting constraint families: an argument for Harmonic Grammar. Language 93 (3): 497–548.

    Article  Google Scholar 

  73. Zuraw, Kie, and Yu-An Lu. 2009. Diverse repairs for multiple labial consonants. Natural Language & Linguistic Theory 27 (1): 197–224.

    Article  Google Scholar 

Download references


For helpful feedback, we would like to thank Arto Anttila and the anonymous reviewers of NLLT, as well as Maddie Gilbert, Juliet Stanton, Ildi Emese Szabó, Sora Heng Yin, Jon Rawski, audiences at OCP 2018 in London, UMass Amherst, Stony Brook, and the Phonology Winter School in Israel. Finally, we would like to thank Daniel Ridings for making the ALLEX corpus wordlist available to us, and Colin Wilson for sharing the code for the gain-based MaxEnt Phonotactic Learner, as well as detailed feedback on related work. This research was supported in part by NSF BCS-1724753 to the authors.

Author information



Corresponding author

Correspondence to Maria Gouskova.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gouskova, M., Gallagher, G. Inducing nonlocal constraints from baseline phonotactics. Nat Lang Linguist Theory 38, 77–116 (2020).

Download citation


  • Phonology
  • Phonotactics
  • Computational modeling
  • Inductive learning
  • Learnability
  • Consonant harmony
  • Consonant dissimilation
  • Vowel harmony
  • Nonlocal phonology
  • Corpus phonology
  • Quechua
  • Aymara
  • Shona