Melody learning and long-distance phonotactics in tone


This paper presents evidence that tone well-formedness patterns share a property of melody-locality, and shows how patterns with this property can be learned. Essentially, a melody-local pattern is one in which constraints over an autosegmental melody operate independently of constraints over the string of tone-bearing units. This includes a range of local tone patterns, long-distance tone-patterns, and their interactions. These results are obtained from the perspective of formal language theory and grammatical inference, which focus on the structural properties of patterns, but the implications extend to other learning frameworks. In particular, a melody-local learner can induce attested tone patterns that cannot be learned by the tier projection learners that have formed the basis of work on learning long-distance phonology. Thus, melody-local learning is a necessary property for learning tone. It is also shown how melody-local learners are more restrictive than learning directly over autosegmental representations.

This is a preview of subscription content, access via your institution.

Fig. 1


  1. 1.

    Pierrehumbert and Beckman (1988) propose an underspecification-based model for (Tokyo) Japanese. A partial string-based representation of this model would instead represent the L-toned TBUs as ∅ TBUs, indicating that they are not linked to any tone, and H, which indicates the TBU linked to the H tone. How we might fully capture the spirit of underspecification models via string representations is briefly discussed in Sect. 6.3.

  2. 2.

    Hyman and Katamba’s 2010 account of Luganda is derivational, explicitly distinguishing between lexical and post-lexical levels of the phonology. This paper follows this assumption that the phonology can be organized into distinct sub-phonologies, although it abstracts away from the details of how these are organized. In a constraint-based framework, this would require a stratally organized grammar (Kiparsky 2000; Bermúdez-Otero 2017).

  3. 3.

    Recall our assumption that generalizations hold for strings of arbitrary length. Thus, the property of locality is about the phonological pattern itself, independent of generalizations about word length or of constraints on performance or processing.

  4. 4.

    This function is defined formally in the Appendix. Interestingly, this function is input strictly local, a restrictive type of function that has been linked to phonological processes (Chandlee and Heinz 2018; Chandlee et al. 2018). Thanks to Jane Chandlee and Jeff Heinz for pointing this out.

  5. 5.

    This function is based on Jardine and Heinz’s 2015 concatenation operation for generating OCP-obeying ARs from strings. For now, we gloss over the treatment of contours, which can be straightforwardly dealt with but are not necessary to capture the tone patterns from Sect. 3. It will be shown in Sect. 6.2 how to adapt our melody function to incorporate contours.

  6. 6.

    These are not the only way these patterns can be described with melody-local grammars; in fact the learning algorithm proposed in Sect. 4 will learn slightly different, though extensionally equivalent, grammars. For discussion see Sects. 4.2 and 4.3.

  7. 7.

    In word-final position, this can technically be realized as a rising or falling tone; contours are abstracted away from here to focus on the long-distance nature of the pattern. For more on contours, see Sect. 6.2.

  8. 8.

    The second tone in (32a) and (32b) are ‘melody high’ tones assigned to a particular mora by tense, aspect, and mood morphology. As Bickmore and Kula (2013) explain, these tones behave identically to underlying tones with respect to the main spreading generalizations.

  9. 9.

    The boundaries here are not strictly word boundaries; to make this explicit, one could replace these with appropriate boundaries that demarcate the stem.

  10. 10.

    There are cases, e.g. in Cilungu (Bickmore 2007:16), in which a morpheme introduces two H tones, which associate to the beginning and end of the word. However, as both tones are introduced by a morphological process, this is best characterized as a kind of circumfixation and not phonological agreement. The applicability of melody-local learning to morphological processes is an interesting question for future work.

  11. 11.

    For brevity, this grammar abstracts away from the complete set of constraints that obtain bounded spreading.

  12. 12.

    A full analysis would also require AR versions of the local spreading constraints in (42), to eliminate forms of the shape *HLLn. However, this is still describable with AR subgraph grammars; see Jardine (2017).

  13. 13.

    Hyman (2011) lists Dioula Odienne (Braconnier 1982) as a possible example of tautomorphemic OCP violation not marked by downstep, but he also gives an alternate, OCP-obeying analysis based on underspecification. See also Shih (2016).

  14. 14.

    Thanks to an anonymous reviewer for pointing this out.


  1. Angluin, Dana. 1980. Inductive inference of formal languages from positive data. Information and Control 45 (2): 117–135.

    Google Scholar 

  2. Beckman, Jill. 1998. Positional faithfulness. PhD diss., University of Massachusetts, Amherst.

  3. Bermúdez Otero, Ricardo. 2017. Stratal phonology. In The Routledge handbook of phonological theory. London: Routledge.

    Google Scholar 

  4. Bickmore, Lee S. 2007. Stem tone melodies in Cilungu. SOAS Working Papers in Linguistics 15: 7–18.

    Google Scholar 

  5. Bickmore, Lee S., and Nancy C. Kula. 2013. Ternary spreading and the OCP in Copperbelt Bemba. Studies in African Linguistics 42 (2): 101–132.

    Google Scholar 

  6. Bickmore, Lee S., and Nancy C. Kula. 2015. Phrasal phonology in Copperbelt Bemba. Phonology 32: 147–176.

    Google Scholar 

  7. Bird, Steven, and T. Mark Ellison. 1994. One-level phonology: Autosegmental representations and rules as finite automata. Computational Linguistics 20 (1): 55–90.

    Google Scholar 

  8. Braconnier, Cassian. 1982. Le système du dioula d’Odienné, tome 1. Institut de linguistique appliquée, publication 86. University of Abidjan: Abidjan, Ivory Coast.

    Google Scholar 

  9. Buckley, Eugene. 2009. Locality in metrical typology. Phonology 26: 389–435.

    Google Scholar 

  10. Cassimjee, Farida, and Charles Kisseberth. 1998. Optimal domains theory and Bantu tonology. In Theoretical aspects of Bantu tone, eds. Charles Kisseberth and Larry Hyman, 265–314. Stanford: CSLI.

    Google Scholar 

  11. Chandlee, Jane. 2014. Strictly local phonological processes. PhD diss., University of Delaware.

  12. Chandlee, Jane, and Jeffrey Heinz. 2018. Strictly locality and phonological maps. Linguistic Inquiry 49: 23–60.

    Google Scholar 

  13. Chandlee, Jane, Rémi Eyraud, and Jeffrey Heinz. 2014. Learning Strictly Local subsequential functions. Transactions of the Association for Computational Linguistics 2: 491–503.

    Google Scholar 

  14. Chandlee, Jane, Jeffrey Heinz, and Adam Jardine. 2018. Input Strictly Local opaque maps. Phonology 35: 1–35.

    Google Scholar 

  15. Chiośain, Maire Ní, and Jaye Padgett. 2001. Markedness, segment realization, and locality in spreading. In Segmental phonology in optimality theory, ed. Linda Lombardi, 118–156. Cambridge: Cambridge University Press.

    Google Scholar 

  16. Coleman, John, and John Local. 1991. The “No Crossing Constraint” in autosegmental phonology. Linguistics and Philosophy 14: 295–338.

    Google Scholar 

  17. de Lacy, Paul. 2002. The interaction of tone and stress in Optimality Theory. Phonology 19: 1–32.

    Google Scholar 

  18. de la Higuera, Colin. 2010. Grammatical inference: Learning automata grammars. Cambridge: Cambridge University Press.

    Google Scholar 

  19. Ding, Picus Sizhi. 2006. A typological study of tonal systems of Japanese and Prinmi: Towards a definition of pitch-accent languages. Journal of Universal Language 7: 1–35.

    Google Scholar 

  20. Donohue, Mark. 1997. Tone systems in New Guinea. Linguistic Typology 1: 347–386.

    Google Scholar 

  21. Eisner, Jason. 1997. What constraints should OT allow? Talk handout, Linguistic Society of America (LSA), Chicago. ROA#204-0797. Available at

  22. Eyraud, Rémi, Jean-Christophe Janodet, and Tim Oates. 2012. Learning substitutable binary plane graph grammars. In 11th International Conference on Grammatical Inference (ICGI 2012), eds. Jeffrey Heinz, Colin de la Higuera, and Tim Oates. JMLR Workshop and Conference Proceedings, 114–128.

    Google Scholar 

  23. Gafos, Adamantios. 1996. The articulatory basis of locality in phonology. PhD diss., Johns Hopkins University.

  24. Gallagher, Gillian, and Colin Wilson. 2018. Accidental gaps and surface-based phonotactic learning: A case study of South Bolivian Quechua. Linguistic Inquiry 49 (3): 610–623.

    Google Scholar 

  25. García, Pedro, Enrique Vidal, and José Oncina. 1990. Learning locally testable languages in the strict sense. In Workshop on Algorithmic Learning Theory, 325–338.

    Google Scholar 

  26. Gold, Mark E. 1967. Language identification in the limit. Information and Control 10: 447–474.

    Google Scholar 

  27. Goldsmith, John. 1976. Autosegmental phonology. PhD diss., Massachusetts Institute of Technology.

  28. Goldsmith, John A. 1990. Autosegmental & metrical theory. Oxford: Basil Blackwell, Inc.

    Google Scholar 

  29. Goldsmith, John, and Jason Riggle. 2012. Information theoretic approaches to phonological structure: The case of Finnish vowel harmony. Natural Language & Linguistic Theory 30 (3): 859–896.

    Google Scholar 

  30. Good, Jeff. 2004. Tone and accent in Saramaccan: Charting a deep split in the phonology of a language. Lingua 114: 575–619.

    Google Scholar 

  31. Gouskova, Maria, and Gillian Gallagher. 2020. Inducing nonlocal constraints from baseline phonotactics. Natural Language and Linguistic Theory 38 (1): 77–116.

    Google Scholar 

  32. Graf, Thomas. 2017. The power of locality domains in phonology. Phonology 34: 385–405.

    Google Scholar 

  33. Haraguchi, Shōsuke. 1977. The tone pattern of Japanese: An autosegmental theory of tonology. Tokyo: Kaitakusha.

    Google Scholar 

  34. Hayes, Bruce, and Colin Wilson. 2008. A maximum entropy model of phonotactics and phonotactic learning. Linguistic Inquiry 39: 379–440.

    Google Scholar 

  35. Heinz, Jeffrey. 2010a. Learning long-distance phonotactics. Linguistic Inquiry 41: 623–661.

    Google Scholar 

  36. Heinz, Jeffrey. 2010b. String extension learning. In 48th annual meeting of the Association for Computational Linguistics (ACL), 897–906. Stroudsburg: Association for Computational Linguistics.

    Google Scholar 

  37. Heinz, Jeffrey, and James Rogers. 2010. Estimating strictly piecewise distributions. In 48th annual meeting of the Association for Computational Linguistics (ACL). Stroudsburg: Association for Computational Linguistics.

    Google Scholar 

  38. Heinz, Jeffrey, and James Rogers. 2013. Learning subregular classes of languages with factored deterministic automata. In 13th meeting on the Mathematics of Language (MoL 13), eds. Andras Kornai and Marco Kuhlmann, 64–71. Stroudsburg: Association for Computational Linguistics.

    Google Scholar 

  39. Heinz, Jeffrey, Colin de la Higuera, and Menno van Zaanen. 2016. Grammatical inference for computational linguistics. Synthesis lectures on human language technologies vol. 28. Williston: Morgan & Claypool Publishers.

    Google Scholar 

  40. Heinz, Jeffrey, Chetan Rawal, and Herbert G. Tanner. 2011. Tier-based strictly local constraints for phonology. In 49th annual meeting of the Association for Computational Linguistics (ACL), 58–64. Stroudsburg: Association for Computational Linguistics.

    Google Scholar 

  41. Hewitt, Mark, and Alan Prince. 1989. OCP, locality, and linking: the N. Karanga verb. In West Coast Conference on Formal Linguistics (WCCFL) 8, eds. E. Jane Fee and Katherine Hunt, 176–191. Stanford: CSLI.

    Google Scholar 

  42. Hirayama, T. 1951. Kyuusyuu hoogen onchoo no kenkyuu (studies on the tone of the kyushu dialects). Tokyo: Gakkai no shinshin-sha.

    Google Scholar 

  43. Hopcroft, John, Rajeev Motwani, and Jeffrey Ullman. 2006. Introduction to automata theory, languages, and computation, 3rd edn. Boston: Addison-Wesley.

    Google Scholar 

  44. Hyde, Brett. 2012. Alignment constraints. Natural Language and Linguistic Theory 30: 789–836.

    Google Scholar 

  45. Hyman, Larry. 2001. Tone systems. In Language typology and language universals: An international handbook, eds. Martin Haspelmath, Ekkehard Konig, Wulf Oesterreicher, and Wolfgang Raible, Vol. 2, 1367–1380. Berlin: Walter de Gruyter.

    Google Scholar 

  46. Hyman, Larry. 2007. Universals of tone rules: 30 years later. In Tones and tunes volume 1: Typological studies in word and sentence prosody, 1–34. New York: Mouton de Gruyter.

    Google Scholar 

  47. Hyman, Larry. 2009. How (not) to do typology: The case of pitch-accent. Language Sciences 31 (2-3): 213–238.

    Google Scholar 

  48. Hyman, Larry. 2011. Tone: Is it different? In The Blackwell handbook of phonological theory, eds. John A. Goldsmith, Jason Riggle, and Alan C. L. Yu, 197–238. Hoboken: Wiley-Blackwell.

    Google Scholar 

  49. Hyman, Larry, and Francis X. Katamba. 1993. A new approach to tone in luganda. Language 69: 34–67.

    Google Scholar 

  50. Hyman, Larry, and Francis X. Katamba. 2010. Tone, syntax and prosodic domains in Luganda. In Papers from the workshop on Bantu relative clauses, eds. Laura Downing, Annie Rialland, Jean-Marc Beltzung, Sophie Manus, Cédric Patin, and Kristina Riedel. Vol. 53 of ZAS papers in linguistics, 69–98. Berlin: ZAS.

    Google Scholar 

  51. Jardine, Adam. 2016. Computationally, tone is different. Phonology 33: 247–283.

    Google Scholar 

  52. Jardine, Adam. 2017. The local nature of tone-association patterns. Phonology 34: 363–384.

    Google Scholar 

  53. Jardine, Adam. 2019. The expressivity of autosegmental grammars. Journal of Logic, Language, and Information 28: 9–54.

    Article  Google Scholar 

  54. Jardine, Adam, and Jeffrey Heinz. 2015. A concatenation operation to derive autosegmental graphs. In 14th meeting on the Mathematics of Language (MoL 2015), 139–151. Stroudsburg: Association for Computational Linguistics. Last accessed 10 February 2020.

    Google Scholar 

  55. Jardine, Adam, and Jeffrey Heinz. 2016. Learning tier-based strictly 2-local languages. Transactions of the Association for Computational Linguistics 4: 87–98. Last accessed 10 February 2020.

    Google Scholar 

  56. Jardine, Adam, and Kevin McMullin. 2017. Efficient learning of tier-based strictly k-local languages. In Language and automata theory and applications, 11th international conference, eds. Frank Drewes, Carlos Martín-Vide, and Bianca Truthe. Lecture notes in computer science, 64–76. Dordrecht: Springer.

    Google Scholar 

  57. Jardine, Adam, Jane Chandlee, Rémi Eyraud, and Jeffrey Heinz. 2014. Very efficient learning of structured classes of subsequential functions from positive data. In 12th International Conference on Grammatical Inference (ICGI 2014). JMLR workshop proceedings, 94–108.

    Google Scholar 

  58. Jurafsky, Daniel, and James H. Martin. 2009. Speech and language processing, 2nd edn. Upper Saddle River: Prentice Hall.

    Google Scholar 

  59. Kiparsky, Paul. 2000. Opacity and cyclicity. Linguistic Review 17: 351–366.

    Google Scholar 

  60. Kisseberth, Charles, and David Odden. 2003. Tone. In The Bantu languages, eds. Derek Nurse and Gérard Philippson. New York: Routledge.

    Google Scholar 

  61. Kornai, András. 1995. Formal phonology. New York: Garland.

    Google Scholar 

  62. Kubozono, Haruo. 2012. Varieties of pitch accent systems in Japanese. Lingua 122: 1395–1414.

    Google Scholar 

  63. Lai, Regine. 2015. Learnable versus unlearnable harmony patterns. Linguistic Inquiry 46: 425–451.

    Google Scholar 

  64. Leben, W. R. 1973. Suprasegmental phonology. PhD diss., Massachusetts Institute of Technology.

  65. McCarthy, John J. 1985. Formal problems in semitic phonology and morphology. New York: Garland.

    Google Scholar 

  66. McMullin, Kevin, and Gunnar Ólafur Hansson. 2019. Inductive learning of locality relations in segmental phonology. Laboratory Phonology 10 (1): 14.

    Google Scholar 

  67. McNaughton, Robert, and Seymour Papert. 1971. Counter-free automata. Cambridge: MIT Press.

    Google Scholar 

  68. Miller, George Armitage. 1956. The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review 63 (2): 81–97.

    Google Scholar 

  69. Myers, Scott. 1987. Tone and the structure of words in shona. PhD diss., University of Massachusetts, Amherst.

  70. Myers, Scott. 1997. OCP effects in Optimality Theory. Natural Language & Linguistic Theory 15 (4): 847–892.

    Google Scholar 

  71. Odden, David. 1982. Tonal phenomena in Kishambaa. Studies in African Linguistics 13 (2): 177–208.

    Google Scholar 

  72. Odden, David. 1984. Stem tone assignment in Shona. In Autosegmental studies in Bantu tone, eds. G. N. Clements and John A. Goldsmith, 255–280. Dordrecht: Foris Publications.

    Google Scholar 

  73. Odden, David. 1986. On the role of the Obligatory Contour Principle in phonological theory. Language 62 (2): 353–383.

    Google Scholar 

  74. Odden, David. 1994. Adjacency parameters in phonology. Language 70 (2): 289–330.

    Google Scholar 

  75. Pater, Joe, and Anne-Michelle Tessier. 2003. Phonotactic knowledge and the acquisition of alternations. In 15th International congress of phonetic sciences, eds. Maria-Josep Solé, Daniel Recasens, and Joaquín Romero, 1177–1180. Barcelona: Universitat Autónoma de Barcelona.

    Google Scholar 

  76. Pierrehumbert, Janet B., and Mary E. Beckman. 1988. Japanese tone structure. Cambridge: MIT Press.

    Google Scholar 

  77. Pulleyblank, Douglas. 1986. Tone in lexical phonology. Dordrecht: D. Reidel.

    Google Scholar 

  78. Rogers, James, and Geoffrey Pullum. 2011. Aural pattern recognition experiments and the subregular hierarchy. Journal of Logic, Language and Information 20: 329–342.

    Google Scholar 

  79. Rogers, James, Jeffrey Heinz, Margaret Fero, Jeremy Hurst, Dakotah Lambert, and Sean Wibel. 2013. Cognitive and sub-regular complexity. In Formal grammar, eds. Glyn Morrill and Mark-Jan Nederhof. Lecture notes in computer science, 90–108. Dordrecht: Springer.

    Google Scholar 

  80. Roundtree, S. Catherine. 1972. Saramaccan tone in relation to intonation and grammar. Lingua 29: 308–325.

    Google Scholar 

  81. Shih, Stephanie S. 2016. Super additive similarity in Dioula tone harmony. In West Coast Conference on Formal Linguistics (WCCFL) 33, eds. Kyeong-min Kim, Pocholo Umbal, Trevor Block, Queenie Chan, Tanie Cheng, Kelli Finney, Mara Katz, Sophie Nickel-Thompson, and Lisa Shorten, 361–370.

    Google Scholar 

  82. Shih, Stephanie, and Sharon Inkelas. 2019. Autosegmental aims in surface-optimizing phonology. Linguistic Inquiry 50 (1): 137–196.

    Google Scholar 

  83. Tadadjeu, Maurice. 1974. Floating tones, shifting rules, and downstep in Dschang-Bamileke. In Papers from the 5th Annual Conference on African Linguistics (ACAL), ed. William R. Leben. Studies in African linguistics, supplement 5.

    Google Scholar 

  84. Williams, Edwin S. 1976. Underlying tone in Margi and Igbo. Linguistic Inquiry 7 (3): 463–484.

    Google Scholar 

  85. Yip, Moira. 1988. Template morphology and the direction of association. Natural Language and Linguistic Theory 6: 551–577.

    Google Scholar 

  86. Yip, Moira. 1995. Tone in East Asian languages. In The handbook of phonological theory, 476–494. Oxford: Blackwell.

    Google Scholar 

  87. Yip, Moira. 2002. Tone. Cambridge: Cambridge University Press.

    Google Scholar 

  88. Zoll, Cheryl. 2003. Optimal tone mapping. Linguistic Inquiry 34 (2): 225–268.

    Google Scholar 

Download references


The author would like to thank Jane Chandlee, Jeffrey Heinz, Arto Anttila, and three anonymous reviewers for their thoughtful comments.

Author information



Corresponding author

Correspondence to Adam Jardine.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.



This appendix collects the formal details of the paper. Standard notation for set theory is used. Let Σ be a fixed, finite alphabet of symbols and \(\Sigma^{*}\) be the set of all strings over Σ, including λ, the empty string. For a symbol σ∈Σ, \(\sigma^{n}\) denotes the string resulting from n repetitions of σ. Let |w| indicate the length of a string w. For two strings \(w,v\in\Sigma^{*}\), let wv denote their concatenation (likewise for \(w\in\Sigma^{*}\) and σ∈Σ, denotes their concatenation). A stringset (or formal language) is a subset of \(\Sigma^{*}\); this corresponds to the notion of pattern discussed in Sect. 2.1. Let ⋊ and ⋉ represent special boundary symbols not in Σ that represent the beginning and end of words, respectively; thus, ⋊w⋉ is the string w delineated with the boundary strings. (These correspond to the # boundary used in phonology.)

A.1 Strictly local grammars and k-factors

A string u is a k-factor of w if |u| = k and \(w=v_{1}uv_{2}\) for some \(v_{1},v_{2}\in\Sigma^{*}\); that is, u is a substring of w of length k. The k-factors of w are given by the following function \(\texttt{fac}_{k}\):

$$\texttt{fac}_{k} (w)\mathrel{\stackrel{\mbox{\footnotesize def}}{=}}\textstyle\begin{array}[t]{ll} \{u~|~u\text{ is a }k\text{-factor of } \rtimes w \ltimes \} & \text{ if }| \rtimes w \ltimes |>k \\ \{ \rtimes w \ltimes \} & \text{ otherwise } \\ \end{array} $$

For instance, fac3(LHLL)={⋊LH,LHL,HLL,LL⋉}. We extend \(\texttt{fac}_{k} \) to stringsets in the natural way; i.e. for \(L\subseteq\Sigma^{*}\), \(\texttt{fac}_{k} (L)=\bigcup_{w\in L} \texttt{fac}_{k} (w)\).

A strictly k-local (SLk) grammar is a set \(S\subseteq \texttt{fac}_{k} (\Sigma^{*})\); that is, a subset of all of the possible k-factors that can appear in strings in S. For example, for Σ = {L,H},

$$\texttt {fac}_{2} (\Sigma^{*})= \{ \rtimes \text{H} , \rtimes \text{L} , \text{H} \text{H} , \text{H} \text{L} , \text{L} \text{H} , \text{L} \text{L} , \text{H} \ltimes , \text{L} \ltimes \}. $$

Then, for example, \(S_{\mathrm {alt}} =\{ \rtimes \text{H} , \text{H} \text{H} , \text{L} \text{L} , \text{L} \ltimes \}\) is a SL2 grammar because ⋊H, HH, LL, and L⋉ are all 2-factors of strings in \(\Sigma^{*}\) (for example, they are all in fac2(HHLL)).

The for a SLk grammar S, the stringset described by S, written L(S), is thus the set of strings that contain no k-factors in S; that is,

$$L(S)\mathrel{\stackrel{\mbox{\footnotesize def}}{=}} \{w\in\Sigma^{*}~|~ \texttt{fac}_{k} (w)\cap S=\emptyset\} $$

For example,

$$L( S_{\mathrm {alt}} )= \{ \text{L} \text{H} , \text{L} \text{H} \text{L} \text{H} , \text{L} \text{H} \text{L} \text{H} \text{L} \text{H} ,...\}, $$

that is, the set of strings of alternating Hs and Ls, as this is exactly the set of strings that contain none of the 2-factors in \(S_{\mathrm {alt}}\).

A stringset L is thus strictly k-local iff L = L(S) for some SLk grammar S. We say a stringset is strictly local if it is strictly k-local for some k.

The learning procedure for the class of strictly k-local stringsets amounts to the function \(\mathtt {SLlearn}_{k}\) defined as follows. For a finite set \(D\subset\Sigma^{*}\),

$$\mathtt {SLlearn}_{k} (D)\mathrel{\stackrel{\mbox{\footnotesize def}}{=}} \texttt{fac}_{k} (\Sigma^{*})- \texttt{fac}_{k} (D) $$

That is, \(\mathtt {SLlearn}_{k}\)(D) returns the set of possible k-factors minus the set of k-factors observed in D. This means that \(\mathtt {SLlearn}_{k}\) returns a strictly k-local grammar consisting of all of the k-factors not observed in D. It should be noted that this is a ‘batch’ conception of the learner, as opposed to the sequential learner presented in the main text. They are equivalent, however. The sequential version of the learner takes some finite sequence of data points \(d_{1}\), \(d_{2}\), \(d_{3}\), ..., \(d_{n}\) and returns, at each data point \(d_{i}\), \(\mathtt {SLlearn}_{k} (\{d_{1},d_{2},d_{3},...,d_{i}\})\).

The following theorem asserts the correctness of \(\mathtt {SLlearn}_{k}\).

Theorem 1

For a target strictlyk-local stringsetLand a sampleDofLsuch that \(\texttt{fac}_{k} (D)= \texttt{fac}_{k} (L)\), \(\mathtt {SLlearn}_{k} (D)\)returns a strictlyk-local grammarSsuch thatL(S)=L.


We show first that wL implies wL(S) and then that wL(S) implies wL. Since \(\texttt{fac}_{k} (D)= \texttt{fac}_{k} (L)\), then \(\texttt{fac}_{k} (\Sigma^{*})- \texttt{fac}_{k} (D)= \texttt{fac}_{k} (\Sigma^{*})- \texttt{fac}_{k} (L)\). Since \(S= \mathtt {SLlearn}_{k} (D){=} \texttt{fac}_{k} (\Sigma^{*})- \texttt{fac}_{k} (D)\), then \(S= \texttt{fac}_{k} (\Sigma^{*})- \texttt{fac}_{k} (L)\). Thus for every wL, \(\texttt{fac}_{k} (w)\cap S=\emptyset\), so wL(S).

Because L is a strictly k-local set, there is some strictly k-local grammar \(S^{\prime}\) such that \(L(S^{\prime})=L\). Note that for any string w that if \(\texttt{fac}_{k} (w)\subseteq \texttt{fac}_{k} (L)\), then \(\texttt{fac}_{k} (w)\cap S^{\prime}=\emptyset\), and so \(\texttt{fac}_{k} (w)\in L\). Because \(S= \texttt{fac}_{k} (\Sigma^{*})- \texttt{fac}_{k} (L)\). For wL(S), then \(\texttt{fac}_{k} (w)\cap( \texttt{fac}_{k} (\Sigma^{*})- \texttt{fac}_{k} (L))=\emptyset\) and so \(\texttt{fac}_{k} (w)\subseteq \texttt{fac}_{k} (L)\). Thus wL(S) implies that wL. □

A.2 Melody-local grammars and their learning

Having defined strictly local stringsets and their learning, we can now define melody-local stringsets.

First, we define the mldy function recursively as follows. For \(w\in\Sigma^{*}\),

$$\textstyle\begin{array}{llll} \mathtt {mldy} (w) & \mathrel{\stackrel{\mbox{\footnotesize def}}{=}}& \lambda& \text{ if }w=\lambda, \\ & & \mathtt {mldy} (v)\sigma& \text{ if }w=v\sigma^{n}\text{, }v\neq u\sigma \text{ for some } u\in\Sigma^{*} \end{array} $$

That is, mldy(w) returns λ if w = λ, otherwise it returns mldy(v)σ, where v is the longest string not ending in σ. For example,

$$\mathtt {mldy} (\text{HHLLLH}) \textstyle\begin{array}[t]{ll} = & \mathtt {mldy} (\text{HHLLL})\text{H} \\ = & \mathtt {mldy} (\text{HH})\text{LH} \\ = & \mathtt {mldy} (\lambda)\text{HLH} \\ = & \lambda\text{HLH}=\text{HLH} \end{array} $$

For a stringset \(L\subseteq\Sigma^{*}\) let \(\mathtt {mldy} (L)=\{ \mathtt {mldy} (w)~|~w\in L\}\).

A melody strictly k-local grammar M is thus, like a strictly k-local grammar, a subset of the possible k factors of Σ. That is, \(M\subseteq \texttt{fac}_{k} (\Sigma^{*})\). The difference is that we interpret a melody strictly k-local grammar using the mldy function. The stringset described by M is as follows:

$$L(M)\mathrel{\stackrel{\mbox{\footnotesize def}}{=}} \{w\in\Sigma^{*}~|~ \texttt{fac}_{k} ( \mathtt {mldy} (w))\cap M=\emptyset\} $$

Thus, for example, if k = 3 and M = {HLH}, then HHLLLH∉L(M), because mldy(HHLLLH)=HLH and \(\texttt{fac}_{k} (\text{HLH})\cap M=\{\text{HLH}\}\). However, HLLLL∈L(M), because mldy(HLLLL)=HL and \(\texttt{fac}_{k} (\text{HL})\cap M=\emptyset\).

We can then define a k,j-melody-local grammar G as a tuple G(S,M) where S is a strictly k-local grammar and M is a melody strictly j-local grammar. The stringset described by G is thus

$$L(G)\mathrel{\stackrel{\mbox{\footnotesize def}}{=}}L(S)\cap L(M), $$

that is, the set of strings that satisfy both S and M. We say a stringset is melody-local if it is k,j-melody-local for some k and j.

Learning melody-local stringsets is a straightforward extension of learning strictly local stringsets. If we fix k, we can define a learning function that takes an input D and outputs the following result:

$$\mathtt {MLlearn}_{k,j} (D)\mathrel{\stackrel{\mbox{\footnotesize def}}{=}}\big( \mathtt {SLlearn}_{k} (D), \mathtt {SLlearn}_{j} ( \mathtt {mldy} (D))\big) $$

That is, \(\mathtt {MLlearn}_{k,j} (D)\) returns a tuple, the first of which is obtained by running a strictly k-local learning on D, the second of which is a melody strictly j-local grammar obtained by running strictly j-local learning on mldy(D). The following theorem asserts the correctness of \(\mathtt {MLlearn}_{k,j}\).

Theorem 2

For a targetk,j-melody-local stringsetLand a sampleDofLsuch that \(\texttt{fac}_{k} (D)= \texttt{fac}_{k} (L)\)and \(\texttt {fac}_{j} ( \mathtt {mldy} (D))= \texttt {fac}_{j} ( \mathtt {mldy} (L))\), \(\mathtt {MLlearn}_{k,j} (D)\)returns ak,j-melody-local grammarGsuch thatL(G)=L.


Almost immediate from Theorem 1. If L is k-melody-local, then there is some k-melody-local grammar \(G^{\prime}=(S^{\prime},M^{\prime})\) such that \(L(G^{\prime})=L\). Let G = (S,M). Because \(\texttt{fac}_{k} (D)= \texttt{fac}_{k} (L)\) and \(\texttt {fac}_{j} ( \mathtt {mldy} (D))= \texttt {fac}_{j} ( \mathtt {mldy} (L))\), from Theorem 1 we know that \(L(S)=L(S^{\prime})\) and \(L(M)=L(M^{\prime})\). Thus \(L(G)=L(G^{\prime})=L\). □

A.3 Abstract characterization

We can posit an abstract characterization for melody-local patterns independent of a particular grammar formalism to describe them. This allows us to prove whether or not a pattern is melody-local. We base this off of the abstract characterization of strictly local stringsets. Strictly local stringsets can be characterized by the property of suffix substitution closure (Rogers and Pullum 2011; Rogers et al. 2013), which can be used to prove that a pattern is not strictly local.

Theorem 3

(Suffix substitution closure, Rogers and Pullum 2011)

A stringsetLis SLkiff for any stringxof lengthk − 1 and any strings \(u_{1}\), \(u_{2}\), \(w_{1}\), and \(w_{2}\),

$$\text{if }u_{1}xu_{2}\in L \textit{ and }w_{1}xw_{2}\in L\textit{, then }u_{1}xw_{2} \in L $$

This means that, for any \(u_{1}xu_{2}\in L\), and for any \(w_{1}xw_{2}\in L\), then, as long as x is of length k − 1, then we can freely replace \(u_{2}\) with \(w_{2}\) and be guaranteed to produce another string in L. For example, for the stringset \(L_{\mathrm {KJ}}\) (penultimate or final H tone) from the main text, we can set x to be HL (because k = 3, x must be of length 2), and \(u_{1}\), \(u_{2}\), \(w_{1}\), and \(w_{2}\) as in (71).


Thus, \(u_{1}xu_{2}\) is LLLLLH, which is a member of \(L_{\mathrm {KJ}}\), and \(w_{1}xw_{2}\) is LLHL, which is also a member of \(L_{\mathrm {KJ}}\). If we substitute \(u_{2}\) for \(w_{2}\) in the former, then we obtain a new string \(u_{1}xw_{2}=\text{LLLLLHL}\), which is also in \(L_{\mathrm {KJ}}\). We can do this for any x of length 2. Another example is given below in (72) for x = LL.


To show that a stringset is not strictly local, we show that suffix substitution closure fails for some x no matter the size of k. Recall the stringset \(L_{\mathrm {Ch}}\) (at least one H) from the main text.


If, as in \(L_{\mathrm {KJ}}\), we set k = 3 and choose the string LL, then \(L_{\mathrm {Ch}}\) fails suffix substitution closure for x = LL and \(u_{1}\), \(u_{2}\), \(w_{1}\), \(w_{2}\) chosen as shown in (74).


Because \(u_{1}xw_{2}=\text{LLLL}\) is not a member of \(L_{\mathrm {Ch}}\), \(L_{\mathrm {Ch}}\) is not strictly 3-local. Furthermore, there is no k for which \(L_{\mathrm {Ch}}\) is strictly k-local, because we can simply replace x with \(\text{L} ^{k-1}\) (k − 1 repetitions of L).


This shows that, no matter what k − 1, suffix substitution in this case will produce a string LL\(^{k-1}\)L, which is not a member of \(L_{\mathrm {Ch}}\). Thus, \(L_{\mathrm {Ch}}\) fails suffix substitution closure for any k. This is a formal version of the intuitive ‘scanning’ proof given in Sect. 2.3, (13).

From the suffix substitution closure characterization of strictly local stringsets, we can posit melody-dependent suffix substitution closure as the abstract characterization of melody-local stringsets.

Theorem 4

(Melody-dependent suffix substitution closure (MSSC))

A stringsetLis melody-local iff, for somekand somej,

  1. 1.

    mldy(L) is strictlyj-local and

  2. 2.

    for any strings \(w_{1},w_{2},u_{1},u_{2}\)and for any stringx, |x| = k − 1,

    $$w_{1}xw_{2}\in L\text{ and }u_{1}xu_{2}\in L\textit{ and } \mathtt {mldy} (w_{1}xu_{2}) \in \mathtt {mldy} (L)\textit{ implies }w_{1}xu_{2}\in L $$


Recall that a stringset is melody-local iff it is describable by some melody-local grammar G = (S,M). Thm. 4a follows directly from the definition of L(M). Thm. 4b follows from suffix substitution closure for L(S) plus the additional requirement that L(G)=L(S)∩L(M). □

Melody-dependent suffix substitution closure adds two conditions on suffix substitution closure. First, Thm. 4a states that mldy(L) (the stringset consisting of the melodies of all strings in L) must be strictly j-local. Second, Thm. 4b adds to the antecedent of the suffix substitution closure implication that \(\mathtt {mldy} (w_{1}xu_{2})\) must be in mldy(L). As an example, take \(L_{\mathrm {Ch}}\). First, note that \(\mathtt {mldy} ( L_{\mathrm {Ch}} )\) (given below in (76)), is strictly 3-local, as witnessed by the melody strictly j-local grammar \(M_{\mathrm {Ch}} =\{ \rtimes \text{L} \ltimes \}\) (i.e., it does not contain the string L).


It is also then true that \(L_{\mathrm {Ch}}\) satisfies Thm. 4 for k = j = 3. While \(L_{\mathrm {Ch}}\) fails the implication in (74) for suffix substitution closure, this implication holds for melody-dependent suffix substitution closure, because mldy(LLLL)=L is not a member of \(\mathtt {mldy} ( L_{\mathrm {Ch}} )\), and so it does not matter that \(\text{LLLL}\not\in L_{\mathrm {Ch}} \).


It is thus the case that \(L_{\mathrm {Ch}}\) satisfies melody-dependent suffix substitution closure.

To give an example that does not, recall the ‘no consecutive spreading Hs’ pattern discussed in Sect. 5. More explicitly, this is the set \(L_{\mathrm {No2H}}\) as follows.


That is, \(L_{\mathrm {No2H}}\) is exactly the set not containing any strings like *HHLLHH, or *HHHLHH, or *HHLLLHH, where H spans separated by exactly one L-span are both of more than one TBU.

There are no constraints on the melody in this pattern; thus \(\mathtt {mldy} ( L_{\mathrm {No2H}} )\) is the full set of alternating strings of Hs and Ls.


We can show that this fails melody-dependent substitution closure using example strings based on the ARs in (63) from the main text.


In this case, \(u_{1}xw_{2}=\text{HHHHL$^{k-1}$HHHH}\), in which two consecutive H spans have spread more than two TBUs (as in (63) in the main text). This satisfies the melody constraint (because, e.g., HHHHLH\(\in L_{\mathrm {No2H}} \) and so HLH\(\in \mathtt {mldy} ( L_{\mathrm {No2H}} )\)), but it is not in \(L_{\mathrm {No2H}}\), so it fails the implication, for any k. Thus, ‘the no consecutive spreading Hs’ pattern \(L_{\mathrm {No2H}}\) is not melody-local.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Jardine, A. Melody learning and long-distance phonotactics in tone. Nat Lang Linguist Theory 38, 1145–1195 (2020).

Download citation


  • Tone
  • Learnability
  • Computational phonology
  • Representation