Biological Cybernetics

, Volume 109, Issue 6, pp 611–626 | Cite as

Optimal speech motor control and token-to-token variability: a Bayesian modeling approach

  • Jean-François PatriEmail author
  • Julien Diard
  • Pascal Perrier
Original Article


The remarkable capacity of the speech motor system to adapt to various speech conditions is due to an excess of degrees of freedom, which enables producing similar acoustical properties with different sets of control strategies. To explain how the central nervous system selects one of the possible strategies, a common approach, in line with optimal motor control theories, is to model speech motor planning as the solution of an optimality problem based on cost functions. Despite the success of this approach, one of its drawbacks is the intrinsic contradiction between the concept of optimality and the observed experimental intra-speaker token-to-token variability. The present paper proposes an alternative approach by formulating feedforward optimal control in a probabilistic Bayesian modeling framework. This is illustrated by controlling a biomechanical model of the vocal tract for speech production and by comparing it with an existing optimal control model (GEPPETO). The essential elements of this optimal control model are presented first. From them the Bayesian model is constructed in a progressive way. Performance of the Bayesian model is evaluated based on computer simulations and compared to the optimal control model. This approach is shown to be appropriate for solving the speech planning problem while accounting for variability in a principled way.


Speech motor control Speech sequence motor planning  Bayesian modeling Optimal motor control 



Authors wish to thank Pierre Bessière and Jean-Luc Schwartz for guidance and inspiring conversations.

Supplementary material

422_2015_664_MOESM1_ESM.pdf (56 kb)
Supplementary material 1 (pdf 56 KB)


  1. Attias H (2003) Planning by probabilistic inference. In: Bishop CM, Frey BJ (eds) Proceedings of the ninth international workshop on artificial intelligence and statistics, Key WestGoogle Scholar
  2. Bessière P, Laugier C, Siegwart R (eds) (2008) Probabilistic reasoning and decision making in sensory-motor systems. Springer tracts in advanced robotics, vol 46. Springer, BerlinGoogle Scholar
  3. Bessière P, Mazer E, Ahuactzin JM, Mekhnacha K (2013) Bayesian programming. CRC Press, Boca RatonGoogle Scholar
  4. Boutilier C, Dean T, Hanks S (1999) Decision theoretic planning: structural assumptions and computational leverage. J Artif Intell Res 10:1–94Google Scholar
  5. Bowers JS, Davis CJ (2012) Bayesian just-so stories in psychology and neuroscience. Psychol Bull 138(3):389–414CrossRefPubMedGoogle Scholar
  6. Brown LD (1981) A complete class theorem for statistical problems with finite sample spaces. Ann Stat 9(6):1289–1300CrossRefGoogle Scholar
  7. Calliope (1984) La parole et son traitement automatique. Masson, ParisGoogle Scholar
  8. Colas F, Diard J, Bessière P (2010) Common bayesian models for common cognitive issues. Acta Biotheor 58(2–3):191–216CrossRefPubMedGoogle Scholar
  9. Daunizeau J, den Ouden HEM, Pessiglione M, Kiebel SJ, Stephan KE, Friston KJ (2010) Observing the observer (I): meta-bayesian models of learning and decision-making. PLoS One 5(12):e15554PubMedCentralCrossRefPubMedGoogle Scholar
  10. Feldman AG (1986) Once more on the equilibrium-point hypothesis (\(\lambda \) model) for motor control. J Mot Behav 18(1):17–54CrossRefPubMedGoogle Scholar
  11. Friston K (2010) The free-energy principle: a unified brain theory? Nat Rev Neurosci 11(2):127–138CrossRefPubMedGoogle Scholar
  12. Friston K (2011) What is optimal about motor control? Neuron 72(3):488–498CrossRefPubMedGoogle Scholar
  13. Friston KJ, Frith CD (2015) Active inference, communication and hermeneutics. Cortex 68:129–143PubMedCentralCrossRefPubMedGoogle Scholar
  14. Friston KJ, Daunizeau J, Kiebel SJ (2009) Reinforcement learning or active inference? PLoS One 4(7):e6421PubMedCentralCrossRefPubMedGoogle Scholar
  15. Friston K, Mattout J, Kilner J (2011) Action understanding and active inference. Biol Cybern 104(1–2):137–160PubMedCentralCrossRefPubMedGoogle Scholar
  16. Friston K, Samothrakis S, Montague R (2012) Active inference and agency: optimal control without cost functions. Biol Cybern 106(8–9):523–541CrossRefPubMedGoogle Scholar
  17. Ganesh G, Haruno M, Kawato M, Burdet E (2010) Motor memory and local minimization of error and effort, not global optimization, determine motor behavior. J Neurophysiol 104(1):382–390CrossRefPubMedGoogle Scholar
  18. Goodman ND, Mansinghka VK, Roy DM, Bonawitz K, Tenenbaum JB (2008) Church: a language for generative models. In: Proceedings of the 24th conference on uncertainty in artificial intelligence, vol 22, p 23Google Scholar
  19. Gordon AD, Henzinger TA, Nori AV, Rajamani SK (2014) Probabilistic programming. In: Proceedings of the 36th international conference on software engineering (ICSE 2014, Future of Software Engineering track). ACM, New York, pp 167–181Google Scholar
  20. Guenther FH (1995) Speech sound acquisition, coarticulation, and rate effects in a neural network model of speech production. Psychol Rev 102(3):594–621CrossRefPubMedGoogle Scholar
  21. Guenther FH, Hampson M, Johnson D (1998) A theoretical investigation of reference frames for the planning of speech movements. Psychol Rev 105(4):611–633CrossRefPubMedGoogle Scholar
  22. Hahn U (2014) The Bayesian boom: good thing or bad? Front Psychol 5. Art ID 765Google Scholar
  23. Honda K (1996) Organization of tongue articulation for vowels. J Phon 24:39–52CrossRefGoogle Scholar
  24. Jones M, Love B (2011) Bayesian fundamentalism or enlightenment? On the explanatory status and theoretical contributions of bayesian models of cognition. Behav Brain Sci 34:169–231CrossRefPubMedGoogle Scholar
  25. Jordan MI (1996) Computational motor control. In: Gazzaniga MS (ed) The cognitive neurosciences. MIT Press, Cambridge, pp 597–609Google Scholar
  26. Kaelbling L, Littman M, Cassandra A (1998) Planning and acting in partially observable stochastic domains. Artif Intell 101(1–2):99–134CrossRefGoogle Scholar
  27. Kappen HJ, Gómez V, Opper M (2012) Optimal control as a graphical model inference problem. Mach Learn 87(2):159–182CrossRefGoogle Scholar
  28. Kawato M (1999) Internal models for motor control and trajectory planning. Curr Opin Neurobiol 9(6):718–727CrossRefPubMedGoogle Scholar
  29. Laboissière R, Ostry DJ, Feldman AG (1996) The control of multi-muscle systems: human jaw and hyoid movements. Biol Cybern 74(4):373–384CrossRefPubMedGoogle Scholar
  30. Lebeltel O, Bessière P, Diard J, Mazer E (2004) Bayesian robot programming. Auton Robot 16(1):49–79CrossRefGoogle Scholar
  31. Ma WJ (2010) Signal detection theory, uncertainty, and poisson-like population codes. Vis Res 50:2308–2319CrossRefPubMedGoogle Scholar
  32. Ma WJ (2012) Organizing probabilistic models of perception. Trends Cogn Sci 16(10):511–518CrossRefPubMedGoogle Scholar
  33. Ma L, Perrier P, Dang J (2006) Anticipatory coarticulation in vowel-consonant-vowel sequences: a crosslinguistic study of french and mandarin speakers. In: Proceedings of the 7th international seminar on speech production. Ubatuba, pp 151–158Google Scholar
  34. Marr D, Vision (1982) A computational investigation into the human representation and processing of visual information. W.H. Freeman, New YorkGoogle Scholar
  35. Ménard L (2002) Production et perception des voyelles au cours de la croissance du conduit vocal: variabilité, invariance et normalisation. Unpublished Ph.D. thesis, Université Stendhal de GrenobleGoogle Scholar
  36. Murphy K (2002) Dynamic bayesian networks: representation, inference and learning. Unpublished Ph.D. thesis, University of California, Berkeley, Berkeley, CAGoogle Scholar
  37. Nelson W (1983) Physical principles for economies of skilled movements. Biol Cybern 46:135–147CrossRefPubMedGoogle Scholar
  38. Payan Y, Perrier P (1997) Synthesis of VV sequences with a 2D biomechanical tongue model controlled by the equilibrium point hypothesis. Speech Commun 22(2):185–205CrossRefGoogle Scholar
  39. Perkell SJ, Nelson LW (1985) Variability in production of the vowels /i/ and /a/. J Acoust Soc Am 77:1889–1895CrossRefPubMedGoogle Scholar
  40. Perkell J, Matthies M, Lane H, Guenther F, Wilhelms-Tricarico R, Wozniak J, Guiod P (1997) Speech motor control: acoustic goals, saturation effects, auditory feedback and internal models. Speech Commun 22(2):227–250CrossRefGoogle Scholar
  41. Perrier P, Boë LJ, Sock R (1992) Vocal tract area function estimation from midsagittal dimensions with ct scans and a vocal tract castmodeling the transition with two sets of coefficients. J Speech Lang Hear Res 35(1):53–67CrossRefGoogle Scholar
  42. Perrier P, Payan Y, Zandipour M, Perkell J (2003) Influences of tongue biomechanics on speech movements during the production of velar stop consonants: a modeling study. J Acoust Soc Am 114(3):1582–1599CrossRefPubMedGoogle Scholar
  43. Perrier P, Ma L, Payan Y (2005) Modeling the production of VCV sequences via the inversion of a biomechanical model of the tongue. In: Proceedings of interspeech 2005, Lisbon, Portugal, pp 1041–1044Google Scholar
  44. Poggio T, Girosi F (1989) A theory of networks for approximation and learning. Tech. rep., Artificial Intelligence Laboratory & Center for Biological Information Processing, MIT, Cambridge, MA, USAGoogle Scholar
  45. Pouget A, Beck JM, Ma WJ, Latham PE (2013) Probabilistic brains: knowns and unknowns. Nat Neurosci 16(9):1170–1178PubMedCentralCrossRefPubMedGoogle Scholar
  46. Robert C (2007) The Bayesian choice: from decision-theoretic foundations to computational implementation. Springer, New YorkGoogle Scholar
  47. Robert-Ribes J (1995) Modèles d’intégration audiovisuelle de signaux linguistiques: de la perception humaine a la reconnaissance automatique des voyelles. Unpublished Ph.D. thesis, Institut National Polytechnique de GrenobleGoogle Scholar
  48. Schmolesky MT, Wang Y, Hanes DP, Thompson KG, Leutgeb S, Schall JD, Leventhal AG (1998) Signal timing across the macaque visual system. J Neurophysiol 79(6):3272–3278PubMedGoogle Scholar
  49. Shim JK, Latash ML, Zatsiorsky VM (2003) Prehension synergies: trial-to-trial variability and hierarchical organization of stable performance. Exp Brain Res 152(2):173–184PubMedCentralCrossRefPubMedGoogle Scholar
  50. Todorov E (2004) Optimality principles in sensorimotor control. Nat Neurosci 7(9):907–915PubMedCentralCrossRefPubMedGoogle Scholar
  51. Todorov E, Jordan MI (2002) Optimal feedback control as a theory of motor coordination. Nat Neurosci 5(11):1226–1235CrossRefPubMedGoogle Scholar
  52. Tourville JA, Reilly KJ, Guenther FH (2008) Neural mechanisms underlying auditory feedback control of speech. Neuroimage 39(3):1429–1443PubMedCentralCrossRefPubMedGoogle Scholar
  53. Toussaint M (2009) Probabilistic inference as a model of planned behavior. Künstl Intell 3(9):23–29Google Scholar
  54. Uno Y, Kawato M, Suzuki R (1989) Formation control of optimal trajectory in human multijoint arm movement: minimum torque-change model. Biol Cybern 61:89–101CrossRefPubMedGoogle Scholar
  55. Wolpert DM (2007) Probabilistic models in human sensorimotor control. Hum Mov Sci 26:511–524PubMedCentralCrossRefPubMedGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Jean-François Patri
    • 1
    • 2
    Email author
  • Julien Diard
    • 3
    • 4
  • Pascal Perrier
    • 1
    • 2
  1. 1.GIPSA-LabUniversité Grenoble AlpesGrenobleFrance
  2. 2.GIPSA-LabCNRSGrenobleFrance
  3. 3.LPNCUniversité Grenoble AlpesGrenobleFrance
  4. 4.LPNCCNRSGrenobleFrance

Personalised recommendations