Psychological Research

, Volume 81, Issue 5, pp 990–1003 | Cite as

Familiar units prevail over statistical cues in word segmentation

  • Bénédicte Poulin-CharronnatEmail author
  • Pierre Perruchet
  • Barbara Tillmann
  • Ronald Peereman
Original Article


In language acquisition research, the prevailing position is that listeners exploit statistical cues, in particular transitional probabilities between syllables, to discover words of a language. However, other cues are also involved in word discovery. Assessing the weight learners give to these different cues leads to a better understanding of the processes underlying speech segmentation. The present study evaluated whether adult learners preferentially used known units or statistical cues for segmenting continuous speech. Before the exposure phase, participants were familiarized with part-words of a three-word artificial language. This design allowed the dissociation of the influence of statistical cues and familiar units, with statistical cues favoring word segmentation and familiar units favoring (nonoptimal) part-word segmentation. In Experiment 1, performance in a two-alternative forced choice (2AFC) task between words and part-words revealed part-word segmentation (even though part-words were less cohesive in terms of transitional probabilities and less frequent than words). By contrast, an unfamiliarized group exhibited word segmentation, as usually observed in standard conditions. Experiment 2 used a syllable-detection task to remove the likely contamination of performance by memory and strategy effects in the 2AFC task. Overall, the results suggest that familiar units overrode statistical cues, ultimately questioning the need for computation mechanisms of transitional probabilities (TPs) in natural language speech segmentation.


Artificial Language Word Segmentation Exposure Phase Speech Stream 2AFC Task 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



The authors are grateful to Pascal Morgan and Cédric Foucault for help with collecting the data.


  1. Abla, D., Katahira, K., & Okanoya, K. (2008). On-line assessment of statistical learning by event-related potentials. Journal of Cognitive Neuroscience, 20(6), 952–964. doi: 10.1162/jocn.2008.20058.CrossRefPubMedGoogle Scholar
  2. Abrams, K., & Bever, T. G. (1969). Syntactic structure modifies attention during speech perception and recognition. The Quarterly Journal of Experimental Psychology, 21, 280–290. doi: 10.1080/14640746908400223.CrossRefPubMedGoogle Scholar
  3. Aslin, R. N., Saffran, J. R., & Newport, E. L. (1998). Computation of conditional probability statistics by 8-month-old infants. Psychological Science, 9, 321–324. doi: 10.1111/1467-9280.00063.CrossRefGoogle Scholar
  4. Bertels, J., Franco, A., & Destrebecqz, A. (2012). How implicit is visual statistical learning? Journal of Experimental Psychology. Learning, Memory, and Cognition, 38(5), 1425–1431. doi: 10.1037/a0027210.CrossRefPubMedGoogle Scholar
  5. Bortfeld, H., Morgan, J. L., Golinkoff, R. M., & Rathbun, K. (2005). Mommy and me: familiar names help launch babies into speech-stream segmentation. Psychological Science, 16, 298–304. doi: 10.1111/j.0956-7976.2005.01531.x.CrossRefPubMedPubMedCentralGoogle Scholar
  6. Brent, M. R., & Siskind, J. M. (2001). The role of exposure to isolated words in early vocabulary development. Cognition, 81, B33–B44. doi: 10.1016/S0010-0277(01)00122-6.CrossRefPubMedGoogle Scholar
  7. Cairns, P., Shillcock, R., Chater, N., & Levy, J. (1997). Bootstrapping word boundaries: a bottom-up corpus-based approach to speech segmentation. Cognitive Psychology, 33, 111–153. doi: 10.1006/cogp.1997.0649.CrossRefPubMedGoogle Scholar
  8. Christiansen, M. H., Allen, J., & Seidenberg, M. S. (1998). Learning to segment speech using multiple cues: a connectionist model. Language and Cognitive Processes, 13, 221–268. doi: 10.1080/016909698386528.CrossRefGoogle Scholar
  9. Cunillera, T., Càmara, E., Laine, M., & Rodríguez-Fornells, A. (2010). Words as anchors: known words facilitate statistical learning. Experimental Psychology, 57, 134–141. doi: 10.1027/1618-3169/a000017.CrossRefPubMedGoogle Scholar
  10. Cutler, A., & Norris, D. (1988). The role of strong syllables in segmentation for lexical access. Journal of Experimental Psychology: Human Perception and Performance, 14, 113–121. doi: 10.1037/0096-1523.14.1.113.Google Scholar
  11. Dahan, D., & Brent, M. R. (1999). On the discovery of novel wordlike units from utterances: an artificial-language study with implications for native-language acquisition. Journal of Experimental Psychology: General, 128, 165–185. doi: 10.1037/0096-3445.128.2.165.CrossRefGoogle Scholar
  12. Dutoit, T., Pagel, N., Pierret, F., Bataille, O., & Van Der Vrecken, O. (1996). The MBROLA Project: towards a Set of High-Quality Speech Synthesizers Free of Use for Non-Commercial Purposes. Proc. ICSLP’96. Philadelphia, 3, 1393–1396.Google Scholar
  13. Fernald, A., & Morikawa, H. (1993). Common themes and cultural variations in Japanese and American mothers’ speech to infants. Child Development, 64, 637–656. doi: 10.1111/1467-8624.ep9308115002.CrossRefPubMedGoogle Scholar
  14. Franco, A., Eberlen, J., Destrebecqz, A., Cleeremans, A., & Bertels, J. (2015a). Rapid serial auditory presentation. A new measure of statistical learning in speech segmentation: Experimental Psychology. doi: 10.1027/1618-3169/a000295.Google Scholar
  15. Franco, A., Gaillard, V., Cleeremans, A., & Destrebecqz, A. (2015b). Assessing segmentation processes by click detection: online measure of statistical learning, or simple interference? Behavior Research Methods,. doi: 10.3758/s13428-014-0548-x.PubMedGoogle Scholar
  16. Frank, M. C., Goldwater, S., Griffiths, T. L., & Tenenbaum, J. B. (2010). Modeling human performance in statistical word segmentation. Cognition, 117, 107–125. doi: 10.1016/j.cognition.2010.07.005.CrossRefPubMedGoogle Scholar
  17. French, R. M., Addyman, C., & Mareschal, D. (2011). TRACX: a recognition-based connectionist framework for sequence segmentation and chunk extraction. Psychological Review, 118, 614–636. doi: 10.1037/a0025255.CrossRefPubMedGoogle Scholar
  18. Gebhart, A. L., Aslin, R. N., & Newport, E. L. (2009). Changing structures in midstream: learning along the statistical garden path. Cognitive Science, 33(6), 1087–1116. doi: 10.1111/j.1551-6709.2009.01041.x.CrossRefPubMedPubMedCentralGoogle Scholar
  19. Giroux, I., & Rey, A. (2009). Lexical and sublexical units in speech perception. Cognitive Science, 33, 260–272. doi: 10.1111/j.1551-6709.2009.01012.x.CrossRefPubMedGoogle Scholar
  20. Gómez, R. (2007). Statistical learning in infant language development. In M. G., Gaskell (Ed.), The Oxford handbook of psycholinguistics (pp. 601-616). New York: Oxford University Press.Google Scholar
  21. Gómez, D. M., Bion, R. A. H., & Mehler, J. (2011). The word segmentation process as revealed by click detection. Language and Cognitive Processes, 26, 212–223. doi: 10.1080/01690965.2010.482451.CrossRefGoogle Scholar
  22. Hunt, R. H., & Aslin, R. N. (2001). Statistical learning in a serial reaction time task: access to separable statistical cues by individual learners. Journal of Experimental Psychology: General, 130, 658–680. doi: 10.1037/0096-3445.130.4.658.CrossRefGoogle Scholar
  23. Johnson, E. K. (2012). Bootstrapping language: Are infant statisticians up to the job? In P. Rebuschat & J. N. Williams (Eds.), Statistical learning and language acquisition (pp. 55–89). Berlin: De Gruyter Mouton.Google Scholar
  24. Johnson, E. K., & Jusczyk, P. W. (2001). Word segmentation by 8-month-olds: when speech cues count more than statistics. Journal of Memory and Language, 44, 548–567. doi: 10.1006/jmla.2000.2755.CrossRefGoogle Scholar
  25. Johnson, E. K., & Seidl, A. H. (2009). At 11 months, prosody still outranks statistics. Developmental Science, 12, 131–141. doi: 10.1111/j.1467-7687.2008.00740.x.CrossRefPubMedGoogle Scholar
  26. Johnson, E. K., & Tyler, M. D. (2010). Testing the limits of statistical learning of word segmentation. Developmental Science, 13(2), 339–345. doi: 10.1111/j.1467-7687.2009.00886.x.CrossRefPubMedPubMedCentralGoogle Scholar
  27. Jusczyk, P. W. (1999). How infants begin to extract words from speech. Trends in Cognitive Sciences, 3, 323–328. doi: 10.1016/S1364-6613(99)01363-7.CrossRefPubMedGoogle Scholar
  28. Jusczyk, P. W., Hohne, E. A., & Bauman, A. (1999a). Infants’ sensitivity to allophonic cues for word segmentation. Perception and Psychophysics, 61, 1465–1476. doi: 10.3758/BF03213111.CrossRefPubMedGoogle Scholar
  29. Jusczyk, P. W., Houston, D. M., & Newsome, M. (1999b). The beginnings of word segmentation in english-learning infants. Cognitive Psychology, 39, 159–207. doi: 10.1006/cogp.1999.0716.CrossRefPubMedGoogle Scholar
  30. Kim, R., Seitz, A., Feenstra, H., & Shams, L. (2009). Testing the assumptions of statistical learning: is it long-term and implicit? Neuroscience Letters, 461, 145–149. doi: 10.1016/j.neulet.2009.06.030.CrossRefPubMedGoogle Scholar
  31. Lew-Williams, C., Pelucchi, B., & Saffran, J. R. (2011). Isolated words enhance statistical language learning in infancy. Developmental Science, 14, 1323–1329. doi: 10.1111/j.1467-7687.2011.01079.x.CrossRefPubMedPubMedCentralGoogle Scholar
  32. Mandel, D. R., Jusczyk, P. W., & Pisoni, D. B. (1995). Infants’ recognition of the sound patterns of their own names. Psychological Science, 6, 314–316. doi: 10.1111/j.1467-9280.1995.tb000517.x.CrossRefPubMedPubMedCentralGoogle Scholar
  33. Mattys, S. L., & Jusczyk, P. W. (2001). Phonotactic cues for segmentation of fluent speech by infants. Cognition, 78, 91–121. doi: 10.1016/S001002770000109-8.CrossRefPubMedGoogle Scholar
  34. Mattys, S. L., Jusczyk, P. W., Luce, P. A., & Morgan, J. L. (1999). Phonotactic and prosodic effects on word segmentation in infants. Cognitive Psychology, 38, 465–494. doi: 10.1006/cogp.1999.0721.CrossRefPubMedGoogle Scholar
  35. McQueen, J. M. (1998). Segmentation of continuous speech using phonotactics. Journal of Memory and Language, 39, 21–46. doi: 10.1006/jmla.1998.2568.CrossRefGoogle Scholar
  36. Minier, L., Fagot, J., & Rey, A. (2015). The temporal dynamics of regularity extraction in non-human primates. Cognitive Science,. doi: 10.1111/cogs.12279.PubMedGoogle Scholar
  37. Morgan, J. L., & Saffran, J. R. (1995). Emerging integration of sequential and suprasegmental information in preverbal speech segmentation. Child Development, 66, 911–936. doi: 10.1111/1467-8624.ep9509180265.CrossRefPubMedGoogle Scholar
  38. Norris, D., McQueen, J. M., Cutler, A., & Butterfield, S. (1997). The possible-word constraint in the segmentation of continuous speech. Cognitive Psychology, 34, 191–243. doi: 10.1006/cogp.1997.0671.CrossRefPubMedGoogle Scholar
  39. Perruchet, P., & Poulin-Charronnat, B. (2012). Beyond transitional probability computations: extracting word-like units when only statistical information is available. Journal of Memory and Language, 66, 807–818. doi: 10.1016/j.jml.2012.02.010.CrossRefGoogle Scholar
  40. Perruchet, P., Poulin-Charronnat, B., Tillmann, B., & Peereman, R. (2014). New evidence for chunk-based models in word segmentation. Acta Psychologica, 149, 1–8. doi: 10.1016/j.actpsy.2014.01.015.CrossRefPubMedGoogle Scholar
  41. Perruchet, P., & Tillmann, B. (2010). Exploiting multiple sources of information in learning an artificial language: human data and modeling. Cognitive Science, 34, 255–285. doi: 10.1111/j.1551-6709.2009.01074.x.CrossRefPubMedGoogle Scholar
  42. Perruchet, P., Tyler, M. D., Galland, N., & Peereman, R. (2004). Learning nonadjacent dependencies: no need for algebraic-like computations. Journal of Experimental Psychology: General, 133(4), 573–583.CrossRefGoogle Scholar
  43. Perruchet, P., & Vinter, A. (1998). PARSER: a model for word segmentation. Journal of Memory and Language, 39, 246–263. doi: 10.1016/jmla.1998.2576.CrossRefGoogle Scholar
  44. Radeau, M., & Morais, J. (1990). The uniqueness point effect in the shadowing of spoken words. Speech Communication, 9(2), 155–164. doi: 10.1016/0167-6393(90)900068-K.CrossRefGoogle Scholar
  45. Robinet, V., Lemaire, B., & Gordon, M. B. (2011). MDLChunker: a MDL-based cognitive model of inductive learning. Cognitive Science, 35, 1352–1389. doi: 10.1111/j.1551-6709.2011.01188.x.CrossRefPubMedGoogle Scholar
  46. Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996a). Statistical learning by 8-month-old infants. Science, 274, 1926–1928. doi: 10.1126/science.274.5294.1926.CrossRefPubMedGoogle Scholar
  47. Saffran, J. R., Newport, E. L., & Aslin, R. N. (1996b). Word segmentation: the role of distributional cues. Journal of Memory and Language, 35, 606–621. doi: 10.1006/jmla.1996.0032.CrossRefGoogle Scholar
  48. Sanders, L. D., Newport, E. L., & Neville, H. J. (2002). Segmenting nonsense: an event-related potential index of perceived onsets in continuous speech. Nature Neuroscience, 5(7), 700–703. doi: 10.1018/nn873.CrossRefPubMedPubMedCentralGoogle Scholar
  49. Thiessen, E. D., & Saffran, J. R. (2003). When cues collide: use of stress and statistical cues to word boundaries by 7- to 9-month-old infants. Developmental Psychology, 39, 706–716. doi: 10.1037/0012-1649.39.4.706.CrossRefPubMedGoogle Scholar
  50. Turk-Browne, N. B., Jungé, J. A., & Scholl, B. J. (2005). The automaticity of visual statistical learning. Journal of Experimental Psychology: General, 134(4), 552–564. doi: 10.1037/0096-3445.134.4.552.CrossRefGoogle Scholar
  51. Valian, V., & Coulson, S. (1988). Anchor points in language learning: the role of marker frequency. Journal of Memory and Language, 27, 71–86. doi: 10.1016/0749-596X(88)90049-6.CrossRefGoogle Scholar
  52. Weiss, D. J., Gerfen, C., & Mitchel, A. D. (2009). Speech segmentation in a simulated bilingual environment: a challenge for statistical learning? Language Learning and Development, 5(1), 30–49. doi: 10.1080/15475440802340101.CrossRefPubMedPubMedCentralGoogle Scholar
  53. Yang, C. D. (2004). Universal grammar, statistics or both? Trends in Cognitive Sciences, 8(10), 451–456. doi: 10.1016/j.tics.2004.08.006.CrossRefPubMedGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.Université Bourgogne Franche-ComtéLEAD-CNRS UMR5022DijonFrance
  2. 2.CNRS UMR5292, INSERM U1028, Lyon Neuroscience Research Center, Auditory Cognition and Psychoacoustics Team, Université of Lyon ILyonFrance
  3. 3.Univ. Grenoble AlpesCNRS UMR5105, LPNCGrenobleFrance

Personalised recommendations