Statistical word segmentation succeeds given the minimal amount of exposure

Hao Wang, Felix; Luo, Meili; Wang, Suiping

doi:10.3758/s13423-023-02386-z

Statistical word segmentation succeeds given the minimal amount of exposure

Brief Report
Published: 26 October 2023

(2023)
Cite this article

Psychonomic Bulletin & Review Aims and scope Submit manuscript

175 Accesses
Explore all metrics

Abstract

One of the first tasks in language acquisition is word segmentation, a process to extract word forms from continuous speech streams. Statistical approaches to word segmentation have been shown to be a powerful mechanism, in which word boundaries are inferred from sequence statistics. This approach requires the learner to represent the frequency of units from syllable sequences, though accounts differ on how much statistical exposure is required. In this study, we examined the computational limit with which words can be extracted from continuous sequences. First, we discussed why two occurrences of a word in a continuous sequence is the computational lower limit for this word to be statistically defined. Next, we created short syllable sequences that contained certain words either two or four times. Learners were presented with these syllable sequences one at a time, immediately followed by a test of the novel words from these sequences. We found that, with the computationally minimal amount of two exposures, words were successfully segmented from continuous sequences. Moreover, longer syllable sequences providing four exposures to words generated more robust learning results. The implications of these results are discussed in terms of how learners segment and store the word candidates from continuous sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Familiar units prevail over statistical cues in word segmentation

Article 31 August 2016

WordSeg: Standardizing unsupervised word form segmentation from text

Article 01 April 2019

When statistics collide: The use of transitional and phonotactic probability cues to word boundaries

Article 09 March 2021

Notes

The computation of TPs makes use of the absolute frequency for different units: p(b|a) = count (ab)/count (a).
Incidentally, both of these studies report null results, and that learning did not change by lengthening the exposure period.
Many statistical word-segmentation studies use fade-in or fade-out to avoid providing word boundary information to learners, as sequence boundaries can potentially be regarded as cues for word boundaries. We examined this issue empirically in our analysis; see Results section.
This is another way in which our design differed from the Batterink (2017) design. In Batterink (2017), all the words in syllable streams have equal frequency (a characteristic from Saffran et al., 1996), which makes it difficult to disentangle transitional probability and frequency as the source of the learning effect: these factors can only be disentangled if the words in the sequence have a lower frequency than part-words, as Aslin et al. (1998) explained. We designed all of our sequences with an unbalanced frequency design from Aslin et al. (1998).
In the example sequence in Fig. 1, a counterbalance sequence is constructed by using [8,1] and [2,7] as low-frequency words, and [3,4] and [5,6] as part words, which means that the four words making up this counterbalancing sequence are [4,5], [6,3], [8,1] and [2,7], with a frequency of 4, 4, 2, and 2 respectively.
For all the syllables used in the study, we use a pinyin representation where the syllables are followed by a number from 1 through 4, such that the number represents the tone the syllable is in. In this example, both syllables xian and mo are in the second tone.
In Stata syntax, the equation here is: mixed key i.rc || langrecode: || subject:, where rc codes for word/part-word given the current sequence and counterbalancing condition.
In Stata syntax, the equation here is: mixed key i.rc##i.times || langrecode: || subject: , where rc codes for word/part-word given the current sequence and counterbalancing condition.
In Stata syntax, the equation here is: mixed key i.rc##i.langt || langrecode: || subject:
This was a linear regression for each condition, which in Stata syntax is: reg effect_size time_points.
In Stata syntax, the equations here are:
mixed key i.firstword || langrecode: || subject: if rc == 1
mixed key i.lastword || langrecode: || subject: if rc == 1
Though as we discussed in the Introduction, more exposure does not generate more learning in many cases. In the current experimental conditions, the reason for the increase in learning from more exposures may stem from the fact that the minimal exposure amount (two exposures) generated a very modest amount of learning to begin with.

References

Aslin, R. N., Saffran, J. R., & Newport, E. L. (1998). Computation of conditional probability statistics by 8-month-old infants. Psychological Science, 9(4), 321–324.
Article Google Scholar
Batterink, L. J. (2017). Rapid statistical learning supporting word extraction from continuous speech. Psychological Science, 28(7), 921–928.
Article PubMed PubMed Central Google Scholar
Brysbaert, M., & Stevens, M. (2018). Power analysis and effect size in mixed effects models: A tutorial. Journal of Cognition, 1(1), 1–20. https://doi.org/10.5334/joc.10
Article Google Scholar
Brysbaert, M., Mandera, P., & Keuleers, E. (2018). The word frequency effect in word processing: An updated review. Current Directions in Psychological Science, 27(1), 45–50.
Article Google Scholar
Bulgarelli, F., & Weiss, D. J. (2016). Anchors aweigh: The impact of overlearning on entrenchment effects in statistical learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 42(10), 1621.
PubMed Google Scholar
Cattell, J. M. (1890). Mental tests and measurements. Mind, 15, 373–381.
Article Google Scholar
Chen, J., & Ten Cate, C. (2017). Bridging the gap: Learning of acoustic nonadjacent dependencies by a songbird. Journal of Experimental Psychology: Animal Learning and Cognition, 43(3), 295.
PubMed Google Scholar
Erickson, L. C., & Thiessen, E. D. (2015). Statistical learning of language: Theory, validity, and predictions of a statistical learning account of language acquisition. Developmental Review, 37, 66–108.
Article Google Scholar
Finn, A. S., & Kam, C. L. H. (2008). The curse of knowledge: First language knowledge impairs adult learners’ use of novel statistics for word segmentation. Cognition, 108(2), 477–499.
Article PubMed Google Scholar
Frank, M. C., Goldwater, S., Griffiths, T. L., & Tenenbaum, J. B. (2010). Modeling human performance in statistical word segmentation. Cognition, 117(2), 107–125.
Article PubMed Google Scholar
Gebhart, A. L., Newport, E. L., & Aslin, R. N. (2009). Statistical learning of adjacent and nonadjacent dependencies among nonlinguistic sounds. Psychonomic Bulletin & Review, 16(3), 486–490.
Article Google Scholar
Hick, W. E. (1952). On the rate of gain of information. Quarterly Journal of experimental psychology, 4(1), 11–26.
Article Google Scholar
Hyman, R. (1953). Stimulus information as a determinant of reaction time. Journal of experimental psychology, 45(3), 188.
Article PubMed Google Scholar
Lazartigues, L., Mathy, F., & Lavigne, F. (2021). Statistical learning of unbalanced exclusive-or temporal sequences in humans. Plos one, 16(2), e0246826.
Article PubMed PubMed Central Google Scholar
Lazartigues, L., Mathy, F., & Lavigne, F. (2023). Probability, Dependency, and Frequency Are Not All Equally Involved in Statistical Learning. Experimental Psychology, 69(5), 241–252.
Article Google Scholar
Lew-Williams, C., & Saffran, J. R. (2012). All words are not created equal: Expectations about word length guide infant statistical learning. Cognition, 122(2), 241–246. https://doi.org/10.1016/j.cognition.2011.10.007
Article PubMed Google Scholar
Mirman, D., Graf Estes, K., & Magnuson, J. S. (2010). Computational modeling of statistical learning: Effects of transitional probability versus frequency and links to word learning. Infancy, 15(5), 471–486.
Article PubMed Google Scholar
Misyak, J. B., & Christiansen, M. H. (2012). Statistical learning and language: An individual differences study. Language Learning, 62(1), 302–331.
Article Google Scholar
Misyak, J. B., Christiansen, M. H., & Tomblin, J. B. (2010). On-line individual differences in statistical learning predict language processing. Frontiers in psychology, 1, 31.
Article PubMed PubMed Central Google Scholar
Newport, E. L., & Aslin, R. N. (2004). Learning at a distance I. Statistical learning of non-adjacent dependencies. Cognitive Psychology, 48(2), 127–162. https://doi.org/10.1016/S0010-0285(03)00128-2
Nissen, M. J., & Bullemer, P. (1987). Attentional requirements of learning: Evidence from performance measures. Cognitive psychology, 19(1), 1–32.
Article Google Scholar
Pelucchi, B., Hay, J. F., & Saffran, J. R. (2009). Learning in reverse: Eight-month-old infants track backward transitional probabilities. Cognition, 113(2), 244–247.
Article PubMed PubMed Central Google Scholar
Perruchet, P., & Pacton, S. (2006). Implicit learning and statistical learning: One phenomenon, two approaches. Trends in cognitive sciences, 10(5), 233–238.
Article PubMed Google Scholar
Perruchet, P., & Vinter, A. (1998). PARSER: A model for word segmentation. Journal of memory and language, 39(2), 246–263.
Article Google Scholar
Popov, V., & Reder, L. M. (2020). Frequency effects on memory: A resource-limited theory. Psychological review, 127(1), 1.
Article PubMed Google Scholar
Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical learning by 8-month-old infants. Science, 274(5294), 1926–1928. https://doi.org/10.1126/science.274.5294.1926
Article PubMed Google Scholar
Saffran, J. R., Johnson, E. K., Aslin, R. N., & Newport, E. L. (1999). Statistical learning of tone sequences by human infants and adults. Cognition, 70(1), 27–52.
Article PubMed Google Scholar
Santolin, C., & Saffran, J. R. (2018). Constraints on statistical learning across species. Trends in Cognitive Sciences, 22(1), 52–63.
Article PubMed Google Scholar
Swingley, D. (2005). Statistical clustering and the contents of the infant vocabulary. Cognitive psychology, 50(1), 86–132.
Article PubMed Google Scholar
Toro, J. M., & Trobalón, J. B. (2005). Statistical computations over a speech stream in a rodent. Perception & Psychophysics, 67(5), 867–875.
Article Google Scholar
Wang, F. H., Hutton, E. A., & Zevin, J. D. (2019). Statistical learning of unfamiliar sounds as trajectories through a perceptual similarity space. Cognitive Science, 43(8), e12740.
Article PubMed Google Scholar
Wang, F. H., Zevin, J. D., Trueswell, J. C., & Mintz, T. H. (2020). Top-down grouping affects adjacent dependency learning. Psychonomic Bulletin & Review, 27, 1052–1058.
Article Google Scholar
Wang, F. H., Zevin, J., & Mintz, T. H. (2019). Successfully learning non-adjacent dependencies in a continuous artificial language stream. Cognitive Psychology, 113, 101223.
Article PubMed Google Scholar
Weiss, D. J., Gerfen, C., & Mitchel, A. D. (2009). Speech Segmentation in a Simulated Bilingual Environment: A Challenge for Statistical Learning? Language Learning and Development, 5(1), 30–49. https://doi.org/10.1080/15475440802340101
Article PubMed PubMed Central Google Scholar
Yu, W., Wang, L., Qu, X., Wang, T., Zhang, J., & Liang, D. (2021). Transitional probabilities and expectation for word length impact verbal statistical learning. Acta Psychologica Sinica, 53(6), 565–574.
Article Google Scholar
Zhou, X., & Marslen-Wilson, W. (1997). The abstractness of phonological representation in the Chinese mental lexicon. Cognitive Processing of Chinese and other Asian languages, 3-26.

Download references

Author Note

The reported experiments were not preregistered. The data reported in this paper is available, at https://osf.io/che4q/?view_only=cca901e1ecb748178ecae2b6a5f1c31c. This work was supported by the Natural Science Foundation of China (No. 3217051) to Suiping Wang.

Author information

Authors and Affiliations

School of Psychology, Nanjing Normal University, Nanjing, Jiangsu, China
Felix Hao Wang & Meili Luo
Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents, South China Normal University, Ministry of Education, Guangzhou, China
Suiping Wang

Authors

Felix Hao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Meili Luo
View author publications
You can also search for this author in PubMed Google Scholar
Suiping Wang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Felix Hao Wang: Conceptualization, Methodology, Software, Visualization, Writing – original draft, Writing – review & editing. Meili Luo: Project administration, Writing ‐ review & editing. Suiping Wang: Resources, Writing ‐ review & editing.

Corresponding authors

Correspondence to Felix Hao Wang or Suiping Wang.

Ethics declarations

Conflict of interest

There is no conflict of interest to report.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 56 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Hao Wang, F., Luo, M. & Wang, S. Statistical word segmentation succeeds given the minimal amount of exposure. Psychon Bull Rev (2023). https://doi.org/10.3758/s13423-023-02386-z

Download citation

Accepted: 10 September 2023
Published: 26 October 2023
DOI: https://doi.org/10.3758/s13423-023-02386-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical word segmentation succeeds given the minimal amount of exposure

Abstract

Access this article

Similar content being viewed by others

Familiar units prevail over statistical cues in word segmentation

WordSeg: Standardizing unsupervised word form segmentation from text

When statistics collide: The use of transitional and phonotactic probability cues to word boundaries

Notes

References

Author Note

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (DOCX 56 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Statistical word segmentation succeeds given the minimal amount of exposure

Abstract

Access this article

Similar content being viewed by others

Familiar units prevail over statistical cues in word segmentation

WordSeg: Standardizing unsupervised word form segmentation from text

When statistics collide: The use of transitional and phonotactic probability cues to word boundaries

Notes

References

Author Note

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (DOCX 56 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation