Exploiting Predictable Response Training to Improve Automatic Recognition of Children’s Spoken Responses

Chen, Wei; Mostow, Jack; Aist, Gregory

doi:10.1007/978-3-642-13388-6_11

Exploiting Predictable Response Training to Improve Automatic Recognition of Children’s Spoken Responses

Wei Chen¹⁸,
Jack Mostow¹⁸ &
Gregory Aist^18,19

Conference paper

2488 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 6094))

Abstract

The unpredictability of spoken responses by young children (6-7 years old) makes them problematic for automatic speech recognizers. Aist and Mostow proposed predictable response training to improve automatic recognition of children’s free-form spoken responses. We apply this approach in the context of Project LISTEN’s Reading Tutor to the task of teaching children an important reading comprehension strategy, namely to make up their own questions about text while reading it. We show how to use knowledge about strategy instruction and the story text to generate a language model that predicts questions spoken by children during comprehension instruction. We evaluated this model on a previously unseen test set of 18 utterances totaling 137 words spoken by 11 second grade children in response to prompts the Reading Tutor inserted as they read. Compared to using a baseline trigram language model that does not incorporate this knowledge, speech recognition using the generated language model achieved concept recall 5 times higher – so much that the difference was statistically significant despite small sample size.

The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R305B070458. The opinions expressed are those of the authors and do not necessarily represent the views of the Institute and the U.S. Department of Education. We also thank the educators, students, and LISTENers who helped generate and analyze our data.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hagen, A., Pellom, B., Vuuren, S.v., Cole, R.: Advances in Children’s Speech Recognition within an Interactive Literacy Tutor. In: HLT-NAACL, Association for Computational Linguistics, Boston, pp. 25–28 (2004)
Google Scholar
Litman, D.J., Silliman, S.: ITSPOKE: an intelligent tutoring spoken dialogue system. In: Demonstration Papers at HLT-NAACL, Association for Computational Linguistics, Boston, pp. 5–8 (2004)
Google Scholar
Meron, J., Valente, A., Johnson., W.L.: Improving the Authoring of Foreign Language Interactive Lessons in the Tactical Language Training System. In: SLaTE, Farmington, PA (2007)
Google Scholar
Wijekumar, K., Meyer, B.J.F.: Design and pilot of a web-based intelligent tutoring system to improve reading comprehension in middle school students. International Journal of Technology in Teaching and Learning 2(1), 36–49 (2006)
Google Scholar
Russell, M., D’Arcy, S.: Challenges for computer recognition of children’s speech. In: SLaTE, Pittsburgh, PA, pp. 108–111 (2007)
Google Scholar
Potamianos, A., Narayanan, S.: A Review of the Acoustic and Linguistic Properties of Children’s Speech. In: Proceedings of IEEE Multimedia Signal Processing Workshop, Chania, Crete, Greece, pp. 22–25. IEEE, Los Alamitos (2007)
Google Scholar
Eguchi, S., Hirsh, I.J.: Development of speech sounds in children. Acta Oto-Laryngologica Supplementum 257, 1–51 (1969)
Google Scholar
Gerosa, M., Giuliani, D., Narayanan, S.: Acoustic analysis and automatic recognition of spontaneous children’s speech. In: Interspeech, Pittsburgh, PA, pp. 1886–1889 (2006)
Google Scholar
Aist, G., Mostow, J.: Designing Spoken Tutorial Dialogue with Children to Elicit Predictable but Educationally Valuable Responses. In: Interspeech, Brighton, UK (2009)
Google Scholar
Rosenshine, B., Meister, C., Chapman, S.: Teaching students to generate questions: A review of the intervention studies. Review of Educational Research 66(2), 181–221 (1996)
Google Scholar
NRP: Report of the National Reading Panel. Teaching children to read: An evidence-based assessment of the scientific research literature on reading and its implications for reading instruction, Washington, DC (2000), http://www.nichd.nih.gov/publications/nrppubskey.cfm
Mostow, J., Chen, W.: Generating Instruction Automatically for the Reading Strategy of Self-Questioning. In: 14th International Conference on Artificial Intelligence in Education, pp. 465–472. IOS Press, Brighton (2009)
Google Scholar
Mostow, J., Beck, J.: When the Rubber Meets the Road: Lessons from the In-School Adventures of an Automated Reading Tutor that Listens. In: Schneider, B., McDonald, S.-K. (eds.) Scale-Up in Education, vol. 2, pp. 183–200. Rowman & Littlefield Publishers, Lanham (2007)
Google Scholar
Zhang, X., Mostow, J., Duke, N.K., Trotochaud, C., Valeri, J., Corbett, A.: Mining Free-form Spoken Responses to Tutor Prompts. In: Proceedings of the First International Conference on Educational Data Mining, Montreal, pp. 234–241 (2008)
Google Scholar
Duke, N.K., Pearson, P.D.: Effective Practices for Developing Reading Comprehension. In: Farstrup, A.E., Samuels, S.J. (eds.) What Research Has To Say about Reading Instruction, International Reading Association, Newark, DE, pp. 205–242 (2002)
Google Scholar
Dolch, E.W.: A Basic Sight Vocabulary. The Elementary School Journal (1936)
Google Scholar
Stolcke, A.: SRILM – An Extensible Language Modeling Toolkit. Proc. Intl. Conf. on Spoken Language Processing 2, 901–904 (2002)
Google Scholar
Brants, T., Franz, A.: Web 1T 5-gram Version 1. Linguistic Data Consortium (2006)
Google Scholar
Berthold, A., Jameson, A.: Interpreting Symptoms of Cognitive Load in Speech Input. In: Proceedings of the Seventh International Conference on User Modeling, Banff, Canada, pp. 235–244 (1999)
Google Scholar
Jang, P.J., Hauptmann, A.G.: Improving Acoustic Models with Captioned Multimedia Speech. ICMCS 2, 767–771 (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Project LISTEN, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
Wei Chen, Jack Mostow & Gregory Aist
Applied Linguistics and Communication Studies, Iowa State University, Ames, IA, 50011, USA
Gregory Aist

Authors

Wei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jack Mostow
View author publications
You can also search for this author in PubMed Google Scholar
Gregory Aist
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Human-Computer Interaction Institute, Carnegie Mellon University, 5000 Forbes Avenue, 15213, Pittsburgh, PA, USA
Vincent Aleven & Jack Mostow &
School of Information Technologies, University of Sydney, 1 Cleveland Street, 2006, Sydney, Australia
Judy Kay

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, W., Mostow, J., Aist, G. (2010). Exploiting Predictable Response Training to Improve Automatic Recognition of Children’s Spoken Responses. In: Aleven, V., Kay, J., Mostow, J. (eds) Intelligent Tutoring Systems. ITS 2010. Lecture Notes in Computer Science, vol 6094. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13388-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-13388-6_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13387-9
Online ISBN: 978-3-642-13388-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics