Skip to main content

Exploiting Predictable Response Training to Improve Automatic Recognition of Children’s Spoken Responses

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 6094))

Abstract

The unpredictability of spoken responses by young children (6-7 years old) makes them problematic for automatic speech recognizers. Aist and Mostow proposed predictable response training to improve automatic recognition of children’s free-form spoken responses. We apply this approach in the context of Project LISTEN’s Reading Tutor to the task of teaching children an important reading comprehension strategy, namely to make up their own questions about text while reading it. We show how to use knowledge about strategy instruction and the story text to generate a language model that predicts questions spoken by children during comprehension instruction. We evaluated this model on a previously unseen test set of 18 utterances totaling 137 words spoken by 11 second grade children in response to prompts the Reading Tutor inserted as they read. Compared to using a baseline trigram language model that does not incorporate this knowledge, speech recognition using the generated language model achieved concept recall 5 times higher – so much that the difference was statistically significant despite small sample size.

The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R305B070458. The opinions expressed are those of the authors and do not necessarily represent the views of the Institute and the U.S. Department of Education. We also thank the educators, students, and LISTENers who helped generate and analyze our data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hagen, A., Pellom, B., Vuuren, S.v., Cole, R.: Advances in Children’s Speech Recognition within an Interactive Literacy Tutor. In: HLT-NAACL, Association for Computational Linguistics, Boston, pp. 25–28 (2004)

    Google Scholar 

  2. Litman, D.J., Silliman, S.: ITSPOKE: an intelligent tutoring spoken dialogue system. In: Demonstration Papers at HLT-NAACL, Association for Computational Linguistics, Boston, pp. 5–8 (2004)

    Google Scholar 

  3. Meron, J., Valente, A., Johnson., W.L.: Improving the Authoring of Foreign Language Interactive Lessons in the Tactical Language Training System. In: SLaTE, Farmington, PA (2007)

    Google Scholar 

  4. Wijekumar, K., Meyer, B.J.F.: Design and pilot of a web-based intelligent tutoring system to improve reading comprehension in middle school students. International Journal of Technology in Teaching and Learning 2(1), 36–49 (2006)

    Google Scholar 

  5. Russell, M., D’Arcy, S.: Challenges for computer recognition of children’s speech. In: SLaTE, Pittsburgh, PA, pp. 108–111 (2007)

    Google Scholar 

  6. Potamianos, A., Narayanan, S.: A Review of the Acoustic and Linguistic Properties of Children’s Speech. In: Proceedings of IEEE Multimedia Signal Processing Workshop, Chania, Crete, Greece, pp. 22–25. IEEE, Los Alamitos (2007)

    Google Scholar 

  7. Eguchi, S., Hirsh, I.J.: Development of speech sounds in children. Acta Oto-Laryngologica Supplementum 257, 1–51 (1969)

    Google Scholar 

  8. Gerosa, M., Giuliani, D., Narayanan, S.: Acoustic analysis and automatic recognition of spontaneous children’s speech. In: Interspeech, Pittsburgh, PA, pp. 1886–1889 (2006)

    Google Scholar 

  9. Aist, G., Mostow, J.: Designing Spoken Tutorial Dialogue with Children to Elicit Predictable but Educationally Valuable Responses. In: Interspeech, Brighton, UK (2009)

    Google Scholar 

  10. Rosenshine, B., Meister, C., Chapman, S.: Teaching students to generate questions: A review of the intervention studies. Review of Educational Research 66(2), 181–221 (1996)

    Google Scholar 

  11. NRP: Report of the National Reading Panel. Teaching children to read: An evidence-based assessment of the scientific research literature on reading and its implications for reading instruction, Washington, DC (2000), http://www.nichd.nih.gov/publications/nrppubskey.cfm

  12. Mostow, J., Chen, W.: Generating Instruction Automatically for the Reading Strategy of Self-Questioning. In: 14th International Conference on Artificial Intelligence in Education, pp. 465–472. IOS Press, Brighton (2009)

    Google Scholar 

  13. Mostow, J., Beck, J.: When the Rubber Meets the Road: Lessons from the In-School Adventures of an Automated Reading Tutor that Listens. In: Schneider, B., McDonald, S.-K. (eds.) Scale-Up in Education, vol. 2, pp. 183–200. Rowman & Littlefield Publishers, Lanham (2007)

    Google Scholar 

  14. Zhang, X., Mostow, J., Duke, N.K., Trotochaud, C., Valeri, J., Corbett, A.: Mining Free-form Spoken Responses to Tutor Prompts. In: Proceedings of the First International Conference on Educational Data Mining, Montreal, pp. 234–241 (2008)

    Google Scholar 

  15. Duke, N.K., Pearson, P.D.: Effective Practices for Developing Reading Comprehension. In: Farstrup, A.E., Samuels, S.J. (eds.) What Research Has To Say about Reading Instruction, International Reading Association, Newark, DE, pp. 205–242 (2002)

    Google Scholar 

  16. Dolch, E.W.: A Basic Sight Vocabulary. The Elementary School Journal (1936)

    Google Scholar 

  17. Stolcke, A.: SRILM – An Extensible Language Modeling Toolkit. Proc. Intl. Conf. on Spoken Language Processing 2, 901–904 (2002)

    Google Scholar 

  18. Brants, T., Franz, A.: Web 1T 5-gram Version 1. Linguistic Data Consortium (2006)

    Google Scholar 

  19. Berthold, A., Jameson, A.: Interpreting Symptoms of Cognitive Load in Speech Input. In: Proceedings of the Seventh International Conference on User Modeling, Banff, Canada, pp. 235–244 (1999)

    Google Scholar 

  20. Jang, P.J., Hauptmann, A.G.: Improving Acoustic Models with Captioned Multimedia Speech. ICMCS 2, 767–771 (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, W., Mostow, J., Aist, G. (2010). Exploiting Predictable Response Training to Improve Automatic Recognition of Children’s Spoken Responses. In: Aleven, V., Kay, J., Mostow, J. (eds) Intelligent Tutoring Systems. ITS 2010. Lecture Notes in Computer Science, vol 6094. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13388-6_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13388-6_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13387-9

  • Online ISBN: 978-3-642-13388-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics