Skip to main content

Automatic Annotation of Disfluent Speech in Children’s Reading Tasks

  • Conference paper
  • First Online:
  • 682 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10077))

Abstract

The automatic evaluation of reading performance of children is an important alternative to any manual or 1-on-1 evaluation by teachers or tutors. To do this, it is necessary to detect several types of reading miscues. This work presents an approach to annotate reading speech while detecting false-starts, repetitions and mispronunciations, three of the most common disfluencies. Using speech data of 6–10 year old children reading sentences and pseudowords, we apply a two-step process: first, an automatic alignment is performed to get the best possible word-level segmentation and detect syllable based false-starts and word repetitions by using a strict FST (Finite State Transducer); then, words are classified as being mispronounced or not through a likelihood measure of pronunciation by using phone posterior probabilities estimated by a neural network. This work advances towards getting the amount and severity of disfluencies to provide a reading ability score computed from several sentence reading tasks.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. National Reading Panel: Teaching children to read: an evidence-based assessment of the scientific research literature on reading and its implications for reading instruction. National Institute of Child Health and Human Development (2000)

    Google Scholar 

  2. Abdou, S.M., Hamid, S.E., Rashwan, M., Samir, A., Abdel-Hamid, O., Shahin, M., Nazih, W.: Computer aided pronunciation learning system using speech recognition techniques. In: INTERSPEECH (2006)

    Google Scholar 

  3. Cincarek, T., Gruhn, R., Hacker, C., Nöth, E., Nakamura, S.: Automatic pronunciation scoring of words and sentences independent from the non-native’s first language. Comput. Speech Lang. 23(1), 65–88 (2009)

    Article  Google Scholar 

  4. Mostow, J., Roth, S.F., Hauptmann, A.G., Kane, M.: A prototype reading coach that listens. In: Proceedings of 12th National Conference on Artificial Intelligence, vol. 1, Menlo Park, pp. 785–792 (1994)

    Google Scholar 

  5. Black, M., Tepperman, J., Lee, S., Price, P., Narayanan, S.: Automatic detection and classification of disfluent reading miscues in young children’s speech for the purpose of assessment. Presented at the Proceedings of Interspeech, pp. 206–209 (2007)

    Google Scholar 

  6. Duchateau, J., Kong, Y.O., Cleuren, L., Latacz, L., Roelens, J., Samir, A., Demuynck, K., Ghesquière, P., Verhelst, W., Hamme, H.V.: Developing a reading tutor: design and evaluation of dedicated speech recognition and synthesis modules. Speech Commun. 51(10), 985–994 (2009)

    Article  Google Scholar 

  7. Bolaños, D., Cole, R.A., Ward, W., Borts, E., Svirsky, E.: FLORA: fluent oral reading assessment of children’s speech. ACM Trans. Speech Lang. Process. 7(4), 16:1–16:19 (2011)

    Article  Google Scholar 

  8. The LetsRead Project - Automatic assessment of reading ability of children. http://lsi.co.it.pt/spl/projects_letsread.html. Accessed 25 Mar 2016

  9. Candeias, S., Celorico, D., Proença, J., Veiga, A., Perdigão, F.: HESITA(tions) in Portuguese: a database. In: ISCA, Interspeech Satellite Workshop on Disfluency in Spontaneous Speech - DiSS, pp. 13–16. KTH Royal Institute of Technology, Stockholm (2013)

    Google Scholar 

  10. Liu, Y., Shriberg, E., Stolcke, A., Harper, M.P.: Comparing HMM, maximum entropy, and conditional random fields for disfluency detection. In: Proceedings of Interspeech, pp. 3313–3316 (2005)

    Google Scholar 

  11. Medeiros, H., Moniz, H., Batista, F., Trancoso, I., Nunes, L., et al.: Disfluency detection based on prosodic features for university lectures. In: Proceedings of Interspeech, Lyon, France, pp. 2629–2633 (2013)

    Google Scholar 

  12. Moniz, H., Batista, F., Mata, A.I., Trancoso, I.: Speaking style effects in the production of disfluencies. Speech Commun. 65, 20–35 (2014)

    Article  Google Scholar 

  13. Duchateau, J., Cleuren, L., Hamme, H.V., Ghesquière, P.: Automatic assessment of children’s reading level. In: Proceedings of Interspeech, Antwerp, Belgium, pp. 1210–1213 (2007)

    Google Scholar 

  14. Yilmaz, E., Pelemans, J., Hamme, H.V.: Automatic assessment of children’s reading with the FLaVoR decoding using a phone confusion model. In: Proceedings of Interspeech, Singapore, pp. 969–972 (2014)

    Google Scholar 

  15. Li, X., Ju, Y.-C., Deng, L., Acero, A.: Efficient and robust language modeling in an automatic children’s reading tutor system. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 4, pp. 193–196 (2007)

    Google Scholar 

  16. Proença, J., Celorico, D., Candeias, S., Lopes, C., Perdigão, F., Children’s reading aloud performance: a database and automatic detection of disfluencies. In: ISCA - Conference of the International Speech Communication Association - INTERSPEECH, Dresden, Germany, pp. 1655–1659 (2015)

    Google Scholar 

  17. Black, M.P., Tepperman, J., Narayanan, S.S.: Automatic prediction of children’s reading ability for high-level literacy assessment. Trans. Audio Speech and Lang. Process. 19(4), 1015–1028 (2011)

    Article  Google Scholar 

  18. Proenca, J., Celorico, D., Candeias, S., Lopes, C., Perdigão, F.: The LetsRead corpus of portuguese children reading aloud for performance evaluation. In: Proceedings of 10th Edition of the Language Resources and Evaluation Conference (LREC 2016), Portorož, Slovenia (2016)

    Google Scholar 

  19. Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., Schwarz, P., Silovsky, J., Stemmer, G., Vesely, K.: The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding, Hilton Waikoloa Village, Big Island, Hawaii, US (2011)

    Google Scholar 

  20. Phoneme recognizer based on long temporal context. Brno University of Technology, FIT. http://speech.fit.vutbr.cz/software/phoneme-recognizer-based-long-temporal-context. Accessed 06 May 2015

  21. Veiga, A., Lopes, C., Sá, L., Perdigão, F.: Acoustic similarity scores for keyword spotting. In: Baptista, J., Mamede, N., Candeias, S., Paraboni, I., Pardo, Thiago, A.,S., Volpe Nunes, M.d.G (eds.) PROPOR 2014. LNCS (LNAI), vol. 8775, pp. 48–58. Springer, Heidelberg (2014). doi:10.1007/978-3-319-09761-9_5

    Google Scholar 

  22. Fiscus, J.G., Ajot, J., Garofolo, J.S., Doddingtion, G.: Results of the 2006 spoken term detection evaluation. In: Proceedings of SIGIR, vol. 7, pp. 51–57 (2007)

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by Fundação para a Ciência e Tecnologia under the project UID/EEA/50008/2013 (pluriannual funding in the scope of the LETSREAD project). Jorge Proença is supported by the SFRH/BD/97204/2013 FCT Grant. We would like to thank João de Deus, Bissaya Barreto and EBI de Pereira school associations and CASPAE parent’s association for collaborating in the database collection.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jorge Proença .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Proença, J., Celorico, D., Lopes, C., Candeias, S., Perdigão, F. (2016). Automatic Annotation of Disfluent Speech in Children’s Reading Tasks. In: Abad, A., et al. Advances in Speech and Language Technologies for Iberian Languages. IberSPEECH 2016. Lecture Notes in Computer Science(), vol 10077. Springer, Cham. https://doi.org/10.1007/978-3-319-49169-1_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49169-1_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49168-4

  • Online ISBN: 978-3-319-49169-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics