Automatic Annotation of Disfluent Speech in Children’s Reading Tasks

Proença, Jorge; Celorico, Dirce; Lopes, Carla; Candeias, Sara; Perdigão, Fernando

doi:10.1007/978-3-319-49169-1_17

Automatic Annotation of Disfluent Speech in Children’s Reading Tasks

Jorge Proença^21,22,
Dirce Celorico²¹,
Carla Lopes^21,23,
Sara Candeias²⁴ &
…
Fernando Perdigão^21,22

Conference paper
First Online: 04 November 2016

682 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10077))

Abstract

The automatic evaluation of reading performance of children is an important alternative to any manual or 1-on-1 evaluation by teachers or tutors. To do this, it is necessary to detect several types of reading miscues. This work presents an approach to annotate reading speech while detecting false-starts, repetitions and mispronunciations, three of the most common disfluencies. Using speech data of 6–10 year old children reading sentences and pseudowords, we apply a two-step process: first, an automatic alignment is performed to get the best possible word-level segmentation and detect syllable based false-starts and word repetitions by using a strict FST (Finite State Transducer); then, words are classified as being mispronounced or not through a likelihood measure of pronunciation by using phone posterior probabilities estimated by a neural network. This work advances towards getting the amount and severity of disfluencies to provide a reading ability score computed from several sentence reading tasks.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

National Reading Panel: Teaching children to read: an evidence-based assessment of the scientific research literature on reading and its implications for reading instruction. National Institute of Child Health and Human Development (2000)
Google Scholar
Abdou, S.M., Hamid, S.E., Rashwan, M., Samir, A., Abdel-Hamid, O., Shahin, M., Nazih, W.: Computer aided pronunciation learning system using speech recognition techniques. In: INTERSPEECH (2006)
Google Scholar
Cincarek, T., Gruhn, R., Hacker, C., Nöth, E., Nakamura, S.: Automatic pronunciation scoring of words and sentences independent from the non-native’s first language. Comput. Speech Lang. 23(1), 65–88 (2009)
Article Google Scholar
Mostow, J., Roth, S.F., Hauptmann, A.G., Kane, M.: A prototype reading coach that listens. In: Proceedings of 12th National Conference on Artificial Intelligence, vol. 1, Menlo Park, pp. 785–792 (1994)
Google Scholar
Black, M., Tepperman, J., Lee, S., Price, P., Narayanan, S.: Automatic detection and classification of disfluent reading miscues in young children’s speech for the purpose of assessment. Presented at the Proceedings of Interspeech, pp. 206–209 (2007)
Google Scholar
Duchateau, J., Kong, Y.O., Cleuren, L., Latacz, L., Roelens, J., Samir, A., Demuynck, K., Ghesquière, P., Verhelst, W., Hamme, H.V.: Developing a reading tutor: design and evaluation of dedicated speech recognition and synthesis modules. Speech Commun. 51(10), 985–994 (2009)
Article Google Scholar
Bolaños, D., Cole, R.A., Ward, W., Borts, E., Svirsky, E.: FLORA: fluent oral reading assessment of children’s speech. ACM Trans. Speech Lang. Process. 7(4), 16:1–16:19 (2011)
Article Google Scholar
The LetsRead Project - Automatic assessment of reading ability of children. http://lsi.co.it.pt/spl/projects_letsread.html. Accessed 25 Mar 2016
Candeias, S., Celorico, D., Proença, J., Veiga, A., Perdigão, F.: HESITA(tions) in Portuguese: a database. In: ISCA, Interspeech Satellite Workshop on Disfluency in Spontaneous Speech - DiSS, pp. 13–16. KTH Royal Institute of Technology, Stockholm (2013)
Google Scholar
Liu, Y., Shriberg, E., Stolcke, A., Harper, M.P.: Comparing HMM, maximum entropy, and conditional random fields for disfluency detection. In: Proceedings of Interspeech, pp. 3313–3316 (2005)
Google Scholar
Medeiros, H., Moniz, H., Batista, F., Trancoso, I., Nunes, L., et al.: Disfluency detection based on prosodic features for university lectures. In: Proceedings of Interspeech, Lyon, France, pp. 2629–2633 (2013)
Google Scholar
Moniz, H., Batista, F., Mata, A.I., Trancoso, I.: Speaking style effects in the production of disfluencies. Speech Commun. 65, 20–35 (2014)
Article Google Scholar
Duchateau, J., Cleuren, L., Hamme, H.V., Ghesquière, P.: Automatic assessment of children’s reading level. In: Proceedings of Interspeech, Antwerp, Belgium, pp. 1210–1213 (2007)
Google Scholar
Yilmaz, E., Pelemans, J., Hamme, H.V.: Automatic assessment of children’s reading with the FLaVoR decoding using a phone confusion model. In: Proceedings of Interspeech, Singapore, pp. 969–972 (2014)
Google Scholar
Li, X., Ju, Y.-C., Deng, L., Acero, A.: Efficient and robust language modeling in an automatic children’s reading tutor system. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 4, pp. 193–196 (2007)
Google Scholar
Proença, J., Celorico, D., Candeias, S., Lopes, C., Perdigão, F., Children’s reading aloud performance: a database and automatic detection of disfluencies. In: ISCA - Conference of the International Speech Communication Association - INTERSPEECH, Dresden, Germany, pp. 1655–1659 (2015)
Google Scholar
Black, M.P., Tepperman, J., Narayanan, S.S.: Automatic prediction of children’s reading ability for high-level literacy assessment. Trans. Audio Speech and Lang. Process. 19(4), 1015–1028 (2011)
Article Google Scholar
Proenca, J., Celorico, D., Candeias, S., Lopes, C., Perdigão, F.: The LetsRead corpus of portuguese children reading aloud for performance evaluation. In: Proceedings of 10th Edition of the Language Resources and Evaluation Conference (LREC 2016), Portorož, Slovenia (2016)
Google Scholar
Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., Schwarz, P., Silovsky, J., Stemmer, G., Vesely, K.: The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding, Hilton Waikoloa Village, Big Island, Hawaii, US (2011)
Google Scholar
Phoneme recognizer based on long temporal context. Brno University of Technology, FIT. http://speech.fit.vutbr.cz/software/phoneme-recognizer-based-long-temporal-context. Accessed 06 May 2015
Veiga, A., Lopes, C., Sá, L., Perdigão, F.: Acoustic similarity scores for keyword spotting. In: Baptista, J., Mamede, N., Candeias, S., Paraboni, I., Pardo, Thiago, A.,S., Volpe Nunes, M.d.G (eds.) PROPOR 2014. LNCS (LNAI), vol. 8775, pp. 48–58. Springer, Heidelberg (2014). doi:10.1007/978-3-319-09761-9_5
Google Scholar
Fiscus, J.G., Ajot, J., Garofolo, J.S., Doddingtion, G.: Results of the 2006 spoken term detection evaluation. In: Proceedings of SIGIR, vol. 7, pp. 51–57 (2007)
Google Scholar

Download references

Acknowledgements

This work was supported in part by Fundação para a Ciência e Tecnologia under the project UID/EEA/50008/2013 (pluriannual funding in the scope of the LETSREAD project). Jorge Proença is supported by the SFRH/BD/97204/2013 FCT Grant. We would like to thank João de Deus, Bissaya Barreto and EBI de Pereira school associations and CASPAE parent’s association for collaborating in the database collection.

Author information

Authors and Affiliations

Instituto de Telecomunicações, Coimbra, Portugal
Jorge Proença, Dirce Celorico, Carla Lopes & Fernando Perdigão
Department of Electrical and Computer Engineering, University of Coimbra, Coimbra, Portugal
Jorge Proença & Fernando Perdigão
Polytechnic Institute of Leiria, Leiria, Portugal
Carla Lopes
Microsoft Language Development Centre, Lisbon, Portugal
Sara Candeias

Authors

Jorge Proença
View author publications
You can also search for this author in PubMed Google Scholar
Dirce Celorico
View author publications
You can also search for this author in PubMed Google Scholar
Carla Lopes
View author publications
You can also search for this author in PubMed Google Scholar
Sara Candeias
View author publications
You can also search for this author in PubMed Google Scholar
Fernando Perdigão
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jorge Proença .

Editor information

Editors and Affiliations

INESC-ID/IST, Universidade de Lisboa, Lisbon, Portugal
Alberto Abad
I3A/University of Zaragoza, Zaragoza, Spain
Alfonso Ortega
DETI/IEETA, University of Aveiro, Aveiro, Portugal
António Teixeira
AtlantTIC Research Center, Universidad de Vigo, Vigo, Spain
Carmen García Mateo
Universitat Politècnica de València, Valencia, Spain
Carlos D. Martínez Hinarejos
University of Coimbra, Coimbra, Portugal
Fernando Perdigão
INESC-ID/ISCTE-IUL, Lisbon, Portugal
Fernando Batista
INESC-ID/IST, Universidade de Lisboa, Lisbon, Portugal
Nuno Mamede

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Proença, J., Celorico, D., Lopes, C., Candeias, S., Perdigão, F. (2016). Automatic Annotation of Disfluent Speech in Children’s Reading Tasks. In: Abad, A., et al. Advances in Speech and Language Technologies for Iberian Languages. IberSPEECH 2016. Lecture Notes in Computer Science(), vol 10077. Springer, Cham. https://doi.org/10.1007/978-3-319-49169-1_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-49169-1_17
Published: 04 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49168-4
Online ISBN: 978-3-319-49169-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics