Unsupervised and User Feedback Based Lexicon Adaptation for Foreign Names and Acronyms

Mansikkaniemi, André; Kurimo, Mikko

doi:10.1007/978-3-319-25789-1_19

André Mansikkaniemi¹⁶ &
Mikko Kurimo¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9449))

Included in the following conference series:

International Conference on Statistical Language and Speech Processing

653 Accesses
1 Citations

Abstract

In this work we evaluate a set of lexicon adaptation methods for improving the recognition of foreign names and acronyms in automatic speech recognition (ASR). The most likely foreign names and acronyms are selected from the LM training corpus based on typographic information and letter-ngram perplexity. Adapted pronunciation rules are generated for the selected foreign name candidates using a statistical grapheme-to-phoneme (G2P) model. A rule-based method is used for pronunciation adaptation of acronym candidates. In addition to unsupervised lexicon adaptation, we also evaluate an adaptation method based on speech data and user corrected ASR transcripts. Pronunciation variants for foreign name candidates are retrieved using forced alignment and second-pass decoding over partial audio segments. Optimal pronunciation variants are collected and used for future pronunciation adaptation of foreign names.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Snowball - http://snowball.tartarus.org/.

References

Adde, L., Svendsen, T.: Pronunciation variation modeling of non-native proper names by discriminative tree search. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal, ICASSP (2011)
Google Scholar
Ahmed, B., Cha, S.H., Tappert, C.: Detection of foreign entities in native text using n-gram based cumulative frequency addition. In: Proceedings of the CSIS Research Day, Pace University (2005)
Google Scholar
Bisani, M., Ney, H.: Joint-sequence models for grapheme-to-phoneme conversion. Speech Commun 50(5), 434–451 (2008)
Article Google Scholar
Creutz, M., Lagus, K.: Unsupervised morpheme segmentation and morphology induction from text corpora using morfessor 1.0. Technical report, Helsinki University of Technology, technical Report A81, Publications in Computer and Information Science (2005)
Google Scholar
Driesen, J., Bell, P., Sinclair, M., Renals, S.: Description of the uedin system for German asr. In: Proceedings of the International Workshop on Spoken Language Translation, IWSLT (2013)
Google Scholar
Heuvel, H., Reveil, B., Martens, J.P.: Pronunciation-based asr for names. In: Proceedings of Interspeech (2009)
Google Scholar
Hirsimäki, T., Pylkkönen, J., Kurimo, M.: Importance of high-order n-gram models in morph-based speech recognition. IEEE Trans. Audio Speech Lang. Process. 17, 724–732 (2009)
Article Google Scholar
Lehečka, J., Švec, J.: Improving speech recognition by detecting foreign inclusions and generating pronunciations. In: Habernal, I. (ed.) TSD 2013. LNCS, vol. 8082, pp. 295–302. Springer, Heidelberg (2013)
Google Scholar
Maison, B., Chen, S., Cohen, P.S.: Pronunciation modeling for names of foreign origin. In: IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU (2003)
Google Scholar
Mansikkaniemi, A., Kurimo, M.: Unsupervised topic adaptation for morph-based speech recognition. In: Proceedings of Interspeech (2013)
Google Scholar
Mansikkaniemi, A., Kurimo, M.: Adaptation of morph-based speech recognition for foreign names and acronyms. IEEE Trans. Audio Speech Lang. Process. 23(5), 941–950 (2015)
Article Google Scholar
Siivola, V., Hirsimäki, T., Virpioja, S.: On growing and pruning kneser-ney smoothed n-gram models. IEEE Trans. Audio Speech Lang. Process. 15(5), 1617–1624 (2007)
Article Google Scholar
Yang, Q., Martens, J.P., Konings, N., Heuvel, H.: Development of a phoneme-to-phoneme (p2p) converter to improve the grapheme-to-phoneme (g2p) conversion of names. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation, LREC (2006)
Google Scholar

Download references

Acknowledgements

This work has been funded by the Academy of Finland under the Finnish Centre of Excellence in Computational Inference programme. The experiments were performed using computational resources provided by the Aalto Science-IT project.

Author information

Authors and Affiliations

Department of Signal Processing and Acoustics, School of Electrical Engineering, Aalto University, Aalto, Finland
André Mansikkaniemi & Mikko Kurimo

Authors

André Mansikkaniemi
View author publications
You can also search for this author in PubMed Google Scholar
Mikko Kurimo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to André Mansikkaniemi .

Editor information

Editors and Affiliations

Research Group on Mathematical Linguistic, Rovira i Virgili University, Tarragona, Spain
Adrian-Horia Dediu
Research Group on Mathematical Linguistic, Rovira i Virgili University, Tarragona, Spain
Carlos Martín-Vide
Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, Budapest, Hungary
Klára Vicsi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mansikkaniemi, A., Kurimo, M. (2015). Unsupervised and User Feedback Based Lexicon Adaptation for Foreign Names and Acronyms. In: Dediu, AH., Martín-Vide, C., Vicsi, K. (eds) Statistical Language and Speech Processing. SLSP 2015. Lecture Notes in Computer Science(), vol 9449. Springer, Cham. https://doi.org/10.1007/978-3-319-25789-1_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-25789-1_19
Published: 17 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25788-4
Online ISBN: 978-3-319-25789-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics