Unsupervised and User Feedback Based Lexicon Adaptation for Foreign Names and Acronyms
In this work we evaluate a set of lexicon adaptation methods for improving the recognition of foreign names and acronyms in automatic speech recognition (ASR). The most likely foreign names and acronyms are selected from the LM training corpus based on typographic information and letter-ngram perplexity. Adapted pronunciation rules are generated for the selected foreign name candidates using a statistical grapheme-to-phoneme (G2P) model. A rule-based method is used for pronunciation adaptation of acronym candidates. In addition to unsupervised lexicon adaptation, we also evaluate an adaptation method based on speech data and user corrected ASR transcripts. Pronunciation variants for foreign name candidates are retrieved using forced alignment and second-pass decoding over partial audio segments. Optimal pronunciation variants are collected and used for future pronunciation adaptation of foreign names.
KeywordsSpeech recognition Lexicon adaptation Unsupervised pronunciation adaptation Forced alignment User feedback
This work has been funded by the Academy of Finland under the Finnish Centre of Excellence in Computational Inference programme. The experiments were performed using computational resources provided by the Aalto Science-IT project.
- 1.Adde, L., Svendsen, T.: Pronunciation variation modeling of non-native proper names by discriminative tree search. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal, ICASSP (2011)Google Scholar
- 2.Ahmed, B., Cha, S.H., Tappert, C.: Detection of foreign entities in native text using n-gram based cumulative frequency addition. In: Proceedings of the CSIS Research Day, Pace University (2005)Google Scholar
- 4.Creutz, M., Lagus, K.: Unsupervised morpheme segmentation and morphology induction from text corpora using morfessor 1.0. Technical report, Helsinki University of Technology, technical Report A81, Publications in Computer and Information Science (2005)Google Scholar
- 5.Driesen, J., Bell, P., Sinclair, M., Renals, S.: Description of the uedin system for German asr. In: Proceedings of the International Workshop on Spoken Language Translation, IWSLT (2013)Google Scholar
- 6.Heuvel, H., Reveil, B., Martens, J.P.: Pronunciation-based asr for names. In: Proceedings of Interspeech (2009)Google Scholar
- 8.Lehečka, J., Švec, J.: Improving speech recognition by detecting foreign inclusions and generating pronunciations. In: Habernal, I. (ed.) TSD 2013. LNCS, vol. 8082, pp. 295–302. Springer, Heidelberg (2013) Google Scholar
- 9.Maison, B., Chen, S., Cohen, P.S.: Pronunciation modeling for names of foreign origin. In: IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU (2003)Google Scholar
- 10.Mansikkaniemi, A., Kurimo, M.: Unsupervised topic adaptation for morph-based speech recognition. In: Proceedings of Interspeech (2013)Google Scholar
- 13.Yang, Q., Martens, J.P., Konings, N., Heuvel, H.: Development of a phoneme-to-phoneme (p2p) converter to improve the grapheme-to-phoneme (g2p) conversion of names. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation, LREC (2006)Google Scholar