Phonetic Sequence to Graphemes Conversion Based on DTW and One-Stage Algorithms
This work proposes an algorithm for converting phonetic sequences into graphemes using DTW on the recognition of isolated words or closed sentences, and using One-Stage on a continuous speech recognition task. Most speech recognition systems resolve the task of recognition on a single stage without having an intermediate phonetic sequence result. The proposed solution is hybrid in the sense that it uses HMMs and Viterbi Decoding for recognizing a phonetic sequence (actually, triphones) and then DTW or One-Stage to generate the corresponding graphemes. Experimental results showed an average accuracy rate of 100% on the recognition of closed sentences, and average word recognition rate of 84% for the continuous speech recognition task.
Unable to display preview. Download preview PDF.
- 1.HTK - Hidden Markov Model Toolkit, http://htk.eng.cam.ac.uk
- 2.Rabiner, L.R., Juang, B.: Fundamentals on Speech Recognition. New Jersey, Prentice Hall (1996)Google Scholar
- 3.Ney, H.: The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition. In: Proceedings of ICASSP (1984)Google Scholar
- 4.Alcaim, A., Solewicz e, J.A., Moraes, J.A.: Freqüência de ocorrência dos fones e listas de frases foneticamente balanceadas no português falado no Rio de Janeiro. Revista da Sociedade Brasileira de Telecomunicações, Rio de Janeiro, 7(1), 23–41 (1992)Google Scholar
- 6.Huang, X., Acero, A., Hon, H.: Spoken Language Processing: A Guide to Theory, Algorithm and System Development, ch. 11. Prentice Hall, Englewood Cliffs (2001)Google Scholar