Abstract
A phonetic approach to the problem of automatic recognition of isolated words is investigated. The phonetic encoding method whereby each word from a vocabulary is associated with the code sequence of stable phonemes is proposed. The information-theoretical estimate of vocabulary confusability, the calculations of which rely on the phonetic database of a speaker and the communications channel SNR, is synthesized using the Kullback-Leibler divergence properties. In the experimental study of the proposed method, the mutual influence between the recognition quality and the proposed estimate of confusability is demonstrated by solving the problem of recognition of words in the Russian speech. It is established that the introduced requirement to isolated syllable pronunciation makes it possible to attain the 90–95% accuracy of recognition for vocabularies containing 2000 words.
Similar content being viewed by others
References
A. V. Kozlov, G. V. Savvina, and V. Yu. Shelepov, Iskusstvennyi Intellekt, No. 1, 156 (2003).
V. V. Savchenko, J. Commun. Technol. Electron. 50, 286 (2005).
V. Sorokin and A. Tananykin, J. Commun. Technol. Electron. 55, 1542 (2010).
V. V. Savchenko, Izv. Vyssh. Uchebn. Zaved., Radioelektronika, No. 5, 31 (2009).
S. Kullback, Information Theory and Statistics (Dover, New York, 1997).
I. S. Kipyatkova and A. A. Karpov, Tr. SPIIRAN, No. 12, 7 (2010).
B. Tan, in Proc. Conf. Electrical Power Systems and Computers, Lecture Notes in Electrical Engineering (Springer-Verlag, New York, 2011), Vol. 99, p. 771.
B. Mérialdo, IBM J. Res. Dev. 32, 227 (1988).
J. Anguita, J. Hernando, S. Peillon, and A. Bramoulle, IEEE Signal Process. Lett. 12, 585 (2005).
A. V. Savchenko, Vestn. Komp. Inform. Tekhnol., No. 8, 14 (2012).
V. V. Savchenko, J. Commun. Technol. Electron. 42, 393 (1997).
A. V. Gerasimov, O. A. Morozov, and V. R. Fidel’man, J. Commun. Technol. Electron. 50, 1192 (2005).
S. L. Marple, Jr. Digital Spectral Analysis: with Applications (Prentice-Hall, Englewood Cliffs, N. J., 1987; Mir, Moscow, 1990).
Springer Handbook of Speech Recognition, Ed. by J. Benesty, M. Sondh, Y. Huang (Springer-Verlag, New York, 2008).
A. I. Tsyplikhin and V. N. Sorokin, Inform. Protsessy 6, 177 (2006).
R. Sibson, Comp. J. (British Comp. Soc.) 16(1), 30 (1973).
A. V. Savchenko, Opt. Memory and Neural Networks (Inform. Opt.) 21, 219 (2012).
V. V. Savchenko and A. V. Savchenko, Sist. Uprav. Inform. Tekhnol., No. 2, 284 (2012).
M. Schuster, in Proc. 11th Pacific Rim Int. Conf. on Trends in Artificial Intelligence, Lecture Notes in Comp. Sci. (Springer, New York, 2010), Vol. 6230, p. 8.
A. V. Savchenko, Avtom. and Remote Control 74, 1225 (2013).
Author information
Authors and Affiliations
Corresponding author
Additional information
Original Russian Text © A.V. Savchenko, 2014, published in Radiotekhnika i Elektronika, 2014, Vol. 59, No. 4, pp. 339–345.
Rights and permissions
About this article
Cite this article
Savchenko, A.V. Phonetic encoding method in the isolated words recognition problem. J. Commun. Technol. Electron. 59, 310–315 (2014). https://doi.org/10.1134/S1064226914040093
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1064226914040093