Automated definition of phonetically homogeneous sections of words in a natural language based on multiparameter optimization

Korsun, O. N.; Poliev, A. V.

doi:10.1134/S1064230716040080

Automated definition of phonetically homogeneous sections of words in a natural language based on multiparameter optimization

Pattern Recognition and Image Processing
Published: 11 August 2016

Volume 55, pages 609–618, (2016)
Cite this article

Journal of Computer and Systems Sciences International Aims and scope

O. N. Korsun¹ &
A. V. Poliev¹

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

An approach to the automated splitting of words into phonetically homogeneous parts is proposed under which the boundaries of the parts are defined as a result of solving a multiparameter optimization problem. The approach is assumed to ensure the maximum difference in the phonetic material between the adjacent parts and the maximum similarity within the parts. The accepted measure of similarity and difference is based on the correlation between the columns of the parametric portrait matrix of the word generated as a result of a time-spectral conversion of an audio recording of the word. To obtain a numerical solution of the problem, an algorithm is proposed which is a modification of a dynamic programming technique. The experimental results are presented with several words from the Russian language taken as examples to confirm the legitimacy of the assumptions made and viability of the algorithms proposed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Overview of Phonetic Encoding Algorithms

Article 01 October 2020

Phoneme-Lattice to Phoneme-Sequence Matching Algorithm Based on Dynamic Programming

Lexique-Infra: grapheme-phoneme, phoneme-grapheme regularity, consistency, and other sublexical statistics for 137,717 polysyllabic French words

Article 21 May 2020

References

Yu. G. Bondaros, K. A. Makovkin, and V. Ya. Chuchupal, “Recognition system of speech pilot interface commands for the integrated modular avionics,” Vestn. Komp. Inform. Tekhnol., No. 4, 2–13 (2007).
Google Scholar
Yu. G. Bondaros, A. I. Ivanov, A. A. Shishov, and A. I. Kostyuk, “Speech signals operators research, critical for safety systems,” Vestn. Komp. Inform. Tekhnol., No. 11, 2–11 (2009).
Google Scholar
Yu. G. Bondaros, A. S. Kolokolov, and A. I. Kostyuk, “Using of speech signals in a cabin of the aircraft,” Vestn. Komp. Inform. Tekhnol., No. 4, 2–10 (2008).
Google Scholar
O. N. Korsun and A. Sh. Gabdrakhmanov, “Noise resistant algorithm of voice control of aircraft equipment,” Vestn. Komp. Inform. Tekhnol., No. 4, 3–7 (2012).
Google Scholar
H.-G. Hirsch and D. Pearce, “The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions,” Autom. Speech Recognit., No. 1, 181–188 (2000).
Google Scholar
A. Schmidt-Nielsen, E. Marsh, J. Tardeli, P. Gatewood, E. Kreamer, T. Tremain, C. Cieri, and J. Wright, Speech in Noisy Environments (SPINE) Evaluation Audio (Linguistic Data Consortium, 2000).
Google Scholar
J. Benesty, M. M. Sondhi, and Y. Huang, Springer Handbook of Speech Processing (Springer Science, Business Media, Berlin, 2007).
Google Scholar
L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals (Prentice-Hall, Englewood Cliffs, NJ, 1978; Ripol Klassik, Moscow, 1981).
Google Scholar
O. N. Korsun, A. I. Ivanov, V. N. Filatov, I. V. Krasavin, and V. Ya. Chuchupal, “The technique for methodic for experimental research of overload influence on speech characteristics for avionics speech interface design,” Vestn. Komp. Inform. Tekhnol., No. 5, 3–7 (2012).
Google Scholar
Yu. G. Bondaros, A. I. Ivanov, and A. A. Tishchenko, “Operator fatigue degree definition according his voice Lyapunov exponent,” Vestn. Komp. Inform. Tekhnol., No. 6, 22–30 (2010).
Google Scholar
L. V. Savchenko, “An algorithm of oral speech phonemic recognition on the basis of the fuzzy phonetic codingdecoding method,” Inform.-Upravl. Sist., No. 1, 23–31 (2014).
Google Scholar
L. Rabiner and B. Luang, Fundamentals of Speech Recognition (Prentice Hall, Englewood Cliffs, NJ, 1993).
Google Scholar
A. Vorga and H. Steeneken, “Assessment for automatic speech recognition: II. NOISEX-92: a database and an experimental to study the effect of additive noise on speech recognition systems,” Speech Commun., No. 3, 247–251 (1993).
Article Google Scholar
D. S. Pallet, W. M. Fisher, and J. G. Fiscus, “Tools for the analysis of benchmark speech recognition tests,” IEEE Trans. Acoust. Speech, Signal Process., No. 1, 97–100 (1990).
Article Google Scholar
O. N. Korsun, A. Sh. Gibdrakhmanov, E. I. Mikhailov, M. Z. Nakhaev, and A. K. Tulekbaeva, “Algorithm for an automatic recognition of the speech commands, invariant to languages,” Mekhatron., Avtomatiz., Upravl., No. 9, 599–604 (2015).
Article Google Scholar
A. S. Kolokolov and I. A. Lyubinskii, “A comparative study of several approaches to short-term frequency analysis of a speech signal,” Autom. Remote Control 76, 1828 (2015).
Article MATH Google Scholar
A. S. Kolokolov, “Frequency domain signal processing in speech recognition,” Probl. Upravl., No. 3, 13–18 (2006).
Google Scholar
F. D. Harris, “On the use of windows for harmonic analysis with the discrete Fourier transform,” Proc. IEEE 66, 51–83 (1978).
Article Google Scholar
R. L. Stratonovich, Information Theory (Sov. Radio, Moscow, 1975) [in Russian].
Google Scholar
E. S. Venttsel’, Operations Research. Problems, Principles, Methodology (Nauka, Moscow, 1980) [in Russian].
MATH Google Scholar
F. S. Cooper, P. C. Delattre, A. M. Liberman, J. M. Borst, and L. J. Gerstman, “Some experiments on the perception of synthetic speech sounds,” J. Acoust. Soc. Am., No. 6, 597–606 (1952).
Article Google Scholar
S. E. Blumstein and K. N. Stevens, “Perceptual invariance and onset spectra for stop consonants in different vowel environments,” J. Acoust. Soc. Am., No. 2, 648–662 (1980).
Article Google Scholar

Download references

Author information

Authors and Affiliations

State Research Institute of Aviation Systems (FGUP GosNIIAS), Moscow Institute of Physics and Technology, Moscow, Russia
O. N. Korsun & A. V. Poliev

Authors

O. N. Korsun
View author publications
You can also search for this author in PubMed Google Scholar
A. V. Poliev
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to O. N. Korsun.

Additional information

Original Russian Text © O.N. Korsun, A.V. Poliev, 2016, published in Izvestiya Akademii Nauk, Teoriya i Sistemy Upravleniya, 2016, No. 4, pp. 115–124.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Korsun, O.N., Poliev, A.V. Automated definition of phonetically homogeneous sections of words in a natural language based on multiparameter optimization. J. Comput. Syst. Sci. Int. 55, 609–618 (2016). https://doi.org/10.1134/S1064230716040080

Download citation

Received: 24 February 2016
Accepted: 10 March 2016
Published: 11 August 2016
Issue Date: July 2016
DOI: https://doi.org/10.1134/S1064230716040080

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automated definition of phonetically homogeneous sections of words in a natural language based on multiparameter optimization

Abstract

Access this article

Similar content being viewed by others

An Overview of Phonetic Encoding Algorithms

Phoneme-Lattice to Phoneme-Sequence Matching Algorithm Based on Dynamic Programming

Lexique-Infra: grapheme-phoneme, phoneme-grapheme regularity, consistency, and other sublexical statistics for 137,717 polysyllabic French words

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Navigation

Automated definition of phonetically homogeneous sections of words in a natural language based on multiparameter optimization

Abstract

Access this article

Similar content being viewed by others

An Overview of Phonetic Encoding Algorithms

Phoneme-Lattice to Phoneme-Sequence Matching Algorithm Based on Dynamic Programming

Lexique-Infra: grapheme-phoneme, phoneme-grapheme regularity, consistency, and other sublexical statistics for 137,717 polysyllabic French words

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation