Skip to main content
Log in

Automated definition of phonetically homogeneous sections of words in a natural language based on multiparameter optimization

  • Pattern Recognition and Image Processing
  • Published:
Journal of Computer and Systems Sciences International Aims and scope

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

An approach to the automated splitting of words into phonetically homogeneous parts is proposed under which the boundaries of the parts are defined as a result of solving a multiparameter optimization problem. The approach is assumed to ensure the maximum difference in the phonetic material between the adjacent parts and the maximum similarity within the parts. The accepted measure of similarity and difference is based on the correlation between the columns of the parametric portrait matrix of the word generated as a result of a time-spectral conversion of an audio recording of the word. To obtain a numerical solution of the problem, an algorithm is proposed which is a modification of a dynamic programming technique. The experimental results are presented with several words from the Russian language taken as examples to confirm the legitimacy of the assumptions made and viability of the algorithms proposed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Yu. G. Bondaros, K. A. Makovkin, and V. Ya. Chuchupal, “Recognition system of speech pilot interface commands for the integrated modular avionics,” Vestn. Komp. Inform. Tekhnol., No. 4, 2–13 (2007).

    Google Scholar 

  2. Yu. G. Bondaros, A. I. Ivanov, A. A. Shishov, and A. I. Kostyuk, “Speech signals operators research, critical for safety systems,” Vestn. Komp. Inform. Tekhnol., No. 11, 2–11 (2009).

    Google Scholar 

  3. Yu. G. Bondaros, A. S. Kolokolov, and A. I. Kostyuk, “Using of speech signals in a cabin of the aircraft,” Vestn. Komp. Inform. Tekhnol., No. 4, 2–10 (2008).

    Google Scholar 

  4. O. N. Korsun and A. Sh. Gabdrakhmanov, “Noise resistant algorithm of voice control of aircraft equipment,” Vestn. Komp. Inform. Tekhnol., No. 4, 3–7 (2012).

    Google Scholar 

  5. H.-G. Hirsch and D. Pearce, “The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions,” Autom. Speech Recognit., No. 1, 181–188 (2000).

    Google Scholar 

  6. A. Schmidt-Nielsen, E. Marsh, J. Tardeli, P. Gatewood, E. Kreamer, T. Tremain, C. Cieri, and J. Wright, Speech in Noisy Environments (SPINE) Evaluation Audio (Linguistic Data Consortium, 2000).

    Google Scholar 

  7. J. Benesty, M. M. Sondhi, and Y. Huang, Springer Handbook of Speech Processing (Springer Science, Business Media, Berlin, 2007).

    Google Scholar 

  8. L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals (Prentice-Hall, Englewood Cliffs, NJ, 1978; Ripol Klassik, Moscow, 1981).

    Google Scholar 

  9. O. N. Korsun, A. I. Ivanov, V. N. Filatov, I. V. Krasavin, and V. Ya. Chuchupal, “The technique for methodic for experimental research of overload influence on speech characteristics for avionics speech interface design,” Vestn. Komp. Inform. Tekhnol., No. 5, 3–7 (2012).

    Google Scholar 

  10. Yu. G. Bondaros, A. I. Ivanov, and A. A. Tishchenko, “Operator fatigue degree definition according his voice Lyapunov exponent,” Vestn. Komp. Inform. Tekhnol., No. 6, 22–30 (2010).

    Google Scholar 

  11. L. V. Savchenko, “An algorithm of oral speech phonemic recognition on the basis of the fuzzy phonetic codingdecoding method,” Inform.-Upravl. Sist., No. 1, 23–31 (2014).

    Google Scholar 

  12. L. Rabiner and B. Luang, Fundamentals of Speech Recognition (Prentice Hall, Englewood Cliffs, NJ, 1993).

    Google Scholar 

  13. A. Vorga and H. Steeneken, “Assessment for automatic speech recognition: II. NOISEX-92: a database and an experimental to study the effect of additive noise on speech recognition systems,” Speech Commun., No. 3, 247–251 (1993).

    Article  Google Scholar 

  14. D. S. Pallet, W. M. Fisher, and J. G. Fiscus, “Tools for the analysis of benchmark speech recognition tests,” IEEE Trans. Acoust. Speech, Signal Process., No. 1, 97–100 (1990).

    Article  Google Scholar 

  15. O. N. Korsun, A. Sh. Gibdrakhmanov, E. I. Mikhailov, M. Z. Nakhaev, and A. K. Tulekbaeva, “Algorithm for an automatic recognition of the speech commands, invariant to languages,” Mekhatron., Avtomatiz., Upravl., No. 9, 599–604 (2015).

    Article  Google Scholar 

  16. A. S. Kolokolov and I. A. Lyubinskii, “A comparative study of several approaches to short-term frequency analysis of a speech signal,” Autom. Remote Control 76, 1828 (2015).

    Article  MATH  Google Scholar 

  17. A. S. Kolokolov, “Frequency domain signal processing in speech recognition,” Probl. Upravl., No. 3, 13–18 (2006).

    Google Scholar 

  18. F. D. Harris, “On the use of windows for harmonic analysis with the discrete Fourier transform,” Proc. IEEE 66, 51–83 (1978).

    Article  Google Scholar 

  19. R. L. Stratonovich, Information Theory (Sov. Radio, Moscow, 1975) [in Russian].

    Google Scholar 

  20. E. S. Venttsel’, Operations Research. Problems, Principles, Methodology (Nauka, Moscow, 1980) [in Russian].

    MATH  Google Scholar 

  21. F. S. Cooper, P. C. Delattre, A. M. Liberman, J. M. Borst, and L. J. Gerstman, “Some experiments on the perception of synthetic speech sounds,” J. Acoust. Soc. Am., No. 6, 597–606 (1952).

    Article  Google Scholar 

  22. S. E. Blumstein and K. N. Stevens, “Perceptual invariance and onset spectra for stop consonants in different vowel environments,” J. Acoust. Soc. Am., No. 2, 648–662 (1980).

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to O. N. Korsun.

Additional information

Original Russian Text © O.N. Korsun, A.V. Poliev, 2016, published in Izvestiya Akademii Nauk, Teoriya i Sistemy Upravleniya, 2016, No. 4, pp. 115–124.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Korsun, O.N., Poliev, A.V. Automated definition of phonetically homogeneous sections of words in a natural language based on multiparameter optimization. J. Comput. Syst. Sci. Int. 55, 609–618 (2016). https://doi.org/10.1134/S1064230716040080

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S1064230716040080

Navigation