The problem of determining the accuracy of an autoregressive model of a speech signal is considered, and a method for measuring the accuracy index in the sliding observation window mode is proposed. As an indicator of accuracy, we used a modified value of the COSH-distance (hyperbolic cosine) of the autoregressive model relative to the eponymous (single phoneme) Schuster periodogram as a reference spectral sample. To study the possibilities of the proposed method, a full-scale experiment was set up and carried out, in which the object of study was a set of autoregressive models of different orders. These models were obtained by Berg’s method for the vowel speech sounds of a test speaker. According to the results of the performed measurements for each vowel, the optimal values of the autoregressive order and the corresponding optimal autoregressive model were found. It is shown that this optimization made it possible to increase the accuracy of the autoregressive model of the speech signal by more than 60%, depending on the sound of the test speaker's speech and the characteristics of his vocal tract. The results obtained are intended for use in automatic processing and digital speech transmission systems with radical data compression based on linear prediction coefficients.
Similar content being viewed by others
Change history
04 August 2023
A Correction to this paper has been published: https://doi.org/10.1007/s11018-023-02225-6
Notes
Phoneme Training. Information system for phonetic analysis and speech training [site]. URL: https://sites.google.com/site/frompldcreators/ produkty-1/phonemetraining (accessed 20.09.2022).
References
J. Gibson, Entropy, 20, No. 10, 7502018 (2018), https://doi.org/10.3390/e20100750.
V. V. Savchenko, Radioelectron. Commun. Syst., 64, No. 11, 592–603 (2021), https://doi.org/10.3103/S0735272721110030.
A. V. Savchenko and V. V. Savchenko, An adaptive method for measuring the pitch frequency using a two-level autoregressive model of a speech signal, Izmerit. Tekh., No. 6, 60–66 (2022).
E. Jaramillo, J. K. Nielsen, anf M. G. Christensen, 27th Eur. Signal Processing Conf. (EUSIPCO), 2019, pp. 1–5, https://doi.org/10.23919/EUSIPCO.2019.8902763.
V. V. Savchenko, Radiophys. Quantum Electron., 60, No. 1, 89–96 (2017), https://doi.org/10.1007/s11141-017-9778-y.
S. Cui, E. Li, and X. Kang, IEEE Int. Conf. on Multimedia and Expo (ICME), London, United Kingdom, 2020, pp. 1–6, https://doi.org/10.1109/ICME46284.2020.9102765.
Sh. Ando, J. Acoust. Soc. Am., 146, No. 11, 2846 (2019), https://doi.org/10.1121/1.5136873.
V. V. Savchenko and A. V. Savchenko, Radioelectron. Commun. Syst., 62, No. 5, 276–286 (2019), https://doi.org/10.3103/S0735272719050042.
J. Ding, V. Tarokh, and Y. Yang, IEEE Trans. Inform. Theory, 64, No. 6, 4024–4043 (2018), https://doi.org/10.1109/TIT.2017.2717599.
V. V. Savchenko, Radioelectron. Commun. Syst., 63, 532–542 (2020), https://doi.org/10.3103/S0735272720100039.
V. V. Savchenko and L. V. Savchenko, Method for Measuring the Intelligibility of Speech Signals in the Kullback–Leibler Information Metric, Meas. Tech., 62, No. 9, 832–839 (2019), https://doi.org/10.1007/s11018-019-01702-1.
M. Tohyama, Spectral envelope and source signature analysis, in: Acoustic Signals and Hearing, Academic Press (2020), pp. 89–110, https://doi.org/10.1016/B978-0-12-816391-7.00013-9.
P. Sun, A. Mahdi, J. Xu, and J. Qin J., Speech Commun., 101, 57–69 (2018), https://doi.org/10.1016/j.specom.2018.05.006.
V. V. Savchenko and L. V. Savchenko, The method of autoregresion modeling of the speech signal on the basis of its discrete Fourier transform and scale-invariant measure of information mismatch, Radiotech. Electron., 66, No. 11, 1100–1108 (2021), https://doi.org/10.31857/S0033849421110085.
S. L. Marple, Digital Spectral Analysis with Applications, 2nd edn., Dover Publications, Mineola, New York (2019), 432 p.
Radioelektronnye sistemy. Osnovy postroeniya i teoriya: Spravochnik, Ed. Ya. D. Shirman, 2nd edn., Radiotekhnika Publ., Moscow (2007), p. 657.
J. Benesty, J. Chen, and Y. Huang, Linear prediction, in: Springer Handbook of Speech Processing, Part B, Springer, New York (2008), pp. 111–124, https://doi.org/10.1007/978-3-540-49127-9_7.
A. Palaparthi, and I. R. Titze, Speech Commun., 123, 98–108 (2020), https://doi.org/10.1016/j.specom.2020.07.003.
L. R. Rabiner and R. W. Shafer, Theory and Applications of Digital Speech Processing, Pearson, Boston (2010), 1060 p.
A. V. Savchenko, V. V. Savchenko, and L. V. Savchenko, Optim. Lett., No. 7 (2021), https://doi.org/10.1007/s11590-021-01790-5.
S. Kumar, S. K. Singh, and S. Bhattacharya, Int. J. Speech Technol., 18, 521–527 (2015), https://doi.org/10.1007/s10772-015-9296-2.
Author information
Authors and Affiliations
Corresponding author
Additional information
Translated from Izmeritel'naya Tekhnika, No. 10, pp. 58–63, October, 2022.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Savchenko, V.V. Improving the Method for Measuring the Accuracy Indicator of a Speech Signal Autoregression Model. Meas Tech 65, 769–775 (2023). https://doi.org/10.1007/s11018-023-02150-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11018-023-02150-8