The problem of determining a fundamental tone frequency of a speech signal in the presence of white Gaussian noise is examined. A method for measuring this frequency is proposed which takes into account the periodic structure of the power spectrum of voiced speech frames and is based on the principle of harmonic energy accumulation in the frequency domain. For this purpose a procedure for equalizing the envelope of the power spectrum is introduced in the algorithm for processing a speech signal using a two-level autoregression model of the observations: within the limits of a single period of the fundamental tone and within an interval of several of these periods. Here adaptation of the order of the autoregression of the lower level to the observed frame is planned. An example of the practical realization of the adaptive method based on the Berg method is examined. The basic advantages of the adaptive method compared to the known analogs are high speed and enhanced noise stability, which are confirmed in a full-scale experiment. A gain in threshold signals of 5-10 dB was obtained through use of the adaptive method.
Similar content being viewed by others
Notes
Phoneme Training. An Information System for Phonetic Analysis and Learning of Speech, https://sites.google.com/site/frompldcreators/produkty-1/phonemetraining, accessed on May 25, 2022.
References
L. R. Rabiner and R. W. Shafer, Theory and Applications of Digital Speech Processing, Pearson, Boston (2010).
B. N. Schenkman and V. K. Gidla, Appl. Acoust., 163, 107214 (2020), https://doi.org/10.1016/j.apacoust.2020.107214.
G. V. Souza, J. M. Duarte, F. Viegas, et al, J. Voice, 34, No. 4, 641–648 (2020), https://doi.org/10.1016/j.jvoice.2018.12.007.
S. R. Smith, J. Acoust. Soc. Am., 150, A113 (2021), https://doi.org/10.1121/10.0007806.
A. V. Savchenko and V. V. Savchenko, “Methods of measuring frequencies of the fundamental tone of a speech signal for acoustic analysis systems,” Measur. Techn., 62, No. 3, 282–288 (2019), https://doi.org/10.1007/s11018-019-01617-x.
I. C. Yadav, S. Shahnawazuddin, and G. Pradhan, Dig. Signal Proc., 86, 55–64 (2019), https://doi.org/10.1016/j.dsp.2018.12.013.
V. V. Savchenko, Radioelectron. Commun. Syst., 63, 532–542 (2020), https://doi.org/10.3103/S0735272720100039.
J. D. Gibson, Information, 32, No. 7, (2016), https://doi.org/10.3390/info7020032.
Yu. Gu and H. L. Wei, Inform. Sci., 451–452, 195–209 (2018), https://doi.org/10.1016/j.ins.2018.04.007.
S. Cui, E. Li, and X. Kang, IEEE Int. Conf. on Multimedia and Expo (ICME), London, UK (2020), pp. 1–6, https://doi.org/10.1109/ICME46284.2020.9102765.
V. V. Savchenko and A. V. Savchenko, J. Communic. Technol. Electron., 65, No. 11, 1311–1317 (2020), https://doi.org/10.1134/S1064226920110157.
V. V. Savchenko and A. V. Savchenko, Radioelectron. Commun. Syst., 62, No. 5, 276–286 (2019), https://doi.org/10.3103/S0735272719050042.
V. V. Savchenko, Radioelectron. Commun. Syst., 64, No. 11, 592–603 (2021), https://doi.org/10.3103/S0735272721110030.
H. B. Kashani and A. Sayadiyan, Comp. Speech & Lang., 50, 105–125 (2018), https://doi.org/10.1016/j.csl.2017.12.008.
J. Gibson, Information, 179, No. 10, (2018), https://doi.org/10.3390/info10050179.
J. D. Markel and A. H. Gray, “Fundamental frequency estimation,” in: Linear Prediction of Speech. Communication and Cybernetics, Springer, Berlin, Heidelberg (1976), Vol. 12, https://doi.org/10.1007/978-3-642-66286-7_8.
M. Esfandiari, S. A. Vorobyov, and M. Karimi, Signal Proc., 171, 107480 (2020), https://doi.org/10.1016/j.sigpro.2020.107480.
A. E. Jaramillo, J. K. Nielsen, and M. G. Christensen, 27th Europ. Signal Proc. Conf., EUSIPCO (2019), pp. 1–5, https://doi.org/10.23919/EUSIPCO.2019.8902763.
A. Palaparthi and I. R. Titze, Speech Commun., 123, 98–108 (2020), https://doi.org/10.1016/j.specom.2020.07.003.
Ya. D. Shirman (ed.) et al., Radio-Electronic Systems. Basic Construction and Theory: Handbook, Radiotekhika, Moscow (2007), 2nd ed.
R. Sinha and S. Shahnawazuddin, Comp. Speech & Lang., 48, 103–121 (2018), https://doi.org/10.1016/j.csl.2017.10.007.
A. V. Oppenheim and R. W. Schafer, IEEE Signal Proc. Mag., 21, No. 5, 95–106 (2004), https://doi.org/10.1109/MSP.2004.1328092.
C. Parlak and Yu. Altun, Math. Probl. Eng., 2021, 6658951 (2021), https://doi.org/10.1155/2021/6658951.
S. L. Marple, Digital Spectral Analysis with Applications, Dover Publ., Mineola, New York (2019), 2nd ed.
D. G. Levkov, A. G. Panin, and I. I. Tkachev, arXiv:2010.15145v4 [astro-ph.HE] (2021), https://arxiv.org/abs/2010.15145.
L. V. Savchenko and A. V. Savchenko, “Method for measuring the dynamic indicator of the emotional state of a user based on his speech signal in real time,” Measur. Techn., 64, No. 12, (2021), https://doi.org/10.1007/s11018-021-01935-z.
B. Deng, D. Jouvet, Y. Laprie, et al., IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP (2017), pp. 5605–5609, https://doi.org/10.1109/ICASSP.2017.7953229.
M. B. Akçay and K. Oğuz, Speech Communic., 116, 56–76 (2020), https://doi.org/10.1016/j.specom.2019.12.001.
Author information
Authors and Affiliations
Corresponding author
Additional information
Translated from Izmeritel’naya Tekhnika, No. 6, pp. 60–66, June, 2022.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Savchenko, A.V., Savchenko, V.V. Adaptive Method for Measuring a Fundamental Tone Frequency Using a Two-Level Autoregressive Model of Speech Signals. Meas Tech 65, 453–460 (2022). https://doi.org/10.1007/s11018-022-02104-6
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11018-022-02104-6