Skip to main content
Log in

Method for Measurement the Intensity of Speech Vowel Sounds Flow for Audiovisual Dialogue Information Systems

  • Published:
Measurement Techniques Aims and scope

The interaction of two types of modality of a system for processing audiovisual information in the problem of evaluating the emotional state of users of dialogue information systems was studied. In order to enhance the precision of an estimation in real time, it is proposed to use an audio modality for the purpose of detecting speech segments of increased emotionality. As an indicator of the degree of speech emotionality, the intensity of the flow of vowel sounds in a user’s speech signal at input to the information system is used. A method has been developed for measuring this indicator from the empirical probability of the occurrence of vowel sounds in the user’s a speech signal. An example is presented for practical implementation of the method in soft real time. A full-scale experiment using the authors’ software was posed and presented. The advantages of the proposed method are shown: high speed of operation and high sensitivity to the change in the level of speech emotionality of users. Results obtained are intended for developers of advanced information systems with an audiovisual user interface.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1.
Fig. 2.
Fig. 3.

Similar content being viewed by others

References

  1. S. K. Davis, M. Morningstar, M. A. Dirks, and R. Qualter, Person. Individ. Differ., 160, 109938 (2020), https://doi.org/10.1016/j.paid.2020.109938.

    Article  Google Scholar 

  2. J. M. Arana, E. Gordillo, J. Darias, and L. Mestas, Comp. Human Behav., 104, 106156 (2020), https://doi.org/10.1016/j.chb.2019.106156.

    Article  Google Scholar 

  3. L. V. Savchenko and A. V. Savchenko, Measur. Techn., 64, No. 4 (2021), https://doi.org/10.1007/s11018-021-01935-z.

  4. F. A. Shaqra, R. Duwairi, and M. Al-Ayyoub, Proc. Comp. Sci., 151, 37–44 (2019), https://doi.org/10.1016/j.procs.2019.04.009.

    Article  Google Scholar 

  5. A. V. Savchenko and V. V. Savchenko, Izmer. Tekhn., No. 11, 60–66 (2021), https://doi.org/10.32446/0368-1025it.2021-11-60-66.

  6. N. Srinivas, G. Pradhan, and P. K. Kumar, Integration, 63, 185–195 (2018), https://doi.org/10.1016/j.vlsi.2018.07.005.

    Article  Google Scholar 

  7. R. Rammohan, N. Dhanabalsamy, V. Dimov, and F. L. Eidelman, J. Allergy Clin. Immunol., 139, No. 2, ab250 (2017), https://doi.org/10.1016/j.jaci.2016.12.804.

    Article  Google Scholar 

  8. M. B. Akgay and K. Oguz, Speech Commun., 116, 56–76 (2020), https://doi.org/10.1016/j.specom.2019.12.001.

    Article  Google Scholar 

  9. M. Bourguignon, N. Molinaro, M. Lizarazu, et al., Neuroimage, 216, 116788 (2020), https://doi.org/10.1016/j.neuroimage.2020.

    Article  Google Scholar 

  10. D. B. Cardona, N. Nedjah, and L. M. Mourelle, Neurocomputing, 265, 78–90 (2017), https://doi.org/10.1016/j.neucom.2016.09.140.

    Article  Google Scholar 

  11. S. Cui, E. Li, and X. Kang, IEEE Int. Conf. on Multimedia and Expo (ICME), London UK, July 6–10, 2020, pp. 1–6, 10.1109/ICME46284.2020.9102765.

  12. H. B. Kashani, A. Sayadiyan, and H. Sheikhzadeh, Speech Commun., 91, 28–48 (2017), https://doi.org/10.1016/j.specom.2017.04.008.

    Article  Google Scholar 

  13. D. Yongda, L. Fang, and X. Huang, Computers & Electr. Eng., 72, 443–454 (2018), https://doi.org/10.1016/j.compeleceng.2018.09.014.

    Article  Google Scholar 

  14. F. R. Akbulut, H. G. Perros, and M. Shahzad, Comp. Meth. Progr. Biomed., 195, 105571 (2020), https://doi.org/10.1016/j.cmpb.2020.105571.

    Article  Google Scholar 

  15. B. Stasak, J. Epps, and R. Goecke, Comp. Speech & Lang., 53, 140–155 (2019), https://doi.org/10.1016/j.csl.2018.08.001.

    Article  Google Scholar 

  16. T. Asada, R. Adachi, S. Takada, et al., Proc. Int. Conf. on Artificial Life and Robotics, Beppu, Oita, Japan, Jan. 13–16, 2020, ALife Robotics Corp. (2020), Vol. 2, pp. 398–402, https://doi.org/10.5954/ICAROB.2020.OS16-3.

  17. D. S. Juan, M. Senoussaoui, E. Granger, et al., Multimodal Fusion with Deep Neural Networks for Audio-Video Emotion Recognition (2019), https://arxiv.org/abs/1907.03196v1 [cs.CVj.

  18. A. A. Borovkov, Mathematical Statistics, Lan’, St. Petersburg (2010).

    Google Scholar 

  19. A. Kumar, S. Shahnawazuddin, and G. Pradhan, Circ. Syst. Signal Proc., 36, 2315–2340 (2017), https://doi.org/10.1007/s00034-016-0409-1.

    Article  Google Scholar 

  20. V. V. Savchenko, Radioelectr. Commun. Syst., 63, 532–542 (2020), https://doi.org/10.3103/S0735272720100039.

    Article  Google Scholar 

  21. A. V. Savchenko, V. V. Savchenko, and L. V. Savchenko, Optimiz. Lett., No. 7 (2021), https://doi.org/10.1007/s11590-021-01790-5.

  22. Ç. Candan, Signal Proc., 166, 107256 (2020), https://doi.org/10.1016/j.sigpro.2019.107256.

    Article  Google Scholar 

  23. V. V. Savchenko, “Solving the problem of multiple comparisons in problems of automatic signal recognition at the output of the voice communication path,” Elektrosvyaz, No. 12, 22–27 (2017).

  24. V. V. Savchenko and A. V. Savchenko, J. Communic. Technol. Electron., 65, No. 11, 1311–1317 (2020), https://doi.org/10.1134/S1064226920110157.

    Article  Google Scholar 

  25. S. Kullback, Information Theory and Statistics, Dover Publ., New York (1997).

    MATH  Google Scholar 

  26. V. V. Savchenko, J. Communic. Technol. Electron., 64, No. 6, 590–596 (2019), https://doi.org/10.1134/S1064226919060093.

    Article  Google Scholar 

  27. R. M. Gray, A. Buzo, A. H. Gray, and Y. Matsuyama, IEEE T. Signal Proc., 28, No. 4, 367–377 (1980), https://doi.org/10.1109/TASSP.1980.1163421.

    Article  Google Scholar 

  28. V. V. Savchenko and A. V. Savchenko, Radioelectr. Commun. Syst., 62, 276–286 (2019), l0.3103/S0735272719050042.

  29. S. L. Marple, Digital Spectral Analysis with Applications, Dover Publ., Mineola, NY (2019), 2nd ed.

  30. O. Perepelkina, E. Kazimírova, and M. Konstantinova, Proc. Int. Conf. on Speech and Computer (SPECOM 2018), Germany, Sept. 18–22, 2018, Springer, Cham, (2018), pp. 501–510, https://doi.org/10.1007/978-3-319-99579-3_52.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. V. Savchenko.

Additional information

Translated from Izmeritel’naya Tekhnika, No. 3, pp. 65–72, March, 2022.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Savchenko, A.V., Savchenko, V.V. Method for Measurement the Intensity of Speech Vowel Sounds Flow for Audiovisual Dialogue Information Systems. Meas Tech 65, 219–226 (2022). https://doi.org/10.1007/s11018-022-02072-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11018-022-02072-x

Keywords

Navigation