Fractal-Based Speech Analysis for Emotional Content Estimation

Abrol, Akshita; Kapoor, Nisha; Lehana, Parveen Kumar

doi:10.1007/s00034-021-01737-2

Fractal-Based Speech Analysis for Emotional Content Estimation

Published: 12 May 2021

Volume 40, pages 5632–5653, (2021)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Akshita Abrol¹,
Nisha Kapoor² &
Parveen Kumar Lehana¹

229 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Speech emotional content estimation is still a challenge for building robust human–machine interaction systems. Accuracy of emotion estimation depends upon the corpus used for training and the acoustic features employed for modelling the speech signal. Generally, emotion estimation is computationally expensive, and hence, there is a need of developing alternative techniques. In this paper, a low complexity fractal-based technique has been explored. Our hypothesis is that fractal analysis would provide better emotional content estimation because of the nonlinear nature of the speech signals. Fractal analysis involves two important parameters, i.e. fractal dimension and loop area. Fractal dimension has been computed using the Katz algorithm. The investigations using a GMM-based model show that the proposed technique is capable of identifying the emotional content within the given speech signals reliably and accurately. Further, the technique is robust in the sense that it can bear the noise level in the signal up to 10 dB. The analysis also shows that the technique is gender insensitive. The scope of the investigations presented here is limited to phonemic-level analysis, although the technique works efficiently with speech phrases as well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal Fractal Feature Selection and Estimation for Speech Recognition Under Mismatched Conditions

A Hybrid of Fractal Code Descriptor and Harmonic Pattern Generator for Improving Speech Recognition of Different Sampling Rates

Text-independent speech emotion recognition using frequency adaptive features

Article 13 February 2018

Data Availability

The data generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

Z. Ali, M. Talha, Innovative method for unsupervised voice activity detection and classification of audio segments. IEEE Access 6, 15494–15504 (2018)
Article Google Scholar
P.N. Baljekar. H.A. Patil, A comparison of waveform fractal dimension techniques for voice pathology classification. in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (2012), pp. 4461–4464
G. Bandoin, Y. Stylianou, On the transformation of the speech spectrum for voice conversion. in Proceedings of the 4^th International Conference on Spoken Language Processing ICSLP, (1996)
A. Barbulescu, C. Serban, C. Maftei, Evaluation of hurst exponent for precipitation time series. in Proceedings of the 14th WSEAS International Conference on Computers, (2010), pp. 590–595
P. Castiglioni, What is wrong in Katz’s method? comments on: a note on fractal dimensions of biomedical waveforms. Comput. Biol. Med. 40(11–12), 950–952 (2010)
Article Google Scholar
P. Chandrasekar, S. Chapaneri, D. Jayaswal, Automatic speech emotion recognition: a survey. in Proc. Circuits, Systems, Communication and Information Technology Applications (CSCITA), (2014), pp. 341–346
M. Chen, X. He, J. Yang, H. Zhang, 3-D Convolutional recurrent neural networks with attention model for speech emotion recognition. IEEE Signal Process. Lett. 25(10), 1440–1444 (2018)
Article Google Scholar
M. Chen, P. Zhou, G. Fortino, Emotion communication system. IEEE Access 5, 326–337 (2017)
Article Google Scholar
K. Dupuis, M.K. Pichora-Fuller, Toronto Emotional Speech Set (TESS) (Psychology Department, University of Toronto, Toronto, 2010).
Google Scholar
M. ElAyadi, M.S. Kamel, F. Karray, Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recognit. 44(3), 572–587 (2011)
Article Google Scholar
R. Esteller, G. Vachtsevanos, J. Echauz, B. Litt, A comparison of waveform fractal dimension algorithms. Trans. Circuits Syst. I Fundam. Theory Appl. 48(2), 177–183 (2001)
Article Google Scholar
A.M. Fahim, A.M. Salem, F.A. Torkey, M.A. Ramadan, An efficient enhanced k-means clustering algorithm. J. Zhejiang Univ Sci 7, 1626–1633 (2006)
Article Google Scholar
K. Han, D. Yu, I. Tashev, Speech emotion recognition using deep neural network and extreme learning machine. Proceedings of the Fifteenth Annual Conference of the International Speech Communication Association (INTERSPEECH), (2014), pp. 223–227
M. N. Hasrul, M. Hariharan, S. Yaacob, Human Affective (Emotion) behaviour analysis using speech signals: A review. in, 2012 International Conference on Biomedical Engineering (ICoBE), (2012), pp. 27–28
T. Higuchi, Approach to an irregular time series on the basis of the fractal theory. Phys. D 31, 277–283 (1988)
Article MathSciNet Google Scholar
R. Hokking, K. Woraratpanya, and Y. Kuroki, Speech recognition of different sampling rates using fractal code descriptor. in, IEEE International Joint Conference on Computer Science and Software Engineering (JCSSE), (2016), pp. 1–5
H. Hu, M.X. Xu, W. Wu, GMM supervector based SVM with spectral features for speech emotion recognition. in, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (2007), pp. 413–416
G. Julia, Memoire sur l’iteration des fonctions rationnelles. J. Math. Pures Appl. 8, 47–245 (1918)
MATH Google Scholar
M.J. Katz, Fractals and the analysis of waveforms. Comput. Biol. Med. 18(3), 145–156 (1988)
Article Google Scholar
S.G. Koolagudi, K.S. Rao, Emotion recognition from speech: a review. Int. J. Speech Tech. 15(2), 99–117 (2012)
Article Google Scholar
W.J.M. Levelt, Models of word production. Trends Cogn. Sci. 3(6), 223–232 (1999)
Article MathSciNet Google Scholar
G. Tamulevicius, R. Karbauskaite, G. Dzemyda, Selection of fractal dimension features for speech emotion classification. in, IEEE Open Conference of Electrical, Electronic and Information Sciences (eStream), (2017), pp. 1–4
R. Lopes, N. Betrouni, Fractal and multifractal analysis: a review. Med. Image Anal. 13(4), 634–649 (2009)
Article Google Scholar
P.C. Mahalanobis, On the generalized distance in statistics. in, Proceeding of the National Institute of Sciences of India, (1936), pp. 49–55
B.B. Mandelbrot, The Fractal Geometry of Nature (Henry Holt and Company, New York, 1983).
Book Google Scholar
Q. Mao, M. Dong, Z. Huang, Y. Zhan, Learning salient features for speech emotion recognition using convolutional neural networks. IEEE Trans. Multimed. 16(8), 2203–2213 (2014)
Article Google Scholar
P. Maragos, Fractal aspects of speech signals: dimension and interpolation. in, Proceeding of IEEE International Conference on Acoust., Sp. and Sig. Proc. (ICASSP), (1991), pp. 417–420
A.K. Mishra, S. Raghav, Local fractal dimension based ECG arrhythmia classification. Biomed. Signal Process Control 5(2), 114–123 (2010)
Article Google Scholar
J.S. Park, S.H. Kim, Emotion recognition from speech signals using fractal features. Int. J. Soft Eng. Appl. 8(5), 15–22 (2014)
Google Scholar
S. Peleg, J. Naor, R. Hartley. D. Avnir, Multiple resolution texture analysis and classification. in, 4th Jerusalem Conference on Information Technology, (1984), pp. 483–488
A. Petrosian, Kolmogorov complexity of finite sequences and recognition of different preictal EEG patterns. in, Proceedings eighth IEEE symposium on computer-based medical systems, (1995), pp. 212–217
J. Rong, G. Li, Y.P.P. Chen, Acoustic feature selection for automatic emotion recognition from speech. Inf. Process. Manag. 45(3), 315–328 (2009)
Article Google Scholar
T.R. Senevirathne, E.L.J. Bohez, J.A. Van Winden, Amplitude scale method: new and efficient approach to measure fractal dimension of speech waveforms. Electron. Lett. 28(4), 420–422 (1992)
Article Google Scholar
J.B. Singh, P. Lehana, Emotional speech analysis using harmonic plus noise model and Gaussian mixture model. Int. J. Speech Technol. 22(3), 483–496 (2019)
Article Google Scholar
J.B. Singh, P. Lehana, Straight-based emotion conversion using quadratic multivariate polynomial. Circuits Syst. Signal Process. 37(5), 2179–2193 (2018)
Article MathSciNet Google Scholar
P. Song, W. Zheng, Feature selection based transfer subspace learning for speech emotion recognition. IEEE Trans. Affect. Comput. 11(3), 373–382 (2018)
Article Google Scholar
K.P. Truong, D.A. Leeuwen, Automatic discrimination between laughter and speech. Speech Commun. 49(2), 114–158 (2007)
Article Google Scholar
T. Vogt, E. Andre, J. Wagner, Automatic recognition of emotions from speech: a review of the literature and recommendations for practical realisation. in, Affect and Emotion in Human-Computer Interaction: From Theory to Appl, (2008), pp. 75–91
Z. Zeng, M. Pantic, G.I. Roisman, T.S. Huang, A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans. Pattern Anal. Mach. Intell. 31(1), 39–58 (2009)
Article Google Scholar
S. Zhang, S. Zhang, T. Huang, W. Gao, Speech emotion recognition using deep convolutional neural network and discriminant temporal pyramid matching. IEEE Trans. Multimed 20(6), 1576–1590 (2018)
Article Google Scholar

Download references

Author information

Authors and Affiliations

D.S.P. Lab, Department of Electronics, University of Jammu, Jammu and Kashmir, Jammu, 180006, India
Akshita Abrol & Parveen Kumar Lehana
School of Biotechnology, University of Jammu, Jammu and Kashmir, Jammu, 180006, India
Nisha Kapoor

Authors

Akshita Abrol
View author publications
You can also search for this author in PubMed Google Scholar
Nisha Kapoor
View author publications
You can also search for this author in PubMed Google Scholar
Parveen Kumar Lehana
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Akshita Abrol.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abrol, A., Kapoor, N. & Lehana, P.K. Fractal-Based Speech Analysis for Emotional Content Estimation. Circuits Syst Signal Process 40, 5632–5653 (2021). https://doi.org/10.1007/s00034-021-01737-2

Download citation

Received: 19 March 2020
Revised: 22 April 2021
Accepted: 26 April 2021
Published: 12 May 2021
Issue Date: November 2021
DOI: https://doi.org/10.1007/s00034-021-01737-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fractal-Based Speech Analysis for Emotional Content Estimation

Abstract

Access this article

Similar content being viewed by others

Optimal Fractal Feature Selection and Estimation for Speech Recognition Under Mismatched Conditions

A Hybrid of Fractal Code Descriptor and Harmonic Pattern Generator for Improving Speech Recognition of Different Sampling Rates

Text-independent speech emotion recognition using frequency adaptive features

Data Availability

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fractal-Based Speech Analysis for Emotional Content Estimation

Abstract

Access this article

Similar content being viewed by others

Optimal Fractal Feature Selection and Estimation for Speech Recognition Under Mismatched Conditions

A Hybrid of Fractal Code Descriptor and Harmonic Pattern Generator for Improving Speech Recognition of Different Sampling Rates

Text-independent speech emotion recognition using frequency adaptive features

Data Availability

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation