Skip to main content
Log in

A radius-incorporated localized multiple kernel learning algorithm for detecting depression in speech

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

Early intervention for depression could provide a means to reducing the disease burden, but there is a lack of objective diagnostic methods. This study investigated automatic depression classification on a speech dataset of 85 healthy controls (51 females and 34 males) and 85 depressed patients (53 females and 32 males). Considering that there are obvious differences in the performance of different types of speech features, we propose a radius-incorporated localized multiple kernel learning (trLMKL) algorithm for detecting depression in speech to make the best use of speech features. To improve the classification accuracy, we combine the information of both the margin and the radius of the MEB to learn the gating model parameters in our algorithm. Furthermore, we do not directly incorporate the radius of the MEB, but incorporate the trace of the total scattering matrix of training data. This method can avoid the time cost of calculating the radius at each iteration and decrease the computational complexity. Comprehensive experiments were carried out on our depressed speech dataset and 10 UCI datasets. Our algorithm achieved better classification performance overall than SimpleMKL and LMKL, and it was efficient at detecting depression, indicating its potential for use as a diagnostic method for depression.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Data availability

The data used to support the findings of this study are available from the corresponding author upon request.

References

  • Airas, M. (2008). TKK Aparat: An environment for voice inverse filtering and parameterization. Logopedics Phoniatrics Vocology, 33, 49–64.

    Article  Google Scholar 

  • Alghowinem, S., Goecke, R., Wagner, M., Epps, J., Gedeon, T., Breakspear, M., & Parker, G. (2013). A comparative study of different classifiers for detecting depression from spontaneous speech. In Proceedings of ICASSP 2013, (pp. 8022–8026). IEEE

  • Chapelle, O., Vapnik, V., Bousquet, O., & Mukherjee, S. (2002). Choosing multiple parameters for support vector machines. Machine Learning, 46, 31–159.

    Article  MATH  Google Scholar 

  • Chen, J., & Liu, Y. (2011). Locally linear embedding: A survey. Artificial Intelligence Review, 36, 29–48.

    Article  Google Scholar 

  • Chung, K. M., Kao, W. C., Sun, C. L., Wang, L. L., & Lin, C. J. (2003). Radius margin bounds for support vector machines with the RBF kernel. Neural Computation, 15, 2643–2681.

    Article  MATH  Google Scholar 

  • Cummins, N., Scherer, S., Krajewski, J., Schnieder, S., Epps, J., & Quatieri, T. F. (2015). A review of depression and suicide risk assessment using speech analysis. Speech Communication, 71, 10–49.

    Article  Google Scholar 

  • Cummins, N., Epps, J., Sethu, V., & Krajewski, J. (2014). Variability compensation in small data: Oversampled extraction of I-vectors for the classification of depressed speech. In Proceedings of ICASSP 2014, (pp. 970–974). IEEE

  • Dua, D., & Karra Taniskidou, E. UCI machine learning repository. University of California, School of Information and Computer Science. Retrieved 2021, from http://archive.ics.uci.edu/ml.

  • Eyben, F., Wöllmer, M., & Schuller, B. (2010). Opensmile-The Munich versatile and fast open-source audio feature extractor. In Proceedings of the 18th ACM international conference on multimedia, (pp. 1459–1462). Association for Computing Machinery

  • Gönen, M., & Alpaydin, E. (2008). Localized multiple kernel learning. In Proceedings of the 5th international conference on machine learning, (pp. 352–359). Springer-Verlag

  • Gönen, M., & Alpaydın, E. (2013). Localized algorithms for multiple kernel learning. Pattern Recognition, 46, 795–807.

    Article  MATH  Google Scholar 

  • Hawton, K., Comabella, C. C. I., Haw, C., & Saunders, K. (2013). Risk factors for suicide in individuals with depression: A systematic review. Journal of Affective Disorders, 147, 17–28.

    Article  Google Scholar 

  • He, L., & Cao, C. (2018). Automated depression analysis using convolutional neural networks from speech. Journal of Biomedical Informatics, 83, 103–111.

    Article  Google Scholar 

  • Hu, M., Chen, Y., & Kwok, J. T. Y. (2009). Building sparse multiple kernel SVM classifiers. IEEE Transactions on Neural Networks, 20, 827–839.

    Article  Google Scholar 

  • Huang, K. Y., Wu, C. H., Su, M. H., & Kuo, Y. T. (2020). Detecting unipolar and bipolar depressive disorders from elicited speech responses using latent affective structure model. IEEE Transcactions on Affective Computing, 11, 393–404.

    Article  Google Scholar 

  • Jiang, H. H., Hu, B., Liu, Z. Y., Wang, G., Zhang, L., Li, X. Y., & Kang, H. Y. (2018). Detecting depression using an ensemble logistic regression model based on multiple speech features. Computational and Mathematical Method, 9, 1–9.

    MATH  Google Scholar 

  • Jiang, H. H., Hu, B., Liu, Z. Y., Yan, L. H., Wang, T. Y., Liu, F., Kang, H. Y., & Li, X. Y. (2017). Investigation of different speech types and emotions for detecting depression using different classifiers. Speech Communication, 90, 39–46.

    Article  Google Scholar 

  • Liu, X. W., Wang, L., Yin, J. P., Zhu, E., & Zhang, J. (2013). An efficient approach to integrating radius information into multiple kernel learning. IEEE Transactions on Cybernetics., 43, 557–569.

    Article  Google Scholar 

  • Low, L. A., Maddage, N. C., Lech, M., Sheeber, L. B., & Allen, N. B. (2011). Detection of clinical depression in adolescents’ speech during family interactions. IEEE Transactions on Bio-Medical Engineering, 58, 574–586.

    Article  Google Scholar 

  • Moore, E., Clements, M., Peifer, J. W., & Weisser, L. (2008). Critical analysis of the impact of glottal features in the classification of clinical depression in speech. IEEE Transactions on Bio-Medical Engineering, 55, 96–107.

    Article  Google Scholar 

  • Nolenhoeksema, S., & Girgus, J. S. (1994). The emergence of gender differences in depression during adolescence. Psychological Bulletin, 115, 424–443.

    Article  Google Scholar 

  • Ooi, K. E. B., Lech, M., & Allen, N. B. (2014). Prediction of major depression in adolescents using an optimized multi-channel weighted speech classification system. Biomedical Signal Processing, 14, 228–239.

    Article  Google Scholar 

  • Rakotomamonjy, A., Bach, F., Grandvalet, Y., & Canu, S. (2008). SimpleMKL. Journal of Machine Learning Research, 9, 2491–2521.

    MathSciNet  MATH  Google Scholar 

  • Scherer, S., Stratou, G., Gratch, J., & Morency, L. P. (2013). Investigating voice quality as a speaker-independent indicator of depression and PTSD. In Proceedings of Interspeech, 2013, (pp. 847–851). ISCA

    Google Scholar 

  • Sobin, C., & Sackeim, H. A. (1997). Psychomotor symptoms of depression. American Journal of Psychiatry., 154, 4–17.

    Article  Google Scholar 

  • Wang, L. (2008). Feature selection with kernel class separability. IEEE Transactions on Pattern Analysis, 30, 1534–1546.

    Article  Google Scholar 

  • World Health Organization. (2021, September 13). Depression fact sheet. WHO, Geneva, Switzerland. Retrieved January 27, 2022, from http://www.who.int/en/news-room/fact-sheets/detail/depression.

  • Xu, X., Tsang, I. W., & Xu, D. (2013). Soft margin multiple kernel learning. IEEE Transactions on Neural Networks, 24, 749–761.

    Article  Google Scholar 

  • Xu, Z., Jin, R., Yang, H., King, I., & Lyu, M. R. (2010). Simple and efficient multiple kernel learning by group Lasso. In Proceedings of the 27th international conference on machine learning, (pp. 1175–1182). Omnipress

  • Zhao, Z., Bao, Z., Zhang, Z., Cummins, N., & Schuller, B. (2020). Hierarchical attention transfer networks for depression assessment from speech. In Proceedings of ICASSP 2020, (pp. 7159–7163). IEEE

Download references

Acknowledgements

This work was supported by the National Basic Research Program of China (973 Program) (No.2014CB744600).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Haihua Jiang or Bin Hu.

Ethics declarations

Conflict of interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, H., Hu, B., Liu, Z. et al. A radius-incorporated localized multiple kernel learning algorithm for detecting depression in speech. Int J Speech Technol 26, 371–378 (2023). https://doi.org/10.1007/s10772-023-10017-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-023-10017-0

Keywords

Navigation