Abstract
Early intervention for depression could provide a means to reducing the disease burden, but there is a lack of objective diagnostic methods. This study investigated automatic depression classification on a speech dataset of 85 healthy controls (51 females and 34 males) and 85 depressed patients (53 females and 32 males). Considering that there are obvious differences in the performance of different types of speech features, we propose a radius-incorporated localized multiple kernel learning (trLMKL) algorithm for detecting depression in speech to make the best use of speech features. To improve the classification accuracy, we combine the information of both the margin and the radius of the MEB to learn the gating model parameters in our algorithm. Furthermore, we do not directly incorporate the radius of the MEB, but incorporate the trace of the total scattering matrix of training data. This method can avoid the time cost of calculating the radius at each iteration and decrease the computational complexity. Comprehensive experiments were carried out on our depressed speech dataset and 10 UCI datasets. Our algorithm achieved better classification performance overall than SimpleMKL and LMKL, and it was efficient at detecting depression, indicating its potential for use as a diagnostic method for depression.
Similar content being viewed by others
Data availability
The data used to support the findings of this study are available from the corresponding author upon request.
References
Airas, M. (2008). TKK Aparat: An environment for voice inverse filtering and parameterization. Logopedics Phoniatrics Vocology, 33, 49–64.
Alghowinem, S., Goecke, R., Wagner, M., Epps, J., Gedeon, T., Breakspear, M., & Parker, G. (2013). A comparative study of different classifiers for detecting depression from spontaneous speech. In Proceedings of ICASSP 2013, (pp. 8022–8026). IEEE
Chapelle, O., Vapnik, V., Bousquet, O., & Mukherjee, S. (2002). Choosing multiple parameters for support vector machines. Machine Learning, 46, 31–159.
Chen, J., & Liu, Y. (2011). Locally linear embedding: A survey. Artificial Intelligence Review, 36, 29–48.
Chung, K. M., Kao, W. C., Sun, C. L., Wang, L. L., & Lin, C. J. (2003). Radius margin bounds for support vector machines with the RBF kernel. Neural Computation, 15, 2643–2681.
Cummins, N., Scherer, S., Krajewski, J., Schnieder, S., Epps, J., & Quatieri, T. F. (2015). A review of depression and suicide risk assessment using speech analysis. Speech Communication, 71, 10–49.
Cummins, N., Epps, J., Sethu, V., & Krajewski, J. (2014). Variability compensation in small data: Oversampled extraction of I-vectors for the classification of depressed speech. In Proceedings of ICASSP 2014, (pp. 970–974). IEEE
Dua, D., & Karra Taniskidou, E. UCI machine learning repository. University of California, School of Information and Computer Science. Retrieved 2021, from http://archive.ics.uci.edu/ml.
Eyben, F., Wöllmer, M., & Schuller, B. (2010). Opensmile-The Munich versatile and fast open-source audio feature extractor. In Proceedings of the 18th ACM international conference on multimedia, (pp. 1459–1462). Association for Computing Machinery
Gönen, M., & Alpaydin, E. (2008). Localized multiple kernel learning. In Proceedings of the 5th international conference on machine learning, (pp. 352–359). Springer-Verlag
Gönen, M., & Alpaydın, E. (2013). Localized algorithms for multiple kernel learning. Pattern Recognition, 46, 795–807.
Hawton, K., Comabella, C. C. I., Haw, C., & Saunders, K. (2013). Risk factors for suicide in individuals with depression: A systematic review. Journal of Affective Disorders, 147, 17–28.
He, L., & Cao, C. (2018). Automated depression analysis using convolutional neural networks from speech. Journal of Biomedical Informatics, 83, 103–111.
Hu, M., Chen, Y., & Kwok, J. T. Y. (2009). Building sparse multiple kernel SVM classifiers. IEEE Transactions on Neural Networks, 20, 827–839.
Huang, K. Y., Wu, C. H., Su, M. H., & Kuo, Y. T. (2020). Detecting unipolar and bipolar depressive disorders from elicited speech responses using latent affective structure model. IEEE Transcactions on Affective Computing, 11, 393–404.
Jiang, H. H., Hu, B., Liu, Z. Y., Wang, G., Zhang, L., Li, X. Y., & Kang, H. Y. (2018). Detecting depression using an ensemble logistic regression model based on multiple speech features. Computational and Mathematical Method, 9, 1–9.
Jiang, H. H., Hu, B., Liu, Z. Y., Yan, L. H., Wang, T. Y., Liu, F., Kang, H. Y., & Li, X. Y. (2017). Investigation of different speech types and emotions for detecting depression using different classifiers. Speech Communication, 90, 39–46.
Liu, X. W., Wang, L., Yin, J. P., Zhu, E., & Zhang, J. (2013). An efficient approach to integrating radius information into multiple kernel learning. IEEE Transactions on Cybernetics., 43, 557–569.
Low, L. A., Maddage, N. C., Lech, M., Sheeber, L. B., & Allen, N. B. (2011). Detection of clinical depression in adolescents’ speech during family interactions. IEEE Transactions on Bio-Medical Engineering, 58, 574–586.
Moore, E., Clements, M., Peifer, J. W., & Weisser, L. (2008). Critical analysis of the impact of glottal features in the classification of clinical depression in speech. IEEE Transactions on Bio-Medical Engineering, 55, 96–107.
Nolenhoeksema, S., & Girgus, J. S. (1994). The emergence of gender differences in depression during adolescence. Psychological Bulletin, 115, 424–443.
Ooi, K. E. B., Lech, M., & Allen, N. B. (2014). Prediction of major depression in adolescents using an optimized multi-channel weighted speech classification system. Biomedical Signal Processing, 14, 228–239.
Rakotomamonjy, A., Bach, F., Grandvalet, Y., & Canu, S. (2008). SimpleMKL. Journal of Machine Learning Research, 9, 2491–2521.
Scherer, S., Stratou, G., Gratch, J., & Morency, L. P. (2013). Investigating voice quality as a speaker-independent indicator of depression and PTSD. In Proceedings of Interspeech, 2013, (pp. 847–851). ISCA
Sobin, C., & Sackeim, H. A. (1997). Psychomotor symptoms of depression. American Journal of Psychiatry., 154, 4–17.
Wang, L. (2008). Feature selection with kernel class separability. IEEE Transactions on Pattern Analysis, 30, 1534–1546.
World Health Organization. (2021, September 13). Depression fact sheet. WHO, Geneva, Switzerland. Retrieved January 27, 2022, from http://www.who.int/en/news-room/fact-sheets/detail/depression.
Xu, X., Tsang, I. W., & Xu, D. (2013). Soft margin multiple kernel learning. IEEE Transactions on Neural Networks, 24, 749–761.
Xu, Z., Jin, R., Yang, H., King, I., & Lyu, M. R. (2010). Simple and efficient multiple kernel learning by group Lasso. In Proceedings of the 27th international conference on machine learning, (pp. 1175–1182). Omnipress
Zhao, Z., Bao, Z., Zhang, Z., Cummins, N., & Schuller, B. (2020). Hierarchical attention transfer networks for depression assessment from speech. In Proceedings of ICASSP 2020, (pp. 7159–7163). IEEE
Acknowledgements
This work was supported by the National Basic Research Program of China (973 Program) (No.2014CB744600).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jiang, H., Hu, B., Liu, Z. et al. A radius-incorporated localized multiple kernel learning algorithm for detecting depression in speech. Int J Speech Technol 26, 371–378 (2023). https://doi.org/10.1007/s10772-023-10017-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-023-10017-0