Fisher Discriminative Embedding Low-Rank Sparse Representation for Music Genre Classification

Cai, Xin; Zhang, Hongjuan

doi:10.1007/s00034-024-02696-0

Fisher Discriminative Embedding Low-Rank Sparse Representation for Music Genre Classification

Published: 14 May 2024

(2024)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

22 Accesses
Explore all metrics

Abstract

This work focuses on a music genre classification method based on a sparse low-rank representation. Sparse low-rank representation is an effective method for learning classifiers, which aims to learn a row-sparse low-rank representation matrix to effectively ignore noise and identify subspace structures in data contaminated by outliers. However, these related methods fail to utilize the discriminative information to mine the rich supervision information available in the training samples. To address this issue, a novel Fisher Discriminative Embedding Low-Rank Sparse Representation (FDLRSR) classification algorithm is proposed based on the Fisher criterion, which results in stronger intra-class similarity and inter-class separability representation coefficients. Meanwhile, its two special cases, i.e., the Fisher Discriminative Embedding Low-Rank Representation (FDLR) and Fisher Discriminative Embedding Sparse Representation (FDSR) are also presented in this work. Specifically, the proposed classification method employs the FDLRSR algorithm coupled with the feature combinations consisting acoustic features and spectral features for music genre classification tasks by minimizing the residuals. Compared with the several state-of-the-art music genre classification methods, the proposed methods substantially improve the classification results on three widely used datasets, the GTZAN, ISMIR2004 and Homburg datasets, with the highest classification accuracies of 97.9% and 99.43%, which verify its effectiveness and availability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Importance of audio feature reduction in automatic music genre classification

Article 23 December 2014

Unsupervised feature selection based on joint spectral learning and general sparse regression

Article 14 March 2019

Influence of Low-Level Features Extracted from Rhythmic and Harmonic Sections on Music Genre Classification

Data Availability

The data that support the findings of this study are openly available in the GTZAN Dataset - Music Genre Classification (http://marsyas.info/downloads/datasets.html), reference number [51], ISMIR2004 (https://ismir2004.ismir.net/genre_contest/index.html), reference number [6] and Homburg datasets (https://www-ai.cs.tu-dortmund.de/audio.html) reference number [21].

References

S. Allamy, A.L. Koerich, 1d CNN architectures for music genre classification. CoRR, arXiv:2105.07302, (2021)
B.E. Boser, I.M. Guyon, V.N. Vapnik, A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pages 144–152, (1992)
S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein et al., Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach.® learn. 3(1), 1–122 (2011)
Google Scholar
J.F. Cai, E.J. Candès, Z. Shen, A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20(4), 1956–1982 (2010)
Article MathSciNet Google Scholar
E.J. Candès, X. Li, Y. Ma, J. Wright, Robust principal component analysis. J. ACM (JACM) 58(3), 1–37 (2011)
Article MathSciNet Google Scholar
P. Cano, E. Gómez, F. Gouyon, P. Herrera, M. Koppenberger, B. Ong, X. Serra, S. Streich, N. Wack, ISMIR 2004 audio description contest. Tech. Report. Music Technol. Group, Bracelona, Spain 01, 2006 (2004)
Google Scholar
J. Chaki, Pattern analysis based acoustic signal processing: a survey of the state-of-art. Int. J. Speech Technol. (2020). https://doi.org/10.1007/s10772-020-09681-3
Article Google Scholar
S.S. Chen, D.L. Donoho, M.A. Saunders, Atomic decomposition by basis pursuit. SIAM Rev. 43(1), 129–159 (2001)
Article MathSciNet Google Scholar
Z. Chen, W. XiaoJun, J. Kittler, Low-rank discriminative least squares regression for image classification. Signal Process. 173, 107485 (2020)
Article Google Scholar
D.C. Corrèa, F.A. Rodrigues, A survey on symbolic data-based music genre classification. Expert Syst. Appl. 60, 190–210 (2016)
Article Google Scholar
Y.M.G. Costa, L.S. Oliveira, A.L. Koerich, F. Gouyon, Music genre recognition using spectrograms. In 2011 18th International Conference on Systems, Signals and Image Processing, pages 1–4, (07 2011)
Y.M.G. Costa, L.S. Oliveira, A.L. Koerich, F. Gouyon, Music genre recognition using gabor filters and lpq texture descriptors. Prog. Pattern Recognit. Image Anal. Comput. Vis. and Appl. 8259, 67–74 (2013)
Google Scholar
Y.M.G. Costa, L.S. Oliveira, A.L. Koerich, F. Gouyon, J.G. Martins, Music genre classification using lbp textural features. Signal Process. 92(11), 2723–2737 (2012)
Article Google Scholar
T. Cover, P. Hart, Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)
Article Google Scholar
D. Haishun, Y. Wang, F. Zhang, Y. Zhou, Low-rank discriminative adaptive graph preserving subspace learning. Neural Process. Lett. 52(3), 2127–2149 (2020)
Article Google Scholar
A. Elbir, N. Aydin, Music genre classification and music recommendation by using deep learning. Electron. Lett. 56(12), 627–629 (2020)
Article Google Scholar
Z. Fu, G. Lu, K.M. Ting, D. Zhang, A survey of audio-based music classification and annotation. IEEE Trans. Multimedia 13(2), 303–319 (2011)
Article Google Scholar
Z. Fu, G. Lu, K.M. Ting, D. Zhang, On feature combination for music classification. Struct. Synt. Stat. Pattern Recognit. (2010). https://doi.org/10.1007/978-3-642-14980-1_44
Article Google Scholar
Y.F. Guo, S.J. Li, J.Y. Yang, T.T. Shu, W. LiDe, A generalized foley-sammon transform based on generalized fisher discriminant criterion and its application to face recognition. Pattern Recogn. Lett. 24(1–3), 147–158 (2003)
Article Google Scholar
N. Han, W. Jigang, Y. Liang, X. Fang, W.K. Wong, S. Teng, Low-rank and sparse embedding for dimensionality reduction. Neural Netw. 108, 202–216 (2018)
Article Google Scholar
H. Homburg, I. Mierswa, B. Möller, K. Morik, M. Wurst, A benchmark dataset for audio classification and clustering. In ISMIR 2005, 528–531 (2005)
Google Scholar
C.-H. Lee, J.-L. Shih, Yu. Kun-Ming, H.-S. Lin, Automatic music genre classification based on modulation spectral analysis of spectral and cepstral features. IEEE Trans. Multimedia 11, 670–682 (2009)
Article Google Scholar
A. Li, D. Chen, W. Zhiqiang, G. Sun, K. Lin, Self-supervised sparse coding scheme for image classification based on low rank representation. PLoS ONE 13(6), e0199141 (2018)
Article Google Scholar
H. Li, T. Jiang, K. Zhang, Efficient and robust feature extraction by maximum margin criterion. IEEE Trans. Neural Netw. 17(1), 157–165 (2006)
Article Google Scholar
T. Li, M. Ogihara, Toward intelligent music information retrieval. IEEE Trans. Multimedia 8(3), 564–574 (2006)
Article Google Scholar
T.L. Li , A.B. Chan, Genre classification and the invariance of mfcc features to key and tempo. In International Conference on MultiMedia Modeling, pages 317–327. Springer (2011)
T. Lidy, A. Rauber, Evaluation of feature extractors and psycho-acoustic transformations for music genre classification. In Proceedings of the Sixth International Conference on Music Information Retrieval (ISMIR 2005), pages 34–41, September 11-15 (2005)
S. Lim, J. Lee, S. Jang, S. Lee, M.Y. Kim, Music-genre classification system based on spectro-temporal features and feature selection. IEEE Trans. Consum. Electron. 58(4), 1262–1268 (2012)
Article Google Scholar
Z. Lin, M. Chen, Y. Ma, The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. arXiv preprint arXiv:1009.5055, (2010)
Z. Lin, R. Liu, Z. Su, Linearized alternating direction method with adaptive penalty for low-rank representation. arXiv preprint arXiv:1109.0367, (2011)
C. Liu, L. Feng, G. Liu, H. Wang, S. Liu, Bottom-up broadcast neural network for music genre classification. Multimed. Tools Appl. 80(5), 7313–7331 (2021)
Article Google Scholar
G. Liu, Z. Lin, J. Shuicheng Yan, Y.Y. Sun, Y. Ma, Robust recovery of subspace structures by low-rank representation. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 171–184 (2012)
Article Google Scholar
G. Liu, Z. Lin, Y. Yu, Robust subspace segmentation by low-rank representation. In Proceedings of the 27th International Conference on International Conference on Machine Learning, number 8 in ICML’10, page 663–670, Madison, WI, USA, (2010). Omnipress
C. Lu, A Library of ADMM for Sparse and Low-rank Optimization. National University of Singapore, (June 2016). https://github.com/canyilu/LibADMM
L. Canyi, J. Feng, S. Yan, Z. Lin, A unified alternating direction method of multipliers by majorization minimization. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 527–541 (2017)
Google Scholar
T. Luo, Y. Yang, D. Yi, J. Ye, Robust discriminative feature learning with calibrated data reconstruction and sparse low-rank model. Appl. Intell. (2017). https://doi.org/10.1007/s10489-017-1060-7
Article Google Scholar
L. Ma, C. Wang, B. Xiao, W. Zhou, Sparse representation for face recognition based on discriminative low-rank dictionary learning. In 2012 IEEE conference on computer vision and pattern recognition, pages 2586–2593, (2012)
D. Mitrović, M. Zeppelzauer, C. Breiteneder, Features for content-based audio retrieval. In Adv. Comput. Improv. Web 78, 71–150 (2010)
Article Google Scholar
L. Nanni, Y.M.G. Costa, D.R. Lucio, C.N. Silla, S. Brahnam, Combining visual and acoustic features for audio classification tasks. Pattern Recogn. Lett. 88, 49–56 (2017)
Article Google Scholar
L. Nanni, Y.M.G. Costa, A. Lumini, M.Y. Kim, S.R. Baek, Combining visual and acoustic features for music genre classification. Expert Syst. Appl. 45, 108–117 (2016)
Article Google Scholar
R. Nosaka, C.H. Suryanto, K. Fukui, Rotation invariant co-occurrence among adjacent lbps. In Jong-Il Park and Junmo Kim, editors, Computer Vision - ACCV 2012 Workshops, pages 15–25, (2013)
T. Ojala, M. Pietikainen, T. Maenpaa, Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)
Article Google Scholar
V. Ojansivu, J. Heikkilä, Blur insensitive texture classification using local phase quantization. In Abderrahim Elmoataz, Olivier Lezoray, Fathallah Nouboud, and Driss Mammass, editors, Image and Signal Processing, pages 236–243, (2008)
Y. Panagakis, C. Kotropoulos, Music genre classification via topology preserving non-negative tensor factorization and sparse representations. In 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, pages 249–252, (2010)
Y. Panagakis, C.L. Kotropoulos, G.R. Arce, Music genre classification via joint sparse low-rank representation of audio features. IEEE/ACM Trans. Audio, Speech, Lang. Process. 22(12), 1905–1917 (2014)
Article Google Scholar
L. Qiu, S. Li, Y. Sung, 3D-DCDAE: Unsupervised music latent representations learning method based on a deep 3d convolutional denoising autoencoder for music genre classification. Mathematics 9(18), 2274 (2021)
Article Google Scholar
L. Qiu, S. Li, Y. Sung, DBTMPE: Deep bidirectional transformers-based masked predictive encoder approach for music genre classification. Mathematics 9(5), 530 (2021)
Article Google Scholar
A. Schindler, A. Rauber, An audio-visual approach to music genre classification through affective color features. In Allan Hanbury, Gabriella Kazai, Andreas Rauber, and Norbert Fuhr, editors, Advances in Information Retrieval, pages 61–67, (04 2015)
F. Song, D. Zhang, D. Mei, Z. Guo, A multiple maximum scatter difference discriminant criterion for facial feature extraction. IEEE Trans. Syst. Man Cybern. Part B (Cybernetics) 37(6), 1599–1606 (2007)
Article Google Scholar
D.G. Stork, R.O. Duda, P.E. Hart, D. Stork, Pattern classification (A Wiley-Interscience Publication, Hoboken, 2001)
Google Scholar
G. Tzanetakis, P. Cook, Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing 10(5), 293–302 (2002)
Article Google Scholar
E. Van Den Berg, M.P. Friedlander, Probing the pareto frontier for basis pursuit solutions. SIAM J. Sci. Comput. 31(2), 890–912 (2009)
Article MathSciNet Google Scholar
T.H. Vu, V. Monga, Fast low-rank shared dictionary learning for image classification. IEEE Trans. Image Process. 26(11), 5160–5175 (2017)
Article MathSciNet Google Scholar
H. Wang, S. Yan, D. Xu, X. Tang, T. Huang, Trace ratio vs. ratio trace for dimensionality reduction. In 2007 IEEE Conference on Computer Vision and Pattern Recognition, pages 1–8, (2007)
Z. Wen, B. Hou, L. Jiao, Discriminative dictionary learning with two-level low rank and group sparse decomposition for image classification. IEEE trans. cybern. 47(11), 3758–3771 (2017)
Article Google Scholar
J. Wright, A.Y. Yang, A. Ganesh, S.S. Sastry, Y. Ma, Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)
Article Google Scholar
M. Wu, Z. Chen, J.R. Jang, J. Ren, Y. Li, C. Lu, Combining visual and acoustic features for music genre classification. In 2011 10th International Conference on Machine Learning and Applications and Workshops, volume 2, pages 124–129, (2011)
X. Huan, C. Caramanis, S. Sanghavi, Robust pca via outlier pursuit. IEEE Trans. Inf. Theory 58(5), 3047–3064 (2012)
Article MathSciNet Google Scholar
Y. Xu, W. Zhou, A deep music genres classification model based on cnn with squeeze & excitation block. In 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pages 332–338, (2020)
B.Q. Yang, X.P. Guan, J.W. Zhu, G. ChaoChen, W. KaiJie, X. JiaJie, Svms multi-class loss feedback based discriminative dictionary learning for image classification. Pattern Recogn. 112, 107690 (2021)
Article Google Scholar
H. Yang, W.Q. Zhang, Music genre classification using duplicated convolutional layers in neural networks. In Interspeech, pages 3382–3386, (2019)
J. Yang, X. Yuan, Linearized augmented lagrangian and alternating direction methods for nuclear norm minimization. Math. Comput. 82(281), 301–329 (2013)
Article MathSciNet Google Scholar
M. Yang, L. Zhang, X. Feng, D. Zhang, Sparse representation based fisher discrimination dictionary learning for image classification. Int. J. Comput. Vision 109(3), 209–232 (2014)
Article MathSciNet Google Scholar
J. Ylioinas, A. Hadid, Y. Guo, M. Pietikäinen, Efficient image appearance description using dense sampling based local binary patterns. In Kyoung Mu Lee, Yasuyuki Matsushita, James M. Rehg, and Zhanyi Hu, editors, Computer Vision – ACCV 2012, pages 375–388, (2013)
Yu. Yang, S. Luo, S. Liu, H. Qiao, Y. Liu, L. Feng, Deep attention based music genre classification. Neurocomputing 372, 84–91 (2020)
Article Google Scholar
Y. Zhang, Z. Jiang, L.S. Davis, Learning structured low-rank representations for image classification. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 676–683, (2013)
G. Zhao, T. Ahonen, J. Matas, M. Pietikainen, Rotation-invariant image and video description with local binary pattern features. IEEE Trans. Image Process. 21(4), 1465–1477 (2012)
Article MathSciNet Google Scholar
L. Zhuang, H. Gao, Z. Lin, Y. Ma, X. Zhang, N. Yu, Non-negative low rank and sparse graph for semi-supervised learning. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 2328–2335, (2012)

Download references

Acknowledgements

Thank all the referees and the editorial board members for their insightful comments and suggestions, which improved our paper significantly. This study was funded by the National Natural Science Foundation of China under the Grants No.11501351.

Author information

Authors and Affiliations

Department of Mathematics, Shanghai University, Shanghai, 200444, P R China
Xin Cai & Hongjuan Zhang
Newtouch Center for Mathematics, Shanghai University, Shanghai, 200444, People’s Republic of China
Hongjuan Zhang

Authors

Xin Cai
View author publications
You can also search for this author in PubMed Google Scholar
Hongjuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongjuan Zhang.

Ethics declarations

Conflict of interest

The authors declare that they have no Conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Cai, X., Zhang, H. Fisher Discriminative Embedding Low-Rank Sparse Representation for Music Genre Classification. Circuits Syst Signal Process (2024). https://doi.org/10.1007/s00034-024-02696-0

Download citation

Received: 07 April 2023
Revised: 09 April 2024
Accepted: 10 April 2024
Published: 14 May 2024
DOI: https://doi.org/10.1007/s00034-024-02696-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fisher Discriminative Embedding Low-Rank Sparse Representation for Music Genre Classification

Abstract

Access this article

Similar content being viewed by others

Importance of audio feature reduction in automatic music genre classification

Unsupervised feature selection based on joint spectral learning and general sparse regression

Influence of Low-Level Features Extracted from Rhythmic and Harmonic Sections on Music Genre Classification

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fisher Discriminative Embedding Low-Rank Sparse Representation for Music Genre Classification

Abstract

Access this article

Similar content being viewed by others

Importance of audio feature reduction in automatic music genre classification

Unsupervised feature selection based on joint spectral learning and general sparse regression

Influence of Low-Level Features Extracted from Rhythmic and Harmonic Sections on Music Genre Classification

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation