Music similarity model based on CRP fusion and Multi-Kernel Integration

  • Yanlan Fan
  • Ning ChenEmail author


A music similarity model that combines Cross Recurrence Plot (CRP) fusion and Multi-Kernel Integration (MKI) is proposed for Cover Song Identification (CSI) task. First, two complementary descriptors: Harmonic Pitch Class Profile (HPCP), which represents the harmonic progression, and MeLoDy (MLD), which describes the melody evolution, are extracted from each track. Then, the CRP between each pair of tracks is constructed based on HPCP descriptor and MLD descriptor, respectively. To take advantage of the complementarity between the recurrence properties described by the HPCP based and MLD based CRPs, nonlinear graph fusion technique is adopted to fuse them to obtain the fused CRP. Next, recurrence quantification analysis is applied to the fused CRP to obtain the similarity score. Finally, MKI is introduced to refine the obtained similarity score to make it fit for diverse statistical characteristics of the tracks. Experimental results on three cover song datasets verify the superiority of the proposed model, in terms of identification accuracy and classification accuracy, over state-of-the-art CSI schemes based on single similarity function or similarity fusion. The proposed model can be modified and applied in other tasks, such as image classification, visual object tracking, and drug taxonomy, etc.


Similarity fusion Cross Recurrence Plot (CRP) Nonlinear graph fusion Multi-Kernel Integration (MKI) 



  1. 1.
    Ahonen T et al (2016) Cover song identification using compression-based distance measures. Series of publications A/Department of Computer Science, University of HelsinkiGoogle Scholar
  2. 2.
    Bo W, Mezlini AM, Demir F, Fiume M, Zhuowen T, Brudno M, Haibe-Kains B, Goldenberg A (2014) Similarity network fusion for aggregating data types on a genomic scale. Nat Methods 11(3):333–337CrossRefGoogle Scholar
  3. 3.
    Bo Wang, Zhu Junjie, Pierson Emma, Ramazzotti Daniele, Batzoglou Serafim (2017) Visualization and analysis of single-cell rna-seq data by kernel-based similarity learning. Nat Methods 14(4):414CrossRefGoogle Scholar
  4. 4.
    Chen J, Wang C (2013) Automatic music stretching resistance classification using audio features and genres. IEEE Signal Process Lett 20(12):1249–1252CrossRefGoogle Scholar
  5. 5.
    Chen N, Xiao H (2017) Fusing similarity functions for cover song identification. Multimed Tools Appl 2017:1–24Google Scholar
  6. 6.
    Chen N, Xiao H-D (2016) Similarity fusion scheme for cover song identification. Electron Lett 52(13):1173–1175CrossRefGoogle Scholar
  7. 7.
    Chen N, Downie JS, Xiao H-D, Zhu Y (2015) Cochlear pitch class profile for cover song identification. Appl Acoust 99:92–96CrossRefGoogle Scholar
  8. 8.
    Chen N, Li M, Xiao H (2017) Two-layer similarity fusion model for cover song identification. EURASIP J Audio Speech Music Process 2017(12):1–15Google Scholar
  9. 9.
    Degani A, Dalai M, Leonardi R, Migliorati P (2013) A heuristic for distance fusion in cover song identification. In: Proceedings of the 14th international workshop image analysis for multimedia interactive services (WIAMIS). IEEE, pp 1–4Google Scholar
  10. 10.
    Ellis DP W (2006) Identifying ‘cover songs’ with beat-synchronous chroma features. In: Music information retrieval evaluation eXchange (MIREX), pp 1–4Google Scholar
  11. 11.
    Foucard R, Durrieu J-L, Lagrange M, Richard G (2010) Multimodal similarity between musical streams for cover version detection. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP). IEEE, pp 5514–5517Google Scholar
  12. 12.
    Gómez E, Herrera P (2006) The song remains the same: identifying versions of the same piece using tonal descriptors. In: Proceedings of the international conference on music information retrieval (ISMIR), pp 180–185Google Scholar
  13. 13.
    Hogg RV, Ledolter J (1987) Engineering statistics. Macmillan Pub Co, New YorkGoogle Scholar
  14. 14.
    Marolt M (2006) A mid-level melody-based representation for calculating audio similarity. In: Proceedings of the international conference on music information retrieval (ISMIR), pp 280–285Google Scholar
  15. 15.
    Ravuri S, Ellis DPW (2010) Cover song detection: from high scores to general classification. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP). IEEE, pp 65–68Google Scholar
  16. 16.
    Salamon J, Gómez E, Bonada J (2011) Sinusoid extraction and salience function design for predominant melody estimation. In: Proceedings of the international conference on digital audio effects (DAFx), pp 73–80Google Scholar
  17. 17.
    Salamon J, Serrà J, Gómez E (2012) Melody, bass line, and harmony representations for music version identification. In: Proceedings of the 21st international conference companion on World Wide Web (WWW 2012). ACM, pp 887–894Google Scholar
  18. 18.
    Salamon JJ et al (2013) Melody extraction from polyphonic music signals. PhD thesis, Universitat Pompeu FabraGoogle Scholar
  19. 19.
    Schulkind MD, Posner RJ, Rubin DC (2003) Musical features that facilitate melody identification: how do you know it’s “your” song when they finally play it. Music Percept 21(2):217–249CrossRefGoogle Scholar
  20. 20.
    Semwal V B, Raj M, Nandi GC (2015) Biometric gait identification based on a multilayer perceptron. Robot Auton Syst 65(3):65–75CrossRefGoogle Scholar
  21. 21.
    Semwal V B, Mondal K, Nandi GC (2017) Robust and accurate feature selection for humanoid push recovery and classification: deep learning approach. Neural Comput Appl 28(3):565–574CrossRefGoogle Scholar
  22. 22.
    Semwal VB, Gaud N, Nandi G C (2019) Human gait state prediction using cellular automata and classification using ELM. Springer, Berlin, pp 135–145Google Scholar
  23. 23.
    Semwal VB, Joyeeta S, Kumari SP, Arun C, Basudeba B (2017) An optimized feature selection technique based on incremental feature analysis for bio-metric gait data classification. Multimed Tools Appl 76(22):24457–24475CrossRefGoogle Scholar
  24. 24.
    Serra J (2011) Identification of versions of the same musical composition by processing audio descriptions. PhD thesis, Universitat Pompeu FabraGoogle Scholar
  25. 25.
    Serra J, Gómez E, Herrera P, Serra X (2008) Chroma binary similarity and local alignment applied to cover song identification. IEEE Trans Audio Speech Lang Process 16(6):1138–1151CrossRefGoogle Scholar
  26. 26.
    Serra J, Serra X, Andrzejak RG (2009) Cross recurrence quantification for cover song identification. New J Phys 11(9):093017CrossRefGoogle Scholar
  27. 27.
    Serrà J, Gómez E, Herrera P (2010) Audio cover song identification and similarity: background, approaches, evaluation, and beyond. Springer, BerlinGoogle Scholar
  28. 28.
    Tralie CJ (2017) Early mfcc and hpcp fusion for robust cover song identification. In: International conference on music information retrieval (ISMIR), pp 294–301Google Scholar
  29. 29.
    Tsai WH, Yu HM, Wang HM (2008) Using the similarity of main melodies to identify cover versions of popular songs for music document retrieval. J Inf Sci Eng 24(6):1669–1687Google Scholar
  30. 30.
    Yang F, Chen N (2016) Cover song identification based on cross recurrence plot and local alignment. J East China Univ Sci Technol 42(2):247–253Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Information Science and EngineeringEast China University of Science and TechnologyShanghaiChina

Personalised recommendations