Multimedia Tools and Applications

, Volume 75, Issue 24, pp 16905–16922 | Cite as

Landmark-based music recognition system optimisation using genetic algorithms

  • Salvador Gutiérrez
  • Salvador García


Audio fingerprinting allows us to label an unidentified music fragment within a previously generated database. The use of spectral landmarks aims to obtain a robustness that lets a certain level of noise be present in the audio query. This group of audio identification algorithms holds several configuration parameters whose values are usually chosen based upon the researcher’s knowledge, previous published experimentation or just trial and error methods. In this paper we describe the whole optimisation process of a Landmark-based Music Recognition System using genetic algorithms. We define the actual structure of the algorithm as a chromosome by transforming its high relevant parameters into various genes and building up an appropriate fitness evaluation method. The optimised output parameters are used to set up a complete system that is compared with a non-optimised one by designing an unbiased evaluation model.


Music recognition Genetic algorithms Audio fingerprinting Parameter optimisation 



This work is supported by the research project TIN2014-57251-P. The authors are very grateful to the anonymous reviewers for their valuable suggestions and comments to improve the quality of this paper.


  1. 1.
    Almeida LB (1994) Fractional fourier transform and time-frequency representations. IEEE Trans Signal Process 42(11):3084–3091CrossRefGoogle Scholar
  2. 2.
    Apelblat A (2012) Laplace transforms and their applications. Nova Science PublishersGoogle Scholar
  3. 3.
    Bellettini C, Mazzini G (2010) A framework for robust audio fingerprinting. J Commun 5(5):409–424CrossRefGoogle Scholar
  4. 4.
    Buqing C, Jianxun L, Liu X, Li B, Dong Z, Kang G (2013) CHC-TSCM: A trustworthy service composition method based on an improved CHC genetic algorithm. Communications, China 10(12):77–91CrossRefGoogle Scholar
  5. 5.
    Cano P, Batlle E, Kalker T, Haitsma J (2005) A review of audio fingerprinting. Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology 41(3):271–284. (SPEC, ISS.)CrossRefGoogle Scholar
  6. 6.
    Casey MA, Veltkamp R, Goto M, Leman M, Rhodes C, Slaney M (2008) Content-based music information retrieval: Current directions and future challenges. Proc IEEE 96(4):668–696CrossRefGoogle Scholar
  7. 7.
    Chen W-H, Smith CH, Fralick SC (1977) Fast computational algorithm for the discrete cosine transform. IEEE Trans Commun COM-25(9):1004–1009CrossRefzbMATHGoogle Scholar
  8. 8.
    Cordon O, Damas S, Santamaria J (2006) Feature-based image registration by means of the CHC evolutionary algorithm. Image Vis Comput 24(5):525–533CrossRefGoogle Scholar
  9. 9.
    Deng J, Wan W, Swaminathan R, Yu X, Pan X (2011) An audio fingerprinting system based on spectral energy structure. In: IET International Conference on Smart and Sustainable City, 2011, 27Google Scholar
  10. 10.
    Deng J, Wan W, Yu X, Pan X, Yang W (2011) Audio fingerprinting based on harmonic enhancement and spectral subband centroid. In: IET International Communication Conference on Wireless Mobile and Computing, 2011, 93–96Google Scholar
  11. 11.
    Duhamel P, Vetterli M (1990) Fast fourier transforms: A tutorial review and a state of the art. Signal Process 19(4):259–299MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Dupraz E, Richard G (2010) Robust frequency-based audio fingerprinting. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 281–284Google Scholar
  13. 13.
    Eiben AE, Smith JE (2003) Introduction to Evolutionary Computing. SpringerGoogle Scholar
  14. 14.
    Ellis D (2009) Robust landmark-based audio fingerprinting. Available at Accessed: 2015-01-20.
  15. 15.
    Eshelman LJ (1991) The CHC adaptive search algorithm: How to have safe search when engaging in nontraditional genetic recombination. Proceedings of the First Workshop on Foundations of Genetic Algorithms, 265–283Google Scholar
  16. 16.
    Eshelman LJ, Schaffer JD (1992) Real-coded genetic algorithms and interval-schemata. In: LD Whitley (ed) FOGA, 187–202Google Scholar
  17. 17.
    Filipiak P, Lipiski P (2012) Parallel CHC algorithm for solving dynamic traveling salesman problem using many-core GPU. Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7557 LNAI:305–314Google Scholar
  18. 18.
    Herrera F, Lozano M, Snchez AM (2003) A taxonomy for the crossover operator for real-coded genetic algorithms: An experimental study. Int J Intell Syst 18(3):309–338CrossRefzbMATHGoogle Scholar
  19. 19.
    Jiang W, Zhu Y, Bao X, Yu R (2012) Cloud-based audio fingerprinting service. In: 2012 Conference Handbook - Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2012Google Scholar
  20. 20.
    Kamaladas MD, Dialin MM (2013) Fingerprint extraction of audio signal using wavelet transform. In: International Conference on Signal Processing, Image Processing and Pattern Recognition 2013, ICSIPR 2013, 1Google Scholar
  21. 21.
    Ke Y, Hoiem D, Sukthankar R (2005) Computer vision for music identification. In: Proceedings - 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, I, 597–604Google Scholar
  22. 22.
    Klapuri A, Davy M (2006) Signal processing methods for music transcription. SpringerGoogle Scholar
  23. 23.
    Lee I-H, Mahmood MT, Shim S-O, Choi T-S (2014) Optimizing image focus for 3d shape recovery through genetic algorithm. Multimedia Tools and Applications 71(1):247–262CrossRefGoogle Scholar
  24. 24.
    Levy M, Sandler M (2009) Music information retrieval using social tags and audio. IEEE Trans Multimedia 11(3):383–395CrossRefGoogle Scholar
  25. 25.
    Li Z-Y, Zhang W-Q, Liu J (2013) Multi-resolution time frequency feature and complementary combination for short utterance speaker recognition. Multimedia Tools and Applications:1–17Google Scholar
  26. 26.
    Liu J, Zhang T (2011) Wavelet-based audio fingerprinting algorithm robust to linear speed change. Communications in Computer and Information Science, 234 CCIS(PART 4):360–368Google Scholar
  27. 27.
    Liu Y, Yun HS, Kim NS (2009) Audio fingerprinting based on multiple hashing in DCT domain. IEEE Signal Process Lett 16(6):525–528CrossRefGoogle Scholar
  28. 28.
    Malekesmaeili M, Ward RK (2012) A novel local audio fingerprinting algorithm. In: 2012 IEEE 14th International Workshop on Multimedia Signal Processing, MMSP 2012 - Proceedings, pages 136–140Google Scholar
  29. 29.
    Marín J, Molina D, Herrera F (2012) Modeling dynamics of a real-coded CHC algorithm in terms of dynamical probability distributions. Soft Computing 16(2):331–351CrossRefGoogle Scholar
  30. 30.
    Mendoza M, Cobos C, León E, Lozano M, Rodríguez F, Herrera-Viedma E (2014) A new memetic algorithm for multi-document summarization based on CHC algorithm and greedy search. Human-Inspired Computing and Its Applications:125–138Google Scholar
  31. 31.
    Mohsenfar SM, Mosleh M, Barati A (2013) Audio watermarking method using QR decomposition and genetic algorithm. Multimedia Tools and Applications 74(3):1–21Google Scholar
  32. 32.
    Nesmachnow S, Alba E, Cancela H (2012) Scheduling in heterogeneous computing and grid environments using a parallel CHC evolutionary algorithm. Comput Intell 28(2):131–155MathSciNetCrossRefGoogle Scholar
  33. 33.
    Pan X, Yu X, Deng J, Yang W, Wang H (2011) Audio fingerprinting based on local energy centroid. In: IET International Communication Conference on Wireless Mobile and Computing, vol. 2011, pp. 351–354Google Scholar
  34. 34.
    Ramalingam A, Krishnan S (2006) Gaussian mixture modeling of short-time fourier transform features for audio fingerprinting. IEEE Trans Inf Forensics Secur 1(4):457–463CrossRefGoogle Scholar
  35. 35.
    Seo JS, Jin M, Lee S, Jang D, Lee S, Yoo CD (2005) Audio fingerprinting based on normalized spectral subband centroids. In: IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, volume III, pages 213–216Google Scholar
  36. 36.
    Sethares W.A (2005) Tuning, timbre, spectrum, scale, 2nd. SpringerGoogle Scholar
  37. 37.
    Sinitsyn A (2006) Duplicate song detection using audio fingerprinting for consumer electronics devices. In: Proceedings of the International Symposium on Consumer Electronics, ISCE, pages 622–627Google Scholar
  38. 38.
    Son W, Cho H-T, Yoon K, Lee S-P (2010) Sub-fingerprint masking for a robust audio fingerprinting system in a real-noise environment for portable consumer devices. IEEE Trans Consum Electron 56(1):156–160CrossRefGoogle Scholar
  39. 39.
    Suyoto ISH, Uitdenbogerd AL, Scholer F (2008) Searching musical audio using symbolic queries. IEEE Trans Audio Speech Lang Process 16(2):372–381CrossRefGoogle Scholar
  40. 40.
    Theodoridis S, Koutroumbas K (2009) Pattern Recognition. ElsevierGoogle Scholar
  41. 41.
    Tsai T-H, Huang Y-S, Liu P-Y, Chen D-M (2014) Content-based singer classification on compressed domain audio data. Multimedia Tools and Applications:1–21Google Scholar
  42. 42.
    Typke R, Wiering F, Veltkamp RC (2005) A survey of music information retrieval systems. In: Proceedings of the sixth International conference on Music Information Retrieval, ISMIR 2005, pages 153–160Google Scholar
  43. 43.
    Tzanetakis G, Cook P (2002) Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing 10(5):293–302CrossRefGoogle Scholar
  44. 44.
    Vaseghi SV (2007) Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications, John Wiley & SonsGoogle Scholar
  45. 45.
    Wang A (2004) An industrial-strength audio search algorithm. Proceedings of SPIE - The International Society for Optical Engineering 5307:582–588Google Scholar
  46. 46.
    Wang A (2006) The shazam music recognition service. Commun ACM 49(8):44–48CrossRefGoogle Scholar
  47. 47.
    Wang Q, Guo Z, Liu G, Guo J (2012) Audio fingerprinting based on n-grams. International Journal of Digital Content Technology and its Applications 6(10):361–368CrossRefGoogle Scholar
  48. 48.
    Wong GY, Leung FHF, Ling S-H (2014) An under-sampling method based on fuzzy logic for large imbalanced dataset. In: Fuzzy Systems (FUZZ-IEEE), 2014 IEEE International Conference on, 1248–1252Google Scholar
  49. 49.
    Zhu B, Li W, Wang Z, Xue X (2010) A novel audio fingerprinting method robust to time scale modification and pitch shifting. In: MM’10 - Proceedings of the ACM Multimedia 2010 International Conference, 987–990Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Instituto de Ciencias de la Vid y del VinoUniversity of La Rioja, CSIC, Gobierno de La RiojaLogroñoSpain
  2. 2.Department of Computer Science and Articial Intelligence, CITIC-UGR Research Center on Information and Communications TechnologyUniversity of Granada, ETSIIGranadaSpain
  3. 3.Department of Information Systems, Faculty of Computing and Information TechnologyKing Abdulaziz UniversityJeddahSaudi Arabia

Personalised recommendations