Multimedia Systems

, Volume 17, Issue 4, pp 313–326 | Cite as

M-MUSICS: an intelligent mobile music retrieval system

Regular Paper


Accurate voice humming transcription and efficient indexing and retrieval schemes are essential to a large-scale humming-based audio retrieval system. Although much research has been done to develop such schemes, their performance in terms of precision, recall, and F-measure, among all similarity metrics, are still unsatisfactory. In this paper, we propose a new voice query transcription scheme. It considers the following features: note onset detection using dynamic threshold methods, fundamental frequency (F0) acquisition of each frame, and frequency realignment using K-means. We use a popularity-adaptive indexing structure called frequently accessed index (FAI) based on frequently queried tunes for indexing purposes. In addition, we propose a semi-supervised relevance feedback and query reformulation scheme based on a genetic algorithm to improve retrieval efficiency. In this paper, we extend our efforts to mobile multimedia environments and develop a mobile audio retrieval system. Experiments show our system performs satisfactory in wireless mobile multimedia environments.


Content-based audio retrieval Mobile platform Relevance feedback Signal processing 



“This research was supported by the MKE (The Ministry of Knowledge Economy), Korea, under the ITRC (Information Technology Research Center) support program supervised by the NIPA (National IT Industry Promotion Agency)” (NIPA-2010-C1090-1031-0004). “This research was also supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2010-0025395)”. We would like to specially thank to Byeong-jun Han who were dedicated as much as we were in this paper.


  1. 1.
    Pham, B., Wong, O.: Handheld devices for applications using dynamic multimedia data. In: Proceedings of graphite (2004)Google Scholar
  2. 2.
    Paraskevi, L., Aristomenis L., George T.: Intelligent mobile content-based retrieval from digital music libraries. Intell. Decis. Tech. 3(3), 123–138 (2009). doi: 10.3233/IDT-2009-0060
  3. 3.
    Barrington, L., Chan, A., Turnbull, D., Lanckriet, G.: Audio information retrieval using semantic similarity. In: ICASSP, pp.725–728 (2007)Google Scholar
  4. 4.
    Gagliardi, l., Pagliarulo, P.: Audio information retrieval in HyperMedia environment. In: ACM conference on Hypertext and hypermedia, pp. 248–250 (2005)Google Scholar
  5. 5.
    Batlle, E., Masip, J., Guaus, E.: Amadeus: A scalable HMM-based audio information retrieval system. In: ISCCSP, pp. 731–734 (2004)Google Scholar
  6. 6.
    Schnelle, D., James, F.: Structured audio information retrieval system, pp. 1–12. Mobile Computing and Ambient Intelligence (2005)Google Scholar
  7. 7.
    Jang, J.S.R., Chun, J., Kao, M.-Y.: MIRACLE: a music information retrieval system with clustered computing engines, pp. 11–12. ISMIR (2001)Google Scholar
  8. 8.
    Ghias, A., Logan, J., Chamberlin, D., Smith, B.: Query by humming—musical information retrieval in an audio database. In: Proceedings of ACM multimedia, pp. 231–236 (1995)Google Scholar
  9. 9.
    Chen, L., Hu, B.: An implementation of web based query by humming system, pp. 1467–1470. ICME (2007)Google Scholar
  10. 10.
    Rho, S., Han, B., Hwang, E., Kim, M.: MUSEMBLE: a novel music retrieval system with automatic voice query transcription and reformulation. J. Syst. Softw. 81(7), 1065–1080 (2008). doi: 10.1016/j.jss.2007.05.038 Google Scholar
  11. 11.
    Park, S., Kim, S., Byeon, K., Hwang, E.: Automatic voice query transformation for query-by-humming systems, pp. 197–202. IMSA (2005)Google Scholar
  12. 12.
    Zhang, W., Xu, G., Wang, Y.: Pitch extraction based on circular AMDF. In: ICASSP, pp. 341–344 (2002)Google Scholar
  13. 13.
    Ross, M. J.: Average magnitude difference function pitch extractor. IEEE Trans. Acoust. 22(1), 353–362 (1974)Google Scholar
  14. 14.
  15. 15.
    Bello, J.P., Daudet, L., Abdallah, S., Duxbury, C., Davies, M., Sandler, M.B.: A tutorial on onset detection in music signals. IEEE Trans. Speech. Audio. Process. 13(5), 1035–1047 (2005). doi: 10.1109/TSA.2005.851998 Google Scholar
  16. 16.
    Duxbury, C., Bello, J. P.: Complex domain Onset Detection for Musical Signals. In: DAFx (2003)Google Scholar
  17. 17.
    Klapuri, A.: Sound onset detection by applying psychoacoustic knowledge. In: ICASSP, pp. 115–118 (1999)Google Scholar
  18. 18.
    Gainza: Onset detection and music transcription for the Irish Tin Whistle. In: Proceedings of the Irish systems and signals conference (2004)Google Scholar
  19. 19.
    Leveau, P., Richard, L. D. G.: Methodology and tools for the evaluation of automatic onset detection algorithms in music, pp. 72–75. ISMIR (2004)Google Scholar
  20. 20.
  21. 21.
    Eric, L., Maddox, R.: Real-time time-domain pitch tracking using wavelets. In: REU Report, University of Illinois at Urbana-Champaign, Department of Physics (2005)Google Scholar
  22. 22.
    Ryynanen, M., Klapuri A.: Transcription on the singing melody in polyphonic music, ISMIR (2006)Google Scholar
  23. 23.
    Klapuri, A.: A perceptually motivated multiple-F0 estimation method. In: WASPAA, pp. 291–294 (2005)Google Scholar
  24. 24.
    Roger, J., Lee, H. R.: A general framework of progressive filtering and its application to query by singing/humming. IEEE Trans. Audio. Speech. Lang. Process. 2(16), 350–358 (2008). doi: 10.1109/TASL.2007.913035 Google Scholar
  25. 25.
    Hoashi, K., Zeitler, E., Inoue, N.: Implementation of relevance feedback for content-based music retrieval based on user preferences. In: ACM SIGIR, pp. 385–286 (2002)Google Scholar
  26. 26.
  27. 27.
    Rui, Y., Huang, T. S., Ortega, M. Mehrotra, S.: Relevance feedback: a power tool for interactive content-based image retrieval. IEEE Trans. Circuits. Syst. Video. Technol. 8(5), 644–655 (1998)Google Scholar
  28. 28.
    Stejic, Z., Takama, Y., and Hirota, K.: Genetic algorithm based relevance feedback for image retrieval using local similarity patterns. Inf. Process. Manag. 39(1), 1–23 (2003). doi: 10.1016/S0306-4573(02)00024-9
  29. 29.
    Karydis, I., Nanopoulos, A., Papadopoulos, A., Katsaros, D., Manolopoulos, Y.: Content-based music information retrieval in wireless ad-hoc networks, pp. 137–144. ISMIR (2005)Google Scholar
  30. 30.
    Lampropoulou, P.S., Lampropoulos, A.S., Tsihrintzis, G.A.: Alimos: A middleware system for accessing digital music libraries in mobile services. In: Proceedings of 10th international conference on knowledge-based and intelligent information and engineering systems, pp. 384–391 (2006)Google Scholar
  31. 31.
    Savitzkey, A., Golay, M. J.E.: Smoothing and differentiation of data by simplified least squares procedures. Anal. Chem. 36(8), 1627–1639 (1964)Google Scholar
  32. 32.
    MacQueen J.: some methods for classification and analysis of multivariate observations. In: Proceedings of knowledge discovery and data mining, pp. 16–22 (1967)Google Scholar
  33. 33.
    Rho, S., Hwang, E.: FMF: Query adaptive melody retrieval system. J. Syst. Softw. 79(1), 43–56 (2006). doi: 10.1016/j.jss.2004.11.024 Google Scholar
  34. 34.
    AKOff sound labs.
  35. 35.
    Digital Ear. Real-time wav to MIDI converter.
  36. 36.
    Ross, M.J.: Average magnitude difference function pitch extractor. IEEE Trans. Acoust. 22(1), 353–362 (1974)Google Scholar

Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  1. 1.School of Electrical EngineeringKorea UniversitySeoulKorea
  2. 2.Department of Computer Science and EngineeringSeoul National University of Science and TechnologySeoulKorea

Personalised recommendations