The Method of Main Vocal Melody Extraction Based on Harmonic Structure Analysis from Popular Song

  • Chai-Jong Song
  • Seok-Pil Lee
  • Kyung-Hack Seo
  • Hochong Park
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 107)


In this paper, we propose the method of main vocal melody extraction based on harmonic structure analysis technique from polyphonic music signal. It is the most important part of contents based music retrieval method which has mainly three parts. The first part is pitch estimation from humming signal, the second one is the melody extraction from polyphonic music signal and the last one is the matching engine which measure the distance between two vectors. The accuracy of melody extraction affects the overall system performance rather than any other parts. Human vocal track makes the harmonics like most musical instruments. This is one of the most important things that we have considered to utilize. So, we might extract the main vocal melody from the complicated mixed signal with musical instruments. We utilize harmonic structure analysis and track pitch sequence during three frames include current frame. The proposed method contains three major blocks named preprocessing, multi-pitch extraction with peak picking, fundamental frequency detection and the last part with pitch tracking, predominant melody detection. We have started this project with aiming for supporting commercial service for music portal provider, KARAOKE system and mobile devices.


QbSH Multi-F0 Melody extraction Pitch contour 


  1. 1.
    Orio N (2006) Music information retrieval: a turorial and review. Found Trends Inf Retr 1:1–90MATHCrossRefGoogle Scholar
  2. 2.
    Downie JS (2008) The music information retrieval evaluation exchange (2005–2007): a window into music information retrieval research. Acoust Sci Tech 29:4Google Scholar
  3. 3.
    Poliner G, Ellis DP, Ehamann AF, Gomez E, Streich S, Ong B (2007) Melody transcription from music audio: approaches and evaluation. IEEE Trans Audio Speech Lang Process 15(4):1066–1074CrossRefGoogle Scholar
  4. 4.
    Eggink J, Broown GJ (2004) Extracting melody lines from complex audio, ISMIRGoogle Scholar
  5. 5.
    Klapuri AP (2003) Multiple fundamental frequency estimation by summing harmonic amplitude. IEEE Trans Speech Audio Process 8:6Google Scholar
  6. 6.
    Goto M (2004) A real-time music scene description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals. Speech Commun 43(4):311–329CrossRefGoogle Scholar
  7. 7.
    TIA-EIA-IS-127, Enhanced Variable Rate CODECGoogle Scholar
  8. 8.
    Audio melody extraction results.

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  • Chai-Jong Song
    • 1
  • Seok-Pil Lee
    • 1
  • Kyung-Hack Seo
    • 1
  • Hochong Park
    • 2
  1. 1.Digital Media Research CenterKETISeoulSouth Korea
  2. 2.Department of Electronics EngineeringKwangwoon UniversitySeoulRepublic of Korea

Personalised recommendations