Synthesizing the note-specific atoms based on their fundamental frequency, used for single-channel musical source separation

  • Mohammadali AzamianEmail author
  • Ehsanollah Kabir


The musical source separation deals with extracting the musical signals from a mixture. To attain this goal, one of the efficient methods is to decompose the mixture into a dictionary of some basic functions that inherently describe the instruments. Usually, a unique function is synthesized for each of the notes of each instrument, called the note-specific atom. In this paper, a sine-harmonic model is utilized to synthesize note-specific atoms and the note’s fundamental frequency is used as a prior information to determine the model parameters. To calculate these parameters, the training signal spectrum is processed only around the main note harmonics. Experimental results demonstrated that the proposed method is much faster in note-specific atoms synthesis, without decreasing the source separation quality and can also eliminate the single-frequency noise from training signals.


Musical source separation Audio signal processing Note-specific atom Sine-harmonic model Fundamental frequency 



  1. 1.
    Azamian M, Kabir E, Seyedin S, Masehian E (2017) An adaptive sparse algorithm for synthesizing note-specific atoms by spectrum analysis, applied to musical signal separation. Advances in Electrical And Computer Engineering 17(2):103–112. CrossRefGoogle Scholar
  2. 2.
    Bertin N, Badeau R, Vincent E (2010) Enforcing harmonicity and smoothness in bayesian non-negative matrix factorization applied to polyphonic music transcription. IEEE Trans Audio Speech Lang Process 18(3):538–549CrossRefGoogle Scholar
  3. 3.
    Brown J, Smaragdis P (2004) Independent component analysis for automatic note extraction from musical trills. J Acoust Soc Am 115:2295–2306CrossRefGoogle Scholar
  4. 4.
    Casey MA, Westner A (2000) Separation of mixed audio sources by independent subspace analysis. Proceedings of the International Computer Music Conference, GermanyGoogle Scholar
  5. 5.
    Cho N, Kuo CCJ (2011) Sparse music representation with source-specific dictionaries and its application to signal separation. IEEE Trans Audio Speech Lang Process 19(2):337–348CrossRefGoogle Scholar
  6. 6.
    Davies ME, James CJ (2007) Source separation using single channel ICA. Signal Processing 87(8)Google Scholar
  7. 7.
    Durrieu JL, Richard G, David B, Fevotte C (2010) Source/filter model for unsupervised main melody extraction from polyphonic audio signals. IEEE Trans Audio Speech Lang Process 18(3):564–575CrossRefGoogle Scholar
  8. 8.
    Ewert S, Muller M, Sandler M (2013) Efficient data adaption for musical source separation methods based on parametric models. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP): 46–50Google Scholar
  9. 9.
    Ewert S, Pardo B, Mueller M, Plumbley MD (2014) Score-informed source separation for musical audio recordings: an overview. IEEE Signal Process Mag 31(3):116–124CrossRefGoogle Scholar
  10. 10.
    Ewert S, Plumbley MD, Sandler M (2014) Accounting for phase cancellations in non-negative matrix factorization using weighted distances. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing: 649–653Google Scholar
  11. 11.
    Fitzgerald D, Coyle E, Lawlor B (2002) Sub-band independent subspace analysis for drum transcription. Proceedings of DAFx, GermanyGoogle Scholar
  12. 12.
    Fitzgerald D, Coyle E, Lawlor B (2003) Independent subspace analysis using locally linear embedding. Proceedings of DAFx, GermanyGoogle Scholar
  13. 13.
    Goto M, Hashiguchi H, Nishimura T, Oka R (2003) RWC music database: musical instrument sound database. ISMIR 2003:229–230Google Scholar
  14. 14.
    Grais EM, Roma G, Simpson JR, Plumbley MD (2017) Two-stage single-channel audio source separation using deep neural networks. IEEE Trans Audio Speech Lang Process 25(9):1773–1783CrossRefGoogle Scholar
  15. 15.
    Guo Y, Zhu M (2011) Audio source separation by basis function adaptation. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)Google Scholar
  16. 16.
    Islam Molla MK, Hirose K, Minematsu N (2004) Single mixture audio source separation using KLD based clustering of independent basis functions. 3rd International Conference on Electrical & Computer EngineeringGoogle Scholar
  17. 17.
    Jang GJ, Lee TW, Oh YH (2003) Single-channel signal separation using time-domain basis functions. IEEE Signal Processing Letters 10(6):168–171CrossRefGoogle Scholar
  18. 18.
    Jao PK, Su L, Yang YH, Wohlberg B (2016) Monaural music source separation using convolutional sparse coding. IEEE Trans Audio Speech Lang Process 24(11):2158–2170CrossRefGoogle Scholar
  19. 19.
    Kameoka H (2015) Multi-resolution signal decomposition with time-domain spectrogram factorization. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing: 86–90Google Scholar
  20. 20.
    Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791CrossRefGoogle Scholar
  21. 21.
    Lefevre A, Bach F, Févotte C (2012) Semi-supervised NMF with time-frequency annotations for single-channel source separation. Proceedings of International Society for Music Information Retrieval Conference (ISMIR): 115–120Google Scholar
  22. 22.
    Liu Y, Nie L, Liu L, Rosenblum D (2016) From action to activity: Sensor-based activity recognition. Neurocomputing 181:108–115. CrossRefGoogle Scholar
  23. 23.
    Ozerov A, Févotte C, Blouet R, Durrieu JL (2011) Multichannel nonnegative tensor factorization with structured constraints for user-guided audio source separation. Proceedings of IEEE International Conference Acoustics, Speech, Signal Processing (ICASSP): 257–260Google Scholar
  24. 24.
    Phiwma N, Sanguansat P (2010) A music information system based on improved melody contour extraction. Proceedings of the IEEE International Conference on Signal Acquisition and Processing (ICSAP), Bangalore, India: 85–89Google Scholar
  25. 25.
    Salamon J, Gomez E (2012) Melody extraction from polyphonic music signals using pitch contour characteristics. IEEE Trans Audio Speech Lang Process 20(6):1759–1770CrossRefGoogle Scholar
  26. 26.
    Schmidt MN, Olsson RK (2006) Single-channel speech separation using sparse nonnegative matrix factorization. Proceeding of the InterspeechGoogle Scholar
  27. 27.
    Simsekli U, Cemgil AT (2012) Score guided musical source separation using generalized coupled tensor factorization. Proceedings of European Signal Processing Conference (EUSIPCO): 2639–2643Google Scholar
  28. 28.
    Smaragdis P, Mysore GJ (2009) Separation by humming: user guided sound extraction from monophonic mixtures. Proceedings of IEEE Workshop Application Signal Processing to Audio Acoustics (WASPAA): 69–72Google Scholar
  29. 29.
    Suits BH (2015) Frequencies for equal-tempered scale. Michigan Technological University, Accessed Mar 2018
  30. 30.
    Ueda Y, Uchiyama Y, Nishimoto T, Ono N, Sagayama S (2010) HMM-based approach for automatic chord detection using refined acoustic features. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP): 5518–5521Google Scholar
  31. 31.
    Uhle C, Dittmar C (2003) Extraction of drum tracks from polyphonic music using independent subspace analysis. in: 4th International Symposium on Independent Component Analysis and Blind Signal SeparationGoogle Scholar
  32. 32.
    Universal Sound Bank database (2015). Available: Accessed Mar 2018
  33. 33.
    Virtanen T (2000) Audio signal modeling with sinusoids plus noise, Master’s thesis, Tampere University of TechnologyGoogle Scholar
  34. 34.
    Virtanen T (2004) Separation of sound sources by convolutive sparse coding. Proceedings of the ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition (SAPA), ISCAGoogle Scholar
  35. 35.
    Virtanen T (2007) Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria. IEEE Trans Audio Speech Lang Process 15(3):1066–1074CrossRefGoogle Scholar
  36. 36.
    Wang L, Reiss JD, Cavallaro A (2016) Over-determined source separation and localization using distributed microphones. IEEE Trans Audio Speech Lang Process 24(9):1573–1588CrossRefGoogle Scholar
  37. 37.
    Xu Y, Bao G, Xu X, Ye Z (2015) Single-channel speech separation using sequential discriminative dictionary learning. Signal Process 106:134–140. CrossRefGoogle Scholar
  38. 38.
    Yoshii K, Tomioka R, Mochihashi D, Goto M (2013) Beyond NMF: time-domain audio source separation without phase reconstruction. Proceedings of the International Society of Music Information Retrieval Conference: 369–374Google Scholar
  39. 39.
    Zdunek R (2013) Improved convolutive and under-determined blind audio source separation with MRF smoothing. Cogn Comput 5(4):493–503CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Islamic Azad UniversityEsfahanIran
  2. 2.Department of Electrical and Computer EngineeringTarbiat Modares UniversityTehranIran

Personalised recommendations