Real-time Soundprism

Abstract

This paper presents a parallel real-time sound source separation system for decomposing an audio signal captured with a single microphone in so many audio signals as the number of instruments that are really playing. This approach is usually known as Soundprism. The application scenario of the system is for a concert hall in which users, instead of listening to the mixed audio, want to receive the audio of just an instrument, focusing on a particular performance. The challenge is even greater since we are interested in a real-time system on handheld devices, i.e., devices characterized by both low power consumption and mobility. The results presented show that it is possible to obtain real-time results in the tested scenarios using an ARM processor aided by a GPU, when this one is present.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  1. 1.

    Alonso P, Cortina R, Rodríguez-Serrano FJ, Vera-Candeas P, Alonso-González M, Ranilla J (2017) Parallel online time warping for real-time audio-to-score alignment in multi-core systems. J Supercomput 73:126. https://doi.org/10.1007/s11227-016-1647-5

    Article  Google Scholar 

  2. 2.

    Carabias-Orti JJ, Cobos M, Vera-Candeas P, Rodríguez-Serrano FJ (2013) Nonnegative signal factorization with learnt instrument models for sound source separation in close-microphone recordings. EURASIP J Adv Signal Process 2013:184. https://doi.org/10.1186/1687-6180-2013-184

    Article  Google Scholar 

  3. 3.

    Carabias-Orti JJ, Rodriguez-Serrano FJ, Vera-Candeas P, Canadas-Quesada FJ, Ruiz-Reyes N (2015) An audio to score alignment framework using spectral factorization and dynamic time warping. In: 16th International Society for Music Information Retrieval Conference, pp 742–748

  4. 4.

    Díaz-Gracia N, Cocaña-Fernández A, Alonso-González M, Martínez-Zaldívar FJ, Cortina R, García-Mollá VM, Alonso P, Ranilla J (2014) NNMFPACK: a versatile approach to an NNMF parallel library. In: Proceedings of the 2014 International Conference on Computational and Mathematical Methods in Science and Engineering, pp 456–465

  5. 5.

    Díaz-Gracia N, Cocaña-Fernández A, Alonso-González M, Martínez-Zaldívar FJ, Cortina R, García-Mollá VM, Vidal AM (2015) Improving NNMFPACK with heterogeneous and efficient kernels for \(\beta \)-divergence metrics. J Supercomput 71:1846–1856. https://doi.org/10.1007/s11227-014-1363-y

    Article  Google Scholar 

  6. 6.

    Driedger J, Grohganz H, Prätzlich T, Ewert S, Müller M (2013) Score-informed audio decomposition and applications. In: Proceedings of the 21st ACM International Conference on Multimedia, pp 541–544

  7. 7.

    Duan Z, Pardo B (2011) Soundprism: an online system for score-informed source separation of music audio. IEEE J Sel Top Signal Process 5(6):1205–1215

    Article  Google Scholar 

  8. 8.

    Duong NQ, Vincent E, Gribonval R (2010) Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Trans Audio Speech 18(7):1830–1840. https://doi.org/10.1109/TASL.2010.2050716

    Article  Google Scholar 

  9. 9.

    Ewert S, Müller M (2011) Estimating note intensities in music recordings. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp 385–388

  10. 10.

    Ewert S, Pardo B, Mueller M, Plumbley MD (2014) Score-informed source separation for musical audio recordings: an overview. IEEE Signal Process Mag 31:116–124. https://doi.org/10.1109/MSP.2013.2296076

    Article  Google Scholar 

  11. 11.

    Fastl H, Zwicker E (2007) Psychoacoustics. Springer, Berlin

    Google Scholar 

  12. 12.

    Ganseman J, Scheunders P, Mysore GJ, Abel JS (2010) Source separation by score synthesis. Int Comput Music Conf 2010:1–4

    Google Scholar 

  13. 13.

    Goto M, Hashiguchi H, Nishimura T, Oka R (2002) RWC music database: popular, classical and jazz music databases. In: ISMIR, vol 2, pp 287–288

  14. 14.

    Goto M (2004) Development of the RWC music database. In: Proceedings of the 18th International Congress on Acoustics (ICA 2004), ppp 553–556

  15. 15.

    Hennequin R, David B, Badeau R (2011) Score informed audio source separation using a parametric model of non-negative spectrogram. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp 45–48. https://doi.org/10.1109/ICASSP.2011.5946324

  16. 16.

    Itoyama K, Goto M, Komatani K et al (2008) Instrument equalizer for query-by-example retrieval: improving sound source separation based on integrated harmonic and inharmonic models. In: ISMIR. https://doi.org/10.1136/bmj.324.7341.827

  17. 17.

    Marxer R, Janer J, Bonada J (2012) Low-latency instrument separation in polyphonic audio using timbre models. In: International Conference on Latent Variable Analysis and Signal Separation, pp 314–321

  18. 18.

    Miron M, Carabias-Orti JJ, Janer J (2015) Improving score-informed source separation for classical music through note refinement. In: ISMIR, pp 448–454

  19. 19.

    Ozerov A, Févotte C (2010) Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans Audio Speech Lang Process 18:550–563. https://doi.org/10.1109/TASL.2009.2031510

    Article  Google Scholar 

  20. 20.

    Ozerov A, Vincent E, Bimbot F (2012) A general flexible framework for the handling of prior information in audio source separation. IEEE Trans Audio Speech Lang Process 20:1118–1133. https://doi.org/10.1109/TASL.2011.2172425

    Article  Google Scholar 

  21. 21.

    Pätynen J, Pulkki V, Lokki T (2008) Anechoic recording system for symphony orchestra. Acta Acust United Acust 94:856–865. https://doi.org/10.3813/AAA.918104

    Article  Google Scholar 

  22. 22.

    Raphael C (2008) A classifier-based approach to score-guided source separation of musical audio. Comput Music J 32:51–59. https://doi.org/10.1162/comj.2008.32.1.51

    Article  Google Scholar 

  23. 23.

    Rodriguez-Serrano FJ, Duan Z, Vera-Candeas P, Pardo B, Carabias-Orti JJ (2015) Online score-informed source separation with adaptive instrument models. J New Music Res 44:83–96. https://doi.org/10.1080/09298215.2014.989174

    Article  Google Scholar 

  24. 24.

    Rodriguez-Serrano FJ, Carabias-Orti JJ, Vera-Candeas P, Martinez-Munoz D (2016) Tempo driven audio-to-score alignment using spectral decomposition and online dynamic time warping. ACM Trans Intell Syst Technol 8:1–20. https://doi.org/10.1145/2926717

    Article  Google Scholar 

  25. 25.

    Sawada H, Araki S, Makino S (2011) Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment. IEEE Trans Audio Speech Lang Process 19(3):516–527. https://doi.org/10.1109/TASL.2010.2051355

    Article  Google Scholar 

  26. 26.

    Vincent E, Araki S, Theis F et al (2012) The signal separation evaluation campaign (2007–2010): achievements and remaining challenges. Signal Process 92:1928–1936. https://doi.org/10.1016/j.sigpro.2011.10.007

    Article  Google Scholar 

  27. 27.

    Vincent E, Bertin N, Gribonval R, Bimbot F (2014) From blind to guided audio source separation: how models and side information can improve the separation of sound. IEEE Signal Process Mag 31:107–115. https://doi.org/10.1109/MSP.2013.2297440

    Article  Google Scholar 

Download references

Acknowledgements

This work has been supported by the “Ministerio de Economía y Competitividad” of Spain and FEDER under projects TEC2015-67387-C4-{1,2,3}-R.

Author information

Affiliations

Authors

Corresponding author

Correspondence to A. J. Muñoz-Montoro.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Muñoz-Montoro, A.J., Ranilla, J., Vera-Candeas, P. et al. Real-time Soundprism. J Supercomput 75, 1594–1609 (2019). https://doi.org/10.1007/s11227-018-2703-0

Download citation

Keywords

  • Sound source separation
  • Real-time
  • Score alignment
  • Audio processing
  • Parallel computing
  • GPGPU