Abstract
Source separation evaluation is typically a top-down process, starting with perceptual measures which capture fitness-for-purpose and followed by attempts to find physical (objective) measures that are predictive of the perceptual measures. In this paper, we take a contrasting bottom-up approach. We begin with the physical measures provided by the Blind Source Separation Evaluation Toolkit (BSS Eval) and we then look for corresponding perceptual correlates. This approach is known as psychophysics and has the distinct advantage of leading to interpretable, psychophysical models. We obtained perceptual similarity judgments from listeners in two experiments featuring vocal sources within musical mixtures. In the first experiment, listeners compared the overall quality of vocal signals estimated from musical mixtures using a range of competing source separation methods. In a loudness experiment, listeners compared the loudness balance of the competing musical accompaniment and vocal. Our preliminary results provide provisional validation of the psychophysical approach.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Vincent, E., Gribonval, R., Févotte, C.: Performance measurement in blind audio source separation. IEEE Trans. Audio Speech Lang. Process. 14, 1462–1469 (2006)
Vincent, E., Jafari, M.G., Plumbley. M.D.: Preliminary guidelines for subjective evaluation of audio source separation algorithms. In: Nandi, A.K., Zhu, X., (eds.) Proceedings of ICA Research Network International Workshop, Liverpool, UK, pp. 93–96 (2006)
ITU. Recommendation ITU-R BS.1534-3: Method for the subjective assessment of intermediate quality level of audio systems (2014)
Emiya, V., Vincent, E., Harlander, N., Hohmann, V.: Subjective and objective quality assessment of audio source separation. IEEE Trans. Audio Speech Lang. Process. 19, 2046–2057 (2011)
Cartwright, M., Pardo, B., Mysore, G.J., Hoffman, M.: Fast and easy crowdsourced perceptual audio evaluation. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 619–623 (2016)
Kornycky, J., Gunel, B., Kondoz, A.: Comparison of subjective and objective evaluation methods for audio source separation. In: Meetings on Acoustics, Paris, France, vol. 123, no. 5, p. 3569 (2008)
Langjahr, P., Mowlaee, P.: Objective quality assessment of target speaker separation performance in multisource reverberant environment. In: 4th International Workshop on Perceptual Quality of Systems, Vienna, Austria, pp. 89–94 (2013)
Gupta, U., Moore, E., Lerch, A.: On the perceptual relevance of objective source separation measures for singing voice separation. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2015) (2015)
Cano, E., FitzGerald, D., Brandenburg, K.: Evaluation of quality of sound source separation algorithms: human perception vs quantitative metrics. In: EUSIPCO 2016, pp. 1758–1762 (2016)
Fechner, G.T.: Elemente der Psychophysik. Breitkopf und Härtel, Leipzig (1860)
Gescheider, G.: Psychophysics: The Fundamentals, 3rd edn. Lawrence Erlbaum Associates, Mahwah (1997)
Fletcher, H., Munson, W.A.: Loudness, its definition, measurement and calculation. J. Acoust. Soc. Am. 5, 82–108 (1933)
Moore, B.C.J.: An Introduction to the Psychology of Hearing, 6th edn. Brill, Leiden (2012)
Grais, E.M., Roma, G., Simpson, A.J.R., Plumbley, M.D.: Discriminative enhancement for single channel audio source separation using deep neural networks. In: 13th International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA) (2017)
Ono, N., Rafii, Z., Kitamura, D., Ito, N., Liutkus, A.: The 2015 signal separation evaluation campaign. In: Vincent, E., Yeredor, A., Koldovský, Z., Tichavský, P. (eds.) LVA/ICA 2015. LNCS, vol. 9237, pp. 387–395. Springer, Heidelberg (2015). doi:10.1007/978-3-319-22482-4_45
Terrell, M.J., Simpson, A.J.R., Sandler, M.: The mathematics of mixing. J. Audio Eng. Soc. 62(1/2), 4–13 (2014)
Dwass, M.: Modified randomization tests for nonparametric hypotheses. Ann. Math. Stat. 28, 181–187 (1957)
Simpson, A.J.R., Roma, G., Grais, E.M., Mason, R.D., Hummersone, C., Liutkus, A., Plumbley, M.D.: Evaluation of audio source separation models using hypothesis-driven non-parametric statistical methods. In: European Signal Processing Conference (EUSIPCO) (2016)
Simpson, A.J.R., Roma, G., Plumbley, M.D.: Deep karaoke: Extracting vocals from musical mixtures using a convolutional deep neural network. In: Proceedings of International Conference on Latent Variable Analysis and Signal Separation, pp. 429–436 (2015)
Acknowledgment
This work was supported by grants EP/L027119/1 and EP/L027119/2 from the UK Engineering and Physical Sciences Research Council (EPSRC). The authors also wish to thank the reviewers for helpful comments on an earlier version of the paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Simpson, A.J.R., Roma, G., Grais, E.M., Mason, R.D., Hummersone, C., Plumbley, M.D. (2017). Psychophysical Evaluation of Audio Source Separation Methods. In: Tichavský, P., Babaie-Zadeh, M., Michel, O., Thirion-Moreau, N. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2017. Lecture Notes in Computer Science(), vol 10169. Springer, Cham. https://doi.org/10.1007/978-3-319-53547-0_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-53547-0_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-53546-3
Online ISBN: 978-3-319-53547-0
eBook Packages: Computer ScienceComputer Science (R0)