Abstract
Blind single-channel source separation is a long-standing machine learning and signal processing problem. Traditional blind source separation (BSS) algorithms were proposed to solve this task utilizing multiple signal constraints. Generative adversarial network (GAN) are free from statistical constraints and samples, but the role of adversarial training in the task of BSS has not been fully demonstrated. Therefore, a new separation network model that enables to learn the known separated signal distribution from the stepwise fine estimation of the unknown mixture distribution was presented in this paper, and a self-attention mechanism was introduced to solve the problem of the blurring details of the generated image by the generator which preserves image details in the process of image separation. Compared with the existing single-channel blind source separation algorithm based on generative adversarial network-neural egg separation (NES), the detailed information of this new separation algorithm is more prominent, and the source signal in the mixed image has been separated more effectively, and has better separation performance than the classic blind source separation algorithms.
Similar content being viewed by others
References
Alti A (2020) Hybrid features extraction for adaptive face images retrieval. Int J Synth Emot (IJSE) 11:17–26
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. Comput Sci
Cheng J, Dong L, Lapata M (2016) Long short-term memory-networks for machine reading. In: The 2016 conference on empirical methods in natural language processing, Austin, pp 551–561
Cherry EC (1953) Some experiments on the recognition of speech with one ear and with two ears. J Acoust Soc Am 25(5):975–979
Dong S, Yu C, Sun B (2018) Gaussian noise removal of blind source separation based on image sequence. In: 11th international symposium on computational intelligence and design, Hangzhou, pp 3–6
Fan Z, Lai Y, Jang JR (2018) SVSGAN: singing voice separation via generative adversarial network. In: 2018 IEEE international conference on acoustics, speech and signal processing, Calgary, pp 726–730
Feng WK, Chun SX (2020) Moving target detection and tracking based on improved FCM algorithm. Int J Cogn Inform Nat Intell (IJCINI) 14(1):63–74
Goodfellow I, Pouget-Abadie J, Mirza M (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Halperin T, Ephrat A, Hoshen Y (2019) Neural separation of observed and unobserved distributions. In: Proceedings of the 36th international conference on machine learning, pp 2566–2575
Huang P, Chen SD, Smaragdis P, Hasegawa-Johnson M (2012 Singing-voice separation from monaural recordings using robust principal component analysis. In: 2012 IEEE international conference on acoustics, speech and signal processing, Kyoto, pp 57–60
Hyvärinen A, Oja E (2000) Independent component analysis: algorithms and applications. Neural Netw 13(4–5):411–430
Li M, Liu Y, Chen F, Hu D (2015) Including signal intensity increases the performance of blind source separation on brain imaging data. IEEE Trans Med Imaging 34:551–563
Lin D, Xu G, Wang X (2019) A remote sensing image dataset for cloud removal. In: 2019 IEEE conference on computer vision and pattern recognition, Long beach
Michelashvili M, Benaim S, Wolf L (2019) Semi-supervised monaural singing voice separation with a masking network trained on synthetic mixtures. In: 2019 IEEE international conference on acoustics, speech and signal processing, Brighton, pp 291–295
Parikh AP, Tackstrom O, Das D, Uszkoreit J (2016) A decomposable attention model for natural language inference. In: The 2016 conference on empirical methods in natural language processing, Austin, pp 2249–2255
Parmar N, Vaswani A, Uszkoreit J, Ukasz K, Shazeer N, Ku A (2018) Image transformer. In: 2018 IEEE conference on computer vision and pattern recognition, Salt Lake City
Pascual S, Bonafonte A, Serra J (2017) SEGAN: speech enhancement generative adversarial network. arXiv:1703.09452
Seki S, Kameoka H, Li L, Toda T, Takeda K (2019) Underdetermined source separation based on generalized multichannel variational autoencoder. IEEE Access 7:168104–168115
Simonetta F, Ntalampiras S, Avanzini F (2019) Multimodal music information processing and retrieval: survey and future challenges. In: 2019 international workshop on multilayer music representation and processing, Milano, pp 10–18
Stoller D, Ewert S, Dixon S (2018) Adversarial semi-supervised audio source separation applied to singing voice extraction. In: 2018 IEEE international conference on acoustics, speech and signal processing, Calgary, pp 2391–2395
Tachioka Y (2019) Permutation alignment based on MUSIC spectrum discrepancy for blind source separation. In: 27th European signal processing conference, A Coruna, pp 1–5
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Neural information processing systems, pp 5998–6008
Wang R, Li M, Zhang J, Pan P, Cao X, Liu W, Yang M H (2018) Gated fusion network for single image dehazing. In: 2018 IEEE conference on computer vision and pattern recognition, Salt Lake City, UT, pp 3253–3261
Wilson KW, Raj B, Smaragdis P, Divakaran A (2008) Speech denoising using nonnegative matrix factorization with priors. In: 2008 IEEE international conference on acoustics, speech and signal processing, Las Vegas, pp 4029–4032
Yu A, Grauman K (2014) Fine-grained visual comparisons with local learning. In: 2014 IEEE conference on computer vision and pattern recognition, Columbus, pp 192–199
Zhang S, Huang J, Li H, Metaxas DN (2012) Automatic image annotation and retrieval using group sparsity. IEEE Trans Syst Man Cybern Part B (Cybernetics) 42(3):838–849
Zhang H, Goodfellow I, Metaxas D, Odena A (2018a) Self-attention generative adversarial networks. arXiv:1805.08318
Zhang Y, Li K, Wang L, Zhong B, Fu Y (2018b) Image superresolution using very deep residual channel attention networks. In: 2018 European conference on computer vision, pp 294–310
Zhang X, Ng R, Chen Q (2018c) Single image reflection separation with perceptual losses. In: 2018 IEEE conference on computer vision and pattern recognition, Salt Lake City, pp 4786–4794
Zhu JY, Krähenbühl P, Shechtman E (2016) Generative visual manipulation on the natural image manifold. In: Springer European conference on computer vision, pp 597–613
Acknowledgements
This research is funded by the Natural Science Foundation (62072391, 62066013) Natural Science Foundation of Shandong (ZR2019MF060, ZR2017MF008), A Project of Shandong Province Higher Educational Science and Technology Key Program (J18KZ016), and the Yantai Science and Technology Plan (2018YT06000271).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sun, X., Xu, J., Ma, Y. et al. Single-channel blind source separation based on attentional generative adversarial network. J Ambient Intell Human Comput 13, 1443–1450 (2022). https://doi.org/10.1007/s12652-020-02679-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-02679-4