Skip to main content
Log in

Single-channel blind source separation based on attentional generative adversarial network

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Blind single-channel source separation is a long-standing machine learning and signal processing problem. Traditional blind source separation (BSS) algorithms were proposed to solve this task utilizing multiple signal constraints. Generative adversarial network (GAN) are free from statistical constraints and samples, but the role of adversarial training in the task of BSS has not been fully demonstrated. Therefore, a new separation network model that enables to learn the known separated signal distribution from the stepwise fine estimation of the unknown mixture distribution was presented in this paper, and a self-attention mechanism was introduced to solve the problem of the blurring details of the generated image by the generator which preserves image details in the process of image separation. Compared with the existing single-channel blind source separation algorithm based on generative adversarial network-neural egg separation (NES), the detailed information of this new separation algorithm is more prominent, and the source signal in the mixed image has been separated more effectively, and has better separation performance than the classic blind source separation algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Alti A (2020) Hybrid features extraction for adaptive face images retrieval. Int J Synth Emot (IJSE) 11:17–26

    Article  Google Scholar 

  • Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. Comput Sci

  • Cheng J, Dong L, Lapata M (2016) Long short-term memory-networks for machine reading. In: The 2016 conference on empirical methods in natural language processing, Austin, pp 551–561

  • Cherry EC (1953) Some experiments on the recognition of speech with one ear and with two ears. J Acoust Soc Am 25(5):975–979

    Article  Google Scholar 

  • Dong S, Yu C, Sun B (2018) Gaussian noise removal of blind source separation based on image sequence. In: 11th international symposium on computational intelligence and design, Hangzhou, pp 3–6

  • Fan Z, Lai Y, Jang JR (2018) SVSGAN: singing voice separation via generative adversarial network. In: 2018 IEEE international conference on acoustics, speech and signal processing, Calgary, pp 726–730

  • Feng WK, Chun SX (2020) Moving target detection and tracking based on improved FCM algorithm. Int J Cogn Inform Nat Intell (IJCINI) 14(1):63–74

    Article  Google Scholar 

  • Goodfellow I, Pouget-Abadie J, Mirza M (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680

  • Halperin T, Ephrat A, Hoshen Y (2019) Neural separation of observed and unobserved distributions. In: Proceedings of the 36th international conference on machine learning, pp 2566–2575

  • Huang P, Chen SD, Smaragdis P, Hasegawa-Johnson M (2012 Singing-voice separation from monaural recordings using robust principal component analysis. In: 2012 IEEE international conference on acoustics, speech and signal processing, Kyoto, pp 57–60

  • Hyvärinen A, Oja E (2000) Independent component analysis: algorithms and applications. Neural Netw 13(4–5):411–430

    Article  Google Scholar 

  • Li M, Liu Y, Chen F, Hu D (2015) Including signal intensity increases the performance of blind source separation on brain imaging data. IEEE Trans Med Imaging 34:551–563

    Article  Google Scholar 

  • Lin D, Xu G, Wang X (2019) A remote sensing image dataset for cloud removal. In: 2019 IEEE conference on computer vision and pattern recognition, Long beach

  • Michelashvili M, Benaim S, Wolf L (2019) Semi-supervised monaural singing voice separation with a masking network trained on synthetic mixtures. In: 2019 IEEE international conference on acoustics, speech and signal processing, Brighton, pp 291–295

  • Parikh AP, Tackstrom O, Das D, Uszkoreit J (2016) A decomposable attention model for natural language inference. In: The 2016 conference on empirical methods in natural language processing, Austin, pp 2249–2255

  • Parmar N, Vaswani A, Uszkoreit J, Ukasz K, Shazeer N, Ku A (2018) Image transformer. In: 2018 IEEE conference on computer vision and pattern recognition, Salt Lake City

  • Pascual S, Bonafonte A, Serra J (2017) SEGAN: speech enhancement generative adversarial network. arXiv:1703.09452

  • Seki S, Kameoka H, Li L, Toda T, Takeda K (2019) Underdetermined source separation based on generalized multichannel variational autoencoder. IEEE Access 7:168104–168115

    Article  Google Scholar 

  • Simonetta F, Ntalampiras S, Avanzini F (2019) Multimodal music information processing and retrieval: survey and future challenges. In: 2019 international workshop on multilayer music representation and processing, Milano, pp 10–18

  • Stoller D, Ewert S, Dixon S (2018) Adversarial semi-supervised audio source separation applied to singing voice extraction. In: 2018 IEEE international conference on acoustics, speech and signal processing, Calgary, pp 2391–2395

  • Tachioka Y (2019) Permutation alignment based on MUSIC spectrum discrepancy for blind source separation. In: 27th European signal processing conference, A Coruna, pp 1–5

  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Neural information processing systems, pp 5998–6008

  • Wang R, Li M, Zhang J, Pan P, Cao X, Liu W, Yang M H (2018) Gated fusion network for single image dehazing. In: 2018 IEEE conference on computer vision and pattern recognition, Salt Lake City, UT, pp 3253–3261

  • Wilson KW, Raj B, Smaragdis P, Divakaran A (2008) Speech denoising using nonnegative matrix factorization with priors. In: 2008 IEEE international conference on acoustics, speech and signal processing, Las Vegas, pp 4029–4032

  • Yu A, Grauman K (2014) Fine-grained visual comparisons with local learning. In: 2014 IEEE conference on computer vision and pattern recognition, Columbus, pp 192–199

  • Zhang S, Huang J, Li H, Metaxas DN (2012) Automatic image annotation and retrieval using group sparsity. IEEE Trans Syst Man Cybern Part B (Cybernetics) 42(3):838–849

    Article  Google Scholar 

  • Zhang H, Goodfellow I, Metaxas D, Odena A (2018a) Self-attention generative adversarial networks. arXiv:1805.08318

  • Zhang Y, Li K, Wang L, Zhong B, Fu Y (2018b) Image superresolution using very deep residual channel attention networks. In: 2018 European conference on computer vision, pp 294–310

  • Zhang X, Ng R, Chen Q (2018c) Single image reflection separation with perceptual losses. In: 2018 IEEE conference on computer vision and pattern recognition, Salt Lake City, pp 4786–4794

  • Zhu JY, Krähenbühl P, Shechtman E (2016) Generative visual manipulation on the natural image manifold. In: Springer European conference on computer vision, pp 597–613

Download references

Acknowledgements

This research is funded by the Natural Science Foundation (62072391, 62066013) Natural Science Foundation of Shandong (ZR2019MF060, ZR2017MF008), A Project of Shandong Province Higher Educational Science and Technology Key Program (J18KZ016), and the Yantai Science and Technology Plan (2018YT06000271).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jindong Xu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, X., Xu, J., Ma, Y. et al. Single-channel blind source separation based on attentional generative adversarial network. J Ambient Intell Human Comput 13, 1443–1450 (2022). https://doi.org/10.1007/s12652-020-02679-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-020-02679-4

Keywords

Navigation