Abstract
Single channel speech separation is a branch of speech separation process, which is an ongoing interesting research topic for the past 40 years and continues till now, but still there is a lack in separating the required signal from the mixture of signals with 100% accuracy and be used by the common people. Many researches have been done in various ways using the parameters like pitch, phase, magnitude, amplitude, frequency and energy, spectrogram of the speech signal. Various issues in single channel speech separation process are surveyed in this paper and the major challenges faced by the speech research community in realizing the system are pointed out as conclusion.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Parson, T.W.: Separation of speech from interfering speech by means of harmonic selection. J. Acoust. Soc. Am. 60(4), 911–918 (1976)
Weintraub, M.: A theory and computational model of Auditory Monaural Sound Separation. Ph.D Thesis, Stanford University (1985)
Wang, D.L., Brown, G.J.: Computational Auditory Scene Analysis. John Wiley&Sons (2006)
Reddy, A.M., Raj, B.: Soft Mask Methods for Single Channel Speaker Separation. IEEE Tran. Audio, Speech, Lang. Process. 15(6), 1766–1776 (2007)
Radfar, M.H., Dansereau, R.M.: Single Channel Speech Separation Using Soft Mask Filtering. IEEE Tran. Audio, Speech, Lang. Process. 15(8), 2299–2310 (2007)
Gu, L.: Single-Channel Speech Separation based on Instantaneous Frequency, Carnegie Mellon University, Ph.D Thesis (2010)
Lee, Y.-K., Lee, I.S., Kwon, O.-W.: Single Channel Speech Separation Using Phase Based Methods. Procedures of the IEEE Tran. Acoust., Speech, Signal, Process. 56(4), 2453–2459 (2010)
Mowlaee, P., Christensen, M.G., Jensen, S.H.: New Results on Single-Channel Speech Separation Using Sinusoidal Modeling. IEEE Tran. Audio, Speech, Lang. Process. 19(5), 1265–1277 (2011)
Mowlaee, P., Saeidi, R., Tan, Z.H., Christensen, M.G., Kinnunen, T.: Sinusoidal Approach for the Single Channel Speech Separation and Recognition Challenge. In: Proc. Interspeech, pp. 677–680 (2011)
Kırbız, S., Smaragdis, P.: An adaptive time-frequency resolution approach for non-negative matrix factorization based single channel sound source separation. In: Proc. IEEE Conference ICASSP, pp. 253–256 (2011)
King, B.J., Atlas, L.: Single-Channel Source Separation Using Complex Matrix Factorization. IEEE Tran. Audio, Speech, Lang. Process. 19(8), 2591–2597 (2011)
Gao, B., Woo, W.L., Dlay, S.S.: Single-Channel Source Separation Using EMD-Subband Variable Regularized Sparse Features. Tran. Audio, Speech, Lang. Process. 19(4), 961–976 (2011)
Vishnubhotla, S., Espy-Wilson, C.Y.: An Algorithm For Speech Segregation of Co-Channel Speech. In: Proc. IEEE Conference ICASSP, pp. 109–112 (2009)
Schimmel, S.M., Atlas, L.E., Nie, K.: Feasibility of single channel speaker separation based on modulation frequency analysis. In: Proc. IEEE Conference ICASSP, pp. IV605–IV608 (2007)
Nakashizuka, M., Okumura, H., Iiguni, Y.: Single Channel Speech Separation Using A Sparse Periodic Decomposition. In: Proc. 17th European Signal Processing Conference (EUSIPCO 2009), Glasgow, Scotland, pp. 218–222 (2009)
Bach, F., Jordan, M.: Discriminative training of hidden markov models for multiple pitch tracking. In: Proc. of ICASSP, pp. v489–v492 (2005)
Charpentier, F.J.: Pitch detection using the short-term phase spectrum. In: Proc. of ICASSP, pp. 113–116 (1986)
Rabiner, L.R., Schafer, R.W.: Digital processing of speech signals. Prentice-Hall, Englewood (1993)
Weintraub, M.: A computational model for separating two simultaneous talkers. In: Proc. of ICASSP, pp. 81–84 (1986)
de Cheveigne, A., Kawahara, H.: Multiple period estimation and pitch perception model. Speech Communication 27(3-4), 175–185 (1999)
Barker, J., Coy, A., Ma, N., Cooke, M.: Recent advances in speech fragment decoding techniques. In: Proc. of Interspeech, pp. 85–88 (2006)
Schimmel, S.M., Atlas, L.E., Nie, K.: Feasibility of Single Channel Speaker Separation Based on Modulation Frequency Analysis. In: Proc. of ICASSP, pp. IV605–IV608 (2007)
Mahmoodzadeh, Abutalebi, H.R., Soltanian-Zadeh, H., Sheikhzadeh, H.: Single Channel Speech Separation with a Frame-based Pitch Range Estimation Method in Modulation Frequency. In: Proc. of IST, pp. 609–613 (2010)
Hu, G., Wang, D.L.: Monaural speech segregation based on pitch tracking and amplitude modulation. IEEE Tran. on Neural Networks 15(5), 1135–1150 (2004)
Stark, M., Wohlmayr, M., Pernkopf, F.: Source–Filter-Based Single-Channel Speech Separation Using Pitch Information. IEEE Trans. on Acoustics, Speech, Signal Process. 19(2), 242–255 (2011)
Ji, M., Srinivasan, R., Crookes, D.: A corpus-based approach to speech enhancement from nonstationary noise. In: Proc. of Interspeech, Makuhari, Chiba, Japan, pp. 1097–1100 (2010)
Huang, Q., Wang, D.: Single-channel speech separation based on long-short frame associated harmonic model. Digital Signal Processing 21, 497–507 (2011)
Roweis, S.T.: One microphone source separation. In: Proc. of NIPS-13, pp. 793–799 (2001)
Roweis, S.T.: Factorial models and refiltering for speech separation and denoising. In: Proc. Eurospeech, pp. 1009–1012 (2003)
Jang, G.J., Lee, T.W.: A maximum likelihood approach to single channel source separation. Journal of Machine Learning Research 4(7-8), 1365–1392 (2004)
Bach, F., Jordan, M.I.: Blind one-microphone speech separation: A spectral learning approach. Neural Info. Process. System, 65–72 (2005)
Jang, G.-J., Lee, T.-W., Oh, Y.-H.: Single channel Signal Separation Using Time-Domain Basis Functions. IEEE Signal Processing Letters 10(6), 168–171 (2003)
Prendergast, G., Johnson, S.R., Green, G.G.R.: Extracting amplitude modulations from speech in the time domain. Speech Communication 53, 903–913 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Logeshwari, G., Anandha Mala, G.S. (2012). A Survey on Single Channel Speech Separation. In: Das, V.V., Stephen, J. (eds) Advances in Communication, Network, and Computing. CNC 2012. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 108. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35615-5_61
Download citation
DOI: https://doi.org/10.1007/978-3-642-35615-5_61
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35614-8
Online ISBN: 978-3-642-35615-5
eBook Packages: Computer ScienceComputer Science (R0)