Speech Separation Based on Time Frequency Ratio of Mixtures and Track Identification

Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 221)

Abstract

Analysis of non stationary signals like audio, speech and biomedical signals require good resolution both in time and frequency as their spectral components are not fixed. There are many applications of time frequency analysis in non stationary signals like source separation, signal denoising, automatic gain control, speaker recognition etc. This paper presents an application of time frequency analysis using STFT, Short Time Fourier Transform in speech and audio separation. This method is known as Blind Source Separation. The method is blind since the information about the sources and mixing type is not available. The method uses relative amplitude information and time frequency ratios of audio and speech mixtures in time frequency domain and ideal binary mask of source signals. A mixture of male speech, female speech and tones of musical instruments are considered for the separation first with a strong mixing matrix and next with a weak mixing matrix.

Keywords

Short time Fourier transform Binary masking Automatic speech recognition Time–frequency domain Ideal mask Ratio of mixtures 

References

  1. 1.
    Abrard F, Deville Y (2005) Atime frequency blind signal separation method applicable to underdetermined mixtures of dependent sources. Signal Process 85(7):1389–1403MATHCrossRefGoogle Scholar
  2. 2.
    Yilmaz O, Rickard S (2004) Blind separation of speech mixtures via time-frequency masking. IEEE Trans Signal Process 52(7):1830–1847MathSciNetCrossRefGoogle Scholar
  3. 3.
    Araki S, Makino S, Sawada H, Mukai R (2004) Underdetermined blind separation of convolutive mixtures of speech with directivity pattern based mask and ica. Fifth international conference on independent component analysis and blind signal separation, pp 898–905Google Scholar
  4. 4.
    Torkkola K (1996) Blind separation of convolved sources based on information maximization. IEEE Worshop on neural networks for signal processing, Kyoto, pp 423–432Google Scholar
  5. 5.
    Bell AJ, Sejnowski TJ (1995) An information-maximization approach to blind separation and blind deconvolution. Neural Comput 7(6):1129–1159CrossRefGoogle Scholar
  6. 6.
    Vincent E, Gribonval R, F’evotte C (2006) Performance measurement in blind audio source separation. IEEE Trans Speech Audio Process 14(4):1462–1469CrossRefGoogle Scholar

Copyright information

© Springer India 2013

Authors and Affiliations

  1. 1.Department of Electronics and Communication EngineeringVivekananda College of Engineering and TechnologyPutturIndia

Personalised recommendations