Parameter discrimination analysis in speaker identification using self organizing map
This paper presents a comparison of the discrimination in representing the individual features of speakers between Mel Frequency Cepstrum Coefficients(MFCC) and Line Spectrum Pair Frequencies(LSP). We use Self Organizing Map of Kohonen(SOM) to explore the effectiveness of these two parameters. Because SOM can keep the topological property of the feature space, it helps us to understand the difference directly through the senses. In the experiment, MFCC is derived from FFT and LSP is derived from LPC analysis. To reduce the computation complexity and improve the robustness, LSP parameters are vector quantized by a codebook like in speech coding and a distance weighting is incorporated. SOM is trained by 33 speakers and a codebook with 400 codes. For each speaker, the training utterance is 60 sec. long. The final result shows that these two speech parameters produce very similar feature maps for the same speaker in the general feature space. A correlation criterion gives further verification. Thus, LSP and MFCC coefficients may be considered to be equivalent in Euclidean distance meaning. At the end of the paper, neural networks VQ model method is adopted to compare the experiment validity of these two parameters in text independent speaker identification and both of them achieve satisfactory results.
Unable to display preview. Download preview PDF.
- .R. Brunelli and D. Falavigna, “Person Identification Using Multiple Cues”, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, no. 10, pp. 955–966, 1995.Google Scholar
- .T. Kohonen, “The Self Organizing Map”, Proc. IEEE, vol. 78, pp. 1464–1480, September 1990.Google Scholar
- .X.J. Yang and H.S. Chi, “Digital Processing of Speech Signal”, Publishing house of electronics industry, 1995.Google Scholar