Combining Evidence from Temporal and Spectral Features for Person Recognition Using Humming
In this paper, hum of a person is used to identify a speaker with the help of machine. In addition, novel temporal features (such as zero-crossing rate & short-time energy) and spectral features (such as spectral centroid & spectral flux) are proposed for person recognition task. Feature-level fusion of each of these features with state-of-the art spectral feature set, viz., Mel Frequency Cepstral Coefficients (MFCC) is found to give better recognition performance than MFCC alone. In addition, it is shown that the person identification rate is competitive over baseline MFCC. Furthermore, the reduction in equal error rate (EER) by 1.46 % is obtained when a feature-level fusion system is employed by combining evidences from MFCC, temporal and proposed spectral features.
KeywordsHumming Mel cepstrum zero-crossing rate short-time energy spectral centroid spectral flux and polynomial classifier
Unable to display preview. Download preview PDF.
- 1.Amino, K., Arai, T.: Perceptual Speaker Identification Using Monosyllabic Stimuli-Effects of the Nucleus Vowels and Speaker Characteristics Contained in Nasals. In: INTERSPEECH 2008, Brisbane, Australia, pp. 1917–1920 (2008)Google Scholar
- 3.Jin, M., Kim, J., Yoo, C.D.: Humming-based Human Verification and Identification. In: Proc. Int. Conf. on Acoustic, Speech and Signal Processing, ICASSP 2009, Taipei, Taiwan, pp. 1453–1456 (2009)Google Scholar
- 4.Patil, H.A., Parhi, K.K.: Novel Variable Length Teager Energy based Features for Person Recognition from Their Hum. In: Proc. Int. Conf. on Acoustic, Speech and Signal Processing, ICASSP 2010, Dallas, Texas, USA, pp. 4526–4529 (2010)Google Scholar
- 7.Schubert, E., Wolfe, J., Tarnopolsky, A.: Spectral Centroid and Timbre in Complex, Multiple Instrumental Textures. In: Proceedings of the 8th International Conference on Music Perception & Cognition, Evanston, IL, pp. 654–657 (2004)Google Scholar
- 10.Martin, A.F., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The DET Curve in Assessment of Detection Task Performance. In: Proc. EUROSPEECH 1997, Rhodes, Greece, vol. 4, pp. 1895–1898 (1997)Google Scholar