Abstract
The phase of a signal conveys critical information for feature extraction. In this work is shown that for certain speech and audio classes where their magnitude content underperforms in terms of recognition rate, the combination of magnitude with phase related features increases the classification rate compared to the case where only the magnitude content of the signal is used. However, signal phase extraction is not a straightforward process, mainly due to the discontinuities appearing in the phase spectrum. Hence, in the proposed method, the phase content of the signal is extracted via the Hartley Phase Spectrum where the sources of phase discontinuities are detected and overcome, resulting in a phase spectrum in which the number of discontinuities is significantly reduced.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alsteris, L.D., Paliwal, K.K.: Further intelligibility results from human listening tests using the short-time phase spectrum. Speech Communication 48, 727–736 (2006)
McGowan, R., Kuc, R.: A direct relation between a signal time series and its unwrapped phase. IEEE Transactions on Acoustics, Speech, and Signal Processing 30(5), 719–726 (1982)
Eck, D., Casagrande, N.: Finding meter in music using an autocorrelation phase matrix and Shannon entropy. In: Proc. of the 6th Int. Conference on Music Information Retrieval (ISMIR), UK, pp. 312–319 (2005)
Schlüter, R., Ney, H.: Using phase spectrum information for improved speech recognition performance. In: Proc. of the Int. Conference on Acoustics, Speech, and Signal Processing (ICASSP), USA, vol. 1, pp. 133–136 (2001)
Alsteris, L.D., Paliwal, K.K.: Evaluation of the modified group delay feature for isolated word recognition. In: Proc. of the 8th International Symposium on Signal Processing and its Applications (ISSPA), Australia, pp. 715–718 (2005)
Paliwal, K.K., Alsteris, L.D.: On the usefulness of STFT phase spectrum in human listening tests. Speech Communication 45, 153–170 (2005)
Bozkurt, B., Couvreur, L., Dutoit, T.: Chirp group delay analysis of speech signals. Speech Communication 49, 159–176 (2007)
Furui, S.: Cepstral analysis technique for automatic speaker verification. IEEE Transactions on Acoustics, Speech, and Signal Processing 29(2), 254–272 (1981)
Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing 28(4), 357–366 (1980)
TIMIT® Acoustic-Phonetic Continuous Speech Corpus (1993)
Audio Database: 505 Digital Sound Effects. (Disk 3/5: 101 Sounds of the Machines of War), Delta (1993)
Mahalanobis, P.C.: On the generalized distance in statistics. Proceedings of the National Institute of Science of India 12, 49–55 (1936)
Tribolet, J.: A new phase unwrapping algorithm. IEEE Transactions on Acoustics, Speech and Signal Processing 25(2), 170–177 (1977)
Al-Nashi, H.: Phase Unwrapping of Digital Signals. IEEE Transactions on Acoustics, Speech and Signal Processing 37(11), 1693–1702 (1989)
Paraskevas, I., Rangoussi, M.: The Hartley Phase Cepstrum as a Tool for Signal Analysis. In: Chetouani, M., Hussain, A., Gas, B., Milgram, M., Zarader, J.-L. (eds.) NOLISP 2007. LNCS (LNAI), vol. 4885, pp. 204–212. Springer, Heidelberg (2007)
Proakis, J.G., Manolakis, D.G.: Digital Signal Processing Principles, Algorithms, and Applications, ch. 4, 5. Macmillan Publishing Company, Basingstoke (1992)
Bozkurt, B., Dutoit, T.: Mixed-phase speech modeling and formant estimation, using differential phase spectrums. In: Proc. Voice Quality: Functions, Analysis and Synthesis (VOQUAL), Switzerland, pp. 21–24 (2003)
Sitton, G.A., Burrus, C.S., Fox, J.W., Treitel, S.: Factoring very-high-degree polynomials. IEEE Signal Processing Magazine 6, 27–42 (2003)
Bracewell, R.N.: The Fourier Transform and Its Applications, ch. 19. McGraw-Hill Book Company, New York (1986)
Chilton, E.: An 8kb/s speech coder based on the Hartley transform. In: Proc. Communication Systems: Towards Global Integration (ICCS), Singapore, vol. 1, pp. 13.5.1–13.5.5 (1990)
Paraskevas, I., Rangoussi, M.: The Hartley Phase Spectrum as a noise-robust feature in speech analysis. In: Proc. of the ISCA Tutorial and Research Workshop (ITRW) on Speech Analysis and Processing for Knowledge Discovery, Denmark (2008)
Paraskevas, I., Rangoussi, M.: The Hartley phase cepstrum as a tool for improved phase estimation. In: Proc. of the 16th Int. Conference on Systems, Signals and Image Processing (IWSSIP), Greece (2009)
Webb, A.R.: Statistical Pattern Recognition, 2nd edn., ch. 9. John Wiley & Sons, Ltd., Chichester (2002)
Paraskevas, I., Chilton, E., Rangoussi, M.: Audio Classification Using Features Derived from The Hartley Transform. In: Proc. of the 13th Int. Conference on Systems, Signals and Image Processing (IWSSIP), Hungary, pp. 309–312 (2006)
Gough, P.: A particular example of phase unwrapping using noisy experimental data. IEEE Transactions on Acoustics, Speech, and Signal Processing 31(3), 742–744 (1983)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Paraskevas, I., Rangoussi, M. (2010). The Hartley Phase Spectrum as an Assistive Feature for Classification. In: Solé-Casals, J., Zaiats, V. (eds) Advances in Nonlinear Speech Processing. NOLISP 2009. Lecture Notes in Computer Science(), vol 5933. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11509-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-11509-7_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11508-0
Online ISBN: 978-3-642-11509-7
eBook Packages: Computer ScienceComputer Science (R0)