Nguyen H., Weruaga L. (2008) Time–Frequency Analysis of Vietnamese Speech Inspired on Chirp Auditory Selectivity. In: Ho TB., Zhou ZH. (eds) PRICAI 2008: Trends in Artificial Intelligence. PRICAI 2008. Lecture Notes in Computer Science, vol 5351. Springer, Berlin, Heidelberg
In speech analysis, the pitch or fundamental frequency is usually considered as parameter for characterizing the vocal chord excitation, but it plays nearly no role in the very time–spectral analysis of the speech signal. In this paper, we present a novel speech analysis approach in which pitch (and its variation over time) play a leading role. The computation of the pitch and the pitch rate is carried out in-segment, by means of the minimization of Huber’s loss over the short-time correlation according to a second-order polynomial fitting law. The proposed method is integrated within the Fan-Chirp transform and the Spectral All-Pole Estimation method, both proposed previously by the authors. The results over Vietnamese speech reveal the advantages of the proposed analysis methodology versus the popular linear prediction estimation. The paper discusses finally the possible impact of the proposed method in speech coding, this representing the upcoming research work.
Pitch-driven time–frequency analysis frequency-selective AR estimation speech coding