Advertisement

Speech Reconstruction by Sparse Linear Prediction

  • Ján Koloda
  • Antonio M. Peinado
  • Victoria Sánchez
Part of the Communications in Computer and Information Science book series (CCIS, volume 328)

Abstract

This paper proposes a new variant of the least square autoregressive (LSAR) method for speech reconstruction, which can estimate via least squares a segment of missing samples by applying the linear prediction (LP) model of speech. First, we show that the use of a single high-order linear predictor can provide better results than the classic LSAR techniques based on short- and long-term predictors without the need of a pitch detector. However, this high-order predictor may reduce the reconstruction performance due to estimation errors, especially in the case of short pitch periods, and non-stationarity. In order to overcome these problems, we propose the use of a sparse linear predictor which resembles the classical speech model, based on short- and long-term correlations, where many LP coefficients are zero. The experimental results show the superiority of the proposed approach in both signal to noise ratio and perceptual performance.

Keywords

Speech reconstruction error concealment sparse linear prediction least squares autoregressive model 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Vaseghi, S.: Multimedia signal processing. John Wiley (2007)Google Scholar
  2. 2.
    Janssen, A., Veldhuis, R., Vries, L.: Adaptive interpolation of discrete-time signals that can be modeled as AR processes. IEEE Transactions on Acoustics, Speech and Signal Processing, 317–330 (1986)Google Scholar
  3. 3.
    Jauppinen, I., Roth, K.: Audio signal restoration - theory and applications. In: Proceedings of the 5th Int. Conf. on Digital Audio Effects (2002)Google Scholar
  4. 4.
    Esquef, P., Biscainho, L.: An efficient model-based multirate method for reconstruction of audio signals along long gaps. IEEE Transactions on Speech and Audio Processing 14, 1391–1400 (2006)CrossRefGoogle Scholar
  5. 5.
    Lindblom, J., Hedelin, P.: Packet loss concealment based on sinusoidal modelling. In: Proceedings of ICASSP 2002 (2002)Google Scholar
  6. 6.
    Vaseghi, S., Rayner, P.: Detection and suppression of impulsive noise in speech communication systems. IEE Proceedings 1, 38–46 (1990)Google Scholar
  7. 7.
    Giacobello, D., Christensen, M., Dahl, J., Jensen, S., Moonen, M.: Sparse linear prediction of speech. In: Proceedings of Interspeech 2008 (2008)Google Scholar
  8. 8.
    Giacobello, D., Christensen, M., Murthi, M., Jensen, S., Moonen, M.: Speech coding based on sparse linear prediction. In: Proceedings of Eusipco 2009 (2009)Google Scholar
  9. 9.
    Koloda, J., Østergaard, J., Jensen, S., Peinado, A., Sanchez, V.: Sequential error concealment of video/images via weighted template matching. In: Proceedings of DCC 2012 (2012)Google Scholar
  10. 10.
    Romberg, J.: Imaging via compressive sensing. IEEE Signal Processing Magazine 25 (March 2008)Google Scholar
  11. 11.
    Vandenberghe, L., Boyd, S.: Semidefinite programming. Society for Industrial and Applied Mathematics (1996)Google Scholar
  12. 12.
    Cheveigné, A., Kawahara, H.: Yin, a fundamental frequency estimator for speech and music. Journal of the Acoustical Society of America 111(4), 1917–1930 (2002)CrossRefGoogle Scholar
  13. 13.
    Díaz-Verdejo, J., Peinado, A., Rubio, A., Segarra, E., Prieto, N., Casacuberta, F.: ALBAYZIN: a task-oriented spanish speech corpus. In: First International Conference on Language Resources and Evaluation, vol. 1, pp. 487–502 (May 1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Ján Koloda
    • 1
  • Antonio M. Peinado
    • 1
  • Victoria Sánchez
    • 1
  1. 1.Dpt. Teoría de la Señal, Telemática y ComunicacionesCentro de Investigación en Tecnologías de la Información y de las ComunicacionesGranadaSpain

Personalised recommendations