Audio Imputation Using the Non-negative Hidden Markov Model

  • Jinyu Han
  • Gautham J. Mysore
  • Bryan Pardo
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7191)

Abstract

Missing data in corrupted audio recordings poses a challenging problem for audio signal processing. In this paper we present an approach that allows us to estimate missing values in the time-frequency domain of audio signals. The proposed approach, based on the Non-negative Hidden Markov Model, enables more temporally coherent estimation for the missing data by taking into account both the spectral and temporal information of the audio signal. This approach is able to reconstruct highly corrupted audio signals with large parts of the spectrogram missing. We demonstrate this approach on real-world polyphonic music signals. The initial experimental results show that our approach has advantages over a previous missing data imputation method.

Keywords

Audio Signal Audio Clip Original Audio Spectral Vector Singing Voice 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Brand, M.: Incremental Singular Value Decomposition of Uncertain Data with Missing Values. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 707–720. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  2. 2.
    Han, J., Pardo, B.: Reconstructing completely overlapped notes from musical mixtures. In: ICASSP (2011)Google Scholar
  3. 3.
    Le Roux, J., Kameoka, H., Ono, N., de Cheveigné, A., Sagayama, S.: Computational auditory induction as a missing-data model-fitting problem with bregman divergence. Speech Communication (2010)Google Scholar
  4. 4.
    Mysore, G.J.: A Non-negative Framework for Joint Modeling of Spectral Structure and Temporal Dynamics in Sound Mixtures. Ph.d. dissertation, Stanford University (2010)Google Scholar
  5. 5.
    Nawab, S., Quatieri, T., Lim, J.: Signal reconstruction from short-time fourier transform magnitude. IEEE Trans. on Acoustics, Speech & Signal Processing 31, 986–998 (1983)CrossRefGoogle Scholar
  6. 6.
    Raj, B.: Reconstruction of Incomplete Spectrograms for Robust Speech Recognition. Ph.d. dissertation, Carnegie Mellon University (2000)Google Scholar
  7. 7.
    Smaragdis, P., Raj, B., Shashanka, M.: Missing data imputation for time-frequency representations of audio signals. J. Signal Processing Systems (2010)Google Scholar
  8. 8.

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Jinyu Han
    • 1
  • Gautham J. Mysore
    • 2
  • Bryan Pardo
    • 1
  1. 1.EECS DepartmentNorthwestern UniversityUSA
  2. 2.Advanced Technology LabsAdobe Systems Inc.USA

Personalised recommendations