Scalable perceptual audio representation with an adaptive three time-scale sinusoidal signal model

Raed, Al-Moussawy; Yin, Junxun; Song, Shaopeng

doi:10.1007/BF02687874

Scalable perceptual audio representation with an adaptive three time-scale sinusoidal signal model

Papers
Published: May 2004

Volume 21, pages 213–221, (2004)
Cite this article

Journal of Electronics (China)

Al-Moussawy Raed¹,
Yin Junxun¹ &
Song Shaopeng¹

13 Accesses
Explore all metrics

Abstract

This work is concerned with the development and optimization of a signal model for scalable perceptual audio coding at low bit rates. A complementary two-part signal model consisting of Sines plus Noise (SN) is described. The paper presents essentially a fundamental enhancement to the sinusoidal modeling component. The enhancement involves an audio signal scheme based on carrying out overlap-add sinusoidal modeling at three successive time scales, large, medium, and small. The sinusoidal modeling is done in an analysis-by-synthesis overlap-add manner across the three scales by using a psychoacoustically weighted matching pursuits. The sinusoidal modeling residual at the first scale is passed to the smaller scales to allow for the modeling of various signal features at appropriate resolutions. This approach greatly helps to correct the pre-echo inherent in the sinusoidal model. This improves the perceptual audio quality upon our previous work of sinusoidal modeling while using the same number of sinusoids. The most obvious application for the SN model is in scalable, high fidelity audio coding and signal modification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speech analysis and synthesis with a refined adaptive sinusoidal representation

Article 15 May 2018

Bio-Inspired Sparse Representation of Speech and Audio Using Psychoacoustic Adaptive Matching Pursuit

Spline Modeling of Audio Signals with Cycle Interpolation

References

ISO/IEC, 14496-3: MPEG-4 audio, Tech. Rep. JTSC1/SC29/WG11 N2503, ISO/IEC, October (1998).
R. Coifman, M. Wickerhauser, Entropy based algorithm for best basis selection, IEEE Trans. on Information Theory, 38(1992)2, 713–718.
Article MATH Google Scholar
S. Mallat, Z. Zhang, Matching pursuits with time-frequency dictionaries, IEEE Trans. on SP, 41(1993)12, 3397–3415.
Article MATH Google Scholar
M. Goodwin, Matching pursuit with damped sinusoids, In Proc. IEEE ICASSP, Münich, 1997, Vol.3, 2037–2040.
D. V. Anderson, Speech analysis and coding using a multi-resolution sinusoidal transform, In Proc. IEEE ICASSP, Atlanta, 1996, Vol.2, 1045–1048.
Google Scholar
M. Rodriguez-Hernandez, F. Casajus-Quiros, Improving time-scale modification of audio signals using wavelets, In Proc. ICSPAT, 1994, 1573–1577.
D. Ellis, B. Vercoe, A wavelet-based sinusoidal model of sound for auditory signal separation, In Proc. Int. Comp. Mus. Conf., Montreal, 1991, 86–89.
S. Levine, J. O. Smith III, A sines+transients+noise audio representation for data compression and time/pitch-scale modification, In Proc. of the 105th AES Con., San Francisco, 1998, 1–21.
M. Goodwin, Multiresolution sinusoidal modeling using adaptive segmentation, In Proc. IEEE ICASSP, Seattle, 1998, Vol.3, 1525–1528.
Google Scholar
AL-Moussawy Raed, Yin Junxun, Huang Jiancheng, A perceptual audio representation for low rate coding based on sines+noise modeling, To be published in Acta Electronica Sinica (in Chinese).
E. B. George, M. Smith, Analysis-by-synthesis/overlap-add sinusoidal modeling applied to the analysis and synthesis of musical tones, Journal of the AES, 40(1992)6, 497–516.
Google Scholar
AL-Moussawy Raed, et al., A flexible and efficient sinusoidal modeling using matching pursuits suited for signal compression, Journal of South China University of Technology, 29(2001)10, 38–41.
Google Scholar
ISO/MPEEG Committee, Information technology-coding of moving pictures and associated audio for digital storage media at up to about 5 1.5 Mbit/s-part 3: Audio, ISO/IEC 11172-3.
X. Rodet, P. Depalle, Spectral envelopes and inverse FFT synthesis, In Proc. of the 93rd AES Conv., San Francisco, 1992.
M. Goodwin, Residual modeling in music analysis/synthesis, In Proc. IEEE ICASSP, Atlanta, 1996, Vol.2, 1005–1008.
Google Scholar

Download references

Author information

Authors and Affiliations

College of Electronic and Information Eng., South China Univ. of Tech., 510640, Guangzhou
Al-Moussawy Raed, Yin Junxun & Song Shaopeng

Authors

Al-Moussawy Raed
View author publications
You can also search for this author in PubMed Google Scholar
Yin Junxun
View author publications
You can also search for this author in PubMed Google Scholar
Song Shaopeng
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Supported by the National Natural Science Foundation of China (No.69802007), Motorola China Research Center (No.B38300), and Natural Science Foundation of Guangdong (No.011611)

About this article

Cite this article

Raed, AM., Yin, J. & Song, S. Scalable perceptual audio representation with an adaptive three time-scale sinusoidal signal model. J. of Electron.(China) 21, 213–221 (2004). https://doi.org/10.1007/BF02687874

Download citation

Received: 11 October 2002
Revised: 30 May 2003
Issue Date: May 2004
DOI: https://doi.org/10.1007/BF02687874

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scalable perceptual audio representation with an adaptive three time-scale sinusoidal signal model

Abstract

Access this article

Similar content being viewed by others

Speech analysis and synthesis with a refined adaptive sinusoidal representation

Bio-Inspired Sparse Representation of Speech and Audio Using Psychoacoustic Adaptive Matching Pursuit

Spline Modeling of Audio Signals with Cycle Interpolation

References

Author information

Authors and Affiliations

Additional information

About this article

Cite this article

Key words

Navigation

Scalable perceptual audio representation with an adaptive three time-scale sinusoidal signal model

Abstract

Access this article

Similar content being viewed by others

Speech analysis and synthesis with a refined adaptive sinusoidal representation

Bio-Inspired Sparse Representation of Speech and Audio Using Psychoacoustic Adaptive Matching Pursuit

Spline Modeling of Audio Signals with Cycle Interpolation

References

Author information

Authors and Affiliations

Additional information

About this article

Cite this article

Share this article

Key words

Search

Navigation