Harmonic Source Separation Using Prestored Spectra

Bay, Mert; Beauchamp, James W.

doi:10.1007/11679363_70

Mert Bay²⁰ &
James W. Beauchamp²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3889))

Included in the following conference series:

International Conference on Independent Component Analysis and Signal Separation

3007 Accesses
10 Citations

Abstract

Detecting multiple pitches (F0s) and segregating musical instrument lines from monaural recordings of contrapuntal polyphonic music into separate tracks is a difficult problem in music signal processing. Applications include audio-to-MIDI conversion, automatic music transcription, and audio enhancement and transformation. Past attempts at separation have been limited to separating two harmonic signals in a contrapuntal duet (Maher, 1990) or several harmonic signals in a single chord (Virtanen and Klapuri, 2001, 2002). Several researchers have attempted polyphonic pitch detection (Klapuri, 2001; Eggink and Brown, 2004a), predominant melody extraction (Goto, 2001; Marolt, 2004; Eggink and Brown, 2004b), and instrument recognition (Eggink and Brown, 2003). Our solution assumes that each instrument is represented as a time-varying harmonic series and that errors can be corrected using prior knowledge of instrument spectra. Fundamental frequencies (F0s) for each time frame are estimated from input spectral data using an Expectation-Maximization (EM) based algorithm with Gaussian distributions used to represent the harmonic series. Collisions (i.e., overlaps) between instrument harmonics, which frequently occur, are predicted from the estimated F0s. The uncollided harmonics are matched to ones contained in a pre-stored spectrum library in order that each F0‘s harmonic series is assigned to the appropriate instrument. Corrupted harmonics are restored using data taken from the library. Finally, each voice is additively resynthesized to a separate track. This algorithm is demonstrated for a monaural signal containing three contrapuntal musical instrument voices with distinct timbres.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Beauchamp, J.: Unix Workstation Software for Analysis, Graphics, Modification, and Synthesis of Musical Sounds. Audio Eng. Soc., 1–17 (1993) Preprint No. 3479
Google Scholar
Beauchamp, J.W., Horner, A.: Wavetable Interpolation Synthesis Based on Time-Variant Spectral Analysis of Musical Sounds. Audio Eng. Soc., 1–17 (1995) Preprint No. 3960
Google Scholar
Eggink, J., Brown, G.J.: A missing feature approach to instrument identification in polyphonic music. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP 2003), pp. 553–556 (2003)
Google Scholar
Eggink, J., Brown, G.J.: Instrument recognition in accompanied sonatas and concertos. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP 2004, pp. IV217–IV220 (2004a)
Google Scholar
Eggink, J., Brown, G.J.: Extracting melody lines from complex audio. In: Proc. 5th Int. Conf. on Music Information Retrieval (ISMIR 2004), pp. 84–91 (2004b)
Google Scholar
Fritts, L.: University of Iowa Musical Instrument Samples (1997), On-line at, http://theremin.music.uiowa.edu/MIS.html
Goto, M.: A predominant-F0 estimation method for CD recordings: MAP estimation using EM algorithm for adaptive tone models. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP 2001), pp. 3365–3368 (2001)
Google Scholar
Klapuri, A.: Multipitch estimation and sound separation by the spectral smoothness principle. In: Proc. ICASSP 2001, pp. 3381–3384 (2001)
Google Scholar
Maher, R.: Evaluation of a method for separating digitized duet signals. J. Audio Eng. Soc. 38(12), 957–979 (1990)
Google Scholar
Marolt, M.: Gaussian mixture models for extraction of melodic lines from audio recordings. In: Proc. 5th Int. Conf. on Music Information Retrieval (ISMIR 2004), pp. 80–83 (2004)
Google Scholar
McAulay, R.J., Quatieri, T.F.: Speech analysis/synthesis based on a sinusoidal representation. IEEE Trans. Acoust. Speech, Signal Processing ASSP-34, 744–754 (1986)
Article Google Scholar
Pepper, A.: The Intimate Art Pepper (music CD), tracks 5 & 7 (1996)
Google Scholar
Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition, pp. 125–128. Prentice-Hall, Englewood Cliffs (1993)
Google Scholar
Smith, J.O., Serra, X.: PARSHL: An analysis/synthesis program for nonharmonic sounds based on a sinusoidal representation. In: Proc. 1987 Int. Computer Music Conf., pp. 290–297 (1987)
Google Scholar
Thiede, T., Treurniet, W.C., Bitto, R., Schmidmer, C., Sporer, T., Beerends, J.G., Colomes, C., Keyhl, M.l., Stoll, G., Brandenburg, K., Feiten, B.: PEAQ-The ITU Standard for Objective Measurement of Perceived Audio Quality. J. Audio Eng. Soc. 48(1/2), 3–29 (2000)
Google Scholar
Virtanen, T., Klapuri, A.: Separation of harmonic sounds using multipitch analysis and iterative parameter estimation. In: IEEE Workshop on Applicatioins of Signal Processing to Audio and Acoustics (WASPAA 2001), pp. 83–86 (2001)
Google Scholar
Virtanen, T., Klapuri, A.: Separation of Harmonic Sounds Using Linear Models for the Overtone Series. In: IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP 2002 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Music and Dept. of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL, 61801
Mert Bay & James W. Beauchamp

Authors

Mert Bay
View author publications
You can also search for this author in PubMed Google Scholar
James W. Beauchamp
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Siemens Corporate Research, 755 College Road East, 08540, Princeton, NJ, USA
Justinian Rosca
Department of CSEE, Oregon Health and Science University, Portland, Oregon, USA
Deniz Erdogmus
Dep. of Electrical and Computer Engineering, University of Florida, Gainesville, Florida, USA
José C. Príncipe
McMaster University, 1280 Main Street West, L8S 4K1, Hamilton, Ontario, Canada
Simon Haykin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bay, M., Beauchamp, J.W. (2006). Harmonic Source Separation Using Prestored Spectra. In: Rosca, J., Erdogmus, D., Príncipe, J.C., Haykin, S. (eds) Independent Component Analysis and Blind Signal Separation. ICA 2006. Lecture Notes in Computer Science, vol 3889. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11679363_70

Download citation

DOI: https://doi.org/10.1007/11679363_70
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32630-4
Online ISBN: 978-3-540-32631-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics