Separating Voices in Polyphonic Music: A Contig Mapping Approach

Chew, Elaine; Wu, Xiaodan

doi:10.1007/978-3-540-31807-1_1

Elaine Chew¹⁷ &
Xiaodan Wu¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3310))

Included in the following conference series:

International Symposium on Computer Music Modeling and Retrieval

1103 Accesses
7 Citations

Abstract

Voice separation is a critical component of music information retrieval, music analysis and automated transcription systems. We present a contig mapping approach to voice separation based on perceptual principles. The algorithm runs in O(n ²) time, uses only pitch height and event boundaries, and requires no user-defined parameters. The method segments a piece into contigs according to voice count, then reconnects fragments in adjacent contigs using a shortest distance strategy. The order of connection is by distance from maximal voice contigs, where the voice ordering is known. This contig-mapping algorithm has been implemented in VoSA, a Java-based voice separation analyzer software. The algorithm performed well when applied to J. S. Bach’s Two- and Three-Part Inventions and the forty-eight Fugues from the Well-Tempered Clavier. We report an overall average fragment consistency of 99.75%, correct fragment connection rate of 94.50% and average voice consistency of 88.98%, metrics which we propose to measure voice separation performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bregman, A.: Auditory Scene Analysis: The Perceptual Organization of Sound, pp. 417–442. The MIT Press, Cambridge (1990)
Google Scholar
Cambouropoulos, E.: From MIDI to Traditional Musical Notation. In: Proceedings of the AAAI Workshop on Artificial Intelligence and Music: Towards Formal Models for Composition, Performance and Analysis, Austin, Texas, July 30 - August 3 (2000)
Google Scholar
Cambouropoulos, E.: Pitch Spelling: A Computational Model. Music Perception 20(4), 411–429 (2003)
Article Google Scholar
Chew, E., Chen, Y.-C.: Determining Context-Defining Windows: Pitch Spelling Using the Spiral Array. In: Proceedings of the 4th International Conference on Music Information Retrieval (2003)
Google Scholar
Deutsch, D.: Two-channel Listening to Musical Scales. Journal of the Acoustical Society of America 57, 1156–1160 (1975)
Article Google Scholar
Goebl, W.: Melody Lead in Piano Performance: Expressive Device or Artifact? Journal of the Acoustical Society of America 110(1), 563–572 (2001)
Article Google Scholar
Huron, D.: Tone and Voice: A Derivation of the Rules of Voice-leading from Perceptual Principles. Music Perception 19(1), 1–64 (2001)
Article MathSciNet Google Scholar
Kilian, J., Hoos, H.: Voice Separation - A Local Optimization Approach. In: Proceedings of the 3rd International Conference on Music Information Retrieval, pp. 39–46 (2002)
Google Scholar
Lemström, K., Tarhio, J.: Detecting monophonic patterns within polyphonic sources. In: Content-Based Multimedia Information Access Conference Proceedings (RIAO 2000), Paris, pp. 1251–1279 (2000)
Google Scholar
Meredith, D.: Pitch Spelling Algorithms. In: Proceedings of the Fifth Triennial ESCOM Conference. Hanover University of Music and Drama, Germany, pp. 204–207 (2003)
Google Scholar
Temperley, D.: The Cognition of Basic Musical Structures, pp. 85–114. The MIT Press, Cambridge Massachusetts (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Epstein Department of Industrial and Systems Engineering, University of Southern California, Viterbi School of Engineering, Integrated Media Systems Center, 3715, McClintock Avenue GER240 MC:0193, Los Angeles, California, USA
Elaine Chew & Xiaodan Wu

Authors

Elaine Chew
View author publications
You can also search for this author in PubMed Google Scholar
Xiaodan Wu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Mærsk Mc-Kinney Møller Institute, University of Southern Denmark, Campus 55, 5230, Odense M, Denmark
Uffe Kock Wiil

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chew, E., Wu, X. (2005). Separating Voices in Polyphonic Music: A Contig Mapping Approach. In: Wiil, U.K. (eds) Computer Music Modeling and Retrieval. CMMR 2004. Lecture Notes in Computer Science, vol 3310. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31807-1_1

Download citation

DOI: https://doi.org/10.1007/978-3-540-31807-1_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24458-5
Online ISBN: 978-3-540-31807-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics