Bootstrap learning for accurate onset detection

Hu, Ning; Dannenberg, Roger B.

doi:10.1007/s10994-006-8458-5

Bootstrap learning for accurate onset detection

Published: 08 May 2006

Volume 65, pages 457–471, (2006)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Bootstrap learning for accurate onset detection

Download PDF

Ning Hu¹ &
Roger B. Dannenberg²

628 Accesses
6 Citations
3 Altmetric
Explore all metrics

Abstract

Supervised learning models have been applied to create good onset detection systems for musical audio signals. However, this always requires a large set of labeled training examples, and hand-labeling is quite tedious and time consuming. In this paper, we present a bootstrap learning approach to train an accurate note onset detection model. Audio alignment techniques are first used to find the correspondence between a symbolic music representation (such as MIDI data) and an acoustic recording. This alignment provides an initial estimate of note boundaries which can be used to train an onset detector. Once trained, the detector can be used to refine the initial set of note boundaries and training can be repeated. This iterative training process eliminates the need for hand-labeled audio. Tests show that this training method can improve an onset detector initially trained on synthetic data.

Article PDF

The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances

Article Open access 18 December 2020

A comprehensive survey on automatic speech recognition using neural networks

Article 15 August 2023

catch22: CAnonical Time-series CHaracteristics

Article Open access 09 August 2019

References

Beauchamp, J. (1993). Unix workstation software for analysis, graphics, modifications, and synthesis of musical sounds. AES Convention, preprint 3479. New York: Audio Engineering Society.
Dannenberg, R. B. & Hu, N. (2003). Polyphonic audio matching for score following and intelligent audio editors. In Proceedings of the 2003 International Computer Music Conference (pp. 27–34). San Francisco: International computer music association.
Downie, J. S., West, K., Ehmann, A., & Vincent, E. (2005). The 2005 Music Information Retrieval Evaluation Exchange (MIREX 2005): Preliminary Overview (pp. 320–323). ISMIR 2005: 6th International Conference on Music Information Retrieval Proceedings (pp. 288–295). London: Queen Mary, University of London.
Fitzgerald, R. B. (1955). English suite: For Bb trumpet or cornet and piano. Bryn Mawr: Theodore Presser Co.
Google Scholar
Hu, N. & Dannenberg, R. B. (2002). A comparison of melodic database retrieval techniques using sung queries. In JCDL 2002: Proceedings of the Second ACM/IEEE-CS Joint Conference on Digital Libraries (pp. 301–307). New York: ACM.
Hu, N., Dannenberg, R. B., & Tzanetakis, G. (2003). Polyphonic audio matching and alignment for music retrieval. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (pp. 185–188). New York: IEEE.
Kapanci, E. and Pfeffer, A. (2004). A hierarchical approach to onset detection. In Proceedings of the 2004 International Computer Music Conference (pp. 438–441). San Francisco: International Computer Music Association.
Kuipers, B. & Beeson, P. (2002). Bootstrap learning for place recognition. In Proceedings of the Eighteenth National Conference on Artificial Intelligence (pp. 174-180). Menlo Park, CA: AAAI Press.
Kurková, V. (1992). Kolmogorov’s theorem and multilayer neural networks. Neural Networks, 5(3), 501–506.
Article Google Scholar
Lu, L., Li, S. Z., & Zhang, H. J. (2001). Content-based audio segmentation using support vector machines. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME 2001) (pp. 956–959). New York: IEEE.
Marolt, M., Kavcic, A., & Privosnik, M. (2002). Neural networks for note onset detection in piano music. http://lgm.fri.uni-lj.si/ matic/SONIC.html.
McAulay, R. J. & Quatieri, T. F. (1986) Speech analysis/synthesis based on a sinusoidal representation. IEEE Transactions on Acoustics, Speech, and Signal Processing, 34(4), 744–754.
Article Google Scholar
Muller, M., Kurth, F., & Clausen, M. (2005). Audio matching via chroma-based statistical features. ISMIR 2005: 6th International Conference on Music Information Retrieval Proceedings (pp. 288–295). London: Queen Mary, University of London.
Orio, N. & Schwarz, D. (2001). Alignment of monophonic and polyphonic music to a score. In Proceedings of the 2001 International Computer Music Conference (pp. 155–158). San Francisco: International Computer Music Association.
Plumbley, Mark D., Brossier, P. M., & Bello, J. P. (2004). Fast labelling of notes in music signals. ISMIR 2004 Fifth International Conference on Music Information Retrieval Proceedings (pp. 331–336). Barcelona, Spain: Universitat Pompeu Fabra.
Raphael, C. (1999). Automatic segmentation of acoustic musical signals using hidden markov model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(4), 360–370.
Article Google Scholar
Raphael, C. (2004). A hybrid graphical model for aligning polyphonic audio with musical scores. In Proceedings of the 5th International Conference on Musical Information Retrieval (pp. 387–394). London: Queen Mary, University of London.
Schwarz, D. (2004). Data-driven concatenative sound synthesis (PhD thesis) Paris, France: Universit Paris 6-Pierre et Marie Curie.
Soulez, F., Rodet, X., & Schwarz, D. (2003). Improving polyphonic and poly-instrumental music to score alignment. ISMIR 2003 In Proceedings of the Fourth International Conference on Music Information Retrieval (pp. 143–148). Baltimore, MD: Johns Hopkins University.

Download references

Author information

Authors and Affiliations

Google Inc. New York Office, 1440 Broadway, 21st Floor, New York, NY, 10018, United States
Ning Hu
Computer Science Department, Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA, 15213, United States
Roger B. Dannenberg

Authors

Ning Hu
View author publications
You can also search for this author in PubMed Google Scholar
Roger B. Dannenberg
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ning Hu.

Additional information

Major part of work was done while the first author was at Carnegie Mellon University.

Editor: Gerhard Widmer

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hu, N., Dannenberg, R.B. Bootstrap learning for accurate onset detection. Mach Learn 65, 457–471 (2006). https://doi.org/10.1007/s10994-006-8458-5

Download citation

Received: 27 September 2005
Revised: 17 February 2006
Accepted: 20 March 2006
Published: 08 May 2006
Issue Date: December 2006
DOI: https://doi.org/10.1007/s10994-006-8458-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Bootstrap learning for accurate onset detection

Abstract

Article PDF

Similar content being viewed by others

The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances

A comprehensive survey on automatic speech recognition using neural networks

catch22: CAnonical Time-series CHaracteristics

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bootstrap learning for accurate onset detection

Abstract

Article PDF

Similar content being viewed by others

The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances

A comprehensive survey on automatic speech recognition using neural networks

catch22: CAnonical Time-series CHaracteristics

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation