Abstract
Pitch detection methods are widely used for extracting musical data from digital signals. A review of those methods is presented in the paper. Since musical signals may contain noise and distortion, detection results can be erroneous. In this paper a new method employing music prediction to support pitch determination is introduced. This method was developed in order to override disadvantages of standard pitch detection algorithms. The new approach utilizes signal segmentation and pitch prediction based on musical knowledge extraction employing artificial neural networks. Signal segmentation allows for estimating the pitch for a single note as a whole, therefore suppressing errors in transient and decay phases. Pitch prediction helps correcting pitch estimation errors by tracking musical context of the analyzed signal. As it was shown in the experimental results, pitch estimation errors may be reduced by using both signal segmentation and music prediction techniques.
Similar content being viewed by others
References
Beauchamp, J.W., Maher, R.C., and Brown, R. (1993). Detection of Musical Pitch from Recorded Solo Performances. 94th Audio Eng. Soc. Convention, Preprint 3541, Berlin, March 16–19.
Brown, J.C. (1992). Musical Fundamental Frequency Tracking Using a Pattern Recognition Method. J. Acoust. Soc. Am. 92(3), 1394–1402.
de Cheveigné, A. and Kawahara, H. (2002). YIN, a Fundamental Frequency Estimator for Speech and Music. J. Acoust. Soc. Am., 111(4), 1917–1930.
Cook, P.R., Morill, D., and Smith, J.O. (1998). An Automatic Pitch Detection and MIDI Control System for Brass Instruments. J. Acoust. Soc. Am, 92(4 pt. 2), 2429–2430.
Czyzewski, A., Szczerba, M., and Kostek, B. (2004). Musical Phrase Representation and Recognition by Means of Neural Networks and Rough Sets. In J.W. Grzymala-Busse, B. Kostek, R.W. Swiniarski, and M. Szczuka, (Eds.), Transactions on Rough Sets, pp. 259–284.
Gelfand, S.A. (1998). Hearing: An Introduction to Psychological and Psychological Acoustics. New York: Marcel Dekker, Inc.
Herrera, P., Amatriain, X., Battle, E., and Serra, X. (2000). Towards Instrument Segmentation for Music Content Description: A Critical Review of Instrument Classification Techniques. In Proc. Intern. Symposium on Music Information Retrieval, Indiana, USA.
Hess, W. (1983). Pitch Determination of Speech Signals: Algorithms and Devices. Springer Berlin: Verlag, Tokyo: Heidelberg, New York.
Hörnel, D. (1997). MELONET I: Neural Nets for Inventing Baroque-Style Chorale Variations. In M.I. Jordan, M.J. Kearns, and S.A. Solla (Eds.), Advances in Neural Information Processing 10 (NIPS 10), MIT Press.
Klapuri, A. (1999). Wide-band Pitch Estimation for Natural Sound Sources with Inharmonicities. 106th Audio Eng. Soc. Convention, Preprint 4906, Munich, May 8–11.
Kostek, B. (1999). {Soft Computing in Acoustics, Applications of Neural Networks, Fuzzy Logic and Rough Sets to Musical Acoustics}. Studies in Fuzziness and Soft Computing, New York: Physica Verlag, Heidelberg.
Kostek, B. and Czyzewski, A. (2001). Representing Musical Instrument Sounds for Their Automatic Classification. J. Audio Eng. Soc., 49(9), 768–785.
Maher, R.C. and Beauchamp, J.W. (1974). Fundamental Frequency Estimation of Musical Signals Using a Two-Way Mismatch Procedure. J. Acoust. Soc. Am., 95(4), 2254–2263.
McAulay, R.J., and Quatieri, T.F. (1995). In W.B. Kleijn & K.K. Paliwal (Eds.) Sinusoidal Coding, Speech Coding and Synthesis. Elsevier Science B.V., pp. 121–131.
Meddis, R. and Hewitt M.J. (1991). Virtual Pitch and Phase Sensitivity of a Computer Model of the Auditory Periphery. I: Pitch Identification. J. Acoust. Soc. Am., 89(6), 2866–2881.
Moorer, J.A. (1974). The Optimum Comb Method of Pitch Period Analysis of Continuous Digitized Speech. IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-22, 5, 330–338.
Moradi, H., Grzymala-Busse, J.W., and Roberts, J.A. (1998). Entropy of English Text: Experiments with Humans and a Machine Learning System Based on Rough Sets. Information Sciences, 104(1/2), 31–47.
Mozer, M.C. (1991). IN Todd, P.M. and Loy, D.G. (Eds.), Connectionist Music Composition Based on Melodic, Stylistic, and Psychophysical Constraints, Music and Connectionism Cambridge, Massachusetts, London, England: The MIT Press, pp. 195–211 .
Noll, A.M. (1964). Short-Time Spectrum And ‘Cepstrum’ Techniques For Vocal-Pitch Detection. J. Acoust. Soc. Am., 36, 296–302.
Paganini (1990). N., 24 Capricci op. 1, Alexander Markov, CD, Erato 2292-45502-2.
Rabiner, L., Cheng, M.J., Rosenberg, A.E., and Gonegal, C.A. (1976). A Comparative Performance Study of Several Pitch Detection Algorithms. Transactions on Acoustics, Speech, and Signal Processing, ASSP-24, 399–418.
Rife, D.C. and Boorstyn R.R. (1974). Single-Tone Parameter Estimation from Discrete-Time Observations. IEEE Transactions on Information Theory, 20(5), 591–598.
Rife, D.C. and Boorstyn, R.R. (1976). Multiple Tone Parameter Estimation from Discrete-Time Observations. Bell System Technical Journal, 55(3), 1389–1410.
Ross, M.J., Shaffer, H.L., Cohen, A., Freudberg, R., and Manley, H.J. (1974). Average Magnitude Difference Function Pitch Extractor. IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-2, (5), 353–362.
Schroeder, M.R. (1968). Period Histogram and Product Spectrum: New Methods for Fundamental Frequency Measurement. J. Acoust. Soc. Am., 43, 829–834.
Szczerba, M. (1999). Recognition and Prediction of Music: A Machine Learning Approach. 106th AES Convention, Munich, May 8–11, Preprint 4904.
Telemann, G.Ph. (1984). Twelve Fantasies for Oboe Solo, Heinz Holliger, CD, Nippon Columbia, Denon, 38C37–7089.
Todd, P.M. (1991). In Todd, P.M. and Loy, D.G. (Eds.), A Connectionist Approach to Algorithmic Composition, Music and Connectionism, Cambridge, Massachusetts, London, England: The MIT Press, pp. 173–194.
Walmsley, P.J., Godsill S.J., and Rayner, P.J.W. (1999). Polyphonic Pitch Tracking Using Joint Bayesian Estimation of Multiple Frame Parameters. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 17th-20th October: New Paltz (NY).
Zell, A. (1995). u.a., SNNS—Stuttgart Neural Network Simulator User Manual, Ver. 4.1.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Szczerba, M., Czyzewski, A. Pitch Detection Enhancement Employing Music Prediction. J Intell Inf Syst 24, 223–251 (2005). https://doi.org/10.1007/s10844-005-0324-6
Received:
Revised:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/s10844-005-0324-6