Skip to main content
Log in

Pitch Detection Enhancement Employing Music Prediction

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Pitch detection methods are widely used for extracting musical data from digital signals. A review of those methods is presented in the paper. Since musical signals may contain noise and distortion, detection results can be erroneous. In this paper a new method employing music prediction to support pitch determination is introduced. This method was developed in order to override disadvantages of standard pitch detection algorithms. The new approach utilizes signal segmentation and pitch prediction based on musical knowledge extraction employing artificial neural networks. Signal segmentation allows for estimating the pitch for a single note as a whole, therefore suppressing errors in transient and decay phases. Pitch prediction helps correcting pitch estimation errors by tracking musical context of the analyzed signal. As it was shown in the experimental results, pitch estimation errors may be reduced by using both signal segmentation and music prediction techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Beauchamp, J.W., Maher, R.C., and Brown, R. (1993). Detection of Musical Pitch from Recorded Solo Performances. 94th Audio Eng. Soc. Convention, Preprint 3541, Berlin, March 16–19.

  • Brown, J.C. (1992). Musical Fundamental Frequency Tracking Using a Pattern Recognition Method. J. Acoust. Soc. Am. 92(3), 1394–1402.

    Google Scholar 

  • de Cheveigné, A. and Kawahara, H. (2002). YIN, a Fundamental Frequency Estimator for Speech and Music. J. Acoust. Soc. Am., 111(4), 1917–1930.

    Google Scholar 

  • Cook, P.R., Morill, D., and Smith, J.O. (1998). An Automatic Pitch Detection and MIDI Control System for Brass Instruments. J. Acoust. Soc. Am, 92(4 pt. 2), 2429–2430.

    Google Scholar 

  • Czyzewski, A., Szczerba, M., and Kostek, B. (2004). Musical Phrase Representation and Recognition by Means of Neural Networks and Rough Sets. In J.W. Grzymala-Busse, B. Kostek, R.W. Swiniarski, and M. Szczuka, (Eds.), Transactions on Rough Sets, pp. 259–284.

  • Gelfand, S.A. (1998). Hearing: An Introduction to Psychological and Psychological Acoustics. New York: Marcel Dekker, Inc.

    Google Scholar 

  • Herrera, P., Amatriain, X., Battle, E., and Serra, X. (2000). Towards Instrument Segmentation for Music Content Description: A Critical Review of Instrument Classification Techniques. In Proc. Intern. Symposium on Music Information Retrieval, Indiana, USA.

  • Hess, W. (1983). Pitch Determination of Speech Signals: Algorithms and Devices. Springer Berlin: Verlag, Tokyo: Heidelberg, New York.

    Google Scholar 

  • Hörnel, D. (1997). MELONET I: Neural Nets for Inventing Baroque-Style Chorale Variations. In M.I. Jordan, M.J. Kearns, and S.A. Solla (Eds.), Advances in Neural Information Processing 10 (NIPS 10), MIT Press.

  • Klapuri, A. (1999). Wide-band Pitch Estimation for Natural Sound Sources with Inharmonicities. 106th Audio Eng. Soc. Convention, Preprint 4906, Munich, May 8–11.

  • Kostek, B. (1999). {Soft Computing in Acoustics, Applications of Neural Networks, Fuzzy Logic and Rough Sets to Musical Acoustics}. Studies in Fuzziness and Soft Computing, New York: Physica Verlag, Heidelberg.

    Google Scholar 

  • Kostek, B. and Czyzewski, A. (2001). Representing Musical Instrument Sounds for Their Automatic Classification. J. Audio Eng. Soc., 49(9), 768–785.

    Google Scholar 

  • Maher, R.C. and Beauchamp, J.W. (1974). Fundamental Frequency Estimation of Musical Signals Using a Two-Way Mismatch Procedure. J. Acoust. Soc. Am., 95(4), 2254–2263.

    Google Scholar 

  • McAulay, R.J., and Quatieri, T.F. (1995). In W.B. Kleijn & K.K. Paliwal (Eds.) Sinusoidal Coding, Speech Coding and Synthesis. Elsevier Science B.V., pp. 121–131.

  • Meddis, R. and Hewitt M.J. (1991). Virtual Pitch and Phase Sensitivity of a Computer Model of the Auditory Periphery. I: Pitch Identification. J. Acoust. Soc. Am., 89(6), 2866–2881.

    Google Scholar 

  • Moorer, J.A. (1974). The Optimum Comb Method of Pitch Period Analysis of Continuous Digitized Speech. IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-22, 5, 330–338.

    Google Scholar 

  • Moradi, H., Grzymala-Busse, J.W., and Roberts, J.A. (1998). Entropy of English Text: Experiments with Humans and a Machine Learning System Based on Rough Sets. Information Sciences, 104(1/2), 31–47.

    Google Scholar 

  • Mozer, M.C. (1991). IN Todd, P.M. and Loy, D.G. (Eds.), Connectionist Music Composition Based on Melodic, Stylistic, and Psychophysical Constraints, Music and Connectionism Cambridge, Massachusetts, London, England: The MIT Press, pp. 195–211 .

    Google Scholar 

  • Noll, A.M. (1964). Short-Time Spectrum And ‘Cepstrum’ Techniques For Vocal-Pitch Detection. J. Acoust. Soc. Am., 36, 296–302.

    Google Scholar 

  • Paganini (1990). N., 24 Capricci op. 1, Alexander Markov, CD, Erato 2292-45502-2.

  • Rabiner, L., Cheng, M.J., Rosenberg, A.E., and Gonegal, C.A. (1976). A Comparative Performance Study of Several Pitch Detection Algorithms. Transactions on Acoustics, Speech, and Signal Processing, ASSP-24, 399–418.

    Google Scholar 

  • Rife, D.C. and Boorstyn R.R. (1974). Single-Tone Parameter Estimation from Discrete-Time Observations. IEEE Transactions on Information Theory, 20(5), 591–598.

    Google Scholar 

  • Rife, D.C. and Boorstyn, R.R. (1976). Multiple Tone Parameter Estimation from Discrete-Time Observations. Bell System Technical Journal, 55(3), 1389–1410.

    Google Scholar 

  • Ross, M.J., Shaffer, H.L., Cohen, A., Freudberg, R., and Manley, H.J. (1974). Average Magnitude Difference Function Pitch Extractor. IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-2, (5), 353–362.

  • Schroeder, M.R. (1968). Period Histogram and Product Spectrum: New Methods for Fundamental Frequency Measurement. J. Acoust. Soc. Am., 43, 829–834.

    Google Scholar 

  • Szczerba, M. (1999). Recognition and Prediction of Music: A Machine Learning Approach. 106th AES Convention, Munich, May 8–11, Preprint 4904.

  • Telemann, G.Ph. (1984). Twelve Fantasies for Oboe Solo, Heinz Holliger, CD, Nippon Columbia, Denon, 38C37–7089.

  • Todd, P.M. (1991). In Todd, P.M. and Loy, D.G. (Eds.), A Connectionist Approach to Algorithmic Composition, Music and Connectionism, Cambridge, Massachusetts, London, England: The MIT Press, pp. 173–194.

    Google Scholar 

  • Walmsley, P.J., Godsill S.J., and Rayner, P.J.W. (1999). Polyphonic Pitch Tracking Using Joint Bayesian Estimation of Multiple Frame Parameters. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 17th-20th October: New Paltz (NY).

  • Zell, A. (1995). u.a., SNNS—Stuttgart Neural Network Simulator User Manual, Ver. 4.1.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrzej Czyzewski.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Szczerba, M., Czyzewski, A. Pitch Detection Enhancement Employing Music Prediction. J Intell Inf Syst 24, 223–251 (2005). https://doi.org/10.1007/s10844-005-0324-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-005-0324-6

Keywords

Navigation