Skip to main content
Log in

Abstract

An audio fingerprint is a compact content-based signature that summarizes an audio recording. Audio Fingerprinting technologies have attracted attention since they allow the identification of audio independently of its format and without the need of meta-data or watermark embedding. Other uses of fingerprinting include: integrity verification, watermark support and content-based audio retrieval. The different approaches to fingerprinting have been described with different rationales and terminology: Pattern matching, Multimedia (Music) Information Retrieval or Cryptography (Robust Hashing). In this paper, we review different techniques describing its functional blocks as parts of a common, unified framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval, Addison Wesley, 1999.

  2. S. Subramanya, R. Simha, B. Narahari, and A. Youssef, “Transform-Based Indexing of Audio Data for Multimedia Databases,” in Proc. of Int. Conf. on Computational Intelligence and Multimedia Applications, New Delhi, India, Sept. 1999.

  3. A. Kimura, K. Kashino, T. Kurozumi, and H. Murase, “Very Quick Audio Searching: Introducing Global Pruning to the Time-Series Active Search,” in Proc. of Int. Conf. on Computational Intelligence and Multimedia Applications, Salt Lake City, Utah, May 2001.

  4. J. Haitsma and T. Kalker, “A Highly Robust Audio Fingerprinting System,” in Proceedings of the International Symposium on Music Information Retrieval, Paris, France, 2002.

  5. M. Mihçak and R. Venkatesan, “A Perceptual Audio Hashing Algorithm: A Tool for Robust Audio Identification and Information Hiding,” in 4th Int. Information Hiding Workshop, Pittsburg, PA, April 2001.

  6. P. Cano, E. Batlle, H. Mayer, and H. Neuschmied, “Robust Sound Modeling for Song Detection in Broadcast Audio,” in Proc. AES 112th Int. Conv., Munich, Germany, May 2002.

  7. E. Allamanche, J. Herre, O. Helmuth, B. Fröba, T. Kasten, and M. Cremer, “Content-Based Identification of Audio Material Using Mpeg-7 Low Level Description,” in Proc. of the Int. Symp. of Music Information Retrieval, Indiana, USA, Oct. 2001.

  8. S. Sukittanon and L. Atlas, “Modulation Frequency Features for Audio Fingerprinting,” in Proc. of the ICASSP, May 2002.

  9. S. Theodoris and K. Koutroumbas, Pattern Recognition, Academic Press, 1999.

  10. J. Picone, “Signal Modeling Techniques in Speech Recognition,” Proc. of the ICASSP, vol. 81, no. 9, 1993, pp. 1215–1247.

    Google Scholar 

  11. Request for information on audio fingerprinting technologies (2001) [Online]. Available:[ http://www.riaa.org/pdf/RIAA_IFPI_Fingerprinting_RFI.pdf

  12. L. Boney, A. Tewfik, and K. Hamdy, “Digital Watermarks for Audio Signals,” in IEEE Proceedings Multimedia, 1996, pp. 473–480.

  13. S. Craver, W.M., and B. Liu, “What Can We Reasonably Expect from Watermarks?” in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, Oct. 2001.

  14. Audio identification technology overview. (2002) [Online]. Available: [http://www.audiblemagic.com/about]

  15. T. Kalker, “Applications and Challenges for Audio Fingerprinting,” in Presentation at the 111th AES Convention, New York, 2001.

  16. E. Gómez, P. Cano, L. de C.T. Gomes, E. Batlle, and M. Bonnet, “Mixed Watermarking-Fingerprinting Approach for Integrity Verification of Audio Recordings,” in Proceedings of the International Telecommunications Symposium, Natal, Brazil, Sept. 2002.

  17. P. Cano, M. Kaltenbrunner, F. Gouyon, and E. Batlle, “On the Use of Fastmap for Audio Information Retrieval,” in Proceedings of the International Symposium on Music Information Retrieval, Paris, France, 2002.

  18. J. Lourens, “Detection and Logging Advertisements Using its Sound,” in Proc. of the COMSIG, Johannesburg, 1990.

  19. F. Kurth, A. Ribbrock, and M. Clausen, “Identification of Highly Distorted Audio Material for Querying Large Scale Databases,” in Proc. AES 112th Int. Conv., Munich, Germany, May 2002.

  20. G. Richly, L. Varga, F. Kovàs, and G. Hosszú, “Short-Term Sound Stream Characterisation for Reliable, Real-Time Occurrence Monitoring of Given Sound-Prints,” in Proc. 10th Mediterranean Electrotechnical Conference, MEleCon, 2000.

  21. C. Burges, J. Platt, and S. Jana, “Extracting Noise-Robust Features from Audio Data,” in Proc. of the ICASSP, Florida, USA, May 2002.

  22. T. Blum, D. Keislar, J. Wheaton, and E. Wold, “Method and Article of Manufacture for Content-Based Analysis, Storage, Retrieval and Segmentation of Audio Information,” U.S. Patent 5,918,223, June 1999.

  23. C. Papaodysseus, G. Roussopoulos, D. Fragoulis, T. Panagopoulos, and C. Alexiou, “A New Approach to the Automatic Recognition of Musical Recordings,” J. Audio Eng. Soc., vol. 49, no. 1/2, 2001, pp. 23–35.

    Google Scholar 

  24. E. Batlle, J. Masip, and E. Guaus, “Automatic Song Identification in Noisy Broadcast Audio,” in Proc. of the SIP, Aug. 2002.

  25. Etantrum (2002) [Online]. Available: [http://www.freshmeat.net/projects/songprint].

  26. Musicbrainz trm.(2002) musicbrainz-1.1.0.tar.gz. [Online]. Available: [http://ftp.musicbrainz.org/pub/musicbrainz].

  27. M. Miller, M. Rodriguez, and I. Cox, “Audio Fingerprinting: Nearest Neighbor Search in High Dimensional Binary Spaces,” in 5th IEEE Int. Workshop on Multimedia Signal Processing: Special session on Media Recognition, US Virgin Islands, USA, Dec. 2002.

  28. D. Kirovski and H. Attias, “Beat-id: Identifying Music via Beat Analysis,” in 5th IEEE Int. Workshop on Multimedia Signal Processing: Special session on Media Recognition, US Virgin Islands, USA, Dec. 2002.

  29. E. Chávez, G. Navarro, R.A. Baeza-Yates, and J.L. Marroquin, “Searching in Metric Spaces,” ACM Computing Surveys, vol. 33, no. 3, 2001, pp. 273–321.

    Article  Google Scholar 

  30. C. Faloutsos, M. Ranganathan, and Y. Manolopoulos, “Fast Subsequence Matching in Time-Series Databases,” in Proc. of the ACM SIGMOD, Mineapolis, MN, 1994, pp. 419–429.

  31. S. Kenyon, “Signal Recognition System and Method,” U.S. Patent 5,210,820, 1993.

  32. T. Kastner, E. Allamanche, J. Herre, O. Hellmuth, M. Cremer, and H. Grossmann, “MPEG-7 Scalable Robust Audio Fingerprinting,” in Proc. AES 112th Int. Conv., Munich, Germany, May 2002.

  33. A.L.-C. Wang and J. Smith II, “System and Methods for Recognizing Sound and Music Signals in High Noise and Distortion,” U.S. Patent Application Publication US 2002/0083060 A1, 2002.

  34. P. Cano, M. Kaltenbrunner, O. Mayor, and E. Batlle, “Statistical Significance in Song-Spotting in Audio,” in Proceedings of the International Symposium on Music Information Retrieval, Oct. 2001.

Download references

Author information

Authors and Affiliations

Authors

Additional information

Pedro Cano received a B.Sc and M. Sc. Degree in Electrical Engineering from the Universitat Politècnica de Catalunya in 1999. In 1997, he joined the Music Technology Group of the Universitat Pompeu Fabra where he is currently pursuing his Ph.D. on Content-based Audio Identification. He has been assistant professor in the Department of Technologies of the Universitat Pompeu Fabra since 1999. His research interests and recent work include: signal processing for music applications, within a real-time voice morphing system for karaoke applications, pattern matching and information retrieval, specifically content-based audio identification.

Eloi Batlle received his M.S. degree in electronic engineering in 1995 from the Politechnical University of Catalunya in Barcelona, Spain. He then joined the Signal Processing Group at the same university where he was working on robust speech recognition. He received a PhD on this subject in 1999. While he was a PhD student he also worked as a researcher at the Telecom Italia Lab during 1997. In 2000 he joined the Audiovisual Institute (a part of the Pompeu Fabra University). Currently he is a member of the Music Technology Group of the same Institute where he leads several reseach projects on music identification and similarity. In 2000 he also joined the Department of Technologies of the Pompeu Fabra University and he teaches several subjects to undergraduate and graduate students. From 2001 he is the Deputy Director of this Department. His research interests include information theory, music similary, statistical signal processing and pattern recognition.

Ton Kalker was born in The Netherlands in 1956. He received his M.S. degree in mathematics in 1979 from the University of Leiden, The Netherlands. From 1979 until 1983, while he was a Ph.D. candidate, he worked as a Research Assistant at the University of Leiden. From 1983 until December 1985 he worked as a lecturer at the Computer Science Department of the Technical University of Delft. In January 1986 he received his Ph.D. degree in Mathematics. In December 1985 he joined the Philips Research Laboratories Eindhoven. Until January 1990 he worked in the field of Computer Aided Design. He specialized in (semi) automatic tools for system verification. Currently he is a member of the Processing and Architectures for Content MANagement group (PACMAN) of Philips Research, where he is working on security of multimedia content, with an emphasis on watermarking and fingerprinting for video and audio. In November 1999 he became a part-time professor in the Signal Processing Systems group of Jan Bergmans in the area of ‘signal processing methods for data protection’. He is a Fellow of the IEEE for his contributions to practical applications of watermarking, in particular watermarking for DVD-Video copy protection. His other research interests include wavelets, multirate signal processing, motion estimation, psycho physics, digital video compression and medical image processing.

Jaap Haitsma was born in 1974 in Easterein, the Netherlands. He received his B.Sc. in Electronic Engineering from the Noordelijke Hogeschool Leeuwarden in 1997. He did his thesis in 1997 at the Philips Research Laboratories in Redhill, England, on the topic of: “Colour Management for Liquid Crystal Displays”. Currently he is with the Philips Research Laboratories, Eindhoven, the Netherlands, where he has been doing research into digital watermarking and fingerprinting of audio and video since late 1997. From 1999 to 2002 he was also a part-time student at the Technical University of Eindhoven, where he obtained his M.Sc. in Electronic Engineering. His areas of interest include digital signal processing, database search algorithms and software engineering.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cano, P., Batlle, E., Kalker, T. et al. A Review of Audio Fingerprinting. J VLSI Sign Process Syst Sign Image Video Technol 41, 271–284 (2005). https://doi.org/10.1007/s11265-005-4151-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-005-4151-3

Keywords

Navigation