Data Mining and Knowledge Discovery

, Volume 17, Issue 3, pp 402–430 | Cite as

Learning to hash: forgiving hash functions and applications



The problem of efficiently finding similar items in a large corpus of high-dimensional data points arises in many real-world tasks, such as music, image, and video retrieval. Beyond the scaling difficulties that arise with lookups in large data sets, the complexity in these domains is exacerbated by an imprecise definition of similarity. In this paper, we describe a method to learn a similarity function from only weakly labeled positive examples. Once learned, this similarity function is used as the basis of a hash function to severely constrain the number of points considered for each lookup. Tested on a large real-world audio dataset, only a tiny fraction of the points (~0.27%) are ever considered for each lookup. To increase efficiency, no comparisons in the original high-dimensional space of points are required. The performance far surpasses, in terms of both efficiency and accuracy, a state-of-the-art Locality-Sensitive-Hashing-based (LSH) technique for the same problem and data set.


Hashing Audio matching Machine learning Locality sensitive hashing 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Aucouturier J, Pachet F (2002) Music similarity measures: what’s the use? In: Proceedings of the 3rd international conference on music information retrievalGoogle Scholar
  2. Baluja S (2007) Automated image orientation detection: a scalable boosting approach. Pattern Anal Appl 10(3): 247–263CrossRefMathSciNetGoogle Scholar
  3. Baluja S, Covell M (2006) Content fingerprinting with wavelets. In: Third European conference on visual media production (CVMP), pp 198–207Google Scholar
  4. Baluja S, Covell M (2007) Learning forgiving hash functions: algorithms and large-scale tests. In: International joint conference on artificial intelligenceGoogle Scholar
  5. Bar-Hillel A, Hertz T, Shental N, Weinshall D (2003) Learning distance functions using equivalence relations. In: Proceedings of the twentieth international conference on machine learningGoogle Scholar
  6. Brieman L (1996) Bagging predictors. Mach Learn 24(2): 123–140Google Scholar
  7. Burges JC, Platt JC, Jana S (2003) Distortion discriminant analysis for audio fingerprinting. IEEE Trans Speech Audi Processing 11: 165–174CrossRefGoogle Scholar
  8. Bylander T, Tate L (2006) Using validation sets to avoid overfitting in AdaBoost. In: Proceedings of the 19th international Florida artificial intelligence research society conference, pp 544–549Google Scholar
  9. Caruana R, Baluja S, Mitchell T (1996) Using the future to “sort out” the present: rankprop and multitask learning. Neural Inf Process Syst 8: 959–965Google Scholar
  10. Chaudhuri S, Ganjam K, Ganti V, Motwani R (2003) Robust and efficient fuzzy match for online data cleaning. In: Proceedings of the 2003 ACM SIGMOD international conference on management of data, pp 313–324Google Scholar
  11. Cohen E, Datar M, Fujiwara S, Gionis A, Indyk P, Motwani R, Ullman JD, Yang C (2001) Finding interesting associations without support pruning. Knowl Data Eng 13(1): 64–78CrossRefGoogle Scholar
  12. Covell M, Baluja S (2007) Known-audio detection using waveprint: spectrogram fingerprinting by wavelet hashing. In: Proceedings of the international conference on acoustics, speech, and signal processingGoogle Scholar
  13. Freund Y, Schapire R (1996) Experiments with a new boosting algorithm. In: Proceedings of the thirteenth international conference on machine learning, pp 148–156Google Scholar
  14. Gionis A, Indyk P, Motwani R (1999) Similarity search in high dimensions via hashing. In: Proceedings of the international conference on very large databases. Edinburgh, Scotland, UK, pp 518–529Google Scholar
  15. Haitsma J, Kalker T (2002), A highly robust audio fingerprinting system. In: Proceedings of International conference on music information retrievalGoogle Scholar
  16. Hastie T, Tibshirani R (1996) Discriminant adaptive nearest neighbor. IEEE PAMI, 18Google Scholar
  17. Jacobs C, Finkelstein A, Salesin D (1995) Fast multiresolution image querying. In: Proceedings SIGGRAPHGoogle Scholar
  18. Ke Y, Hoiem D, Sukthankar R (2005) Computer vision for music identification. In: Proceedings of computer vision and pattern recognition, pp 597–604Google Scholar
  19. Pampalk E (2006) Computational models of music similarity and their application to music information retrieval. Doctoral Thesis, Vienna University of Technology, Austria, March 2006Google Scholar
  20. Shakhnarovich G, Viola P, Darrell T (2003) Fast pose estimation with parameter sensitive hashing. In: Proceedings of the international conference on computer visionGoogle Scholar
  21. Shazam Entertainment (2005).
  22. Tieu K, Viola P (2000) Boosting image retrieval. In: Proceedings of computer vision and pattern recognitionGoogle Scholar
  23. Tsang IW, Cheung P-M, Kwok JT (2005) Kernel relevant component analysis for distance metric learning. In: Proceedings of the 2005 IEEE international joint conference on neural networks, vol 2, pp 954–959Google Scholar
  24. Viola P, Jones MJ (2001) Robust real-time object detection. In: Proceedings of the IEEE workshop on statistical and computational theories of visionGoogle Scholar
  25. Wu J, Rehg J, Mullin M (2003) Learning a rare event detection cascade by direct feature selection. Adv Neural Inf Process Syst 16Google Scholar
  26. Zhang L, Li M, Zhang H (2002) Boosting image orientation detection with indoor vs. outdoor classification. In: IEEE workshop on applications of computer visionGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  1. 1.Google, Inc.Mountain ViewUSA

Personalised recommendations