• Deepak PEmail author
  • Prasad M. Deshpande
Part of the SpringerBriefs in Computer Science book series (BRIEFSCOMPUTER)


In this introductory chapter, we consider the operation of common similarity search systems, more from a semantics point of view as opposed to the efficiency-oriented view as used in typical database literature. We illustrate that the full-specification of a similarity search system involves the schema definition as well as details pertaining to the phases of pair-wise similarity estimation and result set identification. We will see how variations in the specification of pairwise similarity estimation and result set identification give rise to various similarity operators. In addition to reviewing the most common similarity operator, the top-k operator, we look at the landscape of similarity operators that have been proposed in the last two decades. We then consider the notion of similarity from a cognitive/psychological perspective and outline some assumptions of similarity measures that form conventional wisdom in such literature. In particular, we focus on those aspects that have implications to building computer-based similarity search systems, and outline some disconnects between the literature in psychology and that in computing pertaining to assumptions made about similarity measures.


Triangle Inequality Similarity Search Dynamic Time Warping Skyline Query Query Object 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    J. L. Bentley. Multidimensional binary search trees used for associative searching. Commun. ACM, 18(9):509–517, 1975.Google Scholar
  2. 2.
    S. Borzsony, D. Kossmann, and K. Stocker. The skyline operator. In Data Engineering, 2001. Proceedings. 17th International Conference on, pages 421–430. IEEE, 2001.Google Scholar
  3. 3.
    C.-Y. Chan, H. Jagadish, K.-L. Tan, A. K. Tung, and Z. Zhang. On high dimensional skylines. In Advances in Database Technology-EDBT 2006, pages 478–495. Springer, 2006.Google Scholar
  4. 4.
    Y.-C. Chang, L. Bergman, V. Castelli, C.-S. Li, M.-L. Lo, and J. R. Smith. The onion technique: indexing for linear optimization queries. In ACM SIGMOD Record, volume 29, pages 391–402. ACM, 2000.Google Scholar
  5. 5.
    E. Dellis and B. Seeger. Efficient computation of reverse skyline queries. In Proceedings of the 33rd international conference on Very large data bases, pages 291–302. VLDB Endowment, 2007.Google Scholar
  6. 6.
    T. Emrich, M. Franzke, N. Mamoulis, M. Renz, and A. Z¨ufle. Geo-social skyline queries. In Database Systems for Advanced Applications, pages 77–91. Springer, 2014.Google Scholar
  7. 7.
    R. Fagin and L. Stockmeyer. Relaxing the triangle inequality in pattern matching. International Journal of Computer Vision, 30(3):219–231, 1998.Google Scholar
  8. 8.
    H. Ferhatosmanoglu, I. Stanoi, D. Agrawal, and A. El Abbadi. Constrained nearest neighbor queries. In Advances in Spatial and Temporal Databases, pages 257–276. Springer, 2001.Google Scholar
  9. 9.
    R. A. Finkel and J. L. Bentley. Quad trees: A data structure for retrieval on composite keys. Acta Inf., 4:1–9, 1974.Google Scholar
  10. 10.
    Y. Gao, B. Zheng, G. Chen,W.-C. Lee, K. C. Lee, and Q. Li. Visible reverse k-nearest neighbor queries. In Data Engineering, 2009. ICDE’09. IEEE 25th International Conference on, pages 1203–1206. IEEE, 2009.Google Scholar
  11. 11.
    G. Gilmore, H. Hersh, A. Caramazza, and J. Griffin. Multidimensional letter similarity derived from recognition errors. Perception & Psychophysics, 25(5):425–431, 1979.Google Scholar
  12. 12.
    A. Guttman. R-trees: A dynamic index structure for spatial searching. In SIGMOD’84, Proceedings of Annual Meeting, Boston, Massachusetts, June 18-21, 1984, pages 47–57, 1984.Google Scholar
  13. 13.
    A. Jain, P. Sarda, and J. R. Haritsa. Providing diversity in k-nearest neighbor query results. In Advances in Knowledge Discovery and Data Mining, pages 404–413. Springer, 2004.Google Scholar
  14. 14.
    W. Jin, J. Han, and M. Ester. Mining thick skylines over large databases. In Knowledge Discovery in Databases: PKDD 2004, pages 255–266. Springer, 2004.Google Scholar
  15. 15.
    F. Korn and S. Muthukrishnan. Influence sets based on reverse nearest neighbor queries. In ACM SIGMOD Record, volume 29, pages 201–212. ACM, 2000.Google Scholar
  16. 16.
    Y. Kumar, R. Janardan, and P. Gupta. Efficient algorithms for reverse proximity query problems. In Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems, page 39. ACM, 2008.Google Scholar
  17. 17.
    C. Li, N. Zhang, N. Hassan, S. Rajasekaran, and G. Das. On skyline groups. In Proceedings of the 21st ACM international conference on Information and knowledge management, pages 2119–2123. ACM, 2012.Google Scholar
  18. 18.
    X. Lian and L. Chen. Similarity search in arbitrary subspaces under l p-norm. In Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on, pages 317–326. IEEE, 2008.Google Scholar
  19. 19.
    X. Lin, Y. Yuan, Q. Zhang, and Y. Zhang. Selecting stars: The k most representative skyline operator. In Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on, pages 86–95. IEEE, 2007.Google Scholar
  20. 20.
    Q. Liu, Y. Gao, G. Chen, Q. Li, and T. Jiang. On efficient reverse k-skyband query processing. In Database Systems for Advanced Applications, pages 544–559. Springer, 2012.Google Scholar
  21. 21.
    M. M¨uller. Dynamic time warping. Information retrieval for music and motion, pages 69–84, 2007.Google Scholar
  22. 22.
    S. Nutanong, E. Tanin, and R. Zhang. Visible nearest neighbor queries. In Advances in Databases: Concepts, Systems and Applications, pages 876–883. Springer, 2007.Google Scholar
  23. 23.
    D. Papadias, Y. Tao, G. Fu, and B. Seeger. Progressive skyline computation in database systems. ACM Transactions on Database Systems (TODS), 30(1):41–82, 2005.Google Scholar
  24. 24.
    R. Pereira, A. Agshikar, G. Agarwal, and P. Keni. Range reverse nearest neighbor queries. In KICSS, 2013.Google Scholar
  25. 25.
    P. Podgorny and W. Garner. Reaction time as a measure of inter-and intraobject visual similarity: Letters of the alphabet. Perception & Psychophysics, 26(1):37–52, 1979.Google Scholar
  26. 26.
    V. S. Ramachandran. The tell-tale brain: A neuroscientist’s quest for what makes us human. WW Norton & Company, 2012.Google Scholar
  27. 27.
    R. N. Shepard. Toward a universal law of generalization for psychological science. Science, 237(4820):1317–1323, 1987.Google Scholar
  28. 28.
    Y. Shi and B. Graham. A similarity search approach to solving the multi-query problems. In Computer and Information Science (ICIS), 2012 IEEE/ACIS 11th International Conference on, pages 237–242. IEEE, 2012.Google Scholar
  29. 29.
    Y. Tao, D. Papadias, and X. Lian. Reverse knn search in arbitrary dimensionality. In Proceedings of the Thirtieth international conference on Very large data bases-Volume 30, pages 744–755. VLDB Endowment, 2004.Google Scholar
  30. 30.
    A. K. Tung, R. Zhang, N. Koudas, and B. C. Ooi. Similarity search: a matching based approach. In Proceedings of the 32nd international conference on Very large data bases, pages 631–642. VLDB Endowment, 2006.Google Scholar
  31. 31.
    A. Tversky. Features of similarity. Psychological Reviews, 84(4):327–352, 1977.Google Scholar
  32. 32.
    A. Tversky and I. Gati. Similarity, separability, and the triangle inequality. Psychological review, 89(2):123, 1982.Google Scholar
  33. 33.
    R. Yager and F. Petry. Hypermatching: Similarity matching with extreme values. Fuzzy Systems, IEEE Transactions on, 22(4):949–957, Aug 2014.Google Scholar
  34. 34.
    Z. Zhang, C. Jin, and Q. Kang. Reverse k-ranks query. Proceedings of the VLDB Endowment, 7(10), 2014.Google Scholar

Copyright information

© The Author(s) 2015

Authors and Affiliations

  1. 1.IBM ResearchBangaloreIndia

Personalised recommendations