Advertisement

Common Similarity Search Operators

  • Deepak PEmail author
  • Prasad M. Deshpande
Chapter
Part of the SpringerBriefs in Computer Science book series (BRIEFSCOMPUTER)

Abstract

We present a simple framework for similarity search systems that enables expression of different similarity operators as a combination of aggregation and filter functions. We then describe the common aggregation functions such as weighted sum and N-Match followed by an overview of common filter functions including the threshold, top-k and skyline filters. We then illustrate how combinations of aggregation and filter functions form some of the commonly used similarity search operators.

Keywords

Range Query Query Point Skyline Query Query Object Similarity Vector 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    W.-T. Balke, U. Güntzer, and J. X. Zheng. Efficient distributed skylining for web information systems. In EDBT, pages 256–273, 2004.Google Scholar
  2. 2.
    H. Bast, D. Majumdar, R. Schenkel, M. Theobald, and G. Weikum. Io-top-k: Index-access optimized top-k query processing. In VLDB, pages 475–486, 2006.Google Scholar
  3. 3.
    J. L. Bentley. Multidimensional binary search trees used for associative searching. Commun. ACM, 18(9):509–517, 1975.Google Scholar
  4. 4.
    S. Borzsony, D. Kossmann, and K. Stocker. The skyline operator. In Data Engineering, 2001. Proceedings. 17th International Conference on, pages 421–430. IEEE, 2001.Google Scholar
  5. 5.
    C.-Y. Chan, H. Jagadish, K.-L. Tan, A. K. Tung, and Z. Zhang. On high dimensional skylines. In Advances in Database Technology-EDBT 2006, pages 478–495. Springer, 2006.Google Scholar
  6. 6.
    J. Chomicki, P. Godfrey, J. Gryz, and D. Liang. Skyline with presorting. In Proceedings of the 19th International Conference on Data Engineering, March 5-8, 2003, Bangalore, India, pages 717–719, 2003.Google Scholar
  7. 7.
    E. Dellis and B. Seeger. Efficient computation of reverse skyline queries. In Proceedings of the 33rd international conference on Very large data bases, pages 291–302. VLDB Endowment, 2007.Google Scholar
  8. 8.
    K. Deng, X. Zhou, and H. T. Shen. Multi-source skyline query processing in road networks, 2007.Google Scholar
  9. 9.
    P. M. Deshpande, P. Deepak, and K. Kummamuru. Efficient online top-k retrieval with arbitrary similarity measures. In Proceedings of the 11th international conference on Extending database technology: Advances in database technology, pages 356–367. ACM, 2008.Google Scholar
  10. 10.
    V. Dohnal, C. Gennaro, P. Savino, and P. Zezula. D-index: Distance searching index for metric data sets. Multimedia Tools Appl., 21(1):9–33, 2003.Google Scholar
  11. 11.
    T. Emrich, M. Franzke, N. Mamoulis, M. Renz, and A. Züfle. Geo-social skyline queries. In Database Systems for Advanced Applications, pages 77–91. Springer, 2014.Google Scholar
  12. 12.
    R. Fagin. Combining fuzzy information from multiple systems. In PODS, pages 216–226, 1996.Google Scholar
  13. 13.
    R. Fagin. Combining fuzzy information: an overview. SIGMOD Record, 31(2):109–118, 2002.Google Scholar
  14. 14.
    R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. Journal of Computer and System Sciences, 66(4):614–656, 2003.Google Scholar
  15. 15.
    R. A. Finkel and J. L. Bentley. Quad trees: A data structure for retrieval on composite keys. Acta Inf., 4:1–9, 1974.Google Scholar
  16. 16.
    P. Godfrey, R. Shipley, and J. Gryz. Maximal vector computation in large data sets. In Proceedings of the 31st International Conference on Very Large Data Bases, Trondheim, Norway, August 30 - September 2, 2005, pages 229–240, 2005.Google Scholar
  17. 17.
    K. Goh, B. Li, and E. Chang. Dyndex: A dynamic and nonmetric space indexer, 2002.Google Scholar
  18. 18.
    A. Guttman. R-trees: A dynamic index structure for spatial searching. In SIGMOD’84, Proceedings of Annual Meeting, Boston, Massachusetts, June 18-21, 1984, pages 47–57, 1984.Google Scholar
  19. 19.
    I. Kalantari and G. McDonald. A data structure and an algorithm for the nearest point problem. IEEE Trans. Software Eng., 9(5):631–634, 1983.Google Scholar
  20. 20.
    D. Kossmann, F. Ramsak, and S. Rost. Shooting stars in the sky: An online algorithm for skyline queries. In VLDB, pages 275–286, 2002.Google Scholar
  21. 21.
    C. Li, N. Zhang, N. Hassan, S. Rajasekaran, and G. Das. On skyline groups. In Proceedings of the 21st ACM international conference on Information and knowledge management, pages 2119–2123. ACM, 2012.Google Scholar
  22. 22.
    X. Lian and L. Chen. Monochromatic and bichromatic reverse skyline search over uncertain databases. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 213–226. ACM, 2008.Google Scholar
  23. 23.
    X. Lin, Y. Yuan, Q. Zhang, and Y. Zhang. Selecting stars: The k most representative skyline operator. In Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on, pages 86–95. IEEE, 2007. 32 3 Common Similarity Search OperatorsGoogle Scholar
  24. 24.
    D. Padmanabhan, P. M. Deshpande, D. Majumdar, and R. Krishnapuram. Efficient skyline retrieval with arbitrary similarity measures. In EDBT, pages 1052–1063, 2009.Google Scholar
  25. 25.
    D. Papadias, Y. Tao, G. Fu, and B. Seeger. An optimal and progressive algorithm for skyline queries. In SIGMOD Conference, pages 467–478, 2003.Google Scholar
  26. 26.
    D. Papadias, Y. Tao, G. Fu, and B. Seeger. Progressive skyline computation in database systems. ACM Trans. Database Syst., 30(1):41–82, 2005.Google Scholar
  27. 27.
    M. Sharifzadeh and C. Shahabi. The spatial skyline queries. In Proceedings of the 32nd international conference on Very large data bases, pages 751–762. VLDB Endowment, 2006.Google Scholar
  28. 28.
    K.-L. Tan, P.-K. Eng, and B. C. Ooi. Efficient progressive skyline computation. In VLDB, pages 301–310, 2001.Google Scholar
  29. 29.
    Y. Tao, X. Xiao, and J. Pei. Subsky: Efficient computation of skylines in subspaces. In ICDE, page 65, 2006.Google Scholar
  30. 30.
    A. K. Tung, R. Zhang, N. Koudas, and B. C. Ooi. Similarity search: a matching based approach. In Proceedings of the 32nd international conference on Very large data bases, pages 631–642. VLDB Endowment, 2006.Google Scholar
  31. 31.
    J. K. Uhlmann. Satisfying general proximity/similarity queries with metric trees. Information Processing Letters, 40(4):175–179, 1991.Google Scholar
  32. 32.
    E. Vidal. New formulation and improvements of the nearest-neighbour approximating and eliminating search algorithm (aesa). Pattern Recognition Letters, 15(1):1–7, 1994.Google Scholar
  33. 33.
    S. Wang, B. C. Ooi, A. K. H. Tung, and L. Xu. Efficient skyline query processing on peer-to-peer networks. In ICDE, pages 1126–1135, 2007.Google Scholar
  34. 34.
    P. Zesula, G. Amato, V. Dohnal, and M. Batko. Similarity Search - The Metric Space Approach. Springer, 1978.Google Scholar

Copyright information

© The Author(s) 2015

Authors and Affiliations

  1. 1.IBM ResearchBangaloreIndia

Personalised recommendations