Skip to main content

Expressing and Optimizing Similarity-Based Queries in SQL

(Extended Abstract)

  • Conference paper
Conceptual Modeling – ER 2004 (ER 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3288))

Included in the following conference series:

Abstract

Searching for similar objects (in terms of near and nearest neighbors) of a given query object from a large set is an essential task in many applications. Recent years have seen great progress towards efficient algorithms for this task. This paper takes a query language perspective, equipping SQL with the near and nearest search capability by adding a user-defined-predicate, called NN-UDP. The predicate indicates, among a set of objects, if an object is a near or nearest-neighbor of a given query object. The use of the NN-UDP makes the queries involving similarity searches intuitive to express. Unfortunately, traditional cost-based optimization methods that deal with traditional UDPs do not work well for such SQL queries. Better execution plans are possible with the introduction of a new operator, called NN-OP, which finds the near or nearest neighbors from a set of objects for a given query object. An optimization algorithm proposed in this paper can produce these plans that take advantage of the efficient search algorithms developed in recent years. To assess the proposed optimization algorithm, this paper focuses on applications that deal with streaming time series. Experimental results show that the optimization strategy is effective.

This is an abbreviated version of the technical report [6].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Faloutsos, C., Swami, A.N.: Efficient similarity search in sequence databases. In: Lomet, D.B. (ed.) FODO 1993. LNCS, vol. 730, pp. 69–84. Springer, Heidelberg (1993)

    Google Scholar 

  2. Chaudhuri, S., Gravano, L.: Optimizing queries over multimedia repositories. In: SIGMOD Conference, pp. 91–102 (1996)

    Google Scholar 

  3. Chaudhuri, S., Shim, K.: Optimization of queries with user-defined predicates. ACM Transactions on Database Systems 24(2), 177–228 (1999)

    Article  Google Scholar 

  4. Chimenti, D., Gamboa, R., Krishnamurthy, R.: Towards an open architecture for LDL. In: VLDB Conference (1989)

    Google Scholar 

  5. Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in timeseries databases. In: SIGMOD Conference, pp. 419–429 (1994)

    Google Scholar 

  6. Gao, L., Wang, M., Wang, X.S., Padmanabhan, S.: Expressing and optimizing similarity-based queries in SQL. Technical Report CS-04-06, University of Vermont (March 2004), http://www.cs.uvm.edu/csdb/techreport.shtml

  7. Gao, L., Wang, X.S., Wang, M., Padmanabhan, S.: A learning-based approach to estimate statistics of operators in continuous queries: a case study. In: Workshop on Research Issues in Data Mining and Knowledge Discovery, DMKD (2003)

    Google Scholar 

  8. Hellerstein, J.M.: Practical predicate placement. In: SIGMOD Conference, pp. 325–335 (1994)

    Google Scholar 

  9. Hellerstein, J.M., Stonebraker, M.: Predicate migration: optimizing queries with expensive predicates. In: SIGMOD Conference, pp. 267–276 (1993)

    Google Scholar 

  10. Keogh, E.J., Chakrabarti, K., Mehrotra, S., Pazzani, M.J.: Locally adaptive dimensionality reduction for indexing large time series databases. In: SIGMOD Conference (2001)

    Google Scholar 

  11. Rafiei, D., Mendelzon, A.: Similarity-based queries for time series data. In: SIGMOD Conference, pp. 13–25 (1997)

    Google Scholar 

  12. Roussopoulos, N., Kelley, S., Vincent, F.: Nearest neighbor queries. In: SIGMOD Conference, pp. 71–79 (1995)

    Google Scholar 

  13. Seidl, T., Kriegel, H.-P.: Optimal multi-step k-nearest neighbor search. In: SIGMOD Conference, pp. 154–165 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gao, L., Wang, M., Wang, X.S., Padmanabhan, S. (2004). Expressing and Optimizing Similarity-Based Queries in SQL. In: Atzeni, P., Chu, W., Lu, H., Zhou, S., Ling, TW. (eds) Conceptual Modeling – ER 2004. ER 2004. Lecture Notes in Computer Science, vol 3288. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30464-7_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30464-7_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23723-5

  • Online ISBN: 978-3-540-30464-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics