Skip to main content

Handling Query Skew in Large Indexes: A View Based Approach

  • Conference paper
  • First Online:
Databases Theory and Applications (ADC 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9093))

Included in the following conference series:

  • 1495 Accesses

Abstract

Indexing is one of the most important techniques to facilitate query processing over a multi-dimensional dataset. A commonly used strategy for such indexing is to keep the tree-structured index balanced. This strategy implies that all queries are uniformly issued, which is partially because the query distribution is not possibly known and will change over time in practice. A key issue we study in this work is whether it is the best to fully rely on a balanced tree-structured index in particular when datasets become larger and larger. This means that, when a dataset becomes very large, it becomes unreasonable to assume that all data in any subspace are equally important and are uniformly accessed by all queries at the index level. Given the existence of query skew, in this paper, we study how to handle such query skew at the index level without sacrifice of supporting any possible queries in a well-balanced tree index and without a high overhead. To tackle the issue, we propose index-view at the index level, where an index-view is a short-cut in a balanced tree-structured index to access objects in the subspace that are more frequently accessed, and propose a new index-view-centric framework for query processing using index-views in a bottom-up manner. We study index-views selection problem, and we confirm the effectiveness of our approach using large real and synthetic datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Achakeev, D., Seeger, B., Widmayer, P.: Sort-based query-adaptive loading of r-trees. In: Proc. of CIKM 2012 (2012)

    Google Scholar 

  2. Arya, S., Mount, D.M., Netanyahu, N.S., Silverman, R., Wu, A.Y.: An optimal algorithm for approximate nearest neighbor searching. In: Proc. of SODA 1994 (1994)

    Google Scholar 

  3. Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9) (1975)

    Google Scholar 

  4. Cudré-Mauroux, P., Wu, E., Madden, S.: Trajstore: an adaptive storage system for very large trajectory data sets. In: Proc. of ICDE 2010 (2010)

    Google Scholar 

  5. Felipe, I.D., Hristidis, V., Rishe, N.: Keyword search on spatial databases. In: Proc. of ICDE 2008 (2008)

    Google Scholar 

  6. Filho, Y.V.S.: Average case analysis of region search in balanced k-d trees. Inf. Process. Lett. 8(5) (1979)

    Google Scholar 

  7. Finkel, R.A., Bentley, J.L.: Quad trees: A data structure for retrieval on composite keys. Acta Inf. 4 (1974)

    Google Scholar 

  8. Friedman, J.H., Bentley, J.L., Finkel, R.A.: An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw. 3(3) (1977)

    Google Scholar 

  9. Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: Proc. of SIGMOD 1984 (1984)

    Google Scholar 

  10. Hjaltason, G.R., Samet, H.: Distance browsing in spatial databases. ACM Trans. Database Syst. 24(2) (1999)

    Google Scholar 

  11. Levandoski, J.J., Sarwat, M., Eldawy, A., Mokbel, M.F.: Lars: a location-aware recommender system. In: Proc. of ICDE 2012 (2012)

    Google Scholar 

  12. Li, G., Feng, J., Xu, J.: Desks: direction-aware spatial keyword search. In: Proc. of ICDE 2012 (2012)

    Google Scholar 

  13. Nemhauser, G.L., Wolsey, L.A., Fisher, M.L.: An analysis of approximations for maximizing submodular set functionsi. Mathematical Programming 14(1) (1978)

    Google Scholar 

  14. Papadias, D., Shen, Q., Tao, Y., Mouratidis, K.: Group nearest neighbor queries. In: Proc. of ICDE 2004 (2004)

    Google Scholar 

  15. Park, E., Mount, D.M.: A self-adjusting data structure for multidimensional point sets. In: Epstein, L., Ferragina, P. (eds.) ESA 2012. LNCS, vol. 7501, pp. 778–789. Springer, Heidelberg (2012)

    Google Scholar 

  16. Samet, H.: Foundations of multidimensional and metric data structures. Morgan Kaufmann (2006)

    Google Scholar 

  17. Sheng, C., Tao, Y.: Fifo indexes for decomposable problems. In: Proc. of PODS 2011 (2011)

    Google Scholar 

  18. Tzoumas, K., Yiu, M.L., Jensen, C.S.: Workload-aware indexing of continuously moving objects. PVLDB 2(1) (2009)

    Google Scholar 

  19. Yuan, J., Zheng, Y., Zhang, C., Xie, W., Xie, X., Sun, G., Huang, Y.: T-drive: driving directions based on taxi trajectories. In: Proc. of GIS 2010 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weihuang Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Huang, W., Yu, J.X., Shang, Z. (2015). Handling Query Skew in Large Indexes: A View Based Approach. In: Sharaf, M., Cheema, M., Qi, J. (eds) Databases Theory and Applications. ADC 2015. Lecture Notes in Computer Science(), vol 9093. Springer, Cham. https://doi.org/10.1007/978-3-319-19548-3_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19548-3_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19547-6

  • Online ISBN: 978-3-319-19548-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics