The Method of Query Selectivity Estimation for Selection Conditions Based on Sum of Sub-Independent Attributes

Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 242)

Abstract

Selectivity estimation is an activity performed during a query optimization process. Selectivity parameter lets estimate the query result size before the query is really executed. This allows to obtain the best query execution plan. For complex queries (where selection condition is based on many attributes) an accurate selectivity estimation requires a multidimensional distribution of attributes values. But often, attribute value independence assumption and usage of only 1-dimensional distributions give a sufficient accuracy of selectivity approximation. The paper describes the method of selectivity estimation for queries with a complex selection condition based on a sum of independent attributes or sub-independent ones. The proposed method operates on 1-dimensional Fourier Transforms of marginal distributions of attributes that are involved in the selection condition.

Keywords

query selectivity estimation query optimization sub-independence characteristic function FFT 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Augustyn, D.R.: Applying advanced methods of query selectivity estimation in Oracle DBMS. In: Cyran, K.A., Kozielski, S., Peters, J.F., Stańczyk, U., Wakulicz-Deja, A. (eds.) Man-Machine Interactions. AISC, vol. 59, pp. 585–593. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  2. 2.
    Getoor, L., Taskar, B., Koller, D.: Selectivity estimation using probabilistic models. ACM SIGMOD Record 30, 461–472 (2001)CrossRefGoogle Scholar
  3. 3.
    Hamedani, G., Walter, G.: A fixed point theorem and its application to the central limit theorem. Archiv der Mathematik 43(3), 258–264 (1984) (in English)Google Scholar
  4. 4.
    Hamedani, G.G., Volkmer, H.W., Behboodian, J.: A note on sub-independent random variables and a class of bivariate mixtures. Studia Scientiarum Mathematicarum Hungarica 49(1), 19–25 (2012)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Khachatryan, A., Müller, E., Stier, C., Böhm, K.: Sensitivity of self-tuning histograms: query order affecting accuracy and robustness. In: Ailamaki, A., Bowers, S. (eds.) SSDBM 2012. LNCS, vol. 7338, pp. 334–342. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  6. 6.
    Oracle®: Using Extensible Optimizer, http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28425/ext_optimizer.htm (accessed July 10, 2005)
  7. 7.
    Poosala, V., Ioannidis, Y.E.: Selectivity estimation without the attribute value independence assumption. In: Proceedings of the 23rd International Conference on Very Large Data Bases (VLDB 1997), pp. 486–495. Morgan Kaufmann (1997)Google Scholar
  8. 8.
    Scott, D.W., Sain, S.R.: Multidimensional density estimation. In: Rao, C.R., Wegman, E.J. (eds.) Handbook of Statistics: Data Mining and Data Visualization, vol. 24, pp. 229–261. Elsevier, Amsterdam (2005)CrossRefGoogle Scholar
  9. 9.
    Yan, F., Hou, W.C., Jiang, Z., Luo, C., Zhu, Q.: Selectivity estimation of range queries based on data density approximation via cosine series. Data & Knowledge Engineering 63(3), 855–878 (2007)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Institute of InformaticsSilesian University of TechnologyGliwicePoland

Personalised recommendations