Skip to main content
Log in

Evolutionary techniques for updating query cost models in a dynamic multidatabase environment

  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract.

Deriving local cost models for query optimization in a dynamic multidatabase system (MDBS) is a challenging issue. In this paper, we study how to evolve a query cost model to capture a slowly-changing dynamic MDBS environment so that the cost model is kept up-to-date all the time. Two novel evolutionary techniques, i.e., the shifting method and the block-moving method, are proposed. The former updates a cost model by taking up-to-date information from a new sample query into consideration at each step, while the latter considers a block (batch) of new sample queries at each step. The relevant issues, including derivation of recurrence updating formulas, development of efficient algorithms, analysis and comparison of complexities, and design of an integrated scheme to apply the two methods adaptively, are studied. Our theoretical and experimental results demonstrate that the proposed techniques are quite promising in maintaining accurate cost models efficiently for a slowly changing dynamic MDBS environment. Besides the application to MDBSs, the proposed techniques can also be applied to the automatic maintenance of cost models in self-managing database systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Adali S, Candan KS, Papakonstantinou Y, Subrahmamian VS (1996) Query caching and optimization in distributed mediator systems. In: Proceedings of the 1996 ACM SIGMOD international conference on management of data, Montreal, Canada, 4-6 June 1996, pp 137-148

  2. Attaluri GK, Bradshaw DP, Coburn N, Larson P, Martin P, Silberschatz A, Slonim J, Zhu Q (1995) The CORDS multidatabase project. IBM Sys J 34(1):39-62

    Google Scholar 

  3. Chaudhuri S, Narasayya V (2000) Automating statistics management for query optimizers. In: Proceedings of the IEEE international conference on data engineering, San Diego, 28 February-3 March 2000, pp 339-348

  4. Chaudhuri S, Christensen E, Graefe G, Narasayya V, Zwilling M (1999) Self tuning technology in Microsoft SQL Server. Data Eng Bull 22(2):20-26

    Google Scholar 

  5. Du W, Krishnamurthy R, Shan MC (1992) Query optimization in heterogeneous DBMS. In: Proceedings of the 18th international conference on very large data bases, Vancouver, BC, Canada, 23-27 August 1992, pp 277-291

  6. Du W, Shan MC, Dayal U (1995) Reducing multidatabase query response time by tree balancing. In: Proceedings of the 1995 ACM SIGMOD international conference on management of data, San Jose, 22-25 May 1995, pp 293-303

  7. Gardarin G, Sha F, Tang ZH (1996) Calibrating the query optimizer cost model of IRO-DB, an object-oriented federated database system. In: Proceedings of the 22th international conference on very large databases, Mumbai, India, 3-6 September 1996, pp 378-389

  8. Lee C, Chen CJ (1997) Query optimization in multidatabase systems considering schema conflicts. IEEE Trans Knowl Data Eng 9(6):941-955

    Article  Google Scholar 

  9. Lightstone S, Lohman GM, Zilio DC (2002) Toward autonomic computing with DB2 universal database. SIGMOD Rec 31(3):55-61

    Google Scholar 

  10. Litwin W, Mark L, Roussopoulos N (1990) Interoperability of multiple autonomous databases. ACM Comp Surv 22(3):267-293

    Article  Google Scholar 

  11. Lu H, Shan MC (1992) On global query optimization in multidatabase systems. In: Proceedings of the 2nd international workshop on research issues on data engineering, Tempe, AZ, 2-3 February 1992, p 217

  12. Naacke H, Gardarin G, Tomasic A (1998) Leveraging mediator cost models with heterogeneous data sources. In: Proceedings of the IEEE international conference on data engineering, Orlando, 23-27 February 1998, pp 351-360

  13. Pfaffenberger R, Patterson JH (1987) Statistical methods for business and economics. Richard D Irwin, New York

  14. Rahal A, Zhu Q, Larson PÅ(2002) Developing evolutionary cost models for query optimization in a dynamic multidatabase environment. In: Proceedings of the international conference on cooperative information systems, Irvine, CA, 30 October-1 November 2002, pp 1-18

  15. Roth MT, Ozcan F, Haas LM (1999) Cost models DO matter: providing cost information for diverse data sources in a federated system. In: Proceedings of the 25th international conference on very large databases, Edinburgh, UK, 7-10 September 1999, pp 599-610

  16. Sheth AP, Larson JA (1990) Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Comp Surv 22(3):183-236

    Article  Google Scholar 

  17. Treicher J, Richard J (1987) Theory and design of adaptive filters. Wiley, New York

  18. Urhan T, Franklin MJ, Amsaleg L (1998) Cost-based query scrambling for initial delays. In: Proceedings of the 1998 ACM SIGMOD international conference on management of data, Seattle, 2-4 June 1998, pp 130-141

  19. Zhu Q, Sun Y, Motheramgari S (2000) Developing cost models with qualitative variables for dynamic multidatabase environment. In: Proceedings of the IEEE international conference on data engineering, San Diego, 28 February-3 March 2000, pp 413-424

  20. Zhu Q, Motheramgari S, Sun Y (2000) Cost estimation for large queries via fractional analysis and probabilistic approach in dynamic multidatabase environments. In: Proceedings of the international conference on database and expert systems, London, 4-8 September 2000, pp 509-525

  21. Zhu Q, Larson PÅ(1998) Solving local cost estimation problem for global query optimization in multidatabase systems. Distr Paral Databases 6(4):373-420

    Article  Google Scholar 

  22. Zhu Q, Dunkel B, Soparkar N, Chen S, Schiefer B, Lai T (1998) A piggyback method to collect statistics for query optimization in database management systems. In: Proceedings of the 1998 IBM CAS conference, Toronto, 30 November-3 December 1998, pp 67-82

  23. Zhu Q, Larson PÅ(1997) A fuzzy query optimization approach for multidatabase systems. Int J Uncert Fuzz Knowl Based Sys 5(6):701-722

    MathSciNet  MATH  Google Scholar 

  24. Zhu Q, Larson PÅ(1996) Building regression cost models for multidatabase systems. In: Proceedings of the 4th IEEE international conference on parallel and distributed information systems, Miami Beach, FL, 18-20 December 1996, pp 220-231

  25. Zhu Q, Larson PÅ(1994) A query sampling method for estimating local cost parameters in a multidatabase system. In: Proceedings of the IEEE international conference on data engineering, Houston, 14-18 February 1994, pp 144-153

  26. Zilio DC, Lightstone S, Lyons KA, Lohman GM (2001) Self-managing technology in IBM DB2 universal database. In: Proceedings of the international conference on information and knowledge management, Atlanta, 5-10 November 2001, pp 541-543

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amira Rahal.

Additional information

Received: 25 November 2002, Accepted: 20 May 2003, Published online: 30 September 2003

Edited by: L. Liu

Research supported by the US National Science Foundation under Grant # IIS-9811980 and The University of Michigan under OVPR and UMD grants.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rahal, A., Zhu, Q. & Larson, PÅ. Evolutionary techniques for updating query cost models in a dynamic multidatabase environment. VLDB 13, 162–176 (2004). https://doi.org/10.1007/s00778-003-0110-4

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-003-0110-4

Keywords:

Navigation