Abstract.
Deriving local cost models for query optimization in a dynamic multidatabase system (MDBS) is a challenging issue. In this paper, we study how to evolve a query cost model to capture a slowly-changing dynamic MDBS environment so that the cost model is kept up-to-date all the time. Two novel evolutionary techniques, i.e., the shifting method and the block-moving method, are proposed. The former updates a cost model by taking up-to-date information from a new sample query into consideration at each step, while the latter considers a block (batch) of new sample queries at each step. The relevant issues, including derivation of recurrence updating formulas, development of efficient algorithms, analysis and comparison of complexities, and design of an integrated scheme to apply the two methods adaptively, are studied. Our theoretical and experimental results demonstrate that the proposed techniques are quite promising in maintaining accurate cost models efficiently for a slowly changing dynamic MDBS environment. Besides the application to MDBSs, the proposed techniques can also be applied to the automatic maintenance of cost models in self-managing database systems.
Similar content being viewed by others
References
Adali S, Candan KS, Papakonstantinou Y, Subrahmamian VS (1996) Query caching and optimization in distributed mediator systems. In: Proceedings of the 1996 ACM SIGMOD international conference on management of data, Montreal, Canada, 4-6 June 1996, pp 137-148
Attaluri GK, Bradshaw DP, Coburn N, Larson P, Martin P, Silberschatz A, Slonim J, Zhu Q (1995) The CORDS multidatabase project. IBM Sys J 34(1):39-62
Chaudhuri S, Narasayya V (2000) Automating statistics management for query optimizers. In: Proceedings of the IEEE international conference on data engineering, San Diego, 28 February-3 March 2000, pp 339-348
Chaudhuri S, Christensen E, Graefe G, Narasayya V, Zwilling M (1999) Self tuning technology in Microsoft SQL Server. Data Eng Bull 22(2):20-26
Du W, Krishnamurthy R, Shan MC (1992) Query optimization in heterogeneous DBMS. In: Proceedings of the 18th international conference on very large data bases, Vancouver, BC, Canada, 23-27 August 1992, pp 277-291
Du W, Shan MC, Dayal U (1995) Reducing multidatabase query response time by tree balancing. In: Proceedings of the 1995 ACM SIGMOD international conference on management of data, San Jose, 22-25 May 1995, pp 293-303
Gardarin G, Sha F, Tang ZH (1996) Calibrating the query optimizer cost model of IRO-DB, an object-oriented federated database system. In: Proceedings of the 22th international conference on very large databases, Mumbai, India, 3-6 September 1996, pp 378-389
Lee C, Chen CJ (1997) Query optimization in multidatabase systems considering schema conflicts. IEEE Trans Knowl Data Eng 9(6):941-955
Lightstone S, Lohman GM, Zilio DC (2002) Toward autonomic computing with DB2 universal database. SIGMOD Rec 31(3):55-61
Litwin W, Mark L, Roussopoulos N (1990) Interoperability of multiple autonomous databases. ACM Comp Surv 22(3):267-293
Lu H, Shan MC (1992) On global query optimization in multidatabase systems. In: Proceedings of the 2nd international workshop on research issues on data engineering, Tempe, AZ, 2-3 February 1992, p 217
Naacke H, Gardarin G, Tomasic A (1998) Leveraging mediator cost models with heterogeneous data sources. In: Proceedings of the IEEE international conference on data engineering, Orlando, 23-27 February 1998, pp 351-360
Pfaffenberger R, Patterson JH (1987) Statistical methods for business and economics. Richard D Irwin, New York
Rahal A, Zhu Q, Larson PÅ(2002) Developing evolutionary cost models for query optimization in a dynamic multidatabase environment. In: Proceedings of the international conference on cooperative information systems, Irvine, CA, 30 October-1 November 2002, pp 1-18
Roth MT, Ozcan F, Haas LM (1999) Cost models DO matter: providing cost information for diverse data sources in a federated system. In: Proceedings of the 25th international conference on very large databases, Edinburgh, UK, 7-10 September 1999, pp 599-610
Sheth AP, Larson JA (1990) Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Comp Surv 22(3):183-236
Treicher J, Richard J (1987) Theory and design of adaptive filters. Wiley, New York
Urhan T, Franklin MJ, Amsaleg L (1998) Cost-based query scrambling for initial delays. In: Proceedings of the 1998 ACM SIGMOD international conference on management of data, Seattle, 2-4 June 1998, pp 130-141
Zhu Q, Sun Y, Motheramgari S (2000) Developing cost models with qualitative variables for dynamic multidatabase environment. In: Proceedings of the IEEE international conference on data engineering, San Diego, 28 February-3 March 2000, pp 413-424
Zhu Q, Motheramgari S, Sun Y (2000) Cost estimation for large queries via fractional analysis and probabilistic approach in dynamic multidatabase environments. In: Proceedings of the international conference on database and expert systems, London, 4-8 September 2000, pp 509-525
Zhu Q, Larson PÅ(1998) Solving local cost estimation problem for global query optimization in multidatabase systems. Distr Paral Databases 6(4):373-420
Zhu Q, Dunkel B, Soparkar N, Chen S, Schiefer B, Lai T (1998) A piggyback method to collect statistics for query optimization in database management systems. In: Proceedings of the 1998 IBM CAS conference, Toronto, 30 November-3 December 1998, pp 67-82
Zhu Q, Larson PÅ(1997) A fuzzy query optimization approach for multidatabase systems. Int J Uncert Fuzz Knowl Based Sys 5(6):701-722
Zhu Q, Larson PÅ(1996) Building regression cost models for multidatabase systems. In: Proceedings of the 4th IEEE international conference on parallel and distributed information systems, Miami Beach, FL, 18-20 December 1996, pp 220-231
Zhu Q, Larson PÅ(1994) A query sampling method for estimating local cost parameters in a multidatabase system. In: Proceedings of the IEEE international conference on data engineering, Houston, 14-18 February 1994, pp 144-153
Zilio DC, Lightstone S, Lyons KA, Lohman GM (2001) Self-managing technology in IBM DB2 universal database. In: Proceedings of the international conference on information and knowledge management, Atlanta, 5-10 November 2001, pp 541-543
Author information
Authors and Affiliations
Corresponding author
Additional information
Received: 25 November 2002, Accepted: 20 May 2003, Published online: 30 September 2003
Edited by: L. Liu
Research supported by the US National Science Foundation under Grant # IIS-9811980 and The University of Michigan under OVPR and UMD grants.
Rights and permissions
About this article
Cite this article
Rahal, A., Zhu, Q. & Larson, PÅ. Evolutionary techniques for updating query cost models in a dynamic multidatabase environment. VLDB 13, 162–176 (2004). https://doi.org/10.1007/s00778-003-0110-4
Issue Date:
DOI: https://doi.org/10.1007/s00778-003-0110-4