Advertisement

On Transformation of Query Scheduling Strategies in Distributed and Heterogeneous Database Systems

  • Janusz R. GettaEmail author
  • Handoko
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9011)

Abstract

This work considers a problem of optimal query processing in heterogeneous and distributed database systems. A global query submitted at a local site is decomposed into a number of queries processed at the remote sites. The partial results returned by the queries are integrated at a local site. The paper addresses a problem of an optimal scheduling of queries that minimizes time spend on data integration of the partial results into the final answer. A global data model defined in this work provides a unified view of the heterogeneous data structures located at the remote sites and a system of operations is defined to express the complex data integration procedures. This work shows that the transformations of an entirely simultaneous query processing strategies into a hybrid (simultaneous/sequential) strategy may in some cases lead to significantly faster data integration. We show how to detect such cases, what conditions must be satisfied to transform the schedules, and how to transform the schedules into the more efficient ones.

Keywords

Distributed heterogenous database systems Data integration Optimization of query processing 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ahmad, M., Aboulnaga, A., Babu, S.: Query interactions in database workloads. In: Proceedings of the Second International Workshop on Testing Database Systems, pp. 1–6 (2009)Google Scholar
  2. 2.
    Ahmad, M., Duan, S., Aboulnaga, A., Babu, S.: Predicting completion times of batch query workloads using interaction-aware models and simulation. In: Proceedings of the 14th International Conference on Extending Database Technology, pp. 449–460 (2011)Google Scholar
  3. 3.
    Costa, R.L.-C., Furtado, P.: Runtime estimations, reputation and elections for top performing distributed query scheduling. In: Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 28–35 (2009)Google Scholar
  4. 4.
    Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Proceedings of the 6th Symposium on Operating Systems Design and Implementation (2004)Google Scholar
  5. 5.
    Granas, A., Dugundji, J.: Fixed Point Theory. Springer-Verlag (2003)Google Scholar
  6. 6.
    Ives, Z.G., Green, T.J., Karvounarakis, G., Taylor, N.E., Tannen, V., Talukdar, P.P., Jacob, M., Pereira F.: The ORCHESTRA Collaborative Data Sharing System. SIGMOD Record (2008)Google Scholar
  7. 7.
    Ilarri, S., Mena, E., Illarramendi, A.: Location-dependent query processing: Where we are and where we are heading. ACM Computing Surveys 42(3), 1–73 (2010)CrossRefGoogle Scholar
  8. 8.
    Lenzerini, M.: Data Integration: A Theoretical Perspective (2002)Google Scholar
  9. 9.
    Liu L., Pu, C.: A dynamic query scheduling framework for distributed and evolving information systems. In: Proceedings of the 17th International Conference on Distributed Computing Systems (1997)Google Scholar
  10. 10.
    Mishra, C., Koudas, N.: The design of a query monitoring system. ACM Transactions on Database Systems 34(1), 1–51 (2009)CrossRefGoogle Scholar
  11. 11.
    Nam, B., Shin, M., Andrade, H., Sussman, A.: Multiple query scheduling for distributed semantic caches. Journal of Parallel and Distributed Computing 70(5), 598–611 (2010)CrossRefzbMATHGoogle Scholar
  12. 12.
    Ozcan, F., Nural, S., Koksal, P., Evrendilek, C., Dogac, A.: Dynamic Query Optimization in Multidatabases. Bulletin of the Technical Committee on Data Engineering 20(3), 38–45 (2011)Google Scholar
  13. 13.
    Thain, D., Tannenbaum, T., Livny, M.: Distributed computing in practice: the Condor experience: Research Articles. Concurrency Computing: Practice and Experience. 17(2–4), 323–356 (2005)CrossRefGoogle Scholar
  14. 14.
    Zhou, Y., Ooi, B.C., Tan, K.-L., Tok, W.H.: An adaptable distributed query processing architecture. Data and Knowledge Engineering 53(3), 283–309 (2005)CrossRefGoogle Scholar
  15. 15.
    Zhu, Q., Larson, P.A.: Solving Local Cost Estimation Problem for Global Query Optimization in Multidatabase Systems. Distributed and Parallel Databases 6(4), 373–420 (1998)CrossRefGoogle Scholar
  16. 16.
    Ziegler, P.: Three Decades of Data Integration - All problems Solved? In: 18th IFIP World Computer Congress, vol. 12 (2004)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.School of Computer Science and Software EngineeringUniversity of WollongongWollongongAustralia

Personalised recommendations