Query Execution Optimization Based on Incremental Update in Database Distributed Middleware

  • Wei Ye
  • Mei WangEmail author
  • Jiajin Le
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9528)


Big data is often generated incrementally in the real word. Existing incremental query optimization is mainly used in the streaming data environment. Due to the constraints of real-time streaming data applications, existing incremental execution mechanisms is difficult to directly apply to large business-oriented data in a distributed environment. This paper proposes a query execution optimization method based on incremental update in database distributed middleware. First, the proposed method defines the Reference-Graph according to tables and their foreign key relationships, based on which a data partition strategy is provided to reduce data transmission quantity during query operation. In addition, the proposed method proposes an incremental update query execution strategy and incremental intermediate result preservation mechanism in distributed environment for non-aggregate and aggregate query respectively. The combination of data partition and incremental updating strategy reduces the query execution cost and enhance the performance of complex query operation significantly. Finally, the experimental results conducted on the benchmark dataset test and verify the effectiveness of the proposed method.


Database middleware Distributed database Data partition Incremental update Result set reuse 



This work was supported by the Fundamental Research Funds for the Central Universities and DHU distinguished Young Professor Program No. B201312.


  1. 1.
    Kobielus, J., Evelson, B., Karel, R., Coit, C.: In-database analytics: the heart of the predictive enterprise. Forrester Researc, Cambridge, USA (2009)Google Scholar
  2. 2.
    Amoeba Software Foundation.
  3. 3.
    Alibaba Group. Cobar architecture guide.
  4. 4.
    MyCat Software Foundation.
  5. 5.
    QIHU 360 software Co. Atlas architecture guide.
  6. 6.
    Transaction Processing Performance Council. TPC BENCHMARK H: Standard Specification Revision 2.17.0.
  7. 7.
    Jin, C., Carbonell, J.G., Hayes, P.: ARGUS: Rete + DBMS = efficient persistent profile matching on large-volume data streams. In: Hacid, M.-S., Murray, N.V., Raś, Z.W., Tsumoto, S. (eds.) ISMIS 2005. LNCS (LNAI), vol. 3488, pp. 142–151. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  8. 8.
    Cheung, D.W., Han, J., Ng, V.T., Wong, C.Y.: Maintenance of discovered association rules in large databases: an incremental updating technique. In: Proceedings of the 12th International Conference on Data Engineering, pp. 106–114. IEEE Press, Piscataway (1996)Google Scholar
  9. 9.
    Ou, J.C., Lee, C.H., Chen, M.S.: Efficient algorithms for incremental Web log mining with dynamic thresholds. VLDB J. 17(4), 827–845 (2008)CrossRefGoogle Scholar
  10. 10.
    Borthakur, D.: The hadoop distributed file system: architecture and design. Hadoop Proj. Website 11(2007), 21 (2007)Google Scholar
  11. 11.
    Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRefGoogle Scholar
  12. 12.
    Nykiel, T., Potamias, M., Mishra, C., Kollios, G., Koudas, N.: MRShare: sharing across multiple queries in MapReduce. Proc. VLDB Endowment 3(1–2), 494–505 (2010)CrossRefzbMATHGoogle Scholar
  13. 13.
    Elghandour, I., Aboulnaga, A.: Restore: Reusing results of MapReduce jobs. Proc. VLDB Endowment 5(6), 586–597 (2012)CrossRefGoogle Scholar
  14. 14.
    Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig latin: a not-so-foreign language for data processing, In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1099–1110. ACM (2008)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyDongHua UniversityShanghaiChina

Personalised recommendations