Journal of Computer Science and Technology

, Volume 22, Issue 2, pp 273–279 | Cite as

Efficient Execution of Multiple Queries on Deep Memory Hierarchy

  • Yan Zhang
  • Zhi-Feng Chen
  • Yuan-Yuan Zhou
Short Paper


This paper proposes a complementary novel idea, called MiniTasking to further reduce the number of cache misses by improving the data temporal locality for multiple concurrent queries. Our idea is based on the observation that, in many workloads such as decision support systems (DSS), there is usually significant amount of data sharing among different concurrent queries. MiniTasking exploits such data sharing to improve data temporal locality by scheduling query execution at three levels: query level batching, operator level grouping and mini-task level scheduling. The experimental results with various types of concurrent TPC-H query workloads show that, with the traditional N-ary Storage Model (NSM) layout, MiniTasking significantly reduces the L2 cache misses by up to 83, and thereby achieves 24% reduction in execution time. With the Partition Attributes Across (PAX) layout, MiniTasking further reduces the cache misses by 65% and the execution time by 9%. For the TPC-H throughput test workload, MiniTasking improves the end performance up to 20%.


cache performance temporal locality mini-task scheduling concurrent queries 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

11390_2007_9034_MOESM1_ESM.pdf (61 kb)
Supplementary material - Chinese Abstract (PDF 60.5 KB)


  1. [1]
    Ailamaki A, DeWitt D J, Hill M D et al. DBMSs on a modern processor: Where does time go? In Proc. VLDB’99, Edinburgh, Scotland, UK, 1999, pp.266–277.Google Scholar
  2. [2]
    Lo J L, Barroso L A, Eggers S J et al. An analysis of database workload performance on simultaneous multithreaded processors. In Proc. ISCA’98, IEEE Computer Society, Barcelona, Spain, 1998, pp.39–50.Google Scholar
  3. [3]
    Trancoso P, Larriba-Pey J L, Zhang Z et al. The memory performance of DSS commercial workloads in shared-memory multiprocessors. In Proc. HPCA’97, San Antonio, Texas, USA, 1997, pp.250–260.Google Scholar
  4. [4]
    Ailamaki A, DeWitt D J, Hill M D. Data page layouts for relational databases on deep memory hierarchies. The VLDB Journal, 2002, 11(3): 198–215.zbMATHCrossRefGoogle Scholar
  5. [5]
    Hankins R A, Patel J M. Data morphing: An adaptive, cache-conscious storage technique. In Proc. VLDB’03, Berlin, Germany, 2003, pp.417–428.Google Scholar
  6. [6]
    Chen S, Gibbons P B, Mowry T C. Improving index performance through prefetching. In Proc. SIGMOD’01, Santa Barbara, CA, USA, ACM Press, 2001, pp.235–246.Google Scholar
  7. [7]
    Wolf M E, Lam M S. A data locality optimizing algorithm. In Proc. PLDI’91, Toronto, Ontario, Canada, 1991, pp.30–44.Google Scholar
  8. [8]
    Philbin J, Edler J, Anshus O J et al. Thread scheduling for cache locality. In Proc. ASPLOS’96, Cambridge, Massachusetts, USA, ACM Press, 1996, pp.60–71.Google Scholar
  9. [9]
    Zhou Y, Wang L, Clark D W et al. Thread scheduling for out-of-core applications with memory server on multicomputers. In Proc. IOPADS’99, Atlanta, GA, USA, ACM Press, 1999, pp.57–67.Google Scholar
  10. [10]
    IBM. Personal communication with IBM, Jan. 2005.Google Scholar
  11. [11]
    Finkelstein S. Common expression analysis in database applications. In Proc. SIGMOD’82, Orlando, Florida, USA, ACM Press, 1982 pp.235–245.Google Scholar
  12. [12]
    Sellis T K. Multiple-query optimization. ACM Trans. Database Syst., 1998, 13(1): 23–52.CrossRefGoogle Scholar
  13. [13]
    Sellis T K, Ghosh S. On the multiple-query optimization problem. IEEE TKDE, 1990, 2(2): 262–266.Google Scholar
  14. [14]
    Gupta A, Sudarshan S, Viswanathan S. Query scheduling in multi query optimization. In Proc. IDEAS’01, Grenoble, France, IEEE Computer Society, 2001, pp.11–19.Google Scholar
  15. [15]
    Carey M J, e David, J DeWitt. Shoring up persistent applications. In Proc. SIGMOD’94, Minneapolis, USA, ACM Press, 1994, pp.383–394.Google Scholar
  16. [16]
    Park J, Segev A. Using common subexpressions to optimize multiple queries. In Proc. ICDE’88, Los Angeles, CA, USA, IEEE Computer Society, 1988, pp.311–319.Google Scholar
  17. [17]
    Dalvi N N, Sanghai S K, Roy P, Sudarshan S. Pipelining in multi-query optimization. In Proc. PODS’01, Santa Barbara, CA, USA, ACM Press, 2001, pp.59–70.Google Scholar
  18. [18]
    Roy P, Seshadri S, Sudarshan S, Bhobe S. Efficient and extensible algorithms for multi query optimization. In Proc. SIGMOD’00, Dallas, Texas, USA, ACM Press, 2000, pp.249–260.Google Scholar
  19. [19]
    Harizopoulos S, Shkapenyuk V, Ailamaki A. Qpipe: A simultaneously pipelined relational query engine. In Proc. SIGMOD’05, Baltimore, Maryland, USA, 2005, pp.383–394.Google Scholar
  20. [20]
    O’Gorman K, Agrawal D, Abbadi A E. Multiple query optimization by cache-aware middleware using query teamwork. In Proc. ICDE02, San Jose, CA, USA, IEEE Computer Society, 2002, p.274.Google Scholar
  21. [21]
    Boncz P A, Manegold S, Kersten M L. Database architecture optimized for the new bottleneck: Memory access. In Proc. VLDB’99, Edinburgh, Scotland, UK, 1999, pp.54–65.Google Scholar
  22. [22]
    Chen S, Gibbons P B, Mowry T C et al. Fractal prefetching B +-trees: Optimizing both cache and disk performance. In Proc. SIGMOD’02, Madison, USA, ACM Press, 2002, pp.157–168.Google Scholar
  23. [23]
    Kim K, Cha S K, Kwon K. Optimizing multidimensional index trees for main memory access. In Proc. SIGMOD’01, Santa Barbara, CA, USA, ACM Press, 2001, pp.139–150.Google Scholar
  24. [24]
    Ramamurthy R, DeWitt D J, Su Q. A case for fractured mirrors. In Proc. VLDB’02, Hong Kong, China, 2002, pp.430–441.Google Scholar
  25. [25]
    Zhou J, Ross K A. Buffering accesses to memory-resident index structures. In Proc. VLDB’03, Berlin, Germany, 2003, pp.405–416.Google Scholar
  26. [26]
    Rao J, Ross K A. Making B +-trees cache conscious in main memory. In Proc. SIGMOD’00, Dallas, Texas, USA, ACM Press, 2000, pp.475–486.Google Scholar
  27. [27]
    Shatdal A, Kant C, Naughton J F. Cache conscious algorithms for relational query processing. In Proc. VLDB’94, Santiago de Chile, Chile, 1994, pp.510–521.Google Scholar
  28. [28]
    Transaction processing performance council.
  29. [29]
    Zhang Y, Chen Z, Zhou Y. Efficient execution of multiple queries on deep memory hierarchy (full version). /jcst-full.pdf.
  30. [30]
    Intel Corporation. Intel Vtune performance analyzer., 2004.

Copyright information

© Science Press, Beijing, China and Springer Science + Business Media, LLC, USA 2007

Authors and Affiliations

  1. 1.National Laboratory on Machine PerceptionPeking UniversityBeijingChina
  2. 2.Google Inc.Mountain ViewU.S.A.
  3. 3.Department of Computer ScienceUniversity of Illinois at Urbana-ChampaignUrbanaU.S.A.

Personalised recommendations