The VLDB Journal

, Volume 25, Issue 5, pp 625–650 | Cite as

Characterization of the Impact of Hardware Islands on OLTP

  • Danica PorobicEmail author
  • Ippokratis Pandis
  • Miguel Branco
  • Pınar Tözün
  • Anastasia Ailamaki
Special Issue Paper


Modern hardware is abundantly parallel and increasingly heterogeneous. The numerous processing cores have non-uniform access latencies to the main memory and processor caches, which causes variability in the communication costs. Unfortunately, database systems mostly assume that all processing cores are the same and that microarchitecture differences are not significant enough to appear in critical database execution paths. As we demonstrate in this paper, however, non-uniform core topology does appear in the critical path and conventional database architectures achieve suboptimal and even worse, unpredictable performance. We perform a detailed performance analysis of OLTP deployments in servers with multiple cores per CPU (multicore) and multiple CPUs per server (multisocket). We compare different database deployment strategies where we vary the number and size of independent database instances running on a single server, from a single shared-everything instance to fine-grained shared-nothing configurations. We quantify the impact of non-uniform hardware on various deployments by (a) examining how efficiently each deployment uses the available hardware resources and (b) measuring the impact of distributed transactions and skewed requests on different workloads. We show that no strategy is optimal for all cases and that the best choice depends on the combination of hardware topology and workload characteristics. Finally, we argue that transaction processing systems must be aware of the hardware topology in order to achieve predictably high performance.


Islands Shared-everything Shared-nothing OLTP Multisocket multicores Non-uniform hardware topology 



We would like to thank Eric Sedlar and Brian Gold for many insightful discussions and the members of the DIAS laboratory for their support throughout this work. This work is partially funded by Oracle Labs and by the Swiss National Science Foundation (Grant No. 200021-146407/1).


  1. 1.
    Accetta, M.J., Baron, R.V., Bolosky, W.J., Golub, D.B., Rashid, R.F., Tevanian, A., Young, M.: Mach: A new kernel foundation for UNIX development. In: USENIX Summer, pp. 93–112 (1986)Google Scholar
  2. 2.
    Ailamaki, A., DeWitt, D.J., Hill, M.D., Wood, D.A.: DBMSs on a modern processor: where does time go? In: VLDB, pp. 266–277 (1999)Google Scholar
  3. 3.
    Albutiu, M.C., Kemper, A., Neumann, T.: Massively Parallel sort-merge joins in main memory multi-core database systems. PVLDB 5(10), 1064–1075 (2012)Google Scholar
  4. 4.
    Amazon: EC2 instance types (2015).
  5. 5.
    Bailis, P., Fekete, A., Franklin, M.J., Ghodsi, A., Hellerstein, J.M., Stoica, I.: Coordination avoidance in database systems. PVLDB 8(3), 185–196 (2015)Google Scholar
  6. 6.
    Balkesen, C., Alonso, G., Teubner, J., Ozsu, M.T.: Multi-core, main-memory joins: sort vs. hash revisited. PVLDB 7(1), 85–96 (2014)Google Scholar
  7. 7.
    Barroso, L.A., Gharachorloo, K., Bugnion, E.: Memory system characterization of commercial workloads. In: ISCA, pp. 3–14 (1998)Google Scholar
  8. 8.
    Baumann, A., Barham, P., Dagand, P.E., Harris, T., Isaacs, R., Peter, S., Roscoe, T., Schüpbach, A., Singhania, A.: The multikernel: a new OS architecture for scalable multicore systems. In: SOSP, pp. 29–44 (2009)Google Scholar
  9. 9.
    Beckmann, B.M., Wood, D.A.: Managing wire delay in large chip-multiprocessor caches. In: IEEE MICRO, pp. 319–330 (2004)Google Scholar
  10. 10.
    Bernstein, P.A., Goodman, N.: Multiversion concurrency control–theory and algorithms. ACM TODS 8(4), 465–483 (1983)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Blagodurov, S., Zhuravlev, S., Fedorova, A., Kamali, A.: A case for NUMA-aware contention management on multicore systems. In: PACT, pp. 557–558 (2010)Google Scholar
  12. 12.
    Brewer, E.A.: Towards robust distributed systems (abstract). In: PODC, pp. 7–7 (2000)Google Scholar
  13. 13.
    Carey, M.J., DeWitt, D.J., Franklin, M.J., Hall, N.E., McAuliffe, M.L., Naughton, J.F., Schuh, D.T., Solomon, M.H., Tan, C.K., Tsatalos, O.G., White, S.J., Zwilling, M.J.: Shoring up persistent applications. In: SIGMOD, pp. 383–394 (1994)Google Scholar
  14. 14.
    Closson, K.: You buy a NUMA system, Oracle says disable NUMA! What gives? (2009).
  15. 15.
    Corbett, J.C., Dean, J., Epstein, M., Fikes, A., Frost, C., Furman, J., Ghemawat, S., Gubarev, A., Heiser, C., Hochschild, P., Hsieh, W., Kanthak, S., Kogan, E., Li, H., Lloyd, A., Melnik, S., Mwaura, D., Nagle, D., Quinlan, S., Rao, R., Rolig, L., Saito, Y., Szymaniak, M., Taylor, C., Wang, R., Woodford, D.: Spanner: Google’s globally-distributed database. In: OSDI, pp. 261–264 (2012)Google Scholar
  16. 16.
    Curino, C., Jones, E., Zhang, Y., Madden, S.: Schism: a workload-driven approach to database replication and partitioning. PVLDB 3, 48–57 (2010)Google Scholar
  17. 17.
    Dashti, M., Fedorova, A., Funston, J., Gaud, F., Lachaize, R., Lepers, B., Quema, V., Roth, M.: Traffic management: a holistic approach to memory placement on NUMA systems. In: ASPLOS, pp. 381–394 (2013)Google Scholar
  18. 18.
    David, T., Guerraoui, R., Trigonakis, V.: Everything you always wanted to know about synchronization but were afraid to ask. In: SOSP, pp. 33–48 (2013)Google Scholar
  19. 19.
    Engler, D.R., Kaashoek, M.F., O’Toole Jr., J.: Exokernel: an operating system architecture for application-level resource management. In: SOSP, pp. 251–266 (1995)Google Scholar
  20. 20.
    Giceva, J., Alonso, G., Roscoe, T., Harris, T.: Deployment of query plans on multicores. PVLDB 8(3), 233–244 (2014)Google Scholar
  21. 21.
    Graham, C., Sood, B., Horiuchi, H., Sommer, D.: Market share: Database management system software, worldwide (2009).
  22. 22.
    Hardavellas, N., Ferdman, M., Falsafi, B., Ailamaki, A.: Reactive NUCA: near-optimal block placement and replication in distributed caches. In: ISCA, pp. 184–195 (2009)Google Scholar
  23. 23.
    Harizopoulos, S., Abadi, D.J., Madden, S., Stonebraker, M.: OLTP through the looking glass, and what we found there. In: SIGMOD, pp. 981–992 (2008)Google Scholar
  24. 24.
    Helland, P.: Life beyond distributed transactions: an apostate’s opinion. In: CIDR, pp. 132–141 (2007)Google Scholar
  25. 25.
    HP: Running Microsoft SQL Server 2014 on HP Integrity Superdome X—Reference Configuration Guide (2015).
  26. 26.
    Johnson, R., Pandis, I., Ailamaki, A.: Improving OLTP scalability using speculative lock inheritance. PVLDB 2(1), 479–489 (2009)Google Scholar
  27. 27.
    Johnson, R., Pandis, I., Ailamaki, A.: Eliminating unscalable communication in transaction processing. Vldb J. 23(1), 1–23 (2014)CrossRefGoogle Scholar
  28. 28.
    Johnson, R., Pandis, I., Hardavellas, N., Ailamaki, A., Falsafi, B.: Shore-MT: a scalable storage manager for the multicore era. In: EDBT, pp. 24–35 (2009)Google Scholar
  29. 29.
    Johnson, R., Pandis, I., Stoica, R., Athanassoulis, M., Ailamaki, A.: Aether: a scalable approach to logging. PVLDB 3, 681–692 (2010)Google Scholar
  30. 30.
    Jones, E., Abadi, D.J., Madden, S.: Low overhead concurrency control for partitioned main memory databases. In: SIGMOD, pp. 603–614 (2010)Google Scholar
  31. 31.
    Jung, H., Han, H., Fekete, A.D., Heiser, G., Yeom, H.Y.: A Scalable lock manager for multicores. In: SIGMOD, pp. 73–84 (2013)Google Scholar
  32. 32.
    Kemper, A., Neumann, T.: HyPer – a hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In: ICDE, pp. 195–206 (2011)Google Scholar
  33. 33.
    Kimura, H.: FOEDUS: OLTP engine for a thousand cores and NVRAM. In: SIGMOD, pp. 691–706 (2015)Google Scholar
  34. 34.
    Kimura, H., Graefe, G., Kuno, H.: Efficient locking techniques for databases on modern hardware. In: ADMS (2012)Google Scholar
  35. 35.
    Kissinger, T., Kiefer, T., Schlegel, B., Habich, D., Molka, D., Lehner, W.: ERIS: A NUMA-aware in-memory storage engine for analytical workload. In: ADMS, pp. 74–85 (2014)Google Scholar
  36. 36.
    Kung, H.T., Robinson, J.T.: On optimistic methods for concurrency control. ACM TODS 6(2), 213–226 (1981)CrossRefGoogle Scholar
  37. 37.
    Lahiri, T., Neimat, M.A., Folkman, S.: Oracle TimesTen: an in-memory database for enterprise applications. IEEE Data Eng. Bull. 36(2), 6–13 (2013)Google Scholar
  38. 38.
    Lahiri, T., Srihari, V., Chan, W., MacNaughton, N., Chandrasekaran, S.: Cache fusion: extending shared-disk clusters with shared caches. In: VLDB, pp. 683–686 (2001)Google Scholar
  39. 39.
    Larson, P.A., Blanas, S., Diaconu, C., Freedman, C., Patel, J.M., Zwilling, M.: High-performance concurrency control mechanisms for main-memory databases. PVLDB 5(4), 298–309 (2011)Google Scholar
  40. 40.
    Levandoski, J.J., Lomet, D.B., Sengupta, S.: The bw-tree: a b-tree for new hardware platforms. In: ICDE, pp. 302–313 (2013)Google Scholar
  41. 41.
    Levinthal, D.: Performance analysis guide for Intel Core i7 and Intel Xeon 5500 processors (2009).
  42. 42.
    Li, Y., Pandis, I., Mueller, R., Raman, V., Lohman, G.: NUMA-aware algorithms: the case of data shuffling. In: CIDR (2013)Google Scholar
  43. 43.
    Lindström, J., Raatikka, V., Ruuth, J., Soini, P., Vakkila, K.: IBM solidDB: in-memory database optimized for extreme speed and availability. IEEE Data Eng. Bull. 36(2), 14–20 (2013)Google Scholar
  44. 44.
    Mao, Y., Kohler, E., Morris, R.T.: Cache craftiness for fast multicore key-value storage. In: Eurosys, pp. 183–196 (2012)Google Scholar
  45. 45.
  46. 46.
  47. 47.
    Pandis, I., Johnson, R., Hardavellas, N., Ailamaki, A.: Data-oriented transaction execution. PVLDB 3(1), 928–939 (2010)Google Scholar
  48. 48.
    Pandis, I., Tözün, P., Johnson, R., Ailamaki, A.: PLP: page latch-free shared-everything OLTP. PVLDB 4(10), 610–621 (2011)Google Scholar
  49. 49.
    Pavlo, A., Curino, C., Zdonik, S.: Skew-Aware Automatic database partitioning in shared-nothing, parallel OLTP systems. In: SIGMOD, pp. 61–72 (2012)Google Scholar
  50. 50.
    Pavlo, A., Jones, E.P.C., Zdonik, S.: On predictive modeling for optimizing transaction execution in parallel OLTP systems. PVLDB 5(2), 85–96 (2011)Google Scholar
  51. 51.
    Polychroniou, O., Ross, K.A.: A comprehensive study of main-memory partitioning and its application to large-scale comparison- and radix-sort. In: SIGMOD, pp. 755–766 (2014)Google Scholar
  52. 52.
    Porobic, D., Liarou, E., Tözün, P., Ailamaki, A.: ATraPos: adaptive transaction processing on hardware islands. In: ICDE (2014)Google Scholar
  53. 53.
    Porobic, D., Pandis, I., Branco, M., Tözün, P., Ailamaki, A.: OLTP on hardware Islands. PVLDB 5(11), 1447–1458 (2012)Google Scholar
  54. 54.
    Quamar, A., Kumar, K.A., Deshpande, A.: Sword: scalable workload-aware data placement for transactional workloads. In: EDBT, pp. 430–441 (2013)Google Scholar
  55. 55.
    Salomie, T.I., Subasu, I.E., Giceva, J., Alonso, G.: Database engines on multicores, why parallelize when you can distribute? In: EuroSys, pp. 17–30 (2011)Google Scholar
  56. 56.
    Somogyi, S., Wenisch, T.F., Hardavellas, N., Kim, J., Ailamaki, A., Falsafi, B.: Memory coherence activity prediction in commercial workloads. In: WMPI, pp. 37–45 (2004)Google Scholar
  57. 57.
    Stonebraker, M.: The case for shared nothing. IEEE Database Eng. Bull. 9, 4–9 (1986)Google Scholar
  58. 58.
    Stonebraker, M., Madden, S., Abadi, D.J., Harizopoulos, S., Hachem, N., Helland, P.: The end of an architectural era: (it’s time for a complete rewrite). In: VLDB, pp. 1150–1160 (2007)Google Scholar
  59. 59.
    Tang, L., Mars, J., Vachharajani, N., Hundt, R., Soffa, M.L.: The impact of memory subsystem resource sharing on datacenter applications. In: ISCA, pp. 283–294 (2011)Google Scholar
  60. 60.
    Thomson, A., Diamond, T., Weng, S.C., Ren, K., Shao, P., Abadi, D.J.: Calvin: Fast distributed transactions for partitioned database systems. In: SIGMOD, pp. 1–12 (2012)Google Scholar
  61. 61.
    Tözün, P., Pandis, I., Johnson, R., Ailamaki, A.: Scalable and dynamically balanced shared-everything OLTP with physiological partitioning. VLDB J. 22(2), 151–175 (2013)CrossRefGoogle Scholar
  62. 62.
    Tözün, P., Pandis, I., Kaynak, C., Jevdjic, D., Ailamaki, A.: From A to E: analyzing TPC’s OLTP Benchmarks—The obsolete, the ubiquitous, the unexplored. In: EDBT, pp. 17–28 (2013)Google Scholar
  63. 63.
    TPC: TPC benchmark B standard specification, revision 2.0 (1994).
  64. 64.
    TPC: TPC benchmark C standard specification, revision 5.11 (2010).
  65. 65.
    TPC: TPC benchmark E standard specification, revision 1.12.0 (2010).
  66. 66.
    Tran, K.Q., Naughton, J.F., Sundarmurthy, B., Tsirogiannis, D.: JECB: A join-extension, code-based approach to OLTP data partitioning. In: SIGMOD, pp. 39–50. ACMGoogle Scholar
  67. 67.
    Tu, S., Zheng, W., Kohler, E., Liskov, B., Madden, S.: Speedy transactions in multicore in-memory databases. In: SOSP, pp. 18–32 (2013)Google Scholar
  68. 68.
    Vogels, W.: Eventually consistent. Commun. ACM 52, 40–44 (2009)CrossRefGoogle Scholar
  69. 69.
    Wagle, M., Booss, D., Schreter, I.: NUMA-aware memory management with in-memory databases. In: TPCTC (2015)Google Scholar
  70. 70.
  71. 71.
    Yu, X., Bezerra, G., Pavlo, A., Devadas, S., Stonebraker, M.: Staring into the abyss: an evaluation of concurrency control with one thousand cores. PVLDB 8(3), 209–220 (2014)Google Scholar
  72. 72.
    Zhang, C., Ré, C.: Dimmwitted: a study of main-memory statistical analytics. PVLDB 7(12), 1283–1294 (2014)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Danica Porobic
    • 1
    Email author
  • Ippokratis Pandis
    • 2
  • Miguel Branco
    • 3
  • Pınar Tözün
    • 4
  • Anastasia Ailamaki
    • 1
    • 3
  1. 1.School of Computer and Communication SciencesÉcole Polytechnique Fédérale de LausanneLausanneSwitzerland
  2. 2.Amazon Web ServicesPalo AltoUSA
  3. 3.RAW LabsLausanneSwitzerland
  4. 4.IBM Almaden Research CenterSan JoseUSA

Personalised recommendations