Abstract
An array database is a software that uses non-linear data structures to store and process multidimensional data, including images and time series. As multi-dimensional data applications are generally data-intensive, array databases can benefit from multi-processing systems to improve performance. However, when dealing with Non-Uniform Memory Access (NUMA) machines, the movement of massive amounts of data across NUMA nodes may result in significant performance degradation. This paper presents a mechanism for scheduling array database threads based on data movement patterns and performance monitoring information. Our scheduling mechanism uses non-cooperative game theory to determine the optimal thread placement. Threads act as decision-makers selecting the best NUMA node based on each node’s remote memory access cost. We implemented and tested our mechanism on two array databases (Savime and SciDB), demonstrating improved NUMA-affinity. With Savime, we observed a maximum speedup of \(1.64\times \) and a consistent reduction of up to \(2.46\times \) in remote data access during subarray operations. With SciDB, we observed a speedup of up to \(1.38\times \) and a reduction of \(1.71\times \) in remote data access.
This work was supported by Serrapilheira Institute (grant number Serra-1709-16621).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
National Aeronautics and Space Administration.
- 2.
European Center for Medium-Range Weather Forecasts.
References
Baumann, P., Misev, D., Merticariu, V., Huu, B.P.: Array databases: concepts, standards, implementations. J. Big Data 8(1), 1–61 (2021)
Stonebraker, M., Brown, P., Poliakov, A., Raman, S.: The architecture of SciDB. In: Bayard Cushing, J., French, J., Bowers, S. (eds.) SSDBM 2011. LNCS, vol. 6809, pp. 1–16. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22351-8_1
Dominico, S., de Almeida, E.C., Meira, J.A., Alves, M.A.Z.: An elastic multi-core allocation mechanism for database systems. In: ICDE, pp. 473–484 (2018)
Brown, P.G.: Overview of SciDB: large scale array storage, processing and analysis. In: SIGMOD, pp. 963–968 (2010)
Nash, J.: Non-cooperative games. Ann. Math. 54, 286–295 (1951)
Baumann, P., Furtado, P., Ritsch, R., Widmann, N.: The RasDaMan approach to multidimensional database management. In: SAC, pp. 166–173 (1997)
Soroush, E., Balazinska, M., Wang, D.: ArrayStore: a storage manager for complex parallel array processing. In: SIGMOD, pp. 253–264 (2011)
Zhang, Y., Kersten, M., Manegold, S.: SciQL: array data processing inside an RDBMS. In: SIGMOD, pp. 1049–1052 (2013)
Kepe, T.R., de Almeida, E.C., Alves, M.A.Z.: Database processing-in-memory: an experimental study. PVLDB 13(3), 334–347 (2019)
Lustosa, H.L.S.: SAVIME: enabling declarative array processing in memory. Ph.D. dissertation, LNCC, Petrópolis - Brasil, Fevereiro, p. 100 (2020)
Broquedis, F., et al.: hwloc: a generic framework for managing hardware affinities in HPC applications. In: Euromicro, pp. 180–186 (2010)
Intel. Maximizing multicore processor performance (2019). https://www.intel.com/content/www/us/en/io/quickpath-technology/quickpath-technology-general.html
Willhalm, T., Dementiev, R., Fay, P.: Intel performance counter monitor (2012). https://software.intel.com/en-us/articles/intel-performance-counter-monitor
B. S. Center. HPC4E seismic test suite (2016). https://www.bsc.es/news/bsc-news/new-hpc4e-seismic-test-suite-increase-the-pace-development-new-modelling-and-imaging-technologies
Lustosa, H., Porto, F.: SAVIME: a multidimensional system for the analysis and visualization of simulation data. CoRR, vol. abs/1903.02949 (2019)
Dominico, S., Alves, M.A.Z., de Almeida, E.C.: On the performance limits of thread placement for array databases in non-uniform memory architectures. Comput. J. 105, 1059–1075 (2022)
Psaroudakis, I., Scheuer, T., May, N., Sellami, A., Ailamaki, A.: Scaling up concurrent main-memory column-store scans: towards adaptive NUMA-aware data and task placement. PVLDB 12 (2015)
Kiefer, T., Schlegel, B., Lehner, W.: Experimental evaluation of NUMA effects on database management systems. In: BTW, pp. 185–204 (2013)
Gawade, M., Kersten, M.: NUMA obliviousness through memory mapping. In: DAMON, pp. 1–7 (2015)
Psaroudakis, I., Scheuer, T., May, N., Sellami, A., Ailamaki, A.: Adaptive NUMA-aware data placement and task scheduling for analytical workloads in main-memory column-stores. PVLDB 2 (2016)
Leis, V., Boncz, P., Kemper, A., Neumann, T.: Morsel-driven parallelism: a NUMA-aware query evaluation framework for the many-core age. In: SIGMOD, pp. 743–754 (2014)
Albutiu, M.-C., Kemper, A., Neumann, T.: Massively parallel sort-merge joins in main memory multi-core database systems. PVLDB 5 (2012)
Li, Y., Pandis, I., Mueller, R., Raman, V., Lohman, G.M.: NUMA-aware algorithms: the case of data shuffling. In: CIDR (2013)
Balkesen, C., Alonso, G., Teubner, J., Özsu, M.T.: Multi-core, main-memory joins: sort vs. hash revisited. Proc. VLDB Endow. 7(1), 85–96 (2013)
Diener, M., Cruz, E.H.M., Navaux, P.O.A.: Locality vs. balance: exploring data mapping policies on NUMA systems. In: PDP, pp. 9–16 (2015)
Lepers, B., Quéma, V., Fedorova, A.: Thread and memory placement on NUMA systems: asymmetry matters. In: USENIX, pp. 277–289 (2015)
Virouleau, P., Broquedis, F., Gautier, T., Rastello, F.: Using data dependencies to improve task-based scheduling strategies on NUMA architectures. In: Dutot, P.-F., Trystram, D. (eds.) Euro-Par 2016. LNCS, vol. 9833, pp. 531–544. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-43659-3_39
Di Gennaro, I., Pellegrini, A., Quaglia, F.: OS-based NUMA optimization: tackling the case of truly multi-thread applications with non-partitioned virtual page accesses. In: 16th CCGrid, pp. 291–300 (2016)
Wang, W., Davidson, J.W., Soffa, M.L.: Predicting the memory bandwidth and optimal core allocations for multi-threaded applications on large-scale NUMA machines. In: IEEE HPCA, pp. 419–431 (2016)
Serpa, M.S., Krause, A.M., Cruz, E.H., Navaux, P.O.A., Pasin, M., Felber, P.: Optimizing machine learning algorithms on multi-core and many-core architectures using thread and data mapping. In: PDP, pp. 329–333. IEEE (2018)
Popov, M., Jimborean, A., Black-Schaffer, D.: Efficient thread/page/parallelism autotuning for NUMA systems. In: International Conference on Supercomputing, pp. 342–353 (2019)
Cruz, E.H., Diener, M., Pilla, L.L., Navaux, P.O.: Online thread and data mapping using a sharing-aware memory management unit. ACM TOMPECS 5(4), 1–28 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Dominico, S., Alves, M.A.Z., de Almeida, E.C. (2023). NoGar: A Non-cooperative Game for Thread Pinning in Array Databases. In: Strauss, C., Amagasa, T., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2023. Lecture Notes in Computer Science, vol 14146. Springer, Cham. https://doi.org/10.1007/978-3-031-39847-6_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-39847-6_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-39846-9
Online ISBN: 978-3-031-39847-6
eBook Packages: Computer ScienceComputer Science (R0)