Abstract
Driven by the emerging requirements of High Performance Computing (HPC) architectures, the main focus of this work is the interplay of computational and energetic aspects of a Four Dimensional Variational (4DVAR) Data Assimilation algorithm, based on Domain Decomposition (named DD-4DVAR). We report first results on the energy consumption of the DD-4DVAR algorithm on embedded processor and a mathematical analysis of the energy behavior of the algorithm by assuming the architectures characteristics as variable of the model. The main objective is to capture the essential operations of the algorithm exhibiting a direct relationship with the measured energy. The experimental evaluation is carried out on a set of mini-clusters made available by the Barcelona Supercomputing Center.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Due the time complexity of the computation, for each Megabyte, the values on \(n_C\) which is independent from the computing architecture, is such that: \(n_{C,1}= \left\lceil \left( \frac{1048576}{8*3}\right) ^{\frac{1}{6}} \right\rceil = 6\), where \(\lceil \cdot \rceil \) denotes the integer part.
References
Arcucci, R., D’Amore, L., Carracciuolo, L., Scotti, G., Laccetti, G.: A decomposition of the tikhonov regularization functional oriented to exploit hybrid multilevel parallelism. Int. J. Parallel Prog. 45(5), 1214–1235 (2017)
Arcucci, R., D’Amore, L., Carracciuolo, L.: On the problem-decomposition of scalable 4D-Var data assimilation models. In: Proceedings of HPCS 2015, pp. 589–594 (2015)
Arcucci, R., D’Amore, L., Pistoia, J., Toumi, R., Murli, A.: On the variational data assimilation problem solving and sensitivity analysis. JCPH 335, 311–326 (2017)
Boccia, V., Carracciuolo, L., Laccetti, G., Lapegna, M., Mele, V.: HADAB: enabling fault tolerance in parallel applications running in distributed environments. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds.) PPAM 2011. LNCS, vol. 7203, pp. 700–709. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31464-3_71
Carracciuolo, L., D’Amore, L., Murli, A.: Towards a parallel component for imaging in PETSc programming environment: a case study in 3-D echocardiography. Parallel Comput. 32, 67–83 (2006)
Caruso, P., Laccetti, G., Lapegna, M.: A performance contract system in a grid enabling, component based programming environment. In: Sloot, P.M.A., Hoekstra, A.G., Priol, T., Reinefeld, A., Bubak, M. (eds.) EGC 2005. LNCS, vol. 3470, pp. 982–992. Springer, Heidelberg (2005). https://doi.org/10.1007/11508380_100
Chandrakasan, A.P., Sheng, S., Brodersen, R.W.: Low-power CMOS digital design. J. Solid-State Circ. 27(4) (1992)
D’Amore, L., Arcucci, R., Marcellino, L., Murli, A.: HPC computation issues of the incremental 3D variational data assimilation scheme in OceanVar software. JNAIAM 7(3–4), 91–105 (2012)
D’Amore, L., Arcucci, R., Carracciuolo, L., Murli, A.: A scalable approach to variational data assimilation. J. Sci. Comput. 2, 239–257 (2014)
Di Lauro, R., Giannone, F., Ambrosio, L., Montella, R.: Virtualizing general purpose GPUs for high performance cloud computing: an application to a fluid simulator. In: Proceedings of 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA (2012)
Gregoretti, F., Laccetti, G., Murli, A., Oliva, G., Scafuri, U.: MGF: a grid-enabled MPI library. Future Gener. Comput. Syst. (FGCS) 24(2), 158–165 (2008)
Guarracino, M.R., Laccetti, G., Murli, A.: Application oriented brokering in medical imaging: algorithms and software architecture. In: Sloot, P.M.A., Hoekstra, A.G., Priol, T., Reinefeld, A., Bubak, M. (eds.) EGC 2005. LNCS, vol. 3470, pp. 972–981. Springer, Heidelberg (2005). https://doi.org/10.1007/11508380_99
Kalnay, E.: Atmospheric Modeling, Data Assimilation and Predictability. Cambridge University Press, Cambridge (2003)
Korthikanti, V.A., Agha, G.: Energy-performance trade-off analysis of parallel algorithms. In: Hot Topics in Parallelism (HotPar) (2010)
Laccetti, G., Lapegna, M., Mele, V., Romano, D., Murli, A.: A double adaptive algorithm for multidimensional integration on multicore based HPC systems. Int. J. Parallel Program. (IJPP) 40(4), 397–409 (2012)
Montella, R., Giunta, G., Laccetti, G.: Virtualizing high-end GPGPUs on ARM clusters for the next generation of high performance cloud computing. Cluster Comput. 17(1), 139–152 (2014)
Montella, R., Giunta, G., Laccetti, G., Lapegna, M., Palmieri, C., Ferraro, C., Pelliccia, V.: Virtualizing CUDA enabled GPGPUs on ARM clusters. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K., Kitowski, J., Wiatr, K. (eds.) PPAM 2015. LNCS, vol. 9574, pp. 3–14. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32152-3_1
Murli, A., Boccia, V., Carracciuolo, L., D’Amore, L., Laccetti, G., Lapegna, M.: Monitoring and migration of a PETSc-based parallel application for medical imaging in a grid computing PSE. In: Gaffney, P.W., Pool, J.C.T. (eds.) Grid-Based Problem Solving Environments. ITIFIP, vol. 239, pp. 421–432. Springer, Boston (2007). https://doi.org/10.1007/978-0-387-73659-4_25
Murli, A., D’Amore, L., Laccetti, G., Gregoretti, F., Oliva, G.: A multi-grained distributed implementation of the parallel Block Conjugate Gradient algorithm. Concurr. Comput.: Pract. Exp. 22(15), 2053–2072 (2010)
Rajovic, N., Carpenter, P.M., Gelado, I., Puzovic, N., Ramirez, A., Valero, M.R.: Supercomputing with commodity CPUs: are mobile SoCs ready for HPC? In: International Conference on High Performance Computing, Networking, Storage and Analysis (SC), pp. 1–12 (2013)
Rajovic, N., et al.: The Mont-Blanc prototype: an alternative approach for HPC systems. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, Piscataway, NJ, USA, pp. 38:1–38:12 (2016)
Nocedal, J., Byrd, R.H., Lu, P., Zhu, C.: L-BFGS-B: fortran subroutines for large-scale bound-constrained optimization. ACM Trans. Math. Softw. 23(4), 550–560 (1997)
http://www.anandtech.com/show/10353/investigating-cavium-thunderx-48-arm-cores/
Acknowledgment
The research has received funding from European Commission under H2020-MSCA-RISE NASDAC project (grant agreement no. 691184) FP7 Mont-Blanc and Mont-Blanc 2 (grant agreements no. 288777 and 610402), H2020-FET Mont-Blanc 3 (grant agreement 671697).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Arcucci, R., Basciano, D., Cilardo, A., D’Amore, L., Mantovani, F. (2018). Energy Analysis of a 4D Variational Data Assimilation Algorithm and Evaluation on ARM-Based HPC Systems. In: Wyrzykowski, R., Dongarra, J., Deelman, E., Karczewski, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2017. Lecture Notes in Computer Science(), vol 10778. Springer, Cham. https://doi.org/10.1007/978-3-319-78054-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-78054-2_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-78053-5
Online ISBN: 978-3-319-78054-2
eBook Packages: Computer ScienceComputer Science (R0)