Energy Analysis of a 4D Variational Data Assimilation Algorithm and Evaluation on ARM-Based HPC Systems

  • Rossella ArcucciEmail author
  • Davide Basciano
  • Alessandro Cilardo
  • Luisa D’Amore
  • Filippo Mantovani
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10778)


Driven by the emerging requirements of High Performance Computing (HPC) architectures, the main focus of this work is the interplay of computational and energetic aspects of a Four Dimensional Variational (4DVAR) Data Assimilation algorithm, based on Domain Decomposition (named DD-4DVAR). We report first results on the energy consumption of the DD-4DVAR algorithm on embedded processor and a mathematical analysis of the energy behavior of the algorithm by assuming the architectures characteristics as variable of the model. The main objective is to capture the essential operations of the algorithm exhibiting a direct relationship with the measured energy. The experimental evaluation is carried out on a set of mini-clusters made available by the Barcelona Supercomputing Center.


Data assimilation 4DVar Domain Decomposition Embedded processor architectures Energy consumption 



The research has received funding from European Commission under H2020-MSCA-RISE NASDAC project (grant agreement no. 691184) FP7 Mont-Blanc and Mont-Blanc 2 (grant agreements no. 288777 and 610402), H2020-FET Mont-Blanc 3 (grant agreement 671697).


  1. 1.
    Arcucci, R., D’Amore, L., Carracciuolo, L., Scotti, G., Laccetti, G.: A decomposition of the tikhonov regularization functional oriented to exploit hybrid multilevel parallelism. Int. J. Parallel Prog. 45(5), 1214–1235 (2017)CrossRefGoogle Scholar
  2. 2.
    Arcucci, R., D’Amore, L., Carracciuolo, L.: On the problem-decomposition of scalable 4D-Var data assimilation models. In: Proceedings of HPCS 2015, pp. 589–594 (2015)Google Scholar
  3. 3.
    Arcucci, R., D’Amore, L., Pistoia, J., Toumi, R., Murli, A.: On the variational data assimilation problem solving and sensitivity analysis. JCPH 335, 311–326 (2017)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Boccia, V., Carracciuolo, L., Laccetti, G., Lapegna, M., Mele, V.: HADAB: enabling fault tolerance in parallel applications running in distributed environments. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds.) PPAM 2011. LNCS, vol. 7203, pp. 700–709. Springer, Heidelberg (2012). CrossRefGoogle Scholar
  5. 5.
    Carracciuolo, L., D’Amore, L., Murli, A.: Towards a parallel component for imaging in PETSc programming environment: a case study in 3-D echocardiography. Parallel Comput. 32, 67–83 (2006)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Caruso, P., Laccetti, G., Lapegna, M.: A performance contract system in a grid enabling, component based programming environment. In: Sloot, P.M.A., Hoekstra, A.G., Priol, T., Reinefeld, A., Bubak, M. (eds.) EGC 2005. LNCS, vol. 3470, pp. 982–992. Springer, Heidelberg (2005). CrossRefGoogle Scholar
  7. 7.
    Chandrakasan, A.P., Sheng, S., Brodersen, R.W.: Low-power CMOS digital design. J. Solid-State Circ. 27(4) (1992)Google Scholar
  8. 8.
    D’Amore, L., Arcucci, R., Marcellino, L., Murli, A.: HPC computation issues of the incremental 3D variational data assimilation scheme in OceanVar software. JNAIAM 7(3–4), 91–105 (2012)MathSciNetzbMATHGoogle Scholar
  9. 9.
    D’Amore, L., Arcucci, R., Carracciuolo, L., Murli, A.: A scalable approach to variational data assimilation. J. Sci. Comput. 2, 239–257 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Di Lauro, R., Giannone, F., Ambrosio, L., Montella, R.: Virtualizing general purpose GPUs for high performance cloud computing: an application to a fluid simulator. In: Proceedings of 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA (2012)Google Scholar
  11. 11.
    Gregoretti, F., Laccetti, G., Murli, A., Oliva, G., Scafuri, U.: MGF: a grid-enabled MPI library. Future Gener. Comput. Syst. (FGCS) 24(2), 158–165 (2008)CrossRefGoogle Scholar
  12. 12.
    Guarracino, M.R., Laccetti, G., Murli, A.: Application oriented brokering in medical imaging: algorithms and software architecture. In: Sloot, P.M.A., Hoekstra, A.G., Priol, T., Reinefeld, A., Bubak, M. (eds.) EGC 2005. LNCS, vol. 3470, pp. 972–981. Springer, Heidelberg (2005). CrossRefGoogle Scholar
  13. 13.
    Kalnay, E.: Atmospheric Modeling, Data Assimilation and Predictability. Cambridge University Press, Cambridge (2003)Google Scholar
  14. 14.
    Korthikanti, V.A., Agha, G.: Energy-performance trade-off analysis of parallel algorithms. In: Hot Topics in Parallelism (HotPar) (2010)Google Scholar
  15. 15.
    Laccetti, G., Lapegna, M., Mele, V., Romano, D., Murli, A.: A double adaptive algorithm for multidimensional integration on multicore based HPC systems. Int. J. Parallel Program. (IJPP) 40(4), 397–409 (2012)CrossRefGoogle Scholar
  16. 16.
    Montella, R., Giunta, G., Laccetti, G.: Virtualizing high-end GPGPUs on ARM clusters for the next generation of high performance cloud computing. Cluster Comput. 17(1), 139–152 (2014)CrossRefGoogle Scholar
  17. 17.
    Montella, R., Giunta, G., Laccetti, G., Lapegna, M., Palmieri, C., Ferraro, C., Pelliccia, V.: Virtualizing CUDA enabled GPGPUs on ARM clusters. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K., Kitowski, J., Wiatr, K. (eds.) PPAM 2015. LNCS, vol. 9574, pp. 3–14. Springer, Cham (2016). CrossRefGoogle Scholar
  18. 18.
    Murli, A., Boccia, V., Carracciuolo, L., D’Amore, L., Laccetti, G., Lapegna, M.: Monitoring and migration of a PETSc-based parallel application for medical imaging in a grid computing PSE. In: Gaffney, P.W., Pool, J.C.T. (eds.) Grid-Based Problem Solving Environments. ITIFIP, vol. 239, pp. 421–432. Springer, Boston (2007). CrossRefGoogle Scholar
  19. 19.
    Murli, A., D’Amore, L., Laccetti, G., Gregoretti, F., Oliva, G.: A multi-grained distributed implementation of the parallel Block Conjugate Gradient algorithm. Concurr. Comput.: Pract. Exp. 22(15), 2053–2072 (2010)Google Scholar
  20. 20.
    Rajovic, N., Carpenter, P.M., Gelado, I., Puzovic, N., Ramirez, A., Valero, M.R.: Supercomputing with commodity CPUs: are mobile SoCs ready for HPC? In: International Conference on High Performance Computing, Networking, Storage and Analysis (SC), pp. 1–12 (2013)Google Scholar
  21. 21.
    Rajovic, N., et al.: The Mont-Blanc prototype: an alternative approach for HPC systems. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, Piscataway, NJ, USA, pp. 38:1–38:12 (2016)Google Scholar
  22. 22.
    Nocedal, J., Byrd, R.H., Lu, P., Zhu, C.: L-BFGS-B: fortran subroutines for large-scale bound-constrained optimization. ACM Trans. Math. Softw. 23(4), 550–560 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Rossella Arcucci
    • 1
    Email author
  • Davide Basciano
    • 1
  • Alessandro Cilardo
    • 1
  • Luisa D’Amore
    • 1
  • Filippo Mantovani
    • 2
  1. 1.University of Naples Federico IINaplesItaly
  2. 2.Barcelona Supercomputing Center (BSC)BarcelonaSpain

Personalised recommendations