Characterizing the Performance and Energy Attributes of Scientific Simulations

  • Sayaka Akioka
  • Konrad Malkowski
  • Padma Raghavan
  • Mary Jane Irwin
  • Lois Curfman McInnes
  • Boyana Norris
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3991)


We characterize the performance and energy attributes of scientific applications based on nonlinear partial differential equations (PDEs). where the dominant cost is that of sparse linear system solution. We obtain performance and energy metrics using cycle-accurate emulations on a processor and memory system derived from the PowerPC RISC architecture with extensions to resemble the processor in the BlueGene/L. These results indicate that low-power modes of CPUs such as Dynamic Voltage Scaling (DVS) can indeed result in energy savings at the expense of performance degradation. We then consider the impact of certain memory subsystem optimizations to demonstrate that these optimizations in conjunction with DVS can provide faster execution time and lower energy consumption. For example, on the optimized architecture, if DVS is used to scale down the processor to 600MHz, execution times are faster by 45% with energy reductions of 75% compared to the original architecture at 1GHz. The insights gained from this study can help scientific applications better utilize the low-power modes of processors as well as guide the selection of hardware optimizations in future power-aware, high-performance computers.


Execution Time Energy Attribute Dynamic Voltage Scaling Krylov Method Drive Cavity 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Anderson, W.K., Gropp, W.D., Kaushik, D.K., et al.: Achieving high sustained performance in an unstructured mesh CFD application. In: SC 1999 (1999)Google Scholar
  2. 2.
    Bailey, D., Harris, T., Saphir, W., et al.: The NAS parallel benchmarks 2.0. Technical Report NAS-95-020, NASA Ames Research Center (1995)Google Scholar
  3. 3.
    Balay, S., Buschelman, K., Eijkhout, V., et al.: PETSc users manual. Technical Report ANL-95/11 - Revision 2.3.0, Argonne National Laboratory (2005), See
  4. 4.
    Barrett, R., Berry, M., Chan, T.F., et al.: Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, Software, Environments, Tools. SIAM, Philadelphia (1994)Google Scholar
  5. 5.
    Bhowmick, S., McInnes, L.C., Norris, B., et al.: The role of multi-method linear solvers in PDE-based simulations. In: Kumar, V., Gavrilova, M.L., Tan, C.J.K., L’Ecuyer, P. (eds.) Architecting Dependable Systems. LNCS, vol. 2677. Springer, Heidelberg (2003)Google Scholar
  6. 6.
    Bircher, W.L., Valluri, M., John, L., et al.: Runtime identification of microprocessor energy saving opportunities. In: ISLPED 2005 (2005)Google Scholar
  7. 7.
    Brooks, D., Tiwari, V., Martonosi, M.: Wattch: A framework for architectural-level power analysis and optimizations. In: ISCA 2000 (2000)Google Scholar
  8. 8.
    Burger, D.C., Austin, T.M.: The SimpleScalar tool set, version 2.0. Technical Report 1342, UW Madison Computer Sciences (1997)Google Scholar
  9. 9.
    Casmira, J., Grunwald, D.: Dynamic instruction scheduling slack. In: 2000 KoolChips Workshop (2000)Google Scholar
  10. 10.
    Coffey, T.S., Kelley, C.T., Keyes, D.E.: Pseudo-transient continuation and differential-algebraic equations. SIAM J. Sci. Comput. 25(2) (2003)Google Scholar
  11. 11.
    Contreras, G., Martonosi, M.: Power prediction for Intel XScale processors using performance monitoring unit events. In: ISLPED 2005 (2005)Google Scholar
  12. 12.
    Standard Performance Evaluation Corporation. The SPEC benchmark suite,
  13. 13.
    de Rose, L.A., Reed, D.A.: SvPablo: A multi-language architecture-independent performance analysis system. In: ICPP 1999 (1999)Google Scholar
  14. 14.
    Feng, X., Ge, R., Cameron, K.W.: Power and energy profiling of scientific applications on distributed systems. In: IPDPS 2005 (2005)Google Scholar
  15. 15.
    Fields, B., Bodik, R., Hill, M.M.: Slack: Maximizing performance under technological constraints. In: ISCA 2002 (2002)Google Scholar
  16. 16.
    Fryxell, B., Olson, K., Ricker, P., et al.: FLASH: An adaptive-mesh hydrodynamics code for modeling astrophysical thermonuclear flashes. Astrophys. J. Suppl. (2000)Google Scholar
  17. 17.
    Gara, A., Blumrich, M.A., Chen, D., et al.: Overview of the Blue Gene/L system architecture. IBM J. Res. & Dev. 49(2/3) (2005)Google Scholar
  18. 18.
    Ge, R., Feng, X., Cameron, K.W.: Performance-constrained, distributed DVS scheduling for scientific applications on power-aware clusters. In: SC 2005 (2005)Google Scholar
  19. 19.
    Hsu, C., Feng, W.: A power-aware run-time system for high-performance computing. In: SC 2005 (2005)Google Scholar
  20. 20.
    Kappiah, N., Freeh, V.W., Lowenthal, D.K.: Just-in-time dynamic voltage scaling: Exploiting inter-node slack to save energy in MPI programs. In: SC 2005 (2005)Google Scholar
  21. 21.
    Malkowski, K., Lee, I., Raghavan, P., et al.: Memory optimizations for tuned scientific applications: An evaluation of performance-power characteristics. In: ISPASS 2006 (2006) (submitted)Google Scholar
  22. 22.
    McInnes, L., Norris, B., Bhowmick, S., et al.: Adaptive sparse linear solvers for implicit CFD using Newton-Krylov algorithms, Boston, USA, June 17-20, vol. 2 (2003)Google Scholar
  23. 23.
    Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, Heidelberg (1999)MATHCrossRefGoogle Scholar
  24. 24.
    Saad, Y.: Iterative Methods for Sparse Liner Systems, 2nd edn. SIAM, Philadelphia (2003)CrossRefGoogle Scholar
  25. 25.
    Semeraro, G., Albonesi, D.H., Dropsho, S.G., et al.: Dynamic frequency and voltage control for a multiple clock domain microarchitecture. In: MICRO 2002 (2002)Google Scholar
  26. 26.
    Tang, X.Z., Fu, G.Y., Jardin, S.C., et al.: Resistive magnetohydrodynamics simulation of fusion plasmas. Technical Report PPPL-3532, Princeton Plasma Physics Laboratory (2001)Google Scholar
  27. 27. Top 500 supercomputer sites (2005),
  28. 28.
    Yuan, W., Nahrstedt, K.: Energy-efficient soft real-time CPU scheduling for mobile multimedia systems. In: SOSP 2003 (2003)Google Scholar
  29. 29.
    Zhu, D., Melhem, R., Childers, B.R.: Scheduling with dynamic voltage/speed adjustment using slack reclamation in multi-processor real-time systems. In: RTSS 2001 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Sayaka Akioka
    • 1
  • Konrad Malkowski
    • 1
  • Padma Raghavan
    • 1
  • Mary Jane Irwin
    • 1
  • Lois Curfman McInnes
    • 2
  • Boyana Norris
    • 2
  1. 1.The Pennsylvania State University 
  2. 2.Argonne National Laboratory 

Personalised recommendations