Behavioral Emulation for Scalable Design-Space Exploration of Algorithms and Architectures

  • Nalini KumarEmail author
  • Carlo Pascoe
  • Christopher Hajas
  • Herman Lam
  • Greg Stitt
  • Alan George
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9945)


This paper presents a simulation methodology called Behavioral Emulation (BE) for scalable design-space exploration of algorithms and architectures. By design, BE is independent of simulation vehicle (e.g., simulation in software or emulation in hardware) and addresses system-simulation complexity with a coarse-grained, multi-scale approach. We describe the BE methodology, component models, and simulation workflow from calibration to validation of applications simulated on existing architectures and present a device-level case study with roughly 10 % relative error. Finally, we discuss the extension of validated models to predict application performance on notional architectures.


Behavioral Emulation Performance modeling Coarse-grained simulation Design space exploration 



This work is supported by the U.S. Department of Energy, National Nuclear Security Administration, Advanced Simulation and Computing Program, as a Cooperative Agreement under the Predictive Science Academic Alliance Program, under Contract No. DE-NA0002378. This work was supported in part by the I/UCRC Program of the National Science Foundation under Grant Nos. EEC-0642422 and IIP-1161022.


  1. 1.
    Fischer, P.F., Lottes, J.W., Kerkemeier, S.G.: Nek5000 (2008).
  2. 2.
    Berna, G., Beyer, G., Davis, K., Lanning, D.: FRAPCON-3: a computer code for the calculation of steady-state, thermal-mechanical behavior of oxide fuel rods for high burnup, December 1997Google Scholar
  3. 3.
    Hamidouche, T., Bousbia-Salah, A., Adorni, M., D’Auria, F.: Dynamic calculations of the iaea safety mtr research reactor benchmark problem using relap5/3.2 code. Ann. Nucl. Energy 31(12), 1385–1402 (2004)CrossRefGoogle Scholar
  4. 4.
    CCMT: Psaap-ii center for compressible multiphase turbulence.
  5. 5.
    Kogge, P., Bergman, K., Borkar, S., Campbell, D., Carson, W., Dally, W., Denneau, M., Franzon, P., Harrod, W., Hill, K., et al.: Exascale computing study: technology challenges in achieving exascale systems (2008)Google Scholar
  6. 6.
    Top500: The top500 list, November 2015.
  7. 7.
    Barrett, R.F., Borkar, S., Dosanjh, S.S., Hammond, S.D., Heroux, M.A., Hu, X.S., Luitjens, J., Parker, S.G., Shalf, J., Tang, L.: On the role of co-design in high performance computing, vol. 24, pp. 141–155 (2013)Google Scholar
  8. 8.
    Dosanjh, S., Barrett, R., Doerfler, D., Hammond, S., Hemmert, K., Heroux, M., Lin, P., Pedretti, K., Rodrigues, A., Trucano, T., et al.: Exascale design space exploration and co-design. Future Gener. Comput. Syst. 30, 46–58 (2014)CrossRefGoogle Scholar
  9. 9.
    Ang, J.A., Hoang, T.T., Kelly, S.M., McPherson, A., Neely, R.: Advanced simulation and computing co-design strategy. Technical report, Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States) (2015)Google Scholar
  10. 10.
    Binkert, N., Beckmann, B., Black, G., Reinhardt, S.K., Saidi, A., Basu, A., Hestness, J., Hower, D.R., Krishna, T., Sardashti, S., Sen, R., Sewell, K., Shoaib, M., Vaish, N., Hill, M.D., Wood, D.A.: The gem5 simulator. SIGARCH Comput. Archit. News 39(2), 1–7 (2011)CrossRefGoogle Scholar
  11. 11.
    Magnusson, P.S., Christensson, M., Eskilson, J., Forsgren, D., Hallberg, G., Hogberg, J., Larsson, F., Moestedt, A., Werner, B.: Simics: a full system simulation platform. Computer 35(2), 50–58 (2002)CrossRefGoogle Scholar
  12. 12.
    Austin, T., Larson, E., Ernst, D.: Simplescalar: an infrastructure for computer system modeling. Computer 35(2), 59–67 (2002)CrossRefGoogle Scholar
  13. 13.
    Zheng, G., Kakulapati, G., Kalé, L.V.: Bigsim: a parallel simulator for performance prediction of extremely large parallel machines. In: Proceedings of the 18th International Symposium on Parallel and Distributed Processing, vol. 78. IEEE (2004)Google Scholar
  14. 14.
    Wang, J., Beu, J., Yalamanchili, S., Conte, T.: Designing configurable, modifiable and reusable components for simulation of multicore systems. In: 2012 SC Companion: High Performance Computing, Networking, Storage and Analysis (SCC), pp. 472–476. IEEE (2012)Google Scholar
  15. 15.
    Tan, Z., Waterman, A., Avizienis, R., Lee, Y., Cook, H., Patterson, D., Asanović, K.: Ramp gold: an fpga-based architecture simulator for multiprocessors. In: Proceedings of the 47th Design Automation Conference, pp. 463–468. ACM (2010)Google Scholar
  16. 16.
    Culler, D., Karp, R., Patterson, D., Sahay, A., Schauser, K.E., Santos, E., Subramonian, R., von Eicken, T.: Logp: towards a realistic model of parallel computation. In: Proceedings of the Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 1993, New York, NY, USA, pp. 1–12. ACM (1993)Google Scholar
  17. 17.
    Alexandrov, A., Ionescu, M.F., Schauser, K.E., Scheiman, C.: Loggp: incorporating long messages into the logp modelone step closer towards a realistic model for parallel computation. In: Proceedings of the Seventh Annual ACM Symposium on Parallel Algorithms and Architectures, pp. 95–105. ACM (1995)Google Scholar
  18. 18.
    Ino, F., Fujimoto, N., Hagihara, K.: Loggps: a parallel computational model for synchronization analysis. In: ACM SIGPLAN Notices, vol. 36, pp. 133–142. ACM (2001)Google Scholar
  19. 19.
    Cadence palladium iiGoogle Scholar
  20. 20.
    Mentor graphics corporation. veloce emulation platformGoogle Scholar
  21. 21.
    Dong, Z., Wang, J., Riley, G., Yalamanchili, S.: An efficient front-end for timing-directed parallel simulation of multi-core system. In: Proceedings of the 7th International ICST Conference on Simulation Tools and Techniques, ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), pp. 201–206 (2014)Google Scholar
  22. 22.
    Grobelny, E., Bueno, D., Troxel, I., George, A.D., Vetter, J.S.: Fase: a framework for scalable performance prediction of hpc systems and applications. Simulation 83(10), 721–745 (2007)CrossRefGoogle Scholar
  23. 23.
    Rodrigues, A.F., Hemmert, K.S., Barrett, B.W., Kersey, C., Oldfield, R., Weston, M., Risen, R., Cook, J., Rosenfeld, P., CooperBalls, E., et al.: The structural simulation toolkit. ACM SIGMETRICS Perform. Eval. Rev. 38(4), 37–42 (2011)CrossRefGoogle Scholar
  24. 24.
    Janssen, C.L., Adalsteinsson, H., Cranford, S., Kenny, J.P., Pinar, A., Evensky, D.A., Mayo, J.: A simulator for large-scale parallel computer architectures. Technol. Integr. Advancements Distrib. Syst. Comput. 179, 179–195 (2012)CrossRefGoogle Scholar
  25. 25.
    Rudolph, D., Stitt, G.: An interpolation-based approach to multi-parameter performance modeling for heterogeneous systems. In: 2015 IEEE 26th International Conference on Application-specific Systems, Architectures and Processors (ASAP), pp. 174–180. IEEE (2015)Google Scholar
  26. 26.
    Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21(7), 558–565 (1978)CrossRefzbMATHGoogle Scholar
  27. 27.
    Kumar, N., Sringarpure, M., Banerjee, T., Hackl, J., Balachandar, S., Lam, H., George, A., Ranka, S.: CMT-bone: a mini-app for compressible multiphase turbulence simulation software. In: Proceedings of the 2015 IEEE International Conference on Cluster Computing (CLUSTER 2015), pp. 785–792. IEEE Computer Society, Washington, DC (2015)Google Scholar
  28. 28.
    Tilera: Tile-gx8036 processor specification brief (2012).
  29. 29.
    Barney, B.: Using the sequoia and vulcan bg/q systems, Tutorial Lawrence Livermore National Laboratory, Lawrence Livermore National Laboratory, 8 (2014).
  30. 30.
    LLNL: Open computing facility-ocf, resource overview: Cab, October 2014Google Scholar
  31. 31.
    Helton, J.: Uncertainty and sensitivity analysis in the presence of stochastic and subjective uncertainty. J. Stat. Comput. Simul. 57(1–4), 3–76 (1997)CrossRefzbMATHGoogle Scholar
  32. 32.
    George, A., Lam, H., Stitt, G.: Novo-g: at the forefront of scalable reconfigurable supercomputing. Comput. Sci. Eng. 13(1), 82–86 (2011)CrossRefGoogle Scholar
  33. 33.
    Lawande, A.G., George, A.D., Lam, H.: Novo-g#: a multidimensional torus-based reconfigurable cluster for molecular dynamics. Concurrency Comput. Pract. Experience 28, 2374–2393 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Nalini Kumar
    • 1
    Email author
  • Carlo Pascoe
    • 1
  • Christopher Hajas
    • 1
  • Herman Lam
    • 1
  • Greg Stitt
    • 1
  • Alan George
    • 1
  1. 1.Department of ECE, PSAAP II Center for Compressible Multiphase Turbulence, NSF Center for High-Performance Reconfigurable ComputingUniversity of FloridaGainesvilleUSA

Personalised recommendations