Advertisement

Analyzing Advanced PDE Solvers Through Simulation

  • Henrik Johansson
  • Dan Wallin
  • Sverker Holmgren
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3732)

Abstract

By simulating a real computer it is possible to gain a detailed knowledge of the cache memory utilization of an application, e.g., a partial differential equation (PDE) solver. Using this knowledge, we can discover regions with intricate cache memory performance. Furthermore, this information makes it possible to identify performance bottlenecks.

In this paper, we employ full system simulation of a shared memory computer to perform a case study of three different PDE solver kernels with respect to cache memory performance. The kernels implement state-of-the-art solution algorithms for complex application problems and the simulations are performed for data sets of realistic size. We discovered interesting properties in the solvers, which can help us to improve their performance in the future.

Keywords

Cache Size Cache Line Cache Memory Memory Access Pattern Level Cache 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Magnusson, P.S., et al.: Simics: A Full System Simulation Platform. IEEE Computer 35(2), 50–58 (2002)Google Scholar
  2. 2.
    Rosenblum, M., Bugnion, E., Devine, S., Herrod, S.: Using the SimOS Machine Simulator to Study Complex Computer Systems. In: ACM TOMACS Special Issue on Computer Simulation (1997)Google Scholar
  3. 3.
    Barroso, L.A., Gharachorloo, K., Bugnion, E.: Memory System Characterization of Commercial Workloads. In: Proceedings of the 25th International Symposium on Computer Architecture (June 1998)Google Scholar
  4. 4.
    Verghese, B., Devine, S., Gupta, A., Rosenblum, M.: Operating System Support for Improving Data Locality on CC-NUMA Compute servers. In: ASPLOS VII, Cambridge, MA (1996)Google Scholar
  5. 5.
    Bugnion, E., Anderson, J.M., Rosenblum, M.: Using SimOS to characterize and optimize auto-parallelized SUIF applications. In: First SUIF Compiler Workshop, Stanford University, January 11-13 (1996)Google Scholar
  6. 6.
    Karlsson, M., Moore, K., Hagersten, E., Wood, D.: Memory System Behavior of Java-Based Middleware. In: Proceedings of the Ninth International Symposium on High Performance Computer Architecture (HPCA-9), Anaheim, California, USA (February 2003)Google Scholar
  7. 7.
    Marden, M., Lu, S., Lai, K., Lipasti, M.: Comparison of Memory System Behavior in Java and Non-Java Commercial Workloads. In: The proceedings of the Computer Architecture Evaluation using Commercial Workloads (CAECW 2002), February 2 (2002)Google Scholar
  8. 8.
    Hennessy, J., Patterson, D.: Computer Architechture, A Quantitative Approach. Morgan Kaufmann, San FranciscoGoogle Scholar
  9. 9.
    Jameson, A., Caughey, D.A.: How Many Steps are Required to Solve the Euler Equations of Steady Compressible Flow. In: Search of a Fast Solution Algorithm, AIAA 2001-2673, 15th AIAA Computational Fluid Dynamics Conference, Anaheim, CA, June 11-14 (2001)Google Scholar
  10. 10.
    Nordén, M., Silva, M., Holmgren, S., Thuné, M., Wait, R.: Implementation Issues for High Performance CFD. In: Proceedings of International Information Technology Conference, Colombo (2002)Google Scholar
  11. 11.
    Edelvik, F.: Hybrid Solvers for the Maxwell Equations in Time-Domain, PhD thesis, Dep. of Information Technology, Uppsala University (2002) Google Scholar
  12. 12.
    Petersson, Å., Karlsson, H., Holmgren, S.: Predissociation of the Ar-12 van der Waals Molecule, a 3D Study Performed Using Parallel Computers, Submitted to Journal of Physical Chemistry (2002)Google Scholar
  13. 13.
    Wallin, D.: Performance of a High-Accuracy PDE Solver on a Self-Optimizing NUMA Architecture, Master’s thesis, Dep. of Information Technology, Uppsala University (2001) Google Scholar
  14. 14.
    Johansson, H.: An Analysis of Three Different PDE-Solvers With Respect to Cache Performance, Master’s thesis, Dep. of Information Technology, Uppsala University (2003) Google Scholar
  15. 15.
    Wallin, D., Johansson, H., Holmgren, S.: Cache Memory Behavior of Advanced PDE Solvers, Parallel Computing: Software Technology, Algorithms, Architectures and Applications 13. In: Proceedings of the International Conference ParCo 2003, Dresden, Germany, pp. 475–482 (2003)Google Scholar
  16. 16.
    Eggers, S.J., Jeremiassen, T.E.: Eliminating False Sharing. In: Proceedings of International Conference on Parallel Processing, pp. 377–381 (1991)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Henrik Johansson
    • 1
  • Dan Wallin
    • 1
  • Sverker Holmgren
    • 1
  1. 1.Department of Information TechnologyUppsala UniversityUppsalaSweden

Personalised recommendations