Skip to main content

A General Performance Model of Structured and Unstructured Mesh Particle Transport Computations


The performance of unstructured mesh applications presents a number of complexities and subtleties that do not arise for dense structured meshes. From a programming point of view, the handling of unstructured meshes has an increased complexity in order to manage the necessary data structures and interactions between mesh-cells. From a performance point of view, there are added difficulties in understanding both the processing time on a single processor and the scaling characteristics when using large-scale parallel systems. In this work we present a general performance model for the calculation of deterministic S N transport on unstructured meshes that is also applicable to structured meshes. The model captures the key processing characteristics of the calculation and is parametric using both system performance data (latency, bandwidth, processing rate etc.) and application data (mesh size etc.) as input. A single formulation of the model is used to predict the performance of two quite different implementations of the same calculation. It is validated on two clusters (an HP AlphaServer and an Itanium-2 system) showing high prediction accuracy.

This is a preview of subscription content, access via your institution.


  1. K. Davis, A. Hoisie, G. Johnson, D. J. Kerbyson, M. Lang, S. Pakin, and F. Petrini. A performance and scalability analysis of the BlueGene/L architecture. In Proc. IEEE/ACM Supercomputing, Pittsburgh, PA, 2004.

  2. A. Hoisie, O. Lubeck, and H. Wasserman. Performance and scalability analysis of Teraflop-scale parallel architectures using multidimensional wavefront applications. Int. J. of High Performance Computing Applications, 14(4):330–346, 2000.

    Article  Google Scholar 

  3. A. Hoisie, O. Lubeck, H. Wasserman, F. Petrini, and H. Alme. A general predictive performance model for wavefront algorithms on clusters of SMPs. In Proc. of ICPP 2000, pages 20–25, Toronto, Canada, 2000.

  4. G. Karypis and V. Kumar. METIS 4.0: Unstructured Graph Partitioning and Sparse Matrix Ordering System. Technical report, Department of Computer Science, University of Minnesota, 1998.

  5. D. J. Kerbyson, H. Alme, A. Hoisie, F. Petrini, H. Wasserman, and M. Gittings. Predictive performance and scalability modeling of a large-scale application. In Proc. Supercomputing, Denver, CO, 2001.

  6. D. J. Kerbyson, A. Hoisie, and H. J. Wasserman. Modeling the performance of large-scale systems. IEE Proceedings (Software), 150(4):214–221, 2003.

    Article  Google Scholar 

  7. D. J. Kerbyson, A. Hoisie, and H. J. Wasserman. A performance comparison between the earth simulator and other terascale systems on a characteristic ASCI workload. Concurrency and Computation, Practice and Experience, 17(10):1219–1238, 2004.

    Google Scholar 

  8. D. J. Kerbyson, A. Hoisie, and H. J. Wasserman. Use of predictive performance modeling during large-scale system installation. To appear in Parallel Processing Letters, 2005.

  9. K. R. Koch, R. S. Baker, and R. E. Alcouffe. Solution of the first-order form of the 3D discrete ordinates equation on a massively parallel processor. Transactions of the American Nuclear Society, 65:198–199, 1992.

    Google Scholar 

  10. M. M. Mathis, N. M. Amato, and M. L. Adams. A general performance model for parallel sweeps on orthogonal grids for particle transport calculations. In Proc. ACM Int. Conf. Supercomputing (ICS), pp. 255–263, Santa Fe, NM, 2000.

  11. M. M. Mathis and D. J. Kerbyson. Performance modeling of unstructured mesh particle transport computations. In Proc. ACM/IEEE Int. Parallel and Distributed Processing Symposium (IPDPS), Santa Fe, NM, 2004.

  12. M. M. Mathis, D. J. Kerbyson, and A. Hoisie. A performance model of non-deterministic particle transport on large-scale systems. In Proc. Int. Conf. on Computational Science (ICCS), LNCS, vol. 2659, pp. 936–945, Melbourne, Australia, 2003.

  13. S. D. Pautz. An algorithm for parallel sn sweeps on unstructured meshes. J. Nuclear Science and Engineering, 140:111–136, 2002.

    Google Scholar 

  14. F. Petrini, W. C. Feng, A. Hoisie, S. Coll, and E. Frachtenberg. The Quadrics Network: High-Performance Clustering Technology. IEEE Micro, 22(1):46–57, 2002.

    Article  Google Scholar 

  15. F. Petrini, D. J. Kerbyson, and S. Pakin. The case of the missing supercomputer performance: Achieving optimal performance on the 8,192 processors of ASCI Q. In Proc. IEEE/ACM SuperComputing, Phoenix, 2003.

  16. S. Plimpton, B. Hendrickson, S. Burns, and W. McLendon. Parallel algorithms for radiation transport on unstructured grids. In Proc. IEEE/ACM Supercomputing, Dallas, 2000.

  17. The ASCI SWEEP3D README File. Available from: /sweep3d/sweep3d_readme.html

  18. The UMT2K (UMT 1.2) README File. Available from:

  19. J. S. Vetter and A. Yoo. An empirical performance evaluation of scalable scientific applications. In Proc. IEEE/ACM Supercomputing, Baltimore, MD, 2002.

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Mark M. Mathis.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Mathis, M.M., Kerbyson, D.J. A General Performance Model of Structured and Unstructured Mesh Particle Transport Computations. J Supercomput 34, 181–199 (2005).

Download citation

  • Issue Date:

  • DOI:


  • performance modeling
  • performance analysis
  • high performance computing
  • SN transport
  • unstructured meshes
  • parallel processing
  • large-scale systems