K. Davis, A. Hoisie, G. Johnson, D. J. Kerbyson, M. Lang, S. Pakin, and F. Petrini. A performance and scalability analysis of the BlueGene/L architecture. In Proc. IEEE/ACM Supercomputing, Pittsburgh, PA, 2004.
A. Hoisie, O. Lubeck, and H. Wasserman. Performance and scalability analysis of Teraflop-scale parallel architectures using multidimensional wavefront applications. Int. J. of High Performance Computing Applications, 14(4):330–346, 2000.
A. Hoisie, O. Lubeck, H. Wasserman, F. Petrini, and H. Alme. A general predictive performance model for wavefront algorithms on clusters of SMPs. In Proc. of ICPP 2000, pages 20–25, Toronto, Canada, 2000.
G. Karypis and V. Kumar. METIS 4.0: Unstructured Graph Partitioning and Sparse Matrix Ordering System. Technical report, Department of Computer Science, University of Minnesota, 1998.
D. J. Kerbyson, H. Alme, A. Hoisie, F. Petrini, H. Wasserman, and M. Gittings. Predictive performance and scalability modeling of a large-scale application. In Proc. Supercomputing, Denver, CO, 2001.
D. J. Kerbyson, A. Hoisie, and H. J. Wasserman. Modeling the performance of large-scale systems. IEE Proceedings (Software), 150(4):214–221, 2003.
D. J. Kerbyson, A. Hoisie, and H. J. Wasserman. A performance comparison between the earth simulator and other terascale systems on a characteristic ASCI workload. Concurrency and Computation, Practice and Experience, 17(10):1219–1238, 2004.
D. J. Kerbyson, A. Hoisie, and H. J. Wasserman. Use of predictive performance modeling during large-scale system installation. To appear in Parallel Processing Letters, 2005.
K. R. Koch, R. S. Baker, and R. E. Alcouffe. Solution of the first-order form of the 3D discrete ordinates equation on a massively parallel processor. Transactions of the American Nuclear Society, 65:198–199, 1992.
M. M. Mathis, N. M. Amato, and M. L. Adams. A general performance model for parallel sweeps on orthogonal grids for particle transport calculations. In Proc. ACM Int. Conf. Supercomputing (ICS), pp. 255–263, Santa Fe, NM, 2000.
M. M. Mathis and D. J. Kerbyson. Performance modeling of unstructured mesh particle transport computations. In Proc. ACM/IEEE Int. Parallel and Distributed Processing Symposium (IPDPS), Santa Fe, NM, 2004.
M. M. Mathis, D. J. Kerbyson, and A. Hoisie. A performance model of non-deterministic particle transport on large-scale systems. In Proc. Int. Conf. on Computational Science (ICCS), LNCS, vol. 2659, pp. 936–945, Melbourne, Australia, 2003.
S. D. Pautz. An algorithm for parallel sn sweeps on unstructured meshes. J. Nuclear Science and Engineering, 140:111–136, 2002.
F. Petrini, W. C. Feng, A. Hoisie, S. Coll, and E. Frachtenberg. The Quadrics Network: High-Performance Clustering Technology. IEEE Micro, 22(1):46–57, 2002.
F. Petrini, D. J. Kerbyson, and S. Pakin. The case of the missing supercomputer performance: Achieving optimal performance on the 8,192 processors of ASCI Q. In Proc. IEEE/ACM SuperComputing, Phoenix, 2003.
S. Plimpton, B. Hendrickson, S. Burns, and W. McLendon. Parallel algorithms for radiation transport on unstructured grids. In Proc. IEEE/ACM Supercomputing, Dallas, 2000.
The ASCI SWEEP3D README File. Available from: www.llnl.gov/asci_benchmarks/asci/limited /sweep3d/sweep3d_readme.html
The UMT2K (UMT 1.2) README File. Available from: www.llnl.gov/asci/purple/benchmarks/limited/umt/umt1.2.readme.html
J. S. Vetter and A. Yoo. An empirical performance evaluation of scalable scientific applications. In Proc. IEEE/ACM Supercomputing, Baltimore, MD, 2002.