ISHPC 1999: High Performance Computing pp 29-40 | Cite as
Instruction-level microprocessor modeling of scientific applications
Abstract
Superscalar microprocessor efficiency is generally not as high as anticipated. In fact, sustained utilization below thirty percent of peak is not uncommon, even for fully optimized, cache-friendly codes. Where cycles are lost is the topic of much research. In this paper we attempt to model architectural effect on processor utilization with and without memory influence. By presenting analytical formulas that use measurements from “on-chip” performance counters, we provide a novel technique for modeling state-of-theart microprocessors over ASCI representative scientific applications. ASCI is the Accelerated Strategic Computing Initiative sponsored by the US Department of Energy. We derive formulas for calculating a lower bound for CPI0, CPI without memory effect, and we quantify utilization of architectural parameters. These equations are architecturally diagnostic and qualitatively predictive in nature. Results provide promise in code characterization, and empirical/analytical modeling.
Keywords
Queue Length Memory Instruction Instruction Cache Performance Counter Positive Growth RatePreview
Unable to display preview. Download preview PDF.
References
- 1.Albonesi, D.H., and Koren, I., A Mean Value Analysis Multiprocessor Model Incorporating Superscalar Processors and Latency Tolerating Techniques, International Journal of Parallel Programming, Vol. 24, No. 3, 1996.Google Scholar
- 2.Emma, P.G., and Davidson, E.S., Characterization of Branch and Data Dependencies in Programs for Evaluating Pipeline Performance, IEEE Transactions on Computers, Vol. C. 36, No. 7, July 1987.Google Scholar
- 3.De Gloria, A., Ancarani, F., Bellotti, F., and Olivieri, M., Instruction level analytic prediction of parallel CPU architecture performance, IIS'97, Dec. 1997.Google Scholar
- 4.Migliardi, M., and Maresca, M., Modelling Instruction Level Parallel Architectures Efficiency in Image Processing Applications, International Conference on High Performance Computing and Networking, Vienna, Austria, April 1997.Google Scholar
- 5.MacDougall, M.H., Instruction-Level Program and Processor Modeling, IEEE Computer, July 1984Google Scholar
- 6.Lubeck, O.M., Luo, Y., and Wasserman, H.J. et al, An Empirical Hierarchical Memory Model Based on Hardware Performance Counters, PDPTA'98, Las Vegas, July 13–16, 1998.Google Scholar
- 7.Sun, X. H., Cameron, K. W., et al., A Hierarchical Statistic Methodology for Advanced Memory System Evaluation, accepted by IPPS'99, Sept. 1998.Google Scholar
- 8.Luo, Y. and Cameron, K.W., Instruction-level Characterization of Scientific Computing Application using Hardware Performance Counters, Workshop on Workload Characterization at Micro-31, Nov. 1998.Google Scholar
- 9.Bianchini, R., Lim, B., Evaluating the Performance of Multithreading and Prefetching in Multiprocessors, Journal of Parallel and Distributed Computing, N. 37, p 83–97, 1996.CrossRefGoogle Scholar
- 10.Luo, Y., and Cameron, K. Instruction-level Performance Modeling and Characterization of Multimedia Applications, Los Alamos Unclassified Technical Report #99-303, Jan. 1999.Google Scholar
- 11.Hennessy, J.L., and Patterson, D.A., Computer Architecture: A Quantitative Approach, Morgan Kaufmann Publishers, Inc., p35–37, 1996.Google Scholar
- 12.Turner, S. (SGI/Cray), Private Communications, Mar. 1998Google Scholar