Abstract
Software systems typically exploit only a small fraction of the realizable performance from the underlying microprocessors. While there has been much work on hardware-aware optimizations, two factors limit their benefit. First, microprocessors are so complex that it is unlikely that even an aggressively optimizing compiler will be able to satisfy all the constraints necessary to obtain the best performance. Thus, most optimizations use a simplified model of the hardware (e.g., they may be cache-aware but they may ignore other hardware structures, such as TLBs, etc.). Second, hardware manufacturers do not reveal all details of their microprocessors so even if the authors of optimizations wanted to simultaneously optimize for all components of the hardware, they may be unable to do so because they are working with limited knowledge. This paper presents and evaluates our blind optimization approach which provides a way to get around these issues.
Blind optimization uses the insight that we can generate many variants of an application by altering semantic preserving parameters of an application; for example our variants can cover the space of code and data layout by shifting the positions of code and data in memory. Our optimization strategy attempts to find a variant that performs well with respect to an optimization objective. We show that even our first implementation of blind optimization speeds up a number of programs from the SPECint 2006 benchmark suite.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Arnold, M., Fink, S., Grove, D., Hind, M., Sweeney, P.F.: Adaptive optimization in the Jalapeño JVM. ACM SIGPLAN Notices 35(10), 47–65 (2000)
Arnold, M., Hind, M., Ryder, B.G.: Online feedback-directed optimization of java. SIGPLAN Not. 37(11), 111–129 (2002)
Browne, S., Dongarra, J., Garner, N., London, K., Mucci, P.: A scalable cross-platform infrastructure for application performance tuning using hardware counters. In: SC, Dallas, Texas (November 2000)
Calder, B., Grunwald, D.: Reducing branch costs via branch alignment. In: ASPLOS (October 1994)
Cavazos, J., Moss, J.E.B.: Inducing heuristics to decide whether to schedule. In: PLDI, pp. 183–194. ACM Press, New York (2004)
Cavazos, J., O’Boyle, M.F.P.: Automatic tuning of inlining heuristics. In: SC, Washington, DC, USA, p. 14. IEEE Computer Society, Los Alamitos (2005)
Cooper, K.D., Subramanian, D., Torczon, L.: Adaptive optimizing compilers for the 21st century. J. Supercomput. 23(1), 7–22 (2002)
Gloy, N., Blackwell, T., Smith, M.D., Calder, B.: Procedure placement using temporal ordering information. In: MICRO, pp. 303–313 (1997)
Hashemi, A.H., Kaeli, D.R., Calder, B.: Efficient procedure mapping using cache line coloring. In: PLDI, pp. 171–182 (1997)
Hauswirth, M., Sweeney, P.F., Diwan, A., Hind, M.: Vertical profiling: Understanding the behavior of object-oriented applications. In: OOPSLA (2004)
Jiménez, D.A.: Code placement for improving dynamic branch prediction accuracy. In: PLDI, pp. 107–116. ACM Press, New York (2005)
Lau, J., Arnold, M., Hind, M., Calder, B.: Online performance auditing: using hot optimizations without getting burned. SIGPLAN Not. 41(6), 239–251 (2006)
Lee, H., von Dincklage, D., Diwan, A., Eliot, J., Moss, B.: Understanding the behavior of compiler optimizations. Softw. Pract. Exper. 36(8), 835–844 (2006)
Massalin, H.: Superoptimizer: a look at the smallest program. SIGPLAN Not. 22(10), 122–126 (1987)
Mcfarling, S.: Program optimization for instruction caches. In: ASPLOS, pp. 183–191. ACM, New York (1989)
Mcfarling, S.: Procedure merging with instruction caches. In: PLDI, pp. 71–79 (1991)
McGovern, A., Moss, E., Barto, A.G.: Building a basic block instruction scheduler with reinforcement learning and rollouts. Mach. Learn. 49(2-3), 141–160 (2002)
Mytkowicz, T., Diwan, A., Hauswirth, M., Sweeney, P.F.: Producing wrong data without doing anything obviously wrong? In: ASPLOS (2009)
Pan, Z., Eigenmann, R.: Fast and effective orchestration of compiler optimizations for automatic performance tuning. In: CGO, Washington, DC, USA, pp. 319–332. IEEE Computer Society, Los Alamitos (2006)
Pettis, K., Hansen, R.C.: Profile guided code positioning. In: PLDI, pp. 16–27 (June 1990)
Singer, J., Brown, G., Watson, I., Cavazos, J.: Intelligent selection of application-specific garbage collectors. In: ISMM, pp. 91–102. ACM Press, New York (2007)
Standard Performance Evaluation Corporation. SPEC CPU2006 Benchmarks, http://www.spec.org/cpu2006/
Tomiyama, H., Yasuura, H.: Code placement techniques for cache miss rate reduction. ACM Trans. Des. Autom. Electron. Syst. 2(4), 410–429 (1997)
Triantafyllis, S., Vachharajani, M., Vachharajani, N., August, D.I.: Compiler optimization-space exploration. In: CGO, Washington, DC, USA, pp. 204–215. IEEE Computer Society Press, Los Alamitos (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Knights, D., Mytkowicz, T., Sweeney, P.F., Mozer, M.C., Diwan, A. (2009). Blind Optimization for Exploiting Hardware Features. In: de Moor, O., Schwartzbach, M.I. (eds) Compiler Construction. CC 2009. Lecture Notes in Computer Science, vol 5501. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00722-4_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-00722-4_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00721-7
Online ISBN: 978-3-642-00722-4
eBook Packages: Computer ScienceComputer Science (R0)