Abstract
Complex software systems use many shared libraries frequently composed of large off-the-shelf components. Only a limited number of functions are used from these shared libraries. Historically demand paging prevented this from wasting large amounts of memory. Many high end systems lack virtual memory and thus must load the entire shared library into each node’s memory. In this paper we propose a system which decreases the memory footprint of applications by selectively loading only the used portions of the shared libraries. After profiling executables and shared libraries, our system rewrites all target shared libraries with a new function ordering and updated ELF program headers so that the loader only loads those functions that are likely to be used by a given application and includes a fallback user-level paging system to recover in the case of failures in our analysis. We present a case study that shows our system achieves more than 80% reduction in the number of pages that are loaded for several HPC applications while causing no performance overhead for reasonably long running programs.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Adiga, N.R., et al.: An overview of the bluegene/l supercomputer. In: Supercomputing 2002: Proceedings of the 2002 ACM/IEEE Conference on Supercomputing, pp. 1–22. IEEE Computer Society Press, Los Alamitos (2002)
Kelly, S.M., Brightwell, R.: Software architecture of the light weight kernel, catamount. In: Cray User Group, pp. 16–19 (2005)
Levine, J.R.: Linkers and Loaders. Morgan Kaufmann Publishers Inc., San Francisco (1999)
Buck, B., Hollingsworth, J.K.: An api for runtime code patching. Int. J. High Perform. Comput. Appl. 14(4), 317–329 (2000)
Drepper, U.: Using elf in glibc 2.1. Technical report, Cygnus Solutions, Sunnyvale, CA (1999)
Balay, S., Buschelman, K., Gropp, W.D., Kaushik, D., Knepley, M.G., McInnes, L.C., Smith, B.F., Zhang, H.: PETSc Web page (2009), http://www.mcs.anl.gov/petsc
Kotschenreuther, M., Rewoldt, G., Tang, W.M.: Comparison of initial value and eigenvalue codes for kinetic toroidal plasma instabilities. Computer Physics Communications 88(2-3), 128–140 (1995)
Dorland, W., Jenko, F., Kotschenreuther, M., Rogers, B.N.: Electron temperature gradient turbulence. Phys. Rev. Lett. 85(26), 5579–5582 (2000)
Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J.J., Squyres, J.M., Sahay, V., Kambadur, P., Barrett, B., Lumsdaine, A., Castain, R.H., Daniel, D.J., Graham, R.L., Woodall, T.S.: Open MPI: Goals, concept, and design of a next generation MPI implementation. In: Proceedings, 11th European PVM/MPI Users’ Group Meeting, Budapest, Hungary, pp. 97–104 (September 2004)
Cooper, K.D., McIntosh, N.: Enhanced code compression for embedded risc processors. SIGPLAN Not 34(5), 139–149 (1999)
Debray, S., Evans, W.: Profile-guided code compression. In: PLDI 2002: Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation, pp. 95–105. ACM, New York (2002)
Xie, Y., Wolf, W., Lekatsas, H.: Profile-driven selective code compression. In: DATE 2003: Proceedings of the Conference on Design, Automation and Test in Europe, p. 10462. IEEE Computer Society, Washington (2003)
Lefurgy, C., Piccininni, E., Mudge, T.: Evaluation of a high performance code compression method. In: MICRO 32: Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture, pp. 93–102. IEEE Computer Society, Washington (1999)
Komondoor, R., Horwitz, S.: Using slicing to identify duplication in source code. In: Cousot, P. (ed.) SAS 2001. LNCS, vol. 2126, pp. 40–56. Springer, Heidelberg (2001)
Komondoor, R., Horwitz, S.: Effective, automatic procedure extraction. In: IWPC 2003: Proceedings of the 11th IEEE International Workshop on Program Comprehension, p. 33. IEEE Computer Society, Washington (2003)
Debray, S.K., Evans, W., Muth, R., De Sutter, B.: Compiler techniques for code compaction. ACM Trans. Program. Lang. Syst. 22(2), 378–415 (2000)
Beszédes, A., Ferenc, R., Gyimóthy, T., Dolenc, A., Karsisto, K.: Survey of code-size reduction methods. ACM Comput. Surv. 35(3), 223–267 (2003)
Van Put, L., Chanet, D., De Bus, B., De Sutter, B., De Bosschere, K.: Diablo: a reliable, retargetable and extensible link-time rewriting framework. In: Proceedings of the 2005 IEEE International Symposium On Signal Processing And Information Technology, Athens, pp. 7–12. IEEE, Los Alamitos (December 2005)
Zmily, A., Kozyrakis, C.: Simultaneously improving code size, performance, and energy in embedded processors. In: DATE 2006: Proceedings of the Conference on Design, Automation and Test in Europe, 3001 Leuven, Belgium, pp. 224–229. European Design and Automation Association (2006)
Lau, J., Schoenmackers, S., Sherwood, T., Calder, B.: Reducing code size with echo instructions. In: CASES 2003: Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, pp. 84–94. ACM, New York (2003)
Zhang, L., Krintz, C.: Profile-driven code unloading for resource-constrained jvms. In: PPPJ 2004: Proceedings of the 3rd International Symposium on Principles and Practice of Programming in Java, Trinity College Dublin, pp. 83–90 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ince, T., Hollingsworth, J.K. (2010). Profile-Driven Selective Program Loading. In: D’Ambra, P., Guarracino, M., Talia, D. (eds) Euro-Par 2010 - Parallel Processing. Euro-Par 2010. Lecture Notes in Computer Science, vol 6271. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15277-1_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-15277-1_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15276-4
Online ISBN: 978-3-642-15277-1
eBook Packages: Computer ScienceComputer Science (R0)