Assessing and optimizing the performance impact of the just-in-time configuration parameters - a case study on PyPy

Li, Yangguang; Jiang, Zhen Ming (Jack)

doi:10.1007/s10664-019-09691-z

Assessing and optimizing the performance impact of the just-in-time configuration parameters - a case study on PyPy

Published: 08 March 2019

Volume 24, pages 2323–2363, (2019)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

529 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Many modern programming languages (e.g., Python, Java, and JavaScript) support just-in-time (JIT) compilation to speed up the execution of a software system. During runtime, the JIT compiler translates the frequently executed part of the system into efficient machine code, which can be executed much faster compared to the default interpreted mode. There are many JIT configuration parameters, which vary based on the programming languages and types of the jitting strategies (method vs. tracing-based). Although there are many existing works trying to improve various aspects of the jitting process, there are very few works which study the performance impact of the JIT configuration settings. In this paper, we performed an empirical study on the performance impact of the JIT configuration settings of PyPy. PyPy is a popular implementation of the Python programming language. Due to PyPy’s efficient JIT compiler, running Python programs under PyPy is usually much faster than other alternative implementations of Python (e.g., cPython, Jython, and IronPython). To motivate the need for tuning PyPy’s JIT configuration settings, we first performed an exploratory study on two microbenchmark suites. Our findings show that systems executed under PyPy’s default JIT configuration setting may not yield the best performance. Optimal JIT configuration settings vary from systems to systems. Larger portions of the code being jitted do not necessarily lead to better performance. To cope with these findings, we developed an automated approach, ESM-MOGA, to tuning the JIT configuration settings. ESM-MOGA, which stands for effect-size measure-based multi-objective genetic algorithm, automatically explores the PyPy’s JIT configuration settings for optimal solutions. Case studies on three open source systems show that systems running under the resulting configuration settings significantly out-perform (5% - 60% improvement in average peak performance) the default configuration settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FAROS: A Framework to Analyze OpenMP Compilation Through Benchmarking and Compiler Optimization Analysis

Advanced Python Performance Monitoring with Score-P

OpenMP $$^{\textregistered }$$ Runtime Instrumentation for Optimization

Notes

To ease explanation, we will call this the TechEmpower benchmark in the rest of this paper.

References

Abdessalem RB, Panichella A, Nejati S, Briand LC, Stifter T (2018) Testing autonomous cars for feature interaction failures using many-objective search. In: Proceedings of the 33rd ACM/IEEE international conference on automated software engineering (ASE)
Alghmadi HM, Syer MD, Shang W, Hassan AE (2016) An automated approach for recommending when to stop performance tests. In: 2016 IEEE international conference on software maintenance and evolution (ICSME), pp 279–289
Apache JMeter (2015) http://jmeter.apache.org/, visited 2015-10-23
Barrett E, Bolz-Tereick CF, Killick R, Mount S, Tratt L (2017) Virtual machine warmup blows hot and cold. In: Proceedings of the ACM Programming Language 1(OOPSLA), pp 52:1–52:27. https://doi.org/10.1145/3133876
Bolz CF, Cuni A, Fijalkowski M, Rigo A (2009) Tracing the meta-level: Pypy’s tracing jit compiler. In: Proceedings of the 4th workshop on the implementation, compilation, optimization of object-oriented languages and programming systems (ICOOOLPS), pp 18–25
Bondi AB (2007) Automating the analysis of load test results to assess the scalability and stability of a component. In: Proceedings of the 2007 computer measurement group conference (CMG), pp 133–146
Brecht T, Arjomandi E, Li C, Pham H (2006) Controlling garbage collection and heap growth to reduce the execution time of java applications. ACM Trans Program Lang Syst 28(5):908–941. https://doi.org/10.1145/1152649.1152652
Article Google Scholar
Candan KS, Li WS, Luo Q, Hsiung WP, Agrawal D (2001) Enabling dynamic content caching for database-driven web sites. In: Proceedings of the 2001 ACM SIGMOD international conference on management of data (SIGMOD), pp 532–543
Canfora G, Lucia AD, Penta MD, Oliveto R, Panichella A, Panichella S (2013) Multi-objective cross-project defect prediction. In: Proceedings of the 2013 IEEE 6th international conference on software testing, verification and validation (ICST)
Clark M (2017) How the BBC builds websites that scale. http://www.creativebloq.com/features/how-the-bbc-builds-websites-that-scale. Last accessed 10/06/2017
Cramer T, Friedman R, Miller T, Seberger D, Wilson R, Wolczko M (1997) Compiling java just in time. IEEE Micro 17(3):36–43
Article Google Scholar
Deb K, Agrawal S, Pratap A, Meyarivan T (2000) A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: Nsga-ii. In: International conference on parallel problem solving from nature. Springer, pp 849–858
Distributed Evolutionary Algorithms in Python (DEAP) (2017) https://github.com/DEAP/deap. Last accessed 10/06/2017
Duan S, Thummala V, Babu S (2009) Tuning Database Configuration Parameters with iTuned. Proceedings of the VLDB Endowment 2(1):1246–1257. https://doi.org/10.14778/1687627.1687767
Article Google Scholar
Eaton K (2017) How One Second Could Cost Amazon $1.6 Billion In Sales. https://www.fastcompany.com/1825005/how-one-second-could-cost-amazon-16-billion-sales. Last accessed 10/06/2017
Evaluation tools in Python (DEAP) (2017) http://deap.readthedocs.io/en/master/api/tools.html?highlight=dominance. Last accessed 10/06/2017
Fraser G, Arcuri A (2011) Evosuite: automatic test suite generation for object-oriented software. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th european conference on foundations of software engineering (ESEC/FSE)
Gal A, Eich B, Shaver M, Anderson D, Mandelin D, Haghighat MR, Kaplan B, Hoare G, Zbarsky B, Orendorff J, Ruderman J, Smith EW, Reitmaier R, Bebenita M, Chang M, Franz M (2009) Trace-based just-in-time type specialization for dynamic languages. In: Proceedings of the 30th ACM SIGPLAN conference on programming language design and implementation (PLDI), pp 465–478
Gao R, Jiang ZMJ (2017) An exploratory study on assessing the impact of environment variations on the results of load tests. In: Proceedings of the 14th international conference on mining software repositories (MSR)
Gao R, Jiang ZMJ, Barna C, Litoiu M (2016) A framework to evaluate the effectiveness of different load testing analysis techniques. In: 2016 IEEE international conference on software testing, verification and validation (ICST)
Georges A, Buytaert D, Eeckhout L (2007) Statistically rigorous java performance evaluation. In: Proceedings of the 22nd international conference on object-oriented programming, systems, languages and applications (OOPSLA)
Gewirtz D (2017) Which programming languages are most popular (and what does that even mean)? http://www.zdnet.com/article/which-programming-languages-are-most-popular-and-what-does-that-even-mean/. Last accessed 10/06/2017
Golovin D, Solnik B, Moitra S, Kochanski G, Karro J, Sculley D (2017) Google vizier: a service for black-box optimization. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (KDD)
Gong L, Pradel M, Sen K (2015) Jitprof: Pinpointing jit-unfriendly javascript code. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering (ESEC/FSE), pp 357–368
Grigorik I (2017) Optimizing Encoding and Transfer Size of Text-Based Assets. https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/optimize-encoding-and-transfer. Last accessed 10/06/2017
Hashemi M (2014) 10 Myths of Enterprise Python. https://www.paypal-engineering.com/2014/12/10/10-myths-of-enterprise-python/. Last accessed 10/06/2017
Henard C, Papadakis M, Harman M, Traon YL (2015) Combining multi-objective search and constraint solving for configuring large software product lines. In: Proceedings of the 37th international conference on software engineering (ICSE)
Hopkins WG (2016) A new view of statistics. [Online accessed 2017-10-14] http://www.sportsci.org/resource/stats/index.html
Hoskins DS, Colbourn CJ, Montgomery DC (2005) Software performance testing using covering arrays: Efficient screening designs with categorical factors. In: Proceedings of the 5th international workshop on software and performance (WOSP)
Hoste K, Georges A, Eeckhout L (2010) Automated just-in-time compiler tuning. In: Proceedings of the 8th annual IEEE/ACM international symposium on code generation and optimization (CGO), pp 62–72
IBM Java 8 JIT and AOT command-line options (2017) https://www.ibm.com/support/knowledgecenter/SSYKE2_8.0.0/com.ibm.java.aix.80.doc/diag/appendixes/cmdline/commands_jit.html. Last accessed 10/06/2017
Insights GP (2017) Remove Render-Blocking JavaScript. https://developers.google.com/speed/docs/insights/BlockingJS. Last accessed 10/06/2017
Jamshidi P, Siegmund N, Velez M, Kästner C, Patel A, Agarwal Y (2017) Transfer learning for performance modeling of configurable systems: an exploratory analysis. In: Proceedings of the international conference on automated software engineering (ASE)
Jamshidi P, Siegmund N, Velez M, Kästner C, Patel A, Agarwal Y (2017) Transfer learning for performance modeling of configurable systems: an exploratory analysis. In: Proceedings of the 32nd IEEE/ACM international conference on automated software engineering (ASE)
Jantz MR, Kulkarni PA (2013) Exploring single and multilevel jit compilation policy for modern machines 1. ACM Trans Archit Code Optim (TACO) 10(4):22:1–22:29
Google Scholar
Java Microbenchmark Harness (JMH) (2017) http://openjdk.java.net/projects/code-tools/jmh/. Last accessed 10/06/2017
Jiang ZM, Hassan AE (2015) A survey on load testing of large-scale software systems. IEEE Trans Softw Eng 41:1–1. https://doi.org/10.1109/TSE.2015.2445340
Article Google Scholar
Kampenes VB, Dybå T, Hannay JE, Sjøberg DIK (2007) Systematic review: A systematic review of effect size in software engineering experiments. Inf Softw Technol 49(11-12):1073–1086
Article Google Scholar
Komorn R (2016) Python in production engineering. https://code.facebook.com/posts/1040181199381023/python-in-production-engineering/. Last accessed 10/06/2017
Lengauer P, Mössenböck H (2014) The taming of the Shrew: increasing performance by automatic parameter tuning for java garbage collectors. In: Proceedings of the 5th ACM/SPEC international conference on performance engineering (ICPE), pp 111–122
Libič P, Bulej L, Horky V, Tůma P (2014) On the limits of modeling generational garbage collector performance. In: Proceedings of the 5th ACM/SPEC international conference on performance engineering (ICPE)
Lion D, Chiu A, Sun H, Zhuang X, Grcevski N, Yuan D (2016) Don’t get caught in the cold, warm-up your jvm: Understand and eliminate jvm warm-up overhead in data-parallel systems. In: Proceedings of the 12th USENIX conference on operating systems design and implementation (OSDI), pp 383–400
Martí L, García J, Berlanga A, Molina JM (2009) An approach to stopping criteria for multi-objective optimization evolutionary algorithms: The mgbm criterion. In: IEEE congress on evolutionary computation, 2009. CEC’09. IEEE, pp 1263–1270
Oaks S (2014) Java performance: the definitive guide, 1st. O’Reilly Media, Inc, Sebastopol
Google Scholar
Osogami T, Kato S (2007) Optimizing system configurations quickly by guessing at the performance. In: Proceedings of the 2007 ACM SIGMETRICS international conference on measurement and modeling of computer systems (SIGMETRICS)
Oracle Java 8 Advanced JIT Compiler Options (2017) https://docs.oracle.com/javase/8/docs/technotes/tools/windows/java.html#BABDDFII. Last accessed 10/06/2017
Performance monitoring tools for Linux (2015) https://github.com/sysstat/sysstat, visited 2015-10-23
PyPy speed center (2017) http://speed.pypy.org/. Last accessed 10/04/2017
Quokka CMS (Content Management System) - Python Flask and MongoDB (2017) http://quokkaproject.org/. Last accessed 10/06/2017
Replication package (2018) https://github.com/seasun525/PyPyJITTuner
Romano J, Kromrey JD, Coraggio J, Skowronek J (2006) Appropriate statistics for ordinal level data: Should we really be using t-test and Cohen’sd for evaluating group differences on the NSSE and other surveys?. In: Annual meeting of the Florida Association of Institutional Research
Saleor - An e-commerce storefront for Python and Django (2017) http://getsaleor.com/. Last accessed 10/06/2017
Shamshiri S, Rojas JM, Fraser G, McMinn P (2015) Random or genetic algorithm search for object-oriented test suite generation?. In: Proceedings of the 2015 annual conference on genetic and evolutionary computation (GECCO)
Siegmund N, Grebhahn A, Apel S, Kästner C (2015) Performance-influence models for highly configurable systems. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering, ESEC/FSE 2015. ACM
Siegmund N, Kolesnikov SS, Kästner C, Apel S, Batory D, Rosenmüller M, Saake G (2012) Predicting Performance via Automated Feature-interaction Detection. In: Proceedings of the 34th international conference on software engineering (ICSE)
Singer J, Brown G, Watson I, Cavazos J (2007) Intelligent selection of application-specific garbage collectors. In: Proceedings of the 6th International Symposium on Memory Management, ISMM ’07
Singh R, Bezemer CP, Shang W, Hassan AE (2016) Optimizing the performance-related configurations of object-relational mapping frameworks using a multi-objective genetic algorithm. In: Proceedings of the 7th ACM/SPEC on international conference on performance engineering (ICPE), pp 309–320
Sopitkamol M, Menascé DA (2005) A method for evaluating the impact of software configuration parameters on e-commerce sites. In: Proceedings of the 5th international workshop on software and performance (WOSP), pp 53–64
TechEmpower Web Framework Benchmarks (2017) https://www.techempower.com/benchmarks/. Last accessed 10/04/2017
The Python Profilers (2018) https://docs.python.org/2/library/profile.html. Last accessed 10/28/2018
Thonangi R, Thummala V, Babu S (2008) finding good configurations in High-Dimensional spaces: doing more with less. In: 2008 IEEE international symposium on modeling, analysis and simulation of computers and telecommunication systems
Tracing a Program As It Runs (2018) https://pymotw.com/2/sys/tracing.html. Last accessed 10/28/2018
vmprof - a statistical program profiler (2017) http://vmprof.com/. Last accessed 10/06/2017
Wang K, Lin X, Tang W (2012) Predator - An experience guided configuration optimizer for Hadoop MapReduce. In: 4Th IEEE international conference on cloud computing technology and science proceedings
Wagtail CMS: Django Content Management System (2017) https://wagtail.io/. Last accessed 10/06/2017
What is Load Balancing? (2017) https://www.nginx.com/resources/glossary/load-balancing/. Last accessed 10/06/2017
What is python used for at Google? (2017) https://www.quora.com/What-is-python-used-for-at-Google. Last accessed 10/06/2017
Wimmer C, Brunthaler S (2013) Zippy on truffle: a fast and simple implementation of python. In: Proceedings of the 2013 companion publication for conference on systems, programming, & applications: software for humanity (SPLASH)
Würthinger T, Wimmer C, Humer C, Wöß A, Stadler L, Seaton C, Duboscq G, Simon D, Grimmer M (2017) Practical partial evaluation for high-performance dynamic language runtimes. In: Proceedings of the 38th ACM SIGPLAN conference on programming language design and implementation, PLDI 2017. ACM, New York, pp 662–676. https://doi.org/10.1145/3062341.3062381
Würthinger T, Wimmer C, Wöß A, Stadler L, Duboscq G, Humer C, Richards G, Simon D, Wolczko M (2013) One vm to rule them all. In: Proceedings of the 2013 ACM international symposium on new ideas, new paradigms, and reflections on programming & software (Onward!)
Xi B, Liu Z, Raghavachari M, Xia CH, Zhang L (2004) A smart hill-climbing algorithm for application server configuration. In: Proceedings of the 13th international conference on world wide web (WWW). https://doi.org/10.1145/988672.988711. ACM, New York, pp 287–296
Xu T, Jin X, Huang P, Zhou Y, Lu S, Jin L, Pasupathy S (2016) Early detection of configuration errors to reduce failure damage. In: Proceedings of the 12th USENIX conference on operating systems design and implementation (OSDI)
Yilmaz C, Porter A, Krishna AS, Memon AM, Schmidt DC, Gokhale AS, Natarajan B (2007) Reliable effects screening: a distributed continuous quality assurance process for monitoring performance degradation in evolving software systems. IEEE Trans Softw Eng (TSE) 33(2):124–141. https://doi.org/10.1109/TSE.2007.20
Article Google Scholar

Download references

Author information

Authors and Affiliations

Software Construction, AnaLytics and Evaluation (SCALE) lab, York University, Toronto, ON, Canada
Yangguang Li & Zhen Ming (Jack) Jiang

Authors

Yangguang Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhen Ming (Jack) Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yangguang Li.

Additional information

Communicated by: Sven Apel

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Y., Jiang, Z.M.(. Assessing and optimizing the performance impact of the just-in-time configuration parameters - a case study on PyPy. Empir Software Eng 24, 2323–2363 (2019). https://doi.org/10.1007/s10664-019-09691-z

Download citation

Published: 08 March 2019
Issue Date: 15 August 2019
DOI: https://doi.org/10.1007/s10664-019-09691-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Assessing and optimizing the performance impact of the just-in-time configuration parameters - a case study on PyPy

Abstract

Access this article

Similar content being viewed by others

FAROS: A Framework to Analyze OpenMP Compilation Through Benchmarking and Compiler Optimization Analysis

Advanced Python Performance Monitoring with Score-P

OpenMP $$^{\textregistered }$$ Runtime Instrumentation for Optimization

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Assessing and optimizing the performance impact of the just-in-time configuration parameters - a case study on PyPy

Abstract

Access this article

Similar content being viewed by others

FAROS: A Framework to Analyze OpenMP Compilation Through Benchmarking and Compiler Optimization Analysis

Advanced Python Performance Monitoring with Score-P

OpenMP $$^{\textregistered }$$ Runtime Instrumentation for Optimization

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation