Skip to main content
Log in

A comparative study of application-level caching recommendations at the method level

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Performance and scalability requirements have a fundamental role in most large-scale software applications. To satisfy such requirements, caching is often used at various levels and infrastructure layers. Application-level caching—or memoization—is an increasingly used form of caching within the application boundaries, which consists of storing the results of computations in memory to avoid re-computing them. This is typically manually done by developers, who identify caching opportunities in the code and write additional code to manage the cache content. The task of identifying caching opportunities is a challenge because it requires the analysis of workloads and code locations where it is feasible and beneficial to cache objects. To aid developers in this task, there are approaches that automatically identify cacheable methods. Although such approaches have been individually evaluated, their effectiveness has not been compared. We thus in this paper present an empirical evaluation to compare the method recommendations made by the two existing application-level caching approaches at the method level, namely APLCache and MemoizeIt, using seven open-source web applications. We analyse the recommendations made by each approach as well as the hits, misses and throughput achieved with their valid caching recommendations. Our results show that the effectiveness of both approaches largely depends on the specific application, the presence of invalid recommendations and additional configurations, such as the time-to-live. By inspecting the obtained results, we observed in which cases the recommendations of each approach fail and succeed, which allowed us to derive a set of seven lessons learned that give directions for future approaches to support developers in the adoption of this type of caching.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Listing 1
Fig. 1
Listing 2
Listing 3
Listing 4
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. Note that if new application-level caching approaches are proposed to recommend methods to cache, the same procedure can be used. The only requirement is to use the recommendations made by the new approach. We provide details on how to reproduce the study in Appendix 1 for those who would like to use our infrastructure to conduct similar studies.

  2. The source code of our study is available for reproducibility at http://inf.ufrgs.br/prosoft/resources/2021/emse-apl-caching-comparison.

  3. http://www.tpc.org/tpcw

  4. https://www.eclipse.org/aspectj

  5. https://github.com/stleary/JSON-java

  6. https://github.com/gousiosg/java-callgraph

  7. https://github.com/google/gson

  8. https://github.com/ben-manes/caffeine

  9. https://spring.io

  10. https://redis.io

  11. https://www.ehcache.org

  12. https://memcached.org

  13. https://github.com/ben-manes/caffeine

  14. http://inf.ufrgs.br/prosoft/resources/2021/emse-apl-caching-comparison

  15. Note that deciding the validity of the recommendation is not possible in these cases without discussing the requirements of each application with involved stakeholders. We considered these valid recommendations.

  16. We remind the reader that additional charts to further inspect the results are available at http://inf.ufrgs.br/prosoft/resources/2021/emse-apl-caching-comparison.

References

  • Abbott ML, Fisher MT (2009) The art of scalability: scalable web architecture, processes, and organizations for the modern enterprise. Pearson Education

  • Ali W, Shamsuddin SM, Ismail AS (2012) Intelligent Web proxy caching approaches based on machine learning techniques. Decis Support Syst 53(3):565–579. https://doi.org/10.1016/j.dss.2012.04.011

    Article  Google Scholar 

  • Ali Ahmed W, Shamsuddin SM (2011) Neuro-fuzzy system in partitioned client-side Web cache. Expert Systems with Applications 38(12):14715–14725. https://doi.org/10.1016/j.eswa.2011.05.009

    Article  Google Scholar 

  • Alici S, Altingovde IS, Ozcan R, Barla Cambazoglu B, Ulusoy Ö (2012) Adaptive time-to-live strategies for query result caching in web search engines. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 7224 LNCS, pp 401–412. https://doi.org/10.1007/978-3-642-28997-2_34

  • Candan KS, Li WS, Luo Q, Hsiung WP, Agrawal D (2001) Enabling dynamic content caching for database-driven web sites. ACM SIGMOD Record 30(2):532–543

    Article  Google Scholar 

  • Chen TH, Shang W, Hassan AE, Nasser M, Flora P (2016) CacheOptimizer: Helping Developers Configure Caching Frameworks for Hibernate-based Database-centric Web Applications. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of Software engineering - FSE 2016. ACM Press, New York, New York, USA, pp 666–677. https://doi.org/10.1145/2950290.2950303

  • Chen Z, Chen B, Xiao L, Wang X, Chen L, Liu Y, Xu B (2018) Speedoo: prioritizing performance optimization opportunities. In: Proceedings of the 40th international conference on software engineering. ACM, pp 811–821

  • Della Toffola L, Pradel M, Gross TR (2015) Performance problems you can fix: a dynamic analysis of memoization opportunities. In: Proceedings of the 2015 ACM SIGPLAN international conference on object-oriented programming, systems, languages, and applications - OOPSLA 2015. ACM Press, New York, New York, USA, pp 607–622. https://doi.org/10.1145/2814270.2814290

  • Ghandeharizadeh S, Yap J, Barahmand S (2012) Cosar-cqn: an application transparent approach to cache consistency. In: Twenty first international conference On software engineering and data engineering

  • Ghandeharizadeh S, Irani S, Lam J (2015) Cache replacement with memory allocation. Proceedings, pp 9. https://doi.org/10.1137/1.9781611973754.1. http://epubs.siam.org/doi/abs/10.1137/1.9781611973754.1

  • Guo PJ, Engler D (2011) Using Automatic Persistent Memoization to Facilitate Data Analysis Scripting. In: Proceedings of the 2011 international symposium on software testing and analysis. ACM, New York, NY, USA, ISSTA ’11, pp 287–297. http://doi.acm.org/10.1145/2001420.2001455

  • Gupta P, Zeldovich N, Madden S (2011) A trigger-based middleware cache for ORMs. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). Springer Berlin Heidelberg, Lisbon, Portugal, vol 7049 LNCS, pp 329–349. https://doi.org/10.1007/978-3-642-25821-3_17

  • Huang J, Liu X, Zhao Q, Ma J, Huang G (2010) A browser-based framework for data cache in web-delivered service composition. In: Proceedings - 2010 IEEE international conference on service-oriented computing and applications, SOCA 2010. https://doi.org/10.1109/SOCA.2010.5707138

  • Hwang J, Wood T (2013) Adaptive performance-aware distributed memory caching. In: 10th international conference on autonomic computing (ICAC 13). USENIX Association, San Jose, CA, pp 33–43, https://www.usenix.org/conference/icac13/technical-sessions/presentation/hwang

  • Larson PÅ, Goldstein J, Zhou J (2004) MTCache: Transparent mid-tier database caching in SQL server. In: Proceedings - international conference on data engineering, vol 20, pp 177–188. https://doi.org/10.1109/ICDE.2004.1319994. arXiv:1011.1669v3

  • Leszczyński P, Stencel K (2010) Consistent caching of data objects in database driven websites. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 6295 LNCS, pp 363–377. https://doi.org/10.1007/978-3-642-15576-5_28

  • Maplesden D, von Randow K, Tempero E, Hosking J, Grundy J (2015a) Performance analysis using subsuming methods: an industrial case study. In: 2015 IEEE/ACM 37th IEEE international conference on software engineering (ICSE). IEEE, vol 2, pp 149–158

  • Maplesden D, Tempero E, Hosking J, Grundy JC (2015b) Subsuming methods: finding new optimisation opportunities in object-oriented software. In: Proceedings of the 6th ACM/SPEC international conference on performance engineering. ACM, pp 175–186

  • Megiddo N, Modha DS (2004) Outperforming LRU with an adaptive replacement cache algorithm. Computer 37(4):58–65. https://doi.org/10.1109/MC.2004.1297303

    Article  Google Scholar 

  • Mertz J, Nunes I (2017a) A qualitative study of application-level caching. IEEE Transactions on Software Engineering 43(9):798–816. https://doi.org/10.1109/TSE.2016.26339. http://ieeexplore.ieee.org/document/7762909/

  • Mertz J, Nunes I (2017b) Understanding application-level caching inweb applications: A comprehensive introduction and survey of state-of-the-art approaches. ACM Computing Surveys 50(6). https://doi.org/10.1145/3145813

  • Mertz J, Nunes I (2018) Automation of application-level caching in a seamless way. Software - Practice and Experience. https://doi.org/10.1002/spe.2571

    Article  Google Scholar 

  • Nguyen K, Xu G (2013) Cachetor: detecting cacheable data to remove bloat. In: Proceedings of the 2013 9th joint meeting on foundations of software engineering. ACM, New York, NY, USA, ESEC/FSE 2013, pp 268–278. http://doi.acm.org/10.1145/2491411.2491416

  • Ports DRK, Clements AT, Zhang I, Madden S, Liskov B (2010) Transactional consistency and automatic management in an application data cache. In: Proceedings of the 9th USENIX symposium on operating systems design and implementation. USENIX Association, CA, USA, pp 279–292

  • Qin X, Wang W, Zhang W, Wei J, Zhao X, Zhong H, Huang T (2014) PRESC2: efficient self-reconfiguration of cache strategies for elastic caching platforms. Computing 96(5):415–451. https://doi.org/10.1007/s00607-013-0365-6

  • Radhakrishnan G (2004) Adaptive application caching. Bell Labs Technical Journal 9(1):165–175. https://doi.org/10.1002/bltj.20011. http://dx.doi.org/10.1002/bltj.20011

  • Saemundsson T, Bjornsson H, Chockler G, Vigfusson Y (2014) Dynamic performance profiling of cloud caches. In: Proceedings of the ACM symposium on cloud computing - SOCC ’14, pp 1–14. https://doi.org/10.1145/2670979.2671007. http://dl.acm.org/citation.cfm?doid=2670979.2671007

  • Santhanakrishnan G, Amer A, Chrysanthis PK (2006) Self-tuning caching: the universal caching algorithm. Software - Practice and Experience 36(11–12):1179–1188. https://doi.org/10.1002/spe.755

  • Scully Z, Chlipala A, Scully Z, Chlipala A (2017) A program optimization for automatic database result caching. In: POPL 2017: Proceedings of the 44th ACM SIGPLAN symposium on principles of programming languages, pp 271–284. https://doi.org/10.1145/3009837.3009891. http://dl.acm.org/citation.cfm?doid=3009837.3009891

  • Selakovic M, Pradel M (2016) Performance issues and optimizations in JavaScript. In: Proceedings of the 38th international conference on software engineering - ICSE ’16. ACM Press, New York, New York, USA, pp 61–72. https://doi.org/10.1145/2884781.2884829. http://dl.acm.org/citation.cfm?doid=2884781.2884829

  • Subramanian R, Smaragdakis Y, Loh GH (2006) Adaptive caches: Effective shaping of cache behavior to workloads. In: Proceedings of the annual international symposium on microarchitecture, MICRO, pp 385–396. https://doi.org/10.1109/MICRO.2006.7

  • Sun H, Xiao B, Wang X, Liu X (2017) Adaptive trade-off between consistency and performance in data replication. Software - Practice and Experience 47(6):891–906. https://doi.org/10.1002/spe.2462

  • Venketesh P, Venkatesan R (2009) A survey on applications of meural metworks and evolutionary techniques in web caching. IETE Technical Review 26(3):171. https://doi.org/10.4103/0256-4602.50701. http://tr.ietejournals.org/text.asp?2009/26/3/171/50701

  • Wang W, Liu Z, Jiang Y, Yuan X, Wei J (2014) EasyCache: a transparent in-memory data caching approach for internetware. In: Proceedings of the 6th Asia-Pacific Symposium on Internetware on Internetware. ACM Press, New York, New York, USA, pp 35–44. https://doi.org/10.1145/2677832.2677837. http://dx.doi.org/10.1145/2677832.2677837

  • Xu G (2012) Finding reusable data structures. Proceedings of the ACM international conference on Object oriented programming systems languages and applications - OOPSLA ’12, pp 1017. https://doi.org/10.1145/2384616.2384690. http://dl.acm.org/citation.cfm?doid=2384616.2384690

  • Xu G (2013) Resurrector: A tunable object lifetime profiling technique for optimizing real-world programs. In: Proceedings of the 2013 ACM SIGPLAN international conference on object oriented programming systems languages & applications. ACM, New York, NY, USA, OOPSLA ’13, pp 111–130/ https://doi.org/10.1145/2509136.2509512. http://doi.acm.org/10.1145/2509136.2509512

  • Xu G, Yan D, Rountev A (2012) Static detection of loop-invariant data structures. In: Noble J (ed) ECOOP 2012 - object-oriented programming. Springer, Berlin Heidelberg, Berlin, Heidelberg, pp 738–763

  • Xu Y, Frachtenberg E, Jiang S, Paleczny M (2014) Characterizing Facebook’s Memcached Workload. IEEE Internet Computing 18:41–49. https://doi.org/10.1109/MIC.2013.80

    Article  Google Scholar 

  • Yang Q, Zhang HH (2003) Web-log mining for predictive web caching. IEEE Transactions on Knowledge and Data Engineering 15(4):1050–1053. https://doi.org/10.1109/TKDE.2003.1209022

    Article  Google Scholar 

  • Zaidenberg N, Gavish L, Meir Y (2015) New caching algorithms performance evaluation. In: Proceedings of the 2015 international symposium on performance evaluation of computer and telecommunication systems, SPECTS 2015 - Part of SummerSim 2015 Multiconference. https://doi.org/10.1109/SPECTS.2015.7285291

Download references

Acknowledgements

The authors would like to thank for CNPq grants ref. 131271/2018-0, ref. 313357/2018-8, and ref. 428157/2018-1. This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rômulo Meloca.

Additional information

Communicated by: Tse-Hsun (Peter) Chen, Cor-Paul Bezemer, André van Hoorn, Catia Trubiani, and Weiyi Shang

Appendix: Reproducibility

Appendix: Reproducibility

To reproduce and/or change the parameters of our experiment, it is required to have two computers with sufficient resources, both running Linux-based systems and having Java 11+, Maven, Docker CE and docker-compose installed. In our repository, we provide a configure file that helps to install such dependencies. Also, this configuration script automates the cloning of all the repositories of each application version in the proper folder, as well as the tools that we implemented and rely on. It is also possible to run our experiment in standalone mode (with only one computer). However, it may not generate reliable results, due to the interference of the requester application that would compete on resources with the running application. Finally, compiling and installing our tools through Maven is also required, which is automatically made by a script named compile.sh.

To the first phase of the experiment, we automatically detect all the applications (including new ones) within the folder that holds the NOCACHE version of applications through the script trace.sh. The hostname for the server machine must be informed. We then check whether the traces of the application were already collected in the outputs folder. If not yet executed, we send a signal to the server machine (which could also be the same machine) to bring the application up via a Docker container. It is expected to the server machine to be listening for requests with our Java tool called RemoteExecutor, as well as having all the mentioned configurations done. New applications are expected to have a Dockerfile—as we did for our target applications—including its required steps to compile and to run, as well as a docker-compose describing its dependencies, such as a database and its initialisation SQL. We also look for the files whitelist, blacklist and ignored in order to configure which Java packages should be serialised by our JSONSerialiser tool for the tracings. Also, including the dependency of our ApplicationTracer in the pom or gradle file is required, so we can automatically inject code in every method of new applications. We then automatically wait for the application to start and once started we start firing requests. For new applications, describing the JSON file containing the workload graph as we did for our applications in the workloads folder is required for this step. When finished with the tracing, we automatically tear the application down and clean its created files and data.

To the recommendation of methods based on the collected traces, it is just needed to execute the approaches using as input the trace file generated in the first phase. Manual analysis of the recommendations, as well as the manual implementation of the caching with our Cache component, is required. Particularly for APL, the generated files containing the recommended inputs (which is already included in our repository) are required to be in the outputs folder. It is also important, for APL, that the same serialisation parameters adopted in the first phase are used in the second phase, so the inputs can match. For new applications, the Maven/Gradle dependency of our Cache component must be added. For new approaches, it is required that they are able to read and process the tracing file generated in the first phase so they can give their recommendations.

To the second phase of the experiment, we automatically detect all the applications and start firing the requests through our script named run.sh.The hostname for the server machine must be informed. If not yet generated the workloads, we automatically do it. We then start sampling all the ten executions for all the groups of simulated users collecting our metrics in the disk. Particular executions that already succeeded are skipped. It is expected in this phase that each application version is in the proper folder and includes the files, configurations and caching as mentioned before.

To aggregate the results produced in the outputs folder, the script reduce.sh helps on initialising the CSV files and executing our tools that calculate our metrics. The hostname for the server machine must be informed. In order to generate visualisations for the aggregated CSV files, the script plot.sh under the analysis folder may be invoked.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Meloca, R., Nunes, I. A comparative study of application-level caching recommendations at the method level. Empir Software Eng 27, 88 (2022). https://doi.org/10.1007/s10664-021-10089-z

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-021-10089-z

Keywords

Navigation