How the Apache community upgrades dependencies: an evolutionary study

Abstract

Software ecosystems consist of multiple software projects, often interrelated by means of dependency relations. When one project undergoes changes, other projects may decide to upgrade their dependency. For example, a project could use a new version of a component from another project because the latter has been enhanced or subject to some bug-fixing activities. In this paper we study the evolution of dependencies between projects in the Java subset of the Apache ecosystem, consisting of 147 projects, for a period of 14 years, resulting in 1,964 releases. Specifically, we investigate (i) how dependencies between projects evolve over time when the ecosystem grows, (ii) what are the product and process factors that can likely trigger dependency upgrades, (iii) how developers discuss the needs and risks of such upgrades, and (iv) what is the likely impact of upgrades on client projects. The study results—qualitatively confirmed by observations made by analyzing the developers’ discussion—indicate that when a new release of a project is issued, it triggers an upgrade when the new release includes major changes (e.g., new features/services) as well as large amount of bug fixes. Instead, developers are reluctant to perform an upgrade when some APIs are removed. The impact of upgrades is generally low, unless it is related to frameworks/libraries used in crosscutting concerns. Results of this study can support the understanding of the of library/component upgrade phenomenon, and provide the basis for a new family of recommenders aimed at supporting developers in the complex (and risky) activity of managing library/component upgrade within their software projects.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Notes

  1. 1.

    http://www.markosproject.eu

  2. 2.

    http://projects.apache.org/doap.html

  3. 3.

    An example of archive for the Ant project can be found here http://archive.apache.org/dist/ant/source

  4. 4.

    http://ninka.turingmachine.org

  5. 5.

    Note that we limit our analysis to developers we can detect through their activities in the versioning system, as also pointed out in Section 4.

  6. 6.

    http://mail-archives.apache.org/mod_mbox

  7. 7.

    https://issues.apache.org/jira/secure/Dashboard.jspa

  8. 8.

    http://www.bugzilla.org

  9. 9.

    http://distat.unimol.it/reports/emse-apache

  10. 10.

    Remember that we consider a developer “active” if she performed at least one commit in the previous six months.

  11. 11.

    http://projects.apache.org/projects/ecs.html

  12. 12.

    http://uima.apache.org

  13. 13.

    http://db.apache.org/derby

  14. 14.

    http://logging.apache.org/chainsaw

  15. 15.

    http://commons.apache.org/proper/commons-betwixt

  16. 16.

    http://cocoon.apache.org

  17. 17.

    http://commons.apache.org

  18. 18.

    http://commons.apache.org/proper/commons-compress

  19. 19.

    In a small-world network most of the nodes are not neighbors of one another but most nodes can be reached from every other by passing a small number of edges (Watts and Strogatz 1998).

  20. 20.

    http://tapestry.apache.org

  21. 21.

    http://stanbol.apache.org

  22. 22.

    http://tika.apache.org

  23. 23.

    http://tinyurl.com/p3nxkyc

  24. 24.

    http://mina.apache.org

  25. 25.

    http://mina.apache.org/sshd-project

  26. 26.

    http://tinyurl.com/nfrqfrf

  27. 27.

    http://db.apache.org/torque/torque-4.0/index.html

  28. 28.

    http://tinyurl.com/qjyw6u2

  29. 29.

    http://roller.apache.org/

  30. 30.

    http://tinyurl.com/qz254l8

  31. 31.

    http://tinyurl.com/opmlu8z

  32. 32.

    http://cxf.apache.org/

  33. 33.

    http://accumulo.apache.org/

  34. 34.

    http://tomcat.apache.org/

  35. 35.

    http://sourceforge.net/

  36. 36.

    The onion model is a socialization process where newcomers join a project by first contributing through mailing list discussions and bug trackers and they advance to more important roles contributing where they can improve the code and making design decisions

  37. 37.

    API breaking changes would cause an application built with an older version of the component to fail under a newer version.

References

  1. Annosi M, Di Penta M, Tortora G (2012) Managing and assessing the risk of component upgrades. In: 2012 3rd international workshop on product line approaches in software engineering (PLEASE), pp 9–12

  2. Antoniol G, Ayari K, Di Penta M, Khomh F, Guéhéneuc YG (2008) Is it a bug or an enhancement? A text-based approach to classify change requests. In: Proceedings of the 2008 conference of the centre for advanced studies on collaborative research. IBM, Richmond Hill, p 23

  3. Aranda J, Venolia G (2009) The secret life of bugs: going past the errors and omissions in software repositories. In: Proceedings of the 31st international conference on software engineering, ICSE 2009. Vancouver, pp 298–308

  4. Bavota G, Canfora G, Di Penta M, Oliveto R, Panichella S (2013) The evolution of project inter-dependencies in a software ecosystem: the case of apache. In: 29th IEEE international conference on software maintenance (ICSM 20013). IEEE, Eindhoven, The Netherlands

  5. Bavota G, Ciemniewska A, Chulani I, De Nigro A, Di Penta M, Galletti D, Galoppini R, Gordon TF, Kedziora P, Lener I, Torelli F, Pratola R, Pukacki J, Rebahi Y, Villalonga S G (2014) The market for open source: An intelligent virtual open source marketplace. In: Proceedings of joint 18th European conference on software maintenance and reengineering / 21st working conference on reverse engineering, CSMR18/WCRE21. Antwerp, pp 399–402

  6. Bosh J (2009) From software product lines to software ecosystems. In: Proceedings of the 13th international conference on software product lines (SPLC), pp 111–119

  7. Businge J, Serebrenik A, van den Brand M (2012) Survival of eclipse third-party plug-ins. In: 28th IEEE international conference on software maintenance (ICSM 2012). IEEE Computer Society, Trento, pp 368–377

  8. Cohen J (1988) Statistical power analysis for the behavioral sciences, 2nd edn. Lawrence Earlbaum Associates

  9. Collard ML, Kagdi HH, Maletic JI (2003) An XML-based lightweight C++ fact extractor. In: 11th International Workshop on Program Comprehension (IWPC 2003), May 10-11, 2003, Portland. IEEE Computer Society, pp 134–143

  10. Conover WJ (1998) Practical nonparametric statistics, 3rd edn. Wiley

  11. Dagenais B, Robillard MP (2008) Recommending adaptive changes for framework evolution. In: 30th international conference on software engineering (ICSE 2008). ACM, Leipzig, pp 481–490

  12. Di Penta, Germán D M, Guéhéneuc YG, Antoniol G (2010) An exploratory study of the evolution of software licensing. In: Proceedings of the 32nd ACM/IEEE international conference on software engineering, ICSE 2010, vol 1. ACM, Cape Town, pp 145–154

  13. Dig D, Johnson R (2006) How do APIs evolve? A story of refactoring. J Softw Maint Evol 18:83–107. Research and Practice

    Article  Google Scholar 

  14. Gala-Perez S, Robles G., Gonzalez-Barahona JM, Herraiz I (2013) Intensive metrics for the study of the evolution of open source projects. In: 10th IEEE working conference on mining software repositories. San Francisco

  15. German D, Adams B, Hassan AE (2013) Programming language ecosystems: the evolution of R. In: Proceedings of the 17th European conference on software maintenance and reengineering (CSMR). Genova, pp 243–252

  16. Germán DM (2003) The GNOME project: A case study of open source, global software development. Softw Process Improv Prac 8 (4):201–215

    Article  Google Scholar 

  17. German DM, Gonzalez-Barahona JM, Robles G (2007) A model to understand the building and running inter-dependencies of software. In: Proceedings of the 14th working conference on reverse engineering, WCRE ’07. IEEE Computer Society, Washington, DC, pp 140–149

  18. German DM, Manabe Y, Inoue K (2010) A sentence-matching method for automatic license identification of source code files. In: Proceedings of the IEEE/ACM international conference on automated software engineering, ASE ’10. ACM, New York

  19. Godfrey MW, Tu Q (2000) Evolution in open source software: a case study. In: Proceedings of the international conference on software maintenance (ICSM’00). IEEE Computer Society, Washington, DC, pp 131–140

  20. Goeminne M, Claes M, Mens T (2013) A historical dataset for the GNOME ecosystem. In: Proceedings of the 10th working conference on mining software repositories, MSR’13. IEEE Press, Piscataway, pp 225–228

  21. Goeminne M, Mens T (2013) Analyzing ecosystems for open source software developer communities. In: Slinger Jansen Sjaak Brinkkemper MAC (ed) Software ecosystems: analyzing and managing business networks in the software industry. Edward Elgar Publishing, Incorporated, pp 301–329

  22. Gonzalez-Barahona JM, Robles G, Michlmayr M, Amor JJ, German DM (2009) Macro-level software evolution: a case study of a large software compilation. Empirical Softw Engg 14 (3):262–285

    Article  Google Scholar 

  23. Grissom RJ, Kim JJ (2005) Effect sizes for research: a broad practical approach, 2nd edn. Lawrence Earlbaum Associates

  24. Hazewinkel M (2001) Kolmogorov-Smirnov test. Springer

  25. Hou D, Yao X (2011) Exploring the intent behind API evolution: a case study. In: 18th working conference on reverse engineering (WCRE’11). Limerick, pp 131–140

  26. Jansen S, Finkelstein A, Brinkkemper S (2005) A sense of community: a research agenda for software ecosystems. In: 31st international conference on software ecosystems, new and emerging research track, pp 187–190

  27. Jergensen C, Sarma A, Wagstrom P (2011) The onion patch: migration in open source ecosystems. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. ACM, New, pp 70–80

  28. Kidane YH, Gloor PA (2007) Correlating temporal communication patterns of the eclipse open source community with performance and creativity. Comput Math Organ Theory 13 (1):17–27

    Article  MATH  Google Scholar 

  29. Kilamo T, Hammouda I, Mikkonen T, Aaltonen T (2012) From proprietary to open sourcegrowing an open source ecosystem. J Syst Softw 85 (7):1467–1478

    Article  Google Scholar 

  30. Koch S, Schneider G (2002) Effort, cooperation and coordination in an open source software project: GNOME. Inf Syst J 12 (1):27–42

    Article  Google Scholar 

  31. Krinke J, Gold N, Jia Y, Binkley D (2010) Cloning and copying between GNOME projects. In: Whitehead J, Zimmermann T (eds) 2010 7th IEEE working conference on mining software repositories, MSR 2010. IEEE, pp 98–101

  32. Levenshtein V (1966) Binary codes capable of correcting deletions, insertions, and reversals. Sov Phys Dokl 10:707–716

    MathSciNet  MATH  Google Scholar 

  33. Lungu M, Robbes R, Lanza M (2010) Recovering inter-project dependencies in software ecosystems. In: Proceedings of ASE 2010. ACM Society Press, pp 309–312

  34. Mens T, Fernández-Ramil J, Degrandsart S (2008) The evolution of eclipse. In: 24th IEEE international conference on software maintenance (ICSM 2008), September 28 - October 4, 2008, Beijing, China. IEEE, pp 386–395

  35. Ossher J, Bajracharya S K, Lopes C V (2010) Automated dependency resolution for open source software. In: Proceedings of the 7th international working conference on mining software repositories, MSR 2010 (Co-located with ICSE). IEEE, Cape Town, pp 130–140

  36. Raemaekers S, van Deursen A, Visser J (2012) Measuring software library stability through historical version analysis. In: 28th IEEE international conference on software maintenance (ICSM’12). Trento, pp 378–387

  37. Robbes R, Lungu M, Röthlisberger D (2012) How do developers react to API deprecation? The case of a smalltalk ecosystem. In: Proceedings of the ACM SIGSOFT 20th international symposium on the foundations of software engineering. ACM, New York, pp 56:1–56:11

  38. Scacchi W, Alspaugh TA (2012) Understanding the role of licenses and evolution in open architecture software ecosystems. J Syst Softw 85 (7):1479–1494

    Article  Google Scholar 

  39. Singh P V (2010). The small-world effect: the influence of macro-level properties of developer collaboration networks on open-source project success. ACM Trans Softw Eng Methodol 20 (2):6:1–6:27

    Google Scholar 

  40. Vasilescu B, Serebrenik A, Goeminne M, Mens T (2013) On the variation and specialisation of workload A case study of the Gnome ecosystem community. Empirical Software Engineering, pp 1–54

  41. Watts DJ, Strogatz SH (1998) Collective dynamics of small-world networks. Nature 393 (6684):409–10

  42. Wermelinger M, Yu Y (2008) Analyzing the evolution of eclipse plugins. In: Proceedings of the 2008 international working conference on mining software repositories. ACM, New York, pp 133–136

  43. Wermelinger M, Yu Y, Lozano A, Capiluppi A (2011) Assessing architectural evolution: a case study. Empir Softw Eng 16 (5):623–666

    Article  Google Scholar 

  44. Yu L, Ramaswamy S, Bush J (2007) Software evolvability: an ecosystem point of view. IEEE Int Work Softw Evolvability 0:75–80

    Google Scholar 

  45. Zar JH (1972) Significance testing of the spearman rank correlation coefficient. J Am Stat Assoc 67 (339):578–580

    Article  MATH  Google Scholar 

Download references

Acknowledgments

Gabriele Bavota, Gerardo Canfora, Massimiliano Di Penta, and Sebastiano Panichella are partially funded by the EU FP7-ICT-2011-8 project Markos, contract no. 317743. Any opinions, findings, and conclusions expressed herein are the authors’ and do not necessarily reflect those of the sponsors.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Gabriele Bavota.

Additional information

This paper is an extended version of the paper: “Gabriele Bavota, Gerardo Canfora, Massimiliano Di Penta, Rocco Oliveto, Sebastiano Panichella: The Evolution of Project Inter-dependencies in a Software Ecosystem: The Case of Apache. 2013 IEEE International Conference on Software Maintenance, ICSM 2013, Eindhoven, The Netherlands, September 22-28, 2013: 280-289, IEEE”

Communicated by: Yann Gaël Guéhéneuc and Tom Mens

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bavota, G., Canfora, G., Di Penta, M. et al. How the Apache community upgrades dependencies: an evolutionary study. Empir Software Eng 20, 1275–1317 (2015). https://doi.org/10.1007/s10664-014-9325-9

Download citation

Keywords

  • Software ecosystems
  • Project dependency upgrades
  • Mining software repositories