Abstract
Nowadays, with the rapid growth of open source software (OSS), library reuse becomes more and more popular since a large amount of third- party libraries are available to download and reuse. A deeper understanding on why developers reuse a library (i.e., replacing self-implemented code with an external library) or re-implement a library (i.e., replacing an imported external library with self-implemented code) could help researchers better understand the factors that developers are concerned with when reusing code. This understanding can then be used to improve existing libraries and API recommendation tools for researchers and practitioners by using the developers concerns identified in this study as design criteria. In this work, we investigated the reasons behind library reuse and re-implementation. To achieve this goal, we first crawled data from two popular sources, F-Droid and GitHub. Then, potential instances of library reuse and re-implementation were found automatically based on certain heuristics. Next, for each instance, we further manually identified whether it is valid or not. For library re-implementation, we obtained 82 instances which are distributed in 75 repositories. We then conducted two types of surveys (i.e., individual survey to corresponding developers of the validated instances and another open survey) for library reuse and re-implementation. For library reuse individual survey, we received 36 responses out of 139 contacted developers. For re-implementation individual survey, we received 13 responses out of 71 contacted developers. In addition, we received 56 responses from the open survey. Finally, we perform qualitative and quantitative analysis on the survey responses and commit logs of the validated instances. The results suggest that library reuse occurs mainly because developers were initially unaware of the library or the library had not been introduced. Re-implementation occurs mainly because the used library method is only a small part of the library, the library dependencies are too complicated, or the library method is deprecated. Finally, based on all findings obtained from analyzing the surveys and commit messages, we provided a few suggestions to improve the current library recommendation systems: tailored recommendation according to users’ preferences, detection of external code that is similar to a part of the users’ code (to avoid duplication or re-implementation), grouping similar recommendations for developers to compare and select the one they prefer, and disrecommendation of poor-quality libraries.
Similar content being viewed by others
Notes
Nodejs, https://www.npmjs.com
Maven, https://maven.apache.org
RubyGems, https://rubygems.org
Packagist, https://packagist.org
Statistics for the Maven Repository, https://search.maven.org/stats
F-Droid, https://f-droid.org
Github, https://github.com
F-Droid, https://f-droid.org/
Github API, https://developer.github.com/v3
Replication package, https://github.com/XBWer/Why-Reinventing-the-Wheel.
Stack Overflow Survey, https://insights.stackoverflow.com/survey/2019#technology.
References
Abdalkareem R, Nourry O, Wehaibi S, Mujahid S, Shihab E (2017) Why do developers use trivial packages? an empirical case study on npm. In: 11th joint meeting on foundations of software engineering. ACM, pp 385–395
Basili VR, Briand LC, Melo WL (1996) How reuse influences productivity in object-oriented systems. Commun ACM 39(10):104–116
Blog of Jos de Jong (2017) The art of creating simple but flexible APIs. http://josdejong.com/blog/2014/10/18/the-art-of-creating-simple-but-flexible-apis/, online. Accessed 14 Nov 2017
Gao W, Chen L, Wu J, Gao H (2015) Manifold-learning based api recommendation for mashup creation. In: 22nd IEEE international conference on web services. IEEE, pp 432–439
GNU (2017) Unified diff format. http://www.gnu.org/software/diffutils/manual/html_node/Unified-Format.html, online. Accessed 14 Sept 2017
Griss ML (1993) Software reuse: From library to factory. IBM Syst J 32 (4):548–566
Gu X, Zhang H, Zhang D, Kim S (2016) Deep api learning. In: 24th international symposium on foundations of software engineering. ACM, pp 631–642
Heinemann L, Deissenboeck F, Gleirscher M, Hummel B, Irlbeck M (2011) On the extent and nature of software reuse in open source java projects. In: 13th international conference on software reuse. Springer, pp 207–222
Iivari J (1996) Why are case tools not used? Commun ACM 39(10):94–103
Kawrykow D, Robillard MP (2009) Improving api usage through automatic detection of redundant code. In: 24th international conference on automated software engineering. IEEE, pp 111–122
Kim Y, Stohr EA (1998) Software reuse: survey and research directions. J Manag Inf Syst 14(4):113– 147
Krueger CW (1992) Software reuse. ACM Comput Surv 24(2):131–183
Krutz DE, Mirakhorli M, Malachowsky SA, Ruiz A, Peterson J, Filipski A, Smith J (2015) A dataset of open-source android applications. In: 12th working conference on mining software repositories. IEEE, pp 522–525
Lethbridge TC (2000) Priorities for the education and training of software engineers. J Syst Softw 53(1): 53–71
Lv F, Zhang H, Lou JQ, Wang S, Zhang D, Zhao J (2015) Codehow: Effective code search based on api understanding and extended boolean model (e). In: 30th international conference on automated software engineering. IEEE, pp 260–270
McMillan C, Grechanik M, Poshyvanyk D, Xie Q, Fu C (2011) Portfolio: finding relevant functions and their usage. In: 33rd international conference on software engineering. ACM, pp 111–120
Mohagheghi P, Conradi R, Killi OM, Schwarz H (2004) An empirical study of software reuse vs. defect-density and stability. In: 26th international conference on software engineering. IEEE Computer Society, pp 282–292
Mojica IJ, Adams B, Nagappan M, Dienst S, Berger T, Hassan AE (2014) A large-scale empirical study on software reuse in mobile apps. IEEE Softw 31(2):78–86
Nguyen AT, Hilton M, Codoban M, Nguyen HA, Mast L, Rademacher E, Nguyen TN, Dig D (2016) Api code recommendation using statistical learning from fine-grained changes. In: 24th international symposium on foundations of software engineering. ACM, pp 511–522
Ouni A, Kula RG, Kessentini M, Ishio T, German DM, Inoue K (2017) Search-based software library recommendation using multi-objective optimization. Inf Softw Technol 83:55–75
Piccioni M, Furia CA, Meyer B (2013) An empirical study of api usability. In: 7th international symposium on empirical software engineering and measurement. IEEE, pp 5–14
PythonModule (2018) Python official documentation on modules. https://docs.python.org/2/tutorial/modules.html, online. Accessed 29 March 2018
Rahman MM, Roy CK, Lo D (2016) Rack: Automatic api recommendation using crowdsourced knowledge. In: 23rd international conference on software analysis, evolution, and reengineering, vol 1. IEEE, pp 349–359
Ruiz IJM, Nagappan M, Adams B, Hassan AE (2012) Understanding reuse in the android market. In: 20th international conference on program comprehension. IEEE, pp 113–122
Singer J, Sim SE, Lethbridge TC (2008) Software engineering data collection for field studies. In: Guide to advanced empirical software engineering. Springer, pp 9–34
Sun C, Khoo SC, Zhang SJ (2011) Graph-based detection of library api imitations. In: 27th IEEE international conference on software maintenance. IEEE, pp 183–192
Thung F (2016) Api recommendation system for software development. In: 31st international conference on automated software engineering, pp 896–899
Thung F, Lo D, Lawall J (2013a) Automated library recommendation. In: 20th working conference on reverse engineering. IEEE, pp 182–191
Thung F, Wang S, Lo D, Lawall J (2013b) Automatic recommendation of api methods from feature requests. In: 28th international conference on automated software engineering. IEEE Press, pp 290–300
Uddin G, Khomh F (2017) Automatic summarization of api reviews. In: 2017 32nd IEEE/ACM international conference on automated software engineering (ASE). IEEE, pp 159–170
Wei H, Li M (2017) Supervised deep features for software functional clone detection by exploiting lexical and syntactical information in source code. In: 26th international joint conference on artificial intelligence, pp 3034–3040
Yin RK (2002) Case study research: design and methods - Third Edition, 3rd edn. SAGE Publications
YouTube video (2004) Designing and evaluating reusable components. https://www.youtube.com/watch?v=ZQ5_u8Lgvyk, online. Accessed 29 March 2018
Zaimi A, Ampatzoglou A, Triantafyllidou N, Chatzigeorgiou A, Mavridis A, Chaikalis T, Deligiannis I, Sfetsos P, Stamelos I (2015) An empirical study on the reuse of third-party libraries in open-source software development. In: 7th Balkan conference on informatics conference. ACM, p 4
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Maurizio Morisio
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Bowen Xu and Le An are contributed equally.
Rights and permissions
About this article
Cite this article
Xu, B., An, L., Thung, F. et al. Why reinventing the wheels? An empirical study on library reuse and re-implementation. Empir Software Eng 25, 755–789 (2020). https://doi.org/10.1007/s10664-019-09771-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-019-09771-0