Why reinventing the wheels? An empirical study on library reuse and re-implementation

Xu, Bowen; An, Le; Thung, Ferdian; Khomh, Foutse; Lo, David

doi:10.1007/s10664-019-09771-0

Why reinventing the wheels? An empirical study on library reuse and re-implementation

Published: 05 September 2019

Volume 25, pages 755–789, (2020)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Bowen Xu¹,
Le An ORCID: orcid.org/0000-0003-1246-864X²,
Ferdian Thung¹,
Foutse Khomh² &
…
David Lo¹

1165 Accesses
30 Citations
Explore all metrics

Abstract

Nowadays, with the rapid growth of open source software (OSS), library reuse becomes more and more popular since a large amount of third- party libraries are available to download and reuse. A deeper understanding on why developers reuse a library (i.e., replacing self-implemented code with an external library) or re-implement a library (i.e., replacing an imported external library with self-implemented code) could help researchers better understand the factors that developers are concerned with when reusing code. This understanding can then be used to improve existing libraries and API recommendation tools for researchers and practitioners by using the developers concerns identified in this study as design criteria. In this work, we investigated the reasons behind library reuse and re-implementation. To achieve this goal, we first crawled data from two popular sources, F-Droid and GitHub. Then, potential instances of library reuse and re-implementation were found automatically based on certain heuristics. Next, for each instance, we further manually identified whether it is valid or not. For library re-implementation, we obtained 82 instances which are distributed in 75 repositories. We then conducted two types of surveys (i.e., individual survey to corresponding developers of the validated instances and another open survey) for library reuse and re-implementation. For library reuse individual survey, we received 36 responses out of 139 contacted developers. For re-implementation individual survey, we received 13 responses out of 71 contacted developers. In addition, we received 56 responses from the open survey. Finally, we perform qualitative and quantitative analysis on the survey responses and commit logs of the validated instances. The results suggest that library reuse occurs mainly because developers were initially unaware of the library or the library had not been introduced. Re-implementation occurs mainly because the used library method is only a small part of the library, the library dependencies are too complicated, or the library method is deprecated. Finally, based on all findings obtained from analyzing the surveys and commit messages, we provided a few suggestions to improve the current library recommendation systems: tailored recommendation according to users’ preferences, detection of external code that is similar to a part of the users’ code (to avoid duplication or re-implementation), grouping similar recommendations for developers to compare and select the one they prefer, and disrecommendation of poor-quality libraries.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Fig. 3

How Does Library Migration Impact Software Quality and Comprehension? An Empirical Study

Understanding When to Adopt a Library: A Case Study on ASF Projects

Understanding the role of external pull requests in the NPM ecosystem

Article 20 May 2023

Notes

Nodejs, https://www.npmjs.com
Maven, https://maven.apache.org
RubyGems, https://rubygems.org
Packagist, https://packagist.org
PyPI, https://pypi.python.org/pypi
Statistics for the Maven Repository, https://search.maven.org/stats
F-Droid, https://f-droid.org
Github, https://github.com
F-Droid, https://f-droid.org/
Github API, https://developer.github.com/v3
https://github.com/geometer/FBReaderJ/commit/9d0ca05
https://github.com/geometer/FBReaderJ/commit/9d0ca05#diff-111c3f193c58d04aed7c19db835db11b
https://github.com/grzegorznittner/chanu/commit/5159070#diff-015d116ababf2863b74874b6ba078cfeR365
https://developer.android.com/reference/android/text/Html.html
https://pypi.python.org/pypi/isoparser
https://github.com/erincandescent/Impeller/commit/5d1e7e8
https://github.com/koush/UrlImageViewHelper
https://www.reddit.com/r/Python
https://www.reddit.com/r/Android
https://www.reddit.com/r/developer
https://news.ycombinator.com
https://android-arsenal.com
Replication package, https://github.com/XBWer/Why-Reinventing-the-Wheel.
Stack Overflow Survey, https://insights.stackoverflow.com/survey/2019#technology.

References

Abdalkareem R, Nourry O, Wehaibi S, Mujahid S, Shihab E (2017) Why do developers use trivial packages? an empirical case study on npm. In: 11th joint meeting on foundations of software engineering. ACM, pp 385–395
Basili VR, Briand LC, Melo WL (1996) How reuse influences productivity in object-oriented systems. Commun ACM 39(10):104–116
Article Google Scholar
Blog of Jos de Jong (2017) The art of creating simple but flexible APIs. http://josdejong.com/blog/2014/10/18/the-art-of-creating-simple-but-flexible-apis/, online. Accessed 14 Nov 2017
Gao W, Chen L, Wu J, Gao H (2015) Manifold-learning based api recommendation for mashup creation. In: 22nd IEEE international conference on web services. IEEE, pp 432–439
GNU (2017) Unified diff format. http://www.gnu.org/software/diffutils/manual/html_node/Unified-Format.html, online. Accessed 14 Sept 2017
Griss ML (1993) Software reuse: From library to factory. IBM Syst J 32 (4):548–566
Article Google Scholar
Gu X, Zhang H, Zhang D, Kim S (2016) Deep api learning. In: 24th international symposium on foundations of software engineering. ACM, pp 631–642
Heinemann L, Deissenboeck F, Gleirscher M, Hummel B, Irlbeck M (2011) On the extent and nature of software reuse in open source java projects. In: 13th international conference on software reuse. Springer, pp 207–222
Iivari J (1996) Why are case tools not used? Commun ACM 39(10):94–103
Article Google Scholar
Kawrykow D, Robillard MP (2009) Improving api usage through automatic detection of redundant code. In: 24th international conference on automated software engineering. IEEE, pp 111–122
Kim Y, Stohr EA (1998) Software reuse: survey and research directions. J Manag Inf Syst 14(4):113– 147
Article Google Scholar
Krueger CW (1992) Software reuse. ACM Comput Surv 24(2):131–183
Article Google Scholar
Krutz DE, Mirakhorli M, Malachowsky SA, Ruiz A, Peterson J, Filipski A, Smith J (2015) A dataset of open-source android applications. In: 12th working conference on mining software repositories. IEEE, pp 522–525
Lethbridge TC (2000) Priorities for the education and training of software engineers. J Syst Softw 53(1): 53–71
Article Google Scholar
Lv F, Zhang H, Lou JQ, Wang S, Zhang D, Zhao J (2015) Codehow: Effective code search based on api understanding and extended boolean model (e). In: 30th international conference on automated software engineering. IEEE, pp 260–270
McMillan C, Grechanik M, Poshyvanyk D, Xie Q, Fu C (2011) Portfolio: finding relevant functions and their usage. In: 33rd international conference on software engineering. ACM, pp 111–120
Mohagheghi P, Conradi R, Killi OM, Schwarz H (2004) An empirical study of software reuse vs. defect-density and stability. In: 26th international conference on software engineering. IEEE Computer Society, pp 282–292
Mojica IJ, Adams B, Nagappan M, Dienst S, Berger T, Hassan AE (2014) A large-scale empirical study on software reuse in mobile apps. IEEE Softw 31(2):78–86
Article Google Scholar
Nguyen AT, Hilton M, Codoban M, Nguyen HA, Mast L, Rademacher E, Nguyen TN, Dig D (2016) Api code recommendation using statistical learning from fine-grained changes. In: 24th international symposium on foundations of software engineering. ACM, pp 511–522
Ouni A, Kula RG, Kessentini M, Ishio T, German DM, Inoue K (2017) Search-based software library recommendation using multi-objective optimization. Inf Softw Technol 83:55–75
Article Google Scholar
Piccioni M, Furia CA, Meyer B (2013) An empirical study of api usability. In: 7th international symposium on empirical software engineering and measurement. IEEE, pp 5–14
PythonModule (2018) Python official documentation on modules. https://docs.python.org/2/tutorial/modules.html, online. Accessed 29 March 2018
Rahman MM, Roy CK, Lo D (2016) Rack: Automatic api recommendation using crowdsourced knowledge. In: 23rd international conference on software analysis, evolution, and reengineering, vol 1. IEEE, pp 349–359
Ruiz IJM, Nagappan M, Adams B, Hassan AE (2012) Understanding reuse in the android market. In: 20th international conference on program comprehension. IEEE, pp 113–122
Singer J, Sim SE, Lethbridge TC (2008) Software engineering data collection for field studies. In: Guide to advanced empirical software engineering. Springer, pp 9–34
Sun C, Khoo SC, Zhang SJ (2011) Graph-based detection of library api imitations. In: 27th IEEE international conference on software maintenance. IEEE, pp 183–192
Thung F (2016) Api recommendation system for software development. In: 31st international conference on automated software engineering, pp 896–899
Thung F, Lo D, Lawall J (2013a) Automated library recommendation. In: 20th working conference on reverse engineering. IEEE, pp 182–191
Thung F, Wang S, Lo D, Lawall J (2013b) Automatic recommendation of api methods from feature requests. In: 28th international conference on automated software engineering. IEEE Press, pp 290–300
Uddin G, Khomh F (2017) Automatic summarization of api reviews. In: 2017 32nd IEEE/ACM international conference on automated software engineering (ASE). IEEE, pp 159–170
Wei H, Li M (2017) Supervised deep features for software functional clone detection by exploiting lexical and syntactical information in source code. In: 26th international joint conference on artificial intelligence, pp 3034–3040
Yin RK (2002) Case study research: design and methods - Third Edition, 3rd edn. SAGE Publications
YouTube video (2004) Designing and evaluating reusable components. https://www.youtube.com/watch?v=ZQ5_u8Lgvyk, online. Accessed 29 March 2018
Zaimi A, Ampatzoglou A, Triantafyllidou N, Chatzigeorgiou A, Mavridis A, Chaikalis T, Deligiannis I, Sfetsos P, Stamelos I (2015) An empirical study on the reuse of third-party libraries in open-source software development. In: 7th Balkan conference on informatics conference. ACM, p 4

Download references

Author information

Authors and Affiliations

Singapore Management University, Singapore, Singapore
Bowen Xu, Ferdian Thung & David Lo
Polytechnique Montreal, Montreal, Canada
Le An & Foutse Khomh

Authors

Bowen Xu
View author publications
You can also search for this author in PubMed Google Scholar
Le An
View author publications
You can also search for this author in PubMed Google Scholar
Ferdian Thung
View author publications
You can also search for this author in PubMed Google Scholar
Foutse Khomh
View author publications
You can also search for this author in PubMed Google Scholar
David Lo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bowen Xu.

Additional information

Communicated by: Maurizio Morisio

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Bowen Xu and Le An are contributed equally.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, B., An, L., Thung, F. et al. Why reinventing the wheels? An empirical study on library reuse and re-implementation. Empir Software Eng 25, 755–789 (2020). https://doi.org/10.1007/s10664-019-09771-0

Download citation

Published: 05 September 2019
Issue Date: January 2020
DOI: https://doi.org/10.1007/s10664-019-09771-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Why reinventing the wheels? An empirical study on library reuse and re-implementation

Abstract

Access this article

Similar content being viewed by others

How Does Library Migration Impact Software Quality and Comprehension? An Empirical Study

Understanding When to Adopt a Library: A Case Study on ASF Projects

Understanding the role of external pull requests in the NPM ecosystem

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Why reinventing the wheels? An empirical study on library reuse and re-implementation

Abstract

Access this article

Similar content being viewed by others

How Does Library Migration Impact Software Quality and Comprehension? An Empirical Study

Understanding When to Adopt a Library: A Case Study on ASF Projects

Understanding the role of external pull requests in the NPM ecosystem

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation