Abstract
To support successful quality managements of open source software (OSS) projects, this paper proposes to measure the balance of developers’ contributions to a source file as an entropy. Through an analysis of data collected from 10 popular OSS projects, the following trends are reported: a source file is more fault-prone as the developers’ contributions to the file are more imbalanced (lower entropy), and the proposed metric can be useful for predicting fault-prone programs.
Article PDF
Avoid common mistakes on your manuscript.
References
Black Duck Software, The 2017 open source 360° survey, (2017), https://www.blackducksoftware.com/open-source-360deg-survey.
Synopsys, 2018 Open Source Security and Risk Analysis, (2018), https://www.synopsys.com/content/dam/synopsys/sig-assets/reports/2018-ossra.pdf.
C. Jones, Applied Software Measurement: Global Analysis of Productivity and Quality, (McGraw-Hill, New York, 2008).
A. E. Hassan, The road ahead for mining software repositories, in Proc. Frontiers of Softw. Maintenance, (Beijing, China, 2008), pp. 48–57.
F. Rahman, D. Posnett, A. Hindle, E. Barr, and P. Devanbu, Bugcache for inspections : Hit or miss? in Proc. 19th ACM SIGSOFT Symp. & 13th European Conf. Foundations Softw. Eng., (Szeged, Hungary, 2011), pp. 322–331.
C. Lewis, and R. Ou, Bug prediction at Google, (2011), http://google-engtools.blogspot.jp/2011/12/bug-prediction-at-google.html.
T. Zimmermann, A. Zeller, P. Weissgerber, and S. Diehl, Mining version histories to guide software changes, IEEE Trans. Softw. Eng. 31 (6) (2005) pp. 429–445.
N. Ajienka, and A. Capiluppi, Understanding the interplay between the logical and structural coupling of software classes, J. Syst. & Softw., 134 (2017) 120–137.
A. T. Misirli, E. Shihab, and Y. Kamei, Studying high impact fix-inducing changes, Empir. Softw. Eng. 21 (2) (2016) 605–641.
S. Suzuki, H. Aman, S. Amasaki, T. Yokogawa, and M. Kawahara, An application of the pagerank algorithm to commit evaluation on git repository, in Proc. 43rd Euromicro Conf. Softw. Eng. & Advanced Applications, (Vienna, Austria, 2017), pp. 380–383.
S. Kim, E. J. Whitehead, and Y. Zhang, Classifying software changes: Clean or buggy?, IEEE Trans. Softw. Eng. 34 (2) (2008) 181–196.
Y. Kamei, E. Shihab, B. Adams, A. E. Hassan, A. Mockus, A. Sinha, and N. Ubayashi, A large-scale empirical study of just-in-time quality assurance, IEEE Tran. Softw. Eng. 39 (6) (2013) 757–773.
Q. C. Taylor, J. E. Stevenson, D. P. Delorey, and C. D. Knutson, Author entropy: A metric for characterization of software authorship patterns, in Proc. 3rd Int’l Workshop Public Data about Softw. Dev., (Leipzig, Germany, 2008), pp. 42–47.
C. Bird, N. Nagappan, B. Murphy, H. Gall, and P. Devanbu, Don’t touch my code!: Examining the effects of ownership on software quality, in Proc. 19th ACM SIGSOFT Symp. & 13th European Conf. Foundations of Softw. Eng., (Szeged, Hungary, 2011), pp. 4–14.
D. Posnett, R. D’Souza, P. Devanbu, and V. Filkov, Dual ecological measures of focus in software development, in Proc. 35th Int’l Conf. Softw. Eng., (San Francisco, CA, 2013), pp. 452–461.
K. Yamashita, S. McIntosh, Y. Kamei, and N. Ubayashi, Magnet or sticky? an oss project-by-project typology, in Proc. 11th Working Conf. Mining Softw. Repositories, (Hyderabad, India, 2014), pp. 344–347.
S. Onoue, H. Hata, and K. Matsumoto, Software population pyramids: The current and the future of OSS development communities, in Proc. 8th Int’l Symp. Empir. Softw. Eng. & Measurement, (Torino, Italy, 2014), pp. 34:1–34:4.
H. Aman, S. Amasaki, T. Yokogawa, and M. Kawahara, A survival analysis of source files modified by new developers, in Product-Focused Software Process Improvement, eds. M. Felderer et al., Lecture Notes in Computer Science, vol. 10611 (Springer, Cham, 2017), pp. 80–88.
S. Matsumoto, Y. Kamei, A. Monden, K. Matsumoto, and M. Nakamura, An analysis of developer metrics for fault prediction, in Proc. 6th Int’l Conf. Predictive Models in Softw. Eng., (Timisoara, Romania, 2010), pp. 18:1–18:9.
C. E. Shannon, A mathematical theory of communication, The Bell System Tech. J. 27 (3) (1948) 379–423.
T.K. Ho, The random subspace method for constructing decision forests, IEEE Trans. Pattern Analysis & Machine Intelligence 20 (8) (1998) 832–844
J. Śliwerski, T. Zimmermann, and A. Zeller, When do changes induce fixes? SIGSOFT Softw. Eng. Notes 30 (4) (2005) 1–5.
C. Bird, A. Gourley, P. Devanbu, M. Gertz, and A. Swaminathan, Mining email social networks, in Proc. Int’l Workshop Mining Softw. Repositories, (Shanghai, China, 2006), pp. 137–143.
N. E. Fenton and N. Ohlsson, Quantitative analysis of faults and failures in a complex software system, IEEE Trans. Softw. Eng. 26 (8) (2000) 797–814.
E. N. Adams, Optimizing preventive service of software products, IBM J. Research & Development 28 (1) (1984) 2–14.
N. Nagappan and T. Ball, Using software dependencies and churn metrics to predict field failures: An empirical case study, in Proc. 1st Int’l Symp. Empir. Softw. Eng. & Measurement, (Kaiserslautern, Germany, 2007), pp. 364–373.
S. Lessmann, B. Baesens, C. Mues, and S. Pietsch, Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings, IEEE Trans. Softw. Eng. 34 (4) (2008) 485–496.
L. Breiman, Random Forests, Machine Learning 45 (1) (2001) 5–32.
Author information
Authors and Affiliations
Corresponding author
Additional information
An earlier version of this paper was presented at the 3rd IEEE/ACIS International Conference on Big Data, Cloud Computing, and Data Science Engineering (BCD2018).
Rights and permissions
This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
About this article
Cite this article
Yamauchi, K., Aman, H., Amasaki, S. et al. An Entropy-Based Metric of Developer Contribution in Open Source Development and Its Application to Fault-Prone Program Analysis. Int J Netw Distrib Comput 6, 118–132 (2018). https://doi.org/10.2991/ijndc.2018.6.3.1
Published:
Issue Date:
DOI: https://doi.org/10.2991/ijndc.2018.6.3.1