Abstract
Offshore and outsourced software development is a rapidly increasing trend in global software business environment. Predicting fault-prone modules in outsourced software product may allow both parties to establish mutually satisfactory, cost-effective testing strategies and product acceptance criteria, especially in iterative transitions. In this paper, based on industrial software releases data, we conduct an empirical study to compare ten classifiers over eight sets of code attributes, and provide recommendations to aid both the client and vendor to assess the products’ quality through defect prediction. Overall, a generally high accuracy is observed, which confirms the usefulness of the metric-based classification. Furthermore, two classification techniques, Random Forest and Bayesian Belief Network, outperform the others in terms of predictive accuracy; in more detail, the former is the most cost-effective and the latter is of the lowest fault-prone module escaping rate. Our study also concludes that code metrics including size, traditional complexity, and object-oriented complexity perform fairly well.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Sommer, C., Troxler, G.: Outsourcing and Offshoring: The Consultancies Estimates. In: Meyer, B., Joseph, M. (eds.) SEAFOOD 2007. LNCS, vol. 4716, pp. 109–113. Springer, Heidelberg (2007)
Sabherwal, R.: The evolution of coordination in outsourced software development projects: a comparison of client and vendor perspectives. Information and Organization 13, 153–202 (2003)
Khoshgoftaar, T.M., Seliya, N.: Comparative Assessment of Software Quality Classification Techniques: An Empirical Case Study. Empirical Software Engineering 9, 229–257 (2004)
Khoshgoftaar, T.M., Seliya, N.: Fault Prediction Modeling for Software Quality Estimation: Comparing Commonly Used Techniques. Empirical Software Engineering 8, pp. 255–283 (2003)
Zhong, S., Khoshgoftaar, T.M., Seliya, N.: Analyzing Software Measurement Data with Clustering Techniques. IEEE Intelligent Systems 19, 20–27 (2004)
Li, P.L., Herbsleb, J., Shaw, M.: Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD. In: Proc. IEEE Software Metrics Symp., pp. 10–32. IEEE Computer Society, Washington (2005)
Lessmann, S., Baesens, B., Mues, C., Pietsch, S.: Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Trans. SW Eng. 34, 485–496 (2008)
Lanubile, F., Visaggio, G.: Evaluating predictive quality models derived from software measures: lessons learned. J. Systems and Software 38, 225–234 (1997)
Fenton, N., Neil, M.: A critique of software defect prediction models. IEEE Trans. SW Eng. 25, 675–689 (1999)
Menzies, T., Greenwald, J., Frank, A.: Data Mining Static Code Attributes to Learn Defect Predictors. IEEE Trans. Software Eng. 33, 2–13 (2007)
Nagappan, N., Ball, N., Zeller, A.: Mining metrics to predict component failures. In: Proc. International Conference on Software engineering, pp. 452–461. ACM, New York (2006)
Schneidewind, N.F.: Methodology for Validating Software Metrics. IEEE Trans. Software Eng. 18, 410–422 (1992)
Basili, V., Briand, L., Melo, W.: A validation of object-oriented design metrics as quality indicators. IEEE Trans. Software Eng. 22, 751–761 (1996)
Fenton, N., Neil, M., Krause, P.: Software measurement: uncertainty and causal modeling. IEEE Software 19, 116–122 (2002)
Guo, L., Ma, Y., Cukic, B., Singh, H.: Robust prediction of fault-proneness by random forests. In: Proc. International Symposium on Software Reliability Engineering (ISSRE 2004), pp. 417–428. IEEE Computer Society, Washington (2004)
Xing, X., Guo, P., Lyu, M.R.: A Novel Method for Early Software Quality Prediction Based on Support Vector Machine. In: Proc. International Symposium on Software Reliability Engineering, pp. 213–222. IEEE Computer Society, Washington (2005)
Elish, K.O., Elish, M.O.: Predicting defect-prone software modules using support vector machines. J. Systems and Software 81, 649–660 (2008)
Moser, R., Pedrycz, W., Succi, G.: A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In: Proc. international conference on Software engineering, pp. 181–190. ACM, New York (2008)
El-Emam, K., Benlarbi, S., Goel, N., Rai, S.N.: Comparing Case-Based Reasoning Classifiers for Predicting High-Risk Software Components. J. Systems and Software 55, 301–320 (2001)
Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
McCabe, T.: A Complexity Measure. IEEE Trans. Software Eng. 2, 308–320 (1976)
Halstead, M.: Elements of Software Science. Elsevier, Amsterdam (1977)
Chidamber, S., Kemerer, C.: A metrics suite for object-oriented design. IEEE Trans. Software Eng. 20(6), 476–493 (1994)
Subramanyam, R., Krishnan, M.S.: Empirical Analysis of CK Metrics for Object-Oriented Design Complexity: Implications for Software Defects. IEEE Trans. Software Eng. 29, 297–310 (2003)
Yang, Y., Li, Q., Li, M., Wang, Q.: An Empirical Analysis on Distribution Patterns of Software Maintenance Effort. In: Proc. International Conference of Software Maintenance, pp. 456–459. IEEE Computer Society, Washington (2008)
Zimmermann, T., Premraj, R., Zeller, A.: Predicting Defects for Eclipse. In: Proc. International Workshop on Predictor Models in Software Engineering, p. 9. IEEE Computer Society, Washington (2007)
Nagappan, N., Ball, T.: Use of relative code churn measures to predict system defect density. In: Proc. International Conference on Software Engineering, pp. 284–292. ACM, New York (2005)
Wu, S., Wang, Q., Yang, Y.: Quantitative Analysis of Faults and Failures with Multiple Releases of SoftPM. In: Proc. International Symposium on Empirical Software Engineering and Measurement, pp. 198–205. ACM, New York (2008)
Challagulla, V.U.B., Bastani, F.B., Yen, I.-L., Paul, R.A.: Empirical Assessment of Machine Learning based Software Defect Prediction Techniques. In: Proc. International Workshop on Object-Oriented Real-Time Dependable Systems, pp. 263–270. IEEE Computer Society, Washington (2005)
Hsu, C.-W., Chang, C.-C., Lin, C.-J.: A practical guide to support vector classification (2003), http://www.csie.ntu.edu.tw/~cjlin/libsvm/
Myrtveit, I., Stensrud, E., Shepperd, M.: Reliability and Validity in Comparative Studies of Software Prediction Models. IEEE Trans. Software Eng. 31, 380–391 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jia, H., Shu, F., Yang, Y., Wang, Q. (2009). Predicting Fault-Prone Modules: A Comparative Study. In: Gotel, O., Joseph, M., Meyer, B. (eds) Software Engineering Approaches for Offshore and Outsourced Development. SEAFOOD 2009. Lecture Notes in Business Information Processing, vol 35. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02987-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-02987-5_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02986-8
Online ISBN: 978-3-642-02987-5
eBook Packages: Computer ScienceComputer Science (R0)