An Approach to Predict Software Project Success Based on Random Forest Classifier
The success or failure of a software project depends on the product’s quality and reliability. The predictions of defects are important since it helps direct test effort, reduce costs and improve the quality of software. Software defects are expensive in terms of quality and cost. Data mining techniques and machine learning algorithms can be applied on these repositories to extract the useful information. This paper presents a software defect prediction model based on Random Forest (RF) ensemble classifier, which is more robust and beneficial for large-scale software system. The difference in the performance of the proposed methodology over other methods is statistically significant. Two fold information, one is RF is efficient irrespective of the domain of applications that is from the point of project, complexity of project, domain of project. Second is this inference enabled to predict the success level of projects. RF is travels light to project managers to predict the success of the projects based on the mining carried out using RF from empirical investigations.
KeywordsData Mining Clustering Software Engineering Random forest Metrics Software Quality Project Management
Unable to display preview. Download preview PDF.
- 1.Sandhu, P.S., Malhotra, U., Ardil, E.: Predicting the Impact of the Defect on the Overall Environment in Function Based Systems. World Academy Of Science, Engineering And Technology (56), 140–143 (2009)Google Scholar
- 3.Sun, Q., Pfahringer, B.: Bagging Ensemble Selection. Department of Computer Science. The University of Waikato Hamilton, New ZealandGoogle Scholar
- 4.Zhong, S., Khoshgoftaar, T.M., Seliya, N.: Analyzing Software Measurement Data with Clustering Techniques. Florida Atlantic University (March/April 2004), http://www.computer.org/intelligent
- 5.Catal, C., Sevim, U., Diri, B., Member, AENG: Software Fault Prediction of Unlabeled Program Modules. In: Proceedings of the World Congress on Engineering, WCE 2009, London, U.K, July 1-3, vol. I (2009)Google Scholar
- 6.Krishna Prasad, A.V., Rama Krishna, S.: Data Mining for Secure Software Engineering –Source Code Management Tool Case Study. International Journal of Engineering Science and Technology 2(7), 2667–2677 (2010)Google Scholar
- 7.Xie, T., Thummalapenta, S., Lo, D., Liu, C.: Data Mining for Software Engineering. IEEE Published by the IEEE Computer Society (August 2009), 0018-9162/09/$26.00 © 2009Google Scholar
- 8.Malhotra, R., Singh, Y.: On the Applicability of Machine Learning Techniques for Object Oriented Software Fault PredictionGoogle Scholar
- 10.Seliya, N.: Software quality analysis with limited prior knowledge of faults. Graduate Seminar, Wayne State University, Department of Computer Science (2006), www.cs.wayne.edu/graduateseminars/gradsem_f06/Slides/seliya_wsu_talk.ppt
- 11.Ma, Y., Guo, L., Cukic, B.: A Statistical Framework for the Prediction of Fault-PronenessGoogle Scholar
- 12.Boehm, B., Basili, V.R.: Software Defect Reduction Top 10 List. Software Manangement (January 2001)Google Scholar
- 13.Canul-Reich, J., Shoemaker, L., Hall, L.O.: Ensembles of Fuzzy Classifiers. In: The Proceedings of IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2007), July 23-26. Imperial College, London (2007)Google Scholar