Abstract
The immense increase in software technology has resulted in the convolution of software projects. Software effort estimation is fundamental to commence any software project and inaccurate estimation may lead to several complications and setbacks for present and future projects. Several techniques have been following for ages of the software effort estimation. As the application of software is extensively increased in its size and complexity, the traditional methods aren’t adequate to meet the requirements. To achieve the accurate estimation of software effort, in this paper, a gradient boosting regressor model is proposed as a robust approach. The performance is compared with regression models such as stochastic gradient descent, K-nearest neighbor, decision tree, bagging regressor, random forest regressor, Ada-boost regressor, and gradient boosting regressor by employing COCOMO’81 containing 63 projects and CHINA of 499 projects. The regression models are evaluated by the evaluation metrics such as MAE, MSE, RMSE, and R2. From the results, it is evident that the gradient boosting regressor model is performing well by obtaining an accuracy of 98% with COCOMO’81 and 93% with CHINA dataset. The proposed method significantly performs better than all regression models used in comparison with both the datasets.
Similar content being viewed by others
Abbreviations
- SGD:
-
Stochastic gradient descent
- KNN:
-
K-nearest neighbor
- DT:
-
Decision trees
- BR:
-
Bagging regressor
- RFR:
-
Random forest regressor
- ABR:
-
Ada-Boost regressor
- GBR:
-
Gradient boosting regressor
- MRE:
-
Magnitude relative error
- MMRE:
-
Mean magnitude of relative error
- RMSE:
-
Root mean square error
- MAE:
-
Mean absolute error
- MdMRE:
-
Median magnitude of relative error
- MSE:
-
Mean square error
References
Al Yahya M, Ahmad R, Lee S (2010) Impact of CMMI based software process maturity on COCOMO II’s effort estimation. Int Arab J Inf Technol 7(2):129–137
Attarzadeh I, Mehranzadeh A, Barati A (2012) Proposing an enhanced artificial neural network prediction model to improve the accuracy in software effort estimation. In: Proceedings of 2012 4th international conference computer intelligence, communication system networks, CICSyN 2012, pp 167–72
Suresh Kumar P, Behera HS (2020) Role of soft computing techniques in software effort estimation: an analytical study. Adv Intell Syst Comput 999:807–831
Baskeles B, Turhan B, Bener A (2007) Software effort estimation using machine learning methods. In: 2007 22nd international symposium on computer and information sciences [Internet]. IEEE, pp 1–6. Available from http://ieeexplore.ieee.org/document/4456863/
Kocaguneli E, Tosun A, Bener A (2010) AI-based models for software effort estimation. In: Proceedings of 36th EUROMICRO conference software engineering and advanced applications, SEAA, pp 323–326
Nassif AB, Capretz LF, Ho D (2012) Software effort estimation in the early stages of the software life cycle using a cascade correlation neural network model. In: Proceedings of 13th ACIS international conference on software engineering, artificial intelligence, networking and parallel/distributed computing. SNPD 2012, pp 589–594
Huang J, Li YF, Xie M (2015) An empirical analysis of data preprocessing for machine learning-based software cost estimation. Inf Softw Technol 67:108–127
Arslan F (2019) A review of machine learning models for software cost estimation. Rev Comput Eng Res 6(2):64–75
Kocaguneli E, Menzies T, Keung JW (2012) On the value of ensemble effort estimation. IEEE Trans Softw Eng 38(6):1403–1416
Singal P, Kumari AC, Sharma P (2020) Estimation of software development effort: a differential evolution approach. Proc Comput Sci 167(2019):2643–2652
Malgonde O, Chari K (2019) An ensemble-based model for predicting agile software development effort. Empir Softw Eng 24:1017–1055
Abdelali Z, Mustapha H, Abdelwahed N (2019) Investigating the use of random forest in software effort estimation. Proc Comput Sci 148:343–352
Nassif AB, Azzeh M, Idri A, Abran A (2019) Software development effort estimation using regression fuzzy models. Comput Intell Neurosci. https://doi.org/10.1007/978-94-007-7506-0_7
Pospieszny P, Czarnacka-Chrobot B, Kobylinski A (2018) An effective approach for software project effort and duration estimation with machine learning algorithms. J Syst Softw 137:184–196
Minku LL, Yao X (2013) Ensembles and locality: insight on improving software effort estimation. Inf Softw Technol 55(8):1512–1528. https://doi.org/10.1016/j.infsof.2012.09.012
Friedman H, Greedy J (2001) Function approximation: a gradient boosting machine. Ann Stat. https://doi.org/10.2307/2699986
Keprate A, Ratnayake RMC (2017) Using gradient boosting regressor to predict stress intensity factor of a crack propagating in small bore piping. In: IEEE international conference on industrial engineering management, pp 1331–1336
Aljahdali S, Sheta AF, Debnath NC (2016) Estimating software effort and function point using regression. In: Support vector machine and artificial neural networks models. Proceedings of IEEE/ACS international conference on computing system applications, AICCSA
Reddy PVGDP, Sudha KR, Sree PR, Ramesh SNSVSC (2010) Software effort estimation using radial basis and generalized regression. Neural Netw 2(5):87–92
Minku LL, Yao X (2011) A principled evaluation of ensembles of learning machines for software effort estimation. In: ACM international conference on proceeding series
Dave VS, Dutta K (2011) Comparison of regression model, feed-forward neural network and radial basis neural network for software development effort estimation. ACM SIGSOFT Softw Eng Notes 36(5):1
Sarro F, Petrozziello A, Harman M (2016) Multi-objective software effort estimation. In: Proceedings of international conference on software engineering, pp 619–30
Bettenburg N, Nagappan M, Hassan AE (2012) Think locally, act globally: improving defect and effort prediction models. IEEE Int Work Conf Min Softw Repos, pp 60–69
Boehm B (1981) Software engineering economics. Available from http://promise.site.uottawa.ca/SERepository/datasets/cocomo81.arff
No Title. Available from http://promise.site.uottawa.ca/SERepository/datasets-page.html
Boehm BW (1984) Software engineering economics. IEEE Trans Softw Eng 10(1):4–21
Bosu MF, Macdonell SG (2019) Experience: quality benchmarking of datasets used in software effort estimation. J Data Inf Qual 11(4):1–38
Menzies T, Butcher A, Cok D, Marcus A, Layman L, Shull F et al (2013) Local versus global lessons for defect prediction and effort estimation. IEEE Trans Softw Eng 39(6):822–834
Li X, Li W, Xu Y (2018) Human age prediction based on DNA methylation using a gradient boosting regressor. Genes (Basel) 9(9):424
Singh AJ, Kumar M (2020) Comparative study on effort estimation using different data mining techniques. Int J Sci Technol Res 9(4):3005–3010
Fadhil AA, Alsarraj RG (2020) Exploring the whale optimization algorithm to enhance software project effort estimation. In: 2020 6th international engineering conference “sustainable technology and development” (IEC) [Internet]. IEEE, pp 146–51. Available from https://ieeexplore.ieee.org/document/9122918/
Suresh Kumar P, Behera HS (2020) Estimating software effort using neural network: an experimental investigation. In: Advances in intelligent systems and computing. Springer ,Singapore, pp 165–80. https://doi.org/10.1007/978-981-15-2449-3_14
Xia T, Krishna R, Chen J, Mathew G, Shen X, Menzies T (2018) Hyperparameter optimization for effort estimation. Available from http://arxiv.org/abs/1805.00336
Saljoughinejad R, Khatibi V (2018) A new optimized hybrid model based on COCOMO to increase the accuracy of software cost estimation. J Adv Comput Eng Technol 4(1):27–40
Satapathy SM, Rath SK (2017) Empirical assessment of machine learning models for agile software development effort estimation using story points. Innov Syst Softw Eng 13(2–3):191–200. https://doi.org/10.1007/s11334-017-0288-z
Satapathy SC, Govardhan A, Srujan Raju K, Mandal JK (2015) Emerging ICT for bridging the future. In: Proceedings of the 49th annual convention of the computer society of India (CSI), vol 1. Advanced intelligent system computing, vol 337, pp 19–30
Azzeh M, Elsheikh Y, Alseid M (2014) An optimized analogy-based project effort estimation. Int J Adv Comput Sci Appl 5(4):6–11
Uzun R, Erkaymaz O, Yapici İŞ (2018) Comparison of artificial neural network and regression models to diagnose of knee disorder in different postures using surface. Electromyography 31(1):100–110
Kumari S, Pushkar S (2016) A framework for analogy-based software cost estimation using multi-objective genetic algorithm. Lect Notes Eng Comput Sci 2225:508–515
Liu Q, Xiao J, Zhu H (2019) Feature selection for software effort estimation with localized neighborhood mutual information. Cluster Comput 22(1):6953–6961
Azzeh M (2011) Model tree based adaption strategy for software effort estimation by analogy. In: Proceedings of 11th IEEE international conference on computing information technology, pp 328–335
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that this manuscript has no conflict of interest with any other published source and has not been published previously (partly or in full). No data have been fabricated or manipulated to support our conclusions.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Suresh Kumar, P., Behera, H.S., Nayak, J. et al. A pragmatic ensemble learning approach for effective software effort estimation. Innovations Syst Softw Eng 18, 283–299 (2022). https://doi.org/10.1007/s11334-020-00379-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11334-020-00379-y