Skip to main content
Log in

Appropriate number of analogues in analogy based software effort estimation using quality datasets

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Analogy-based software effort estimation (ASEE) plays an important role in software development. It attracts the attention of researchers nowadays due to the simplicity of the ASEE reasoning method. ASEE reasoning is considered simple because it is similar to human reasoning. The estimation approach repeatedly uses the effort values of preceding similar projects. In this approach, the appropriate number of similar previous projects to be reused is still a topic of debate in ASEE research studies. The reliability and accuracy of ASEE methods are considerably affected by the quality of software repositories (datasets). Therefore, if a software dataset does not follow the ASEE principle, then it is not considered useful for the ASEE method. This article presents a novel approach for ASEE to find the appropriate number of analogues from quality datasets. In this approach, the data pre-processing stage is based on Spearman’s rank-order correlation and Kruskal–Wallis test. In the proposed approach, it can deal with categorical (both nominal and ordinal) attributes individually. Spearman’s rank-order correlation is used to find reliable numerical and ordinal attributes. Kruskal–Wallis test identifies reliable nominal attributes. Reliable attributes refer to those attributes which significantly influence the effort. The experimental results show that the proposed approach enhances the quality of the dataset, attribute selection from the metadata, and reduces the abnormal observation and overall project development cost.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

https://github.com/Nisha-pal27/ASEE.

References

  1. Resmi, V., Vijayalakshmi, S., Chandrabose, R.S.: An effective software project effort estimation system using optimal firefly algorithm. Clust. Comput. 22(5), 11329–11338 (2019)

    Article  Google Scholar 

  2. Shepperd, M., Schofield, C.: Estimating software project effort using analogies. IEEE Trans. Softw. Eng. 23(11), 736–743 (1997)

    Article  Google Scholar 

  3. Auer, M., Trendowicz, A., Graser, B., Haunschmid, E., Biffl, S.: Optimal project feature weights in analogy-based cost estimation: improvement and limitations. IEEE Trans. Softw. Eng. 32(2), 83–92 (2006)

    Article  Google Scholar 

  4. Edinson, P., Muthuraj, L.: Performance analysis of fcm based anfis and elman neural network in software effort estimation. Int. Arab J. Inf. Technol 15(1), 94–102 (2018)

    Google Scholar 

  5. Boehm, B., Abts, C., Chulani, S.: Software development cost estimation approaches-a survey. Ann. Softw. Eng. 10(1), 177–205 (2000)

    Article  Google Scholar 

  6. Kosti, M. V., Mittas, N., Angelis, L.: Dd-eba: an algorithm for determining the number of neighbors in cost estimation by analogy using distance distributions. arXiv preprint arXiv:1012.5755

  7. Azzeh, M., Neagu, D., Cowling, P.: Software project similarity measurement based on fuzzy c-means. In: International Conference on Software Process, Springer, pp. 123–134 (2008)

  8. Nazir, S., Shahzad, S., Atan, R.B., Farman, H.: Estimation of software features based birthmark. Clust. Comput. 21(1), 333–346 (2018)

    Article  Google Scholar 

  9. Li, Y.-F., Xie, M., Goh, T.N.: A study of project selection and feature weighting for analogy based software cost estimation. J. Syst. Softw. 82(2), 241–252 (2009)

    Article  Google Scholar 

  10. Azzeh, M., Nassif, A.B.: Analogy-based effort estimation: a new method to discover set of analogies from dataset characteristics. IET Softw. 9(2), 39–50 (2015)

    Article  Google Scholar 

  11. Azzeh, M., Neagu, D., Cowling, P.I.: Fuzzy grey relational analysis for software effort estimation. Empir. Softw. Eng. 15(1), 60–90 (2010)

    Article  Google Scholar 

  12. Suresh Kumar, P., Behera, H., Nayak, J., Naik, B.: A pragmatic ensemble learning approach for effective software effort estimation. Innov.s Syst. Softw. Eng. 18(2), 283–299 (2022)

    Article  Google Scholar 

  13. Khatibi, V., Jawawi, D.N., Khatibi, E.: Increasing the accuracy of analogy based software development effort estimation using neural networks. Int. J. Comput. Commun. Eng. 2(1), 78 (2013)

    Article  Google Scholar 

  14. Angelis, L., Stamelos, I.: A simulation tool for efficient analogy based cost estimation. Empir. Softw. Eng. 5(1), 35–68 (2000)

    Article  Google Scholar 

  15. Mahmood, Y., Kama, N., Azmi, A., Khan, A.S., Ali, M.: Software effort estimation accuracy prediction of machine learning techniques: a systematic performance evaluation. Softw.: Pract. Exp. 52(1), 39–65 (2022)

    Google Scholar 

  16. Wieczorek, I.: Improved software cost estimation-a robust and interpretable modelling method and a comprehensive empirical investigation. Empir. Softw. Eng. 7(2), 177–180 (2002)

    Article  MathSciNet  Google Scholar 

  17. Myrtveit, I., Stensrud, E.: A controlled experiment to assess the benefits of estimating with analogy and regression models. IEEE Trans. Softw. Eng. 25(4), 510–525 (1999)

    Article  Google Scholar 

  18. Yücalar, F., Kilinc, D., Borandag, E., Ozcift, A.: Regression analysis based software effort estimation method. Int. J. Softw. Eng. Knowl. Eng. 26(05), 807–826 (2016)

    Article  Google Scholar 

  19. Liu, Q., Xiao, J., Zhu, H.: Feature selection for software effort estimation with localized neighborhood mutual information. Clust. Comput. 22(3), 6953–6961 (2019)

    Article  Google Scholar 

  20. Bardsiri, V.K., Jawawi, D.N.A., Hashim, S.Z.M., Khatibi, E.: Increasing the accuracy of software development effort estimation using projects clustering. IET Softw. 6(6), 461–473 (2012)

    Article  Google Scholar 

  21. Malathi, S., Sridhar, S.: Estimation of effort in software cost analysis for heterogenous dataset using fuzzy analogy. arXiv preprint arXiv:1211.1136

  22. Humayun, M., Gang, C.: Estimating effort in global software development projects using machine learning techniques. Int. J. Inf. Educ. Technol. 2(3), 208 (2012)

    Google Scholar 

  23. Prabhakar, M.D., Dutta, M.: Prediction of software effort using artificial neural network and support vector machine. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 3(3), 40–46 (2013)

    Google Scholar 

  24. Araujo, R. d A., Oliveira, A..L., Meira, S.: A class of hybrid multilayer perceptrons for software development effort estimation problems. Expert Syst. Appl. 90, 1–12 (2017)

    Article  Google Scholar 

  25. Sakhrawi, Z., Sellami, A., Bouassida, N.: Software enhancement effort estimation using correlation-based feature selection and stacking ensemble method. Clust. Comput. 25(4), 2779–2792 (2022)

    Article  Google Scholar 

  26. Kaushik, A., Verma, S., Singh, H.J., Chhabra, G.: Software cost optimization integrating fuzzy system and coa-cuckoo optimization algorithm. Int. J. Syst. Assur. Eng. Manag. 8(2), 1461–1471 (2017)

    Article  Google Scholar 

  27. Satapathy, S.M., Kumar, M., Rath, S.K.: Fuzzy-class point approach for software effort estimation using various adaptive regression methods. CSI Trans. ICT 1(4), 367–380 (2013)

    Article  Google Scholar 

  28. Borandag, E., Yucalar, F., Erdogan, S.Z.: A case study for the software size estimation through MK II FPA and FP methods. Int. J. Comput. Appl. Technol. 53(4), 309–314 (2016)

    Article  Google Scholar 

  29. Kocaguneli, E., Menzies, T., Bener, A., Keung, J.W.: Exploiting the essential assumptions of analogy-based effort estimation. IEEE Trans. Softw. Eng. 38(2), 425–438 (2011)

    Article  Google Scholar 

  30. Zhu, B., Yu, L.-A., Geng, Z.-Q.: Cost estimation method based on parallel monte Carlo simulation and market investigation for engineering construction project. Clust. Comput. 19(3), 1293–1308 (2016)

    Article  Google Scholar 

  31. Baker, D.R.: A hybrid approach to expert and model based effort estimation. West Virginia University (2007)

  32. Chinthanet, B., Phannachitta, P., Kamei, Y., Leelaprute, P., Rungsawang, A., Ubayashi, N., Matsumoto, K.: A review and comparison of methods for determining the best analogies in analogy-based software effort estimation. In: Proceedings of the 31st Annual ACM Symposium on Applied Computing, pp. 1554–1557 (2016)

  33. Kitchenham, B., Mendes, E.: Why comparative effort prediction studies may be invalid. In: Proceedings of the 5th International Conference on Predictor Models in Software Engineering, pp. 1–5 (2009)

  34. Kirsopp, C., Mendes, E., Premraj, R., Shepperd, M.: An empirical analysis of linear adaptation techniques for case-based prediction. In: International Conference on Case-Based Reasoning, Springer, pp. 231–245 (2003)

  35. Idri, A., Abran, A., Khoshgoftaar, T.: Fuzzy analogy: a new approach for software cost estimation. In: International Workshop on Software Measurement, Citeseer, pp. 28–29 (2001)

  36. Li, J., Ruhe, G., Al-Emran, A., Richter, M.M.: A flexible method for software effort estimation by analogy. Empir. Softw. Eng. 12(1), 65–106 (2007)

    Article  CAS  Google Scholar 

  37. JH, Z.: Spearman rank correlation. Encyclopedia of Biostatistics, 7, (2005). https://doi.org/10.1002/0470011815

  38. Xia, X., Lo, D., Bao, L., Sharma, A., Li, S.:Personality and project success: insights from a large-scale study with professionals. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), IEEE, pp. 318–328 (2017)

  39. Azzeh, M., Elsheikh, Y.: Learning best k analogies from data distribution for case-based software effort estimation. arXiv preprint arXiv:1703.04567

  40. Shirabad, J.S., Menzies, T.: The promise repository of software engineering databases. School of Information Technology and Engineering, University of Ottawa, Canada, http://promise.site.uottawa.ca/SERepository

  41. Azzeh, M.: Dataset quality assessment: an extension for analogy based effort estimation. Int. J. Comput. Sci. Eng. Surv. 4(1), S6 (2013)

    Article  Google Scholar 

Download references

Funding

This research received no specific Grant from any funding agency.

Author information

Authors and Affiliations

Authors

Contributions

All authors read and approved the final manuscript.

Corresponding author

Correspondence to Nisha Pal.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This manuscript is the authors’ own original work, which has not been previously published elsewhere.

Informed consent

Research does not involve humans.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pal, N., Yadav, M.P. & Yadav, D.K. Appropriate number of analogues in analogy based software effort estimation using quality datasets. Cluster Comput 27, 531–546 (2024). https://doi.org/10.1007/s10586-023-03967-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-023-03967-2

Keywords

Navigation