Skip to main content
Log in

Stable rankings for different effort models

  • Published:
Automated Software Engineering Aims and scope Submit manuscript

Abstract

There exists a large and growing number of proposed estimation methods but little conclusive evidence ranking one method over another. Prior effort estimation studies suffered from “conclusion instability”, where the rankings offered to different methods were not stable across (a) different evaluation criteria; (b) different data sources; or (c) different random selections of that data. This paper reports a study of 158 effort estimation methods on data sets based on COCOMO features. Four “best” methods were detected that were consistently better than the “rest” of the other 154 methods. These rankings of “best” and “rest” methods were stable across (a) three different evaluation criteria applied to (b) multiple data sets from two different sources that were (c) divided into hundreds of randomly selected subsets using four different random seeds. Hence, while there exists no single universal “best” effort estimation method, there appears to exist a small number (four) of most useful methods. This result both complicates and simplifies effort estimation research. The complication is that any future effort estimation analysis should be preceded by a “selection study” that finds the best local estimator. However, the simplification is that such a study need not be labor intensive, at least for COCOMO style data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Anderson, J.: Cognitive Psychology and its Implications. W.H. Freeman, New York (1985)

    Google Scholar 

  • Baker, D.: A hybrid approach to expert and model-based effort estimation. Master’s thesis, Lane Department of Computer Science and Electrical Engineering, West Virginia University, 2007. Available from https://eidr.wvu.edu/etd/documentdata.eTD?documentid=5443 (2007)

  • Basili, V., McGarry, F., Pajerski, R., Zelkowitz, M.: Lessons learned from 25 years of process improvement: the rise and fall of the NASA software engineering laboratory. In: Proceedings of the 24th International Conference on Software Engineering (ICSE) 2002, Orlando, Florida, 2002. Available from http://www.cs.umd.edu/projects/SoftEng/ESEG/papers/83.88.pdf

  • Boehm, B.: Software Engineering Economics. Prentice Hall, New York (1981)

    MATH  Google Scholar 

  • Boehm, B.: A spiral model of software development and enhancement. Softw. Eng. Notes 11(4), 22 (1986)

    Article  Google Scholar 

  • Boehm, B., Horowitz, E., Madachy, R., Reifer, D., Clark, B.K., Steece, B., Brown, A.W., Chulani, S., Abts, C.: Software Cost Estimation with Cocomo II. Prentice Hall, New York (2000)

    Google Scholar 

  • Briand, L., Langley, T., Wieczorek, I.: A replicated assessment and comparison of common software cost modeling techniques. In: Proceedings of the 22nd International Conference on Software Engineering, Limerick, Ireland, pp. 377–386 (2000)

  • Brieman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)

    Google Scholar 

  • Chang, C.: Finding prototypes for nearest neighbor classifiers. IEEE Trans. Comput. C-23, 1179–1185 (1974)

    Article  Google Scholar 

  • Chen, Z., Menzies, T., Port, D., Boehm, B.: Finding the right data for software cost modeling. IEEE Softw. 22(6), 38–46 (2005)

    Article  Google Scholar 

  • Chulani, S., Boehm, B., Steece, B.: Bayesian analysis of empirical software engineering cost models. IEEE Trans. Softw. Eng. 25(4), 573–583 (1999)

    Article  Google Scholar 

  • Endres, A., Rombach, H.D.: A Handbook of Software and Systems Engineering—Empirical Observations, Laws and Theories. Addison-Wesley, Reading (2003)

    Google Scholar 

  • Fayyad, U.M., Irani, I.H.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pp. 1022–1027 (1993)

  • Ferens, D., Christensen, D.: Calibrating software cost models to Department of Defense Database: a review of ten studies. J. Parametr. 18(1), 55–74 (1998)

    Google Scholar 

  • Freund, Y., Schapire, R.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55 (1997)

  • Gama, J., Pinto, C.: Discretization from data streams: applications to histograms and data mining. In: SAC’06: Proceedings of the 2006 ACM Symposium on Applied Computing, pp. 662–667. ACM Press, New York (2006). Available from http://www.liacc.up.pt/~jgama/IWKDDS/Papers/p6.pdf

    Chapter  Google Scholar 

  • Gregor, S.: Design theory in information systems. Australasian J. Inf. Syst. (2002). Available from http://dl.acs.org.au/index.php/ajis/article/viewPDFInterstitial/439/399?ads=

  • Hall, M., Holmes, G.: Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans. Knowl. Data Eng. 15(6), 1437–1447 (2003). Available from http://www.cs.waikato.ac.nz/~mhall/HallHolmesTKDE.pdf

    Article  Google Scholar 

  • Jalali, O.: Evaluation bias in effort estimation. Master’s thesis, Lane Department of Computer Science and Electrical Engineering, West Virginia University (2007)

  • Jensen, R.: An improved macrolevel software development resource estimation model. In: 5th ISPA Conference, pp. 88–92, April 1983

  • Jorgensen, M.: A review of studies on expert estimation of software development effort. J. Syst. Softw. 70(1–2), 37–60 (2004)

    Article  Google Scholar 

  • Jorgensen, M., Molokken-Ostvold, K.: Reasons for software effort estimation error: Impact of respondent error, information collection approach, and data analysis method. IEEE Trans. Softw. Eng. 30(12), 993–1007 (2004)

    Article  Google Scholar 

  • Jorgensen, M., Shepperd, M.: A systematic review of software development cost estimation studies. January 2007. Available from http://www.simula.no/departments/engineering/publications/Jorgensen.2005.12

  • Kemerer, C.: An empirical validation of software cost estimation models. Commun. ACM 30(5), 416–429 (1987)

    Article  Google Scholar 

  • Keung, J.W., Kitchenham, B.A., Jeffery, D.R.: Analogy-x: providing statistical inference to analogy-based software cost estimation. IEEE Trans. Softw. Eng. 34(4), 471–484 (2008)

    Article  Google Scholar 

  • Kirsopp, C., Shepperd, M.: Making inferences with small numbers of training sets. IEEE Proc. 149 (2002a)

  • Kirsopp, C., Shepperd, M.: Case and feature subset selection in case-based software project effort prediction. In: Proc. of 22nd SGAI International Conference on Knowledge-Based Systems and Applied Artificial Intelligence, Cambridge, UK (2002b)

  • Kitchenham, B.A., Mendes, E., Travassos, G.H.: Cross versus within-company cost estimation studies: a systematic review. IEEE Trans. Softw. Eng. 33(5), 316–329 (2007)

    Article  Google Scholar 

  • Kliijnen, J.: Sensitivity analysis and related analyses: a survey of statistical techniques. J. Stat. Comput. Simul. 57(1–4), 111–142 (1987)

    Google Scholar 

  • Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997). [Online]. Available: citeseer.nj.nec.com/kohavi96wrappers.html

    Article  MATH  Google Scholar 

  • Li, J., Ruhe, G.: Decision support analysis for software effort estimation by analogy. In: Proceedings, PROMISE’07 Workshop on Repeatable Experiments in Software Engineering (2007)

  • Li, Y., Xie, M., Goh, T.N.: A study of the non-linear adjustment for analogy based software cost estimation. Empir. Softw. Eng., 603–643 (2009)

  • Li, Y., Xie, M., Goh, T.N.: A study of project selection and feature weighting for analogy based software cost estimation. J. Syst. Softw. 82, 241–252 (2009)

    Article  Google Scholar 

  • Lipowezky, U.: Selection of the optimal prototype subset for 1-NN classification. Pattern Recognit. Lett. 19, 907918 (1998). [Online]. Available: http://linkinghub.elsevier.com/retrieve/pii/S0167865598000750

    Article  Google Scholar 

  • Lum, K., Powell, J., Hihn, J.: Validation of spacecraft software cost estimation models for flight and ground systems. In: ISPA Conference Proceedings, Software Modeling Track, May 2002

  • Mann, H.B., Whitney, D.R.: On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 18(1), 50–60 (1947). Available on-line at http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.aoms/1177730491

    Article  MATH  MathSciNet  Google Scholar 

  • Mendes, E., Watson, I.D., Triggs, C., Mosley, N., Counsell, S.: A comparative study of cost estimation models for web hypermedia applications. Empir. Softw. Eng. 8(2), 163–196 (2003)

    Article  Google Scholar 

  • Menzies, T., Port, D., Chen, Z., Hihn, J., Stukes, S.: Validation methods for calibrating software effort models. In: Proceedings, ICSE. Available from http://menzies.us/pdf/04coconut.pdf, (2005)

  • Menzies, T., Chen, Z., Hihn, J., Lum, K.: Selecting Best Practices for Effort Estimation. IEEE Trans. Softw. Eng. 32, 883–895 (2006)

    Article  Google Scholar 

  • Menzies, T., Chen, Z., Hihn, J., Lum, K.: Selecting best practices for effort estimation. IEEE Trans. Softw. Eng. 32(11), 883–895 (2006). Available from http://menzies.us/pdf/06coseekmo.pdf

    Article  Google Scholar 

  • Menzies, T., Chen, Z., Port, D., Hihn, J.: Simple software cost estimation: Safe or unsafe? In: Proceedings, PROMISE Workshop, ICSE 2005. Available from http://menzies.us/pdf/05safewhen.pdf

  • Miller, A.: Subset Selection in Regression, 2nd edn. Chapman & Hall, London (2002)

    MATH  Google Scholar 

  • Myrtveit, I., Stensrud, E., Shepperd, M.: Reliability and validity in comparative studies of software prediction models. IEEE Trans. Softw. Eng. 31(5), 380–391 (2005)

    Article  Google Scholar 

  • Park, R.: The central equations of the price software cost model. In: 4th COCOMO Users Group Meeting, November 1988

  • Putnam, L., Myers, W.: Measures for Excellence. Yourdon Press Computing Series (1992)

  • Quinlan, J.R.: Learning with continuous classes. In: 5th Australian Joint Conference on Artificial Intelligence, pp. 343–348 (1992). Available from http://citeseer.nj.nec.com/quinlan92learning.html

  • Shepperd, M.: Software project economics: a roadmap. In: International Conference on Software Engineering 2007: Future of Software Engineering (2007)

  • Shepperd, M., Kadoda, G.F.: Comparing software prediction techniques using simulation. IEEE Trans. Softw. Eng. 27(11), 1014–1022 (2001)

    Article  Google Scholar 

  • Shepperd, M., Schofield, C.: Estimating software project effort using analogies. IEEE Trans. Softw. Eng. 23(11), 736–743 (1997). Available from http://www.utdallas.edu/~rbanker/SE_XII.pdf

    Article  Google Scholar 

  • Simons, D., Chabris, C.: Gorillas in our midst: sustained in attentional blindless for dynamic events perception. Perception 28, 1059–1074 (1999)

    Article  Google Scholar 

  • Spareref.com: NASA to shut down checkout & launch control system. August 26, 2002. http://www.spaceref.com/news/viewnews.html?id=475 (2002)

  • Walkerden, F., Jeffery, R.: An empirical study of analogy-based software effort estimation. Empir. Softw. Eng. 4(2), 135–158 (1999)

    Article  Google Scholar 

  • Witten, I.H., Frank, E.: Data mining, 2nd edn. Morgan Kaufmann, Los Altos (2005)

    MATH  Google Scholar 

  • Yang, Y., Webb, G.I.: A comparative study of discretization methods for Naive-Bayes classifiers. In: Proceedings of PKAW 2002: The 2002 Pacific Rim Knowledge Acquisition Workshop, pp. 159–173 (2002)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tim Menzies.

Additional information

The research described in this paper was carried out at West Virginia University and the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the US National Aeronautics and Space Administration. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not constitute or imply its endorsement by the US Government.

http://menzies.us/pdf/07stability.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Menzies, T., Jalali, O., Hihn, J. et al. Stable rankings for different effort models. Autom Softw Eng 17, 409–437 (2010). https://doi.org/10.1007/s10515-010-0070-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10515-010-0070-z

Keywords

Navigation