Stable rankings for different effort models

Menzies, Tim; Jalali, Omid; Hihn, Jairus; Baker, Dan; Lum, Karen

doi:10.1007/s10515-010-0070-z

Stable rankings for different effort models

Published: 26 May 2010

Volume 17, pages 409–437, (2010)
Cite this article

Automated Software Engineering Aims and scope Submit manuscript

Tim Menzies¹,
Omid Jalali¹,
Jairus Hihn²,
Dan Baker¹ &
…
Karen Lum²

249 Accesses
36 Citations
Explore all metrics

Abstract

There exists a large and growing number of proposed estimation methods but little conclusive evidence ranking one method over another. Prior effort estimation studies suffered from “conclusion instability”, where the rankings offered to different methods were not stable across (a) different evaluation criteria; (b) different data sources; or (c) different random selections of that data. This paper reports a study of 158 effort estimation methods on data sets based on COCOMO features. Four “best” methods were detected that were consistently better than the “rest” of the other 154 methods. These rankings of “best” and “rest” methods were stable across (a) three different evaluation criteria applied to (b) multiple data sets from two different sources that were (c) divided into hundreds of randomly selected subsets using four different random seeds. Hence, while there exists no single universal “best” effort estimation method, there appears to exist a small number (four) of most useful methods. This result both complicates and simplifies effort estimation research. The complication is that any future effort estimation analysis should be preceded by a “selection study” that finds the best local estimator. However, the simplification is that such a study need not be labor intensive, at least for COCOMO style data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Negative results for software effort estimation

Article 21 November 2016

Multi-objective Optimisation, Software Effort Estimation and Linear Models

Generalized Gini Correlation and its Application in Data-Mining

Article 25 January 2016

References

Anderson, J.: Cognitive Psychology and its Implications. W.H. Freeman, New York (1985)
Google Scholar
Baker, D.: A hybrid approach to expert and model-based effort estimation. Master’s thesis, Lane Department of Computer Science and Electrical Engineering, West Virginia University, 2007. Available from https://eidr.wvu.edu/etd/documentdata.eTD?documentid=5443 (2007)
Basili, V., McGarry, F., Pajerski, R., Zelkowitz, M.: Lessons learned from 25 years of process improvement: the rise and fall of the NASA software engineering laboratory. In: Proceedings of the 24th International Conference on Software Engineering (ICSE) 2002, Orlando, Florida, 2002. Available from http://www.cs.umd.edu/projects/SoftEng/ESEG/papers/83.88.pdf
Boehm, B.: Software Engineering Economics. Prentice Hall, New York (1981)
MATH Google Scholar
Boehm, B.: A spiral model of software development and enhancement. Softw. Eng. Notes 11(4), 22 (1986)
Article Google Scholar
Boehm, B., Horowitz, E., Madachy, R., Reifer, D., Clark, B.K., Steece, B., Brown, A.W., Chulani, S., Abts, C.: Software Cost Estimation with Cocomo II. Prentice Hall, New York (2000)
Google Scholar
Briand, L., Langley, T., Wieczorek, I.: A replicated assessment and comparison of common software cost modeling techniques. In: Proceedings of the 22nd International Conference on Software Engineering, Limerick, Ireland, pp. 377–386 (2000)
Brieman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
Google Scholar
Chang, C.: Finding prototypes for nearest neighbor classifiers. IEEE Trans. Comput. C-23, 1179–1185 (1974)
Article Google Scholar
Chen, Z., Menzies, T., Port, D., Boehm, B.: Finding the right data for software cost modeling. IEEE Softw. 22(6), 38–46 (2005)
Article Google Scholar
Chulani, S., Boehm, B., Steece, B.: Bayesian analysis of empirical software engineering cost models. IEEE Trans. Softw. Eng. 25(4), 573–583 (1999)
Article Google Scholar
Endres, A., Rombach, H.D.: A Handbook of Software and Systems Engineering—Empirical Observations, Laws and Theories. Addison-Wesley, Reading (2003)
Google Scholar
Fayyad, U.M., Irani, I.H.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pp. 1022–1027 (1993)
Ferens, D., Christensen, D.: Calibrating software cost models to Department of Defense Database: a review of ten studies. J. Parametr. 18(1), 55–74 (1998)
Google Scholar
Freund, Y., Schapire, R.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55 (1997)
Gama, J., Pinto, C.: Discretization from data streams: applications to histograms and data mining. In: SAC’06: Proceedings of the 2006 ACM Symposium on Applied Computing, pp. 662–667. ACM Press, New York (2006). Available from http://www.liacc.up.pt/~jgama/IWKDDS/Papers/p6.pdf
Chapter Google Scholar
Gregor, S.: Design theory in information systems. Australasian J. Inf. Syst. (2002). Available from http://dl.acs.org.au/index.php/ajis/article/viewPDFInterstitial/439/399?ads=
Hall, M., Holmes, G.: Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans. Knowl. Data Eng. 15(6), 1437–1447 (2003). Available from http://www.cs.waikato.ac.nz/~mhall/HallHolmesTKDE.pdf
Article Google Scholar
Jalali, O.: Evaluation bias in effort estimation. Master’s thesis, Lane Department of Computer Science and Electrical Engineering, West Virginia University (2007)
Jensen, R.: An improved macrolevel software development resource estimation model. In: 5th ISPA Conference, pp. 88–92, April 1983
Jorgensen, M.: A review of studies on expert estimation of software development effort. J. Syst. Softw. 70(1–2), 37–60 (2004)
Article Google Scholar
Jorgensen, M., Molokken-Ostvold, K.: Reasons for software effort estimation error: Impact of respondent error, information collection approach, and data analysis method. IEEE Trans. Softw. Eng. 30(12), 993–1007 (2004)
Article Google Scholar
Jorgensen, M., Shepperd, M.: A systematic review of software development cost estimation studies. January 2007. Available from http://www.simula.no/departments/engineering/publications/Jorgensen.2005.12
Kemerer, C.: An empirical validation of software cost estimation models. Commun. ACM 30(5), 416–429 (1987)
Article Google Scholar
Keung, J.W., Kitchenham, B.A., Jeffery, D.R.: Analogy-x: providing statistical inference to analogy-based software cost estimation. IEEE Trans. Softw. Eng. 34(4), 471–484 (2008)
Article Google Scholar
Kirsopp, C., Shepperd, M.: Making inferences with small numbers of training sets. IEEE Proc. 149 (2002a)
Kirsopp, C., Shepperd, M.: Case and feature subset selection in case-based software project effort prediction. In: Proc. of 22nd SGAI International Conference on Knowledge-Based Systems and Applied Artificial Intelligence, Cambridge, UK (2002b)
Kitchenham, B.A., Mendes, E., Travassos, G.H.: Cross versus within-company cost estimation studies: a systematic review. IEEE Trans. Softw. Eng. 33(5), 316–329 (2007)
Article Google Scholar
Kliijnen, J.: Sensitivity analysis and related analyses: a survey of statistical techniques. J. Stat. Comput. Simul. 57(1–4), 111–142 (1987)
Google Scholar
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997). [Online]. Available: citeseer.nj.nec.com/kohavi96wrappers.html
Article MATH Google Scholar
Li, J., Ruhe, G.: Decision support analysis for software effort estimation by analogy. In: Proceedings, PROMISE’07 Workshop on Repeatable Experiments in Software Engineering (2007)
Li, Y., Xie, M., Goh, T.N.: A study of the non-linear adjustment for analogy based software cost estimation. Empir. Softw. Eng., 603–643 (2009)
Li, Y., Xie, M., Goh, T.N.: A study of project selection and feature weighting for analogy based software cost estimation. J. Syst. Softw. 82, 241–252 (2009)
Article Google Scholar
Lipowezky, U.: Selection of the optimal prototype subset for 1-NN classification. Pattern Recognit. Lett. 19, 907918 (1998). [Online]. Available: http://linkinghub.elsevier.com/retrieve/pii/S0167865598000750
Article Google Scholar
Lum, K., Powell, J., Hihn, J.: Validation of spacecraft software cost estimation models for flight and ground systems. In: ISPA Conference Proceedings, Software Modeling Track, May 2002
Mann, H.B., Whitney, D.R.: On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 18(1), 50–60 (1947). Available on-line at http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.aoms/1177730491
Article MATH MathSciNet Google Scholar
Mendes, E., Watson, I.D., Triggs, C., Mosley, N., Counsell, S.: A comparative study of cost estimation models for web hypermedia applications. Empir. Softw. Eng. 8(2), 163–196 (2003)
Article Google Scholar
Menzies, T., Port, D., Chen, Z., Hihn, J., Stukes, S.: Validation methods for calibrating software effort models. In: Proceedings, ICSE. Available from http://menzies.us/pdf/04coconut.pdf, (2005)
Menzies, T., Chen, Z., Hihn, J., Lum, K.: Selecting Best Practices for Effort Estimation. IEEE Trans. Softw. Eng. 32, 883–895 (2006)
Article Google Scholar
Menzies, T., Chen, Z., Hihn, J., Lum, K.: Selecting best practices for effort estimation. IEEE Trans. Softw. Eng. 32(11), 883–895 (2006). Available from http://menzies.us/pdf/06coseekmo.pdf
Article Google Scholar
Menzies, T., Chen, Z., Port, D., Hihn, J.: Simple software cost estimation: Safe or unsafe? In: Proceedings, PROMISE Workshop, ICSE 2005. Available from http://menzies.us/pdf/05safewhen.pdf
Miller, A.: Subset Selection in Regression, 2nd edn. Chapman & Hall, London (2002)
MATH Google Scholar
Myrtveit, I., Stensrud, E., Shepperd, M.: Reliability and validity in comparative studies of software prediction models. IEEE Trans. Softw. Eng. 31(5), 380–391 (2005)
Article Google Scholar
Park, R.: The central equations of the price software cost model. In: 4th COCOMO Users Group Meeting, November 1988
Putnam, L., Myers, W.: Measures for Excellence. Yourdon Press Computing Series (1992)
Quinlan, J.R.: Learning with continuous classes. In: 5th Australian Joint Conference on Artificial Intelligence, pp. 343–348 (1992). Available from http://citeseer.nj.nec.com/quinlan92learning.html
Shepperd, M.: Software project economics: a roadmap. In: International Conference on Software Engineering 2007: Future of Software Engineering (2007)
Shepperd, M., Kadoda, G.F.: Comparing software prediction techniques using simulation. IEEE Trans. Softw. Eng. 27(11), 1014–1022 (2001)
Article Google Scholar
Shepperd, M., Schofield, C.: Estimating software project effort using analogies. IEEE Trans. Softw. Eng. 23(11), 736–743 (1997). Available from http://www.utdallas.edu/~rbanker/SE_XII.pdf
Article Google Scholar
Simons, D., Chabris, C.: Gorillas in our midst: sustained in attentional blindless for dynamic events perception. Perception 28, 1059–1074 (1999)
Article Google Scholar
Spareref.com: NASA to shut down checkout & launch control system. August 26, 2002. http://www.spaceref.com/news/viewnews.html?id=475 (2002)
Walkerden, F., Jeffery, R.: An empirical study of analogy-based software effort estimation. Empir. Softw. Eng. 4(2), 135–158 (1999)
Article Google Scholar
Witten, I.H., Frank, E.: Data mining, 2nd edn. Morgan Kaufmann, Los Altos (2005)
MATH Google Scholar
Yang, Y., Webb, G.I.: A comparative study of discretization methods for Naive-Bayes classifiers. In: Proceedings of PKAW 2002: The 2002 Pacific Rim Knowledge Acquisition Workshop, pp. 159–173 (2002)

Download references

Author information

Authors and Affiliations

Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, USA
Tim Menzies, Omid Jalali & Dan Baker
NASA’s Jet Propulsion Laboratory, Pasadena, USA
Jairus Hihn & Karen Lum

Authors

Tim Menzies
View author publications
You can also search for this author in PubMed Google Scholar
Omid Jalali
View author publications
You can also search for this author in PubMed Google Scholar
Jairus Hihn
View author publications
You can also search for this author in PubMed Google Scholar
Dan Baker
View author publications
You can also search for this author in PubMed Google Scholar
Karen Lum
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tim Menzies.

Additional information

The research described in this paper was carried out at West Virginia University and the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the US National Aeronautics and Space Administration. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not constitute or imply its endorsement by the US Government.

http://menzies.us/pdf/07stability.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Menzies, T., Jalali, O., Hihn, J. et al. Stable rankings for different effort models. Autom Softw Eng 17, 409–437 (2010). https://doi.org/10.1007/s10515-010-0070-z

Download citation

Received: 19 November 2009
Accepted: 14 May 2010
Published: 26 May 2010
Issue Date: December 2010
DOI: https://doi.org/10.1007/s10515-010-0070-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stable rankings for different effort models

Abstract

Access this article

Similar content being viewed by others

Negative results for software effort estimation

Multi-objective Optimisation, Software Effort Estimation and Linear Models

Generalized Gini Correlation and its Application in Data-Mining

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Stable rankings for different effort models

Abstract

Access this article

Similar content being viewed by others

Negative results for software effort estimation

Multi-objective Optimisation, Software Effort Estimation and Linear Models

Generalized Gini Correlation and its Application in Data-Mining

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation