Negative results for software effort estimation

Menzies, Tim; Yang, Ye; Mathew, George; Boehm, Barry; Hihn, Jairus

doi:10.1007/s10664-016-9472-2

Negative results for software effort estimation

Published: 21 November 2016

Volume 22, pages 2658–2683, (2017)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Tim Menzies¹,
Ye Yang²,
George Mathew¹,
Barry Boehm³ &
…
Jairus Hihn⁴

1937 Accesses
46 Citations
4 Altmetric
Explore all metrics

Abstract

More than half the literature on software effort estimation (SEE) focuses on comparisons of new estimation methods. Surprisingly, there are no studies comparing state of the art latest methods with decades-old approaches. Accordingly, this paper takes five steps to check if new SEE methods generated better estimates than older methods. Firstly, collect effort estimation methods ranging from “classical” COCOMO (parametric estimation over a pre-determined set of attributes) to “modern” (reasoning via analogy using spectral-based clustering plus instance and feature selection, and a recent “baseline method” proposed in ACM Transactions on Software Engineering). Secondly, catalog the list of objections that lead to the development of post-COCOMO estimation methods. Thirdly, characterize each of those objections as a comparison between newer and older estimation methods. Fourthly, using four COCOMO-style data sets (from 1991, 2000, 2005, 2010) and run those comparisons experiments. Fifthly, compare the performance of the different estimators using a Scott-Knott procedure using (i) the A12 effect size to rule out “small” differences and (ii) a 99 % confident bootstrap procedure to check for statistically different groupings of treatments. The major negative result of this paper is that for the COCOMO data sets, nothing we studied did any better than Boehms original procedure. Hence, we conclude that when COCOMO-style attributes are available, we strongly recommend (i) using that data and (ii) use COCOMO to generate predictions. We say this since the experiments of this paper show that, at least for effort estimation, how data is collected is more important than what learner is applied to that data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data collection and quality challenges in deep learning: a data-centric AI perspective

Article 03 January 2023

Sampling in software engineering research: a critical review and guidelines

Article 28 April 2022

How different are different diff algorithms in Git?

Article Open access 11 September 2019

Notes

For full details on these attributes, see Section 4 of this paper.

References

Arcuri A, Briand L (2011) A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: ICSE’11, pp 1–10
Auer M, Trendowicz A, Graser B, Haunschmid E, Stefan B (2006) Optimal project feature weights in analogy-based cost estimation: improvement and limitations. IEEE Trans Softw Eng 32:83– 92
Article Google Scholar
Baker D (2007) A hybrid approach to expert and model-based effort estimation. Master’s thesis, Lane Department of Computer Science and Electrical Engineering, West Virginia University, Available from https://eidr.wvu.edu/etd/documentdata.eTD?documentid=5443
Black R, Curnow R, Katz R, Bray M (1977) Bcs software production data, final technical report radc-tr-77-116. Technical report Boeing Computer Services, Inc
Boehm B (1981) Software engineering economics. Prentice Hall, Englewood Cliffs
MATH Google Scholar
Boehm B (2000) Safe, simple software cost analysis. IEEE Softw:14–17
Boehm B, Horowitz E, Madachy R, Reifer D, Bradford KC, Steece B, Winsor Brown A, Chulani S, Abts C (2000) Software Cost Estimation with Cocomo II. Prentice Hall, Englewood Cliffs
Google Scholar
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees
MATH Google Scholar
Burgess CJ, Lefley Martin (2001) Can genetic programming improve software effort estimation? a comparative evaluation. Inf Softw Technol 43(14):863–873
Article Google Scholar
Chen Z, Boehm B, Menzies T, Port D (2005) Finding the right data for software cost modeling. IEEE Softw 22:38–46
Article Google Scholar
Chen Z, Menzies T, Port D (2005) Feature subset selection can improve software cost estimation. In: PROMISE’05. Available from http://menzies.us/pdf/05/fsscocomo.pdf
Chulani S, Boehm B, Steece B (1999) Bayesian analysis of empirical software engineering cost models. IEEE Trans Softw Eng 25(4)
Cohen PR (1995) Empirical methods for artificial intelligence, MIT Press, Cambridge
Corazza A, Di Martino S, Ferrucci F, Gravino C, Sarro F, Mendes E (2010) How effective is tabu search to configure support vector regression for effort estimation?. In: Proceedings of the 6th international conference on predictive models in software engineering, PROMISE ’10, pp 4:1–4:10
Cordero R, Costamagna M, Paschetta E (1997) A genetic algorithm approach for the calibration of cocomo-like models. In: 12th COCOMO Forum
Dabney JB (2002) Return on investment for IV&V. NASA funded study. Results Available from http://sarpresults.ivv.nasa.gov/ViewResearch/24.jsp
Dejaeger K, Verbeke W, Martens D, Baesens B (2012) Data mining techniques for software effort estimation: a comparative study. IEEE Trans Softw Eng 38:375–397
Article Google Scholar
Efron B, Tibshirani RJ (1993) An introduction to the bootstrap. Mono. Stat. Appl. Probab. Chapman and Hall, London
Book Google Scholar
Freiman F, Park R (1979) Price software model - version 3: An overview. In: Proceedings IEEE-PINY workshop on quantitative software models, IEEE catalog number TH 0067-9, pp 32–41
Herd J, Postak J, Russell W, Stewart J (1977) Software cost estimation study-study results, final technical report, radc-tr-77-220. Technical report, Doty Associates
Ingold D, Boehm B, Koolmanojwong S (2013) A model for estimating agile project process and schedule acceleration. In: ICSSP 2013, pp 29–35
Jensen R (1983) An improved macrolevel software development resource estimation model, pp 88–92
Jorgensen M (2015) The world is skewed: ignorance, use, misuse, misunderstandings, and how to improve uncertainty analyses in software development projects, 2015 CREST workshop. http://goo.gl/0wFHLZ
Jørgensen M, Gruschke TM (2009) The impact of lessons-learned sessions on effort estimation and uncertainty assessments. IEEE Trans Softw Eng 35(3):368–383
Article Google Scholar
Jørgensen M, Shepperd M (2007) A systematic review of software development cost estimation studies. Available from http://www.simula.no/departments/engineering/publications/Jorgensen.2005.12
Jorgensen M (2004) A review of studies on expert estimation of software development effort. J Syst Softw 70(1-2):37–60
Article Google Scholar
Li M, Mao K, Yang Y, Harman M (2013) Pricing crowdsourcing-based software development tasks. In: ICSE, new ideas and emerging results, San Francisco, CA, USA, pp 1205–1208
Kadoda G, Cartwright M, Chen L, Shepperd M (2000) Experiences using casebased reasoning to predict software project effort
Kampenes Vigdis By, Dybå T, Hannay JE, Sjøberg DIK (2007) A systematic review of effect size in software engineering experiments. Inf Softw Technol 49(11–12):1073–1086
Article Google Scholar
Keung JW (2008) Empirical evaluation of analogy-x for software cost estimation. In: ESEM ’08: international symposium on empirical software engineering and measurement. ACM, New York, NY, USA, pp 294–296
Keung JW, Kitchenham B (2008) Experiments with analogy-x for software cost estimation. In: ASWEC ’08: proceedings of the 19th Australian conference on software engineering. IEEE Computer Society, Washington, DC, USA, pp 229–238
Keung JW, Kitchenham BA, Jeffery DR (2008) Analogy-x: providing statistical inference to analogy-based software cost estimation. IEEE Trans Softw Eng 34(4):471–484
Article Google Scholar
Kirsopp C, Shepperd M (2002) Making inferences with small numbers of training sets. IEEE Proc:149
Kocaguneli E, Menzies T, Bener A, Keung J (2012) Exploiting the essential assumptions of analogy-based effort estimation. IEEE Trans Softw Eng 28:425–438. Available from http://menzies.us/pdf/11teak.pdf
Article Google Scholar
Kocaguneli E, Menzies T, Keung J W (2012) On the value of ensemble effort estimation. IEEE Trans Softw Eng 38(6):1403–1416
Article Google Scholar
Kocaguneli E, Menzies T, Keung J, Cok D, Madachy R (2013) Active learning and effort estimation: finding the essential content of software effort estimation data. IEEE Trans Softw Eng 39(8):1040–1053
Article Google Scholar
Kocaguneli E, Menzies T, Mendes E (2014) Transfer learning in effort estimation. Empir Softw Eng:1–31
Kocaguneli E, Zimmermann T, Bird C, Nagappan N, Menzies T (2013) Distributed development considered harmful?. In: ICSE, pp 882–890
Li Jingzhou, Ruhe Guenther (2006) A comparative study of attribute weighting heuristics for effort estimation by analogy. In: International symposium on empirical software engineering, p 74
Li J, Ruhe G (2007) Decision support analysis for software effort estimation by analogy. In: PROMISE ’07: proceedings of the third international workshop on predictor models in software engineering, p 6
Li J, Ruhe G (2008) Analysis of attribute weighting heuristics for analogy-based software effort estimation method aqua+. Empir Softw Eng 13:63–96
Li Y, Xie M, Goh T (2009) A study of the non-linear adjustment for analogy based software cost estimation. Empir Softw Eng:603–643
Lokan C, Mendes E (2006) Cross-company and single-company effort models using the isbsg database: a further replicated study. In: The ACM-IEEE international symposium on empirical software engineering, November 21–22, Rio de Janeiro
Lokan C, Mendes E (2009) Applying moving windows to software effort estimation. In: 3rd international symposium on empirical software engineering and measurement, 2009. ESEM 2009, pp 111–122
Menzies T, Butcher A, Cok DR, Marcus A, Layman L, Shull F, Turhan B, Zimmermann T (2013) Local versus global lessons for defect prediction and effort estimation. IEEE Trans Softw Eng 39(6):822–834. Available from http://menzies.us/pdf/12localb.pdf
Article Google Scholar
Menzies T, Chen Z, Hihn J, Lum K (2006) Selecting best practices for effort estimation. IEEE Trans Softw Eng. Available from http://menzies.us/pdf/06coseekmo.pdf
Menzies T, Dekhtyar A, Distefano J, Greenwald J (2007) Problems with precision. IEEE Trans Softw Eng. http://menzies.us/pdf/07precision.pdf
Menzies T, Kocagüneli E, Minku L, Peters F, Turhan B (2015) Chapter 20 - ensembles of learning machines. In: Sharing data and models in software engineering, pp 239–265
Menzies T, Peters F, Marcus A (2013) Ooops... (errata report for “Better Cross-Company Learning”). In: MSR’13. http://www.slideshare.net/timmenzies/msr13-mistake
Menzies T, Port D, Chen Z, Hihn J, Stukes S (2005) Validation methods for calibrating software effort models. In: Proceedings, ICSE. Available from http://menzies.us/pdf/04coconut.pdf
Menzies T, Shepperd M (2012) Special issue on repeatable results in software engineering prediction. Empir Softw Eng 17(1–2):1–17
Article Google Scholar
Miller A (2002) Subset selection in regression, 2nd edn. Chapman & Hall, London
Book MATH Google Scholar
Minku LL, Yao X (2011) A principled evaluation of ensembles of learning machines for software effort estimation, vol 106
Minku LL, Yao X (2013) Ensembles and locality: insight on improving software effort estimation. Inf Softw Technol 55:1512–1528
Article Google Scholar
Minku LL, Yao X (2014) How to make best use of cross-company data in software effort estimation?. In: ICSE’14, pp 446–456
Mittas N, Angelis L (2013) Ranking and clustering software cost estimation models through a multiple comparisons algorithm. IEEE Trans Softw Eng 39(4):537–551
Article Google Scholar
Molokken-Pstvold K, Haugen NC, Benestad HC (2008) Using planning poker for combining expert estimates in software projects. J Syst Softw 81:2106–2117
Article Google Scholar
Murphy-Hill E, Parnin C, Black AP (2012) How we refactor, and how we know it. IEEE Trans Softw Eng 38(1):5–18
Article Google Scholar
Myrtveit I, Stensrud E, Shepperd M (2005) Reliability and validity in comparative studies of software prediction models. IEEE Trans Softw Eng 31(5):380–391
Article Google Scholar
Papakroni V (2013) Data carving: identifying and removing irrelevancies in the data. Master’s thesis, Lane Department of Computer Science and Electrical Engineering, West Virginia University
Park R (1988) The central equations of the price software cost model. In: 4th COCOMO users group meeting
Passos C, Braun AP, Cruzes DS, Mendonca M (2011) Analyzing the impact of beliefs in software project practices. In: ESEM’11
Popper K R (1963) Conjectures and refutations. Routledge and Kegan Paul
Posnett D, Filkov V, Devanbu P (2011) Ecological inference in empirical software engineering. In: Proceedings of ASE’11
Putnam L (1976) A macro-estimating methodology for software development, pp 38–43
Scanniello G, Gravino C, Marcus A, Menzies T (2013) Class level fault prediction using software clustering. In: IEEE/ACM 28th international conference on automated software engineering, (ASE), 2013. IEEE, pp 640–645
Shaw M (2001) The coming-of-age of software architecture research. In: Proceedings of the 23rd international conference on software engineering, ICSE ’01, vol 656. IEEE Computer Society, Washington, DC, USA
Shepperd M, Schofield C (1997) Estimating software project effort using analogies. IEEE Trans Softw Eng 23(12). Available from http://www.utdallas.edu/~rbanker/SE_XII.pdf
Shepperd MJ, Macdonell SG (2012) Evaluating prediction systems in software project estimation. Inf Softw Technol 54(8):820–827
Article Google Scholar
Spareref.com (2002) Nasa to shut down checkout & launch control system. http://www.spaceref.com/news/viewnews.html?id=475
Stanley C, Byrne MD (2013) Predicting tags for stackoverflow posts. In: Proceedings of ICCM, vol 2013
Valerdi R (2011) Convergence of expert opinion via the wideband delphi method: an application in cost estimation models. In: Incose International Symposium, Denver, USA. Available from http://goo.gl/Zo9HT
Walkerden Fiona, Jeffery R (1999) An empirical study of analogy-based software effort estimation. Empir Softw Engg 4(2):135–158
Article Google Scholar
Walston C, Felix C (1977) A method of programming measurement and estimation. IBM Syst J 16(1):54–77
Article Google Scholar
Whigham PA, Owen CA, Macdonell SG (2015) A baseline model for software effort estimation. ACM Trans Softw Eng Methodol 24(3):20:1–20:11
Article Google Scholar
Wolverton R (1974) The cost of developing large-scale software. IEEE Trans Comput:615–636

Download references

Acknowledgments

The research described in this paper was carried out, in part, at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the US National Aeronautics and Space Administration. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not constitute or imply its endorsement by the US Government.

Author information

Authors and Affiliations

CS, North Carolina State University, Raleigh, NC, USA
Tim Menzies & George Mathew
SSE, Stevens Institute, Hoboken, NJ, USA
Ye Yang
CS, University of Southern California, Los Angeles, CA, USA
Barry Boehm
Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, USA
Jairus Hihn

Authors

Tim Menzies
View author publications
You can also search for this author in PubMed Google Scholar
Ye Yang
View author publications
You can also search for this author in PubMed Google Scholar
George Mathew
View author publications
You can also search for this author in PubMed Google Scholar
Barry Boehm
View author publications
You can also search for this author in PubMed Google Scholar
Jairus Hihn
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tim Menzies.

Additional information

Communicated by: Richard Paige, Jordi Cabot and Neil Ernst

Rights and permissions

Reprints and permissions

About this article

Cite this article

Menzies, T., Yang, Y., Mathew, G. et al. Negative results for software effort estimation. Empir Software Eng 22, 2658–2683 (2017). https://doi.org/10.1007/s10664-016-9472-2

Download citation

Published: 21 November 2016
Issue Date: October 2017
DOI: https://doi.org/10.1007/s10664-016-9472-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Negative results for software effort estimation

Abstract

Access this article

Similar content being viewed by others

Data collection and quality challenges in deep learning: a data-centric AI perspective

Sampling in software engineering research: a critical review and guidelines

How different are different diff algorithms in Git?

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Negative results for software effort estimation

Abstract

Access this article

Similar content being viewed by others

Data collection and quality challenges in deep learning: a data-centric AI perspective

Sampling in software engineering research: a critical review and guidelines

How different are different diff algorithms in Git?

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation