Cost Prediction and Software Project Management

Shepperd, Martin

doi:10.1007/978-3-642-55035-5_3

Martin Shepperd³

4233 Accesses
4 Citations

Abstract

This chapter reviews the background and extent of the software project cost prediction problem. Given the importance of the topic, there has been a great deal of research activity over the past 40 years, most of which has focused on developing formal cost prediction systems. The problem is that presently there is limited evidence to suggest formal methods outperform experts, therefore detailed consideration is given to the available empirical evidence concerning expert performance. This shows that software professionals tend to be biased (optimistic) and over-confident, and there are a number of deep cognitive biases which help us understand why this is so. Finally, the chapter describes how this might best be tackled through a range of simple, practical and evidence-based methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Systematic Literature Review on Software Development Estimation Techniques

Project Risk Management Methodology

Top Management Support for Software Cost Estimation

Notes

1.
There is something of a proliferation of terminology. Whilst the majority of writers refer to cost modelling or prediction, strictly speaking the usual focus is upon labour or effort which forms the dominant part of costs and is usually the hardest to predict. Such costs may or may not be reflected in the price charged to the client or user. This chapter will use the term in this particular sense. Likewise, estimation and prediction are used interchangeably since we’re only concerned with future events.
2.
The regression model is constructed one independent variable at a time or iteratively until no new variable significantly contributes to the model fit.
3.
A lazy learner only makes an inductive generalization when actually presented with the new problem to solve. This can be advantageous when trying to learn in the face of noisy training cases and much uncertainty.
4.
Essentially, the point is that when conducting a significance test for a hypothesis, there are two dangers: One can wrongly reject the null hypothesis or wrongly fail to reject the null hypothesis. It is customary to set the chances of wrongly rejecting the null hypothesis (denoted by α) at 0.05. However, if many tests are performed, the probability of at least once committing such an error grows with the number of tests. For this reason, the α threshold needs to be reduced to take this danger into account.
5.
The bibliographic database can be found at www.simula.no/BESTweb

References

Abran A, Bourque, P (2004) SWEBOK: guide to the software engineering body of knowledge. IEEE Computer Society
Google Scholar
Albrecht AJ, Gaffney JR (1983) Software function, source lines of code, and development effort prediction: a software science validation. IEEE Trans Softw Eng 9:639–648
Article Google Scholar
Argyris C, Schön D (1996) Organizational learning II: theory, method and practice. Addison-Wesley, Reading, MA
Google Scholar
Armstrong S (2007) Significance tests harm progress in forecasting. Int J Forecast 23:321–327
Article Google Scholar
Benington HD (1956) Production of large computer programs. In: Symposium on advanced computer programs for digital computers, ACR-15
Google Scholar
Boehm BW (1981) Software engineering economics. Prentice-Hall, Englewood Cliffs, NJ
MATH Google Scholar
Boehm BW (1984) Software engineering economics. IEEE Trans Softw Eng 10:4–21
Article Google Scholar
Boehm B, Abts C, Brown W, Chulani S, Clark BK, Horowitz E, Madachy R, Reifer D, Steece B (2000) Software cost estimation with COCOMO II. Pearson/Prentice Hall, Englewood Cliffs, NJ
Google Scholar
Borkowski JG, Carr M, Pressley M (1987) Spontaneous strategy use: perspectives from metacognitive theory. Intelligence 11:61–75
Article Google Scholar
Buehler R, Griffin D (2003) Planning, personality, and prediction: the role of future focus in optimistic time predictions. Organ Behav Hum Decis Process 92:80–90
Article Google Scholar
Buehler R, Griffin D, Ross M (1994) Exploring the “Planning Fallacy”: why people underestimate their task completion times. J Pers Soc Psychol 67:366–381
Article Google Scholar
Clark J, Dolado JJ, Harman M, Hierons RM, Jones B, Lumkin M, Mitchell B, Mancoridis S, Coutinho SA (2007) The relationship between goals, metacognition, and academic success. Educate 7:39–47
Google Scholar
Coutinho SA (2007) The relationship between goals, metacognition, and academic success. Educate 7:39–47
Google Scholar
Cuelenaere A, van Genuchten M, Heemstra F (1987) Calibrating a software cost estimation model - why and how. Inf Softw Technol 29:558–567
Article Google Scholar
Dawson TL (2008) Metacognition and learning in adulthood. Developmental Testing Service LLC, Northampton, MA
Google Scholar
DeMarco T (1982) Controlling software projects: management, measurement and estimation. Yourdon Press, New York
Google Scholar
El Emam K, Koru G (2008) A replicated survey of IT software project failures. IEEE Softw 25:84–90
Article Google Scholar
Ellis P (2010) The essential guide to effect sizes: statistical power, meta-analysis, and the interpretation of research results. Cambridge University Press, Cambridge
Book Google Scholar
Fishman G (1996) Monte Carlo: concepts, algorithms, and applications. Springer, New York
Book MATH Google Scholar
Flavell JH (1979) Metacognition and cognitive monitoring: a new area of cognitive-developmental inquiry. Am Psychol 34:906–911
Article Google Scholar
Flyvbjerg B (2008) Curbing optimism bias and strategic misrepresentation in planning: reference class forecasting in practice. Eur Plan Stud 16:3–32
Article Google Scholar
Flyvbjerg B, Bruzelius N, Rothengatter W (2003) Megaprojects and risk: an anatomy of ambition. Cambridge University Press, Cambridge
Book Google Scholar
Foss T, Stensrud E, Kitchenham B, Myrtveit I (2003) A simulation study of the model evaluation criterion MMRE. IEEE Trans Softw Eng 29:985–995
Article Google Scholar
Griffin D, Buehler R (1999) Frequency, probability, and prediction: easy solutions to cognitive illusions? Cogn Psychol 38:48–78
Article Google Scholar
Grimstad S, Jørgensen M, Moløkken-Østvold K (2006) Software effort estimation terminology: the tower of Babel. Inf Softw Technol 48:302–310
Google Scholar
Gulezian R (1991) Reformulating and calibrating COCOMO. J Syst Softw 16:235–242
Article Google Scholar
Heemstra FJ (1992) Software cost estimation. Inf Softw Technol 34:627–639
Article Google Scholar
Hughes RT (1996) Expert judgement as an estimating method. Inf Softw Technol 38:67–75
Article Google Scholar
Hughes RT, Cotterell M (2009) Software project management. McGraw-Hill, London
Google Scholar
Humphrey W (2000) Introducing the personal software process. Ann Softw Eng 1:311–325
Article Google Scholar
Jeffery DR, Low GC (1990) Calibrating estimation tools for software development. Softw Eng J 5:215–221
Article Google Scholar
Jørgensen M (2004) A review of studies on expert estimation of software development effort. J Syst Softw 70:37–60
Article Google Scholar
Jørgensen M (2010) Identification of more risks can lead to increased over-optimism of and over-confidence in software development effort estimates. Inf Softw Technol 52:506–516
Article Google Scholar
Jørgensen M, Grimstad S (2012) Software development estimation biases: the role of interdependence. IEEE Trans Softw Eng 38:677–693
Article Google Scholar
Jørgensen M, Gruschke T (2005) Industrial use of formal software cost estimation models: expert estimation in disguise? In: Proceedings of EASE, Keele, UK
Google Scholar
Jørgensen M, Gruschke T (2009) The impact of lessons-learned sessions on effort estimation and uncertainty assessments. IEEE Trans Softw Eng 35:368–383
Article Google Scholar
Jørgensen M, Moløkken-Østvold K (2006) How large are software cost overruns? A review of the 1994 CHAOS report. Inf Softw Technol 48:297–301
Article Google Scholar
Jørgensen M, Shepperd M (2007) A systematic review of software development cost estimation studies. IEEE Trans Softw Eng 33:33–53
Article Google Scholar
Jørgensen M, Sjøberg DIK (2003) An effort prediction interval approach based on the empirical distribution of previous estimation accuracy. Inf Softw Technol 45:123–136
Article Google Scholar
Jørgensen M, Teigen KH, Moløkken K (2004) Better sure than safe? Overconfidence in judgment based software development effort prediction intervals. J Syst Softw 70:79–93
Article Google Scholar
Kahneman D, Tversky A (1979) Intuitive prediction: biases and corrective procedures. TIMS Stud Manag Sci 12:313–327
Google Scholar
Kahneman D, Fredrickson B, Schreiber C, Redelmeir D (1993) When more pain is preferred to less: adding a better end. Psychol Sci 4:401–405
Article Google Scholar
Kemerer CF (1987) An empirical validation of software cost estimation models. Commun ACM 30:416–429
Article Google Scholar
Keung J, Kitchenham B, Jeffery R (2008) Analogy-X: providing statistical inference to analogy-based software cost estimation. IEEE Trans Softw Eng 34:471–484
Article Google Scholar
Kitchenham BA (2002) The question of scale economies in software - why cannot researchers agree? Inf Softw Technol 44:13–24
Article Google Scholar
Kitchenham BA, Kansala, K. (1993) Inter-item correlations among function points. In: 1st International symposium on software metrics. IEEE Computer Society Press, Baltimore, MD
Google Scholar
Kitchenham BA, Linkman SG (1997) Estimates, uncertainty and risk. IEEE Softw 14:69–74
Article Google Scholar
Kitchenham BA, MacDonell SG, Pickard L, Shepperd MJ (2001) What accuracy statistics really measure. IEEE Proc Softw Eng 148:81–85
Article Google Scholar
Kitchenham BA, Pfleeger SL, McColl B, Eagan S (2002) An empirical study of maintenance and development estimation accuracy. J Syst Softw 64:57–77
Article Google Scholar
Kitchenham B, Mendes E, Travassos G (2007) Cross versus within-company cost estimation studies: a systematic review. IEEE Trans Softw Eng 33:316–329
Article Google Scholar
Kocaguneli E, Menzies T, Hihn J, Kang H (2012a) Size doesn’t matter? On the value of software size features for effort estimation. In: Proceedings of the 8th international conference on predictive models in software engineering, New York
Google Scholar
Kocaguneli E, Menzies T, Keung J (2012b) On the value of ensemble effort estimation. IEEE Trans Softw Eng 38:1403–1416
Article Google Scholar
Kolodner JL (1993) Case-based reasoning. Morgan-Kaufmann, San Mateo, CA
Google Scholar
Lederer A, Mendelow A (1999) The impact of the environment on the management of information systems. Inf Syst Res 1:205–222
Article Google Scholar
Liu Q, Mintram R (2005) Preliminary data analysis methods in software estimation. Softw Qual J 13:91–115
Article Google Scholar
MacDonell S, Shepperd M (2003a) Using prior-phase effort records for re-estimation during software projects. In: 9th IEEE international metrics symposium
Google Scholar
MacDonell S, Shepperd M (2003b) Combining techniques to optimize effort predictions in software project management. J Syst Softw 66:91–98
Article Google Scholar
MacDonell S, Shepperd MJ (2007) Comparing local and global software effort estimation models – reflections on a systematic review. In: 1st international symposium on empirical software engineering and measurement, Madrid
Google Scholar
Mair C, Shepperd M (2005) The consistency of empirical comparisons of regression and analogy-based software project cost prediction. In: 4th international symposium on empirical software Engineering (ISESE) Noosa Heads, Australia
Google Scholar
Mair C, Kadoda G, Lefley M, Keith P, Schofield C, Shepperd M, Webster S (2000) An investigation of machine learning based prediction systems. J Syst Softw 53:23–29
Article Google Scholar
Mair C, Martincova M, Shepperd M (2009) A literature review of expert problem solving using analogy. In: 13th international conference on evaluation and assessment in software engineering (EASE), British Computer Society, Swinton, UK
Google Scholar
Menzies T, Jalili M, Hihn J, Baker D, Lum K (2010) Stable rankings for different effort models. Autom Softw Eng 17:409–437
Article Google Scholar
Menzies T, Butcher A, Cok D, Marcus A, Layman L, Shull F, Turhan B, Zimmermann T (2013) Local versus global lessons for defect prediction and effort estimation. IEEE Trans Softw Eng 39:822–834
Article Google Scholar
Minku L, Yao X (2013) Ensembles and locality: insight on improving software effort estimation. Inf Softw Technol 55:1512–1528
Article Google Scholar
Mittas N, Angelis L (2013) Ranking and clustering software cost estimation models through a multiple comparisons algorithm. IEEE Trans Softw Eng 39:537–551
Article Google Scholar
Moløkken K, Jørgensen M (2004) Group processes in software effort estimation. Empir Softw Eng 9:315–334
Article Google Scholar
Moon J (1999) Reflection in learning and professional development: theory and practice. Kogan Page, London
Google Scholar
Myrtveit I, Stensrud E (1999) A controlled experiment to assess the benefits of estimating with analogy and regression models. IEEE Trans Softw Eng 25:510–525
Article Google Scholar
Passing U, Shepperd M (2003) An experiment on software project size and effort estimation. In: ACM-IEEE international symposium on empirical software engineering (ISESE 2003)
Google Scholar
Riaz M, Mendes E, Tempero E (2009) A systematic review of software maintainability pre- diction and metrics. In: 3rd international symposium on empirical software engineering and measurement, ACM Computer Press, pp 367–377
Google Scholar
Ridley D, Schutz P, Glanz R, Wernstein C (1992) Self-regulated learning: the interactive influence of metacognitive awareness and goal-setting. J Exp Educ 60:293–306
Article Google Scholar
Saltelli A, Tarantola S, Campolongo F (2000) Sensitivity analysis as an ingredient of modeling. Stat Sci 15:377–395
Article MathSciNet Google Scholar
Schön DA (1983) The reflective practitioner. Basic Books, New York
Google Scholar
Shepperd M (2003) Case-based reasoning and software engineering. In: Aurum A, Jeffery R, Wohlin C, Handzic M (eds) Managing software engineering knowledge. Springer, Berlin
Google Scholar
Shepperd MJ, Kadoda G (2001) Comparing software prediction techniques using simulation. IEEE Trans Softw Eng 27:987–998
Article Google Scholar
Shepperd M, MacDonell S (2012) Evaluating prediction systems in software project estimation. Inf Softw Technol 54:820–827
Article Google Scholar
Shepperd MJ, Schofield C (1997) Estimating software project effort using analogies. IEEE Trans Softw Eng 23:736–743
Article Google Scholar
Sommerville I (2010) Software engineering. Pearson, Hemel Hempstead, UK
Google Scholar
Song Q, Shepperd M (2007) Missing data imputation techniques. Int J Bus Intell Data Mining 2:261–291
Article Google Scholar
Song Q, Shepperd M (2011) Predicting software project effort: a grey relational analysis based method. Expert Syst Appl 38:7302–7316
Article Google Scholar
Strike K, El Emam K, Madhavji N (2001) Software cost estimation with incomplete data. IEEE Trans Softw Eng 27:890–908
Article Google Scholar
Symons CR (1988) Function point analysis: difficulties and improvements. IEEE Trans Softw Eng 14:2–11
Article Google Scholar
Taff LM, Borcering JWB, Hudgins WR (1991) Estimeetings: development estimates and a front-end process for a large project. IEEE Trans Softw Eng 17:839–849
Article Google Scholar
Tversky A, Kahneman D (1974) Judgment under uncertainty: heuristics and biases. Science 185:1124–1131
Article Google Scholar
Wagner S (2007) An approach to global sensitivity analysis: FAST on COCOMO. In: 1st International symposium on empirical software engineering and measurement (ESEM 2007). IEEE Computer Society, pp 440–442
Google Scholar
Whitfield D (2007) Cost Overruns, delays and terminations: 105 outsourced public sector ICT contracts. The European Services Strategy Unit
Google Scholar
Willis R (1985) Invited review: critical path analysis and resource constrained project scheduling—theory and practice. Eur J Oper Res 21(2):149–155
Article Google Scholar
Witten I, Frank E, Hall M (2011) Data mining: practical machine learning, tools and techniques. Morgan Kaufmann, Burlington, MA
Google Scholar
Yang Y, He Z, Mao K, Li Q, Nguyen V, Boehm B, Valerdi R (2013) Analyzing and handling local bias for calibrating parametric cost estimation models. Inf Softw Technol 55:1496–1511. Software Engineering Body of Knowledge (SWEBOK). Software Engineering Body of Knowledge (SWEBOK) Home. http://www.computer.org/portal/web/swebok/home

Download references

Author information

Authors and Affiliations

Department of Computer Science, Brunel University, Middlesex, UB8 3PH, UK
Martin Shepperd

Authors

Martin Shepperd
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Martin Shepperd .

Editor information

Editors and Affiliations

University of Calgary, Calgary, Alberta, Canada
Günther Ruhe
Blekinge Institute of Technology, Karlskrona, Sweden
Claes Wohlin

Glossary

Absolute residuals: a simple and robust means of assessing the predictive accuracy of a prediction system. It is defined simply as: \( \left|{y}_i-{\widehat{y}}_i\right| \) where y _i is the true value for the ith project and \( {\widehat{y}}_i \) the estimated value. This gives the error, irrespective of direction, i.e., an under- or over-estimate. The mean residual (keeping the direction of error) gives a measure of the degree of bias.
Cognitive bias: these are patterns of thinking about problem solving or decision-making that distort and lead people to ‘sub-optimal’ choices. Because of the ubiquity of many such biases, they are classified and named, e.g., the anchoring bias. See the pioneering work of Tversky and Kahneman (1974).
Double loop learning: this differs from ordinary or single-loop learning in that one not only observes the effects of the process, but also understands the external factors that influence the effects. This was initially promoted by Argyris and Schön as a way of promoting effective organisational behaviour (Argyris and Schön 1996).
Estimation by Analogy (EBA): uses some form of case-based reasoning where a new or target case which is to be solved is plotted in feature space (one dimension per feature) and some distance metric used to determine past proximal cases from which a solution can be derived. For a general account of CBR see the pioneering work by Kolodner (1993) and for its application to software engineering see Shepperd (2003).
Expert Judgement: this is something of a catch all description for a range of informal approaches to estimation. Jørgensen describes it as ‘unaided intuition (“gut feeling”) to expert judgment supported by historical data, process guidelines and checklists (“structured estimation”)’ (Jørgensen 2004). Despite it being a widespread estimation approach, it can still be criticised for its reasoning not being open to scrutiny since the reasoning process is ‘non-recoverable’ (Jørgensen 2004), not repeatable or easily transferable from existing experts to others.
Formal prediction system: or formal model for cost prediction is characterised by repeatability so that different individuals applying the same inputs should generate the same outputs (with the exception of prediction systems based on stochastic search [also see Chap. 15 on search-based project management] where this will tend to be true over time (Clark et al. 2007), but not necessarily for a single utilisation). Examples of formal systems range from simple algorithmic models, such as COCOMO, to complex ensembles of learners.
Machine Learning: this is a branch of applied artificial intelligence based on inducing prediction systems from historical data, i.e., reasoning from the particular to the general. There are a wide range of approaches including neural networks, case-based reasoning, rule induction, Bayesian methods, support vector machines and population search methods such as genetic programming. Standard textbooks that provide overviews of these techniques include Witten et al. (2011).
Mean magnitude of relative error (MMRE): this is a widely used, although now heavily criticized (Kitchenham et al. 2001; Foss et al. 2003; Shepperd and MacDonell 2012), measure of predictive accuracy defined as: \( MMRE=\displaystyle \sum_1^n\left.{\bigg(\bigg|\left.^{({x}_i-{\widehat{x}}_i)}\right/ {x}_i\bigg|\bigg)}\right/_{n} \) where x is the true cost for the ith project, ◯ is the estimated cost and n the total number of projects.
Metacognition: this refers to ‘thinking about thinking’ (Flavell 1979) and is an awareness and monitoring of one’s thoughts and performance. It encompasses the ability to consciously control the cognitive processes involved in learning such as planning, strategy selection, monitoring and evaluating progress towards a particular goal and adapting strategies as, and when, necessary to reach that goal (Ridley et al. 1992).
Over-confidence: refers to the tendency of an estimator to value precision over accuracy. Typically, one might express confidence in an estimate as the likelihood that the true value falls within a specified interval. For example, stating that one is 80 % confident that the actual effort will fall within the range 1,000–1,200 person-hours implies that this will occur 8 out of 10 times. If the true value falls into the range less frequently this implies over-confidence. Jørgensen et al. (2004) reported that over-confidence was a widespread phenomenon and that at least one contributor was the fact that managers often interpret wide intervals as conveying a lack of knowledge and prefer narrow but less accurate estimates.
Over-optimism: refers to the situation where the estimation error is biased towards an under-estimate. Many studies indicate that this is the norm in the software industry with a figure of 30 % being seen as typical (Jørgensen 2004).
Prediction: whilst ‘prediction’ and ‘estimation’ are often used interchangeably, we use ‘prediction’ to mean a forecast or projection, and ‘estimate’ to connote a guess or rough and ready calculation.
Single-loop learning: Argyris and Schön (1996) characterise this as focusing on restrictive feedback so that the individual or organisation only endeavours to improve a single metric without external reflection upon the process, i.e., double loop learning.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Shepperd, M. (2014). Cost Prediction and Software Project Management. In: Ruhe, G., Wohlin, C. (eds) Software Project Management in a Changing World. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55035-5_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-55035-5_3
Published: 24 May 2014
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-55034-8
Online ISBN: 978-3-642-55035-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Cost Prediction and Software Project Management

Abstract

Access this chapter

Similar content being viewed by others

A Systematic Literature Review on Software Development Estimation Techniques

Project Risk Management Methodology

Top Management Support for Software Cost Estimation

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Glossary

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Cost Prediction and Software Project Management

Abstract

Access this chapter

Similar content being viewed by others

A Systematic Literature Review on Software Development Estimation Techniques

Project Risk Management Methodology

Top Management Support for Software Cost Estimation

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Glossary

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation