Software Project Effort Estimation Based on Multiple Parametric Models Generated Through Data Clustering
Parametric software effort estimation models usually consists of only a single mathematical relationship. With the advent of software repositories containing data from heterogeneous projects, these types of models suffer from poor adjustment and predictive accuracy. One possible way to alleviate this problem is the use of a set of mathematical equations obtained through dividing of the historical project datasets according to different parameters into subdatasets called partitions. In turn, partitions are divided into clusters that serve as a tool for more accurate models. In this paper, we describe the process, tool and results of such approach through a case study using a publicly available repository, ISBSG. Results suggest the adequacy of the technique as an extension of existing single-expression models without making the estimation process much more complex that uses a single estimation model. A tool to support the process is also presented.
Keywordssoftware engineering software measurement effort estimation clustering
Unable to display preview. Download preview PDF.
- Boehm B, Abts C, Chulani S. Software development cost estimation approaches — A survey. USC Center for Software Engineering Technical Report USC-CSE-2000-505, 2000.Google Scholar
- Parametric Estimating Initiative. Parametric Estimating Handbook, 2nd Edition, 1999.Google Scholar
- Stensrud E, Foss T, Kitchenham B, Myrtveit I. An empirical validation of the relationship between the magnitude of relative error and project size. In Proc. the Eighth IEEE Symp. Software Metrics, Ottawa, Canada, 2002, pp.3–12.Google Scholar
- Cuadrado-Gallego J J, Sicilia M A, Garre M et al. An empirical study of process-related attributes in segmented software cost-estimation relationships. Journal of Systems and Software, 2006, 79(3): 351–361.Google Scholar
- Shepperd M, Schofield C, Kitchenham B. Effort estimation using analogy. In Proc. 8th Int. Conf. Software Engineering, IEEE Computer Society Press, Berlin, 1996, pp.170–178.Google Scholar
- Oligny S, Bourque P, Abran A, Fournier B. Exploring the relation between effort and duration in software engineering project. In Proc. World Computer Congress, Beijing, China, August 21–25, 2000, pp.175–178.Google Scholar
- Conte S D, Dunsmore H E, Shen V Y. Software Engineering Metrics and Models. Menlo Park: Benjamin/Cummings, CA, 1986.Google Scholar
- Kohavi R, John G. Automatic parameter selection by minimizing estimated error. In Proc. 12th Int. Conf. Machine Learning, San Francisco, 1995, pp.304–312.Google Scholar
- Witten I H, Frank E. Data Mining, Practical Machine Learning Tools and Techniques with Java Implementations. San Francisco: Morgan Kaufmann Publishers, USA, 2005.Google Scholar
- NESMA. NESMA FPA counting practices manual (CPM 2.0), 1996.Google Scholar
- Dreger J B. Function Point Analysis. Englewood Cliffs, NJ: Prentice Hall, 1989.Google Scholar