Advertisement

Empirical Software Engineering

, Volume 8, Issue 3, pp 255–283 | Cite as

Fault Prediction Modeling for Software Quality Estimation: Comparing Commonly Used Techniques

  • Taghi M. Khoshgoftaar
  • Naeem Seliya
Article

Abstract

High-assurance and complex mission-critical software systems are heavily dependent on reliability of their underlying software applications. An early software fault prediction is a proven technique in achieving high software reliability. Prediction models based on software metrics can predict number of faults in software modules. Timely predictions of such models can be used to direct cost-effective quality enhancement efforts to modules that are likely to have a high number of faults. We evaluate the predictive performance of six commonly used fault prediction techniques: CART-LS (least squares), CART-LAD (least absolute deviation), S-PLUS, multiple linear regression, artificial neural networks, and case-based reasoning. The case study consists of software metrics collected over four releases of a very large telecommunications system. Performance metrics, average absolute and average relative errors, are utilized to gauge the accuracy of different prediction models. Models were built using both, original software metrics (RAW) and their principle components (PCA). Two-way ANOVA randomized-complete block design models with two blocking variables are designed with average absolute and average relative errors as response variables. System release and the model type (RAW or PCA) form the blocking variables and the prediction technique is treated as a factor. Using multiple-pairwise comparisons, the performance order of prediction models is determined. We observe that for both average absolute and average relative errors, the CART-LAD model performs the best while the S-PLUS model is ranked sixth.

Software quality prediction software metrics fault prediction CART S-PLUS multiple linear regression neural networks case-based reasoning 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Beaumont, G. P. 1996. Statistical Tests: An Introduction with Minitab Commentary. Prentice Hall.Google Scholar
  2. Berenson, M. L., Levine, D. M., and Goldstein, M. 1983. Intermediate Statistical Methods and Applications: A Computer Package Approach. Englewood Cliffs, NJ, USA: Prentice Hall.Google Scholar
  3. Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. 1984. Classification And Regression Trees. Belmont, California, USA: Wadsworth International Group, 2nd edition.Google Scholar
  4. Briand, L. C., Basili, V. R., and Hetmanski, C. J. 1993. Developing interpretable models with optimized set reduction for identifying high-risk software components. IEEE Transactions on Software Engineering 19(11): 1028–1044.Google Scholar
  5. Briand, L. C., Langley, T., and Wieczorek, I. 2000. A replicated assessment and comparison of common software cost modeling techniques. In Proceedings: International Conference on Software Engineering. Limerick, Ireland, 377–386.Google Scholar
  6. Clark, L. A., and Pregibon, D. 1992. Tree-based models. In J. M. Chambers and T. J. Hastie (eds.): Statistical Models in S. Pacific Grove, California: Wadsworth International Group, pp. 377–419.Google Scholar
  7. Fenton, N. E., and Pfleeger, S. L. 1997. Software Metrics: A Rigorous and Practical Approach, second edition, Boston, MA, USA: PWS Publishing Company: ITP.Google Scholar
  8. Finnie, G. R., Wittig, G. E., and Desharnais, J. M. 1997. A comparison of software effort estimation techniques: using function points with neural networks, case-based reasoning, and regression models. Journal of Systems and Software 39: 281–289.Google Scholar
  9. Ganesan, K., Khoshgoftaar, T. M., and Allen, E. B. 2000. Case-based software quality prediction. International Journal of Software Engineering and Knowledge Engineering 10(2): 139–152. World Scientific Publishing.Google Scholar
  10. Gokhale, S. S., and Lyu, M. R. 1997. Regression tree modeling for the prediction of software quality. In H. Pham (ed.): Proceedings: 3rd International Conference on Reliability and Quality in Design. Anaheim, California, USA, 31–36.Google Scholar
  11. Gray, A. R., and MacDonell, S. G. 1999. Software metrics data analysis: exploring the relative performance of some commonly used modeling techniques. Empirical Software Engineering 4: 297–316.Google Scholar
  12. Hudepohl, J. P., Aud, S. J., Khoshgoftaar, T. M., Allen, E. B., and Mayrand, J. 1996. Emerald: Software metrics and models on the desktop. IEEE Software 13(5): 56–60.Google Scholar
  13. Jones, W. D., Hudepohl, J. P., Khoshgoftaar, T. M., and Allen, E. B. 1999. Application of a usage profile in software quality models. In Proceedings: 3rd European Conference on Software Maintenance and Reengineering. Amsterdam, Netherlands, 148–157.Google Scholar
  14. Khoshgoftaar, T. M., and Allen, E. B. 2001. Modeling software quality with classification trees. In H. Pham (ed.): Recent Advances in Reliability and Quality Engineering. Singapore: World Scientific Publishing, Chapter 15, 247–270.Google Scholar
  15. Khoshgoftaar, T. M., Allen, E. B., and Busboom, J. C. 2000a. Modeling software quality: the software measurement analysis and reliability toolkit. In Proceedings: 12th International Conference on Tools with Artificial Intelligence. Vancouver, BC Canada, 54–61.Google Scholar
  16. Khoshgoftaar, T. M., Allen, E. B., and Deng, J. 2001. Controlling overfitting in software quality models: experiments with regression trees and classification. In Proceedings: 7th International Software Metrics Symposium. London UK, 190–198.Google Scholar
  17. Khoshgoftaar, T. M., Allen, E. B., Jones, W. D., and Hudepohl, J. P. 2000b. Accuracy of software quality models over multiple releases. Annals of Software Engineering 9(1–4): 103–116.Google Scholar
  18. Khoshgoftaar, T. M., Allen, E. B., and Shan, R. 2000c. Improving tree-based models of software quality with principal components analysis. In Proceedings of the Eleventh International Symposium on Software Reliability Engineering. San Jose, California, USA, 198–209.Google Scholar
  19. Khoshgoftaar, T. M., and Lanning, D. L. 1995. A neural network approach for early detection of pro-gram modules having high risk in the maintenance phase. Journal of Systems and Software 29(1): 85–91.Google Scholar
  20. Khoshgoftaar, T. M., Munson, J. C., Bhattacharya, B. B., and Richardson, G. D., 1992. Predictive modeling techniques of software quality from software measures. IEEE Transactions on Software Engineering 18(11):979–987.Google Scholar
  21. Khoshgoftaar, T. M., and Seliya, N. 2002. Tree-based software quality models for fault prediction. In Proceedings: 8th International Software Metrics Symposium. Ottawa, Ontario, Canada, 203–214.Google Scholar
  22. Kolodner, J. 1993. Case-Based Reasoning. San Mateo, California, USA: Morgan Kaufmann Publishers, Inc.Google Scholar
  23. Lin, C. T., and Lee, C. S. G. 1996. Neural Fuzzy Systems: A Neuro-Fuzzy Synergism to Intelligent Systems. Upper Saddle River, NJ, USA: Prentice Hall Inc.Google Scholar
  24. Lippmann, R. P. 1987. An introduction to computing withneural networks. Acoustics, Speech and Signal Processing Magazine 4(2): 4–22.Google Scholar
  25. Minsky, M., and Papert, S. 1969. Perceptrons. MA, USA: MIT Press.Google Scholar
  26. Neter, J., Kutner, M. H., Nachtsheim, C. J., and Wasserman, W. 1996. Applied Linear Statistical Models. Tom Casson.Google Scholar
  27. Nielsen, R. H. 1987. Counter propagation network. Applied Optics Journal 26(23).Google Scholar
  28. Ohlsson, M. C., and Runeson, P. 2002. Experience from replicating empirical studies on prediction models. In Proceedings: 8th International Software Metrics Symposium. Ottawa, Ontario, Canada, 217–226.Google Scholar
  29. Rosenblatt, F. 1962. Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. New York, NY, USA: Spartan Books.Google Scholar
  30. Rumelhart, D. E., Hinton, G. E., and Williams, R. 1962. Parallel Distributed Processing, Vol. 1. Cambridge, MA, USA: MIT Press.Google Scholar
  31. Schneidewind, N. F. 1995. Software metrics validation: space shuttle flight software example. Annals of Software Engineering 1: 287–309.Google Scholar
  32. Schneidewind, N. F. 1997. Software metrics model for integrating quality control and prediction. In Proceedings: 8th International Symposium on Software Reliability Engineering. Albuquerque, NM USA, 402–415.Google Scholar
  33. Seliya, N. 2001. Software fault prediction using tree-based models. Master's thesis, Florida Atlantic University, Boca Raton, FL USA. Advised by T. M. Khoshgoftaar.Google Scholar
  34. Sundaresh, N. 2001. An empirical study of analogy based software fault prediction. Master's thesis, Florida Atlantic University, Boca Raton, FL USA. Advised by Taghi M. Khoshgoftaar.Google Scholar
  35. Takahashi, R., Muraoka, Y., and Nakamura, Y. 1997. Building software quality classification trees: approach, experimentation, evaluation. In Proceedings: 8th International Symposium on Software Reliability Engineering. Albuquerque, NM, USA, 222–233.Google Scholar
  36. Troster, J., and J. Tian 1995. Measurement and defect modeling for a legacy software system. Annals of Software Engineering 1: 95–118.Google Scholar
  37. Xu, Z. 2001. Fuzzy logic techniques for software reliability engineering. Ph.D. thesis, Florida Atlantic University, Boca Raton, Florida USA. Advised by Taghi M. Khoshgoftaar.Google Scholar

Copyright information

© Kluwer Academic Publishers 2003

Authors and Affiliations

  • Taghi M. Khoshgoftaar
    • 1
  • Naeem Seliya
    • 2
  1. 1.Florida Atlantic UniversityBoca Raton
  2. 2.Florida Atlantic UniversityBoca Raton

Personalised recommendations