Annals of Software Engineering

, Volume 1, Issue 1, pp 141–154 | Cite as

Application of neural networks for predicting program faults

  • Taghi M. Khoshgoftaar
  • Abhijit S. Pandya
  • David L. Lanning


Accurately predicting the number of faults in program modules is a major problem in the quality control of large software development efforts. Some software complexity metrics are closely related to the distribution of faults across program modules. Using these relationships, software engineers develop models that provide early estimates of quality metrics that do not become available until late in the development cycle. By considering these early estimates, software engineers can take actions to avoid or prepare for emerging quality problems. Most often, the predictive models are based upon multiple regression analysis. However, measures of software quality and complexity exhibit systematic departures from the assumptions of these analyses. With extreme violations of these assumptions, multiple regression models become unstable and lose most of their predictive quality. Since neural network models carry no data assumptions, these models could be more appropriate than regression models for modeling software faults. In this paper, we explore a neural network methodology for developing models that predict the number of faults in program modules. We apply this methodology to develop neural network models based upon data collected during the development of two commercial software systems. After developing neural network models, we apply multiple linear regression methods to develop regression models on the same data. For the data sets considered, the neural network methodology produced better predictive models in terms of both quality of fit and predictive quality.


Software complexity metrics software quality regression analysis neural networks program faults model quality of fit model predictive quality average relative error 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Briand, L.C. and V.R. Basili (1992), “A Classification Procedure for the Effective Management of Changes during the Maintenance Process”, InProceedings of the IEEE Conference on Software Maintenance, Orlando, Florida, pp. 328–336.Google Scholar
  2. Dillon, W.R. and M. Goldstein (1984),Multivariate Analysis: Methods and Applications, Wiley, New York.Google Scholar
  3. Fahlman, S. (1988), “An Empirical Study of Learning in Back-Propagation Networks”, Technical Report CMU-CS-88-162, Computer Science Deartment, Carnegie-Mellon University, Pittsburgh, PA.Google Scholar
  4. Fahlman, S. and C. Lebiere (1990), “The Cascaded-Correlation Learning Architecture”, Technical Report CMU-CS-90-100, Computer Science Department, Carnegie-Mellon University, Pittsburgh, PA.Google Scholar
  5. Gaffney, J.E. (1984), “Estimating the Number of Faults in Code”,IEEE Transactions on Software Engineering SE-10, 459–464.Google Scholar
  6. Halstead, M.H. (1977),Elements of Software Science, Elsevier North-Holland, New York.Google Scholar
  7. Henry, S.M. and S. Wake (1991), “Predicting Maintainability with Software Quality Metrics”,Software Maintenance: Research and Practice 3, 129–143.Google Scholar
  8. Jacobs, R. (1988), “Increases Rates of Convergence Through Learning Rate Adaptation”,Neural Networks 1, 295–307.Google Scholar
  9. Khoshgoftaar, T.M., D.L. Lanning, and A.S. Pandya (1993a), “A Neural Network Modeling Methodology for the Detection of High-Risk Programs”, InProceedings of the Fourth IEEE International Symposium on Software Reliability Engineering, Denver, Colorado, pp. 302–309.Google Scholar
  10. Khoshgoftaar, T.M., D.L. Lanning, and A.S. Pandya (1994), “A Comparative Study of Pattern Recognition Techniques for Quality Evaluation of Telecommunications Software”,IEEE Journal of Selected Areas in Communications 12, 2, 279–291.Google Scholar
  11. Khoshgoftaar, T.M. and J.C. Munson (1990), “Predicting Software Development Errors Using Complexity Metrics”,IEEE Journal of Selected Areas in Communications 8, 2, 253–261.Google Scholar
  12. Khoshgoftaar, T.M., J.C. Munson, B.B. Bhattacharya, and G.D. Richardson (1992a), “Predictive Modeling Techniques of Software Quality from Software Measures”,IEEE Transactions on Software Engineering 18, 11, 979–987.Google Scholar
  13. Khoshgoftaar, T.M., J.C. Munson, and D.L. Lanning (1993b), “A Comparative Study of Predictive Models for Program Changes During System Testing and Maintenance”, InProceedings of the IEEE Conference on Software Maintenance, Montreal, Quebec, Canada, pp. 72–79.Google Scholar
  14. Khoshgoftaar, T.M., A.S. Pandya, and H.B. More (1992b), “A Neural Network Approach for Predicting Software Development Faults”, InProceedings of the Third IEEE International Symposium on Software Reliability Engineering, Research Triangle Park, North Carolina, pp. 83–89.Google Scholar
  15. Levitin A. (1989), “TheL 1 Criteria in Data Analysis and the Problem of Software Size Estimation”, InProceedings of 21st Symposium on the Interface Computing Science and Statistics, pp. 382–383.Google Scholar
  16. Lind, R.K. and K. Vairavan (1989), “An Experimental Investigation of Software Metrics and their Relationship to Software Development Effort”,IEEE Transactions on Software Engineering 15, 5, 649–651.Google Scholar
  17. McCabe, T.J. (1976), “A Complexity Metric”,IEEE Transactions on Software Engineering 2, 4, 308–320.Google Scholar
  18. Munson, J.C. and T.M. Khoshgoftaar (1991), “Some Primitive Control Flow Metrics”, InProceedings of the Annual Oregon Workshop on Software Metrics, Silver Falls, Oregon.Google Scholar
  19. Myers, R.H. (1990),Classical and Modern Regression with Applications, Duxbury Press, Boston, MA.Google Scholar
  20. Narula, S.C. and J.F. Wellington (1977), “Prediction, Linear Regression and the Minimum Sum of Relative Errors,Technometrics 19, 185–190.Google Scholar
  21. Ooyen, A.V. and B. Niemhuis (1992), “Improving the Convergence of the Back-Propagation Algorithm”,Neural Networks 5, 465–471.Google Scholar
  22. Pedone, R. and D. Parisi (1991), “Learning the Learning Parameters”, InProceedings of the IJCNN, Singapore, pp. 2033–2037.Google Scholar
  23. Rumelhart, D.E., G.E. Hinton, and R.J. Williams (1986),Parallel Distributed Processing: Explorations in the Microstructure of Cognition, volume 1, chapter Learning Internal Representations by Error Propagation, MIT Press, Cambridge, MA, pp. 318–362.Google Scholar
  24. Selby, R.W. and V.R. Basili (1991), “Analyzing Error-Prone System Structure”,IEEE Transactions on Software Engineering 17, 2, 141–152.Google Scholar
  25. Shen, V.Y., T. Yu, S.M. Thebaut, and L.R. Paulsen (1985), “Identifying Error-Prone Software — An Empirical Study”,IEEE Transactions on Software Engineering SE-11, 317–323.Google Scholar
  26. Tolleneare, T. (1990), “SuperSAB: Fast Adaptive Back-Propogation with Good Scaling Properties”,Neural Networks 3, 561–573.Google Scholar
  27. Vogl, T.P., J.K. Mangis, A.K. Rigler, W.T. Zink, and D.L. Alkon (1988), “Accelerating the Convergence of the Back-Propogation Method”,Biological Cybernetics 59, 257–263.Google Scholar

Copyright information

© J.C. Baltzer AG, Science Publishers 1995

Authors and Affiliations

  • Taghi M. Khoshgoftaar
    • 1
  • Abhijit S. Pandya
    • 2
  • David L. Lanning
    • 3
  1. 1.Department of Computer Science and EngineeringFlorida Atlantic UniversityBoca RatonUSA
  2. 2.Department of Computer Science and EngineeringFlorida Atlantic UniversityBoca RatonUSA
  3. 3.IBM CorporationBoca RatonUSA

Personalised recommendations