An empirical study of predicting software faults with case-based reasoning
Purchase on Springer.com
$39.95 / €34.95 / £29.95*
Rent the article at a discountRent now
* Final gross prices may vary according to local VAT.
The resources allocated for software quality assurance and improvement have not increased with the ever-increasing need for better software quality. A targeted software quality inspection can detect faulty modules and reduce the number of faults occurring during operations. We present a software fault prediction modeling approach with case-based reasoning (CBR), a part of the computational intelligence field focusing on automated reasoning processes. A CBR system functions as a software fault prediction model by quantifying, for a module under development, the expected number of faults based on similar modules that were previously developed. Such a system is composed of a similarity function, the number of nearest neighbor cases used for fault prediction, and a solution algorithm. The selection of a particular similarity function and solution algorithm may affect the performance accuracy of a CBR-based software fault prediction system. This paper presents an empirical study investigating the effects of using three different similarity functions and two different solution algorithms on the prediction accuracy of our CBR system. The influence of varying the number of nearest neighbor cases on the performance accuracy is also explored. Moreover, the benefits of using metric-selection procedures for our CBR system is also evaluated. Case studies of a large legacy telecommunications system are used for our analysis. It is observed that the CBR system using the Mahalanobis distance similarity function and the inverse distance weighted solution algorithm yielded the best fault prediction. In addition, the CBR models have better performance than models based on multiple linear regression.
- Aha, D.W. and Bankert, R.L. 1994. Feature selection for case-based classification of cloud types: an empirical comparison. In D.W. Aha, ed., Workshop on Case-Based Reasoning (Technical Report WS-94-01), Menlo Park, California, AAAI Press.
- Bartsch-Spoerl, B. 1995. Toward the integration of case-based, schema-based, and model-based reasoning for supporting complex design tasks. In Proceedings: First International Conference on Case-Based Reasoning, pp. 145–156. Springer-Verlag.
- Bell, B., Kedar, S. and Bareiss, R. 1994. Interactive model-driven case adaptation for instructional software design. in Proceedings: 16th Annual Conference of the Cognitive Science Society, pp. 33–38. Lawrence Erlbaum Publishers.
- Berenson, M.L., Levine, D.M. and Goldstein, M. 1983. Intermediate Statistical Methods and Applications: A Computer Package Approach. Prentice Hall, Englewood Cliffs, NJ, USA.
- Briand, L.C., Langley, T. and Wieczorek, I. 2000. Areplicated assessment and comparison of common software cost modeling techniques. In Proceedings: International Conference on Software Engineering, pp. 377–386, Limerick, Ireland. Association for Computing Machinery.
- Dillon, W.R. and Goldstein, M. 1984. Multivariate Analysis: Methods and Applications. John Wiley & Sons, New York.
- Fayyad, U.M. 1996. Data mining and knowledge discovery: making sense out of data. IEEE Expert, 11(4): 20–25. CrossRef
- Fenton, N.E. and Pfleeger, S.L. 1997. Software Metrics: A Rigorous and Practical Approach. PWS Publishing Company: ITP, Boston, MA, 2nd edition.
- Ganesan, K., Khoshgoftaar, T.M. and Allen, E.B. 2000. Case-based software quality prediction. International Journal of Software Engineering and Knowledge Engineering, 10(2): 139–152. World Scientific Publishing.
- Gokhale, S.S. and Lyu, M.R. 1997. Regression tree modeling for the prediction of software quality. In Pham, H., ed. Proceedings of 3rd International Conference on Reliability and Quality in Design, pp. 31–36, Anaheim, CA. International Society of Science and Applied Technologies.
- Gray, A.R. and MacDonell, S.G. 1999. Software metrics data analysis: exploring the relative performance of some commonly used modeling techniques. Empirical Software Engineering Journal, 4: 297–316. CrossRef
- Hall, M.A. and Smith, L.A. 1998. Practical feature subset selection. In Proceedings: 21st Australian Computer Science Conference, pp. 181–191. Springer Verlag.
- Hudepohl, J.P., Aud, S.J., Khoshgoftaar, T.M., Allen, E.B. and Mayrand, J. 1996. Emerald: Software metrics and models on the desktop. IEEE Software, 13(5): 56–60 CrossRef
- Idri, A., Abran, A. and Khoshgoftaar, T.M. 2002. Estimating software project effort by analogy based on linguistic values. In Proceedings: 8th International Software Metrics Symposium, pp. 21–30, Ottawa, Ontario, Canada, IEEE Computer Society.
- Imam, K.E., Benlarbi, S., Goel, N. and Rai, S.N. 2001. Comparing case-based reasoning classifiers for predicting high-risk software componenets. Journal of Systems and Software, 55(3): 301–320. Elsevier Science Publishing.
- Kadoda, G., Cartwright, M., Chen, L. and Shepperd, M. 2000. Experiences using case-based reasoning to predict software project effort. In Proceedings of 4th International Conference on Empirical Assessment in Software Engineering, pp. 23–33, Staffordshire, UK.
- Khoshgoftaar, T.M., Allen, E.B. and Busboom, J.C. 2000. Modeling software quality: the software measurement analysis and reliability toolkit. In Proceedings of 12th International Conference on Tools with Artificial Intelligence, pp. 54–61, Vancouver, BC, Canada, November. IEEE Computer Society.
- Khoshgoftaar, T.M., Bullard, L.A. and Gao, K. 2003. Detecting outliers using rule-based modeling for improving cbr-based software quality classification models. In Ashley, K.D. and Bridge, D.G., (Eds.), Proceedings of the 16th International Conference on Case-Based Reasoning, volume 1689, pp. 216–230. Springer-Verlag LNAI.
- Khoshgoftaar, T.M., Ganesan, K., Allen, E.B., Ross, F.D., Munikoti, R., Goel, N. and Nandi, A. 1997. Predicting fault-prone modules with case-based reasoning. In Proceedings of 8th International Symposium on Software Reliability Engineering, pp. 27–35, Albuquerque, NM, IEEE Computer Society.
- Khoshgoftaar, T.M., Nguyen, L., Gao, K. and Rajeevalochanam, J. 2003. Application of an attribute selection method to cbr-based software quality classification. In Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence, pp. 47–52, Sacramento, CA.
- Khoshgoftaar, T.M., Pandya, A.S. and Lanning, D.L. 1995. Application of neural networks for predicting faults. Annals of Software Engineering, 1: 141–154. CrossRef
- Khoshgoftaar, T.M. and Seliya, N. 2002 Tree-based software quality models for fault prediction. In Proceedings: 8th International Software Metrics Symposium, pp. 203–214, Ottawa, Ontario, Canada IEEE Computer Society.
- Khoshgoftaar, T.M. and Seliya, N. 2003. Analogy-based practical classification rules for software quality estimation. Empirical Software Engineering Journal, 8(4): 325–350. CrossRef
- Kolodner, J. 1993. Case-Based Reasoning. Morgan Kaufmann Publishers, Inc., San Mateo, California USA.
- Korel, B. 1996. Automated test data generation for programs with procedures. Proceedings of the International Symposium on Software Testing and Analysis, 21(3): 209–215.
- Kriegsman, M. and Barletta, R. 1993. Building a case-based help desk application. IEEE Expert, 8(6): 18–24. CrossRef
- Leake, D.B. 1996. Editor. Case-Based Reasoning: Experience, Lessons and Future Directions. MIT Press, Cambridge, MA USA.
- Perry, W.E. 2000. Effective Methods for Software Testing. John Wiley & Sons, New York, NY, 2nd edition.
- Porter, A.A., Siy, H.P., Toman, C.A. and Votta, L.G. 1997. An experiment to assess the cost-benefits of code-inspection in large scale software development. IEEE Transactions on Software Engineering, 23(6): 329–346. CrossRef
- Ramamoorthy, C.V., Chandra, C., Ishihara, S. and Ng, Y. 1993. Knowledge-based tools for risk assessment in software development and reuse. In Proceedings: 5th International Conference on Tools with Artificial Intelligence, pp. 364–371, Boston, MA, USA IEEE Computer Society.
- Schneidewind, N.F. 2002. Body of knowledge for software quality measurement. IEEE Computer, 35(2): 77–83
- Shepperd, M. and Kadoda, G. 2001. Comparing software prediction techniques using simulation. IEEE Transactions on Software Engineering, 27(11): 1014–1022. CrossRef
- Shepperd, M. and Schofield, C. 1997. Estimating software project effort using analogies. IEEE Transactions on Software Engineering, 23(12): 736–743. CrossRef
- Smith, N.T. and Ganesan, K. 1995. Software design using case-based reasoning. In Proceedings: Fourth Software Engineering Research Forum, pp. 193–200, Boca Raton, FL
- Sundaresh, N. 2001. An empirical study of analogy based software fault prediction. Master’s thesis, Florida Atlantic University, Boca Raton, FL. Advised by Taghi M. Khoshgoftaar.
- Troster, J. and Tian, J. 1995. Measurement and defect modeling for a legacy software system. Annals of Software Engineering, 1: 95–118 CrossRef
- Votta, L.G. and Porter, A.A. 1995. Experimental software engineering: a report on the state of the art. In Proceedings of the 17th. International Conference on Software Engineering, pp. 277–279, Seattle, WA USA. IEEE Computer Society.
- Whitten, I.H. and Frank, E. 2000. Data Mining: Practical Machine Learning Tools and Techniques with JAVA Implementations. Morgan Kaufmann, San Francisco, CA.
- Wohlin, C., Runeson, P., Host, M., Ohlsson, M.C., Regnell, B. and Wesslen, A. 2000. Experimentation in Software Engineering: An Introduction. Kluwer International Series in Software Engineering. Kluwer Academic Publishers, Boston, MA.
- An empirical study of predicting software faults with case-based reasoning
Software Quality Journal
Volume 14, Issue 2 , pp 85-111
- Cover Date
- Print ISSN
- Online ISSN
- Kluwer Academic Publishers
- Additional Links
- Software quality
- Case-based reasoning
- Software fault prediction
- Similarity functions
- Solution algorithm
- Software metrics
- Industry Sectors