Skip to main content

AutoODC: Automated generation of orthogonal defect classifications


Orthogonal defect classification (ODC), the most influential framework for software defect classification and analysis, provides valuable in-process feedback to system development and maintenance. Conducting ODC classification on existing organizational defect reports is human-intensive and requires experts’ knowledge of both ODC and system domains. This paper presents AutoODC, an approach for automating ODC classification by casting it as a supervised text classification problem. Rather than merely applying the standard machine learning framework to this task, we seek to acquire a better ODC classification system by integrating experts’ ODC experience and domain knowledge into the learning process via proposing a novel relevance annotation framework. We have trained AutoODC using two state-of-the-art machine learning algorithms for text classification, Naive Bayes (NB) and support vector machine (SVM), and evaluated it on both an industrial defect report from the social network domain and a larger defect list extracted from a publicly accessible defect tracker of the open source system FileZilla. AutoODC is a promising approach: not only does it leverage minimal human effort beyond the human annotations typically required by standard machine learning approaches, but it achieves overall accuracies of 82.9 % (NB) and 80.7 % (SVM) on the industrial defect report, and accuracies of 77.5 % (NB) and 75.2 % (SVM) on the larger, more diversified open source defect list.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3


  1. 1.

    Due to proprietary rules, we anonymize the industrial company by referring to it as “Company P” throughout this paper.

  2. 2.

    The definitions and taxonomy of ODC v5.2 attributes are accessible at

  3. 3.

    Elgg is an open source social networking engine. The defect (issue) tracker of Elgg can be accessed at

  4. 4.

    Other stemmers, such as the Porter stemmer (Porter 1980), can be used, but we found that the WordNet stemmer yields slightly better accuracy.

  5. 5.

    FileZilla is a free FTP solution composed of three subsystems: FileZilla Client, FileZilla Server, and Other. The defect tracker for the three subsystems of FileZilla is accessible at

  6. 6.

    To train a multi-class SVM classifier, we use \(SVM^{multiclass}\) (Tsochantaridis et al. 2004). To train a multi-class NB classifier, we use the implementation in Weka.


  1. Ahsan, S.N., Ferzund, J., Wotawa, F.: Automatic classification of software change request using multi-label machine learning methods. In: Proceedings of the 33rd IEEE Software Engineering, Workshop, pp. 79–86 (2009)

  2. Aizawa, A.: Linguistic techniques to improve the performance of automatic text categorization. In: Proceedings of NLPRS-01, 6th Natural Language Processing Pacific Rim Symposium, pp. 307–314 (2001)

  3. Asuncion, H.U., Asuncion, A.U., Taylor, R.N.: Software traceability with topic modeling. In: Proceedings of the 32nd International Conference on Software Engineering, pp. 95–104 (2010)

  4. Bellucci, S., Portaluri, B.: Automatic calculation of orthogonal defect classification (odc) fields (2012). US Patent 8,214,798

  5. Bridge, N., Miller, C.: Orthogonal defect classification: using defect data to improve software development. Softw. Qual. 3(1), 1–8 (1998)

    Google Scholar 

  6. Caropreso, M., Matwin, S., Sebastiani, F.: A learner independent evaluation of the usefulness of statistical phrases for automated text categorization. In: Chin, A.G. (ed.) Text Databases and Document Management, Theory and Practice, pp. 78–102. Idea Group Publishing, Hershey (2001)

    Google Scholar 

  7. Chawla, N.V., Japkowicz, N., Kotcz, A.: Editorial: special issue on learning from imbalanced data sets. In: SIGKDD Exploration Newsletter, pp. 1–6 (2004)

  8. Chillarege, R.: Orthogonal defect classification. In: Lyu, M. (ed.) Handbook of Software Reliability Engineering, pp. 359–400. McGraw-Hill, New York (1995)

    Google Scholar 

  9. Chillarege, R., Bhandari, I.S., Chaar, J.K., Halliday, M.J., Moebus, D.S., Ray, B.K., Wong, M.Y.: Orthogonal defect classification-a concept for in-process measurements. IEEE Trans. Softw. Eng. 18(11), 943–956 (1992)

    Article  Google Scholar 

  10. Chillarege, R., Biyani, S.: Identifying risk using odc based growth models. In: Proceedings of the 5th International Symposium on Software, Reliability Engineering, pp. 282–288 (1994)

  11. Cubranic, D., Murphy, G.C.: Automatic bug triage using text categorization. In: Proceedings of the 6th International Conference on Software Engineering and Knowledge, Engineering, pp. 92–97 (2004)

  12. Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  13. Gegick, M., Rotella, P., Xie, T.: Identifying security bug reports via text mining: an industrial case study. In: Proceedings of the 7th IEEE Working Conference on Mining Software Repositories, pp. 11–20 (2010)

  14. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD Explor. Newslett. 11(1), 10–18 (2009)

    Article  Google Scholar 

  15. Huang, J., Czauderna, A., Gibiec, M., Emenecker, J.: A machine learning approach for tracing regulatory codes to product specific requirements. In: Proceedings of the 32nd International Conference on Software Engineering, pp. 155–164 (2010)

  16. Hussain, I., Ormandjieva, O., Kosseim, L.: Automatic quality assessment of srs text by means of a decision-tree-based text classifier. In: Proceedings of the 7th International Conference on Quality Software, pp. 209–218 (2007)

  17. Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Proceedings of the 10th European Conference on Machine Learning, pp. 137–142. Springer, Berlin (1998)

  18. Kiekel, P., Cooke, N., Foltz, P., Gorman, J., Martin, M.: Some promising results of communication-based automatic measures of team cognition. In: Proceedings of Human Factors and Ergonomics Society: 46th Annual Meeting, pp. 298–302 (2002)

  19. Ko, A., Myers, B.: A linguistic analysis of how people describe software problems. In: IEEE Symposium on Visual Languages and Human-Centric, Computing, pp. 127–134 (2006)

  20. Lamkanfi, A., Demeyer, S., Giger, E., Goethals, B.: Predicting the severity of a reported bug. In: Proceedings of the 7th IEEE Working Conference on Mining Software Repositories, pp. 1–10 (2010)

  21. Lin, Z., Ng, H.T., Kan, M.Y.: A pdtb-styled end-to-end discourse parser. Nat. Lang. Eng. 20, 151–184 (2014)

    Article  Google Scholar 

  22. Lutz, R., Mikulski, C.: Empirical analysis of safety-critical anomalies during operations. IEEE Trans. Softw. Eng. 30(3), 172–180 (2004)

    Article  Google Scholar 

  23. Lutz, R., Mikulski, C.: Ongoing requirements discovery in high integrity systems. IEEE Softw. 21(2), 19–25 (2004)

    Article  Google Scholar 

  24. Ma, L., Tian, J.: Analyzing errors and referral pairs to characterize common problems and improve web reliability. In: Proceedings of the 3rd International Conference on Web, Engineering, pp. 314–323 (2003)

  25. Ma, L., Tian, J.: Web error classification and analysis for reliability improvement. J. Syst. Softw. 80(6), 795–804 (2007)

    Article  Google Scholar 

  26. Mays, R., Jones, C., Holloway, G., Stundisky, D.: Experiences with defects prevention process. IBM Syst. J. 29(1), 4–32 (1990)

    Article  Google Scholar 

  27. Menzies, T., Lutz, R., Mikulski, C.: Better analysis of defect data at NASA. In: Proceedings of the 5th International Conference on Software Engineering and Knowledge, Engineering, pp. 607–611 (2003)

  28. Menzies, T., Marcus, A.: Automated severity assessment of software defect reports. In: Proceedings of the International Conference on Software, Maintenance, pp. 346–355 (2008)

  29. Ormandjieva, O., Kosseim, L., Hussain, I.: Toward a text classification system for the quality assessment of software requirements written in natural language. In: Proceedings of the 4th International Workshop on Software Quality Assurance, pp. 39–45 (2007)

  30. Pandita, R., Xiao, X., Yang, W., Enck, W., Xie, T.: Whyper: towards automating risk assessment of mobile application. In: Proceedings of 22nd USENIX Security Symposium, pp. 527–542 (2013)

  31. Polpinij, J., Ghose, A.: An automatic elaborate requirement specification by using hierarchical text classification. In: Proceedings of the 2008 International Conference on Computer Science and Software Engineering, pp. 706–709 (2008)

  32. Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)

    Article  Google Scholar 

  33. Rennie, J.D., Shih, L., Teevan, J., Karger, D.R.: Tackling the poor assumption of naive bayes text classifiers. In: Proceedings of International Conference on Machine Learning, pp. 616–623 (2003)

  34. Romano, D., Pinzger, M.: A comparison of event models for naive bayes text classification. In: Proceedings of AAAI Workshop on Learning for Text Categorization, pp. 41–48 (1998)

  35. Sebastiani, F.: Text categorization. In: Zanasi, A. (ed.) Texting Mining and Its Applications, pp. 109–129. MIT Press, Cambridge (2005)

    Google Scholar 

  36. Swigger, K., Brazile, R., Dafoulas, G., Serce, F.C., Alpaslan, F.N., Lopez, V.: Using content and text classification methods to characterize team performance. In: Proceedings of the 5th International Conference on Global, Software Engineering, pp. 192–200 (2010)

  37. Tamrawi, A., Nguyen, T.T., AI-Kofahi, J., Nguyen, T.N.: Fuzzy set-based automatic bug triaging. In: Proceedings of the 33rd International Conference on Software Engineering, pp. 884–887 (2011)

  38. Thung, F., Lo, D., Jiang, L.: Automatic defect categorization. In: Proceedings of 19th Working Conference on Reverse Engineering, pp. 205–214 (2012)

  39. Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. J. Mach. Learn. Res. 2, 45–66 (2001)

    Google Scholar 

  40. Tsochantaridis, I., Hofmann, T., Joachims, T., Altun, Y.: Support vector machine learning for interdependent and structured output spaces. In: Proceedings of the 21st International Conference on Machine Learning, pp. 104–112 (2004)

  41. Vapnik, V.: The Nature of Statistical Learning. Springer, Berlin (1995)

    Book  MATH  Google Scholar 

  42. Yang, C., Hou, C., Kao, W., Chen, I.: An empirical study on improving severity prediction of defect reports using feature selection. In: Proceedings of the 19th Asia-Pacific, Software Engineering Conference, pp. 240–249 (2012)

  43. Zheng, J., Williams, L., Nagappan, N., Hudpohl, J.: On the value of static analysis tools for fault detection. IEEE Trans. Softw. Eng. 32(44), 240–253 (2006)

    Article  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Zeheng Li.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Huang, L., Ng, V., Persing, I. et al. AutoODC: Automated generation of orthogonal defect classifications. Autom Softw Eng 22, 3–46 (2015).

Download citation


  • Orthogonal defect classification (ODC)
  • Machine learning
  • Natural language processing
  • Text classification