Label prediction on issue tracking systems using text mining


Issue tracking systems are overall change-management tools in software development. The issue-solving life cycle is a complex socio-technical activity that requires team discussion and knowledge sharing between members. In that process, issue classification facilitates an understanding of issues and their analysis. Issue tracking systems permit the tagging of issues with default labels (e.g., bug, enhancement) or with customized team labels (e.g., test failures, performance). However, a current problem is that many issues in open-source projects remain unlabeled. The aim of this paper is to improve maintenance tasks in development teams, evaluating models that can suggest a label for an issue using its text comments. We analyze data on issues from several GitHub trending projects, first by extracting issue information and then by applying text mining classifiers (i.e., support vector machine and naive Bayes multinomial). The results suggest that very suitable classifiers may be obtained to label the issues or, at least, to suggest the most suitable candidate labels.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2


  1. 1.


  1. 1.

    Anil Kumar, R., Ravi, V.: Predicting credit card customer churn in banks using data mining. Int. J. Data Anal. Tech. Strateg. 1(1), 4–28 (2008)

    Article  Google Scholar 

  2. 2.

    Anjali, M., Jivani, G.: A comparative study of stemming algorithms. Int. J. Comput. Tech. Appl. 2(6), 1930–1938 (2011)

    Google Scholar 

  3. 3.

    Barandela, R., Valdovinos, R.M., Sánchez, J.S.: New applications of ensembles of classifiers. Pattern Anal. Appl. 6(3), 245–256 (2003)

    MathSciNet  Article  Google Scholar 

  4. 4.

    Basili, V., Caldiera, G., Rombach, D.H.: The goal question metric approach. In: Marciniak, J. (ed.) Encyclopedia of Software Engineering. Wiley, New York (1994).

  5. 5.

    Batista, G.E., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor. Newsl. 6(1), 20–29 (2004)

    Article  Google Scholar 

  6. 6.

    Batuwita, R., Palade, V.: microPred: effective classification of pre-mirnas for human mirna gene prediction. Bioinformatics 25(8), 989–995 (2009)

    Article  Google Scholar 

  7. 7.

    Berczuk, S., Appleton, B.: Software Configuration Management Patterns: Effective Teamwork, Practical Integration, 01st edn. Addison Wesley Longman Inc Div Pearson Suite 300, Boston (2002)

    Google Scholar 

  8. 8.

    Bunkhumpornpat, C., Sinapiromsaran, K., Lursinsap, C. Safe-level-smote: safe-level-synthetic minority over-sampling technique for handling the class imbalanced problem. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining(PAKDD09). Lecture Notes on Computer Science, vol. 5476, pp. 475–482. Springer, New York (2009)

  9. 9.

    Cabot, J., Izquierdo, J.L.C., Cosentino, V., Rolandi, B.: Exploring the use of labels to categorize issues in Open-Source Software projects. In: 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), pp. 550–554 (2015).

  10. 10.

    Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    MATH  Article  Google Scholar 

  11. 11.

    Chawla, N.V., Japkowicz, N., Kotcz, A.: Editorial: special issue on learning from imbalanced data sets. ACM SIGKDD Explor. Newsl. 6(1), 1–6 (2004)

    Article  Google Scholar 

  12. 12.

    Chawla, N.V., Lazarevic, A., Hall, L.O., Bowyer, K.W.: Smoteboost: improving prediction of the minority class in boosting. In: 7th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD 2003), pp. 107–119 (2003)

  13. 13.

    Cieslak, D.A., Chawla, N.V.: Learning decision trees for unbalanced data. In: Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases—Part I, ECML PKDD’08, pp. 241–256, Springer, Berlin (2008)

  14. 14.

    Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995).

    MATH  Article  Google Scholar 

  15. 15.

    Díez-Pastor, J.F., Rodríguez, J.J., García-Osorio, C., Kuncheva, L.I.: Random balance: ensembles of variable priors classifiers for imbalanced data. Knowl. Based Syst. 85, 96–111 (2015)

    Article  Google Scholar 

  16. 16.

    Drown, D.J., Khoshgoftaar, T.M., Seliya, N.: Evolutionary sampling and software quality modeling of high-assurance systems. IEEE Trans. Syst. Man Cybern. Part A: Syst. Hum. 39(5), 1097–1107 (2009)

    Article  Google Scholar 

  17. 17.

    Eskildsen, S.F., Coupé, P., Fonov, V., Collins, D.L.: Detecting Alzheimer’s disease by morphological MRI using hippocampal grading and cortical thickness. In: Esther, B., Marion, S., van John, S., Wiro, N., Stefan, K., (eds.) Challenge on Computer-Aided Diagnosis of Dementia Based on Structural MRI Data, pp. 38–47 (2014)

  18. 18.

    Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)

    MATH  Google Scholar 

  19. 19.

    Fan, W., Salvatore, J.S., Junxin, Z., Philip, K.C.: Adacost: misclassification cost-sensitive boosting. In: Proceedings of the Sixteenth International Conference on Machine Learning, ICML’99, pp. 97–105, San Francisco, CA, (1999). Morgan Kaufmann Publishers Inc

  20. 20.

    Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006).

    MathSciNet  Article  Google Scholar 

  21. 21.

    Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 42(4), 463–484 (2012)

    Article  Google Scholar 

  22. 22.

    García-Pedrajas, N., Pérez-Rodríguez, J., García-Pedrajas, M.D., Ortiz-Boyer, D., Fyfe, C.: Class imbalance methods for translation initiation site recognition in DNA sequences. Knowl. Based Syst. 25(1), 22–34 (2012)

    Article  Google Scholar 

  23. 23.

    Gousios, G.: The GHTorrent dataset and tool suite. In: Proceedings of the 10th Working Conference on Mining Software Repositories, MSR’13, pp. 233–236. IEEE Press, Piscataway, NJ (2013).

  24. 24.

    Güemes-Peña, D., López-Nozal, C., Marticorena-Sánchez, R., Maudes-Raedo, J.: Emerging topics in mining software repositories. Progr. Artif. Intell. 7(3), 237–247 (2018).

    Article  Google Scholar 

  25. 25.

    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009).

    Article  Google Scholar 

  26. 26.

    Han, H., Wang, W.Y., Mao, B.H.: Borderline-smote: a new over-sampling method in imbalanced data sets learning. In: 2005 International Conference on Intelligent Computing (ICIC05). Lecture Notes on Computer Science, vol. 3644, pp. 878–887. Springer, New York (2005)

  27. 27.

    Irfan, R., King, C., Grages, D., Ewen, S., Khan, S., Madani, S., Kolodziej, J., Wang, L., Chen, D., Rayes, A., Tziritas, N., Xu, C.-Z., Zomaya, A., Alzahrani, A., Li, H.: A survey on text mining in social networks. Knowl. Eng. Rev. 30(2), 157–170 (2015)

    Article  Google Scholar 

  28. 28.

    Izquierdo, J.L.C., Cosentino, V., Rolandi, B., Bergel, A., Cabot, J.: GiLA: GitHub label analyzer. In: 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), pp. 479–483 (2015).

  29. 29.

    Joshi, M.V., Kumar, V., Agarwal, R.C.: Evaluating boosting algorithms to classify rare classes: comparison and improvements. In: Proceedings IEEE International Conference on Data Mining (ICDM 2001), pp. 257–264 (2001)

  30. 30.

    Khan, A., Baharudin, B., Lee, L.H., Khan, K., Tronoh, U.T.P.: A review of machine learning algorithms for text-documents classification. J. Adv. Inf. Technol. 1(1), 4–18 (2010)

    Google Scholar 

  31. 31.

    Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI’95) vol. 2, pp. 1137–1143. Morgan Kaufmann Publishers Inc., San Francisco (1995).

  32. 32.

    Kotsiantis, S.B., Pintelas, P.E.: Mixture of expert agents for handling imbalanced data sets. Ann. Math. Comput. Teleinform. 1(1), 46–55 (2003)

    Google Scholar 

  33. 33.

    Krawczyk, B., Galar, M., Jeleń, Ł., Herrera, F.: Evolutionary undersampling boosting for imbalanced classification of breast cancer malignancy. Appl. Soft Comput. 38, 714–726 (2016)

    Article  Google Scholar 

  34. 34.

    Kukar, M., Kononenko, I.: Cost-sensitive learning with neural networks. In: Proceedings of the 13th European Conference on Artificial Intelligence (ECAI-98), pp. 445–449. Citeseer (1998)

  35. 35.

    Lachiche, N., Flach, P.A.: Improving accuracy and cost of two-class and multi-class probabilistic classifiers using roc curves. In: ICML (2003)

  36. 36.

    Liao, T.W.: Classification of weld flaws with imbalanced class data. Expert Syst. Appl. 35(3), 1041–1052 (2008)

    Article  Google Scholar 

  37. 37.

    Ling, C.X., Sheng, V.S., Yang, Q.: Test strategies for cost-sensitive decision trees. IEEE Trans. Knowl. Data Eng. 18(8), 1055–1067 (2006)

    Article  Google Scholar 

  38. 38.

    Liu, W., Chawla, S., Cieslak, D.A., Chawla, N.V.: A robust decision tree algorithm for imbalanced data sets. Proceedings of the SIAM International Conference on Data Mining, SDM, pp. 766–777 (2010)

  39. 39.

    Lovins, J.B.: Development of a stemming algorithm. Mechan. Transl. Comput. Linguist. 11, 22–31 (1968)

    Google Scholar 

  40. 40.

    McCallum, A.K.: Bow: a toolkit for statistical language modeling, text retrieval, classification and clustering (1996).

  41. 41.

    McCallum, A., Nigam, K.: A comparison of event models for naive Bayes text classification. In: AAAI-98 Workshop on Learning for Text Categorization, vol. 752, pp. 41–48 (1998)

  42. 42.

    Phua, C., Alahakoon, D., Lee, V.: Minority report in fraud detection: classification of skewed data. ACM SIGKDD Explor. Newsl. 6(1), 50–59 (2004)

    Article  Google Scholar 

  43. 43.

    Porter, M.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)

    Article  Google Scholar 

  44. 44.

    Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34(1), 1–47 (2002).

    MathSciNet  Article  Google Scholar 

  45. 45.

    Seiffert, C., Khoshgoftaar, T.M., Van Hulse, J.: Improving software-quality predictions with data sampling and boosting. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 39(6), 1283–1294 (2009)

    Article  Google Scholar 

  46. 46.

    Seiffert, C., Khoshgoftaar, T.M., Van Hulse, J., Napolitano, A.: Rusboost: a hybrid approach to alleviating class imbalance. IEEE Trans. Syst. Man Cybern. Part A: Syst. Hum. 40(1), 185–197 (2010)

    Article  Google Scholar 

  47. 47.

    Sohrawardi, S.J., Azam, I., Hosain, S.: A comparative study of text classification algorithms on user submitted bug reports. In: 2014 Ninth International Conference on Digital Information Management (ICDIM), pp. 242–247 (2014)

  48. 48.

    Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 45(4), 427–437 (2009).

    Article  Google Scholar 

  49. 49.

    Sun, C., Lo, D., Wang, X., Jiang, J., Khoo, S.C.: A discriminative model approach for accurate duplicate bug report retrieval. In: Proceedings of the 32Nd ACM/IEEE International Conference on Software Engineering—vol. 1, ICSE’10, pp. 45–54. ACM, New York (2010).

  50. 50.

    Sun, Y., Kamel, M., Wong, A., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recognit. 40, 3358–3378 (2007)

    MATH  Article  Google Scholar 

  51. 51.

    Sun, Z., Song, Q., Zhu, X.: Using coding-based ensemble learning to improve software defect prediction. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(6), 1806–1817 (2012)

    Article  Google Scholar 

  52. 52.

    Treude, C., Storey, M.A.: Work item tagging: communicating concerns in collaborative software development. IEEE Trans. Softw. Eng. 38(1), 19–34 (2012).

    Article  Google Scholar 

  53. 53.

    Valdivia Garcia, H., Shihab, E.: Characterizing and predicting blocking bugs in open source projects. In: Proceedings of the 11th Working Conference on Mining Software Repositories, MSR 2014, pp. 72–81. ACM, New York (2014).

  54. 54.

    Vapnik, V.N.: The Nature of Statistical Learning Theory (Information Science and Statistics). Springer, New York (1999)

    Google Scholar 

  55. 55.

    Veropoulos, K., Campbel, C., Cristianini, N.: Controlling the sensitivity of support vector machines. In: Proceedings of the International Joint Conference on AI, pp. 55–60 (1999)

  56. 56.

    Visa, S., Ralescu, A.: Issues in mining imbalanced data sets—a review paper. In: Proceedings of the Sixteen Midwest Artificial Intelligence and Cognitive Science Conference, pp. 67–73 (2005)

  57. 57.

    Wang, S., Yao, X.: Diversity analysis on imbalanced data sets by using ensemble models. In: IEEE Symposium Series on Computational Intelligence and Data Mining (IEEE CIDM 2009), pp. 324–331 (2009)

  58. 58.

    Wen, W., Yu, T., Hayes, J.H.: Colua: automatically predicting configuration bug reports and extracting configuration options. In: 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE), pp. 150–161 (2016).

  59. 59.

    Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann, San Francisco (2011)

    Google Scholar 

  60. 60.

    Wohlin, C., Runeson, P., Höst, M., Ohlsson, M.C., Regnell, B., Wesslén, A.: Experimentation in Software Engineering. Springer, New York (2012)

    Google Scholar 

  61. 61.

    Xia, X., Feng, Y., Lo, D., Chen, Z., Wang, X.: Towards more accurate multi-label software behavior learning. In: 2014 Software Evolution Week—IEEE Conference on Software Maintenance, Reengineering and Reverse Engineering (CSMR-WCRE), pp. 134–143 (2014).

  62. 62.

    Xia, X., Lo, D., Wang, X., Zhou, B.: Accurate developer recommendation for bug resolution. In: Proceedings of the 20th Working Conference Reverse Engineering (2013)

  63. 63.

    Xia, X., Lo, D., Wang, X., Zhou, B.: Tag recommendation in software information sites. In: Proceedings of the 10th Working Conference on Mining Software Repositories, MSR’13, pp. 287–296. IEEE Press, Piscataway, NJ (2013).

  64. 64.

    Zhang, M., Zhou, Z.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2014).

    Article  Google Scholar 

  65. 65.

    Zou, Q., Xie, S., Lin, Z., Wu, M., Ju, Y. (2016) Finding the best classification threshold in imbalanced classification. Big Data Res. 5, 2–8. Big data analytics and applications

Download references


We would like to thank the Ministerio de Economía y Competitividad of the Spanish Government for financing the Project TIN2015-67534-P (MINECO/FEDER, UE) and the Junta de Castilla y León for financing the Project BU085P17 (JCyL/FEDER, UE) both co-financed from European Union European Regional Development Fund (ERDF/FEDER) funds. We gratefully acknowledge the support of NVIDIA Corporation for the donation of TITAN Xp GPUs used for this research.

Author information



Corresponding author

Correspondence to Carlos López-Nozal.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.



See Tables 4 and 5.

Table 4 Methods and configurations that predict the best AUROC for each label
Table 5 Methods and configurations that predict the best optimized \(F_1\) score for each label

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Alonso-Abad, J.M., López-Nozal, C., Maudes-Raedo, J.M. et al. Label prediction on issue tracking systems using text mining. Prog Artif Intell 8, 325–342 (2019).

Download citation


  • Text classifier
  • Experimentation in software engineering
  • Issue tracker system
  • Text mining
  • Label prediction