Unsupervised software defect prediction using signed Laplacian-based spectral classifier
The lack of training dataset availability is the most popular issue in the software defect prediction, especially when dealing with new project development. Adopting training dataset from other software projects probably will not be the best solution because of the software metrics heterogeneity issues across projects. Unsupervised approaches have been proposed to address this issue, where the software prediction model is built without training dataset. Spectral classifier is one of these unsupervised approaches that has been applied successfully to address the lack of training dataset. However, this method leaves an issue when the dataset does not meet the requirement of nonnegative Laplacian assumption. This case would be occurred if there were nonnegative values of the adjacency matrix. It is well known that spectral classifier works with the Laplacian matrix, where the Laplacian matrix is constructed by adjacency matrix. In this paper, the signed Laplacian-based spectral classifier is proposed to solve the negative values problem in the adjacency matrix by converting the negative values into absolute values. The experimental results show that the proposed method could improve the performance of unsupervised classifiers compared to the unsigned Laplacian-based spectral classifier method. Hence, the proposed method is strongly suggested as unsupervised software defects prediction for the software projects that have no historical software dataset.
KeywordsUnsupervised software defect prediction Spectral clustering Absolute adjacency matrix Signed Laplacian Unsigned Laplacian
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
Human and animal participants
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed consent was obtained from all individual participants included in the study.
- Abaei G, Rezaei Z, Selamat A (2013) Fault prediction by utilizing self-organizing map and threshold. In: Proceedings of the 2013 IEEE international conference on control system, computing and engineering (ICCSCE), pp 465–470Google Scholar
- Catal C, Sevim U, Diri B (2009) Software fault prediction of unlabeled program modules. In: Proceedings of the world congress on engineering, pp 1–6Google Scholar
- Gallier J (2016) Spectral theory of unsigned and signed graphs. applications to graph clustering: a survey, pp 1–122. arXiv:1601.04692
- Knyazev AV (2017) Signed Laplacian for spectral clustering revisited, pp 1–24. arXiv:1701.01394v1
- Kunegis J, Schmidt S, Lommatzsch A, Lerner J, De Luca EW, Albayrak S (2010) Spectral analysis of signed graphs for clustering, prediction and visualization. In: Proceedings of the SIAM international conference on data mining, pp 559–570Google Scholar
- Menzies T, Krishna R, Pryor D (2016) The promise repository of empirical software engineering data. North Carolina State University, Department of Computer Science, RaleighGoogle Scholar
- Nam J, Kim S (2015) CLAMI: defect prediction on unlabeled datasets. In: Proceedings of the 30th IEEE/ACM international conference on automated software engineering (ASE), pp 452–463Google Scholar
- Nam J, Pan SJ, Kim S (2013) Transfer defect learning. In: Proceedings of the 35th international conference on software engineering (ICSE), vol 34(2), pp 382–391Google Scholar
- Nam J, Fu W, Kim S, Menzies T, Tan L (2017) Heterogeneous defect prediction. IEEE Trans Softw Eng 99:1–23Google Scholar
- Osborne JW, Carolina N (2010) Improving your data transformations: applying the Box-Cox transformation. Pract Assess Res Eval 15(12):1–9Google Scholar
- Punitha K, Chitra S (2013) Software defect prediction using software metrics: a survey. In: Proceedings of the the 2013 international conference on information communication and embedded systems (ICICES), pp 555–558Google Scholar
- Wahono RS (2015) A systematic literature review of software defect prediction: research trends, datasets, methods and frameworks. J Softw Eng 1(1):1–16Google Scholar
- Zhang F, Mockus A, Keivanloo I, Zou Y (2014) Towards building a universal defect prediction model. In: Proceedings of the 11th working conference on mining software repositories (MSR), pp 182–191Google Scholar
- Zhang F, Zheng Q, Zou Y, Hassan AE (2016) Cross-project defect prediction using a connectivity based unsupervised classifier. In Proceedings of the 38th international conference on software engineering (ICSE), pp 309–320Google Scholar
- Zhong S, Khoshgoftaar TM, Seliya N (2004) Unsupervised learning for expert-based software quality estimation. In: Proceedings of the eighth IEEE international conference on high assurance systems engineering, pp 149–155Google Scholar