Abstract
Imbalanced data typically refers to class distribution skews and underrepresented data, which affect the performance of learning algorithms. Such data are well-known in real-life situations, such as behavior analysis, cancer malignancy grading, industrial systems’ monitoring and software defect prediction. In this paper, we present a W-PSO method, which comprises weighting of instances in a dataset and the Particle Swarm Optimization algorithm. The presented method was combined with classification methods C4.5 and Naive Bayes, respectively, and tested experimentally on ten freely accessible software defect prediction datasets. Based on the results achieved, the presented W-PSO method creates better classification models than classification methods C4.5 and Naive Bayes in the majority of the cases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Brezočnik, L.: Feature selection for classification using particle swarm optimization. In: IEEE EUROCON 2017 – 17th International Conference on Smart Technologies, pp. 966–971. IEEE, Ohrid, Macedonia (2017)
Brezočnik, L., Karakatič, S., Podgorelec, V.: Weighted particle swarm optimization for imbalanced data. In: Twenty-sixth International Electrotechnical and Computer Science Conference ERK 2017, pp. 387–390. IEEE Slovenia Section, Portorož, Slovenia (2017)
Fenton, N.E., Ohlsson, N.: Quantitative analysis of faults and failures in a complex software system. IEEE Trans. Softw. Eng. 26(8), 797–814 (2000)
IBM: IBM SPSS Software. https://goo.gl/djbcCa. Accessed 10 Jan 2018
Karakatič, S., Heričko, M., Podgorelec, V.: Weighting and sampling data for individual classifiers and bagging with genetic algorithms. In: 7th International Joint Conference on Computational Intelligence IJCCI, pp. 180–187. IEEE, Lisbon, Portugal (2015)
Kennedy, J., Eberhart, R.: Particle swarm optimization. In: IEEE International Conference on Neural Networks, vol. 4, pp. 1942–1948. IEEE, Perth, Australia (1995)
Khoshgoftaar, T.M., Allen, E.B., Deng, J.: Using regression trees to classify fault-prone software modules. IEEE Trans. Reliab. 51(4), 455–462 (2002)
Menzies, T., Di Stefano, J.S., Chapman, M., McGrill, K.: Metrics that matter. In: 27th Annual NASA Goddard/IEEE Software Engineering Workshop, pp. 51–57. IEEE (2002)
Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33(1), 2–13 (2007)
Pighin, M., Podgorelec, V., Kokol, P.: Fault-threshold prediction with linear programming methodologies. Empirical Softw. Eng. 8(2), 117–138 (2003)
Podgorelec, V.: Improved mining of software complexity data on evolutionary filtered training sets. WSEAS Trans. Inf. Sci. Appl. 6(11), 1751–1760 (2009)
Podgorelec, V., Karakatič, S.: Predicting software defect-proneness from software repository data – a case of eclipse bug data. In: 18th International Multiconference Information Society – IS 2015, Collaboration, Software and Services in Information Society, pp. 5–8. InstitutJožef Stefan, Ljubljana, Slovenia (2015)
Podgorelec, V., Kokol, P.: Evolutionary induced decision trees for dangerous software modules prediction. Inf. Process. Lett. 82(1), 31–38 (2002)
Porter, A.A., Selby, R.W.: Empirically guided software development using metric-based classification trees. IEEE Softw. 7(2), 46–54 (1990)
Song, Q., Jia, Z., Shepperd, M., Ying, S., Liu, J.: A general software defect-proneness prediction framework. IEEE Trans. Softw. Eng. 37(3), 356–370 (2011)
Wahono, R.S.: A systematic literature review of software defect prediction: Research trends, datasets, methods and frameworks. J. Softw. Eng. 1(1), 1–16 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Brezočnik, L., Podgorelec, V. (2019). Applying Weighted Particle Swarm Optimization to Imbalanced Data in Software Defect Prediction. In: Karabegović, I. (eds) New Technologies, Development and Application. NT 2018. Lecture Notes in Networks and Systems, vol 42. Springer, Cham. https://doi.org/10.1007/978-3-319-90893-9_35
Download citation
DOI: https://doi.org/10.1007/978-3-319-90893-9_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-90892-2
Online ISBN: 978-3-319-90893-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)