Abstract
This study presents a classification model that incorporates significant enhancements based on the Bayesian method and genetic algorithm (BGA). Firstly, the prior probabilities in each iteration are determined using the ratio of the number of elements in each group, obtained through clustering techniques, to the total number of elements in the training set. Secondly, an automatic selection process optimizes the training set to minimize classification errors. Finally, the traditional genetic algorithm operators are improved by utilizing the Bayes error as the objective function. These improvements combine to create an effective classification model. Additionally, the BGA demonstrates effective performance on real data using the established MATLAB procedure. A numerical example illustrates the superiority of the proposed algorithm compared to existing methods. The study also applies the BGA for image classification using the Gabor filter, which extracts essential image features. The proposed model outperforms popular methods in classifying various numerical and image datasets. These applications highlight the potential of this study in real-world scenarios.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41060-023-00436-2/MediaObjects/41060_2023_436_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41060-023-00436-2/MediaObjects/41060_2023_436_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41060-023-00436-2/MediaObjects/41060_2023_436_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41060-023-00436-2/MediaObjects/41060_2023_436_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41060-023-00436-2/MediaObjects/41060_2023_436_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41060-023-00436-2/MediaObjects/41060_2023_436_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41060-023-00436-2/MediaObjects/41060_2023_436_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41060-023-00436-2/MediaObjects/41060_2023_436_Fig8_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41060-023-00436-2/MediaObjects/41060_2023_436_Fig9_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41060-023-00436-2/MediaObjects/41060_2023_436_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41060-023-00436-2/MediaObjects/41060_2023_436_Fig11_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41060-023-00436-2/MediaObjects/41060_2023_436_Fig12_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41060-023-00436-2/MediaObjects/41060_2023_436_Fig13_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41060-023-00436-2/MediaObjects/41060_2023_436_Fig14_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41060-023-00436-2/MediaObjects/41060_2023_436_Fig15_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41060-023-00436-2/MediaObjects/41060_2023_436_Fig16_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41060-023-00436-2/MediaObjects/41060_2023_436_Fig17_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41060-023-00436-2/MediaObjects/41060_2023_436_Fig18_HTML.png)
Similar content being viewed by others
References
Bandyopadhyay, S., Maulik, U.: Non-parametric genetic clustering: comparison of validity indices. IEEE Trans. Syst. Man Cybern. Part C 31(1), 120–125 (2001)
Bandyopadhyay, S., Maulik, U.: Genetic clustering for automatic evolution of clusters and application to image classification. Pattern Recogn. 35(6), 1197–1208 (2002)
Behera, D.K., Das, M., Swetanisha, S.: Follower link prediction using the XGBoost classification model with multiple graph features. Wirel. Pers. Commun. 127, 695–714 (2021)
Behera, T.K., Khan, M.A., Bakshi, S.: Brain MR image classification using superpixel-based deep transfer learning. IEEE J. Biomed. Health Inform. (2022). https://doi.org/10.1109/JBHI.2022.3216270
Bidi,N., Elberrichi, Z.: Feature selection for text classification using genetic algorithms. In: 8th International Conference on Modelling, Identification and Control, Algiers, Algerial, pp. 806–810 (2016)
Celebi, E., Alpkocak, A.: Clustering of texture features for content-based image retrieval. In: International Conference on Advances in Information Systems (ADVIS 2000), pp. 216–225 (2000)
Chen, Z., Hongbo, Z., Chao, S., Wenquan, F.: Detection and classification of GNSS signal distortions based on quadratic discriminant analysis. IEEE Access 8, 25221–25236 (2020)
Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach. Learn. 40, 139–157 (2000)
Fadl, S., Megahed, A., Han, Q., Qiong, L.: Frame duplication and shuffling forgery detection technique in surveillance videos based on temporal average and gray level co-occurrence matrix. Multimed. Tools Appl. 79, 17619–17643 (2020)
Fisher, R.A.: Statistical methods for research workers. In: Breakthroughs in Statistics, pp. 66–70 (1992)
Haghighat, M., Zonouz, S., Abdel-Mottaleb, M.: Cloudid: trustworthy cloud-based and cross-enterprise biometric identification. Expert Syst. Appl. 42(21), 7905–7916 (2015)
Hearst, M.A., Dumais, S.T., Osuna, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intell. Syst. Their Appl. 13(4), 18–28 (1998)
Hemanth, D.J., Anitha, J.: Modified genetic algorithm approaches for classification of abnormal magnetic resonance brain tumour images. Appl. Soft Comput. 75, 21–28 (2019)
Holland, J.H.: Genetic algorithms and the optimal allocation of trials. SIAM J. Comput. 2(2), 88–105 (1973)
Hu, L., Cui, J.: Digital image recognition based on fractional-order-PCA-SVM coupling algorithm. Measurement 145, 150–159 (2019)
Imandoust, S.B., Bolandraftar, M.: Application of k-nearest neighbor (KNN) approach for predicting economic events: theoretical background. Int. J. Eng. Res. Appl. 3(5), 605–610 (2013)
Kamarainen, J.K., Kyrki, V., Kalviainen, H.: Invariance properties of Gabor filter-based features-overview and applications. IEEE Trans. Image Process. 15(5), 1088–1099 (2006)
Malarvizhi, N., Selvarani, P., Raj, P.: Adaptive fuzzy genetic algorithm for multi biometric authentication. Multimed. Tools Appl. 79(13), 9131–9144 (2020)
Mazurowski, M.A., Habas, P.A., Zurada, J.M., Lo, J.Y., Baker, J.A., Tourassi, G.D.: Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance. Neural Netw. 21(2), 427–436 (2008)
Mehrdad, R., Saman, F., Kamal, B., Mina, S.: Integration of multi-objective PSO based feature selection and node centrality for medical datasets. Genomics 112(6), 4370–4384 (2020)
Meshgini, S., Aghagolzadeh, A., Seyedarabi, H.: Face recognition using Gabor-based direct linear discriminant analysis and support vector machine. Comput. Electr. Eng. 39(3), 727–745 (2013)
Mishra, S., Saha, S., Mondal, S.: GAEMTBD: genetic algorithm based entity matching techniques for bibliographic databases. Appl. Intell. 47(1), 197–230 (2017)
Mousavi, S.M.H., MiriNezhad, S.Y., Mosleh, M.S., Dezfoulian, M.H.: A PSO fuzzy-expert system: as an assistant for specifying the acceptance by NOET measures, at PH.D level. In: Artificial Intelligence and Signal Processing Conference (AISP), Shiraz, Iran, pp. 11–18 (2017)
Nalepa, J., Michal, K.: Selecting training sets for support vector machines: a review. Artif. Intell. Rev. 52(2), 857–900 (2019)
Neto, J.G., Ozorio, L.V., De Abreu, T.C.C., Dos Santos, B.F., Pradelle, F.: Modeling of biogas production from food, fruits and vegetables wastes using artificial neural network (ANN). Fuel 285, 119081 (2021)
Nhu, V.H., Zandi, D., Shahabi, H., Chapi, K., Shirzadi, A., Al-Ansari, N., Singh, S.K., Dou, J., Nguyen, H.: Comparison of support vector machine, Bayesian logistic regression, and alternating decision tree algorithms for shallow landslide susceptibility mapping along a mountainous road in the west of Iran. Appl. Sci. 10(15), 5047 (2020)
Nguyentrang, T., Vovan, T.: A new approach for determining the prior probabilities in the classification problem by Bayesian method. Adv. Data Anal. Classif. 11, 629–643 (2017)
Pham-Gia, T., Turkkan, N., Vovan, T.: Statistical discrimination analysis using the maximum function. Commun. Stat. Simul. Comput. 37(2), 320–336 (2008)
Phamtoan, D., Vovan, T., Phamchau, A., Nguyentrang, T., Hokieu, D.: A new binary adaptive elitist differential evolution based automatic k-medoids clustering for probability density functions. Math. Probl. Eng. 6380568, 1–16 (2019)
Phamtoan, D., Vovan, T.: Automatic fuzzy genetic algorithm in clustering for images based on the extracted intervals. Multimedia Tools and Applications 80, 35193–35215 (2021)
Rostami, M., Berahmand, K., Nasiri, E., Forouzandeh, S.: Review of swarm intelligence-based feature selection methods. Eng. Appl. Artif. Intell. 100, 104210 (2021)
Saeid, A., Mehrdad, R., Kamal, B., Parham, M., Mourad, O.: Graph-based relevancy-redundancy gene selection method for cancer diagnosis. Comput. Biol. Med. 147, 105766 (2022)
Scott, D.W.: Multivariate Density Estimation. Wiley, New York (1992)
Shen, L., Bai, L., Fairhurst, M.: Gabor wavelets and general discriminant analysis for face identification and verification. Image Vis. Comput. 25(5), 553–563 (2007)
Shen, M., Tang, X., Zhu, L., Du, X., Guizani, M.: Privacy-preserving support vector machine training over block chain-based encrypted IoT data in smart cities. IEEE Internet Things J. 6(5), 7702–7712 (2019)
Sun, F., Xu, Y., Zhou, J.: Active learning SVM with regularization path for image classification. Multimed. Tools Appl. 75(3), 1427–1442 (2016)
Tanveer, M., Tiwari, A., Choudhary, R., Jalan, S.: Sparse pinball twin support vector machines. Appl. Soft Comput. 78, 164–175 (2019)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Aidan N.G., Kaiser, L., Polosukhin L.: Attention is all you need. In: NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing Systems, December 2017, pp. 6000–6010 (2017)
Vovan, T.: \(L^1\)-distance and classification problem by Bayesian method. J. Appl. Stat. 44(3), 385–401 (2017)
Vovan, T., Nguyentrang, T., Chengoc, H.: The prior probability in classifying two populations by Bayesian method. Appl. Math. Eng. Reliab. 6, 35–40 (2016)
Vovan, T., Chengoc, H., Nguyentrang, T.: Textural features selection for image classification by Bayesian method. In: 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), Guilin, China, pp. 733–139 (2018)
Vovan, T., Phamtoan, D., Nguyenthithuy, D.: Automatic genetic algorithm in clustering for discrete elements. Commun. Stat. Simul. Comput. 50(6), 1679–1694 (2021)
Vovan, T., Phamtoan, D., Lehoang, T., Nguyentrang, T.: An automatic clustering for interval data using the genetic algorithm. Ann. Oper. Res. 303, 359–380 (2021)
Wang, P.W., Lin, C.J.: Iteration complexity of feasible descent methods for convex optimization. J. Mach. Learn. Res. 15, 1523–1548 (2014)
Weishui, W., Chen, X.: Convergence theorem of genetic algorithm. In: IEEE International Conference on Systems, Man and Cybernetics. Information Intelligence and Systems, vol. 3, pp. 1676–1681 (1996)
Yin, P., Neubig, G., Yih, W, Riedel, S.: TaBERT: pretraining for joint understanding of textual and tabular data. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8413–8426 (2020)
Acknowledgements
This research is funded by Van Lang University, Viet Nam under grant number 13/2022/HD-NCKH.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
No potential conflict of interest was reported by the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Pham-Toan, D., Vo-Van, T. Building the classification model based on the genetic algorithm and the improved Bayesian method. Int J Data Sci Anal (2023). https://doi.org/10.1007/s41060-023-00436-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s41060-023-00436-2