Skip to main content

Advertisement

Log in

Building the classification model based on the genetic algorithm and the improved Bayesian method

  • Regular Paper
  • Published:
International Journal of Data Science and Analytics Aims and scope Submit manuscript

Abstract

This study presents a classification model that incorporates significant enhancements based on the Bayesian method and genetic algorithm (BGA). Firstly, the prior probabilities in each iteration are determined using the ratio of the number of elements in each group, obtained through clustering techniques, to the total number of elements in the training set. Secondly, an automatic selection process optimizes the training set to minimize classification errors. Finally, the traditional genetic algorithm operators are improved by utilizing the Bayes error as the objective function. These improvements combine to create an effective classification model. Additionally, the BGA demonstrates effective performance on real data using the established MATLAB procedure. A numerical example illustrates the superiority of the proposed algorithm compared to existing methods. The study also applies the BGA for image classification using the Gabor filter, which extracts essential image features. The proposed model outperforms popular methods in classifying various numerical and image datasets. These applications highlight the potential of this study in real-world scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  1. Bandyopadhyay, S., Maulik, U.: Non-parametric genetic clustering: comparison of validity indices. IEEE Trans. Syst. Man Cybern. Part C 31(1), 120–125 (2001)

    Article  Google Scholar 

  2. Bandyopadhyay, S., Maulik, U.: Genetic clustering for automatic evolution of clusters and application to image classification. Pattern Recogn. 35(6), 1197–1208 (2002)

    Article  MATH  Google Scholar 

  3. Behera, D.K., Das, M., Swetanisha, S.: Follower link prediction using the XGBoost classification model with multiple graph features. Wirel. Pers. Commun. 127, 695–714 (2021)

    Article  Google Scholar 

  4. Behera, T.K., Khan, M.A., Bakshi, S.: Brain MR image classification using superpixel-based deep transfer learning. IEEE J. Biomed. Health Inform. (2022). https://doi.org/10.1109/JBHI.2022.3216270

    Article  Google Scholar 

  5. Bidi,N., Elberrichi, Z.: Feature selection for text classification using genetic algorithms. In: 8th International Conference on Modelling, Identification and Control, Algiers, Algerial, pp. 806–810 (2016)

  6. Celebi, E., Alpkocak, A.: Clustering of texture features for content-based image retrieval. In: International Conference on Advances in Information Systems (ADVIS 2000), pp. 216–225 (2000)

  7. Chen, Z., Hongbo, Z., Chao, S., Wenquan, F.: Detection and classification of GNSS signal distortions based on quadratic discriminant analysis. IEEE Access 8, 25221–25236 (2020)

    Article  Google Scholar 

  8. Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach. Learn. 40, 139–157 (2000)

    Article  Google Scholar 

  9. Fadl, S., Megahed, A., Han, Q., Qiong, L.: Frame duplication and shuffling forgery detection technique in surveillance videos based on temporal average and gray level co-occurrence matrix. Multimed. Tools Appl. 79, 17619–17643 (2020)

    Article  Google Scholar 

  10. Fisher, R.A.: Statistical methods for research workers. In: Breakthroughs in Statistics, pp. 66–70 (1992)

  11. Haghighat, M., Zonouz, S., Abdel-Mottaleb, M.: Cloudid: trustworthy cloud-based and cross-enterprise biometric identification. Expert Syst. Appl. 42(21), 7905–7916 (2015)

    Article  Google Scholar 

  12. Hearst, M.A., Dumais, S.T., Osuna, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intell. Syst. Their Appl. 13(4), 18–28 (1998)

    Article  Google Scholar 

  13. Hemanth, D.J., Anitha, J.: Modified genetic algorithm approaches for classification of abnormal magnetic resonance brain tumour images. Appl. Soft Comput. 75, 21–28 (2019)

    Article  Google Scholar 

  14. Holland, J.H.: Genetic algorithms and the optimal allocation of trials. SIAM J. Comput. 2(2), 88–105 (1973)

    Article  MathSciNet  MATH  Google Scholar 

  15. Hu, L., Cui, J.: Digital image recognition based on fractional-order-PCA-SVM coupling algorithm. Measurement 145, 150–159 (2019)

    Article  Google Scholar 

  16. Imandoust, S.B., Bolandraftar, M.: Application of k-nearest neighbor (KNN) approach for predicting economic events: theoretical background. Int. J. Eng. Res. Appl. 3(5), 605–610 (2013)

    Google Scholar 

  17. Kamarainen, J.K., Kyrki, V., Kalviainen, H.: Invariance properties of Gabor filter-based features-overview and applications. IEEE Trans. Image Process. 15(5), 1088–1099 (2006)

    Article  Google Scholar 

  18. Malarvizhi, N., Selvarani, P., Raj, P.: Adaptive fuzzy genetic algorithm for multi biometric authentication. Multimed. Tools Appl. 79(13), 9131–9144 (2020)

    Article  Google Scholar 

  19. Mazurowski, M.A., Habas, P.A., Zurada, J.M., Lo, J.Y., Baker, J.A., Tourassi, G.D.: Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance. Neural Netw. 21(2), 427–436 (2008)

    Article  Google Scholar 

  20. Mehrdad, R., Saman, F., Kamal, B., Mina, S.: Integration of multi-objective PSO based feature selection and node centrality for medical datasets. Genomics 112(6), 4370–4384 (2020)

    Article  Google Scholar 

  21. Meshgini, S., Aghagolzadeh, A., Seyedarabi, H.: Face recognition using Gabor-based direct linear discriminant analysis and support vector machine. Comput. Electr. Eng. 39(3), 727–745 (2013)

    Article  Google Scholar 

  22. Mishra, S., Saha, S., Mondal, S.: GAEMTBD: genetic algorithm based entity matching techniques for bibliographic databases. Appl. Intell. 47(1), 197–230 (2017)

    Article  Google Scholar 

  23. Mousavi, S.M.H., MiriNezhad, S.Y., Mosleh, M.S., Dezfoulian, M.H.: A PSO fuzzy-expert system: as an assistant for specifying the acceptance by NOET measures, at PH.D level. In: Artificial Intelligence and Signal Processing Conference (AISP), Shiraz, Iran, pp. 11–18 (2017)

  24. Nalepa, J., Michal, K.: Selecting training sets for support vector machines: a review. Artif. Intell. Rev. 52(2), 857–900 (2019)

    Article  Google Scholar 

  25. Neto, J.G., Ozorio, L.V., De Abreu, T.C.C., Dos Santos, B.F., Pradelle, F.: Modeling of biogas production from food, fruits and vegetables wastes using artificial neural network (ANN). Fuel 285, 119081 (2021)

    Article  Google Scholar 

  26. Nhu, V.H., Zandi, D., Shahabi, H., Chapi, K., Shirzadi, A., Al-Ansari, N., Singh, S.K., Dou, J., Nguyen, H.: Comparison of support vector machine, Bayesian logistic regression, and alternating decision tree algorithms for shallow landslide susceptibility mapping along a mountainous road in the west of Iran. Appl. Sci. 10(15), 5047 (2020)

    Article  Google Scholar 

  27. Nguyentrang, T., Vovan, T.: A new approach for determining the prior probabilities in the classification problem by Bayesian method. Adv. Data Anal. Classif. 11, 629–643 (2017)

    Article  MathSciNet  Google Scholar 

  28. Pham-Gia, T., Turkkan, N., Vovan, T.: Statistical discrimination analysis using the maximum function. Commun. Stat. Simul. Comput. 37(2), 320–336 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  29. Phamtoan, D., Vovan, T., Phamchau, A., Nguyentrang, T., Hokieu, D.: A new binary adaptive elitist differential evolution based automatic k-medoids clustering for probability density functions. Math. Probl. Eng. 6380568, 1–16 (2019)

    Article  Google Scholar 

  30. Phamtoan, D., Vovan, T.: Automatic fuzzy genetic algorithm in clustering for images based on the extracted intervals. Multimedia Tools and Applications 80, 35193–35215 (2021)

    Article  Google Scholar 

  31. Rostami, M., Berahmand, K., Nasiri, E., Forouzandeh, S.: Review of swarm intelligence-based feature selection methods. Eng. Appl. Artif. Intell. 100, 104210 (2021)

    Article  Google Scholar 

  32. Saeid, A., Mehrdad, R., Kamal, B., Parham, M., Mourad, O.: Graph-based relevancy-redundancy gene selection method for cancer diagnosis. Comput. Biol. Med. 147, 105766 (2022)

    Article  Google Scholar 

  33. Scott, D.W.: Multivariate Density Estimation. Wiley, New York (1992)

    Book  MATH  Google Scholar 

  34. Shen, L., Bai, L., Fairhurst, M.: Gabor wavelets and general discriminant analysis for face identification and verification. Image Vis. Comput. 25(5), 553–563 (2007)

    Article  Google Scholar 

  35. Shen, M., Tang, X., Zhu, L., Du, X., Guizani, M.: Privacy-preserving support vector machine training over block chain-based encrypted IoT data in smart cities. IEEE Internet Things J. 6(5), 7702–7712 (2019)

    Article  Google Scholar 

  36. Sun, F., Xu, Y., Zhou, J.: Active learning SVM with regularization path for image classification. Multimed. Tools Appl. 75(3), 1427–1442 (2016)

    Article  Google Scholar 

  37. Tanveer, M., Tiwari, A., Choudhary, R., Jalan, S.: Sparse pinball twin support vector machines. Appl. Soft Comput. 78, 164–175 (2019)

    Article  Google Scholar 

  38. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Aidan N.G., Kaiser, L., Polosukhin L.: Attention is all you need. In: NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing Systems, December 2017, pp. 6000–6010 (2017)

  39. Vovan, T.: \(L^1\)-distance and classification problem by Bayesian method. J. Appl. Stat. 44(3), 385–401 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  40. Vovan, T., Nguyentrang, T., Chengoc, H.: The prior probability in classifying two populations by Bayesian method. Appl. Math. Eng. Reliab. 6, 35–40 (2016)

    Google Scholar 

  41. Vovan, T., Chengoc, H., Nguyentrang, T.: Textural features selection for image classification by Bayesian method. In: 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), Guilin, China, pp. 733–139 (2018)

  42. Vovan, T., Phamtoan, D., Nguyenthithuy, D.: Automatic genetic algorithm in clustering for discrete elements. Commun. Stat. Simul. Comput. 50(6), 1679–1694 (2021)

  43. Vovan, T., Phamtoan, D., Lehoang, T., Nguyentrang, T.: An automatic clustering for interval data using the genetic algorithm. Ann. Oper. Res. 303, 359–380 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  44. Wang, P.W., Lin, C.J.: Iteration complexity of feasible descent methods for convex optimization. J. Mach. Learn. Res. 15, 1523–1548 (2014)

    MathSciNet  MATH  Google Scholar 

  45. Weishui, W., Chen, X.: Convergence theorem of genetic algorithm. In: IEEE International Conference on Systems, Man and Cybernetics. Information Intelligence and Systems, vol. 3, pp. 1676–1681 (1996)

  46. Yin, P., Neubig, G., Yih, W, Riedel, S.: TaBERT: pretraining for joint understanding of textual and tabular data. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8413–8426 (2020)

Download references

Acknowledgements

This research is funded by Van Lang University, Viet Nam under grant number 13/2022/HD-NCKH.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tai Vo-Van.

Ethics declarations

Conflict of interest

No potential conflict of interest was reported by the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pham-Toan, D., Vo-Van, T. Building the classification model based on the genetic algorithm and the improved Bayesian method. Int J Data Sci Anal (2023). https://doi.org/10.1007/s41060-023-00436-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s41060-023-00436-2

Keywords

Navigation