Abstract
Logistic regression is an efficient machine learning procedure, and it is applied to build a mathematical model for classifying a certain input to a certain class among a number of preset classes. One of the main limitations of the standard classification approaches is the sensitivity to model structure, and another limitation is the sensitivity to the chosen value of regularization parameter \({\lambda}\) that affects the estimation of the generalization error of candidate model, as any wrong value might cause underfitting or overfitting. In this research, a new approach for building a classifier model based on logistic regression is proposed, and the new algorithm depends on genetic algorithm to choose the effective model structure and also utilizes a proposed procedure to tune the regularization parameter aiming to find better model parameters. A case study for prediction of a diabetic patient using Pima Indian data set is included, and results showed the high effectiveness of the proposed approach as it reached a simpler model with a higher accuracy.
Similar content being viewed by others
References
Ali, S.-H.-A.; Ozawa, S.; Nakazato, J.; Ban, T.; Shimamura, J.: An autonomous online malicious spam email detection system using extended rbf network. In: 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (July 2015)
Cabrera, J.; Dionisio, A.; Solano, G.: Lung cancer classification tool using microarray data and support vector machines. In: 2015 6th International Conference on Information, Intelligence, Systems and Applications (IISA), pp. 1–6 (July 2015)
Gomez-Chova L., Tuia D., Moser G., Camps-Valls G.: Multimodal classification of remote sensing images: a review and future directions. Proc. IEEE 103, 1560–1584 (2015)
Gutierrez P., Hervas-Martinez C., Martinez-Estudillo F.: Logistic regression by means of evolutionary radial basis function neural networks. Neural Netw. IEEE Trans. 22(2), 246–263 (2011)
Stacey, A.; Kildea, D.: Genetic algorithm search for large logistic regression models with significant variables. In: Information Technology Interfaces, 2000. Proceedings of the 22nd International Conference on ITI 2000, pp. 275–279 (2000)
Yaguinuma, C.; Santos, M.; Camargo, H.; Nicoletti, M.; Nogueira, T.: Fuzz-onto: A meta-ontology for representing fuzzy elements and supporting fuzzy classification rules. In: 12th International Conference on Intelligent Systems Design and Applications (ISDA) 2012, pp. 166–171 (2012)
Salem, D.; AbulSeoud, R.; Ali, H.: K5. merging genetic algorithm with different classifiers for cancer classification using microarrays. In: 2012 29th National Radio Science Conference (NRSC), pp. 659–666 (2012)
Omar M., Ali W., Mostafa M.: Auto tuning of pid controller using swarm intelligence. Int. Rev. Autom. Control 4(3), 319–327 (2011)
Pang S., Ban T., Kadobayashi Y., Kasabov N.: Personalized mode transductive spanning svm classification tree. Inf. Sci. 181, 2071–2085 (2011)
Tan P.-N., Steinbach M., Kumar V.: Introduction to Data Mining. Addison-Wesley Longman Publishing Co., Inc, Boston (2005)
Aly W.M.: Analog electric circuits synthesis using a genetic algorithm approach. Int. J. Comput. Appl. 121, 28–32 (2015)
Goldberg D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Longman Publishing Co., Inc, Boston (1989)
Singh, V.; Misra, A.: Detection of unhealthy region of plant leaves using image processing and genetic algorithm. In: 2015 International Conference on Advances in Computer Engineering and Applications (ICACEA), pp. 1028–1032 (2015)
Dalei, Y.; Qinghe, H.; Jiazhuo, X.; Weidong, C.: Research of model structure selection based on genetic algorithm. In: Intelligent Information Technology Application, 2008. Second International Symposium on IITA ’08, vol. 2, pp. 132–135 (2008)
Paterlini, S.; Minerva, T.: Regression model selection using genetic algorithms. In: Proceedings of the 11th WSEAS International Conference on Neural Networks and 11th WSEAS International Conference on Evolutionary Computing and 11th WSEAS International Conference on Fuzzy Systems, NN’10/EC’10/FS’10, Stevens Point, Wisconsin, pp. 19–27, World Scientific and Engineering Academy and Society (WSEAS) (2010)
Martinez, D.; Cabaleiro, J.; Pena, T.; Rivera, F.; Blanco, V.: Model selection to characterize performance using genetic algorithms. In: 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications (ISPA), pp. 859–860 (2012)
Zhang, Y.; Li, L.: Model selection in support vector machines using self-adaptive genetic algorithm. In: 2010 International Symposium on Computational Intelligence and Design (ISCID), vol. 1, pp. 114–118 (2010)
Wahbeh, A.H.; Al-Radaideh, Q.A.; Al-Kabi, M.N.; Al-Shawakfa, E.M.: A comparison study between data mining tools over some classification methods. International Journal of Advanced Computer Science and Applications, pp. 18–26 (2011)
Arlot S., Celisse A.: A survey of cross-validation procedures for model selection. Stat. Surv. 4, 40–79 (2010)
El-Koka, A.; Cha, K.-H.; Kang, D.-K.: Regularization parameter tuning optimization approach in logistic regression. In: 2013 15th International Conference on Advanced Communication Technology (ICACT), pp. 13–18 (2013)
Lichman, M.: UCI Machine Learning Repository. University of California, Irvine, CA. http://archive.ics.uci.edu/ml (2013)
MR Bozkurt N.Y.: Comparison of different methods for determining diabetes. Turk. J. Elec. Eng. Comput. Sci. 22, 1044–1051 (2014)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Aly, W.M. A New Approach for Classifier Model Selection and Tuning Using Logistic Regression and Genetic Algorithms. Arab J Sci Eng 41, 5195–5204 (2016). https://doi.org/10.1007/s13369-016-2223-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13369-016-2223-2