Abstract
High blood sugar levels in diabetes mellitus (DM) can cause cardiac arrest, nervous system damage, vision loss, foot problems, liver or kidney damage, and death if left untreated. Age, gender, family history, BMI, and glucose levels all contribute to diabetes. To increase diabetes detection and prevent health concerns, machine learning techniques are used for prediction. Identifying the type of diabetes and considering the risk of accompanying diseases can improve diabetes prediction accuracy. This study uses one-way analysis of variance, mutual information, and F-regressor with random forest, Gaussian Naive Bayes, support vector machine, and decision tree for feature selection. Results with and without selected algorithms are compared. They have been used to adjust diabetic care using clinical parameters like accuracy, precision, recall, and F1-score. Random forest (RF) using F-regressor (FR) or ANOVA feature selection and numerous iterations of N (75) and K (3–5) outperforms competitors with 0.9 accuracy. This proves the diabetes-related DNA sequence classification technique works.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Li X, Zhang J, Safara F (2023) Improving the accuracy of diabetes diagnosis applications through a hybrid feature selection algorithm. Neural Process Lett 55(1). https://doi.org/10.1007/s11063-021-10491-0
Arora A, Shoeibi N, Sati V, González-Briones A, Chamoso P, Corchado E (2021) Data augmentation using Gaussian mixture model on csv files. Adv Intell Syst Comput. https://doi.org/10.1007/978-3-030-53036-5_28
Mirza S, Mittal S, Zaman M (2018) Decision support predictive model for prognosis of diabetes using SMOTE and decision tree
Bubby S, Chrisman B (2021) DNA-SEnet: a convolutional neural network for classifying DNA-asthma associations. J Emerg Investig 4
Lugo L, Hernández EB (2021) A recurrent neural network approach for whole genome bacteria identification. Appl Artif Intell 35(9):642–656. https://doi.org/10.1080/08839514.2021.1922842
Chaki J, Thillai Ganesh S, Cidham SK, Ananda Theertan S (2022) Machine learning and artificial intelligence based diabetes mellitus detection and self-management: a systematic review. J King Saud Univ Comput Inf Sci 34(6). https://doi.org/10.1016/j.jksuci.2020.06.013
Fregoso-Aparicio L, Noguez J, Montesinos L, García-García JA (2021) Machine learning and deep learning predictive models for type 2 diabetes: a systematic review. Diabetology Metab Syn 13(1). https://doi.org/10.1186/s13098-021-00767-9
Ramadhan NG, Adiwijaya, Romadhony A (2021) Preprocessing handling to enhance detection of type 2 diabetes mellitus based on random forest. Int J Adv Comput Sci Appl 12(7). https://doi.org/10.14569/IJACSA.2021.0120726
Naz H, Ahuja S (2020) Deep learning approach for diabetes prediction using PIMA Indian dataset. J Diab Metab Disord 19(1). https://doi.org/10.1007/s40200-020-00520-5
Butt UM, Letchmunan S, Ali M, Hassan FH, Baqir A, Sherazi HHR (2021) Machine learning based diabetes classification and prediction for healthcare applications. J Healthc Eng 2021. https://doi.org/10.1155/2021/9930985
Deng Y et al (2021) Deep transfer learning and data augmentation improve glucose levels prediction in type 2 diabetes patients. NPJ Digit Med 4(1). https://doi.org/10.1038/s41746-021-00480-x
Kaur H, Pannu HS, Malhi AK (2019) A systematic review on imbalanced data challenges in machine learning: applications and solutions. ACM Comput Surv 52(4). https://doi.org/10.1145/3343440
Zhuang F et al (2021) A comprehensive survey on transfer learning. Proc IEEE 109(1). https://doi.org/10.1109/JPROC.2020.3004555
Li K, Daniels J, Liu C, Herrero P, Georgiou P (2020) Convolutional recurrent neural networks for glucose prediction. IEEE J Biomed Health Inform 24(2). https://doi.org/10.1109/JBHI.2019.2908488
Recurrent neural network and convolutional network for diabetes blood glucose prediction. Int J Mach Learn Comput 12(6). https://doi.org/10.18178/ijmlc.2022.12.6.1115
Tasin I, Nabil TU, Islam S, Khan R (2022) Diabetes prediction using machine learning and explainable AI techniques. Healthc Technol Lett. https://doi.org/10.1049/htl2.12039
Al-Bermany HM, Al-Rashid SZ (2021) Microarray gene expression data for detection Alzheimer’s disease using k-means and deep learning. In: Proceedings of the 7th International engineering conference “research and innovation amid global pandemic”, IEC 2021. https://doi.org/10.1109/IEC52205.2021.9476128
National Library of Medicine. https://www.ncbi.nlm.nih.gov/. Accessed 14 May 2023
Es-Sabery F et al (2021) A MapReduce opinion mining for COVID-19-related tweets classification using enhanced ID3 decision tree classifier. IEEE Access 9. https://doi.org/10.1109/ACCESS.2021.3073215
Valsalan P, Hasan NU, Farooq U, Zghaibeh M, Baig I (2023) IoT based expert system for diabetes diagnosis and insulin dosage calculation. Healthcare (Switzerland) 11(1). https://doi.org/10.3390/healthcare11010012
Ye H, Tang S, Yang C (2021) Deep learning for chlorophyll-a concentration retrieval: a case study for the pearl river estuary. Remote Sens (Basel) 13(18). https://doi.org/10.3390/rs13183717
Gunasekaran H, Ramalakshmi K, Rex Macedo Arokiaraj A, Kanmani SD, Venkatesan C, Dhas CSG (2021) Analysis of DNA sequence classification using CNN and hybrid models. Comput Math Methods Med 2021. https://doi.org/10.1155/2021/1835056
Ibraheem EMA, El-sisy AME (2019) Comparing the effect of three denture adhesives on the retention of mandibular complete dentures for diabetic patients (randomized clinical trial). Bull Natl Res Cent 43(1). https://doi.org/10.1186/s42269-019-0052-7
Kabakuş AT (2020) The data science met with the COVID-19: revealing the most critical measures taken for the COVID-19 pandemic. Sakarya Univ J Comput Inf Sci. https://doi.org/10.35377/saucis.03.03.771501
Asfaw TA (2019) Prediction of diabetes mellitus using machine learning techniques. Int J Comput Eng Technol 10(4). https://doi.org/10.34218/ijcet.10.4.2019.004
Ahn CH, Lee S, Song HM, Park JR, Joo JC (2019) Assessment of water quality and thermal stress for an artificial fish shelter in an urban small pond during early summer. Water (Switzerland) 11(1). https://doi.org/10.3390/w11010139
Al-Sarem M et al (2021) An improved multiple features and machine learning-based approach for detecting clickbait news on social networks. Appl Sci (Switzerland) 11(20). https://doi.org/10.3390/app11209487
Kim SK, Yeun CY, Yoo PD (2019) An enhanced machine learning-based biometric authentication system using RR-interval framed electrocardiograms. IEEE Access 7. https://doi.org/10.1109/ACCESS.2019.2954576
Xuegang L, Junrui L, Juan W (2021) Missing data reconstruction based on spectral k-support norm minimization for NB-IoT data. Math Probl Eng 2021. https://doi.org/10.1155/2021/1336900
Aminah R, Saputro AH (2019) Diabetes prediction system based on iridology using machine learning. In: 2019 6th International conference on information technology, computer and electrical engineering, ICITACEE 2019. https://doi.org/10.1109/ICITACEE.2019.8904125
Rani A, Kumar N, Kumar J, Sinha NK (2022) Machine learning for soil moisture assessment. Deep Learn Sustain Agric. https://doi.org/10.1016/B978-0-323-85214-2.00001-X
Vishwakarma DK, Dhiman C (2019) A unified model for human activity recognition using spatial distribution of gradients and difference of Gaussian kernel. Vis Comput 35(11). https://doi.org/10.1007/s00371-018-1560-4
Raizada RDS, Lee YS (2013) Smoothness without smoothing: why Gaussian Naive Bayes is not naive for multi-subject searchlight studies. PLoS ONE 8(7). https://doi.org/10.1371/journal.pone.0069566
Barman M, Dev Choudhury NB (2020) A similarity based hybrid GWO-SVM method of power system load forecasting for regional special event days in anomalous load situations in Assam, India. Sustain Cities Soc 61. https://doi.org/10.1016/j.scs.2020.102311
Alimjan G, Sun T, Liang Y, Jumahun H, Guan Y (2018) A new technique for remote sensing image classification based on combinatorial algorithm of SVM and KNN. Intern J Pattern Recogn Artif Intell 32(7):1–23. https://doi.org/10.1142/S0218001418590127
Aziz FA, Al-Rashid SZ (2022) Prediction of DNA binding sites bound to specific transcription factors by the SVM algorithm. Iraqi J Sci 63(11). https://doi.org/10.24996/ijs.2022.63.11.37
Muzzammel R, Raza A (2020) A support vector machine learning-based protection technique for MT-HVDC systems. Energies (Basel) 13(24). https://doi.org/10.3390/en13246668
Hafeez MA, Rashid M, Tariq H, Abideen ZU, Alotaibi SS, Sinky MH (2021) Performance improvement of decision tree: a robust classifier using Tabu search algorithm. Appl Sci (Switzerland) 11(15). https://doi.org/10.3390/app11156728
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
AL Raheim Hamza, L.A., Lafta, H.A., Al Rashid, S.Z. (2024). Classification of DNA Sequence for Diabetes Mellitus Type Using Machine Learning Methods. In: Sharma, D.K., Peng, SL., Sharma, R., Jeon, G. (eds) Micro-Electronics and Telecommunication Engineering. ICMETE 2023. Lecture Notes in Networks and Systems, vol 894. Springer, Singapore. https://doi.org/10.1007/978-981-99-9562-2_8
Download citation
DOI: https://doi.org/10.1007/978-981-99-9562-2_8
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-9561-5
Online ISBN: 978-981-99-9562-2
eBook Packages: EngineeringEngineering (R0)