Abstract
Rheumatoid Arthritis (RA) disease is an inflammatory disease, which is characterized by persistent synovitis and autoantibodies that eventually lead to joint damage and reduced quality of life. This paper implements two data mining methodologies to explore the most important attributes that correlated with RA disease activity: (1) Feature selection algorithms to be used by Association rules (Apriori and Predictive) or Classification algorithms (J48 and J48 Consolidated), (2) Predictive rules (Rule Induction), Feature weight (Information Gain) and Trees algorithms (CHAID). This study experiments a pre-collected dataset consists of 260 patient records with a confirmed diagnosis of RA. The experimented algorithms are measured in terms of F-Measure, Accuracy, and the output tree. The accuracy of the J48 classification algorithm result was 79.18%. Many new rules were found by using the Predictive- Apriori technique from the association rules algorithms. By using the Information Gain algorithm, the most important attributes that highly correlated with the disease discovered were identified. This study revealed a model that validates the previous RA studies and includes new parameters that include both non-pharmacologic measures (No smoking, physical exercise and patient compliance) and pharmacologic therapies (MTX dose above 20 mg /week, prednisone dose >5 mg/day as add-on therapy and biologic DMARDs (adalimumab, preferred in our study) and Hb > 10.8 g/dl). The model would help RA patients to have will controlled and low disease activity.
Similar content being viewed by others
References
Ahmed ABED, Elaraby IS (2014) Data mining: a prediction for student's performance using classification method. World J Comput Appl Technol 2(2):43–47. https://doi.org/10.13189/wjcat.2014.020203
Akin M, Eyduran E, Reed B (2017) Use of RSM and CHAID data mining algorithm for predicting mineral nutrition of hazelnut. Plant Cell Tissue Organ Cult 128(2):303–316
Aletaha D, Smolen J (2005) The simplified disease activity index (SDAI) and the clinical disease activity index (CDAI): a review of their usefulness and validity in rheumatoid arthritis. Clin Exp Rheumatol 23(5 Suppl 39):S100–S108
Ali M, Eyduran E, Tariq MM, Tirink C, Abbas F, Bajwa MA, Baloch MH, Nizamani AH, Waheed A, Awan MA, Shah SS, Ahmad Z, Jan S (2015) Comparison of artificial neural network and decision tree algorithms used for predicting live weight at post weaning period from some biometrical characteristics in Harnai sheep. Pak J Zool 47:1579–1585
Ali A, Zhu Y, Zakarya M (2021) A data aggregation based approach to exploit dynamic spatio-temporal correlations for citywide crowd flows prediction in fog computing. Multimed Tools Appl 80(20):31401–31433. https://doi.org/10.1007/s11042-020-10486-4
Ali A, Zhu Y, Zakarya M (2021) Exploiting dynamic spatio-temporal correlations for citywide traffic flow prediction using attention based neural networks. Inf Sci 577:852–870. https://doi.org/10.1016/j.ins.2021.08.042
Ali A, Zhu Y, Zakarya M (2022) Exploiting dynamic spatio-temporal graph convolutional neural networks for citywide traffic flows prediction. Neural Netw 145:233–247
Alqudah M, Al-azzam S, Alzoubi K, Alkhatatbeh M, Alawneh K, Alazzeh O, Ababneh B (2017) Effects of antirheumatic drug underutilization on rheumatoid arthritis disease activity. Inflammopharmacology 25(4):431–438
Bardhan S, Bhowmik MK (2019) 2-stage classification of knee joint thermograms for rheumatoid arthritis prediction in subclinical inflammation. Australas Phys Eng Sci Med 42(1):259–277
Bascol K, Emonet R, Fromont E, Habrard A, Metzler G, Sebban M (2019) From cost-sensitive to tight fmeasure bounds. In: The 22nd international conference on artificial intelligence and statistics. PMLR, pp 12451253
Beniwal S, Arora JK (2012) Classification and feature selection techniques in data mining. Int J Eng Res Technol 1(06). https://doi.org/10.17577/IJERTV1IS6124
Chaurasia V, Pal S (2013) Data mining approach to detect heart disease. Int J Adv Comput Sci Inf Technol 2:56–66
Chaurasia V, Pal S (2017) A novel approach for breast cancer detection using data mining techniques. Int J Innov Res Comput Commun Eng 2:2456–2465 (an ISO 3297: 2007 certified organization)
Curtis J, Yang S, Patkar N, Chen L, Singh J, Cannon G, … DuVall S (2014) Risk of hospitalized bacterial infections associated with biologic treatment among US veterans with rheumatoid arthritis. Arthritis Care Res 66(7):990–997
Damberg E (2014) Data mining for description and prediction of antibiotic treated healthcare-associated infections. Biomed Res Int. https://doi.org/10.1155/2017/3292849
Demisse GB, Tadesse T, Bayissa Y (2017) Data mining attribute selection approach for drought modeling: a case study for greater horn of Africa. arXiv preprint arXiv:1708.05072
Durairaj M, Ranjani V (2013) Data mining applications in healthcare sector: a study. Int J Sci Technol Res 2(10):29–35
García S, Luengo J, Herrera F (2016) Data preprocessing in data mining. Springer
Gosselt HR, Verhoeven M, Bulatović-Ćalasan M, Welsing PM, de Rotte MC, Hazes JM, … de Jonge R (2021) Complex machine-learning algorithms and multivariable logistic regression on par in the prediction of insufficient clinical response to methotrexate in rheumatoid arthritis. J Pers Med 11(1):44. https://doi.org/10.3390/ijms22020044
Guo Y, Zhang W, Qin Q, Chen K, Wei Y (2022) Intelligent manufacturing management system based on data mining in artificial intelligence energy-saving resources. Soft Comput:1–16
Hajar T, Rostom S, Hari A, Lahlou R, Bahiri R (2015) Prevalence of anemia and its association with parameters of rheumatoid arthritis patients: a study from the Moroccan quest-RA data. J Palliat Care Med 5(221):2
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18
Jiang P, Li H, Li X (2015) Diabetes mellitus risk factors in rheumatoid arthritis: a systematic review and metaanalysis. Clin Exp Rheumatol 33(1):115–121
Koh HC, Tan G (2005) Data mining applications in healthcare. J Healthc Inf Manag 19(2):64–72
Kumar A, Kumar P, Srivastava A, Kumar VA, Vengatesan K, Singhal A (2020) Comparative analysis of data mining techniques to predict heart disease for diabetic patients. In: International conference on advances in computing and data sciences. Springer, Singapore, pp 507–518
Levi E, Watad A, Whitby A, Tiosano S, Comaneshter D, Cohen A, Amital H (2016) Coexistence of ischemic heart disease and rheumatoid arthritis patients—a case control study. Autoimmun Rev 15(4):393–396
Maimon OZ, Rokach L (2014) Data mining with decision trees: theory and applications, vol 81. World scientific
Myasoedova E, Crowson C, Kremers H, Therneau T, Gabriel S (2010) Is the incidence of rheumatoid arthritis rising? Results from Olmsted County. Minnesota, 1955–2007. Arthritis Rheum 62:1576–1582
Nahar K, Shova BI, Ria T, Rashid HB, Islam AHM (2021) Mining educational data to predict students performance. Educ Inf Technol 26(5):6051–6067
Nakagawa C, Yokoyama S, Hosomi K, Takada M (2021) Repurposing haloperidol for the treatment of rheumatoid arthritis: an integrative approach using data mining techniques. Ther Adv Musculoskelet Dis 13:1759720X211047057
Nourisson C, Soubrier M, Mulliez A, Baillet A, Bardin T, Cantagrel A, … Sibilia J (2017) Impact of gender on the response and tolerance to abatacept in patients with rheumatoid arthritis: results from the ‘ORA’registry. RMD Open 3(2):e000515
Pinjarkar V, Jain A, Bhaskar A (2022) Mental health disorders and privacy-preserving data mining: a survey. In: The role of IoT and Blockchain: techniques and applications. CRC Press, pp 441–449
Prajna B, Yasaswi B (2016) The early augmentation for diabetes diagnosis using data mining approaches. Int J Comput Sci Technol 7(3)
Ramotra AK, Mahajan A, Kumar R, Mansotra V (2020) Comparative analysis of data mining classification techniques for prediction of heart disease using the weka and SPSS modeler tools. In: Smart trends in computing and communications. Springer, Singapore, pp 89–97
Rashidi S, Ranjitkar P, Hadas Y (2014) Modeling bus dwell time with decision tree-based methods. Transp Res Rec 2418:74–83
Saad MK (2010) The impact of text preprocessing and term weighting on arabic text classification. The Islamic University-Gaza
Scott DL, Wolfe F, Huizinga TWJ (2010) Rheumatoid arthritis. Lancet 376(9746):1094–1108. https://doi.org/10.1016/S0140-6736(10)60826-4
Shanmugam S, Preethi J (2019) Improved feature selection and classification for rheumatoid arthritis disease using weighted decision tree approach (REACT). J Supercomput 75(8):5507–5519
Shanmugam S, Preethi J, Nadu T (2017) Study of early prediction and classification of arthritis disease using soft computing techniques. International Journal for Research in Engineering Application & Management (IJREAM) 03(05). https://doi.org/10.18231/2454-9150.2017.0006
Singh P, Singh N (2021) Role of data mining techniques in bioinformatics. Int J Appl Res Bioinform 11(1):51–60
Singh J, Saag K, Bridges S, Akl E, Bannuru R, Sullivan M, … Curtis JR (2016) American College of Rheumatology guideline for the treatment of rheumatoid arthritis. Arthritis Rheumatol 68(1):1–26
Smyrnova G (2014) The relationship between hemoglobin level and disease activity in patients with rheumatoid arthritis. Rev Bras Reumatol 54(6):437–440
Sornalakshmi M, Balamurali S, Venkatesulu M, Krishnan MN, Ramasamy LK, Kadry S, … Muthu BA (2020) Hybrid method for mining rules based on enhanced Apriori algorithm with sequential minimal optimization in healthcare industry. Neural Comput Appl:1–14
Sundaramurthy S, Saravanabhavan C, Kshirsagar P (2020) Prediction and classification of rheumatoid arthritis using ensemble machine learning approaches. In: 2020 international conference on decision aid sciences and application (DASA). IEEE, pp 17–21
Taylor A, Bagga H (2011) Measures of rheumatoid arthritis disease activity in Australian clinical practice. ISRN rheumatology. ISRN Rheumatol. https://doi.org/10.5402/2011/437281
Traore B, Kamsu-Foguem B, Tangara F (2017) Data mining techniques on satellite images for discovery of risk areas. Expert Syst Appl 72:443–456
Wong T (2015) Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern Recogn 48(9):2839–2846
Wu CT, Lo CL, Tung CH, Cheng HL (2020) Applying data mining techniques for predicting prognosis in patients with rheumatoid arthritis. In Healthcare (Vol. 8, no. 2, p. 85). Multidisciplinary Digital Publishing Institute
Zhang HN, Dwivedi AD (2022) Precise marketing data mining method of E-commerce platform based on association rules. Mob Netw Appl:1-9
Acknowledgements
We are grateful to Jordan University of Science and Technology for support in providing patient data.
Funding
None.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
AlQudah, M.M., Otair, M.A., Alqudah, M.A.Y. et al. Prediction of hidden patterns in rheumatoid arthritis patients records using data mining. Multimed Tools Appl 82, 369–388 (2023). https://doi.org/10.1007/s11042-022-13331-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-13331-y