Abstract
Thyroid disease is spreading very rapidly among women after the age of 30 years. Therefore, it is necessary to examine the thyroid dataset for predicting the disease at early stage so that precautions can be taken to protect the dangerous condition of thyroid cancer. A decision tree is used to extract hidden patterns from the stored datasets. The objective of this research paper is to examine the thyroid disease dataset using decision tree, random forest, and classification and regression tree (CART), and after obtaining the results of these classifiers, we enhanced the results using the bagging ensemble technique. The proposed experiment was done on 3710 instances and 29 features of thyroid patients. The overall prediction depends on target variable whch is divided in sick and negative class. The accuracy of the prediction was calculated on the basis of different num-fold and seed values. Different classification algorithms are analyzed using thyroid dataset. The results obtained by individual classification algorithms like decision tree, random forest tree, and extra tree give an accuracy of 98%, 99%, and 93%, respectively. Then, we developed a bagging ensemble method combining the three basic tree classifiers and apply again on the same dataset, which gives a better accuracy of 100% in the case of seed value 35 and num-fold value 10. This proposed ensemble method can be used for better prediction of thyroid disease.
Similar content being viewed by others
References
Akbaş A, Turhal U, Babur S, Avci C (2013) Performance improvement with combining multiple approaches to diagnosis of thyroid cancer. Engineering. 5(10):264–267
Alqurashi T, Wang W (2019) Clustering ensemble method. Int J Mach Learn Cybern 10(6):1227–1246
Aswathi AK, Antony A (2018) An intelligent system for thyroid disease classification and diagnosis. In: 2018 Second international conference on inventive communication and computational technologies (ICICCT). IEEE, pp 1261–1264
Azar AT, Hassanien AE, Kim TH (2012) Expert system based on neural-fuzzy rules for thyroid diseases diagnosis. In: Computer applications for bio-technology, multimedia, and Ubiquitous City. Springer, Berlin, Heidelberg, pp 94–105
Azar AT, El-Said SA, Hassanien AE (2013) Fuzzy and hard clustering analysis for thyroid disease. Comput Methods Prog Biomed 111(1):1–6
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Cataloluk H, Kesler M (2012) A diagnostic software tool for skin diseases with basic and weighted K-NN. In: 2012 International symposium on innovations in intelligent systems and applications. IEEE, pp 1-4
Chang CL, Chen CH (2009) Applying decision tree and neural network to increase quality of dermatologic diagnosis. Expert Syst Appl 36(2):4035–4041
Chaurasia V, Pal S, Tiwari BB (2018a) Chronic kidney disease: a predictive model using decision tree. Int J Eng Res Technol 11:1781–1794
Chaurasia V, Pal S, Tiwari BB (2018b) Prediction of benign and malignant breast cancer using data mining techniques. Journal of Algorithms & Computational Technology 12(2):119–126
http://archive.ics.uci.edu/ml/datasets/Thyroid+Disease 2013
Ioniţă I, Ioniţă L (2016) Prediction of thyroid disease using data mining techniques. BRAIN. Broad Research in Artificial Intelligence and Neuroscience 7(3):115–124
Landwehr N, Hall M, Frank E (2005) Logistic model trees. Mach Learn 59(1–2):161–205
Liu X, Xue A (2012) The thyroid disease analysis by a class of tissue P system. In: 2012 international symposium on information Technologies in Medicine and Education, vol 2. IEEE, pp 744–748
Ozyilmaz L, Yildirim T (2002) Diagnosis of thyroid disease using artificial neural network methods. In: Proceedings of the 9th international conference on neural information processing, 2002. ICONIP'02, vol 4. IEEE, pp 2033–2036
Prasad V, Rao TS, Babu MS (2016) Thyroid disease diagnosis via hybrid architecture composing rough data sets theory and machine learning algorithms. Soft Comput 20(3):1179–1189
Sharaff A, Gupta H (2019) Extra-tree classifier with metaheuristics approach for email classification. In: Bhatia S, Tiwari S, Mishra K, Trivedi M (eds) Advances in computer communication and computational sciences. Advances in intelligent systems and computing, vol 924. Springer, Singapore
Sivasakthivel A, Shrivakshan GT (2017) “A comparative study of diagnosing thyroid diseases using classification algorithm”. International Journals of Advanced Research in Computer Science and Software Engineering 7(Issue 8):ISSN: 2277-128X
Tyagi A, Mehra R, Saxena A (2018) Interactive thyroid disease prediction system using machine learning technique. In: 2018 Fifth international conference on parallel, distributed and grid computing (PDGC). IEEE, pp 689–693
Verma AK, Pal S, Kumar S (1887) Classification of skin disease using ensemble data mining techniques. Asian Pacific Journal of Cancer Prevention 20(6)
Verma AK, Pal S, Kumar S (2019) Prediction of skin disease using ensemble data mining techniques and feature selection method—a comparative study. Appl Biochem Biotechnol 27:1–9
Yadav DC, Pal S (2019) To generate an ensemble model for women thyroid prediction using data mining techniques. Asian Pac J Cancer Prev 20(4):1275–1281
Acknowledgments
The author is grateful to Veer Bahadur Singh Purvanchal University Jaunpur, Uttar Pradesh, for providing financial support to work as Postdoctoral Research Fellowship.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yadav, D.C., Pal, S. Prediction of thyroid disease using decision tree ensemble method. Hum.-Intell. Syst. Integr. 2, 89–95 (2020). https://doi.org/10.1007/s42454-020-00006-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42454-020-00006-y