Abstract
As shown in previous data, diabetes has led to the increasing mortality and considerable financial expenditure in the US. It is necessary to find out how to making correct diagnosis and prescription of diabetes plays an important role in helping patients. That is why we choose the dataset of diabetic inpatients having diagnosis at hospitals in the US, and predict how different treatments and medications influence patient outcomes. We use the class attribute of readmission number to obtain the results.
Because of the large and biased dataset, we firstly remove attributes with high missing value rate, and reduce the imbalance classes of instances by over-sampling and under-sampling, then followed by the attribute selection through various methods, such as the Correlation-based feature selection, the Chi-Squared Attribute Evaluator, the Information Gain Attribute Evaluator, etc.
Three classification methods C4.5, RIPPER, and Random Forests are used to predict the classification in Weka. In addition, we also use the ensemble learning methods including bagging and boosting to improve the stability and accuracy. From the analysing results, we can see that C4.5 and Ripper perform better, and both bagging and boosting increase the accuracy rate to differing degrees because both algorithms are somewhat unstable. There is no doubt that Random Forests is the best performer among all classification methods we use, and after using boosting, we see big increases in the values of the evaluation metrics we use. The final outcome is much better than random guess.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Centers for Disease Control and Prevention (2014) National diabetes statistics report: estimates of diabetes and its burden in the United States, 2014. US Department of Health and Human Services, Atlanta
Gale EAM, Gillespie KM (2001) Diabetes and gender. Diabetologia 44(1):3–15 Web May 2016
Gebel E (2001) How diabetes differs for men and women. Diabetes Forecast, October 2011. N.p., Web May 2016
(2011) “Men ‘develop diabetes more easily’”. Men ‘get diabetes more easily than women’, 05 October 2011. N.p., Web May 2016
Frese T, Sandholzer H (2013) The epidemiology of type 1 diabetes mellitus. Type 1 diabetes. In: Escher A (ed) InTech. https://doi.org/10.5772/52893. http://www.intechopen.com/books/type-1-diabetes/the-epidemiology-of-type-1-diabetes-mellitus
Cherney K (2013) Age of onset for type 2 diabetes: know your risk. Healthline, 13 August 2013. N.p., Web May 2016
(2016) “Type 1 diabetes”. Type 1 diabetes. N.p., n.d., Web May 2016
Ingraham C (2015) The average american woman now weighs as much as the average 1960s man. Washington post. The Washington post, 12 June 2015. Web May 2016
“Unexplained weight loss”. - Reasons, symptoms causes. N.p., n.d., Web May 2016
Brownlee J (2015) 8 tactics to combat imbalanced classes in your machine learning dataset-machine learning mastery. Mach Learn Mastery, 19 August 2015. N.p., Web May 2016
Brownlee J (2014) An introduction to feature selection - machine learning mastery. Mach Learn Mastery, 06 October 2014. N.p., Web May 2016
“CfsSubsetEval” CfsSubsetEval. N.p, n.d., Web May 2016
Hall MA (1999) Correlation-based feature selection for machine learning. Thesis, The University of Waikato, N.p: n.p, n.d. Print
Ladha L (2011) Feature selection methods and algorithms. Int J Comput Sci Eng (IJCSE) 3(5):1787–1797 Web May 2016
“Decision forest – ALGLIB”. Decision forest – ALGLIB. N.p., n.d., Web May 2016
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wang, Y. (2020). Research and Application on Ensemble Learning Methods. In: Deng, Z. (eds) Proceedings of 2019 Chinese Intelligent Automation Conference. CIAC 2019. Lecture Notes in Electrical Engineering, vol 586. Springer, Singapore. https://doi.org/10.1007/978-981-32-9050-1_17
Download citation
DOI: https://doi.org/10.1007/978-981-32-9050-1_17
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-32-9049-5
Online ISBN: 978-981-32-9050-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)