Skip to main content

Advertisement

Log in

Physical Examination Data Based Cataract Risk Analysis

  • Published:
Journal of Systems Science and Systems Engineering Aims and scope Submit manuscript

Abstract

Cataract is a very common eye disease and the most significant cause of blindness. In consideration of its burden on society, the focus was put on testing the risk factors of cataract and building robust machine learning models in which these factors can be utilized to predict the risk of cataract. The data used herein was collected by a Chinese physical examination center located in Shanghai. It contains more than 120,000 examinees and about 500 physical examination metrics. Firstly, association rules were adopted to filter 39 abnormalities which are more likely to incur the risk of cataract, and the significance of these abnormalities was tested with univariate analysis and multivariate analysis. The test results indicate that age, diabetes, refractive error, retinal arteriosclerosis, thyroid nodules, and incomplete mammary gland degeneration significantly increase the possibility of cataract. Various machine learning models were compared in terms of their performance in predicting the risk of cataract based on these six factors, among which the logistic regression model and the decision-tree based ensemble methods outperform others. The test set AUC of these models can reach 0.84.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Assmann G, Cullen P, Schulte H (2002). Simple scoring scheme for calculating the risk of acute coronary events based on the 10-year follow-up of the prospective cardiovascular Munster (PROCAM) study. Circulation 105(3):310–315.

    Article  Google Scholar 

  • Breiman L (2001). Random forests. Machine Learning 45(1):5–32.

    Article  Google Scholar 

  • Chang J R (2011). Risk factors associated with incident cataracts and cataract surgery in the Age-related Eye Disease Study (AREDS): AREDS report number 32. Ophthalmology 118(11): 2113–2119.

    Article  Google Scholar 

  • Chawla N V, Bowyer K W, Hall L O, Kegelmeyer W P (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16: 321–357.

    Article  Google Scholar 

  • Conroy R M, Pyörälä K, Fitzgerald A P, Sans S, Menotti A, De Backer G, De Bacquer D, Ducimetière P, Jousilahti P, Keil U, Njølstad I, Oganov RG, Thomsen T, Tunstall-Pedoe H, Tverdal A, Wedel H, Whincup P, Wilhelmsen L, Graham IM; SCORE project group (2003). Estimation of ten-year risk of fatal cardiovascular disease in Europe: The SCORE project. European Heart Journal 24(11): 987–1003.

    Article  Google Scholar 

  • Cumming R G, Mitchell P (1997). Alcohol, smoking, and cataracts: The Blue Mountains eye study. Archives of Ophthalmology 115(10): 1296–1303.

    Article  Google Scholar 

  • Foster A (2000). Vision 2020: The cataract challenge. Community Eye Health 13(34): 17–19.

    Google Scholar 

  • Harding J J, Harding R S, Egerton M (1989). Risk factors for cataract in Oxfordshire: Diabetes, peripheral neuropathy, myopia, glaucoma and diarrhoea. Acta Ophthalmologica 67(5): 510–517.

    Article  Google Scholar 

  • Heyningen V R (1972). The human lens: I. A comparison of cataracts extracted in Oxford (England) and Shikarpur (W. Pakistan). Experimental Eye Research 13(2): 136–147.

    Article  Google Scholar 

  • Hiller R, Sperduto R D, Ederer F (1986). Epidemiologic associations with nuclear, cortical, and posterior subcapsular cataracts. American Journal of Epidemiology 124(6): 916–925.

    Article  Google Scholar 

  • Hodge W G, Whitcher J P, Satariano W (1995). Risk factors for age-related cataracts. Epidemiologic Reviews 17(2): 336–346.

    Article  Google Scholar 

  • Javitt J C, Wang F, West S K (1996). Blindness due to cataract: Epidemiology and prevention. Annual Review of Public Health 17: 159–177.

    Article  Google Scholar 

  • Fei Jiang, Yong Jiang, Hui Zhi, Yi Dong, Hao Li, Sufeng Ma, Yilong Wang, Qiang Dong, Haipeng Shen, Yongjun Wang (2017). Artificial intelligence in healthcare: Past, present and future. Stroke and Vascular Neurology. 2(4): 230–243.

    Article  Google Scholar 

  • Jiang TX, Zhai SN, Yan J, Li Y, Lu ZQ (2012). Association between hyperlipidemia, diabetes and age-related cataract. International Eyes Science 12(11): 2098–2101.

    Google Scholar 

  • Kaur A, Gupta V, Christopher A F, Malik M A, Bansal P (2016). Nutraceuticals in prevention of cataract can evidence based approach. Saudi Journal of Ophthalmology 31(1): 30–37.

    Article  Google Scholar 

  • Moncef K, Rim K, Rupert B, Hans L, Flaxman SR, Jonasl JB, Jill K, Janet L, Kovin N, Konrad P (2015). Number of people blind or visually impaired by cataract worldwide and in world regions, 1990 to 2010. Investigative Ophthalmology & Visual Science 56(11): 6762–6769.

    Article  Google Scholar 

  • Kleiman R S, Larose E R, Badger J C, Page D, Peissig P L (2018). Using machine learning algorithms to predict risk for development of calciphylaxis in patients with chronic kidney disease. AMIA Summits on Translational Science Proceedings 2018, 139.

  • Kuppens E V, Van Best J A, Sterk C C (1995). Is glaucoma associated with an increased risk of cataract? British Journal of Ophthalmology 79(7): 649–652.

    Article  Google Scholar 

  • Lundberg S M, Lee S I (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems: 4765–4774.

  • Maaten L V D, Hinton G (2008). Visualizing data using t-SNE. Journal of Machine Learning Research 9(Nov): 2579–2605.

    MATH  Google Scholar 

  • Mao W S, Hu T S (1982). An epidemiologic survey of senile cataract in China. Chinese Medical Journal 95(11): 813–818.

    Google Scholar 

  • Shapley L S, Lloyd S (1951). Notes on the n-Person Game II: The Value of an n-Person Game. Santa Monica, CA: RAND Corporation.

    Google Scholar 

  • Tang Y, Wang X, Wang J, Huang W, Gao YP (2015). Prevalence and causes of visual impairment in a Chinese adult population: The Taizhou eye study. Ophthalmology 122(7):1480–1488.

    Article  Google Scholar 

  • Tavani A, Negri E, La Vecchia C (1996). Food and nutrient intake and risk of cataract. Annals of Epidemiology 6(1): 41–46.

    Article  Google Scholar 

  • Welp A, Woodbury R B, McCoy M A, et al. Understanding the epidemiology of vision loss and impairment in the United States. Making Eye Health A Population Health Imperative: Vision for Tomorrow, National Academies Press (US).

  • Wilson P, D’Agostino R, Levy D, Bélanger A M, Silbershatz H, Kannel W (1998). Prediction of coronary heart disease using risk factor categories. Circulation 97(18): 1837–1847.

    Article  Google Scholar 

  • World Health Organization (2014). Facts about blindness and visual impairment.

  • Xu B, Shi L (2012). Analysis of the importance of early prevention and disease detection of physical examination. Modern Preventive Medicine 39(19): 5033–5034.

    Google Scholar 

  • Yang X, Li J, Hu D, Chen J, Li Y, Huang J, Liu X, Liu F, Cao J, Shen C (2016). Predicting the 10-year risks of atherosclerotic cardiovascular disease in Chinese population: the China-PAR project (Prediction for ASCVD Risk in China). Circulation 134(19): 1430–1440.

    Article  Google Scholar 

  • Yeh D Y, Cheng C H, Chen Y W (2011). A predictive model for cerebrovascular disease using data mining. Expert Systems with Applications 38(7): 8970–8977.

    Article  Google Scholar 

  • Zhang Q, Zhu Z, Meng W, Zhang YY, Xue FZ (2012). Longitudinal monitoring large-scale health check-up data analysis strategy. Journal of Shandong University (Health Sciences) 50(2): 149–156.

    Google Scholar 

  • Zhang R, Zheng L, Pan G (2015). Application and foundation of disease prediction models. Chinese Journal of Health Statistics 32(4): 724–726.

    Google Scholar 

  • Zhao Y, Wong Z S Y, Tsui K L (2018). A framework of rebalancing imbalanced healthcare data for rare events’ classification: A case of look-alike sound-alike mix-up incident detection. Journal of Healthcare Engineering:1–11.

Download references

Acknowledgments

This work has been supported by the National Key R&D Program of China under Grant No. 2020AAA0103800.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongbo Xiao.

Additional information

Jianqiao Hao is a master student at School of Economics and Management, Tsinghua University, China.

Yongbo Xiao is a professor (with tenure) at School of Economics and Management, Tsinghua University, China. He received his Ph.D. and M.A. in Management Science and Engineering in 2006, and B.E. in Management Information Systems in 2000, all from Tsinghua University. His research interests include revenue and pricing management, service management, supply chain management, and healthcare management. His research papers have been published in international journals including Operations Research, Production and Operations Management, Decision Sciences, Naval Research Logistics, IIE Transactions, etc.

Shudi Du is a Ph.D student at School of Economics and Management, Tsinghua University, China. She received her master degree in Management and Systems from New York University in 2018. Her research interests include public healthcare and smart cities.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hao, J., Xiao, Y. & Du, S. Physical Examination Data Based Cataract Risk Analysis. J. Syst. Sci. Syst. Eng. 30, 198–214 (2021). https://doi.org/10.1007/s11518-021-5477-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11518-021-5477-5

Keywords

Navigation