Abstract
Background and Aims
Nonalcoholic fatty liver disease (NAFLD) is the most common cause of chronic liver disease worldwide. Risk factors for NAFLD disease progression and liver-related outcomes remain incompletely understood due to the lack of computational identification methods. The present study sought to design a classification algorithm for NAFLD within the electronic medical record (EMR) for the development of large-scale longitudinal cohorts.
Methods
We implemented feature selection using logistic regression with adaptive LASSO. A training set of 620 patients was randomly selected from the Research Patient Data Registry at Partners Healthcare. To assess a true diagnosis for NAFLD we performed chart reviews and considered either a documentation of a biopsy or a clinical diagnosis of NAFLD. We included in our model variables laboratory measurements, diagnosis codes, and concepts extracted from medical notes. Variables with P < 0.05 were included in the multivariable analysis.
Results
The NAFLD classification algorithm included number of natural language mentions of NAFLD in the EMR, lifetime number of ICD-9 codes for NAFLD, and triglyceride level. This classification algorithm was superior to an algorithm using ICD-9 data alone with AUC of 0.85 versus 0.75 (P < 0.0001) and leads to the creation of a new independent cohort of 8458 individuals with a high probability for NAFLD.
Conclusions
The NAFLD classification algorithm is superior to ICD-9 billing data alone. This approach is simple to develop, deploy, and can be applied across different institutions to create EMR-based cohorts of individuals with NAFLD.
Similar content being viewed by others
References
Williams CD, Stengel J, Asike MI, et al. Prevalence of nonalcoholic fatty liver disease and nonalcoholic steatohepatitis among a largely middle-aged population utilizing ultrasound and liver biopsy: a prospective study. Gastroenterology. 2011;140:124–131.
Byrne CD, Targher G. NAFLD: a multisystem disease. J Hepatol. 2015;62:S47–S64.
Musso G, Gambino R, Cassader M, Pagano G. Meta-analysis: natural history of non-alcoholic fatty liver disease (NAFLD) and diagnostic accuracy of non-invasive tests for liver disease severity. Ann Med. 2011;43:617–649.
Vernon G, Baranova A, Younossi ZM. Systematic review: the epidemiology and natural history of non-alcoholic fatty liver disease and non-alcoholic steatohepatitis in adults. Aliment Pharmacol Ther. 2011;34:274–285.
White DL, Kanwal F, El-Serag HB. Association between nonalcoholic fatty liver disease and risk for hepatocellular cancer, based on systematic review. Clin Gastroenterol Hepatol. 2012;10:e1342.
Charlton M. Cirrhosis and liver failure in nonalcoholic fatty liver disease: Molehill or mountain? Hepatology. 2008;47:1431–1433.
Matteoni CA, Younossi ZM, Gramlich T, Boparai N, Liu YC, McCullough AJ. Nonalcoholic fatty liver disease: a spectrum of clinical and pathological severity. Gastroenterology. 1999;116:1413–1419.
Dam-Larsen S, Franzmann M, Andersen IB, et al. Long term prognosis of fatty liver: risk of chronic liver disease and death. Gut. 2004;53:750–755.
Ekstedt M, Franzen LE, Mathiesen UL, et al. Long-term follow-up of patients with NAFLD and elevated liver enzymes. Hepatology. 2006;44:865–873.
Soderberg C, Stal P, Askling J, et al. Decreased survival of subjects with elevated liver function tests during a 28-year follow-up. Hepatology. 2010;51:595–602.
Sung KC, Kim BS, Cho YK, et al. Predicting incident fatty liver using simple cardio-metabolic risk factors at baseline. BMC Gastroenterol. 2012;12:84.
Liao KP, Cai T, Gainer V, et al. Electronic medical records for discovery research in rheumatoid arthritis Arthritis. Care Res. 2010;62:1120–1127.
Ananthakrishnan AN, Cai T, Savova G, et al. Improving case definition of Crohn’s disease and ulcerative colitis in electronic medical records using natural language processing: a novel informatics approach. Inflamm Bowel Dis. 2013;19:1411–1420.
Zou H. The adaptive lasso and its oracle properties. J Am Stat Assoc. 2006;101:1418–1429.
Friedman JHT, Tibshirani R. The elements of statistical learning. New York: Springer; 2001.
Dunn W, Xu R, Wingard DL, et al. Suspected nonalcoholic fatty liver disease and mortality risk in a population-based cohort study. Am J Gastroenterol.. 2008;103:2263–2271.
Ong JP, Pitts A, Younossi ZM. Increased overall mortality and liver-related mortality in non-alcoholic fatty liver disease. J Hepatol. 2008;49:608–612.
Targher G, Bertolini L, Rodella S, et al. Nonalcoholic fatty liver disease is independently associated with an increased incidence of cardiovascular events in type 2 diabetic patients. Diabetes Care. 2007;30:2119–2121.
Kramer JR, Davila JA, Miller ED, Richardson P, Giordano TP, El-Serag HB. The validity of viral hepatitis and chronic liver disease diagnoses in Veterans Affairs administrative databases. Aliment Pharmacol Ther. 2008;27:274–282.
Husain N, Blais P, Kramer J, et al. Nonalcoholic fatty liver disease (NAFLD) in the Veterans Administration population: development and validation of an algorithm for NAFLD using automated data. Aliment Pharmacol Ther. 2014;40:949–954.
Browning JD, Szczepaniak LS, Dobbins R, et al. Prevalence of hepatic steatosis in an urban population in the United States: impact of ethnicity. Hepatology. 2004;40:1387–1395.
Peabody JW, Luck J, Jain S, Bertenthal D, Glassman P. Assessing the accuracy of administrative data in health information systems. Medical Care. 2004;42:1066–1072.
Newton KM, Peissig PL, Kho AN, et al. Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network. JAMIA. 2013;20:e147–e154.
Corey KE, Chalasani N. Management of Dyslipidemia as a Cardiovascular Risk Factor in Individuals With Nonalcoholic Fatty Liver Disease. Clin Gastroenterol Hepatol. 2014;12:1077–1084.
Trivedi B. Biomedical science: betting the bank. Nature. 2008;452:926–929.
Murphy S, Churchill S, Bry L, et al. Instrumenting the health care enterprise for discovery research in the genomic era. Genome Res. 2009;19:1675–1681.
Acknowledgments
The authors would like to acknowledge Dr. Ashwin N. Ananthakrishnan, MBBS, MPH who provided feedback and critical comments.
Financial Support
This study was funded in part by grants from the NIH K23 DK099422 (KEC) and NIH U54 LM008748 (SYS).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
None.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Corey, K.E., Kartoun, U., Zheng, H. et al. Development and Validation of an Algorithm to Identify Nonalcoholic Fatty Liver Disease in the Electronic Medical Record. Dig Dis Sci 61, 913–919 (2016). https://doi.org/10.1007/s10620-015-3952-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10620-015-3952-x