Skip to main content
Log in

A machine-learning approach for nonalcoholic steatohepatitis susceptibility estimation

  • Original Article
  • Published:
Indian Journal of Gastroenterology Aims and scope Submit manuscript

Abstract

Background

Nonalcoholic steatohepatitis (NASH), a severe form of nonalcoholic fatty liver disease, can lead to advanced liver damage and has become an increasingly prominent health problem worldwide. Predictive models for early identification of high-risk individuals could help identify preventive and interventional measures. Traditional epidemiological models with limited predictive power are based on statistical analysis. In the current study, a novel machine-learning approach was developed for individual NASH susceptibility prediction using candidate single nucleotide polymorphisms (SNPs).

Methods

A total of 245 NASH patients and 120 healthy individuals were included in the study. Single nucleotide polymorphism genotypes of candidate genes including two SNPs in the cytochrome P450 family 2 subfamily E member 1 (CYP2E1) gene (rs6413432, rs3813867), two SNPs in the glucokinase regulator (GCKR) gene (rs780094, rs1260326), rs738409 SNP in patatin-like phospholipase domain-containing 3 (PNPLA3), and gender parameters were used to develop models for identifying at-risk individuals. To predict the individual’s susceptibility to NASH, nine different machine-learning models were constructed. These models involved two different feature selections including Chi-square, and support vector machine recursive feature elimination (SVM-RFE) and three classification algorithms including k-nearest neighbor (KNN), multi-layer perceptron (MLP), and random forest (RF). All nine machine-learning models were trained using 80% of both the NASH patients and the healthy controls data. The nine machine-learning models were then tested on 20% of both groups. The model’s performance was compared for model accuracy, precision, sensitivity, and F measure.

Results

Among all nine machine-learning models, the KNN classifier with all features as input showed the highest performance with 86% F measure and 79% accuracy.

Conclusions

Machine learning based on genomic variety may be applicable for estimating an individual’s susceptibility for developing NASH among high-risk groups with a high degree of accuracy, precision, and sensitivity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Data availability

The datasets analyzed during the current study are available in the ZENODO repository and can be accessed from https://doi.org/10.5281/zenodo.4686908.

References

  1. Caligiuri A, Gentilini A, Marra F. Molecular pathogenesis of NASH. Int J Mol Sci. 2016;17:1575.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Adams LA, Feldstein AE. Nonalcoholic steatohepatitis: risk factors and diagnosis. Expert Rev Gastroenterol Hepatol. 2010;4:623–35.

    Article  PubMed  Google Scholar 

  3. Vespasiani-Gentilucci U, Gallo P, Dell'Unto C, Volpentesta M, Antonelli-Incalzi R, Picardi A. Promoting genetics in non-alcoholic fatty liver disease: combined risk score through polymorphisms and clinical variables. World J Gastroenterol. 2018;24:4835–45.

  4. Vilar-Gomez E, Chalasani N. Non-invasive assessment of non-alcoholic fatty liver disease: clinical prediction rules and blood-based biomarkers. J Hepatol. 2018;68:305–15.

  5. Anstee QM, Seth D, Day CP. Genetic factors that affect risk of alcoholic and nonalcoholic fatty liver disease. Gastroenterology. 2016;150:1728–44.e7.

  6. Kawaguchi T, Shima T, Mizuno M, et al. Risk estimation model for nonalcoholic fatty liver disease in the Japanese using multiple genetic markers. PLoS One. 2018;13:e0185490.

  7. Koo BK, Joo SK, Kim D, et al. Development and validation of a scoring system, based on genetic and clinical factors, to determine risk of steatohepatitis in Asian patients with nonalcoholic fatty liver disease. Clin Gastroenterol Hepatol. 2020;18:2592-9.e10.

  8. Gaudillo J, Rodriguez JJR, Nazareno A, et al. Machine learning approach to single nucleotide polymorphism-based asthma prediction. PLoS One. 2019;14:e0225574.

  9. Ostrovski V. New equivalence tests for Hardy–Weinberg equilibrium and multiple alleles. Stats. 2020;3:34–9.

  10. Wang X, Strizich G, Hu Y, Wang T, Kaplan RC, Qi Q. Genetic markers of type 2 diabetes: progress in genome-wide association studies and clinical application for risk prediction. J Diabetes. 2016;8:24–35.

  11. Ma H, Xu CF, Shen Z, Yu CH, Li YM. Application of machine learning techniques for clinical predictive modeling: A cross-sectional study on nonalcoholic fatty liver disease in China. Biomed Res Int. 2018;2018:4304376.

  12. Ho DSW, Schierding W, Wake M, Saffery R, O'Sullivan J. Machine learning SNP based prediction for precision medicine. Front Genet. 2019;10:267.

  13. Yip TC, Ma AJ, Wong VW, et al. Laboratory parameter-based machine learning model for excluding non-alcoholic fatty liver disease (NAFLD) in the general population. Aliment Pharmacol Ther. 2017;46:447–56.

  14. Canbay A, Kälsch J, Neumann U, et al. Non-invasive assessment of NAFLD as systemic disease-a machine learning perspective. PLoS One. 2019;14:e0214436.

  15. Fialoke S, Malarstig A, Miller MR, Dumitriu A. Application of machine learning methods to predict non-alcoholic steatohepatitis (NASH) in non-alcoholic fatty liver (NAFL) patients. AMIA Annu Symp Proc. 2018;2018:430–9.

  16. Perakakis N, Polyzos SA, Yazdani A, et al. Non-invasive diagnosis of non-alcoholic steatohepatitis and fibrosis with the use of omics and supervised learning: a proof of concept study. Metabolism. 2019;101:154005.

  17. Chiappini F, Coilly A, Kadar H, et al. Metabolism dysregulation induces a specific lipid signature of nonalcoholic steatohepatitis in patients. Sci Rep. 2017;7:46658.

  18. Dai G, Liu P, Li X, Zhou X, He S. Association between PNPLA3 rs738409 polymorphism and nonalcoholic fatty liver disease (NAFLD) susceptibility and severity: A meta-analysis. Medicine (Baltimore). 2019;98:e14324.

  19. Vespasiani-Gentilucci U, Gallo P, Porcari A, et al. The PNPLA3 rs738409 C>G polymorphism is associated with the risk of progression to cirrhosis in NAFLD patients. Scand J Gastroenterol. 2016;51:967–73.

  20. Hotta K, Yoneda M, Hyogo H, et al. Association of the rs738409 polymorphism in PNPLA3 with liver damage and the development of nonalcoholic fatty liver disease. BMC Med Genet. 2010;11:172.

  21. Liu YL, Patman GL, Leathart JB, et al. Carriage of the PNPLA3 rs738409 C>G polymorphism confers an increased risk of non-alcoholic fatty liver disease associated hepatocellular carcinoma. J Hepatol. 2014;61:75–81.

  22. Tan HL, Zain SM, Mohamed R, et al. Association of glucokinase regulatory gene polymorphisms with risk and severity of non-alcoholic fatty liver disease: an interaction study with adiponutrin gene. J Gastroenterol. 2013;49:1056–64.

  23. Ulusoy G, Arinç E, Adali O. Genotype and allele frequencies of polymorphic CYP2E1 in the Turkish population. Arch Toxicol. 2007;81:711–8.

  24. Matsushita N, Hassanein MT, Martinez-Clemente M, et al. Gender difference in NASH susceptibility: roles of hepatocyte Ikkβ and Sult1e1. PLoS One. 2017;12:e0181052.

  25. Noureddin M, Vipani A, Bresee C, et al. NASH leading cause of liver transplant in women: updated analysis of indications for liver transplant and ethnic and gender variances. Am J Gastroenterol. 2018;113:1649–59.

  26. Hashimoto E, Tokushige K. Prevalence, gender, ethnic variations, and prognosis of NASH. J Gastroenterol. 2011;46 Suppl 1:63–9.

  27. Soleymani R, Granger E, Fumera G. F-measure curves: a tool to visualize classifier performance under imbalance. Pattern Recognition. 2020;107146:107146.

Download references

Author information

Authors and Affiliations

Authors

Contributions

Concept: FG, AAH; design: FG; supervision: AAH, OÖ; materials: AAH; data collection and/or analysis: AAH, FG; literature search: FG; writing: FG, AAH; critical reviews: AAH

Corresponding author

Correspondence to Fatemeh Ghadiri.

Ethics declarations

Competing interests

FG, AAH and OÖ declare no competing interests.

Ethics statement

The study was performed conforming to the Helsinki declaration of 1975, as revised in 2000 and 2008 concerning human and animal rights, and the authors followed the policy concerning informed consent as shown on Springer.com.

Ethics approval

The ethics committee of Istanbul Gelişim University approved this study (ethical code: 77366270-302.08.01-E.12978, date: 16.11.2020).

Consent to participate

Consent forms were signed by all the participants before being included in the study.

Consent for publication

Not applicable.

Disclaimer

The authors are solely responsible for the data and the contents of the paper. In no way is the honorary editor in chief, editorial board members, the Indian Society of Gastroenterology or the printer/publishers responsible for the results/findings and content of this article.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ghadiri, F., Husseini, A.A. & Öztaş, O. A machine-learning approach for nonalcoholic steatohepatitis susceptibility estimation. Indian J Gastroenterol 41, 475–482 (2022). https://doi.org/10.1007/s12664-022-01263-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12664-022-01263-2

Keywords

Navigation