Introduction

Metabolic-associated fatty liver disease (MAFLD), a multisystem metabolic disease involving the liver, is an update of nonalcoholic fatty liver disease (NAFLD) and is notable for its redefinition of diagnostic conditions and its emphasis on metabolic factors while considering nonalcoholic factors [1]. MAFLD can not only progress to steatohepatitis, liver cirrhosis, and hepatocellular carcinoma [2] but also increase the occurrence and development of extrahepatic diseases, such as cardiovascular and chronic kidney diseases [3]. At present, MAFLD affects more than one-third of the global population, showing a trend of annual increase and rejuvenation [4, 5]. Meanwhile, in China, between 29 and 46% of the population has MAFLD, which has become the most prevalent chronic liver disease, thus seriously aggravating the medical and economic burden on affected individuals and all societies [6, 7].

MAFLD has an implicit pathogenesis and no specific clinical symptoms in the early stages and is therefore easily ignored. Early detection and management are crucial to prevent the progression of MAFLD. Among existing diagnostic tools, liver biopsy is the gold standard for the diagnosis of MAFLD but is unsuitable for routine screening because it is invasive and expensive [8]. Although ultrasonography is noninvasive, it may not be routinely performed in primary or secondary medical centers [9]. Other imaging tests are too expensive for conducting mass screening effectively. Moreover, some factors related to metabolic dysfunction in the new diagnostic criteria may not be routinely measured owing to the complexity and technicality associated with biomarker measurements and diagnostic equipment, such as the homeostasis model assessment of insulin resistance index (HOMA-IR), plasma high-sensitivity C-reactive protein level (Hs-CRP), glycated hemoglobin (HbA1c), and 2-h postload glucose. This situation limits the application of MAFLD diagnostic criteria by healthcare providers. Therefore, developing a simple, noninvasive, and practical MAFLD prediction model for the rapid screening of MAFLD is particularly necessary. Moreover, the screening tool should be widely applied for the early detection of MAFLD in primary, secondary, and tertiary medical centers.

Some simple screening tools for NAFLD based on demographics, laboratory factors, and anthropometrics have emerged [10,11,12,13] (e.g., fatty liver index, NAFLD liver fat score, and the hepatic steatosis index). However, these screening tools are not applicable to the newly defined condition of MAFLD. A nomogram to predict the risk of MAFLD in overweight and obese people has recently been developed but is suitable only for those with body weight index (BMI) ≥ 24 and male waist circumference (WC) ≥ 90 cm or female WC ≥ 80 cm [14]. Another nonimaging-assisted nomogram established in a large United States (US) population could screen for MAFLD well but has unclear applicability to the Chinese population [15]. Therefore, MAFLD screening tools that can be easily used in the Chinese general population have not yet been developed.

Nomograms have been utilized widely to predict the risks of various diseases [16]. They are graphical prediction tools that visually and intuitively quantify the risk of events on the basis of various predictors [17]. Therefore, we aimed to develop and validate a nomogram for MAFLD screening and MAFLD risk classification in the general population based on routine indicators associated with MAFLD during physical examination.

Materials and methods

Participants

The study was carried out on the basis of a physical examination survey among individuals who underwent annual physical examinations at the Health Management Center of the Third Xiangya Hospital of Central South University in Hunan Province, China, between 2016 and 2020 (although all participants were from one medical facility, they were from different provinces of China). This institution is a tertiary medical center with a high ultrasound completion rate. A total of 207,663 individuals who underwent physical examinations between 2016 and 2020 and who were aged 18–79 years old were included in this study (Supplementary Fig. 1). The enrollment was limited to participants with complete records of demographic, anthropometric, blood biochemical indicators, and lifestyle information, as well as the results of hepatic ultrasonography examination.

Data collection

Predictor variables were chosen on the basis of their clinical importance and evidence related to MAFLD. The collected data included demographic information (sex, age, education, marriage, family history of hypertension, and/or diabetes), anthropometric parameters (BMI, WC, waist-to-hip ratio [WHR], systolic blood pressure [SBP], and diastolic blood pressure [DBP]), blood biochemical indicators (alanine aminotransferase [ALT], uric acid [UA], fasting plasma glucose [FPG], total cholesterol [TC], triglyceride [TG], high-density lipoprotein cholesterol [HDL-C], and low-density lipoprotein cholesterol [LDL-C]), and self-reported lifestyles (dietary preference, smoking status, drinking status, physical activity, and sleep duration). In total, 22 variables were collected.

The quality of data collection is controlled by the following procedures. Blood pressure, including SBP and DBP, was measured on the right arm with the participants in a seated position after 5 min of rest. Blood biochemical measurements were performed in the morning on an empty stomach in accordance with standard procedures. Lifestyle-related information was collected by trained clinicians. Outliers and missing values were corrected and added by rechecking the original data in the data management system.

In this study, the same protocol was followed as shown in Transparent Reporting of a Multivariable Predictive Model for Individual Prognosis or Diagnosis [18].

Definition and assessment

The height and weight of each subject were measured on digital scales to the nearest 0.1 cm and 0.1 kg, respectively. BMI was calculated by weight (kg) divided by the square of height (m2). The WC was calculated as the horizontal girth through the navel center. Hip circumference (HC) was defined as the perimeter surrounding the widest part of the buttocks at the axial plane. WHR is calculated by the ratio of WC (cm) to HC (cm).

This study categorized BMI into four groups (underweight: < 18.50 kg/m2, normal: 18.50–22.99 kg/m2, overweight: 23.00–24.99 kg/m2, obese: ≥ 25.00 kg/m2) on the basis of the BMI criteria for Asians formulated by the WHO [19]. In our study, the diagnostic criteria for diabetes were or under antidiabetes treatment or self-reported diabetes, and prediabetes was defined as FPG between 5.6 and 6.9 mmol/L (impaired fasting glucose) [20]; SBP of 130 mmHg, DBP of 85 mmHg, being on antihypertensive therapy, or self-reported hypertension were used as diagnostic criteria for hypertension; and abnormal WC was defined as WC ≥ 90 cm for men and WC ≥ 80 cm for women [2]. Abnormal WHR was defined as WHR ≥ 0.90 for men and WHR ≥ 0.85 for women. Hyperuricemia was defined as UA > 420 μmol/L [21]. Elevated liver enzymes were defined as ALT > 40 IU/L [22]. Dyslipidemia was defined as follows: TC ≥ 5.2 mmol/L; LDL-C ≥ 3.4 mmol/L; HDL-C < 1 mmol/L in men and < 1.3 mmol/L in women; and TG ≥ 1.7 mmol/L [23].

Hepatic ultrasound examination was conducted by trained ultrasonographers. Ultrasound diagnosis of hepatic steatosis is based on the presence of hepatic and renal echogenic contrast, liver parenchymal brightness, deep attenuation, and vascular blurring [24]. In reference to the latest MAFLD criteria described by Eslam et al. [1], the diagnosis of MAFLD in this study is based on ultrasonically confirmed steatosis of the liver and one of the following three criteria: overweight or obesity (defined as BMI > 23 kg/m2 in Asians), presence of type 2 diabetes mellitus, or evidence of metabolic dysregulation. Metabolic dysregulation was defined as the presence of ≥ 2 of the following [1]: (i) WC ≥ 90/80 cm (Asian cutoff) in men and women, respectively; (ii) blood pressure ≥ 130/85 mmHg or specific drug treatment; (iii) TG ≥ 1.7 mmol/L or specific drug treatment; (iv) HDL-C < 1.0 mmol/L in men and < 1.3 mmol/L in women; (v) prediabetes (i.e., FPG of 5.6 to 6.9 mmol/L or HbA1c of 5.7% to 6.4% or 2-h postload glucose level of 7.8 to 11.0 mmol); (vi) HOMA-IR score ≥ 2.5; and (vii) Hs-CRP level > 2 mg/L.

Statistical analyses

There was a 7:3 ratio of subjects randomly divided into development and validation datasets for the construction and validation of the nomogram. To develop the model, the development dataset was used, and the validation dataset was used to validate it. The comparability between the two datasets was then evaluated. Categorical variables were presented as numbers (percentages) and compared by using the χ2 test. To identify the potential predictors of MAFLD, two statistical methods were used: univariate regression analysis and the random forest algorithm [25]. We used random forest analysis to calculate the mean decreased Gini (MDG) of each independent variable in this study, which could be used as a measure of this variable's contribution to the risk of MAFLD and explain how the independent and dependent variables are related. In the follow-up analysis, we selected the variables that reached statistical significance in univariate regression analysis (P < 0.05) and the top 50% of the random forest MDG. A multivariable logistic regression model was then based on the statistically significant variables identified during these procedures. To ensure that the multivariable logistic regression model was not overfitting, least absolute shrinkage and selection operator regression (LASSO) was performed to eliminate factors with high correlation. Ultimately, a nomogram based on the multivariate model composed of the optimal features was developed to predict the risk of MAFLD. The receiver operating characteristic curve (ROC) was also applied to evaluate discrimination performance, and the AUC (area under the ROC curve) was greater than 0.70, reflecting the high performance of this nomogram [26]. The concordance between the practical results and the predicted probabilities was measured by calibration curves. The clinical practicability of the nomogram was evaluated by decision curve analysis (DCA). The DCA method is used to evaluate and compare predictive models and calculate the net benefits against threshold probabilities [27].

Statistical analysis was performed with R software version 4.2.2 and SPSS version 24.0. A two-sided P value < 0.05 was considered statistically significant.

Results

Characteristics of subjects

After rigorous screening, 138 664 participants, including 77 951 men and 60 713 women, were finally enrolled. Supplementary Fig. 1 shows the process of selecting subjects. By using the novel MAFLD diagnostic criteria, the MAFLD prevalence was found to be 39.55% (men: 53.39%, women: 21.77%, P < 0.001). In our study, participants’ data were randomly assigned 7:3 between the development dataset (n = 97 066) and the validation dataset (n = 41 598). The prevalence of MAFLD between the two datasets was not significantly different (development dataset: 39.55%, validation dataset: 39.55%, P = 0.998). The characteristics of the two datasets are shown in Table 1.

Table 1 Characteristics of participants in the development and validation datasets

Identifying predictors and constructing a nomogram for MAFLD

Table 2 shows the results of univariate logistic regression analysis and random forest for MAFLD. The default value of Ntree is 500; when mtry = 3 and ntree = 500, out-of-bag samples had the lowest estimation error rate (OOB = 16.37%). All variables were statistically significant in univariate logistic regression analysis. However, 11 variables failed to achieve a high MDG in random forest analysis. The other variables (BMI, WC, WHR, TGs, sex, ALT, FPG, age, UA, SBP, and smoking status) obtained relatively high MDGs (top 50%) in addition to producing significant results in univariate analysis (P < 0.05). Therefore, further multivariate modeling was conducted using these 11 variables.

Table 2 Univariate regression and random forest results for the development dataset

The modeling process of LASSO regression is shown in Fig. 1a, b. Among the 11 variables (BMI, WC, WHR, TGs, sex, ALT, FPG, age, UA, SBP, and smoking status), 10 independent predictors in the development dataset were identified by the nonzero coefficients in LASSO regression, and the optimal parameter (lambda) selection in the LASSO model was tenfold cross-validated by the minimum criteria. Then, multivariate logistic regression modeling was conducted using the 10 potential risk factors (Table 3). The results showed that BMI ≥ 23.00 kg/m2, abnormal WC and WHR, TGs ≥ 1.7 mmol/L, male sex, ALT > 40 U/L, FPG ≥ 5.6 mmol/L, middle age and older age, UA > 420 μmol/L, and SBP ≥ 130 mmHg were independent risk factors for MAFLD.

Fig. 1
figure 1

Variable filtering of LASSO regression. Note: a LASSO coefficient profile for 11 variables. b The selection of the optimal lambda parameter in the LASSO model. To avoid overfitting, LASSO regression suggested including 10 variables (λ = 0.007, log[λ] =  − 5.00)

Table 3 Multivariate logistic regression model for MAFLD

In accordance with the results of the multivariable logistic regression model, the nomogram for MAFLD was developed on the basis of the 10 risk factors (Fig. 2). To improve the clinical utility of the nomogram, we converted the calculation of risk levels into a prediction table (Supplementary Table 1).

Fig. 2
figure 2

Nomogram for predicting the risk of MAFLD in the physical examination population. Note: When using the nomogram, the corresponding points for each variable were added to obtain the total points, and a vertical line was drawn from the total points axis to the risk of MAFLD axis to obtain the predicted risk value

Discrimination and calibration

After constructing the model using the development dataset (n = 97 066), the validation dataset (n = 41 598) was used to verify the predictive ability of the model. The AUC of the model was 0.915 (95% CI 0.913–0.916) in the development dataset and 0.914 (95% CI 0.911–0.917) in the validation dataset, indicating good discrimination (Supplementary Fig. 2). For the development dataset, the sensitivity was 0.804, the specificity was 0.863, and the cutoff point was -0.463. For the validation dataset, the sensitivity was 0.787, and the specificity was 0.878, with a cutoff of -0.677. The calibration plot of the current MAFLD rate revealed that the development dataset was similar to the validation dataset, with a slight overestimation of the MAFLD probability between 0.08 and 0.66 and an underestimation above 0.72. Nevertheless, the overall calibration ability was good (Supplementary Fig. 3).

Clinical practicality

The clinical practicality of the developed nomogram model was evaluated with decision curves (Supplementary Fig. 4). On the basis of decision curve analysis, the threshold probability was ≤ 95% in the development dataset and ≤ 90% in the validation dataset. In other words, when the predicted risk is ≤ 90%, further diagnosis is beneficial. When the predicted risk value is greater than 90%, MAFLD diagnosis has no benefit. In brief, the MAFLD prediction nomogram presented more net benefit than “all individuals with MAFLD” or “no individuals with MAFLD”. As a result, the risk of MAFLD could be classified as low (< 90%) or high (> 90%) in accordance with the developed nomogram.

Discussion

With the increase in public health awareness, physical examination has become the main way through which people engage in health self-management. In consideration of the background that most patients with MAFLD are diagnosed incidentally during physical examinations and the lack of predictive tools for the large-scale screening of MAFLD risk in the Chinese general population [28], we established and validated a nomogram for predicting MAFLD risk based on real-world large-scale physical examination data by combining classical regression analysis and a random forest algorithm to identify the most significant predictors of MAFLD. The nomogram aims to enable mass screening for MAFLD in primary, secondary, and tertiary care centers by using easily available indicators for the early detection, diagnosis, and intervention of people at risk of MAFLD. Our results showed that our prediction model has good performance in terms of discrimination, calibration, and clinical practicality.

In our study, candidate variables were limited to easily available indicators in the construction of the model for MAFLD prediction. This approach contributed to enhancing the clinical utility of our nomogram. Furthermore, combining classical regression analysis methods and the random forest algorithm guaranteed that we obtained the best combination of predictors. The use of LASSO regression to ensure that multivariate logistic regressions were not overfitted ensured the objectivity of the variables in the model. The application of the random forest algorithm to filter variables could avoid the increase in estimated parameters and insensitivity to outliers when dealing with multilayer categorical variables and is highly resistant to interference [29]. In addition, the nomogram lacks a complicated formula and instead predicts an individual’s risk of developing MAFLD on the basis of its scoring system and is therefore highly acceptable and can be effectively applied to the general population [30]. More importantly, our nomogram not only succinctly demonstrates the relationship between MAFLD and its risk factors but also facilitates identifying changes in prevalence in accordance with the changes in the values of specific risk factors. A growing body of evidence shows that nomograms can predict disease risk in a visual and understandable way [17].

Notably, 10 variables were included in our nomogram. The diagnostic criteria of MAFLD have been proven to be associated with five of these 10 factors, namely, BMI, WC, SBP, FPG, and TG, but not with sex, age, WHR, ALT, and UA. Our study innovatively included sex and WHR in the variable screening. Males are at a higher risk of developing MAFLD than females, as has been confirmed in many studies [31,32,33]. While sex cannot be modified, it can be used as a categorical indicator to advise the highly susceptible population of men to be screened for MAFLD. Although research on the use of WHR as an anthropometric indicator to predict MAFLD is limited, Zheng et al. [34] and Cai et al. [35] demonstrated that WHR has a high diagnostic value for NAFLD. In the present study, the multivariate logistic regression results indicated that abnormal WHR was strongly and positively associated with the risk of MAFLD (OR = 2.01, 95% CI: 1.93–2.10). This finding confirmed that WHR needs to be used as one of the routine indicators for predicting and screening MAFLD. Moreover, our study found that high ALT values were tightly associated with a high risk of MAFLD (OR = 2.39, 95% CI: 2.26–2.52), suggesting that the ALT biomarker is an important reference for screening MAFLD, although evidence showing that ALT values could be regarded as the diagnostic standard for NAFLD to some extent is controversial [36,37,38]. Consistent with a previous study [39, 40], the present work highlighted the importance of UA in predicting MAFLD (OR = 1.65, 95% CI 1.57–1.73). Thus, these identified parameters in our MAFLD-predictive model are not only easily available but also reliable and accurate.

To our knowledge, our nomogram is the first nomogram for predicting MAFLD risk that is applicable to the Chinese general population and may compensate for some of the shortcomings of previous MAFLD screening tools. For example, the nomogram for predicting the risk of MAFLD in overweight and obese populations reported by Song et al. is unavailable to the general population [14], the sensitivity and specificity of the clinical and laboratory nomogram (CLN) model for predicting NAFLD need improvement, and the CLN model is inapplicable to the newly defined condition of MAFLD [41]. The MAFLD prediction nomogram based on demography, laboratory factors, anthropometry, and comorbidities can well predict MAFLD but may be inappropriate for the Chinese population because the BMI and WC thresholds for the Asian population differ from the diagnostic criteria for the US population [15]. Specifically, Asians are defined as overweight/obese with lower cutoff values for BMI (BMI ≥ 23 kg/m2) and WC anomaly (WC ≥ 90/80 cm) compared to Caucasians [1]. Our nomogram could solve these problems because it was constructed on the basis of physical examination data related to MAFLD in the general Chinese population and has high sensitivity and specificity. Our nomogram not only could support clinicians in screening for MAFLD and determining whether participants need further abdominal ultrasound to confirm the diagnosis of MAFLD, it could also provide self-management to patients with MAFLD who are potentially at risk to seek timely medical assistance.

Study strengths and limitations

The strengths of this study include the large sample of participants (138,664), which increases the reliability and statistical power of the nomogram. The combination of classical regression methods and the random forest algorithm ensured the best combination of factors included in the prediction model. More importantly, our prediction model can be widely applied to health management (physical examination) centers for rapid screening of MAFLD, and the presentation of the nomogram also makes it easy to assess the risk of MAFLD, which contributes to realizing graded management and timely referral of MAFLD, thereby improving the overall level of MAFLD prevention and treatment.

However, our study also has several potential limitations. First, in this study, the diagnosis of MAFLD was based on steatosis detected by liver ultrasonography rather than biopsy because performing liver biopsy in a large-scale survey was impractical. Future studies will add liver biopsy where possible to ensure the accuracy of MAFLD diagnosis. Second, the exclusion of some patients who underwent physical examination due to missing data may have led to some bias. Third, patients with MAFLD diagnosed by ultrasonography lacked data on 2-h postload glucose, HbA1c, HOMA-IR, and Hs-CRP.

Conclusion

Our study used routine indicators to establish a risk-stratified nomogram that screens for the risk of MAFLD in the physical examination population. Clinicians can provide individualized plans to subjects in accordance with risk assessment. High-risk individuals, for whom early lifestyle interventions may help prevent disease progression and reduce the risk of adverse outcomes, should be referred for additional diagnostic testing to confirm NAFLD.