Skip to main content

Causal Relationship Between Complement C3, C4, and Nonalcoholic Fatty Liver Disease: Bidirectional Mendelian Randomization Analysis


The complement system is activated during the development of nonalcoholic fatty liver disease (NAFLD). We aimed to evaluate the causal relationship between serum C3 and C4 levels and NAFLD. After exclusion criteria, a total of 1600 Chinese Han men from the Fangchenggang Area Male Health and Examination Survey cohort were enrolled in cross-sectional analysis, while 572 participants were included in the longitudinal analysis (average follow-up of 4 years). We performed a bidirectional Mendelian randomization (MR) analysis using two C3-related, eight C4-related and three NAFLD-related gene loci as instrumental variables to evaluate the causal associations between C3, C4, and NAFLD risk in cross-sectional analysis. Per SD increase in C3 levels was significantly associated with higher risk of NAFLD (OR = 1.65, 95% CI 1.40, 1.94) in cross-sectional analysis while C4 was not (OR = 1.04, 95% CI 0.89, 1.21). Longitudinal analysis produced similar results (HRC3 = 1.20, 95% CI 1.02, 1.42; HRC4 = 1.10, 95% CI 0.94, 1.28). In MR analysis, there were no causal relationships for genetically determined C3 levels and NAFLD risk using unweighted or weighted GRS_C3 (βE_unweighted = −0.019, 95% CI −0.019, −0.019, p = 0.202; βE_weighted = −0.019, 95% CI −0.019, −0.019, p = 0.322). Conversely, serum C3 levels were significantly effected by the genetically determined NAFLD (βE_unweighted = 0.020, 95% CI 0.020, 0.020, p = 0.004; βE_weighted = 0.021, 95% CI 0.020, 0.021, p = 0.004). Neither the direction from C4 to NAFLD nor the one from NAFLD to C4 showed significant association. Our results support that the change in serum C3 levels but not C4 levels might be caused by NAFLD in Chinese Han men.


Nonalcoholic fatty liver disease (NAFLD) is one of the most common chronic liver diseases. More than a quarter of the global population was affected by NAFLD in 2018 (Younossi et al. 2019). NAFLD is defined as the presence of fat in the liver (hepatic steatosis) after the exclusion of secondary causes of fat accumulation (e.g., significant alcohol consumption, viral hepatitis, autoimmune hepatitis, certain medications, and other medical conditions) (Chalasani et al. 2018). It can progress to inflammatory nonalcoholic steatohepatitis (NASH), fibrosis, cirrhosis, and even hepatocellular carcinoma (Friedman et al. 2018). A wide range of diseases and conditions, including metabolic syndrome (MetS), obesity, hyperlipidemia, hypertension, hyperglycemia, insulin resistance (IR), and type 2 diabetes, are associated with NASH risk, and NAFLD is even considered as one of the risk factors for cardiovascular diseases (Friedman et al. 2018). Recent studies have also reported that oxidative stress and inflammasome activation are related to the development of NAFLD (Younossi et al. 2019).

The complement system consists of a group of proteins that can mediate immune and inflammatory responses. This system was activated in liver biopsies from NAFLD patients compared with healthy controls (Rensen et al. 2009; Segers et al. 2014). The complement proteins C3 and C4 are the major components of the complement system. Mouse study indicated that C3 primarily contributes to accumulation of triglycerides in the liver (Pritchard et al. 2007). Recently, some epidemiological studies have reported a significant role of C3 as a potential predictor of NAFLD in the general population and rheumatoid arthritis (Jia et al. 2015; Ursini et al. 2017; Xu et al. 2016). In addition, a cross-sectional study suggested that plasma C3 levels are associated with liver fat and paralleled the degree of liver injury (Wlazlo et al. 2013). Compared with those with normal transaminase levels, serum C3 and C4 levels in patients with high transaminase levels were significantly lower (Bugdaci et al. 2012). Moreover, we previously found that C3 and C4 were associated with the risk of MetS (Liu et al. 2016), which is one of the risk factors of NAFLD. The results of the MetS study prompted us to speculate whether C3 and C4 may be indirectly associated with NAFLD risk by affecting the development of MetS risk. However, the evidence of current research on the causal relationship between C3, C4, and NAFLD is still inadequate.

Mendelian randomization (MR) uses genetic variants as instrumental variables (IV) to estimate the effect of an exposure on an outcome and consequently make causal inferences (Haycock et al. 2016). MR analysis could make up for the defects of traditional epidemiological study design (cohort study, case–control study, etc.); the bidirectional MR method could validate the correct direction of a causal network (Chaibub Neto et al. 2010; Phillips and Smith 1991; Smith and Ebrahim 2002). In this study, we performed a bidirectional MR analysis together with cross-sectional and longitudinal analyses to explore the causal relationship between C3, C4, and NAFLD.

Materials and Methods


This study was based on data obtained from a hospital-based cohort of Chinese men in Guangxi, China: the Fangchenggang Area Male Health and Examination Survey (FAMHES) cohort. The rationale, design, and methods of the cohort have been detailed elsewhere (He et al. 2014; Jiang et al. 2015; Liu et al. 2016; Yang et al. 2012). Briefly, all participants, who were the physical examination populations, were recruited at the Medical Centre of Fangchenggang First People’s Hospital. A total of 4,303 men aged from 17 to 88 completed the face-to-face interviews and physical examinations at baseline from September 2009 to December 2009 (Fig. 1).

Fig. 1

The flow diagram of study design based on the Fangchenggang Area Male Health and Examination Survey (FAMHES) cohort

After excluding the participants whose immediate family members were not Chinese Han, and those who were other Chinese ethnicities, a total of 2020 eligible subjects were involved in cross-sectional and MR analyses. These subjects had self-reported as being the southern Chinese Han ethnicity, and it was confirmed by the genome-wide assay. We excluded the participants who had missing information on at least one of the following aspects, including single nucleotide polymorphism (SNP) genotypes (n = 36), serum C3 and C4 levels (n = 2), liver ultrasound (n = 3), and general demographic data (n = 7). We also excluded those who had a history of excessive consumption (> 294 g/week) of pure alcohol (n = 89) (Chalasani et al. 2018) and hepatic diseases (including hepatitis B, hepatic cirrhosis, liver cancer, etc.) (n = 283). In the end, a total of 1600 participants (328 with NAFLD, 1272 without NAFLD) were enrolled in the final cross-sectional and MR analyses.

At the follow-up stage, a total of 1272 men without NAFLD at baseline were followed up in 2013, with an average follow-up of 4 years. The participants completed the same face-to-face interview and physical examination in 2013 as they had at baseline. We excluded members who had one or more of the following criteria: loss to follow-up (n = 89), unwillingness to participate (n = 54), subsequent diseases that were unsuitable for participation (n = 213, including cardiovascular diseases, lung cancer, goiter/thyroid cystic nodules/thyroid cancer, nasopharyngeal carcinoma, usage of medications associated with secondary NAFLD, other cancers, etc.), excessive alcohol consumption (n = 67), hepatic diseases (n = 235, including 30 with hepatitis B and hepatitis C virus infection, 13 with liver fibrosis, 7 with cirrhosis, 18 with hepatic distomiasis, 16 with intrahepatic hemangioma, 111 with cholecystitis/gallstone/cholangiolithiasis, 6 with liver cancer, 19 with hepatic cyst, and 15 with hepatapostema), and missing data on anthropometric measurements or clinical biochemistry assays in 2013 (n = 42). Finally, a total of 572 men (146 with incident NAFLD, 426 without NAFLD) participated in the longitudinal analysis.

Data collection

Several types of data were collected in this study, as previously reported, including questionnaire data obtained from in-person interviews, clinical data from physical examinations and laboratory tests in the Fangchenggang First People’s Hospital (Tan et al. 2011; Tian et al. 2012; Yang et al. 2012). The Illumina Omni one platform was used for the genome-wide assay (Yang et al. 2012).


Smoking and drinking were defined as daily smoking and drinking at least once a month (> 6 months), respectively. The consumption of pure alcohol was calculated as the amount of drinking per week (mL) × alcohol% by volume (Vol.) × 0.8 (alcohol density, g/mL). Body mass index (BMI) was calculated as weight (kg)/height (m2).

The diagnostic criteria for fatty liver included: (1) increased liver echogenicity (bright); (2) stronger echoes in the hepatic parenchyma compared with the renal parenchyma; (3) vessel blurring; and (4) narrowing of the lumen of the hepatic veins (Chalasani et al. 2018). The definition of NAFLD was based on abdominal ultrasonographic examination without: (1) a history of excessive consumption (> 294 g/week) of pure alcohol; and (2) a history of hepatic diseases, self-reported viral hepatitis (including hepatitis B and hepatitis C virus), usage of medications associated with secondary NAFLD (corticosteroids, estrogens, amiodarone, methotrexate) or carcinoma.

The criteria for MetS were based upon the National Cholesterol Education Program Adult Treatment Panel III for Asian Americans (Liu et al. 2016) as having three or more of the following components: (1) central obesity (waist circumference ≥ 90 cm and/or BMI > 25 kg/m2); (2) hyperlipidemia (triglyceride ≥ 1.7 mmol/L); (3) low high-density lipoprotein cholesterol (HDL-C) (< 1.03 mmol/L); (4) hypertension (systolic blood pressure ≥ 130 mmHg and/or diastolic blood pressure ≥ 85 mmHg); and (5) hyperglycemia (fasting blood glucose ≥ 5.6 mmol/L).

IR was estimated by the homeostasis model assessment-insulin resistance index. The formula was: [fasting blood-glucose (mmol/L) × fasting insulin (mIU/L)]/22.5. A value of 2.5 or higher was considered IR (Lee et al. 2016).

Selection of Genetic Loci

In our previous two-stage genome-wide association study (GWAS) conducted in the FAMHES cohort (stage 1) and an additional 1496 subjects recruited from three collaborating hospitals in Guangxi, China (stage 2, did not overlap with stage 1), we reported that two SNPs (rs3753394 and rs3745567) were significantly associated with serum C3 levels and eight SNPs (rs1052693, rs11575839, rs2075799, rs2857009, rs2071278, rs3763317, rs9276606, and rs241428) for serum C4 levels (Yang et al. 2012). They are in or near the complement factor H (CFH) gene, the C3 gene, the human leukocyte antigen (HLA) gene, and the C4 gene, respectively (Table 1). In the current study, we used the above-reported two C3-related SNPs and eight C4-related SNPs as IV.

Table 1 Information of SNPs with estimated genetic risk score in participates

Based on the previously published GWAS or candidate-gene studies on NAFLD-related traits (Kitamoto et al. 2013; Lin et al. 2014; Macaluso et al. 2015; Wang et al. 2016, 2018b), we have found eight SNPs involved in susceptibility and/or progression of NAFLD: rs12137855, rs780094, rs4240624, rs1227756, rs58542926, rs738409, rs738491, and rs5764455, which are in or near the lysophospholipase-like 1 (LYPLAL1) gene, the glucokinase regulatory protein (GCKR) gene, the protein phosphatase 1 regulatory subunit 3b (PPP1R3B) gene, the collagen type XIII alpha 1 chain (COL13A1) gene, the transmembrane 6 superfamily member 2 (TM6SF2) gene, the patatine-like phospholipase domain containing 3 (PNPLA3) gene, the sorting and assembly machinery component (SAMM50) gene, and the parvin beta (PARVB) gene, respectively. However, rs12137855, rs780094, rs4240624, and rs1227756 were not significantly associated with NAFLD in the populations of our study, while the genotype of rs58542926 was not found in our genome-wide assay. Therefore, we finally selected rs738409, rs738491, and rs5764455 as IV for NAFLD, after considering the effect size of SNPs on NAFLD in this study.

The two C3-related, eight C4-related and three NAFLD-related SNPs we selected were not in linkage disequilibrium (r2 = 0.00, except for 0.60 between rs738409 and rs738491, 0.63 between rs738409 and rs5764455, and 0.59 between rs738491 and rs5764455).

Statistical Analysis

The data are summarized as median (interquartile range, [IQR]) for continuous variables or as numbers (percentage) for categorical variables. To compare the characteristics among individuals by disease status, the Mann–Whitney U test was used to compare continuous covariates, and the Chi-square test was used to compare categorical covariates. To estimate the associations between C3/C4 and NAFLD, binary logistic regression was used to calculate the odds ratio (OR) and 95% confidence interval (CI) in cross-sectional analysis, and the Cox proportional hazards regression model was used to estimate the hazard ratio (HR) and 95% CI in longitudinal analysis. C3 and C4 levels were quartiles as well as per standard deviation (SD) changes in regression analyses using three models. Model 1 was adjusted for age, marital status (yes or no), and education level (middle school or below, high school, university or beyond). Model 2 was adjusted for the same terms in model 1, as well as smoking (yes or no) and drinking (yes or no). Model 3 was adjusted for the same terms in model 2, as well as MetS (yes or no) and IR (yes or no).

To construct genetic risk scores (GRS), the additive genetic model for each SNP was coded as 0, 1, or 2. For the GRS_C3 and GRS_C4, we created a weighted score using each SNP based on its effect sizes on C3 and C4 from our previous GWAS (Yang et al. 2012). For the GRS_NAFLD, the weights were also estimated from our cohort by linear regression analyses adjusted for age and logBMI. Utilizing two C3-related, eight C4-related and three NAFLD-related SNPs, we created unweighted GRS (GRS_C3: 0–4; GRS_C4: 1–14; GRS_NAFLD: 0–6) and weighted GRS (GRS_C3: 0–0.28; GRS_C4: 0.09–2.14; GRS_NAFLD: 0–2.09). Regarding MR analysis, we used a triangulation approach to estimate the possible causal effect of C3 and C4 on NAFLD (and vice versa), using unweighted and weighted GRS of C3, C4 and NAFLD as IV estimators, respectively (Wang et al. 2018a). The formula was βE = βGE × βEO, where βE is the expected effect size of exposure GRS (GRS_C3, GRS_C4, or GRS_NAFLD) on the outcome, βGE is the effect size of exposure GRS (GRS_C3, GRS_C4, or GRS_NAFLD) on the exposure risk, and βEO is the effect size of exposure (C3, C4, or NAFLD) on the outcome risk (Wang et al. 2018a). Student’s t-test was used to compare the difference between βE and βO (Luk et al. 2008), where βO is the observed effect size of exposure GRS (GRS_C3, GRS_C4, or GRS_NAFLD) on the outcome estimated by a generalized linear model for continuous variables and binary logistic regression for categorical variables. We considered there to be a causal relationship between exposure and outcome when βE was significantly different from βO. For binary traits, we used ln-transformed OR to acquire the value of β. To limit weak SNP bias, an F-statistic above 10 was expected to be sufficiently strong for the study (Pierce et al. 2011). To estimate the pleiotropic effects of each C3-, C4-, and NAFLD-related SNPs and GRS, we used linear regression models or binary logistic regression models to analyze the characteristics of the participants according to each SNP, while  the generalized linear models was used for continuous variables and binary logistic regression or the Chi-square test was used for categorical variables according to the tertile unweighted and weighted GRS. In sensitivity analyses, to identify overly influential SNPs, the leave-1-SNP-out analysis was conducted so that each SNP was gradually excluded to recreate new unweighted and weighted GRS for C3, C4, and NAFLD as IV.

We used R software (version 3.4.4, R Core Team) and PLINK software (version 2.0 alpha) for analyses. Statistical tests were two-sided, and p value < 0.05 was considered statistically significant.


Characteristics of the Participants

Of the 1600 eligible participants at the baseline stage, 328 NAFLD cases (20.50%) and 1272 healthy individuals (79.50%) were identified (Table 2). Compared with the group without NAFLD, the participants with NAFLD had the higher prevalence of MetS, central obesity, hyperlipidemia, low high-density lipoprotein cholesterol (HDL-C), hypertension, hyperglycemia and IR (all p < 0.001). The C3 and C4 levels of participants with NAFLD were 19.44% and 9.38% , respectively, which were higher than those without NAFLD (both p < 0.001).

Table 2 The demographic characteristics of the participants at baseline and follow-up stages

After a 4-year follow-up, 146 subjects (25.52%) were newly diagnosed with NAFLD. The characteristics of education level, MetS, central obesity, hyperlipidemia, low HDL-C, IR, C3, and C4 were different between the participants with and without NAFLD, while age, marital status, smoking, drinking, hypertension, and hyperglycemia were not. In the longitudinal analysis, except for age, marital status, drinking, hypertension, and hyperglycemia, the characteristics of incident NAFLD were similar to the baseline stage (all p < 0.001). Moreover, the median C3 and C4 levels and the prevalence rates of MetS, central obesity, hyperlipidemia, low HDL-C, hypertension, hyperglycemia, and IR were all lower at the follow-up than at baseline.

Associations of Serum C3 and C4 Levels with NAFLD Risk

In the cross-sectional analysis, we observed that the C3 and C4 levels were significantly associated with higher risk of NAFLD in model 1. The adjusted OR (95% CI) for NAFLD per SD increase were 2.36 (95% CI 2.05, 2.72) and 1.24 (95% CI 1.10, 1.40), respectively (both p < 0.001; Table 3). However, the adjusted OR (95% CI) for NAFLD per SD increase were 2.35 (95% CI 2.04, 2.71) for C3 and 1.24 (95% CI 1.10, 1.40) for C4 (both p < 0.001), and gradually decreased to 1.65 (95% CI 1.40, 1.94) for C3 (p < 0.001) and 1.04 (95% CI 0.89, 1.21) for C4 (p = 0.311), with adjusted covariates increasing from model 2 to model 3. The quartiles of C3 and C4 showed similar results.

Table 3 The associations of serum C3 and C4 levels with NAFLD risk

In the longitudinal analysis, the adjusted HR (95% CI) of NAFLD per SD increase for C3 and C4 were 1.48 (95% CI 1.28, 1.71, p < 0.001) and 1.22 (95% CI 1.06, 1.42, p = 0.039) in model 1, respectively. Similar to the results of the cross-sectional analysis, the adjusted HR (95% CI) of NAFLD per SD increase for C3 and C4 were gradually decreased and finally became borderline significant for C3 while not statistically significant for C4 with the increase of adjusted covariates (shown in model 2 and 3). Furthermore, the multivariable analysis results for the quartiles of C3 and C4 were the same as those at survey stage of baseline.

The Bidirectional MR Analysis

In the MR triangulation framework, the per SD increase in unweighted and weighted GRS_C3 were associated with C3 (βGE_unweighted = −0.0383, 95% CI −0.0486, −0.0280, p < 0.001; βGE_weighted = −0.0387, 95% CI −0.0488, −0.0287, p < 0.001), and it is not associated with NAFLD (βE_unweighted = −0.019169, 95% CI −0.019203, −0.019136, p = 0.202; βE_weighted = −0.019391, 95% CI −0.019424, −0.019357, p = 0.322) after adjusting for age, marital status, education level, smoking, drinking, MetS and IR (Fig. 2A). There were no obvious differences between the observed effects of GRS_C3 on the NAFLD risk (βO) and the expected effects (βE), whether using the unweighted or weighted GRS_C3 (punweighted = 0.202, pweighted = 0.322). Conversely, the expected adjusted regression coefficients of genetically determined NAFLD for the C3 level were 0.020346 (95% CI 0.020203, 0.020489) for the unweighted GRS_NAFLD and 0.020538 (95% CI 0.020395, 0.020682) for the weighted GRS_NAFLD (Fig. 2B). In addition, the observed effects of the unweighted and weighted GRS_NAFLD on the C3 risk were different from the expected effects (both p = 0.004). However, the bidirectional IV estimates for causal relationships from C4 to NAFLD and from NAFLD to C4 were also not significant (Fig. 2C, D).

Fig. 2

The estimated associations between C3 (A, B), C4 (C, D) and NAFLD by GRS. βO is the observed effect size of exposure GRS (GRS_C3, GRS_C4, or GRS_NAFLD) on the outcome. βE is the expected effect size of exposure GRS (GRS_C3, GRS_C4, or GRS_NAFLD) on the outcome. The formula is βE = βGE × βEO, where βGE is the effect size of exposure GRS (GRS_C3, GRS_C4, or GRS_NAFLD) on the exposure risk, and βEO is the effect size of exposure (C3, C4, or NAFLD) on the outcome risk. For binary traits, we used ln-transformed OR to acquire the value of β. All analyses were adjusted for age, marital status (yes or no), education level (middle school or below, high school, university or beyond), smoking (yes or no), drinking (yes or no), MetS (yes or no), and IR (yes or no). p*: p values were derived from Student’s t-test to compare the difference between expected and observed associations with the outcome risk

Pleiotropic Effects of SNPs and GRS

None of the C3-related, C4-related, and NAFLD-related SNPs reached a genome-wide association level, and none of them had pleiotropic effects with age, smoking, drinking, MetS, central obesity, hyperlipidemia, low HDL-C, hypertension, hyperglycemia, and IR (all p > 0.001) (Supplementary Tables 1–3). With the increase of GRS_C3 and GRS_C4, C3 and C4 levels were significantly decreased, respectively, while the prevalence of NAFLD significantly increased with higher GRS_NAFLD (Supplementary Tables 4–6). Moreover, GRS_C3, GRS_C4, and GRS_NAFLD were consistently not associated with potential confounders (all p for trend > 0.05).

Sensitivity Analysis

The F-statistics were examined to evaluate the weak effects of SNPs on NAFLD; they ranged from 1.530 to 105.999 (Table 1). Because the F-statistics of four C4-related SNPs (rs1052693, rs11575839, rs2857009, rs9276606) were less than 10, we recreated GRS_4SNP excluding these four SNPs of weak effects to estimate the effect size of C4 genes on NAFLD risk in sensitivity analysis. However, except for the significant causal effects from NAFLD to C3 (all p < 0.05), the bidirectional MR analysis did not reveal significant results, whether in unweighted or weighted GRS of C3, C4, and NAFLD in leave-1-SNP-out analysis (all p > 0.05) (Supplementary Table 7). These results were consistent with the prior results considering all the SNPs.


In this study, an elevated serum level of C3 but not C4 was associated with NAFLD in both the cross-sectional and longitudinal analyses. In the MR analysis, except for the significant causal effects from NAFLD to C3, there were no associations between C3, C4, and NAFLD from either direction. To the best of our knowledge, this study is the first to confirm the causal relationships between C3, C4, and NAFLD using the MR method.

Previous conventional epidemiological studies have provided sufficient evidence to support the associated links between C3 and NAFLD in both Chinese and Italian populations (Jia et al. 2015; Ursini et al. 2017; Xu et al. 2016). Our observational results in the cross-sectional and longitudinal analyses are consistent with them. The interesting discrepancy was found in the same data using MR method to estimate causal effects from C3 to NAFLD, and it may be due to the limitations of conventional means to make causal inferences. Especially when adjusted confounders increased, the risk of NAFLD gradually decreased, and it has even returned negative results between C4 and NAFLD for either cross-sectional or longitudinal analysis. Because of the existence of unknown or unmeasured confounders or the imprecision of measured confounders, the significant results would be a biased estimate in a traditional epidemiological survey. These discrepancies were not seen in previous MR studies of C3 with the risk of other diseases, but they have been found in causal inferences between other risk factors and NAFLD risk using MR analysis (Wang et al. 2018b). Few studies have focused on the biological effects of C4 on NAFLD in both the conventional and MR designs.

In the MR analysis of this study, the GRS composed of two C3-related SNPs and eight C4-related SNPs were used to estimate the causal role from C3 and C4 to NAFLD. Conversely, three NAFLD-related SNPs were used to construct GRS_NAFLD for detecting the relationship from NAFLD to C3 and C4. Using a composed GRS as a single IV is helpful to create stronger instruments than each SNP alone, and thus it provides a better evaluation on the genetic effect. Formally, the composed GRS, as an IV, should satisfy the following three assumptions (Lawlor et al. 2008): first, the composed GRS should be robustly associated with the modifiable (non-genetic) exposure of interest. Our previous two-stage GWAS has confirmed the significant associations between the SNPs selected in this study and serum C3 and C4 levels. Meanwhile, we selected three NAFLD-related SNPs from other studies of large-sample GWAS and candidate genes in Asian populations, which were also reported with positive results in this study. We found the associations of the three GRS with corresponding exposures were also significant. Second, the composed GRS should not be associated with potential confounding factors that would bias observational epidemiological associations between modifiable exposures and outcomes. In this study, none of the SNPs had pleiotropic effects with possible confounders, including age, smoking, drinking, MetS, central obesity, hyperlipidemia, low HDL-C, hypertension, hyperglycemia, and IR. The associations between three GRS and the above-mentioned confounding variables were also not detected, whether considering the unweighted or weighted GRS. Third, the composed GRS should be related to the outcome only via its associations with the modifiable exposures. This means that the only causal route from GRS_C3 and GRS_C4 to NAFLD is through serum C3 and C4 levels (and vice versa). Hence, we analyzed the associations of each exposures-related SNP and GRS with the corresponding outcome (C3, C4 or NAFLD), and none of them was shown with significant results.

The complement system is activated during the inflammatory process of NAFLD, then it produces an immune response. Nevertheless, the causal relationship between C3, C4, and NAFLD is still unknown. Our MR results showed that there are no causal effects from C3/C4 to NAFLD, while conversely they might exist from NAFLD to C3 but not to C4. These results provide evidence to support the observation from previous studies: the change in serum C3 levels might be caused by NAFLD. Liver injury and dysfunction, which are caused by fatty liver diseases and concomitantly lead to excessive oxidative stress and inflammatory response, would reduce the normal synthesis and increase the consumption of C3 and C4, and it further leads to the decline in these proteins in patients with NAFLD (Gao et al. 2008; Hu 2002; van Greevenbroek et al. 2011). This vicious cycle would ultimately weaken the anti-inflammatory effects of C3 and C4 on NAFLD, and the disease would develop. Therefore, the changes in C3 and C4 levels in patients with NAFLD can be considered as the auxiliary observational indicators to judge the severity of disease progression in clinical therapy.

The strengths of this study mostly lie in the prospective design and the combination of observational and genetic epidemiology methods to draw the causal inferences in a step-by-step manner. The published epidemiological studies were only based on a cross-sectional design to explore the associations between serum C3 and C4 levels and NAFLD risk. Compared with the conventional means, the bidirectional MR approach would not be affected by the external environment, social behavior, and other factors, and thus it could be applied to control the potential or unmeasured confounder bias, reverse causation bias, measurement error and so on, enhancing the ability to infer causality.

Our study also has some limitations. First, the sample size is relatively small for MR analysis, which would result in lower power to estimate the effect size. Second, the selection of C3-related and C4-related SNPs may have weak IV bias. The F-statistics of two C3-related and four C4-related SNPs were less than 10. Of note, no other studies have found loci associated with C3 and C4, except our previous study of GWAS in the Chinese population. Besides, after excluding four C4-related SNPs (each with an F-statistic less than 10), the causal effect between C4 and NAFLD persisted as before in the sensitivity analysis. Therefore, our findings need to be confirmed in the larger cohort after finding new C3-related or C4-related SNPs. Third, we built up our GRS using common variants, which were considered to represent the fractional genetic ability of C3, C4, and NAFLD. Therefore, the potential contribution of rare variants was not assessed in this study. Fourth, abdominal ultrasound was used to diagnose liver steatosis. Though liver biopsy is the golden standard, it is a high-risk, traumatic operation performed by a specialist physician, and is not easily accepted by participants. Thus, ultrasonography is a relatively feasible method, and it has been used to determine fatty liver in many large epidemiological studies (Sinn et al. 2017). Fifth, we could not predict the effects of undetected confounders on the exposure and outcome. Limited by biological science and technology, our understanding of many biological effects is still poor. More investigations are required to illustrate the precise mechanisms of unknown confounding factors. Finally, this study was conducted only among the adult males in southern China, especially in a population with a 4-year incidence of other chronic liver diseases that exceeded even NAFLD. In this case, our results may not provide sufficient evidence to indicate the relationships in the general population, even for children and females. Moreover, our findings should be generalized cautiously to other ethnicities or ethnic groups.

In conclusion, we found that there were significant causal relationships from NAFLD to C3 but not C4 levels, and no significant causal relationships were revealed from serum C3 and C4 levels to NAFLD by the bidirectional MR approach, although conventional means indicated positive results. Our findings support that the change in the serum C3 level is caused by NAFLD in Chinese Han men, and provide novel evidence for the biologically plausible causal relationship between genetically determined C3, C4, and NAFLD.


  1. Bugdaci MS, Karaca C, Alkim C, Kesici B, Bayraktar B, Sokmen M (2012) Serum complement C4 in chronic hepatitis C: correlation with histopathologic findings and disease activity. Turk J Gastroenterol 23(1):33–37.

    Article  PubMed  Google Scholar 

  2. Chaibub Neto E, Keller MP, Attie AD, Yandell BS (2010) Causal graphical models in systems genetics: a unified framework for joint inference of causal network and genetic architecture for correlated phenotypes. Annals Appl Stat 4(1):320–339.

    Article  Google Scholar 

  3. Chalasani N, Younossi Z, Lavine JE, Charlton M, Cusi K, Rinella M et al (2018) The diagnosis and management of nonalcoholic fatty liver disease: practice guidance from the American Association for the Study of Liver Diseases. Hepatology 67(1):328–357.

    Article  PubMed  Google Scholar 

  4. Friedman SL, Neuschwander-Tetri BA, Rinella M, Sanyal AJ (2018) Mechanisms of NAFLD development and therapeutic strategies. Nat Med 24(7):908–922.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  5. Gao B, Jeong WI, Tian Z (2008) Liver: an organ with predominant innate immunity. Hepatology 47(2):729–736.

    CAS  Article  PubMed  Google Scholar 

  6. Haycock PC, Burgess S, Wade KH, Bowden J, Relton C, Davey Smith G (2016) Best (but oft-forgotten) practices: the design, analysis, and interpretation of Mendelian randomization studies. Am J Clin Nutr 103(4):965–978.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  7. He M, Wu C, Xu J, Guo H, Yang H, Zhang X et al (2014) A genome wide association study of genetic loci that influence tumour biomarkers cancer antigen 19–9, carcinoembryonic antigen and alpha fetoprotein and their associations with cancer risk. Gut 63(1):143–151.

    CAS  Article  PubMed  Google Scholar 

  8. Hu K-Q (2002) Occult hepatitis B virus infection and its clinical implications. J Viral Hepatitis 9(4):243–257.

    Article  Google Scholar 

  9. Jia Q, Li C, Xia Y, Zhang Q, Wu H, Du H et al (2015) Association between complement C3 and prevalence of fatty liver disease in an adult population: a cross-sectional study from the Tianjin Chronic Low-Grade Systemic Inflammation and Health (TCLSIHealth) cohort study. PLoS ONE 10(4):e0122026.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  10. Jiang DK, Ma XP, Yu H, Cao G, Ding DL, Chen H et al (2015) Genetic variants in five novel loci including CFB and CD40 predispose to chronic hepatitis B. Hepatology 62(1):118–128.

    CAS  Article  PubMed  Google Scholar 

  11. Kitamoto T, Kitamoto A, Yoneda M, Hyogo H, Ochi H, Nakamura T et al (2013) Genome-wide scan revealed that polymorphisms in the PNPLA3, SAMM50, and PARVB genes are associated with development and progression of nonalcoholic fatty liver disease in Japan. Hum Genet 132(7):783–792.

    CAS  Article  PubMed  Google Scholar 

  12. Lawlor DA, Harbord RM, Sterne JA, Timpson N, Davey Smith G (2008) Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Stat Med 27(8):1133–1163.

    Article  PubMed  Google Scholar 

  13. Lee YH, Kim SU, Song K, Park JY, Kim DY, Ahn SH et al (2016) Sarcopenia is associated with significant liver fibrosis independently of obesity and insulin resistance in nonalcoholic fatty liver disease: Nationwide surveys (KNHANES 2008–2011). Hepatology 63(3):776–786.

    CAS  Article  PubMed  Google Scholar 

  14. Lin YC, Chang PF, Chang MH, Ni YH (2014) Genetic variants in GCKR and PNPLA3 confer susceptibility to nonalcoholic fatty liver disease in obese individuals. Am J Clin Nutr 99(4):869–874.

    CAS  Article  PubMed  Google Scholar 

  15. Liu Z, Tang Q, Wen J, Tang Y, Huang D, Huang Y et al (2016) Elevated serum complement factors 3 and 4 are strong inflammatory markers of the metabolic syndrome development: a longitudinal cohort study. Sci Rep 6:18713.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  16. Luk AO, So WY, Ma RC, Kong AP, Ozaki R, Ng VS et al (2008) Metabolic syndrome predicts new onset of chronic kidney disease in 5,829 patients with type 2 diabetes: a 5-year prospective analysis of the Hong Kong Diabetes Registry. Diabetes Care 31(12):2357–2361.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Macaluso FS, Maida M, Petta S (2015) Genetic background in nonalcoholic fatty liver disease: a comprehensive review. World J Gastroenterol 21(39):11088–11111.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  18. Phillips AN, Smith GD (1991) How independent are “independent” effects? relative risk estimation when correlated exposures are measured imprecisely. J Clin Epidemiol 44(11):1223–1231.

    CAS  Article  PubMed  Google Scholar 

  19. Pierce BL, Ahsan H, Vanderweele TJ (2011) Power and instrument strength requirements for Mendelian randomization studies using multiple genetic variants. Int J Epidemiol 40(3):740–752.

    Article  PubMed  Google Scholar 

  20. Pritchard MT, McMullen MR, Stavitsky AB, Cohen JI, Lin F, Edward Medof M et al (2007) Differential contributions of C3, C5, and decay-accelerating factor to ethanol-induced fatty liver in mice. Gastroenterology 132(3):1117–1126.

    CAS  Article  PubMed  Google Scholar 

  21. Rensen SS, Slaats Y, Driessen A, Peutz-Kootstra CJ, Nijhuis J, Steffensen R et al (2009) Activation of the complement system in human nonalcoholic fatty liver disease. Hepatology 50(6):1809–1817.

    CAS  Article  PubMed  Google Scholar 

  22. Segers FM, Verdam FJ, de Jonge C, Boonen B, Driessen A, Shiri-Sverdlov R et al (2014) Complement alternative pathway activation in human nonalcoholic steatohepatitis. PLoS ONE 9(10):e110053.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  23. Sinn DH, Kang D, Jang HR, Gu S, Cho SJ, Paik SW et al (2017) Development of chronic kidney disease in patients with non-alcoholic fatty liver disease: a cohort study. J Hepatol 67(6):1274–1280.

    Article  PubMed  Google Scholar 

  24. Smith GD, Ebrahim S (2002) Data dredging, bias, or confounding. BMJ 325(7378):1437–1438.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Tan A, Gao Y, Yang X, Zhang H, Qin X, Mo L et al (2011) Low serum osteocalcin level is a potential marker for metabolic syndrome: results from a Chinese male population survey. Metabolism 60(8):1186–1192.

    CAS  Article  PubMed  Google Scholar 

  26. Tian GX, Sun Y, Pang CJ, Tan AH, Gao Y, Zhang HY et al (2012) Oestradiol is a protective factor for non-alcoholic fatty liver disease in healthy men. Obes Rev 13(4):381–387.

    CAS  Article  PubMed  Google Scholar 

  27. Ursini F, Russo E, Mauro D, Abenavoli L, Ammerata G, Serrao A et al (2017) Complement C3 and fatty liver disease in Rheumatoid arthritis patients: a cross-sectional study. Eur J Clin Invest 47(10):728–735.

    CAS  Article  PubMed  Google Scholar 

  28. van Greevenbroek MM, Jacobs M, van der Kallen CJ, Vermeulen VM, Jansen EH, Schalkwijk CG et al (2011) The cross-sectional association between insulin resistance and circulating complement C3 is partly explained by plasma alanine aminotransferase, independent of central obesity and general inflammation (the CODAM study). Eur J Clin Invest 41(4):372–379.

    CAS  Article  PubMed  Google Scholar 

  29. Wang X, Liu Z, Wang K, Wang Z, Sun X, Zhong L et al (2016) Additive effects of the risk alleles of PNPLA3 and TM6SF2 on non-alcoholic fatty liver disease (NAFLD) in a Chinese Population. Front Genet 7:140.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Wang F, Wang J, Li Y, Yuan J, Yao P, Wei S et al (2018a) Gallstone disease and type 2 diabetes risk: a Mendelian randomization study. Hepatology.

    Article  PubMed  Google Scholar 

  31. Wang N, Chen C, Zhao L, Chen Y, Han B, Xia F et al (2018b) Vitamin D and nonalcoholic fatty liver disease: bi-directional mendelian randomization analysis. EBioMedicine 28:187–193.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Wlazlo N, van Greevenbroek MM, Ferreira I, Jansen EH, Feskens EJ, van der Kallen CJ et al (2013) Activated complement factor 3 is associated with liver fat and liver enzymes: the CODAM study. Eur J Clin Invest 43(7):679–688.

    CAS  Article  PubMed  Google Scholar 

  33. Xu C, Chen Y, Xu L, Miao M, Li YYuC (2016) Serum complement C3 levels are associated with nonalcoholic fatty liver disease independently of metabolic features in Chinese population. Sci Rep 6:23279.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  34. Yang X, Sun J, Gao Y, Tan A, Zhang H, Hu Y et al (2012) Genome-wide association study for serum complement C3 and C4 levels in healthy Chinese subjects. PLoS Genet 8(9):e1002916.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  35. Younossi ZM, Koenig AB, Abdelatif D, Fazel Y, Henry L, Wymer M (2016) Global epidemiology of nonalcoholic fatty liver disease—meta-analytic assessment of prevalence, incidence, and outcomes. Hepatology 64(1):73–84.

    Article  PubMed  Google Scholar 

  36. Younossi Z, Tacke F, Arrese M, Chander Sharma B, Mostafa I, Bugianesi E et al (2019) Global perspectives on nonalcoholic fatty liver disease and nonalcoholic steatohepatitis. Hepatology 69(6):2672–2682.

    Article  PubMed  Google Scholar 

Download references


We thank the local research team from Fangchenggang First People’s Hospital for their contribution to the recruitment of study subjects. We thank XZ, HZ, and OL at the Genergy Biotechnology (Shanghai) Co., Ltd., for their assistance in the genotyping. Finally, we thank all study subjects for participating in this study.

Author information



Corresponding author

Correspondence to Xiaobo Yang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Availability of data and material

The datasets analyzed during the current study are available from the corresponding author upon reasonable request

Code availability

Not applicable.

Authors' contributions

LL and YX designed the study. MZ, ZH, and YX coordinated the study and oversaw participant recruitment, collection, and analysis of biological samples. LL and HL conducted the statistical analyses. LL drafted the paper, which was reviewed by all authors. All authors approved the final version of the article, including the authorship list.


This work was supported by the Guangxi Natural Science Fund for Innovation Research Team [2017GXNSFGA198003], Key projects of strategic international scientific and technological innovation cooperation of the Chinese Ministry of Science and Technology [2020YFE0201600], and Guangxi key Laboratory for Genomic and Personalized Medicine [19-185-33, 20-065-33].

Ethics approval

The study was approved by the Ethics and Human Subject Committee of Guangxi Medical University.

Consent to participate

Informed consent was obtained from all individual participants included in the study.

Consent to publication

Not applicable.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 52 KB)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Li, L., Huang, L., Yang, A. et al. Causal Relationship Between Complement C3, C4, and Nonalcoholic Fatty Liver Disease: Bidirectional Mendelian Randomization Analysis. Phenomics 1, 211–221 (2021).

Download citation


  • Nonalcoholic fatty liver disease
  • C3
  • C4
  • Mendelian randomization