Background

Systemic lupus erythematosus (SLE) is a multisystem autoimmune disease, and lupus nephritis (LN) is a frequently occurring and serious complication of SLE [1, 2]. Studies indicate that the prevalence of hypothyroidism is much higher in SLE, and especially among LN patients, than in the general population [3,4,5,6]; additionally, the risk of subsequent cardiovascular events and renal impairment is higher among LN patients with thyroid dysfunction. Accordingly, analysis of the associations between LN and hypothyroidism and a determination of relevant risk factors would greatly aid in diagnosis and disease management.

However, the pathological and physiological mechanisms underlying SLE with hypothyroidism are sophisticated. Furthermore, the availability of multiple indicators and of large relevant datasets makes it difficult to analyse clinical data directly; therefore, the precise nature of these mechanisms remains unknown [6,7,8].

Logistic regression is widely used to analyse the relationship between individual risk/protective factors and outcomes [9]. However, if the variables therein are collinear, the regression equation will be unstable and its results difficult to predict. Principal component analysis (PCA) is a powerful method by which to explore intricate datasets that feature multiple variables. PCA uses a mathematical algorithm to determine a smaller number of new variables called principal components (PCs), which are linear functions of those in the original dataset. Hence, PCA scales down the dimensionality of a large dataset while preserving as much statistical information as possible [10, 11]. As such, the current study’s use of PCA helps ensure the stability of the regression equation. In fact, PCA has previously been used to analyze complex serological and immunological datasets with multiple variables in SLE cross-sectional studies. Raymond et al [12] used PCA to describe the dynamic interplay and the influence of complex cytokines measured in serum, detect the cytokine groups that differentiated across disease activity in SLE patients. Adel Helmy et al. [13] used PCA to identify cytokine groups which accounted for the majority of the variation within the serological laboratory test data in traumatic brain injury patients.

The current study examines the laboratory test results of selected patient populations, and leverages PCA–logistic regression analysis to pinpoint key PCs. Such information may greatly assist in the prevention or management of this disease.

Methods

Patients

In our cross-sectional study, we investigated 143 LN patients diagnosed through renal biopsy who had been admitted to Xiangya Hospital of Central South University in Changsha, China during the June 2012–December 2016 period. The exclusion criteria included the coexistence of another autoimmune disease or having been diagnosed with thyroid disease prior to LN. All patients were informed of the objectives of this study, and each provided signed written consent prior to enrolment. As this research did not affect patient treatment, as per Central South University policies, ethics board approval was not required.

Collection of clinical data

Data on patient characteristics, clinical symptoms, and laboratory results were retrospectively collected from each patient’s medical records. These included: (1) general information, including age and sex; (2) clinical symptoms, including course of disease, hypertension, fever, cutaneous manifestations, alopecia, oral ulcer, malar rash, renal dysfunction (proteinuria), and haematological disease; and (3) laboratory results, including white blood cell count, haemoglobin (Hb) concentration, concentration of total protein (TP), serum lipid, erythrocyte sedimentation rate, C-reactive protein, C3, C4, and antibodies to dsDNA, simth, SSA, SSB, anti-U1 ribonucleoprotein, and ribosomal P protein. Patients’ SLE disease activity (i.e., SLEDAI) scores were collected from medical records and calculated by an experienced clinician.

Statistical analysis

Values herein are expressed as mean (standard deviation), median, and interquartile range, or as a number and percentage. We undertook comparisons between categorical variables by using the χ2 test, and between continuous variables in two independent groups by using the t-test. In cases where we were unable to establish a normal distribution for a variable, we performed the Mann–Whitney U-test.

We performed PCA by using SPSS software (a factor analysis package), to determine the interplay of clinical variables among LN patients with and without hypothyroidism. We achieved convergence during an Oblimin rotation with Kaiser normalization. In the final PCA iteration, we covered nine clinical variables in the patient group analysed. To be considered a PC, a variable’s eigenvalue had to exceed 1, and PC1 represents the group of variables that induced the greatest amount of variation in the data. We used logistic regression to further screen clinically significant eigenvalues and scrutinize critical factors that affect outcomes among LN patients.

We performed the analysis in three stages. First, we performed a monofactor analysis to examine differences between LN patients with and without hypothyroidism. Second, we performed PCA with regard to all the serology, immunology, and biochemistry variables of LN patients. We truncated those data by rotational reorientation to maximize variance along the new axis (i.e., PC) while concurrently preserving the relationship and order among the data points; the PCs could then be used in further classification, as they retain information from the original data. Third, the absolute majority of cumulative contribution (> 2/3) was used to extract PCs as independent variables, and the clinical outcome was used as a dependent variable for logistic regression modelling. In this way, we were able to obtain the PCs that significantly correlated with certain clinical outcomes. We generated an ROC of multivariate observations to assess the PCA—logistic regression model’s performance. Statistical analysis was performed using SPSS (version 19), and all p-values less than 0.05 were considered statistically significant.

Results

Patient characteristics

We compared the clinical characteristics of 48 LN patients with hypothyroidism and 94 LN patients with euthyroidism (Table 1). The two groups were well matched in terms of age (35.6 vs. 33.1 years; p > 0.05), sex (87.5% vs. 83.2% female; p > 0.05), and disease duration (36 vs. 15 months; p > 0.05). LN patients with hypothyroidism had a significantly higher frequency of rash, and higher levels of serum creatinine (SCr), blood urea nitrogen (BUN), blood uric acid (UA), triglyceride (TG), and low-density lipoprotein (LDL) concentrations. Additionally, Table 1 clearly shows that the LN patients with hypothyroidism had lower Hb, C3, and C4 levels. Notwithstanding these characteristics, any analysis leveraging only a single variable would not be as accurate as comprehensive research involving multiple variables to evaluate the risk factors for LN with hypothyroidism, and the accurate selection of variables of value remains difficult. The PCA–logistic regression model we use in the current study stands as a reasonable solution to this problem.

Table 1 Main demographic, clinical and biochemical data of LN patients with hypothyroidism and euthyroidism

Principal component analysis

To cover as many indices that affect the outcomes of LN with hypothyroidism as possible, factors with p < 0.05 were included as input variables for PCA. The Kaiser–Meyer–Olkin value was 0.7 when all the clinical variables were included; meanwhile, the p-value of the Bartlett test of primary data was 0.000, indicating that the data were suitable for use in PCA. We removed symptomatic variables and those of which the extract value were too small in the common factor variance table. The model generated nine PCs that explained 74% of the variation within the dataset; two of these, taken together, explained 30% of the variation. From the viewpoint of variance contribution rate, when eigenvalue λ1 = 3.515, the PC1 contribution rate was 15.3%—the highest value—and it contained the most information (When eigenvalue λ2 = 3.397, the PC2 contribution rate was 14.7%). For the nine main PCs (Table 2), the loadings represented the degree of importance of the corresponding compound. For example, the first three degrees of importance of PC1 in the sequence were albumin (ALB) > TP > C3; likewise, the first three degrees of importance of PC2 in the sequence were SCr > BUN > UA. In focusing on the indices whose loading was obviously higher than those of others, we could clearly see that PC1 was mainly about renal functions (including SCr, BUN, and UA); PC2 was about serum protein factor (including TP and ALB); PC3 was a leukocyte factor; and PC4 was a globulin factor. We additionally found that PC5–PC8 could not be accurately classified as any certain factor bearing a specific meaning, and PC9 was an autoantibody factor.

Table 2 Component loadings

PCA–logistic regression analysis

We used the nine PCs as input variables and the clinical outcome (LN with or without hypothyroidism) as a dependent variable in logistic regression modelling. Our analytical results showed that PC1, PC2, and PC9 were the PCs that have a significant influence on whether LN was combined with hypothyroidism (Table 3)—that is to say, SCr, BUN, UA, TP, ALB, and anti-ribonucleoprotein (RNP) antibody might be paramount factors in treating LN with hypothyroidism. It is noteworthy that the Exp(B) of PC2 and PC9 were 2.361 and 4.724, respectively; these indicate that the correlation between each of these two PCs and LN patients with hypothyroidism was much stronger than that between other pairings. We also generated an ROC (Fig. 1) that was close to the top-left corner of the coordinate system. The area under the ROC curve (AUC) was 0.885 (p < 0.001).

Table 3 The result of logistic regression analysis
Fig. 1
figure 1

The ROC curve of logistic regression (unadjusted model)

Discussion

We applied PCA–logistic regression analysis to demonstrate that three PCs—namely, PC1, PC2 and PC9, which included SCr, BUN, UA, TP, ALB, and anti-RNP antibody—were found to be important clinical variables with respect to LN patients with hypothyroidism. The Exp(B) of PC2 and PC9 was 2.361 and 4.724, respectively, indicating that the correlation between these two PCs and the outcome was much stronger than that among others.

Previous studies conclude that the most common kidney derangements associated with hypothyroidism are elevated SCr levels, reduced estimated glomerular filtration rate, and water–electrolyte imbalance [14, 15]. Moreover, SCr levels in SLE patients with hypothyroidism were found to be elevated [3]. The current study also showed that renal function indices such as SCr, BUN, and UA are essential factors in whether LN patients are associated with hypothyroidism. Possible mechanisms might include reduced renal perfusion [16], adaptive preglomerular vasoconstriction caused by filtrate overloads [17], and decreased endothelial nitric oxide synthase activity/capacity of the renal vasculature caused by reduced secretion of insulin-like growth factor 1 and vascular endothelial growth factor [18].

Severe hypoalbuminemia was observed in SLE patient with subclinical hypothyroidism [3], correspondingly, we found lower TP and ALB were influential for LN patients with hypothyroidism. Actually, most thyroid hormones are bound to plasma proteins including thyroid-binding globulin (TBG), thyroxine-binding pre-albumin (TBPA) and ALB. While kidney function of LN patients is impaired, TBG, TBPA and ALB are significantly reduced because of severe and persistent proteinuria, thyroid hormone synthesis is also affected by this [19, 20]. Furthermore, the serum hormonal concentration may be altered by changes in the binding capacity of serum proteins, thereby patients with hypoproteinemia may exhibit clinical features and laboratory findings suggestive of hypothyroidism [21, 22].

Additionally, in this study, higher anti-RNP antibody level had massive effect among LN patients with hypothyroidism, which has not been reported before. Anti-RNP antibody reacts with proteins that are associated with U1 RNA and form U1snRNP, autoimmunity to RNP autoantigens is frequently seen in systemic autoimmune diseases including lupus and it may induce the occurrence of renal disease [23,24,25], thyroid hormone synthesis may be affected by impaired kidney function as mentioned earlier. Moreover, the induction of anti-RNP autoantibodies is associated with the initial clinical manifestations of autoimmune disease, in this case, autoantibodies may lead to thyroid hormone synthesis disorders by damaging the thyroid follicular epithelium [26,27,28,29], suggesting that RNP related immune responses may have pathogenic roles in hypothyroidism. Accordingly, those hypotheses deserved to be verified through further mechanism research.

Conclusions

The principal component analysis (PCA)–logistic regression model approach used herein is a useful statistical method by which to analyse the effects of multiple clinical index interactions in lupus nephritis (LN) patients who also have hypothyroidism. Using this model, we found serum creatinine (SCr), blood urea nitrogen (BUN), blood uric acid (UA), total protein (TP), albumin (ALB), and anti-ribonucleoprotein (RNP) antibody to be particularly vital factors with respect to these patients. What is more, the impact of PC9—which mainly involved the anti-RNP antibody—was the strongest among these patients: its Exp(B) was 4.724, the highest among nine principal components. SCr, BUN, UA, TP, ALB, and autoantibody levels are modifiable factors that can be improved through early treatment to improve renal function and strengthen nutrition support, in order to reduce risk among LN patients with hypothyroidism. Ultimately, PCA offers great insights in exploring the influence of clinical variables or measuring the important factors that affect patient outcomes.