Introduction

According to World Health Organization (WHO) data, Mycobacterium tuberculosis infects about 1/4 of the global population, of which approximately 10 million develop active TB and 1.6 million die from it [1,2,3]. TB is one of the leading causes of death worldwide and poses a serious threat to global public health security [4]. A lung infection caused by M. tuberculosis leads to pulmonary TB. Extrapulmonary TB (EPTB) occurs when M. tuberculosis infects the spine, lymph nodes, kidney, liver, intestine, joints, brain, and other organs outside the lung [5]. The most common extrapulmonary form of TB is spinal TB, which accounts for half of all bone TB cases [6,7,8]. Spinal TB could seriously destroy bone and scoliosis and affect neurological function. It has a high refractory, disability, and recurrence rate, which seriously affects the patient’s quality of life [9, 10]. Studies have revealed that patients infected with M. tuberculosis develop active TB when their immune system is imbalanced [11], and the incidence rate of EPTB is higher [12]. Granulomas containing large numbers of immune cells, including macrophages, monocytes, T cells, and B cells, form at sites of M. tuberculosis infection [13], suggesting that immune cell dysregulation might play a crucial role in TB pathogenesis.

Current research reveals that hypoxia plays a key role in pathological or physiological immune responses. In different immune processes and microenvironments, hypoxia affects inflammation and immunity differently. In pathological conditions, such as chronic inflammation, infection, and tissue ischemia, pathological hypoxia induces dysregulation of immune cells leading to disease progression [14]. In a study, Allison N. Bucşet al. found that Erdman, a strain of M. tuberculosis, exhibited greater virulence under hypoxic conditions. Hypoxia may substantially impact bacterial persistence, reactivation, and treatment efficiency [15. A regulatory factor called hypoxia-inducible factor (HIF) plays an essential role in regulating the transcription of immune effector cells. As a result of tissue hypoxia, the HIF pathway is activated [16, 17]. When the body is infected with bacteria, the bacterial oxygen consumption, formation of oxygen-impermeable biofilms, and inflammation-related hypoxia activate HIF and affect the function of immune cells [18,19,20]. In addition, a study shows that hypoxia can increase the drug resistance of Pseudomonas aeruginosa [21].

In this study, We utilized a label-free protein profiling method to analyze the diseased intervertebral disks of patients with spinal TB. We utilized WGCNA and machine learning methods to find key hypoxia-related genes. Besides, various diagnostic and predictive models were constructed to evaluate the diagnostic and predictive values of these key hypoxia-related genes in TB. We also used ssGSEA to identify immune cells associated with spinal tuberculosis and validated the results with data from routine blood tests. In addition, a pharmacotranscriptomic analysis was also performed.

Materials and methods

Tissue samples collection

We collected the intervertebral disks from ten patients who underwent spinal surgery at the First Affiliated Hospital of Guangxi Medical University from 2018 to 2020. Five patients with spinal TB were included in the experimental group, and five patients with thoracolumbar disk herniation were included in the control group. There was no evidence of autoimmune diseases, spinal tumors, or other infectious diseases in any of the patients. This study was conducted following the Helsinki Declaration, which passed the ethical review, and obtained informed consent from all patients.

Label-free quantitative proteomic analysis

The specific steps and processes of the Label-Free Quantitative Proteomic Analysis are as described in our previous research [22], as follows:

Sample lysis

The RIPA solution must be prepared right before use and stored in an ice bath to keep it cool. The mixture consists of RIPA lysis buffer, Protease inhibitor cocktail, and 1 mM PMSF (Phenylmethylsulfonyl fluoride). For each 100 mg sample tissue, 1,000 µl of RIPA solution should be thoroughly mixed and homogenized, with sonication at 4 °C for 5 min. Afterwards, centrifugation should be done at 14,000 g for 15 min at 4 °C. The supernatant should then be transferred to a new EP tube and stored in an ice bath.

BCA assay

The BCA (Bicin-choninic Acid) Protein Assay Kit instructions indicate that reagent A and reagent B should be mixed at a ratio of 50:1, and added in 160 µl/well to a 96-well plate (with five wells for a calibration curve and one well for a blank). Then 10 µl of each sample (diluted 5–10 times) or calibration standard protein (at five different concentrations) should be added to the respective wells. The plates should be shaken and incubated at 37 °C for 30 min, after which they should be read at 562 nm wavelength. Using the calibration curve, the protein concentration of each sample can be determined.

Acetone precipitation

For every sample, 100 µg of protein was taken and diluted to 1 mg/ml in RIPA buffer. Then, 4–6 times the volume of pre-chilled acetone was mixed into the EP tube and shaken in an ice bath for 30 min or left to incubate at -20 °C for the entire night. Following centrifugation at a speed of 10,000 g and 4 °C, the supernatant was carefully discarded, taking care not to disturb the pellet. The sample was then washed twice using 200 µl of cold 80% acetone.

Resuspend protein for tryptic digest

Two hundred µl of 1% SDC and 100 mM ABC (ammonium bicarbonate) were added to the EP tube, mixed with a vortex, and spun down. The EP tube was then subjected to sonication for 5 ~ 30 min in a water bath to dissolve the proteins. Five mmol of TCEP (tris 2-carboxyethyl phosphine) was then added to the EP tube and mixed at 55 °C for 10 min. After the sample was cooled down to room temperature (RT), ten mmol of IAA (iodoacetamide) was added in. The EP tube was then incubated in the dark for 15 min. Trypsin (sequence grade) was resuspended in a resuspension buffer to 0.5 µg/µl and incubated at RT for 5 min. A trypsin solution (protein:trypsin = 50:1) was then added to the EP tube. The mixture was well blended and spun down, then incubated at 37 °C with a thermomixer for approximately 8 h or overnight.

Cleaning up of SDC

After 2% TFA (Trifluoroacetic Acid, HPLC) was added to the EP tube, SDC was precipitated. After being centrifuged at the highest speed, the supernatant was transferred to a new EP tube. N * 100 µl of 2% TFA was added to the pellet to extract the co-precipitated peptides. This step was repeated twice. The three supernatants were then combined. After being centrifuged at the highest speed for 10–20 min, the supernatant was carefully transferred to a new EP tube, leaving the peptide samples.

Peptide desalting for Base-RP fractionation

Buffer A (0.1% FA, H2O, 2% ACN) and Buffer B (0.1% FA, 70% ACN) were prepared. The C18 (3 M) column was then equilibrated using 500 µl of ACN. This was followed by washing it out with 500 µl of 0.1% FA twice. The peptide solution was then added to the column. After low speed centrifugation, liquid (A) was collected. This process was repeated once more, with peptide eluted using 400 µl of 70% ACN and liquid (A) collected. Desalting was performed once again with liquid (A). The two liquids were then combined and dried with a vacuum at either 4 °C or room temperature. Buffer A was then added to re-dissolve the peptide to 1 buffer g/buffer L for LC-MS/MS detection or storage at − 80 °C.

Separation via Nano-UPLC and LC-MS/MS

Separate 2 µg peptides from each sample and detect them using nano UPLC coupled with Q-Exactive mass spectrometry. Analyze using a reverse-phase column and a mobile phase composed of solvent A (0.1% FA, 2% ACN) and solvent B (80% ACN, 0.1% FA). Samples are directly loaded onto the chromatographic column by an autosampler and then separated by the column. Analyze peptides for 240 min/sample by LC-MS/MS, using positive ion detection mode with a scanning range of 350–1600 m/z and DDA acquisition method. Use standard parameters for resolution, AGC, maximum IT, NCE, isolation window, and dynamic exclusion time.

MaxQuant analysis and LFQ

MaxQuant (1.6.1.0) processed raw MS data using the UNIPROT database. LFQ with trypsin, oxidation [M], and acetyl [protein N-term] modifications were used. Carbamidomethyl [C] was set as the fixed modification (maximum of three variable modifications). Peptides without variable modifications were used for quantification, with an FDR of 0.01. Ten samples were standardized, and missing values were imputed using Perseus software. Protein groups with fewer non-missing values than biological replicates were removed. LFQ quantification results were log-transformed.

Identification of differentially expressed proteins

To identify differentially expressed proteins between spinal TB and controls, we performed differential analysis of the normalized quantitative results using the “limma” package. | logfc | > 1 and p-value < 0.05 were set as the conditions for screening differentially expressed proteins [23, 24]. To illustrate these differential proteins more clearly, we created a volcano plot and cluster heat map using the “impulse” and “pheatmap” package. All operations were carried out on the R language programming software (version 4.1.1).

GO/KEGG and DO enrichment analyses

To further explore the biological functions of these differential proteins, we used the “clusterprofiler” package for gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) enrichment analyses [25,26,27]. In addition, we also performed a disease ontology (DO) analysis on these differential proteins to reveal the relationship between spinal TB and other diseases [28]. To improve the accuracy of the results, we set the screening conditions as p-value < 0.05 and q-value < 0.05. Finally, the top 10 GO terms, KEGG pathway, and DO terms with the most significant enrichment were visualized.

Weighted gene co-expression network analysis

Weighted gene co-expression network analysis (WGCNA) is a system biology method used to describe the gene association pattern between different samples. It can be used to identify the gene set with highly synergistic changes and identify the gene set with the strongest correlation with the disease according to the interconnection of the gene set and the association between a gene set and phenotype. It is widely used in the research of diseases and other traits and gene association studies [29]. In this study, we employed the “WGCNA” package to cluster all proteins, automatically select the best soft threshold, and finally obtain each protein module related to the disease.

Construction of a PPI network of hypoxia-related proteins

In this study, we investigated the role of hypoxia-related proteins in spinal TB by intersecting the two most disease-related modules in WGCNA with a set of all hypoxia-related genes in humans downloaded from the Molecular Signatures Database (version 7.5.1) and differential proteins [29]. Later, the results were used to construct a protein-protein interaction network through the STRING database (version 11.5) and visualized through Cytoscape (version 3.9.0). Finally, a key module in the network was retrieved through the MCODE plugin in Cytoscape software [30].

Identification of key hypoxia-related proteins and prediction model construction

In order to investigate the transcriptome expression level of hypoxia-related proteins closely related to spinal TB in TB, the GSE144127 dataset, GSE83456 dataset, and GSE147690 dataset related to TB were downloaded from the GEO database (https://www.ncbi.nlm.nih.gov/geo/). The mRNA expression levels of these hypoxia-related proteins in spinal TB and other extrapulmonary TB from the GSE144127 dataset were extracted for differential analysis. Finally, 11 hypoxia-related genes with consistent changes at the transcriptional and protein levels were obtained. We utilized two machine learning methods, LASSO and SVM-REF, to screen these 11 hypoxia-related genes further. LASSO is a regression analysis method that performs variable selection and regularization while fitting a generalized linear model and selects the best variable by the smallestλvalue [31]. This process is achieved through the “glmnet” package. SVM-REF is a powerful feature selection algorithm that continuously eliminates the redundancy between features and finds the optimal feature subset by repeatedly building the model [32]. This process is implemented by the “e1071”, “kernlab” and “caret” packages. Subsequently, we integrated the genes from the LASSO, SVM-REF, and MCODE modules to obtain three important genes. Finally, a diagnostic model was developed using five machine learning techniques, including logistic regression [33], Bayesian logistic regression [34], decision tree [35], random forest [36], and extreme gradient boosting [37], to evaluate the diagnostic value of these three genes in TB disease.

Immune infiltration analysis

We obtained 28 immune cells and their marker genes from a prior study, used ssGSEA to assess the protein expression matrix through the “GSVA” package, and scored each sample according to the expression of the marker genes to determine the immune cell infiltration level [31]. Finally, using the “limma” and “corrplot” packages, the difference and correlation analyses were performed.

Blood routine data validation

To further validate the differential analysis of immune cell infiltration findings, we collected lymphocytes, monocytes, and platelets during routine blood examinations from 162 normal patients and 237 patients with spinal TB for statistical analysis. This study adhered to the Declaration of Helsinki guidelines and received approval from the hospital ethics committee.

Pharmaco-transcriptomic analysis

To provide new solutions for treating multidrug-resistant TB, we conducted a pharmaco-transcriptomic analysis utilizing the DrugBank database (version 5.1.9). DrugBank database integrates the chemical structure and pharmacological action of drugs, as well as the sequence, structure, and physiological pathway of drug action targets [38]. It is an extensive, public web database. Finally, Cytoscape was used to obtain and visualize the effect of drug molecule metabolism on the up- or down-regulation of genes.

Immunohistochemistry

In this study, 5 cases of intervertebral disc tissue resected during surgery for spinal tuberculosis diagnosed at First Affiliated Clinical Hospital of Guangxi Medical University were taken as a test group, and 5 cases of intervertebral disc tissue resected during surgery for lumbar intervertebral protrusion were taken as the control group. The differences in expression of PSMB9, STAT1, and TAP1 between experimental and control groups were compared by immunohistochemistry. After separating the disc tissue, we immersed it in formalin solution and preserved it within 10 min. We then made immunohistochemical sections and done staining after laboratory operations such as wax sealing, sectioning, antigen repair, antibody hybridization, color development, and tissue sealing. The specimens were observed under the inverted microscope, and the experimental and control group images were collected, respectively. We used Image J software to evaluate the positive rate of all immunohistochemical images and used an independent samples t-test to statistically analyze the positive rate of PSMB9, STAT1, and TAP1 in the experimental group and the control group, respectively, through IBM SPSS Statistics 26.0.

Results

Differentially expressed proteins

Following label-free quantitative proteomic analysis, we obtained 1965 quantifiable proteins. The quantitative repeatability analysis between samples revealed that the quantitative experiment had good sensitivity and reliability (Fig. 1A). According to the screening conditions, we obtained 350 differentially expressed proteins, which could be clearly distinguished by volcano plot (Fig. 1B) and cluster heat map (Fig. 1C). Furthermore, the cluster heat map also indicated that these differential proteins could distinguish well between the spinal TB and control groups.

Fig. 1
figure 1

Differentially expressed proteins. (A) The quantitative repeatability analysis between samples. (B) The cluster heat map of differentially expressed proteins. (C) The volcano plot of differentially expressed proteins

GO/KEGG and DO enrichment analyses

Through GO enrichment analysis, we found that these differentially expressed proteins are primarily involved in cytoplasmic translation, generation of precursor metabolites and energy, electron transport chain, cellular respiration, oxidation of organic compounds to produce energy, aerobic respiration, collagen fibril organization, and other processes (Fig. 2A). KEGG pathway analysis showed that these differentially expressed proteins were primarily related to a ribosome, coronavirus disease (COVID-19), chemical carcinogenesis-reactive oxygen species, phagosome, oxidative phosphorylation, neutrophil extracellular trap formation, citrate cycle (TCA cycle) and other pathways (Fig. 2B). DO analysis found that these differentially expressed proteins were not only linked to pulmonary disease but also linked to osteoarthritis, bacterial infectious disease, atherosclerosis, arteriosclerotic cardiovascular disease, phagocyte bactericidal dysfunction, and other diseases. This provides novel insights into the etiology and comorbidities of spinal TB (Fig. 2C).

Fig. 2
figure 2

GO/KEGG and DO enrichment analyses. (A) The top 10 entries of GO enrichment analysis for differentially expressed proteins. (B) The top 10 entries of the KEGG pathway enriched by the differentially expressed proteins. (C) The top 30 entries of DO analysis enriched by the differentially expressed proteins

WGCNA and identification of key modules

WGCNA could cluster genes with similar expression patterns, analyze the correlation between modules and specific traits or phenotypes, and identify the molecular markers that are strongly correlated with diseases. It is an advanced method frequently employed for bioinformatics analysis. Following analysis, we found that the two modules, “salmon” and “green,“ were highly correlated with spinal TB (Fig. 3A-I), and the gene expression in most modules also showed a significant correlation (Fig. 3J-P).

Fig. 3
figure 3

Results of weighted gene co-expression network analysis. (A-P) The entire WGCNA process, from sample clustering to correlation analysis, looking for the genes in the modules most associated with the disease

PPI network of hypoxia-related proteins

We intersected the proteins in the two modules of “salmon” and “green” in WGCNA with 3147 hypoxia-related genes and 350 differential proteins screened by our study. Finally, 36 hypoxia-related proteins were obtained in total (Fig. 4A). We constructed a protein-protein interaction network using the string database with 22 points and 27 edges (Fig. 4B). Through the MCODE plugin, we found that there is only one key module in the network (Fig. 4C).

Fig. 4
figure 4

The PPI network of hypoxia-related proteins. (A) The results of taking the intersection of the differentially expressed proteins, the hypoxia-associated genes, and the genes from the two modules most associated with the disease. (B) A protein-protein interaction network of hypoxia-related proteins. (C) The key module in the network

The key hypoxia-related proteins and prediction models

To further explore the role of hypoxia-related genes in TB, we analyzed the GSE144127 datasets. We found that the transcriptional levels of 11 genes in these 36 hypoxia-related genes were consistent with the protein expression levels (10 up-regulation and 1 down-regulation). The difference in transcriptional level was significant, in extrapulmonary TB and the control group (Fig. 5A). These 11 genes were further screened in extrapulmonary TB and control groups using LASSO and SVM-REF machine learning (Fig. 5B-D) and then intersected with the key modules extracted by the MCODE plugin. Finally, three genes, PSMB9, STAT1, and TAP1, were obtained (Fig. 5E). The GSE83456 dataset revealed significant differences in these three genes between the TB and control groups (Fig. 5F-H). In addition, in the GSE144127 dataset, the AUC of PSMB9, STAT1, and TAP1 in extrapulmonary TB and the control group were 0.781, 0.804, and 0.788 (Fig. 5I). In the GSE83456 dataset, the AUC of PSMB9, STAT1, and TAP1 genes in the TB and control groups were as high as 0.934, 0.961, and 0.966 (Fig. 5J). All these three genes have high diagnostic value for TB and may play a crucial role in the pathogenesis of TB.

Finally, the five machine learning methods of logistic regression, Bayesian logistic regression, decision tree, random forest, and extreme gradient boosting were used to build a prediction model based on these three genes. In the GSE144127 dataset, the accuracies in extrapulmonary TB and the control group were 0.764, 0.764, 0.758, 0.701 and 0.783, respectively (Fig. 5K). In the GSE83456 dataset, the accuracies of pulmonary TB and the control group were 0.822, 0.844, 0.822, 0.8, and 0.8 (Fig. 5L). Comparatively, we can observe that the machine learning method extreme gradient boosting has the highest prediction accuracy for extrapulmonary TB, which is 0.783, and Bayesian logistic regression has the highest prediction accuracy for pulmonary TB, which is 0.844.

Fig. 5
figure 5

The key hypoxia-related proteins and prediction models. (A) Differential expression of 11 genes in extrapulmonary TB and control group in the GSE144127 dataset. (B) SVM-REF algorithm for screening key genes. (C) LASSO coefficient spectrum of 11 differentially expressed genes selected by optimal. (D) Selection of the best parameter. (E) PSMB9, STAT1, and TAP1 were screened by two algorithms and MCODE. (F-H) Differential expression of PSMB9, STAT1, and TAP1 between TB group and control group in the GSE83456 dataset. (I, J) Diagnostic ROC curves of PSMB9, STAT1, and TAP1 in extrapulmonary TB and TB. (K, L) Accuracy of PSMB9, STAT1, and TAP1 prediction models based on 5 machine learning algorithms for extrapulmonary TB and TB.

Immune infiltration analysis

By ssGSEA analysis, we obtained 25 types of infiltrating immune cells in all protein samples (Fig. 6A). Through the correlation heat map, we can observe that activated dendritic cells with gamma delta T cells possess a strong positive correlation, r = 0.73, and gamma delta T cells with immature B cells also possess a strong positive correlation, r = 0.77. Monocytes and most lymphocytes also have a more significant correlation (Fig. 6B). Differential analysis showed that most immune cells were highly infiltrated in the disease group, and the activated dendritic cells, gamma delta T cells, and immaturity B cells were significantly different between the spinal TB group and control group (p-value < 0.05) (Fig. 6C).

Fig. 6
figure 6

Immune infiltration analysis. (A) Heat map of the landscape of 25 immune cell subpopulations infiltration. (B) Heat map of correlation between immune cells. (C) Violin plot of immune cell differences between disease group and control group

Correlation of hypoxia-related genes PSMB9, STAT1, and TAP1 with immune cells

Following correlation analysis (Fig. 7), we found that PSMB9, STAT1, and TAP1 significantly correlated with activated dendritic cells, gamma delta T cells, immature B cells, and neutrophils. In addition, STAT1 and TAP1 were also significantly positively correlated with central memory CD4 T cells and macrophages while negatively correlated with Type 1 T helper cells. PSMB9 and STAT1 had the strongest and most significant correlation with gamma delta T cells, while TAP1 had the strongest and most significant correlation with immature B cells. This suggests that these key genes and immune cells might play an important role in the pathogenesis of TB, including spinal TB (Fig. 7A-U).

Fig. 7
figure 7

Correlation of PSMB9, STAT1, and TAP1 with immune cells. (A-C) Lollipop plot of correlation of PSMB99, STAT1, and TAP1 with immune cells. (D-U) Scatter plot of significant correlation between PSMB99, STAT1, and TAP1 with immune cells

Blood routine data validation

Through a statistical analysis of the blood routine examination of 162 normal patients and 237 patients with spinal TB, we found that the monocytes and platelets in the spinal TB group were higher in comparison to the normal control group. In contrast, the lymphocytes in the normal control group were higher in comparison to the spinal TB group, and the difference was statistically significant (p-value < 0.05) (Fig. 8A-C). According to our immune cell infiltration results, based on ssGSEA analysis, the monocytes and macrophages had higher infiltration levels in the disease group. This finding was proven to be accurate through routine blood data.

Fig. 8
figure 8

Routine blood tests. (A-C) The results of routine blood tests for monocytes, platelets, and lymphocytes in 162 normal patients and 237 patients with spinal TB.

Pharmaco-transcriptomic analysis

In the GSE147690 dataset, we found that PSMB9, STAT1, and TAP1 were also highly expressed in the multidrug-resistant TB group, and the difference was very statistically significant (p-value < 0.01) (Fig. 9A-C). PSMB9, STAT1, and TAP1 may be potential therapeutic targets for multidrug-resistant TB. Therefore, we performed pharmaco-transcriptomic analysis and found that 11 drug compounds, such as estradiol, cyclosporine, and cisplatin, can upregulate the expression of PSMB9. At the same time, acetaminophen and calcitriol can down-regulate the expression of PSMB9. Cyclosporine, dactinomycin, diethylstilbestrol, and other 11 drug compounds can upregulate the expression of STAT1. In contrast, afimoxifene, azathioprine, diclofenac, and other 14 kinds of drug compounds can down-regulate the expression of STAT1, and acetaminophen, estradiol, and methotrexate have effects on the up- and downregulation of STAT1. Dactinomycin, daunorubicin, camptothecin, and other 22 drug compounds can upregulate the expression of TAP1, while arsenic trioxide can downregulate the expression of TAP1 (Fig. 9D-F). This will help us in providing new insights into the treatment of multidrug-resistant TB.

Fig. 9
figure 9

Pharmaco-transcriptomic analysis. (A-C) Differential expression of PSMB9, STAT1, and TAP1 between the multidrug-resistant TB group and the control group. (D-F) The pharmaco-transcriptomic analysis of PSMB9, STAT1, and TAP1.

Immunohistochemical analysis results

Immunohistochemical staining of PSMB9, STAT1, and TAP1 was performed in 5 patients with spinal tuberculosis and 5 patients with lumbar disc herniation. The results showed that the specific expressions of PSMB9, STAT1, and TAP1 in the experimental group were significantly higher than in the control group (Fig. 10A-F). We used Image J software to detect the positive rate of immunohistochemical images. The positive rate data of PSMB9, STAT1, and TAP1 were imported into SPSS 26.0, and the difference between the two groups was statistically analyzed by independent sample t-test. The positive rates of PSMB9, STAT1, and TAP1 genes in the experimental group were significantly higher than those in the control group (p-value < 0.001) (Fig. 10G-I). It showed that PSMB9, STAT1, and TAP1 were differentially expressed in the experimental and control groups. This result confirms the accuracy of our analysis.

Fig. 10
figure 10

Immunohistochemical staining analysis. (A-F) Shows the specific expression of PSMB9, STAT1, and TAP1 in spinal TB group and the control group. (G-I) Shows the statistical analysis results of the positivity rate between spinal TB group and the control group.

Discussion

Granuloma is an important feature of TB, and it is also a place where M. tuberculosis obtains nutrients and evades immunity, and plays a key role in the spread of TB infection [39, 40]. Studies suggest that M. tuberculosis granulomas may be in a hypoxic environment in which M. tuberculosis enters a non-replicating “quiescent” state, thereby enhancing bacterial resistance to antibiotics [41]. Hua Yang et al. found that M. tuberculosis can secrete fatty acid-degrading protein A under hypoxic conditions, regulate fatty acid metabolism, and inhibit the secretion of pro-inflammatory cytokines, thereby inhibiting host immunity so that M. tuberculosis could survive in the granuloma and persist in the host infection [42]. Therefore, the molecular mechanism of hypoxia-related genes in tuberculosis infection deserves further exploration.

By analyzing the differentially expressed proteins between the spinal TB group and the control group, we found that in the GO enrichment analysis, these differential proteins were mainly concentrated in the generation of precursor metabolites and energy, cellular respiration, oxidation of organic compounds to produce energy, aerobic respiration, respiratory electron transport chain, and reactive oxygen species metabolic process. KEGG pathway analysis also showed that these differential proteins mainly concentrated in the ribosome, chemical carcinogenesis-reactive oxygen species, oxidative phosphorylation, and citrate (TCA cycle). Ribosomal stability is very important for the persistence and latent infection of mycobacteria. Under hypoxic conditions, ribosome-associated factor during hypoxia (Rafh) is the primary factor leading to the hypoxic survival of mycobacteria mediated by response regulator dose [43]. All these results indicate that hypoxia is closely related to the pathogenesis of TB.

In this study, we screened out three key hypoxia-related genes, PSMB9, STAT1, and TAP1, which were highly expressed at the protein and transcriptional levels in spinal TB. Notably, previous studies have shown that PSMB9, STAT1, and TAP1 are all associated with TB. A meta-analysis integrating the transcriptional expression dataset of whole blood of multiple hosts and integrating and comparing different data through the network method found that there is a highly active core gene group in TB, which is composed of 380 genes, of which STAT1 and PSMB9 are the key hubs of the gene group [44]. PSMB9 is an immunoproteasome subunit involved in MHC class I antigen presentation, and the expression of this gene is induced by inflammatory factors, such as interferon-gamma [45, 46]. Tetsuaki Shoji et al. found that in cisplatin-resistant lung cancer cell line models, the transcription levels of PSMB8 and PSMB9 were highly expressed, and the protein expression levels were also significantly increased. After treatment with immunoproteasome inhibitors, it was found that immunoproteasomes may be an effective therapeutic target for some cisplatin-resistant lung cancers [47]. STAT1 is a signal transducer and activator of transcription 1, a member of the STAT protein family [48]. This protein can be activated by ligands, such as interferon-alpha and interferon-gamma, and plays an important role in the immune response to viral, fungal, and mycobacterial pathogens [49, 50]. STAT1 transcriptional up-regulation in severe COVID-19 patients is a potential predictive biomarker and target for certain interferon pathway-targeted therapies [51]. Tuo Liang et al. also found that STAT1 is related to the pathogenesis of spinal TB and other extrapulmonary TB, which may be involved in M1-macrophage polarization and then cause bone destruction. It is an important biomarker of tuberculosis and a potential therapeutic target [52]. The full name of TAP1 is transporter 1, an ATP binding cassette subfamily B member. In the process of antigen processing and presentation, heterodimer transporters related to antigen processing (TAP) transport peptides produced by immunoproteasome to the endoplasmic reticulum to play immune function. TAP1 and PSMB9 are involved in the formation of heterodimer transporters and immune proteasomes, respectively. When TAP is dysfunctional, pathogenic microorganisms can escape immune surveillance [53, 54]. Several studies have shown that abnormalities in the TAP1 gene are closely associated with pulmonary TB [55, 56]. In this study, PSMB9, STAT1, and TAP1 have high diagnostic and predictive values for both extrapulmonary TB and TB. These results indicate that PSMB9, STAT1, and TAP1 may play a role in the pathogenesis of TB, such as spinal TB.

In addition, PSMB9, STAT1, and TAP1 were also significantly upregulated in the multidrug-resistant TB group. Pharmaco-transcriptomic analysis showed that estradiol, cyclosporine, cisplatin, and other drug compounds could upregulate the expression of PSMB9, while acetaminophen and calcitriol can down-regulate PSMB9 expression. Cyclosporine, dactinomycin, diethylstilbestrol, and other drug compounds can upregulate the expression of STAT1, while 14 kinds of drug compounds, such as afimoxifene, azathioprine, and diclofenac can down-regulate the expression of STAT1, and acetaminophen, estradiol, and methotrexate have effects on the up and down regulation of STAT1. Dactinomycin, daunorubicin, camptothecin, and other drug compounds can upregulate the expression of TAP1, while arsenic trioxide can down-regulate the expression of TAP1. Cyclosporine is an important immunosuppressant. Its main mechanism is to inhibit the activity of the immune system by inhibiting the activity and growth of T cells [57]. Delayed activation of T lymphocytes and insufficient secretion of related cytokines can lead to pathogenic inflammation, increased bacterial load, spread of infection, and severe disease progression [58, 59]. Therefore, T lymphocytes play an important role in immune protection against Mb infection. Many studies have also shown that cyclosporin is associated with an increased risk of activation of TB and latent tuberculosis infection [60]. In this study, we found that cyclosporin can upregulate the expression of PSMB9 and STAT1, which may be one of the mechanisms of cyclosporin-induced increased risk of activation of tuberculosis disease. Calcitriol is the “active metabolite” of vitamin D3. An in vitro study showed that it has antibacterial properties and inhibits the production of pro-inflammatory cytokines [61]. In addition, calcitriol also plays a role in host defense against mycobacterium tuberculosis infection by inducing autophagy of antimicrobial peptides (AMP) and/or colonized macrophages [62]. Klauer et al. first proved that calcitriol could inhibit pathogenic Mycobacterium tuberculosis proliferation in human macrophages [63]. This provides a new reference for the treatment of multidrug-resistant TB.

TB is closely related to the immune response in the body but the immune mechanism of anti-M. tuberculosis antibodies are not completely clear [64]. By analyzing the ssGSEA data, we described the immune cell infiltration of spinal TB. We found that activated dendritic cells, gamma delta T cells, and immature B cells were different in the spinal TB group and the control group, and they were significantly positively correlated with PSMB9, STAT1, and TAP1. Dendritic cells have the function of activating and stabilizing T lymphocytes and B lymphocytes and can differentiate into different immune cells, participate in cellular and humoral responses, and also form complexes with multifunctional APCs, which play a key role in antipathogen activity; they are one of the most important immune regulatory cells [65, 66]. Dendritic cells play a role in granuloma formation by inducing the migration of natural killer (NK) cells and T cells in vitro under the stimulation of M. tuberculosis [67]. Gamma delta T cells are unconventional T cells that play an important role in recognizing foreign pathogens and stress signals of infected cells [68,69,70]. In tuberculosis, γδT cells can rapidly recognize M. tuberculosis antigens, respond to the BCG vaccine, inhibit the growth of mycobacteria, and are potential vaccine targets against TB [71]. We also found a significant positive correlation between STAT1 and macrophage, which once again demonstrated that STAT1 might induce M1-macrophage polarization to cause bone destruction in spinal TB. In addition, we analyzed the differences of monocytes/macrophages in patients with spinal TB through the blood routine examination data of 162 normal patients and 237 patients with spinal TB and found that the number of monocytes/macrophages in the disease group was significantly higher than that of normal control groups. This observation verifies the obtained results.

Similar to other studies, our study also had limitations. First, the sample size was inadequate. Taking into account the analysis of large samples, we only used five pairs of 10 samples for the protein park test, which was insufficient. Second, there are limitations in using routine blood data to check differential immunocyte analysis; tissue-based flow cytometry should be used for further verification. In addition, we do not have more laboratory analysis to verify our results, and this study should be further verified through cell and animal experiments.

Conclusion

PSMB9, STAT1, and TAP1, might play a key role in the pathogenesis of TB, including spinal TB, and the protein product of the genes can be served as diagnostic markers and potential therapeutic target for TB.