Background

Multiple sclerosis (MS) is a clinically heterogeneous disease with a largely unpredictable course in individual patients. Clinical management of MS is hampered by the difficulty in obtaining an accurate prognosis. To achieve an accurate prognosis during the early or mid-stages of the disease, and to monitor both disease course and the response to therapy, it is essential to identify clinical or biological markers that can serve as surrogate end-points of the phenotype [14]. Such biomarkers would also greatly facilitate the design and monitoring of clinical trials to test new disease-modifying drugs by identifying the most appropriate patient subgroups for a specific therapy of interest.

The activity of relapsing-remitting MS (RRMS) is defined by the presence of new clinical relapses, presence of new lesions in the MRI, or increase in disability due to such relapses. The underlying pathogenesis of RRMS activity is therefore dependent on the presence of new inflammatory plaques within the brain parenchyma, as a result of the activation of the immune system. At present, it is not well known which pathways are activated during relapses and which ones are critical for defining a more active disease activity [5]. Previous gene expression studies have identified several genes such as GPR3, NFKB, SOCS3, STAT3, STAT1, CX3CR1, IDO, SLC9A9, HO-1 among others, associated with a more active disease [616]. However, at present none of such genes have been validated as a known biomarker of disease activity in MS and also, the pathways driving a more active disease are not well known [2, 5].

In the present study, we sought to identify gene signature patterns and pathways associated with RRMS and associated with clinical activity (relapses and disability worsening). Although clinical activity was defined prospectively, our study did not included MRI assessment, limiting the definition of disease activity and the opportunity to relate findings with imaging markers. We first screened gene expression from blood using DNA arrays to identify gene expression patterns that distinguish between clinically stable and active disease. Then, we validated these genes by RT-PCR in an independent prospective cohort of RRMS patients. Finally, we analyzed the regulatory network of validated genes, identifying a gene triplet composed by DOCK10, CASP2 and SYTL2 that pointed to interferon and T and B cell activation pathways as associated with disease activity in MS.

Methods

Subjects

We recruited two cohorts of patients with RRMS defined using the 2005 criteria [17]. Patients were classified as having clinically stable or active disease using following definitions: 1) stable disease - no relapses and no changes in the EDSS score during 2 years of follow-up; 2) active disease - 2 or more relapses or a 1-point increase in the EDSS score due to relapses during 2 years of follow-up. The screening cohort contained RRMS patients in the early to mid phase of the disease: stable MS patients (n = 3; sex: 1 M/2 F; age: 38 ± 6 years; disease duration: 7 ± 0.8 years; relapse rate during the 2 years follow-up: 0; EDSS: 0 [range: 0–1]); active MS patients (n = 3; sex: 1 M/2 F; age: 33 ± 7 years; disease duration: 3 ± 0.5 years; relapse rate over the 2 years follow-up: 2.5 ± 0.5, EDSS: 3 [range: 2–4]); and healthy controls (n = 3; sex: 1 M/2 F; age: 36). None of the patients in the screening cohort were treated with disease modifying therapies at the time of the study or in the previous month. The validation cohort was a longitudinal prospective cohort of RRMS, including stable MS patients (n = 20; sex: 5 M/15 F; age: 34.0 ± 5.3 years; disease duration: 2.6 ± 2.3; relapses follow-up: 0; EDSS: 0.6 ± 0.7), active MS patients (n = 20; sex: 10 M/10 F; age: 30.5 ± 4.5; disease duration: 9.7 ± 6.6 years; relapse rate 2 years follow-up: 2.0 ± 0.5; EDSS: 3.8 ± 1.4) and sex and age matched healthy controls (n = 20; sex: 8 M/12 F; age: 32.5 ± 9.6 years). Use of DMTs (Interferon-beta or Glatiramer acetate) were allowed in the validation cohort, and 33 % of patients were receiving either therapy. Definition of clinical disease activity in this validation cohort was based in longitudinal prospective data and disability worsening was confirmed 6 months apart.

Ethics, consent and permissions

This study was approved by the Ethics committee of the Hospital Clinic of Barcelona. Patients were invited to participate by their neurologist after signing informed consent.

RNA extraction and DNA arrays

RNA was obtained from whole blood in the screening cohort using the PAXgeneTM Blood RNA Kit (Quiagen). For the validation cohort, RNA was obtained from PBMCs purified using the Ficoll-Paque gradient system (Pharmacia Biotech). PBMCs were stored in RNAlater stabilization solution (Applied Biosystems) at − 80 °C until RNA extraction was performed. RNA was purified using the RNeasy Mini Kit and digested with the RNase Free DNase Set (both from Qiagen). The quality and quantity of the RNA was determined using a NanoDrop 2000 spectrophotometer and its integrity assessed using a 2100 bioanalyzer (Agilent Technologies). For DNA array analysis, RNA (6 μg) was transcribed to cDNA using the SuperScript Choice System (Life Technologies) according to the Affymetrix Expression Analysis Technical Manual. Subsequently, cRNA was synthesized using the BioArray HighYield RNA Transcript Labeling kit (Enzo), purified with the Kit Clean-up module (Affymetrix) and finally hybridized to the HG-U133 Plus 2.0 DNA array (Affymetrix). For the validation and treatment cohorts, cDNA synthesis was performed from total RNA using the High-Capacity cDNA Archive Kit (Applied Biosystems). Raw DNA array data was uploaded to the ArrayExpress database (accession code E-GEOD-2012082010000219) (Additional file 1).

Real time PCR

RT-PCR was performed using Low Density Arrays (LDA: Applied Biosystems) that were designed selecting TaqMan assays provided by Applied Biosystems for 45 genes plus five housekeeping genes using the following criteria: 1) minimal distance between the Affymetrix probe set and the Applied Biosystems probe set; 2) no genomic DNA detection (Additional file 2). All samples were analyzed in duplicate (384 wells) and the arrays were analyzed in triplicate. RT-PCR was carried out in the 7900 Fast Real-Time PCR system using TaqMan Gene Expression Master Mix kit (both from Applied Biosystems) as follows: 2 min at 50 °C; 10 min at 94.5 °C; 40 cycles of 30 s at 97 °C, 1 min at 59.7 °C and 4 °C indefinitely [18]. SDS 2.2.1 software was used to analyze migration on microfluid plates.

Western blot analysis

Protein levels of DOCK10, CASP2 and SYTL2 were measured by Western blots from brain tissue as previously described [6]. Antibodies used for Western blot were as follows: mouse anti-DOCK10 (Novus Biologicals), mouse anti-CASP2 and mouse anti-SYTL2 (both from Santa Cruz).

Bioinformatics analysis

In order to estimate the power of our analysis, we assumed that 95 % of the evaluated probes were not differentially expressed. We aimed to detect an isolated mean difference of 1 in log-expression between groups, and set to control the rate of false positives at 10 %. Assuming a standard deviation of 0.5 in the difference in log-expression between groups, the power of our analysis was 0.65. Therefore, it could be expected that 65 % of genes that showed a two-fold differential expression between any of the groups would be identified.

DNA array results were normalized using Microarray Suite 5.0 (MAS 5.0; Affymetrix®) and analyzed using the Biometric Research Branch (BRB) Array Tools 3.2.3 (Dr Richard Simon and Amy Peng Lam). To filter the genes with the BRB software we used the following criteria: 1) genes with an intensity >10 were assigned a value of 10; 2) genes were deleted if in the < 20 % of cases the change gene expression was < 1.5 with respect to the median, if the percentage of missing values was > 50 %, or if the percentage of absents was > 70 %. For group comparison we used the F-test and for pair group comparison we used T-test, with a significance threshold set to p < 0.001. Multiple comparisons were adjusted using the false discovery method with the significance threshold was set to p < 0.05. For LDA analysis, we first discarded samples with an SD > 0.38. The normalization factor was calculated using geNorm software (https://genorm.cmgg.be) and normalized values were transformed to a logarithmic scale. For group comparison we used the Kruskal-Wallis and for pair group comparison we used Mann–Whitney U non-parametric tests, with a significance threshold set to p < 0.01. Multiple comparisons were performed using the Bonferroni method with the significance threshold was set to p < 0.05. Statistical analyses were performed using SPSS 11.0 (SPSS Inc., Chicago, USA) and R software (R Core Development Team, 2011). Correlation between DEG in the array and RT-PCR study was 0.379. Gene Ontology was analyzed using Genecodis software (http://www.cnb.csic.es). To identify transcription factors (TF) that act as common regulators of the genes of interest, we searched for common TF binding sites based on frequency matrices obtained from the Jaspar database [19], using the position specific weight matrix (PSWM) method with both PSCAN [20] and R software.

Results

Patients with either clinically active or stable RRMS were screened, along with healthy sex- and age-matched controls. RNA extracted from whole blood was analyzed using the HG-U133 Plus 2.0 DNA array (Affymetrix), obtaining 14,705 of the 54,675 available probes which fulfilled the criteria for analysis. We performed group analysis and pairwise comparisons between each of the three groups. The group analysis identified 45 differentially expressed genes (DEGs) grouped in four clusters between the three groups (F-test, p < 0.001, FDR correction) (Fig. 1; Table 1; Additional file 3). Genes GABPA; FREB; ZNF146; GATA3; KMO; MSH2; GP1BA; DSP; GIPC2; HAK; SSBP4; SP192; MGC35130; KIAA0826; C6orf115 significantly discriminate between patients with clinically stable and active disease (T-test, p < 0.001, FDR correction). To validate the DEGs associated with clinical activity in RRMS, we analyzed the expression of the DEGs by RT-PCR in a second independent cohort of MS patients. We determined which genes differed significantly between the three conditions, validating 14 out of the previously identified 45 DEGs that differed significantly (Kruskal-Wallis test, p < 0.01 after Bonferroni correction for multiple testing): ARHGEF7, BTBD7, CASP2, CCT8, DOCK10, DSP, ITPR1, KLDHC5, MALAT-1 (PRO1073), RBBP4, SYTL2, IFT88 (TTC10), WDR20, ZNF75 (Fig. 2). Eight out these 14 genes, ARHGEF7, CASP2, DOCK10, DSP, ITPR1, KLDHC5, RBBP4, SYTL2, were expressed significantly different between MS patients with clinically stable or active disease (Mann–Whitney U test, p < 0.01 after Bonferroni correction for multiple testing).

Fig. 1
figure 1

Screening for differentially expressed genes in controls and in patients with clinically stable or active RRMS. 45 genes differentially expressed between all 3 conditions in patients with stable MS (MS-good), active MS (MS-bad) and controls (HC) were found. a Cluster analysis identified 4 clusters of genes using as a cut-off a correlation of 0.65; b shows the heat map of the 45 DEG grouped in the 4 clusters indentified (F-test after FDR correction)

Table 1 Differentially expressed genes between clinically stable MS, active MS and controls by DNA array analysis (Additional file 1)
Fig. 2
figure 2

Validation of DEGs in a prospective cohort of patients with stable or active disease. Heat map showing the 14 DEGs validated by RT-PCR in the validation cohort: ARHGEF7, BTBD7, CASP2, CCT8, DOCK10, DSP, ITPR1, KLDHC5, PRO1073, RBBP4, SYTL2, TTC10, WDR20, ZNF75. The normalization factor was calculated using geNorm software and normalized values were transformed to a logarithmic scale. Group comparison were performed using the Kruskal-Wallis test with a significance threshold set to p < 0.01 after correction for multiple testing (Bonferoni)

In order to obtain insights about the biological role of the validated DEGs we performed Gene Ontology (GO) analysis and regulatory network analysis. The GO analysis of the DEGs revealed an overrepresentation of several pathways associated with lymphocyte activation as follows: a) GO biological process: B cell receptor signaling pathway, and positive regulation of apoptosis; b) GO molecular functions: Rho GTPase binding, Rab GTPase binding and GTPase binding; c) Enrichment analysis of KEGG pathways: cell adhesion molecules and T cell receptor signaling. In order to analyze the regulatory networks associated with the validated DEGs associated with disease activity, we searched for high correlation in gene expression as indication or co-regulated expression.

Among the 14 DEGs, we found only one triplet with high degree of correlation between their expression levels (by pairwise correlation). The expression of the triplet DOCK10, CASP2 and SYTL2 was highly correlated (DOCK10/CASP2: r = 0.799; p = 1.637e-11; DOCK10/SYTL2: r = 0.591; p = 1.218e-05; and CASP2/SYTL: r = 0.613; p = 4.506e-06; Fig. 3), which would suggest the existence of master regulators for these gene triplet. The gene expression levels of the gene triplet DOCK10, CASP2 and SYTL2 were found to be increased in patients with MS (both clinically stable and active disease) compared to controls both in the DNA array as well as in the RT-PCR assays. Also, we analyzed protein levels of the triplet DOCK10, CASP2 and SYTL2 in PBMCs from MS patients and healthy controls. We observed higher levels of Dock10 and Sytl2 protein in MS patients compared with controls, and we observed a decrease in Caspase-2 in patients at protein level (Fig. 4). In order to search for common regulators of the expression of the gene triplet we performed a TF binding site prediction for DOCK10, CASP2 and SYTL2 genes. We identified 8 TF as potential master regulators: MAFB, STAT1, MYF5, FOXQ1, TCF3, SOX9, SRY and INSM1 (Table 2). A subsequent search of the Reactome and KEGG databases revealed that the top two TF MAFB and STAT1 are the downstream TF of the type I IFN and MAP kinase pathways.

Fig. 3
figure 3

Correlation of DOCK10, SYTL and CASP2 gene expression. The correlation of the gene expression levels of DOCK10, SYTL and CASP2 obtained by RT-PCR were performed using the Spearman correlation (R 2 = 0.65)

Fig. 4
figure 4

Dock10, SYTL and CASP2 protein levels in PBMCs from MS patients. Representative Western blots probed to assess the Dock10, SYTL and CASP2 protein levels in PBMCs from patients and controls

Table 2 Transcription factors associated with the MS gene expression pattern

Discussion

We found a gene expression signature associated with higher clinically disease activity in patients with RRMS. To identify pathways associated with such increased clinically disease activity, a bioinformatics search revealed the gene triplet DOCK10, CASP2 and SYTL2, which are activated in concert by the MAFB, STAT1 and MYF transcription factors. This pattern suggests the activation of type I IFN and MAPkinase pathways in response to immune cell receptor activation, which is in agreement with previous studies showing dysregulation of type I interferon pathway in MS [21, 22]. In addition, GO analysis revealed the involvement of several pathways related with lymphocyte activation (Fig. 5).

Fig. 5
figure 5

Biological pathways involved in DOCK10 and CASP2 function. Pathways related with both genes were obtained from KEGG and Reactome databases

Several gene expression patterns associated with disease activity in MS have been identified previously, implicating T-cell activation and expansion, inflammation and apoptosis/cell cycle regulation in disease activity [79, 13, 2326]. Biological processes that involve DEGs include the immune response, cell adhesion, cell differentiation, cellular component movement, signal transduction, blood coagulation, axon guidance, DNA and RNA transcription regulation, and the regulation of cell proliferation. Several of the TFs identified in the present study have previously been implicated in MS disease activity by the ANZ consortium [27], including TCF, MYB and the SOX family. Our findings strongly support the involvement of T and B cell activation, immune cell signaling, such as type I interferon signaling, and the regulation of cell proliferation in the pathogenesis and severity of MS. Further validation of these biomarkers in multicenter clinical studies will be required to assess its robustness in different platforms and centers.

The DOCK10/CASP2/SYTL2 gene triplet has been implicated in lymphocyte activation and function (Fig. 5). DOCK10 is a member of the dedicator of cytokinesis family, which acts as activator of the Rho family of small GTPases and mediates signaling by G-protein receptors, cytokine receptors (protein-kinases), integrins and cadherins [28]. Dock10 is mainly expressed in lymphocytes, and it is activated by the IL-4 and Rho pathways [28], exhibiting differential splicing between T and B cells [29]. Statins modulate the activity of Rho GTPases, which contributes to the induction of the Th2 phenotype and ameliorates disease severity in an animal model of MS [30]. Finally, DOCK10 is also involved in dendrite spine morphogenesis [31]. Taken together with the present results, these findings implicate DOCK-10 in increased MS disease activity through its effects on lymphocyte migration and their differentiation to the Th2 phenotype, and B cell activation.

Caspase 2 (CASP2) is a protease that is activated in response to stress (DNA damage) and it appears to participate more in the regulation of the cell cycle than in apoptosis [32]. Moreover, CASP2 acts as a tumor suppressor by inducing cell cycle arrest. CASP2 is highly expressed in B cells during the plasmablast stage of differentiation [33], and is overexpressed in MS patients during relapses and in response to interferon-beta or intravenous immunoglobulin therapy [3436]. In addition, CASP2 is involved specifically in apoptosis in neuronal cells [36, 37]. Together with its modulation of the immune response, this observation strongly suggests that CASP2 contributes to the dampening of lymphocyte proliferation. We believe differences in CASP2 RNA and protein levels in our study should be related with functional regulation of this protein. One hypothesis is that the increase in protein levels may be due to higher stability or lower degradation of this protein in immune cells from MS patients. Such higher levels may suppress the levels of RNA expression as a negative feedback. Alternatively, regulation of CASP2 gene expression and efficiency to translation to protein may differ in MS patients compared to controls, probably due to pro-inflammatory state, leading to higher protein levels even with lower RNA synthesis.

Synaptotagmin-like 2 (SYTL2 or SLP2) is a member of a C2 domain-containing protein family and it is involved in RAB27A-dependent vesicular trafficking. Among other functions, SYTL2 participates in the maturation of immune synapses in cytotoxic T cells in order to promote the exocytoxis of cytotoxic granules containing perforin [38]. Interestingly, SYTL2 is required for mitochondrial fusion in response to cellular stress [39], which is critical for the restoration of mitochondrial function after stress damage [40]. Its mode of action suggests that SYTL2 participates in the stress response during inflammation.

There are several limitations of the present study that should be noted. The definitions of “active” and “stable” disease are based on the clinical expression of the disease and thus, they do not account for sub-clinical activity revealed by MRI. For this reason, our findings only apply to clinically active disease and we were not able to correlate our findings with imaging markers of disease activity. Nevertheless, misclassification of patients due to lack of MRI would increase the number of false negatives more than the number of false positives, suggesting that validated biomarkers of the proposed phenotype will persist even in the presence of the noise generated due to inaccurate clinical descriptions [3]. Although our datasets were small, we followed a sequential analysis in independent cohorts for reducing number of genes and test performed. Overall, this strategy decreased the risk of false positives, which is the main limitation in gene expression studies.

Conclusion

In summary, we describe a gene expression pattern composed by DOCK10, CASP2 and SYTL2, which is associated with a more active course of disease in patients with RRMS. The biological function of the genes and pathways identified suggest that clinical disease activity in the early to medium phase of MS is associated with increased activation of the adaptive immune system, involving both T and B cells. Additional studies will be required to further validate this molecular biomarker as well as for validating such therapeutic targets for MS and other autoimmune diseases.