Introduction

Systemic sclerosis (SSc), also known as scleroderma, refers to a systemic rheumatic disease that is generally classified as an autoimmune disease (AID) [1, 2]. It is characterised by the three main pathological hallmarks: vasculopathy, immune system abnormalities and excessive deposition of collagen (fibrosis) in many tissues throughout the human body causing hardening and thickening [1, 3]. This disease is heterogeneous and multisystemic as symptoms vary among patients and several organs of the human body might be affected [3].

Based on its clinical features, SSc is subdivided into limited cutaneous SSc (lcSSc) and diffused cutaneous SSc (dcSSc) [4]. LcSSc is less severe than dcSSc as skin thickening is usually limited to the distal extremities, finger, upper neck and face. However, in some cases, lcSSc patients present mild organ-complications. In contrast to lcSSc, dcSSc progresses faster and usually affects internal organs (e.g. lung, kidney and heart) causing complications such as congestive heart failure, renal crisis and interstitial lung disease (ILD). In addition, serological data showed that different autoantibodies are produced in each subtype [3, 4]. Anti-centromere autoantibodies (ACA) are mainly present in lcSSc patients, whereas anti-topoisomerase (ATA) and anti-RNA-polymerase (ARNAP) autoantibodies in dcSSc patients [5].

Up to date, SSc aetiopathogenesis is not well understood; thus, its prognosis and diagnosis are challenging and there is no cure for the disease [6]. Therefore, there is an urgent requirement for SSc biomarkers that could be used not only for prognosis and diagnosis of the disease, but also for SSc staging, activity, classification and for potential therapeutic targets. During the last two decades, proteomics biomarker discovery has been developed due to the advances of mass spectrometry (MS) approaches. MS, a high-throughput technique, enables the identification and quantification of proteins in a variety of biological samples such as saliva, plasma and serum [7].

The aim of this proteomic study was to analyse affected and unaffected skin biopsy samples from patients with SSc in order to identify biomarkers and pathways which are implicated in SSc pathogenesis. We hereby report some associated pathways and the validation of possible SSc proteomic biomarkers.

Materials and methods

Clinical samples

Fourteen paired cutaneous biopsies were obtained from clinically affected (forearm) and clinically unaffected (proximal arm) skin from seven patients with SSc, voluntarily participating in the PRECISESADs project (ref: 115565) following the appropriate written informed consent procedures. The study was approved by the Ethical Committee of Université catholique de Louvain (2014/17DEC/603) and the Cyprus National Bioethics committee (ΕΕΒΚ/ΕΠ/2015/31). All patients fulfilled the LeRoy and Medsger criteria [8] and were in the active phase of the disease.

Histological classification of skin biopsy samples

Histological features of SSc skin biopsies (collagen bundles and inflammatory cells levels) were assessed.

Evaluation of optical parameters (density of collagen bundles and inflammatory cell infiltrates) was performed using a semi-quantitative score on a 0–3 scale, where 0 indicates absence, 1 weak and very focal staining, 2 moderate and focal staining, and 3 moderate in several foci. Perivascular fibroblastic densification was assessed as present or not [9].

Pre-analytical sample processing

Frozen skin biopsies were cryo-pulverised to a fine powder using a cell crusher device (Cellcrusher, Cork). Skin powdered samples were suspended in a lysis buffer (10 mM Tris-HCl pH 7.4; 150 mM NaCl; 1 mM EDTA; PBS; 0.2% SDS; Proteinase Inhibitor (Roche, USA)) and homogenised by sonication. Then, protein precipitation was performed using 1 ml of frozen acetone at − 20°C, overnight. Protein pellet was re-suspended in urea buffer, and total protein content was determined by the BCA assay. A total of 100 μg of protein was transferred to centrifugal filter units (30 kDa MWCO, Pall NanoSep Omega, OD030C34, Sigma Aldrich), and a modified filter-aided sample preparation (FASP) protocol [10] was followed. Briefly, proteins were reduced with 8 mM dithiothreitol at 56°C for 15 min, alkylated with 50 mM iodoacetamide at room temperature for 20 min in the dark and then digested with trypsin (Proteomics Grade, Roche Life Science, USA) at 1: 50 (trypsin: protein) ratio for 18 h at 37°C. Tryptic peptides were collected by centrifugation, and trifluoroacetic acid (TFA) was added to a final concentration of 0.5% to stop enzymatic reaction. TFA concentration was reduced to 0.1%, and peptides were purified and desalted by solid phase extraction using Sep-pak®Vac1cc (100 mg) tC18 cartridges (Waters, Ireland). The eluted peptides were dried, using centrifuge vacuum at 45°C and stored in − 80 °C until further analysis.

LC/MS methods

Dried peptide pellets were re-suspended in buffer (99% H2O, 1% acetonitrile, 0.1% formic acid). The total peptide concentration was determined by absorption at 280 nm (A280) and was adjusted accordingly to a final concentration of 0.2 μg/μl prior liquid chromatography-mass spectrometry (LC-MS) analysis. For the discovery phase, 2 μl was loaded onto an analytical column (nanoAcquity CSH C18 75 μm ID X 250 mm length, 1.7 μm particle size, 130 Å pore size, Waters, UK) and separated on a nano-liquid chromatography (LC) system (nanoAcquity UPLC, Waters, UK) at a flow rate of 0.300 μl/min using a 220-min gradient elution. Column oven temperature was set to 40°C. Eluted peptides were ionised in positive mode using nanoelectrospray ionisation (nanoESI) and analysed on a Waters Synapt G2Si HDMS instrument operated in ion mobility mode, using an ultra-definition (UD) MSe approach [11].

For the validation phase, the MRM (Multiple Reaction Monitoring) analyses were performed using specific selected analytes. In the first step, a pool of selected synthetic peptides (2.5 μg) was loaded onto the analytical column and separated as descripted in the discovery phase using a 125-min gradient elution and the column oven temperature was set to 45°C. For the skin biopsy samples, 10 μl of 0.8 μg/μl total peptide concentration were loaded. Peptides were analysed on a Waters Xevo TQD instrument operated on positive MRM mode.

Protein identification and quantification and statistical analyses (discovery phase)

Raw MS data were processed by the Progenesis QI for proteomics (QIp) analysis software. Peptide identifications were performed using the MSe [12] search identification, with 1% peptide false discovery rate (FDR). The resulted identifications were further refined using the following parameters: confidence score ≥ 5, sequence length ≥ 6 and hits ≥ 2. Protein-level relative quantitation was performed using the Hi-N approach (N = 3) as implemented in the Progenesis QIp. Further statistical analysis was carried out according to affected/unaffected paired samples (all affected versus all unaffected skin biopsy samples), and this is referred to as affected/unaffected comparison in this manuscript. A variation of one-way ANOVA analysis, as implemented in the Progenesis QIp software, was used to calculate p values. Data normalisation was performed by Progenesis QIp software, using the ‘normalise to all compounds’ option. Briefly, the normalisation approach is based on the calculation of a global scaling factor which is then applied to normalise samples to an automatically selected reference sample.

Selection of proteins for the validation phase

Selection of proteins for validation was performed based on p value and FC through the affected/unaffected comparison. Significantly differentially expressed proteins (p < 0.05) with FC > 2.5 or < 0.4 in affected/unaffected comparison were selected.

Synthesised peptides for validation

For the validation phase, selection of target peptides for proteins quantification was performed using the Skyline 4.1 source (MacCross Lab software, University of Washington, WA, USA) [13]. At least two unique peptides were selected for each protein of interest. Synthesised label-free peptides (~ 70% purity) were purchased from GenScript Biotech, Netherlands. The targeted MS method parameters used for each peptide are shown in Additional file 1.

MRM-MS data and statistical analyses

The transition list of targeted peptides was created by the Skyline 4.1 source. The peak areas of peptides were calculated by summing the peak areas of their transition ions.

Statistical analysis of validation data was carried out using two-tailed paired t test for paired samples analysis (all affected versus all unaffected skin biopsy samples). A p value of < 0.05 was considered as significant. For each protein, the peptide with the strongest p value was indicated as quantitative, while the rest were used as qualitative references. The FC was calculated by comparing the mean value of the peak areas of the peptide among the grouped samples.

Bioinformatics tools

The Uniprot Retrieve/ ID mapping tool (http://www.uniprot.org/uploadlists/) was used to convert the UniProtKB AC/ID to Gene Name.

Pathway analysis was performed for significantly over-expressed (p < 0.05, FC > 1.5) or under-expressed (p < 0.05, FC < 0.67) proteins from affected/unaffected comparison in the discovery phase. Enrichr (http://amp.pharm.mssm.edu/Enrichr/) was utilised for inferring pathway knowledge about the gene set corresponding to the differentially expressed proteins, based on KEGG 2019.

Volcano plots were carried out using the R studio software (version 1.1.442, 2018-03-11). The volcano plot indicates the -log10 (p value) for proteins plotted against their respective log2 (fold change (FC)) on the y and x axes, respectively.

Results

Skin biopsy samples

Five female and two male unrelated SSc patients were analysed with a mean ± SD age of 56 ± 15.80 years and 38.5 ± 9.20 years, respectively. Skin biopsies were histologically assessed, and their features are shown in Additional file 2.

Discovery proteomic analysis

Discovery phase proteomic analysis of the analysed SSc samples led to the identification and quantification of 2149 proteins. Samples were grouped and compared based on affected/unaffected skin biopsy areas. This comparison showed that 169 (87 under-expressed and 82 over-expressed) out of 2149 identified proteins were significantly differentially expressed (p < 0.05) (Additional file 3) (Table 1).

Table 1 Discovery and validation results from affected/unaffected comparison selected proteins

Pathway analysis

Pathway analysis of significantly over-expressed (p < 0.05, FC > 1.5) and under-expressed (p < 0.05, FC < 0.67) proteins from affected/unaffected comparison revealed a large number of involved pathways. Significant extracted pathways (p < 0.05) are shown in Table 2. Several pathways that are associated with SSc pathogenesis were also identified in this study. Platelet activation, ECM-receptor interaction, complement and coagulation cascades, antigen processing and presentation and leukocyte transendothelial migration are among these significant extracted pathways (Table 2).

Table 2 Significant extracted KEGG 2019 pathways that are associated with p < 0.05, FC > 1.5 or < 0.67 proteins in affected versus unaffected comparison

Selected proteins and validation

Although many of the dysregulated proteins could be promising biomarkers for SSc, a limited number of proteins were selected for validation due to the high cost of MRM-MS approach. A total number of 15 proteins were selected for validation through the affected/unaffected comparison (Fig. 1).

Fig. 1
figure 1

Significantly dysregulated proteins (p < 0.05) with FC > 2.5 or < 0.4 in affected/unaffected paired sample comparison. The bar graphs show the significantly dysregulated proteins (p < 0.05) with a FC > 2.5 and b < 0.4 in affected compared to unaffected paired samples. c Volcano plot reporting p values against FC for affected/unaffected comparison. It indicates -log10 (p value) for affected/unaffected comparison proteins (y-axis) plotted against their respective log2 (fold change) (x-axis)

Five (UCHL1, PPID, DDX55, COX6B1 and APCS) out of these proteins were confirmed in the validation phase to be significantly dysregulated in affected/unaffected comparison and all details and results are shown in Table 1. UCHL1 belongs to the ubiquitin C-terminal hydrolases family and enables the hydrolysis of small ubiquitin adducts [14]. PPID (CyP40) is implicated in protein folding, nuclear localisation of progesterone, oestrogen and glucocorticoid receptors, ligand binding, pro-tumorigenic effects and congenital heart defects [15, 16]. DDX55 belongs to the DEAD-box proteins which are implicated in several RNA metabolism processes such as RNA transcription, degradation as well as gene expression in organelles and pre-mRNA splicing [17]. COX6B1 is a subunit of Complex IV and Massa et al. reported that a recessive mutation on COX6B1 causes a mitochondrial encephalomyopathy [18]. The last confirmed protein is the plasma glycoprotein APCS which functions as a calcium-dependent lectin [19] and is involved in immunological responses [20].

Discussion

SSc is a multisystemic AID with unclear aetiology and pathogenesis. The main pathological features of the disease, vasculopathy, inflammation and fibrosis, highlight its complexity and indicate that various molecules and pathways are implicated in the different stages of disease pathogenesis. However, these molecules and pathways/mechanisms have not been fully elucidated [21]. Therefore, the aim of this study was to analyse skin biopsies from affected and unaffected areas of the same SSc patients in order to study different stages of the disease and identify sensitive proteomic biomarkers gaining insights into the pathways/mechanisms that are implicated in SSc pathogenesis.

Using discovery MS proteomic analysis of SSc samples, 2149 proteins were identified and 169 proteins were shown to be significantly dysregulated in the affected/unaffected comparison. Pathway analysis of these proteins confirmed the heterogeneity of the disease as the proteins are involved in approximately 190 different pathways. As SSc is an extremely complex disease and several molecules, cell types and biological processes are implicated in its pathogenesis [6], several of the bioinformatics-highlighted pathways might be associated with the disease in different ways. Furthermore, some of these pathways are related with other diseases including AIDs. These findings indicate that several diseases especially AIDs might share common pathogenetic mechanisms. According to the Enrichr analyses, platelet activation, ECM-receptor interaction, complement and coagulation cascades, antigen processing and presentation and leukocyte transendothelial migration are among the common significant extracted pathways.

It is known that overexpression of the ECM proteins plays a key role in the development of fibrosis in SSc [22]. Our data confirm this knowledge as three proteins (ITGB1, VTN and ITGA5) that are implicated in ECM-receptor interaction pathway were over-expressed in the affected/unaffected comparison. Proteins that are implicated in complement and coagulation cascades (C4B, VTN, SERPINC1 and CLU), and antigen processing and presentation pathways (HSPA4 and HLA-G) were also significantly over-expressed in this comparison. Although this observation is consistent with the results obtained in a previous study, these pathways are not only activated in SSc but also in other AIDs [23, 24]. Another important pathway that was extracted from ITGB1, AKT3 and MYLK over-expressed proteins in the affected/unaffected comparison is the platelet activation pathway which contributes to all three stages of SSc pathogenesis: vascular injury, inflammation and fibrosis [25]. In a previous study, Agache et al. reported that platelet activation markers are associated with the severity and activity of the disease and increased levels of C-reactive protein (CRP) [26]. Leukocyte transendothelial migration is also an essential pathway as CD4+ T lymphocytes transendothelial migration is enhanced in this disease and the migrating cells display an activated phenotype [27]. The ITGB1, ACTN1 and ICAM1 proteins that were identified to be significantly over-expressed in affected/unaffected comparison in our study are implicated in this pathway.

As already stated, a large number of proteins were identified; however, only 15 proteins were selected for validation due to the high cost of MS. In the validation phase, 5 (UCHL1, PPID, DDX55, COX6B1 and APCS) out of these 15 selected proteins were further confirmed to be significantly dysregulated. Giusti et al. in a transcriptomic study showed that ubiquitin C-terminal hydrolases protein UCHL1 and other molecules that are implicated in ubiquitination/stress are highly expressed in skin endothelial cells of dcSSc patients compared to controls [14, 28]. In the validation phase of our study, UCHL1 was confirmed to be significantly over-expressed in affected/unaffected comparison. It is remarkable that overexpression of UCHL1 in SSc affected skin areas is confirmed both at the transcriptome and proteome levels by two independent studies. CyP40 is another protein that was found to be significantly over-expressed in the affected/unaffected comparison in our study. Interestingly, Balanescu et al. showed that cyclophilin-A, a member of the same family with CyP40, is abnormally expressed in biological fluids and cutaneous biopsies of SSc patients [7]. These evidences suggest that UCHL1 and CyP40 could be promising SSc biomarkers and further studies on these two molecules should be performed. DDX55 and COX6B1 were also significantly over-expressed in the affected/unaffected comparison. These two proteins have been associated with SSc for the first time in our study; thus, additional investigation should be performed in order to further confirm their association with the development of SSc and their potential use as biomarkers of the disease. APCS is a plasma glycoprotein which is implicated in immunological responses [20]. Tennent et al. used serum of lcSSc and dcSSc patients and 12-month longitudinal patients and showed that APCS levels were in the normal range; except from a limited number of elevated values that are associated with acute inter-current complications [29]. Aden et al. showed that the APCS precursor level was significantly overexpressed in skin biopsies from SSc patients compared to healthy controls [30]. In agreement with Aden et al., our study showed that APCS is significantly over-expressed in the affected/unaffected comparison in both the discovery and validation phases. However, this protein may generally be associated with autoimmunity instead of specifically with SSc as it was found to be implicated in other AIDs as well [31].

This study shows different proteomic background and histological features among affected and unaffected skin biopsies of the same SSc patients. This is the first study that compares SSc skin areas macroscopically classified as affected and unaffected. Despite the importance of the results, the small number of samples used is a limitation of this study. However, previous reported studies also had this limitation [32, 33], because SSc is a rare disease, and thus, large numbers of patients are not expected to be recruited. Although skin biopsies were obtained only from SSc patients and not from healthy controls, this could be also considered as an advantage as the comparison is performed on samples from the same individual and the bias of heterogeneity is reduced.

Conclusions

In conclusion, the results of this study are important as they display the promising diagnostic power of a multi-biomarker approach. Pathway analysis showed that several pathways are implicated/activated in SSc pathogenesis. Available literature on UCHL1 and PPID proteins supports that they could be promising biomarkers for SSc. Interestingly, 3 (DDX55, PPID and COX6B1) out of 5 confirmed proteins have been associated with SSc for the first time in our study; therefore, these proteins could be novel biomarkers of the disease. In addition, APCS is associated with several AIDs and inflammatory pathways; thus, it might not be specific for SSc. However, they could be used as inflammation-related biomarkers. Further studies could be performed using samples from patients with different AID in order to assess whether these molecules are mainly associated with any other specific AID or with general inflammatory conditions. Further evaluation/validation studies could be performed in samples of new patients with SSc in order to confirm that these 5 molecules could be also useful biomarkers for specific stages (e.g. early phase) of the disease. Moreover, confirmation of these biomarkers in an easily accessible tissue will be useful for the clinicians and alleviate painful procedures for the SSc patients.