Unravelling morphoea aetiopathogenesis by next-generation sequencing of paired skin biopsies

Background Morphoea can have a significant disease burden. Aetiopathogenesis remains poorly understood, with very limited existing genetic studies. Linear morphoea (LM) may follow Blascho’s lines of epidermal development, providing potential pathogenic clues. Objective The first objective of this study was to identify the presence of primary somatic epidermal mosaicism in LM. The second objective was tTo explore differential gene expression in morphoea epidermis and dermis to identify potential pathogenic molecular pathways and tissue layer cross-talk. Methodology Skin biopsies from paired affected and contralateral unaffected skin were taken from 16 patients with LM. Epidermis and dermis were isolated using a 2-step chemical-physical separation protocol. Whole Genome Sequencing (WGS; n = 4 epidermal) and RNA-seq (n = 5-epidermal, n = 5-dermal) with gene expression analysis via GSEA-MSigDBv6.3 and PANTHER-v14.1 pathway analyses, were performed. RTqPCR and immunohistochemistry were used to replicate key results. Results Sixteen participants (93.8% female, mean age 27.7 yrs disease-onset) were included. Epidermal WGS identified no single affected gene or SNV. However, many potential disease-relevant pathogenic variants were present, including ADAMTSL1 and ADAMTS16. A highly proliferative, inflammatory and profibrotic epidermis was seen, with significantly-overexpressed TNFα-via-NFkB, TGFβ, IL6/JAKSTAT and IFN-signaling, apoptosis, p53 and KRAS-responses. Upregulated IFI27 and downregulated LAMA4 potentially represent initiating epidermal ‘damage’ signals and enhanced epidermal-dermal communication. Morphoea dermis exhibited significant profibrotic, B-cell and IFN-signatures, and upregulated morphogenic patterning pathways such as Wnt. Conclusion This study supports the absence of somatic epidermal mosaicism in LM, and identifies potential disease-driving epidermal mechanisms, epidermal-dermal interactions and disease-specific dermal differential-gene-expression in morphoea. We propose a potential molecular narrative for morphoea aetiopathogenesis which could help guide future targeted studies and therapies. Supplementary Information The online version contains supplementary material available at 10.1007/s00403-023-02541-5.


Introduction
Morphoea is characterised by fibrosis of the skin and/or underlying connective tissues, with the potential for significant functional and psychological impact. It is suggested that environmental triggers [1][2][3], occurring in a genetically susceptible individual, underpin the inflammation and deregulated tissue injury response in morphoea [4]. However, precise genetic susceptibility factors, inciting and propagating molecular mechanisms, remain unclear.
Linear morphoea (LM) may follow Blaschko's lines of epidermal development, and hence may represent epidermal somatic mosaicism for a mutation conferring increased risk of disease at specific sites [5][6][7][8][9]. Accordingly, keratinocytederived signals and epidermal-dermal communication pathways vital to normal skin development and wound repair, are also key to pathological skin fibrosis and highly active, proliferative keratinocytes are seen in systemic sclerosis (SSc) [4,10,11].
Morphoea's morphological heterogeneity, clinical symmetry, patterning and possibly Blaschkoid distribution, may therefore provide clinical clues to potential underlying epidermal and dermal genetic aetiopathogenic and diseasedriving mechanisms [4].
The goals of this study were to identify the presence or absence of primary somatic epidermal genomic variation (as a common single nucleotide variant (SNV), or differing SNVs in a commonly affected gene, across all study samples) in LM, and to explore differential gene expression (DGE) in isolated epidermal and dermal site-matched tissue pairs, to identify potential inciting and pathogenic pathways in the epidermis and dermis. We aimed to correlate our data with the very limited current genetic data in morphoea, to propose a possible genetic and molecular narrative underlying morphoea aetiopathogenesis and hence identify potential future study and therapeutic targets.

Methodology
This study was approved by the National Research Ethics Service (London-Hampstead, MREC Reference 6398). Tissue specimens were obtained with written informed consent as part of an ongoing programme of research into the pathogenesis of scleroderma.

Specimen source
Patients with LM involving the limb(s) and/or trunk identified from our previously characterised morphoea cohort were eligible for specimen collection [24]. A total of 16 patients were enrolled (Table 1). Details regarding sample selection for each molecular (DNA/RNA) and tissue layer (epidermal/dermal) dataset are described in the Supplemental Methods section. Paired 4 mm whole skin punch-biopsies were taken from each participant; one or two from morphoea affected (lesional) skin, and one or two from site-matched contralateral unaffected skin. For tissues samples utilised for DNA/ RNA isolation, epidermis was immediately chemically separated from the dermis utilising 3.8% ammonium thiocyanate (Sigma-Alrich USA) in Dulbecco's phosphate-buffered saline pH 7.4 at room temperature for 25 min. Residual epidermal tissue was gently curetted off the superficial dermal surface using a scalpel blade (no. 15) [25].

DNA isolation, whole genome sequencing and analysis; epidermis
DNA was isolated from paired epidermal tissue and four selected paired samples underwent WGS. All identified genes with SNVs underwent network analysis utilising STRING online database (v11). Identified SNVs were then classified; graded according to disease relevance and subclassified according to MAF (using ExAC) and pathogenicity (according to PolyPhen-2, PROVEAN, SIFT and CADD scores) (Supplemental Methods and Fig. 1).

RNA isolation, sequencing and analysis; epidermis and dermis
Total RNA was isolated from paired epidermal and dermal tissue, and selected samples underwent RNA-seq. Epidermal and dermal differentially expressed genes (DEG) were further analysed via Gene Set Enrichment Analysis (GSEA), using MSigDB Hallmark gene sets [26,27]. Enrichment was reported as significant if the false discovery rate (FDR) was less than 0.25 [28] and each GSEA set was ranked according to log2 fold change (log2FC).
For dermal RNA-seq data, further complimentary analysis via PANTHER (PANTHER Gene Ontology (GO)-Slim Biological Process) [29] was completed. An adjusted P-value was calculated using Bonferroni correction, with a statistical significance cut-off of < 0.05. STRING database was utilised to review protein-protein interactions between products of particular DEGs of interest. (Supplemental Methods).

RT-qPCR and IHC of selected epidermal and dermal gene candidates derived from epidermal RNA-seq
Details can be found in the relevant Supplemental Methods sections.

Results
Epidermal protein coding single nucleotide variants 861 SNVs were identified in morphoea-affected epidermis, but absent in paired unaffected epidermis. Of these, 119 were protein-coding exonic and 72 nonsynonymous. No single common SNV or commonly affected gene was identified across all four sequenced epidermal tissue pairs. A number of nonsynonymous protein-coding SNVs had high CADD scores (> 20) and pathogenicity rated as damaging or possibly damaging by at least two of PolyPhen-2, PROVEAN and SIFT algorithms, including; ADAMTS16, ADAMTSL1 and CBX2 (Table 2). STRING network analyses of these variants yielded no noteworthy gene clusters.

Disease relevance of epidermal genomic variants
No protein coding nonsynonymous SNVs were graded as very high for disease relevance. Variants in the genes ADAMTS16 and ADAMTSL1 were graded as high for disease relevance and Level 1 for potential pathogenicity and rarity. All other protein-coding nonsynonymous variants were graded as medium disease relevance (Table 3).
Thirty-two members of the ADAM, ADAMTS and ADAMTSL super-family were nonsignificantly DE in the dermis (13 downregulated and 19 upregulated) and 12 in the epidermis (6 upregulated and 6 downregulated). Overall, 50 ADAM/ADAMTS-family genes were affected across all three datasets, including the potentially highly pathogenic (according to criteria described in Fig. 1) nonsynonymous SNVs in ADAMTS16 and ADAMSTL1 (Fig. 6).

Candidate genes and pathways based on epidermal genomic and epidermal and dermal transcriptomic profiles
Based on the WGS and RNA-seq results, a number of gene candidates were selected; some for further study. Selected epidermal candidate genes included ADAMTS16, ADAMTSL1 and the inflammatory and profibrotic TGF-β1 and JUNB. Selected dermal candidates included members of some developmental and morphogenic signaling pathways; SFRP4, SIX1, WNT2 and NOTCH4. Key characteristics of these genes and justification for their selection as candidates are detailed in Table 7.

RT-qPCR and immunohistochemistry validation of selected epidermal and dermal gene candidates
Two key candidate genes were validated by RT-pPCR in this study; TGF-β1 and JUNB. These were from the strongly over-expressed and highly disease-relevant TGF-β signaling gene set. TGF-β1 is the recognised orchestrator of fibrosis and the role of its epidermal production and expression have not been specifically investigated in morphoea. JUNB is also a key player in TGF-β signaling and hence with its relatively high log2CPM, JUNB was selected as the second validation candidate, keeping both genes for qPCR from the TGF-β signaling gene set (NES = 2.006, FDR = 0.001).
Expression of TGF-β1 and JUNB was higher in morphoea affected epidermis compared to the contralateral sitematched unaffected epidermis in all samples, but this trend was not significant (TGF-β1; P = 0.476, JUNB; P = 0.105, Fig. 7).
WNT2 was selected for validation via IHC on formalinfixed, wax-embedded paraffin whole skin sections. WNT2 was highlighted by dermal transcriptomic profiling, subsequent pathway analysis and is a member of the developmental morphogenic pathways which are of particular relevance to the anatomical patterning in morphoea and its pathogenesis. Of note, WNT2 was also highlighted by epidermal RNA-seq. In the dermis, WNT2 was the only Wnt signaling gene with log2FC > 1.5 (log2FC = 1.79), its FDR approached significance (FDR = 0.061), it was a leading edge gene (highest ranked) within the positively enriched Notch signaling Hallmark gene set within dermal GSEA data and was also present within the significantly enriched Multicellular organism development gene set (PANTHER GO-Slim Biological Process; P = 0.007).

Discussion
In this study, WGS did not identify a single common somatic mutation occurring in all four epidermal samples taken from LM-affected skin, or a commonly affected gene across all study samples. To our knowledge, this is the first study to investigate the presence of primary genomic variation in morphoea skin. This critical finding provides robust evidence against primary genomic epidermal segmental mosaicism-related aetiology in adult-onset LM. There are several clinical complexities of LM supporting more multifaceted aetiopathogenesis. LM may not be truly Blaschkoid [8], morphoea is a dermal pathology, has vast clinical heterogeneity with complex patterning and morphology [4,30] and is not congenital.
Accordingly, we identified 861 epidermal SNVs, including 119 protein-coding variants, many with medium to high disease relevance and potential pathogenicity, providing possible support for complex polygenic epidermal mosaicism in LM [31,32].
The ADAM/ADAMTS-family genes were widely affected across all three datasets, including potentially highly pathogenic nonsynonymous SNVs in ADAMTS16 and ADAMSTL1, possibly pointing to their pathogenic role in morphoea. These proteins/proteases are ECM-regulators implicated in embryological morphogenesis, skin development, wound healing, fibrosis [33][34][35][36], rare primary fibrotic  [41]. Whilst links between the ADAMTS/ ADAMTSL's and their precise functions in morphoea are unclear, their possible role in LM is further supported by our findings. Corroborating the potential key role of the epidermis in morphoea pathogenesis, we demonstrated a structurally active, proliferative and differentiating epidermis, with significant overexpression of SPPRs, PALLD, WNT2, other cell cycle/cell division (such as p53 and KRAS signalling) and apoptosis-related gene pathways, along with significant down-regulation of checkpoint and DNA repair-related genes (such as G2M DNA checkpoint and E2F targets) (Fig. 9).
We also demonstrated an inflammatory and profibrotic epidermal gene signature, which corresponds to the early inflammatory and profibrotic disease phases previously mapped by blood cytokine profiles [42][43][44][45][46]. A Th1 response (IL-2, TNF-α and IL-6) seen in the first year, is followed by a Th17 response (IL-1, IL-17, IL-22 and TGF-β) and Th2 cytokines (IL-4 and IL-13) [47]. Accordingly, the three Hallmark gene sets with the strongest significant positive enrichment in this study were TNF-α signalling via NFkB, TGF-β signalling and IL-6/JAKSTAT3 signalling; all suggesting early active inflammatory and fibrotic phase disease (Fig. 9). This was despite study samples being from LM of at least  3-years duration and not all demonstrating an inflammatory clinical phenotype; supporting an ongoing disease-driving role of the epidermis. Importantly, in recently published work evaluating transcriptomic whole-skin profiles of pediatric-onset morphoea, healthy controls, active and inactive disease were compared, and JAK/STATs were highlighted as the most prevalent DE pathway [48]. By separating the epidermis and dermis, we have highlighted that this signature may originate from the epidermis, promoting ongoing dermal disease activity. These findings provide further support for future studies to better elucidate precise pathogenic JAK/STAT-related mechanisms  in morphoea and the use of therapeutic JAK-inhibitors in sclerotic skin disease [49]. Finally, the epidermal molecular picture was also that of a 'wounded epidermis', similar to the epidermal phenotype demonstrated in SSc [10,50,51]. TGF-β is a key orchestrator of wound healing responses, also propagating pathological fibrosis [52]. Isolating a strongly enriched TGF-β signature in morphoea epidermis is unique, significant, and could provide impetus for further study of local TGF-β inhibition in appropriate clinical scenarios of superficial disease (e.g. with pirfenidone) [53]. However, precisely whether these signals are originating in the epidermis, or due to secondary unchecked positive feedback from the dermis, remains unclear.
Relevantly, epidermal IFI27 was upregulated (nonsignificant, but with the dataset's highest log2FC). It is known to induce IFNγ-related epidermal apoptosis. We saw significant upregulation of the epidermal Apoptosis gene set, and epidermal and dermal IFNα and IFNγ responses. IFN-signalling has been widely implicated in SSc and morphoea [11,48,54]. IFNγ-related chemokines and their receptors may stimulate fibroblasts, including in morphoea [46,48,55]. CXCL9 was significantly upregulated in morphoea dermis in our study, and it has previously been suggested as a disease biomarker [46,55].
Importantly, IFI27 negatively regulates NR4A1 [54], which was significantly downregulated in the dermal dataset. In turn, NR4A1 is an endogenous TGF-β inhibitor [56]. Fibrotic diseases appear to utilise this NR4A1-dependent mechanism to enable persistent TGF-β signaling and deregulated fibrosis and NR4A1 agonists inhibit laboratory-induced fibrosis of the skin, lung, liver, and kidney in mice [56,57].
Clues to another potential inciting epidermal 'damage' signal in morphoea lie in the significant downregulation of Fig. 4 Interactions between leading edge genes within inflammatory gene sets IFN-signaling (α and γ), and developmental related gene sets of epithelial to mesenchymal transition, Angiogenesis and Hedgehog signaling, demonstrating clustering and inter-pathway interactions. Default STRING criteria used: nodes linked by evidence, with medium confidence level of 0.4 LAMA4. Laminins are extracellular matrix (ECM) glycoproteins involved in differentiation, cell adhesion, signaling, migration, and form a key non-collagen component of the dermo-epidermal junction (DEJ) [54]. Related DEJ disruption could plausibly enhance epidermal-dermal communication and/or act as an initiating 'damage' signal, inciting proinflammatory and profibrotic dermal responses. Correspondingly, LAMA4-deficiency has been linked to cardiac [58][59][60] and renal fibrosis [61].
Individual dermal-genes demonstrated far greater DGE compared to the epidermis, suggesting dermal factors are more disease-specific in morphoea; in keeping with its predominantly dermal pathology. Two distinct DGE clusters were identified; inflammatory and profibrotic. The inflammatory signature, with significant upregulation of Humoral immunity, Lymphocyte activation and IFN-response-related genes, validates and adds to the limited morphoea gene expression data currently available [11,48,62]. This corroborated over-expression of IFN-signalling has an immediate foreseeable opportunity for potential therapeutic exploitation via anifrolimab, FDA-approved for systemic lupus erythematosus. Interestingly, KRAS-signalling has been identified as a potential biomarker for disease activity [48]. We demonstrated significant downregulation of inhibitory KRASsignalling in the dermis and upregulated KRAS-signalling in the epidermis also. All our cases had disease activity as demonstrated by LoScAT-activity scores of greater than zero (progressive or stable disease activity) (Tables 1, 4 and 5).
In the profibrotic DGE cluster, upregulated genes involved in embryogenesis and oncogenesis was seen such as Wnt, Hedgehog, dishevelled, frizzled family, HOX and PAX. PAX and HOX genes were specifically highlighted by PANTHER pathway analysis of dermal RNA-seq data. These families of biologically and functionally related developmental genes were collectively impacted in all three data sets (epidermal WGS, epidermal RNA-seq and dermal RNA-seq). HOX genes are the key orchestrating genes involved in fibroblast PI [12,13,[63][64][65]. Related location-specific gene signatures confer developmental patterning, position and help determine downstream differentiation of site-specific mesenchymal cells [13,66]. The genetic origin of fibroblasts can also alter their crosstalk with overlying keratinocytes [67]. Several HOX genes have shown significant DE in affected SSc-skin compared to unaffected skin [68] and related SOX genes have also been implicated in fibrosis and SSc [23,69]. Accordingly, one can deduce the feasible role HOX and related developmental and patterning genes could play in morphoea aetiopathogenesis and observed clinical patterning of non-linear subtypes. Indeed, their involvement in 'dermal mosaicism' has been suggested.
It is also suggested that via its regulation of dermal development, epidermal Wnt-signalling could account for the Blaschkoid distribution of dermal dermatoses, including Focal Dermal Hypoplasia [70]. Twelve Wnt-signalling genes contributed to the upregulation of the GO-Slim Biological Process of Multicellular organism development; WNT2B, WNT10B and WNT16 with significant DE. WNT2 was significantly upregulated in the epidermis, approached significance in the dermis (FDR = 0.061) and both these RNA-seq results were validated with IHC whole skin staining. Correspondingly, WNT2, WNT3A and β-catenin have previously demonstrated increased activity via IHC staining in both SSc and morphoea [71] and the role of Wnt-signalling in morphoea is established [20,55,[71][72][73][74][75]. Dermal SFRP4 was also significantly upregulated and recent data demonstrated the upregulation of SFRP2 in morphoea dermal fibroblasts [55]. SFRPs are homologous to the Wnt-binding site on  frizzled proteins and, therefore, modulate Wnt-signalling via direct interactions [54]. Interestingly, SFRP4 expression in the myocardium is associated with an apoptotic-related gene expression profile [54], feasibly associating its overexpression in morphoea to a disease-related damage signal. Limitations of this study include its cross-sectional nature, small datasets and limited validation of transcriptomic data. It is also impossible to differentiate primary from secondary gene expression changes or to adjust for treatment effect.
In summary, despite the often assumed Blaschkoid distribution of LM, data from this study indicate the absence of a single epidermal developmental somatic mutation responsible for disease causation. Instead, this study's molecular (genomic and transcriptomic) and tissue (epidermis and dermis) layered approach highlights possible polygenic epidermal mosaicism in initiating a complex multicomponent disease aetiopathogenesis. A wounded epidermal phenotype could, perhaps via Wnt-signalling, depletion of NR4A1 and other complex tissue layer crosstalk, contribute to the consequent inflammatory dermal fibrosis of morphoea, with its variable patterning possibly explained, at least in part, by the involvement of HOX, SOX, PAX and WNT developmental patterning genes (Fig. 9).

Acknowledgements
We would like to sincerely thank Dr Chiara Bacchelli for her expert advice and collaboration with whole genome and RNA sequencing work and bioinformatics support. Dr Ioannis Papaioannou and Dr Markella Ponticos, for their tireless practical laboratory support. Korsa Khan and Francesca Launchbury for their assistance with histology slide preparation and immunohistochemical staining, as well as Drs Florence Deroide and Victoria Swale for their expert slide interpretation. Bahja Ahmed Abdi for logistical laboratory assistance. Dr Catherine Orteu for assistance in the early phases of this project. Data availability The data that support the findings of this study are available upon reasonable request to the corresponding author, after validation by co-authors.

Declarations
Conflict of interest AMS: has received honoraria from UCB outside the submitted work. DK: nil. GWO: nil. AG: nil. DJA: nil. CPD: reports personal fees or research grants to his institution from GlaxoS-mithKline, Galapagos, Boehringer Ingelheim, Roche, CSL Behring, Corbus, Horizon, and Arxx Therapeutics outside the submitted work.
Ethics approval This study was approved by the National Research Ethics Service (London-Hampstead, MREC Reference 6398). Tissue specimens were obtained with written informed consent as part of an ongoing programme of research into the pathogenesis of scleroderma.
Informed consent Written informed consent to participate in the study and publication was obtained from all participants.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, Fig. 7 RT-qPCR validation for key epidermal upregulated TGF-β signaling genes, mean expression levels as normalised copy number; A TGF-β1, B JUNB as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.

Fig. 8
High power images of immunohistochemical staining with WNT2 antibody; unaffected control skin (above) and morphoea affected contralateral site-matched skin (below); study participant 15 Fig. 9 Multicomponent morphea etiopathogenesis; summary of key epidermal and dermal genes involved in morphea, as highlighted by NGS of paired epidermal and dermal tissue samples in this study