Acta Neuropathologica

, Volume 123, Issue 4, pp 485–499

Subgroup-specific alternative splicing in medulloblastoma

Authors

  • Adrian M. Dubuc
    • Division of NeurosurgeryArthur and Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children
    • Program in Developmental and Stem Cell BiologyThe Hospital for Sick Children
    • Department of Laboratory Medicine and PathobiologyUniversity of Toronto
  • A. Sorana Morrissy
    • Division of NeurosurgeryArthur and Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children
    • Program in Developmental and Stem Cell BiologyThe Hospital for Sick Children
  • Nanne K. Kloosterhof
    • Department of NeurologyErasmus MC
    • Department of Paediatric Oncology and HematologyErasmus MC, Sophia Children’s Hospital
  • Paul A. Northcott
    • Division of NeurosurgeryArthur and Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children
    • Program in Developmental and Stem Cell BiologyThe Hospital for Sick Children
    • Department of Laboratory Medicine and PathobiologyUniversity of Toronto
  • Emily P. Y. Yu
    • Program in Biology and PharmacologyUniversity of Western Ontario
  • David Shih
    • Division of NeurosurgeryArthur and Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children
    • Program in Developmental and Stem Cell BiologyThe Hospital for Sick Children
    • Department of Laboratory Medicine and PathobiologyUniversity of Toronto
  • John Peacock
    • Division of NeurosurgeryArthur and Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children
    • Program in Developmental and Stem Cell BiologyThe Hospital for Sick Children
    • Department of Laboratory Medicine and PathobiologyUniversity of Toronto
  • Wieslawa Grajkowska
    • Department of PathologyChildren’s Memorial Health Institute
  • Timothy van Meter
    • Department of NeurosurgeryMedical College of Virginia
  • Charles G. Eberhart
    • Department of PathologyJohns Hopkins University
  • Stefan Pfister
    • German Cancer Research CentreUniversity of Heidelberg
  • Marco A. Marra
    • British Columbia Cancer AgencyGenome Science Centre
  • William A. Weiss
    • Helen Diller Family Comprehensive Cancer CentreUniversity of California
  • Stephen W. Scherer
    • The Centre for Applied GenomicsThe Hospital for Sick Children
    • Department of Molecular GeneticsUniversity of Toronto
  • James T. Rutka
    • Division of NeurosurgeryArthur and Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children
    • Department of Laboratory Medicine and PathobiologyUniversity of Toronto
  • Pim J. French
    • Department of NeurologyErasmus MC
    • Division of NeurosurgeryArthur and Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children
    • Program in Developmental and Stem Cell BiologyThe Hospital for Sick Children
    • Department of Laboratory Medicine and PathobiologyUniversity of Toronto
Original Paper

DOI: 10.1007/s00401-012-0959-7

Cite this article as:
Dubuc, A.M., Morrissy, A.S., Kloosterhof, N.K. et al. Acta Neuropathol (2012) 123: 485. doi:10.1007/s00401-012-0959-7

Abstract

Medulloblastoma comprises four distinct molecular variants: WNT, SHH, Group 3, and Group 4. We analyzed alternative splicing usage in 14 normal cerebellar samples and 103 medulloblastomas of known subgroup. Medulloblastoma samples have a statistically significant increase in alternative splicing as compared to normal fetal cerebella (2.3-times; P < 6.47E−8). Splicing patterns are distinct and specific between molecular subgroups. Unsupervised hierarchical clustering of alternative splicing events accurately assigns medulloblastomas to their correct subgroup. Subgroup-specific splicing and alternative promoter usage was most prevalent in Group 3 (19.4%) and SHH (16.2%) medulloblastomas, while observed less frequently in WNT (3.2%), and Group 4 (9.3%) tumors. Functional annotation of alternatively spliced genes reveals overrepresentation of genes important for neuronal development. Alternative splicing events in medulloblastoma may be regulated in part by the correlative expression of antisense transcripts, suggesting a possible mechanism affecting subgroup-specific alternative splicing. Our results identify additional candidate markers for medulloblastoma subgroup affiliation, further support the existence of distinct subgroups of the disease, and demonstrate an additional level of transcriptional heterogeneity between medulloblastoma subgroups.

Keywords

MedulloblastomaAlternative splicingNeuronal developmentMolecular subgroupPediatric cancer

Introduction

Medulloblastoma (MB) is the most common malignant brain tumor in children [12, 41, 45] and has recently been demonstrated to exhibit considerable inter-tumoral heterogeneity [7, 14]. Recent publications have dissected medulloblastoma at a molecular level into four distinct variants—namely WNT, SHH, Group 3 and Group 4 [9, 27, 41, 43, 53]. These subgroups differ in their epidemiology, copy number profiles, transcriptional networks, mutational spectra, and clinical characteristics [42, 45, 48]. The study of subgroup-specific gene expression has assisted in the identification of cells of origin for WNT [20] and SHH medulloblastomas [63]. Subgroup-specific targeted therapy is imminent, with promising preliminary responses to SHH-pathway inhibitors in humans and mice [8, 49]. However, nearly half of all MBs are represented by Group 3 or 4 tumors [13, 43] with dismal overall survival and in which the molecular mechanisms driving tumorigenesis remain largely unknown [21]. The optimal mechanism for ‘real time’ assignment of subgroup affiliation in the setting of a clinical trial is not currently settled. To further understand the transcriptional dissimilarity between subgroups, we undertook an analysis of alternative splicing and promoter use in medulloblastoma.

Alternative splicing of pre-mRNA is a dynamic mechanism that adds complexity to the human transcriptome, thereby significantly increasing the diversity of expressed proteins [28]. Transcriptional selection of splice sites usage occurs through the processes of: exon skipping, alternative transcriptional start site usage, intron retention, and alternative polyadenylation sites, collectively referred to in the current manuscript as alternative splicing [28, 35, 40, 62]. One or more of these alternative splicing mechanisms are estimated to affect 75% [19] to 92% [62] of all genes in the human genome. Tight regulation of normal, tissue-specific, and developmental splicing is mediated by a complex network of RNA-binding proteins that recognize exonic or intronic cis-regulatory elements, enhancing or repressing inclusion of an exon in a transcript [19]. Recent evidence has also shown that transcription of an overlapping gene, encoded on the opposite DNA strand (antisense transcription) can affect splicing outcomes [57, 58].

Alternative splicing has been reported to be “cancer-specific”, producing protein isoforms that favor cellular growth, or metastasis [4, 11, 36, 59, 60]. Research efforts to target cancer-specific isoforms are ongoing [2, 23], and in notable cases have led to clinical trials addressing the efficacy of isoform-specific monoclonal antibodies [54]. A limited number of studies detailing medulloblastoma-restricted isoform expression have been conducted. Notable findings include alternative splicing of ERRB4 [17], PTC [56], and GLI1 [65], which impact critical signaling and developmental pathways relevant to the pathogenesis of medulloblastoma. We undertook a comprehensive investigation of alternative splicing across medulloblastoma subgroups in a large cohort of primary tumors (n = 103). Using data from the Affymetrix exon array platform, we identified multiple, recurrent, subgroup-specific alternative start site, and exon dropping events. Furthermore, we identified sense–antisense (S-AS) transcription, with subgroup–specific expression of antisense transcripts correlating with alternative splicing in medulloblastoma, which may represent a putative mechanism contributing to isoform variability. Our data further highlights the transcriptional dissimilarity between subgroups, suggests additional markers for assignment of subgroup affiliation, provide additional tools for cell of origin studies, and provides a hypothesis based on S-AS transcription that may explain patterns of subgroup-specific alternative splicing.

Materials and methods

Tissue samples and RNA preparation

Primary medulloblastoma (n = 103) and normal cerebella (fetal, n = 9; adult, n = 5) samples were profiled on Affymetrix Genechip Human Exon 1.0ST Arrays. Samples, obtained in accordance with Hospital for Sick Children (Toronto, Canada) Research Ethic Board, were snap frozen with liquid nitrogen at local host institutions and stored at −80°C. RNA was extracted using standard TRIzol (Invitrogen) protocol and quantification was performed using a Nanodrop ND-1000 Spectrophotometer. The quality of RNA was assessed on an Agilent 2100 Bioanalyzer by The Toronto Centre for Applied Genomics (TCAG, Toronto, Canada).

Expression profiling and molecular subgrouping

As previously described in Northcott et al. [41], Affymetrix Genechip Human Exon 1.0ST Array were processed at the TCAG (Toronto, Canada). Molecular subgroups were established as previously described [43].

Subgroup-specific splice variant detection

Pattern-based correlation

Pattern-based correlation (PAC) splice variant values, which represent the theoretical expression of each probe in relation to gene expression levels, were generated for each probe set and used to identify putative alternative splicing. PAC values were calculated for each probe set in each sample except for samples where its meta-probe set level was <8.5. In this manner, PAC values are calculated only in samples where the meta-probe set (a measure for gene-level expression) is expressed. We further focused only on probe sets whose expression is correlated with its meta-probe set (R2 > 0.64) as a measure to filter out poor performing and cross-hybridizing probe sets. ANOVA analysis was performed to identify differentially expressed splice variants between molecular subgroups of MBs. Differentially expressed splice variants were then further selected based on the degree of alternative splicing (PAC values >2 criteria, corresponding to fourfold difference in relative expression), see also French et al. [18]. Our analysis detected distinct changes in intra-transcript levels, revealing 1,986 probe sets mapping to 1,286 genes demonstrating ≥1 probe with a statistically significant PAC-score, suggestive of alternative splicing.

Splice index (SI)

As an alternative bioinformatic approach, we re-processed the data to generate SI values for each probe set. First, to filter out probe sets with poor performance or low signal, we calculated the median intensity across samples for each of the 287,189 core probe sets on the array. 161,720 probe sets with an average intensity above the median of these values (6.58) were retained for further analysis. Next, probe sets were mapped to Ensembl genes (hg18). A total of 12,209 genes that (1) were represented by a minimum of 6 core probe sets on the array; and (2) had a minimum of 20% of probe sets above the filtering threshold, were retained for further analysis. A SI value was calculated for each probe set in these 12,209 genes as previously described [37]. Briefly, in each sample, probe set intensity values were normalized by the corresponding gene expression value. The resulting SI value indicates whether the exon is included in the transcript (higher SI) or excluded (lower SI). SI values were first filtered to retain probe sets with a high dynamic range across samples, as described next. For each probe set, we calculated the difference between the 5th percentile and 95th percentile of the SI values across the 117 samples. The top 5% of probe sets (7,464) with the largest 95th–5th percentile differences were selected as a probable target of alternative splicing. To determine the number of alternative splicing events across each sample, a z-score was calculated for each probe set. Samples with probe sets whose z-score fell two standard deviations away from the mean (−2 ≥ z-score ≥ 2) were identified as alternative splicing events.

Comparing alternatively spliced probe sets identified by PAC or SI predictions a collective splice series was generated, whereas probe sets/genes identified by both algorithms were defined as the consensus splice series. The collective splice series was used to identify subgroup-specific alternative splicing events and hypersplice medulloblastomas, whereas the consensus splice series permitted the identification of hallmark alternative splicing events prevalent in each molecular subgroup.

Unsupervised clustering of SI values

Splice index (SI) values for medulloblastoma (n = 103) and normal cerebellar (n = 14) samples were used for clustering analysis. We performed unsupervised hierarchical clustering (HCL) using Pearson’s correlation as a distance metric with bootstrapping analysis (100 iterations) with all 7,464 probe sets using TM4 Microarray SoftwareSuite (MeV v4.6, Dana-Farber Cancer Institute, Boston). We repeated this analysis with the top 50% of probe sets (3,732) with the highest standard deviation, as well as the top 25% of probe sets (1,866), 13.4% (1,000 probe sets) and 6.25% of probe sets (467). We identified the strongest support for clustering using 1,000 probe sets which identified 6 core clusters, composed of 2 normal subgroups (fetal and adult cerebella) and 4 medulloblastoma subgroups. Non-negative matrix factorization (NMF) (Dana-Farber Cancer Institute, Boston) was used as second, unsupervised clustering algorithm. Using both the top 1,000 probe sets with the highest standard deviation, as well as all 7,464 probe sets, we determined the cophenetic correlation for k = 2 to k = 8 subgroups. We identified the strongest support for k = 7 for both the filtered (1,000 probe sets 0.9629) and unfiltered (7,464 probe sets 0.8801), producing two normal clusters and five medulloblastoma subgroups.

qRT-PCR validation of alternative start sites and exon dropping events

Validation of splice isoforms was performed using qRT-PCR. In brief, cDNA was synthesized from RNA using Superscript III First-Strand Synthesis supermix (Invitrogen). Five-hundred (500 ng) nanograms of RNA was incubated with 2xFirst Strand Reaction mix (Invitrogen) and random hexamers (50 ng/μL) for 10 min at 25°C and then 1 h at 50°C prior to heat-inactivation of the enzyme mixture at 85°C for 5 min. Primers designed using Primer Express software were generated targeting regions at the 5′ and 3′ end of the transcript. Primer sequences can be found in supplemental data (Table S5). Fifty-nanograms (50 ng) of cDNA was profiled on ABI Step One qRT-PCR instrumentation using SYBR green. A transcript ratio was calculated as the fold-change difference between the 5′ versus 3′ end. All transcript ratios were normalized to pooled fetal cerebellar cDNA.

Bioinformatic analysis of overrepresented genes and pathways

Ingenuity pathway analysis (IPA) (Ingenuity Systems) was used to annotate predominant themes and pathways. Specifically, the top statistically significant canonical pathways and molecular functions were used to classify genes. Over representation of gene ontology (GO) groups targeted by alternative splicing were assessed using BINGO v2.3 (A Biological Network Gene Ontology Tool) [31] a Cytoscape plug-in [10]. In brief, a hypergeometric test was used to assess overrepresented GO Biological Processes. Benjamini and Hochberg False Discovery Rate (FDR) correction was applied and only themes with a statistical significance of P < 0.05 were included in the analysis.

Sense–antisense transcription

Filtered Affymetrix exon array probe sets were mapped to ~1,765 S-AS (defined as overlapping by a minimum of 1 bp, and encoded on opposing strands, as in Morrissy et al. [37]). A total of 376 genes with at least 20% of probe sets expressed above the filtering threshold were further considered. These genes had a total of 4,344 filtered probe sets. SI values for these probe sets were calculated as described above. Spearman’s rank correlation coefficients were calculated between the SI values of each probe set in a sense gene, and the expression values of the antisense gene (across all samples). P values for correlations were calculated using the cor.test function in R (R Development Core Team 2008), and were multiple-test corrected using the stringent Bonferroni method. For each S-AS gene pair, each gene partner was, in turn, analyzed as the sense gene and as the antisense gene (in order to identify cases where both genes had antisense-correlated splicing events).

Results

Alternative splicing in medulloblastoma is subgroup-specific

To further highlight the transcriptional differences between medulloblastoma subgroups, we analyzed alternative splicing consisting of the differential use of exons, promoters and polyadenylation sites in a large cohort of medulloblastomas (n = 103) and normal cerebella (n = 14). Using two independent bioinformatics algorithms—SI [37] and PAC [18]—we created a ‘collective splicing series’ of 9,096 putatively spliced probe sets that map to well-annotated exons in 4,622 genes (Figure S1a; Table S3). The majority of these alternatively spliced probe sets (64%) mapped to non-terminal exons, whereas 15 and 21% of our collective splice series affected probe sets which could be mapped to the first or last exon, respectively (Figure S2). Most of the identified alternative splicing occurred in medulloblastoma samples (79%), while only a minority (15%) was specifically enriched in the normal cerebella (Fig. 1a). Subgroup-specific splicing events were most prevalent in Group 3 and SHH tumors (19.4 and 16.2%, respectively) and less abundant in Group 4 (9.3%) and WNT (3.2%) medulloblastomas. Half (51.9%) of all medulloblastoma-enriched splicing events occurred across subgroups in a mixed population of medulloblastomas (Fig. 1b). We identified genes with known roles in medulloblastoma and cerebellar development including: AXIN2 (WNT), GLI1 [47], TSC1 [5] and PTCH1 (SHH) [24] (Table S5, Table S6). We also observed the previously reported medulloblastoma-specific splicing affecting ERBB4 [17].
https://static-content.springer.com/image/art%3A10.1007%2Fs00401-012-0959-7/MediaObjects/401_2012_959_Fig1_HTML.gif
Fig. 1

Subgroup-specific alternative splicing in medulloblastoma. a Distribution of 9,096 alternatively spliced probe sets, identified by splice index (SI) and pattern-based correlation (PAC) algorithms, across 103 primary medulloblastoma and 14 normal cerebella samples demonstrates strong enrichment patterns in medulloblastoma (79%) with a minority of alternative splicing events restricted to the normal cerebella (15%). b Subgroup association of 7,509 probe sets with medulloblastoma-enriched splicing patterns identifies elevated levels of Group 3 (19.4%, 1,454 probe sets) and SHH (16.2%, 1,216 probe sets) enriched alternative splicing with lower levels present in WNT (3.2%, 241 probe sets) and Group 4 (9.3%, 697 probe sets) tumors. Half of all medulloblastoma-enriched splicing events (51.9%, 3,901 probe sets) were identified in medulloblastomas from multiple subgroups. c Using splice index (SI) and pattern-based correlation (PAC) algorithms the number of alternative splicing events per sample was identified, producing similar trends for both algorithms. Subgroup-specific splicing patterns revealed a significant developmental increase in alternative splicing from the fetal to adult normal cerebella with a further increase in splicing observed in medulloblastomas. d Distribution of the molecular subgroups of medulloblastoma in hyperspliced (n = 26) and non-hyperspliced tumors (n = 77) reveals an increased frequency of WNT (+6%) and Group 3 tumors (+16%) in hyperspliced medulloblastomas, and a decreased frequency in Group 4 tumors (−20%). e Subgroup-specific distribution of alternative splicing per sample for hyperspliced versus non-hyperspliced tumors. Hyperspliced tumors demonstrate a significant increased number of alternatively spliced exons across each subgroup—ranging from 2.18 (WNT) to 4.97 (Group 3) times higher levels of splicing. f Hyperspliced medulloblastomas display a significantly decreased overall survival (P < 3.08E−2) relative to non-hyperspliced medulloblastomas. g Each molecular subgroup of medulloblastomas demonstrates a trend towards increased mortality for hyperspliced tumors, with an 80% or greater increase in mortality for WNT, SHH and Group 3 medulloblastomas

During cerebella development, a significant increase in alternative splicing is observed as the normal cerebellum develops from the fetus to adulthood (P < 4.34E−8) (Fig. 1c). The adult cerebella demonstrates 3.93-times higher median levels of alternative splicing (Fig. 1c; Figure S3a) relative to the fetal cerebella; however, within fetal or adult samples there exists no direct correlation between age and the observed frequency of alternative splicing (Figure S3b, Figure S3c). Medulloblastomas display on average 2.3-times the median levels present in the developing fetal cerebella (P < 6.47E−8), which nonetheless remain 0.59-times lower than those observed within the developed, adult cerebella (P < 1.89E−2). A subgroup-specific analysis of medulloblastoma alternative splicing reveals no statistically significant differences in the observed number of spliced probe sets across WNT, SHH and Group 3 tumors whereas Group 4 medulloblastomas possess a reduced frequency of alternative splicing (P < 2.31E−2). Although medulloblastoma is largely a pediatric disease, adult tumors (age > 16) represent 13.6% (14/102) of our tumor cohort. Pediatric versus adult medulloblastomas do not display any statistically significant differences (P < 4.92E−1) in the observed frequency of alternative splicing events (Figure S4a). Furthermore, there is no correlation between the age of the patient and the frequency of alternative splicing when medulloblastoma is analyzed as a single disease (Figure S4b); however, a weak, positive trend towards increasing alternative splicing with age was observed in non-Group 4 tumors (Figure S4c).

The extensive intra-subgroup variance in abundance of alternative spliced probe sets permits stratification of medulloblastomas into two broader groups, distinguished by the frequency of alternative splicing. The first group, referred to as “hyperspliced”, is composed of 26 samples with splicing frequencies above the 75th percentile across all medulloblastomas. The second group, with splicing frequencies comparable to those present in normal cerebella, is referred as “non-hyperspliced”. Notably, we observed relative differences in the distributions of Group 3 and 4 subgroups across both hyperspliced and non-hyperspliced medulloblastomas. An increase (+16%) in the distribution of Group 3 tumors was observed in the hyperspliced group with the inverse relationship (−20%) for Group 4 medulloblastomas (Fig. 1d). Hyperspliced medulloblastomas demonstrate 2.18 (WNT) to 4.97 (Group 3) times greater frequency of median splicing events relative to non-hyperspliced tumors in the same subgroup (Fig. 1e). Strikingly, there is a significant decrease in the overall survival of patients with hyperspliced tumors (P < 3.08E−2) (Fig. 1f) with a trend towards increased mortality across all molecular subgroups of the disease (Fig. 1g). There exists no significant change in the frequency of alternative splicing that occurs in the presence of metastasis (Figure S5a), nor is there any change in the incidence of metastasis which correlates with the presence of the hyperspliced phenotype (Figure S5b). Whether this hyperspliced phenomena is a true biological event with clinical significance, or an artifact associated with the current sample cohort, the algorithms used for analysis, or the platform used, remains to be proven through identification and validation of the hypersplice phenotype on a separate cohort of medulloblastomas with exon-level expression data derived from another hybridization or sequencing-based platform.

Unsupervised clustering of SIs identifies four medulloblastoma subgroups

Through unsupervised HCL of SI values, we were able to recapitulate the clustering pattern produced by gene-level transcriptional data [41, 43], generating six major clusters with four clear medulloblastoma subgroups, in addition to normal fetal and adult cerebella clusters (Fig. 2a). Ninety-two percent (92%, 95/103) of samples clustered according to their predicted molecular subgroup, while six samples (8%, 8/103) were misclassified. Clustering discrepancies occurred largely (87.5%, 7/8) between Group 3 and 4 medulloblastomas—two molecular subgroups previously shown to display a higher concordance in copy number and transcriptional profiles. The clustering pattern observed was highly robust (Figure S6a) with >98% confidence associated with the clustering patterns of WNT, SHH and normal cerebellar samples (Figure S6b). Fetal and adult normal cerebella clustered together with confidence scores >81% irrespective of the number of probe sets used to generate the clusters, suggesting they display a distinct alternative splicing pattern from the medulloblastoma samples profiled (Figure S6a). There is clear sub-structure identified within Group 3 medulloblastomas, with half of all Group 3 hyperspliced tumors clustering with a high confidence (77%) (Figure S6b), further supporting the necessity of characterizing intra-subgroup heterogeneity. Using an independent and unsupervised learning algorithm, NMF, we were able to reproduce our HCL clustering patterns. NMF provided the highest support (Cophenetic correlation 0.9629) for seven molecular subgroups consisting of the six major groups identified by HCL (fetal cerebella, adult cerebella, WNT, SHH, Group 3, Group 4) and one additional subgroup (Fig. 2b). The additional subgroup consisted of a minority (n = 3) of SHH cases clustering separately from other SHH tumors (Fig. 2c). NMF produced an accuracy similar to that of HCL, with 93% (96/103) of medulloblastomas clustering as expected. Importantly, the clustering pattern produced by alternative splicing is not driven by gene expression, as there is only 47.2% (244/516) overlap in the genes sets used to generate stable alternative slicing clustering and gene-level transcriptional clustering (Table S10). Using information generated from both HCL and NMF clustering, we identified highly recurrent hallmark alternative splicing events enriched in each of the molecular subgroups (Fig. 2d).
https://static-content.springer.com/image/art%3A10.1007%2Fs00401-012-0959-7/MediaObjects/401_2012_959_Fig2_HTML.gif
Fig. 2

Unsupervised clustering of splice indices identifies four subgroups of medulloblastomas. a Unsupervised hierarchical clustering (HCL) of the top 1,000 probe sets with the highest standard deviation across splice index (SI) values generates robust clustering of four core medulloblastoma subgroups with two normal cerebella clusters. b Non-negative matrix factorization using the same 1,000 probe sets used for HCL clustering produces the highest support (cophenetic correlation) for 7 subgroups. c Non-negative matrix factorization demonstrates 7 core subgroups composed of the adult and fetal normal samples, the corresponding four subgroups identified by HCL, and one additional subgroup composed of a small set of SHH medulloblastomas (n = 3). d Medulloblastoma subgroup-enriched alternative splicing events identified by SI and PAC analysis revealing events present in >50–70% of each subgroup

Extensive alternative splicing of cerebellar development genes in non-WNT medulloblastoma

To identify genes and pathways disproportionately affected by alternative splicing we performed IPA in a subgroup-specific manner (Fig. 3a). We identified pathways with known roles in the pathogenesis of medulloblastoma, including p53 signaling (WNT tumors, P < 1.09E−2) [44] and CREB signaling (SHH tumors; P < 1.70E−4) [46]. Among medulloblastomas, TP53 mutations are most common in the WNT subgroup [44]. In non-WNT medulloblastomas, we identified a high incidence of neuronal development pathways affected by alternative splicing. Of the top ten statistically significant pathways, 60% (6/10) in both SHH and Group 3 medulloblastomas, and 40% (4/10) of Group 4 tumors, affected neuronal functions (Figure S7). Normal cerebella exhibited some overlap with these findings; however, neuronal functions are less frequently targeted (30%, 3/10). Instead, cell cycle pathways (30%, 3/10) are enriched in the normal cerebella (Table S11).
https://static-content.springer.com/image/art%3A10.1007%2Fs00401-012-0959-7/MediaObjects/401_2012_959_Fig3_HTML.gif
Fig. 3

Pathway and gene ontology analysis of subgroup-specific splicing events identifies recurrent targeting of cerebellar development pathways in non-WNT medulloblastomas. a Ingenuity Pathway Analysis (IPA) of the top ten pathways affected by alternative splicing across each molecular subgroup of medulloblastoma. Known signaling pathways: such as tight junction signaling (WNT, P < 1.49E−2) and CREB signaling (SHH, P < 1.70E−4) were identified in our analysis as well as an abundance of neuronal pathways in non-WNT medulloblastomas. b Cytoscape BINGO analysis of the significant gene ontologies (GO) targeted by alternative splicing in Group 3 tumors, after subtracting events present in the normal cerebella, identifies neuronal pathways targeting axonogenesis and glutamatergic synaptic transmission

Using Cytoscape BINGO [10, 31], an independent algorithm for the visualization of Gene Ontology (GO) functions, we performed a subtractive analysis, removing gene ontologies present in the normal cerebella and identifying biological processes enriched exclusively in medulloblastoma. The results complemented our pathway analysis demonstrating a strong enrichment of neuronal networks, including nervous system development (P < 1.30E−2), axonal guidance (P < 3.36E−3) and glutamatergic synaptic transmission (P < 2.19E−2) in Group 3 medulloblastomas (Fig. 3b). Additionally, this analysis identified signaling pathways previously implicated in medulloblastoma pathogenesis including the roundabout (ROBO-SLIT, Group 3, P < 1.13E−2) [61] and PDGF pathways (Group 3, P < 2.53E−2) [1, 30]. Similarly, alternative splicing events in SHH and Group 4 tumors comprised a high percentage of neuronal pathways, and networks such as the regulation of cell migration (P < 8.06E−3, SHH) and extracellular structure organization (P < 1.65E−2, Group 4) (Figure S7; Tables S12–S15).

Given the high incidence of mortality associated with hyperspliced medulloblastomas, we analyzed differential use of exons, alternative polyadenylation sites and alternative start sites between hyperspliced and non-hyperspliced tumors in an effort to identify possible molecular changes contributing to this phenotype. Using supervised clustering, we first identified the top 5% of probe sets with the greatest differential splice indices between the two groups (Figure S9a, Figure S9b). We then examined the molecular function of the most differential probe sets using IPA. The majority of the top canonical pathways differentiating hyperspliced from non-hyperspliced tumors affected known cancer-signaling pathways (60%, 6/10) (Figure S9c).

Validation of subgroup-specific alternative splicing events

We selected high confidence subgroup-specific splicing candidates from our consensus splicing series to validate. These events were predominantly found in a single molecular subgroup (>75%), and displayed gross changes in transcript structure of isoforms. To assess subgroup-specific isoform expression, a transcript ratio was calculated as change in 3′ versus 5′ expression levels, and normalized to fetal cerebellar levels. Exon-specific primers were then designed to distinguish between 3′ and 5′ exon cassettes. We validated alternative splicing events for INADL (WNT), CHN2 (Group 3), NBEA (Group 4) and SNAP25 (Mixed MB). INADL is a cell polarity and tight junction protein [52] with a 3′ alternative promoter isoform that is present in >80% of WNT tumors and a minority of Group 4 tumors (8%) (Fig. 4a). Upon validation, WNT medulloblastomas displayed a 2–10 times greater transcript ratio (i.e. higher levels of the shorter isoform) relative to non-WNT tumors and normal cerebella (Fig. 4b). CHN2, a Rho-GTPase Activating Protein [6, 25, 29] and NBEA, a neuronal differentiation protein [33] also demonstrated 3′ alternative promoters enriched in Group 3 and Group 4 subgroups, respectively (Fig. 4a). The NBEA truncated isoform was predicted to occur in 91% of Group 4 tumors and a minority of Group 3 medulloblastomas (37%), whereas the 3′ CHN2 isoform predominated in Group 3 (74%) and WNT (37%) medulloblastomas, evident in our validation (Fig. 4b). Finally, we validated alternative splicing targeting SNAP25, a synaptosomal protein necessary for neurotransmitter exocytosis [3], with a known exon-5 cassette (SNAP25a vs. SNAP25b). We observed a greater 5a to 5b ratio in normal fetal cerebella and the majority non-WNT medulloblastomas. In contrast, adult cerebella and WNT tumors demonstrate higher 5b levels. Each of these splicing events causes the disruption of one or more protein domains, likely resulting in a significant change in gene function (Figure S10).
https://static-content.springer.com/image/art%3A10.1007%2Fs00401-012-0959-7/MediaObjects/401_2012_959_Fig4_HTML.gif
Fig. 4

Validation of subgroup-specific hallmark alternative splicing events in medulloblastoma. a Exon array RMA signal intensity plots of highly frequent and subgroup-specific alternative transcripts. Alternative promoter usage generates a WNT-specific 3′ isoform of INADL, while alternative promoter usages in Group 3 and Group 4 tumors produce known isoforms of CHN2 and NBEA, respectively. SNAP25 demonstrates subgroup-specific expression of a known exon cassette with Fetal and non-WNT medulloblastomas expressing elevated levels of SNAP25a and adult and WNT tumors expressing higher levels of SNAP25b. b Primers generated against the 5′ and 3′ gene regions of the full transcript were used to assess the intra-transcript variability. A transcript ratio, based on the 3′:5′ expression, was calculated and normalized to normal fetal cerebella permitting the identification of subgroup-specific isoform expression. For each of the reported genes we observe the expected isoform expression restricted to the predicted subgroups

Sense–antisense (S-AS) transcription correlates with alternative splicing

Recent reports have demonstrated changes in alternative splicing patterns that correlate with the expression of an antisense gene [37]. S-AS transcription occurs when overlapping genes on opposing DNA strands are co-expressed in the same cell [26, 51]. Antisense transcription can regulate splicing decisions and alter the balance of isoforms expressed from the sense strand through a variety of mechanisms, such as direct transcriptional interference based on the physical consequences of convergent polymerase complexes (PolII) transcribing both strands of the S-AS gene locus [50]. In normal human cells, antisense transcription is significantly correlated to splicing at hundreds of loci, showing that this is likely a common mechanism of transcriptional regulation (Fig. 5a), occurring in >75% of all genes and often altered in a cancer-specific context [32].
https://static-content.springer.com/image/art%3A10.1007%2Fs00401-012-0959-7/MediaObjects/401_2012_959_Fig5_HTML.gif
Fig. 5

Sense–antisense transcription correlates with alternative splicing in medulloblastoma. a Schematic of sense–antisense (S-AS) transcription depicting overlapping genes on opposing strands with concomitant expression. A switch in the predominant sense-strand isoform occurs in the context of antisense transcription. b Analysis of MAB21L1 transcription (antisense gene) and correlated NBEA (sense gene) alternative splicing events. The splice index values of 38 NBEA 5′ exons are inversely correlated with expression of MAB21L1 (i.e. these exons are excluded from the NBEA isoform when MAB21L1 is expressed). Conversely, the 22 3′ exons have splice index values that are positively correlated to MAB21L1 expression (i.e. these exons are included in the NBEA isoform when MAB21L1 is expressed). c Examples of S-AS gene pairs demonstrating inverse relationships between sense strand exon inclusion and antisense strand transcription. These correlations demonstrate mixed medulloblastomas (C13ORF3MRP63) or subgroup-enriched (DDX31GTF3C4, Group 3 and Group 4) patterns suggesting S-AS events may play a critical role in normal development (SLC26A10B4GALNT, Normal CB) and the pathogenesis of medulloblastoma

To determine whether antisense transcription contributes to alternative splicing in medulloblastoma, we analyzed 188 overlapping gene pairs (i.e. 376 genes) that are encoded in opposing orientations and simultaneously expressed in a given tumor. Alternative splicing was predicted to occur in either the sense or antisense gene in 88 (46.8%) S-AS partners. We measured the correlation between exon inclusion in the sense genes [splice index (SI) values; see “Materials and methods”], and the expression of the antisense gene partner, across all 117 samples, identifying significant correlations between splicing and antisense gene expression in all (100%, 88/88) gene pairs (P < 0.05, after Bonferroni correction) (Figure S11a). Our results suggest that S-AS transcription may play a role in the regulation of alternative splicing of these genes in medulloblastoma.

Notable examples of such events include the well-annotated S-AS pairs NBEAMAB21L1, NNATBLCAP and BCL2L12IRF3. All three examples have previously been identified, and validated [15, 39, 55] using independent molecular techniques demonstrating the validity and strength of our approach. Of specific interest to us was NBEA, previously identified as alternatively spliced in Group 4 medulloblastomas where it is enriched in isoforms distinguished by an alternative transcriptional start site located mid-gene (Fig. 4). Analysis of the antisense-correlated splicing events identified the presence of two predominant NBEA isoforms (Fig. 5b). Expression of the longer NBEA isoform was negatively correlated (r = −0.61) to the expression of the antisense gene (MAB21L1). In contrast, expression of the shorter NBEA isoform, previously identified as up-regulated in Group C and Group D medulloblastomas (Fig. 4b), was positively correlated (r = 0.74) to MAB21L1 expression [also up-regulated in Group D tumors (Figure S11b)]. These results suggest that subgroup-specific expression of MAB21L1 contributes to the regulation of subgroup-specific alternative promoter usage of NBEA.

In the set of 88 S-AS gene pairs with significant correlations between alternative splicing and concomitant antisense transcription, there exist subgroup-restricted (Fig. 5c, middle and bottom panels) and subgroup-independent events (Fig. 5c, top panel). To understand the putative biological relevance of S-AS genes with antisense-correlated splicing events, we performed pathway analysis, and identified an enrichment of critical cellular functions such as cell death, cell cycle regulation and cellular development (Figure S11c). Further validation of the relationship between antisense transcription and the regulation of alternative splicing in larger datasets with more comprehensive AS transcriptional data will allow further testing of our model. Antisense genes that are expressed in a highly subgroup-specific manner should be considered as candidate marker genes for subgroup assignment.

Discussion

We present the first subgroup-specific analysis of alternative splicing in medulloblastoma using two independent bioinformatic approaches. We identified differential splicing across each medulloblastoma subgroup as well as normal fetal and adult cerebella. While age-matched normal tissue was unavailable, normal fetal (20–40 weeks of age) and adult (22–82 years of age) cerebella were used as normal controls, representing the developing and developed cerebella. Based on the distribution of alternative splicing events across all tumors, we were able to identify a ‘hyperspliced’ phenotype evident in one-quarter of all medulloblastomas. Although hyperspliced medulloblastomas are evident across all molecular subgroups, they are significantly under-represented in Group 4 tumors. Notably, they display decreased overall survival and a strong trend towards increased mortality in non-Group 4 medulloblastomas, indicating that survival trends are not influenced by the overrepresentation of a single aggressive molecular variant. Hyperspliced tumors do not display any change in the observed frequency of metastatic disease (M+), nor do we observe higher levels of alternative splicing in M+ tumors. Bioinformatic analysis of probe sets differentiating hyperspliced from non-hyperspliced medulloblastomas identified numerous cancer-signaling pathways whose instability may, in part, explain the aggressive nature of hyperspliced tumors. Clinical and biological relevance of the observed hyperspliced phenotype awaits its validation on a separate set of tumors studied using an independent technology.

Exon arrays were designed to allow complementary analyses to be performed, both at the level of gene expression as well as alternative splicing [16]. For the latter purpose, the relative difference in exon-level probe set expression can be effectively interpreted as alternative splicing, alternative promoter usage, and alternative polyadenylation events [11, 19, 34, 38]. Although a comprehensive survey of transcript structures cannot be conducted using this platform, particularly in genes with low expression values or in genes with numerous co-expressed isoforms, obvious and consistent events corresponding to the above categories can be measured reliably and reproducibly [16, 64]. Although sequencing technologies can explicitly profile exon–exon junctions, their comparatively prohibitive cost ensures that array-based approaches are an efficient and informative method for rapidly assaying large numbers of tumors, in an effort to address the known heterogeneity of this disease. However, a complete catalog of alternative splicing events remains to be identified through the use of alternative technologies.

Our results are supported by previous research demonstrating a role for alternative splicing in medulloblastoma. Most recently, Menghi et al. [34] examined alternative splicing in a modest cohort of 14 medulloblastomas using the SI algorithm which differentiated SHH from non-SHH medulloblastomas. The authors identified 174 high-confidence splicing events, with an additional 285 probable events. Of these, they validated 11/14 alternatively spliced exons present in a SHH-specific (3/11) or in a medulloblastoma-enriched (8/11) manner. Our analysis identified 100% of the validated events presented by Menghi et al. [34], largely following the subgroup associations reported in their investigation. There are some discrepancies, such as the presence of SHH-restricted isoform of TRRAP reported by the Menghi study, which was observed in all molecular subgroups of medulloblastoma in our dataset.

We examined whether the variation in alternative splicing patterns reflected the transcriptional heterogeneity that defines the four molecular subgroups of medulloblastoma using two independent methods of unsupervised clustering: HCL and NMF. HCL produced the most robust clustering with four molecular subgroups of medulloblastoma, findings largely recapitulated by NMF, which identified the four core subgroups and one additional minor medulloblastoma cluster. The extra NMF cluster composed of three SHH medulloblastomas, two of which are large cell anaplastic and hyperspliced, while the remaining SHH tumor is classic and non-hyperspliced. These tumors cluster apart from other SHH tumors in our HCl analysis, suggesting they may represent outliers. As the clustering patterns produced were largely driven by exon from genes independent from those used to produce transcriptional clustering, our results suggest that subgroup-specific alternative splicing events are an independent and equally informative measure of the heterogeneity that exists within the medulloblastoma transcriptome.

By examining alternative splicing events that predominate within each medulloblastoma subgroup, we identified hallmark events, many of which have neuronal functions. These prevalent splicing events, found in >50–70% of tumors in each molecular variant, may identify genes important in tumorigenesis, and may aid in the identification of the cell type of origin. WNT and non-WNT tumors display significantly different pathways affected by alternative splicing. Pathway analysis of WNT medulloblastomas revealed genes important to cerebellar development and medulloblastomas pathogenesis targeting tight junction signaling (P < 1.49E−2) and p53 signaling (P < 1.08E−2) [44]. In non-WNT medulloblastomas there was a high incidence of pathways affecting nervous system development and differentiation. Important neuronal signaling pathways include CREB signaling in neurons (SHH tumors, P < 1.70E−4) and RAR activation (Group 4 tumors, P < 2.77E−3), both of which are core or peripheral elements of retinoic acid (RA) signaling. Retinoid treatment has been previously used to induce differentiation in medulloblastoma cells [22]. Furthermore, our analysis has identified subgroup-specific pathways targeting Roundabout (ROBO/SLIT) in Group 3 tumors. Deregulation of ROBO/SLIT genes may contribute to the high incidence of metastasis and brain tumor invasion observed in aggressive Group 3 medulloblastomas [61].

We suggest a model in which antisense transcription may represent one mechanism able to mediate alternative splicing outcomes. We found that approximately 47% of S-AS gene pairs expressed in medulloblastoma had a significant relationship between sense gene splicing outcomes, and expression of the antisense gene partner. An example of this relationship is the alternative splicing of NBEA, a neuronal development protein. Our data show that a majority of Group 4 (91%) tumors express a truncated NBEA isoform containing only 27% of the full-length coding sequence, lacking important functional domains. Expression of this short NBEA isoform is significantly correlated to expression of MAB21L1, a gene encoded on the opposite strand of NBEA, and believed to function in embryonal development [55]. Many of the identified S-AS events occurred in a subgroup-specific manner, indicating that antisense transcription is likely an important component in the regulation of subgroup-specific alternative splicing and medulloblastoma tumorigenesis.

Our data reveals important clinical and biological trends associated with alternative splicing in the medulloblastoma transcriptome, suggests a putative mechanism for subgroup-specific alternative splicing, and further highlights the transcriptional heterogeneity present across, and within, subgroups of medulloblastoma.

Supplementary material

401_2012_959_MOESM1_ESM.eps (1007 kb)
Supplementary material 1 (EPS 1007 kb)
401_2012_959_MOESM2_ESM.eps (1.8 mb)
Supplementary material 2 (EPS 1807 kb)
401_2012_959_MOESM3_ESM.eps (1.9 mb)
Supplementary material 3 (EPS 1994 kb)
401_2012_959_MOESM4_ESM.eps (4.9 mb)
Supplementary material 4 (EPS 5060 kb)
401_2012_959_MOESM5_ESM.eps (1.7 mb)
Supplementary material 5 (EPS 1785 kb)
401_2012_959_MOESM6_ESM.eps (1.6 mb)
Supplementary material 6 (EPS 1649 kb)
401_2012_959_MOESM7_ESM.eps (2.5 mb)
Supplementary material 7 (EPS 2514 kb)
401_2012_959_MOESM8_ESM.eps (13.7 mb)
Supplementary material 8 (EPS 14064 kb)
401_2012_959_MOESM9_ESM.eps (3.5 mb)
Supplementary material 9 (EPS 3534 kb)
401_2012_959_MOESM10_ESM.eps (9.9 mb)
Supplementary material 10 (EPS 10174 kb)
401_2012_959_MOESM11_ESM.eps (2.1 mb)
Supplementary material 11 (EPS 2122 kb)
401_2012_959_MOESM12_ESM.xlsx (673 kb)
Supplementary material 12 (XLSX 673 kb)

Copyright information

© Springer-Verlag 2012