Rapid, reliable, and reproducible molecular sub-grouping of clinical medulloblastoma samples
- First Online:
- Cite this article as:
- Northcott, P.A., Shih, D.J.H., Remke, M. et al. Acta Neuropathol (2012) 123: 615. doi:10.1007/s00401-011-0899-7
- 2.2k Downloads
The diagnosis of medulloblastoma likely encompasses several distinct entities, with recent evidence for the existence of at least four unique molecular subgroups that exhibit distinct genetic, transcriptional, demographic, and clinical features. Assignment of molecular subgroup through routine profiling of high-quality RNA on expression microarrays is likely impractical in the clinical setting. The planning and execution of medulloblastoma clinical trials that stratify by subgroup, or which are targeted to a specific subgroup requires technologies that can be economically, rapidly, reliably, and reproducibly applied to formalin-fixed paraffin embedded (FFPE) specimens. In the current study, we have developed an assay that accurately measures the expression level of 22 medulloblastoma subgroup-specific signature genes (CodeSet) using nanoString nCounter Technology. Comparison of the nanoString assay with Affymetrix expression array data on a training series of 101 medulloblastomas of known subgroup demonstrated a high concordance (Pearson correlation r = 0.86). The assay was validated on a second set of 130 non-overlapping medulloblastomas of known subgroup, correctly assigning 98% (127/130) of tumors to the appropriate subgroup. Reproducibility was demonstrated by repeating the assay in three independent laboratories in Canada, the United States, and Switzerland. Finally, the nanoString assay could confidently predict subgroup in 88% of recent FFPE cases, of which 100% had accurate subgroup assignment. We present an assay based on nanoString technology that is capable of rapidly, reliably, and reproducibly assigning clinical FFPE medulloblastoma samples to their molecular subgroup, and which is highly suited for future medulloblastoma clinical trials.
KeywordsMedulloblastoma Molecular classification Clinical trials NanoString
Currently, patients enrolled on clinical trials for medulloblastoma are stratified on the basis of clinical variables such as age, extent of resection, presence of metastases, and histology. Recently, several publications have reported that the histological entity known as medulloblastoma comprises several distinct molecular variants or subgroups [1, 6, 12, 17]. Despite variation in the number and nomenclature of the subgroups identified, the current consensus is that medulloblastoma comprises four core subgroups (i.e., WNT, SHH, Group C, and Group D), with mounting evidence for clinically relevant substructure (subtypes within the subgroups) [1, 11]. Each subgroup exhibits distinct demographics, transcriptomics, genomics, and clinical outcomes [1, 12]. While some subgroups are well treated, or debatably even over-treated using current protocols (i.e., WNT medulloblastomas), others have a very poor outcome (i.e., Group C medulloblastomas). Additionally, as the subgroups have very different molecular genetic profiles, any successful strategies for targeted therapy will likely be subgroup specific (i.e., SMO inhibitors for SHH subgroup tumors). Although the retrospective classification of various medulloblastoma cohorts into molecular subgroups has been scientifically insightful, medulloblastoma sub-grouping has not yet been applied in the setting of a prospective clinical trial for either patient stratification or patient selection for targeted therapy.
There is currently no well-accepted gold standard test for medulloblastoma subgroup assignment. The methodology used in most of the published literature on medulloblastoma subgroups has been the analysis of high-quality RNA from flash-frozen samples that were analyzed using genome-wide transcriptional microarrays. Although an excellent tool for retrospective research studies, gene expression microarray profiling is likely inappropriate and inadequate for routine clinical use or for clinical trials due to the need for large amounts of high-quality RNA (from frozen tumor tissue), lot-to-lot variability of microarrays, bioinformatic complexity, and relatively high cost. Specifically, RNA isolated from formalin-fixed paraffin embedded (FFPE) medulloblastoma samples is fragmented, and not suitable for hybridization to expression microarrays. In both routine clinical settings and clinical trials, a rapid test completion time is critical, making microarray platforms an inefficient diagnostic tool.
In contrast, medulloblastoma subgroup assignment using immunohistochemistry (IHC) performed on FFPE cases has shown recent promise. We recently reported a four-antibody protocol for classification of medulloblastomas, and applied this method to a large series (n = 294) of FFPE medulloblastomas on tissue microarrays (TMAs), effectively classifying ~98% of samples . Ellison et al.  recently reported an IHC-based assay for classifying medulloblastomas into WNT, SHH, and non-WNT/SHH subgroups using a distinct set of antibodies. Challenges in bringing IHC to the clinic for subgroup assignment remain due to lot-to-lot variability of antibodies, inter-institutional differences in tissue fixation and embedding, technical variations of IHC, and inter and intra-observer variability in image interpretation. The inclusion of IHC markers for subgroup ascertainment in future clinical trials would likely be complimented by another orthogonal technology, to confirm subgroup affiliation as identified by IHC, and provide treating clinicians with confidence that the correct subgroup has been assigned. By its very nature, IHC is likely limited to one or two markers per subgroup, and these markers must be proteins excluding the use of non-coding RNAs as markers. While some of the described antibodies will likely become widely used as clinical tests, complimentary and confirmatory technologies may be required in the setting of a clinical trial.
To develop and optimize a more rapid, reliable, reproducible, and economical method for medulloblastoma classification, we have taken advantage of the recently described nanoString nCounter System, a non-enzymatic multiplexed assay that uses sequence-specific probes to digitally measure target abundance (i.e., mRNA) within a given sample [5, 7, 9]. Based on nanoString technology, and using information from existing gene expression array data , we designed a custom CodeSet (i.e., probe library) consisting of interrogating probes against 22 medulloblastoma subgroup-specific signature genes. We tested our nanoString assay on our own medulloblastoma series of known subgroup affiliation prior to validation of the assay on three non-overlapping medulloblastoma cohorts with known subgroup affiliation. Finally, the assay was applied to a large series of FFPE medulloblastomas to establish its applicability in the classification of routine clinical samples as would be encountered in the setting of a prospective clinical trial.
Patients and methods
All samples were obtained in accordance with the Research Ethics Board at the Hospital for Sick Children (Toronto, Canada). Primary medulloblastomas comprising the training series for nanoString (n = 101) have been previously described [10, 12, 13]. Samples contributing to the validation series (n = 130) have been previously described and were obtained as total RNA extracted from fresh-frozen tissue from the DKFZ (Heidelberg, Germany; Remke series, n = 55), the Dana-Farber Cancer Institute (Boston, USA; Cho series, n = 39) , and the Academic Medical Center (Amsterdam, the Netherlands; Kool series, n = 36) . Formalin-fixed paraffin embedded (FFPE) cases (n = 84) were obtained as paraffin sections from the Hospital for Sick Children (Toronto, Canada; n = 34), Johns Hopkins University (Baltimore, USA; n = 25), and the DKFZ (Heidelberg, Germany; n = 25).
NanoString CodeSet design and expression quantification
Signature genes for each medulloblastoma subgroup were included in the CodeSet on the basis of their observed subgroup-specific expression, as previously determined by Affymetrix exon array analysis [10, 12]. Specifically, conventional t test statistics restricted on the proportion of false discoveries (FDR) were employed to compare each subgroup to the remaining three subgroups in order to identify the most highly significant, differentially expressed genes. The CodeSet was designed to consist of a total of 25 genes with 5–6 signature genes included for each subgroup: WNT (WIF1, TNC, GAD1, DKK2, EMX2), SHH (PDLIM3, EYA1, HHIP, ATOH1, SFRP1), Group C (IMPG2, GABRA5, EGFL11, NRL, MAB21L2, NPR3), Group D (KCNA1, EOMES, KHDRBS2, RBM24, UNC5D, OAS1). Three housekeeping genes (ACTB, GAPDH, and LDHA) were also included in the CodeSet for biological normalization purposes. Probe sets for each gene in the CodeSet were designed and synthesized at nanoString Technologies.
Total RNA (100 ng) from fresh-frozen tissue and FFPE material was analyzed using the nanoString nCounter Analysis System at the University Health Network Microarray Centre (Toronto, Canada), the Oncogenomics Core Facility at the University of Miami (Miami, USA), and the Frontiers in Genetics Facility at the University of Geneva (Geneva, Switzerland). All procedures related to mRNA quantification including sample preparation, hybridization, detection, and scanning were carried out as recommended by nanoString Technologies.
Total RNA was extracted from fresh-frozen tissue using the Trizol method (Invitrogen) according to the manufacturer’s instructions. For FFPE samples, ~3–5 paraffin sections per sample were first deparaffinized with xylene prior to RNA extraction using the RNeasy FFPE kit (Qiagen) as directed by the manufacturer. RNA concentration was measured using a Nanodrop 1000 instrument (Nanodrop) and RNA integrity was assessed using an Agilent 2100 bioanalyzer at The Centre for Applied Genomics at the Hospital for Sick Children (Toronto, Canada).
NanoString data processing and class prediction analysis
Raw nanoString counts for each gene within each experiment were subjected to a technical normalization using the counts obtained for positive control probe sets prior to a biological normalization using the three housekeeping genes included in the CodeSet. Normalized data was log2-transformed and then used as input for class prediction analysis.
A series of medulloblastomas with known subgroup affiliation (n = 101) were used to establish a training dataset for subsequent class prediction analysis of independent cohorts utilized in the study. Various class prediction algorithms were assessed by a tenfold cross-validation scheme, using a set of scoring indices to establish a pipeline for prediction of medulloblastoma subgroups using nanoString data derived from the training series. Based on superior performance in cross-validation analysis, the PAM method  was selected for all downstream class prediction analyses.
All class prediction analyses were performed in the R statistical programming environment (v2.13). Implementations of the class prediction algorithms were imported from the following R packages: MASS v7.3 (linear discriminant analysis; LDA), class v7.3 (k-nearest neighbor; KNN), e1071 v1.5 (support vector machine; SVM), nnet v7.3 (multinomial log-linear model; MULT), and pamr v1.51 (prediction analysis for microarrays; PAM) . During cross-validation, the training set of 101 samples was randomly split into 10 partitions. Each class predictor was trained on nine of the partitions, and the performance of the predictor was subsequently tested on the one remaining partition. Each of the 10 partitions was used as the testing set in turn for a round of cross-validation, for a total of 10,000 rounds of cross-validation, which was repeated three times with reproducible results.
The scoring indices used during testing were accuracy, Jaccard similarity index, Rand index, adjusted Rand index, and Fowlkes–Mallows index. The latter four indices are different indices for determining the similarity between two groupings, which are the known and predicted classifications of samples in the current analysis. These indices serve as more stringent measures of accuracy in multi-class prediction. Aside from the accuracy measures (validity), the reliabilities of the predictors were also determined using Shannon entropy as a measure of uncertainty. Predictors with varying predicted classes for the same sample across the cross-validation rounds have higher entropy values, and are hence less reliable.
Since the model parameters for SVM can affect the prediction performance, these parameters were optimized by a grid search in a separate round of cross-validation. The ranges of searched parameter values were: [2−5, 215] for C; [2−15, 23] for gamma; [2, 8] for degree; [−1, 1] for coef0. Further, SVM using different kernels (linear, radial basis, polynomial, and sigmoid) were assessed, and the kernel with the best performance was selected. Similarly for KNN, the best model was selected from models with different k.
Regression analysis of prediction accuracy
Cumulative prediction accuracy was modeled as a function of FFPE sample age. The prediction accuracies were first calculated for each sample age year-group. The cumulative accuracies were then determined by calculating the cumulative sum of the accuracies, weighted by the size of each year-group. The data were fitted using a 5-parameter logistic regression model, as implemented in the drc v2.1 R package. The maximum asymptote parameter (D) was constrained at 1 in order to reflect the high accuracy the predictor achieved with recent FFPE samples.
RNA integrity assessment
RNA derived from FFPE material was subjected to Agilent Bioanalyzer analysis to determine RNA integrity. Smear analysis was performed using the Agilent 2100 expert software to determine the proportion of RNA ≥300 nucleotides (nt) within a given sample.
Establishment of a nanoString assay for medulloblastoma subgroup identification
WNT inhibitory factor 1
glutamate decarboxylase 1 (brain, 67 kDa)
dickkopf homolog 2 (Xenopus laevis)
empty spiracles homeobox 2
PDZ and LIM domain 3
eyes absent homolog 1 (Drosophila)
hedgehog interacting protein
atonal homolog 1 (Drosophila)
secreted frizzled-related protein 1
interphotoreceptor matrix proteoglycan 2
gamma-aminobutyric acid (GABA) A receptor, alpha 5
eyes shut homolog (Drosophila)
neural retina leucine zipper
mab-21-like 2 (C. elegans)
natriuretic peptide receptor C/guanylate cyclase C (atrionatriuretic peptide receptor C)
potassium voltage-gated channel, shaker-related subfamily, member 1 (episodic ataxia with myokymia)
KH domain containing, RNA binding, signal transduction associated 2
RNA binding motif protein 24
unc-5 homolog D (C. elegans)
2′,5′-oligoadenylate synthetase 1, 40/46 kDa
Validation of the nanoString classifier on multiple published medulloblastoma cohorts
Reproducibility and cross-site validation of the nanoString CodeSet
Accurate classification of archival formalin-fixed paraffin embedded (FFPE) medulloblastomas
Most conventional technologies employed for quantification of mRNA abundance (i.e., gene expression arrays, q-RT-PCR, RNA-Seq) require high-quality RNA that exhibits little to no degradation. Nucleic acid (including RNA) extracted from tissue stored as FFPE material is typically highly degraded and fragmented, and therefore not suitable for most molecular profiling platforms. As nanoString relies on relatively short pairs of 50mer probes , it exhibits robust performance on RNA extracted from FFPE material with results comparable to those obtained with RNA from fresh-frozen tissue .
As the threshold for accurate subgroup assignment varied by subgroup, probability thresholds were re-established in a subgroup-specific manner using cases from the last 8 years (Supplementary Figure 5b). The new probability thresholds were chosen to maintain a near 100% class prediction accuracy in high-quality samples (WNT = 0.7, SHH = 0.5, Group C = 0.5, Group D = 0.5). In recent FFPE samples (≤8 years, n = 32), PAM confidently predicted subgroups for 28/32 cases (87.5%) (Fig. 4c, d). In 4/32 (12.5%) cases PAM was unable to provide a high confidence subgroup assignment, suggesting that our current nanoString assay is incapable of sub-grouping them, and that alternative methods would be necessary (Fig. 4d). Notably, 2/4 cases that failed to meet the PAM threshold were in fact accurately classified by our nanoString assay (Supplementary Table 3). For those FFPE cases in which the PAM threshold was exceeded, 28/28 (100%) were assigned to the correct subgroup (Fig. 4c). Multiple logistic regression analysis established that sample age was a more reliable predictor of class prediction accuracy than measures of RNA integrity (i.e., RIN and RNA size) (Supplementary Figure 6). These results confirm the compatibility of our custom nanoString CodeSet with recent FFPE-derived material, and strongly suggest that our nanoString assay for medulloblastoma classification is well suited to the clinical trial setting in which recent FFPE samples are readily available.
Current criteria for risk stratification of medulloblastoma patients include patient age, metastatic status, and extent of surgical resection. Patients over the age of three with non-metastatic disease that is gross totally resected are considered average-risk, and all others deemed high-risk. This current stratification scheme fails to account for the extensive prognostic variability that exists between molecular subgroups. Therefore, the next generation of prospective clinical trials for medulloblastoma will almost certainly include molecular subgroup assignment for both patient stratification, and patient selection for targeted therapies. In particular, modulation of the intensity of therapy in a subgroup-specific manner is a very attractive approach in order to improve outcomes for patients. For example, WNT subgroup medulloblastomas are rarely metastatic and have progression-free and overall survival rates of >90% [1, 2, 4, 12, 14]; in contrast, patients with Group C medulloblastoma have a dismal prognosis [1, 12]. Molecular subgroup-based risk stratification will permit a more rational and personalized approach to patient treatment. Furthermore, targeted therapies against activated signaling pathways such as those that attenuate SHH pathway activation currently being evaluated in clinical trials  will benefit from subgroup-based stratification as they will likely only be effective in one of the four subgroups.
We describe a novel molecular classification method for medulloblastoma that relies on the nanoString nCounter System. This technology requires minimal RNA input (~100 ng), does not involve any enzymatic amplification, and produces expression data that are highly correlative with data generated by expression arrays. The nanoString-mediated subgroup assay described in this report was significantly more cost effective than performing the equivalent classification using an array-based approach, averaging ~$60 USD per sample for our nanoString assay compared to ~$425 USD per sample for a modern Affymetrix expression array. Using the expression pattern of only 22 medulloblastoma subgroup-specific signature genes we have established an assay that effectively assigns fresh-frozen medulloblastomas to the correct subgroup with ~98% accuracy as confirmed using three independent validation cohorts. Schwalbe et al.  recently described a 13-gene multiplex qPCR-based expression assay to classify medulloblastomas into either WNT, SHH, or non-WNT/SHH subgroups. Unsupervised analyses were used in this study to establish the ability of the 13-gene signature to recapitulate subgroup data previously determined in multiple published gene expression cohorts. Although this method proved capable of placing samples into WNT, SHH, and non-WNT/SHH categories, the technique was not directly evaluated on samples belonging to the published cohorts, nor did the assay attempt to make the important distinction between Group C and Group D medulloblastomas, confirmed in multiple recent studies to be both genetically and clinically distinct [1, 6, 12]. In the current study, we have obtained a subset of the same fresh-frozen RNA samples that were used in three independent microarray-based medulloblastoma sub-grouping studies and validated our nanoString assay directly on these templates (n = 130). Class prediction analysis confirmed the accuracy of our assay in ~98% of cases establishing the validity of our protocol. For samples that were misclassified, it is difficult to verify the source of the discrepancy regarding subgroup assignment, although possible explanations could be related to erroneous results of our nanoString assay, potential sample mix-ups, or erroneous classification in the original gene expression array profiling.
We previously introduced an IHC-based classification scheme for sub-grouping medulloblastoma using only four commercially available antibodies . This IHC-based method is very robust in our laboratory; although challenges remain in making the technique generalizable, including variability in antibody batches, sample preparation methods, staining procedures, and inter-observer reliability. We would suggest that in the future, IHC-based methods could be used in concert with a nanoString-based assay to provide clinicians with a high confidence assignment of subgroup for clinical medulloblastoma samples. The two methods are orthogonal, and highly complimentary.
To test the reproducibility of our nanoString-based classification assay across different centers, we analyzed a series of 48 cases at nanoString facilities in Toronto, Miami, and Geneva. The expression data generated at the three international sites were virtually indistinguishable, and produced correlation coefficients of ≥0.97. This impressive level of reproducibility achieved using the nanoString technology suggests that our assay could produce identical results at any institute equipped with the nanoString nCounter System, or that RNA samples from centers around the world could be studied at a central location.
Pathologists have long stored tumor biopsies as FFPE material in order to preserve as much cellular and structural integrity of the original tumor specimen as possible, making samples amenable to study for decades. A significant drawback associated with this preservation technique is that DNA and RNA extracted from FFPE material is typically highly degraded, and therefore of limited use in molecular studies. The nanoString technology has known compatibility with degraded RNA isolated from FFPE cases , largely due to the usage of relatively short 50mer probes . In a large series of 84 FFPE medulloblastomas from three independent pathology labs, our nanoString assay could assign subgroup with high confidence in 87.5% of cases from the last 8 years. Of those FFPE cases with a high confidence subgroup assignment, 100% were accurately classified as compared to the gold standard of expression profiling. Although 2/4 FFPE cases that failed to meet the PAM threshold were assigned to the correct subgroup, we suggest that higher specificity at the expense of sensitivity is necessary for a biomarker in the setting of a clinical trial.
In conclusion, we have developed, optimized, and validated a novel assay for medulloblastoma sub-grouping that is compatible with conditions common to current clinical trial settings. Future incorporation of this or similar molecular classification pipelines into prospective clinical trials will enhance our current understanding of the biological and prognostic significance of medulloblastoma subgroups, and we anticipate that this information will lead to improved care and outcomes for our patients.
M.D.T. is supported by a clinician-scientist award from the Canadian Institutes of Health Research. P.A.N. is supported by a Restracomp fellowship at the Hospital for Sick Children. Grant support is acknowledged from The Pediatric Brain Tumor Foundation, Genome Canada, Genome BC, Terry Fox Research Institute, Ontario Institute for Cancer Research, Pediatric Oncology Group Ontario, Funds from ‘The Family of Kathleen Lorette’ and the Clark H. Smith Brain Tumor Centre, Montreal Children’s Hospital Foundation, Hospital for Sick Children: Sonia and Arthur Labatt Brain Tumor Research Centre, Chief of Research Fund, Cancer Genetics Program, Garron Family Cancer Centre, B.R.A.I.N. Child. C.G.E is supported by an NIH R01 operating grant (NS055089). We thank Susan Archer for assistance with technical writing.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.