DeepSurvNet: deep survival convolutional network for brain cancer survival rate classification based on histopathological images

Zadeh Shirazi, Amin; Fornaciari, Eric; Bagherian, Narjes Sadat; Ebert, Lisa M.; Koszyca, Barbara; Gomez, Guillermo A.

doi:10.1007/s11517-020-02147-3

DeepSurvNet: deep survival convolutional network for brain cancer survival rate classification based on histopathological images

Original Article
Open access
Published: 02 March 2020

Volume 58, pages 1031–1045, (2020)
Cite this article

Download PDF

You have full access to this open access article

Medical & Biological Engineering & Computing Aims and scope Submit manuscript

DeepSurvNet: deep survival convolutional network for brain cancer survival rate classification based on histopathological images

Download PDF

6611 Accesses
32 Citations
8 Altmetric
1 Mention
Explore all metrics

Abstract

Histopathological whole slide images of haematoxylin and eosin (H&E)-stained biopsies contain valuable information with relation to cancer disease and its clinical outcomes. Still, there are no highly accurate automated methods to correlate histolopathological images with brain cancer patients’ survival, which can help in scheduling patients therapeutic treatment and allocate time for preclinical studies to guide personalized treatments. We now propose a new classifier, namely, DeepSurvNet powered by deep convolutional neural networks, to accurately classify in 4 classes brain cancer patients’ survival rate based on histopathological images (class I, 0–6 months; class II, 6–12 months; class III, 12–24 months; and class IV, >24 months survival after diagnosis). After training and testing of DeepSurvNet model on a public brain cancer dataset, The Cancer Genome Atlas, we have generalized it using independent testing on unseen samples. Using DeepSurvNet, we obtained precisions of 0.99 and 0.8 in the testing phases on the mentioned datasets, respectively, which shows DeepSurvNet is a reliable classifier for brain cancer patients’ survival rate classification based on histopathological images. Finally, analysis of the frequency of mutations revealed differences in terms of frequency and type of genes associated to each class, supporting the idea of a different genetic fingerprint associated to patient survival. We conclude that DeepSurvNet constitutes a new artificial intelligence tool to assess the survival rate in brain cancer.

Deep Learning and Prediction of Survival Period for Breast Cancer Patients

Deep Learning Versus Classical Regression for Brain Tumor Patient Survival Prediction

Glioblastoma Survival Prediction

Discover the latest articles, news and stories from top researchers in related subjects.

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Brain cancer patient classification is mainly based on histopathological images that can accurately identify the type of cancer as well as genetic tests [1, 2]. However, recent single cell RNA seq experiments performed in GBM biopsies [3,4,5,6,7,8] have challenged these models, pointing out that the reliability of these methods and its use in personalized medicine strongly depends on how much we know on these different type of cancers (i.e. cancer cell subtypes within a tumour) and how many therapies for their individual treatment we have available and whether these target all or none of such cancer cell populations [9]. Thus and as we certainly are still progressing on the molecular determinants that contribute to the aggressiveness of glioblastoma, the current brain cancer classification methods (either based on histological and/or genetic approaches) so far have shown not being sufficient to provided a complete picture on how this can be used to predict (i) survival, (ii) response to treatment and (iii) the development of more personalized treatments [10], which is clearly evident by the following: (i) the lack of development of new treatments for brain cancer patients, in particular, those patients affected by grade IV glioma [10]; (ii) the lack of improvement in brain cancer treatments and patients outcomes (i.e. survival) in the last 30 years [11]; and (iii) the lack of personalized treatments in the clinic, where most oncologists subject patients to the Stupp protocol and knowledge-based on IDH gene mutations and MGMT methylation [12].

Thus, we feel that in addition to the classification of brain tumours that have been done so far, it is also equally important to stratify brain cancer patients based on their survival characteristics and which will permit us to clinicians to tailor both the timing and the type of treatments to patients [12, 13]. This will, for example, be helpful or avoid overtreating those patients with more stable disease. Moreover, classification of brain tumours as a function of brain cancer survival will help us to reveal key characteristics that make these tumours very aggressive and for those patients that present long survival, what are the molecular signatures that contribute to it [13].

Thus, survival rate analysis has become essential for clinicians to select the best treatment methods based on the patient’s clinical data [14, 15]; and survival predictor models have been developed in oncology to investigate the relationship between information obtained at the time of diagnosis and the overall patient’s survival [16]. This has been further facilitated by the recent access to large datasets of digital images, e.g. The Cancer Genome Atlas (TCGA), at the moment of diagnosis, including those from computed tomography (CT), magnetic resonance imaging (MRI) and whole slide pathological imaging (WSI), which have allowed researchers to investigate patient’s survival based on the information contained in these images [17,18,19,20]. For example, Tomczak et al. [21] collected > 2000 lung cancer WSIs, and others established a relationship between the information stored in the pathological images and survival rates [22, 23].

Thus, a different group of models for prediction of the patient’s survival based on the histopathological information collected at the moment of diagnosis have emerged. One group correspond to accurate prediction of the patient’s survival that is related to the traditional hazard models and which are based on the Cox model [24] and its derivations [25, 26]. These consider a linear combination of covariates to predict the risk of the patient’s death with nonlinear functions related to the risk [27]. Another group is based on artificial intelligence and deep learning, on which deep convolutional neural networks (DCNN) are used for the analysis of biomedical imaging and applied to recognition, classification and prediction tasks [28,29,30,31]. Numerous examples that use DCNNs have been reported recently to predict the survival rate based on pathological images including Katzman et al. [32] who put forwards for the first time deep fully connected network, namely, DeepSurv, to predict survival rate based on structured clinical data (non-images data) and Zhu et al. [27] who used a modified DCNN, namely, DeepConvSurv, on the unstructured data (867 lung cancer WSIs pathological images) to predict the survival rate. In particular, they changed the DCNN loss function in their model to negative partial log likelihood, and as a result, the output of their network measured the risk value for each patient. In their work, they reported a concordance index (c-index) of 0.63 as their model evaluator. Zhu et al. [33] applied a WSI-based model (viz. WSISA) to predict survival state in lung cancer as well as in glioblastoma (c-index 0.7, 0.64 for lung cancer and glioblastoma, respectively), although in a limited manner as (i) WSIs from TCGA with 0.5-μm/pixel (p) resolution were downloaded, and patches of 512 pixels × 512 pixels (512 × 512) size were extracted haphazardly, implying that 54% of the publicly available data was outliers in their analysis, and (ii) high-level semantic information could not be detected in their model. Tang et al. [34] also used DCNN-based model (viz. CapSurv) to predict survival rate in lung and a specific type of brain cancer (glioblastoma) considering patches of 256 × 256 extracted from WSIs from TCGA and applied a new loss function, namely, survival loss, to improve the accuracy (c-index 0.67) of the predictive model.

In addition to accurate prediction of the patient’s survival, supervised machine learning–based algorithms are also used for classification [35, 36] where input values (e.g. an image associated to clinical record) are assigned to an output class (e.g. survival within a given time period after diagnosis). Classifiers offer the possibility of predicting with high accuracy the class to which a group of patients belong (e.g. time period after diagnosis) compared to accurate prediction of the patient’s survival methods that are less precise and works inefficiently. As a novel example, Kolachalama et al. [37] utilized DCNN to classify the survival rate of three types of kidney cancer based on WSIs. In their model, the inputs were WSIs without any extracted patches, a computationally very demanding task, and the outputs were three classes of survival rates including 1 year, 3 years and 5 years whose results (area under curve as a classifier evaluator metric) achieve 0.878, 0.875 and 0.904, respectively.

In this work, we use DCNN for classification of brain cancer survival using whole slide histopathological images obtained from haematoxylin and eosin(H&E )-stained biopsy tissue sections, since no models were reported previously for classification of survival rates of brain cancer patients (see [38] for a comprehensive review on brain cancer classification using deep learning methods and MRI imaging). Moreover and although research is progressing on the molecular determinants that contribute to the development and growth of brain tumours, including glioblastoma, the most aggressive form, current classification approaches (either based on histological and/or genetic tests) do not directly focus on the survival of patients [1, 2, 10] and have not yet provided a complete picture on how “brain cancer type classification” can be used to predict (i) survival and (ii) response to treatment and (iii) help the development of more personalized treatments.” In order to address this problem of brain cancer classification based on survival, we put forwards deep survival convolutional network (DeepSurvNet) as a novel classifier approach based on DCNN. Like the other models, we used patches derived from WSIs as inputs in our model, and we trained and tested our model based on WSI images available from TCGA. In addition, we were able to generalize the results of our model by further applying it to a completely independent dataset of H&E images derived from tumour biopsies collected locally by SA Pathology, the South Australian state pathology service. Thus, DeepSurvNet allowed us for the first time to (1) accurately (> 99%) classify brain cancer survival rate directly from the WSIs and (2) validate our TCGA-trained model in an independent and local cohort of patients. The experimental results illustrate that DeepSurvNet model is a distinguished classifier and open a new horizon in the field of survival analysis.

2 Methods

2.1 Construction, training and testing of DeepSurvNet

Figure 1 presents the steps (a to h) involved in the construction, training and testing of DeepSurvNet, which are described below.

2.1.1 Datasets used for training, testing and validation of deep learning classifiers (Fig. 1a)

We considered two different datasets for the classification of survival rates in patients who suffered from different types of brain cancer including glioblastoma multiform, mixed glioma, oligodendroglioma and astrocytoma. The first dataset is derived from 490 brain cancer patients and is publicly available from TCGA [39] and was used to train and test all the classifier models of survival rates. It is important to mention that within this dataset, slides – and therefore WSI – for each patient contain several tissue sections of the same biopsy, and all of these were used to train and test the classifiers. The second dataset was derived from 9 glioblastoma patients who underwent surgical tumour resection within the South Australian public hospital system. Tumour biopsy specimens were accessed from archival material stored at SA Pathology (the state pathology service), and survival time was calculated based on electronic medical records. Formalin-fixed paraffin-embedded biopsy tissues were sectioned and stained with H&E according to standard protocols at SA Pathology and imaged at 0.5-μm/pixel resolution using a Zeiss AxioImager.M2 microscope equipped with an EC Plan-Neofluar 40x/0.75 M27 Objective and an AxioCam Mrc camera. We used this dataset for an independent test and to monitor the efficiency of our model (i.e. this data was not used for training of the model, for which only TCGA datasets were used).

2.1.2 Patients’ database creation: removing outliers and extraction of tumour regions of interest (ROIs) from WSIs (Fig. 1b)

937 WSIs from 490 brain cancer patients were downloaded from TCGA. These were visually explored, and those WSIs that are useless for further analysis because they are corrupted, present marker annotations that cannot be removed, are of low-resolution or lack of clinical information (time of decease after diagnosis) were removed, which left 654 WSI from 445 cases available for further analysis. Guided by the pathologist, we further inspect the data for optimum extraction of several tumour ROIs from each WSI. The total number of extracted ROIs was 849 from the 445 cases. We used this result to create a curated database containing all the patients’ clinical output information including the patients’ ID, mutated genes, and time between brain cancer diagnosis and disease. This database is directly related to all the extracted ROIs used in our work and is available from the authors upon request.

2.1.3 Definition of different classes for survival (Fig. 1c)

For classification, we have considered 4 classes. These classes are related to the patients’ history of their time between brain cancer diagnosis to death which was extracted from patients clinical history available from TCGA. Thus, in classes I, II, III and IV, there are respectively 217 ROIs (related to patients with survival time after diagnosis between 0 and 6 months), 210 ROIs (related to patients with survival time after diagnosis between 6 and12 months), 277 ROIs (related to patients with survival time after diagnosis between 12 and 24 months) and 145 ROIs (related to patients with survival time after diagnosis greater than 24 months). Thus, the number of classes and ROIs in each one is sufficiently large for training the DCNN classifiers which are known to be extremely data hungry throughout the training phase [40].

2.1.4 Patch extraction from ROIs and patch standardization (Fig. 1d)

ROIs allocated to each class are large in size, and processing them directly is computationally demanding. Thus, for training and testing purposes, we have extracted ROI subregions or “patches” of different sizes 256 × 256 (218,760 patches), 512 × 512 (38,963 patches) and 1024 × 1024 (8657 patches) and compared them to know which can detect more features from the ROIs. For supervised machine learning tasks (e.g. classification), each patch is allocated to a class with a specific label, which results in 4 labels as outputs, and each label is related to each class. Table 1 shows a summary of the number of extracted patches with different sizes for each class.

Table 1 Patch extraction from ROIs for each class

Full size table

Finally, as TCGA derived images present variable levels of colour intensities, we standardize their intensities by applying the following formula to each pixel:

$$ {P}^{\hbox{'}}=\frac{P-\mu }{\sigma } $$

(1)

where P^′ and P are standardized and original patches, respectively. Also, μ and σ are the average and standard deviation of all values in the original image patch.

2.1.5 Training, validating and testing datasets and DCNN-based classifiers (Fig. 1e, f)

For each specific patch size extracted from TCGA dataset, we have divided all the patches into three different cohorts including training (80%), validating (18%) and testing (2%). An example of an early CNN structure can be seen in Fig. 2. The early basic architectures popularized by AlexNet [41] loosely follow a pattern of alternating between convolutional layers (Conv Layer) and pooling layers (Pool Layer). The intention is to “learn” features from input layers via convolutional layers and reduce the spatial complexity via pooling layers. Subsequent iterations of these operations distil a set of features that are enrolled into a fully connected (FC) layer which are computed to output classes.

In more modern architectures such as MobileNetV2 [42], FC layers are largely outdated in favour of 1 × 1 convolutions. More performant patterns have also been developed such as residual layers which utilize skip connections introduced in ResNet50 [43].

2.1.6 Five DCNN-based classifiers for brain cancer survival rate classification (Fig. 1g)

In order to classify different classes of survival rates based on different sizes of patches, we have considered the most popular DCNN classifiers in image recognition task including VGG19 [44], GoogleNet [45], ResNet50 [43], InceptionV3 [46] and MobileNetV2 [42]. We compared all the results derived from each of these models, and the best-performing model was then used as the engine for DeepSurvNet.

VGG19

In 2014, Visual Geometry Group (VGG) in the Oxford University presented A DCNN classifier model named VGG [44] in the ILSVRC [47] challenge and won the image classification tasks using the VGG model. There are several architectures of VGG with different layers, two of which are very popular. The first one is a 16-layer (VGG16), and the other is a 19-layer (VGG19). We use VGG19 as a classifier for survival rate classification task in this study.

GoogleNet

In 2014, Szegedy et al. [45] from Google introduced a new conception, namely, Inception, in their article and called their model GoogleNet. In this 22-layer deep network, they have applied filters with different sizes 1 × 1, 3 × 3 and 5 × 5 in the Inception modules. The aim of using such multiple convolutions in the Inception modules would be to feature extraction in different levels. After stacking the outputs of these filters along the channels, they are ready for further layers.

ResNet50

In 2015, He et al. from Microsoft introduced the ResNet architecture and demonstrated that using the residual modules, we can train very deep convolutional networks with standard stochastic gradient descent (SGD) method [43]. Among all different kinds of ResNet models, the ResNet50 is very popular since it has simpler structure than the other forms, a reason why we use it in this study.

InceptionV3

As mentioned earlier, GoogleNet introduced the Inception architecture or Inception V1. Afterwards, Inception module was purified in various ways and other architectures are introduced by Google as Inception vN where N is the Inception version. The Inception V3 [46] architecture adds new features to the inception module to increase the accuracy of the ILSVRC classification task.

MobileNetV2

Another successful approach of DCNN-based classifiers is MobileNetV2 [42] introduced by Sandler et al. from Google in 2018. Although MobileNetV2 is a new idea elicited from MobileNetV1 [48], i.e. using efficient building blocks through depth wise separable convolution, there are two new characteristics to the V2 architecture. The first feature is linear bottlenecks between the layers, and the second is shortcut connections between the bottlenecks. Since their classifier has good functionality on benchmarks like ILSVRC, we have included it as a survival rate classifier for this study.

DeepSurvNet classifier model (Fig. 1h)

After the utilization of five classifiers introduced in the previous part on the different patch sizes, the best classifier model of survival rate is selected. It should be noted since we have five classifiers and three different sizes of patches, and the number of models applied was 15 in total. The best classifier with the highest accuracy and the lowest loss among all the 15 classifiers is called DeepSurvNet.

2.2 Evaluation criteria

Several metrics like confusion matrix [49]; the combination of precision, recall and F-score [50]; and the area under the ROC curve (AUC) [51] were used for performance evaluation of our classifiers.

Confusion matrix

The confusion matrix summarizes a classifier success in the prediction of examples belonging to different classes based on true positives (TP), true negatives (TN), false negatives (FN) and false positive (FP) values. This table is used to calculate the other performance metrics, i.e. precision, recall and Matthews correlation coefficient (MCC).

Precision, recall and F-score

Precision and recall are defined as follows:

$$ \Pr ecision=\frac{TP}{TP+ FP} $$

(2)

$$ \operatorname{Re} call=\frac{TP}{TP+ FN} $$

(3)

And F-score is the harmonic average of the precision and recall:

$$ F- Score=\frac{2\left(\Pr ecision\times \operatorname{Re} call\right)}{\left(\Pr ecision+\operatorname{Re} call\right)} $$

(4)

The MCC value is a correlation coefficient between the targets and predicted classifications:

$$ MCC=\left( TP\times TN\right)-\left( FP\times FN\right)/\sqrt{\left( TP+ FP\right)\left( TP+ FN\right)\left( TN+ FP\right)\left( TN+ FN\right)} $$

(5)

Precision, recall and F-score reach their best values at 1 and worst at 0. MCC of + 1 indicates a perfect prediction and − 1 represents completely disagreement between target and prediction.

Area under the curve (AUC) and receiver operating characteristics (ROC)

ROC curves combine the true positive rate (TPR or sensitivity) and false positive rate (FPR or 1-specificity) to illustrate the classification performance. These two metrics are defined as follows:

$$ TPR=\frac{TP}{TP+ FN} $$

(6)

$$ FPR=\frac{FP}{FP+ TN} $$

(7)

A perfect classifier would achieve higher AUC, and AUC of 1 means the best classification.

3 Implementation details

In this study, in the preprocessing stage, for WSIs visualization and removing outliers, we have used Aperio ImageScope software. Also, we have initialized our input shapes to 224 × 224 × 3 channels (224 × 224 × 3) for all of the classifiers. After several experiences, we found that the best practices for setting parameters and hyperparameters in training stage are 30 epochs with stochastic gradient descent (SGD) optimizer, an initial learning rate of 0.01, the momentum of 0.9, learning rate decay of 0.001 and categorical cross-entropy as loss function. In order to tackle the overfitting problem, we have applied the dropout regularization technique. All the networks were implemented in python with the Keras [52], a high-level neural networks API running on Tensorflow framework [53], and trained using four NVIDIA 1080Ti GPUs.

4 Results and discussion

4.1 Survival rate classifiers comparison

Figure 3 shows training accuracy and loss curves in training phase for different patch sizes (256 × 256, 512 × 512 and 1024 × 1024) for all survival classifiers (note that all classifiers were applied to the same TCGA training patches). Results using 256 × 256 patch size show that for all classifiers, this size has improved training accuracy curves (nearly 1) and the lowest training loss curves (nearly 0) when compared to the other patch sizes.

Then, in the testing phase, we evaluate the “trained classifiers to the corresponding test set (i.e. a set of patch images of different sizes)”. During this phase, we calculated confusion matrix, AUC, and the achieved values for all the evaluator metrics including recall, precision and F-score for the different classifiers (Table 2). We found that using GoogleNet led to the highest level of ordered pair of (i) average precision and (ii) average AUC of 0.65 and 0.86, 0.93 and0.99 and 0.99 and1 for 1024 × 1024, 512 × 512 and 256 × 256 patch sizes, respectively. Therefore, DeepSurvNet classifier is powered by trained GoogleNet on 256 × 256 histopathological patches, given the highest average precision obtained under these conditions.

Table 2 Comparison between different kinds of DCNN classifiers in the testing phase

Full size table

Figure 4 shows the application of the 5 classifiers on 256 × 256 patch size. In this figure, confusion matrix and AUC have been depicted confirming that GoogleNet has the highest true positives and average AUC for four classes in comparison with the other classifiers. Indeed, classification results of 5 classifiers trained on 256 × 256 patch size for each cross-validation in 3 different testing folds have been shown in Table 3. The results show that the highest average indexes (among all 4 classes) including precision, recall, f1-score and MCC for all the 3 folds again are related to GoogLeNet.

Table 3 Classification results of 5 DCNN classifiers trained on 256 × 256 patch size for each cross validation in 3 different testing folds

Full size table

4.2 DeepSurvNet generalization in unseen (locally derived) dataset

Having established a pipeline for accurate prediction for the different classes to which patient’s survival allocate based on pathological images using DeepSurvNet, we then wanted to test the accuracy of the model using a completely unseen data, which is of relevance for those who might also want to apply this pipeline with already available brain cancer histopathological slides. For this, we analysed images of H&E-stained glioblastoma tissue sections collected by SA Pathology from 9 patients undergoing tumour resection in local hospitals. Figure 5 shows the summary of the results. First, H&E histopathological images from each patient (Fig. 5a, b) were analysed in consultation with the clinical pathologist for the distinction of those regions that correspond to the tumour. These ROIs were used to extract 20 patches per patient for “patch classification” using the TCGA-trained DeepSurvNet classifier (Fig. 5b). From the different patients, we observe that the frequency of class prediction per patient was highly biased towards a single class as would be expected since patches were derived from the same pathological sample (Fig. 5c, d). Remarkably, this single class perfectly matches the real class to which patients belong (9 of 9 patients, Fig. 5d).

We then performed precision analysis based on (i) the analysis of 20 × 9 = 180 patches derived from these samples (i.e. without making a distinction to which patient they belong. Confusion matrix results (Fig. 6) show that the application of DeepSurvNet to this unseen dataset led to an average global precision of 80%. This precision was higher for patches belonging to class I and class II (80% and 86%, respectively) and lower for those patches belonging to class III and class IV (77% and 74%, for which morphological and genetic features are much more heterogeneous, see below).

4.3 Gene mutation frequency within survival classes

We then sought for better understanding of the underlying genetic differences associated with each class. For this, we analysed the distribution of frequency for mutated genes in the different survival classes using data derived from the TCGA database (Fig. 7). First, we found that by pooling all brain cancer data, the most highly mutated genes were PTEN, TTN, TP53EGFR, PLG and MUC 16 (Fig. 7a). We then analysed the frequency of mutations within each class and compared it to the distributions for all patients (Fig. 7b). We found that the distribution of gene mutations in class I mimics better than the one from the whole cohort, this being less obvious for the rest of the classes. This potentially highlights the underlying genetic differences between the classes and their impact on patient survival. To gain further insight into this, we performed a Z-score analysis to test whether there are highly mutated genes associated to each class by identifying those genes whose frequency of mutations is higher than 2 standard deviations of the frequency values for the entire set of genes (Fig. 7c). Interestingly, we found specific genes associated with each class (class I, PTEN; class II, SPTA1; class III, TTN; and class IV, TTN and FLG). Of these, the clinical significance of TTN mutations is limited since high rates of TTN mutations (passenger mutations) are mostly due to large size of this protein and variation of mutation rates across the genome [54]. We were also interested in those mutations that were different between classes, in particular, those features that are different between those patients with short and long survival. For this, we calculated the differences in frequency of mutations of each class with respect to the frequency of mutations in class IV, to discover which genes are more often aberrant in those short survival cancers (compared to those with long survival) (Fig. 7d). In particular, lack of mutations of FLG are associated with class I and class II; this adds to the presence of PTEN and SPTA mutations within these classes to define their signatures. Also, we found that there are no clear differences between long survival classes (III and IV), which highlight short survival cancers, like glioblastoma (Supplementary Table 1), are intrinsically different from those long survival cancers and correlate with our precision analysis in SA Pathology samples on which accuracy is reduced for these classes.

From the above analysis of frequency of mutated genes in brain cancer, it is worth to highlight the identification of flg mutations in class III and IV patients. The National Cancer Institute (NCI) is currently developing a new genomics database, the Exceptional Responders Initiative (ERI), to identify molecular features of patients who have a unique response to treatments and therefore exhibit long survival rates (i.e. “exceptional responders”). FLG is a high-affinity receptor of basic fibroblast growth factor (bFGF), and a recent report by Wipfler et al. has shown that FLG has a significantly different distribution of patients affected by somatic nonsynonymous mutations. Of these, 25% of exceptional responders had one mutation each in FLG [13]. In contrast, overexpression of FLG is associated with low immune cell infiltration and short survival rates in melanoma and ovarian cancer [55], while the loss of function mutations in FLG is associated with lower cancer risk in several cancers [56]. This suggests that FLG mutations in patients with long survival rates confer a prognostic benefit possibly related to immune cell infiltration within the glioma tumour cellular microenvironment, a feature that can be detected in H&E-stained tissue sections by our image-based classifier. Similarly, SPTA1 (Spectrin, alpha, erythrocytic 1) mutations can led to alterations in H&E-stained tissue features due to its involvement in the regulation of cortical actin organization and cell shape as it has been shown in other cancers [57], although its role in GBM has not been investigated yet. Similar conclusions in relation to the tumour microenvironment and the differential expression of extracellular matrix (ECM) proteins (and therefore outside-in cell-ECM signalling) have been identified to be highly and inversely correlated to patient’s survival rates [58]. Thus, these observations suggest that differences in the cellular and noncellular microenvironment [10] and the way that cancer cells sense it through adhesion receptors and modulation of the actin cytoskeleton (i.e. EMT [59] and invasion [60]) are reflected as key biological features that could be captured by our image-based survival rate classifier.

5 Conclusion

We tested the possibility of using H&E-stained brain cancer histopathological images as input data for patients’ survival classification using DCNN. In doing so, we compared the performance of DCNN algorithms using two independent datasets: the first publicly available in TCGA and the other generated by ourselves from samples collected in Adelaide. DeepSurvNet is GoogleNet classifier trained on 200,000 training samples using TCGA brain cancer dataset. Patches classification accuracy using DeepSurvNet was of 99% in the testing phase. Moreover, we found that our model DeepSurvNet classified > 50% patients’ patches class with > 90% accuracy and more than > 75% patients’ patches with 75% accuracy and 100% accuracy when considered the single patient classification based on the total patches per patient. Moreover, since for each patient the model could classify > 50% of patches in a correct class, we can also say that the classifier accuracy for 9 patients is 100%.

The analysis of frequency of mutations within these survival classes shows differences between these in terms of frequency and type of genes associated to patients with different survival rates, supporting the idea of a different genetic fingerprint associated to patient survival. This highlight that differences between short and long survival tumours and the underlying genetic characterisitcs could be useful not only in scheduling of treatments but also for the identification of new targets for glioblastoma. Thus, we conclude that DeepSurvNet constitute a new AI tool to assess the malignancy of brain cancer, which could help in the evaluation of patient treatment.

References

Louis DN, Ohgaki H, Wiestler OD, Cavenee WK, Burger PC, Jouvet A, Scheithauer BW, Kleihues P (2007) The 2007 WHO classification of tumours of the central nervous system. Acta Neuropathol 114(2):97–109
Article PubMed PubMed Central Google Scholar
Louis DN, Perry A, Reifenberger G, von Deimling A, Figarella-Branger D, Cavenee WK, Ohgaki H, Wiestler OD, Kleihues P, Ellison DW (2016) The 2016 World Health Organization classification of tumors of the central nervous system: a summary. Acta Neuropathol 131(6):803–820
Article PubMed Google Scholar
Darmanis S, Sloan SA, Croote D, Mignardi M, Chernikova S, Samghababi P, Zhang Y, Neff N, Kowarsky M, Caneda C, Li G, Chang SD, Connolly ID, Li Y, Barres BA, Gephart MH, Quake SR (2017) Single-cell RNA-Seq analysis of infiltrating neoplastic cells at the migrating front of human glioblastoma. Cell Rep 21(5):1399–1410
Article CAS PubMed PubMed Central Google Scholar
Muller S et al (2017) Single-cell profiling of human gliomas reveals macrophage ontogeny as a basis for regional differences in macrophage activation in the tumor microenvironment. Genome Biol 18(1):234
Article PubMed PubMed Central Google Scholar
Muller S et al (2016) Single-cell sequencing maps gene expression to mutational phylogenies in PDGF- and EGF-driven gliomas. Mol Syst Biol 12(11):889
Article PubMed PubMed Central Google Scholar
Neftel C et al (2019) An integrative model of cellular states, plasticity, and genetics for glioblastoma. Cell 178(4):835–849 e21
Article CAS PubMed PubMed Central Google Scholar
Patel AP, Tirosh I, Trombetta JJ, Shalek AK, Gillespie SM, Wakimoto H, Cahill DP, Nahed BV, Curry WT, Martuza RL, Louis DN, Rozenblatt-Rosen O, Suvà ML, Regev A, Bernstein BE (2014) Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science 344(6190):1396–1401
Article CAS PubMed PubMed Central Google Scholar
Yuan J, Levitin HM, Frattini V, Bush EC, Boyett DM, Samanamud J, Ceccarelli M, Dovas A, Zanazzi G, Canoll P, Bruce JN, Lasorella A, Iavarone A, Sims PA (2018) Single-cell transcriptome analysis of lineage diversity in high-grade glioma. Genome Med 10(1):57
Article PubMed PubMed Central Google Scholar
Dirkse A, Golebiewska A, Buder T, Nazarov PV, Muller A, Poovathingal S, Brons NHC, Leite S, Sauvageot N, Sarkisjan D, Seyfrid M, Fritah S, Stieber D, Michelucci A, Hertel F, Herold-Mende C, Azuaje F, Skupin A, Bjerkvig R, Deutsch A, Voss-Böhme A, Niclou SP (2019) Stem cell-associated heterogeneity in glioblastoma results from intrinsic tumor plasticity shaped by the microenvironment. Nat Commun 10(1):1787
Article PubMed PubMed Central Google Scholar
Perrin SL, Samuel MS, Koszyca B, Brown MP, Ebert LM, Oksdath M, Gomez GA (2019) Glioblastoma heterogeneity and the tumour microenvironment: implications for preclinical research and development of new treatments. Biochem Soc Trans 47(2):625–638
Article CAS PubMed Google Scholar
cancers., A.I.o.H.a.W.B.a.o.c.n.s., Australian Institute of Health and Welfare. Brain and other central nervous system cancers. 2017. Cat. no. CAN 106
Gomez GA, et al (2019) New approaches to model glioblastoma in vitro using brain organoids: implications for precision oncology. Translational Cancer Research
Wipfler K, Cornish AS, Guda C (2018) Comparative molecular characterization of typical and exceptional responders in glioblastoma. Oncotarget 9(47):28421–28433
Article PubMed PubMed Central Google Scholar
Sun D, et al (2017) Prognosis prediction of human breast cancer by integrating deep neural network and support vector machine: supervised feature extraction and classification for breast cancer prognosis prediction. In 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI). IEEE
Sun D, Wang M, Li A (2018) A multimodal deep neural network for human breast cancer prognosis prediction by integrating multi-dimensional data. IEEE/ACM transactions on computational biology and bioinformatics
Ohno-Machado L (2001) Modeling medical prognosis: survival analysis techniques. J Biomed Inform 34(6):428–439
Article CAS PubMed Google Scholar
Zhu X, et al (2016) Lung cancer survival prediction from pathological images and genetic data—an integration study. In 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI). IEEE
Hawkins SH et al (2014) Predicting outcomes of nonsmall cell lung cancer using CT image features. IEEE Access 2:1418–1426
Article Google Scholar
Liao X, et al (2019) Machine-learning based radiogenomics analysis of MRI features and metagenes in glioblastoma multiforme patients with different survival time. J Cell Mol Med
Lao J, Chen Y, Li ZC, Li Q, Zhang J, Liu J, Zhai G (2017) A deep learning-based radiomics model for prediction of survival in glioblastoma multiforme. Sci Rep 7(1):10353
Article PubMed PubMed Central Google Scholar
Tomczak K, Czerwińska P, Wiznerowicz M (2015) The cancer genome atlas (TCGA): an immeasurable source of knowledge. Contemp Oncol 19(1A):A68–A77
Google Scholar
Sun D, Li A, Tang B, Wang M (2018) Integrating genomic data and pathological images to effectively predict breast cancer clinical outcome. Comput Methods Prog Biomed 161:45–53
Article Google Scholar
Yu K-H et al (2016) Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features. Nat Commun 7:12474
Article CAS PubMed PubMed Central Google Scholar
Cox DR (1972) Regression models and life-tables. J R Stat Soc Ser B Methodol 34(2):187–202
Google Scholar
Park MY, Hastie T (2007) L1-regularization path algorithm for generalized linear models. J R Stat Soc: Ser B (Stat Methodol) 69(4):659–677
Article Google Scholar
Bøvelstad HM, Nygård S, Størvold HL, Aldrin M, Borgan Ø, Frigessi A, Lingjaerde OC (2007) Predicting survival from microarray data—a comparative study. Bioinformatics 23(16):2080–2087
Article PubMed Google Scholar
Zhu X, Yao J, Huang J (2016) Deep convolutional neural network for survival analysis with pathological images. In 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) IEEE
Wei JW, et al (2019) Automated detection of celiac disease on duodenal biopsy slides: A deep learning approach. arXiv preprint arXiv:1901.11447
Li Y, Wu J, Wu Q (2019) Classification of breast cancer histology images using multi-size and discriminative patches based on deep learning. IEEE Access 7:21400–21408
Article Google Scholar
Khan S, et al (2019) A novel deep learning based framework for the detection and classification of breast Cancer using transfer learning. Pattern Recogn Lett
Wei JW, Tafe LJ, Linnik YA, Vaickus LJ, Tomita N, Hassanpour S (2019) Pathologist-level classification of histologic patterns on resected lung adenocarcinoma slides with deep neural networks. Sci Rep 9(1):3358
Article PubMed PubMed Central Google Scholar
Katzman JL et al (2016) Deep survival: A deep cox proportional hazards network. Stat 1050:2
Google Scholar
Xinliang Z, et al (2017) Wsisa: Making survival prediction from whole slide histopathological images. In Proc. CVPR
Tang B, et al (2019) CapSurv: Capsule Network for Survival Analysis with Whole Slide Pathological Images. IEEE Access
Jiawei H, Kamber M. Data mining: concepts and techniques, (the morgan kaufmann series in data management systems), vol. 2. Morgan Kaufmann
Shirazi AZ, Chabok SJSM, Mohammadi Z (2018) A novel and reliable computational intelligence system for breast cancer detection. Medi Biol Eng Comput 56(5):721–732
Article Google Scholar
Kolachalama VB, Singh P, Lin CQ, Mun D, Belghasem ME, Henderson JM, Francis JM, Salant DJ, Chitalia VC (2018) Association of pathological fibrosis with renal survival using deep neural networks. Kidney Int Rep 3(2):464–475
Article PubMed PubMed Central Google Scholar
Tandel GS, Biswas M, Kakde G et al (2019) A review on a deep learning perspective in brain cancer classification. Cancers 11(1):111
Article PubMed Central Google Scholar
Weinstein JN et al (2013) The cancer genome atlas pan-cancer analysis project. Nat Genet 45(10):1113–1120
Article PubMed PubMed Central Google Scholar
Yu F, et al (2015) Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems
Sandler M, et al (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
He K, et al (2016) Deep residual learning for image recognition. in Proceedings of the IEEE conference on computer vision and pattern recognition
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Szegedy C, et al (2015) Going deeper with convolutions. in Proceedings of the IEEE conference on computer vision and pattern recognition
Szegedy C, et al (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Deng J, et al (2012) Imagenet large scale visual recognition competition. ilsvrc2012
Howard AG, et al (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861
Düntsch I, Gediga G (2019) Confusion matrices and rough set data analysis. arXiv preprint arXiv:1902.01487
Juba B, Le HS (2019) Precision-recall versus accuracy and the role of large data sets. Proc. 33rd AAAI
Heagerty PJ, Zheng Y (2005) Survival model predictive accuracy and ROC curves. Biometrics 61(1):92–105
Article PubMed Google Scholar
Chollet F (2015) Keras
Girija SS (2016) Tensorflow: large-scale machine learning on heterogeneous distributed systems. Software available from tensorflow. org
Lawrence MS, Stojanov P, Polak P, Kryukov GV, Cibulskis K, Sivachenko A, Carter SL, Stewart C, Mermel CH, Roberts SA, Kiezun A, Hammerman PS, McKenna A, Drier Y, Zou L, Ramos AH, Pugh TJ, Stransky N, Helman E, Kim J, Sougnez C, Ambrogio L, Nickerson E, Shefler E, Cortés ML, Auclair D, Saksena G, Voet D, Noble M, DiCara D, Lin P, Lichtenstein L, Heiman DI, Fennell T, Imielinski M, Hernandez B, Hodis E, Baca S, Dulak AM, Lohr J, Landau DA, Wu CJ, Melendez-Zajgla J, Hidalgo-Miranda A, Koren A, McCarroll S, Mora J, Crompton B, Onofrio R, Parkin M, Winckler W, Ardlie K, Gabriel SB, Roberts CWM, Biegel JA, Stegmaier K, Bass AJ, Garraway LA, Meyerson M, Golub TR, Gordenin DA, Sunyaev S, Lander ES, Getz G (2013) Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499(7457):214–218
Article CAS PubMed PubMed Central Google Scholar
Salerno EP, Bedognetti D, Mauldin IS, Deacon DH, Shea SM, Pinczewski J, Obeid JM, Coukos G, Wang E, Gajewski TF, Marincola FM, Slingluff CL Jr (2016) Human melanomas and ovarian cancers overexpressing mechanical barrier molecule genes lack immune signatures and have increased patient mortality risk. Oncoimmunology 5(12):e1240857
Article PubMed PubMed Central Google Scholar
Skaaby T, Husemoen LL, Thyssen JP, Meldgaard M, Thuesen BH, Pisinger C, Jørgensen T, Carlsen K, Johansen JD, Menné T, Szecsi PB, Stender S, Linneberg A (2014) Filaggrin loss-of-function mutations and incident cancer: a population-based study. Br J Dermatol 171(6):1407–1414
Article CAS PubMed Google Scholar
Palaniappan A, Ramar K, Ramalingam S (2016) Computational identification of novel stage-specific biomarkers in colorectal cancer progression. PLoS One 11(5):e0156665
Article PubMed PubMed Central Google Scholar
Xu P et al (2018) Identification of glioblastoma gene prognosis modules based on weighted gene co-expression network analysis. BMC Med Genet 11(1):96
CAS Google Scholar
Kim Y-W, Koul D, Kim SH, Lucio-Eterovic AK, Freire PR, Yao J, Wang J, Almeida JS, Aldape K, Yung WK (2013) Identification of prognostic gene signatures of glioblastoma: a study based on TCGA data analysis. Neuro-oncology 15(7):829–839
Article CAS PubMed PubMed Central Google Scholar
Park J, et al (2019) Transcriptome profiling-based identification of prognostic subtypes in glioblastoma: novel therapeutic strategy targeting invasiveness. AACR

Download references

Funding

This work was supported by grants from the National Health and Medical Research Council of Australia (1067405 and 1123816 to G.A.G.), the Cure Brain Cancer Foundation (to G.A.G.), the University of South Australia (to G.A.G and A. Z. S.), the Neurosurgical Research Foundation (to G.A.G.) and the Cancer Council SA Beat Cancer Project Infrastructure (to G.A.G.). G.A.G. is also supported by an Australian Research Council Future Fellowship (FT160100366). A.Z.S. is supported by an Australian Government Research Training Program (RTP) Scholarship. Imaging was performed at the Australian Cancer Research Foundation (ACRF) Cancer Discovery Accelerator facility, established with the generous support of the Australian Cancer Research Foundation.

Author information

Amin Zadeh Shirazi, Eric Fornaciari and Guillermo A. Gomez contributed equally to this work.

Authors and Affiliations

Centre for Cancer Biology, SA Pathology and University of South Australia, UniSA CRI Building, North Terrace, Adelaide, SA, 5001, Australia
Amin Zadeh Shirazi, Lisa M. Ebert & Guillermo A. Gomez
Department of Mathematics of Computation, University of California, Los Angeles (UCLA), Los Angeles, CA, USA
Eric Fornaciari
Department of Ophthalmology, Mashhad University of Medical Sciences, Mashhad, Iran
Narjes Sadat Bagherian
SA Pathology, Royal Adelaide Hospital, Adelaide, Australia
Barbara Koszyca

Authors

Amin Zadeh Shirazi
View author publications
You can also search for this author in PubMed Google Scholar
Eric Fornaciari
View author publications
You can also search for this author in PubMed Google Scholar
Narjes Sadat Bagherian
View author publications
You can also search for this author in PubMed Google Scholar
Lisa M. Ebert
View author publications
You can also search for this author in PubMed Google Scholar
Barbara Koszyca
View author publications
You can also search for this author in PubMed Google Scholar
Guillermo A. Gomez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Amin Zadeh Shirazi or Guillermo A. Gomez.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Research involving human participants and/or animals

The use of human tissues collected by SA Pathology, and associated clinical information, for this research was approved by the Central Adelaide Local Health Network Human Research Ethics Committee (approval number R20160727).

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplementary Table 1

TCGA patient ID, brain tumour type and survival. (XLSX 25 kb)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zadeh Shirazi, A., Fornaciari, E., Bagherian, N.S. et al. DeepSurvNet: deep survival convolutional network for brain cancer survival rate classification based on histopathological images. Med Biol Eng Comput 58, 1031–1045 (2020). https://doi.org/10.1007/s11517-020-02147-3

Download citation

Received: 27 July 2019
Accepted: 14 February 2020
Published: 02 March 2020
Issue Date: May 2020
DOI: https://doi.org/10.1007/s11517-020-02147-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

DeepSurvNet: deep survival convolutional network for brain cancer survival rate classification based on histopathological images

Abstract

Similar content being viewed by others

Deep Learning and Prediction of Survival Period for Breast Cancer Patients

Deep Learning Versus Classical Regression for Brain Tumor Patient Survival Prediction

Glioblastoma Survival Prediction

Explore related subjects

1 Introduction

2 Methods

2.1 Construction, training and testing of DeepSurvNet

2.1.1 Datasets used for training, testing and validation of deep learning classifiers (Fig. 1a)

2.1.2 Patients’ database creation: removing outliers and extraction of tumour regions of interest (ROIs) from WSIs (Fig. 1b)

2.1.3 Definition of different classes for survival (Fig. 1c)

2.1.4 Patch extraction from ROIs and patch standardization (Fig. 1d)

2.1.5 Training, validating and testing datasets and DCNN-based classifiers (Fig. 1e, f)

2.1.6 Five DCNN-based classifiers for brain cancer survival rate classification (Fig. 1g)

VGG19

GoogleNet

ResNet50

InceptionV3

MobileNetV2

DeepSurvNet classifier model (Fig. 1h)

2.2 Evaluation criteria

Confusion matrix

Precision, recall and F-score

Area under the curve (AUC) and receiver operating characteristics (ROC)

3 Implementation details

4 Results and discussion

4.1 Survival rate classifiers comparison

4.2 DeepSurvNet generalization in unseen (locally derived) dataset

4.3 Gene mutation frequency within survival classes

5 Conclusion

References

Funding

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Research involving human participants and/or animals

Additional information

Publisher’s note

Electronic supplementary material

Supplementary Table 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation