The use of generative adversarial networks for multi-site one-class follicular lymphoma classification

Somaratne, Upeka Vianthi; Wong, Kok Wai; Parry, Jeremy; Laga, Hamid

doi:10.1007/s00521-023-08810-8

The use of generative adversarial networks for multi-site one-class follicular lymphoma classification

Original Article
Open access
Published: 22 July 2023

Volume 35, pages 20569–20579, (2023)
Cite this article

Download PDF

You have full access to this open access article

Neural Computing and Applications Aims and scope Submit manuscript

The use of generative adversarial networks for multi-site one-class follicular lymphoma classification

Download PDF

Upeka Vianthi Somaratne ORCID: orcid.org/0000-0002-8916-2828¹,
Kok Wai Wong¹,
Jeremy Parry^1,2 &
…
Hamid Laga¹

1199 Accesses
1 Citation
Explore all metrics

Abstract

Recent advances in digital technologies have lowered the costs and improved the quality of digital pathology Whole Slide Images (WSI), opening the door to apply Machine Learning (ML) techniques to assist in cancer diagnosis. ML, including Deep Learning (DL), has produced impressive results in diverse image classification tasks in pathology, such as predicting clinical outcomes in lung cancer and inferring regional gene expression signatures. Despite these promising results, the uptake of ML as a common diagnostic tool in pathology remains limited. A major obstacle is the insufficient labelled data for training neural networks and other classifiers, especially for new sites where models have not been established yet. Recently, image synthesis from small, labelled datasets using Generative Adversarial Networks (GAN) has been used successfully to create high-performing classification models. Considering the domain shift and complexity in annotating data, we investigated an approach based on GAN that minimized the differences in WSI between large public data archive sites and a much smaller data archives at the new sites. The proposed approach allows the tuning of a deep learning classification model for the class of interest to be improved using a small training set available at the new sites. This paper utilizes GAN with the one-class classification concept to model the class of interest data. This approach minimizes the need for large amounts of labelled data from the new site to train the network. The GAN generates synthesized one-class WSI images to jointly train the classifier with WSIs available from the new sites. We tested the proposed approach for follicular lymphoma data of a new site by utilizing the data archives from different sites. The synthetic images for the one-class data generated from the data obtained from different sites with minimum amount of data from the new site have resulted in a significant improvement of 15% for the Area Under the curve (AUC) for the new site that we want to establish a new follicular lymphoma classifier. The test results have shown that the classifier can perform well without the need to obtain more training data from the test site, by utilizing GAN to generate the synthetic data from all existing data in the archives from all the sites.

Is More Always Better? Effects of Patch Sampling in Distinguishing Chronic Lymphocytic Leukemia from Transformation to Diffuse Large B-Cell Lymphoma

Adversarial Learning of Cancer Tissue Representations

Gigapixel Whole-Slide Images Classification Using Locally Supervised Learning

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Cancer is the second leading cause of death worldwide. The World Health Organization (WHO) reported 18 million worldwide new cancer cases in 2018 [1]. The reports for commonwealth nations stated a 35% increase in new cases between 2008 and 2018, with nearly 1.7 million deaths in 2018 [2]. Follicular lymphoma (FL) is the most common subtype taking up to 20–25% of non-Hodgkin lymphomas. It is crucial to diagnose FL early but due to slow growth symptoms show in later stages [3]. Medical imaging is an essential tool for the diagnosis of cancer and cancer research [4]. Currently, most cancers are diagnosed by pathologists who examine thin (e.g. 4 micron) tissue sections stained with Hematoxylin and Eosin (H&E), with other ancillary stains used as required. Historically this process has used light microscopy. Increasingly it is based on the examination of digital Whole Slide Images (WSIs). The diagnosis is based on features of tissue morphology, such as the size and density of nuclei, which can only be seen at the microscopic level [5, 6].

Aided by improvements in computational resources, image analysis in radiology and pathology increasingly uses Machine Learning (ML) techniques to improve the accuracy and speed of diagnosis [7]. Deep Learning (DL) has been shown to outperform most other ML models given large numbers of training datasets, due largely to its capability to learn unseen patterns and perform automated feature extraction [8]. In medical imaging, DL has shown promising results in radiology images from a wide variety of imaging modalities including MRI, CT and X-ray with numerous tasks including augmenting the diagnostic workflow [9,10,11,12,13].

DL has shown similar promises in digital pathology for classification, segmentation, object detection and registration tasks for WSIs and offers opportunities to improve the efficiency and accuracy of pathology diagnosis [5, 14,15,16].

However, there are key differences between radiology and pathology images which challenge the translation of DL techniques from one domain to the other [17, 18]. These include but are not restricted to the following:

1.
The large dimensions of WSI (often around 100, 000 × 100, 000 pixels) require them to be partitioned into large number of patches to use in classification, with partitioning required at both low and high magnifications due to the different features presented at different levels; this presents significant computational challenges.
2.
The requirement for image-level as well as pixel-level labelling for grading and localisation in WSI data and for cell-wise classification via cell-level labelling to train DL models, which requires time-consuming and expensive involvement of domain experts [19]. This reflects the pyramidal nature of WSIs which comprise multiple levels with different dimensions containing different types of important features for diagnosis.
3.
The significant variation in WSIs between sites is due to differences in tissue preparation steps such as sectioning, staining scanners and different scanning procedures [5, 18, 20, 21]. These variations cause many trained models to generalize poorly to data from new sites, while establishing new models for different new sites requires a large amount of ground truth data from the sites [5, 6, 22].
4.
The time and specialized domain knowledge required to label data in pathology are more complex when compared to other medical image types [23]. Images often portray different types of diseases of non-standardized appearance representing a large number of pathological abnormalities that require highly specialized domain knowledge, which is a challenge that pathologists can only deal with after years of specialized training [6, 24].
5.
The requirement of methods to handle the paucity of large labelled pathology data sets and small data sets from new sites have not captured the wide variance in clinical samples [11, 23, 25].

To minimize the adverse effects of small data sets on network performance in pathology, transfer learning (TL), weakly supervised classification and the use of synthetic data using Generative Adversarial Networks (GAN) have been used [11, 26]. These techniques address the numerically small size of data sets. However, these techniques require dealing with variation in pathology data from different sites or hospitals due to the differences in imaging technologies and staining processes. This problem can be addressed by normalizing data to minimize the difference in data distribution and/or increasing variation in local datasets [27]. The combination of these approaches has been shown to better improve classification [28,29,30].

In the normalization approach, all the data are normalized into one style for training. Normalization techniques differ between datasets, and suitable normalization techniques must be identified for each different dataset depending on the applications desired [7]. Three main types of methods have been used to handle stain differences; colour matching techniques which match colour to a reference template image; stain-separation methods which normalize each channel independently; and pure learning-based techniques, including GAN that handle the problem as a stain transfer method. Learning-based methods reduce the drawbacks of colour matching, which can lead to improper colour mapping due to the use of the same transformation across the datasets, and are superior to stain-separation techniques, which do not consider spatial features of tissue [31]. However, learning as a style transfer technique is computationally expensive, and this barrier has prompted a search for computationally simpler solutions to handle the stain differences in WSIs especially for different sites.

The variance approach to small data sets attempts to increase the variance captured by the data set and modify the data distribution [26]. This can be achieved by adding more labelled data if sufficient domain experts are available to assign labels. If not, techniques such as Transfer Learning (TL) and GAN that use minimal labelled data can be explored.

TL has shown promising results in pathology [32, 33]. In TL, models trained on a source dataset are adapted to a target dataset either by using the pre-trained model as a feature extractor or by fine-tuning the pre-trained model to the target dataset [11]. TL based methods for WSIs have proven to improve performance using a smaller dataset [25]. However, the investigations of the impact of TL used for new sites with limited labelled data and the impact of using data from multiple sites are not well reported in this area. Prior research presents an approach to fine-tune a pre-trained model using data from the class of interest of a new site trained with a dataset from another site. This reduced the need for labelled data from the new site [32]. However, the approach reported in the paper [31] will lead to overfitting if insufficient data are available for training. If limited labelled data are available and the training models consist of millions of parameters, steps should be taken to perform thorough evaluations with testing data which captures the data distribution. Therefore, this paper investigates the possible use of GAN to generate more data for the new site.

Recent research on using GAN for WSI has shown the value of generating realistic synthetic data to increase the labelled data for classification [23, 26]. One-class classification approach can be used where a class imbalance exists [34]. This focus on developing non-target data in order to perform binary classification. One-class classification has been discussed in medical image classification in general and has shown promising results in multiple domain areas, although relatively limited research has focussed on histology image processing [34, 35].

This paper focuses on improving the generalization of models to new sites using the one-class data from new sites. To the best of our knowledge, GAN for one-class image classification for WSIs has not been investigated. Therefore, the paper investigates the GANs influence in the one-class classification tasks by creating synthetic data for the labelled one-class data for a new site. To classify one-class new site data, the multi-site dataset’s negative class is passed down as non-target data for classification. The use of limited one-class data from new sites with GAN significantly contributes to reducing the differences in data distributions of different sites and the resulting generalization problem.

The main contributions of the paper are:

1.
Introduce the use of GAN for one-class classification tasks for WSI.
2.
Use of the one-class classification data to label and test on a new site dataset.
3.
The use of GAN to improve one-class classification data distribution among a new class without retraining the model.

The experiments conducted show that there is an improvement in the results and the generalization effect of the one-class data to the generated results without retraining the model for the new site.

2 GAN and related work

GANs have attracted much attention recently and have been used in the medical imaging domain [23]. A research creates synthetic images for 3D live images of the Caenorhabditis elegans embryo with fluorescently labelled cell membranes. The satisfactory performance of the classification indicated the applicability of other unique structures in microscopic data [36].

GANs have also been applied to be used for WSIs, in the areas of augmentation, segmentation, virtual staining, stain normalization and stain style transfer.

The most common application in pathology is to eliminate the stain differences in WSI data, and there is a growing interest in using GAN to produce synthetic images to increase the amount of data to train the DL models [23, 26, 37]. This application addresses a specific issue in pathology, which is that images with small and large amounts of positive features will both be classified as positive by a pathologist, in contrast to general domains which have distinct classes. A method based on CycleGAN has been used to augment positive samples by translating easy-to-classify samples into hard-to-classify samples [27]. GAN as an image translation method has been proposed with a Conditional Generative Adversarial Networks (CGANs) for histopathological to immunofluorescent image translation [38]. Preliminary investigations show that GANs can be used to handle inter-site differences in WSIs. An unsupervised domain adaptation technique using adversarial training was developed in which discriminative knowledge from a source domain was effectively transferred using a Siamese network [39]. The study investigated colour normalization and adversarial training to adapt knowledge from the source domain to the target domain, with significant improvement. However, the authors also mention the drawback of two-step training, the effects of using a higher number of samples and different complexities of models. The method also depends on the reference images used and the normalization techniques applied to the images. Improving generalization through staining invariant features is another approach to improve classification using CNNs [24]. Colour normalization and colour augmentation have been investigated to address the inter-site differences in WSIs. Furthermore, in the instance when there are only one-class WSIs available which is from the class of interest, the classification is challenged [40].

An additional problem experienced across imaging domains is the class imbalance that can arise due to numerical imbalance between the positive class (e.g. cancer) and negative cancer (not cancer). Cancer in medical data is often the minority class due to various factors, including the relative paucity of cancer tissue compared to normal background tissue, the complexity of labelling small regions of cancer (often single cells) and the lack of openness of medical domains. Auto-labelling techniques are preferred as a method to handle the cost of the labelling problem. However, in unsupervised techniques, there is no constraint on the boundaries of the clusters, which may fail to provide the accurate segmentation of regions of interest at the pixel level required to develop models [41]. Consequently, there is interest in methods applicable to one-class classification.

One-class classification has been applied as a learning-based technique using positive and unlabelled data, a novelty and outlier detection technique and a one-class support vector machine (SVM) based technique [34, 40]. The limitations of a one-class approach have been studied [42].

This includes the tendency of pathologists to label images at the whole-image level regardless of how much cancer is present in the image, whereas in natural image domains the images usually have a distinct label [27]. Therefore, annotations in the images of the class of interest are important to train models to assist in diagnosis. Many research focus on deep learning-based techniques and many shallow learning techniques explores the novelty outlier detection technique. This method focuses on artificially created outliers for binary classification along with the labelled positive data. The target dataset given to the models should capture the high variability in the distribution to support classification with the artificially created class.

Inspired by the prior research for one-class data problem in other domains, this paper uses the one-class data as a solution to minimize the required amount of labelled data. Models trained only with a small one-class dataset from a new site decrease the need for a large number of labelled data from the new site [32]. In WSIs, the class of interest have minimal labelled data. Due to a lack of generalization, it is not feasible to directly transfer a model trained with one site’s data to another site’s data. Therefore, to improve the performance of models, we suggest that the one-class dataset can consist small amount of data from the class of interest. Limited research focus on using one-class data to handle the lack of labelled data while handling WSIs from different sites. The paper introduces one solution to address the limitations in labels and differences in data from different sites in WSIs.

3 Methodology

3.1 Overview of the proposed structure

Figure 1 presents the overview of the proposed architecture to learn from the new site’s WSIs. Figure 2 shows the combination of one-class data from a new site and multi-site data. The proposed architecture supports:

1.
The use of one-class data and GAN to effectively minimize the need for labelled data from new sites, and
2.
Minimizing the distribution difference on the WSIs from multiple sites and the new sites.

The proposed approach consist of two main steps. First a GAN is used to create new data points for synthetic one-class WSIs patches. These synthetic patches are generated for new Site (S2) which has a minimal number of labelled WSIs. Second step is a binary classifier to combine WSIs' patches from n Sites (MS1) and the new site S2 which improves the differences in the distribution of WSIs. The GAN generates synthetic data for the new site. Therefore, the classifier is more generalized to classify WSIs from different Sites.

3.2 Using GAN to increase the number of patches for new site

GANs are based on two CNNs trained as a generator and a discriminator. The network for the generator learns the distribution of the real images to generate new data points which synthetic images belong to the real data distribution.

The generator's aim is to maximize the capability of creating realistic images to trick the discriminator while the discriminator aims to maximize the capability of differentiating the synthetic data from the real data.

In the proposed approach, a GAN is used to create synthetic WSI patches for the new site, S2 which has a limited number of labelled WSIs for the class of interest. The synthetic WSI patches create additional training data for the new site, S2. This minimizes the differences in sites and improves the classification model. The paper examines the effects of combining WSIs from a new site, with the use of synthetic data generated based on the Deep Convolutional Generative Adversarial Networks (DCGAN) for the small one-class data from new site, S2. The DCGAN consist of a Convolutional Neural Network (CNN) combined to the traditional GAN to achieve deep feature-based representation of data. The architecture of the DCGAN is capable of generating better quality images with stable training compared to the traditional GAN [43].

The DCGAN is also a model which does not require a high computational power [41]. The DCGAN consist of a generator G, and a discriminator D. The input to G is a vector of 100 elements with a random normal distribution. The generator will learn and create data for the target dataset which is the one-class dataset (G(x) = x). The discriminator's aim is to differentiate the generated fake image (x_f) and real target image (x_t) which were the inputs to the discriminator. The generator and the discriminator is learnt adversarial in a min–max game, in which the discriminators objective is to maximize the ability to differentiate between fake (G(x) = x_f) and (x_t), while the generators objective is to create target like synthetic images (x_s). Equation (1) is used by the generator and discriminator. The DCGAN constructed is used to create S2 patches and passed on to the classifier. The model based on the DCGAN architecture is illustrated in Fig. 1 which is based on [44]. The generator of the DCGAN with the input of 100 element vector outputs a synthetic image.after propagating through the model. The output generated is a 64 × 64×3 image. The network consists of a fully connected layer and four deep convolutional layers. Batch normalization layers and Relu layers are applied to the convolutional layers of the network.

The discriminator network of the DCGAN has a CNN architecture with an input of 64 × 64×3. The synthetic images from the generator and the target data which are real images are learnt in order to differentiate the real and fake images. The network of the discriminator consists of four deep convolutional layers and a fully connected layer.

$${\text{min}}_{G} {\text{max}}_{D} V\left( {D,G} \right) = E_{{x\sim p_{{{\text{data}}\left( x \right)}} }} \left[ {\log D(x)} \right] + E_{z\sim pz(z)} \left[ {\log (1 - D(G(z))} \right]\left[ {\log (1 - D(G(Z)))} \right]$$

(1)

The network consists of batch normalization layers and Relu layers in the convolutional layers. The GAN was trained for 100 epochs.

3.3 One-class classification

As shown in Fig. 2 the new site, S2 with its limited WSI patches contain only one class, therefore a one-class classifier is applied for classification. However, the one-class classifier receives WSI patches from MS1 (contains patches from the target class and non-target class from multi-site), S2 (contains only target class patches) and S2 synthetic data generated from the DCGAN as shown in Fig. 2. In order to handle the one-class problem, the non-target class is taken from MS1 non-target class. Therefore, the final classifications target class comprises of WSI patches from MS1, S2 and S2 synthetic and the non-target class comprises only of MS1's non-target class. The target and non-target WSI patches are classified using CNN. The CNN comprises of three layers with sigmoid activation function and the final layer comprises of a softmax activation. The CNN takes an input of 64 × 64 × 3. A dropout of 0.5 is applied to all the layers. The CNN optimizer is RMS-prop, and the loss is binary classifier.

4 Experiments

4.1 Datasets

The experiments were conducted based on a publicly available multi-site dataset for Lymphoma subtype classification and a new site class dataset which contain data for the class of interest. The publicly available dataset has been created by National Institute on Ageing (NIA). This dataset includes three subtypes of lymphoma (Follicular Lymphoma (FL), Chronic Lymphocytic (CLL), and Mantle Cell Lymphoma (MCL)). The H&E stained images were gathered from multiple sites to add high staining variation. Furthermore, the dataset consists of 374 images of 1388 × 1040 dimension. Each class has images classified as follows, 113 for the CLL, 139 for the FL, and 122 for the MCL. In order to conduct the experiments, the images were split into non-overlapping patches of 64 × 64. The paper focuses on binary classification and therefore the CLL and MCL classes were considered as the Non-FL non-target class, while the FL class is the class of interest.

The second dataset is provided by PathWest Laboratory Medicine WA. This private dataset is considered as the dataset from the second site. It includes three H&E stained WSIs, scanned using the Aperio WSI scanner. Experienced pathologists have labelled the images for the FL class, which is the class of interest. The Regions of Interest (ROI) were extracted based on the coordinates of the annotations, and non-overlapping patches of 64 × 64 were created. The patches that contained more background were eliminated from the dataset. Patch extraction was performed by using Distinct Block Processing from the blockproc function of Matlab’s Image Processing Toolbox. A test set from this dataset has been created for the experiments. The models trained with the public dataset (multi-site, MS1) were tested with the private dataset (new site – S2).

4.2 Experiments

The differences in the data from the public multi-site dataset and the private new site dataset were explored. A visual comparison of the differences in data can be identified based on Fig. 3a and b. Additionally, we generated t-Distributed Stochastic Neighbour Embedding (t-SNE) plots for the two datasets in order to understand the data distribution and differences in the datasets. Identifying the distribution and the differences is important to address the possible generalization problems using the proposed approach. The t-SNE plots are suited for visualization of high-dimensional datasets. Based on the generated t-SNE plot shown in Fig. 5 for the two datasets, we can identify a significant difference in the data distributions of the class of interest from different sites. Based on the findings of the differences in the data distributions, we conducted detailed experiments in order to validate the proposed method. We compared the approach of using synthetic data from GAN to validate the effectiveness of the proposed method. The proposed method is compared with synthetic data and without synthetic data for comparison.

For the experiment without synthetic data, the classifier is trained and validated with public dataset. For the experiment with synthetic data, the classifier is trained and validated with combined public data and synthetic data of S2. The unseen test set created using the private new site dataset (S2) is used to blind test both scenarios. The experiment without synthetic data is conducted by creating the classifier with data from the first sites. The classifiers are trained and performance of the classification.is obtained. The experiment with synthetic data has two phases. The first phase is to create synthetic data for the one-class data. Figure 4 provides examples of generated synthetic data using the DCGAN for the new site. The synthetic data are combined with the dataset of the first site as the input to the classifier. The proposed approach develops GAN as a method to handle differences in the data from different sites. In order to build the negative class of the multi-site data is considered as the negative class for the one-class dataset. The classifiers are trained with the joint dataset and performance is evaluated to be compared with the experiment without any synthetic data.

The experiments are conducted with small scale images even though there are successful prior research for images with higher dimensions (256*256, 128*128). We test our models at relatively small scales for computational ease and prior research shows that the low-resolution images with large number of cells may need to be improved when compared to training cell-wise images [44, 45]. After generating synthetic data for the private dataset, we generated a t-SNE plot in order to identify the changes in the data distributions. Figure 5 demonstrates that the synthetic data generated by the GAN has contributed to merge the gap between the distributions of multi-site dataset and the new site dataset.

5 Results and discussion

5.1 Results

The experiments with synthetic data to classify one-class data from the new site were conducted, and the performance was calculated for evaluation. The multi-site dataset and the synthetic data for the one-class private dataset are used as the input to the classifier. The classifier is trained with the synthetic data to align the differences between the datasets.

The results show better performance by using synthetic data to minimize the differences and improve performance for the data from the new site. The classifier acts as a method to handle the one-class data problem by using the multi-site dataset’s negative class. The results showed an improvement in both the new site dataset's and multi-site dataset's performance. Table 1 shows the performance without synthetic data and with synthetic data for the validation data of the multi-site dataset. It indicates an 8% increase in Accuracy and AUC values. Table 1 also presents the improvements to the F1-score, Precision and Recall to perform a fair comparison of the performance for the one-class dataset.

Table 1 Validation set—the performance for one-class classification

Full size table

5.2 Comparison of classification

Differences of new site data caused poor generalization in trained models which is seen in Table 2 (1st column). The AUC for the same site validation set is 82, but the AUC for the new site test set is 60.66, when no synthetic data is generated. In order to provide more metric comparisons, the evaluation metrics for the F1-score, Precision and recall were also derived, and similar patterns of results can be observed.

Table 2 Test set—the performance for one-class classification

Full size table

The proposed approach utilizes a small amount of one-class data from a new site to improve the learnt features. From Table 1, it shows the validation set from the same site also has an improvement in the performance. Table 2 shows the improvement of the test set from the unseen new site. Table 3 and Table 4 show the model evaluations for classification using 600, 400, 100 and 50 synthetic images. The increase in synthetic data has clearly improved the performance of validation data as the training set has captured more variation. The unseen test set shows F1 score range of 72–75 for FL and an AUC range of 75–77. The largest performance improvement of around 15% can be seen from combining 50 synthetic images onwards. Using different amounts of synthetic data does not drastically change the performance compared with and without synthetic data comparison.

Table 3 The classification outcome for validation set data explored with different amounts of synthetic data

Full size table

Table 4 The classification outcome for blind set data explored with different amounts of synthetic data

Full size table

5.3 Using GAN to handle differences

In order to handle the differences in the data from new sites, the proposed method takes a different approach using GAN. The synthetic data generated by the GAN is used to increase the amount of one-class target data while increasing the variation in the data distribution with a limited amount of labelled data. The enriched data distribution aligns the features in the multiple datasets, which leads to minimizing the gap between the datasets. Instead of data normalization or colour matching, this learning approach is much more beneficial [31]. The other learning-based approaches' model complexity is a limitation and by applying synthetic data as an approach to align the data is much more efficient.

5.4 Discussion

Method [46] and [47] in prior research discuss solutions for gradient exploding and fast convergence to generate improved synthetic data for general objects. While GANs have significantly progressed for general objects and some areas of medical imaging (MRI, CT, and fluorescent microscopic) the research for WSIs are limited.

The evidence based on the prior research showed the challenges of applying DL techniques to WSIs. Despite the promising performances in DL application to WSIs, the domain shift caused by the WSIs from completely new sites presented some challenges as described. The barrier of differences can be approached by obtaining large amounts of labelled data from new sites and retraining models. However, it is an infeasible approach considering the complex tasks of labelling WSIs and developing specific DL models to different sites. We identify that there is a need to handle the domain shift in WSIs using limited labelled data. Therefore, inspired by the one-class classification techniques, the approach of one-class data classification using GAN is proposed to handle inter-site differences in WSI data. The method can be used for new sites by only using a small one-class data set of the class of interest.

In our experiments, we have incorporated a larger multi-site dataset from a publicly available source and a small amount of one-class data from a new site from a local hospital. We provided a comparison of the performance of the data from the new site, with and without the use of synthetic data. We demonstrated that by using labelled data from a multi-site dataset and a small number of synthetic data, the accuracy of the CNN can reach a 15% increase in classifying FL and non-FL data in the test set. The method is capable of minimizing the need for a large amount of labelled data from the new site and the handle differences. This is achieved without compensation for the image variations and non-morphological differences in data from different sites. Figure 5 demonstrates the differences by separating the data from the public (multi-site) dataset and the private (new site) dataset into two clusters. After generating the synthetic data for the new site, Fig. 5 demonstrates that the differences are minimized. Furthermore, the performance of the model can be enhanced by incorporating the synthetic data generated by GAN.

Furthermore, the proposed approach is a computationally, less demanding approach than the commonly used GAN architectures. The proposed approach was tested on a NVIDIA 1080i 1 GPU. Most common GAN types, StarGAN, StyleGAN and CycleGANs require a higher GPU power. In general the starGAN requires a NVIDIA Titan Xp GPU with 4000 training images [48]. StyleGAN recommends using NVIDIA DGX-1 with 8 Tesla V100 GPUs and training for a week [49]. Although CycleGAN can use the NVIDIA 1080i GPU it takes approximately 72 h to run 100 epochs, whereas the proposed approach generates images by running 100 epochs for less than one hour. Figure 6 shows the images generated by CycleGAN after 10 epochs running for 3 h. It shows image translation converting multi-site images to new site images which do not capture the heterogeneous tissue structures, and it fails in translating private images to public images which have heterogeneous features in the tissue structures. The prior research on CycleGAN also shows that complex texture and shape structures were not captured by the CycleGAN [50]. Adapting a model to capture real-world data distribution is essential. Therefore, the proposed approach using DCGAN is an efficient method compared to the other methods due to its capability of addressing the unique features from the new site using limited computational power. Therefore, the proposed DCGAN based approach is a faster method which is capable of generating synthetic data with a limited number of labelled data from a new site.

6 Conclusion

The paper presents an alternative approach to handle inter-site differences while minimizing the need for labelled data. A GAN based technique is presented to handle the one-class data problem as a solution to minimize the number of required labels while improving the performance for data from new sites. Our empirical study shows that the proposed GAN based technique shows significant improvement in performance for the new site's data. The proposed technique can be applied to any new site with a minimal number of labelled data for the class of interest. An efficient and effective DCGAN is introduced to handle the differences in data from new sites. The proposed technique can be implemented with reduced computational resources which is beneficial for translating the method in the digital pathology workflow.

References

Ferlay J et al (2019) Estimating the global cancer incidence and mortality in 2018: GLOBOCAN sources and methods. Int J Cancer 144(8):1941–1953
Article Google Scholar
Lodge M (2020) The role of the Commonwealth in the wider cancer control agenda. Lancet Oncol 21(7):879–881
Article Google Scholar
Carbone A et al (2019) Follicular LYMPHOMA. Nat Rev Dis Primers 5(1):1–20
Article MathSciNet Google Scholar
Kurc T et al., Segmentation and classification in digital pathology for glioma research: Challenges and deep learning approaches, Fron Neurosci, 14, (2020).
Campanella G et al (2019) Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat Med 25(8):1301–1309
Article Google Scholar
Ianni JD et al (2020) Tailored for real-world: a whole slide image classification system validated on uncurated multi-site data emulating the prospective pathology workload. Sci Rep 10(1):1–12
Article Google Scholar
Komura D, Ishikawa S (2018) Machine learning methods for histopathological image analysis. Comput Struct Biotechnol J 16:34–42
Article Google Scholar
Khan S, Islam N, Jan Z, Din IU, Rodrigues JJC (2019) A novel deep learning based framework for the detection and classification of breast cancer using transfer learning. Pattern Recogn Lett 125:1–6
Article Google Scholar
Dou Q et al (2019) PnP-AdaNet: Plug-and-play adversarial domain adaptation network at unpaired cross-modality cardiac segmentation. IEEE Access 7:99065–99076
Article Google Scholar
Altaf F, Islam SM, Akhtar N, Janjua NK (2019) Going deep in medical image analysis: concepts, methods, challenges, and future directions. IEEE Access 7:99540–99572
Article Google Scholar
Cheplygina V, de Bruijne M, Pluim JP (2019) Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis. Med Image Anal 54:280–296
Article Google Scholar
Armanious K et al (2020) MedGAN: medical image translation using GANs. Comput Med Imaging Graph 79:101684
Article Google Scholar
Tan W, Tiwari P, Pandey HM, Moreira C, and Jaiswal AK, Multimodal medical image fusion algorithm in the era of big data, Neural Comput Appl, pp. 1–21, 2020.
Litjens G et al (2017) A survey on deep learning in medical image analysis. Med Image Anal 42:60–88
Article Google Scholar
Yu K-H et al (2016) Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features. Nat Commun 7(1):1–10
Article Google Scholar
Levy-Jurgenson A, Tekpli X, Kristensen VN, Yakhini Z (2020) Spatial transcriptomics inferred from pathology whole-slide images links tumor heterogeneity to survival in breast and lung cancer. Sci Rep 10(1):1–11
Article Google Scholar
Srinidhi CL, Ciga O, and Martel AL (2020), Deep neural network models for computational histopathology: a survey, Med Image Anal, 101813.
Dimitriou N, Arandjelović O, Caie PD (2019) Deep learning for whole slide image analysis: an overview. Front Med 6:264
Article Google Scholar
Rony J, Belharbi S, Dolz J, Ayed IB, McCaffrey L, and Granger E, Deep weakly-supervised learning methods for classification and localization in histology images: a survey, arXiv preprint arXiv:1909.03354, 2019.
Wang X et al (2019) Weakly supervised deep learning for whole slide lung cancer image analysis. IEEE Transact Cybernet 50(9):3950–3962
Article Google Scholar
Tschuchnig ME, Oostingh GJ, Gadermayr M (2020) Generative adversarial networks in digital pathology: a survey on trends and future potential. Patterns 1(6):100089
Article Google Scholar
Hägele M et al (2020) Resolving challenges in deep learning-based analyses of histopathological images using explanation methods. Sci Rep 10(1):1–12
Article Google Scholar
Yi X, Walia E, Babyn P (2019) Generative adversarial network in medical imaging: a review. Med Image Anal 58:101552
Article Google Scholar
Otálora S, Atzori M, Andrearczyk V, Khan A, Müller H (2019) Staining invariant features for improving generalization of deep convolutional neural networks in computational pathology. Front Bioeng Biotechnol 7:198
Article Google Scholar
Celik Y, Talo M, Yildirim O, Karabatak M, Acharya UR (2020) Automated invasive ductal carcinoma detection based using deep transfer learning with whole-slide images. Pattern Recogn Lett 133:232–239
Article Google Scholar
Hou L, Agarwal A, Samaras D, Kurc TM, Gupta RR, and Saltz JH, Robust histopathology image analysis: to label or to synthesize?."
Gibson E et al (2018) Inter-site variability in prostate segmentation accuracy using deep learning. International conference on medical image computing and computer-assisted intervention. Springer, pp 506–514
Google Scholar
Tellez D et al (2019) Quantifying the effects of data augmentation and stain color normalization in convolutional neural networks for computational pathology. Med Image Anal 58:101544
Article Google Scholar
Gour M, Jain S, Sunil Kumar T (2020) Residual learning based CNN for breast cancer histopathological image classification. Int J Imaging Syst Technol 30(3):621–635
Article Google Scholar
Kassani SH, Kassani PH, Wesolowski MJ, Schneider KA, and Deters R, Classification of histopathological biopsy images using ensemble of deep learning networks, arXiv preprint arXiv:1909.11870, 2019.
Shaban MT, Baur C, Navab N, and Albarqouni S, Staingan: stain style transfer for digital histological images," in 2019 IEEE 16th international symposium on biomedical imaging (Isbi 2019), 2019: IEEE, pp. 953–956.
Somaratne UV, Wong KW, Parry J, Sohel F, Wang X, and Laga H, Improving follicular lymphoma identification using the class of interest for transfer learning," in 2019 Digital image computing: techniques and applications (DICTA), 2019: IEEE, pp. 1–7.
Kong B, Sun S, Wang X, Song Q, Zhang S (2018) Invasive cancer detection utilizing compressed convolutional neural network and transfer learning. International conference on medical image computing and computer-assisted intervention. Springer, pp 156–164
Google Scholar
Perera P, Patel VM (2019) Learning deep features for one-class classification. IEEE Trans Image Process 28(11):5450–5463
Article MathSciNet MATH Google Scholar
Sabokrou M, Khalooei M, Fathy M, and Adeli E, Adversarially learned one-class classifier for novelty detection, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 3379–3388.
Wang D, Lu Z, Xu Y, Wang Z, Santella A, Bao Z (2019) Cellular structure image classification with small targeted training samples. IEEE Access 7:148967–148974
Article Google Scholar
BenTaieb A, Hamarneh G (2017) Adversarial stain transfer for histopathology image analysis. IEEE Trans Med Imaging 37(3):792–802
Article Google Scholar
Tavolara TE, Niazi MKK, Arole V, Chen W, Frankel W, Gurcan MN (2019) A modular cGAN classification framework: application to colorectal tumor detection. Sci Rep 9(1):1–8
Article Google Scholar
Ren J, Hacihaliloglu I, Singer EA, Foran DJ, Qi X (2018) Adversarial domain adaptation for classification of prostate histopathology whole-slide images. International conference on medical image computing and computer-assisted intervention. Springer, pp 201–209
Google Scholar
Yang Y, Hou C, Lang Y, Yue G, He Y (2019) One-class classification using generative adversarial networks. IEEE Access 7:37970–37979
Article Google Scholar
Van Engelen JE, Hoos HH (2020) A survey on semi-supervised learning. Mach Learn 109(2):373–440
Article MathSciNet MATH Google Scholar
Gamper J, Chan B, Tsang, YW, Snead D, and Rajpoot N, Meta-SVDD: probabilistic meta-learning for one-class classification in cancer histology images, arXiv preprint arXiv:2003.03109, 2020.
Diaz-Pinto A, Colomer A, Naranjo V, Morales S, Xu Y, Frangi AF (2019) Retinal image synthesis and semi-supervised learning for glaucoma assessment. IEEE Trans Med Imaging 38(9):2211–2218
Article Google Scholar
Radford A, Metz L, and Chintala S, Unsupervised representation learning with deep convolutional generative adversarial networks, arXiv preprint arXiv:1511.06434, 2015.
Kainz P, Burgsteiner H, Asslaber M, Ahammer H (2017) Training echo state networks for rotation-invariant bone marrow cell classification. Neural Comput Appl 28(6):1277–1292
Article Google Scholar
Kang M, Shim W, Cho M, Park J (2021) Rebooting acgan: auxiliary classifier gans with stable training. Adv Neural Inf Process Syst 34:23505–23518
Google Scholar
Wang (2021), Learning fast converging, effective conditional generative adversarial networks with a mirrored auxiliary classifier," in Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 2566–2575.
Chung D and Delp EJ (2019) Camera-aware image-to-image translation using similarity preserving StarGAN for person re-identification," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 2019, pp. 0–0.
Karras T, Laine S and Aila TA style-based generator architecture for generative adversarial networks, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 4401–4410.
Zhu J-Y, Park T, Isola P and Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks, in Proceedings of the IEEE international conference on computer vision, pp. 2223–2232.

Download references

Acknowledgements

This project is funded by a Research Translation Grant 2018 by the Department of Health, Western Australia.

Funding

Open Access funding enabled and organized by CAUL and its Member Institutions.

Author information

Authors and Affiliations

Murdoch University, Perth, Australia
Upeka Vianthi Somaratne, Kok Wai Wong, Jeremy Parry & Hamid Laga
Western Diagnostic Pathology, Myaree, Australia
Jeremy Parry

Authors

Upeka Vianthi Somaratne
View author publications
You can also search for this author in PubMed Google Scholar
Kok Wai Wong
View author publications
You can also search for this author in PubMed Google Scholar
Jeremy Parry
View author publications
You can also search for this author in PubMed Google Scholar
Hamid Laga
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Upeka Vianthi Somaratne.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Somaratne, U.V., Wong, K.W., Parry, J. et al. The use of generative adversarial networks for multi-site one-class follicular lymphoma classification. Neural Comput & Applic 35, 20569–20579 (2023). https://doi.org/10.1007/s00521-023-08810-8

Download citation

Received: 29 December 2021
Accepted: 28 June 2023
Published: 22 July 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s00521-023-08810-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The use of generative adversarial networks for multi-site one-class follicular lymphoma classification

Abstract

Similar content being viewed by others

Is More Always Better? Effects of Patch Sampling in Distinguishing Chronic Lymphocytic Leukemia from Transformation to Diffuse Large B-Cell Lymphoma

Adversarial Learning of Cancer Tissue Representations

Gigapixel Whole-Slide Images Classification Using Locally Supervised Learning

1 Introduction

2 GAN and related work