Abstract
Pretraining a deep learning model on large image datasets is a standard step before fine-tuning the model on small targeted datasets. The large dataset is usually general images (e.g. imagenet2012) while the small dataset can be specialized datasets that have different distributions from the large dataset. However, this “large-to-small” strategy is not well-validated when the large dataset is specialized and has a similar distribution to small datasets. We newly compiled three hematoxylin and eosin-stained image datasets, one large (PTCGA200) and two magnification-adjusted small datasets (PCam200 and segPANDA200). Major deep learning models were trained with supervised and self-supervised learning methods and fine-tuned on the small datasets for tumor classification and tissue segmentation benchmarks. ResNet50 pretrained with MoCov2, SimCLR, and BYOL on PTCGA200 was better than imagenet2012 pretraining when fine-tuned on PTCGA200 (accuracy of 83.94%, 86.41%, 84.91%, and 82.72%, respectively). ResNet50 pretrained on PTCGA200 with MoCov2 exceeded the COCOtrain2017-pretrained baseline and was the best in ResNet50 for the tissue segmentation benchmark (mIoU of 63.53% and 63.22%). We found supervised re-training imagenet-pretrained models (ResNet50, BiT-M-R50x1, and ViT-S/16) on PTCGA200 often improved downstream benchmarks.
Codes: https://github.com/enigmanx20/PatchTCGA
Datasets: http://bit.ly/3KCzkCA
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dosovitskiy, A. et al.: An image is worth 16 × 16 words: transformers for image recognition at scale. Presented at the (2021)
Azizi, S. et al.: Robust and efficient medical imaging with self-supervision (2022). https://doi.org/10.48550/arxiv.2205.09723
Bulten, W., et al.: Artificial intelligence for diagnosis and Gleason grading of prostate cancer: the PANDA challenge. Nat. Med. 28(1), 154–163 (2022). https://doi.org/10.1038/s41591-021-01620-2
Cao, Y.-H., Wu, J.: Rethinking self-supervised learning: small is beautiful (2021). arXiv:2103.13559 [cs]
Caron, M., et al.: Emerging properties in self-supervised vision transformers. Presented at the (2021)
Chen, R.J., et al.: Scaling vision transformers to gigapixel images via hierarchical self-supervised learning. Presented at the (2022). https://doi.org/10.1109/CVPR52688.2022.01567
Chen, T., et al.: A simple framework for contrastive learning of visual representations (2020)
Chen, T., et al.: Big self-supervised models are strong semi-supervised learners (2020)
Chen, X., et al.: An empirical study of training self-supervised vision transformers. CoRR. abs/2104.02057 (2021)
Coudray, N., et al.: Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat. Med. 24(10), 1559–1567 (2018). https://doi.org/10.1038/s41591-018-0177-5
Deininger, L., et al.: A comparative study between vision transformers and CNNs in digital pathology
Deng, J., et al.: ImageNet: a large-scale hierarchical image database. In: IEEE, pp. 248–255 (2009)
Devlin, J., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. Presented at the June (2019). https://doi.org/10.18653/v1/N19-1423
Ehteshami Bejnordi, B., et al.: Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA 318(22), 2199 (2017). https://doi.org/10.1001/jama.2017.14585
Esteva, A., et al.: Dermatologist-level classification of skin cancer with deep neural networks. Nature 542(7639), 115–118 (2017). https://doi.org/10.1038/nature21056
Goyal, P., et al.: Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour (2018). arXiv:1706.02677 [cs]
Grill, J.-B., et al.: Bootstrap your own latent - a new approach to self-supervised learning. Presented at the (2020)
He, K., et al.: Deep residual learning for image recognition. Presented at the (2016). https://doi.org/10.1109/CVPR.2016.90
He, K., et al.: Identity mappings in deep residual networks. arXiv:1603.05027 [cs]. (2016)
He, K., et al.: Momentum contrast for unsupervised visual representation learning. Presented at the (2020). https://doi.org/10.1109/CVPR42600.2020.00975
Howard, F.M., et al.: The impact of site-specific digital histology signatures on deep learning model accuracy and bias. Nat. Commun. 12(1), 4423 (2021). https://doi.org/10.1038/s41467-021-24698-1
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift (2015). https://arxiv.org/abs/1502.03167
Alayrac, J.B., et al.: Flamingo: a visual language model for few-shot learning. Presented at the (2022)
Kather, J.N., et al.: Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat. Med. 25(7), 1054–1056 (2019). https://doi.org/10.1038/s41591-019-0462-y
Kolesnikov, A., et al.: Big transfer (BiT): general visual representation learning (2020). arxiv:1912.11370 [cs]
Li, Z., et al.: Vision transformer-based weakly supervised histopathological image analysis of primary brain tumors. iScience 26, 1, 105872 (2023). https://doi.org/10.1016/j.isci.2022.105872
Liu, Y., et al.: Detecting cancer metastases on gigapixel pathology images. https://arxiv.org/abs/1703.02442
Long, J., et al.: Fully convolutional networks for semantic segmentation. https://arxiv.org/abs/1411.4038
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization (2017). arxiv.org.
Lu, M.Y., et al.: AI-based pathology predicts origins for cancers of unknown primary. Nature 594, 106–110 (2021). https://doi.org/10.1038/s41586-021-03512-4
Qiao, S., et al.: Micro-batch training with batch-channel normalization and weight standardization (2020). arXiv:1903.10520 [cs]
Radford, A., et al.: Learning transferable visual models from natural language supervision. Presented at the (2021)
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV). 115, 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations (ICLR 2015), pp. 1–14 Computational and Biological Learning Society (2015)
Szegedy, C., et al.: Rethinking the inception architecture for computer vision. Presented at the (2016). https://doi.org/10.1109/CVPR.2016.308
Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. Presented at the (2019)
Uegami, W., et al.: MIXTURE of human expertise and deep learning—developing an explainable model for predicting pathological diagnosis and survival in patients with interstitial lung disease. Mod. Pathol. 35(8), 1083–1091 (2022). https://doi.org/10.1038/s41379-022-01025-7
Veeling, B.S., et al.: Rotation equivariant CNNs for digital pathology (2018). arxiv.org.
Wightman, R.: PyTorch image models. GitHub repository (2019). https://doi.org/10.5281/zenodo.4414861
Wu, Y., He, K.: Group Normalization. https://openaccess.thecvf.com/content_ECCV_2018/html/Yuxin_Wu_Group_Normalization_ECCV_2018_paper.html
Zhai, X., et al.: A large-scale study of representation learning with the visual task adaptation benchmark (2019)
Acknowledgment
The results shown here are in part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga. We follow the original licenses to share the compiled datasets. We share PTCGA200 by acknowledging NIH Genomic Data Sharing (GDS) Policy. We share PCam200 dataset under CC0 license. We share segPANDA200 dataset under CC BY-SA-NC 4.0 license. This paper is based on results obtained from a project, JPNP20006, commissioned by the New Energy and Industrial Technology Development Organization (NEDO).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kawai, M., Ota, N., Yamaoka, S. (2023). Large-Scale Pretraining on Pathological Images for Fine-Tuning of Small Pathological Benchmarks. In: Xue, Z., et al. Medical Image Learning with Limited and Noisy Data. MILLanD 2023. Lecture Notes in Computer Science, vol 14307. Springer, Cham. https://doi.org/10.1007/978-3-031-44917-8_25
Download citation
DOI: https://doi.org/10.1007/978-3-031-44917-8_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47196-4
Online ISBN: 978-3-031-44917-8
eBook Packages: Computer ScienceComputer Science (R0)