Large-Scale Pretraining on Pathological Images for Fine-Tuning of Small Pathological Benchmarks

Kawai, Masakata; Ota, Noriaki; Yamaoka, Shinsuke

doi:10.1007/978-3-031-44917-8_25

Masakata Kawai ORCID: orcid.org/0000-0003-1106-239X¹⁵,
Noriaki Ota¹⁶ &
Shinsuke Yamaoka¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14307))

Included in the following conference series:

Workshop on Medical Image Learning with Limited and Noisy Data

300 Accesses
1 Citations

Abstract

Pretraining a deep learning model on large image datasets is a standard step before fine-tuning the model on small targeted datasets. The large dataset is usually general images (e.g. imagenet2012) while the small dataset can be specialized datasets that have different distributions from the large dataset. However, this “large-to-small” strategy is not well-validated when the large dataset is specialized and has a similar distribution to small datasets. We newly compiled three hematoxylin and eosin-stained image datasets, one large (PTCGA200) and two magnification-adjusted small datasets (PCam200 and segPANDA200). Major deep learning models were trained with supervised and self-supervised learning methods and fine-tuned on the small datasets for tumor classification and tissue segmentation benchmarks. ResNet50 pretrained with MoCov2, SimCLR, and BYOL on PTCGA200 was better than imagenet2012 pretraining when fine-tuned on PTCGA200 (accuracy of 83.94%, 86.41%, 84.91%, and 82.72%, respectively). ResNet50 pretrained on PTCGA200 with MoCov2 exceeded the COCOtrain2017-pretrained baseline and was the best in ResNet50 for the tissue segmentation benchmark (mIoU of 63.53% and 63.22%). We found supervised re-training imagenet-pretrained models (ResNet50, BiT-M-R50x1, and ViT-S/16) on PTCGA200 often improved downstream benchmarks.

Codes: https://github.com/enigmanx20/PatchTCGA

Datasets: http://bit.ly/3KCzkCA

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Dosovitskiy, A. et al.: An image is worth 16 × 16 words: transformers for image recognition at scale. Presented at the (2021)
Google Scholar
Azizi, S. et al.: Robust and efficient medical imaging with self-supervision (2022). https://doi.org/10.48550/arxiv.2205.09723
Bulten, W., et al.: Artificial intelligence for diagnosis and Gleason grading of prostate cancer: the PANDA challenge. Nat. Med. 28(1), 154–163 (2022). https://doi.org/10.1038/s41591-021-01620-2
Article Google Scholar
Cao, Y.-H., Wu, J.: Rethinking self-supervised learning: small is beautiful (2021). arXiv:2103.13559 [cs]
Caron, M., et al.: Emerging properties in self-supervised vision transformers. Presented at the (2021)
Google Scholar
Chen, R.J., et al.: Scaling vision transformers to gigapixel images via hierarchical self-supervised learning. Presented at the (2022). https://doi.org/10.1109/CVPR52688.2022.01567
Article Google Scholar
Chen, T., et al.: A simple framework for contrastive learning of visual representations (2020)
Google Scholar
Chen, T., et al.: Big self-supervised models are strong semi-supervised learners (2020)
Google Scholar
Chen, X., et al.: An empirical study of training self-supervised vision transformers. CoRR. abs/2104.02057 (2021)
Google Scholar
Coudray, N., et al.: Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat. Med. 24(10), 1559–1567 (2018). https://doi.org/10.1038/s41591-018-0177-5
Article Google Scholar
Deininger, L., et al.: A comparative study between vision transformers and CNNs in digital pathology
Google Scholar
Deng, J., et al.: ImageNet: a large-scale hierarchical image database. In: IEEE, pp. 248–255 (2009)
Google Scholar
Devlin, J., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. Presented at the June (2019). https://doi.org/10.18653/v1/N19-1423
Ehteshami Bejnordi, B., et al.: Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA 318(22), 2199 (2017). https://doi.org/10.1001/jama.2017.14585
Esteva, A., et al.: Dermatologist-level classification of skin cancer with deep neural networks. Nature 542(7639), 115–118 (2017). https://doi.org/10.1038/nature21056
Article Google Scholar
Goyal, P., et al.: Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour (2018). arXiv:1706.02677 [cs]
Grill, J.-B., et al.: Bootstrap your own latent - a new approach to self-supervised learning. Presented at the (2020)
Google Scholar
He, K., et al.: Deep residual learning for image recognition. Presented at the (2016). https://doi.org/10.1109/CVPR.2016.90
Article Google Scholar
He, K., et al.: Identity mappings in deep residual networks. arXiv:1603.05027 [cs]. (2016)
He, K., et al.: Momentum contrast for unsupervised visual representation learning. Presented at the (2020). https://doi.org/10.1109/CVPR42600.2020.00975
Article Google Scholar
Howard, F.M., et al.: The impact of site-specific digital histology signatures on deep learning model accuracy and bias. Nat. Commun. 12(1), 4423 (2021). https://doi.org/10.1038/s41467-021-24698-1
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift (2015). https://arxiv.org/abs/1502.03167
Alayrac, J.B., et al.: Flamingo: a visual language model for few-shot learning. Presented at the (2022)
Google Scholar
Kather, J.N., et al.: Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat. Med. 25(7), 1054–1056 (2019). https://doi.org/10.1038/s41591-019-0462-y
Article Google Scholar
Kolesnikov, A., et al.: Big transfer (BiT): general visual representation learning (2020). arxiv:1912.11370 [cs]
Li, Z., et al.: Vision transformer-based weakly supervised histopathological image analysis of primary brain tumors. iScience 26, 1, 105872 (2023). https://doi.org/10.1016/j.isci.2022.105872
Liu, Y., et al.: Detecting cancer metastases on gigapixel pathology images. https://arxiv.org/abs/1703.02442
Long, J., et al.: Fully convolutional networks for semantic segmentation. https://arxiv.org/abs/1411.4038
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization (2017). arxiv.org.
Google Scholar
Lu, M.Y., et al.: AI-based pathology predicts origins for cancers of unknown primary. Nature 594, 106–110 (2021). https://doi.org/10.1038/s41586-021-03512-4
Article Google Scholar
Qiao, S., et al.: Micro-batch training with batch-channel normalization and weight standardization (2020). arXiv:1903.10520 [cs]
Radford, A., et al.: Learning transferable visual models from natural language supervision. Presented at the (2021)
Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV). 115, 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Article MathSciNet Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations (ICLR 2015), pp. 1–14 Computational and Biological Learning Society (2015)
Google Scholar
Szegedy, C., et al.: Rethinking the inception architecture for computer vision. Presented at the (2016). https://doi.org/10.1109/CVPR.2016.308
Article Google Scholar
Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. Presented at the (2019)
Google Scholar
Uegami, W., et al.: MIXTURE of human expertise and deep learning—developing an explainable model for predicting pathological diagnosis and survival in patients with interstitial lung disease. Mod. Pathol. 35(8), 1083–1091 (2022). https://doi.org/10.1038/s41379-022-01025-7
Article Google Scholar
Veeling, B.S., et al.: Rotation equivariant CNNs for digital pathology (2018). arxiv.org.
Google Scholar
Wightman, R.: PyTorch image models. GitHub repository (2019). https://doi.org/10.5281/zenodo.4414861
Wu, Y., He, K.: Group Normalization. https://openaccess.thecvf.com/content_ECCV_2018/html/Yuxin_Wu_Group_Normalization_ECCV_2018_paper.html
Zhai, X., et al.: A large-scale study of representation learning with the visual task adaptation benchmark (2019)
Google Scholar

Download references

Acknowledgment

The results shown here are in part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga. We follow the original licenses to share the compiled datasets. We share PTCGA200 by acknowledging NIH Genomic Data Sharing (GDS) Policy. We share PCam200 dataset under CC0 license. We share segPANDA200 dataset under CC BY-SA-NC 4.0 license. This paper is based on results obtained from a project, JPNP20006, commissioned by the New Energy and Industrial Technology Development Organization (NEDO).

Author information

Authors and Affiliations

Department of Pathology, University of Yamanashi, Yamanashi, Japan
Masakata Kawai
Systems Research and Development Center, Technology Bureau, NS Solutions Corp, Tokyo, Japan
Noriaki Ota & Shinsuke Yamaoka

Authors

Masakata Kawai
View author publications
You can also search for this author in PubMed Google Scholar
Noriaki Ota
View author publications
You can also search for this author in PubMed Google Scholar
Shinsuke Yamaoka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Masakata Kawai .

Editor information

Editors and Affiliations

National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Zhiyun Xue
National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Sameer Antani
National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Ghada Zamzmi
National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Feng Yang
National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Sivaramakrishnan Rajaraman
College of Information Sciences and Technology, Penn State University, University Park, PA, USA
Sharon Xiaolei Huang
Sheikh Zayed Institute for Pediatric Surgical Innovation, Children’s National Hospital, Washington, DC, USA
Marius George Linguraru
National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
Zhaohui Liang

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 302 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kawai, M., Ota, N., Yamaoka, S. (2023). Large-Scale Pretraining on Pathological Images for Fine-Tuning of Small Pathological Benchmarks. In: Xue, Z., et al. Medical Image Learning with Limited and Noisy Data. MILLanD 2023. Lecture Notes in Computer Science, vol 14307. Springer, Cham. https://doi.org/10.1007/978-3-031-44917-8_25

Download citation

DOI: https://doi.org/10.1007/978-3-031-44917-8_25
Published: 08 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47196-4
Online ISBN: 978-3-031-44917-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics