Evaluation of Effectiveness of Self-Supervised Learning in Chest X-Ray Imaging to Reduce Annotated Images

Imagawa, Kuniki; Shiomoto, Kohei

doi:10.1007/s10278-024-00975-5

Evaluation of Effectiveness of Self-Supervised Learning in Chest X-Ray Imaging to Reduce Annotated Images

Open access
Published: 08 March 2024

Volume 37, pages 1618–1624, (2024)
Cite this article

Download PDF

You have full access to this open access article

Journal of Imaging Informatics in Medicine Aims and scope Submit manuscript

Evaluation of Effectiveness of Self-Supervised Learning in Chest X-Ray Imaging to Reduce Annotated Images

Download PDF

669 Accesses
Explore all metrics

Abstract

A significant challenge in machine learning-based medical image analysis is the scarcity of medical images. Obtaining a large number of labeled medical images is difficult because annotating medical images is a time-consuming process that requires specialized knowledge. In addition, inappropriate annotation processes can increase model bias. Self-supervised learning (SSL) is a type of unsupervised learning method that extracts image representations. Thus, SSL can be an effective method to reduce the number of labeled images. In this study, we investigated the feasibility of reducing the number of labeled images in a limited set of unlabeled medical images. The unlabeled chest X-ray (CXR) images were pretrained using the SimCLR framework, and then the representations were fine-tuned as supervised learning for the target task. A total of 2000 task-specific CXR images were used to perform binary classification of coronavirus disease 2019 (COVID-19) and normal cases. The results demonstrate that the performance of pretraining on task-specific unlabeled CXR images can be maintained when the number of labeled CXR images is reduced by approximately 40%. In addition, the performance was significantly better than that obtained without pretraining. In contrast, a large number of pretrained unlabeled images are required to maintain performance regardless of task specificity among a small number of labeled CXR images. In summary, to reduce the number of labeled images using SimCLR, we must consider both the number of images and the task-specific characteristics of the target images.

Enhancing diagnostic deep learning via self-supervised pretraining on large-scale, unlabeled non-medical images

Article Open access 08 February 2024

SS-FL: Self-Supervised Federated Learning for COVID-19 Detection from Chest X-Ray Images

DINO-CXR: A Self Supervised Method Based on Vision Transformer for Chest X-Ray Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Machine learning-based medical image analysis has been researched actively and introduced into the medical environment. This use of such technologies is expected to increase in the future due to the success of deep learning, which is a subset of machine learning [1, 2]. However, a common challenge is the scarcity of medical images due to patient privacy concerns. In addition, the scarcity of labeled medical images is also a serious problem because annotating medical images requires specialized knowledge, and an inappropriate annotation process can introduce annotation bias [3, 4]. To overcome the lack of medical images, the most common approach for supervised learning is to pretrain a convolutional neural network (CNN) on a large number of natural images, e.g., ImageNet [5], and then fine-tune the network using labeled medical images for the target medical task. This approach has proven effective in terms of improving performance in some categories [6]. For example, since the outbreak of coronavirus disease 2019 (COVID-19), several X-ray and CT image datasets have been made available to the public, and many COVID-19 classification studies have demonstrated the effectiveness of this approach, especially for limited images. However, there are significant differences in the pretrained and fine-tuned parameters because ImageNet contains approximately 1.4 million natural color images with 22,000 categories and 1000 labels. Medical applications that utilize CNN models pretrained on ImageNet remain ambiguous and can suffer from model overfitting. In addition, it has been reported that CNN models pretrained on ImageNet can perform worse than models without pretraining, depending on the characteristics of the data [7,8,9].

Self-supervised learning (SSL) has been proposed recently as an effective approach to the labeled image scarcity problem. The SSL training method produces representations using unlabeled images, and it is a type of unsupervised learning. Generally, the pretrained representation is fine-tuned for a downstream task on a few labeled images. This semi-supervised strategy achieves good performance compared to supervised learning. SimCLR [10] is a simple framework for contrastive learning of visual representations, and its success has led to extensive research on contrastive methods. However, SimCLR requires a large batch size to achieve sufficient performance; thus, many revised contrastive learning frameworks, e.g., Momentum Contrast (Moco) [11] and Bootstrap Your Own Latent (BYOL) [12], have emerged to reduce the batch size. In the medical fields, there has been an increase in the number of studies demonstrating the effectiveness of SSL methods [13]. For example, Shih-Cheng et al. [14] reviewed literature published after 2012 for SSL on medical image classification. They demonstrated the potential of SSL to reduce the amount of labeled data and improve performance and transferability. However, these previous studies demonstrated the effectiveness of SSL by pretraining on a large number of unlabeled images.

In a related study, Shekoofeh et al. [15] demonstrated that pretraining on unlabeled ImageNet and chest X-ray (CXR) images with SimCLR outperformed a supervised method pretrained on ImageNet and improved transferability to dermatology images. In addition, Hari et al. [16] demonstrated that pretraining on CXR images with Moco can reduce the number of labeled images and outperform CXR images without a pretraining process. Guang et al. [17] performed pretraining on labeled ImageNet and then performed pretraining on unlabeled CXR images six SSL methods (Cross, BYOL, SimSiam, SimCLR, PIRL-jigsaw, and PIRL-rotation), which was followed by transfer learning to the target task. They demonstrated the improvement in COVID-19 detection compared to supervised learning and pretraining with SSL methods. These studies also performed pretraining on a large number (in excess of several hundred thousand) of unlabeled CXR images or natural images using SSL.

In real world applications, it would be difficult to prepare such a large number of medical images, even unlabeled images, because utilization of medical data must comply with the laws and regulations of each country. Thus, this study was conducted to confirm the effectiveness of SSL for pretraining on a limited number of unlabeled medical images to reduce the amount of labeled data and improve performance. The methodology employed in this study is similar to that used in various previous studies, where the pretraining representation is fine-tuned for a binary classification. Specifically, we evaluate the number of CXR labeled images with the fine-tuning method (i.e., label fraction) for task-specific and non-task-specific images using supervised learning without pretraining, unsupervised pretraining, and supervised pretraining. Here, a total of 2000 task-specific images and 90,000 non-task-specific images were used for classification of COVID-19. In this paper, the “task-specific” pretraining dataset means that pretraining dataset matches the test dataset primarily in terms of image type, e.g., medical images and natural images, medical images acquired using medical devices, and the target diseases. The results demonstrate that the performance obtained by pretraining on task-specific unlabeled CXR images can be maintained when the number of labeled CXR images is reduced by approximately 40%. In addition, the performance was significantly better than that obtained without pretraining. In contrast, a large number of pretrained unlabeled images are required to maintain performance regardless of task specificity among a small number of labeled CXR images. In summary, to reduce the number of labeled images and improve performance with SSL, we must consider both the number of images and also the task-specific characteristics of the images.

Material and Methods

Datasets

In this study, two publicly available CXR datasets were used, i.e., the National Institutes of Health (NIH) dataset [18] and the BrixlA dataset [19]. The NIH dataset contains 112,120 CXR images with 14 diseased and normal images from 30,805 unique patients, and the BrixlA dataset contains 4703 CXR images from COVID-19 patients. Here, a total of 2000 training and validation images for COVID-19 and normal cases were selected randomly at the same ratio because our previous study demonstrated that CNN models, e.g., AlexNet and ResNet, achieve consistent performance when supervised learning is performed on 2000 training images.

The training and validation datasets were divided into labeled images (N=100, 250, 500, 1000, 1500, and 2000) and unlabeled images (N=1900, 1750, 1500, 1000, 500, and 0) from the NIH and BrixlA datasets. Note that there was no duplication between labeled and unlabeled images. In addition, the 90,000 unlabeled CXR images in the NIH dataset and the 90,000 unlabeled natural images in the ImageNet dataset were selected randomly for training and validation. A test dataset of 1000 CXR images, independent of the training datasets, was used for binary classification of COVID-19. Here, all CXR images were resized to 256 × 256 pixels and cropped around the center to 224 × 224 pixels. The grayscale CXR images were converted from 16-bit to 8-bit and converted to three color channels (red, green, and blue). The pixel values of the input images were normalized between ranges 0 and 1.

Methodology

In our experiments, we employed SimCLR [10] as the SSL method to learn the visual representations as a pretraining process. Figure 1 shows the method used in our experiments. SimCLR is a simple framework that does not require a special architecture or memory banks. Here, image x is differentially augmented $\tilde{\varvec{x}}_i$ and $\tilde{\varvec{x}}_j$ as positive pairs. In addition, f($\cdot$) is an encoder network to generate representations, where $h_i = f(\tilde{\varvec{x}}_i) = ResNet(\tilde{\varvec{x}}_i)$, and g($\cdot$) is a neural network projection head with one hidden layer used for contrastive loss, where $z_i = g(h_i) = W^{(2)}\sigma (W^{(1)}\varvec{h}_i)$. N is an arbitrary number of batches with 2N positive pairs in each batch and 2$(N-1)$ negative pairs. SimCLR maximizes the agreement of the positive pairs and minimizes the negative pairs using the contrasting loss as follows:

$$\begin{aligned} \ell _{i,j} = -\log \frac{\textrm{exp}(\textrm{sim}(\varvec{z_i}, \varvec{z_j})/\tau )}{\sum _{k=1}^{2N} \parallel _{[k \ne i]}\exp (\textrm{sim}(\varvec{z_i}, \varvec{z_k})/\tau )} \end{aligned}$$

(1)

where $\tau$ is a temperature parameter, and $\parallel$ means if $k= i$ then 0 and $k \ne i$ then 1. In addition, sim() denotes the cosine similarity, and N is the batch size.

We performed some experiments to select the type of data augmentation, the backbone network, and the hyperparameters such as the number of batches and the number of training epochs, because the original paper [10] reported that these have a significant impact on performance. For data augmentation technique, two types of transformations were used in this evaluation, i.e., spatial transformations, e.g., cropping, rotation, vertical flipping, and horizontal flipping, and appearance transformations, e.g., Gaussian blur and grayscale. In addition, different numbers of layers (N = 18, 34, 50, and 101) in the ResNet backbone were also evaluated, and we investigated different numbers of batches (N = 32, 64, 128, and 256) and training epochs (N = 100, 300, and 500). Note that other SimCLR hyperparameters were unchanged, and the hyperparameters for supervised learning were based on our previous results [20]. The area under the curve (AUC) was used for evaluation. The sensitivity and specificity depend on the classification threshold, and accuracy is dependent on the proportion of the test dataset. The 95% confidence interval was constructed using the method proposed by Hanley and McNeil method [21]. To evaluate the effectiveness of the SSL, the AUC on the common test dataset was determined as a function of pretraining on unlabeled images and the number of labeled CXR images for the subsequent supervised learning process. The types and number of images used for each training and test are shown in Table 1.

Table 1 Types and number of images used for training and test

Full size table

Results

Figure 2 shows the performance obtained using these enhancements when applied in various combinations. We found that the combination of rotation and vertical flipping obtained the highest performance. Figures 3 and 4 show the effect of batch size and the number of training epochs on performance. As can be seen, the best performance was obtained with a batch size of 128. In addition, higher numbers of training epochs tended to improve performance; however, but 300 epochs were used due to computational resources. We also evaluated the effect of different numbers of ResNet layers. ResNet34 was the most effective deep layer compared to other layers in Fig. 5. Note that the fundamental model and hyperparameter configurations were the same for all experiments (see “Methodology”), and the detailed experimental conditions are shown in Figs. 2, 3, 4 and 5.

The pretrained representations with different types and numbers of images were subsequently fine-tuned on different numbers of labeled CXR images. Figure 6 shows AUC versus the number of CXR labeled images with the fine-tuning method for each unsupervised method pretrained on unlabeled images. Note that Fig. 6 includes several task-specific CXR images extracted from the NIH and BrixlA datasets (red), 90,000 CXR images extracted from only the NIH dataset (blue) and 90,000 natural images extracted from ImageNet (green). The supervised learning without pretraining was also included as a baseline method (black). We found that the AUCs for pretraining on task-specific CXR images can be maintained when the number of labeled CXR images is reduced from 2000 to 500 (i.e., a reduction of approximately 40%). In addition, performance was significantly better than the supervised baseline in this area. In contrast, the effect of SimCLR was reduced drastically when the number of labeled images was reduced to less than 500. When pretraining was performed on 90,000 non-task-specific CXR images and 90,000 natural images, we observed slightly better performance compared to the supervised baseline. In contrast, the improvement in terms of the effectiveness of SimCLR can be observed in the small labeled image region.

Discussion

The implementation of SimCLR on ImageNet requires large batch sizes; thus, this approach consumes significant computational resources. However, our results demonstrate that SimCLR specific to the CXR images did not require a large batch size. We assume that the CXR images have common anatomical structures across the images; thus, so there is no need to create many negative pairs in each batch to reduce the loss function. The results of data augmentation and the backbone network are also specific to the CXR images. As a backbone, we found that ResNet does not require deeper layers, and this trend is similar to the supervised learning results presented by D’souza et al. [22]. They suggested that “deeper is better” is not always true, especially for small amounts of data, and the optimal CNN network depends on the nature of the training data. This suggests that feature extraction with SimCLR does not necessarily require a deep model because CXR images are simpler than ImageNet and the number of images is very small. In addition, from a computational resource perspective, our results also demonstrate that it is important to use SimCLR with task-specific images rather than ImageNet.

In this study, SimCLR was used specifically for CXR images, and we found that the AUCs of the pretrained task-specific CXR images can be maintained when the number of labeled CXR images is reduced by approximately 40%. Kyungjin et al. [23] also investigated the effectiveness of an SSL method and demonstrated the performance for some CXR datasets, e.g., the CheXpert datasets [24] with fine-tuning method for multiclassification used by pretraining on non-task-specific 4.8 million unlabeled CXR images with Moco. Table 2 compares these results in terms of the reduction of fine-tuned labeled images. Although performing direct quantitative comparisons is difficult due to the differences in the type of SSL method, the number of classifications, and the number of labeled images, the results demonstrate a similar trend. In other words, AUCs can be maintained when the number of labeled CXR images is reduced by approximately half. Considering that the number of unlabeled pretraining images used in the current study was extremely small (ranging from hundreds to thousands), the acquisition of task-specific images is an important factor in terms of data efficiency.

In contrast, the AUC performance obtained when pretraining with task-specific images demonstrated a significant reduction compared to the performance obtained with non-task-specific images among a small number of labeled CXR images. Figure 7 shows the number of CXR labeled images with the fine-tuning method for both unsupervised and supervised methods pretrained on unlabeled and labeled images. This figure includes task-specific unlabeled CXR images extracted from the NIH and BrixlA datasets (red) and the labeled natural images extracted from ImageNet (purple) as pretraining. Note that the supervised learning without pretraining is also included as a baseline method (black). As shown, pretraining on 1.4 million labeled ImageNet clearly outperforms in small labeled image regions (less than 500 images). A related study by Sellergren et al. [25] achieved an AUC comparable to state-of-the-art deep supervised learning models on tens to hundreds of labeled images by performing pretraining on 821,544 unlabeled CXR images using the SSL method. This suggests that a large number of images is required to maintain sufficient performance regardless of the pretraining method and task specificity in small labeled image regions.

In this study, we found that the performance obtained by pretraining on the task-specific unlabeled images with SimCLR can reduce the number of labeled images and outperform the process that does not employ pretraining. However, this result is limited in that does not apply to a small number of labeled images. Recently, many studies have reported the effectiveness of SSL; however, to the best of our knowledge, few studies have investigated that use of a small number of labeled and unlabeled images that reflect the real world. Thus, the results of the current study provide fundamental insight into the effectiveness of SSL in the medical field. Note that our study is limited in terms of the generalizability of our findings. First, the experiment conducted in this study only investigated SimCLR; thus, other improved and refined SSL frameworks should be considered in the future. In addition, more detailed hyperparameters should be considered. Second, a wider variety of diseases should be considered to the reflect actual clinical practices. For COVID-19 detection, similar diseases, e.g., viral and bacterial pneumonia, should be investigated. In addition, other medical images should also be considered.

Table 2 Fraction of labeled images

Full size table

Conclusion

To the best of our knowledge, this paper represents the first report demonstrating the effectiveness of the SSL method with pretraining on a small number of unlabeled images. We hope that the results of this study will contribute to the ongoing development of machine learning-based medical image analysis.

Data Availability

The NIH dataset and the BrixlA dataset that support the findings of this study are openly available at https://nihcc.app.box.com/v/ChestXray-NIHCC and https://brixia.github.io/.

References

G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi, M. Ghafoorian, J. A. W. M. van der Laak, B. van Ginneken, and C. I. Sánchez. A survey on deep learning in medical image analysis. Med Image Anal, 42:60–88, Dec 2017.
Ryuji Hamamoto, Kruthi Suvarna, Masayoshi Yamada, Kazuma Kobayashi, Norio Shinkai, Mototaka Miyake, Masamichi Takahashi, Shunichi Jinnai, Ryo Shimoyama, Akira Sakai, Ken Takasawa, Amina Bolatkan, Kanto Shozu, Ai Dozen, Hidenori Machino, Satoshi Takahashi, Ken Asada, Masaaki Komatsu, Jun Sese, and Syuzo Kaneko. Application of artificial intelligence technology in oncology: Towards the establishment of precision medicine. Cancers, 12(12), 2020.
Ibrahim Tolga Öztürk, Rostislav Nedelchev, Christian Heumann, Esteban Garces Arias, Marius Roger, Bernd Bischl, and Matthias Aßenmacher. How different is stereotypical bias across languages? arXiv preprint arXiv:2307.07331, 2023.
Bogdan A Bercean, Andreea Birhala, Paula G Ardelean, Ioana Barbulescu, Marius M Benta, Cristina D Rasadean, Dan Costachescu, Cristian Avramescu, Andrei Tenescu, Stefan Iarca, et al. Evidence of a cognitive bias in the quantification of covid-19 with ct: an artificial intelligence randomised clinical trial. Scientific Reports, 13(1):4887, 2023.
J Deng, W Dong, R Socher, LJ Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. 2009 ieee conference on computer vision and pattern recognition. miami. 2009.
R. Yamashita, M. Nishio, R. K. G. Do, and K. Togashi. Convolutional neural networks: an overview and application in radiology. Insights Imaging, 9(4):611–629, Aug 2018.
Maithra Raghu, Chiyuan Zhang, Jon Kleinberg, and Samy Bengio. Transfusion: Understanding transfer learning for medical imaging. Advances in neural information processing systems, 32, 2019.
L. Wynants, B. Van Calster, G. S. Collins, R. D. Riley, G. Heinze, E. Schuit, M. M. J. Bonten, D. L. Dahly, J. A. A. Damen, T. P. A. Debray, V. M. T. de Jong, M. De Vos, P. Dhiman, M. C. Haller, M. O. Harhay, L. Henckaerts, P. Heus, M. Kammer, N. Kreuzberger, A. Lohmann, K. Luijken, J. Ma, G. P. Martin, D. J. McLernon, C. L. Andaur Navarro, J. B. Reitsma, J. C. Sergeant, C. Shi, N. Skoetz, L. J. M. Smits, K. I. E. Snell, M. Sperrin, R. Spijker, E. W. Steyerberg, T. Takada, I. Tzoulaki, S. M. J. van Kuijk, B. van Bussel, I. C. C. van der Horst, F. S. van Royen, J. Y. Verbakel, C. Wallisch, J. Wilkinson, R. Wolff, L. Hooft, K. G. M. Moons, and M. van Smeden. Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal. BMJ, 369:m1328, 04 2020.
Michael Roberts, Derek Driggs, Matthew Thorpe, Julian Gilbey, Michael Yeung, Stephan Ursprung, Angelica I Aviles-Rivero, Christian Etmann, Cathal McCague, Lucian Beer, et al. Common pitfalls and recommendations for using machine learning to detect and prognosticate for covid-19 using chest radiographs and ct scans. Nature Machine Intelligence, 3(3):199–217, 2021.
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020.
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, et al. Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33:21271–21284, 2020.
Saeed Shurrab and Rehab Duwairi. Self-supervised learning methods and applications in medical imaging analysis: a survey. PeerJ. Computer science, 8:e1045, 2022.
Article PubMed PubMed Central Google Scholar
Shih-Cheng Huang, Anuj Pareek, Malte Jensen, Matthew P Lungren, Serena Yeung, and Akshay S Chaudhari. Self-supervised learning for medical image classification: a systematic review and implementation guidelines. NPJ Digital Medicine, 6(1):74, 2023.
Shekoofeh Azizi, Basil Mustafa, Fiona Ryan, Zachary Beaver, Jan Freyberg, Jonathan Deaton, Aaron Loh, Alan Karthikesalingam, Simon Kornblith, Ting Chen, et al. Big self-supervised models advance medical image classification. In Proceedings of the IEEE/CVF international conference on computer vision, pages 3478–3488, 2021.
Hari Sowrirajan, Jingbo Yang, Andrew Y Ng, and Pranav Rajpurkar. Moco pretraining improves representation and transferability of chest x-ray models. In Medical Imaging with Deep Learning, pages 728–744. PMLR, 2021.
Guang Li, Ren Togo, Takahiro Ogawa, and Miki Haseyama. Covid-19 detection based on self-supervised transfer learning using chest x-ray images. International Journal of Computer Assisted Radiology and Surgery, 18(4):715–722, 2023.
Article PubMed Google Scholar
Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, M Bagheri, and R Summers. Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In IEEE CVPR, volume 7, page 46. sn, 2017.
A. Signoroni, M. Savardi, S. Benini, N. Adami, R. Leonardi, P. Gibellini, F. Vaccher, M. Ravanelli, A. Borghesi, R. Maroldi, and D. Farina. BS-Net: Learning COVID-19 pneumonia severity on a large chest X-ray dataset. Med Image Anal, 71:102046, 07 2021.
Kuniki Imagawa, and Kohei Shiomoto. Performance change with the number of training data: A case study on the binary classification of COVID-19 chest Xray by using convolutional neural networks. Computers in Biology and Medicine, 142:105251, 2022. Elsevier.
Article CAS PubMed PubMed Central Google Scholar
James A Hanley and Barbara J McNeil. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology, 148(3):839–843, 1983.
Rhett N D’souza, Po-Yao Huang, and Fang-Cheng Yeh. Structural analysis and optimization of convolutional neural networks with a small sample size. Scientific reports, 10(1):834, 2020.
Kyungjin Cho, Ki Duk Kim, Yujin Nam, Jiheon Jeong, Jeeyoung Kim, Changyong Choi, Soyoung Lee, Jun Soo Lee, Seoyeon Woo, Gil-Sun Hong, et al. Chess: Chest x-ray pre-trained model via self-supervised contrastive learning. Journal of Digital Imaging, pages 1–9, 2023.
Jeremy Irvin, Pranav Rajpurkar, Michael Ko, Yifan Yu, Silviana Ciurea-Ilcus, Chris Chute, Henrik Marklund, Behzad Haghgoo, Robyn Ball, Katie Shpanskaya, et al. Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 590–597, 2019.
Article Google Scholar
Andrew B. Sellergren, Christina Chen, Zaid Nabulsi, Yuanzhen Li, Aaron Maschinot, Aaron Sarna, Jenny Huang, Charles Lau, Sreenivasa Raju Kalidindi, Mozziyar Etemadi, Florencia Garcia-Vicente, David Melnick, Yun Liu, Krish Eswaran, Daniel Tse, Neeral Beladia, Dilip Krishnan, and Shravya Shetty. Simplified transfer learning for chest radiography models using less data. Radiology, 305(2):454–465, November 2022. Funding Information: Supported by Google. The authors thank the members of the Google Health Radiology and labeling software teams for software infrastructure support, logistical support, and assistance in data labeling. For the ChestX-ray14 data set, we thank the NIH Clinical Center for making it publicly available. Sincere appreciation also goes to the radiologists who enabled this work with their image interpretation and annotation efforts throughout the study, Jonny Wong, BA, for coordinating the imaging annotation work, and Akinori Mitani, MD, and Craig H. Mermel, MD, PhD, for providing feedback on the manuscript. Publisher Copyright: © RSNA, 2022.

Download references

Funding

Open Access funding provided by University of Yamanashi. The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.

Author information

Authors and Affiliations

Faculty of Information Technology, Tokyo City University, 1-28-1 Tamazutsumi, Setagaya-ku, Tokyo, 158-8557, Japan
Kuniki Imagawa & Kohei Shiomoto

Authors

Kuniki Imagawa
View author publications
You can also search for this author in PubMed Google Scholar
Kohei Shiomoto
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection, and analysis were performed by Kuniki Imagawa under the supervision by Kohei Shiomoto. The first draft of the manuscript was written by Kuniki Imagawa, and Kohei Shiomoto commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Kuniki Imagawa.

Ethics declarations

Ethics Approval

Not applicable.

Consent to Participate

Not applicable.

Consent for Publication

Not applicable.

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Imagawa, K., Shiomoto, K. Evaluation of Effectiveness of Self-Supervised Learning in Chest X-Ray Imaging to Reduce Annotated Images. J Digit Imaging. Inform. med. 37, 1618–1624 (2024). https://doi.org/10.1007/s10278-024-00975-5

Download citation

Received: 07 September 2023
Revised: 17 November 2023
Accepted: 17 November 2023
Published: 08 March 2024
Issue Date: August 2024
DOI: https://doi.org/10.1007/s10278-024-00975-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Evaluation of Effectiveness of Self-Supervised Learning in Chest X-Ray Imaging to Reduce Annotated Images

Abstract

Similar content being viewed by others

Enhancing diagnostic deep learning via self-supervised pretraining on large-scale, unlabeled non-medical images

SS-FL: Self-Supervised Federated Learning for COVID-19 Detection from Chest X-Ray Images

DINO-CXR: A Self Supervised Method Based on Vision Transformer for Chest X-Ray Classification

Introduction