Supervised machine learning techniques require large amounts of annotated training data to attain good performance. Active learning aims to ease the data collection process by automatically detecting which instances an expert should annotate in order to train a model as quickly and effectively as possible. Such strategies have been previously reported for medical imaging, but for other tasks than focal pathologies where there is high class imbalance and heterogeneous background appearance. In this study we evaluate different data selection approaches (random, uncertain, and representative sampling) and a semi-supervised model training procedure (pseudo-labelling), in the context of lung nodule segmentation in CT volumes from the publicly available LIDC-IDRI dataset. We find that active learning strategies allow us to train a model with equal performance but less than half of the annotation effort; data selection by uncertainty sampling offers the most gain, with the incorporation of representativeness or the addition of pseudo-labelling giving further small improvements. We conclude that active learning is a valuable tool and that further development of these strategies can play a key role in making diagnostic algorithms viable.


Active learning Lung nodule segmentation Pseudo-labelling 


  1. 1.
    Abraham, J.: Reduced lung cancer mortality with low-dose computed tomographic screening. Commun. Oncol. 8(10), 441–442 (2011)CrossRefGoogle Scholar
  2. 2.
    Armato III, S.G., et al.: Data from lidc-idri. the cancer imaging archive (2015)Google Scholar
  3. 3.
    Bank, D., Greenfeld, D., Hyams, G.: Improved training for self training by confidence assessments. In: Arai, K., Kapoor, S., Bhatia, R. (eds.) SAI 2018. AISC, vol. 858, pp. 163–173. Springer, Cham (2019). Scholar
  4. 4.
    Becker, N., et al.: Lung cancer mortality reduction by ldct screening-results from the randomised german lusi trial. International Journal of Cancer (2019)Google Scholar
  5. 5.
    Gal, Y., Ghahramani, Z.: Dropout as a bayesian approximation: representing model uncertainty in deep learning. In: international Conference on Machine Learning, pp. 1050–1059 (2016)Google Scholar
  6. 6.
    Golan, R., Jacob, C., Denzinger, J.: Lung nodule detection in ct images using deep convolutional neural networks. In: 2016 International Joint Conference on Neural Networks (IJCNN), pp. 243–250. IEEE (2016)Google Scholar
  7. 7.
    Gorriz, M., Carlier, A., Faure, E., Giro-i Nieto, X.: Cost-effective active learning for melanoma segmentation (2017). arXiv preprint arXiv:1711.09168
  8. 8.
    Hua, K.L., Hsu, C.H., Hidayati, S.C., Cheng, W.H., Chen, Y.J.: Computer-aided classification of lung nodules on computed tomography images via deep learning technique. Onco Targets Therapy 8, 2015–2022 (2015)Google Scholar
  9. 9.
    Jesson, A., et al.: CASED: curriculum adaptive sampling for extreme data imbalance. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10435, pp. 639–646. Springer, Cham (2017). Scholar
  10. 10.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014). arXiv preprint arXiv:1412.6980
  11. 11.
    Kohl, S.A., et al.: A hierarchical probabilistic u-net for modeling multi-scale ambiguities (2019). arXiv preprint arXiv:1905.13077
  12. 12.
    Lee, D.H.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on Challenges in Representation Learning, ICML. vol. 3, p. 2 (2013)Google Scholar
  13. 13.
    Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (2018)Google Scholar
  14. 14.
    Maaten, L.V.D., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res. 9, 2579–2605 (2008)zbMATHGoogle Scholar
  15. 15.
    Park, S., Hwang, W., Jung, K.H.: Integrating reinforcement learning to self training for pulmonary nodule segmentation in chest x-rays. NeurIPS ML4 Health Workshop (2018)Google Scholar
  16. 16.
    Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). Scholar
  17. 17.
    Segal, R., Miller, K., Jemal, A.: Cancer statistics (2018).
  18. 18.
    Settles, B.: Active learning literature survey. University of Wisconsin-Madison Department of Computer Sciences, Technical report (2009)Google Scholar
  19. 19.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv preprint arXiv:1409.1556
  20. 20.
    Smith, L.N.: Cyclical learning rates for training neural networks. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 464–472. IEEE (2017)Google Scholar
  21. 21.
    Wang, K., Zhang, D., Li, Y., Zhang, R., Lin, L.: Cost-effective active learning for deep image classification. IEEE Trans. Circ. Syst. Video Technol. 27(12), 2591–2600 (2016)CrossRefGoogle Scholar
  22. 22.
    Yang, L., Zhang, Y., Chen, J., Zhang, S., Chen, D.Z.: Suggestive annotation: a deep active learning framework for biomedical image segmentation. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10435, pp. 399–407. Springer, Cham (2017). Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Canon Medical Research EuropeEdinburghUK
  2. 2.Universitat de GironaGironaSpain
  3. 3.University of GlasgowGlasgowUK
  4. 4.University of EdinburghEdinburghUK

Personalised recommendations