3D-CNNs for Deep Binary Descriptor Learning in Medical Volume Data

  • Max Blendowski
  • Mattias P. Heinrich
Conference paper
Part of the Informatik aktuell book series (INFORMAT)


Deep convolutional neural networks achieve impressive results in many computer vision tasks not least because of their representation learning abilities. The translation of these findings to the medical domain with large volumetric data e.g. CT scans with typically ≥ 106 voxels is an important area of research. In particular for medical image registration, a standard analysis task, the supervised learning of expressive regional representations based on local greyvalue information is of importance to define a similarity metric. By providing discriminant binary features modern architectures can leverage special operations to compute hamming distance based similarity metrics. In this contribution we devise a 3D-Convolutional Neural Network (CNN) that can efficiently extract binary descriptors for Hamming distance-based metrics. We adopt the recently introduced Binary Tree Architectures and train a model using paired data with known correspondences. We employ a triplet objective term and extend the hinge loss with additional penalties for non-binary entries. The learned descriptors are shown to outperform state-of-the-art hand-crafted features on challenging COPD 3D-CT datasets and demonstrate their robustness for retrieval tasks under compression factors of ≈ 2000.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. Proc ICCV. 2016; p. 770–778.Google Scholar
  2. 2.
    Zhang Y, Ozay M, Li S, et al. Truncating wide networks using binary tree architectures. arXiv preprint arXiv:170400509. 2017.
  3. 3.
    Calonder M, Lepetit V, Strecha C, et al. Brief: binary robust independent elementary features. Proc ECCV. 2010; p. 778–792.Google Scholar
  4. 4.
    Heinrich MP, Jenkinson M, Papie z BW, et al.; Springer. Towards realtime multimodal fusion for image-guided interventions using self-similarities. Proc MICCAI. 2013; p. 187–194.Google Scholar
  5. 5.
    Liu H, Wang R, Shan S, et al. Deep supervised hashing for fast image retrieval. Proc ICCV. 2016; p. 2064–2072.Google Scholar
  6. 6.
    Simonovsky M, Gutiérrez-Becker B, Mateus D, et al.; Springer. A deep metric for multimodal registration. Proc MICCAI. 2016; p. 10–18.Google Scholar
  7. 7.
    Conjeti S, Roy AG, Katouzian A, et al.; Springer. Hashing with residual networks for image retrieval. Proc MICCAI. 2017; p. 541–549.Google Scholar
  8. 8.
    Dou Q, Chen H, Yu L, et al. Multilevel contextual 3-D CNNs for false positive reduction in pulmonary nodule detection. IEEE Trans Biomed Eng. 2017;64(7):1558– 1567.Google Scholar
  9. 9.
    Dosovitskiy A, Fischer P, Ilg E, et al. Flownet: Learning optical flow with convolutional networks. Proc ICCV. 2015; p. 2758–2766.Google Scholar
  10. 10.
    Weinzaepfel P, Revaud J, Harchaoui Z, et al. DeepFlow: Large displacement optical flow with deep matching. Proc ICCV. 2013; p. 1385–1392.Google Scholar
  11. 11.
    Montufar GF, Pascanu R, Cho K, et al. On the number of linear regions of deep neural networks. Adv Neural Inf Process Syst. 2014; p. 2924–2932.Google Scholar
  12. 12.
    Castillo R, Castillo E, Fuentes D, et al. A reference dataset for deformable image registration spatial accuracy evaluation using the COPDgene study archive. Phys Med Biol. 2013;58(9):2861.Google Scholar
  13. 13.
    Heinrich MP, Handels H, Simpson IJ; Springer. Estimating large lung motion in COPD patients by symmetric regularised correspondence fields. Proc MICCAI. 2015; p. 338–345.Google Scholar

Copyright information

© Springer-Verlag GmbH Deutschland 2018

Authors and Affiliations

  1. 1.Institute of Medical InformaticsUniversity of LübeckLübeckDeutschland

Personalised recommendations