Abstract
As an important upstream task for many medical applications, supervised landmark localization still requires non-negligible annotation costs to achieve desirable performance. Besides, due to cumbersome collection procedures, the limited size of medical landmark datasets impacts the effectiveness of large-scale self-supervised pre-training methods. To address these challenges, we propose a two-stage framework for one-shot medical landmark localization, which first infers landmarks by unsupervised registration from the labeled exemplar to unlabeled targets, and then utilizes these noisy pseudo labels to train robust detectors. To handle the significant structure variations, we learn an end-to-end cascade of global alignment and local deformations, under the guidance of novel loss functions which incorporate edge information. In stage II, we explore self-consistency for selecting reliable pseudo labels and cross-consistency for semi-supervised learning. Our method achieves state-of-the-art performances on public datasets of different body parts, which demonstrates its general applicability. Code is available at https://github.com/GoldExcalibur/EdgeTrans4Mark.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Wang, C.-W., et al.: A benchmark for comparison of dental radiography analysis algorithms. Med. Image Anal. 31, 63–76 (2016)
Chen, R., Ma, Y., Chen, N., Lee, D., Wang, W.: Cephalometric landmark detection by attentive feature pyramid fusion and regression-voting. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11766, pp. 873–881. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32248-9_97
Escobar, M., González, C., Torres, F., Daza, L., Triana, G., Arbeláez, P.: Hand pose estimation for pediatric bone age assessment. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11769, pp. 531–539. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32226-7_59
Gong, P., Yin, Z., Wang, Y., Yu, Y.: Towards robust bone age assessment: rethinking label noise and ambiguity. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12266, pp. 621–630. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59725-2_60
Payer, C., Štern, D., Bischof, H., Urschler, M.: Integrating spatial configuration into heatmap regression based CNNs for landmark localization. Med. Image Anal. 54, 207–219 (2019)
Liu, W., Wang, Yu., Jiang, T., Chi, Y., Zhang, L., Hua, X.-S.: Landmarks detection with anatomical constraints for total hip arthroplasty preoperative measurements. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12264, pp. 670–679. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59719-1_65
Wang, J., et al.: Deep high-resolution representation learning for visual recognition. IEEE trans. Pattern Anal. Mach. Intell. 43, 3349–3364 (2020)
Li, W., et al.: Structured landmark detection via topology-adapting deep graph learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12354, pp. 266–283. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58545-7_16
Liu, H., Liu, F., Fan, X., Huang, D.: Polarized self-attention: towards High-quality Pixel-wise Regression. arXiv preprint arXiv:2107.00782 (2021)
Browatzki, B., Wallraven, C.: 3FabRec: fast few-shot face alignment by reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6110–6120 (2020)
Yao, Q., Quan, Q., Xiao, L., Zhou, S.K.: One-shot medical landmark detection. arXiv preprint arXiv:2103.04527 (2021)
Zhou, X-Y., et al.: Scalable semi-supervised landmark localization for X-ray images using few-shot deep adaptive graph. arXiv preprint arXiv:2104.14629 (2021)
Balakrishnan, G., Zhao, A., Sabuncu, M.R., Guttag, J., Dalca, A.V.: An unsupervised learning model for deformable medical image registration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9252–9260 (2018)
Lee, M.C.H., Oktay, O., Schuh, A., Schaap, M., Glocker, B.: Image-and-spatial transformer networks for structure-guided image registration. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11765, pp. 337–345. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32245-8_38
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Yao, Q., He, Z., Han, H., Zhou, S.K.: Miss the point: targeted adversarial attack on multiple landmark detection. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12264, pp. 692–702. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59719-1_67
Honari, S., Molchanov, P., Tyree, S., Vincent, P., Pal, C., Kautz, J.: Improving landmark localization with semi-supervised learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1546–1555 (2018)
Moskvyak, O., Maire, F., Dayoub, F., Baktashmotlagh, M.: Semi-supervised keypoint localization. arXiv preprint arXiv:2101.07988 (2021)
Qian, S., Sun, K., Wu, W., Qian, C., Jia, J.: Aggregation via separation: boosting facial landmark detector with semi-supervised style translation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10153–10163 (2019)
Kumar, A., Chellappa, R.: S2LD: Semi-supervised landmark detection in low-resolution images and impact on face verification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 758–759 (2020)
Uzunova, H., Wilms, M., Handels, H., Ehrhardt, J.: Training CNNs for image registration from few samples with model-based data augmentation. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10433, pp. 223–231. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66182-7_26
Li, H., Fan, Y.: Non-rigid image registration using self-supervised fully convolutional networks without training data. In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), pp. 1075–1078. IEEE (2018)
Zhao, A., Balakrishnan, G., Durand, F., Guttag, J.V., Dalca, A.V.: Data augmentation using learned transformations for one-shot medical image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8543–8553 (2019)
Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019)
Parmar, N., et al.: Image transformer. In: International Conference on Machine Learning, pp. 4055–4064. PMLR (2018)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. Adv. Neural. Inf. Process. Syst. 28, 2017–2025 (2015)
Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: 2003 the Thrity-Seventh Asilomar Conference on Signals, Systems and Computers, vol. 2, pp. 1398–1402. IEEE (2003)
Han, B., et al.: Co-teaching: robust training of deep neural networks with extremely noisy labels. arXiv preprint arXiv:1804.06872 (2018)
Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization. arXiv:abs/1611.03530 (2017)
Arpit, D., et al.: A closer look at memorization in deep networks. In: International Conference on Machine Learning, pp. 233–242. PMLR (2017)
Yu, X., Han, B., Yao, J., Niu, G., Tsang, I., Sugiyama, M.: How does disagreement help generalization against label corruption? In: International Conference on Machine Learning pp. 7164–7173 (2019)
Sohn, K., et al.: FixMatch: simplifying semi-supervised learning with consistency and confidence. arXiv preprint arXiv:2001.07685 (2020)
Wang, C.-W., et al.: Evaluation and comparison of anatomical landmark detection methods for cephalometric X-ray images: a grand challenge. IEEE Trans. Med. Imaging 34(9), 1890–1900 (2015)
Gertych, A., Zhang, A., Sayre, J., Pospiech-Kurkowska, S., Huang, HK.: Bone age assessment of children using a digital hand atlas. Comput. Med. Imaging Graph. 31(4–5), 322–331 (2007)
Zhu, H., Yao, Q., Xiao, L., Zhou, S.K.: You only learn once: universal anatomical landmark detection. arXiv preprint arXiv:2103.04657 (2021)
Zhao, S., et al.: Recursive cascaded networks for unsupervised medical image registration. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10600–10610 (2019)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Acknowledgements
This work is supported in part by MOST-2018AAA0102004 and NSFC-62061136001.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yin, Z., Gong, P., Wang, C., Yu, Y., Wang, Y. (2022). One-Shot Medical Landmark Localization by Edge-Guided Transform and Noisy Landmark Refinement. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13681. Springer, Cham. https://doi.org/10.1007/978-3-031-19803-8_28
Download citation
DOI: https://doi.org/10.1007/978-3-031-19803-8_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19802-1
Online ISBN: 978-3-031-19803-8
eBook Packages: Computer ScienceComputer Science (R0)