One-Shot Medical Landmark Localization by Edge-Guided Transform and Noisy Landmark Refinement

Yin, Zihao; Gong, Ping; Wang, Chunyu; Yu, Yizhou; Wang, Yizhou

doi:10.1007/978-3-031-19803-8_28

Zihao Yin¹²,
Ping Gong¹³,
Chunyu Wang¹⁴,
Yizhou Yu¹⁵ &
…
Yizhou Wang^16,17

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13681))

Included in the following conference series:

European Conference on Computer Vision

1943 Accesses
1 Citations

Abstract

As an important upstream task for many medical applications, supervised landmark localization still requires non-negligible annotation costs to achieve desirable performance. Besides, due to cumbersome collection procedures, the limited size of medical landmark datasets impacts the effectiveness of large-scale self-supervised pre-training methods. To address these challenges, we propose a two-stage framework for one-shot medical landmark localization, which first infers landmarks by unsupervised registration from the labeled exemplar to unlabeled targets, and then utilizes these noisy pseudo labels to train robust detectors. To handle the significant structure variations, we learn an end-to-end cascade of global alignment and local deformations, under the guidance of novel loss functions which incorporate edge information. In stage II, we explore self-consistency for selecting reliable pseudo labels and cross-consistency for semi-supervised learning. Our method achieves state-of-the-art performances on public datasets of different body parts, which demonstrates its general applicability. Code is available at https://github.com/GoldExcalibur/EdgeTrans4Mark.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Wang, C.-W., et al.: A benchmark for comparison of dental radiography analysis algorithms. Med. Image Anal. 31, 63–76 (2016)
Article Google Scholar
Chen, R., Ma, Y., Chen, N., Lee, D., Wang, W.: Cephalometric landmark detection by attentive feature pyramid fusion and regression-voting. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11766, pp. 873–881. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32248-9_97
Chapter Google Scholar
Escobar, M., González, C., Torres, F., Daza, L., Triana, G., Arbeláez, P.: Hand pose estimation for pediatric bone age assessment. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11769, pp. 531–539. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32226-7_59
Chapter Google Scholar
Gong, P., Yin, Z., Wang, Y., Yu, Y.: Towards robust bone age assessment: rethinking label noise and ambiguity. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12266, pp. 621–630. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59725-2_60
Chapter Google Scholar
Payer, C., Štern, D., Bischof, H., Urschler, M.: Integrating spatial configuration into heatmap regression based CNNs for landmark localization. Med. Image Anal. 54, 207–219 (2019)
Article Google Scholar
Liu, W., Wang, Yu., Jiang, T., Chi, Y., Zhang, L., Hua, X.-S.: Landmarks detection with anatomical constraints for total hip arthroplasty preoperative measurements. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12264, pp. 670–679. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59719-1_65
Chapter Google Scholar
Wang, J., et al.: Deep high-resolution representation learning for visual recognition. IEEE trans. Pattern Anal. Mach. Intell. 43, 3349–3364 (2020)
Article Google Scholar
Li, W., et al.: Structured landmark detection via topology-adapting deep graph learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12354, pp. 266–283. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58545-7_16
Chapter Google Scholar
Liu, H., Liu, F., Fan, X., Huang, D.: Polarized self-attention: towards High-quality Pixel-wise Regression. arXiv preprint arXiv:2107.00782 (2021)
Browatzki, B., Wallraven, C.: 3FabRec: fast few-shot face alignment by reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6110–6120 (2020)
Google Scholar
Yao, Q., Quan, Q., Xiao, L., Zhou, S.K.: One-shot medical landmark detection. arXiv preprint arXiv:2103.04527 (2021)
Zhou, X-Y., et al.: Scalable semi-supervised landmark localization for X-ray images using few-shot deep adaptive graph. arXiv preprint arXiv:2104.14629 (2021)
Balakrishnan, G., Zhao, A., Sabuncu, M.R., Guttag, J., Dalca, A.V.: An unsupervised learning model for deformable medical image registration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9252–9260 (2018)
Google Scholar
Lee, M.C.H., Oktay, O., Schuh, A., Schaap, M., Glocker, B.: Image-and-spatial transformer networks for structure-guided image registration. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11765, pp. 337–345. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32245-8_38
Chapter Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Yao, Q., He, Z., Han, H., Zhou, S.K.: Miss the point: targeted adversarial attack on multiple landmark detection. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12264, pp. 692–702. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59719-1_67
Chapter Google Scholar
Honari, S., Molchanov, P., Tyree, S., Vincent, P., Pal, C., Kautz, J.: Improving landmark localization with semi-supervised learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1546–1555 (2018)
Google Scholar
Moskvyak, O., Maire, F., Dayoub, F., Baktashmotlagh, M.: Semi-supervised keypoint localization. arXiv preprint arXiv:2101.07988 (2021)
Qian, S., Sun, K., Wu, W., Qian, C., Jia, J.: Aggregation via separation: boosting facial landmark detector with semi-supervised style translation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10153–10163 (2019)
Google Scholar
Kumar, A., Chellappa, R.: S2LD: Semi-supervised landmark detection in low-resolution images and impact on face verification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 758–759 (2020)
Google Scholar
Uzunova, H., Wilms, M., Handels, H., Ehrhardt, J.: Training CNNs for image registration from few samples with model-based data augmentation. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10433, pp. 223–231. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66182-7_26
Chapter Google Scholar
Li, H., Fan, Y.: Non-rigid image registration using self-supervised fully convolutional networks without training data. In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), pp. 1075–1078. IEEE (2018)
Google Scholar
Zhao, A., Balakrishnan, G., Durand, F., Guttag, J.V., Dalca, A.V.: Data augmentation using learned transformations for one-shot medical image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8543–8553 (2019)
Google Scholar
Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019)
Google Scholar
Parmar, N., et al.: Image transformer. In: International Conference on Machine Learning, pp. 4055–4064. PMLR (2018)
Google Scholar
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chapter Google Scholar
Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. Adv. Neural. Inf. Process. Syst. 28, 2017–2025 (2015)
Google Scholar
Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: 2003 the Thrity-Seventh Asilomar Conference on Signals, Systems and Computers, vol. 2, pp. 1398–1402. IEEE (2003)
Google Scholar
Han, B., et al.: Co-teaching: robust training of deep neural networks with extremely noisy labels. arXiv preprint arXiv:1804.06872 (2018)
Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization. arXiv:abs/1611.03530 (2017)
Arpit, D., et al.: A closer look at memorization in deep networks. In: International Conference on Machine Learning, pp. 233–242. PMLR (2017)
Google Scholar
Yu, X., Han, B., Yao, J., Niu, G., Tsang, I., Sugiyama, M.: How does disagreement help generalization against label corruption? In: International Conference on Machine Learning pp. 7164–7173 (2019)
Google Scholar
Sohn, K., et al.: FixMatch: simplifying semi-supervised learning with consistency and confidence. arXiv preprint arXiv:2001.07685 (2020)
Wang, C.-W., et al.: Evaluation and comparison of anatomical landmark detection methods for cephalometric X-ray images: a grand challenge. IEEE Trans. Med. Imaging 34(9), 1890–1900 (2015)
Article Google Scholar
Gertych, A., Zhang, A., Sayre, J., Pospiech-Kurkowska, S., Huang, HK.: Bone age assessment of children using a digital hand atlas. Comput. Med. Imaging Graph. 31(4–5), 322–331 (2007)
Google Scholar
Zhu, H., Yao, Q., Xiao, L., Zhou, S.K.: You only learn once: universal anatomical landmark detection. arXiv preprint arXiv:2103.04657 (2021)
Zhao, S., et al.: Recursive cascaded networks for unsupervised medical image registration. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10600–10610 (2019)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar

Download references

Acknowledgements

This work is supported in part by MOST-2018AAA0102004 and NSFC-62061136001.

Author information

Authors and Affiliations

Center for Data Science, Peking University, Beijing, China
Zihao Yin
Deepwise AI Lab, Beijing, China
Ping Gong
Microsoft Research Asia, Beijing, China
Chunyu Wang
The University of Hong Kong, Pok Fu Lam, Hong Kong
Yizhou Yu
Center on Frontiers of Computing Studies, School of Computer Science, Peking University, Beijing, China
Yizhou Wang
Institute for Artificial Intelligence, Peking University, Beijing, China
Yizhou Wang

Authors

Zihao Yin
View author publications
You can also search for this author in PubMed Google Scholar
Ping Gong
View author publications
You can also search for this author in PubMed Google Scholar
Chunyu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yizhou Yu
View author publications
You can also search for this author in PubMed Google Scholar
Yizhou Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yizhou Wang .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 5411 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yin, Z., Gong, P., Wang, C., Yu, Y., Wang, Y. (2022). One-Shot Medical Landmark Localization by Edge-Guided Transform and Noisy Landmark Refinement. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13681. Springer, Cham. https://doi.org/10.1007/978-3-031-19803-8_28

Download citation

DOI: https://doi.org/10.1007/978-3-031-19803-8_28
Published: 23 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19802-1
Online ISBN: 978-3-031-19803-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

One-Shot Medical Landmark Localization by Edge-Guided Transform and Noisy Landmark Refinement