AGNet: Attention-Guided Network for Surgical Tool Presence Detection

Hu, Xiaowei; Yu, Lequan; Chen, Hao; Qin, Jing; Heng, Pheng-Ann

doi:10.1007/978-3-319-67558-9_22

Xiaowei Hu²⁷,
Lequan Yu²⁷,
Hao Chen²⁷,
Jing Qin²⁸ &
…
Pheng-Ann Heng²⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10553))

Included in the following conference series:

7422 Accesses
14 Citations

Abstract

We propose a novel approach to automatically recognize the presence of surgical tools in surgical videos, which is quite challenging due to the large variation and partially appearance of surgical tools, the complicated surgical scenes, and the co-occurrence of some tools in the same frame. Inspired by human visual attention mechanism, which first orients and selects some important visual cues and then carefully analyzes these focuses of attention, we propose to first leverage a global prediction network to obtain a set of visual attention maps and a global prediction for each tool, and then harness a local prediction network to predict the presence of tools based on these attention maps. We apply a gate function to obtain the final prediction results by balancing the global and the local predictions. The proposed attention-guided network (AGNet) achieves state-of-the-art performance on m2cai16-tool dataset and surpasses the winner in 2016 by a significant margin.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Google Scholar
Letouzey, A., Decrouez, M., Agustinos, A., Voros, S.: Instruments localisation and identification for laparoscopic surgeries (2016). http://camma.u-strasbg.fr/m2cai2016/reports/Letouzey-Tool.pdf
Luo, H., Hu, Q., Jia, F.: Surgical tool detection via multiple convolutional neural networks (2016). http://camma.u-strasbg.fr/m2cai2016/reports/Luo-Tool.pdf
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Article Google Scholar
Raju, A., Wang, S., Huang, J.: M2CAI surgical tool detection challenge report (2016). http://camma.u-strasbg.fr/m2cai2016/reports/Raju-Tool.pdf
Rosen, M.L., Stern, C.E., Michalka, S.W., Devaney, K.J., Somers, D.C.: Cognitive Control Network Contributions to Memory-Guided Visual Attention. Cerebral Cortex, New York (2015). bhv028
Google Scholar
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Sahu, M., Mukhopadhyay, A., Szengel, A., Zachow, S.: Tool and phase recognition using contextual CNN features. arXiv preprint arXiv:1610.08854 (2016)
Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In: CVPR, pp. 761–769 (2016)
Google Scholar
Twinanda, A.P., Mutter, D., Marescaux, J., de Mathelin, M., Padoy, N.: Single-and multi-task architectures for tool presence detection challenge at M2CAI 2016. arXiv preprint arXiv:1610.08851 (2016)
Twinanda, A.P., Shehata, S., Mutter, D., Marescaux, J., de Mathelin, M., Padoy, N.: EndoNet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans. Med. Imaging 36(1), 86–97 (2017)
Article Google Scholar
Zia, A., Castro, D., Essa, I.: Fine-tuning deep architectures for surgical tool detection (2016). http://camma.u-strasbg.fr/m2cai2016/reports/Zia-Tool.pdf

Download references

Acknowledgements

The work described in this paper was supported by the following grants from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. CUHK 14202514 and CUHK 14203115).

Author information

Authors and Affiliations

Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, The People’s Republic of China
Xiaowei Hu, Lequan Yu, Hao Chen & Pheng-Ann Heng
Centre for Smart Health, School of Nursing, The Hong Kong Polytechnic University, Hong Kong, People’s Republic of China
Jing Qin

Authors

Xiaowei Hu
View author publications
You can also search for this author in PubMed Google Scholar
Lequan Yu
View author publications
You can also search for this author in PubMed Google Scholar
Hao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jing Qin
View author publications
You can also search for this author in PubMed Google Scholar
Pheng-Ann Heng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaowei Hu .

Editor information

Editors and Affiliations

University College London, London, United Kingdom
M. Jorge Cardoso
McGill University, Montreal, Québec, Canada
Tal Arbel
University of Adelaide, Adelaide, South Australia, Australia
Gustavo Carneiro
IBM Research - Almaden, San Jose, California, USA
Tanveer Syeda-Mahmood
Universidade do Porto, Porto, Portugal
João Manuel R.S. Tavares
IBM Research - Almaden, San Jose, USA
Mehdi Moradi
University of Queensland, Brisbane, Queensland, Australia
Andrew Bradley
Tel Aviv University, Tel Aviv, Israel
Hayit Greenspan
Universidade Estadual Paulista, Bauru, Brazil
João Paulo Papa
Case Western Reserve University, Cleveland, Ohio, USA
Anant Madabhushi
Instituto Superior Técnico, Lisboa, Portugal
Jacinto C. Nascimento
Universidade do Porto, Porto, Portugal
Jaime S. Cardoso
University of Oxford, Oxford, United Kingdom
Vasileios Belagiannis
University of South Australia, Adelaide, South Australia, Australia
Zhi Lu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, X., Yu, L., Chen, H., Qin, J., Heng, PA. (2017). AGNet: Attention-Guided Network for Surgical Tool Presence Detection. In: Cardoso, M., et al. Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support . DLMIA ML-CDS 2017 2017. Lecture Notes in Computer Science(), vol 10553. Springer, Cham. https://doi.org/10.1007/978-3-319-67558-9_22

Download citation

DOI: https://doi.org/10.1007/978-3-319-67558-9_22
Published: 09 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67557-2
Online ISBN: 978-3-319-67558-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics