Addressing multi-label imbalance problem of surgical tool detection using CNN
A fully automated surgical tool detection framework is proposed for endoscopic video streams. State-of-the-art surgical tool detection methods rely on supervised one-vs-all or multi-class classification techniques, completely ignoring the co-occurrence relationship of the tools and the associated class imbalance.
In this paper, we formulate tool detection as a multi-label classification task where tool co-occurrences are treated as separate classes. In addition, imbalance on tool co-occurrences is analyzed and stratification techniques are employed to address the imbalance during convolutional neural network (CNN) training. Moreover, temporal smoothing is introduced as an online post-processing step to enhance runtime prediction.
Quantitative analysis is performed on the M2CAI16 tool detection dataset to highlight the importance of stratification, temporal smoothing and the overall framework for tool detection.
The analysis on tool imbalance, backed by the empirical results, indicates the need and superiority of the proposed framework over state-of-the-art techniques.
KeywordsTransfer learning Surgical tool detection CNN Laparoscopic videos Multi-label learning
- 1.Allan M, Chang PL, Ourselin S, Hawkes DJ, Sridhar A, Kelly J, Stoyanov D (2015) Image based surgical instrument pose estimation with multi-class labelling and optical flow. In: International conference on medical image computing and computer-assisted intervention, Springer, pp 331–338Google Scholar
- 2.Blum T, Feußner H, Navab N (2010) Modeling and segmentation of surgical workflow from laparoscopic video. In: International conference on medical image computing and computer-assisted intervention, Springer, pp 400–407Google Scholar
- 4.Charte F, Rivera AJ, del Jesus MJ, Herrera F (2015) Addressing imbalance in multilabel classification: measures and random resampling algorithms. Neurocomputing 163:3–16Google Scholar
- 5.Donaldson MS, Corrigan JM, Kohn LT (2000) To err is human: building a safer health system, vol 6. National Academies Press, WashingtonGoogle Scholar
- 6.Gu Z, Gu L, Eils R, Schlesner M, Brors B (2014) circlize implements and enhances circular visualization in R. Bioinformatics. Oxford Univ Press, p btu393Google Scholar
- 7.Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105Google Scholar
- 10.Raju A, Wang S, Huang J (2016) M2cai surgical tool detection challenge report. http://camma.u-strasbg.fr/m2cai2016/reports/Raju-Tool.pdf
- 11.Sahu M, Moerman D, Mewes P, Mountney P, Rose G (2016a) Instrument state recognition and tracking for effective control of robotized laparoscopic systems. Int J Mech Eng Rob Res 5(1):33Google Scholar
- 12.Sahu M, Mukhopadhyay A, Szengel A, Zachow S (2016b) Tool and phase recognition using contextual CNN features. arXiv preprint arXiv:1610.08854
- 13.Sechidis K, Tsoumakas G, Vlahavas I (2011) On the stratification of multi-label data. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 145–158Google Scholar
- 14.Speidel S, Benzko J, Krappe S, Sudra G, Azad P, Müller-Stich BP, Gutt C, Dillmann R (2009) Automatic classification of minimally invasive instruments based on endoscopic image sequences. In: SPIE medical imaging, International society for optics and photonics, p 72,610AGoogle Scholar
- 15.Sznitman R, Becker C, Fua P (2014) Fast part-based classification for instrument detection in minimally invasive surgery. In: International conference on medical image computing and computer-assisted intervention, Springer, pp 692–699Google Scholar
- 16.Twinanda AP, Mutter D, Marescaux J, de Mathelin M, Padoy N (2016a) Single- and multi-task architectures for tool presence detection challenge at M2CAI 2016. arXiv preprint arXiv:1610.08851
- 17.Twinanda AP, Shehata S, Mutter D, Marescaux J, de Mathelin M, Padoy N (2016b) Endonet: a deep architecture for recognition tasks on laparoscopic videos. arXiv preprint arXiv:1602.03012