A Cascade Attention Network for Liver Lesion Classification in Weakly-Labeled Multi-phase CT Images
Focal liver lesion classification is important to the diagnostics of liver disease. In clinics, lesion type is usually determined by multi-phase contrast-enhanced CT images. Previous methods of automatic liver lesion classification are conducted on lesion-level, which rely heavily on lesion-level annotations. In order to reduce the burden of annotation, in this paper, we explore automatic liver lesion classification with weakly-labeled CT images (i.e. with only image-level labels). The major challenge is how to localize the region of interests (ROIs) accurately by using only coarse image-level annotations and accordingly make the right lesion classification decision. We propose a cascade attention network to address the challenge by two stages: Firstly, a dual-attention dilated residual network (DADRN) is proposed to generate a class-specific lesion localization map, which incorporates spatial attention and channel attention blocks for capturing the high-level feature map’s long-range dependencies and helps to synthesize a more semantic-consistent feature map, and thereby boosting weakly-supervised lesion localization and classification performance; Secondly, a multi-channel dilated residual network (MCDRN) embedded with a convolutional long short-term memory (CLSTM) block is proposed to extract temporal enhancement information and make the final classification decision. The experiment results show that our method could achieve a mean classification accuracy of 89.68%, which significantly mitigates the performance gap between weakly-supervised approaches and fully supervised counterparts.
KeywordsLiver lesion classification Channel attention Spatial attention Weakly-labeled Multi-phase CT images CLSTM
This work was supported in part by Major Scientific Research Project of Zhejiang Lab under the Grant No. 2018DG0ZX01, in part by the Science and Technology Support Program of Hangzhou under the Grant No. 20172011A038, and in part by the Grant-in Aid for Scientific Research from the Japanese Ministry for Education, Science, Culture and Sports (MEXT) under the Grant No. 18H03267.
- 2.Xu, Y., et al.: Combined density, texture and shape features of multi-phase contrast-enhanced CT images for CBIR of focal liver lesions: a preliminary study. Innov. Med. Healthc. 2015, 215–224 (2015)Google Scholar
- 7.Frid-Adar, M., Diamant, I., Klang, E., Amitai, M., Goldberger, J., Greenspan, H.: Modeling the intra-class variability for liver lesion detection using a multi-class patch-based CNN. In: Wu, G., Munsell, B.C., Zhan, Y., Bai, W., Sanroma, G., Coupé, P. (eds.) Patch-MI 2017. LNCS, vol. 10530, pp. 129–137. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67434-6_15CrossRefGoogle Scholar
- 8.Liang, D., et al.: Combining convolutional and recurrent neural networks for classification of focal liver lesions in multi-phase CT images. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11071, pp. 666–675. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00934-2_74CrossRefGoogle Scholar
- 10.Hu, J., Shen, L., Sun, G.: Squeeze-and-Excitation networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7132–7141 (2017)Google Scholar
- 12.Yu, F., Koltun, V., Funkhouser, T.A.: Dilated residual networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 472–480 (2017)Google Scholar
- 13.Selvaraju, R., Cogswell, M., Das, A., et al.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of ICCV 2017, pp. 618–626 (2017)Google Scholar
- 15.Chen, X., et al.: A dual-attention dilated residual network for liver lesion classification and localization on CT images. In: Proceedings of IEEE ICIP 2019 (2019, in press)Google Scholar