Lw-TISNet: Light-Weight Convolutional Neural Network Incorporating Attention Mechanism and Multiple Supervision Strategy for Tongue Image Segmentation

Huang, Xiaodong; Zhuo, Li; Zhang, Hui; Li, Xiaoguang; Zhang, Jing

doi:10.1007/s11220-021-00375-x

Lw-TISNet: Light-Weight Convolutional Neural Network Incorporating Attention Mechanism and Multiple Supervision Strategy for Tongue Image Segmentation

Original Paper
Published: 08 January 2022

Volume 23, article number 6, (2022)
Cite this article

Sensing and Imaging Aims and scope Submit manuscript

Xiaodong Huang^1,3,
Li Zhuo ORCID: orcid.org/0000-0002-9937-2669^1,2,
Hui Zhang^1,2,
Xiaoguang Li^1,2 &
…
Jing Zhang^1,2

1015 Accesses
11 Citations
Explore all metrics

Abstract

Segmenting the tongue body is an essential step for automated tongue diagnosis, which is a challenge task due to the tongue body’s specificity and heterogeneity. The current deep-learning based tongue image segmentation networks are bloated with high computational complexity. In this study, a light-weight segmentation network for tongue images is proposed under the basic encoder-decoder framework, in which MobileNet v2 is adopted as the backbone network, due to its few parameters and low computational complexity. The high-level semantic information and low-level positional information are combined together to detect the tongue body’s boundary. And the dilated convolution operations are performed on the final feature maps of the network to enlarge the receptive field, so as to capture rich global semantic information. An attention mechanism is embedded to re-calibrate the feature maps spatially and channel-wise to enhance important features for the segmentation task, while suppressing the irrelevant ones. Moreover, a supervision output is added to each level of the decoder to guide the network to capture both the local and global image features for accurate tongue image segmentation. All supervision outputs are fused to produce good segmented results. The quantitative and qualitative results on two tongue datasets indicate that the proposed network can achieve a competitive performance with smaller model size and lower computational cost. The proposed method could accurately extract the tongue body, which can fully meet the requirements of practical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DSE-Net: Deep Semantic Enhanced Network for Mobile Tongue Image Segmentation

Automated Tongue Segmentation in Chinese Medicine Based on Deep Learning

Unsupervised Tongue Segmentation Using Reference Labels

References

Liu, Y. Q., Wang, Y. X., Shi, N. N., Han, X. J., & Lu, A. P. (2016). Current situation of international organization for standardization/technical committee 249 international standards of traditional Chinese medicine. Chinese Journal of Integrative Medicine, 23(5), 376–380.
Article Google Scholar
Tania, M. H., Lwin, K., & Hossain, M. A. (2018). Advances in automated tongue diagnosis techniques. Integrative Medicine Research, 8(1), 42–56.
Article Google Scholar
Chiu, C. C. (2000). A novel approach based on computerized image analysis for traditional Chinese medical diagnosis of the tongue. Computer Methods and Programs in Biomedicine, 61(2), 77–89.
Article Google Scholar
Oji, T., Namiki, T., Nakaguchi, T., Ueda, K., Takeda, K., Nakamura, M., Okamoto, H., & Hirasaki, Y. (2014). Study of factors involved in tongue color diagnosis by kampo medical practitioners using the farnsworth-munsell 100 hue test and tongue color images. Evid Based Complement Alternat Med, 2014(3), 1–9.
Article Google Scholar
Wang, Y., Zhou, Y., Yang, J., & Xu, Q. (2004). An image analysis system for tongue diagnosis in traditional Chinese medicine. In: Computational and Information Science, First International Symposium, {CIS} 2004, Shanghai, China, December 16–18, 2004. pp. 1181–1186.
Kim, J. S., Han, G.-J., Choi, B.-H., Park, J.-W., Park, K., Yeo, I.-K., & Ryu, B.-H. (2012). Development of differential criteria on tongue coating thickness in tongue diagnosis. Complementary Therapies in Medicine, 20(5), 316–322.
Article Google Scholar
Zhang, D., Zhang, H., & Zhang, B. (2017). Tongue image analysis. Springer.
Book Google Scholar
Ning, J. F., Zhang, D., Wu, C., & Yue, F. (2012). Automatic tongue image segmentation based on gradient vector flow and region merging. Neural Computing and Applications, 21(8), 1819–1826.
Article Google Scholar
Wu, J., Zhang, Y., & Bai, J. (2005). Tongue area extraction in tongue diagnosis of traditional Chinese medicine. In: 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, 2005. pp. 4955–4957.
Wu, K. B., & Zhang, D. (2015). Robust tongue segmentation by fusing region-based and edge-based approaches. Expert Systems with Applications, 42(21), 8027–8038.
Article Google Scholar
Kass, M., Witkin, A., & Terzopoulos, D. (1988). Snakes: Active contour models. International Journal of Computer Vision, 1(4), 321–331.
Article MATH Google Scholar
Pang, B., Zhang, D., & Wang, K. (2005). The bi-elliptical deformable contour and its application to automated tongue segmentation in Chinese medicine. IEEE Transactions on Medical Imaging, 24(8), 946–956.
Article Google Scholar
Zuo, W., Wang, K., Zhang, D., & Zhang, H. (2004). Combination of polar edge detection and active contour model for automated tongue segmentation. In: Third International Conference on Image and Graphics (ICIG'04), Hong Kong, China, December 18–20, 2004(pp. 270–273).
Yu, S., Yang, J., Wang, Y., & Zhang, Y. (2007). Color active contour models based tongue segmentation in traditional Chinese medicine. In: International Conference on Bioinformatics and Biomedical Engineering, 2007. pp. 1065–1068.
Shi, M., Li, G., Li, F., & Chao, X. (2012). A novel tongue segmentation approach utilizing double geodesic flow. In: 7th International Conference on Computer Science and Education, 2012(pp. 21–25).
Jiang, L., Xu, B., Ban, X., Ping, T., & Ma, B. (2017). A tongue image segmentation method based on enhanced HSV convolutional neural network. In: International Conference on Cooperative Design, Visualization and Engineering, Mallorca, Spain, September 17–20, 2017. pp. 252–260.
Lin, B., Xle, J., Li, C., & Qu, Y. (2018). Deeptongue: Tongue segmentation via resnet. In: IEEE International Conference on Acoustics, Speech and Signal Processing, Calgary, AB, Canada, April 15–20, 2018. pp. 1035–1039.
Qu, P., Hui, Z., Li, Z., Jing, Z., & Chen, G. (2017). Automatic tongue image segmentation for traditional Chinese medicine using deep neural network. In: International Conference on Intelligent Computing Theories and Application, Liverpool, UK, August 7–10, 2017. pp. 247–259.
Li, X., Yang, D., Wang, Y., Yang, S., Qi, L., Li, F., Gan, Z., & Zhang, W. (2019). Automatic tongue image segmentation for real-time remote diagnosis. In: IEEE International Conference on Bioinformatics and Biomedicine-BIBM, San Diego, CA, USA, November 18–21, 2019. pp. 409–414.
Zhou, J., Zhang, Q., Zhang, B., & Chen, X. (2019). TongueNet: A precise and fast tongue segmentation system using u-net with a morphological processing layer. Applied Sciences, 9(15), 3128–3136.
Article Google Scholar
Cai, Y., Wang, T., Liu, W., & Luo, Z. (2020). A robust interclass and intraclass loss function for deep learning based tongue segmentation. Concurrency and Computation: Practice and Experience, 32(22), e5849.
Article Google Scholar
Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., & Adam, H. (2018). Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Computer Vision - (ECCV) 2018 - 15th European Conference, Munich, Germany, September 8–14, 2018. pp. 833–851.
Badrinarayanan, V., Kendall, A., & Cipolla, R. (2017). SegNet: A Deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12), 2481–2495.
Article Google Scholar
Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, June 7–12, 2015. pp. 3431--3440.
Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, San Diego, CA, USA, May 7–9, 2015.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, June 27–30, 2016. pp. 770–778.
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L.-C. (2018). MobileNetV2: Inverted residuals and linear bottlenecks. In: IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, June 18–22, 2018. pp. 4510–4520.
Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In: IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, June 18–22, 2018. pp. 7132–7141.
Roy, A. G., Navab, N., & Wachinger, C. (2019). Recalibrating fully convolutional networks with spatial and channel squeeze and excitation blocks. IEEE Transactions on Medical Imaging, 38(2), 540–549.
Article Google Scholar
Huang, L., Xia, W., Zhang, B., Qiu, B., & Gao, X. (2017). MSFCN-multiple supervised fully convolutional networks for the osteosarcoma segmentation of CT images. Computer Methods and Programs in Biomedicine, 143, 67–74.
Article Google Scholar
Crum, W. R., Camara, O., & Hill, D. L. G. (2006). Generalized overlap measures for evaluation and validation in medical image analysis. IEEE Transactions on Medical Imaging, 25(11), 1451–1461.
Article Google Scholar
Taha, A. A., & Hanbury, A. (2015). Metrics for evaluating 3D medical image segmentation: Analysis, selection, and tool. BMC Medical Imaging, 15(8), 29.
Article Google Scholar
Zhao, H., Shi, J., Qi, X., Wang, X., & Jia, J. (2017). Pyramid Scene Parsing Network. In: IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, July 21–26, 2017. pp. 6230–6239.
Wu, T., Tang, S., Zhang, R., & Zhang, Y. (2021). CGNet: A light-weight context guided network for semantic segmentation. IEEE Transactions on Image Processing, 30, 1169–1179.
Article Google Scholar
Romera, E., Álvarez, J. M., Bergasa, L. M., & Arroyo, R. (2018). ERFNet: Efficient residual factorized ConvNet for real-time semantic segmentation. IEEE Transactions on Intelligent Transportation Systems, 19(1), 263–272.
Article Google Scholar
Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., & Sang, N. (2018). BiSeNet: Bilateral segmentation network for real-time semantic segmentation. In: Computer Vision - (ECCV) 2018 - 15th European Conference, Munich, Germany, September 8–14, 2018. pp. 334–349.
Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., & Torralba, A. (2019). Semantic understanding of scenes through the ADE20K dataset. International Journal of Computer Vision, 127(3), 302–321.
Article Google Scholar
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., & Schiele, B. (2016). The cityscapes dataset for semantic urban scene understanding. In: IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, June 27–30, 2016. pp. 3213–3223.
Brostow, G.J., Shotton, J., Fauqueur, J., & Cipolla, R. (2008). Segmentation and recognition using structure from motion point clouds. In: Computer Vision - (ECCV) 2008, 10th European Conference on Computer Vision, Marseille, France, October 12–18, 2008. pp. 44–57.
Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention- (MICCAI), Munich, Germany, October 5–9, 2015. pp. 234–241.

Download references

Acknowledgements

This study is supported by the National Natural Science Foundation of China (No.61871006). The authors thank Dr. Xiaopeng Zheng, Dr. Wenqiang Chen and Dr. Wenjing Chen from Beijing Xuanwu Hospital for their help in capturing the tongue images and guiding in Traditional Chinese Medicine.

Author information

Authors and Affiliations

Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
Xiaodong Huang, Li Zhuo, Hui Zhang, Xiaoguang Li & Jing Zhang
Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing, 100124, China
Li Zhuo, Hui Zhang, Xiaoguang Li & Jing Zhang
School of Mechatronic Engineering, Henan University of Science and Technology, Luoyang, 471023, China
Xiaodong Huang

Authors

Xiaodong Huang
View author publications
You can also search for this author in PubMed Google Scholar
Li Zhuo
View author publications
You can also search for this author in PubMed Google Scholar
Hui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoguang Li
View author publications
You can also search for this author in PubMed Google Scholar
Jing Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li Zhuo.

Ethics declarations

Conflict of interest

Authors declare that they have no conflicts of interest.

Ethical Approval

This study does not contain any study with human or animal participants.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, X., Zhuo, L., Zhang, H. et al. Lw-TISNet: Light-Weight Convolutional Neural Network Incorporating Attention Mechanism and Multiple Supervision Strategy for Tongue Image Segmentation. Sens Imaging 23, 6 (2022). https://doi.org/10.1007/s11220-021-00375-x

Download citation

Received: 09 November 2021
Revised: 08 December 2021
Accepted: 24 December 2021
Published: 08 January 2022
DOI: https://doi.org/10.1007/s11220-021-00375-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Lw-TISNet: Light-Weight Convolutional Neural Network Incorporating Attention Mechanism and Multiple Supervision Strategy for Tongue Image Segmentation

Abstract

Access this article

Similar content being viewed by others

DSE-Net: Deep Semantic Enhanced Network for Mobile Tongue Image Segmentation

Automated Tongue Segmentation in Chinese Medicine Based on Deep Learning

Unsupervised Tongue Segmentation Using Reference Labels

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Lw-TISNet: Light-Weight Convolutional Neural Network Incorporating Attention Mechanism and Multiple Supervision Strategy for Tongue Image Segmentation

Abstract

Access this article

Similar content being viewed by others

DSE-Net: Deep Semantic Enhanced Network for Mobile Tongue Image Segmentation

Automated Tongue Segmentation in Chinese Medicine Based on Deep Learning

Unsupervised Tongue Segmentation Using Reference Labels

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation