Bunet: An effective and efficient segmentation method based on bilateral encoder-decoder structure for rapid detection of apple tree branches

Zhang, Shanshan; Wan, Hao; Fan, Zeming; Zeng, Xilei; Zhang, Ke

doi:10.1007/s10489-023-04742-x

Bunet: An effective and efficient segmentation method based on bilateral encoder-decoder structure for rapid detection of apple tree branches

Published: 08 July 2023

Volume 53, pages 23336–23348, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Shanshan Zhang¹,
Hao Wan¹,
Zeming Fan ORCID: orcid.org/0000-0001-6683-1745¹,
Xilei Zeng¹ &
…
Ke Zhang¹

207 Accesses
Explore all metrics

Abstract

Automatic apple harvesting robots have received much research attention in recent years to lower harvesting costs. A fundamental problem for harvesting robots is how to quickly and accurately detect branches to avoid collisions with limited hardware resources. In this paper, we propose a lightweight, high-accurate and real-time semantic segmentation network, Bilateral U-shape Network (BUNet), to segment apple tree branches. The BUNet consists mainly of a U-shaped detail branch and a U-shaped semantic branch, the former for capturing spatial details and the latter for supplementing semantic information. These two U-shape branches complement each other, keeping the high accuracy of the Encoder-decoder Backbone while maintaining the efficiency and effectiveness of the Two-pathway Backbone. In addition, a Simplified Attention Fusion Module (SAFM) is proposed to effectively fuse different levels of information from two branches for pixel-by-pixel prediction. Experimental results show (on our own constructed dataset) that BUNet achieves the highest Intersection over Union (IoU) and F1-score of 75.96% and 86.34%, respectively, with minimum parameters of 0.93M and 11.94G Floating-point of Operations (FLOPs) in branch segmentation. Meanwhile, BUNet achieves a speed of 110.32 Frames Per Second (FPS) with input image size of 1280\(\times \)720 pixels. These results confirm that the proposed method can effectively detect the branches and can, therefore, be used to plan an obstacle avoidance path for harvesting robots.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dual Attention-Guided Network for Anchor-Free Apple Instance Segmentation in Complex Environments

ASE-UNet: An Orange Fruit Segmentation Model in an Agricultural Environment Based on Deep Learning

Article 22 December 2023

YOLOv5s-BC: an improved YOLOv5s-based method for real-time apple detection

Article 10 May 2024

Availability of data and materials

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request

References

Bac CW, Van Henten EJ, Hemming J, Edan Y (2014) Harvesting robots for high-value crops: State-of-the-art review and challenges ahead. J Field Robotics 31(6):888–911. https://doi.org/10.1002/rob.21525
Article Google Scholar
Kapach K, Barnea E, Mairon R, Edan Y, Ben-Shahar O (2012) Computer vision for fruit harvesting robots-state of the art and challenges ahead. Int J Comput Vision Robotics 3(1/2):4–34. https://doi.org/10.1504/IJCVR.2012.046419
Article Google Scholar
Zhang Z, Igathinathane C, Li J, Cen H, Lu Y, Flores P (2020) Technology progress in mechanical harvest of fresh market apples. Comput Electr Agriculture 175:105606. https://doi.org/10.1016/j.compag.2020.105606
Article Google Scholar
Tang Y, Chen M, Wang C, Luo L, Li J, Lian G, Zou X (2020) Recognition and localization methods for vision-based fruit picking robots: A review. Frontiers Plant Sci 11:510. https://doi.org/10.3389/fpls.2020.00510
Article Google Scholar
Fu L, Gao F, Wu J, Li R, Karkee M, Zhang Q (2020) Application of consumer rgb-d cameras for fruit detection and localization in field: A critical review. Comput Electr Agriculture 177:105687. https://doi.org/10.1016/j.compag.2020.105687
Article Google Scholar
Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV), pp 801–818. https://doi.org/10.1007/978-3-030-01234-2_49
Kamilaris A, X F (2018) Deep learning in agriculture: A survey. Comput Electr Agriculture 147:70–90. https://doi.org/10.1016/j.compag.2018.02.016
Mo Y, Wu Y, Yang X, Liu F, Liao Y (2022) Review the state-of-the-art technologies of semantic segmentation based on deep learning. Neurocomputing 493:626–646. https://doi.org/10.1016/j.neucom.2022.01.005
Article Google Scholar
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440. https://doi.org/10.1109/TPAMI.2016.2572683
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Analysis Machine Int 40(4):834–848. https://doi.org/10.1109/TPAMI.2017.2699184
Article Google Scholar
Chen L-C, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. https://doi.org/10.48550/arXiv.1706.05587 arXiv preprint arXiv:1706.05587
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In Proceedings of the IEEE International conference on computer vision, pp 2961–2969. https://doi.org/10.1109/ICCV.2017.322
Guo Y, Liu Y, Georgiou T, Lew MS (2018) A review of semantic segmentation using deep neural networks. Int J Multimedia Inf Retrieval 7(2):87–93. https://doi.org/10.1007/s13735-017-0141-z
Article Google Scholar
Lin G, Tang Y, Zou X, Xiong J, Li J (2019) Guava detection and pose estimation using a low-cost rgb-d sensor in the field. Sensors 19(2):428. https://doi.org/10.3390/s19020428
Article Google Scholar
Lin G, Tang Y, Zou X, Wang C (2021) Three-dimensional reconstruction of guava fruits and branches using instance segmentation and geometry analysis. Comput Electr Agriculture 184:106107. https://doi.org/10.1016/j.compag.2021.106107
Article Google Scholar
Li J, Tang Y, Zou X, Lin G, Wang H (2020) Detection of fruit-bearing branches and localization of litchi clusters for vision-based harvesting robots. IEEE Access 8:117746–117758. https://doi.org/10.1109/ACCESS.2020.3005386
Article Google Scholar
Kang H, Chen C (2019) Fruit detection and segmentation for apple harvesting using visual sensor in orchards. Sensors 19(20):4599. https://doi.org/10.3390/s19204599
Article Google Scholar
Bechar A, Vigneault C (2016) Agricultural robots for field operations: Concepts and components. Biosyst Eng 149:94–111. https://doi.org/10.1016/j.biosystemseng.2016.06.014
Article Google Scholar
Bechar A, Vigneault C (2017) Agricultural robots for field operations. part 2: Operations and systems. Biosyst Eng 153:110–128. https://doi.org/10.1016/j.biosystemseng.2016.11.004
Article Google Scholar
Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation. In Proceedings of the European conference on computer vision (ECCV). https://doi.org/10.1007/978-3-030-01249-6_34
Article Google Scholar
Mehta S, Rastegari M, Shapiro L, Hajishirzi H (2019) Espnetv2: A light-weight, power efficient, and general purpose convolutional neural network. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Peng J, Liu Y, Tang S, Hao Y, Chu L, Chen G, Wu Z, Chen Z, Yu Z, Du Y, et al. (2022) Pp-liteseg: A superior real-time semantic segmentation model. https://doi.org/10.48550/arXiv.2204.02681 arXiv preprint arXiv:2204.02681
Sun K, Zhao Y, Jiang B, Cheng T, Xiao B, Liu D, Mu Y, Wang X, Liu W, Wang J (2019) High-resolution representations for labeling pixels and regions. https://doi.org/10.48550/arXiv.1904.04514 arXiv preprint arXiv:1904.04514
Zhang W, Huang Z, Luo G, Chen T, Wang X, Liu W, Yu G, Shen C (2022) Topformer: Token pyramid transformer for mobile semantic segmentation. In Proceedings of the IEEE/CVF ccnference on computer vision and pattern recognition (CVPR), pp 12083–12093. https://doi.org/10.1109/CVPR52688.2022.01177
Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) Bisenet: Bilateral segmentation network for real-time semantic segmentation. In Proceedings of the European conference on computer vision (ECCV), pp 325–341. https://doi.org/10.1007/978-3-030-01261_20
Fan M, Lai S, Huang J, Wei X, Chai Z, Luo J, Wei X (2021) Rethinking bisenet for real-time semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9716–9725. https://doi.org/10.1109/CVPR46437.2021.00959
Yu C, Gao C, Wang J, Yu G, Shen C, Sang N (2021) Bisenet v2: Bilateral network with guided aggregation for real-time semantic segmentation. Int J Comput Vision 129(11):3051–3068. https://doi.org/10.1007/s11263-021-01515-2
Article Google Scholar
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Analysis Machine Intell 39(12):2481–2495. https://doi.org/10.1109/TPAMI.2016.2644615
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. https://doi.org/10.48550/arXiv.1704.04861 arXiv preprint arXiv:1704.04861
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV), pp 3–19. https://doi.org/10.1007/978-3-030-01234-2_1
Contributors P (2019) PaddleSeg, End-to-end image segmentation kit based on PaddlePaddle. https://github.com/PaddlePaddle/PaddleSeg
Howard A, Sandler M, Chu G, Chen L-C, Chen B, Tan M, Wang W, Zhu Y, Pang R, Vasudevan V, Le QV, Adam H (2019) Searching for mobilenetv3. In Proceedings of the IEEE/CVF international conference on computer vision (ICCV). https://doi.org/10.1109/ICCV.2019.00140
Article Google Scholar
Yuan Y, Chen X, Chen X, Wang J (2019) Segmentation transformer: Object-contextual representations for semantic segmentation. https://doi.org/10.48550/arXiv.1909.11065 arXiv preprint arXiv:1909.11065

Download references

Funding

No funding was received to assist with the preparation of this manuscript

Author information

Authors and Affiliations

School of Automation, Northwestern Polytechnical University, Xi’an, 710100, Shaanxi Province, China
Shanshan Zhang, Hao Wan, Zeming Fan, Xilei Zeng & Ke Zhang

Authors

Shanshan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hao Wan
View author publications
You can also search for this author in PubMed Google Scholar
Zeming Fan
View author publications
You can also search for this author in PubMed Google Scholar
Xilei Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Ke Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zeming Fan.

Ethics declarations

Competing interests

The authors have no relevant financial or non-financial interests to disclose

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Shanshan Zhang and Hao Wan were both co-first authors.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, S., Wan, H., Fan, Z. et al. Bunet: An effective and efficient segmentation method based on bilateral encoder-decoder structure for rapid detection of apple tree branches. Appl Intell 53, 23336–23348 (2023). https://doi.org/10.1007/s10489-023-04742-x

Download citation

Accepted: 28 May 2023
Published: 08 July 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s10489-023-04742-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bunet: An effective and efficient segmentation method based on bilateral encoder-decoder structure for rapid detection of apple tree branches

Abstract

Access this article

Similar content being viewed by others

Dual Attention-Guided Network for Anchor-Free Apple Instance Segmentation in Complex Environments

ASE-UNet: An Orange Fruit Segmentation Model in an Agricultural Environment Based on Deep Learning

YOLOv5s-BC: an improved YOLOv5s-based method for real-time apple detection

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bunet: An effective and efficient segmentation method based on bilateral encoder-decoder structure for rapid detection of apple tree branches

Abstract

Access this article

Similar content being viewed by others

Dual Attention-Guided Network for Anchor-Free Apple Instance Segmentation in Complex Environments

ASE-UNet: An Orange Fruit Segmentation Model in an Agricultural Environment Based on Deep Learning

YOLOv5s-BC: an improved YOLOv5s-based method for real-time apple detection

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation