Abstract
Sweet melon, and in particular, spotted melon, is one of the most profitable fruit crops for farmers in the international market. As the spot ratio impacts the melon’s visual appeal, it plays a significant role in shaping consumers’ initial impressions and influencing their decision to purchase a spotted melon. However, accurately determining the spot area on a melon’s skin is challenging due to the diverse sizes and colors of these spots among different types of melons. In this study, the novel networks based on UNet model have been proposed to accurately determine the spot area on melon skins after harvesting. First, Mask R-CNN model was employed to isolate the melons from unwanted objects and backgrounds. Then, the novel variants of the Atrous Spatial Pyramid Pooling (ASPP) and Waterfall Atrous Spatial Pooling (WASP) were developed based on the multi-head self-attention (MHSA) approach to efficiently enhance the original structures. Finally, the proposed modules were integrated into VGG16-UNet network to segment melons’ spots on its skin. The experimental results demonstrate that the proposed methods yielded promising outcomes, achieving a mean IoU of 89.86% and an accuracy of 99.45% across all classes. Moreover, it outperformed other existing models.
Similar content being viewed by others
Data availability
The datasets generated during the current study are available from the corresponding author upon reasonable request.
References
A.M. Herrero, Raman spectroscopy a promising technique for quality assessment of meat and fish: A review. Food Chem. 107(4), 1642–1651 (2008)
J. Dong, Q. Chen, S. Yan, A. Yuille, Towards unified object detection and semantic segmentation. In: European Conference on Computer Vision, pp. 299–314. Springer, Cham (2014)
S. Gidaris, N. Komodakis, Object detection via a multi-region and semantic segmentation-aware CNN model, in Proceedings of the IEEE International Conference on Computer Vision, pp. 1134–1142 (2015)
J. Liang, N. Homayounfar, W.-C. Ma, Y. Xiong, R. Hu, R. Urtasun, Polytransform: deep polygon transformer for instance segmentation, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9131–9140 (2020)
N. Saovana, N. Yabuki, T. Fukuda, Automated point cloud classification using an image-based instance segmentation for structure from motion. Autom. Constr. 129, 103804 (2021)
A. Francis Alexander Raghu, J.P. Ananth, Robust object detection and localization using semantic segmentation network. Comput. J. 64(10), 1531–1548 (2021)
R. Singh, R. Rani, Semantic segmentation using deep convolutional neural network: a review, in Proceedings of the International Conference on Innovative Computing & Communications (ICICC) (2020)
R. Yang, Y. Yu, Artificial convolutional neural network in object detection and semantic segmentation for medical imaging analysis. Front. Oncol. 11, 573 (2021)
A. Anagnostis, A.C. Tagarakis, D. Kateris, V. Moysiadis, C.G. Sørensen, S. Pearson, D. Bochtis, Orchard mapping with deep learning semantic segmentation. Sensors 21(11), 3813 (2021)
D. Tian, Y. Han, B. Wang, T. Guan, H. Gu, W. Wei, Review of object instance segmentation based on deep learning. J. Electron. Imaging 31(4), 041205 (2021)
A.M. Hafiz, G.M. Bhat, A survey on instance segmentation: state of the art. Int. J. Multimedia Inf. Retrieval 9(3), 171–189 (2020)
W. Gu, S. Bai, L. Kong, A review on 2D instance segmentation based on deep neural networks. Image Vis. Comput. 120, 104401 (2022)
Q. Zhang, X. Chang, S.B. Bian, Vehicle-damage-detection segmentation algorithm based on improved mask RCNN. IEEE Access 8, 6997–7004 (2020)
R. Mohan, A. Valada, EfficientPS: efficient panoptic segmentation. Int. J. Comput. Vis. 129(5), 1551–1579 (2021)
T. Masuda, Leaf area estimation by semantic segmentation of point cloud of tomato plants, in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1381–1389 (2021)
S. Bargoti, J.P. Underwood, Image segmentation for fruit detection and yield estimation in apple orchards. Journal of Field Robotics 34(6), 1039–1060 (2017)
H. Kang, C. Chen, Fruit detection and segmentation for apple harvesting using visual sensor in orchards. Sensors 19(20), 4599 (2019)
H. Kang, C. Chen, Fruit detection, segmentation and 3d visualisation of environments in apple orchards. Comput. Electron. Agric. 171, 105302 (2020)
A.M. Mostafa, S.A. Kumar, T. Meraj, H.T. Rauf, A.A. Alnuaim, M.A. Alkhayyal, Guava disease detection using deep convolutional neural networks: A case study of guava plants. Appl. Sci. 12(1), 239 (2021)
H. Mureşan, M. Oltean, Fruit recognition from images using deep learning (2017). arXiv preprint at arXiv:1712.00580
T.B. Shahi, C. Sitaula, A. Neupane, W. Guo, Fruit classification using attention-based mobilenetv2 for industrial applications. PLoS ONE 17(2), 0264586 (2022)
K. Sun, X. Wang, S. Liu, C. Liu, Apple, peach, and pear flower detection using semantic segmentation network and shape constraint level set. Comput. Electron. Agric. 185, 106150 (2021)
P. Ganesh, K. Volle, T. Burks, S. Mehta, Deep orange: mask R-CNN based orange detection and segmentation. IFAC-PapersOnLine 52(30), 70–75 (2019)
X. Longye, W. Zhuo, L. Haishen, K. Xilong, Y. Changhui, Overlapping citrus segmentation and reconstruction based on mask R-CNN model and concave region simplification and distance analysis. J. Phys. Conf. Ser. 1345, 032064 (2019)
Y. Yu, K. Zhang, L. Yang, D. Zhang, Fruit detection for strawberry harvesting robot in non-structural environment based on Mask-RCNN. Comput. Electron. Agric. 163, 104846 (2019)
X. Liu, D. Zhao, W. Jia, W. Ji, C. Ruan, Y. Sun, Cucumber fruits detection in greenhouses based on instance segmentation. IEEE Access 7, 139635–139642 (2019)
M. Afonso, H. Fonteijn, F.S. Fiorentin, D. Lensink, M. Mooij, N. Faber, G. Polder, R. Wehrens, Tomato fruit detection and counting in greenhouses using deep learning. Front. Plant Sci. (2020). https://doi.org/10.3389/fpls.2020.571299
W. Jia, Y. Tian, R. Luo, Z. Zhang, J. Lian, Y. Zheng, Detection and segmentation of overlapped fruits based on optimized mask R-CNN application in apple harvesting robot. Comput. Electron. Agric. 172, 105380 (2020)
X. Ni, C. Li, H. Jiang, F. Takeda, Deep learning image segmentation and extraction of blueberry fruit traits associated with harvestability and yield. Hortic. Res. 7, 110 (2020)
M. Fukuda, T. Okuno, S. Yuki, Central object segmentation by deep learning for fruits and other roundish objects (2020). arXiv preprint at arXiv:2008.01251
A. Khan, T. Ilyas, M. Umraiz, Z.I. Mannan, H. Kim, CED-NET: crops and weeds segmentation for smart farming using a small cascaded encoder-decoder architecture. Electronics 9(10), 1602 (2020)
L. Hashemi-Beni, A. Gebrehiwot, Deep learning for remote sensing image classification for agriculture applications. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 44, 51–54 (2020)
M. Fukuda, T. Okuno, S. Yuki, Central object segmentation by deep learning to continuously monitor fruit growth through RGB images. Sensors 21(21), 6999 (2021)
A. Taravat, M.P. Wagner, R. Bonifacio, D. Petit, Advanced fully convolutional networks for agricultural field boundary detection. Remote Sens. 13(4), 722 (2021)
Q. Li, W. Jia, M. Sun, S. Hou, Y. Zheng, A novel green apple segmentation algorithm based on ensemble U-Net under complex orchard environment. Comput. Electron. Agric. 180, 105900 (2021)
K. Roy, S.S. Chaudhuri, S. Pramanik, Deep learning based real-time industrial framework for rotten and fresh fruit detection using semantic segmentation. Microsyst. Technol. 27(9), 3365–3375 (2021)
T. Van De Looverbosch, E. Raeymaekers, P. Verboven, J. Sijbers, B. Nicolai, Non-destructive internal disorder detection of conference pears by semantic segmentation of X-ray CT scans using deep learning. Expert Syst. Appl. 176, 114925 (2021)
G. Lin, Y. Tang, X. Zou, C. Wang, Three-dimensional reconstruction of guava fruits and branches using instance segmentation and geometry analysis. Comput. Electron. Agric. 184, 106107 (2021)
P. Chu, Z. Li, K. Lammers, R. Lu, X. Liu, Deep learning-based apple detection using a suppression mask R-CNN. Pattern Recogn. Lett. 147, 206–211 (2021)
D. Wang, D. He, Fusion of mask R-CNN and attention mechanism for instance segmentation of apples under complex background. Comput. Electron. Agric. 196, 106864 (2022)
J. Lv, H. Xu, L. Xu, Y. Gu, H. Rong, L. Zou, An image rendering-based identification method for apples with different growth forms. Comput. Electron. Agric. 211, 108040 (2023)
T.-T. Ho, T. Hoang, K.-D. Tran, Y. Huang, N.Q.K. Le, Non-destructive classification of melon sweetness levels using segmented rind properties based on semantic segmentation models. J. Food Meas. Charact. 17, 5913–5928 (2023)
Z. Li, X. Deng, Y. Lan, C. Liu, J. Qing, Fruit tree canopy segmentation from UAV orthophoto maps based on a lightweight improved U-Net. Comput. Electron. Agric. 217, 108538 (2024)
C. Qian, H. Liu, T. Du, S. Sun, W. Liu, R. Zhang, An improved u-net network-based quantitative analysis of melon fruit phenotypic characteristics. J. Food Meas. Charact. 16(5), 4198–4207 (2022)
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is all you need, in Advances in Neural Information Processing Systems, vol. 30 (2017)
L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A.L. Yuille, DeepLab: Semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)
B. Artacho, A. Savakis, Waterfall atrous spatial pooling architecture for efficient semantic segmentation. Sensors 19(24), 5361 (2019)
S.-H.M. Ashtiani, S. Javanmardi, M. Jahanbanifard, A. Martynenko, F.J. Verbeek, Detection of mulberry ripeness stages using deep learning models. IEEE Access 9, 100380–100394 (2021)
W. Zhao, H. Zhang, Y. Yan, Y. Fu, H. Wang, A semantic segmentation algorithm using FCN with combination of BSLIC. Appl. Sci. 8(4), 500 (2018)
M.A. Al-Masni, M.A. Al-Antari, M.-T. Choi, S.-M. Han, T.-S. Kim, Skin lesion segmentation in dermoscopy images via deep full resolution convolutional networks. Comput. Methods Programs Biomed. 162, 221–231 (2018)
F. Yu, V. Koltun, Multi-scale context aggregation by dilated convolutions (2015). arXiv preprint at arXiv:1511.07122
L.-C. Chen, G. Papandreou, F. Schroff, H. Adam, Rethinking atrous convolution for semantic image segmentation (2017). arXiv preprint at arXiv:1706.05587
L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, H. Adam, Encoder-decoder with atrous separable convolution for semantic image segmentation, in Proceedings of the European Conference on Computer Vision (ECCV), pp. 801–818 (2018)
R. Augustauskas, A. Lipnickas, Improved pixel-level pavement-defect segmentation using a deep autoencoder. Sensors 20(9), 2557 (2020)
Y. Wang, B. Liang, M. Ding, J. Li, Dense semantic labeling with atrous spatial pyramid pooling and decoder for high-resolution remote sensing imagery. Remote Sens. 11(1), 20 (2018)
G. Chen, C. Li, W. Wei, W. Jing, M. Woźniak, T. Blažauskas, R. Damaševičius, Fully convolutional neural network with augmented atrous spatial pyramid pool and fully connected fusion path for high resolution remote sensing image segmentation. Appl. Sci. 9(9), 1816 (2019)
P. Zhang, Y. Ke, Z. Zhang, M. Wang, P. Li, S. Zhang, Urban land use and land cover classification using novel deep learning models based on high spatial resolution satellite imagery. Sensors 18(11), 3717 (2018)
Y.B. Guo, B. Matuszewski, Giana polyp segmentation with fully convolutional dilation neural networks, in Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (SCITEPRESS-Science and Technology Publications, 2019), pp. 632–641
V. Badrinarayanan, A. Kendall, R. Cipolla, SEGNET: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
K. He, G. Gkioxari,, P. Dollár, R. Girshick, Mask R-CNN, in Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
R. Girshick, Fast R-CNN, in Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
O. Ronneberger, P. Fischer, T. Brox, U-Net: Convolutional networks for biomedical image segmentation, in International Conference on Medical Image Computing and Computer-Assisted Intervention (Springer, 2015), pp. 234–241
K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition (2014). arXiv preprint at arXiv:1409.1556
M. Iman, H.R. Arabnia, K. Rasheed, A review of deep transfer learning and recent advancements. Technologies 11(2), 40 (2023)
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, IMAGENET: a large-scale hierarchical image database, in 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (IEEE, 2009)
T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C.L. Zitnick, Microsoft coco: common objects in context, in European Conference on Computer Vision (Springer, 2014), pp. 740–755
M. Yang, K. Yu,, C. Zhang, Z. Li, K. Yang, DENSEASPP for semantic segmentation in street scenes, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3684–3692 (2018)
C. Balakrishna, S. Dadashzadeh, S. Soltaninejad, Automatic detection of lumen and media in the IVUS images using U-Net with VGG16 encoder (2018). arXiv preprint at arXiv:1806.07554
L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A.L. Yuille, Semantic image segmentation with deep convolutional nets and fully connected CRFS (2014). arXiv preprint at arXiv:1412.7062
Funding
The work was supported by National Science and Technology Council of the Republic of China under grant NSTC 112-2222-E-032-001. The work was also supported by Academia Sinica under Grant AS-TP-110-M07.
Author information
Authors and Affiliations
Contributions
Conceptualization: Khoa-Dang Tran and Trang-Thi Ho; Methodology: Khoa-Dang Tran and Trang-Thi Ho; Formal analysis and investigation: Trang-Thi Ho, Khoa-Dang Tran and Yennun Huang; writing—original draft preparation: Khoa-Dang Tran and Trang-Thi Ho; writing—review and editing: Yennun Huang, Nguyen Quoc Khanh Le and Le Quoc Tuan; Supervision: Van Lam Ho. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Confict of interest
The authors do not declare any conflict of interest.
Research involving human and animal rights
This research did not contain any studies involving animal or human participants, nor did it take place on any private or protected areas. No specific permissions were required for corresponding locations.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tran, KD., Ho, TT., Huang, Y. et al. MASPP and MWASP: multi-head self-attention based modules for UNet network in melon spot segmentation. Food Measure (2024). https://doi.org/10.1007/s11694-024-02466-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11694-024-02466-1