Skip to main content
Log in

MASPP and MWASP: multi-head self-attention based modules for UNet network in melon spot segmentation

  • Original Paper
  • Published:
Journal of Food Measurement and Characterization Aims and scope Submit manuscript

Abstract

Sweet melon, and in particular, spotted melon, is one of the most profitable fruit crops for farmers in the international market. As the spot ratio impacts the melon’s visual appeal, it plays a significant role in shaping consumers’ initial impressions and influencing their decision to purchase a spotted melon. However, accurately determining the spot area on a melon’s skin is challenging due to the diverse sizes and colors of these spots among different types of melons. In this study, the novel networks based on UNet model have been proposed to accurately determine the spot area on melon skins after harvesting. First, Mask R-CNN model was employed to isolate the melons from unwanted objects and backgrounds. Then, the novel variants of the Atrous Spatial Pyramid Pooling (ASPP) and Waterfall Atrous Spatial Pooling (WASP) were developed based on the multi-head self-attention (MHSA) approach to efficiently enhance the original structures. Finally, the proposed modules were integrated into VGG16-UNet network to segment melons’ spots on its skin. The experimental results demonstrate that the proposed methods yielded promising outcomes, achieving a mean IoU of 89.86% and an accuracy of 99.45% across all classes. Moreover, it outperformed other existing models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

The datasets generated during the current study are available from the corresponding author upon reasonable request.

References

  1. A.M. Herrero, Raman spectroscopy a promising technique for quality assessment of meat and fish: A review. Food Chem. 107(4), 1642–1651 (2008)

    Article  CAS  Google Scholar 

  2. J. Dong, Q. Chen, S. Yan, A. Yuille, Towards unified object detection and semantic segmentation. In: European Conference on Computer Vision, pp. 299–314. Springer, Cham (2014)

  3. S. Gidaris, N. Komodakis, Object detection via a multi-region and semantic segmentation-aware CNN model, in Proceedings of the IEEE International Conference on Computer Vision, pp. 1134–1142 (2015)

  4. J. Liang, N. Homayounfar, W.-C. Ma, Y. Xiong, R. Hu, R. Urtasun, Polytransform: deep polygon transformer for instance segmentation, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9131–9140 (2020)

  5. N. Saovana, N. Yabuki, T. Fukuda, Automated point cloud classification using an image-based instance segmentation for structure from motion. Autom. Constr. 129, 103804 (2021)

    Article  Google Scholar 

  6. A. Francis Alexander Raghu, J.P. Ananth, Robust object detection and localization using semantic segmentation network. Comput. J. 64(10), 1531–1548 (2021)

    Article  Google Scholar 

  7. R. Singh, R. Rani, Semantic segmentation using deep convolutional neural network: a review, in Proceedings of the International Conference on Innovative Computing & Communications (ICICC) (2020)

  8. R. Yang, Y. Yu, Artificial convolutional neural network in object detection and semantic segmentation for medical imaging analysis. Front. Oncol. 11, 573 (2021)

    Google Scholar 

  9. A. Anagnostis, A.C. Tagarakis, D. Kateris, V. Moysiadis, C.G. Sørensen, S. Pearson, D. Bochtis, Orchard mapping with deep learning semantic segmentation. Sensors 21(11), 3813 (2021)

    Article  PubMed  PubMed Central  Google Scholar 

  10. D. Tian, Y. Han, B. Wang, T. Guan, H. Gu, W. Wei, Review of object instance segmentation based on deep learning. J. Electron. Imaging 31(4), 041205 (2021)

    Article  Google Scholar 

  11. A.M. Hafiz, G.M. Bhat, A survey on instance segmentation: state of the art. Int. J. Multimedia Inf. Retrieval 9(3), 171–189 (2020)

    Article  Google Scholar 

  12. W. Gu, S. Bai, L. Kong, A review on 2D instance segmentation based on deep neural networks. Image Vis. Comput. 120, 104401 (2022)

    Article  Google Scholar 

  13. Q. Zhang, X. Chang, S.B. Bian, Vehicle-damage-detection segmentation algorithm based on improved mask RCNN. IEEE Access 8, 6997–7004 (2020)

    Article  Google Scholar 

  14. R. Mohan, A. Valada, EfficientPS: efficient panoptic segmentation. Int. J. Comput. Vis. 129(5), 1551–1579 (2021)

    Article  Google Scholar 

  15. T. Masuda, Leaf area estimation by semantic segmentation of point cloud of tomato plants, in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1381–1389 (2021)

  16. S. Bargoti, J.P. Underwood, Image segmentation for fruit detection and yield estimation in apple orchards. Journal of Field Robotics 34(6), 1039–1060 (2017)

    Article  Google Scholar 

  17. H. Kang, C. Chen, Fruit detection and segmentation for apple harvesting using visual sensor in orchards. Sensors 19(20), 4599 (2019)

    Article  PubMed  PubMed Central  Google Scholar 

  18. H. Kang, C. Chen, Fruit detection, segmentation and 3d visualisation of environments in apple orchards. Comput. Electron. Agric. 171, 105302 (2020)

    Article  Google Scholar 

  19. A.M. Mostafa, S.A. Kumar, T. Meraj, H.T. Rauf, A.A. Alnuaim, M.A. Alkhayyal, Guava disease detection using deep convolutional neural networks: A case study of guava plants. Appl. Sci. 12(1), 239 (2021)

    Article  Google Scholar 

  20. H. Mureşan, M. Oltean, Fruit recognition from images using deep learning (2017). arXiv preprint at arXiv:1712.00580

  21. T.B. Shahi, C. Sitaula, A. Neupane, W. Guo, Fruit classification using attention-based mobilenetv2 for industrial applications. PLoS ONE 17(2), 0264586 (2022)

    Article  Google Scholar 

  22. K. Sun, X. Wang, S. Liu, C. Liu, Apple, peach, and pear flower detection using semantic segmentation network and shape constraint level set. Comput. Electron. Agric. 185, 106150 (2021)

    Article  Google Scholar 

  23. P. Ganesh, K. Volle, T. Burks, S. Mehta, Deep orange: mask R-CNN based orange detection and segmentation. IFAC-PapersOnLine 52(30), 70–75 (2019)

    Article  Google Scholar 

  24. X. Longye, W. Zhuo, L. Haishen, K. Xilong, Y. Changhui, Overlapping citrus segmentation and reconstruction based on mask R-CNN model and concave region simplification and distance analysis. J. Phys. Conf. Ser. 1345, 032064 (2019)

    Article  Google Scholar 

  25. Y. Yu, K. Zhang, L. Yang, D. Zhang, Fruit detection for strawberry harvesting robot in non-structural environment based on Mask-RCNN. Comput. Electron. Agric. 163, 104846 (2019)

    Article  Google Scholar 

  26. X. Liu, D. Zhao, W. Jia, W. Ji, C. Ruan, Y. Sun, Cucumber fruits detection in greenhouses based on instance segmentation. IEEE Access 7, 139635–139642 (2019)

    Article  Google Scholar 

  27. M. Afonso, H. Fonteijn, F.S. Fiorentin, D. Lensink, M. Mooij, N. Faber, G. Polder, R. Wehrens, Tomato fruit detection and counting in greenhouses using deep learning. Front. Plant Sci. (2020). https://doi.org/10.3389/fpls.2020.571299

    Article  PubMed  PubMed Central  Google Scholar 

  28. W. Jia, Y. Tian, R. Luo, Z. Zhang, J. Lian, Y. Zheng, Detection and segmentation of overlapped fruits based on optimized mask R-CNN application in apple harvesting robot. Comput. Electron. Agric. 172, 105380 (2020)

    Article  Google Scholar 

  29. X. Ni, C. Li, H. Jiang, F. Takeda, Deep learning image segmentation and extraction of blueberry fruit traits associated with harvestability and yield. Hortic. Res. 7, 110 (2020)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. M. Fukuda, T. Okuno, S. Yuki, Central object segmentation by deep learning for fruits and other roundish objects (2020). arXiv preprint at arXiv:2008.01251

  31. A. Khan, T. Ilyas, M. Umraiz, Z.I. Mannan, H. Kim, CED-NET: crops and weeds segmentation for smart farming using a small cascaded encoder-decoder architecture. Electronics 9(10), 1602 (2020)

    Article  Google Scholar 

  32. L. Hashemi-Beni, A. Gebrehiwot, Deep learning for remote sensing image classification for agriculture applications. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 44, 51–54 (2020)

    Article  Google Scholar 

  33. M. Fukuda, T. Okuno, S. Yuki, Central object segmentation by deep learning to continuously monitor fruit growth through RGB images. Sensors 21(21), 6999 (2021)

    Article  PubMed  PubMed Central  Google Scholar 

  34. A. Taravat, M.P. Wagner, R. Bonifacio, D. Petit, Advanced fully convolutional networks for agricultural field boundary detection. Remote Sens. 13(4), 722 (2021)

    Article  Google Scholar 

  35. Q. Li, W. Jia, M. Sun, S. Hou, Y. Zheng, A novel green apple segmentation algorithm based on ensemble U-Net under complex orchard environment. Comput. Electron. Agric. 180, 105900 (2021)

    Article  Google Scholar 

  36. K. Roy, S.S. Chaudhuri, S. Pramanik, Deep learning based real-time industrial framework for rotten and fresh fruit detection using semantic segmentation. Microsyst. Technol. 27(9), 3365–3375 (2021)

    Article  Google Scholar 

  37. T. Van De Looverbosch, E. Raeymaekers, P. Verboven, J. Sijbers, B. Nicolai, Non-destructive internal disorder detection of conference pears by semantic segmentation of X-ray CT scans using deep learning. Expert Syst. Appl. 176, 114925 (2021)

    Article  Google Scholar 

  38. G. Lin, Y. Tang, X. Zou, C. Wang, Three-dimensional reconstruction of guava fruits and branches using instance segmentation and geometry analysis. Comput. Electron. Agric. 184, 106107 (2021)

    Article  Google Scholar 

  39. P. Chu, Z. Li, K. Lammers, R. Lu, X. Liu, Deep learning-based apple detection using a suppression mask R-CNN. Pattern Recogn. Lett. 147, 206–211 (2021)

    Article  Google Scholar 

  40. D. Wang, D. He, Fusion of mask R-CNN and attention mechanism for instance segmentation of apples under complex background. Comput. Electron. Agric. 196, 106864 (2022)

    Article  Google Scholar 

  41. J. Lv, H. Xu, L. Xu, Y. Gu, H. Rong, L. Zou, An image rendering-based identification method for apples with different growth forms. Comput. Electron. Agric. 211, 108040 (2023)

    Article  Google Scholar 

  42. T.-T. Ho, T. Hoang, K.-D. Tran, Y. Huang, N.Q.K. Le, Non-destructive classification of melon sweetness levels using segmented rind properties based on semantic segmentation models. J. Food Meas. Charact. 17, 5913–5928 (2023)

    Article  Google Scholar 

  43. Z. Li, X. Deng, Y. Lan, C. Liu, J. Qing, Fruit tree canopy segmentation from UAV orthophoto maps based on a lightweight improved U-Net. Comput. Electron. Agric. 217, 108538 (2024)

    Article  Google Scholar 

  44. C. Qian, H. Liu, T. Du, S. Sun, W. Liu, R. Zhang, An improved u-net network-based quantitative analysis of melon fruit phenotypic characteristics. J. Food Meas. Charact. 16(5), 4198–4207 (2022)

    Article  Google Scholar 

  45. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is all you need, in Advances in Neural Information Processing Systems, vol. 30 (2017)

  46. L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A.L. Yuille, DeepLab: Semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)

    Article  PubMed  Google Scholar 

  47. B. Artacho, A. Savakis, Waterfall atrous spatial pooling architecture for efficient semantic segmentation. Sensors 19(24), 5361 (2019)

    Article  PubMed  PubMed Central  Google Scholar 

  48. S.-H.M. Ashtiani, S. Javanmardi, M. Jahanbanifard, A. Martynenko, F.J. Verbeek, Detection of mulberry ripeness stages using deep learning models. IEEE Access 9, 100380–100394 (2021)

    Article  Google Scholar 

  49. W. Zhao, H. Zhang, Y. Yan, Y. Fu, H. Wang, A semantic segmentation algorithm using FCN with combination of BSLIC. Appl. Sci. 8(4), 500 (2018)

    Article  Google Scholar 

  50. M.A. Al-Masni, M.A. Al-Antari, M.-T. Choi, S.-M. Han, T.-S. Kim, Skin lesion segmentation in dermoscopy images via deep full resolution convolutional networks. Comput. Methods Programs Biomed. 162, 221–231 (2018)

    Article  PubMed  Google Scholar 

  51. F. Yu, V. Koltun, Multi-scale context aggregation by dilated convolutions (2015). arXiv preprint at arXiv:1511.07122

  52. L.-C. Chen, G. Papandreou, F. Schroff, H. Adam, Rethinking atrous convolution for semantic image segmentation (2017). arXiv preprint at arXiv:1706.05587

  53. L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, H. Adam, Encoder-decoder with atrous separable convolution for semantic image segmentation, in Proceedings of the European Conference on Computer Vision (ECCV), pp. 801–818 (2018)

  54. R. Augustauskas, A. Lipnickas, Improved pixel-level pavement-defect segmentation using a deep autoencoder. Sensors 20(9), 2557 (2020)

    Article  PubMed  PubMed Central  Google Scholar 

  55. Y. Wang, B. Liang, M. Ding, J. Li, Dense semantic labeling with atrous spatial pyramid pooling and decoder for high-resolution remote sensing imagery. Remote Sens. 11(1), 20 (2018)

    Article  Google Scholar 

  56. G. Chen, C. Li, W. Wei, W. Jing, M. Woźniak, T. Blažauskas, R. Damaševičius, Fully convolutional neural network with augmented atrous spatial pyramid pool and fully connected fusion path for high resolution remote sensing image segmentation. Appl. Sci. 9(9), 1816 (2019)

    Article  Google Scholar 

  57. P. Zhang, Y. Ke, Z. Zhang, M. Wang, P. Li, S. Zhang, Urban land use and land cover classification using novel deep learning models based on high spatial resolution satellite imagery. Sensors 18(11), 3717 (2018)

    Article  PubMed  PubMed Central  Google Scholar 

  58. Y.B. Guo, B. Matuszewski, Giana polyp segmentation with fully convolutional dilation neural networks, in Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (SCITEPRESS-Science and Technology Publications, 2019), pp. 632–641

  59. V. Badrinarayanan, A. Kendall, R. Cipolla, SEGNET: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)

    Article  PubMed  Google Scholar 

  60. K. He, G. Gkioxari,, P. Dollár, R. Girshick, Mask R-CNN, in Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)

  61. R. Girshick, Fast R-CNN, in Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)

  62. O. Ronneberger, P. Fischer, T. Brox, U-Net: Convolutional networks for biomedical image segmentation, in International Conference on Medical Image Computing and Computer-Assisted Intervention (Springer, 2015), pp. 234–241

  63. K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  64. K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition (2014). arXiv preprint at arXiv:1409.1556

  65. M. Iman, H.R. Arabnia, K. Rasheed, A review of deep transfer learning and recent advancements. Technologies 11(2), 40 (2023)

    Article  Google Scholar 

  66. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, IMAGENET: a large-scale hierarchical image database, in 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (IEEE, 2009)

  67. T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C.L. Zitnick, Microsoft coco: common objects in context, in European Conference on Computer Vision (Springer, 2014), pp. 740–755

  68. M. Yang, K. Yu,, C. Zhang, Z. Li, K. Yang, DENSEASPP for semantic segmentation in street scenes, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3684–3692 (2018)

  69. C. Balakrishna, S. Dadashzadeh, S. Soltaninejad, Automatic detection of lumen and media in the IVUS images using U-Net with VGG16 encoder (2018). arXiv preprint at arXiv:1806.07554

  70. L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A.L. Yuille, Semantic image segmentation with deep convolutional nets and fully connected CRFS (2014). arXiv preprint at arXiv:1412.7062

Download references

Funding

The work was supported by National Science and Technology Council of the Republic of China under grant NSTC 112-2222-E-032-001. The work was also supported by Academia Sinica under Grant AS-TP-110-M07.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: Khoa-Dang Tran and Trang-Thi Ho; Methodology: Khoa-Dang Tran and Trang-Thi Ho; Formal analysis and investigation: Trang-Thi Ho, Khoa-Dang Tran and Yennun Huang; writing—original draft preparation: Khoa-Dang Tran and Trang-Thi Ho; writing—review and editing: Yennun Huang, Nguyen Quoc Khanh Le and Le Quoc Tuan; Supervision: Van Lam Ho. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Trang-Thi Ho.

Ethics declarations

Confict of interest

The authors do not declare any conflict of interest.

Research involving human and animal rights

This research did not contain any studies involving animal or human participants, nor did it take place on any private or protected areas. No specific permissions were required for corresponding locations.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tran, KD., Ho, TT., Huang, Y. et al. MASPP and MWASP: multi-head self-attention based modules for UNet network in melon spot segmentation. Food Measure (2024). https://doi.org/10.1007/s11694-024-02466-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11694-024-02466-1

Keywords

Navigation