Abstract
Accurate identifying of strawberry appearance quality is an important step for robot picking in the orchard. The convolutional neural network (CNN) has greatly helped the computer vision tasks such as the identification of fruits. However, better performance of CNN requires more time and computation for training. In order to overcome these shortcomings, a method, named “Swin-MLP”, based on Swin Transformer and multi-layer perceptron (MLP) to identify the strawberry appearance quality is proposed. The proposed method utilizes the Swin Transformer to extract strawberry image features and then import the features into MLP for identifying strawberry. In addition, the performance of combinations of Swin Transformer plus diffident classifiers is evaluated. Furthermore, the proposed Swin-MLP method is compared with original Swin-T and traditional CNN models. The accuracy of the proposed method reaches 98.45%, which is 2.61% higher than original Swin-T model. The required training time of the Swin-MLP only is 16.79 s that is extremely faster than other models. The experiment results show that the Swin-MLP has a good effect on identifying strawberry appearance quality. The success of the proposed method provides a new solution for strawberry quality identification.
Similar content being viewed by others
References
Q. Sun, D. Harishchandra, J. Jia, Q. Zuo, G. Zhang, Q. Wang, J. Yan, W. Zhang, X. Li, Role of Neopestalotiopsis rosae in causing root rot of strawberry in Beijing, China. Crop Prot. 147, 105710 (2021). https://doi.org/10.1016/j.cropro.2021.105710
J.J. Lei, S. Jiang, R.Y. Ma, L. Xue, J. Zhao, H.P. Dai, Current status of strawberry industry in China. Acta Hortic. 1309, 349–352 (2021). https://doi.org/10.17660/ActaHortic.2021.1309.50
Q. Liu, K. Sun, N. Zhao, J. Yang, Y. Zhang, C. Ma, L. Pan, K. Tu, Information fusion of hyperspectral imaging and electronic nose for evaluation of fungal contamination in strawberries during decay. Postharvest Biol. Technol. 153, 152–160 (2019). https://doi.org/10.1016/j.postharvbio.2019.03.017
T.T. Watson, J.W. Noling, J.A. Desaeger, Fluopyram as a rescue nematicide for managing sting nematode (Belonolaimus longicaudatus) on commercial strawberry in Florida. Crop. Prot. 132, 105–108 (2020). https://doi.org/10.1016/j.cropro.2020.105108
C. Zhang, C. Guo, F. Liu, W. Kong, Y. He, B. Lou, Hyperspectral imaging analysis for ripeness evaluation of strawberry with support vector machine. J. Food Eng. 179, 11–18 (2016). https://doi.org/10.1016/j.jfoodeng.2016.01.002
W. Chen, Y. Xu, Z. Zhang, L. Yang, X. Pan, Z. Jia, Mapping agricultural plastic greenhouses using Google Earth images and deep learning. Comput. Electron. Agric. 191, 106552 (2021). https://doi.org/10.1016/j.compag.2021.106552
G. Wang, H. Zheng, X. Zhang, A robust checkerboard corner detection method for camera calibration based on improved YOLOX. Front. Phys-Lausanne. 9, 819019 (2022). https://doi.org/10.3389/fphy.2021.819019
Y. Sun, C. Wang, A computation-efficient CNN system for high-quality brain tumor segmentation. Biomed. Signal. Process. 74, 103475 (2022). https://doi.org/10.1016/j.bspc.2021.103475
R. Zhang, P. Zhao, W. Guo, R. Wang, W. Lu, Medical named entity recognition based on dilated convolutional neural network. Cogn. Robot. 2, 13–20 (2022). https://doi.org/10.1016/j.cogr.2021.11.002
L. Jiao, S. Dong, S. Zhang, C. Xie, H. Wang, AF-RCNN: an anchor-free convolutional neural network for multi-categories agricultural pest detection. Comput. Electron. Agric. 174, 105522 (2020). https://doi.org/10.1016/j.compag.2020.105522
G. Sambasivam, G.D. Opiyo, A predictive machine learning application in agriculture: Cassava disease detection and classification with imbalanced dataset using convolutional neural networks. Egypt. Inform. J. 22, 27–34 (2021). https://doi.org/10.1016/j.eij.2020.02.007
D. Wang, J. Wang, W. Li, P. Guan, T-CNN: trilinear convolutional neural networks model for visual detection of plant diseases. Comput. Electron. Agric. 190, 106468 (2021). https://doi.org/10.1016/j.compag.2021.106468
J. Yu, X. Ye, H. Li, A high precision intrusion detection system for network security communication based on multi-scale convolutional neural network. Future Gener. Comput. Syst. 129, 399–406 (2021). https://doi.org/10.1016/j.future.2021.10.018
W. Bao, X. Yang, D. Liang, G. Hu, X. Yang, Lightweight convolutional neural network model for field wheat ear disease identification. Comput. Electron. Agric. 189, 106367 (2021). https://doi.org/10.1016/j.compag.2021.106367
I. Indrabayu, N. Arifin, I.S. Areni, Strawberry ripeness classification system based on skin tone color using multi-class support vector machine, in 2019 International Conference on Information and Communications Technology (ICOIACT) (2019), pp. 191–195. https://doi.org/10.1109/icoiact46704.2019.8938457
Q. Jiang, G. Wu, C. Tian, N. Li, H. Yang, Y. Bai, B. Zhang, Hyperspectral imaging for early identification of strawberry leaves diseases with machine learning and spectral fingerprint features. Infrared Phys. Technol. 118, 103898 (2021). https://doi.org/10.1016/j.infrared.2021.103898
Z. Gao, Y. Shao, G. Xuan, Y. Wang, Y. Liu, X. Han, Real-time hyperspectral imaging for the in-field estimation of strawberry ripeness with deep learning. Artif. Intell. Agric. 4, 31–38 (2020). https://doi.org/10.1016/j.aiia.2020.04.003
C. Dong, Z. Zhang, J. Yue, L. Zhou, Automatic recognition of strawberry diseases and pests using convolutional neural network. Smart Agric. Technol. 1, 100009 (2021). https://doi.org/10.1016/j.atech.2021.100009
J. Choi, K. Seo, J. Cho, K. Moon, Applying convolutional neural networks to assess the external quality of strawberries. J. Food Compos. Anal. 102, 104071 (2021). https://doi.org/10.1016/j.jfca.2021.104071
H. Li, M. Sui, F. Zhao, Z. Zha, F. Wu, MVT: mask vision transformer for facial expression recognition in the wild (2021), https://arXiv.org/2106.04520
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, L. Kaiser, I. Polosukhin, Attention is all you need, in 31st Conference on Neural Information Processing Systems (NIPS) (2017), pp. 6000–6010
D. Zhou, B. Kang, X. Jin, L. Yang, X. Lian, Z. Jiang, Q. Hou, J. Feng, Deepvit: towards deeper vision transformer (2021), https://arXiv.org/2103.11886
C. Huang, Y. Chen. Adapting pretrained transformer to lattices for spoken language understanding, in 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (2019), pp. 845–852. https://doi.org/10.1109/ASRU46091.2019.9003825
N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, S. Zagoruyko, End-to-end object detection with transformers, in European Conference on Computer Vision (Springer, Cham, 2020), pp. 845–852. https://doi.org/10.1007/978-3-030-58452-8_13
A. Arnab, M. Dehghani, G. Heigold, C. Sun, M. Lučić, C. Schmid, ViViT: a video vision transformer (2021), https://arXiv.org/2103.15691
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, N. Houlsby, An image is worth 16x16 words: transformers for image recognition at scale (2021), https://arXiv.org/2010.11929
Z. Liu, Y. Lin, Y. Cao, H. Hu, W. Wei, Z. Zhang, S. Lin, B. Guo, Swin transformer: hierarchical vision transformer using shifted windows (2021), https://arXiv.org/2103.14030
J. Wang, Z. Zhang, L. Luo, W. Zhu, J. Chen, W. Wang, SwinGD: a robust grape bunch detection model based on Swin Transformer in complex vineyard environment. Horticulturae 7, 492 (2021). https://doi.org/10.3390/horticulturae7110492
H. Cao, Y. Wang, J. Chen, D. Jiang, X. Zhang, Q. Tian, M. Wang, Swin-Unet: Unet-like pure transformer for medical image segmentation (2021), https://arXiv.org/2105.05537
F. Del Frate, F. Pacifici, G. Schiavon, C. Solimini, Use of neural networks for automatic classification from high-resolution images. IEEE Trans. Geosci. Remote Sens. 45, 800–809 (2007). https://doi.org/10.1109/TGRS.2007.892009
P. Xu, R. Yang, T. Zeng, J. Zhang, Y. Zhang, Q. Tan, Varietal classification of maize seeds using computer vision and machine learning techniques. J. Food Process. Eng. 44, e13846 (2021). https://doi.org/10.1111/jfpe.13846
C. Zhang, X. Pan, H. Li, A. Gardiner, I. Sargent, J. Hare, P.M. Atkinson, A hybrid MLP-CNN classifier for very fine resolution remotely sensed image classification. ISPRS J. Photogramm. 140, 133–144 (2018). https://doi.org/10.1016/j.isprsjprs.2017.07.014
A. Takahashi, Y. Koda, K. Ito, T. Aoki, Fingerprint feature extraction by combining texture, minutiae, and frequency spectrum using multi-task CNN, in 2020 IEEE International Joint Conference on Biometrics (2020), pp. 1–8. https://doi.org/10.1109/IJCB48548.2020.9304861
H. Zhu, L. Yang, J. Fei, L. Zhao, Z. Han, Recognition of carrot appearance quality based on deep feature and support vector machine. Comput. Electron. Agric. 186, 106185 (2021). https://doi.org/10.1016/j.compag.2021.106185
A. Krizhevsky, I. Sutskever, G.E. Hinton, ImageNet classification with deep convolutional neural networks, in 25th International Conference on Neural Information Processing Systems (2012), pp. 1097–1105. https://doi.org/10.1145/3065386
W. Noble, What is a support vector machine? Nat. Biotechnol. 24, 1565–1567 (2006). https://doi.org/10.1038/nbt1206-1565
M. Pal, Random forest classifier for remote sensing classification. Int. J. Remote Sens. 26, 217–222 (2005). https://doi.org/10.1080/01431160412331269698
A. Perez, P. Larranaga, I. Inza, Supervised classification with conditional Gaussian networks: increasing the structure complexity from naive Bayes. Int. J. Approx. Reason. 43, 1–25 (2006). https://doi.org/10.1016/j.ijar.2006.01.002
G. Guo, H. Wang, D. Bell, Y. Bi, K. Greer, KNN model-based approach in classification, in On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE. OTM 2003. Lecture Notes in Computer Science, ed. by R. Meersman, Z. Tari, D.C. Schmidt (2003), pp. 986–996. https://doi.org/10.1007/978-3-540-39964-3_62
A.H. Jahromi, M. Taheri, A non-parametric mixture of Gaussian naive Bayes classifiers based on local independent features, in 2017 Artificial Intelligence and Signal Processing Conference (AISP) (2017), pp. 209–212. https://doi.org/10.1109/AISP.2017.8324083
A. Izenman, Linear discriminant analysis, in Modern Multivariate Statistical Techniques. ed. by A.J. Izenman (Springer, Berlin, 2013), pp. 237–280. https://doi.org/10.1007/978-0-387-78189-1_8
S. Bose, A. Pal, R. SahaRay, J. Nayak, Generalized quadratic discriminant analysis. Pattern Recogn. 48, 2676–2684 (2015). https://doi.org/10.1016/j.patcog.2015.02.016
J. Ni, J. Gao, L. Deng, Z. Han, Monitoring the change process of banana freshness by GoogLeNet. IEEE Access (2020), pp. 228369–228376. https://doi.org/10.1109/ACCESS.2020.3045394
R.R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, Grad-cam: visual explanations from deep networks via gradient-based localization, in 2017 IEEE International Conference on Computer Vision (2017), pp. 618–626. https://doi.org/10.1109/ICCV.2017.74
X. Li, C. Cai, H. Zheng, H. Zhu, Recognizing strawberry appearance quality using different combinations of deep feature and classifiers. J Food Process Eng. 45, e13982 (2022). https://doi.org/10.1111/jfpe.13982
D.M. Martínez Gila, J.P. Navarro Soto, S. Satorres Martínez, J. Gómez Ortega, J. Gámez García, The advantage of multispectral images in fruit quality control for extra virgin olive oil production. Food Anal. Method. 15, 75–84 (2022). https://doi.org/10.1007/s12161-021-02099-w
K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition (2014), https://arXiv.org/1409.1556
A. Szegedy, W. Liu, Y. Jia, Going deeper with convolutions, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015), pp. 1–9. https://doi.org/10.1109/CVPR.2015.7298594
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, L. Chen, Mobilenetv2: inverted residuals and linear bottlenecks, in Proceedings of the IEEE conference on computer vision and pattern recognition (2018), pp. 4510–4520. https://doi.org/10.1109/CVPR.2018.00474
F. Zhang, Z. Qi, K. Duan, D. Xi, Y. Zhu, H. Zhu, A comprehensive on transfer learning. Proc. IEEE 109, 43–76 (2020). https://doi.org/10.1109/JPROC.2020.3004555
T.T. Nguyen, Q. Vien, H. Sellahewa, An efficient pest classification in smart agriculture using transfer learning. EAI Endorsed Trans. Ind. Netw. Intell. Syst. 8, 1–8 (2021). https://doi.org/10.4108/eai.26-1-2021.168227
L. Zhang, Y. Wen, A transformer-based framework for automatic COVID19 diagnosis in chest CTs, in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 513–518. https://doi.org/10.1109/ICCVW54120.2021.00063
A.J. Bowers, X. Zhou, Receiver operating characteristic (ROC) area under the curve (AUC): a diagnostic measure for evaluating the accuracy of predictors of education outcomes. J. Educ. Stud. Placed Risk 24, 20–46 (2019). https://doi.org/10.1080/10824669.2018.1523734
L. Van der Maaten, G. Hinton, Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
Y. Chen, W. Lee, H. Gan, N. Peres, C. Fraisse, Y. Zhang, Y. He, Strawberry yield prediction based on a deep neural network using high-resolution aerial orthoimages. Remote Sens-Basel. 11, 1584 (2019). https://doi.org/10.3390/rs11131584
D. Zhang, Y. Xu, W. Huang, X. Tian, Y. Xia, L. Xu, S. Fan, Nondestructive measurement of soluble solids content in apple using near infrared hyperspectral imaging coupled with wavelength selection algorithm. Infrared Phys. Technol. 98, 297–304 (2019). https://doi.org/10.1016/j.infrared.2019.03.026
Acknowledgements
The work is partly supported by Natural Science Basic Research Program of Shaanxi (Program No. 2022JM-318) and President's Fund of Xi'an Technological University (No. XGPY200216).
Author information
Authors and Affiliations
Contributions
Conceptualization: [HZ], Data Curation: [HZ, GW]; Formal Analysis: [HZ]; Investigation: [HZ]; Methodology: [HZ]; Resources: [HZ, GW]; Software: [HZ]; Validation: [HZ, GW, XL]; Visualization: [HZ, XL]; Writing – Original Draft Preparation: [HZ]. Funding Acquisition: [GW]; Supervision: [GW]; Writing – Review & Editing [GW, XL].
Corresponding author
Ethics declarations
Conflict of interest
The authors declared that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zheng, H., Wang, G. & Li, X. Swin-MLP: a strawberry appearance quality identification method by Swin Transformer and multi-layer perceptron. Food Measure 16, 2789–2800 (2022). https://doi.org/10.1007/s11694-022-01396-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11694-022-01396-0