Grapevine Varieties Classification Using Machine Learning

Marques, Pedro; Pádua, Luís; Adão, Telmo; Hruška, Jonáš; Sousa, José; Peres, Emanuel; Sousa, Joaquim J.; Morais, Raul; Sousa, António

doi:10.1007/978-3-030-30241-2_17

Pedro Marques¹¹,
Luís Pádua^11,12,
Telmo Adão^11,12,
Jonáš Hruška¹¹,
José Sousa¹¹,
Emanuel Peres^11,12,
Joaquim J. Sousa^11,12,
Raul Morais^11,12 &
…
António Sousa^11,12

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11804))

Included in the following conference series:

EPIA Conference on Artificial Intelligence

2623 Accesses

Abstract

Viticulture has a major impact in the European economy and over the years the intensive grapevine production led to the proliferation of many varieties. Traditionally these varieties are manually catalogued in the field, which is a costly and slow process and being, in many cases, very challenging to classify even for an experienced ampelographer. This article presents a cost-effective and automatic method for grapevine varieties classification based on the analysis of the leaf’s images, taken with an RGB sensor. The proposed method is divided into three steps: (1) color and shape features extraction; (2) training and; (3) classification using Linear Discriminant Analysis. This approach was applied in 240 leaf images of three different grapevine varieties acquired from the Douro Valley region in Portugal and it was able to correctly classify 87% of the grapevine leaves. The proposed method showed very promising classification capabilities considering the challenges presented by the leaves which had many shape irregularities and, in many cases, high color similarities for the different varieties. The obtained results compared with manual procedure suggest that it can be used as an effective alternative to the manual procedure for grapevine classification based on leaf features. Since the proposed method requires a simple and low-cost setup it can be easily integrated on a portable system with real-time processing to assist technicians in the field or other staff without any special skills and used offline for batch classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Vivier, M.A., Pretorius, I.S.: Genetically tailored grapevines for the wine industry. Trends Biotechnol. 20, 472–478 (2002)
Article Google Scholar
This, P., Lacombe, T., Thomas, M.: Historical origins and genetic diversity of wine grapes. Trends Genet. 22, 511–519 (2006)
Article Google Scholar
Thomas, M.R., Cain, P., Scott, N.S.: DNA typing of grapevines: a universal methodology and database for describing cultivars and evaluating genetic relatedness. Plant Mol. Biol. 25, 939–949 (1994)
Article Google Scholar
Diago, M.P., Fernandes, A.M., Millan, B., Tardaguila, J., Melo-Pinto, P.: Identification of grapevine varieties using leaf spectroscopy and partial least squares. Comput. Electron. Agric. 99, 7–13 (2013)
Article Google Scholar
Fuentes, S., Hernández-Montes, E., Escalona, J.M., Bota, J., Gonzalez Viejo, C., Poblete-Echeverría, C., Tongson, E., Medrano, H.: Automated grapevine cultivar classification based on machine learning using leaf morpho-colorimetry, fractal dimension and near-infrared spectroscopy parameters. Comput. and Electr. in Agriculture 151, 311–318 (2018)
Article Google Scholar
Gutiérrez, S., Tardaguila, J., Fernández-Novales, J., Diago, M.P.: Support vector machine and artificial neural network models for the classification of grapevine varieties using a portable NIR spectrophotometer. PLoS ONE 10, e0143197 (2015)
Article Google Scholar
Gutiérrez, S., Fernández-Novales, J., Diago, M.P., Tardaguila, J.: On-The-Go hyperspectral imaging under field conditions and machine learning for the classification of grapevine varieties. Front. Plant Sci. 9, 1102 (2018)
Article Google Scholar
Karasik, A., Rahimi, O., David, M., Weiss, E., Drori, E.: Development of a 3D seed morphological tool for grapevine variety identification, and its comparison with SSR analysis. Sci. Rep. 8, 6545 (2018)
Article Google Scholar
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. B Cybern. 9, 62–66 (1979)
Article Google Scholar
Bendig, J., et al.: Combining UAV-based plant height from crop surface models, visible, and near infrared vegetation indices for biomass monitoring in barley. Int. J. Appl. Earth Obs. Geoinf. 39, 79–87 (2015)
Article Google Scholar
Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 4th edn. Pearson (2017)
Google Scholar
Rodgers, J.L., Nicewander, W.A.: Thirteen ways to look at the correlation coefficient. Am. Stat. 42, 59–66 (1988)
Article Google Scholar
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
MATH Google Scholar
Yu, H.-F., Huang, F.-L., Lin, C.-J.: Dual coordinate descent methods for logistic regression and maximum entropy models. Mach. Learn. 85, 41–75 (2011)
Article MathSciNet Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in Statistics, 2nd edn. Springer, New York (2009). https://doi.org/10.1007/978-0-387-84858-7
Book MATH Google Scholar
Altman, N.S.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46, 175–185 (1992)
MathSciNet Google Scholar
Breiman, L.: Classification and Regression Trees. Routledge, Boca Raton (2017)
Book Google Scholar
Zhang, H.: The Optimality of Naive Bayes. In: FLAIRS2004 Conference (2004)
Google Scholar
Guyon, I., Boser, B., Vapnik, V.: Automatic capacity tuning of very large VC-dimension classifiers. In: Proceedings Advances in Neural Information Processing Systems, vol. 5, pp. 147–155 (1992)
Google Scholar
Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection, Montreal, vol. 14, pp. 1137–1145 (1995)
Google Scholar
Du, J.-X., Wang, X.-F., Zhang, G.-J.: Leaf shape based plant species recognition. Appl. Math. Comput. 185, 883–893 (2007)
MATH Google Scholar
Silva, P.F.B., Marçal, A.R.S., da Silva, R.M.A.: Evaluation of features for leaf discrimination. In: Kamel, M., Campilho, A. (eds.) ICIAR 2013. LNCS, vol. 7950, pp. 197–204. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39094-4_23
Chapter Google Scholar
Pauwels, E.J., de Zeeuw, P.M., Ranguelova, E.B.: Computer-assisted tree taxonomy by automated image recognition. Eng. Appl. A.I. 22, 26–31 (2009)
Article Google Scholar
Yang, M., Kpalma, K., Ronsin, J.: A survey of shape feature extraction techniques. In: Yin, P.-Y. (ed.) Pattern Recognition, pp. 43–90. IN-TECH (2008)
Google Scholar
Ghozlen, N.B., Cerovic, Z.G., Germain, C., Toutain, S., Latouche, G.: Non-destructive optical monitoring of grape maturation by proximal sensing. Sensors 2010(10), 10040–10068 (2010)
Article Google Scholar
Mokhtarian, F., Mackworth, A.K.: A theory of multiscale, curvature-based shape representation for planar curves. IEEE Trans. Pattern Anal. Mach. Intell. 14, 789–805 (1992)
Article Google Scholar
Jalba, A.C., Wilkinson, M.H.F., Roerdink, J.B.T.M.: Shape representation and recognition through morphological curvature scale spaces. IEEE Trans. Image Process. 2006(15), 331–341 (2006)
Article Google Scholar

Download references

Acknowledgements

This work is financed by the ERDF – European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation - COMPETE 2020 Programme within project «POCI-01-0145-FEDER-006961», and by National Funds through the FCT – Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) as part of project UID/EEA/50014/2013.

Author information

Authors and Affiliations

School of Science and Technology, University of Trás-os-Montes e Alto Douro, Vila Real, Portugal
Pedro Marques, Luís Pádua, Telmo Adão, Jonáš Hruška, José Sousa, Emanuel Peres, Joaquim J. Sousa, Raul Morais & António Sousa
Centre for Robotics in Industry and Intelligent Systems (CRIIS), INESC Technology and Science (INESC-TEC), Porto, Portugal
Luís Pádua, Telmo Adão, Emanuel Peres, Joaquim J. Sousa, Raul Morais & António Sousa

Authors

Pedro Marques
View author publications
You can also search for this author in PubMed Google Scholar
Luís Pádua
View author publications
You can also search for this author in PubMed Google Scholar
Telmo Adão
View author publications
You can also search for this author in PubMed Google Scholar
Jonáš Hruška
View author publications
You can also search for this author in PubMed Google Scholar
José Sousa
View author publications
You can also search for this author in PubMed Google Scholar
Emanuel Peres
View author publications
You can also search for this author in PubMed Google Scholar
Joaquim J. Sousa
View author publications
You can also search for this author in PubMed Google Scholar
Raul Morais
View author publications
You can also search for this author in PubMed Google Scholar
António Sousa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to António Sousa .

Editor information

Editors and Affiliations

INESC-TEC, University of Trás-os-Montes and Alto Douro, Vila Real, Portugal
Paulo Moura Oliveira
University of Minho, Braga, Portugal
Paulo Novais
LIACC/UP, University of Porto, Porto, Portugal
Luís Paulo Reis

Appendices

Appendix A

Feature	Description	Equation
Eccentricity [21,22,23,24]	Defined as the ratio of the length of main inertia axis of the ROI EA to the length of minor inertia axis of the ROI EB	\( E = \frac{EA}{EB} \)
Aspect Ratio [21,22,23,24]	Ratio between the maximum length Dmax and the minimum length D_min of the minimum bounding rectangle (MBR) of the leaf with holes	\( AR = \frac{Dmax}{Dmin} \)
Aspect Ratio 2	Same as Aspect Ratio but without considering the holes of the leaf
Elongation [22,23,24]	From each point inside the ROI is calculated the minimal distance Dmin to the boundary contour \( \partial \) ROI and its denoted by D_me its maximal value over the region	\( D_{me} = {}_{x \in R}^{max} d\left( {x,\partial ROI} \right) \)
Elongation [22,23,24]	Then elongation is defined as:	\( E = 1 - \frac{{2D_{me} }}{{D_{ROI} }} \)
Solidity [21,22,23,24]	Describes the extent to which the shape is convex or concave	\( S = \frac{{A_{ROI} }}{{A_{CH} }} \)
Isoperimetric Factor [21,22,23,24]	If the closed contour \( \partial \) ROI of length L(\( \partial ROI) \) encloses a region ROI of area A (R), the isoperimetric factor is defined by	\( IF = \frac{4\pi A\left( R \right)}{{L\left( {\partial ROI} \right)^{2} }} \)
Maximal Indentation Depth [22, 23]	For each point on the contour of ROI is determined the distance to the convex hull, expressing this distance by a function. Then the Maximal Indentation Depth is the maximum of the function
Rectangularity [21, 24]	Represents the ratio of \( A_{ROI} \) and the MBR (Mum Bounding Rectangle) area	\( Rect = \frac{{A_{ROI} }}{{D_{max} \times D_{min} }} \)
Convex Perimeter Ratio [25]	Represents the ratio of the \( P_{ROI} \) and the \( P_{CH} \)	\( CPR = \frac{{P_{ROI} }}{{P_{CH} }} \)
Circularity [21, 24]	Defined by all of the bounding points of the ROI \( C = \frac{\mu R}{\sigma R} \) Where \( \mu \)R is the mean distance between the centre of the ROI and all of the bounding points, and \( \sigma \)R is the quadratic mean deviation of the mean distance: \( \mu R = \frac{1}{N}\sum\limits_{i = 0}^{N - 1} {\left\\| {\left( {x_{i} ,y_{i} } \right) - \left( {\bar{x}, \bar{y}} \right)} \right\\|} \) \( \sigma R = \frac{1}{N} \sum\limits_{i = 0}^{N - 1} {\left\\| {\left( {\left( {x_{i} ,y_{i} } \right) - \left( {\bar{x}, {\bar{\text{y}}}} \right) - \mu R} \right)^{2} } \right\\|} \)
Sphericity [25]	Represents the ratio between the R of incircle (in) of the ROI and the radius of the excircle (ex) of the ROI	\( S = \frac{{R_{in} }}{{R_{ex} }} \)
Entirety	Ratio between the difference between ACH and AROI, and the AROI	\( Ent = \frac{{A_{CH} - A_{ROI} }}{{A_{ROI} }} \)
Extent	Ratio between AROI and the product of BB width and height	\( Ex = \frac{{A_{ROI} }}{{BB_{Width} \times BB_{height} }} \)
Equiv Diameter	Calculates the D of a circle with the same area as the ROI	\( ED = \sqrt {\frac{{4 \times A_{ROI} }}{\pi }} \)
Number Curvatures [24, 26, 27]	Number of corners with 5 × 5 pixels neighbouring. In order to use K(n) for shape representation, it was quoted the function of curvature, K(n) as: \( K\left( n \right) = \frac{{\dot{x}\left( n \right)\ddot{y}\left( n \right) - \dot{y}\left( n \right)\ddot{x}\left( n \right) }}{{\left( {\dot{x}\left( n \right)^{2} + \dot{y}\left( n \right)^{2} } \right)^{3/2} }} \) Therefore, it is computed the curvature of a planar curve from its parametric representation. If n is the normalized arc-length parameter s, then: \( K\left( s \right) = \dot{x}\left( s \right)\ddot{y}\left( s \right) - \dot{y}\left( n \right)\ddot{x}\left( n \right) \) However, the curvature function is computed only from parametric derivatives, and, therefore, it is invariant under rotations and translations. Though, the curvature measure is scale dependent, i.e., inversely proportional to the scale. A possible way to achieve scale independence is to normalize this measure by the mean absolute curvature, i.e., \( K^{\prime}\left( s \right) = \frac{K\left( s \right)}{{\frac{1}{N}\mathop \sum \nolimits_{s = 1}^{N} \left\| {K\left( s \right)} \right\|}} \) where N is the number of points on the normalized contour

Appendix B

Feature	Description	Equation
Mean	Average of leaf pixel values
Deviation	Standard deviation of leaf pixel values
Softness	Calculate the smoothness of the image	\( Sft = 1 - \frac{1}{{1 + Deviation^{2} }} \)
Contrast	Returns the average of the measure of the intensity contrast between a pixel and its neighbour over the whole image, also known as Variance	\( C = \sum\limits_{i,j} {\left\| {i - j} \right\|^{2} p\left( {i,j} \right)} \)
Correlation	Returns the average of the measure of how correlated a pixel is to its neighbour over the whole image, where the range is between –1 and 1. Correlation is 1 or –1 for a perfectly positively or negatively correlated image. Correlation is NaN for a constant image	\( Corr = \sum\limits_{i,j} {\frac{{\left( {i - \mu i} \right)\left( {j - \mu j} \right)p\left( {i,j} \right)}}{{\sigma_{i} \sigma_{j} }}} \)
Energy	Returns the average of the sum of squared elements in the GLCM, where the range is between 0 and 1. Energy is 1 for a constant image The property Energy is also known as uniformity, uniformity of energy, and angular second moment, and its calculated by:	\( En = \sum\limits_{i,j} {p\left( {i,j} \right)^{2} } \)
Homogeneity	Returns the average of the value that measures the closeness of the distribution of elements in the GLCM to the GLCM diagonal, where the range is between 0 and 1. Homogeneity is 1 for a diagonal GLCM	\( Homo = \sum\limits_{i,j} {\frac{{p\left( {i,j} \right)}}{{1 + \left\| {i - j} \right\|}}} \)
Mean Hist	Calculate the average of the approximate probability density of occurrence of the intensity in the histogram
Variance Hist	Calculate the variance of the approximate probability density of occurrence of the intensity in the histogram
Skewness Hist	Calculate the skewness of the approximate probability density of occurrence of the intensity in the histogram Skewness is a measure of the asymmetry of the data around the sample mean. If skewness is negative, the data are spread out more to the left of the mean than to the right. If skewness is positive, the data are spread out more to the right. The skewness of the normal distribution (or any perfectly symmetric distribution) is zero The skewness of a distribution is defined as: \( Sk = \frac{{E\left( {x - \mu } \right)^{3} }}{{\sigma^{3} }} \) where µ is the mean of x, σ is the standard deviation of x, and E(t) represents the expected value of the quantity t
Kurtosis Hist	Calculate the kurtosis of the approximate probability density of occurrence of the intensity in the histogram Kurtosis is a measure of how outlier-prone a distribution is. The kurtosis of the normal distribution is 3. Distributions that are more outlier-prone than the normal distribution have kurtosis greater than 3; distributions that are less outlier-prone have kurtosis less than 3 The kurtosis of a distribution is defined as \( K = \frac{{E\left( {x - \mu } \right)^{4} }}{{\sigma^{4} }} \) where μ is the mean of x, σ is the standard deviation of x, and E(t) represents the expected value of the quantity t
Energy Hist	Calculate the energy of the approximate probability density of occurrence of the intensity in the histogram

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Marques, P. et al. (2019). Grapevine Varieties Classification Using Machine Learning. In: Moura Oliveira, P., Novais, P., Reis, L. (eds) Progress in Artificial Intelligence. EPIA 2019. Lecture Notes in Computer Science(), vol 11804. Springer, Cham. https://doi.org/10.1007/978-3-030-30241-2_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-30241-2_17
Published: 30 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30240-5
Online ISBN: 978-3-030-30241-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics