Abstract
The use of hyperspectral imaging systems in studying plant properties, types, and conditions has significantly increased due to numerous economical and financial benefits. It can also enable automatic identification of plant phenotypes. Such systems can underpin a new generation of precision agriculture techniques, for instance, the selective application of plant nutrients to crops, preventing costly losses to soils, and the associated environmental impact to their ingress into watercourses. This paper is concerned with the analysis of hyperspectral images and data for monitoring and classifying plant conditions. A spectral-texture approach based on feature selection and the Markov random field model is proposed to enhance classification and prediction performance, as compared to conventional approaches. Two independent hyperspectral datasets, captured by two proximal hyperspectral instrumentations with different acquisition dates and exposure times, were used in the evaluation. Experimental results show promising improvements in the discrimination performance of the proposed approach. The study shows that such an approach can shed a light on the attributes that can better differentiate plants, their properties, and conditions.
Keywords
1 Introduction
Hyper-spectral imaging (HSI), a branch of multivariate imaging [1], gathers optical properties of a target with several spectral representations using a mixture of spectroscopy and imaging technologies [2]. HSI has been utilised in an increasing number of applications, for instance, remote sensing [3], proximal sensing [4], industrial processes [5], medical imaging [6] and chemical processes [7]. Moreover, several configurations have been used to capture hyperspectral images: point, line, area, and single shot scanning [8].
Texture characteristics and spectral information are the fundamental properties of hyperspectral images. Textural information is described as attributes representing texture arrangement of grey levels [9]. This information is associated with many image properties such as coarseness, smoothness, orientation, depth, etc. Whilst, spectral information defines the measured spectrum of the corresponding texture images, where each image represents a unique spectral signature [2]. It is worth noting that the spectrum information covers single or several parts of the electromagnetic spectrum.
Several spectral analysis techniques have been introduced to analyse hyperspectral images. These techniques have played an important role in several domains such as agriculture, medicine, and industry for many tasks - especially image classification [10]. Since hyperspectral imaging senses a wider range of the electromagnetic spectrum, effective and efficient approaches are needed for analysing the images [10]. These approaches include feature extraction and feature selection, used to reduce the dimensionality of hyperspectral images as well as the requested processing time, thus analysing only the information relevant to the investigated problem.
Texture analysis provides insight about texture properties, which are important basis to recognition and description. Generally, texture analysis techniques can be broadly categorised into statistical, structural, transform-based, and model-based [9]. The first two use statistics of the grey levels and arrangement rules of the grey levels to describe the texture. The characteristics of transformed-based are the use of transforms to describe texture properties in the transform-based techniques. While the model-based approach uses models and (estimated) model parameters to define textures. Several studies published in the past have shown that the Markov random field (MRF) is one of the most powerful models for describing various textures [11]. MRF, in which inter-pixel dependency is modelled probabilistically, has been utilised in many applications such as image de-noising, image compression, image segmentation and super-resolution.
This work focuses on classifying different plant conditions (e.g. stressed vs. normal; diseased vs. healthy) using a spectral-texture approach and compares the results with those from using conventional methods, individual spectral or texture approaches. The relevant features (i.e. spectral wavelengths) along with the texture representation (i.e. estimated MRF parameters) are used in the classification stage. Furthermore, support vector machines (SVM) are used as classifier for two reasons [12]: (1) it is considered as a state of the art classification algorithm, (2) it reduces the risk of overfitting (in order to deal efficiently with the dimensionality problem). The approach has been evaluated on real-world datasets with promising and improved discrimination achieved.
The remainder of the paper is organised as follows: the background on feature selection and MRF is reviewed in Sect. 2. Section 3 presents HSI systems, HSI datasets, and the proposed spectral-texture approach. Results and discussions are given in Sect. 4, followed by the concluding remarks in Sect. 5.
2 Background
An overview of feature selection is presented in the first subsection, while the Markov random field (MRF) model is highlighted in the second.
2.1 Feature Selection
HSI systems gather large amounts of information; however, not all the data collected is necessarily relevant to the problem investigated. The problem of high dimensionality can be alleviated by using a feature selection process. Feature selection is the process of choosing a relevant subset of features (in this study, wavelengths) and discarding the remaining ones (e.g. irrelevant and redundant) [13, 14]. The process of feature selection can be described in four steps: search organisation, subset evaluation, stopping criteria, and result validation. The first step is responsible for generating several subsets of features and that includes determining search direction and procedure. The second step involves evaluation of the relevance of the generated subsets, based on certain criteria, in order to select the optimal one (i.e. the one that maximises the evaluation criteria). The last two steps determine when the process should be halted and the significance of the selection parameters to the investigated problem.
Feature selection models can be separated - based on certain evaluation criteria - into the following categories: filter, wrapper, and embedded [13, 14]. The discrimination capability depends solely on data characteristics in the first model, while it depends on the mining algorithms used to assess the relevancy of the features in latter models. It should be noted that the embedded model was introduced to utilise both filter and wrapper models, i.e. to rank features based on their data characteristics and evaluate their goodness through classification algorithms. In addition, the filter model can produce acceptable to good performances in short time, while the wrapper and embedded models are easy to implement.
Various feature selection algorithms have been introduced in the past. The correlation-based feature selection (CFS) [15] algorithm has been shown to be particularly powerful due to its ability to discard irrelevant and redundant features, as well as producing good discrimination compared to other selection algorithms [12]. It uses Shannon’s entropy \(H(\varvec{x})=-\sum _{i=1}^nP(x_i)\log _2P(x_i)\) and information gain \(I(\varvec{x},\varvec{y})=H(\varvec{x})-H(\varvec{x}|\varvec{y})\) to minimise feature bias and then measures the correlation between the features and the classes. The measured correlation is then used to evaluate the feature heuristically:
where \(\overline{r_{cf}}\) denotes the average feature-class correlation, \(\overline{r_{ff}}\) represents the average feature-feature correlation, and \(Merit_S\) is the heuristic merit of a subset containing features.
2.2 MRF Model
MRF is an extension of the Markov chain model [16]. It has a set of nodes, each of which corresponds to a variable or set of variables (an example of MRF neighbour structure and corresponding parameters is shown in Fig. 1). It is also termed as undirected graph model since it is more natural for modelling certain problems, such as spatial statistics and image analysis [17]. Moreover, the orientation of the texture features is not required, unlike the directed graph model. The main advantages of MRF models compared to directed graph models are: (1) more natural for certain domains (i.e. symmetric) and (2) the discrimination of former models work better than the latter one due to the normalisation process (i.e. globally vs. locally). In contrast, the major disadvantages are: (1) less interoperable and (2) parameter estimation can be computationally more expensive (e.g. maximum likelihood estimate).
The MRF can be mathematically described using the equivalent Gibbs distribution with regard to the same graph [18]. Let P(x) denotes a Gibbs distribution for realisation x, \(\mathcal {N}\) represents a neighbouring system, \(\varOmega \) denotes a finite lattice, \(\mathcal {C}\) represents all possible cliques, i.e. the subset of a lattice consists of single and/or set of pixels which are neighbours to each other, then the distribution can be represented as:
where T is a constant and stands for temperature; U(x) represents the energy function that depends only on clique potential \(V_C\) on the lattice and can be written as:
Z denotes a normalising constant, also termed as partition function, and is defined as:
In terms of estimation of texture parameters, the least square (LS) and maximum likelihood (ML) are two estimates that are widely used with the texture [11]. The former is simple and it has low computational requirements compared to the latter, which is why it is more preferable in practice. For the LS estimate, the parameters over a finite lattice \(\varOmega \) can be estimated using the following equation:
where \(x_{i,j}\) represents the middle pixel and \(\beta _m\) denotes the neighbouring pixels that can be represented as:
where u, v represent the location of the neighbouring pixels horizontally and vertically respectively and col stands for column. It is worth noting that LS is not consistent for non-causal neighbour sets [11, 17]. However, it is more preferable compared to the ML estimate since ML is computationally expensive. In addition, the ML result is not always guaranteed (if not impossible) and requires an alternative function, i.e. iterative and computationally expensive, such as pseudo likelihood (MPL).
3 Materials and Methods
This section describes the materials and the spectral-texture approach. It first emphasises the specifications of the HSI systems used to capture the hyperspectral datasets and then describes the datasets used in the experiment, followed by the description of the spectral-texture approach.
3.1 HSI Systems and Datasets
Two HSI systems were used to collect the hyperspectral images: The University of Manchester (UoM) HSI system and The University of Bonn (Bonn) HSI system. The key specifications of both systems are given in Table 1. Both systems operate in controlled environments (dark room vs. dark chamber) in order to minimise the effect of unwanted noise. Furthermore, the dynamic range of both systems is managed to prevent saturation. In addition, three images (scene, dark noise, and flat field) are captured by both systems and then used to spectrally normalise the scene images to enhance the quality of the image. More information about both systems can be found in [19, 20].
Two HSI datasets (called UoM and Bonn for simplicity) captured with different acquisition dates and exposure times were used for analysis purposes. The scene images of the UoM dataset consisted of six Arabidopsis leaf samples, while the Bonn dataset consisted of four sugar leaf samples [20] placed flattened on the sample plate in both cases (shown in Fig. 2). Moreover, the former dataset consists of two normal and four stressed (cold and heat) leaves (i.e. top and bottom left: normal, middle top and bottom: cold stress, and top and bottom right: heat stress), while the latter consists of four leaves under one condition; either healthy (controlled) or unhealthy (Cercospora). 648 samples were extracted from the UoM dataset and divided into two groups: normal and stressed. The normal group was represented by 216 samples and the remaining represented stressed samples. The Bonn dataset yielded 196 samples: 98 samples of controlled and Cercospora conditions. It should be stated that only green areas of both datasets were considered for sample extraction. In addition, 50% of the samples was used for training purposes with 10-fold cross validation with the remaining ones used for testing.
3.2 Spectral-Texture Approach
The proposed spectral-texture approach (is illustrated in Fig. 3) and can be described in four steps: spectral signature extraction, significant wavelengths selection, texture parameters estimation, and classification. The spectral signature is extracted from the pixel value of the small leaf region and then averaged over the entire wavelengths spectrum. This averaged signature is used to reduce the variation in pixel intensities across the selected leaf region. CFS is used in the wavelengths selection step in order to simplify the dataset and select the most significant wavelengths. The LS estimate is used to estimate the second order parameters of the MRF model and then average them either over the entire wavelength spectrum or over the list of selected wavelengths. In the final step, the selected wavelength and the estimated texture parameters are combined and passed to a conventional SVM classifier with a radial basis function (RBF) kernel for classification. The SVM uses a quadratic programming routine to solve the following quadratic problem with the regards to training set in order to find the best hyperplane [21]:
where \(\omega \), b, \(\varvec{x_i}\), \(y_i\), \(\xi _i\) represent the weight vector, bias, training set, desired class label, and a non-zero slack variable respectively. Moreover, C is the regularisation parameter and it is used to mark the misclassified samples, thus determining the flexibility of the decision boundary. In this case, the decision function y can be solved using the weight vector as well as the bias:
The value of decision function \(y\in \{\pm 1\}\), where 1 denotes one class and \(-1\) the other class. It should be mentioned that false positive and negative errors have to be reduced in order to obtain a good classification result. In addition, the RBF kernel was used to employ the nonlinear hyperplane and can be defined as the following exponential function:
4 Results and Discussions
The experimental results assessed the usefulness of the spectral-texture approach in analysing and classifying plant hyperspectral images under different conditions. Both UoM and Bonn datasets were used in the analysis. The final results were then compared with the classification results of existing spectral analysis approach, texture analysis approach, and the combination of all the wavelengths and the estimated texture parameters. 50% of the samples was used for training with 10-fold cross validation and the remaining 50% was used for testing. Table 2 displays the average classification rates of 100 runs with the standard deviations.
What stands out in this table is that the average classification rate of the spectral-texture approach outperforms other approaches (i.e. the all wavelengths, the selected wavelengths, the estimated texture parameters, and the combination of all the wavelengths and the estimated texture parameters). Moreover, a positive correlation is found between the selected wavelengths (e.g. 550 nm and 710 nm in UoM dataset and 513 nm and 698 nm in Bonn dataset) and the wavelengths used in the previous empirical studies [22, 23]. These results suggest the proposed method is a valid approach for studying and analysing different plant types and plant conditions. Further statistical tests revealed that the improvements are significant compared to the other approaches at a significance level of 1%, especially the wavelengths-texture and texture approaches (p-value\(<10^{-5}\)). On the other hand, the proposed approach tends to be slower compared to the spectral approach. It should be noted that using single feature selection algorithm might affect the prediction performance as well as the robustness since there is no single feature selection algorithm can deal with all situations. A more robust and effective selection framework was proposed in our previous work [10].
5 Conclusions
This paper presented a spectral-texture approach for analysis and classification of hyperspectral plant types and conditions. The experimental results from this approach have shown marked improvements in discrimination performance compared to other approaches. The improvements are statistically significant. The findings suggest that such approach seems valid and applicable for the study of different plant properties, types, and conditions. Future study can explore the effect of the estimated parameters with different orders (e.g. first and third orders MRF) to identify the optimal neighbouring system, as well as different classification routines, such as novelty detection to best identify plant properties, types, and conditions. In addition, different computational platforms can be explored to improve the speed of the proposed approach, thus reducing the time complexity.
References
Geladi, P.L.M., Grahn, H.F., Burger, J.E.: Multivariate images, hyperspectral imaging: background and equipment. In: Techniques and Applications of Hyperspectral Image Analysis, pp. 1–15. John Wiley and Sons, Ltd. (2007)
ElMasry, G., Sun, D.W.: Principles of Hyperspectral Imaging Technology. In: Sun, D.W. (ed.) Hyperspectral Imaging for Food Quality Analysis and Control, pp. 3–43. Academic Press, San Diego (2010)
Campbell, J., Wynne, R.: Introduction to Remote Sensing, 5th edn. Guilford Publications, New York (2011)
Liu, H., Lee, S.H., Chahl, J.S.: Development of a proximal machine vision system for off-season weed mapping in broadacre no-tillage fallows. J. Comput. Sci. 9(12), 1803–1821 (2013)
Duchesne, C., Liu, J., MacGregor, J.: Multivariate image analysis in the process industries: a review. Chemometr. Intell. Lab. Syst. 117, 116–128 (2012)
Lu, G., Fei, B.: Medical hyperspectral imaging: a review. J. Biomed. Opt. 19(1), 010901 (2014)
Geladi, P., Bengtsson, E., Esbensen, K., Grahn, H.: Image analysis in chemistry i. Properties of images, greylevel operations, the multivariate image. TrAC Trends Anal. Chem. 11(1), 41–53 (1992)
Qin, J.: Hyperspectral Imaging Instruments. In: Sun, D.W. (ed.) Hyperspectral Imaging for Food Quality Analysis and Control, pp. 129–172. Academic Press, San Diego (2010)
Bharati, M.H., Liu, J., MacGregor, J.F.: Image texture analysis: methods and comparisons. Chemometr. Intell. Lab. Syst. 72(1), 57–71 (2004)
AlSuwaidi, A., Veys, C., Hussey, M., Grieve, B., Yin, H.: Hyperspectral feature selection ensemble for plant classification. In: Hyperspectral Imaging and Applications (HSI 2016), October 2016
Yin, H., Allinson, N.M.: Self-organised parameter estimation and segmentation of MRF model-based texture images. In: Proceedings of the IEEE International Conference on Image Processing, ICIP 1994, vol. 2, pp. 645–649. IEEE (1994)
AlSuwaidi, A., Veys, C., Hussey, M., Grieve, B., Yin, H.: Hyperspectral selection based algorithm for plant classification. In: 2016 IEEE International Conference on Imaging Systems and Techniques (IST), pp. 395–400, October 2016
Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic Publishers, Norwell (1998)
Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowl. Data Eng. 17(4), 491–502 (2005)
Hall, M.A., Smith, L.A.: Feature selection for machine learning: Comparing a correlation-based filter approach to the wrapper. In: Proceedings of the Twelfth International Florida Artificial Intelligence Research Society Conference, pp. 235–239 (1999)
Blake, A., Kohli, P., Rother, C.: Markov Random Fields for Vision and Image Processing. The MIT Press, Cambridge (2011)
Murphy, K.P.: Machine Learning: A Probabilistic Perspective. The MIT Press, Cambridge (2012)
Geman, S., Geman, D.: Stochastic relaxation, gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. PAMI–6(6), 721–741 (1984)
Foster, D.H., Amano, K., Nascimento, S.M.C.: Color constancy in natural scenes explained by global image statistics. Vis. Neurosci. 23(3–4), 341–349 (2006)
Mahlein, A.K., Hammersley, S., Oerke, E.C., Dehne, H.W., Goldbach, H., Grieve, B.: Supplemental blue led lighting array to improve the signal quality in hyperspectral imaging of plants. Sensors 15(6), 12834–12840 (2015)
Kulkarni, S., Harman, G.: An Elementary Introduction to Statistical Learning Theory, 1st edn. Wiley Publishing, New Jersey (2011)
Gitelson, A., Merzlyak, M.N.: Spectral reflectance changes associated with autumn senescence of aesculus hippocastanum l. and acer platanoides l. leaves. spectral features and relation to chlorophyll estimation. J. Plant Physiol. 143(3), 286–292 (1994)
Mahlein, A.K., Rumpf, T., Welke, P., Dehne, H.W., Plmer, L., Steiner, U., Oerke, E.C.: Development of spectral indices for detecting and identifying plant diseases. Remote Sens. Environ. 128, 21–30 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
AlSuwaidi, A., Grieve, B., Yin, H. (2017). Towards Spectral-Texture Approach to Hyperspectral Image Analysis for Plant Classification. In: Yin, H., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2017. IDEAL 2017. Lecture Notes in Computer Science(), vol 10585. Springer, Cham. https://doi.org/10.1007/978-3-319-68935-7_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-68935-7_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68934-0
Online ISBN: 978-3-319-68935-7
eBook Packages: Computer ScienceComputer Science (R0)