Potential Bands of Sentinel-2A Satellite for Classification Problems in Precision Agriculture

: Various indices are used for assessing vegetation and soil properties in satellite remote sensing applications. Some indices, such as normalized difference vegetation index (NDVI) and normalized difference water index (NDWI), are capable of simply differentiating crop vitality and water stress. Nowadays, remote sensing capabilities with high spectral, spatial and temporal resolution are available to analyse classification problems in precision agriculture. Many challenges in precision agriculture can be addressed by supervised classification, such as crop type classification, disease and stress (e.g., grass, water and nitrogen) monitoring. Instead of performing classification based on designated indices, this paper explores direct classification using different bands information as features. Land cover classification by using the recently launched Sentinel-2A image is adopted as a case study to validate our method. Four approaches of featured band selection are compared to classify five classes (crop, tree, soil, water and road) with the support vector machines (SVMs) algorithm, where the first approach utilizes traditional empirical indices as features and the latter three approaches adopt specific bands (red, near infrared and short wave infrared) related to indices, specific bands after ranking by mutual information (MI), and full bands of on-board sensors as features, respectively. It is shown that a better classification performance can be achieved by directly using the selected bands after MI ranking compared with the one using empirical indices and specific bands related to indices, while the use of all 13 bands can marginally improve the classification accuracy than MI based one. Therefore, it is recommended that this approach can be ap- plied for specific Sentinel-2A image classification problems in precision agriculture.


Introduction
Over the past few decades, satellite remote sensing has been playing a crucial role in forest monitoring, disaster management and agricultural applications [1][2][3] . Various satellites own different characteristics due to their customized sensors. Remote sensing images may be produced by optical sensors with a good number of spectral bands and require tailored analysis depending on specific applications. The classification problems in agriculture are mainly focused on monitoring crop status such as crop vigour, water, grass and nitrogen stress in various crop growing stages. Indices composed of various spectral bands are very promising approach to extract useful information for stress monitoring. Some typical indices, such as normalized difference vegetation index (NDVI) and normalized difference water index (NDWI) have been widely used in many areas such as land cover classification and water stress monitoring. NDVI proposed by Rouse et al has the ability to classify land covers in remote sensing area, as well as vegetation vitality [4] . This index is defined by the reflectance of Red band and near infrared (NIR) band since they sense very different depths through vegetation canopies. Red channel locates in the strong chlorophyll absorption region while NIR channel has high vegetation canopy reflectance in this area [5] . Thus, this index can be applied to classify land covers. NDWI was proposed by Gao [6] to assess water status by the combination of NIR and short wave infrared (SWIR) channel, since both are located in the high reflectance plateau of vegetation canopies and sense similar depth in vegetation canopies. Absorption by vegetation liquid water near NIR is negligible, and weak liquid absorption near SWIR is present. Therefore, canopy scattering enhances the water performance. In the past, lots of research has been conducted to link these two indices with other indices of interest (e.g., vegetation water content (VWC)) to generate a classification map of land cover or vegetation water status [7][8][9][10] .
For remote sensing applications, band information is of paramount importance in the phase of satellite data analysis and interpretation. The technical advances in space science and sensor technologies enable new generation of satellite with multispectral sensors such as Sentinel 2. The launch of Sentinel-2A is a key part of Global Monitoring for Environment and Security Program supported by the European Space Agency and European Commission ensuring a better data continuity than other relevant satellites, such as SPOT and Landsat satellite series, due to its high spectral, spatial and temporal resolutions [11] . To obtain more retrieval information, its multispectral instrument (MSI) is an important component on this satellite as shown in Fig. 1. The MSI holds an anastigmatic telescope with three mirrors with a pupil diameter of about 150 mm minimizing thermos-elastic distortions, and the optical design has been optimized to achieve state-of-the-art imaging quality across its 290 km field of view [12] . MSI also features 13 spectral bands ranging from visible, NIR to SWIR at different resolutions. This configuration is selected as the best compromise between user requirements and mission performance. Four bands at 10 m resolution meet the basic requirements for land classification. Six bands at 20 m resolution provide additional information on vegetation detecting. The remaining three bands at 60 m contribute to atmospheric and geophysical parameters [12] . Sentinel-2A has the revisit time of 10 days and the launch of Sentinel-2B in March, 2017 shorten the revisit time into 5 days, which means Sentinel-2 series have the shortest revisit time among mainstream freely available satellites until now.
Satellite image processing usually involves image classification (e.g., land cover classification). The ever-increasing computation power and advanced algorithm development are making machine learning algorithms a popular tool in satellite big data application. For example, the support vector machines (SVMs) has been applied to solve remote sensing applications regarding unmanned aerial vehicles hyperspectral image (HSI) classification and satellite image analysis. In comparison with many existing classifiers such as neural network, SVMs classifier can achieve a competitive performance even with small training samples [13][14][15][16] . This property is extremely attractive for precision agriculture applications, since getting ground truth data is expensive, labour and time-consuming, involving filed survey and lab experiment test. Therefore, SVMs is selected to be the supervised learning tool to analyse Sentinel-2A image in our study.
Features are vital in image classification. In the afore-mentioned literature, most of the research is focused on the NDVI or NDWI calculation and their usefulness in land cover classification, water content evaluation, etc., by exploiting the specific spectral bands of satellites. Although the NDVI and NDWI have been widely used due to their simplicity along with clear physical meanings, there still exist several limitations. For example, in land cover classifications, NDVI usually saturates when vegetation coverage becomes dense (i.e., leaf area index (LAI), the one-sided green leaf area per unit ground surface area, reaches around 3) and no longer sensitive to vegetation changes [5] . Although compared with NDVI, NDWI saturates at a later stage, it also results in limited performance [6] . Besides, it is generally not an easy task to determine an appropriate threshold for index-based classification approaches. To avoid the problems in index-based classification approaches and further explore the potential of all the latest available capabilities of new satellites, the benefit of using selected or even all spectral bands of Sentinel-2A will be investigated using machine leaning techniques in land cover classifications.
On the other hand, little has been done in the literature to classify Sentinel-2A images by using machine learning methods and explore the benefits of the availability of more spectral bands of this satellite in classification. Consequently, this paper will compare different feature selection approaches according to indices and different bands. Four approaches are studied and compared where in the first approach, NDVI and NDWI are treated as the features, in the second approach the three related bands are directly adopted (Red, NIR, SWIR), and in the third approach top seven bands after mutual information (MI) band ranking are applied. Finally, all 13 bands available on Sentinel-2A satellite are employed. Confusion matrix can analyse the classification results among four different approaches. It is expected that the better classification performance can be achieved by directly adopting the selected bands than only using indices and all 13 bands of Sentinel-2A can improve the classification performance due to increased bands and consequently information. To be more exact, the main contributions of the work are summarized.
1) The remote sensing images of the newly launched Sentinel-2A satellite are exploited for the purpose of land cover classification by using different features with supervised learning algorithm.
2) It is discovered that the approach based on selected bands using MI algorithm can increase the classification accuracy than index-based and index-related approach. It can also obtain the comparative performance as the one based on all bands available on Sentinel-2A satellite.
3) By considering the balance between time consuming and classification accuracy, full bands approach can be employed to achieve the higher accuracy in a small Fig. 1 Multi spectral imager view on sentinel-2A [12] area. For large area, band selection after MI approach is more applicable.
The remainder of this paper is organized as follows. The problem under consideration is formulated in Section 2, including data sources and problem statement. The methodology is described in Section 3, including overall procedure, ground truth labelling, feature selection and SVM classification algorithm. Classification results are compared in Section 4. Finally, conclusions with future work are drawn in Section 5.

Materials and problem
This part focuses on data acquisitions and statement of the classification problem for Sentinel-2A satellite image in our case study. The data sources are introduced in Section 2.1, including satellite information selection as well as experimental site selection and then the problem formulation is conducted in Section 2.2, where the basic problem is briefly stated.

Data sources
Sentinal-2A satellite. Landsat8 and Sentinel-2A are the most advanced satellites with freely available data for long-term high-frequency remote sensing applications. The former one was launched in 2013 with operational land imager (OLI) sensor offering high quality multispectral images at 15 m, 30 m, 100 m and with a 16-day revisit time [17][18][19] . The latter one consists of Sentinel-2A and Sentinel-2B equipped with MSI capable of acquiring 13 bands information at different spatial resolutions (10 m, 20 m and 60 m). The band wavelength information for Landsat 8 and Sentinel-2A are drawn at central wavelength (see Tables 1-2).
It follows from Tables 1 and 2 that compared with Landsat 8, Sentinel-2A is more popular due to its fine properties including increased number of bands, shorter revisit time, and higher spatial resolution. In particular, Sentinel-2A provides more details in NIR band range and SWIR band range, which is helpful for land cover classifications in precision agriculture and forest monitoring applications among many others. A drawback of Sentinel-2A compared with Landsat 8 is without thermal infrared bands. The spectral and spatial resolution as well as temporal resolution determine the quality of spectral image [18] . Consequently, Sentinel-2A satellite is selected for solving remote sensing applications in our study. All Sentinel-2A satellite images could be freely downloaded from Sentinel Hub, which was developed by European Space Agency (https://scihub.copernicus.eu/). Besides, freely available satellite information analysis software sentinel application platform (SNAP) is also provided, which in comparison with quantum GIS (QGIS) and the environment for visualizing images (ENVI), is specially customized for Sentinel series. This software could read all the information that Sentinel series can provide and export any data to other relative analysis software in next steps.
Site selection. In supervised learning, groundtruth data is the baseline that different approaches can be evaluated and compared with. To study and compare the performance of different land cover classification algorithms, an area that we often performs flight tests regularly is chosen as an example site in this paper. The remote sensing data of Sentinel-2A for the site of interest can be selected on the aforementioned website and downloaded. The basic information of this chosen field (see, Fig. 2) including location, spectral bands, pixel information, cloud cover percentage is summarized in Table 3.
From previous literature regarding NDVI and NDWI calculation of Sentinel-2A [5] , Red band is chosen as Band 4, NIR band and SWIR band are selected as Band 8a and Band 11 respectively to achieve the better performance.

Problems formulation
The core problem in this study can be formulated as a classification problem, where indices or band information are selected as the features for supervised classifier training and testing. The set of Sentinel-2A satellite image pixels are denoted by , where n denotes the number of pixels, and means the pixel vector with d being bands or indices. Let be a set of class labels and be the classification map corresponding to the label. Training samples can be generated by corresponding pixel vector with the number of features d and a set of labelled data C in the form of with being the total number of training samples. Training samples will be adopted to train a classifier and a classification map with corresponding classification performance will be generated. The aim of this study is to evaluate the performance of various classifiers under different sets of features, so that suitable features can be identified for the land cover classification problem under consideration.

Overall procedure
The whole process of land cover classification using satellite remote sensing images can be divided into two stages including pre-processing and data analysis, as shown in Fig. 3. SNAP software is to pre-process the data downloaded from Sentinel Hub and calculate the related indices. Some specific classes could be labelled on the original data, then the NDVI and NDWI data can be generated and exported as excel format from SNAP. The data analysis stage is performed by Matlab using SVMs algorithm with different feature inputs.
Resampling, atmospheric correction and subset selection are necessary in pre-processing satellite images. In particular, resampling ensures that images of each band have the same resolution and number of pixels. Subset selection allows re-choosing specific areas of interests. Atmospheric correction algorithms are based on the Atmospheric/Topographic Correction for Satellite Imagery by Richter [20] . This method performs atmospheric correction according to libRadtran radiative transfer model that is run to generate a large look-up table accounting for various atmospheric conditions, solar geometries and ground elevations.
This simplified model runs much faster than a full model to invert the radiative transfer equation and to calculate bottom-of-atmosphere reflectance. Therefore, all gaseous and aerosol properties of the atmosphere are both derived by the algorithm and aerosol optical thickness or water vapor content are derived from the images respectively. SNAP software offers a plug-in to make atmospheric corrections termed Sen2Cor [21] . Atmospheric correction is an integrated part in the process of Sentinel-2A satellite image processing. Fig. 4 provides the red, green and blue (RGB) map of Sentinel-2A data for the selected site after pre-process.

Ground truth labelling
Groundtruth data is inevitable in supervised learning tasks. In this study, labelling specific areas is achieved by using SNAP software. This is because, the procedure is convenient to realize than other satellite software due to its compatibility with Sentinel-2A.
On   The average reflectance over bands for different classes is shown in Fig. 6. It is noted that the reflectance differences at different bands lay the foundation for machine learning based classification. It is obvious that five classes are totally distinct in terms of NIR range and SWIR range, which is the foundation to classify them under multiple classifiers. The labelled classes on these images could be exported to an excel file, along with location details and band details.

Feature selection
In this study, four different sets of features are defined, which will lead to four corresponding classifiers. These features are detailed as below.
NDVI and NDWI. Index-based classification directly treats NDVI and NDWI as features for classifier model construction. As mentioned in Section 2, Band 4, Band 8A and Band 11 are chosen as Red, NIR and SWIR band, respectively.
According to the formula in (1) and (2), NDVI and NDWI can be calculated easily from SNAP or Matlab software. It is noted that although three different bands are involved in NDVI and NDWI, the spectral bands based classifier has features with dimension 2.
Index related bands. NDVI and NDWI involve band information of Red, NIR and SWIR band, which can specifically determine water stress and vegetation vitality of classification problem. In order to avoid the problems of index based classification (e.g., saturation with a high canopy cover), the three aforementioned bands will be selected as training features to detect whether it will get a corresponding results compared with index based or not. For this reason, the features of index related classification consist of Band 4, Band 8A and Band 11.
Mutual Information based bands. Mutual information is one of the feature scoring algorithms (for feature selection) to calculate a score value for each feature to reflect its usefulness for classification problem [13,22] .
There are several scoring algorithms according to various criteria such as Fisher score [23] , minimum redundancy maximum relevance (MRMR) [24] , MI and their variants. In this work, MI approach is employed as the band selection method due to its simpleness and computational efficiency. In this approach, the individual spectral band information and five labelled classes are conducted, where the band is ranked by MI algorithm according to the MI value. A higher value means a higher relevance. The MI for discrete random variables Y and Z are defined as below:

p(y, z) p(y) p(z)
where Y denotes the features in supervised learning and Z means the classes label. is the joint probability distribution function of Y and Z, and and are the marginal probability distribution functions of Y and X, respectively. In MI approach, the Sentinel-   To visually compare the differentiating capability of Band 6 and Band 10, the site generated maps by these two bands are displayed in Fig. 8. It can be seen that the site map of Band 10 is mainly dominated by noise providing little useful information for land cover classification, while the site map of Band 6 is much clearer and so has the better classification capability. The reflectance value of Band 6 and Band 10 for five labelled classes are also given in Fig. 9, which also shows that Band 10 has little discriminating ability. Actually, the overall classification accuracy by using Band 6 can reach to nearly 0.6. Adding Band 10 as a new feature can marginally improve the performance.
Consequently, to select an appropriate set of features by using the MI values, the bands with a higher MI value can be sequentially added to the feature vector, leading to classifiers with different number of features (or bands). And the performance with a good accuracy can be chosen. The classification performance value with respect to the number of bands can be generated and analysed by overall accuracy (the percentage of correctly classified pixels) and average accuracy (the mean of the percentages of correctly classified pixels for each class) line (see Fig. 10). By using this simple approach, it is discovered that the classification overall accuracy adopting seven top ranked bands (Band 6, Band 7, Band 8, Band 8A, Band 9, Band 5 and Band 3) in Sentinel-2A can reach up to 95%. Consequently, the aforementioned bands after MI approach are selected as the features.
Full bands. In this approach, all 13 bands available on Sentinel-2A satellite will be used as the training features. This is done to see whether all bands approach can further improve the classification performance.

Classifier selection: SVMs
The land cover classification problem can be solved by using supervised learning algorithms. In this paper, supervised classification builds the implicit relationship between feature vector (four approaches of feature selection ) and target variable (five classes labels) by learning from limited labelled training data. With the trained classification model, prediction can be made on new feature data such that its class label can be determined. To avoid the problem of overfitting, the labelled data are usually divided into training set and testing set using either the approach of hold-out or cross-validation. Different classification algorithms have been developed in the literature including decision trees, discriminant analysis, SVMs, nearest neighbor, neural network, just to name a few [13,14,25] . The performance of several aforementioned classification algorithms are compared by employing two indices (NDVI and NDWI) as features, the comparison results are shown in Table 4.
According to the comparison results, SVMs obtained a relatively high accuracy among all testing classifers. From the literature review [13][14][15] , SVMs is also quite effective in coping classification problem with a small dataset. In addition, SVM is one non-parametric statistical learning algorithm, where no particular assumption should be made on data distribution [26] .
The principle of SVMs is introduced in [13], which is also briefly introduced in this work for the sake of completeness. In this approach, a given training set is projected into a Hilbert space (higher than the original feature space) by adopting a mapping leading to . The optimal hyperplane is to separate the original data on the condition of the maximization of the margin and the minimization of the sum classification error meeting the constraint: in the following formulation: where 's are the so-called slack variables and constant K is regularization parameter which can control the shape of the decision boundry. The optimization problem can be built up and solved by the use of Lagrange multipliers : Kernel function is employed [27] to avoid the computation of the inner products in the transformed space . The decision rule is formulated by where denote the support vectors. Different kernels lead to different SVMs, where the commonly used are polynomial kernel of order p, and Gaussian kernel with being a parameter inversely proportional to the width of the Gaussian kernel [28,29] . SVMs is one promising approach to deal with satellite classification problem in our case but different mechanisms are available for multi-class classification. In this paper, Quadratic SVMs due to its simplicity and effectiveness with 50% holdout validation is chosen based on our previous experience, where its implementation is conducted in Matlab using classifier learner with built-in functions.

Classification results
At first, classification by using one index (i.e., NDVI or NDWI) is performed. NDVI performs well in landcover classification, specially for vegetation/non vegetation area and vitality/non vitality status. NDWI is good at classifying water status under different levels. For our case, there are five classes to be classificied, hence, NDVI and NDWI indices are not a good solution to directly make a classification. It is discovered that one index results in very poor result (NDVI classification accuracy with SVMs: 76.2%; NDWI classification accuracy with SVMs: 35.7%), so that the result analysis is omitted due to lack of space. This is mainly due to the fact that one feature is not enough for the land cover classification problem with five different classes in this study. Consequently, only the classification methods with relatively satisfying performance are presented in this paper. In this section, the algorithms discussed in Section 3 are implemented, particularly the performance of four different feature selection methods are evaluated by using confusion matrix (see, . In the confusion matrix plot, the rows correspond to the predicted class (i.e., output class), and the columns show the truth class (i.e., target class). More explanations on confusion matrix will be given where necessary.

Index based approach
This part mainly focuses on the analysis of NDVI and NDWI based classification. The confusion matrix for this approach is given in Fig. 11. In Fig. 11, the diagonal cells in green show the number and percentage of correct classification. For example, 1 226 samples are correctly classified as crop corresponding to 45.6% of all samples. The The result shows that classification based on empirical or semi-empirical approach has a relatively high accuracy. This is mainly due to the fact that NDVI can effectively reflect vegetation status and NDWI is valid for wa-ter content evaluation. Both of them can partly capture the main characteristics of the land covers of interest. It should also be noted that the main misclassification is that soil is misclassified as tree and road. This is mainly because there is little chlorophyll in tree in winter and consequently poses challenges in distinguishing between tree, soil and road.

Index related bands approach
Instead of using empirical or semi-empirical indices in satellite remote sensing as in Section 4.1, the specific relevant bands including Red, NIR and SWIR are directly adopted as features for supervised classification in this part. The classification results are shown in Fig. 12  classification where specific mathematical operations are performed on the three bands, machine learning algorithm can automatically build the relationship between the three bands and class label by learning from labelled training samples. Comparing the performance of these two approaches, one can discover that classification by directly using Red, NIR, SWIR band is more effective than NDVI, NDWI based classification. Therefore, classifying Sentinel-2A spectral images by using selected bands along with machine learning techniques is an effective approach.

MI selected bands approach
The results by using top bands selected by MI approach is displayed in Fig. 13. It is obvious that the crop classification by MI approach is accurate than index based approach (only 97.5%) and index related band based approach (98.5%). Additionally, other samples classification results are all over 90% (Tree: 92.9%; Soil: 96.2%; Water: 91.9%; Road: 91.4%). Moreover, the overall classification accuracy at the right bottom is 95.8%, which is higher than that index based approach (87.7%) and index related bands based approach (93.2%). This means that by adopting more informative bands in Sentinel-2A satellite, the land cover classification performance can be improved.

Full bands approach
It should be noted that there are 13 spectral bands on Sentinel-2A satellite, which provides a great amount of information for remote sensing applications. It would be of interest to verify whether full band information can further improve the performance or not. To this end, all 13 bands are further treated as features for classification in the fourth approach, where the classification results are shown in Fig. 14. It can be seen from Fig. 14 that the overall accuracy increases to 97.9% from 95.8%. Moreover, the misclassification rates between soil class and tree class also reduce obviously. This demonstrates that incorporating more related band information can further improve the classification performance, however, the improvement is marginal.

Further discussions
Comparing these different classification algorithms with different spectral features, the following observations can be drawn: 1) Classification by using indices related bands outperforms the empirical or semi-empirical indices based approaches in terms of overall accuracy from 87.7% to 93.2%. This is mainly due to the increased one dimension information (i.e., certain information has been missed by the reduced-order transformation from three dimen-sions to two dimensions). → 2) Different bands have different differentiating abilities (reflected by the mutual information), and classification by using selected top bands (seven bands in this work) via MI approach can further improve the classification performance (93.2% 95.8%). This again is due to the increased information in the additional bands. MI is an effective approach to identify the most differentiating bands in a large number of features.
3) Classification by using all 13 bands available on Sentinel-2A satellite can further improve the classification performance. However, the marginal performance improvement is at the expense of using additional six bands in comparison with MI based approach. The substantially increased number of bands usually require extra data transmission and storage, which may be not necessary or desirable for certain applications. 4) In practical applications, in addition to classification accuracy, other performance indices such as data volume, training and classification time should also be considered. In this case, an appropriate number of bands with a satisfying performance may be more desirable, and dimension reduction (e.g., MI information based feature selection) may provide a promising solution to this problem.
Overall speaking, machine learning based classification by using the spectral bands of Sentinel-2A satellite is one promising solution for agriculture remote sensing applications (i.e. land cover classification including crop classification), in particular the approach based on feature selection by using mutual information is recommended.

Conclusions and future work
This paper develops a novel approach to analyse satellite remote sensing images, particularly Sentinel-2A satellite images using machine learning techniques. Four feature selection methods applying to classification problem are studied and compared here, namely index-based classification (NDVI, NDWI), index related band based classification (Band 4, Band 8A, Band 11), MI scored band based classification (Band 6, Band 7, Band 8, Band 8A, Band 9, Band 5 and Band 3) and all available bands based classification. By using a case study of land cover classification with five classes, it is shown that the method employing all available bands of Sentinel-2A satellite result in the best performance while the use of MI scored bands with highly relevance also yields quite promising results. Overall the classification methods directly using specific relevant bands with supervised learning outperform the classic index based classification methods. Some limits of the index based classification could be removed by the direct use of spectral bands of Sentinel-2A. The proposed method can also be applied to forest vegetation monitoring, vegetation physiological status detecting and irrigation decisions [30,31] .
Future work on this direction is summarized in the following aspects: 1) In addition to spectral band information, other types of information may also be considered, such as texture information.
2) More advanced classification algorithms can be considered, such as random forest and their variants.