Keywords

1 Introduction

Since hyperspectral image (HSI) has hundreds of spectral bands, it has a higher spectral resolution comparing to other kinds of images, which improves its ability to distinguish different materials [1, 2]. Due to the each pixel in HSI corresponds to a spectral curve, it means that HSI classification is to assign each pixel a land-cover class based on their respective spectral information [3]. However, the high dimensional characteristic existing in HSI may cause the Hughes phenomenon [4], which poses a big challenge to the HSI classification.

Sparse representation (SR), a useful tool for high-dimensional signal processing, has been applied to HSI classification last few years. Related studies shown in [7,8,9,10] have displayed the achievement the SR based methods made. In 2011, Yi chen firstly applied the sparse representation classification (SRC) approach from other fields to HSI classification [5]. Meanwhile, the author proposed a joint sparse representation classification (JSRC) method rooting in the assumption that HSI pixels in a small area consist of similar material (same class). Though the JSRC has a better performance than SRC, there is a situation that not all HSI pixels can meet the assumption of JSRC. When the HSI pixel locates at regional edge, its neighboring pixels can not be guaranteed homogeneity. In order to reduce the interference of heterogenous pixels in the neighborhood, Zhang in [6] proposed a non-local weighted joint sparse representation classification (NLW-JSRC) method. By assigning different weights to HSI pixels in search window according to the similarities between neighboring pixels and central pixel, the NLW-JSRC can improve the problem existing in JSRC. However, its calculation of weights can not fully consider the spatial information of the HSI and can not assign appropriate weights for pixels in search window.

In order to effectively reduce the interference of the heterogeneous pixels in the search window, we propose an adaptive and rotated non-local weighted sparse representation classification (ARW-JSRC) method. Compared with NLW-JSRC, our proposed method can provide more appropriate weight to every pixel in the search window. It uses a rotation transformation strategy to measure the similarity between the pixels in the search window, so as to make full use of the spatial information of the image. Then, a new weight calculation method is used to give more appropriate weight to each pixel in the search. The adaptive threshold involved in the weight formula is obtained by calculating the median of the maximum and minimum spectral angles of various training samples.

The remainder of this paper is organized as follows. The non-local weighted joint sparse representation classification is described in Sect. 2. Then, the adaptive and rotated non-local weighted joint sparse representation classification is introduced in Sect. 3. The experimental results and discussion are presented in Sect. 4. The conclusion is shown in Sect. 5.

2 Related Works

2.1 JSRC

In the JSRC, it is assumed that all neighboring HSI pixels in a small area can be approximately represented by the linear combination of a few common atoms with different coefficients. For any test HSI pixel \(\varvec{x}_i(i=1,2,\cdots ,N)\), let its search window size be set as \(\sqrt{S}\times \sqrt{S}\), then the joint signal matrix \( \varvec{X}=[\varvec{x}_{1}, \varvec{x}_{2}, \dots , \varvec{x}_S]\) can be represented as

$$\begin{aligned} \varvec{X}&=[\varvec{x}_{1} \ \varvec{x}_{2} \ \dots \ \varvec{x}_S]\nonumber \\ {}&=[\varvec{D}\varvec{\alpha }_{1} \ \varvec{D}\varvec{\alpha }_{2}\ \dots \ \varvec{D}\varvec{\alpha }_{S}]\nonumber \\&=\varvec{D}\underbrace{[\varvec{\alpha }_1\ \dots \ \varvec{\alpha }_i\ \dots \ \varvec{\alpha }_S]}_{\varvec{\varPhi }}\nonumber \\ {}&=\varvec{D}\varvec{\varPhi }, \end{aligned}$$
(1)

where \(\varvec{\varPhi }\) is the sparse coefficient matrix with few nonzero rows and N is the number of test pixels in HSI. \(\varvec{D}\) is the over-complete dictionary consist of training pixels randomly selected from all classes in HSI with a certain proportion. Given the over-complete dictionary \(\varvec{D}\) and the joint signal matrix \(\varvec{X}\), the sparse matrix \(\varvec{\varPhi }\) can be obtained as follow:

$$\begin{aligned} \varvec{\varPhi }=\arg \min \Vert \varvec{X}-\varvec{D}\varvec{\varPhi }\Vert _{F}\quad \mathrm{s.t.}\ \Vert \varvec{\varPhi }\Vert _{row,0}\le K, \end{aligned}$$
(2)

where \(\Vert \varvec{\varPhi }\Vert _{row,0}\) denotes the number of nonzero rows of \(\varvec{\varPhi }\). Besides, the objective function (2) can be solved by the simultaneous orthogonal matching pursuit (SOMP) algorithm [11, 12]. Once the sparse matrix \(\varvec{\varPhi }\) is obtained, the test pixel \(\varvec{x}\) can be labeled as follow

$$\begin{aligned} Class(\varvec{x}_i)&=\arg \min _{m=1,2,\dots ,M}\Vert \varvec{X}-\varvec{D}^m\varvec{\varPhi }^m\Vert _F,\nonumber \\ i&=1, 2, \dots , N, \end{aligned}$$
(3)

where M is the number of classes in HSI. \(\varvec{D}^m\) is the sub-dictionary constructed by HSI pixels randomly selected from mth class.

2.2 NLW-JSRC

The author in [6] proposed the NLW-JSRC method that assigns appropriae weights to all neighboring pixels in search window based on the similarities between neighboring pixels and the central test pixel. The weight \(w_{ij}\) can be obtained mathmatically by

$$\begin{aligned} w'_{ij}=(1-(\frac{\Vert J(\varvec{x}_i)-J(\varvec{x}_j)\Vert }{\rho })^2)^2, \end{aligned}$$
(4)

where \(\Vert J(\varvec{x}_i)-J(\varvec{x}_j)\Vert \) denotes the similarity measure (euclidean distance) between the two HSI patches, which are sized as \(so \times so\) and centered at the test pixel \(\varvec{x}_j\) and the neighboring pixel \(\varvec{x}_j\), respectively. Parameter \(\rho \) is the \(\max (\Vert J(\varvec{x}_i)-J(\varvec{x}_j)\Vert )\). Then, the weight scheme \(w'_{ij}\) is modificated as follow:

$$\begin{aligned} w_{ij}=\left\{ \begin{array}{lcl} 0 &{}0<w'_{ij}<w'_1\\ w'_{ij} &{}w'_1<w'_{ij}<w'_2\\ 1 &{}w'_2<w'_{ij}<1 \end{array} \right. , \end{aligned}$$
(5)

where \(w'_1\) and \(w'_2\) are two parameters applied to judge valid and invalid neighboring pixels.

With the joint consideration of (2) and (5), we get:

$$\begin{aligned} \varvec{\varPhi }_{NLW}=&\arg \min \Vert \varvec{X}\varvec{W}_{NLW}-\varvec{D}\varvec{\varPhi }\Vert _F, \nonumber \\&s.t.\Vert \varvec{\varPhi }\Vert _{row,0}\le K, \end{aligned}$$
(6)

where \(\varvec{W}_{NLW}=diag(w_{i1}, w_{i2}, \dots , w_{im})\) is the non-local weighted matrix, and each weight can be get by (5). The sparse coeffiecent matrix \(\varvec{\varPhi }_{NLW}\) can be obtained as similar as the JSRC. Finaly, the label fo the test pixel \(\varvec{x}_i\) is given by minimizing the residual:

$$\begin{aligned} Class(\varvec{x}_i)&=\arg \min _{m=1,2,\dots ,M}\Vert \varvec{X}\varvec{W}_{NLW}-\varvec{D}^m\varvec{\varPhi }_{NLW}^m\Vert _F,\nonumber \\ i&=1, 2, \dots , N. \end{aligned}$$
(7)

3 Adaptive and Rotated Weighed Joint Sparse Representation Classification

Due to the NLW-JSRC can not consider the directionality of HSI spatial structure and the Turkey function failed to give appropriate weights. To make up for the deficiency of NLW-JSRC, the adaptive and rotated weighed joint sparse representation classification (ARWJSRC) is proposed. The proposed method can be divided into three parts, containing spectral angle, rotated similar window, weighed function.

3.1 Spectral Angle

In this paper, the spectral angle is used to measure the similarity between HSI pixels. Suppose that there is a search window centered at HSI pixel \(\varvec{x}_i(i=1,2,\cdots ,N)\) with the size of \(\sqrt{S}\times \sqrt{S}\) and pixel \(\varvec{x}_j(j=1,2,\cdots ,S)\) is one of HSI pixels in search window, then the similarity between \(\varvec{x}_i\) and \(\varvec{x}_j\) can be written as

$$\begin{aligned} \theta _{ij}=\theta ( \bar{\varvec{x}}_i, \bar{\varvec{x}}_j)=\frac{180^{\circ }}{\pi }\arccos \frac{ \bar{\varvec{x}}_i\cdot \bar{\varvec{x}}_j}{\Vert \bar{\varvec{x}}_i\Vert _2\Vert \bar{\varvec{x}}_j\Vert _2}, \theta _{ij}\in [0^\circ ,90^\circ ], \end{aligned}$$
(8)

where \(\bar{\varvec{x}}_i\) denotes the average of HSI pixels in similar window centered at \(\varvec{x}_i\) with the size of \(\sqrt{s}\times \sqrt{s}\). It can be written as \(\bar{\varvec{x}}_i=\frac{1}{s}\sum ^{s}_{n=1}\varvec{x}_n(i=1,2,\cdots ,N)\). \(\varvec{x}_n\) is one of HSI pixels in similar window. Besides, \(\bar{\varvec{x}}_j\) is the average of HSI pixels in similar window centered at \(\varvec{x}_j\) with the size of \(\sqrt{s}\times \sqrt{s}\).

The deficiency exiting in NLW-JSRC can not be solved if the similarity between \(\varvec{x}_i\) and \(\varvec{x}_j\) is measured directly as above. Because the calculation introduced above dose not consider the directionality of HSI spatial structure. Thus, the rotated similar window strategy is introduced in next subsection.

3.2 Rotated Similar Window Strategy

NLW-JSRC dose not consider the directionality of HSI spatial structure and only calculate the Euclidean distance between two HSI patches. As considering the redundancy of image spatial information, we apply the rotated similar window technology instead of the traditional window technology to measure the similarity between neighboring pixel and the central. The rotated window method looks for the most similar structure through the rotation of the HSI patches so that the similarity between neighboring pixel and the central can be more accurately estimated. The Fig. 1 illustrates the process.

Fig. 1.
figure 1

The rotation measurement of similarity. The rotated similar window consists of the central pixel and its 8 neighboring pixels. Suppose that (a)–(d) are the HSI blocks obtained by rotating the original HSI block \(0^{\circ }\), \(90^{\circ }\), \(180^{\circ }\), \(270^{\circ }\), respectively. Besides, (e)–(h) are obtained by flipping the original HSI block upside down, left and right, diagonally, anti-diagonally, respectively. It is obvious that they have low similarities with it although (b)–(h) have the same spatial structure with the original HSI block. Because their directionality of HSI block are different from the original HSI block unless the (a). Thus, to find the most similar structure by rotating the HSI block is important to get a more accurate similarity measurement.

The following passage will mathematical introduce the specific process. Suppose that \(\phi (\varvec{x}_i)\) and \(\phi (\varvec{x}_j)\) respectively denotes the similar window centered at test HSI pixel \(\varvec{x}_i\) and \(\varvec{x}_j\) that is one of HSI pixels in the search window centered at \(\varvec{x}_i\). Then, the most similar structure \(\hat{\phi }(\varvec{x}_j)\) of \(\varvec{x}_j\) with \(\phi (\varvec{x}_i)\) can be obtained by

$$\begin{aligned} \hat{\phi }(\varvec{x}_j)=R_k[\phi (\varvec{x}_j)]=\arg \min _{k=0,1,\cdots ,7}\Vert \phi (\varvec{x}_i)-R_k[\phi (\varvec{x}_j)]\Vert _F, \end{aligned}$$
(9)

where \(R_k[\bullet ]\) denotes kth rotation or flip operation.

By getting \(\hat{\phi }(\varvec{x}_j)\) and \(\phi (\varvec{x}_j)\), their residual \(r_{min}\), \(r_o\) with \(\phi (\varvec{x}_i)\) can be respectively expressed as

$$\begin{aligned} r_{min}=\Vert \hat{\phi }(\varvec{x}_j)-\phi (\varvec{x}_i)\Vert _F, \end{aligned}$$
(10)
$$\begin{aligned} r_o=\Vert \phi (\varvec{x}_j)-\phi (\varvec{x}_i)\Vert _F. \end{aligned}$$
(11)

Once the residuals \(r_{min}\), \(r_o\) are obtained, a direction coefficient O can be got. The coefficient O can revise the spectral angle obtained by (9), which improve the measurement of similarity between HSI pixels. The coefficient O can be got by

$$\begin{aligned} O=\frac{r_{min}}{r_o}. \end{aligned}$$
(12)

Then the revised spectral angle \(\hat{\theta }_{ij}\) between the test pixel \(\varvec{x}_i\) and any HSI pixel \(\varvec{x}_j\) in search window centered at \(\varvec{x}_i\) can be written as

$$\begin{aligned} \hat{\theta }_{ij}=\hat{\theta }(\bar{\varvec{x}}_i,\bar{\varvec{x}}_j)=\theta (\bar{\varvec{x}}_i,\bar{\varvec{x}}_j)\times O. \end{aligned}$$
(13)

3.3 The Proposed Calculation of Weights

For any HSI pixel \(\varvec{x}_j(j=1,2,\cdots ,S)\) in the search window centered at a test pixel \(\varvec{x}_i(i=1,2,\cdots ,N)\), its weight \(w_{ij}\) in search window can be obtained by

$$\begin{aligned} w_{ij}=\frac{1}{1+(\frac{\hat{\theta }_{ij}}{\theta _{th}})^G}, \end{aligned}$$
(14)

where G is the order that determined the decay rate of weight. The larger weight is, the more similar HSI pixels are. Vice versa. \(\theta _{th}\) is a adaptive threshold which is got by calculating the median between the maximum and minimum of spectral angle between training samples. Here is its detailed process.

Given the training samples \(\varvec{X}_{train}=[\varvec{X}_1,\cdots ,\varvec{X}_i,\cdots ,\varvec{X}_M]\), where \(\varvec{X}\in \mathbb R^{B\times N_i}\) is class ith training samples and \(N_i\) denotes the number of training samples in ith class. Then the average \({\bar{X}}_i\) of class i can be written as

$$\begin{aligned} {\bar{X}}_i=\frac{1}{N_i}\sum _{n=1}^{N_i}\varvec{x}_i^n,i=1,2,\cdots ,M. \end{aligned}$$
(15)

After getting averages of all classes according to (15), their spectral angles \(\theta _{ij}=\theta ({\bar{X}}_i,{\bar{X}}_j)\) can be obtained by (8) and sorted. The adaptive threshold \(\theta _{th}\) is the median between the maximum \(\theta _{max}\) and minimum \(\theta _{min}\) selecting from the sorted spectral angles.

$$\begin{aligned} \theta _{th}=\frac{\theta _{max}+\theta _{min}}{2}. \end{aligned}$$
(16)

3.4 Reconstruction and Classification

Suppose that there is a search window centered at test pixel \(\varvec{x}_i\) with the size of \(\sqrt{S}\times \sqrt{S}\) and all pixel in search window construct a joint signal matrix \(\varvec{X}=[\varvec{x}_i^1,\varvec{x}_i^2,\cdots ,\varvec{x}_i^S]\), then a rotating weighted matrix \(\varvec{W}_{OW}=diag(w_{i1},w_{i2},\cdots ,w_{iS})\) according to (14). Similar to (7), the sparse coefficient matrix \(\varPhi _{OW}\) can be obtained by

$$\begin{aligned} \varvec{\varPhi }_{OW}=&\arg \min \Vert \varvec{X}\varvec{W}_{OW}-\varvec{D}\varvec{\varPhi }\Vert _F, s.t.\Vert \varvec{\varPhi }\Vert _{row,0}\le K, \end{aligned}$$
(17)

In this paper, the SOMP algorithm is used to solve (17). Once the sparse coefficient matrix \(\varvec{\varPhi }_{OW}\) is got, the class of test HSI pixel \(\varvec{x}_i\) can be determined by

$$\begin{aligned} Class(\varvec{x}_i)&=\arg \min _{n=1,2,\dots ,M}\Vert \varvec{X}\varvec{W}_{OW}-\varvec{D}_n\varvec{\varPhi }_{OW}^n\Vert _F,\nonumber \\ i&=1, 2, \dots , N. \end{aligned}$$
(18)

4 Experiment and Discussion

In this paper, two data sets containing Indian Pines and Pavia University are used to evaluate the performance of the proposed method. Besides, several classical HSI classification algorithms are also used as contrasting methods to prove the superiority of our proposed method. The section can be divided into two parts: 1) experimental data; 2) experimental result and discussion.

Table 1. Sixteen classes in the AVIRIS Indian Pines data and the training and test sets for each class.
Table 2. Nine classes in the ROSIS Urban Pavia University data and the training and test set for each class.

In this paper, four evaluating indicators including the average accuracy (AA), the overall accuracy (OA), the kappa coefficient, and time were used to judge the classification results. In order to display superiority of the proposed method, several classical algorithms including SVM [14], SRC [5], JSRC [5], NLW-JSRC [6] are applied to compare with our method. Among those methods, SVM and SRC are pixel-wise classification algorithms which only takes into account the spectral information, whereas the rest is the spectral-spatial classification method.

The parameters of SVM are obtained by the 5-fold cross-validation technique. According to [5, 15], the sparsity level of all the sparse representation-based method mentioned in this paper was set to 3. If rising in sparsity level, it not only causes higher computational cost but also mislead the dictionary atoms from wrong classes to be selected, which leads to the worse classification performance. For Indian Pines and Pavia University, the window sizes in JSRC were \(7\times 7\) and \(11\times 11\), respectively. As for the NLW-JSRC, the window sizes were set as \(9\times 9\) and \(13\times 13\), respectively. Besides, the size of the nonlocal weighting patch was \(7\times 7\). The parameters \(w_1\), \(w_2\) for the thresholds of nonlocal weights were 0.14 and 0.88, respectively. More detail was shown in [6]. All the experiments were conducted using MATLAB R2014a on a 3.2 GHz computer with 64.0 Gb RAM.

Table 3. Classification results of the Indian Pines and Pavia University, including classification accuracies for every class, AA, OA, Kappa, and Time obtained by SVM, SRC, JSRC, NLW-JSRC, and ARW-JSRC

4.1 Indian Pines

The Indian Pines has 220 spectral bands ranging from 0.4-2.5um where each band consists of \(145\times 145\) pixels with a spatial resolution of 20m. Due to serious water absorption [13], we remove 20 absorption bands (no. 104-108, 150-163, 220) and retain only the remaining 200 bands. For this data set with 16 classes of land cover, we randomly select \(10\%\) of each class of samples for training and the remaining is used for testing. The reference contents are shown in Table 1 and the label map of ground truth is shown in Fig. 4(a).

Fig. 2.
figure 2

Classification map for the Indian Pines image. (a) Label map; (b) SVM (\(OA=77.49\%\)); (c) SRC (\(OA=69.95\%\)); (d) JSRC (\(OA=93.67\%\)); (e) NLW-JSRC (\(OA=93.26\%\)); (f) ARW-JSRC (\(OA=98.34\%\)).

The classification map was shown in Fig. 2(b)–(h) and the classification results including OA, AA, Kappa, and time were displayed in Table 3. As it shown in Table 3, all the pixel-wise methods (KNN, SVM, and SRC) had worse performance than the spectral-spatial classification methods (JSRC, NLW-JSRC, SAJSRC, and ARW-JSRC). The reason is that those pixel-wise methods can not take advantage of spatial information in HSI and avoid the “Houghes” phenomenon. HSI pixels belong to same class may have different spectral characteristic and those are same class may have the similar spectral characteristic, which brings great difficulties to those pixel-wise classification methods. Among all approaches listed at Table 3, the proposed method ARW-JSRC displays the best classification performance comparing with other methods in terms of OA, AA, Kappa. Especially, the improvement brought by ARW-JSRC on JSRC is significantly obvious in terms of various evaluation metrics. For instance, the OA has increased from \(93.67\%\) to \(98.34\%\). Though the NLW-JSRC and SAJSRC also has improved the classification result of JSRC, the ARW-JSRC makes the great improvement on the JSRC about \(5\%\), which is more efficient than the NLW-JSRC and SAJSRC. Moreover, the ARW-JSRC has the \(100\%\) classification accuracy in class 1, 4, 7, 8, and 14. The reason why the ARW-JSRC can make such a great improvement is that it considers the directionality of spatial structure in HSI and assigns more appropriate weights for HSI pixels in search window. However, using the rotated similar window and the nonlocal weighted based method leads to the large computation consumption for the ARW-JSRC.

Fig. 3.
figure 3

Influence of parameters on classification. (a) Search window size S; (b) Order G.

The size of search window S directly determines the size of neighborhood of the test pixel, which finally influences the joint signal matrix and the classification. The influence of search window on the classification has shown in Fig. 3(a). The size of search window S ranged from \(3\times 3\) to \(21\times 21\). As it can be seen in Fig. 3(a), the OA increased rapidly when \(3<\sqrt{S}<9\). The reason is that the number of HSI pixels in joint signal matrix is insufficient causing the unsatisfactory result at first. When \(\sqrt{S} =4\), the OA reaches its peak. However, the heterogenous pixels in search window will be more and more when \(\sqrt{S}>4\), which brings great challenges to the classification. Thus, once \(\sqrt{S}\) is larger than the optimal threshold, the ARW-JSRC can not solve efficiently all the heterogenous pixels in search window. In order to have the good classification performance, the size of search window S can be set as \(9\times 9\).

The order G determines the slope of the weighed function. The influence of order G on the classification is shown in Fig. 3(b). From Fig. 3(b), it is known that the ARW-JSRC got its best OA when \(G=12\). When \(1<G<12\), the OA rose quickly because more appropriate weights can be given by the weighed function as G grew. However, the OA decreases slow when \(G>12\). The reason is that those have small similarity may be given large weights when G is too large. Therefore, the optimal order of weighed function can be set as \(G=12\).

Fig. 4.
figure 4

Classification map for the Pavia University image. (a) Label map; (b) SVM (\(OA=90.26\%\)); (c) SRC (\(OA=81.79\%\)); (d) JSRC (\(OA=96.05\%\)); (e) NLW-JSRC (\(OA=95.99\%\)); (f) ARW-JSRC (\(OA=98.05\%\)).

4.2 Pavia University

The geometric resolution of Pavia University image is 1.3 m and it has 115 bands ranging from 0.43 to 0.86\(\mu \)m. Some of bands in the HSI data are noisy and we only preserve 103 bands for the experiment. For this data with 9 classes of ground truth, we randomly select 5\(\%\) of each class of samples for training and the remaining is used for testing. The specific information can be seen in Table 2 and the label map of ground truth shown in Fig. 4(a).

The classification map is shown in Fig. 4(b)–(h) and the classification results can be seen in Table 3. Due to the more adequate samples than the Indian Pines, most classification algorithms have the higher OA, especially for the pixel-wise methods. Though the ARW-JSRC has the best classification performance in terms of OA, AA, and Kappa, it cost most time to obtain the better results. It means that the ARW-JSRC obtains the better classification result (\(OA=98.05\%\)) than the JSRC (\(OA=96.05\%\)) by consuming more time. As for the size of search window S and order of weighed function G, their optimal value can be set as \(\sqrt{S}=7\) and \(G=3\) from Fig. 3(a) and (b).

5 Conclusion

Aiming at the problem that JSRC can not deal with the interference of heterogeneous pixels in the search window, a joint sparse representation classification algorithm based on adaptive rotation weighting is proposed. The algorithm mainly uses the strategy of rotating similar window to measure the similarity between pixels, and uses a new method of weight calculation to assign weight to each pixel in the search window. In addition, the median of the maximum and minimum spectral angles of various training samples are used as the adaptive threshold of the weight formula. Experiments show that the proposed algorithm achieves remarkable improvement in classification accuracy.

Although the proposed method achieves good classification accuracy, it takes a heavy computation. In the future, we will start with reducing the time complexity of the algorithm and improve the efficiency of the algorithm.