Theoretical and Applied Climatology

, Volume 132, Issue 1–2, pp 301–311 | Cite as

The application of low-rank and sparse decomposition method in the field of climatology

  • Nitika Gupta
  • Prasad K. Bhaskaran
Original Paper


The present study reports a low-rank and sparse decomposition method that separates the mean and the variability of a climate data field. Until now, the application of this technique was limited only in areas such as image processing, web data ranking, and bioinformatics data analysis. In climate science, this method exactly separates the original data into a set of low-rank and sparse components, wherein the low-rank components depict the linearly correlated dataset (expected or mean behavior), and the sparse component represents the variation or perturbation in the dataset from its mean behavior. The study attempts to verify the efficacy of this proposed technique in the field of climatology with two examples of real world. The first example attempts this technique on the maximum wind-speed (MWS) data for the Indian Ocean (IO) region. The study brings to light a decadal reversal pattern in the MWS for the North Indian Ocean (NIO) during the months of June, July, and August (JJA). The second example deals with the sea surface temperature (SST) data for the Bay of Bengal region that exhibits a distinct pattern in the sparse component. The study highlights the importance of the proposed technique used for interpretation and visualization of climate data.


Low-rank and sparse decomposition Climate system Climate signal Wind speed Sea surface temperature Climate indices Indian Ocean 

1 Introduction

In the field of climate science, to study and understand the pattern and impact of climate change on spatiotemporal climate variables, traditionally basic analytic methods such as mean, standard deviation, correlation, or empirical orthogonal function (EOF) are widely used. For instance, these methods are used to analyze the variations in climate variables (by calculating their deviations from mean), or to study their climatology (by calculating their temporal mean) or inter-annual behavior (by calculating their standard deviation), or to estimate the temporal correlation of two different climate variables. The calculation of standard deviation, anomaly, or composite maps only provides an approximate estimate on the change in the climate field variable or their long-term behavior. They often fail to explain the nature of perturbation, as well the expected behavior of a field especially at each time scale.

One of the most widely used and accepted techniques to understand the spatiotemporal variability in climate variables is the EOF analysis. This technique vectorizes the two-dimensional spatial distribution of a climate variable at each time scale. The final vectorized data projected into a new coordinate system preserves the overall variance of the original data. Fundamentally, it is the principal component analysis (PCA) where the Eigen vectors are the unit vectors along the new coordinate system. The principal components (PCs) are the components of the vectorized spatial data in the new coordinate axis defined by its Eigen vectors. These Eigen vectors also called the EOFs due to its orthogonal nature provide an understanding on the different patterns that contribute to the spatial climate data. Their contribution is defined by the magnitude of PCs along these Eigen vectors or the EOFs. The limitations with this technique are the following:
  1. (i)

    Its assumption is that data lies in low-dimensional linear sub-space with little variation or perturbation in few entries of data. It fundamentally means that finding a low-rank matrix A for stacked column vectors of data point D minimizes the discrepancy between D and E. The minimization is classically done by the l 2 norm.

  2. (ii)

    It breaks down under high amount of variation in data entries due to a high perturbation in climate data from a low-rank behavior, i.e., during cyclones, El Nino, or any abrupt climate change. It can affect the EOF analysis to extract exactly the low-rank matrix or the EOF patterns.

  3. (iii)

    The other limitation lies in the interpretation of EOF patterns in terms of the considered field physical units.

  4. (iv)

    Since the EOF analysis focuses more towards low-rank observations or EOF patterns, it fundamentally ignores the large climate perturbations that are usually sparse in nature.

  5. (v)

    Together, the above-mentioned points (ii) and (iv) mean that one cannot recover the original data from the obtained EOFs and PCs completely, since the exact low-rank and sparse terms cannot be extracted precisely.


The present study introduces the low-rank and sparse separation technique in the field of climatology. The application of this method in climate science results in a set of low-rank and sparse component for each time scale having similar units, wherein the low-rank component provides the expected value or mean behavior of the field. The sparse component provides the variations or perturbations existent in the field apart from its expected or mean behavior. The strength and novelty of the proposed methodology is as given below.

The proposed technique is a new method that uses the low-rank and sparse decomposition technique applied to study climate data exactly recovering the low-rank and sparse components associated with large climate perturbations. The application of this mathematical technique investigated the variability in maximum wind speed (MWS) for the past 21 years over the Indian Ocean basin. The study reveals an observed decadal reversal pattern in the MWS for the sparse component bounded between the geographical coordinates 26°–30° N. The variability seen from the sparse component in the sea surface temperature (SST) for the past 25 years in the Bay of Bengal corresponds to the signal calculated using the sparse terms at all time scales. This has a good statistical correlation with various climate indices. The benefits and limitations of the proposed technique are as given below.
  1. (i)

    It exactly recovers the low-rank data from climate perturbations as sparse data.

  2. (ii)

    The degree of perturbation provides more valuable information to understand the climate change, random occurrences, and effect of extreme weather events such as cyclones, El Nino, etc. on different climate variables.

  3. (iii)

    The most important benefit of this technique is its ease to visualize and interpret different patterns in the data obtained at each time scale. Since both low-rank and sparse components can be represented as maps, their quantification is straight forward and exact for interpretation.

  4. (iv)

    Further application of mathematical techniques and signal processing techniques can provide a better analysis of the obtained components. For example, application of EOF analysis on obtained low-rank can provide perfect EOF modes with exact temporal oscillations in the form of PCs.


The duration of datasets used for analysis is important, and datasets with short duration will be a limitation in using this technique. An increase in the dimension of data matrix can provide better low-rank components due to their linear correlated dependence.

1.1 Relevant studies using this technique

At present, the application of low-rank and sparse decomposition technique is limited to areas such as image processing, web data ranking, and bioinformatics data analysis. Few of its applications in the field of computer vision include detection of objects from a cluttered background using this technique (Wright et al. 2009; Emmanuel et al. 2011). This technique is also effectively used in the recovery of shadows and specularities from face images (Liang et al. 2012). The problem to repair low-rank textures, distorted textures, and completion of structured textures and their improvement in a highly robust manner is also reported in their work. Zhang et al. (2011) reconstructed a 3D shape and 2D texture for a class of surfaces using a single perspective image. This technique finds application in calculation of camera calibration and radial dimension from lens distortion (Zhang et al. 2011) and in 3D reconstruction of urban scenes (Mobahi et al. 2011). Other applications of this method includes object identification such as rectifying pose of objects, regularity of texts at all scales, character rectification, and street sign rectification. Applications such as web image refinement, web-document corpus analysis, protein gene correlation, and so on find application with the help of this method (Ma 2012). However, in the field of climate science, this application exactly separates the expected behavior of a climate variable from its variability.

2 Description and application of the technique

The following section provides the mathematical description of the low-rank and sparse decomposition technique along with one of its application used in the field of computer vision.

2.1 Mathematical formulation

Consider a large data matrix D ∈  m × n that can be decomposed as D = A + E, where A is a low-rank matrix and E is the sparse, where both A and E have arbitrary magnitudes. The low-dimensional column and the row space of A are unknown including their dimensions. Similarly, the locations of the non-zero entries in E are also unknown. The problem is to recover the low-rank and sparse component exactly and efficiently from D, where both A and E are unknown but A is known to be low rank and E to be sparse. Lately, Emmanuel et al. (2011) proved that under some suitable assumptions, it is possible to recover exactly the low-rank A as well as the sparse matrix E from D = A + E, by soving the following optimization problem:
$$ { \min}_{A, E}{\left\Vert A\right\Vert}_{\ast }+\lambda {\left\Vert E\right\Vert}_1,\kern0.5em \mathrm{subject}\ \mathrm{to}\ D= A+ E $$
where, ‖.‖ represents the nuclear norm of a matrix (the sum of its singular values), ‖.‖1 denotes the sum of the absolute values of matrix entries, and λ is a positive weighting parameter. As per theoretical considerations, λ should be of the form \( C/\sqrt{m} \), where C is a constant, typically set to unity. The solution to the optimization problem given by Eq. (1) uses the convex programming that exactly recovers low-rank matrices A whose singular vectors are not sparse or spiky. The success rate is with high probability (assuming that the support of E is random) provided rank(A) < C 1 μ −1 n/log2 m and ‖E0 < C 2 mn, where C 1 and C 2 are numerical constants, and μ is an incoherence parameter that is small if the singular spaces of A are not aligned with the standard basis (Emmanuel et al. 2011). The new objective function is now a continuous and convex, non-smooth functions.
Recently, with the advancements in the field of computational analysis, the number of algorithms to solve convex optimization problem like Eq. (1) had increased. Liu et al. (2013) reported various techniques along with their algorithms and computation speed to solve such problems. The augmented Lagrange multipliers (ALM) algorithm proposed by Liu et al. (2013) is used to solve the convex optimization problem given by Eq. (1) used in the present study. The choice of ALM depends on the success of prior studies (Wright et al. 2009; Emmanuel et al. 2011; Peng et al. 2012). These studies successfully used ALM algorithm to solve the convex optimization problem for detection of objects in a cluttered background modeled from video and face recognition viz.; removing secularities and shadows from images of faces. The ALM function used in Eq. (1) can be defined in the form: π
$$ {\mathrm{\mathcal{L}}}_{\mu}\left( A, E, Y\right)={\left\Vert A\right\Vert}_{\ast }+\lambda {\left\Vert E\right\Vert}_1+\left\langle Y, h\left( A, E\right)\right\rangle +\frac{\mu}{2}{\left\Vert h\left( A, E\right)\right\Vert}_F^2, $$
where, Y ∈  m × n is a Lagrange multiplier matrix, μ is a positive scalar, 〈., .〉 denotes the inner product of the matrix: 〈X, Y〉 = trace(X T Y), ‖.‖ F denotes the Forbenius norm, and the function h(A, E) = D − A − E. For an appropriate choice of Lagrange multiplier matrix Y, and for a sufficiently large constant μ, it can be shown that the augmented Lagrangian function has the same minimizer as the original constrained problem (Bertsekas 1999). The ALM algorithm iteratively estimates both the Lagrange multiplier and the optimal solution by iteratively minimizing the augmented Lagrangian function:
$$ \begin{array}{c}\hfill \left({A}_{k+1},{E}_{k+1}\right)= \arg \underset{A, E}{ \min }{\mathrm{\mathcal{L}}}_{\mu k}\left( A, E,{Y}_k\right),\hfill \\ {}\hfill {Y}_{k+1}={Y}_k+{\mu}_k h\left({A}_{k+1},{E}_{k+1}\right).\hfill \end{array} $$
When {μ k } is a monotonically increasing positive sequence, the iteration converges to the optimal solution of the problem (1) (Bertsekas 1999). The first step in the iteration (2) is complex to solve directly. Hence, to minimize the Lagrangian function, approximately an alternative strategy is adopted, i.e., minimizing the function against two unknowns A and E one at a time:
$$ \begin{array}{c}\hfill {A}_{k+1}=\mathit{\arg}\underset{A}{ \min }{\mathrm{\mathcal{L}}}_{\mu k}\left( A,{E}_k,{Y}_k\right),\hfill \\ {}\hfill {E}_{k+1}=\mathit{\arg}\underset{E}{ \min }{\mathrm{\mathcal{L}}}_{\mu k}\left({A}_k, E,{Y}_k\right).\hfill \end{array} $$
Although each step of iteration (3) involves in solving a convex program, each has a simple closed-form solution, and hence, can be solved efficiently by a single step. For explicit solutions, the soft-thresholding or shrinkage operator is defined for scalars as follows:
$$ {S}_{\alpha}\left[ x\right]=\mathrm{sign}(x). \max \left\{\left| x\right|-\alpha, 0\right\} $$
where, α ≥ 0. The shrinkage operator acts element-wise when applied to vectors and matrices. The solution to each step of (3) using the shrinkage operator is expressed in the form:
$$ \begin{array}{l}\left( U,\sum, V\right)=\mathrm{svd}\left( D+\frac{1}{\mu_k}{Y}_k-{E}_k\right),{A}_{k+1}={US}_{\frac{1}{\mu_k}}\left[\sum \right]{V}^T,\\ {}{E}_{k+1}=\frac{S_{\lambda}}{\mu_k}\left( D+\frac{1}{\mu_k}{Y}_k-{A}_{k+1}\right)\end{array} $$
where svd(.) denotes the singular value decomposition operator. The complete algorithm used for the present study is the “algorithm 5” known as inexact augmented Lagrange multipliers (IALM) given by Liu et al. (2013). The IAML algorithm utilizes the technique of the augmented Lagrange multipliers (ALM). The IALM requires significant less number of partial singular value decomposition (svd). Also, the IALM algorithm converges almost five times faster than the previous available mostly used algorithm such as the accelerated proximal gradient (APG). Further, its precision rate is higher and it requires lower storage or memory space. Precisely, the number of non-zeros in E computed by IALM is much more accurate (often exactly) than compared with APG. There are many non-zero terms often left by APG in E. More details on this algorithm is available in the study by Liu et al. (2013).

2.2 Application in the field of computer vision

The RASL: Robust Alignment by Sparse and Low-rank Decomposition for Linearly Correlated Images, reported by Liang et al. (2012) is the inspiration behind the present study. Their study aimed to measure image similarity by aligning a batch of linearly correlated images containing gross errors such as significant illumination variation, partial occlusion, and poor or no alignment. For instance, consider Fig. 1 of their study, consisting of 40 face images of a person with different illuminations, occlusions, poses, and expressions. To find a set of transformations such that the transformed images can be decomposed as the sum of images from a low-rank approximation and sparse errors is the target of their work. To accomplish this, first, each image is considered a column vector of matrix, suppose D ∈  m × n (where m is the number of pixels in each image and n is the number of images), illustrated in Fig. 1a. To align these images at pixel level, a domain transformation function (τ) is used, whose results are shown in Fig. 1b. Using the low-rank and sparse decomposition techniques on the transformed image data matrix Doτ ∈  m × n , this matrix is separated into a set of low-rank component that are linearly correlated and describes the similarity of the aligned images, shown in Fig. 1c, and the sparse component (Fig. 1d) that consists of all the shadows, hats, and glasses as errors in this case. The ALM algorithm is used to solve the convex optimization problem here.
Fig. 1

Batch image alignment. a D representing the 40 faces images of a person with different illuminations, occlusions, poses, and expressions. b Doτ representing the transformed images that can be decomposed as the sum of c A low-rank components and d E sparse components; together, (b), (c), and (d) illustrates Doτ = A + E

3 Application of this technique in climate studies

3.1 Decadal variability in maximum wind speed for JJA months over North Indian Ocean

Similar to the above-mentioned study (Section 2.2), the data also consists of a set of linearly correlated component and a sparse component representing perturbations. The present study uses this technique for one of the climate variables, MWS, expressed in meters per second. Assuming there is no change, then for each month, the MWS should be similar at each grid point in a spatial domain, and therefore the sparse component should be zero. For perturbations in the environmental forcing as evidenced in the real world, i.e., “the system evolves in time under the influence of its own internal dynamics and due to changes in external factors affecting the weather system” (Solomon et al. 2007), there are spatial and temporal variations in the MWS. It means there exists a set of low-rank and sparse components in the data. In addition, due to seasonal dependence, the months of similar seasons are linearly correlated than the overall months throughout all the years. To demonstrate this technique, the present study considers the June, July, August (JJA) months (with reference to austral seasons) for the Indian Ocean (IO) region encompassing the geographical coordinates 30° E–120° E; 60° S–30° N for a period of 21 years (1992–2012). The present study use the MWS data derived from eight satellite missions that was homogeneous in nature, and a well calibrated data quality checked product obtained from the URL link: More details on the quality of this product are available in the study of Queffeulou and Croize-Fillon (2013). The extracted daily data of maximum wind speed from satellite altimeters are prepared into a monthly dataset corresponding to a grid size of 1° × 1°. Satellite data has sporadic gaps as one of its basic limitation; therefore, the data is interpolated using nearest-neighborhood interpolation method. The spatiotemporal gaps in satellite data can be addressed using different gap filling approaches that use multi-step modeling approaches, wherein the algorithm can be used to fill missing gaps using alternating sequence of spatial and temporal steps. There are methods such as by Kang et al. (2005) that used MODIS data using simple spatial interpolation within land cover classes. Borak and Jasinski (2009) used the modified version of the Kang et al. (2005) approach. Recently, Poggio et al. (2012) developed an innovative method for gap-filling MODIS EVI data that utilizes a hybrid generalized additive model (GAM) that addresses the missing values in spatiotemporal scales. In addition, the present study used a Gaussian filter of size 5 × 5° applied to the monthly datasets that eventually produced a smooth dataset. The monthly dataset is then represented as a column vector of matrix D ∈  m × n , where m represents the number of grid points (91 × 91 = 8281) and n represents the number of months (63). This matrix is then decomposed into a set of low-rank and sparse components using ALM algorithm as mentioned in Section 2.1. The criteria in choosing the MWS in the present study is due to the necessity as mentioned in the IPCC SREX report to have a precise understanding on the variability of extreme wind, as they can influence the climate system directly or indirectly (Seneviratne et al. 2012). Precise knowledge on extreme winds are important as they are often considered in context to extreme weather phenomenon associated with weather events such as tropical cyclones and extra-tropical cyclones, thunderstorm downbursts, and tornadoes. Even though wind is often not used to define an extreme event (Peterson and Manton 2008), wind-speed thresholds may be used to characterize the severity of a phenomenon like the Saffir-Simpson scale used for tropical cyclone classification.

Figure 2 illustrates the matrix D = A + E (exact decomposition of the MWS obtained using this method) for the first 4 months out of 63 months (for 21 years data) where the indexing of a consecutive month is row wise. Figure 2a represents the matrix D (original dataset), Fig. 2b represents its decomposed matrix A (low-rank components), and Fig. 2c represents the matrix E (sparse components). For further analysis, the study examines the latitude versus time Hovmoller diagram of the sparse components. In climatology, the Hovmoller diagram is used to examine the time evolution of a propagating signal, wherein the zonal and meridional propagation effects can be effectively visualized to infer meaningful information (Cipollini et al. 1999). Figure 3 illustrates the Hovmoller plot for the sparse components, where the x-axis represents time (3 months corresponding to JJA in a year) and the y-axis represents the latitude, where each latitude is the corresponding meridional average value. The upper latitudes represent the NIO basin (between 26° and 30° N), highligted by the red-colored box. It clearly illustrates a decadal reversal pattern (Fig. 3). Further, the time series is calculated by an area average that corresponds to the observed decadal reversal pattern (highlighted by red box in Fig. 3), and it is hereafter referred to as the MWS sparse signal. Figure 4a represents the time series of the MWS sparse signal (thick yellow line) along with the original data (dotted blue line) and the low-rank components (thick red line). To verify the observed decadal pattern, the mean of the first decade (1992–2001) and second decade (2002–2012) is calculated from each of these signals. For the first decade, the corresponding values are 6.177 m s−1 for original dataset, 6.4896 m s−1 for the low-rank components, and −0.03353 m s−1 for the sparse component signal. It means that if the expected MWS for the first decade was 6.4896 m s−1, then a depreciation factor of 0.03353 m s−1 from the expected mean resulted in a MWS of 6.177 m s−1. Similarly for the second decade, it resulted in 7.9708 m s−1 for original dataset, 7.284 m s−1 for the low-rank component, and 0.7361 m s−1 with the sparse component signal. Comparing the two decades, it is evident that the second decade exhibited a positive perturbation for the MWS sparse signal that is negative during the first decade, justifying the reversal decadal pattern observed in the MWS over the North Indian Ocean region.
Fig. 2

Low-rank and sparse decomposition of the MWS for the JJA months. a D representing original data matrix, b A its low-rank components, and c E its sparse components. The maps are plotted for the first 4 months having indexing row wise

Fig. 3

Latitude-time Hovmoller diagram for sparse component for the MWS of JJA months

Fig. 4

a Composite time series for the MWS of the JJA months for original, low-rank, and sparse signal for the area averaged 26–30°North of the Hovmoller diagram; b Fourier power spectrum of JJA months for the MWS sparse signal; c filtered time series of the JJA months for the MWS sparse signal; and d Fourier power spectrum of monthly anomaly for MSWH PC2

Further a fast Fourier transform (FFT) was applied on this sparse signal to investigate the dominant frequencies. The study signifies that a dominant frequency of 0.04688 Hz corresponding to a period of 21.3 years is obtained (each frequency corresponds to the number of cycles per year). It refers to an oscillatory period of almost 10 years for this sparse signal time series. To verify this technique, the present study also performed an EOF analysis of monthly dataset for 24 years that resulted in 63.54% of total variance in the first three modes. The variance explained by the first mode is 52.38%, and the second and third modes are 7.44% and 3.72%, respectively. Figure 4d shows the Fourier power spectrum associated with the PCA2 of monthly anomaly dataset, having a frequency of 0.4688 Hz (solid blue line), the lower dotted red line is the mean red-noise spectrum, and the above dotted cyan line is the 95% confidence interval. The comparison shows that the present technique having a dominant frequency of 0.04688 Hz matches very well with the traditional PCA obtained from EOF analysis using the monthly dataset. The inverse Fourier transform for this dominant frequency corresponding to the bandwidth (0, 0.09) is calculated and illustrated as a time series in Fig. 4c. It is clear from this figure that a 10-year cycle exists with negative amplitude during the first 10 years and vice-versa for the next 10 years, justifying the observed decadal reversal pattern. The statistical significane of the obtained reversal pattern through FFT is verified by creating a background red-noise spectrum (Torrence and Compo 1998) as shown in Fig. 4b. The solid blue line in this figure represents the Fourier power spectrum of the MWS sparse signal having a single dominant frequency of 0.04688 Hz, and the lower dotted red line represents the corresponding mean red-noise spectrum, and the dotted cyan line is the 95% confidence interval. The null hypothesis is considered to be a result in the peak of the Fourier power spectrum below the background red-noise spectrum, else it is deemd to be an original feature with 95% considered confidence level (5% significance level). From Fig. 4b, it is clearly evident that the obtained period through FFT is a true signal as the peak lies above the 95% confidence level (seen from the blue cyan line). The reason behind this decadal reversal pattern could be the decadal shift in east Asian summer monsoon circulation during the mid-1990s reported by Kwon et al. (2007). The correlation coefficient of the MWS sparse signal with the Webster and Wang monsoon (WYM) index (obtained from the Asia-Pacific Data Research Center) is calculated. The WYM index is defined by the relation:
$$ {\mathrm{WYM}}_{\mathrm{index}}={U}_{850}\left(40{}^{\circ}-110{}^{\circ} E, EQ-20{}^{\circ} N\right)-{U}_{200}\left(40{}^{\circ}-110{}^{\circ} E,\mathrm{eq}-20{}^{\circ} N\right) $$
In the above relationship, the term U represents the anomalous zonal wind speed for 850 and 200 mb levels, and the deviations of the zonal wind fields are calculated relative to the two separate mean summer values (Yang et al. 1992). The MWS sparse signal exhibits a moderately significant lag3 correlation of −0.538 with the WYM index. Figure 5 shows the combined plot of the MWS sparse signal with the WYM index that clearly indicates a 3-month lag between these signals. To ensure these findings, a verification check was perfomed by correlating the WYM index with the original and low-rank signals that resulted in a lag3 correlation coefficient values of −0.269 and −0.055, respectively. It indicates that the dependence of decadal reversal pattern in the sparse signal is a perturbation in the MWS due to the south Asian summer monsoon. However, it is believed that this could be only one of the factor noticed in the decadal reversal, and for more details, an elaborate study is required. The present study successfully provides a platform for a more detailed investigation on the observed reversing decadal pattern in the MWS.
Fig. 5

Combined plot of the MWS sparse signal and WYM Index for the JJA months

A number of recent studies reports on the trends in extreme wind speeds based on wind observations and reanalyzes over different parts of the world. A declining trend on extreme wind speeds have been reported over much of the regions in North America for the period 1973–2005 (Pryor et al. 2007). In another study, Wan et al. (2010) reported on declining trends in extreme wind speed using 10 m hourly wind data for the period 1953–2006 over regions in western and southern Canada. Similarly, the 10-m monthly mean and 95th percentile winds over China show a declining trend (Guo et al. 2011). Another study by Pirazzoli and Tomasin (2003) found a general declining trend in both annual mean and annual maxima winds for the period from 1951 to the mid-1970s, and an increasing trend thereafter from observations in the central Mediterranean region. All these studies examined the trends in wind speed or the extreme wind speed. The major drawback in using trend analysis is that it only provides the overall estimated increase or decrease in the long-term variability of a given variable, but it fails to provide the perturbation that explains exactly on the variability from its mean behavior; this technique proposed in the present study can provide the exact nature on this variability.

3.2 Variability in the SST for the Bay of Bengal region

To understand the variability in SST over the Bay of Bengal region, the daily mean SST (°C) for the December month at a depth of 5.022 m was used. The data was obtained from the Integrated Climate Data Centre from the URL link:, which is a product of ECMWF Ocean Reanalysis System (ORA-S4) for the period from 1958 to present. More details on the evaluation, assimilation, and quality of the data are available in the study of Balmseda et al. (2013). The SST data has a spatial resolution of 1° × 1° for the Bay of Bengal region (2° N–26° N; 72° E–105° E). The data structure has 24 × 33 grid points for a period of 25 years (1990–2014). The justification behind choosing the month of December is that during this month, there is a significant impact on SST affected due to the strong El Nino event (1997–1998 El Nino) of the century that occurred in the Pacific (Webster et al. 1999; Yu et al. 1999). The original SST dataset was considered a data matrix D (Fig. 6a) and decomposed using the above-mentioned technique (Section 2.1). This resulted in a set of matrix A (low-rank component) and and E (sparse component) illustrated in Fig. 6b, c. From the total 25 years SST data, Fig. 6 shows the representation of 6 years that corresponds to the years 1997, 1998, 2003, 2004, 2009, and 2010 where the indexing was performed row wise for a corresponding year. An interesting observation noticed is a strong negative pattern present at the upper (14° N–22° N; 87° E–95° E) and lower latitudinal belts (2° N–10° N; 77° E–102° E) in the Bay of Bengal region for the sparse component during 1997 (highlighted in black box) that was not observed for the years prior to 1997. This feature is highlighted prominently in the Hovmoller diagram (Fig. 7) for the SST sparse components, wherein the x-axis represents the longitude (each longitude point is the mean of all latitude values at that point) and the y-axis represents the time (December month of each year). In Fig. 7, the region bounded between the longitudes 90° E and 93° E exhibits an oscillating signal from December 1997 onwards. Both Figs. 6 and 7 have negative values for the 1997 sparse components and that were positive before 1997. The sparse components explain the perturbation or variability in the SST data; hence, it is very important to have a detailed understanding on the physical reason behind this observed pattern. This study performs an attempt to explain the feature; however, a detailed separate study is required. In addition, the low-rank component for the year 1997 (highlighted in black box) also depicts higher SST values relative to other years indicating an abnormal warming.
Fig. 6

Low-rank and sparse decomposition of SST for December month. a D representing original data matrix, b A its low-rank components, and c E its sparse components. The maps are plotted for 1997, 1998, 2003, 2004, 2009, and 2010, having indexing row wise

Fig. 7

Longitude-time Hovmoller diagram for sparse component for the SST of December month

3.2.1 Observed features in the 1997 El Nino event

Figure 6b, c illustrates relatively high positive values of A (low-rank component) and distinct negative values for E (sparse component) for the year 1997 compared with the other years and exceptionally visible (highlighted in black box). The June–December period of 1997 is considered one of the strongest El Nino year in history. During this event, the Indian Ocean experiences a reversal of Walker circulation and its implication being that easterly anomalies starts to develop along the equator during early June, which is about 3 months after the onset of the 1997–1998 El Nino event (Yu and Rienecker 1998). The reversal of Walker circulation forces the equatorial Kelvin/Rossby waves affecting the heat balance thereby leading to the reversal of zonal SST gradient in the fall of 1997. The resulting pattern of negative SST values during 1997 (Fig. 6) can be an effect of the above-mentioned phenomena. The findings reported in this study are in close agreement with the results of Yu et al. (1999). Their study conducted on various ocean parameters, one among which is the weakly mean SST derived from Reynolds analyses for the study period January 1997–May 1998 investigated the mechanism behind the IO warming during the El Nino event. The analysis from present study for the E component (Fig. 6) resulted in the highest negative value of 2 °C for the region 2° N–4° N and 96° E–100° E, and the Hovmoller diagram (Fig. 7) also shows a similar pattern. The warming reported by Yu et al. (1999) for the IO region during the 1997–1998 El Nino is clearly seen in the obtained low-rank component for 1997 (Fig. 6) that illustrates the highest SST values relative to the other years considered in this study. The major benefit in using this technique is that the SST perturbation is recovered exactly in terms of its sparse component at each time scale, and that was earlier represented by their anomaly values; those are approximate values and do not provide the exact variability from their expected behavior (Yu et al. 1999). The novelty of the proposed technique is the representation of exact variability patterns for each El Nino event that was not possible with other methods. For instance, after the 1997–1998 El Nino, the year 2009–2010 was also a strong El Nino year; however, the spatial pattern of their variability as seen in Fig. 6c was different.

3.2.2 Correlation of the signal with various climate indices

To further quantify and verify the present findings, the study considers the signal from the SST sparse components evident from the Hovmoller diagram corresponding to 91° E (highlighted by black line in Fig. 7). The time series plot of the SST sparse component corresponding to the longitude 91° E is shown in Fig. 8 given by the solid blue line. This time series was correlated with various known climate indices (CI) to understand their effect on the observed variability in the Bay of Bengal region. A CI is a calculated value used to describe the state and changes in the climate system. The CIs along with their data sources used in this study are obtained from the given URL links. The Nino3.4 is a descriptor of El Nino and La Nina events taken from the link ( The DMI is a descriptor of the Indian Ocean Dipole (IOD), and this data is downloaded from the URL link ( The IOSD is a descriptor the Indian Ocean Subtropical Dipole, oriented in the southeast-northwest direction. The SAMI is a descriptor of the Southern Annular Mode, and the data is obtained from Each of the CI mentioned above and used in this analysis are calculated on monthly scales.
Fig. 8

Comparison of the SST sparse signal with various climatic indices

Table 1 shows the correlation coefficient values of the 91° E signal with various CIs. To validate the findings, the correlation of CIs was performed with both low-rank and original data of this signal. These results validate the claim that the sparse components add perturbation or variability in the climate data. Interestingly, the correlation value of the December month SST sparse component signal is similar to the September month Nino3.4 that shows a 3-month lag correlation between them in agreement with the study by Lanzante (1996). According to their study (Lanzante 1996), during a typical El Nino event, SST anomaly first appears over the eastern tropical Pacific, followed by the Central Pacific ~3 months later, and thereafter in the Indian Ocean after another 3 months. The negative correlation coefficient of DMI with sparse component signal indicates an out-of-phase relation between them. For instance, the 1997 was a positive DMI year that resulted in a cooling effect over the Bay of Bengal region (Saji and Yamagata 2003), and this is evidenced in the present study. Hence, according to the proposed technique, when the DMI time series increases (becomes positive), the sparse component signal for the Bay of Bengal region should decrease (becomes negative). In addition, the obtained correlation value is moderate indicating an association of the variability in SST sparse components along with its 91 ° E signal with the other patterns of climate variability. The beginning of an IOD event is around May–June, attains its peak activity in October, and rapidly subsides thereafter (Saji et al. 1999). Hence, its presence in the obtained sparse signal for December month is less compared with the El Nino and IOSD events. The IOSD event begins in December (Yamagata et al. 2004) adding a significant contribution to the obtained sparse component and the 91° E signal almost having a similar magnitude of Nino3.4 having strong effects in the Indian Ocean during December months (Yu et al. 1999). Among all the CIs considered in this study, the contribution from SAMI was least in the sparse signal variability as evidenced from their correlation coefficient values. The FFT analysis of the 91° E sparse component signal resulted in a dominant cycle of 0.3438 Hz corresponding to a period of a 2.9-year cycle. Along with this, the FFT of each considered CI time series resulted in a similar dominant frequency of 0.3438 Hz, and each above 95% confidence interval representing a true signal as mentioned in Section 3.1. Therefore, it can be concluded that apart from the CIs contribution in the variability of sparse components over the Bay of Bengal region, it also leads to an oscillating pattern having a periodicity cycle of 18 months.
Table 1

Correlation coefficient of original data, low-rank, and sparse signal with various climate indices

Climate indices (CIs)

Months of CIs

R (original data)

R (low-rank)

R (sparse)





















4 Summary and conclusions

The study reports on an efficient method that separates the perturbations or variability (sparse components) in climate data from its mean or expected behavior (low-rank component). The exact determination of low-rank data is very much useful to extract perfect EOF modes and their standing oscillation as PCs for further EOF analysis on this data. The extracted sparse components provide climate perturbations from their mean behavior that further enhances the understanding of climate variability in a better perspective. The study uses the MWS and SST as the variables to demonstrate this technique. The first application of this technique was on the MWS that revealed a decadal reversal pattern in the sparse components over the North Indian Ocean basin. The pattern represented by the sparse signal results in a dominant period of 21.3 years, which is statistically significant at 95% confidence level. The sparse signal also indicates a significant lag3 correlation with the WYM, indicating its dependence on the south Asian summer monsoon. The authors believe that this finding has practical applications in the field of climate science. The second application of this technique analyzes the SST parameter during 1997 that revealed a distinct pattern in the sparse component. The present study provides a possible explanation on the observed patterns that show a reasonable good match with previous studies. The sparse signal has a significant correlation with various well-known climate indices that contributes to the variability or perturbations in the SST over the Bay of Bengal region. The study clearly signifies a warmer SST pattern in the low-rank component during 1997 that has a good match with earlier studies. The long-term climatology of any climate variable can be examined using the low-rank component that can help to determine the pattern classification at each time scale and not obtained using the existing methods. Simultaneously, the exact separation of perturbations from its mean behavior is a more challenging task in climate data that is not given by any existing method and further its meaningful analysis has more effect and impact on the fast changing climate. It is believed that the proposed technique can be extended to other field variables having many practical applications in the field of coastal and ocean engineering, coastal zone management, and climate dynamics.



The authors express their sincere gratitude to Dr. Swadhin Behera, Group Leader, Climate Variability Prediction and Application Research Group, Application Laboratory, JAMSTEC, Japan for sharing the IODS index values used in the present study.


  1. Balmseda MA, Mogensen K, Weaver AT (2013) Evaluation of the ECMWF ocean reanalysis system ORAS4. Q J R Meteorol Soc 139:1132–1161CrossRefGoogle Scholar
  2. Bertsekas DP (1999) Nonlinear programming: 2nd edition, Athena Scientific pp. 780.Google Scholar
  3. Borak JS, Jasinski MF (2009) Effective interpolation of incomplete satellite-derived leaf-area index time series for the continental United States. Agric For Meteorol 149:320–332CrossRefGoogle Scholar
  4. Cipollini P, Cromwell D, Quartly GD (1999) Observations of Rossby wave propagation in the Northeast Atlantic with TOPEX/POSEIDON altimetry. Adv Space Res 22(11):1553–1556CrossRefGoogle Scholar
  5. Emmanuel C, Li X, Ma Y, Wright J (2011) Robust principal component analysis. Journal of the ACM 58(3)Google Scholar
  6. Guo H, Xu M, Hu Q (2011). Changes in near-surface wind speed in China: 1969-2005. Int J Climatol 31:349–358. doi: 10.1002/joc.2091
  7. Kang S, Running SW, Zhao M, Kimball JS, Glassy J (2005) Improving continuity of MODIS terrestrial photosynthesis products using an interpolation scheme for cloudy pixels. Int J Remote Sens 26:1659–1676CrossRefGoogle Scholar
  8. Kwon MH, Jhun J-G, Ha K-J (2007) Decadal change in east Asian summer monsoon circulation in the mid-1990s. Geophys Res Lett 34(21). doi: 10.1029/2007GL031977
  9. Lanzante JR (1996) Lag relationships involving tropical sea surface temperatures. J Clim 9:2568–2578CrossRefGoogle Scholar
  10. Liang X, Ren X, Zhang Z, Ma Y (2012) Repairing sparse low-rank texture. Computer vision-ECCV 2012, vol. 7576 of the series Lecture Notes in Computer Science 482–495Google Scholar
  11. Liu G, Lin A, Yan S, Sun J, Yu Y, Ma Y (2013) Robust recovery of subspace structures by low-rank representation. IEEE Trans Pattern Anal Mach Intell 34(1):171–184CrossRefGoogle Scholar
  12. Ma Yi, (2012) Pursuit of low-dimensional structures in high-dimensional data. Invited talk Session-I, TuAT1, IEEE.Google Scholar
  13. Mobahi H, Zhou Z, Yang AY, Ma Y (2011) Holistic 3D reconstruction of urban structures from low-rank textures. IEEE International Conference on Computer Vision Workshops. ICCV 2011 Workshops, Barcelona, Spain, November 6–13, 2011Google Scholar
  14. Peng Y, Ganesh A, Wright J, Xu W, Ma Y (2012) RASL: robust alignment by sparse and low-rank decomposition for linearly correlated images. IEEE Trans Pattern Analysis and Machine Intelligence 34(11):2233–2246CrossRefGoogle Scholar
  15. Peterson TC, Manton MJ (2008) Monitoring changes in climate extremes: a tale of international collaboration. Bulletin of American Meteorological Society 89:1266–1271CrossRefGoogle Scholar
  16. Pirazzoli PA, Tomasin A (2003) Recent near-surface wind changes in the Central Mediterranean and Adriatic areas. Int J Climatol 23:963–973CrossRefGoogle Scholar
  17. Poggio L, Gimona A, Brown I (2012) Spatio-temporal MODIS EVI gap filling under cloud cover: an example in Scotland. ISPRS J Photogramm Remote Sens 72:56–72CrossRefGoogle Scholar
  18. Pryor SC, Barthelmie RJ, Riley ES (2007) Historical evolution of wind climates in the USA. Jour Phys Conf Ser 75:012065. doi: 10.1088/1742-6596/75/1/012065 CrossRefGoogle Scholar
  19. Queffeulou P, Croize-Fillon D (2013). Global altimeter SWH data set, version 10, May 2013, Laboratoire d’Oceanographie Spatiale, IFREMER, BP 70, 29280 Plouzane, France.
  20. Saji NH, Yamagata T (2003) Possible impacts of Indian Ocean dipole mode events on global climate. Clim Res 25:151–169Google Scholar
  21. Saji NH, Goswami BN, Vinayachandran PN, Yamagata T (1999) A dipole mode in the tropical Indian Ocean. Nature 401:360–363Google Scholar
  22. Seneviratne SI, et al. (2012) Changes in climate extremes and their impacts on the natural physical environment. Managing the risks of extreme events and disasters to advance climate change adaptation, C. B. Field et al. (Eds.), Cambridge University Press, 109–230Google Scholar
  23. Solomon S, Qin D, Manning M, Chen Z, Marquis M, Averyt K et al (2007) Climate change 2007: the physical science basis. Contributions of working group I to the fourth assessment report of the intergovernmental panel on climate change. Cambridge University Press, CambridgeGoogle Scholar
  24. Torrence C, Compo GP (1998) A practical guide to wavelet analysis. Bull Am Meteorol Soc 79:61–78CrossRefGoogle Scholar
  25. Wan H, Wang LX, Swail VR (2010) Homogenization and trend analysis of Canadian near-surface wind speeds. J Clim 23:1209–1225. doi: 10.1175/2009JCLI3200.1
  26. Wright J, Ganesh A, Rao S, Ma Y (2009) Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization. Proc. of the 23rd Annual Conference on Neural Information Processing Systems, Vancouver, BC, Canada. pp 2080–2088Google Scholar
  27. Yamagata T, Behera SK, Luo JJ, Masson S, Jury M, Rao SA (2004) Coupled ocean atmosphere variability in the tropical Indian Ocean. Earth climate: the ocean-atmosphere interaction, geophys monogr, vol 147. Amer Geophys Union 189–212Google Scholar
  28. Yang S, Webster PJ, Dong M (1992) Longitudinal heating gradient: another possible factor influencing the intensity of the Asian summer monsoon circulation. Adv Atmos Sci 9:397–410CrossRefGoogle Scholar
  29. Yu L, Rienecker MM (1998) Evidence of an extra-tropical atmospheric influence during the onset of the 1997–98 El Nino. Geophys Res Lett 25:3537–3540Google Scholar
  30. Yu L, Rienecker MM (1999) Mechanisms for the Indian Ocean warming during the 1997–98 El Nino. Geophys Res Lett 26:735–738. doi: 10.1029/1999GL900072
  31. Zhang Z, Liang X, Ma Y (2011) Camera calibration with lens distortion from low-rank textures. IEEE Intl. Conf. on Computer Vision and Pattern Recognition (CVPR), 1347–1354Google Scholar

Copyright information

© Springer-Verlag Wien 2017

Authors and Affiliations

  1. 1.Department of Ocean Engineering and Naval ArchitectureIndian Institute of Technology KharagpurKharagpurIndia

Personalised recommendations