The application of low-rank and sparse decomposition method in the field of climatology
- 154 Downloads
The present study reports a low-rank and sparse decomposition method that separates the mean and the variability of a climate data field. Until now, the application of this technique was limited only in areas such as image processing, web data ranking, and bioinformatics data analysis. In climate science, this method exactly separates the original data into a set of low-rank and sparse components, wherein the low-rank components depict the linearly correlated dataset (expected or mean behavior), and the sparse component represents the variation or perturbation in the dataset from its mean behavior. The study attempts to verify the efficacy of this proposed technique in the field of climatology with two examples of real world. The first example attempts this technique on the maximum wind-speed (MWS) data for the Indian Ocean (IO) region. The study brings to light a decadal reversal pattern in the MWS for the North Indian Ocean (NIO) during the months of June, July, and August (JJA). The second example deals with the sea surface temperature (SST) data for the Bay of Bengal region that exhibits a distinct pattern in the sparse component. The study highlights the importance of the proposed technique used for interpretation and visualization of climate data.
KeywordsLow-rank and sparse decomposition Climate system Climate signal Wind speed Sea surface temperature Climate indices Indian Ocean
In the field of climate science, to study and understand the pattern and impact of climate change on spatiotemporal climate variables, traditionally basic analytic methods such as mean, standard deviation, correlation, or empirical orthogonal function (EOF) are widely used. For instance, these methods are used to analyze the variations in climate variables (by calculating their deviations from mean), or to study their climatology (by calculating their temporal mean) or inter-annual behavior (by calculating their standard deviation), or to estimate the temporal correlation of two different climate variables. The calculation of standard deviation, anomaly, or composite maps only provides an approximate estimate on the change in the climate field variable or their long-term behavior. They often fail to explain the nature of perturbation, as well the expected behavior of a field especially at each time scale.
Its assumption is that data lies in low-dimensional linear sub-space with little variation or perturbation in few entries of data. It fundamentally means that finding a low-rank matrix A for stacked column vectors of data point D minimizes the discrepancy between D and E. The minimization is classically done by the l 2 norm.
It breaks down under high amount of variation in data entries due to a high perturbation in climate data from a low-rank behavior, i.e., during cyclones, El Nino, or any abrupt climate change. It can affect the EOF analysis to extract exactly the low-rank matrix or the EOF patterns.
The other limitation lies in the interpretation of EOF patterns in terms of the considered field physical units.
Since the EOF analysis focuses more towards low-rank observations or EOF patterns, it fundamentally ignores the large climate perturbations that are usually sparse in nature.
Together, the above-mentioned points (ii) and (iv) mean that one cannot recover the original data from the obtained EOFs and PCs completely, since the exact low-rank and sparse terms cannot be extracted precisely.
The present study introduces the low-rank and sparse separation technique in the field of climatology. The application of this method in climate science results in a set of low-rank and sparse component for each time scale having similar units, wherein the low-rank component provides the expected value or mean behavior of the field. The sparse component provides the variations or perturbations existent in the field apart from its expected or mean behavior. The strength and novelty of the proposed methodology is as given below.
It exactly recovers the low-rank data from climate perturbations as sparse data.
The degree of perturbation provides more valuable information to understand the climate change, random occurrences, and effect of extreme weather events such as cyclones, El Nino, etc. on different climate variables.
The most important benefit of this technique is its ease to visualize and interpret different patterns in the data obtained at each time scale. Since both low-rank and sparse components can be represented as maps, their quantification is straight forward and exact for interpretation.
Further application of mathematical techniques and signal processing techniques can provide a better analysis of the obtained components. For example, application of EOF analysis on obtained low-rank can provide perfect EOF modes with exact temporal oscillations in the form of PCs.
The duration of datasets used for analysis is important, and datasets with short duration will be a limitation in using this technique. An increase in the dimension of data matrix can provide better low-rank components due to their linear correlated dependence.
1.1 Relevant studies using this technique
At present, the application of low-rank and sparse decomposition technique is limited to areas such as image processing, web data ranking, and bioinformatics data analysis. Few of its applications in the field of computer vision include detection of objects from a cluttered background using this technique (Wright et al. 2009; Emmanuel et al. 2011). This technique is also effectively used in the recovery of shadows and specularities from face images (Liang et al. 2012). The problem to repair low-rank textures, distorted textures, and completion of structured textures and their improvement in a highly robust manner is also reported in their work. Zhang et al. (2011) reconstructed a 3D shape and 2D texture for a class of surfaces using a single perspective image. This technique finds application in calculation of camera calibration and radial dimension from lens distortion (Zhang et al. 2011) and in 3D reconstruction of urban scenes (Mobahi et al. 2011). Other applications of this method includes object identification such as rectifying pose of objects, regularity of texts at all scales, character rectification, and street sign rectification. Applications such as web image refinement, web-document corpus analysis, protein gene correlation, and so on find application with the help of this method (Ma 2012). However, in the field of climate science, this application exactly separates the expected behavior of a climate variable from its variability.
2 Description and application of the technique
The following section provides the mathematical description of the low-rank and sparse decomposition technique along with one of its application used in the field of computer vision.
2.1 Mathematical formulation
2.2 Application in the field of computer vision
3 Application of this technique in climate studies
3.1 Decadal variability in maximum wind speed for JJA months over North Indian Ocean
Similar to the above-mentioned study (Section 2.2), the data also consists of a set of linearly correlated component and a sparse component representing perturbations. The present study uses this technique for one of the climate variables, MWS, expressed in meters per second. Assuming there is no change, then for each month, the MWS should be similar at each grid point in a spatial domain, and therefore the sparse component should be zero. For perturbations in the environmental forcing as evidenced in the real world, i.e., “the system evolves in time under the influence of its own internal dynamics and due to changes in external factors affecting the weather system” (Solomon et al. 2007), there are spatial and temporal variations in the MWS. It means there exists a set of low-rank and sparse components in the data. In addition, due to seasonal dependence, the months of similar seasons are linearly correlated than the overall months throughout all the years. To demonstrate this technique, the present study considers the June, July, August (JJA) months (with reference to austral seasons) for the Indian Ocean (IO) region encompassing the geographical coordinates 30° E–120° E; 60° S–30° N for a period of 21 years (1992–2012). The present study use the MWS data derived from eight satellite missions that was homogeneous in nature, and a well calibrated data quality checked product obtained from the URL link: ftp://ftp.ifremer.fr/ifremer/cersat/products/swath/altimeters/waves/data. More details on the quality of this product are available in the study of Queffeulou and Croize-Fillon (2013). The extracted daily data of maximum wind speed from satellite altimeters are prepared into a monthly dataset corresponding to a grid size of 1° × 1°. Satellite data has sporadic gaps as one of its basic limitation; therefore, the data is interpolated using nearest-neighborhood interpolation method. The spatiotemporal gaps in satellite data can be addressed using different gap filling approaches that use multi-step modeling approaches, wherein the algorithm can be used to fill missing gaps using alternating sequence of spatial and temporal steps. There are methods such as by Kang et al. (2005) that used MODIS data using simple spatial interpolation within land cover classes. Borak and Jasinski (2009) used the modified version of the Kang et al. (2005) approach. Recently, Poggio et al. (2012) developed an innovative method for gap-filling MODIS EVI data that utilizes a hybrid generalized additive model (GAM) that addresses the missing values in spatiotemporal scales. In addition, the present study used a Gaussian filter of size 5 × 5° applied to the monthly datasets that eventually produced a smooth dataset. The monthly dataset is then represented as a column vector of matrix D ∈ ℝ m × n , where m represents the number of grid points (91 × 91 = 8281) and n represents the number of months (63). This matrix is then decomposed into a set of low-rank and sparse components using ALM algorithm as mentioned in Section 2.1. The criteria in choosing the MWS in the present study is due to the necessity as mentioned in the IPCC SREX report to have a precise understanding on the variability of extreme wind, as they can influence the climate system directly or indirectly (Seneviratne et al. 2012). Precise knowledge on extreme winds are important as they are often considered in context to extreme weather phenomenon associated with weather events such as tropical cyclones and extra-tropical cyclones, thunderstorm downbursts, and tornadoes. Even though wind is often not used to define an extreme event (Peterson and Manton 2008), wind-speed thresholds may be used to characterize the severity of a phenomenon like the Saffir-Simpson scale used for tropical cyclone classification.
A number of recent studies reports on the trends in extreme wind speeds based on wind observations and reanalyzes over different parts of the world. A declining trend on extreme wind speeds have been reported over much of the regions in North America for the period 1973–2005 (Pryor et al. 2007). In another study, Wan et al. (2010) reported on declining trends in extreme wind speed using 10 m hourly wind data for the period 1953–2006 over regions in western and southern Canada. Similarly, the 10-m monthly mean and 95th percentile winds over China show a declining trend (Guo et al. 2011). Another study by Pirazzoli and Tomasin (2003) found a general declining trend in both annual mean and annual maxima winds for the period from 1951 to the mid-1970s, and an increasing trend thereafter from observations in the central Mediterranean region. All these studies examined the trends in wind speed or the extreme wind speed. The major drawback in using trend analysis is that it only provides the overall estimated increase or decrease in the long-term variability of a given variable, but it fails to provide the perturbation that explains exactly on the variability from its mean behavior; this technique proposed in the present study can provide the exact nature on this variability.
3.2 Variability in the SST for the Bay of Bengal region
3.2.1 Observed features in the 1997 El Nino event
Figure 6b, c illustrates relatively high positive values of A (low-rank component) and distinct negative values for E (sparse component) for the year 1997 compared with the other years and exceptionally visible (highlighted in black box). The June–December period of 1997 is considered one of the strongest El Nino year in history. During this event, the Indian Ocean experiences a reversal of Walker circulation and its implication being that easterly anomalies starts to develop along the equator during early June, which is about 3 months after the onset of the 1997–1998 El Nino event (Yu and Rienecker 1998). The reversal of Walker circulation forces the equatorial Kelvin/Rossby waves affecting the heat balance thereby leading to the reversal of zonal SST gradient in the fall of 1997. The resulting pattern of negative SST values during 1997 (Fig. 6) can be an effect of the above-mentioned phenomena. The findings reported in this study are in close agreement with the results of Yu et al. (1999). Their study conducted on various ocean parameters, one among which is the weakly mean SST derived from Reynolds analyses for the study period January 1997–May 1998 investigated the mechanism behind the IO warming during the El Nino event. The analysis from present study for the E component (Fig. 6) resulted in the highest negative value of 2 °C for the region 2° N–4° N and 96° E–100° E, and the Hovmoller diagram (Fig. 7) also shows a similar pattern. The warming reported by Yu et al. (1999) for the IO region during the 1997–1998 El Nino is clearly seen in the obtained low-rank component for 1997 (Fig. 6) that illustrates the highest SST values relative to the other years considered in this study. The major benefit in using this technique is that the SST perturbation is recovered exactly in terms of its sparse component at each time scale, and that was earlier represented by their anomaly values; those are approximate values and do not provide the exact variability from their expected behavior (Yu et al. 1999). The novelty of the proposed technique is the representation of exact variability patterns for each El Nino event that was not possible with other methods. For instance, after the 1997–1998 El Nino, the year 2009–2010 was also a strong El Nino year; however, the spatial pattern of their variability as seen in Fig. 6c was different.
3.2.2 Correlation of the signal with various climate indices
Correlation coefficient of original data, low-rank, and sparse signal with various climate indices
Climate indices (CIs)
Months of CIs
R (original data)
4 Summary and conclusions
The study reports on an efficient method that separates the perturbations or variability (sparse components) in climate data from its mean or expected behavior (low-rank component). The exact determination of low-rank data is very much useful to extract perfect EOF modes and their standing oscillation as PCs for further EOF analysis on this data. The extracted sparse components provide climate perturbations from their mean behavior that further enhances the understanding of climate variability in a better perspective. The study uses the MWS and SST as the variables to demonstrate this technique. The first application of this technique was on the MWS that revealed a decadal reversal pattern in the sparse components over the North Indian Ocean basin. The pattern represented by the sparse signal results in a dominant period of 21.3 years, which is statistically significant at 95% confidence level. The sparse signal also indicates a significant lag3 correlation with the WYM, indicating its dependence on the south Asian summer monsoon. The authors believe that this finding has practical applications in the field of climate science. The second application of this technique analyzes the SST parameter during 1997 that revealed a distinct pattern in the sparse component. The present study provides a possible explanation on the observed patterns that show a reasonable good match with previous studies. The sparse signal has a significant correlation with various well-known climate indices that contributes to the variability or perturbations in the SST over the Bay of Bengal region. The study clearly signifies a warmer SST pattern in the low-rank component during 1997 that has a good match with earlier studies. The long-term climatology of any climate variable can be examined using the low-rank component that can help to determine the pattern classification at each time scale and not obtained using the existing methods. Simultaneously, the exact separation of perturbations from its mean behavior is a more challenging task in climate data that is not given by any existing method and further its meaningful analysis has more effect and impact on the fast changing climate. It is believed that the proposed technique can be extended to other field variables having many practical applications in the field of coastal and ocean engineering, coastal zone management, and climate dynamics.
The authors express their sincere gratitude to Dr. Swadhin Behera, Group Leader, Climate Variability Prediction and Application Research Group, Application Laboratory, JAMSTEC, Japan for sharing the IODS index values used in the present study.
- Bertsekas DP (1999) Nonlinear programming: 2nd edition, Athena Scientific pp. 780.Google Scholar
- Emmanuel C, Li X, Ma Y, Wright J (2011) Robust principal component analysis. Journal of the ACM 58(3)Google Scholar
- Guo H, Xu M, Hu Q (2011). Changes in near-surface wind speed in China: 1969-2005. Int J Climatol 31:349–358. doi: 10.1002/joc.2091
- Kwon MH, Jhun J-G, Ha K-J (2007) Decadal change in east Asian summer monsoon circulation in the mid-1990s. Geophys Res Lett 34(21). doi: 10.1029/2007GL031977
- Liang X, Ren X, Zhang Z, Ma Y (2012) Repairing sparse low-rank texture. Computer vision-ECCV 2012, vol. 7576 of the series Lecture Notes in Computer Science 482–495Google Scholar
- Ma Yi, (2012) Pursuit of low-dimensional structures in high-dimensional data. Invited talk Session-I, TuAT1, IEEE.Google Scholar
- Mobahi H, Zhou Z, Yang AY, Ma Y (2011) Holistic 3D reconstruction of urban structures from low-rank textures. IEEE International Conference on Computer Vision Workshops. ICCV 2011 Workshops, Barcelona, Spain, November 6–13, 2011Google Scholar
- Queffeulou P, Croize-Fillon D (2013). Global altimeter SWH data set, version 10, May 2013, Laboratoire d’Oceanographie Spatiale, IFREMER, BP 70, 29280 Plouzane, France. ftp://ftp.ifremer.fr/ifremer/cersat/products/swath/altimeters/waves/documentation
- Saji NH, Yamagata T (2003) Possible impacts of Indian Ocean dipole mode events on global climate. Clim Res 25:151–169Google Scholar
- Saji NH, Goswami BN, Vinayachandran PN, Yamagata T (1999) A dipole mode in the tropical Indian Ocean. Nature 401:360–363Google Scholar
- Seneviratne SI, et al. (2012) Changes in climate extremes and their impacts on the natural physical environment. Managing the risks of extreme events and disasters to advance climate change adaptation, C. B. Field et al. (Eds.), Cambridge University Press, 109–230Google Scholar
- Solomon S, Qin D, Manning M, Chen Z, Marquis M, Averyt K et al (2007) Climate change 2007: the physical science basis. Contributions of working group I to the fourth assessment report of the intergovernmental panel on climate change. Cambridge University Press, CambridgeGoogle Scholar
- Wan H, Wang LX, Swail VR (2010) Homogenization and trend analysis of Canadian near-surface wind speeds. J Clim 23:1209–1225. doi: 10.1175/2009JCLI3200.1
- Wright J, Ganesh A, Rao S, Ma Y (2009) Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization. Proc. of the 23rd Annual Conference on Neural Information Processing Systems, Vancouver, BC, Canada. pp 2080–2088Google Scholar
- Yamagata T, Behera SK, Luo JJ, Masson S, Jury M, Rao SA (2004) Coupled ocean atmosphere variability in the tropical Indian Ocean. Earth climate: the ocean-atmosphere interaction, geophys monogr, vol 147. Amer Geophys Union 189–212Google Scholar
- Yu L, Rienecker MM (1998) Evidence of an extra-tropical atmospheric influence during the onset of the 1997–98 El Nino. Geophys Res Lett 25:3537–3540Google Scholar
- Yu L, Rienecker MM (1999) Mechanisms for the Indian Ocean warming during the 1997–98 El Nino. Geophys Res Lett 26:735–738. doi: 10.1029/1999GL900072
- Zhang Z, Liang X, Ma Y (2011) Camera calibration with lens distortion from low-rank textures. IEEE Intl. Conf. on Computer Vision and Pattern Recognition (CVPR), 1347–1354Google Scholar