1 Introduction

1.1 Background

Venezuela is located in the north of South America, in the tropics between 0.7°N–12.2°N and 59.8°N–73.4°N. Caracas, Venezuela’s capital city, has a population of about 8 million people. About 40 % of the population lives on the hillsides surrounding the city with rather poor housing conditions. These huge suburbs called barrios, are highly vulnerable to damages due to impacts from extreme rainfall. Extreme precipitation can cause landslides which in turn can cause considerable destruction. The biggest catastrophe took place in December 1999. The effects were enormous: 10,000 dead people were reported and the estimated price for reconstruction adds up to 1.8 billion US Dollars (Lyon 2003). The whole coast was affected and important roadways were destroyed. The strong influence of precipitation, especially of extreme events, on the life and infrastructure of Venezuela reveals the importance of further investigation.

INAMEH, the Instituto Nacional de Meteorología e Hidrología de Venezuela, is in charge of the weather forecasts and conducts some research on parameters which influence precipitation. Furthermore, an early warning system has been proposed in the Vargas State at the north central coast of Venezuela (Bravo de Guenni et al. 2010). This is one of the states we are studying in this research.

1.2 Context

1.2.1 Characteristics of the rainfall seasons

Precipitation in Venezuela is highly influenced by the topography and climate zones. The tropical coast at the Caribbean Sea is located in the north, the Andes in the west, the plains in the center and the rainforest in the south. Venezuela exhibits two primary rainfall regimes. In the central and eastern areas, the rainfall is unimodal with the rainfall season spanning May to November and the dry season November to April (Hastenrath 1966; Mundaray 2005). This distribution manifests due to influences from the Intertropical Convergence Zone (ITCZ), easterly waves and the anticyclone over the North Atlantic.

Hastenrath (1966) described the beginning of the rainy season which is characterized by the ITCZ lying north of the equator and over Venezuela. This is followed by a convergence of winds and a moisture transport from the east and the west towards the Caribbean. At the same time, the anticyclone over the North Atlantic moves northward. This reduces the pressure gradient between the southern flank of the high and the trough at the equator and weakens the northeasterly trades.

As the rainy season proceeds (described by Ashby et al. 2005), the trades from the east near the surface get weaker and the convergence expands to the mid-troposphere. The weaker trades decrease the heat loss of the Atlantic. The vertical wind shear, which is strong at the beginning of the rainy season, is then reduced. Knaff (1997) added that these three factors: the low surface pressure, the low vertical wind shear and the weaker east surface trades, forward the convection over the Tropical Atlantic. The same study reveals that the southward movement of the North Atlantic High, the reinforcement of the trades and the increment of vertical wind shear define the end of the rainy season.

During the dry season (November–April), the northeasterly trade winds and an anticyclonic system in the high levels of the troposphere influence the weather in Venezuela. The ITCZ is located in the south of Venezuela and the weather is characterized by clear skies, sometimes partly cloudy and with little rain (FAV 2000).

A second rainfall regime is observed west of the Andes and in the eastern Venezuelan highland. The rainfall in these areas is bimodal with peaks in May–June and September–October with dry spells July–August and November–April (Pulwarty et al. 1998). This bimodal distributions is controlled by the annual cycle of convection over northern South America (Pulwarty et al. 1992). In this study a third rainfall regime is described based on larger mean annual rainfall amounts than in the rest of the stations, but it is essentially a unimodal seasonal pattern.

1.2.2 Influence of the oceans on Venezuelan precipitation

Different studies about the influence of the oceans on precipitation of the Caribbean and Venezuela have been conducted. Influencing factors seem to be the sea surface temperature (SST) of the East Equatorial Pacific and the SST of the North Tropical Atlantic, connected by the atmospheric bridge. The effect of these influencing factors differ with time and location. There is broad agreement about the sign of temperature anomalies in the North Tropical Atlantic and East Equatorial Pacific which may favors wet and dry conditions in the Caribbean and at the northern coast of Venezuela. Positive rainfall anomalies are associated with a cold East Equatorial Pacific and a warm North Tropical Atlantic. Enfield and Alfredo (1999) added that the strongest rainfall response occurs with a meridional dipole of sea surface temperature anomalies in the Tropical Atlantic and when the East Tropical Pacific and North Tropical Atlantic (NTA) SST anomalies are of opposite sign. Chen and Taylor (2002) suggested a wet early wet season linked to a warm Equatorial Pacific the winter before, but to a warm NTA the spring before. The time lag until the precipitation responds to the SST differs with different studies (see for example Nurmohamed and Naipal 2006). Martelo (2003a) revealed that NTA SSTs influence precipitation in Venezuela with a lead time of zero to two months, Niño3 SST with up to four months, and Niño3.4 SST with up to three months.

Cárdenas et al. (2003) determined almost always higher correlations between SST in the Pacific regions and rainfall in Bolívar (located in the south east of Venezuela) than with precipitation in Vargas (at the north central coast). Martelo (2003a) discovered that northern Venezuela is more strongly influenced by the Atlantic, and eastern Venezuela more by the Pacific. Kayano and Andreoli (2006) investigated the oceanic influence on the northeastern Brazilian rainfall and came to the conclusion that the influence of ENSO is much weaker than the effect of the Tropical South Atlantic.

1.2.2.1 Impact of the North Tropical Atlantic on precipitation

A positive SST anomaly in the North Tropical Atlantic causes more humid air over the NTA itself (Knaff 1997), upwind of Venezuela. This leads to decreased stability and the development of convection, especially with temperatures over \(26.5^{\circ }\hbox {C}\) (Gray 1968). The warm and humid air reaches the coast of Venezuela with the northeast trade winds. Previous studies agree that a positive anomaly over the North Tropical Atlantic favors strong precipitation at the northern coast of Venezuela and in the Caribbean (Enfield 1996; Chen and Taylor 2002; Taylor et al. 2002; Martelo 2003a; Bravo de Guenni et al. 2013).

1.2.2.2 The effect of El Niño-Southern Oscillation (ENSO)

Sequera (2009) found that El Niño is one of the most important mechanisms for the climate variability in Northern South America. Pulwarty et al. (1992) came to a similar conclusion and further suggested that La Niña increases rainfall amounts during the rainy season and over Maracaibo during July–August. La Niña is suggested to have a stronger influence than the El Niño and favors higher precipitation at the coast (Cárdenas et al. 2002; Cárdenas et al. 2003). However some drier Mays might occur during La Niña conditions (Martelo 2003a), and Taylor et al. (2002) supports this statement and add that a dry early rainy season in the Caribbean is connected with La Niña conditions in the early dry season. Taylor et al. (2002) also indicates that La Niña is related to heavy late wet season (August–October) rainfall. Lyon (2002) suggests that a transition from a warm to a cold ENSO phase is linked with stronger June–August rainfall. A similar effect is observed for the Caribbean where increased (decreased) late season (August–October) rainfall is linked with a cold (warm) ENSO event (Taylor et al. 2002). The development of strong warm ENSO events is related to heavy rainfall in Northern Venezuela with rainfall deficits simultaneously in Los Llanos. A strong El Niño is linked with widespread drying (Bravo de Guenni et al. 2013) and the effect of the warm ENSO is stronger in the dry season versus the wet season.

During El Niño, an anomalously displaced Hadley cell locates subsidence over northern South America and blocks the upward motion and convection and leads to less precipitation (Poveda and Mesa 1997). The flow from the south to the Caribbean is strengthened and the moisture transport from the Atlantic to the Pacific is weakened (Mestas-Nuñez et al. 2007). This is associated with decreased precipitation over northern South America (Mestas-Nuñez et al. 2007). There is not only less precipitation during El Niño, but also tropical disturbances, hurricanes and storms appear less frequently (Cárdenas et al. 2002).

1.2.2.3 Atmospheric bridge

The zonal SST gradient between the Pacific and the Atlantic influences the strength of the trade winds over the Atlantic and in the Caribbean Low Level Jet [a maximum of easterly winds in 950 hPa over the Caribbean (Wang 2007)]. The SST gradient over both oceans and the resulting trade winds are the bridge connecting the oceans and have a positive feedback with the Walker Circulation over the Pacific basin (Wang et al. 2009). The ENSO signal reaches the North Tropical Atlantic in such a way that 50–80 % of the anomalous SST variability of the Tropical Atlantic is associated with the Pacific ENSO (Enfield and Mayer 1997). The Pacific is ahead of the Atlantic. Enfield and Mayer (1997) concluded that a warming in the Atlantic occurs four to five months after a warming in the Pacific. Whyte et al. (2008) found out that the wind feedback depends on the driver of the gradient. When the Pacific is driving the jet by an El Niño event, jet winds are stronger than usually in the north and southwestern of the axis. In contrast, winds are intensified over the whole axis when the Atlantic is driving the gradient. The same study stated that a warm Pacific and a cool Atlantic are strengthening the jet. In the future, the Caribbean Low Level Jet will intensify and thus the rainy season will weaken (Taylor et al. 2013).

1.2.2.4 Seasonal difference between the impact of the Pacific and the Atlantic

The seasonal difference in the influence of the Pacific and the Atlantic has been analyzed as well. Taylor et al. (2002) and Spence et al. (2004) pointed out that the influence of the Tropical Pacific on Caribbean rainfall is stronger in the late than in the early wet season, while the impact of the Tropical Atlantic is more intense in the early wet season. However, the Tropical Atlantic has an effect on the whole rainy season (Taylor et al. 2002). The same investigation detects that the early wet season is modulated directly by the SST, while the late rainy season is effected mainly by vertical wind shear modification of convective conditions. Enfield and Alfredo (1999) figured out, that the wet season is more strongly influenced by the NTA than by the East Tropical Pacific.

Martelo (2003a) and Sequera (2009) additionally investigated the difference between the wet season and the dry season and came to the results that the North Tropical Atlantic influences more strongly the wet season while the regions Niño3 and Niño3.4 have a stronger impact on the late wet and dry season.

This paper presents the influence of the North Tropical Atlantic and the East Equatorial Pacific on the precipitation in Venezuela. Earlier studies show that there is definitively a connection between the SST of both oceans and the precipitation in the Caribbean and northern South America. This investigation will analyze which ocean influences more strongly temporally, during the different seasons (early/late wet and early/late dry), and spatially, at the coast and in the inland. As coastal region, we selected Vargas state. Vargas is on the central coast of Venezuela, north and northeast of Caracas, the capital of Venezuela. The precipitation in this state is strongly influenced by its topography. The Caribbean sea lies north of Vargas, while the Avila mountain range lies to the south. As inland regions we chose Bolívar state. Bolívar is located in southeastern Venezuela. The Rio Orinoco borders the northern limit, Guyana the eastern, Brazil the southern and the Amazon state and the central plains the western border of the state. It is interesting to find out whether there is a geographical difference in oceanic influence as well. Most of the mentioned articles are studies about the Caribbean rainfall and not about the precipitation in Venezuela. This paper goes beyond existing studies by analyzing for two geographically contrasting regions (coastal and inland), the potential impacts of both oceans on precipitation and the temporal seasonal and intra-seasonal differences of these impacts. These associations are also analyzed for current and future simulated data from the Max Planck Institute for Meteorology Earth System Model MPI-ESM, to understand how this spatial-temporal dependence dynamics might change in the future.

The paper is structured as follows: The data and methods used for this study will be presented in Sect. 2. Results on the influence of the SSTs in the North Tropical Atlantic and the Niño regions 3 and 3.4 on the precipitation in Venezuela investigated with Canonical Correlation Analysis will be presented in Sect. 3 for observed and simulated data. The discussion and conclusion of the results will be given in Sect. 4.

2 Data and methods

2.1 Data

2.1.1 Station data

The station data includes 127 stations, 36 in the Vargas state and 91 in the Bolívar state (see Fig. 1). The measurements at the stations in Vargas are conducted by the Dirección de Hidrologia y Meteorología, Ministerio del Ambiente y Recursos Naturales (MARN) and the data in Bolívar are collected by the Electrificación del Caroní (EDELCA), with the exception of three stations (MARN). These precipitation data are used for the Canonical Correlation Analyses (CCAs). The monthly values are accumulated values of the particular month. Four seasons are considered: early dry season (November–January), late dry season (February–April), early wet season (May–July) and late wet season (August–October). Before calculating the seasonal means, missing values are replaced by long term means of the corresponding month. Most stations have measurements during different time periods. That makes it very difficult to specify the period for the investigation. To have the same period in both states in order to compare the regions with each other, 60 years between 1951 and 2010 are chosen (Table 1).

Fig. 1
figure 1

Map of the stations (black dots), the precipitation regions of the model data (black boxes coast and inland) and the oceanic regions, pink box North Tropical Atlantic, turquoise box Pac–Atl, red Niño3.4, blue area Niño3

Fig. 2
figure 2

Spatial loadings for mode 1 of SST in the Niño3.4 region (a) and of precipitation in Vargas (b) from a Canonical Correlation Analysis in the early dry season (ERSST and station data)

2.1.2 Sea surface temperature data

The Canonical Correlation Analyses are done with the following SST data together with the station data, described in the previous paragraph.

The most recent version of the Extended Reconstructed Sea Surface Temperature (ERSST), v3b, is used as SST data for the correlations. The analysis is based on the International Comprehensive Ocean–Atmosphere Data Set (ICOADS) release 2.4. At the end of every month, the ERSST analysis is updated with the available Global Telecommunication System (GTS) ship and buoy data for that month. The anomalies are computed with respect to the 1971–2000 climatology (Xue et al. 2003). The data have a \(2^{\circ } \times 2^{\circ }\) resolution.

For the Atlantic, a region called North Tropical Atlantic (NTA) is chosen. It spans the area \(6.0^{\circ }\hbox {N}{-}18.0^{\circ }\hbox {N}, 60.0^{\circ }\hbox {W}{-}10.0^{\circ }\hbox {W}\) (Fig. 1). This area is defined after Penland and Matrosova (1998) and is used by NOAA (National Oceanic and Atmospheric Administration) for calculating the NTA index. A previous study by Bravo de Guenni et al. (2013) was based also on this region. All previous studies split the area in two parts: \(6.0^{\circ }\hbox {N}{-}18.0^{\circ }\hbox {N}, 60.0^{\circ }\hbox {W}{-}20.0^{\circ }\hbox {W}\) and \(6.0^{\circ }\hbox {N}{-}10.0^{\circ }\hbox {N}, 20.0^{\circ }\hbox {W}{-}10.0^{\circ }\hbox {W}\), probably to avoid having parts of the African continent in the chosen window. Here, a rectangular region is used and values over land are masked.

In the Equatorial Pacific, the investigated area are: Niño3 \((6^{\circ }\hbox {S}{-}6^{\circ }\hbox {N}, 150^{\circ }\hbox {W}{-}90^{\circ }\hbox {W})\) and Niño3.4 \((6^{\circ }\hbox {S}{-}6^{\circ }\hbox {N}, 170^{\circ }\hbox {W}{-}120^{\circ }\hbox {W})\) (Fig. 1). The regions Niño3 and Niño3.4 are officially defined as the region between \(5^{\circ }\hbox {S}{-}5^{\circ }\hbox {N}\). In this study, the areas are chosen to be one degree larger because of the spatial \(2^{\circ }\) resolution of the data. Furthermore, a fourth area will be used which includes all three regions (NTA, Niño3 and Niño3.4) called Pac–Atl \((6^{\circ }\hbox {S}{-}18^{\circ }\hbox {N}, 150^{\circ }\hbox {W}{-}10^{\circ }\hbox {W})\) (Fig. 1).

2.1.3 Model data

In addition to the station data, CCAs are applied to data from climate model simulations performed with MPI-ESM, the Earth System Model (ESM) of the Max Planck Institute for Meteorology (MPIM). Modeled data from historical and future simulations experiments are used in the analysis. The objective of this analysis is to investigate the consistency of model based results in comparison with respect to station observations. MPI-ESM is a coupled model consisting of four main components: the atmospheric model ECHAM6 (Roeckner et al. 2003), the land surface and vegetation model JSBACH (Raddatz et al. 2007), the ocean and sea ice model MPIOM (Marsland et al. 2003), and the ocean biogeochemistry model HAMOCC (Maier-Reimer et al. 2005). The sixth-generation atmospheric general circulation model ECHAM6 is the most recent version in a series of ECHAM models evolving originally from the spectral weather prediction model of the European Centre for Medium Range Weather Forecasts (ECMWF) (Simmons et al. 1989). In this study, data from two simulations are used that have been performed with MPI-ESM within the Coupled Model Intercomparison Project, Phase5 (CMIP5, http://cmip-pcmdi.llnl.gov/cmip5): a historical simulation (experiment id MPI-ESM-LR_historical_r1i1p1) from 1850 to 2005 using observed forcings for atmospheric greenhouse gas concentrations (among others), and a future simulation (experiment id MPI-ESM-LR_rcp85_r1i1p1) from 2006 to 2100 using a scenario of relatively high greenhouse gas concentrations that would lead to a radiative forcing of about 85 W/m2 by the year 2100. The sea surface temperature and precipitation from these simulations are monthly values in T63 horizontal (approx. \(1.9^{\circ } \times 1.9^{\circ }\)) and L47 vertical resolution (top level at 10 hPa). Hagemann et al. (2006) found out that the validation of the modeled precipitation (here ECHAM5) provides much better results for the precipitation over the oceans. The same investigation mentions that a higher horizontal resolution (T42 to T159) does not have a strong effect on the precipitation as an increased vertical resolution (L19 to L31). The model is able to reproduce most of the precipitation which is connected with ENSO, even though the precipitation might be higher than observed (Hagemann et al. 2006).

The precipitation areas of the model data are chosen to represent Vargas and Bolívar states as well as possible. Vargas is represented by the coastal area of three grid boxes at \(10.3^{\circ }\hbox {N}\) and between \(67.5^{\circ }\hbox {W}{-}63.8^{\circ }\hbox {W}\) (Fig. 1). The orography of the model in the boxes has its highest elevation with 200 m in the western grid box, decreasing to the box in the center with 120 m and further down to 100 m at the eastern boundary of the chosen area. The region which represents the inland (the Bolívar state) with the station data is a larger area with 3 × 3 grid boxes, between \(4.7^{\circ }\hbox {N}{-}8.4^{\circ }\hbox {N}\) and \(63.8^{\circ }\hbox {W}{-}60.0^{\circ }\hbox {W}\). Its orography varies more. The altitude is lowest in the northeast grid box near the Orinoco delta (100 m), and highest in the southwest with 780 m which stands for the table mountains (tepuys) present in this part of the country.

The oceanic regions of the model data span the following areas: NTA: \(6.5^{\circ }\hbox {N}{-}17.7^{\circ }\hbox {N}, 60.0^{\circ }\hbox {W}{-}11.3^{\circ }\hbox {W}\), Niño3: \(4.7^{\circ }\hbox {S}{-}4.7^{\circ }\hbox {N}, 150.0^{\circ }\hbox {W}{-}90.0^{\circ }\hbox {W}\), Niño3.4: \(4.7^{\circ }\hbox {S}{-}4.7^{\circ }\hbox {N}, 168.8^{\circ }\hbox {W}{-}120.0^{\circ }\hbox {W},\) Pac–Atl: \(4.7^{\circ }\hbox {S}{-}17.7^{\circ }\hbox {N}, 168.8^{\circ }\hbox {W}{-}11.3^{\circ }\hbox {W}\).

2.2 Methods

2.2.1 Canonical Correlation Analysis

The relationship between precipitation and SST is investigated with Canonical Correlation Analysis (CCA). Canonical correlation is used to characterize the relationship between two sets of variables, each set has more than one member (Everitt 1996). It was originally developed by Hotelling in 1936. The description is based on Everitt (1996) and Zorita et al. (1992). The Canonical Correlation Analysis is about finding linear relationships of two sets of variables to maximize the correlation among them.

There are two sets of variables:

$$\begin{aligned} \mathbf x =(x_{1},\ldots ,x_{n})\;\; and \;\; \mathbf y =(y_{1},\ldots ,y_{m}) \end{aligned}$$
(1)

The CCA finds two new sets of variables \(U_{i}\) and \(V_{j}\). \(U_{i}\) and \(V_{j}\), the canonical correlation time series, are linear combinations of x and y,

$$\begin{aligned} U_{i}=\alpha _{i}^{T}\mathbf x \end{aligned}$$
(2)

and

$$V_{j}=\beta _{j}^{T}\mathbf y$$
(3)

and uncorrelated.

The problem is reduced to an eigenproblem. The coupled eigenproblem is defined as:

$$\mathbf C _{xx}^{-1}\mathbf C _{xy}\mathbf C _{yy}^{-1}\mathbf C _{yx}^{T}\alpha = \lambda ^{2}\alpha$$
(4)
$$\mathbf C _{yy}^{-1}\mathbf C _{yx}^{T}\mathbf C _{xx}^{-1}\mathbf C _{xy} \beta = \lambda ^{2}\beta$$
(5)

The eigenvalues \(\lambda ^{2}\) are the same in both cases, \(\alpha\) and \(\beta\) are the eigenvectors or weights. \(\mathbf C _{xx}\) and \(\mathbf C _{yy}\) are the autocovariance matrices and \(\mathbf C _{xy}\) and \(\mathbf C _{yx}\) the cross-covariance matrices. The eigenvectors are chosen in such a way that the correlation between

$$\begin{aligned} U_{1}= \alpha _{1}^{T}\mathbf x \end{aligned}$$
(6)

and

$$\begin{aligned} V_{1}= \beta _{1}^{T}\mathbf y \end{aligned}$$
(7)

is maximized.

The size of the covariance matrix can be reduced via an empirical orthogonal function (EOF) analysis, previous to CCA. Limiting the number of EOFs (modes) from nm to ij, the size of the covariance matrix is reduced from \((n \times m)\) to \((i \times j)\). The number of modes are chosen in a way that the retained modes explain most of the total variance. This filtering is done to eliminate noise but can also ignore useful data.

The new variables which enter the Canonical Correlation are the Principal Components (PCs), the projected eigenvectors. Therefore, the calculation of the autocorrelation matrices \(\mathbf C _{xx}\) and \(\mathbf C _{yy}\) is easier because their inverse is a diagonal matrix. The canonical variates \(U_{i}\) and \(V_{j}\) are calculated as linear combinations of the PCs. The patterns of the Canonical Correlation can be obtained from the original variables by averaging in time:

$$\begin{aligned} \mathbf g _{i}= & {} \mathbf C _{xx}\alpha _{i} = \langle {U}_{i}\mathbf x \rangle \end{aligned}$$
(8)
$$\begin{aligned} \mathbf h _{j}= & {} \mathbf C _{yy}\beta _{j} = \langle {U}_{j}\mathbf y \rangle \end{aligned}$$
(9)

g and h, the spatial loadings, can be seen as the local covariance between the variables x and y and the canonical variates and can be considered as the indicator of the strength of the signal.

Canonical Correlations are chosen because it is a suitable method to identify the patterns in two multivariate time series that are optimally correlated. The precipitation of the four seasons in the states Vargas (coast) and Bolívar (inland) are canonically correlated with the SST of the regions Niño3 and Niño3.4 in the Equatorial Pacific, the NTA and Pac–Atl. The dry and the wet season are split into an early and a late part because the precipitation varies within these seasons. Each of these four seasons (early wet, late wet, early dry and late dry) contains three months. The monthly values are averaged over these three months. The CCAs are performed with zero-lagged SST and precipitation and with lags up to six months, with SST leading precipitation. Each lag is shifted one month backwards with respect to the previous one. That means that the first two lags still contain at least one month of the corresponding season. The spatial loadings are shown as correlations of the original data with the CCA time series. The spatial loadings present the typical patterns of the anomalies of SST and precipitation which tend to develop together. The CCA coefficients and spatial loadings of the first CCA mode which provide a physically meaningful pattern are described.

2.2.2 Climate Predictability Tool

Climate Predictability Tool (CPT) is a program for performing CCA and it is used in this study. The following description is based on the CPT user guide (Ndiaye and Mason 2010). CPT was developed by the International Research Institute (IRI) for Climate and Society of the University of Columbia, USA. CPT is a powerful tool to forecast seasonal climate in tropical and sub-tropical areas.

The CPT pre-filters the input data with an EOF analysis. The maximum number of modes can be specified. The Principal Components are calculated by using the correlation matrix (the analysis is based on the standardized anomalies). By constructing the model, a cross-validated forecast is made. The cross-validated forecast is a prediction of the hindcasts. The best cross-validated result is defined by the highest goodness index, calculated with the Pearson correlation coefficient. The goodness index can be seen as an average correlation between the transformed cross-validated forecasts and observations for all series. The correlations are first transformed to the Fisher z-scale, to use a transform which is more normally distributed, averaged, and then transformed back to the correlation scale. The number of used modes is chosen according to this goodness index.

The predictand, the rainfall data, is transformed to a normal distribution before the calculations, because its original distribution is skewed. CCA generally works best with normally distributed data. The empirical distribution is the basis for the transformation. The precipitation data are transformed to an uniform distribution on the unit interval. Standard normal distribution deviates are calculated using these percentiles. The process is done backwards by converting to the percentiles and linearly interpolating them on the original data. The goodness index and the spatial loadings are applied to the transformed data. Furthermore, the precipitation data are zero-bounded so that negative values are not predicted.

The settings of CPT are chosen as following. The length of the cross validation window is five. It defines how many years are left out for computing the cross-validation. The number of years has to be odd, because just the year in the middle of the cross-validation window is predicted at each step. The time periods are 1951–2010 (station data), 1946–2005 (model data of the past) and 2041–2100 (model data of the future). The length of the training period defines the number of years used for constructing the model and is chosen as the first 50 years of the time periods. The remaining 10 years are used as the forecast period. Starting years for both input files (predictor and predictand) are the same. When computing CCAs with a lagged SST leading the precipitation, and the lag crosses the calendar year, the time period of the predictand starts 1 year later and the training period infolds only 49  years. Three-month means of November–January and December–February cross the calendar year and refer to the year of November and December, thus the data contains just 59 years.

Missing values are replaced by the median of each station. The maximum percentage of missing values per station and time step is set to 50 %. It has to be such a high limit because of the poor quality station data. With a lower value, the number of used stations would be too small to make a reasonable analysis. The replacement is done using the untransformed data. Table 1 show the remaining stations and the corresponding percentage of missing values for both regions.

Table 1 Percentage of missing values (%) for each season and station for the 1951–2010 period

3 Results

3.1 Station data

Before computing the CCA, an empirical orthogonal function (EOF) analysis is performed. Two EOFs are relevant for the CCAs of the precipitation in Vargas. In the dry season, the EOFs explain 67 % [early dry season (November–January)] and 64 % [late dry season (February–April)] of the variance. The explained variance of the two EOFs for the wet season is 46 % for the early season (May–July) and 44 % for the late season (August–October). The CCA applied to the precipitation in Bolívar included the first two EOFs as well. Only for the early wet season the first four EOFs seem to explain a relevant variance percentage of 65 %. The EOFs for the station data in Bolívar explain 59 % of the variance for the early dry season, 66 % of the variance for the late dry season and 53 % of the variance for the late wet season. For calculating the correlations between the NTA and Vargas precipitation, three (two in the early dry season in Bolívar) SST EOFs were used which explained between 83 and 95 % of the variance. Two EOFs of the El Niño region fields were included in the CCA. They account for a variance between 93 and 98 %. For Pac–Atl, between two and four EOFs seem to explain the relevant variance percentage of 63–85 %. The remaining EOFs were truncated as noise. The number of used EOFs for each CCA were chosen according to the goodness index so that the best possible forecast is provided.

Table 2 shows the order of the zero-lag canonical correlation coefficients for the different rainfall seasons and SST regions.

Table 2 Ranking of the correlation coefficients from the Canonical Correlation Analysis between the sea surface temperature (ERSST) indices and the precipitation in Vargas and Bolívar (station data) for the four seasons

3.1.1 Vargas

3.1.1.1 Early dry season

The early dry season spans the months of November, December and January. The influence on precipitation of the NTA is smaller (correlation coefficient = 0.22), than the influence of the Pacific regions (correlation coefficient = 0.49 and 0.47). The precipitation pattern of the spatial loadings of the first CCA mode cannot be explained physically when correlating with the NTA. There seems to be no plausible explanation for inhomogeneous anomalies in this small state. Thus, the second CCA mode is discussed here. The spatial loadings show a large area with positive SST anomalies east of Africa and a positive rainfall anomaly in Vargas. The correlation coefficients of Niño3 and Niño3.4 are quite similar with values equal to 0.49 and 0.47 respectively. The CCA patterns look very similar. The spatial loadings of the CCA with Niño3.4 are displayed in Fig. 2. The spatial loadings of the precipitation in (b) show the anomalies at each station. The station data contain many missing values, thus the number of stations is reduced (form 36 to 14 for Vargas and from 91 to 30 for Bolívar) when using the 50 % threshold of missing values CPT set up. Nevertheless, a positive SST anomaly in the Niño3.4 region is associated with a negative rainfall anomaly. The region Pac–Atl also has a strong influence on the early season rainfall with a correlation coefficient of 0.49. The spatial loadings show positive anomalies over the whole domain, with smaller ones for the NTA region, and negative anomalies in Vargas.

3.1.1.2 Late dry season

In the late dry season, the NTA region has the strongest influence compared to the other regions and seasons with a correlation value of 0.38. The spatial loadings of the SST show a dipole structure with negative anomalies in the east and positive ones in the west. The loadings of the precipitation have negative values. The correlation coefficient is quite small with a value of 0.20. The correlation coefficient of Niño3.4 is higher, with a value of 0.29, and the spatial loadings are of opposite sign. Niño3 has homogeneous positive SST anomalies and negative precipitation anomalies while Niño3.4 has negative SST anomalies with stronger values at the western boundary of the domain and positive anomalies in Vargas. The correlation coefficient of the CCA with Pac–Atl is 0.26 and the spatial loadings are shown in Fig. 3. Again the SST anomalies in the Pacific and Atlantic are positive. This result is opposite to the results found for the NTA region, and the magnitude of the coefficients suggests a higher influence of the NTA on late dry season rainfall.

Fig. 3
figure 3

Spatial loadings for mode 1 of SST in the Pac–Atl region (a) and of precipitation in Vargas (b) from a Canonical Correlation Analysis in the late dry season (ERSST and station data)

3.1.1.3 Early wet season

The CCA of the early wet season rainfall with the NTA has a correlation coefficient of 0.28, again lower than the estimated coefficient for the Pacific regions. The spatial loadings of SST and precipitation have again the same sign (positive). The correlation coefficient of the CCA with Niño3 is only slightly higher compared to NTA with a value of 0.31. Niño3.4 has a higher influence with a coefficient of 0.38. The spatial loadings of these two regions have opposite signs in precipitation and SST. The CCA with Pac–Atl has a correlation of 0.35. The spatial loadings show a dipole between Pacific and Atlantic, positive anomalies in the Equatorial Pacific and negative anomalies in the NTA. This pattern is combined with negative rainfall anomalies in Vargas. This result corresponds with the patterns of all three regions separately.

3.1.1.4 Late wet season

The late wet season is more strongly influenced by the SST of the Pacific. Like in the early dry season only the precipitation pattern of the second CCA mode when correlating with the NTA region can be explained physically. The correlation coefficient is small with a value of 0.18. The spatial loadings show negative SST anomalies in most of the domain. Only at its southern boundary, the SST anomalies have the same positive sign as the precipitation anomalies. The coefficients and patterns of the spatial loadings of the CCAs with the Niño3 and Niño3.4 region are very similar. The coefficients are 0.35 and 0.36 and the spatial loadings show negative precipitation anomalies and positive SST anomalies in both cases. The correlation coefficient of the CCA with Pac–Atl is quite high, with a value of 0.44. The patterns of the spatial loadings for SST and precipitation agree perfectly with the results of every single region: negative anomalies in the Pacific and North Equatorial Atlantic, positive anomalies in the South Equatorial Atlantic domain and in Vargas stations (Fig. 4).

Fig. 4
figure 4

Spatial loadings for mode 1 of SST in the Pac–Atl region (a) and of precipitation in Vargas (b) from a Canonical Correlation Analysis in the late wet season (ERSST and station data)

To sum up, the influence of the NTA is highest during the late dry season when the lowest precipitation is measured. The regions Niño3 and Niño3.4 show similar results with slightly higher correlation coefficients with Niño3.4 in most cases (except for the early dry season). The region Pac–Atl has correlations of the same order or higher than the Niño regions. Its pattern reflects the NTA pattern only in the wet season, and the Pacific patterns during all four seasons.

3.1.2 Bolívar

3.1.2.1 Early dry season

The canonical correlation coefficient is very small with the NTA region (0.17) in the early dry season. The correlation with Niño3 and Niño3.4 are much higher and very similar with estimated values of 0.77 and 0.76. The spatial loadings of the CCAs with both Niño regions show oppositely signed anomalies for SST and precipitation. The coefficient of the Pac–Atl region is high as well, with a value of 0.70 and its spatial loadings underline the magnitude of the coefficient. There are positive SST anomalies over the whole domain with stronger ones in the Pacific than in the Atlantic and negative precipitation anomalies in Bolívar stations.

3.1.2.2 Late dry season

In the late dry season all zero-lag correlations of the four regions have a first spatial loading of the precipitation that cannot be explained physically. Therefore, only the second one is mentioned here. The correlation coefficients do not differ much between the regions. The correlations with the NTA and Niño3.4 have a coefficient of 0.21, Pac–Atl a slightly lower one with a value of 0.20, and the Niño3 has the highest value (0.34). The spatial loadings of the CCA with the NTA region show negative anomalies in the eastern NTA region and in Bolívar stations (Fig. 5). The spatial loadings of Niño3 have negative values in the east and positive ones in the west of the oceanic region and in the precipitation. That corresponds well with the results of Niño3.4 where the anomalies of SST are negative over the whole domain and positive in the Bolívar precipitation stations. The spatial loadings of Pac–Atl show negative values in the Pacific and opposite signs in precipitation. In the Atlantic there is a north-south dipole with positive anomalies in the south.

Fig. 5
figure 5

Spatial loadings for mode 2 of SST in the NTA region (a) and of precipitation in Bolívar (b) from a Canonical Correlation Analysis in the late dry season (ERSST and station data)

3.1.2.3 Early wet season

The CCAs of the early wet season have higher correlation coefficients than the previous season. Previous studies agree with that only for the NTA region. The coefficient of the CCA with the NTA region is very high compared to the other regions, with a value of 0.85. The spatial loadings of the precipitation do not have anomalies of uniform sign over the whole Bolívar state. The SST has negative anomalies, as precipitation at most stations. Some stations in the southeast have slightly positive values. For this season the results differ within the two Niño regions. The CCA with Niño3 has a correlation coefficient of 0.53. The spatial loadings show positive values in the SST and in the main part of Bolívar. Anomalies with opposite signs occur in the south of the state. The CCA with Niño3.4 has a higher coefficient (0.75). The spatial loadings have the same pattern but with opposite signs compared to the ones of the CCA with Niño3. The coefficient of the CCA with Pac–Atl is high as well, with a value of 0.82. The spatial loadings of the CCA with Pac–Atl provide a good summary of the results of each region. They are shown in Fig. 6.

Fig. 6
figure 6

Spatial loadings for mode 1 of SST in the Pac–Atl region (a) and of precipitation in Bolívar (b) from a Canonical Correlation Analysis in the early wet season (ERSST and station data)

3.1.2.4 Late wet season

The CCA with the NTA and the precipitation in the late wet season provides a correlation coefficient which is slightly higher than in the Niño regions with a value of 0.67 compared to 0.64 (Niño3 and Niño3.4). The spatial loadings of the CCA with the NTA region show negative anomalies in the precipitation stations associated with a dipole in the SST anomalies, with positive values in the north of the domain. The patterns of the spatial loadings of the CCA with Niño3 have positive anomalies in the SST and negative ones in the precipitation stations. The spatial loadings of the CCA with Niño3.4 show similar patterns: Negative anomalies in precipitation and positive ones in the SST. The SST anomalies are stronger in the western part of the region. The CCA with Pac–Atl shows that the impact of both oceans is important during this season. The coefficient is higher than of any other single region with a value of 0.82. The spatial loadings (Fig. 7) summarizes that negative SST anomalies in the Equatorial Pacific and positive ones in the NTA region provide positive rainfall anomalies in the late wet season.

Fig. 7
figure 7

Spatial loadings for mode 1 of SST in the Pac–Atl region (a) and of precipitation in Bolívar (b) from a Canonical Correlation Analysis in the late wet season (ERSST and station data)

In summary, the influence of the NTA is low during the dry season and high during the wet season. The Niño regions and Pac–Atl have similar and high coefficients for both wet and the early dry season.

3.1.3 Goodness index

For Vargas the goodness indices of the CCA with the NTA are all low, between −0.063 (late dry) and 0.114 (early wet). The indices of the correlations with the Niño regions are lower than 0.1 in the late dry and early wet season, still low with 0.150 and 0.157 in the late wet and highest in the early dry season. Good forecasts for the precipitation in Vargas can only be done with the Niño regions and Pac–Atl in the early dry season. The indices differ between 0.309 (Niño3.4) and 0.345 (Pac–Atl).

The zero-lag goodness indices of the correlations with the NTA and precipitation in Bolívar are low during both dry seasons (0.074 and −0.039). The CCA of the early wet season has a goodness index of 0.177 and of the late wet season the highest with the NTA (0.222). The goodness indices of the Niño regions are mostly similar. The best forecast can be done of the early dry season with indices of 0.515 and 0.519. Followed by the late wet season with 0.365 and 0.354, the late dry season with 0.241 and 0.133, and the lowest indices occur in the early wet season with 0.121 and 0.004. The Pac–Atl region has indices of the same size like the Niño regions. The highest index appears as well with the early dry season (0.468), second is the index of the late wet with 0.460, third of the early wet (0.146) and goodness index of the late dry season provides the worst forecast with 0.064.

In summary, the goodness index is higher with the Pacific regions than with the NTA. Furthermore, the precipitation in Vargas and Bolívar can possibly be predicted in the early dry season by the Niño regions.

3.1.4 Non-zero lag analysis

The precipitation of the early dry season in Vargas has the highest correlation during JJA (lag 5) with the SST of oceanic regions NTA (0.28), Niño3.4 (0.49) and Pac–Atl (0.52), while Niño3 has the strongest influence on the precipitation in this season with one or three lags (0.52). Also in the late dry season, Niño3 has a different result than the other three regions. In the early wet season the Niño regions and Pac–Atl have the highest coefficients (0.31–0.38) at small lags (0–2). NTA has an earlier influence (December–February). The results of the late wet season correspond well for all regions. The highest correlations are found at lag 4 (April–June) or lag 3 (May–July). The highest goodness index coincides well with the highest correlation coefficient for all regions during this season. A good agreement is defined as a difference up to a maximum of two lags. For the NTA and Niño3 this condition is not fulfilled in the late dry season. The lags of the CCA with Niño3.4 agree for both, coefficient and goodness index in both late seasons, and for Pac–Atl in all four seasons.

In Bolívar the NTA has its highest correlation coefficients (0.53 and 0.58) at a long lag (six months) when correlating with the precipitation of the dry seasons and at a small lag (zero or one month) with the precipitation of the wet seasons (0.85 and 0.72). The lags with the highest goodness index correspond with these results. The strongest influence of Niño3 takes place short time in advance (between one and two months) with correlations between 0.59 and 0.77. Only for the precipitation of the late wet season is the coefficient highest at lag four (0.65). The goodness indices underline this with a good agreement of the lags in these three seasons. The lags of the highest correlation coefficient and goodness index differ by four months in the CCAs of the precipitation of the late wet season. Niño3.4 influence is strongest at the same time or with a small lag of two months (precipitation of the late wet season) but the rain in the late dry season is highest correlated with a lag of six months. The goodness index is highest at the same or at similar lags of the correlation coefficient. Also for Pac–Atl both lags correspond well. The lags of the highest correlation coefficients seem to be temporally in the middle between NTA and the Niño regions.

The spatial loadings fulfill the pattern of anomalies suggested in the literature (sign of SST anomalies of NTA and precipitation are the same and opposite to the SST anomalies in the Pacific). Exception to these patterns are the rainfall in Vargas in the early dry season correlated with Pac–Atl at the lag of the highest goodness index, in the early wet season at the lag of the goodness index when correlating Niño3 and during the late wet season in both lags of NTA and Pac–Atl. For the precipitation in Bolívar, the spatial loadings fulfill the pattern with exception of lags when correlating the precipitation of the early wet season with the Niño regions and Pac–Atl. That agrees with the results of previous studies.

In summary, for precipitation in Vargas there is a high variability of the results for different lags, where only the late wet season provides homogeneous results among the SST regions. In Bolívar the results of the NTA suggest that its SST of the wet season influences the whole year. The Niño regions influence only short time in advance with the exception of Niño3.4 in the late dry season. The anomalies of the spatial loadings differ in the NTA and Pac–Atl for the precipitation in Vargas in the late wet season and in the Niño regions and Pac–Atl for the early wet season rainfall in Bolívar.

3.2 Model data for the past

In this part, the results of the CCAs of the model data for the twentieth century of sea surface temperature and the precipitation are displayed and discussed.

All EOFs which explain at least 1 % of the total variance enter the CCAs. None of the EOFs of the precipitation at the coast were truncated, so that 100 % of the variance remains for the CCAs. For the data of the inland, between five and seven of the nine EOFs correspond with this criterion. They represent between 97 and 99 % of the total variance. For the NTA data six or seven EOFs remain, a 95 % variance. Between three and five EOFs of the SST of Niño3 fulfill the criterion so that 96 or 97 % variance is covered. With the Niño3.4 region there are only three or four EOFs which enter the CCAs but explain between 98 and 99 % of the variance. The remaining variance of Pac–Atl is lower with values between 91 and 93 % even if the number of EOFs is much higher with eight to twelve. The explained variance of the leading mode of the Niño regions is much higher than the explained variance of the NTA and even more compared to Pac–Atl. This induces the different number of remaining modes.

Modeled precipitation amounts are lower than the observed values in general. The measured precipitation amounts clearly show that it rains more in Bolívar than in Vargas. The model data do not show this difference. This may be due to the orography, which is less pronounced in the model than in reality. For the coast, both data sets agree with the result that the early wet season is drier than the late wet season. In the modeled dry season there is nearly no precipitation, whereas the measured amounts for the early dry season are higher than in the late dry season. In the inland, both data sets agree that the wettest season is the early wet season.

The modeled SSTs have a wider spectrum of anomalies, especially the NTA. The ERSST anomalies spectrum is narrower and more skew than the modeled one.

Table 3 shows the ranking of the canonical correlation coefficients with the model data for the past.

Table 3 Ranking of the correlation coefficients from the Canonical Correlation Analysis of the sea surface temperature and the precipitation for the four seasons (model data of the past)

3.2.1 Coast

3.2.1.1 Early dry season

In the early dry season, the highest correlation is with respect to Pac–Atl (0.81), followed by NTA (0.74), Niño3.4 (0.49) and Niño3 (0.38). The spatial loadings of the CCA with the NTA show negative SST anomalies in the center of the region and positive anomalies at its southeast and northwest corner. The precipitation anomalies are negative with the strongest anomalies in the center. Niño3 and Niño3.4 have similar results, as expected, with anomalies of opposite sign in SST and precipitation. The SST anomalies are less strong in the Niño3.4 region, the precipitation anomalies look exactly like in the CCA with the NTA region. The spatial loadings of the Pac–Atl show the same patterns of positive SST anomalies in the Niño regions and negative anomalies in the NTA region and the precipitation at the coast (Fig. 8).

Fig. 8
figure 8

Spatial loadings for mode 1 of SST in the Pac–Atl region (a) and of precipitation at the coast (b) from a Canonical Correlation Analysis in the early dry season (model data of the past)

3.2.1.2 Late dry season

In the late dry season the influence of the Niño regions becomes stronger compared to the previous season and stronger than the one of the NTA region. The correlation coefficient with Pac–Atl is still the highest with a value of 0.58, but this time it is followed by Niño3 with a value of 0.42. The third strongest influence occurs with Niño3.4 with a value of 0.39 and the coefficient of NTA region is only a value of 0.34. The spatial loadings of the NTA show negative rainfall anomalies, slightly stronger in the west and positive SST anomalies with highest values in the southeast and a little area with negative anomalies in the northwest. The spatial loadings of the Niño regions show positive SST and negative rainfall anomalies. In the Niño3 region the anomalies are higher in the west. The spatial loadings of the CCA with Pac–Atl are shown in Fig. 9. The SST anomalies are positive with the exception of a small area in the NTA and north of the Caribbean coast of Venezuela. The precipitation anomalies are negative. These patterns show, additionally to the coefficients, that the Equatorial Pacific has a stronger influence than the NTA in the late dry season on the precipitation at the coast of Venezuela.

Fig. 9
figure 9

Spatial loadings for mode 1 of SST in the Pac–Atl region (a) and of precipitation at the coast (b) from a Canonical Correlation Analysis in the late dry season (model data of the past)

3.2.1.3 Early wet season

In the early wet season, which spans the months of May to July, the correlation coefficients are a bit higher than in the season before and have the same order as at the beginning of the dry season. The Pac–Atl has again the highest correlation with a value of 0.68. With a value of 0.45, the NTA has the second strongest influence in this season. The correlations of the Niño regions are lower and similar to each other with coefficients of 0.40 (Niño3) and 0.41 (Niño3.4). The spatial loadings of the CCA with the NTA are shown in Fig. 10. There is a warm pool between the coast of Africa and \(50^{\circ }\hbox {W}\). Negative anomalies occur in the north and the west of the region. The precipitation anomalies are negative with higher values in the east. The spatial loadings of the CCAs with the Niño regions show strong positive SST anomalies and strong negative anomalies in the precipitation with the maximum in the central grid box. The anomalies of the SST of Pac–Atl summarize the results of each region separately. The SST anomalies of the Pacific and of the southern part of the Atlantic area are negative and positive only in the north of the NTA region and in the Caribbean Sea. The precipitation anomalies are positive too.

Fig. 10
figure 10

Spatial loadings for mode 1 of SST in the NTA region (a) and of precipitation at the coast (b) from a Canonical Correlation Analysis in the early wet season (model data of the past)

3.2.1.4 Late wet season

The oceanic influence on the precipitation at the coast of Venezuela seems to be strongest in the late wet season. Both Niño regions have a correlation coefficient of 0.75. The NTA has an even higher coefficient of 0.85. The Pac–Atl region has again the highest correlation coefficient with 0.91. In this season the area of the NTA where the SST anomalies have the same sign as the precipitation is even smaller than in the season before. Just a small part in the northwest has negative anomalies, the rest of the spatial loadings is positive. The corresponding rainfall anomalies are homogeneous and strong. The spatial loadings of the Niño regions have the same patterns and signs. Figure 11 shows the loadings of Niño3.4. The negative SST anomalies are strongest in the west, the precipitation anomalies are very high with the same strength in all three grid boxes. The spatial loadings of the CCA with Pac–Atl show negative anomalies in the Pacific with highest values in the west and at the west coast of Central America. Positive SST anomalies occur on the north coast of South America and in the Atlantic south of the Equator. The positive rainfall anomalies are strong and homogeneous.

Fig. 11
figure 11

Spatial loadings for mode 1 of SST in the Niño3.4 region (a) and of precipitation at the coast (b) from a Canonical Correlation Analysis in the late wet season (model data of the past)

To sum up, the coefficients of Pac–Atl are always the highest, followed by NTA with the exception of the late dry season where both Niño regions have a stronger influence. The oceanic influence is highest in the late wet season. The anomalies of the Niño regions are always of the opposite sign than the ones of the precipitation. The SST anomalies of the NTA have the same sign as the rainfall at the beginning of the dry and the wet season. The patterns of the Pac–Atl reflects the results of the correlations with the Pacific and Atlantic regions separately.

3.2.2 Inland

3.2.2.1 Early dry season

In the early dry season the order of influence of the oceanic regions on inland rainfall is similar as for coastal precipitation, only the Niño regions change their position in the ranking. The coefficient of Pac–Atl is 0.84, slightly higher than with the precipitation at the coast. The coefficient of the CCA with the NTA is 0.74, exactly as with the coastal precipitation. Niño3 has a correlation coefficient of 0.63, much higher than with the rainfall at the coast and also 0.1 higher than with Niño3.4. The spatial loadings of the correlation with the Pac–Atl are shown in Fig. 12. The SST anomalies are positive in the East Equatorial Pacific, as with the correlation of Niño3. The Atlantic is mostly colder with positive anomalies only in small parts in the east and at the east coast of Central America. The spatial loadings of SST of the CCAs with the three regions separately show the same results. The precipitation anomalies are strongest in the north and east where the altitude is low when correlated with the NTA and strongest in the grid boxes with intermediate elevation in the spatial loadings of the CCA with Niño3. The spatial loadings of precipitation of the CCA with Pac–Atl are strongly negative at all grid boxes except at the one with the highest altitude in the southwest.

Fig. 12
figure 12

Spatial loadings for mode 1 of SST in the Pac–Atl region (a) and of precipitation in the inland (b) from a Canonical Correlation Analysis in the early dry season (model data of the past)

3.2.2.2 Late dry season

The results of the CCAs with late dry season rainfall are different for the coast and the inland. In the inland, the highest correlation coefficient is of the CCA with the Pac–Atl with a value of 0.73, followed by the CCA with NTA with a coefficient of 0.67. The Niño regions have lower influence with coefficients of 0.53 (Niño3.4) and 0.44 (Niño3). The spatial loadings of the correlation with the NTA show positive anomalies in the south and in the center of the NTA and precipitation anomalies of the same sign, strongest where the orography is lowest. Niño3 and Niño3.4 have again the same pattern in loadings in the spatial loadings although the coefficients differ by a value of 0.09. The SST anomalies are positive and the rainfall anomalies negative. The spatial loadings of the Pac–Atl correlation are shown in Fig. 13. SST anomalies are positive in the Pacific and in the north and south of the Tropical Atlantic. In its center the anomalies are negative like the ones of the precipitation in the inland, with highest values in the southwest and northeast. Again, the spatial loadings of the CCA with Pac–Atl are a good summary of the patterns of the correlation with each one of the other three oceanic region.

Fig. 13
figure 13

Spatial loadings for mode 1 of SST in the Pac–Atl region (a) and of precipitation in the inland (b) from a Canonical Correlation Analysis in the late dry season (model data of the past)

3.2.2.3 Early wet season

In the early wet season the canonical correlation coefficients are more similar between the regions than in the previous seasons. Again, the highest coefficient provides the CCA with Pac–Atl a value of 0.57. The three small regions have similar coefficients. The CCA with Niño3 has a coefficient of 0.52, NTA of 0.49 and Niño3.4 of 0.48. It seems that there is no notable difference between the influence of the three regions on the inland precipitation in this season. The spatial loadings of the CCA with the NTA can be seen in Fig. 14. The SST anomalies show a dipole with positive values in the northwest and negative ones in the southeast. The precipitation anomalies are positive, strongly in the north and in the east. The spatial loadings of precipitation of the CCAs with the NTA and Pac–Atl are positive, whereas they are negative in the Niño regions. The loadings of the SST in the Niño regions are positive. The SST of the CCA with Pac–Atl has negative anomalies in the Pacific with a strong cold tongue just at the Equator. In the Atlantic the anomalies are positive in the north, in the south and especially in the Caribbean Sea, and negative in the center of the area.

Fig. 14
figure 14

Spatial loadings for mode 1 of SST in the NTA region (a) and of precipitation in the inland (b) from a Canonical Correlation Analysis in the early wet season (model data of the past)

3.2.2.4 Late wet season

In the second part of the rainy season, the coefficients are the highest compared to other seasons and similar when compared to the coast. The Pac–Atl again provides the strongest influence with a coefficient of 0.91, followed by the NTA with 0.85. The Niño regions have coefficients of 0.73 (Niño3) and 0.72 (Niño3.4). The SST anomalies of the NTA are negative in most parts and positive only in the southwest and in the precipitation. The Niño region has negative SST anomalies with maxima at the north boundary of the areas and positive rainfall anomalies. The spatial loadings of the Atl (Fig. 15) show positive anomalies in the Pacific with highest values just north of the Equator and at the west coast of Central America. In the Atlantic, there are positive anomalies in the major part north of the Equator and negative ones south of the Equator and in a thin line at the coast of South America.

Fig. 15
figure 15

Spatial loadings for mode 1 of SST in the Pac–Atl region (a) and of precipitation in the inland (b) from a Canonical Correlation Analysis in the late wet season (model data of the past)

In summary, the influence of Pac–Atl is always the strongest, followed by the NTA with the exception only of the early wet season. In this season, all regions have similar coefficients. The signs of the anomalies are always opposite between Niño regions and precipitation and the same between NTA and precipitation, with the exception of the late wet season. The patterns of Pac–Atl always give a summary of the three regions separately. The oceanic influence is highest in the late wet season.

3.2.3 Goodness index

The goodness index of the zero-lag correlations with the NTA and the coastal precipitation is highest in the late wet season with 0.780, followed by the early dry (0.579), early wet (0.304) and late dry season (−0.075). The order of the seasons is the same for the indices and for the coefficients. The Niño regions make a good forecast in the late wet season with correlation coefficients of 0.670 (Niño3) and 0.679 (Niño3.4). In the other seasons the goodness index is always just around 0.3 or 0.4. Pac–Atls goodness indices have the same order as the ones of the NTA and are between 0.815 and 0.274. The correlations with the inland do not differ greatly in the goodness indices when compared to the coast. The only difference is that Pac–Atl has a stronger influence in the late dry (0.475) than in the early wet season (0.370).

Summing up, Pac–Atl and NTA provide a better zero-lag forecast in the second half of the year, while the Niño regions have a higher index than the CCAs with the NTA in the early wet and the late dry season. Highest of all is always the goodness index of Pac–Atl. This conclusion is valid for the precipitation in both areas in Venezuela.

3.2.4 Non-zero lags analysis

The NTA and Pac–Atl have their strongest CCA coefficients always at small lags which include at least one of the three months of the corresponding season. Furthermore, the lags of the highest goodness indices differ only up to two lags compared to the highest correlation coefficient except for the CCAs of NTA with the precipitation in both regions in the late dry seasons, and the CCA of Pac–Atl with the inland precipitation of the late wet season. The strongest influence of Niño3 is at lags between zero and four, simultaneously in the early dry (inland) and late dry season (coast). The results of the goodness indices are similar but in the early dry season at the coast they differ by four months. The results of the Niño3.4 region are in good agreement within coast and inland. The highest correlation coefficient is at the zero-lag in both dry seasons, at lag two in the late wet and at lag six (coast) and one (inland) in the early wet season. This big lag of six months seems to be an exception and the correlation coefficients between lag zero and lag six only differ by 0.03.

The spatial loadings at the lags with the highest coefficients and indices of the Niño regions always have anomalies of opposite signs in SST and precipitation. For the correlations with the NTA region, anomalies of opposite sign are provided by the correlations in the late wet seasons and the late dry season at the coast. Furthermore, the common pattern of the spatial loadings is not fulfilled at the lag of the highest correlation coefficient in the early wet season and at the lag of the highest index in the late wet season, both for precipitation in the inland. In agreement with this, the spatial loadings at the lags of the highest coefficients and indices of the CCAs of Pac–Atl vary in the late wet seasons as well as the spatial loadings at the lag of the highest goodness index in the late dry season at the coast.

In conclusion, the influence is highest and forecast is as good as possible nearly always at small lags when at least one of the months of the particular precipitation season is included in the three-month average. At these lags, the anomalies in the East Equatorial Pacific are always of opposite sign to the ones of the precipitation and the NTA, which have the same sign with the exception of both late seasons in the NTA and the Atlantic part of Pac–Atl.

3.3 Model data for the future

In this part, the results of CCAs of the model data for the future, with sea surface temperature and precipitation are displayed and discussed.

The EOFs of the precipitation at the coast all explain more than 1 % of the total variance, so that all of them enter the CCAs and 100 % variance remains (three EOFs). For the inland precipitation, between five and eight EOFs enter the CCAs, which explain between 97 and 99 % of the variance. The truncated variance of the oceanic regions is small as well. The three, four or five EOFs of the NTA explain at least 96 % of the variance. For the Niño3 this percentage is even higher with 98 % during all four seasons (three or four EOFs), and between 98 and 99 % of the total variance is represented in the three or two EOFs of the Niño3.4 region. The variance which enters the CCAs of the Pac–Atl is a little less (93–95 %) with between five and seven used EOFs. Again, the leading mode of the Niño regions explain more variance compared to the NTA and Pac–Atl.

Modeled precipitation amount in the future show less rain for the late dry season when compared to the past. In the other seasons changes are small.

The modeled SSTs for the past indicate a trend to more La Niña events in the future as well as that both, El Niño and La Niña events will become more extreme.

The canonical correlation coefficients of the seasonal analysis of the model data for the future are listed in Table 4.

Table 4 Ranking of the correlation coefficients from the Canonical Correlation Analysis of the sea surface temperature and the precipitation for the four seasons (model data for the future)

3.3.1 Coast

3.3.1.1 Early dry season

The rainfall in the early dry season at the coast is most strongly influenced by the SST of the Pac–Atl with a correlation coefficient of 0.53, followed by the two Niño regions with a value of 0.45. The impact of the NTA is quite low with a value of 0.30 (second CCA mode). The spatial loadings of the CCA with Pac–Atl summarize the results of the three regions. There are positive SST anomalies, stronger ones in the Pacific than in the Atlantic and negative precipitation anomalies.

3.3.1.2 Late dry season

The late dry season rainfall is not strongly correlated with the oceanic surface temperatures. The strongest correlation is with the Pac–Atl region with only 0.26 correlation coefficient, followed by the NTA with 0.22, Niño3 with 0.19 and Niño3.4 with 0.12. Except for Niño3, the second CCA modes were used to get a physically meaningful precipitation pattern of the spatial loadings. In the spatial loadings of the NTA there is a north-south dipole with positive values in the north. The corresponding precipitation anomalies are positive. The patterns of the correlations with the Niño regions do agree with negative anomalies in both variables. The spatial loadings of Pac–Atl agree with the Niño regions.

3.3.1.3 Early wet season

The rainfall of the early wet season is highly influenced by the NTA (0.64) and Pac–Atl (0.68). The Niño regions play an important role too, with correlation coefficients of 0.46 (Niño3) and 0.42 (Niño3.4). The anomalies of SST and precipitation are of opposite sign in all four correlations. Just one small area in the northeast of the NTA has negative anomalies like the precipitation.

3.3.1.4 Late wet season

Even stronger coefficients are provided by the correlations of the late wet season. The SST of all areas in the oceans have high correlations, the strongest being again with Pac–Atl with a value of 0.82, NTA has the second highest with a value of 0.76 and the Niños just slightly lower with values of 0.75 (Niño3) and 0.74 (Niño3.4). All spatial loadings show positive SST anomalies and negative precipitation anomalies. In Fig. 16 the spatial loadings of the correlation with Pac–Atl are displayed. The SST anomalies of the Pacific are stronger than the ones of the Atlantic. The Caribbean SST anomalies are negative like the precipitation anomalies.

Fig. 16
figure 16

Spatial loadings for mode 1 of SST in the Pac–Atl region (a) and of precipitation at the coast (b) from a Canonical Correlation Analysis in the late wet season (model data of the future)

In conclusion, the Pac–Atl always provides the highest correlation and the Niño regions the lowest except for the early dry season. The strongest oceanic influence is in the late wet season, followed by the early wet, early dry and late dry (Niño3 has a higher correlation in the early dry than in the early wet). The spatial loadings show anomalies of opposite sign for correlations with the Niño regions, except in the late dry season. Even the NTA has only one exception with anomalies of the same sign in SST and precipitation, in the late dry season. The correlations with Pac–Atl do not have opposite anomalies in SST and precipitation in the late dry.

3.3.2 Inland

3.3.2.1 Early dry season

The influence of the oceans on the rainfall in the inland in the early dry season is higher than on the coastal precipitation. The highest correlation is with Niño3 region (0.71), followed by the Pac–Atl region (0.68), the Niño3.4 (0.61) and the NTA (0.41). The spatial loadings show opposite SST anomalies and precipitation in all four seasons.

3.3.2.2 Late dry season

In the late dry season the coefficients are again higher compared to the results of the rainfall at the coast. Pac–Atl has the highest correlation coefficient with a value of 0.68. The other three regions have similar coefficients. Niño3 has the strongest influence of all of them with a value of 0.46, NTA has a coefficient of 0.41 and Niño3.4 of 0.38. The spatial loadings of the CCA with NTA region provide anomalies of opposite sign between SST and precipitation. Both Niño regions show mainly anomalies with opposite sign but with a pool of anomalies like the precipitation between 110°W and 130°W. This pattern can also be seen in the spatial loadings of the Pac–Atl where the Atlantic has negative anomalies and the precipitation positive ones (Fig. 17).

Fig. 17
figure 17

Spatial loadings for mode 1 of SST in the Pac–Atl region (a) and of precipitation in the inland (b) from a Canonical Correlation Analysis in the late dry season (model data of the future)

3.3.2.3 Early wet season

The precipitation in the early wet season is highly influenced by the SST of all regions. The canonical correlation coefficients differ between 0.74 (Pac–Atl and Niño3) and 0.71 (NTA and Niño3.4). The Niño regions and Pac–Atl region show positive SST anomalies and negative precipitation anomalies in the spatial loadings. The CCA with NTA has anomalies of opposite signs too, except for a small area in the northeast of the NTA with negative anomalies like the precipitation.

3.3.2.4 Late wet season

The correlation coefficients of the CCAs of the late wet season are similarly high. The NTA has a coefficient of 0.83. The second strongest influence is with Pac–Atl with a coefficient of 0.75, closely followed by the Niño regions with a value of 0.73. The spatial loadings of the correlation with the NTA show mainly positive SSTs and only in the very southwest there are negative loadings like the sign of the precipitation anomalies. Both Niño regions have anomalies of opposite sign, also Pac–Atl region with the exception of the Caribbean Sea at the coast of Venezuela and Colombia.

In short, like at the coast the oceanic influence is higher in the wet than in the dry seasons. The impact of Pac–Atl region is always strong, being strongest during the late dry and early wet seasons. The NTA region has a higher correlation coefficient than both Niño regions only in the late wet season. The NTA region has the smallest influence compared to the other regions in the early dry season, and similar to the Niño regions in the late dry and early wet season. The spatial loadings have anomalies of opposite sign with all regions in the early dry and both wet seasons. The NTA has it, additionally, in the late dry season, but in the spatial loadings of the Niño regions and Pac–Atl region, the SST anomalies have opposite signs to the precipitation anomalies only in a pool in the middle of this Pacific area between 110°W and 130°W.

3.3.3 Goodness index

All four regions provide the best zero-lag forecast in the late wet season (between 0.544 and 0.588). The Niño regions and Pac–Atl have similar high indices in the early wet and the Niño regions even in the early dry season. Least predictable is the late dry season with all oceanic regions.

3.3.4 Non-zero lag analysis

The lags with the highest correlation coefficients are small (between zero and two) in the early dry and the wet seasons, so that at least one month of the three-month average of the corresponding season is still included in the mean. Exceptions are the CCA with Niño3 and the precipitation in the early dry season at the coast, NTA with the rainfall of both regions in the same season and Pac–Atl with precipitation in the late wet season in the inland. The late dry season rainfall at the coast is influenced by large lags of six months (except by Niño3), the inland precipitation with large lags occurs only with the NTA region. The lags with the highest goodness indices differ from the ones with the highest correlation coefficients by two lags at maximum. Every oceanic region has one exception to this pattern.

The spatial loadings of the Niño regions show opposite anomalies between SST and precipitation, except for late dry season at the coast. The patterns with the NTA region are opposite too in nearly all cases and the Pac–Atl mirrors that with homogeneous SST anomalies in Pacific and Atlantic, oppositely signed to the rainfall anomalies.

In summary, the highest oceanic influence and the best predictability with the SSTs are fulfilled with lags between zero and two. Only the late dry season at the coast seems to be influenced already by the SST of the previous autumn. The spatial loadings show anomalies which are mainly of contrary signs between the SSTs and the precipitation.

4 Discussion and conclusions

The connection between SST and precipitation was analyzed by Canonical Correlation Analysis. The CCA results of the four seasons are as follows:

  • The precipitation of the station data in Vargas in the late dry season and in Bolívar in the wet seasons is more strongly influenced by the NTA than by the Niño regions (Table 2). The results for Bolívar agree with previous studies. The Niño regions and Pac–Atl region have similar and high canonical correlation coefficients for both parts of the wet season and the early dry season (Table 2). The model data provide some opposite results (Tables 3, 4). In most of the cases, especially in the wet seasons, the NTA has a stronger influence than the Niño regions. Analyzing the observations this is only the case for Bolívar. The literature does not suggest such a dominating influence of the NTA. The Pac–Atl always provides the highest correlation, except for the CCA of inland precipitation in the early dry and the late wet season for the future. The oceanic influence is highest in the late wet season where most precipitation occurs. Low correlations take place in the late dry season in the future with both precipitation regions, and in the past in the inland, when the rainfall is minimal. Thus, the results of the stations data compared to the ones of the model data show less cases where Pac–Atl has the highest correlation coefficient, the season with the highest correlation is more variable, and the strength of the correlation is higher in Bolívar than in Vargas. When comparing the model data results between past and future, no clear changes can be observed. NTA influence dominates the one of the Niño regions in the wet seasons and the oceanic influence on the late wet season stays strongest, compared to the other seasons.

  • With the station data, the zero-lag goodness index is higher with the Pacific than with the Atlantic. Furthermore, the precipitation in Vargas and in Bolívar can possibly be predicted in the early dry season by the Niño regions. With the model data, all four regions provide the best zero-lag forecast in the late wet season. Predictability is lowest in the late dry season (except for the coastal precipitation of the past with Niño3 and the inland precipitation of the past with Niño3.4 and Pac–Atl). In the past, the early dry and late wet season precipitation can be better predicted by the NTA than by the Niño regions. In the future, the Niño regions provide a better forecast than the NTA on the precipitation of all seasons (except for the late dry season in the inland).

  • The results of the CCAs with the station data at the lags with the highest impact suggest that the SST of the wet season in the NTA region influences the precipitation for the whole year in Bolívar. In most cases the Niño regions influence precipitation most strongly at a short time in advance. With model data, the highest influence is almost always at small lags where at least one of the months of the particular season is included in the three-month average. In general, the lags with the highest goodness indices do not vary strongly from the lags of the highest coefficients. Conclusively, the best precipitation forecast with the highest correlation coefficient can be made with the SST short time in advance.

  • Comparison of the two Niño regions show that there is no important difference in the influence on the precipitation in Venezuela between the two areas in the Equatorial Pacific. In most of the cases, the station data indicate a higher influence with Niño3.4. Past modeled precipitation at the coast also indicates a stronger influence from this region. In the future, Niño3 has a stronger influence in nearly all seasons.

  • The patterns of the spatial loadings of the Canonical Correlations with the station data show, in most of the cases, anomalies of the same sign in the NTA and the precipitation (e.g. Fig. 7). The anomalies of the Niño regions have opposite signs to the ones of the NTA and the precipitation (e.g. Fig. 4). These results correspond with previous studies (Enfield 1996; Cárdenas et al. 2002; Chen and Taylor 2002; Taylor et al. 2002; Cárdenas et al. 2003; Martelo 2003a, b; Bravo de Guenni et al. 2013). With the model data for the past, anomalies of opposite sign take place in NTA and precipitation when correlating the NTA and Pac–Atl in the late dry (Fig. 13) and the late wet season (Fig. 15). The spatial loadings of the CCAs with the model data for the future do not show this pattern in most of the correlations with the NTA and Pac–Atl region. Thus, the model seems to be able to capture the precipitation response of ENSO in the correct way, but not the impact of the NTA.

The explanation for the patterns of the spatial loadings which occur most of the time in the station data and in the Pacific regions of the model data is given in the following statement.

La Niña provides a positive rainfall anomaly in Venezuela. This can be explained by the zonal and the meridional atmospheric circulation, the Walker Circulation and the Hadley Cell. The Walker Circulation (January) during La Niña leads to the ascent of air over South America, stronger and faster than during the normal phase of ENSO (Lau and Yang 2002). The winter Hadley Cell (December–February) is weakened and has its branch of ascending air anomalously north of the equator (Quan et al. 2004). The upward motion, in addition supported with positive NTA SST anomalies, leads to positive precipitation anomalies in northern South America, at least in the early dry season. Furthermore, the Subtropical Jetstream influences the intensity of the precipitation. The Subtropical Jetstream generates a subsidence which causes the subtropical desert regions (Martelo 2003a). During the La Niña phase, this high wind speed air current is located more towards the north than usual. This diminishes the subsidence over Venezuela and enables convection (Arévalo, personal communication). The coherence of La Niña and the precipitation in Venezuela is actually being studied and is not yet fully understood.