1 Introduction

The Caribbean atmospheric circulation plays both a direct and an indirect role in Caribbean precipitation by variations in divergence (Gamble and Curtis 2008), influencing sea surface temperatures in the Gulf of Mexico or through air–sea interactions (Wu and Kirtman 2011). While average annual, seasonal and monthly atmospheric circulation climatologies and their related influence on Caribbean climate are well understood, the day-to-day variability in the Caribbean atmospheric circulation has not yet been characterized. The current understanding of the Caribbean atmospheric circulation is based on the characteristics of the North Atlantic subtropical High (NAH) and its related features including the north–east trade winds and the Caribbean Low Level Jet (CLLJ). Previous investigations have related the direct influence of these features of Caribbean atmospheric circulation to rainfall characteristics on annual, seasonal, and monthly scales (Amador 2008; Hastenrath 1976; Hastenrath and Lamb 1977; Muñoz et al. 2008; Muñoz and Enfield 2011; Wang 2007; Wang and Lee 2007; Cook and Vizy 2010), and have investigated the influence of other major factors such as Atlantic and Pacific sea surface temperatures (SSTs), the El Niño-Southern Oscillation (ENSO) on the variability of regional and sub-regional precipitation in the Caribbean region (e.g. Enfield and Alfaro 1999; Giannini et al. 2000; Chang et al. 2000; Chen and Taylor 2002; Taylor et al. 2002; Spence et al. 2004; Wang 2001; Wang et al. 2006; Charlery et al. 2006; Jury et al. 2007), and the North Atlantic Oscillation (NAO) on sub-regional rainfall.

Thus, previous studies on atmospheric variability were focused on mean monthly atmospheric circulation characteristics determined in relation to the monthly precipitation characteristics of the Caribbean and Central America region. Dry season (DJFMA) mean surface wind field/atmospheric circulation maps show that the line of confluence associated with the Intertropical Convergence Zone (ITCZ) is located close to the equator in the Atlantic and at about 5°N–10°N in the Pacific (Hastenrath and Lamb 2004, Fig. 1). This is accompanied by a strong meridional pressure gradient and strong trade winds in the Northern Hemisphere. In contrast, the rainy season is distinguished by a northward displacement of the ITCZ in which the east–west axis of the confluence zone is approximately 5°N–10°N in the Atlantic and 10°N–15°N in the Pacific (Hastenrath and Lamb 2004, Fig. 2). Cross-equatorial wind flow from the Southern Hemisphere’s trade winds reaches 5°N–10°N. The May/June and September/October periods are accompanied by subsidence due to the southward displacement or expansion of the NAH. During September to November, the trade winds are relatively weak and hurricane activity with associated easterly waves tends to peak (Amador 2008) and Atlantic trade wind monthly wind speeds are linked to the NAO (George and Saunders 2001). The CLLJ, a sub-regional feature also known as the Intra-America Seas low-level jet (Amador 2008), is a localized feature of the atmospheric circulation in the Caribbean with an east–west axis located in the Caribbean Sea near 15°N (Amador 2008) between northern South America and the Greater Antilles islands. The CLLJ transports moisture from the tropical Atlantic into the Caribbean Sea, Gulf of Mexico, and influences rainfall in the United States (Gamble and Curtis 2008; Wang 2007; Amador 2008; Cook and Vizy 2010; Muñoz and Enfield 2011). The CLLJ exhibits maximum wind speeds at the 925 hPa level with wind speeds up to 16 m/s (Amador 2008; Whyte et al. 2008) while wind speeds at the 10 m level are between 8 and 10 m/s (Chadee and Clarke 2014). It experiences monthly maximum wind speeds in January and July with minima in May and October (Wang and Lee 2007; Wang 2007). The secondary wind speed maximum in July is related to the westward extension of the NAH (Wang and Lee 2007). The CLLJ bifurcates into two branches, one that flows over Central America and into the Pacific while the other connects with the Great Plains low-level jet in the United States (Cook and Vizy 2010). It is modified by SSTs in the Caribbean (Wang and Lee 2007), the ENSO (Amador 2008; Whyte et al. 2008), and the North Atlantic Oscillation (NAO) (Wang 2007). The semi-annual cycle of the CLLJ and its relationship to precipitation in the Caribbean have been used to evaluate general circulation and regional climate models (Martin and Schumacher 2011; Taylor et al. 2013). The CLLJ region, because of the localized spatial maximum of wind speeds, has the highest wind power potential (Chadee and Clarke 2014) and wave energy potential (Appendini et al. 2015) in the Caribbean.

The day-to-day atmospheric circulation of the Caribbean has not been characterized. The daily atmospheric circulation, although complex and chaotic in nature, may be reduced into a discrete set of frequently occurring daily atmospheric circulation types on the regional scale with associated frequencies of occurrence (Philipp 2009). The circulation types could be considered to be attractors of the atmospheric circulation system (Pastor and Casado 2012). Members of each circulation type share common characteristics but differ from members of other types. Even though a discrete number of atmospheric circulation types are unable to account for the full dynamics of the atmospheric continuum system, they have proved useful in understanding the regional and local climate variability of surface variables. Atmospheric circulation types in several regions globally have been linked to surface variables such as surface level ozone concentrations (Cooter et al. 2007; Saavedra et al. 2012), aerosol optical depths and concentrations of PM10 (Zhang et al. 2012), air temperature (Hoy et al. 2013), cold spells (Guentchev and Winkler 2010), rainfall (Hope et al. 2006; Raziei et al. 2012; Romero et al. 1999), flooding (Prudhomme and Genevier 2011), extreme surface winds (Peña et al. 2011), and biological indicators such as potato yield (Sepp and Saue 2012). Our interest in the daily atmospheric circulation types (CTs) lies in extracting recurring daily types that could be used in future statistical–dynamical downscaling studies to develop high resolution wind climate maps at the mesoscale which are required to determine those areas that are best suited for wind farm development. Mesoscale wind maps are, to a large extent, dependent on the large-scale atmospheric circulation. The mean wind climate at the mesoscale is determined by using large-scale wind conditions that are representative of the wind climate, dynamically downscaling each representative condition for hourly wind speeds at scales relevant to the use of wind energy technologies, and then weighting the wind speeds according to the frequency of occurrence of the representative large-scale wind conditions. Strategies for selecting representative large-scale wind patterns include (1) determining a ‘typical wind year’ either by concatenating ‘typical wind months’ as characterized by cumulative wind speed and wind direction distribution functions (e.g. Kotroni et al. 2014), (2) stratified random sampling of days (AWS Truepower 2012), and (3) classifying geostrophic winds estimated from pressure gradients at selected locations into wind direction and wind speed intervals (Frank et al. 2001). Alternatively, representative large-scale wind conditions can also be attained through the extraction of recurring daily atmospheric wind patterns using a classification procedure. This approach has several benefits. In tropical regions such as the Caribbean, pressure gradients and the Coriolis force are weak. Thus, atmospheric circulation is preferably analyzed through streamline analysis. Classification of atmospheric circulation patterns in domain encompassing tropical regions has been performed on wind fields (e.g. Espinoza et al. 2012). A classification of daily atmospheric circulation types in the Caribbean based on wind patterns will assist in understanding mechanisms forcing regional climate variability. As Caribbean climate is also affected by other large-scale phenomena such as the El-Niño Southern Oscillation (ENSO), the North Atlantic Oscillation, the Pacific North American (PNA) pattern, and the Pacific Decadal Oscillation (PDO), regional changes in wind power availability and other climatic parameters may be anticipated. Furthermore, daily atmospheric circulation types extracted from reanalysis data may assist in determining those general circulation models (GCMs) that realistically capture circulation types and current climate conditions (Belleflamme et al. 2012; Cassano et al. 2006; Pastor and Casado 2012). Climate change projections from those GCMs which reproduce the current climate may be more useful for regional climate downscaling studies for determining local scale impacts of climate change.

Although most climatological classification studies have focused on the relationships between large-scale circulation patterns and surface variables, there is still considerable interest in the dominant circulation types. Various circulation classifications for other regions have provided additional insight into their regional climate. Jiang et al. (2012) found that daily synoptic circulation types over east Australia featuring easterly or westerly troughs were positively correlated with the Southern Oscillation Index (SOI) and that some synoptic types associated with an east–west trough declined in annual frequency from the 1970s to 1995. Alexander et al. (2010) classified daily winter (JJA) mean sea level pressure (MSLP) over Australia for the 1907–2006 period into twenty synoptic patterns and found that patterns characterized by a marked high pressure across continental Australia decreased in frequency and that their variability was linked to the ENSO. Philipp et al. (2007) found that daily MSLP patterns over the North Atlantic-European region could be described by 6–11 daily circulation types depending on season. Casado et al. (2009) identified eight circulation types over the Euro-Atlantic region and an increase in annual frequencies of ridge circulation types during the 1962–2002 period. Solman and Menéndez (2003) found that daily winter 500 hPa geopotential heights over the southern part of the South American continent could be grouped into five frequently occurring circulation types and that two of the five circulation types had anomalously high frequencies during positive ENSO events. Galambosi et al. (1996) classified 500 hPa pressure field heights over southwestern United States for each season into eight or nine classes and Burlando (2009) found that eight wind flow patterns adequately explained the wind climate over the Mediterranean.

As there are no catalogues for daily atmospheric circulation in the Caribbean region, the purpose of this study therefore is to extract the recurring daily circulation patterns over the Caribbean. In this work, the recurring daily atmospheric circulation patterns in the Caribbean region are extracted from 850 hPa wind fields using a two-stage cluster analysis method comprising a hierarchical scheme followed by k-means. We also investigate the temporal characteristics of the atmospheric circulation types as well as the relationships between the frequency of occurrence and several large-scale teleconnections. This work constitutes the first daily atmospheric circulation catalogue for the Caribbean region, and in future, will be used for determining high resolution wind maps for Caribbean islands and assessing the potential impacts of climate change on wind power availability. This study is organized as follows: Sect. 2 describes the domain and the atmospheric data used, with details of the clustering method given in Sect. 3. A detailed description of the results and discussion is provided in Sect. 4, with particular emphasis on frequency and variability on monthly and annual scales, trends in the frequency of occurrence, persistence of each atmospheric circulation type, transitions of one atmospheric circulation type to another, and correlations between frequency of occurrence of each CT and teleconnections. Finally, some conclusions are summarized in Sect. 5.

2 Domain and data

The domain considered for determining the atmospheric circulation types over the Caribbean is bounded by latitudes 0°N and 30°N and longitudes 110°W and 40°W (Fig. 1). This chosen domain is larger than the Caribbean region in an attempt to capture climatic factors in the Atlantic and the Pacific which influence the climate in the Caribbean. Daily 850 hPa zonal and meridional wind components (u, v) for the period 1 January 1979–31 December 2010 were extracted from the NCEP/DOE reanalysis (Kanamitsu et al. 2002). As streamline analysis is the main means of analyzing tropical weather and climate, wind components are used instead of geopotential heights to represent atmospheric circulation. Classification of atmospheric circulation patterns in domains encompassing tropical regions has been performed on wind fields (e.g. Espinoza et al. 2012; Guèye et al. 2011). The wind components are provided by the reanalysis data at a resolution of 2.5° latitude × 2.5° longitude. The mean circulation and mean wind speeds within the domain at the 850 hPa level are also shown in Fig. 1. The north–east trade winds are evident at latitudes less than 17°N. The central region of the NAH is also evident in the north–west of the domain and the CLLJ is a region of wind speed maximum between 10–15°N and 70–80°W over the Caribbean Sea.

Fig. 1
figure 1

Mean atmospheric circulation at the 850 hPa level for the 1979–2010 period. Shaded areas are wind intensities in m/s

3 Classification method

There are two widely used methods for classifying daily atmospheric circulation into a small number of types: empirical orthogonal function (EOF) analysis [principal components analysis (PCA)] and cluster analysis. EOF analysis or PCA is a linear technique that identifies the principal orthogonal directions (eigenvectors) along which most of the data variability are constrained. Classification of circulation into types is performed using T-mode PCA (Casado et al. 2009; Compagnucci and Salles 1997; Huth 2000; Huth and Canziani 2003; Jiang 2011; Salles et al. 2001) and the eigenvectors are the primary flow patterns (Burlando 2009). However, the EOF decomposition technique does not allow it to extract important circulation patterns which are not necessarily orthogonal to its first eigenvector or circulation pattern (Burlando 2009). Cluster analysis, unlike EOF’s linear decomposition, is a non-linear approach that identifies the dense aggregation of flow patterns around a few primary types, which entails, the “identification of the peaks of the probability density function” of the circulation or flow fields (Burlando 2009). Cluster analysis has been used to classify atmospheric circulation flows over the Svalbard region (Käsmacher and Schneider 2011), the Mediterranean Sea (Burlando 2009), northeastern Iberian Peninsula (Jiménez et al. 2009), the Grand Canyon (Kaufmann and Whiteman 1999), Switzerland (Weber and Furger 2001), and over western Europe (Esteban et al. 2006), as well as at single-point locations (Clifton and Lundquist 2012). The popular cluster analysis technique k-means (McQueen 1967) tends to produce circulation types for which the within cluster sum-of-squares variance is locally minimized rather than globally minimized. As k-means uses a random sample of initial cluster centroids to group the circulation patterns, the k-means procedure may be repeatedly reinitialized with a different sample of initial cluster seeds. This is to enable finding a global minimum (Blender et al. 1997). Another option is to precede k-means with a hierarchical cluster analysis to determine a preliminary categorization and thus select the number of clusters and provide the initial clusters centers to k-means (Kaufmann and Whiteman 1999). The application of a hierarchical clustering method such as Ward’s clustering scheme prior to k-means was proposed in an attempt to avoid locally optimum solutions that k-means produces (Steinley 2006), which becomes more likely as the number of items to be grouped increases. The Ward’s clustering scheme extracts clusters that are compact and spherical, and of approximate sizes (Burlando et al. 2008; Lorente-Plazas et al. 2014). Other hierarchical schemes such as the average linkage and complete linkage schemes also produce clusters of similar size while the single linkage produces one large cluster and many clusters with few members (Weber and Kaufmann 1995). One problem associated with hierarchical clustering techniques is that patterns could be erroneously grouped early on in the iterative clustering eventually forming one large cluster containing all patterns and cannot be reassigned during the hierarchical iterative clustering process. However, coupling the solutions of the hierarchical clustering schemes to non-hierarchical methods such as k-means allows for reassignment of patterns until a metric, cluster-sum-of squares in the case of k-means, is minimized (Burlando 2009; Lorente-Plazas et al. 2014). More sophisticated modified k-means algorithms include numerical techniques of simulated annealing and diversified randomization, called SANDRA (Philipp et al. 2007). Other non-linear techniques for classification include self-organizing maps (SOMs) (Cavazos 2000; Hewitson and Crane 2002). SOMs project multidimensional datasets onto a two-dimensional array, thereby producing two most dissimilar patterns with a large number of intermediate patterns. Although SOMs may theoretically be best suited for decomposing a continuum and have been successively used in regionalization (Lin and Chen 2006) in addition to classifying atmospheric circulation (Chávez-Arroyo et al. 2014; Sheridan and Lee 2011; Guèye et al. 2011; Espinoza et al. 2012), they have not always been able to sufficiently represent large-scale circulation variability because of the two-dimensional constraint (Jacobeit 2010). Furthermore, the parameters of a SOM may be difficult to tune to produce a reliable classification in regions where there has not been sufficient work on the atmospheric circulation as different parameter choices may lead to different SOM patterns (Liu and Weisberg 2011).

Interestingly, several extensive classification studies have found that no one single classification technique produces the best classification for all applications (Huth et al. 2008; Beck and Philipp 2010; Philipp et al. 2014). As such, we used a two-stage clustering technique comprising a hierarchical clustering scheme followed by k-means as this technique has been widely used for atmospheric circulation classification defined by wind fields (e.g. Kaufmann and Whiteman 1999; Weber and Furger 2001; Burlando et al. 2008; Burlando 2009; Lorente-Plazas et al. 2014).

Hybrid clustering schemes have been used for classifying flow patterns in the Grand Canyon region (Kaufmann and Whiteman 1999) and in the Mediterranean (Burlando 2009). Because previous studies (Compagnucci and Salles 1997; García-Valero et al. 2012) have found that some atmospheric circulation patterns occur throughout the year, the seasonal cycle was kept in the data and the clustering procedure is performed on all daily circulation patterns in the 1979–2010 period to identify the overlap in some patterns between the dry and wet seasons in the Caribbean. In the following sections, we first describe how the input data were scaled (Sect. 3.1) followed by the application of the two-stage clustering technique (Sect. 3.2).

3.1 Scaling wind components

We scale the wind components at each grid-point against the climatological (time-averaged) wind speed and a spatial averaged wind speed to obtain patterns describing primarily directional flow features. These scalings were applied prior to cluster analysis so that frequently occurring atmospheric circulation types that are extracted are flow patterns that occur across all seasons. Thus, each wind component u ij and v ij at each time i and grid-point j was scaled by the time-averaged wind speed s j to prevent grid-points with generally higher wind speeds from dominating the calculation of the distances between patterns (Kaufmann and Whiteman 1999):

$$u_{ij}^{{\prime }} = \frac{{u_{ij} }}{{s_{j} }}\quad {\text{and}}\quad v_{ij}^{{\prime }} = \frac{{v_{ij} }}{{s_{j} }}$$
(1)

where

$$s_{j} = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left( {u_{ij}^{2} + v_{ij}^{2} } \right)}^{1/2}$$
(2)

and n is the total number of daily patterns.

A second scaling in which the wind components u ij and v ij are divided by the spatial-average wind speed at each time i (Kaufmann and Whiteman 1999) was also carried out to ensure that similar flow patterns that differ by a scaling factor are grouped into the same cluster:

$$\tilde{u}_{ij} = \frac{{u_{ij}^{{\prime }} }}{{s_{I}^{{\prime }} }}\quad {\text{and}}\quad \, \tilde{v}_{ij} = \frac{{v_{ij}^{{\prime }} }}{{s_{I}^{{\prime }} }}$$
(3)

where

$$s_{I}^{{\prime }} = \frac{1}{m}\sum\limits_{i = 1}^{m} {\left( {u_{ij}^{{{\prime }2}} + v_{ij}^{{{\prime }2}} } \right)}^{1/2}$$
(4)

and m is the total number of grid-points.

3.2 Clustering procedure

We try to allocate the spatial patterns of wind components into groups or clusters. We use a two-stage clustering technique (Burlando 2009), first applying the Ward’s hierarchical clustering scheme (Ward 1963) and then the commonly used partitional k-means algorithm (McQueen 1967) to maintain consistency in the clustering procedure; the two clustering schemes use the minimum sum-of-squares between patterns as the similarity criterion. The resulting median centroids associated with the clustering solutions from Ward’s hierarchical clustering scheme were used as initial centroids for the k-means algorithm thereby allowing for a unique solution to be found (Fielding 2007). The distance metric used to calculate the sum-of-squares in both the Ward’s and k-means clustering schemes was the Euclidean distance.

The Ward’s clustering scheme is used to determine the number of clusters. Ward’s algorithm for the agglomeration of clusters starts with each normalized pattern as a singleton so that for n patterns there are n singleton clusters whose variation is zero. At each iterative step of the algorithm, the two clusters that provide the smallest increase in the within-cluster sum-of-squares are joined (Mirkin 2011). The Ward’s algorithm stops when all clusters have been joined to form one large cluster. The process of fusion is shown by the dendrogram in Fig. 2. The dendrogram or hierarchical tree diagram shows the “height” or the sum-of-squares (sum-of-cluster variances) at which clusters or singletons are combined to form a new larger cluster. Two clusters that are joined together at low heights are more similar to each other rather than two clusters with large difference between heights. The data are said to be comprised of k cluster solutions when there is a large increase in “height” when the k cluster solution is merged with a k + 1 cluster solution (Izenman 2008). The change in the within-cluster sum-of-squares may be identified by plotting the pooled within-cluster sum-of-squares versus the last agglomerative k steps.

Figure 3, when read from right to left along the abscissa, is a plot of the sum-of-cluster variances corresponding to successive merging for the last 50 aggregations. It shows that for the last 11 aggregations the sum-of-cluster variance rapidly increases. This is due to the joining of dissimilar clusters, resulting in non-homogeneous clusters. Between the last 10 and 50 successive aggregations there are very small changes in the sum-of-squares, shown in Fig. 3 by the plateau, indicating that clusters that are similar are aggregated. The number of clusters is taken to be the cluster number corresponding to the base of a steep increase of sum-of-cluster variance just before a merger produces a large increase (Burlando 2009; Kaufmann and Whiteman 1999). The Ward’s clustering algorithm indicates that the number of clusters may be k = 8 or 5 (Fig. 3).

Fig. 2
figure 2

Dendrogram resulting from applying Ward’s hierarchical clustering procedure to the 850 hPa daily atmospheric circulation patterns for the 1979–2010 period. The vertical coordinate ‘Height’ refers to the sum-of-cluster variances

Fig. 3
figure 3

Pooled within-cluster sum-of-squares during the last 50 agglomerative steps of the Ward’s clustering procedure. The vertical coordinate ‘Height’ refers to the sum-of-cluster variances and k is the number of clusters used to describe the full data set

The cluster centroids for each of the k solutions found from the Ward’s minimization of sum-of-squares are then used to provide initial centroids for the k-means algorithm which requires the number of cluster solutions to be specified. Since k = 8 and k = 5 are possible solutions, the k-means clustering technique was applied to cluster numbers ranging from k = 2 cluster solution up to k = 15 cluster solution. The boxplots of correlation coefficient of each member to their final centroids for various cluster number solutions are shown in Fig. 4. We use the range of correlation coefficients in deciding which of the two possible clustering solutions (k = 8 and k = 5) better describes the scaled atmospheric circulation patterns. We do not consider the k = 5 clustering solution to be appropriate for this study as the range of coefficients for the k = 5 clustering solution contain negative correlations suggesting that some patterns may not have been grouped with the appropriate cluster or that they should have been in a separate cluster.

Fig. 4
figure 4

Correlation coefficients for each k cluster solution. Thick horizontal lines indicate the median correlation values. The upper extremities of the boxes are the third quartile values and the lower extremities of the boxes are the first quartile values. The upper ends of the dashed vertical lines represent the maximum correlations and the lower ends the minimum correlations

Three other features are evident in Fig. 4. Firstly, the range of correlation coefficients seems to be optimized by k = 9. Secondly, the median correlation coefficients for each cluster solution, which are indicated by the thick horizontal lines, increases as k increases, but the rate of change in increase after k = 7 is small. Thirdly, the third quartile and first quartile values do not change quickly after k = 7. These observations provide greater confidence for a k = 8 solution, but they also indicate that k = 7 or k = 9 might be better solutions.

The stability of cluster solutions for k = 7, 8, 9 were therefore tested. The degree of stability is a measure of the robustness of the clusters in that the clustering method should consistently produce the same or similar clusters with similar data sets of different time periods. By reducing the data set to an overlapping time period with the original full data set and repeating the clustering method, the cluster stability can be checked by comparing the clusters obtained for the reduced data set with those obtained for the full data set. The adjusted Rand index (Hubert and Arabie 1985; Milligan and Cooper 1986) was used as the stability measure. It compares the similarity of groupings between two cluster solutions and counts the number of pairs of objects that maintain their groupings with those pairs that change their grouping while accounting for chance grouping of pairs of objects (Steinley 2004; Hubert and Arabie 1985). If the two cluster solutions match exactly, the adjusted Rand index is 1. Reduced subsets of length half of the full periods (16 year periods) were chosen from the full data set of 32 years. The first subset was from 1979 to 1994, the second from 1980 to 1995, etc. Therefore, the clustering procedure was repeated on seventeen subsets. The mean adjusted Rand indices for 7, 8, and 9 possible cluster solutions were 0.69, 0.64, and 0.62 respectively, indicating that the k = 7 was the most stable cluster solution. We do note that the large range of correlation coefficients of daily circulation patterns to their corresponding cluster centroids for the k = 7 solution indicates that there is high intra-cluster variability. This may suggest that k = 8 solution, with the smaller spread in correlation coefficients between each member and its corresponding cluster centroid, tends to reduce the intra-cluster variance. However, the k = 8 solution also produces one cluster which has low membership with one cluster containing less than 0.2 % of all members. A low membership implies that the atmospheric circulation associated with that cluster is not a frequently recurring pattern and could be anomalous events or that those atmospheric circulation patterns simply could not be grouped with the other seven types. Therefore, we select a k = 7 cluster solution to represent the most frequently occurring atmospheric circulation types over the Caribbean.

4 Results and discussions

4.1 Atmospheric circulation types

The daily atmospheric circulation types (CTs) for the 1979–2010 period of the NCEP/DOE reanalysis 850 hPa circulation are shown in Fig. 5. These types may be differentiated by the location and extension of the well-known quasi-stationary Atlantic and Pacific anticyclonic flow centers, as well as neutral areas, and resultant flows in the study domain (box in Fig. 5) and in an extended domain (Fig. 5). The CTs were derived by applying the cluster analysis technique to the daily atmospheric circulation patterns defined over the smaller domain indicated by the black boxes in each sub-figure. We used the extended domain to identify the center of action of an anticyclonic flow over the Gulf of Mexico-Florida region in CTs 1, 3, 5, as well as the extension of the Atlantic anticyclone as this semi-permanent feature directly influences the atmospheric circulation in the Caribbean.

Fig. 5
figure 5figure 5

a Atmospheric circulation types 1–4 and mean wind speeds. b Atmospheric circulation types 5–7 and mean wind speeds

CT 1 has an anticyclonic flow centered over Florida at about 25°N, 82°W with a west–east span from the Gulf of Mexico into the Atlantic to about 60°W. Associated with this anticyclone, which we refer to as a Gulf of Mexico anticyclone, are strong westerlies between 30 and 40°N (Fig. 5a). A trough over the Atlantic is located to the east of the anticyclone and its axis is oriented in the north–south direction. CT 1 is also characterized by a neutral area centered over 19°N, 62°W (Fig. 5a). The neutral area is a result of the interaction of the Gulf of Mexico anticyclone, the Atlantic trough, and the easterly trade winds. The easterly trade winds over the Caribbean emanate from the southern flank of the Atlantic anticyclone which is centered about 30°N, 30°W (Fig. 5a). The easterly flow within the Caribbean bifurcates over Mexico (10°N, 80°W) into a northeasterly flow and a southeasterly flow. The southeasterly arm of the bifurcation forms part of the North American anticyclone and produces a northerly flow over the Mexican coast (Fig. 5a). The northeasterly arm of the bifurcation flows over Nicaragua into the Pacific and is associated with the northwest–southwest (NW–SW) oriented trough over South America. The third anticyclone that characterizes CT 1 in the extended domain is located over the Pacific with center at about 30°N, 140°W.

The Gulf of Mexico anticyclone is absent in CT 2. As a result, this CT is defined by only two anticyclonic flows in the extended domain (Fig. 5a), and there are no neutral areas in the Caribbean and no troughs in the Atlantic. The Atlantic anticyclone is closer to the Caribbean in CT 2 as compared with all CTs and has its most southward position in this CT with center at about 25°N, 60°W. The location of the Atlantic anticyclonic flow in CT 2 is similar to that of the mean annual resultant flow (Fig. 1). Easterly winds over the Caribbean in CT 2 bifurcate at about 17°N, 85°W. As in CT 1, the northeasterly arm flows over Nicaragua and into the Pacific. However, in this CT, the southeasterly arm is met by strong westerlies and is re-incorporated into the Atlantic anticyclone. The South American trough has a NW–SE orientation as in CT 1.

CT 3 is similar to CT 1 in that three anticyclones influence the resultant flow within the Caribbean (Fig. 5a). However, the Gulf of Mexico anticyclone is located to the west, centered at about 25°N, 97°W over the eastern coast of Mexico and covers north Mexico, the state of Texas in the United States, and the Gulf of Mexico where it gives rise to northeasterly flow. The southwestern flank of the Gulf of Mexico anticyclone coupled by the southeasterly flow over Nicaragua, leads to strong easterly winds in the Pacific (Fig. 5a). CT 3 has a neutral area like CT 1. However, this neutral area is located over Cuba and the Bahamas, around 22°N, 77°W. The Atlantic anticyclone east of the neutral area is centered at 32°N, 45°W (Fig. 5a). The Pacific anticyclone is located in a similar position as in CT 2. Another feature of CT 3 is that the Caribbean easterly winds do not bifurcate over Central America.

Only the Pacific and Atlantic anticyclonic flows are present in CT 4 as in CT 2 (Fig. 5a). The location of the Pacific anticyclone is similar to those in CTs 1–3. The Atlantic anticyclone flow, however, has a broad extent (Fig. 5a) with mean latitudinal position of 30°N and with its western flank over the southeastern coast of the United States. The Atlantic anticyclonic flow gives rise to predominantly easterly/northeasterly flow over the Caribbean which bifurcates at around 17°N, 80°W. The bifurcation leads to a southerly flow into the Gulf of Mexico and a northeasterly flow which may be a factor for the stronger than average easterlies in the eastern Pacific. This CT is also associated with strong wind speeds in the CLLJ region.

There are three anticyclones in CT 5 that are similar to CTs 1 and 3 (Fig. 5b). The Pacific anticyclone’s location is similar to that of CTs 1 and 3. However, the center of the Gulf of Mexico anticyclone is located over the Gulf of Mexico at about 20°N, 90°W, and the Atlantic anticyclone center about 25°N, 45°W. As with CTs 1–4 the Atlantic anticyclone results in easterly winds over the Caribbean. However, they are weaker than average. CT 5 is also defined by a neutral area in the Caribbean. The neutral area, located over the Greater Antilles islands comprising Cuba, Jamaica, Hispañola and Puerto Rico, is at its most southerly position and is widest in CT 5 compared with CTs 1 and 3. The width of the neutral area in CT 5 may be due to strong westerlies which are at their most southerly location in CT 5, extending towards 23°N.

Southeasterly flow dominates the Caribbean and the Gulf of Mexico in CT 6 (Fig. 5b) extending to 25°N. This southeasterly flow is a consequence of the broad extent of the Atlantic anticyclone whose center is located over the Atlantic Ocean around 30°N, 40°W. The southeasterlies in the Caribbean show a weak bifurcation at 20°N, 95°W. Unlike any of the other CTs, CT 6 is also defined by westerlies in the eastern Pacific. These westerlies arise from the Pacific anticyclone. The Pacific anticyclone, centered at 35°N, 140°W, is broadest in CT 6 and its eastern flank shows a distinct bifurcation at 12°N, 120–125°W into an easterly flow and a westerly flow into Central and South America. The westerly flow may have a role to play in the orientation of the trough over South America (Fig. 5b). The South American trough in this CT is in the NE–SW orientation unlike the NW–SE orientation in CTs 1–5. We also note the southerly anomalous cross-equatorial flow over northern South America.

CT 7, like CT 6, has predominantly southeasterly flow over the Caribbean. However, the bifurcation of the Caribbean southeasterly flow is to the south at 17°N, 95°W. Like CT 6 the Pacific anticyclone in CT 7 has its most northerly center at about 35°N, but, unlike CT 6, the bifurcation of the western flank of the Pacific anticyclone along 120°W does not extend beyond 20°N. This feature of the Pacific anticyclone gives rise to a wide band of easterlies over the eastern Pacific. Anomalous southerly cross-equatorial flow over northern South America is present in CT 7, and may be a result of the northward progression of the ITCZ. This cross-equatorial flow could play a role in the strengthening of the winds in the Caribbean low level jet region. As in CT 6 the trough over northwestern South America has a NE–SW orientation.

4.2 Frequency and trends

The circulation types described in Sect. 4.1 vary in frequency from month to month and consequently between the dry (December–April) and rainy (May–November) seasons. CTs 1–5 rarely occur during the June to September period in the rainy season (Fig. 6) while CTs 6 and 7 dominate during this time. Towards the end of the rainy season in November, the incidence of CTs 1–5 increases while in the dry period, CTs 6 and 7 are absent. We also observe that all CTs are present in the months of May and November. Thus, these months may be called transitional months. A more rigorous analysis to determine the relationship between precipitation and atmospheric circulation types would be the object of a further study.

Fig. 6
figure 6

Monthly frequency of each atmospheric circulation type

The long-term frequency of each CT is given in Table 1. CTs 7 and 6 are the two most prevalent types while CT 5 occurred least frequently. Although long-term frequencies indicate the relative frequencies of each CT, they mask year-to-year variations. The annual variations in CT frequencies are shown in Fig. 7. The linear trend estimated using the Theil-Sen’s slope estimator (Yue et al. 2002) is also indicated on the individual plots for each CT in Fig. 6. The non-parametric Mann–Kendall trend test (Yue et al. 2002) on the annual frequencies of each circulation type was used to test the null hypothesis of no trend at the 95 % confidence level (5 % significance level). Only CTs 5 and 6 showed significant trends. CT 5 decreases at a rate of 0.33 days per year and CT 6 increases at a rate of 0.81 days per year. Table 2 shows the significant results of the Mann–Kendall trend test on the seasonal and monthly frequencies of each circulation type and the estimated linear trend. The annual decrease in the frequency of CT 5 may be attributed to its decrease during the wet season, and more specifically the early rainy season (MJJ) and the month of June. CT 6, which is increasing in annual frequency, is increasing during the wet season, especially during the late rainy season (August–November) and in the month of September.

Table 1 Total number of events and frequency of occurrence (expressed in percentage) of each atmospheric circulation type over the Caribbean region for the 1979–2010 period
Fig. 7
figure 7

Annual frequencies of each atmospheric circulation type. Significant trends at the 95 % confidence level are indicated in bold

Table 2 Trend analysis results using a Mann–Kendall trend test on CT occurrence in each season and month

Trend analyses on seasonal and monthly frequencies of each CT show trends not seen on the annual scale. CT 7 decreases during the dry season and in the month of April. CT 3, which occurs mostly during the dry season, decreases during the wet season, in both the early and late rainy seasons, in May, July and August. During these months, CT 3’s monthly average is low (see Fig. 6).

4.3 Persistence

The uninterrupted sequence of days for which a CT persists is a characteristic of the dynamics of the atmospheric circulation. The persistence measures the probability for one circulation type to persist from one day to another (Espinoza et al. 2012). Although the mean lifetime (Table 3) for the first five CTs is between 2 and 3 days, their historical histograms of duration (Fig. 7) indicate that they could persist up to 10–15 days. Ten percent of their total number of events are spent as events of duration 5 days or more. The percentage of days spent in these ‘long’ events is 27 % of the total number of days in which CTs 1–5 occur. Thus, although CTs 1–5 have a low mean lifetime, they spend at least a quarter of their time in long events. CTs 6 and 7, which are the dominant types in the wet season, have high mean lifetimes that are at least 2–3 times that of CTs 1–5. CT 6 and 7 had 36 and 32 %, respectively, of their events in long events and 78 and 85 %, respectively, of days in these long events (Fig. 8). There were long events such as 56 days for CT 6 and 85 days for CT 7. The single day events range from 26 % (CT 6) to 45 % (CT 2).

Table 3 Persistence characteristics of each atmospheric circulation type over the Caribbean region for the 1979–2010 period
Fig. 8
figure 8

Histogram of duration in days of each atmospheric circulation type. The axes vary from CT to CT to reflect the longest event duration

4.4 Transitions

The transitions for one atmospheric circulation type to a small subset of other CTs are useful for predictions in the medium-range (García-Valero et al. 2012). Table 4 shows the transition probabilities for the circulation types. Each value (i, j) is the ratio of number of changes that the CT in row i changes to the CT in row j to the total number of changes from CT i (Casado et al. 2009).

Table 4 Transition probabilities from one daily atmospheric circulation type (row) to another (column), not including itself, over the 1979–2010 period

The most likely transition of one daily circulation type to another within the Caribbean region is that from CT 5 to CT 1 which occurs with a probability of 0.602. Therefore, there are highly preferred routes of evolution in the Caribbean. For the transition CT 5 to CT 1, the Gulf of Mexico and the Atlantic anticyclones are most likely to move northward and the westerly winds over the Bahamas are also likely to transition to north easterly winds indicating that the extra-tropical westerly winds attain a higher latitudinal location. In addition, since CT 5 has the lowest occurrence (8.3 %, Table 1), when it does occur, it is most likely to evolve to CT 1, indicating that CT 1 is a more stable atmospheric state.

The transition from CT 7 to CT 6 with probability 0.599 is the second most likely transition. In the Caribbean, this is a change in flow from predominantly easterly to a southeasterly flow and is due to the weakening of the Atlantic cyclone. In addition, since CT 6 is most likely to transition to CT 7, CTs 6 and 7 form a cycle through CT 7 → CT 6 → CT 7.

CT 7’s second likely transition is to CT 3 (probability of 0.179). This relates to the breakdown of the broad Atlantic anticyclone into the Gulf of Mexico anticyclone and a residual Atlantic anticyclone. Furthermore, CT 6 has equal probabilities of transitioning to CTs 1, 2, and 3. It is possible to hypothesize that the transition from the rainy season when CTs 6 and 7 are prevalent into the dry season is through the transition of either CT 6 or CT 7 to CT 3. This hypothesis warrants further investigations which are outside of the scope of this work.

Approximately 50 % of CT 3’s and CT 1’s events are most likely to change to CT 4 and CT 2, respectively. The transition of CT 3 to CT 4 refers to the disappearance of the Gulf of Mexico anticyclone and the neutral area over Cuba is replaced by strong easterly winds. Since CT 4 has the greatest probability of being followed by CT 3, these two CTs could undergo a cycle via CT 3 → CT 4 → CT 3 which is an expression of the transient nature of the Gulf of Mexico anticyclone. Similar to the transition CT 3 to CT 4, the transition of CT 1 to CT 2 also indicates that the Gulf of Mexico anticyclone is transient. Thus, the neutral area over the Atlantic disappears during these transitions. In addition, easterly winds over Cuba become southerly and the flow along the eastern Mexican coast becomes an outflow from the coast into the Gulf of Mexico.

There are equal chances of CT 2 converting to CT 3 or CT 5. There are two potential transitions, one in which there is a cycle, CT 1 → CT 2 → CT 5 → CT 1, and another through CT 1 → CT 2 → CT 3 → CT 4 → CT 3. Low probability transitions include CT 3 → CT 2, CT 4 → CT 5, and CT 5 → CT 4, indicating that even though CTs 1–5 occur in similar frequencies during each month, they have preferred paths of evolution.

Circulation types that persist over long events transit to their most likely CTs as described above. For example, CT 7 which had 108 long events (>4 days) between 1979 and 2010, transits to CT 6 for 76 % of these events, and was proceeded by CT 6 for 49 % of its long events. Long events of CT 7 are unlikely to evolve from CT 5; only one long event of CT 7 was preceded by CT 5. Of interest is the fate of CT 2’s long events. Although CT 2 has similar probabilities of transiting to CT 3 or CT 5, long events of CT 2 are twice as likely to be followed by CT 5 as compared with CT 3.

4.5 Relation between frequencies of occurrence of each CT and teleconnection indices

The inherent interannual variability in the frequency of regional atmospheric circulation patterns could be linked to large-scale modes of variability (e.g. Stefanicki et al. 1998; Coleman and Rogers 2007). Large-scale climatic modes, also known as teleconnections, that are known to affect the Caribbean’s climate (precipitation, temperature) are the El Niño-Southern Oscillation (ENSO) (e.g. Rogers 1988; Giannini et al. 2001; Taylor et al. 2002; Chen and Taylor 2002), the North Atlantic Oscillation (NAO) (e.g. Giannini et al. 2001; George and Saunders 2001) and the Pacific Decadal Oscillation (PDO) (e.g. Ryu and Hayhoe 2014). Table 5 shows the correlations of representative indices with the time series of the monthly frequency of each circulation type. We also included other climatic modes that could influence the Caribbean’s climate, including the Pacific-North American teleconnection (PNA), Western Pacific teleconnection (WP), Pacific Transition (PT), Tropical/Northern Hemisphere teleconnection (TNH), and the East Atlantic/Western Russia teleconnection (EA/WR) which were obtained from the Earth System Research Laboratory’s online climate indices website at http://www.esrl.noaa.gov/psd/data/climateindices/list/. We note that the TNH is dominant during the DJF period and as such is calculated for this period only.

Table 5 Correlations between the monthly occurrence of each CT and the time series of climatic indices

The largest significant correlations are found between the frequency of CTs 1–5 and the TNH followed by the correlations with the NAO. Most CTs with the exception of CT 7 are significantly correlated with the TNH. As the TNH is calculated only for the DJF period which is during the dry season, we expect that only CTs that are dominant in this time to be correlated with the TNH index. However, CT 6, which is prevalent during the wet season, is also associated with the TNH. As the TNH is distinguished by anomalously high 500 hPa heights over the Gulf of Alaska as well as over the Gulf of Mexico and northeastward into the Atlantic, the proximity of the Caribbean domain considered in this work may be the primary reason for significant correlations between the TNH and the frequencies of each CT.

The variation in CT frequency with TNH is a complex one. Although CTs 1, 3, and 5 are distinguished in relative positions of a Gulf of Mexico anticyclone, only CT 3 significantly increases with the positive phase of the TNH while CTs 1 and 5 decrease in frequency. The decrease in both CT 1 and 5 seems reasonable as CT 5 is most likely to transit to CT 1. The preferential evolution of CT 1 to CT 2 in which the Gulf of Mexico’s anticyclone is no longer present may be related to why CT 2 decreases in frequency during the positive phase of the TNH. In contrast to CT 2, CT 4 increases with the positive phase of the TNH. As CT 3 preferentially evolves to CT 4 we expect these two CTs to have similar sign (positive) of variation with the TNH. CT 6 also shows a decrease in frequency with the TNH indicating that southeasterly winds over the Caribbean decrease in frequency. However, CT 7, which CT 6 preferentially evolves to, does not show significant variation with the TNH. In fact, CT 7 is not significantly correlated with any of the teleconnection indices considered in this study, all of which are based on centers of action in the Pacific, and the northern hemisphere stretching from the United States and Canada to Russia.

Correlations between CTs 1–5 with the NAO index are similar in sign to those between the corresponding CTs and the TNH but smaller in magnitude. Given the relative locations of the NAO’s center of action to the study region as compared to that of the TNH, the weaker coupling of the NAO is not surprising. The NAO is the dominant mode of variability in the North Atlantic sector and its influence is more pronounced during the winter (Hurrell et al. 2003) therefore we expect that the CTs that are prevalent during December–March period to be influenced by the NAO. CTs 6 and 7 which involve cross-equatorial flow from the South Atlantic into the Caribbean do not have significant correlations with the NAO.

The only other Atlantic based teleconnection that we found to be associated with the extracted CTs is the East Atlantic-West Russia (EA/WR) teleconnection. This teleconnection tends to increase the frequency of CT 2 and decrease the frequency of CT 3. It may be related to a shorter persistence of CT 3 and a longer persistence of CT 2, rather than encouraging more CT 2 types to transit to CT 3 types.

The ENSO, as represented by the Nino-3.4 index, is the main Pacific based teleconnection that has been widely correlated with surface climatic variables in the Caribbean. Of the seven extracted atmospheric circulation types, only CTs 2, 4, and 5 are associated with the ENSO. CTs 2 and 5 tend to increase in frequency during the positive phase. It is possible that the transition of CT 2 to CT 5 is preferred during El Niño events. In contrast, CT 4 tends to decrease in frequency during the positive phase of the ENSO. CT 4 is associated with northeasterly flow from the Caribbean into the Pacific and stronger than average easterlies in the eastern Pacific. The reduction in frequency of an atmospheric circulation type with associated higher than normal wind speeds in the eastern Pacific near the equator is in congruence with weakened trade winds during El Niño events. El Niño events are also known to influence the surface climatic variables in the Caribbean for several months. We therefore computed lagged correlations between the Nino 3.4 index and CT frequencies. CT 2 continues to have a small positive association with the Nino 3.4 index up to 3 month lags (correlations between 0.106 and 0.150) and slight negative association between 22 and 26 months (−0.137 to −0.095). CT 4 and 5, like CT 2, have small correlations up to 7 months but no significant correlations at longer lags. CT 3, which is not correlated with the ENSO at zero lag, has small negative correlations at lags of 2–3 months. Although CT 6 is not significantly correlated at zero lag, it has small positive correlations (0.109 to 0.116) at 21–23 month lags. In contrast, CT 7 is not associated with the ENSO even at long lags.

While the ENSO has been investigated as a primary forcing mechanism that modulates the Caribbean regional climate, much fewer studies have looked at the influence of other Pacific based teleconnections. In this study, the PDO has a slightly stronger relationship with CT frequencies than the ENSO. The signs of the correlations for CTs 2, 4, and 5 are the same as that with the ENSO. PDO is also significantly correlated with CT 6 while the ENSO is not. The PNA is associated with the occurrence of CTs 4 and 5, again with the same sign as the ENSO but with a slightly stronger association. The Pacific Transition (PT) is negatively associated with CT 4 just as the other Pacific based teleconnections. The Western Pacific (WP) teleconnection is associated with CT 2 but with an opposite sign to the ENSO and PDO.

5 Conclusions

Seven daily near-surface atmospheric circulation patterns were identified over the Caribbean region from the application of a two-stage cluster analysis technique on daily 850 hPa reanalysis circulation data defined by wind components. This study also determined their characteristics in terms of trends, lifetime and persistence, transitions, and correlations with large-scale teleconnections. While all the daily atmospheric circulation types are distinguished through the extension and location of the well-known quasi-stationary Atlantic (Azores/North Atlantic High) and Pacific anticyclones, only three of the types show a third anticyclone over the Gulf of Mexico. As this Gulf of Mexico anticyclone is a transient feature with a mean lifetime of two to three days, it is not present on long-term mean monthly atmospheric circulation maps. The Gulf of Mexico and the Atlantic anticyclones influence the location of neutral areas and the bifurcation of the prevailing winds in the western Caribbean.

While most circulation types (CTs) show the dominant north–east/easterly trade winds, CT 6 showed that southeasterly winds in the Caribbean and westerly winds in the eastern Pacific could prevail. This feature of southeasterly winds throughout the Caribbean is not observed on mean seasonal or monthly maps during the late wet season as mean monthly maps during this time comprise of the influence of both CT 6 and CT 7. CT 6 has a mean lifetime of 6 days and occurs more frequently during the Caribbean wet season. The monthly frequency distribution of CT 6 has a similar pattern as the monthly rainfall totals in the Caribbean and a significant increasing trend during the wet season. Further work is needed to determine the precipitation distribution associated with each circulation type, the efficacy of the catalogue in reproducing the precipitation characteristics to assist in devising a forecasting system for daily rainfall and for understanding the climatic changes or spatial behavior in Caribbean rainfall. Furthermore, CT 6 and CT 7 which both show a broad extension of the Atlantic anticyclone, have very long lasting events and the application of the clustering technique on various reanalysis data sets would be needed to confirm the persistence of the south-easterly flow types over several weeks.

The transition probabilities for certain CT transits are high for this region, indicating a high preference for specific transitions such as the northeastward movement of the Gulf of Mexico and Atlantic anticyclones. High transition probabilities indicate greater ability to predict weather changes on a daily time scale. Other transitions show that the Gulf of Mexico anticyclone is a transient feature between types associated primarily with the dry season and with a mean lifetime of 2–3 days.

Most of the atmospheric circulation types are correlated with several teleconnections including the NAO, TNH, ENSO, PDO, and PNA. CTs 2 and 3 showed surprising, albeit small correlations with the EA/WR implying that the influence of this teleconnection on the surface climate of the Caribbean should be investigated further. Teleconnections such as the TNH may influence the occurrence of specific cycles, and the ENSO may signal the preference for daily atmospheric circulation transitions. The ENSO is also found to influence the specific daily atmospheric types up to three months and for others the influence is only significant up to 2 years. In addition, we have found that CT 4 shows the most complex interactions as it is correlated with six teleconnection indices, two Atlantic based indices and four Pacific based ones. The atmospheric circulation types in this work comprise the first atmospheric circulation catalogue, of many to come, for the Caribbean region. Our proposed catalogue could be useful in statistical–dynamical downscaling applications to explain the variability of surface weather variables such as wind speeds. By classifying the atmospheric continuum into a small number of types, each of which has an associated frequency of occurrence, and accounting for the previously noted high intra-cluster variability, it is then possible to perform a reasonable number of short-term numerical weather prediction simulations for high-resolution wind mapping studies over the Caribbean small islands. A similar approach could be used for determining climate change impacts at the local scale.