Reanalysis data
In this study, we use the state-of-the-art ERA5 reanalysis data for 3-hourly mean sea level pressure (MSLP) (Hersbach et al. 2020) over the sea. As the TCs can undergo both a rapid intensification and weakening within a span of a few hours, the use of the 3-hourly temporal resolution ensures a high enough temporal auto-correlation. Moreover, as the TCs are short-lived, with a typical lifespan of \(\sim\) 3–10 days, the sub-daily resolution adds more time points to the period in consideration. MSLP exhibits stronger variability at higher frequencies than sea surface temperatures (SSTs) or surface air temperatures (SATs), which enhances its sensitivity towards cyclone signals and thereby increases the possibility of capturing TCs in MSLP networks for the duration of their lifetime. The entire availability period of the dataset is from 1950 to present, available at hourly resolution. The daily climatology is computed as the mean of the daily MSLP values over a period of 40 years (1979–2018). We remove the seasonal cycle from all time series of the dataset, by calculating the anomaly time series, i.e., subtracting the daily climatology of each day from all the hours of that day.
The spatial resolution of the MSLP dataset plays a significant role in the identification and tracking of TCs (Kouroutzoglou et al. 2011); the probability of detecting cyclones increases with an increase in the resolution of the dataset. We use a high spatial resolution of \(0.75{^\circ }\times 0.75{^\circ }\), which proves to be sufficient for our analysis. As the MSLP large-scale patterns are not so well determined by the land-sea boundary, and most TCs originate over the sea and dissipate shortly after landfall, we analyse the MSLP spatiotemporal dataset over the sea only. Furthermore, it should be noted that the MSLP over land is estimated by extrapolation of surface pressure in the models used in the ERA5 reanalysis and therefore may introduce artificial inconsistencies if compared to sea values. Our results in Sect. 3 show that the omission of land points does not affect the analysis of land-crossing cyclones (Figs. 2, 4).
We generate TC tracks from the Best Tracks data available over the north Indian Ocean basin (entire availability period of 1982-2020, Indian Meteorological Department) and the north Atlantic Ocean basin (entire availability period of 1851-2019, HURDAT2, NOAA, (Landsea and Franklin 2013)) to compare them with the results obtained from our analyses (See Data availability).
Evolving networks
Functional network construction
In accordance with the idea of evolving networks, we divide the reanalysis data into overlapping short time windows and construct a climate network for each of these windows. The length of the time window is taken to be 10 days, which is of similar time scale as that of the typical lifespan of TCs, to capture the effect of TCs on the dynamical and structural evolution of the network better. The successive time windows have 9 days of overlap, i.e., the climate network evolves in daily steps (see Fig. 1). It should be noted, that the obtained results in Sect. 3 do not have a strong dependence on the chosen parameters—networks constructed for time windows spanning up to 15 days yielded similar results.
Following the method of reconstruction of evolving climate networks, every node or spatial grid point of the TC basin is associated with a 3-hourly 10-day anomaly time series, i.e., 80 time points at each grid point. We employ the functional network representation (Tsonis and Roebber 2004; Tsonis et al. 2006; Tsonis and Swanson 2008; Donges et al. 2009, 2010) of the multivariate datasets to encode the position of strong statistical linkages between every pair of involved time series. Under this framework, we first measure the link strength between every pair of nodes by calculating the Kendall’s rank correlation coefficient \(\tau\) (Kendall 1938, 1945) between the corresponding pair of time series at zero lag. It should be noted that a positive time lag may be used for TCs with a slower translation speed, such as those in the Atlantic Ocean, in which case the information transfer cannot be assumed to be instantaneous. We use Kendall’s \(\tau\) coefficient as it is known to perform better than other measures such as Pearson’s correlation coefficient for short time series (Goswami et al. 2017). We only take into account correlation values that are statistically significant at a confidence level of 0.05 and set all other values to zero. This gives a symmetric cross correlation matrix. We then construct the climate network adjacency matrix \(A_{ij}\), from the correlation matrix by considering the strongest 5% of the significant correlations to define the links. Among the thresholds ranging from 80th to 99.5th percentile of the correlation matrix, 95th percentile was found to be the optimal choice for all our networks. \(A_{ij}=1\) if there is a link between nodes i and j and \(A_{ij}=0\) otherwise. Thus, we obtain time-evolving climate networks \(A_{ij}(t)\) for each time window.
Network measures
We analyze the time variation of the topology of the interaction patterns in the regional climate system of the TC basin by using global and local network measures to characterize the climate networks (Donges et al. 2009, 2010; Yamasaki et al. 2008; Tsonis and Swanson 2008; Boers et al. 2013). We adopt several commonly used network measures (Newman 2010; Albert and Barabási 2002): degree, mean geographical link distance, and the local and global clustering coefficients.
The degree \(k_{i}\) of a node i in a network gives the number of connections it has to all other nodes:
$$\begin{aligned} k_{i}=\sum _{j=1}^{n}A_{ij} \end{aligned}$$
(1)
where n is the total number of nodes in a network, and \(A_{ij}\) is the adjacency matrix. Regions with higher connectivity have larger values of k, while regions of low k values are indicative of a small-scale atmospheric process and are often related to large topographic barriers (Malik et al. 2012; Boers et al. 2014; Stolbova et al. 2014).
To obtain further insight into the spatial scales involved in the region during a cyclone, we calculate the mean geographical link distance \({\mathcal {L}}_{i}\) (Malik et al. 2012; Boers et al. 2013; Stolbova et al. 2014), which associates a spatial length scale with each node i. The measure calculates the mean of the spatial distances of node i to all its connected neighbours j, along the corresponding great-circles, i.e.,
$$\begin{aligned} {\mathcal {L}}_{i}=\frac{1}{k_{i}}\sum _{j=1}^{n}{\mathcal {L}}_{ij}A_{ij} \end{aligned}$$
(2)
where \({\mathcal {L}}_{ij}\) is the great-circle distance between nodes i and j calculated using the Haversine formula for spherical Earth projected on to plane.
The clustering coefficient is the measure of the degree to which nodes in a network tend to cluster together. The local clustering coefficient (Watts and Strogatz 1998) of a node i in a network quantifies how close its neighbours are to being a clique (i.e., a complete graph), that is, the average probability that a pair of node i’s neighbours, j and h, are connected. Mathematically, we calculate the ratio of the links connecting the direct neighbours of node i to the number of all possible connections between them,
$$\begin{aligned} C_{i}=\frac{{\sum _{j,h=1}^{n}}A_{ij}A_{ih}A_{jh}}{k_{i}\left( k_{i}-1\right) } \end{aligned}$$
(3)
The local clustering coefficient, \(C_i\), measures control over flows between immediate neighbours of a node (Newman 2010). It indicates spatial continuity in network.
The global clustering coefficient C, also known as transitivity (Newman 2010), measures the average probability that two neighbours of a vertex are themselves neighbours for the whole network. It measures the density of triangles in the networks and is defined as the fraction of paths of length two in the network (triplet of nodes) that are closed. This is equivalent to the number of closed triplets over the total number of triplets. As a triangle graph includes three closed triplets, one centred on each node, the number of closed triplets is equal to thrice the number of triangles.
$$\begin{aligned} C=\frac{3\times \text {the number of triangles}}{\text {number of all triplets}} \end{aligned}$$
(4)
For an undirected network with an adjacency matrix A, the global clustering coefficient is expressed as:
$$\begin{aligned} C=\frac{\sum _{i,j,k}A_{ij}A_{jk}A_{ki}}{\sum _{i}k_{i}\left( k_{i}-1\right) } \end{aligned}$$
(5)
and \(C=0\) when the denominator is zero. If \(C=1\), perfect transitivity occurs in the network, i.e., the components of the network are all cliques. The global clustering coefficient is of interest because a higher C than expected by chance indicates the formation of localized structures of high connectivity in a network, e.g., the presence of tightly knit groups characterized by a high density of ties in a social network.
Correction of boundary effects due to spatial embedding
As TCs are highly localized extreme weather events, the networks are constructed over areas of TC formation, i.e., TC basins, instead of taking the full globe into consideration to enable a detailed understanding of the regional weather system. As mentioned earlier, we only consider grid points at sea. Therefore, in addition to the boundaries of the TC basin, the coastlines also spatially confines the regional networks. However, the introduction of such spatial boundaries cuts links that would connect the considered region with outside regions. This artificially reduces the degree of the nodes and the number of long links, and also influences the spatial patterns of any other network measure. Boundary effects depend on the distribution of link lengths and on the network measures themselves. As more links are cut for nodes closer to the boundaries than nodes deep inside the region, the degree of the nodes near to the boundaries has a stronger reduction compared to the nodes in the interior. In the case of the clustering coefficient, which depends on topological paths of length three, it is seen that nodes along the boundaries tend to have a higher tendency to cluster, while for the mean geographical distance the effects of boundaries become more complex (see Supporting Information of Boers et al 2013).
In order to avoid any spurious conclusions arising solely from the effects of the spatial embedding (Barnett et al. 2007), we adopt a correction procedure (Rheinwalt et al. 2012) for the considered network measures as follows: We first construct 1000 spatially embedded random networks (SERN) that preserve both the node positions in space and the link probability, depending on the spatial link lengths of the original network. Thereafter, we calculate each of the considered network measures on all the SERN surrogates. We then estimate the boundary effects on the network measure by taking the average of that measure over the ensemble of surrogates. The corrected network measure is finally obtained by dividing the network measure of the original network by the corresponding average measure of the SERN surrogates. As the corrected network measure thus gives the value of the network measure relative to the value expected from the spatial embedding, it is dimensionless.