Complex network approach for detecting tropical cyclones

Tropical cyclones (TCs) are one of the most destructive natural hazards that pose a serious threat to society, particularly to those in the coastal regions. In this work, we study the temporal evolution of the regional weather conditions in relation to the occurrence of TCs using climate networks. Climate networks encode the interactions among climate variables at different locations on the Earth’s surface, and in particular, time-evolving climate networks have been successfully applied to study different climate phenomena at comparably long time scales, such as the El Niño Southern Oscillation, different monsoon systems, or the climatic impacts of volcanic eruptions. Here, we develop and apply a complex network approach suitable for the investigation of the relatively short-lived TCs. We show that our proposed methodology has the potential to identify TCs and their tracks from mean sea level pressure (MSLP) data. We use the ERA5 reanalysis MSLP data to construct successive networks of overlapping, short-length time windows for the regions under consideration, where we focus on the north Indian Ocean and the tropical north Atlantic Ocean. We compare the spatial features of various topological properties of the network, and the spatial scales involved, in the absence and presence of a cyclone. We find that network measures such as degree and clustering exhibit significant signatures of TCs and have striking similarities with their tracks. The study of the network topology over time scales relevant to TCs allows us to obtain crucial insights into the effects of TCs on the spatial connectivity structure of sea-level pressure fields.


Introduction
Complex networks are powerful tools used to study the collective behaviour of complex systems composed of many interacting dynamical units. A crucial step in this approach is that networks encode the underlying interaction structure of a complex system, allowing us to understand the intricate interplay between its structural and dynamical aspects (Boccaletti et al. 2006). Complex networks have been applied to the study of complex systems in various areas of science (Albert and Barabási 2002), e.g., the internet, ecological and neural networks, power grid systems, and science collaboration networks. In particular, the emergent property of synchronization in networked systems, arising due to a transfer of dynamical information according to the network topology, has been studied intensively (Arenas et al. 2008).
Over the last two decades, network theory has been successfully applied towards the understanding of complex climate phenomena (Tsonis and Roebber 2004;Tsonis et al. 2006;Tsonis and Swanson 2008;Donges et al. 2010;Yamasaki et al. 2008). In many applications, climate networks are reconstructed from a spatio-temporal dataset, wherein a time series is associated with every spatial grid point on the Earth's surface . Each spatial grid point of the dataset acts as a node representing individual dynamical subsystems, and network links between these pairs of nodes are placed in case of strong interactions between the nodes. Commonly, such interactions are determined via statistically significant values of suitable similarity measures (e.g., cross correlation (Donges et al. 2009;Ludescher et al. 2013Ludescher et al. , 2014Meng et al. 2017), mutual information , event synchronization (Malik et al. 2012;Stolbova et al. 2014;Boers et al. 2014Boers et al. , 2019), computed between the corresponding pairs of anomaly time series, i.e., the variation relative to the climatological normal. Different network measures are used to characterize the structural properties of the network over many spatial scales (i.e., the network topology), ranging from local properties such as the number of first neighbours of a node v (degree centrality), to global network measures such as the clustering coefficient or the average path length (Donges et al. 2009). For instance, the local clustering coefficient, a measure of similarity based on network topology, can be associated with spatial homogeneity of a rainfall field (Cheung and Ozturk 2020;Boers et al. 2013) while the regions of high betweenness centrality reveal flow of energy and information that can be related to transport phenomena such as global surface ocean currents and winds Boers et al. 2013). Climate networks have proved to be a very promising approach in the study of global patterns of extreme-rainfall teleconnections (Boers et al. 2019), and have aided in the improved prediction of climate phenomena such as the El Niño (Ludescher et al. 2013(Ludescher et al. , 2014Meng et al. 2017), the Indian Summer Monsoon (Malik et al. 2012;Stolbova et al. 2014) and the South American Monsoon (Ciemer et al. 2018), which occur over seasonal or annual time scales.
Extreme weather phenomena such as tropical cyclones (TCs), that are relatively short-lived (average life span of 7-10 days), destructive events, have a significant impact on life and property (Emerton et al. 2020). Spatio-temporal patterns of heavy rainfall related to tropical cyclones have been investigated using climate network approaches (Traxl et al. 2016;Ozturk et al. 2018Ozturk et al. , 2019. Ozturk et al. (2019) employed tools from network theory to compare between the spatial characteristics of extreme rainfall synchronicity, in and around the Japanese archipelago, due to the Baiu and tropical storm systems. However, limited attention has been given towards understanding the temporal evolution of network topology of the regional weather system during individual tropical cyclones, which occur over very short time scales.
In this paper, we study the topological and dynamical evolution of the regional weather conditions over a particular TC season. This involves the construction of climate networks over rather small spatial regions (TC basins), which evolve in time according to the time scale of the weather extreme, i.e., the underlying interaction structure of the meteorological fields cannot be considered as static. Evolving complex networks capture the mechanisms that contribute to the system's evolution. They are of particular importance to study real-world networks (Albert and Barabási 2000;Barrat et al. 2004;Gross and Blasius 2008;Hlinka et al. 2014) which undergo structural changes (addition or deletion of nodes and connections) as the system dynamically evolves (such as growth or aging processes). They exhibit rich dynamics, such as, structure formation and evolving collective behaviour between some of the elements. Time-evolving complex networks have been used to investigate failure propagation in power-grids (Albert et al. 2004;Li et al. 2018), hierarchical structures in the brain (Strogatz 2001;Zhou et al. 2006;Lehnertz et al. 2014), structural differences in the interconnectivity of the climate system between El Niño and La Niña conditions , early-warning of El Niño events (Ludescher et al. 2013(Ludescher et al. , 2014Meng et al. 2017), transition of regional connectivity during the South American Monsoon onset (Ciemer et al. 2018), and the multiscale nature of Australian Summer Monsoon (ASM) development (Cheung and Ozturk 2020). In our work, we use networks constructed over overlapping short-length sliding time windows to compare the spatial patterns of the various topological properties, such as degree and clustering coefficient, and the spatial scales involved, for short time frames around the occurrence of individual TCs. Our analyses show that the regional system undergoes a characteristic spatial reorganization in the connectivity structure during a TC in such a way that the network measures are in close correspondence with the TC tracks. Finally, we confirm that our inferences hold true for different TC basins irrespective of the difference in the complexity of their dynamics.
In Sect. 2, we list the employed datasets, explain our choices of the spatial and temporal resolutions, and then outline our methodology. We then discuss our results in Sect. 3.

Reanalysis data
In this study, we use the state-of-the-art ERA5 reanalysis data for 3-hourly mean sea level pressure (MSLP) (Hersbach et al. 2020) over the sea. As the TCs can undergo both a rapid intensification and weakening within a span of a few hours, the use of the 3-hourly temporal resolution ensures a high enough temporal auto-correlation. Moreover, as the TCs are short-lived, with a typical lifespan of ∼ 3-10 days, the sub-daily resolution adds more time points to the period in consideration. MSLP exhibits stronger variability at higher frequencies than sea surface temperatures (SSTs) or surface air temperatures (SATs), which enhances its sensitivity towards cyclone signals and thereby increases the possibility of capturing TCs in MSLP networks for the duration of their lifetime. The entire availability period of the dataset is from 1950 to present, available at hourly resolution. The daily climatology is computed as the mean of the daily MSLP values over a period of 40 years . We remove the seasonal cycle from all time series of the dataset, by calculating the anomaly time series, i.e., subtracting the daily climatology of each day from all the hours of that day.
The spatial resolution of the MSLP dataset plays a significant role in the identification and tracking of TCs (Kouroutzoglou et al. 2011); the probability of detecting cyclones increases with an increase in the resolution of the dataset. We use a high spatial resolution of 0.75 • × 0.75 • , which proves to be sufficient for our analysis. As the MSLP largescale patterns are not so well determined by the land-sea boundary, and most TCs originate over the sea and dissipate shortly after landfall, we analyse the MSLP spatiotemporal dataset over the sea only. Furthermore, it should be noted that the MSLP over land is estimated by extrapolation of surface pressure in the models used in the ERA5 reanalysis and therefore may introduce artificial inconsistencies if compared to sea values. Our results in Sect. 3 show that the omission of land points does not affect the analysis of landcrossing cyclones (Figs. 2, 4).
We generate TC tracks from the Best Tracks data available over the north Indian Ocean basin (entire availability period of 1982-2020, Indian Meteorological Department) and the north Atlantic Ocean basin (entire availability period of 1851-2019, HURDAT2, NOAA, (Landsea and Franklin 2013)) to compare them with the results obtained from our analyses (See Data availability).

Functional network construction
In accordance with the idea of evolving networks, we divide the reanalysis data into overlapping short time windows and construct a climate network for each of these windows. The length of the time window is taken to be 10 days, which is of similar time scale as that of the Gives the network measures for the period Nov 10-19, 2018 during the cyclone. The TC tracks are represented by solid black circles whose sizes are scaled according to the cyclone intensity typical lifespan of TCs, to capture the effect of TCs on the dynamical and structural evolution of the network better. The successive time windows have 9 days of overlap, i.e., the climate network evolves in daily steps (see Fig. 1). It should be noted, that the obtained results in Sect. 3 do not have a strong dependence on the chosen parameters-networks constructed for time windows spanning up to 15 days yielded similar results.
Following the method of reconstruction of evolving climate networks, every node or spatial grid point of the TC basin is associated with a 3-hourly 10-day anomaly time series, i.e., 80 time points at each grid point. We employ the functional network representation (Tsonis and Roebber 2004;Tsonis et al. 2006;Tsonis and Swanson 2008;Donges et al. 2009Donges et al. , 2010 of the multivariate datasets to encode the position of strong statistical linkages between every pair of involved time series. Under this framework, we first measure the link strength between every pair of nodes by calculating the Kendall's rank correlation coefficient (Kendall 1938(Kendall , 1945 between the corresponding pair of time series at zero lag. It should be noted that a positive time lag may be used for TCs with a slower translation speed, such as those in the Atlantic Ocean, in which case the information transfer cannot be assumed to be instantaneous. We use Kendall's coefficient as it is known to perform better than other measures such as Pearson's correlation coefficient for short time series (Goswami et al. 2017). We only take into account correlation values that are statistically significant at a confidence level of 0.05 and set all other values to zero. This gives a symmetric cross correlation matrix. We then construct the climate network adjacency matrix A ij , from the correlation matrix by considering the strongest 5% of the significant correlations to define the links. Among the thresholds ranging from 80th to 99.5th percentile of the correlation matrix, 95th percentile was found to be the optimal choice for all our networks. A ij = 1 if there is a link between nodes i and j and A ij = 0 otherwise. Thus, we obtain time-evolving climate networks A ij (t) for each time window.

Network measures
We analyze the time variation of the topology of the interaction patterns in the regional climate system of the TC basin by using global and local network measures to characterize the climate networks (Donges et al. 2009Yamasaki et al. 2008;Tsonis and Swanson 2008;Boers et al. 2013). We adopt several commonly used network measures (Newman 2010; Albert and Barabási 2002): degree, mean geographical link distance, and the local and global clustering coefficients.
The degree k i of a node i in a network gives the number of connections it has to all other nodes: where n is the total number of nodes in a network, and A ij is the adjacency matrix. Regions with higher connectivity have larger values of k, while regions of low k values are indicative of a small-scale atmospheric process and are often related to large topographic barriers (Malik et al. 2012;Boers et al. 2014;Stolbova et al. 2014).
To obtain further insight into the spatial scales involved in the region during a cyclone, we calculate the mean geographical link distance L i (Malik et al. 2012;Boers et al. 2013;Stolbova et al. 2014), which associates a spatial length scale with each node i. The measure calculates the mean of the spatial distances of node i to all its connected neighbours j, along the corresponding great-circles, i.e., where L ij is the great-circle distance between nodes i and j calculated using the Haversine formula for spherical Earth projected on to plane.
The clustering coefficient is the measure of the degree to which nodes in a network tend to cluster together. The local clustering coefficient (Watts and Strogatz 1998) of a node i in a network quantifies how close its neighbours are to being a clique (i.e., a complete graph), that is, the average probability that a pair of node i's neighbours, j and h, are connected. Mathematically, we calculate the ratio of the links connecting the direct neighbours of node i to the number of all possible connections between them, The local clustering coefficient, C i , measures control over flows between immediate neighbours of a node (Newman 2010). It indicates spatial continuity in network.
The global clustering coefficient C, also known as transitivity (Newman 2010), measures the average probability that two neighbours of a vertex are themselves neighbours for the whole network. It measures the density of triangles in the networks and is defined as the fraction of paths of length two in the network (triplet of nodes) that are closed. This is equivalent to the number of closed triplets over the total number of triplets. As a triangle graph includes three closed triplets, one centred on each node, the number of closed triplets is equal to thrice the number of triangles. (1)

number of triangles number of all triplets
For an undirected network with an adjacency matrix A, the global clustering coefficient is expressed as: and C = 0 when the denominator is zero. If C = 1 , perfect transitivity occurs in the network, i.e., the components of the network are all cliques. The global clustering coefficient is of interest because a higher C than expected by chance indicates the formation of localized structures of high connectivity in a network, e.g., the presence of tightly knit groups characterized by a high density of ties in a social network.

Correction of boundary effects due to spatial embedding
As TCs are highly localized extreme weather events, the networks are constructed over areas of TC formation, i.e., TC basins, instead of taking the full globe into consideration to enable a detailed understanding of the regional weather system. As mentioned earlier, we only consider grid points at sea. Therefore, in addition to the boundaries of the TC basin, the coastlines also spatially confines the regional networks. However, the introduction of such spatial boundaries cuts links that would connect the considered region with outside regions. This artificially reduces the degree of the nodes and the number of long links, and also influences the spatial patterns of any other network measure. Boundary effects depend on the distribution of link lengths and on the network measures themselves. As more links are cut for nodes closer to the boundaries than nodes deep inside the region, the degree of the nodes near to the boundaries has a stronger reduction compared to the nodes in the interior. In the case of the clustering coefficient, which depends on topological paths of length three, it is seen that nodes along the boundaries tend to have a higher tendency to cluster, while for the mean geographical distance the effects of boundaries become more complex (see Supporting Information of Boers et al 2013).
In order to avoid any spurious conclusions arising solely from the effects of the spatial embedding (Barnett et al. 2007), we adopt a correction procedure (Rheinwalt et al. 2012) for the considered network measures as follows: We first construct 1000 spatially embedded random networks (SERN) that preserve both the node positions in space and the link probability, depending on the spatial link lengths of the original network. Thereafter, we calculate each of the considered network measures on all the SERN surrogates. We then estimate the boundary effects on the network measure by taking the average of that measure over the ensemble of surrogates. The corrected network measure is finally obtained by dividing the network measure of the original network by the corresponding average measure of the SERN surrogates. As the corrected network measure thus gives the value of the network measure relative to the value expected from the spatial embedding, it is dimensionless.

Results and discussion
In this section, we show how TCs affect the topological properties of the constructed regional networks. We first use the methodology described in the previous section to study some recent TCs in the North Indian Ocean (NIO) basin which extends from 49.5 • E to 100 • E and from 34.5 • N to 4.5 • N. The NIO basin exhibits a unique bimodal TC season (Wahiduzzaman et al. 2017;Li et al. 2013;Vissa et al. 2013) in the monsoon transition periods-pre-monsoon (March-April-May; MAM) and post-monsoon (September-October-November-December; SOND)-which can be attributed to the annual cycle of the background vertical shear of the horizontal winds (Gray 1968;Camargo et al. 2007). We focus on TCs occurring in the SOND season over the last decade (2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018), which has comparatively higher TC frequency than that in the pre-monsoon season primarily due to the difference in mean relative humidity between the two periods (Li et al. 2013).
A comparison between the network measures before and during the Very Severe Cyclonic Storm (VSCS) Gaja, which was formed in the Bay of Bengal and later crossed the Indian peninsula into the Arabian Sea, is shown in Fig. 2. The degree, mean geographical distance and local clustering coefficient fields (Fig. 2a-c) for the period Oct 29-Nov 7, 2018, demonstrate that there are no definite structures in their spatial distributions in the absence of a cyclone. However, in the presence of a TC, the network measures spatially organize themselves to exhibit definite patterns as shown in Fig. 2d-f for the period Nov 10-19, 2018, when the TC occurred. The nodes along the TC track have a lower degree k than the surrounding regions. The spatial pattern for the mean geographical distance L is very similar to k. This is because the nodes containing the cyclone are only connected to each other and are no longer connected to noncyclone nodes. As TCs are highly localised events, the area (number of nodes) affected by the cyclone is much less than that of the unaffected regions. These mesoscale convective systems reduce the values of both k and L in the affected regions compared to their surroundings. While shorter links constitute the cyclone regions as indicated by lower L values in Fig. 2e, surrounding regions have higher L values as longer links connect the regions separated by the cyclone track. Therefore, the cyclone track separates a region of high connectivity into two. It should be noted that the mean geographical distances in Figs. 2, 3, 4, 5, 7 are dimensionless quantities because of the correction of bias due to spatial embedding, as mentioned in Sect. 2.2.
On the contrary, the local clustering coefficient field (Fig. 2f) is relatively high in the localized region containing the TC. Cyclone tracks have a striking connection with high clustering coefficient nodes. This indicates spatial continuity in the network along the track. The explanation for the increase in C i with decreasing degree for nodes containing the cyclones is that these nodes tend to form a tightly-knit small group, with connections mostly within the group. Such a group becomes mostly detached from the rest of the network and tends to function on its own as an isolated  Nov 8-10, 2015). The TC tracks are represented by solid black circles whose sizes are scaled according to the cyclone intensity small network and hence, has higher local clustering. This separates the dynamics of a TC from the rest of the region. Network measures for various other TCs of the NIO basin like VSCS Luban and Titli in 2018 (Fig. 3), VSCS Vardah in 2016 which crossed the Indian peninsula and formed the Depression ARB 02 (Fig. 4), and the Extremely Severe Cyclonic Storm Megh and Deep Depression BOB 03 in 2015 (Fig. 5), exhibit similar characteristic behaviourlower degree and mean geographical distance, but higher clustering along the TC track-which further supports the arguments presented in the previous paragraphs.
Next, we construct a time series using the global clustering coefficient value, C, of each successive network for the TC season in consideration, against the date corresponding to the middle day of the network period. Figure 6 shows the variation of C for daily evolving networks constructed over the 2018 SOND season of the NIO basin. We find that one or more TC events can be associated with networks having relatively high values of C. This indicates that the networks during TCs are more transitive due to the formation of localized structures of high connectivity. It should be noted that the choice of plotting C against the middle day of the network associates with it a temporal tolerance of ±5 days, due to the time span covered by each network.
We also confirm our findings by applying the methodology to analyse TCs in the North Atlantic Ocean (NAO) TC basin during the months of August and September, when the peak of the hurricane season occurs there. We find that with the same choice of network window length and time lag as for the TCs in the NIO basin, similar observations in spatial patterns of degree, mean geographical distance, and local clustering coefficient can be made for Hurricane Irma, which occurred during Aug 30-Sep 13, 2017 (see Fig. 7). The network is constructed over a spatial region extending from 10 • N to 42 • N and from 42 • W to 100 • W for the period Sep 1-10, 2017. Therefore, despite the difference in basin properties between the NAO and the NIO basins, for example amongst others Sea Surface Temperature, a similar topological network evolution of the regions takes place during a TC. For a detailed analysis of Atlantic hurricanes, one can choose a longer time window ( > 10 days) and a positive temporal lag for constructing the evolving networks, as Atlantic hurricanes typically have a longer lifespan and are slow-moving.
Through the topological study of the evolution of climate networks over the TC basin, we thus see that the regional network topology undergoes a specific rearrangement during a TC. Although there will be a rearrangement in the connectivity structure of the network during any low pressure system, the above observed signatures are strongest for TCs. Since the degree and mean geographical distance are proportional to the size of the low pressure system, for larger low pressure systems such as the monsoon trough (see Fig. 9 in Stolbova et al. 2014), these measures would have comparable values with the other regions. Hence, the above observations cannot be uniquely associated with any general low pressure system formation.

Conclusions
The evolving climate networks approach has been very promising to study the impacts of long-lived extreme events such as strong El Niños, extreme monsoon rainfall, or volcanic eruptions, on the topology of climate interaction networks. In this work, we used this approach to demonstrate its potential to study the evolution of weather extremes occurring over very short time scales of only a few days. Through a spatio-temporal analysis of the TC basin, we extract insightful information about the underlying dynamical organization of the regional weather system during TCs. We have investigated the temporal evolution of mean sea level pressure patterns by utilizing various topological properties of functional climate networks. We have used three network measures, viz. degree, mean geographical distance and clustering coefficient which characterize the changes in connectivity structure from three different perspectivescentrality of nodes, associated spatial length scale and tendency to form clusters. We have employed sliding windows of 10 days, successively shifted by 1 day over the period of a TC season-post-monsoon Sep-Oct-Nov-Dec cyclone season of north Indian Ocean basin, and the Aug-Sep season of the north Atlantic Ocean basin. We find that cycloneaffected regions exhibit an increase in the local clustering values C i along with a decreasing degree k as compared to their surroundings, implying the formation of an almost isolated system within the network based on mean sea level pressure. The highly localized nature of these tropical storms leads to such a behaviour, as is evident from the lower values of mean geographical distance L along the nodes affected by the cyclone. We have uncovered that there is a close resemblance of cyclone tracks with high C i nodes if the TC event occurs during the span of the network time window. Such networks also tend to have relatively high global clustering coefficient values. This indicates that the regions along the TC track are localized structures in the network with high connectivity, along which there is a continuous flow.
In conclusion, evolving climate networks can be used to study weather variability that occurs over much shorter, daily time scales, provided the length of the time window and the temporal resolution of the data is chosen in accordance to the weather phenomenon in consideration. Application of this approach over time scales relevant for TCs allowed us to gain deeper insights into the individual local signatures of changes in the flow structure of the regional weather system, in contrast to generic long-term topological changes reported in earlier works (Ozturk et al. 2018(Ozturk et al. , 2019. This methodology using network-based indicators has a strong potential to detect cyclones and their tracks from MSLP outputs from models as well as other inputs. Further investigations involving network tools to analyse the topological changes in the various layers of the atmosphere during TCs using geopotential height, and the use of evolving network approach to study variability of teleconnections over decadal scales can be outlined as relevant topics for future research.