The role of Atlantic variability in modulating the tropical cyclone formation in the Australian region

Previously the interannual variability of tropical cyclone genesis (TCG) in the Australian region has mainly been attributed to the climate variability in the Pacific and Indian Oceans. In this study, we found that the influence from climate variability in the Atlantic is of equal importance. Application of a state-of-the-art causality analysis reveals that the Atlantic meridional mode (AMM), Atlantic multidecadal oscillation (AMO) and north tropical Atlantic (NTA) sea surface temperature (SST) anomalies are all causal to the Australian region TCG frequency. The associated physical mechanisms are investigated as well. Based on this causal analysis and inference, a statistical model is constructed to forecast TCG, using the Poisson regression and the step-by-step predictor selection method. The Atlantic causal factors, after being taken in as new predictors, help increase the forecast skill for the seasonal Australian region TCG by as much as 10% in terms of correlation increase and 40% in terms of root-mean-square error reduction.


Introduction
Tropical cyclones (TCs) are strong atmospheric vortex with warm-cored low pressure structure, forming over tropical or subtropical oceans. The Australian region (0-30° S, 100°-65° E) is climatologically active for TCs, with a typical season average of 12.5 TCs (Liu and Chan 2012;Chand et al. 2019). TCs can also bring about plentiful precipitation, flooding, and storm surge, which poses a recurring and growing threat to coastal communities throughout the Australian TC region (Holmes 2021). Therefore, implementing effective forecast of TC activity is of importance, which relies on a solid understanding of how atmospheric and oceanic factors impact TC activity.
The influence of various environmental and climate factors on the variability of TC activity in the Australian region has been widely investigated since the late 1970s. Several important large-scale environmental factors have been examined and verified to be responsible for impacting the TC activity in the Australian region. These include sea surface temperature (SST) and a deep thermocline, weak deep-tropospheric vertical wind shear, conditional instability through a deep-tropospheric layer and relatively large values of lowlevel cyclonic vorticity (Ramsay et al. 2008). Kuleshov et al. (2009) repeated these studies and proposed high values of relative humidity in the mid-troposphere as another major modulator for the genesis and continued support of TCs. In addition to the effect of environmental variables, Mcbride and Keenan (2010) suggested a strong seasonal relationship between the intertropical convergence zone (ITCZ) and tropical TCG in the Australian basin from the 1950s to the 1970s. Intraseasonally, the Madden-Julian oscillation (MJO) plays a significant role in the Australian region TCG (Hall et al. 2011). More (fewer) TCs tend to form in the active (inactive) phase of the MJO. Note that nearly 50% of the TCs in their study formed within 300 km of land, quite different from those in the Atlantic and western North Pacific, more than 80% of which are formed far offshore (Werner and Holbrook 2011;Parker et al. 2018). On the interannual time scale, many existing studies have investigated the relationship between the El Niño-Southern Oscillation (ENSO) phenomenon and TC activity in the Australian region. Werner and Holbrook (2011) investigated the influence of SST on variations in Australian TC activity and discovered that ENSO acts as a dominant modulator. Kuleshov et al. (2008) examined the connection of the ENSO to the TC activity in the Southern Hemisphere and showed the differences in TCG in El Niño and La Niña years. Explored the link between ENSO extremes, the Australian monsoon trough, with TC activity. Significant differences in the structure of the monsoon trough and associated TC activity were related to ENSO phase. ENSO is considered to be a significant contributing factor influencing the mean TC genesis location in the Australian region.
In addition to the effect of ENSO on TCG, Nicholls (2010) found a relation between interannual variations in TC numbers in the Australian region and Darwin sea level pressure (SLP) averaged over June to August. After the introduction of satellite, which aided in the detection of TCs, the relationship between Darwin SLP and TC number was found to be even stronger. Grant and Walsh (2001) examined the relation between the interdecadal variability of TC formation in the northeastern Australian region and the Interdecadal Pacific Oscillation. Broadbridge and Hanstrum (1998) examined TC activity in the Australian region and its relationship with the Southern Oscillation index (SOI). Increases in TC frequency and in landfalling impacts were found for strongly positive SOI values. Various studies have confirmed the relationship between TCG in the Australian region and ENSO indices, SST in the Niño regions or the SOI (Butler and Callaghan 2007;Hastings 2010;Kuleshov et al. 2014). The geographical distribution of TCG shifts westward (eastward) and southward (northward) during positive (negative) phase of SOI.
Additionally, some previous studies have also developed seasonal predictions of TC activity in the Australian region. McDonnell and Holbrook (2004) proposed a statistical model for the seasonal forecast of TC activity based on the SOI and the saturated equivalent potential temperature gradient. Flay and Nott (2010) also constructed a statistical model for the prediction of TC landfalls in Queensland using the SOI. In a later study, Goebbert (2009) developed a prediction scheme for the northwest Australian TC frequency based on a set of NECP-NCAR reanalysis fields (e.g., geopotential height, air temperature, and components of wind) highly correlated with TC frequency.
In the above studies on the interannual variability of TCG over the Australian region, there are two outstanding problems that have received wide attention. The first one is that climate factors related to the TC activity are limited to local environmental drivers in the Australian region. Few previous studies have examined impact factors of TCG in the entire Southern Hemisphere or the whole globe. Thus, the restricted selection of environmental factors (i.e., predictors) inevitably causes inaccuracy for seasonal prediction of TC activity. For example, predictors such as SOI and SST in the South Indian and South Pacific Oceans are most commonly used to build statistical or dynamical models (2003). The second scientific challenge has been how to analyze the relationship between various climate factors and TCG. Traditionally, the most commonly used research method in climate science is the time-delayed correlation analysis. That may be not appropriate, as there has been strong argument in philosophy against using correlation analysis for relation identification, because, for example, correlation lacks the needed asymmetry or directedness between dynamical events (Liang 2015). Therefore, how to extract the causal relations in climate-cyclone interactions is an important problem in atmospheric science.
For the first problem, recent studies have suggested that the Atlantic Ocean variability, such as SST and atmospheric variability, can influence the ENSO variability and its predictability through the atmospheric circulation response (Jansen et al. 2010;Frauen and Dommenget 2012;Ding et al. 2012). Ham et al. (2013) investigated a subtropical teleconnection between the north tropical Atlantic (NTA) SST with the Pacific atmospheric anomalies by modulating the tropical Pacific atmospheric circulation and SST. Thus, it is possible that the Atlantic Ocean variability may modulate the Pacific or, specifically, Australian region climate variability.
On the other hand, causality analysis has been developed as a statistical problem by many people such as Granger (1969). Realizing that causality is in fact a real physical notion, Liang (2008Liang ( , 2016 found that it can be rigorously formulated from first principles, with analytical results explicitly obtained. In Liang's formalism, causality of both linear systems and highly nonlinear system could be measured by information flow (IF) remarkably. Then Liang (2015) normalized the obtained IF in order to assess the relative importance of an identified causality for bivariate time series causal inference. In dealing with the causality with a multitude of variables, Liang (2021) proposed a rigorous but easy-to-use algorithm and obtained impressive results in several extreme situations (More details can be referred to Liang 2021).
In this study, we extend the investigation to the whole globe to determine the importance of different climate factors in influencing TC formation in the Australian region, using the recently developed rigorous and quantitative causality analysis tool (i.e., IF). Interestingly, besides reconfirming the existing relations, it is also found that the Atlantic anomalies could impact the Australian region TCG remotely. And the teleconnection from the Atlantic could be new predictors that can significantly improve the seasonal forecast of TCG in the Australian region. In the current study, we first introduce the new tool and theory, and then show the analysis results. Possible physical mechanisms are suggested for the teleconnection between TCG in the Australian region and the Atlantic variability. From the analysis, three causal factors are particularly examined: the Atlantic meridional mode (AMM), Atlantic multidecadal oscillation (AMO), and north tropical Atlantic (NTA) sea surface temperature (SST). Finally, these factors are taken as predictors to build a statistical model for the prediction of the Australian region TCG.

Typhoon data
Following the definition by World Meteorological Organization (WMO), TC seasons begin on 1 July in the prior year.
Considering that the quality of the data prior to 1971 may be poor because of the lack of satellite coverage, we select the International Best Track Archive for Climate Stewardship version 3 for the period of 1972-2016 during the typhoon season (November-April, NDJFMA) in the Australian region (0-30° S, 100°-165° E). Only the TCs which have the maximum sustained wind speed of 17.2 m s −1 or above and a lifetime of 48 h or more are considered, in order to minimize the uncertainty in identifying the tropical depression (Liu and Chan 2008) and address the artificial trend in short-duration storms (Landsea et al. 2010).

Atmospheric and oceanic index data
Environmental variables including vorticity and vertical velocity are extracted from the National Centers for Environmental Prediction (NCAR) reanalysis dataset (Kalnay et al. 1996). The extended reconstructed NTA (0-25° N, 90° W-15° E) SST and Outgoing Longwave Radiation (OLR) data are taken from the National Oceanic and Atmospheric Administration (NOAA). The ENSO index is represented by the standardized Niño-3.4 SST (5° S-5° N, 170° W-120° W) time series; the AMO and AMM indices (defined by Vimont and Kossin 2007) are obtained directly from the ESRL database of NECP/NCAR data.

Methodology
In meteorology and oceanography, time-delayed correlation analysis is still the most commonly used tool for identifying causality. That has long been criticized in physics ever since Bishop Berkeley's criticism in 1710, and indeed has failed in tests with both purportedly designed dynamical systems (e.g., a simple Langevin system) and real world problems (e.g., the influence of Indian Ocean on ENSO, Liang 2014). Causality and IF has been equilated in the logical sense, which quantitatively expresses the causality between two variables with information time rate from one to the other (Liang 2016).
According to the original form of IF, consider a d -dimensional random system: where is the drift coefficient vector; is the parameter; is the diffusion coefficient matrix; is the standard Wiener process vector. If the system is two-dimensional, the IF from variable X 2 to X 1 is (Liang 2008): Based on this, Liang (2014) performed a maximum likelihood estimation of Eq. (2) for causal analysis. Given two time series X 1 and X 2 , the maximum likelihood estimator of the rate of the IF from X 2 to X 1 is: where C ij denotes the covariance between X i and X j , and C i,d j is determined as follows. Let Ẋ j be the finite-difference approximation of dX j ∕dt using the Euler forward scheme: with k = 1 or 2 (the details about how to determine k are referred to Liang 2014) and Δt being the time step. C i,d j in Eq. (3) is the covariance between X i and Ẋ j , Ideally, if T 2−1 = 0 , then X 2 does not cause X 1 ; otherwise it is causal.
In order to quantify the relative importance of a detected causality, Liang (2015) (denoted as NIF L (2015) ) developed an approach to normalizing the IF and H noise 1 represents the random effect (Liang 2015). The range of Eq. (5) is [0, 1], measures the importance of the IF transmitted from X 2 to X 1 relative to other random processes. The larger the value, the more significant the causal relationship between X 2 to X 1 .
For implementing multivariate time series causality analysis, Liang (2021) proposed a novel generalization of the IF-based bivariate time series causal inference to multivariate ones. The corresponding computing method (denoted as NIF L(2021) ) can be referred to the Algorithm 1 in Liang (2021).

Correlation and causality analysis
Based on a time-delayed correlation analysis, we select AMM, AMO and NTA (0-25° N, 90° W-15° E) SST for August to October (ASO) for analysis. Figure 1a-c presents the unfiltered time series of their anomalies during ASO (7) dH noise 12 1 x 2 1 overlaid with the Australian region TCG frequency during the typhoon season (NDJFMA), respectively. It can be clearly seen that there is a relatively strong inverse relation that TC frequency tends to increase during negative phases of AMM or AMO, and cold NTA SST anomaly years, and vice versa. The correlation coefficient is noted to be − 0.26, − 0.32, − 0.41 in order (significant at the 95% level). The time series also presents significant low-frequency variability, such as multidecadal signal. We compute the causal relations between the TC frequency and the above three indexes, together with an examination of the statistical significance of the resulting IF. Table 1 provides the quantitative details; also are the causality analysis results. Note that the direction of a causality in the table is from its column index to row index. As mentioned before, the causality values here have been normalized so that its maximum is 1. Specifically, the NTA SST yields the most significant correlation coefficient ( R ), NIF L(2015) andNIF L(2021) statistics of − 41.818, 11.309 and 13.394, respectively. Second to NTA SST is the AMO. The AMM has the relative minimum R , NIF L(2015) and NIF L(2021) , but most of them are still significant at the 95% confidence level except R . Therefore, it may be safe to say that there is indeed an inverse relation between AMM and AMO anomalies in ASO with TCG during the typhoon season in the Australian region. However, so far we are still not sure about the existence of a direct connection between NTA SST anomaly and the Australian TCG, as the former may be partly affected by the multidecadal variations and ENSO. To clarify, we first perform a wavelet analysis to the NTA SST index to see if the relationship between the Australian region TCG and NTA SST is on the multidecadal scales. As shown in Fig. 2, the results reveal negligible interannual and interdecadal variations besides decadal scales in the North Tropical Atlantic. The detrended and low-pass filtered NTA SST time series during 1972-2016 are employed to perform the causality and correlation analyses ( Table 1). The results, as shown in Table 1, reveal that the modified NTA SST index still reflects a significant relationship with the TCG frequency over Australian basin.
Besides, considering that NTA SST has a rigorous lagged correlation with ENSO variability (Alexander and Scott 2002;Chiang and Sobel 2002), we remove the lagged effect of ENSO using linear regression of NTA SST in ASO with the previous season (May-July) Niño-3.4 SST. The causality and correlation analysis are tabulated in Table 1; they confirm that the relation between TCG and the NTA SST with lag effect removed is still significant at the 95% confidence level. These results in significant correlation agree well with that of Ramsay et al. (2008).
Overall, all these results demonstrate that there is an acceptable relationship between the NTA SST anomalies and the TCG in the Australian region. In the following, only the NTA SST time series with the lagged effect of ENSO removed are considered.
After identifying the important climate-cyclone interactions in the Australian region, we attempt to explore if there are distinct differences in the TC genesis location during positive and negative phases of AMM and AMO anomalies,  L(2015) and NIF L(2021) between the TCGF over the Australian basin during typhoon season (NDJFMA) and the AMM, AMO, NTA SST Index for ASO, respectively The Raw refers to the scenario where the causality and correlation analysis are performed using unfiltered time series. In d trend , the linear trend is removed from both the time series. The analysis is repeated after removing the lagged ENSO effect only from the Atlantic signals (noENSO)  and warm and cool NTA SST anomaly years, respectively. Figure 3 presents the results with the AMM phases. Indeed, there is significant difference in the number of TCs of this region, especially between 115° and 155° E, between the positive and negative phases. Specifically, the frequencies of TCG are lower in the positive phases than in the negative phases, and the local minima exist in the northern subregion from 125° to 142.5° E (Dare and Davidson 2014). This suggests that the signal of AMM is strong enough to impact on the TCG in the Australian region. In contrast, with AMO or NTA SST, there seem to be no particular differences in the genesis location of the Australian TCs for different phases, as shown in Figures S1a-S1b.

Dynamical interpretation
Having demonstrated the above-identified teleconnection linkage between the Atlantic variability with TCG frequency in the Australian region, we turn our focus on explaining this remote effect dynamically and/or thermodynamically. We first perform a linear regression analysis of AMM, AMO and NTA SST anomalies with SST and surface wind anomalies over the Pacific and eastern South India Ocean at various lags. The results are consistent with that of Ham et al. (2013) in the corresponding areas (please refer to Figures S2-S4).
In agreement with Ham et al. (2013), it should be noted that, through a pair of low-level circulation responses, the variations of the Atlantic (i.e. AMM, AMO and NTA SST anomalies) enhances not only the convective activity over the Atlantic ITCZ but that over the ITCZ in the vicinity of the Australian region, where about 85% of cyclones form (Mcbride and Keenan 2010). This remote connection linking Atlantic to Pacific during typhoon season is also consistent with the Gill-type Rossby-wave response over the subtropical eastern Pacific, which produces an atmospheric flow extending to the west flank near the Australian region (Ham et al. 2013). From Figures S3-S5, it is then possible that the significant SST and circulation anomalies are triggered by the Atlantic signals, and are maintained throughout the typhoon season over Australian basin. Moreover, to see whether these large-scale anomalies exert an influence on the Australian TCG, we regress the environmental field factors (i.e., OLR, vorticity, and vertical velocity) that are favorable for TC formation and, Fig. 4 Lagged regressed fields of a OLR, b 850 hPa relative vorticity, and c 500 hPa vertical velocity during NDJFMA with respect to the interannual ASO AMO index. Regression coefficients exceeding 95% confidence level are stippled respectively, with the AMM, AMO and NTA SST, following Huo et al. (2015). Since the results with the three indices show almost the same good relationship with the selected dynamic/thermodynamic fields (please refer to Figures S6-S7), here we only present the results with AMO. Figure 4 presents the regressed fields of the OLR, relative humidity, vorticity, vertical velocity, and eddy kinetic energy (EKE) during the typhoon season in the Australian region onto the AMO in ASO.
All the factors as shown in Fig. 4 have a sign that is consistent with the conclusion drawn from the previous causality and correlation analyses. For example, in Fig. 4a, the analysis shows that the subregion (from 120° E to 135° E) of the Australian basin is covered by negative OLR anomalies. Interestingly, the center of OLR anomalies are located on the northwest of the Australia, that is coincident with the climatological maximum in TC genesis intensity, which is consistent with the simulated results from Walsh and Ryan (2000). Additionally, as suggested that low-level cyclonic vorticity impacts on the formation and intensity of TCs over the Australian basin (Kuleshov et al. 2014), Fig. 4b does reveal a significantly enhanced (reduced) cyclonic vorticity during negative (positive) phases of AMO. It is also clearly seen from Fig. 4c that vertical velocity favors increased (decreased) TC genesis during negative (positive) phases of AMO. In conclusion, there is indeed a strong physical basis for the inverse relation between AMO anomalies and TCG frequency in the Australian region, which is in conformity with AMM and NTA SST.

Prediction and discussion
It is clear now that there is a robust physical mechanism linking the Australian TC activity to the Atlantic Ocean variability. These results suggest that the indices relating to the Atlantic Ocean may be good predictors for the seasonal forecast of the annual number of TCs occurring in the Australian region. In this subsection, we attempt to investigate if the selected influences can help improve the prediction of the TCG frequency in the area.
Since Poisson regression (PR): is commonly utilized in climate science to predict TCG frequency (Boudreault et al. 2017;Caron et al. 2015;Kozar et al. 2012;Nicholls 1992;Magee et al. 2021), a new PR model armed with the Atlantic signals is built to perform a seasonal TC forecast for the period of 1972-2016: � � denotes the expected TCGF with covariate values x ij for the j predictors on the i th observation. Both 0 and j represent the regression coefficient. In order to achieve acceptable prediction performance, a step-by-step predictor selection is used to choose optimal predictor combinations (see Table 2). Considering the ENSO as a dominant influence of TC activity in the Australian region, a predictor of combined Niño-3.4 SST index and previous observations of TCGF, denoted as ENSO&TCGF for simplicity, is also included as a baseline for comparison. First, we employ the leave-one-out cross-validation technique to calculate the root-mean-squared error (RMSE) and correlation coefficient (R) to help evaluate the model performances. It can be seen from the first line of Table 2 that the ENSO&TCGF-only model results in an RMSE and R of 3.172 and 0.631, respectively. With ENSO&TCGF as the base, it is shown that the two-predictor combinations, ENSO&TCGF + AMM, ENSO&TCGF + AMO, ENSO&TCGF + NTA SST, improves (in comparison to the single predictor forecast) by 26.06%, 24.11% and 22.70%, respectively, in terms of R increasing, and by about 31.09%, 30.03%, and 29.29% in terms of RMSE reduction. In other words, the addition of the Atlantic signals indeed improves the seasonal forecast model. The four-predictor model ENSO&TCGF + AMM + AMO + NTA SST gives the best RMSE and R statistics, which are 2.857 and 0.715, respectively (Table 2 and Fig. 5a). This is quite a good result since it increases R by at least 19.20% compared to previous models with correlations between TCG frequency and cross-validated hindcasts ranging from R = 0.44 to R = 0.60 (please refer to Mcdonnell and Holbrook 2004;Nicholls 1992;Solow and Nicholls 1990), and decreases RMSE by 45.06% (previously ranging from RMSE = 5.20 to RMSE = 6.20, please refer to Werner and Holbrook 2011).
Second, for a further investigation of the robustness of the optimal predictor set, motivated by Werner and Holbrook (2011), the four-predictor model is then trained on the observations of TCG during the period of 1972-2006 to hindcast the left-out 10 years. The results are shown in Fig. 5b. Generally, the model gives a result highly correlated with the observed data, although some discrepancies still exist (e.g., from 2011 to 2012). A substantial improvement of the forecast result in terms of RMSE reduction is approximate 33.31%, comparing to the Werner and Holbrook's forecast.
To a large extent, it may be safe to say that Atlantic signals can act as robust predictors for the TCG frequency in the Australian region.

Conclusions
Although detecting the causal factors in the cyclone-climate interactions in the Australian region has been a field of extensive research during the past decades, the studies so far focused mainly on the Pacific and Indian Oceans.
In this study, we found that the Atlantic Ocean is actually of equal importance, through a teleconnection linkage to the Australian region TCG. This influence is seen in the linear correlation analysis and the information theoretic approach. Specifically, we found that AMM, AMO, and NTA SST are representatives for the influence, which is found not only to enhance the convective activity over the Atlantic ITCZ but also that over the ITCZ in the Australian region. They may exert influence on the Australian region TC activity through impacting on different factors which are favorable for TC formulation during the typhoon season. This Atlantic-Pacific teleconnection is also consistent with the Gill-type Rossby-wave response over the subtropical eastern Pacific. Based on the above causal inference, we henceforth constructed a statistical model to forecast the TCG activity, using the Poisson regression and step-by-step predictor selection and taking into account the identified influences from the Atlantic. Comparing with the conventional seasonal forecast modes, this one shows an appreciable improvement in forecast skill. While this on one hand substantiates the identified causal relations with the state-of-the-art causality analysis tool, on the other hand it provides us a more reliable TCG forecast model for future use.