# Late Holocene Asian summer monsoon dynamics from small but complex networks of paleoclimate data

## Authors

- First Online:

- Received:
- Accepted:

DOI: 10.1007/s00382-012-1448-3

- Cite this article as:
- Rehfeld, K., Marwan, N., Breitenbach, S.F.M. et al. Clim Dyn (2013) 41: 3. doi:10.1007/s00382-012-1448-3

- 12 Citations
- 765 Views

## Abstract

Internal variability of the Asian monsoon system and the relationship amongst its sub-systems, the Indian and East Asian Summer Monsoon, are not sufficiently understood to predict its responses to a future warming climate. Past environmental variability is recorded in Palaeoclimate proxy data. In the Asian monsoon domain many records are available, e.g. from stalagmites, tree-rings or sediment cores. They have to be interpreted in the context of each other, but visual comparison is insufficient. Heterogeneous growth rates lead to uneven temporal sampling. Therefore, computing correlation values is difficult because standard methods require co-eval observation times, and sampling-dependent bias effects may occur. Climate networks are tools to extract system dynamics from observed time series, and to investigate Earth system dynamics in a spatio-temporal context. We establish paleoclimate networks to compare paleoclimate records within a spatially extended domain. Our approach is based on adapted linear and nonlinear association measures that are more efficient than interpolation-based measures in the presence of inter-sampling time variability. Based on this new method we investigate Asian Summer Monsoon dynamics for the late Holocene, focusing on the Medieval Warm Period (MWP), the Little Ice Age (LIA), and the recent period of warming in East Asia. We find a strong Indian Summer Monsoon (ISM) influence on the East Asian Summer Monsoon during the MWP. During the cold LIA, the ISM circulation was weaker and did not extend as far east. The most recent period of warming yields network results that could indicate a currently ongoing transition phase towards a stronger ISM penetration into China. We find that we could not have come to these conclusions using visual comparison of the data and conclude that paleoclimate networks have great potential to study the variability of climate subsystems in space and time.

### Keywords

Asian summer monsoonComplex networksIrregular samplingLittle ice ageMedieval warm period## 1 Introduction

The Intertropical Convergence Zone (ITCZ) plays a governing role in monsoonal circulation and variations of its mean northward extent have been linked with summer monsoon strength (Breitenbach et al. 2010; Gadgil 2003; Ma et al. 2012; Sinha et al. 2011). The defining geography (composition of landmass, mean altitude, position and extent of surrounding seas) however, is quite different for ISM and EASM. The extent to which the two sub-systems interacted in the past is a matter of current research (Cheng et al. 2012; Wang et al. 2005, 2010; Zhang et al. 2011; Zhou et al. 2011). As a third player, the mid-latitude westerlies dominate the area north and west of the (variable) monsoon boundary (Cheng et al. 2012). The relative strength of these circulation systems and thus their areas of influence, varied in the past (Herzschuh 2006; Mayewski et al. 2004; Wang et al. 2010), and our knowledge about the complex spatio-temporal processes and variability behind them is insufficient (Cook et al. 2010).

Numerous paleoclimatological studies focused on the reconstruction of individual climatic parameters, such as moisture or precipitation (Borgaonkar et al. 2010; Managave et al. 2010; Pant et al. 1988; Ramesh et al. 2010; Singh et al. 2009; Wang et al. 2010; Yi et al. 2011; Zhang et al. 2011), temperature (Yi et al. 2011), or droughts (Borgaonkar et al. 2010; Cook et al. 2010; Sinha et al. 2011; Yi et al. 2011) by use of proxy records. Furthermore, linkages among the Asian Monsoon system and the North Atlantic realm (Gupta et al. 2003; Hong et al. 2003; Ma et al. 2012; Wang et al. 2001, 2005 ), El Niño/Southern Oscillation (ENSO) (Shukla et al. 2011), and solar forcing (Gupta 2005; Wang et al. 2005; Zhang et al. 2008) have been explored. However, the mechanism(s) and variability of the interactions between ISM and EASM during the Holocene (and beyond) remain far from being fully understood (Wang et al. 2005, 2010; Zhang et al. 2011). Using numerical meta-analysis and reconstructions of moisture indices, Wang et al. found an asynchronous evolution of the ISM and the EASM for the Holocene on centennial timescales (Wang et al. 2010). The spatial distribution of the paleoclimatic records used in the study of Wang et al. did include only four records from India (out of a total 92) and focused mainly on China and Tibet, with no record in the ISM domain below 27°N (Wang et al. 2010). It is important to note that the currently general low number of datasets from the Indian peninsula might lead to systematic biases towards the Tibetan plateau and China, complicating or even precluding meaningful interpretation of results, a caveat that must be accounted for.

Based on ensemble runs of a coupled climate model run with anthropogenic forcing, May found an increase in monsoonal rainfall, accompanied by a decrease in the intensity of the overall lower-tropospheric large-scale circulation at a warming of 2°C relative to pre-industrial ISM conditions (May 2010). Derived from global climate modeling results and observations, an overall stagnation in precipitation but a redistribution towards extremes (prolonged dry and wet spells) was supported in Kumar et al. (2010). Decreasing reliability of rainfall and increased variability of precipitation amounts would have disastrous impacts on rain-fed agriculture all over Asia.

In the paleoclimatic context, we strive to understand whether the weakening of the large-scale circulation associated with a warming scenario, as found for the time period 2020–2200 AD in the modeling study by May (2010), is paralleled by an increased influence of the ISM on the EASM domain during the MWP (1100–700 years BP) and during the recent warm period (RWP, 1850–1980 AD), in contrast to an expected diminished influence during the LIA (100–400 years BP). Given that the Asian Summer Monsoon is, amongst other factors, differential-heating driven, and thus modulated, to some extent, by northern hemisphere temperature, we hypothesize that the eastward ISM penetration depth was higher during periods of extended northern hemisphere warmth (e.g. the MWP) than during cool periods and vice versa. We define the boundaries of LIA (MWP) in agreement with the timings given by Jones et al. (2001) and within the periods of relative cold (warmth) in the East Asian temperature reconstruction by Osborn and Briffa (2006).

On short (annual to multi-decadal) timescales, we are not aware of any study systematically investigating the interactions between both sub-systems. As we find that the understanding of any system is fundamental to comprehending its links to other systems, we aim to investigate the extent of interaction between the traditional ISM domain over continental India and the EASM domain over China. To this end we propose here the construction of paleoclimate networks, based on significant association between proxy records of past climate variability. Palaeoclimate records come with particularities, when compared to data used in climate network studies up to now. They are heterogeneously sampled in time (1) and space (2) which, if ignored, leads to biased and possibly incorrect results. Previous climate network studies have focused on the analysis of gridded datasets, from reanalysis data (Donges et al. 2009, 2011; Gozolchiani et al. 2011; Steinhaeuser et al. 2010; Tsonis et al. 2006; Yamasaki et al. 2009) or recent observations (Ge-Li and Tsonis 2009; Malik et al. 2010, 2011) and were thus restricted to the recent, observational period. Palaeoclimate records are, in contrast, spatio-temporally inhomogeneously distributed. However, due to the increasing number of (Asian monsoon) records published in the last decades (Wang et al. 2010), the spatio-temporal reconstruction of past climates becomes feasible (Cook et al. 2010; Wang et al. 2010). In difference to previously analyzed climate networks, paleoclimate networks cannot make use of *direct* information about climate parameters (e.g. temperature) and have to rely on proxy data that are usually irregularly sampled in time and space. Generally, fewer datasets are available the further back in time the analysis is extended. Also, much less paleoclimate data is available from India, compared to China. One option would be to include only datasets that span all time periods of interest and an equal number from both regions of interest (ISM and EASM domain). However, this would decrease the robustness and significance of the results. Therefore, we strive to sample all regions consistently in order to retain comparability for different time slices, and include all records in the database where they meet the temporal sampling requirements. Possible bias effects should nevertheless be kept in mind for the subsequent analysis and need to be discussed.

To improve spatial resolution and robustness of the estimates with increasing node numbers, we forsake the reconstruction of direct physical flows (which would limit us to using only precipitation or temperature reconstructions), but instead combine records of precipitation and temperature. We argue that temperature and precipitation amounts over land covary, as the moisture-carrying capacity of atmospheric flows increase with temperature. We do not claim that the relationship, especially in monsoonal and tropical climate, co-varies in a strict linear correlation sense either positively or negatively, but that a (nonlinear) association between the climate variables probably exists. Trenberth (2005) found a negative correlation between monthly mean anomalies of boreal summer (MJJAS) surface air temperature and precipitation amount of reanalysis data (1979–2002) over much of India and China and state that “neither precipitation nor temperature should be interpreted without considering the strong co-variability that exists”. Therefore, until a higher density of records for individual climate parameters is established, we believe it is justified to use both to reconstruct the flow of dynamical *information*, measured by the extent of linkages, significant associations, between the time series of individual nodes. Combining different archives increases the robustness of the analysis against individual archive-specific biases, e.g., trees might provide information where stalagmites cannot or vice versa. In contrast to other analysis methods, every node retains its individuality in the network and its role in the final result, the network, can be assessed both visually (e.g. in force-weighted network representations) or quantitatively (by computing network statistics). Furthermore, should incompatibility be suspected, node removal is straightforward and does not require re-computation of the whole network.

Using published paleoclimate records from the ASM domain, we analyze late Holocene Asian monsoon dynamics during the MWP, the LIA, and the recent warm period (RWP, here: 1850–1980 AD). We review literature and methodology of complex (climate) networks in Sect. 2.1. In Sect. 2.2 we then set out to document paleoclimate network construction and introduce linear and nonlinear similarity measures adapted to paleoclimate data. We describe the ASM paleoclimate data in Sect. 3 and the results we obtain from the paleoclimate networks in Sect. 4. In Sect. 5 our results regarding the Asian monsoon synchronization for the past millennium are compared to previously published findings and we discuss the robustness and advantages of the paleoclimate network approach compared to the usually employed visual comparison.

## 2 Methods

We propose a new, complementary tool for the reconstruction and investigation of spatio-temporal dynamics of climate systems in the past: *Palaeoclimate networks*. The approach is inspired by climate networks which are a relatively new, but a powerful and increasingly popular tool to reconstruct Earth system dynamics. In the following we first describe climate networks and subsequently develop the paleoclimate network approach.

### 2.1 Climate networks

Climate networks are a relatively new tool to explore spatio-temporal variability of climate parameters and assess dynamical information flow between spatially distant regions (Donges et al. 2009, 2011; Malik et al. 2010) and the stability of the climate system and its teleconnections (Gozolchiani et al. 2011; Steinhaeuser et al. 2010; Tsonis and Swanson 2008; Yamasaki et al. 2009). They are inspired by complex networks theory, which, from sociology through gene networks to citation networks consist of two main components: *nodes*, or vertices and *links*, also called edges. The nodes might be representing actors, genes, or authors of scientific papers. The links can be drawn from co-starring in the same movie, sequential expression of genes, or co-authorships.

Climate networks are based on observations of climate dynamics (time series) at certain points, the nodes. Computed from these time series, pairwise similarity calculation [linear correlation or nonlinear interrelations, like mutual information (MI) (Donges et al. 2009) or recurrence-based measures (Feldhoff et al., submitted)] yield a correlation matrix with entries for each pair of nodes. This matrix is then thresholded using either a fixed value for the correlation or a prescribed *link density*. The resultant *adjacency matrix***A** is a sparse binary matrix with the *i,j*th entry being non-zero if and only if the time series representing nodes *i* and *j* are significantly associated. Network statistics can subsequently be employed to assess overall characteristics of the network such as the degree distribution (e.g., how many links do the individual nodes have) or more abstract measures such as betweenness, where *information flow* through the network is quantified.

### 2.2 Palaeoclimate networks

#### 2.2.1 Difference to recent climate networks

Major difference between modern observational or reanalysis data and proxy data is the heterogeneous sampling of the paleoclimate records. Whereas modern observations are represented regularly, hourly, daily, or monthly, many paleoclimate proxies are reconstructed with sampling intervals (e.g. from stalagmites or ice cores), varying intrinsically from sub-annual to centennial resolution. By nature, annually laminated sediments or tree ring chronologies should not suffer from this complications. However, missing data can occur in them as well and it was recently reported that tree-ring based temperature reconstructions might be biased, as trees might be missing rings in exceptionally cold years after volcanic eruptions (Mann et al. 2012). Carefully cross-dated, such flaws could be identified and corrected for in the final chronology. The final dataset would then, again, be irregular in time.

As they are reconstructed from natural archives with varying sedimentation rates, paleoclimate time series are generally unevenly sampled. They can contain hiatuses and might have poor chronological control. These features require special measures for similarity assessment, as physically meaningful signal reconstruction is often not feasible, and standard interpolation methods introduce strong bias effects (Babu and Stoica 2010; Rehfeld et al. 2011; Schulz and Stattegger 1997; Stoica and Sandgren 2006). We have recently shown that using a Gaussian kernel-based correlation estimator, Pearson correlation can be estimated more efficiently than if using interpolation (Rehfeld et al. 2011). Here, we additionally put forward an algorithm to estimate MI, a nonlinear dependence measure, for unevenly sampled data. In Sect. 2.2.2 we review these similarity measures and show, that our MI estimation algorithm compares favorably to an approach using standard linear interpolation techniques. All records in one network are required to have recorded climate variability at comparable temporal resolution. For periods of interest in the range of few centuries, annual to multi-annual resolution is required to meet the numerical demands of the estimators. Not all records, however, will cover the whole period of interest, and some will display large gaps. While our methodology is able to cope with such complications, individual significance tests for each pair of nodes, mimicking their temporal coverage, have to be conducted. This is in contrast to standard climate network construction, where usually a link density selecting, e.g., the 5 % strongest associations as links is used (Donges et al. 2009; Malik et al. 2010, 2011).

#### 2.2.2 Similarity measures for irregularly sampled time series

Linear dependence, or similarity in linear properties, between two time series (i.e. the dynamical processes behind them) is often estimated employing the cross correlation function (XCF) (Chatfield 2004; Rehfeld et al. 2011). The association between observations might, however, also be non-linear and not follow a specific functional form, which can not be captured by linear correlation. Bivariate (cross) mutual information as a measure of dependence additionally captures nonlinear associations (Dionisio et al. 2004), which is why we will use it along with correlation as similarity functions \(S_i(m\Updelta t^{xy})\), with the index *i* indicating, which measure was calculated and *m* representing a lag time step of a width of \(\Updelta t^{xy}\). We use a lag vector resolution of \(\Updelta t^{xy}=\hbox{max}(\Updelta t^x, \Updelta t^y)\), choosing the larger of the average sampling rates \(\Updelta t^x\) and \(\Updelta t^y\) of the two time series. The scales of variation of MI and XCF are different, but we do not employ the absolute values in the network analysis. We determine the significance of the numerical estimates with respect to critical values from surrogate data and subsequently convert to a binary scale (0 for no, 1 for significant association) that we can intercompare. Standard methods require regular observation intervals and therefore signal reconstruction on an evenly sampled grid. However, the original irregularity causes positive spectral bias towards low frequencies and consequently high-frequency variability is underestimated when it is overcome by conventional interpolation methods (Babu and Stoica 2010; Rehfeld et al. 2011; Schulz and Stattegger 1997; Stoica and Sandgren 2006). Gap-filling and meaningful signal reconstruction is non-trivial, as, physically, surrounding climate processes during archive growth (e.g. with sufficient moisture availability) and impeded growth (e.g. in a drought period) are potentially very different and inferring from observations of one on potential observations of the other is probably very error prone. A negative coupling strength bias has been found for the pairwise correlation estimate of irregular time series and linear Pearson correlation can be estimated more efficiently employing a Gaussian kernel-based, adapted, correlation estimator (Rehfeld et al. 2011).

*Gaussian kernel-based Pearson correlation* The main idea of Pearson correlation is to take a mean over concurrently observed and standardized products of observations from time series of stationary stochastic processes. Concurrency of observations is rare for unevenly sampled time series and would need to be forced via signal reconstruction to allow the application of standard methods. Key idea of the Gaussian kernel-based estimator is to calculate a weighted mean over standardized observations, avoiding signal alteration. The Gaussian weights rate, e.g., a product of observations that are (almost) concurrent higher than a product of observations that are far apart. The resultant estimator was tested on synthetic and real datasets and shown to be more efficient for irregular time series than other techniques (e.g. linear interpolation, inversion of the Lomb-Scargle periodogram) (Rehfeld et al. 2011).

*Mutual information for irregularly sampled time series* Mutual information *MI*(*X*, *Y*) is a measure of the dependence (linear or nonlinear) between two random variables, *X* and *Y*. This measure from information theory can be interpreted as the uncertainty reduction in variable *X*, given that we observed *Y*. It is symmetric, i.e. relationships of opposite sign but the same association strength give the same MI. The measure yields a null result if, and only if, the two random variables, in our case time series of observations, are independent (Kraskov et al. 2004).

*p*

_{x,y}is the two-dimensional joint probability density function of the variables

*X*and

*Y*and

*p*

_{x}resp.

*p*

_{y}are the one-dimensional probability distributions of

*X*resp.

*Y*. Different estimators are applied to estimate mutual information, starting from the joint probability distribution, itself estimated from an

*x*−

*y*scatterplot. In case of irregular sampling, however, the bivariate observations (

*X*

_{t},

*Y*

_{t}) at regular observation points

*t*required for a scatterplot are not readily available. We therefore perform a local reconstruction of the signal, estimating for each point

*i*{

*t*

_{i}

^{x},

*x*

_{i}} a local signal reconstruction by calculating a weighted mean of signal {

*t*

_{j}

^{y},

*y*

_{j}}, centering the weight around

*t*

_{i}

^{x}. If there are no or too few observations

*y*

_{j}available around

*t*

_{i}

^{x}this reconstruction is not performed. From this we get a new, bivariate set of observations {

*t*

_{i}

^{x},

*x*

_{i},

*y*

_{i}

^{rec}}. We then repeat the procedure by stepping through

*t*

_{j}

^{y}, which yields {

*t*

_{j}

^{y},

*x*

_{j}

^{rec},

*y*

_{j}}. From these sets of observations we can estimate the joint density of

*X*and

*Y*using standard estimators for MI. We have compared the performance of MI estimation for standard linear interpolation and our reconstruction scheme at varying sampling irregularities. We followed the sampling sensitivity analysis described in Rehfeld et al. (2011). We generated AR1 processes at very high time resolution and then re-sampled the observations onto the irregular observation times. The driving process is given by

*l*

*MI*(

*X*(

*t*),

*Y*(

*t*+

*l*)) = − 0.5log(1 − ρ

_{xy}

^{2}(

*l*), where ρ

_{xy}(

*l*) = α = 0.8, as the processes follow a bivariate normal distribution (Nazareth et al. 2007). We can then set out to estimate

*MI*(

*X*(

*t*),

*Y*(

*t*+

*l*)) from the simulated time series and, comparing the result to the expected value, calculate the Root Mean Square Error (RMSE) of the estimators. We show the results in Fig. 2. With increasing sampling irregularity (i.e. larger gaps) the RMSE of the linear interpolation routine increases systematically. This effect is also visible for the Gaussian-kernel based signal reconstruction, but it is much milder. We therefore conclude that estimating MI using local Gaussian kernel reconstruction is

*more efficient*than using standard interpolation.

#### 2.2.3 Constructing a paleoclimate network

*gXCF*and

*gMI*), form the basis for a network analysis of paleoclimate records, because employing them we can hope to be able to capture the true dependence structure with small sampling bias. Network construction is conducted according to the following steps:

- 1.
In the first step, paleoclimate records in the study region, representing, presumably, one climatic component (e.g. monsoonal rainfall amounts) are identified and checked for comparability: While their time sampling does not have to be equal, the average sampling interval should be of the same order of magnitude. Within the time slice of interest, the record should consist of at least 100 observations, to ensure the power of the similarity tests.

- 2.
In the second step we pre-process the suitable datasets. We limit the time series to a time window of width

*W*. For each record we subtract a nonlinear trend which we estimate by applying a Gaussian kernel smoother of a bandwidth of*W*/2. We choose the bandwidth such that we remove centennial-scale trends but do not smooth high-frequency (annual to decadal) variability. The data, within this time window, now has zero mean and unit variance. - 3.
In the third step, the degree of similarity is estimated for all pairwise combinations of records. Within the overlap of the individual pairs, we calculate lagged MI and Pearson correlation in the ‘standard’ way, involving interpolation to an average time scale,

*iXCF*and*iMI*, and using the adapted estimators,*gXCF*and*gMI*. To compensate for possible dating uncertainties, we determine the largest absolute value of the similarity function \(S(m \Updelta t_{xy})\), within time lags of*m*= 0 ± 1 around zero lag. As a result we get four matrices with MI, resp. correlation estimates. - 4.
We then conduct pairwise significance tests for each similarity measure

*S*as described in Rehfeld et al. (2011): We construct surrogate time series following the null hypothesis that both records are uncoupled irregularly sampled autoregressive processes of order 1. The persistence time for the test time series is estimated from the original records. The similarity function*S*(*m*) for these artificial data is estimated 1,000 times, so that the critical values, the 2.5 and 97.5 % quantiles of the distribution of similarity estimates, can be determined. - 5.
Finally, these critical values are used to threshold the correlation matrices. If a significant correlation exists between the records

*i*and*j*, i.e.,*S*_{est}^{i,j}<*S*_{2.5}^{i,j}or*S*_{est}^{i,j}>*S*_{97.5}^{i,j}, we set*A*(*i*,*j*) = 1. If no significant similarity is found we set the entry to zero. We repeat this for all four similarity estimators and obtain four adjacency matrices. We then sum the matrices to obtain the final, weighted, adjacency matrix for the network. The nodes*i*and*j*are linked, if any*A*(*i*,*j*) > 0. Link weight scales between zero (no link) and four (all measures find a significant link). Employing*gMI, gXCF, iMI*, and*iXCF*all together we can improve the robustness of the network detection, as then the resulting link weight reflects our certainty of a true similarity and is not likely be due to the peculiarity of one measure. - 6.
The obtained network can now be visualized and analyzed.

#### 2.2.4 Basic paleoclimate network measures

We calculate *weighted node degree**D*_{i} = ∑_{j}*W*_{i,j}, given by the sum of link weights *W*_{i,j} of a node *i* linking it to all *j* others. The overall link density *L*, is given by *L* = \( \frac{1}{4N}\) ∑_{i,j}*W*_{i,j}, the sum of link weights divided by the possible sum of link weights, depending on the number of nodes *N* and involved similarity measures (here, 4). To understand the spatial distribution of our links, we define a third measure, *PConn*, the percentage of realized connections (*PConn*) between subdomains. We define it as the fraction of realized vs. possible links between nodes west of 95° longitude (nodes in the traditional ISM domain) and nodes east of 95° longitude. We then generate 1,000 random networks, redistributing links randomly (at the adjacency matrix level), and estimate *PConn* from each. From the resultant distribution of *PConn*_{sim} we can find the fraction *p* of random networks that show a *lower**PConn* than our observed *PConn*_{real}.

Similarly, we calculate the *average link density* of all nodes and nodes east/west of the boundary to determine if they show uniform or differing characteristics.

## 3 Data

In our analysis we include published proxy data from the Asian monsoon domain between 66° and 116°E, and 10° to 39°N (Fig. 1). We include tree-ring and stalagmite data as well as one annually laminated sediment core (Von Rad et al. 1999), one ice core (Thompson et al. 2000) and one reconstruction of summer temperatures compiled from tree-ring data and historic documents (Yi et al. 2011). The data had to cover at least one of the periods (−30–100, 100–400 or 700–1100 years BP) with at least 100 observations.

*rwl-crn*in Table 1) were used as provided. Raw tree ring width series (rwl) were assembled into chronologies by first detrending the individual tree series with a 50-year Gaussian kernel smoother (to remove youth bias), standardizing and then averaging the individual trees for the corresponding years.

Table of all paleoclimate records used in this study

No | Name | Lat. [°N] | Lon. [°E] | Archive | Proxy | Reference |
---|---|---|---|---|---|---|

1 | SO90-39KG-56KA | 25 | 66 | marine | varve thickn. | Von Rad et al. (1999) |

2 | Akalagavi | 15 | 74 | Stal | δ | Yadava et al. (2004) |

3 | Karakoram | 36 | 75 | Tree | *rainfall | Treydte et al. (2006) |

4 | ktrc | 10 | 77 | Tree | rwl-crn | Borgaonkar et al. (2010) |

5 | imrf | 13 | 77 | Tree | *rainfall | Pant et al. (1988) |

6 | INDI019 | 30 | 78 | Tree | rwl-crn | Borgaonkar et al. (1994) |

7 | INDI021 | 30 | 79 | Tree | rwl-crn | Borgaonkar et al. (1994) |

8 | Jhumar | 19 | 82 | Stal | δ | Sinha et al. (2007) |

9 | Dandak | 19 | 82 | Stal | δ | |

10 | DasuopuC3 | 28 | 85 | Ice core | δ | Thompson et al. (2000) |

11 | Wah-Shikar | 25 | 92 | Stal | δ | Sinha et al. (2007) |

12 | CHIN006 | 36 | 98 | Tree | rwl | Sheppard et al. (2004) |

13 | CHIN005 | 37 | 99 | Tree | rwl | Sheppard et al. (2004) |

14 | CHIN017 | 29 | 99 | Tree | rwl | Cook et al. (2010) |

15 | CHIN019 | 29 | 100 | Tree | rwl | Cook et al. (2010) |

16 | CHIN021 | 29 | 100 | Tree | rwl | Cook et al. (2010) |

17 | HIN001a | 37 | 100 | Tree | rwl-crn | noaa-tree-5408; Zu, R.Z. |

18 | CHIN018 | 29 | 100 | Tree | rwl | Cook et al. (2010) |

19 | CHIN020 | 30 | 100 | Tree | rwl | Cook et al. (2010) |

20 | CHIN003 | 38 | 100 | Tree | rwl-crn | noaa-tree-5407; Zu, R.Z. |

21 | Wanxiang | 33 | 105 | Stal | δ | Zhang et al. (2008) |

22 | Dayu | 33 | 106 | Stal | δ | Tan et al. (2009) |

23 | VIET001 | 12 | 108 | Tree | rwl-crn | Buckley et al. (2010) |

24 | Jiuxian-c996-1 | 33 | 109 | Stal | δ | Cai et al. (2010) |

25 | Heshang | 30 | 110 | Stal | δ | Hu et al. (2008) |

26 | CHIN004ea | 34 | 110 | Tree | rwl-crn | noaa-tree-5352; Wu, X.D et al. |

27 | NCPrecipIndex | 37 | 112 | Historic + tree | *JJA precip. | Yi et al. (2011) |

28 | Shihua 2003 | 39 | 116 | stal | *Temp | Tan and Liu (2003) |

## 4 Results

We derive small, but due to the spatial and archive-specific heterogeneities still very complex, networks from the datasets in Table 1. For each time period (MWP, LIA, late RWP) we select records fulfilling the data requirements described in Sect. 2.2. We subsequently describe the retrieved networks visually, qualitatively, and quantitatively.

### 4.1 Medieval warm period (MWP)

Palaeoclimate record composition and results obtained from the networks for the three considered time periods, MWP, LIA and RWP

| MWP | LIA | RWP |
---|---|---|---|

Time frame [yrs BP] | 700–1100 | 100–400 | −30–100 |

No. of records (All/ tree/ stal/ other) | 10 (4/5/1) | 25 (16/6/3) | 22 (16/3/3) |

No. of records East/West of 95°E. | 4/6 | 10/15 | 8/14 |

Weighted degree (mean/<95°E/ > 95°E) | 8.00 / 11.25 / 5.83 | 15.92 / 12.20 / 18.40 | 11.00 / 9.00 / 12.14 |

PConn (p-val) | 0.24 (0.76) | 0.14 (0.16) | 0.13 (0.56) |

After pairwise similarity assessment and significance testing at the 95 %-level, we observe a well-connected network (Fig. 4). Still, the mean correlation levels for all measures (reported in Table 2) are not significantly different from zero (for gXCF and iXCF) and the intrinsic estimator bias of approximately 0.6 (for *gMI* and *iMI*). Note that though we report the upper and lower quantiles for MI, we only used the upper quantile to threshold the correlation matrix, as MI is a symmetric measure (see also Sect. 2.2.2).

Between the 10 nodes we find 22 links, which have an overall weight of 40 (link weights scale from zero to four, as described in Sect. 2.2). We find two links with highest certainty (weight = 4, Wanxiang \(\leftrightarrow\) Dandak and Wanxiang \(\leftrightarrow\) Shihua), showing a strong West-East connection. The Dandak record is also linked with high certainty to Jhumar cave, SO90-39-KG-KA and the tree ring chronology CHIN006). It is the node with the highest weighted degree, followed by the Wanxiang record. The weighted node degree is visualized by the size of the nodes in Fig. 4. The tree-ring record from Vietnam, VIET001, is the node with the lowest degree, it is linked only to one, the easternmost marine record (1). Link weight, in Fig. 4a, b, is indicated by both width and darkness of the links. The nodes in the network in Fig. 4b are not placed according to their geographic origins but according to an iterative force-weighing algorithm. Linked nodes are attracted to each other, while nodes without connections are repelled. *Isolates*, only loosely connected nodes, here the VIET001 or CHIN005 tree ring records, tend to be pushed to the margins, while hubs, i.e., nodes that are strongly connected through the network (here: the Dandak stalagmite record), remain central.

Finally, we divide the nodes into two sections, West and East of 95°E and estimate regional degree and *PConn*, as defined in Sect. 2.2.4. Were the two domains actually asynchronous and independent, we would not expect to find a significant fraction of realized links between nodes across the artificial border and, by consequence, *PConn* to be low. Assuming independence of the regions, we would also expect the node degree statistics on both sides to be homogeneous. However, at an average weighted degree of 8 we find that nodes in the West show an almost twice as high degree as further East (Table 2). We find *PConn* = 0.24, so approximately one quarter of the possible links are realized. Conducting our simple statistical test in which we redistribute the links randomly across the network for each similarity measure, we find that 76 % of these networks have *fewer* connections between the subnetworks, so the connectivity across the artificial border is rather high.

### 4.2 Little ice age (LIA)

In the more recent period of the LIA (100–400 years BP) we were able to include 25 records, 16 from trees, 6 stalagmite and 3 other records (Records no. 1, 10 and 27, see Table 1). Again, the node distribution is spatially biased towards China, with two thirds of the records located east of 95°E.

108 links connect the nodes, with a weight sum of 199 and a weighted link density of ≈17 %. We find 5 links of highest and 16 of high certainty (Fig. 5). The ‘supernodes’, having the highest degree, are th e Chinese stalagmite record, Dayu (sum of weights 27) and the tree chronology, CHIN018 (weight sum 26). The South Indian record of Akalagavi has the lowest link weight sum (5). At the same time, the previously (during the MWP) almost isolated Vietnamese tree-ring record, VIET 00, is now well-connected to the network (weight sum 14) and is with highest certainty associated to tree-ring record CHIN018! In the force-weighted representation (Fig. 5b), however, it is still pushed outwards, similar to the almost isolated Akalagavi record from Southern India.

During the time period of the LIA, the average degree east of the artificial 95°E boundary is 30 % higher than on the Indian side of the boundary, while the overall weighted degree is almost twice as high as compared to the MWP. This is concordant with twice the number of available nodes. The estimated *PConn* is lower (0.14) across the border and relatively few, only 16 %, of the randomly generated networks have a lower connectivity.

### 4.3 Recent warm period (RWP)

For the RWP (−30–100 years BP, i.e., 1850–1980 AD) we included 22 records, out of which 16 came from trees, three from stalagmites and three from other sources (Number 1, 10 and 27 in Table 1). Roughly 60 % of the nodes lie west of 95°E, the spatial bias is therefore slightly lower than in the preceding time intervals. There is no apparent overall association amongst all nodes, as the mean correlation levels are well between the critical values, given in Table 2.

The obtained network is rather sparsely connected (Fig. 6). We find 62 links between the 22 nodes with an overall link sum of 121. The overall weighted link density is ≈13 %. Only two links of highest certainty are observed (Heshang \(\leftrightarrow\) CHIN001a; CHIN021 \(\leftrightarrow\) CHIN20) and seven of high certainty. The most connected node is the Chinese tree-ring record CHIN017, and the Akalagavi record has the second highest weighted degree (17). SO90-39-KG is an isolated node in this time interval, with no link to the rest of the network and the Indian tree-ring chronology INDI019 has only a weighted degree of 2. VIET001 has an above-average weighted degree (16) and is, like the South Indian Akalagavi record (17) well-connected to the network, both are centrally located in the force-weighted network representation (Fig. 6b).

Although the across-border connectivity *PConn* is, at 0.13, lower for the RWP than for the previous LIA period, the significance of the estimate (*p* = 0.54) is low due to the overall lower number of connections (a lower average degree than in LIA) and the result can not be distinguished from a randomly generated network of the same link density.

### 4.4 Comparison of medieval warm period, little ice age, and the recent warm period

the warmer MWP featured a high overall link density, a strong West-East connection and a higher node degree west of the artificial boundary.

the colder LIA showed an lower link density, a lower West-East linkage and a higher degree east of 95°E longitude. Within the ISM domain, fewer links connect meridionally than zonally.

during the relatively warmer RWP we derive the lowest overall link density and a medium West-East connectivity, consistent with a more uniform network.

although the net connectivity

*PConn*is decreasing towards the present (0.24,0.14,0.13) for MWP/LIA/RWP, this is consistent with an decrease in link density (0.22,0.17,0.13). If we account for this effect by standardizing the fraction of realized zonal edges by dividing by the average link density, we observe a pattern that is in accordance with the significance test results:*PConn*/*D*≈ (1.1, 0.8, 1.0) is high 1,000 years ago, drops for the period of the LIA and is higher, though not at the MWP level, for the most recent RWP network. In compliance with this, the*p*values we obtained show the same patterns, (0.76,0.16,0.54). These*p*-values indicate how*PConn*is to be interpreted with respect to the null hypothesis of the network being homogeneous and random. The high value of*p*during the MWP points towards a stronger zonal linkage than expected from random graphs of the same link density. The low value for the LIA reflects a lower connectivity, which is inconsistent with an overall association between the areas east and west. The RWP network is practically random (*p*= 0.54, close to the median of the*PConn*from surrogate networks).

## 5 Discussion and conclusions

### 5.1 Medieval warm period

The MWP paleoclimate network, representing a period of northern hemispheric warmth, shows strong zonal connectivity between the subdomains, linking India and China very effectively. This strong eastward flow of dynamical information indicates a strong ISM circulation, with a strong ISM penetration into the mainland of China. A temperature modulation of ISM strength has been observed on decadal to millennial timescales (Cai et al. 2010; Cheng et al. 2012; Wang et al. 2005; Zhang et al. 2008) and is expected from model results (Schewe et al. 2012). Increased northern hemisphere temperature could have allowed an earlier retreat of the Tibetan High in spring parallel to a more northward intrusion of the ITCZ. This could then have resulted in an earlier ISM onset, and a prolonged and enhanced ISM season. We hypothesize that increased circulation allowed deeper eastward ISM penetration into China, and that the northern ITCZ is the main factor linking India and China during the MWP summers.

### 5.2 Little ice age

In contrast to the MWP, the cool LIA yields a comparably weaker information flux towards the East and strong regional associations within China, pointing towards increased regional scale, or EASM, influence in this region. The low number of meridional links over India during the LIA and the disconnection between the ASM sub-systems could be explained if we invoke a southward mean ITCZ position, leading to a relative strengthening of local weather effects in India and China, and a disruption of the link between the ISM and EASM domains. At the same time the Vietnamese tree-ring record is now strongly connected to sites in central China, and we find highly significant links across the Tibetan Plateau. A relative increase in the Tibetan High and an increased importance of local effects during this cold phase would explain these observations. In agreement with this the (at present ISM-dominated) record from Wanxiang cave (Cai et al. 2010) was found to show a wetter MWP and RWP with stronger, and a drier LIA with weaker monsoon periods, respectively. A link between the Indian Dandak cave record, located centrally in the zonal ISM inflow corridor, and Wanxiang cave was observed for the onset phase of the LIA (Berkelhammer et al. 2010; Rehfeld et al. 2011). Unfortunately, we have no insight in this link during the entire LIA period because the Dandak record does not fully cover the LIA. However, for the Jhumar stalagmite record, which is located in close proximity to the Dandak site, we do not find highly significant multi-annual to decadal scale similarities during the LIA period, corroborating our hypothesis of a weakened teleconnection between India and China at that time.

### 5.3 Recent warm period

The paleoclimate network for the most recent time period does neither indicate strong nor weak zonal information flow. Link orientation appears to be almost random, which could be consistent with a transition from the ‘cold state’ (emphasized Tibetan High and local effect importance, and decreased ISM meridional components) to a ‘warm, MWP-like, state’ (deep eastward ISM penetration, strong meridional links within India). This is also supported by node degree statistics, which show an equal distribution of links on both sides of the artificial 95°E boundary. Our observation period does, however, include the transition from the LIA (Osborn and Briffa 2006) and increasing anthropogenic impacts and alteration of the atmosphere, also in monsoonal Asia (Ruddiman 2003; Zhang et al. 2008), and we must be careful not to over-interpret this results.

Though the quantitative accordance between the two warmer periods is striking, the low spatio-temporal resolution of the MWP proxies is a potential source of uncertainty. While we strove to ensure comparability by sampling all regions in both networks, archive composition becomes tree-oriented toward the present. Although a source of uncertainty, the bias should be negligible, because the tree-specific link densities are not, or little, higher than for the rest of the archives. This could be due to the fact that the tree-sites we included, especially those in Central China, are located in mountainous areas, where strong geographic heterogeneities in form of valleys and mountains induce local moisture flow divergences.

The low number of available paleoclimate proxy records from the late Holocene is the reason why we chose to combine temperature and precipitation-dominated records, based on the assumption of a functional relationship between the parameters. If a sufficient number of datasets representing variability of one climate parameter across the Asian monsoon domain is available, we could attempt to reconstruct physical flows as in more recent climate network analysis (Donges et al. 2009; Malik et al. 2011), but at present such an analysis, at least for sub-decadal to decadal scale variability, is not feasible in the ASM domain. On decadal to centennial time scales, such an analysis might, however, be feasible with the inclusion of other terrestrial and marine archives (e.g. pollen, coral, or lacustrine records). Our study focused on the Asian monsoon, but it is equally possible—and informative—to use available paleoclimate records from other locations in addition to study the regional response to forcing factors like the North Atlantic Oscillation, or El Niño Southern Oscillation (ENSO). Future extensions of this method may consider directionalities and indirect couplings, e.g., derived from recurrence based methods (Feldhoff et al., submitted; Zou et al. 2011). Furthermore, it would be informative to use the new method for time intervals during interstadial, stadial, and interglacial times. Such studies could shed light on the variability of, e.g., monsoonal teleconnections during these periods.

### 5.4 Discussion of the paleoclimate network approach

^{18}

*O*time series), but the advantage of the paleoclimate network approach is that we obtain figures for the

*degree*of similarity, not only concerning the relationship between two proxy records, but also its ties to all other records included in the analysis. Therefore, to address the question (“How did the subsystems interact during the different time periods?”) we were able to compute a connectivity index from realized links connecting the subdomains. The results indicate that interaction was stronger during the MWP than during the LIA, and the recent warming finds more MWP-like conditions. Contemplating the time series in Fig. 3 by eye alone we could not possibly have come to such a similar conclusion.

Uncertainties of the records should be incorporated into similarity assessment wherever possible. This can be done, for example by comparing, visually or numerically, on an *absolute* time scale (Breitenbach et al. 2012), where the dating errors are moved into the proxy domain and the time scale becomes certain. Provided with a proxy record with confidence bounds it is possible to incorporate these uncertainties into the paleoclimate network approach numerically (i.e. via Monte Carlo simulations). A basic prerequisite for this, however, is the access to dating information for all data that should be included, a requirement not met at the moment.

More generally, a paleoclimate network is a tool that enables us to obtain a spatio-temporal fingerprint of the climate system, a visual representation that summarizes what we can see by eye—and more. We could use it also to study proxy response to climate parameters be they linear or nonlinear (Anchukaitis et al. 2006; Schleser et al. 1999), as it relies on association measures suitable for irregular sampling. Similarly, weather station data is often riddled with gaps, making it necessary to reconstruct these missing data—or cut the time periods to the sections of overlap. To compare them amongst each other—and to proxy reconstructions—Gaussian kernel-based correlation estimation (Rehfeld et al. 2011) and mutual information are well-suited. Such a systematic validation could, for example, take place in the framework of *interacting* networks (Donges et al. 2011), or in a potential multivariate extension of the paleoclimate networks.

We have attempted to reconstruct monsoonal dynamics of the last millennium using a combination of different paleoclimate archives and proxies from Asia. Using the paleoclimate network approach we find that the warm climate of the Medieval Warm Period was characterized by a strong zonal ISM penetration into China, whereas during the cold Little Ice Age the meridional component within the EASM was strengthened. We hypothesize that the ITCZ (itself responding on a variety of factors) is the major influencing factor connecting the two sub-systems of the Asian monsoon domain during warm intervals. During cold periods, the Tibetan High would have forced a retreat of the ITCZ and local effects become more dominant. Though we can, at present, not make a statement about the future of the ISM strength, we find that the most recent period (1850–1980 AD) is dynamically more similar to the MWP than to the LIA.

## Acknowledgments

This research was financially supported by the the German Federal Ministry of Education and Research (BMBF project PROGRESS, 03IS2191B), the German Science Foundation (DFG research group FOR 1380 “Himalaya: Modern and Past Climates (HIMPAC))”, the DFG graduate school GRK 1364 “Shaping Earth’s Surface in a Variable Environment”) and the Schweizer National Fond (SNF Sinergia grant CRSI22 132646/1). The authors would like to thank M. Yadava, R. Ramesh and H. Borgaonkar for providing data from India as well as G. Helle and M. Freund for helpful discussions about tree-ring data. Software to analyze irregularly sampled time series using the methods in this paper can be found on http://tocsy.pik-potsdam.de.