1 Introduction

The Asian-Australian monsoon (A-AM) is an integral component of the Earth’s climate system, involving complex interactions among the atmosphere, hydrosphere and biosphere. The A-AM covers more than one-third of the global tropics from roughly 40°E to 160°E (Wang et al. 2001), and home to more than half of the world’s population. A-AM climate is characterized by strong seasonality as well as interannual variability in vector winds and precipitation (Lin and Wang 2002; Wang et al. 2001). A-AM system includes several sub-monsoon systems, i.e., South Asian, East Asian, and Australian monsoons (Wang et al. 2001, 2014). They have a far-reaching impact on regional and global climate (Webster and Yang 1992). The monsoon circulation transports abundant water vapor from the Pacific and Indian Ocean to the region and greatly affects the monsoon rainfall and water budget. The prediction and projection of future climate strongly rely on the simulation skills of climate models. The improvement in model performance with respect to simulation of monsoon climate is particularly important to the projection of future climate. An accurate forecast of monsoon variation enables productive operation of agriculture in the domain and to reduce damage caused by drought or flood (e.g., Webster and Jian 2011; Sperber et al. 2013; Zhou et al. 2016).

In terms of the evaluation of model performance in simulating the monsoon climate, most previous studies focused on scalar variables such as temperature, precipitation, wind speed, and pay little attention to the vector wind fields. However, seasonal reversal of vector wind is one of the most important features of the monsoon climate. A strong Asian summer monsoon circulation usually brings more precipitation and vice versa. Therefore, the simulated precipitation is strongly determined by how well climate models can simulate atmospheric circulation (e.g., Twardosz et al. 2011; Sperber et al. 2013; Zhou et al. 2016; Wei et al. 2016). In addition to climate studies, the climate-related industries, e.g. wind power, agriculture, and ecology, also require comprehensive evaluation of model performance in simulating wind fields for the purpose of adaptation to future climate change (Pryor et al. 2005; Fitch and Moore 2007; Hout et al. 2008; Pryor and Barthelmie 2011; Rasmussen et al. 2011; Barthelmie and Pryor 2014). Efforts have been made to evaluate monsoon circulation by assessing the monsoon winds of climate models, but most of them focus on the zonal or meridional wind component (e.g., Zveryaev 2002; Chen et al. 2012; Sperber et al. 2013; Kawatani et al. 2016). For example, Zveryaev (2002) investigated the decadal–interdecadal variabilities of zonal winds in the Asian monsoon region. However, it is known that the meridional wind also plays a crucial role in transferring moisture and shaping the regional climate, especially for the East Asian monsoon. Likewise, Sperber et al. (2013) evaluated the effect of meridional gradient of zonal wind anomalies on interannual variability of rainfall in the East Asia region, but did not consider the vector nature of winds, either. Evaluating meridional and zonal wind separately may lose some important information or get one-sided evaluation of modeled wind field. A few studies also evaluated climate models in terms of vorticity or divergence fields, which take both the zonal and meridional wind components into account (Mitovski et al. 2010; Seiler and Zwiers 2016). However, as the secondary variables computed from the vector wind field, the vorticity and divergence have some limitations in representing the vector field accurately. For instance, same vorticity or divergence field may correspond to different vector wind fields due to the partial derivative algorithm. Thus, an accurate simulation in vorticity or divergence field may not necessarily indicate an accurate simulation in vector wind fields.

Recently, Xu et al. (2016) defined a set of statistical quantities to measure the feature of vector fields and in turn devised a vector field evaluation (VFE) diagram. The VFE diagram can summarize multiple aspects of model performance in simulating vector fields, which can be regarded as a generalized Taylor diagram (Taylor 2001). In this study, we carry out an in-depth evaluation on the vector winds in the A-AM region simulated by the Coupled Model Intercomparison Project-5 (CMIP5) models with the support of the VFE method. We intend to assess the CMIP5 model skills in reproducing the spatial pattern and temporal variations of vector wind fields from the viewpoint of vector field evaluation. The evaluation helps to understand the errors of CMIP5 models and potentially support improvement of climate models.

In Sect. 2, we briefly introduce the models and data used in this study as well as the VFE method. Section 3 reports the performance of the models with regard to spatial pattern of climatological mean vector winds and the simulated temporal variation of vector winds including annual cycle and interannual variability. Section 4 investigates the interrelationship of model performances in simulating climatological mean, annual cycle and interannual variability. Conclusions are presented in Sect. 5.

2 Data and methodology

2.1 Data

The model data used are the first ensemble run of historical experiment from 37 CMIP5 models during the period of 1979–2005 (Table 1). The historical experiment was forced by observed natural and anthropogenic forcings with some of the models including time-evolving land cover (Taylor et al. 2012). Detailed model information can be obtained from http://cmip-pcmdi.llnl.gov/cmip5.

Table 1 CMIP5 models used in this study

The data used to validate the models are from six reanalysis datasets, i.e., the National Centers for Environment Prediction (NCEP)-Department of Energy (DOE) Atmospheric Model Inter-comparison Project II reanalysis data (NCEP2), the National Centers for Environmental Prediction/National Center for Atmospheric Research Reanalysis Project (NNRP), the European Centre for Medium-Range Weather Forecasts Reanalysis-40 (ERA40), the European Centre for Medium-Range Weather Forecasts Interim Reanalysis (ERA-I), the Japan Meteorological Agency and the Central Research Institute of Electric Power Industry Reanalysis-25 (JRA-25) and Reanalysis-55 (JRA-55). We can take observational uncertainty into account by using multiple reanalysis datasets as validation data.

We examine the performance of 37 CMIP5 models by comparing the monthly mean vector winds in the Asian-Australian monsoon region (40°E–160°E, 30°S–45°N) from the first ensemble run of historical simulations against ensemble mean of six reanalysis data sets during the period from 1979 to 2005. All model results and reanalysis products are regridded to a common grid of 2.5° × 2.5°. To avoid the unrealistic values caused by data extrapolation, all data below the land surface are excluded from the evaluation.

2.2 Assessment methods

The statistical quantities and VFE diagram devised by Xu et al. (2016) are briefly introduced here. The VFE diagram is a generalized Taylor diagram (2001), which can evaluate the vector fields. The VFE diagram can provide a concise and informative evaluation on model performance in terms of three statistical quantities, i.e. vector similarity coefficient (VSC), root-mean-square vector length (RMSL), and root-mean-square vector difference (RMSVD):

$${\text{VSC}}=\frac{{\mathop \sum \nolimits_{{{\text{i}}=1}}^{{\text{N}}} {{\varvec{A}}_i} \cdot {{\varvec{B}}_i}}}{{\sqrt {\mathop \sum \nolimits_{{{\text{i}}=1}}^{{\text{N}}} |{{\varvec{A}}_i}{|^2}}}{\sqrt {\mathop \sum \nolimits_{{{\text{i}}=1}}^{{\text{N}}} |{{\varvec{B}}_i}{|^2}} }},$$
(1)
$${L_A}=\sqrt {\frac{{\mathop \sum \nolimits_{{i=1}}^{N} |{{\varvec{A}}_i}{|^2}}}{N}} \;{\text{and}}\;{L_B}=\sqrt {\frac{{\mathop \sum \nolimits_{{i=1}}^{N} {{\left| {{{\varvec{B}}_i}} \right|}^2}}}{N}} ,$$
(2)
$${\text{RMSVD}}=\frac{1}{N}\mathop \sum \limits_{{i=1}}^{N} {({{\varvec{A}}_i} - {{\varvec{B}}_i})^2}.$$
(3)

Here, A and B represent two vector fields, which consist of N discrete vectors (in time and/or space). Similar to the Pearson correlation coefficient, the VSC measures pattern similarity between two vector fields and it ranges from − 1 to 1. The RMSLs, i.e., \({L_A}\) and \({L_B}\), measure the mean and variance of the magnitudes of vector fields A and B, respectively. RMSVD describes the overall difference between two vector fields similar to the root mean squared difference (RMSD) between two scalar fields. These three statistical quantities can also be applied to evaluation of anomalous scalar fields. Under such a circumstance, the VSC, RMSL, and RMSVD become the Pearson correlation coefficient, standard deviation, and the root mean squared difference in the Taylor diagram, respectively. The VFE diagram can be flexibly applied to the full vector fields or vector anomaly fields for different applications. The centered pattern correlation excludes mean state from the statistics and is most commonly used for detection studies (Santer et al. 1993; Wigley et al. 2000). In this study, we employed the uncentered statistics to take both the mean state and anomaly statistics into account.

To quantitatively compare and rank the performance of the various CMIP5 models, we calculated the model skill scores Sv1 and Sv2 proposed by Xu et al. (2016). Sv1 and Sv2 are similar to the model skill scores defined by Taylor (2001) except for vector fields. Sv1 and Sv2 are defined as:

$${S_{v1}}=\frac{{4(1+{R_v})}}{{{{\left( {\frac{{{L_A}}}{{{L_B}}}+\frac{{{L_B}}}{{{L_A}}}} \right)}^2}(1+{R_0})}},$$
(4)
$${S_{v2}}=\frac{{4{{(1+{R_v})}^4}}}{{{{\left( {\frac{{{L_A}}}{{{L_B}}}+\frac{{{L_B}}}{{{L_A}}}} \right)}^2}{{(1+{R_0})}^4}}},$$
(5)

where \({R_v}\) is the vector similarity coefficient between observation and simulation, and \({R_0}\) is the maximum similarity attainable. Here we assumed that \({R_0}\) = 1. The value of skill score approaches 1 in a perfect simulation. Note that Sv1 and Sv2 take both the VSC and RMSL into account. Sv1 places more emphasis on the simulation of amplitude of vector field. In contrast, Sv2 is more sensitive to the pattern similarity between two vector fields.

3 Evaluations of vector winds in the Asian-Australian monsoon region

3.1 Climatological means

The MME of 37 CMIP5 models well capture the main features of the Asian-Australian summer monsoon circulation against the multi-reanalysis dataset ensemble (MRE). For example, MME and reanalysis show highly consistent summer monsoon circulation pattern, characterized by 850-hPa cross-equatorial flow to the east of Africa and maritime continent, westerly flow over South Asia, and the anticyclonic circulation over western Pacific (Fig. 1a, b). The upper tropospheric circulation is dominated by the South Asian anticyclonic circulation and north-to-south cross-equatorial flow (Fig. 1d, e). The VSCs between MME and reanalysis is 0.98 (0.99) and the normalized RMSL is 0.97 (0.97) for the 850-hPa (200-hPa) wind fields. Clearly, the climatological mean vector winds simulated by MME are highly consistent with those in the MRE in terms of both the spatial pattern and the magnitude of vector winds. However, 60–80% of models underestimate the strength of Somali low-level jet stream, and consequently the wind speed, to the east of Somali, is approximately 10% lower in MME than the MRE (Fig. 1c). Compared with the MRE, MME shows a stronger southerly flow over eastern China, a slightly weaker monsoon trough over South Asia (Fig. 1a, c), and a weaker tropical easterly flow in the upper troposphere. More than 70% CMIP5 models underestimate the tropical upper level easterly flow and subtropical westerly flow, but overestimate the westerly flow around 30°N in Northern Hemisphere and that to the south of 10°S (Fig. 1d, f).

Fig. 1
figure 1

The climatological mean 850-hPa and 200-hPa vector winds in summer (June–July–August) for a, d multi-reanalysis dataset ensemble (MRE) and b, e MME during 1979–2005. The shading in b, e represents the inter-model spread defined by the standard deviation of vector winds across 37 CMIP5 models. c, f The difference of vector wind fields between MME and MRE. The shaded area in c, f represents the percentage of models with greater-than-MRE wind speed to total number of models. The unit is m s−1 for the vector wind and their inter-model spread

To measure the spread of 37 CMIP5 models, we calculated the root mean square vector difference between individual model and MME at each grid as:

$${\sigma _v}=\sqrt {\frac{1}{N}\mathop \sum \limits_{{i=1}}^{N} {{\left( {{{\varvec{V}}_{CMIP5,i}} - {{\bar {{\varvec{V}}}}_{MME}}} \right)}^2}} ~=\sqrt {\frac{1}{N}\mathop \sum \limits_{{i=1}}^{N} {{({u_{CMIP5,i}} - {{\bar {u}}_{MME}})}^2}+{{({v_{CMIP5,i,}} - {{\bar {v}}_{MME}})}^2}} ,$$
(6)

where N equals 37. \({{\varvec{V}}_{CMIP5,i}}\) and \({\bar {{\varvec{V}}}_{MME}}\) represent the vector wind of i-th CMIP5 model and the ensemble mean of CMIP5 models, respectively. The primary inter-model spread of vector winds occur in the Asian summer monsoon region between 5°N and 30°N with the maximum spread located in the Somali jet stream region, northern India, and Southeast Asian-western Pacific region (Fig. 1b). Conversely, the inter-model spread is very small in the upper stream of Asian-Australian summer monsoon circulation, i.e. the southern Indian Ocean and Australia (Fig. 1e). The large spread of vector winds in the Asian summer monsoon region indicates that models may not able to accurately describe coupling between precipitation and circulation, since a great spread of the simulated precipitation climatology is also observed where models have large discrepancy in reproducing the wind fields (not show). The inter-model spread of 200-hPa vector winds mainly occurs in the East Asian sub-tropical westerly jet stream regions. In addition, South Asian anticyclonic circulation also shows large spread among 37 CMIP5 models over the south of the Tibetan Plateau and the eastern Africa.

In winter, the MME can also realistically reproduce the circulation patterns of Asian-Australian winter monsoon, with a VSC of 0.96(0.99) and a normalized RMSL of 1.1(1.0) at lower (upper) troposphere in comparison with the MRE. The typical characteristics of monsoon circulation in the region can be captured by the MME, such as the north-to-south cross equatorial flow over Indonesia and the subtropical westerlies jet in the upper troposphere (Fig. 2a, b, d, e). However, MME overestimates the strength of the monsoon trough over Indonesia and the upper westerlies over the Tibetan Plateau and western Pacific where also experience a large inter-model spread (Fig. 2b, c, e, f). The center of inter-model spread of vector winds tends to appear in the regions where models show poor performance. Similar conclusion also reported in previous study in terms of precipitation (Lee et al. 2010).

Fig. 2
figure 2

Same as in Fig. 1 except for winter (December–January–February)

To quantitatively summarize the performance of CMIP5 models in simulating climatological mean vector winds, Fig. 3 illustrates multiple statistics of modeled vector winds in the VFE diagram. The statistics, i.e. VSC, RMSL, and RMSVD, of 6 reanalysis data sets are very close to each other over the A-AM region (blue marks in Fig. 3), indicating that the difference among reanalysis products is fairly small. Thus, the observational uncertainty is negligible in our model evaluations. In the remaining analyses, we take the ensemble mean of six reanalysis datasets (MRE) as observational data. MME generally outperforms individual models in the simulation of vector wind climatology in terms of both the amplitude and spatial pattern. The VSCs of various CMIP5 models range from 0.75 to 0.96, suggesting great differences of CMIP5 models’ ability in reproducing the spatial pattern of vector winds. The normalized RMSLs are generally greater than 1 in spring, autumn, and winter, which indicates that CMIP5 models systematically overestimate the magnitude of 850-hPa vector winds in the A-AM region. Models generally show better statistics in the upper troposphere than the lower troposphere (Fig. 3). The outstanding performance of upper winds is possibly due to the simple atmospheric circulation patterns.

Fig. 3
figure 3

Normalized VFE diagram of climatological mean 850-hPa and 200-hPa vector winds in the Asian-Australian monsoon region (40E°–160°E, 30°S–45°N) for spring, summer, autumn, and winter. MME and each model are denoted by green point and red numbers, respectively. The blue marks denote the reanalysis datasets. All data are compared with the reference data (REF) that is the ensemble mean of six reanalysis datasets

In order to clarify the extent to which the overall RMSVD is attributed to bias in the mean vector winds and how much is due to the poor simulation in anomaly field, we computed the RMSVD from the mean vector field and the anomaly vector field, respectively. The vector anomaly fields can roughly account for 41–100% of the RMSVD in the vector full fields (the proportion for RMSVD of mean state is less than 59%), impling the simulation of the anomaly component is the predominant error source in climatological means. In other words, the area-mean vector wind is better reproduced than the spatial pattern of vector wind by the models.

Although a smaller RMSVD generally represents better correspondence between model results and observation, the RMSVD does not decrease monotonically as the improvement of model performance (Xu et al. 2017). We therfore computed the model skill scores Sv1 and Sv2, which satify the monotonic relationship with model performance. Figure 4 summarizes and ranks model performance in simulating the vector wind climatology in various levels and seasons. The MME exhibits the best performance with the CESM1-CAM5 model and three MPI-ESM models follow closely behind. An evidently improved model skill from the lower level to the upper level is observed (Figs. 3, 4), indicating that it is more challenging to simulate low-level vector winds than the upper-level ones.

Fig. 4
figure 4

Skill score Sv1 (upper left triangle) and Sv2 (lower right triangle) represent the performance of CMIP5 models in reproducing the climatological mean 850-hPa, 500-hPa, 200-hPa vector winds in different seasons

3.2 Annual cycle

The annual cycle of monsoon circulation, characterized by seasonal reversal of vector wind, is one of the key features of monsoon climate. As interpreted in Sect. 2, the model skill score Sv2 is more sensitive to the pattern similarity of vector winds than Sv1, which can measure how well the modeled vector winds resemble the observed one. We therefore computed Sv2 at each grid with monthly mean climatology of vector winds between individual model and MRE. Under such a circumstance, the Sv2 at each grid represents how well the model can reproduce the annual cycle of vector winds. Generally, the annual cycle of vector winds are better captured in the upper troposphere than the lower troposphere (Fig. 5). The overall Sv2 averaged across 37 CMIP5 models is generally greater than 0.8 in the areas with strong low-level monsoon circulation, such as the southern Indian Ocean between 10 °S and 25 °S, northern Indian Ocean, and South China Sea (Fig. 5a). Sv2 is greater than 0.9 in the extratropical regions in the upper troposphere. In these regions, models can well reproduce the observed annual cycle of vector winds (Fig. 5b). Sv2 appears to be small in the vicinities of the Tibetan Plateau in the lower troposphere and the tropical regions in both the lower and upper troposphere, suggesting a relative larger bias in the modeled annual cycle in these regions. Note that the inter-model dispersion of Sv2 also tends to appear in regions with lower Sv2, e.g., the Tibetan Plateau, Iranian Plateau, and maritime continent in the lower troposphere and tropics in the upper troposphere. Thus, models are difficult to capture the annul cycle of vector winds and show large inter-model spread around complex terrains in lower troposphere and the equatorial India ocean in lower, middle and upper troposhere. This indicates that the topography effects on monsoon circulation may not be well captured by the climate models.

Fig. 5
figure 5

Sv2 averaged over 37 CMIP5 models measuring the overall ability of CMIP5 models in simulating the annual cycle of climatological mean vector wind. Contour denotes the standard deviation of Sv2 representing the inter-model spread of CMIP5 models in simulating annual cycle of vector winds

To compare the performance of various CMIP5 models in reproducing annual cycle of vector winds in the A-AM region, the VFE diagram illustrates multiple statistics calculated with the three-dimensional (time, latitude, longitude) vector winds within the A-AM region (Fig. 6). The statistics measure the overall performance of a climate model in reproducing the annual cycle of vector winds within the A-AM region. The normalized RMSLs in the VFE diagram are generally greater than 1, which indicates the CMIP5 models generally overestimate the amplitude of annual cycle of 850-hPa vector wind (Fig. 6). In contrast, the annual cycle of 200-hPa vector winds are well simulated by CMIP5 models characterized by smaller RMSVD for all models. Note that MME still outperforms any individual models in reproducing the annual cycle of vector winds.

Fig. 6
figure 6

Same as in Fig. 3 except for annual cycle

By computing the RMSVD of the mean field and the anomaly field of vector winds, we find that the anomaly fields account for 35–81%, with a typical value of 63%, errors of vector winds in A-AM. The skill scores, Sv1 and Sv2, computed based on the three-dimensional vector winds show that MME still outperforms any individual CMIP5 models in reproducing the annual cycle of vector winds. The comparison of Fig. 4 with Fig. 7 suggests that models those can well reproduce the climatological means, e.g., the MME, CESM1-CAM5 and three MPI models, also show good performance in simulating annual cycle of vector winds. Sv1 and Sv2 also reveal an improvement of model skills from the lower level to the upper level (Fig. 7), indicating the models have diffculty to simulate the annual cycle of lower tropospheric vector winds.

Fig. 7
figure 7

Same as in Fig. 4 expect for annual cycle of climatological mean vector winds

3.3 Interannual variability

The A-AM climate shows a very large interannual variability. An accurate simulation and prediction of the variability is still one of the most challenging tasks (Sperber and Palmer 1996; Wang et al. 2008; Boo et al. 2011; Song and Zhou 2014). The change of variability is considered more important than the change of mean state to the detection of extreme events and the design of model experiments (Katz and Brown 1992). Besides, the interannual variabilities of precipitation and temperature are closely related to that of the vector winds. Thus, assessing the variation of vector winds should be helpful for understanding the variation of monsoon rainfall and temperature. The amplitude of interannual variability of vector winds is defined as the standard deviation of the vector wind:

$${\sigma _v}=\sqrt {\frac{1}{N}\mathop \sum \limits_{{j=1}}^{N} {{\left( {{{\varvec{V}}_j} - \bar {{\varvec{V}}}} \right)}^2}} =\sqrt {\frac{1}{N}\mathop \sum \limits_{{j=1}}^{N} {{({u_j} - \bar {u})}^2}+{{({v_j} - \bar {v})}^2}} ,$$
(7)

where \({{\varvec{V}}_j}\) and \(\bar {{\varvec{V}}}\) represent the seasonal mean vector wind of the j-th year and the climatological mean vector wind, respectively. N equals to 27, representing the number of years from 1979 to 2005. The standard deviation of vector wind field measures the extent of vector wind fluctuation from its mean vector in terms of both magnitude and direction. The calculating method of the vector wind interannual variability resembles that of the inter-model spread that is defined by the root mean square difference between MME and individual model result. It should be noted that the interannual variability analyzed here is based on the time dimension, as the anomaly field is obtained by removing its own climatological means from the time dimension, while the spread of models measures the overall difference between individual model and the MME.

The maximum interannual variability of 850-hPa vector winds in summer occurs in the southern and northern sides of the Western Pacific subtropical high (WPSH) due to interannual variation of WPSH (Fig. 8a). Models can capture these variabilities but underestimate the amplitude by 20%, approximately. On the other hand, models clearly overestimate the interannual variability of 850-hPa vector winds over the Bay of Bengal and maritime continent by approximately 30% (Fig. 8a, b). In winter, the MRE shows maximum interannual variability in the tropical zone of 0°–15°S (Fig. 8d). The MME of the standard deviation of vector winds can still reasonably reproduce the overall spatial pattern of the interannual variability of vector winds but overestimates the amplitude over the Indo-China peninsula, the southern edge of the Tibetan plateau, central China and the northwestern Pacific by approximately 30–50%. (Fig. 8d, e). The tropical variability of 850-hPa vector winds in the MME appears to be linked with the Intertropical Convergence Zone (ITCZ) and its seasonal march. For example, similar to the ITCZ, the maximum variability of vector winds also shows a clear northward migration from the Southern Hemisphere in winter to the Northern Hemisphere in summer (Fig. 8a, b, d, e). The inter-model spread for interannual variability is measured by the inter-model standard deviation. Models disagree with each other mostly in the tropical western Pacific as well as the regions with high topographies in summer, such as the vicinities of the Tibetan and Iranian plateaus, Ethiopian Highland, and maritime continent (Fig. 8c). In winter, the spatial pattern of inter-model spread of 850-hPa vector winds variability resembles that in summer, except shifting from the northern Indian Ocean-northwestern Pacific to the tropical Indian Ocean-maritime continent (Fig. 8c, f). This indicates the insufficient ability of models in simulating interannual variability of low-level vector winds in regions with complex terrain.

Fig. 8
figure 8

Interannual variability of 850-hPa vector winds of the a, d MRE, b, e MME and c, f the inter-model spread of 37 CMIP5 models in summer (JJA) and winter (DJF). The interannual variability is defined by the temporal standard deviation of vector winds over the period of 1979–2005

The 200-hPa vector winds show strong variability over the subtropical westerly zones in both hemispheres in summer, as well as the westerly jet regions and the Australian region in winter (Fig. 9a, d). MME can successfully capture the spatial pattern of the variability but underestimate the magnitude of both the maximum and minimum (Fig. 9b, e). For example, MME overestimates the vector wind variability by 40% in East Asian-Pacific region and fails to capture the minimum over the Indo-Pacific region, overestimating the variability by a factor of 10–20% (Fig. 9b). In winter, MME clearly underestimates the vector wind variability over the west Australia by approximately 10–30% (Fig. 9e). The maximum inter-model spread of 200-hPa vector winds variability occurs over the western Indian Ocean and tropical western Pacific region in summer, as well as the equatorial Indian Ocean and maritime continent in winter (Fig. 9c, f).

Fig. 9
figure 9

Same as in Fig. 8 except for 200-hPa vector winds

VFE diagrams indicate that CMIP5 models can well simulate the spatial pattern of interannual variability of vector winds (Fig. 10). The VSCs are about 0.95 for most models and can approach 0.99 for the MME. However, most models overestimate the strength of interannual variability of 850-hPa vector winds in the A-AM region especially in autumn and winter. Conversely, a few models, e.g., GISS-E2-H, GISS-E2-R, and inmcm4, underestimate the strength of interannual variability of 850-hPa vector winds. In the upper troposphere, CMIP5 models generally exhibit better statistics than the lower troposphere in terms of the interannual variability characterized by closer relationship to the reanalysis in both spatial pattern and magnitude.

Fig. 10
figure 10

Same as in Fig. 3 except for interannual variability of vector winds

MME shows best performance in reproducing the interannual variability of vector winds in both the lower and upper troposphere. This is also confirmed by the ranking of Sv1 and Sv2 (Fig. 11). CMIP5 models generally display better performance in reproducing the interannual variability in the upper troposphere than the lower troposphere. It is noteworthy that the CNRM-CM5, MIROC5 and BCC-CSM1-1 rank top 3 out of 37 CMIP5 models in terms of the interannual variability of vector winds. However, MIROC5 and BCC-CSM1-1 rank outside the top 10 of 37 models when it comes to the simulation of climatological mean and annual cycle (Figs. 4, 7). This suggests that the model performance in simulating the vector wind interannual variability may not directly link to model abilities in mean state and annual cycle.

Fig. 11
figure 11

Same as Fig. 4 except for interannual variability of vector winds

4 Interrelationship of model performances in simulating climatological mean, annual cycle and interannual variability

The climatological mean, annual cycle and interannual variability are computed based on the same set of data representing three aspects of modeled vector winds. These three aspects may link to each other to a certain degree or not. If they do so it suggests that the improvement in climatological mean may lead to an improvement in annual cycle or interannual variability. If they do not so it may indicate that different processes govern these aspects.

Previous studies demonstrated that the interannual variability of monsoon in a coupled model is closely correlated to its simulation of mean features (e.g., Fennessy et al. 1994; Sperber and Palmer 1996; Kang et al. 2002; Lee et al. 2010). Similarly, the decadal variability of wind field also can be well captured by models with improved mean-state of sea surface temperature (Kajtar et al. 2017; McGregor et al. 2018). Moreover, the skill for individual coupled model in predicting the interannual variability is positively correlated with its performances of prediction of the annual cycle (Lee et al. 2010). Investigating the relationship of model performance in simulating climatological mean, annual cycle, and interannual variability may help to understand the sources of errors in models and guide the improvement of climate models. For example, Ham and Kug (2015) introduced a methodology to improve the simulated interannual variability by correcting the climatological bias based on the relationship between the inter-model diversity of interannual variability and mean state. Correcting general circulation model (GCM) mean bias can also help to improve the downscaled temperature variability to a certain agree (Xu and Yang 2012, 2015). Notably, the aforementioned studies focused on the scalar variables of monsoon, such as the sea surface temperature and precipitation. The relationship between model performances in reproducing climatological mean, annual cycle, and interannual variability were not yet assessed in terms of vector winds.

Figure 12 shows the correlation coefficients of model skill scores across 37 models describing the interrelationship between climatological mean, annual cycle, and interannual variability. There are 51 out of 144 correlation coefficients reaching the significant level of 0.01, which suggests that the ability of climate models in simulating climatological mean, annual cycle, and interannual variability do link with each other to a certain degree. The positive correlations of model skills between climatological means and annual cycle are distinguished from other comparisons, especially in the whole A-AM region. Instead, the model performances in reproducing the vector wind climatology and interannual variability are not always well correlated. For example, only 5 of 48 correlation coefficients between these two aspects reach the significant level of 0.01. Note that the climatological mean of vector winds is better correlated with its annual cycle for all monsoon regions, especially in summer and autumn for A-AM region with the significant correlation coefficients range from 0.84 to 0.9. This suggests if one model can well simulate climatological mean states of vector winds, it can usually well reproduce the annual cycle of vector winds. We also noted that the model performances in simulating climatological means are closely correlated with each other between different seasons. In another word, if one model shows good performance in simulating the spring climatology of vector winds, it also tends to better reproducs the summer, autumn and winter climatology.

Fig. 12
figure 12

Correlation coefficients between model skill scores (average of Sv1 and Sv2) in simulating the climatological means (CM), annual cycle (AC), and interannual variability (IV). The skill scores are computed in the Asian-Australian monsoon region (A-AM: 40°E–160°E, 30°S–45°N), South Asian monsoon region (SAM: 50°E–100°E, 0°–25°N), East Asian monsoon region (EAM: 100°E–140°E, 0°–45°N) and North Australian monsoon region (NAM: 100°E–140°E, 0°–15°S) for different vertical levels and seasons, respectively. The correlation coefficients at the significance level of 0.01 are shown in red font

5 Discussions and conclusions

We assess the CMIP5 models performance in reproducing the spatial pattern and temporal variations of vector wind fields in the Asian-Australian monsoon. In our evaluation, the wind field is treated as a two-dimensional vector field, which is different from most previous evaluations in those the wind field was treated as one or two scalar fields. As known that both wind speed and direction are of great importance in shaping regional climate. Therefore, our studies are expected to provide a more comprehensive and reasonable evaluation on the CMIP5 models performance in reproducing monsoon circulation.

Our evaluation indicates that CESM1-CAM5 and three MPI-ESM models show better performance than other CMIP5 models in reproducing the climatological mean vector winds in the A-AM region (Fig. 4). MME simulation has much better skill than any individual CMIP5 models in reproducing the climatological mean of vector winds in terms of both the pattern similarity and magnitude. However, MME underestimates the strength of summertime 850-hPa Somali jet stream and South Asian monsoon trough (Fig. 1). Remarkable inter-model spread occurs in the South Asian and East Asian monsoon region between 5–20°N where precipitation also shows greater model spread. This may suggest that models still lack of accuracy in describing the coupling process between precipitation and circulation. In addition, the regions with complex terrains also show large bias and model dispersion. This is likely due to the limitation of cumulus parameterization schemes in describing the interaction of complex dynamics and sharp moisture gradients associated with complex terrains (e.g., Ghan et al. 2002; Qian et al. 2010; Mehran et al. 2014). To investigate the possible cause for the systematic biases of vector winds, we decompose the vector wind into geostrophic and ageostrophic components. It turns out that CMIP5 models generally overestimate the strength of geostrophic wind as well, while the ageostrophic components contribute little to the biases. The biases in geostrophic component result from models’ deficiency in simulating the gradient of geopotential height (figure not shown). Moreover, we find the errors in climatological means largely stems from the spatial anomaly bias while the models can better capture the area-mean vector. This indicates that it is difficult to accurately simulate the spatial pattern of vector wind for the models. Note that previous studies reported that the CMIP5 models show a large zonal wind spread in the Indo-China peninsula and a large meridional wind spread in the East Asia(Gong et al. 2014). Both inter-model spreads can be simultaneously identified by the vector field evaluation method in our study (Fig. 2b).

In terms of annual cycle of vector winds in the A-AM region, CESM1-CAM5 and three MPI models still rank top of the 37 CMIP5 models. Most CMIP5 models overestimate the annual cycle amplitude of 850-hPa vector winds. MME still exhibits better performance than any individual CMIP5 models. CMIP5 models can generally well reproduce the annual cycle of vector winds in the extratropical regions in upper troposphere and strong low-level monsoon circulation. In contrast, models show large biases over complex terrains, such as the vicinities of the Tibetan Plateau and maritime continent in the lower troposphere where models also show a large inter-model dispersion.

The models can reasonably capture the temporal variance of 850-hPa vector winds in the A-AM region although overestimate its strength over the Bay of Bengal and maritime continent in summer, as well as the Indo-China peninsula, the southern edge of the Tibetan plateau, the central China and the northwestern Pacific in winter. In tropical regions, the vector winds variability center is closely related to the ITCZ, indicating a close coupling between wind and precipitation. Models disagree with each other mainly in the northwestern Pacific and the regions with high topographies. Investigation based on the zonal or meridional wind components failed to capture the large inter-model spread over the equatorial region (Gong et al. 2014) where is recognized as an area with significant climate variability. In contrast, our evaluation by using the VFE method can capture this inter-model spread (Fig. 5e).

The models can generally better simulate the climatological means, annual cycle and interannual variability of vector winds in the upper troposphere than those in the lower troposphere. Models those can well simulate vector wind climatology can usually well reproduce the annual cycle, and vice versa. Hence, the inherent bias in climate models reflected by mean states may still be the key for model development.

In general, CMIP5 models shows large bias and inter-model spread in the vicinities of complex topography, which indicates that the topographic effect may not be well resolved by the models. Therefore, we further investigate the possible relationship between model resolution and model skills by defining an index:

$${\text{Ir}}=\frac{{\mathop \sum \nolimits_{{i=1}}^{{N - 1}} \mathop \sum \nolimits_{{j=i+1}}^{N} ({r_i} - {r_j}) \cdot ({s_{v2,~i}} - {s_{v2,j}})}}{{\mathop \sum \nolimits_{{i=1}}^{{N - 1}} \mathop \sum \nolimits_{{j=i+1}}^{N} |{r_i} - {r_j}|}},$$
(8)

where i and j represent different models. N is the number of involved models. \({S_{v2}}\) and r are model skill score and model horizontal resolution, respectively. The model horizontal resolution r is defined by the number of grid cells. Ir tends to be positive if the performances of most models improve with the increase in resolution and vice versa. Thus, Ir index measures the overall relationship of model skill with model resolution. We compute Ir indices with the models from 4 modelling centers, respectively, and that computed from 37 CMIP5 models. The Ir indices describe the relationship of model resolution and model skill in terms of climatological mean, annual cycle, and interannual variability in different seasons and levels (Fig. 13). Clearly, most Ir indices appear to be positive, especially for the climatological mean and annual cycle of vector winds. This suggests that model performance generally improves with the increase in model resolution based on the viewpoint of statistics. However, it should also be emphasized that the resolution is only one of the factors affecting model performance and may not be the most important one. Thus, the increase in model resolution does not guarantee a better model performance. For example, MPI-ESM-LR (low resolution) and MPI-ESM-MR (medium resolution) show the similar skill score though their resolutions are different.

The results reported in this study may provide useful information for model development as vector winds plays a critical role in the climate system by transporting energy, moisture, momentum, and aerosol from one region to another. In addition, vector wind is one of the most important lateral boundary forcing variables in dynamical downscaling simulations. The mean states and interannual variation of wind in GCM can significantly affect the downscaled temperature, precipitation, and extreme events (e.g., Xu and Yang 2012, 2015; Bruyère et al. 2013; Xu et al. 2018). Therefore, our evaluation can also guide model selections for the purpose of dynamical downscaling simulation.

Fig. 13
figure 13

Ir indices describing the relationship between model resolution and model skill in terms of climatological mean (CM), annual cycle (AC) and interannual variability (IV). Ir is computed with the models from CMCC, Had, IPSL, MIROC modelling centers and with 37 CMIP5 models, respectively. Three columns from left to right within each group of models represent 850, 500 and 200 hPa vector wind fields, respectively. All indices are multiplied by 100