1 Introduction

Since the Coupled Model Intercomparison Project (CMIP) was launched in 1995, coupled ocean-atmosphere general circulation models developed in dozens of research centers around the world (e.g., Table 1) have been compared and analyzed extensively. The program has improved our scientific understanding of the processes of Earth’s climate system and of our simulation capabilities in this field. CMIP also plays an important social role by contributing to the Intergovernmental Panel on Climate Change (IPCC). The CMIP phase three (CMIP3) provided the scientific base for the Fourth Assessment Report (AR4) of IPCC published in 2007. CMIP phase 5 (CMIP5) was initiated in 2008, and the CMIP5 data are now available for analyses and are expected to provide new insights on our climate for the Fifth Assessment Report (AR5).

Table 1 A list of model names and modeling centers

Previous studies have shown that the performance of climate models in reproducing the present climate have been improved from CMIP1 to CMIP3, with more sophisticated physical parameterizations and higher model resolutions (Reichler and Kim 2008). However, a perfect simulation is, of course, impossible, and discrepancies exist between the CMIP3 models and the observations. Of these discrepancies "the double ITCZ bias" and "the cold tongue bias" are well-known (Meehl et al. 2005).

Figure 1 shows the annual means of the observed precipitation (TRMM PR2A25) averaged for 1998–2007, plotted along with contours of the sea surface temperature (SST; HadISST) and the zero lines of the vertical pressure velocity at 500 hPa (ω500 = 0 hPa s−1; JRA), which indicate the boundaries between the large-scale subsiding regions and ascending regions. Very high SST accompanied by heavy precipitation is observed over the warm pool from the Indian Ocean to the western Pacific. From this warm pool, two convergence zones with heavy precipitation, the intertropical convergence zone (ITCZ) and the South Pacific convergence zone (SPCZ), extend eastward and southeastward, respectively. In general, greater amounts of precipitation are observed over oceans with higher SST because higher SST supplies more heat and moisture from the sea surface to the lower troposphere, thus fueling convective activity. However, some exceptions have been identified over subsidence regions such as the southeastern Pacific (indicated by a white box in Fig. 1), where only light precipitation associated with shallow convection is observed despite the relatively high SST. Many recent observation based studies as well as model studies suggest importance of environmental humidity air in dynamical suppression of deep convection (Sherwood 1999; Jensen and Del Genio 2006; Wang et al. 2007; Holloway and Neelin 2009). For example, Takayabu et al. (2010) explained that since the mid-to-lower troposphere is very dry over large-scale subsidence regions, entrainment of dry environmental air to a convective parcel effectively reduces the parcel’s buoyancy suppressing deep convection.

Fig. 1
figure 1

Annual mean precipitation (color; units in the color scale below are mm day−1; TRMM PR2A25) and SST (contours; interval is 3 °C; HadISST) averaged from 1998 to 2007. Purple lines are contours of ω500 = 0 hPa s−1 (JRA) indicating the boundaries of the large-scale subsidence regions and ascending regions

The double ITCZ bias is a model’s overestimation of precipitation over the southeastern Pacific, corresponding to a fictitious ITCZ in the southern hemisphere (SH) alongside the observed ITCZ in the northern hemisphere (NH). Since this precipitation overestimation is often accompanied by a positive SST bias, previous studies have examined the possible reasons for the SST bias.

In the real climate, SST over the eastern Pacific is higher in the NH than in the SH corresponding to the convective ITCZ and the suppressed condition, respectively (Fig. 1). A fundamental reason for the north–south asymmetric distribution of SST is considered to be the northwest tilt of the American continent’s coastline (Philander et al. 1996). Ekman transport of ocean water driven by northeasterly surface winds in the NH is directed northwestward and that driven by southeasterly winds in the SH is directed southwestward. Therefore, the Ekman flow is nearly parallel to the coastline in the NH and perpendicular to the coastline in the SH. The resulting upwelling of cold water in the SH causes the asymmetric SST about the equator.

When the north–south SST gradient drives southerly winds across the equator, the Coriolis force acts to deflect the southerlies westward in the SH and eastward in the NH. Superposed on the background easterly trades, evaporation is enhanced in the SH and suppressed in the NH, thereby increasing the SST gradient (Xie and Philander 1994). The southerlies, which drive westward Ekman transport in the SH and eastward transport in the NH, further amplify the SST gradient through upwelling (Chang and Philander 1994). Moreover, radiation associated with clouds also works as a positive feedback. Klein and Hartmann (1993) reported that low-level stratus cloud cover is negatively correlated with SST over the southeastern Pacific. A SST cooling increases the atmospheric stability and hence the stratus cloud cover that reflects the solar heating back, thereby decreasing SST further (Philander et al. 1996). On the other hand, in climate models, the cross-equatorial southerly winds over the southeastern Pacific are underestimated and the positive SST bias appears (Mechoso et al. 1995; Xie et al. 2007; de Szoeke and Xie 2008). Moreover, many climate models fail to reproduce low-level stratus cloud in their shallow cloud schemes, which amplify the SST bias (Bretherton et al. 2004).

However, the SST bias may not fully explain the double ITCZ bias because deep convection behaves differently even with the same SST owing to the dynamical suppression. Zhang et al. (2007) and Chikira (2010) reported that the double ITCZ appears even in atmospheric models with a prescribed SST. Song and Zhang (2009) and Chikira (2010) showed that the double ITCZ bias is mitigated when the convective parameterization scheme is modified. Hirota et al. (2011) examined the CMIP3 models and showed that the sensitivity of the deep convection to environmental humidity in the models with the double ITCZ bias is significantly weaker than that in the models without the double ITCZ. Moreover, they pointed out that the models using convective schemes that are sensitive to environmental air humidity (Nordeng 1994; Tokioka et al. 1988; Emori et al. 2001; Chikira 2010) successfully suppressed deep convection over the dry southeastern Pacific. This is consistent with the observational studies suggesting the importance of dynamical suppression over the southeastern Pacific.

Relatively cold SST with a contour of 26 °C extending west to approximately 130°W at the equator is called a cold tongue (Fig. 1). The cold tongue bias is a model discrepancy of excessive and narrow cold SST that extends too far west into the western Pacific. Recently, Zheng et al. (2012) analyzed ocean heat budget for the CMIP3 models and showed that overly negative horizontal and vertical heat advection contribute to the cold SST bias. Moreover, they examined equatorial ocean circulations and thermal structures in models and suggested that these errors in horizontal advection are caused by too strong westward surface currents, and that in vertical advection are associated with excessive upwelling and vertical temperature gradients over the central to eastern Pacific. Meanwhile, Meehl et al. (2001) showed models with cold tongue bias tend to have overly strong easterly trades over the equatorial Pacific. The stronger easterlies cause stronger equatorial upwelling and surface evaporations, which contribute to the cold tongue bias (Mechoso et al. 1995). The enhanced zonal gradient of SST may feedback to the stronger easterlies, which is an important air-sea interaction controlling the cold tongue (Lin 2007). Previous studies also suggested importance of the ocean model resolution for reducing the cold tongue bias (Imada and Kimoto 2012; Sakamoto et al. 2012). Roberts et al. (2009) showed that representation of the tropical instability waves is improved in models with a higher resolution, and the heat transport by the waves reduces the cold tongue bias. Although there are many studies discussing the double ITCZ and the cold tongue biases as described above, physical factors separating (or combining) these biases are not clearly understood.

In this study, we investigate the precipitation reproducibility in the CMIP5 multi-climate models compared to CMIP3, and discuss physical factors for the reproducibility and the differences. In Sect. 2, we describe the data used in our analyses. In Sect. 3, we examine the precipitation reproducibility of the models and explain how the double ITCZ and the cold tongue biases have changed (or not changed) from CMIP3 to CMIP5. Section 4 presents the discussion, and Sect. 5 provides the summary.

2 Data and method

In this study, we analyze daily output data from the historical experiments of the CMIP5 models and the twentieth-century experiments (20C3M) of the CMIP3 models available at the Program for Climate Model Diagnosis and Inter-comparison (PCMDI; http://www-pcmdi.llnl.gov). Only models with daily data and models not using flux corrections are analyzed. The models are listed in Table 1. The variables used in this study include precipitation, SST, temperature, humidity, and vertical pressure velocity (ω). We also use observed precipitation data from the Tropical Rainfall Measuring Mission Precipitation Radar 2A25 (TRMM PR2A25), SST compiled by the Hadley Center (HadISST), and the Japanese 25-year reanalysis (JRA25; Onogi et al. 2007).

For the HadGem2-CC model, the inmcm4 model, and all the CMIP3 models, output data of ω are not available. In such cases, ω is diagnosed by integrating the continuity equation assuming ω = 0 at 100 hPa. The calculation’s accuracy is validated using 15 CMIP5 models with ω available and comparing the calculated value with the original model output (not shown). We also diagnose apparent heat source Q 1 defined by Yanai et al. (1973) for each model, and use the radiative heating rate Q R calculated from the radiative transfer model using information from TRMM (L’Ecuyer and Stephens 2003, 2007). Q 1 − Q R represents mainly convective heating and is useful in understanding the vertical structure of the convective activity. Since Q R is common to all the models, when comparing Q 1 − Q R among the models, we are actually comparing only Q 1. Daily data are required for the calculation of Q 1 and for analyses of its sensitivity to the environmental humidity in Sect. 4. See Hirota et al. (2011) for more information about calculation of Q 1 for the CMIP models.

All the data described above are linearly interpolated into 2.5° × 2.5° horizontal grids for comparison. In this study, climatological averages are defined as the average over 1981–2000 for all the models and 1998–2007 for TRMM/HadISST.

3 Precipitation reproducibility in CMIP5 and CMIP3

Figure 2 shows the precipitation distributions, SST, and zero lines of ω500 for each of the seventeen CMIP5 models. While the ITCZ is reproduced fairly well at 10°N in most models, the representations of the SPCZ vary largely among the models. CanESM2, MIROC5, and FGOALS-s2 reproduce the SPCZ extending southeastward from the western Pacific as in the observations. MRI-CGCM3 and inmcm4 overestimate the precipitation over the subsidence region of the southeastern Pacific resulting to the double ITCZ problem. In GFDL-ESM2G, MPI-ESM-LR, and CSIRO-Mk3-6-0, the cold tongue of the equatorial Pacific extends too far into the western Pacific, whereas the deep convection over the southeastern Pacific is suppressed. Interestingly, these results for CMIP5 suggest that the double ITCZ and the cold tongue bias are not necessarily connected.

Fig. 2
figure 2

Same as Fig. 1 but for the seventeen CMIP5 models. A contour level of SST is 26 °C. Models are shown in the order of decreasing skill score (Fig. 3)

Following Hirota et al. (2011), we calculate the skill score defined by Taylor (2001) to evaluate the reproducibility of the annual mean precipitation distribution over the tropical oceans (30°S–30°N). We use the precipitation data obtained from the TRMM PR2A25 product as reference data. The score is defined as

$$ S\equiv \frac{(1+R)^4}{4\left( SDR + \frac{1}{SDR} \right)^2} $$
(1)

where R is the pattern correlation between the models and the reference data and SDR is the ratio of the spatial standard deviations of the models against to that of the reference. Thus, this score quantifies the similarity between the distribution and amplitude of the spatial pattern of the model to that of the observation. Figure 3 shows the skill scores of the CMIP5 models and the CMIP3 models. The score for the multi-model ensemble mean (MME) of the 17 CMIP5 models (0.64) is slightly higher than that for the 15 CMIP3 models (0.60), but the difference is not statistically significant compared to the inter-model variance of the all models. We refer to the models with the five highest and five lowest scores as the highest-scoring models (HSMs) and the lowest-scoring models (LSMs), respectively.

Fig. 3
figure 3

Skill scores (Taylor 2001) of precipitation distributions over the tropical oceans (30°S–30°N) averaged from 1981 to 2000 for (a) the CMIP5 models and (b) the CMIP3 models. The reference is the precipitation observed by TRMM PR2A25 during 1998–2007. d and c denote models with the 5 highest DI and the 5 lowest CT, respectively (see text for detail)

Since the CMIP5 models generally have higher horizontal resolution than the CMIP3 models, that may cause some difference in the Taylor skill score. We calculate the skill scores for the precipitation distribution smoothed by a spectral filter with truncation wave number of 21 (T21 ∼ 5.6°), which is about the lowest model resolution. The average scores for the smoothed precipitation are 0.71 in CMIP5 and 0.68 in CMIP3. The scores for the smoothed precipitation are increased compared to that for unsmoothed precipitation because errors in the smaller scales are filtered out. The higher scores for the smoothed precipitation in CMIP5 than in CMIP3 imply improvements in the large scale precipitation pattern.

Taylor skill scores for the precipitation reproducibility over different ocean basins of the tropics are investigated (30°S–30°N). Correlation coefficients between the scores over the entire tropics (0°–360°E) and that over the Indian Ocean (30°–110°E), the Pacific Ocean (110°E–70°W), and the Atlantic Ocean (70°W–30°E) are 0.12, 0.94, and 0.57, respectively. So the scores over the entire tropics mainly correspond to that over the Pacific. Interestingly, the scores over the Indian Ocean are not correlated with that over the other ocean basins (−0.12 with the Pacific and −0.09 with the Atlantic). The differences among the ocean basins will be examined in the future, and this study focuses on the Pacific.

The precipitation distributions of the HSMs and the LSMs for CMIP5 and CMIP3 are shown in Figs. 4 and 5, respectively. As shown in Hirota et al. (2011), one of the most prominent differences between the HSMs and the LSMs for CMIP3 corresponds to the severity of the double ITCZ problem: a larger precipitation overestimate over the southeastern Pacific in the LSMs than in the HSMs. In contrast, for CMIP5, the precipitation over the southeastern Pacific is similar in the HSMs and LSMs, but an extended cold tongue bias at the equatorial Pacific is identified in the LSMs.

Fig. 4
figure 4

Same as Fig. 1 but for composites of (a) the HSMs and (b) the LSMs of CMIP5

Fig. 5
figure 5

Same as Fig. 1 but for composites of (a) the HSMs and (b) the LSMs of CMIP3

To compare the double ITCZ and the cold tongue biases more directly with the observations, we use two indices: A double ITCZ index (hereafter referred to as DI) defined as average precipitation over the southeastern Pacific (150°–100°W, 20°–0°S; indicated by white box in Fig. 1; Bellucci et al. 2010), and a cold tongue index (CT) defined as an anomalous SST over the central equatorial Pacific (180°E–150°W, 2.5°S–2.5°N) from that over the entire tropical oceans (0–360°E, 30°S–30°N). DI and CT for the CMIP5 and CMIP3 models are plotted in Fig. 6. Most of the CMIP5 and CMIP3 models have larger DI and smaller CT compared to that of the observations indicating the double ITCZ bias and the cold tongue bias, respectively. Precipitation distributions for the CMIP5 models with the 5 highest (poor) and lowest (good) DI as well as with the 5 highest (good) and lowest (poor) CT are shown in Figs. 7 and 8. The poor DI models show large precipitation overestimate over the southeastern Pacific, and the poor CT models have cold SST bias over the equatorial Pacific. These selected poor models can be found in Fig. 3. They are generally associated with lower skill scores.

Fig. 6
figure 6

Inter-model relationship between CT (ordinate) and DI (abscissa) among (a) the CMIP5 models and (b) the CMIP3 models. TRMM/HadISST, JRA, HSM, LSM, MME are also shown. Red and blue letters indicate models of the HSMs and the LSMs, respectively

Fig. 7
figure 7

Same as Fig. 1 but for composites of the models with (a) the 5 highest (poor) DI models and (b) the 5 lowest (good) DI models for CMIP5

Fig. 8
figure 8

Same as Fig. 1 but for composites of the models with (a) the 5 highest (good) CT models and (b) the 5 lowest (poor) CT models for CMIP5

DI and CT are also calculated for the HSMs and the LSMs for CMIP5 and CMIP3 ("H" and "L" in Fig. 6). The HSMs and the LSMs of CMIP5 have similar DI value, but CT values are significantly different (statistical significance is tested using inter-model variance of the all models); the HSMs have relatively larger value close to the observations, while the LSMs have smaller value (Fig. 6a). The values of CT from the HSMs and the LSMs of CMIP3 are similar, while DI value of the LSMs is significantly larger than the observations and the HSMs (Fig. 6b). Moreover, DI and CT for the MME of CMIP3 and CMIP5 show that DI is slightly improved (2.9 mm/day for CMIP3 → 2.6 mm/day for CMIP5) and CT remains similar (1.3 → 1.3 °C). Therefore, number of models with the very large double ITCZ bias is reduced, whereas the cold tongue bias remains similar in CMIP5. Thus, it appears that the cold tongue bias stands out in the LSMs of the CMIP5 models.

Same conclusion is obtained even when we analyze only models which can be traced from CMIP3 to CMIP5. Table 2 shows the Taylor skill score, DI, and CT for these models. The average of the DI is slightly improved from 2.64 to 2.39 mm/day, while the skill score and CT remain similar. Even for the DI values, some models are degraded (e.g., CSIRO-Mk3-6-0 and GFDL-ESM2M). At least, we can say that number of improved models is more than degraded models in terms of DI.

Table 2 The Taylor skill score (S), DI, and CT for models which can be traced from CMIP3 to CMIP5

Next, we performe empirical orthogonal function (EOF) analysis on the inter-model correlation matrix of the climatological precipitation distribution over the tropical oceans (30°S–30°N) for CMIP5 to examine relationship of dominant inter-model discrepancies with DI and CT. The first and second modes explain 24 and 16 % of the total correlation, respectively, and are clearly separated from the higher modes (North et al. 1982). The first two eigen vectors are shown in Fig. 9. The first mode shows a zonally elongated positive-negative-positive structure over the tropical Pacific. The second mode shows a similar positive-negative-positive structure in the western Pacifc, but the positive anomaly around 7°N is narrower. The negative anomaly distributes wider along the equator, and spreads to the southeastern Pacific.

Fig. 9
figure 9

a The first and (b) the second mode of the inter-model EOF analysis for climatological precipitation distributions in CMIP5 (mm day−1)

Then we calculate the principal components (PC) defined as the inner products of the eigen vectors and the precipitation anomaly of each model minus the MME. The PC1 is significantly correlated with DI (the correlation coefficient is 0.67) but not significantly with CT (−0.16), whereas the PC2 is negatively correlated with DI (−0.56) and CT (−0.70). The larger PC1 implies the larger double ITCZ bias, and the larger PC2 suggests the worse cold tongue bias with the better double ITCZ bias. Therefore, the severity of the double ITCZ and cold tongue biases are closely related to the dominant inter-model discrepancies.

Finally, we perform an EOF analysis using the CMIP3 models and obtain very similar results. The inter-model correlation between PC1 defined using the eigen vectors for CMIP3 and CMIP5 is 0.92 (0.90 for PC2). However, the total correlation explained is 33 % (19 %), which is higher than that defined by CMIP5. This means that while the basic structure of the inter-model discrepancies is similar, the relative contribution has changed from CMIP3 to CMIP5. The larger contribution of the first mode in CMIP3 is consistent with the result that the major difference between the LSMs and HSMs for CMIP3 is the double ITCZ.

4 Discussion

Following Hirota et al. (2011), we analyze the sensitivity of the deep convection to changes in environmental air humidity to understand why the double ITCZ bias appears. Figure 10 shows the daily Q 1 − Q R heating profiles stratified against the relative humidity at 600 hPa (RH600) for the CMIP5 models with 5 highest (poor) and lowest (good) DI over the tropical oceans (30°S–30°N). In both the poor and good DI models, a deep convection is active when RH600 > 50 % and is mostly suppressed for RH600 < 50 %. This sensitivity of the deep convection to humidity is qualitatively similar, but differences appear when we subtract the poor DI models from the good DI models. A negative anomaly of the deep heating is identified in the profiles for RH600 < 40 % and a positive anomaly appears for RH600 > 60 %. Thus, the sensitivity of the deep convection to humidity is stronger in the good DI models (Fig. 10c). These anomalies become strongly significant when we analyze the sensitivity of the heating profiles just over the oceans with surface temperature near 26 °C (Fig. 10d), which corresponds to the SST over the southeastern Pacific. Therefore, over the dry subsidence region of the southeastern Pacific, the deep convection in the good DI models is suppressed because of its strong sensitivity to humidity, whereas that of the poor DI models is not suppressed, forming the double ITCZ bias. This situation is identical to that described by Hirota et al. (2011) for the CMIP3 models with the double ITCZ bias. Note that the positive anomaly of mid-level heating (1,000–600 hPa) for RH600 = 40–60 % indicates a slower transition of the mid-height convection to the deep convection with increasing environmental air humidity. Further analysis is needed to understand the roles of the mid-height convection.

Fig. 10
figure 10

Q 1 − Q R (contour, K day−1) profiles stratified against relative humidity at 600 hPa (abscissa, %) for (a) the poor DI models, (b) the good DI models, and (c) their difference over the entire tropical oceans (30°S–30°N) for 1981–2000. d Shows the difference but over oceans with SST near 26 °C instead of the entire tropical oceans. Negative values are indicated by dashed lines. Contour intervals are 1 in (a, b) and 0.2 in (c, d). Light and dark shadings in (c, d) denote significance levels of 90 and 95 %, respectively

As described in Sect. 3, DI is slightly improved from CMIP3 to CMIP5. We compare the sensitivity of deep convection to environmental humidity for CMIP5 and CMIP3 (Fig. 11). The anomalies of CMIP5 minus CMIP3 are significantly negative for RH600 < 40 % and positive for RH600 > 60 % showing the stronger sensitivity. Thus it is consistent that the convection over the southeastern Pacific is more suppressed in CMIP5 models.

Fig. 11
figure 11

Same as Fig. 10c but for differenec of CMIP5 minus CMIP3

It is worth discussing differences in models developed in a same institute. MPI-ESM-LR is an updated version from MPI_ECHAM5. Although both versions use the Tiedtke (1989) cumulus parameterization scheme modified by Nordeng (1994), some parameters in the scheme are modified (Mauritsen et al. 2012). Our analysis indicates that the sensitivity of deep convection to humidity is increased (not shown). This is consistent that convection is more suppressed over the southeastern Pacific in MPI-ESM-LR (DI = 2.1 mm/day) than in MPI_ECHAM5 (DI = 2.7 mm/day). FGOALS-s2 (DI = 2.52 mm/day) also shows improvements from IAP (DI = 2.9 mm/day). Their convective parameterization is replaced from the Zhang and McFarlane ((1995) scheme to the Tiedke (1989) and Nordeng (1994) scheme in which deep convection requires large-scale moisture convergence over the depth of the clouds. The sensitivity is again increased in our analysis (not shown). The original Zhang and McFarlane (1995) scheme is known to have large double ITCZ bias (Zhang and Song 2010). The replacement of the convective scheme is likely to contribute to the improvements in DI. MIROC5 also replaced its convective scheme from Arakawa and Schubert (1974) with the empirical suppression of Emori et al. (2001) to Chikira and Sugiyama (2010) with large entrainment rate in the lower-mid troposphere. Since DI is already small in their older versions (MIROC_M, MIROC_H, MIROC-ESM, MIROC-ESM-CHEM and MIROC4h) with the empirical suppression, the DI value of MIROC5 is not much changed (Fig. 6). However, the reproducibility of precipitation distribution measured by the Taylor skill score is largely improved as shown in Fig. 3 (see Chikira and Sugiyama 2010 for detail). The convective schemes in CSIRO-Mk3-6-0 (Rotstayn et al. 2012), CNRM-CM5 (Voldoire et al. 2013), and GFDL-ESM2 (Dunne et al. 2012) are not modified from their older versions, but the DI of CNRM-CM5 and GFDL-ESM2G is improved whereas that of CSIRO-Mk3-6-0 and GFDL-ESM2M is degraded. These results suggest some other factors influencing the double ITCZ bias.

Next, the cold tongue bias is discussed. CT for the CMIP5 and CMIP3 models are plotted against the resolution of the ocean models (Fig. 12). Here, we define the ocean resolution as

$$ \sqrt{(360^{\circ} \times 180^{\circ})/(\hbox{number of horizontal grid points})}. $$
Fig. 12
figure 12

Inter-model relationship between CT (abscissa; °C) and the horizontal resolution of the ocean models (ordinate; degree) for the CMIP5 (blue letters) and CMIP3 (red letters)

Although some models have different resolution between low and high latitudes, we ignore the inhomogeneity of the ocean resolution because the exact positions of the model native grids are not available for the CMIP3 models. We expect the inhomogeniety does not influence our result as correlation between the resolution over the tropics and the entire globe is very high (0.91) in CMIP5 (not shown). It seems that CT is generally higher (good) with higher resolution except for the two lowest resolution models, GISS_AOM and GISS_E_R. Although GISS_AOM and GISS_E_R have better CT values, their precipitation distribution is not well reproduced with very poor representation of the SPCZ (see Fig. 2 and Fig. 3 in Hirota et al. 2011), and their ocean circulations are unrealistic (Zheng et al. 2012). So the two exceptions do not suggest possibility for better simulations of the tropics with the lower resolution. Excluding these two models, the correlation coefficient between CT and ocean model resolution is −0.54. This relationship can also be identified in models from a same institute: CT of the CNRM models is improved from 1.2 to 1.7 °C with increasing the resolution from 1.5° to 0.8° (b1, b2 in Fig. 12); the lower resolution models of MIROC_M, MIROC-ESM, MIROC-ESM-CHEM, and MIROC5 have lower (poor) CT values compared to the higher resolution models of MIROC_H and MIROC4h (i1-i6). As described in Sect. 1, the representation of tropical instability waves (Roberts et al. 2009; Imada and Kimoto 2012) as well as circulations and thermal structures of the equatorial Pacific Ocean (Zheng et al. 2012) are known to affect the cold tongue bias. These oceanic features may be better represented in the higher resolution models. We also examine the trade winds for the 4 good CT models (excluding FGOALS-g2, of which surface wind is not available) and 5 poor CT models from CMIP5 over the equatorial Pacific shown in Fig. 13 (similar results are obtained also for CMIP3; not shown). Consistent with the previous studies (e.g. Meehl et al. 2001), the lowest CT models show significantly stronger trades. Stronger trades enhance upwelling and evaporations over the central equatorial Pacific and associated feedback processes may contribute to the cold tongue bias (Mechoso et al. 1995; Lin 2007).

Fig. 13
figure 13

Zonal wind (m s−1) at the surface for JRA (black lines), the good CT models (red lines), and the poor CT models (blue lines) averaged over the equatorial Pacific (2.5°S–2.5°N)

Comparing the CMIP5 and CMIP3 MME, we have identified significant improvements in the model resolution (1.7° → 1.3°) but not in the CT values (Fig. 6) and the trade winds over the equatorial Pacific (not shown). Due to the large inter-model spread of CT (Fig. 12), discussing differences of the cold tongue bias between CMIP5 and CMIP3 is difficult. Moreover, the correlation between CT and the resolution is significant but not so high (−0.54), and this relationship does not hold for some models: CT values are degraded in models from CSIRO (c1–c3 in Fig. 12), GFDL (d1–d4), MPI-ESM-LR (j1, j2) even their resolution is not reduced. We are currently analyzing results from numerical experiments with only ocean resolutions are different to isolate the impacts of the resolution and to understand which factor is most dominant for the bias.

5 Summary

The reproducibility of the precipitation distribution over the tropical oceans for CMIP5 is investigated and compared to that for CMIP3. The Taylor skill score for the reproducibility of the CMIP5 MME (0.64) is slightly higher than that of CMIP3 (0.60), but the difference is not statistically significant. The inter-model EOF analysis indicates that two dominant discrepancies of precipitation distribution among the CMIP5 models are closely related to severity of the double ITCZ and the cold tongue biases. The first mode is positively correlated with the double ITCZ index (DI; correlation coefficient = 0.67), whereas the second mode is negatively correlated with DI (−0.56) and the cold tongue index (CT; −0.70). The eigen vectors for inter-model EOF of the CMIP3 models are very similar to that of the CMIP5 models, but total correlation explained is different. These results suggest that the basic structures of the inter-model discrepancies associated with the double ITCZ and the cold tongue biases are similar, but their relative contribution has changed from CMIP3 to CMIP5. Comparison of DI and CT for CMIP3 and CMIP5 indicates that number of models with very large double ITCZ bias is reduced, whereas the cold tongue bias remains similar.

As in Hirota et al. (2011) for CMIP3, the difference between models with and without the double ITCZ (the lowest and the highest DI models) is consistently explained by the sensitivity of the deep convection to relative humidity at the lower-mid troposphere. The sensitivity of convection to humidity is significantly weaker in the models with the double ITCZ, thus the deep convection is more active in those models even over the dry subsidence region of the southeastern Pacific. In the model without the double ITCZ, the deep convection is successfully suppressed by entrainment of dry environmental air as in the observations. Conversely, the severity of the cold tongue bias is significantly correlated with the resolution of the ocean models. Moreover, the models with severe cold tongue bias show too strong trade winds over the equatorial Pacific. Probably, higher resolution models simulate better ocean circulations and thermal structures as well as associated air-sea interactions over the equatorial Pacific.

With statistically significant evidences provided in Sects. 3 and 4, we conclude that the sensitivity of deep convection to environmental humidity and higher resolution of the ocean models with the better equatorial trades would be important factors in reducing the double ITCZ bias and the cold tongue bias. However, it should be noted that there are many exceptions as discussed in Sect. 4. Because so many factors are different from one model to another, it is difficult to isolate and investigate impacts of a physical factor by comparing the CMIP models. To further understand reasons for the biases, complementary studies including sensitivity experiments with only one physical factor changed are necessary. We are currently investigating impacts of modifications in convective schemes and model resolution by numerical experiments of MIROC5.