1 Introduction

The frequency and intensity of extreme weather events have increased in the past several decades, resulting in severe crop yield loss and farmer income reduction (Godfray et al. 2010; Lobell et al. 2011; Tao et al. 2013). Among such events, rainfall-related events—deficit rainfall and excessive rainfall—have showed harsh effects on crop production in China (Tao et al. 2003; Zhang, Chen et al. 2014; Zhang, Wang et al. 2014; Zhang et al. 2016). Drought, a typical agro-meteorological disaster caused by deficit rainfall, is the most frequent disaster for maize and wheat (frequency during 1991–2009: 48.6% and 79.2%) in China, followed by heavy rainfall (frequency during 1991–2009: 15.8% and 6.0%) (Zhang, Chen et al. 2014; Zhang, Wang et al. 2014). For rice, the area affected by drought has increased, since rainfall over land has decreased marginally and evaporation has increased due to warmer conditions (Tao et al. 2013; Ding et al. 2020). Previous studies have demonstrated that yield loss caused by rainfall-related disasters has been and is expected to be more frequent and more severe than that caused by other disasters (Tao et al. 2013; Piao et al. 2010; Zhang, Chen et al. 2014; Zhang, Wang et al. 2014). One potential financial adaptation to these extreme events is agricultural insurance, which has become a prominent issue in many countries and plays a vital role in transferring various weather risks.

Unfortunately, due to the expensive insurance premiums (World Bank 2007; Okhrin et al. 2013; Tuo 2016), the traditional farm-level agricultural insurances typically result in either low participation rates or high premium subsidies (Diaz-Caneja et al. 2009). The insurance literature indicates two main reasons for this fact. One is the adverse selection and moral hazards of farmers, which increase the transaction costs and deductibles (Hazell 1992; Goodwin 2001; Collier et al. 2009). The other is the system risk of insurance, which implies that risk exposure units are not independent, so the insurance may fail to transfer risks among farmers (Miranda and Glauber 1997; Cummins and Trainar 2009). Based on a simple observable parameter that is highly correlated with losses, weather index insurance (WII) is regarded as an effective tool to mitigate the first problem (Skees et al. 1997; Mahul 1999; Martin et al. 2001; Vedenov and Barnett 2004; Barnett and Mahul 2007; Zhang et al. 2017). The second problem also can be avoided because the precondition of index insurance is that risks of agricultural loss are somewhat spatially correlated (Goodwin 2001; Ibarra and Skees 2007; Barnett et al. 2008; Okhrin et al. 2013). Skees and Barnett (1999) defined yield loss risks as “in-between” risks, which are not completely and spatially correlated. The “in-between” risks are sufficiently covariated to apply one weather index within a region and to transfer risks among different regions. However, such “in-between” risks are also sufficiently idiosyncratic to cause basis risk. The basis risk of agricultural insurance is generally regarded as a mismatch between famers’ compensation and loss. Fortunately, a weather index highly associated with yield loss represents low basis risk (Ibarra and Skees 2007; Barnett et al. 2008). Although WII has been attracting attention in many countries, very few countries have focused on the performance of different weather indices across the whole nation, which restricts WII implementation and the adaptations to risk transfer and risk diversification.

Previous studies have indicated that spatial correlations between yield losses quickly decline as farm distances increase to hundreds of miles (Goodwin 2001; Wang and Zhang 2003; Okhrin et al. 2013). Thus in this study, we calculated the weather index–yield loss correlation at the county scale, which ranges from dozens of miles to hundreds of miles in China (Verburg and Chen 2000). A county can be defined as an “insurable county” when the weather index and yield loss correlation is significant at a given level. Based on insurable counties, a suitable weather index can generally be characterized by four factors: a large number of insurable counties, independent loss risk, effective risk pooling, and low basis risk. The more insurable counties identified by one weather index, the more likely the insurance premium could be calculated according to the statistical law of large numbers (Priest 1996; Wang and Zhang 2003). The independent loss risks could minimize the ruin probability in a given time period, and vice versa, according to the theory of risk pooling (Smith and Kane 1994; Wang and Zhang 2003). Finally, if one weather index cannot effectively pool risks within a region, the insurance market needs to diversify its products for transferring risks.

Our aim in the present study was to investigate which weather index can be used where at a regional scale for three main crops—maize, rice, and wheat—in China. Section 2 introduces the main cropping zones, the weather indices for the three main crops, and the analysis methods. Section 3 identifies which counties can be insured, and then conducts the analysis on basis risk, independent insurable counties, and effective risk pooling. Section 4 further discusses the innovativeness of this study and the differences between the weather indices.

2 Data and Methods

We first introduce the study areas and data for the three main crops in the present study, then describe the method for calculating the correlation between yield loss and weather index to determine insurable counties, and finally present the analysis methods for insurable counties.

2.1 Study Area and Data Sources

Based on the Chinese Landuse Dataset (Liu et al. 2003), counties with cropping areas greater than 10% were selected in five main agricultural zones for each crop (Table 1 and Fig. 1). In total, we selected 1091, 1385, and 1245 counties for maize, rice, and wheat, respectively. The total studied areas covered more than 90% of the total cropping area in China for each crop. The weather indices in the present study were based on daily rainfall data during the crop growth period (see details in Sect. 2.2.2). National standard stations (NSS) of the China Meteorological Administration (CMA)Footnote 1 provided coordinates of NSSs and daily weather data such as rainfall during 1980–2008. In total, 243, 310, and 224 NSSs were used to calculate weather indices for maize, rice, and wheat, respectively (Fig. 1). The phenology records of maize, rice, and wheat at 179, 354, and 205 agro-meteorological experimental stations in the study areas were also from CMA (Fig. 1). The records include dates of major phenological events such as planting, transplanting, flowering, and maturity from 1980 to 2008. Rainfall and phenology data for each county were from the nearest station. Yield loss in the present study was calculated from a county-level yield dataset (1980–2008) based on the Agricultural Yearbook of each county (published annually by the China Agriculture Press in Beijing). All these datasets have been used in previous studies (Liu et al. 2013; Wang et al. 2014; Zhang et al. 2016).

Table 1 Main agricultural zones of three crops in China
Fig. 1
figure 1

The location of the study region in China, national standard stations (NSS), and agro-meteorological stations of a maize, b rice, and c wheat. The roman numerals refer to the agricultural zones listed in Table 1

2.2 Determining Insurable Counties

When considering possible insurable counties for deficit rain and excessive rain, we first calculated weather-related yield loss for each county, then introduced the weather indices, and finally determined the insurable counties by analyzing the relationship between weather-related yield loss and weather index.

2.2.1 Weather-Related Yield Loss

The time series of yield consists of a trend and a non-trend component (Ker and Coble 2003; Coble et al. 2010; Wang et al. 2014). The trend yield (\(Y_{T}\)) is caused by changes in technological progress (for example, fertilization and irrigation) and long-term environmental impact (Ker and Coble 2003; Lobell and Field 2007). The non-trend yield (\(Y_{loss}\)) results from weather conditions and disasters and can be extracted by Eq. 1. Seven trend models are most frequently used to fit \(Y_{T}\)—linear, loglinear, moving average, exponential smoothing, Savitzky–Golay smoother, autoregressive integrated moving average (ARIMA), and robust locally weighted regression (Rlowess). Ye et al. (2015) found that the moving average model was the best trend model for crop yield assessment and insurance pricing after deleting outliers. Therefore, we defined outliers as the data outside the range of mean \(\pm 2\;{\text{SD}}\) (standard deviation) (Zhang, Chen et al. 2014; Zhang, Wang et al. 2014) and removed them from the present study. \(Y_{T}\) was calculated as the unweighted mean of the moving window centered on each year.

$$Y_{loss} = \frac{{Y_{T} - Y_{A} }}{{Y_{T} }}\,100\%$$
(1)

where \(Y_{loss}\) is the weather-related yield loss (%), and \({Y}_{A}\) and \({Y}_{T}\) are the actual yield and technological yield, respectively.

2.2.2 Weather Indices

We selected two types of indices to characterize deficit rain and excessive rain during the crop growth period—the anomaly-based index (AI) and the humidity-based index (HI) (Keyantash and Dracup 2002; Tank and Konnen 2003; Alexander et al. 2006; AQSIQ and SAC 2006; Zhang et al. 2011; AQSIQ and SAC 2015). The AI (Table 2) is defined as the percentage of the rainfall anomaly, which refers to below (for deficit rain, DAI) or above (for excessive rain, EAI) the long-term annual average values (Tank and Konnen 2003). The HI (Table 2) is defined as the difference between rainfall and potential evapotranspiration (PE) divided by the PE, where PE is based on the Thornthwaite equation (Cornwell and Harvey 2007). The HI is widely used to assess soil moisture imbalances and is a useful index in determining climatic suitability for agriculture (Yang et al. 2013; Ficklin et al. 2016). Humidity-based index values could be lower than − 0.4 (for deficit rain, DHI) or higher than 0.4 (for excessive rain, EHI). These indices represent extreme meteorological consequences and are regarded as reasonable measurements of agro-meteorological hazards (Hess and Syroka 2005) (also see Table 2 for the weather index details).

Table 2 Rainfall-related weather indices for three crops (maize, rice, wheat)

2.2.3 Insurable Counties

Pearson R has been widely used to capture the effect of climate variables on yield (Martin et al. 2001; Vedenov and Barnett 2004; Porter and Semenov 2005; Lobell and Field 2007) and could represent the magnitude of uninsured basis risk, which is the potential mismatch between the indemnity and actual losses suffered by farmers (Hazell et al. 2010; Binswanger-Mkhize 2012; Elabed et al. 2013). Hence, we used Pearson R to measure the correlation of weather indices and yield loss, and both variables were examined by de-autocorrelation to assure reliability of the correlation analysis (Zhang et al. 2000; Wang and Swail 2001; Alexander et al. 2006). If the correlation was very high, the uninsured basis risk was low. Counties at the 0.1 significance level were defined as the insurable counties in our study, which were subsequently classified into different categories, such as deficit-rain-insurable counties (the union of DAI- and DHI-insurable counties) and excessive-rain-insurable counties (the union of EAI- and EHI-insurable counties). Additionally, given the minimum sample size requirement (Bonett and Wright 2000; Chok 2010), counties with less than 10 event samples for one weather index were not included in the correlation analysis.

2.3 Analysis Methods

Given the vast differences in weather conditions and farming systems between agricultural zones, we investigated the performance of weather indices at the zonal scale after determining the insurable counties.

2.3.1 Identifying Insurable Counties and their Zonal Basis Risk

The number of insurable counties is directly related to the implementation of WII, and Pearson R indicates the level of basis risk. Thus, the primary objectives are to identify insurable counties and correlation levels of weather index and yield loss in different agricultural zones for the three main crops in China. We calculated the coverage ratio (the number of insurable counties/the number of all counties in one agricultural zone) to make it comparable among different zones. The values of Pearson R at the zonal scale were analyzed through comparing the kernel density and the boxplot of Pearson R. A greater median value in the boxplot corresponds to a lower basis risk. The kernel density of Pearson R can reveal the zonal distribution of basis risks, which may offer a potential structure to estimate the expectation of basis risk.

2.3.2 Estimating the Independence and Risk Pooling of Insurable Counties

Based on insurable counties, where the county-level yield losses can be effectively pooled is crucially important and might affect the selection of the best WII application at the zonal scale. Moreover, it is a common belief that independent risk exposures benefit the effectiveness of risk pooling (Miranda and Glauber 1997; Wang and Zhang 2003; Okhrin et al. 2013). Therefore, we first identified whether insurable counties were independent and then calculated the effectiveness of risk pooling in each zone.

A spatial clustering method, the Getis–Ord \(G_{i}^{*}\) statistic (\(G_{i}^{*}\)), was used to analyze the independence of the average yield loss (from 1980 to 2008) for insurable counties at the zonal scale. The \(G_{i}^{*}\) is also known as hotspot analysis (Getis and Ord 1992, 1996; Ord and Getis 1995; Mitchell 2005), which computes spatial correlation statistics (including z-score and p value) for one specific attribute of each data point/area. Only data points/areas with significant p values can be categorized into a cluster. The significant p value is defined as 0.1 in the present study. The cluster could be positively significant (hot cluster) or negatively significant (cold cluster). A hot cluster represents the high yield losses clustered together, and vice versa. Insurance could not pool loss risks among counties within a cluster because of the significant spatial correlation (Wang and Zhang 2003), but could pool loss risks among different clusters and between the clustered counties and the independent counties.

After identifying the clustered and independent counties in each agricultural zone, another important issue was whether yield loss could be effectively pooled. Here, we adopted Eq. 2 (Wang and Zhang 2003) to measure the effectiveness of risk pooling for insurable counties in each zone.

$$\emptyset = \frac{{{\text{Var}}(\bar{X})}}{{\mathop \sum \nolimits_{{i = 1}}^{N} {\text{Var}}(X(c_{i} ))/N}}$$
(2)

where N is the number of insurable counties in each zone, \(X(c_{i} )\) is the average yield loss at insurable county i from the year 1980 to 2008, and \(\bar{X} = (1/N)\mathop \sum \nolimits_{{i = 1}}^{N} X(c_{i} )\). When losses of each exposure unit are independently and identically distributed, the variance of average loss is \(\text{Var}(X(c_{i} ))/N^{2}\). Hence, the uncertainty of average loss could decrease as N increases and \(\emptyset\) equals to 1/N. However, given that yield loss distributions usually are not independent and identical to each other, \(\emptyset\) is likely higher than 1/N. Therefore, \(\emptyset \le 1 /N\) can indicate that the risk pooling of weather yield loss is effective (Wang and Zhang 2003).

3 Results

After calculating the number of insurable counties, zonal basis risk, independence of insurable counties, and the effectiveness of risk pooling, the results were individually displayed (Sects. 3.13.4) and comprehensively analyzed (Sect. 3.5).

3.1 Identifying Insurable Counties

The primary results of our study are where and how many insurable counties are in different agricultural zones. Thus, we mapped the insurable counties in orange for deficit rain and in blue for excessive rain, while the counties with event samples < 10 appear in grey (maize, Fig. 2; rice, Fig. 3, and wheat, Fig. 4). We also calculated the number and coverage ratio of insurable counties in each agricultural zone according to weather indices and hazards.

Fig. 2
figure 2

Spatial distribution and the number of insurable counties for maize: a deficit rain and b excessive rain

Fig. 3
figure 3

Spatial distribution and the number of insurable counties for rice: a deficit rain and b excessive rain

Fig. 4
figure 4

Spatial distribution and the number of insurable counties for wheat: a deficit rain and b excessive rain

We found that the number of AI-insurable counties was slightly less than that of HI-insurable counties in maize-planting areas (172 for DAI and 182 for DHI, Fig. 2a; 63 for EAI and 68 for EHI, Fig. 2b). Deficit-rain-insurable counties (266, Fig. 2a) covered more than twice the counties than those of excessive rain (108, Fig. 2b). Please note that because some counties were both AI-insurable and HI-insurable, the number of deficit-rain-insurable counties was less than the sum of DAI- and DHI-insurable counties; and the same phenomenon happened to excessive-rain-insurable counties. Spatially, more deficit-rain-insurable counties were distributed in Zones III, V, and VI (Fig. 2a-1, a-2), and more excessive-rain-insurable counties were scattered in Zones I and III (Fig. 2b-1, b-2). Additionally, the coverage ratios were greater than 0.1 in most zones for deficit rain (Zones I, III, V, and VI, Fig. 2b-3), but lower than 0.1 in most zones for excessive rain (Zones II, III, V, and VI, Fig. 2b-3).

Among the 1385 selected rice counties, the AI-insurable counties (274, Fig. 3a-1) outnumbered the HI-insurable counties (164, Fig. 3a-2) for deficit rain. However, excessive rain showed the opposite result—the HI-insurable counties (272, Fig. 3b-2) outnumbered the AI-insurable counties (199, Fig. 3b-1). This phenomenon occurred in most zones, that is, DAI performed better than DHI in Zones I, IV, VI, and VII (Fig. 3a-3), while EHI was superior to EAI in Zones I, IV, and VII (Fig. 3b-3). Spatially, the minimum number of insurable counties was found in Zone I for deficit rain (Fig. 3a-3) and in Zone V for excessive rain (Fig. 3b-3). With respect to the two hazards, more deficit-rain-insurable counties (405, Fig. 3a) were identified than excessive-rain ones (369, Fig. 3b).

With regard to the wheat-planting counties, the AI-insurable counties (DAI: 71; EAI: 196) outnumbered the HI-insurable counties (DHI: 59; EHI: 73) (Fig. 4). In total, more insurable counties were identified for excessive rain (254, Fig. 4b) than for deficit rain (117, Fig. 4a). Spatial characteristics were distinctly different between the east and the west for rainfall-related weather indices. Most deficit-rain-insurable counties were located in the west of the planting area (Fig. 4a-1, a-2), while the excessive-rain-insurable counties were mainly in the east of the planting area, especially in Zone IV (Fig. 4b-1, b-2). The coverage ratio of deficit-rain-insurable counties decreased from Zone II via III to the minimum in IV, but greatly increased to a high level in Zones V and VI (Fig. 4a-3). In contrast, for excessive rain, the coverage ratio gradually increased from Zone II via III, reached the maximum value in Zone IV, and considerably decreased in Zones V and VI. Additionally, EAI consistently identified more insurable counties than EHI, except for Zone VI (Fig. 4b-3).

3.2 Zonal Basis Risk

After identifying the numbers and locations of insurable counties, we further examined the zonal basis risk to determine the credibility of the weather index for the three main crops.

At the zonal scale, the ideal comparison on basis risk should use the expectation of Pearson R, which could be derived from the distribution of Pearson R for each agricultural zone. However, it was difficult to calculate the expectations because of the irregular distributions for various portfolios of weather indices, hazards, and agricultural zones (Fig. 5). Specifically, a few were under a regular distribution, for example, the bimodal distribution (Fig. 5a-2-I_DHI, 5b-1-I_EHI, 5b-3-III_EHI) and the normal distribution (Fig. 5b-3-VI_EHI). But the majority were under irregular distribution with a heavy tail, for example, the DHI in Zone VI for rice (Fig. 5a-2). Moreover, in some zones, it was totally impossible to calculate the expectation through simulating the distribution curve because of few insurable counties—for example, the basis risk of DAI in Zone IV for wheat (Fig. 5a-3) and the basis risk of EAI in Zone II for maize (Fig. 5b-1).

Fig. 5
figure 5

Boxplots and kernel density curves of Pearson R for: a deficit rain and b excessive rain. The y axis refers to the Pearson R value. Combinations of agricultural zone ID (see Table 1) and weather index (see Table 2) are on the x axis. The odd columns with colored shadow in each figure are for AI (DAI, EAI), and even columns for HI (DHI, EHI). The central axis divides the boxplot (at left) and the kernel density curve (at right)

Therefore, we mainly depended on the median of the Pearson R to compare basis risks. Higher median represents lower basis risk. For maize, the median of R of AI was higher than that of HI in Zones I, II, and III for deficit rain (0.46 vs 0.41, 0.40 vs 0.36, and 0.43 vs 0.42, respectively, Fig. 5a-1), but was smaller in Zones V and VI (0.41 vs 0.47 and 0.38 vs 0.44, respectively, Fig. 5a-1). In contrast, with regard to excessive rain for maize, the median of R of AI was lower than that of HI in Zones I and III (0.40 vs 0.43 and 0.39 vs 0.44, respectively, Fig. 5b-1), while it was higher in Zones V and VI (0.55 vs 0.37 and 0.44 vs 0.40, respectively, Fig. 5b-1). A similar phenomenon occurred with rice in Zone I, where AI was better than HI for deficit rain (0.43 vs 0.40, Fig. 5a-2-I_DAI) but worse for excessive rain (0.40 vs 0.47, Fig. 5b-2-I_DAI). No significantly consistent differences were observed in the rest of the zones for both deficit and excessive rains (Fig. 5a-2, b-2). As for wheat, more stable performance was identified by AI rather than by HI. Specifically, AI consistently performed better for excessive rain (Fig. 5b-3) than for deficit rain (Fig. 5a-3) in all five zones (0.46 vs 0.38, 0.44 vs 0.40, 0.46 vs 0.40, 0.40 vs 0.39, and 0.41 vs 0.40, respectively). However, this phenomenon applied to HI only in Zones II (0.45 vs 0.40), IV (0.45 vs 0), and VI (0.42 vs 0.36).

3.3 Independence of Insurable Counties

We used \(G_{i}^{*}\) to identify the spatial clusters and examine the independence of yield losses based on insurable counties in each zone for maize (Fig. 6), rice (Fig. 7), and wheat (Fig. 8). Counties with high yield loss assembled in the hot clusters, while counties with low yield loss assembled in the cold clusters.

Fig. 6
figure 6

Clustered and independent insurable counties for maize: a deficit rain and b excessive rain

Fig. 7
figure 7

Clustered and independent insurable counties for rice: a deficit rain and b excessive rain

Fig. 8
figure 8

Clustered and independent insurable counties for wheat: a deficit rain and b excessive rain

The results show that over half of the insurable counties were independent in most zones for the three crops (Figs. 6a-3, b-3, 7a-3, b-3, 8a-3, b-3), while the number and location of clustered counties varied greatly. Specifically, among all agricultural zones for maize, only insurable counties in Zone II were consistently independent regardless of the type of weather index and hazard (Fig. 6a-1, a-2, b-1). Both hot and cold clusters were identified in Zones I and III (Fig. 6a-1, a-2, b-1, b-2). As for Zones V and VI, insurable counties of deficit rain were partially clustered (Fig. 6a-1, a-2) but those of excessive rain were consistently independent (Fig. 6b-1, b-2). As for rice, insurable counties in Zone I were in clusters except for the ones identified by DHI (Fig. 7a-2). By contrast, insurable counties in Zone V were spatially independent except for the ones identified by DHI (Fig. 7a-2). More interestingly, clusters of yield loss were located in similar places in the rest of the zones regardless of the type of weather index and hazard. For example, high yield loss clustered in the north of Zone IV, while low yield loss clustered in the south of Zone IV for both deficit-rain-insurable (Fig. 7a-1, a-2) and excessive-rain-insurable counties (Fig. 7b-1, b-2). Among the wheat-insurable counties, AI-insurable counties were spatially independent in most zones for deficit rain (Zones III, IV, and V, Fig. 8a-1) and EHI-insurable counties were also independent in most zones (Zones II, III, and V, Fig. 8b-2). Additionally, AI identified more independently insurable counties than HI for both hazards (54 vs 47 for deficit rain, Fig. 8a-3; 88 vs 67 for excessive rain, Fig. 8b-3).

3.4 The Effectiveness of Risk Pooling

The pooling of loss risks (Tables 3, 4, 5) shows that the variance of average yield loss was consistently lower than the average variance of individual yield losses, which reduced the uncertainty of total compensation. However, not all WII cases could result in effective risk pooling. Among all agricultural zones for maize (Table 3), more zones were identified with effective risk pooling by HI for both hazards (Zones II and V for deficit rain; Zones I, V, and VI for excessive rain). As for rice (Table 4), only yield losses of HI-insurable counties could be effectively pooled for deficit rain (Zones I, IV, and VII), while AI-insurable counties could be pooled for excessive rain (Zones V and VI). Both AI and HI could identify effective risk pooling for rainfall-related hazards in wheat-cropping zones (Table 5), that is, AI in Zone IV and HI in Zone II for deficit rain; AI in Zones V and VI, and HI in Zone IV for excessive rain. Unfortunately, no weather index could effectively pool loss risks at the national scale.

Table 3 Effectiveness of risk pooling in each agricultural zone for maize
Table 4 Effectiveness of risk pooling in each agricultural zone for rice
Table 5 Effectiveness of risk pooling in each agricultural zone for wheat

More interestingly, one agricultural zone could be diagnosed as an effective risk pooling case even though some insurable counties within it were in a cluster. Specifically, among all effective risk pooling cases, both hot clusters and cold clusters were identified by DAI-insurable (Fig. 6a-1) and EHI-insurable (Fig. 6b-2) counties in Zone I for maize. A similar phenomenon also showed in DHI-insurable counties of Zone VII (Fig. 7a-2) and EAI-insurable counties of Zone VI for rice (Fig. 7b-1), and EAI-insurable counties of Zone V (Fig. 8b-1) and EHI-insurable counties of Zone IV for wheat (Fig. 8b-2). Wang and Zhang (2003) also concluded that correlated losses could be effectively pooled as long as the risk pool was sufficiently large.

3.5 Comprehensive Evaluation for Insurable Counties

All attributes of insurable counties separately investigated above are plotted in Fig. 9 for a comprehensive evaluation. The zonal number of insurable counties from Figs. 2, 3 and 4 is indicated by the y axis. The median values of zonal Pearson R from Fig. 5 are shown on the x axis. The size of each circle in the plot represents the number of independently insurable counties from Figs. 6, 7 and 8. We used the colored solid circles for zones with effective risk pooling, and the blank circles for the opposite cases (Tables 3, 4, 5). According to the four factors of a weather index (details in Sect. 1), the suitable weather index should be located in the upper right corner in the coordinate system, as well as in large size and colored circle. However, no cases simultaneously have the above four characteristics. Especially, all circles located in the top area of the plots are blank (Fig. 9), which might suggest that trade-offs between the number of insurable counties and the spatial correlation exist. Moreover, the zonal Pearson R was below 0.5 in all zones with the exception of EAI in Zone V for maize, which is far from satisfactory (Fig. 9). Considering the scaling up of WII was based on the independently insurable counties, we could mainly depend on the number of independently insurable counties to evaluate the performance of the weather index. As a result, DHI in Zone III is superior to others for maize (Fig. 9a-1); DAI in Zones IV and VII, EAI in Zone IV, and EHI in Zone IV are the most suitable weather index application for rice (Fig. 9a-1, b-2); and EAI in Zone IV (Fig. 9b-3) is the most feasible application for wheat.

Fig. 9
figure 9

Comprehensive evaluation for insurable counties: deficit rain (top row) and excessive rain (bottom row): a maize; b rice; c wheat

4 Discussion

After conducting the comprehensive evaluation on insurable counties for the three main crops in China, we further discuss the innovations of our study (Sect. 4.1), analyze differences between different weather indices (Sect. 4.2), and finally, summarize the shortcomings and uncertainties in this research (Sect. 4.3).

4.1 The Innovations of Our Study

The insurance premium rate for an individual exposure unit consists of the fair price, the buffer load, and the administrative cost (Cummins 1991; Goodwin and Smith 1995). The fair price refers to the probability of average loss, and the administrative cost is always assumed to be zero. However, the buffer load can be extremely high if the insured loss simultaneously strikes a large number of exposure units (Ibarra and Skees 2007). Unfortunately, agricultural losses tend to be spatially correlated and occur simultaneously because weather patterns are generally similar over large geographic areas. This phenomenon results in the systematic risk in agricultural insurance.

In order to design more suitable insurance products, many studies have concentrated on how to decrease the spatial correlation between yields or between weather indices. Goodwin (2001) found that the yield correlation was more persistent in bad years than normal years. Wang and Zhang (2003) quantified the positive correlation between yields, which died off quickly when the lag distance between two counties increased. For example, the positive correlation between yields fell sharply when the distance was longer than 570 miles for crops in the United States. Furthermore, researchers tried to minimize the spatial correlation through diversifying the insured areas. For example, Xu et al. (2010) claimed that for weather indices in Germany, the reduction of buffer load could be effective as the insured area became larger, even though that effectiveness was small. Okhrin et al. (2013) conducted a similar study in China. Their results showed that spatial diversification revealed a significant effect if several provinces were aggregated. But these studies did not analyze the correlation of weather index and yield loss—they assumed that yield loss always significantly relates to a single weather index, but this is not the case. In our present study, we examined the insurable counties through conducting Pearson R correlation between weather index and yield loss for all counties in the study area. The results showed that over half of the counties could not be insured by one type of weather index (Fig. 2 for maize, Fig. 3 for rice, and Fig. 4 for wheat). Therefore, the weather index-yield loss correlation should be a preliminary step before conducting spatial correlation analysis. The spatial correlation between yields or between weather indices will be more meaningful in insurable areas.

Another innovation of the present study is that we proposed that the scale of WII research should depend on the systematic risk of yield loss. Traditional agricultural insurance is implemented at farm level. Goodwin (2001) and Wang and Zhang (2003) focused on the county level. Xu et al. (2010) and Okhrin et al. (2013) used provincial data. Considering that the systematic risk of yield loss is the “in-between” risk (details in Sect. 1), we assumed that all farms at the proper regional scale were characterized by homogenous conditions—for example, the county scale in the present study. As for further investigations, the larger zonal scale might be more suitable so that we would explore the spatial effect of similar weather conditions and farming strategies.

Overall, based on previous studies, we focused on the insurable counties to analyze the size of the potential WII market and the risk pooling of yield losses for each agricultural zone, which makes sense in determining where and which weather index can be used at the regional scale for the three main crops—maize, rice, and wheat—in China.

4.2 Differences between Weather Indices

A simple weather index is the key to WII contracts, which has guided many pilot programs and reinsurance companies in practice (Skees 2008; Hellmuth et al. 2009). Although some indices are widely used—such as the standard precipitation index (SPI) and the Palmer drought severity index (PDSI)—the complicated computational steps or high-level data requirements make them less practical than AI and HI (Keyantash and Dracup 2002) in our study. The HI and AI focus on two different aspects, but both can reflect the dryness and wetness of the climate; AI directly measures the shortage of rainfall, and HI is based on the supply and demand concept of the water balance equation.

Given the different definitions, we could expect many differences between the results of the two indices. First, we used the number of insurable counties (Figs. 2, 3, 4), the basis risk (Fig. 5) at the zonal scale, the independent insurable counties (Figs. 6, 7, 8), and the effectiveness of pooling risks (Tables 3, 4, 5) to compare the performance of different weather indices. However, we could not assert which type of weather indices performed consistently better through comparing the four important characteristics. For example, for rice, deficit-rain-insurable counties outnumbered the excessive rain counties in Zones V and VI (Fig. 3a-3, b-3) regardless of the type of weather index. In the rest of the zones, AI was more suitable for deficit rain while HI was more suitable for excessive rain in most zones (Zones I, IV, and VII, Fig. 3a-3, b-3). As to excessive rain for maize, the median of R of AI was lower than that of HI in Zones I and III, but higher in Zones V and VI (Fig. 5b-1). Actually, their performances could meet the needs of different stakeholders—for example, the government prefers that as many counties as possible could be insured; insurers like the smaller basis risk; and effective risk pooling can lower buffer load, which will increase farmers’ willingness to participate in WII. Additionally, the combination of insurable counties of different cropping zones could enlarge the risk pool and may provide higher flexibility in loss risk transfer, which deserves more analysis in future research.

4.3 Shortcomings and Uncertainties

Although we have systematically developed the evaluation criteria for the weather index, there are some deficiencies in this study. Therefore, we provide the following suggestions for future work.

Hazard thresholds for weather indices were defined as 40% for AI and 0.4 for HI (Table 2). However, considering the differences in weather conditions and farming systems between agricultural zones, local adjustments to hazard thresholds may be needed. Relevant research is gaining increasing focus. For example, Hess et al. (2002) set different rainfall index thresholds to find out the trade-off between insurance coverage and cost in Meknes, Morocco. He et al. (2016) improved the rainfall anomaly index by using multiple thresholds to identify drought events in major water resource regions in China. Further, we used Pearson correlation to identify the linear relationship between weather index and yield loss, with the assumption of normal joint distribution among these variables (Nguyen-Huy et al. 2018). While a linear regression model could provide a quick overview of general trends, it might be strongly influenced by outliers—for example, extreme events in bad years could possibly mislead correlation results (Goodwin 2001; Hassani 2016). In general, it is worth careful consideration how such uncertainty will affect future insurance studies.

5 Conclusion

Weather index-based insurance has been designed as a sustainable market mechanism to transfer risk in low-income countries for single-peril crop insurance. Different from previous studies that only focused on yield loss or weather index across the whole study area, we argue that the correlation between weather index and yield loss should be examined. Specifically, Pearson R was used to calculate this correlation relationship in the present study. Based on insurable counties, this article provided systematic evaluation criteria to analyze the weather index, which included the number of insurable counties, the magnitude of basis risk, independent yield loss, and the effectiveness of risk pooling for each portfolio of weather index, hazard, and agricultural zone. The results revealed that the application of WII should be localized and diversified because one single weather index could not meet all criteria and also could not work for all regions and weather-related hazards. Also, our results should be helpful to both supply and demand clients since we have pointed out where and which weather index performed better than others. Specifically, these include DHI in Zone III for maize; DAI in Zones IV and VII, EAI in Zone IV, and EHI in Zone IV for rice; and EAI in Zone IV for wheat. Research based on insurable counties, such as for setting insurance premiums and reinsurance, should receive more attention in the future for the successful implementation of WII in practice.