Weather extremes, agriculture and the value of weather index insurance

This paper evaluates the potential value of a weather index insurance for the agriculture sector in an high income country (Germany). In our theoretical analysis we model an index insurance, a loss-based insurance market as well as a combination of both kinds of insurance and compare the resulting expected utility of a risk averse crop farmer. To find a suitable index, we conduct a panel estimation and evaluate the link between different weather variables and losses of crop farmers in Germany. Following our estimation, mean temperatures in summer have the highest potential for an valuable index insurance. Finally, we simulate the theoretical model using the results from the estimation and using different thresholds for the definition of a NatCat. According to this simulation, index-insurance is more attractive for the lower and more frequently occurring losses and loss-based insurance is more attractive for rare high losses. A combination of both kinds of insurance could be optimal for intermediate cases.


Introduction
Agriculture strongly depends on climatic conditions and is therefore significantly influenced by climate change. While some moderate seasonal warming and/or increasing rainfall can have beneficial effects on some crops and in some regions, yield decreases if those climate variables exceed thresholds at the upper or lower tail of their distribution (Mishra and Sahu 2014;Lippert et al. 2009). Climate change, however, not only affects average weather conditions but also the variance of weather conditions (i.e. natural catastrophes, NatCat). McCarl et al. (2008McCarl et al. ( , p. 1247 find "that higher variances in climate conditions tend to lower average crop yield and inflate yield variability". Although NatCats lead to income fluctuations of farmers and threat their solvency (Mishra and Sahu 2014;Nordhaus 1993), these risks are often not insured. In the literature, several reasons for this have been identified (e.g. Goodwin 2001, or Woodard et al. 2012. The supply side has to deal with a systemic risk (high-correlation of risks) and asymmetric information (with moral hazard and adverse selection problems). The high concentration of risks leads to large risks for the insurer. In addition, insurance creates high costs for risk and loss assessment (e.g. to avoid moral hazard).
These factors result in high prices for insurance products that cover losses from NatCats, and are therefore unattractive for farmers. Furthermore, farmers often can hope for government support and, therefore, have a lower incentive to purchase private insurance (charity hazard). As a result, these insurance products do hardly exist or have to rely on public subsidies (Miranda and Farrin 2012).
In this paper, we evaluate whether an index insurance could be a welfare enhancing option for the German agricultural sector. In contrast to a traditional loss based insurance, claim payments of an index insurance do not depend on observed individual damages but on the development of an (weather) index. 1 The two main advantages of index insurance are that it is cheap and that it limits moral hazard and adverse selection. 2 The main disadvantage is the basis risk of the insurance buyer which depends on the match between the index and the individual losses of the farmer. Index insurance was mainly designed for price sensitive developing economies (e.g. Barnett et al. 2008). However, it could also help to deal with challenges in developed economies where private insurance against extreme weather events (i.e. natural catastrophes) hardly exist (an exception is hail insurance).
There are some papers which explicitly look at index insurance in high income countries. Kath et al. (2019), for example, assess the value of an index insurance for sugar cane producers in Australia by comparing (potential) revenue streams of farmers with and without index insurance. Kapphan et al. (2012) examine the effect of climate change scenarios on optimal weather (index) insurance contracts for crop farmers in Switzerland. The authors show that climate change could lead to increased attractiveness of weather insurance-for insurers and insured. However, the results depend on the insurers' ability to capture the effect of climate change and to adjust contracts accordingly.
Based on data from a discrete choice experiment, Achtnicht and Osberghaus (2019), evaluate the value of an index-based flood insurance for households in Germany. Their results indicate that most customers would prefer a traditional lossbased insurance. Mahul (2001) provides a theoretical analysis of insurance against climate risk in agriculture. In his model, the production function of a crop farmer depends on an observable and insurable random weather index as well as an uninsurable production shock. Mahul shows that the optimal insurance coverage positively depends on the correlation of the two risks. Gollier (2003) demonstrates that the traditional static theoretical insurance models artificially inflate the value of insurance to risk adverse individuals. If there is no serial correlation between shocks, individuals can also self-insure by precautionary savings. Gollier (2003, p. 21) concludes that "only liquidity constrained households would purchase a generous insurance coverage. Wealthier people would mostly rely on their ability to time diversify their risks. They would limit their insurance purchase to catastrophic risks, i.e., risks whose largest potential loss exceed a large fraction of their annual income." While the above papers study hypothetical index insurance markets, there are several papers which evaluate existing index insurance in developing economies. Cole et al. (2014) and Hochscherf (2017), for example, analyse the driving factors of the demand for index insurance in India based on panel data from a field experiment. They conclude that factors like risk exposure and insurance experience are important drivers of insurance demand.
We use a different approach to analyse the value of index insurance for German farmers. First, we develop a theoretical two-period model of a risk averse farmer who is subject to a potential loss. The probability for this loss depends on whether there is a NatCat or not. In this setting, we calculate under which conditions the farmer would prefer an index insurance which pays if there is a NatCat to a (more expensive) traditional loss based insurance which pays if there is a loss. Our model adds value to the existing literature on index insurance by considering savings as a substitute for insurance and by building a foundation for a simulation with real world data. Our results indicate that an index insurance would increase welfare if the probability for a loss is significantly higher under a NatCat than without a NatCat and at the same time the NatCat is relatively likely. Hence, the performance of the index insurance strongly depends on the used whether index and the concrete definition of a NatCat or trigger point for the index insurance, respectively.
In a second step, we therefore, conduct an empirical panel estimation to see which weather variables have the strongest link to losses of crop farmers in Germany. While the link between weather and agriculture yields has been studied in a number of papers, many of these papers focus on average weather conditions (i.e. the climate) and average yields 3 or use weather variables as controls for analysing the short-term impact of different planting methods. 4 We use data on winter wheat yields (the main kind crop cultivated in Germany) on a district level from 1999 to 2019. For our main estimation, the corresponding weather variables are mean temperatures, the number of heat days, 5 sunshine hours and precipitation. For all four variables we distinguish between values in spring and in summer which constitutes a major value added of our paper. 6 Hence, we have eight weather variables for the time span 1999 to 2019 which we transform into district level data (from grid and point data, respectively). Since the relationship of weather and yield might not be linear, an additional second-order polynomial regression was conducted. Following both estimations, mean temperatures in summer have the strongest impact on losses. The strength of the impact remains when we use mean temperatures in summer as the only weather variable in a separate estimation.
In a final step, we simulate the theoretical model using data and results from the empirical estimation. The goal of our simulation is to find the optimal threshold of mean temperatures in summer which classifies as a NatCat and, therefore, triggers the payment of the index insurance. According to this simulation, index-insurance is more attractive for the lower and more frequently occurring losses and loss-based insurance is more attractive for rare high losses. A combination of both kinds of insurance could be optimal for intermediate cases.
This three-step approach is one of the main contributions of our paper to the existing literature. Our model enables us to simulate insurance demand using empirical data. We believe that our approach is suitable and useful to analyse the potential value of insurance for the agriculture sector in Germany. Furthermore, besides looking separately at index and loss-based insurance, we also evaluate under which conditions a combination of both kinds of insurance can be optimal. In addition, we are able to indicate how such a product could be designed.
The paper is structured as follows. In the next section, we develop the theoretical model. Section 3 presents the data and the empirical estimation. In Sect. 4 we simulate the theoretical model and Sect. 5 offers some concluding remarks.

The value of insurance
In this section, we develop a theoretical model of a crop farmer to analyse whether an index insurance would lead to higher expected utility than a traditional (lossbased) insurance. For simplicity, we assume that the crop farmer only owns land in one region and has only one kind of crop in. Therefore, the farmer is not able to diversify or even hedge income fluctuations.
The main advantage of loss-based insurance is that it pays if there is a lossindependent of whether the loss was the result of a NatCat or not. It is therefore ideally suited to limit income fluctuations which is of high value for risk averse farmer. As discussed above, the drawback is, however, that this insurance is relatively expensive.
Index insurance, in contrast, only pays if there is a NatCat-independent of whether the farmer suffered a loss or not. This creates basis risk for the customer: there can be a loss and the farmer does not get anything and there can be no loss and the insurance pays anyway. The big advantage of an index insurance is, however, that it is relatively cheap: There are no costs for risk and loss assessment to avoid adverse selection and moral hazard effects. Also costs for distribution and management are relatively low. Furthermore, index risks can relatively easy be sold on financial markets and diversified with other (uncorrelated) risks.

Basic assumptions
We look at the optimal savings and insurance decision of a crop farmer. Crop can be consumed or stored and is the only numerator in our model. In the present period the farmer has the crop wealth Y 1 = 1 and decides on how much he or she saves (s), spends on insurance [p(i)] and consumes [ 1 − s − p(i) ]. In the future period, the farmer has the crop yield Y 2 = 1 and the savings of the first period (the interest rate is assumed to be zero). The farmer, however, faces the probability to suffer a loss ( 0 < l < 1 ). The overall probability to suffer this loss is 0 < < 0.5. 7 and thowever, with the probability 0 < c < 0.5, 8 there is a NatCat which makes losses more likely. If there is a NatCat, the probability for the loss is h and if there is no NatCat the probability is l . We define = h − l as the difference between the NatCat loss probability h and the Non-NatCat probability l . We therefore get: We look at two alternative kinds of insurance: a traditional loss-based insurance which pays i L if there is a loss (independent of whether it is the result of a NatCat or not), and an index insurance which pays i I if there is a NatCat (independent of whether there is a loss or not). We start the analyses by assuming that there is either only a loss-based insurance (L), or only an index insurance (I). Farmers maximize their expected utility by choosing optimal amounts of savings s L,I and insurance i L,I . We assume that farmers have a standard logarithmic utility function ( U[X] = ln[X] ) and are indifferent between utility in the first and second period (discount factor of one). Section 2.5 analyses the optimal decision of the crop farmer when both kinds of insurance are available.

Loss-based insurance
The loss based insurance pays if there is a loss for the farmer. Hence, the probability for a claims payment is . Since insurers have to cover fluctuations in aggregated 7 The restriction < 0.5 makes sure that the losses can be seen as a risk to have a lower yield instead of seeing a no-loss as a chance of having higher than expected yield. The results of the model, however, would also hold if the condition is relaxed to < 1. 8 The restriction c < 0.5 makes sure that the NatCat can be seen as an unusual weather condition instead of seeing a Non-NatCat as a surprisingly favourable condition. The results of the model, however, would also hold if the condition is relaxed to c < 1. losses (i.e. there is a NatCat or not) and costs for distribution, management as well as risk and loss assessment, we assume that they charge a mark-up L ≥ 0 on the actuarial fair premium, where (1 + L ) < 1 . The resulting premium for insuring the The farmer maximizes the following expected utility function: by choosing optimal savings ( s L ) and insurance ( i L ). The first order conditions are: and The corresponding second order derivatives are: and Given these second order derivatives, the Hessian matrix is negative definite and, therefore, the second order condition for a maximum expected utility is satisfied. 9 The resulting optimal savings ( s L ) and insurance ( i L ) are: (1) The second order condition for an optimum is satisfied if the Hessian matrix of second order derivatives is negative (semi-)definite (i.e. all eigenvalues are non-positive). The eigenvalues of the Hessian matrix are given by , the eigenvalues are both negative.
Hence, the probability has a negative and the loss l a positive effect on i L . Or in other words, farmers especially want to insure low probability/ high loss events. In addition, savings and insurance are to some degree substitutes. The more expensive insurance gets (higher L ), the less farmers will buy insurance and the more they will save. For L = 0 farmers would choose full insurance ( i L = l ) and would finance half of the insurance purchase by taking a loan ( s L = − l∕2 ). For , savings are getting positive and for: insurance demand i L would be zero as negative insurance is not allowed (in our model). As the insurance demand depends positively on l and negatively on , also this threshold depends positively on l and negatively on . With i L = 0 , the optimal savings would be:

Index insurance
The index insurance pays if there is a NatCat. As the information on NatCats is given by an publicly available index, the provision of an index insurance is significantly cheaper than the provision of a traditional insurance. Nevertheless, also providing index insurance involves costs and we assume that the mark-up on the fair index insurance premium is I ≥ 0 . Hence, the premium for insuring the amount i I is The farmer now maximizes the following expected utility function: by choosing optimal savings ( s I ) and insurance ( i I ). There are now four different cases in the second period: (i) there is an insured loss, (ii) there is no loss but the insurance pays, (iii) there is a loss but the insurance does not pay and (iv) there is no loss and the insurance does not pay. The first order conditions are: and The corresponding second order derivatives are: and As a result, the Hessian matrix is negative definite and the second order condition for a maximum expected utility is satisfied. 10 Given Eq. (13) and l = − c , we can rewrite Eq. (12) to: (13) can be written as: The optimal demand for index insurance ( i I ) therefore depends positively on and negatively on I . This is not surprising as a higher I makes insurance more expensive and a higher reduces basis risk. For ≤ 0 , insurance demand i I would be zero even if I = 0 . In this case, the resulting savings s I would be equal to Eq. (10). For I = 0 and = 1 (which implies that h = 1 , l = 0 and c = ) there would be full (12) insurance ( i I = l ) and savings would be s I = − l∕2 , which is equal to the savings under loss based insurance with L = 0 and i L = l . Hence, with L = 0 , = 1 and L = 0 , the index and the loss based insurance are identical.

Loss-based vs. index insurance
The focus of this paper is to analyse whether an index-based insurance could lead to a higher expected utility than a traditional loss based insurance. Hence, we want to know under which conditions: EU I > EU L . The mark-up L only affects the traditional loss-based insurance and should have a negative effect on its attractiveness as it makes this kind of insurance more expensive. The derivative of the expected utility equation (1) with respect to L is: Since the assumed optimization behavior of the farmers leads to dEU L ∕ds L = dEU L ∕di L = 0 , (for i L > 0 ) the mark-up L has a negative effect on expected utility. Hence, in line with intuition, the higher L , the more likely the index insurance leads to a higher expected utility than the loss-based insurance.
The mark-up on the index insurance I as well as the breakdown of the loss probability in c , l and only affects the index insurance. As L is negatively affecting the attractiveness of loss based insurance, I has a negative effect on the expected utility from an index insurance EU I .
Since, l = − c and h = + (1 − c ) , we only have to look at the effect of and c on the expected utility of a farmer using index insurance ( EU I ). The derivative of the expected utility equation (11) with respect to is: Since optimization leads to dEU I ∕ds I = dEU I ∕di I = 0 , the difference in loss probabilities has a positive effect on expected utility if: For i I > 0 this condition is fulfilled. The rationale for this result is that a higher reduces the basis risk of the index insurance.
The derivative of the expected utility equation (11) with respect to c is: (20) (21) Again, optimization leads to dEU I ∕ds I = dEU I ∕di I = 0 . Hence, in combination with Eq. (17) the NatCat probability c has a positive effect on expected utility if: For l > 0 and low levels of I and c this condition is fulfilled and the expected utility depends positively on the NatCat probability. 11 The rationale behind this result is that with higher levels of I , c makes the insurance more expensive and therefore less attractive. However, c has a positive impact on the expected pay-out and therefore on expected utility. If I is low, this positive effect outweighs the negative cost effect. If, however, index insurance is expensive (high I ), the NatCat probability c would have negative effect on expected utility. This result implies that (with a low mark up I ) a very low threshold for the definition of a NatCat and, hence, a high probability for a NatCat would increase expected utility. However, the difference between the NatCat loss probability and the Non-NatCat loss probability ( ) has a positive effect on expected utility, as well. This difference likely increases with the severity of weather events and therefore also with narrowing the definition of a NatCat. Hence, when using real world data, there will likely be a trade-off between a high NatCat probability c and a high .
The overall loss probability and the extend of the loss l affect the loss based as well as the index insurance. While the variables obviously have a negative effect on expected utility-independent of the kind of insurance, the extent of the effect could be different. Hence, we have to compare the effect of the two variables on loss-based-insurance expected utility ( EU L ) with their effect on the index-insurance expected utility ( EU I ). As shown above, for L = 0 , I = 0 and = 1 both kinds of insurance are identical. The same applies to the case that L is equal to (9) and the combination of a high I and a low = 0 leads to i I = 0 . Hence, in both extreme cases also the effect of and l is identical. For the cases in between the extremes, a (23) higher loss probability would make the index insurance relatively more attractive if: Hence, a higher L would make the effect of on EU L more negative. A higher I and/or a lower (or a lower i I , respectively), in turn, would make the effect of on EU I more negative. Hence, the effect of on the difference between EU L and EU I is unclear and depends on L , I and . Also the effect of l on the difference between EU L and EU I is unclear if L , I and have intermediate values (i.e. 0 < i L , i I < l ). A higher loss l would make the index insurance relatively more attractive if:

Loss-based and index insurance
So far, we have assumed that there is either a loss-based or an index insurance. This section analyses the optimal decision of the crop farmer when both kinds of insurance are available. In this case, expected utility is given by: The farmer now maximizes expected utility by choosing optimal savings (s), lossbased insurance ( i L ) and index insurance ( i I ). The first order conditions are: and (24) (26) Rearranging the first order conditions leads to: and The sum of the nominators on the right-hand side of these equations is always one. Hence, each left-hand side of the Eqs. (30) to (32) is the weighted average of the corresponding two fractions on the right-hand side. As a consequence, following Hence, a positive demand for index insurance i I demands that I < L . The rational behind this result is that the attractiveness of index insurance is not only harmed by the mark-up but also by the basis risk. From (30) and (32) follows that the demand for index insurance is positive if: Hence, for > 0 , I = 0 and L > 0 , there is always a positive demand for index insurance. However, since i I and i L are substitutes, for i L > 0 the demand for index insurance is lower than without the possibility to purchase loss-based insurance.

Weather and agriculture in Germany
Although, the relationship between crop yield and weather is intensively studied already, most literature is focused on climate conditions, not on short-term weather fluctuations. Discussions like that by Greenstone (2007, 2012) and Fischer et al. (2012) also show that the results greatly depend on data and approach. Hence, further research in addressing the challenges in estimating the link between weather and agricultural output is still required, especially in the area of short-term fluctuations. (29) Trend in crop time series is another obstacle in estimations because the overall production of crops increased in the last decades (Food and Agriculture Organisation 2020), partially due to technological advancement-but also due to warmer conditions caused by the recent climate change. Seemingly contradictory at first glance, one has to keep in mind that climate Change also comes along with an increasing number of extreme weather conditions which affect agricultural output negatively (Kapphan et al. 2019). Furthermore, as climate change effects the timing and length of seasons, it also has an effect on crop's life cycles. Crops are observed to adapt to climate change by earlier blooming and grain filling (Rezaei et al. 2000;Xiao et al. 2015), which can be either beneficial or leaving them more vulnerable when exposed to extremer weather events (Brown 2013). Those can also have indirect effects. Bakker et al. (2005) argue that specific weather like heavy rainfall can benefit pests. With standardising seeds and planting practices, crops might also be more sensitive towards pests, diseases and stronger influenced by weather (Chen et al. 2004).
Another important issue to be considered are regional conditions which leads to ambiguous results in the literature. Although most studies show that in general heat stress (e.g. Bakker et al. 2005;Brown 2013;Ferris et al. 1998) and drought stress decrease crop yield (e.g. Eitzinger et al. 2013;Olesen et al. 2000;Torriani et al. 2007), they also find varying results depending on country and season when taking a closer look.
A lack of rainfall in mid-summer in Scotland is found beneficial by Brown (2013) while the results by Olesen et al. (2000) show evidence that increasing precipitation in July has a significant negative effect on crops in Danmark. Gornott and Wechsung (2016) find for Germany that winter wheat seems to be sensitive to low water supply in early growing stages, i.e. in spring. The results regarding the effect of increased temperatures are just as ambiguous, especially because the related variables temperature and radiation seem to have counterbalancing effects. According to Brown (2013), crops in Scotland benefit from higher radiation in early growth stages but suffer from increased temperatures. The positive effect of higher radiation in spring is also found for Danmark by Olesen et al. (2000) and Kristensen et al. (2011). Bakker et al. (2005) and Atkinson et al. (2005) both find that radiation influences crop yield significantly negative in numerous mid-and southern regions in Europe. Increased temperatures are shown to have a negative effect on crop yield in England (Ferris et al. 1998), as well as in Danmark (Kristensen et al. 2011) and China (Zhang et al. 2013). By contrast, Bakker et al. (2005) find a positive effect for European countries and Rezaei et al. (2000) acknowledge their findings for Germany.
In conclusion, even though dry spells and heat stress is shown to significantly lower yields, the exact relationship and interaction between different weather variables and yield is difficult to estimate. Next to the challenges when choosing data and methods, there is also a time and a spatial component to consider.
From the results in the literature, we can learn that development stages of crops in spring and summer should be considered, which means using season weather variables instead of annual averages as suggested by Maddison et al. (2007). We proceed to the assumption that our results will resemble the general results in the literature. We expect higher temperature and heat variables to increase the probability of loss, as well as precipitation in summer. In contrast, precipitation in spring is expected to decrease the probability of loss. No assumption for the results of radiation can be made because radiation is shown to have both positive and negative effects on crop yield.

Data
The data sets are created from several sources. We combine winter wheat yield and weather data for Germany in a time series from 1999 to 2019 on district level, which is the second lowest administrative level in Germany. The yield data is available on the regional department of the Federal Statistical Office of Germany (Statistisches Bundesamt). The weather data is derived from the German Climate Data Center (DWD), which provides both data from observation stations throughout Germany but also interpolated and modelled data grids. We will go into more detail about the data sets in the following sections.
The winter wheat data set The yield data set by Statistisches Bundesamt contains annual yield of winter wheat in 10 tons per hectare (dt/ha) for every district from 1999 to 2019. In the years 2007, 2008, 2011 and 2016, there had been reforms in areal allocation of the districts. During that process, several districts were merged into bigger districts. To create a data set with complete time series for currently (2021) valid districts including the years before the reforms, the former districts yields are merged and the yield averaged. Germany is currently divided in 16 states and all together 401 districts. In consideration of land-use, only the rural districts are used for further analysis. Rural districts are identified according to Landatlas (2018), a data source by the Federal Ministry of Food and Agriculture. The Ministry defines rural districts by the Thuenen-Topology, which uses the relative high proportion of agricultural land-use, lesser settlement density, proportion of one-or two-family houses and distance to bigger centres. One of the five categories is considered "not rural". Therefore, 98 districts which fall into this category (including cities and city areas like Hamburg and Berlin), are dropped and the analysis includes the remaining 303 districts. From those, only 224 districts provide complete a time series for 1999 to 2019. Table 1 provides descriptive statistics of the data set. In regard to the characteristics of absolute production, the district mean, the average total production in the time period 1999 to 2019, show a broad variance between individual districts (see Table 1).
As the aim of this chapter is to get a better grasp about yield losses, a variable was created as an alternative to absolute yield production. The new variable Dev_yield represents the district deviation in percent of the individual district mean (variable mean_yield). Figure 1 shows the deviation of the production in the districts for the timeline of 1999 to 2019. The years 2003 and 2018 are known for high temperatures in summer and dry conditions. In Fig. 1 those years along with 2011 and 2012 can clearly be spotted as years with heavy losses. On the other hand, the years 2004 and especially 2014 can be interpreted as productive years for farmers regarding winter wheat (i.e. low/negative losses).
The weather data set From the literature, we know that precipitation, temperature and radiation are essential factors in the agricultural sector, especially in spring (March to May) and summer (June to August). Therefore, we include several indicators for those three weather variables in our estimation, starting with precipitation.
The DWD provides seasonal precipitation raster grids where the total precipitation amount is given in mm/cm 2 , which are the result of an interpolation project with name REGNIE (further details see DWD). The raster grids consist of 611 × 971 square grid cells covering the area of Germany which translates to a spatial resolution of 1 × 1 km (611 cells in east-west direction, and 971 in north-south direction). The grids are intersected with a multi-polygon shape-file of the districts. Each grid cell within the boundaries of a district polygon is allocated to that specific district. Cells which are only partially within the boundaries are still fully accounted and are allocated to both overlapping districts which means they are counted twice. Since the average district size is about 900 km 2 , the bias which could result from the double counting, is assumed negligible. For each season, year and district, the raster data is extracted and aggregated by averaging the cell values.
The station data provides three temperature variables: the lowest temperature measured, the mean temperature and the highest temperature measured. Data from all the available set of 810 stations are utilized to calculate the mean seasonal temperatures. However, the daily mean temperature is provided by significantly more weather stations than the minimum and maximum. Therefore, the temperature variables to create the heat index for seasons spring, and summer on a district level are calculated by using daily weather station data of 249 weather stations which recorded continuously since at least 1999. The heat index gives the number of days From the literature, we learned that radiation is an important factor for crops growth. The DWD provides monthly sums of sunshine hours for 439 weather stations since 1892. Hence, we are using sunshine hours as an indicator for radiation in spring and summer.
Since weather is a system with interacting actors, correlation between those variables might bias the results. As displayed in Fig. 2 the weather variables in our dataset show some correlation but it is low enough to use in a common setting. Figure 2 also shows the expected positive relationship between sunshinehours and temperature, as well as the negative relationship between sunshinehours and precipitation.
After creating the indices, this point data is transformed to planar data using the common approach of Thiessen polygons. This is a simple method where the space between two points are equally divided to create polygons with the R package dismo by Hijmans et al. (2017). Those resulting polygons are converted into raster grids using the same resolution as the precipitation grids. This step provides the advantage of a weighted mean when the grids were again aggregated to district level for each season for the time series of 1999 to 2019. All weather variables defined, created and used in this work are also displayed in Table 2.

Estimation approach
The dependant variables in each of our estimations are the deviation from the mean yield in percent for winter wheat. The ordinary least squares panel estimation is given by the following equation: where y dt is the deviation from the average yield in district d at year t, with district fixed effects d and X dt representing weather variables in district d at year t and error term dt for unknown factors. The weather variables were standardized by: in order to simplify the interpretation of the regression results. As discussed by Deschênes and Greenstone (2012), too many fixed effects take out too much variation. Hence, no time fixed effects are included. Instead, the trend is estimated by t . Brown (2013) argues that weather fluctuates more than the agricultural input of for example fertilizers, and as we know from the literature, more uniform agricultural practices most likely lead to a stronger influence of weather (Chen et al. 2004). Therefore, no additional controls are included either.
Two sets of weather variables are estimated. The first estimation includes mean temperature, sunshine hours and precipitation with the goal to determine the effects of fluctuations from average conditions. In the second estimation, the variable (34) mean temperature is exchanged with the heat index to control for effects of extreme conditions. Since the relationship between the weather variables and deviation in winter wheat yields might not be linear, a second-order polynomial approach is estimated with squared weather variables which is described by (X st dt ) 2 .

Estimation results
The results (see Table 3) of the influence of temperature is in line with expectations. They acknowledge the findings of Ferris et al. (1998), Kristensen et al. (2011) and Chen et al. (2004) who found a negative effect of increasing temperatures on yields, but is in contrast of Bakker et al. (2005) and Rezaei et al. (2015) who found a positive effect of increasing temperature. Although the mean temperature in spring has no significant effect, days in spring with temperatures above 25 • C show a significant positive influence on losses. The conclusions by Mishra and Sahu (2014) and Lippert et al. (2009), that minor increases of the mean temperature does not have a significant effect but exceeding certain levels decreases yield, applies in our estimation for spring. The average temperature in summer is implied to have the strongest negative effect in relation to the other variables on winter wheat yield. The effect of radiation (36) was found ambiguous in the literature. However, our results imply increasing sunshine hours in both spring and summer to be unfavorable for winter wheat which is in line with the results found by Bakker et al. (2005) estimating radiation effects for Germany. The influence of precipitation in spring is negative as expected, although no significant effect is found. The beneficial effect of sufficient rainfall during the early growing stages has also been found by Gornott and Wechsung (2016). As Germany is not known typically for a country suffering from water distress, the few years with extreme warm conditions 2003 and 2018 seem not significantly enough to influence the estimation results. Precipitation in summer seems unfavorable for crop yield as it increases negative deviation which conforms with findings by Olesen et al. (2000). This is in line with the expectations although precipitation only shows a significant effect in combination with the heat indices. A possible explanation can be found in the higher dynamics in the hydrological cycle due to warming conditions as the probability for heavy (and harmful) rainfall events increases with increased water saving capacity.
The results of the second-order polynomial regression correspond to the baseline estimation (see Table 4 second row). The MeanTemp_st_su coefficients are the strongest in this setting, as it is in the baseline estimation. Precipitation in spring as expected is positive as long as it stays in a certain range while precipitation in summer has a negative effect on the yield. The R 2 gain in the polynomial regression results is low compared to the baseline model. Though, the R 2 in each models is low, the results of our estimations approach still imply a significant link between weather and crop yield in Germany. Mean temperatures in summer show to have the highest effect in relation to the other variables on yield fluctuations. Therefore, this variable is examined further as a possible candidate for the calibration of the insurance model.
The second column of Table 4 displays the results for an second-order polynomial estimation of mean summer temperatures only, and the first column of Table 5 shows its OLS counterpart. The gain in R 2 is very low in the polynomial estimation but the coefficient is lower by about 0.6. Table 5 also displays each weather variable in an OLS estimation. R 2 and the coefficient are highest for summer temperatures. It can be concluded that MeanTemp_st_su is a robust variable and a possible correlation between MeanTemp_st_su with other variables does not bias the results. In Table 6 summer temperatures are compared in full (1999 to 2019) and half sample (from the year 1999 to 2009) as a robustness check. A possible explanation for the higher values with half the sample could be a higher variability in weather variables in several districts since reports of unusual weather in Germany have increased during that time. As the variable MeanTemp_st_su proves to be robust, it is a suitable index to simulate the insurance model in the next chapter.

Simulation for Germany
Our main research question is whether an index insurance could be a welfare enhancing option for the German agricultural sector. From Sect. 2, we know that the answer to this question depends on the one hand on the price of (or mark-up on) a traditional loss based insurance ( L ) vs. the price of index insurance ( I ), and on the other hand on the trade off between a high NatCat probability ( c ) and a high difference between the loss probabilities with and without a NatCat ( ). Hence, we are interested in a combination of an index and a NatCat definition which leads to a good fit to the loss events. According to Sect. 3.3, the mean temperature in summer is the most promising indicator for losses in the winter wheat production in Germany.
In this section, we simulate our theoretical model by deriving the different parameters from the winter wheat and mean temperature data presented in Sect. 3.3. This implies in particular deriving the optimal threshold of a NatCat (i.e. the mean temperature that triggers payments of the index insurance) and the comparison of the calibrated expected utility of a farmer using a traditional loss based insurance and a farmer using the optimal index insurance.

Simulation of the loss based insurance
According to Eq. (1), the simulation of the loss based insurance requires the overall loss probability , the extent of the loss l and the mark up on the fair premium L .
The loss parameters and l are based on the crop yield data presented in Sect. 3.1. Since the data only signals the distribution of absolute crop yields or the negative deviation from corresponding district means and not a binomial distribution of "loss" and "not a loss", we first have to define a loss. One option would be to define every negative deviation of the crop yield as a loss. However, in this case the loss probability would be about 50%. Hence, we only consider more material negative deviations as a loss.
Furthermore, we have to consider the fact that there is a positive trend in yields, and therefore, a negative trend in the negative deviations of the crop yields from their mean. According to our estimation, using mean temperature in summer as the only weather variable, there is a significant linear trend component which declines by −0.567 each year. As this trend component captures the (negative) trend of the yield deviation and the (positive) trend of the mean temperature in summer, we have to disentangle this effect. By minimizing mean squared errors between a linear trend and the negative yield deviation, we get a trend of −0.395. The rest (i.e. 0.172) can be attributed to the increase in mean temperatures. 12 Hence, we consider material deviations of the negative yield deviations from their trend as a loss. According to Table 1, the standard deviation of the negative deviations (or losses) is about 10% . About 15% of the "losses" is more than 10% higher than their trend and the average of these losses is about 17% . With this definition of a loss, would be 15% and l would be 17% . When we consider 20% (i.e. about two standard deviations) as the threshold for a loss, it would be = 3.6% and l = 26% . A loss threshold of 30% (about three standard deviations) would lead to = 0.9% and l = 35%.
The mark up L on the fair premium directly affects the relative attractiveness of the loss based insurance. According to AXCO data for German property insurance, between 2000 and 2018, average loss ratios were about 73% which translates into a L of about 0.37. The insurance of NatCats is likely more costly than an average property insurance and therefore, the relevant L is likely higher. This is especially true if we consider a high loss threshold which results in a difficult to insure low probability/high loss risk. On the other hand, digitalization could help to bring down the costs for distribution and management which would result in a lower mark up in the future.
From Eq. (9), we know the maximum L which allows a positive insurance demand for a given combination of and l. For a 10% loss threshold (i.e. = 15% and l = 17% ) the maximum L for a positive insurance demand would be about 0.16 and therefore lower than the 0.37 for property insurance in general. This is one explanation for the fact that a traditional loss based insurance for crop yields hardly exists. If we consider a higher loss threshold also the maximum L for a positive insurance demand increases. With 20% deviation loss threshold, the maximum L would be about 0.34 and therefore close to the 0.37. With a loss threshold of 25% the maximum would be 0.44 and with a threshold of 30% it would be 0.53. Hence, the more extreme the risk, the more attractive becomes the loss based insurance for the farmer. However, such low probability/high loss risks are difficult to insure and likely would come with a above average mark up.
As a robustness check, we do the same analysis using data until 2009. Now, with a threshold of 10% the parameters are = 13% and l = 16% (instead of = 15% and l = 17% ) but the maximum L for a positive insurance demand would still be about 0.16. For a threshold of 20% the 2009 data would lead to = 2.5% and l = 27% (instead of = 3.6% and l = 26% ) and the maximum L would be about 12 The corresponding trend of the mean temperature in summer is 0.0536. When we consider that we use standardized mean temperatures (divided by standard deviation of 1.036) and that the mean temperatures are multiplied by the parameter value 3.324, we get 0.172 as an adjusted trend component. Hence, the sum of the two trend components (0.172 and 0.395) is equal to the trend component in our estimation. 0.36 (instead of 0.34). For a threshold of 30% the 2009 data would lead to = 0.8% and l = 36% (instead of = 0.9% and l = 35% ) and the maximum L would be about 0.55 (instead of 0.53). Hence, the results of the simulation seem to be rather stable.

Simulation of the index insurance
According to Eq. (11), we need the parameters c , l , h , l and I to simulate the expected utility of a farmer using index insurance. To be able to compare the resulting expected utility with the expected utility of a farmer using a loss based insurance, we have to use the same definitions of a loss and hence the same l and .
There is no data on the mark up on index insurance in Germany available. Given that index insurance does not require individual risk and loss assessments, the mark up on index insurance should be lower than the mark up on loss-based insurance. Most studies of index insurance are on developing economies, where the mark up on fair insurance premiums is much higher in general. According to Carter et al. (2017), in the U.S. the mark up on agricultural index insurance is about 20-30%. However, also in the U.S. mark ups on property insurance in general are higher than in Germany (0.64 compared to 0.37). 13 Hence, we assume a mark up at the lower end of this range.
The probability for a NatCat c depends on the definition of a NatCat. Since we are using mean temperatures in summer as the relevant indicator, c would be the fraction of the observations (years and districts) with a mean temperature above a certain threshold. Given that there is a positive trend in mean temperatures, we have to look at the deviation of the temperatures from their trend. The mean of the mean temperatures in summer is 17.7 • C and the corresponding trend increases by 0.0536 each year. 14 If the relevant threshold for the deviation of the temperatures from their trend would be zero, 44% of the observations would be a NatCat and the probability for a NatCat c = 44% . If the threshold would be 2 • (i.e. about 19.7 • on average), the NatCat probability would be only c = 3.9%.
The conditional probabilities l and h depend on the combination of the loss definition and the NatCat definition. If we consider a 20% deviation loss threshold ( = 3.6% and l = 26% ) and 1 • deviation as a threshold for a NatCat ( c = 14% ), 13.2% of the NatCat observations would also be a loss (i.e. h = 13.2% ) and only 2% of the Non-NatCat observations (i.e. l = 2.0% ). The difference would therefore be = 11.2% . If we would choose 2 • deviation as a NatCat threshold ( c = 3.9% ), the difference would increase to = 17.1% (with h = 20.0% and l = 2.9%). Figure 3 displays the impact of the NatCat definition on c and for two different loss definitions (10% and 20% negative deviation threshold). The threshold for the NatCat definition obviously has a negative impact on the NatCat probability. As expected, the threshold has (overall) a positive effect on as more severe weather events more likely lead to losses.
As a robustness check, we do the same analysis using data until 2009. Now, with a loss threshold of 20% and 1 • deviation as a threshold for a NatCat 13% of the observations would be a NatCat (i.e. c = 13% instead of 14% in the full sample). 14.3% of the NatCat observations would also be a loss (i.e. h = 14.3% instead of 13.2% ) and only 0.7% of the Non-NatCat observations (i.e. l = 0.7% instead of 2% ). The difference would therefore be = 13.6% instead of 11.2% . With 2 • deviation as a threshold for a NatCat 6.3% of the observations would be a NatCat (i.e. c = 6.3% instead of 3.9% in the full sample). 20.6% of the NatCat observations would also be a loss (i.e. h = 20.6% instead of 20.0% ) and 1.3% of the Non-NatCat observations (i.e. l = 1.3% instead of 2.9% ). The difference would therefore be = 19.3% instead of 17.1% . Hence, using only data up to 2009 would make index insurance more attractive to farmers (higher c and higher ). Or, in other words, the shift in the distribution would have led to lower than expected profits for insurers and, hence, may have reduced the appeal of the index insurance product for insurers. However, overall the results of the simulation seem to be rather stable.
As shown in Sect. 2, and (for low levels of I ) c have a positive effect on EU I and hence, the relative attractiveness of the index insurance. Therefore, if we increase the NatCat threshold, there is a trade-off between lowering c and increasing . For higher mark ups I , however, the positive effect of c (and hence the negative effect of a higher NatCat threshold) is reduced. The goal of our simulation is to find a threshold that leads to the highest EU I . Hence, we have to calculate the optimal levels of savings ( s I ) and insurance demand ( i I ) as well as the By rearranging condition (17), we get: To get the optimal levels of savings and insurance demand, we choose a value of s I which (sufficiently) fulfills condition (18). The corresponding i I is calculated using (37). Figure 4 displays the impact of the definition on the expected utility of a farmer using index insurance ( EU I ) for a 10% loss threshold and different mark-ups I . For a low mark up I = 0.02 , expected utility peaks at a NatCat threshold of 1.3 • C above trend. In line with our theoretical findings, with a higher mark up expected utility is not only reduced, also its peak is at a higher NatCat threshold. For I = 0.08 insurance demand would only be positive (and therefore increase expected utility above its minimum) for NatCat thresholds between about 2 and 2.5 • . For a higher mark up, there would be no insurance demand independent of the NatCat definition. Therefore, there would only be demand for index insurance if the mark up on the fair premium is well below the current range in the U.S. (0.2 to 0.3). .

Fig. 4
Impact of NatCat definition on expected utility ( EU I ) for a 10% loss threshold and different markups I

Index vs. loss-based insurance
As shown above, with a 10% loss threshold, there would be no demand for lossbased insurance. For higher loss thresholds and therefore lower loss probabilities and higher losses, loss-based insurance becomes more attractive and farmers would be willing to pay a higher mark up. With index insurance, however, it is the other way around. While with a 10% loss threshold, there would be a positive demand for index insurance up to a I of slightly above 0.08, for a 20% loss threshold, the maximum I would be about 0.06 and for a 25% loss threshold the maximum I would be only about 0.03. Figure 5 shows the impact of the loss definition on the (relative to the potential losses l) demand for index ( i I ∕l ) and loss-based insurance ( i L ∕l ) as well as on the difference between expected utility from index insurance and expected utility from loss-based insurance (i.e. EU I − EU L ). In order to have a certain range with positive insurance demand, the assumed mark ups are rather low: I = 0.05 and L = 0.2 . If a loss would be defined as a negative deviation of yields from its trend (i.e. loss threshold 0%), there would neither be a demand for index insurance nor for lossbased insurance. For a loss threshold of about 3%, the demand for index insurance gets positive and demand reaches its peak at about 9%. For a loss threshold larger than 20%, demand for index insurance gets zero again. Demand for loss-based insurance gets positive at a loss threshold of about 13% and continues to increase with higher loss thresholds up to its maximum of 45%. In line with these results, index insurance is more attractive (i.e. EU I > EU L ) for lower loss thresholds (between 3 and 15%) and loss-based insurance is more attractive (i.e. EU I < EU L ) for higher loss thresholds. Or in other words, index-insurance is more attractive for the lower and more frequently occurring losses and loss-based insurance is more attractive for rare high losses.
For a loss threshold between 13 and 20% there would be a positive demand for index insurance as well as loss-based insurance. Therefore, in this range it would be possible that the farmer purchases both kinds of insurance. Following Sect. 2.5, if both kinds of insurance are available, there is only demand for index insurance if I < L which is given) and index and loss-based insurance are substitutes. The latter implies that, if the loss threshold is only little higher than 13%, there would only be demand for index insurance and if the loss threshold is only little below 20% there would only be demand for loss-based insurance. However, for an intermediate loss threshold (i.e. about 16%), purchasing index and loss-based insurance could be optimal.

Conclusions
Weather related risks can significantly affect agriculture production but are very difficult to insure. In fact, even in high income countries where insurance penetration in general is relatively high, these risks are hardly ever insured-at least not without public support. In this paper, we have evaluated the potential of index and loss-based insurance in enhancing protection and welfare of crop farmers in Germany.
For our evaluation we followed a three step approach. First, we have modeled a risk averse farmer and calculated under which conditions he or she would prefer a simple index insurance to a loss-based insurance and under which conditions a combination of both kinds of insurance could be optimal. Our results indicate that besides the different mark up on index and loss-based insurance, the result depends on the probability of a NatCat and the difference between the loss probability with and without a NatCat. Hence, the (relative) performance of the index insurance strongly depends on the used weather index and the concrete definition of a NatCat or trigger point for the index insurance.
In a second step, we have therefore conducted an empirical estimation in order to see which weather variables have the strongest link to losses of crop farmers in Germany. We have regressed losses on mean temperatures, number of heat days, sunshine hours and precipitation. For all four variables we have distinguished between spring and summer. Following our estimation, as a single index mean temperatures in summer have the highest potential as a valuable index insurance.
In a final step, we have simulated the theoretical model using the results from the estimation and using different thresholds for the definition of losses as well as for mean temperature in summer as a definition for a NatCat. According to this simulation, index-insurance is more attractive for the lower and more frequently occurring losses and loss-based insurance is more attractive for rare high losses. A combination of both kinds of insurance could be optimal for intermediate cases. However, the analysis has also demonstrated that with currently prevailing mark ups on the fair premiums for loss-based and index insurance, demand for both kinds of insurance would be (and is to some degree) zero.
The main contribution of our paper is this three step approach: our model allows us to simulate insurance demand using empirical data. This enables us to derive theoretical founded results for the attractiveness of index and loss-based insurance for German crop farmers. In addition, besides looking separately at the two kinds of insurance, we also evaluate under which conditions a combination could be optimal.
Our approach has some shortcomings. In our model, we do not consider the possibility for farmers to diversify output risks over different kinds of crops or different regions. Therefore, in our model, insurance is more attractive than in reality. Furthermore, we assume that losses are a binary variable. While this assumption facilitates the solving and interpretation of the model, it is not very realistic. In addition, our model does not consider behavioral deviations of demand patterns including issues such as rank-dependence, reference-dependence or ambiguity aversion. Future work should aim to analyse the effect of such broadly documented behavioral aspects and, thereby, help to understand potential failures of index insurance products. Nevertheless, we believe that our simplifying approach provides useful insights regarding the potential of index and loss-based insurance for farmers in Germany. Finally, our model only looks at the demand side of the insurance market. Future work should aim to also analyse the effect of the different parameters on the supply of index and loss-based insurance.
A shortcoming of our empirical estimation is that the explanatory power (measures by R 2 ) is relatively low. One reason for this might be that there are important variables which we do not control for. Another reason could be that there is no linear relationship between the weather variables and winter wheat yields. The gain in R 2 in second-order polynomial estimation, however, was only minor. Nevertheless, it does not harm the results for our simulation of the binary NatCat indicator.