Determining the influence of Itaipu Lake on thermal conditions for soybean development in adjacent lands

Previous numerical simulations have suggested that the area adjacent to Itaipu Lake in Southern Brazil is significantly affecting the local thermal regime through development of a lake breeze. This has led to concerns that soybean growth and development, and consequently yield, has been affected by the creation of the artificial lake in this important agricultural region, but a systematic climatological study of the thermal effects of Itaipu Lake has not been conducted. The objectives of this study were to assess the spatial pattern of minimum and maximum air temperatures in a 10-km-wide area adjacent to Itaipu Lake as affected by distance from the water. Measurements were conducted over 3 years in seven transects along the shore of Itaipu Lake, with five weather stations placed in each transect. Phenological observations in soybean fields surrounding the weather stations were also conducted. Generalized additive models for location, scale, and shape (GAMLSS) analysis indicated no difference in the temperature time series as distance from water increased. Semivariograms showed that the random components in the air temperature were predominant and that there was no spatial structure to the signal. Wind direction measured over the three growing seasons demonstrated that, on average, the development of a lake breeze is limited to a few locations and a few hours of the day, supporting the temporal and spatial analysis. Phenological observations did not show differences in the timing of critical soybean stages. We suggest that the concerns that soybean development is potentially affected by the presence of Itaipu Lake are not supported by the thermal environment observed.


Introduction
The moderating effect of large water bodies on air temperature is well-known and is responsible for local climates that allow for productive agriculture in their vicinity, as for example, in the Great Lakes region of North America (Sanderson 2004). Artificially created lakes for hydroelectric power generation have the potential of affecting local climate and generating lake breezes (Klaic and Kvakic 2014) potentially causing a shift in conditions for agriculture in adjacent lands. Lake breezes are driven by the differential heating of water and land and affected by the magnitude of the land's sensible heat flux, geostrophic wind, water body dimensions, and surface roughness length amongst other geophysical variables (Crosman and Horel 2010).
Itaipu Lake is an artificial lake created on the Paraná river between Brazil and Paraguay when the Itaipu power generation facility became operational in 1982. The lake is 170 km long in the north-south direction, has an average width of 7 km (west-east), depth of 22 m for a total area of 1350 km 2 , and water volume of 29×10 9 m 3 . The lake perimeter on the Brazilian side is 1340 km long, and a protective vegetative strip with a mean width of 210 m surrounding the whole extension of the lake was established in 1982. Farmland composed of fertile soils in intensive agricultural production of soybeans, corn, and wheat is present beyond this vegetative strip. Due to the limited width of Itaipu Lake, its impact on the climate of the surrounding farm land should be limited, as onshore penetration of lake breezes is estimated at less than half of the lake width (Segal et al. 1997). However, recent Doppler radar observations of a small reservoir (mean width of~2 km) concluded that local circulations are also generated by small lakes (Asefi-Najafabady et al. 2012). In addition, numerical simulation and limited measurements of air temperature over Itaipu Lake and at an adjacent land station concluded that the lake is inducing and sustaining a local circulation pattern (Stivari et al. 2003) and reducing the regional thermal amplitude (Stivari et al. 2005). However, numerical simulations with horizontal grid spacing >2 km used in most studies limit the interactions between geophysical variables (e.g., roughness length, sensible heat flux), so that predicted lake breeze onshore penetration might not be realistic (Crosman and Horel 2010). Nevertheless, results from previous studies on the effect of Itaipu Lake (Stivari et al. 2003(Stivari et al. , 2005 have led to concerns that thermal conditions for an important regional crop such as soybeans could be affected, and a systematic climatological study for the area is needed. The objectives of this study were as follows: (1) to determine if the spatial pattern of minimum and maximum air temperatures in the area adjacent to Itaipu Lake is affected by distance from the water and (2) assess the effect of Itaipu Lake on available growing degree days and phenological stages of soybeans. Weather stations were installed in seven transects distributed approximately evenly in the north to south direction along the eastern shore of Itaipu Lake. Each transect was composed of five stations, placed at varying distances from the lake center (~300-1200 m to~10 km), and measurements were conducted over 3 years. Spatial and temporal statistics were applied to the collected data to determine if the distance from the lake explained the variability in observed thermal environment. Our hypotheses were that stations closer to the lake would show different trends in time compared to stations further away from the lake and a spatial pattern in weather variables would be observed.

Site characterization
The climate in the western region of Paraná state, where Itaipu Lake is located (~25°S and 54°W, Fig. 1a), is classified as humid subtropical (Cfa, according to the Köppen classification; Alvares et al. 2014). Mean monthly air temperature is above 22°C during the summer months (Dec-Mar) with a tendency of higher precipitation during these months but no defined dry season according to the Köppen classification (Alvares et al. 2014). Mean annual precipitation varies from 1600 to 2000 mm in the studied region (IAPAR 1994).
Weather stations were installed on the eastern shore of the Paraná river, along the Brazilian margin of Itaipu Lake, in seven transects located in the municipalities of Guaíra (GUA), Mercedes (MER), Pato Bragado (PBG), Santa Helena (two transects; one in the south, SHS, and one in the north, SHN), Itaipulândia (ITA), and Santa Terezinha de Itaipu (STI) (Fig. 1a). At each transect, areas of at least 1 ha were selected from commercial farms which were planted with the same cultivar of soybeans (cultivar V max ). Five meteorological stations were installed in these areas along each transect (Table 1), with the closest stations at a distance between 320 and 1220 m and the furthest 9.32 to 11.33 km from the lake shore (Fig. 1b). These distances were chosen based on the suggestion that lake breeze onshore flow is generally less than half of lake width (Segal et al. 1997) (for Itaipu Lake, on average, 3.5 km and a maximum of 6 km). The weather stations were set up in a grassed area within soybean fields and fenced to avoid vandalism (Fig. 1c). The following criteria were observed when selecting the location for weather station installation: elevation <300 m, average slope <8 % toward stages were observed throughout each growing season following procedure by Fehr and Caviness (1977).

Weather measurements
Air temperature and relative humidity sensors (CS215, Campbell Scientific) were mounted inside a radiation shield (41003-5A, Campbell Scientific) on a tower at a height of 2 m. The sensor characteristics are as follows: range −39.2°C to +60°C and accuracy for minimum values of ±0.2°C and for maximum values of ±0.5°C, with a resolution of 0.1°C. A data logger (CR1000, Campbell Scientific) was programmed to take readings every 10 s and store values every 1 min and save hourly maximum and minimum values. Daily maximum and minimum values were obtained over 24 h. Additional measurements consisted of wind speed and direction (03002-L34 RM Young) and incoming solar radiation (LI200X-L12, LI-COR).

Temporal statistical analysis
Data collected at each weather station was individually analyzed for temporal trends, and the fitted trends for stations within each transect were compared. Mean minimum and maximum air temperatures over 5 days for 150 days starting on October 1 of each year (2010,2011,2012) were obtained for the 35 weather stations. The 5-day mean was used to minimize the autocorrelation between observations according to the Durbin-Watson test, as determined by the function suggested by Zeileis and Hothorn (2002). Models were fitted to each time series (30 data points for each of 35 sites) using the generalized additive models for location, scale and shape (GAMLSS) tool proposed by Rigby and Stasinopoulos (2005). In a GAMLSS model, the observations Y i (for i=1,…, n; where i=time) are assumed to be independent in time with a distribution function F Y (y i ; θ i ), where θ i =(θ i1 ,…, θ im ) represents a vector of m distribution parameters accounting for position, scale, and shape. Monotonic link functions g k (·) (k=1,…, m) relate the parameters of the distribution to the design matrix of the selected covariate (in this case, time t). In this study, we considered six distributions: Gumbel, Weibull, gamma, logistic, lognormal, and normal. Each had two fitted parameters (theta1 and the-ta2), where theta1 is associated with location and theta2 with scale. A parametric formulation was assumed for the GAML SS (Rigby and Stasinopoulos 2005) with theta1 described as a function of time (t). The Akaike information criterion (AIC) was used to select the best fit model (Akaike 1974). The quality of the fitting procedure was evaluated by the residual statistics, the Filliben correlation coefficient (Filliben 1975), and visual inspection of worm plots (Buuren and Fredriks 2001). Model parameters were fitted using GAMLSS written in R software (R Core Team 2014). Comparison of model parameters for stations within a transect was performed to determine the existence of significant differences in temporal trends in minimum and maximum air temperatures at varying distances from the lake, for each growing season, using the confidence intervals calculated using the BSagri package (Schaarschmidt 2013).

Spatial statistical analysis
Semivariograms are used in geostatistics to model how a property changes in space. Assuming that there is spatial continuity between adjacent samples, the correlation between spatially adjacent variables is measured (or modeled) with a semivariogram function. For example, a set of n measurements of a property A i taken at locations x i which are placed at certain interval along a transect will give a variance of where Ā is the mean for A i for all locations. If we apply this to N pairs of values of A i separated by a distance h, we have the semivariogram γ(h): Semivariogram functions can have different shapes depending mostly on the nature of spatial variability of variable A i . We were interested in determining if air temperature measured along each transect of five weather stations (Table 1) varied independently or according to a determined spatial pattern. If the former was the case, a pure nugget semivariogram would be obtained indicating a lack of spatial pattern along each transect.
Experimental semivariograms along with their models and sample variance for five variables (minimum and maximum air temperatures, relative humidity, solar radiation, and growing degree days (GDD), for base temperature of 10°C accumulated from emergence to specific phenological stages) were calculated. Each of these variables corresponded to A i in Eqs. 1 and 2. The distance h in Eq. 2 corresponded to the difference in horizontal distance between all possible pairs of stations (1-2, 1-3, 1-4, 1-5, 2-3, 2-4, 2-5, 3-4, 3-5, 4-5). The selected phenological stages of soybean development were as follows: Ve, emergence; R1, beginning bloom; R6, beginning physiological maturity; and R8, full maturity) for the three growing seasons.
All calculations were done using R (R Core Team 2014). As part of an exploratory analysis, tests were performed to confirm that the datasets passed normality tests and that variables did not need to be transformed. A preliminary analysis under a spatiotemporal geostatistical framework (Pebesma 2012;Bivand et al. 2013) did not yield any spatiotemporal relationship, and further work using this approach was not continued.
Semivariograms were calculated using the function Bvariogram^from the gstat package (Pebesma 2004;Pebesma and Bivand 2005) calculated first using coordinates easting and northing for each transect location and then adding elevation as a coordinate. As results did not differ, only easting and northing were used to define the spatial location of each meteorological variable. Each semivariogram was calculated using meteorological data from 5 days around the date of occurrence of each selected phenological stage (a given date plus 2 days previous and 2 days following) for each site in order to increase the sample number (from one to five). As this results in multiple values of a variable for each location, a small shift was randomly added to the easting and northing coordinates (less than 1 mm) for each date because the algorithm used in gstat does not allow for repeated measurements on the same location.
An omnidirectional linear model with nugget was fitted to the experimental semivariogram using weighted least squares (weight N h ×h−2, with N h the number of point pairs and h the distance) using as starting values the variance of the sample, V (nugget=V×0.9, partial sill of the linear model=V×0.01, and range=100).

Temporal analysis
An overview of the minimum and maximum air temperature data recorded at all stations is given in Fig. 2. Medians are shown, as datasets were not normally distributed. No clear trends were observed between stations in any of the study years. The 5-day minimum and maximum means showed warming between October and February as expected and similar trends for the five stations in each transect (Figs. 3 and 4). Normal or Gumbel was the best fit model selected according to the AIC (data not shown) for all stations and years using the GAMLSS tool, with good quality of residual statistics as given by the Filliben correlation coefficient (>0.97) and worm plot inspection. The time series model adjusted for each station showed clear temporal trends, but model parameters (slope, intercept, sigma) did not show consistent differences between stations within a transect for minimum (Table 2) or maximum temperature (data not shown). For example, slopes were nearly identical and the difference in model intercept between stations varied from 0.5°C (stations 3 and 5, MER in 2010/2011) to 2.7°C (stations 3 and 5, SHS in 2010/2011). The weather stations with the highest intercept tended to be station 2 or 3, that is, located 3.4 to 5.5 km from the lakeshore, while the ones with the lowest intercept were located at distances between 7.3 and 11.8 km (stations 4 and 5). Over the three growing seasons, mean differences were 1.0, 0.9, 0.7, 1.5, 1.8, 1.0, and 0.9°C between the largest and smallest intercepts for sites GUA, MER, PBG, SHN, SHS, ITA, and STI, respectively, but these values were not significantly different.

Spatial analysis
Spatial analysis of all variables (minimum and maximum temperatures, relative humidity, solar radiation, and GDD) for all transects and all growing seasons studied showed a flat experimental semivariogram described by two models, nugget or   nugget + linear equation. An example of a semivariogram obtained for these variables (except solar radiation) and one location and time is shown in Fig. 5. The variance explained by the model (partial sill) and variance not explained by the model (nugget, or randomness of measured variable) obtained for GDD at each site and different phenological stages during the study years are shown in Table 3. The nugget corresponded from 87 to 92 % of the total variance for all semivariograms analyzed, and the partial sill was negligible. Semivariograms obtained for the other meteorological variables produced very similar results and are not shown.

Discussion
The derived intercept and slope of the times series models for all seasons studied showed the same trends for each station, that is, no significant difference in the fitted parameters, indicating similarity in the temperature time series between stations within the 10-km transect. This was contrary to the expected trend of decreasing minimum (increasing maximum) air temperature with distance from lakeshore typically observed for larger water bodies (Im et al. 2009). Likely largescale synoptic factors had an overriding effect producing similar air temperature conditions at all stations. In a study of the representativeness of European climate stations, Orlowsky and Seneviratne (2014) found high-rank-based correlations  between stations in regions where large-scale circulation controls daily variability in air temperature. Semivariograms for meteorological variables yielded models with large nuggets, almost equal to the variance, which is indicative of a lack of spatial structure; this was the case for all sites, years, and phenological stages analyzed (Table 3). This means that there was a lack of spatial pattern in the measured transects. Nugget values up to approximately 30 % of the total variance are expected for nonrandom spatial patterns (Wackernagel 2003). In our case, if the distance from the lake was the main predictive variable, much smaller nuggets would have been obtained for each transect. Janis and Robeson (2004) used a pointcentered semivariogram, applied to time-averaged diurnal temperature range, to determine how well 301 weather stations within the central USA represented climate over large spatial scales. They found significant spatial patterns with small nuggets for the majority of the stations, indicating Ve seedling emergence, R1 beginning bloom, R6 seed filling, R8 full maturity that stations were spatially representative of the large-scale climate. The scale of our study differs significantly from this analysis, given that the spatial separation of our stations within each transect was much smaller. However, B u r c s u e t a l . ( 2 0 0 1 ) d e m o n s t r a t e d t h e u s e o f semivariograms applied to remote sensing data to a somewhat similar question in forests, that is, identification of the distance within a transect where edge effects of a clearing propagated into a surrounding forest. We did not find any published applications of geostatistics to weather station transects, probably because this is an unusual configuration, not found in climate networks. The lack of spatial structure and high correspondence of time series models fitted to observations indicate that distance from Itaipu Lake does not have a significant differentiating effect on the thermal environment of adjacent land. Indeed, phenological observations confirmed that there was no difference in the thermal regime between transects as stages occurred approximately at the same time with very little variation between stations at a given site (Table 4).
The lack of temperature gradient induced by the lake presence can be explained by a weak to nonexistent lake breeze for most of the measurement period. Wind direction and speed according to time of day averaged over the three growing seasons (Fig. 6) show predominant wind directions from the east or northeast, which is consistent with large-scale average synoptic patterns (Wagner et al. 1989). A switch to south/southwest direction in the afternoon, consistent with the development of a lake breeze, can be observed at some of the locations (e.g., PBG). However, this lake breeze was of limited duration and clearly did not affect the overall thermal regime of the area surrounding the lake. The land surface sensible heat flux, ambient geostrophic wind, water body dimensions, and surface aerodynamic resistance are some of the geophysical variables that affect the development of sea and lake breezes (Crosman and Horel 2010). Strong land surface heating during daytime, and the associated sensible heat flux, will result in the horizontal temperature gradient required for lake breeze development. We suggest that intensive latent heat flux associated with soybean evapotranspiration would induce lower sensible heat fluxes than required for strong lake breeze development. In addition, the presence of the 200-m vegetative strip with a tall canopy of large aerodynamic roughness length would also be detrimental for lake breeze development as the frictional drag works to diminish the development of horizontal pressure gradients associated with lake breezes (Crosman and Horel 2010). Finally, the small width (~7 km) of the water body is another factor, as dimensions of the water body have a strong effect on lake breeze development.
Our results somewhat contradict previous analysis which suggested that the Itaipu Lake reduced the amplitude of the diurnal air temperature regionally (Stivari et al. 2005). However, this conclusion was based on comparison of temperature measured over the lake with land stations and air temperature temporal trends (before and after lake formation) that could have been confounded with other climatic trends. In a previous study, Stivari et al. (2005) found evidence that Itaipu Lake induced daytime lake breezes; however, authors cautioned that it is difficult to reproduce average circulation patterns with the numerical model used. Ideally, future studies could investigate the mechanisms controlling lake breeze development in the Itaipu Lake region through detailed measurements of surface-atmosphere energy exchange over contrasting land use, using ground-based flux towers and airborne flux measurements combined with numerical models (e.g., Samuelsson and Tjernström 2001). Contrasting land use on the Paraguayan and Brazilian side of Itaipu Lake provides further opportunities for testing the effect of controlling variables such as surface roughness and land sensible heat flux.

Conclusions
Detailed measurements of air temperature over three soybean growing seasons in transects adjacent to Itaipu Lake did not show a significant effect of distance from shore as shown by temporal and spatial analysis. The GAMLSS analysis allowed for adjustment of models to the time series obtained at each weather station, but fitted parameters indicated no difference in the time series as distance from lake increased. The semivariograms showed that the random components in the air temperature signal were predominant and that there was no spatial structure to the signal. Average wind direction measured over the three growing seasons demonstrated that the development of a lake breeze is limited to a few locations and a few hours of the day, supporting the spatial and temporal analyses. Soybean phenological observations are in agreement with these analyses, as no differences in the occurrence of significant stages were observed.