Spatio-temporal analysis of the extent of an extreme heat event

Evidence of global warming induced from the increasing concentration of greenhouse gases in the atmosphere suggests more frequent warm days and heat waves. The concept of an extreme heat event (EHE), defined locally based on exceedance of a suitable local threshold, enables us to capture the notion of a period of persistent extremely high temperatures. Modeling for extreme heat events is customarily implemented using time series of temperatures collected at a set of locations. Since spatial dependence is anticipated in the occurrence of EHE’s, a joint model for the time series, incorporating spatial dependence is needed. Recent work by Schliep et al. (J R Stat Soc Ser A Stat Soc 184(3):1070–1092, 2021) develops a space-time model based on a point-referenced collection of temperature time series that enables the prediction of both the incidence and characteristics of EHE’s occurring at any location in a study region. The contribution here is to introduce a formal definition of the notion of the spatial extent of an extreme heat event and then to employ output from the Schliep et al. (J R Stat Soc Ser A Stat Soc 184(3):1070–1092, 2021) modeling work to illustrate the notion. For a specified region and a given day, the definition takes the form of a block average of indicator functions over the region. Our risk assessment examines extents for the Comunidad Autónoma de Aragón in northeastern Spain. We calculate daily, seasonal and decadal averages of the extents for two subregions in this comunidad. We generalize our definition to capture extents of persistence of extreme heat and make comparisons across decades to reveal evidence of increasing extent over time.


Introduction
There is strong evidence of global warming due to the increasing concentration of greenhouse gases in the atmosphere (Lai and Dzombak 2019). This global warming suggests more frequent warm days, more frequent and persistent heat waves (Lemonsu et al. 2014;Alexander 2016) as well as events that break previous records by much larger margins (Fischer et al. 2021;Cebrián et al. 2021). The analysis of heat waves is particularly important due to the potential for serious anthropogenic, environmental, and economic impacts (Amengual et al. 2014;Campbell et al. 2018). Extreme heat raises significant health concerns in humans as it can result in death, change the range or niche for plants and animals, and lead to heatdriven peaks in electricity demands or lost crop income.
A challenge in analyzing heat waves stems from ongoing debate over its exact definition. According to the World Meteorological Office (WMO), a period persisting at least three consecutive days of marked unusual hot weather (maximum, minimum and daily average temperature) over a region with thermal conditions above given thresholds based on local climatological conditions can be considered a heat wave. This definition suggests that analyses of heat waves require temperature series at a daily scale, but offers no operational guidance with regard to the various choices of implementation. Khaliq et al. (2005) and Reich et al. (2014) used only maximum temperature, while Keellings and Waylen (2014) and, more recently, Abaurrea et al. (2018) considered both maximum and minimum temperatures.
While there is lack of agreement on heat wave definition in the literature (Perkins and Alexander 2013;Smith et al. 2013), the concept of an extreme heat event (EHE) is explicitly defined, capturing the notion of a period (number of consecutive days) of persistent extremely high temperatures. Specifically, EHE's are defined locally and are based on exceedance of a suitable local threshold. That is, useful thresholds should be based on local conditions; in the sequel, we adopt the 95th percentile of local daily maximum temperatures, derived using a ten year period. In the context of persistence of exceedance, it is evident that we need to model daily max temperatures since an EHE is defined at a given location as a run of consecutive daily temperature observations exceeding the given threshold for that location. Additionally, daily modeling enables assessment of important behaviors describing the nature of an EHE such as the duration, average exceedance, and maximum exceedance above the threshold. As a result, we define extent in terms of incidence of exceedance on a given day, developed from a daily max temperature model that is able to adequately represent both the central part and the upper tail of the temperature distribution. It is not a model only for the tails or observations above a threshold, i.e., left-truncated data (peaks over thresholds models) or for the extremes (generalized extreme value distribution models). Such modeling addresses different objectives.
Modeling for extreme heat events is customarily implemented using time series of temperatures over a window of time collected at particular locations. However, spatial aspects of this phenomenon are also of interest and should be introduced into the modeling process, particularly with interest in predicting EHE behavior at locations without available time series of temperatures. Since spatial dependence is anticipated in the occurrence of EHE's, a joint model for time series at different locations that incorporates spatial dependence is needed.
Recent work by Schliep et al. (2021) develops a spacetime model based on a point-referenced collection of temperature time series that enables the prediction of both the incidence and the characteristics of the EHE's occurring at any location in the region. Specifically, it offers direct spatial modeling for daily maximum temperatures which can then be used to characterize the EHE's. The challenge is that, while the bulk of the distribution, i.e., where most of the data is observed, is not extreme, the main interest for the model lies in the upper tail when attempting to characterize EHE's (Keellings and Waylen 2015;Shaby et al. 2016). To address this, a specification incorporating thresholding is introduced, i.e., a model which switches between two observed states, one that defines extreme heat days (those above the temperature threshold) and the other that defines non-extreme heat days (those below the temperature threshold). Again, these thresholds are obtained locally and assumed fixed. Importantly, this two-state structure allows temporal dependence of the observations but also permits the parameters controlling the effects of covariates and the spatial dependence to differ between the two states. We briefly review details of this modeling in Sect. 2.4.
With regard to risk assessment, the contribution of this work is to formalize the notion of the spatial ''extent'' of an extreme heat event and then to illustrate it using output from the Schliep et al. (2021) modeling work. Interest in characterizing the extent of heat waves is clear. For example, Lhotka and Kyselý (2015) proposed an extremity index that captures joint effects of spatial extent, temperature and duration of heat waves. Keellings and Moradkhani (2020) also developed a spatial metric combining heat wave frequency, magnitude, duration and areal extent to analyze the evolution of heat waves across the United States. Rebetez et al. (2009) compared the heatwave extent in Europe in 2003and 2006. Khan et al. (2019 found an increase of 1.36% of the affected area having both maximum and minimum temperature above the 95th percentile per decade in Pakistan. Lyon et al. (2019) projected the increase in the spatial extent of contiguous US summerheat waves using the CMIP5 archive under RCP4.5 and RCP8.5 scenarios. They found a substantial increase in spatial extent climate model employing projections for 2031-2055. However, all this work analyzes the extent using descriptive approaches and using observed or gridded data with no formal inference from probabilistic modeling. Some more formal definitions related to the concept of area under extreme conditions have been introduced in the statistical literature. For example, French and Sain (2013) present a method for constructing confidence regions for Gaussian processes that contain the true exceedance regions with some predefined probability. Extending this methodology, Hazra and Huser (2021) obtain confidence regions that contain joint threshold exceedances of surface sea temperatures in the Red Sea, using a semiparametric Bayesian spatial mixed-effects linear model. Bolin and Lindgren (2015) consider excursion sets, which are sets of points in an area where a spatial function is above a given threshold. Sommerfeld et al. (2018) develop confidence regions for these spatial excursion sets with an application to climate and Romero-Béjar et al. (2018) develop quantile-based spatiotemporal risk assessment of exceedances using this concept. Zhong et al. (2020) analyze spatial extent of heatwaves using a model based on max-infinitely divisible processes for annual temperature maxima. However, to our knowledge, the notion of the extent of an EHE has not been explicitly defined as a stochastic object. So, we take this up both conceptually and practically. Given a threshold surface over a specified region, for a given day, the extent of an EHE over the region is the proportion of the area of the region whose daily max temperature for the day is above the associated threshold surface. In the sequel we adopt a static threshold surface in order to assess change in extent over time. With different intentions, timevarying thresholds could be employed.
The basic idea is an extension of the so-called spatial cumulative distribution function (cdf) following the work of Lahiri et al. (1999) and Short et al. (2005). It is also discussed in Section 15.3 of Banerjee et al. (2014). The extent arises as a stochastic integral, i.e., an integral over a realization of a stochastic process. It is a random object as well as a conceptual object in the sense that it can not be observed and it can not be calculated in closed form. However, its moment properties can be calculated and approximate realizations can be obtained through Monte Carlo integration.
Attractively, we can study extents employing realizations from any spatio-temporal model fitted for daily max temperatures. Here, we adopt the model from Schliep et al. (2021) and do not do any additional modeling work. We can provide extents directly from the output of that model fitting. In different words, assessment of extent of an EHE is a post model fitting activity.
The region over which we study extents is the Comunidad Autónoma de Aragón region in northeastern Spain, located in the Ebro basin (85,362 km 2 ). The Ebro river flows from the NW to the SE through a valley bordered by the Pyrenees and the Cantabrian Range in the north and the Iberian System in the southwest. The maximum elevation is approximately 3400 m in the Pyrenees, 2600 m in the Cantabrian Range and Iberian System, and between 200-400 m in the central valley. In general, the area is characterized by a Mediterranean-continental dry climate with irregular rainfall, and a large temperature range. However, several climate subareas can be distinguished due to the heterogeneous orography and other influences, such as the Mediterranean sea to the east, and the continental conditions of the Iberian central plateau in the southwest. Zaragoza, the largest city in the region, is located in the central part of the valley, and experiences more extreme temperatures and drier conditions.
Our data are observational series from AEMET (the Spanish Meteorological Office). Only long term series with limited missing observations were considered, resulting in daily maximum temperatures for 18 sites across and around the Comunidad Autónoma de Aragón region for the years 1953-2015. The names and locations of the 18 sites are shown spatially in Fig. 1. Data from the years 1953-1962 were used to obtain the location-specific thresholds for extreme heat events; the data for the years 1963-2015 were used in the modeling. Specifically, thresholds were computed as the 95th percentile of daily maximum temperature for the months June, July, and August during the 10 years 1953-1962. Again, our objective is to see change in extent over time which requires a fixed qðsÞ threshold surface. While we could use fixed thresholds associated with any time window, using thresholds prior to the start of our modeling, using data not employed in our fitting, seemed to provide a sensible baseline.
The format of the paper is as follows. In Sect. 2, we present the details of the spatial extent and briefly review the Schliep et al. (2021) model. Section 3 shows the rich scope of extents we can calculate and compare. Section 4 provides a brief summary and future work.
2 Formalizing extent of extreme heat

The definition of extent
The definition of the spatial cdf (Lahiri et al. 1999) is attached to a realization of a stochastic process taking values in R, Z ¼ fZðsÞ : s 2 Dg, over a region D and, for a given w, is where jDj is the area of D. That is, it is the proportion of the realization over D that lies below w. It behaves like a cdf in the sense that it is nondecreasing and goes to 0 as w ! À1 and 1 as w ! 1. However, since it is a function of the process realization, it is a random variable and arises as a stochastic integral. It differs from the marginal cdf associated with the variable ZðsÞ at location s, PðZðsÞ wÞ which is a constant (as a feature of the distribution of ZðsÞÞ. Like stochastic integrals in general, it can not be calculated explicitly (but it can be approximated using Monte Carlo integration). However, some distributional properties, e.g., mean and variance, can be calculated.
To define the object of interest here, a spatial extent, let Y t ðsÞ denote the daily max temp for day t at location s. Suppose we consider a subregion B D of the study domain. Then, for a given w, the extent of the EHE in subregion B on day t is: Here, qðsÞ is the threshold surface over B. That is, it is the proportion of B which is experiencing extreme heat at least w degrees above or below (according to the sign of w) associated local thresholds on day t. In fact, it is referred to as a block average (Banerjee et al. 2014) in this case of indicator functions. Evidently, we can choose B as we wish. In Section 3 we consider two subregions of interest for comparison but, with interest in extent at a larger spatial scale, we might also consider the case where B ¼ D.
Regardless, since extent captures proportion of incidence within a region, a larger region does not imply a larger extent. However, an alternative definition of extent would specify a region where extent is anticipated to be high (or low) and then calculate the extent in order to provide quantification. EHE is applicable to the extent when w ¼ 0 and is probably of greatest interest but below, Sect. 3.4, we also look at the extent of more (w [ 0) or less (w\0) extreme heat events. The posterior predictive distribution for Ext t ðw; BÞ is needed for inference. We use a Monte Carlo integration to obtain an approximate realization of it by computing: 1ðY t ðs j Þ À qðs j Þ ! wÞ: Here, for a selected set of m locations in B with an associated set of thresholds, fqðs j Þ; j ¼ 1; 2; . . .; mg, fY t ðs j Þ; j ¼ 1; 2; . . .; mg is a posterior predictive realization of daily max temperatures for day t at the locations, s j . If we have a collection of these realizations, then we can obtain posterior samples of g Ext t ðw; BÞ for any choices of w. In this way, with arbitrarily many posterior predictive realizations, we can learn arbitrarily well about the posterior predictive distribution for Ext t ðw; BÞ.
To compute (2), given t, we need a realization fY t ðs j Þ; j ¼ 1; 2; . . .; mg. To consider arbitrary days within arbitrary years within arbitrary decades, we need a posterior predictive realization of a 50 year daily max time series  for each s j . Employing the output of the model fitting of Schliep et al. (2021), we can obtain such posterior predictive realizations using composition sampling (Banerjee et al. 2014, Chapter 6). Below, we obtain a collection of 500 such realizations, enabling a posterior predictive distribution for Ext t ðw; BÞ through g Ext t ðw; BÞ. These samples yield an empirical summary of the posterior predictive distribution for Ext t ðw; BÞ, hence, posterior inference regarding any features of Ext t ðw; BÞ that are of interest. Note that to compute (2) we first need to create the qðs j Þ. In the sequel, upon a regular gridding of B to fairly fine resolution, we obtained the centroids of the grid cells as our s j 's. Then, gathering elevations for these centroids from a digital terrain map (DTM), we kriged the qðs j Þ. That is, using the thresholds for the observed sites with their associated elevations, we employed a standard kriging model for the thresholds at the s j using their associated DTM elevations. Furthermore, we identified an additional 37 stations which had temperature data available between 1953-1962, yielding a total of 55 stations from which we developed the qðsÞ surface. In Figure OR.1 in the Online Resource we show the resulting qðsÞ surface along with the 55 stations. Fancier kriging could be imagined but here and illustratively, we confine ourselves to the above. Furthermore, we treat all of the kriged qðs j Þ as fixed in implementing the Monte Carlo integrations.
As a last remark, it is possible to create an empirical extent, employing the form in (2) but using only the available observed sites that are within B. With only 18 stations, only a few will be in B, yielding a single estimate that can assume only a few discrete values and with no uncertainty. In implementing the Monte Carlo integrations for (2) below, we employ m % 6000 -8000, yielding much smoother extents and with replication enabling us to see the distribution.

Some technical details: a digression
Here, we offer some theoretical insight into the distribution of a spatial extent by calculating the first and second moments under an illustrative Gaussian spatial first order autoregression model, a simplified version of the model that we work with in Sect. 2.4. In particular, consider the model Y t ðsÞ ¼ l t ðsÞ þ gðsÞþ qðY tÀ1 ðsÞÀ ðl tÀ1 ðsÞþ gðsÞÞÞ þ t ðsÞ where l t ðsÞ is a spatio-temporal drift term. Specific choices are adopted in Sect. 2.4. Here, gðsÞ is a mean 0 Gaussian process with covariance covðgðsÞ; gðs 0 ÞÞ ¼ r 2 hðs À s 0 Þ providing local spatial adjustment to the drift terms as well as spatial dependence across locations. The t ðsÞ are pure errors, independent and identically distributed as Nð0; s 2 Þ.

Elaborating extents
In climate applications, such as attempting to assess the effect of global warming on extreme temperatures, the main interest is not usually to characterize the distribution of the extent in a given day Ext t ðw; BÞ but rather to characterize the behavior of the extent across years or changes in its seasonal pattern. To this end, it is convenient to analyze the distribution of different averages of the daily extent over a given period of time, such as seasons or years. In defining average extents, we will use two time indexes, one for the day within year l and one for the year t, 1ðY t;l ðsÞ À qðsÞ ! wÞds: Then, we can average the daily extent over a period of days within the year, L, and over a period of years, T, as where n T and n L are the number of observations in the periods T and L respectively.
In the analysis of EHE, we consider: (i) decadal averages for a given day l and decade D, Av t2D Ext t;l ðw; BÞ ¼ 1 10 P t2D Ext t;l ðw; BÞ, (ii) the average extent over the summer months JJA for a given year t, Av l2JJA Ext t;l ðw; BÞ ¼ 1 92 P l2JJA Ext t;l ðw; BÞ, and (iii) the average extent over the summer months and a decade, Av t2D;l2JJA Ext t;l ðw; BÞ ¼ 1 920 P t2D P l2JJA Ext t;l ðw; BÞ: In calculating these quantities, we replace all integrals by their Monte Carlo approximations, obtaining the values Av t2T;l2L g Ext t;l ðw; BÞ. Further, using 500 posterior predictive samples of realizations of g Ext t;l ðw; BÞ, we obtain 500 samples of Av t2T;l2L g Ext t;l ðw; BÞ to supply an empirical summary of the posterior distribution of Av t2T;l2L Ext t;l ðw; BÞ. In the cases where the average is carried out only over one time index, a bigger size sample is obtained if the distribution is the same in a given period of time. For example, if the distribution of Av l2JJA Ext t;l ðw; BÞ is taken to be the same for all the years in a decade, t 2 D, we can obtain a sample of 5000 realizations of Av l2JJA g Ext t;l ðw; BÞ, 500 for each of the 10 years t 2 D, to consider empirically the posterior distribution of Av l2JJA Ext t;l ðw; BÞ in D. When it is assumed that the distribution of the ten years in D is the same, we modify notation to Av D l2JJA Ext t;l ðw; BÞ, and Av D l2JJA g Ext t;l ðw; BÞ. Persistence: In the context of global warming, a relevant feature in the analysis of extreme temperatures is persistence. We consider persistence as arising when the probability of being in an extreme heat state at day l þ 1 is higher if we were in an extreme heat state at day l than if we were not. To analyze this feature spatially through the extent, we consider the proportion of B at w degrees above threshold, for both of two consecutive days, l and l þ 1, denoted by 2 Ext t;l ðw; BÞ and defined as 2 Ext t;l ðw; BÞ ¼ 1 jBj Z B 1ðY t;l ðsÞ À qðsÞ ! wÞ 1ðY t;lþ1 ðsÞ À qðsÞ ! wÞds: More generally, r Ext t;l ðw; BÞ denotes an r-day EHE, that is an EHE persisting for r consecutive days starting at day l, with analogous definition. It is immediate that r Ext t;l ðw; BÞ rÀ1 Ext t;l ðw; BÞ . . . Ext t;l ðw; BÞ: This accords with the fact that the extents of two-day EHE's will be smaller than the extents of one-day EHE's. Again, we will use a Monte Carlo integration to obtain a realization, e.g., for the r ¼ 2 case: 1ðY t;l ðsÞ À qðsÞ ! wÞ 1ðY t;lþ1 ðsÞ À qðsÞ ! wÞ: In the analysis for our dataset/study area we confine ourselves to r ¼ 1 and r ¼ 2 since the probability of observing runs with r ! 3 is too small to show useful extents. As with the daily extents, we are usually more interested in the average of r Ext t;l ðw; BÞ over a time window of l's, t's, or both. These averages are defined as above but with 2 Ext t;l ðw; BÞ. Monte Carlo integrations for these averages are analogous to those for averages of Ext t;l ðw; BÞ; we only have to substitute r Ext t;l ðw; BÞ with r g Ext t;l ðw; BÞ. Useful displays: Displays we will develop for extent and persistence include the following. First, we will consider different subregions of Aragón in order to make comparison between regions. To do this we will examine evolution of extent or persistence across the JJA season. We do this averaged over a decade in order to enable decadal comparison. Further, with posterior predictive samples of extents for each day l within each year t, we will supply the entire posterior predictive distribution of some of these extents and persistences. We will also develop ''time to'' displays, showing the time to the first day in year t with extent or persistence greater than or equal to a specified v. Lastly, we will examine how extents and persistences vary over choices of w since there can be interest in different specification of local thresholds for extreme heat.

Reviewing the daily max temperature model
Returning to the notation at the start of this section, let Y t ðsÞ denote the daily maximum temperature at day t and location s. Schliep et al. (2021) propose a two-state model where the state for a given day defines whether or not the location is experiencing an extreme heat event. Given a threshold, qðsÞ at location s, let U t ðsÞ 2 f0; 1g denote the state at time t for location s where a value of 0 denotes the below threshold state and a 1 denotes the above threshold state. So, U t ðsÞ is a spatial binary time series process reflecting times of transition or state-switching. It is observed for each t at a monitored site but is latent elsewhere. With regard to extents, we note that Ext t ð0; BÞ ¼ 1 jBj R B U t ðsÞds. U t ðsÞ is a Markov process where the state U t ðsÞ depends only on the previous state U tÀ1 ðsÞ. Then, the distribution of Y t ðsÞ is specified explicitly given U t ðsÞ and, given the threshold, U t ðsÞ is a binary function of Y t ðsÞ. Furthermore, the transition probabilities between states are allowed to be a function of previous temperature. This specification ensures transition probabilities to be ''local'', i.e., to vary with location and to depend upon the previous day's maximum temperature at that location. The opposite would be expected if the maximum temperature of the previous day resulted in a non-extreme heat state.
The joint distribution for temperature and state is specified in a first order Markov fashion explicitly as follows. Given Y tÀ2 ðsÞ, and thus, U tÀ2 ðsÞ, for two consecutive time points, t À 1 and t, the joint distribution ½U tÀ1 ðsÞ; Y tÀ1 ðsÞ; U t ðsÞ; Y t ðsÞ is written as This formulation requires three model specifications: (i) ½Y t ðsÞjU t ðsÞ ¼ 0; Y tÀ1 ðsÞ, (ii) ½Y t ðsÞjU t ðsÞ ¼ 1; Y tÀ1 ðsÞ, and (iii) ½U t ðsÞjY tÀ1 ðsÞ. With qðsÞ denoting the threshold (quantile) for location s, truncated distributions are needed for (i) and (ii), i.e., ½Y t ðsÞ ¼ y1ðy\qðsÞÞ and ½Y t ðsÞ ¼ y1ðy ! qðsÞÞ, respectively. For (i), a truncated normal distribution is adopted, with autoregressive centering, TN l 0 t ðsÞ À q 0 ðY tÀ1 ðsÞ À l 0 tÀ1 ðsÞÞ; r 2;0 ðsÞ À Á IðÀ1; qðsÞÞ: Details for l 0 t ðsÞ are given below. For (ii), a truncated t-distribution is adopted, with autoregressive centering, Tt l 1 t ðsÞ À q 1 ðY tÀ1 ðsÞ À l 1 tÀ1 ðsÞÞ; r 2;1 ðsÞ À Á IðqðsÞ; 1Þ: Details for l 1 t ðsÞ are given below. As an aside, unlike the multivariate normal, the multivariate t-distribution captures upper tail extreme dependence (Chan and Li 2008) which may be desirable in looking at extreme heat events. Further, with multivariate t-distributions in both time and space, tail dependence is inherited in space as well.
Exploratory analysis of the observed data suggested a smaller variance for the above threshold daily maximum temperature distribution than for the below threshold daily maximum temperature distribution. Further, spatiallyvarying variances are introduced, expecting that variation in say, Jaca (in the Pyrenees in the north of the region) would be different from variation in say, Zaragoza (flat and central in the region).
For (iii) a probit link is employed to define U À1 ðp t ðsÞÞ U À1 ðPðU t ðsÞ ¼ 1jY tÀ1 ðsÞÞÞ g t ðsÞ with g t ðsÞ given below. Putting (i), (ii), and (iii) together, a mixture distribution for Y t ðsÞ results: (i) a truncated normal distribution for the bulk of the distribution, (ii) a truncated t-distribution for the upper tail of the distribution, and (iii) mixture weights according to PðU t ðsÞ ¼ 0Þ or PðU t ðsÞ ¼ 1Þ, respectively. As for the specifics of l t ðsÞ and g t ðsÞ, compacting notation, for l ðsÞ is modeled as a mean 0 Gaussian process with exponential covariance function. For c ð1Þ ½ t 365 þ1 , where ½ denotes the greatest integer function, thus counting years with this subscript and, as a result, the c's provide annual intercepts to allow for yearly shifts, i.e., for hotter or colder years. The sin and cos terms are introduced to capture annual seasonality with their coefficients reflecting associated amplitudes. This seasonality is critical to ensure that an annual daily maximum temperature trajectory over the course of a year at a location will provide sensible realizations. elevðsÞ is the elevation at s and latðsÞ is the latitude. Finally, q U t ðsÞ provides a centered AR(1) specification, bringing in the previous day's temperature, Y tÀ1 ðsÞ.
We note that, as other common mixture models with a ''cut point'' to capture skewness, the unconditional density obtained from this model is discontinuous at the threshold. No continuity constraints are needed to smooth the density since this discontinuity does not affect the calculation of the extents over the threshold. This model was satisfactorily validated using out-of-sample prediction of characteristics of extreme values in three locations, see Section 5.2 in Schliep et al. (2021) and a summary in Section OR.2 in the Online Resource.

The data and subregions
We analyze the evolution of the extent over time in two different areas around Aragón, a region located in northeastern Spain. The areas are quite different in topography and temperature, see Fig. 1, where the elevation and the thresholds that define a local EHE are shown.
• B1: Pyrenees, with latitude between 42.5N and 43N and longitude between À1:9W and 0.7E. It is a mountainous area with a high variability in elevation. The elevation of the stations varies from 442 to 1645 m asl, but some points in the area are over 3000 m. This leads to a high variability in the thresholds qðsÞ, that vary from 25.5 to 36 C. This region shows a high biodiversity, including the last glaciers in Spain. • B2: Central Ebro valley with latitude between 41. 5N and 42.5N and longitude between À1:9W and 0.7E. This region is more homogeneous both in elevation, ranging from 245 to 546 m, and in temperature. The thresholds qðsÞ vary from 33.8 to 37 C. This region is the most populated in Aragón, and the most important farming zones are located here.
Altogether, the posterior predictive time series result in very large data files from which extents and persistences are computed. For instance, for B1, we have 50 years by 92 days by 7881 grid centroids by 500 replicates yielding 1:81263 Â 10 10 points.
In the sequel we present some comparative analysis for the regions B1 and B2. We present in the text the analysis for both regions and displays for the Pyrenees (B1) region, with analogous displays for the Ebro Valley (B2) region in the Online Resource.

Analysis of the time evolution of the extent
To offer some quantification of global warming, we analyze the evolution of the extent of EHE's across years. In that regard, we also analyze persistence through the extent of two-day EHE's. We consider the averages of the extent in the summer period JJA, Av l2JJA Ext t;l ð0; BÞ and Av l2JJA 2 Ext t;l ð0; BÞ. We also compare the evolution in the regions B1 and B2. Figure 2 shows the plot of the posterior mean of Av l2JJA Ext t;l ð0; BÞ for B1 and B2 vs. year t. Both regions show an increasing trend; the magnitude of the slope is similar in both regions, although the mean extent in B2 is approximately 1% higher than in B1. The analogous plot for Av l2JJA 2 Ext t;l ð0; BÞ is also shown in Figure 2. Similar conclusions about the extent of two-day EHE's are obtained, although the magnitude is reduced almost to half, the increase over time is slower, and the difference between the mean extent in the two regions is reduced to 0.5%. These conclusions are in agreement with the analysis of the raw empirical EHE extents using the observed series of temperature, see Figure OR.4 in the Online Resource. However, that figure reveals the limitations of the empirical extent. The plots in Figure OR.4 show the high variability of the empirical extent, and the impossibility of inference to compare the two decades.
To study the entire distribution, we analyze Av D l2JJA Ext t;l ð0; BÞ using the 5000 realizations Av D l2JJA g Ext t;l ð0; BÞ. The boxplots of the distribution for each decade in B1 are shown in Fig. 3, and Table 1 gives some summary measures in the two regions. Apart from a small dip in D4, an almost linear increase is observed. Similar conclusions are obtained in B2. With regard to the evolution of the extent of two-day EHE's, the boxplots of Av D l2JJA 2 Ext t;l ð0; BÞ in Fig. 3, and Table OR.1 in the Online Resource, confirm that it is quite similar to the evolution of the extent of EHE's, but smaller and slower in magnitude.
To better illuminate the global increase of the extent in the summer season, the posterior density for the first and the last decades is shown in Fig. 3 and Figure OR r Ext t;l ð0; BÞ À Av D1 l2JJA r Ext t;l ð0; BÞ for r ¼ 1; 2 (superscript r ¼ 1 is omitted for simplicity) and B1 and B2. In both regions, the interval for the one-day EHE's contains zero, showing that it is possible that the JJA average extent of one year in the last decade is lower than in the first decade. However, it is unlikely according to the posterior probability PðD D5 D1 ðBÞ [ 0 j yÞ, which is 0.803 in B1 and 0.805 in B2. Similar conclusions are obtained for the two-day EHE's since Pð 2 D D5 D1 ðBÞ [ 0 j yÞ are 0.815 in B1 and 0.808 in B2.
Evidence of a significant increase over time in the decadal average is much stronger. Summary measures of the difference of the decadal average, r D D5;D1 ðBÞ ¼ Av t2D5;l2JJA r Ext t;l ð0; BÞ À Av t2D1;l2JJA r Ext t;l ð0; BÞ for r ¼ 1; 2 are also shown in Table 2. The CI's for the one-day EHE's are narrower than the corresponding CI's of the yearly averages and do not include zero. The posterior probabilities Pð r D D5;D1 ðBÞ [ 0 j yÞ are essentially 1 in both regions for r ¼ 1; 2, indicating that the decadal average in D5 is significantly higher than in D1 in all the cases. Another question of interest is the comparison of the increment of the extent between D1 and D5 in the two regions under study. Fig. 2 shows that the trends in B1 and B2 are quite parallel, and that the increase of the JJA average extent between D1 and D5 is quite similar in both regions. The posterior mean of the difference between B1 and B2 of the increase in the yearly average, D D5 D1 ðB1Þ À D D5 D1 ðB2Þ, is À0:002 and the 90% CI is ðÀ0:013; 0:009Þ while the CI of the decadal average, D D5;D1 ðB1Þ À D D5;D1 ðB2Þ is ðÀ0:006; 0:002Þ. That means that although there is a shift in the mean value of the average extent in both regions, there is no evidence of a significant difference in the increase of average extent, and both regions show a similar evolution over time.

Evolution of the seasonal pattern and the beginning of the summer
To analyze the evolution of the seasonal pattern in JJA of the extent Ext t;l ð0; BÞ, l ¼ 1; . . .; 92, and 2 Ext t;l ð0; BÞ, Fig. 4 shows the plot vs. day within year of the posterior mean of Av t2D Ext t;l ð0; BÞ in D1 and D5 in region B1. The seasonal pattern attains the maximum mean extent at the end of July in all cases, and a similar increase of the mean across decades, is observed in both regions. The analogous plot of Av t2D 2 Ext t;l ð0; BÞ, see Fig. 4, shows a smoother seasonal pattern of the extent of the two-day EHE's, and a smaller increase between decades than in the extent of the one-day EHE's. The same conclusions are obtained in region B2, and the corresponding plots are shown in Figure OR.6 in the Online Resource.
The increase between decades is quantified in Table 3 that summarizes the posterior distribution of the monthly averages Av D l2Jn Ext t;l ð0; BÞ, Av D l2Jl Ext t;l ð0; BÞ and Av D l2Ag Ext t;l ð0; BÞ, in D1 and D5. There is an increase in the posterior mean in the three months with a similar seasonal pattern in both regions: the highest absolute increase, more than 3%, is observed in July and the lowest, around 1.4%, in June. This increase is higher in the upper tail: almost 2% in June and 4% in July. The posterior probability PðAv D5 l2Jn Ext t;l ð0; BÞ [ Av D1 l2Jn Ext t;l ð0; BÞ j yÞ is 0.813 in B1 and 0.818 in B2, and in July and August they are 0.805 and 0.804, in both regions.

Time to beginning of EHE's
An important feature related to the seasonality is the beginning of extreme temperatures in summer, and the analysis of its change across years. To that end, we define the variable L t ðv; BÞ, the number of days to the first day l within the period JJA in year t with an extent higher or equal to v, that is the first day l such that Ext t;l ð0; BÞ ! v. If no extent over v is observed in a year, we set L t ðv; BÞ ¼ 92. Since extents over high thresholds are being analyzed, large extents are very rarely observed so we only consider here v ¼ 0:1. 2 L t ðv; BÞ, the first day with an extent of the two-day EHE's higher or equal than v, is defined analogously. Figure 5 shows the posterior mean of L t ð0:1; BÞ vs. year for B1 and B2, and the the analogous plot for 2 L t ð0:1; BÞ. A decreasing trend is observed in both variables and in both regions. However, the decrease of the mean of 2 L t ð0:1; BÞ across years is much greater, almost 30 days, versus less than 10, but with much higher variability. This variability is likely due to the low incidence of the considered event. Figure 6 shows the posterior density of the first day, estimated in a decade, L D t ð0:1; BÞ, in D1 and D5 in B1; the   analogous plot in B2 is shown in Figure OR.7 in the Online Resource. Table 4 summarizes the posterior mean, standard deviation and 90% CI for L D t ð0:1; BÞ, for the five decades in the observed period. The mean of the first day between the first and the last decade has decreased almost 7 days in B1 and 8 days in B2. The posterior probability PðL D1 t ð0:1; BÞ [ L D5 t ð0:1; BÞ j yÞ, that is the probability that the first day of the summer where more than 10% of the region is under extreme temperatures occurs earlier in the last decade, is 0.775 in B1 and 0.781 in B2.
A summary of 2 L D t ð0:1; BÞ analogous to that in Table 4 is presented in Table OR.2 in the Online Resource. Since the probability of a two-day EHE is quite low, it is likely that in a year all the extents of two-day EHE's are lower or equal than 0.1 and, consequently, the event Pð 2 L D t ð0:1; BÞ ¼ 92 j yÞ has a positive probability mass. However, this posterior probability decreases over the observed period. In B1, the percentage of years with L t ð0:1; BÞ ¼ 92 is 0.77 in D1 and 0.31 in D5, and 0.88 and 0.36 in B2. Further evidence that L D t ð0:1; BÞ is decreasing is that, conditionally to the fact that an extent higher than 0.1 occurs in a year, the posterior mean of 2 L D t ð0:1; BÞ has decreased more than 5 days between D1 and D5 in both regions.

Behavior of extent across choices of w
The previous sections consider results for Ext t;l ðw; BÞ only at w ¼ 0, that is the extent corresponding to the threshold used to define an EHE. Here, we consider the effect on extent by adjustment of the local thresholds to lower extreme temperatures and to higher extreme temperatures. To that end, we consider Ext t;l ðw; BÞ for a grid of values over and under the threshold, w ¼ À1:5; À1:0; À0:5; 0:0; 0:5; 1:0; 1.5 C.
To study the evolution across years and across threshold, Fig. 7 shows the posterior mean of Av l2JJA Ext t;l ðw; B1Þ vs.
year for the grid of w values, and the corresponding linear trends fitted to the means. An increasing trend is observed for all the w values but the trend is stronger for smaller w's. More precisely, the ratio between the slope for w ¼ À1:5 and w ¼ 1:5 is 0:00106=0:00018 ¼ 5:8. The finding is that the incidence of extents associated with more extreme thresholds is increasing at a slower rate than that for less extreme thresholds. This increase over time is not only observed in the mean, but in the entire distribution. As an example, we consider the decadal average of July, where Fig. 8 shows the posterior density of Av t2D;l2Jl Ext t;l ðw; B1Þ across w for D1 and D5; the analogous plot for B2 is shown in Figure OR.8 in the Online Resource. It is observed that the  shape of the distribution is very similar across w and across decades but, in both cases, there are clear shifts that are not homogeneous across w. The different evolution across w is observed if we compare the mean difference between decades. For example, in B1, the posterior mean difference E Av D5 l2Jl Ext t;l ðw; B1Þ À Av D1 l2Jl Ext t;l ðw; B1Þ j y À Á is 5% for w ¼ À1:5, while for w ¼ 1:5 is 1%. The change is also observed if we compare the mean differences between the range of w values across decades. Again in B1, the posterior mean difference E Av D l2Jl Ext t;l ðÀ1:5; B1Þ À Av D l2Jl Ext t;l ð1:5; B1Þ j y À Á is 18% in D5, while in D1 is 14%. These values together with the counterparts for B2 are summarized in Table OR.3 in the Online Resource. The change in the entire distribution is quantified by the posterior probability of the extent for a given w in the last decade being higher than in the first, P Av D5 l2Jl Ext t;l ðw; B1Þ [ Av D1 l2Jl Ext t;l ðw; B1Þ j y À Á : Table 5 summarizes these posterior probabilities and shows that there is a nonmonotonic decrease across w. Figure 9 shows the posterior mean of the average Av t2D Ext t;l ðw; B1Þ vs. day within year for D1 and D5 in order to compare the seasonal behavior of the extent across w; the analogous plot for B2 is shown in Figure OR.9 in the Online Resource. It can be seen that the seasonal pattern is smoother for larger w. In all the cases, the seasonal pattern is more pronounced in the last decade but the changes across w are not homogeneous, with stronger differences in smaller w values. More precisely, the seasonal pattern for w ¼ 1:5 in D5 is more pronounced than its counterpart in D1, approaching that of w ¼ 1 in D1, with probability P Av JJA t2D5 Ext t;l ð1:5; B1Þ [ Av JJA t2D1 Ext t;l ð1:5; B1Þ j y À Á , that is essentially 1, and P Av JJA t2D5 Ext t;l ð1:5; B1Þ [ Av JJA t2D1 À Ext t;l ð1; B1Þ j yÞ ¼ 0:18. For smaller w values, the differences are higher, with the posterior probabilities comparing the previous averages in w ¼ À1:5 and w ¼ À1:5, and in w ¼ À0:5 and w ¼ À1:5 equal to 1; in addition, the seasonal pattern of w ¼ À1:5 in D1 is only slightly more pronounced than the pattern in w ¼ À0:5 in D5.

Summary and future work
Notions of the spatial extent of heat waves and extreme heat events have been considered informally and descriptively in the climate community. Here we have introduced a formal probabilistic definition for extents of extreme heat events. For a specified region, for a given day, the definition of spatial extent takes the form of a block average over the region. It is an average of indicator variables which identify exceedance of a local threshold by the daily max temperature surface for the day at each location within the region. We demonstrate that extents can be calculated through Monte Carlo integration and can be obtained for realizations from arbitrary space-time autoregressive models for daily max temperatures. Using a dataset of daily max temperatures over 50 years, adopting a particular choice of model, working within a Bayesian framework, we obtained posterior predictive samples of daily temperature time series on a fairly fine grid scale to implement the Monte Carlo integrations. With these samples, we calculated daily, seasonal and decadal averages of the extents for two regions around the Comunidad Autónoma de Aragón in Spain. We generalized these extents to capture extents of persistence of extreme heat. We made comparisons across decades to reveal evidence of increasing extent over time. We also studied other features related to the extent of EHE's, for example, the first day in the period JJA with an extent higher than a given percentage, and the behaviour of extent across choices of the threshold. Following our approach, other extents yielding other comparisons may be developed according to the interest of the user.
With regard to the regions under study, Pyrenees and Ebro Valley, we found that a clear increase of the extent of EHE's across time is observed. For example, the posterior probability of the yearly average extent across JJA in the decade 2006-2015 being higher than in 1966-1975 is higher than 0.8. It is also found that the first day of the summer where the extent of EHE's is higher than 10% has decreased around seven days. The extent of EHE's defined with all the considered thresholds is increasing, but the incidence of extents associated with more extreme thresholds is increasing at a slower rate than that for less extreme thresholds.
Future work will explore the temporal evolution of geographic extent for different regions to enable comparison. It will also examine the challenges of working with regions at larger spatial scales. Alternatively, with suitable computing capability, we can consider investigating extents at higher spatial resolution than done here. Further, while here we work with actual weather data, another goal is to consider projection of future spatial extent using future climate scenarios. scholarship ORDEN CUS/581/2020, from Gobierno de Aragón. Lastly, we thank the editor and anonymous reviewers for their thoughtful comments.
Funding Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature.
Availability of data and code Code (in R), temperature series used to fit the model, and simulated series used to compute the extents are available upon request.

Declarations
Conflicts of interest The authors declare that they do not have financial or personal interest that can inappropriately influence this work.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.