1 Introduction

According to the latest IPCC report (IPCC 2021), the Earth’s climate system is undergoing a warming that is unprecedented at least in the last millennia in terms of rapidity and extent. The human contribution to this global change is unequivocal and it is due to artificially increased greenhouse gas concentrations in the atmosphere. Since the warming and the associated hazardous effects are projected to continue in the upcoming decades, adaptation actions are deemed as necessary to reduce exposure and vulnerability to climate change (IPCC 2022). There is thus a growing interest in high-resolution climate monitoring and modelling in order to provide suitable local information to the adaptation planning both in urban and rural areas (Bhatta et al. 2020; Eingrüber et al. 2022; Marchi et al. 2020). In this context, there is an emerging use of quality data from densely distributed amateur weather stations to complement the usually sparse institutional networks (de Vos et al. 2018; Frustaci et al. 2022; Giazzi et al. 2022). Here we explore the potential added value of this kind of combined network in the Aterno Valley in Central Italy.

A climate adaptation strategy includes four main sequential elements: risk assessment, sectorial impact, economic assessment, decision support systems (European Commission et al., 2021). The results from each phase feeds the following and the quality of the input data are crucial. The first stage of the process, risk assessment, is where the contribution of climate data is more relevant. The main climate-related risks considered are heatwaves (Arbuthnott et al. 2016), droughts (Hagenlocher et al. 2019), forest fires (Moritz et al. 2014), desertification, heavy precipitation (Wheaton and Kulshreshtha 2017), windstorms, hailstorms, floods (Wilby and Keenan 2012), landslides (Gariano and Guzzetti 2016), sea level rise (Azevedo De Almeida and Mostafavi, 2016; Graham et al. 2013). The sectors for which the impact may be relevant include water supply (Short et al. 2012; Wheeler and von Braun 2013), agriculture (Olesen and Bindi 2002; Porter and Semenov 2005), forestry (Kolström et al. 2011), energy (Varianou Mikellidou et al. 2018; Viguié et al. 2021), tourism (Soontiens-Olsen et al. 2023; Steiger et al. 2023), urban areas (Hunt and Watkiss 2011), cultural heritage (Tansey 2015; Xia et al. 2012), critical infrastructures (Wells et al. 2022), transport (Gössling et al. 2023) and health (Levy et al. 2016; Luber and McGeehin 2008; Tirado et al. 2010).

In this context, there is a growing need for data quality assurance (Acosta-Morel et al. 2021; Giazzi et al. 2022; Xiong et al. 2022) and high temporal/spatial resolution of climate data (Amengual et al. 2012; Berne et al. 2004; Bruni et al. 2015; Cheng et al. 2012; Cunsolo Willox et al. 2012; de Vos et al. 2018; Frustaci et al. 2022). This is especially true recognizing the importance of citizen involvement and capacity building in the adaptation planning development and application (Chapman et al. 2017; Cornes et al. 2020; de Vos et al. 2017; Gharesifard et al. 2017; Meier et al. 2017; Mitheu et al. 2022; Runkle et al. 2018; Yvonne et al. 2020). The advent of Internet-of-Things (IoT) technologies is also a clear call for more accurate and local data (Gharesifard and Wehn 2016; Kamel Boulos et al. 2011; Mohamed Firdhous and Sudantha, 2020; Ren et al. 2021). Indeed, the added value of amateur weather observational networks to institutional networks is an emerging element aimed at improving the density and the local representativeness of climate data.

In this work, we test the added value of amateur weather network in the Aterno Valley of the Central Italian Apennines. In the recent decades, mean temperatures in the region are rising at a rate of about 4 °C/century (Curci et al. 2021; Scorzini et al. 2018) and precipitation shows a tendency to decrease, especially in winter (Scorzini and Leopardi 2019). This broad information should be specialized in the local context using more local dense local observations, which are expected to improve the accuracy of climate information, especially for precipitation (Di Luca et al. 2012).

In the manuscript, we first illustrate the data sources and the methods used for quality assurance. We then show sample analysis of the weather variables considered (temperature, humidity, wind and precipitation) in order to check which advantages could be gained when the dataset is hypothetically used for climate risk assessment.

2 Methods

2.1 Location and data

The area of interest selected for this work is a mountain valley in central Italy around the city of L’Aquila (Fig. 1). The valley is crossed by the Aterno river and has a floor at about 650–750 m a.s.l. The valley is confined by two mountain ranges, the Gran Sasso massif to the North-East having average altitude of about 2000 m a.s.l. and the Sirente-Velino massif to the South-West of about 1500 m a.s.l. The total area in the figure is about 50 × 50 or 2500 km2, and the most urbanized part of the valley axis, centered on the city of L’Aquila, is about 20 km long and 4 km wide, thus with an approximate extent of 80 km2.

We use four data sources of observations from ground-based automated weather stations (Table 1; Fig. 1). The first two, which we term “institutional” in this work, are run by the public office for the hydrological service of the Abruzzo Region (19 stations) and by the University of L’Aquila (1 station). The former dataset is the same used at daily frequency in past work that included quality check and homogenization in order to be suitable for long-term climate analysis (Curci et al. 2021). The other two are from “non-institutional” networks, namely the amateur association MeteoAquilano (11 stations) and the private foundation Osservatorio Milano Duomo (1 station).

The total number of institutional stations is 20 and those non-institutional is 12 over the full area of interest of Fig. 1: the addition of non-institutional stations implies an increase of about 50% of the sites, which translates in an average station density increased from 0.84 ha− 1 to 1.3 ha− 1. Considering the main urban area around the city of L’Aquila, the number of stations increases from 5 to 15 (+ 200%), when adding the non-institutional sites, yielding an average station density passing from 6.2 ha− 1 to 18.7 ha− 1. Indeed, most non-institutional stations (10 out of 12) are located in the main urban area.

All 32 stations are equipped with a thermometer and a pluviometer (12 of which non-institutional), 18 with an anemometer (12 non-institutional), 20 with a hygrometer (12 non-institutional), only 2 with pyranometer (both institutional). In addition to the augmented station density, the non-institutional network promises to significantly improve the monitoring of wind and humidity conditions in the valley.

We use three years of data,, from 2017 to 2019.

Table 1 List of the sites of weather stations used in this study. The owner short names denote the hydrological office of the Abruzzo Region (ABR), University of L’Aquila (UNI), the amateur association MeteoAquilano (MAQ), the private foundation Osservatorio Milano Duomo (OMD). The first two are classified as “institutional” (type “I”), the second two “non-institutional” (type “N”). The last five columns report the presence (X) or absence (-) of sensors of temperature (Temp), precipitation (Prec), wind speed and direction (Wind), relative humidity (Hum), and global downward solar radiation (SRad).
Fig. 1
figure 1

Location of the weather stations used in this study. In red the “institutional” stations, in blue the “non-institutional” stations. The reference numbers are those listed in Table 1. In the small panel on the top-right the approximate location of the area in central Italy. The rectangle around the city of L’Aquila denotes the approximate extent of the main urban area

2.2 Quality checks

The raw data are reported and stored with variable time averages, spanning from 1 min to 1 h. We average all data over 1-hour time slots for subsequent analysis. In general, quality check procedures are divided in two main categories: the first is applied to single stations and checks the presence of unphysical values and of anomalous temporal sequences (e.g. many repeated values, abrupt jumps, etc.), while the second is applied to groups of nearby stations in order to test the spatial consistency of the data.

The data from the Abruzzo Region database (ABR) were previously screened and homogenized on a daily basis according to international standards as reported in Curci et al. (2021): we exclude from the analysis the data of those days that were flagged as problematic for the stations of the ABR network.

For hourly or subhourly timeseries of weather variables, several authors suggested QC procedures which are similar to those applied to daily data, but that account for the generally higher variability of subdaily data (Cerlini et al. 2020; DeGaetano 1997; Estévez et al. 2011; Giazzi et al. 2022; Lewis et al. 2021). For the detection of temporal (single-station) and spatial (stations intercomparison) outliers, we adopt a control method similar to that suggested by Giazzi et al. (2022), which is based on the check of the distribution of the anomalies (z-score) of the timeseries. The main reason for this choice is to keep the number of control parameters and thresholds at a minimum.

The first check on hourly data is for impossible values. Temperatures lower than − 40 °C and higher than 50 °C are excluded. Negative values of precipitation, wind speed and solar radiation are also excluded. Wind direction is checked to be in the range 0-360° and records with negative wind speed are removed. Relative humidity is retained only in the range 5-100%.

The second check is on abrupt jumps in consecutive values. We found that only temperature and humidity needed such a check and we excluded records with temperature differences higher than 15 °C between consecutive values, and relative humidity with differences higher than 50%.

The third check is on consecutive repeated values. We allow a maximum number of identical values of 6 for temperature, 2880 (120 days) for precipitation, 12 for wind speed and relative humidity, 24 for solar radiation. These thresholds are selected based on visual inspection of the data and the timings of typical daily cycles and extended dry periods in the case of precipitation.

The fourth check is on excessive anomaly of the data. We first transform the variables in order to have more symmetric distributions, applying the logarithm to non-zero precipitation, and a power transformation to the others, with the exception of temperature which was not transformed. The exponent of the power transformation is 0.5 for wind speed, 2 for relative humidity and 0.25 for wind speed. We then compute the z-score for each transformed variable \( x\) as:

$$ z-score=\frac{x-median}{IQR}$$

where \( IQR\) is the interquartile range. The z-score is calculated from medians and IQRs as a function of month and hour of the day. We exclude from the analysis all records with a z-score higher than 3.

The fifth and last check is on spatial consistency of the data and also based on the z-score of transformed variables. For each station, we compute the 1st and the 99th percentiles of the z-score of the nearby stations in a radius of 50 km, and their standard deviations. We exclude from the analysis all records having a z-score higher than 99th percentile plus \( Y\) times the standard deviation, and lower than the 1st percentile minus \( Y\) times the standard deviation. We selected a multiplication factor \( Y\) of the standard deviation equal to 2 for temperature and precipitation, and to 4 for relative humidity and wind speed. We do not apply the spatial consistency check to solar radiation, since only two stations are available.

In Table 2 we report the percentage of data excluded after the quality check procedures, for each step of the process. Less than 1% of temperature, precipitation and solar radiation data were flagged, less than 4% of relative humidity data (mostly because of repeated values and of single-station z-score exceedances), and about 6% of wind data (mostly because of repeated values). Illustrative examples of records flagged after the quality check procedures are displayed in Figure S1 in the supplementary material.

Table 2 Percentage of data flagged for each quality test and variable, on the timeseries 2017–2019 of all stations

In Figure S2 we display all the timeseries available, with histograms of the values after the quality checks. The timeseries for the stations of Scoppito, Tagliacozzo and Tornimparte are not complete for the three-year period and they cover just the last half of the period: for that reason we exclude these stations from the analyses over the whole period and we retain them only for specific short periods. In Figure S3 we display the monthly mean values at all stations for the period 2017–2019. The precipitation data at Aterno a Fagnano, Castel del Monte and Collepietro are unrealistically low, so we exclude them from further analysis.

3 Results

We analyze the datasets in order to highlight the representation of the spatial variability in the Aterno Valley, employing alternatively sensors from only institutional, only non-institutional, and mixed networks. This way, the possible added value of the non-institutional networks should emerge. We analyze both the “climatological” averages of spatial features, such as the urban heat island, and the reconstruction of specific short-term episodes, such as a heat wave period. We present the results organized for variables, namely temperature, precipitation, wind and relative humidity. We do not analyze solar radiation, since only the time series of two sensors, both institutional, are available.

3.1 Temperature

First we look at the urban heat island (UHI) effect using the full dataset 2017–2019. We calculate monthly mean temperatures at each station and visualize the data on selected stations alternatively along or across the valley axis as a function of the month. We also compute the hourly average of the temperature difference between pair of stations representative of urban and rural areas, as a function of the month, in order to visualize the intensity of UHI. The idea is derived from the seminal work of (Colacino and Lavagnini 1982).

In the first row of Fig. 2 we show the along-axis transect of monthly mean temperature, respectively using institutional only, non-institutional only, and mixed type sensors. The institutional network spans a larger area than non-institutional, thus the UHI effect is better visualized with the former dataset. However, the magnitude of UHI, of the order of + 2 °C in the warm months and + 1 °C in the cold months, is similar in the two datasets. Mixing the station types, a slightly more refined view, because of the higher station density, for example highlighting a relative minimum in the UHI effect in the urban center, which is slightly cooler than the surrounding urban area. From Figure S4 we may see that the UHI effect is most effective at night, as shown by the daily minimum temperature field, while during daytime the temperature is more uniform in the valley, with daily maximum temperature slightly lower in around the city center with respect to the surroundings.

The cross-axis transect of mean temperature, displayed in the second row of Fig. 2, shows that the temperature in the valley is on average higher by more than 2 °C with respect to the adjacent slopes. The inclusion of the non-institutional network allows also in this case a smoother representation of the spatial distribution. We notice that the effect highlighted here is a superimposition of the UHI and of the different altitudes of the stations (see Table 1).

In the third row of Fig. 2 we show the daily cycle of the UHI intensity as a function of the month. For this purpose, we select the stations in L’Aquila (urban area) and Pizzoli (rural area, mostly upwind of the city), that have nearby institutional and non-institutional sensors in both locations. The mixed pair of the non-institutional station in the city and the institutional station in Pizzoli is the one with the most similar altitudes, thus the effect of UHI should be better isolated. All the combinations of networks show a positive UHI (urban warmer than rural) nighttime and a negative UHI (urban cooler than rural) daytime. There is a sharp transition just after the sunrise from positive to negative UHI, and then a more gradual switch from negative to positive near the sunset. The magnitude of the UHI effect is lower with institutional data and higher with non-institutional data, the mixed case is intermediate. The latter plot appears more regular than the others and probably more realistic, with UHI intensities in the range from − 3 °C to + 3 °C depending on the time of the day, consistent with UHI observed elsewhere (Di Bernardino et al. 2023; Oliveira et al. 2021). The dependence of the UHI on the choice of the pair of urban-rural stations and the significant intensity detected using the non institutional stations certainly warrants further analysis in the future.

Fig. 2
figure 2

Spatial distrbution of temperature around the city of L’Aquila in the Aterno Valley, using alternatively data from (a) institutional, (b) non-institutional, and (c) mixed networks. On the first row, the monthly along-axis valley transect (NW-SE) of mean temperature, on the second row the cross-axis valley transect (S-N), on the third row the daily cycle of the urban heat island intensity as a function of the month

In Fig. 3 we illustrate the advantage of using a more dense station network, thanks to the addition of non-institutional to institutional sensors, in reconstructing the average temperature over the valley during a heat wave event (29 July– 10 August 2017). The map is produced with a universal kriging algorithm using topography as auxiliary spatial field. With the institutional only stations the heat wave appears to be more intense, especially in the downslope valleys to the South-East of the city, while, adding the non-institutional stations, the maximum intensity is around the city core and decreases all around in the surroundings. The visual comparison of observed data denotes an unrealistically high temperature field in the valley when using the institutional only network, while it looks much more consistent with the spatial distribution of the observations when using all stations. The maps of kriging standard errors for the two images are shown in the supplement (Figure S5): it is estimated a reduction of the final uncertainty of more than 0.5 °C.

Fig. 3
figure 3

Reconstructed mean temperature during a heat wave event (29 Jul– 10 Aug 2017) using alternatively (a) institutional only and (b) mixed institutional and non-institutional weather stations. The colored dots denote the observed values at the sensor locations

A more quantitative evaluation of the improvement is given in Table S1 in the supplementary material, where we compare the average temperature observed at each site during the heat wave episode with the values interpolated at each location from the maps of Fig. 3. We found that in most cases there is an improvement of the comparison when adding the non-institutional stations, with a mean bias passing from 0.5 °C to 0.01 °C.

3.2 Precipitation

In Fig. 4 we display the annual precipitation map for the year 2019, over the area of interest. Using the institutional network, the broad features emerging are a dry valley (year rain of about 600 mm/year) surrounded by wet mountains (up to more than 1200 mm/year), and a relatively wetter valley to the North-West, with respect to the South-East. With the addition of non-institutional network we have an even better representation of the large inhomogeneity of this complex terrain territory, with a restriction of the reconstructed drier area in the valley just around the city and in the hills to the South-East, and wetter slopes just nearby the city, to the South and to the North. The institutional network is thus sufficient to gather the general picture, but the non-institutional network may provide a better local characterization when needed.

Fig. 4
figure 4

Reconstructed annual precipitation in 2019 using alternatively (a) institutional only and (b) mixed institutional and non-institutional weather stations. The colored dots denote the observed values at the sensor locations

In Fig. 5 we show an example of a specific rain event in the period 21–24 August 2019. The period was characterized by strong anticyclonic conditions, with a high-pressure system extending from North Africa to Central Europe, and the intrusion of cold air in the middle troposphere, yielding favorable conditions for sparse showers and thunderstorms over the Appenines mountain. The map of the cumulated precipitation over the four days reconstructed using the institutional network nicely captures the narrow band where the storms impacted to the ground. Adding the non-institutional sensors we get an even more sharp reconstruction of the location impacted by the highest precipitation, in a quite limited area just to the North of the city. Although the kriging error remains similar in the two cases (Figure S6), the comparison of the observations with the values interpolated from the maps confirms the improvement gathered with the addition of non-institutional stations, especially in reproducing the high end of the distribution (see Table S2).

Fig. 5
figure 5

Reconstructed precipitation event (21–24 Aug 2019) using alternatively (a) institutional only and (b) mixed institutional and non-institutional weather stations. The colored dots denote the observed values at the sensor locations. Please note the different color scale in the two maps

3.3 Wind

As reported in the data section, the institutional network contains 6 wind sensors while the non-institutional 12. In the main urban area there is only 1 institutional station providing wind data, while most of the non-institutional stations are located there. It is thus thanks to the non-institutional network that we are able to display a picture of the wind flow in the area.

In Fig. 6 we show the wind roses at the station locations on the map. The institutional station is near the center of the area and the solid lines denote the approximate profile of the valley walls. Considering the depth of the valley of about 800 m and its width of about 10 km, in the absence of strong synoptic forcing (i.e. more than 0.5 hPa / 100 km) the onset of a local thermally-driven circulation is expected (Whiteman 1990). The typical mountain-breeze system, with nighttime downslope katabatic flow and daytime upslope anabatic flow, was actually observed in the area (Curci et al. 2012 and references therein). The system is nicely visualized using the institutional stations in and around the valley, as shown in the Fig. 6. Even without a careful screening of periods with low synoptic forcing, the average of the three years displays the prevalence of downslope light winds at night and the reverse upslope faster winds during the day.

Fig. 6
figure 6

Wind roses at each station along the Aterno Valley using 2017–2019 data separated in nighttime (left) and daytime (right). The red lines denote the approximate contour of the valley walls and the arrows the typical directions of the wind flow. The day/night splitting is done based on the astronomical sunrise and sunset times at each location (Ropkins and Carslaw 2012)

In Fig. 7 we show the along axis transect of the average wind speed over the 2017–2019 period as a function of the month, and the corresponding average daily cycle for three selected stations. We use again both institutional and non-institutional stations. The plot shows a clear acceleration of the wind to the South-East of the city, passing from values of about 0.6 m/s to values up to 1.5 m/s. This may be explained by the Venturi effect due to the narrowing of the valley walls downwind of the city for a flow that is prevailing from the North-West, as also illustrated in Fig. 6. We also note a further local acceleration of the wind in correspondence of the city center, which is located on a hill in the valley center at an elevation of about 50 m above the valley floor. This picture again nicely illustrates how the addition of data from the non-institutional network helps in better characterizing fine atmospheric features of the place.

Fig. 7
figure 7

Spatial distrbution of wind speed (m/s) around the city of L’Aquila in the Aterno Valley, using alternatively data from both institutional and non-institutional networks. On the left, the monthly along-axis valley transect (NW-SE) of wind speed is shown. On the right, the average daily cycle at three selected stations is shown

In Fig. 8 we illustrate the data detected at three stations during a sample high wind day (22nd December 2019). A low pressure system passed over Italy inducing a fast northwesterly flow over the valley, with wind speed higher than 6 m/s during the central part of the day. The wind is well aligned with the main NE-SE valley axis, as displayed by the wind roses. The direction is very similar at the station on the West (Pizzoli) and the one in the city center, while it is slightly rotated to the North at the station to the East (Poggio Picenze), possibly due to local orographic effect. The acceleration downwind of the valley is again evident also during this episode, with speeds higher in the South-East.

Fig. 8
figure 8

Illustration of a sample high wind episode in the Aterno Valley on 22/12/2019. In the upper panel the wind speed time series (m/s) at three selected stations, in the bottom panel the corresponding wind rose

3.4 Humidity

In Fig. 9 we show the effect of the city on humidity levels, the so called urban dry or moisture islands (UDIs, UMIs) (Huang and Song 2023), using both institutional and non-institutional networks. As for the wind, this evaluation is allowed by the usage of the latter. We may notice a substantial drying effect of the city of about − 20% at night and − 10% during the day in terms of relative humidity. The difference is enhanced during the warmest months. The phenomenon is in broad agreement with observations worldwide (Meili et al. 2022) and it certainly deserves more careful inspection in future work.

Fig. 9
figure 9

Same as Figure 7, but for relative humidity (%).

In Fig. 10 we show the comparison of the reconstructed average temperature and the average Humidex during the same heat wave period illustrated in Fig. 3. The Humidex index (Barnett et al. 2010) combines temperature and relative humidity in order to have a better prediction of the thermal comfort of an average person. As previously noted, the highest temperature was experienced in the city core and in the Eastern downwind area. Considering humidity, we see that the thermal discomfort is mitigated in the city core by lower humidity, while it is further enhanced in the downwind area to the East. A decreased discomfort on the slopes around the valley floor is also emerging from the Humidex map. Such evaluation is only possible with the aid of the relative humidity sensors of the non-institutional network.

Fig. 10
figure 10

Reconstructed (a) mean temperature and (b) Humidex index during a heat wave event (29 Jul– 10 Aug 2017) using both institutional and non-institutional stations in the area around the city center

4 Conclusions

We analyzed three years of data collected in the Aterno Valley around the city of L’Aquila in central Italy in order to verify the potential added value of combining institutional and non-institutional near-surface weather station networks. The non-institutional data were found of good quality, because only a relatively small fraction of them were discarded after temporal and spatial quality checks.

One main advantage of adding the non-institutional sensors to the database is the presence of many more observations of wind and relative humidity: indeed, only a limited number of institutional sensor monitor those variables in the area of interest.

For temperature and precipitation, the number of institutional sensors is already adequate to describe the spatial characteristics of the related fields, however, the complement of the non-institutional stations helps in better characterize the spatial features over both long and short time scales. For example, the extent and intensity of the urban heat island can be better identified and the areas most impacted by a rainstorm better delimited.

The combined use of institutional and non-institutional wind observations allows the description of the wind flow in the valley. For example, we were able to visualize the broad features of the typical mountain-valley breeze system and the acceleration of the flow when passing through the narrowing of the valley.

The additional data on relative humidity allowed diagnostics that were not feasible with the institutional network only. We were able to detect the urban dry island effect and to calculate a thermal comfort index (Humidex) during a heat wave period.

In conclusion, the first analysis presented here showed that there is a clear potential for added value of non-institutional network in improving the description of fine spatial features of temperature and precipitation fields and in allowing the spatial characterization of wind and humidity fields. Indeed, more in depth analysis is warranted at least for the phenomena related to the last two variables. The collaboration of public institutions with non-institutional entities in this context is thus highly encouraged, also in view of the increasing institutional effort in guiding the climate adaptation planning.