Validation of ERA5-Land temperature and relative humidity on four Peruvian glaciers using on-glacier observations

Weather and climate conditions drive the evolution of tropical glaciers which play an important role as water reservoirs for Peruvian inhabitants in the arid coast and semi-arid Andean region. The scarcity of long-term high-quality observations over Peruvian glaciers has motivated the extensive use of reanalysis data to describe the climatic evolution of these glaciers. However, the representativeness and uncertainties of these reanalysis products over these glaciers are still poorly constrained. This study evaluates the ability of the ERA5-Land reanalysis (ERA5L) to reproduce hourly and monthly 2 m air temperature and relative humidity (T2m and Rh2m, respectively) over several Peruvian glaciers. We compared the ERA5L with data from four on-glacier automatic weather stations (AWS), whose hourly time series were completed with nearby stations, for the period January 2017 to December 2019. Results indicates a better performance of the reanalysis for T2m (r >0.80) than for Rh2m (∼0.4< r <∼0.6) in all four glaciers. Concerning the observations, both parameters show a daily cycle influenced by the presence of the glacier. This influence is more prominent during the dry months when the so-called glacier damping and cooling effects are stronger. On a monthly time scale, the ERA5L validation for both parameters are better in wet outer tropical sites (RMSE between ±0.2°C for T2m and between 3%–7% for Rh2m) rather than in dry outer tropical sites (RMSE between ±0.2°C for T2m and between 3%–7% for Rh2m). Among all sites considered in the study, the Rh2m bias is the highest in the Cavalca glacier (correlation of 0.81; RMSE 13%, MAE 11% and bias 8.3%) and the lowest in Artesonraju glacier (correlation of 0.96; RMSE 3%; MAE 2.3% and bias — 0.8%). Based on certain considerations outlined in this paper, it is appropriate to use ERA5L to characterize T2m and Rh2m conditions on Peruvian glaciers, particularly in the wet outer tropics.

Abstract: Weather and climate conditions drive the evolution of tropical glaciers which play an important role as water reservoirs for Peruvian inhabitants in the arid coast and semi-arid Andean region. The scarcity of long-term high-quality observations over Peruvian glaciers has motivated the extensive use of reanalysis data to describe the climatic evolution of these glaciers. However, the representativeness and uncertainties of these reanalysis products over these glaciers are still poorly constrained. This study evaluates the ability of the ERA5-Land reanalysis (ERA5L) to reproduce hourly and monthly 2 m air temperature and relative humidity (T2m and Rh2m, respectively) over several Peruvian glaciers. We compared the ERA5L with data from four on-glacier automatic weather stations (AWS), whose hourly time series were completed with nearby stations, for the period January 2017 to December 2019. Results indicates a better performance of the reanalysis for T2m (r >0.80) than for Rh2m (~0.4< r <~0.6) in all four glaciers. Concerning the observations, both parameters show a daily cycle influenced by the presence of the glacier. This influence is more prominent during the dry months when the so-called glacier damping and cooling effects are stronger. On a monthly time scale, the ERA5L validation for both parameters are better in wet outer tropical sites (RMSE between ±0.2°C for T2m and between 3%-7% for Rh2m) rather than in dry outer tropical sites (RMSE between ±0.2°C for T2m and between 3%-7% for Rh2m). Among all sites considered in the study, the Rh2m bias is the highest in the Cavalca glacier (correlation of 0.81; RMSE 13%, MAE 11% and bias

Validation of ERA5-Land temperature and relative humidity on four
Peruvian glaciers using on-glacier observations 1 Introduction Andean glaciers are among the fastest shrinking and largest contributors to sea-level rise on Earth (Dussaillant et al. 2019). Particularly, tropical Andean glaciers are experiencing an unprecedented retreat since their last maximum extension during the Little Ice Age (mid-17th-early 18th century) and it is more pronounced on small glaciers at low altitudes (Rabatel et al. 2013;Zemp et al. 2015). In the Peruvian Andes, the disappearance of glaciers in the Pacific basin is leading to seasonal fresh-water shortages in the hyperarid and populated coast. Moreover, glaciers play an important role in the customs and cultures of Andean communities (Bolin 2009;Chevallier et al. 2011;Drenkhan et al. 2015).
A big challenge for studying and modelling the recent evolution of the Peruvian glaciers is the unavailability of long-term continuous meteorological measurements, which are often sparse in mountainous areas (Richardson et al. 2004), especially at high elevations or in uninhabited regions (Rolland 2003). The difficulty of access, installation and maintenance of weather stations at high altitudes and especially on glaciers restricts the availability of accurate weather and climate data (Beniston et al. 1997). As a result, the few on-glacier automatic weather stations (AWS) available are generally not properly maintained, resulting in frequent errors and gaps in the collected data. Recent and accurate onglacier meteorological measurements are only available in 4 Peruvian mountain ranges (Cordillera Blanca, Cordillera Central, Nevado Coropuna and Cordillera Vilcanota). For all these reasons, numerous studies used reanalysis data to characterise the climatic conditions in mountain glaciers (e.g. Francou and Coudrain 2005;Pan et al. 2020;Schauwecker et al. 2014;Vuille et al. 2008). Although these data sets provide a spatially and temporally continuous series of tropospheric parameters, they are not free of uncertainties and biases from different sources. These inaccuracies in reanalysis are particularly relevant in high mountain regions, both in mid-latitudes and in the tropics (Hofer 2012;Trenberth et al. 2001;Wang and Zeng 2012) where low observations are assimilated in them. In addition, they often provide free-air rather than near-surface variables, neglecting complex interactions between the earth's surface and the boundary layer. A comparison between temperatures from NCEP-NCAR reanalysis and mountain observations around the world was performed by Pepin and Seidel (2005). Their results showed that free-air temperatures from NCEP-NCAR reanalysis were further adjusted to surface temperatures in mountain summits than in lower mountain sites. However, free-air temperatures may not be optimal to describe near-surface conditions over glaciers.
While most common reanalysis variables used to describe the atmospheric conditions over the Peruvian glaciers are temperature and humidity at 500 hPa pressure level (Chevallier et al. 2011;Francou et al. 2003Schauwecker et al. 2014;Veetil et al. 2017) only a few studies compared observations with reanalysis data. For the Vilcanota mountain range, Salzmann et al. (2013) compared monthly temperature from Santa Rosa manual weather station with 500 hPa temperature from the NCEP/NCAR reanalysis finding a high correlation. In the Cordillera Blanca, Hofer et al. (2012) used a regression analysis to evaluate the performance of a comprehensive assessment of 5 reanalyses in representing monthly air temperature. They concluded that the ERA-Interim presented the best abilities in representing monthly intra-seasonal and inter-annual air temperature variations at the Artesonraju Glacier weather station (4820 masl). In addition, climate variability in the Pariacaca mountain range (Cordillera Central in the present work) was described for the 1980-2017 period using monthly ERA5 data (Lopez-Moreno et al. 2020). More recently, Birkel et al. (2022) compared 2-metre temperature, precipitation and other climate variables from six widely used global reanalysis products (ERAI, CFSR, NNR1, JRA55, MERRA2 and the new ERA5) to provide context that can inform the calibration of ice core climate proxy records in the central Andes. In this study, 2 metre monthly temperature and precipitation from each reanalysis were validated against surface observations at different high-altitude sites in the central Andes. In other glacial regions such as the Antarctic Peninsula, the performance of surface ERA5 and ERA-Interim parameters were evaluated by comparing different variables with insitu observations (Hillebrand et al. 2020;Tetzner et al. 2019). These works concluded that ERA5 showed good skills in representing monthly temperature over glaciers.
The temperature and precipitation variability on Peruvian glaciers was also described using nearby offglacier weather stations (Mark and Seltzer 2005;Vuille et al. 2008). However, these weather stations are often located at lower altitudes and relatively distant from glaciers introducing errors related to the environmental lapse rate correction. Therefore, the climatic conditions over glaciers may not be properly described using off-glacier measurements. From autumn 2019, the new global ERA5-Land reanalysis (hereafter ERA5L) was released under the auspices of the Copernicus Climate Change Service. This product is producing an enhanced global dataset for the land component of the fifth generation of European Reanalysis (ERA5) providing an improved and higher-resolution description of the water and energy cycles at the surface level (Muñoz-Sabater et al. 2021). Due to its recent availability, it has not yet been validated with observations in the Andes and only a few studies have assessed its performance in capturing different surface variables in other regions (Camargo and Schmidt 2020; Cao et al. 2020;Pelosi et al. 2020;Sheridan et al. 2020).
The aim of this study is to assess the ability of the ERA5L to represent hourly and monthly 2 metres air temperature and relative humidity (hereinafter T2m and Rh2m respectively) in four different Peruvian glaciers. This is the first study that validates ERA5L using accurate on-glacier observations that were not assimilated in the reanalysis for the 2017-2019 period. Hence, our evaluation will contribute to a better understanding of the strengths and shortcomings of this reanalysis product in representing critical variables driving tropical glacier dynamics.

Study area
The Peruvian Andes glaciers represent around 70% of the tropical glaciers on Earth and are located entirely in the Outer Tropics (Kaser and Osmaston 2002). The climatic conditions in the outer tropical Andes are defined by a marked seasonality in cloudiness, humidity and precipitation, but a low annual thermal amplitude. Thus, subtropical conditions prevail during the dry season (DS) while tropical conditions during the wet season (WS) as described in Kaser and Osmaston (2002) and Rabatel et al. (2013). In general terms, fluctuations in upperlevel zonal winds are related to the moisture transport variability at the diurnal (Garreaud et al. 2009;Reuder and Egger 2006), intraseasonal (Falvey and Garreaud 2005;Garreaud 2000) and interannual time scales (Garreaud and Aceituno 2001;Garreaud et al. 2003;Vuille and Keimig 2004;Segura et al. 2016). However, in the southern Peruvian Andes negative (positive) anomalies in daily maximum and minimum temperatures (precipitation) during the DS were related to the presence of extratropical upper tropospheric troughs located off the southern coast of Peru (Bonshoms et al. 2020). In addition, the influence of dynamic and orographic convergence of moist air from the Amazon may have an impact on precipitation over the eastern Andes glaciers (Cordillera Vilcanota and, to a lesser intensity, in Cordillera Blanca). This study focuses on glaciers from four different mountain ranges of the Peruvian Andes: Cordillera Blanca (CB): The most extensively glaciated tropical mountain range with an ice surface of 448.81 km 2 (INAIGEM 2018). Despite its proximity to the Pacific Ocean, this mountain range is wellexposed to eastern wet Amazonian flows. The Artesonraju glacier is a valley type glacier with an estimated surface area of 5.4 km 2 in 2006 (Rabatel et al. 2013).
Cordillera Central (CC): This mountain range also known as Cordillera Pariacaca is located on the western slope of the central Peruvian Andes. Despite their limited glacier extension of 42.4 km 2 (INAIGEM 2018), some of their glaciers provide water to the extremely populated Peruvian coast. Due to proximity to the Pacific Ocean and the blocking of Amazonian flows by an eastern cordillera, the climate is drier than in Cordillera Blanca. The Chuecon glacier is a valley type glacier formed by the confluence of two tongues descending from Nevado Norma and Suiricochas. We calculated a glacier surface of ~2 km 2 in 2019, based on a Sentinel 2-L2A image (24-08-2019).
Cordillera Vilcanota (CV): A mountain range located in the eastern Peruvian Andes with a glacier surface of 255.44 km 2 (INAIGEM 2018). Wetter conditions prevail due to their proximity to the rainfall hot spot east of the Andes (Chavez and Takahashi 2017) and the occasional influence of upper tropospheric troughs during the austral winter (Bonshoms et al. 2020). The Quelccaya ice-cap with a surface of 42.8 km 2 in 2009 (Salzmann et al. 2013) has been considered the largest tropical glacier for many years (Salzmann et al. 2013;Thompson et al. 2006).
Nevado Coropuna (CO): A volcanic complex entirely covered by an ice cap with an estimated surface of 44.1±3.9 km 2 (Kochtitzky et al. 2018) surpassing Quelccaya as the largest ice cap in the tropics (Kochtitzky et al. 2018;Ubeda et al. 2018). The Cavalca glacier is a slope glacier which descends from the eastern peak (6305 masl).

AWS Data
Due to their high altitude, complex orography, difficult accessibility and severe weather conditions, meteorological measurements in the Peruvian glaciers are very scarce and often limited to short periods. Nevertheless, efforts conducted by national and international institutions led to the installation and maintenance of several AWS in different mountain ranges. As a result, at least 10 AWS are providing meteorological measurements on glacier environments (between ~4500-5800 metres above sea level). However, these AWS have short time series (less than 10 years long), frequent data gaps, and are managed by different institutions resulting in a serious constraint for long-term climatological analysis.
From the total AWS, we selected one main station (AWSmain) on each glacier for validation of the reanalysis data. The criteria used to choose the AWSmain were: The on-glacier location, the availability of artificially ventilated radiation shield and the lowest percentage of missing data ( Table 1). The AWSmain stations are equipped with thermohygrometers (Campbell Scientific. Model CS215-L11) with artificially ventilated radiation shields (Apogee, Model TS-100) in CBmain and CCmain. However, the thermo-hygrometers in CVmain (Campbell Scientific. Model EE181) and COmain (Campbell Scientific. Model HC2S3) are protected with naturally ventilated radiation shields (MetSpec. Model Rad10E and RM YOUNG. Model 41003-5, respectively). The use of artificially ventilated radiation shield results in more accurate temperature and relative humidity measurements over glaciers, especially around noon and under no wind conditions (Hardy et al. 1998;Georges and Kaser 2002).
The remaining AWS located near/over glaciers were labelled as secondary stations (AWSsecond) and used in the infilling gap process. All the AWSsecond Fig. 1 Map indicating the location of the four glaciers (black triangles) located within three different climate zones of the Peruvian Andes (Altitude above 3000 masl) according to Bonshoms et al. (2020). NWOT, Northern wet outer tropics; SWOT, Southern wet outer tropics; SDOT, Southern dry outer tropics. are located in the same basin of the analysed glaciers except the CVsecond1 which is located approximately 15 km away ( Fig. 2 and Table 1). To improve the results of the infilling gap process, nearby highaltitude complementary AWS (AWScomp), equipped with naturally ventilated radiation shields were also considered ( Fig. 3 and Table 2).

ERA5L Data
The fifth-generation ECMWF reanalysis ERA5 combines observations into globally complete fields using the laws of physics with the method of data assimilation. ERA5 assimilates new observations and input data which improves observed changes in climate forcing in comparison with the previous product (ERA-Interim) and at enhanced temporal and horizontal resolution. The new ERA5L reanalysis was created by forcing the land component of the ERA5 (Hersbach et al. 2020). This product is forced by the atmospheric analysis of ERA5, and hence the assimilated observations indirectly influence simulations. It is provided by the Copernicus Climate Change Service with the same temporal resolution as ERA5 (hourly resolution) but with a higher spatial resolution of 0.1° 0.1°. The main characteristics of this product were described in Muñoz-Sabater et al. (2021) and it is currently available from 1950 to the near-present time on https://cds.climate.copernicus.eu. The core of ERA5L is the ECMWF land surface model: the Carbon Hydrology-Tiled ECMWF Scheme for Surface Exchanges over Land (CHTESSEL). In the H-TESSEL scheme, each land grid-box is divided into fractions (tiles), with up to six fractions (bare ground, low and high vegetation, intercepted water, shaded   (CB, CC, CV and CO) with the location of the AWScomp (blue dots) and the analysed glaciers (yellow dots). The ERA5L grid is overlaid (grey grid) with the selected grids highlighted (blue grid). Altitudes over 3000 masl (dark grey) and glaciers (red spots) from the National glacier inventory (INAIGEM 2018) are highlighted. and exposed snow). Each fraction has its properties defining separate heat and water fluxes used in an energy balance equation solved for the tile skin temperature. As described in Muñoz-Sabater et al. (2021), grids with more than 50% of the area covered by glaciers are considered glacier grids by ERA5L, assuming a constant snow depth of 10 m. None of the four selected grid cells (Table 3) satisfies this condition.
We downloaded the hourly and monthly mean T2m and dew point temperature (Td2m) from the nearest grid point to each AWSmain (highlighted in blue colour in Fig. 3). These variables are in Kelvin degree and calculated by interpolating the lowest model level and the Earth's surface considering the atmospheric conditions. The ERA5L data is provided at Universal Time Coordinated (UTC) and hence, we extracted hourly data by considering the Local time in Peru (UTC-5) for appropriate comparison with the observations. Additionally, we accounted for the grid altitude (Table 3) as defined by the geopotential field of ERA5L available on: https://confluence.ecmwf.int/ display/CKB/ERA5-Land%3A+data+documentation.
Due to the lack of the Rh2m variable in ERA5L dataset, we obtained this variable through T2m and Td2m by applying the equation proposed by Lawrence (2004) and described as follows: Rh 100 5 • T2m Td2m (1)

Statistical Metrics
To evaluate their performance, the ERA5L reanalysis was compared to the AWSmain data. The inputs of the reanalyses were the hourly and monthly values derived from the nearest grid points to each AWS site. We converted hourly AWSmain observations to monthly means for the monthly assessment.
To evaluate the accuracy of ERA5L in representing the observed T2m and Rh2m on four  Peruvian glaciers, we incorporated the following statistics: the BIAS (ERA5L -in-situ measurements) indicator, to quantify the systematic errors, the Root Mean Square Error (RMSE) to provide an indication of the mean deviation of the reanalysis values compared to the observed values, the Mean Absolute Error (MAE) to represent the difference between datasets and the Pearson's linear correlation (r). The root mean square error (RMSE) is recommended for statistical analyses that combine errors with low correlation, high bias, and normal distribution (Bromwich and Fogt 2004), but this metric is sensitive to extreme values. The mean absolute error (MAE) indicates the average performance attributing the same weight to all errors and it tends to present values lower than that verified by the RMSE (Hillebrand et al. 2020).
Where is the hourly/monthly ERA5L value, is the hourly/monthly value AWSmain observation. Meanwhile, and correspond to the hourly/monthly averages of reanalysis and observations for the dry seasons of 2017-2019.

Environmental Lapse Rate correction
The Environmental Lapse Rate (ELR) is defined as the rate of temperature change with height and can be computed as follows: (6) where Γ is the ELR in degree Celsius per kilometre (°C·km -1 ), DT is the temperature difference in degree Celsius (°C) between two layers and DZ is the altitude difference between the two layers in geopotential meters (gpm). The Γ values are normally negative with a lower limit of -10°C·km -1 for the dry-adiabatic lapse rate and positive values under thermal inversions.
The ERA5L temperature, humidity and pressure are corrected to account for the altitude difference between the grid of the forcing and the higher resolution grid of ERA5L. This lapse rate correction takes into account an adjustment of the air temperature to the altitude difference by an environmental lapse rate (ELR) derived from the temperature vertical profiles of ERA5 in the lower troposphere (Dutra et al. 2020;Muñoz-Serrano et al. 2021). This ELR correction use Eq. 6 between 16 combinations of model levels centred between model levels 124 (500 m above the surface) and 116 (1200 m above the surface). Nevertheless, to compare the ERA5L T2m with that measured at the glaciers, we performed an additional ELR correction to adjust the ERA5L data to the altitude of the AWSmain. Despite the common use of the fixed mean elevation lapse rate (MELR = -6.5°C·km -1 ), its use is not recommended due to its high Spatio-temporal variability. In this regard, many studies have proven that the MELR is frequently higher than the real in many regions (e.g., Jobst et al. 2017;Minder et al. 2010;Shen et al. 2016;Wang et al. 2018). For the Peruvian Andes, Córdova et al. (2016) estimated an overall monthly mean ELR of -6.8°C·km -1 . However, the high spatial variability of this parameter can be expressed by the local differences found by Ubeda (2011) around Coropuna, where the monthly ELR ranged from -5.6°C·km -1 to -8.4°C·km -1 depending on the altitude of the weather stations considered. Therefore, we applied an hourly and monthly ELR correction through the vertical freeair temperature gradient between 600 hpa (~ 4330 gpm) and 500 hPa (~5760 gpm) using data from the ERA5. This additional ELR correction was based on Gao et al. (2017) and calculated as follows: 500 600 500 600 • 1000 (7) where corresponds to the environmental lapse rate for temperature in °C·km -1 , T to free-air temperature in degrees Celsius (°C), and Z to geopotential height in geopotential meters (gpm) at 500 hPa and 600 hPa, respectively. It is important to note that the ELR was calculated through free air temperatures at two pressure levels which are not influenced by surface conditions and processes. Therefore, we assume errors due to the non-consideration of the thermal effect of surface heat fluxes and the particular conditions of the glacial boundary layer. We decided not to apply a similar ELR correction to Rh2m due to numerous ERA5 values out of the physical limit of the variable (0-100%) and the complexity to establish a meaningful vertical gradient for Rh2m.

Imputation of missing data
Incomplete data is an unavoidable problem in dealing with most meteorological data sources (Kotsiantis et al. 2006). Due to significant differences in the percentage of gaps between the four AWSmain (Table 1), we opted to use an imputation method to obtain a continuous hourly time series in the four glaciers. Therefore, hourly gaps in the AWSmain were filled using data from nearby AWS (Table 1 and Table  2). Even though this procedure introduced slight errors in the AWSmain data, it allowed us to compare ERA5L with continuous observations. In addition, the complete hourly measurements contribute to reducing errors in the computation of monthly averages.
For the data imputation procedure, we used extended time series of the different AWS: up to ~6 years in CBmain, CV and CO and just over 3 years in CC. The use of a longer data period enabled a more efficient data training of the filling method, improving the results. As a first step, hourly data from the AWSmain was visually checked to detect suspicious outliers and inhomogeneities that could be attributed to sensor manipulation or faults. Secondly, we filled the 1-hour gaps in AWSmain and AWScomp with a linear interpolation between the previous and following measures. Finally, greater than 1-hour gaps were filled by adapting the Gradient Boosting infilling gap method suggested by Friedman et al. (2000) and Friedman (2001). The Gradient Boosting is a learning procedure that combines the outputs of many simple predictors to produce a powerful committee with performances improved over the single members. The approach is typically used with fixed-size decision trees as base learners and is therefore also known as gradient tree boosting (Biau et al. 2019). The method proposed by Körner et al. (2018) was used to fill in the gaps. Briefly, this model is trained for different predictors (gradient boosted regression trees) for a target variable during a training period. This model is then applied for times when the target value is not available. The model is non-linear and nonparametric and performs significantly better than others such as R-packages mice (van Buuren and Groothuis-Oudshoorn 2011), single or multiple linear regression methods or neural networks (Körner et al. 2018). The model was enhanced with the automatic creation of training and prediction sub-periods to obtain the best results with the available data. As a result, several models instead of one were used for gap filling. Besides, the previous xgboost (Chen and Guestrin 2016) and lightgbm (Ke et al. 2017) libraries were used. As predictors, we used hourly data from nearby AWSsecond and AWScomp described in section 3.1.
To assess the quality of the gap-filling methods, we tested the following cross-validation procedure: The first step was to randomly select 1000 consecutive T2m and Rh2m hourly values from COmain and the extremely close COsecond1. Secondly, the AWSmain values were removed and the gapfilling method was applied. Finally, the filled gaps were compared with the measured data. For most of the time steps with gaps, data were available in COsecond1 and therefore, the error sizes gave a good overview of the expected error sizes during the gapfilling process. However, for individual error values data from the nearby AWS was not available. Therefore, the cross-validation was carried out again, without using the data of the nearby AWS. The error size was correspondingly worse for this scenario. The procedure previously described was carried out 20 times in each case to reduce the proportion of randomness in the cross-validation. The following gap-filling methods were compared: a) Hourly linear interpolation with the next available station.
b) The R library "mice" considering all the available data with restrictions described above. c) Gradient boosting using all available data with the restrictions described above.
The evaluation of the cross-validation enables distinguishing the best performing gap-filling methods. The error sizes for cross-validation methods were depicted in Table 4. In that sense, the error measures are significantly improved by taking nearby stations (AWScomp) with supporting data. Furthermore, the hourly linear interpolation (a) performs better than the R-library "mice'' interpolation (b). However, gap-filling by Gradient Boosting (c) achieved the best performance overall (Table 4).

Hourly assessment and mean daily cycles
The ERA5L reanalysis provides land variables at an hourly scale enabling the evaluation of daily to sub-daily processes (C3S 2017). The seasonal relationship between hourly T2m and Rh2m datasets was evaluated through different statistics. There is no univocal definition of which months belong to the dry or wet season in the Peruvian Andes. Therefore, the four central months of the dry (MJJA) and wet (DJFM) seasons were considered as proposed in previous works (eg. Bonshoms et al. 2020;Francou et al. 2003;Kane 2000;Rosales et al. 2022;Segura et al. 2019). These four-month periods allowed us to reduce distortions that could arise due to regional differences in the length of both seasons. Moreover, we assessed the performance of ERA5L in capturing the mean daily T2m and Rh2m cycles (MDTC and MDHC, respectively), defined as the mean T2m and Rh2m for each hour. Hourly observations from the 4 glaciers were compared with the corresponding value from the nearest ERA5L grid-point for the period between 00:00 h of 1st January 2017 to 23:00 h of 31st December 2019 at local time.

Hourly T2m
A strong linear positive relationship between hourly ERA5L and observations was detected in the four sites with a high correlation coefficient (r >0.80) considering the entire period (Table 5). However, some seasonal differences in the statistics were detected with higher correlations and lower biases but higher RMSE and MAE during the DS, particularly in the driest sites (Table 5). During the WS positive biases were detected in CC and CO (drier sites) and negative in CB and CV (wetter sites). In that season, the error metrics were slightly lower (RMSE between 1.6°C and 2°C; MAE between 1.2°C and 1.5°C) without significant regional differences.
Differences by hour between ERA5L and observations were analysed using box plot graphs (Fig.  4, column B). Results indicate the highest variability of the hourly differences in CV for both seasons, around noon and early afternoon in the WS and from 0-12 hours in the DS. However, the median of differences in CV during the WS tended to be 0°C (Fig.  4, column B). The small negative ERA5L bias in CV was concentrated towards the lower values, resulting in a slight ERA5L underestimate during the nighttime hours. In the DS, hourly differences were reduced Table 4 Performance measures for three cross-validation methods applied to hourly 2m air temperature (T2m) and relative humidity (Rh2m) data from COmain and COsecond1 (automatic weather stations described in Table 1 with a greater variability from 0 to 6 hours. The boxplots clearly show the daytime (nighttime) underestimate (overestimate) of the reanalysed T2m. The linear relationship between both datasets was depicted in scatter plots (Fig. 4, column C). There was a fairly good linear fit in the four sites, especially during the WS. Furthermore, the great agreement between ERA5L and observations in CB during the WS and in CO during the DS was reflected in the concordance between the regression and identity lines. The ERA5L tended to underestimate (overestimate) lower (higher) T2m values in the four AWSmain for both seasons except in CV and CO during the WS when the slope of the linear regression line decreased indicating a contrary behaviour (Fig. 4, column C). Kernel estimation density plots supported the above findings by showing that both distributions fitted better during the WS with similar peaks. These plots allowed to highlight the distribution of biases over the range of values. During DS, the distribution of ERA5L was left-skewed, bimodal and spread over a wider range of values compared to observations (Fig.  4, column D). However, during the WS the T2m range decreased and fitted better the observations with a concordance in the values of the peaks, particularly in the wet outer tropics.
We assessed the skill of ERA5L in reproducing the observed mean daily T2m cycle (MDTC) defined as the mean T2m for each hour. The observed MDTC was characterised by low values due to the high altitude of the AWSmain and by low amplitude. In addition, small seasonal differences in the MDTC were found in the northern glaciers (Artesonraju and Chuecon) and greater differences in the southern glaciers. The MDTC of both datasets showed an extremely high positive correlation (r>0.97) and low error metrics. The RMSE ranged between 0.7°C and 1.3°C during the WS and between 1.2°C and 1.8°C during the DS decreasing from northern to southern sites. During the WS, the RMSE and MAE were lower than 1°C in the wetter sites (CB and CV) as shown in Table 6.
The amplitude of the MDTC was evaluated through the mean daily temperature range (MDTR) defined as the difference between the mean daily maximum and minimum T2m. The observed MDTR ranged approximately from 4°C to 5.5°C during the WS and from 4.5°C to 7°C in the DS (Table 7). A similar daily temperature range was observed by Wagnon et al. (1999) at the Zongo Glacier (Bolivia). The ERA5L amplified the MDTR in the four sites, with larger differences during the DS, when reanalysis overstated the MDTR with differences of around 2.4°C to 5.1°C (Table 7). During the WS, lower differences between the observed and modelled MDTR were lower than 1°C in the southern sites and around 3°C in the northern. The lowest difference in the MDTR was noted in CV and CO during the WS  (Table 7). As demonstrated in different studies, glaciers modify the thermal regime compared to off-glacier areas at the same altitude as a result of snow/ice cover and the development of a katabatic boundary layer (eg. Shea and Moore, 2010). Besides, the ice/ snow cover reduces air temperature above glaciers as a result of constant near 0°C surface temperatures. Therefore, the low amplitude in the observed MDTC over analysed glaciers can be partially attributed to the so-called glacier damping and cooling effects. To estimate the magnitude of both effects, we compared the MDTC of two nearby AWS located at similar altitudes, with naturally ventilated radiation shields and installed on/off the Artesonraju glacier CBsecond1 and CBsecond2, respectively (Table 1). As expected, the on-glacier AWS showed a slightly lower MDTR in comparison with the off-glacier AWS, during the WS (mean difference of 0.7°C) and the DS (mean difference of 0.4°C) as depicted in Fig. 5. In this regard, results were in accordance with the expected on-glacier damping effect. Besides, the on-glacier cooling effect was also captured in the comparison between the on/off-glacier AWS. This cooling effect was observed throughout the day during both seasons. Seasonal differences were found with a strengthening of the cooling effect during the DS when the mean cooling effect was around -1.1°C and the maximum cooling of around -1.4°C at 14:00 LH. During the WS, the mean cooling effect was lighter with a mean value of around 0.5°C but a similar maximum of around -1.2°C registered at 15:00 LH (Fig. 5).

Hourly Rh2m
The hourly Rh2m assessment revealed a moderate positive correlation in the four sites (~ 0.4 < r < ~0.6) considering the complete period (Table 8). Concerning seasonal differences in the ERA5L accuracy, higher correlation and lower biases, RMSE and MAE were found on the WS, particularly in the wet outer tropic sites. The reanalysis showed a better performance in reproducing hourly Rh2m in the wet outer tropic sites, with poorer results in CO for both seasons.  Contrary to observed in the hourly T2m assessment, correlations were higher during the WS with values decreasing from northern (CB r=0.62) to southern sites (CO r=0.45). The ERA5L biases in the wet outer tropic sites were around ±2% during the WS suggesting a good agreement of the reanalysis with observations in these glaciers. In that season, a negative ERA5L bias was observed in CV (-2.3%) while it was positive in the other sites. However, biases in CO were high in both seasons (biases between 12% and 14.6%).
The RMSE and MAE were also lower during the WS in the wet outer tropic sites (RMSE between 11.5% and 13.3%; MAE between 7.2% and 9.1%) as depicted in Table 8. During the DS, the RMSE and MAE increased in all four sites (Table 8). Among all four sites, lower RMSE and MAE were found in the wettest sites (CB and CV).
The hourly Rh2m differences between the ERA5L and the observations were analysed using box plots to understand the variability and timing of these differences (Fig. 6, column B). The variability was higher in the DS, particularly during nighttime, coinciding with positive ERA5L biases. On the contrary, during the WS, higher variability occurred in daytime hours in CV, and daytime and early nighttime hours in CB and CC. In CO, the variability was high during all the nighttime hours (Fig. 6, column B). Scatter plots comparing ERA5L and observations show a large dispersion of the data, which indicates highly significant differences between the reanalysis and the observations in the four sites (Fig. 6, column C). The poorer fit between the regression and identity lines as well as the lack of clustering of the measurements are evidence of discordance between databases. However, the accumulation of points around 100% values, indicates that the reanalysis tended to saturate the atmosphere when the observed conditions were drier. The Rh2m kernel density estimate distribution during the DS showed a significant disagreement between observations and ERA5L in CB and CC. In those sites, the peak in observations was centred between ~70 to 80% while around 90% in ERA5L. However, in CV and CO both distributions presented a better adjustment. During the WS, a rightward skewed distribution was observed in all four areas, both for ERA5L and observations (Fig. 6, column D). The best adjustment between both distributions was detected in CV during WS while the poorest in CB during the DS.
We also calculated the mean daily Rh2m cycle (MDHC) for both datasets. The MDHC comparison revealed notorious seasonal differences in the agreement between ERA5L and observations. During the WS, an extremely high correlation (r = ~0.9) was observed in the wet outer tropics sites but moderate in the dry outer tropics (r = ~0.5). Differences in the RMSE and MAE between wet and dry outer tropics were also relevant. On the other hand, the performance of ERA5L in reproducing the MDHC during the DS decreased significantly in the four sites, with an RMSE and MAE around 12-16% in the wet outer tropics sites and slightly higher in CO. In addition, Pearson's linear correlation decreased, particularly with a poor correlation in CB and CO ( Table 9). The observed MDHC showed nonsignificant changes in the seasonal pattern. However, the ERA5L magnified the amplitude of the MDHC during the DS in all four sites (Fig. 6, column A). The observed MDHC was characterised by marked seasonal differences in the magnitude of Rh2m with higher values in the WS. Regionally, higher values were found in the wet outer tropics sites, particularly in the Quelccaya ice cap and the Artesonraju glacier, in line with expectations. In the Cavalca glacier, the mean minimum Rh2m in both seasons was detected around 09:00-10:00 LH instead of around midday as could be expected. Similar behaviour was observed in the Artesonraju and less markedly in Quelccaya during the DS. During WS, the daily cycle of Rh2m in the wet outer tropic glaciers presented the typical pattern, with a minimum in concordance with the hours of higher T2m (around 13:00 LH).
Concerning the comparison of the MDHC, the ERA5L tended to overestimate (underestimate) the Rh2m during the nighttime (daytime), more prominently during the DS (Fig. 6, column A). Consistent with the findings for T2m, the amplitude of the ERA5L daily cycle was greater than that observed in the 4 glaciers, particularly during the DS (Table 9). Significant seasonal differences in the MDHC correlation were found, with high correlation for the WS but very low for the DS (Table 9). The observed MDHC exhibited no significant seasonal variation. However, the reanalysis MDHC showed a remarkable increase in amplitude from WS to DS.
During the WS, the observed mean minimum Rh2m was similar in the wet outer tropic sites (CB, CC and CV) with values of ~70-80% and lower in dry outer tropics (CO) with values of ~60%. During the Fig. 6 Comparative plots between ERA5L and AWSmain relative humidity (%) in the four areas (CB, CC, CV and CO) for the WS (Upper block) and DS (Lower block). Column A corresponds to the MDHC for ERA5L (red line) and AWSmain (black line); Column B depicts the hourly reanalysis bias. Box represent the interquartile range (IQR), the horizontal line correspond to the median and whiskers to 1.5 × IQR; Data dispersion is depicted in column C with the regression line (black dashed) and the identity line (black solid). The Kernel density estimation distribution for ERA5L (red line) and AWSmain (black line) in column D. The blue (orange) colour used in boxplots and scatterplots helps to distinguish wet (dry) season plots. DS, the observed mean minimum Rh2m decreased to ~55-60% in glaciers from mountain ranges more exposed to Amazon flows (CB and CV) and to ~40-45% in the Pacific basin glaciers (CC and CO). The ERA5L MDHR was higher during the DS in the four sites. However, this higher amplitude was unexpectedly not detected in the CB and CO observations. The observed mean daily Rh2m range (MDHR) in the four glaciers was fairly low ranging between ~15% and 25%, with slight seasonal differences (around ±5%). However, the seasonal differences in the MDHR for ERA5L were higher with seasonal variations around ±10% to 15% (Table 10).
The amplitude in the observed MDHC is also affected by the ice surface. To demonstrate this point, we also compared the MDHC between CBsecond1 and CBsecond2 (Fig. 7). The on-glacier AWS showed a slightly higher amplitude of the MDHC with a difference of 1.6% in the mean daily range during the WS. However, the amplitude of the MDHC during the DS was lower in the on-glacier AWS with a difference compared to the off-glacier AWS of around 3.7%. In addition, the comparison of the MDHC showed lower Rh2m in the on-glacier AWS throughout the day during the WS. In that season, the mean drying effect was around -3.8% with a maximum difference of -5.1% at 10:00 LH. Contrary, higher Rh2m was observed in the on-glacier AWS during the DS. The mean and  maximum difference was of a similar magnitude to those observed during the WS with a mean Rh2m difference of +3.2% and a maximum difference of 6.4% at 10:00 LH (Fig. 7).

Monthly T2m
The evolution of ERA5L and AWSmain monthly T2m and differences between datasets were analysed for the 2017-2019 period (Fig. 8). Monthly T2m over the four glaciers tended to describe an annual cycle with maximum values during the WS (DJFM) and lower during the DS (MJJA). However, regional differences were detected with a more pronounced seasonality in southern glaciers (Quelccaya and Cavalca) than in northern (Artesonraju and Chuecon).
Additionally, a secondary annual maximum around November was detected, and it was more pronounced in the westward glaciers (Cavalca and Chuecon) as depicted in Fig. 8. The lowest monthly T2m were measured during the DS of 2018 in CCmain (-1.0°C), CVmain (-7.1°C) and COmain (-5.9°C) but this minimum was not observed in CBmain (Fig. 8). Due to the higher altitude of the southern glaciers, freezing (below 0°C) monthly T2m occurred in all the considered months.
An extremely high positive correlation between datasets was detected for monthly T2m in the four glaciers (r≥0.9). Besides, the error metrics also pointed to a good adjustment of ERA5L to the observations with RMSE and MAE lower than 1°C (Table 11). The accuracy of the reanalysis was higher in the wet outer tropic sites than in the southern dry outer tropic site. The higher skill of the reanalysis in   Table 11. In CO, the ERA5L showed the highest correlation with observations (r=0.97) but also higher bias (bias=0.6°C), RMSE (0.9°C) and MAE (0.8°C). Besides, the ERA5L was able to capture the observed secondary peak around November despite the higher positive biases in CC, CV and CO. The highest ERA5L positive biases occurred from the end of the DS until the end of the WS in CC (maximum bias of 0.9°C in January 2019) and CO (maximum bias of 1.8°C in November 2018). However, in CB and CV we found no evident monthly predominance of positive biases. On the contrary, maximum negative ERA5L biases tended to present a similar magnitude to the positive. We only found a clear monthly pattern in CC and CO where negative biases were more frequent during the DS. The highest negative bias was -1.3°C detected in CO in August 2018. Interestingly, the ERA5L captured with high accuracy the monthly T2m in CV during the drier months of 2018 when the reanalysis tended to strongly underestimate this parameter in the other sites.
The performance of monthly ERA5L was highest in CB, CC and CV where the reanalysis captured with high accuracy the observed monthly T2m. In these sites, the ERA5L showed high correlations but also lower biases. Despite the high correlation detected in CO, the positive monthly ERA5L biases were higher and prolonged on this site. Therefore, the monthly ERA5L data could be used to reproduce monthly T2m in the Cavalca glacier considering an appropriate bias correction.
The reanalysis was able to reproduce the observed changes in the amplitude of the annual cycle and regional differences in its amplitude. However, the ERA5L tended to strongly amplify the annual cycle in the Cavalca glacier.

Monthly Rh2m
A well-defined annual cycle of Rh2m was detected in all four glaciers, with higher values during the WS (December to March) and lower during the DS (May to August), as shown in Fig. 9. The amplitude of the annual cycle was similar in the four glaciers, but magnitudes were lower in the Cavalca glacier. In that glacier, the monthly minimum Rh2m was around 40% during the DS while the maximum fluctuated between 70% and 80%. In addition, a secondary peak in monthly Rh2m was detected around September-October in all four glaciers but markedly in the Chuecon, Cavalca and Quelccaya glaciers. The evolution of monthly Rh2m over the period analysed showed similar behaviour in the three glaciers of the wet outer tropics. However, the monthly Rh2m in the Chuecon glacier was slightly lower during the drier months.
As detected in the hourly assessment, we found lower monthly biases (bias approximately ±10%) higher correlations (r>0.95) and lower RMSE (between 3.0% and 6.7%) and MAE (between 2.3% and 5.3%) in the wet outer tropics (Table 11). In CO, the ERA5L performance was worse with lower correlation (r=0.81) and higher bias (bias=8.3%), RMSE (13.0%) and MAE (11.0%). In addition, the ERA5L succeeded in capturing the observed secondary peak around September-October in CV, CC and with higher biases in CO (Table 11).
The highest positive biases of the ERA5L occurred in the transition months from the wet to the dry season in CC (maximum bias of 13.7% in June 2019) and CO (maximum bias of 25.8% in May 2019) and around the drier months in CV (maximum bias of 18.5% in July 2017). However, the ERA5L biases revealed an unclear monthly predominance in CB (maximum bias of 4.9% in January 2017). On the contrary, negative ERA5L biases tended to be lower in all four sites. Only a clear monthly pattern was detected in CV where negative biases were more frequent during the WS. The highest negative bias was -13.7% detected in CO in September 2018.
The performance of monthly ERA5L was highest in CB where the reanalysis captured with high accuracy the observed monthly Rh2m in the Artesonraju glacier. The largest negative ERA5L biases occurred in the DS of 2017 and 2019 with values of -6.0% and -8.0% respectively. On the contrary, the worst performance of the reanalysis was detected in CO, where the ERA5L tended to strongly overestimate the observed Rh2m, predominantly from April to June.

Discussion
Given its recent release, there are still no studies comparing ERA5L with on-glacier meteorological data. Therefore, our novel assessment cannot be directly compared with previous studies. This paper constitutes an important contribution to the comprehension of the atmospheric conditions over four Peruvian glaciers and contributes to evaluating the use of ERA5L for the long-term description of meteorological variables over these glaciers.

Hourly assessment
The accuracy of ERA5L in capturing hourly T2m was higher than for Rh2m in all the analysed glaciers. Besides, the reanalysis showed higher skills in representing both parameters in glaciers located in the wet outer tropics, particularly during the WS. In the case of T2m, the linear positive relationship (r >0.80) may be explained by the well-defined daily cycle, especially during the DS when higher atmospheric stability leads to a more homogeneous daily cycle. Furthermore, the negative ERA5L biases during that season were especially driven by the nighttime T2m underestimation of the reanalysis data. It suggests that the ERA5L tends to reinforce nocturnal cooling over glaciers, despite the fact that the on-glacier nocturnal temperature is lower than the off-glacier temperature at high altitudes. Concerning the hourly Rh2m assessment, the performance of ERA5L was better under wet conditions. In this regard, the reanalysis showed better skills during the WS, particularly in the wetter glaciers. Contrary to expectations, the ERA5L was unable to capture the hourly Rh2m variability under drier conditions and at night, when the ERA5L tended to overestimate Rh2m significantly. The lack of an ELR correction for Rh2m, but also the estimation of this parameter by Eq. 1 could introduce errors in the ER5L hourly and monthly data. In addition, strong sublimation during the DS could also have influenced the Rh2m differences between both data sets, especially during daylight hours.

Mean daily cycles
As expected, the observed and modelled MDTC showed a well-defined pattern with the minimum T2m during the early morning and a maximum around noon. The observed seasonal differences in the time of the mean maximum T2m may be influenced by cloudiness rather than changes in the amount of solar radiation, which do not vary significantly due to the low latitude of the sites.
The lower optical depth of greenhouse gases, due to a thin atmosphere in the high Andes, implies a reduced atmospheric absorption of terrestrial infrared radiation (Aceituno 1996). As a result, a large daily temperature range is observed in the tropical Andes (Lavado et al. 2012) being greater than the annual range (Kaser and Osmaston, 2002). However, the MDTC over glaciers is strongly influenced by the high albedo, low surface temperature, the katabatic winds and the boundary layer conditions resulting in a damping of the daily temperature range and colder air temperature (Ayala et al. 2015;Carturan et al. 2015;Greuell and Böhm 1998). In this regard, we found magnitudes of the mean glacial cooling effect at Artesonraju Glacier that were similar to WS biases between the hourly observations and the ERA5L.
The amplitude of the MDTC was lower during the WS when increased cloudiness and tropospheric moisture modify the surface radiative balance by reducing (increasing) the incoming short-wave (longwave) radiation. These changes in the radiative balance lead to an increase in minimum rather than a decrease in maximum T2m during that season (Table  6). In this sense, cloudy conditions contribute to the increase of minimum T2m through the effect of nighttime cloudiness on the incoming longwave radiation. The expected reduction of the observed mean maximum T2m due to increased cloudiness was not detected. The reduction in maximum temperatures due to increased cloudiness could be partly compensated by warmer upper troposphere conditions during the WS (austral summer). In addition, the greater amplitude in the MDTC detected at the southern glaciers responds to the increase of the annual solar radiation variability with latitude. The positive bias found during the WS at the drier sites was maintained throughout the day at CO and concentrated in the daytime hours at CC. This bias could be influenced by an underestimation of the vertical temperature gradient extracted from ERA5 that may not properly resolve the dry atmospheric conditions over these sites.
The logical decrease in Rh2m during the DS was appreciated both in ERA5L and observations at all four sites. The modelled MDHC showed the typical pattern with higher values at nighttime and lower around noon coinciding with maximum T2m. That pattern matches the observed during the WS in glaciers from the wet outer tropics where the error metrics indicate a good performance of the reanalysis. However, the pattern is slightly different in CO, where drier conditions prevail throughout the year. At this site, ERA5L showed a marked positive bias during the night hours in both seasons. During the DS, the pattern of the observed MDHC at wet outer tropic sites tends to approximate that observed in CO. The high atmospheric stability in the entire Peruvian Andes during the DS may favour lower regional differences in the MDHC. Furthermore, the enhanced sublimation increases Rh2m during the daytime hours dampening the diurnal decrease of this parameter.
The ice surface modifies the MDHC as demonstrated with the on/off glacier AWS comparison in the Artesonraju glacier. A drying effect was observed during the WS with lower Rh2m values over the glacier for each hour. The ERA5L tended to excessively amplify the MDHC during the DS. It may be related to particular effects of the glacier on this variable, but also to the sporadic and local presence of fog/clouds due to the presence of local orographic cloudiness not captured by the reanalysis due to its reduced spatial scale. The enhanced sublimation over glaciers during the DS, particularly during the hours of higher insolation, could explain the mean minimum Rh2m observed in the morning hours rather than around noon. Besides, the absence of an artificially-ventilated radiation shield in COmain may also introduce some errors in the observations, predominantly around noon under low wind conditions. Therefore the use of artificially ventilated radiation shields is strongly recommended.
Regarding the mean daily cycle of both variables, results indicate that ERA5L tended to amplify the MDTC at all four sites, with an overestimation (underestimation) of the observed T2m during the day (night) hours and vice versa for Rh2m. Differences in the amplitude of the daily cycles between the observations and ERA5L were influenced by the particular conditions of the katabatic boundary layer of the glacier. Damping and cooling effects on glaciers reduced the magnitude and amplitude of the observed MDTC compared to surrounding off-glacier regions, which are described by the reanalysis data.

Monthly assessment
Due to the tropical location of the glaciers, the annual fluctuation of the solar radiation is low leading to a weak T2m seasonality. However, the more pronounced seasonality in southern (CV and CO) than in northern sites (CB and CC) can be related to latitudinal increases in the solar radiation variability. Monthly maximum T2m and Rh2m were measured during the WS (DJFM) while lower during the DS (MJJA) in concordance with results obtained in different Peruvian mountain ranges by different studies (eg. López-Moreno et al. 2014;Veetil et al. 2017). Seasonal changes in cloudiness and moisture troposphere content modify the energy surface balance by changes in incoming long and short-wave radiation resulting in slight changes in day-time but significant in night-time T2m over glaciers. In addition, the secondary annual maximum in both variables was detected around November, particularly in the westward sites (CC and CO). The extremely high correlation (r >0.90) for T2m and a very high correlation (r > 0.8) for Rh2m supports the use of monthly ERA5L data to represent both variables over the Peruvian glaciers. However, the poorer performance of the reanalysis was detected in the southern dry outer tropic glacier, particularly for monthly Rh2m. The particularly strong and persistent dry atmospheric conditions in this region of the Peruvian Andes enhanced sublimation for most of the year. Thus, on-glacier Rh2m measurements are strongly affected by sublimation resulting in greater Rh2m values over glaciers during the daylight hours in comparison to off-glacier surroundings.
A secondary maximum in T2m (around November) and Rh2m (around September) were observed in the four sites. Concerning T2m this secondary maximum could be related to the persistence of sky clear conditions, lower albedo and the proximity to the summer solstice. All these factors influence the surface energy balance and lead to higher T2m, as was observed in the Cavalca glacier (SENAMHI 2016). Concerning Rh2m, the secondary maximum could be a consequence of the onset of the first rains coincidently with the beginning of the hydrological year in Peru (September-April). The progressive establishment of the low-level jet east of the Andes (Marengo et al. 2004) and the onset of moisture advection from the Amazon basin provide moist conditions on the Peruvian Andes. This secondary maximum could also be related to the effect in Rh2m caused by a strong sublimation at the end of the DS.

Error sources and research issues
Despite the identified differences between hourly and monthly ERA5L and the AWSmain observations, these cannot be exclusively attributed to errors in the reanalysis. Additional issues may have influenced the results and must be considered. We can classify these issues as General (affecting both variables) or Specific (only one variable).
General: -Whereas ERA5L averages an area, measurements are representative of a specific location with particular environmental characteristics. Therefore, ERA5L data provides mean values for a grid-cell (not considered a glacier by the reanalysis) rather than local observations over glaciers.
-Despite removing several outliers and doubtful data, errors in the observations may not be completely detected. In addition, the gap-filling process was performed with off-glacier AWS data not influenced by the presence of ice on the surface.
-The particular glacier surface and the establishment of katabatic flows and boundary layer generates a singular response of the T2m and thus Rh2m compared to off-glacier regions at the same altitude and pressure.
-The cooling of the near-surface air generates an increase in air density forcing a downward flow along the glacial valley. This flow may be stronger in Artesonraju, Chuecon, rather than in Cavalca or Quelccaya due to their location on steeper slopes. These local conditions produce an on-glacier cooling and damping effects. These effects were identified in the Artesonraju glacier but we assumed that also occurred in the other analysed glaciers.
-Wind speed and wind direction have a strong influence on T2m and Rh2m. However, this variable was not accounted in the analysis of the observed data.
-A minor source of variability in measurements, could be associated with variations in the height of the instrument due to partial station burial, at the end of the WS.
Specific: -In the on-glacier AWS, more accurate T2m measurements are obtained with sensors protected by artificially ventilated radiation shields (Georges and Kaser 2002). However, in CO and CV the AWSmain are equipped with naturally ventilated radiation shields.
-The hourly ELR correction applied to the ERA5L T2m to estimate the value at the altitude of the AWSmain is not error-free. The magnitude of these errors should be proportional to differences in the altitude. Therefore, errors were probably greater in CV and CO.
-The absence of an ELR correction for Rh2m may also introduce an error in the estimation of this variable.
-Eq. 1 was performed with the 2m air and dew point temperatures from the ERA5L to estimate Rh2m. However, this equation introduces errors especially for Rh2m values below 50% (Lawrence 2004). Therefore errors would be larger at the drier sites (CO and CC) and during the DS.
-The effect of glacial sublimation, predominantly during the daylight hours of the DS, can significantly increase the on-glacier Rh2m. It would contribute to larger daytime differences with the hourly ERA5L data during the DS.

Conclusions
The ability of ERA5L reanalysis to capture the hourly and monthly T2m and Rh2m variability in four different Peruvian glaciers was assessed using hourly on-glacier measurements from 2017 to 2019. This study provides an important reference for the use of ERA5L in future climate and glaciological studies on tropical glaciers. Furthermore, it provides an accurate description of the observed mean daily cycle and monthly evolution of both variables in different Peruvian glaciers through the analysis of unpublished meteorological data.
The ERA5L tends to amplify the on-glacier observed daily T2m and Rh2m cycles. In this regard, the reanalysis underestimates (overestimates) nighttime (daytime) and low (high) T2m, and vice versa for Rh2m. There are several reasons for these differences in amplitude, including the effects of the ice surface on measurements, the estimation of the ELR correction or differences in the spatial resolution of the two datasets. The mean glacier cooling and damping effects were strengthened during the DS in the Artesonraju glacier when differences between observations and the ERA5L were higher. The environmental lapse rate applied to the ERA5L T2m was shown to be effective, adjusting the reanalysed T2m quite accurately to those equivalent to the AWSmain altitude. However, the ELR correction may introduce slight errors due to the difficulty in determining the real ELR which has high Spatiotemporal variability. For Rh2m, the poorer performance of the reanalysis can be attributed to the lack of an ELR correction and to the estimation of this variable from two parameters (T2m and Td2m). Further, the different spatial scales between datasets make it impossible to match them perfectly. Therefore the ERA5L could be applicable to describe the climate on the Peruvian glaciers, but statistical or dynamical downscaling techniques are recommended to reduce the scale mismatch between reanalysis grids to individual sites.
The extremely high correlations (r>0.97) and low errors (0.7°C>RMSE<1.8°C; 0.6°C>MAE<1.6°C) between the observed and modelled MDTC indicate that ERA5L can be used to describe with relatively high accuracy the MDTC over glaciers, particularly during the WS in CV and CO. However, the small positive bias (~+1.0°C) detected in CO during the WS must be considered. During the DS, the ERA5L showed poorer performance in reproducing the MDTC, particularly in the northern wet outer tropics glaciers.
Regional differences found in the observed MDHC suggest a better ability of the model to reproduce the Rh2m in glaciers located in the wet outer tropics. In contrast, the reanalysis showed poorer results when it tried to reproduce this variable and its daily cycle in CO, especially during the DS when errors and biases were higher. However, the ERA5L succeeded in describing seasonal changes in the mean daily cycle of T2m and Rh2m with lower values during the DS.
Based on the good performance of the ERA5L in the monthly reproducibility of T2m in the four glaciers and of Rh2m in the wet outer tropics glaciers, this reanalysis is suitable for retrospective and climatic analyses of these parameters. However, the performance of ERA5L with respect to monthly Rh2m has been found to be quite poor in the dry tropical glacier.

Open Access
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
Funding note: Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature.