Background & Summary

Large-sample datasets of hydrological variables across many catchments and long time periods are crucial for understanding and predicting hydrological variability in time and space1,2. These datasets are increasingly in demand due to the rise of data-intensive machine learning models3.

Following the publication of the MOPEX dataset in the early 2000s, there has recently been a broad movement to making large-sample hydrology (LSH) datasets available. Many of those were developed inspired by the Catchment Attributes and MEteorology for Large-sample Studies (CAMELS) initiative that compiled and made available full datasets for the contiguous United States1. Many countries and regions have embraced these or similar initiatives, including Australia4, Brazil5, Chile6, Great Britain2, Switzerland7, Central-Europe8, North America9, China10, Central Asia11 and Iceland12.

At the global scale, there are already some collection efforts for hydro-meteorological data. The Global Streamflow Indices and Metadata Archive (GSIM)13,14 provides streamflow indices for 35,000+ locations around the globe, but no extensive set of catchment landscape and meteorological attributes. Recently another global streamflow indices time series initiative took place enlarging the analysis to 41,000+ river branches worldwide and using different streamflow signatures to enrich the flow regime analysis15. Considering streamflow records, the Global Runoff Data Centre (GRDC)16 provides data for 10,000+ stations, but similar to the previous datasets, no catchment attributes and meteorological forcing time series are available. In addition, the GRDC data is only updated episodically, while the others do, to our knowledge, not provide any updates. More recently the Caravan3 dataset compilation was published as a global initiative for standardizing already open-source published streamflow datasets of initially 6,830 catchments, where catchment attributes and meteorological forcing were derived from gridded global products.

While global datasets offer easy access, they come with limitations. Firstly, their spatial coverage remains restricted, offering only a fraction of data available from national providers worldwide. The Caravan dataset, for example, originally covered Europe for only Great Britain, Austria and the Danube catchment as far downstream as the city of Bratislava (Slovakia). By now, there are multiple extensions for Denmark, Israel, Switzerland, Spain, Iceland and, most recently, a GRDC extension17 adding another 25 countries globally. Yet, for eastern and southern Europe publicly available data is still difficult to access. Secondly, such datasets are also limited in their temporal extent. For example, the CAMELS-GB2 covers the period from 1970 to 2015, while the LamaH-CE dataset8 spans from 1981 to 2017. Thirdly, existing large sample hydrology datasets, including the CAMELS databases, lack extensibility, making the accommodation of newly available data challenging.

Although most countries collect daily streamflow data at numerous river gauging stations, compiling a comprehensive hydrological dataset from this information presents significant challenges. Firstly, access to these data can be challenging. Some countries offer this data on the official websites of government agencies or associated data providers, while others provide it upon request. Official government websites are frequently available only in national languages, adding an extra layer of complexity. Gaining access can be intricate, involving navigation to a selection of stations and periods, which need to be downloaded individually. Secondly, substantial formatting and pre-processing are often necessary before the data can be effectively utilized. Finally, redistribution restrictions may hinder the republishing of country-specific data. These obstacles pose significant barriers to hydrological analyses of catchments in large-sample investigations, particularly given the short timeframes of typical research projects.

Here, we present “EStreams”, a platform consisting of two distinct products: (1) an extensive streamflow catalogue together with Python scripts for data direct access at the individual data providers and (2) a dataset of weekly, monthly, seasonal and annual indices, of streamflow, together with the associated catchment-averaged hydro-climatic signatures, meteorological time series and landscape descriptors for 17,130 catchments across 41 countries over pan-European territory. Currently, the dataset covers the period of 1900–2022.

While the focus of EStreams is on streamflow, the EStreams dataset also contains catchment aggregated meteorological forcing and landscape descriptors, typically necessary for hydrological analyses. These indices and descriptors were derived from various open source datasets and include climate18, geology19,20, hydrology and topography21,22,23,24, land use and land cover25,26,27, soil types28,29,30 and vegetation characteristics31,32. Similarly to streamflow, national providers often have more accurate information for such auxiliary data, but seldom they are easily accessible.

Unlike existing global datasets, which are relatively “static” as not easily updatable with new stations or recent time periods, EStreams is designed as “dynamic” by linking users to the original data providers. While “static” datasets may offer more accurate quality checks and are well-suited for applications such as benchmarking methods and models, many practical applications benefit from using the most up-to-date and dense data. This is particularly true for tasks like accurate streamflow predictions using data-intensive machine learning models.

Hence, our main contributions with this work are:

  1. i.

    Introducing the currently most extensive and extensible integrated collection of weekly, monthly, seasonal and annual indices of streamflow for Europe, along with catchment-aggregated meteorological and landscape variables (dataset).

  2. ii.

    Providing detailed metadata for streamflow gauges, including catchment boundaries, and a catalogue of the corresponding data providers.

  3. iii.

    Allowing reproducibility and extension by making available all codes used to retrieve the source data and aggregate them by catchment in an easy-to-use workflow, allowing users to directly and readily access the desired data from data providers.

The methodology employed to process the source data and obtain the current dataset and catalogue is illustrated in Fig. 1. This figure highlights the primary data sources, the general procedure, and the final outputs of EStreams. A detailed description of each step is provided in the Methods sections.

Fig. 1
figure 1

Framework of the methodology adopted in EStreams for deriving the Streamflow Catalogue, and the Dataset. The boxes with dashed lines represent the original, and the intermediate (pre-processed) data used in EStreams. The outputs are shown in pink (catalogue) and blue (dataset). *The landscape datasets encompass topography, soils, geology, hydrology, vegetation and land cover.

Methods

Streamflow data

Available stations

Daily streamflow data from 17,130 European river catchments with varying sizes and characteristics were aggregated from 41 countries and more than 50 different data providers. In some countries, such as Italy and Germany, multiple data providers contributed to the dataset. Figure 2a shows the distribution of the gauges with their respective catchment boundaries in the background. As can be seen in the figure, there is a significant variability in terms of station density, which is the highest in central Europe and the lowest in the South and the East. The time series records span the period 1900–2022, with varying length for each catchment, as shown in Fig. 2b. Central Europe features the longest time series, with many stations with records extending over 80 years. Figure 2c shows the evolution of the number of stations with measurements at a given time accounting for the discontinuity of stations over time. The plot shows an increasing trend in the number of gauging stations with concurrent records.

Fig. 2
figure 2

(a) Spatial distribution of the 17,130 streamflow gauges currently included in EStreams (in black dots) with their catchment boundaries in background (in blue) over Europe. (b) Spatial distribution of the streamflow with the colors representing the time series length in years. (c) Temporal evolution of station coverage. The plot shows the number of active stations in a given year, Although the curve accounts for dismissed stations, it still shows an increasing trend. Basemap from GeoPandas104.

The streamflow records were selected based on the following criteria: (i) they were available from official authorities in their respective country or from a recent open-access dataset, and (ii) they were open-source and easily accessible either via the internet or by e-mail request. The latter point emphasizes that no dataset requiring purchase for non-commercial access were included. It is important to note that freely available data do not necessarily come with a free redistribution license. Therefore, we cannot and do not make raw daily streamflow data directly available. Should the source data be necessary, we provide the EStreams catalogue of data sources to allow users easy and direct data access from the original repositories, including codes and instructions for data download and formatting. Compared to static databases of pre-compiled datasets currently available, our approach has two main advantages:

  1. i.

    Users can tailor the download to determine the desired spatial and temporal coverage, also making use of the provided descriptive statistics of the source data, such as regime characteristics or catchment properties.

  2. ii.

    Users can access the most up-to-date information directly from the data sources.

Table 1 provides an overview of the contributing countries, the number of streamflow gauges, and the data providers. France has the highest number of gauges (4,968), followed by Germany (2,093) and Spain (1,440). In contrast, Bulgaria (8 gauges) Moldova (2) and North Macedonia (1) have the lowest numbers of gauges.

Table 1 Overview of streamflow time series data available per country/region, with information about number of stations and data providers.

Streamflow gauges labelling

After the collection of the streamflow data and gauge information from each provider, the individual datasets were collated into a single dataset. In this process, each gauge was labelled with a unique 8-digit code. Consequently, each catchment was renamed according to its respective streamflow gauge. The 8-digit codes were generated using the following logic: the first two digits represent the country/region, the next two digits represent specifications about the data provider within regions that had more than one official provider, and the last four digits refer to the gauge counter for each country/region. For example, the gauge GB000045 represents Great Britain (GB), with only one provider (00), and the gauge number 0045. Similarly, ITIS0001 represents Italy (IT), with ISPRA (IS) as the data provider, and gauge number 0001. The gauges with records obtained from GRDC have the second two digits as “GR” (e.g., LVGR0001) to facilitate identification. This standardization ensures that all gauges are consistently labelled, providing users with a clear indication of the source and the number of records.

Identification of duplicate gauges

When compiling large streamflow datasets, there is a possibility of having duplicate records within the dataset that need to be identified and removed. This issue can arise when combining information from multiple sources and even within datasets obtained from a single data provider. To identify suspected duplicate records, we used a similar approach as used by the GSIM13, where for gauges originating from distinct data providers, we identified potential duplicate gauges by examining similarities in gauge and river names. We employed the Jaro-Winkler distance metric to quantify alphanumeric similarity, as discussed by Christen, 201233 with a threshold set at 0.70. We additionally considered spatial proximity, constraining pairs of stations within 1 km of each other. For gauges originating from the same data provider, we selected stations within a spatial proximity of 50 m and a delineated area difference below 1%. Gauges meeting these criteria were flagged as potential duplicates. The list of potential duplicates for each gauge is contained in the attribute duplicated_suspect within the gauges’ layer in the final EStreams dataset. Notably, all potential duplicates are preserved in EStreams, giving users the flexibility to choose their preferred station and data provider when duplicates are found. This approach ensures that users can tailor their dataset according to their specific needs and preferences.

Quality flags of records

Quality control of streamflow data is essential before undertaking any hydrological study. While some data providers include quality flags with each published record, this practice is not consistently available. Automatic checks are available but may be subjective, and their effectiveness has not yet been fully investigated34,35. For example, Do, 201813 employed an automatic detection criterion to identify and filter potentially suspect records based on negative values, consecutive repetitions, and outliers. However, these filtering criteria are not always reliable, as pointed out by Chen, 202315.

In this work, following the approach utilized by Chen, 202315, we adopt a two stages approach for quality checking the data, the first oriented at individual data points, and the second assessing the entire record. The first stage is primarily based on the quality flags from the original providers, when available, which for consistency are reclassified into four categories: “missing”, “no-flags”, “suspect” and “reliable”. First, all negative values were replaced with “not a number” (NaN) and flagged as “missing”. Then, values with a quality flag given by the data providers had their original labels reclassified as either “reliable”, “suspect” or “missing”. Finally, all data without a quality flag from the original providers were classified as “no-flag”. A complete overview of the mapping between the original flags and our four flags system is available in Supplementary Table 1.

In the second stage, we assessed the overall reliability of each entire time series based on the fraction of problematic data points as determined in the previous stage. This classification considered five criteria outlined in Table 2.

Table 2 Criteria used for the quality assessment of the streamflow gauges as in Chen, 202315.

A total of 7,430 stations had quality flags from their providers (about 43% of the total). Figure 3a shows that approximately 134 million data points (63.4% of the total) were classified as “no-flag”, 56 million data points (26.7%) as “reliable”, 3.9 million data points (1.9%) as “suspect”, and 16.8 million data points (8%) as “missing”. Regarding the gauge’s quality classification, Fig. 3b shows that most stations were categorized as either Class A or B (9,652), followed by Class E (3,317), Class C (2,827) and Class D (1,334). This classification allows users to filter the data depending on their needs. It is noteworthy that many national providers may offer only high-quality data for download. Therefore, even without explicit quality flags, the data can often be assumed to come from reliable stations. The quality flag for each gauge’s records is stored as the attribute gauge_flag within the gauges’ layer in the final EStreams dataset.

Fig. 3
figure 3

(a) Histogram of the streamflow data points according to their four data quality flags and (b) Histogram of the number of gauges according to their integrated data quality flag.

Basin delineation

Since catchment boundaries shapefiles were rarely available from national providers, this work adopted a semi-automatic delineation of catchment boundaries corresponding to streamflow gauges using Python scripts and QGIS software. We used the “delineator” python package36, which determines catchment boundaries using hybrid vector and raster-based methods. This package requires as input the latitude and longitude coordinates of the streamflow gauges and uses the MERIT-Hydro Digital Elevation Model (DEM)21. MERIT-Hydro is a digital elevation model developed to remove multiple error components from the existing spaceborne DEMs (SRTM3 v2.1 and AW3D-30m v1).

To appraise the accuracy of the delineated area, catchments were split into two categories: (i) catchments with a reported area from the data providers and (ii) catchments without this information. For gauges with available official catchment areas, the reported area was compared to the derived area, and the following workflow was adopted:

  1. i.

    First, we computed the “relative area difference” Arel as defined in Eq. 1. If |Arel| was below 10%, regardless of catchment size, the delineation was accepted, and the catchment was labelled with a quality flag of “0”.

  2. ii.

    Otherwise, the catchment delineation was visually inspected, potentially corrected as described below, and assigned a specific quality flag as detailed in Table 3, which provides an overview of the flags used and number of gauges corresponding to each flag.

    Table 3 Description of the catchment area quality flags adopted for the current catchment delineations and overview of the number of catchments per group.
$${{\boldsymbol{A}}}_{{\boldsymbol{rel}}}={\bf{100}}\times \frac{{{\boldsymbol{A}}}_{{\boldsymbol{EStreams}}}-\,{{\boldsymbol{A}}}_{{\boldsymbol{official}}}\,}{{{\boldsymbol{A}}}_{{\boldsymbol{official}}}}$$
(1)

where AEStream is the calculated area in EStreams and Aofficial is the reported official area.

The visual inspection was made using the river networks from both the MERIT-Hydro and EU-Hydro datasets37, Google Maps satellite imagery, and nearby catchments delineated and labelled with a quality flag of “0”. These three data sets were used as they represent independent sources and offer a good trade-off for evaluating the catchment delineation usability.

During the visual inspection, it was observed that some boundary discrepancies could be corrected with an adjustment in the streamflow gauge location. We assumed that uncertainties in the georeferenced system or the presence of close-by river branches could cause these discrepancies. For those catchments, the gauge location was moved (snapped) to the closest point within the MERIT-Hydro River network based on the gauge’s river and location names.

Catchments with |Arel| below 10% after the snap were labelled with a quality flag “1” indicating accepted delineation after the snap. The remaining catchments were classified with the criteria detailed in Table 3.

It is important to note that for some situations where human-influence such as canalization, water exports and specific lithologies like karstic systems, the actual catchment boundary delineation remains challenging. Hence, for catchments where |Arel| was above 10% and the visual inspection indicated such situations, we assigned a quality flag of “888”.

Finally, catchments where |Arel| was above 10%, and were not visually adjusted or accepted, were assigned to a quality flag ‘‘999”.

Out of a total of 17,130 stations, 15,775 (92%) had a reported catchment area from the data providers. Figure 4a shows the distribution of these streamflow gauges divided into two classes: gauges with |Arel| above 50% (in red), and those with |Arel| below 50% (in blue). Generally, gauges with high area discrepancies are located in regions of low relief, partly canalized landscapes and with high presence of lakes such as in Denmark, Sweden and Croatia.

Fig. 4
figure 4

(a) Relative absolute area difference |Arel| above 50% (in red) and below 50% (in blue). (b) Exceedance percentage of the |Arel|; the orange line marks the exceedance percentage corresponding to a |Arel| of 50%. (c) Bar plots showing the relative number of basins with areas above 50% for different basin area ranges (e.g., 0–100 km², 100–200 km², and >1,300 km²) relative to the total number of basins in each range. Basemap from GeoPandas104.

Figure 4b shows the exceedance percentage of |Arel| of these 15,775 catchments with a reported area. As indicated with the dashed orange line, the catchments with |Arel| above 50% was 8% (1,205 catchments). This analysis also shows that less than 17% of the catchments (2,712) had |Arel| above 10%.

Figure 4c focuses on catchments with |Arel| above 50% (1,205 catchments) and shows how the fraction of these catchment varies with catchment area. Notably, 17% of catchments under 100 km² exhibited |Arel| above 50%, while in all other ranges shown in the bar plot, the occurrence was below 5%. This analysis suggests that catchments with significant area differences tend to be relatively small.

Finally, for the 1,355 gauges (8% of the data) without catchment area information, the delineation was visually inspected, and a label was assigned to indicate the accuracy of the delineation based on the criteria shown in Table 3. Note that as it is not possible to calculate |Arel| for these catchments, the quality flags of “0” or “1” were never assigned to such basins. The visual inspection was again made using the river name, the river network provided by MERIT-Hydro and the EU-Hydro, Google Maps satellite imagery and nearby catchments delineated and labelled with a quality flag of “0”.

Hence, in the gauges’ layer stored in the final EStreams dataset, besides the original lat and lon coordinates, we included the lat_snap and lon_snap coordinates after the potential snap. The gauges layer also received an attribute called area_estreams, which express the AEStream. Additionally, we included the Arel as the attribute area_rel, and the qualitative flag as the attribute area_flag.

Catchment aggregated data

The EStreams dataset includes streamflow, meteorological, and landscape variables. For streamflow, we distinguish between dynamic streamflow indices and hydro-climatic signatures, which are further detailed in their respective sections. Meteorological variables are discussed in the “Meteorological records” section. Finally, landscape attributes were categorized into six groups (Topography, Soils, Geology, Hydrology, Vegetation, and Land Cover) and are described in the “Landscape attributes” section. All catchment aggregations were derived using the catchment boundaries and areas calculated by EStreams. For example, all streamflow indices and signatures were computed using the specific discharge (in mm/day) derived with the AEStreams areas.

Streamflow indices

In EStreams, streamflow data is presented in terms of “indices”, hence statistics of the daily data such as mean streamflow, maximum, minimum, percentiles and coefficient of variation, which are provided at annual, seasonal, monthly and weekly resolutions. The use of these indices is consistent with earlier works, such as the GSIM dataset13,14 and the CCl/WCRP/JCOMM Expert Team on Climate Change Detection and Indices (ETCCDI) (https://www.wcrp-climate.org/data-etccdi).

The use of indices instead of the daily data allows to make relevant climate information publicly available in cases where access to raw daily values is restricted. The selected indices, as discussed in the GSIM dataset13,14, are of high relevance and have been widely used in many hydrological studies, as they can facilitate the analysis of trends and changes in the regional water balance and the seasonal cycle.

The streamflow indices contained in EStreams are presented in Table 4, alongside with their units and temporal resolution. All the indices were computed for time-steps where at least 95% of the data was available, e.g., at annual time-step, the indices were computed for years where at least 347 days of data were available.

Table 4 Set of dynamic streamflow time series indices computed and made available at the present dataset.

Hydro-climatic signatures

In addition to the streamflow indices, we computed the same set of meteorological and hydrological signatures provided in the original CAMELS dataset1. Unlike streamflow indices, these signatures were calculated for the entire time period between 1950–2022 where data are available. Here we refer to these indices and signatures as hydro-climatic signatures (e.g., streamflow & precipitation mean, seasonality & aridity index, and runoff coefficient). For meteorology, we used precipitation and temperature derived from the Ensembles Observation (E-OBS) product18. This work used the “hydroanalysis” python package38 for the computation of these signatures.

The full list of signatures used is available in Table 5. We considered only catchments with more than one year of continuous measurements within the period of 1950–2022. Additionally, we also provide the number of years used for the signature’s computation (num_years), the start (start_date) and the end (end_date) of the observations between 1950–2022 to give a further overview of the period the signature refers to, considering separately the hydrological (hydro) and the climatic (climatic) signatures.

Table 5 Set of static hydro-climatic signatures.

Meteorological records

EStreams used E-OBS18 for meteorological forcing data records, which has been widely used in hydrological studies over Europe39,40,41,42. E-OBS provides a pan-European observational dataset of surface climate variables that is derived by statistical interpolation of in-situ measurements, collected from national data providers. It is an open-access database with daily records ranging from 1950-present. We used the ensemble mean dataset at a resolution of 0.25 degrees. Additionally, we used the temperature records from E-OBS to derive potential evapotranspiration (PET) using the Hargreaves formulation43 and the “pyet” python package44 for computation. Each catchment has 9 daily meteorological time series associated with it, which are illustrated in Table 6. The accuracy of E-OBS may be dependent on station density42, which varies across Europe. In order to account for this potential source of uncertainty, EStreams also includes information on the number of weather stations and density aggregated to a buffer of 10 km within each catchment boundary.

Table 6 Meteorological catchment attributes at daily resolution from 1950 to 2022.

Landscape attributes

A full overview of the landscape attributes contained in EStreams is shown in Table 7 and Table 8, with a short description, their units, and data provider. Regarding spatial coverage, except for the landcover & land use and soil types that have pan-European coverage, all the remaining products are global. Table 7 covers solely the fully static attributes, which are considered time invariant, such as elevation, soil types, main geology and mean vegetation indices. Conversely, Table 8 encompasses a group of attributes that are considered time variable, such as normalized difference vegetation index (NDVI), leaf-area index (LAI), irrigation and snow cover. These attributes are reported in time series at either monthly, yearly or in a specific number of years (e.g., irrigation and landcover) resolution.

Table 7 Set of static catchment attributes included in the present dataset.
Table 8 Set of the temporal catchment landscape attributes.

Topographical attributes were based on MERIT-Hydro21. Geology made use of the widely used Global Lithological Map Database (GLiM)19 and a gridded product for the estimation of the depth to bedrock20, which have been both used in several applications databases1,8,23. For the number of dams and of total upstream reservoir volume we used the Georeferenced global dams and reservoirs dataset22. A similar aggregation was performed for lakes using the HydroLakes dataset45. Vegetation indices and snow cover percentage made use of three MODIS products27,31,32 and were aggregated considering both temporal and static attributes. For irrigation, we decided to use the global dataset of the extent of irrigated land26, which ranges from 1900 to 2005, and has been already used in other studies13,14,23. The soil attributes were based on the European Soil Database Derived data (ESDD)28,29,30 and the land cover on the CORINE land cover dataset25. Both are widely used products which have been used in previous LSH datasets covering Europe7,8.

Data Records

The current version of the EStreams dataset and catalogue (v1.0) is stored at a Zenodo repository46 at https://doi.org/10.5281/zenodo.13154470. The repository is organized into the following subfolders:

  • streamflow_gauges: Contains two csv-files. One includes all the metadata associated with each of the 17,130 streamflow gauging stations such as location, river name, catchment area, and gauge elevation. The other file is the streamflow catalogue containing all the data provider information, further described in the following section.

  • shapefiles: Contains two shapefiles. One shapefile includes the derived catchment boundaries associated with each streamflow gauge, and the other shapefile marks the location of the streamflow gauges. Both files are referenced in WGS 84.

  • streamflow_indices: Contains one sub-folder per time resolution (weekly, monthly, seasonal and yearly) with a csv-file per computed index. The rows of each csv-file represent the time, and the columns represent the catchment.

  • meteorology: Contains one csv-file per catchment (17,130 in total), each containing all the daily aggregated meteorological forcing records for that catchment (as detailed in Table 6). The rows of each csv-file represent the time, and the columns represent each of the 9 meteorological variables.

  • attributes: Contains two sub folders. The static_attributes subfolder contains one csv-file per attribute group (i.e., topography, soils, geology, hydrology, vegetation and landcover) encompassing all the attributes shown in Table 7. The rows of the csv-file represent the gauging stations, and the columns represent the attribute variable. The temporal_attributes subfolder includes all the monthly or annual landscape attributes shown in Table 8. The csv-files in this subfolder are organized by gauging stations (rows), and attribute variables (columns), or as time series (each column represents one gauging station, and each row represents one date).

  • hydroclimatic_signatures: Contains one csv-file with all computed hydro-climatic signatures for all catchments. The rows of each csv-file represent the streamflow gauging station, and the columns represent each of the 25 derived signatures.

  • appendix: Contains three txt-files. One file provides descriptions of the lithological classes’ labels, another describes the landcover classes’ labels, and the third file includes licenses and data providers.

Streamflow data catalogue

An important component of EStreams is the streamflow catalogue, which provides complete guidance on how to retrieve the raw streamflow data used in this study to compute the streamflow statistics. Table 9 provides an overview and description of the attribute fields included in the catalogue.

Table 9 Attribute fields included in the European Streamflow Catalogue provided.

Particularly, the field license_redistribution specifies the data redistribution policy of the data provider. In cases where this information is unavailable, users are advised to proceed with caution regarding any redistribution or specific use of the data, and to contact the data provider directly. The catalogue also includes various links to individual data providers, covering the website, the license source, streamflow and gauges metadata. Up to four different links are provided because the websites for downloading the streamflow time series may differ from those for the gauges metadata.

The Zenodo repository46 (https://doi.org/10.5281/zenodo.13154470) supports versioning, which ensures reproducibility, benchmarking, and the extensibility of the dataset as new stations or time periods are added.

Additionally, Jupyter Notebook demonstrations are available at the GitHub repository47 (https://doi.org/10.5281/zenodo.13255133) showing not only how to use the catalogue but also allowing to directly retrieve and pre-process each of the daily records currently included in EStreams. The repository is linked to a GitHub page, enabling users to track potential changes in data providers, websites, and propose updates. This collaborative approach can lead to new releases of the catalogue, ensuring EStreams remains an updated and dynamic resource.

Gauges layer

A comprehensive overview of the gauges’ attributes and metadata included in this dataset is presented in Table 10. These attributes are designed to offer users complete guidance on data availability before downloading, thereby optimizing the data collection process. The attributes include the gauges names and location, data provider, topographic information, temporal data availability, quality and reliability descriptors, and nested catchments & flow order attributes. These attributes ensure that users have detailed information to facilitate the efficient retrieval and application of the streamflow data in various hydrological analyses.

Table 10 Description of the attributes of the streamflow gauges’ layer.

Catchments layer

The delineated boundary of each catchment is stored in the catchment layer. This layer includes the basin_id field, which is also used for the gauges, allowing a link between the two datasets. Additionally, the catchment layer also has the fields gauge_id, gauge_country (here named country), area_official (here named area_offic), area_estreams (here named area_estre), area_flag, area_rel, start_date, end_date, gauge_flag, gauges_upstream (here named upstream) and watershed_group (here named group), which were already described in Table 10. Note that area_official, area_estreams, gauge_country, gauges_upstream and watershed_group had their names reduced due to storage limitations in the shape files. These fields ensure consistency between the catchment and gauge datasets, facilitating seamless integration and analysis.

Technical Validation

Duplicate stations

This work provides, alongside the gauges’ metadata, information on potential candidates for duplication. This information is useful for users aiming to have a consistent dataset for their hydrological analysis. The results indicate that a total of 885 gauges are identified as potential duplicates, representing about 5% of the total. This means that more than 16,600 gauges in the dataset may be seen as unique gauging stations. The duplicates are divided into two types: gauges duplicated with other gauges within the same provider and gauges duplicated with other gauges within different providers.

These first types of duplicates often occur when gauges are discontinued and later reactivated as new stations, usually resulting in stations with non-overlapping time records but located at the same point. These cases are primarily found in France (449) and Finland (160). For example, stations FR001479 (1969–1999), FR001477 (1993–1999) and FR001478 (2015–2023) are flagged as duplicate suspects among each other.

Additionally, 163 gauges are identified as duplicates across different data providers. These typically represent gauging stations located at the boundaries between countries and are mainly found in Austria (33), Switzerland (36) and Czech Republic (51). Interestingly, FR004543 is the only gauge identified as duplicate both within the same provider (FR002217) and across different providers (CH000268).

Basin delineation validation

In this part of the study, we used the dataset provided by LamaH-CE8 for Austria, which includes both catchment boundaries and their respective officially reported areas. These were compared to the boundaries delineated using the methodology adopted in this work.

Figure 5a shows a scatter plot comparing the areas reported in LamaH-CE and those derived in EStreams. As expected, the scatter between the computed and reported areas is larger for smaller catchments. Figure 5b presents a histogram with the distribution of the relative absolute area difference |Arel| between the two areas (in %). Out of the total of 599 Austrian catchments, 539 had a |Arel| below 10%. This indicates that roughly 90% of the catchments were accurately delineated during the automatic part of the delineation process.

Fig. 5
figure 5

(a) Comparison of catchment boundary areas reported LamaH-CE8 against those delineated in this study. Both axes are presented in logarithmic scale to enhance visualization. (b) Histogram illustrating the |Arel| between the two sources of data. Most catchments exhibit |Arel| below 10%. Catchment AT000009 (EStreams) delineations are displayed (c) prior to manual adjustment of the outlet location and (d) following manual adjustment.

However, if we consider only catchments with areas above 100 km2 the number of catchments with |Arel| above 10% drops from 60 to only 21. After visual inspection, we concluded that the main cause of these discrepancies was associated either to the difficulties in the delineation of relatively small catchments, below 100 km2, or to small discrepancies between the streamflow gauge location in terms of the MERIT-Hydro network.

Figure 5c-d illustrate an example of the catchment delineation workflow for catchment AT000009. This catchment has an Aofficial of 1281.0 km2. Initially, AEStream derived an area of 4680.0 km2, which accounts for a Arel of +265.0%. Upon visual inspection, we realized that the inconsistency was due to the inaccurate location of the streamflow gauge in relation to the MERIT-Hydro River network (Fig. 5c). Since the outlet was not within the river network, the “delineator” python module used automatically moved it to the closest river network intersection, which had a much higher drainage area. After manually adjusting the streamflow gauge location, the delineation resulted in an area of 1,300.0 km2, an Arel of only +1.5% (Fig. 5d).

E-OBS assessment

Spatial coverage

EStreams used E-OBS to derive the catchment aggregated time series of meteorological variables. However, the number of stations used to produce the gridded dataset varies significantly from country to country. Here we provide a brief overview of the station densities used to derive the precipitation time series provided in E-OBS within each catchment. We present this analysis only for precipitation since it is considered the most important forcing input in hydrological studies and gives already a significant overview of the E-OBS network. To ensure a fair comparison, we considered a buffer of 10 km for the catchment boundaries and considered any station within this range to compute the number of stations.

Figure 6a illustrates the spatial distribution of the stations, revealing a large spatial variability in station density. Central and North Europe exhibit the highest density, with Germany and Poland taking leading in station density, while the density decreases significantly towards South and East.

Fig. 6
figure 6

(a) Overview of the spatial distribution of the stations used to derive the precipitation time series grided data available at E-OBS18. (b) Histogram of the stations per catchment. Due to the high distribution of densities the bins are not evenly spaced, and the first bin (in red) corresponds to the threshold of one station per 100 km2. Basemap from GeoPandas104.

Figure 6b presents the histogram of the station density per catchment included in EStreams. The x-axis is resampled to stations per 100 km2 to facilitate visualization, with the threshold of less than one station per 100 km2 marked in red. A total of 9,840 catchments have at least one precipitation gauge per 100 km2. This represents, a median of 1.2 stations per 100 km2. Considering absolute terms, we found a total of 14,153 gauges with at least one precipitation station within their boundaries.

This information enables users to be aware of the highly variable quality of the provided E-OBS data and make informed decisions, especially considering the critical role of accurate precipitation data in many hydrological applications. Like streamflow data, national providers typically offer much higher resolution precipitation data compared to global databases48. While retrieving this information was beyond the scope of this study, users may choose to leverage such local data sources, particularly in regions where station density is notably low, such as in the South, East, and West of Europe.

Validation of meteorological forcing

We further validated the aggregated precipitation derived from E-OBS comparing it to the reported time series available at CAMELS-CH7 and CAMELS-GB2. Given that the aggregation of the forcing variables used E-OBS gridded data with a resolution of 0.25 degrees, we opted to include only catchments with areas above 100 km2 in the comparison.

Figure 7a shows a scatter plot illustrating the daily precipitation from E-OBS and CAMELS. CAMELS-GB is represented in blue and CAMELS-CH in orange. A notable correspondence between the two sources is observable, with correlation coefficients of 0.89 for GB and 0.94 for CH. Generally, the scatter is lower in catchments with higher daily mean precipitation and an underestimation from E-OBS compared to the two sources is evident.

Fig. 7
figure 7

(a) Scatter plot of the long-term mean daily precipitation (1950–2022) considering the precipitation forcing time series derived from E-OBS18 and the provided in CAMELS-CH7 and CAMELS-GB2 and (b) Histogram of the correlation coefficient between the two data sources. The plots only show catchments with areas above 100 km2.

Figure 7b shows the distribution of the correlation coefficients between each daily time series of E-OBS and CAMELS. Again, it is possible to observe that most of the catchments presented a correlation above 0.8, indicating some agreement between the two precipitation sources. Overall, CAMELS-CH demonstrates higher correlation coefficients than CAMELS-GB. Despite this comparison only encompassing two different regions within the large span covered by EStreams, it was conducted using two independent sources. Hence, this analysis suggests that E-OBS, at least in countries where the station density is relatively high, provides a broadly consistent starting point for representing precipitation time series.

Usage Notes

Aggregated data

The original data used to aggregate the catchment attributes such as climate, geology, hydrology, land use and land cover, soil types and vegetation characteristics have all continental or global resolution. It should be kept in mind that such resolution is rather coarse compared to local information usually available at the national scales, but seldom easily accessible. We therefore recommend that users acknowledge these potential limitations when using the aggregated data. Additionally, we recommend users to also reference the original sources when using the aggregated data provided in EStreams.

Streamflow catalogue

We recognize that potential retrospective check and updates of streamflow time series by the data providers may alter the information of the gauges metadata provided here. We also acknowledge that potential changes in the data providers’ platforms may alter the available links in the catalogue. Therefore, we invite the users to access the latest version of the catalogue and dataset on the Zenodo repository46 page for potential updates.

Instructions for Python

We kindly request that future users of the EStreams’ codes read and follow carefully the instructions provided in the scripts. Specifically, (i) use the specified version of the Python modules (requirements.txt); (ii) clone the repository locally and keep all the original folders’ names; (iii) place the original data in their specified folder and with their expected filename and version; (iv) follow the pre-defined specified order of run for the available scripts (when necessary). Be aware that the potential main source of problems when running the scripts might be caused by not following these guidelines.