Feature-based Groundwater Hydrograph Clustering Using Unsupervised Self-Organizing Map-Ensembles

Hydrograph clustering helps to identify dynamic patterns within aquifers systems, an important foundation of characterizing groundwater systems and their influences, which is necessary to effectively manage groundwater resources. We develope an unsupervised modeling approach to characterize and cluster hydrographs on regional scale according to their dynamics. We apply feature-based clustering to improve the exploitation of heterogeneous datasets, explore the usefulness of existing features and propose new features specifically useful to describe groundwater hydrographs. The clustering itself is based on a powerful combination of Self-Organizing Maps with a modified DS2L-Algorithm, which automatically derives the cluster number but also allows to influence the level of detail of the clustering. We further develop a framework that combines these methods with ensemble modeling, internal cluster validation indices, resampling and consensus voting to finally obtain a robust clustering result and remove arbitrariness from the feature selection process. Further we propose a measure to sort hydrographs within clusters, useful for both interpretability and visualization. We test the framework with weekly data from the Upper Rhine Graben System, using more than 1800 hydrographs from a period of 30 years (1986-2016). The results show that our approach is adaptively capable of identifying homogeneous groups of hydrograph dynamics. The resulting clusters show both spatially known and unknown patterns, some of which correspond clearly to external controlling factors, such as intensive groundwater management in the northern part of the test area. This framework is easily transferable to other regions and, by adapting the describing features, also to other time series-clustering applications.


Introduction
The analysis and evaluation of groundwater level dynamics can contribute valuable information to assess quantitative groundwater availability, which is important to manage groundwater resources and secure water supply in many regions worldwide. As every hydrograph contains information about system properties (e.g. geology), artificial (e.g. withdrawal) and natural (e.g. streamflow interaction) environmental factors, hydrograph clustering is often helpful to identify common dynamics and to differentiate between signals resulting from external controlling factors and noise. This improves understanding of system dynamics, and forms the basis for further analysis including forecasting or scenario building. Popular methods for clustering hydrological time series are for example Cluster-Analysis (CA) (Naranjo-Fernández et al. 2020) and PCA (Haaf and Barthel 2018), each alone or as a combination of both (Machiwal and Singh 2015). Besides classical approaches, Artificial Neural Networks (ANN) offer innovative concepts to deal with larger sets of multidimensional data, for example by using Self-Organizing Maps (SOM) for unsupervised clustering. Several studies from different disciplines compare SOM to other well-established clustering methods like k-means and hierarchical clustering (HC). Some authors found that k-means performs equally (He et al. 2004) or even better than SOM (Balakrishnan et al. 1994;Kumar and Dhamija 2010;Mingoti and Lima 2006); however, there is no consent on this aspect in the literature as other authors found SOM to be clearly superior to k-means Kiang et al. 2006;Melo Riveros et al. 2019) and also to HC (Mangiameli et al. 1996). Often, SOM are even combined with k-means or HC methods, because interpreting a trained SOM structure is not trivial and usually second-level clustering is therefore applied. Besides classical clustering methods, also algorithms specialized on the interpretation of trained SOM, such as DS2L (Cabanes et al. 2012), exist. In the hydrological context, SOM have been extensively used to analyze water quality and chemistry (Gholami et al. 2021). Applications to groundwater hydrographs are forecasting by using hybrid SOM-ANN models (Chang et al. 2016;Chang et al. 2014;Lin and Chen 2005;Moradkhani et al. 2004), hydrological event type clustering and classification (Abrahart and See 2000;Toth 2009), or catchment classification (Toth 2013). The clustering of groundwater hydrographs, especially by using SOM, has been carried out rather rarely so far. Han et al. (2016) used SOM to identify homogeneous clusters of groundwater level piezometers as a preprocessing step to forecasting with a step-wise cluster multi-site inference model. However, they tested the approach on a rather small number of wells (30) and more importantly, they used the time series directly as inputs. Approaches that use time series directly for clustering suffer from dependency on high-quality data (equal length, equal period, no gaps). Application of feature-based approaches can overcome this problem by using patchy input data (Wang et al. 2006). Features, in this case, are descriptive (statistical) measures of the time series, extracted e.g. from the time or frequency domain (Caiado et al. 2015). To apply a feature-based approach on groundwater level data, features taking the peculiarities of groundwater hydrographs into account are desirable. Heudorfer et al. (2019) present a comprehensive compilation of 45 possibly suited indices to describe groundwater dynamics. Their approach is very much related to the concept of hydrological signatures (McMillan et al. 2017), where features are designed to describe certain dynamic aspects in surface hydrology. Feature-based clustering of hydrological time series using Self-Organizing Maps has already been performed by Nourani et al. (2015), who used features based on wavelet decomposition to cluster a small number of wells on Ardabil plain, Iran. However, to the best knowledge of the authors, no approach is known yet that combines SOM-clustering with specifically designed features that describe the dynamic aspects of certain groundwater hydrographs.
In this study, we develop a robust, flexible, and semi automated framework for groundwater hydrograph clustering. We chose feature-based time series clustering, which allows to use data from time series of different periods, different lengths as well as missing and noisy data. Moreover, we present and explore several new features, which showed promising results and which are particularly suited to describe the dynamic aspects of groundwater hydrographs. We introduce a modification of a powerful clustering algorithm combination (SOM+DS2L) that allows influence on the level of detail of the clustering result, and implement Ensemble-Modeling-Techniques to remove arbitrariness from the feature selection process as well as to ensure a higher robustness of the clustering result. We apply the developed approach to the Upper Rhine Graben (URG) area in central Europe, based on a dataset of overall 1853 groundwater hydrographs. The motivation and later application is the reduction of the forecasting workload of regional forecasting of groundwater levels by selecting representative hydrographs from the clustering result. Additionally, we aim for increased system understanding in terms of dynamic patterns and their main controlling factors.

Data and Study Area
The study area is the Upper Rhine Graben (URG), mainly located in southwestern Germany and northeastern France (Fig. 1a). It is the largest groundwater resource in central Europe (LUBW 2006), covering 80% of the drinking water demand of the region (Région Alsace -Strasbourg 1999) and is also intensively used for water extraction for irrigation and industrial purposes. The URG, a Cenozoic rift structure, 300 km long (N-S) and on average 40 km wide (E-W), is filled with sediments (mainly gravel and sand) with a total thickness of up to about 3500 m. Hydrogeologically, the uppermost Quaternary sediments are most important. They reach a thickness of more than 200 m in the southern part, which strongly decreases to about 30 m in the area around Karlsruhe. In the northern part of the URG, the Quaternary sediment thickness increases to up to 500 m and a multi-aquifer system exists due to several fine-clastic layers dividing the Quaternary sediments (Geyer et al. 2011;LUBW 2006).
The dataset used consists of 1853 weekly groundwater hydrographs from Germany and France, including one synthetic hydrograph with strong outlier characteristics to explore and illustrate additional properties of the clustering approach. The considered period ranges from October 1986 to September 2016 (30 years). The majority of the hydrographs show data for almost the entire period, the shortest length included being six years. We removed strong outliers conservatively and interpolated small data gaps to up to one month linearly. Figure 1a shows the study area in general (left) and the locations of the 1852 real wells included in the dataset (right). The dataset includes only wells from the uppermost aquifer within the Quaternary sediments, which causes e.g. the three major blank spaces on the map in Fig. 1a (right) due to locally changing geological conditions in these areas. Figure 1b sketches a strongly simplified E-W cross-section of the URG and illustrates that the regional groundwater dynamics are the result of a complex interaction of multiple factors, which we divided into processes (Pr), driving forces (DF), and governing parameters (GP) for the sake of a more systematic point of view. Processes are the physical processes that directly influence the groundwater levels (e.g. recharge). They are mostly driven by external driving forces (e.g. precipitation) and in most cases depend on one or several governing parameters (e.g. topography, land use). A detailed assessment of the importance of each factor can be found in the electronic supplement (Text S1). Strongly simplified E-W cross-section of the URG, summarizing some influences on groundwater dynamic patterns (DF: driving force, GP: governing parameter, Pr: process); DF1 -artificial extraction/infiltration, DF2 -surface water interactions (a: floods), DF3 -regional flow systems, DF4 -weather/climate, DF5 -soil moisture; GP1 -topography, GP2 -vegetation/land use, GP3 -geology (aquifer type/material properties), GP4 -pressure state (free/confined), GP5 -mean depth to groundwater; Pr1 -recharge (a: direct/diffuse; b: direct/local; c: inter-aquifer-exchange; d: lateral), Pr2 -evapotranspiration, Pr3signal damping (low pass filter effect), Pr4 -in-/exfiltration, Pr5 -bank storage

Feature-Based Time Series Characterization
A proper feature set, depending on the unique hydrogeological conditions, is key to adequately describe and thus successfully cluster the data. Here, features are descriptive (statistical) indices that quantify the dynamics of groundwater hydrographs, similar to the concept of signatures in hydrology (McMillan et al. 2017). However, groundwater hydrographs generally differ considerably from surface water hydrographs, which makes many hydrological signatures inadequate for describing dynamic aspects of groundwater. Thus, there is a need for comprehensive testing of the transferability to the groundwater domain, as was done by Heudorfer et al. (2019). A most important supportive tool for pre-selecting adequate features is a visual skill test to check the adequacy and the explanatory power of every single feature. Applying PCA or related methods can help to reduce the feature number by ruling out redundant features based on the explained variance. However, including correlated features can help to improve the result, by up-weighting important aspects of the general dynamics. We explore this aspect with a correlation analysis of all selected features in the results section. In total, we tested a broad variety of feature candidates (> 50), including standard statistics measures, features derived from the literature (Heudorfer et al. 2019;Wang et al. 2006), as well as self-designed features to account for peculiarities of both the study area and groundwater hydrographs in general. In the following, we introduce those which have successfully passed the visual skill test for our data set. Skill test results that show the explanatory power of each feature are provided in the supplementary material (Figs. S1 to S13). Table 1 summarizes the feature calculation, the corresponding data basis, and the primary purpose or a short description for all used features. For more details on the self-designed features, we refer to the supplementary material where we also present results on the robustness of the features against gaps, noise and time series length (Text S2 and S4, Tables S1−S3, Figs. S1−S13).

Self-Organizing Map Clustering Using DS2L Algorithm
SOM perform a non-linear projection of multidimensional data onto a regular neuron lattice surface. They show characteristics of both clustering (local averaging) and data compression methods (topology preservation), which is a unique property and also an advantage of SOM compared to other cluster algorithms and projection methods (Kohonen 2014). Every neuron has clearly identifiable neighbors, which allows simple two-dimensional visual representations of multi-dimensional data. We apply a modified version of the density-based simultaneous two-level (DS2L)-algorithm (Cabanes et al. 2012) to automatically derive clusters from the trained SOM. DS2L detects clusters by analyzing data density and neighborhood connection-strength of the SOM. An adequate cluster number is automatically determined and the algorithm does not tend to produce clusters of equal size, both advantages compared to some well-established cluster algorithms (e.g. k-means or some hierarchical methods). We modify DS2L-algorithm in such a way that the user can decide purely qualitatively whether the clustering should be performed more coarsely or more finely. On the chosen level of detail the cluster number is still determined automatically. For this, we implement three adjustment parameters for thresholds of data density and neighborhood connection-strength as well as to control the application of some algorithm steps. Besides  Richter et al. (1996), derived from Heudorfer et al. (2019) lit the number of neurons (SOM-size), which also has an influence on the cluster result, the following four parameters must be optimized during the clustering process.
-SOM-size: normal ( 5 √ n ), small ( 5 √ n ⋅ 0.25 ) or big ( 5 √ n ⋅ 4 ) -options implemented in SOM-Toolbox (Vesanto 2005), n: number of samples -NTH: NTH ≥ −1 ∈ ℤ -DS2L-Neighborhood-Threshold, connection strength required to qualify as cluster border, -1 means connection strength is not used. -DR: Yes/No -DS2L-Density-Refinement, use density values for cluster determination -DM: Yes/No -DS2L-Density-Merging, merge similar clusters based on densitydependent index Figure 2 summarizes the workflow of the approach applied in this study. A common problem with many feature-based approaches is the arbitrariness of feature selection. As shown by line I in Fig. 2, we implement an SOM-ensemble to find the best combination of all pre-selected features, whereby the cluster quality is judged by five different internal validation indices (Caliński-Harabasz criterion (CH), McClain-Rao criterion (MR), PBM-Index, Ratkowsky-Lance criterion (RL), C-Index). Line II in Fig. 2 shows a second SOM-Ensemble based on delete-d-jackknifing resampling. Its purpose is to simulate changes in the observational network by manipulating the input data set, and to obtain cluster results as robust as possible. The final cluster result is based on voting consensus. For visualization and evaluation, we rearrange all original time series of a cluster by their mean pairwise Pearson-correlation with all other cluster members. A weighting by the p-value of the respective single correlations lowers correlation values with low significance (which might arise from only short overlapping time periods). We define this value as the weighted intracluster correlation ( R W ). A detailed description and discussion of the workflow is added to the supplementary material (Text S3). Besides the clustering itself, interpreting the results is very useful to improve system understanding in general. This is especially the case for clusters, which are not easily interpretable in terms of spatial location or dynamic aspects. Hence, we conduct detailed correlation analyses for factors mentioned in Fig. 1b, where reasonable additional data are available to perform meaningful statistics. For some, data are only available for part of the study area. We therefore link them also with features and not only with clusters. In this way, we avoid a bias, for clusters with wells in areas without data. Furthermore, the dynamics within clusters are usually the result of a superposition of several influencing factor which Fig. 2 Workflow of the presented methodology can make correlations rather challenging. Because of the easier metric interpretation, we focus on linear correlation analysis, although we are aware that non-linear relationships can also exist. In addition, we only mention significant correlations with p ≤ 0.05.

Results and Discussion
We applied our approach to 1853 time series from the Upper Rhine Graben area (including one synthetic hydrograph). The feature pre-selection provided 13 features with good explanatory power regarding our specific dataset (Sect. 3.1/ Table 1). The used cluster parameter combination was: SOM-size: big, NTH = 0, DR: Yes, DM: No (Sect. 3.2). The best feature configuration derived from the first ensemble (115.005 members) included 9 out of 13 features.
As stated in Sect. 3.1, we found that including correlated features improves the clustering results. A correlation analysis among the included features shows the highest absolute significant (p<0.05) correlations for the features Skew-Med01 (-0.81) and P52-RR (0.79), which is consistent with the meaning and calculation of these respective feature pairs (e.g. hydrographs with high annual periodicity often also show a regular range over the years, thus high RR values). A detailed correlation matrix of all features can be found in the supplementary material (Fig. S27).
The final cluster result consists of 18 clusters (Fig. 3a) with sizes ranging from 239 hydrographs in cluster 1, to only one hydrograph in cluster 18, which is the synthetic hydrograph with outlier characteristics (cluster numbers sorted in descending order by size). The five biggest clusters include almost 1000 of the 1853 hydrographs in total, eight clusters show sizes larger than 100, only five clusters show sizes below 50. Due to the huge amount of information, we summarize detailed information and graphics on every single cluster in the supplement (Figs. S28−S65). In the following, we only present selected results.
The Boxplots in Fig. 3b show the feature value distributions within each cluster. For some clusters a clear feature importance can be derived. Cluster 2, for example, is comprised of mainly regular hydrographs dominated by the annual periodicity and with little other long-or short-term periodicities (high P52), as well as the annual maximum and minimum occurring very regularly during March and September, respectively (high SB). Reasons are comparably high recharge values in the middle of the Graben, typical for wells neither strongly dominated by margin inflows nor by the Rhine River. However, less straightforward feature combinations also exist which are therefore harder to interpret. The same applies to the spatial distribution of the clusters. If there is no distinct grouping (e.g. as a result of a spatially limited, local influence on the dynamics), more effort is required to understand what processes, forces, or parameters might be the cause of the common dynamics.
Cluster 3 (Fig. 4) is an example of straightforward interpretation, where wells follow almost exclusively the Rhine River course. Thus, identifying interaction with surface water (DF2, Pr1b, Pr4, P5, Fig. 1b) as the dominant driving force is comparatively easy. Some wells of this cluster showing greater distances to the Rhine River are in turn closer to mid-sized rivers like the Neckar or Ill, where common dynamics can be expected due to similar overall conditions. The resulting hydrographs grouping reveals that despite data gaps and different time series lengths, still a homogeneous grouping was achieved by our approach. The weighted intra-cluster correlation values ( R W ) are expressed by the coloring (the brighter the lower), thus by the sorting of the stacked time series and by the bars on the right. In general, with decreasing ( R W )-values towards the cluster borders the heterogeneity increases and the certainty of the cluster assignment of individual hydrographs decreases. Considering cluster 3, we can observe a distinct north-south gradient, which means that despite an changing dynamic along the river, grouping was still successful. However, other wells close to the Rhine River were sorted into different clusters, but show indeed different dynamics (compare clusters 7 and 9 in the supplement). In terms of feature values, the Rhine influence for cluster 3 is best expressed by feature SDdiff, describing the higher flashiness close to the river (Fig. 3). Other features are also in accordance. For example, Med01 values are comparably low, indicating that the hydrographs are more likely to be Fig. 3 a) cluster sizes, b) feature value boxplots of all clusters. For a better graphical representation, Cluster 18 was omitted due to strong outlier characteristics. Boxplots including Cluster 18 can be found in the supplement Fig. 4 Maps and stacked z-transformed hydrographs of selected clusters. Coloring and stacked order reflect the weighted intra-cluster correlation ( R W ) also shown as bar-plot on the far right; Cluster 3 is mainly influenced by the Rhine River; Cluster 8 shows spatial grouping in the northern part and contains hydrographs with low annual periodicity and low variability; Cluster 15 groups hydrographs with outliers and inhomogeneities; Cluster 18 contains only the synthetic hydrograph, which is a heavy outlier compared to the whole dataset bound to some kind of baseflow level in combination with short and high peaks triggered by the streamflow.
Overall results show that in the north of the URG, predominantly hydrographs with small variability and weak annual periodicity occur, while especially the middle section of the URG shows highly seasonal and highly regular hydrograph patterns. The former is expressed mainly by clusters 1, 5, 8, 10, 16; the latter can be seen e.g. in clusters 2 and 4 (Figs. S29−S65). We selected cluster 8 (Fig. 4) to illustrate the low-variance case in the northern URG. Driving forces connected to this cluster are most certainly strong anthropogenic influences (DF1, Fig. 1b) because the cluster focuses spatially on an area with strong groundwater management efforts. Connections to in generally lower groundwater recharge values (Pr1a, Fig. 1b) in the northern URG can also be drawn. Both factors can explain the smoothness as well as the comparatively weak annual periodicity and low variability of the hydrographs in cluster 8.
The approach successfully separates a small group of 16 hydrographs with outliers and significant inhomogeneities, which probably occur due to two major Rhine River weir locks (Strasbourg, Breisach) (cluster 15, Fig. 4). Furthermore, the synthetic hydrograph is put in a separate cluster (cluster 18, Fig. 4). Such clusters are rather based on single events or characteristics than on similar, highly correlated time series. Therefore, even for good clusters in terms of such events, ( R W )-values can be rather low.
In terms of system understanding, thus the correlation analysis of clusters and features with explaining factors, we found that the mean depth to groundwater (GP5, Fig. 1b) shows clear negative correlations (P52 (-0.45), RR (-0.44), SB (-0.29), SDdiff (-0.16)) with features describing the variability of hydrographs (e.g. seasonality, flashiness). Such variability is generally damped with increasing depth to groundwater. The complimentary case applies to HPD (0.33) and LRec (0.29), which both rather reach higher values for smoother hydrographs with little short-term variations. A clear relation to the clusters could not be found, though, probably due to the only minor variation of this parameter (70% of the wells < 5 m bgl on average), which makes a meaningful interpretation of the cluster development challenging. We observed only slight tendencies to greater or smaller depths to GW for some clusters. Another probable explanation could be that more dominating factors superimpose the effect of the depth to groundwater and are thus more decisive for cluster assignment.
We explored the connection of features and clusters to diffuse groundwater recharge (Pr1a, Fig. 1b) using GWN1000 data (BGR 2019). French wells (190) were excluded due to no data. In accordance with the findings and explanations given for depth to groundwater we found positive significant correlations for damping sensitive features (RR (0.26), P52 (0.19), SB (0.07), SDdiff (0.05)). Further it seems plausible that weak recharge signals correlate with important features for smoother hydrographs (LRec (-0.15), HPD (-0.14)). In agreement spatial recharge data, we found that clusters showing mainly smooth hydrographs with lower variability (1,5,8,10,16) are connected to lower recharge in the northern URG; clusters showing higher annual periodicity and variability and which occur mainly in the middle part of the URG (2,4,6) are connected to higher recharge. Nonetheless, due to missing data for France, these relations must be considered somewhat carefully.
For most of the area east of the Rhine River (Baden-Württemberg), we explored connections to the hydraulic conductivity within the uppermost aquifer (K-values, GP3, Fig. 1b) (LGRB 2007). Due to the spatially limited data, no meaningful correlation can be made with clusters, however, still a reasonable number of wells (828) can be assigned to a specific K-value. Categorical correlation analysis (Spearman) with features yields positively correlations for Skew (0.24) and SDdiff (0.18) probably because high conductivities can be 1 3 found mainly close to the Rhine River. Similarly, Jumps (0.20) are probably often caused by anthropogenic influences (abstractions, ship locks), which in turn occur preferentially in regions of high conductivities. Other correlations implicate that smoother hydrographs (HPD (-0.34)), long descending hydrograph parts (LRec (-0.23)), boundedness preferentially to an upper bound (Med01 (-0.21)), as well as the yearly maximum during spring (SB (-0.18)) seem to be related to lower hydraulic conductivities for this subset of wells. This might sound counter-intuitive, since flashy behavior is often linked to lower hydraulic conductivities, however, the main reason for flashy behavior in this area is probably the influence of the Rhine River, where high conductivities occur.
The influence of streamflows (DF2, Pr1b/4/5, Fig. 1b) was explored as the general relationships between the distance to the Rhine River and feature values. The results confirm the relation to cluster 3. Further, we found clear relationships for clusters 7 and 9. Clusters 6 and 15 showed a weaker connection, but all of the mentioned clusters show a clear spatial relation to the Rhine River. Nevertheless, they show different dynamics, which maintains the reasonability of the results. Clusters 3, 6, and 7 are closely related but flashiness of the hydrographs is decreasing from one to the other. Cluster 9 shows less periodicity than cluster 3, but both are visually similar and match for major dynamic peaks. It remains an open questions what causes the different dynamics close to the streamflow. Also smaller streamflows seem to have a significant influence on groundwater at least in the southern part of our test area (Longuevergne et al. 2007). Hence, we performed a detailed streamflow distance analysis based on the Strahler classes of all streams (Text S5) in the area, derived from the Copernicus EU-HYDRO Dataset (EEA 2017). We obtained similar findings, but found a much stronger influence for cluster 15 and also a slight influence of streamflows on cluster 12.
For most conducted analyses, the correlation values are significant but rather low. This illustrates that there are distinct relations but at the same time also a lot of interactions between the influences. Correlation is nevertheless a good indicator and shows that the features express important properties of the hydrographs and thus are well selected. On the other hand, low correlation also shows, that a dynamic-based clustering is even more important, because simply grouping wells according to external factors is clearly not sufficient. Supplement Table S4 and Fig. S66 show a comprehensive overview of all explored correlations (R-values and significance).

Summary and Conclusions
In this work, we present the results of a newly developed semi-automated groundwater hydrograph clustering framework. We group hydrographs based exclusively on their dynamics by describing them with features specifically designed for important dynamic aspects of groundwater hydrographs. Hence, heterogeneous input data can be used, which we confirmed by a high robustness for most of our features especially towards data gaps. The combination of DS2L-algorithm with SOM allows automatic determination of the cluster number and great flexibility in terms of cluster size. It further allows the user to determine the level of detail of the clustering result. The application of two SOM-Ensembles helps to remove arbitrariness from the feature selection process, which is also a common issue in feature-based clustering. In addition, it allows to obtain robust and practice-oriented results even for groundwater observation networks that are subject to change over time. The combination of these methods therefore creates a solid clustering framework with advantages in terms of (i) making use of heterogeneous data, (ii) operating in a comparatively highly automated manner, still leaving possibilities to adapt to specific dataset characteristics and analysis goals, as well as (iii) obtaining robust, practice-oriented results. By exchanging the describing features, the presented framework is easily transferable to other time series-clustering applications in various domains. For cluster ordering and visualization, we propose the use of a weighted correlation measure ( R W ).
The clustering results illustrate the above characteristics well. Similar dynamic patterns are derived from a large data set, which can be used for further processing (e.g. forecasting) and interpretation. Our results also show, that the frequently made assumption that nearby wells have a more similar dynamic than wells further apart is only partly true, even for wells in the same aquifer. Moreover, in some cases, there are similar dynamic patterns with no clear spatial reference, which makes it important to cluster wells according to their dynamics, rather than according to spatial proximity or common aquifer properties.
We confirmed that groundwater dynamics are a complicated interaction of most diverse factors, where some of them are hard to determine or are even poorly understood at all. This makes it usually very difficult to disentange the contributions, not to mention the mostly incomplete information on such metadata. We mainly focused on framework development, motivated by the superior goal of selecting representatives for forecasting purposes, which is why it only lies partly within the scope of this work to improve the understanding of the different factors contributing to groundwater dynamics. Thus, we have comparatively small or almost no variation in geological conditions, aquifer type and similar parameters, which is not the best starting point for a search for such correlations. Nevertheless, we hope, that our approach can contribute to this general question, besides the improved system knowledge on a local scale, which a hydrograph grouping itself already provides. This applies especially because studies of groundwater dynamics and their connections to relevant driving forces are comparatively rare yet (Giese et al. 2020). To fully exploit the potential of this method in contributing to the improvement of system knowledge, comprehensive data sets of potential influencing factors covering the complete study area should be available. The goal should be to link driving forces directly to features or indices. For this purpose more systems should be subject to research studies to explore many different characteristics and system properties. We also presume that once a better understanding of dynamiccontrolling factors is in place, a prediction of ungauged locations may be possible.