Abstract
In the last 20 years in Portugal, water resources have been affected to the point that water storage has decreased by 20% since 2000. Creating strategies to manage water resources requires a comprehensive understanding of the factors influencing water storage and their effects over time. This study is focused on the evolution of Groundwater Deep Levels (GDL) by applying a two-phase trend analysis methodology to examine the dynamic changes in GDL within a series of monitoring wells located in the Central and Southern sectors of the Left Bank of the Tagus-Sado Cenozoic age Basin, situated in Portugal In the initial phase of trend analysis, Factorial Analysis of Mixed Data (FAMD) was employed and posteriorly the Hierarchical Classification Analysis (HCA). These techniques enabled us to identify distinct GDL trend profiles and generate interpretative maps illustrating their spatial distribution. In the second phase, the non-parametric Mann–Kendall Analysis (MKA) and Innovative Trend Analysis (ITA) were applied, allowing for a quantified confirmation of the different trend profiles previously detected. These techniques allowed the identification of positive and negative hydrodynamic trends in distinct sections of the Basin. In the SE sector they are characterized by a significative increase of GDL associated with overexploitation and in the Central sector with a decrease of GDL. Nevertheless, significant depletion effects can result from natural factors such as prolonged droughts, and in certain regions, changes in geological and hydrothermal dynamics, such as Alpine-age faults, graben, and horst structures, may account for these alterations.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Groundwater reservoirs play a crucial role in both the environment and the development of our 21st-century society. The preservation of these reservoirs is increasingly challenging due to a range of factors such as natural ones like dry seasons combined with low precipitation ranges and non-natural ones such as over-exploitation, contamination, and climate change. This preservation dilemma has become a prominent global concern. According to the latest report by the Intergovernmental Panel on Climate Change (IPCC), southern European countries are at high risk of extreme drought and with the potential for flooding, with a temperature increase equal to or greater than 1.5 °C, defined as a target value not to be exceeded (Pörtner et al. 2022). With these temperature increases, the hydrological cycle will be affected, and consequently, the aquifer systems will be too.
In order to find a solution, the initial step involves researching the evolution of groundwater systems throughout time and identifying factors that can either enhance or hinder their effective management.
Analysis of GDL evolution through time is a crucial key-issue for the understanding of the dynamics and the evolutionary conditions of groundwater systems. The GDL consist of the difference between the elevation of the top of the well and the groundwater level. If GDL increases, it means that the groundwater level decreases. If GDL decreases, the groundwater level increases. This gives us direct information of the health of the groundwater system, compared to de precipitation and temperature information.
When considering the application of GDL, it is crucial to take into consideration two assumptions. Firstly, the GDL time series are influenced by both local and regional conditions and so GDL time-series result in non-stationary behaviour. Therefore, it is necessary to explore alternative options for its trend analysis. Traditional trend identification and significance tests are restrictive and make assumptions that may not be applicable for monitoring hydrodynamic parameters such as GDL. Understanding and determining the factors that influence the spatial and temporal variation of groundwater levels and hydrochemical parameters is essential to outline action and monitoring measures, allowing the preservation of this resource.
As a reference many publications reflect the use of multivariate statistical analysis in the evolution of salinization conditions in coastal areas (Ferchichi et al. 2018; Soltani and Mellah 2023), in the study of water quality (Kantiranis et al. 2017; Wang et al. 2018; De Andrade Costa et al. 2020; Varol 2020; Ghashghaie et al. 2022;Ghaemi and Noshadi 2022; Wali et al. 2022), and in the variation of piezometric levels (Machiwal, and Singh 2015; Adamovic et al. 2016; Rabiei et al. 2022). These techniques thus allow the analysis of relationships between variables in a comprehensive way and their quantification (Shiker 2012; Javadinejad et al. 2019a).
Factorial Analysis (FA) is a set of multivariate statistical analysis techniques that helps understand the relationship between variables. By transforming data and simplifying the relationships between variables it is possible to highlight a smaller number of intrinsic characteristics, designated as factors (Pagès 2004). It is, therefore, also known as Common Factor Analysis (CFA) (Seth 2022).
Factors capture much of the information of a set of variables in the dataset by providing an understanding of the underlying concepts of its interrelations. The FA can be applied to group the samples in a dataset based on their similarity for a certain characteristic. It can explore deeper factors that may not be evident in the dataset, reducing its multidimensionality. This technique can be handy for exploring relationships in a particular dataset category. The explored concepts or causes may not be immediately evident but may represent peculiarities or certain tendencies that may be difficult to identify or measure in a simpler way. The FA is beneficial for the interpretation of groundwater-quality data and for understanding specific hydrogeochemical processes, many distinct studies regarding multivariate statistics and FA applied in distinct contexts of groundwater study can be found in the related literature (Ruiz et al. 1990; Cerón et al. 2000; Love et al. 2004; Celestino et al. 2018; Krishnan et al. 2019; Panda et al. 2019; Fatahi et al. 2021).
The most widely used FA technique is Principal Component Analysis (PCA), which enables the relationships between quantitative variables to be assessed. There is another technique that simply assesses the relationship between qualitative variables, which is the Multiple Correspondence Analysis (MCA). There are several examples in the literature where factors identified with FA are considered in subsequent analyses, such as regression or unsupervised classification (cluster analysis). K-means Cluster Analysis and Hierarchical Classification Analysis (HCA) are the most common methods considered in this context (Celestino et al. 2018; Oh et al. 2020; Barbosa et al. 2021). However, in the presence of variables that contain both quantitative and qualitative information, it is not possible to use PCA or MCA.
Since this case study falls into this situation, FAMD technique and its potentialities were experimented. FAMD is an extension of the Factor Analysis (FA) method (Abdi 2003; Adamovic et al. 2016),. it is a multivariate statistical analysis method that is used to analyse datasets with both continuous and categorical variables. It is, in practical terms, a combination of FA and MCA techniques, FAMD is a multivariate statistical technique that is used to identify underlying dimensions or factors that explain the observed variation in a set of quantitative and qualitative variables.
In groundwater studies, the Hierarchical Classification Analysis (HCA) method, can be combined or not with FA, PCA or FAMD factors, allowing a clearer grouping of data (Wang et al. 2015; Jiang et al. 2015; Bayo and Lopez-Castellano 2016; Al Naeem et al. 2019; Rao and Chaudhary 2019; Subba et al. 2019; Visbal-Cadavid et al. 2020; Rahbar et al. 2020; Batdelger et al. 2023).
The other assumption is that GDL trends may change significantly through time due to the dependence on distinct factors that cause variations in hydrodynamic conditions, some of these causes may act independently in time or could be a part of complex combined trending effects. In these cases, verification of positive or negative slope tendencies may be difficult or even impossible using simple linear approaches. In geosciences, parametric methods are widely used, although often that the data under analysis does not meet the necessary normality criteria.
In hydrological studies it is wrong to assume that the data is stationary and independent of the time series, as this does not correspond to reality (Helsel 1987; Anderson 2008; Mumby 2002; Riaz et al. 2016; Mirabbasi et al. 2020). Furthermore parametric methods are hypersensitive in to the presence of outliers in the data series, unlike non-parametric methods (Mirabbasi et al. 2020). Thus, it is natural to resort to non-parametric methods, like, Mann–Kendall Analysis (MKA) and Şen's T Analysis, also known as Innovative Trend Analysis (ITA). According to Şen (2012, 2014) ITA is proposed as an improved technique of the classical MKA, there are some differences between these two techniques.
In MK analysis it is assumed that time series are independent and have no serial correlation. In ITA it is considered the opposite situation, in this technique it is considered that serial correlations in small series are usually significant in the interpretations of the phenomena as an all (Şen 2017).
ITA allows the identification of monotonous and holistic trends, as is the case with MK, however, ITA is a graphical method that allows trend patterns to have low, medium, and high values in the data, therefore applying to non-stationary datasets (Şen 2012, 2014, 2017). This method was originally developed for hydro-climatological time series and has been increasingly used due to problems of non-stationary, high magnitude and variability which are increasingly more often detected in distinct type of climate and environmental data time series. Examples of its applicability can be found in recent works (Caloiero 2019; Javadinejad et al. 2019b; Alifujiang et al. 2020; Güçlü 2020; Minea et al. 2020; Achite et al. 2021; Zakwan 2021; Buri et al. 2022; Swain et al. 2022; Umar et al. 2022). Nevertheless, Şen et al. (2019) proposed another innovative method, defined was Innovative Polygon Trend Analysis (IPTA), which has been used in several studies and investigations in Hydrogeology (Caloiero et al. 2018; Sanikhani et al. 2018; Kuriqi et al. 2020; Harkat and Kisi 2021; Ahmed et al. 2022),in this method, polygon patterns are obtained using information such as the mean, minimum, maximum, standard deviation or skewness parameters of the data at different time scales (daily, monthly, etc.) (Akçay et al. 2022). When compared to the ITA this method allows for the information associated to one year to be symbolized and can retain the trend, the magnitude and slope of trend transitions between successive segments (e.g., days, months) (Akçay et al. 2022).
In the present study, due the two characteristics of GDL trends, FAMD was applied first to identify the temporary evolution of each well and to verify existence of effects of depletion or rising in the groundwater levels and their spatial distributions along the basin. After that, the results were combined with HCA to group wells by the results. Lastly, ITA, IPTA and MK algorithm were applied with the aim of having a synthesis of positive and negative trends of GDL dataset which are a direct consequence of distinct hydrodynamic conditions. Examples of application of ITA to groundwater levels data series can be found in the references Minea et al. (2020), and Zakwan (2021).
Materials and methods
Geological and hydrogeological characteristics of study area
The area of study integrates the Central and Southern sectors of the aquifer systems of the Left Bank of the Tagus-Sado Basin (Portugal, Iberian Peninsula, Fig. 1). The Tagus-Sado Basin is a wide sedimentary basin, formed by Cenozoic and Quaternary sediments, it consists of a long depression with NE-SW diretion, limited in the W and N by Mesozoic formations, and NE, E and SE by Paleozoic subtract (Almeida et al. 2000).
The studied wells reach groundwaters from the Upper Miocene, which includes various types of aquifers (semi-confined and confined systems). It is a complex multilayer aquifer system (Costa 1994) mainly composed by alternating sedimentary sequences of continental and marine facies, resulting from transgressions and regression phenomenons, controled by the reactivation of graben and horst structures during the Alpine orogeny, promoting different hydrogeological behaviours and high spatial variability, thus allowing the creation of free, semi-confined and confined aquifer systems (Costa 1994; Fernandes and Silva 1998; Simões 1998; Simões and Legoinha 2014). This system is connected vertically with the free aquifer of Quartenary Pliocene–Pleistocene age (Zbyszewski et al. 1976; Antunes 1983; Manuppella et al. 1999; Pais et al. 2006), which is mainly constituted by fluvial deposits from deltaic systems associated to last regression of the basin, these allow some parts of the basin have a length of 800 m deeper. Most of the fault structures in the basin area are associated with the tectonic inversion of the Lusitanian Basin (Mesozoic) and the Paleozoic punch, as a result of the convergence of the African continent, relatively, to the Iberian continental block (Kullberg et al. 2000).
In the North sector, the basin was controlled by structures with NE-SW diretion (Carvalho et al. 1983–85; Cunha 1992). However, one the most important struteres, Pinhal Novo fault presents a general orientation NNW-SSE and covers a wide zone of deformation (about 1.5 km) (Correia 2017), in which it presents a pattern of branching and anastomosing faults.
In the South sector, the basin was controlled by te reactivation of the late-Hercynian faults of the Iberian Pyrite Belt, which have a general NW–SE orientation, showing cleavage and folding planes compatible with a strong vergence to SW (Matos 2021). During this time the fragmentation of the Paleozoic background happened due to alpine and/or late-variscan faults with NNE-SSW orientation NE-SW, NW–SE and E-W, creating stratigraphic variations between the North and South sector of the basin, as well as the diffraction from other aquifers systems associated to Tagus-Sado Basin (Oliveira et al. 1998, 2001).
The main method of drainage of the basin occurs extensively in depth and the recharge of the aquifer system depends mostly in precipitation in which the infiltration of water flows into the soil, with descending and lateral flows, which then communicate with the water lines. The flow is conditioned by the two main water lines (the Tagus River to the North and the Sado River to the South), where the discharge of the system occurs. In the coastal sectors, the flow goes towards the Atlantic Ocean (Almeida et al. 2000). There are other types of flow, like local flows, whose discharge areas are the adjacent water lines and the recharge areas are the interfluves. These are flows where descending and lateral flow directions predominate. Also, there could occur intermediate groundwater flows, associated with one or more basins of the main tributaries (Almeida et al. 2000).
Data set information
The selected statistical techniques were applied to understand common characteristics underlying the GDL trends of a set of wells that are used for public monitoring purposes by the Portuguese National Water Resources Information System Entity, SNIRH (“Sistema Nacional de Informação de Recursos Hídricos”, Fig. 2).
The considered wells were selected following a previous rigorous work, the wells that were considered for the study are those that represent undoubtedly the chronostratigraphic units selected for the case study and presented data series with sufficient representativeness for the chosen period of study, that is, from 1999 to 2021, which are define in the Table 1.
It can be said that the high increases in GDL found in some monitoring wells, particularly in the South sector of the basin, were the initial motivation for this study (Fig. 3). To gain a comprehensive understanding of the results, the climatological evaluation of the area between 2000 and 2020 was also analyzed, which are represented in the Fig. 4.
The data was collected from the Comporta, Moinhola, and Monte de Caparica weather stations (SNIRH). For the initial understanding of the climatological evolution of data between years, data was analyzed every five years since the hydrological year 2000. From that data there is some evidence of a decrease of precipitation, in the South area according to weather station of Comporta.
In Monte and Moinhola stations, the rate of precipitation is decreasing but it has some flutuation. However, the average tempeturate has a erratic tendency, and doesn’t have correlation with the precipitation rate. Therefore, this suggests the influence of other factors that promote this decrease in the last 20 years.
To understand how these tendencies manifest over time, for each technique, the selections of wells were considered according to the relevance of the information and the quantity of data.
Multivariate analysis – factorial analysis of mixed data
FAMD consists of a multivariate factorial analysis technique in which it is assumed that there is no dependency between the variables (independent variables), focusing simply on the relationship between the data.
The method combines the principles of Factorial Analysis (FA) with Multiple Correspondence Analysis (MCA), data is first transformed using MCA to create a set of synthetic variables that capture the underlying structure of the categorical variables then, FA is applied to the synthetic and continuous variables to extract the common factors that explain the variance in the dataset.
Quantitative variables are scaled to unit variance, while qualitative variables are transformed into a disjunctive data table (Husson et al. 2017, 2020). Thus, this technique allows data to be analysed and the balance and influence of the two types of variables to be assessed (Abdi 2003; Pagès 2004; Adamovic et al. 2016).
This method provides a comprehensive analysis of the dataset by uncovering the relationship among the variables and identifying the most important factors that drive the variation.
According to Abdi (2003), Pagès (2004) and Adamovic et al. (2016), FAMD can be applied: (a) to data with few qualitative variables compared to quantitative variables, (b) when the number of individuals in the population under study is generally low.
In the analysis of results, the representation of individuals in the data population is performed directly from factors, as a projection on the first two dimensions, where quantitative variables are represented through the circle of correlations associated with the PCA analysis and qualitative variables are represented in the same way as in MCA, in which the categorical variable is represented at the centroid of the individual who has it (Adamovic et al. 2016).
By performing this procedure, indicators “cos2” are obtained, these determine the representativeness of a variable and consist of the measure between the square of the cosine and the vector issued from the position of the variable and its projection on the axis (Husson et al. 2017, 2020; Lê et al. 2008). For indicator “cos2” values are close to 1 when a variable is well represented in the projection. According to Adamovic et al. (2016), usually, after performing this technique, it is advisable to proceed with HCA to complete the classification of individuals into groups that represent distinct well trend patterns.
As the mathematical concept, according to Audigier et al. (2016), the first step of FAMD consists of coding the categorical variables using the indicator matrix of dummy variables. For that, we have to initially define the information to the respective parameter, where \(I\) is the number of individuals, \({K}_{1}\) is the number of continuous variables, \({K}_{2}\) is the number of categorical variables, and \({K=K}_{1}+{K}_{2}\) is the total number of variables. Each continuous variable is a constructed matrix, where \({X}_{i\times j}\) is the matrix where \({{(x}_{j})}_{1\le \mathrm{ j}\le {K}_{1}}\) are continuous variables \({{(x}_{j})}_{1\le \mathrm{ j}\le {K}_{1}}\) and are dummy variables. The total number of columns is \({J=K}_{1}+{\sum }_{k={K}_{1}+1}^{K}qk\) where L is the number of categories of the variable k.
The PCA in the FAMD can be represented as \((\left(X-M\right){D}_{\Sigma }^{-\frac{1}{2}}),\) where \({M}_{I\times J}\) is the matrix with each row being the vector of the means of each column of \(X\) and \({D}_{\Sigma }\) is the diagonal matrix \(diag({\sigma }_{x1}^{2},.\dots , {\sigma }_{x{K}_{1}}^{2}, \dots ., {p}_{{K}_{1}+1},\dots ., {p}_{j}, \dots ., {p}_{J})\), being the standard deviation of the continuous variable \({x}_{j}\) and \({p}_{j}\) being the proportion of individuals in the category \(j(j={K}_{1}+1, \dots , 1,\dots , J)\). The matrix \({D}_{\Sigma }\) is the metric used to compute distances between rows. The loss function (known as the reconstruction error), which in the PCA is minimized in matrix \(X\) is:
Thus, FAMD can be defined as minimizing:
FAMD provides the best low-rank \((S<\left(J-{K}_{2}\right))\) approximation of the matrix \(\left(X-M\right){D}_{\Sigma }^{-\frac{1}{2}}\) in the least square sense. The solution is given by the singular value decomposition (SVD) of the matrix \(\left(X-M\right){D}_{\Sigma }^{-\frac{1}{2}}\), with \({\hat{U} }_{I\times S}\) the left singular vectors and \({\widehat{{\varvec{V}}}}_{{\varvec{J}}\times {\varvec{S}}}\) the right-singular vectors associated with the S largest singular values gathered in the matrix \({({\widehat{\boldsymbol{\Lambda }}}_{{\varvec{S}}\times {\varvec{S}}})}^{1/2}=diag\left(\sqrt{{\widehat{{\varvec{\lambda}}}}_{1}},\dots ,\sqrt{{\widehat{{\varvec{\lambda}}}}_{{\varvec{S}}}}\right)\). Notice that the maximum number of non-null eigen values is \(\left(J-{K}_{2}\right)\) because of the linear restrictions on the columns for the categorical variables (the row sum for each variable equals 1).
The specific weighting implies that the distances between two individuals in the initial space, \(i\) and \(i^{\prime}\) (before approximating the distances by keeping the first \(S\) dimensions obtained from the SVD), is:
Weighting by \(\frac{1}{{\widehat{{\varvec{\sigma}}}}_{{{\varvec{x}}}_{{\varvec{k}}}}^{2}}\) ensures that units of continuous variables do not influence the (square) distance between individuals. Weighting by \(\frac{1}{{p}_{j}}\) unequivocally establishes that two individuals who belong to different categories for the same variable are significantly more distant from each other when one of them is in a rare category compared to when both of them belong to frequent categories. The marginal frequencies of the categorical variables play a crucial role in this method. Categories with a small frequency have a greater inertia than the others, and consequently, rare categories have a greater influence on the construction of the principal components. The specific weighting also implies that, in FAMD, the principal components maximize the associations with both continuous and categorical variables. More precisely, the first principal component \({f}_{1}\), maximizes:
With \({\left({z}_{k}\right)}_{k}=K+1, \dots , K\) the categorical variables. The first principal component is the synthetic variable, which has the highest correlation with both the continuous variables (measured by the coefficient of determination, (\({R}^{2}\))) and the categorical variables (measured by the squared correlation ratio (\({\eta }^{2}\))). The second principal component is the synthetic variable, which maximizes the criterion among variables orthogonal to the first principal component and to the rest of the other principal component.
Multivariate analysis – hierarchical classification analysis
In support of FAMD results, HCA was executed, it is a method of clustering data based on the similarity of the variables or cases.
In the case of FA, HCA can be used to identify groups of variables or cases that share similar factor scores. This can be particularly useful in identifying patterns or subgroups within a dataset that might not be immediately apparent from the initial FA.
Unsupervised classification is a technique used to determine homogeneous groups, defined as clusters, whose data presents similarities among themselves in a population. For this purpose, it includes techniques that use successive iterative processes, which process all objects in a set through divisive or agglomerative methods.
Agglomerative methods start with each case or variable as its own cluster and then merges clusters together based on their similarity, while divisive methods start with all cases or variables in one cluster and then split them into smaller clusters based on their dissimilarity. This process ends when all objects are processed (Almeida et al. 2007).
The choice of method will depend on the characteristics of the dataset and the research question being addressed. The HCA results are often presented in a dendrogram, where each linking step in the clustering process is represented by a connecting line (Smoliski et al. 2002; Granato et al. 2018). One common approach to conducting HCA in the context of Factorial Analysis is to use Ward's method, as in our case-study, which aims to minimize the sum of squared distances between the cases or variables within each cluster.
Other methods, such as single linkage or complete linkage, can also be used, depending on the research question and the characteristics of the dataset.
According to Nielsen (2016), this method groups all potential groups in pairs, in which the differences between the sum of the square of the differences of each object to the centroid is the smallest when comparing before and after the junction. The Ward linkage function is characterized as: To merge \({{\varvec{X}}}_{{\varvec{i}}}({{\varvec{n}}}_{{\varvec{i}}}=\left|{{\varvec{X}}}_{{\varvec{i}}}\right|)\) with, \({{\varvec{X}}}_{{\varvec{j}}}({{\varvec{n}}}_{{\varvec{j}}}=\left|{{\varvec{X}}}_{{\varvec{j}}}\right|)\) where consider the following Ward criteria:
where \({\varvec{c}}\left({{\varvec{X}}}^{\boldsymbol{^{\prime}}}\right)\) denotes the centroid of the subset \({{\varvec{X}}}^{\boldsymbol{^{\prime}}}\subseteq \boldsymbol{ }{\varvec{X}}:{\varvec{c}}\left({{\varvec{X}}}^{\boldsymbol{^{\prime}}}\right)=\frac{1}{\left|{{\varvec{X}}}^{\boldsymbol{^{\prime}}}\right|}{\sum }_{{\varvec{x}}\in {{\varvec{X}}}^{\boldsymbol{^{\prime}}}}{\varvec{x}}\). Observe that the distance between two elements induced from the sub-set distance \(\Delta\) is merely half of the squared Euclidean distance:
According to the equation in each iteration, the centroid of each group is calculated, and the sum of squared error (SSE) is calculated as the Euclidean distance between each object and the centroid of the group. Then, for all possible pairwise groupings, the hypothetical centroid resulting from the merger of the two groups is calculated, along with the sum of the Euclidean distances from each object to the hypothetical centroid. For each evaluated group merger, the value resulting from the sum of the two distances calculated separately is compared with the distance calculated already as a group (it is always an increment), and the similarity matrix is updated with these increments. Once the matrix is fully populated, the pair of groups leading to the smallest increment is merged. At the beginning of the process, each object is a group with zero dispersion, so the sum of dispersions for all groups is zero. In the end, only one group will remain, so the total dispersion is the sum of the squares of the differences between each value and the mean of the respective variable. The distance recorded in the dendrogram is the sum of the dispersions of the involved groups.
Innovative trend analysis and innovative polygon trend analysis
ITA (Jiang et al. 2015) divides the time series into two equal subseries based on the average of its values, which are consequently classified separately in ascending order. Each subseries is subsequently distributed along two axes, the \({X}_{i}\) subseries on the \(X\) axis, and the \({X}_{j}\) subseries on the Y axis. With the subseries defined, the arithmetic means of each subseries are obtained, allowing for the determination of the slope value, according to Eq. (1) where n is the number of total time series data.
If \(E\)(\(\overline{{X }_{1}}\)) = \(E\)(\(\overline{{X }_{2}}\)), then \(E\left(s\right)=0\), and the centroid of the trend line falls on the 1:1 line, indicating that there is no trend. This corresponds with the null hypothesis, \({H}_{0}\), where it is assumed that there is no trend if the calculated slope value, \({S}_{a}\), remains below a critical value, \({S}_{cr}\). Otherwise, an alternative hypothesis is applied, \({H}_{a}\), in which \({S}_{a}\)>\({S}_{cr}\) (Şen 2017). In this case, if \({X}_{1}\)>\({X}_{2}\), there is a negative trend, that is, there is a decrease, and if \({X}_{1}\)<\({X}_{2}\) there is a positive trend, that is, there is an increase (Şen 2017).
This can also be verified by the variance \({V(S}_{a}\)), in which,
In this situation the null hypothesis H_0 assumes that \({V(S}_{a})=E\left({S}_{a}^{2}\right)\) according to Eq. (3).
In the situation that \(E\left({\overline{{X }_{2}}}^{2}\right)= E\left({\overline{{X }_{1}}}^{2}\right)\), variance mean is equal to:
The correlation coefficient between two values is given according to its autocorrelation which is equal to:
where \({S}_{{\overline{X} }_{1}}={S}_{{\overline{X} }_{2}}=S/\sqrt{n}\). Hence, the equation can be written according as:
where the correlation coefficient corresponds to the arithmetic means between the two subseries. Thus, the standard deviation of the sampling slope value can be obtained from Eq. (7).
Under these circumstances, the third-order moment of the slope variable is also equal to zero, and the same is true for all odd-order moments. For this reason, the slope respects the normal (Gaussian) Probability Distribution Functions (PDF) with zero mean and standard deviation, thus being the basic criterion of this method (Almazroui and Şen 2020). As previously mentioned, this technique makes it possible to distinguish between series values in “low”, “medium” and “high”. Individualization and characterization of data is possible to be considered into 9 subareas, according to Fig. 5, with the next following interpretative means (Almazroui and Şen 2020):
-
1.
Trendlines running parallel to the 1:1 line (45º) implies an increase or decrease, which is constant (B and C), while partial lines (D, E, I and L), encompass ratings below ("low", "medium" or "high"). If the centroid is on the 1:1 straight line (45º), then there is a trend;
-
2.
Non-parallel trend lines (F, G, H, J, K or M) to the 1:1 (45º) straight line implies a change in the deviation pattern over time;
-
3.
Straight lines F and G have trends in standard deviation but not in arithmetic mean;
-
4.
Trend lines, H (K) and J (M) imply an increasing (decreasing) standard deviation trend.
According to Şan et al. (2021), Şen et al. (2019) proposed the IPTA graphic method and revealed that it has certain advantages compared to other traditional trend methods in specific areas, such as the agricultural field and in hydrological studies. The ability to identify trends in a sequence is one of its significant characteristics that provides a highly productive basis for better linguistic and numerical interpretation and deduction.
If IPTA is applied to monthly data, according to Şen et al. (2019) and Şan et al. (2021), five processing steps are required. Step (1): monthly (e.g. Jan., Feb., Mar…) time series is divided into two equal periods. Step (2): basic statistics (e.g. mean, max) or desired criteria (e.g. uncertainty) for each month are calculated in both periods. Step (3): the first (second) period is placed on the horizontal (vertical) axis in the scatter chart, and 12 points are marked for 12 months. Step (4): the points of consecutive months are connected by straight lines forming a polygon (Fig. 2). Step (5): the slope and size of the line between consecutive points are calculated.
where \({\varvec{s}}\) is the trend slope, \(\left|{\varvec{A}}{\varvec{B}}\right|\) is the trend length,\({{\varvec{x}}}_{1}\) and \({{\varvec{x}}}_{2}\) are two consecutive points in the first part in horizontal, \({{\varvec{y}}}_{1}\) and \({{\varvec{y}}}_{2}\) are two consecutive points in the second part. In the Cartesian coordinate system, a 1:1 (45°) line is drawn, where the points are below (above) the line representing a decreasing (increasing) trend (Şen 2012). Straight lines connecting the points provide information about the changes between the successive months. If the slopes of the lines between consecutive months are far from each other, the contribution of the changes between months to the average change in the hydrometeorological series is significant and vice versa (Şan et al. 2021).
Mann–Kendall analysis
According to diverse authors (Hamed 2008; Güçlü 2020; Mirabbasi et al. 2020) the MKA developed by Mann (1945) and revised by Kendall in 1948 (NLJ 1948), assumes that the trend value to be obtained is calculated according to the Eq. (8), and (9).
Each pairwise of a data series is classified in one of the three subsets (-1, 0, 1) according to their differences (Eq. (8)). In Eq. (9) \({x}_{i}\) and \({x}_{j}\) are the data values for time \(i\) and \(j\), and \(n\) is the dataset length. When the S value is positive, it indicates an increasing trend, otherwise, it verifies a decreasing trend. When a data length of a series is higher than n > 10, the data distributions approach a normal distribution law, with a mean equal to zero, where the variance [\(Var\left(S\right)]\) assumes the following expression (Eq. (10)):
where \(C\) is a factor for modified variation and \(n\) represents the number of defined groups. In the presence of data series with the occurrence of successive data, the parameter \(C\) is calculated by the following formula:
where \({t}_{i}\) defines the number of linked data in the group. If there are no groups, it automatically skips this process. After calculating the variance of the series data, the standard Z value is calculated according to the following equation:
In the MK analysis, the null hypothesis \({H}_{0}\) represents “no significant trend in the time series”, and it is accepted if the confidence level, \(\alpha\), is:
If this hypothesis is not verified, \({H}_{0}\) is rejected, and the alternative hypothesis (that is, existence of a significant trend in terms of importance) is accepted (Dinpashoh et al. 2014).
In the graphical results of MKA information is divided into two different lines: the prograde and the retrograde line. The prograde line corresponds to Z(S) values, and the retrograde line represents Z^* (S). When these two lines touch each other, we are under an event that represents an inversion of the tendency. With this, it is possible to determine at which point the transition occurs. The arrangement of lines gives us information about the type of tendency and the confidence level interval value (CL). If the line prograde is above the other line, that indicates a positive tendency, which means an increase in GDL and if the opposite situation manifests, that indicates a decrease in GDL. However, for the analysis to be correct, it is necessary for the two lines to be inside the CL value of 95%. In the case where just one of the lines is inside that interval, the tendency is represented by the line inside the CL if 95% prior to the inversion. For the null hypothesis to occur, the two lines must touch each other constantly with a minimal gap in between.
Results
FAMD and HCA
Information was analysed between 2006 and 2021 for the monitoring wells defined in Table 2.
In this study, the years represent the quantitative variables, and the wells represent the qualitative variables. Figure 6 presents the distribution of the qualitative variables (that is, the monitoring wells) in Dimension 1 (Dim 1), and Dimension 2 (Dim2) axes, it is possible to observe the projection of the contribution of each qualitative variable for dimensions 1 and 2 (axes values) and its classification for the indicator “cos2” simultaneously (colour gradient scale).
For a better understanding of the projection results and on the meaning of each dimension, the well “476/20” was not considered in the initial analysis, due to is behaviour, this well shows a very marked downward trend with a difference of GDL close to 24 m between 2006 and 2021 (Table 2) having a behaviour of extreme outlier.
In the performed analysis, Dim 1 and Dim 2 represents about 72% of the variance. The distribution of qualitative variables is dispersed, indicating the existence of subpopulations with distinct tendencies in time. Dim1 is conditioned by the monitoring wells with the most irregular trends, from lower depths of groundwater levels (contribution close to -5, well 484/8) to higher depths of groundwater levels (contribution close to 9, well 453/18) (Fig. 6). Monitoring wells with the most constant trends are positioned in Dim 1 close to “0”, along Dim 2 (Fig. 6).
The Fig. 7 presents the FAMD results considering well “476/20” and other two wells included in a posterior stage of the analysis (518/30, and 528/16),it is to say that, although the outlier behaviour of variable “476/20”, the variance results for Dim1 and Dim2 don’t vary significantly. For this reason, and due to the importance of this monitoring well in the performed analysis, it was decided to consider FAMD results with the outlier “476/20”.
In these conditions (Fig. 7), the monitoring wells with the highest scores in Dim2 present more accentuated non-constant tendential patterns on the depths of groundwater levels. The Dim2 in these conditions is, therefore, conditioned by well “476/20” which evolves from constant stationary trends (negative score contributions in Dim2 and closer to “0” in Dim1) to less constant trends (increasing or decreasing trends in positive score contributions in Dim2 and further apart to “0” in Dim1). Again, Dim1 is conditioned by the monitoring wells with the most irregular trends, from lower depths of groundwater levels (contribution of -5.4, well 484/8) to higher depths of groundwater levels (contribution of 9.1, well 458/18) (Fig. 7).
It is also observed an increasing tendency in the depth of groundwater levels in the monitoring wells with the highest score contributions for Dim1, like it is the case of wells 434/306, 442/94, 443/924, 445/7 and 453/18. The location of these monitoring wells can be observed in Fig. 8, which includes the interpretational kriging maps of the FAMD contribution scores (Table 3) for Dim 1 and Dim2 and the subsequent HCA classification of monitoring wells in 4 clusters. These results evidence the subsector in the Basin with deepest groundwater levels (scores with red and orange colours in Dim1), and the areas with groundwater levels closest to the ground surface (scores with blue and green colours in Dim1).
From the score mapping of Dim2 in Fig. 7, it is also possible to observe a more constant stationary behaviour in GDL of the monitoring wells located at the North sector of Basin while towards the South sector, there is a growing trend to greater oscillations, mainly with an increasing in deep of groundwater levels as is the case of wells 476/21, 476/20, 484/8, 518/30 and 528/ 16, which present positive values in Dim2.
Monitoring well 476/20 has the heights increasing tendency the deep of groundwater levels (with more than 20 m in depth). The score map of FAMD results for Dim 1 and Dim 2 (Fig. 8) gives a good perspective of the distinct GDL trend groups in the aquifer systems between 2006 and 2021.
For a better understanding of the different characteristics of the distinct well trend groups, HCA was applied to the contribution score values of obtained from FAMD. The data was subjected to an iterative process to find the right number of clusters which represent the most adequate for the population. From that process, the number of clusters considered was 4 (Table 4). From the observation of Fig. 8 it is possible to understand the localization of the HCA clusters in the study area. It is possible to verify that HCA applied to FAMD scores enhances the results of FAMD making it possible do have a finer discretization of the results.
ITA and MK
For ITA method, monthly information of GDL was used, from each well, to detect small variations in trends. In this way, all samples from Table 1, which refers information from 2006 until 2021, were used. It also included 4 wells with data from 2011 to 2021 and 3 others from outside the main aquifer system to understand the tendencies in some areas, which was not possible to conclude in the other analysis. The achieved results are presented in Table 5 and also in the Appendix A (from Figs.11, 12, 13, 14, 15, 16, 17, 18, 19).
ITA slope kriging results are presented in Fig. 9, it is possible to observe that there is a more pronounced increase in GDL in the South sector of the Basin as it has been verified previously with FAMD. There are also situations where a decrease in GDL is observed (432/855, 442/36, 442/537, 442/94, 444/355, 453/18, 453/395, 454/151 and 476/19), and in cases even a constant tendency is verifiable (444/318 and 454/146).
Through the analysis of the graphs obtained from ITA, it is possible to verify two situations, which are also reflected in FAMD analysis:
-
There are monitoring wells with constant tendencies and monitoring wells with irregular tendencies. The monitoring wells with a more constant behaviour mostly show positive trends, defined in the confidence interval (α). The monitoring wells with irregular behaviours (10 wells) can present general negative trends of GDL (in 453/151, 432/855 442/36) or general positive trends, as is the case of 420/105, 432/68, 432/800, 476/20, 476/21, 484/8, and 528/16. Higher slopes (positive and negative) present a more pronounced anomalous trend, suggesting that anomalous behaviours register a very significant variation in short periods, which is the case of well 476/20 (Fig. 6).
-
The monitoring wells with constant tendency behaviour have smaller slope standard deviation values. All found correlation values are quite high, greater than 0.8 and in most of the cases close to 0.99, except for one monitoring well (420/105) that has a low correlation value, less than 0.5, which results from an episode in which the groundwater depth, at first, increased sharply.
Through Fig. 9 it is possible to verify that in the Southern sector and in a smaller area of the North study area, increasing tendencies of GDL (that is, increases in groundwater depth levels) occur, while the more coastal sectors and areas close to estuaries zones present a certain slight tendency to decreases in GDL. However, the well 466/21 shows a different tendency, with an increasing of GDL, which could result from exploitation in the area. Hence, from this result it possible to visualize an initial transition to the decreasing of GDL as shown in the Appendix A.
MKA method was applied to complement ITA results, and to reverify the tendencies observed with the previous methods. Like ITA, the graphical results of MKA are also presented in Appendix A.
It is possible to observe that most of the wells display a positive trend of the GDL, except in the cases of 432/855, 442/36, 442/537, 442/94, 444/355, 453/18, 453/395, 454/151 and 476/19, as it was stated in ITA.
Positive trends of GDL are predominant between 2006–2016 and 2018–2021, where some wells that display a minor transition period between 2008–2009 and 2017–2018. In 2018–2021, the slope of both lines is significant, which shows that in the last years the rate of GDL increase was escalating.
Negative trends in GDL began to manifest after the beginning of 2007 and in 2010 and 2011. The wells 432/855 and 454/151 display a GDL negative trend between 2015 and 2021. From the previous analyses, the well 476/20 represents a very significant positive trend since the beginning of 2019. It is also possible to verify that wells 434/280, 434/306, and 443/924, have three to five breaking points at different periods, which reinforces the already irregular profiles detected. In general, wells with anomalous tendencies have significant slopes in ITA after the breaking points in MKA.
Afte analysing the ITA and MK, was applied the IPTA methodology to complement the analyses of ITA as well. Ten wells, were analysed, 5 in the south sector and the other 5 in the North sector. The, results obtained, which are in Appendix B present some clarifications. It is possible to visualise that, in a general way, almost wells that show an increase in GDL manifest this trend for most months of the year, especially in the wells located in the southern sector of the study area. The wells with GDL decreasing behaviour show the opposite behaviour. It suggests that depletion and rising effects have been occurring consistently over the years.
By careful analysis, it becomes evident that some wells share a similar pattern in their monthly evolution. In wells 476/20 and 528/16 between the months Dec-Jan, there is an inflexion, whose slope is accentuated and coincides with the maximum trend line, indicating a tendency for the depth to increase in the month of Dec over the years in comparison to the month Jan. This suggests a decrease in the recharge rate in the southern sector in the winter months, with other factors of a structural nature in the basin. The wells 445/7 and 466/21 show a more constant trend of increasing depth over the months, except for the month of Aug for the well 445/7, where there is a decrease in GDL, indicating the possibility of recharge in the system by anthropogenic means (agricultural areas), to mitigate the effects of drought. The well 476/21, in comparison to the remaining wells, observes an inversion between the months Sep-Oct, corresponding to one of the maximum trend lines, followed by the months Apr-May. In addition, the month of Jan shows no trend relative to the other months. This could be due to changes in the recharge effect and rising sea levels during the dry season. In 442/241, there is an increase in GDL over the months, especially in the summer months, which is conditioned by the recharge rate.
In the wells 453/18 and 432/855 show irregular behaviour. The significant decrease observed in the well 432/855, could result from the combination of the rise of sea level and possibly from precipitation. In the well 453/18, the rise of sea level could also explain that effect in the summer combine with the precipitation but not so significate as the well 432/855. The well 476/19, on the other hand, shows a constant trend over the months, and in the months of Aug, Sep and Oct, this effect is significant due to the rise of sea level. The well 420/12 shows variations in the tendency, where the months Jan, Apr, May and Aug present a decrease in GDL, Feb, Mar, Jun, Jul, Nov and Dec, present a decrease in GDL, and last Sep and Oct don´t show tendency at all. This suggests a combination of factor like anthropic and the climate change.
Discussion and conclusions
The evaluation of annual and long-term changes in groundwater reserves, namely the variations in groundwater depth levels (GDL), assumes an enormous relevance in the context of hydrogeological resource management. Among others, the GDL trend analysis allows to estimate indirectly the groundwater recharge rate effectiveness, and allows to determine its gradients, to understand dynamic evolutions of aquifer systems and to design efficient and sustainable exploitation of groundwater wells.
In general, FAMD associated with HCA allows to obtain more reliable results in cases where the amounts of available data are small.
However, there were some difficulties and incongruencies in the clustering analysis process, probably due to the reduced number of observation wells when compared to the dimension of the study area, and the extreme outlier behaviour associated with the well “476/20” which affected the results.
When comparing the FAMD and ITA, there are correspondences in the detected trends observed with both methods, without a doubt,, ITA enables significantly a more efficient analysis of periods and can effectively leverage a larger pool of observation data in comparison to FAMD.
In this case-study, variations in GDL are likely relatable to deficiencies of the recharge rate of the aquifer system. From ITA it was possible to observe that the high slope values, that is, the most demarcated trends, occur central area of the Basin and the inverse situations in the bay areas.
With the IPTA analyses it's also possible to conclude that in most of the years, the increases and decreases of the GDL occur almost in all months. The evolution along the month reveals also that in some locations the combination of different factors, severely conditionate the transition between months, creating irregular polygons, showing the evolution of GDL is not constant.
When ITA results are complemented with MKA, it is possible to conclude with a certain level of accuracy that, in the period 2006–2010, there was an initial increase of GDL in the center of the Basin, and from 2010 to 2015, the opposite occurred in the bay areas. In the South sector, there are significant variations of GDL in the last five years, in which the well 476/20 represents the highest increase in GDL. This change is undoubtedly visible in both ITA and MKA.
The higher increase tendencies in GDL can be related to insufficient recharge effects, caused by anthropic action, which is reflected by the intensification of human activity in the NE and SE sectors of the study area. In the North sector, industrialization, urban areas, and population growth may be inducing some of the detected localized increases in GDL.
In the SE sector, besides the influence of agricultural activities, the generalized increase in GDL suggests the probable influence of other natural factors such as more pronounced drought effects, which has been evidenced by recent investigation studies in the area (Novo et al. 2020), and by the evolution of temperature and precipitation according to Fig. 4. Another reason the increase in GDL is associated with the probability of some localized hydrodynamic changes due to the influence of Variscan faulting and hydrothermal systems present in the Paleozoic basement massif that underlies the studied aquifer systems (Oliveira et al. 1998, 2001; Matos 2008, 2021; Matos et al. 2009). The southern sector of the Tagus-Sado Basin is affected by Paleozoic structures corresponding to the Iberian Pyrite Belt where VMS deposits are present (e.g., Lagoa Salgada) (Matos 2021). Distinct Alpine age faults intersect the area, compartmentalizing the Paleozoic basement in distinct horst and graben structures and it is likely that some of these faults are in an active tectonic regime (Oliveira et al. 1998, 2001; Matos 2008, 2021; Matos et al. 2009).
It is also important to mention that in the SE sector of the basin, some areas are classified as high rock temperatures until a depth 5 of km (Anderson et al. 2011). Under this geological context, hydrodynamic conditions of the aquifer systems may suffer in specific locations with slight or well demarked alterations in groundwater dynamics and hydrochemical composition This information is corroborated by the analysis of hydrochemical facies from data ranging from 2000 to 2010 shown in Fig. 10, in which there is a progression of the tendency from the ocean to the continent in the NW sector, with the appearance of transition facies, and to the south the increase of continental facies.
Due to some missing values in some data ranges, a certain level of uncertainty is presented in some of the results, which can mislead the observation of accurate tendencies in some few wells. It is to say that this is an exploratory study. Considering the detected tendencies, and the extension and complexity of the geological and hydrogeological conditions, it is advisable to increase the official ongoing monitoring plan, and future research works at regional scale.
Data availability
Not applicable.
References
Abdi H (2003) Factor Analysis of mixed data: methods and applications. In: Tashakkori A, Teddlie C (eds) Handbook of Mixed methods in social & behavioral research. Sage Publications, Thousand Oaks, CA, pp 401–425
Achite M, Ceribasi G, Ceyhunlu AI, Wałęga A, Caloiero T (2021) The innovative polygon trend analysis (IPTA) as a simple qualitative method to detect changes in environment—example detecting trends of the total monthly precipitation in semiarid area. Sustainability 13(22):12674. https://doi.org/10.3390/su132212674
Adamovic M, Branger F, Braud I, Kralisch S (2016) Development of a data-driven semi-distributed hydrological model for regional scale catchments prone to Mediterranean flash floods. J Hydrol 541:173–189. https://doi.org/10.1016/j.jhydrol.2016.03.032
Ahmed N, Wang G, Booij MJ, Ceribasi G, Bhat MS, Ceyhunlu AI, Ahmed A (2022) Changes in monthly streamflow in the Hindukush–Karakoram–Himalaya region of Pakistan using innovative polygon trend analysis. Stochastic Environ Res Risk Assess 36(3):811–830. https://doi.org/10.1007/S00477-021-02067-0/FIGURES/11
Akçay F, Bingölbali B, Akpınar A, Kankal M (2022) Trend detection by innovative polygon trend analysis for winds and waves. Front Mar Sci. https://doi.org/10.3389/fmars.2022.930911
Al Naeem MF, Yusoff I, Ng TF, Maity JP, Alias Y, May R, Alborsh HA (2019) A study on the impact of anthropogenic and geogenic factors on groundwater salinization and seawater intrusion in Gaza coastal aquifer, Palestine: an integrated multi-techniques approach. J Afr Earth Sci 156:75–93. https://doi.org/10.1016/j.jafrearsci.2019.05.006
Alifujiang Y, Abuduwaili J, Maihemuti B, Emin B, Groll M (2020) Innovative trend analysis of precipitation in the lake Issyk-Kul basin Kyrgyzstan. Atmosphere 11(4):332. https://doi.org/10.3390/atmos11040332
Almazroui M, Şen Z (2020) Trend analyses methodologies in hydro-meteorological records. Earth Syst Environ 4(4):713–738. https://doi.org/10.1007/s41748-020-00190-6
Almeida JS, Barbosa L, Pais AACC, Formosinho SJ (2007) Improving hierarchical cluster analysis: a new method with outlier detection and automatic clustering. Chemom Intell Lab Syst 87(2):208–217. https://doi.org/10.1016/j.chemolab.2007.01.005
Almeida C, Mendonça JJL, Jesus MR, Gomes AJ (2000) Sistemas aquíferos de Portugal Continental. Centro de Geologia da Fac. Ciências Univ. Lisboa Instituto da Água, vol. II e III. https://snirh.apambiente.pt/snirh/download/aquiferos_PortugalCont/Ficha_T3.pdf
Anderson MJ (2008) A new method for non-parametric multivariate analysis of variance. Austral Ecol 26(1):32–46. https://doi.org/10.1111/j.1442-9993.2001.01070.pp.x
Anderson EW, Antkowiak M, Butt R, Davis J, Dean J, Hillesheim M, Hotchkiss E, Hunsberger R, Kandt A, Lund J, Massey K, Robichaud R, Stafford B, Visser C (2011) Broad overview of energy efficiency and renewable energy opportunities for department of defense installations. (Report No NREL/TP-7A20–50172). National Renewable Energy Laboratory. https://doi.org/10.2172/1023698
Antunes MT (1983) Notícia Explicativa da Folha 39-C (Alcácer do Sal) da Carta Geológica de Portugal, na escala 1/50 000. Serviços Geológicas de Portugal, Lisboa, pp 39–47. https://geoportal.lneg.pt/pt/dados_abertos/cartografia_geologica/cgp50k/
Audigier V, Husson F, Josse J (2016) A principal component method to impute missing values for mixed data. Adv Data Anal Classif 10:5–26. https://doi.org/10.1007/S11634-014-0195-1/FIGURES/9
Barbosa S, Pinto M, Almeida JA, Carvalho E, Diamantino C (2021) Hydrochemical contamination profiling and spatial-temporal mapping with the support of multivariate and cluster statistical analysis. World Acad Sci Eng Tech Intern J Geol Environ Eng 15(9):243–252
Batdelger O, Tsujimura M, Tran DA, Zorigt B, Thuc PTB (2023) Identification of hydrogeochemical processes and controlling factors in groundwater and surface water using integrated approaches, Tuul River Basin (Ulaanbaatar, Mongolia). Environ science and Eng. Springer Nature, Cham, pp 167–198. https://doi.org/10.1007/978-3-031-17808-5_12
Bayo J, López-Castellanos J (2016) Principal factor and hierarchical cluster analyses for the performance assessment of an urban wastewater treatment plant in the Southeast of Spain. Chemosphere 155:152–162. https://doi.org/10.1016/j.chemosphere.2016.04.038
Buri ES, Keesara VR, Loukika NK (2022) Long-term trend analysis of observed gridded precipitation and temperature data over Munneru River basin, India. J Earth Syst Sci. https://doi.org/10.1007/s12040-022-01864-7
Caloiero T (2019) Evaluation of rainfall trends in the South Island of New Zealand through the innovative trend analysis (ITA). Theoret Appl Climatol 139(1–2):493–504. https://doi.org/10.1007/s00704-019-02988-5
Caloiero T, Coscarelli R, Ferrari E (2018) Application of the innovative trend analysis method for the trend analysis of rainfall anomalies in Southern Italy. Water Resour Manag 32(15):4971–4983. https://doi.org/10.1007/S11269-018-2117-Z/TABLES/1
Celestino AEM, Cruz DP, Sánchez E, Reyes FV, Soto D (2018) Groundwater quality assessment: an improved approach to K-means clustering, principal component analysis and spatial analysis: a case study. Water 10(4):437. https://doi.org/10.3390/w10040437
Cerón JC, Jiménez-Espinosa R, Pulido-Bosch A (2000) Numerical analysis of hydrogeochemical data: a case study (Alto Guadalentı́n, southeast Spain). Appl Geochem. https://doi.org/10.1016/s0883-2927(99)00105-5
Correia J (2017) Contributo para a modelação 3D de horizontes geológicos com o auxílio de sísmica de reflexão e dados de sondagens e poços: um caso de estudo na Bacia Cenozóica do Baixo Tejo. Master dissertation, Nova School of Science and Tencology. http://hdl.handle.net/10362/22223
Costa EF (1994) Notícia explicativa das folhas 7 e 8 da Carta Hidrogeológica de Portugal, Instituto Geológico e Mineiro, Lisboa, pp 83. https://geoportal.lneg.pt/pt/dados_abertos/cartografia_geologica/carta_hidrogeologica_200k
Cunha, PP (1992) Estratigrafia e sedimentologia dos depósitos do cretácio superior e terciário de Portugal Central, a leste de Coimbra. Doctoral dissertation, University of Coimbra. https://estudogeral.sib.uc.pt/handle/10316/2015
De Andrade CD, De Azevedo JA, Santos MAD, Assumção RDSFV (2020) Water quality assessment based on multivariate statistics and water quality index of a strategic river in the Brazilian Atlantic forest. Sci Rep. https://doi.org/10.1038/s41598-020-78563-0
Dinpashoh Y, Mirabbasi R, Jhajharia D, Abianeh HZ, Mostafaeipour A (2014) Effect of short-term and long-term persistence on identification of temporal trends. J Hydrol Eng 19(3):617–625. https://doi.org/10.1061/(asce)he.1943-5584.0000819
Fatahi NR, Yaghoobi P, Reaisi VH, Ostad-Ali-Askari K, Nouri J, Maghsoudlou B (2021) Eco-hydrologic stability zonation of dams and power plants using the combined models of SMCE and CEQUALW2. Appl Water Sci. https://doi.org/10.1007/s13201-021-01427-z
Ferchichi H, Hamouda MB, Farhat B, Mammou AB (2018) Assessment of groundwater salinity using GIS and multivariate statistics in a coastal Mediterranean aquifer. Int J Environ Sci Tech 15(11):2473–2492. https://doi.org/10.1007/s13762-018-1767-y
Fernandes GP, Silva OM (1998) Contribuição para a caracterização hidrogeológica do sistema aquífero da bacia do sado. In 4 º Congresso da Água, A água como recurso estruturante do desenvolvimento, Centro de congressos da FIL, Lisboa. APRH. https://www.aprh.pt/congressoagua98/files/c_titul.htm
Ghaemi Z, Noshadi M (2022) Surface water quality analysis using multivariate statistical techniques: a case study of Fars Province rivers Iran. Environ Monit Assess. https://doi.org/10.1007/s10661-022-09811-1
Ghashghaie M, Eslami H, Ostad-Ali-Askari K (2022) Applications of time series analysis to investigate components of Madiyan-rood River water quality. Appl Water Sci. https://doi.org/10.1007/s13201-022-01693-5
Granato D, Santos JS, Escher GB, Ferreira BL, Maggio RM (2018) Use of principal component analysis (PCA) and hierarchical cluster analysis (HCA) for multivariate association between bioactive compounds and functional properties in foods: a critical perspective. Trends Food Sci Techn 72:83–90. https://doi.org/10.1016/j.tifs.2017.12.006
Güçlü YS (2020) Improved visualization for trend analysis by comparing with classical Mann-Kendall test and ITA. J Hydrol 584:124674. https://doi.org/10.1016/j.jhydrol.2020.124674
Hamed KH (2008) Trend detection in hydrologic data: the Mann-Kendall trend test under the scaling hypothesis. J Hydrol 349(3–4):350–363. https://doi.org/10.1016/j.jhydrol.2007.11.009
Harkat S, Kisi O (2021) Trend analysis of precipitation records using an innovative trend methodology in a semi-arid Mediterranean environment: Cheliff watershed case (Northern Algeria). Theoretical Appl Climatol 144(3–4):1001–1015. https://doi.org/10.1007/S00704-021-03520-4/FIGURES/10
Helsel DR (1987) Advantages of nonparametric procedures for analysis of water quality data. Hydrol Sci J 32(2):179–190. https://doi.org/10.1080/02626668709491176
Husson F, Lê S, Pagès J (2017) Exploratory multivariate analysis by example using R. Chapman and Hall/CRC eBooks. https://doi.org/10.1201/b21874
Husson F, Josse J, Lê S, Mazet J (2020) Package ‘FactoMineR’: multivariate exploratory data analysis and data mining. https://cran.rediris.es/web/packages/FactoMineR/FactoMineR.pdf
Javadinejad S, Hannah D, Ostad-Ali-Askari K, Krause S, Zalewski M, Boogaard F (2019a) The impact of future climate change and human activities on hydro-climatological drought, analysis and projections: using CMIP5 climate model simulations. Water Conserv Sci Eng 4:71–88. https://doi.org/10.1007/s41101-019-00069-2
Javadinejad S, Ostad-Ali-Askari K, Eslamian S (2019b) Application of multi-index decision analysis to management scenarios considering climate change prediction in the Zayandeh Rud river basin. Water Conserv Sci Eng 4(1):53–70. https://doi.org/10.1007/S41101-019-00068-3/TABLES/4
Jiang Y, Guo H, Jia Y, Cao Y, Hu C (2015) Principal component analysis and hierarchical cluster analyses of arsenic groundwater geochemistry in the Hetao basin. Inner Mongolia Chemie Der Erde 75(2):197–205. https://doi.org/10.1016/j.chemer.2014.12.002
Kantiranis N, Mattas C, Pavlou A, Patrikaki O, Voudouris K (2017) Multivariate statistical analysis for the assessment of groundwater quality under different hydrogeological regimes. Environ Earth Sci. https://doi.org/10.1007/s12665-017-6665-y
Krishnan RA, Ansari J, Sundararajan M, John C, Saharuba PM (2019) Chapter 9 – groundwater quality assessment using multivariate statistical methods for chavara aquifer system, Kerala, India. In: Venkatramanan S, Prasanna MV, Chung SY (eds) GIS and geostatistical techniques for groundwater science. Elsevier, pp 113–131. https://doi.org/10.1016/B978-0-12-815413-7.00009-2
Kullberg MC, Kullberg JC, Terrinha P (2000) Tectónica da Cadeia da Arrábida. Tectónica das regiões de Sintra e Arrábida. Memórias Geociências Do Museu Nacional História Natural Da Universidade De Lisboa 2:35–84
Kuriqi A, Ali R, Pham QB, Gambini JM, Gupta V, Malik A, Linh NTT, Joshi Y, Anh DT, Nam VT, Dong X (2020) Seasonality shift and streamflow flow variability trends in central India. Acta Geophys 68:1461–1475. https://doi.org/10.1007/s11600-020-00475-4
Lê S, Josse J, Rennes A, Husson F (2008) FactoMineR: an R package for multivariate analysis. JSS J Statis Softw 25:1–18
Love DJ, Hallbauer D, Amos A, Hranova R (2004) Factor analysis as a tool in groundwater quality management: two southern African case studies. Phys Chem Earth Parts a/b/c 29(15–18):1135–1143. https://doi.org/10.1016/j.pce.2004.09.027
Machiwal D, Singh PK (2015) Understanding factors influencing groundwater levels in hard-rock aquifer systems by using multivariate statistical techniques. Environ Earth Sci 74(7):5639–5652. https://doi.org/10.1007/s12665-015-4578-1
Mann HB (1945) Nonparametric tests against trend. Econ Soc 13(3):245–259. https://doi.org/10.2307/1907187
Manuppella G (Coord), Antunes MT, Pais J, Cardoso JL, Ramalho M, Rey J (1999) Notícia Explicativa da Folha 39-B (Setúbal) da Carta Geológica de Portugal, na escala 1/50 000. Serviços Geológicas de Portugal, Lisbon, pp 143 . https://geoportal.lneg.pt/pt/dados_abertos/cartografia_geologica/cgp50k/
Matos. JX, Sousa P, Ricardo J (2009) MAEPA - Área de Marateca. Caracterização Geológica, Geofísica e Geoquímica da Região de Palma-Serrinha-Cordoeira, Definição de Alvos de Sondagem. Technical Report, DPMM INETI pp. 66.
Matos JX (2008) MAEPA - Área de Marateca. Caracterização Geológica e Mineira Preliminar das Janelas Paleozóicas de Vela, Serra do Loureiro, Palma, Serrinha, Clérigos e Cordoeira. Technical Report DPMM LNEG/INETI, pp. 17
Matos JX (2021) Alteração hidrotermal ácido-sulfato associada aos jazigos de sulfuretos maciços de Lagoa Salgada, Caveira, Lousal, Aljustrel e São Domingos (Faixa Piritosa Ibérica). PhD Thesis. In Geology, specialization metallogeny, geology department, science faculty University of Lisbon, Lisbon pp 435.
Minea I, Boicu D, Chelariu OE (2020) Detection of groundwater levels trends using innovative trend analysis method in temperate climatic conditions. Water 12(8):2129. https://doi.org/10.3390/w12082129
Mirabbasi R, Ahmadi F, Jhajharia D (2020) Comparison of parametric and non-parametric methods for trend identification in groundwater levels in Sirjan plain aquifer. Iran Hydrol Res 51(6):1455–1477. https://doi.org/10.2166/nh.2020.041
Mumby PJ (2002) Statistical power of non-parametric tests: a quick guide for designing sampling strategies. Mar Pollut Bull 44(1):85–87. https://doi.org/10.1016/s0025-326x(01)00097-2
Nielsen F (2016) Hierarchical clustering. In: Nielsen F (ed) Introduction to HPC with MPI for data science. International Publishing, Cham, pp 195–211. https://doi.org/10.1007/978-3-319-21903-5_8
NLJ (1948) Rank correlation methods. By Maurice G. Kendall, M. [Pp. vii + 160. London: Charles Griffin and Co. Ltd., 42 Drury Lane, 1948. 18s.]. J Inst Actuar 75(1):140–141. https://doi.org/10.1017/s0020268100013019
Novo ME, Martins T, Henriques JM (2020) BINGO project: Impacts of climate change in groundwater in the lower Tagus - Coupling outputs from climate and re-charge models with aquifer modelling. Report No 245/2020 – DHA/NRE. Laboratório Nacional de Engenharia Civíl. http://repositorio.lnec.pt:8080/xmlui/handle/123456789/1012984
Oh J, Kim H, Yu S, Kim K, Yun ST (2020) Delineating the impacts of poultry burial leachate on shallow groundwater in a reclaimed agro-livestock farming area, using multivariate statistical analysis of hydrochemical data. Environ Sci Pollut Res 28(7):7742–7755. https://doi.org/10.1007/s11356-020-08178-5
Oliveira V, Matos JX, Bengala M, Sousa P (1998) Principais alinhamentos vulcânicos a norte da Falha de Grândola, sob formações da Bacia Terciária do Sado e sua potencialidade mineira no contexto da Faixa Piritosa Ibérica. In Actas V Congresso Nacional de Geologia Com. IGMT. 84 F. 2, pp.F15–18.
Oliveira V, Matos JX, Rosa C (2001) The NNW sector of the Iberian pyrite belt — new exploration perspectives for the next decade. In Geode workshop — massive sulphide deposits in the iberian pyrite belt: new advances and comparison with equivalent systems, pp.34–35, Aracena Spain.
Pagès J (2004) Analyse factorielle de données mixtes. Revue de Statist Appliq 52(4) :93–111. http://www.numdam.org/item?id=RSA_2004__52_4_93_0
Pais J, Moniz C, Cabral J, Cardoso JL, Legoinha P, Machado S, Morais MA, Lourenço C, Ribeiro ML, Henriques P, Falé P (2006) Notícia Explicativa da Folha 34-D (Lisboa) da Carta Geológica de Portugal, na escala 1/50 000. Departamento de Geologia, INETI - Instituto Nacional de Engenharia, Tecnologia e Inovação., Lisbon, pp. 74. https://geoportal.lneg.pt/pt/dados_abertos/cartografia_geologica/cgp50k/
Panda B, Radha VD, Chidambaram S, Arindam M, Thilagavathi R, Manikandan S, Thivya C, Ramanathan AL, Ganesh N (2019) Chapter 22 - Fluoride contamination in groundwater—a GIS and geostatistics reappraisal. In: Venkatramanan S, Prasanna MV, Chung SY (eds) GIS and geostatistical techniques for groundwater science. Elsevier, Cham, pp 309–322
Pörtner H, Roberts D, Tignor M, Poloczanska E, Mintenbec K, Alegría A, Craig M, Langsdorf S, Löschke S, Möller V, Okem A, Rama B, Belling D, Dieck W, Götze S, Kersher T, Mangele P, Maus B, Mühle A, Weyer N (2022) Climate change 2022: impacts, adaptation and vulnerability working group II contribution to the sixth assessment report of the intergovernmental panel on climate change. Cambridge, UK and New York, NY, USA
Rabiei J, Sadat KM, Bagherpour S, Ebadi N, Karimi A, Ostad-Ali-Askari K (2022) Investigation of fire risk zones using heat-humidity time series data and vegetation. Appl Water Science. https://doi.org/10.1007/s13201-022-01742-z
Rahbar A, Vadiati M, Talkhabi M, Nadiri AA, Nakhaei M, Rahimian M (2020) A hydrogeochemical analysis of groundwater using hierarchical clustering analysis and fuzzy C-mean clustering methods in Arak plain, Iran. Environ Earth Sci. https://doi.org/10.1007/s12665-020-09064-6
Rao NK, Chaudhary M (2019) Hydrogeochemical processes regulating the spatial distribution of groundwater contamination, using pollution index of groundwater (PIG) and hierarchical cluster analysis (HCA): a case study. Groundw Sustain Dev 9:100238. https://doi.org/10.1016/j.gsd.2019.100238
Riaz M, Tahir M, Arslan M (2016) Non-parametric versus parametric methods in environmental sciences. Bull Environ Stud 1(1):28–30
Ruiz F, Gomis V, Blasco P (1990) Application of factor analysis to the hydrogeochemical study of a coastal aquifer. J Hydrol 119(1–4):169–177. https://doi.org/10.1016/0022-1694(90)90041-u
Şan M, Akçay F, Linh NTT, Kankal M, Pham QB (2021) Innovative and polygonal trend analyses applications for rainfall data in Vietnam. Theoretical Appl Climatol 144(3–4):809–822. https://doi.org/10.1007/S00704-021-03574-4/FIGURES/6
Sanikhani H, Kisi O, Mirabbasi R, Meshram SG (2018) Trend analysis of rainfall pattern over the Central India during 1901–2010. Arabian J Geosci 11(15):1–14. https://doi.org/10.1007/S12517-018-3800-3/TABLES/8
Şen Z (2012) Innovative trend analysis methodology. J Hydrol Eng 17(9):1042–1046. https://doi.org/10.1061/(asce)he.1943-5584.0000556
Şen Z (2014) Trend Identification simulation and application. J Hydrol Eng 19(3):635–642. https://doi.org/10.1061/(asce)he.1943-5584.0000811
Şen Z (2017) Innovative trend significance test and applications. Theoretical Appl Climatol 127(3–4):939–947. https://doi.org/10.1007/s00704-015-1681-x
Şen Z, Şişman E, Dabanli I (2019) Innovative polygon trend analysis (IPTA) and applications. J Hydrol 575:202–210. https://doi.org/10.1016/J.JHYDROL.2019.05.028
Seth N. (2022). What is Principal Component Analysis (PCA) vs. Factor Analysis?. In: Blogs & Updates on Data Science, Business Analytics, AI Machine Learning. https://www.analytixlabs.co.in/blog/factor-analysis-vs-pca/. Accessed 21 May 2022
Shiker MAK (2012) Multivariate statistical analysis public policies-new smart settings in public management view project transportation problems view project British. J Sci 6(1):55–77
Simões M, Legoinha P (2014) Unidades hidrostratigráficas no Miocénico da Bacia do Tejo na região de Almada. Comunicações Geológicas 101(Especial II):717–720
Simões MM (1998) Contribuição para o Conhecimento Hidrogeológico do Cenozóico na Bacia do Baixo Tejo. PhD Thesis, FCT- NOVA Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa, Caparica. http://hdl.handle.net/10362/1152
Smoliński A, Walczak B, Einax JW (2002) Hierarchical clustering extended with visual complements of environmental data set. Chemom Intell Lab Syst 64(1):45–54. https://doi.org/10.1016/s0169-7439(02)00049-7
Soltani L, Mellah T (2023) Exploring farmers’ adaptation strategies to water shortage under climate change in the Tunisian semi-arid region. Environ Manag 71(1):74–86. https://doi.org/10.1007/s00267-022-01604-z
Subba Rao N, Srihari C, Deepthi Spandana B, Sravanthi M, Kamalesh T, Abraham Jayadeep V (2019) Comprehensive understanding of groundwater quality and hydrogeochemistry for the sustainable development of suburban area of Visakhapatnam, Andhra Pradesh, India. Hum Ecol Risk Assess: Int J 25(1–2):52–80. https://doi.org/10.1080/10807039.2019.1571403
Swain S, Sahoo S, Taloor AK, Mishra S, Pandey A (2022) Exploring recent groundwater level changes using innovative trend analysis (ITA) technique over three districts of Jharkhand, India. Groundw Sustain Dev 18:100783. https://doi.org/10.1016/j.gsd.2022.100783
Umar S, Lone MA, Goel N, Zakwan M (2022) Trend analysis of hydro-meteorological parameters in the Jhelum River basin, North Western Himalayas. Theoretical Appl Climatol 148(3–4):1417–1428. https://doi.org/10.1007/s00704-022-04014-7
Varol M (2020) Use of water quality index and multivariate statistical methods for the evaluation of water quality of a stream affected by multiple stressors: a case study. Environ Pollut 266:115417. https://doi.org/10.1016/j.envpol.2020.115417
Visbal-Cadavid D, Mendoza AM, De La Hoz-Dominguez EJ (2020) Use of factorial analysis of mixed data (FAMD) and hierarchical cluster analysis on principal component (HCPC) for multivariate analysis of academic performance of industrial engineering programs. Xinan Jiaotong Daxue Xuebao. https://doi.org/10.35741/issn.0258-2724.55.5.34
Wali SU, Alias N, Harun S, Umar K, Gada MA, Dankani IM, Kaoje IU, Usman AA (2022) Water quality indices and multivariate statistical analysis of urban groundwater in semi-arid Sokoto Basin, Northwestern Nigeria. Groundw Sustain Dev 18:100779. https://doi.org/10.1016/j.gsd.2022.100779
Wang H, Jiang X, Wan L, Han G, Guo H (2015) Hydrogeochemical characterization of groundwater flow systems in the discharge area of a river basin. J Hydrol 527:433–441. https://doi.org/10.1016/j.jhydrol.2015.04.063
Wang Y, Zhu GY, Yu R (2018) Assessment of surface water quality using multivariate statistical techniques: a case study in China. Irrigat Drainage Syst Eng. https://doi.org/10.4172/2168-9768.1000214
Zakwan M (2021) Trend analysis of groundwater level using innovative trend analysis. Springer eBooks, pp 389–405
Zbyszewski G, Antunes MT, Ferreira ODV (1976) Notícia Explicativa da Folha 39-A (Águas de Moura) da Carta Geológica de Portugal, na escala 1/50 000. Serviços Geológicos de Portugal, Lisbon, pp. 59. https://geoportal.lneg.pt/pt/dados_abertos/cartografia_geologica/cgp50k/
Acknowledgements
The authors acknowledge the support of the Department of Earth Sciences of FCT-NOVA, GeoBioTec, and Mineral Resources and Geophysics Unit of LNEG.
Funding
Open access funding provided by FCT|FCCN (b-on). The research was partially developed for a master thesis, developed in the Earth Sciences Department of the Faculty of Sciences and Technology (FCT), NOVA University of Lisbon, Portugal. This research was partially funded by FCT-Fundação para a Ciência e a Tecnologia, Portugal, grant number UIDB/04035/2020.
Author information
Authors and Affiliations
Contributions
Conceptualization, methodology, funding acquisition, S.B.; software, M.B.; investigation, S.B., M.B., and J.X.M.; resources, S.B. and J.X.M.; data curation, M.M.; validation, M.B., S.B., and J.X.M; writing—original draft preparation, M.B. and S.B.; writing—review and editing, S.B. and J.X.M. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ferreira Branco, M., Barbosa, S.V. & Matos, J.X. Trend detection and depletion effects evidence in time series of groundwater levels in the southern sector of the left bank of the Tagus-Sado Basin (Portugal, Iberian Peninsula). Sustain. Water Resour. Manag. 10, 115 (2024). https://doi.org/10.1007/s40899-024-01083-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s40899-024-01083-6