Introduction

The timing of the flood event and the degree of seasonality are important characteristics of flood processes. The seasonality of annual maximum flows (AM) is one of flood process indicators (Merz and Blöschl 2003). Studies on flood seasonality can be helpful in recognizing changes in flood driving processes (Hall 2014). Both the climate forcing mechanisms (for example, temperature changes and atmospheric patterns) and local soil and geophysical properties are reflected in the dates of floods. A useful basis for assessing the seasonality of environmental variables is circular statistics (Fisher 1993; Mardia and Jupp 2000). The method provides a practical approach for studying the timing of the flood event (Burn 1997; Bayliss and Jones 1993). Seasonal indices based on circular statistics represent an important indicator of flood processes that can be used as a pooling characteristic in the regional flood frequency analysis (Kriegerová and Kohnová 2005). New methods for identifying flood seasons based on circular measures have been introduced (Chen et al. 2013) based on the division of the flood season using the circular standard deviation of flood occurrences and of flood occurrences combined with flood magnitudes. The first advantage of the use of circular instead of linear statistics on the dates of annual maximum flows (DAM) is that they can reflect the closeness of the dates that occur at the end and at the beginning of the hydrological year. The next advantage is that the dates of floods are almost error-free.

Circular statistics had been applied in measures of similarity in catchment hydrologic response (Burn 1997; Cunderlik and Burn 2002; Cunderlik et al. 2004; Castellarin 2001). The methods were used in studies on floods in Great Britain (Bayliss and Jones 1993), on seasonality of rainfall- and snowmelt-induced floods in mid-sized catchments in Slovakia (Kriegerová and Kohnová 2005), on seasonality of precipitation and runoff characteristics in Slovakia and Austria (Parajka et al. 2009), on seasonal variation in flood date in Peak Over Threshold model (Ouarda et al. 1993), in flood seasonality regionalization (Ouarda et al. 2006), on predicted impact of climate change on low flows in catchments in Germany (Demirel 2013) and in studies on projected changes in flood seasonality under climate change in six catchments in Norway (Vormoor et al. 2015). A comprehensive statistical analysis of the dates of extreme precipitation at stations in the USA was conducted by Dhakal et al. (2015) who studied nonstationarity in seasonality. The circular statistics were also used by Blöschl et al. (2017) who revealed patterns of change in flood timing in many parts of Europe.

The main objective of the paper is to identify the probabilistic properties of the date of winter, summer and annual maximum river flow using the circular statistics and the circular theoretical distribution function. Three catchments with different hydrological regime were selected to the study. To the best of the authors’ knowledge, the methods such as identifying the theoretical distribution function as the mixture of von Mises distribution functions have not yet been applied to the date of annual and seasonal maximum flow in hydrological literature. All symbols and abbreviations used in this paper are placed in Table 1.

Table 1 List of symbols and abbreviations in alphabetical order

Data and study areas

The date of occurrence of summer maximum river flow, winter maximum river flow and annual maximum river flow was studied in the Zagożdżonka river (gauging station: Płachty Stare), in the Czarna Przemsza river (gauging station: Piwoń) and in the Poprad river (gauging station: Muszyna). The data for the Czarna Przemsza river and for the Poprad river were obtained from the Institute of Meteorology and Water Management National Research Institute, Poland (Polish acronym: IMGW-PIB). The data for the Zagożdżonka river were collected by the Department of Hydraulic Engineering, Warsaw University of Life Sciences SGGW.

All rivers contribute to water resources of the Vistula river basin, the longest river in Poland.

The Zagożdżonka river is a left tributary of the Vistula river. The watershed is located in central Poland, ca. 100 km south from Warsaw. Its topography is typically lowland. Local depressions which do not contribute to direct runoff constitute a significant part of the area. In respect of the mean value of the long-term precipitation, the wettest month is July with rainfall depth equal to 13% of the annual value. In respect of discharge, the wettest month is March with the mean value of over 70% larger than the mean annual discharge. The reader interested more in hydrological conditions of the watershed is referred to papers (Banasik and Hejduk 2012, 2013; Banasik et al. 2013; Hejduk and Hejduk 2014; Kaznowska and Banasik 2011).

The Czarna Przemsza river has its source in the Kraków-Czȩstochowa Upland in southern Poland. This is a typical highland catchment. Most of the catchment is an agricultural area lying in the Piedmont Plateau with permeable soils.

The Poprad river has its source in the High Tatra Mountains which is the highest part of the Carpathian Mountains. The river flows through part of Slovakia, forms the border between Slovakia and Poland and enters the Dunajec river in Poland. The Poprad river drains water from the Tatra Mountains where precipitation levels are very high. The river contributes considerably to the water resources of the Upper Vistula river basin, the region in Poland which is highly susceptible to flooding and where mountain rivers pose a very high flood hazard (Punzet 1978; Cyberski et al. 2006; Kundzewicz et al. 2016). Two main climatic conditions characterize the Poprad river basin to the Muszyna station: prolonged snow cover, low air temperature, small temperature inversion and a very high annual precipitation reaching 2000 mm in the western, high mountainous part (upper course of the Poprad river) and a highland character with a substantial temperature inversion and a lower level of annual precipitation, reaching 900 mm in the eastern part (lower course) (Šatalová and Kenderessy 2017; Trizna 2004). The three catchments were shown in Fig. 1.

Fig. 1
figure 1

The study areas; a the Zagożdżonka catchment, b the Czarna Przemsza catchment, c the Poprad catchment, d the catchments’ location on the map of Poland

Catchment characteristics and data were presented in Table 2.

Table 2 Catchment characteristics and discharge data. A is area, EL is the mean elevation, SH = \(\frac{A}{L^2_{\text {max}}}\) is the shape coefficient, \(\bar{P}_{\text {tot}}\) is the mean annual precipitation totals, \(\bar{Q}_{\text {AM}}\) is the mean annual maximum discharge, study period is in hydrological years i.e. from the 1st November to the 31st October

The three catchments have mixed snowmelt/rainfall regimes. Therefore, the annual maximum flows are either summer or winter flows. Winter floods dominate in the Zagożdżonka catchment and in the Czarna Przemsza catchment while summer floods dominate in the Poprad catchment.

Methods

Circular statistics

In every catchment, the dates of the seasonal (winter, summer) and annual maxima flows were selected for the samples. Then, every date from the calendar dates of the annual flows was converted to \(D_i, i=1, ..., n\), the number of day of the maximum river flow in the hydrological year. To be more specific, \(D_i=1\) for 1st November and \(D_i=365\) for 31st October and when it is not a leap year. Similarly, every date from the dates of winter flows was converted to \(D_i\), namely \(D_i=1\) for 1st November and \(D_i=181\) for 30th April. For dates of summer flows, in turn, \(D_i=1\) for 1st May and \(D_i=184\) for 31st October. Subsequently, every number of day \(D_i\) was converted to angular value (angular date) using the formula:

$$\begin{aligned} \Theta _i=\left\{ \begin{array}{ll} \frac{2\pi\cdot D_i}{365} &{} \hbox { if } D_i \hbox { is the day of annual maximum flow},\\ \frac{\pi\cdot D_i}{181} &{} \hbox { if } D_i \hbox { is the day of winter maximum flow},\\ \pi+\frac{\pi\cdot D_i}{184} &{} \hbox { if } D_i \hbox { is the day of summer maximum flow}.\\ \end{array} \right. \end{aligned}$$
(1)

In leap years, the denominators for annual and winter days above were increased by one. Finally, three samples of angular dates \(\Theta _i\) were obtained for every river, namely DWM (winter maxima dates), DSM (summer maxima dates), and DAM (annual maxima dates). The angular dates \(\Theta _i\) are in radians. The value \(\Theta _i\) is a measure of the counterclockwise-directed angle between the vectors [1, 0] and \([x_i, y_i]\), assigned to a point (0, 0) with endpoints on the unit circle where \((x_i, y_i)=(\cos \Theta _i, \sin \Theta _i)\). Therefore, the dates \(\Theta _i\) can be depicted as points located on the unit circle (DAM), on the upper unit semicircle (DWM), and on the lower unit semicircle (DSM).

The mean flood date \(\bar{\Theta }\in [0, 2\pi )\), a measure of location, is uniquely determined by the pair \((\bar{x}, \bar{y})=(\frac{1}{n}\sum _{i=1}^n \cos \Theta _i,\frac{1}{n}\sum _{i=1}^n \sin \Theta _i)\) as

$$\begin{aligned} \bar{\Theta }=\left\{ \begin{array}{ll} \arctan \frac{\bar{y}}{\bar{x}} &{} \hbox { if } \bar{x}>0, \bar{y} \ge 0,\\ 2\pi +\arctan \frac{\bar{y}}{\bar{x}} &{} \hbox { if } \bar{x}>0, \bar{y}<0,\\ \pi +\arctan \frac{\bar{y}}{\bar{x}} &{} \hbox { if } \bar{x}<0,\\ \frac{\pi }{2}&{} \hbox { if } \bar{x}=0, \bar{y}>0,\\ \frac{3\pi }{2}&{} \hbox { if } \bar{x}=0, \bar{y}<0.\\ \end{array} \right. \end{aligned}$$
(2)

If \(\bar{x}=0\) and \(\bar{y}=0\) then \(\bar{\Theta }\) is not defined.

The measures of variability enable the dispersion between the mean flood date and the angular dates of flood occurrences to be assessed [see e.g. Chen et al. (2013); Cunderlik and Burn (2002); Mardia and Jupp (2000)]. The mean resultant length, \(\bar{r}=\sqrt{\bar{x}^2+\bar{y}^2}\), is the most commonly used measure of dispersion. It should also be noted that \(0<\bar{r}\le 1\) and that \(\bar{r}\) near to 1 (to 0) implies little (large) variation and a high concentration (wide dispersion) of data. The sample circular variance \(CIV=1-\bar{r}\) and the standard deviation \(\sigma =\sqrt{-2\ln \bar{r}}\) are also used as measures of dispersion.

Attention should be paid to the method of conversion of the number of day into angular date using formulas (1) for winter and summer dates. Thanks to the formula, we have DWM\(\in (0, \pi ]\) and DSM\(\in (\pi , 2\pi ]\) which makes the DWM to be located on the upper semicircle and the DSM on the lower semicircle. But, because the denominators in formulas (1) are different (the denominators are the winter and summer lengths for non-leap year), the angular difference between two consecutive days is lower in the DSM than in the DWM. Therefore, for example, the angular winter date of the 10th Nov is 0.208 (approx) and the angular summer date of the 10th May is not \(\pi +0.208\) but \(\pi +0.204\). The incompatibility between the DWM and the DSM is very low and results in a difference much lower than one day.

Circular distribution function

The von Mises distribution on the unit circle is often used because of its highly developed inference methods. Many tests of von Mises distributions are presented in Mardia and Jupp (2000), for example, tests of the mean direction and of the concentration parameter in one population and tests to ascertain the equality of mean directions or equality of concentration parameters in several populations (Dobson 1978; Stephens 1969; Upton 1973; Yamamoto and Yanagimoto 1995). Many of these methods are based on large-sample approximate statistics. The role of this distribution is similar to that of the normal distribution for linear data. The von Mises distribution function \(M(\mu , \kappa )\) has circular probability density function (PDF)

$$\begin{aligned} f(\Theta , \mu , \kappa )=\frac{1}{2\pi I_0(\kappa )} {\rm e}^{\kappa \cos (\Theta -\mu )}, \end{aligned}$$
(3)

where \(\mu \in [0, 2\pi )\) is a mean direction parameter and \(\kappa \ge 0\) is a concentration parameter which reflects the dispersion of the \(\Theta\) values around the mean direction \(\mu\). The parameter \(\kappa\) is small for variables with large variance and vice versa. The function \(I_0\) is the modified Bessel function of the first kind of order 0 where the modified Bessel function of the first kind of order m (\(m=0, 1, 2, ...\)) is (Fichtenholz 2007)

$$\begin{aligned} I_m(z)=\frac{1}{\pi }\int _0^{\pi }\cos {(z\sin t-mt)}\,\mathrm{d}t. \end{aligned}$$
(4)

The version of the von Mises distribution on high-dimensional sphere is the von Mises–Fisher distribution which is used in directional statistics.

The shape of the empirical pdf of seasonal or annual maximum is unimodal or multimodal. Multimodality suggests the existence of several sub-populations in the dates; therefore, the mixture of von Mises distributions was used with the PDF equal to

$$\begin{aligned} f(\Theta , p, \mu , \kappa )=\sum _{s=1}^S p_s\cdot \frac{\text {exp}(\kappa _s\,\cos (\Theta -\mu _s))}{2\pi I_0(\kappa _s)}, \end{aligned}$$
(5)

where \(p=(p_1, ..., p_S), \mu =(\mu _1, ..., \mu _S), \kappa =(\kappa _1, ..., \kappa _S)\). The parameters \(p_s\) are positive weights that sum to one and that reflect the contribution of every sub-population to the population of dates. The parameter \(\mu _s\) is the mode of the sth component distribution. It is the mean value of the sth population. The parameter \(\kappa _s\) reflects the concentration around the mode, i.e. the larger is the value of \(\kappa _s\), the greater is the clustering around the sth mode. Finite mixtures of von Mises–Fisher distributions were introduced in Banerjee et al. (2005) to directional data.

The Maximum Likelihood Estimates (MLE) of the parameters \(\mu , \kappa\) of a single von Mises distribution are \(\hat{\mu }=\bar{\Theta }, \hat{\kappa }=A_1^{-1}(\bar{r})\) where \(A_1(z)=\frac{I_1(z)}{I_0(z)}\) is the ratio of the modified Bessel functions of the first kind of order 1 and 0. However, the problem of finding the maximum of the log-likelihood function both for single von Mises distribution and for mixture of them cannot be solved analytically because it leads to equation with inverse of the ratio of two Bessel functions of different order. Thus, numerical procedures must be applied. The issue was tackled by many researchers for the von Mises–Fisher distribution, for example by Amos (1974), Dempster et al. (1977), Banerjee et al. (2005), Tanabe et al. (2007), Sra (2012), Hornik and Grün (2014), among others. In this paper, the method of Hornik and Grün was applied (Hornik and Grün 2014) where bounds for the inverse of the ratio of Bessel functions were derived which yielded the improvement of the previous approximate methods.

The Expectation–Maximization (EM) algorithm (Dempster et al. 1977; McLachlan and Peel 2000) was used in the estimation of parameters of the mixture of von Miseses using MLE. The EM algorithm was introduced as early as in 1950 by Ceppellini et al. (1955) in gene frequency estimation. In the first step (E-step) of the algorithm, each observation is associated with an unobserved value equal to one or zero depending on the location of the observation. Then the expected value of the log-likelihood function for the complete-data is estimated. In the second step (M-step), the expected values are maximized. The two steps are repeated until convergence of parameter estimates. Various variants of the EM algorithms are known in the literature, for example the soft-clustering (used in this paper) or hard-clustering. The high efficacy of these algorithms for fairly skewed empirical distribution function was shown in Banerjee et al. (2005).

The final choice of the number of components S was based on the Akaike Information Criterion. The corrected version, namely the AICc, was used (Hurvich and Tsai 1989). The rationale for this choice is that AICc is recommended when the number of parameters is a substantial fraction of the sample size because it tends to select a more parsimonious model than the AIC. It should also be noted that the mixture model (5) has as many as eight parameters for three components and eleven parameters for four components, while the sample sizes of maxima dates have between fifty and seventy elements. This high number of parameters certifies the use of the AICc. The formula \(AICc=AIC+\frac{2(w+1)(w+2)}{n-w-2}\) was used where \(AIC=-2\log L(\Theta _1,..., \Theta _n, \hat{p}, \hat{\mu }, \hat{\kappa })+2w\) and where L is the likelihood function and \(\it {w}\) is the number of parameters. The model with the minimum value of the AICc was selected for further study.

To assess the goodness of fit, the congruence between empirical and theoretical quantiles of the same order was evaluated by means of \(r_c\), the circular correlation coefficient (Jammalamadaka and Sarma 1988; Jammalamadaka and SenGupta 2001). Suppose a sample of n pairs of angles is \((\Theta _{11},..., \Theta _{1n}), (\Theta _{21},..., \Theta _{2n})\), then

$$\begin{aligned} r_c=\frac{\sum _{i=1}^n \sin (\Theta _{1i}-\bar{\Theta }_1)\sin (\Theta _{2i}-\bar{\Theta }_2)}{\sqrt{\sum _{i=1}^n \sin ^2(\Theta _{1i}-\bar{\Theta }_1)\sum _{i=1}^n \sin ^2(\Theta _{2i}-\bar{\Theta }_2)}}, \end{aligned}$$
(6)

where \(\bar{\Theta }_1, \bar{\Theta }_2\) are the mean dates of the first and second sample, respectively. To test whether the circular correlation coefficient between populations of dates is significantly different from zero, the test statistic

$$\begin{aligned} z_r=r_c\sqrt{\frac{n\lambda _{20}\lambda _{02}}{\lambda _{22}}} \end{aligned}$$
(7)

was derived where \(\lambda _{kj}=\frac{1}{n}\sum _{i=1}^n\sin ^k(\Theta _{1i}-\bar{\Theta }_1) \sin ^j (\Theta _{2i}-\bar{\Theta }_2)\). If the null hypothesis is true, then the theoretical distribution of \(z_r\) is N(0, 1).

Next, the Kuiper’s and the Watson’s tests for uniformity were used (Mardia and Jupp 2000). Although these methods are designed for testing uniformity of circular data, they can also be used for testing goodness of fit to any other continuous distribution function on a circle by taking \(2\pi F(\Theta _i)\) as the data sample where F is the theoretical (hypothetical) cumulative distribution function (CDF). The Kuiper’s test statistic is (Kuiper 1960; Mardia and Jupp 2000)

$$\begin{aligned} V =\max _{1\le i \le n} \left(U_i-\frac{i}{n}\right)-\min _{1\le i \le n} \left(U_i-\frac{i}{n}\right)+\frac{1}{n}, \end{aligned}$$
(8)

where \(U_i=\frac{\Theta _{(i)}}{2\pi }\) with dates ordered to \(\Theta _{(1)}\le ... \le \Theta _{(n)},\, i=1, ..., n\). The statistic \(V_n\) is a measure of deviation between empirical and theoretical CDFs. It is rotation-invariant.

The Watson’s test statistic (Watson 1961; Mardia and Jupp 2000) is

$$\begin{aligned} U^2 = \sum _{i=1}^n\left( U_i-\bar{U}-\frac{i-1/2}{n}+\frac{1}{2} \right) ^2+\frac{1}{12n}. \end{aligned}$$
(9)

The Watson’s \(U^2\) test is an analog to the Cramér–von Mises test for linear data. Approximations of critical values given in Stephens (1970), Mardia and Jupp (2000) were used both for the Kuiper’s and the Watson’s tests.

The estimation with the mixture of von Mises distributions was carried out for the DWM, DSM and DAM variables.

The non-parametric bootstrap procedure was implemented to estimate the confidence intervals of the parameters. The bootstrap samples of length n were drawn with replacement \(N=10^3\) times by sampling from the original sample. For every bootstrap sample, the parameters of the von Mises distribution function (or of a mixture) were estimated. Thus, N estimates of every parameter were obtained, \(\hat{p}, \hat{\mu }, \hat{\kappa }\). The lower and upper confidence limits of the parameter were the quantiles of order \(\frac{\alpha }{2}\) and \(1-\frac{\alpha }{2}\) of the sample of N estimates.

All calculations were carried out in R (R Core Team 2017, Lund et al. 2017, Hornik and Grün 2017, Tsagris et al. 2017). The significance level equal to \(\alpha =5\%\) and the confidence level equal to \(1-\alpha =95\%\) were used in this paper.

Results and discussion

Circular statistics

In the series of AM, winter floods dominate over summer floods with proportion from 88 to 12% and 59 to 41% in the Zagożdżonka catchment and in the Czarna Przemsza catchment, respectively. In the Poprad catchment this relation is reversed, from 38% to 62%.

The circular statistics (see Sect. 3.1) are presented in Table 3.

Table 3 Circular statistics of the DWM / DSM / DAM. The mean flood dates are given in radians (2nd column) and in days (3rd column) of hydrological year

In the Zagożdżonka river, the circular mean flood dates are on 4th March (DWM), 11th July (DSM), and 8th March (DAM). In the Czarna Przemsza and the Poprad rivers, the mean dates are, respectively, 28th February and 16th March (DWM), 1st July and 5th July (DSM), and 11th April and 29th May (DAM).

Water is retained in snow cover during winter time. The two main factors influencing the DWM are snow depth and temperature. Sometimes, the winter floods are amplified by rainfall. Usually, the warm periods during which the snow may melt are at the end of winter in the Zagożdżonka catchment, mainly in March (Hejduk and Hejduk 2014). Similar conditions are found in the Czarna Przemsza catchment where the negative temperatures only rarely occur in April. Therefore, the mean date of the DWM in these two rivers is comparable. The Poprad catchment differs from the two catchments in the mean of DWM because of different winter climate conditions (Sect. 2). In the western, high mountainous part, the snow is accumulated even in May and June due to negative temperatures, with the mean value of the DWM lagging by several weeks in comparison with the Zagożdżonka and Czarna Przemsza catchments. This is due to extreme floods caused mainly by snowmelt in March as also rain or snow floods which appear in later spring months prevailing in April. The mean values of the DSM are, in turn, comparable in all three catchments and located between the end of June and the first ten days of July.

The concentration of DWM is comparable in all three catchments because the values of \(\bar{r}, CIV\) and \(\sigma\) are similar. The largest variation of the DSM is observed in the Zagożdżonka river while the lowest dispersion is in the Poprad river. What can be observed about the DAM, the Czarna Przemsza river shows the largest variation in the date of maximum flow.

In Fig. 2, the rose diagrams of the DWM in the three catchments were shown. The mean flood date \(\bar{\Theta }\) is depicted in every figure. The length of the left arm of the angle is \(\bar{r}\), the mean resultant length value. The arm is long if the dates are highly concentrated around \(\bar{\Theta }\) and it is short if the dates are more dispersed. The shapes in both Zagożdżonka and Czarna Przemsza rivers are similar with somewhat higher frequency in March in the Czarna Przemsza river. The shape of the Poprad diagram is much different because of two dominating frequencies in March and in April while other months are much less frequent.

Fig. 2
figure 2

Rose diagram of the DWM in the a Zagożdżonka, b Czarna Przemsza, c Poprad rivers. Every bar represents one month with the height reflecting the number of winter maxima that occurred in this month during the study period. The angle \(\bar{\Theta }\) is the sample mean DWM and the length of the left arm of the angle is \(\bar{r}\), the mean resultant length value of the DWM (the longer the arm the higher the concentration).

In Fig. 3, the rose diagrams of the DSM in the three catchments were shown. The lowest dispersion in the Poprad river is reflected in a high \(\bar{r}\) value. It is induced by the highest July frequency. The extreme summer floods, caused prevailingly by convective rains are dominant in the Slovakian part of High Tatra Mountains for all catchments. The shape of the summer Zagożdżonka diagram shows some similarity to uniform distribution which explains its large dispersion reflected in a low \(\bar{r}\) value and in a high \(\sigma\) value in Table 3.

Fig. 3
figure 3

Rose diagram of the DSM in the a Zagożdżonka, b Czarna Przemsza, c Poprad rivers. Every bar represents one month with the height reflecting the number of summer maxima that occurred in this month during the study period. The angle \(\bar{\Theta }\) is the sample mean DSM and the length of the left arm of the angle is \(\bar{r}\), the mean resultant length value of the DSM (the longer the arm the higher the concentration)

In Fig. 4, the rose diagrams of the DAM in the three catchments were depicted. The rose diagram is mostly stretched over the winter season with the highest frequency in March in the Zagożdżonka catchment while the summer season is much less occupied. In the Czarna Przemsza catchment, both the winter and summer parts are comparable, although the March frequency also dominates. The rose diagram shape in the Poprad catchment is unlike the two others because summer season apparently dominates with the highest frequencies in June and July. However, the March frequency is also quite high in the winter season. In the Poprad river, the annual highest flows only rarely occur in months from August to November because of relatively low precipitation from December to February because all rain accumulates in snow cover.

Fig. 4
figure 4

Rose diagram of the DAM in the a Zagożdżonka, b Czarna Przemsza, c Poprad rivers. Every bar represents one month with the height reflecting the number of annual maxima that occurred in this month during the study period. The angle \(\bar{\Theta }\) is the sample mean DAM and the length of the left arm of the angle is \(\bar{r}\), the mean resultant length value of the DAM (the longer the arm the higher the concentration)

It can be observed that due to dominating July frequency in the Poprad river and March frequency in the Zagożdżonka river, the mean date in the DAM is by as many as 82 days later in the former than in the latter (see Table 3).

Circular distribution function

The parameters of the distribution were estimated using the MLE method. The numerical algorithm was based on the method presented in Hornik and Grün (2013, 2014). Using results of the the AICc criterion, shown in Table 4, the number of mixture components equal to \(S=2\) was identified in all three catchments in the DWM and to \(S=3\) and \(S=2\) in the Czarna Przemsza river and in the Poprad river in the DSM, respectively. The estimation failed in the Zagożdżonka river in the DSM. In this catchment, \(S=4\) was identified using the AICc criterion; however, huge values of the estimates of the concentration parameters, equal to several hundreds, were obtained. This topped the rugged circular PDF curve with several distortions. This can be explained by the shape of the circular diagram of the DSM in the Zagożdżonka river in Fig. 3a , which is more similar to a uniform than to a peaked distribution. In the DAM, the number of components equal to \(S=1\) was identified in the Zagożdżonka river, to \(S=3\) in the Czarna Przemsza river and to \(S=2\) in the Poprad river. Therefore, every parameter among \(p, \mu , \kappa\) in the formula (5) has two coordinates in the DWM in all three catchments, three and two parameters in the DSM in the Czarna Przemsza and Poprad rivers and one, three and two parameters in the DAM in the Zagożdżonka, Czarna Przemsza and Poprad rivers, respectively.

Table 4 The AICc values for various S, the number of components in the mixture of the von Mises distribution functions (Eq. (5)) for DWM/DSM/DAM. The mean flood dates are given in radians (2nd column) and in days (3rd column) of hydrological year

The estimates are listed in Table 5. In the DAM, the \(\hat{\mu }\) value in the Zagożdżonka river, \(\hat{\mu }_1\) in the Poprad river and \(\hat{\mu }_1\) and \(\hat{\mu }_2\) in the Czarna Przemsza river are located in the winter season. It is worth observing that in the DAM in the Czarna Przemsza river, the estimate of the total weight of components with the circular mean date from the winter season, i. e. \(\hat{p}_1+\hat{p}_2\) approximately equals the contribution of the WM to the AM series, namely 0.51 as against 0.59. Therefore, the long-term contribution of seasonal maxima to annual maxima is reflected in \(\hat{p}\) in the Czarna Przemsza river. In the Poprad river, the difference is greater and amounts to 0.25 as against 0.38 (see Sect. 4.1).

In the Zagożdżonka river, the second component prevails in the DWM (\(\hat{p}_2=0.51, \hat{\mu }_2=2.60, \hat{\kappa }_2=14.03\)) which confirms the dominating role of the March maxima flows because the angular value 2.60 is located in March, after conversion. This can be also observed in the March mode equal to 2.20 in the DAM. A large contribution of the second component (\(\hat{p}_2=0.32\)) and a large concentration \(\hat{\kappa }_2=30.34\) around the \(\hat{\mu }_2=2.39\) is observed in the DAM in the Czarna Przemsza river. Similarly, the second component in the DWM has very similar mode (\(\hat{\mu }_2=2.40\)) and contributes to a large degree to the DWM in the Czarna Przemsza river (\(\hat{p}_2=0.57, \hat{\kappa }_2=11.00\)). This can be explained by the dominating role of the March maxima flows. The role of the third component in the DAM is also considerable and shows the second dominant date in June (\(\hat{p}_3=0.49, \hat{\mu }_3=4.06, \hat{\kappa }_3=1.70\)). In the Poprad river, the dominating June and July frequency is reflected in a large contribution of the second component to the DAM (\(\hat{p}_2=0.75, \hat{\kappa }_2=1.6\)) with the mode at \(\hat{\mu }_2=4.10\).

Table 5 Estimates of the parameters of the mixture of von Mises distribution functions (Eq. (5)). Every estimate among \(\hat{p}, \hat{\mu }, \hat{\kappa }\) has two coordinates in all catchments in the DWM, three coordinates in Czarna Przemsza and two coordinates in Poprad in the DSM, and one coordinate in Zagożdżonka, three coordinates in Czarna Przemsza and two coordinates in Poprad in the DAM

To verify the hypothesis that the distribution function of the DWM, DSM and the DAM is of von Mises or a mixture of von Miseses, the Kuiper’s and the Watson’s tests were used [(Eqs. (8), (9)] to \(2\pi F(\Theta _i)\) sample values where F is the hypothetical CDF. Both tests did not reject the null hypothesis on uniformity in all three catchments. Results of the goodness-of-fit analysis are shown in Table 6. Both the Kuiper’s V and the Watson’s \(U^2\) test statistics are lower than the critical values of these tests equal to 1.747 and 0.187, respectively, (Stephens 1970, Mardia and Jupp 2000). This meant that the null hypothesis on the theoretical distribution function was not rejected. The circular correlation coefficient \(r_c\) between empirical and theoretical date of maximum river flow and the test statistic \(z_r\) are also shown in Table 3. Values of \(r_c\) are very near to 1, and values of \(z_r\) are much higher than the critical value equal to 1.645, which confirms the high congruence between dates.

Table 6 Results of the goodness-of-fit analysis to the (mixture of) von Mises distribution function of the DWM / DSM / DAM

In Figs. 5 and 6, the circular estimates of the PDF of the DWM and DSM are shown. The shape of every estimate follows that of the rose diagram. It is worth observing that the body of the PDF plot is extremely concentrated on the quarter \((\frac{\pi }{2}, \pi )\) (days between the 92nd and 181st day in the winter season, i.e. from 30th Jan to 30 Apr) in the DWM in the Poprad river and on the quarter \((\pi , \frac{3\pi }{2})\) (182nd–273th in hydrological year, from 1st May to 31st Jul) in the DSM in the Czarna Przemsza and Poprad rivers.

Fig. 5
figure 5

The circular PDF estimate of the DWM in the a Zagożdżonka, b Czarna Przemsza, c Poprad rivers. Every thick blue point depicts the mean direction parameter estimate \(\hat{\mu }_s\) of the sth subpopulation. The area under the sth subpopulation is equal to \(p_s\) while the height of the sth part of the curve reflects the concentration around the \(\hat{\mu }_s\) value

Fig. 6
figure 6

The circular PDF estimate of the DSM in the a Czarna Przemsza, b Poprad rivers. Every thick blue point depicts the mean direction parameter estimate \(\hat{\mu }_s\) of the sth subpopulation. The area under the sth subpopulation is equal to \(p_s\) while the height of the sth part of the curve reflects the concentration around the \(\hat{\mu }_s\) value

In Fig. 7, the circular PDF estimates of the DAM are shown. The main body of the PDF is concentrated on the quarter \((\frac{\pi }{2}, \pi )\) (92nd–181rd day in hydrol. year, from 30th Jan to 30 Apr) in the Zagożdżonka river, on the interval \((\frac{\pi }{3}, \frac{5\pi }{3}, )\) (61st–304th day, from 31st Dec to 31st Aug) in the Czarna Przemsza river and on the semicircle \((\frac{\pi }{2}, \frac{3\pi }{2})\) (92nd–273th day, from 31st Jan to 31st Jul) in the Poprad river. The plot is smooth in the Zagożdżonka and more diverse in the Czarna Przemsza and Poprad rivers. All densities only differ somewhat from zero on dates with a very low frequency, i.e. from September to November in the Zagożdżonka river, and from October to January in the Czarna Przemsza river, and from November to February in the Poprad river. This means that it is nearly unlikely that the annual maximum flow date is from these periods. It is worth observing that in the Zagożdżonka river, where the high \(\bar{r}\) value was obtained in the DAM (\(\bar{r}=0.59\)) due to a high concentration of dates, the single von Mises distribution was sufficient to reflect the distribution of the date of maximum flow. In the Poprad river, where the sample concentration was moderate (\(\bar{r}=0.50\)), the two components in the mixture of von Miseses had to be used. In the Czarna Przemsza river, in turn, as many as three components were identified because of the lowest concentration (\(\bar{r}=0.43\)) of the dates of maxima flows. Comparing results from Tables 3 and 5, the perfect agreement between the mean date \(\bar{\Theta }\) and the \(\hat{\mu }\) value can be observed in the DAM in the Zagożdżonka river (because the MLE estimate of \(\mu\) is \(\bar{\Theta }\)). In the Poprad river, the modes \(\mu _1=2.42\) and \(\mu _2=4.10\) of the two components belong to the winter and summer season, respectively, and are similar to the mean angular dates of winter and summer maxima equal to 2.33 and 4.25.

Fig. 7
figure 7

The circular PDF estimate of the DAM in the a Zagożdżonka, b Czarna Przemsza, c Poprad rivers. Every thick blue point depicts the mean direction parameter \(\hat{\mu }_s\) of the sth subpopulation. The area under the sth subpopulation is equal to \(p_s\) while the height of the sth part of the curve reflects the concentration around the \(\hat{\mu }_s\) value

In Table 7 the confidence intervals of the parameters were shown. The confidence intervals were obtained with ease if \(S=1\) (DAM, the Zagożdżonka river). However, they were derived with computational difficulty for mixtures because the EM algorithm diverged for certain bootstrap samples and the procedure had to be repeated until convergence. In the Czarna Przemsza river (DSM), however, the procedure of estimation of confidence intervals using nonparametric bootstrap was not applicable because huge concentration parameter estimates were obtained in the bootstrap samples. The possible causes behind the divergence of the algorithm are a very high or very low concentration of data in the bootstrap sample and that the width of the collapsing mixture components may become zero for many data repetitions (Archambeau et al. 2003). This may also cause a large width of confidence intervals and a lack of symmetry around several parameter estimates that can be observed in Table 7. Further studies on this issue are needed in the future.

Table 7 Confidence intervals of the parameters of the mixture of von Mises distribution functions with the number of components equal to \(S=2\) for all three rivers in the DWM, equal to \(S=3\) for the Czarna Przemsza and \(S=3\) for the Poprad rivers in the DSM and equal to \(S=1\) for the Zagożdżonka, \(S=3\) for the Czarna Przemsza and \(S=2\) for the Poprad rivers in the DAM

As mentioned in Sect. 1, the issue with the estimation with the von Mises distribution (or with the mixtures of them) relies on numerical difficulty. Therefore, apart from the Hornik&Grün algorithm (Hornik and Grün 2014), other methods were also applied to check whether results can depend on the method of approximation of the ratio of Bessel functions. The methods presented in Banerjee et al. (2005), Tanabe et al. (2007), Hornik and Grün (2013) were used. In total, three additional methods were applied for each of nine series of data. It was observed that if the number of iterations was sufficiently high, i.e. at least equal to \(10^6\), the AICc pointed at the same S values apart from two cases, namely in the Czarna Przemsza river (\(S=3\) in the DAM) and in the Zagożdżonka river (\(S=2\) in the DWM) in the method based on Newton algorithm Hornik and Grün (2014)). A difference by approx. 0.5 in values of the \(\mu _s\) parameters was only observed in the Czarna Przemsza river (DAM) in the methods based on Newton and Newton–Fourier algorithm (Hornik and Grün 2013). This leads to conclusion that results were congruent. It should be noted that the number of components in the mixture (Eq. (5)) relied on the AICc value and that another criterion can lead to another number of components.

Results show that in the lowland Zagożdżonka catchment, with a low contribution of summer maxima to the total number of annual maxima, the AICc indicated only one component of the von Mises distribution as the estimate of the PDF of the DAM. In the highland Czarna Przemsza catchment and in the mountainous Poprad catchment, the number of components was larger than one for both seasonal and annual maxima dates, which reflected the large complexity of hydrological processes influencing the dates, namely sudden melting at the end of winter or heavy downpours from thunderstorm cells with high rainfall intensity in summer, which can lead to flash floods that cause the large diversity between maxima dates.

The example of the DAM in the Poprad river shows that the mixture can cover the sample asymmetry because both densities with modes at 2.42 (rad) (first component) and 4.10 (rad) (second component) together contribute to a large part of the area under the PDF on the interval between these two values. Additionally, the contribution is different because \(\hat{p}_1=0.25\) and \(\hat{p}_2=0.75\). Therefore, the contribution is lower for the former and higher for the latter component because of the dominating role of the July dates, which makes the PDF estimates asymmetrical, with the larger part of the body on the lower semicircle. Similarly, the asymmetry was reflected in other PDFs, for example in the DWM in all three catchments and in the DSM in the Poprad river. In the Czarna Przemsza river, in turn, the mixture has three components in the DAM with mean directions at 1.46, 2.39 and 4.06 (rad). Because the March maxima dominate in the series, the PDF is asymmetrical and the main part of the body is concentrated on the upper semicircle. It can be observed in the DAM that most mean direction values \(\mu _s\) are located in the interval \((\frac{\pi }{2}, \frac{3\pi }{2})\) (days 91–273, from 30th Jan to 31st Jul) which makes the PDF curve estimate very thin in months from August to February (Poprad), from June to November (Zagożdżonka) and from September to December (Czarna Przemsza). This reflects a very low contribution of these frequencies to the total DAM frequencies.

A similar analysis was performed for assessment of changes in the dates of extreme precipitation at ten stations in the USA (Dhakal et al. 2015). From statistical point of view, the main difference between Dhakal et al. (2015) and this study relies on testing of uniformity by means of various statistical tests in Dhakal et al. (2015) and testing of mixture of von Mises distributions in this paper.

Conclusions

The von Mises distribution can cover a large variety of both the sample mean circular values and the dispersion values which are reflected in mean direction and concentration parameters, respectively. The conclusion can be drawn that it is an useful estimate of symmetrical or nearly symmetrical, unimodal empirical distribution function of the date of annual maximum flow.

If several sub-populations are identified in the sample of dates of maximum flows, the mixture of von Mises distributions can be used to properly reflect the sample multimodality. The AIC or the AICc criterion can be used for selecting the number of components. Further studies may include the issue of the use of another criterion. However, the mixture was not useful when the empirical distribution was similar to uniform. In this case, this method is not recommended.

Attention should be drawn to the difficulty of parameter estimation of the mixture of von Mises distributions. Further studies can include testing various variants of the EM algorithm and various methods of estimating the concentration parameter.

The methods presented here can also be applied to dates of the maximum precipitation totals. Then, further studies may include linkage between the circular characteristics of precipitation and river flow. The next issue involves grouping catchments according to similarity measures based on the circular approach.