Stochastic modelling of rainfall for the island of Ireland

This paper analyses a recently created continuous 305-year (1711–2016) monthly rainfall series for the island of Ireland. The findings are as follows. The excess skewness in the monthly series may be eradicated by using a Box-Cox transformation with parameter equal to 0.6: a value very similar to that found for the U.K. and its regions. There is no evidence of either an overall stochastic trend or of evolving monthly seasonal patterns, but positive linear trends are found for January, March, and December and a negative linear trend is found for July. Analysis of the seasonal and annual series (which require no transformation) confirms the implication from the monthly data that winters have become progressively wetter and summers progressively drier, with the positive linear trend for winter being twice the size of the negative summer trend. Since there is no trend in either spring or autumn rainfall, annual rainfall shows a positive linear trend. Given that the rainfall series exists for over three centuries, breaks and structural shifts in the model were investigated. Five breaks were identified, three of which occurred in the early portion of the series during the eighteenth century. However, trends were found to be much more stable from the middle of the nineteenth century. For the seasonal series, only a single break, at 1790 for the winter series, was found: it was only after this break that winters became wetter; before then, winter rainfall had a negative trend. In terms of predictability, predictions from the model were found to be more volatile during the second half of the eighteenth century and again from 1976 onwards.


Introduction
have recently created a continuous 305year (1711-2016) monthly rainfall series for the island of Ireland, known as IoI_1711. They have also provided detailed descriptive statistical analysis of the series but have not attempted any stochastic time series modelling of the type undertaken by Mills (2005Mills ( , 2015Mills ( , 2017 for the U.K. and its regions. The purpose of this paper is to undertake, such an analysis on IoI_1711 to enable a wider perspective on the evolution of the series to be obtained. To this end, Sect. 2 outlines the model used for the analysis of the monthly IoI_1711 series and discusses how deterministic and stochastic trends and seasonal patterns may be identified, along with model estimation and testing. Section 3 provides a complementary analysis of the seasonal and annual series obtained from the monthly data. Given the long span of the data, Sect. 4 investigates the possibility of breaks and shifts in the model and in its predictability. Section 5 completes the paper by providing a summary, conclusions, and a comparison with the findings of Murphy et al. (2018). 2 A model for monthly rainfall 2.1 The basic model Following Mills (2005, 2015, 2017), a basic model for a monthly rainfall series observed from time t = 1 to time t = T has been found to be In Eq. (1), rainfall x t is transformed by the Box and Cox (1964) power transformation defined as This has been applied to ameliorate the skewness found in the raw data, a consequence of x t being bounded below at zero and possibly having a long right (positive) tail. Through its scaling property, the transformation helps to induce normality, linearity, and constancy of variance into the model. It is used here in its simplest form, as it may be generalised in a variety of directions to deal, for example, with negative values and heteroskedasticity, neither of which are needed here. The s i, t t = 1, 2, ⋯, T, are Bdummy^variables defined to take the value 1 in month i and 0 elsewhere (where i = 1 signifies January, etc.). Their inclusion allows a deterministic monthly pattern to be modelled. The presence of the s i, t t Binteraction^variables allows for the possibility of different monthly linear time trends. The α i and β i parameters measure the intercept and slope of these trends, so that if β i ≠ 0 then the seasonal pattern for month i evolves linearly over time.
The error u t can, in general, follow a seasonal autoregressive-moving average (ARMA) process: see for example, Mills (2019), chapter 8) for technical details and Mills (2014) for a discussion of such models in a meteorological context: are Bnon-seasonal^polynomials of orders p and q in the lag operator B, defined such that B j a t ≡ a t − j , a t being zero mean white noise (E(a t ) = 0, E(a t a t − j ) = 0 for all j ≠ 0) with variance E a 2 t À Á ¼ σ 2 a . The Bseasonal^polynomials are of orders P and Q, their presence allowing the error to be autocorrelated at seasonal lags, such as 12, 24, ⋯, as well as being autocorrelated at non-seasonal lags.

Deterministic and stochastic trends and seasonality
More general models result if unit roots are allowed in the ϕ(B) and Φ(B 12 ) polynomials. If the non-seasonal autoregressive polynomial contains a unit root, i.e., the characteristic equation associated with ϕ(B) contains a root of unity, then ϕ(B) can be factorised as where ϕ * (B) is a polynomial of order p − 1. Equation (1) then becomes, with ∇ = 1 − B signifying the first-difference operator and Noting that ∇s i, t = s i, t − s i + 1, t and ∇s i, t t = (12 + i)(s i, t − s i + 1, t ), where it is taken that ∇s 12, t = s 12, t − s 1, t + 1 , Eq. (3) in turn becomes Since ∇x λ ð Þ t ¼ α þ u * t would depict a random walk with drift α, Eq. (4) may be interpreted as implying that x λ ð Þ t contains a stochastic, random walk, trend with differing seasonal drifts, i.e. each month evolves as a random walk with its own drift.
Alternatively, suppose that the seasonal autoregressive polynomial contains a (seasonal) unit root: Equation (1) then becomes, with ∇ 12 = (1 − B 12 ) and Since ∇ 12 s i, t = 0 and ∇ 12 s i, t t = 12s i, t , Eq. (5) becomes and x λ ð Þ t now contains a stochastic seasonal random walk with differing seasonal drifts.

Estimating and testing the model
The data provided by Murphy et al. (2018) are T = 3672 monthly observations on rainfall for the island of Ireland for the 305 years from 1711 to 2016, known as the IoI_1711 series. Figure 1 displays the histogram and empirical kernel density of the series, superimposed on which is a normal distribution with the same mean and standard deviation as IoI_1711. Although the distribution is not excessively kurtotic (the kurtosis measure is only 3.12), it is highly skewed to the right (the skewness measure is 0.52), as might be expected. Figure 2 shows the plot of the log-likelihood function for the Box-Cox transformation parameter λ in Eq. (1). The maximum likelihood (ML) estimate isλ ¼ 0:58 with a 95% confidence interval running from 0.53 to 0.63. (ML estimation of and the construction of a confidence interval for the Box-Cox transformation parameter in models, such as (1) is conveniently discussed in Mills 2019, chapter 2). For convenience, λ was thus set at the value of 0.6, and the histogram and empirical kernel density of the transformed series are shown in Fig. 3. Skewness has been eradicated (it is now just 0.03), and the distribution is close to the superimposed normal distribution.
To determine the most appropriate form of the combined model given by Eqs. (1) and (2), initial analysis using the information from the sample autocorrelation and partial autocorrelation functions, along with residual diagnostic checks from fitted models, established that the polynomial orders could be set at p = 3, q = 0, and P = Q = 1 (at most), leading to the model The estimates of this model are reported in the first column of Table 1. For there to be a random walk trend in rainfall, ϕ 1 + ϕ 2 + ϕ 3 would have to be unity (the unit root condition). The estimates show clearly the absence of such a (stochastic) trend sinceφ 1 þφ 2 þφ 3 ¼ −0:195, with a standard error of 0.0285. There is, though, evidence of non-seasonal autocorrelation as bothφ 1 andφ 3 are significantly different from zero. ig. 2 Log-likelihood function for the Box-Cox transformation parameter λ and 95% confidence interval Table 1 Estimates of Eq. (7) Eq. (7) Restricted Eq. (7 Fig. 3 Histogram and empirical kernel density of Box-Cox transformed IoI_1711 with a normal distribution with same mean and standard deviation superimposed from zero, so that there is, in fact, no evidence of stochastic seasonality, so that the seasonal pattern does not evolve over time. However, the hypothesis of no deterministic monthly trends (β 1 = β 2 = ⋯ = β 12 = 0) may be conclusively rejected, although only four of the months have trends that are individually significant.
The residuals from this model exhibit no autocorrelation. The question of whether the deterministic seasonal model might contain a non-linear component was addressed by including additional quadratic and cubic trends, taking the form s i, t t 2 and s i, t t 3 , but these were found to be insignificant (an F test for their inclusion has a marginal significance level of just 0.65 when just quadratic trends are included and 0.60 when both quadratic and cubic trends are included).
The second column of Table 1 provides estimates of a restricted version of Eq. (7) in which all individually insignificant coefficients have been set to 0. These restrictions are clearly acceptable as the fit of the equation remains the same. The monthly trends from this restricted model, calculated by Binverting^the Box-Cox transformation, are shown in Fig. 4: i.e., if the predicted transformed rainfall for month i in year y, where y = 1 corresponds to 1711, etc., is given bŷ then the predicted rainfall itself is given by the inverted valuê January, March, and December exhibit positive trends, so that rainfall in these months has increased over the three centuries, while the trend for July is negative, indicating that this month has become progressively drier. The slopes of these (non-linear) trends are rather small; however, January rainfall is predicted to have increased from 81 to 118 mm between 1711 and 2016, March rainfall from 64 to 81 mm, and December rainfall from 91 to 121 mm. July rainfall is predicted to have declined from 99 to 74 mm over the three centuries. The remaining 8 months show constant seasonal factors.
3 Modelling the seasonal and annual rainfall data Following Murphy et al. (2018), the IoI_1711 monthly series may be aggregated to the four seasons: winter (the sum of December of year y − 1, January of year y, and February of year y, i.e. win y = x 12, y − 1 + x 1, y + x 2, y ), spring (spr y = x 3, y + x 4, y + x 5, y ), summer (sum y = x 6, y + x 7, y + x 8, y ) and autumn (aut y = x 9, y + x 10, y + x 11, y ). An annual series may then be defined as ann y = x 1, y + x 2, y + ⋯ + x 12, y . These series are displayed in Fig. 5, and obviously, such annual series display no seasonality. Interestingly, these series do not require transformation, since at this level of aggregation, no significant departures from normality are found in any of them, presumably because aggregation Baverages out^many of the more extreme rainfall fluctuations observed at monthly frequencies. 1 Fitted trend lines are also shown in Fig. 5 These models are consistent with the findings from the monthly IoI_1711 series. Only the summer series exhibits any autocorrelation, and this is of just 1-year duration. Winter exhibits a positive trend in rainfall, which is approximately twice the size of the negative trend for summer. Both spring and autumn have no trends in rainfall, the positive March trend being dissipated in significance by the lack of trends in April and May rainfall. Consequently, annual rainfall has a positive trend, being approximately the average of the (absolute) winter and summer trends.

Breaks and changing predictability
Over such a long sample period, somewhat in excess of three centuries, it is quite conceivable that the model may have undergone one or more shifts over time. To investigate this possibility, a model closely related to Eq. (7), was investigated for structural shifts using the Bai (1997) and Bai andPerron (1998, 2003a, b) Bsequentially determined^break test. 2 The statistics from this test, shown in Table 2, identify five breaks, at July 1739, December 1765, December 1786, February 1843, and August 1976. Interestingly, three of these breaks occur during the eighteenth century, a period for which Murphy et al. (2018) have Blow confidence^in the reliability of the data. The seasonal trends estimated from this five-break model are shown in Fig. 6. The trends are quite volatile across the three breaks during the eighteenth century, but from the mid-1800s, the seasonal trends are rather stable within subsamples, with only one significantly negative linear trend (for September) during the fifth subsample from 1843 to 1976 and one significantly positive linear trend (for July) during the last subsample from 1976.
Break tests were also performed on the seasonal and annual series. The only break that could be identified was for the winter series with a break at 1790: There is thus a trend towards drier winters in the years up to 1790, with the trend then being reversed towards wetter winters after this break point.
The potential for changes in predictability was also investigated. 3 To assess whether the pattern of rainfall has altered in predictability over time relative to the fitted models, moving residual standard deviations were computed for both the model fitted assuming no breaks and for the model with five breaks. The n period moving residual standard deviation at time t,σ a;t;n is defined from σ 2 a;t;n ¼ n−1 ð Þ −1 ∑ n−1 j¼0â 2 t−i t ¼ n; n þ 1; ⋯; T so that the conventional residual standard deviationσ a reported in Table 1 implicitly sets the moving window size n equal to the sample size T. Figure 7 plots these moving residual standard deviations for n = 120, i.e. for a 10-year (decadal) moving window. Variation is much more pronounced during the eighteenth century, the first half of which exhibits less unpredictability than the second half, during which unpredictability was at its greatest. The period since 1976 has also exhibited a tendency towards greater unpredictability.

Summary and concluding comments
Using a stochastic model that has already been successfully employed to model monthly rainfall series for the U.K. and its regions, this paper has demonstrated that this model can also 2 The testing procedure involves the following steps: (i) begin with the full sample and perform a test of parameter constancy with unknown break using a standard Chow (1960) F test;(ii) if the test rejects the null hypothesis of constancy, determine the break date using an Andrews (1993) modified F test given by the largest F statistic over all possible break dates; (iii) the sample is then divided at this break date into two subsamples, and single unknown breakpoint tests in each subsample are performed. Each of the tests may be viewed as a test of the alternative of l + 1 = 2 breaks versus the null of l = 1 break. A breakpoint is then added whenever a subsample null is rejected; (iv) the procedure is then repeated up to a maximum of five breaks until all subsamples do not reject the null hypothesis or until the maximum number of breakpoints is reached; and (v) the break dates are then refined by reestimation if they are obtained from a subsample containing more than one break. A Btrimming percentage^is required to ensure that individual subsamples are not too small. Given the length of the series, a trim of 5% was chosen, with the tests using 5% critical values. The procedure is based on least squares estimation, which precludes ARMA errors: hence, the use of lagged dependent variables in Eq. (8) to model potential autocorrelation. 3 The term Bchanges in predictability^refers to whether the goodness of fit of the estimated models alters, for better or worse, over the sample period. be successfully fitted to the IoI_1711 series for the island of Ireland. The excess skewness in the monthly IoI_1711 data may be eradicated by using a Box-Cox transformation with parameter equal to 0.6, a value very similar to that found for the U.K. and its regions. There is no evidence of either an overall stochastic trend or of evolving monthly seasonal patterns, but positive linear trends are found for January, March, and December and a negative linear trend found for July. Analysis of the seasonal and annual series (which require no transformation) confirms the implication from the monthly data that winters have become progressively wetter and summers progressively drier, with the positive linear trend for winter being approximately twice the size of the negative summer trend. Since there is no trend in either spring or autumn rainfall, annual rainfall shows an overall positive linear trend. Given that the IoI_1711 series exists for over three centuries, breaks in trend and structural shifts in the model were investigated. Five breaks were identified, three of which occurred in the early portion of the series during the eighteenth century. However, trends were found to be much more stable from the middle of the nineteenth century. For the seasonal series, only a single break, at 1790 for winter, was found, so that it was only after this break that winters became wetter; as before then winter rainfall had a negative trend. In terms of predictability, predictions from the model were found to be more volatile during the second half of the eighteenth century and again from 1976 onwards.
The formal modelling results presented in this paper may also be compared with the essentially descriptive findings of Murphy et al. (2018). The overall finding of increasingly wetter winters and dryer summers complements their conclusions, as does the finding that most of the eighteenth century was characterised by dryer winters, which Murphy et al. suggest  Critical values from Bai and Perron (2003b). 5% trimming; heterogeneous error distribution across breaks may be a consequence of the under-catch of snowfall. They also point out that before 1790, confidence in the data is low and this may have led to the finding of multiple breaks and more volatile predictions in the models during the eighteenth century. The break analysis also complements their conclusions that trends were less significant from 1850 onwards and that trends computed from recent data are not necessarily representative of long-term trends. Of course, a perennial problem with all trend fitting techniques is their projection into the future. Given the results of the break analysis one should be wary of projecting current trends too far! Comparisons with the findings for the U.K. regions given in Mills (2017) are necessarily limited by the much shorter sample periods available for the U.K. and the rather different focus of that paper. Perhaps, the most noticeable difference is that the annual IoI_1711 series, which contains a positive trend, stands in contrast to all the U.K. regions, for which trends in rainfall are conspicuously absent.
The above findings, which complement and enhance the descriptive analysis of Murphy et al. (2018), thus conclusively demonstrate the importance and usefulness of the formal modelling of rainfall series undertaken in this paper.
Open Access This article is distributed under the terms of the Creative Comm ons Attribution 4.0 International License (http:// creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. 1800 1850 1900 1950 2000 Residual moving standard deviation Residual moving standard deviation from breaks model Fig. 7 Moving residual standard deviations using a decadal window