# Trends in the average temperature in Finland, 1847–2013

- First Online:

DOI: 10.1007/s00477-014-0992-2

- Cite this article as:
- Mikkonen, S., Laine, M., Mäkelä, H.M. et al. Stoch Environ Res Risk Assess (2015) 29: 1521. doi:10.1007/s00477-014-0992-2

## Abstract

The change in the mean temperature in Finland is investigated with a dynamic linear model in order to define the sign and the magnitude of the trend in the temperature time series within the last 166 years. The data consists of gridded monthly mean temperatures. The grid has a 10 km spatial resolution, and it was created by interpolating a homogenized temperature series measured at Finnish weather stations. Seasonal variation in the temperature and the autocorrelation structure of the time series were taken account in the model. Finnish temperature time series exhibits a statistically significant trend, which is consistent with human-induced global warming. The mean temperature has risen very likely over 2 °C in the years 1847–2013, which amounts to 0.14 °C/decade. The warming after the late 1960s has been more rapid than ever before. The increase in the temperature has been highest in November, December and January. Also spring months (March, April, May) have warmed more than the annual average, but the change in summer months has been less evident. The detected warming exceeds the global trend clearly, which matches the postulation that the warming is stronger at higher latitudes.

### Keywords

Temperature change Time series analysis State space models## 1 Introduction

The global average temperature has increased by about 0.8 °C since the mid-19th century. It has been shown (e.g., Bloomfield 1992; Gao and Hawthorne 2006; Wu and Zhao 2007; Keller 2009) that this increase is statistically significant and that it can, for the most part, be attributed to human-induced climate change (IPCC 2013; Foster and Rahmstorf 2011). A temperature increase is obvious also in regional and local temperatures in many parts of the world. However, compared with the global average temperature, the regional and local temperatures exhibit higher levels of noise, which has largely been removed from the global temperature due to the higher level of averaging. It is therefore not always clear that a regional or local warming signal, although apparent “to the naked eye” in the temperature data, can, under strict assumptions, be considered statistically significant. Because climate change is one of the most serious environmental issues today, the question of statistical significance in local and regional temperature trends is not only of scientific but also of public interest.

In this article, we consider the time series of Finnish average temperatures in 1847–2013. Because Finland is located in northern latitudes, it is subject to the polar amplification of climate change-induced warming, which is due to the enhanced melting of snow and ice and other feedback mechanisms (see, e.g., Screen and Simmonds 2010; Serreze and Barry 2011). Therefore, warming in Finland is expected to be approximately 50 % higher than the global average. Conversely, the location of Finland between the Atlantic Ocean and continental Eurasia causes the weather to be very variable, and thus the temperature signal is rather noisy.

The concept of trend in itself is not completely free of ambiguity (e.g., Wu et al. 2007). Ambient temperature time series, for example, exhibit autocorrelation created by processes that are not completely understood. Therefore, the choice of the autocorrelation model is somewhat arbitrary, which is reflected in the obtained trend and its significance level. It is relatively straightforward to calculate different averages and linear trends from the observed temperatures. However, to evaluate the significance of the observed changes relative to the natural year-to-year variability and to give realistic uncertainty estimates of the trends we need statistical modeling. In this paper we have used dynamic regression to model the seasonality and the background level of the average temperature in Finland for the years 1847–2013. As our model fits the observed data well and the non-modeled part of the variability, the model residuals, can be seen to consisting of independent Gaussian noise, we can safely say that the uncertainty attributed to the trend values given here is well justified.

## 2 Data

Tietäväinen et al. (2010) created an over 160-year-long time series of monthly mean temperature grids with 10 km resolution for Finland. Homogenized station values of monthly mean temperature (Tuomenvirta 2001) from Finnish weather stations as well as monthly mean temperatures from selected weather stations in Sweden, Norway, and Russia near the Finnish border were used for the spatial interpolation. A kriging interpolation method (Matheron 1963; Ripley 1981), especially developed for climatological applications in Finland (Henttonen 1991), was used for creating the monthly mean temperature grids. As external forcing parameters, the kriging method took into account the geographical coordinates, elevation of the terrain, and the percentage of lakes and sea in each grid box. At the 10 km resolution, a total of 3,829 grid boxes were needed to cover the whole of Finland. Besides, according to Tietäväinen et al. (2010), this spatial model has previously been applied in climatological research projects conducted by Venäläinen and Heikinheimo (1997), Vajda and Venäläinen (2003), Venäläinen et al. (2005), Vajda (2007), and Ylhäisi et al. (2010).

The spatial representativeness of the observation station network is highly dependent on time. A meteorological observation network was initiated in 1846 by the Societas Scientiarum Fennica—The Finnish Society of Sciences and Letters (Finska Vetenskaps-Societeten). The extent of data is limited to temperature measurements from six stations in the first year of the time series in 1847. After decades of slow growth in the number of observation stations, in the 1880s, many new observation stations were established in different parts of southern and central Finland; however, in northern Finland, the first weather stations were not set up until the early 20th century. Therefore, data from Sweden and Norway is crucial. The number of stations used for the interpolation process increased continuously until the 1970s, when there were 179 stations in the network, after which it has slowly decreased. The density of the station network is still higher in southern and central Finland than in the northern part of the country. Stations outside of Finnish borders were removed from the kriging interpolation after 2002 and currently there are more than 120 stations in the network. More details on the station network can be found in Tietäväinen et al. (2010).

## 3 Statistical methods

A trend is a change in the statistical properties of the background state of a system (Chandler and Scott 2011). The simplest case is a linear trend, in which, when applicable, we need to specify only the trend coefficient and its uncertainty. Natural systems evolve continuously over time, and it is not always appropriate to approximate the background evolution with a constant trend. Furthermore, the time series can include multiple time dependent cycles, and they are typically non-stationary, i.e., their distributional properties change over time.

In this work, we apply dynamic regression analysis by using dynamic linear model (DLM) approach to time series analysis of Finnish temperatures. DLM is used to statistically describe the underlying processes that generate variability in the observations. The method will effectively decompose the series into basic components, such as level, trend, seasonality, and noise. The components can be allowed to change over time, and the magnitude of this change can be modeled and estimated. The part of the variability that is not explained by the chosen model is assumed to be uncorrelated noise and we can evaluate the validity of this assumption by statistical model residual diagnostics.

Our model is, of course, just one possibility to describe the evolution of the observed temperatures. We see it as a very natural extension to non-dynamic multiple linear regression model. The method allows us to estimate both the model states (e.g. time-varying trends) and the model parameters (e.g. variances related to temporal variability), and we can assess the uncertainties and statistical significance of the underlying features. In this study, we are not trying to use the model to predict future temperatures, but to detect trends by finding a description that is consistent with the observed temperature variability. To study the adequacy of our chosen model, we examine the model residuals to see if the modeling assumptions are fulfilled.

With a properly set-up and estimated DLM model, we can detect significant changes in the background state and estimate the trends. The magnitude of the trend is not prescribed by the modeling formulation, and the method does not favor finding a “statistically significant” trend. The statistical model provides a method to detect and quantify trends, but it does not directly provide explanations for the observed changes, i.e., whether for example natural variability or solar effects could explain the changes in the background level. Model diagnostics and the increase in the observational data will eventually falsify incorrect models and other poorly selected prior specifications (see e.g. Tarantola 2006).

Dynamic linear models are linear regression models whose regression coefficients can depend on time. This dynamic approach is well known and documented in time series literature (Chatfield 1989; Harvey 1991; Hamilton 1994; Migon et al. 2005). These models are sometimes called structural time series models or hidden Markov models. The latter comes from the fact that dynamic regression is best described by the state space approach where the hidden state variables describe the time evolution of the components of the system. Modern computationally oriented references of the state space approach include Petris et al. (2009) and Durbin and Koopman (2012). The first describes a software package dlm for R statistical language that can be used to do the calculations described in this paper. We have used the Matlab software and computer code described in Laine et al. (2014). In this work, we use a DLM to explain variability in the temperature time series using components for a smooth varying locally linear mean level, for a seasonal effect, and for noise that is allowed to have autoregressive correlation. The autoregressive stochastic error term is used to account for long-range dependencies, irregular cycles, and the effects of different forcing mechanisms that a model with only second order random walk for mean and stochastic seasonality does not suffice to explain.

*n*, the length of the time series to be analyzed. In this work, we analyze univariate temperature time series, but the framework would also allow the modeling of multivariate series. We use notation common to many time series textbooks, e.g., Petris et al. (2009).

*t*-1 to time

*t*. This system can be written by the equations

*ε*” terms are used for the observation uncertainty and for random dynamics of the level and the trend. In terms of the state space Eqs. (1) and (2) this model is written as

Note that only the state vector \( x_{t} \) and the observation uncertainty covariance (a \( 1 \times 1 \) matrix) depend on time *t*. Depending on the choice of the variances \( \sigma_{\text{level}}^{2} \) and \( \sigma_{\text{trend}}^{2} \), the mean state \( \mu_{t} \) will define a smoothly varying background level of the time series. In our analyses, we will set \( \sigma_{\text{level}}^{2} = 0 \) and estimate \( \sigma_{\text{trend}}^{2} \) from the observations. As noted by Durbin and Koopman (2012), this will result in an integrated random walk model for the mean level \( \mu_{t} \), which can be interpreted as a cubic spline smoother, with well-based statistical descriptions of the stochastic components.

*ρ*and an innovation variance, \( \sigma_{AR}^{2} \), we simply define

*ρ*and \( \sigma_{AR}^{2} \) can be estimated from the observations.

*ρ*in the matrix \( G_{\text{AR}} \)), and to the estimation of the model states by state space Kalman filter methods.

*t*, \( \mu_{t} \) is the mean temperature level, \( \gamma_{t} \) is the seasonal component for monthly data, \( \eta_{t} \) is an autoregressive error component, and \( \varepsilon_{t} \) is the error term for the uncertainty in the observed temperature values. The simplification \( \sigma_{\text{level}}^{2} = 0 \) in Eq. (4) allows us to write a second difference process for the mean level \( \mu_{t} \) as

*t*:

*ρ*:

In the model construction above, we have four unknown model parameters: the three variances for stochastic model evolution, \( \sigma_{\text{trend}}^{2} \), \( \sigma_{\text{seas}}^{2} \), \( \sigma_{AR}^{2} \) and the autoregressive coefficient *ρ*. If the values of these parameters are known, the state space representation and the implied Markov properties of the processes allow estimation of the marginal distributions of the states given the observations and parameter by the Kalman filter and Kalman smoother formulas (Durbin & Koopman, 2012). The Kalman smoother gives efficient recursive formulas to calculate the marginal distribution of model states at each time *t* given the whole set of observations \( y_{t} , t = 1, \ldots ,n \). In a DLM these distributions are Gaussian, so defined by a mean vector and a covariance matrix. In addition, the auxiliary parameter vector *θ* = [\( \sigma_{\text{trend}}^{2} \), \( \sigma_{\text{seas}}^{2} \), \( \sigma_{AR}^{2} \), *ρ*] can be estimated using a marginal likelihood function that is provided as a side product of the Kalman filter recursion. This likelihood can be used to estimate the parameter *θ* using maximum likelihood method and the obtained estimates can be plugged back to the equations. We use Bayesian approach and Markov chain Monte Carlo (MCMC) simulation to estimate the posterior distribution of θ and to account for its uncertainty in the trend analysis.

The level component \( \mu_{t} \) models the evolution of the mean temperature after the seasonal and irregular noise components have been filtered out. It allows us to study the temporal changes in the temperature. The trends can be studied visually, or by calculating trend related statistics from the estimated mean level component \( \mu_{t} \). Statistical uncertainty statements can be given by simulating realizations of the level component using MCMC and the Kalman simulation smoother (Durbin and Koopman 2012, Laine et al., 2014).

The strength of the DLM method is its ability to estimate all model components, such as trends and seasonality, in one estimation step and to provide a conceptually simple decomposition of the observed variability. Furthermore, the analysis does not require assumptions about the stationarity of the series in the sense required, e.g., in classical ARIMA time series analyses and ARIMA analyses can be seen as special cases of the DLM analyses. For example, the simple local level and trend DLM of Eqs. (3–5) is equivalent to the ARIMA (0,2,2) model. In addition, the state space methods can easily handle missing observations; they are extendible to non-linear state space models, to hierarchical parameterizations, and to non-Gaussian errors (e.g. Durbin and Koopman 2012 and Gonçalves and Costa 2013). Details of the construction procedure of a DLM model and estimations of model states and parameters can be found in Gamerman (2006) and in Petris et al. (2009). We use an efficient adaptive MCMC algorithm by Haario et al. (2006) and the Kalman filter likelihood to estimate the four parameters in θ. The details of the estimation procedure can be found in Laine et al. (2014) who use similar DLM model to study trends in stratospheric ozone concentrations. We also conducted our analyses with dlm-package in R-software (Petris 2010) to verify the computations.

## 4 Results and discussion

The variance parameters in matrices \( V_{t} \) and \( W_{t} \) and the autocorrelation coefficient *ρ* used in the DLM were estimated using the MCMC simulation algorithm. The length of the MCMC chain was 10,000, the last half of the chain was used for calculating the posterior values, and the convergence of the MCMC algorithm was assessed using plots of the MCMC chain, by calculating convergence diagnostics statistics, and by estimating the Monte Carlo error of the posterior estimates.

Modeled decadal average temperatures [°C] with lower and upper limits of the 95 % probability limits as in Fig. 2

Decade | Lower 95 % | Mean | Upper 95 % |
---|---|---|---|

1840–1850 | 0.063 | 0.38 | 0.71 |

1850–1860 | 0.20 | 0.45 | 0.69 |

1860–1870 | 0.34 | 0.53 | 0.74 |

1870–1880 | 0.41 | 0.61 | 0.77 |

1880–1890 | 0.50 | 0.70 | 0.87 |

1890–1900 | 0.63 | 0.83 | 1.0 |

1900–1910 | 0.83 | 1.0 | 1.2 |

1910–1920 | 1.1 | 1.2 | 1.4 |

1920–1930 | 1.2 | 1.4 | 1.6 |

1930–1940 | 1.4 | 1.6 | 1.9 |

1940–1950 | 1.5 | 1.7 | 1.9 |

1950–1960 | 1.5 | 1.7 | 1.8 |

1960–1970 | 1.5 | 1.7 | 1.9 |

1970–1980 | 1.5 | 1.8 | 2.0 |

1980–1990 | 1.7 | 2.0 | 2.1 |

1990–2000 | 2.1 | 2.2 | 2.4 |

2000–2010 | 2.3 | 2.6 | 2.8 |

2010–2020 | 2.4 | 2.8 | 3.2 |

Prior and posterior means and corresponding relative standard errors shown in Fig. 3

Parameter name | Posterior mean | Posterior standard% | Prior mean | Prior standard% |
---|---|---|---|---|

\( \sigma_{\text{trend}}^{2} \) | 0.00011 | 71 | 0.0004 | 200 |

\( \sigma_{\text{seas}}^{2} \) | 0.0019 | 140 | 0.01 | 1,000 |

\( \sigma_{\text{AR}}^{2} \) | 2.3 | 1.7 | 2.0 | 500 |

| 0.34 | 6.3 | prior is uniform(0,1) |

Temperature change, between the last and the first 10 years, for each month

Month | Lower 95 % | Mean | Upper 95 % |
---|---|---|---|

January | 2.3 | 3.2 | 4.4 |

February | 1.0 | 1.9 | 2.8 |

March | 2.2 | 2.8 | 3.5 |

April | 2.1 | 2.5 | 3.2 |

May | 2.5 | 3.0 | 3.5 |

June | 1.0 | 1.4 | 1.8 |

July | 0.2 | 0.7 | 1.4 |

August | 0.1 | 0.6 | 1.1 |

September | 0.3 | 0.7 | 1.4 |

October | 1.4 | 1.9 | 2.4 |

November | 3.2 | 3.9 | 5.0 |

December | 3.8 | 4.8 | 5.9 |

## 5 Conclusions

By using advanced statistical time series approach, a dynamic linear model (DLM), we were able to model the uncertainty caused by year-to-year natural variability and the uncertainty caused by the incomplete data and non-uniform sampling in the early observational years, and to estimate the uncertainty limits for the increase of the mean temperature in Finland. The Finnish temperature time series exhibits a statistically significant trend, which is consistent with the human-induced global warming. Our analysis shows that the mean temperature has risen by a total of 2.3 ± 0.4 °C (95 % probability limits) during the years 1847–2013, which amounts to 0.14 °C/decade. The warming trend before the 1940s was close to linear for the whole period, whereas the temperature change in the mid-20th century was negligible. However, the warming after the late 1960 s has been more rapid than ever before. Within the last 40 years the rate of change has varied between 0.2 and 0.4 °C/decade. The highest increases were seen in November, December and January. Also spring months (March, April, May) have warmed more than the annual average. Impacts of long-term cold season and spring warming have been documented e.g. in later freeze-up and earlier ice break-up in Finnish lakes (Korhonen 2006) and advancement in the timing of leaf bud burst and flowering of native deciduous trees growing in Finland (Linkosalo et al. 2009). Although warming during the growing season months has been small in centigrade it has resulted in attributable growth in growth of boreal forests in Finland in addition to other drivers (forest management, nitrogen deposition, CO2 concentration) since the 1960s (Kauppi et al. 2014). The analysis of a 166-year-long time series shows that the temperature change in Finland follows the global warming trend, which can be attributed to anthropogenic activities (IPCC: Climate Change 2013). The observed warming in Finland is almost twice as high as the global temperature increase (0.74 °C/100 years), which is in line with the notion that warming is stronger in higher latitudes.

## Acknowledgments

This work was funded by University of Eastern Finland strategic funding (Project ACRONYM), the Academy of Finland Centre of Excellence program (Project No. 1118615) and the Academy of Finland INQUIRE Project.

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.