Introduction

Particulate matter (PM) are microscopicparticles of solid or liquidmattersuspended in the air. Its sources can be natural or anthropogenic. It has a significant impact on both climate and precipitation and therefore has both health and social costs. In particular, it affects the amount of incoming solar radiation and outgoing terrestrial radiation. The coarse particles can have a diameter between 2.5 and 10 μm (PM10) and are known to be a very harmful form of air pollution given their ability to penetrate into the lungs and blood streams and cause respiratory and heart diseases as well as premature death. Various countries have therefore set limits for particulars in the air, which are emitted during the combustion of vehicle engine fuels, braking and tyre wearing. In particular, the European Union has defined in a series of directives the acceptable limits for exhaust emissions of new vehicles sold in the European Union and EEA member states.

Numerous studies have analysed the connection between pollution and harmful health effects (e.g. Schwartz and Marcus 1990; Anderson et al. 1996; Atkinson et al. 1999; Gardner and Dorling 1999). The present study contributes to another branch of the literature which focuses instead on modelling various pollutants such as sulphur dioxide (SO2), nitrogen dioxide (NO2), carbon monoxide (CO), ozone (O3), PM2.5 and PM10. For instance, Zamri et al. (2009) applied the Box-Jenkins ARIMA approach to model CO and NO2 in Malaysia and found an upward trend. Li et al. (2017) analysed air quality in Beijing from 2014 to 2016 using the spatio-temporal deep learning (StDL) model, the time delay neural network (TDNN) model, the ARMA model, the support vector regression (SVR) model and the long short-term memory neural network extended (LSTME) model and concluded that the LSTME model is the most suitable one for time series characterised by long-term dependence with optimal time delays. Naveen and Anu (2017) studied air quality in India using ARIMA, seasonal ARIMA (SARIMA) and other models. Pan and Chen (2008) is one of the few studies using long-memory autoregressive fractional integrated moving average (ARFIMA) models for air pollution data (in the case of Taiwan) and concluding that these are more accurate than autoregressive integrated moving average (ARIMA) models for such series. We also apply the latter type of framework for our analysis; however, instead of imposing a specific ARMA structure for the differenced process, we use white noise errors or alternatively the non-parametric approach of Bloomfield (1973), thus avoiding the issue of misspecification that might occur when choosing the short-run components.

It is clearly important to investigate the dynamics of air pollution to develop suitable models for prediction purposes and design policies to manage air quality. This paper examines the statistical properties of daily PM10 in eight European capitals (Amsterdam, Berlin, Brussels, Helsinki, London, Luxembourg, Madrid and Paris) over the period 2014–2020 by applying a fractional integration framework that is more general than the standard approach based on the classical dichotomy between I(0) stationary and I(1) non-stationary series used in the vast majority of previous studies on air pollutants, since it allows for fractional as well as integer degrees of differentiation and thus for a much wider set of stochastic behaviours. In particular, it enables the researcher to analyse the long-memory properties of the series of interest and the possible presence of trends, to test for mean reversion and to measure the degree of persistence and the speed of adjustment to the long-run equilibrium level. Therefore, it provides information about whether the effects of shocks are transitory or permanent, which is a crucial piece of information for adopting appropriate policy measures.

The remainder of the paper is structured as follows: the ‘Methodology’ section outlines the methodology used for the analysis; the ‘Data’ section describes the data; the ‘Empirical results’ section presents the empirical results; and the ‘Conclusions’ section offers some concluding remarks.

Methodology

As mentioned above, we adopt a long-memory approach based on fractional integration. Long memory is a feature of time series that are characterised by a high degree of dependence between observations which are far apart in time. It has been found to be displayed by many time series in different fields such as climatology (Gil-Alana 2005, 2008, 2017; Vyushin and Kushner 2009; Franzke 2012; Ludescher et al. 2016; Bunde 2017; Yuan et al. 2019; Bruneau et al. 2020), environmental sciences (Barros et al. 2016; Gil-Alana et al. 2016; Tiwari et al. 2016; Gil-Alana and Solarin 2018, Gil-Alana and Trani 2019; Xayasouk et al. 2020) and economics and finance (Gil-Alana and Moreno 2012; Abritti et al. 2017; Kalemkerian and Sosa 2020; Murialdo et al. 2020; Qiu et al. 2020).

There exist a variety of statistical models that can describe this type of behaviour; a very popular one among time series analysts is based on the concept of fractional integration, which occurs when the number of differences required to make a series stationary I(0) is a fractional value. More precisely, a time series is said to be integrated of order d or I(d) if it can be expressed as:

$$ {\left(1-B\right)}^d{x}_t={u}_t,\kern0.5em t=1,2,\dots, $$
(1)

where B is the backshift operator (Bxt =xt-1), the differencing parameter d can be any real value, and ut is I(0) defined as a covariance stationary process with a spectral density function that is positive and bounded at all frequencies in the spectrum. This framework encompasses different cases such as short memory (d = 0), stationary long memory (0 < d < 0.5), non-stationary though mean-reverting processes (0.5 ≤ d < 1), unit roots (d = 1) and explosive patterns (d ≥ 1).

Data

The series analysed is the daily average air quality taken from the World Air Quality Index (WAQI) at https://aqicn.org/map/world/es/. All data have been converted using the US EPA standard (United States Environmental Protection Agency). Specifically, we use daily data for the past 7 years (2014–2020) concerning eight European capitals: Amsterdam, Berlin, Brussels, Helsinki, London, Luxembourg, Madrid and Paris. The series represents the daily level of air quality (PM10) measured in micrograms per cubic metre of air (μg/m3). The WAQI data come from the following original sources: Madrid, http://www.mambiente.madrid.es/opencms/opencms/calaire/ (Ayuntamiento de Madrid); Paris, http://www.airparif.asso.fr/ (AirParif—Association de surveillance de la qualité de l'air en Île-de-France); Amsterdam, https://www.luchtmeetnet.nl/ (RIVM); Luxembourg, https://environnement.public.lu/fr.html (Portail de l`Environnement du Grand-duché de Luxembourg); London, https://uk-air.defra.gov.uk/ (UK-AIR, air quality information resource, Defra, UK); Helsinki, https://www.ilmatieteenlaitos.fi/ilmanlaatu (Ilmanlaatu Suomessa); Brussels, https://www.irceline.be/en/ (Belgian Interregional Environment Agency); and Berlin, https://www.berlin.de/senuvk/umwelt/luftqualitaet/ (Luftqualität).

Table 1 reports the sample periods for each capital and provides some descriptive statistics for each series. It can be seen that Paris exhibits the highest mean value, while Helsinki has the lowest. Paris also has the most volatile series, while Luxembourg has the least volatile.

Table 1 Descriptive statistics

Empirical results

We estimate the following model:

$$ {\mathrm{y}}_{\mathrm{t}}=\alpha +\beta \mathrm{t}+{x}_{\mathrm{t}},\kern0.5em {\left(1-\mathrm{B}\right)}^{{\mathrm{d}}_{\mathrm{o}}}{\mathrm{X}}_{\mathrm{t}}={\mathrm{u}}_{\mathrm{t}},\kern0.5em \mathrm{t}=1,2,\dots, $$
(2)

where yt stands for PM10 in each European capital in turn, and xt is an I(d) process such that the error term ut is I(0); the disturbances are assumed to follow a white noise (see Tables 2 and 3) and an autocorrelated process (see Tables 4 and 5) in turn, where the latter is modelled using the exponential spectral framework of Bloomfield (1973). In all cases, we display the estimated values of d (and their associated 95% confidence bands) for three different specifications: (i) no deterministic terms in (2), i.e. we impose the restriction α = β = 0 (the results for this case are reported in the second column in Tables 2 and 4); (ii) an intercept only (see the third column in both tables); and (iii) an intercept and a linear time trend (see the fourth column in both tables). The estimated values reported in bold in these tables are those corresponding to our preferred specification, which has been selected on the basis of the statistical significance of the regressors.

Table 2 Estimates of d: white noise errors
Table 3 Estimated coefficients in the selected model: white noise errors
Table 4 Estimates of d: autocorrelated errors
Table 5 Estimated coefficients in the selected model: autocorrelated errors

When assuming that ut is a white noise, the intercept is found to be the only significant deterministic term in all cases; the estimated values of d are in the interval (0, 1), which implies long memory and fractional integration. They range between 0.39 (Amsterdam) and 0.62 (Madrid). For Amsterdam, the values are all within the stationary range (d < 0.5); for Brussels, London, Paris and London, they are around the stationary boundary (d = 0.5), while non-stationarity (d ≥ 0.5) is found in the case of Helsinki, Berlin and Madrid.

When allowing for autocorrelation (Tables 4 and 5), the time trend appears to be negative and statistically significant in the case of Brussels, Berlin and Paris, which might reflect the anti-pollution policies adopted in these three capitals. In particular, a low emission zone (LEZ) was established in the Brussels region with the aim of meeting the European air quality standards and emission ceilings; in Berlin the German Climate Action Plan 2050 is being implemented to control air pollution by laying down environmental quality standards and emission reduction requirements; a LEZ based on Euro norm vehicle classification has also been introduced in Paris.

The estimated values of d are once more in the interval (0, 1), though they are now significantly smaller than in the previous case. In fact, they are all within the stationary range, specifically between 0.22 (Luxembourg) and 0.33 (Helsinki and Paris). These lower estimates are likely to reflect the competition between the fractional integration and Bloomfield parameters in describing time dependence between the observations. Both sets of estimates, under the assumption of white noise and autocorrelated errors, respectively, indicate that the degree of persistence is highest in the case of Brussels, Amsterdam and London and lowest in the case of Helsinki, Berlin, and Madrid; thus, the effects of shocks are more long-lived in the former capitals.

Conclusions

This paper has used fractional integration methods to obtain evidence on persistence and time trends in PM10 in eight European capitals (Amsterdam, Berlin, Brussels, Helsinki, London, Luxembourg, Madrid and Paris). This approach is more general than the standard ones used in most of the literature on air pollutants and thus is more informative about the time series properties of the series of interest. The results indicate that all of them display fractional integration with orders of integration in the range (0,1); this implies that mean reversion occurs and shocks do not have permanent effects. However, the degree of persistence is different in the eight capitals examined; in particular, the effects of shocks take longer to die away in the case of Brussels, Amsterdam and London. Such evidence should be taken into account by policymakers aiming to design effective measures to reduce pollution.

The estimated values of d are lower under the assumption of autocorrelated errors; in this case, three of the capitals examined (Brussels, Paris and Berlin) exhibit statistically significant negative time trends, which suggests that the policies they have adopted to reduce pollution (such as the establishment of LEZs) have been successful, at least to some extent.

Other issues that could be investigated in the case of PM10 are the following: seasonality with fractional integration (Gil-Alana and Robinson 2001; Bisognin and Lopes 2009; del Barrio Castro and Rachinger 2021; etc.); non-linearities and structural breaks, which is of particular interest given the fact that these are both strongly related to long memory and fractional integration (see, e.g. Diebold and Inoue 2001; Granger and Hyung 2004; Ohanissian et al. 2008; Kongcharoen 2013; etc.); and the forecasting performance of alternative specifications; all these topics are left for future work.