Functional ANOVA approaches for detecting changes in air pollution during the COVID-19 pandemic

Faced with novel coronavirus outbreak, the most hard-hit countries adopted a lockdown strategy to contrast the spread of virus. Many studies have already documented that the COVID-19 control actions have resulted in improved air quality locally and around the world. Following these lines of research, we focus on air quality changes in the urban territory of Chieti-Pescara (Central Italy), identified as an area of criticality in terms of air pollution. Concentrations of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {NO}_{{2}}$$\end{document}NO2, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {PM}_{{10}}$$\end{document}PM10, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {PM}_{2.5}$$\end{document}PM2.5 and benzene are used to evaluate air pollution changes in this Region. Data were measured by several monitoring stations over two specific periods: from 1st February to 10 th March 2020 (before lockdown period) and from 11st March 2020 to 18 th April 2020 (during lockdown period). The impact of lockdown on air quality is assessed through functional data analysis. Our work makes an important contribution to the analysis of variance for functional data (FANOVA). Specifically, a novel approach based on multivariate functional principal component analysis is introduced to tackle the multivariate FANOVA problem for independent measures, which is reduced to test multivariate homogeneity on the vectors of the most explicative principal components scores. Results of the present study suggest that the level of each pollutant changed during the confinement. Additionally, the differences in the mean functions of all pollutants according to the location and type of monitoring stations (background vs traffic), are ascribable to the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {PM}_{{10}}$$\end{document}PM10 and benzene concentrations for pre-lockdown and during-lockdown tenure, respectively. FANOVA has proven to be beneficial to monitoring the evolution of air quality in both periods of time. This can help environmental protection agencies in drawing a more holistic picture of air quality status in the area of interest.


Introduction
After the discovery of the first case in Wuhan (China) in December 2019, the current outbreak of COVID-19, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has dramatically affected all the countries ). On January 30 2020, the World Health Organization (WHO) declared worldwide public health emergency and in March 11 2020, due to widespread global infection, the WHO authorities categorised the new Coronavirus as pandemic (WHO 2020). To contain the virus and save lives, governments around the word have been taking a range of actions and measures, such as social and travel restrictions. More specifically, coronavirus pandemic has forced nations under partial or complete lockdowns, resulting in prohibition of unnecessary commercial activities in people's daily lives; prohibition of any types of gathering by residents; restrictions on private (vehicle) and public transportation. Different studies have already documented the effects of COVID-19 lockdown measures on many aspects of human activities, such as transportation (Mogaji 2020), renewable and sustainable energy (Hosseini 2020), health risk assessment (Gautam and Trivedi 2020;Ambade et al. 2021;Zoran et al. 2020;Gupta et al. 2021), tourism (Sigala 2020), commodity markets (Rajput et al. 2021). Certainly, COVID-19 has severe negative impact on the world activities as well as on local economy. In the major economics across the globe, lockdown will directly affect the Gross Domestic Product (GDP) of each country. On the contrary, efforts to restrict transmission of the SARS-CoV-2 have had outstanding effects on the ecosystems which are being greatly recovered. In many cities where lockdown measures have been implemented, the decline in economic activities, the nonfunctioning of industries, the drop in road transport, have contributed to mitigate air pollution. Bherwani et al. (2020) monetarily quantified the overall benefit due to pollution reduction over the potential economic loss sustained by local governments, as an outcome of lockdown.
Several researchers around the world reported that there is a considerable reduction of air pollution level across geographies. For instance, the findings of many studies showed a substantial enhancement in the air quality in the lockdown period globally (Lal et al. 2020;Dutheil et al. 2020;Gope et al. 2021;Venter et al. 2020;Gautam 2020b).
A visible improvement in air quality parameters was recorded in different areas of China and India (see, among others, Wang et al. 2020;Bao and Zhang 2020;Li et al. 2020;Gautam and Trivedi 2020;Mahato et al. 2020;Sharma et al. 2020;Agarwal et al. 2020;Gautam et al. 2021). The study of Kanniah et al. (2020) highlighted that large emission reduction in transportation and anthropogenic activities, resulted in a significant reduction in air pollution levels in the urban regions of Malaysia as well as in a sizeable reduction aerosol optical thickness concentration during March-April 2020 in comparison with the same period in 2019 and 2018. Kerimray et al. (2020) analyzed the effect of the lockdown from March 19 to April 14, 2020, on the concentrations of air pollutants in Almaty, Kazakhstan, finding out a substantial difference between concentrations of air pollutant recorded before and during lockdown. Otmani et al. (2020) gave evidence that the government decisions in response to COVID-19 had an important impact on the air pollution in Salé city (Morocco). Findings from that study showed that the difference between the concentrations recorded before and during the lockdown period were 75, 49 and 96% for PM 10 , SO 2 and NO 2 ; respectively. Berman and Ebisu (2020) investigated the changes in levels of air pollutants across USA during COVID-19 pandemic. The authors reported a significant reduction on NO 2 (up to À25:5%) and an overall decline in PM 2:5 , compared with pre-lockdown phase. Zabrano-Monserrate and Ruano (2020) studied the effects of quarantine policies on air pollutants concentrations in Quito, Ecuador. Using parametric methods, they detected a significant reduction of NO 2 and PM 2:5 since the introduction of lockdown measures. However, there was a noticeable growth in ozone levels. There were reports that documented a decline in NO 2 , NO x and an increase in surface O 3 at São Paulo in Brazil during the quarantine period (see, for instance, Nakada and Urban 2020; Dantas et al. 2020). The similar trends of reducing air pollution and increasing air quality due to introduction of lockdown were observed in large parts of Europe, such as France, Germany, Spain, and Italy (Sicard et al. 2020;Collivignarelli et al. 2020;Zambrano-Monserrate et al. 2020;Tobías et al. 2020). An overview of selected studies on air pollution and COVID-19 is provided in the Appendix. In this study, we focus on investigating the possible effects of the lockdown due to the COVID-19 pandemic on air quality in the Pescara-Chieti urban area, Abruzzo (Italy), identified as an area of criticality in terms of air pollution. Data of monitoring stations of the regional air quality network managed by the Regional Agency for the Environmental Protection (ARTA) of Abruzzo have been collected and examined. We compared data from 1st February to 10 th March 2020, before the beginning of the main limitations on personal mobility, with data from 11st of March to 18 th of April, during the adoption of lockdown restrictions. Measured concentrations of NO 2 , PM 10 , PM 2:5 and benzene were used to evaluate air pollution changes. Commonly, strategies used in monitoring air quality refer to descriptive statistics, box-plots, autocorrelation analysis and spatiotemporal models. Unfortunately, in the monitoring of environmental pollutants, the temporal observations of the different pollutants for the different stations have not always been referred to the same instants of time. As a result, the implementation of classical statistical procedures might be problematic. Besides, for interpretative purposes, it is convenient to rely on statistical methods able to capture the speed and acceleration of pollutants variation over time. For these reasons, in our research, to overcome the weakness of classical statistical procedures and to effectively detect to what extent extreme changes in human behaviour after the quarantine policies adopted by the Italian Government have affected air quality, we followed an approach based on functional data analysis (FDA). During the two last decades, it has emerged an important literature in this methodological framework. A comprehensive introduction to the foundations and applications of FDA can be found in Ramsay andSilverman (2002, 2005), whereas nonparametric functional methods are summarized in a monograph by Ferraty and Vieu (2006). It is well known that FDA extends the classical multivariate techniques to data whose observations are functions, usually curves, with the advantage of reducing a large number of discrete observations highly correlated for each curve to a functional form that conserves all relevant information.
Recently, the use of FDA methods for environmental data has received attention. For instance, Escabias et al. (2005), proposed a functional principal component logistic regression to estimate the risk of drought in terms of time evolution in temperatures in Canada. Gao and Niemeier (2008) use functional methods to model the dynamics of diurnal ozone and nitrogen oxides cycles, their interrelationships, and the multilevel spatio-temporal variations of ozone and nitrogen oxides measurements from Southern California. A functional model for forecasting the time evolution of a binary response from discrete time observations of a continuous time series, is introduced by Aguilera et al. (2008) to predict the risk of drought in a future period of time from monthly observations of El Nin˜o phenomenon. Martínez et al. (2014) and Martínez Torres et al. (2020) implement a model based on functional analysis to detect outliers in air quality samples, with the overall aim to achieve a better solution for the air quality control. Likewise, Sancho et al.  Gautam and Trivedi (2020).
Following these research streams, also in this study, rather than simply considering the data as vectors to apply classical multivariate analysis methods, which may lead to a loss of useful information, we explicitly exploit the functional form of environmental data. It is worth noting that from a physical point of view the processes involved in the production of pollutants, such as NO 2 and particulate matter (PM 10 , PM 2:5 ) are of continuous time type and thus turn out to be appropriate for the analysis of functional data. As a result, the use of models that account for the continuity of the whole trajectories along time seem to be more natural. The FDA paradigm makes it possible to work with the entire time spectrum of pollutants time series. In doing so, the FDA approach might bring additional information to be recovered from the data than in the vectorial approach, by looking at the smoothness of underlying functions and its derivatives. Furthermore, in contrast to most other methods commonly used to model trends in time series data, the functional techniques make nonparametric assumptions and there is no concern about correlation due to repeated measures. Our goal in this paper is to ascertain whether the level of each pollutant has changed during the lockdown period. In other terms, we want to test the equality of mean functions related to each pollutant in two different periods of time: before and during lockdown days. The theoretical framework involves the use of FDA tools for repeated measures, and in particular, the analysis of variance. In the literature there are not many works related to this matter for the field of FDA. In this work, the statistics proposed by Martinez-Camblor and Corral (2011) and Smaga (2020) to test the equality of two mean functions are extended by assuming a basis expansion of the sample curves. On the other hand, in order to check the differences between the temporal evolution of all pollutants in terms of the location of measuring stations, a novel approach for multivariate FANOVA for independent measures is introduced. This is based on multivariate functional principal components analysis of the sample curves of all pollutants and the problem is reduced to testing multivariate homogeneity or MANOVA (gaussian data) on the vectors of the most explicative principal components scores. This new approach solves the problems of high dimension and multicollinearity that affect the basis expansion approach developed in Gorecki and Smaga (2017) when the number of basis coefficients for each functional variable is large. The relevance of this research work lies in its contribution to extending the parametric and nonparametric functional homogeneity testing procedures proposed by Aguilera et al. (2021) for univariate functional data to the multivariate case.
In addition to this methodological development, this study provides empirical suggestions for local environmental authorities. The comparison of air pollutants allows to study whether the level of each pollutant decreased during the quarantine days. The suitable analysis of a different behaviour of pollutants at measuring sites provides ground to decide on whether to keep recording measurement at different locations or redesign the local surveillance network, with the overall aim to reduce air pollution.
The paper is organized as follows. Section 2 is dedicated to illustrate the studied area, the monitoring stations where air quality data have been collected and some explorative analysis of pollutants used for this research. Section 3 introduces the theoretical framework. In Section 4, our method is applied for checking differences between air quality data collected at the monitoring stations placed in the urban area of Chieti-Pescara. Finally, Section 5 concludes the paper.

Air quality data and studied period
This section gives details about the studied area with respect to its geographic characteristics and the monitoring stations used to collect air quality data. In addition, the dataset employed to derive insights into research problem has been analyzed with descriptive statistics and visualization tools. For high quality graphics, we used the ''openair'' R package (Carslaw and Ropkins 2012).

Description of studied region
In this work, we closely examine the air quality of the metropolitan area of Chieti-Pescara, situated in the Abruzzo region, along the Adriatic coast of central Italy. The Chieti-Pescara metropolitan area (Fig. 1) is a territory, identified according to a functional criterion, formed by six municipalities, Pescara, Montesilvano, Chieti, Francavilla al Mare, San Giovanni Teatino, Spoltore, covering a total area of 159.33 km 2 , and accounting for around 281,101 inhabitants at 31/12/2019.
The configuration of Chieti-Pescara urban area is limited by the sea, in the North-East, and by hilly reliefs in the South-West. The central city is formed by the two provincial capitals: Chieti, not in a central position for the municipalities of the province, and the city of Pescara, which are extremely close to each other (approximately 12 km). Pescara city, located on the centre of a metropolitan area (on the coast), is the administrative and commercial heart of Abruzzo and in a few decades, it has become the most populated city of the region, with 120,000 inhabitants. It developed on a flat territory, with a surface of 33.62 km 2 , whose urban area develops around the terminal stretch of the homonymous river and a restricted coastal area.
The Chieti-Pescara conurbation is characterized by a system of infrastructures, which is one of the strongholds of the Abruzzo: significant and industrial sites are located around this pole. However, the progressive growth of the industrial activity, the increased road travel, the urban expansion, make the metropolitan area the locus of growing environmental concerns, for the rising levels of energy and resource consumption, greenhouse gas emissions and air quality pollution. For these reasons, the conurbation of Chieti-Pescara has been identified as an area deserving mitigation measures to reduce air pollution by the ''Plan for the Protection of Air Quality'', drafted by the Abruzzo Region, in accordance with current Italian legislation (Legislative Decree 155/2010). Also, in the Plan, there is the inventory of the main sources of polluting emissions (updated to 2012), which in this area largely sees the contribution of non-industrial combustion plants (mainly domestic heating plants) as regards particulate emissions (78.5% of the total for PM 10 , 88.8% for PM 2:5 and 88.4% for benzene), while for nitrogen oxides road traffic prevails (49.7%), with industrial combustion plants in second place with 23.6%.

Data
This study tracks four pollutants over two specific periods: from 1st February to 10 th March 2020 (before lockdown period) and from 11 st March 2020 to 18 th April 2020 (strict lockdown period). The analyses include measures of NO 2 , PM 10 , PM 2:5 and benzene obtained from the automatic reporting platform, run by Regional Agency for the Environmental Protection of Abruzzo Region (ARTA). These variables are measured in micrograms per cubic meter (lg=m 3 ) and information are obtained from five monitoring sites. The spatial location of all five monitoring stations is shown in Fig. 1. The air monitoring stations of Pescara (Via Firenze) and Montesilvano are designed as Urban Traffic (UT) and are located roadside, where the pollution level is most influenced by traffic emissions from neighboring roads with medium-high traffic intensity; conversely, air quality data collected from the monitoring stations of Pescara (Teatro d'Annunzio), Chieti and Francavilla are deemed Urban Background (UB), located where the pollution level is not influenced mostly by emissions from specific sources and are representative of the population average exposure. Hourly measurements of pollutants have been collected from February to April 2020.

Descriptive statistics and graphical analysis
After the implementation of strict lockdown measures starting from 11 st March 2020, air pollution of the urban area of Chieti-Pescara has witnessed a substantial improvement. Table 1 highlights the net and percentage variations of each pollutant, in each monitoring site, before and during the lockdown. Early evidence of significant reduction of NO 2 (air pollutant mainly ascribable to fossil fuel combustion) as a consequence of lockdown, both in Asia and Europe, was showed by the data collected by NASA and ESA satellites (Gautam 2020a). In our studied area it can be also noticed that NO 2 has shown the most significant declining trend. In particular, the concentrations of this pollutant were approximately 50% lower compared to the previous average, that of pre-lockdown period. On the other hand, we recorded an increase of PM 10 and PM 2:5 concentrations during the lockdown weeks, whereas benzene levels dropped in the traffic measuring stations and increased in the background monitoring sites. A trend analysis of 24-h daily average data for the four pollutants was also considered for the above stated periods in all monitoring stations to better understand the impact in the levels of pollutants accumulation amid the lockdown period. Figure 2 allows to capture the changes in concentrations of four pollutants for the pre-lockdown and during-lockdown period. The reduction of NO 2 during the lockdown is clearly visible and marked in all monitoring sites and is due to the collapse of vehicular flows, even if differences in magnitude exist depending on the stations. As observed by several researches in other areas worldwide Donzelli et al. 2020), the particulate matter (PM 10 and PM 2:5 ) seems to be rather independent from the measures adopted during the COVID-19 nation-wide lockdown: our data show that background and traffic stations undergo an increase, whereas benzene levels dropped in the UT stations and increased in UB ones. This apparently strange behaviour of particulate matter can be explained taking into account the peculiarity of this pollutant. Airborne particulate matter can be regarded as a complex mixture of solid particles and liquid droplets generated by a wide variety of natural or anthropogenic sources, with different size and chemical composition. The fine fraction (PM 2:5 ) arises mainly from combustion (primary PM 2:5 ) or gas-to-particle conversion processes (secondary PM 2:5 ), while the coarse fraction (PM 10 Þ arises mainly from traffic -linked mechanical processes (road dust resuspension, brake and tyre wear emissions) (Grigoratos and Martini 2014) or comes from natural sources (marine aerosol, wind-blown soil, pollens). This implies that the monitoring sites might be under the effect of multiple emission sources, not linked to urban traffic. In this regard, we must note that in the lockdown period the average temperature in the area under study was only slightly higher than in the pre- This fact, together with the prolonged stay of the people at home, could have raised the emissions related to domestic heating. Besides, a pertinent amount of PM 10 and PM 2:5 variability has a meteorological origin. As known (Querol et al. 2004;Galindo et al. 2011;Wang et al. 2020), meteorological conditions play key roles in physico-chemical processes governing formation and transport of airborne pollutants and their variability may have nullified the reductions caused by the drop in emissions related to road traffic. Considering, for example, the concentrations of PM 2:5 in the ''Teatro d'Annunzio'' station, we note that the average value in the month of February and the first decade of March 2019 was 23.6 (lg/m 3 ), while in the same period of 2020 (the ''pre lockdown period'' of our analysis) the average was 15.3 (lg/m 3 ). This is just the effect of interannual variability, due to the different meteorological scenarios. As an extreme example of variability induced by weather conditions, we must mention a notable long-distance transport event of natural dust from the Central Asia (Aral Sea) occurred at the end of March (29-31 March), which produced a sudden increase of the PM 10 level (clearly visible in the graphs of Fig. 2). Concerning the benzene concentrations, as stated earlier, it appears that this pollutant exhibits very different behaviours in background stations compared to traffic ones. In the latter, there is a decline, albeit contained, due to the reduction of vehicular flows during the lockdown, while in the former there is stability or even slight increases, probably due to domestic heating systems, particularly those fuelled by wood or pellets. To get a comprehensive understanding of how lockdown policies have affected air pollution, we also look at the weekly concentrations of each pollutant at background and traffic stations before and during the restriction periods. From Figs. 3 and 4, it is evident that on Sunday, the traffic during the lockdown phase is virtually zero, therefore the concentrations of NO 2 , pollutant specifically linked to vehicle emissions, reduce more than in the other days of the week. It is worth noting that ''Teatro d'Annunzio'' measuring station seems to be affected by traffic emissions in an anomalous way (despite being a background site). Also the inspection of weekly concentrations reveals that the impact of restrictions measures on PM 10 and PM 2:5 is the most complex of the four pollutants studied: we are not able to detect consistent patterns with vehicular flows even if the small sample size, only five weeks and half, could affect the empirical findings. More in detail, among the background monitoring stations, ''Teatro d'Annunzio'' results more subject to the natural component of PM 10 , probably due to marine aerosol: this station is about 200 m from the sea, with no buildings in the way. Regarding the weekly benzene concentrations, the comparison between the two traffic monitoring sites indicates that ''Via Firenze'' is probably subject to the emissions arising from combustion (mainly domestic heating) compared to those due to traffic road. The results of FANOVA (see subsection 4.3) seem to point out a possible misclassification of this monitoring site, which is supposed to be more prone to urban traffic. Conversely, the traffic station ''Montesilvano'' being on the edge of the urban area is especially exposed to road traffic emissions.

Theoretical framework
In functional data analysis, the data are curves or more general functions that evolve over time, space or other continuous argument. However, the available data are usually vectors associated with the observation of a variable at a finite set of time points (longitudinal data). In fact, functional data can be seen as a particular case of high- dimensional data with a number of highly correlated variables that is usually much larger than the sample size. Because of this, the classical multivariate statistical methods are not usually efficient for functional data on account of problems related to the sample size and overfitting. In particular, it is well known that the multivariate parametric (ANOVA) and nonparametric approaches for testing homogeneity of both, independent and non independent (repeated measures) samples of vectors, do not work properly when the number of variables is large and many of them are not applicable when the dimension of the data exceeds the sample size (see, for example, Biswas and Ghosh 2014). The functional data analysis approaches solve these problems by introducing in the statistical models the true form of curves and considering the metric associated with the functional space they belong to. A detailed comparison of eleven existing functional tests for the one-way ANOVA problem for independent functional data was developed in an exhaustive simulation study from which guidelines for using the tests in practice were provided (Gorecki and Smaga 2015). Different FANOVA approaches are proposed in this section for univariate and multivariate samples of dependent and independent curves, respectively. Let X ijr ¼ ðX ijr1 ; :::; X ijrH Þ; i ¼ 1; :::; g; j ¼ 1; :::; n i , r ¼ 1; :::; R be a sample of curves. Note that g represents the number of independent groups, H is the number of observed response variables, R denotes the number of different periods of time (or conditions) where the response variable is observed (repeated measures) and n ¼ P g i¼1 n i is the sample size. It is considered that these curves are realizations of a H-dimensional stochastic process X ¼ ðX 1 ; :::; X H Þ, whose components are second order and continuous in quadratic mean stochastic processes with sample paths belonging to the Hilbert space L 2 ½T of squared integrable functions on T , with the natural inner product

FANOVA for repeated measures
The goal is to test the equality of mean functions associated with the observation of a functional variable in two different conditions or periods of time for the same subjects. For instance, the problem laid out in this paper about the evolution of the quality of the air before and during the lockdown. That is, whether the level of each pollutant has changed during the lockdown. The theoretical framework involves the use of tools for repeated measures, and in particular, the analysis of variance. In the literature there are not many works related to this matter for the field of FDA. Martinez-Camblor and Corral (2011) introduced the first testing procedure for this problem by keeping in mind the between group variability. They proposed three different approaches in order to approximate the null distribution. The first technique consisted of applying a bootstrap parametric method through re-sampling some Gaussian process involved. The second and third methods were based on non-parametric approaches via bootstrap and permutation tests. Later, Smaga (2019) proposed another perspective focused on the Box-Type approximation. In that study, the four existent methods were compared, turning out to be the Box-Type approximation the quickest option from the computational viewpoint. In relation to size control and power all of them gave similar results, and its finite sample behaviour was very satisfactory. However, both works agree that for very small sample size, the bootstrap tests are lightly nonconservative. Smaga (2020) adapted two new statistics from the classical paired t-test to the functional data framework. This new approach is more powerful than the testing procedures aforementioned because takes the within group variability into account as well. Here, the distributions of the statistics were also approximated by parametric methods based on derived asymptotic distributions as well as non-parametric bootstrap and permutation approaches. The simulation study proved that the asymptotic and Box-Type tests are not recommended because of their liberality. Smaga (2020) suggested the permutation tests, although the non-parametric bootstrap methods also work correctly. Nevertheless, it was emphasized that there are evidences that the procedures proposed tend to be nonconservative for small sample size.
In what follows, a single functional variable is considered because they are going to be dealt separately. In this context, it is assumed that the sample functions can be represented as X jr ðtÞ with t 2 T ¼ ½a; b, j ¼ 1; :::; n and r ¼ 1; :::; R, such that E½X jr ðtÞ ¼ l r ðtÞ. Only two different conditions or periods of time are evaluated in the current work (R ¼ 2). Besides, each trajectory can be expressed as X jr ðtÞ ¼ l r ðtÞ þ e jr ðtÞ where e jr ðtÞ are random functions centered in mean. In this kind of problem the pursued goal is to test the hypothesis H 0 : l 1 ðtÞ ¼ l 2 ðtÞ 8t 2 ½a; b H 1 : l 1 ðtÞ 6 ¼ l 2 ðtÞ for some t: (2011) proposed the following statistics in order to solve the statistical hypothesis testing

Martinez-Camblor and Corral
where X r ðtÞ ¼ n À1 P n j¼1 X jr ðtÞ is the mean function for each condition or period of time. This statistics avoid the homoscedasticity assumption.
Due to C n only takes the between group variability, Smaga (2020) proposed the following two statistics in order to consider both the between and within group variabilities Kðt; tÞ dt; E n ¼sup t2½a;b n X 1 ðtÞ À X 2 ðtÞ À Á 2 Kðt; tÞ Kðt; tÞ ¼ P n j¼1 ðX j1 ðtÞ À X 1 ðtÞÞ À ðX j2 ðtÞ À X 2 ðtÞÞ Â Ã 2 n À 1 : One of the biggest problems in the practice is that curves are observed in discrete time because it is impossible to observe a set of functions continuously in time. Thus, the first step would be to reconstruct the functional form of the curves approximately. Ferraty and Vieu (2006) proposed to use non-parametric techniques for this purpose, meanwhile Ramsay andSilverman (2002, 2005) suggested an approach based on basis expansion of each sample curve. This last strategy consists of assuming that curves belong to a finite-dimension space spanned by a basis f/ 1 ðtÞ; :::; / p ðtÞg, so that they can be expressed as X jr ðtÞ ¼ X p k¼1 a jrk / k ðtÞ ¼ a 0 jr /ðtÞ ; j ¼ 1; :::; n; r ¼ 1; 2; where a jrk represent the basis coefficients of the reconstruction for the corresponding sample curve with a jr ¼ ða jr1 ; :::; a jrp Þ 0 and /ðtÞ ¼ ð/ 1 ðtÞ; :::; / p ðtÞÞ 0 . Note that p must be sufficiently large to guarantee an accurate precision. Besides, it is necessary to choose properly the dimension and the type of the basis by keeping in mind the nature of the curves. There are numerous basis systems but the most employed ones are Fourier functions (for periodic data), B-spline (for non-periodic and smooth data) and wavelets (for curves with strong local behaviour). Finally, sample trajectories can be observed with error or without error. For the first case, least squares approximation is usually used in order to estimate the basis coefficients, whereas for the second scene some interpolation method could be applied. For more details about these methodologies, Ramsay and Silverman (2005) carried through an exhaustive study and Ramsay et al. (2009) extended theses aspects to the software R. C n , D n and E n can be computed by considering the basis expansion. In fact, it is direct to prove that X 1 ðtÞ À X 2 ðtÞ À Á 2 ¼ a 0 1 /ðtÞ À a 0 2 /ðtÞ Kðt; tÞ ¼ P n j¼1 ðX j1 ðtÞ À X 1 ðtÞÞ À ðX j2 ðtÞ À X 2 ðtÞÞ Â Ã 2 n À 1 ¼VarðX 1 ðtÞÞ À 2CovðX 1 ðtÞ; X 2 ðtÞÞ þ VarðX 2 ðtÞÞ ¼Ĉ 1 ðt; tÞ À 2Ĉ 12 ðt; tÞ þĈ 2 ðt; tÞ ¼/ðtÞ 0 ðR 1 À 2R 12 þR 2 Þ/ðtÞ; with d ¼ ðd 1 ; :::; d p Þ 0 ¼ a 1 À a 2 ¼ ða 11 ; :::; a 1p Þ 0 À ða 21 ; :::; a 2p Þ 0 where a rk ¼ n À1 P n j¼1 a jrk r ¼ 1; 2; k ¼ 1; :::; p. Besides, R r is the sample covariance matrix of the matrix A r of basis coefficients in the group r, whose elements are A r ¼ ða jrk Þ, andR 12 is the sample cross-covariance matrix between A 1 and A 2 . Note for major clarity that X r ¼ n À1 P n j¼1 a 0 jr /ðtÞ ¼ a 0 r /ðtÞ.

Multivariate FANOVA for independent measures
The following idea in this kind of analysis is a little bit different to the case of repeated measures. Now, the aim is to test the the equality of the mean functions coming from independent groups. For example, the evolution of level of benzene in the air in two different regions. If there is a response variable (e.g. level of benzene), the problem is known as Univariate FANOVA. Likewise, another fundamental aspect in these studies is the number of factors that determine the different groups. If it only exists one factor (e.g. regions) the problem is called one-way FANOVA. There are several existing methods for testing the one-way FANOVA problem (Faraway 1997;Cuevas et al. 2004;Zhang et al. 2013;Zhang 2014). On the other hand, Gorecki and Smaga (2015) made a detailed comparison of tests for the one-way ANOVA problem for functional data and presented tests based on a basis function representation.
These tests were inspired by the idea of the B-Spline method of Shen and Faraway (2004). In this line, Aguilera et al. (2021) suggested a novel approach by using Functional Principal Component Analysis (FPCA). This method consists of testing multivariate homogeneity on a vector of principal components scores. However, although there are available many works for the univariate case, the natural extension for the multivariate case (more than one functional response variable) is not a theme that it had been studied deeply. Some references are Jacques and Preda (2014), and Gorecki and Smaga (2017). Here, a novel approach based on multivariate FPCA is considered for dealing with the multivariate FANOVA problem. This new methodology can be seen as the extension of the parametric and nonparametric approaches proposed by Aguilera et al. (2021).
In the field of FDA, it is very common to deal with high dimension data. These type of data are defined as data associated to a great number of highly correlated variables where the sample size is too much small. For this reason, one of the most important technique in FDA is FPCA. This tool reduces the dimension of the problem and explains the main characteristics and modes of variation of the curves in terms of a reduce set of uncorrelated variables call functional principal components (PC's). Ramsay and Silverman (2002) presented the univariate approach and discussed the extension of FPCA to the case of bivariate functional data. This theory can be adapted for more than two response variables. PC's are obtained as some generalized linear combinations of the process variables with maximum variance. Formally, the m-th principal component scores are determined by  R T f mh ðtÞf m 0 h ðtÞdt ¼ 1 if m ¼ m 0 and 0 otherwise. These functions are obtained as the eigenfunctions of the eigenequation system Cf m ¼ k m f m , with C being the covariance operator and the sequence fk m g m ! 1 of positive real eigenvalues decreasing to zero indicate the amount of variance attributable to each component. The aforementioned system can be written in detail as follows: Highlight that each PC is a zero-mean random variable with maximum variance and uncorrelated with the remainder of PC's. Hence, in multidimensional context and similar to the univariate setting, this process admits the following orthogonal decomposition known as Karhunen-Loève expansion n ijm f m ðtÞ: The principal advantage of this decomposition is that curves can be approximated by means of a principal reconstruction in terms of the first q PC's, that is X q ij ðtÞ ¼ lðtÞ þ P q m¼1 n ijm f m ðtÞ. Normally, q is chosen so that the explained cumulative variability is as close as possible to 100%. With this approach, the dimension of the problem is considerably reduced.
Jacques and Preda (2014) explain the multivariate FPCA by means of the basis expansions and it is summarized in Schmutz et al. (2020). This approach is briefly explained hereafter. If the basis expansion is considered, X ij ðtÞ can be expressed as where the basis coefficients are gathered as a ij ¼ ða ij11 ; :::; a ij1p 1 ; a ij21 ; :::; a ij2p 2 ; :::; a ijH1 ; :::; a ijHp H Þ with p h being the number of basis functions for the h-th response variable and In general XðtÞ ¼ AU 0 ðtÞ, where A is the resultant matrix after joining by row all a ij . Thus, whether the mean vector is subtracted to each row of XðtÞ, the spectral decomposition of the covariance operator C becomes with R A being the covariance matrix of A, b m being a rowvector that contains the basis coefficients of f m ðtÞ ¼ UðtÞb 0 m and W ¼ R T UðtÞ 0 UðtÞdt being the matrix of inner products between basis functions with dimension P H h¼1 p h Â P H h¼1 p h . Since the showed spectral decomposition is true for all s, the expression can be reduced as , the multivariate FPCA is equivalent to the multivariate PCA of the matrix AW 1=2 , whose covariance matrix can be diagonalized as follows: Therefore, the PC's are given by In order to obtain the principal component scores for the multivariate case, Schmutz et al. (2019) implemented in the software R (R Core Team 2020) the package called ''funHDDC''. Once they are computed, there are suggested two different ways to solve the problem based on testing homogeneity on the vector of the first q principal components scores in the g groups. The first one is to perform univariate ANOVA on each principal component correcting the level of significance for the normality case. It is well-known whether the multivariate normality is suitable, the uncorrelatedness implies independence and then, it does not make sense to consider a multivariate approach. Otherwise, when the multivariate normality is not satisfied, the option is to apply non-parametric multivariate tests such as the extensions of the univariate Kruskal Wallis's test and Moods's test. Normally, it is recommended to use the permutation version of the tests when the sample size is small (Oja 2010).

Results
We now illustrate the use of testing procedures previously described to ascertain whether the level of each pollutant has changed during the lockdown period. As pointed out in Sect. 2, to reveal the impact of restriction measures due to the COVID-19 on the air quality, the obtained environmental datasets were divided in two time frames, of the same length (39 days): i) pre-lockdown (February 1, 2020 -March 10, 2020) and ii) during-lockdown (March 11, 2020-April 18, 2020. From a theoretical viewpoint, we have longitudinal functional data corresponding with the observation of the same functional variables in two different periods of time.

Functional reconstruction of pollutant curves
As a first step of our data analysis, we reconstructed the functional form of curves from the initial points that come from the discrete values measured in the study. To convert the discretely observed data to smooth functions, the reconstruction of curves is made by using a cubic B-spline smoothing. The B-spline functions are one of the most prominent spline basis, used for non-periodic functions, which is proven to be numerically stable and flexible (Ramsay and Silverman 2005). Initially, in tailoring a basis system to fit our data, we used 7 basis functions. This option is conservative: it allows to capture the trend of curves but not their local behaviour. To recover the underlying functions of the observed data, we were increasing the number of basis functions up to 20. This choice preserves important information about the real form of the curves. Figure 5 illustrates the shape the data would take after smoothing them into these basis systems. It is clear that the increase of the number of basis functions produces smaller differences between the smoothed sample curves and the observed data (see Fig. 6). Hereinafter, an approximation of each sample curve in terms of a basis of cubic B-splines of dimension 20 is considered.

FANOVA results for repeated measures
Before moving on more complex studies, we carried out a univariate analysis to evaluate the behaviour of each pollutant before and during lockdown. To statistically confirm the effect of lockdown on the mean of each pollutant, we first implemented FANOVA for repeated measures, as defined in Section 3.2. Specifically, we apply the statistics D n and E n which are the best to control the between and within group variability that there are behind the repeated measures design. In order to construct the tests based on these statistics, a permutation method is used to approximate their null distributions. This technique consists of a random permutation of each sample unit. Let us denote the original data by X ¼ ðX 1 ; X 2 ; . . .; X n Þ where X j ¼ ðX j;1 ; X j;2 Þ ðj ¼ 1; . . .; nÞ; and the resampling vectors by X Ã ¼ ðX 1 Ã; . . .; X n ÃÞ with X j Ã ¼ ðX j;1 Ã; X j;2 ÃÞ being a random permutation of the sample unit X j : This process is repeated D times, with D a number sufficiently large, so that D d n Ã and E d n Ã are calculated for each replication, being d ¼ 1; . . .; D. Later, p-values are obtained as the proportion of times that D d n Ã and E d n Ã overcome D n and E n , respectively. Here, the p-values were obtained from 2000 replications. The results of the proposed testing procedures are shown in Table 2. The p-values of all tests are less than the significance level a ¼ 0:05 for NO 2 , PM 10 and PM 2:5 . For benzene, E n shows no differences between both periods,  but it is very close to the limit region. Therefore, and taking into account the sample size, we have evidence to reject the null hypothesis for benzene and we state that there are also differences in the mean curves of this pollutant in the pre and during lockdown phases. These results statistically confirm the evidences already reported in Table 1 and discussed in Section 2.3. For some pollutants (PM 10 and PM 2:5 ), we recorded an increase during the lockdown tenure probably due to a greater employment of domestic heating systems and the key roles of particular meteorological conditions that govern the formation and transport of particulate matter. For benzene, we pointed out a differentiate behaviour according to the type of monitoring stations, influenced by different emission sources (traffic vs domestic heating). Finally, the collapse of vehicular traffic during the quarantine days justifies the steep decline of NO 2 in both types of measuring sites.

Multivariare FANOVA results for independent measures
Once the impact of the lockdown has been studied, a further step of our data analysis has involved the assessment of equality of mean functions of individual groups. In our context, the groups have been individuated according to the location of the monitoring sites. In more detail, our interest lies in investigating if the mean function of all the pollutants measured in the background stations is equal to that of the urban traffic ones. The multivariate analysis of variance is carried out both before and during lockdown tenure to detect differences attributable to the government restrictions. This comparison has been evaluated firstly globally, considering all the pollutants together, and then for each variable separately. In Table 3, the results for multivariate and univariate FANOVA based on FPCA are displayed. On this matter, four principal components are chosen for both cases (multivariate and univariate analysis), since more than a 99% of total variability is explained with four components in all situations. Besides, due to the fact that the normality is in question and the sample size is really small, the extension of the univariate Kruskal-Wallis's test with the permutation version is conducted by means of ''MNM'' R package. Looking at the results of multivariate FANOVA, we found that, in the pre-lockdown phase, the groups are different from each other and the main discrimination is ascribable to the PM 10 concentrations. Furthermore, it seems that there could be indications of significance as well regarding the benzene because the p-value is 0.186 and by increasing the sample size we could reject the homogeneity in this pollutant. Conversely, the multivariate test is not able to distinguish the two groups in the lockdown period.   In fact, the p-value for the multivariate test is equal to 0.30. However, when we carry out the univariate tests, we record significant differences between the groups in relation to the benzene. This appreciation is also corroborated by the visual inspections of Figs. 7 and 8. A possible explanation for these results can be found in the simulation study performed by Aguilera et al. (2021) where it was shown that these approaches tend to be very conservative for small sample sizes. As for fine particulate (PM 2:5 ), the results of univariate test confirm, as already known in literature, that this pollutant is ubiquitous in urban environment due to the widespread diffusion of the sources (traffic, domestic heating etc.) and the important role played by meteorological variables in the processes of its formation and transport.

Conclusions
Recent studies suggested that lockdown measures, adopted by the most hard-hit countries around the world in the Spring of 2020 to prevent the spread of COVID-19, have had a positive significant impact on air quality. In this paper, novel approaches for functional analysis of variance with univariate repeated measures and multivariate independent measures are presented, as a new methodology for a more effective understanding of the impact of lockdown on four critical air pollutants, measured in five monitoring sites in the urban area of Chieti-Pescara (Central Italy). Being a powerful approach to modelling temporal observations, which is complementary to the usual time series techniques, FDA allowed us to reconstruct the temporal profiles of the studied pollutants for the lockdown and unlock phases in each measuring station. We have found significant reduction in NO 2 levels during the lockdown period albeit some differences in magnitude are recorded according to the monitoring station. These results are in line with the findings of other published studies on this topic (Mahato et al. 2020;Berman and Ebisu 2020;Gautam and Trivedi 2020;Kerimray et al. 2020). Unlike the NO 2 pollutant, for particulate matter, that is for PM 10 and PM 2:5 , the monitoring stations experienced an increase during the quarantine weeks. Besides, less clear was the impact of lockdown on benzene levels: the concentrations of this pollutant were smaller in the traffic stations while an increasing trend was observed in the background measuring sites. Equally important was to determine if these differences were statistically significant. In this respect, the functional analysis of variance has proven to be beneficial to monitoring the evolution of air quality before and during the lockdown tenure and to assessing the equality of mean functions of individual groups, individuated according to the location of measuring sites. The considered FANOVA approaches based on basis expansion of sample curves, dimension reduction by using FPCA of pollutants curves and testing homogeneity on the vector of the most explicative principal component scores have made this analysis feasible providing contrasted evidence to reject the null hypothesis of equality in the mean functions of all pollutants, both in the time frame considered and the localization of monitoring stations. It should be noted that the results of multivariate FANOVA for independent measures, shown in Table 3, suggest a possible misclassification of the monitoring stations as concern the NO 2 , because the proposed technique failed to discriminate between UB and UT measuring sites, despite the fact that NO 2 , in urban areas, is a pollutant mostly produced by traffic emissions. The acknowledge of the presence of some redundant or misclassified monitoring stations will provide better support for managers to formulate a more adequate air pollution control strategy. For all air quality monitoring networks the identification of misclassified measurements is, in fact, an important task, not only for determining the cost of pollution monitoring, but also for determining the integrity of pollution information monitoring and the accuracy of air quality assessment. In general, the FDA framework has provided a valid understanding and knowledge of the temporal behaviour of air pollutants in a kind of controlled experiment such that offered by the lockdown. The COVID-19 restrictions reduced the anthropogenic emissions and created an ''unprecedented scenario'' in which the source of road traffic has been drastically dropped out.
We believe that the results of this study are of interest for environmental protection agencies involved in developing policies to achieve air quality improvements, encouraging them to establish mechanisms to reduce pollution emissions and properly redesign local monitoring network. Using satellite data and a network of more than 10,000 air quality stations, the authors find remarkable declines in ground-level nitrogen dioxide (NO 2 : -29 with 95% confidence interval -44 to -13%), Ozone (O 3 : -11; -20 to -2%) and fine particulate matter (PM 2:5 : -9; -28 to 10%) during the first two weeks of lockdown.
Gope et al (2021) Global (most polluted cities worldwide) The lockdown led to significant reduction of pollutant concentrations mainly due to vehicular traffic (PM 10 , PM 2:5 , BC, benzene, CO and N0 x ) This study investigated the changes in air pollution levels during the lockdown in terms of urban background and traffic air quality observed stations. After two weeks of lockdown, the most significant reduction was estimated for BC and NO 2 (-45 to -51%), pollutants mainly related to traffic emissions. A lower reduction was observed for PM 10 (-28 to -31.0%). By contrast, O 3 , levels increased (from þ33 to þ57%).
Sale City (Morocco) PM 10 , SO 2 , NO 2 Analysing air pollutants before and during the lockdown period, in this study it was found that PM 10 , SO 2 and NO 2 concentrations were reduced respectively by 75, 49 and 96%. Data on seven pollutants, collected over 34 monitoring stations in Delhi, were analyzed during pre-lockdown periods and during the lockdown. Empirical findings gave evidence that air quality significantly improved during lockdown, with reductions of 60% (PM 10 ), 39% (PM 2:5 ), 53% (NO 2 ) and 30% (CO) compared to 2019.
Gautam (2020) India AOD From this study, it was emerged an up-gradation of air quality in the Indian region just a week after lockdown restrictions took place, as a result of the sizeable reduction aerosol optical thickness concentration. Agarwal et al (2020) India and China NO 2 , PM 2:5 It has been recorded a visible improvement in air quality parameters in some cities of India and China, selected on the basis of their availability of historical air pollution data, population density, monitoring station network, and the number of positive COVID-19 cases per million people.
Malaysia and Southeast Asia AOD, PM 10 , PM 2:5 , NO 2 ,SO 2 , CO, 0 3 Over the Southeast Asia region, the analysis of air pollutants before and during the lockdown period reported a significant drop in AOD, PM 2:5 , PM 10 , NO 2 , SO 2 , and CO. Besides, the authors reported $ 40 and $ 70 % (in industrial and urban sites, respectively) reduction in AOD level in Malaysia during March-April 2020 as compared to the same period in 2019 and 2018. Kerimray et al. (2020) Almaty (Kazakhstan) PM 2:5 , CO, NO 2 , 0 3 , benzene, toluene This study reported that in Almaty (Kazakhstan) PM 2:5 concentration reduced by 21% with spatial variations of 6-34% compared to the average of the same days in 2018-2019. CO and NO 2 concentrations reduced by 49 and 35%, respectively while O 3 concentrations increased by 15% compared to the preceding 17 days before the lockdown. Finally, concentrations of benzene and toluene were 2-3 times higher than in the same seasons of 2015-2019.

Zambrano-Monserrate & Ruano (2020)
Ecuador (Quito) NO 2 , PM 2:5 , 0 3 The quarantine policies adopted by the government of Ecuador have led to a significant reduction of NO 2 and PM 2:5 concentrations. Specifically, it was found that the NO 2 concentrations of 2020 were, on average, 5.6 times less than the 2018 concentrations and 4.8 times less than those from 2019. Likewise, compared with these years, the PM 2:5 concentrations were 1.5 and 1.6 times lower, respectively. On the other hand, regarding O 3 concentrations, it was arisen that the ozone levels in 2020 were much higher than the levels in 2018 and 2019.
Acknowledgements This research was funded by project PID2020-113961GB-I00 of the Spanish Ministry of Science and Innovation (also supported by the FEDER program), project FQM-307 of the Government of Andalusia (Spain) and the PhD grant (FPU18/01779) awarded to Christian Acal. The authors also thank the support of the University of Granada, Spain, under project for young researchers PPJIB2020-01.
Funding Open access funding provided by Università degli Studi G. D'Annunzio Chieti Pescara within the CRUI-CARE Agreement.
Data availability All data used during the study are available from the corresponding author by request.

Declarations
Conflict of interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/. Data from four air quality stations in São Paulo Brazil were analyzed to assess air pollutant concentrations variations during the partial lockdown. Overall, drastic reductions on NOx (up to-77.3%), NO 2 (up to-54.3%), and CO (up to-64.8%) concentrations were observed in the urban area during partial lockdown compared to the five-year monthly mean. By contrast, an increase of approximately 30% in ozone concentrations was observed.