1 Introduction

Air pollution is considered one of the biggest health challenges worldwide in urban environments. There are a wide variety of urban air pollutants, such as carbon monoxide (CO), nitrogen oxides (\(\textrm{NO}_x\)), sulphur dioxide (\(\textrm{SO}_2\)), particulate matter (\(\textrm{PM}_{2.5}\) and \(\textrm{PM}_{10}\)) and ozone (\(\textrm{O}_3\)). In this work, we will focus on ground-level ozone, as it has been shown to have serious health effects on humans and can also damage plants and trees (e.g. Karlsson et al. 2017), but the proposed methodology could be also applied to other pollutants.

Nowadays, technological developments related to sensor technology and IoT have made it possible to have data sources where air pollutants and related variables are continuously monitored. For instance, the UK Automatic Urban and Rural Network (AURN, https://uk-air.defra.gov.uk/networks/network-info?view=aurn) records large volumes of information, including the pollutants cited above. As the observed values can be considered realizations of a functional process, the application of functional data analysis (FDA) techniques may be useful on the assessment of air pollution impact on human health and ecosystems (see e.g. Ramsay and Silverman 2005; Ferraty and Vieu 2006; Manteiga and Vieu 2007; Ullah and Finch 2013 for a general view of this methodology). There are several studies available in the literature on functional methods applied to environmental data (e.g. Febrero et al. 2008; Delicado et al. 2010; Giraldo et al. 2010; Embling et al. 2012; Sancho et al. 2014; Xiao and Hu 2018). Many of these developments used the tools implemented in different R packages. Among them, we may highlight the packages fda (Ramsay et al. 2020), rainbow (Shang and Hyndman 2019), and fda.usc (Febrero and Oviedo 2012), which allow the application of descriptive, outlier detection, regression, classification, clustering, dimension reduction, variance analysis and bootstrap methods, among others.

Bootstrap methods for functional data can be of great interest for problem solving in many fields, including air quality data analysis (see e.g. McMurry and Politis 2011). They are often used to approximate characteristics of the distribution of statistics related to the process under study. For instance, among many other applications, this includes estimating the probability that a pollutant exceeds a certain threshold value (e.g. UK air quality guidelines state that eight-hour average of ozone should not exceeded 100 \(\upmu \textrm{g}\) \(\textrm{m}^{-3}\) more than 10 times a year). Classical bootstrap procedures have been employed for functional data analysis, including naive, parametric, and block bootstrap methods. For example, Ferraty et al. (2010) studied the asymptotic validity of naive and wild bootstrap methods for inference on a nonparametric functional regression model. Several resampling procedures specifically designed for functional data have also been proposed (de Castro et al. 2005; Politis and Romano 2010). Among them, we may highlight the smoothed bootstrap method proposed in Cuevas et al. (2006), where they compare its performance with those of the naive and parametric bootstrap methods. However, for the results obtained with a bootstrap procedure to be reliable, it must adequately reproduce the variability of the underlying process.

The proposed bootstrap procedure is an adaptation of the method developed by Castillo-Páez et al. (2019) for spatial data. The idea would be to consider the functional process as a spatial process of dimension one so that repeated (independent) measurements are observed at some discretization points. This method requires the modelling of the variability of the process, which is done employing nonparametric techniques. Following the usual procedure in geostatistics, the modelling of the dependence is done through the semivariogram. For this purpose, a new package npfda (Fernandez-Casal et al. 2023) has been developed, adapting the tools implemented in the npsp package (Fernandez-Casal 2023) for this particular case (see the supplementary material for more details).

This methodology was applied to ground-level ozone data. The data set consist of daily averages of ozone concentration (\(\upmu \textrm{g}\) \(\textrm{m}^{-3}\)) recorded over the period from 1988 to 2020 at the Yarner Wood monitoring site in the UK (available at https://uk-air.defra.gov.uk/data). These data were pre-processed, applying the usual outlier detection and data imputation methods, using the package climatol (Guijarro 2019). It is assumed that the observations corresponding to each year are (partial) realizations of a functional process, so the data consist of 33 curves observed at 365 discretization points. This curves are shown in Fig. 1. As an initial objective, we will assume that we intend to make inferences about the annual trend of the ozone level. Specifically, the estimation of the mean curve and the construction of confidence intervals (Sect. 4). However, this methodology can be used for a large number of problems including the analysis of other pollution related variables.

Fig. 1
figure 1

Annual ozone curves (\({\upmu \textrm{g}\,\textrm{m}^{-3}}\)), from 1988 to 2020, at the Yarner Wood site in the UK (using a rainbow colour scale for the year)

The remainder of the paper is organized as follows. The general model, the nonparametric estimators and the proposed bootstrap method, are presented in Sect. 2. The performance of this procedure is illustrated through numerical studies in Sect. 3, where the results are compared with those derived from the naive and smoothed bootstrap approaches. In Sect. 4, we describe an application of the proposed methodology to the ozone data. Finally, Sect. 5 contains a summary of the main conclusions and some finals remarks.

2 Methodology

Suppose that \({\mathcal {S}}_{n}=\{Y_i(t)\}_{i=1}^{n}\), for \(t \in [a, b] \subset {\mathbb {R}}\), is a set of n independent observations of a functional variable Y(t) defined over \({\mathbb {R}}\), verifying:

$$\begin{aligned} Y_i(t) = \mu (t) + \sigma (t)\varepsilon _{i}(t), \end{aligned}$$
(1)

being \(\mu (t)\) the functional trend, \(\sigma ^2(t)\) the functional variance, and \(\varepsilon _{i}(t)\) a random error process with zero mean, unit variance and correlations

$$\begin{aligned} \textrm{Cov} \left( \varepsilon _{i}(t), \varepsilon _{i'}(t')\right) = \delta _{ii'} \rho \left( \left| t - t' \right| \right) , \end{aligned}$$

for \(1 \le i,i' \le n\) and \(a \le t,t' \le b\), where \(\delta _{ii'} = 1\) if \(i=i'\), \(\delta _{ii'} = 0\) if \(i \ne i'\) and \(\rho (\cdot )\) is the correlogram function.

In practice, each \(Y_i(t)\) is observed at a discrete set of points \(t_j \in [a, b] \subset {\mathbb {R}}\), with \(j=1,\ldots ,p\). This set of observations can be expressed as a matrix \({\textbf{Y}}\) of order \(n \times p\), with \({\textbf{Y}}_{ij} = Y_i(t_j)\). Furthermore, if \({\textbf{y}}_i = \left( Y_i(t_1), \ldots , Y_i(t_p)\right) ^\top \) is the vector corresponding to the i-th row of \({\textbf{Y}}\), the elements of its covariance matrix \(\textrm{Cov}({\textbf{y}}_i) = \varvec{\Sigma }_0 \) (within-curve covariance matrix) are

$$\begin{aligned} \left( \varvec{\Sigma }_0 \right) _{jj'} = \sigma (t_j) \sigma (t_{j'}) \rho \left( \left| t_j - t_{j'} \right| \right) , \end{aligned}$$

for \(i = 1, \ldots , n\). Consequently, \(\varvec{\Sigma }_0 = {\textbf{D}} \varvec{\Sigma }_{\varvec{\varepsilon }} {\textbf{D}}\), where \(\varvec{\Sigma }_{\varvec{\varepsilon }}\) (within-curve correlation matrix) is the covariance matrix of \(\varvec{\varepsilon }_i = \left( \varepsilon _i(t_1), \ldots , \varepsilon _i(t_p)\right) ^\top \), for \(i = 1, \ldots , n\), being \({\textbf{D}}= \textrm{diag}(\sigma (t_1),\ldots ,\sigma (t_p))\). Nevertheless, the dependence structure is estimated through the error semivariogram:

$$\begin{aligned} \gamma (u)= \frac{1}{2}\textrm{Var}(\varepsilon (t) - \varepsilon (t+u) ) = 1 - \rho (u). \end{aligned}$$

2.1 Nonparametric Estimation

The proposed procedure starts with the nonparametric estimation of the trend, the conditional variance and the dependence, following an iterative algorithm similar to the one described in Fernández-Casal et al. (2017). However, in this case, since multiple realizations of the process are available, it has been observed that a bias correction in the estimation of the small-scale variability seems to be not necessary.

The trend is estimated by linear smoothing of

$$\begin{aligned} \left\{ (t_{j}, Y_i(t_{j})): 1 \le i \le n, 1 \le j \le p \right\} . \end{aligned}$$

This estimator can be written explicitly in terms of the sample means \({\bar{Y}}(t) = \frac{1}{n}\sum _{i}Y_{i}(t)\):

$$\begin{aligned} {\hat{\mu }}(t) = {\textbf{e}}_1^\top \left( {\textbf{X}}_{t}^\top {\textbf{W}}_{t}{\textbf{X}}_{t} \right) ^{-1}{\textbf{X}}_{t}^\top {\textbf{W}}_{t} \bar{{\textbf{y}}} = {\textbf{s}}_{t}^\top {\bar{{\textbf{y}}}} \end{aligned}$$
(2)

where \(\bar{{\textbf{y}}} = \left( {\bar{Y}}(t_1), \ldots , {\bar{Y}}(t_p)\right) ^\top \), \({\textbf{e}}_1=(1,0)^\top \), \({\textbf{X}}_{t}\) is a matrix with the j-th row equal to \(\left( 1, t_j - t \right) \), \({\textbf{W}}_{t}= \textrm{diag} \lbrace K_{h}(t_1 - t), \ldots , K_{h}(t_p - t) \rbrace \), \(K_{h}(u)= \frac{1}{h}K(\frac{u}{h})\), K is a kernel function and h is the bandwidth parameter.

The small-scale variability of the process, determined by the conditional variance and the temporal dependence of the error process, is estimated from the residuals \(r_{ij} = Y_i(t_j) - {{\hat{\mu }}}(t_j)\). An estimate of the conditional variance \({\hat{\sigma ^2}}(\cdot )\) is obtained by linear smoothing of:

$$\begin{aligned} \{(t_{j},r^2_{ij}): 1 \le i \le n, 1 \le j \le p \}, \end{aligned}$$

analogously to the trend estimate, using a bandwidth \(h_2\).

A pilot local linear estimate of the error semivariogram \({\hat{\gamma }}(\cdot )\) is obtained by the linear smoothing of the semivariances,

$$\begin{aligned} \left\{ \left( t_{j}-t_{j'}, \tfrac{1}{2} ({\hat{\varepsilon _{ij}}}- {\hat{\varepsilon _{ij'}}})^2 \right) : 1 \le i \le n, 1 \le j < j' \le p \right\} , \end{aligned}$$

of the standardized residuals \({\hat{\varepsilon _{ij}}} = r_{ij}/{\hat{\sigma }}(t_j)\). The corresponding bandwidth parameter will be denoted by \(h_3\). Additionally, as this estimator is not necessarily conditionally negative definite (it cannot be used directly for prediction or simulation), a flexible Shapiro–Botha variogram model (Shapiro and Botha 1991) is fitted to the pilot estimates to obtain the final variogram estimate \({\bar{\gamma }}(\cdot )\).

Although the choice of the kernel function is of secondary importance, the bandwidth parameters play an important role in the performance of the local linear estimators described above, since they control the shape and size of the local neighbourhoods used for computing the corresponding estimates, determining their smoothness. However, when the data are correlated, traditional smoothing parameter selection methods for nonparametric regression will often fail to provide useful results (Opsomer et al. 2001). To take the dependence into account, we recommend the use of the “bias-corrected and estimated” generalized cross-validation criterion (CGCV) proposed in Francisco-Fernández and Opsomer (2005). In the case of the trend estimator \({\hat{\mu }}(\cdot )\), this method consists in selecting the bandwidth h that minimizes:

$$\begin{aligned} \textrm{CGCV}(h)=\frac{1}{n}\sum _{i=1}^{n}\left( \frac{{\bar{Y}}(t_i) - {\hat{\mu }}(t_i)}{1-\frac{1}{n}\textrm{tr}( {\textbf{S}} \hat{{\textbf{R}}}_{\bar{{\textbf{y}}}} )}\right) ^{2}, \end{aligned}$$

where \(\textrm{tr}({\textbf{A}})\) stands for the trace of a square matrix \({\textbf{A}}\), \({\textbf{S}}\) is the smoothing matrix, a square matrix whose ith row is equal to \({\textbf{s}}_{t_i}\) (the smoother vector for \(t = t_i\)), and \(\hat{{\textbf{R}}}_{\bar{{\textbf{y}}}}\) is an estimate of the correlation matrix of the sample means \(\bar{{\textbf{y}}}\). This matrix can be easily obtained bearing in mind that:

$$\begin{aligned} \textrm{Cov} \left( {\bar{Y}}(t_j), {\bar{Y}}(t_{j'}) \right) = \tfrac{1}{n} \sigma (t_j) \sigma (t_{j'}) \rho \left( \left| t_j - t_{j'} \right| \right) . \end{aligned}$$

An analogous procedure can be used to select the bandwidth \(h_2\) for the variance estimation. Nevertheless, this method will require an estimate of the correlation matrix of the squared residuals (or of their sample means, if we use the previous approximation). Under the assumptions of normality and zero mean for the residuals, the covariance matrix of the squared residuals admits the following expression:

$$\begin{aligned} \varvec{\Sigma }_{{\textbf{r}}^2} = 2\varvec{\Sigma }_{{\textbf{r}}}\odot \varvec{\Sigma }_{{\textbf{r}}}, \end{aligned}$$
(3)

where \(\odot \) represents the Hadamard product and \(\varvec{\Sigma }_{{\textbf{r}}}\) the covariance matrix of the residuals (Ruppert et al. 1997), from which it is simpler to approximate the required correlations. The bandwidth parameter \(h_3\) for the estimation of the variogram could be selected, for instance, by minimizing the cross-validation relative squared error of the semivariogram estimates (see e.g. Fernández-Casal and Francisco-Fernández 2014). Although, as this criterion does not take into account the dependence between the sample semivariances, the resulting bandwidth should be increased (for example by multiplying it by a factor between 1.5 and 2) to avoid under-smoothing the variogram estimates.

The above criteria, for the selection of optimal bandwidths for trend and variance approximation, require estimation of the small-scale variability of the process, leading to a circular problem. To avoid it, an iterative algorithm is used. Starting with an initial h and \(h_1\) bandwidths (e.g. obtained by any of the available methods for independent data). At each iteration, the bandwidths are selected using the variance and variogram estimates computed in the previous iteration, and the model components are re-estimated. The algorithm is considered to have converged when there are no significant changes in the selected bandwidths, indicating similar small-scale variability estimates. Typically, a single iteration of this algorithm is sufficient in practice. This procedure is implemented in the npf.fit() function of the npfda package (Fernandez-Casal et al. 2023). More details are provided in the supplementary material.

2.2 Nonparametric Bootstrap

Using the nonparametric estimates of the trend \({{\hat{\mu }}}(\cdot )\), the variance \({\hat{\sigma ^2}}(\cdot )\) and the semivariogram \({\bar{\gamma }}(\cdot )\) obtained with the procedure described in previous section, the proposed bootstrap algorithm is as follows:

  1. 1.

    Form the standardized residuals matrix \(\hat{{\textbf{E}}}\), whose ith row is equal to \(\hat{\varvec{\varepsilon }}_i = \hat{{\textbf{D}}}^{-1} ({\textbf{y}}_i - \hat{\varvec{\mu }})\), where \(\hat{{\textbf{D}}} = \textrm{diag}({\hat{\sigma }}^2(t_1),\ldots ,{\hat{\sigma }}^2(t_p))\) and \(\hat{\varvec{\mu }} = \left( {\hat{\mu }}(t_1), \ldots , {\hat{\mu }}(t_p)\right) ^\top \).

  2. 2.

    Construct an estimate \(\hat{\varvec{\Sigma }}_{\varvec{\varepsilon }}\) of the within-curve correlation matrix from \({\bar{\gamma }}(\cdot )\), and compute its Cholesky decomposition \(\hat{\varvec{\Sigma }}_{\varvec{\varepsilon }}={\textbf{U}}^\top {\textbf{U}}\).

  3. 3.

    Compute the uncorrelated standardized residuals \({\textbf{E}}=\hat{{\textbf{E}}}{\textbf{U}}^{-1}\) and scale them (jointly, by subtracting the overall sample mean and dividing by the overall sample standard deviation).

  4. 4.

    Use the scaled values to derive an independent bootstrap sample \({\textbf{E}}^{*}\) (by resampling the rows and columns of \({\textbf{E}}\)).

  5. 5.

    Compute the bootstrap errors \(\varvec{\varepsilon }^{*} = {\textbf{E}}^{*}{\textbf{U}}\).

  6. 6.

    Obtain the bootstrap sample \({\textbf{Y}}^{*}\), with

    $$\begin{aligned} {\textbf{y}}^{*}_i = \hat{\varvec{\mu }} + \hat{{\textbf{D}}} \varvec{\varepsilon }^{*}_i, \end{aligned}$$

    for \(i = 1, \ldots , n\).

  7. 7.

    Repeat B times steps 4–6 to obtain the B bootstrap replicates \(\left\{ {\textbf{Y}}^{*}_{1}, \ldots ,\textbf{Y}^{*}_{B}\right\} \).

As stated in the Introduction, the replicates derived from this algorithm can be used to approximate characteristics of the distribution of a statistic under study. For example, they can be used to approximate the standard error and bias of an estimator (as illustrated in Sects. 3 and 4), as well as to compute confidence intervals (Sect. 4), among many other potential applications.

3 Simulation Results

This section presents various studies comparing the performance of the proposed nonparametric bootstrap method (NPB) with the smoothed bootstrap (SB) method proposed by Cuevas et al. (2006) and the naive bootstrap (NB) method. The SB algorithm is implemented in the fdata.bootstrap() function of the fda.usc package and can be summarized as follows:

  1. 1.

    Draw a standard bootstrap replicate \({\textbf{Y}}^{*}_0\) from \({\textbf{Y}}\), by uniform resampling of the rows \({\textbf{y}}_1, \ldots , {\textbf{y}}_n\).

  2. 2.

    Generate \({\textbf{Z}}\), such that each row \({\textbf{z}}_i =(Z_i(t_1), \ldots , Z_i(t_p)))^\top \) is normally distributed with mean \({\textbf{0}}\) and covariance matrix \(\alpha {\hat{\Sigma _{\textbf{Y}}}}\), where \({\hat{\Sigma }}_{{\textbf{Y}}}\) is the sample covariance matrix of the observed values \({\textbf{Y}}\) (an estimate of \(\Sigma _0\)) and \(\alpha \) is a smoothing parameter (controlling the amount of additional variability), and such that \({\textbf{z}}_i\) is independent of \({\textbf{z}}_{i'}\) if \(i \ne i'\) (\(\textrm{Cov} \left( Z_{i}(t_j), Z_{i'}(t_{j'})\right) = 0\)).

  3. 3.

    Compute the bootstrap sample as \({\textbf{Y}}^{*} = \textbf{Y}^{*}_0 + {\textbf{Z}}\).

  4. 4.

    Repeat B times steps 1–3 to obtain the B bootstrap replicates \(\left\{ {\textbf{Y}}^{*}_{1}, \ldots ,\textbf{Y}^{*}_{B}\right\} \).

The difficulty in applying this method in practice is the proper selection of the \(\alpha \) parameter. However, in the results shown below, we set \(\alpha = 0.05\) following the authors’ recommendation.

Note that the naive bootstrap (NB) can be obtained as a particular case when \(\alpha = 0\). In this case, steps 2 and 3 in the previous algorithm can be skipped, resulting in the naive bootstrap replicates \({\textbf{Y}}^* = {\textbf{Y}}_0^*\).

Numerical studies were carried out to study the behaviour of the three bootstrap procedures (NPB, SB, NB) under different scenarios. In each case, \(N = 2000\) curve samples of sizes \(n=25\), 50 and 100, with \(p=101\) regular discretization points in the interval \(\left[ 0,1\right] \), following the model (1) were generated. In order to take into account the effect of different functional forms of the trend and variance, the following theoretical functions were considered: \(\mu _{1}(t)=2.5 + \sin (2\pi t)\) (nonlinear trend), \(\mu _{2}(t)=10t(1-t)\) (polynomial trend), \(\mu _{3}(t)= 2\) (constant trend), \(\sigma _{1}^{2}(t) = (\frac{15}{16} )^2 [1-(2t-1)^2]^2 + 0.1\) (nonlinear variance), \(\sigma _{2}^{2}(t)= 0.5 (1+t)\) (linear variance) and \(\sigma _{3}^{2}(t)=1\) (constant variance, i.e. homoscedastic case). The random errors \(\varepsilon _{i}\) were normally distributed with zero mean, unit variance and isotropic exponential variogram:

$$\begin{aligned} \gamma _\varepsilon (u)=c_{0}+ (1 - c_0)\left( 1-\exp \left( -3\frac{\vert u\vert }{a}\right) \right) , \end{aligned}$$

(for \(u\ne 0\)), where \(c_{0}\) is the nugget effect (\(1 - c_0\) is the partial sill) and a is the practical range. The values considered in the simulations were \(a=0.3, 0.6, 0.9\), and \(c_0 = 0, 0.2, 0.5\). For instance, Fig. 2 provides an idea of the shape of the simulated samples in two of the studied scenarios.

Fig. 2
figure 2

Simulated samples of size \(n=25\) with \(\mu _1\) (nonlinear), \(\sigma _1^2\) (nonlinear), \(c_{0}=0.2\) and \(a=0.6\) (a), and with \(\mu _2\) (polynomial), \(\sigma _2^2\) (linear), \(c_{0}=0\) and \(a=0.9\) (b). The theoretical trends are shown in solid lines and the nonparametric estimates in black dashed lines

In each scenario, \(B=1000\) bootstraps replicates were obtained using both the SB and NPB methods. The performance of both methods was analysed by comparing the results in the approximation of characteristics of two estimators. More specifically, we will consider the approximation of the bias and the standard error (se) of the nonparametric trend \({\hat{\mu }}(t)\) and conditional variance \({\hat{\sigma ^2}}(t)\) estimators described in Sect. 2.1. The general procedure to approximate the bias and the standard error of an estimator \({\hat{\theta }}(t)\) from bootstrap resamples is as follows:

  1. 1.

    Derive B replicates \(\left\{ {\textbf{Y}}^{*}_{1}, \ldots ,{\textbf{Y}}^{*}_{B}\right\} \) from the original data.

  2. 2.

    Compute B estimates of \(\theta (t) \) from the B replicates, which will be denoted by \(\left\{ {{\hat{\theta }}^{*}_{1}(t)},\ldots , {{\hat{\theta }}^{*}_{B}(t)}\right\} \).

  3. 3.

    Approximate the bootstrap version of \(\sigma ({\hat{\theta }}(t))\) as follows:

    $$\begin{aligned} {\widehat{se}}^{*}({\hat{\theta }}^{*}(t)) = \left\{ \frac{1}{B-1}\sum _{b=1}^{B} \left( {\hat{\theta }}^{*}_{b}(t) - \bar{{\hat{\theta }}}^{*}(t) \right) ^2\right\} ^\frac{1}{2}, \end{aligned}$$
    (4)

    where \(\bar{{\hat{\theta }}}^{*}(t) = \sum _{b=1}^{B} {\hat{\theta }}^{*}_{b}(t) / B\).

  4. 4.

    In a similar way, obtain the bootstrap counterpart of \(\textrm{Bias} ({\hat{\theta }}(t))\) through

    $$\begin{aligned} {\widehat{\textrm{Bias}}}^{*}({\hat{\theta }}^{*}(t)) = \frac{1}{B} \sum _{b=1}^{B} \left( {\hat{\theta }}^{*}_b (t) - {\hat{\theta }}(t) \right) . \end{aligned}$$
    (5)

To avoid the effect that the bandwidth selection criteria might have on the results, the local linear trend and variance estimators were computed using the bandwidths that minimized the corresponding (theoretical) mean average squared errors (MASE). For the trend estimator, this criterion can be expressed as follows:

$$\begin{aligned} \textrm{MASE}(h)= \frac{1}{p}({\textbf{S}}\varvec{\mu }-\varvec{\mu })^t ({\textbf{S}}\varvec{\mu }-\varvec{\mu })+\frac{1}{np}\textrm{tr}({\textbf{S}} \varvec{\Sigma }_0{\textbf{S}}^t), \end{aligned}$$

where \(\varvec{\mu }=\left[ \mu (t_1), \ldots , \mu (t_p)\right] ^t\). An analogous approach was used in the case of the variance estimator, by using (3) to approximate the corresponding covariance matrix.

At each simulation, the bias and variance of the two estimators were approximated through (5) and (4). To measure the accuracy of these bootstrap estimates, mean squared (MSE) errors were computed, using theoretical values, \(\textrm{Bias} ({\hat{\theta }}(t))\) and \(\sigma ({\hat{\theta }}(t))\), approximated by simulation. For example, in the case of the approximation of the bias of the trend estimator:

$$\begin{aligned} \textrm{MSE}(t) = E\left\{ \left[ {\widehat{\textrm{Bias}}}^*({\hat{\mu }}^*(t)) - \textrm{Bias}({\hat{\mu }}(t))\right] ^2 \right\} . \end{aligned}$$

The averages of these errors over the discretization points will be denoted by AMSE.

Similar results were observed across the simulation scenarios, although only a few representative outcomes are presented here for brevity. Overall, the proposed method showed superior performance in approximating the bias of both estimators. The bias approximations obtained with the SB and NB methods were closer to zero, particularly for the trend estimator.

In addition, the results obtained with the SB and NB methods were more similar than expected, since the replicates with the SB method have more variability. Only slight differences between these two methods were observed when approximating the bias of the variance estimator. For example, Fig. 3 compares the theoretical values with the bootstrap approximations of the bias and the standard error of both estimators for \(\mu _1\) (nonlinear), \(\sigma _1^2\) (nonlinear), \(n = 50\), \(c_{0}=0.2\) and \(a=0.6\).

Fig. 3
figure 3

Comparison of the theoretical bias (left) and standard error (right) with their bootstrap approximations, for the local linear trend (top) and the variance (bottom) estimators, considering \(\mu _1\) (nonlinear), \(\sigma _1^2\) (nonlinear), \(n = 50\), \(c_{0}=0.2\) and \(a=0.6\). The theoretical values are shown in solid lines, the NPB, SB and NB approximations in dashed, dotted and dot-dashed lines, respectively

Unexpectedly, the standard error approximations obtained with the SB and NB methods turned out to be slightly better than those obtained with the NBP method, especially when the sample size is small. For instance, Table 1 summarizes the errors obtained in the approximation of the bias and standard error of the estimators with both bootstrap procedures considering the different sample sizes, for \(\mu _1\) (nonlinear), \(\sigma _1^2\) (nonlinear), \(c_{0}=0.2\) and \(a=0.6\). It can be observed that as the sample size n increases, the squared errors decrease, suggesting the consistency of the approximations obtained with both methods. A clear improvement is observed when using the SB or the NB methods to approximate the standard error of the variance estimator with the smallest sample size, obtaining very similar results with both methods as the number of observations increases. However, the NPB method outperforms the other methods at approximating the bias in all cases, especially when the variance estimator is considered.

Table 1 Monte Carlo approximations of the AMASE \((\times 10^{2})\) of the bias and standard error bootstrap estimates, for the local linear trend \({{\hat{\mu }}}(t)\) and variance \({\hat{\sigma ^2}}(t)\) estimators, considering \(\mu _{1}\) (nonlinear), \(\sigma _1^2\) (nonlinear), \(n=100\) \(c_{0}=0.2\) and \(a=0.6\)

The influence of the temporal dependence on the bootstrap approximations was also studied. For instance, Table 2 shows the results obtained for the trend estimator \({{\hat{\mu }}}(t)\) considering the different nugget (\(c_0\)) and practical range (a) values, for \(\mu _1\) (nonlinear), \(\sigma _1^2\) (nonlinear) and \(n=100\). In these cases, the errors corresponding to the standard error approximations are quite similar for all three methods. As for the bootstrap estimates of biases, as expected, it is generally observed that the errors decrease as the nugget increases (which corresponds to lower temporal dependency). This effect is particularly pronounced when the SB or NB method is used. A similar behaviour is observed when the practical range increases.

Table 2 Monte Carlo approximations of the AMASE \((\times 10^{2})\) of the bootstrap estimates of the bias and standard error of \({{\hat{\mu }}}(t)\), considering the different \(c_0\) and a values, with \(\mu _1\) (nonlinear), \(\sigma _1^2\) (nonlinear) and \(n=100\)

Finally, Table 3 illustrates the effect of the assumed theoretical functional forms in model (1) on the errors in bias and standard error approximations of the variance estimator (for \(n=100\), \(c_0 = 0.2\), and \(a=0.6\)). Once again, the NPB method consistently outperforms the other methods in approximating biases across the different scenarios. When the variance model remains fixed, similar results are obtained with all methods when the trend varies. However, for the same theoretical trend, different behaviours are observed when the functional form of the theoretical variance changes. While the error in bias approximations increases notably with the SB and NB methods when simpler variance models are considered, a similar effect is observed with the NPB method in standard error approximations. This may be attributed to the slight underestimation of variance by the local linear estimator \({\hat{\sigma ^2}}(t)\) in these cases, resulting in a small negative bias that the SB and NB methods approximate with values close to zero, and producing slightly lower variability in the NPB method.

Table 3 Monte Carlo approximations of the AMASE \((\times 10^{2})\) of the bootstrap estimates of the bias and standard error of \({\hat{\sigma ^2}}(t)\), considering the different theoretical trend and variance functions, with \(n=100\), \(c_0 = 0.2\), and \(a=0.6\)

4 Application to Pollution Data

In this section, the practical performance of proposed methodology is illustrated through its application to the data set of ground-level ozone concentrations briefly mentioned in the Introduction (\(n = 33\) and \(p = 365\)).

The iterative process described at the end of Sect. 2.1 was used to estimate the model components. As a stopping criterion, an absolute percentage difference of less than 10% between bandwidths was used. Two iterations were performed in this case. (Although only one would have been necessary since the selected bandwidths for trend and variance estimation were nearly identical to those of the second iteration, further details can be found in the supplementary material.) The final trend estimate is shown in Fig. 4, computed with a bandwidth \(h = 36.077\) selected by the CGCV criterion, where an increase in mean ozone levels is observed during springtime.

Fig. 4
figure 4

Sample mean (dashed line) and nonparametric trend estimates (solid line), of the ozone data (grey dotted lines)

Then, from the final residuals, the variance estimate \({\hat{\sigma ^2}}(\cdot )\) (with a bandwidth \(h_2 = 33.106\) selected by the CGCV criterion), the pilot semivariogram estimates \({\hat{\gamma }}(\cdot )\) (with a bandwidth \(h_3 = 3.713\) selected by minimizing the CV relative squared error) and its Shapiro-Botha fit \({\bar{\gamma }}(\cdot )\) were computed. Figure 5a shows the standard deviation estimate, where an increase in the variability in ozone concentration at the beginning of summer and in winter. The variogram estimates are shown in Fig. 5b. The final variogram has a nugget effect of \({\hat{c}}_0 = 0.307\) (which may be interpreted as the proportion of independent variability) and a practical range \({\hat{a}} \approx 32.7\) (a distance beyond which the temporal correlation can be considered negligible).

With these nonparametric estimates, the NPB approach was applied to make inference about the trend of the functional process. Thus, the bias and standard error of the local linear trend estimator were approximated with \(B = 2000\) replicates. Figure 6 shows an example of the results obtained, the bias-corrected trend estimates (solid line) and pointwise confidence intervals (point lines), computed adding and subtracting two standard errors to the corrected trend estimate. The NPB method also allows the construction of pointwise confidence intervals using the basic percentile method (see e.g. Davison and Hinkley, 1997, Section 5.2), obtaining practically identical results. (The basic bootstrap replicas are shown in dotted grey lines; see the supplementary material for further details.)

Fig. 5
figure 5

Sample variance and nonparametric variance estimates (a), dashed and solid lines, respectively, and semivariogram estimates (b) of the ozone data

Fig. 6
figure 6

Bias-corrected (solid line) and uncorrected (dashed line) trend estimates, pointwise confidence intervals (point lines) and basic bootstrap replicas (dotted grey lines) obtained as a result of the application of the NPB method to the ozone data

5 Conclusion

The performance of the proposed methodology was validated by a simulation study, showing its good behaviour under different scenarios, considering distinct theoretical trend and variance functions and including several degrees of temporal dependence. The results were compared to those obtained with the SB and NB approaches, showing that the new method seems to be better at reproducing the process variability. Specifically, the NPB method proved to be much better at approximating the bias of the estimators considered, as the SB or NB methods tend to produce bias approximations close to zero. Although, unexpectedly, the standard error approximations obtained with the SB and NB methods turned out to be slightly better than those obtained with the NBP method when the sample size is small.

To improve performance in the case of small samples, a correction for the bias due to the direct use of the residuals in the estimation of the small-scale variability, similar to that proposed in Fernández-Casal et al. (2017) for the spatial case, could be investigated.

The NPB method proposed in this study is designed for nonstationary heteroscedastic processes. However, it can be easily adapted to cases where either the mean or variance is assumed to be constant, such as when using residuals from a functional regression model. If any of these assumptions is reasonable, the procedure could be simplified, and even better results could be expected. On the other hand, the proposed functional model may not be appropriate in certain cases. For example, in the ozone dataset, it might be reasonable to assume that there is a yearly effect in the functional mean or in the variance. In such cases, more sophisticated estimators, such as semiparametric ones, could be considered for these components. However, the bootstrap procedure would remain analogous. Whereas if it is not appropriate to assume that the distribution of the standardized errors is homogeneous, it would be necessary to modify the resampling procedure. These aspects could be the subject of future researches, including the presence of dependence between curves.

The NPB technique was used for approximating characteristics of estimators and for the construction of confidence intervals. Moreover, it can also be employed in other inference problems, including hypothesis testing (e.g. related to the trend or variance functions), estimation of the probability that a pollutant concentration level exceed air quality guidelines, outlier detection (e.g. due to pollution episodes or sensor failures), among many others.