1 Introduction

The State of the Climate report by Blunden and Arndt (2020) corresponding to 2019, includes a series of analysis of fully monitored variables on a global scale. One of these variables is surface temperature, revealing that July 2019 was Earth’s hottest month on record. In Europe, 2019 was the second hottest year (following 2018), and 2014–2019 are Europe’s warmest years on record. The report indicates that the warming of land and ocean surfaces is reflected across the Globe, with lakes and permafrost temperatures increasing, as an evidence of climate change.

However, as pointed out by Bloodhart et al. (2015), it is difficult for individuals to determine if they have experienced the effects of climate change, given that their information refers to a reduced period of time and is usually restricted to a local scale. Nevertheless, understanding how people experience the changes on local weather patterns is important, given that personal experiences are known to affect climate change beliefs (Goebbert et al. 2012) and risk perceptions. Hence, they may condition citizen’s support on prevention policies (see Howe 2018; Goebbert et al. 2012; Taylor et al. 2014; Bloodhart et al. 2015, among others). As pointed out by Goebbert et al. (2012) on a global scale, personal perceptions on weather patterns are usually noticed through temperature changes and this indicator is usually taken as an evidence by individuals for making inferences about climate changes. However, these subjective insights should be confirmed with the construction of an appropriate statistical regression model that allows for assessing if relevant changes in temperature patterns could have happened in different periods of time.

As a motivating example, consider daily temperature records in Santiago de Compostela (NW-Spain) for the period 2002–2019. Temperature curves from February 15, 2002 until June 28, 2005 can be seen in Fig. 1 (left), where the color scale indicates the day of the year when each curve was observed (note that the scale-palette preserves the periodicity of the data). In Fig. 1 (right), a functional boxplot (Sun and Genton 2011) is presented, plotting the 50% central region for the observed temperature curves. The vertical segments are the whiskers of the boxplot. Modeling the relation between the temperature curve (as a functional covariate) and the day of the year basis (as a circular response) for a certain period of time would allow to investigate, by comparing the predicted day by such a model for a given temperature curve, in a different period, with the actual day corresponding to that temperature curve, if temperature curve patterns are stable along time or if, on the contrary, observed temperature curves are displaced.

Fig. 1
figure 1

Daily temperature curves in Santiago de Compostela (Spain) from February 15, 2002 to June 28, 2005 (left) and its corresponding functional boxplot (right)

The novel methodology presented in this paper exploits the usage of high frequency temperature records, considering a (flexible) regression model with a functional covariate (the whole daily temperature curve) and a circular response (a day on a year basis). Circular data can be viewed as points on the unit circle, so fixing a direction and a sense of rotation, circular data can be expressed as angles. For a complete introduction to circular data, we refer to Mardia and Jupp (2000) or Ley and Verdebout (2017). This type of data appear in different applied fields, and some examples include wind directions (Fisher 1995), wave directions (Jona-Lasinio et al. 2012; Wang et al. 2015), or animal orientations (Scapini et al. 2002), among others. Regression analysis considering models where the response and/or the covariables are circular variables has been addressed in different papers. Parametric approaches were considered in Fisher and Lee (1992) and Presnell et al. (1998) for regression models with circular response and Euclidean covariates. The authors assumed a parametric distribution model for the circular response variable. An alternative parametric regression model was also analyzed in Kim and SenGupta (2017), considering that a multivariate circular variable depends on several circular covariates. Using nonparametric methods, Di Marzio et al. (2013) introduced a regression estimator for models with circular response, a single real-valued covariate and independent errors and also for errors coming from mixing processes. A similar approach was also applied for time series in Di Marzio et al. (2012). Following similar ideas, Meilán-Vila et al. (2021) proposed and studied local polynomial-type regression estimators considering a model with a circular response and several real-valued covariates for independent data. This class of estimators was also analyzed in Meilán-Vila et al. (2021) in the presence of spatial correlation.

A natural extension of the multiple regression model used in Meilán-Vila et al. (2021) would be to consider a functional covariate belonging to an infinite-dimensional space. Nowadays, with the advances in data collection methods, more and more data are being recorded during an interval time or at several discrete time points with a high frequency, producing functional data. For example, in many fields of applied sciences such as chemometrics (Abraham et al. 2003), biometrics (Gasser et al. 1998) and medicine (Antoniadis and Sapatinas 2007), among others, it is quite common to deal with observations that are curves or functions. The branch of statistics that analyzes this type of data is called functional data analysis (see, for instance, Ramsay and Silverman 2005; Ferraty and Vieu 2006; Kokoszka and Reimherr 2017). The literature on functional regression modeling is really extensive (see, for example, Greven and Scheipl 2017; Morris 2015; Aneiros et al. 2019, 2022; Goia and Vieu 2016, for complete reviews and recent advances in this field), including parametric (in particular, linear) and nonparametric models. Assuming parametric conditions, earlier advances on functional regression were introduced by Ramsay and Silverman (2005), while a recent overview was provided in Febrero-Bande et al. (2017). The nonparametric methodologies became popular in the functional regression context with the book by Ferraty and Vieu (2006), this topic being widely addressed in the last decade (see Ling and Vieu 2018, for a survey). In this framework, if the explanatory variables are functional, nonparametric regression approaches are essentially based on an adaptation of the Nadaraya–Watson (Ferraty and Vieu 2002, 2006; Ling et al. 2020) or the local linear regression estimators (Aneiros-Pérez et al. 2011; Berlinet et al. 2011; Boj et al. 2008; Baíllo and Grané 2009; Ferraty and Nagy 2022).

As pointed out before the aim of the present work is to propose and study a nonparametric regression estimator for a model with a circular response and a functional covariate. When the response variable is circular, the regression function can be defined as the minimizer of a circular risk function. It can be proved that the minimizer of this risk function is the inverse tangent function of the ratio between the conditional expectation of the sine and the cosine of the response variable. The proposal introduced in this work implicitly considers two regression models, one for the sine and another for the cosine of the response variable. Then, a nonparametric estimator for the regression function is directly obtained by calculating the inverse tangent function of the ratio of appropriate Nadaraya–Watson estimators for the two regression functions of the sine and cosine models. This way of proceeding has already been considered previously. For instance, a similar approach was employed in Meilán-Vila et al. (2021) and Meilán-Vila et al. (2021) for regression models with a circular response, Euclidean covariates and independent or spatially correlated errors, respectively. As in any nonparametric approach, a crucial issue is the selection of an appropriate bandwidth or smoothing parameter. In this paper, cross-validation bandwidth selection methods adapted for the current framework are introduced and analyzed in practice.

This paper is organized as follows. In Sect. 2, the regression model for a circular response and a functional covariate is presented. In Sect. 3, the nonparametric estimator of the circular regression function is proposed. Expressions for its asymptotic bias and variance, as well as its asymptotic distribution are included in Sect. 4. The finite sample performance of the estimator is studied through simulations in Sect. 5, and illustrated in Sect. 6 with the real data example on temperature curves. Finally, a discussion on this paper is included in Sect. 7.

2 Functional-circular regression model

Let \(\{({\mathcal {X}}_i,\Theta _i)\}_{i=1}^n\) be a random sample from the random vector (\({\mathcal {X}}, \Theta \)) , where \(\Theta \) is a circular random variable taking values on \({\mathbb {T}}=[0,2\pi )\), and \({\mathcal {X}}\) is a functional variable supported on E, a separable Banach space endowed with a norm \(\Vert \cdot \Vert \). This general framework includes \(L^p\), Sobolev and Besov spaces. Separability condition avoids measurability problems for the random variable \(\mathcal {X}\).

Assume that the circular random variable \(\Theta \) depends on the functional random variable \({\mathcal {X}}\) through the following regression model:

$$\begin{aligned} \Theta _i=[m(\mathcal {X}_i)+\varepsilon _i]({\texttt {mod}} \, 2\pi ), \quad i=1,\ldots ,n, \end{aligned}$$
(1)

where m is a smooth trend or regression function mapping E onto \({\mathbb {T}}\), mod stands for the modulo operation and \(\varepsilon _i, i=1,\ldots ,n\), is an independent sample of a circular random variable \(\varepsilon \), satisfying \({\textrm{E}}[\sin (\varepsilon )\mid \mathcal {X}=\chi ]=0\) and having finite concentration. In this setting, the circular regression function m in model (1) can be defined as the minimizer of the risk function \(\textrm{E}\{1-\cos [\Theta -m(\mathcal {X})]\mid \mathcal {X}=\chi \}\). The minimizer of this cosine risk is given by:

$$\begin{aligned} m(\chi )=\text{ atan2 }[m_1(\chi ),m_2(\chi )], \end{aligned}$$
(2)

where \(m_1(\chi )=\textrm{E}[\sin (\Theta )\mid \mathcal {X}=\chi ]\), \(m_2(\chi )=\textrm{E}[\cos (\Theta )\mid \mathcal {X}=\chi ]\) and the function \(\text{ atan2 }(y,x)\) returns the angle between the x-axis and the vector from the origin to (xy). With this formulation, \(m_1\) and \(m_2\) can be considered as the regression functions of two regression models, being \(\sin (\Theta )\) and \(\cos (\Theta )\) their response variables, respectively, and with functional covariates. Specifically, we assume the models

$$\begin{aligned} \sin (\Theta _i)=m_1(\mathcal {X}_i)+\xi _i,\quad i=1,\ldots ,n, \end{aligned}$$
(3)

and

$$\begin{aligned} \cos (\Theta _i)=m_2(\mathcal {X}_i)+\zeta _i,\quad i=1,\ldots ,n, \end{aligned}$$
(4)

where \(m_1\) and \(m_2\) are regression functions mapping E onto \([-1,1]\). The \(\xi _i\) and the \(\zeta _i\) are independent error terms, absolutely bounded by 1, satisfying \(\textrm{E}(\xi \mid \mathcal {X}=\chi )=\textrm{E}(\zeta \mid \mathcal {X}=\chi )=0\). Additionally, for every \(\chi \in E\), set \(s_1^2(\chi )=\textrm{Var}(\xi \mid \mathcal {X}=\chi )\), \(s_2^2(\chi )=\textrm{Var}(\zeta \mid \mathcal {X}=\chi )\) and \(c(\chi )=\textrm{E}(\xi \zeta \mid \mathcal {X}=\chi )\). The assumption that models (3) and (4) simultaneously hold with (1) leads to certain relations between the variances and covariances of the errors in these models, as it is described below.

Set \(\ell (\chi ) =\textrm{E}[\cos (\varepsilon )\mid \mathcal {X}=\chi ]\), \(\sigma ^2_1(\chi )=\textrm{Var}[\sin (\varepsilon )\mid \mathcal {X}=\chi ]\), \(\sigma ^2_2(\chi )=\textrm{Var}[\cos (\varepsilon )\mid \mathcal {X}=\chi ]\) and \(\sigma _{12}(\chi )=\textrm{E}[\sin (\varepsilon )\cos (\varepsilon )\mid \mathcal {X}=\chi ]\). Then, using the sine and cosine addition formulas in model (1), it follows that, for \(i=1, \ldots , n\),

$$\begin{aligned} \sin (\Theta _i)=\sin [m(\mathcal {X}_i)] \cos (\varepsilon _i)+\cos [m(\mathcal {X}_i)] \sin (\varepsilon _i) \end{aligned}$$
(5)

and

$$\begin{aligned} \cos (\Theta _i)=\cos [m(\mathcal {X}_i)] \cos (\varepsilon _i)-\sin [m(\mathcal {X}_i)] \sin (\varepsilon _i). \end{aligned}$$
(6)

Hence, defining \(f_1(\chi )=\sin [m(\chi )]\) and \(f_2(\chi )=\cos [m(\chi )]\) and applying conditional expectations in (5) and (6), it holds that

$$\begin{aligned} m_1(\chi )=f_1(\chi )\ell (\chi )\quad \text{ and }\quad m_2(\chi )=f_2(\chi )\ell (\chi ). \end{aligned}$$
(7)

Note that \(f_1(\chi )\) and \(f_2(\chi )\) correspond to the normalized versions of \(m_1(\chi )\) and \(m_2(\chi )\), respectively. Indeed, taking into account that \(f_1^2(\chi )+f_2^2(\chi )=1\), it can be easily deduced that \(\ell (\chi )=[m^2_1(\chi )+m_2^2(\chi )]^{1/2}\). Hence, under model (1), \(\ell (\chi )\) amounts to the mean resultant length of \(\Theta \) given \(\mathcal {X}=\chi \), and taking into account that it is assumed that \(\textrm{E}[\sin (\varepsilon )\mid \mathcal {X}=\chi ]=0\), also corresponds to the mean resultant length of \(\varepsilon \) given \(\mathcal {X}=\chi \). The following explicit expressions for the conditional variances of the error terms involved in models (3) and (4), in terms of the conditional variances and covariance of the Cartesian coordinates of \(\varepsilon \), can be obtained:

$$\begin{aligned} s_1^2(\chi )= & {} f_1^2(\chi )\sigma ^2_2(\chi )+2f_1(\chi )f_2(\chi )\sigma _{12}(\chi )+f_2^2(\chi )\sigma ^2_1(\chi ), \end{aligned}$$
(8)
$$\begin{aligned} s_2^2(\chi )= & {} f_2^2(\chi )\sigma ^2_2(\chi )-2f_2(\chi )f_1(\chi )\sigma _{12}(\chi )+f_1^2(\chi )\sigma ^2_1(\chi ), \end{aligned}$$
(9)

as well as for the covariance between the error terms in (3) and (4),

$$\begin{aligned} c(\chi )&= f_1(\chi )f_2(\chi )\sigma ^2_2(\chi )-f_1^2(\chi )\sigma _{12}(\chi )\nonumber \\&\quad + f_2^2(\chi )\sigma _{12}(\chi )-f_1(\chi )f_2(\chi )\sigma ^2_1(\chi ). \end{aligned}$$
(10)

A nonparametric Nadaraya–Watson-type estimator of the circular regression function m in model (1) is presented and studied in what follows.

3 Nonparametric regression estimator

A Nadaraya–Watson-type estimator for \(m(\chi )\), in model (1) can be defined by replacing \(m_1(\chi )\) and \(m_2(\chi )\) in (2) by suitable Nadaraya–Watson estimators. Specifically, the following estimator:

$$\begin{aligned} {\hat{m}}_{h}(\chi )=\textrm{atan2}[{\hat{m}}_{1, h}(\chi ),{\hat{m}}_{2, h}(\chi )] \end{aligned}$$
(11)

is considered, where \({{\hat{m}}}_{1, h}(\chi )\) and \({{\hat{m}}}_{2, h}(\chi )\) denote the Nadaraya–Watson estimators of \( m_1(\chi )\) and \(m_2(\chi )\), respectively. The asymptotic properties (bias, variance and asymptotic normality) of estimator (11) are derived in the following section. For this purpose, assuming that models (3) and (4) hold, the asymptotic properties of \({{\hat{m}}}_{1, h}(\chi )\) and \({{\hat{m}}}_{2, h}(\chi )\) are used.

Considering models (3) and (4), Nadaraya–Watson estimators for \(m_j(\chi )\), \(j=1,2\), are respectively defined as:

$$\begin{aligned} {{\hat{m}}}_{j, h}(\chi )= & {} \left\{ \begin{array}{lc}\dfrac{\sum _{i=1}^n K(h^{-1}\Vert \mathcal {X}_i-\chi \Vert )\sin (\Theta _i)}{\sum _{i=1}^n K(h^{-1}\Vert \mathcal {X}_i-\chi \Vert )}&{} \text{ if }\ j=1,\\ \dfrac{\sum _{i=1}^n K(h^{-1}\Vert \mathcal {X}_i-\chi \Vert )\cos (\Theta _i)}{\sum _{i=1}^n K(h^{-1}\Vert \mathcal {X}_i-\chi \Vert )}&{}\text {if} j=2,\end{array}\right. \end{aligned}$$
(12)

where K is a symmetric kernel and \(h=h(n)\) is a strictly positive real-valued bandwidth controlling the smoothness of the estimator. Note that in this functional framework, the support of K should be contained in \({\mathbb {R}}^+\), since \( {\Vert \chi -\chi '\Vert }\ge 0\), for all \(\chi ,\chi '\in E\). If the kernel K is positive with support on [0, 1], the estimators given in (12) only consider the observations \(\sin (\Theta _i)\) and \(\cos (\Theta _i)\), respectively, associated to the curves \(\mathcal {X}_i\) such that \(\Vert \mathcal {X}_i-\chi \Vert \le {h}\), since \(K(h^{-1}\Vert \mathcal {X}_i-\chi \Vert )=0\) when the distance between \(\chi \) and \(\mathcal {X}_i\) is larger than h. The regression estimators given in (12) are the adaptations to the functional context of the classical finite-dimensional Nadaraya–Watson estimators for the sine and cosine regression models (Ferraty et al. 2007).

Although the choice of the kernel function is of secondary importance, the bandwidth parameter plays an important role in the performance of the Nadaraya–Watson estimators (12) and, consequently, of the regression estimator (11). When h is excessively large, the number of observations involved in the estimation method will also be too large, and an oversmoothed estimator is obtained. Conversely, if the bandwidth parameter is too small, few observations will be considered in the estimation procedure, and an undersmoothed estimator estimator is computed. Therefore, in this case, as in any other kernel-based estimator, in practice, data-driven bandwidth selection methods are needed. A cross-validation approach is used to select the bandwidth parameter h for (11) in the simulation study and for the real data application. Note that the same bandwidth is used for computing \({{\hat{m}}}_{1, h}\) and \({{\hat{m}}}_{2, h}\) in (12). If two different bandwidths for sine and cosine components were considered, this would imply that possibly different curves were chosen for computing regression estimators of sine and cosine models, which seems quite unnatural, since the Cartesian coordinates are directly related to the the angle itself.

4 Asymptotic results

In this section, the asymptotic bias and variance expressions for \({\hat{m}}_{h}(\chi )\), given in (11), are derived. Moreover, the asymptotic distribution of the proposed estimator is calculated. The proofs of all these theoretical results are collected in Appendix A.

4.1 Asymptotic bias and variance

To derive the asymptotic bias and variance of the estimator given in (11), the asymptotic properties of the Nadaraya–Watson estimators of \(m_j(\chi )\), \(j=1,2,\) given in (12), are required. Using the asymptotic results given in Ferraty et al. (2007), the asymptotic bias and variance of \({{\hat{m}}}_{j, h}(\chi )\), \(j=1,2\), are immediately obtained. These expressions, jointly with the covariance between \({{\hat{m}}}_{1, h}(\chi )\) and \({{\hat{m}}}_{2, h}(\chi )\), are collected in Lemma 1.

Let \(\varphi _{\chi }\) and \(\varphi _{j,\chi }\), \(j=1,2\), be the functions defined for all \(s\in {\mathbb {R}}\) by:

$$\begin{aligned} \varphi _{\chi }(s)= & {} {\mathbb {E}}\left\{ [m(\mathcal {X})-m(\chi )]\mid \Vert \mathcal {X}-\chi \Vert =s\right\} , \end{aligned}$$
(13)
$$\begin{aligned} \varphi _{j,\chi }(s)= & {} {\mathbb {E}}\left\{ [m_j(\mathcal {X})-m_j(\chi )]\mid \Vert \mathcal {X}-\chi \Vert =s\right\} , \end{aligned}$$
(14)

and denote by \(F_{\chi }\) the cumulative distribution function of the random variable \(\Vert \mathcal {X}-\chi \Vert \),

$$\begin{aligned} F_{\chi }(t)={\mathbb {P}}(\Vert \mathcal {X}-\chi \Vert \le t), \quad t\in {\mathbb {R}}. \end{aligned}$$

In addition, denoting, \(\forall s\in [0,1],\)

$$\begin{aligned} \tau _{\chi ,h}(s)=\dfrac{F_{\chi }(hs)}{F_{\chi }(h)}={\mathbb {P}}(\Vert \mathcal {X}-\chi \Vert \le hs\mid \Vert \mathcal {X}-\chi \Vert \le h), \end{aligned}$$

the following assumptions are required.

  1. (C1)

    The regression functions \(m_j\) and \(s_j^2\), for \(j=1,2\), are continuous in a neighborhood of \(\chi \in E\), and \(F_{\chi }(0)=0\).

  2. (C2)

    \(\varphi '_{j,\chi }(0)\) , for \(j=1,2,\) exist.

  3. (C3)

    The bandwidth h satisfies \(h\rightarrow 0\) and \(nF_{\chi }(h)\rightarrow \infty \), as \(n\rightarrow \infty \).

  4. (C4)

    The kernel K is supported on [0, 1] and has a continuous derivative on [0, 1). Moreover, \(K'(s)\le 0\) and \(K(1)>0\).

  5. (C5)

    For all \(s\in [0,1]\), \(\tau _{\chi ,h}(s)\rightarrow \tau _{\chi ,0}(s)\), when \(h\rightarrow 0\).

Assumptions (C1), (C3)–(C4) are similar to those classically employed in the finite-dimensional context. Regarding (C2), as pointed out by Ferraty et al. (2007), this condition avoids the difficulties that could appear when considering differential calculus on Banach spaces. Thus, regularity hypotheses on the regression function m, such as being twice continuously differentiable, are not required. In assumption (C5), the function \(\tau _{\chi ,0}\) is included in the constants that are part of the main terms of the asymptotic expansions of the bias and the variance of the regression estimator (11). The close expression of this function in some particular cases is provided in Ferraty et al. (2007), Proposition 1.

Lemma 1

Let \(\{(\mathcal {X}_i,\Theta _i)\}_{i=1}^n\) be a random sample from (\({\mathcal {X}}, \Theta \)) supported on \(E\times {\mathbb {T}}\). Under assumptions (C1)–(C5), for \(\chi \in E\), then, for \(j=1,2\),

where

$$\begin{aligned} M_{\chi ,0}= & {} K(1)-\int _0^1[sK(s)]'\tau _{\chi ,0}(s)ds,\\ M_{\chi ,1}= & {} K(1)-\int _0^1K'(s)\tau _{\chi ,0}(s)ds \end{aligned}$$

and

$$\begin{aligned} M_{\chi ,2}=K^2(1)-\int _0^1(K^2)'(s)\tau _{\chi ,0}(s)ds. \end{aligned}$$

Remark 1

In the case of regression models with Euclidean response and multiple covariates, the main term of the asymptotic bias of the Nadaraya–Watson estimator depends on the gradient and the Hessian matrix of the regression functions (see Härdle and Müller 2012, for further details). Nevertheless, in the present functional setting, the leading term of the asymptotic bias of \({{\hat{m}}}_{j, h}\) depends on the first derivative of the functions \(\varphi _{j,\chi }\) evaluated at zero.

Now, using the previous lemma, the following theorem provides the asymptotic bias and variance of the estimator \({{\hat{m}}}_{h}(\chi )\).

Theorem 1

Let \(\{(\mathcal {X}_i,\Theta _i)\}_{i=1}^n\) be a random sample from (\({\mathcal {X}}, \Theta \)) supported on \(E\times {\mathbb {T}}\). Under assumptions (C1)–(C5), the asymptotic bias of estimator \({\hat{m}}_{h}(\chi )\), for \(\chi \in E\), is given by:

(15)

and the asymptotic variance is:

(16)

The asymptotic expressions of the bias and variance obtained in Theorem 1 can be used to derive the asymptotic mean squared error (AMSE) of \({\hat{m}}_{h}(\chi )\). An asymptotically optimal local bandwidth for \({\hat{m}}_{h}(\chi )\) can be selected minimizing this error criterion with respect to h. A plug-in bandwidth selector could be defined replacing the unknown quantities in the asymptotic optimal parameter by appropriate estimates. This problem is more complex in this functional context than in the finite-dimensional setting (Ferraty et al. 2007). Moreover, the design of global plug-in selectors would require obtaining integrated versions of the AMSE. Making this issue rigorous poses challenges that continue to be topics of current research. Therefore, in this paper, we avoid using plug-in bandwidth selectors and, as pointed out before, we employ a suitable adapted cross-validation technique to select the bandwidth for \({\hat{m}}_{h}\) in the numerical studies (simulations and real data application).

4.2 Asymptotic normality

Next, the asymptotic distribution of the proposed nonparametric regression estimator (11) is derived. Apart from the assumptions stated in Sect. 4.1 for computing its asymptotic bias and variance, the following condition is needed:

  1. (C6)

    \(\varphi '_{\chi }(0)\ne 0\) and \(M_{\chi ,0}\) in Lemma 1 is strictly positive.

Condition (C6) is required to ensure that the leading term of the bias does not vanish. The first part of this assumption is similar to the one stated in the finite-dimensional setting . The second part of (C6) is specific for the infinite-dimensional framework and it is satisfied in some standard situations. For example, if \(\tau _{\chi ,0}(s)\ne {\mathbb {I}}_{(0,1]}(s)\) (being \({\mathbb {I}}_{(0,1]}\) the indicator function on the set (0, 1]) and \(\tau _{\chi ,0}\) is continuously differentiable on (0, 1), or if \(\tau _{\chi ,0}(s)=\delta _1(s)\) (where \(\delta _1\) stands for the Dirac mass at 1), then \(M_{\chi ,0}>0\) for any kernel K satisfying (C4). For further details on condition (C6), we refer to Ferraty et al. (2007).

The following theorem provides the asymptotic distribution of the nonparametric regression estimator proposed in (11).

Theorem 2

Under assumptions (C1)–(C6), it can be proved that , as \(n\rightarrow \infty ,\)

$$\begin{aligned} \sqrt{n{F}_{\chi }(h)}\dfrac{\ell (\chi )M_{\chi ,1}}{\sqrt{\sigma ^2_1(\chi )M_{\chi ,2}}} [{\hat{m}}_{h}(\chi )-m(\chi )-\varvec{B}_h]\xrightarrow []{\mathcal {L}}N(0,1), \end{aligned}$$

where \(\xrightarrow []{\mathcal {L}}\) denotes convergence in distribution, with

$$\begin{aligned} \varvec{B}_h= \varphi _{\chi }'(0)\dfrac{M_{\chi ,0}}{M_{\chi ,1}}h. \end{aligned}$$

A simpler version of the previous theorem can be derived just by considering the following additional assumption:

  1. (C7)

    \(\lim _{n\rightarrow \infty }h\sqrt{n{F}_{\chi }(h)}=0\).

Condition (C7) allows the cancellation of the bias term \(\varvec{B}_h\) in Theorem  2. Under this assumption (and also if (C1)–(C6) hold), the pointwise asymptotic Gaussian distribution for the functional nonparametric regression estimate is given in Corollary 1.

Corollary 1

Under assumptions (C1)–(C7), it can be proved that , as \(n\rightarrow \infty ,\)

$$\begin{aligned} \sqrt{n{F}_{\chi }(h)}\dfrac{\ell (\chi )M_{\chi ,1}}{\sqrt{\sigma ^2_1(\chi )M_{\chi ,2}}}[{\hat{m}}_{h}(\chi )-m(\chi )]\xrightarrow []{\mathcal {L}} N(0,1). \end{aligned}$$

Notice that in order to use the asymptotic distribution of \({\hat{m}}_{h}\) in practice, for inferential purposes, it is necessary to estimate the functions \(\ell \) and \(\sigma _1^2\). Moreover, both constants \(M_{\chi ,1}\) and \(M_{\chi ,2}\) have to be computed. Assuming a uniform kernel function, \(K(u)={\mathbb {I}}_{[0,1]}(u)\), it is obvious that \(M_{\chi ,1} = M_{\chi ,2} = 1\).

Corollary 2

Under assumptions (C1)–(C7), considering \(K(u)={\mathbb {I}}_{[0,1]}(u)\) and if \({\hat{\sigma }}_1^2\) an \({\hat{\ell }}\) are consistent estimators of \({\sigma }_1^2\) and \(\ell \), respectively, it can be proved that

$$\begin{aligned} \sqrt{n{\hat{F}}_{\chi }(h)}\dfrac{ {{\hat{\ell }}(\chi )}}{\sqrt{{\hat{\sigma }}^2_1(\chi )}}[{\hat{m}}_{h}(\chi )-m(\chi )]\xrightarrow []{\mathcal {L}} N(0,1) \text { as } n\rightarrow \infty , \end{aligned}$$

where

$$\begin{aligned} {\hat{F}}_{\chi }(h)=\dfrac{1}{n}\sum _{i=1}^n {{\mathbb {I}}_{\{{\Vert \mathcal {X}_i-\chi \Vert \le h}\}},} \end{aligned}$$

being \({\mathbb {I}}_{\{A \}}=1\) if A is true and 0 otherwise.

Remark 2

An immediate consequence of Corollary 2 is the possibility of constructing (asymptotic) confidence intervals for the circular regression function. Fixing a confidence level \(1-\alpha \), with \(\alpha \in (0,1)\), it can be easily obtained that an asymptotic confidence interval for m at \(\chi \in E\) is:

$$\begin{aligned} \left( {\hat{m}}_{h}(\chi )\mp z_{\alpha /2}\frac{1}{\sqrt{n{\hat{F}}_{\chi }(h)}}\frac{\sqrt{{\hat{\sigma }}^2_1(\chi )}}{ {{\hat{\ell }}(\chi )}}\right) , \end{aligned}$$

where \(z_{\alpha /2}\) denotes the \((1-\alpha )\)-quantile of the standard normal distribution and \(\hat{\sigma }_1^2(\chi )\) and \(\hat{\ell }(\chi )\) are (consistent) estimators of \(\sigma _1^2(\chi )\) and \(\ell (\chi )\), respectively.

Among the possible (consistent) conditional variance estimators for \(\sigma _1^2\), we suggest to use the following method based on adapting to the functional-circular framework the approach studied in Vilar-Fernández and Francisco-Fernández (2006). In the present context, taking into account that

$$\begin{aligned} \sigma ^2_1(\chi )=\textrm{Var}[\sin (\varepsilon )\mid \mathcal {X}=\chi ] ={\mathbb {E}}[\sin ^2(\varepsilon )\mid \mathcal {X}=\chi ]-\{{\mathbb {E}}[\sin (\varepsilon )\mid \mathcal {X}=\chi ]\}^2, \end{aligned}$$

and under the assumption that \({\mathbb {E}}[\sin (\varepsilon )\mid \mathcal {X}=\chi ]=0\), only \({\mathbb {E}}[\sin ^2(\varepsilon )\mid \mathcal {X}=\chi ]\) has to be estimated. For this, in a first step, considering model (1), the residuals from a nonparametric fit (using, for instance, the Nadaraya–Watson estimator with a pilot bandwidth) are obtained. Then, in a second step, the estimator of the conditional variance function \(\sigma _1^2\) is defined as the nonparametric estimator (employing again the Nadaraya–Watson estimator with another bandwidth) of the regression function using the squared sine of the residuals as the response variables.

Regarding function \(\ell \), taking into account that \(\ell (\chi )=[m^2_1(\chi )+m_2^2(\chi )]^{1/2}\), a consistent estimator for this function is straightforwardly obtained replacing in this equation \(m_j\) by the Nadaraya–Watson estimators \({\hat{m}}_{j,h}\), \(j=1,2\), defined in (12).

5 Simulation study

In this section, the regression estimator proposed in (11) is analyzed numerically by simulation, considering different scenarios and model (1). For each scenario, 500 samples of size n (\(n=50, 100, 200\) and 400) are generated using the following regression functions:

$$\begin{aligned} \text {r1: } m(\mathcal {X})= & {} \textrm{atan2}(0.5+\textstyle \int ^{\mathbb {I}}_0\mathcal {X}(t)dt,\textstyle \int ^{\mathbb {I}}_0\mathcal {X}(t)dt),\\ \text {r2: } m(\mathcal {X})= & {} \textrm{acos}(-0.3\textstyle \int ^{\mathbb {I}}_0\mathcal {X}(t)dt)+3/2\textrm{acos}(0.4\textstyle \int ^{\mathbb {I}}_0\mathcal {X}(t)dt), \end{aligned}$$

where the random curves are simulated from

$$\begin{aligned} \mathcal {X}(t)=30U(1-t)t^{1+U}, \quad t\in [0,1], \end{aligned}$$
(17)

being U a uniformly distributed variable on [0, 1] , and the circular errors \(\varepsilon \) in (1) are drawn from a von Mises distribution \(vM(0,\kappa )\), with different concentrations (\(\kappa =5, 10\) and 15).

As an illustration, Fig. 2 shows two realizations of simulated data of size \(n=100\) (model with regression function r1 in top row and model with regression function r2 in bottom row). Left plots display the circular regression functions evaluated in the curves generated from (17). Central panels present the random errors drawn from a von Mises distribution with zero mean direction and concentration \(\kappa =5\) (for model with regression function r1) and \(\kappa =15\) (for model with regression function r2). It can be seen that the errors in the top row, corresponding to \(\kappa =5\), present more variability than the ones generated with \(\kappa =15\). Right panels show the values of the response variables, obtained adding the mean values in the left panels and the circular errors in the central panels.

Fig. 2
figure 2

Illustration of model generation (model with regression function r1: top row; model with regression function r2: bottom row). Left: regression function evaluated at a random sample of size \(n=100\) of curves simulated from (17). Center: independent errors from a von Mises distribution with zero mean and concentration \(\kappa =5\) (for the model with regression function r1) and \(\kappa =15\) (for the model with regression function r2). Right: random responses obtained by adding the two previous plots

For each simulated sample, the Nadaraya–Watson-type estimator (11) is computed using the kernel \(K(u)=1-u^2\), \(u \in (0,1)\), and taking the \(L^2\) metric to calculate the distance between curves. Regarding the bandwidth h, it is selected by cross-validation, choosing the bandwidth parameter \({h}_{\textrm{CV}}\) that minimizes the function:

$$\begin{aligned} \textrm{CV}(h)=\sum _{i=1}^n \left\{ 1-\cos \left[ \Theta _i-{\hat{m}}_{h}^{(i)}(\mathcal {X}_i)\right] \right\} , \end{aligned}$$
(18)

where \({\hat{m}}_{h}^{(i)}\) denotes the Nadaraya–Watson-type estimator computed using all observations except for \((\mathcal {X}_i,\Theta _i)\).

To evaluate the performance of the Nadaraya–Watson-type estimator \({\hat{m}}_h\), the circular average squared error (CASE), defined by Kim and SenGupta (2017):

$$\begin{aligned} \textrm{CASE}[{\hat{m}}_{h}(\chi )]=\frac{1}{n}\sum _{i=1}^n \left\{ 1- \cos \left[ m(\mathcal {X}_i) - {\hat{m}}_{h}(\mathcal {X}_i)\right] \right\} , \end{aligned}$$
(19)

is computed. Table 1 shows the average (over 500 replicates) of the CASE when h is selected by cross-validation. For comparative purposes, in each scenario, the average of the minimum value of the CASE, computed using the bandwidth \(h_{\textrm{CASE}}\) that minimizes (19), is also included in Table 1. Note that the bandwidth \(h_{\textrm{CASE}}\) can not be computed in a practical situation, because the regression function is unknown. The corresponding CASE errors are included in this table as a benchmark. In the different scenarios, it can be seen that the average errors decrease when the sample size increases. In addition, as expected, results are better when the error concentration gets larger.

Table 1 Average CASE given in (19), over 500 replicates, using a cross-validation bandwidth, \({h}_{\textrm{CV}}\), and an optimal bandwidth, \({h}_{\textrm{CASE}}\), as a benchmark

The appropriate performance of the nonparametric circular regression estimator (11) is also observed in Fig. 3. In this plot, the same scenarios as those considered to obtain Fig. 2 are used. On the left, the theoretical regression functions r1 and r2 are shown, and, on the right, the corresponding estimates using a random sample generated from these models are presented. The representations in the top row correspond to the data simulated using the model with regression function r1 and those in the bottom row with regression function r2.

Fig. 3
figure 3

Theoretical regression function (left), jointly with the corresponding estimates (right), using the specific scenarios considered in Fig. 2, for the model with regression function r1 (top row) and r2 (bottom row)

To complete the study, we repeated the simulations using a k-nearest neighbor (kNN) version of our estimator (see Burba et al. 2009, for a study of the kNN method in a regression model with scalar response a functional covariate). The results obtained were very similar to those achieved by the Nadaraya–Watson approach, with a slightly better performance of the Nadaraya–Watson-type estimator. For reasons of space, only the results with r1 using the kNN-type estimator are presented (Table 2). For comparison, we also include in this table the part of Table 1 relative to r1.

Table 2 Average CASE given in Eq. (19), over 500 replicates, using the kNN-type estimator (kNN)

6 Real data illustration

There is a quite extensive literature on the computation of daily temperature averages or profiles, which are useful to summarize temperature patterns (see Huld et al. 2006; Ma and Guttorp 2013; Bernhardt et al. 2018). More recently, there has been some interest in investigating the identification of daily temperature extremes (see Glynis et al. 2021) or the association of diurnal temperature range with mortality burden (see Cai et al. 2021). All these approaches are possible given the high frequency registries of temperatures that are available nowadays, claiming for a proper statistical analysis using functional data methods (see Ruiz-Medina and Espejo 2012, for a spatial autoregressive functional approach, and Aguilera-Morillo et al. (2017), for spatial functional prediction). These works, tailored for a spatial analysis, do not consider the calendar time as a response.

Daily temperature curves (obtained by data records every 10 min, so each curve has 144 points) have been collected by MeteogaliciaFootnote 1 in Santiago de Compostela (NW-Spain) from February 15, 2002 until December 31, 2019. The northern region of Spain presents an oceanic climate, characterized by cool winters and warm summers, with mild temperatures in spring and fall seasons. For the initial period from February 15, 2002 to June 28, 2005, using the available sample \((\mathcal {X}_i,\Theta _i)\), \(i=1, \ldots , n\), where \(\mathcal {X}_i\) denotes the whole daily temperature curve and \(\Theta _i\) the corresponding day on a year basis, for the day i in that period (see Fig. 1 in the Introduction), the regression estimator proposed in (11) is computed. The fitted model is obtained using the same kernel and metric as in the simulations and considering the cross-validation bandwidth selector obtained minimizing (18), which gives \(h_{\textrm{CV}}=8.8139\). As it has been explained in the Introduction, relevant changes in temperature patterns could be revealed by comparing the predicted days with the fitted model for some temperature curves, in a different period, with the observed days corresponding to those temperature curves. Note that the period selected for estimating the model contains enough data to proceed with our nonparametric method, and it is far enough in time to mitigate temporal correlation from the prediction experiment carried out in what follows.

As a specific example, consider the period from May 21 to May 27, 2019. For this spring week, the 7 observed curves can be seen in Fig. 4 (left), with colors indicating the observed days. In Fig. 4 (right), the same 7 observed curves are now colored according to the predicted days where these curves should be observed with the model fitted from the 2002–2005 data in Fig. 1 (left). According to the model for this early period, the observed curves in the considered week in May 2019 correspond to days in June, July and August. So, this week in May 2019 would be perceived as warmer with respect to the reference period.

Fig. 4
figure 4

Daily temperature curves in Santiago de Compostela (Spain) from May 21, 2019 to May 27, 2019 (left). Colors indicate the observed day. Temperature curves colored by predicted days using the fitted model (right)

For a more general analysis, consider the temperature curves registered in 2019 and take a division by months. We denote the observed sample by \((\mathcal {X}^b_j,\Theta ^b_j)\), for \(j=1,\ldots ,n_b\) and \(b=1,\ldots ,12\), being \(n_b\) the number of days of month b. Similarly to the measurement error considered for evaluating the performance of the estimator in (19), a circular average prediction error (CAPE) is defined for each month as:

$$\begin{aligned} \text{ CAPE}_b=\frac{1}{n_b}\sum _{j=1}^{n_b}\left\{ 1-\cos \left[ {\Theta ^b_j}-{{\hat{m}}}_h({\mathcal {X}^b_j})\right] \right\} , \end{aligned}$$
(20)

where \({{\hat{m}}}_h(\mathcal {X}^b_j)\), for \(j=1, \ldots , n_b\) and \(b=1,\ldots , 12\), is the predicted day in the year 2019 with the fitted model using the 2002–2005 data. Note that this prediction error is given in an easily interpretable scale since a perfect match between observed and predicted days for a sample of temperature curves yields a null value if predictions are located 3-months later/before than observations, then the prediction error takes value 1 (so a month time lag gives one third); if predictions and observations are 6-months apart, then the prediction error takes value 2. Boxplots for monthly prediction errors for 2019 are depicted in Fig. 5.

Fig. 5
figure 5

Boxplots for the circular prediction errors, by months, for 2019. Red dots: circular average prediction error for each month. (Color figure online)

Note that there are seven months (April, May, June, September, October, November and December) that present prediction errors larger than 1/3, indicating that the mismatch between the observed day for a certain curve and the predicted day (with the model fitted for the early period) is larger than one month. On the other hand, it is observed in Fig. 5 that January/February and July/August in 2019 behave, more or less, as “usual” Januaries/Februaries or Julys/Augusts in the years considered in the training sample, perhaps with only a small “shift” in days for the temperature curves of these months. The experts from the meteorological service consulted by us indicated that this is the expected behavior for these months in a context of global warming. In the Atlantic climate (the one corresponding to Galicia, where the data were recorded), one of the manifestations of climate change is the disappearance of seasons, specifically, spring and fall. January/February and July/August are still the coolest and warmest months, respectively, in the region considered in this research and they would be the months that would distinguish the “periods" of the year.

Moreover, to analyze more deeply the direction of the displacements observed in Fig. 5, circular boxplots (Buttarazzi et al. 2018) of the differences between the observed days and those predicted by our nonparametric estimator are computed. Figure 6 shows these circular boxplots, for all the months, that is, the circular boxplots of \({\Theta ^b_j}-{{\hat{m}}}_h({\mathcal {X}^b_j})\), for \(j=1,\ldots , n_b\) and \(b=1,\ldots ,12\). Note the large variability in predictions for April, May, June and October, as well as the small variability but with a clear shift (prediction errors are not centered at zero) for September (warmer than expected according to the fitted model), November and December (cooler than expected). Additionally, to rule out that year 2019 was a singular year, we replicated this experiment for years 2017, 2018 and 2020, showing a very similar pattern in all cases.

Fig. 6
figure 6

Circular boxplots for the circular prediction errors, by months, for 2019. The months are in order from left to right and from top to bottom

We also repeated this study using the kNN-type estimator compared in the simulations. The results and conclusions derived are practically the same as those obtained with the Nadaraya–Watson-type estimator. Figure 7 shows the same information as that presented in Fig. 5, but using the kNN-type estimator.

As a final comment, it should be highlighted the potential use of the proposed method in order to explain climate change effects on agriculture. The IPCC 2022 (Pörtner HO et al. 2022, Chap. 5) indicates that temperature trend changes have modified the life cycle of crops (both shortening or prolonging them, depending on latitude). In some areas, these effects have lead farmers to change their agricultural practices. In addition, the IPCC 2022 (Pörtner HO et al. 2022) also notices that temperature variability directly affects both the characteristics of some harvests (e.g. acidity or colour of fruits) and the risk of pests and diseases. Understanding temperature variability throughout the year can be useful for adapting agricultural practices.

Fig. 7
figure 7

Boxplots for the circular prediction errors, by months, for 2019. Red dots: circular average prediction error for each month, using kNN-type estimator. (Color figure online)

7 Discussion

Regression analysis of models involving non-Euclidean data represents a challenging problem given the need of developing new statistical approaches for its study. In this paper, we focus on one of these non-standard settings. In particular, we consider a regression model with a circular response and a functional covariate. A nonparametric estimator of Nadaraya–Watson type of the corresponding regression function is proposed and analyzed from a theoretical and a practical point of view. Asymptotic bias and variance expressions, as well as asymptotic normality of the nonparametric approach are derived. The finite sample performance of the estimator is assessed by simulations and illustrated using a real data set. From a practical perspective, our methodology allows modeling the relation between a temperature curve (as a covariate) and a day (as a response), enabling to illustrate how temperature patterns change on a local scale.

As in any kernel-based method an important tuning parameter, called the bandwidth or smoothing parameter, has to be selected by the user. In this paper, a cross-validation approach was proposed for this task and used in numerical studies. Other possible selectors of plug-in type or based on bootstrap resampling methods could also be designed. While the use of global plug-in bandwidths in the infinite-dimensional context can be very intricate, given the difficulty of dealing with integrated versions of the AMSE, a bootstrap bandwidth selection method could be easily formulated. The idea would consist in approximating the CASE, given in (19) (or the mean of the CASE), by a bootstrap version of this error criterion using bootstrap samples obtained from bootstrap residuals. For this, two pilot bandwidths are needed, one to obtain the residuals and another to compute the bootstrap samples. This second pilot bandwidth can also be used to compute the Nadaraya–Watson-type estimator of m employed instead of the theoretical regression function in the bootstrap expression of the CASE (19). Repeating this process many times, the bootstrap bandwidth would be obtained by minimizing the averaged bootstrap CASE. Theoretical justification for this functional-circular bootstrapping procedure as well as the design of practical rules to obtain both pilot bandwidths are open problems out of the scope of the present paper, but of interest for a future research.

In this paper, we have focused on a Nadaraya–Watson-type estimator for m in model (1). However, a local linear-type estimator could be also defined. In that case, local linear regression estimators for models with a Euclidean response and a functional covariate are required. Moreover, using the asymptotic theory on these estimators (Baíllo and Grané 2009; Ferraty and Nagy 2022) and with similar arguments to those employed in the present paper for (11), asymptotic results for the local linear-type regression estimator could also be derived.

A possible extension of the model assumed in this paper would consist in including additional type of covariates. In a practical situation, this kind of complex models could help to obtain better predictions. The most natural way to address this problem (following our kernel methodology) would be to use product kernels defined on possibly different spaces for the estimators of the sine and cosine components of the response. For instance, the ideas in García-Portugués et al. (2013) for cylindrical density estimation, and the ones in Racine and Li (2004) or in Li and Racine (2004) for categorical data can be employed to define the corresponding weights in this context. This type of product kernel estimators have been recently studied in nonparametric functional regression in Shang (2014) and in Selk and Gertheiss (2022), for models with scalar response and different types of covariates (functional, real-valued or discrete-valued), and they could be directly employed in our circular-functional framework as regression estimators of the sine and cosine models. The important bandwidth selection problem in this context could be accomplished by a cross-validation procedure or using a Bayesian approach (Shang 2014). We think that a deep study of these extended models is an interesting point that could be addressed in further research.

It should be noticed that the temperature curves considered in our data illustration exhibit temporal dependence. The nonparametric approach for estimating the regression function is not affected by data dependencies, but such an issue should be accounted for when obtaining the asymptotic properties and a more appropriate bandwidth parameter. Considering that the proposed estimator is computed as the atan2 of nonparametric regression estimators for sine and cosine models, results about nonparametric regression for scalar responses and functional covariate with temporal dependence could be employed to address this issue (Attouch et al. 2010, 2013; Ferraty and Vieu 2006; Masry 2005; Ling et al. 2019; Kurisu 2022). Note also that our real data application considers a single location, but if the data collection presents also a high spatial frequency, it would be possible to proceed with a spatial analysis. For this, an extension to the infinite-dimensional context of the regression model assumed in Meilán-Vila et al. (2021) could be considered, using the results obtained in Ruiz-Medina and Espejo (2012) or in Aguilera-Morillo et al. (2017) for spatial functional data as part of the analysis.

The numerical analysis carried out in this research was performed with the statistical environment R (Development Core Team 2022), using the functions supplied with the fda.usc package (Febrero-Bande and Oviedo de la Fuente 2012).