Abstract
Derivative estimation of the mean of longitudinal and functional data is useful, because it provides a quantitative measure of changes in the mean function that can be used for modeling of the data. We propose a general method for estimation of the derivative of the mean function that allows us to make inference about both longitudinal and functional data regardless of the sparsity of data. The \(L^2\) and uniform convergence rates of the local linear estimator for the true derivative of the mean function are derived. Then the optimal weighting scheme under the \(L^2\) rate of convergence is obtained. The performance of the proposed method is evaluated by a simulation study, and additionally compared with another existing method. The method is used to analyse a real data set involving children weight growth failure.
Similar content being viewed by others
References
Benko M, Härdle W, Kneip A (2009) Common functional principal components. Ann Stat 37:1–34
Cao G, Wang J, Wang L, Todem D (2012) Spline confidence bands for functional derivatives. J Stat Plan Inference 142(6):1557–1570
Cao H, Liu W, Zhou Z (2018) Simultaneous nonparametric regression analysis of sparse longitudinal data. Bernoulli 24(4A):3013–3038
Chen Y, Yao W (2017) Unified inference for sparse and dense longitudinal data in time-varying coefficient models. Scand J Stat 44:268–284
Dai W, Tong T, Genton M (2016) Optimal estimation of derivatives in nonparametric regression. J Mach Learn Res 17:1–25
Dai X, Müeller H-G, Tao W (2018) Derivative principal component analysis for representing the time dynamics of longitudinal and functional data. Stat Sinica 28:1583–1609
Ebrahimzadeh F, Hajizadeh E, Baghestani AR, Nazer MR (2018) Effective factors on the rate of growth failure in children below two years of age: a recurrent events model. Iran J Public Health 47(3):418–426
Fan J, Gijbels I (1996) Local polynomial modelling and its applications. CRC Press, Boca Raton
Fan J, Zhang WY (2000) Simultaneous confidence bands and hypothesis testing in varying-coefficient models. Scand J Stat 27:715–731
Grith MM, Wagner HH, Härdle WK, Kneip AA (2018) Functional principal component analysis for derivatives of multivariate curves. Stat Sin 28:2469–2496
Hall P, Müller H-G, Yao F (2009) Estimation of functional derivatives. Ann Stat 37(6A):3307–3329
Horváth L, Kokoszka P (2012) Inference for functional data with applications. Springer, New York
Hosseinioun N, Doosti H, Nirumand HA (2012) Nonparametric estimation of the derivatives of a density by the method of wavelet for mixing sequences. Stat Pap 53(1):195–203
Hsing T, Eubank R (2015) Theoretical foundations of functional data analysis, with an introduction to linear operators. Wiley, New York
Kara-Zaitri L, Laksaci A, Rachdi M, Vieu P (2017) Data-driven kNN estimation in nonparametric functional data-analysis. J Multivar Anal 153:176–188
Kokoszka P, Reimherr M (2017) Introduction to functional data analysis. Chapman & Hall/CRC Press, Boca Raton
Li Y, Hsing T (2010) Uniform convergence rates for nonparametric regression and principal component analysis in functional/longitudinal data. Ann Stat 38:3321–3351
Lima IR, Cao G, Billor N (2018) M-based simultaneous inference for the mean function of functional data. Ann Inst Stat Math. https://doi.org/10.1007/s10463-018-0656-y
Liu B, Müller H-G (2009) Estimating derivatives for samples of sparsely observed functions, with application to on-line auction dynamics. J Am Stat Assoc 104:704–714
Liu W, Lu X (2011) Empirical likelihood for density-weighted average derivatives. Stat Pap 52(2):391–412
Ramsay JO, Silverman B (2002) Applied functional data analysis: methods and case studies. Springer, New York
Ramsay JO, Silverman B (2005) Functional data analysis. Springer, New York
Rossi F, Villa-Vialaneix N (2011) Consistency of functional learning methods based on derivatives. Pattern Recogn Lett 32(8):1197–1209
Pini A, Spreafico L, Vantini S, Vietti A (2019) Multi-aspect local inference for functional data: analysis of ultrasound tongue profiles. J Mach Learn Res 170:162–185
Poyton AA, Varziri MS, McAuley KB, McLellan PJ, Ramsay JO (2006) Parameter estimation in continuous-time dynamic models using principal differential analysis. Comput Chem Eng 30(4):698–708
Simpkin AJ, Durban M, Lawlor DA, MacDonald-Wallis C, May MT, Metcalfe C, Tilling K (2018) Derivative estimation for longitudinal data analysis: examining features of blood pressure measured repeatedly during pregnancy. Stat Med 37:2836–2854
Srivastava A, Klassen E, Joshi S, Jermyn I (2011) Shape analysis of elastic curves in euclidean spaces. IEEE Trans Pattern Anal Mach Intell 33(7):1415–1428
Wang H, Zhong P-S, Cui Y, Li Y (2018) Unified empirical likelihood ratio tests for functional concurrent linear models and the phase transition from sparse to dense functional data. J R Stat Soc Ser B 80(2):343–364
Xiao J, Li X, Shi J (2019) Local linear smoothers using inverse Gaussian regression. Stat Pap 60:1225–1253
Zhang J-T, Chen J (2007) Statistical inferences for functional data. Ann Stat 35:1052–1079
Zhang X, Wang J-L (2016) From sparse to dense functional data and beyond. Ann Stat 44:2281–2321
Zhang X, Wang J-L (2018) Optimal weighting schemes for longitudinal and functional data. Stat Prob Lett 138:165–170
Zheng S, Yang L, Härdle W (2014) A smooth simultaneous confidence corridor for the mean of sparse functional data. J Am Stat Assoc 109:661–673
Zhou L, Lin H, Liang H (2018) Efficient estimation of the nonparametric mean and covariance functions for longitudinal and sparse functional data. J Am Stat Assoc 113:1550–1564
Acknowledgements
We thank Dr. Farzad Ebrahimzadeh for providing the children weight growth failure data set. Moreover, the support and resources from the center for High Performance Computing at Shahid Beheshti University of Iran are gratefully acknowledged. We also thank the Associate Editor and two referees for providing constructive comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 Proof of Lemma 1
Proof
Denote the equi-distant partition on [0, 1] by \(\chi (\eta _n)\) in which the grid length is \(\eta _n \equiv \left( \sum _{i=1}^n N_i \omega _i^2 \log (1/\sum _{i=1}^n N_i \omega _i^2 )\right) ^{2+r}\). Then
where
For the second term on the right-hand side of Eq. (13), we deduce that
where
Note that Assumption (A1) implies that \(K(.) \le M_K\) for a constant \(M_K\), therefore
Using the fact that if for some i and j, \(\vert \frac{t_{ij}-s}{h}\vert >1 \) then \(K\big (\frac{t_{ij}-s}{h}\big )=0 \), we have
where \(I_{\lbrace .\rbrace }\) is the indicator function. Otherwise, if for some i and j, \( \vert \frac{t_{ij}-s}{h} \vert \le 1\), we can show that
Hence \(B_{2}=B_{21}+B_{22}=o_{p}(a_{n})\) follows directly from (15) and (16). For \( D_2 \) in (13), Jensen’s inequality implies that \( D_2=o(a_{n}) \). For the first term on the right-hand side of Eq. (13), define
then we can write for any \(M >0\),
By Markov inequality and independence of the \( v_{ij} \), we deduce that
for some constant C. Therefore, for a large M we have
The proof is complete by combining (13)–(17). \(\square \)
1.2 Proof of Theorem 3.1
Proof
Combining Eq. (9) and Assumptions (A1)–(A3), we conclude that
From Assumptions (A1) and (A2), we have \(f^2 (t)>0\) and \(h^2 f^{\prime 2}(t)\mu _k^2=O(h^2)\). Therefore, under Assumption (A5), \(f^2 (t)-h^2 f^{\prime 2}(t)\mu _k^2\) tends to a positive constant that is bounded away from zero with probability approaching one as \(h \rightarrow 0\), so we have
where \(M_{1}\), \(M_2\), \(M_{3}\), and \(M_4\) are some constants. For any \(M>0\) by Markov inequality,
Therefore, by Assumption (A5), the right-hand side of (19) tends to zero as \(M \rightarrow \infty \). Combining (18) and (19) completes the proof. \(\square \)
1.3 Proof of Theorem 3.2
Proof
We can write
By Lemma 5 in Zhang and Wang (2016), we deduce that
Using the fact that \(f^{\prime }(t)\), f(t) and \(\mu _k^2\) are bounded, and \(K_h(t_{ij}-t) =0\) if \(\vert t_{ij} -t \vert > h\), we have
\(\square \)
1.4 Proof of Theorem 3.3
Proof
For a fixed bandwidth h, minimizing the \(L^2\) rate is equivalent to the unique minimizer of \(b_n\) defined in Theorem 3.1. Applying the method of Lagrange multipliers under the condition \(\sum _{i=1}^n N_i \omega _i =1\), we deduce that
\(\square \)
Rights and permissions
About this article
Cite this article
Sharghi Ghale-Joogh, H., Hosseini-Nasab, S.M.E. On mean derivative estimation of longitudinal and functional data: from sparse to dense. Stat Papers 62, 2047–2066 (2021). https://doi.org/10.1007/s00362-020-01173-5
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-020-01173-5
Keywords
- Functional/longitudinal data
- Mean derivative function
- Weighting schemes
- \(L^2\) convergence
- Uniform convergence