pp 1–25 | Cite as

Oracally efficient estimation for dense functional data with holiday effects

  • Li Cai
  • Lisha Li
  • Simin Huang
  • Liang Ma
  • Lijian YangEmail author
Original Paper


Existing functional data analysis literature has mostly overlooked data with spikes in mean, such as weekly sporting goods sales by a salesperson which spikes around holidays. For such functional data, two-step estimation procedures are formulated for the population mean function and holiday effect parameters, which correspond to the population sales curve and the spikes in sales during holiday times. The estimators are based on spline smoothing for individual trajectories using non-holiday observations, and are shown to be oracally efficient in the sense that both the mean function and holiday effects are estimated as efficiently as if all individual trajectories were known a priori. Consequently, an asymptotic simultaneous confidence band is established for the mean function and confidence intervals for holiday effects, respectively. Two sample extensions are also formulated and simulation experiments provide strong evidence that corroborates the asymptotic theory. Application to sporting goods sales data has led to a number of new discoveries.


B-spline Dummy variables Functional data Holiday effects Oracle efficiency Simultaneous confidence band 

Mathematics Subject Classification

62M10 62G08 62P20 



This research was supported in part by National Natural Science Foundation of China Awards 11371272 and 11771240, and the Tsinghua University Center for Data-Centric Management in the Department of Industrial Engineering. Part of the research was carried out when the first author was a visitor at the Department of Statistics, Texas A & M University. The first author thanks the China Scholarship Council (CSC) for providing financial support to visit Texas A & M University. The helpful comments from Editor-in-Chief Lola Ugarte, an Associate Editor and two Reviewers are gratefully acknowledged.

Supplementary material

11749_2019_655_MOESM1_ESM.pdf (88 kb)
Supplementary material 1 (pdf 87 KB)


  1. Anzanello M, Fogliatto F (2011) Learning curve models and applications: literature review and research directions. Int J Ind Ergon 41:573–583CrossRefGoogle Scholar
  2. Benko M, Härdle W, Kneip A (2009) Common functional principal components. Ann Statist 37:1–34MathSciNetCrossRefzbMATHGoogle Scholar
  3. Bosq D (2000) Linear processes in function spaces: theory and applications. Springer, New YorkCrossRefzbMATHGoogle Scholar
  4. Cai L, Yang L (2015) A smooth simultaneous confidence band for conditional variance function. TEST 24:632–655MathSciNetCrossRefzbMATHGoogle Scholar
  5. Cai L, Liu R, Wang S, Yang L (2019) Simultaneous confidence bands for mean and variance functions based on deterministic design. Stat Sin 29:505–525MathSciNetzbMATHGoogle Scholar
  6. Cao G, Wang L, Li Y, Yang L (2016) Oracle efficient confidence envelopes for covariance functions in dense functional data. Stat Sin 26:359–383MathSciNetzbMATHGoogle Scholar
  7. Cao G, Yang L, Todem D (2012) Simultaneous inference for the mean function based on dense functional data. J Nonparametr Statist 24:359–377MathSciNetCrossRefzbMATHGoogle Scholar
  8. Cardot H (2000) Nonparametric estimation of smoothed principal components analysis of sampled noisy functions. J Nonparametr Stat 12:503–538MathSciNetCrossRefzbMATHGoogle Scholar
  9. Cho H, Fryzlewicz P (2015) Multiple-change-point detection for high dimensional time series via sparsified binary segmentation. J R Stat Soc B 77:475–507MathSciNetCrossRefGoogle Scholar
  10. Claeskens G, Van Keilegom I (2003) Bootstrap confidence bands for regression curves and their derivatives. Ann Stat 31:1852–1884MathSciNetCrossRefzbMATHGoogle Scholar
  11. de Boor C (1978) A practical guide to splines. Springer, New YorkCrossRefzbMATHGoogle Scholar
  12. Degras D (2011) Simultaneous confidence bands for nonparametric regression with functional data. Stat Sin 21:1735–1765MathSciNetCrossRefzbMATHGoogle Scholar
  13. Fan J, Huang T, Li R (2007) Analysis of longitudinal data with semiparametric estimation of covariance function. J Am Stat Assoc 102:632–642CrossRefzbMATHGoogle Scholar
  14. Fan J, Lin S (1998) Tests of significance when data are curves. J Am Stat Assoc 93:1007–1021MathSciNetCrossRefzbMATHGoogle Scholar
  15. Fan J, Zhang W (2000) Simultaneous confidence bands and hypothesis testing in varying coefficient models. Scand J Stat 27:715–731MathSciNetCrossRefzbMATHGoogle Scholar
  16. Ferraty F, Vieu P (2006) Nonparametric functional data analysis: theory and practice. Springer, New YorkzbMATHGoogle Scholar
  17. Fryzlewicz P, Subba Rao S (2014) Multiple-change-point detection for auto-regressive conditional heteroscedastic processes. J R Stat Soc B 76:903–924MathSciNetCrossRefGoogle Scholar
  18. Gu L, Wang L, Härdle W, Yang L (2014) A simultaneous confidence corridor for varying coefficient regression with sparse functional data. TEST 23:806–843MathSciNetCrossRefzbMATHGoogle Scholar
  19. Gu L, Yang L (2015) Oracally efficient estimation for single-index link function with simultaneous confidence band. Electron J Stat 9:1540–1561MathSciNetCrossRefzbMATHGoogle Scholar
  20. Hall P, Müller H, Wang J (2006) Properties of principal component methods for functional and longitudinal data analysis. Ann Stat 34:1493–1517MathSciNetCrossRefzbMATHGoogle Scholar
  21. Huang J, Yang L (2004) Identification of nonlinear additive autoregressive models. J R Stat Soc B 66:463–477MathSciNetCrossRefzbMATHGoogle Scholar
  22. Huang X, Wang L, Yang L, Kravchenko A (2008) Management practice effects on relationships of grain yields with topography and precipitation. Agron J 100:1463–1471CrossRefGoogle Scholar
  23. James G, Hastie T, Sugar C (2000) Principal component models for sparse functional data. Biometrika 87:587–602MathSciNetCrossRefzbMATHGoogle Scholar
  24. James G, Sugar C (2003) Clustering for sparsely sampled functional data. J Am Stat Assoc 98:397–408MathSciNetCrossRefzbMATHGoogle Scholar
  25. Komlós J, Major P, Tusnády G (1976) An approximation of partial sums of independent RV’s, and the sample DF II. Z. Wahrscheinlichkeitstheorie Verw. Gebiete 34:33–58MathSciNetCrossRefzbMATHGoogle Scholar
  26. Li B, Yu Q (2008) Classification of functional data: a segmentation approach. Comput Stat Data Anal 52:4790–4800MathSciNetCrossRefzbMATHGoogle Scholar
  27. Ma S, Yang L, Carroll RJ (2012) A simultaneous confidence band for sparse longitudinal regression. Stat Sin 22:95–122MathSciNetzbMATHGoogle Scholar
  28. Ma S (2014) A plug-in the number of knots selector for polynomial spline regression. J Nonparametr Stat 26:489–507MathSciNetCrossRefzbMATHGoogle Scholar
  29. Raña P, Aneiros G, Vilar JM (2015) Detection of outliers in functional time series. Environmetrics 26:178–191MathSciNetCrossRefGoogle Scholar
  30. Rice J, Wu C (2001) Nonparametric mixed effects models for unequally sampled noisy curves. Biometrics 57:253–259MathSciNetCrossRefzbMATHGoogle Scholar
  31. Schröder AL, Fryzlewicz P (2013) Adaptive trend estimation in financial time series via multiscale change-point-induced basis recovery. Stat Interface 6:449–461MathSciNetCrossRefzbMATHGoogle Scholar
  32. Song Q, Yang L (2009) Spline confidence bands for variance function. J Nonparametric Stat 21:589–609MathSciNetCrossRefzbMATHGoogle Scholar
  33. Wang J, Liu R, Cheng F, Yang L (2014) Oracally efficient estimation of autoregressive error distribution with simultaneous confidence band. Ann Stat 42:654–668MathSciNetCrossRefzbMATHGoogle Scholar
  34. Wang J, Wang S, Yang L (2016) Simultaneous confidence bands for the distribution function of a finite population and of its superpopulation. TEST 25:692–709MathSciNetCrossRefzbMATHGoogle Scholar
  35. Wang J, Yang L (2009) Polynomial spline confidence bands for regression curves. Stat Sin 19:325–342MathSciNetzbMATHGoogle Scholar
  36. Wu W, Zhao Z (2007) Inference of trends in time series. J R Stat Soc B 69:391–410MathSciNetCrossRefGoogle Scholar
  37. Yao F, Müller H, Wang J (2005) Functional data analysis for sparse longitudinal data. J Am Stat Assoc 100:577–590MathSciNetCrossRefzbMATHGoogle Scholar
  38. Zhang J (2013) Analysis of variance for functional data. Chapman & Hall/CRC, Boca RatonCrossRefGoogle Scholar
  39. Zhao Z, Wu W (2008) Confidence bands in nonparametric time series regression. Ann Stat 36:1854–1878MathSciNetCrossRefzbMATHGoogle Scholar
  40. Zheng S, Liu R, Yang L, Hädle W (2016) Statistical inference for generalized additive models: simultaneous confidence corridors and variable selection. TEST 25:607–626MathSciNetCrossRefzbMATHGoogle Scholar
  41. Zheng S, Yang L, Härdle W (2014) A smooth simultaneous confidence corridor for the mean of sparse functional data. J Am Stat Assoc 109:661–673MathSciNetCrossRefzbMATHGoogle Scholar
  42. Zhou S, Shen X, Wolfe D (1998) Local asymptotics of regression splines and confidence regions. Ann Stat 26:1760–1782MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Sociedad de Estadística e Investigación Operativa 2019

Authors and Affiliations

  1. 1.School of Statistics and MathematicsZhejiang Gongshang UniversityHangzhouChina
  2. 2.Department of Industrial EngineeringTsinghua UniversityBeijingChina
  3. 3.Center for Statistical Science and Department of Industrial EngineeringTsinghua UniversityBeijingChina

Personalised recommendations