Abstract
Let h(.) be a continuous, strictly positive probability density function over an interval [a, b] and H(.) its associated cumulative distribution function (cdf). Given a sample set \(X_{1},\ldots ,X_{n}\) of independent identically distributed variables, we want to estimate H(.) from this sample set. The present work has two goals. The first one is to propose an estimator of a cdf based on an orthogonal trigonometric series and to give its statistical and asymptotic proprieties (bias, variance, mean square error, mean integrated squared error, convergence of the bias, convergence of variance, convergence of the mean squared error, convergence of the mean integrated squared error, uniform convergence in probability and the rate of convergence of the mean integrated squared error). The second is to introduce a new method for the selection of a “smoothing parameter”. The comparison by simulation between this method and Kronmal–Tarter’s method, shows that the new method is more performant in the sense of the mean integrated square error.
Similar content being viewed by others
References
Altman N, Leger C (1995) Bandwidth selection for kernel distribution function estimation. J Stat Plan Inference 46:195–214
Babu J, Canty A, Chaubey P (2002) Application of bernstein polynomials for smooth estimation of a distribution and density function. J Stat plan Inference 105(2):377–392
Bradford R (1974) Estimation of distribution using orthogonal expansions. Ann Math Stat 2:454–463
Buckland S (1992) Fitting density functions with polynomials. J R Stat Soc [Ser A] 41:63–76
Butler W, Kronmal R (1985) Discrimination with polychotomous predictor variables using orthogonal functions. J Am Assoc 80:443–448
Cencov N (1962) Evaluation of an unknown distribution density from observations. Dokl Soviet Math 3:1559–1562
Chaubey Y, Sen P (1996) On smooth estimation of survival and density functions. Stat. Decis 14:1–22
Crain B (1973) A note on density estimation using orthogonal expansions. J Am Stat Assoc 68:964–965
Crain B (1976) More on esitmation of the distributions using orthogonal expansions. J Am Stat Assoc 71:741–745
Donoho D, Johnstone I, Kerkyacharian G, Picard D (1996) Density estimation by wavelet thresholding. Ann Stat 24:508–539
Diggle P, Hall P (1986) The selection of terms in an orthogonal series density estimator. J Am Stat Assoc 81:230–233
Efron B, Tibshirani R (1993) An introduction to the bootstrap. Chapman and Hall, London
Efromovich S (1999) Nonparametric curve estimation: methods, theory and applications. Springer, New York
Efromovich S (2010) Orthogonal series density estimation. Interdiscip Rev 2:467–476
Fryer M (1976) A review of some non-parametric estimators of density functions. J Inst Math Appl 18:371–380
Greblicki W, Pawlak M (1981) Classification using the Fourier series estimate of multivariate density functions. IEEE Trans Syst Man Cybern 11:726–730
Hall P (1980) Estimating a density on the positive half line by the method of orthogonal series. Ann Inst Stat Math 32:351–362
Hall P (1981) On trigonometric series estimates of densities. Ann Stat 9:683–685
Hall P (1983a) Orthogonal series distribution function estimation, with applications. J R Stat Soc B 45:81–88
Hall P (1983b) Orthogonal series methods for both qualitative and quantitative data. Ann Stat 11:1004–1007
Hansen B, Lauritzen S (2002) Nonparametric Bayes inference for concave distribution functions. Stat Neerl 56(1):110–127
Hart J (1985) On the choice of truncation point in Fourier series density estimators. J Stat Comput Simul 21:95–116
Härdle W, Kerkyacharian G, Picard D, Tsybakov A (1999) Wavelets, approximation and statisticals applications. Lecture Notes in Statististics, 129
He X, Shi P (1998) Monotone B-spline smoothing. JASA Theory Methods 93:643–650
Herrick D, Nason G, Silverman B (2001) Some new methods for wavelet density estimation. Sankhya A63:94–411
Ihaka R, Gentleman R (1982) A language for data analysis and graphics. J Comput Graph Stat 5(3):299–314
Jones M (1990) The performance of kernel density functions in kernel distribution function estimation. Stat Probab Lett 9:129–132
Kronmal R, Tarter M (1968a) The estimation of probability densities and cumulatives by Fourier series methods. J Am Stat Assoc 63:925–952
Kronmal R, Tarter M (1968b) Estimation of the cumulatives by Fourier series methods and application to the insertion problem. Proc ACM 23:491–497
Kronmal R, Tarter M (1970) On multivariate density estimates based on orthogonal expansions. Ann Math Stat 41:718–722
Lehmann E, Casella G (1998) Theory of point estimation, 2nd edn. Springer, New York
Lock M (1990) Optimizing density estimates based on an weighted and weighted mean integrated squared error. Unpublished PhD Dissertation, University of California, Berkely, Group in Biostatistics
Nadaraya E (1964) Some new estimates for distribution function. Theory Probab Appl 9:497–500
Ott J, Kronmal R (1976) Some classification procedures for multivariate binary data using orthogonal functions. J Am Stat Assoc 71:391–399
Perron F, Mengersen K (2001) Bayesian nonparametric modeling using mixtures of triangular distributions. Biometrics 57:518–528
Restle EM (1999) Estimating distribution functions with smoothing splines. Technical Report Statap. 1999.5 (DMA, EPF Lausanne, CH.)
Rudzkis R, Radavicius M (2005) Adaptive estimation of distribution density in the basis of algebraic polynomials. Theory Probab Appl 49:93–109
Saadi N, Adjabi S (2009) On the estimation of the probability density by trigonometric series. Commun Stat 38:3583–3595
Tarter M, Kronmal R (1976) An introduction to the implementation and theory of nonparametric density estimation. Am Stat 30:105–112
Tarter M, Lock M (1993) Model-free curve estimation. Chapman and Hall, New York
Wahba G (1976) Histosplines with knots which are order statistics. J R Stat Soc B 38:140–151
Walter G (1977) Properties of Hermite series estimation of probability density. Ann Stat 5:1258–1264
Watson G, Leadbetter M (1964) Hazard analysis I. Biometrica 51:175–184
Walter GG (1994) Wavelets and other Orthogonal Systems with Applications. CRC Press, London
Acknowledgments
We are very much thankful to Editor, an Editorial Board member and the referees for their instructive comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Proof of Proposition 1
-
(i)
For \(k=0\), \(\mathbb {E}(\hat{A}_{0})=\mathbb {E}\big (\frac{\pi -\overline{X}}{\sqrt{2\pi }}\big )= \frac{\pi -\mu }{\sqrt{2\pi }}.\) However, \(A_{0} =\frac{1}{\sqrt{2\pi }}\int _{-\pi }^{\pi } H(x)dx.\) Applying integration by parts, we get
$$\begin{aligned} A_{0}= \frac{1}{\sqrt{2\pi }}\left[ [xH(x)]_{-\pi }^{\pi }-\int _{-\pi }^{\pi } x h(x)dx\right] = \frac{\pi -\mu }{\sqrt{2\pi }}=\mathbb {E}(\hat{A}_{0}). \end{aligned}$$For \(k \ne 0\)
$$\begin{aligned} \mathbb {E}(\hat{A}_{k})= & {} \mathbb {E}\left[ \frac{\hat{\gamma }_{k}}{k}-\frac{\hat{\beta }_{k}}{k}+\,\frac{(-1)^{k+1}}{k\sqrt{2\pi }}\right] =\frac{1}{k\sqrt{2\pi }}\mathbb {E}(\cos (kX))\\&-\,\frac{1}{k\sqrt{2\pi }}\mathbb {E}(\sin (kX))+\,\frac{(-1)^{k+1}}{k\sqrt{2\pi }} =\frac{\gamma _{k}}{k}-\frac{\beta _{k}}{k}+\,\frac{(-1)^{k+1}}{k\sqrt{2\pi }}, \end{aligned}$$where
$$\begin{aligned} \frac{\gamma _{k}}{k}=\frac{1}{k\sqrt{2\pi }}\int _{-\pi }^{\pi }\cos (kx) h(x)dx \; \; \hbox {and} \; \; \frac{\beta _{k}}{k}=\frac{1}{k\sqrt{2\pi }}\int _{-\pi }^{\pi }\sin (kx) h(x)dx. \end{aligned}$$Applying integration by parts, we get
$$\begin{aligned} \frac{\gamma _{k}}{k}= & {} \frac{1}{\sqrt{2\pi }}\left[ \frac{\cos (kx)}{k}H(x)\right] _{-\pi }^{\pi }+\,\frac{1}{\sqrt{2\pi }}\int _{-\pi }^{\pi }\sin (kx)H(x)dx,\\ \frac{-\beta _{k}}{k}= & {} \frac{1}{\sqrt{2\pi }}\left[ -\frac{\sin (kx)}{k}H(x)\right] _{\pi }^{-\pi }+\,\frac{1}{\sqrt{2\pi }}\int _{-\pi }^{\pi }\sin (kx)H(x)dx. \end{aligned}$$Consequently,
$$\begin{aligned} \frac{\gamma _{k}}{k}=\frac{(-1)^{k}}{k\sqrt{2\pi }}+\,\frac{1}{\sqrt{2\pi }}\int _{-\pi }^{\pi }\sin (kx)H(x)dx \; \; \hbox {and} \; \; \frac{-\beta _{k}}{k}=\frac{1}{\sqrt{2\pi }}\int _{-\pi }^{\pi }\cos (kx)H(x)dx. \end{aligned}$$Consequently, Then, for \(k \ne 0\), we deduce that
$$\begin{aligned} \mathbb {E}(\hat{A}_{k})= & {} \frac{(-1)^{k}}{k\sqrt{2\pi }}+\,\frac{1}{\sqrt{2\pi }}\int _{-\pi }^{\pi }\sin (kx)H(x)dx +\,\frac{1}{\sqrt{2\pi }}\int _{-\pi }^{\pi }\cos (kx)H(x)dx\\&+\,\frac{(-1)^{k+1}}{k\sqrt{2\pi }}=A_{k}. \end{aligned}$$ -
(ii)
For \(k=0\), \(\mathbb {V}ar(\hat{A}_{k})= \mathbb {V}ar(\hat{A}_{0}) = \mathbb {V}ar(\frac{\pi -\overline{X}}{\sqrt{2\pi }})= \frac{\alpha }{2\pi n}.\) For \(k\ne 0\), we have
$$\begin{aligned} \mathbb {V}ar(\hat{A}_{k})= & {} \frac{1}{2\pi nk^{2}}\mathbb {V}ar\left[ \cos (kX) -\sin (kX)+(-1)^{k+1}\right] \\= & {} \frac{1}{2\pi nk^{2}}-\frac{\beta _{2k}}{nk^{2}\sqrt{2\pi } }-\frac{(\gamma _{k}-\beta _{k})^{2}}{ nk^{2}}. \end{aligned}$$
-
(iii)
We have
$$\begin{aligned} \mathbb {C}ov(\hat{A}_{k},\hat{A}_{j})=\mathbb {E}(\hat{A}_{k}\hat{A}_{j})-\mathbb {E}(\hat{A}_{k})\mathbb {E}(\hat{A}_{j}). \end{aligned}$$$$\begin{aligned} \mathbb {E}(\hat{A}_{k}\hat{A}_{j})= & {} \frac{1}{kj}\left[ \mathbb {E}(\hat{\gamma }_{k}\hat{\gamma }_{j})-\mathbb {E}(\hat{\gamma }_{k}\hat{\beta }_{j})+\,\frac{(-1)^{j+1}}{\sqrt{2\pi }}\mathbb {E}(\hat{\gamma }_{k}) -\mathbb {E}(\hat{\beta }_{k}\hat{\gamma }_{j})+\mathbb {E}(\hat{\beta }_{k}\hat{\beta }_{j})\nonumber \right. \\&\left. -\frac{(-1)^{j+1}}{\sqrt{2\pi }}\mathbb {E}(\hat{\beta }_{k})+ \frac{(-1)^{k+1}}{\sqrt{2\pi }}\mathbb {E}(\hat{\gamma }_{j})-\frac{(-1)^{k+1}}{\sqrt{2\pi }}\mathbb {E}(\hat{\beta }_{j})+\frac{(-1)^{j+k+2}}{2\pi }\right] ,\nonumber \\ \end{aligned}$$(26)and
$$\begin{aligned} \mathbb {E}(\hat{A}_{k})\mathbb {E}(\hat{A}_{j})= & {} \frac{1}{kj}\left[ \mathbb {E}(\hat{\gamma }_{k})-\mathbb {E}(\hat{\beta }_{k})+\frac{(-1)^{k+1}}{\sqrt{2\pi }}\right] \left[ \mathbb {E}(\hat{\gamma }_{j})-\mathbb {E}(\hat{\beta }_{j})+\frac{(-1)^{j+1}}{\sqrt{2\pi }}\right] .\nonumber \\ \end{aligned}$$(27)
Calculation of the \(\mathbb {E}(\hat{\gamma }_{k} \hat{\gamma }_{j})\), \(\mathbb {E}(\hat{\gamma }_{k} \hat{\beta }_{j})\), \(\mathbb {E}(\hat{\beta }_{k} \hat{\gamma }_{j})\) and \(\mathbb {E}(\hat{\beta }_{k} \hat{\beta }_{j})\)
We have the following properties:
Then,
In addition
And
Finally,
Finally:
Using the similar calculus we obtain the following results
and
Substituting (29), (30) and (31) in (26) and (27) respectively, we deduce that
\(\square \)
Proof of Property 1
-
(i)
For \(k=0\),
$$\begin{aligned} \lim _{n \longrightarrow \infty }\mathbb {V}ar(\hat{A}_{k})=\lim _{n \longrightarrow \infty }\frac{\alpha }{2\pi n} =0. \end{aligned}$$For \(k\ne 0\), we have
$$\begin{aligned} \lim _{n \longrightarrow \infty }\mathbb {V}ar(\hat{A}_{k})=\lim _{n \longrightarrow \infty }\left[ \frac{1}{2\pi nk^{2}}-\frac{\beta _{2k}}{nk^{2}\sqrt{2\pi }}-\frac{(\gamma _{k}-\beta _{k})^{2}}{ nk^{2}}\right] =0. \end{aligned}$$ -
(ii)
We have
$$\begin{aligned} \lim _{n \longrightarrow \infty }\mathbb {E} |\hat{A}_{k}-A_{k}|^{2}=\lim _{n \longrightarrow \infty }\mathbb {E} |\hat{A}_{k}-E[\hat{A}_{k}]|^{2}=\lim _{n \longrightarrow \infty }\mathbb {V}ar(\hat{A}_{k}). \end{aligned}$$according to (i) we have \(\lim _{n \longrightarrow \infty }\mathbb {V}ar(\hat{A}_{k})=0,\quad \forall k=0,1,\ldots \). Then we deduce that:
$$\begin{aligned} \lim _{n \longrightarrow \infty }\mathbb {E} |\hat{A}_{k}-A_{k}|^{2}=0. \end{aligned}$$
\(\square \)
Proof of Theorem 1
It is known that \(\mathbb {V}ar(X)\le \mathbb {E}(X^{2})\), one obtains
Poses \(s=\sum _{k=0}^{d_{n}}\frac{1}{2\pi }[\cos k(y-x)+\sin k(y+x)]\). We deduce that
Then,
We have \(\frac{\sin (kx)}{\sin x}\le k\), then
We deduce that
\(\square \)
Proof of Theorem 4
Then,
In addition,
According to the Property 3 and the Property 1 hence,
Consequently,
However,
It follow that:
Then, we deduce that
\(\square \)
Rights and permissions
About this article
Cite this article
Saadi, N., Adjabi, S. & Gannoun, A. The selection of the number of terms in an orthogonal series cumulative function estimator. Stat Papers 59, 127–152 (2018). https://doi.org/10.1007/s00362-016-0756-9
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-016-0756-9