Skip to main content

Securing Density Estimates via Smooth Moment-Based Empirical Distribution Function Approximants

Abstract

This paper proposes an adaptive density estimation procedure that hinges on securing moment-based approximants of certain splines passing through particular points that are obtained from an appropriately adjusted and truncated empirical distribution function. More specifically, a four-parameter beta density estimate is initially fitted to the data in order to determine the endpoints of the distribution which are combined to the data points. Interpolants of the continuity-corrected empirical distribution function evaluated at these points are then approximated by smooth functions involving polynomials. As a matter of course, the density estimates are obtained by differentiation. Any quantile of the corresponding distribution can thereby be directly evaluated from the associated distribution functions. The Cramér-von Mises goodness-of-fit statistic is utilized as a measure of accuracy. Three illustrative examples are presented.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

References

  • Anderson TW (1962) On the distribution of the two-sample Cramér-von Mises criterion. Ann Math Stat 33(3):1148–1159

    Article  Google Scholar 

  • Devroye L, Gyorfi L (1985) Nonparametric density estimation: the L1 view (Wiley Series in Probability and Statistics). Wiley, New York

    MATH  Google Scholar 

  • Eloyan A, Ghosh SK (2011) Smooth density estimation with moment constraints using mixture distributions. J Nonparametr Stat 23(2):513–531

    MathSciNet  Article  Google Scholar 

  • Fritsch FN, Carlson RE (1980) Monotone piecewise cubic interpolation. SIAM J Numer Anal 17(2):238–246

    MathSciNet  Article  Google Scholar 

  • Girard S, Guillou A, Stupfler G (2012) Estimating an endpoint with high-order moments. Test 21(4):697–729

    MathSciNet  Article  Google Scholar 

  • Gramacki A (2019) Nonparametric density estimation and its computational aspects. Springer Nature, New York

    MATH  Google Scholar 

  • Hall P (1982) On estimating the endpoint of a distribution. Ann Stat 11(1):556–568

    MathSciNet  MATH  Google Scholar 

  • Hall P, Wang JZ (1999) Estimating the end-point of a probability distribution using minimum-distance methods. Bernoulli 5(1):177–189

    MathSciNet  Article  Google Scholar 

  • Hall P, Wang JZ (2005) Bayesian likelihood methods for estimating the end point of a distribution. J R Stat Soc Series B 67(5):717–729

    MathSciNet  Article  Google Scholar 

  • Izenman AJ (1991) Recent developments in nonparametric density estimation. J Am Stat Assoc 86:205–224

    MathSciNet  MATH  Google Scholar 

  • Jiang M, Provost SB (2014) A hybrid bandwidth selection methodology for kernel density estimation. J Stat Comput Simul 84(3):614–627. https://doi.org/10.1080/00949655.2012.721366

    MathSciNet  Article  MATH  Google Scholar 

  • Li D, Peng L, Qi Y (2011) Empirical likelihood confidence intervals for the endpoint of a distribution function. Test 20(2):353–366

    MathSciNet  Article  Google Scholar 

  • Loh WY (1984) Estimating an endpoint of a distribution with resampling methods. Ann Stat 12(4):1543–1550

    MathSciNet  Article  Google Scholar 

  • Liao JG, Wu Y, Lin Y (2010) Improving Sheather and Jones’ bandwidth selector for difficult densities in kernel density estimation. J Nonparametr Stat 22(1):105–114

    MathSciNet  Article  Google Scholar 

  • Olkin I, Spiegelman CH (1987) A semiparametric approach to density estimation. J Am Stat Assoc 82:858–865

    MathSciNet  Article  Google Scholar 

  • Provost SB (2005) Moment-based density approximants. Math J 9(4):727–756

    Google Scholar 

  • Schuster E, Yakowitz S (1985) Parametric/nonparametric mixture density estimation with application to flood-frequency analysis. Water Resour Bull 21:797–804

    Article  Google Scholar 

  • Scott DW (1985) Average shifted histograms: effective nonparametric density estimators in several dimensions. Ann Stat 13:1024–1040

    Article  Google Scholar 

  • Silverman BW (1998) Density estimation for statistics and data analysis. Routledge, Boca Raton

    Google Scholar 

  • Wang JZ (2005) A note on estimation in the four-parameter beta distribution. Commun Stat B Simul Comput 34(3):495–501

    MathSciNet  Article  Google Scholar 

  • Waterman MS, Whiteman DE (1978) Estimation of probability densities by empirical density functions. Int J Math Educ Sci Technol 9(2):127–137

    Article  Google Scholar 

  • Zareamoghaddam H, Provost SB, Ahmed SE (2017) A moment-based bivariate density estimation methodology applicable to big data modeling. J Probab Stat Sci 15(2):135–152

    Google Scholar 

Download references

Acknowledgements

The financial support of the Natural Sciences and Engineering Research Council of Canada is gratefully acknowledged. We also wish to thank the reviewers for their valuable comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Serge B. Provost.

Ethics declarations

Conflict of interest

None of the authors has any competing interests, financial or otherwise, which could be construed as being directly or indirectly related to this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix:

Appendix:

The Fritsch-Carlson Monotonic Cubic Interpolation

Given a set of data points \(\left( x_{0}, f_{0}\right) ,\left( x_{1}, f_{1}\right) , \ldots ,\left( x_{p}, f_{p}\right)\) where \(x_0< \cdots < x_p\) and \(f_0< \cdots < f_p\), this approach relies on Hermite interpolation with the requirement that the derivatives at the knots be \(s_1 \tau _1,\, s_2 \tau _2, \ldots , s_p \tau _p\), where

$$\begin{aligned} s_i = \left\{ \begin{array}{ll} \frac{f_2-f_1}{f_1-f_0}, &{} i=1,\\ \frac{1}{2} \left( \frac{f_{i}-f_{i-1}}{f_{i-1}-f_{i-2}} + \frac{f_{i+1}-f_{i}}{f_{i}-f_{i-1}} \right) , &{} i=2,\ldots ,p-1, \\ \frac{f_p-f_{p-1}}{f_{p-1}-f_{p-2}}, &{} i=p, \end{array} \right. \nonumber \\ \end{aligned}$$

and

$$\begin{aligned} \tau _i = \left\{ \begin{array}{ll} \min \Big ( \frac{3\,\delta _1}{\sqrt{s_1^2 + s_2^2}}, \, 1 \Big ), &{} i=1,\\ \min \Big ( \frac{3\,\delta _{i}}{\sqrt{s_{i}^2 + s_{i+1}^2}}, \, \frac{3\,\delta _{i-1}}{\sqrt{s_{i-1}^2 + s_{i}^2}}, \, 1 \Big ), &{} i=2,\ldots ,p-1, \\ \min \Big ( \frac{3\,\delta _{p-1}}{\sqrt{s_{p-1}^2 + s_p^2}}, \, 1 \Big )&{} i=p, \\ \end{array} \right. \nonumber \\ \end{aligned}$$

with

$$\begin{aligned} \delta _i = \frac{f_{i+1} - f_{i}}{f_{i} - f_{i-1}}, \quad i = 1,\ldots ,p-1, \end{aligned}$$

the Hermite interpolation function on each interval \(\left[ x_{i}, x_{i+1}\right] , \ i=0,1, \ldots ,p-1,\) being

$$\begin{aligned} H(x)=&\left( 1+2 \frac{x-x_{i}}{x_{i+1}-x_{i}}\right) \left( \frac{x-x_{i+1}}{x_{i}-x_{i+1}}\right) ^{2} f_{i}+\left( 1+2 \frac{x-x_{i+1}}{x_{i}-x_{i+1}}\right) \left( \frac{x-x_{i}}{x_{i+1}-x_{i}}\right) ^{2} f_{i+1} \\&+\left( x-x_{i}\right) \left( \frac{x-x_{i+1}}{x_{i}-x_{i+1}}\right) ^{2} f_{i}^{\prime }+\left( x-x_{i+1}\right) \left( \frac{x-x_{i}}{x_{i+1}-x_{i}}\right) ^{2} f_{i+1}^{\prime } \,. \end{aligned}$$

If one wishes readily to obtain some d.f. values or to determine certain quantiles on the basis of \(\{x_1,\ldots ,x_n\},\) a sample of n ordered observations, a Fritsch-Carlson third degree spline ought to provide reasonably accurate values throughout the support of the distribution once it is applied to the points

$$\begin{aligned} \big \{(\ell ,0), (x_1, {\tfrac{1}{n}}-{\tfrac{1}{2n}}),\ldots , (x_n, {\tfrac{n}{n}}-{\tfrac{1}{2n}}) , (u,1)\big \}, \end{aligned}$$
(14)

where \(\ell\) and u denote the estimated endpoints.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Provost, S.B., Yang, Z. & Ahmed, S.E. Securing Density Estimates via Smooth Moment-Based Empirical Distribution Function Approximants. J Indian Soc Probab Stat 23, 1–18 (2022). https://doi.org/10.1007/s41096-022-00119-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41096-022-00119-4

Keywords

  • Data modeling
  • Density estimation
  • Goodness-of-fit
  • Moments
  • Quantiles

JEL Classifications

  • C80
  • C14
  • C13