Notes on kernel density based mode estimation using more efficient sampling designs

Abstract

The mode is a measure of the central tendency as well as the most probable value. Additionally, the mode is not influenced by the tail of the distribution. In the literature the properties and the application of mode estimation is only considered under simple random sampling (SRS). However, ranked set sampling (RSS) is a structural sampling method which improves the efficiency of parameter estimation in many circumstances and typically leads to a reduction in sample size. In this paper we investigate some of the asymptotic properties of kernel density based mode estimation using RSS. We demonstrate that kernel density based mode estimation using RSS is consistent and asymptotically normal with smaller variance than that under SRS. Improved performance of the mode estimation using RSS compared to SRS is supported through a simulation study. An illustration of the computational aspect using a Duchenne muscular dystrophy data set is provided.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2

References

  1. Baxter M, Beardah C, Westwood S (2000) Sample size and related issues in the analysis of lead isotope data. J Archaeol Sci 27(10):973–980

    Article  Google Scholar 

  2. Buch-Larsen T, Nielsen JP, Guillen M, Bolance C (2005) Kernel density estimation for heavy-tailed distributions using the champernowne transformation. Statistics 39(6):503–518

    MathSciNet  Article  MATH  Google Scholar 

  3. Chen Z (1999) Density estimation using ranked-set sampling data. Environ Ecol Stat 6(2):135–146

    Article  Google Scholar 

  4. Chen H, Stasny EA, Wolfe DA (2006) Unbalanced ranked set sampling for estimating a population proportion. Biometrics 62:150–158

    MathSciNet  Article  MATH  Google Scholar 

  5. Chen Z, Bai Z, Sinha B (2004) Ranked set sampling: theory and applications, vol 176. Springer, Berlin

    Google Scholar 

  6. DiNardo J, Fortin NM, Lemieux T (1995) Labor market institutions and the distribution of wages, 1973–1992: a semiparametric approach. National Bureau of Economic Research

  7. Ferreyra RA, Podestá GP, Messina CD, Letson D, Dardanelli J, Guevara E, Meira S (2001) A linked-modeling framework to estimate maize production risk associated with ENSO-related climate variability in Argentina. Agric For Meteorol 107(3):177–192

    Article  Google Scholar 

  8. Härdle W (2004) Nonparametric and semiparametric models. Springer, Berlin

    Google Scholar 

  9. Hedges SB, Shah P (2003) Comparison of mode estimation methods and application in molecular clock analysis. BMC Bioinform 4(1):31

    Article  Google Scholar 

  10. Jeffrey SS (1996) Smoothing methods in statistics. Springer, New York

    Google Scholar 

  11. Kaur A, Patil G, Sinha A, Taillie C (1995) Ranked set sampling: an annotated bibliography. Environ Ecol Stat 2(1):25–54

    Article  Google Scholar 

  12. Kim J, Scott CD (2012) Robust kernel density estimation. J Mach Learn Res 13:2529–2565

    MathSciNet  MATH  Google Scholar 

  13. Kim K-D, Heo J-H (2002) Comparative study of flood quantiles estimation by nonparametric models. J Hydrol 260(1):176–193

    Article  Google Scholar 

  14. Lim J, Chen M, Park S, Wang X, Stokes L (2014) Kernel density estimator from rnaked set samples. Commun Stat-Theory Methods 43:2156–2168

    Article  MATH  Google Scholar 

  15. McIntyre GA (1952) A method for unbiased selective sampling, using ranked sets. Aust J Agric Res 3:385–90

    Article  Google Scholar 

  16. Muttlak HA (1997) Median ranked set sampling. JASS 6:245–255

    MATH  Google Scholar 

  17. Nazari S, Jozani MJ, Kharrati-Kopaei M (2014) Nonparametric density estimation using partially rank-ordered set samples with application in estimating the distribution of wheat yield. Electron J Stat 8:738–761

    MathSciNet  Article  MATH  Google Scholar 

  18. Ozturk O (2011) Sampling from partially rank-ordered sets. Environ Ecol Stat 18:757–779

    MathSciNet  Article  Google Scholar 

  19. Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33(3):1065–1076

  20. Paulsen O, Heggelund P (1996) Quantal properties of spontaneous EPSCs in neurones of the guinea-pig dorsal lateral geniculate nucleus. J Physiol 496(3):759–772

    Article  Google Scholar 

  21. Rosenblatt M (1971) Curve estimates. Ann Math Stat 42(6):1815–1842

    MathSciNet  Article  MATH  Google Scholar 

  22. Samawi HM, Ahmed MS, Abu-Dayyeh W (1996) Estimating the population mean using extreme ranked set sampling. Biom J 38:577–586

    Article  MATH  Google Scholar 

  23. Samawi HM, Al-Sagheer OA (2001) On the estimation of the distribution function using extreme and median ranked set sampling. Biom J 43(3):357–373

    MathSciNet  Article  MATH  Google Scholar 

  24. Segal MR, Wiemels JL (2002) Clustering of translocation breakpoints. J Am Stat Assoc 97(457):66–76

    MathSciNet  Article  MATH  Google Scholar 

  25. Silverman BW (1986) Density estimation for statistics and data analysis, vol 26. CRC Press, Boca Raton

    Google Scholar 

  26. Singh RS (1977) Applications of estimators of a density and its derivatives to certain statistical problems. J R Stat Soc Ser B (Methodological) 39(3):357–363

  27. Takahasi K, Wakimoto K (1968) On unbiased estimates of the population mean based on the sample stratified by means of ordering. Ann Inst Stat Math 20(1):1–31

    Article  MATH  Google Scholar 

  28. Tortosa-Ausina E (2002) Financial costs, operating costs, and specialization of Spanish banking firms as distribution dynamics. Appl Econ 34(17):2165–2176

    Article  Google Scholar 

  29. Tsybakov AB (2009) Introduction to nonparametric estimation. ISBN, Springer Science+Business Media, LLC, New York

  30. Wand M, Jones M (1995) Kernel smoothing, vol. 60 of monographs on statistics and applied probability. Chapman and Hall, London

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank the reviewers and the associate editor for their valuable comment which helped us to improve the manuscript.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Hani Samawi.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Samawi, H., Rochani, H., Yin, J. et al. Notes on kernel density based mode estimation using more efficient sampling designs. Comput Stat 33, 1071–1090 (2018). https://doi.org/10.1007/s00180-017-0787-2

Download citation

Keywords

  • Mode estimation
  • Density kernel estimation
  • Ranked set sampling
  • Simple random sample
  • Duchenne muscular dystrophy