Skip to main content
Log in

Estimating Sample Skewness from Sample Data Summaries and Associated Evaluation of Normality

  • Published:
Mathematical Methods of Statistics Aims and scope Submit manuscript

Abstract

We propose a method to estimate a sample skewness from the given summary statistics and give explicit formulas for the most common scenarios. We show that our method provides a nearly unbiased estimator for the non-parametric skewness measure. We empirically evaluate the performance on real-life data sets of COVID-19 vaccination status. We also demonstrate how the method can be applied to detect the skewness of the underlying distribution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

REFERENCES

  1. P. C. Austin, ‘‘Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples,’’ Statistics in Medicine 28 (25), 3083–3107 (2009).

    Article  MathSciNet  Google Scholar 

  2. N. Balakrishnan, J. Rychtář, D. Taylor, and S. Walter, ‘‘Unified approach to optimal estimation of mean and variance from sample summaries,’’ Statistical Methods in Medical Research 31 (11), 2087–2103 (2022).

    Article  MathSciNet  Google Scholar 

  3. M. Bland, ‘‘Estimating mean and standard deviation from the sample size, three quartiles, minimum, and maximum,’’ International Journal of Statistics in Medical Research 4 (1), 57–64 (2015).

    Article  Google Scholar 

  4. C. Bonferroni, ‘‘Teoria statistica delle classi e calcolo delle probabilita,’’ Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commericiali di Firenze 8, 3–62 (1936).

  5. S. Cai, J. Zhou, and J. Pan, ‘‘Estimating the sample mean and standard deviation from order statistics and sample size in meta-analysis,’’ Statistical Methods in Medical Research 30 (12), 2701–2719 (2021).

    Article  MathSciNet  Google Scholar 

  6. M. Capanni, F. Calella, M. Biagini, S. Genise, L. Raimondi, G. Bedogni, G. Svegliati-Baroni, F. Sofi, S. Milani, and R. Abbate, ‘‘Prolonged n-3 polyunsaturated fatty acid supplementation ameliorates hepatic steatosis in patients with non-alcoholic fatty liver disease: A pilot study,’’ Alimentary Pharmacology and Therapeutics 23 (8), 1143–1151 (2006).

    Article  Google Scholar 

  7. CDC. COVID-19 vaccinations in the United States. https://covid.cdc.gov/covid-data-tracker/ #vaccinations_vacc-total-admin-rate-total. Accessed June 16, 2022.

  8. J. G. Eisenhauer, ‘‘Estimating sample means and standard deviations from quartiles and extrema,’’ Journal of Probability and Statistical Science 18 (2), 129–144 (2020).

    Google Scholar 

  9. J. G. Eisenhauer, ‘‘A note on estimating unreported sample statistics for meta-analysis,’’ Asian Journal of Probability and Statistics, 12–20 (2021).

  10. J. P. Higgins, J. Thomas, J. Chandler, M. Cumpston, T. Li, M. J. Page, and V. A. Welch, Cochrane Handbook for Systematic Reviews of Interventions (John Wiley and Sons, Hoboken, New Jersey, 2019).

    Book  Google Scholar 

  11. S. P. Hozo, B. Djulbegovic, and I. Hozo, ‘‘Estimating the mean and variance from the median, range, and the size of a sample,’’ BMC Medical Research Methodology 5 (1), 13 (2005).

    Article  Google Scholar 

  12. D. Kwon and I. M. Reis, ‘‘Simulation-based estimation of mean and standard deviation for meta-analysis via Approximate Bayesian Computation (ABC),’’ BMC Medical Research Methodology 15 (1), 1–12 (2015).

    Article  Google Scholar 

  13. D. Luo, X. Wan, J. Liu, and T. Tong, ‘‘Optimally estimating the sample mean from the sample size, median, mid-range, and/or mid-quartile range,’’ Statistical Methods in Medical Research 27 (6), 1785–1805 (2018).

    Article  MathSciNet  Google Scholar 

  14. D. Luo, X. Wan, J. Liu, and T. Tong, ‘‘Testing normality using the summary statistics with application to meta-analysis,’’ arXiv preprint arXiv:1801.09456 (2018).

  15. S. McGrath, X. Zhao, R. Steele, B. D. Thombs, A. Benedetti, and Collaboration, D. S. D. D., ‘‘Estimating the sample mean and standard deviation from commonly reported quantiles in meta-analysis,’’ Statistical Methods in Medical Research 29 (9), 2520–2537 (2020).

    Article  MathSciNet  Google Scholar 

  16. M. D. Moran, ‘‘Arguments for rejecting the sequential Bonferroni in ecological studies,’’ Oikos 100 (2), 403–405 (2003).

    Article  Google Scholar 

  17. A. Ramírez and C. Cox, ‘‘Improving on the range rule of thumb,’’ Rose-Hulman Undergraduate Mathematics Journal 13 (2), 1 (2012).

    MathSciNet  Google Scholar 

  18. S. H. Rice, The expected value of the ratio of correlated random variables. https://www.depts.ttu. edu/biology/people/Faculty/Rice/home/ratio-derive.pdf. Accessed May 18, 2015.

  19. S. H. Rice and A. Papadopoulos, Evolution with stochastic fitness and stochastic migration, PloS One 4 (10) (2009).

  20. J. Rychtář and D. Taylor, ‘‘Estimating the sample variance from the sample size and range,’’ Statistics in Medicine 39 (30), 4667–4686 (2020).

    Article  MathSciNet  Google Scholar 

  21. J. Rychtář and D. T. Taylor, ‘‘Moran process and Wright–Fisher process favor low variability,’’ Discrete and Continuous Dynamical Systems-B 26 (7), 3491 (2021).

    Article  MathSciNet  Google Scholar 

  22. J. Shi, D. Luo, X. Wan, Y. Liu, J. Liu, Z. Bian, and T. Tong, ‘‘Detecting the skewness of data from the sample size and the five-number summary’’. arXiv preprint:2010.05749 (2020).

  23. J. Shi, D. Luo, H. Weng, X.-T. Zeng, L. Lin, H. Chu, and T. Tong, ‘‘Optimally estimating the sample standard deviation from the five-number summary,’’ Research Synthesis Methods 11 (5), 641–654 (2020).

    Article  Google Scholar 

  24. J. Shi, T. Tong, Y. Wang, and M. G. Genton, ‘‘Estimating the mean and variance from the five-number summary of a log-normal distribution,’’ Statistics and Its Interface 13 (4), 519–531 (2020).

    Article  MathSciNet  Google Scholar 

  25. S. M. Stigler, ‘‘Studies in the history of probability and statistics. XXXII: Laplace, Fisher, and the discovery of the concept of sufficiency,’’ Biometrika 60 (3), 439–445 (1973).

    MathSciNet  Google Scholar 

  26. N. Thatcher, E. De Campos, D. Bell, W. P. Steward, G. Varghese, R. Morant, J. Vansteenkiste, R. Rosso, S. Ewers, E. Sundal, et al. ‘‘Epoetin alpha prevents anaemia and reduces transfusion requirements in patients undergoing primarily platinum-based chemotherapy for small cell lung cancer,’’ British Journal of Cancer 80 (3), 396–402 (1999).

    Article  Google Scholar 

  27. S. D. Walter, J. Rychtář, D. Taylor, and N. Balakrishnan, ‘‘Estimation of standard deviations and inverse-variance weights from an observed range,’’ Statistics in Medicine 41, 242–257 (2022).

    Article  MathSciNet  Google Scholar 

  28. S. D. Walter and X. Yao,‘‘ Effect sizes can be calculated for studies reporting ranges for outcome variables in systematic reviews,’’ Journal of Clinical Epidemiology 60 (8), 849–852 (2007).

    Article  Google Scholar 

  29. X. Wan, W. Wang, J. Liu, and T. Tong, ‘‘Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range,’’ BMC Medical Research Methodology 14 (1), 135 (2014).

    Article  Google Scholar 

  30. C. J. Weir, V. Assi, L. Na, S. C. Lewis, G. D. Murray, P. Langhorne, and M. C. Brady, ‘‘Unreported summary statistics in trial publications and risk of bias in stroke rehabilitation systematic reviews: An international survey of review authors and examination of practical solutions,’’ Journal of Stroke Medicine 2 (2), 136–142 (2019).

    Article  Google Scholar 

  31. C. J. Weir, I. Butcher, V. Assi, S. C. Lewis, G. D. Murray, P. Langhorne, and M. C. Brady, ‘‘Dealing with missing standard deviation and mean values in meta-analysis of continuous outcomes: A systematic review,’’ BMC Medical Research Methodology 18 (1), 1–14 (2018).

    Article  Google Scholar 

Download references

Funding

The research of the first author was funded by the Natural Sciences and Engineering Research Council of Canada RGPIN-2020-06733. The funding agency had no input in study design, analysis and interpretation of data, in the writing of the report, nor in the decision to submit the article for publication.

Author information

Authors and Affiliations

Authors

Contributions

Narayanaswamy Balakrishnan: conceptualization, formal analysis, investigation, methodology, funding acquisition, writing—original draft, writing—review and editing. Jan Rychtář: formal analysis, investigation, methodology, software, visualization, writing—original draft, writing—review and editing. Dewey Taylor: formal analysis, investigation, methodology, software, visualization, writing—original draft, writing—review and editing.

Corresponding authors

Correspondence to Narayanaswamy Balakrishnan, Jan Rychtář or Dewey Taylor.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Balakrishnan, N., Rychtář, J. & Taylor, D. Estimating Sample Skewness from Sample Data Summaries and Associated Evaluation of Normality. Math. Meth. Stat. 32, 260–273 (2023). https://doi.org/10.3103/S106653072304004X

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S106653072304004X

Keywords:

Navigation