Inference about the arithmetic average of log transformed data

Curto, José Dias

doi:10.1007/s00362-022-01315-x

Inference about the arithmetic average of log transformed data

Regular Article
Published: 30 April 2022

Volume 64, pages 179–204, (2023)
Cite this article

Statistical Papers Aims and scope Submit manuscript

José Dias Curto ORCID: orcid.org/0000-0003-2012-9015^1,2

234 Accesses
1 Citation
Explore all metrics

Abstract

A common practice in statistics is to take the log transformation of highly skewed data and construct confidence intervals for the population average on the basis of transformed data. However, when computed based on log-transformed data, the confidence interval is for the geometric instead of the arithmetic average and neglecting this can lead to misleading conclusions. In this paper, we consider an approach based on a regression of the two sample averages to convert the confidence interval for the geometric average in a confidence interval for the arithmetic average of the original untransformed data. The proposed approach is substantially simpler to implement when compared to the existing methods and the extensive Monte Carlo and bootstrapping simulation study suggests outperforming in terms of coverage probabilities even at very small sample sizes. Some real data examples have been analyzed, which support the simulation findings of the paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Point Estimation for the Ratio of Medians of Two Independent Log-Normal Distributions

Article 01 February 2021

Maximum Smoothed Likelihood Estimation of the Centre of a Symmetric Distribution

Statistical Inference for the Location and Scale Parameters of the Skew Normal Distribution

Article 29 November 2018

Notes

We use “average” or “mean” interchangeably.
The use of bootstrapping was also suggested by one of the reviewers.

References

Abu-Shawiesh MO, Al-Athari FM, Kittani HF (2009) Confidence interval for the mean of a contaminated normal distribution. J Appl Sci 9(15):2835–2840
Article Google Scholar
Akahir M (2002) Confidence intervals for the difference of means: application to the Behrens-Fisher type problem. Stat Pap 43:273–284. https://doi.org/10.1007/s00362-002-0100-4
Article Google Scholar
Albrecht P, Steenis G, Wezel AL, Salk J (1984) The geometric mean: confidence limits and significance tests. Percept Psychophys 26(5):419–421
Google Scholar
Alf EF, Grossberg JM (1979) Standardization of poliovirus neutralizing antibody tests. Rev Infect Dis 6:S540–S544
Google Scholar
Atkinson AC (1986) Plots, transformations, and regression: an introduction to graphical methods of diagnostic regression analysis. Oxford statistical science series, 1st edn. Oxford University, Oxford
Baklizi A (2007) Inference about the mean difference of two non-normal populations based on independent samples: a comparative study. J Stat Comput Simul 77(7):613–624
Article MATH Google Scholar
Baklizi A (2008) Inference about the mean of skewed population: a comparative study. J Stat Comput Simul 78:421–435
Article MATH Google Scholar
Baklizi A, Kibria BMG (2009) One and two sample confidence intervals for estimating the mean of skewed populations: an empirical comparative study. J Appl Stat 1:1–9
MATH Google Scholar
Bland JM, Altman DG (1996) Transformations, means, and confidence intervals. BMJ 312:1079
Article Google Scholar
Box GEP, Cox DR (1964) An Analysis of Transformations. J R Stat Soc 26(2):211–243
MATH Google Scholar
Chen L (1995) Testing the mean of skewed distributions. J Am Stat Assoc 90:762–772
Article Google Scholar
Chen HJ, Chen S-Y (1999) A nearly optimal confidence interval for the largest normal mean. Commun Stat 28(1):131–146. https://doi.org/10.1080/03610919908813539
Article MATH Google Scholar
Chen Z, Mi J (2001) An approximate confidence interval for the scale parameter of the gamma distribution based on grouped data. Stat Pap 42:285–299. https://doi.org/10.1007/s003620100059
Article MATH Google Scholar
Cornish EA, Fischer RA (1937) Moments and cumulants in the specifications of distributions. Rev Int Stat Inst 5:307–327
Article Google Scholar
Curto JD (2021) Averages: there is still something to learn. Computational economics. https://doi.org/10.1007/s10614-021-10165-y
Curto JD (2021) Confidence intervals for means and variances of nonnormal distributions. Communications in statistics—simulation and computation. https://doi.org/10.1080/03610918.2021.1963448
Feng C, Wang H, Lu N, Tu XM (2013) Log transformation: application and interpretation in biomedical research. Stat Med 32(2):230–239
Article Google Scholar
Galton F (1897) The geometric mean in vital and social statistics. Proc R Soc Lond 29:365–367
Google Scholar
Hall P (1992) On the removal of skewness by transformation. J R Stat Soc 54(1):221–228
Google Scholar
Johnson NJ (1978) Modified \(t\) tests and confidence intervals for asymmetrical populations. J Am Stat Assoc 73(363):536–544
MATH Google Scholar
Kibria BMG (2006) Modified confidence intervals for the mean of the Asymmetric distribution. Pak J Stat 22(2):109–120
MATH Google Scholar
Kleijnen JPC, Kloppenburg GLJ, Meeuwsen FL (1986) Testing the mean of an asymmetric population: Johnson’s modified t test revisited. Commun Stat 15(3):715–732. https://doi.org/10.1080/03610918608812535
Article Google Scholar
McGuinness D, Bennett S, Riley E (1997) Statistical analysis of highly skewed immune response data. J Immunol Methods 201:99–114
Article Google Scholar
Owen AB (2001) Empirical likelihood. Chapman and Hall, London
MATH Google Scholar
Sherman M, Maity A, Wang S (2011) Inferences for the ratio: Fieller’s interval, log ratio, and large sample based confidence intervals. AStA Adv Stat Anal 95:313
Article MATH Google Scholar
Shi W, Kibria BMG (2007) On some confidence intervals for estimating the mean of a skewed population. Int J Math Educ Sci Technol 38(3):412–421. https://doi.org/10.1080/00207390601116086
Article Google Scholar
Shoemaker LH (2003) Fixing the F test for equal variances. Am Stat 57:105–114
Article MATH Google Scholar
Sutton CD (1993) Computer-intensive methods for tests about the mean of an asymmetrical distribution. J Am Stat Assoc 88:802–810
Article Google Scholar
Taylor DJ, Kupper LL, Muller KE (2002) Improved approximate confidence intervals for the mean of a log-normal random variable. Stat Med 21:1443–1459
Article Google Scholar
Tian L, Wu J (2005) Confidence intervals for the mean of lognormal data with excess zeros. Biom J 48(1):149–156
Article MATH Google Scholar
Wang F-K (2001) Confidence interval for the mean of non-normal data. Qual Reliab Eng 17(4):257–267. https://doi.org/10.1002/qre.400
Article Google Scholar
Wilcox R (2021) A note on computing a confidence interval for the mean. Simul Comput. https://doi.org/10.1080/03610918.2021.2011926
Article Google Scholar
Willink R (2005) A confidence interval and test for the mean of an asymmetric distribution. Commun Stat 34(4):753–766. https://doi.org/10.1081/STA-200054419
Article MATH Google Scholar
Wooldridge J (2020) Introductory econometrics: a modern approach, 7th edn. South-Western, Mason
Google Scholar
Yu K, Lu Z, Stander J (2003) Quantile regression: applications and current research areas. The Statistician 52(3):331–350
Article Google Scholar
Zhou X-H, Gao S (1997) Confidence Intervals for the log-normal mean. Stat Med 16:783–790
Article Google Scholar
Zhou X-H, Gao S (2000) One-Sided confidence intervals for means of positively skewed Distributions. Am Stat 54(2):100–104
Google Scholar

Download references

Acknowledgements

The author thanks the Editor-in-Chief and referees for their valuable comments and constructive suggestions. This work was supported by Fundação para a Ciência e a Tecnologia, Grant UIDB/00315/2020.

Author information

Authors and Affiliations

Instituto Universitário de Lisboa (ISCTE-IUL), BRU-UNIDE, Lisbon, Portugal
José Dias Curto
Department of Quantitative Methods for Management and Economics, Av. Prof. Aníbal Bettencourt, 1600-189, Lisbon, Portugal
José Dias Curto

Authors

José Dias Curto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to José Dias Curto.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Curto, J.D. Inference about the arithmetic average of log transformed data. Stat Papers 64, 179–204 (2023). https://doi.org/10.1007/s00362-022-01315-x

Download citation

Received: 07 July 2021
Revised: 22 January 2022
Accepted: 10 April 2022
Published: 30 April 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s00362-022-01315-x

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Inference about the arithmetic average of log transformed data

Abstract

Access this article

Similar content being viewed by others

Point Estimation for the Ratio of Medians of Two Independent Log-Normal Distributions

Maximum Smoothed Likelihood Estimation of the Centre of a Symmetric Distribution

Statistical Inference for the Location and Scale Parameters of the Skew Normal Distribution

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Inference about the arithmetic average of log transformed data

Abstract

Access this article

Similar content being viewed by others

Point Estimation for the Ratio of Medians of Two Independent Log-Normal Distributions

Maximum Smoothed Likelihood Estimation of the Centre of a Symmetric Distribution

Statistical Inference for the Location and Scale Parameters of the Skew Normal Distribution

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation