Abstract
A non-parametric transformation function is introduced to transform data to any continuous distribution. When transformation of data to normality is desired, the use of a suitable parametric pre-transformation function improves the performance of the proposed non-parametric transformation function. The resulting semi-parametric transformation function is shown empirically, via a Monte Carlo study, to perform at least as well as any parametric transformation currently available in the literature.
Similar content being viewed by others
References
Altman, N., Léger, C.: Bandwidth selection for kernel distribution function estimation. J. Stat. Plan. Inference 46, 195–214 (1995)
Atkinson, A.C.: Plots, Transformations and Regression. Clarendon/Oxford University Press, Oxford (1985)
Atkinson, A.C., Pericchi, L.R., Smith, R.L.: Grouped likelihood for the shifted power transformation. J. Roy. Stat. Soc. Ser. B 53, 473–482 (1991)
Azzalini, A.: A note on the estimation of a distribution function and quantiles by a kernel method. Biometrika 68, 326–328 (1981)
Bickel, P.J., Doksum, K.A.: An analysis of transformations revisited. J. Am. Stat. Assoc. 76, 296–311 (1981)
Boos, D.D.: Rates of convergence for the distance between distribution function estimators. Metrika 33, 197–202 (1986)
Bowman, A., Hall, P., Prvan, T.: Bandwidth selection for the smoothing of distribution functions. Biometrika 85, 799–808 (1998)
Box, G.E.P., Cox, D.R.: An analysis of transformations. J. Roy. Stat. Soc. Ser. B 26, 211–252 (1964)
Burdige, J.B., Magee, L., Robb, A.L.: Alternative transformations to handle extreme values of the dependent variable. J. Am. Stat. Assoc. 83, 123–127 (1988)
Cheng, R.C.H., Amin, N.A.K.: Estimating parameters in continuous univariate distributions with a shifted origin. J. Roy. Stat. Soc. Ser. B 45, 394–403 (1983)
Chu, I.-S.: Bootstrap smoothing parameter selection for distribution function estimation. Math. Japonica 41, 189–197 (1995)
D’Agostino, R.B., Stephens, M.A.: Goodness-of-Fit Techniques. Marcel Dekker, New York (1986)
Dony, J., Einmahl, U., Mason, D.M.: Uniform in bandwidth consistency of local polynomial regression function estimators. Austrian J. Stat. 35, 105–120 (2006)
Einmahl, U., Mason, D.M.: Uniform in bandwidth consistency of kernel-type function estimators. Ann. Stat. 33, 1380–1403 (2005)
Gaudard, M., Karson, M.: On estimating the Box–Cox transformation to normality. Commun. Stat. Simul. Comput. 29, 559–582 (2000)
John, J.A., Draper, N.R.: An alternative family of transformations. J. Roy. Stat. Soc. Ser. C 29, 190–197 (1980)
Johnson, N.L.: Systems of frequency curves generated by methods of translation. Biometrika 36, 149–176 (1949)
Jones, M.C.: The performance of kernel density functions in kernel distribution function estimation. Stat. Probab. Lett. 9, 129–132 (1990)
Koekemoer, G.: A new method for transforming data to normality with application to density estimation. Ph.D. thesis, Potchefstroom University (2004)
Manley, B.F.: Exponential data transformations. Statistician 25, 37–42 (1976)
Marron, J.S., Wand, M.P.: Exact mean integrated squared error. Ann. Stat. 20, 712–736 (1992)
Polansky, A.M., Baker, E.R.: Multistage plug-in bandwidth selection for kernel distribution function estimates. J. Stat. Comput. Simul. 65, 63–80 (2000)
Reiss, R.D.: Nonparametric estimation of smooth distribution functions. Scand. J. Stat. 8, 116–119 (1981)
Ruppert, D., Cline, D.B.H.: Bias reduction in kernel density estimation by smoothed empirical transformations. Ann. Stat. 22, 185–210 (1994)
Ruppert, D., Wand, M.P.: Correcting for kurtosis in density estimation. Australian J. Stat. 34, 19–29 (1992)
Sakia, R.M.: The Box–Cox transformation technique: a review. Statistician 41, 169–178 (1992)
Sarda, P.: Smoothing parameter selection for smooth distribution functions. J. Stat. Plan. Inference 35, 65–75 (1993)
Serfling, R.J.: Properties and applications of metrics on nonparametric density estimators. In: Proceedings of the International Colloquium on Nonparametric Statistical Inference, Budapest, pp. 859–873. North-Holland, Amsterdam (1980)
Shapiro, S.S., Wilk, M.B.: An analysis of variance test for normality. Biometrika 52, 591–611 (1965)
Swanepoel, J.W.H.: Mean integrated squared error properties and optimal kernels when estimating a distribution function. Commun. Stat. Theory Methods 17, 3785–3799 (1988)
Titterington, D.M.: Comment on ‘Estimating parameters in continuous univariate distributions’. J. Roy. Stat. Soc. Ser. B 47, 115–116 (1985)
Tukey, J.W.: The comparative anatomy of transformations. Ann. Math. Stat. 28, 602–632 (1957)
van der Vaart, A.W.: Asymptotic Statistics. Cambridge University Press, Cambridge (1998)
van Graan, F.C.: Nie-parametriese beraming van verdelingsfunksies. Master’s thesis, P.U. for C.H.E. Potchefstroom (1982)
Yang, L.: Root-n convergent transformation-kernel density estimation. J. Nonparametric Stat. 12, 447–474 (2000)
Yang, L., Marron, J.S.: Iterated transformation kernel density estimation. J. Am. Stat. Assoc. 94, 580–589 (1999)
Yeo, I.-K., Johnson, R.A.: A new family of power transformations to improve normality or symmetry. Biometrika 87, 954–959 (2000)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Koekemoer, G., Swanepoel, J.W.H. A semi-parametric method for transforming data to normality. Stat Comput 18, 241–257 (2008). https://doi.org/10.1007/s11222-008-9053-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-008-9053-3