Abstract
We propose a methodology for informative goodness of fit testing that combines the merits of both hypothesis testing and nonparametric density estimation. In particular, we construct a data-driven smooth test that selects the model using a weighted integrated squared error (WISE) loss function. When the null hypothesis is rejected, we suggest plotting the estimate of the selected model. This estimate is optimal in the sense that it minimises the WISE loss function. This procedure may be particularly helpful when the components of the smooth test are not diagnostic for detecting moment deviations. Although this approach relies mostly on existing theory of (generalised) smooth tests and nonparametric density estimation, there are a few issues that need to be resolved so as to make the procedure applicable to a large class of distributions. In particular, we will need an estimator of the variance of the smooth test components that is consistent in a large class of distributions for which the nuisance parameters are estimated by method of moments. This estimator may also be used to construct diagnostic component tests.
The properties of the new variance estimator, the new diagnostic components and the proposed informative testing procedure are evaluated in several simulation studies. We demonstrate the new methods on testing for the logistic and extreme value distributions.
Similar content being viewed by others
References
Akaike, H., 1973. Information theory and an extension of the maximum likelihood principle. In Second International Symposium on Inference Theory, Petrov, B., Csàki, F. (editors), 267–281. Akadémiai Kiadó, Budapest.
Akaike, H., 1974. A new look at statistical model identification. I.E.E.E. Trans. Auto. Control, 19, 716–723.
Anderson, G., de Figueiredo, R., 1980. An adaptive orthogonal-series estimator for probability density functions. Annals of Statistics, 8, 347–376.
Bain, L., Easterman, J., Engelhardt, M., 1973. A study of life-testing models and statistical analyses for the logistic distribution. Technical Report ARL-73-0009, Aerospace Research Laboratories, Wright Patterson AFB.
Baringhaus, L., Henze, N., 1992. Limit distributions for Mardia measure of multivariate skewness. Annals of Statistics, 20, 1889–1902.
Barton D., 1953. On Neyman’s smooth test of goodness of fit and its power with respect to a particular system of alternatives. Skandinavisk Aktuarietidskrift, 36, 24–63.
Bickel, P., Ritov, Y., Stoker, T., 2006. Tailor-made tests of goodness of fit to semiparametric hypotheses. Annals of Statistics, 34, 721–741.
Boos, D., 1992. On generalized score tests. The American Statistician, 46, 327–333.
Buckland, S., 1992. Fitting density functions with polynomials. Applied Statistics, 41, 63–76.
Cencov, N., 1962. Evaluation of an unknown distribution density from observations. Soviet. Math., 3, 1559–1562.
Claeskens, G., Hjort, N., 2004. Goodness of fit via non-parametric likelihood ratios. Scandinavian Journal of Statistics, 31, 487–513.
Clutton-Brock, M., 1990. Density estimation using exponentials of orthogonal series. Journal of the American Statistical Association, 85, 760–764.
Diggle, P., Hall, P., 1986. The selection of terms in an orthogonal series density estimator. Journal of the American Statistical Association, 81, 230–233.
Efron, B., Tibshirani, R., 1996. Using specially designed exponential families for density estimation. Annals of Statistics, 24, 2431–2461.
Emerson, P., 1968. Numerical construction of orthogonal polynomials from a general recurrence formula. Biometrics, 24, 695–701.
Engelhardt, M., 1975. Simple linear estimation of the parameters of the logistic distribution from a complete or censored sample. Journal of the American Statistical Association, 70, 899–902.
Eubank, R., LaRiccia, V., Rosenstein, R., 1987. Test statistics derived as components of Pearson’s phi-squared distance measure. Journal of the American Statistical Association, 82, 816–825.
Gajek, G., 1986. On improving density estimators which are not bona fide functions. Annals of Statistics, 14, 1612–1618.
Glad, I., Hjort, N., Ushakov, N., 2003. Correction of density estimators that are not densities. Scandinavian Journal of Statistics, 30, 415–427.
Hall, W., Mathiason, D., 1990. On large-sample estimation and testing in parametric models. International Statistical Review, 58, 77–97.
Henze, N., 1997. Do components of smooth tests of fit have diagnostic properties? Metrika, 45, 121–130.
Henze, N., Klar, B., 1996. Properly rescaled components of smooth tests of fit are diagnostic. Australian Journal of Statistics, 38, 61–74.
Hjort, N., Glad, I., 1995. Nonparametric density estimation with a parametric start. Annals of Statistics, 23, 882–904.
Kallenberg, W., Ledwina, T., 1995. Consistency and Monte Carlo simulation of a data driven version of smooth goodness-of-fit tests. Annals of Statistics, 23, 1594–1608.
Kallenberg, W., Ledwina, T., 1997. Data-driven smooth tests when the hypothesis is composite. Journal of the American Statistical Association, 92, 1094–1104.
Kallenberg, W., Ledwina, T., Rafajlowicz, E., 1997. Testing bivariate independence and normality. Sankhyā, Series A, 59,42-59.
Klar, B., 2000. Diagnostic smooth tests of fit. Metrika, 52, 237–252.
Ledwina, T., 1994. Data-driven version of Neyman’s smooth test of fit. Journal of the American Statistical Association, 89, 1000–1005.
Lehmann, E., 1999. Elements of Large-Sample Theory. Springer, New York.
Mardia, K., Kent, J., 1991. Rao score tests for goodness-of-fit and independence. Biometrika, 78, 355–363.
Rayner J., Best D., 1989. Smooth Tests of Goodness-of-Fit. Oxford University Press, New York.
Rayner, J., Best, D., Mathews, K., 1995. Interpreting the skewness coefficient. Communications in Statistics — Theory and Methods, 24, 593–600.
Rayner, J., Best, D., Thas, O., 2009a. Generalised smooth tests of goodness of fit. Journal of Statistical Theory and Practice, 3(3), 665–679. Accompanying paper.
Rayner, J., Thas, O., Best, D., 2009b. Smooth Tests of Goodness of Fit. Wiley, New York, USA.
Rayner, J., Thas, O., De Boeck, B., 2008. A generalised Emerson recurrence relation. Australian and New Zealand Journal of Statistics, 50, 235–240.
Schwarz, G., 1978. Estimating the dimension of a model. Annals of Statistics, 6, 461–464.
Stuart, A., Ord, J., 1994. Kendall’s Advanced Theory of Statistics. Arnold / Halsted, London.
Tarter, M., 1976. An introduction to the implementation and theory of nonparametric density estimation. The American Statistician, 30, 105–112.
van der Vaart, A., 1998. Asymptotic Statistics. Cambridge University Press, Cambridge.
Wasserman, L., 2005. All of Nonparametric Statistics. Springer.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Thas, O., Rayner, J.C.W., Best, D.J. et al. Informative Statistical Analyses Using Smooth Goodness of Fit Tests. J Stat Theory Pract 3, 705–733 (2009). https://doi.org/10.1080/15598608.2009.10411955
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1080/15598608.2009.10411955