Skip to main content
Log in

A nonparametric assessment of model adequacy based on Kullback-Leibler divergence

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

A discrepancy measure to assess model fitness against a nonparametric alternative is proposed. First, a Polya tree prior is constructed so that the centering distribution is the null. Second, the prior is updated in the light of data to obtain the posterior centering distribution as the alternative. Third, a Kullback-Leibler divergence type of test statistic is derived to assess the discrepancy between the two centering distributions. The properties of the test statistic are derived, and a power comparison with several well-known test statistics is conducted. The use of the test statistic is illustrated using network traffic data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Andrews, D.F., Herzberg, A.M.: Data: A Collection of Problems from Many Fields for the Student and Research Worker. Springer, Berlin (1985)

    MATH  Google Scholar 

  • Arizono, I., Ohta, H.: A test for normality based on Kullback-Leibler information. Am. Stat. 43, 20–22 (1989)

    MathSciNet  Google Scholar 

  • Berger, J.O., Guglielmi, A.: Bayesian and conditional frequentist testing of a parametric model versus nonparametric alternatives. J. Am. Stat. Assoc. 96, 174–184 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Carota, C., Parmigiani, G.: On Bayes factors for nonparametric alternatives. In: Bernardo, J.M., Berger, J.O., David, A.P., Smith, A.F.M. (eds.) Bayesian Statistics 5, pp. 507–511. Clarendon Press, Oxford (1996)

    Google Scholar 

  • Chaganty, N.R., Karandikar, R.L.: Some properties of the Kullback-Leibler number. Sankhyā, Ser. A 58, 69–80 (1996)

    MathSciNet  MATH  Google Scholar 

  • d’Agostino, R.B., Stephens, M.A.: Goodnesso-of-fit techniques. Statistics: Textbooks and Monographs, vol. 68. Marcel Dekker, New York (1986)

    Google Scholar 

  • Dudewicz, E.J., van der Meulen, E.C.: Entropy-based tests of uniformity. J. Am. Stat. Assoc. 76, 967–974 (1981)

    Article  MATH  Google Scholar 

  • Ebrahimi, N., Habibullah, M., Soofi, E.S.: Testing exponentiality based on Kullback-Leibler information. J. R. Stat. Soc., Ser. B 54, 739–748 (1992)

    MathSciNet  MATH  Google Scholar 

  • Evans, M., Swartz, T.: Distribution theory and inference for polynomial-normal densities. Commun. Stat., Theory Methods 23(4), 1123–1148 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  • Ferguson, T.S.: Prior distributions on spaces of probability measures. Ann. Stat. 2, 615–629 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  • Gelman, A., Meng, X.-L., Stern, H.: Posterior predictive assessment of model fitness via realized discrepancies. Stat. Sin. 6, 733–807 (1996)

    MathSciNet  MATH  Google Scholar 

  • Goutis, C., Robert, C.: Model choice in generalized linear models: a Bayesian approach via Kullback-Leibler projections. Biometrika 85, 29–37 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  • Hsieh, P.-H.: An exploratory first step in teletraffic data modeling: evaluation of long-run performance of parameter estimators. Comput. Stat. Data Anal. 40, 263–283 (2002)

    Article  MATH  Google Scholar 

  • Lavine, M.: Some aspects of polya tree distributions for statistical modelling. Ann. Stat. 20, 1222–1235 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  • Lavine, M.: More aspects of polya tree distributions for statistical modelling. Ann. Stat. 22, 1161–1176 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  • Ledwina, T.: Data-driven version of Neyman’s smooth test of fit. J. Am. Stat. Assoc. 89, 1000–1005 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  • Leland, W.E., Taqqu, M.S., Willinger, W., Wilson, D.V.: On the self-similar nature of ethernet traffic (Extended Version). IEEE/ACM Trans. Netw. 2, 1–15 (1994)

    Article  Google Scholar 

  • Mauldin, R.D., Sudderth, W.D., Williams, S.C.: Polya trees and random distributions. Ann. Stat. 20, 1203–1221 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  • Meng, X.L.: Posterior predictive p-values. Ann. Stat. 22, 1142–1160 (1994)

    Article  MATH  Google Scholar 

  • Mengerson, K., Robert, C.: Testing for mixtures: a Bayesian entropic approach. In: Bernardo, J.M., Berger, J.O., David, A.P., Smith, A.F.M. (eds.) Bayesian Statistics 5, pp. 255–276. Clarendon Press, Oxford (1996)

    Google Scholar 

  • Neath, A.A.: Polya tree distributions for statistical modeling of censored data. J. Appl. Math. Decis. Sci. 7(3), 175–186 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  • Quesenberry, C.P., Miller Jr., F.L.: Power studies of some tests for uniformity. J. Stat. Comput. Simul. 5, 169–191 (1977)

    Article  MATH  Google Scholar 

  • Rubin, D.B.: Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann. Stat. 12, 1151–1172 (1984)

    Article  MATH  Google Scholar 

  • Stephens, M.A.: EDF statistics for goodness of fit and some comparisons. J. Am. Stat. Assoc. 69, 730–737 (1974)

    Article  Google Scholar 

  • Swartz, T.: Goodness-of-fit tests using Kullback-Leibler information. Commun. Stat. Part. B, Simul. Comput. 21, 711–729 (1992)

    Article  MathSciNet  Google Scholar 

  • Vasicek, O.: A test for normality based on sample entropy. J. R. Stat. Soc., Ser. B 38, 54–59 (1976)

    MathSciNet  MATH  Google Scholar 

  • Verdinelli, I., Wasserman, L.: Bayesian goodness-of-fit testing using infinite-dimensional exponential families. Ann. Stat. 26(4), 1215–1241 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  • Viele, K.: Evaluating fit using Dirichlet processes. Technical Report 384, Department of Statistics, University of Kentucky (http://web.as.uky.edu/statistics/techreports/techreports.html) (2000)

  • Walker, S., Muliere, P.: A characterisation of polya tree distributions. Stat. Probab. Lett. 31, 163–168 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  • Willinger, W., Taqqu, M.S., Sherman, R., Wilson, D.V.: Self-similarity through high-variability: statistical analysis of ethernet LAN traffic at the source level (Extended Version). IEEE/ACM Trans. Netw. 5, 71–86 (1997)

    Article  Google Scholar 

  • Willinger, W., Paxson, V., Taqqu, M.S.: Self-similarity and heavy tails: structural modeling of network traffic. In: Adler, R., Feldman, R., Taqqu, M.S. (eds.) A Practical Guide to Heavy Tails: Statistical Techniques and Applications, pp. 27–53. Birkhäuser, Boston (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ping-Hung Hsieh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hsieh, PH. A nonparametric assessment of model adequacy based on Kullback-Leibler divergence. Stat Comput 23, 149–162 (2013). https://doi.org/10.1007/s11222-011-9298-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-011-9298-0

Keywords

Navigation