Advertisement

Statistics and Computing

, Volume 23, Issue 2, pp 149–162 | Cite as

A nonparametric assessment of model adequacy based on Kullback-Leibler divergence

  • Ping-Hung Hsieh
Article

Abstract

A discrepancy measure to assess model fitness against a nonparametric alternative is proposed. First, a Polya tree prior is constructed so that the centering distribution is the null. Second, the prior is updated in the light of data to obtain the posterior centering distribution as the alternative. Third, a Kullback-Leibler divergence type of test statistic is derived to assess the discrepancy between the two centering distributions. The properties of the test statistic are derived, and a power comparison with several well-known test statistics is conducted. The use of the test statistic is illustrated using network traffic data.

Keywords

Goodness of fit Nonparametric alternative Packet train Polya tree Teletraffic data 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Andrews, D.F., Herzberg, A.M.: Data: A Collection of Problems from Many Fields for the Student and Research Worker. Springer, Berlin (1985) zbMATHGoogle Scholar
  2. Arizono, I., Ohta, H.: A test for normality based on Kullback-Leibler information. Am. Stat. 43, 20–22 (1989) MathSciNetGoogle Scholar
  3. Berger, J.O., Guglielmi, A.: Bayesian and conditional frequentist testing of a parametric model versus nonparametric alternatives. J. Am. Stat. Assoc. 96, 174–184 (2001) MathSciNetzbMATHCrossRefGoogle Scholar
  4. Carota, C., Parmigiani, G.: On Bayes factors for nonparametric alternatives. In: Bernardo, J.M., Berger, J.O., David, A.P., Smith, A.F.M. (eds.) Bayesian Statistics 5, pp. 507–511. Clarendon Press, Oxford (1996) Google Scholar
  5. Chaganty, N.R., Karandikar, R.L.: Some properties of the Kullback-Leibler number. Sankhyā, Ser. A 58, 69–80 (1996) MathSciNetzbMATHGoogle Scholar
  6. d’Agostino, R.B., Stephens, M.A.: Goodnesso-of-fit techniques. Statistics: Textbooks and Monographs, vol. 68. Marcel Dekker, New York (1986) Google Scholar
  7. Dudewicz, E.J., van der Meulen, E.C.: Entropy-based tests of uniformity. J. Am. Stat. Assoc. 76, 967–974 (1981) zbMATHCrossRefGoogle Scholar
  8. Ebrahimi, N., Habibullah, M., Soofi, E.S.: Testing exponentiality based on Kullback-Leibler information. J. R. Stat. Soc., Ser. B 54, 739–748 (1992) MathSciNetzbMATHGoogle Scholar
  9. Evans, M., Swartz, T.: Distribution theory and inference for polynomial-normal densities. Commun. Stat., Theory Methods 23(4), 1123–1148 (1994) MathSciNetzbMATHCrossRefGoogle Scholar
  10. Ferguson, T.S.: Prior distributions on spaces of probability measures. Ann. Stat. 2, 615–629 (1974) MathSciNetzbMATHCrossRefGoogle Scholar
  11. Gelman, A., Meng, X.-L., Stern, H.: Posterior predictive assessment of model fitness via realized discrepancies. Stat. Sin. 6, 733–807 (1996) MathSciNetzbMATHGoogle Scholar
  12. Goutis, C., Robert, C.: Model choice in generalized linear models: a Bayesian approach via Kullback-Leibler projections. Biometrika 85, 29–37 (1998) MathSciNetzbMATHCrossRefGoogle Scholar
  13. Hsieh, P.-H.: An exploratory first step in teletraffic data modeling: evaluation of long-run performance of parameter estimators. Comput. Stat. Data Anal. 40, 263–283 (2002) zbMATHCrossRefGoogle Scholar
  14. Lavine, M.: Some aspects of polya tree distributions for statistical modelling. Ann. Stat. 20, 1222–1235 (1992) MathSciNetzbMATHCrossRefGoogle Scholar
  15. Lavine, M.: More aspects of polya tree distributions for statistical modelling. Ann. Stat. 22, 1161–1176 (1994) MathSciNetzbMATHCrossRefGoogle Scholar
  16. Ledwina, T.: Data-driven version of Neyman’s smooth test of fit. J. Am. Stat. Assoc. 89, 1000–1005 (1994) MathSciNetzbMATHCrossRefGoogle Scholar
  17. Leland, W.E., Taqqu, M.S., Willinger, W., Wilson, D.V.: On the self-similar nature of ethernet traffic (Extended Version). IEEE/ACM Trans. Netw. 2, 1–15 (1994) CrossRefGoogle Scholar
  18. Mauldin, R.D., Sudderth, W.D., Williams, S.C.: Polya trees and random distributions. Ann. Stat. 20, 1203–1221 (1992) MathSciNetzbMATHCrossRefGoogle Scholar
  19. Meng, X.L.: Posterior predictive p-values. Ann. Stat. 22, 1142–1160 (1994) zbMATHCrossRefGoogle Scholar
  20. Mengerson, K., Robert, C.: Testing for mixtures: a Bayesian entropic approach. In: Bernardo, J.M., Berger, J.O., David, A.P., Smith, A.F.M. (eds.) Bayesian Statistics 5, pp. 255–276. Clarendon Press, Oxford (1996) Google Scholar
  21. Neath, A.A.: Polya tree distributions for statistical modeling of censored data. J. Appl. Math. Decis. Sci. 7(3), 175–186 (2003) MathSciNetzbMATHCrossRefGoogle Scholar
  22. Quesenberry, C.P., Miller Jr., F.L.: Power studies of some tests for uniformity. J. Stat. Comput. Simul. 5, 169–191 (1977) zbMATHCrossRefGoogle Scholar
  23. Rubin, D.B.: Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann. Stat. 12, 1151–1172 (1984) zbMATHCrossRefGoogle Scholar
  24. Stephens, M.A.: EDF statistics for goodness of fit and some comparisons. J. Am. Stat. Assoc. 69, 730–737 (1974) CrossRefGoogle Scholar
  25. Swartz, T.: Goodness-of-fit tests using Kullback-Leibler information. Commun. Stat. Part. B, Simul. Comput. 21, 711–729 (1992) MathSciNetCrossRefGoogle Scholar
  26. Vasicek, O.: A test for normality based on sample entropy. J. R. Stat. Soc., Ser. B 38, 54–59 (1976) MathSciNetzbMATHGoogle Scholar
  27. Verdinelli, I., Wasserman, L.: Bayesian goodness-of-fit testing using infinite-dimensional exponential families. Ann. Stat. 26(4), 1215–1241 (1998) MathSciNetzbMATHCrossRefGoogle Scholar
  28. Viele, K.: Evaluating fit using Dirichlet processes. Technical Report 384, Department of Statistics, University of Kentucky (http://web.as.uky.edu/statistics/techreports/techreports.html) (2000)
  29. Walker, S., Muliere, P.: A characterisation of polya tree distributions. Stat. Probab. Lett. 31, 163–168 (1997) MathSciNetzbMATHCrossRefGoogle Scholar
  30. Willinger, W., Taqqu, M.S., Sherman, R., Wilson, D.V.: Self-similarity through high-variability: statistical analysis of ethernet LAN traffic at the source level (Extended Version). IEEE/ACM Trans. Netw. 5, 71–86 (1997) CrossRefGoogle Scholar
  31. Willinger, W., Paxson, V., Taqqu, M.S.: Self-similarity and heavy tails: structural modeling of network traffic. In: Adler, R., Feldman, R., Taqqu, M.S. (eds.) A Practical Guide to Heavy Tails: Statistical Techniques and Applications, pp. 27–53. Birkhäuser, Boston (1998) Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.College of BusinessOregon State UniversityCorvallisUSA

Personalised recommendations