AStA Advances in Statistical Analysis

, Volume 94, Issue 3, pp 247–271 | Cite as

Symmetric and asymmetric rounding: a review and some new results

Original Paper

Abstract

Using rounded data to estimate moments and regression coefficients typically biases the estimates. We explore the bias-inducing effects of rounding, thereby reviewing widely dispersed and often half forgotten results in the literature. Under appropriate conditions, these effects can be approximately rectified by versions of Sheppard’s correction formula. We discuss the conditions under which these approximations are valid and also investigate the efficiency loss caused by rounding. The rounding error, which corresponds to the measurement error of a measurement error model, has a marginal distribution, which can be approximated by the uniform distribution, but is not independent of the true value. In order to take account of rounding preferences (heaping), we generalize the concept of simple rounding to that of asymmetric rounding and consider its effect on the mean and variance of a distribution.

Keywords

Simple rounding Asymmetric rounding Euler–Maclaurin Moments Sheppard’s correction Maximum likelihood 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Augustin, T., Wolff, J.: A bias analysis of Weibull models under heaped data. Stat. Pap. 45, 211–229 (2004) MATHCrossRefMathSciNetGoogle Scholar
  2. Baten, W.D.: Correction for the moments of a frequency distribution in two variables. Ann. Math. Stat. 2, 309–312 (1931) MATHCrossRefGoogle Scholar
  3. Braun, J., Duchesne, T., Stafford, J.E.: Local likelihood density estimation for interval censored data. Can. J. Stat. 33, 39–59 (2005) MATHCrossRefMathSciNetGoogle Scholar
  4. Crockett, A., Crockett, R.: Consequences of data heaping in the British religious census of 1851. Hist. Methods 39, 24–47 (2006) Google Scholar
  5. Daniels, H.E.: Grouping correction for high autocorrelations. J. R. Stat. Soc. B 9, 245–249 (1947) MathSciNetGoogle Scholar
  6. Dempster, A.P., Rubin, D.B.: Rounding error in regression: the appropriateness of Sheppard’s correction. J. R. Stat. Soc. B 45, 51–59 (1983) MATHMathSciNetGoogle Scholar
  7. Don, F.J.H.: A note on Sheppard’s corrections for grouping and maximum likelihood estimation. J. Multivariate Anal. 11, 452–458 (1981) MATHCrossRefMathSciNetGoogle Scholar
  8. Eisenhart, C.: Effects of rounding or grouping data. In: Eisenhart, C., Hastay, M.W., Wallis, W.A. (eds.) Selected Techniques of Statistical Analysis, pp. 185–223. McGraw-Hill, New York/London (1947). Chapter 4 Google Scholar
  9. Fryer, J.G., Pethybridge, R.J.: Maximum likelihood estimation of a linear regression function with grouped data. Appl. Stat. 21, 142–154 (1972) CrossRefMathSciNetGoogle Scholar
  10. Gjeddebaek, N.F.: Contribution to the study of grouped observations: I. Application of the method of maximum likelihood in case of normally distributed observations. Skand. Aktuarietidskrift 32, 135–159 (1949) MathSciNetGoogle Scholar
  11. Gjeddebaek, N.F.: Contribution to the study of grouped observations: II. Loss of information caused by groupings of normally distributed observations. Skand. Aktuarietidskrift 39, 154–159 (1956) MathSciNetGoogle Scholar
  12. Gjeddebaek, N.F.: Statistical analysis: III. Grouped observations. In: Sills, D.R. (ed.): International Encyclopedia of Social Sciences, vol. 15, pp. 193–196. Macmillan/Free Press, New York (1968) Google Scholar
  13. Gray, R.M., Neuhoff, D.L.: Quantization. IEEE Trans. Inf. Theory 44, 1–63 (1998) CrossRefMathSciNetGoogle Scholar
  14. Haitovsky, Y.: In: Grouped Data. Encyclopedia of Statistical Sciences, vol. 3, pp. 527–536. Wiley, New York (1982) Google Scholar
  15. Hall, P.: The influence of rounding errors on some nonparametric estimators of a density and its derivatives. SIAM J. Appl. Math. 42, 390–399 (1982) MATHCrossRefMathSciNetGoogle Scholar
  16. Heitjan, D.F.: Inference from grouped continuous data: a review. Stat. Sci. 4, 164–179 (1989) CrossRefGoogle Scholar
  17. Heitjan, D.F., Rubin, D.B.: Ignorability and coarse data. Ann. Stat. 19, 2244–2253 (1991) MATHCrossRefMathSciNetGoogle Scholar
  18. Janson, S.: Rounding of continuous random variables and oscillatory asymptotics. Ann. Probab. 34, 1807–1826 (2006) MATHCrossRefMathSciNetGoogle Scholar
  19. Johnson, D.S., Barry, R.P., Bowyer, R.T.: Estimating timing of life-history events with coarse data. J. Mammal. 85, 932–939 (2004) CrossRefGoogle Scholar
  20. Kendall, M.G.: The conditions under which Sheppard’s corrections are valid. J. R. Stat. Soc. 101, 592–605 (1938) CrossRefGoogle Scholar
  21. Komlos, J.: On the nature of the Malthusian threat in the eighteenth century. Econ. Hist. Rev. 52, 730–748 (1999) CrossRefGoogle Scholar
  22. Kullback, S.: A note on Sheppard’s corrections. Ann. Math. Stat. 6, 158–159 (1935) MATHCrossRefGoogle Scholar
  23. Kulldorff, G.: Contributions to the Theory of Estimation from Grouped and Partially Grouped Samples. Almqvist and Wiksell, Stockholm (1961) Google Scholar
  24. Lambert, P., Eilers, P.H.C.: Bayesian density estimation from grouped continuous data. Comput. Stat. Data Anal. 53, 1388–1399 (2009) MATHCrossRefGoogle Scholar
  25. Lee, C.-S., Vardeman, S.B.: Interval estimation of a normal process mean from rounded data. J. Qual. Technol. 33, 335–348 (2001) Google Scholar
  26. Lee, C.-S., Vardeman, S.B.: Interval estimation of a normal process standard deviation from rounded data. Commun. Stat., Part B: Simul. Comput. 31, 13–34 (2002) MATHCrossRefMathSciNetGoogle Scholar
  27. Lindley, D.V.: Grouping corrections and maximum likelihood equations. Proc. Camb. Philos. Soc. 46, 106–110 (1950) MATHCrossRefMathSciNetGoogle Scholar
  28. Liu, T.Q., Zhang, B.X., Hu, G.R., Bai, Z.D.: Revisit of Sheppard corrections in linear regression. RMI Working Paper 07/06, Berkeley-NSU (2007) Google Scholar
  29. McNeil, D.R.: Consistent statistics for estimating and testing hypotheses from grouped samples. Biometrika 53, 545–557 (1966) MathSciNetGoogle Scholar
  30. Müller, S.: Zuverlässige statistische Modellierung bei gerundeten Daten. Diplomarbeit. Department of Statistics, Ludwig-Maximilian University Munich (2008) Google Scholar
  31. Myers, R.J.: Accuracy of age reporting in the 1950 United States census. J. Am. Stat. Assoc. 49, 826–831 (1954) CrossRefGoogle Scholar
  32. Pairman, E., Pearson, K.: On correcting for the moment-coefficients of limited range frequency-distributions when there are finite or infinite ordinates and any slopes at the terminals of range. Biometrika 12, 231–258 (1919) Google Scholar
  33. Rietveld, P.: Rounding of arrival and departure times in travel surveys: an interpretation in terms of scheduled activities. J. Transp. Stat. 5, 71–82 (2002) Google Scholar
  34. Schader, M., Schmid, F.: Computation of maximum likelihood estimates for μ and σ from a grouped sample of a normal population. A comparison of algorithms. Stat. Pap. 25, 245–258 (1984) MATHGoogle Scholar
  35. Schneeweiss, H., Komlos, J.: Probabilistic rounding and Sheppard’s correction. Stat. Methodol. 6, 577–593 (2009) CrossRefGoogle Scholar
  36. Schneeweiss, H., Komlos, J., Ahmad, A.S.: Symmetric and asymmetric rounding. Discussion Paper 479, Sonderforschungsbereich 386, University of Munich (2006) Google Scholar
  37. Sheppard, W.F.: On the calculation of the most probable values of frequency constants for data arranged according to equidistant divisions of a scale. Proc. Lond. Math. Soc. 29, 353–380 (1898) CrossRefGoogle Scholar
  38. Stuart, A., Ord, J.K.: Kendall’s Advanced Theory of Statistics. Distribution Theory, vol. 1, 5th edn. Charles Griffin, London (1987) MATHGoogle Scholar
  39. Stoer, J., Bulirsch, R.: Introduction to Numerical Analysis. Springer, New York (1980) Google Scholar
  40. Tallis, G.M.: Approximate maximum likelihood estimation from grouped data. Technometrics 9, 599–606 (1967) CrossRefMathSciNetGoogle Scholar
  41. Tallis, G.M., Young, S.S.: Maximum likelihood estimation of parameters of the normal, log-normal, truncated normal and bivariate normal distributions from grouped data. Aust. J. Stat. 4, 49–54 (1962) MATHCrossRefMathSciNetGoogle Scholar
  42. Tricker, A.R.: Estimation of rounding data sampled from the exponential distribution. J. Appl. Stat. 11, 51–87 (1984a) Google Scholar
  43. Tricker, A.R.: Effects of rounding on the moments of a probability distribution. Statistician 33, 381–390 (1984b) CrossRefGoogle Scholar
  44. Tricker, A.R.: The effect of rounding on the significance level of certain normal test statistics. J. Appl. Stat. 17, 31–38 (1990a) CrossRefGoogle Scholar
  45. Tricker, A.R.: The effect of rounding on the power level of certain normal test statistics. J. Appl. Stat. 17, 219–227 (1990b) CrossRefGoogle Scholar
  46. Tricker, A.R.: Estimation of parameters for rounded data from non-normal distributions. J. Appl. Stat. 19, 465–471 (1992) CrossRefGoogle Scholar
  47. Vardeman, S.B.: Sheppard’s correction for variances and the “Quantization Noise Model”. IEEE Trans. Instrum. Meas. 54, 2117–2119 (2005) CrossRefGoogle Scholar
  48. Vardeman, S.B., Lee, C.-S.: Likelihood-based statistical estimation from quantization data. IEEE Trans. Instrum. Meas. 54, 409–414 (2005) CrossRefGoogle Scholar
  49. Wang, H., Heitjan, D.F.: Modeling heaping in self-reported cigarette counts. Stat. Med. 27, 3789–3804 (2008) CrossRefMathSciNetGoogle Scholar
  50. Widrow, B., Kollar, I., Liu, M.-C.: Statistical theory of quantization. IEEE Trans. Instrum. Meas. 45, 353–361 (1996) CrossRefGoogle Scholar
  51. Wilrich, P.Th.: Rounding of measurement values or derived values. Measurement 37, 21–30 (2005) CrossRefGoogle Scholar
  52. Wimmer, G., Witowsky, V., Duby, T.: Proper rounding for the measurement results under the assumption of uniform distribution. Meas. Sci. Technol. 11, 1659–1665 (2000) CrossRefGoogle Scholar
  53. Wold, H.: Sheppard’s correction formulae in several variables. Skand. Aktuarietidskrift 17, 248–255 (1934) Google Scholar
  54. Wolff, J., Augustin, T.: Heaping and its consequences for duration analysis: a simulation study. Allgemeines Stat. Arch. 87, 59–86 (2003) MATHMathSciNetGoogle Scholar
  55. Wright, D.E., Bray, I.: A mixture model for rounded data. Statistician 52, 3–13 (2003) MathSciNetGoogle Scholar
  56. Wu, J.: How severe was the Great Depression? Evidence from the Pittsburgh region. In: Komlos, J. (ed.) Stature, Living Standards, and Economic Development: Essays in Anthropometric History, pp. 129–152. University of Chicago Press, Chicago (1994) Google Scholar
  57. Zhang, B.X., Liu, T.Q., Bai, Z.D.: Analysis of rounded data from dependent sequences. Ann. Inst. Stat. Math. (2010, to appear) Google Scholar

Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  1. 1.Department of StatisticsUniversity of Munich LMUMunichGermany
  2. 2.Department of Economic HistoryUniversity of Munich LMUMunichGermany
  3. 3.Institute of Cancer ResearchLondonUK

Personalised recommendations