Bounds for Cell Entries in Two-Way Tables Given Conditional Relative Frequencies

  • Aleksandra B. Slavkovic
  • Stephen E. Fienberg
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3050)

Abstract

In recent work on statistical methods for confidentiality and disclosure limitation, Dobra and Fienberg (2000, 2003) and Dobra (2002) have generalized Bonferroni-Fréchet-Hoeffding bounds for cell entries in k-way contingency tables given marginal totals. In this paper, we consider extensions of their approach focused on upper and lower bounds for cell entries given arbitrary sets of marginals and conditionals. We give a complete characterization of the two-way table problem and discuss some implications to statistical disclosure limitation. In particular, we employ tools from computational algebra to describe the locus of all possible tables under the given constraints and discuss how this additional knowledge affects the disclosure.

Keywords

Confidentiality Contingency tables Integer programming Linear programming Markov bases Statistical disclosure control Tabular data 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Arnold, B., Castillo, E., Sarabia, J.M.: Specification of distributions by combinations of marginal and conditonal distributions. Statistics and Probability Letters 26, 153–157 (1996)MATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Arnold, B., Castillo, E., Sarabia, J.M.: Conditional Specification of Statistical Models. Springer, New York (1999)MATHGoogle Scholar
  3. 3.
    Balke, A., Pearl, J.: Bounds on treatment effects from studies with imperfect compliance. Journal of American Statistical Association 92(439), 1171–1176 (1997)MATHCrossRefGoogle Scholar
  4. 4.
    Besag, J.: Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society. Series B 36(2), 192–236 (1974)MATHMathSciNetGoogle Scholar
  5. 5.
    Bishop, Y.M.M., Fienberg, S.E., Holland, P.W.: Discrete Multivariate Analysis: Theory and Practice. MIT Press, Cambridge (1975)MATHGoogle Scholar
  6. 6.
    Dobra, A.: Statistical Tools for Disclosure Limitation in Multi-Way Contingency Tables. Ph.D. Thesis, Department of Statistics, Carnegie Mellon University (2002)Google Scholar
  7. 7.
    Karr, A.F., Dobra, A., Sanil, A., Fienberg, S.E.: Software systems of tabular data releases. The International Journal on Uncertainty, Fuzziness and Knowledge- Based Systems 10, 529–544 (2002)MATHCrossRefGoogle Scholar
  8. 8.
    Dobra, A., Fienberg, S.E.: Bounds for cell entries in contingency tables given marginal totals and decomposable graphs. Proceedings of the National Academy of Sciences 97(22), 11885–11892 (2000)MATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Dobra, A., Fienberg, S.E.: Bounding entries in multi-way contingency tables given a set of marginal totals. In: Haitovsky, Y., Lerche, H.R., Ritov, Y. (eds.) Foundations of Statistical Inference: Proceedings of the Shoresh Conference 2000, pp. 3–16. Springer, Berlin (2003)Google Scholar
  10. 10.
    Dobra, A., Fienberg, S.E., Trottini, M.: Assessing the risk of disclosure of confidential categorical data. In: Bernardo, J., et al. (eds.) Bayesian Statistics 7, Proceedings of the Seventh Valencia International Meeting on Bayesian Statistics, pp. 125–144. Oxford University Press, Oxford (2003)Google Scholar
  11. 11.
    Edwards, D.: Introduction to Graphical Modeling, 2nd edn. Springer, New York (2000)Google Scholar
  12. 12.
    Federal Committee on Statistical Methodology: Report on Statistical Disclosure Limitation Methodology. Statistical Policy Working Paper 22, Subcommittee on Disclosure Limitation Methodology. Office of Management and Budget, Executive Office of the President, Washington, DC. (1994), http://ntl.bts.gov/docs/wp22.html
  13. 13.
    Fienberg, S.E., Makov, U.E., Meyer, M.M., Steele, R.J.: Computing the exact distribution for a multi-way contingency table conditional on its marginal totals. In: Saleh, P.K.M.E. (ed.) Data Analysis from Statistical Foundations: A Festschrift in Honor of the 75th Birthday of D.A.S. Fraser, pp. 145–165. Nova Science Publishers, Huntington (2001)Google Scholar
  14. 14.
    Gelman, A., Speed, T.P.: Characterizing a joint probability distribution by conditionals. Journal of the Royal Statistical Society. Series B 55(1), 185–188 (1993)MATHMathSciNetGoogle Scholar
  15. 15.
    Gelman, A., Speed, T.P.: Corrigendum: Characterizing a joint probability distribution by conditionals. Journal of the Royal Statistical Society. Series B 61(2), 483 (1999)CrossRefMathSciNetGoogle Scholar
  16. 16.
    Gutmann, S., Kemperman, H.J.B., Reeds, J.A., Shepp, L.A.: Existence of probability measures with given marginals. The Annals of Probability 19(4), 1781–1797 (1991)MATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    King, G.: A Solution to the Ecological Inference Problem. Princeton University Press, Princeton (1997)Google Scholar
  18. 18.
    Lauritzen, S.L.: Graphical Models. Oxford University Press, Oxford (1996)Google Scholar
  19. 19.
    Pearl, J.: Causality: Models, Reasoning, and Inference. Cambridge University Press, Cambridge (2000)MATHGoogle Scholar
  20. 20.
    Pistone, J., Riccomagno, E., Wynn, H.P.: Algebraic Statistics - Computational Commutative Algebra in Statistics. Chapman & Hall/CRC, Boca Raton (2001)MATHGoogle Scholar
  21. 21.
    Rachev, S.T., Rüschendorf, L.: Mass Transportation Problems, vol. 1&2. Springer, New York (1998)MATHGoogle Scholar
  22. 22.
    Slavkovic, A.B.: Markov bases given fixed conditional distributions for two-way contingency tables (2003) (in preparation)Google Scholar
  23. 23.
    Tian, J., Pearl, J.: Probabilities of causation: Bounds and identification. Technical Report (R-271) (April 2000)Google Scholar
  24. 24.
    Whittaker, J.: Graphical Models in Applied Mathematical Mulitvariate Statistics. Wiley, New York (1990)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Aleksandra B. Slavkovic
    • 1
  • Stephen E. Fienberg
    • 2
  1. 1.Department of StatisticsCarnegie Mellon UniversityPittsburghUSA
  2. 2.Department of Statistics, Center for Automated Learning and Discovery, Center for Computer Communications and SecurityCarnegie Mellon UniversityPittsburghUSA

Personalised recommendations