Skip to main content

Beyond Simpson’s Paradox: One Problem in Data Science

  • Conference paper
Advances in Data Science and Classification

Abstract

In the present paper, the conditions under which Simpson’s paradox does not occur are discussed for various cases. These conditions are first obtained from the descriptive point of view and then on the assumption of prior probability distributions of parameters. The robustness of the results is discussed with respect to the prior probability distributions. Practically, the result is given as the magnitude of odds ratio (or relative risk), i.e., Simpson’s paradox does not occur if the odds ratio is more or less than a certain values, depending on various cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Geng, Z.H.I. (1992). Collapsibility of relative risk in contingency tables with a response variable, J R Statist Soc B, 54, 585–593.

    Google Scholar 

  • Hand, D.J. (1994). Deconstruction statistical questions, J R Statist Soc, A157, 317–356.

    Article  Google Scholar 

  • Hayashi, C. (1993). Treatise on behaviormetrics (in Japanese), Asakura-Syoten, 108–122.

    Google Scholar 

  • Hintsman, D.L. (1993). On variability, Simpson’s paradox, and the relation between recognition and recall: reply to Tulving and Flexser, Psychol Rev, 100, 143–148.

    Article  Google Scholar 

  • Miettinen, O.S. (1976). Stratification by multivariate confounder score, Am J Epidemiol, 104 (6), 609–620.

    Google Scholar 

  • Shapiro, S.H. (1982). Collapsing contingency tables: a geometric approach. Am Statistn, 36, 43–46.

    Article  Google Scholar 

  • Simpson, E.H. (1951). The interpretation of interaction in contingency tales, J R Statist Soc, B13, 238–241.

    Google Scholar 

  • Vogt, A. (1995). Simpson’s paradox revisited, Student, 1, 2, 99–108.

    Google Scholar 

  • Weinberg, C.R. (1993). Toward a clear definition of confounding, Am J Epidemiol, 137, 1, 1–8.

    Google Scholar 

  • Wermuth, N. (1989). Moderating effects of subgroups in linear models, Biometrika, 76, 81–92.

    Article  Google Scholar 

  • Yamaoka, K. (1996). Beyond Simpson’s Paradox: A descriptive approach, Data Analysis and Stochastic Models, 12, 23 9–253.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin · Heidelberg

About this paper

Cite this paper

Hayashi, C., Yamaoka, K. (1998). Beyond Simpson’s Paradox: One Problem in Data Science. In: Rizzi, A., Vichi, M., Bock, HH. (eds) Advances in Data Science and Classification. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-72253-0_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-72253-0_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64641-9

  • Online ISBN: 978-3-642-72253-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics