Median Polish with Power Transformations as an Alternative for the Analysis of Contingency Tables with Patient Data

  • Frank Klawonn
  • Katja Crull
  • Akiko Kukita
  • Frank Pessler
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7231)


Contingency tables are a very common basis for the investigation of effects of different treatments or influences on a disease or the health state of patients. Many journals put a strong emphasis on p-values to support the validity of results. Therefore, even small contingency tables are analysed by techniques like t-test or ANOVA. Both these concepts are based on normality assumptions for the underlying data. For larger data sets, this assumption is not so critical, since the underlying statistics are based on sums of (independent) random variables which can be assumed to follow approximately a normal distribution, at least for a larger number of summands. But for smaller data sets, the normality assumption can often not be justified.

Robust methods like the Wilcoxon-Mann-Whitney-U test or the Kruskal-Wallis test do not lead to statistically significant p-values for small samples. Median polish is a robust alternative to analyse contingency tables providing much more insight than just a p-value.

In this paper we discuss different ways to apply median polish to contingency tables in the context of medical data and how to interpret the results based on different examples. We also introduce a technique based on power transformations to find a suitable transformation of the data before applying median polish.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Leek, J., Scharpf, R., Corrado Bravo, H., Simcha, D., Langmead, B., Johnson, W., Geman, D., Baggerly, K., Irizarry, R.: Tackling the widespread and critical impact of batch effects in high-throughput data. Nature Reviews|Genetics 11, 733–739 (2010)CrossRefGoogle Scholar
  2. 2.
    Shaffer, J.P.: Multiple hypothesis testing. Ann. Rev. Psych. 46, 561–584 (1995)CrossRefGoogle Scholar
  3. 3.
    Holm, S.: A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6, 65–70 (1979)MathSciNetMATHGoogle Scholar
  4. 4.
    Benjamini, Y., Hochberg, Y.: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological) 57, 289–300 (1995)MathSciNetMATHGoogle Scholar
  5. 5.
    Hoaglin, D., Mosteller, F., Tukey, J.: Understanding Robust and Exploratory Data Analysis. Wiley, New York (2000)MATHGoogle Scholar
  6. 6.
    Berthold, M., Borgelt, C., Höppner, F., Klawonn, F.: Guide to Intelligent Data Analysis: How to Intelligently Make Sense of Real Data. Springer, London (2010)MATHGoogle Scholar
  7. 7.
    R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Frank Klawonn
    • 1
    • 2
  • Katja Crull
    • 3
  • Akiko Kukita
    • 4
  • Frank Pessler
    • 5
  1. 1.Department of Computer ScienceOstfalia University of Applied SciencesWolfenbuettelGermany
  2. 2.Bioinformatics and StatisticsHelmholtz Centre for Infection ResearchBraunschweigGermany
  3. 3.Department of Molecular ImmunologyHelmholtz Centre for Infection ResearchBraunschweigGermany
  4. 4.Department of MicrobiologySaga Medical SchoolSagaJapan
  5. 5.Department of Infection GeneticsHelmholtz Centre for Infection ResearchBraunschweigGermany

Personalised recommendations