Hypothesis Testing and Categorical Data

  • Kenneth Lange
Part of the Statistics for Biology and Health book series (SBH)


Most statistical geneticists are frequentists, and fairly traditional ones at that. In testing statistical hypotheses, they prefer pure significance tests or likelihood ratio tests based on large sample theory. Although one could easily dismiss this conservatism as undue reverence for Karl Pearson and R. A. Fisher, it is grounded in the humble reality of geneticists’ inability to describe precise alternative hypotheses and to impose convincing priors. In the first part of this chapter, we will review by way of example the large sample methods summarized so admirably by Cavalli-Sforza and Bodmer [4] and Elandt-Johnson [8]. Then we will move on to modern elaborations of frequentist tests for contingency tables. The novelty here lies not in geneticists’ inference philosophy, but in designing tests sensitive to certain types of departures from randomness and in computing p-values. Good algorithms permit exact or nearly exact computation of p-values and consequently relieve our anxieties about large sample approximations.


Ulcer Patient Linkage Equilibrium Ethnic Association Exact Inference Large Sample Theory 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Agresti A (1992) A survey of exact inference for contingency tables. Stat Sei 7:131–177.zbMATHCrossRefMathSciNetGoogle Scholar
  2. [2]
    Badner JA, Chakravarti A, Wagener DK (1984) A test of nonrandom segregation. Genetic Epidemiology 1:329–340CrossRefGoogle Scholar
  3. [3]
    Barbour AD, Holst L, Janson S (1992) Poisson Approximation. Oxford University Press, OxfordzbMATHGoogle Scholar
  4. [4]
    Cavalli-Sforza LL, Bodmer WF (1971) The Genetics of Human Populations. Freeman, San FranciscoGoogle Scholar
  5. [5]
    Clarke CA, Price Evans DA, McConnell RB, Sheppard PM (1959) Secretion of blood group antigens and peptic ulcers. Brit Med J 1:603–607CrossRefGoogle Scholar
  6. [6]
    De Braekeleer M, Smith B (1988) Two methods for measuring the non-randomness of chromosome abnormalities. Ann Hum Genet 52:63–67CrossRefGoogle Scholar
  7. [7]
    de Vries RRP, Lai A, Fat RFM, Nijenhuis LE, van Rood JJ (1976) HLA-linked genetic control of host response to Mycobacterium leprae. Lancet 2:1328–1330CrossRefGoogle Scholar
  8. [8]
    Elandt-Johnson RC (1971) Probability Models and Statistical Methods in Genetics. Wiley, New YorkzbMATHGoogle Scholar
  9. [9]
    Ewens WJ, Griffiths RC, Ethier SN, Wilcox SA, Graves JAM (1992) Statistical analysis of in situ hybridization data: Derivation and use of the z max test. Genomics 12:675–682CrossRefGoogle Scholar
  10. [10]
    Fuchs C, Kenett R (1980) A test for detecting outlying cells in the multinomial distribution and two-way contingency tables. J Amer Stat Assoc 75:395–398zbMATHCrossRefMathSciNetGoogle Scholar
  11. [11]
    Guo S-W, Thompson E (1992) Performing the exact test of Hardy-Weinberg proportion for multiple alleles. Biometrics 48:361–372.zbMATHCrossRefGoogle Scholar
  12. [12]
    Hanash SM, Boehnke M, Chu EHY, Neel JV, Kuick RD (1988) Nonrandom distribution of structural mutants in ethylnitrosourea-treated cultured human lymphoblastoid cells. Proc Natl Acad Sci USA 85:165–169CrossRefGoogle Scholar
  13. [13]
    Joag-Dev K, Proschan F (1983) Negative association of random variables with applications. Ann Stat 11:286–295zbMATHCrossRefMathSciNetGoogle Scholar
  14. [14]
    Kolchin VF, Sevast’yanov BA, Chistyakov VP (1978) Random Allocations. Winston, Washington DCGoogle Scholar
  15. [15]
    Lange, K (1993) A stochastic model for genetic linkage equilibrium. Theor Pop Biol 44:129–148zbMATHCrossRefGoogle Scholar
  16. [16]
    Lazzeroni LC, Lange K (1997) Markov chains for Monte Carlo tests of genetic equilibrium in multidimensional contingency tables. Ann Stat (in press)Google Scholar
  17. [17]
    Mallows CL (1968). An inequality involving multinomial probabilities. Biometrika 55:422–424zbMATHCrossRefGoogle Scholar
  18. [18]
    Nijenhuis A, Wilf HS (1978) Combinatorial Algorithms for Computers and Calculators, 2nd ed. Academic Press, New YorkzbMATHGoogle Scholar
  19. [19]
    Sandell D (1991) Computing probabilities in a generalized birthday problem. Math Scientist 16:78–82zbMATHMathSciNetGoogle Scholar
  20. [20]
    Searle AG (1959) A study of variation in Singapore cats. J Genet 56:111–127CrossRefGoogle Scholar
  21. [21]
    Spielman RS, McGinnis RE, Ewens WJ (1993) Transmission test for linkage disequilibrium: The insulin gene region and Insulin-Dependent Diabetes Mellitus (IDDM). Amer J Hum Genet 52:506–516Google Scholar
  22. [22]
    Terwilliger JD, Ott J (1992) A haplotype-based ‘haplotype relative risk’ approach to detecting allelic associations. Hum Hered 42:337–346CrossRefGoogle Scholar
  23. [23]
    Uhrhammer N, Lange E, Porras E, Naiem A, Chen X, Sheikhavandi S, Chiplunkar S, Yang L, Dandekar S, Liang T, Patel N, Teraoka S, Udar N, Calvo N, Concannon P, Lange K, Gatti RA (1995) Sublocalization of an ataxia-telangiectasia gene distal to D11S384 by ancestral haplotyping of Costa Rican families. Amer J Hum Genet 57:103–111Google Scholar
  24. [24]
    Vogel F, Motulsky AG (1986) Human Genetics: Problems and Approaches, 2nd ed. Springer-Verlag, BerlinGoogle Scholar
  25. [25]
    Weir BS, Brooks LD (1986) Disequilibrium on human chromosome lip. Genet Epidemiology Suppl 1:177–183.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 1997

Authors and Affiliations

  • Kenneth Lange
    • 1
  1. 1.Department of Biostatistics and MathematicsUniversity of MichiganAnn ArborUSA

Personalised recommendations