Mammalian Genome

, Volume 17, Issue 6, pp 509–517 | Cite as

A review of statistical methods for expression quantitative trait loci mapping

  • Christina KendziorskiEmail author
  • Ping Wang


With high-throughput technologies now widely available, investigators can easily measure thousands of phenotypes for quantitative trait loci (QTL) mapping. Microarray measurements are particularly amenable to QTL mapping, as evidenced by a number of recent studies demonstrating utility across a broad range of biological endeavors. The early success stories have impelled a rapid increase in both the number and complexity of expression QTL (eQTL) experiments. Consequently, there is a need to consider the statistical principles involved in the design and analysis of these experiments and the methods currently being used. In this article we review these principles and methods and discuss the open questions most likely to yield significant progress toward increasing the amount of meaningful information obtained from eQTL mapping experiments.


Quantitative Trait Locus False Discovery Rate Quantitative Trait Locus Mapping Scale Free Network Recombinant Inbred 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



The authors thank Alan Attie, Meng Chen, Michael Newton, and Brian Yandell for useful discussions and two anonymous reviewers for comments that improved the manuscript. They also thank Stephanie Ciatti for extra help at home.


  1. Barry WT, Nobel AB, Wright FA (2005) Significance analysis of functional categories in gene expression studies: a structured permutation approach. Bioinformatics 21, 1943–1949PubMedCrossRefGoogle Scholar
  2. Bing N, Hoeschele I (2005) Genetical genomics analysis of a yeast segregant population for transcription network inference. Genetics 170:533–542PubMedCrossRefGoogle Scholar
  3. Black MA, Doerge RW (2002) Calculation of the minimum number of replicate spots required for detection of significant gene expression fold change in microarray experiments. Bioinformatics 18:1609–1616PubMedCrossRefGoogle Scholar
  4. Brem RB, Kruglyak L (2005) The landscape of genetic complexity across 5700 gene expression traits in yeast. Proceedings of the National Academy of Sciences 102:1572–1577CrossRefGoogle Scholar
  5. Brem RB, Yvert G, Clinton R, Kruglyak L (2002) Genetic dissection of transcriptional regulation in budding yeast. Science 296:752–755PubMedCrossRefGoogle Scholar
  6. Bystrykh L, Weersing E, Dontje B, Sutton S, Pletcher MT, et al. (2005) Uncovering regulatory pathways that affect hematopoietic stem cell function using “genetical genomics.” Nat Genet 37:225–232PubMedCrossRefGoogle Scholar
  7. Chesler EJ, Lu L, Shou S, Qu Y, Gu J, et al. (2005) Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function. Nat Genet 37:233–242PubMedCrossRefGoogle Scholar
  8. Churchill GA (2002) Fundamentals of experimental design for cDNA microarrays. Nat Genet 32:490–495PubMedCrossRefGoogle Scholar
  9. Cui X, Churchill GA (2003) How many mice and how many arrays? Replication in mouse cDNA microarray experiments In: Methods of Microarray Data Analysis III, Johnson KF, Lin SM (eds.) (Norwell MA: Kluwer Academic Publishers) pp 139–154Google Scholar
  10. Dobbin K, Simon R (2005) Sample size determination in microarray experiments for class comparison and prognostic classification. Biostatistics 6(1):27–38PubMedCrossRefGoogle Scholar
  11. Dobbin K, Shih JH, Simon R (2003a) Statistical design of reverse dye microarrays. Bioinformatics 19(7):803–810CrossRefGoogle Scholar
  12. Dobbin K, Shih JH, Simon R (2003b) Questions and answers on design of dual-label microarrays for identifying differentially expressed genes. J Natl Cancer Inst 95(18):1362–1369Google Scholar
  13. Dombkowski AA, Thibodeau BJ, Starcevic SL, Novak RF (2004) Gene-specific dye bias in microarray reference designs. FEBS Lett 560:120–124PubMedCrossRefGoogle Scholar
  14. Dupuis J, Siegmund D (1999) Statistical methods for mapping quantitative trait loci from a dense set of markers. Genetics 151:373–386PubMedGoogle Scholar
  15. Efron B (2005) Local False Discovery Rates. Available at∼brad/papers/. Last accessed April 21 2006
  16. Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci 95:14863–14868PubMedCrossRefGoogle Scholar
  17. Gadbury GL, Page GP, Edwards JW, Kayo T, Prolla TA, et al. (2004) Power and sample size estimation in high dimensional biology. Stat Methods Med Res 13:325–338CrossRefGoogle Scholar
  18. Gentleman R (2005) Using GO for Statistical Analyses, Bioconductor vignette
  19. Hubner N, Wallace CA, Zimdahl H, Petretto E, Schulz H, et al. (2005) Integrated transcriptional profiling and linkage analysis for identification of genes underlying disease. Nat Genet 37:243–253PubMedCrossRefGoogle Scholar
  20. Hu J, Zou F, Wright FA (2005) Practical FDR-based sample size calculations in microarray experiments. Bioinformatics 21(15):3264–3272PubMedCrossRefGoogle Scholar
  21. Irizarry RA, Warren D, Spencer F, Kim IF, Biswal S, et al. (2005) Multiple-laboratory comparison of microarray platforms. Nat Methods 2:345–350PubMedCrossRefGoogle Scholar
  22. Jannink JL (2005) Selective phenotyping to accurately map quantitative trait loci. Crop Sci 45:901–908CrossRefGoogle Scholar
  23. Jansen RC, Nap JP (2001) Genetical genomics: the added value from segregation. Trends Genet 17:388–391PubMedCrossRefGoogle Scholar
  24. Jensen FV (2001) Bayesian Network and Decision Graphs. In Statistics for Engineering and Information Science (New York: Springer-Verlag)Google Scholar
  25. Jeong H, Tombor B, Albert R, Oltvai ZN, Barabasi AL (2000) The large scale organization of metabolic networks. Nature 407:651–653PubMedCrossRefGoogle Scholar
  26. Jin C, Lan H, Attie AD, Bulutuglo D, Churchill GA, et al. (2004) Selective phenotyping for increased efficiency in genetic mapping studies. Genetics 168:2285-2293PubMedCrossRefGoogle Scholar
  27. Jung S-H, Bang H, Young S (2005a) Sample size calculation for multiple testing in microarray data analysis. Biostatistics 6(1):157–169CrossRefGoogle Scholar
  28. Jung S-H (2005b) Sample size for FDR-control in microarray data analysis. Bioinformatics 21(14):3097–3104CrossRefGoogle Scholar
  29. Kendziorski C, Zhang Y, Lan H, Attie AD (2003) The efficiency of mRNA pooling in microarray experiments. Biostatistics 4:465–477PubMedCrossRefGoogle Scholar
  30. Kendziorski C, Irizarry RA, Chen K, Haag JD, Gould MN (2005) On the utility of pooling biological samples in microarray experiments. Proc Natl Acad Sci USA 102(12):4252–4257PubMedCrossRefGoogle Scholar
  31. Kendziorski C, Chen M, Yuan M, Lan H, Attie AD (2006) Statistical methods for expression quantitative trait loci (eQTL) mapping. Biometrics 62:19-27PubMedCrossRefGoogle Scholar
  32. Kerr K (2003) Design considerations for efficient and effective microarray studies. Biometrics 59(4):822–828PubMedCrossRefGoogle Scholar
  33. Kerr K, Churchill GA (2001) Experimental design for gene expression microarrays. Biostatistics 2:183–201PubMedCrossRefGoogle Scholar
  34. Lan H, Chen M, Flowers JB, Yandell BS, Stapleton DS, et al. (2006) Combined expression trait correlations and expression quantitative trait locus mapping. PLoS Genet 2:e6PubMedCrossRefGoogle Scholar
  35. Larget B, Simon D (1999) Markov chain Monte Carlo algorithms for the Bayesian analysis of phylogenetic trees. Mol Biol Evol 16:750–759Google Scholar
  36. Larkin JE, Frank BC, Gavras H, Sultana R, Quackenbush J (2005) Independence and reproducibility across microarray platforms. Nat Methods 2:337–344PubMedCrossRefGoogle Scholar
  37. Lee MT, Whitmore GA (2002) Power and sample size for DNA microarray studies. Stat Med 21:3543–3570PubMedCrossRefGoogle Scholar
  38. Li H, Lu L, Manly KF, Chesler EJ, Bao L, et al. (2005a) Inferring gene transcriptional modulatory relations: a genetical genomics approach. Hum Mol Genet 14(9):1119–1125CrossRefGoogle Scholar
  39. Li L, Alderson D, Doyle JC, Willinger W (2005b) Towards a theory of scale-free graphs: definition, properties, and implications. Internet Mathematics 2(4), 431–523Google Scholar
  40. Liu Y, Zeng ZB (2000) A general mixture model approach for mapping quantitative trait loci from diverse cross designs involving multiple inbred lines. Genet Res 75:345–355PubMedCrossRefGoogle Scholar
  41. Mehrabian M, Allayee H, Stockton J, Lum PY, Drake TA, et al. (2005) Integrating genotypic and expression data in a segregating mouse population to identity 5-lipoxygenase as a susceptibility gene for obesity and bone traits. Nat Genet 37, 1224–1233PubMedCrossRefGoogle Scholar
  42. Morley M, Molony CM, Weber TM, Devlin JL, Ewens KG, et al. (2004) Genetic analysis of genome-wide variation in human gene expression. Nature 430:743–747PubMedCrossRefGoogle Scholar
  43. Muller P, Parmigiani G, Robert C, Rousseau J (2004) Optimal sample size for multiple testing: the case of gene expression microarrays. J Am Stat Assoc 99:990–1001CrossRefGoogle Scholar
  44. Pan W, Lin J, Le CT (2002) How many replicates of arrays are required to detect gene expression changes in microarray experiments? A mixture model approach. Genome Biol 3(5), research0022Google Scholar
  45. Perez-Enciso M (2004) In silico study of transcriptome genetic variation in outbred populations. Genetics 166:547–554PubMedCrossRefGoogle Scholar
  46. R Development Core Team (2004) R: A language and environment for statistical computing (Vienna, Austria: R Foundation for Statistical Computing)Google Scholar
  47. Ruschhaupt M, Huber W, Poustka A, Mansmann U (2004) A compendium to ensure computational reproducibility in high-dimensional classification tasks. Statistical Applications in Genetics and Molecular Biology 3(1), article 37Google Scholar
  48. Schadt EE, Monks SA, Drake TA, Lusis AJ, Che N, et al. (2003) Genetics of gene expression surveyed in maize, mouse and man. Nature 422:297–302PubMedCrossRefGoogle Scholar
  49. Schadt EE, Lamb J, Yang X, Zhu J, Edwards S, et al. (2005) An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet 37:710–717PubMedCrossRefGoogle Scholar
  50. Sen S, Satagopan J, Churchill GA (2005) QTL study design from an information perspective. Genetics 170:447–464PubMedCrossRefGoogle Scholar
  51. Simon RM, Dobbin K (2003) Experimental design of DNA microarray experiments. BioTechniques Suppl, 16–21Google Scholar
  52. Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci USA 100(16):9440–9445PubMedCrossRefGoogle Scholar
  53. Storey JD, Akey JM, Kruglyak L (2005) Multiple locus linkage analysis of genomewide expression in yeast. PLoS Biol 3(8):e267PubMedCrossRefGoogle Scholar
  54. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, et al. (2005). Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci 102:15545–15550PubMedCrossRefGoogle Scholar
  55. Weis BK, Members of the Toxicogenomics Research Consortium (2005) Standardizing global gene expression analysis between laboratories and across platforms. Nat Methods 2(5):351–356PubMedCrossRefGoogle Scholar
  56. Yang YH, Speed TP (2002) Design issues for cDNA microarray experiments. Nat Rev Genet 3:579–588PubMedGoogle Scholar
  57. Yvert G, Brem RB, Whittle J, Akey JM, Foss E, et al. (2003) Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors. Nat Genet 35:57–64PubMedCrossRefGoogle Scholar
  58. Zhu J, Lum PY, Lamb J, GuhaThakurta D, Edwards SW, et al. (2004). An integrative genomics approach to the reconstruction of gene networks in segregating populations Cytogenet Genome Res 105:363–374PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, Inc. 2006

Authors and Affiliations

  1. 1.Department of Biostatistics and Medical InformaticsUniversity of WisconsinMadisonUSA
  2. 2.Department of StatisticsUniversity of WisconsinMadisonUSA

Personalised recommendations