Analyzing Illumina Gene Expression Microarray Data Obtained From Human Whole Blood Cell and Blood Monocyte Samples

  • Alexander Teumer
  • Claudia Schurmann
  • Arne Schillert
  • Katharina Schramm
  • Andreas Ziegler
  • Holger Prokisch
Part of the Methods in Molecular Biology book series (MIMB, volume 1368)


Microarray profiling of gene expression is widely applied to studies in molecular biology and functional genomics. Experimental and technical variations make not only the statistical analysis of single studies but also meta-analyses of different studies very challenging. Here, we describe the analytical steps required to substantially reduce the variations of gene expression data without affecting true effect sizes. A software pipeline has been established using gene expression data from a total of 3358 whole blood cell and blood monocyte samples, all from three German population-based cohorts, measured on the Illumina HumanHT-12 v3 BeadChip array. In summary, adjustment for a few selected technical factors greatly improved reliability of gene expression analyses. Such adjustments are particularly required for meta-analyses of different studies.

Key words

Gene expression analysis Transcriptomics Human whole blood cells Human monocytes Microarray ILLUMINA HumanHT-12 BeadChip 



This work was funded by the European Commission’s Seventh Framework Programme (FP7/2007-2013, HEALTH-F2-2011, grant agreement No. 277984, TIRCON), the BMBF (German Ministry of Education and Research) grants 03IS2061A, 03ZIK012, 01GS0834, 01GS0833, 01GS0831, 01KU0908A, 01KU0908B, 0315536F, and the National Genome Research Network NGFNplus Atherogenomics, the Federal State of Mecklenburg-West Pomerania, the Caché Campus program of the InterSystems GmbH, the Helmholtz Zentrum München (German Research Center for Environmental Health), the State of Bavaria, the German Center for Diabetes Research (DZD e.V.), the State of North-Rhine-Westphalia, the Leibniz Association (WGL Pakt für Forschung und Innovation), the government of Rheinland-Pfalz (“Stiftung Rheinland Pfalz für Innovation”, contract AZ 961–386261/733), the Johannes Gutenberg-University of Mainz, and its contract with Boehringer Ingelheim and PHILIPS Medical Systems, the Agence Nationale de la Recherche, France (contract ANR 09 GENO 106 01), the European Union (HEALTH-2011-278913), and the DZHK (German Centre for Cardiovascular Research).


  1. 1.
    Ramasamy A, Mondry A, Holmes CC et al (2008) Key issues in conducting a meta-analysis of gene expression microarray datasets. PLoS Med 5:e184PubMedCentralCrossRefPubMedGoogle Scholar
  2. 2.
    Heinig M, Petretto E, Wallace C et al (2010) A trans-acting locus regulates an anti-viral expression network and type 1 diabetes risk. Nature 467:460–464PubMedCentralCrossRefPubMedGoogle Scholar
  3. 3.
    Ein-Dor L, Kela I, Getz G et al (2005) Outcome signature genes in breast cancer: is there a unique set? Bioinformatics 21:171–178CrossRefPubMedGoogle Scholar
  4. 4.
    Ntzani EE, Ioannidis JPA (2003) Predictive ability of DNA microarrays for cancer outcomes and correlates: an empirical assessment. Lancet 362:1439–1444CrossRefPubMedGoogle Scholar
  5. 5.
    Eysenck HJ (1994) Meta-analysis and its problems. BMJ 309:789–792PubMedCentralCrossRefPubMedGoogle Scholar
  6. 6.
    Campain A, Yang YH (2010) Comparison study of microarray meta-analysis methods. BMC Bioinformatics 11:408PubMedCentralCrossRefPubMedGoogle Scholar
  7. 7.
    Repsilber D, Fink L, Jacobsen M et al (2005) Sample selection for microarray gene expression studies. Methods Inf Med 44:461–467PubMedGoogle Scholar
  8. 8.
    Du P, Kibbe WA, Lin SM (2008) lumi: a pipeline for processing Illumina microarray. Bioinformatics 24:1547–1548CrossRefPubMedGoogle Scholar
  9. 9.
    Lin SM, Du P, Huber W et al (2008) Model-based variance-stabilizing transformation for Illumina microarray data. Nucleic Acids Res 36:e11PubMedCentralCrossRefPubMedGoogle Scholar
  10. 10.
    Fu J, Wolfs MGM, Deelen P et al (2012) Unraveling the regulatory mechanisms underlying tissue-dependent genetic variation of gene expression. PLoS Genet 8:e1002431PubMedCentralCrossRefPubMedGoogle Scholar
  11. 11.
    Fehrmann RSN, Jansen RC, Veldink JH et al (2011) Trans-eQTLs reveal that independent genetic variants associated with a complex phenotype converge on intermediate genes, with a major role for the HLA. PLoS Genet 7:e1002197PubMedCentralCrossRefPubMedGoogle Scholar
  12. 12.
    Völzke H, Alte D, Schmidt CO et al (2011) Cohort profile: the study of health in Pomerania. Int J Epidemiol 40:294–307CrossRefPubMedGoogle Scholar
  13. 13.
    Holle R, Happich M, Löwel H et al (2005) KORA–a research platform for population based health research. Gesundheitswesen 67(Suppl 1):S19–S25CrossRefPubMedGoogle Scholar
  14. 14.
    Zeller T, Wild P, Szymczak S et al (2010) Genetics and beyond–the transcriptome of human monocytes and disease susceptibility. PLoS ONE 5:e10693PubMedCentralCrossRefPubMedGoogle Scholar
  15. 15.
    Schurmann C, Heim K, Schillert A et al (2012) Analyzing Illumina gene expression microarray data from different tissues: methodological aspects of data analysis in the MetaXpress consortium. PLoS ONE 7:e50938PubMedCentralCrossRefPubMedGoogle Scholar
  16. 16.
    R Development Core Team (2006) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, ISBN 3-900051-07-0
  17. 17.
    Smith ML, Baggerly KA, Bengtsson H et al. (2013) illuminaio: an open source idat parsing tool for Illumina microarrays. F1000Research 2Google Scholar
  18. 18.
    Dunning MJ, Smith ML, Ritchie ME et al (2007) beadarray: R classes and methods for Illumina bead-based data. Bioinformatics 23:2183–2184CrossRefPubMedGoogle Scholar
  19. 19.
    Du P, Kibbe WA, Lin SM (2007) nuID: a universal naming scheme of oligonucleotides for Illumina, Affymetrix, and other microarrays. Biol Direct 2: 16Google Scholar
  20. 20.
    Chen LS, Storey JD (2008) Eigen-r2 for dissecting variation in high-dimensional studies. Bioinformatics 24:2260–2262PubMedCentralCrossRefPubMedGoogle Scholar
  21. 21.
    Westra HJ, Peters MJ, Esko T et al (2013) Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat Genet 45:1238–1243PubMedCentralCrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Alexander Teumer
    • 1
    • 2
  • Claudia Schurmann
    • 2
    • 3
    • 4
  • Arne Schillert
    • 5
    • 6
  • Katharina Schramm
    • 7
    • 8
    • 9
  • Andreas Ziegler
    • 5
    • 6
    • 10
    • 11
  • Holger Prokisch
    • 7
    • 8
    • 9
  1. 1.Institute for Community MedicineUniversity Medicine GreifswaldGreifswaldGermany
  2. 2.Interfaculty Institute for Genetics and Functional GenomicsUniversity Medicine GreifswaldGreifswaldGermany
  3. 3.The Charles Bronfman Institute for Personalized MedicineIcahn School of Medicine at Mount SinaiNew YorkUSA
  4. 4.The Genetics of Obesity and Related Metabolic Traits ProgramIcahn School of Medicine at Mount SinaiNew YorkUSA
  5. 5.Institut für Medizinische Biometrie und StatistikUniversität zu Lübeck, Universitätsklinikum Schleswig-HolsteinLübeckGermany
  6. 6.DZHK (German Centre for Cardiovascular Research), partner site Hamburg/Kiel/LübeckLübeckGermany
  7. 7.Institute of Human Genetics, Helmholtz Zentrum MünchenMunichGermany
  8. 8.German Research Center for Environmental HealthNeuherbergGermany
  9. 9.Institute of Human GeneticsTechnical University MunichMunichGermany
  10. 10.Center for Clinical TrialsUniversity of LübeckLübeckGermany
  11. 11.School of Mathematics, Statistics and Computer ScienceUniversity of KwaZulu-NatalPietermaritzburgSouth Africa

Personalised recommendations