Bayesian Systems-Based Genetic Association Analysis with Effect Strength Estimation and Omic Wide Interpretation: A Case Study in Rheumatoid Arthritis

Part of the Methods in Molecular Biology book series (MIMB, volume 1142)


Rich dependency structures are often formed in genetic association studies between the phenotypic, clinical, and environmental descriptors. These descriptors may not be standardized, and may encompass various disease definitions and clinical endpoints which are only weakly influenced by various (e.g., genetic) factors. Such loosely defined complex intermediate clinical phenotypes are typically used in follow-up candidate gene association studies, e.g., after genome-wide analysis, to deepen the understanding of the associations and to estimate effect strength.

This chapter discusses a solid methodology, which is useful in such a scenario, by using probabilistic graphical models, namely, Bayesian networks in the Bayesian statistical framework. This method offers systematically scalable, comprehensive hierarchical hypotheses about multivariate relevance. We discuss its workflow: from data engineering to semantic publication of the results. We overview the construction, visualization, and interpretation of complex hypotheses related to the structural analysis of relevance. Furthermore, we illustrate the use of a dependency model-based relevance measure, which takes into account the structural properties of the model, for quantifying the effect strength. Finally, we discuss the “interpretational” or translational challenge of a genetic association study, with a focus on the fusion of heterogeneous omic knowledge to reintegrate the results into a genome-wide context.

Key words

Genetic association studies Detailed phenotyping Bayesian networks Bayesian multilevel analysis of variance Bayesian structure-based effect strength estimation Gene prioritization 


  1. 1.
    Dermitzakis E (2008) From gene expression to disease risk. Nat Genet 40:492–493PubMedGoogle Scholar
  2. 2.
    Maher B (2008) Personal genomes: the case of the missing heritability. Nature 456:18–21PubMedGoogle Scholar
  3. 3.
    Joober R (2011) The 1000 Genomes Project: deep genomic sequencing waiting for deep psychiatric phenotyping. J Psychiatry Neurosci 36:147–149PubMedCentralPubMedGoogle Scholar
  4. 4.
    Moreau Y, Antal P, Fannes G, De Moor B (2003) Probabilistic graphical models for computational biomedicine. Methods Inf Med 42:161–168PubMedGoogle Scholar
  5. 5.
    Saeys Y, Inza I, Larranaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23:2507–2517PubMedGoogle Scholar
  6. 6.
    Rodin A, Boerwinkle E (2005) Mining genetic epidemiology data with Bayesian networks I: Bayesian networks and example application (plasma apoE levels). Bioinformatics 21:3273–3278PubMedCentralPubMedGoogle Scholar
  7. 7.
    Verzilli C, Stallard N, Whittaker J (2006) Bayesian graphical models for genomewide association studies. Am J Hum Genet 79: 100–112PubMedCentralPubMedGoogle Scholar
  8. 8.
    Mourad R, Sinoquet C, Leray P (2012) Probabilistic graphical models for genetic association studies. Brief Bioinform 13: 20–33PubMedGoogle Scholar
  9. 9.
    Li W, Wang M, Irigoyen P, Gregersen P (2006) Inferring causal relationships among intermediate phenotypes and biomarkers: a case study of rheumatoid arthritis. Bioinformatics 22:1503–1507PubMedGoogle Scholar
  10. 10.
    Xing H, McDonagh P, Bienkowska J, Cashorali T, Runge K, Miller R, DeCaprio D, Church B, Roubenoff R, Khalil I, Carulli J (2011) Causal modeling using network ensemble simulations of genetic and gene expression data predicts genes involved in rheumatoid arthritis. PLoS Comput Biol 7:e1001105PubMedCentralPubMedGoogle Scholar
  11. 11.
    Han B, Park M, Chen X (2010) A Markov blanket-based method for detecting causal SNPs in GWAS. BMC Bioinformatics 11 Suppl 3:S5Google Scholar
  12. 12.
    Jiang X, Barmada MM, Visweswaran S (2010) Identifying genetic interactions in genome-wide data using Bayesian networks. Genet Epidemiol 34:575–581PubMedCentralPubMedGoogle Scholar
  13. 13.
    Fridley B (2009) Bayesian variable and model selection methods for genetic association studies. Genet Epidemiol 33:27–37PubMedGoogle Scholar
  14. 14.
    Antal P, Millinghoffer A, Hullám G, Hajós G, Sárközy P, Szalai C, Falus A (in press) Bayesian, systems-based, multilevel analysis of biomarkers of complex phenotypes: from interpretation to decisions. In: Sinoquet C, Mourad R (eds) Probabilistic graphical models for genetics, genomics and postgenomics. ISBN: 978-0-19-870902-2, Oxford University PressGoogle Scholar
  15. 15.
    Antal P, Millinghoffer A, Hullam G, Szalai C, Falus A (2008) A Bayesian view of challenges in feature selection: feature aggregation, multiple targets, redundancy and interaction. In: Saeys Y, Liu H, Inza I, Wehenkel L, Van de Peer Y (eds) New challenges for feature selection in data mining and knowledge discovery (FSDM), JMLR workshop and conference proceedings, September 15, 2008, Antwerp, Belgium, pp 74–89Google Scholar
  16. 16.
    Antal P, Hullám G, Gézsi A, Millinghoffer A (2006) Learning complex Bayesian network features for classification. In: Third European workshop on probabilistic graphical model, Prague, pp 9–16Google Scholar
  17. 17.
    Pal Z, Antal P, Millinghoffer A, Hullam G, Paloczi K, Toth S, Gabius H, Molnar M, Falus A, Buzas E (2010) A novel galectin-1 and interleukin 2 receptor beta haplotype is associated with autoimmune myasthenia gravis. J Neuroimmunol 229:107–111PubMedGoogle Scholar
  18. 18.
    Sarkozy P, Marx P, Millinghoffer A, Varga G, Szekely A, Nemoda Z, Demetrovics Z, Sasvari-Szekely M, Antal P (2011) Bayesian data analytic knowledge bases for genetic association studies. In: Arjen Hommersom PL (ed) The 13th conference on artificial intelligence in medicine (AIME’11): probabilistic problem solving in biomedicine, July 2–6, 2011, Bled, Slovenia, pp 55–66Google Scholar
  19. 19.
    Lautner-Csorba O, Gezsi A, Semsei AF, Antal P, Erdelyi DJ, Schermann G, Kutszegi N, Csordas K, Hegyi M, Kovacs G, Falus A, Szalai C (2012) Candidate gene association study in pediatric acute lymphoblastic leukemia evaluated by Bayesian network based Bayesian multilevel analysis of relevance. BMC Med Genomics 5:42PubMedCentralPubMedGoogle Scholar
  20. 20.
    Ungvari I, Hullam G, Antal P, Kiszel P, Gezsi A, Hadadi E, Virag V, Hajos G, Millinghoffer A, Nagy A, Kiss A, Semsei A, Temesi G, Melegh B, Kisfali P, Szell M, Bikov A, Galffy G, Tamasi L, Falus A, Szalai C (2012) Evaluation of a partial genome screening of two asthma susceptibility regions using Bayesian network based Bayesian multilevel analysis of relevance. PLoS One 7:e33573PubMedCentralPubMedGoogle Scholar
  21. 21.
    Varga G, Szekely A, Antal P, Sarkozy P, Nemoda Z, Demetrovics Z, Sasvari-Szekely M (2012) Additive effects of serotonergic and dopaminergic polymorphisms on trait impulsivity. Am J Med Genet B Neuropsychiatr Genet 159B(3):281–288PubMedGoogle Scholar
  22. 22.
    Lautner-Csorba O, Gézsi A, Erdélyi D, Hullám G, Antal P, Semsei Á, Kutszegi N, Kovács G, Falus A, Szalai C (2013) Roles of genetic polymorphisms in the folate pathway in childhood acute lymphoblastic leukemia evaluated by Bayesian relevance and effect size analysis. PLoS One 8:e69843PubMedCentralPubMedGoogle Scholar
  23. 23.
    Vereczkei A, Demetrovics Z, Szekely A, Sarkozy P, Antal P, Szilagyi A, Sasvari-Szekely M, Barta C (2013) Multivariate analysis of dopaminergic gene variants as risk factors of heroin dependence. PLoS One 8:e66592PubMedCentralPubMedGoogle Scholar
  24. 24.
    Stephens M, Balding D (2009) Bayesian statistical methods for genetic association studies. Nat Rev Genet 10:681–690PubMedGoogle Scholar
  25. 25.
    Beaumont M, Rannala B (2004) The Bayesian revolution in genetics. Nat Rev Genet 5:251–261PubMedGoogle Scholar
  26. 26.
    Roeder K, Devlin B, Wasserman L (2007) Improving power in genome-wide association studies: weights tip the scale. Genet Epidemiol 31:741–747PubMedGoogle Scholar
  27. 27.
    Curtis D, Vine A, Knight J (2007) A pragmatic suggestion for dealing with results for candidate genes obtained from genome wide association studies. BMC Genet 8:20PubMedCentralPubMedGoogle Scholar
  28. 28.
    Jiang X, Barmada M, Cooper G, Becich M (2011) A Bayesian method for evaluating and discovering disease loci associations. PLoS One 6:e22075PubMedCentralPubMedGoogle Scholar
  29. 29.
    Saccone S, Saccone N, Swan G, Madden P, Goate A, Rice J, Bierut L (2008) Systematic biological prioritization after a genome-wide association study: an application to nicotine dependence. Bioinformatics 24: 1805–1811PubMedCentralPubMedGoogle Scholar
  30. 30.
    Saccone S, Bolze R, Thomas P, Quan J, Mehta G, Deelman E, Tischfield J, Rice J (2010) SPOT: a web-based tool for using biological databases to prioritize SNPs after a genome-wide association study. Nucleic Acids Res 38:W201–W209PubMedCentralPubMedGoogle Scholar
  31. 31.
    Saccone S, Chesler E, Haendel M (2012) Applying in silico integrative genomics to genetic studies of human disease. Bioinformatics of Behavior: Part 1 103: 133–156Google Scholar
  32. 32.
    Madigan D, Andersson S, Perlman M, Volinsky C (1996) Bayesian model averaging and model selection for Markov equivalence classes of acyclic digraphs. Comm Stat Theor Methods 25:2493–2519Google Scholar
  33. 33.
    Friedman N, Koller D (2003) Being Bayesian about network structure. A Bayesian approach to structure discovery in Bayesian networks. Mach Learn 50:95–125Google Scholar
  34. 34.
    Efron B (2013) Bayes’ theorem in the 21st century. Science 340:1177–1178PubMedGoogle Scholar
  35. 35.
    Pearl J (2000) Causality: models, reasoning, and inference. Cambridge University Press, New YorkGoogle Scholar
  36. 36.
    Moreau Y, Tranchevent L (2012) Computational tools for prioritizing candidate genes: boosting disease gene discovery. Nat Rev Genet 13:523–536PubMedGoogle Scholar
  37. 37.
    Pettersson F, Anderson C, Clarke G, Barrett J, Cardon L, Morris A, Zondervan K (2009) Marker selection for genetic case-control association studies. Nat Protoc 4:743–752PubMedCentralPubMedGoogle Scholar
  38. 38.
    Nsengimana J, Bishop DT (2012) Design considerations for genetic linkage and association studies. Methods Mol Biol 850:237–262PubMedGoogle Scholar
  39. 39.
    Friedman N, Yakhini Z (1996) On the sample complexity of learning Bayesian networks. In: Horvitz E, Jensen F (eds.) UAI’96: Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence, August 1-4, 1996, Portland, Oregon, USA, pp 274–282Google Scholar
  40. 40.
    Hullám G, Antal P, Millinghoffer A, Szalai C, Falus A (2010) Evaluation of a Bayesian model-based approach in GA studies. In: JMLR workshop and conference proceeding, pp 30–43Google Scholar
  41. 41.
    Vittinghoff E, McCulloch C (2007) Relaxing the rule of ten events per variable in logistic and Cox regression. Am J Epidemiol 165:710–718PubMedGoogle Scholar
  42. 42.
    Stahl EA, Wegmann D, Trynka G, Gutierrez-Achury J, Do R, Voight BF, Kraft P, Chen R, Kallberg HJ, Kurreeman FA, Diabetes Genetics Replication and Meta-analysis Consortium, Myocardial Infarction Genetics Consortium (2012) Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis. Nat Genet 44:483–489PubMedGoogle Scholar
  43. 43.
    Skapenko A, Prots I, Schulze-Koops H (2009) Prognostic factors in rheumatoid arthritis in the era of biologic agents. Nat Rev Rheumatol 5:491–496PubMedGoogle Scholar
  44. 44.
    Ortutay Z, Polgar A, Gomor B, Geher P, Lakatos T, Glant T, Gay R, Gay S, Pallinger E, Farkas C, Farkas E, Tothfalusi L, Kocsis K, Falus A, Buzas E (2003) Synovial fluid exoglycosidases are predictors of rheumatoid arthritis and are effective in cartilage glycosaminoglycan depletion. Arthritis Rheum 48:2163–2172PubMedGoogle Scholar
  45. 45.
    Pasztoi M, Nagy G, Geher P, Lakatos T, Toth K, Wellinger K, Pocza P, Gyorgy B, Holub M, Kittel A, Paloczy K, Mazan M, Nyirkos P, Falus A, Buzas E (2009) Gene expression and activity of cartilage degrading glycosidases in human rheumatoid arthritis and osteoarthritis synovial fibroblasts. Arthritis Res Ther 11:R68PubMedCentralPubMedGoogle Scholar
  46. 46.
    Wigginton J, Cutler D, Abecasis G (2005) A note on exact tests of Hardy-Weinberg equilibrium. Am J Hum Genet 76:887–893PubMedCentralPubMedGoogle Scholar
  47. 47.
    Marchini J, Howie B, Myers S, McVean G, Donnelly P (2007) A new multipoint method for genome-wide association studies by imputation of genotypes. Nat Genet 39:906–913PubMedGoogle Scholar
  48. 48.
    Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via EM algorithm. J Roy Stat Soc B Stat Methods 39:1–38Google Scholar
  49. 49.
    Tanner M, Wong W (2010) From EM to data augmentation: the emergence of MCMC Bayesian computation in the 1980s. Stat Sci 25:506–516Google Scholar
  50. 50.
    Gelman A (1995) Bayesian data analysis, 1st edn. Chapman & Hall, New YorkGoogle Scholar
  51. 51.
    Barrett J, Fry B, Maller J, Daly M (2005) Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21:263–265PubMedGoogle Scholar
  52. 52.
    Zhang J, Rowe W, Struewing J, Buetow K (2002) HapScope: a software system for automated and visual analysis of functionally annotated haplotypes. Nucleic Acids Res 30:5213–5221PubMedCentralPubMedGoogle Scholar
  53. 53.
    Gu S, Pakstis A, Kidd K (2005) HAPLOT: a graphical comparison of haplotype blocks, tagSNP sets and SNP variation for multiple populations. Bioinformatics 21:3938–3939PubMedGoogle Scholar
  54. 54.
    Davidovich O, Kimmel G, Shamir R (2007) GEVALT: an integrated software tool for genotype analysis. BMC Bioinformatics 8:36PubMedCentralPubMedGoogle Scholar
  55. 55.
    Stephens M, Smith N, Donnelly P (2001) A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 68:978–989PubMedCentralPubMedGoogle Scholar
  56. 56.
    Mourad R, Sinoquet C, Leray P (2011) A hierarchical Bayesian network approach for linkage disequilibrium modeling and data-dimensionality reduction prior to genome-wide association studies. BMC Bioinformatics 12:16PubMedCentralPubMedGoogle Scholar
  57. 57.
    Kost J, McDermott M (2002) Combining dependent P-values. Stat Probab Lett 60: 183–190Google Scholar
  58. 58.
    Zhang F, Guo X, Wu S, Han J, Liu Y, Shen H, Deng H (2012) Genome-wide pathway association studies of multiple correlated quantitative phenotypes using principle component analyses. PLoS One 7:e53320PubMedCentralPubMedGoogle Scholar
  59. 59.
    Friedman N, Goldszmidt M (1996) Discretizing continuous attributes while learning Bayesian networks. In: Saitta L (ed) Thirteenth international conference on machine learning, (ICML ’96). Morgan Kaufmann, Bari, pp 157–165Google Scholar
  60. 60.
    Hullam G, Antal P (2013) The effect of parameter priors on Bayesian relevance and effect size measures. Periodica Polytechnica Electrical Engineering and Computer Science 57:35–48Google Scholar
  61. 61.
    Cooper G, Herskovits E (1992) A Bayesian method for the induction of probabilistic networks from data. Mach Learn 9:309–347Google Scholar
  62. 62.
    Silander T, Kontkanen P, Myllymaki P (2007) On sensitivity of the MAP Bayesian network structure to the equivalent sample size parameter. AUAI Press, Corvallis, OR, pp 360–367Google Scholar
  63. 63.
    Ueno M (2010) Learning networks determined by the ratio of prior and data. AUAI Press, Corvallis, OR, pp 598–605Google Scholar
  64. 64.
    Bouckaert RR (1994) Properties of Bayesian belief network learning algorithms. Morgan Kaufmann, San Francisco, CA, pp 102–109Google Scholar
  65. 65.
    Buntine WL (1991) Theory refinement on Bayesian networks. In: D’Ambrosio B, Smets P (eds.): UAI ‘91: Proceedings of the Seventh Annual Conference on Uncertainty in Artificial Intelligence, July 13-15, 1991, UCLA, Los Angeles, CA, USA, pp 52–60Google Scholar
  66. 66.
    Heckerman D, Geiger D, Chickering D (1995) Learning Bayesian networks—the combination of knowledge and statistical-data. Mach Learn 20:197–243Google Scholar
  67. 67.
    Giudici P, Castelo R (2003) Improving Markov Chain Monte Carlo model search for data mining. Mach Learn 50:127–158Google Scholar
  68. 68.
    Chen M-H, Shao Q-M, Ibrahim JG (2000) Monte Carlo methods in Bayesian computation. Springer, New YorkGoogle Scholar
  69. 69.
    Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann Publishers, San Mateo, CAGoogle Scholar
  70. 70.
    Kohavi R, John G (1997) Wrappers for feature subset selection. Artif Intell 97:273–324Google Scholar
  71. 71.
    Tsamardinos I, Aliferis C (2003) Towards Principled Feature Selection: Relevancy, Filters, and Wrappers. In: Bishop CM, Frey BJ (eds.) Proc. of the Ninth International Workshop on Artificial Intelligence and Statistics, January 3-6, 2003, Morgan Kaufmann Publishers, Key West, FL, USA, pp 334–342Google Scholar
  72. 72.
    O’Hara R, Sillanpaa M (2009) A review of Bayesian variable selection methods: what, how and which. Bayesian Anal 4:85–117Google Scholar
  73. 73.
    Kooperberg C, Ruczinski I (2005) Identifying interacting SNPs using Monte Carlo logic regression. Genet Epidemiol 28:157–170PubMedGoogle Scholar
  74. 74.
    Ioannidis J (2008) Why most discovered true associations are inflated. Epidemiology 19: 640–648PubMedGoogle Scholar
  75. 75.
    Hoeting JA, Madigan D, Raftery AE, Volinsky CT (1999) Bayesian model averaging: a tutorial. Stat Sci 14:382–417Google Scholar
  76. 76.
    Hullam G, Antal P (2012) Estimation of effect size posterior using model averaging over Bayesian network structures and parameters. In: The sixth European workshop on probabilistic graphical models (PGM2012), Granada, SpainGoogle Scholar
  77. 77.
    Stein L (2003) Integrating biological databases. Nat Rev Genet 4:337–345PubMedGoogle Scholar
  78. 78.
    Ashburner M, Ball C, Blake J, Botstein D, Butler H, Cherry J, Davis A, Dolinski K, Dwight S, Eppig J, Harris M, Hill D, Issel-Tarver L, Kasarskis A, Lewis S, Matese J, Richardson J, Ringwald M, Rubin G, Sherlock G, The Gene Ontology Consortium (2000) Gene ontology: tool for the unification of biology. Nat Genet 25:25–29PubMedCentralPubMedGoogle Scholar
  79. 79.
    Liekens A, De Knijf J, Daelemans W, Goethals B, De Rijk P, Del-Favero J (2011) BioGraph: unsupervised biomedical knowledge discovery via automated hypothesis generation. Genome Biol 12:R57PubMedCentralPubMedGoogle Scholar
  80. 80.
    Glenisson P, Coessens B, Van Vooren S, Mathys J, Moreau Y, De Moor B (2004) TXTGate: profiling gene groups with text-based information. Genome Biol 5:R43PubMedCentralPubMedGoogle Scholar
  81. 81.
    Kohler S, Bauer S, Horn D, Robinson P (2008) Walking the interactome for prioritization of candidate disease genes. Am J Hum Genet 82:949–958PubMedCentralPubMedGoogle Scholar
  82. 82.
    Lee I, Blom U, Wang P, Shim J, Marcotte E (2011) Prioritizing candidate disease genes by network-based boosting of genome-wide association data. Genome Res 21:1109–1121PubMedCentralPubMedGoogle Scholar
  83. 83.
    Lanckriet G, De Bie T, Cristianini N, Jordan M, Noble W (2004) A statistical framework for genomic data fusion. Bioinformatics 20: 2626–2635PubMedGoogle Scholar
  84. 84.
    De Bie T, Tranchevent L, Van Oeffelen L, Moreau Y (2007) Kernel-based data fusion for gene prioritization. Bioinformatics 23: I125–I132PubMedGoogle Scholar
  85. 85.
    Bromberg Y (2013) Chapter 15: disease gene prioritization. PLoS Comput Biol 9:e1002902PubMedCentralPubMedGoogle Scholar
  86. 86.
    Doncheva N, Kacprowski T, Albrecht M (2012) Recent approaches to the prioritization of candidate disease genes. Wiley Interdiscip Rev Syst Biol Med 4:429–442PubMedGoogle Scholar
  87. 87.
    Magger O, Waldman Y, Ruppin E, Sharan R (2012) Enhancing the prioritization of disease-causing genes through tissue specific protein interaction networks. PLoS Comput Biol 8:e1002690PubMedCentralPubMedGoogle Scholar
  88. 88.
    Navlakha S, Kingsford C (2010) The power of protein interaction networks for associating genes with diseases. Bioinformatics 26: 1057–1063PubMedCentralPubMedGoogle Scholar
  89. 89.
    Kanehisa M, Goto S (2000) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28:27–30PubMedCentralPubMedGoogle Scholar
  90. 90.
    Franceschini A, Szklarczyk D, Frankild S, Kuhn M, Simonovic M, Roth A, Lin J, Minguez P, Bork P, von Mering C, Jensen L (2013) STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res 41:D808–D815PubMedCentralPubMedGoogle Scholar
  91. 91.
    Prasad T, Goel R, Kandasamy K, Keerthikumar S, Kumar S, Mathivanan S, Telikicherla D, Raju R, Shafreen B, Venugopal A, Balakrishnan L, Marimuthu A, Banerjee S, Somanathan D, Sebastian A, Rani S, Ray S, Kishore C, Kanth S, Ahmed M, Kashyap M, Mohmood R, Ramachandra Y, Krishna V, Rahiman B, Mohan S, Ranganathan P, Ramabadran S, Chaerkady R, Pandey A (2009) Human protein reference database-2009 update. Nucleic Acids Res 37:D767–D772Google Scholar
  92. 92.
    Edgar R, Domrachev M, Lash A (2002) Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 30:207–210PubMedCentralPubMedGoogle Scholar
  93. 93.
    McKusick-Nathans Institute for Genetic Medicine. Online Mendelian Inheritance in Man, OMIM®. Johns Hopkins University, Baltimore, MD.
  94. 94.
    Yu W, Gwinn M, Clyne M, Yesupriya A, Khoury M (2008) A navigator for human genome epidemiology. Nat Genet 40: 124–125PubMedGoogle Scholar
  95. 95.
    Hindorff L, Sethupathy P, Junkins H, Ramos E, Mehta J, Collins F, Manolio T (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A 106:9362–9367PubMedCentralPubMedGoogle Scholar
  96. 96.
    Arany A, Bolgar B, Balogh B, Antal P, Matyus P (2013) Multi-aspect candidates for repositioning: data fusion methods using heterogeneous information sources. Curr Med Chem 20:95–107PubMedGoogle Scholar
  97. 97.
    Subramanian A, Tamayo P, Mootha V, Mukherjee S, Ebert B, Gillette M, Paulovich A, Pomeroy S, Golub T, Lander E, Mesirov J (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102:15545–15550PubMedCentralPubMedGoogle Scholar
  98. 98.
    Attia J, Ioannidis J, Thakkinstian A, McEvoy M, Scott R, Minelli C, Thompson J, Infante-Rivard C, Guyatt G (2009) How to use an article about genetic association a: background concepts. JAMA 301:74–81PubMedGoogle Scholar
  99. 99.
    Attia J, Ioannidis J, Thakkinstian A, McEvoy M, Scott R, Minelli C, Thompson J, Infante-Rivard C, Guyatt G (2009) How to use an article about genetic association B: are the results of the study valid? JAMA 301:191–197PubMedGoogle Scholar
  100. 100.
    Attia J, Ioannidis J, Thakkinstian A, McEvoy M, Scott R, Minelli C, Thompson J, Infante-Rivard C, Guyatt G (2009) How to use an article about genetic association C: what are the results and will they help me in caring for my patients? JAMA 301:304–308PubMedGoogle Scholar
  101. 101.
    Huang J, Mirel D, Pugh E, Xing C, Robinson P, Pertsemlidis A, Ding L, Kozlitina J, Maher J, Rios J, Story M, Marthandan N, Scheuermann R (2011) Minimum information about a genotyping experiment (MIGEN). Stand Genomic Sci 5:224–229PubMedCentralPubMedGoogle Scholar
  102. 102.
    Janssens A, Ioannidis J, van Duijn C, Little J, Khoury M, Grp G (2011) Strengthening the reporting of Genetic Risk Prediction Studies: the GRIPS statement. Genet Med 13:453–456PubMedGoogle Scholar
  103. 103.
    Little J, Higgins J, Ioannidis J, Moher D, Gagnon F, von Elm E, Khoury M, Cohen B, Davey-Smith G, Grimshaw J, Scheet P, Gwinn M, Williamson R, Zou G, Hutchings K, Johnson C, Tait V, Wiens M, Golding J, van Duijn C, McLaughlin J, Paterson A, Wells G, Fortier I, Freedman M, Zecevic M, King R, Infante-Rivard C, Stewart A, Birkett N (2009) STrengthening the REporting of Genetic Association studies (STREGA)—an extension of the STROBE statement. Eur J Clin Invest 39:247–266PubMedCentralPubMedGoogle Scholar
  104. 104.
    Ioannidis J, Khoury M (2011) Improving validation practices in “Omics” research. Science 334:1230–1232PubMedGoogle Scholar
  105. 105.
    Colhoun H, McKeigue P, Smith G (2003) Problems of reporting genetic associations with complex outcomes. Lancet 361:865–872PubMedGoogle Scholar
  106. 106.
    Shi G, Boerwinkle E, Morrison A, Gu C, Chakravarti A, Rao D (2011) Mining gold dust under the genome wide significance level: a two-stage approach to analysis of GWAS. Genet Epidemiol 35:111–118PubMedCentralPubMedGoogle Scholar
  107. 107.
    Province M, Borecki I (2007) Gathering the gold dust: identification small-effect complex trait genes. Genet Epidemiol 31:611–612Google Scholar
  108. 108.
    Evangelou E, Ioannidis J (2013) Meta-analysis methods for genome-wide association studies and beyond. Nat Rev Genet 14:379–389PubMedGoogle Scholar
  109. 109.
    Pers T, Hansen N, Lage K, Koefoed P, Dworzynski P, Miller M, Flint T, Mellerup E, Dam H, Andreassen O, Djurovic S, Melle I, Borglum A, Werge T, Purcell S, Ferreira M, Kouskoumvekaki I, Workman C, Hansen T, Mors O, Brunak S (2011) Meta-analysis of heterogeneous data sources for genome-scale identification of risk genes in complex phenotypes. Genet Epidemiol 35:318–332PubMedGoogle Scholar
  110. 110.
    Little J, Higgins J, Ioannidis J, Moher D, Gagnon F, von Elm E, Khoury M, Cohen B, Davey-Smith G, Grimshaw J, Scheet P, Gwinn M, Williamson R, Zou G, Hutchings K, Johnson C, Tait V, Wiens M, Golding J, van Duijn C, McLaughlin J, Paterson A, Wells G, Fortier I, Freedman M, Zecevic M, King R, Infante-Rivard C, Stewart A, Birkett N (2009) Strengthening the reporting of genetic association studies (STREGA): an extension of the STROBE Statement. Hum Genet 125:131–151PubMedGoogle Scholar
  111. 111.
    Shotton D (2009) Semantic publishing: the coming revolution in scientific journal publishing. Learn Publish 22:85–94Google Scholar
  112. 112.
    Shotton D, Portwin K, Klyne G, Miles A (2009) Adventures in semantic publishing: exemplar semantic enhancements of a research article. PLoS Comput Biol 5:e1000361PubMedCentralPubMedGoogle Scholar
  113. 113.
    Seringhaus M, Gerstein M (2008) Manually structured digital abstracts: a scaffold for automatic text mining. FEBS Lett 582:1170PubMedGoogle Scholar
  114. 114.
    Gerstein M, Seringhaus M, Fields S (2007) Structured digital abstract makes text mining easy. Nature 447:142PubMedGoogle Scholar
  115. 115.
    Seringhaus M, Gerstein M (2007) Publishing perishing? Towards tomorrow’s information architecture. BMC Bioinformatics 8:17PubMedCentralPubMedGoogle Scholar
  116. 116.
    Bourne P (2005) Will a biological database be different from a biological journal? PLoS Comput Biol 1:179–181PubMedGoogle Scholar
  117. 117.
    Gerstein M (1999) E-publishing on the web: promises, pitfalls, and payoffs for bioinformatics. Bioinformatics 15:429–431PubMedGoogle Scholar
  118. 118.
    Goddard K, Knaus W, Whitlock E, Lyman G, Feigelson H, Schully S, Ramsey S, Tunis S, Freedman A, Khoury M, Veenstra D (2012) Building the evidence base for decision making in cancer genomic medicine using comparative effectiveness research. Genet Med 14:633–642PubMedCentralPubMedGoogle Scholar
  119. 119.
    Gwinn M, Grossniklaus D, Yu W, Melillo S, Wulf A, Flome J, Dotson W, Khoury M (2011) Horizon scanning for new genomic tests. Genet Med 13:161–165PubMedGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Department of Measurement and Information SystemsBudapest University of Technology and EconomicsBudapestHungary
  2. 2.Department of Genetics, Cell and ImmunobiologySemmelweis UniversityBudapestHungary
  3. 3.Department of Measurement and Information SystemsBudapest University of Technology and Economics (BME)BudapestHungary

Personalised recommendations