Molecular Genetics and Genomics

, Volume 287, Issue 9, pp 679–698 | Cite as

RETRACTED ARTICLE: Candidate gene prioritization

  • Ali Masoudi-Nejad
  • Alireza Meshkin
  • Behzad Haji-Eghrari
  • Gholamreza Bidkhori


Candidate gene identification is typically labour intensive, involving laboratory experiments required to corroborate or disprove any hypothesis for a nominated candidate gene being considered the causative gene. The traditional approach to reduce the number of candidate genes entails fine-mapping studies using markers and pedigrees. Gene prioritization establishes the ranking of candidate genes based on their relevance to the biological process of interest, from which the most promising genes can be selected for further analysis. To date, many computational methods have focused on the prediction of candidate genes by analysis of their inherent sequence characteristics and similarity with respect to known disease genes, as well as their functional annotation. In the last decade, several computational tools for prioritizing candidate genes have been proposed. A large number of them are web-based tools, while others are standalone applications that install and run locally. This review attempts to take a close look at gene prioritization criteria, as well as candidate gene prioritization algorithms, and thus provide a comprehensive synopsis of the subject matter.


Candidate gene Gene prioritization Genetic disorder Computational tools 



We appreciate Joseph Hannon Bozorgmehr for help with English editing the manuscript.


  1. Adie EA, Adams RR, Evans KL et al (2006) SUSPECTS: enabling fast and effective prioritization of positional candidates. Bioinformatics 22:773–774CrossRefPubMedGoogle Scholar
  2. Aerts S et al (2006) Gene prioritization through genomic data fusion. Nat Biotechnol 24:537–544CrossRefPubMedGoogle Scholar
  3. Auffray C et al (2009) Systems medicine: the future of medical genomics and healthcare. Genome Med. 1(1):2PubMedCentralCrossRefPubMedGoogle Scholar
  4. Braun TA et al (2003) Identification candidate disease genes with high-performance computing. J Supercomput 26:7–17CrossRefGoogle Scholar
  5. Braun TA et al (2006) Prioritizing regions of candidate genes for efficient mutation screening. Hum Mutat 27:195–200CrossRefPubMedGoogle Scholar
  6. Chen J, Xu H, Aronow BJ et al (2007) Improved human disease candidate gene prioritization using mouse phenotype. BMC Bioinformatics 8:392PubMedCentralCrossRefPubMedGoogle Scholar
  7. Chen J, Aronow B, Jegga A (2009a) Disease candidate gene identification and prioritization using protein interaction networks. BMC Bioinformatics 10(1):1CrossRefGoogle Scholar
  8. Chen J, Bardes EE, Aronow BJ, Jegga AG (2009b) TOPPGENE Suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Res 37:305–306CrossRefGoogle Scholar
  9. Chen Y, Wang W et al (2011) In Silico gene prioritization by integrating multiple data sources. PLoS One 6(6):e21137PubMedCentralCrossRefPubMedGoogle Scholar
  10. Cheng D, Knox C et al (2008) POLYSEARCH: a web based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites. Nucleic Acids Res 36:399–405CrossRefGoogle Scholar
  11. De Bie T, Tranchevent LC, Oeffelen LV, Moreau Y (2007) Kernel-based data fusion for gene prioritization. Bioinformatics 23(13):i125–i132Google Scholar
  12. Franke L, van Bakel H, Fokkens L, de Jong ED, Egmont-Petersen E, Wijmenga C (2006) Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. Am J Hum Genet 78:1011–1025PubMedCentralCrossRefPubMedGoogle Scholar
  13. Gaulton KJ, Mohlke KL, Vision TJ (2007) A computational system to select candidate genes for complex human traits. Bioinformatics 23:1132–1140CrossRefPubMedGoogle Scholar
  14. George RA, Liu JY, Feng LL, Bryson-Richardson RJ, Fatkin D, Wouters MA (2006) Analysis of protein sequence and interaction data for candidate disease gene prediction. Nucleic Acids Res 34:130CrossRefGoogle Scholar
  15. Gibson G (2009) Decanalization and the origin of complex disease. Net Rev Genet 10:134–136CrossRefGoogle Scholar
  16. Hristovskia D, Peterlinc B, Mitchellb JA, Humphrey SM (2005) Using literature-based discovery to identify disease candidate genes. Int J Med Informatics 74:289CrossRefGoogle Scholar
  17. Hutz JE, Kraja AT, McLeod HL, Province MA (2008) CANDID: a flexible method for prioritization candidate genes for complex human traits. Genet Epidemiol 32:779–811PubMedCentralCrossRefPubMedGoogle Scholar
  18. Kohl P et al (2010) Systems biology: an approach. Clin Pharmacol Therap 88:25–33CrossRefGoogle Scholar
  19. Kohler S, Bauer S, Horn D, Robinson PN (2008) Walking the interactome for prioritization of candidate disease genes. Am J Hum Genet 82:949–958PubMedCentralCrossRefPubMedGoogle Scholar
  20. Lage K, Karlberg EO et al (2007) A human phenome–interactome network of protein complexes implicated in genetic disorders. Nat Bio 25(3):309–316CrossRefGoogle Scholar
  21. Ma X, Lee H, Wang L, Sun F (2007) CGI: a new approach for prioritizing genes by combining gene expression and protein–protein interaction data. Bioinformatics 23(2):215–221CrossRefPubMedGoogle Scholar
  22. Morrison JL, Breitling R, Higham DJ, Gilbert DR (2005) GENERANK: using search engine technology for the analysis of microarray experiments. BMC Bioinformatics 6:233PubMedCentralCrossRefPubMedGoogle Scholar
  23. Nabieva E, Jim K, Agarwal A, Chazelle B, Singh M (2005) Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics Suppl 1:i302–i310Google Scholar
  24. Nitsch D et al (2010) Candidate gene prioritization by network analysis of differential expression using machine learning approaches. BMC Bioinformatics 14(11):460CrossRefGoogle Scholar
  25. Nitsch D et al (2011) PINTA-A web server for network-based gene prioritization from expression data. Nucleic Acids Res. 39(Web Server issue):W334–W338Google Scholar
  26. O’Connor TP, Crystal RG (2006) Genetic medicines: treatment strategies for hereditary disorders. Nat Rev Genet 7:261CrossRefPubMedGoogle Scholar
  27. Oti M, Snel B, Huynen MA, Brunner HG (2006) Predicting disease genes using protein–protein interactions. J Med Genet 43(8):691–698PubMedCentralCrossRefPubMedGoogle Scholar
  28. Perez-Iratxeta C, Bork P, Andrade MA (2002) Association of genes to genetically inherited diseases using data mining. Nat Genet 31:316–319PubMedGoogle Scholar
  29. Perez-Iratxeta C, Wjst M, Bork P, Andrade MA (2005) G2D: a tool for mining genes associated with disease. BMC Genet 6:45–49PubMedCentralCrossRefPubMedGoogle Scholar
  30. Pers TH et al (2011) Meta-analysis of heterogeneous data sources for genome-scale identification of risk genes in complex phenotypes. Genet Epidemiol 35(5):318–332CrossRefPubMedGoogle Scholar
  31. Radivojac P, Peng K et al (2008) An integrated approach to inferring gene–disease associations in humans. Proteins 72:1030–1037PubMedCentralCrossRefPubMedGoogle Scholar
  32. Rossi S, Masotti D et al (2006) TOM: a web-based integrated approach for identification of candidate disease genes. Nucleic Acids Res 34:285–292CrossRefGoogle Scholar
  33. Schlicker A et al (2010) Improving disease gene prioritization using the semantic similarity of Gene Ontology terms. Bioinformatics 26(18):i561–i567PubMedCentralCrossRefPubMedGoogle Scholar
  34. Seelow D, Schwarz JM, Schuelke M (2008) GENEDISTILLER–distilling candidate genes from linkage intervals. PLoS One 3(12):e3874PubMedCentralCrossRefPubMedGoogle Scholar
  35. Shannon P, Markiel A et al (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13(11):2498–2504PubMedCentralCrossRefPubMedGoogle Scholar
  36. Smoot M, Ono K et al (2011) PINGO: a cytoscape plugin to find candidate genes in biological networks. Bioinformatics 27(7):1030–1031PubMedCentralCrossRefPubMedGoogle Scholar
  37. Tranchevent LC et al (2008) ENDEAVOUR update: a web resource for gene prioritization in multiple species. Nucleic Acids Res 36 (Web Server issue):W377–W384Google Scholar
  38. Tranchevent LC, Capdevila FB, Nitsch D, Moor BD, De-Causmaecker P, Moreau Y (2010) A guide to web tools to prioritize candidate genes. Brief Bioinform 11:1–11CrossRefGoogle Scholar
  39. Turner FS, Clutterbuck DR, Semple CAM (2003) POCUS: mining genomic sequence annotation to predict disease genes. Genome Biol 4:75CrossRefGoogle Scholar
  40. Van Vooren S, Thienpont B, Menten B, Speleman F, De-Moor B, Vermeesch J, Moreau Y (2007) Mapping biomedical concepts onto the human genome by mining literature on chromosomal aberrations. Nucleic Acids Res 35:2533–2543PubMedCentralCrossRefPubMedGoogle Scholar
  41. Van-Driel MA, Bruggeman J, Vriend G, Brunner HG, Leunissen JAM (2006) A text-mining analysis of the human phenome. Eur J Hum Genet 14:535–542CrossRefPubMedGoogle Scholar
  42. Vanunu O, Sharan R (2008) A propagation based algorithm for inferring gene–disease associations. In: Proceedings of German Conference on bioinformatics. BerlinGoogle Scholar
  43. Xiong Q, Qiu Y, Gu W (2008) PGMAPPER: a web-based tool linking phenotype to genes. Bioinformatics 24:1011–1013PubMedCentralCrossRefPubMedGoogle Scholar
  44. Yoshida Y, Makita Y et al (2009) POSMED (Positional Medline): prioritizing genes with an artificial neural network comprising medical documents to accelerate positional cloning. Nucleic Acids Res 37:147–152CrossRefGoogle Scholar
  45. Yu W, Wulf A, Liu T, Khoury MJ, Gwinn M (2008) Gene Prospector: an evidence gateway for evaluating potential susceptibility genes and interacting risk factors for human diseases. BMC Bioinformatics 9:528PubMedCentralCrossRefPubMedGoogle Scholar
  46. Yue P, Melamud E, Moult J (2006) SNPs3D: candidate gene and SNP selection for association studies. BMC Bioinformatics 7:166PubMedCentralCrossRefPubMedGoogle Scholar
  47. Zhu M, Zhao S (2007) Candidate gene identification approach: progress and Challenges. Int J Biol Sci 3(7):420–427PubMedCentralCrossRefPubMedGoogle Scholar

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  • Ali Masoudi-Nejad
    • 1
  • Alireza Meshkin
    • 1
  • Behzad Haji-Eghrari
    • 1
  • Gholamreza Bidkhori
    • 1
  1. 1.Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and BiophysicsUniversity of TehranTehranIran

Personalised recommendations