Skip to main content

Conceptual Thinking for In Silico Prioritization of Candidate Disease Genes

  • Protocol
  • First Online:
In Silico Tools for Gene Discovery

Part of the book series: Methods in Molecular Biology ((MIMB,volume 760))

Abstract

Prioritization of most likely etiological genes entails predicting and defining a set of characteristics that are most likely to fit the underlying disease gene and scoring candidates according to their fit to this “perfect disease gene” profile. This requires a full understanding of the disease phenotype, characteristics, and any available data on the underlying genetics of the disease. Public databases provide enormous and ever-growing amounts of information that can be relevant to the prioritization of etiological genes. Computational approaches allow this information to be retrieved in an automated and exhaustive way and can therefore facilitate the comprehensive mining of this information, including its combination with sets of empirically generated data, in the process of identifying most likely candidate disease genes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Risch, N. J. (2000) Searching for genetic determinants in the new millennium. Nature 405, 847–856.

    Article  PubMed  CAS  Google Scholar 

  2. Yang, Q., Khoury, M. J., Botto, L., et al. (2003) Improving the prediction of complex diseases by testing for multiple disease-susceptibility genes. Am J Hum Genet 72, 636–649.

    Article  PubMed  CAS  Google Scholar 

  3. Oti, M., and Brunner, H. G. (2007) The modular nature of genetic diseases. Clin Genet 71, 1–11.

    Article  PubMed  CAS  Google Scholar 

  4. Tiffin, N., Okpechi, I., Perez-Iratxeta, C., et al. (2008) Prioritization of candidate disease genes for metabolic syndrome by computational analysis of its defining phenotypes. Physiol Genomics 35, 55–64.

    Article  PubMed  CAS  Google Scholar 

  5. Lombard, Z., Tiffin, N., Hofmann, O., et al. (2007) Computational selection and prioritization of candidate genes for fetal alcohol syndrome. BMC Genomics 8, 389.

    Article  PubMed  Google Scholar 

  6. Kel, A., Voss, N., Valeev, T., et al. (2008) ExPlain: finding upstream drug targets in disease gene regulatory networks. SAR QSAR Environ Res 19, 481–494.

    Article  PubMed  CAS  Google Scholar 

  7. Tabor, H. K., Risch, N. J., and Myers, R. M. (2002) Candidate-gene approaches for studying complex genetic traits: practical considerations. Nat Rev Genet 3, 391–397.

    Article  PubMed  CAS  Google Scholar 

  8. Franke, L., Bakel, H., Fokkens, L., et al. (2006) Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. Am J Hum Genet 78, 1011–1025.

    Article  PubMed  CAS  Google Scholar 

  9. George, R. A., Liu, J. Y., Feng, L. L., et al. (2006) Analysis of protein sequence and interaction data for candidate disease gene prediction. Nucleic Acids Res 34, e130.

    Article  PubMed  Google Scholar 

  10. Firth, H. V., Richards, S. M., Bevan, A. P., et al. (2009) DECIPHER: Database of Chromosomal Imbalance and Phenotype in Humans Using Ensembl Resources. Am J Hum Genet 84, 524–533.

    Article  PubMed  CAS  Google Scholar 

  11. Oti, M., Huynen, M. A., and Brunner, H. G. (2009) The biological coherence of human phenome databases. Am J Hum Genet 85, 801–808.

    Article  PubMed  CAS  Google Scholar 

  12. Bodenreider, O. (2004) The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res 32, D267–270.

    Article  Google Scholar 

  13. Bodenreider, O. (2008) Biomedical ontologies in action: role in knowledge management, data integration and decision support. Yearb Med Inform, 67–79.

    Google Scholar 

  14. Sam, L. T., Mendonca, E. A., Li, J., et al. (2009) PhenoGO: an integrated resource for the multiscale mining of clinical and biological data. BMC Bioinformatics 10(Suppl 2), S8.

    Article  PubMed  Google Scholar 

  15. Braun, J., and Sieper, J. (2007) Ankylosing spondylitis. Lancet 369, 1379–1390.

    Article  PubMed  Google Scholar 

  16. Levsky, J. M., and Singer, R. H. (2003) Fluorescence in situ hybridization: past, present and future. J Cell Sci 116, 2833–2838.

    Article  PubMed  CAS  Google Scholar 

  17. Gray, J. W., Kallioniemi, A., Kallioniemi, O., et al. (1992) Molecular cytogenetics: diagnosis and prognostic assessment. Curr Opin Biotechnol 3, 623–631.

    Article  PubMed  CAS  Google Scholar 

  18. Tiffin, N., Adie, E., Turner, F., et al. (2006) Computational disease gene identification: a concert of methods prioritizes type 2 diabetes and obesity candidate genes. Nucleic Acids Res 34, 3067–3081.

    Article  PubMed  CAS  Google Scholar 

  19. Lahiry, P., Torkamani, A., Schork, N. J., and Hegele, R. A. (2010) Kinase mutations in human disease: interpreting genotype-phenotype relationships. Nat Rev Genet 11, 60–74.

    Google Scholar 

  20. Perez-Iratxeta, C., Wjst, M., Bork, P., and Andrade, M. A. (2005) G2D: a tool for mining genes associated with disease. BMC Genet 6, 45.

    Article  PubMed  Google Scholar 

  21. Turner, F. S., Clutterbuck, D. R., and Semple, C. A. (2003) POCUS: mining genomic sequence annotation to predict disease genes. Genome Biol 4, R75.

    Article  PubMed  Google Scholar 

  22. Masotti, D., Nardini, C., Rossi, S., et al. (2008) TOM: enhancement and extension of a tool suite for in silico approaches to multigenic hereditary disorders. Bioinformatics 24, 428–429.

    Article  PubMed  CAS  Google Scholar 

  23. Tranchevent, L. C., Barriot, R., Yu, S., et al. (2008) ENDEAVOUR update: a web resource for gene prioritization in multiple species. Nucleic Acids Res 36, W377–384.

    Article  Google Scholar 

  24. Adie, E. A., Adams, R. R., Evans, K. L., et al. (2006) SUSPECTS: enabling fast and effective prioritization of positional candidates. Bioinformatics 22, 773–774.

    Article  PubMed  CAS  Google Scholar 

  25. Perez-Iratxeta, C., Palidwor, G., and Andrade-Navarro, M. A. (2007) Towards completion of the Earth’s proteome. EMBO Rep 8, 1135–1141.

    Article  PubMed  CAS  Google Scholar 

  26. Auwerx, J., Avner, P., Baldock, R., et al. (2004) The European dimension for the mouse genome mutagenesis program. Nat Genet 36, 925–927.

    Article  PubMed  CAS  Google Scholar 

  27. van Driel, M. A., Cuelenaere, K., Kemmeren, P. P., et al. (2005) GeneSeeker: extraction and integration of human disease-related information from web-based genetic databases. Nucleic Acids Res 33, W758–761.

    Article  Google Scholar 

  28. Chen, J., Xu, H., Aronow, B. J., and Jegga, A. G. (2007) Improved human disease candidate gene prioritization using mouse phenotype. BMC Bioinformatics 8, 392.

    Article  PubMed  Google Scholar 

  29. Fraser, H. B., and Plotkin, J. B. (2007) Using protein complexes to predict phenotypic effects of gene mutation. Genome Biol 8, R252.

    Article  PubMed  Google Scholar 

  30. Lopez-Bigas, N., Blencowe, B. J., and Ouzounis, C. A. (2006) Highly consistent patterns for inherited human diseases at the molecular level. Bioinformatics 22, 269–277.

    Article  PubMed  CAS  Google Scholar 

  31. Adie, E. A., Adams, R. R., Evans, K. L., et al. (2005) Speeding disease gene discovery by sequence based candidate prioritization. BMC Bioinformatics 6, 55.

    Article  PubMed  Google Scholar 

  32. Flicek, P., Aken, B. L., Ballester, B., et al. (2010) Ensembl“s 10th year. Nucleic Acids Res 38, D557–D562.

    Google Scholar 

  33. Rhead, B., Karolchik, D., Kuhn, R. M., et al. (2010) The UCSC Genome Browser database: update 2010. Nucleic Acids Res 38, D613–619.

    Google Scholar 

  34. Sayers, E. W., Barrett, T., Benson, D. A., et al. (2010) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 38, D5–16.

    Google Scholar 

  35. Kelso, J., Visagie, J., Theiler, G., et al. (2003) eVOC: a controlled vocabulary for unifying gene expression data. Genome Res 13, 1222–1230.

    Article  PubMed  CAS  Google Scholar 

  36. Tanino, M., Debily, M. A., Tamura, T., et al. (2005) The Human Anatomic Gene Expression Library (H-ANGEL), the H-Inv integrative display of human gene expression across disparate technologies and platforms. Nucleic Acids Res 33, D567–572.

    Article  Google Scholar 

  37. Lukk, M., Kapushesky, M., Nikkila, J., et al. (2010) A global map of human gene expression. Nat Biotechnol 28, 322–324.

    Google Scholar 

  38. The Gene Ontology Consortium (2010) The Gene Ontology in 2010: extensions and refinements. Nucleic Acids Res 38, D331–335.

    Google Scholar 

  39. Perez-Iratxeta, C., Bork, P., and Andrade-Navarro, M. A. (2007) Update of the G2D tool for prioritization of gene candidates to inherited diseases. Nucleic Acids Res 35 (Web Server issue), W212–216.

    Google Scholar 

  40. Beissbarth, T., and Speed, T. P. (2004) GOstat: find statistically overrepresented Gene Ontologies within a group of genes. Bioinformatics 20, 1464–1465.

    Article  PubMed  CAS  Google Scholar 

  41. Dennis, G., Jr., Sherman, B. T., Hosack, D. A., et al. (2003) DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol 4, P3.

    Article  Google Scholar 

  42. Huang da, W., Sherman, B. T., and Lempicki, R. A. (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4, 44–57.

    Article  PubMed  Google Scholar 

  43. Tiffin, N., Kelso, J. F., Powell, A. R., et al. (2005) Integration of text- and data-mining using ontologies successfully selects disease gene candidates. Nucleic Acids Res 33, 1544–1552.

    Article  PubMed  CAS  Google Scholar 

  44. Smedley, D., Haider, S., Ballester, B., et al. (2009) BioMart – biological queries made easy. BMC Genomics 10, 22.

    Article  PubMed  Google Scholar 

  45. Mootha, V. K., Lepage, P., Miller, K., et al. (2003) Identification of a gene causing human cytochrome c oxidase deficiency by integrative genomics. Proc Natl Acad Sci USA 100, 605–610.

    Article  PubMed  CAS  Google Scholar 

  46. Parkinson, H., Kapushesky, M., Kolesnikov, N., et al. (2009) ArrayExpress update – from an archive of functional genomics experiments to the atlas of gene expression. Nucleic Acids Res 37, D868–D872.

    Article  Google Scholar 

  47. Barrett, T., Troup, D. B., Wilhite, S. E., et al. (2009) NCBI GEO: archive for high-throughput functional genomic data. Nucleic Acids Res 37, D885–D890.

    Article  Google Scholar 

Download references

Acknowledgments

This work was funded by the Medical Research Council of South Africa.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicki Tiffin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Tiffin, N. (2011). Conceptual Thinking for In Silico Prioritization of Candidate Disease Genes. In: Yu, B., Hinchcliffe, M. (eds) In Silico Tools for Gene Discovery. Methods in Molecular Biology, vol 760. Humana Press. https://doi.org/10.1007/978-1-61779-176-5_11

Download citation

  • DOI: https://doi.org/10.1007/978-1-61779-176-5_11

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-61779-175-8

  • Online ISBN: 978-1-61779-176-5

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics