Role of In Silico Tools in Gene Discovery

Yu, Bing

doi:10.1007/s12033-008-9134-8

Role of In Silico Tools in Gene Discovery

Review
Published: 20 December 2008

Volume 41, pages 296–306, (2009)
Cite this article

Molecular Biotechnology Aims and scope Submit manuscript

Bing Yu¹

526 Accesses
9 Citations
Explore all metrics

Abstract

Common complex diseases remain a major health challenge and involve the interaction of multiple genes and environmental factors. Discovering the relevant genes is difficult although it is known that disease risk can originate from the variation of an individual’s genome. Application of in silico tools can significantly improve the detection of genes and variation. Data mining and automated tracking of new knowledge facilitate locus mapping. At the gene search stage, in silico prioritization of candidate genes plays an indispensable role in dealing with linked or associated loci. In silico analysis can also differentiate subtle consequences of coding DNA variants and remains the major tool to predict potential effects of non-coding DNA variants on gene transcription and/or pre-mRNA splicing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Identification of Disease-Related Genes Using a Genome-Wide Association Study Approach

Bioinformatic Tools for the Search of Disease-Associated Variations

OligoPVP: Phenotype-driven analysis of individual genomic information to prioritize oligogenic disease variants

Article Open access 02 October 2018

References

Wellcome Trust Case Control Consortium. (2007). Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature, 447, 661–678. doi:10.1038/nature05911.
Article Google Scholar
Xiang, J., Li, X. Y., Xu, M., Hong, J., Huang, Y., Tan, J. R., et al. (2008). Zinc transporter-8 gene (SLC30A8) is associated with type 2 diabetes in Chinese. Journal of Clinical Endocrinology and Metabolism, 93, 4107–4112. doi:10.1210/jc.2008-0161.
Article CAS Google Scholar
Grant, S. F., Thorleifsson, G., Reynisdottir, I., Benediktsson, R., Manolescu, A., Sainz, J., et al. (2006). Variant of transcription factor 7-like 2 (TCF7L2) gene confers risk of type 2 diabetes. Nature Genetics, 38, 320–323. doi:10.1038/ng1732.
Article CAS Google Scholar
Herbert, A., Gerry, N. P., McQueen, M. B., Heid, I. M., Pfeufer, A., Illig, T., et al. (2006). A common genetic variant is associated with adult and childhood obesity. Science, 312, 279–283. doi:10.1126/science.1124779.
Article CAS Google Scholar
Watkins, H., & Farrall, M. (2006). Genetic susceptibility to coronary artery disease: From promise to progress. Nature Reviews. Genetics, 7, 163–173. doi:10.1038/nrg1805.
Article CAS Google Scholar
McCarthy, M. I., Abecasis, G. R., Cardon, L. R., Goldstein, D. B., Little, J., Ioannidis, J. P., et al. (2008). Genome-wide association studies for complex traits: Consensus, uncertainty and challenges. Nature Reviews. Genetics, 9, 356–369. doi:10.1038/nrg2344.
Article CAS Google Scholar
Trent, R. J. (2005). Molecular medicine (3rd ed., p. 4). London: Elsevier Academic Press.
Google Scholar
Thomson, G. (2001). Significance levels in genome scans. Advances in Genetics, 42, 475–486. doi:10.1016/S0065-2660(01)42037-2.
Article CAS Google Scholar
Smith, E. W., & Torbert, J. V. (1958). Study of two abnormal hemoglobins with evidence for a new genetic locus for hemoglobin formation. Bulletin of the Johns Hopkins Hospital, 102, 38–45.
CAS Google Scholar
Deisseroth, A., Nienhuis, A., Turner, P., Velez, R., Anderson, W. F., Ruddle, F., et al. (1977). Localization of the human alphaglobin structural gene to chromosome 16 in somatic cell hybrids by molecular hybridization assay. Cell, 12, 205–218.
Article CAS Google Scholar
Rommens, J. M., Iannuzzi, M. C., Kerem, B., Drumm, M. L., Melmer, G., Dean, M., et al. (1989). Identification of the cystic fibrosis gene: Chromosome walking and jumping. Science, 245, 1059–1065. doi:10.1126/science.2772657.
Article CAS Google Scholar
Richards, J. E., Gilliam, T. C., Cole, J. L., Drumm, M. L., Wasmuth, J. J., Gusella, J. F., et al. (1988). Chromosome jumping from D4S10 (G8) toward the Huntington disease gene. Proceedings of the National Academy of Sciences of the United States of America, 85, 6437–6441. doi:10.1073/pnas.85.17.6437.
Article CAS Google Scholar
Frazer, K. A., Ballinger, D. G., Cox, D. R., Hinds, D. A., Stuve, L. L., Gibbs, R. A., et al. (2007). A second generation human haplotype map of over 3.1 million SNPs. Nature, 449, 851–861. doi:10.1038/nature06258.
Article CAS Google Scholar
Gauderman, W. J. (2002). Sample size requirements for matched case-control studies of gene-environment interaction. Statistics in Medicine, 21, 35–50. doi:10.1002/sim.973.
Article Google Scholar
Laird, N. M., & Lange, C. (2006). Family-based designs in the age of large-scale gene-association studies. Nature Reviews. Genetics, 7, 385–394. doi:10.1038/nrg1839.
Article CAS Google Scholar
Lalouel, J.-M., & Rohrwasser, A. (2002). Power and replication in case-control studies. American Journal of Hypertension, 15, 201–205. doi:10.1016/S0895-7061(01)02285-3.
Article Google Scholar
Ambrosius, W. T., Lange, E. M., & Langefeld, C. D. (2004). Power for genetic association studies with random allele frequencies and genotype distributions. American Journal of Human Genetics, 74, 683–693. doi:10.1086/383282.
Article CAS Google Scholar
Marchini, J., Howie, B., Myers, S., McVean, G., & Donnelly, P. (2007). A new multipoint method for genome-wide association studies by imputation of genotypes. Nature Genetics, 39, 906–913. doi:10.1038/ng2088.
Article CAS Google Scholar
Kruglyak, L. (1999). Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nature Genetics, 22, 139–144. doi:10.1038/9642.
Article CAS Google Scholar
Barrett, J. C., & Cardon, L. R. (2006). Evaluating coverage of genome-wide association studies. Nature Genetics, 38, 659–662. doi:10.1038/ng1801.
Article CAS Google Scholar
Pe’er, I., de Bakker, P. I., Maller, J., Yelensky, R., Altshuler, D., & Daly, M. J. (2006). Evaluating and improving power in whole-genome association studies using fixed marker sets. Nature Genetics, 38, 663–667. doi:10.1038/ng1816.
Article CAS Google Scholar
de Bakker, P. I., Yelensky, R., Pe’er, I., Gabriel, S. B., Daly, M. J., & Altshuler, D. (2005). Efficiency and power in genetic association studies. Nature Genetics, 37, 1217–1223. doi:10.1038/ng1669.
Article Google Scholar
De La Vega, F. M. (2007). Selecting single-nucleotide polymorphisms for association studies with SNPbrowser software. Methods in Molecular Biology (Clifton, N.J.), 376, 177–193. doi:10.1007/978-1-59745-389-9_13.
Article Google Scholar
Weeber, M., Kors, J. A., & Mons, B. (2005). Online tools to support literature-based discovery in the life sciences. Briefings in Bioinformatics, 6, 277–286. doi:10.1093/bib/6.3.277.
Article Google Scholar
van Driel, M. A., Cuelenaere, K., Kemmeren, P. P., Leunissen, J. A., Brunner, H. G., & Vriend, G. (2005). GeneSeeker: Extraction and integration of human disease-related information from web-based genetic databases. Nucleic Acids Research, 33, W758–W761. doi:10.1093/nar/gki435.
Article Google Scholar
Freudenberg, J., & Propping, P. (2002). A similarity-based method for genome-wide prediction of disease-relevant human genes. Bioinformatics (Oxford, England), 18, S110–S115.
Google Scholar
Perez-Iratxeta, C., Bork, P., & Andrade, M. A. (2002). Association of genes to genetically inherited diseases using data mining. Nature Genetics, 31, 316–319.
CAS Google Scholar
Turner, F. S., Clutterbuck, D. R., & Semple, C. A. M. (2003). POCUS: Mining genomic sequence annotation to predict disease genes. Genome Biology, 4, R75. doi:10.1186/gb-2003-4-11-r75.
Article Google Scholar
Adie, E. A., Adams, R. R., Evans, K. L., Porteous, D. J., & Pickard, B. S. (2005). Speeding disease gene discovery by sequence based candidate prioritization. BMC Bioinformatics, 6, 55. doi:10.1186/1471-2105-6-55.
Article Google Scholar
Henderson, J., Withford-Cave, J. M., Duffy, D. L., Cole, S. J., Sawyer, N. A., Gulbin, J. P., et al. (2005). The EPAS1 gene influences the aerobic-anaerobic contribution in elite endurance athletes. Human Genetics, 118, 416–423. doi:10.1007/s00439-005-0066-0.
Article CAS Google Scholar
Bouchard, C., Rankinen, T., Chagnon, Y. C., Rice, T., Perusse, L., Gagnon, J., et al. (2000). Genomic scan for maximal oxygen uptake and its response to training in the HERITAGE Family Study. Journal of Applied Physiology, 88, 551–559.
CAS Google Scholar
Miller, R. T., Christoffels, A. G., Gopalakrishnan, C., Burke, J., Ptitsyn, A. A., Broveak, T. R., et al. (1999). A comprehensive approach to clustering of expressed human gene sequence: The sequence tag alignment and consensus knowledge base. Genome Research, 9, 1143–1155. doi:10.1101/gr.9.11.1143.
Article CAS Google Scholar
Devos, D., & Valencia, A. (2001). Intrinsic errors in genome annotation. Trends in Genetics, 17, 429–431. doi:10.1016/S0168-9525(01)02348-4.
Article CAS Google Scholar
Judson, R., Stephens, J. C., & Windemuth, A. (2000). The predictive power of haplotypes in clinical response. Pharmacogenomics, 1, 15–26. doi:10.1517/14622416.1.1.15.
Article CAS Google Scholar
Adkins, R. M. (2004). Comparison of the accuracy of methods of computational haplotype inference using a large empirical dataset. BMC Genetics, 5, 22. doi:10.1186/1471-2156-5-22.
Article Google Scholar
Van Den Bogaert, A., Schumacher, J., Schulze, T. G., Otte, A. C., Ohlraun, S., Kovalenko, S., et al. (2003). The DTNBP1 (dysbindin) gene contributes to schizophrenia, depending on family history of the disease. American Journal of Human Genetics, 73, 1438–1443. doi:10.1086/379928.
Article Google Scholar
Batzoglou, S. (2005). The many faces of sequence alignment. Briefings in Bioinformatics, 6, 6–22. doi:10.1093/bib/6.1.6.
Article CAS Google Scholar
Yu, B. (2004). What is the value of mutation identification in familial hypertrophic cardiomyopathy? IUBMB Life, 56, 281–283. doi:10.1080/15216540412331272254.
Article CAS Google Scholar
Mooney, S. (2005). Bioinformatics approaches and resources for single nucleotide polymorphism functional analysis. Briefings in Bioinformatics, 6, 44–56. doi:10.1093/bib/6.1.44.
Article CAS Google Scholar
Ng, P. C., & Henikoff, S. (2003). SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Research, 31, 3812–3814. doi:10.1093/nar/gkg509.
Article CAS Google Scholar
Cartegni, L., & Krainer, A. R. (2002). Disruption of an SF2/ASF-dependent exonic splicing enhancer in SMN2 causes spinal muscular atrophy in the absence of SMN1. Nature Genetics, 30, 377–384. doi:10.1038/ng854.
Article CAS Google Scholar
Houdayer, C., Dehainault, C., Mattler, C., Michaux, D., Caux-Moncoutier, V., Pages-Berhouet, S., et al. (2008). Evaluation of in silico splice tools for decision-making in molecular diagnosis. Human Mutation, 29, 975–982. doi:10.1002/humu.20765.
Article CAS Google Scholar
Bulyk, M. L. (2003). Computational prediction of transcription-factor binding site locations. Genome Biology, 5, 201. doi:10.1186/gb-2003-5-1-201.
Article Google Scholar
Pavesi, G., Mauri, G., & Pesole, G. (2004). In silico representation and discovery of transcription factor binding sites. Briefings in Bioinformatics, 5, 217–236. doi:10.1093/bib/5.3.217.
Article CAS Google Scholar
Amador, M. L., Oppenheimer, D., Perea, S., Maitra, A., Cusatis, G., Iacobuzio-Donahue, C., et al. (2004). An epidermal growth factor receptor intron 1 polymorphism mediates response to epidermal growth factor receptor inhibitors. Cancer Research, 64, 9139–9143. doi:10.1158/0008-5472.CAN-04-1036.
Article CAS Google Scholar
Tokuhiro, S., Yamada, R., Chang, X., Suzuki, A., Kochi, Y., Sawada, T., et al. (2003). An intronic SNP in a RUNX1 binding site of SLC22A4, encoding an organic cation transporter, is associated with rheumatoid arthritis. Nature Genetics, 35, 341–348. doi:10.1038/ng1267.
Article CAS Google Scholar
Greene, E., Mahishi, L., Entezam, A., Kumari, D., & Usdin, K. (2007). Repeat-induced epigenetic changes in intron 1 of the frataxin gene and its consequences in Friedreich ataxia. Nucleic Acids Research, 35, 3383–3390. doi:10.1093/nar/gkm271.
Article CAS Google Scholar
Fairbrother, W. G., Yeh, R. F., Sharp, P. A., & Burge, C. B. (2002). Predictive identification of exonic splicing enhancers in human genes. Science, 297, 1007–1013. doi:10.1126/science.1073774.
Article CAS Google Scholar
Rademakers, R., Eriksen, J. L., Baker, M., Robinson, T., Ahmed, Z., Lincoln, S. J., et al. (2008). Common variation in the miR-659 binding-site of GRN is a major risk factor for TDP43-positive frontotemporal dementia. Human Molecular Genetics, 17, 3631–3642. doi:10.1093/hmg/ddn257.
Article CAS Google Scholar

Download references

Acknowledgments

The author thanks Professor Ronald J. Trent and Dr Julia M. Morahan for their helpful discussion and comments on the manuscript. This work was partially supported by the Australian Research Council Discovery Grant DP0452019.

Author information

Authors and Affiliations

Department of Molecular & Clinical Genetics, Royal Prince Alfred Hospital and Central Clinical School, University of Sydney, G48A, Medical Foundation Building (K25), 92-94 Parramatta Road, Camperdown, NSW, 2050, Australia
Bing Yu

Authors

Bing Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bing Yu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yu, B. Role of In Silico Tools in Gene Discovery. Mol Biotechnol 41, 296–306 (2009). https://doi.org/10.1007/s12033-008-9134-8

Download citation

Received: 28 September 2008
Accepted: 01 December 2008
Published: 20 December 2008
Issue Date: March 2009
DOI: https://doi.org/10.1007/s12033-008-9134-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Role of In Silico Tools in Gene Discovery

Abstract

Access this article

Similar content being viewed by others

Identification of Disease-Related Genes Using a Genome-Wide Association Study Approach

Bioinformatic Tools for the Search of Disease-Associated Variations

OligoPVP: Phenotype-driven analysis of individual genomic information to prioritize oligogenic disease variants

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Role of In Silico Tools in Gene Discovery

Abstract

Access this article

Similar content being viewed by others

Identification of Disease-Related Genes Using a Genome-Wide Association Study Approach

Bioinformatic Tools for the Search of Disease-Associated Variations

OligoPVP: Phenotype-driven analysis of individual genomic information to prioritize oligogenic disease variants

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation