Abstract
The genetics of human phenotype variation and especially, the genetic basis of human complex diseases could be understood by knowing the functions of Single Nucleotide Polymorphisms (SNPs). The main goal of this work is to predict the deleterious non-synonymous SNPs (nsSNPs), so that the number of SNPs screened for association with disease can be reduced to that most likely alters gene function. In this work by using computational tools, we have analyzed the SNPs that can alter the expression and function of cancerous genes involved in colon cancer. To explore possible relationships between genetic mutation and phenotypic variation, different computational algorithm tools like Sorting Intolerant from Tolerant (evolutionary-based approach), Polymorphism Phenotyping (structure-based approach), PupaSuite, UTRScan and FASTSNP were used for prioritization of high-risk SNPs in coding region (exonic nonsynonymous SNPs) and non-coding regions (intronic and exonic 5′ and 3′-untranslated region (UTR) SNPs). We developed semi-quantitative relative ranking strategy (non availability of 3D structure) that can be adapted to a priori SNP selection or post hoc evaluation of variants identified in whole genome scans or within haplotype blocks associated with disease. Lastly, we analyzed haplotype tagging SNPs (htSNPs) in the coding and untranslated regions of all the genes by selecting the force tag SNPs selection using iHAP analysis. The computational architecture proposed in this review is based on integrating relevant biomedical information sources to provide a systematic analysis of complex diseases. We have shown a “real world” application of interesting existing bioinformatics tools for SNP analysis in colon cancer.
Similar content being viewed by others
References
Andrawiss, M. 2005. First phase of HapMap project already helping drug discovery. Nat Rev Drug Discov 4, 947.
Balasubramanian, S., Xia, Y., Freinkman, E., Gerstein, M. 2005. Sequence variation in G-Proteincoupled receptors: Analysis of single nucleotide polymorphisms. Nucleic Acids Res 33, 1710–1721.
Bao, L., Cui, Y. 2006. Functional impacts of nonsynonymous single nucleotide polymorphisms: Selective constraint and structural environments. FEBS Lett 580, 1231–1234.
Barnetson, R.A., Cartwright, N., Van, V.A., Haq, N., Drew, K., Farrington, S., Williams, N., Warner, J., Campbell, H., Porteous, M.E., Dunlop, M.G. 2008. Classification of ambiguous mutations in DNA mismatch repair genes identified in a population-based study of colorectal cancer. Hum Mutat 29, 367–374.
Brooks-Wilson, A.R., Kaurah, P., Suriano, G. 2004. Germline Ecadherin mutations in hereditary diffuse gastric cancer: assessment of 42 new families and review of genetic screening criteria. J Med Genet 41, 508–517.
Chasman, D., Adams, R.M. 2001. Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: Structure-based assessment of amino acid variation. J Mol Biol 307, 683–706.
Chen-Shtoyerman, R., Theodor, L., Harmati, E., Friedman, E., Dacka, S., Kopelman, Y., Sternberg, A., Zarivach, R., Bar-Meir, S., Fireman, Z. 2003. Genetic analysis of familial colorectal cancer in Israeli Arabs. Hum Mutat 21, 446–447.
Colombino, M., Cossu, A., Arba, A., Manca, A., Curci, A., Avallone, A., Comella, G., Botti, G., Scintu, F., Amoruso, M., D’Abbicco, D., d’Agnessa, M.R., Spanu, A., Tanda, F., Palmieri, G. 2003. Microsatellite instability and mutation analysis among southern Italian patients with colorectal carcinoma: Detection of different alterations accounting for MLH1 and MSH2 inactivation in familial cases. Ann Oncol 14, 1530–1536.
Conde, L., Vaquerizas, J.M., Dopazo, H., Arbiza, L., Reumers, J., Rousseau, F., Schymkowitz, J., Dopazo, J. 2006. PupaSuite: Finding functional single nu cleotide polymorphisms for large-scale genotyping purposes. Nucleic Acids Res 34, W621–W625.
Conde, L., Vaquerizas, J.M., Ferrer-Costa, C., Orozco, M., Dopazo, J. 2005. PupasView: A visual tool for selecting suitable SNPs, with putative pathological effect in genes, for genotyping Purposes. Nucleic Acids Res 33, W501–W505.
Cravo, M., Afonso, A.J., Lage, P., Albuquerque, C., Maia, L., Lacerda, C., Fidalgo, P., Chaves, P., Cruz, C., Nobre-Leitao, C. 2002. Pathogenicity of missense and splice site mutations in hMSH2 and hMLH1 mismatch repair genes: Implications for genetic testing. Gut 50, 405–412.
Doss, C.G.P., Sudandiradoss, C., Rajasekaran, R., Prohit, R., Ramanathan, K., Sethumadhavan, R. 2008a. Identification and structural comparison of deleterious mutations in nsSNPs of ABL1 gene in chronic myeloid leukemia: A bioinformatics study. Journal of Biomedical Informatics 41, 607–612.
Doss, C.G.P., Sudandiradoss, C., Rajasekaran, R., choudhury, P., Sinha, P., Hota, P., Batra, U.P., Sethumadhavan, R. 2008b. Application of computational algorithm tools to identify functional SNPs. Functional and Integrative Genomics 8, 309–316.
Doss, C.G.P., Rajasekaran, R., Sudandiradoss, C., Ramanathan, K., Prohit, R., Sethumadhavan, R. 2008c. A novel computational and structural analysis of nsSNPs in CFTR gene. Genomic Medicine 2, 23–32.
Doss, C.G.P., Sethumadhavan, R. 2009a. Structural and Functional analysis of deleterious nsSNPs in PAH associated with Phenylketonuria. Advance Science Letters 2, 364–371.
Doss, C.G.P., Sethumadhavan, R. 2009b. Functional and structural characterization of polymorphisms in MSH2 gene using computational tools. Journal of Bionanoscience 3, 1–9.
Doss. C.G.P., Sethumadhavan, R. 2009c. Investigation on the role of nsSNPs in HNPCC genes — A Bioinformatics approach. BMC Journal of biomedical science 16, 42.
Ellison, A.R., Lofing, J., Bitter, G.A. 2001. Functional analysis of human MLH1 and MSH2 missense variants and hybrid human-yeast MLH1 proteins in Saccharomyces cerevisiae. Hum Mol Genet 10, 1889–1900.
Ferrer-Costa, C., Gelpi, J.L., Zamakola, L., Parraga, I., de la Cruz, X., Orozco. M. 2005. PMUT: A web-based tool for the annotation of pathological mutations on proteins. Bioinformatics 21, 3176–3178.
Fodde, R., Smits, R., Clevers, H. 2001. APC, signal transduction and genetic instability in colorectal cancer. Nat Rev Cancer 1, 55–67.
Fredman, D., Munns, G., Rios, D., Sjöholm, F., Siegfried, M., Lenhard, B., Lehväslaiho, H., Brookes, A.J. 2004. HGVbase: A curated resource describing human DNA variation and phenotype relationships. Nucleic Acids Res 32, D516–D519.
Fullerton, S.M., Buchanan, A.V., Sonpar, V.A., Taylor, S.L., Smith, J.D., Carlson, C.S., Salomaa, V., Stengard, J.H., Boerwinkle, E., Clark, A.G., Nickerson, D.A., Weiss, K.M. 2004. The effects of scale: Variation in the APOA1/C3/A4/A5 gene Cluster. Hum Gene 115, 36–56.
Futrea, P.A., Coin, L., Marshall, M., Down, T., Hubbard, T., Wooster, R., Rahman, N., Stratton, M.R. 2004. A census of human cancer genes. Nat Rev Cancer 4, 177–183.
Gorlov, I.P., Gorlova, O.Y., Frazier, M.L., Amos, C.I. 2003. Missense mutations in hMLH1 and hMSH2 are associated with exonic splicing enhancers. Am J Hum Genet 73, 1157–1161.
Gryfe, R. 2006. Clinical implications of ouradvancing knowledge of colorectal cancer genetics: Inherited syndromes, prognosis, prevention, screening and therapeutics. Surgical Clinics of North America 86, 787–817.
Hudson, T.J. 2003. Wanted: Regulatory SNPs. Nat Genet 33, 439–440.
Huret, J.L., Dessen, P., Bernheim, A. 2003. Atlas of genetics and cytogenetics in oncology and haematology. Nucleic Acids Res 31, 272–274.
Jemal, A., Siegel, R., Ward, E., Murray, T., Xu, J., Thun, M.J. 2007. Cancer statistics. A Cancer J Clin 57, 43.
Kanetsky, P.A., Ge, F., Najarian, D., Swoyer, J., Panossian, S., Schuchter, L., Holmes, R., Guerry, D., Rebbeck, T.R. 2004. Assessment of polymorphic variants in the melanocortin-1 receptor gene with cutaneous pigmentation using an evolutionary approach. Cancer Epidemiol Biomarkers Prev 13, 808–819.
Karchin, R., Kelly, L., Sali, A. 2005a. Improving functional annotation of non-synonymous SNPs with information theory. Pac Symp Biocomput 10, 397–408.
Karchin, R., Diekhans, M., Kelly, L. Thomas, D.J., Pieper, U., Eswar, N., Haussler, D., Sali, A. 2005b. LS-SNP: Large-scale annotation of coding non-synonymous SNPs based on multiple information sources. Bioinformatics 21, 2814–2820.
Kariola, R., Hampel, H., Frankel, W.L., Raevaara, T.E., de la Chapelle, A., Nyström-Lahti, M. 2004. MMSH6 missense mutations are often associated with no or low cancer susceptibility. Br J Can 91, 1287–1292.
Krawczak, M., Ball, E.V., Fenton, I., Stenson, P.D., Abeysinghe, S., Thomas, N., Strachan, T., Read, A.P., Cooper, D.N. 2000. Human gene mutation database — A biomedical information and research resource. Hum Mutat 15, 45–51.
Kumar, P., Henikoff, S., Ng, P.C. 2009. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nature Protocols 4, 1073–1081.
Lynch, H.T., de la Chapelle, A. 2003. Hereditary colorectal cancer. N Engl J Med 348, 919–932.
Mooney, S. 2005. Bioinformatics approaches and resources for single nucleotide polymorphism functional analysis. Brief Bioinform 6, 44–56.
Moreno-Estrada, A., Casals, F., Ramirez-Soriano, A., Oliva, B., Calafell, F., Bertranpetit, J., Bosch, E. 2008. Signatures of selection in the human olfactory receptor OR5I1 gene. Mol Biol Evol 25, 144–154.
Ng, P.C., Henikoff, S. 2003. SIFT: Predicting amino acid changes that affect protein Function. Nucleic Acids Res 31, 3812–3814.
Ng, P.C., Henikoff, S. 2006. Predicting the effects of amino acid substitutions on protein function. Annu Rev Genomics Hum Genet 7, 61–80.
Nowak, R. 1994. Mining treasures from ‘junk DNA’. Science 263, 608–610.
Nystrom-Lahti, M., Perrera, C., Räschle, M., Panyushkina-Seiler, E., Marra, G., Curci, A., Quaresima, B., Costanzo, F.D., Urso, M., Venuta, S., Jiricny, J. 2002. Functional analysis of MLH1 mutations linked to hereditary nonpolyposis colon cancer. Genes Chromosomes Cancer 33, 160–167.
Ohmiya, N., Matsumoto, S., Yamamoto, H., Baranovskaya, S., Malkhosyan, S.R., Perucho, M. 2001. Germline and somatic mutations in hMSH6 and hMSH3 in gastrointestinal cancers of the microsatellite mutator phenotype. Gene 272, 301–313.
Pesole, G., Liuni, S., 1999. Internet resources for the functional analysis of 5′ and 3′untranslated regions of eukaryotic mRNA. TIG 15, 378.
Pesole, G., Liuni, S., Grillo, G., Licciulli, F., Mignone, F., Gissi, C., Saccone, C. 2002. UTRdb and UTRsite: Specialized databases of sequences and functional elements of 5′ and 3′ untranslated regions of eukaryotic mRNAs. Nucleic Acids Res 30, 335–340.
Peterlongo, P., Nafa, K., Lerman, G.S. 2003. MSH6 germline mutations are rare in colorectal cancer families. Int J Cancer 107, 571–579.
Plaschke, J., Kruppa, C., Tischler, R. 2000. Sequence analysis of the mismatch repair gene hMSH6 in the germline of patients with familial and sporadic colorectal cancer. Int J Cancer 85, 606–613.
Raevaara, T.E., Korhonen, M.K., Lohi, H., Hampel, H., Lynch, E., Lönnqvist, K.E., Holinski-Feder, E., Sutter, C., McKinnon, W., Duraisamy, S., Gerdes, A.M., Peltomäki, P., Kohonen-Ccorish, M., Mangold, E., Macrae, F., Greenblatt, M., de la Chapelle, A., Nystrom, M. 2005. Functional significance and clinical phenotype of nontruncating mismatch repair variants of MLH1. Gastroenterology 129, 537–549.
Ramensky, V., Pork, P., Sunyaev, S. 2002. Human nonsynonymous SNPs: Server and survey. Nucleic Acids Research 30, 3894–3900.
Renkonen, E., Lohi, H., Jarvinen, H.J., Mecklin, J.P., Peltomaki, P. 2004. Novel splicing associations of hereditary colon cancer related DNA mismatch repair gene mutations. J Med Genet 41, e95.
Reumers, J., Conde, L., Medina, I., Maurer-Stroh, S., Durme, J.V., Dopazo, J., Rousseau, F., Schymkowitz, J. 2008. Joint annotation of coding and non-coding single nucleotide polymorphisms and mutations in the SNPeffect and PupaSuite databases. Nucl Acids Res 36, D825–D829.
Reumers, J., Maurer-Stroh, S., Schymkowitz, J., Rousseau, F. 2006. SNPeffect v2.0: A new step in investigating the molecular phenotypic effects of human non-synonymous SNPs. Bioinformatics 22, 2183–2185.
Ricciardiello, L., Boland, C.R. 2005. Lynch syndrome (hereditary nonpolyposis colorectal cancer): Current concepts and approaches to management. Curr Gastroenterol Rep 7, 412–420.
Richard, J.D., Patricia, B.M., Mark, J.C., Saqi, M.A.S. 2006. Predicting deleterious nsSNPs: An analysis of sequence and structural attributes. BMC Bioinformatics 7, 217.
Scartozzi, M., Bianchi, F., Rosati, S., Galizia, E., Antolini, A., Loretelli, C., Piga, A., Bearzi, I., Cellerino, R., Porfiri, E. 2002. Mutations of hMLH1 and hMSH2 in patients with suspected hereditary nonpolyposis colorectal cancer: Correlation with microsatellite instability and abnormalities of mismatch repair protein expression. J Clin Oncol 20, 1203–1208.
Slabinski, L., Jaroszewski, L., Rodrigues, A.P., Rychlewski, L., Wilson, I.A., Lesley, S.A., Godzik, A. 2007. The challenge of protein structure determination — lessons from structural genomics. Protein Sci 16, 2472–2482.
Smigielski, E.M., Sirotkin, K., Ward, M., Sherry, S.T. 2000. dbSNP: A database of single nucleotide polymorphisms. Nucleic Acids Res 28, 352–355.
Sonenberg, N. 1994. mRNA translation: Influence of the 5′ and 3′ untranslated Regions. Curr Opin Genet 4, 310–315.
Song, C.M., Yeo, B.H., Tantoso, E., Yang, Y., Lim, Y.P., Li, K.B., Rajagopal, G. 2006. iHAP — integrated haplotype analysis pipeline for characterizing the haplotype structure of genes. BMC Bioinformatics 7, 525.
Stumpf, M.P.H. 2004. Haplotype diversity and SNP frequency dependence in the description of genetic variation. Eur J Hum Genet 12, 469–477.
Sunyaev, S., Ramensky, V., Bork, P. 2001. Towards a structural basis of human non-synonymous single nucleotide polymorphisms. Trends Genet 16, 198–200.
The International HapMap Consortium. 2003. The International HapMap Project. Nature 426, 789–796.
Thomas, P.D., Campbell, M.J., Kejariwal, A., Mi, H., Karlak, B., Daverman, R., Diemer, K., Muruganujan, A., Narechania, A. 2003. PANTHER: A library of protein families and subfamilies indexed by function. Genome Res 13, 2129–2141.
Vakser, I.A., Kundrotas, P. 2008. Predicting 3D structures of protein-protein complexes. Curr Pharm Biotechnol 9, 57–66.
Xi, T., Jones, I.M., Mohrenweiser, H.W. 2005. Many amino acid substitution variants identified in DNA repair genes during human population screenings are predicted to impact protein function. Genomics 83, 970–979.
Yuan, H.Y., Chiou, J.J., Tseng, W.J., Liu, C.H., Liu, C.K., Lin, Y.J., Wang, H.H., Yao, A., Chen, Y.T., Hsu, C.N. 2006. FASTSNP: An always up-to-date and extendable service for SNP function analysis and prioritization. Nucleic Acids Res 34, W635–W641.
Yue, P., Melamud, E., Moult, J. 2006. SNPs3D: Candidate gene and SNP selection for association studies. BMC Bioinformatics 7, 166.
Zhang, E.Y., Fu, D.J., Pak, Y.A., Ste wart, T.T., Mukhopadhyay, N., Wrighton, S.A., Hillgren, K.M. 2004. Genetic polymorphisms in human protondependent dipeptide transporter PEPT1: Implications for the functional role of Pro586. J Pharmacol Exp Ther 310, 437–445.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
George Priya Doss, C., Rajasekaran, R., Arjun, P. et al. Prioritization of candidate SNPs in colon cancer using bioinformatics tools: An alternative approach for a cancer biologist. Interdiscip Sci Comput Life Sci 2, 320–346 (2010). https://doi.org/10.1007/s12539-010-0003-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12539-010-0003-3