An interactive bovine in silico SNP database (IBISS)
An interactive bovine in silico SNP (IBISS) database has been created through the clustering and aligning of bovine EST and mRNA sequences. Approximately 324,000 EST and mRNA sequences were clustered to produce 29,965 clusters (producing 48,679 consensus sequences) and 48,565 singletons. A SNP screening regime was placed on variations detected in the multiple sequence alignment files to determine which SNPs are more likely to be real rather than sequencing errors. A small subset of predicted SNPs was validated on a diverse set of bovine DNA samples using PCR amplification and sequencing. Fifty percent of the predicted SNPs in the “putative >1” category were polymorphic in the population sampled. The IBISS database represents more than just a SNP database; it is also a genomic database containing uniformly annotated predicted gene mRNA and protein sequences, gene structure, and genomic organization information.
KeywordsGenome Browser Bovine Genome Putative SNPs Bovine Sequence Singleton Sequence
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
The authors would like to thank the entire Bioinformatics team at CSIRO LI and the staff at Electric Genetics for their assistance in using StackPACK to cluster these sequences. The authors also thank Bill Barendse and James Kijas for helpful and insightful discussions regarding specific areas of this paper. We thank Bill Barendse for cattle DNA samples and access to an ABI 377 DNA sequencer.
Altschul, SF, Madden, TL, Schaffer, AA, Zhang, J, Zhang, Z, et al. 1997Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Res2533893402PubMedGoogle Scholar Band, MR, Larson, JH, Rebeiz, M, Green, CA, Heyen, DW, et al. 2000An ordered comparative map of the cattle and human genomesGenome Res1013591368CrossRefPubMedGoogle Scholar Barendse, W, Vaiman, D, Kemp, SJ, Sugimoto, Y, Armitage, SM, et al. 1997A medium-density genetic linkage map of the bovine genomeMamm Genome82128CrossRefPubMedGoogle Scholar Buetow, KH, Edmonson, MN, Cassidy, AB 1999Reliable identification of large numbers of candidate SNPs from public EST dataNat Genet21323325CrossRefPubMedGoogle Scholar Burke, J, Davison, D, Hide, W 1999D2_cluster: A validated method for clustering EST and full-length cDNA sequencesGenome Res911351142CrossRefPubMedGoogle Scholar Cargill, M, Altshuler, D, Sklar, P, Ardlie, K, Patil, N, et al. 1999Characterisation of single nucleotide polymorphisms in coding regions of human genesNat Genet22231238CrossRefPubMedGoogle Scholar Fahrenkrug, SC, Freking, BA, Smith, TPL, Rohrer, GA, Keele, JW 2002Single nucleotide polymorphism (SNP) discovery in porcine expressed genesAnim Genet33186195CrossRefPubMedGoogle Scholar Gu, Z, Hillier, LD, Kwok, P-Y 1998Single nucleotide polymorphism hunting in cyberspaceHum Mutat12221225CrossRefPubMedGoogle Scholar Heaton, MP, Grosse, WM, Kappes, SM, Keele, JW, Chitko-Mckown, GC, et al. 2001Estimation of DNA sequence diversity in bovine cytokine genesMamm Genome123237CrossRefPubMedGoogle Scholar Holliday, R, Grigg, GW 1993DNA methylation and mutationMutat Res2856167PubMedGoogle Scholar Hu, G, Modrek, B, Stensland, HMFR, Saarela, J, Pajukanta, P, et al. 2002Efficient discovery of single-nucleotide polymorphisms in coding regions of human genesPharmacogenomics J2236242CrossRefPubMedGoogle Scholar Huang, X, Zhang, J 1996Methods for comparing a DNA sequence with a protein sequenceComput Applic Biosci12497506Google Scholar Irizarry, K, Kustanovich, V, Li, C, Brown, N, Nelson, S, et al. 2000Genome-wide analysis of single nucleotide polymorphisms in human expressed sequencesNat Genet26233236CrossRefPubMedGoogle Scholar Iseli, C, Jongeneel, CV, Bucher, P 1999ESTScan: a program for detecting, evaluating, and reconstructing potential coding regions in EST sequencesProc Int Conf Intell Syst Mol Biol13848Google Scholar Kappes, SM, Keele, JW, Stone, RT, McGraw, RA, Sonstegard, TS, et al. 1997A second-generation linkage map of the bovine genomeGenome Res7235249PubMedGoogle Scholar Karolchik, D, Baertsch, R, Diekhans, M, Furey, TS, Hinrichs, A, et al. 2003The UCSC Genome Browser DatabaseNucleic Acids Res315154CrossRefPubMedGoogle Scholar Kent, WJ 2002BLAT—The BLAST-Like Alignment ToolGenome Res12656664PubMedGoogle Scholar Kim, H, Schmidt, C, Decker, KS, Emara, MG 2003A double-screening method to identify reliable candidate non-synonymous SNPs from chicken EST dataAnim Genet34249254CrossRefPubMedGoogle Scholar Kruglyak, L 1997The use of a genetic map of biallelic markers in linkage studiesNat Genet172124Google Scholar Lindblad–Toh, K, Winchester, E, Daly, M, Wang, DG, Hirschhorn, JN, et al. 2000Large-scale discovery and genotyping of single-nucleotide polymorphisms in the mouseNat Genet24381386CrossRefPubMedGoogle Scholar Lottaz, C, Iseli, C, Jongeneel, CV, Bucher, P 2003Modeling sequencing errors by combining Hidden Markov modelsBioinformatics19103112CrossRefGoogle Scholar Miller, RT, Christoffels, AG, Gopalakrishnan, C, Burke, J, Ptitsyn, AA, et al. 1999A comprehensive approach to clustering of expressed human gene sequence: The Sequence Tag Alignment and Consensus KnowledgebaseGenome Res911431155CrossRefPubMedGoogle Scholar Muilu, J, Rodriguez–Tome, P, Robinson, A 2001Gbuil-der—an application for the visualization and integration of EST cluster dataGenome Res11179184CrossRefPubMedGoogle Scholar Mullikin, JC, Hunt, SE, Cole, CG, Mortimore, BJ, Rice, CM, et al. 2000An SNP map of human chromosome 22Nature407516520CrossRefPubMedGoogle Scholar Parkinson, J, Guiliano, DB, Blaxter, M 2002Making sense of EST sequences by CLOBBing themBioinformatics331CrossRefPubMedGoogle Scholar Pearson, WR, Wood, T, Zhang, Z, Miller, W 1997Comparison of DNA sequences with protein sequencesGenomics462436CrossRefPubMedGoogle Scholar Picoult–Newberg, L, Ideker, TE, Pohl, MG, Taylor, SL, Donaldson, MA, et al. 1999Mining SNPs from EST databasesGenome Res9167174PubMedGoogle Scholar Rozen, S, Skaletsky, HJ 2000Primer3 on the WWW for general users and for biologist programmersKrawetz, SMisener, S eds. Bioinformatics Methods and Protocols: Methods in Molecular BiologyHumana PressTotowa, NJpp 365386Google Scholar Stone, RT, Grosse, WM, Casas, E, Smith, TPL, Keel, JW, et al. 2002Use of bovine EST data and human genomic sequences to map 100 gene-specific bovine markersMamm Genome13211215CrossRefPubMedGoogle Scholar Taillon–Miller, P, Gu, Z, Li, Q, Hillier, LD, Kwok, P-Y 1998Overlapping genomic sequences: A treasure trove of single nucleotide polymorphismsGenome Res8748754PubMedGoogle Scholar Thorisson, GA, Stein, LD 2003The SNP consortium website: past, present and futureNucleic Acids Res31124127CrossRefPubMedGoogle Scholar Toivonen, HT, Onkamo, P, Vasko, K, Ollikainen, V, Sevon, P, et al. 2000Data mining applied to linkage disequilibrium mappingAm J Hum Genet67133145CrossRefPubMedGoogle Scholar Upton, W, Burrow, HM, Dundon, A, Robbinson, DL, Farrell, EB 2001CRC breeding program design, measurement and database; methods that underpin CRC research resultsAust J Exp Agr41943952CrossRefGoogle Scholar Vignal, A, Milan, D, San Crisobal, M, Eggen, A 2002A review on SNP and other types of molecular markers and their use in animal geneticsGenet Sel Evol34275305CrossRefPubMedGoogle Scholar Weber, JL, Myers, EW 1997Human whole-genome shotgun sequencingGenome Res7401409PubMedGoogle Scholar Werner, FAO, Durstewitz, G, Habermann, FA, Thaller, G, Krämer, W, et al. 2004Detection and characterization of SNPs useful for identity control and parentage testing in major European dairy breedsAnim Genet354449CrossRefPubMedGoogle Scholar Yang, Z, Wong, GK-S, Eberle, MA, Kibukawa, M, Passey, DA, et al. 2000Sampling SNPsNat Genet261314CrossRefGoogle Scholar