Abstract
An interactive bovine in silico SNP (IBISS) database has been created through the clustering and aligning of bovine EST and mRNA sequences. Approximately 324,000 EST and mRNA sequences were clustered to produce 29,965 clusters (producing 48,679 consensus sequences) and 48,565 singletons. A SNP screening regime was placed on variations detected in the multiple sequence alignment files to determine which SNPs are more likely to be real rather than sequencing errors. A small subset of predicted SNPs was validated on a diverse set of bovine DNA samples using PCR amplification and sequencing. Fifty percent of the predicted SNPs in the “putative >1” category were polymorphic in the population sampled. The IBISS database represents more than just a SNP database; it is also a genomic database containing uniformly annotated predicted gene mRNA and protein sequences, gene structure, and genomic organization information.
Similar content being viewed by others
References
SF Altschul TL Madden AA Schaffer J Zhang Z Zhang et al. (1997) ArticleTitleGapped BLAST and PSI-BLAST: a new generation of protein database search programs Nucleic Acids Res 25 3389–3402 Occurrence Handle1:CAS:528:DyaK2sXlvFyhu7w%3D Occurrence Handle9254694
MR Band JH Larson M Rebeiz CA Green DW Heyen et al. (2000) ArticleTitleAn ordered comparative map of the cattle and human genomes Genome Res 10 1359–1368 Occurrence Handle10.1101/gr.145900 Occurrence Handle1:CAS:528:DC%2BD3cXms1yjur4%3D Occurrence Handle10984454
W Barendse D Vaiman SJ Kemp Y Sugimoto SM Armitage et al. (1997) ArticleTitleA medium-density genetic linkage map of the bovine genome Mamm Genome 8 21–28 Occurrence Handle10.1007/s003359900340 Occurrence Handle1:CAS:528:DyaK2sXnsVOgsw%3D%3D Occurrence Handle9021143
KH Buetow MN Edmonson AB Cassidy (1999) ArticleTitleReliable identification of large numbers of candidate SNPs from public EST data Nat Genet 21 323–325 Occurrence Handle10.1038/6851 Occurrence Handle1:CAS:528:DyaK1MXitVCitb0%3D Occurrence Handle10080189
J Burke D Davison W Hide (1999) ArticleTitleD2_cluster: A validated method for clustering EST and full-length cDNA sequences Genome Res 9 1135–1142 Occurrence Handle10.1101/gr.9.11.1135 Occurrence Handle1:CAS:528:DyaK1MXns12kt7w%3D Occurrence Handle10568753
M Cargill D Altshuler P Sklar K Ardlie N Patil et al. (1999) ArticleTitleCharacterisation of single nucleotide polymorphisms in coding regions of human genes Nat Genet 22 231–238 Occurrence Handle10.1038/10290 Occurrence Handle1:CAS:528:DyaK1MXkt1eqsb4%3D Occurrence Handle10391209
SC Fahrenkrug BA Freking TPL Smith GA Rohrer JW Keele (2002) ArticleTitleSingle nucleotide polymorphism (SNP) discovery in porcine expressed genes Anim Genet 33 186–195 Occurrence Handle10.1046/j.1365-2052.2002.00846.x Occurrence Handle1:CAS:528:DC%2BD38Xlt1Kjs7g%3D Occurrence Handle12030921
Z Gu LD Hillier P-Y Kwok (1998) ArticleTitleSingle nucleotide polymorphism hunting in cyberspace Hum Mutat 12 221–225 Occurrence Handle10.1002/(SICI)1098-1004(1998)12:4<221::AID-HUMU1>3.0.CO;2-I Occurrence Handle1:CAS:528:DyaK1cXmtFWqsr4%3D Occurrence Handle9744471
MP Heaton WM Grosse SM Kappes JW Keele GC Chitko-Mckown et al. (2001) ArticleTitleEstimation of DNA sequence diversity in bovine cytokine genes Mamm Genome 12 32–37 Occurrence Handle10.1007/s003350010223 Occurrence Handle1:CAS:528:DC%2BD3MXptlGqtA%3D%3D Occurrence Handle11178741
R Holliday GW Grigg (1993) ArticleTitleDNA methylation and mutation Mutat Res 285 61–67 Occurrence Handle1:CAS:528:DyaK3sXhtF2js78%3D Occurrence Handle7678134
G Hu B Modrek HMFR Stensland J Saarela P Pajukanta et al. (2002) ArticleTitleEfficient discovery of single-nucleotide polymorphisms in coding regions of human genes Pharmacogenomics J 2 236–242 Occurrence Handle10.1038/sj.tpj.6500109 Occurrence Handle1:CAS:528:DC%2BD38XmsVGmu74%3D Occurrence Handle12196912
X Huang J Zhang (1996) ArticleTitleMethods for comparing a DNA sequence with a protein sequence Comput Applic Biosci 12 497–506 Occurrence Handle1:CAS:528:DyaK2sXhtlWkt7k%3D
K Irizarry V Kustanovich C Li N Brown S Nelson et al. (2000) ArticleTitleGenome-wide analysis of single nucleotide polymorphisms in human expressed sequences Nat Genet 26 233–236 Occurrence Handle10.1038/79981 Occurrence Handle1:CAS:528:DC%2BD3cXntlGmsLs%3D Occurrence Handle11017085
C Iseli CV Jongeneel P Bucher (1999) ArticleTitleESTScan: a program for detecting, evaluating, and reconstructing potential coding regions in EST sequences Proc Int Conf Intell Syst Mol Biol 1 38–48
SM Kappes JW Keele RT Stone RA McGraw TS Sonstegard et al. (1997) ArticleTitleA second-generation linkage map of the bovine genome Genome Res 7 235–249 Occurrence Handle1:CAS:528:DyaK2sXhvFamsbo%3D Occurrence Handle9074927
D Karolchik R Baertsch M Diekhans TS Furey A Hinrichs et al. (2003) ArticleTitleThe UCSC Genome Browser Database Nucleic Acids Res 31 51–54 Occurrence Handle10.1093/nar/gkg129 Occurrence Handle1:CAS:528:DC%2BD3sXhvFSgu7g%3D Occurrence Handle12519945
WJ Kent (2002) ArticleTitleBLAT—The BLAST-Like Alignment Tool Genome Res 12 656–664 Occurrence Handle1:CAS:528:DC%2BD38XivVemtLw%3D Occurrence Handle11932250
H Kim C Schmidt KS Decker MG Emara (2003) ArticleTitleA double-screening method to identify reliable candidate non-synonymous SNPs from chicken EST data Anim Genet 34 249–254 Occurrence Handle10.1046/j.1365-2052.2003.01003.x Occurrence Handle1:CAS:528:DC%2BD3sXovFCht7c%3D Occurrence Handle12873212
L Kruglyak (1997) ArticleTitleThe use of a genetic map of biallelic markers in linkage studies Nat Genet 17 21–24
K Lindblad–Toh E Winchester M Daly DG Wang JN Hirschhorn et al. (2000) ArticleTitleLarge-scale discovery and genotyping of single-nucleotide polymorphisms in the mouse Nat Genet 24 381–386 Occurrence Handle10.1038/74215 Occurrence Handle1:CAS:528:DC%2BD3cXisVCjsbg%3D Occurrence Handle10742102
C Lottaz C Iseli CV Jongeneel P Bucher (2003) ArticleTitleModeling sequencing errors by combining Hidden Markov models Bioinformatics 19 103–112 Occurrence Handle10.1093/bioinformatics/btg1067
RT Miller AG Christoffels C Gopalakrishnan J Burke AA Ptitsyn et al. (1999) ArticleTitleA comprehensive approach to clustering of expressed human gene sequence: The Sequence Tag Alignment and Consensus Knowledgebase Genome Res 9 1143–1155 Occurrence Handle10.1101/gr.9.11.1143 Occurrence Handle1:CAS:528:DyaK1MXns12kt70%3D Occurrence Handle10568754
J Muilu P Rodriguez–Tome A Robinson (2001) ArticleTitleGbuil-der—an application for the visualization and integration of EST cluster data Genome Res 11 179–184 Occurrence Handle10.1101/gr.157501 Occurrence Handle1:CAS:528:DC%2BD3MXmsVCqsg%3D%3D Occurrence Handle11156627
JC Mullikin SE Hunt CG Cole BJ Mortimore CM Rice et al. (2000) ArticleTitleAn SNP map of human chromosome 22 Nature 407 516–520 Occurrence Handle10.1038/35035089 Occurrence Handle1:CAS:528:DC%2BD3cXntlSks7w%3D Occurrence Handle11029003
J Parkinson DB Guiliano M Blaxter (2002) ArticleTitleMaking sense of EST sequences by CLOBBing them Bioinformatics 3 31 Occurrence Handle10.1186/1471-2105-3-31 Occurrence Handle12398795
WR Pearson T Wood Z Zhang W Miller (1997) ArticleTitleComparison of DNA sequences with protein sequences Genomics 46 24–36 Occurrence Handle10.1006/geno.1997.4995 Occurrence Handle1:CAS:528:DyaK2sXnvVCkur8%3D Occurrence Handle9403055
L Picoult–Newberg TE Ideker MG Pohl SL Taylor MA Donaldson et al. (1999) ArticleTitleMining SNPs from EST databases Genome Res 9 167–174 Occurrence Handle1:CAS:528:DyaK1MXhsFyjtLc%3D Occurrence Handle10022981
S Rozen HJ Skaletsky (2000) Primer3 on the WWW for general users and for biologist programmers S Krawetz S Misener (Eds) Bioinformatics Methods and Protocols: Methods in Molecular Biology Humana Press Totowa, NJ pp 365–386
RT Stone WM Grosse E Casas TPL Smith JW Keel et al. (2002) ArticleTitleUse of bovine EST data and human genomic sequences to map 100 gene-specific bovine markers Mamm Genome 13 211–215 Occurrence Handle10.1007/s00335-001-2124-9 Occurrence Handle1:CAS:528:DC%2BD38XjtFymt7s%3D Occurrence Handle11956765
P Taillon–Miller Z Gu Q Li LD Hillier P-Y Kwok (1998) ArticleTitleOverlapping genomic sequences: A treasure trove of single nucleotide polymorphisms Genome Res 8 748–754 Occurrence Handle1:CAS:528:DyaK1cXltFeltr0%3D Occurrence Handle9685323
GA Thorisson LD Stein (2003) ArticleTitleThe SNP consortium website: past, present and future Nucleic Acids Res 31 124–127 Occurrence Handle10.1093/nar/gkg052 Occurrence Handle1:CAS:528:DC%2BD3sXhvFSnsL0%3D Occurrence Handle12519964
HT Toivonen P Onkamo K Vasko V Ollikainen P Sevon et al. (2000) ArticleTitleData mining applied to linkage disequilibrium mapping Am J Hum Genet 67 133–145 Occurrence Handle10.1086/302954 Occurrence Handle1:CAS:528:DC%2BD3cXntVyks74%3D Occurrence Handle10848493
W Upton HM Burrow A Dundon DL Robbinson EB Farrell (2001) ArticleTitleCRC breeding program design, measurement and database; methods that underpin CRC research results Aust J Exp Agr 41 943–952 Occurrence Handle10.1071/EA00064
A Vignal D Milan M San Crisobal A Eggen (2002) ArticleTitleA review on SNP and other types of molecular markers and their use in animal genetics Genet Sel Evol 34 275–305 Occurrence Handle10.1051/gse:2002009 Occurrence Handle1:CAS:528:DC%2BD38Xmt1Gmur0%3D Occurrence Handle12081799
JL Weber EW Myers (1997) ArticleTitleHuman whole-genome shotgun sequencing Genome Res 7 401–409 Occurrence Handle1:CAS:528:DyaK2sXjtFWlsbY%3D Occurrence Handle9149936
FAO Werner G Durstewitz FA Habermann G Thaller W Krämer et al. (2004) ArticleTitleDetection and characterization of SNPs useful for identity control and parentage testing in major European dairy breeds Anim Genet 35 44–49 Occurrence Handle10.1111/j.1365-2052.2004.01123.x Occurrence Handle1:CAS:528:DC%2BD2cXitVKlsbk%3D Occurrence Handle14731229
Z Yang GK-S Wong MA Eberle M Kibukawa DA Passey et al. (2000) ArticleTitleSampling SNPs Nat Genet 26 13–14 Occurrence Handle10.1038/81559 Occurrence Handle1:CAS:528:DC%2BD3cXmsVKkur8%3D
Acknowledgments
The authors would like to thank the entire Bioinformatics team at CSIRO LI and the staff at Electric Genetics for their assistance in using StackPACK to cluster these sequences. The authors also thank Bill Barendse and James Kijas for helpful and insightful discussions regarding specific areas of this paper. We thank Bill Barendse for cattle DNA samples and access to an ABI 377 DNA sequencer.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hawken, R.J., Barris, W.C., McWilliam, S. et al. An interactive bovine in silico SNP database (IBISS). Mamm Genome 15, 819–827 (2004). https://doi.org/10.1007/s00335-004-2382-4
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/s00335-004-2382-4