Advertisement

Mammalian Genome

, Volume 15, Issue 10, pp 819–827 | Cite as

An interactive bovine in silico SNP database (IBISS)

  • Rachel J. HawkenEmail author
  • Wesley C. Barris
  • Sean M. McWilliam
  • Brian P. Dalrymple
Original Contributions

Abstract

An interactive bovine in silico SNP (IBISS) database has been created through the clustering and aligning of bovine EST and mRNA sequences. Approximately 324,000 EST and mRNA sequences were clustered to produce 29,965 clusters (producing 48,679 consensus sequences) and 48,565 singletons. A SNP screening regime was placed on variations detected in the multiple sequence alignment files to determine which SNPs are more likely to be real rather than sequencing errors. A small subset of predicted SNPs was validated on a diverse set of bovine DNA samples using PCR amplification and sequencing. Fifty percent of the predicted SNPs in the “putative >1” category were polymorphic in the population sampled. The IBISS database represents more than just a SNP database; it is also a genomic database containing uniformly annotated predicted gene mRNA and protein sequences, gene structure, and genomic organization information.

Keywords

Genome Browser Bovine Genome Putative SNPs Bovine Sequence Singleton Sequence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgments

The authors would like to thank the entire Bioinformatics team at CSIRO LI and the staff at Electric Genetics for their assistance in using StackPACK to cluster these sequences. The authors also thank Bill Barendse and James Kijas for helpful and insightful discussions regarding specific areas of this paper. We thank Bill Barendse for cattle DNA samples and access to an ABI 377 DNA sequencer.

References

  1. Altschul, SF, Madden, TL, Schaffer, AA, Zhang, J, Zhang, Z,  et al. 1997Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Res2533893402PubMedGoogle Scholar
  2. Band, MR, Larson, JH, Rebeiz, M, Green, CA, Heyen, DW,  et al. 2000An ordered comparative map of the cattle and human genomesGenome Res1013591368CrossRefPubMedGoogle Scholar
  3. Barendse, W, Vaiman, D, Kemp, SJ, Sugimoto, Y, Armitage, SM,  et al. 1997A medium-density genetic linkage map of the bovine genomeMamm Genome82128CrossRefPubMedGoogle Scholar
  4. Buetow, KH, Edmonson, MN, Cassidy, AB 1999Reliable identification of large numbers of candidate SNPs from public EST dataNat Genet21323325CrossRefPubMedGoogle Scholar
  5. Burke, J, Davison, D, Hide, W 1999D2_cluster: A validated method for clustering EST and full-length cDNA sequencesGenome Res911351142CrossRefPubMedGoogle Scholar
  6. Cargill, M, Altshuler, D, Sklar, P, Ardlie, K, Patil, N,  et al. 1999Characterisation of single nucleotide polymorphisms in coding regions of human genesNat Genet22231238CrossRefPubMedGoogle Scholar
  7. Fahrenkrug, SC, Freking, BA, Smith, TPL, Rohrer, GA, Keele, JW 2002Single nucleotide polymorphism (SNP) discovery in porcine expressed genesAnim Genet33186195CrossRefPubMedGoogle Scholar
  8. Gu, Z, Hillier, LD, Kwok, P-Y 1998Single nucleotide polymorphism hunting in cyberspaceHum Mutat12221225CrossRefPubMedGoogle Scholar
  9. Heaton, MP, Grosse, WM, Kappes, SM, Keele, JW, Chitko-Mckown, GC,  et al. 2001Estimation of DNA sequence diversity in bovine cytokine genesMamm Genome123237CrossRefPubMedGoogle Scholar
  10. Holliday, R, Grigg, GW 1993DNA methylation and mutationMutat Res2856167PubMedGoogle Scholar
  11. Hu, G, Modrek, B, Stensland, HMFR, Saarela, J, Pajukanta, P,  et al. 2002Efficient discovery of single-nucleotide polymorphisms in coding regions of human genesPharmacogenomics J2236242CrossRefPubMedGoogle Scholar
  12. Huang, X, Zhang, J 1996Methods for comparing a DNA sequence with a protein sequenceComput Applic Biosci12497506Google Scholar
  13. Irizarry, K, Kustanovich, V, Li, C, Brown, N, Nelson, S,  et al. 2000Genome-wide analysis of single nucleotide polymorphisms in human expressed sequencesNat Genet26233236CrossRefPubMedGoogle Scholar
  14. Iseli, C, Jongeneel, CV, Bucher, P 1999ESTScan: a program for detecting, evaluating, and reconstructing potential coding regions in EST sequencesProc Int Conf Intell Syst Mol Biol13848Google Scholar
  15. Kappes, SM, Keele, JW, Stone, RT, McGraw, RA, Sonstegard, TS,  et al. 1997A second-generation linkage map of the bovine genomeGenome Res7235249PubMedGoogle Scholar
  16. Karolchik, D, Baertsch, R, Diekhans, M, Furey, TS, Hinrichs, A,  et al. 2003The UCSC Genome Browser DatabaseNucleic Acids Res315154CrossRefPubMedGoogle Scholar
  17. Kent, WJ 2002BLAT—The BLAST-Like Alignment ToolGenome Res12656664PubMedGoogle Scholar
  18. Kim, H, Schmidt, C, Decker, KS, Emara, MG 2003A double-screening method to identify reliable candidate non-synonymous SNPs from chicken EST dataAnim Genet34249254CrossRefPubMedGoogle Scholar
  19. Kruglyak, L 1997The use of a genetic map of biallelic markers in linkage studiesNat Genet172124Google Scholar
  20. Lindblad–Toh, K, Winchester, E, Daly, M, Wang, DG, Hirschhorn, JN,  et al. 2000Large-scale discovery and genotyping of single-nucleotide polymorphisms in the mouseNat Genet24381386CrossRefPubMedGoogle Scholar
  21. Lottaz, C, Iseli, C, Jongeneel, CV, Bucher, P 2003Modeling sequencing errors by combining Hidden Markov modelsBioinformatics19103112CrossRefGoogle Scholar
  22. Miller, RT, Christoffels, AG, Gopalakrishnan, C, Burke, J, Ptitsyn, AA,  et al. 1999A comprehensive approach to clustering of expressed human gene sequence: The Sequence Tag Alignment and Consensus KnowledgebaseGenome Res911431155CrossRefPubMedGoogle Scholar
  23. Muilu, J, Rodriguez–Tome, P, Robinson, A 2001Gbuil-der—an application for the visualization and integration of EST cluster dataGenome Res11179184CrossRefPubMedGoogle Scholar
  24. Mullikin, JC, Hunt, SE, Cole, CG, Mortimore, BJ, Rice, CM,  et al. 2000An SNP map of human chromosome 22Nature407516520CrossRefPubMedGoogle Scholar
  25. Parkinson, J, Guiliano, DB, Blaxter, M 2002Making sense of EST sequences by CLOBBing themBioinformatics331CrossRefPubMedGoogle Scholar
  26. Pearson, WR, Wood, T, Zhang, Z, Miller, W 1997Comparison of DNA sequences with protein sequencesGenomics462436CrossRefPubMedGoogle Scholar
  27. Picoult–Newberg, L, Ideker, TE, Pohl, MG, Taylor, SL, Donaldson, MA,  et al. 1999Mining SNPs from EST databasesGenome Res9167174PubMedGoogle Scholar
  28. Rozen, S, Skaletsky, HJ 2000Primer3 on the WWW for general users and for biologist programmersKrawetz, SMisener, S eds. Bioinformatics Methods and Protocols: Methods in Molecular BiologyHumana PressTotowa, NJpp 365386Google Scholar
  29. Stone, RT, Grosse, WM, Casas, E, Smith, TPL, Keel, JW,  et al. 2002Use of bovine EST data and human genomic sequences to map 100 gene-specific bovine markersMamm Genome13211215CrossRefPubMedGoogle Scholar
  30. Taillon–Miller, P, Gu, Z, Li, Q, Hillier, LD, Kwok, P-Y 1998Overlapping genomic sequences: A treasure trove of single nucleotide polymorphismsGenome Res8748754PubMedGoogle Scholar
  31. Thorisson, GA, Stein, LD 2003The SNP consortium website: past, present and futureNucleic Acids Res31124127CrossRefPubMedGoogle Scholar
  32. Toivonen, HT, Onkamo, P, Vasko, K, Ollikainen, V, Sevon, P,  et al. 2000Data mining applied to linkage disequilibrium mappingAm J Hum Genet67133145CrossRefPubMedGoogle Scholar
  33. Upton, W, Burrow, HM, Dundon, A, Robbinson, DL, Farrell, EB 2001CRC breeding program design, measurement and database; methods that underpin CRC research resultsAust J Exp Agr41943952CrossRefGoogle Scholar
  34. Vignal, A, Milan, D, San Crisobal, M, Eggen, A 2002A review on SNP and other types of molecular markers and their use in animal geneticsGenet Sel Evol34275305CrossRefPubMedGoogle Scholar
  35. Weber, JL, Myers, EW 1997Human whole-genome shotgun sequencingGenome Res7401409PubMedGoogle Scholar
  36. Werner, FAO, Durstewitz, G, Habermann, FA, Thaller, G, Krämer, W,  et al. 2004Detection and characterization of SNPs useful for identity control and parentage testing in major European dairy breedsAnim Genet354449CrossRefPubMedGoogle Scholar
  37. Yang, Z, Wong, GK-S, Eberle, MA, Kibukawa, M, Passey, DA,  et al. 2000Sampling SNPsNat Genet261314CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2004

Authors and Affiliations

  • Rachel J. Hawken
    • 1
    Email author
  • Wesley C. Barris
    • 1
  • Sean M. McWilliam
    • 1
  • Brian P. Dalrymple
    • 1
  1. 1.Queensland Bioscience PrecinctCSIRO Livestock IndustriesSt. LuciaAustralia

Personalised recommendations