, Volume 138, Issue 4, pp 433–451 | Cite as

Rapidly developing functional genomics in ecological model systems via 454 transcriptome sequencing

  • Christopher W. WheatEmail author


Next generation sequencing technology affords new opportunities in ecological genetics. This paper addresses how an ecological genetics research program focused on a phenotype of interest can quickly move from no genetic resources to having various functional genomic tools. 454 sequencing and its error rates are discussed, followed by a review of de novo transcriptome assemblies focused on the first successful de novo assembly which happens to be in an ecological model system (the Glanville fritillary butterfly). The potential future developments in 454 sequencing are also covered. Particular attention is paid to the difficulties ecological geneticists are likely to encounter through reviewing relevant studies in both model and non-model systems. Various post-sequencing issues and applications of 454 generated data are presented (e.g. database management, microarray construction, molecular marker and candidate gene development). How to use species with genomic resources to inform study of those without is also discussed. In closing, some of the drawbacks of 454 sequencing are presented along with future prospects of this technology.


454 Transcriptome sequencing Ecological genetics Functional genomics Glanville fritillary butterfly EST 



I would like to thank Jim Marden, Ilkka Hanski, Hans Ellegren, Tom Mitchell-Olds, Cris Vera, Heiko Vogel, Roger Butlin, Scott Edwards, Jessica Hellman, and Juan Galindo for the conversations, experience, and feedback on the ideas presented in this paper. Two anonymous reviewers also provided very insightful feedback and are thanked for their effort. Funding during the writing of this paper was supported by the Academy of Finland (grants numbers 38604 and 44887, Finnish Centre of Excellence Programme, 2006–2011) and grant EF-0412651 to J. H. Marden and I. Hanski from the US National Science Foundation.


  1. Abzhanov A, Kuo WP, Hartmann C, Grant BR, Grant PR et al (2006) The calmodulin pathway and evolution of elongated beak morphology in Darwin’s finches. Nature 442:563–567. doi: 10.1038/nature04843 CrossRefPubMedGoogle Scholar
  2. Agrafioti I, Stumpf MPH (2007) SNPSTR: a database of compound microsatellite-SNP markers. Nucleic Acids Res 35:D71–D75. doi: 10.1093/nar/gkl806 CrossRefPubMedGoogle Scholar
  3. Barbazuk WB, Emrich SJ, Chen HD, Li L, Schnable PS (2007) SNP discovery via 454 transcriptome sequencing. Plant J 51:910–918CrossRefPubMedGoogle Scholar
  4. Beldade P, Rudd S, Gruber JD, Long AD (2006) A wing expressed sequence tag resource for Bicyclus anynana butterflies, an evo-devo model. BMC Genomics 7:130. doi: 10.1186/1471-2164-7-130 CrossRefPubMedGoogle Scholar
  5. Beldade P, McMillan WO, Papanicolaou A (2007) Butterfly genomics eclosing. Heredity 100:150–157. doi: 10.1038/sj.hdy.6800934 CrossRefPubMedGoogle Scholar
  6. Bonaldo MF, Lennon G, Soares MB (1996) Normalization and subtraction: two approaches to facilitate gene discovery. Genome Res 6:791–806. doi: 10.1101/gr.6.9.791 CrossRefPubMedGoogle Scholar
  7. Bouck A, Vision T (2007) The molecular ecologist’s guide to expressed sequence tags. Mol Ecol 16:907–924. doi: 10.1111/j.1365-294X.2006.03195.x CrossRefPubMedGoogle Scholar
  8. Bourque G, Zdobnov EM, Bork P, Pevzner PA, Tesler G (2005) Comparative architectures of mammalian and chicken genomes reveal highly variable rates of genomic rearrangements across different lineages. Genome Res 15:98–110. doi: 10.1101/gr.3002305 CrossRefPubMedGoogle Scholar
  9. Braby MF, Trueman JWH (2006) Evolution of larval host plant associations and adaptive radiation in pierid butterflies. J Evol Biol 19:1677–1690. doi: 10.1111/j.1420-9101.2006.01109.x CrossRefPubMedGoogle Scholar
  10. Brockman W, Alvarez P, Young S, Garber M, Giannoukos G et al (2008) Quality scores and SNP detection in sequencing-by-synthesis systems. Genome Res 18:763–770. doi: 10.1101/gr.070227.107 CrossRefPubMedGoogle Scholar
  11. Carroll SB (2005) Endless forms most beautiful: the new science of evo-devo. W. W. Norton & Co, New YorkGoogle Scholar
  12. Chaisson M, Pevzner P, Tang H (2004) Fragment assembly with short reads. Bioinformatics Oxf 20:2067–2074. doi: 10.1093/bioinformatics/bth205 CrossRefGoogle Scholar
  13. Cheung F, Haas BJ, Goldberg SMD, May GD, Xiao YL et al (2006) Sequencing Medicago truncatula expressed sequenced tags using 454 life sciences technology. BMC Genomics 7:272. doi: 10.1186/1471-2164-7-272 CrossRefPubMedGoogle Scholar
  14. Clark AG, Hubisz MJ, Bustamante CD, Williamson SH, Nielsen R (2005) Ascertainment bias in studies of human genome-wide polymorphism. Genome Res 15:1496–1502. doi: 10.1101/gr.4107905 CrossRefPubMedGoogle Scholar
  15. Ellegren H (2008) Sequencing goes 454 and takes large-scale genomics into the wild. Mol Ecol 17:1629–1635. doi: 10.1111/j.1365-294X.2008.03699.x CrossRefPubMedGoogle Scholar
  16. Ellegren H, Sheldon BC (2008) Genetic basis of fitness differences in natural populations. Nat Rev Genet 452:169–175. doi: 10.1038/nature06737 Google Scholar
  17. Endler JA (1986) Natural selection in the wild. Princeton University Press, PrincetonGoogle Scholar
  18. Ewing B, Green P (1998) Base-calling of automated sequencer traces using Phred II. Error probabilities. PCR Methods Appl 8:186–194Google Scholar
  19. Feder ME, Mitchell-Olds T (2003) Evolutionary and ecological functional genomics. Nat Rev Genet 4:651–657. doi: 10.1038/nrg1128 CrossRefPubMedGoogle Scholar
  20. Feder ME, Watt WB (1992) Functional biology of adaptation. In: Crawford TJ, Hewitt GM (eds) Genes in ecology. Blackwell Scientific Publications, Oxford, pp 365–392Google Scholar
  21. Fulton TM, Van der Hoeven R, Eannetta NT, Taknsley SD (2002) Identification, analysis, and utilization of conserved otholog set markers for comparative genomics in higher plants. Plant Cell 14:1457–1467. doi: 10.1105/tpc.010479 CrossRefPubMedGoogle Scholar
  22. Gillespie JH (1991) The causes of molecular evolution. Oxford University, Press, New YorkGoogle Scholar
  23. Goldberg SMD, Johnson J, Busam D, Feldblyum T, Ferriera S et al (2006) A Sanger/pyrosequencing hybrid approach tor the generation of high-quality draft assemblies of marine microbial genomes. Proc Natl Acad Sci USA 103:11240–11245. doi: 10.1073/pnas.0604351103 CrossRefPubMedGoogle Scholar
  24. Haag CR, Saastamoinen M, Marden JH, Hanski I (2005) A candidate locus for variation in dispersal rate in a butterfly metapopulation. Proc R Soc Biol Sci Ser B 272:2449–2456. doi: 10.1098/rspb.2005.3235 CrossRefGoogle Scholar
  25. Hanski I, Saccheri I (2006) Molecular-level variation affects population growth in a butterfly metapopulation. PLoS Biol 4:719–726. doi: 10.1371/journal.pbio.0040129 CrossRefGoogle Scholar
  26. Hanski I, Eralahti C, Kankare M, Ovaskainen O, Siren H (2004) Variation in migration propensity among individuals maintained by landscape structure. Ecol Lett 7:958–966. doi: 10.1111/j.1461-0248.2004.00654.x CrossRefGoogle Scholar
  27. Holt RA, Jones SJM (2008) The new paradigm of flow cell sequencing. Genome Res 18:839–846. doi: 10.1101/gr.073262.107 CrossRefPubMedGoogle Scholar
  28. Hudson ME (2008) Sequencing breakthroughs for genomic ecology and evolutionary biology. Mol Ecol Resour 8:3–17. doi: 10.1111/j.1471-8286.2007.02019.x CrossRefGoogle Scholar
  29. Huse SM, Huber JA, Morrison HG, Sogin ML, Welch DM (2007) Accuracy and quality of massively-parallel DNA pyrosequencing. Genome Biol 8:R143Google Scholar
  30. Joron M, Papa R, Beltran M, Chamberlain N, Mavarez J et al (2006) A conserved supergene locus controls colour pattern diversity in Heliconius butterflies. PLoS Biol 4:1831–1840CrossRefGoogle Scholar
  31. Kantety RV, La Rota M, Matthews DE, Sorrells ME (2002) Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum, and wheat. Plant Mol Biol 48:501–510. doi: 10.1023/A:1014875206165 CrossRefPubMedGoogle Scholar
  32. Korbel JO, Urban AE, Affourtit JP, Godwin B, Grubert F et al (2007) Paired-end mapping reveals extensive structural variation in the human genome. Science 318:420–426. doi: 10.1126/science.1149504 CrossRefPubMedGoogle Scholar
  33. Lee J, Hever A, Willhite D, Zlotnik A, Hevezi P (2005) Effects of RNA degradation on gene expression analysis of human postmortem tissues. FASEB J 19:1356–1358. doi: 10.1096/fj.04-2591hyp CrossRefPubMedGoogle Scholar
  34. Lévesque V, Fayad T, Ndiaye K, Nahé Diouf M, Lussier JG (2003) Size-selection of cDNA libraries for the cloning of cDNAs after suppression subtractive hybridization. Biotechniques 35:72–78PubMedGoogle Scholar
  35. Lewontin RC (1974) The genetic basis of evolutionary change. Columbia University Press, New YorkGoogle Scholar
  36. Long AD, Beldade P, Macdonald SJ (2007) Estimation of population heterozygosity and library construction-induced mutation rate from expressed sequence tag collections. Genetics 176:711–714. doi: 10.1534/genetics.106.063610 CrossRefPubMedGoogle Scholar
  37. Luikart G, England PR, Tallmon D, Jordan S, Taberlet P (2003) The power and promise of population genomics: from genotyping to genome typing. Nat Rev Genet 4:981–994. doi: 10.1038/nrg1226 CrossRefPubMedGoogle Scholar
  38. Lyons LA, Laughlin TF, Copeland NG, Jenkins NA (1997) Comparative anchor tagged sequences (CATS) for integrative mapping of mammalian genomes. Nat Genet 15:47–56Google Scholar
  39. Marden JH (2006) Quantitative and evolutionary biology of alternative splicing: how changing the mix of alternative transcripts affects phenotypic plasticity and reaction norms. Heredity 100:111–120Google Scholar
  40. Mardis ER (2008) The impact of next-generation sequencing technology on genetics. Trends Genet 24:133–141PubMedGoogle Scholar
  41. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS et al (2005) Genome sequencing in microfabricated high-density picolitre reactors. Nature 437:376–380PubMedGoogle Scholar
  42. Meyer M, Stenzel U, Hofreiter M (2008) Parallel tagged sequencing on the 454 platform. Nat Protocols 3:267–278. doi: 10.1038/nprot.2007.520 CrossRefGoogle Scholar
  43. Mitchell-Olds T, Willis JH, Goldstein DB (2007) Which evolutionary processes influence natural genetic variation for phenotypic traits? Nat Rev Genet 8:845–856. doi: 10.1038/nrg2207 CrossRefPubMedGoogle Scholar
  44. Moore MJ, Dhingra A, Soltis PS, Shaw R, Farmerie WG et al (2006) Rapid and accurate pyrosequencing of angiosperm plastid genomes. BMC Plant Biol 6:17. doi: 10.1186/1471-2229-6-17 CrossRefPubMedGoogle Scholar
  45. Morin PA, Luikart G, Wayne RK, Sw group (2004) SNPs in ecology, evolution, and conservation. Trends Ecol Evol 19:208–216. doi: 10.1016/j.tree.2004.01.009 CrossRefGoogle Scholar
  46. Moriya Y, Itoh M, Okuda S, Yoshizawa AC, Kanehisa M (2007) KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res 35:W182–W185. doi: 10.1093/nar/gkm321 CrossRefPubMedGoogle Scholar
  47. Mountain JL, Knight A, Jobin M, Gignoux C, Miller A et al (2002) SNPSTRs: empirically derived, rapidly tryped, autosomal haplotypes for inference of population history and mutational processes. Genome Res 12:1766–1772. doi: 10.1101/gr.238602 CrossRefPubMedGoogle Scholar
  48. Nachman MW, Hoekstra HE, D’Agostino SL (2003) The genetic basis of adaptive melanism in pocket mice. Proc Natl Acad Sci USA 100:5268–5273. doi: 10.1073/pnas.0431157100 CrossRefPubMedGoogle Scholar
  49. Nair S, Williams JT, Brockman A, Paiphun L, Mayxay M et al (2003) A selective sweep driven by pyrimethamine treatment in southeast Asian malaria parasites. Mol Biol Evol 20:1526–1536. doi: 10.1093/molbev/msg162 CrossRefPubMedGoogle Scholar
  50. Nielsen R (2001) Statistical tests of selective neutrality in the age of genomics. Heredity 86:641–647. doi: 10.1046/j.1365-2540.2001.00895.x CrossRefPubMedGoogle Scholar
  51. Noonan JP, Coop G, Kudaravalli S, Smith D, Krause J et al (2006) Sequencing and analysis of Neanderthal genomic DNA. Science 314:1113–1118. doi: 10.1126/science.1131412 CrossRefPubMedGoogle Scholar
  52. Papanicolaou A, Gebauer-Jung S, Blaxter ML, McMillan DM, Jiggins CD (2008) Butterfly base: a platform for lepidopteran genomics. Nucleic Acids Res 36:D582–D587. doi: 10.1093/nar/gkm853 CrossRefPubMedGoogle Scholar
  53. Paschall JE, Oleksiak MF, Van Wye JD, Roach JL, Whitehead JA et al (2004) Funny base: a systems level functional annotation of Fundulus ESTs for the analysis of gene expression. BMC Genomics 5:96. doi: 10.1186/1471-2164-5-96 CrossRefPubMedGoogle Scholar
  54. Picoult-Newberg L, Ideker TE, Pohl MG, Taylor SL, Donaldson MA et al (1999) Mining SNPs from EST databases. Genome Res 9:167–174PubMedGoogle Scholar
  55. Pop M, Salzberg SL (2008) Bioinformatics challenges of new sequencing technology. Trends Genet 24:142–149PubMedGoogle Scholar
  56. Quinlan AR, Stewart DA, Strömberg MP, Marth GT (2008) Pyrobayes: an improved base caller for SNP discovery in pyrosequences. Nat Methods 5:179–181. doi: 10.1038/nmeth.1172 CrossRefPubMedGoogle Scholar
  57. Rosenblum EB, Novembre J (2007) Ascertainment bias in spatially structured populations: a case study in the eastern fence lizard. J Hered 98:331–336. doi: 10.1093/jhered/esm031 CrossRefPubMedGoogle Scholar
  58. Saastamoinen M (2007) Heritability of dispersal rate and other life history traits in the Glanville fritillary butterfly. Heredity 100:39–46. doi: 10.1038/sj.hdy.6801056 CrossRefPubMedGoogle Scholar
  59. Saastamoinen M, Hanski I (2008) Genotypic and environmental effects on flight activity and oviposition in the Glanville fritillary butterfly. Am Nat 171:701–712. doi: 10.1086/587531 CrossRefPubMedGoogle Scholar
  60. Sarhan A (2006) Isolation and characterization of five microsatellite loci in the Glanville fritillary butterfly (Melitaea cinxia). Mol Ecol Notes 6:163–164. doi: 10.1111/j.1471-8286.2006.01176.x CrossRefGoogle Scholar
  61. Schmid KJ, Ramos-Onsins S, Ringys-Beckstein H, Weisshaar B, Mitchell-Olds T (2005a) A multilocus sequence survey in Arabidopsis thaliana reveals a genome-wide departure from a neutral model of DNA sequence polymorphism. Genetics 169:1601–1615. doi: 10.1534/genetics.104.033795 CrossRefPubMedGoogle Scholar
  62. Schmid M, Davison TS, Henz SR, Page UJ, Demar M et al (2005b) A gene expression map of Arabidopsis thaliana development. Nat Genet 37:501–506. doi: 10.1038/ng1543 CrossRefPubMedGoogle Scholar
  63. Shiu S-H, Borevitz JO (2008) The next generation of microarray research: applications in evolutionary and ecological genomics. Heredity 100:141–149. doi: 10.1038/sj.hdy.6800916 CrossRefPubMedGoogle Scholar
  64. Slate J (2005) Quantitative trait locus mapping in natural populations: progress, caveats and future directions. Mol Ecol 14:363–379. doi: 10.1111/j.1365-294X.2004.02378.x CrossRefPubMedGoogle Scholar
  65. Stein LD, Mungall C, Shu S, Caudy M, Mangone M et al (2002) The generic genome browser: a building block for a model organism system database. Genome Res 12:1599–1610. doi: 10.1101/gr.403602 CrossRefPubMedGoogle Scholar
  66. Storz JF (2005) Using genomic scans of DNA polymorphism to infer adaptive population divergence. Mol Ecol 14:671–688. doi: 10.1111/j.1365-294X.2005.02437.x CrossRefPubMedGoogle Scholar
  67. Tishkoff SA, Verrelli BC (2003) Patterns of human genetic diversity: implications for human evolutionary history and disease. Annu Rev Genomics Hum Genet 4:293–340. doi: 10.1146/annurev.genom.4.070802.110226 CrossRefPubMedGoogle Scholar
  68. Torres TT, Metta M, Ottenwälder B, Schlötterer C (2008) Gene expression profiling by massively parallel sequencing. Genome Res 18:172–177. doi: 10.1101/gr.6984908 CrossRefPubMedGoogle Scholar
  69. Van’t Hof AE, Brakefield PM, Saccheri IJ, Zwaan BJ (2007) Evolutionary dynamics of multilocus microsatellite arrangements in the genome of the butterfly Bicyclus anynana, with implications for other Lepidoptera. Heredity 98:320–328. doi: 10.1038/sj.hdy.6800944 CrossRefPubMedGoogle Scholar
  70. Vera C, Wheat CW, Marden JH, Hanski I (2007) Rapid transcriptome characterization for a non-model organism using massively parallel 454 pyrosequencing. Mol Ecol 17:1636–1647. doi: 10.1111/j.1365-294X.2008.03666.x CrossRefGoogle Scholar
  71. Vos P, Hogers R, Bleeker M, Reijans M, Vandelee T et al (1995) Aflp—a new technique for DNA fingerprinting. Nucleic Acids Res 23:4407–4414. doi: 10.1093/nar/23.21.4407 CrossRefPubMedGoogle Scholar
  72. Wahlberg N, Wheat CW (2008) Genomic outposts serve the phylogenomic pioneers: designing novel nuclear markers for genomic DNA extractions of lepidoptera. Syst Biol 57:231–242. doi: 10.1080/10635150802033006 CrossRefPubMedGoogle Scholar
  73. Watt WB (2003) Mechanistic studies of butterfly adaptations in ecology and evolution taking flight. In: Boggs CL, Watt WB, Ehrlich PR (eds) Butterflies as model systems. University of Chicago Press, Chicago, ILGoogle Scholar
  74. Weber APM, Weber KL, Carr K, Wilkerson C, Ohlrogge JB (2007) Sampling the arabidopsis transcriptome with massively parallel pyrosequencing. Plant Physiol 144:32–42. doi: 10.1104/pp.107.096677 CrossRefPubMedGoogle Scholar
  75. Wheat CW, Watt WB, Pollock DD, Schulte PM (2006) From DNA to fitness differences: sequences and structures of adaptive variants of Colias phosphoglucose isomerase (PGI). Mol Biol Evol 23:499–512. doi: 10.1093/molbev/msj062 CrossRefPubMedGoogle Scholar
  76. Wheat CW, Vogel H, Wittstock U, Braby MF, Underwood D et al (2007) The genetic basis of a coevolutionary key innovation. Proc Natl Acad Sci USA 104:20427–20431. doi: 10.1073/pnas.0706229104 CrossRefPubMedGoogle Scholar
  77. Wheeler DA, Srinivasan M, Egholm M, Yufeng S, Chen L et al (2008) The complete genome of an individual by massively parallel DNA sequencing. Nature 452:872–876. doi: 10.1038/nature06884 CrossRefPubMedGoogle Scholar
  78. Wicker T, Schlagenhauf E, Graner A, Close TJ, Keller B et al (2006) 454 sequencing put to the test using the complex genome of barley. BMC Genomics 7:275. doi: 10.1186/1471-2164-7-275 CrossRefPubMedGoogle Scholar
  79. Won Y-J, Sivasundar A, Wang Y, Hey J (2005) On the origin of Lake Malawi cichlid species: a population genetic analysis of divergence. Proc Natl Acad Sci USA 102:6581–6586. doi: 10.1073/pnas.0502127102 CrossRefPubMedGoogle Scholar
  80. Wondji CS, Hemingway J, Ranson H (2007) Identification and analysis of single nucleotide polymorphisms (SNPs) in the mosquito Anopheles funestus, malaria vector. BMC Genomics 8:5. doi: 10.1186/1471-2164-8-5 CrossRefPubMedGoogle Scholar
  81. Wray GA (2007) The evolutionary significance of cis-regulatory mutations. Nat Rev Genet 8:206–216. doi: 10.1038/nrg2063 CrossRefPubMedGoogle Scholar
  82. Xia QY, Zhou ZY, Lu C, Cheng DJ, Dai FY et al (2004) A draft sequence for the genome of the domesticated silkworm (Bombyx mori). Science 306:1937–1940. doi: 10.1126/science.1102210 CrossRefPubMedGoogle Scholar
  83. Yasukochi Y, Ashakumary LA, Baba K, Yoshido A, Sahara K (2006) A second-generation integrated map of the silkworm reveals synteny and conserved gene order between lepidopteran insects. Genetics 173:1319–1328. doi: 10.1534/genetics.106.055541 CrossRefPubMedGoogle Scholar
  84. Zhang DX (2004) Lepidopteran microsatellite DNA: redundant but promising. Trends Ecol Evol 19:507–509. doi: 10.1016/j.tree.2004.07.020 CrossRefPubMedGoogle Scholar
  85. Zhang Y, Austin CA (1997) Using rapid amplification of cDNA ends (RACE) to obtain full-length cDNAs. Humana Press, Clifton, UKGoogle Scholar
  86. Zhang DX, Hewitt GM (2003) Nuclear DNA analyses in genetic studies of populations: practice, problems, and prospects. Mol Ecol 12:563–584. doi: 10.1046/j.1365-294X.2003.01773.x CrossRefPubMedGoogle Scholar
  87. Zhu YY, Machleder EM, Chenchik A, Li R, Siebert PD (2001) Reverse transcriptase template switching: a SMART approach for full-length cDNA library construction. Biotechniques 30:892–897PubMedGoogle Scholar
  88. Zhulidov PA, Bogdanova EA, Shcheglov AS, Vagner LL, Khaspekov GL et al (2004) Simple cDNA normalization using kamchatka crab duplex-specific nuclease. Nucleic Acids Res 32:e37. doi: 10.1093/nar/gnh031 CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2008

Authors and Affiliations

  1. 1.Metapopulation Research Group, Department of Biological and Environmental SciencesUniversity of HelsinkiHelsinkiFinland
  2. 2.Department of BiologyPennsylvania State UniversityUniversity ParkUSA

Personalised recommendations