Quantitative Analysis of Synthesized Nucleic Acid Pools
Experimental evolution of RNA (or DNA) is a powerful method to isolate sequences with useful function (e.g., catalytic RNA), discover fundamental features of the sequence-activity relationship (i.e., the fitness landscape), and map evolutionary pathways or functional optimization strategies. However, the limitations of current sequencing technology create a significant undersampling problem which impedes our ability to measure the true distribution of unique sequences. In addition, synthetic sequence pools contain a non-random distribution of nucleotides. Here, we present and analyze simple models to approximate the true sequence distribution. We also provide tools that compensate for sequencing errors and other biases that occur during sample processing. We describe our implementation of these algorithms in the Galaxy bioinformatics platform.
KeywordsUnique Sequence Selection Experiment Fitness Landscape Adaptor Ligation Initial Pool
- 7.Ellington, A., Pollard Jr., J.D.: Synthesis and purification of oligonucleotides. In: Current Protocols in Molecular Biology, Chap. 2, Unit 2.11. Wiley, New York (2001)Google Scholar
- 8.Edwards, A.W.F.: Likelihood. Johns Hopkins University Press, Baltimore (1992). Expanded editionGoogle Scholar
- 12.Meacham, F., Boffelli, D., Dhahbi, J., Martin, D.I.K., Singer, M., Pachter, L.: Identification and correction of systematic error in high-throughput sequence data. MBC Bioinform. 12, 451 (2011)Google Scholar
- 13.Nakamura, K., Oshima, T., Morimoto, T., Ikeda, S., Yoshikawa, H., Shiwa, Y., Ishikawa, S., Linak, M. C., Hirai, A., Takahashi, H., Altaf-Ul-Amin, Md., Ogasawara, N., Kanaya, S.: Sequence-specific error profile of Illumina sequencers. Nucl. Acids Res. 39 (13), e90 (2011)Google Scholar