In the “big data” era, research biologists are faced with analyzing new types that usually require some level of computational expertise. A number of programs and pipelines exist, but acquiring the expertise to run them, and then understanding the output can be a challenge.
The Pathosystems Resource Integration Center (PATRIC, www.patricbrc.org) has created an end-to-end analysis platform that allows researchers to take their raw reads, assemble a genome, annotate it, and then use a suite of user-friendly tools to compare it to any public data that is available in the repository. With close to 113,000 bacterial and more than 1000 archaeal genomes, PATRIC creates a unique research experience with “virtual integration” of private and public data. PATRIC contains many diverse tools and functionalities to explore both genome-scale and gene expression data, but the main focus of this chapter is on assembly, annotation, and the downstream comparative analysis functionality that is freely available in the resource.
Brettin T et al (2015) RASTtk: a modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci Rep 5:8365CrossRefPubMedPubMedCentralGoogle Scholar
Li D et al (2015) MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31(10):1674–1676. btv033CrossRefPubMedGoogle Scholar
Clark SC et al (2013) ALE: a generic assembly likelihood evaluation framework for assessing the accuracy of genome and metagenome assemblies. Bioinformatics 29(4):435–443. bts723CrossRefPubMedGoogle Scholar
Vicedomini R et al (2013) GAM-NGS: genomic assemblies merger for next generation sequencing. BMC Bioinformatics 14(7):1Google Scholar
Li H (2015) Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics 32:2103–2110. arXiv preprint arXiv:1512.01801CrossRefGoogle Scholar