Abstract
With the emerging sequencing technologies and cost reduction, the sequence data generation has accelerated from a single individual to multiple (thousands of) individuals of a species. The terabytes of sequence data generated from thousands of individuals include the majority of the redundant sequence which depends on the level of sequence similarity within the population of individuals. Managing large datasets and creating the unique catalogue sequence from such a large population is challenging to analyze, store, and retrieve the information. In this chapter, we discuss the practical haplotype graph (PHG) which addresses the above said challenges and also able to retrieve required information such as variants and sequences more efficiently, which enable researchers to manage and assess large genomic data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Jensen SE, Charles JR, Muleta K et al (2020) A sorghum practical haplotype graph facilitates genome-wide imputation and cost-effective genomic prediction. Plant Genome 13(1):e20009. https://doi.org/10.1002/tpg2.20009
Danecek P, Auton A, Abecasis G et al (2011) The variant call format and VCFtools. Bioinformatics 27(15):2156–2158. https://doi.org/10.1093/bioinformatics/btr330
Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9(4):357–359. https://doi.org/10.1038/nmeth.1923
Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25(16):2078–2079. https://doi.org/10.1093/bioinformatics/btp352
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 30(15):2114–2120. https://doi.org/10.1093/bioinformatics/btu170
Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26(6):841–842. https://doi.org/10.1093/bioinformatics/btq033
Van der Auwera GA, Carneiro MO, Hartl C et al (2013) From fastQ data to high-confidence variant calls: the genome analysis toolkit best practices pipeline. Curr Protoc Bioinformatics 43(1110):11.10.1–11.10.33. https://doi.org/10.1002/0471250953.bi1110s43
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Ruperao, P., Gandham, P., Rathore, A. (2022). Construction of Practical Haplotype Graph (PHG) with the Whole-Genome Sequence Data. In: Edwards, D. (eds) Plant Bioinformatics. Methods in Molecular Biology, vol 2443. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2067-0_15
Download citation
DOI: https://doi.org/10.1007/978-1-0716-2067-0_15
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-2066-3
Online ISBN: 978-1-0716-2067-0
eBook Packages: Springer Protocols