Skip to main content

Variant Calling from RNA-seq Data Using the GATK Joint Genotyping Workflow

  • Protocol
  • First Online:
Variant Calling

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2493))

Abstract

The Genome Analysis Toolkit (GATK) developed at the Broad Institute provides state-of-the-art pipelines for germline and somatic variant discovery and genotyping. Unfortunately, the fully validated GATK pipeline for calling variant on RNAseq data is a Per-sample workflow that does not include the recent improvements seen in modern workflows, especially the possibility to perform joint genotyping analysis. Here, we describe how modern GATK commands from distinct workflows can be combined to call variants on RNAseq samples. We provide a detailed tutorial that starts with raw RNAseq reads and ends with filtered variants, of which some were shown to be associated with bovine paratuberculosis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Piskol R, Ramaswami G, Li JB (2013) Reliable identification of genomic variants from RNA-seq data. Am J Hum Genet 93:641–651. https://doi.org/10.1016/j.ajhg.2013.08.008

    Article  CAS  Google Scholar 

  2. Koboldt DC (2020) Best practices for variant calling in clinical sequencing. Genome Med 12:91

    Article  Google Scholar 

  3. GATK (2021) Are there best practices for calling variants in RNAseq data? https://gatk.broadinstitute.org/hc/en-us/articles/360035889711-Are-there-Best-Practices-for-calling-variants-in-RNAseq-data-

  4. GATK (2021) RNAseq short variant discovery (snps + indels). https://gatk.broadinstitute.org/hc/en-us/articles/360035531192-RNAseq-short-variant-discovery-SNPs-Indels-

  5. GATK (2021) gatk4-rnaseq-germline-snps-indels. https://github.com/gatk-workflows/gatk4-rnaseq-germline-snps-indels

  6. Brouard JS, Schenkel F, Marete A, Bissonnette N (2019) The GATK joint genotyping workflow is appropriate for calling variants in RNA-seq experiments. J Anim Sci Biotechnol 10:44. https://doi.org/10.1186/s40104-019-0359-0

    Article  Google Scholar 

  7. GATK (2021) Germline short variant discovery (snps + indels). https://gatk.broadinstitute.org/hc/en-us/articles/360035535932-Germline-short-variant-discovery-SNPs-Indels-

  8. GATK (2021) The logic of joint calling for germline short variants. https://gatk.broadinstitute.org/hc/en-us/articles/360035890431-The-logic-of-joint-calling-for-germline-short-variants

  9. Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, Li H (2021) Twelve years of SAMtools and BCFtools. GigaScience 10. https://doi.org/10.1093/gigascience/giab008

  10. Ariel O, Brouard JS, Marete A, Miglior F, Ibeagha-Awemu E, Bissonnette N (2021) Genome-wide association analysis identified both RNA-seq and DNA variants associated to paratuberculosis in Canadian Holstein cattle ‘in vitro’ experimentally infected macrophages. BMC Genomics 22:162

    Article  CAS  Google Scholar 

  11. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR (2013) Star: ultrafast universal RNA-seq aligner. Bioinformatics (Oxford, England) 29:15–21. https://doi.org/10.1093/bioinformatics/bts635

  12. Buffalo V (2015) Bioinformatics data skills. O’Reilly, Sebastopol

    Google Scholar 

  13. Papadopoulos S, Datta K, Madden S, Mattson T (2016) The TileDB array data storage manager 10:349–360. https://doi.org/10.14778/3025111.3025117

    Google Scholar 

  14. Nielsen R, Paul JS, Albrechtsen A, Song YS (2011) Genotype and SNP calling from next-generation sequencing data. Nat Rev Genet 12:443–51

    Article  CAS  Google Scholar 

  15. Brouard JS, Boyle B, Ibeagha-Awemu EM, Bissonnette N (2017) Low-depth genotyping-by-sequencing (GBS) in a bovine population: strategies to maximize the selection of high quality genotypes and the accuracy of imputation. BMC Genet 18(1):1–14. https://doi.org/10.1186/s12863-017-0501-y

    Article  Google Scholar 

  16. Tam V, Patel N, Turcotte M, Bossé Y, Paré G, Meyre D (2019) Benefits and limitations of genome-wide association studies. Nat Rev Genet 20(8):467–484. https://doi.org/10.1038/s41576-019-0127-1

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jean-Simon Brouard .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Cite this protocol

Brouard, JS., Bissonnette, N. (2022). Variant Calling from RNA-seq Data Using the GATK Joint Genotyping Workflow. In: Ng, C., Piscuoglio, S. (eds) Variant Calling. Methods in Molecular Biology, vol 2493. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2293-3_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-2293-3_13

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-2292-6

  • Online ISBN: 978-1-0716-2293-3

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics