Skip to main content

Poly(A)-Tag Deep Sequencing Data Processing to Extract Poly(A) Sites

Part of the Methods in Molecular Biology book series (MIMB,volume 1255)

Abstract

Polyadenylation [poly(A)] is an essential posttranscriptional processing step in the maturation of eukaryotic mRNA. The advent of next-generation sequencing (NGS) technology has offered feasible means to generate large-scale data and new opportunities for intensive study of polyadenylation, particularly deep sequencing of the transcriptome targeting the junction of 3′-UTR and the poly(A) tail of the transcript. To take advantage of this unprecedented amount of data, we present an automated workflow to identify polyadenylation sites by integrating NGS data cleaning, processing, mapping, normalizing, and clustering. In this pipeline, a series of Perl scripts are seamlessly integrated to iteratively map the single- or paired-end sequences to the reference genome. After mapping, the poly(A) tags (PATs) at the same genome coordinate are grouped into one cleavage site, and the internal priming artifacts removed. Then the ambiguous region is introduced to parse the genome annotation for cleavage site clustering. Finally, cleavage sites within a close range of 24 nucleotides and from different samples can be clustered into poly(A) clusters. This procedure could be used to identify thousands of reliable poly(A) clusters from millions of NGS sequences in different tissues or treatments.

Key words

  • Polyadenylation site
  • Next-generation sequencing
  • Genomic data
  • Poly(A) clusters
  • Bioinformatic processing
  • PAT-seq

This is a preview of subscription content, access via your institution.

Buying options

Protocol
USD   49.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-1-4939-2175-1_4
  • Chapter length: 10 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   89.00
Price excludes VAT (USA)
  • ISBN: 978-1-4939-2175-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   119.00
Price excludes VAT (USA)
Hardcover Book
USD   169.99
Price excludes VAT (USA)
Fig. 1
Fig. 2

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

  1. Xing D, Li QQ (2011) Alternative polyadenylation and gene expression regulation in plants. Wiley Interdiscip Rev RNA 2(3):445–458. doi:10.1002/wrna.59

    CAS  PubMed  CrossRef  Google Scholar 

  2. Shen Y, Ji G, Haas BJ, Wu X, Zheng J, Reese GJ, Li QQ (2008) Genome level analysis of rice mRNA 3′-end processing signals and alternative polyadenylation. Nucleic Acids Res 36(9):3150–3161

    CAS  PubMed Central  PubMed  CrossRef  Google Scholar 

  3. Tian B, Hu J, Zhang HB, Lutz CS (2005) A large-scale analysis of mRNA polyadenylation of human and mouse genes. Nucleic Acids Res 33(1):201–212. doi:10.1093/nar/gki158

    CAS  PubMed Central  PubMed  CrossRef  Google Scholar 

  4. Wu X, Liu M, Downie B, Liang C, Ji G, Li QQ, Hunt AG (2011) Genome-wide landscape of polyadenylation in Arabidopsis provides evidence for extensive alternative polyadenylation. Proc Natl Acad Sci U S A 108(30):12533–12538. doi:10.1073/pnas.1019732108

    CAS  PubMed Central  PubMed  CrossRef  Google Scholar 

  5. Shen Y, Venu RC, Nobuta K, Wu X, Notibala V, Demirci C, Meyers BC, Wang G-L, Ji G, Li QQ (2011) Transcriptome dynamics through alternative polyadenylation in developmental and environmental responses in plants revealed by deep sequencing. Genome Res 21(9):1478–1486. doi:10.1101/gr.114744.110

    CAS  PubMed Central  PubMed  CrossRef  Google Scholar 

  6. Ma L, Pati PK, Liu M, Li QQ, Hunt AG (2014) High throughput characterizations of poly(A) site choice in plants. Methods 67(1):74–83. doi:10.1016/j.ymeth.2013.06.037

    CAS  PubMed  CrossRef  Google Scholar 

  7. Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10(3):R25. doi: 10.1186/gb-2009-10-3-r25

    PubMed Central  PubMed  CrossRef  Google Scholar 

  8. Shen Y, Liu Y, Liu L, Liang C, Li QQ (2008) Unique features of nuclear mRNA poly(A) signals and alternative polyadenylation in Chlamydomonas reinhardtii. Genetics 179(1):167–176

    CAS  PubMed Central  PubMed  CrossRef  Google Scholar 

Download references

Acknowledgement

Funding supports for this work were from the National Natural Science Foundation of China (Nos. 61174161 and 61304141), the Natural Science Foundation of Fujian Province of China (No. 2012J01154), the specialized Research Fund for the Doctoral Program of Higher Education of China (Nos. 20130121130004 and 20120121120038), and the Fundamental Research Funds for the Central Universities in China (Xiamen University: No. 2013121025), Xiamen Shuangbai Talent Plan (to QQL), and US National Science Foundation (grant nos. IOS–0817829 and IOS-1353354 to QQL).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaohui Wu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2015 Springer Science+Business Media New York

About this protocol

Cite this protocol

Wu, X., Ji, G., Li, Q.Q. (2015). Poly(A)-Tag Deep Sequencing Data Processing to Extract Poly(A) Sites. In: Hunt, A., Li, Q. (eds) Polyadenylation in Plants. Methods in Molecular Biology, vol 1255. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-2175-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-2175-1_4

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-2174-4

  • Online ISBN: 978-1-4939-2175-1

  • eBook Packages: Springer Protocols