Skip to main content

PSIM: pattern-based read simulator for RNA-seq analysis

Abstract

Next-generation sequencing technologies (NGS) require mapping tools that are fundamental for their application. These are evaluated by the level of accuracy to be matched and read at the original location. Evaluation increases the need for a simulator to generate reads with their locations and errors, as with indel. In this paper, we propose a simulator, PSIM, that generating a set of artificial RNA segments(reads) with the expression level and errors based on a pattern-based SAM file. PSIM adopts the contour line transpose and interval section shuffle methods to generate a similar expression level. In addition, we show the similarity between a profile contour of synthesized data and a reference sequence.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

References

  1. 1.

    Angly F E, Willner D, Rohwer F et al (2012) Grinder: a versatile amplicon and shotgun sequence simulator. Nucleic Acids Res 40(12):1–8

    Article  Google Scholar 

  2. 2.

    Balzer S, Malde K, Lanzén A et al (2010) Characteristics of 454 pyrosequencing data enabling realistic simulation with flowsim. Bioinformatics 26(18):i420–i425

    Article  Google Scholar 

  3. 3.

    Bartenhagen C, Dugas M (2013) RSVSim: an R/Bioconductor package for the simulation of structural variations. Bioinforma 29(13):1679–1681

    Article  Google Scholar 

  4. 4.

    Choi M, Scholl U I, Ji W et al (2009) Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Bioinformatics 106(45):19096–19101

    Google Scholar 

  5. 5.

    Döring A, Weese D, Rausch T, Reinert K (2008) SeqAn an efficient - generic C++ library for sequence analysis. BMC Bioinforma 9(1):11

    Article  Google Scholar 

  6. 6.

    Hu X, Yuan J, Shi Y et al (2012) pIRS: profile-based Illumina pair-end reads simulator. Bioinformatics 28(11):1533–1535

    Article  Google Scholar 

  7. 7.

    Huang W, Li L, Myers J R, Marth G T (2012) ART: a next-generation sequencing read simulator. Bioinforma 28(4):593–594

    Article  Google Scholar 

  8. 8.

    Kim S, Jeong K, Bafna V (2013) Wessim: a whole-exome sequencing simulator based on in silico exome capture. Bioinformatics 29(8):1076–1077

    Article  Google Scholar 

  9. 9.

    Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25(16):2078–2079

    Article  Google Scholar 

  10. 10.

    Liu L, Li Y, Li S et al (2012) Comparison of next-generation sequencing systems. J Biomed Biotechnol:2012

  11. 11.

    Manuel H (2010) Mason - a read simulator for second generation sequencing data, Technical Report FU, Berlin

  12. 12.

    McElroy K, Luciani F, Thomas T (2012) GemSIM: general error-model based simulator of next-generation sequencing data. BMC Genomics 13(1):1–9

    Article  Google Scholar 

  13. 13.

    Ono Y, Asai K, Hamada M (2013) PBSIM: PacBio reads simulator? toward accurate genome assembly. Bioinforma 29(1):119–121

    Article  Google Scholar 

  14. 14.

    Pickrell J K, Marioni J C, Pai A A et al (2010) Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nat 464(7289):768–772

    Article  Google Scholar 

  15. 15.

    Richter D C, Ott F, Auch A F et al (2008) MetaSim - a sequencing simulator for genomics and metagenomics. PLoS ONE 3(10):e3373

    Article  Google Scholar 

Download references

Acknowledgments

This research was supported by a grant from the KRIBB Research Initiative Program.

Author information

Affiliations

Authors

Corresponding authors

Correspondence to Sang-min Lee or Do-Hoon Lee.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lee, Sm., Tak, H., Park, K. et al. PSIM: pattern-based read simulator for RNA-seq analysis. Multimed Tools Appl 74, 6465–6480 (2015). https://doi.org/10.1007/s11042-014-2108-x

Download citation

Keywords

  • RNA-seq
  • Read simulator
  • Bioinformatics