Skip to main content

Statistical and Bioinformatics Analysis of Data from Bulk and Single-Cell RNA Sequencing Experiments

  • Protocol
  • First Online:
Translational Bioinformatics for Therapeutic Development

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2194))

Abstract

High-throughput sequencing (HTS) has revolutionized researchers’ ability to study the human transcriptome, particularly as it relates to cancer. Recently, HTS technology has advanced to the point where now one is able to sequence individual cells (i.e., “single-cell sequencing”). Prior to single-cell sequencing technology, HTS would be completed on RNA extracted from a tissue sample consisting of multiple cell types (i.e., “bulk sequencing”). In this chapter, we review the various bioinformatics and statistical methods used in the processing, quality control, and analysis of bulk and single-cell RNA sequencing methods. Additionally, we discuss how these methods are also being used to study tumor heterogeneity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Muller PA, Vousden KH (2013) p53 mutations in cancer. Nat Cell Biol 15(1):2–8. https://doi.org/10.1038/ncb2641

    Article  CAS  PubMed  Google Scholar 

  2. Baylin SB (2005) DNA methylation and gene silencing in cancer. Nat Clin Pract Oncol 2(Suppl 1):S4–S11. https://doi.org/10.1038/ncponc0354

    Article  CAS  PubMed  Google Scholar 

  3. Perou CM, Sorlie T, Eisen MB et al (2000) Molecular portraits of human breast tumours. Nature 406(6797):747–752. https://doi.org/10.1038/35021093

    Article  CAS  PubMed  Google Scholar 

  4. Cancer Genome Atlas Network (2012) Comprehensive molecular portraits of human breast tumours. Nature 490(7418):61–70

    Article  Google Scholar 

  5. Parker JS, Mullins M, Cheang MC et al (2009) Supervised risk predictor of breast cancer based on intrinsic subtypes. J Clin Oncol 27(8):1160–1167. https://doi.org/10.1200/JCO.2008.18.1370. JCO.2008.18.1370 [pii]

    Article  PubMed  PubMed Central  Google Scholar 

  6. Sorlie T, Perou CM, Tibshirani R et al (2001) Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A 98(19):10869–10874. https://doi.org/10.1073/pnas.191367098

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Sorlie T, Tibshirani R, Parker J et al (2003) Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci U S A 100(14):8418–8423. https://doi.org/10.1073/pnas.0932692100

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev 10(1):57–63. https://doi.org/10.1038/nrg2484

    Article  CAS  Google Scholar 

  9. Zhu S, Qing T, Zheng Y et al (2017) Advances in single-cell RNA sequencing and its applications in cancer research. Oncotarget 8(32):53763–53779. https://doi.org/10.18632/oncotarget.17893

    Article  PubMed  PubMed Central  Google Scholar 

  10. Bian S, Hou Y, Zhou X et al (2018) Single-cell multiomics sequencing and analyses of human colorectal cancer. Science 362(6418):1060–1063. https://doi.org/10.1126/science.aao3791

    Article  CAS  PubMed  Google Scholar 

  11. Navin NE (2015) Delineating cancer evolution with single-cell sequencing. Sci Transl Med 7(296):296fs229. https://doi.org/10.1126/scitranslmed.aac8319

    Article  CAS  Google Scholar 

  12. Lee MC, Lopez-Diaz FJ, Khan SY et al (2014) Single-cell analyses of transcriptional heterogeneity during drug tolerance transition in cancer cells by RNA sequencing. Proc Natl Acad Sci U S A 111(44):E4726–E4735. https://doi.org/10.1073/pnas.1404656111

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Guo X, Zhang Y, Zheng L et al (2018) Global characterization of T cells in non-small-cell lung cancer by single-cell sequencing. Nat Med 24(7):978–985. https://doi.org/10.1038/s41591-018-0045-3

    Article  CAS  PubMed  Google Scholar 

  14. Zheng C, Zheng L, Yoo JK et al (2017) Landscape of infiltrating T cells in liver cancer revealed by single-cell sequencing. Cell 169(7):1342–1356.e1316. https://doi.org/10.1016/j.cell.2017.05.035

    Article  CAS  PubMed  Google Scholar 

  15. Siegel RL, Miller KD, Jemal A (2019) Cancer statistics, 2019. CA Cancer J Clin 69(1):7–34. https://doi.org/10.3322/caac.21551

    Article  PubMed  Google Scholar 

  16. Cancer Genome Atlas Network (2015) Genomic classification of cutaneous melanoma. Cell 161(7):1681–1696. https://doi.org/10.1016/j.cell.2015.05.044

    Article  CAS  Google Scholar 

  17. Nirschl CJ, Suarez-Farinas M, Izar B et al (2017) IFNgamma-dependent tissue-immune homeostasis is co-opted in the tumor microenvironment. Cell 170(1):127–141.e115. https://doi.org/10.1016/j.cell.2017.06.016

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Gerber T, Willscher E, Loeffler-Wirth H et al (2017) Mapping heterogeneity in patient-derived melanoma cultures by single-cell RNA-seq. Oncotarget 8(1):846–862. https://doi.org/10.18632/oncotarget.13666

    Article  PubMed  Google Scholar 

  19. Kumar MP, Du J, Lagoudas G et al (2018) Analysis of single-cell RNA-Seq identifies cell-cell communication associated with tumor characteristics. Cell Rep 25(6):1458–1468.e1454. https://doi.org/10.1016/j.celrep.2018.10.047

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Tirosh I, Izar B, Prakadan SM et al (2016) Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science 352(6282):189–196. https://doi.org/10.1126/science.aad0501

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Picelli S, Bjorklund AK, Faridani OR et al (2013) Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat Methods 10(11):1096–1098. https://doi.org/10.1038/nmeth.2639

    Article  CAS  PubMed  Google Scholar 

  22. Hansen KD, Brenner SE, Dudoit S (2010) Biases in Illumina transcriptome sequencing caused by random hexamer priming. Nucleic Acids Res 38(12):e131. https://doi.org/10.1093/nar/gkq224

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Benjamini Y, Speed TP (2012) Summarizing and correcting the GC content bias in high-throughput sequencing. Nucleic Acids Res 40(10):e72. https://doi.org/10.1093/nar/gks001

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Pearson WR, Lipman DJ (1988) Improved tools for biological sequence comparison. Proc Natl Acad Sci U S A 85(8):2444–2448. https://doi.org/10.1073/pnas.85.8.2444

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Cock PJ, Fields CJ, Goto N et al (2010) The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res 38(6):1767–1771. https://doi.org/10.1093/nar/gkp1137

    Article  CAS  PubMed  Google Scholar 

  26. Li H, Handsaker B, Wysoker A et al (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25(16):2078–2079. https://doi.org/10.1093/bioinformatics/btp352

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Fuller CW, Middendorf LR, Benner SA et al (2009) The challenges of sequencing by synthesis. Nat Biotechnol 27(11):1013–1023. https://doi.org/10.1038/nbt.1585

    Article  CAS  PubMed  Google Scholar 

  28. Kim D, Pertea G, Trapnell C et al (2013) TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol 14(4):R36. https://doi.org/10.1186/gb-2013-14-4-r36

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Wang K, Singh D, Zeng Z et al (2010) MapSplice: accurate mapping of RNA-seq reads for splice junction discovery. Nucleic Acids Res 38(18):e178. https://doi.org/10.1093/nar/gkq622

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Dobin A, Davis CA, Schlesinger F et al (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29(1):15–21. https://doi.org/10.1093/bioinformatics/bts635

    Article  CAS  PubMed  Google Scholar 

  31. Wu TD, Reeder J, Lawrence M et al (2016) GMAP and GSNAP for genomic sequence alignment: enhancements to speed, accuracy, and functionality. Methods Mol Biol 1418:283–334. https://doi.org/10.1007/978-1-4939-3578-9_15

    Article  PubMed  Google Scholar 

  32. Lunter G, Goodson M (2011) Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Res 21(6):936–939. https://doi.org/10.1101/gr.111120.110

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Li H, Ruan J, Durbin R (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18(11):1851–1858. https://doi.org/10.1101/gr.078212.108

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14):1754–1760. https://doi.org/10.1093/bioinformatics/btp324. btp324 [pii]

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9(4):357–359. https://doi.org/10.1038/nmeth.1923

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Trapnell C, Williams BA, Pertea G et al (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28(5):511–515. https://doi.org/10.1038/nbt.1621

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Pertea M, Pertea GM, Antonescu CM et al (2015) StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol 33(3):290–295. https://doi.org/10.1038/nbt.3122

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Haas BJ, Papanicolaou A, Yassour M et al (2013) De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc 8(8):1494–1512. https://doi.org/10.1038/nprot.2013.084

    Article  CAS  PubMed  Google Scholar 

  39. Grabherr MG, Haas BJ, Yassour M et al (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29(7):644–652. https://doi.org/10.1038/nbt.1883

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Schulz MH, Zerbino DR, Vingron M et al (2012) Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics 28(8):1086–1092. https://doi.org/10.1093/bioinformatics/bts094

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Li B, Dewey CN (2011) RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12:323. https://doi.org/10.1186/1471-2105-12-323

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Patro R, Mount SM, Kingsford C (2014) Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nat Biotechnol 32(5):462–464. https://doi.org/10.1038/nbt.2862

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Liao Y, Smyth GK, Shi W (2014) featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30(7):923–930. https://doi.org/10.1093/bioinformatics/btt656

    Article  CAS  PubMed  Google Scholar 

  44. Anders S, Pyl PT, Huber W (2015) HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 31(2):166–169. https://doi.org/10.1093/bioinformatics/btu638

    Article  CAS  PubMed  Google Scholar 

  45. Bullard JH, Purdom E, Hansen KD et al (2010) Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinformatics 11:94. https://doi.org/10.1186/1471-2105-11-94

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Jiang L, Schlesinger F, Davis CA et al (2011) Synthetic spike-in standards for RNA-seq experiments. Genome Res 21(9):1543–1551. https://doi.org/10.1101/gr.121095.111

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Mortazavi A, Williams BA, McCue K et al (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5(7):621–628. https://doi.org/10.1038/nmeth.1226. nmeth.1226 [pii]

    Article  CAS  PubMed  Google Scholar 

  48. Leek JT, Scharpf RB, Bravo HC et al (2010) Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev 11(10):733–739. https://doi.org/10.1038/nrg2825

    Article  CAS  Google Scholar 

  49. Leek JT, Storey JD (2007) Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet 3(9):1724–1735. https://doi.org/10.1371/journal.pgen.0030161

    Article  CAS  PubMed  Google Scholar 

  50. Risso D, Ngai J, Speed TP et al (2014) Normalization of RNA-seq data using factor analysis of control genes or samples. Nat Biotechnol 32(9):896–902. https://doi.org/10.1038/nbt.2931

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Hansen KD, Wu Z, Irizarry RA et al (2011) Sequencing technology does not eliminate biological variability. Nat Biotechnol 29(7):572–573. https://doi.org/10.1038/nbt.1910

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Anders S, Huber W (2010) Differential expression analysis for sequence count data. Genome Biol 11(10):R106. https://doi.org/10.1186/gb-2010-11-10-r106

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Robinson MD, Oshlack A (2010) A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol 11(3):R25. https://doi.org/10.1186/gb-2010-11-3-r25

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Smyth GK (2005) limma: linear models for microarray data. In: Gentleman R, Carey V, Huber W, Irizarry R, Dudoit S (eds) Bioinformatics and computational biology solutions using R and Bioconductor. Springer, Berlin, pp 397–420

    Chapter  Google Scholar 

  55. Bolstad BM, Irizarry RA, Astrand M et al (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19(2):185–193

    Article  CAS  PubMed  Google Scholar 

  56. Pickrell JK, Marioni JC, Pai AA et al (2010) Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464(7289):768–772

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Li B, Ruotti V, Stewart RM et al (2010) RNA-Seq gene expression estimation with read mapping uncertainty. Bioinformatics 26(4):493–500. https://doi.org/10.1093/bioinformatics/btp692

    Article  CAS  PubMed  Google Scholar 

  58. Wagner GP, Kin K, Lynch VJ (2012) Measurement of mRNA abundance using RNA-seq data: RPKM measure is inconsistent among samples. Theory Biosci 131(4):281–285. https://doi.org/10.1007/s12064-012-0162-3

    Article  CAS  PubMed  Google Scholar 

  59. Conesa A, Madrigal P, Tarazona S et al (2016) A survey of best practices for RNA-seq data analysis. Genome Biol 17:13. https://doi.org/10.1186/s13059-016-0881-8

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Oshlack A, Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology. Biol Direct 4:14. https://doi.org/10.1186/1745-6150-4-14

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Johnson WE, Li C, Rabinovic A (2007) Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8(1):118–127. https://doi.org/10.1093/biostatistics/kxj037

    Article  PubMed  Google Scholar 

  62. Karpievitch YV, Nikolic SB, Wilson R et al (2014) Metabolomics data normalization with EigenMS. PLoS One 9(12):e116221. https://doi.org/10.1371/journal.pone.0116221

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Tracy CA, Widom H (1994) Level spacing distributions and the Bessel kernel. Commun Math Phys 161(2):289–309

    Article  Google Scholar 

  64. Johnstone IM (2001) On the distribution of the largest eigenvalue in principal components analysis. Ann Stat 29(2):295–327

    Article  Google Scholar 

  65. Patterson N, Price AL, Reich D (2006) Population structure and eigenanalysis. PLoS Genet 2(12):e190. https://doi.org/10.1371/journal.pgen.0020190

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Abbas-Aghababazadeh F, Li Q, Fridley BL (2018) Comparison of normalization approaches for gene expression studies completed with high-throughput sequencing. PLoS One 13(10):e0206312. https://doi.org/10.1371/journal.pone.0206312

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Wang L, Feng Z, Wang X et al (2010) DEGseq: an R package for identifying differentially expressed genes from RNA-seq data. Bioinformatics 26(1):136–138. https://doi.org/10.1093/bioinformatics/btp612

    Article  CAS  PubMed  Google Scholar 

  68. Langmead B, Hansen KD, Leek JT (2010) Cloud-scale RNA-sequencing differential expression analysis with Myrna. Genome Biol 11(8):R83. https://doi.org/10.1186/gb-2010-11-8-r83

    Article  PubMed  PubMed Central  Google Scholar 

  69. Li J, Witten DM, Johnstone IM et al (2012) Normalization, testing, and false discovery rate estimation for RNA-sequencing data. Biostatistics 13(3):523–538. https://doi.org/10.1093/biostatistics/kxr031

    Article  PubMed  Google Scholar 

  70. Auer PL, Doerge RW (2011) A two-stage Poisson model for testing RNA-seq data. Stat Appl Genet Mol Biol 10(1):Article 26

    Article  Google Scholar 

  71. Srivastava S, Chen L (2010) A two-parameter generalized Poisson model to improve the analysis of RNA-seq data. Nucleic Acids Res 38(17):e170. https://doi.org/10.1093/nar/gkq670

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Robinson MD, Smyth GK (2007) Moderated statistical tests for assessing differences in tag abundance. Bioinformatics 23(21):2881–2887. https://doi.org/10.1093/bioinformatics/btm453. btm453 [pii]

    Article  CAS  PubMed  Google Scholar 

  73. Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26(1):139–140. https://doi.org/10.1093/bioinformatics/btp616. btp616 [pii]

    Article  CAS  PubMed  Google Scholar 

  74. Love MI, Huber W, Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15(12):550. https://doi.org/10.1186/s13059-014-0550-8

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Di Y, Schafer DW, Cumbie JS et al (2011) The NBP negative binomial model for assessing differential gene expression from RNA-Seq. Stat Appl Genet Mol Biol 10(1):24

    Article  Google Scholar 

  76. Zhou YH, Xia K, Wright FA (2011) A powerful and flexible approach to the analysis of RNA sequence count data. Bioinformatics 27(19):2672–2678. https://doi.org/10.1093/bioinformatics/btr449

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Van De Wiel MA, Leday GG, Pardo L et al (2013) Bayesian analysis of RNA sequencing data by estimating multiple shrinkage priors. Biostatistics 14(1):113–128. https://doi.org/10.1093/biostatistics/kxs031

    Article  Google Scholar 

  78. Hardcastle TJ, Kelly KA (2010) baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics 11:422. https://doi.org/10.1186/1471-2105-11-422

    Article  PubMed  PubMed Central  Google Scholar 

  79. Smyth GK (2004) Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol 3:Article 3. https://doi.org/10.2202/1544-6115.1027

    Article  Google Scholar 

  80. Ritchie ME, Phipson B, Wu D et al (2015) limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 43(7):e47. https://doi.org/10.1093/nar/gkv007

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Law CW, Chen Y, Shi W et al (2014) voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol 15(2):R29. https://doi.org/10.1186/gb-2014-15-2-r29

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Li J, Tibshirani R (2013) Finding consistent patterns: a nonparametric approach for identifying differential expression in RNA-Seq data. Stat Methods Med Res 22(5):519–536. https://doi.org/10.1177/0962280211428386

    Article  PubMed  Google Scholar 

  83. Tarazona S, Garcia-Alcalde F, Dopazo J et al (2011) Differential expression in RNA-seq: a matter of depth. Genome Res 21(12):2213–2223. https://doi.org/10.1101/gr.124321.111

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B Methodol 57(1):289–300

    Google Scholar 

  85. Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci U S A 100(16):9440–9445

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Storey JD (2002) A direct approach to false discovery rates. J R Stat Soc B Methodol 64(Pt. 3):479–498

    Article  Google Scholar 

  87. Bland JM, Altman DG (1995) Multiple significance tests: the Bonferroni method. BMJ 310(6973):170. https://doi.org/10.1136/bmj.310.6973.170

    Article  CAS  PubMed  Google Scholar 

  88. Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6:65–70

    Google Scholar 

  89. Hochberg Y (1988) A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75(4):800–802

    Article  Google Scholar 

  90. Newman AM, Liu CL, Green MR et al (2015) Robust enumeration of cell subsets from tissue expression profiles. Nat Methods 12(5):453–457. https://doi.org/10.1038/nmeth.3337

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Thorsson V, Gibbs DL, Brown SD et al (2018) The immune landscape of cancer. Immunity 48(4):812–830.e814. https://doi.org/10.1016/j.immuni.2018.03.023

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  92. Li T, Fan J, Wang B et al (2017) TIMER: a web server for comprehensive analysis of tumor-infiltrating immune cells. Cancer Res 77(21):e108–e110. https://doi.org/10.1158/0008-5472.CAN-17-0307

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Aran D, Hu Z, Butte AJ (2017) xCell: digitally portraying the tissue cellular heterogeneity landscape. Genome Biol 18(1):220. https://doi.org/10.1186/s13059-017-1349-1

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Hashimshony T, Wagner F, Sher N et al (2012) CEL-Seq: single-cell RNA-Seq by multiplexed linear amplification. Cell Rep 2(3):666–673. https://doi.org/10.1016/j.celrep.2012.08.003

    Article  CAS  PubMed  Google Scholar 

  95. Islam S, Zeisel A, Joost S et al (2014) Quantitative single-cell RNA-seq with unique molecular identifiers. Nat Methods 11(2):163–166. https://doi.org/10.1038/nmeth.2772

    Article  CAS  PubMed  Google Scholar 

  96. Picelli S, Faridani OR, Bjorklund AK et al (2014) Full-length RNA-seq from single cells using Smart-seq2. Nat Protoc 9(1):171–181. https://doi.org/10.1038/nprot.2014.006

    Article  CAS  PubMed  Google Scholar 

  97. Macosko EZ, Basu A, Satija R et al (2015) Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161(5):1202–1214. https://doi.org/10.1016/j.cell.2015.05.002

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Zheng GX, Terry JM, Belgrader P et al (2017) Massively parallel digital transcriptional profiling of single cells. Nat Commun 8:14049. https://doi.org/10.1038/ncomms14049

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Ziegenhain C, Vieth B, Parekh S et al (2017) Comparative analysis of single-cell RNA sequencing methods. Mol Cell 65(4):631–643.e634. https://doi.org/10.1016/j.molcel.2017.01.023

    Article  CAS  PubMed  Google Scholar 

  100. Svensson V, Natarajan KN, Ly LH et al (2017) Power analysis of single-cell RNA-sequencing experiments. Nat Methods 14(4):381–387. https://doi.org/10.1038/nmeth.4220

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  101. Ilicic T, Kim JK, Kolodziejczyk AA et al (2016) Classification of low quality cells from single-cell RNA-seq data. Genome Biol 17:29. https://doi.org/10.1186/s13059-016-0888-1

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  102. Lun AT, McCarthy DJ, Marioni JC (2016) A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor. F1000Res 5:2122. https://doi.org/10.12688/f1000research.9501.2

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  103. Satija R, Farrell JA, Gennert D et al (2015) Spatial reconstruction of single-cell gene expression data. Nat Biotechnol 33(5):495–502. https://doi.org/10.1038/nbt.3192

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  104. Zhao C, Hu S, Huo X et al (2017) Dr.seq2: a quality control and analysis pipeline for parallel single cell transcriptome and epigenome data. PLoS One 12(7):e0180583. https://doi.org/10.1371/journal.pone.0180583

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  105. McCarthy DJ, Campbell KR, Lun AT et al (2017) Scater: pre-processing, quality control, normalization and visualization of single-cell RNA-seq data in R. Bioinformatics 33(8):1179–1186. https://doi.org/10.1093/bioinformatics/btw777

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  106. Finak G, McDavid A, Yajima M et al (2015) MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol 16:278. https://doi.org/10.1186/s13059-015-0844-5

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. Lun AT, Bach K, Marioni JC (2016) Pooling across cells to normalize single-cell RNA sequencing data with many zero counts. Genome Biol 17:75. https://doi.org/10.1186/s13059-016-0947-7

    Article  CAS  PubMed  Google Scholar 

  108. Kharchenko PV, Silberstein L, Scadden DT (2014) Bayesian approach to single-cell differential expression analysis. Nat Methods 11(7):740–742. https://doi.org/10.1038/nmeth.2967

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  109. Jiang Y, Zhang NR, Li M (2017) SCALE: modeling allele-specific gene expression by single-cell RNA sequencing. Genome Biol 18(1):74. https://doi.org/10.1186/s13059-017-1200-8

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  110. Liu Z, Lou H, Xie K et al (2017) Reconstructing cell cycle pseudo time-series via single-cell transcriptome data. Nat Commun 8(1):22. https://doi.org/10.1038/s41467-017-00039-z

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  111. McDavid A, Finak G, Gottardo R (2016) The contribution of cell cycle to heterogeneity in single-cell RNA-seq data. Nat Biotechnol 34(6):591–593. https://doi.org/10.1038/nbt.3498

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  112. Wang J, Huang M, Torre E et al (2018) Gene expression distribution deconvolution in single-cell RNA sequencing. Proc Natl Acad Sci U S A 115(28):E6437–E6446. https://doi.org/10.1073/pnas.1721085115

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  113. Vallejos CA, Risso D, Scialdone A et al (2017) Normalizing single-cell RNA sequencing data: challenges and opportunities. Nat Methods 14(6):565–571. https://doi.org/10.1038/nmeth.4292

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  114. Cole MB, Risso D, Wagner A et al (2019) Performance assessment and selection of normalization procedures for single-cell RNA-Seq. Cell Syst 8(4):315–328.e318. https://doi.org/10.1016/j.cels.2019.03.010

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  115. Bacher R, Chu LF, Leng N et al (2017) SCnorm: robust normalization of single-cell RNA-seq data. Nat Methods 14(6):584–586. https://doi.org/10.1038/nmeth.4263

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  116. Jia C, Hu Y, Kelly D et al (2017) Accounting for technical noise in differential expression analysis of single-cell RNA sequencing data. Nucleic Acids Res 45(19):10978–10988. https://doi.org/10.1093/nar/gkx754

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  117. Vallejos CA, Marioni JC, Richardson S (2015) BASiCS: Bayesian analysis of single-cell sequencing data. PLoS Comput Biol 11(6):e1004333. https://doi.org/10.1371/journal.pcbi.1004333

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  118. Prabhakaran S, Azizi E, Carr A et al (2016) Dirichlet process mixture model for correcting technical variation in single-cell gene expression data. JMLR Workshop Conf Proc 48:1070–1079

    PubMed  PubMed Central  Google Scholar 

  119. Azizi E, Prabhakaran S, Carr A et al (2017) Bayesian inference for single-cell clustering and imputing. Genomics Comput Biol 3(1):e46. https://doi.org/10.18547/gcb.2017.vol3.iss1.e46

    Article  Google Scholar 

  120. Gong W, Kwak IY, Pota P et al (2018) DrImpute: imputing dropout events in single cell RNA sequencing data. BMC Bioinformatics 19(1):220. https://doi.org/10.1186/s12859-018-2226-y

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  121. Huang M, Wang J, Torre E et al (2018) SAVER: gene expression recovery for single-cell RNA sequencing. Nat Methods 15(7):539–542. https://doi.org/10.1038/s41592-018-0033-z

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  122. Mongia A, Sengupta D, Majumdar A (2019) McImpute: matrix completion based imputation for single cell RNA-seq data. Front Genet 10:9. https://doi.org/10.3389/fgene.2019.00009

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  123. Li WV, Li JJ (2018) An accurate and robust imputation method scImpute for single-cell RNA-seq data. Nat Commun 9(1):997. https://doi.org/10.1038/s41467-018-03405-7

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  124. Linderman GC, Zhao J, Kluger Y (2018) Zero-preserving imputation of scRNA-seq data using low-rank approximation. bioRxiv:397588. https://doi.org/10.1101/397588

  125. Chen C, Wu C, Wu L et al (2018) scRMD: imputation for single cell RNA-seq data via robust matrix decomposition. bioRxiv:459404. https://doi.org/10.1101/459404

  126. van Dijk D, Sharma R, Nainys J et al (2018) Recovering gene interactions from single-cell data using data diffusion. Cell 174(3):716–729.e727. https://doi.org/10.1016/j.cell.2018.05.061

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  127. Ronen J, Akalin A (2018) netSmooth: network-smoothing based imputation for single cell RNA-seq. F1000Res 7:8. https://doi.org/10.12688/f1000research.13511.3

    Article  PubMed  PubMed Central  Google Scholar 

  128. Wagner F, Yan Y, Yanai I (2017) K-nearest neighbor smoothing for high-throughput single-cell RNA-Seq data. bioRxiv:217737. https://doi.org/10.1101/217737

  129. Zhang L, Zhang S (2018) Comparison of computational methods for imputing single-cell RNA-sequencing data. IEEE/ACM Trans Comput Biol Bioinform. https://doi.org/10.1109/TCBB.2018.2848633

  130. Andrews TS, Hemberg M (2018) False signals induced by single-cell imputation. F1000Res 7:1740. https://doi.org/10.12688/f1000research.16613.2

    Article  PubMed  Google Scholar 

  131. Buettner F, Natarajan KN, Casale FP et al (2015) Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells. Nat Biotechnol 33(2):155–160. https://doi.org/10.1038/nbt.3102

    Article  CAS  PubMed  Google Scholar 

  132. Katayama S, Tohonen V, Linnarsson S et al (2013) SAMstrt: statistical test for differential expression in single-cell transcriptome with spike-in normalization. Bioinformatics 29(22):2943–2945. https://doi.org/10.1093/bioinformatics/btt511

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  133. Ding B, Zheng L, Zhu Y et al (2015) Normalization and noise reduction for single cell RNA-seq experiments. Bioinformatics 31(13):2225–2227. https://doi.org/10.1093/bioinformatics/btv122

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  134. Lun ATL, Calero-Nieto FJ, Haim-Vilmovsky L et al (2017) Assessing the reliability of spike-in normalization for analyses of single-cell RNA sequencing data. Genome Res 27(11):1795–1806. https://doi.org/10.1101/gr.222877.117

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  135. Vieth B, Parekh S, Ziegenhain C et al (2019) A systematic evaluation of single cell RNA-Seq analysis pipelines: library preparation and normalisation methods have the biggest impact on the performance of scRNA-seq studies. bioRxiv:583013. https://doi.org/10.1101/583013

  136. Buttner M, Miao Z, Wolf FA et al (2019) A test metric for assessing single-cell RNA-seq batch correction. Nat Methods 16(1):43–49. https://doi.org/10.1038/s41592-018-0254-1

    Article  CAS  PubMed  Google Scholar 

  137. Haghverdi L, Lun ATL, Morgan MD et al (2018) Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat Biotechnol 36(5):421–427. https://doi.org/10.1038/nbt.4091

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  138. Stuart T, Butler A, Hoffman P et al (2018) Comprehensive integration of single cell data. bioRxiv:460147. https://doi.org/10.1101/460147

  139. Kiselev VY, Andrews TS, Hemberg M (2019) Challenges in unsupervised clustering of single-cell RNA-seq data. Nat Rev 20(5):273–282. https://doi.org/10.1038/s41576-018-0088-9

    Article  CAS  Google Scholar 

  140. Brennecke P, Anders S, Kim JK et al (2013) Accounting for technical noise in single-cell RNA-seq experiments. Nat Methods 10(11):1093–1095. https://doi.org/10.1038/nmeth.2645

    Article  CAS  PubMed  Google Scholar 

  141. Fan J, Salathia N, Liu R et al (2016) Characterizing transcriptional heterogeneity through pathway and gene set overdispersion analysis. Nat Methods 13(3):241–244. https://doi.org/10.1038/nmeth.3734

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  142. Usoskin D, Furlan A, Islam S et al (2015) Unbiased classification of sensory neuron types by large-scale single-cell RNA sequencing. Nat Neurosci 18(1):145–153. https://doi.org/10.1038/nn.3881

    Article  CAS  PubMed  Google Scholar 

  143. Hyvarinen A, Oja E (2000) Independent component analysis: algorithms and applications. Neural Netw 13(4–5):411–430

    Article  CAS  PubMed  Google Scholar 

  144. Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6):1373–1396. https://doi.org/10.1162/089976603321780317

    Article  Google Scholar 

  145. Van Der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605

    Google Scholar 

  146. Hicks SC, Townes FW, Teng M et al (2018) Missing data and technical variability in single-cell RNA-sequencing experiments. Biostatistics 19(4):562–578. https://doi.org/10.1093/biostatistics/kxx053

    Article  PubMed  Google Scholar 

  147. Risso D, Perraudeau F, Gribkova S et al (2018) A general and flexible method for signal extraction from single-cell RNA-seq data. Nat Commun 9(1):284. https://doi.org/10.1038/s41467-017-02554-5

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  148. Kobak D, Berens P (2018) The art of using t-SNE for single-cell transcriptomics. bioRxiv:453449. https://doi.org/10.1101/453449

  149. Wattenberg M, Viegas F, Johnson I (2016) How to use t-SNE effectively. Distill.pub. https://doi.org/10.23915/distill.00002

  150. Linderman GC, Rachh M, Hoskins JG et al (2019) Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data. Nat Methods 16(3):243–245. https://doi.org/10.1038/s41592-018-0308-4

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  151. McInnes L, Healy J, Melville J (2018) UMAP: uniform manifold approximation and projection for dimension reduction. arXiv e-prints

    Google Scholar 

  152. Becht E, McInnes L, Healy J et al (2018) Dimensionality reduction for visualizing single-cell data using UMAP. Nat Biotechnol 37:38. https://doi.org/10.1038/nbt.4314. https://www.nature.com/articles/nbt.4314#supplementary-information

    Article  CAS  Google Scholar 

  153. Tung PY, Blischak JD, Hsiao CJ et al (2017) Batch effects and the effective design of single-cell gene expression studies. Sci Rep 7:39921. https://doi.org/10.1038/srep39921

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  154. Andrews TS, Hemberg M (2018) Identifying cell populations with scRNASeq. Mol Asp Med 59:114–122. https://doi.org/10.1016/j.mam.2017.07.002

    Article  CAS  Google Scholar 

  155. Navin NE (2014) Cancer genomics: one cell at a time. Genome Biol 15(8):452. https://doi.org/10.1186/s13059-014-0452-9

    Article  PubMed  PubMed Central  Google Scholar 

  156. Wang Y, Navin NE (2015) Advances and applications of single-cell sequencing technologies. Mol Cell 58(4):598–609. https://doi.org/10.1016/j.molcel.2015.05.005

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  157. Duo A, Robinson MD, Soneson C (2018) A systematic performance evaluation of clustering methods for single-cell RNA-seq data. F1000Res 7:1141. https://doi.org/10.12688/f1000research.15666.2

    Article  PubMed  Google Scholar 

  158. Kiselev VY, Kirschner K, Schaub MT et al (2017) SC3: consensus clustering of single-cell RNA-seq data. Nat Methods 14(5):483–486. https://doi.org/10.1038/nmeth.4236

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  159. Wang B, Zhu J, Pierson E et al (2017) Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning. Nat Methods 14(4):414–416. https://doi.org/10.1038/nmeth.4207

    Article  CAS  PubMed  Google Scholar 

  160. Grun D, Lyubimova A, Kester L et al (2015) Single-cell messenger RNA sequencing reveals rare intestinal cell types. Nature 525(7568):251–255. https://doi.org/10.1038/nature14966

    Article  CAS  PubMed  Google Scholar 

  161. Zurauskiene J, Yau C (2016) pcaReduce: hierarchical clustering of single cell transcriptional profiles. BMC Bioinformatics 17:140. https://doi.org/10.1186/s12859-016-0984-y

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  162. Lin P, Troup M, Ho JW (2017) CIDR: ultrafast and accurate clustering through imputation for single-cell RNA-seq data. Genome Biol 18(1):59. https://doi.org/10.1186/s13059-017-1188-0

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  163. Zeisel A, Munoz-Manchado AB, Codeluppi S et al (2015) Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 347(6226):1138–1142. https://doi.org/10.1126/science.aaa1934

    Article  CAS  PubMed  Google Scholar 

  164. Guo M, Wang H, Potter SS et al (2015) SINCERA: a pipeline for single-cell RNA-Seq profiling analysis. PLoS Comput Biol 11(11):e1004575. https://doi.org/10.1371/journal.pcbi.1004575

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  165. Chen J, Schlitzer A, Chakarov S et al (2016) Mpath maps multi-branching single-cell trajectories revealing progenitor cell progression during development. Nat Commun 7:11988. https://doi.org/10.1038/ncomms11988

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  166. Senabouth A, Lukowski SW, Alquicira Hernandez J et al (2017) ascend: R package for analysis of single cell RNA-seq data. bioRxiv:207704. https://doi.org/10.1101/207704

  167. Ester M, Kriegel H-P, et al (1996) A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise. Paper presented at the Proceedings of the Second International Conference on Knowledge discovery and data mining, Portland, Oregon

    Google Scholar 

  168. Jiang L, Chen H, Pinello L et al (2016) GiniClust: detecting rare cell types from single-cell gene expression data with Gini index. Genome Biol 17(1):144. https://doi.org/10.1186/s13059-016-1010-4

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  169. Trapnell C, Cacchiarelli D, Grimsby J et al (2014) The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat Biotechnol 32(4):381–386. https://doi.org/10.1038/nbt.2859

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  170. Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci U S A 105(4):1118–1123. https://doi.org/10.1073/pnas.0706851105

    Article  PubMed  PubMed Central  Google Scholar 

  171. Blondel VD, Guillaume J-L, Lambiotte R et al (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008:10008

    Article  Google Scholar 

  172. Lancichinetti A, Fortunato S (2009) Community detection algorithms: a comparative analysis. Phys Rev E 80(5):056117. https://doi.org/10.1103/PhysRevE.80.056117

    Article  CAS  Google Scholar 

  173. Levine JH, Simonds EF, Bendall SC et al (2015) Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis. Cell 162(1):184–197. https://doi.org/10.1016/j.cell.2015.05.047

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  174. Ding J, Shah S, Condon A (2016) densityCut: an efficient and versatile topological approach for automatic clustering of biological data. Bioinformatics 32(17):2567–2576. https://doi.org/10.1093/bioinformatics/btw227

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  175. Xu C, Su Z (2015) Identification of cell types from single-cell transcriptomes using a novel clustering method. Bioinformatics 31(12):1974–1980. https://doi.org/10.1093/bioinformatics/btv088

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  176. Wolf FA, Angerer P, Theis FJ (2018) SCANPY: large-scale single-cell gene expression data analysis. Genome Biol 19(1):15. https://doi.org/10.1186/s13059-017-1382-0

    Article  PubMed  PubMed Central  Google Scholar 

  177. Baran Y, Sebe-Pedros A, Lubling Y et al (2018) MetaCell: analysis of single cell RNA-seq data using k-NN graph partitions. bioRxiv:437665. https://doi.org/10.1101/437665

  178. Xie P, Gao M, Wang C et al (2019) SuperCT: a supervised-learning framework for enhanced characterization of single-cell transcriptomic profiles. Nucleic Acids Res. https://doi.org/10.1093/nar/gkz116

  179. Aran D, Looney AP, Liu L et al (2019) Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat Immunol 20(2):163–172. https://doi.org/10.1038/s41590-018-0276-y

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  180. Li J, Smalley I, Schell MJ et al (2017) SinCHet: a MATLAB toolbox for single cell heterogeneity analysis in cancer. Bioinformatics 33(18):2951–2953. https://doi.org/10.1093/bioinformatics/btx297

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  181. Ferrall-Fairbanks MC, Ball M, Padron E et al (2019) Leveraging single-cell RNA sequencing experiments to model intratumor heterogeneity. JCO Clin Cancer Informatics 3:1–10. https://doi.org/10.1200/cci.18.00074

    Article  Google Scholar 

  182. Yang X, Liu D, Liu F et al (2013) HTQC: a fast quality control toolkit for Illumina sequencing data. BMC Bioinformatics 14:33. https://doi.org/10.1186/1471-2105-14-33

    Article  PubMed  PubMed Central  Google Scholar 

  183. Patel RK, Jain M (2012) NGS QC Toolkit: a toolkit for quality control of next generation sequencing data. PLoS One 7(2):e30619. https://doi.org/10.1371/journal.pone.0030619

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  184. Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15):2114–2120. https://doi.org/10.1093/bioinformatics/btu170

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  185. Cox MP, Peterson DA, Biggs PJ (2010) SolexaQA: at-a-glance quality assessment of Illumina second-generation sequencing data. BMC Bioinformatics 11:485. https://doi.org/10.1186/1471-2105-11-485

    Article  PubMed  PubMed Central  Google Scholar 

  186. Trapnell C, Pachter L, Salzberg SL (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25(9):1105–1111

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  187. Robertson G, Schein J, Chiu R et al (2010) De novo assembly and analysis of RNA-seq data. Nat Methods 7(11):909–912. https://doi.org/10.1038/nmeth.1517

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brooke L. Fridley .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Yu, X., Abbas-Aghababazadeh, F., Chen, Y.A., Fridley, B.L. (2021). Statistical and Bioinformatics Analysis of Data from Bulk and Single-Cell RNA Sequencing Experiments. In: Markowitz, J. (eds) Translational Bioinformatics for Therapeutic Development. Methods in Molecular Biology, vol 2194. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0849-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-0849-4_9

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-0848-7

  • Online ISBN: 978-1-0716-0849-4

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics