Prokaryotic single-cell RNA sequencing by in situ combinatorial indexing

Blattman, Sydney B.; Jiang, Wenyan; Oikonomou, Panos; Tavazoie, Saeed

doi:10.1038/s41564-020-0729-6

Prokaryotic single-cell RNA sequencing by in situ combinatorial indexing

Letter
Published: 25 May 2020

Volume 5, pages 1192–1201, (2020)
Cite this article

From

View current issue Submit your manuscript

Sydney B. Blattman^1,2,3^na1,
Wenyan Jiang^1,2,3^na1,
Panos Oikonomou ORCID: orcid.org/0000-0001-7387-0312^1,2,3 &
…
Saeed Tavazoie ORCID: orcid.org/0000-0003-2183-4162^1,2,3

14k Accesses
79 Citations
88 Altmetric
4 Mentions
Explore all metrics

Abstract

Despite longstanding appreciation of gene expression heterogeneity in isogenic bacterial populations, affordable and scalable technologies for studying single bacterial cells have been limited. Although single-cell RNA sequencing (scRNA-seq) has revolutionized studies of transcriptional heterogeneity in diverse eukaryotic systems^{1,2,3,4,5,6,7,8,9,10,11,12,13}, the application of scRNA-seq to prokaryotes has been hindered by their extremely low mRNA abundance^14,15,16, lack of mRNA polyadenylation and thick cell walls¹⁷. Here, we present prokaryotic expression profiling by tagging RNA in situ and sequencing (PETRI-seq)—a low-cost, high-throughput prokaryotic scRNA-seq pipeline that overcomes these technical obstacles. PETRI-seq uses in situ combinatorial indexing^11,12,18 to barcode transcripts from tens of thousands of cells in a single experiment. PETRI-seq captures single-cell transcriptomes of Gram-negative and Gram-positive bacteria with high purity and low bias, with median capture rates of more than 200 mRNAs per cell for exponentially growing Escherichia coli. These characteristics enable robust discrimination of cell states corresponding to different phases of growth. When applied to wild-type Staphylococcus aureus, PETRI-seq revealed a rare subpopulation of cells undergoing prophage induction. We anticipate that PETRI-seq will have broad utility in defining single-cell states and their dynamics in complex microbial communities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

**Fig. 2: PETRI-seq captures transcriptomes of single *E. coli* and *S. aureus* cells with high purity and low bias.**

**Fig. 3: PCA distinguishes between exponential- and stationary-phase single *E. coli* cells through mRNA expression patterns.**

DNA barcoding, an effective tool for species identification: a review

Article 29 October 2022

A practical guide to amplicon and metagenomic analysis of microbiome data

Article Open access 11 May 2020

A review of the current state of single-cell proteomics and future perspective

Article Open access 07 June 2023

Data availability

Raw data have been submitted to the Gene Expression Omnibus under accession number GSE141018. Source data are also provided for all figures. All of the figures except for Fig. 1 include original data. An overview of all of the experiments is provided in Supplementary Table 4. A count matrix for the three primary PETRI-seq experiments is provided in Supplementary Table 6.

Code availability

Relevant code for this manuscript is available from the corresponding author on request; current PETRI-seq code and protocols are available at https://tavazoielab.c2b2.columbia.edu/PETRI-seq/.

References

Tang, F. et al. mRNA-seq whole-transcriptome analysis of a single cell. Nat. Methods 6, 377–382 (2009).
Article CAS PubMed Google Scholar
Ramsköld, D. et al. Full-length mRNA-seq from single-cell levels of RNA and individual circulating tumor cells. Nat. Biotechnol. 30, 777–782 (2012).
Article PubMed PubMed Central CAS Google Scholar
Picelli, S. et al. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat. Methods 10, 1096–1098 (2013).
Article CAS PubMed Google Scholar
Fan, H. C., Fu, G. K. & Fodor, S. P. A. Expression profiling. Combinatorial labeling of single cells for gene expression cytometry. Science 347, 1258367 (2015).
Article PubMed CAS Google Scholar
Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).
Article CAS PubMed PubMed Central Google Scholar
Klein, A. M. et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161, 1187–1201 (2015).
Article CAS PubMed PubMed Central Google Scholar
Bose, S. et al. Scalable microfluidics for single-cell RNA printing and sequencing. Genome Biol. 16, 120 (2015).
Article PubMed PubMed Central CAS Google Scholar
Zheng, G. X. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).
Article CAS PubMed PubMed Central Google Scholar
Picelli, S. Single-cell RNA-sequencing: the future of genome biology is now. RNA Biol. 14, 637–650 (2016).
Article PubMed PubMed Central Google Scholar
Sheng, K., Cao, W., Niu, Y., Deng, Q. & Zong, C. Effective detection of variation in single-cell transcriptomes using MATQ-seq. Nat. Methods 14, 267–270 (2017).
Article CAS PubMed Google Scholar
Cao, J. et al. Comprehensive single-cell transcriptional profiling of a multicellular organism. Science 357, 661–667 (2017).
Article CAS PubMed PubMed Central Google Scholar
Rosenberg, A. B. et al. Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding. Science 360, 176–182 (2018).
Article CAS PubMed PubMed Central Google Scholar
Cao, J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502 (2019).
CAS PubMed PubMed Central Google Scholar
Taniguchi, Y. et al. Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in single cells. Science 329, 533–538 (2010).
Article CAS PubMed PubMed Central Google Scholar
Bartholomäus, A. et al. Bacteria differently regulate mRNA abundance to specifically respond to various stresses. Philos. Trans. R. Soc. A 374, 20150069 (2016).
Moran, M. A. et al. Sizing up metatranscriptomics. Isme J. 7, 237–243 (2013).
Article CAS PubMed Google Scholar
de Lange, N., Tran, T. M. & Abate, A. R. Electrical lysis of cells for detergent-free droplet assays. Biomicrofluidics 10, 024114 (2016).
Article PubMed PubMed Central CAS Google Scholar
Amini, S. et al. Haplotype-resolved whole-genome sequencing by contiguity-preserving transposition and combinatorial indexing. Nat. Genet. 46, 1343–1349 (2014).
Article CAS PubMed PubMed Central Google Scholar
Hodson, R. E., Dustman, W. A., Garg, R. P. & Moran, M. A. In situ PCR for visualization of microscale distribution of specific genes and gene products in prokaryotic communities. Appl. Environ. Microbiol. 61, 4074–4082 (1995).
Article CAS PubMed PubMed Central Google Scholar
Bloom, J. D. Estimating the frequency of multiplets in single-cell RNA sequencing from cell-mixing experiments. PeerJ 6, e5578 (2018).
Article PubMed PubMed Central Google Scholar
Okayama, H. & Berg, P. High-efficiency cloning of full-length cDNA. Mol. Cell. Biol. 2, 161–170 (1982).
CAS PubMed PubMed Central Google Scholar
Kivioja, T. et al. Counting absolute number of molecules using unique molecular identifiers. Nat. Methods 9, 72–74 (2012).
Article CAS Google Scholar
Yang, S. et al. Decontamination of ambient RNA in single-cell RNA-seq with DecontX. Genome Biol. 21, 57 (2020).
Article PubMed PubMed Central CAS Google Scholar
Young, M. D. & Behjati, S. SoupX removes ambient RNA contamination from droplet based single-cell RNA sequencing data. Preprint at bioRxiv https://doi.org/10.1101/303727 (2020).
Hotelling, H. Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24, 417–441 (1933).
Article Google Scholar
Hafemeister, C. & Satija, R. Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression. Genome Biol. 20, 296 (2019).
Article CAS PubMed PubMed Central Google Scholar
Gentry, D. R., Hernandez, V. J., Nguyen, L. H., Jensen, D. B. & Cashel, M. Synthesis of the stationary-phase sigma factor σ^s is positively regulated by ppGpp. J. Bacteriol. 175, 7982–7989 (1993).
Article CAS PubMed PubMed Central Google Scholar
Almirón, M., Link, A. J., Furlong, D. & Kolter, R. A novel DNA-binding protein with regulatory and protective roles in starved Escherichia coli. Genes Dev. 6, 2646–2654 (1992).
Article PubMed Google Scholar
Traxler, M. F. et al. The global, ppGpp-mediated stringent response to amino acid starvation in Escherichia coli. Mol. Microbiol. 68, 1128–1148 (2008).
Article CAS PubMed PubMed Central Google Scholar
Chen, H., Shiroguchi, K., Ge, H. & Xie, X. S. Genome-wide study of mRNA degradation and transcript elongation in Escherichia coli. Mol. Syst. Biol. 11, 781 (2015).
Article PubMed PubMed Central CAS Google Scholar
Vargas-Garcia, C. A., Ghusinga, K. J. & Singh, A. Cell size control and gene expression homeostasis in single-cells. Curr. Opin. Syst. Biol. 8, 109–116 (2018).
Article PubMed PubMed Central Google Scholar
Diep, B. A. et al. Complete genome sequence of USA300, an epidemic clone of community-acquired meticillin-resistant Staphylococcus aureus. Lancet 367, 731–739 (2006).
Article CAS PubMed Google Scholar
Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J. & Wheeler, D. L. GenBank. Nucleic Acids Res. 35, D21–D25 (2007).
Article CAS PubMed Google Scholar
Saint, M. et al. Single-cell imaging and RNA sequencing reveal patterns of gene expression heterogeneity during fission yeast growth and adaptation. Nat. Microbiol. 4, 480–491 (2019).
Article CAS PubMed Google Scholar
Grün, L., Kester, L. & Oudenaarden, A. Validation of noise models for single-cell transcriptomics. Nat. Methods 11, 637–640 (2014).
Article PubMed CAS Google Scholar
Raj, A., van den Bogaard, P., Rifkin, S. A., van den Oudenaarden, A. & Tyagi, S. Imaging individual mRNA molecules using multiple singly labeled probes. Nat. Methods 5, 877–879 (2008).
Article CAS PubMed PubMed Central Google Scholar
Abraham, J. M., Freitag, C. S., Clements, J. R. & Eisenstein, B. I. An invertible element of DNA controls phase variation of type 1 fimbriae of Escherichia coli. Proc. Natl Acad. Sci. USA 82, 5724–5727 (1985).
Article CAS PubMed PubMed Central Google Scholar
Deutsch, D. R. et al. Extra-chromosomal DNA sequencing reveals episomal prophages capable of impacting virulence factor expression in Staphylococcus aureus. Front. Microbiol. 9, 1406 (2018).
Article PubMed PubMed Central Google Scholar
Balasubramanian, S., Osburne, M. S., BrinJones, H., Tai, A. K. & Leong, J. M. Prophage induction, but not production of phage particles, is required for lethal disease in a microbiome-replete murine model of enterohemorrhagic E. coli infection. Plos Pathog. 15, e1007494 (2019).
Article PubMed PubMed Central CAS Google Scholar
Blattman, S. B., Jiang, W., Oikonomou, P. & Tavazoie, S. Prokaryotic single-cell RNA sequencing by in situ combinatorial indexing. Preprint at bioRxiv https://doi.org/10.1101/866244 (2019).
Kuchina, A. et al. Microbial single-cell RNA sequencing by split-pool barcoding. Preprint at bioRxiv https://doi.org/10.1101/869248 (2019).
Brauner, A., Fridman, O., Gefen, O. & Balaban, N. Q. Distinguishing between resistance, tolerance and persistence to antibiotic treatment. Nat. Rev. Microbiol. 14, 320–330 (2016).
Article CAS PubMed Google Scholar
Girgis, H. S., Harris, K. & Tavazoie, S. Large mutational target size for rapid emergence of bacterial persistence. Proc. Natl Acad. Sci. USA 109, 12740–12745 (2012).
Article CAS PubMed PubMed Central Google Scholar
Franzosa, E. A. et al. Sequencing and beyond: integrating molecular ‘omics’ for microbial community profiling. Nat. Rev. Microbiol. 13, 360–372 (2015).
Article CAS PubMed PubMed Central Google Scholar
Lee, T. S. et al. BglBrick vectors and datasheets: a synthetic biology platform for gene expression. J. Biol. Eng. 5, 12 (2011).
Zaslaver, A. et al. A comprehensive library of fluorescent transcriptional reporters for Escherichia coli. Nat. Methods 3, 623–628 (2006).
Article CAS PubMed Google Scholar
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBNet 17, 10–12 (2011).
Article Google Scholar
Smith, T., Heger, A. & Sudbery, I. UMI-tools: modelling sequencing errors in unique molecular Identifiers to improve quantification accuracy. Genome Res. 27, 491–499 (2017).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS PubMed PubMed Central Google Scholar
Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
Article CAS PubMed Google Scholar
Santos-Zavaleta, A. et al. RegulonDB v 10.5: tackling challenges to unify classic and high throughput knowledge of gene regulation in E. coli K-12. Nucleic Acids Res. 47, D212–D220 (2019).
Article CAS PubMed Google Scholar
Taboada, B., Ciria, R., Martinez-Guerrero, C. E. & Merino, E. ProOpDB: Prokaryotic Operon DataBase. Nucleic Acids Res. 40, D627–D631 (2012).
Article CAS PubMed Google Scholar
Fu, G. K., Hu, J., Wang, P. & Fodor, S. P. A. Counting individual DNA molecules by the stochastic attachment of diverse labels. Proc. Natl Acad. Sci. USA 108, 9026–9031 (2011).
Article CAS PubMed PubMed Central Google Scholar
Tange, O. GNU Parallel 2018 (Ole Tange, 2018).
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Article CAS PubMed PubMed Central Google Scholar
Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018).
CAS PubMed PubMed Central Google Scholar
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57, 289–300 (1995).
Google Scholar
Huang, Y., Sheth, R. U., Kaufman, A. & Wang, H. H. Scalable and cost-effective ribonuclease-based rRNA depletion for transcriptomics. Nucleic Acids Res. 48, e20 (2020).
Article PubMed CAS Google Scholar
Armour, C. D. et al. Digital transcriptome profiling using selective hexamer priming for cDNA synthesis. Nat. Methods 6, 647–649 (2009).
Article CAS PubMed Google Scholar
He, S. et al. Validation of two ribosomal RNA removal methods for microbial metatranscriptomics. Nat. Methods 7, 807–812 (2010).
Article CAS PubMed Google Scholar
Zhulidov, P. A. et al. Simple cDNA normalization using kamchatka crab duplex-specific nuclease. Nucleic Acids Res. 32, e37 (2004).
Article PubMed PubMed Central CAS Google Scholar

Download references

Acknowledgements

We thank the members of the Tavazoie laboratory for discussions and comments on early drafts of the manuscript; and P. Sims for suggestions during the early development of PETRI-seq. S.T. is supported by award no. 5R01AI077562 from the National Institutes of Health. S.B.B. is supported by a National Science Foundation Graduate Research Fellowship (no. DGE 16-44869). W.J. is supported by a fellowship from the Jane Coffin Childs Fund.

Author information

These authors contributed equally: Sydney B. Blattman, Wenyan Jiang.

Authors and Affiliations

Department of Biological Sciences, Columbia University, New York City, NY, USA
Sydney B. Blattman, Wenyan Jiang, Panos Oikonomou & Saeed Tavazoie
Department of Systems Biology, Columbia University, New York City, NY, USA
Sydney B. Blattman, Wenyan Jiang, Panos Oikonomou & Saeed Tavazoie
Department of Biochemistry and Molecular Biophysics, Columbia University, New York City, NY, USA
Sydney B. Blattman, Wenyan Jiang, Panos Oikonomou & Saeed Tavazoie

Authors

Sydney B. Blattman
View author publications
You can also search for this author in PubMed Google Scholar
Wenyan Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Panos Oikonomou
View author publications
You can also search for this author in PubMed Google Scholar
Saeed Tavazoie
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

W.J., S.B.B. and S.T. conceived the study. S.B.B., W.J. and S.T. designed experiments. S.B.B. and W.J. performed experiments and data analysis. P.O. assisted with computational analysis. S.B.B., W.J. and S.T. wrote the paper.

Corresponding author

Correspondence to Saeed Tavazoie.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Experimental and computational pipelines for PETRI-seq.

a–c, Experimental pipeline for PETRI-seq. PETRI-seq libraries can be prepared in just 2.5 days. (a) Detailed schematic of steps for cell preparation, which is started at the end of day 1 and finished on day 2. (b) Detailed schematic of steps for split-pool barcoding, which is entirely done on day 2. (c) Detailed schematic of steps for library preparation, which can be completed (up to sequencing) on day 3 (or later, if preferred). d, Computational pipeline for PETRI-seq analysis after sequencing. e, Structure of contig elements in read 1 after Illumina sequencing of PETRI-seq. To reduce the length of the sequence, barcodes overlap by one base (indicated by asterisk) with the adjacent linker sequence. f, Representative ‘knee plot’ used to select BCs for further analysis. The threshold line at 25,000 BCs is inclusive to facilitate additional filtering after collapsing PCR duplicates to UMIs. g, Representative histogram of reads per UMI. A threshold line was set for each library. For this library, only UMIs with more than 3 reads were kept for downstream analysis. Threshold line at log₁₀(3). h, Species mixing plot with all BCs containing >0 UMIs for library 1.06SaEc. BCs with fewer than 20 UMIs per cell were removed from further analysis. Line segments at x = 20 and y = 20. i, Distribution of E. coli BCs from species mixing plot in h. BCs above the threshold line were used for further analysis and considered single E. coli cells. Threshold line at log₂(20). j,k, PCAs of E. coli (orange) and S. aureus (blue) BCs from library 1.06SaEc. For calculation of principal components, rRNA operons were omitted and counts were normalized and scaled as described in methods. In j, all S. aureus and E. coli BCs with greater than 20 total UMIs and greater than 0 mRNAs are included (13,786 S. aureus, 1,153 E. coli). In k, only BCs with greater than or equal to 15 mRNA UMIs are included (6,683 S. aureus, 800 E. coli). For 100% of S. aureus BCs, PC1 < 0.05, and for 100% of E. coli BCs, PC1 > 4.

Source Data

Extended Data Fig. 2 Development and preliminary optimization of PETRI-seq.

a, qPCR after in situ RT with random hexamers shows higher yield of rpsB cDNA from fixation without media (pelleting before) than fixation with media (formaldehyde added to culture) [n = 3 technically independent samples (dots), p = 0.012, 2-sided t-test]. Bars show mean abundance. b, Transcriptome stabilized by RNAprotect after 2-minute spin was highly correlated with transcriptomes stabilized immediately by either RNAprotect or flash freezing. Pearson’s r is reported. c, RNA purified from E. coli cells after 16-hour 4% formaldehyde fixation (‘Fixed Bulk’) was highly correlated with non-fixed RNA (‘Standard Bulk’). 2,617 operons included. Pearson’s r is reported. d, qPCR after in situ RT with rpsB-specific primer (SB10) showed similar yield when cells were resuspended in 50% ethanol (n = 2 technically independent samples). e, qPCR after in situ RT with random hexamers shows improved yield of rpsB cDNA after lysozyme treatment (n = 3 technically independent samples [dots], p = 0.001, 2-sided t-test). Bars show mean abundance. f, qPCR after DNase treatment or incubation with only DNase buffer confirmed in situ DNase treatment efficacy (n = 8 technically independent samples [dots], p = 0.035, 2-sided t-test). Bars show mean abundance. g, qPCR after in situ RT with rpsB-specific primer (SB10) confirmed DNase inactivation, as yield was unchanged (n = 2 technically independent samples [dots]). Bars show mean proportion. h, Gel of 775-bp PCR fragment after 1-hour incubation with DNase-treated cells confirmed DNase inactivation. Right-most lane: DNase was directly added to PCR product. Experiment conducted one time. i, Aggregated PETRI-seq UMIs from DNase-treated and untreated libraries were highly correlated. Pearson’s r reported. j, Bioanalyzer traces of RNA purified after in situ DNase treatment and cell lysis (methods). k, Imaging after E. coli cell preparation. Images for all libraries looked similar (n = 8). l, qPCR after bulk RT and ligation (methods) confirmed effective ligation with a 16-base linker. Minor increase (1.5×) in ligation efficiency was detected (p = 0.001, n = 3 technically independent samples [dots], 2-sided t-test). Bars show mean proportion. m, qPCR after in situ RT showed cDNA retention after AMPure purification (n = 4 technically independent samples, p = 0.69, 2-sided t-test). Bars show mean abundance. n,o, Second-strand synthesis yielded more mRNAs and operons per cell (p < 10⁻³⁰⁰, 2-sided Mann-Whitney U) than template switching. 10,000 BCs are included from unoptimized PETRI-seq (Experiment 1.08). Boxplots within violins show interquartile range (black box) and median (white circle).

Source Data

Extended Data Fig. 3 Quantification of intercellular contamination using E. coli and S. aureus cells.

After defining single E. coli and S. aureus cells (Fig. 2b, Experiment 1.06SaEc), we examined levels of cross-contamination within single cells. Similar analysis for Experiment 2.01 is shown in Extended Data Fig. 7c, d. a, Quantification of S. aureus-aligned UMIs assigned to E. coli cells after standard PETRI-seq alignment (edit distance ≤1). Reads mapping equally well to both species are discarded. Bottom: Scatterplots of E. coli UMIs vs. absolute (left) or percent (right) S. aureus UMIs assigned to each E. coli cell. Top: Cumulative distributions corresponding to scatterplots. b, Quantification of E. coli-aligned UMIs assigned to S. aureus cells after standard alignment. Bottom: Scatterplots of S. aureus UMIs vs absolute (left) or percent (right) E. coli UMIs assigned to each S. aureus cell. Top: Cumulative distributions corresponding to scatterplots. c, mRNAs per E. coli cell in a. d, mRNAs per S. aureus cell in b. e,f, Same analysis as (a,b) but using more stringent alignment (edit distance = 0) to better understand source of contamination. g, mRNAs per E. coli cell in e. h, mRNAs per S. aureus cell in f. i,j, To further understand the impact of alignment on apparent cross-contamination, we used stringent alignment to map UMIs for a library of only E. coli (Experiment 1.10). Total UMIs (i) or percent of UMIs (j) assigned to S. aureus were determined after stringent alignment for a PETRI-seq library prepared with only E. coli. S. aureus UMIs are computational artifacts. E. coli cells include a mean of 0.02% S. aureus aligned UMIs, indicating that the majority of interspecies contamination observed in e is not caused by incorrect alignment. To quantify contamination, we needed to correct percentages of inter-species alignment based on species abundance in the library (25% of UMIs aligned to E. coli, 75% S. aureus) to predict the percent of UMIs in a given single-cell derived from any other cell (whether or not the same species). We predict a ‘corrected contamination rate’, or percent of UMIs in a single-cell transcriptome derived from another cell, of 0.19-0.36% \(\left( {\frac{{0.14}}{{0.75}} = 0.19;\frac{{0.09}}{{0.25}} = 0.36} \right)\).

Source Data

Extended Data Fig. 4 Further evaluation of PETRI-Seq for E. coli and S. aureus in Experiment 1.06SaEc.

a,b,c, Breakdown of total aligned UMIs (a,b) or reads (c) per cell for PETRI-seq exponential GFP- and RFP-expressing E. coli (a), PETRI-seq exponential S. aureus (b), and bulk exponential wild-type E. coli (c). Left: Stacked bar shows breakdown of sense and anti-sense alignments. Right: Pie shows breakdown of rRNA and mRNA alignments within the sense fraction. d, Distributions of mRNA UMIs (left) and operons (right) per S. aureus cell. 13,785 cells are included. 2 cells were omitted as they contained zero mRNAs. Boxplots within violins show interquartile range (black box) and median (white circle). e, Distributions of mRNA UMIs (left) and operons (right) per E. coli cell in five sub-populations, including GFP cells (contain GFP plasmid transcripts), RFP cells (contain RFP plasmid transcripts), ambiguous cells (contain no plasmid transcripts), and either RFP or GFP and ambiguous cells. Three ambiguous cells classified as E. coli in Fig. 2B were omitted as they contained zero mRNAs. Boxplots within violins show interquartile range (black box) and median (white circle). f, Distribution of total RNAs per GFP-containing exponential E. coli cell. 609 cells are included. g, Left, growth curves for P_rplN-GFP, P_tet-RFP, and MG1655 (no plasmid) cells with and without aTc. Right, doubling times calculated from the growth curves. P_tet-RFP had a significantly longer doubling time than all other strains/conditions when induced with aTc (n=4, p=2.2 * 10⁻⁵, 2.5 * 10⁻⁵, 2.1 * 10⁻⁵, 3.6 * 10⁻⁵, 2.6 * 10⁻⁵ [for each sample moving left to right], 2-sided t-test), which might explain fewer mRNA UMIs in these cells.

Source Data

Extended Data Fig. 5 Further evaluation of growth phase characterization by PETRI-seq.

a, PCA of Experiment 1.06 (biological replicate of 1.10) shows that PETRI-seq can reproducibly distinguish between stationary and exponential cells by projecting cells onto the principal components calculated from the first library (bottom). 2,724 cells are included. 1,551 cells are left of the threshold (PC1=0.34), and 1,173 cells are right of the threshold. mRNA UMIs captured per cell on either side of the threshold line are shown (top). b, PCA as in Fig. 3b, but UMI counts were normalized using sctransform²⁶. c, Expression along PC1 (Fig. 3b, Experiment 1.10) of operons with the most positive or negative PC1 loadings (z-scored moving average, size=1,000 cells). d, Distribution of mRNA UMIs per cell (Experiment 1.10) on either side of the threshold line in Fig. 3b. Grey cells (without plasmid UMIs) are included. Only cells with greater than 14 mRNA UMIs per cell were included, as cells with fewer were excluded from the PCA. 4,878 cells are left of the threshold, and 2,509 cells are right of the threshold. e,f, Breakdown of total aligned UMIs per cell for Experiment 1.10 for cells above and below the PC1 threshold in Fig. 3b. In e, Exponential E. coli (above the threshold) are shown and in f, stationary E. coli (below the threshold) are shown. Left: Stacked bar shows breakdown of sense and anti-sense alignments. Right: Pie shows breakdown of rRNA and mRNA alignments within the sense fraction.

Source Data

Extended Data Fig. 6 Additional optimization of PETRI-seq by increasing ligation primer concentration and adding detergent during barcoding.

a, Increasing the concentration of round 3 ligation primers by 4x relative to previous experiments (1.06SaEc and 1.10) increases mRNA UMIs per cell 2.7-fold for GFP-expressing exponential (green) and RFP-expressing stationary E. coli cells (red). Boxplots within violins show interquartile range (black box) and median (white circle). b, Adding detergent (tween-20) to cells before ligation 1 and after ligation 3 increased mRNA UMIs per cell 1.4-fold relative to original PETRI-seq for wild-type exponential E. coli cells. Boxplots within violins show interquartile range (black box) and median (white circle). c, With 10x more RT primer relative to original PETRI-seq, we observed a shift in the breakdown of sense/anti-sense and mRNA/rRNA UMIs. Left: Stacked bar shows breakdown of sense and anti-sense alignments. Right: Pie shows breakdown of rRNA and mRNA alignments within the sense fraction. Proportions of anti-sense RNAs and sense rRNAs are significantly increased. We hypothesized that any condition effectively increasing the intracellular concentration of RT primers could lead to this undesirable shift. For this reason, detergent was only ever added after RT to avoid further permeabilizing cells and increasing the effective concentration of RT primer. d, Combining detergent treatment and increased ligation primer (for both rounds) resulted in higher mRNA capture for wild-type exponential E. coli cells. Detergent again increased mRNA UMIs per cell (1.5-fold). Boxplots within violins show interquartile range (black box) and median (white circle). e, Optimized PETRI-seq (4x ligation primer, detergent treatment) resulted in S. aureus transcriptomes with a median of 43 mRNA UMIs per cell (left) and 35 operons per cell (right). Boxplots within violins show interquartile range (black box) and median (white circle). f,g, Breakdown of total aligned UMIs per cell for optimized PETRI-seq (Experiment 2.01) for exponential (f) and stationary E. coli (g). Left: Stacked bar shows breakdown of sense and anti-sense alignments. Right: Pie shows breakdown of sense rRNA and mRNA alignments. h,i, Distributions of total UMIs per E. coli (h) and S. aureus (i) BCs in Experiment 2.01. Given higher capture, we imposed higher thresholds for distinguishing cells from background than used previously (Extended Data Fig. 1i). E. coli BCs with more than 128 total UMIs (threshold line in h) and S. aureus BCs with more than 32 total UMIs (threshold line in i) were considered cells.

Source Data

Extended Data Fig. 7 Multiplet frequency and intercellular contamination for optimized PETRI-seq.

a, Species mixing plot for PETRI-seq with 4x ligation primers and no detergent. The multiplet frequency is 0.7%, which is 5-fold higher than the Poisson expectation of 0.14% for 2,423 BCs. b, Species mixing plot for PETRI-seq with 4x ligation primers and detergent (Experiment 2.01). The multiplet frequency is 2.8%, which is 4.7-fold higher than the Poisson expectation of 0.6% for 10,797 BCs. This indicates that compared to no detergent, detergent treatment did not significantly increase multiplet frequency relative to the Poisson expectation. In (a,b), E. coli BCs with > 128 total UMIs and S. aureus BCs with > 32 total UMIs were included. c,d, Quantification of cross-contamination for PETRI-seq with 4x ligation primers and no detergent (c, same experiment as a) or 4x ligation primers and detergent (d, Experiment 2.01 as in b). Scatterplots show the percent of total UMIs for each cell aligned to the incorrect species. Reads were aligned using the stringent alignment (edit distance = 0) described in Extended Data Fig. 3. Top left: Percent of S. aureus UMIs in exponential E. coli cells (based on first round barcode). Top right: Percent of S. aureus UMIs in stationary E. coli cells (based on first round barcode). Bottom left: Percent of E. coli UMIs in S. aureus cells barcoded with exponential E. coli (based on first round barcode). Bottom right: Percent of E. coli UMIs per S. aureus cell barcoded with stationary E. coli (based on first round barcode). As described in Extended Data Fig. 3, we used these inter-species contamination rates to predict a corrected contamination rate (including intra-species contamination). Though higher than the contamination rates observed in the previous species mixing experiment (Extended Data Fig. 3e, f), these rates are comparable to previous findings for eukaryotic scRNA-seq methods^23,24 and are not affected by detergent treatment (c vs. d). Furthermore, we anticipate that contamination could be reduced by additional washing prior to cell lysis (see ‘Future directions for optimization’ in Methods).

Source Data

Extended Data Fig. 8 Comparison of plasmid-labeled (Experiment 1.10) and RT-labeled (Experiment 2.01) mixed growth stage libraries reveals minimal cross-contamination between E. coli cells barcoded together.

In Experiment 2.01, exponential and stationary cells were prepared separately and then barcoded independently during RT. In contrast, the RFP-expressing stationary cells and GFP-expressing exponential cells barcoded in Experiment 1.10 were combined for fixation and barcoded together, resulting in more opportunity for cross-contamination. Experiment 2.01 is thus a useful reference to quantify this cross-contamination. To account for differences in the capture efficiency for the two experiments, cells were down-sampled to 30 mRNA UMIs. a, PCA for all 4 cell types reveals that the two stationary populations are biologically distinct, possibly because they were grown independently to slightly different ODs, and RFP cells were induced with aTc. In contrast, the two exponential populations appear very similar. b, PC1 was calculated using only the stationary cells from both experiments. Right: The receiver operating characteristic (ROC) shows that PC1 is a strong classifier of the two states. c, PC1 was calculated using only exponential cells from both experiments. Right: The ROC shows that PC1 is a weak classifier of the two exponential states with performance similar to random assignment (Area Under the ROC Curve [AUC]=0.5). d, PC1 was calculated using wild-type exponential cells from Experiment 2.01, GFP-expressing exponential cells from Experiment 1.10, and RFP-expressing stationary cells from Experiment 1.10 in order to quantify cross-contamination between the GFP and RFP cells using the wild-type exponential cells from Experiment 2.01 as a reference. Right: ROC shows that PC1 is a strong classifier of exponential and stationary cells. The probability that the PC1 value of a wild-type exponential cell is lower than the PC1 value of a stationary RFP cell is 99.9% (AUC = 0.999), while the probability that the PC1 value of a GFP exponential cell is lower than the PC1 value of a stationary RFP cell is 99.67% (AUC = 0.9967). Thus, for the GFP exponential cells, 23 out of 10,000 cell pairs (1 exponential, 1 stationary) will be incorrectly ranked due to cross-contamination in the GFP cells. Finally, we confirmed that in the original library for Experiment 1.10, the relative representation of UMIs from exponential and stationary cells were roughly equal (50.3% stationary, 45.6% exponential), indicating that the cross-contamination analysis for the GFP exponential population would be reciprocal for the RFP stationary population.

Source Data

Extended Data Fig. 9 Defining consensus transcriptional states of sub-populations by aggregating single-cell transcriptomes.

a, Correlation between mRNA abundances from 3,547 aggregated wild-type exponential cells (Experiment 2.01) vs. bulk preparation from fixed exponential wild-type E. coli cells. The Pearson correlation coefficient (r) was calculated for 2,150 out of 2,612 total operons, excluding those with zero counts in either library (grey points), or for all 2,612 operons. Bulk library was prepared from the same cells as the PETRI-seq library. b, Bottom: The correlation between the aggregated mRNA counts of single exponential cells (PETRI-seq) and the bulk exponential library increases as more single cells are included. Correlations were calculated from log₁₀(TPM + 1) for each sample. Top: Difference between top curve and bottom curve in plot below, based on best-fit lines (y = ln(x) + b, r > 0.98). c, Correlation between RNA abundances from 4,627 aggregated wild-type stationary cells (Experiment 2.01) vs. bulk preparation from fixed wild-type stationary E. coli cells. The Pearson correlation coefficient (r) was calculated for 2,050 out of 2,612 total operons, excluding those with zero counts in either library (grey points), or for all 2,612 operons. Bulk library was prepared from the same cells as the PETRI-seq library. d, Bottom: The correlation between the aggregated mRNA counts of single stationary cells (PETRI-seq) and the bulk stationary library increases as more single cells are included. Correlations were calculated from log₁₀(TPM + 1) for each sample. Top: Difference between top curve and bottom curve in plot below, based on best-fit lines (y = ln(x) + b, r > 0.98).

Source Data

Extended Data Fig. 10 PETRI-seq detects rare transcriptional states and candidate genes with highly variable expression.

a, PCA detects rare transcriptional states among 6,663 S. aureus cells. A small sub-population of 28 cells (red) expressed operons from the φSA3usa phage. b, Distribution of PC1 loadings for all operons included in the S. aureus analysis. Eight operons from the φSA3usa phage have the highest PC1 loadings. c, Map of genomic region³³ surrounding φSA3usa in the genome of S. aureus strain USA300. Red arrows indicate phage operons upregulated along PC1. d, Percent of mRNA UMIs mapped to the φSA3usa phage for the 28 cells containing phage UMIs. Three cells are composed of >77% phage transcripts. e, Noise (σ²/μ²) versus mean (μ) for operon expression within an S. aureus population of 6,663 cells. 676 operons are included. The circled operon (red) is SAUSA300_1933-1925, which deviated significantly from the rest of the distribution (z-score = 20.6 [determined by residuals from linear regression (see methods)], p = 10⁻⁹⁴, FDR < 0.01). f,g, Noise (σ²/μ²) versus mean (μ) for operon expression in either exponential (f) or stationary (g) E. coli populations from Experiment 2.01. 1,960 operons are included in (f) and 1,219 operons in (g). Five operons significantly (FDR < 0.01, z-scores determined by residuals from linear regression [see methods]) deviated from the other operons in (f): sip-dctR (z-score = 7.3, p = 3*10⁻¹³), murJ (z-score = 6.7, p = 3*10⁻¹¹, fimAICDFGH (z-score = 5.4, p = 7*10⁻⁸), mdtL (z-score = 4.8, p = 1*10⁻⁶), rnhA (z-score = 4.6, p = 4*10⁻⁶). fimAICDFGH, which encodes the type I fimbriae system, has been shown previously to exhibit population-level phase variation that is mediated by transcriptional control³⁷. In (e-g), lines at y = -x indicate Poisson noise where σ² = μ. Operon counts were normalized for each cell before plotting. Operons with fewer than 6 raw total UMIs and a mean less than 0.002 after normalization were excluded.

Source Data

Supplementary information

Supplementary Information

Supplementary Figs. 1 and 2, and Tables 1 and 2.

Reporting Summary

Supplementary Tables 3–5

Supplementary Table 3: 96-well oligonucleotides used for PETRI-seq barcoding. Supplementary Table 4: overview of experiments included in this study. Supplementary Table 5: supplementary statistical data for Fig. 3, Extended Data Fig. 2 and Extended Data Fig. 5.

Supplementary Table 6

Count matrix for experiments 1.06SaEc, 1.10 and 2.01, and Bulk Libraries. Anti-sense operons were excluded. BCs with the prefix SB346 are from experiment 1.06SaEc; 394A from 1.10; and SB442 from 2.01. Bulk libraries for stationary-phase RFP-expressing E. coli cells (SB369) and exponential-phase GFP-expressing E. coli cells (SB371) are also included; reads, rather than UMIs, are reported for bulk libraries. Operon names with the prefix ‘U00096:’ originate from E. coli, whereas operons with the prefix ‘CP000255:’ originate from S. aureus.

Source data

Source Data Fig. 2

Raw source data.

Source Data Fig. 3

Raw source data.

Source Data Extended Data Fig. 1

Raw source data.

Source Data Extended Data Fig. 2

Raw source data.

Source Data Extended Data Fig. 3

Raw source data.

Source Data Extended Data Fig. 4

Raw source data.

Source Data Extended Data Fig. 5

Raw source data.

Source Data Extended Data Fig. 6

Raw source data.

Source Data Extended Data Fig. 7

Raw source data.

Source Data Extended Data Fig. 8

Raw source data.

Source Data Extended Data Fig. 9

Raw source data.

Source Data Extended Data Fig. 10

Raw source data.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Blattman, S.B., Jiang, W., Oikonomou, P. et al. Prokaryotic single-cell RNA sequencing by in situ combinatorial indexing. Nat Microbiol 5, 1192–1201 (2020). https://doi.org/10.1038/s41564-020-0729-6

Download citation

Received: 26 November 2019
Accepted: 23 April 2020
Published: 25 May 2020
Issue Date: October 2020
DOI: https://doi.org/10.1038/s41564-020-0729-6
Springer Nature Limited

This article is cited by

Transcription–replication interactions reveal bacterial genome regulation
- Andrew W. Pountain
- Peien Jiang
- Itai Yanai
Nature (2024)
Machine learning for microbiologists
- Francesco Asnicar
- Andrew Maltez Thomas
- Nicola Segata
Nature Reviews Microbiology (2024)
Co-transcriptional gene regulation in eukaryotes and prokaryotes
- Morgan Shine
- Jackson Gordon
- Karla M. Neugebauer
Nature Reviews Molecular Cell Biology (2024)
Massively parallel single-cell sequencing of diverse microbial populations
- Freeman Lan
- Jason Saba
- Ophelia S. Venturelli
Nature Methods (2024)
Emerging tools for uncovering genetic and transcriptomic heterogeneities in bacteria
- Yi Liao
Biophysical Reviews (2024)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Prokaryotic single-cell RNA sequencing by in situ combinatorial indexing

Abstract

Access this article

Similar content being viewed by others

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Extended data

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Navigation