Rapid construction of a whole-genome mutant library by combining haploid stem cells and inducible self-inactivating PiggyBac transposon

Genome-scale screening is a powerful method used to explore phenotypes that are of interest, and numerous screening systems have been built for functional sites identification. Classical genetic screening in eukaryotes has been performed using chemical mutagens (Chen et al., 2000), transposon mediated gene trapping (Dupuy et al., 2005), and CRISPR-Cas9 system (Wang et al., 2014). These have helped us to decipher many biological processes. Although numerous screening methods have been developed presently (Schneeberger, 2014), most of them are time consuming and expensive for library designing, preparing, and screening. In addition, classical screening systems reveal target genes with high false positive rates because it is very difficult to ensure that only one target is disrupted in a single cell. These disadvantages, which commonly exist in typical genome-wide screening, hinder the widespread application of forward genetic screening. In previous reports, PiggyBac (PB) transposon was performed for insertional mutagenesis in mammals (Wang et al., 2009). Recently, a report has been developed about “Slingshot” screening method, a self-inactivating PB transposon system for tamoxifen inducible insertional mutagenesis (Kong et al., 2010). While Slingshot screening system has been used for gain-of-function, loss-of-function, and genome-wide screens, it is difficult to ensure only one insertional mutagenesis in one cell because of the multi-copy Slingshot vector entry. Here, we have developed a new screening system named “One-Shot” which is based on Slingshot method and haploid embryonic stem cells (haESCs) (Li et al., 2012). In “One-Shot” screening system, we integrated PiggyBac based gene trapping cassette into mRosa26 locus via CRISPR-Cas9 assisted homologous recombination in mouse haESCs, and we ensured no random insertion with thymidine kinase (TK) negative selection (Fig. 1A). By design, correctly integrated clones have only one copy of trapping cassette and characteristic of drugs resistance (Fig. S1A and S1B). Furthermore, without the induction of 4-hydroxytamoxifen (4-OHT), “One-Shot” haESCs can be cryopreserved and expanded homogeneously. Since “OneShot” system was generated, screening could be initiated by supplementing 4-OHT into the medium. This enabled PB transposase-oestrogen receptor ligand-binding domain complex (PBase-ERT2) transport into the nucleus and induced transposition (Fig. 1B). In addition, PB transposase (PBase) in “One-Shot” was self-inactivated after 4-OHT treatment to prevent re-transposition of trapping element (Fig. 1C). From the data, “One-Shot” in haESCs was integrated in the proper position and expressed spatiotemporally. Because haESCs can be differentiated with haplotype (He et al., 2017), we differentiated “One-Shot” haESCs into haploid neural stem cell-like cells (haNSCLCs) to obtain “One-Shot” haNSCLCs (Figs. 1D and S1C), which broaden the scope of application of “One-Shot” system. To evaluate the footprints of “One-Shot” in genome, we expanded “One-Shot” haESCs and “One-Shot” haNSCLCs in a population of 4 × 10 respectively, and then added 4-OHT to trigger the transposition events. After 3 days of 4-OHT induction, cells were harvested to amplify the inserted sites using splinkerette PCR (Horn et al., 2007). Next, PCR products from “One-Shot” haESCs and “One-Shot” haNSCLCs were sequenced separately by next generation sequencing. As expected, PiggyBac mediated transpositions were distributed both in forward and reverse strands of genome without phenotype screening (Fig. 1E) and footprints of trapping element in “One-Shot” system were distributed in the whole genome (Fig. 1F). From global overview of reads distribution, we found that PB transposon could insert in many genome features. Consistent with previous report, we observed the preference of integration of this system within transcription units (Fig. 1G) (Ding et al., 2005). Furthermore, we picked 15 clones from the transposed “One-Shot” haESCs library using flow cytometry, and inserted sites for each clone were isolated via barcodes encoded


Southern blotting
After PCR analysis, 3 clones were selected for Southern blotting. Genomes of the clones were extracted using kit (MicroElute Genomic DNA Kit, Omega, D3096-01), and then 30 µg genomes of each clone were digested with KpnI (New England Biolabs, R3142L) and AgeI (New England Biolabs, R3552L) for 16 h. The digested genomes were fractionated in 0.8% agarose gels and were analyzed using standard Southern blotting procedure. The probe in this part was a 399 bp fragment amplified from OS-TK plasmid. Primers of probe are listed in Table   S3.

Splinkerette PCR
Splinkerette PCR contained three steps, genome digestion, adaptor junction, and nested PCR amplification (Horn et al., 2007). For the first step, 0.5 µg of genomes were digested at 37°C with BstYI (New England Biolabs, R0523L) for 16 h, and then the endonuclease was inactivated at 80°C for 30 min. The inactivated digested genomes were then ligated with annealed splinkerette adapter at 4°C for 16 h. The ligation mixture was used as template for nested PCR, with primers designed on both PB terminal repeats and splinkerette adapter.
Sequences of adapter and primers are listed in Table S3.

Genome coverage evaluation
For genome coverage evaluation, 4 × 10 7 cells (haESCs and haNSCLCs respectively) were induced to trap the genome, and the inserted sites were isolated using splinkerette PCR and high throughput sequencing. The second round of the nested PCR primers was used as barcodes to filter reads. We then extracted the sequences contained 'TTAA' using cutadapt (Martin, 2011), and then aligned them to mouse reference genome from ensembl release 97 using bwa aligner (Li and Durbin, 2010). PCR duplications in alignments were removed using gatk (McKenna et al., 2010), and cleaned alignments were used to count reads with the window of 100,000 bp using deeptools2 (Ramirez et al., 2016) and diagrammed via circos (Krzywinski et al., 2009), genome features enrichment of reads were analyzed using ALFA (Bahin et al., 2019).

Insert sites annotation
Firstly, reads were filtered or demultiplexed using cutadapt, and then aligned to mouse reference genome via bwa described above. Strand orientation of alignments and trapping events was analyzed by bedtools (Quinlan and Hall, 2010) and Linux scripts. "Sense Reads" represented reads that have the same orientation as the trapping element of the "One-Shot" system and genes. "Sense rate" and "dispersion" were calculated using the following formulas: Trapped genes (with total reads more than 50) were featured using DAVID bioinformatics resources (Huang da et al., 2009). All duplications in this analysis were removed using gatk toolkits.

Barcodes encoded splinkerette PCR
Fifteen barcodes were designed, and barcodes sequences are listed in Table S3. For single clone inserted sites isolation, inserted sites of each clone were isolated using splinkerette PCR and then mixed for sequencing via high throughput sequencing. Cleaned Raw data was demultiplexed and mapped to the mouse reference genome, and alignments without PCR duplications were counted with the window of 100,000 bp. Windows were then sorted by the number of alignments, and top 5 windows for each clone were shown using scatter plots.

Differentiation screening
"One-Shot" Rex1-GFP haESCs (OsRG-haESCs) were purified by FACS and expanded for the population of 4.5 × 10 7 . These OsRG-haESCs were divided into 3 libraries and cultured in 4 suspension with 4-Hydroxytamoxifen for 3 days to form mutated EBs in the differentiation medium (N2B27 medium without 2i and LIF), and then were plated onto fibronectin (Gibco, G1890) coated dishes. After 32 days of differentiation, GFP positive cells in each library were obtained using flow cytometry respectively. Two libraries were sequenced and analyzed directly, and GFP positive cells from the third library were plated onto feeder coated dish in N2B27 medium supplemented with 2i and LIF for expansion. And then, inserted sites of GFP positive colonies from the third library were isolated using splinkerette PCR and high throughput sequencing. Genes with more than 50 reads were considered as candidates for maintenance of self-renewal.

Puromycin screening
"One-Shot" haESCs were purified for haploid before puromycin screening, and 4.5 × 10 7 "One-Shot" haESCs were treated with 4-OHT for 3 days to form mutated library. Mutated cells were cultured in ESC medium with puromycin (0.5 μg/mL; Gibco, A1113803) for 7 days, and then survived cells were dissociated to identified inserted sites using splinkerette PCR and high throughput sequencing. For puromycin screening, genes with at least 20 reads were treated as candidates.

Quantitative real-time polymerase chain reaction (qRT-PCR)
Total RNA from cells was extracted using PureLink™ RNA Mini Kit (Invitrogen, 12183018A), and 2 µg of total RNA was used to synthesize complementary DNA via M-MLV Reverse Transcriptase (Promega, M1705) according to instruction. Then proper cDNA was added to SYBR Green Realtime PCR Master Mix (TOYOBO, QPS-201) based qRT-PCR volume to evaluate the abundance. All samples were utilized in triplicates, and data were normalized to housekeeping gene Gapdh and analyzed with delta-delta Ct analysis. Primers used for qRT-PCR were listed in Table S3.

AP staining
AP staining was in accordance with the steps provided in alkaline phosphatase kit (Beyotime, C3206).