Single cell RNA and immune repertoire profiling of COVID-19 patients reveal novel neutralizing antibody

There is no doubt that COVID-19 outbreak is currently the biggest public health threat


Single-cell 5' V(D)J data processing
The analysis pipelines in Cell Ranger (10X Genomics, version 3.1.0) were used for single cell sequencing data processing. Gene expression matrix files were generated using cellranger count with -transcriptome = refdata-cellranger-GRCh38-3.0.0 for each sample. V(D)J sequence assembly and paired clonotype calling were performed using cellranger vdj with -reference = refdata-cellranger-vdj-GRCh38-alts-ensembl-3.1.0 for each sample. Outputs of cellranger count for individual samples were integrated using cellranger aggr with --normalize = none. Raw counts for each individual were normalized by scaling to the total number of UMI, per cell, and log transformed by NormalizeData function in R package Seurat (version: 3.0.2) (Butler et al., 2018;Stuart et al., 2019). Batch effects were removed and data sets from each individual were integrated by the standard Seurat v3 integration workflow (Stuart et al., 2019). Cell-cycle scoring was done by CellCycleScoring function in Seurat. Then cell cycle scores were regressed during data scaling by ScaleData function in Seurat. Normalized distributions were subsequently scaled to have 0 mean, and standard deviation of 1. Principal component analysis was computed based on the scaled expression values of variable genes. First 30 principal components were used to construct a KNN graph using FindNeighbors function in Seurat, the edge weights between any two cells based on the shared overlap in their local neighborhoods (Jaccard similarity). To cluster the cells, Louvain algorithm was used to iteratively group cells together, with the goal of optimizing the standard modularity function. First 20 principal components were used to generate tSNE projections. Cell types were annotated by manually checking the results from CIBERSORT (Newman et al., 2019) and SingleR (Aran et al., 2019), which leverages reference transcriptomic datasets of pure cell types to infer the cell of origin of each single cell independently. Red blood cells, platelets, doublet cells, and other artifacts were removed from the downstream analysis.

iSMART groups analysis
For each repertoire, we extracted CDR3s with more than 5 reads, to include effector/memory B cell population at maximum. As a result, each sample provided 85K to 370K sequences. We processed the BCR clustering using iSMART  for each patient separately. This is based on the previous observation that inter-patient BCR sharing is negligible compared to inter-patient TCR sharing (Hu et al., 2019). We then combined iSMART results from all patients and obtained 74,634 BCR groups. To select the groups that are more likely to be antigen-specific, and suitable for downstream modeling ( Figure S2), we imposed the following selection criteria: 1) There are at least five distinct CDR3 nucleotide sequences in the group; 2) The group must have evidence of IgM to IgG or IgM to IgA class switch recombination; 3) At least one CDR3 in the group must be found in the single cell data, to acquire paired light chain information. After selection, there are 347 remaining BCR groups, and we reported the ones with highest clonal abundance within each group.

Reconstruction of BCR lineage tree
The input data, grouped BCRs (heavy chains only), were generated by iSMART  from BCR sequences mixed from BCR-Seq and 10x Genomics Immune Profiling in each sample. Clustal Omega (Sievers et al., 2011) was used to perform multiple sequence alignment (MSA) of BCR nucleic acid sequences in each BCR group. Using model GY94, Igphyml (Hoehn et al., 2017) successfully generated phylogenetic trees from the results of multiple sequence alignment. Ete3 (Huerta-Cepas et al., 2016) was used to conduct visualization and filtration on the phylogenetic trees.

Analysis of BCR class-switch recombination
The class switch (CS) took BCR groups generated by iSMART  as input. For each sample, we calculated the number of BCR groups shared between any two isotypes. Considering the biological order of CS events in immunological process, we filtered out the pair of IGHA1 and IGHA2 and adjusted the CS direction of each pairs of isotypes to the right order, remaining 15 CS events in the end (IGHM -> IGHG3/ IGHG1 /IGHG2/ IGHA1/ IGHA2, IGHG3 -> IGHG1/ IGHG2/ IGHA1/ IGHA2, IGHG1 -> IGHG2/ IGHA1/ IGHA2, IGHG2 -> IGHA1/ IGHA2). For each sample, we used Fisher's exact test to identify significantly enriched CS events, and the p values were corrected using the Benjamini-Hochberg method.

The measure of BCR diversity
We used D50 value to estimate the BCR diversity for each sample. We decreasingly ordered the clone counts for top 10,000 clonotypes with highest clonal frequency. The D50 value was defined as:

∑ ∑
Where the Thr=10,000, is the clone count of the i th BCR clonotype (i ≤ Thr) in sample j.

Screening for neutralizing antibodies
The candidate antibody sequences were screened from 347 BCR group ( Figure S2). We selected 347 BCR heavy chains in each group according to the following conditions: 1) The sequence came from single cell BCR data, since it can provide enough information about paired chains and V(D)J segments to complete antibody assembly; 2) Keep the heavy chain with the highest expression. Finally, 347 natural antibody sequences were obtained by pairing 347 heavy chains with their corresponding light chains in single cell BCR data.

Monoclonal antibody expression and purification
Paired heavy chain and light chain cDNA were selected for codon optimization and cloned into the expression vector containing the constant region of human IgG1. HEK293 cells were transfected with the equal amounts of heavy chain and light chain plasmids to express IgG monoclonal antibodies. The protein supernatant was purified using affinity chromatography to obtain recombinant monoclonal antibodies.

ELISA quantification
The enzyme-linked immunosorbent assay (ELISA) plates were coated with 0.01 μg/ml and 5 μg/ml SARS-CoV-2 S protein, as well as 0.01 μg/ml and 1 μg/ml SARS-CoV-2 RBD region in phosphate-buffered saline (PBS) at 4°C overnight. After standard washing and blocking, 1 μg/ml antibodies were added to each well and incubated at room temperature for 2 hours. The plates were washed and incubated with 0.08 μg/ml goat anti-human IgG (H+L)/HRP (JACKSON) for 1 h at room temperature. Using chromogen solution as the substrate, the absorbance at 450 nm was measured by a microplate reader. For the competition ELISA, Biotinylated RBD (0.3 μg/ml) was immobilized on the ELISA plates. After standard washing, 0.05 μg/ml ACE2-His was added to the wells, and then the diluted antibodies were added immediately. After incubation at room temperature for 2 hours, the plate was washed and anti-His/HRP (1:3000) was added. After incubation at room temperature for 1 hour, the absorbance at 450 nm was measured by a microplate reader. Compared with negative control, the inhibition rate of ACE2/RBD binding was calculated.

Pseudovirus neutralization assay
The human ACE2 overexpression stable HEK293T cells (293T-ACE2 cells) (2 × 10 4 cells/well) were seeded in 96-well plates. For the neutralization assay, 50 μl SARS-CoV-2 pseudoviruses (~2 × 10 4 RLU) were incubated with serial dilutions of antibodies for 1 h at 37 °C, then added to the 96-well 293T-ACE2 cells. Luciferase luminescence value was detected by chemiluminescence method, and the pseudovirus neutralization inhibition rate of the antibody was calculated according to the luciferase luminescence value to evaluate the neutralization effect of the antibody.

SARS-CoV-2 neutralization assay
All experiments with infectious SARS-CoV-2 were performed in the biosafety level 4 facilities in the Harbin Veterinary Research Institute (HVRI) of the Chinese Academy of Agricultural Sciences (CAAS), which is approved for such use by the Ministry of Agriculture and Rural Affairs of China. The neutralizing activity of the antibody GD1-69 was assessed using a plaque reduction neutralization test assay as described previously . Briefly, serial dilutions of GD1-69 with DMEM medium were incubated with 50 PFU of the SARS-CoV-2 at 37 °C for 1 h, and then added to pre-plated Vero E6 cell monolayers in 24-well plates. The plates were incubated in CO2 incubator at 37 °C for 48 h with agarose overlay. Neutralizing titer was calculated as the maximum antibody dilution yielding a 50% reduction in the number of plaques relative to that for control IgG protein.

Statistical Analysis
All statistical analyses and visualization were performed using R the statistical programming language (version 3.6.1). Two sample tests were performed using two-sided Wilcoxon rank sum test. If multiple tests were performed for a single analysis, we used BH procedure to correct for FDR. For all the boxplots displayed in the figures, the middle line defines the median value, with borders of the boxes indicating the 25% (Q1) and 75% (Q3) quartiles of the data. Lower and upper whiskers corresponded to Q1-1.5IQR and Q3+1.5IQR, where IQR is short for inter-quartile range. Figure S3. Visualization of Ig isotype co-occurrence within B cell clusters by samples. Lines connecting two nodes indicate the sharing between two corresponding Ig subclasses, with darker color for higher enrichment level by fisher-exact test and wider width for higher sharing count. The enrichment level is the ratio of observed and expected switches if assuming Ig isotypes are independently distributed among B cell clusters. Figure S4. Lineage tree of BCRs heavy chains group 140 from patient P5 (A), group 6 from patient P1 (B) and group 94 from patient P11 (C), generated by iSMART with aligned cDNA sequences on the right. The sequences were mixed from BCR-seq and 10x Genomics Immune Profiling and only heavy chains were remained for analysis. Each node represents a BCR heavy chain, colored by corresponding isotype. The size of node represents BCR frequency in the group. Table S1. Data information of 16 COVID-19 patients Table S2. The summary of single cell RNA sequencing data Table S3. The summary of BCR-seq data Table S4. The filtered 347 BCRs of BCR repertoire analysis. Table S5. The result of ELISA test of 100 antibody candidates.