Emergence in southern France of a new SARS-CoV-2 variant harbouring both N501Y and E484K substitutions in the spike protein

SARS-CoV-2 variants have become a major virological, epidemiological, and clinical concern, particularly with regard to the risk of escape from vaccine-induced immunity. Here, we describe the emergence of a new variant, with the index case returning from travel in Cameroon. For 13 SARS-CoV-2-positive patients living in the same geographical area of southeastern France, a qPCR test for screening variant-associated mutations showed an atypical combination. The genome sequences were obtained by next-generation sequencing with Oxford Nanopore Technologies on GridION instruments within about 8 h. Analysis revealed 46 nucleotide substitutions and 37 deletions, resulting in 30 amino acid substitutions and 12 deletions. Fourteen of the amino acid substitutions, including N501Y and E484K, and nine deletions are located in the spike protein. This genotype pattern led to the establishment of a new Pangolin lineage, named B.1.640.2, that is a phylogenetic sister group to the old B.1.640 lineage, which has now been renamed B.1.640.1. The lineages differ by 25 nucleotide substitutions and 33 deletions. The combination of mutations in these isolates and their phylogenetic position indicate, based on our previous definition, that they represent a new variant, which we have named “IHU”. These data are a further example of the unpredictability of the emergence of SARS-CoV-2 variants, and of their possible introduction into a given geographical area from abroad. Supplementary Information The online version contains supplementary material available at 10.1007/s00705-022-05385-y.

SARS-CoV-2 emerged in China in December 2019 and was declared a pandemic 21 months ago [1]. We have shown since the summer of 2020 that several SARS-CoV-2 variants have emerged in southeastern France and have caused distinct epidemics, either successive or superimposed [2,3]. We also reported that these variants were often introduced from abroad but could also be mink. As of December 31, 2021, in our institute, SARS-CoV-2 from almost 43,000 patients had been genotyped, by next-generation sequencing (NGS) of the complete genomes for more than 23,000 patients, and by implementing multiple qPCR specific for each variant for a more exhaustive assessment of their spread. Since then, and with the emergence of the Alpha variant at the end of 2020, SARS-CoV-2 variants have become a major virological, epidemiological, and clinical concern, particularly regarding the risk of escape from vaccine-induced immunity [4][5][6][7]. Here, we describe the emergence in southeastern France of a new variant of possible Cameroonian origin.
The index case, aged between 40 and 50 and living in a small town in southeastern France, was first diagnosed as infected with SARS-CoV-2 by real-time reverse transcription PCR (qPCR) performed on a nasopharyngeal sample collected in mid-November 2021 at a private medical biology laboratory ( Table 1). The person had been vaccinated against SARS-CoV-2 and had returned from travel to Cameroon three days previously. Mild respiratory symptoms arose the day before diagnosis. Subsequent detection of three mutations in the spike gene in a qPCR assay to screen for variants, as performed routinely in France in cases of SARS-CoV-2 positivity, revealed an atypical combination with L452R negativity, E484K positivity, and E484Q negativity (Pentaplex assay, ID solutions, Grabels, France), which did not correspond to the pattern of the Delta variant, which was associated with almost all SARS-CoV-2 infections at that time (Table 1). Respiratory samples collected from seven other SARS-CoV-2-positive patients living in the same geographical area of southern France exhibited the same combination of mutations in the qPCR assay used for screening. These patients included two adults and five children (<15 years of age) ( Extracted RNA was reverse transcribed using SuperScript IV (Thermo Fisher Scientific), and a second cDNA strand was synthesized using a LunaScript RT SuperMix kit (New England Biolabs, Beverly, MA, USA) and then amplified using a multiplex PCR protocol according to the ARTIC procedure (https:// artic. netwo rk/) with the ARTIC nCoV-2019 V3 panel of primers (IDT, Coralville, IA, USA). Finally, NGS was performed using a ligation sequencing kit and a GridION instrument from Oxford Nanopore Technologies (Oxford, UK), following the manufacturer's instructions. Subsequently, fastq files were processed using the ARTIC field bioinformatics pipeline (https:// github. com/ artic-netwo rk/ field bioin forma tics). NGS reads were basecalled using Guppy (4.0.14) and aligned to the Wuhan-Hu-1 reference genome sequence (GenBank accession no. MN908947.3) using minimap2 (v2.17-r941) (https:// github. com/ lh3/ minim ap2) [8]. The ARTIC tool align_trim was used to softmask primers from the read alignment and to cap the sequencing depth at a maximum of 400. The identification of consensuslevel variant candidates was performed using the Medaka (0.11.5) workflow (https:// github. com/ artic-netwo rk/ artic-ncov2 019). This strategy allowed assembly of the complete viral genome sequence from NGS reads obtained within 30 min of the run for cycle threshold (Ct) values of qPCR between 15 and 27. SARS-CoV-2 genomes were classified into Nextclade and Pangolin lineages using web applications (https:// clades. nexts train. org/; https:// cov-linea ges. org/ pango lin. html) [9][10][11]. The sequences were deposited in the GISAID sequence database (https:// www. gisaid. org/) [12] (Table 1). Phylogenies were reconstructed using the nextstrain/ncov tool (https:// github. com/ nexts train/ ncov) and visualized using Auspice (https:// docs. nexts train. org/ proje cts/ auspi ce/ en/ stable/). Respiratory samples collected before December 1, 2021, from five other SARS-CoV-2-positive patients living in the same city or borough as the index case could be identified by NGS as infected with the IHU variant ( Table 1). The viral genome sequences from these patients were determined using the same procedure used for the eight first cases. Analysis of the viral genome sequences revealed the presence of 46 nucleotide substitutions and 37 deletions, resulting in 30 amino acid substitutions and 12 deletions ( Fig. 1a; Supplementary Tables S1 and S2). Fourteen amino acid substitutions and nine amino acid deletions were found in the spike protein. These include the substitutions N501Y and E484K, which are present in the Beta, Gamma, Theta, and Omicron variants [5,13], F490S, which is present in the Lambda variant, and P681H, which is present in the Lambda and Omicron variants. In the other structural proteins, amino acid changes include two substitutions in the nucleocapsid protein and one in the membrane protein. In the non-structural proteins, the amino acid changes include one substitution each in the proteins Nsp2, Nsp4, Nsp6, Nsp12 (RNAdependent RNA polymerase), and Nsp13 (helicase); two substitutions in Nsp14 (3'-5'exonuclease); and three deletions in Nsp6. Finally, in the regulatory proteins, amino acid changes include two substitutions in ORF3a, one in ORF8, and one in ORF9b. In addition, codon 27 of the ORF8 gene is changed to a stop codon, as in the Alpha variant [14]. Some members of the Marseille-4 variant lineage (B.1.160), which predominated in the Marseille geographical area between August 2020 and February 2021 [3], also exhibit a stop codon in the ORF8 gene, but at another position.
Nextclade (https:// clades. nexts train. org/) identified a 20A lineage. Pangolin (https:// cov-linea ges. org/ pango lin. html) identified a B.1.640 lineage in primary analysis but a B.1 lineage with the -usher (Ultrafast Sample placement on Existing tRee; https:// genome. ucsc. edu/ cgi-bin/ hgPhy loPla ce) option, which showed the phylogenetic placement of the genomes we obtained as an outgroup of the B.1.640 lineage and their clustering with a genome sequence obtained in late October in France (Ile-de-France) (EPI_ISL_5926666). The B.1.640 lineage corresponds to a variant first identified in France in April 2021, in Indonesia in August 2021, and in the Republic of the Congo (Brazzaville) in September 2021, and it was involved in a cluster of cases in Brittany, France, around mid-October 2021 [15]. As of December 31, 2021, 371 genome sequences were available from the GISAID database, including 275 from France and 29 from  Fig. S1, Supplementary Tables S1 and S2). However, the spike genes of these two lineages differ by seven mutations. In addition, 25 nucleotide substitutions and 33 nucleotide deletions located elsewhere in the genome differ between the two genotypes. The pattern of mutations therefore indicates that the sequences determined in this study represent a new variant, which we have named "IHU" (in reference to our institute), based on our previous definition [3]. A phylogenetic analysis performed using the nextstrain/ncov tool (https:// github. com/ nexts train/ ncov) also showed that the B.1.640 and IHU variants were most closely related to each other but comprised two divergent branches ( Fig. 1b;  Supplementary Fig. S2 (Fig. 1b). Phylogeny reconstruction showed three major clusters. The first one included the 13 genomes obtained in our laboratory and one additional genome obtained in France in December 2021. A second cluster included seven genomes obtained from patients sampled in India, the United Kingdom, Germany, and the USA between mid-November and early December 2021. A third cluster included three genomes obtained from patients sampled in France in late October and mid-November 2021. As the index case was We analyzed a structural model of the complete spike protein of the IHU variant, generated by incorporating its specific mutational profile into the spike protein structure of the original 20B SARS-CoV-2 (Wuhan-Hu-1 isolate with the D614G substitution) [16] and fixing all gaps in the pdb file by incorporating the missing amino acids using the Robetta protein structure prediction tool [https:// robet ta. baker lab. org/], followed by energy minimization using the Polak-Ribière algorithm as described previously (Fig. 1c) [17]. In the N-terminal domain (NTD), the deletion of amino acids 134-145 is predicted to significantly affect the neutralizing epitope. Other changes involve amino acids at positions 96 and 190: in the Wuhan-Hu-1 isolate, E96 and R190 induce a turn in the NTD secondary structure through electrostatic interactions with each other. This interaction is conserved between the substituted amino acids 96Q and 190S, which suggests the co-evolution of these changes. In the receptor binding domain (RBD), in addition to the well-known substitutions N501Y and E484K, several changes were predicted to significantly affect the neutralizing epitopes. In particular, P681H is located in the cleavage site of the S1-S2 subunits of the spike and is observed in other variants, including the recently emerging Omicron variant [13]. In addition, the D1139H substitution involves an amino acid that is involved in the fusion of the virus and infected cell. Also, D614G is combined with T859N in the IHU variant. Interestingly, in the Wuhan-Hu-1 isolate, the amino acids D614 and T859 from two subunits of the trimeric spike are face to face and lock the trimer in a closed conformation. Although the D614G substitution already allows the trimer conformation to be unlocked, this is predicted to be facilitated even more in the presence of the additional substitution T859N.
Seven patients were involved in intrafamilial cases, two being the index case and a relative whose viral genome exhibited seven nucleotide differences (99.98% identity). All 13 IHU-variant-positive samples showed the same combination of spike mutations identified using real-time qPCR techniques: negativity for 452R and 484Q, positivity for 484K, and, when tested, positivity for 501Y [18] and 681H. We also used a TaqPath COVID-19 kit (Thermo Fisher Scientific, Waltham, USA), which gave positive signals for all three genes targeted (ORF1, S, and N). Thus, the IHU variant could be distinguished in qPCR screening assays from the Delta variant (L452R positive) and the Omicron variant (L452R negative and negative for S gene detection by the TaqPath COVID-19 assay) co-circulating in southern France. Finally, scanning electron microscopy using an SUV 5000 microscope (Hitachi High-Technologies Corporation, Tokyo, Japan) [19] allowed a quick visualization of the virus from a respiratory sample (Fig. 1d).
Overall, these observations show once again the unpredictability of the emergence of new SARS-CoV-2 variants and their possible introduction from abroad, and they exemplify the difficulty in controlling such introductions and subsequent spread. They also confirm the value of the SARS-CoV-2 genomic surveillance that we started at the very beginning of the pandemic in the Marseille geographical area as soon as we diagnosed the first SARS-CoV-2 infection [19] and that we expanded during the summer of 2020 [2,3]. Such surveillance program was implemented at the national level in 2021 through the French Emergen consortium (https:// www. sante publi quefr ance. fr/ dossi ers/ coron avirus-covid-19/ conso rtium-emerg en). It is too early to speculate on the virological, epidemiological, or clinical features of this IHU variant based on these 13 cases. For this purpose, respiratory samples from infected patients were inoculated onto Vero E6 cells as described previously [20] in order to assess the susceptibility of this variant to neutralization by anti-spike antibodies elicited by vaccination or prior infection [21]. company and a founder of a microbial culture company (Culture Top). None of the other authors have conflicts of interest to declare. The funding sources had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript.
Ethics approval This study has been approved by the ethics committee of University Hospital Institute (IHU) Méditerranée Infection (N°2021-029). Access to the patients' biological and registry data issued from the hospital information system was approved by the data protection committee of Assistance Publique-Hôpitaux de Marseille (APHM) and was recorded in the European General Data Protection Regulation registry under number RGPD/APHM 2019-73.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.