Complete genome analysis of African swine fever virus responsible for outbreaks in domestic pigs in 2018 in Burundi and 2019 in Malawi

Several African swine fever (ASF) outbreaks in domestic pigs have been reported in Burundi and Malawi and whole-genome sequences of circulating outbreak viruses in these countries are limited. In the present study, complete genome sequences of ASF viruses (ASFV) that caused the 2018 outbreak in Burundi (BUR/18/Rutana) and the 2019 outbreak in Malawi (MAL/19/Karonga) were produced using Illumina next-generation sequencing (NGS) platform and compared with other previously described ASFV complete genomes. The complete nucleotide sequences of BUR/18/Rutana and MAL/19/Karonga were 176,564 and 183,325 base pairs long with GC content of 38.62 and 38.48%, respectively. The MAL/19/Karonga virus had a total of 186 open reading frames (ORFs) while the BUR/18/Rutana strain had 151 ORFs. After comparative genomic analysis, the MAL/19/Karonga virus showed greater than 99% nucleotide identity with other complete nucleotides sequences of p72 genotype II viruses previously described in Tanzania, Europe and Asia including the Georgia 2007/1 isolate. The Burundian ASFV BUR/18/Rutana exhibited 98.95 to 99.34% nucleotide identity with genotype X ASFV previously described in Kenya and in Democratic Republic of the Congo (DRC). The serotyping results classified the BUR/18/Rutana and MAL/19/Karonga ASFV strains in serogroups 7 and 8, respectively. The results of this study provide insight into the genetic structure and antigenic diversity of ASFV strains circulating in Burundi and Malawi. This is important in order to understand the transmission dynamics and genetic evolution of ASFV in eastern Africa, with an ultimate goal of designing an efficient risk management strategy against ASF transboundary spread.


Introduction
The aetiology of Africa swine fever (ASF) is ASF virus (ASFV), a linear double-stranded DNA arbovirus with a genome size ranging between 170 and 194 kilobase pairs (kbp), and the only member of the genus Asfivirus, family Asfarviridae (Alonso et al. 2018). However, a potential new member of the Asfarviridae family designated as Abalone asfa-like virus (AbALV) has been recently reported (Matsuyama et al. 2020). The outcome of ASF infection in domestic pigs and Eurasian wild boars depends on virulence of causative ASFV and ranges from acute to chronic disease with mortality rates approaching 100% in naïve population (Karger et al. 2019;Pikalo et al. 2019). Due to its high mortality rate, unavailability of a commercial vaccine or effective treatment, and trade restriction of domestic pigs and pork products across countries, ASF is considered as the most serious threat to the global domestic pig industry (Costard et al. 2009; Couacy-Hymann 2019; Onzere et al. 2018). Transmission of ASF is through direct contact between infected and susceptible domestic pigs or wild boars, ingestion of contaminated pork products, contact with infected fomites, indirect transmission through carcasses in the habitat in the case of wild boars, or bites by infected soft ticks of the Ornithodoros moubata complex (Chenais et al. 2018;Penrith and Vosloo 2009). Soft ticks of the O. moubata complex act as vectors of the ASFV while in eastern and southern Africa, asymptomatically infected wild suids mainly warthogs (Phacochoerus africanus) play an important role as ASFV reservoirs (Jori et al. 2013). The ASFV infection of other wild suids species such as bush pig (Potamochoerus larvatus) and giant forest hogs (Hylochoerus meinertzhageni) has been previously reported but their role in the epidemiology of the virus is not well known (Penrith et al. 2019).
Domestic pigs and the pig farming systems in Africa, South of the Sahara, have been reported to play an important role in the ASFV transmission and spread (Mwiine et al. 2019;Yona et al. 2020) while the high stability of ASFV in pork products is cited to be the major factor of ASFV spread across long distances. For instance, the first escape of the virus from Africa to Portugal in 1957 and again in 1960 was associated to airplane waste with contaminated pork products that was used for pig feeding while contaminated ship waste was cited to be the origin of ASFV introduction in Georgia in 2007 (Rowlands et al. 2008). More than 33 countries of Africa, South of the Sahara, have reported ASF where the disease is endemic and ASFV is becoming more prevalent in European and Asian countries threatening global food and nutritional security Penrith et al. 2019).
The ASFV genome varies in size between 170 and 194 kilobase pairs (kbp) with a conserved central region of about 125 kbp, in addition to the left variable region (LVR) of 38 to 47 kbp and the right variable region (RVR) of 13 to16 kbp (de Villiers et al. 2010). The variation of the genome lengths of different ASFV strains is caused by the gain or loss of members of the five different multigene families (MGF) of ASFV found in the LVR and the RVR, for instance, MGFs 100, 110, 300, 360 and 530/505 (Alonso et al. 2018). Previous studies have reported between 151 and 167 ORFs in ASFV genomes (de Villiers et al. 2010). However, an increasing number of studies have reported more than 167 ORFs in ASFV genomes especially the strains belonging to ASFV p72 genotype II, including seven Polish isolates, collected between 2016 and 2017 with 187 to 190 ORFs (Mazur-Panasiuk et al. 2019) and the ASFV strain Belgium/ Etalle/wb/2018 detected in wild boar in Belgium in 2018 with 186 ORFs (Gilliaux et al. 2019). A study that analyzed 12 complete genomes of the ASFV strains collected in Sardinia, Italy, from 1978 to 2014 reported 231 ORFs in four isolates and 235 ORFs in eight ASFV isolates with 66 ORFs defined as uncharacterized (Torresi et al. 2020).
Based on partial nucleotide sequence analysis of the B646L gene that encodes for the major capsid protein p72, 24 (I-XXIV) ASFV genotypes have been identified and all of these have been reported to circulate in Africa, South of the Sahara (Achenbach et al. 2017;Lubisi et al. 2007;Quembo et al. 2018). Previous studies have reported ASFV p72 genotypes II, V, VIII and XII in Malawi while only ASFV p72 genotype X was reported in Burundi (Hakizimana et al. 2020a, b;Lubisi et al. 2005). Currently, only 3 complete and fully annotated ASFV strains belonging to p72 genotype X are available in the GenBank, including two strains from Kenya and one from Democratic Republic of the Congo (DRC) (Bisimwa et al. 2021;de Villiers et al. 2010). However, despite the endemic status of ASF in Burundi, no ASFV has been fully sequenced. In addition, there is no ASFV p72 genotype II strain from Malawi that has been subjected to complete genome sequencing. In this study, we report the complete genome sequences of ASFV p72 genotype X (BUR/18/Rutana) responsible for the 2018 ASF outbreak in Burundi and ASFV 72 genotype II (MAL/19/Karonga) that caused an outbreak during 2019 in Malawi.

Sequencing of the ASFV complete genome
Collection of the samples used in this study and subsequent ASF confirmation and genotyping have been previously described (Hakizimana et al. 2020a, b). Viral DNA was extracted from tissue samples using the Quick-DNA™ Miniprep Plus Kit (Zymo Research Corporation, CA, USA), following the manufacturer's instructions. Assessment of the integrity and quality of the extracted DNA was done through 1% agarose gel electrophoresis for 30 min running at 160 V with 0.5μL of sample DNA loaded. The starting genomic DNA for complete genome sequencing was quantified by picogreen method (Invitrogen, Catalog # P7589) using Victor 3 fluorometry (PerkinElmer Life and Analytical Sciences, Shelton, USA). Illumina NovaSeq6000 instrument with 2 × 150 bp configuration was used for sequencing and TruSeq Nano DNA Kit (Catalog # 20,015,964) was used for library preparation, according to the manufacturer's protocol. Quality control of the prepared library was done by 2100 Bioanalyzer using a DNA 1000 chip (Agilent Technologies, USA) while the library quantification was performed using real-time polymerase chain reaction (qPCR) according to the Illumina qPCR Quantification Protocol Guide (Catalog # SY-930-1010). The libraries were subjected to sequencing to produce approximately 28 million paired-end reads (4 GB) per sample.

Assembly and annotation of the ASFV genome
Adapter sequences and low-quality reads trimming were performed using Trim Galore version 0.6.4 (https:// www. bioin forma tics. babra ham. ac. uk/ proje cts/ trim_ galore/) with cutadapt version 2.8 and the quality Phred score cutoff was set to 30 with a minimum reads length of 75 nucleotides. The quality of the filtered sequence data was assessed using FastQC version 0.11.9 (Andrews 2010). The quality-filtered reads were de novo assembled using SPAdes version 3.13.1 (Bankevich et al. 2012) and Megahit version 1.2.9 (Li et al. 2015). The assembly contigs were mapped to the reference genome using Burrows-Wheeler Aligner (BWA) version 0.7.17 with maximum exact match (mem) option (Li 2013) and the QUAST program version 5.0.2 (Gurevich et al. 2013) was used to evaluate the quality of the assembly. The longest overlapping scaffolds were assembled to generate the ASFV complete genomes. The Genome Annotation Transfer Utility (GATU) software (Tcherepanov et al. 2006) was used for annotation of the assembled ASFV genomes using Georgia 2007/1 (GenBank accession number NC_044959.2) and Ken05/Tk1 (GenBank accession number NC_044945.1) as reference genomes. The basic local alignment search tool for nucleotide (BLASTN) version 2.11.0 + (Zhang et al. 2000) was used for pairwise nucleotide alignment and search for nucleotide identity at GenBank nucleotide database. Multiple sequence alignment was carried out using MAFFT program version 7.221 (Katoh and Standley 2013) and the evolutionary history was inferred using the maximum likelihood method with 1000 bootstrap replications and evolutionary distances were calculated using Kimura 2-parameter model (Kimura 1980) as implemented in MEGA X (Kumar et al. 2018).

Characteristics of the complete genomes of Burundian and Malawian ASFV strains
Complete genome sequences of the ASFV strains responsible for the 2018 outbreak in Rutana region, South-eastern Burundi (BUR/18/Rutana), and the 2019 outbreak in Karonga district, northern Malawi (MAL/19/Karonga), were determined in this study. The strains BUR/18/Rutana and MAL/19/Karonga belong to ASFV p72 genotypes X and II, respectively, as previously described through partial genome amplification and sequencing targeting specific genomic regions (Hakizimana et al. 2020a, b ORFs as highlighted by the whole-genome alignment of homologous genes between the ASFV strains described in this study and the corresponding reference genomes (Fig. 1). For MAL/19/Karonga, a total of 44 multigene family (MGF) members were identified within the genome including MGF 100 (3 members), MGF 110 (10 members), MGF 300 (3 members), MGF 360 (18 members) and MGF 505 (10 members). Furthermore, 36 MGF members were identified within the genome of BUR/18/Rutana strain including MGF 100 (1 member), MGF 110 (8 members), MGF 300 (3

Comparative genomic analysis
Using complete genome sequences for BLASTN search at the GenBank, the MAL/19/Karonga virus was closely related to Tanzania/Rukwa/2017/1 (GenBank accession number LR813622) ASFV strain collected in South-western Tanzania from an infected domestic pig during an ASF outbreak in 2017 and belonging to ASFV p72 genotype II, with 99.97% nucleotide identity. The percentage of nucleotide identity was greater than 99% with other complete genomic sequences of ASFV belonging to p72 genotype II isolated in Europe and Asia including the Georgia 2007/1 isolate. On the other hand, the BUR/18/Rutana ASFV strain exhibited 99.34%, 99.08% and 98.95% nucleotide identity with the Uvira B53 (Bisimwa et al. 2021), Ken05/Tk1 (Bishop et al. 2015) and Kenya 1950 (GenBank accession number AY261360) ASFV p72 genotype X strains, respectively (Table 1). Phylogenetic reconstruction using complete genomes clustered the MAL/19/Karonga and BUR/18/ Rutana viruses into ASFV genotypes II and X, respectively (Fig. 2)

Determination of the serogroups of Burundian and Malawian ASFV strains based on EP402R (CD2v) gene sequences
In order to classify the ASFV strains described in this study among the eight previously determined serogroups based on the ASFV hemadsorption inhibition (HAI) properties, we compared sequences of the EP402R gene that encodes the CD2v major ASFV antigen protein between them and selected isolates representing each serogroup retrieved from GenBank. A high nucleotide sequence variation was observed among the compared sequences and the serotyping results classified the BUR/18/Rutana and MAL/19/Karonga ASFV viruses in serogroups 7 and 8, respectively (Fig. 3). The Burundian ASFV strain grouped together with two strains belonging to serogroup 7 previously described, for instance the Uvira B53 ASFV strain collected during an ASF outbreak in South Kivu province of the Democratic Republic of the Congo (DRC) in 2019 and the Uganda ASFV strain (Bisimwa et al. 2021;Malogolovkin et al. 2015a, b), whereas the MAL/19/Karonga ASFV strain clustered together with strains belonging to ASFV serogroup 8 previously described in Europe and Asia.

Discussion
The limited knowledge of the genetic variation of the ASFV has hindered the development of effective control and prevention strategies, including vaccine, diagnostic test and antiviral treatment development (Arabyan et al. 2019;Bao et al. 2021;Torresi et al. 2020;Urbano et al. 2021). Partial nucleotide sequencing of specific ASFV genomic regions is conventionally used to determine ASFV genotypes and to discriminate closely related ASFV strains. However, in order to obtain adequate information on transmission dynamics, genetic variation and molecular evolution of different ASFV strains, complete genome sequencing is required. To date, despite the regular reports of the ASFV p72 genotype II in different countries of eastern and southern Africa, only one fully annotated complete genome of the genotype II from those countries is publicly available, for instance the Tanzania/Rukwa/2017/1 collected in South-western Tanzania from an infected domestic pig during an outbreak in 2017 (Njau et al. 2021). There is no ASFV p72 genotype II strain from Malawi that has been subjected to complete genome sequencing and no ASFV strain from Burundi that has been fully sequenced. In the present study, complete genome Tick (Ndlovu et al. 2020b) sequences of the ASFV p72 genotype X responsible for the 2018 outbreak in Burundi and genotype II virus that caused the 2019 ASF outbreak in Malawi were generated using Illumina NGS technology. The complete genome sequences generated in this study were closely related to ASFV strains previously described, available in the GenBank database, belonging to ASFV p72 genotype X for the BUR/18/Rutana strain from Burundi and to genotype II for the MAL/19/ Karonga strain from Malawi. Besides, serotyping results classified the BUR/18/Rutana and MAL/19/Karonga ASFV strains into ASFV serogroups 7 and 8, respectively. The Burundian ASFV strain was more closely related to Uvira B53 ASFV strain collected during an ASF outbreak in in South Kivu province of the DRC (Bisimwa et al. 2021), with 99.34% nucleotides identity. These findings are in agreement with the results of studies using partial nucleotide sequencing where relatedness between those two ASFV strains were reported (Bisimwa et al. 2020;Hakizimana et al. 2020b) highlighting the possibility of transboundary spread of genotype X viruses between Burundi and DRC, as previously speculated. Furthermore, the Malawian ASFV strain described in this study was more closely related to the Tanzania/Rukwa/2017/1 ASFV strain collected in South-western Tanzania from an infected domestic pig during an ASF outbreak in 2017 (Njau et al. 2021), with 99.97% nucleotide identity. The high nucleotide similarity between ASFV p72 genotype II strains circulating in Malawi and Tanzania has been previously reported by studies using partial nucleotide sequencing suggesting a common source and transboundary spread of ASFV between these two countries (Hakizimana et al. 2020a;Misinzo et al. 2012). In addition, the Malawian ASFV strain had more than 99% nucleotides identity with ASFV p72 genotype II viruses previously described in Europe and Asia suggesting a possible common ancestor of these ASFV strains as previously speculated (Hakizimana et al. 2020a;Misinzo et al. 2012;Quembo et al. 2018;Rowlands et al. 2008).
Comparative genomic analysis revealed genetic variation in the ASFV strains described in this study compared to ASFV genomes previously described available in the Gen-Bank. For instance, the DP96R gene reported as absent in the Uvira B53 ASFV strain was present in BUR/18/Rutana and MAL/19/Karonga strains with 93.6% and 100% nucleotide identity with the Ken05/Tk1 ASFV p72 genotype X and Georgia 2007/1 ASFV p72 genotype II reference genomes, respectively. The DP96R gene encodes the UK protein potentially involved in determining the ASFV virulence in domestic pigs (Zsak et al. 1998) and its presence in BUR/18/ Rutana and MAL/19/Karonga ASFV strains may explain the high virulence of these strains as evidenced by high mortality rate during the 2018 and 2019 ASF outbreaks in Rutana region of Burundi and Karonga district in northern Malawi, as previously described (Hakizimana et al. 2020a, b). In addition, the K196R and the B119L (9GL) genes encoding the thymidine kinase and sulfhydryl oxidase enzymes, respectively, also described as the factors of virulence for ASFV (Rodríguez et al. 2015) were present in BUR/18/ Rutana and MAL/19/Karonga ASFV strains.
Previous studies have reported important genetic variation within the members of the MGFs located at the both ends of the ASFV genome resulting in difference of the genome size of different ASFV strains (Torresi et al. 2020;Urbano et al. 2021). In the present study, several single-nucleotide polymorphisms (SNPs) and complete ORF deletion were observed within different MGF members. For instance, four MGF members (MGF 100-1R, MGF 110-7L, MGF 110-8L and MGF 110-9L) absent in the BUR/18/Rutana strains were also missing in the Uvira B53 strains as previously reported (Bisimwa et al. 2021). The MGF 360-1Lb gene was truncated in the MAL/19/Karonga strain and the same observation was reported in China/2018/AnhuiXCGQ ASFV strain collected during an ASF outbreak in domestic pigs in Anhui province of China in September 2018 ). In addition, a deletion of almost all members of the MGF 110 were reported in the Estonia 2014 ASFV strain (Zani et al. 2018). The impact of these genetic variations on the phenotypes of the ASFV strains described in this study is subject to further investigations.
The protein pEP402R, a homologue of the T-lymphocyte surface antigen CD2, encoded by the EP402R gene is located in the lipoprotein membrane of the outer viral envelope and plays an important role in the adhesion of erythrocytes to infected cells (hemadsorption) and the binding of the ASFV particles to host erythrocytes during infection (Alejo et al. 2018;Dixon et al. 2019). This gene has been used to define eight viral antigenic types called serogroups (Malogolovkin and Kolbasov 2019). The results of the present study showed that the BUR/18/Rutana and MAL/19/ Karonga ASFV strains may share the hemadsorption properties with ASFV strains belonging to serogroups 7 and 8, respectively (Fig. 3). It has been reported that the ASFV isolates classified into the same serotype show cross-protection responses from challenge during the vaccine development experiments (Malogolovkin et al. 2015a, b;Sánchez et al. 2019). Thus, the determination of the ASFV serogroups was suggested as a perfect tool for discriminating ASFV strains with different virulence and prediction of the efficacy of a specific ASFV vaccine (Burmakina et al. 2016). Recently, genetic signatures specific to each ASFV serotype have been described with the potential of elucidating more on the genetic and antigenic diversity of the ASFV (Malogolovkin et al. 2020;Urbano et al. 2021). Interestingly, the ASFV strains described in this study had the PPPKPC amino acid sequences repeated 4 and 3 times in the BUR/18/Rutana and MAL/19/Karonga ASFV strains, respectively. Similar tandem amino acid repeat sequences within the EP402R (CD2v) gene were reported in the Uvira B53 ASFV strain (Bisimwa et al. 2021).
In conclusion, the results of this study provided important insight into the genetic structure of the ASFV p72 genotype X responsible for the 2018 outbreak in Burundi and genotype II virus that caused the 2019 ASF outbreak in Malawi. Additionally, the strains BUR/18/Rutana and MAL/19/ Karonga were classified into ASFV serogroups 7 and 8, respectively. These results will serve as backbone for possible future investigations concerning molecular evolution, transmission dynamics, diagnostic improvement and control strategies for ASFV.
March 2011 related to the practice of veterinary medicine in Burundi while in Malawi, the Control and Diseases of Animals Act (CAP 66:02 of 1967) and the rule 6 of the Swine Fever Rules G.N. 209/1968 were followed. Oral consent was obtained from the domestic pig owners before sampling of their dead domestic pigs and documented in the Veterinary Officer registry.

Conflict of interest
The authors declare no competing interests.

Disclaimer
The funder had no role in study design, data collection and analysis, decision to publish and in the preparation of this manuscript.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.