Genome Characteristic of Bordetella parapertussis Isolated from Iran

Pertussis also known as whooping cough is a respiratory infection in humans particularly with severe symptoms in infants and usually caused by Bordetella pertussis. However, Bordetella parapertussis can also cause a similar clinical syndrome. During 2012 to 2015, from nasal swabs sent from different provinces to the pertussis reference laboratory of Pasture Institute of Iran for pertussis confirmation, seven B. parapertussis isolates were identified by bacterial culture, biochemical tests, and the presence of IS1001 insertion in the genome. The expression of pertactin (Prn) as one the major virulence factor for bacterial adhesion was investigated using western blot. Moreover, the genomic characteristic of one recently collected isolate, IRBP134, from a seven-month infant was investigated using Illumina NextSeq sequencing protocol. The results revealed the genome with G+C content 65% and genome size 4.7 Mbp. A total of 81 single nucleotide polymorphisms and 13 short insertions and deletions were found in the genome compared to the B. parapertussis 12822 as a reference genome showing ongoing evolutionary changes. A phylogeny relationship of IRBP134 was also investigated using global B. parapertussis available genomes. Supplementary Information The online version contains supplementary material available at 10.1007/s00284-022-03009-x.


Introduction
Bordetella parapertussis is a Gram-negative bacterial pathogen that colonise in the respiratory tract and cause the vaccine-preventable disease known as whooping cough. However, the severity of the disease caused by B. parapertussis is thought to be shorter in duration and milder than B. pertussis, the main responsible pathogen for pertussis in human [1,2].
Comparative genome analysis of Bordetella species revealed that B. pertussis and B. parapertussis were independently evolved from B. bronchiseptica-like ancestors [3]. B. parapertussis diverged to two distinct lineages: one causes whooping cough in infants and the other infects sheep [1].
Bordetella parapertussis and B. pertussis share same virulence factors including pertactin (Prn), dermonecrotic toxin, filamentous haemagglutinin (Fha) and adenylate cyclase [4]. However, the pertussis toxin (Ptx) as one of the major virulence factors is only expressed in B. pertussis, since the ptx operon in B. parapertussis is dysfunctional due to the mutation in the ptx promoter and coding region [3]. In addition, unlike B. pertussis, B. parapertussis isolated from human is oxidase negative and harbour IS1001 insertion element in the genome [3] which can differentiate the clinical isolates from B. pertussis (IS481) using rt-PCR in diagnostic laboratories [5]. Despite high vaccine coverage against pertussis, there are still some pertussis epidemics in many countries including Iran [6,7]. Unlike numerous studies reported B. pertussis as the main cause of these epidemics [8] few studies reported a widespread infection caused by B. parapertussis [9]. In Iran, previous reports showed the lower isolation rate of B. parapertussis compared to B. pertussis [10].
Unlike B. pertussis, there are a few studies investigating the genomic and proteomic characteristics of B. parapertussis or analysing recently collected isolates [3,[11][12][13][14]. Here, we investigate the expression of pertactin (Prn) as one of the important immunologic antigens that are responsible in the bacterial adhesion to host cells and are involved in the acellular pertussis vaccine. Furthermore, the genomic characteristic and microevolutionary changes of one recently isolated B. parapertussis in Iran was analysed using whole genome sequencing approach and compared with global isolates.

Bacterial Isolation and Identification
Nasal swabs of pertussis suspected patients were sent to pertussis reference laboratory of Pasture Institute of Iran for bacterial isolation and infection confirmation during the 2012-2015 pertussis epidemic in Iran [6]. As previously described [15], samples were cultured on Regan-Lowe medium containing charcoal agar and 10% defibrinated sheep blood and incubated at 37 °C for 72 h. B. parapertussis isolates were confirmed by a combination of colony morphology, Gram stain and conventional biochemical tests such as oxidase and real-time PCR. DNA extracted using High Pure PCR Template Preparation Kit (Roche Diagnostics GmbH, Mannheim, Germany) and real-time PCR was performed by targeting IS481, IS1001 and IS1002 with designed primers [16] to confirm the presence of IS1001 for B. parapertussis as recommended by WHO [5].

Western Blot
Western blot analysis was performed to investigate the expression of pertactin in our isolates. Briefly, isolates were suspended in phosphate-buffered saline (PBS) and boiled at55 °C for 30 min after bacterial subculture on Bordet Gengou agar with 15% sheep blood at 37 °C for 72 h. B. pertussis strain Tohama I (Gene Bank Accession Number BX470248) and Klebsiella (ATCC 13883) were used as a positive and negative control, respectively. Bacterial proteins were separated by 10% sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS-PAGE) (Bio-Rad).
After electrophoresis, the proteins were transferred to polyvinylidene fluoride (PVDF) membranes at 300 V for 1 h. The membranes were blocked with skim milk in PBS for overnight. The 220-kDa Fha protein and 69-kDa prn was detected using a mouse anti-Prn antibody (NIBSC, UK), then incubated with a horseradish peroxidase (HRP)-conjugated anti-mouse antibody. After a final wash, membranes were developed with Metal Enhanced DAB Substrate (Thermo Fisher Scientific, Waltham, MA, USA).

Whole Genome Sequencing
Bordetella parapertussis isolate IRBP134 collected from a fully vaccinated 7-month-old female baby in autumn 2015 and was selected for whole genome sequencing. DNA was extracted and purified from pure culture as described previously [17]. DNA libraries were prepared with the insert size of 150 bp paired-end using NexteraXT DNA kit (Illumina) and sequenced on the Nextseq (Illumina) with a minimum coverage of 150-fold. The raw reads were submitted to the GeneBank database under the Biosample number SAMN18214790.

Bioinformatics
De novo assembly and genome annotation were performed as described previously [6,17]. Assemblies were also submitted to CRISPRfinder and Phaster for CRISPR (clustered interspaced short palindromic repeats) and prophage prediction in the genome, respectively [18,19]. Reads were mapped against B. parapertussis strain 12822 (GenBank: NC_002928.3), that is used as a reference genome in most studies, and SNPs and indels were detected as previously described [6,17]. Virulence-associated gene analysis and the multilocus sequence typing (MLST) were performed using the Bacterial Isolates Genome Sequence Database (BIGSdb) at https:// bigsdb. paste ur. fr/ borde tella//. The maximum parsimony algorithm was used to construct phylogenetic tree by MEGA7 [20]. Tree-Bisection-Reconnection (TBR) was used to search optimal trees. Bootstrap analysis was based on 1000 replicates and B. parapertussis 12822 was used as reference genome.

Detection of B. parapertussis in Clinical Samples
From 4923 swabs sent to Pasture pertussis reference laboratory during 2012 to 2015, seven B. parapertussis isolates were confirmed of which four isolates collected from unvaccinated infants with the age of 2-month-old or less and three were collected from fully vaccinated patients older than 6-month-old (Table 1). Tehran as a capital and Eastern Azarbayjan, a north-western province, each had three isolates. These two provinces had the most B. pertussis isolates in recent years as reported previously [6,15]. During 2012-2015 around 112 B. pertussis isolates collected from different provinces in Iran [6] showing low isolation rate of B. parapertussis in the country with 50 years whole cell pertussis vaccination history. In recent years, there are some reports showing the increase of pertussis cases caused by B. parapertussis especially in countries with acellular pertussis (ACV) immunisation program [9,11,21,22]. ACV has been introduced in most developed countries since 1990s for immunisation due to the reported side effects of WCV and usually contained three (Ptx, Prn and Fha) or five components (additional fimbriae Fim2 and Fim3) [23]. The increase in the B. parapertussis isolation might be due to the fact that it might have better fitness under the ACV pressure since it does not express Ptx, as the major virulence protein that are involved in all types of one to five component ACVs [24].

Prn Expression
Pertactin (Prn) is an important surface antigen as an adhesion factor in B. pertussis and B. parapertussis and included in ACVs. B. parapertussi isolates that do not express Prn were reported in France in recent years [11,25]. B. pertussis strains that do not express Prn have been reported in many countries including Australia where the majority of collected clinical isolates in recent years are Prn negative [17,26]. Like B. pertussis the majority of collected B. parapertussis isolates (94.3%) in France since 2007 do not express Prn that suggested to be due to the ACV vaccine pressure [27]. This phenotype is caused by a deletion of one Adenine in region I of the prn gene (position 988, 12.1%, 4/33) or a Guanine in region II (position 1895, 75.8%, 25/33) both of which lead to a stop codon [27]. It is shown that prnnegative B. pertussis have better fitness under ACV pressure [28]. The emergence of Prn negative B. parapertussis isolates could be a global concern especially in countries with ACV immunisation program since this phenotype can easily escape vaccine pressure by not expressing Ptx and Prn as two main components of ACVs. We previously showed that no mutation or disruption was found in the prn gene in current circulating predominant B. pertussis isolates in Iran and no Prn negative isolate was reported from Iran with WCV immunisation program [6,15,16]. Here, western blot was carried out to investigate the expression of Prn in our B. parapertussis isolates and it showed all seven isolates express Prn and confirmed the WCV immunisation did not affect phenotype evolution of this species in Iran.

General Genome Features
We sequenced one recently collected B. parapertussis isolate, IRBP134, from fully vaccinated infant. The Nextseq sequencing generated 6,897,792 paired reads with GC content 65% and coverage rate 231. De novo assembly generated 72 scaffolds with genome size of 4,720,964 bps with N50, 106,547.
The genome annotation showed the total of 4620 potential coding sequences and 55 RNA including 63 tRNA as well as one large and one small subunit of ribosomal RNA.
PHASTER tool was used to identify phage region in the genome and showed one incomplete phage regions with average size 9.7 kb and GC content 68.17% encoding 11 proteins. The sequence of potential prophages in the genome is identified and categorised as intact, incomplete or questionable based on the identity score [19].
CRISPR (Clustered Regulatory Interspersed Short Palindromic Repeats) systems were first discovered in E. coli in 1987 and later in other species. It is based on the generating specific CRISPR RNA (crRNA) which target invasive RNA/DNA sequences and cleave it into multiple smaller sequences by the endonuclease activity of CRISPRassociated (cas) proteins [29]. Based on the CRISPR/cas database [30], seven CRISPR sequences were found in the genome which is consistent with the average number of CRISPR sequences in other submitted B. pertussis isolates in the databases (Supplementary File 1). Furthermore, two  File 1). We aligned the nucleotide sequences of these two identified Cas proteins in NCBI database using BLASTn tool and found the sequences are identical with mfd gene in B. parapertussis encoding transcription-repair coupling factor (Mfd) that associates elongation transcription complexes in bacteria and helps RNA polymerase to finish the transcription [31]. Therefore, the identified Cas-associated protein in our genome needs to be investigated further to be confirmed as Cas protein since in the CRISPRcas database there was no Cas-related protein to be identified for this species.

Microevolutionary Analysis
Studies showed pertussis vaccines using B. pertussis as vaccine seed can protect body against B. parapertussis as well [5]. There are numerous reports showing allelic variation, genome reduction and ongoing microevolutionary adaptation in currently circulating B. pertussis around the word especially in countries switched to acellular vaccine [17,[32][33][34]. Vaccine pressure particularly the switch from whole cell vaccine to acellular vaccine was one of the main reasons for pathogen adaptation [35][36][37]. Since B. parapertussis causes pertussis with milder symptoms, they isolated and reported less than B. pertussis and there are very few studies investigating the adaptation of clinical B. parapertiussis isolates [3,9,11].
Our study showed IRBP134 has the same allelic profile of major virulence-associated genes (ptxS1-ptxS5, prn, fim2, fim3, cyaA, bvgA, bvgS) compared to the reference genome. The only exception is brkB gene that like other B. parapertussis isolates [25], IRBP134 carries allele 6 of brkB gene, encoding a cytoplasmic membrane protein called Bordetella serum resistance (Supplementary File 2). This allele variation is a result of a non-synonymous SNP changing polar Threonine to nonpolar Alanine in BrkB in position 742 of an immunologic protein of B. pertussis. It plays an important role in B. pertussis as a virulence factor mediating adhesion of bacterium to the host cell and is also shown to be expressed in B. parapertussis [38]. The multilocus sequence typing (MLST) analysis according to 7-gene scheme [39] shows IRBP134 belonged to ST19 as it is reported for B. parapertussis strain 12822, FR6242 and some other recently collected isolates [25].
To investigate the genomic microevolution of Iranian B. parapertussis, reads were mapped against B. parapertussis strains 12822 as a reference and a total of 82 SNPs found of which 68 were in coding regions (Supplementary File 2).
From a total of 14 mutations found in intergenic region, ten were in the promoter region of genes including a promoter region of petA encoding ubiquinol-cytochrome C reductase iron-sulphur subunit. It is a respiratory chain protein that generates an electrochemical potential coupled to ATP synthesis. Another important intergenic mutation was in the promoter region of bfrE, the virulence-associated gene in B. pertussis, encoding probable TonB-dependent receptor for iron transport [3].
From a total of 13 indels (four genic and nine intergenic), four located in coding regions leading to the frameshift mutations of which three were in pseudogenes which may convert them to the active genes (Supplementary File 2). There was no gene insertion or deletion in the genome of the isolate compared to the reference genome.

Global Relationships of B. parapertussis Isolates
The phylogeny relationship of the 103 available human B. parapertussis genomes including, IRBP134, was investigated using 896 SNPs against B. parapertussis strain 12,822 ( Figure 1). Isolates were collected between 1974 and 2018 from seven countries mainly from the United States (Supplementary File 3) [14,25]. From a total of 103 B. parapertussis isolates collected during 1974 to 2018, 95 isolates make a large lineage and separated from nine isolates including the reference genome by 17 SNPs including two nonsynonymous-SNPs in rnC and brkB encoding ribonuclease III and Bordetella serum resistance protein, respectively. As discussed previously, the nsSNP in brkB gene resulted in allele variation from 2 in the reference genome to 6 (Fig. 1). The nsSNP in rnC also caused allelic shift from allele 3 to allele 1. The lineage then was separated into two clades as Clade 1 and Clade 2, each with nine clade-specific SNPs. Clade-1 with 21 (20%) isolates mostly collected before 2000 and share nine common SNPs of which seven located in genes including acpS, ddlB, BPP0416, BPPP0452, BPP2138, BPP3371 and BPP3541. IRBP134 located in this clade and made a subgroup with two other isolates from the USA (FDAARGOS177,1935) and Switzerland (502474-16, 2016) separating from other isolates in Clade 1 with 22 SNPs including three intergenic and 19 within genes such as ppc, oplaH and atpG. FDAARGOS177 which is an FDA standard reference strain and IRBP134 shared 13 SNPs, one intergenic and 12 in genes including pyrB and trpB and BPP0058 that encodes 50S ribosomal protein, and differed with 2 novel SNPs for IRBP134. The two nsSNPs located in the BPP2476 and BPP3004 encoding hypothetical protein and putative cytochrome C, respectively, grouped as clade2 and separated from clade. The majority of recently collected isolates from different countries were grouped in Clade2 with nine SNPs all located in genes of which three located in lolD, rnC, thiD encoding lipoprotein releasing system ATP-binding protein, ribonuclease III and phosphmethylpyrimidines kinase.

Conclusion
To summarise, we identified seven B. parapertussis isolates from pertussis cases during 2012-2015. Unlike B. parapertussis isolates collected from countries with ACV vaccination that do not express Prn, all Iranian isolates express Prn as one of the major components of ACV vaccine confirming the ACV vaccine pressure on the Prn expression. The global phylogeny analysis showed IRBP134 was grouped with an isolate from the USA and located in the a b 3 SNPs Fig.1 Phylogenetic relationships of Iranian B. parapertussis isolate (IRBP134) with global isolates. Maximum parsimony phylogeny using 895 SNPs from a total of 104 B. parapertussis isolates from different countries, mainly from the USA-Blue-and other countries such as the UK, France, China, Austria and Switzerland (details of the isolates are available in Supplementary File 3). B. parapertussis strain 12,822 was used as a reference. a Majority of isolates (95) carried allele 6 of brkB as a virulence-associated gene and clustered in two clades as clade 1 and clade 2 each with nine unique SNPs. b The Iranian isolate IRBP134 located in clade 1 which mostly consists of isolates collected before 2000 and make a subclade with two isolates from Switzerland and the USA with 22 SNPs. IRBP134 separated from the USA isolate, FDAARGOS177, with two non-synonymous SNPs located in BPP2476 and BPP3004. Most of the recently collected isolates from the USA, UK, China and France are in Clade -2 with common 9 SNPs clade 1. To the best of our knowledge, this is one of the first reports investigating the whole genomic features of recently isolated B. parapertussis. Our results revealed few mutations leading to ongoing genomic adaptation in our isolate. To investigate mutation rate and its effect on the fitness of B. parapertussis isolates in Iran, more clinical B. parapertussis isolates are needed to be collected from the country to be analysed in terms of genomics or proteomics.
Author Contributions FS and AS designed the research study. FS provided fund for the study. MN performed the bacterial isolation and Real Time PCR confirmation. SS performed DNA extraction and purification for WGS and western blot analysis. AS analysed the genomic data, CRISPR, phage analysis and wrote the paper. ACYT and BL performed sequencing of extracted DNA. All authors read and approved the final version of the paper.
Funding Open Access funding enabled and organized by CAUL and its Member Institutions. This work was supported financially by Pasteur Institute of Iran grant number 968.

Data Availability
The raw reads were submitted to the GeneBank database under the Biosample number SAMN18214790.
Code Availability Not applicable.

Declarations
Conflict of interest All authors declare that they have no conflict of interest.

Ethical Approval Not applicable.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.