Phylogenomic characterisation of a novel corynebacterial species pathogenic to animals

The genus Corynebacterium includes species of biotechnological, medical and veterinary importance. An atypical C. ulcerans strain, W25, was recently isolated from a case of necrotizing lymphadenitis in a wild boar. In this study, we have analysed the genome sequence of this strain and compared the phenotypic and virulence properties with other corynebacterial pathogens. Phylogenomic analyses revealed that strain W25 belongs to a novel species along with PO100/5 and KL1196. The latter strains were isolated from a pig and a roe deer, respectively; hence, this species appears to be associated to animals. The isolate W25 is likely a non-toxigenic tox gene bearing strain and may have compromised abilities to adhere to pharyngeal and laryngeal epithelial cells due to potential loss of the gene functions in spaBC and spaDEF pilus gene clusters. A number of corynebacterial virulence genes are present including pld encoding phospholipase D. Therefore, this strain may be able to cause severe invasive infections in animals and zoonotic infections in humans.


Introduction
Corynebacterium is a diverse genus that includes species of biotechnological, medical and veterinary importance (Bernard and Funke 2015). One of the corynebacterial species, Corynebacterium ulcerans, is an important zoonotic pathogen often acquired from canine pets and causes diphtheria-like infections in humans (Hacker et al. 2016;Mattos-Guaraldi et al. 2014). C. ulcerans has also been isolated from other animals including camels, cattle, cats, goats, ground squirrels, monkeys, pigs, otters, whales, etc.
Two Spa gene clusters, spaDEF and spaBC, have been reported among C. ulcerans strains (Subedi et al. 2018;Trost et al. 2011). Pilus gene clusters encode surface pili that play a key role in adhesion and invasion to the host cells (Broadway et al. 2013;Reardon-Robinson and Ton-That 2014). A variation in the numbers of pilus gene clusters and gain or loss of gene function was found to correlate with differences in the severity of infection by Corynebacterium diphtheriae, another important human pathogen closely related to C. ulcerans (Grosse-Kock et al. 2017;Ott et al. 2010;Sangal et al. 2015).
We have recently isolated an atypical C. ulcerans strain, W25, associated with necrotizing lymphadenitis in a wild boar and published the genome sequence (Busch et al. 2019). While the size of the genome is consistent with other C. ulcerans genomes, G ? C content of the W25 was approximately 1.0% higher than other C. ulcerans strains (Busch et al. 2019). It may reflect significant variations in the gene content and virulence properties of this strain than other C. ulcerans isolates. Therefore, we compared the phenotypic and virulence properties of this strain and performed a comparative genomic analysis against other C. ulcerans isolates.

Bacterial strains and culture conditions
Eight corynebacterial strains were included for phenotypic characterisation (Table 1). The strains were cultured in Brain Heart Infusion (BHI) broth at 37°C and were incubated overnight in a shaking incubator.

Strain identification
Strain W25 was analysed by MALDI-TOF mass spectrometry as previously described (Alibi et al. 2015). Biochemical tests were performed for strain W25 using the standard approach (Efstratiou and George 1999). Antimicrobial susceptibility testing method was carried out on Müller-Hinton agar as described in detail on the EUCAST website (http:// www.eucast.org).

Multiplex PCR
For differentiation of Corynebacterium species, a multiplex colony PCR, based on five genes, rpoB, 16S rRNA, pld, dtxR and tox, was performed using the oligonucleotides listed in Table 2. For colony PCR, a loopful of freshly grown bacteria was resuspended in 500 ll of sterile deionized water and boiled for 10 min at 95°C. The suspension was centrifuged at 13,000 9 g for 1 min and 1 ll of the supernatant was used as template. Multiplex PCR was carried out in a Primus 96 advanced thermocyler (Peqlab, Erlangen) using previously described conditions (Torres Lde et al. 2013). The amplicons were separated by electrophoresis on a 3% agarose gel.

Detection of C. diphtheriae toxin production
Elek test (reaction of immunoprecipitation) was performed as described in the Manual for Diphtheria Laboratory Diagnosis (Efstratiou and George 1999;Mazurova et al. 1998). Korinetoksagar (State Research Center for Microbiology and Biotechnology, Obolensk, Russia) with 15% fetal calf serum was used to grow the bacterial strains. Filter paper strips (8.0 9 1.3 cm) were impregnated with 0.25 ml (500 IU in 1 ml) of purified diphtheria antitoxin (Microgen, Russia) and placed on the centre of the agar plates. Recommended control strains, toxigenic C. diphtheriae NCTC 10648, non-toxigenic C. diphtheriae NCTC 10356, and test cultures were transferred to the plate at a distance of 6-7 mm from the strip edge. The Elek test was analysed after 24 h of incubation at 37°C.

SDS-PAGE and Western blotting
Corynebacterium ulcerans strains were incubated at 37°C in a shaking incubator in BHI broth (Oxoid, Wesel) and were grown to an OD 600 of 0.4-0.6. For toxin production, 2-2 0 -bipyridyl was added at a final concentration of 0.5 mM during exponential phase and bacterial strains were incubated for further 2 h under iron starvation conditions (Moreira et al. 2003). The cells were harvested by centrifugation. Protein extraction, separation of proteins by SDS gel electrophoresis and immuno-detection of diphtheria toxin with human serum were carried out as described previously (Möller et al. 2019).

Genome sequences
A scaffold was generated from the draft assembly of strain W25 (Accession number: VFEM00000000) against the genome of C. ulcerans strain PO100/5 using MeDuSa web-server (Bosi et al. 2015). The genome sequences of 28 other C. ulcerans strains and type strains of closely related corynebacterial species, C. diphtheriae, Corynebacterium belfanti and Corynebacterium pseudotuberculosis were obtained from the GenBank (Supplementary Table 1).

Phylogenomic analyses
16S rRNA gene sequence (1509 bp in size) was extracted from the genome sequence of strain W25 using RNAmmer v1.2 (Lagesen et al. 2007). The reference 16S rRNA gene sequences of all corynebacterial strains were obtained from the GenBank. The nucleotide sequences were aligned using MUSCLE (Edgar 2004) and a phylogenetic tree was constructed from resulting sequence alignment (1155 bp in size after excluding sites with gaps) using IQtree with 100,000 ultra-fast bootstraps and 100,000 SH-aLRT tests (Nguyen et al. 2015). The tree was visualised using iTOL (Letunic and Bork 2016). Pairwise average nucleotide identities (ANI) were calculated among C. ulcerans genome sequences and the type strains of closely related corynebacterial species using FastANI (Jain et al. 2018). Digital DNA-DNA hybridisation (dDDH) values were calculated using Genome-to-Genome Distance Calculator 2.1 (Auch et al. 2010a, b). The genome sequence of strain W25 was also analysed using the PathoBacTyper (Tsai, Liu and Soo 2017), TrueBac TM ID cloud system (Ha et al. 2019) and Type (Strain) Genome Server (Meier-Kolthoff and Göker 2019). All genome sequences were annotated using Prokka v 1.12 (Seemann 2014) and compared using Roary v 3.12.0 with an identity cut-off of 70% (Page et al. 2015;Tange 2011). A maximum-likelihood tree was calculated from the core genomic sequence alignment after removing the sites with missing data using IQ-Tree with 100,000 ultra-fast bootstraps and 100,000 SH-aLRT tests (Nguyen et al. 2015).

Identification of virulence genes
The known virulence genes from pathogenic corynebacteria including C. diphtheriae, C. pseudotuberculosis as well as C. ulcerans were searched into the genome of strain W25 using the protein BLASTsearches (Altschul et al. 1997;Camacho et al. 2009).

Identification of genes involved in starch metabolism
Glycoside hydrolases that are responsible for hydrolysis of amylose and amylopectin were identified from the KEGG pathway for starch and sucrose metabolism (https://www.genome.jp/kegg-bin/show_ pathway?map00500) and were searched among the protein sequences of strains W25, PO100/5 and KL1196, C. ulcerans strains BR-AD22 and NCTC 12077, and C. pseudotuberculosis DSM 20689 obtained from the GenBank (Supplementary Table 1) using protein-protein PSI-BLAST algorithm (Altschul et al. 1997). A ''KEGG-inferred'' database was created with identified glycoside hydrolases sequences.
A two-step protein BLAST search strategy was applied to refine these results and to identify enzymes conserved among all corynebacterial strains. In the first search, proteins sequences from all six strains were searched in the ''KEGG-inferred'' database using BLASTP (Camacho et al. 2009). The query sequences with significant similarity to sequences in the database (C 90% coverage, C 60% identity and e-value B 1e-165) were aligned using Clustal-Omega (Sievers et al. 2011). The hierarchical clustering in the multiple sequence alignment allowed distinguishing protein groups, which were used to record a presence or absence of enzymes among individual strains. In the step 2, the proteins from each hierarchical cluster in the sequence alignment was used as the query against the entire proteome of the six corynebacterial strains. This was an additional confirmation of the presence or absence of a given enzyme in particular strains.

Identification and biochemical characteristics of strain W25
Strain W25 was initially identified by MALDI-TOF mass spectrometry as C. ulcerans with a score of 2.065. Multiplex PCR amplified fragments of 16S rRNA, rpoB and tox genes with a faint DNA band for pld gene for strain W25 (Fig. 1), a profile consistent with other C. ulcerans isolates as the primers were designed to amplify the fragments of rpoB and tox genes for C. diphtheriae, C. pseudotuberculosis and C. ulcerans, 16S rRNA for C. ulcerans and C. pseudotuberculosis, pld for C. pseudotuberculosis and dtxR for C. diphtheriae strains.
Isolate W25 was found to produce H 2 S on Tinsdale medium when stabbed into the surface and was ureasepositive (Table 3). The strain was positive for reverse CAMP reaction, i.e., produced phospholipase D inhibiting b-haemolysis by Staphylococcus aureus on blood-agar plates. The strain was also positive for hydrolase activity and was able to utilise glucose as a carbon source (Table 3). W25 could not hydrolyse gelatine, reduce nitrate or ferment starch and was negative for toxin production according to the Elek test (Table 3). The strain was sensitive to all antibiotics tested (Supplementary Table 2). The extensive search for glycoside hydrolases revealed the absence of two enzymes 1,4-alpha-amylase and a type I pullulanase, in the strain W25. These enzymes are involved in starch hydrolysis and an absence of these enzymes explains inability of W25 strain to ferment starch (Table 4).

Phylogenomic characterisation
The maximum-likelihood tree from 16S rRNA gene also grouped strain W25 with C. ulcerans isolates (Fig. 2). 16S rRNA sequence of strain W25 showed 99.65% similarity with the 16S rRNA gene in C. ulcerans strain CD361 and 99.31% similarity with the 16S rRNA gene of C. ulcerans strain NCTC 7910. The results of core genome phylogeny showed that W25 has separated from other C. ulcerans genomes and formed a distinct cluster with two other strains, PO100/5 and KL1196 (Fig. 3). The latter two strains are also submitted to the GenBank as C. ulcerans isolates. The remaining 26 C. ulcerans strains formed two distinct subgroups, which is in agreement with our previous study showing an existence of two lineages within this species (Subedi et al. 2018). The composition of these lineages is highly consistent between the two studies with five additional strains (211, FH2016-1, NCTC 7908, NCTC 7910 and NCTC 8639) grouped in lineage 1 and two additional strains (03-8664 and NCTC 8666) grouped in lineage 2 (Fig. 3) in this study.
Pairwise dDDH values of strain W25 with other C. ulcerans isolates and type strains of C. belfanti, C. diphtheriae and C. pseudotuberculosis, indicate that strain W25 belong to a novel species along with PO100/5 and KL1196 (Table 5). These results are also confirmed by the ANI values. The ANI values between strains W25, PO100/5 and KL1196 were [ 99%, consistent with them being the same species but were \ 92% between these strains and C. ulcerans genomes (Supplementary Table 3).
An analysis of the genome sequence of strain W25 using PathoBacTyper (Tsai et al. 2017) showed 29% coverage rate against Corynebacterium variabile strain and Type (Strain) Genome Server (Meier-Kolthoff and Göker 2019) indicated that this strain belongs to a novel species. Similarly, TrueBac TM ID cloud system revealed C. ulcerans to be the closest species with 90.85% ANI (90.6% ANI coverage), 99.72% similarity between the 16S rRNA, 94.95% similarity between recA and 99.09% sequence similarity between rplC genes of the two strains. These  A comparative genomic analysis of all C. ulcerans and W25, PO100/5 and KL1196 strains using Roary (Page et al. 2015) with a minimum BLASTP identity of 70%, revealed a pangenome encompassing 4525 genes, of which 1,555 genes belonged to the core genome. The number of genes on individual genomes varied between 2159 and 2529; therefore, [ 61% of the genome is conserved between the two species.
Only 30 genes were found to be unique to C. ulcerans strains including 23 genes encoding hypothetical proteins (Supplementary Table 4). Homologs of six St ra in W 25 Fig. 2 Maximum likelihood tree from the alignment of 16S rRNA sequences for all Corynebacterium species. The scale bar represents nucleotide substitution per site of the remaining seven genes encoding aminopeptidase N, cysteine-tRNA ligase, 1,4-dihydroxy-2-naphthoyl-CoA synthase, heat-inducible transcription repressor HrcA, putative fluoride ion transporter CrcB and putative propionyl-CoA carboxylase beta chain 5 (AccD5) are also present among strains of the novel group. However, RNA polymerase sigma factor YlaC appears to be unique to C. ulcerans strains. In contrast, 238 genes were unique to strains W25, PO100/5 and KL1196 (Supplementary Table 5) that were absent among C. ulcerans isolates. Again, 92% (220 genes) of these genes encode hypothetical/putative proteins. Some of the unique genes encoding ABC transporter ATP-binding proteins, a UDP-glucose 6-dehydrogenase, glucose-specific EIIA component of phosphotransferase system, proteins involved in hemin transport system (HmuU), vitamin B12 import system (BtuC) and resistance to daunorubicin/doxorubicin (DrrA) have other copies or homologs that are conserved across all the strains and may not cause any functional variation between C. ulcerans strains and those belong to the novel group. Similarly, copies of genes encoding sulfate/thiosulfate import ATPbinding protein (CysA) is present among some C. ulcerans strains. Eight genes encoding component of ammonia channel (amt), a bgl operon antiterminator (BglG), D-amino acid dehydrogenase (DadA), glucose-specific EIICBA components of phosphotransferase system, oligopeptide transport system permease protein (OppB), a putative peptidase (cp29_00169) and a transcriptional regulatory protein DesR are unique to strains W25, PO100/5 and KL1196 and may be responsible for minor functional variations between these species (Supplementary Table 5).

Expression of the tox gene in strain W25
Diphtheria toxin is the main virulence factor in toxigenic corynebacteria (Sangal and Hoskisson 2014). The tox gene has been amplified in the multiplex PCR reaction (Fig. 1). A Western blot using human antiserum against the toxin was negative for strain W25 (Fig. 4A). Diphtheria toxin was detected in extract from strain KL756 during the induction of iron starvation (Fig. 4A). We also performed an Elek test to check the expression of the gene. Elek test is an agar gel immunodiffusion assay where horse diphtheria antitoxin diffuses towards diphtheria toxin produced by toxigenic strains. A precipitation line forms near the bacterial colonies at the zone of equivalence. No precipitation lines were observed for W25, suggesting that this strain is non-toxigenic (Fig. 4B).
A BLAST-search of the protein sequence of the toxin from C. diphtheriae NCTC 13129 (DIP0222) showed significant similarity with the protein encoded by the gene cp29_02234 in strain W25 that was annotated to encode a hypothetical protein. A nucleotide sequence alignment of the tox gene including 100 bp upstream and 100 bp downstream regions from strains W25, PO100/5, KL1196, C. diphtheriae NCTC 13129 (DIP0222) and C. ulcerans 0102 (CULC0102_0213) revealed that the gene in strains W25 and KL1196 has a two base (GG) insertion at position 48 ( Supplementary Fig. 1), which introduced a frameshift, leading it to be a pseudogene. Therefore, To further test the toxin expression, RNA hybridization experiments were carried out using tox ? C. ulcerans strain KL756 as the positive control and toxstrain 809 as the negative control. A presence of 16S rRNA gene expression was confirmed in all three strains (Fig. 5A) whereas transcript for tox gene was only detected for strain KL756 when iron starvation was induced by bipyridyl (Fig. 5B). These results confirmed that W25 is an NTTB strain.

Pilus genes clusters in strain W25
Similar to C. ulcerans strains, two spaBC and spaDEF type pilus gene clusters have been identified in strain W25 (Subedi et al. 2018;Trost et al. 2011). The genes encoding sortase A and SpaB fimbrial subunit are present in strain W25; however, the C-terminal region of the SpaC fimbrial subunit is truncated and the corresponding gene is annotated as two smaller genes (Fig. 6). Therefore, SpaBC type pili in this strain may not be functional.
Similarly, spaD, srtC and spaF genes are also truncated and annotated as two smaller genes, respectively, and SpaE is missing approximately 65 amino acid residues at the N-terminal region. A small gene encoding hypothetical protein is also present between spaE and spaF (Fig. 6). Therefore, SpaDEF type pili may also be absent in strain W25.

Other virulence genes in strain W25
When searched for other corynebacterial virulence genes, strain W25 was found to possess most of the other virulence genes present in C. ulcerans strains including cpp (corynebacterial protease), pld (phospholipase D), nanH (neuraminidase, sialidase), vsp1 and vsp2 (trypsin-like serine protease) and cwlH (hydrolase; cell wall peptidase; Table 6). A gene annotated as ripA (peptidoglycan endopeptidase) show 91% identities with the rpfI gene in C. ulcerans with a deletion of 34 amino acids from position 299 to 332. However, two virulence genes rbp and tspA were absent in strain W25.

Discussion
C. ulcerans and C. pseudotuberculosis are pathogens adapted to canine and ovine hosts, respectively but can cause zoonotic infections in humans (Bregenzer et al. 1997;Hacker et al. 2016;Peel et al. 1997). In this study, we have characterised an isolate W25, which has been identified as an atypical C. ulcerans based in MALDI-TOF analysis, multiplex PCR using 16S rRNA, rpoB and tox genes (Fig. 1) and other biochemical characteristics (Table 3).
Genome-based matrices have been extensively used to define novel bacterial species (Nouioui et al. 2018(Nouioui et al. , 2019Sangal et al. 2016Sangal et al. , 2018. The dDDH and ANI cut-off values for defining new species are 70% and 95%, respectively (Auch et al. 2010b;Tiedje 2005, 2007). The genome sequences of strains W25 showed[ 98% dDDH and[ 99% ANI values against the genome sequences of strains PO100/5 and KL1196, clearly indicating that these strains belong to the same species. These strains are phylogenetically closely related to other pathogenic corynebacteria, C. diphtheriae, C. pseudotuberculosis and C. ulcerans (Figs. 2 and 3). However, dDDH and ANI values were below the species cut-off between W25 and the type strains of these species (Table 5;  Table 3), suggesting that this strain belong to a novel Corynebacterium species along with two other strains PO100/5 and KL1196. These results are also confirmed by other bacterial identification platforms including PathoBacTyper (Tsai et al. 2017), TrueBac TM ID cloud system (www.truebacid.com) and Type (Strain) Genome Server (Meier-Kolthoff and Göker 2019). W25 was isolated from a case of necrotizing lymphadenitis in a wild boar (Busch et al. 2019), PO100/5 from a pig and KL1196 from a roe deer. Therefore, this species also appears to be prevalent among animals.
One of the key biochemical difference separating strain W25 from C. ulcerans was inability of the former to ferment starch (Table 3). We identified protein sequences for pullulanase type I and 1,4-alpha amylase enzymes that are involved in starch metabolism are absent in strain W25 as well as PO100/5 and KL1196 (Table 4). The genes encoding these enzymes are inactive (pseudogenes) due to frameshift mutations among these isolates. While both the genes are present in C. ulcerans, the gene encoding 4-alpha amylase enzyme is absent in C. pseudotuberculosis (Table 4). In general, C. pseudotuberculosis strains do not ferment starch (Dorella et al. 2006). Therefore, both the enzymes seem to be important for starch metabolism and an absence of any of these may compromise the ability to ferment starch.
Interestingly, the genome of strain W25 has been annotated to carry the tox gene, encoding a diphtheria-like toxin but the protein was not detectable in the Western blot using human antiserum against the toxin (Fig. 4A) or in the Elek test (Fig. 4B). Furthermore, no transcript of the tox gene was observed in the RNA hybridization experiment (Fig. 5A, B). The gene has a two base (GG) insertion at position 48 which has introduced the frameshift (Supplementary Fig. 1).
Two pilus gene clusters (spaBC and spaDEF) have been identified in strain W25; however, both of them show potential loss of the gene functions (Fig. 6). A spaBC cluster is also present in C. ulcerans, which lacks the spaA gene encoding a major pilin subunit (Subedi et al. 2018;Trost et al. 2011). SpaABC type pili are known to interact with pharyngeal epithelial cells (Mandlik et al. 2007; Reardon-Robinson and Ton-That 2014) and homodimeric or heterodimeric SpaB/SpaC proteins were suggested to facilitate this interaction (Trost et al. 2011). However, spaC gene encoding the tip protein is also truncated in strain W25. Similarly, multiple genes of the the spaDEF cluster are truncated in strain W25 (Fig. 6). These pili in C. diphtheriae are characterised to interact with laryngeal epithelial cells (Mandlik et al. 2007;Reardon-Robinson and Ton-That 2014). The spaDEF cluster is characterised of five genes, spaD, spaE and spaF encoding the major pilin subunit, minor subunit and the tip protein, respectively and two sortasesencoding genes, srtB and srtC, responsible assembly of the pilus (Mandlik et al. 2007; Reardon-Robinson and Ton-That 2014). Therefore, it is possible that the Resuscitation-promoting factor ability of this strain to interact with the pharyngeal or laryngeal epithelial cells is compromised. However, the strain W25 possesses a number of virulence genes present among C. ulcerans and C. pseudotuberculosis strains, which may enable it to cause severe invasive infections (Table 6). For example, phospholipase D is a well-characterised virulenceassociated protein responsible for significant macrophage death (McKean et al. 2007). Similarly, protease and hydrolases activities of other proteins have been found to contribute to the virulence properties in pathogenic corynebacteria (Trost et al. 2010(Trost et al. , 2011.

Conclusions
Isolate W25 is biochemically similar to C. ulcerans strains that can produce H 2 S on Tinsdale medium, is positive for reverse CAMP reaction and hydrolase activity, and is able to utilise glucose as a carbon source (Table 3). This strain was previously defined as atypical C. ulcerans but belong to a novel species including two strains, PO100/5 and KL1196, which were independently isolated from animals. The isolate is likely an NTTB strain with compromised abilities to adhere to pharyngeal and laryngeal epithelial cells due to loss of the multiple genes in spaBC and spaDEF pilus gene clusters. However, a number of corynebacterial virulence genes are present, which may enable the strain to cause severe invasive infections in animals and zoonotic infections in humans.
Author contribution AB and VS conceived the idea and designed the study. JM, VM and WG performed the phenotypic characterisation of corynebacterial strains. LM and VS analysed the genomic data. JM, AB and VS drafted the manuscript. All authors read and approved the manuscript.

Compliance with ethical standards
Conflict of interest The author declare that they have no conflict of interest.
Human and animal participants This study does not involve any human participants or animal experiments.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.