Introduction

Streptococcus agalactiae, or Group B Streptococcus (GBS), is a Gram-positive coccus and prominent cause of invasive disease worldwide. Invasive GBS infections occur in pregnant women and neonates as well as older adults and at-risk populations, especially those with underlying conditions [1]. Among adults, GBS infection can present as bacteremia, endocarditis, osteomyelitis, septic arthritis, pneumonia, skin and soft tissue infection, and urinary tract infections [1, 2].

There are currently 10 recognized serotypes (capsule types) for GBS (Ia, Ib, II, III, IV, V, VI, VII, VIII, and IX) [1]. Serotypes responsible for causing most disease vary between patient demographics and geographic location [3, 4]. Globally, serotypes Ia, III, and V account for most invasive maternal disease cases; serotypes Ia and III for early and late onset infant invasive disease; and among non-pregnant adults serotypes V, Ia, and III are predominant, followed by II and 1b [2, 4]. Currently, there are over 1900 sequence types (STs) of GBS within several clonal complexes (https://pubmlst.org/; accessed April 10, 2022), of which CC1, CC10, CC17, CC19, and CC23 are most associated with colonization and invasion of humans [4].

Serotype VIII GBS has been a rare serotype worldwide but relatively more dominant within Japan [5] and has emerged in several additional countries as a predominant colonizer. In South Korea, serotype VIII was first reported in 2000, representing 2% of isolates collected from both adults and infants in a tertiary care hospital from 1993 to 1996, but has since become a predominant serotype among GBS colonized pregnant women, representing 20% of isolates collected from 2017 to 2019 [6, 7]. Outside of Japan and South Korea, serotype VIII has been isolated in many countries; however, the levels of maternal colonization and infant disease due to serotype VIII are low, with most studies reporting < 3% of colonizing/invasive isolates as serotype VIII [4, 8,9,10]. Adult invasive serotype VIII has been detected in several locations such as Canada, Denmark, Japan, Portugal, Taiwan, and the USA [4, 10,11,12,13,14,15]. A global systematic review and meta-analysis of invasive GBS among non-pregnant adults reported between 1975 and 2018 found serotype VIII accounted for 0.16% of serotyped isolates [2].

From 2014 to 2020 in Alberta, Canada, the most prevalent serotypes of invasive GBS in order of prevalence were III, Ia, Ib, II, V, IV, and VI [14, 16]. During this period, seven cases of invasive serotype VIII were identified (0.45% of the total cases), which was more than double seen in a previous study that identified 3 cases (0.2% of total cases) from 2003 to 2013 [14, 16]. In this work, we characterized these emerging invasive serotype VIII isolates in Alberta collected over the period of January 2003 to September 2021 by determining the multilocus sequence types (ST), antimicrobial resistance and virulence factor carriage, phylogenetic relationships, and pangenome of these isolates. Characterization of the serotype VIII in Alberta will contribute to both our understanding of this emerging rare serotype within the province and to global surveillance efforts.

Materials and methods

Identification of capsular polysaccharide type VIII GBS isolates

Neonatal invasive GBS disease is listed as a notifiable disease in Alberta requiring GBS isolates to be submitted to the Alberta Public Health Laboratory for capsular polysaccharide (CPS) typing and antimicrobial susceptibility testing [14]. For other age groups, clinical microbiology laboratories in Alberta submit invasive GBS isolates to the Public Health Laboratory for CPS typing as a part of an ongoing passive surveillance program. Prior to January 2017, CPS typing was performed using a double immunodiffusion method with CPS-specific antisera developed in rabbits [17]. For 2017 to present, a real-time PCR assay utilizing hydrolysis probes was used for CPS typing [18]. All identified serotype VIII GBS isolates were confirmed as serotype VIII by whole genome sequencing (WGS).

Antimicrobial susceptibility testing

Antimicrobial susceptibility testing for penicillin, erythromycin, clindamycin, vancomycin, and chloramphenicol was conducted by disk diffusion (Oxoid Limited, Nepean, ON, Canada). Interpretations of zone diameter break points were made according to the Clinical and Laboratory Standards Institute (CLSI) M100 Performance Standards for Antimicrobial Susceptibility current at the time [19].

Genomic sequencing

Genomic DNA for sequencing was extracted with a modified protocol for the MagaZorb DNA Mini-Prep Kit (Promega). Briefly, colonies were incubated overnight at 35 °C in Bacto Todd-Hewitt Broth (500 µL; BD Biosciences). Cultures were centrifuged at 6010 RCF for 2 min, supernatant removed, and cells resuspended in 12 mM Tris (500 µL). Centrifugation was repeated and cells resuspended in mutanolysin/hyaluronidase lysis solution (62 µL; 10 µL 3000 U/mL mutanolysin (Sigma), 2 µL 30 mg/mL hyaluronidase (Sigma), and 50 µL 10 mM Tris). Lysozyme (15 µL, 100 mg/mL; Sigma) was added and incubated for 1 h at 37 °C with shaking at 700 rpm (Eppendorf ThermoMixer F1.5). Proteinase K solution (20 µL) and RNase A (20 µL, 20 mg/mL; Qiagen or Invitrogen) were added and tubes incubated at room temperature for 5 min. ATL lysis buffer (200 µL) was added and tubes incubated for 2 h at 56 °C with shaking at 900 rpm (Eppendorf ThermoMixer F1.5). Extracts were centrifuged at 9391 RCF for 2 min, and wash, binding, and elution steps were completed with the KingFisher mL Purification System (Thermo Scientific) with Qiagen Buffer EB. Genomes were sequenced either by NextSeq 500/550 or Illumina MiSeq with paired end reads. Genome coverages ranged from 81 and 99% relative to isolate 3 (accession SAMN31119272). Genome sequences were assembled with Shovill v1.1.0 (https://github.com/tseemann/shovill), annotated with Bakta v1.2.2 (database v3.0) [20], and assembly quality assessed with Quast v5.0.2 (Online Resource 1) [21, 22].

Bioinformatic analysis

In silico MLST was performed on assembled genomes with MLST v2.19.0 (https://github.com/tseemann/mlst) using schema from PubMLST [23] and clonal complexes determined using https://pubmlst.org/. The graph of MLST data was generated with Tableau Desktop v2021.2 and Inkscape v0.92.3. The pangenome for Bakta annotated sequences was calculated using Anvi’o v7.1 using the workflow for microbial pangenomics with default parameters and visualized with anvi-display-pan and Inkscape v0.92.3 [24,25,26,27,28]. Antibiotic resistance genes were identified with the Resistance Gene Identifier (RGI) v5.2.0 and the Comprehensive Antibiotic Resistance Database (CARD) v3.1.4 [29]. Virulence factors were identified with abricate v1.0.1 (https://github.com/tseemann/abricate) and the core Virulence Factor Database (VFDB) (30-Dec-2021) [30]. Additional virulence factors were identified using blastn 2.12.0 + of full pilus gene sequences (PI-2b, GCF_000012705.1; PI-1b major subunit allele, WP_000777405.1) [31] and partial alp/rib genes sequences with a 95% nucleotide identity cutoff to determine gene presence [10].

Phylogenetic relationships were assessed using core single nucleotide polymorphisms (SNPs). Core SNPs were extracted from sequence reads and aligned with Snippy v4.6.0 using the assembled genome of isolate 3 as a reference (https://github.com/tseemann/snippy). Regions of recombination within the alignment were determined with Gubbins v3.1.4 using marginal ancestral reconstruction with IQ-TREE v2.0.3 [32, 33] and masked with maskrc-svg v0.5 (https://github.com/kwongj/maskrc-svg). SNP distance matrices were determined for both the unmasked and masked alignments using snp-dists v0.8.2 (https://github.com/tseemann/snp-dists). A maximum likelihood phylogenetic tree of the masked alignment was constructed with IQ-TREE v2.2.0 using model K3Pu + F + I, 1000 ultrafast bootstraps, and 1000 bootstraps replicates for the SH-like approximate likelihood ratio test (SH-alrt) [33,34,35]. Trees were visualized with FigTree v1.4.4 (https://github.com/rambaut/figtree) and Inkscape v0.92.3.

Results

Epidemiology and antibiotic resistance profiles of GBS serotype VIII in Alberta

Between January 2003 and September 2021, 14 serotype VIII GBS isolates were identified. All isolates were isolated between 2009 and 2021, with one isolate per year for 2009, 2011, 2013, and 2017; two isolates per year from 2018 to 2020; and four isolates in 2021 (Fig. 1). There were equal numbers of males and females and ages ranged from 24 to 79, with two cases (14%) in individuals under 50, three (21%) in those aged 50–60, and nine (64%) in patients over 60 (Table 1). Two isolates (isolate 6 and isolate 9) were recovered from the same patient 173 days apart. Specimens were isolated from blood (n = 9), knee fluid (n = 2), right leg (n = 1), thumb tissue (n = 1), and a toe bone (n = 1) (Table 1). All serotype VIII isolates were susceptible to penicillin, erythromycin, clindamycin, chloramphenicol, and vancomycin except for isolate 2, which exhibited erythromycin and inducible clindamycin resistance (Table 1).

Fig. 1
figure 1

Streptococcus agalactiae serotype VIII cases in Alberta from January 2003 to September 2021. Bars are colored by multilocus sequence type (MLST)

Table 1 Demographic and clinical descriptors of the 14 Streptococcus agalactiae serotype VIII cases from January 2009 to September 2021

ST42 is the most common serotype VIII sequence type in Alberta

The 14 genomes comprised three MLST types: ST1, ST2, and ST42 (Fig. 1; Table 1), and one isolate (isolate 14) was not assigned an ST. Of the three identified sequence types, ST42 was predominant, comprising 7/14 total isolates and all cases identified in 2021. The sequence types were further clustered by clonal complex and fell within CC1 (ST1, ST2) and CC19 (ST42).

Alberta serotype VIII isolates carry few antimicrobial resistance genes

Genomes of serotype VIII isolates were searched for antibiotic resistance ontologies (AROs) (Table 2) [29]. All isolates carried mprF, which provides peptide antibiotic resistance. Three isolates (isolates 3, 4, and 14) had strict hits (imperfect hits that meet curated blastp bit score cutoffs) to tetO, which is implicated in tetracycline resistance [29]. Isolate 2 was the only isolate with a strict hit to ermA (Table 2).

Table 2 Antimicrobial resistance genes carried by invasive Streptococcus agalactiae serotype VIII isolates

Alberta serotype VIII isolates carry multiple virulence factors including potential protein vaccine targets

Genomes were also surveyed for virulence factors [30]. Virulence factors present in all isolates included capsular operons, exotoxin production (hemolysin/cytolysin, CAMP factor), and exoenzyme production (hyaluronidase) (Table 3). More variation was present among adherence-related genes identified, with 13/14 genomes possessing C5a peptidase, laminin-binding surface protein (Lmb), and most PI-1 pilus-associated genes. Pilus gene GBS-RS03565 (major subunit protein) was only present in 6/14 isolates (43%). In isolates that did not carry GBS-RS03565, an alternative major subunit protein (BP-1b) was identified by blastn [36]. In addition, pilus island PI-2b, which is absent in the VFDB dataset, was identified by blastn in all isolates, with a split between two contigs in isolate 10.

Table 3 Virulence factor genes carried by invasive Streptococcus agalactiae serotype VIII in Alberta

Isolates were also surveyed by blastn for the presence of members of the Alpha/Rib protein family, which are current targets for vaccine development (Table 4) [37]. Alp1 was present in 43% of isolates, followed by Alp2/3 (29%), Rib (14%), and Alpha (7%). All genomes except isolate 10 carried at least one of the Alpha/Rib gene sequences (Online Resource 2). The type of Alpha/Rib family protein carried corresponded to ST: ST42 = alp1, ST1 = alp2/3, and ST2 = rib.

Table 4 Alpha/Rib family surface proteins carried by invasive Streptococcus agalactiae serotype VIII in Alberta

Genome similarity is greatest within Alberta serotype VIII STs

To examine the phylogenetic relationships among serotype VIII isolates, a maximum likelihood tree was generated from a core SNP alignment masked for recombination (Fig. 2). The isolates grouped primarily by ST, with ST42 (CC19) forming a monophyletic group and ST2 nesting within the ST1 group (CC1). Isolate 14, which did not have an assigned ST, formed a sister group to the ST1/ST2 cluster. Isolates 6 and 9, which were sampled from the same patient 173 days apart, also clustered together. Summary statistics of SNP alignments in relation to the reference sequence can be found in Online Resource 3.

Fig. 2
figure 2

Core SNP maximum likelihood phylogeny masked for recombination of Streptococcus agalactiae serotype VIII isolates. Dates shown are collection years, and isolates are colored by multilocus sequence type (MLST). Tree is midpoint rooted, and branch supports are SH-aLRT support (%)/ultrafast bootstrap support (%). The reference strain is isolate 3. CC, clonal complex; MLST, multilocus sequence type

To determine relative numbers of SNPs between isolates, pairwise SNP distances were calculated from the unmasked and masked (recombinant regions removed) core SNP alignments (reference = isolate 3; Online Resource 4). There was a decrease in the number of SNPs between the unmasked and masked alignment ranging from 0 to 95% reduction, with a mean of 73% and median of 89%. For the masked alignment, the number of SNPs between different STs ranged from 80 to 490 SNPs overall, with 80–112 SNPs between ST1 and ST2, 381–490 between ST1 and ST42, and 348–436 between ST2 and ST42. Within STs, the number of masked SNPs ranged from 17 to 119 SNPs overall, with 79–119 SNPs between ST1 isolates, 40 SNPs between ST2 isolates, and 17–94 SNPs among ST42 isolates. Isolate 14, which did not fall into an MLST group, shared the fewest numbers of masked SNPs with isolate 1 (n = 95 SNPs; ST2), isolate 7 (n = 96; ST2), isolate 12 (n = 94; ST1), and isolate 13 (n = 99; ST1). The fewest number of SNPs were seen between isolates 6 and 9 (n = 17 masked and unmasked), which were isolated from the same patient and are both part of ST42.

Alberta serotype VIII isolates share a core genome representing 66% of the pangenome

The core genome of the fourteen serotype VIII isolates was composed of 1662 genes, representing 66% of pangenome (Fig. 3). The accessory genome (present in 2 or more, but less than 14 isolates) was composed of 588 genes (23% of pangenome), and singletons (present in only one isolate) made up 268 genes (11% of pangenome). Singleton gene counts ranged from 0 to 64 genes, with isolate 14, which did not have an assigned ST, carrying the highest number of singleton genes (n = 64). Among isolates with MLSTs, most singletons were carried by ST1 and ST2 isolates, with isolate 7 (ST2) carrying the most (n = 54 singletons). Among ST42 isolates, isolate 8 carried the largest number of singletons (n = 23).

Fig. 3
figure 3

Pangenome calculation of invasive Streptococcus agalactiae serotype VIII in Alberta. One thousand one hundred sixty-two core genes were present in all genomes, 588 accessory genes were present in at least two but less than 14 genomes, and 268 genes were only present in one genome (singletons). Each genome is colored by multilocus sequence type (MLSTs). SCG, single gene cluster (excludes paralogs); MLST, multilocus sequence type

Discussion

In this study, we characterized serotype VIII GBS isolates identified in Alberta from January 2003 to September 2021 [14, 16]. The goal was to characterize this emerging group within Alberta primarily through genomic analyses, as there are no previous genome-based studies focused on this group to the best of our knowledge. We evaluated basic demographics, STs, antimicrobial resistance, virulence factors, and phylogenetic and pangenomic relationships between isolates.

Since 2017, the frequency of serotype VIII GBS infections has increased in Alberta, primarily represented by ST42 (Fig. 1). As of April 2022, 43 isolates with a capsular serotype and/or genotype of VIII were present in PubMLST, the majority of which were part of ST1 (CC1) (34/43 isolates) [23]. ST2 serotype VIII were limited within the database with only four isolates identified in Australia, of which three appeared to be invasive being isolated from blood and joint fluid (PubMLST IDs 5714, 5748, 10570). In general, limited information is available for serotype VIII ST42 (CC19) as most serotype VIII isolates identified have been a part of CC1 [38]. In PubMLST, only seven ST42 isolates were present, six from the USA, and one from the UK, of which only the latter isolate was designated as serotype VIII (PubMLST ID 21210) [39]. To the best of our knowledge, no information about this isolate exists apart from its designation as a clinical GBS isolate. One study characterizing invasive GBS isolates in the USA across multiple states from 2015 to 2017 had a similar pattern of ST distribution among their serotype VIII as Alberta [10]. In the US study, 17 serotype VIII isolates from a total of 6336 invasive GBS were identified. The majority of the isolates were ST42 (10/17 isolates), followed by ST2 (n = 4), ST1 (n = 1), and two not assigned a sequence type. In addition, all serotype VIII isolates were identified in adults over 18 years, with eight isolates from individuals 65 years and older. These results are similar to what we have described here in Alberta and may represent the emergence of serotype VIII ST42 infection, especially among older adults in North America.

In our study, antimicrobial resistance was limited, with only one isolate exhibiting erythromycin and inducible clindamycin resistance (Table 1). Alberta has seen increases in resistance to erythromycin and clindamycin among invasive GBS, especially among adult isolates [14]. In a study comparing changes in resistance rates among GBS isolates in Alberta between 2003–2013 and 2014–2020, erythromycin non-susceptibility rose from 36.9 to 50.8% and clindamycin non-susceptibility rose from 21.0 to 45.8% between the two time periods [14, 16]. High rates of erythromycin and clindamycin resistance have been observed in several countries across the globe among both colonizing and invasive isolates [40]. The most widespread factors conferring resistance to macrolides are erm (erythromycin ribosome methylation) genes, including ermA and ermB, among which ermB is most common in GBS [40]. In our study, the one isolate exhibiting erythromycin/inducible clindamycin resistance was found to carry a gene with high nucleotide similarity to ermA (Table 2), which has been observed to be associated with this phenotype previously. Among 18 macrolide-resistant colonizing GBS isolates in Brazil, 39% were found to harbor ermA, which included isolates exhibiting both macrolide resistance and constitutive/inducible macrolide-lincosamide-streptogramin B resistance [41]. The presence of erm genes has been shown to correlate well with resistant phenotypes and are therefore plausibly responsible for the erythromycin and inducible clindamycin resistance of our isolate [42].

A vaccine for GBS has been listed as a priority by the World Health Organization [43]. Vaccines in current development include both polysaccharide-conjugate vaccines and protein-only vaccines that target surface antigens of GBS [44]. While polysaccharide-conjugate vaccine formulations that include serotype VIII have undergone preclinical testing, all formulations currently in clinical trials do not include serotype VIII due to low global burden [44, 45]. In contrast, protein-only vaccines have the potential to be used broadly for protection against all serotypes by targeting universal groups of surface proteins [44]. Attractive choices for protein vaccine development include Alpha/Rib family proteins, which are abundant surface proteins among GBS isolates [44]. A systematic review that included 7193 GBS isolates found that 99% of adult invasive disease and 93% of infant invasive disease isolates carried genes for at least one of Alp1/Epsilon, Alp2/3, Alpha C, or Rib surface protein [4]. All but one of the Alberta serotype VIII isolates possessed at least one Alpha/Rib protein (Table 4; Online Resource 2). Other surface proteins of interest for vaccine target include pili structures, which are also widespread in GBS [4, 46]. Among Alberta serotype VIII isolates, at least one pilus operon was present in each isolate (Table 3). Interestingly, 57% (n = 8) of the Alberta isolates had an alternative major subunit allele for PI-1 and BP-1b, which was first reported in a study of invasive and colonizing GBS isolates in Canada [36]. Among the 1332 isolates included in the Canadian study, 51 carried the alternative backbone protein (major subunit), among which was one serotype VIII (ST42) [36].

We used core SNP alignments masked for recombination to build a phylogenetic tree of our serotype VIII isolates, which primarily clustered by ST/CC (Fig. 2). Among our serotype VIII was a single isolate, isolate 14, which was not assigned an MLST. Core SNP phylogenetic analysis showed that this isolate formed a sister group to the CC1 cluster, demonstrating sequence similarity to this clonal complex (Fig. 2). The inability to determine the ST of isolate 14 suggests this may be a new ST, perhaps generated through gene duplication or deletion events as a result of recombination, or not represented within the current MLST scheme. Recombination within the genomes of our serotype VIII isolates was evident in the comparison of the masked and unmasked pairwise SNP distances, with an average 73% reduction in SNPs after recombination masking (Online Resource 4).

Multiple studies have characterized the pangenome of GBS [47,48,49,50]. The pangenome of the 14 Alberta serotype VIII isolates was composed of 2518 genes, of which 1662 genes were core (Fig. 3). This is in good agreement with previous studies that have calculated core genomes among human-associated S. agalactiae isolates by diverse methodologies to be 1806 genes, 1658 genes, and 1472 genes [47,48,49]. Isolate 14, which had no assigned ST, carried the largest number of singleton genes in our pangenomic analysis (n = 64) (Fig. 3), further highlighting the uniqueness of this isolate within our collection. Interestingly, several of our isolates did not possess any singleton genes, which is in contrast with studies that have characterized the GBS pangenome as “open,” expanding with the addition of each new genome [47, 50]. It is possible that the lack of singletons among some of our isolates, primarily ST42, may represent clonal expansion and as such there would be fewer differences in gene content expected between isolates.

Here, we presented the genomic characterization of invasive serotype VIII GBS within Alberta. Limitations to this study include our small sample size, the use of draft genomes, which may impact gene presence/absence determination, and interpretations of isolate 10 results due to lower depth of coverage and poorer assembly for this isolate. Future directions for this work include the continued surveillance and genomic characterization of invasive serotype VIII in Alberta to determine if the upward trend of invasive infection continues. Globally, invasive serotype VIII GBS is uncommon, and consequently, our understanding of this emerging group is limited. Monitoring and characterization of global isolates is necessary to both inform the epidemiology of this organism and guide vaccine development for the prevention of invasive GBS infection among at-risk adults.