Background

Streptococcus pyogenes, or Group A Streptococcus (GAS), is an exclusively human pathogen responsible for a broad variety of clinical manifestations ranging from pharyngitis and impetigo to invasive diseases, such as necrotizing fasciitis and toxic shock syndrome. Some strains can also trigger autoimmune diseases, such as acute rheumatic fever, rheumatic heart disease and glomerulonephritis [1]. GAS infections are the major cause of morbidity and mortality worldwide. The prevalence of severe GAS diseases is at least 18.1 million cases, which cause approximately 517,000 deaths per year [2].

M protein is a surface component of GAS and one of the main virulence factors due to its anti-phagocytic properties [3]. This protein contains a hyper variable amino terminal end that serves as substrate for gold standard emm-typing for strain identification. More than 220 different emm-types have been described [4]. Systematic epidemiological reviews clearly highlight significant differences in emm-type distribution across different regions of the world. Relatively limited numbers of emm-type are recovered from high-income settings, while a much higher diversity of strains circulates in low-income settings [5, 6]. A complementary typing system, emm-pattern typing, is based on the presence and arrangement of emm and emm-like genes located in the mga locus within the S.pyogenes genome. This classification is correlated with tissue tropism as follows: A-C emm-pattern isolates are usually recovered from the throat infections, D emm-pattern strains are usually isolated from the skin (impetigo), and E emm-patterns are recovered from both biological sites [7, 8].

Sanderson-Smith et al. recently proposed a functional classification of the emm-types in clusters according to the phylogenetic origin and microbiological characteristics of the strain. The cluster classification enabled comparison between strains and serves as a tool for vaccine development [9].

GAS contains numerous genes encoding virulence factors, such as streptococcal pyrogenic exotoxins (Spe proteins). These proteins constitute a family of bacterial toxins with powerful mitogenic effects on T cells expressing a particular Vβ domain of the T cell receptor molecule, inducing non-specific polyclonal activation of the immune system by binding directly to class II MHC molecules [10]. Several studies have reported that Spe exotoxin content is correlated with emm-types and associated with clinical manifestations [1113]. Spe exotoxins most likely contribute to the severity of GAS infections. However, the exact molecular mechanism involved in specific pathologies is still not understood [14].

To date, no anti-streptococcal A vaccine is available; however, several candidates based on both N- and C- terminal portions of the M protein are in different stages of development [15]. Briefly, the 30-valent is based on the highly variable amino-terminal region of the M protein [16], and the J8 candidate vaccine a construction of minimal B-cell epitope from the C-repeat region [17].

StreptInCor candidate vaccine is based on amino acid sequences of the conserved region of the M5 protein. This candidate vaccine, in contrast to the others, contains both B and T cell epitopes to provide a strong protective immune response [18].

Although GAS infections are common in several regions of Brazil, only a few studies on the prevalence, emm-type profiles and virulence factors of the strains are available [1921]. Here, we described the emm-type and superantigen profile of the most prevalent strains in Sao Paulo and assessed the theoretical coverage vaccine.

Methods

S.pyogenes strain collection

GAS isolates were obtained from patients treated at the Clinical Hospital, School of Medicine, University of Sao Paulo, Sao Paulo, and the Special Clinical Microbiology Laboratory (LEMC), Federal University of Sao Paulo, Sao Paulo, Brazil, between 2001 and 2008. The bacterial samples were defined according to their isolation sites (skin, throat and other invasive sites).

Institutional Review Board (IRB) approval was obtained from the Heart Institute Ethics Committee (CAPPesq; approval number-0646/07) at the University of Sao Paulo. Patient informed consent was waived because this study is a retrospective analysis of strains from a microbiology collection.

The GAS diagnostic criteria were based on beta hemolysis in blood agar and sensitivity to bacitracin. Then, the specimens were cultured on sheep blood agar (Vetec, Brazil), followed by growth in Todd-Hewitt broth (Himedia, India) until OD600 of 0.4 and stored at −80 °C.

DNA isolation, emm-typing, patterning and emm-cluster distribution

The genomic DNA extraction, emm-gene PCR amplification and sequencing and emm-type identification were performed according to the protocol described by the CDC (http://www.cdc.gov/ncidod/biotech/strep/strepblast.html) using the primers MF2 and MR1 for amplification and sequencing, respectively, as previously described [19]. The emm-pattern for each emm-type was deduced using the table of correspondence provided by a recent multi-center study [4]. The emm-cluster classification of the strains identified in this study was based on the new functional classification recently proposed by Sanderson-Smith et al. [9].

Superantigen profile

To identify the superantigens each gene carried by strain, PCR reactions were performed using specific primers and singleplex PCR as previously described for speA, speC, speG, speH, speI, speJ, ssa [13] and smeZ [12]. speB (cysteine protease) was used as a positive control in our PCR reaction.

Statistical analysis

The Simpson Reciprocal Index (1/D) of 1 corresponds to a theoretical situation in which only one emm-type/cluster is recovered, representing the lowest diversity possible. The maximum Simpson Reciprocal Index corresponds to the total number of emm-type/cluster recovered in one area. Higher values indicate greater diversity. A Simpson Index was calculated using the following formula: D = ∑ (n/N) 2, where “n” is the total number of isolates of a given emm-type or belonging to a given cluster and “N” is the total number of isolates of all the emm-types/clusters recovered in an area [22, 23]. Confidence intervals were calculated as previously described [24].

M protein sequence analyses

M proteins complete sequences and C repeat annotation from each emm-type included in this study were derived from previous study [4]. Multiple proteic alignments were obtained using Muscle software as implemented in Geneious® version R8.

Results

emm-types

The distribution of emm-types among the 229 GAS isolates is described in Table 1. The clinical origin was known for 214 isolates. Most samples were associated with invasive infection (n = 123, 57 %), whereas the remaining samples were recovered from throat (n = 57, 27 %) and skin infections (n = 34, 16% ). Forty-eight different emm-types were identified. The most frequent emm-types were emm1 (22 %), emm87 (8 %), emm22 (7 %), emm12 (7 %), emm77 (6 % ), emm6 (6 % ), emm89 (5 %), emm33 (3 %), emm75 (3 %) and emm3 (3 %) (Fig. 1). Taken together, these emm-types accounted for 69 % of the GAS isolates. To better understand the strain diversity present in our study, and its likely consequence for multivalent vaccine coverage, we have calculated the reciprocal Simpson index of diversity which results was 12.7 (95 % CI, 10.1-17.0).

Table 1 Distribution of emm-types among 229 GAS isolates obtained during the 2004-2008
Fig. 1
figure 1

Frequency of emm-types. A total of 48 emm-types were represented in the collection. Abbreviation: GAS, group A streptococcus

emm-pattern and emm-cluster distribution

We inferred the emm-pattern for 213 of 214 emm-types, except for emm127 (previously named st223). Pattern E and A-C emm-types were present at similar proportions (43 and 38 %), whereas pattern D strains were less frequent (18 %).

The strains were classified according to the emm-clusters, and the strains fit into 12 of 19 different emm-clusters. Most strains belonged to emm-cluster A-C3 (21 %), followed by E4 (20 %), E3 (13 %), D4 (12 %), single protein cluster clade Y (9 %), A-C4 and E6 (7 %), A-C5 and E1 and E2 (3 %), D2 and D5 (1 %) (Table 2).

Table 2 emm-cluster classification

Superantigen profile

The superantigen gene encoding profile was analyzed in 219/229 isolates (96 %). The chromosomally located superantigens genes smeZ, speG, and speJ were present in 219 (95.6 %), 201 (88 %) and 79 (35 %) isolates, respectively. The speG and smeZ genes were present at high frequencies in all strains, whereas speJ was absent or uncommon in diverse emm-types and presented a higher frequency only in emm1, emm33 and emm87 (n = 72, 86 %). Among the phage-encoded genes, speC was the most prevalent (n = 109, 48 %), followed by ssa (n = 61, 27 %), speA (n = 43, 19 %), speH (n = 37, 16 %), and speI (n = 31, 14 %). Among the most prevalent emm-types, speA was present in emm3 (100 %) and emm1 (62 %) but in only one sample of emm6. The emm-type speC was associated with all strains but was less frequent in emm1, emm3, emm183 and emm75 (n = 12, 33 %) and more frequent in the remaining strains (n = 38, 93 %). Additionally, speI was absent or less frequent in most samples, except for emm12 and emm183 (53 % and 60 %, respectively). Finally, speH was also absent or uncommon in most emm-types and occurred at a higher frequency only in emm183, emm12 and emm78 (n = 40, 72 %), and ssa was absent in only one isolate, with a frequency range of 7-86 % (Table 3).

Table 3 Superantigen profile of the most frequent emm-types identified in Sao Paulo, Brazil

Vaccine coverage

Theoretical vaccine coverage capacity of StreptInCor candidate vaccine was accessed considering the amino acid sequence alignment with the M protein C-terminal region for the 46 emm-types identified here (the complete M protein sequence was missing for both emm127 and emm99). The identities ranged from 94.5 % to 59.7 % (mean of 71 %). Some emm-types presented with an insertion of 7 amino acid residues in their sequences, as previously described (Fig. 2).

Fig. 2
figure 2

In silico analysis of StreptInCor coverage capacity. Amino acid sequence alignment of StreptInCor candidate vaccine with the 46 emm-types identified here (the complete M protein sequence was missing for both emm127 and emm99)

Discussion

Streptococcus pyogenes is an important human pathogen responsible for several invasive and non-invasive diseases in Brazil and worldwide. In this study, we characterized 229 invasive and non-invasive Streptococcus pyogenes samples from patients treated at the Clinical Hospital in Sao Paulo, Brazil. Great diversity of emm-types was observed. Forty-eight emm-types were observed in the 229 samples, with the 10 most frequent emm-types accounting for 69 % of all isolates. In terms of GAS strain diversity, a Simpson Reciprocal Index of 1 corresponding to a theoretical situation where only one emm-type/cluster has been recovered, representing the lowest diversity possible. The maximum value of the Simpson Reciprocal Index corresponds to the total number of emm-type/cluster recovered in one area. The higher the value is, the greater the diversity. The reciprocal Simpson index of diversity found in this study was relatively low (12.7) when compared to the index of 26.72 for Brasilia (in the central region of Brazil) [19]. On the other hand, our results were similar to those reported for high incomes suburbs from Salvador, in northeastern Brazil [20].

The distribution of the strains identified in this study is comparable to those found in other countries, particularly in high-income countries in Asia, the Middle East and Latin America, in which emm1 and emm12 were the most common types, as reviewed by Steer [6]. Interestingly, emm1, emm12 and emm89 have also been found in various studies conducted recently in several countries in Europe and China; these types were frequently correlated with invasive and/or noninvasive isolates [25]. emm77 had a high frequency in the invasive isolates found here. In addition, this strain has been associated with non-invasive diseases in Germany [26] and was found in both invasive and non-invasive isolates in Spain [12]. Among the 229 isolates, E and A-C emm-patterns were found in similar proportions, whereas pattern D was less frequent. Interestingly, studies from Brasilia, in the Central region of Brazil [19], revealed a higher proportion of E and D patterns (51 % and 36 %, respectively), whereas A-C patterns was rarely observed (9.5 %). The data demonstrate the variability of streptococcal strains in Brazil, which may be related to socio-economic differences and can be extended to other countries in which there are also social disparities.

Other factors that play a role in the clinical manifestation of S. pyogenes infection may be due to the associations between emm-types and superantigens.

In this study, the chromosomally encoded genes smeZ and speG occurred at high frequency in nearly all isolates (95.6 and 88 %, respectively); both were present in all emm-types at high frequencies (<70 %), except speG in emm77(43 %), in according with a variety of others studies [12, 2729].

The other chromosomal gene, speJ, was present in only 35 % of isolates and was absent in diverse emm-types, similar to others studies [12, 29, 30].

Among the phage-encoded genes, speC was the most prevalent, detected in 48 % of the isolates, followed by ssa (27 %), speA (19 %), speH (16 %), and speI (14 %). The speC, ssa, speH and speI genes presented similar frequencies to those found in others studies, whereas the speA gene generally had a lower frequency in our samples [25, 30, 31]. speA was present in emm3 (100 %), emm1 (62 %) and only one sample of emm6. The speA genes has been commonly detected among 1 isolate in several studies [32].

Currently, no anti-streptococcal vaccine is available in animal models of streptococcal disease, despite extensive efforts. Some models of anti-streptococcal vaccines are in different stages of development. Among them, the 30-valent contains short peptides from the highly variable amino-terminal region of the M protein [16], and the J8 vaccine candidate comprises a 12 amino acid minimal B-cell epitope from the C-repeat region flanked by 16 amino acids of a yeast DNA-binding protein conjugated to the diphtheria toxoid [17].

The vaccine candidate developed by our group, called StreptInCor, is based on the M5 protein C-terminal region [18], specifically the C2 and C3 region that is conserved among serotypes. Through in silico analysis with predicted amino acid sequence alignment, StreptInCor candidate vaccine had high sequence identity with 46 of the 48 emm-types described here (identity ranged from 94.5 % to 59.7 %, mean of 71 %), which is an important property for the probability of protection. In previous data, we described the structural, chemical, and biological properties of the StreptInCor peptide and demonstrated that the molecule is stable, which is an important property for a vaccine candidate. The possibility of the StrepInCor vaccine candidate epitope being processed by antigen-presenting cells (APCs) generating diverse peptides has also been previously demonstrated. The approach resulted in the observation that the vaccine epitope could be recognized by any individual, thus enabling a broad coverage capacity to trigger specific immunity [33].

The efficacy of this vaccine in animal models was evaluated in inbred and outbred mice, and a strong humoral response with high IgG production was observed [18]. Immunized Swiss mice challenged with the emm1 strain had a survival rate of 87 % at 21 days compared with lower survival in controls (53 %) [34].

Similar results have been observed in HLA class II transgenic mice, which also presented a specific and long-lasting immune response without developing deleterious reactions after one year. These results indicated that StreptInCor is a safe candidate vaccine [35].

In addition, the four most common emm-types included here (emm1, emm12, emm22 and emm87) were opsonized by StreptInCor-induced antibodies [36]. The strains identified here were fit into 12 of the 19 different emm-clusters and exhibited diverse phylogenetic origin and consequently different mechanisms of infection and resistance to escape the host immune system, supporting the hypothesis that StreptInCor vaccination would likely protect against infection caused by strains from different emm-clusters.

Conclusions

This is the first study investigating the epidemiology of streptococcal strains in Sao Paulo, one of the largest cities in the world. These data enabled evaluation of the StreptInCor candidate vaccine coverage capacity against diverse M-types, indicating that the vaccine candidate would likely induce protection against the diverse strains observed worldwide.