Background

Streptococcus pneumoniae, found in the upper respiratory tract of healthy children and adults, causes a range of infections including meningitis, septicemia, pneumonia, sinusitis, and otitis media. Children < 2 years of age and adults aged ≥65 years of age are particularly susceptible [1]. According to the Morbidity and Mortality Weekly Report, April 26 2013 [2], an estimated 14.5 million cases of serious pneumococcal disease (including pneumonia, meningitis, and sepsis) occur each year in children aged <5 years worldwide, which has resulted in approximately 500,000 deaths, mostly in low- and middle-income developing countries.

The high morbidity and mortality caused by pneumococci are not clearly understood. The pathogenicity of pneumococci has been linked to various virulence factors such as capsule, cell wall and its component polysaccharides, pneumolysin, PspA, complement factor H-binding component, autolysin, neuraminidase, peptide permeases, hydrogen peroxide, and IgA1 protease [3,4,5]. Capsular polysaccharide (CPS) is the primary virulence factor, and is also used to categorize, S. pneumoniaeinto more than 90 different serotypes [6,7,8]. Capsule is important for the survival of bacteria at infection site as it provides resistance to phagocytosis [9].

Pneumococcal CPS is generally synthesized by the Wzx/Wzy-dependent pathway, except for types 3 and 37, which are produced by the synthase pathway [10, 11]. Most genes required for synthesis of capsule are within the capsule polysaccharide synthesis (cps) operon, which ranges from 10 kb (serotype 3) to 30 kb (serotype 38). Cps operon is flanked by dexB in 5′ end and aliA at 3′ end. Neither of these participates in capsule synthesis. The 5′-end of the CPS loci starts with regulatory and processing genes wzg, wzh, wzd, and wze (also known as cpsABCD), which are conserved with high sequence identity in all serotypes, followed by the central region consisting of serotype specific genes [12, 13].

Pneumococcal serotyping is necessary for epidemiological and vaccine impact studies. It also aids in understanding the pathogenicity of the organism and closely monitors for the emergence of non-vaccine strains, replacement serotypes, and new serovars [14, 15]. Widespread use of pneumococcal vaccines has led to replacement with serotypes that are not included in the vaccines. Continuous monitoring of serotypes is therefore essential for epidemiological surveillance and long-term vaccine impact studies [16,17,18,19,20].

Several phenotypic and genotypic methods are currently used to identify pneumococcal group and type. The phenotypic serotyping methods of capsular swelling reaction, latex agglutination and coagglutination tests are costly, require skilled personnel, and cannot detect all serotypes. Genotypic typing methods that assess genome variation include sequential multiplex polymerase chain reaction (PCR), sequential real-time PCR, restriction fragment length polymorphism (RFLP), microarray, sequetyping, and matrix-assisted lazer desorption ionization-time of flight (MALDI-TOF) analysis. In addition to general applicability and a high discriminatory power, these genotypic assays are economical, detect pneumococci directly from the clinical specimen, and detect emerging serovars, replacement strains, and vaccine escape recombinants [21]. However, many of these methods are multistep, intricate, and do not discriminate all serotypes [22,23,24,25,26].

It is crucial to develop a robust, simple method with complete serotype coverage for serotype detection and pneumococcal serogroup/serotype surveillance [27]. Herein, the authors describe an innovative serotyping approach that relies on sequencing of assembly genes located in the capsular operon to identify all pneumococcal serotypes.

Methods

Reference strains

There were 91 reference serotype strains of S. pneumoniae obtained from Staten Serum Institute, Copenhagen, Denmark (Table 1).

Table 1 PCRseqtyping results for 91 SSI strains

Clinical isolates

There were 28 clinical isolates of S. pneumoniae selected from isolates submitted to Central Research Laboratory, KIMS Hospital, Bangalore (Table 2). They were isolated from blood (n = 23), cerebrospinal fluid (CSF) (n =3) and pleural fluid (n = 2).

Table 2 Serotype distribution of the clinical isolates of Streptococcus pneumoniae from Central Research Laboratory, KIMS Hospital, Bangalore, India

Media and culture conditions

Strains were stored in skim milk, tryptone, glucose, and glycerol (STGG) media at −80 °C. They were cultured on 5% sheep blood agar (Chromogen, Hyderabad) for 18–24 hrs at37°C with 5% CO2. The isolates were characterized as S. pneumoniae by colony morphology, alpha hemolysis, bile solubility, and optochin susceptibility.

Serotyping

Quellung reaction was performed using Pneumotest kit and type-specific antisera (SSI, Denmark), as recommended by the manufacturer.

PCRSeqTyping

PCRSeqTyping assay was performed in two steps. Step I involved PCR amplification and sequencing of the cpsB gene from genomic DNA. There were 91 serotypes that were divided into non-homologous group (Group I, 59 serotypes) and homologous group (Group II, 32 serotypes) based on the cpsB sequence data. The homologous group was further subdivided into 10 subgroups based on the sequence homology. The second step involved PCR and sequencing of each homology group by using specific primers in order to identify the unique serotypes.

Nucleic acid extraction

Genomic DNA was extracted from bacterial strains using QIAamp DNA mini kit (Qiagen, Germany), as per the manufacturer’s protocol.

PCR amplification

PCR reaction was performed using the primers designed by Leung et al. [26] with modifications. Primers used in the study were cps1-FP (5′-GCAATGCCAGACAGTAACCTCTAT-3)′, cps2-RP (5′-CCTGCCTGCAAGTCTTGATT-3′) and cps-2538-RP (5′-CTTTACCAACCTTTGTAATCCAT-3′). The reaction mixture was modified to contain 50–100 ng of genomic DNA, 0.75 units XT-5 polymerase (3 U/μl, Merck, which is a mixture of thermo stable enzymes Taq DNA polymerase and proof-reading [PR] polymerase), 1X XT5A-Assay buffer, 1 μl deoxynucleoside triphosphates (dNTPs, 2.5 mM each [Fermentas, United States]), 1 μl forward primer (100 ng/μl), 1 μl of reverse primer mix (100 ng/μl). The final reaction volume was made up to 25 μl with DNase/RNase-free distilled water (Gibco, United States). Thermal cycling was performed in GeneAmp PCR system 9700 (Applied Biosystems, United States) under the following conditions: 94 °C for 5 min, followed by 35 amplification cycles of 94 °C for 30 s, 50 °C for 30 s, and 72 °C for 1 min and final extension at 72 °C for 5 min. The PCR products were separated by electrophoresis on 1.2% agarose gel for 45 min at 80 V in 1X Tris-acetate EDTA buffer. Ethidium bromide-stained DNA products were visualized under ultraviolet (UV) illumination and size of the DNA products was determined by using a 1–kb DNA molecular size marker (Fermentas).

Sequencing and data analysis

PCR products were purified using QIA quick PCR purification kit (Qiagen, Germany) following manufacturer’s protocol. Purified PCR products were subjected to sequencing, employing the Big Dye Sequence Terminator kit V3.1 (Applied Biosystems) and analyzed on ABI 3730 XL Genetic Analyzer (Applied Biosystems). Sequencing was performed in one direction using forward primer (cps1), 5′-GCA ATG CCA GAC AGT AAC CTC TAT-3′ and Long Seq Module (ABI). DNA sequences that were obtained were analyzed for sequence similarity using GenBank database (http://www.ncbi.nlm.nih.gov/blast) and then assigned to serotype [26]. Serotype of the cpsB nucleotide sequence was determined from GenBank with the highest BLAST bit score of > 99% sequence identity with the query ‘amplicon nucleotide sequence’.

Homology group assignment and PCRSeqTyping

Homology groups

Amplifiable serotypes that shared identical interceding sequences (e.g. sequences for serotypes 2 and 41A, 7B, and 40) were grouped into 10 different groups based on their homology by in silico analysis of cpsB region. Individual primer sets were designed for each subgroup. Sequetyping data obtained in Step I was used to assign the homologous strains into subgroups (Fig. 1). Serotypes were considered homologous when the highest bit score was shared between two or more serotypes (i.e. the same amount of nucleotide variation between query and database sequences), and then assigned to one of the 10 groups (Table 3).

Fig. 1
figure 1

Homology group assignment for 91 pneumococcal serotypes

Table 3 Primers used in PCRSeqTyping assay

For homologous strains, a second round of PCR was performed using group specific primers as specified in Table 3. PCR products were subjected to sequencing reaction. The nucleotide sequence data was used to assign the serotype.

Results

PCRSeqTyping results for reference strains

The 91 pneumococcal serotype reference strains (sourced from SSI) were tested with PCRSeqTyping protocol. All 91 strains were amplified using the modified method. In Step I of amplification and sequencing, 59 strains of the non-homologous group (Group I) were correctly assigned to their respective serotype. There were 32 strains (Group II) identified along with their homologous type. The homologous types were correctly assigned to their respective type in Step II by performing a second round of amplification using group specific primers and sequencing. Quellung reaction performed using Pneumotest kit (SSI), in parallel with PCRSeqTyping, showed 100% concordant results (Table 1).

The results were further evaluated by blinded testing of PCRSeqtyping. Samples were evaluated randomly by assigning codes. Quellung reaction data showed no discrepancies between serotypes assigned by Quellung and PCRSeqTyping for all reference strains.

PCRSeqTyping results for clinical isolates

Twenty eight pneumococcal isolates tested in the study were from children <5 years with invasive pneumococcal disease. The predominant serotypes were 1, 6B, 19A, 19 F, 14 and 7 F (Table 2). PCRSeqTyping results and serotyping results by Quellung reaction were in concordance, without any discrepancies. Among 28 isolates, 25 isolates were assigned to their serotype with the first step of PCRSeqTyping. Three isolates belonging to the homologous group were subsequently identified with the second step of PCRSeqTyping.

Discussion

There is a renewed interest in pneumococcal capsular typing techniques, as a result of an increased complexity in the management of pneumococcal disease and the widespread use of pneumococcal vaccines [8]. The ability to differentiate pneumococcal strains efficiently is essential to track the emerging serovars, and for epidemiological investigations. The limitations of the Quellung serotyping method, many DNA-based typing protocols, PCR, restriction fragment length polymorphisms, hybridization assays, microarrays and sequencing for S. pneumoniae are well known. Different PCR strategies, namely multiplex PCR, sequential PCR, serotype-specific PCR, and real time multiplex PCR [25, 2836] targeting serotype-specific regions of cps could detect only 22 serotypes uniquely, and 48 serotypes along with their homologous types [37, 38]. Despite the fact these methods cover imited serotypes, PCR is a widely used technique, which avoids the use of serological reagents and requires specific expertise to conduct.

Methods using multiple restriction enzymes and long cps fragments [39, 40] for PCR make the amplification difficult and inconsistent. Another protocol based on sequencing of regulatory region of cps [30, 31] shows poor resolution with cross reactivity of serotypes. An approach targeting serotype-specific glycosyl transferase genes [6] was only tested for serogroup 6 and serotype 19 F. The cross reactivity of serotypes, along with the requirement for a higher number of primers, and poor resolution limits their wide usage.

With the characterization of the cps locus of 92 serotypes [13], Leung et al. [26] developed sequetyping protocol using single primer pair, which binds in all pneumococcal serotypes. Recently, several research groups [27, 41,42,43] have published their results using sequetyping assay. Limitations of the sequetyping protocol were as follows: (i) only 84 serotypes out of 92 were predicted to be amplified by in silico analysis; (ii) cross-reacting serotypes (30/84) belonging to homologous groups could not be uniquely identified; and (iii) considering the central 732 bp region of the cpsB amplicon which could be sequenced, only 46 of 54 serotypes could be sequetyped.

In the first step of this study’s modified approach, successful amplification of all 91 serotypes was achieved with the addition of a new reverse primer to amplify 25A, 25 F and 38 serotypes specifically. Additionally, XT-5 polymerase used in the PCR amplification reactions contains Taq DNA polymerase and Pfu enzyme. This enzyme blend utilizes the powerful 5′-3′ polymerase activity of Taq DNA polymerase and the 3′-5′ exonuclease-mediated proof-reading activity of PR polymerase, resulting in high fidelity PCR products [44]. PCR annealing temperature of 50 °C and extension time of 1 min were found to be optimal for amplification of cpsB gene of all 91 strains.

The serotypes were grouped into homologous (32) and non-homologous (59) based on cpsB sequence. Non-homologous types were identified uniquely. The 32 homologous strains were further subdivided into 10 groups (HG 1–10) based on their sequence similarity. Homology group-specific primers were designed and evaluated for their ability to differentiate between strains. HG primers were designed to be able to assign the serotype accurately with second step of PCR and sequencing.

The limitation of using 732 bp region of cpsB amplicon in sequetyping assay, resulting in prediction of 46 of 54 serotypes, was overcome with the use of Long Seq module. Approximately 1.0 kb quality reads in a single sequencing reaction were obtained with modification. This resulted in providing good quality reads up to the end of the PCR template, identifying cross-reacting serotypes (15B/15C, 7 F/7A, 18B/18C, 9 L/9 N, 15B/C, 17 F/33C, 18B/C, 7A/F, 12A/46, 6C/6D) which have a single SNP in the cpsB region.

A 100% concordance of serotype results of PCRSeqTyping and Quellung testing was seen for the 28 clinical isolates. Moving forward, the study will be extended for serotyping a larger number of clinical isolates and clinical samples. The limitation of the protocol will be in quantification and serotype identification in multiple carriage; however, studies are underway to address these issues. For multiple carriage, the PCR amplicon obtained in the first step will be subcloned into T/A cloning vector and the individual clones will be sequenced for assigning the specific serotype. As the corresponding cpsB gene sequence of the recently discovered serotypes 6E, 6 F, 6G, 6H, 11E, 20A, 20B and 23B1 [45,46,47] were unavailable at the time of the study design, they will be included in future studies.

In the study’s center, the typing cost with Pneumotest Kit (SSI, Denmark) was US$35/isolate, while PCRSeqTyping cost was US$10 for Group I (non-homologous strains) and US$15 for Group II (homologous strains). With the easy availability of outsourced sequencing services, the accurate and reliable PCRSeqTyping test can be adopted in a regular microbiology laboratory, even without the sequencing facility.

This modified typing method has several advantages over other reported methods. It involves techniques with a workflow that many microbiology laboratories can easily implement. The high throughput PCRSeqTyping method features good discriminatory power, reproducibility, and portability, making it suitable for epidemiological studies. The assay has the flexibility of incorporating additional primers for the characterization of emerging serotypes. An added advantage of this method is that raw data from experiments can be reanalyzed upon the addition of new entries to the serotyping database.

Conclusion

PCRSeqTyping assay is a cost-effective alternative to currently available phenotypic and molecular typing methods. The method is simple to perform, robust, and economical. It can identify all 91 serotypes specifically and uniquely.