Introduction

The acquisition of new genetic material by horizontal gene transfer (HGT) significantly drives bacterial genome evolution and is mediated by Mobile Genetic Elements (MGEs). The term “mobilome” is used to indicate the entire set of MGEs of the microbiome[1]. MGEs are responsible for the spread of resistance and virulence genes in the microbial communities [2,3,4]. Thus, to study the acquisition and dissemination of antibiotic determinants in a bacterial population, the characterization of mobilome is crucial [5]. Even though new metagenomic approaches, both whole and targeted [1, 6, 7] have been implemented, a functional study of MGEs is still required [8, 9]. Integrative and Conjugative Elements (ICEs) are MGEs commonly found in bacteria where they can constitute up to 25 % of the genome [5, 10,11,12,13,14]. One of the most studied ICE of gram-positive bacteria is Tn916, a conjugative transposon originally found in Enterococcus faecalis which carries the tet(M) tetracycline resistance gene and is considered the prototype of the Tn916-Tn1545 family of ICEs [15,16,17,18,19]. Conjugative transposons of the Tn916-Tn1545 family can insert at multiple integration sites in the chromosome [20], while other ICEs, like Tn5253, SXT, Tn5397, and ICESt1, integrate at a single specific site [21,22,23,24,25,26]. We previously characterized Tn5253, a 64,528-bp composite ICE of Streptococcus pneumoniae, containing the ICE Tn5251 of the Tn916-Tn1545 family and the Ωcat(pC194) element carrying tet(M) and cat resistance determinants, respectively [27,28,29]. Tn5253 was found integrated at 83-bp specific integration site (attB) located in the essential gene rbgA of the S. pneumoniae chromosome [26, 28, 30]. The ICE was shown to excise from the pneumococcal chromosome with production of (i) circular forms in which the ends of the element were joined by a 84-bp sequence (attTn) and (ii) a reconstituted chromosomal attB. Tn5253, once integrated into the chromosome, was flanked by the attL site, identical to attB, and the attR site, identical to attTn. Pneumococcal mobilome analysis showed the frequent presence of Tn5253-like elements in multidrug-resistant S. pneumoniae strains and the maintenance of the element in all derivative isolates [31,32,33,34]. In this work, in order to contribute to mobilome characterization, we first conducted a functional characterization of the Tn5253 integration site, by analyzing attB in Tn5253-carrying transconjugants obtained in S. pneumoniae strains with different genetic backgrounds and in strains belonging to other bacterial species. We then investigated the presence of the Tn5253 attB site into the complete microbial genomes available in public databases.

Results and discussion

Tn5253 integration sites and circularization in different pneumococcal transconjugants

Representative Tn5253-carrying transconjugants were obtained in S. pneumoniae with different genetic backgrounds, namely TIGR4, A66 and SP18-BS74 [28] (Table 1). DNA sequence analysis of Tn5253-chromosome junction fragments showed that: (i) Tn5253 integration occurred at a specific integration site (attB) located in rbgA gene of the pneumococcal chromosome [26], (ii) attL was identical to attB and (iii) attR was identical to attTn, as already described for D39 and its derivative strains [26], and that (iv) attB sites among these pneumococcal strains were not identical, with their size varying from 41 nucleotides (variant attB13 in SP18-BS74) to 83 nucleotides (variant attB2, in TIGR4 and A66) (Fig. 1). We also analysed the nucleotide sequence of Tn5253 junction fragments in the original Tn5253-carrying clinical strain BM6001 and DP1322, in which Tn5253 was transferred by transformation of a crude lysate from BM6001 [35]. attL sequences of BM6001 and DP1322 were identical and belonged to a 84 bp-long variant (attB5, Fig. 1), since Tn5253 integration occurred via homologous recombination between DNA sequences beyond Tn5253 att sites. Tn5253 was found to excise from pneumococcal chromosome with consequent production of circular forms, containing the attTn site, and reconstitution of attB site [26]. To investigate if different pneumococcal genetic backgrounds influence the excision and circularization of Tn5253, quantitative PCR on cell lysates was used to quantify the excision of Tn5253 and attB reconstitution in liquid pneumococcal cultures (Table 2). Interestingly, the transconjugant FR56, derived from SP18-BS74, produced Tn5253 circular forms and reconstituted attB site at very high frequency (1.2 × 10−2 and 1.9 × 10−3 copies per chromosome, respectively). However, these results did not correlate with the conjugation frequency, which was 6.1 × 10−6, indicating that the frequency of circularization is not the only limiting factor of the conjugation process. Neither circular forms nor reconstituted attB of Tn5253 could be detected in the TIGR4 background (<3.6 × 10−5 and to <3.5 × 10−4, respectively), correlating with the absence of conjugal transfer (<9.9 × 10−8). Analysis of Tn5253 integration, in pneumococci with different genetic backgrounds, revealed that the element always integrates downstream of nucleotide position 20 of rbgA coding sequence (CDS) (Fig. 1). rbgA is an essential gene encoding the ribosomal biogenesis GTPase protein involved in the 50S ribosome subunit assembly [36]. Integration of Tn5253 leads to the duplication of the integration site restoring the CDS and preserving cell viability. Site specific integration of MGEs often occurs at the 5’ or 3’ end of genes, such as those coding for tRNAs or ribosomal proteins, which are essential and conserved among different bacterial species. This characteristic allows to overpass the single species border and favors the spread of MGEs within bacterial communities.

Table 1 Bacterial strains and relevant properties
Fig. 1
figure 1

Allelic variants of the Tn5253 attB integration site in S. pneumoniae. Tn5253 attB is located in the essential pneumococcal rbgA gene. In the 85 complete S. pneumoniae genomes, 13 allelic variants of attB were found. The 83-bp variant 1 (attB1) is the most frequent, carried by 38 genome strains (44.7 % of the analyzed genomes), including D39 strain, its rough derivative R6, and A66 strain, and was used as reference for the alignment. Variant 2 is carried by TIGR4 and other 8 genome strain and contains 2 nucleotide substitutions. BM6001, the original Tn5253-carrying clinical strain, and DP1322, in which Tn5253 was transferred by transformation from BM6001, harbour variant 5. SP18-BS74, whose draft genome is available, harbours the 41-bp variant 13. Within the sequences, identical nucleotides are indicated by periods, substitutions in red. For better alignment, dashes are inserted. The 20 nucleotides belonging to the rbgA coding sequence are underlined

Table 2 PCR quantification of Tn5253 circular forms and reconstituted attB integration sites in Tn5253-carrying transconjugants

Tn5253 integration sites and circularization in Streptococcus and Enterococcus.

We then extended Tn5253 functional analysis to streptococci and enterococci characterizing Tn5253 circular forms and integration sites in the transconjugants obtained in S. agalactiae H36B, S. pyogenes SF370, S. gordonii V288, E. faecalis OG1SS and JH2-2 backgrounds (Table 1). In all bacterial hosts, Tn5253 integration occurred in the orthologous rbgA genes (Fig. 2). As found in S. pneumoniae, in all the bacterial hosts tested: (i) attL was identical to attB regardless of the bacterial strain harbouring the element and attR was identical to attTn suggesting a polarization of Tn5253 integration process, (ii) integration always occurs downstream of a 11-bp conserved sequence, namely the last 11 nucleotides of attB sites, (iii) length of the attB site corresponds to length of the duplication after Tn5253 integration, (iv) attB site duplication restores rbgA CDS. It is worth to note that in E. faecalis, attB duplication modifies the rbgA predicted gene product (Fig. 3). Tn5253 produced circular forms at a similar frequency in S. agalactiae (2.9 × 10−5 copies per chromosome) and S. pyogenes (3.0 × 10−5 copies per chromosome), while no circular forms were detected in E. faecalis JH2-2 genetic background (<2.7 × 10−7 copies per chromosome) (Table 2). Reconstituted attB sites were found in all streptococci tested at a frequency ranging between 3.7 × 10−5 (in S. pyogenes) to 1.7 × 10−2 (in E. faecalis JH2-2 background) copies per chromosome. In E. faecalis, Tn5253 excision and circularization are strain dependent: a representative transconjugant obtained in OG1SS background produced circular forms and reconstituted attB site at 1.4 × 10−4 and 6.8 × 10−3 copies per chromosome, respectively; transconjugant FR50, obtained in JH2-2 background, produced reconstituted attB site at a frequency of 1.7 × 10−2 copies per chromosome but did not produce circular forms (<2.7 × 10−7). Conjugation frequency was lower than circularization frequency in all the tested strains except in S. pyogenes FR68. Many other factors are likely to be important in the conjugation process such as the expression of a capsular polysaccharide [37], the cell wall thickness, the surface charges, and the ability of the conjugation pore to establish a stable contact between cells from different species.

Fig. 2
figure 2

Tn5253 attB integration sites in the orthologous rbgA genes of other bacterial species. Genome sequence analysis identified Tn5253 attB sites in the orthologous rbgA genes of 18 other bacterial species with a size ranging between 33 nucleotides of S. gordonii to 84 nucleotides of Streptococcus mitis. The 17-bp E. faecalis attB was at first experimentally found by PCR and sequencing the Tn5253-chromosomal junction fragments of our E. faecalis transconjugants. Then the 17 nucleotides were used as probes for database interrogation. Inside the same bacterial species, different strains can harbour different allelic variants (up to 7 in S. equi). The sequence of the most represented allelic variant was used for the sequence alignment and its frequency is reported. The S. pneumoniae D39 attB variant 1 was used as reference. Tn5253 chromosomal integration, in the original S. pneumoniae host, as in the other functionally characterized streptococcal and enterococcal hosts (shaded), occurs always downstream of a 11-bp conserved sequence, namely the last 11 nucleotides of attB sites. These 11 nucleotides (boxed in blue) are conserved also among the attB sites of other bacterial species. Within the sequences, identical nucleotides are indicated by periods, substitutions are in red. For better alignment, dashes are inserted. The 20 nucleotides belonging to the rbgA coding sequence are underlined

Fig. 3
figure 3

Tn5253 attB site and integration in E. faecalis chromosome. (A) Tn5253 attB (represented as a blue box) in E. faecalis is composed by 17 nucleotides which correspond to the first 17 nucleotides of the rbgA CDS (light blue arrow). In streptococci and enterococci, Tn5253 always integrates downstream of a conserved 11-nucleotides sequence (boxed in red). The nucleotide sequence of attB and deduced amino acid sequence are reported. (B) Site specific integration of Tn5253 into rbgA causes integration site duplication, restoring an intact CDS. The integration of Tn5253 into bacterial chromosome seems to be polarized, since attTn (orange box) always flanks the element (light orange box) at the right end. In E. faecalis integration site duplication results in the acquisition of an additional codon (GCT ◊ Alanine) in the rbgA CDS. attB and attTn sites are not scaled. The 84 nucleotides sequence of attTn and the deduced amino acid sequence of RbgA N-terminal end are reported. Amino acids single-letter code is used

Genome sequence analysis of Tn5253 attB site in S. pneumoniae

To integrate biological data, a genome-wide investigation of Tn5253 attB among pneumococci was carried out. The database of 85 complete S. pneumoniae genomes (accessed in August 2021) was interrogated by using as a query the 83-bp attB. Sequence homology analysis identified thirteen allelic variants of attB. (Fig. 1, Table S1). In 75 genomes (88.3 %), the attB site was 83 or 84 nucleotides in length, while in 10 (11.7 %) it was 41 nucleotides. The 83-bp attB variant 1 is the most frequent variant, carried by 38 genome strains (44.7 % of the analyzed genomes), including D39 strain, its rough derivative R6, and the classical type 3 Avery’s strain A66. Variant 2 is carried by TIGR4 and other 8 genome strains (10.6 %) and contains two nucleotide substitutions. Variant 13 is harboured by G54 and other 9 genome strains (11.7 %) and contains only the last 41 nucleotides of variant 1. In addition, SP18-BS74, whose draft genome is available, also harbours variant 13. Variants 5 and 7 are carried by 7 strains (8.2 %), variant 4 by 5 strains (5.8 %), variants 6 and 9 by 2 strains (2.4 %). The remaining 5 variants (3, 8, 10, 11 and 12) were found in only in one strain. In thirteen pneumococcal genomes, carrying the attB variants 1, 2, 7, 11 and 12, Tn5253-like elements were integrated into the pneumococcal chromosome, resulting in the duplication of the attB site.

Genome sequence analysis of Tn5253 attB site in other bacterial species

Genome analysis was extended to the 58,138 complete microbial genomes (accessed in August 2021). Homology search identified the Tn5253 attB site in 18 other bacterial species, including the functionally characterized S. agalactiae, S. pyogenes,S. gordonii, hosts (Fig. 2, Table S2). The 17-bp E. faecalis attB was at first experimentally found by sequencing the Tn5253-chromosomal junction fragments obtained by inverse PCR from our E. faecalis transconjugants. Then the 17 nucleotides were used as a probe for database interrogation. Tn5253 attB was located in orthologous rbgA genes and its size ranged between 17 nucleotides of E. faecalis to 84 nucleotides of Streptococcus mitis. Alignment of the attB sequences obtained from the different bacterial species confirms the presence of the 11-bp conserved sequence. Theoretically, all of these attB sites allow Tn5253 integration, however only in one genome, namely Streptococcus mitis SVGS_061, a Tn5253-like element was found integrated, producing attB duplication.

Conclusions

In the present paper we conducted a functional characterization of Tn5253 attB site in S. pneumoniae and other streptococcal and enterococcal species and found that: (i) during conjugal transfer, Tn5253 integrated in S. pneumoniae rbgA gene or in the orthologous rbgA genes of the other bacterial hosts, (ii) Tn5253 produced circular forms containing the attTn site and the frequency was species- and strain-dependent, (iii) reconstitution of attB site was species- and strain-dependent. Through a DNA homology search conducted in the complete microbial genome database, we also found that: (i) thirteen allelic variants of the Tn5253 attB site were present in the complete S. pneumoniae genomes and their size ranged from 41 to 84 nucleotides, (ii) in other bacterial species, Tn5253 attB is located in orthologous rbgA genes with a size ranging between 17 and 84 nucleotides. Tn5253 integration, in the original S. pneumoniae host, as in the other streptococcal and enterococcal hosts, occurs always downstream of a 11-bp conserved sequence located in the rbgA CDS. Genome analysis revealed that the 11 nucleotides, corresponding to the last 11 nucleotides of the attB sites, are conserved also among the attB sites of other bacteria and can be considered the core of the integration site. In conclusion, even if a huge number of bacterial genomes is available, an in-silico analysis and a functional characterization of the mobilome is reported only in few cases. In this work, a functional characterization of the Tn5253 attB integration site, combined with genome sequence analysis, contributed to elucidating the potential of Tn5253 horizontal gene transfer among different bacterial species.

Materials and methods

Bacterial strains, growth, and mating conditions

Bacterial strains and their relevant properties are reported in Table 1. Both streptococcal and enterococcal strains were grown in tryptic soy broth or tryptic soy agar (Difco) supplemented, where appropriate, with antibiotics. Plate mating conjugation experiments were performed as previously described [38]. Briefly, donor and recipient cells were grown until the end of exponential phase and mixed at a 1:10 ratio, then were collected by centrifugation, plated and incubated for 4 h. Cells were harvested by scraping the plates and recombinant strains were selected by a multilayer plating procedure in presence of the appropriate antibiotics. Transconjugant FR39 was obtained from a mating experiment where FP58 [29] was the donor of Tn5253 and HB565, a streptomycin resistance derivative of type 3 Avery strain A66 [14, 39, 40], was the recipient.

Bacterial lysate preparation

Bacterial cultures (1 ml) were harvested in exponential phase (OD590 about 0.2, roughly corresponding to 5 × 108 CFU/ml) and centrifuged at 11,000 x g for 2 min. Pneumococcal lysates were obtained by using lysis solution (0.1 % DOC, 0.008 % SDS) as already reported [26]. Streptococcal and enterococcal cell pellets were resuspended in 90 µl protoplasting buffer (25 % sucrose, 100 mM Tris pH 7.2, 5 mM EDTA), then lysozyme (for E. faecalis) or mutanolysin (for S. agalactiae, S. gordonii and S. pyogenes) was added at a final concentration of 1 mg/ml or 20 µg/ml respectively and mixtures were incubated at 37 °C for 1 h. Protoplasts were centrifuged at 3,000 x g for 15 min, resuspended in 100 µl of dH2O, heated at 85 °C for 5 min and kept on ice until use.

PCR, inverse PCR, sequencing

PCR experiments and direct DNA sequencing of PCR amplicons were carried out essentially as already described [28, 29]. Briefly, PCR reactions were carried out in a 25-µl reaction mixture containing DreamTaq buffer 1X, 100 µM dNTPs, 1.5 mM MgCl2, 10 pmol of each primer, 0.2 U of DreamTaq enzyme, 1 µl bacterial culture. Inverse PCR, for amplifying the Tn5253-chromosome junctions, was performed with pairs of divergent primers targeting the Tn5253 ends as described [29]. 100 ng of each unpurified PCR fragment were used as template in sequencing reactions carried out with the BigDye Terminator v3.1 Cycle Sequencing Kit.

Quantitative Real time PCR

A LightCycler 1.5 apparatus (Roche) and the KAPA SYBR FAST qPCR kit Master Mix Universal (2X) (Kapa Biosystems) were used for Real Time PCR experiments according to the protocol extensively described [26]. Quantification of Tn5253 circular intermediates and reconstituted pneumococcal attB was obtained with the primer pairs IF327/IF328 and IF496/IF356, respectively [26]. Reconstituted attB site was quantified in S. agalactiae with the primer pair IF560/IF561 which amplified a 353 bp fragment, in S. gordonii with IF544/IF545 which amplified a 396 bp fragment, in S. pyogenes with IF509/IF510 which amplified a 249 bp fragment, in E. faecalis with IF525/IF532 which amplified a 480 bp fragment (Table S3). A standard curve for the gyrB gene was used to standardize results and melting curve analysis was performed to differentiate the amplified products from primer dimers as reported [26].

Microbial database interrogation and sequence analysis

Homology searches of the databases available at the National Center for Biotechnology Information were conducted using the Microbial Nucleotide BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch&BLAST_SPEC.

=MicrobialGenomes), selecting the complete genomes database. Default parameters were used and only alignments with significant e-values were considered. We built a stand-alone database containing only genomes of interest to be searched with BLAST software to confirm the results. Sequence analysis was carried out with BioEdit 7.2.5 (http://bioedit.software.informer.

com/). Multiple DNA sequence alignments were performed using Clustal Omega (https://www.ebi.ac.uk/Tools/msa/clustalo/).