Staphylococcus aureus is a relatively clonal microorganism that lives in close contact with human beings [13]. Although multiple human body sites can be colonized, the anterior nares are the most frequent carriage site for S. aureus, and nasal carriage appears to play a key role in the epidemiology and pathogenesis of infection [3]. It is assumed that S. aureus’ conserved core genome composition provides the organism with the potential to interact intimately with its human host, whereas the accessory genome, including “core-variable genes”, harbors the specific colonization and virulence genes rendering it infectious [4, 5]. Interestingly, a recent study using microarrays could not identify specific genes, alone or in combination, that were associated with invasive disease. It was suggested that S. aureus gene combinations necessary for invasive disease may also be necessary for nasal colonization and that the ability to cause invasive disease is mostly dependent on host factors [6].

A variety of molecular strategies has been used to distinguish and catalogue core genome versus core-variable and mobile elements in S. aureus [6]. Clonality has been defined on the basis of, among others, multi-locus sequence typing (MLST) and high-throughput amplified fragment length polymorphism (ht-AFLP) studies, whereas core-variable and mobile elements have been identified by, for instance, DNA array hybridization studies [2, 6, 7].

Large scale data obtained by different genome screening methods, focussing on core genomic versus hypermutable regions, have rarely been linked in order to compare molecular type assignments. We provide here an example of such an analysis. We have compared whole genome polymorphism (as defined by ht-AFLP) and short sequence repeat length variation (using multi-locus variable number of tandem repeat analysis [MLVA]) for 994 S. aureus strains isolated from both healthy carriers and invasive infections. Ht-AFLP is a whole genome typing method that scans for polymorphism in restriction sites and the nucleotides bordering these sites. As such it documents nucleotide sequence variation, insertions, and deletions across entire genomes. Recently, MLVA has been introduced as a typing method for a large number of bacterial pathogens [810]. In MLVA, the variability in the numbers of short tandem repeated sequences is utilized to create DNA fingerprints for epidemiological studies [9]. We visualize here the genetic variability of S. aureus based on ht-AFLP versus MLVA and define overlaps in the inter-strain relatedness.

Materials and methods

Bacterial isolates and DNA isolation

Carriage and clinical isolates of methicillin-susceptible S. aureus (n = 994) have been described before [7]. Strains analysed in our study included carriage (n = 805), bacteremic (n = 132), skin- and soft-tissue infection (n = 17), and impetigo-derived isolates (n = 40) from children and elderly individuals. These strains were isolated at different time points (between 1997 and 2002) from persons living in the greater Rotterdam area in The Netherlands. DNA was isolated from all isolates using culture on blood agar and automated DNA extraction using the Roche MagnaPure and the Bacterial DNA III isolation kit (Roche, Almere, The Netherlands).

Amplified fragment length polymorphism (AFLP) and multiple-locus variable number of tandem repeat analysis (MLVA)

All strains were genetically typed using a high-throughput AFLP (ht-AFLP) approach as described previously [7]. Optimal enzyme and primer combinations were selected using the predictive software package Recomb (Keygene NV, Wageningen, The Netherlands). Bacterial DNA was digested with the enzymes MboI and Csp6I and the linker oligonucleotide pairs for MboI (5’-CTCGTAGACTGCGTACC-3’ and 5’-GATCGGTACGCAGTCTAC-3’) and for Csp6I (5’-GACGATGAGTCCTGAC-3’ and 5’-TAGTCAGGACTCAT-3’) were ligated. Subsequently, a nonselective preamplification was performed using the MboI (5’-GTAGACTGCGTACCGATC-3’) and Csp6I primers (5’-GACGATGAGTCCTGACTAC-3’). Finally, a 33P-labelled MboI primer containing one selective nucleotide (either +C or +G) and a Csp6I primer containing two selective nucleotides (+TA) were used. Amplified material was analysed using polyacrylamide slabgels and autoradiography. Marker fragments (147 AFLP makers per isolate) were scored and a binary table, scoring marker fragment absence (0) or presence (1), was compiled [7, 11].

In addition, multiple-locus variable number of tandem repeat analysis (MLVA) was performed for all isolates [12, 13]. In short, repetitive DNA from the V8 serine protease (sspA), protein A (spa), Ser-Asp-rich fibrinogen-binding protein (sdrCDE), clumping factor B (clfB), clumping factor A (clfA), fibronectin-binding protein (fnBP), collagen adhesion A (cna), methicillin-resistant surface protein (pls), and cell wall surface-anchored protein (sas) genes were amplified in a single multiplex PCR, also including the nonrepeat containing methicillin resistance (mecA) gene. Amplified material obtained by PCR on purified genomic DNA (1 ng) was subjected to capillary electrophoresis using the Agilent 2100 BioAnalyzer (Agilent, Palo Alto, CA, USA) generating 90 second tracing files containing fluorescence values for amplicons ranging from 50 base pairs to 11 kbp. The BioAnalyzer reagent kit contains upper and lower molecular weight markers used to normalize all profiles.


Multi-locus sequence typing (MLST) was carried out for a selection of 53 (5.3%) of the 994 S. aureus strains using DNA arrays [14]. These strains were equally distributed across a AFLP dendrogram by selecting approximately one out of 20 strains going from top to bottom through the AFLP dendrogram [7].

Data analysis

For the comparative analyses of both the AFLP and the MLVA data sets, minimum spanning trees (MSTs) were calculated using the Bionumerics software package (Applied-Maths, Sint-Martens-Latem, Belgium). The binary AFLP data were clustered using a categorical coefficient. Complexes were created if the maximum neighbor distance was two changes (except for Fig. 1a which had 16 changes). The MLVA profiles obtained by capillary electrophoresis were normalized based on the lower and upper band present in all samples, and band positions were automatically assigned by the Bionumerics software and manually adjusted where needed. Band based Dice clustering was performed using a 0.5% band position tolerance and 1% optimization. The distance matrix that was thus obtained was used to construct a MST using a 6% similarity bin size to designate MLVA profiles as being identical. For the construction of MLVA complexes a maximum neighbor distance of one change was used.

Fig. 1
figure 1

Minimum spanning trees based on: (a) AFLP data (n = 994) where complexes were created if the maximum neighbour distance was 16 changes, (b) AFLP data (n = 994) where complexes were created if the maximum neighbour distance was two changes, and (c) MLVA data (n = 994) where complexes were created using a cut-off value of 94%

Results and discussion

Figure 1 shows the type assignments for all of the strains based on separate analyses of the ht-AFLP and MLVA datasets. The ht-AFLP data of the 994 S. aureus isolates are visualized using two minimum spanning trees (MSTs) with different cut-off values (Fig. 1a and b with a maximum neighbor distance of 16 and two, respectively). It should be emphasized that the (dotted) lines between the clusters do not indicate directional evolutionary relationship between these different clusters. As described earlier, the genuine population structure of S. aureus can be subdivided in three major AFLP clusters (denoted 1, 2, and 3) and two minor AFLP (4 and 5) clusters (Fig. 1a) [7, 15]. In addition, two very small outlier clusters are visible (brown complex [n = 3] and dark blue complex [n = 2]). The AFLP MST in Fig. 1a essentially confirms the clonal structure of S. aureus, which is in contrast with those of other pathogenic species such as Streptococcus pneumoniae [11, 16].

Figure 1b shows that the three major and two minor AFLP clusters could be subdivided in several sub-clusters by applying more stringent segregation parameters during MST construction. The largest cluster (blue) is a genetically heterogeneous cluster in contrast with the two other major clusters (red [2] and green [3]). This is in agreement with the previously reported data on the natural population structure of S. aureus [7]. Major AFLP clusters 2 and 3 are genetically homogeneous lineages and correspond with MLST clonal complexes 30 and 45, respectively. In the analysis where we applied a maximum neighbor distance of two changes for the formation of AFLP complexes or clusters (Figs. 1b, 2a, and 3a), 22 different sub-clusters with more than five S. aureus isolates were identified, and 17 of these sub-clusters or complexes carried more than ten isolates. Several of these clusters coincided with MLST sequence types (ST) (Fig. 2a). Earlier studies showed that AFLP clustering matches with the major clonal complexes as defined by MLST [7, 15].

Fig. 2
figure 2

a Minimum spanning trees based on the AFLP data (n = 994) where MLST data of 53 S. aureus isolates are included in the figure. Each complex in this figure was assigned a different colour. b Minimum spanning trees based on the MLVA data (n = 994) where the colours, which are used in this figure, are based on the different S. aureus AFLP complexes which are shown in Fig. 2a

Fig. 3
figure 3

Minimum spanning trees based on: (a) AFLP data (n = 994) and (b) MLVA data (n = 994). The colours which are used in this figure are based on the source (carriage or invasive) of the S. aureus isolates. Light green are carriage isolates (n = 805); red are “blood isolates” (n = 149) including a few skin- and soft-tissue infection isolates (n = 17); and blue are impetigo-isolates (n = 40)

When the same set of strains was analysed by MLVA (Fig. 1c) the layout of the MST clearly revealed enhanced variability, such that a large number of minor genetic variants are visualized when a cut-off similarity value of 94% was used to identify types. However, the majority of the clonal complexes defined in Fig. 1b co-segregate again on the basis of MLVA. For example, the red, green, and yellow complexes (clusters 2, 3, and 4, respectively) that are identified as closely related by AFLP also remain distinct groups after MLVA-based MST construction. In short, despite the apparently increased complexity of the MLVA MST, there is still similarity with the topology of the AFLP MST. Obviously, deviations can be observed as well.

Finally, Fig. 3 displays the association between the AFLP and MLVA MSTs and the clinical origin of the strains. The conclusion is that both invasive and colonizing isolates are present in all S. aureus clusters, whether these are defined by AFLP (Fig. 3a) or MLVA (Fig. 3b). In cases of the impetigo isolates, a certain degree of clonal expansion could be hypothesized on the basis of the presence of several strains in a single AFLP cluster. Furthermore, some of the AFLP clusters (Fig. 3a) show a slight overrepresentation of bacteremia isolates compared to the average number of such isolates in the other clusters, which corroborates earlier data [7]. However, it should be noted that studies using MLST data could not identify hypervirulent clones of S. aureus [2, 17]. In addition, a recent study using microarrays was unable to identify any association between lineage or gene and invasive isolates [6].

It is thought that the generation of genetic variation in hypermutable loci such as short sequence repeats proceeds with enhanced speed, even in clonal species such as S. aureus [18]. We show here that there indeed is a difference in the rates or speed of whole genome versus repeat variation in S. aureus with repeat-derived type assignments showing enhanced diversification (see Fig. 2a versus b). However, a clear topological overlap in the MSTs calculated on the basis of complex repeat patterns versus those calculated on the basis of whole genome polymorphism can still be observed. This suggests that even hypervariation of repeat loci remains within the framework set out by overall genome variation. This is in agreement with earlier studies showing that typing techniques using highly variable genes (e.g. adhesion genes) are at least as informative for phylogenetic reconstruction as the more slowly evolving housekeeping genes (e.g. MLST) [1923].

In conclusion, we demonstrate here that for a clonal bacterial pathogen such as S. aureus, analysis of DNA targets with different inherent degrees of genetic variability results in overlapping molecular type assignments. This suggests that despite the enhanced variability of repeats, clusters of strains remain traceable. In other words, measuring repeat polymorphisms in clonal microorganisms provides a solid basis for genetic type assignment useful for application in epidemiological tracing.