Introduction

Toxoplasma gondii is a zoonotic protozoan parasite that uses domestic cats and other felids as definitive hosts and causes clinical disease in both humans and animals [1,2,3,4]. It was recently ranked second out of 24 important foodborne parasites in Europe [5,6,7].

Globally, T. gondii has a complex population structure [8]. While populations of this parasite in many regions of the world belong to few clonal lineages [9], those observed in South America are much more diverse [8, 10].

A frequently used genotyping technique targets microsatellite (MS) sequences [11]. MS sequences are ubiquitous and polymorphic in the genomes of virtually all organisms [12]. For T. gondii typing, usually a set of up to 15 markers located on 11 different chromosomes of the T. gondii genome is used, including eight lineage typing markers (B18, M33, TUB2, XI.1, TgM-A, W35, IV.1, and B17) and seven fingerprinting markers (N61, M48, N83, N82, N60, M102, and AA). Fingerprinting markers are more polymorphic and were shown to resolve different isolates, applicable to both archetypal (type I, II, or III) and non-archetypal lineages [11].

MS sequences need to be amplified by multiplex or singleplex PCR using primer pairs, with one of the primers per pair labeled by a fluorophore. Subsequently, amplicons are separated on a capillary sequencer, including a size standard in each run, which allows determination of the lengths of the amplified MS fragments. Usually, three different fluorophores are used, which allows examination of a larger number of amplified fragments simultaneously in a single run on the capillary sequencer [11].

Another frequently used technique to type T. gondii is PCR-restriction fragment length polymorphism (PCR-RFLP), which can resolve T. gondii genotypes, but is—in contrast to MS typing—less suitable for differentiating parasites of the same lineage. PCR-RFLP T. gondii typing involves multiplex or singleplex PCR to amplify up to 11 markers, which are distributed over eight chromosomes, and the apicoplast [13].

Multilocus MS typing is used by many laboratories around the world [9, 14,15,16,17]. It is largely unknown, however, to which extent the lineage typing and fingerprinting results obtained by different laboratories are comparable. This is a challenge as a One Health approach, e.g., combining larger data sets on T. gondii genotypes from different sectors and across countries, which is needed to better understand the molecular epidemiology and transmission pathways of T. gondii.

To evaluate consistency in T. gondii MS typing, a ring trial was established among five European laboratories. Laboratories had different levels of experience with this typing technique, had slightly modified the original protocol, and used—at least in part—different laboratory equipment, reagents, and software. This ring trial led to the identification of major reasons for differences in MS typing. The results were used to establish harmonized guidelines for laboratories on implementing MS typing of T. gondii.

Materials and methods

Participating laboratories

Five European laboratories (A–E) participated. One laboratory (B) had previously established and published a MS typing method and served as the reference laboratory [11]. Another laboratory (C) had introduced the technique 4 years ago, one laboratory (E) 2 years ago, and the two remaining laboratories (A and D) very recently. Laboratory E organized sets of samples, shipment, and collection of results.

Origin of samples

The ring trial was divided into three consecutive parts. The first part was planned to assess the capacity to type archetypal lineages of T. gondii, types I, II, and III, and evaluate the effect of DNA concentration on the accuracy of results. The samples comprised DNA aliquots collected from reference strains belonging to the three lineages types I, II, and III with RH [18], ME49 [19], and NED [20], respectively, in different dilutions. In addition, a sample from a type II × III recombinant strain (D200273; DNA of the T. gondii isolate TGA32090; provided by the Biological Resource Centre (BRC) Toxoplasma [http://www.toxocrb.com/]) was included (Table 1). Samples with the two highest DNA concentrations were provided only once, while the three lowest concentrations were provided two to four times in the panel.

Table 1 Composition of the sample set of the first part of the ring trial on microsatellite typing of Toxoplasma gondii

The second part was planned to assess the ability to discriminate different T. gondii type II strains using fingerprinting markers (Table 2). The DNA samples corresponded to 10 different T. gondii type II isolates, as confirmed by PCR-RFLP, and were provided in duplicate in the panel. Concentrations of DNA were adjusted so that they were similar to the 10−1 dilution of the first part.

Table 2 Composition of the sample sets of the second and third parts of the ring trial on microsatellite typing of Toxoplasma gondii

Finally, the last part was established to confirm that the laboratories were able to identify non-archetypal genotypes by MS typing. The panel consisted of DNAs from non-archetypal T. gondii strains (n = 7) and two archetypal strains (n = 2), provided in duplicate (Table 2). Concentrations of DNA were similar to those of the 10−1 dilution of the first part.

For each part, T. gondii DNAs were diluted in bovine carrier DNA with a concentration of 100 ng/μL. Two samples (first part) or one sample (second and third parts) of bovine carrier DNA alone was included as negative controls. The trial was blinded for all participants, including the organizing laboratory (E); only in the third part, the operators knew about the non-archetypal nature of some of the isolates included, but were unaware of the identity and order of the samples.

Irrespective of laboratory-specific protocols, each laboratory was asked to use 5 μL template DNA per reaction in the multiplex typing PCR.

After completing each part, interlaboratory divergences were assessed and discussed among the participants. All laboratories tried to improve their protocols and procedures for the subsequent part. The aim was to improve the individual typing results of each of the participating laboratories and to harmonize MS typing by using and extending the internal guidelines of laboratory B.

Questionnaire to asses divergence from original method

A questionnaire was distributed to collate the technical and methodological details in each laboratory. During online meetings, further details on individual protocols, such as use of particular rules to analyze sequencing profiles, were recorded.

Statistics

After reception of the data, all results were computed in tables for each part and studied separately (Supplementary File Table 1). For each sample, the coded number of the organizing laboratory and of the external laboratory was registered in an EXCEL file along with the operator, the software used, the typing results, the Ct value obtained by initial real-time PCR, the sample volume used for the reactions, the identified genetic type, and the number of typing markers identified. To compare the results among laboratories, the R software (R version 4.1.2, https://cran.r-project.org/) was used for linear regression and specifically, the R packages “binom,” “ggpubr,” “ggplot2,” and “cowplot” for calculating confidence intervals and preparing graphical representations of the results.

Results

Questionnaire results

All participants used the MS typing technique as reported previously [11]. However, questionnaire data and subsequent communication during online meetings revealed a number of deviations from the original protocol, even in the laboratory where the method had been initially established (laboratory B). Interestingly, one of the laboratories (A) had replaced the fluorophore HEXFl with VICFl, another (D) had replaced NEDFl with TAMRAFl, and two laboratories (B, E) had replaced NEDFl with Atto550Fl, for three or two of the MS marker regions, respectively (Table 3).

Table 3 Sets of fluorophores used by the laboratories (A-E) participating in a ring trial on microsatellite typing of Toxoplasma gondii

Further differences were related to the types of sequencing devices, the size standards, and the software used to assess the fragment length of amplified MS regions (Table 4).

Table 4 Technical or methodological details of the microsatellite typing technique as retrieved by questionnaire and further personal communications from laboratories participating in a ring trial on microsatellite typing of Toxoplasma gondii

PCR results to quantify T. gondii DNA in samples

In the first part, linear regression analysis of the Ct values reported by each laboratory on serially diluted DNAs of reference isolates revealed R2 values between 0.577 and 0.710 for individual laboratories (Fig. 1; Table 5). Comparison of the individual regression line equations revealed that the Ct values reported by the laboratories differed. Laboratories D and E reported the lowest Ct values and laboratories A and B the highest for the same given strain (Table 5). Overall, the pairwise comparisons between Ct values reported by the different laboratories revealed R2 values between 0.94 and 0.84. The proportions of recognized positive samples ranged from 85.4% (laboratory D) to 100% (laboratory B) (Table 5). All laboratories reported negative results for negative control samples.

Fig. 1
figure 1

Real-time PCR results to quantify specific DNA in the samples of the first part of the ring trial on Toxoplasma gondii microsatellite typing according to the dilution of samples. A Median, 25–75% quantile (box), minimum and maximum (whiskers) of Ct values reported by all participants of the ring trial. B Median, 25–75% quantile (box), minimum and maximum (whiskers) of Ct values stratified for laboratories participating in the ring trial (Lab)

Table 5 Summarized results of the linear correlation between Ct value and DNA concentration in the samples of the first part of the ring trial on microsatellite typing for Toxoplasma gondii

Typing archetypal T. gondii and impact of DNA concentration

The identification of the genetic type of samples was based on lineage typing markers (TUB2, W35, TgM-A, B18, B17, M33, IV.1, and XI.1). The proportion of correct identifications of the canonical types I, II, and III and a type II × III recombinant decreased depending on the dilution of the samples. If a participant had added a question mark to the result or provided an ambiguous typing result, NA was recorded. There were no differences in the typing results between the two software tools used by laboratory A and the two operators from laboratory E.

At the highest DNA concentrations, the 1st and 2nd dilutions, 71% (20/28) and 75% (21/28) of the results provided by all participants were correct. At the two following concentrations (3rd and 4th dilution), 52% (29/56) and 32% (36/112) of the typing outcome reported were correct. In contrast, the 5th dilution analysis revealed no (0/112) correct identification (Fig. 2). Overall, not only the proportion of incorrect typing results increased with higher dilution of the T. gondii DNA, but also the proportion of undetermined types, i.e., from 18% (5/28) or 14% (4/28) for the 1st and 2nd dilution to 41% (23/56), 62% (69/112), or 88% (98/112) for the 3rd, 4th, and 5th dilutions, respectively (Fig. 2).

Fig. 2
figure 2

Relationship between typing results and DNA concentration: Proportion and 95% confidence intervals of correct (green) and false (orange) typing or typing not possible (blue) in the samples of the first part of the ring trial on Toxoplasma gondii microsatellite typing according to the dilution of samples for all laboratories and operators (i.e., type I, II, III, or II × III recombinant). Note: Number of replicas per strain DNA varied according to sample dilution (please refer to Table 1)

If only the results for dilutions 1, 2, and 3 were included, small differences of up to minimum or maximum deviations of 2 bp relative to the results provided by the reference laboratory (B) were often recorded. In 27 cases, minimum or maximum values exceeding 2 bp were observed (Table 6). Most (63%; 17/27) deviations occurred in results reported by laboratory A. Some of the deviations were extreme and ranged up to 28 or 30 bp (laboratories A and C; Table 6).

Table 6 Results of the first part of the ring trial on microsatellite typing for Toxoplasma gondii in relation to results provided by laboratory B as a reference: Median [minimum; maximum] differences in length observed for each marker compared to the results provided by the reference laboratory (laboratory B). Median differences exceeding 1 bp are typed in bold. Minimum and maximum values exceeding 2 bp are underlined and in italics. Note: The use of Atto550 for N60 and M102 by reference laboratory B was a deviation from the original method. The analysis was restricted to sample dilutions 1, 2, and 3

Median differences in the typing results relative to reference laboratory B were not equally distributed among the laboratories. Overall, the majority (70%; 16/23) of the major differences (i.e., median differences > 1 bp) occurred in particular markers, for which participants employed primers labeled with different fluorophores compared to reference laboratory B. In lineage typing, major differences were only observed between laboratory A and the reference laboratory (100%; 4/4). In the affected marker regions, laboratory A had used VICFl instead of HEXFl to label fragments (Table 6). In fingerprinting analysis, all laboratories reported differences of > 1 bp relative to laboratory B. Of 20 differences, 12 occurred in cases with differences in labeling (Table 6). If only median differences of > 2 bp were counted, 78% (7/9) of the differences occurred in cases with differences in fluorophore labeling (NEDFl instead of Atto550Fl and vice versa, and TAMRAFl instead of NEDFl; details on primer labeling in Table 3).

Based on the results of the first part of the ring trial, it was observed that the fluorophore attached to primers for amplification of MS markers may have had an impact on the fragment sizes determined by capillary sequencing (Table 7). Thus, the literature on this topic was reviewed and differences in publications on the MS typing of RH, ME49, and NED strains were observed on 20 occasions (Table 7). In the vast majority of these cases (n = 17), the laboratory had used an alternative to the originally reported fluorophore, i.e., HEXFl was replaced with VICFl, NEDFl with Atto550Fl, or NEDFl with TAMRAFl (Table 7). Also, the reference laboratory B had recently started to replace NEDFl with Atto550Fl for the amplification of the N60Fl and M102Fl markers (Table 7).

Table 7 Results of the first part of the ring trial on microsatellite typing for Toxoplasma gondii in relation to literature data: Median differences in length observed for each marker of reference strains compared to the results provided in literature, i.e., for RH [31] or ME49 and NED [32]. Only markers are listed, for which laboratories used other fluorophores than those reported in the original reference. In the case of median differences exceeding 1 bp, entries are typed in bold. Fluorophores, different from the original description, are also indicated in bold. The analysis was restricted to sample dilutions 1, 2, and 3

Fingerprinting T. gondii type II

Divergences in fragment length determination

Fingerprinting markers (M48, M102, N60, N82, AA, N61, and N83) can be used to differentiate strains within the same lineage. In the second part of the ring trial, laboratories B and E reported identical fingerprinting results in all regions of the duplicates of 10 different strains, i.e., on 70 occasions (10 strains, seven fingerprinting regions). While laboratory A reported non-existing differences in two (3%), laboratories C and D reported non-existing differences in seven (10%) or 18 (25%) of 70 occasions, respectively (Table 8).

Table 8 Failure in identifying duplicates in samples of the second part of the ring trial on microsatellite typing for Toxoplasma gondii per laboratory

Interlaboratory divergence in fragment length determination

In the second part of the ring trial, most of the differences recorded relative to reference laboratory B did not exceed minimum or maximum deviations of >2 bp. Unlike in the first part of the ring trial, minimum or maximum values exceeding 2 bp were observed in less cases (n = 10; Table 9). Most (60%; 6/10) of these deviations occurred in results reported by laboratory A. However, deviations were far less extreme as compared to the first part and ranged up to 6 bp in one laboratory (laboratory E; Table 9).

Table 9 Second part of the ring trial on microsatellite typing for Toxoplasma gondii: Median [minimum; maximum] differences in length observed for each marker compared to the reference laboratory. Median differences exceeding 1 bp are typed in bold. Minimum and maximum values exceeding 2 bp are underlined and in italics. Note: The use of Atto550Fl for N60 and M102 by reference laboratory B was a deviation from the original method

Typing non-archetypal T. gondii strains

Divergence in fragment length determination in duplicated samples

In the third part, not only fingerprinting, but also typing markers varied between the isolates. Since the samples had been provided in duplicate, it was possible to assess the extent, to which duplicates were correctly recognized. Compared to the second part of the ring trial, the ability to recognize samples with identical profiles increased for all laboratories except laboratory C (i.e., n = 7 in second part but n = 10 in third part). The results for one marker (IV.1) were not available for analysis in the case of one isolate (FRENCHGUIANA15) in laboratories A, C, and E, because they failed to amplify this marker (Table 10).

Table 10 Failure in identifying duplicates in samples of the third part of the ring trial on microsatellite typing for Toxoplasma gondii per laboratory

Interlaboratory divergence in fragment length determination

In the third part, most of the differences recorded relative to reference laboratory B did not exceed minimum or maximum deviations of > 2 bp. Compared to the other parts of the ring trial, minimum or maximum values exceeding 2 bp were observed in a small number of cases (n = 7; Table 11). The majority (71%; 5/7) of such deviations occurred in the results of laboratories that had chosen fluorophores that differed from those used by the reference laboratory (Table 11).

Table 11 Third part of the ring trial on microsatellite typing for Toxoplasma gondii: Median [minimum; maximum] differences in length observed for each marker compared to the results reported by the reference laboratory. Median differences exceeding 1 bp are typed in bold. Minimum and maximum values exceeding 2 bp are underlined and in italics. Note: The use of Atto550Fl for N60Fl and M102Fl by reference laboratory B was a deviation from the original method

All laboratories correctly identified archetypal T. gondii type III or type II variants, although laboratories C and D did not report the variation in this isolate (Table 12). All laboratories, except laboratory C, recognized all non-archetypal strains as such. Laboratory C misclassified Africa 1 as type I, and the Caribbean 1, 2, and 3 as type III, and for two Amazonian isolates the result “Unclassified” was provided. All remaining laboratories, except laboratory C, correctly identified Caribbean 1, 2, and 3, determined the Amazonian isolate as Unclassified or as Amazonian, and the type III-like isolate as Unclassified (laboratory D), type III variant (laboratories B and E), or South American 4-like (laboratory A).

Table 12 Typing results reported by the laboratories (A–E) compared to results reported in the literature: The set provided in the third part of the ring trial on microsatellite typing for Toxoplasma gondii comprised of two strains with an archetypal and seven strains with non-archetypal genotype

Effects due to use of different fluorophore labeling and different suppliers for primers

To confirm that the different fluorophores used caused differences in the apparent sizes of amplified PCR products, comparative experiments were performed using the DNAs from RH, ME49, and NED reference strains and the primer pairs corresponding to the marker regions N60, M102, and AA provided by the different laboratories, labeled with NEDFl, TAMRAFl or Atto550Fl. Capillary sequencing as well as the assessment of profiles was done in laboratory E. The results obtained in laboratory E using reagents provided by reference laboratory B were identical with those previously obtained by laboratory B.

Different fluorophore labeling

NEDFl-labeled N60 fragments were 4 to 5 bp shorter and M102 fragments 2 bp shorter as compared to the Atto550Fl-labeled reference (Table 13). TAMRAFl-labeled N60 was 2 bp shorter compared to Atto550Fl-labeled reference, while the M102 fragments had the same length.

Table 13 Effect of different fluorophores and primer suppliers on microsatellite fragment sizes: Differences in microsatellite (MS) typing for Toxoplasma gondii between Atto550Fl- and NEDFl-labeled MS fragments N60, M102, and AA for reference T. gondii strains RH, ME49, and NED using reagents provided to laboratory E by laboratories participating of the ring trial on T. gondii microsatellite typing

Different suppliers for primers

The Atto550Fl-labeled N60 fragments were identical to the reference Atto550Fl-labeled fragments, if primer pairs supplied by company EU were used by laboratory E. In contrast, if Atto550Fl-labeled primer pairs bought from company ME were used by laboratory E, fragments were 2 bp longer than the reference Atto550Fl-labeled fragments (Table 13). In the case of the AA marker, TAMRAFl and Atto550Fl (EU) fragments were 2 bp longer and Atto550Fl (ME) fragments were 4 bp longer than reference NEDFl fragments.

Discussion

Typing of T. gondii strains is important to study the global population structure of the parasite. Genomic diversity of T. gondii may influence the epidemiology of the parasite, affecting, for example, definitive host and intermediate host adaptation [23, 33, 34]. In addition, some T. gondii genotypes are reported to have a higher virulence for particular hosts than other genotypes [35, 36]. Such differences in virulence may exist between different host species, but also at the intra-host-species level [37, 38].

Multilocus MS typing was established more than one decade ago [11] and has proven to be suitable to discriminate T. gondii strains at the level of lineages globally [8, 15, 39] as well as on the intra-lineage level [17, 40]. Essentially, laboratories currently use this technique to study strains and clinical samples from different geographical areas. As a consequence, differences in typing results between laboratories could introduce bias in population genetic studies comparing MS genotypes from different geographical locations. This is also true within the same geographical region such as in Europe with several laboratories using MS genotyping [9].

Our study revealed numerous differences in MS typing protocols, although all participants of the ring trial, including the reference laboratory, referred to the original description of the MS typing methodology [11]. Laboratories used different real-time PCR procedures to quantify T. gondii DNA prior to typing, different fluorophores (Table 3), capillary sequencers, size standards, and different software tools to assess fragment length of amplified marker regions (Table 4). Differences in the allele identification, supported by various software tools, were noted. Only with particular software, not available to all participants, was it possible to ease and automatize allele identification (Table 4). Not all participating laboratories used the same software and some were not able to automatize allele identification in the respective software. Some of the applied tools (i.e., Gene-Mapper and Geneious Prime) allowed the definition of loci and bins to ease allele identification. The inclusion of characterized reference DNAs can help to define loci and subsequently the respective bins. Furthermore, participating laboratories had different levels of experience in T. gondii MS typing, which ranged from several years to a few weeks.

The first part of this ring trial focused on lineage typing and the effect of T. gondii DNA concentrations on typing. As usually done for field samples, each laboratory tried to quantify T. gondii DNAs in the samples. Each laboratory used a different real-time PCR protocol (Table 4); however, overall high correlation coefficients between Ct values and DNA content in samples were observed. Nevertheless, Ct values differed by more than 3 Ct units in some cases (Table 5). This must be kept in mind for the interpretation of results reported in the literature. Nevertheless, laboratory-specific Ct values provide valuable information for optimizing DNA concentrations of samples (e.g., field samples) subsequently used for MS typing.

The results revealed that from dilution 3 (0.01 ng/μL T. gondii DNA) onwards, the proportion of samples increased, in which laboratories were no longer able to determine the lineage type. However, the proportion of reporting incorrect lineage typing results did not increase with decreasing DNA content. The results were consistent among the participating laboratories and relative to the results of the reference laboratory only up to dilution 2 (0.1 ng/μL T. gondii DNA). Thus, it seems to be important to estimate the level of T. gondii DNA concentration in samples prior to MS typing, and to use this information to select those samples, for which lineage typing (and subtyping) is most likely possible, or to optimize DNA concentration for typing (Fig. 3).

Fig. 3
figure 3

Possible and observed effects on the Toxoplasma gondii microsatellite marker fragment size determination: Steps affected in the microsatellite typing workflow and recommendations for optimization

It should be noted that not only a limited concentration of DNA may negatively influence the accuracy of determining the correct fragment size, but also an excess of T. gondii DNA may cause problems. In the first part of this ring trial, it was noted that the proportion of correct lineage typing was lower in dilution 1 (1 ng/μL T. gondii DNA) than in dilution 2 (0.1 ng/μL T. gondii DNA) samples. It has been noted previously that an overrepresentation of target DNA may cause so-called minus-A peaks during capillary electrophoresis [41]. Minus-A peaks can occur, if a number of amplified fragments lack a terminal adenine at the 3′ end, which is usually added by many DNA polymerases without the use of the template. We studied the occurrence of minus-A peaks for the markers M33 and M102. Both markers showed double peaks, where the intensity of the first peak (assumed to be a minus-A peak as detailed in the typing guidelines, provided as Supplementary File Text 1) increased with increasing DNA concentrations, while the intensity of the second peak (assumed to be the correct peak) decreased. This can cause incorrect results, if the operator or the software normally choose the highest peak as the correct one.

Major differences, mainly affecting fingerprinting markers, were observed in results reported in all parts of the ring trial, especially among those laboratories that used different fluorophores for labeling forward primers (Tables 6, 7, 9, and 11). Comparative experiments performed exclusively in laboratory E, but using primers from the other participating laboratories, confirmed these observations (Table 13). Effects on apparent fragment sizes in capillary sequencing due to differences in fluorophore labeling, especially for fluorescein and rhodamine dyes, have been reported previously [42]. However, these effects and their root-causes remained largely understudied. The fluorophores used in our study, Atto550Fl and TAMRAFl, are rhodamine dyes, while NEDFl belongs to the fluorescein dyes. The previous study reported that TAMRAFl-labeled fragments tended to be larger than NEDFl-labeled fragments and that this effect depended on the fragment size, i.e., the smaller the fragment, the stronger the retardation in capillary electrophoresis of TAMRAFl relative to NEDFl-labeled fragments [42]. Results of comparative experiments with various reagents in laboratory E (Table 13) were mainly in accord with this observation for markers labeled with NEDFl in the original protocol. The strongest effects of a 4–5-bp retardation in Atto550Fl relative to NEDFl-labeled fragments were observed in the smallest fragment N60 (149–151 bp) and a 2-bp retardation in the larger fragments M102 (168–192 bp) and AA (265–267 bp). TAMRAFl-labeled fragments also appeared to be 2 bp larger relative to NEDFl-labeled fragments, but size-dependent differences could not be determined.

In contrast to TAMRAFl and Atto550Fl, VICFl (used in laboratory A instead of HEXFl) belongs to the fluorescein-like dyes, similar to HEXFl, so no effects on fragment size were expected. This was confirmed in our analysis.

It should be mentioned here that an additional retardation of 2 bp was noted when Atto550Fl-labeled primer pairs, used to amplify N60, M102, or AA, had been purchased from the company ME and not from the companies AB or EU (Table 13). The reasons for the differences related to the primer supplier remain unknown. A potential error in the order of the primers was excluded and it should be noted that all primer pairs with different sequences from this supplier were affected. Most likely, the differences seem to be linked to primer production. Differences in the chemical reactions applied to label primers with fluorophores may be possible.

Thus, in general it seems to be important to validate new reagents by using defined reference DNAs, ideally included in each run of capillary sequencing (Fig. 3). In addition, comparative experiments with defined reference DNAs should become mandatory, if the method is newly established in a laboratory or even if previously used primers are replaced by new ones (Fig. 3). In our view, it is unlikely that different PCR kits or enzymes contribute to differences in the amplified fragments, but this was not assessed in our ring trial because all participants used the same multiplex PCR kit.

Results for MS typing were discussed among the participants in web-based meetings. Overall, an improvement of typing results relative to those of the reference laboratory was observed between the first and second parts of the ring trial, probably because participants gained experience and were given access to a laboratory internal guideline established in reference laboratory B. While one of the laboratories with little previous experience (laboratory A) obtained results that differed in determined fragment sizes relative to results provided by the reference laboratory by up to 28 bp (Table 6), including only dilution 1 and 2 results, this was no longer the case in further parts of the ring trial. In the second and third parts, deviations of a maximum of 5 bp were observed (Tables 9 and 11). This clearly shows the need for guidance, if T. gondii MS typing is newly established in a laboratory (Fig. 3).

So-called stutter peaks are frequently reported in MS typing (examples are displayed in the typing guidelines, provided as Supplementary File Text 1) and are caused by slippage of the DNA polymerase. They occur more often, when the number of MS repeats is >20, and less frequent, if the repeat number is <10 [41]. Specific guidelines (laboratory-specific guidelines similar to the guidelines provided in Supplementary File Text 1) may provide help to identify the correct fragment size (Fig. 3). However, stutter peaks remain a problem in MS typing, because it may not always be possible to determine the true variation in repeat numbers in the original DNA.

In the final part of the ring trial, DNA from strains not belonging to the archetypal lineages types I, II, and III was analyzed in the participating laboratories. Although the differences in the results were minor, especially for the typing markers (Table 12), none of the exotic strains was correctly classified by the semi-automatic system in place in one of the laboratories (laboratory C) (Table 2), due to limited references in the system version. Based on the results of laboratory C, correct typing would have been possible, if additional non-archetypal references would have been added to the system. This highlights the challenges of automatization for an organism with substantial genetic variation. One of the isolates was classified by some of the laboratories as type II, and by others as type II variant or type II-like, which shows that also for MS-based classification of lineages or the nomenclature of genotypes clear rules or guidelines are necessary (Fig. 3).

In conclusion, the results of this interlaboratory ring trial suggest that harmonization of MS typing appears to be possible, which might allow the combination of larger data sets on T. gondii genotypes. This is an important prerequisite to study and unravel the molecular epidemiology of this parasite. The use of different fluorophores to label fragments during amplification was identified as a major source of divergence. After numerical adjustments of fragment size results, based on comparative analyses using defined reference DNAs, differences due to the use of other fluorophores no longer presented a problem and results were comparable to those previously reported in the literature. In addition, minor differences of 2 bp could be attributed to different primer suppliers. Further minor differences probably resulted from limited experience, less suitable software for assessing capillary electrophoresis profiles, and missing software options to automatize allele identification. These observations are not only important for typing T. gondii, but may also be relevant for other applications of MS typing (i.e., forensic identification and relatedness testing, cell line identification, or population studies).