This study is an attempt to determine if there is any significant variability between and within the diagnostic assays currently in use at clinical laboratories in Northern Europe by using a large and well-characterized panel of serum samples from patients and controls. The results show high intra-assay correlation between the laboratories using the same diagnostic assay (especially for IgG) and lower correlation between laboratories using different diagnostic assays. Both the intra- and inter-assay comparison showed more specific results with high compliance for IgG, and lower for IgM. Interestingly, we found an increased seroprevalence among blood donors compared to previous studies from the same geographical areas .
The results within the IgM assays showed more heterogeneity compared to the IgG assays, not only between the different diagnostic assays, but also between laboratories using the same diagnostic assays. This may suggest reproducibility problems in the IgM assays, but is more likely a result of the different cut-off values used in the different diagnostic assays or the fact that some laboratories [1, 7,8,9,10] have numerous samples (> 20 samples) with borderline results, which in this study were classified as positive in the statistical analyses. The low specificity for the IgM assays was expected since it is well known that IgM antibodies are less mature and specific than IgG antibodies, and false-positive IgM reactions due to cross-reactivity are difficult to overcome. In everyday practice of clinical microbiology, IgM interpretation may indeed be challenging and should be performed cautiously.
The IgG assays showed high concordance and more homogeneous results both between and within assays. The EuroImmun IgG assay showed a slightly lower sensitivity with no major gain in specificity compared to the other assays. This is in line with previous studies showing a higher sensitivity and specificity for assays based on recombinant antigens compared to whole cell lysate . However, a previous study showed the opposite results with superior sensitivity and negative predictive value in negative tests in combination with low specificity and positive predictive value . Both this study and the one by Kodym et al. 2018 include a low number of laboratories, which makes it hard to draw any firm conclusions. However, Kodym et al.  suggest that the EuroImmun IgM/IgG assay may serve as a screening test to be used together with a confirming immunoblot. Overall, the serological methods showed high concordance and comparable sensitivity and specificity regarding IgG both within and between assays, while the IgM assays showed more heterogenic and less sensitive results. This implicates that if laboratories were to analyse only Borrelia-specific IgG in serum, patients and clinicians were to receive more or less the same test result irrespective of which laboratory that performed the analysis. However, IgM results differ considerably more between laboratories and methods, and our data suggest that IgM testing in serum does not really add any diagnostic value to IgG testing in suspected LB cases, since the sensitivity for IgM is lower (with the possible exception of the Enzygnost Borrelia IgM assay) and results in loss of specificity. Also, the positive rate of IgM in sera from patients with other diseases is higher than among blood donors, illustrating the well-known risk of false-positive reactivities . Taken together, IgM testing in serum samples is a diagnostic tool that is difficult to handle correctly and its value in clinical diagnostics of LB may be questioned. It is important to keep in mind, though, that in this study, we have not included CSF samples or samples from children, and the clinical value of Borrelia-specific intrathecal IgM index or IgM testing in pediatric sera cannot be assessed here.
Commercial assays are marketed using different antigens or combinations of antigens. Comparison of diagnostic assays with different antigens will show less analytical correlation. This is consistent with biology since the reactivity to different antigens or antigen combinations will statistically have conditional independence. Antibodies develop differently in different individuals and different assays detect different antibodies, which may result in both strong and weak correlation when no antibody development is measured in one assay and a high reactivity is measured in another assay. It has been shown in reports concerning external quality assurance, round robins where a smaller number of samples are tested in many laboratories to result in some variation between the laboratories . The Enzygnost IgG assay uses a mix of whole cell detergent extract and recombinant VlsE from the three main B. burgdorferi s.l. species pathogenic to humans, whereas the Liaison IgG assay, according to kit insert, is based solely on recombinant VlsE from B. garinii (PBi). If this is correct, the sensitivity for the Liaison IgG assay may be lower in samples from Northern Europe where B. afzelii is the most prevalent infecting genospecies. However, a previous study  evaluating a recombinant Borrelia line immunoblot assay displayed the highest sensitivity for the recombinant VlsE of B. garinii (PBi) for both IgM and IgG detection. The study also showed that the most sensitive antigen for IgG in all LB stages, especially in early manifestations like EM and acute LNB, is VlsE followed by DbpA and p58 while VlsE of B. afzelii (PKo) reacted poorly with samples from patients with ACA and LA (late manifestations). The poor reactivity for LA might be explained by the rare observation of B. burdorferi sensu stricto (s.s.) in ticks and patients from Northern Europe . However, cross reactivity of VlsE between different species may occur and different species are more likely to cause certain clinical signs and symptoms, e.g., B. garinii has been associated with more distinct symptoms and more pronounced intrathecal inflammation in LNB while B. afzelii, in Europe, is often associated with skin manifestations like EM and ACA [23, 24]. In Europe, there are at least five different species that are known to be pathogenic to humans . A previous study has shown a higher specificity for the Enzygnost assay in both IgM and IgG compared to the Liaison assay , which is in line with our findings, especially for IgM indicating that recombinant VlsE antigens obtained from all three B. burgdorferi genospecies pathogenic to humans improved the diagnostic sensitivity with sustained specificity of LB. Most of the diagnostic assays in this study include VlsE as antigen. VlsE epitopes provoke an early antibody response, which is not detectable in ELISAs prepared from whole-cell sonicates of cultured B. burgdorferi bacteria, since the VlsE antigen is not expressed by the bacteria in vitro . This present study shows that there is no gain in sensitivity, except for the Enzygnost IgM assay, analyzing the samples with both IgM and IgG if VlsE is used as antigen in the IgG test. The use of IgM may instead result in specificity problems. However, if excluding IgM testing is considered, the serodiagnostic IgG assay should include either shared antigens or antigens from the different pathogenic species.
The Recomwell IgG assay follows a principle of using a panel of several recombinant antigens in the same ELISA assay (p100, OspC, VlsE, p18), with a purpose of increasing the the sensitivity. However, in this study, there was a lower specificity without noticable gain in sensitivity for this assay and the RecomWell IgG assay has low screening value (AUC < 0.50) in consecutive patients with other diseases. The low specificity of the Recomwell IgG assay implicates that if used, it would be advisable to use a second confirmatory assay like another ELISA or an immunoblot.
When using blood donors as controls, awareness of seropositivity rate in the local population is crucial as this may be used as a pointer in clinical interpretation of the results, especially in patients with typical symptoms. However, it is of less importance if the specificity of the assay is high . The seroprevalence in blood donors in this study is in agreement with previous studies done in Kalmar [11, 28], a region closely located to Jönköping County, indicating an increase in seroprevalence in the healthy population over the years. It is known that a high seroprevalence for both IgM and IgG can be found in a healthy population in Borrelia endemic areas which is in line with the results in this study.
This study included two control groups, blood donors and patients with other diseases. The results show that healthy blood donors consistently lead to higher specificity than controls with other diseases. The high seropositivity among patients with other diseases is caused by the fact that patients under investigation for symptoms that could be attributed to a tick-borne infection were referred to the specialized centers for further investigation (e.g., a lumbar puncture) partly due to their seropositivity and that the presence of antibodies in this case does not prove the occurrence of an active infection or disease, since antibodies, especially IgG, may persist for 10–20 years at least . However, it cannot be excluded that some of the positive results reflect on-going LB, but in the referral center, seropositivity in serum is of little diagnostic value. In a systematic review by Leeflang et al. , it is recommended that “Future diagnostic accuracy studies should be prospectively planned cross-sectional studies, done in settings where the test will be used in practice”. This study follows these recommendations using patients referred for suspected LB, later classified as patients with other diseases. Thus, a future prospectively planned cross-sectional study should collect samples in the flow of patients at the time of the first suspicion of LB, not after referral of the patient. This is, however, hardly feasible to carry out for practical reasons, as recruitment should be done in primary care involving a large number of clinics.
We are aware that exclusion of the patient group with suspected LB (n = 24) in the statistical analyses may have resulted changed estimated test performances. But we would not know if they should be included in the patient group or the control group. Another drawback in the study population is that it did not include children. Examining diagnostic performance of the assays in paediatric patients would have been of interest, since the seroprevalence may differ from adults and children often present with neurological symptoms early in the course of LNB, when antibody production is low and hard to detect and laboratory diagnosis therefore remain uncertain [15, 30] .