Introduction

Primary antibody deficiencies (PADs) form the majority of primary immunodeficiencies (PIDs) and are characterized by an inability to produce a clinically effective antibody response [1, 2]. PADs represent a heterogeneous group of disorders such as common variable immunodeficiency (CVID), IgG subclass deficiency, and specific antibody deficiency (SpAD) [3]. The reported prevalence of PAD varies considerably from 1:1700 to 1:25,000, partly owing to the suspected large number of undiagnosed patients [4,5,6]. The clinical presentation encompasses a wide range of symptoms including increased susceptibility to respiratory and gastro-intestinal tract infections, auto-immunity, and an increased risk of certain malignancies [2, 6].

Owing to the heterogeneous presentation and low prevalence, diagnosis of PAD can be challenging. This is evident from the reported median delay in diagnosis of between 2 and 10 years, which has not improved substantially over the past five decades [7,8,9,10,11,12]. This diagnostic delay is associated with increased morbidity and mortality, as effective therapies are available [12,13,14]. A timely diagnosis may also result in substantial healthcare cost savings, even when taking the cost of treatment into consideration [15]. Reducing the diagnostic delay of PAD is thus of key importance [12].

To this end, we have developed an algorithm that can be used to detect patients with a high risk of PAD in a primary care setting [16]. An advantage of focusing on primary care is that most patients initially present their complaints to a general practitioner (GP), especially in countries where the GP has a gatekeeper function to secondary care. This allows detection of PAD patients in an early phase. In addition, primary care electronic health records (EHRs) encompass a comprehensive overview of the symptoms for which a patient has sought medical care. In contrast, in secondary care, usually, only the symptoms for which a patient has been referred are documented structurally. For example, if a patient is referred for suspected inflammatory bowel disease, the secondary care EHR might not include recurrent respiratory tract infections for which a patient has visited the GP. Focusing on primary care thus allows screening for a broad range of PAD symptoms at an early stage.

The algorithm is based on structured EHR data including diagnostic codes, antibiotic prescriptions, laboratory results, and the number of visits to the GP. Focusing on structured EHR data enables the application of the algorithm in an automated manner to large databases. The aim of this study was to clinically validate and optimize the algorithm by applying it to a primary care database. Patients identified by the algorithm as being at increased risk of PAD were invited for laboratory evaluation of immunoglobulin levels and referred to an immunologist if deemed clinically necessary.

Methods

Details on the algorithm have been reported previously and in Table S1 [16]. In short, the algorithm was developed using EHR data from PAD patients (University Medical Centre Utrecht), aggregated subgroup data from control groups (Julius General Practitioner Network (JHN) Utrecht), literature, and clinical expertise [17]. The algorithm encompasses 107 items within eight categories: “Antibiotic prescriptions,” “Respiratory tract infections” (RTI), “Gastro-intestinal (GI) complaints,” “Other infections,” “Auto-immune symptoms,” “Malignancies, lymphoproliferative- and other symptoms,” “Laboratory values,” and “Number of visits to the GP.”

In the current study, the algorithm was applied to a JHN data set containing 61,172 EHRs from 13 general practices. All patients in the JHN database were offered an opt-out prior to registration. EHR data were extracted for a certain period before the “censoring date” (e.g., 4 years for antibiotics; Table S1). Usually, this was the date of application of the algorithm to the database (8 February 2022). For certain uncommon diagnoses that can be both a complication of PAD and also the cause of a secondary antibody deficiency (SAD; e.g., non-Hodgkin lymphoma), the censoring date was the registration date of this ambiguous diagnosis (Table S1).

As the estimated prevalence of PAD is 1:1700–1:25,000, 2–36 PAD cases were a priori expected to be present in the data set of 61,172 patients [4,5,6]. Previous PID-screening studies selected 0.1–0.4% of their population for further analysis [18, 19]. Based on the above, expert opinion, and feasibility, we aimed to screen the 400 highest scoring EHRs to confirm eligibility. Of the remaining patients, we aimed to include at least 100 patients (0.2% of the data set) for laboratory analysis. See Fig. 1 for an overview of the study flow. We focused on patients aged 12–70 years because PAD usually presents in the second to fourth decade of life, because of restrictions regarding study participation of patients < 12 years, and because differing clinical presentations have been described for children versus adults [8, 20,21,22,23].

Fig. 1
figure 1

Overview of study flow. EHR electronic health record, GP general practitioner, ICPC International Classification of Primary Care, JHN Julius General Practitioner Network, PAD primary antibody deficiency. aPrevalence was determined based on the 10 PAD patients in this data set. bThis selection included patients with a rank number < 400 due to an error in data extraction (see the “Results” section and Table S6)

Exclusion criteria (Table 1) for which International Classification of Primary Care (ICPC) codes were available were applied by the algorithm to the population of 61,172 patients and include previously diagnosed (secondary) immunodeficiencies. Subsequently, exclusion criteria were verified manually in the 400 EHRs selected for screening. This was performed in a two-step manner owing to COVID-19 restrictions. First, pseudonymized EHRs were screened remotely from a secured server, and subsequently, remaining patients were discussed with the GP on location.

Table 1 Exclusion criteria

Patients that remained eligible after screening were invited for participation through a letter from their GP. Participation consisted of a single visit with analysis of serum immunoglobulins and calculated globulin, and the Early Warning Signs (EWS) questionnaire [24, 25]. GPs were advised to refer patients with reduced immunoglobulins for further evaluation of PAD to an immunologist or infectious disease specialist, except if it concerned solitary reduced IgG4 as this has little clinical relevance [26]. In addition, GPs were advised to refer the 10% highest scoring patients, to account for SpAD which presents without concomitant reduced immunoglobulins [27]. Lastly, GPs were advised to consult an internist in case of incidental findings of elevated immunoglobulins that were not suspect for PAD. Six months after inclusion, GPs were contacted to verify referral outcomes. PAD was classified according to the International Union of Immunological Societies criteria by a clinical immunologist [28]. This study was approved by the Medical Research Ethics Committee NedMec under protocol number NL74 944.041.20. All patients included for laboratory analysis provided written informed consent. For the TRIPOD checklist, see Table S2.

Descriptive statistics are presented as means with standard deviations, medians with interquartile ranges, or frequencies with percentages. To compare continuous data between PAD and non-PAD patients, t-tests or non-parametric Wilcoxon rank sum tests (for two groups) were performed or for two groups or more ANOVA or Kruskal-Wallis (non-parametric) tests. For categorical characteristics, chi-square tests were used, or Fisher’s exact test if small cell frequencies were expected (< 5).

In addition to the original algorithm (version 1), we explored three alternative algorithms (versions 2–4) to optimize predictive performance within the subset of high-risk patients with a confirmed PAD/non-PAD diagnosis. We performed penalized logistic regression analyses, with the presence of PAD as a dependent variable. In the original algorithm (version 1), the eight categories were weighted equally. In version 2, category weights were adjusted based on Ridge regression coefficients (λ = one standard error), with the category scores as independent variables. In version 3, the items (e.g., “pneumonia”) per category (e.g., “RTI”) were first grouped using a principal component (PC) analysis to prevent overfitting due to the large number of items compared with the number of patients. The number of PCs was based on an eigenvalue of ≥ 1. Items were grouped in the PC where they had the highest contribution, or based on clinical rationale if they were not present in this data set. Version 3 was derived by first determining the weight per item group, and subsequently the weight per category using ridge regression. In version 4, we explored the addition of new variables that were not available during algorithm development or have an ambiguous relationship to PAD, i.e., use of immunosuppressant medication in the past 4 years, presence of an ICPC code for chronic obstructive pulmonary disease or malignancy, ≥ 6 GP visits in the past 2–4 years, and the EWS score. These variables and the total score of the optimal algorithm (from versions 1–3) were combined in a Lasso regression. Algorithm 4 consisted of the retained variables. The predictive performance of all algorithms was determined with the area under the curve–receiver operating characteristics curve (AUC-ROC), sensitivity, and specificity using optimal cut-offs based on Youden’s index or 100% sensitivity. Statistics were performed in R version 4.2.0 (The R Foundation for Statistical Computing, Vienna, Austria).

Results

The algorithm was applied to 61,172 EHRs; 5284 patients were excluded based on ICPC codes (see Table 1) of which 8 had a previous PAD diagnosis. From the remaining 55,888 patients, 400 high-ranking patients were selected for EHR screening. Of these 400 patients, 104 were included for immunoglobulin assessment. From these, 16 were referred, of whom 10 were diagnosed with a PAD (Table S3). Sixteen patients were not referred despite referral advice, of which 6 were deemed valid and 10 invalid (Table S4). For example, a sufficient explanation for frequent antibiotic use was deemed a valid reason, while “referral was too much effort” was deemed invalid, as PAD cannot be excluded in this case. The valid non-referred patients (n = 6) and the patients without referral advice (n = 72) were labelled as “unlikely PAD diagnosis” (n = 78). For 6 referred patients PAD could not be confirmed nor excluded (Table S5); together with the invalid non-referred patients (n = 10), these were labelled as “inconclusive” (n = 16). The prevalence of PAD in the subpopulations selected with each step of the study is shown in Fig. 1. In the general population, the prevalence of PAD is estimated to be 1:1700–1:25,000 [4,5,6]. In the 400 patients selected for EHR screening, prevalence was estimated at 1:40, patients in whom immunoglobulin analyses were performed at ~ 1:10, and in those referred for suspected PAD at ~ 1:2. This can be translated to a number needed to screen of 40, a number needed to test of 10, and a number needed to refer of 2 to identify one patient with PAD.

Initially, we aimed to select the 400 highest scoring EHRs for screening. After termination of the study, it appeared, however, that lower ranks were also screened owing to a data-extraction error concerning antibiotic prescriptions, about which the ethical committee was informed (see Table S6 for details). The included patients were still within the highest scoring 2% of the total population of 61,172 patients (Figure S1). An unintended benefit of this occurrence is that it allowed us to study patients with a wider range of ranks. Three of the newly identified PAD patients had a rank lower than 400 (528, 657, and 791), as well as one patient with an inconclusive diagnosis (803). It thus appears that the initially estimated cut-off point of 21.5 (based on 400 highest ranks) was too strict, a finding which would have remained undetected if we had only screened the top 400 EHRs. A cut-off of ≥ 17 (corresponding to a rank of 1000) may be more suitable, as all confirmed and inconclusive PAD cases are well within this range.

The baseline characteristics are shown in Table 2; there were no statistically significant differences between groups of included patients (i.e., “PAD diagnosis,” “unlikely PAD,” and “inconclusive diagnosis”). The algorithm scores are shown in Table 3. Statistically significant differences were present for the total score and for the categories “Antibiotic prescriptions” and “RTIs,” but not for other categories. Most points were scored in the categories “Antibiotic prescriptions,” “RTIs,” and “Visits to the GP,” while points were rarely scored for “Other infections” (e.g., meningitis, osteomyelitis; Table S1) and “Auto-immune symptoms.” There were no previously registered reduced immunoglobulin levels in the EHRs of included patients, most likely because these were not requested by GPs: total IgG and IgM were determined in only one patient, and IgG subclasses in none.

Table 2 Baseline characteristics of included and screened patients
Table 3 Algorithm score, total and per category

The serum immunoglobulin results are shown in Table 4. PAD patients had significantly lower IgM, total IgG, and IgG1, IgG2, and IgG3 subclass levels than “unlikely PAD” patients. There were no statistical differences for IgA nor IgG4 subclass levels. Concerning calculated globulin, PAD patients had a significantly lower median value, but none of the patients had a value below the diagnostic cut-off of 18 g/L [25].

Table 4 Serum measurements of immunoglobulins (gram/liter)

To optimize the efficiency of the screening approach, we aimed to improve the predictive performance within the subset of high-risk patients identified by the original algorithm, i.e., the 88 confirmed PAD/unlikely PAD cases (see Table 5). When considering the total data set of 61,172 patients with 10 new PAD patients and all others assumed to be free of PAD, the original algorithm has an estimated AUC-ROC of 0.99 (95% confidence interval (CI) 0.99–0.99). However, when the original algorithm (version 1) is applied to the subset of 104 high-risk patients, the AUC-ROC is 0.58 (95% CI 0.39–0.78). In version 2 of the algorithm, category weights were adjusted as follows based on the ridge regression coefficients: “GI-complaints” was reduced to 0.5; “Antibiotic prescriptions,” “RTIs,” and “Malignancy, lymphoproliferative- and other symptoms” remained as 1; and “Auto-immune symptoms”, “GP-visits,” and “Other infections” were increased to 2. This did not improve algorithm performance; the AUC-ROC remained as 0.58. The weights per item group and per category in version 3 of the algorithm, based on the ridge regression coefficients, are shown in Table S7. This version improved the AUC-ROC to 0.80. In version 4, only the variables “algorithm score version 3” and “EWS score” were retained, but this did not improve the AUC compared with version 3 (AUC-ROC 0.80).

Table 5 Predictive performance of different versions of the algorithm

Based on these results, we suggest a two-step PAD screening approach using algorithm versions 1 and 3 (see Fig. 2). First, version 1 can be applied to a primary care database to distinguish low-risk from high-risk patients using a cut-off of ≥ 17. This cut-off is based on the lowest score of the identified PAD and inconclusive patients of 18, maintaining a safety margin of 1 point. Within this subset of high-risk patients, algorithm version 3 can be applied to improve the distinction between PAD and non-PAD patients using a cut-off of ≥ 15.5 based on Youden’s index (sensitivity 90%, specificity 68%). A cut-off of ≥ 9.5 points based on a 100% sensitivity may also be considered, although this greatly reduced the specificity to 9%. For the remaining patients, the EHR should be screened to confirm the absence of exclusion criteria. Patients satisfying inclusion/exclusion criteria can then be invited for immunoglobulin analysis by their GP. Referral to an immunologist can be advised for patients with reduced immunoglobulins with the exception of isolated reduced IgG4 as this has little clinical relevance [26]. Referral can also be advised for patients with a high algorithm version 3 score of ≥ 21.5 (based on specificity ~ 90%) to account for SpAD, as this can present without reduced immunoglobulin levels.

Fig. 2
figure 2

EHR electronic health record, PAD primary antibody deficiency, SpAD specific antibody deficiency. aReduced immunoglobulins, with the exception of isolated reduced IgG4 as this has little clinical relevance[26]. Refer patients with a high algorithm version 3 score of ≥ 21.5 (specificity~90%) to account for SpAD. bBased on extrapolation from our current study data using multiple imputation

We estimated the costs for the proposed screening approach based on our population of 61,172 patients. See Fig. 2 for the expected numbers of patients per step of the proposed screening approach. The total estimated cost for this screening approach is €52,586.68, assuming that all invited patients participate. These costs include manual EHR screening for 296 patients, immunoglobulin assessment for 149 patients, and two visits to an academic hospital including additional laboratory assessments for 46 patients (see Table S8 for details). Based on the 10 patients we identified in this study, the estimated cost per detected patient is €5258.66. Initial costs for development of a certified digital screening tool were not taken into account. Potentially, the costs for EHR screening could be reduced by automating the process using text-mining techniques.

Discussion

In the current study, we aimed to detect PAD patients using a screening algorithm in primary care. From a population of 61 172 patients, we included 104 high-risk patients for laboratory analysis, of whom finally 10 PAD patients were identified. A priori, we expected 2–36 PAD patients to be present in our total study population of 61,172, based on the prevalence of PAD in the general population [4,5,6]. As we identified 18 PAD patients in total (i.e., 10 new diagnoses and eight previous diagnoses), this is well within the range of expected PAD patients. Our original screening algorithm (version 1) performed well when distinguishing low-risk from high-risk patients, as the PAD prevalence in included patients was 1:10, compared with 1:1700–1:25,000 in the general population. Within the subset of high-risk patients, the original algorithm did not perform well (AUC-ROC 0.58), but predictive performance improved with an optimized version of the algorithm (AUC-ROC 0.80). We therefore propose a screening approach combining these two algorithms. To our knowledge, this is the first study to identify new PAD patients within a primary care adolescent/adult population using a screening algorithm.

The 10 PAD patients identified in this study had either an isolated IgG subclass deficiency or SpAD. Similar to Rider et al., we did not encounter selective IgA deficiency, which is a more prevalent but also milder type of PAD [29]. Possibly IgA-deficient patients were not classified as high risk because the majority are asymptomatic [30]. We also did not encounter more severe diagnoses such as CVID, which may explain the absence of patients with reduced calculated globulin levels [25, 31]. As CVID has a relatively low prevalence, cases may be encountered when applying the algorithm to a larger population. In addition, PAD cannot be excluded for high-risk patients who did not respond to the invitation from their GP nor for patients for whom referral advice was ignored. It is therefore possible that undetected (more severe) PAD cases are present in our included study population.

When designing the PAD screening algorithm, we were inherently limited by the available (diagnostic) codes in primary care, which tend to be more general (e.g., “Pneumonia”) rather than specific (e.g., “Mycoplasma Pneumonia”). Further limitations include that we initially intended to screen the 400 highest screening EHRs, but owing to a data-extraction error, lower ranking EHRs were also screened. An unintended benefit of this error was that it allowed for the study of patients with a wider range of algorithm scores. As three newly identified PAD patients had a rank lower than 400, we can conclude that our initially estimated cut-off point was too strict. This would have remained undetected if we had screened only the top ranking 400 patients. Lastly, as our optimized algorithm (version 3) was developed in a relatively small data set of high-risk patients, it should be validated in another population.

As for all screening methods, cost-effectiveness should be taken into account before considering implementation. Our initial rough estimate of the costs per detected PAD patient with our proposed screening approach is €5258. This is comparable to the costs per detected patient for other screening programmes in the Netherlands, such as for breast cancer (€9300), colon cancer (€5000), and cervical cancer (€8400) [32,33,34]. Of note, these screening programmes include more invasive interventions such as colonoscopy, compared with the laboratory assessment in our approach. The estimated annual cost savings for early PID diagnosis are $85,882 (≈ €81,157) for patients without immunoglobulin replacement therapy and $6500–$55,882 (≈ €6066–€52,158) for patients with this therapy [35,36,37]. When assuming that our approach would reduce the diagnostic delay of PAD by 3 years (median diagnostic delay 2–10 years [7,8,9,10,11,12]), this would imply a cost saving of €18,198–€243,471 over 3 years per detected PAD patient. This seems well proportionate to the expected cost per detected PAD patient of €5258. It is however important to note that the estimations for cost savings are based on PID patients in general rather than specifically PAD and that these studies were performed in other countries. A full cost-effectiveness analysis specific to our PAD screening approach would therefore be of interest.

The impact of participation in screening and a possible subsequent PAD diagnosis for individual patients should be taken into account. Our screening approach selects undiagnosed patients that have registered complaints associated with PAD. An early diagnosis could offer these patients an explanation for their complaints, and future complications may be prevented by starting adequate treatment. The benefits of an early diagnosis are thus likely to outweigh the burdens.

Other efforts to reduce the diagnostic delay of immunodeficiencies include the approaches developed by Rider et al. and Mayampurath et al., which could be complementary to the screening algorithm described in the current study [29, 38,39,40]. For example, these approaches focus mainly on secondary/tertiary care and use International Classification of Diseases diagnostic codes, while our approach specifically focuses on primary care and incorporates ICPC codes. Considering that diagnostic coding practices differ per country (ICPC being used in 27 countries), it is important to develop screening tools for both systems [41]. In addition, the approach by Rider et al. is based on paediatric data, while we specifically focus on PAD in an adolescent/adult primary care population aged 12–70 years. We chose to target this population as different cut-offs are to be expected in a paediatric population owing to the higher frequency of RTIs and because most PADs present between the second and fourth decade of life [8, 20,21,22,23]. X-linked agammaglobulinemia (XLA) is an exception, as this presents during the first few years of life. However, the diagnostic delay of XLA is limited (e.g., reported median of 1 year compared with 7.5 years for PAD in general [8]), and the proposed addition of XLA to newborn screening is likely a more effective diagnostic strategy for this particular PAD [42, 43].

In conclusion, in the current study, we were able to identify 10 new PAD patients from a primary care population of 61,172 patients using a PAD screening algorithm. We also present an optimized screening approach including a revised algorithm to improve predictive performance within high-risk patients. This approach may aid in the prevention of morbidity and mortality by reducing diagnostic delay of PAD and appears to be cost-effective based on a limited analysis. Future studies should address further validation of the proposed screening approach in other populations and a full cost-effectiveness analysis.