Background

Clinical registries and observational cohorts are essential for studying disease course, treatment effect, and safety in real-world patients. To study rare exposures and outcomes, very large study populations are required, such as through collaborative research across countries. Many countries have established clinical rheumatology registries [1,2,3,4,5,6,7,8,9,10,11,12,13]; however, differences in their design, data availability, and completeness pose a challenge when researchers pool data from multiple countries [14, 15].

In rheumatoid arthritis (RA), two surveys conducted among 25 European clinical cohorts and registries, and 14 biological disease-modifying anti-rheumatic drugs (bDMARD) registries under the European Alliance of Associations for Rheumatology (EULAR), suggested that existing heterogeneity in the data collection represents a limitation for data merging and collaborative research. As an example, the registries used diverse methods and instruments for measuring patient-reported outcomes, hampering direct comparability and interpretation [16,17,18,19].

The EuroSpA Research Collaboration Network (RCN) is a scientific collaboration among European clinical registries, collecting information on patients with spondyloarthritis (SpA), including axial SpA (axSpA) and psoriatic arthritis (PsA). The individual registries collect a broad range of clinical data relevant for the everyday management of patients with SpA (www.eurospa.eu). However, specific knowledge about the commonalities and differences in data collection across the 16 participating registries is limited. The experience from RA clinical registries [16, 17] prompted the need for a similar cross-country exploration of data collection practices in SpA to gain a better understanding of the data used in pooled analyses. Ultimately, such knowledge may guide the design and interpretation of future collaborative studies. Furthermore, as recently suggested in the European Medicines Agency Patients Registries Initiative [20], it would be beneficial for collaborative research if a set of commonly collected variables with high data availability were defined.

The objective of this study was therefore to explore the design of European registries collecting information on axSpA and PsA, including the commonalities and differences in (1) the set-up, clinical data collection, and funding; (2) data availability and completeness; and (3) the wording, recall period, and scale of patient-reported outcome measures (PROMs).

Methods

The study consisted of three parts: (1) an online survey designed to capture aspects of registry set-up and clinical data collection, (2) data availability and completeness analysis performed on real-world data collected through EuroSpA, and (3) investigation of the wording, recall period and scale used for selected patient-reported outcome measures (PROMs).

Online survey regarding registry design

The survey data were collected and managed using the Research Electronic Data Capture (REDCap) tool, a secure, web-based software platform designed to support data capture for research studies [21, 22]. The survey covered the following 12 themes: general registry information (e.g., set-up, infrastructure for data-collection, funding), data management, demographics, diagnosis, disease characteristics, medication, safety, PROMs, lifestyle, laboratory measures, imaging, and comorbidities. The number of individual questions covered in each theme varied from 9 (safety) to 56 (general registry information), the full survey is included as Supplementary material. Each registry assigned 1–3 persons with a thorough knowledge of the registry, hereafter called “registry experts,” to complete the survey. Two investigators (LL, LØ) then reviewed the responses for inconsistencies and missingness. Next, a one-hour semi-structured interview was conducted through a video link by the same two investigators to supplement and validate the survey responses. A common interview guide was shared with the registry experts ahead of the interview (see Supplementary material).

Patient data availability and completeness assessment of uploaded datasets

Considering the themes explored in the online survey, data availability across registries and data completeness across variables were investigated. A variable was considered available if collected in the registry; the data completeness was reported for each available variable. We used patient data that had been prospectively collected in the registries and uploaded onto a secure server by the individual registries for secondary use in the EuroSpA collaboration. Data were pseudonymized, i.e., personal identifiers had been removed and replaced with placeholder values prior to upload. Previous EuroSpA studies have been based on data uploaded in a similar manner [23, 24]. For the current study, we included data on patients with a clinical diagnosis of axSpA or PsA, aged 18 years or older, and followed in one of the participating registries from the start of their first course of biological (b) DMARD or targeted synthetic (ts) DMARD therapy between 2000 to 2021. Data from the baseline visit of the first b/tsDMARD treatment course were used for this study. A baseline visit was defined as a visit from 4 weeks before to 4 weeks after the treatment initiation date, with priority given to the closest visit before treatment start. Baseline visit data included age, time since diagnosis, clinical disease characteristics, medication, PROMs, and inflammatory markers. Other variables, e.g., HLA-B27, lifestyle, comorbidities, and classification criteria were considered patient-specific and were included independently of the baseline visit, if available in the registry. The availability of variables not accessible for evaluation in the uploaded data was instead based on the survey responses provided by the registry experts.

Wording, recall period, and scale used for selected patient-reported outcome measures (PROMs)

In the online survey, the registry experts reported the specific wording (translated into English when necessary), recall period, and scales (NRS or VAS) used in the patient global, pain and fatigue assessments. Further details were explored during the follow-up interview, and furthermore, the reported scale was verified by visual inspection of the distribution of the patient scores in the uploaded data.

Results

Registries from 15 countries participated: ATTRA (Czech Republic), DANBIO (Denmark), ERSBTR (Estonia), ROB-FIN (Finland), ICEBIO (Iceland), GISEA (Italy), AmSpA (Netherlands), NOR-DMARD (Norway), Reuma.pt (Portugal), RRBR (Romania), biorx.si (Slovenia), BIOBADASER (Spain), SRQ (Sweden), SCQM (Switzerland), and BSRBR-AS (UK). BSRBR-AS and AmSpA collected data on axSpA only. Data availability and completeness were assessed in a total of 33,948 patients (axSpA: 21,330, PsA: 12,618).

Online survey regarding registry design

In Table 1, an overview of the 15 registries, based on the online survey and follow-up interviews, is presented. The full survey is included as Supplementary material. A diagnosis was registered using the International Classification of Diseases – tenth revision (ICD-10) in 5 registries, classification criteria in 2 registries, and expert opinion in 1 registry. In the remaining 7 registries all three methods could be applied (Table 1). Treatment with b/tsDMARDs was registered by all, while treatments with conventional synthetic (cs) DMARDs, non-steroidal anti-inflammatory drugs (NSAIDs), and glucocorticoids were registered in 14, 8, and 11 registries, respectively (Table 1). The estimated coverage of eligible patients ranged from 0.5% (Netherlands) to 100% (Romania) for both diagnoses (Table 1). The sources of funding for the registry activities differed, 7/14 from research grants (covering 2–80% of expenses/cost), 4/14 from the public sector (covering 10–100%), 12/14 from industry (20–100%) and other sources in 3/14 registries (10–100%) (Table 1). The funding was further explored during the follow-up interviews and covered expenditures related to the development and running of IT platforms, dedicated research nurses, secretaries, data managers, and statisticians.

Table 1 Set-up of 15 registries in EuroSpA

Patient data availability and completeness assessment of uploaded datasets

In Table 2, data availability and completeness are presented in pooled and stratified data (treatment courses initiated before vs. after January 1, 2015, and axSpA vs PsA), and in Fig. 1 data are further stratified by b/tsDMARD history and registry. Age, sex, disease duration, C-reactive protein (CRP), and details regarding b/tsDMARDs were available in all 15 registries with a data completeness ranging from 85 to 100% (Table 2). Bath Ankylosing Spondylitis Disease Activity Index (BASDAI) scores were also available in all registries; while data completeness varied by the time period (later time period: 71% vs earlier: 54%) and diagnosis (axSpA: 78% vs PsA: 39%) (Table 2). The data completeness in variables describing peripheral involvement, such as swollen/tender joint counts and the Health Assessment Questionnaire (HAQ), were higher in PsA (50–85%) vs. axSpA (16–58%). Conversely, variables designed to evaluate axial involvement, such as the BASDAI, the Bath Ankylosing Spondylitis Functional and Metrology Indices (BASFI and BASMI), had higher data completeness in axSpA (39–78%) vs PsA (7–39%) (Table 2). All PROMs had higher data completeness in the later time period (68–86%) compared to before 2015 (50–79%) (Table 2). Variables describing uveitis and peripheral musculoskeletal manifestations (enthesitis and dactylitis) of SpA were more complete than were comorbid conditions (diabetes, cardiovascular, and kidney disease) (Table 2).

Table 2 Results regarding data availability and completeness
Fig. 1
figure 1

Data completeness for variables collected in axSpA (upper panel) and PsA (lower panel) overall and stratified by time-period for initiation of a b/tsDMARD treatment course, b/tsDMARD history and registry. Legend: Unless otherwise stated, we used secondary pseudonymized baseline data from initiation of the first biologic (b) or targeted synthetic (ts) disease-modifying anti-rheumatic drug (DMARD) treatment on patients with a clinical diagnosis of axial spondyloarthritis (axSpA) and psoriatic arthritis (PsA), 18 years or older, followed in one of the participating registries since the start of their first b/tsDMARD between 2000 and 2021. Sweden has provided data on Secukinumab-treated patients only. ASAS, Assessment of Spondyloarthritis International Society; CASPAR, Classification Criteria for Psoriatic Arthritis; MASES, Maastricht ankylosing spondylitis enthesitis index; PASI, Psoriasis Area and Severity Index; NAPSI, Nail Psoriasis Severity Index; BASMI, Bath Ankylosing Spondylitis Metrology Index; cs, concomitant synthetic; NSAID, non-steroidal anti-inflammatory drug; PROMs, patient-reported outcome measures; BASDAI, Bath Ankylosing Spondylitis Disease Activity Index; BASFI, Bath Ankylosing Spondylitis Functional Index; HAQ, Health Assessment Questionnaire; CRP, C-reactive protein; ESR, erythrocyte sedimentation rate; HLA-B27, Human Leukocyte Antigen subtypes B*2701-2759; EMMs, extra-musculoskeletal manifestations. *Baseline data on patients who initiated a TNFi between January 1, 2009 and December 31, 2018 (alcohol); **baseline data on patients who initiated a new b/tsDMARD from January 1, 2015, and May 31, 2022 (prednisolone); ***baseline data on patients initiating a later line b/tsDMARD (1 prior or ≥2 prior)

Variables not available in the uploaded data

Additional variables, such as physical activity, intramuscular and intra-articular use of glucocorticoids, EuroQol-5 Dimensions (EQ-5D), other comorbid conditions, imaging, and adverse events were available in some registries, as reported by the registry experts (Supplementary Table S1). Data completeness for these variables was not available in this study.

Wording, recall period, and scale used for selected patient-reported outcome measures (PROMs)

An overview of selected PROMs used in axSpA across registries is presented in Table 3 and a similar overview for PsA in Supplementary Table S2. For both diagnoses, differences in the wording, recall period, and scale were observed. For patient global, the questions referred to either “overall impact due to disease activity” or “overall impact due to the rheumatic disease”. For patient pain, the questions referred to either “pain due to the rheumatic disease,” “spinal pain,” or pain non-specifically. For patient fatigue, the questions referred to either “unusual fatigue/tiredness,” “fatigue due to the disease,” or to fatigue non-specifically. For both patient global, pain and fatigue assessments, the recall periods varied from “at the moment” to “last week,” and the assessments were performed using either numeric rating scales (NRS) from 0 to 10 or 100 or visual analog scales (VAS). The BASDAI and BASFI were assessed using either NRS 0–10 or 100 mm/10 cm VAS.

Table 3 Overview of selected patient-reported outcome measures in axSpA across registries

Discussion

In this study, we investigated the design and data collection in 15 European SpA registries, covering ≈34,000 patients with axSpA and PsA. By collecting details of coverage, recruitment, funding, and assessment of PROMs in the participating registries, we have provided insights into potential challenges when attempting to pool data. High data completeness was observed in core demographic, clinical, and treatment-related variables, and moreover, we observed an increased data completeness of PROMs in recent years.

This study is the first to comprehensively characterize the commonalities and differences across European SpA registries. Heterogeneity across registries has been acknowledged as a factor in interpreting pooled data since EuroSpA was established in 2017, and this study provides further insights into such differences [14, 15, 34, 35]. In RA, two collaborative cross-country studies concluded that further collaboration would benefit from harmonization of data collection [16, 17]. Similarities between our study and the RA studies include the survey-based collection of information from registry experts regarding different aspects of European registries. Our study, however, adds further weight by incorporating real-world data uploaded by the registries for assessment of data completeness.

We noted large variation in coverage across registries, some covering up to 100% of eligible patients and others only a small proportion. This implies that some registry cohorts may be generally representative of patients with SpA in that country or region, whereas other cohorts may be highly selected. Such heterogeneity should be considered when pooling data across registries. Another interesting finding was that in some registries, a diagnosis could be assigned using several methods, i.e., either ICD-10, classification criteria, or expert opinion, while in two registries, classification criteria was the only method used. This may reflect that the registries have different main purposes - some of them are primarily clinical while others are mainly used for research. How a diagnosis is established is of importance since the concordance between clinical diagnoses and fulfillment of classification criteria is not complete, and the clinical characteristics of the patients may also differ according to the diagnostic strategy. In a recent study, 83% of patients with a clinical axSpA diagnosis (ICD-10 of all axSpA diagnoses combined) fulfilled either Assessment of SpondyloArthritis international Society (ASAS) or modified New York classification criteria, and those fulfilling the criteria were more often men and HLA-B27 positive but had less enthesitis [36]. To gain more insight, a future perspective would be to investigate how the different registration strategies are balanced in the registries.

We observed similar frequencies of missingness in our data and in the collated estimates previously reported by Radner et al. in European RA registries for disease duration, patient global score, patient pain, HAQ, joint counts and CRP (0–20%) and treatment with NSAID (20–40%), while our data were more complete regarding cigarette smoking and fatigue [16]. However, it should be noted that the frequencies presented by Radner et al. were self-reported estimates, while in this study they were based on calculations of real data [16]. As could be expected, the BASDAI and BASFI, which are measures developed for use in patients with ankylosing spondylitis, had more complete data in axSpA than in PsA patients, probably reflecting that the majority of the latter has a phenotype with predominantly peripheral involvement. It could also suggest that axial PsA is not routinely looked for in the clinical encounter and therefore tools to assess the axial domain of PsA are not applied in a subset of patients. In general, routine registration of PsA patients may be challenged by the heterogeneity of PsA and the large number of potentially affected domains.

Interestingly, we found higher data completeness across all PROMs in the later time period (after 2015), which may be a sign of an increasing focus on patient engagement, as illustrated by implementing online digital solutions to facilitate data collections using touch screens and apps [37,38,39].

Our evaluation of PROMs across registries revealed differences in the use of wording, recall period, and scale. The differences were most evident for the patient global, pain and fatigue scores, which could reflect that no specific wording, recall, or scale for the assessment of these concepts has been recommended across rheumatic diseases. However, some variation in the use of scale was still observed for the BASDAI and BASFI although these have been validated in several countries [25,26,27,28,29,30,31,32,33].

Regarding the wording, only rough comparisons should be made due to the probable semantic differences following the translation of the original questions performed by the registry experts. Possible explanations of the differences observed in our study are many, given the heterogeneity of the registries in general. For instance, we could speculate that data collection practices in axSpA and PsA might have been influenced by RA registries since the movement towards including PROMs as outcome measures in rheumatology started with the development of a core set for endpoints in RA [40]. Several years later, recommendations for AS-specific scores and scales for spinal pain, patient global, and fatigue were proposed in the ASAS core set [41, 42].

In line with this theory, we have seen that the majority of the SpA registries included in our study ask about pain in more general terms and not about spinal pain specifically. Conversely, since widespread pain has been shown to be a strong predictor of poor outcome [43], and spinal pain is already included in the BASDAI, the registries may also have made an active decision to consider pain more generally. The impact of such cross-registry differences in PROM wording, recall period, and scale on data from pooled analyses has not been investigated.

Some limitations to our study should be noted. First, since the online survey and follow-up interviews were conducted in a small group of experts from each registry, we cannot exclude that the responses might have differed slightly, had other registry experts been assigned the task. This limitation would, however, mainly apply to the areas where we have not presented real data for verification, e.g., in registry set-up (including coverage, funding, and data management), safety, lifestyle, and imaging. Next, since all except one registry (BSRBR-AS in the UK) are non-English, the patient assessments were translated by the registry experts from the original language to English to compare the wording. Such a translation should ideally have been done by a native speaker, who has good knowledge of both languages and then translated back by a similarly knowledgeable bilingual [44]. Furthermore, the study revealed that some key patient variables were collected in all registries, whereas considerable heterogeneity in data availability was observed for other variables. Also, the wording, recall periods, and scales used for patient assessments differed across registries. Finally, we observed variation in data completeness of patient-reported outcomes over time with an increase in recent years, perhaps reflecting a larger emphasis on their relevance.

Conclusions

This study has uncovered considerable variation in the design of axSpA and PsA registries across fifteen countries in Europe. Moreover, differences in the availability and completeness of data in general, and the wording, recall periods, and scales used for patient assessments contributed to the heterogeneity., This study might serve as a basis for examining how differences in the current data collection across registries impact the pooled analyses, thereby informing the potential need for a more unified strategy in future collaborative research.