Background

Rheumatoid arthritis (RA) is a chronic inflammatory joint disease that strongly affects quality of life [13]. The core set of outcome measures for RA clinical trials includes three patient-reported outcomes (PROs): pain, physical functioning and patient global assessment of disease activity [4]. In addition, the Outcome Measures in Rheumatology (OMERACT) group, an independent initiative of international health professionals and patient experts interested in outcome measures in rheumatology, has recommended that fatigue should also be measured in all RA clinical trials; furthermore, sleep quality and ability to work have both been mentioned as important outcomes and targeted for further instrument development [57]. Several questionnaires now exist to assess these PROs separately [810].

Recently the Rheumatoid Arthritis Impact of Disease (RAID) score was developed as an initiative of the European League Against Rheumatism (EULAR) to combine the most important PROs in one measure [11]. This patient-centered response index for clinical trials in RA was to replace the use of different questionnaires per outcome without missing important information, and to assess changes in outcome over time [12]. In 2011, the construct validity, reliability and sensitivity to change of the RAID score were documented in an international study. A strong correlation was found with other disease activity measures (e.g., patient global assessment visual analogue scale (R = 0.76)), as well as a high reliability (intraclass correlation coefficient: 0.90) and a good sensitivity to change (standardized response mean: 0.9) [13]. As the domains of the RAID were initially identified by a relatively small group of patient partners (n = 10), we decided to assess the content validity of the RAID questionnaire as well as comprehensibility before implementation of the RAID score in the Netherlands.

Materials and methods

Development of the RAID score

The development of the RAID score has been described elsewhere [12]. In brief, a priori the developers decided to create an instrument with seven domains. In the first phase, domains were initially elaborated through a ‘focus group’-type meeting where 10 patients (one patient per country, ten countries participated) identified 17 health domains on which RA had impact, based on patients’ personal experiences. To reduce the list of domains, 100 new patients ranked these domains in order of decreasing impact of RA on their lives.

In phase 2, items to measure the candidate domains were identified. One or several items, instruments or whole questionnaires were selected for each domain. When no validated instruments were available, a numerical rating scale was formulated by the group and validated with the 10 patients with arthritis who participated in the first phase. In the third phase, relative weights of the candidate domains were obtained, in which the second part of phase 1 was repeated with 500 patients. In the last phase, the generalizability of the preliminary RAID was assessed by comparing ranks of importance of chosen domains across countries (using data from phase 1), and by analyzing weights across disease and demographic characteristics. The final RAID score covers seven health domains: pain, problems with daily functioning, fatigue, sleep, physical well-being, emotional well-being and coping with RA, each represented by a numerical rating scale ranging from 0 to 10 [12]. The complete questionnaire and related calculation rules can be found in Additional file 1.

The translation of the questionnaire into Dutch was performed by the RAID developers. Two independent researchers (with Dutch as their mother tongue) translated the questionnaire into Dutch. After reaching consensus, back-translation was performed by the same two researchers. A multidisciplinary consensus meeting was held in which the original version and the back-translation were compared and in the last step the final version was pre-tested on five patients [12].

Current study: content validity

Content validity as defined by the Consensus-based Standards for the selection of health Measurement Instruments (COSMIN) group addresses the question: are all items included in the RAID score an adequate reflection of the construct to be measured? Content validity is often considered one of the most important measurement properties and therefore addressed first, before looking at other measurement properties such as construct validity. Construct validity is defined as the degree to which the scores of the RAID score are consistent with hypotheses that were made in advance (e.g., about relation of scores to scores of other instruments) based on the assumption that the RAID score validly measures the construct to be measured [14].

This study used a part of the COSMIN checklist to evaluate the content validity of the RAID [15, 16]. This checklist was developed to evaluate the methodological quality of studies on measurement properties [15]. The items in box D of the COSMIN checklist were slightly adapted to evaluate the quality of the instrument. Evidence was collected to answer the following four questions, based on box D of the COSMIN checklist, as evidence for content validity: 1) Do all items refer to relevant aspects of the construct to be measured? 2) Are all items relevant for the study population? 3) Are all items relevant for the purpose of the measurement instrument? And 4) do all items together comprehensively reflect the construct to be measured?

Focus group discussions

To gain insight into the domains of health upon which RA impacts, patients were asked to participate in focus group discussions (FGDs). A FGD is a form of qualitative research that retrieves data on patients’ experiences, opinions, perspectives and assumptions [17]. In a FGD patients are engaged to interact with each other. In this study, patients were asked to describe the impact of RA on their life (question 2 of box B), and whether they felt that the RAID score covered all items that should be measured to assess the impact of RA on their life (question 4 of box B). We were also interested to assess the comprehensibility of the RAID. Therefore, in the second part of the FGD, patients were asked if they felt that the RAID questions were understandable and clearly defined (see Table 1 for the structure of the focus group discussions).

Table 1 Basic structure of the group focus discussions

Study sample focus group discussions

To gather as much diversified opinion from patients as possible, patients with diagnosed RA (≥18 years and not suffering from multiple co-morbidities) were recruited through two patient associations in Amsterdam and surroundings to participate in one of three planned FGDs (maximum 10 patients per group). These patient associations distributed an invitation letter to their members.

Data collection

All questions were open-ended [18]. Table 1 shows the FGD structure used. Data (on the domains mentioned by the patients) were gathered until saturation was reached. With data saturation, focus group interviews were performed until no new categories, themes or explanations emerged from the patients [19]. Patients were informed that the FGDs were moderated (LHvT), taped and anonymously transcribed (BSB) by researchers not involved in the clinical management of any of the patients. Under Dutch law, this study does not need approval from an Ethical Review Board. However, all patients gave oral informed consent at the start of the group sessions on the use of their provided responses, which was recorded.

Data analyses

To thoroughly investigate all expressions of the patients, the focus group moderator asked in-depth questions such as: ‘In what way do you mean that?’ or ‘Could you provide us with an example?’ The data were analyzed using the interpretative phenomenological analyses method that assesses content as well as the underlying cognitive and emotional concepts [20, 21]. A bottom-up approach is applied to avoid prior assumptions and minimize bias. Two researchers experienced in qualitative research (MMtW and LHvT) systematically and independently analyzed the transcripts to identify relevant themes, and subsequently agreed to the same set of major themes during a consensus meeting. The researchers also discussed whether all RAID items potentially could show change when the disease improves or deteriorates (question 3 of COSMIN box D).

World Health Organization International Classification of Functioning, Disability and Health core set for RA comparison

In order to answer question 1 of COSMIN box D, the domains in the RAID score were compared to the domains in the World Health Organization (WHO) comprehensive core set of International Classification of Functioning, Disability and Health (ICF) for RA to see whether all items refer to relevant aspects of the construct to be measured (see Additional file 2) [22]. The ICF makes it possible to assess not only the medical aspects of the patients’ disease, but also to take into account all aspects of their life such as participation and environment [23]. ICF core sets, such as the ICF core set for RA, have been developed to describe a patient’s level of functioning specifically for one disease as only the relevant categories for this disease are listed in the core set [24]. Previously, several studies assessed the validity of the ICF core set for RA. In one of these studies, patients almost entirely confirmed the domains included in the core set in several focus group meetings [25].

Results

Focus group discussions

The two patient organizations sent the invitation letter to all their members (approximately 600). A total of 23 patients were willing to participate in the focus groups, but five subsequently declined, mostly for lack of time. The remaining 18 patients were split equally among three FGDs. Mean age was 60 years (standard deviation 17 years), median disease duration 14 years (range 2–50 years), and 89 % of patients were women. Approximately 50 % of the patients had a high educational level (>17 years) and 17 % a low level (≤12 years). Most patients were either married or living together (83 %). Regarding work status, 28 % performed paid work (17 % worked ≥32 hours per week), 28 % were retired and 39 % were fully work disabled. The results of the interviews are presented using summaries and quotes. During the third focus group interview, no new categories, themes or explanations emerged from the patients (data saturation). Therefore, no new patients had to be recruited.

Performing or maintaining their job was one of the first things patients mentioned in all three FGDs. Most patients lost their job due to their disease. This made patients uncertain about their financial situation and it affected their self-confidence, both leading to feelings of stress and increasing their perceived disease activity. Patients who were still working noticed that pain and fatigue decreased their work productivity. Quotes 1–4 in Table 2 concern this topic.

Table 2 Quotes of participants on work and coping with the disease

A second major issue for patients was the difficulty in coping with their disease. RA can fluctuate with regard to disease activity during the day, making it possible to perform an activity at one point in time but impossible at the next. They also mentioned they experienced difficulties with coping and had to make other choices due to their disease. Patients indicated that this was because it felt as if their body would ‘abandon’ them while performing an activity. Quotes 5–8 in Table 2 concern this topic.

The third domain that patients discussed was the influence of their disease on relationships with others, such as partners, children, and friends. Patients experienced a lack of understanding and consideration on days that their disease was very active; this was attributed to the relatively ‘invisible’ nature of RA as a disease as, for example, compared to a broken leg. Due to the disease, the distribution of roles changed. For example, partners would have to perform more household tasks than they had to do before their partner was diagnosed with RA; or patients were not able to engage in activities with friends as they used to. Therefore, irritations, tension and stress could occur in the relationship with others, sometimes leading to losing a partner or friend. Quotes 9–11 in Table 3 concern this topic.

Table 3 Quotes of participants on relationships with others and performing activities in daily life and leisure time

A fourth domain that was influenced by RA was performing activities in daily life and in spare time. Regarding routine daily activities, patients reported having difficulty getting dressed, taking care of their children, cooking, going to the toilet and performing household activities. Regarding leisure-time activities, patients mentioned that it became difficult to perform their sports, to go on vacation, or perform a hobby such as painting. Quotes 12–14 in Table 3 concern this topic.

Three other domains that patients mentioned that impacts their life were having pain, being fatigued and being emotional. Having pain and being fatigued restricts functioning in all aspects of life, but having the disease also results in emotional fluctuations. For example, patients experience anger towards their disease as it sometimes limits their functioning. Or they have the urge to cry due to pain, or get frustrated due to the inability to perform a task they were able to perform the day before. Quotes 15–20 in Table 4 concern this topic.

Table 4 Quotes of participants on pain, fatigue and emotions

From the seven domains in the RAID score, five were indicated as being relevant for the study population: a) coping with the disease; b) functional disability assessment (activities performed in daily life); c) pain; d) fatigue, and e) emotional well-being. The domains sleep and physical well-being were briefly or not at all mentioned in the FGDs. The domains work, relationships with third parties and leisure time activities are considered important and are missed in the RAID score by the patients.

During their consensus meeting, the two researchers discussed whether the RAID score can be used as an evaluative instrument, i.e., to assess change over time-for example, the effect of drugs on the impact of the disease [12]. They considered all items potentially changeable, and therefore the whole RAID score was considered relevant for an evaluative purpose.

Comprehensibility

In general, the numerical rating scales may cause difficulties in interpretation. In the Netherlands such questions are often answered by comparing them with school grades (a grade 10 is outstanding and grade 1 is very poor; however, in practice only the grades 4–9 are used). In other words, the full range of the scale may not be used, and patients need to reverse the anchors, as low rather than high scores in the scales reflect the preferred condition, which was perceived as counterintuitive.

Although the patients found the questions on daily physical functioning, fatigue, sleep and emotional well-being clearly defined, they also mentioned that it was difficult to confidently ascribe fatigue or sleeping problems to RA, as these are also influenced by other circumstances. The questions about daily physical functioning and emotional well-being were interpreted by the patients as having problems with performing household activities, and coping with their disease in the past week, respectively. It might be more clear when concrete descriptions of activities are provided instead of broad concepts, e.g., by referring to hobbies and work instead of referring to daily physical functioning.

Patients indicated that the questions on pain, physical well-being and coping with RA were difficult to understand, and not clearly formulated. Patients described pain to be dependent on the type of activities that are performed, and influenced by different (environmental) circumstances. Therefore, pain perception can change every hour making it difficult to give an average rating over 1 week. They stated that time frames other than 1 week (for example 1 day) might be more appropriate to use. Concerning the question on coping, patients indicated they felt this question covered the same domain as the question on emotional well-being. They were unclear whether the subject of this question concerns coping with RA emotionally, in daily functioning or in general. For coping, a time period of 3 months was found to be more appropriate than 1 week.

Concerning the question on physical well-being, it seems that translation errors have occurred. For example, in the Dutch version a sentence has been added to the question on physical well-being: ‘considering your physical well-being (apart from pain, inflammation and fatigue)’, which is not mentioned in the English version (see Additional file 1). Patients indicated that it is impossible to set aside the pain, inflammation and tiredness in answering this question, as these items influence the functioning of their body and therefore their physical well-being.

WHO ICF core set for RA comparison

The WHO ICF core set for RA comprises 96 categories, divided over the components ‘Body structures’, ‘Body functions’, ‘Activities and participation’, and ‘Environmental factors’ (see Additional file 2) [22]. Table 5 shows the results of the comparison of the components of the ICF core set and the domains as measured in the RAID score. Most of the domains are formulated as broad concepts and cover several items of the ICF core set. Fatigue and coping are not mentioned in the ICF core set. These could be linked to third level categories ‘b1300 Energy level’ and ‘b4552 fatigability’ of the entire WHO ICF framework, but fatigue is not specifically and explicitly mentioned in the core set [2427]. The component ‘Environmental factors’ of the core set are not covered in the RAID score, as well as the levels d660–d920 of the component ‘Activities and participation’. Concerning pain the ICF core set contains far more specified pain categories, at the third level specified per body part, which are not addressed by the RAID. Finally, all components of ‘Body structures’ (e.g., structure of lower or upper extremity) and ‘Body functions’ (e.g., mobility of joint functions, sensations related to muscles and movement functions) are also not covered in the RAID. However, such items are not expected to be measured in a PRO measure.

Table 5 Comparison of the RAID score with the WHO ICF core set for RA

In conclusion, five out of the seven items from the RAID score refer to three domains of the WHO ICF core set. RAID adds two domains not covered by the WHO ICF, and omits four, of which two are outside the scope of a PRO measure.

Discussion

Taking into account the impact of RA disease activity on the daily life of patients (the aim of the RAID score) is of high importance to rheumatology clinical practice. Our study confirms the importance of five domains of the RAID (coping with the disease; functional disability assessment (activities performed in daily life); pain; fatigue, and emotional well-being). Our patients also noted the omission of questions on (paid) work, relationships with others, and activities performed in spare time.

In the comparison with the ICF core set, all categories of the components ‘Body functions’ and ‘Body structures’ were covered in the domains sleep, physical well-being, pain and emotional well-being. Approximately 75 % of the categories in the component ‘Activities and participation’ and none of the categories in the component ‘Environmental factors’ were covered by the RAID score. On the other hand, the domains fatigue and coping which are known to be of high importance to RA patients are not incorporated in the ICF core set for RA, pointing to content validity problems with the ICF core set [7, 27].

Patients also pointed out some issues with comprehension. This might partly be explained by translation errors, but also due to different interpretation of the numerical rating scales, the large time scale of 3 months to refer to, and that the questions do not provide examples to think about. The Dutch version of the RAID needs to be modified to reflect the English version, and during the development of this paper we learned this has indeed been done in 2015 (see online: http://www.eular.org/tools_products_.cfm).

The strength of our study lies in the three FGDs performed with RA patients, as this method is an effective way to discover people’s ideas, feelings and needs about a subject, in this case the impact of RA on several domains of their life [17]. Another advantage of FGDs is that they present a more usual setting in comparison to individual interviews as patients are influenced by and influencing others, just as in daily life, providing the opportunity for discussion and consensus [17]. Our patients were not provided with literature or the RAID score in advance, so they had no prior knowledge on this subject. The ten patients who participated in the original development of the RAID score are all OMERACT patient research partners. These patients have more knowledge on research than average patients. Although 50 % of our patients were also highly educated, the rest had moderate or low educational levels. Therefore we think our population, albeit small, better represents the entire RA population. We did find it striking to see that the domains sleep and physical well-being were briefly or not at all mentioned by the patients participating in our study, in comparison to the patient research partners from OMERACT who are all also patients who did mention these items. We do not have an explanation for this difference in outcome.

A limitation of this study concerns the recruitment of patients which could have biased the results. As patient organizations distributed the invitation letter, we do not know the exact number of invited patients. It is likely that patients who have noticed that their disease altered their life situation were more likely to participate in the FGDs compared to patients who did not notice a major impact of RA on their lives. Also our patients all had established RA, perhaps limiting generalizability. We did not retrieve data of other comorbidities that patients might have. Other comorbidities might also have impact on a patient’s life. Secondly, the transcript analysts were familiar with the domains of the RAID score. Therefore possible bias towards these domains might have occurred. However, as saturation was reached after the third FGD and patients were only asked about their own experiences, we think this potential bias has a minimum impact. Finally, the domains noted by Dutch patients might be weighed differently by patients in other countries.

Without question, the RAID is a major advance in the assessment of impact of RA disease activity on everyday life, and is useful in its current form. But at its presentation, we noted that some of the domains assessed by the RAID are already in the RA core set, and cautioned against double counting [28]. In this study we suggest other domains that might be added to enhance content validity from the patient perspective. Several studies have pointed out that domains such as (paid) work, relationships with others, and activities performed in spare time are important and have impact on a patient’s ‘identity’ and quality of life [57, 28, 29]. A future upgrade of the tool should take these findings into consideration.

Conclusion

In conclusion, the Dutch version of the RAID score has fairly good content validity. More research is needed to confirm whether the domains (paid) work, relationships with others and activities performed in spare time are important in other patient groups. If so, these should be considered in a future upgrade.

Ethics statement

Under Dutch law, this study does not need approval from an Ethical Review Board. However, all patients gave oral informed consent at the start of the group sessions on the use of their provided responses, which was recorded.

Consent statement

All authors consent with the content of this article.