Introduction

Ankylosing spondylitis (AS) is a chronic, immune-mediated, inflammatory rheumatic disease that causes destruction and fusion of the spinal vertebrae, produces structural changes on radiographs in the sacroiliac joints, and may affect the peripheral joints and entheses [1, 2]. With an estimated global prevalence of approximately 0.02–0.50%, AS is associated with significant clinical and economic burden [3]. According to recent recommendations set forth by the Assessment of SpondyloArthritis international Society and European League Against Rheumatism, as well as the American College of Rheumatology, the Spondylitis Association of America, and the Spondyloarthritis Research and Treatment Network, the goals of treatment of AS are to reduce symptoms, maintain spinal flexibility and normal posture, reduce functional limitations, maintain work ability, and decrease disease complications [4, 5]. Patients are traditionally treated with nonsteroidal anti-inflammatory drugs, physical therapy, and exercise. However, some patients continue to experience active disease despite these regimens; for these patients, international recommendations suggest treatment with biologics such as tumor necrosis factor inhibitors (adalimumab, certolizumab pegol, etanercept, golimumab, and infliximab) or the fully human interleukin-17A inhibitor secukinumab, the only other biologic therapy with an alternative mechanism of action approved for AS [4, 5].

In addition to changes in clinical and laboratory measures of disease activity, patient-reported outcomes (PROs) represent important measures of patients’ firsthand accounts of their health status and response to treatment [6,7,8]. Several disease-specific PRO instruments, which have been developed and validated with input from patients and physicians, are commonly used in clinical trials in patients with AS, including the Assessment of SpondyloArthritis international Society Health Index (ASAS-HI), Ankylosing Spondylitis Quality of Life Questionnaire (ASQoL), Bath Ankylosing Spondylitis Disease Activity Index (BASDAI), Bath Ankylosing Spondylitis Functional Index (BASFI), and the Health Assessment Questionnaire for Spondyloarthropathies (HAQ-S) [9,10,11,12,13]. Many clinical trials have demonstrated the effect of tumor necrosis factor inhibitors and secukinumab in improving clinical and patient-reported outcomes in patients with AS [14,15,16,17,18,19]; however, real-world studies which have thoroughly evaluated the concepts captured within these instruments are limited in patients with AS.

Guided surveys are often used to better understand patient experiences, and the use of the internet as a resource has helped to broaden the reach of patients with AS who participate in these studies [20,21,22]. The pervasiveness of the internet in everyday lives has given rise to online health communities, which are becoming increasingly popular among patients to voluntarily share their experiences and concerns, and to provide support for other patients [23]. Online communities exist for patients with AS through organizations (e.g., PatientsLikeMe, KickAS) or even through groups created on larger online platforms (e.g., Facebook, Twitter) [24,25,26]. Members of these communities may share information about their signs and symptoms (e.g., back pain and morning stiffness), ask questions about different treatments (e.g., effectiveness and/or side effects), or just describe their personal experiences with the disease in the hope that they can gain insight into their condition, or provide comfort, or serve as a guide for others (e.g., the journey to diagnosis or coping with symptoms/treatments). PatientsLikeMe provides an up-to-date summary of basic demographics, common symptoms reported by patients, and treatments used, as well as perceived effectiveness and side effects of these treatments [25]. These online communities and social media networks provide members with a wealth of information on firsthand patient experiences and how patients, family members, and caregivers are living and coping with their disease. Data from online sources can be collated and analyzed to influence how patients and providers make decisions regarding disease management and treatment; however, only limited studies have systematically organized or examined these data across multiple online platforms.

Hence, the social media realm represents an opportunity to increase understanding of patient experiences that are not necessarily captured in clinical trial settings or with guided surveys. Information available in this barrier-free space will be accessed by patients and is potentially influential in providing support and in shaping their decision-making process. This study aimed to better understand AS disease burden and its impact on patients’ lives by describing functional impairments as reported online by patients. These patient experiences were then compared with concepts within existing AS-specific PRO instruments to identify key aspects of AS that are not captured.

Methods

Data Source

From an aggregate of thousands of publicly available online health care sources that are commonly indexed by major search engines (e.g., Google, Bing), English-language narratives between January 2010 and May 2016 were collected from 52 online sources for analysis of functional impairment in patients with AS (Table S1). Preexisting and unsolicited narratives and/or conversations relating to real-world experiences with AS and associated treatments were used; for the purpose of this analysis, non-English narratives were excluded from consideration. Data sources included general social networks, patient–doctor Q&A sites, treatment review forums, health social networks, and disease forums. All data were obtained from publicly available, patient-led discussions. Content such as patient education materials or published literature articles, which do not contain actual experiences, were disqualified as content from this study. The final sources were ultimately chosen based on whether members of a given website reported having AS and ≥ 1 symptom or functional impairment. Strict standards and protocols were followed when crawling websites, abiding by the “robots exclusion standard” and adhering to each site’s robots.txt terms, including crawl delay directives, user-agents, disallow directives, sitemaps, etc. Only publicly available, non-password-protected information indexed by Google, permitted for crawling, was pre-identified for this study. No password-protected sites were accessed, nor were any personally identifiable patient data used in the generation of the study results, i.e., no direct identifiers were identified/extracted, and any quasi-identifiers were transformed into aggregate forms (e.g., age in years → age ranges, with n number of bands). No patient recruitment was conducted.

To reduce potential duplication of patients in the study sample (i.e., those patients who reported experiences as users or members across multiple online platforms), statistical models were used to compare factors across reports (e.g., username, time-stamps of reports, concepts contained within the reports). If these factors appeared highly similar across users, the profiles were tagged and flagged for review. Flagged profiles were then passed through a manual curation process to ensure that every attempt was made to remove duplicates.

Data Qualification and Categorization

RLytics is a software-based data analytics platform, built on a combination of customized natural language-processing algorithms/technologies that extract and structure medical, clinical and functional concepts from unstructured healthcare data. Algorithms comprise both taxonomy-based and semantic models that have been trained and refined by expert curation. Using this natural language-processing platform (RLytics), and through manual expert curation, functional impairments and symptoms related to AS were extracted from patient narratives and classified into 6 high-level concepts based on the analysis of commonly occurring concepts within existing PRO instruments: social, physical, emotional, cognitive, role activity (SPEC-R), and general (consisting of nonspecific narratives, e.g., “feeling unwell”).

The SPEC-R framework is built on curated taxonomies that are used to codify and structure key elements of interest from unstructured patient narratives. Each concept within the SPEC-R framework had a structured, tabulated output that was then manually reviewed by research analysts trained in medical coding from unstructured data. For example, a patient may post the following experience to social media: “I was given drug X for my AS, but I am really anxious about any potential side effects”; from this statement, high-level concepts can be extracted [e.g., the reporter (patient), medical condition (AS), treatment (drug X), medical condition/symptom (anxiety)]. The structured outputs were initially reviewed separately by two individuals, who independently flagged errors for further evaluation. Agreed upon “flags” between the two reviewers were accepted for removal. Divergent flags were qualitatively assessed by both reviewers, with a third research analyst participating in the review. If these three individuals could not agree upon whether the concept output should be flagged, it was removed from the analysis. This framework allows for unguided patient commentary to be translated into structured patient data for further analysis. These broad SPEC-R categories provided a starting point for organization of the data, with additional levels of detail added through subconcept groupings that are more granular in nature (Table 1).

Table 1 Overview of categories of SPEC-R concepts and subconcepts

All these subconcepts have been derived from existing categorizations of symptomatology and functions from existing medical dictionaries and taxonomies, such as those established by the Medical Dictionary for Regulatory Activities or the International Classification of Functioning, Disability and Health developed by the World Health Organization. While these “off-the-shelf” taxonomies comprise only standardized terminologies (e.g., correctly spelled, formalized terms), RLytics contains “curated” versions of these taxonomies, which include (often multiple) verbatim variations and common misspellings of standardized terms, and are subsequently mapped to standardized terms (e.g., “grazed knee” → “open wound of knee and lower leg”). A similar natural language processing approach was used to identify whether medical conditions reported were hypothetical (i.e., reported but not experienced) or as a “true experience” from someone who has actually experienced the condition (e.g., “I/me,” “my son/daughter,” “my husband/wife”).

For comparison, the same SPEC-R categorization was applied to 5 AS-specific PRO instruments with questions across multiple domains commonly used in clinical studies of patients with AS: ASAS-HI, ASQoL, BASDAI, BASFI, and HAQ-S (81 items from PRO instruments curated, 40 total subconcepts identified) [9,10,11,12,13]. These PRO instruments were qualitatively selected by the study sponsor following an analysis of PRO instruments identified with a basic search on clinicaltrials.gov and PubMed using relevant AS keywords. This was not a formal systematic literature review, but instead a semi-targeted automated search, with a qualitative manual review/selection of these final 5 instruments by the clinical experts/authors of this report.

The ASAS-HI was developed with input from experts and patients as a disease-specific questionnaire based on categories of the International Classification of Functioning, Disability and Health [9, 27, 28]. The 17-item questionnaire addresses categories of pain, emotional function, sleep, sexual function, mobility, self-care, community life, and employment. The ASQoL is an 18-item questionnaire which was also developed based on patient input, and assesses the impact of AS on activities of daily living, fatigue, pain, sleep, independence, relationships, and mood [10]. The BASDAI and BASFI instruments were both designed by medical professionals in collaboration with patients for the rapid self-assessment of disease activity and functional ability in patients with AS [11, 12]. The BASDAI is a 6-item questionnaire that assesses aspects of fatigue, pain, discomfort, and morning stiffness on a visual analog scale [11]. Similarly, the BASFI was designed as an 8-item questionnaire to assess ability to perform daily activities (e.g., putting on socks or reaching up to a high shelf without help or aids) on a visual analog scale [12]. The HAQ-S was modified from the original HAQ by adding 5 items specifically related to patients with AS (driving a car, using a rearview mirror, carrying a bag of heavy groceries, sitting for long periods of time, and working at a desk) to the original 20-item HAQ Disability Index [13, 29].

Questions from existing PRO instruments were classified into the same broad concepts and lower-level subconcepts. For example, one item on the ASAS-HI states, “I often get frustrated” [9]; this would be categorized under the emotional concept, specifically the “anger/frustration” lower-level subconcept. The 5 AS-specific PRO instruments were then compared with key concepts and subconcepts extracted from patient narratives to evaluate the capability of each instrument to capture frequently reported patient experiences and to determine the need to refine existing PRO instruments based on identification of potential deficits.

This article does not contain any new studies with human participants or animals performed by any of the authors.

Results

Data Source and Patient Population

A total of 34,780 narratives from 3449 patients with AS who reported ≥ 1 functional impairment were collected from 52 online sources for assessment of functional impairment (Fig. 1). Of the 34,780 narratives included in this study, 46.6% were collected from general health social networking sites (e.g., DailyStrength.org, MedHelp), 24.9% from disease-specific patient forums (e.g., Spondylitis Association of America, KickAS), 13.5% from general health forums (e.g., Patient.info, eHealthForum.com), 8.3% from treatment review forums (e.g., Askapatient.com, Drugs.com), 6.0% from patient-doctor Q&A sites (e.g., HealthTap.com), and 0.5% from mainstream social media sites (e.g., Twitter, Facebook).

Fig. 1
figure 1

Source of narratives of patients with ankylosing spondylitis reported in online communities

Of the 3449 patients with AS, information on age and sex were available for 702 patients (20.3%). The median age was 38 years, and 402 patients (57.2%) patients were female. Among the 1627 patients who had geographic information available, most patients (82.7%) were located in North America, followed by Europe (9.7%), Oceania (4.1%), Asia (2.7%), Africa (0.6%), and Central America (0.2%).

SPEC-R Analysis of Patient Narratives

Of the 34,780 narratives collected from 3449 patients for this analysis, 5.1% of patient narratives were correlated to the social concept, 86.7% to the physical, 32.5% to emotional, 23.6% to cognitive, 8.7% to role activity, and 69.1% to general (e.g., “feeling unwell”; Fig. 2). Within the social concept, lack of independence was expressed by nearly 1 in 4 patients [e.g., “I only wish not to be crippled and live independently (until the) last moment and not to feel ashamed from limited range of mobility”], which was followed by the feeling of being a burden on the family in approximately 1 in 5 patients (e.g., “I feel like a burden to my family and I have trouble meeting people or going out”; Fig. 3). Based on an analysis of lower-level concepts and subconcepts, regaining independence and relieving the family were considered unmet needs that are of high value to patients with AS.

Fig. 2
figure 2

Patient narratives across all categories of the social, physical, emotional, cognitive, and role activity (SPEC-R) analysis. General concepts consisted of nonspecific narratives (e.g., “feeling unwell”)

Fig. 3
figure 3

Patient narratives for each lower-level concept of the social, physical, emotional, cognitive, and role activity (SPEC-R) analysis

Pain (75.3%), fatigue (23.2%), and musculoskeletal disorders (22.9%) were among the primary physical concepts reported by patients (Fig. 3); these concepts were associated with general pain, muscle pain/weakness, a feeling of sluggishness, and muscle stiffness [e.g., “(I) hate going to bed because just lying down caused me extra pain”]. Anxiety (58.8%) and associated conditions such as fear and nervousness were the most commonly reported patient-reported emotional concepts (Fig. 3); patients also reported depression (30.5%), anger/frustration (16.7%), and sadness (14.7%).

Cognitive and role activity impairments were reported by < 10% of all patients. Patients reported a wider variety of lower-level cognitive concepts, with < 15% of patients reporting mental impairment (13.6%), impulsivity (12.4%), and problems with balance/coordination (12.1%), memory (8.8%), speech (7.1%), and concentration (3.2%; Fig. 3). Role activity concepts and subconcepts of interest were primarily focused around work/school issues involving performance (34.6%), unemployment/dropping out (19.9%), and absenteeism (15.0%; Fig. 3); other issues, such as self-care, parenting, and economic circumstances, were reported by < 15% of patients with role activity impairments.

Analysis of AS-Specific PRO Instruments

None of the instruments used to evaluate AS in this analysis captured all of the major concepts and subconcepts discussed by patients (Table 2). Pain was by far the most commonly reported issue and was discussed by nearly two-thirds (65.3%) of all patients included in this analysis. Notably, pain was also the only subconcept that was covered by all 5 of the PRO instruments included in this analysis (ASAS-HI, ASQoL, BASDAI, BASFI, and HAQ-S). Some of the other most common concepts reported by patients from all narratives, such as asthenia (19.9%), musculoskeletal impairment (19.9%), depression (9.9%), and anger/frustration (5.4%), were effectively captured by ≥ 2 of the PRO instruments. However, commonly reported emotional concepts such as anxiety (19.1%), and cognitive concepts such as mental impairment (3.2%), were not adequately addressed by any of the existing PRO instruments evaluated in this analysis.

Table 2 Summary of PRO instrument coverage of key SPEC-R concepts

Discussion

This study collated and analyzed approximately 35,000 unguided patient narratives from online sources to determine which concepts are most commonly reported online in patients with AS, and to evaluate whether these concepts are adequately captured by commonly used PRO instruments. Overall, patients in this analysis exhibited significant physical burden: 86.7% of patients reporting functional impairments described having ≥ 1 physical aspect of the disease, three-quarters of whom discussed pain associated with their disease. Emotional and cognitive concepts represented the next most frequently discussed aspects of the disease, which were represented in approximately 1 in 3 and 1 in 4 patients, respectively, which is consistent with known reports of increased risk of psychiatric disorders in patients with AS [30, 31]. Less than 10% of patients discussed issues related to the social or role activity concepts—there are multiple possible interpretations of this finding: it is possible that AS may not severely restrict professional and social lives, or that patients simply do not prioritize these aspects of their lives in their discussions online, compared with their experiences of pain, for example, which would impact all aspects of a person’s life.

Overall, the AS-specific PRO instruments were more useful in assessing some of the key physical and social aspects of AS including pain, fatigue, and lack of independence, but many other SPEC-R subconcepts, such as anxiety, mental impairment, and impulsivity, were not captured by any of the PRO instruments used in this analysis. Pain was the only subconcept that was captured by all 5 PRO instruments in this analysis. It was by far the most frequently reported issue in patients with AS (65.3%) and was present in > 45% more patient narratives than the next most common physical subconcepts of fatigue and musculoskeletal impairment (19.9% each). Fatigue and musculoskeletal impairment were each captured by 3 of 5 PRO instruments, including the ASAS-HI and BASDAI for both.

Although emotional problems were the next most commonly discussed concepts associated with AS, current PRO instruments were limited in their ability to capture these issues. Anxiety was reported by approximately 1 in 5 patients and represents a major aspect of the disease for patients with AS; however, none of the 5 PRO instruments were equipped to examine problems related to anxiety. Only 2 of the 5 PRO instruments (ASAS-HI and ASQoL) included items to address depression and anger/frustration, the next most commonly described emotional problems associated with AS. Similarly, although cognitive impairments were reported by nearly 1 in 4 patients in this analysis, none of the PRO instruments included in this analysis captured mental impairment or impulsivity, and only 2 of the 5 PRO instruments (ASAS-HI and HAQ-S) probed issues related to balance and coordination. Social and role activity concepts were not as frequently discussed online by patients with AS compared with the physical, emotional, or cognitive issues; among the most common social and role activity issues reported online, only “lack of independence” was covered by > 1 PRO instrument included in this analysis. These results suggest a gap in using these particular disease-specific PRO instruments, as some aspects of AS that are important to patients are commonly discussed online (e.g., anxiety and cognitive impairments) but are not being adequately addressed by current tools. Other general health questionnaires such as the 36-item Short Form Health Survey (SF-36) [32], the EuroQol 5 Dimensions questionnaire (EQ-5D), and the Work Productivity and Activity Impairment Questionnaire (WPAI) [6, 33] include items pertaining to depression/anxiety, mental health, and/or work productivity, and may be used to complement other AS-specific tools. Because all 5 PRO instruments are equipped to capture pain in patients with AS, the choice of PRO instrument (both AS-specific and generic) could be tailored to each situation and guided by its intended use, depending on which aspect(s) of the disease are most important to an individual (e.g., work productivity, quality of life, emotional issues, family burden, or work/school performance). These findings highlight the benefits and potential shortcomings of each PRO instrument, with the use of a particular instrument in clinical practice potentially linked to clinician familiarity with the instrument and/or convenience of its use.

The innovative study design allowed for the aggregation and analysis of a large amount of patient-level data that would otherwise not be captured in typical observational studies; however, the results should be interpreted within the context of some limitations. Online platforms are largely barrier-free and accessible to the public; however, those patients making the effort to discuss their experiences online may not be representative of other patients with AS because they may be more heavily invested in understanding their disease compared with the overall AS population. Furthermore, patients who have taken the initiative to seek out self-help or other patient organizations are better informed about the disease and may potentially have better function or less work impairment [34]. Because of this potential for the inclusion of a subset of motivated patients, it is important to note that patients (especially the highly active) often participate in multiple conversations, sometimes across multiple websites, with other patients in a short time frame. It is therefore possible that the same patient will report the same experience (e.g., prescription of a TNFi) within a single “session” online, with a potential risk of double-counting patients, which may, in turn, skew the data. Using posting metadata such as timestamps and structured concepts within narratives, we can reduce the double-counting of patients, and do so in a replicable and scalable way. However, this level of duplication occurs with extremely low frequency, and with or without it, in our opinion, the impact of such instances is negligible to the wider analysis. Also, although we limited this analysis to English-language narratives, basic demographic information (e.g., age, sex, geographic location) were only available for a portion of patients included in this analysis, while more specific demographic, clinical, and disease characteristics of patients (e.g., employment, insurance, comorbidities, disease activity/function, treatment) were largely unknown; therefore, it is difficult to compare the findings of this analysis to those of other observational studies, or to generalize the findings to larger AS populations or specific geographic locations. In addition, we did not have detailed information on employment or relationship status of patients included in this analysis, making it challenging to interpret how much AS truly impacted social or role activity concepts, or how relevant these concepts were to individual patients. Among those patients whose demographic information was available, a higher proportion included in this analysis were female, with a median age of 38.0 years. AS is often associated with male predominance, with the ratio of men to women ranging from 3:1 to 2:1 [35]; however, the higher female prevalence in this analysis (57.2%) is consistent with observations that females generally have a greater online presence compared with males. For example, data collected from PatientsLikeMe (as of January 2018) showed that 70.0% of members with AS were female [25]. The median age of patients in this study was consistent with self-reported data from PatientsLikeMe and similar to the age of patients enrolled in many other observational studies of AS [22, 36,37,38]. Additionally, all statements and experiences were based on self-reporting without any way to verify their accuracy; therefore, it is possible that some aspects of the disease may be overestimated or underestimated relative to the general AS population, and/or may be affected by other conditions or side effects of treatments. Lastly, this analysis focused on 5 AS-specific PRO instruments that are commonly used in clinical practice. Other disease-specific questionnaires and general instruments commonly used in clinical studies, such as the Bath Ankylosing Spondylitis Global Score [39], the Ankylosing Spondylitis Disease Activity Score [40], the SF-36 [32], the WPAI [6, 33], and the EQ-5D, may cover different subsets of issues faced by patients with AS than those identified in the present analysis.

Conclusions

This is the first comprehensive analysis of real-world patient experiences aggregated from multiple online platforms and social media sources. This analysis used a natural language processing platform and manual expert curation to extract information about issues relevant to patients with AS from unstructured online narratives. Our study shows that patients are proactively discussing their AS experiences online and generating large volumes of data, which may serve as a supplement to other costly and time-consuming recruitment initiatives to collect patient-reported data [41]. These results confirm the high unmet need in patients with AS and provide additional insights into patient-reported disease burden and functional burden. Patients with AS are not reporting about their disease in the same format as traditional PRO instruments (i.e., a Likert scale); however, while this analysis cannot serve as a replacement to traditional PRO approaches, these data do complement existing strategies (e.g., surveys and focus groups) and help to further probe the issues that patients find most relevant to their disease and daily life, particularly anxiety and mental impairment.