Background

Bipolar disorder is a leading cause of poor health worldwide, with a lifetime prevalence of 2.4% [1]. The results of treatment are often suboptimal: 60–85% of patients receiving treatment experience at least one manic or depressive relapse in 4–5 years, and residual symptoms, functional disability, and cognitive impairment between relapses are common [2]. At least one third of patients do not respond to their treatment [3]. Poor adherence to medication affects 20–60% of patients [4] and is associated with hospitalizations, recurrent mood episodes, and an increased risk of suicide attempts [5]. The deficit in adherence is comparable to that seen for other common chronic conditions, which is estimated to range between 35–72% [6]. The symptoms of the disease and the effects of medicines can vary by patient and over time, and they are often unpredictable. For many patients, treatment based on trial and error is unavoidable.

Medicine optimization is a framework targeting many of the reasons for suboptimal treatment. The approach is defined as “a person-centered approach to safe and effective medicine use to ensure people obtain the best possible outcomes from their medicines.” [7].

It should be supported by appropriate tools, such as patient decision aids; clinical decision support; and self-management, monitoring, and communication systems [7]. Medicine optimization integrates the patient’s values and preferences with the best available evidence, measurement of outcomes, and shared decision-making [8].

Importantly, none of the technologies deemed essential to medicine optimization have unequivocally been shown to improve patient health. Patient decision aids can increase general knowledge, enhance the proportion of patients that select a treatment consistent with their values, foster more involvement, and improve communication [9]. However, no significant effect on health outcomes has been found. Clinical decision support systems might improve morbidity outcomes and inappropriate prescriptions but do not significantly reduce adverse events or mortality [10,11,12]. Electronic reminders can improve short-term medication adherence, but 61% of m-health chronic disease management systems and adherence tools do not significantly improve clinical outcomes [13, 14].

One reason for these modest health benefits might be that the technologies exist in separate silos: one for patients and a different one for clinicians. Systems involving both the patient and the clinician are more likely to succeed [15], but only a small minority of decision support systems include patient decision aids, and even fewer allow patients and physicians to jointly interact with the system [16]. To adequately inform clinicians, systems should include real-time monitoring and evaluation [17].

Current prescription guidelines provide additional requirements for a merged system; the information must be evidence-based, and contraindications and potential adverse effects must be considered. Outcomes important to the individual patient should guide the assessment of treatments, and systematic monitoring and review should be facilitated [18, 19]. Adherence, symptoms, and other medical data should be monitored, enabling treatment assessments and decisions to be informed by longitudinal data [20].

Suboptimal medicine use is not the only reason bipolar disorder patients have life expectancies 10 to 20 years shorter than the general population. Other prominent factors are poor lifestyle behaviors, including poor diet, a lack of exercise, and smoking. A lack of coordinated care and management and limited social support also figure among the causes [21]. To holistically support the optimization of treatment, an application should therefore focus on medicines and other interventions, such as a healthy lifestyle and social support.

The aim of this paper is to describe the development and usability testing of a comprehensive health-optimization system for patients with bipolar disorder.

Methods

Theoretical frameworks

In shared decision-making [22], patients and healthcare personnel weigh the pros and cons of treatment options and try to find the option most compatible with the patient’s personal preferences [22, 23]. The main principles of shared decision-making are operationalized in and supported by patient decision aids, a broadly defined tool expected to comply with more than 40 internationally agreed upon requirements [24, 25].

The second framework applied in this research was multi-criteria decision analysis (MCDA) [26], a sub-discipline of operations research. MCDA approaches are increasingly being used to optimize healthcare decisions [27].

MCDA partly overlaps with shared decision-making. An MCDA-based approach that has been extensively researched was the original conceptual and practical stimulus for this work [28,29,30,31]. The MCDA framework was used in design, to model the system database, to integrate input from multiple sources consistently, and to connect the main system modules into a coherent whole.

Single-subject research designs (SSRD) are used to establish whether a treatment is effective for an individual rather than for a group [32] and served as a third framework. This type of design enables patients and clinicians to minimize the element of trial and error in finding the optimal treatment and ensures that this process is systematic and effective. Two main tools within SSRD, also called “n = 1 trials,” are visual inspection and statistical analysis tools. As treatments are added or removed, visual inspection of graphs allows the viewers to gauge the effects of the changes on the patient’s health. More than 400 studies have been conducted using SSRDs [33].

Requirements

The main objective was to create a system enabling long-term follow-up and optimization of treatment and health based on the frameworks. The project was initially a work package in an innovation and research program created to improve suboptimal clinical practices [34].

Formally starting in 2010, the requirement analysis process included interviews and discussions with patients, health professionals, hospital ICT departments, and researchers investigating clinical decision support and patient decision aids. We performed an analysis of personalization in current patient decision aids [35], reviewed existing solutions, performed focus group interviews eliciting users’ needs [36] and reviewed current MCDA-based decision dashboards [37,38,39].

A detailed analysis and description on how the system should work in clinical practice was developed [40]. In summary, “the system will assist the evaluation, monitoring, selection, and follow-up of pharmacotherapy in chronic disorders. It will be used repeatedly to improve decisions over time and visualize trends in crucial decision components. It will continuously provide opinions about what constitutes the best decision – calculated from continuously updated, personalized evidence, and the individual’s preferences. The system will be designed as a clinical decision support system, a self-management system, and a tool for communication and reflection” (minor amendments made for readability).

In accordance with software requirements used in Norwegian hospitals, the system was constructed and is maintained on Windows Server 2012 R2 using the programming languages C#, JS, and the database Azure SQL Server 2016. For details, see Additional file 1.

The main programmer and the project leader analyzed the initial requirements and constructed a functional design description with a sufficient level of detail for system design. New descriptions were added, discussed, and improved in the project management tool JIRA ® throughout the project (Additional file 2).

As of January 2017, nearly all features of the system pertained to one of the following six health-optimization strategies: Find the best, safe treatment, find the best dosage, increase adherence, live more healthily, get support from professionals and improve the decision process.

Additional file 3 presents the features supporting each strategy.

Development

Information development methods

To develop evidence-based and user-centric descriptions of options, outcomes, expected performances, and background information, we performed a systematic review and conducted focus group interviews with people with bipolar disorder. First, we used patient input to create descriptions of outcomes [36, 41]. Ratings for all treatment options were then calculated from estimates in a network meta-analysis and from surveys conducted by the research team among patients and clinical experts [42]. Descriptions of options included contraindications and were developed from summaries of product characteristics (SPCs) and the drug database MicroMedix. Descriptions of bipolar disorder, the decisions targeted in the system, and shared decision-making incorporated patients’ views from the focus groups, all processes described in detail elsewhere [36]. The texts were reviewed by patients and clinicians and refined in multiple iterations.

Usability evaluation

Participants and settings

From August 2014 to August 2016, 78 potential users participated in one or more usability testing sessions: 39 laypeople, 23 patients, 5 nurses, 2 general practitioners, and 9 psychiatrists. So as not to burden patients unnecessarily and to develop a generic platform applicable for a different conditions and users, features deemed not to be condition-specific were tested with laypeople and identified issues corrected before testing with patients and clinicians.

Patients were recruited from the Facebook page of the Norwegian patient organization and via Bipolar UK. Inclusion criteria for patients were: 1) aged between 18 and 65, 2) a diagnosis of bipolar disorder, and 3) seeing a general practitioner or psychiatrist regularly. Clinicians were recruited from Diakonhjemmet, Innlandet hospital trusts and the primary health care service in Norway. Laypeople were mainly recruited from Facebook and a local college in Norway. From mid-2015, all participants received a gift card after test completion equivalent to USD 40.

Methods for evaluating usability

Formative usability testing

In formative usability tests, qualitative reactions to user interface concepts and designs are obtained [43, 44]. Guided by the conceptual framework of user-centered design, each test was effectuated using a rigorous process that included initially reading aloud a predefined script with information and instructions. Tasks and scenarios relevant for the core functionalities (Additional file 2) were developed by the researchers, designer, and QA personnel and piloted with at least one person [45, 46]. Participants were instructed to think aloud during the tests. Performed with one individual at a time, all tasks were neutrally presented, and no assistance was given. Voice and screen activities were recorded and later analyzed by the team. After the session, we asked the users about any difficulties encountered and whether they had suggestions for improvement. A list of issues was maintained and updated after each session, and issues were prioritized in accordance with the number of users experiencing each problem. Table 1 summarizes the phases of testing.

Table 1 Testing phases

System usability scale

The SUS scale allows evaluation of platforms and software [47] and is considered an industry standard for measuring usability [43, 48, 49]. The average SUS score for 500 products is 68 [43]; a SUS score above this level is often referred to as a “good” product, whereas a score above 85 is considered “excellent” [48, 50].

Summative usability testing

In summative usability testing, a product is quantitatively evaluated with representative users and tasks to measure usability, efficiency, and satisfaction [43, 51]. We piloted the summative test with seven patients and then clarified the instructions and tasks given. Eight scenarios and tasks with expected completion times identical to those of a healthy, advanced internet user were presented to five patients. Task completion rate, numbers, and types of errors were measured. Each participant completed a questionnaire (Additional file 4) and the SUS scale (Additional file 5).

Long-term pilot

Two doctor-patient dyads tested the system for four weeks. Participants provided SUS scores three times and answered a questionnaire developed by the researchers (Additional file 4).

Statistical methods and analysis

Standard descriptive statistical methods were used to describe the participants’ demographics and to summarize the data. SUS scores were calculated according to the SUS scoring manual [48]. To compare SUS scores between the different user groups, a one-way ANOVA test was used (IBM SPSS Statistics 22.0, NY, USA).

Comparative assessment

Two authors (KN and ØE) identified modules in the system mathematically integrated by the MCDA-based algorithm and modules providing support for n = 1 trials. The modules in the system (Fig. 1) were compared to the AGREE and IPDAS requirements for clinical practice guidelines and patient decision aids, respectively (Table 2).

Fig. 1
figure 1

System modules. Simplified overview of the system as per June 2017 from end users’ perspective as a loop of collecting and personalizing information, then using decision support panels to gauge the results and select interventions. The results of those interventions are then collected and a new loop starts

Table 2 Comparison to clinical practice guidelines and patient decision aids

Results

Description of the system

Fig. 2 presents the health-optimization system in context.

Fig. 2
figure 2

System in context. Overview of system use in optimizing the patient’s health as per June 2017. Patients collect data on their smartphones; these data are integrated with default data from research. The patient and doctor use three types of decision support panels presenting the processed data to make informed decisions on health-promoting interventions together

Functionality

In the authoring suite, authors and co-authors invited by authors can create systems for any long-term condition, for instance bipolar disorder. Authors can tailor the system using different functionalities available in the suite. This condition-specific system is personalized by the individual patient and/or the clinician on the personal website. In the smartphone app, patients enter data about their health state and compliance with interventions and can receive reminders for adhering to medicines and lifestyle measures. On the website, patients and clinicians can inspect decision support panels and statistics summarising and visualizing the patient’s data, integrated with research, personalized according to the patient’s personal priorities, and adjusted for uncertainty (Fig. 2, Additional file 6). Healthcare personnel and other collaborators can be invited by the patient to use the system and contribute according to rights granted by the patient. Table 3 presents the main features of the system.

Table 3 Main optimization strategies and system features

In chronic conditions with a strong evidence base, all modules in the system are potentially relevant (Additional file 7).

Additional file 8 provides examples of user interfaces.

Textual and numerical descriptions

MCDA prescribes that to make a decision, decision-makers need information about which options are available and which outcomes are relevant, as well as descriptions of the options and outcomes and the expected performance of all options on all outcomes. Overall, 143 dedicated pages containing numerical and textual descriptions of 17 treatment options, 7 outcomes, and 119 ratings were developed. Sixty additional pages present the condition, decision, shared decision-making, and theoretical frameworks. Users can switch between an English and a Norwegian version.

Results from the formative usability tests

Formative usability tests with a total of 69 participants were performed between October 2014 and July 2016 (Additional file 9). In all, 82 usability issues of varying scope and importance were identified (Additional file 10). Overall, 52% of all issues regarded layout and 28% information delivery. SUS scores collected during this period are presented in Additional file 9.

Feedback on the first version resulted in three major redesigns.

Patients were not satisfied with the system being limited to optimization of medication and wanted a more complete disease management system. The patients’ demands led to the development of adherence support, functionality for selecting and monitoring lifestyle measures, the possibility to enter future therapy and support group appointments, and easy access to their detailed treatment plan.

By late 2016, it became clear that many patients felt overwhelmed when first using the site because of the new functionalities. After redesign, features are no longer presented to the patient unless selected specifically.

As of May 2017, more than 3,600 improvements and bug fixes have been implemented.

Results from the summative usability tests

All participants (n = 5) were women, and the mean (± SD) age was 38 (±7) years. Three had bipolar disorder type I, one had bipolar disorder type II, and one did not know the subtype. All patients had completed at least 12 years of education and reported that their internet literacy was good or very good. The mean ± SD (median) SUS score from the summative test was 78 ± 18 (75). With one exception, all participants completed all tasks on time. Participants committed 0 errors during the 37 tasks completed overall. All the participants strongly agreed that the system helped them to find the best treatment option most accordant with their preferences.

Mean system usability score

SUS scores were collected from 19 patients, 11 laypeople, and 8 healthcare providers. The mean ± SD (median) SUS score for all usability tests performed in 2015 and 2016 was 71 ± 18 (73). Fig. 3 presents the median SUS scores per user group. The SUS scores did not differ significantly between groups (p = 0.626).

Fig. 3
figure 3

Boxplot of System Usability Scale scores based on roles. Median system usability scores for patients, laypeople and healthcare personnel

The mean ± SD (median) SUS score from the summative test was 78 ± 18 (75), suggesting above-average usability and a “good product” [48].

Piloting of formative, long-term usability testing

One doctor-patient dyad in general practice and another in an outpatient clinic tested the system for four weeks. A registered nurse (SK) who knew the system well taught patients and doctors how to use the system. During consultations, patients and clinicians used the panels to assess the patient’s state, evaluate medications and their dosages, and reflect on priorities and decisions. The second dyad experienced a firewall problem in the hospital, and an iOS update caused the patient’s app to freeze. Both dyads reported the system to be helpful and were generally positive.

Qualitative feedback

Participants were generally supportive of the health-optimization system. Patients and physicians were particularly satisfied with the panels enabling comparison of options, the possibility to explore how different importance weights influenced the ranking of options, and the Timeline graphs. Here is a sample of quotes from the formative usability testing:

“Brilliant tool. I especially liked how it improved communication with my doctor” (patient)

“The best app I have tested for my condition. This system will be useful for many doctors and patients” (patient)

“The system gave our conversation a head start and helped the patient and me concentrate on what was most important to her” (doctor)

Comparative assessment

Twenty-two out of the 24 modules in the system could be integrated by means of MCDA-based algorithms (Additional file 7). This finding resonated with the authors’ experience that the MCDA framework accommodated and helped resolve most issues that appeared during development.

Thirteen of the 24 main modules were relevant for n = 1 trials in everyday clinical practice (Additional file 7). As the effects of treatments in bipolar disorder can take months or years to establish, graphs can be displayed for different time periods. Statistical summaries of results and monitoring fidelity are provided for all treatment plans (Additional file 8).

The AGREE and IPDAS quality criteria for clinical practice guidelines and patient decision aids together require only 5 out of 23 core features in the health-optimization system (Table 2).

Discussion

Principal findings

We have developed a shared digital platform equipping people with bipolar disorder and their clinicians and caretakers, with 21 features supporting the selection of and adherence to health-optimizing interventions. To the best of our knowledge, the system is the first MCDA-based patient and clinician decision aid that integrates longitudinal patient data. The feedback from patients and clinicians has been positive, with high satisfaction levels and perception of usefulness.

The system now includes support for communication and coordination, facilitation of timely decisions, support for the patient’s self-care, and improved compliance [52]. SUS scores and qualitative feedback from patients and clinicians indicate that it might be feasible for patients and clinicians to use the system to collaborate in optimizing treatment.

Results in context

Like the health-optimization system, clinical practice guidelines and patient decision aids aim to support evidence-based practice, commonly defined as “making decisions about how to promote health or provide care by integrating the best available evidence with practitioner expertise and other resources and with the characteristics, state, needs, values, and preferences of those who will be affected” [53]. However, the three systems differ significantly; the vast majority of components in the health-optimization system are not part of internationally agreed upon requirements for these two genres.

For instance, clinical practice guidelines and patient decision aids do not integrate patients’ weighting of outcomes with estimated treatment effects into a mathematically valid ranking of treatments presenting their expected value for the individual. Although a ranking of treatments is often provided in guidelines, this order is not personalized to the individual based on his or her individual characteristics and weighting of outcomes (preferences). Patient decision aids generally do not provide a ranking; rather, they leave the integration of these preferences and the estimated treatment effects to the cognitive abilities of the patient and clinician, with no patient-specific data. The system sharply contrasts with both approaches; integrating all available information into an overall expected value score.

The health-optimization system differs from traditional patient decision aids in additional regards. First, patient preferences – the relative importance of the outcomes – are elicited to reflect the full and actual range of what to expect from the options [54]. Second, the patient and clinician use identical decision support panels on a common platform. Third, all relevant information from the past and present, from heterogeneous sources, is used to provide a ranking of the treatments.

Strengths and limitations

Only convenience samples of patients participated in the various parts of the study. The SUS scores concerned different versions and components of the system, were collected at different time-points and should therefore be interpreted with caution.

The number of participants in the summative test was small. Paradoxically, testing with large groups of patients may reduce the quality of the results [43]. The ideal group size for usability testing is debated and depends on the product and the purpose of testing [55]. Usability testing with five people has been found to reveal 80–85% of issues identified in larger groups and has been referred to as a “magic number” [43, 56].

Participants in the study were highly motivated and generally computer-literate. Thus, the generalizability of the usability results to the general patient population, particularly among people with low digital literacy, is uncertain.

Implications for future research

Many m- and e-health technologies have substantial technical and conceptual shortcomings, limiting their potential as health-optimization technologies. There is therefore a need for innovative systems and for rigorous research evaluations investigating their effects on health outcomes. The same platform and functionalities used to create a system for bipolar disorder are currently being used to develop long-term health-optimization systems for patients with chronic conditions, such as HIV, heart disease and COPD.

Conclusions

Partly based on current technological genres, we have produce an evidence-based system for the optimal selection of and adherence to interventions in bipolar disorder. The results of feasibility testing are generally positive. If the system is found to improve patient-important outcomes in future research, clinicians might consider prescribing the use of a health-optimization system as a companion to the treatment itself.

Vailability and requirements

  • Project name: A health-optimization system for chronic disorders

  • Project home page: https://decidetreatment.org

  • Operating system (s): Platform independent

  • Programming language: C#, JS

  • Other requirements: Windows Server 2012 R2, C# 6.0, EntityFramework 6.1.3, HTML 5, CSS 3.0, web browsers from the past three years.

  • License: Prototype license: MIT Expat, Creative Commons Attribution 4.0 International

  • Any restrictions to use by non-academics: License needed