Introduction

The availability of digital apps is burgeoning in mental healthcare with tens of thousands available to download. Despite this booming billion-dollar mental health and wellness app industry, there is a lack of serious discussion about the quality of evidence, including trial design, on which digital therapeutics are based [1•]. In clinical research, the randomized placebo-controlled trial is considered the gold standard of treatment. Unfortunately, when it comes to interpreting the nature and function of placebos, all that glitters is not gold, and clinical research is beset with conceptual confusions that undermine the robust design of controls [2, 3]. These confusions directly generate downstream problems for appraising the effectiveness of treatments in clinical trials. Across treatment modalities, poor design compromises the validity of inferences that can be made about the treatments under scrutiny. This review explores the nature of placebos, the challenges of designing adequate placebos in pharmacology and psychotherapy contexts, and applies the lessons that have been learned to the domain of digital mental healthcare.

What Are Placebos?

The term “placebo” has two distinct meanings (see Table 1). The first refers to those treatments that may be offered to patients in clinical settings even though the clinician does not believe that the intervention will be effective for the presenting symptoms. These kinds of clinical placebos are often demarcated into “pure” and “impure” categories. The former is interpreted as including sugar pills (typically, microcrystalline cellulose), or ‘inert’ saline creams; the latter are interventions that may have potent effects, but not for the patient’s presenting ailments: for example, the prescribing of vitamin pills as a malady for itchiness or antibiotics to treat viral infections. Placebos may be prescribed to instil hope or comfort, as a means for practitioners to save face, or as a method of soliciting beneficial placebo effects (of which more, shortly) [4]. Despite ethical concerns associated with their usage (namely, provider deception and diminished patient autonomy), studies show deceptive placebo prescribing is not uncommon in ambulatory contexts [5].

Table 1 Definitions

The second, distinctive usage of the term placebo is in methodological contexts where placebos, properly understood, are instruments for measuring the effectiveness of a treatment within randomized controlled trials (RCTs). Adequately designed, placebos control for a wide variety of noise arising in both treatment and placebo arms of clinical trials including natural history (over time, some people will get better), Hawthorne effects (people may behave differently when enrolled in trials, with the knowledge they are being monitored), response biases (people unintentionally report inaccurate outcomes, e.g., by subconsciously attempting to please investigators or unintentionally reporting what they believe they are expected to say), and placebo/nocebo effects (health changes that arise from psychobiological mechanisms relating to the expectancies that treatments will be effective, or harmful) [3]. Combined this undifferentiated amalgam of potential outcomes is referred to as the “placebo response” [6].

To effectively screen out this noise, ideally participants and researchers should not be able to guess whether recruits are allocated to the treatment or placebo—if they do so, this is referred to as “breaking blind.” This means, in turn, that the placebo should be indistinguishable from the specific treatment (the “verum”) being investigated. Indeed, the delivery of treatments, including subtle cues embedded within provider communication, can interfere with outcomes; even aesthetic factors associated with the context of care, including the provider’s treatment room and attire, can influence expectations [7•, 8]. Therefore, careful attention to the design and administration of placebos in clinical trials is needed. Placebos are not merely a kind of thing, but a measuring tool with design implications that must be well considered. For example, if the treatment is a drug delivered in a tablet that is red and round, with an acidic flavor, then ideally the placebo should be devised, as far as possible, to mimic appearance, taste, and smell of the tablet containing the real drug. The key difference is that the placebo should not comprise the active drug ingredient.

Notice that due consideration about placebos necessitates investigators decide, in advance of designing the trial, what they hypothesize the locus of the treatment to be. Notably, this does not mean they must offer a mechanistic account about why the treatment works, only that they identify the constituent which they consider to be potentially therapeutically effective for a given set of symptoms [2].

However, confusions surrounding placebo concepts reveal clinical investigators often assume placebos in RCTs can be understood merely as “inert” treatments such as sugar pills, saline injections, of the kind that may occasionally be offered in clinical settings [2, 9]. Demonstrating a lack of awareness of the importance of placebo design, studies show, for example, that placebo characteristics are reported in fewer than 10% of drug trials [9]. One review of 36 clinical trials reported that 44% of placebo controls in pharmacology trials were not matched to the active intervention in terms of physical properties [10].

Placebos in Psychotherapy

While there are considerable design challenges associated with devising placebos in pharmacology, the problems are even greater in clinical psychology [11, 12]. There are currently hundreds of versions of psychotherapy, each with highly divergent, specific treatment components; however, all versions of therapy typically share what are referred to as “common factors” including an empathic practitioner, a rationale conferred for the therapy, and the provision of positive cues and expectations by the practitioner that the treatment will be successful [13]. Controlling for common factors in clinical trials, which may influence effect size, poses a major problem for investigators. In addition, ensuring double blinding is another major concern: in psychotherapy clinical trials, practitioners know whether they are delivering a “verum” treatment to participants, who, in turn, may readily surmise that they are receiving a bona fide treatment intervention and not a placebo [14]. Relatedly, researcher allegiance to psychotherapy modalities influences effect sizes perhaps by communication expectancies [15].

Regrettably, like pharmacology research, the need for robust placebos as instruments to adequately screen out noise and measure effect sizes is still poorly grasped by investigators [11]. For example, even among leading researchers, placebos are often conceived as “passive controls”—that is to say, patients are often allocated to waitlists, usual care, or no treatment [16]. However, “passive controls” do not constitute a placebo control and studies show that participants allocated to waitlists often experience worsening of symptoms, perhaps as a result of nocebo effects [17]. Alternatively, psychotherapy researchers sometimes describe placebos as “active controls,” “attention controls,” or “non-directive controls” [16]. Examples include relaxation training, leisure reading, talking about hobbies, daily events, books, or movies, with a practitioner. These kinds of controls produce smaller relative effect sizes than waitlists, or no controls [18], and this is likely because such interventions reduce the size of placebo responses. Notwithstanding, conceived in this way, placebos in psychotherapy risk overestimating treatment effect sizes because they do yet not adequately control for the full composite of placebo responses.

Designing adequate placebos in psychotherapy is therefore a major methodological concern that should be squarely acknowledged [11]. Although there are no straightforward solutions, one approach is to design placebos so that they are—as far as possible—structurally equivalent to the treatment under scrutiny. For example, placebo interventions match the treatment in terms of the number of sessions, duration of sessions, the format, the provision of a convincing rationale, and whether patients discuss topics that appear logical to the treatment/placebo [12]. Patients could also be requested to gauge their treatment allocation to better assess the extent to which participants break blind, and thereby, the adequacy of the placebo control. Relatedly, clinical researchers should be blinded to the treatment allocations, and if feasible, ideally practitioners could be blinded to the trial rationale.

Placebos in Digital Mental Health

Digital therapeutics include a variety of internet- and app-based technologies aimed at helping patients to manage or treat health conditions. Many of these interventions attempt to automate specific psychotherapy modalities by translating face-to-face interventions into self-guided therapy programs, such as internet-based cognitive behavioral therapy (CBT). However, as in other domains of healthcare, the quality of evidence is dependent on the adequacy of the controls that are implemented, and unsurprisingly, the variety of challenges associated with devising placebos in pharmacology and psychotherapy extend to the design of placebos in digital mental healthcare.

Conducted in January 2022, a narrative review of digital therapeutics listed on the DTx Alliance product list that did not involve physical devices (e.g., wearables and sensors) identified a lack of placebo literacy among researchers in describing and naming control conditions [1•]. Analyzing a total of fourteen RCTs, Lutz et al. found that half used unblinded waitlist or treatment as usual controls while the remainder used some form of sham control. However, even among those studies that attempted to devise placebo controls, the authors found that most lacked a clear description of the nature of the control. Relatedly, the authors noted that control condition terminology varied between studies with a lack of explicit discussion about how design choices were motivated. Furthermore, no study reported blinding checks which are recommended by the Consolidated Standards of Reporting Trials statement (“CONSORT”) [19]. As such, it was unclear whether investigators or patients were blind to participant allocation. Combined, it is unclear whether effect sizes were owed to the effectiveness of specific therapeutic constituents of the treatments under scrutiny or owed to placebo responses [20].

A recent FDA-approved virtual reality intervention for lower back pain, embedding a CBT-based program, offers another instructive example of the current limitations associated with current digital placebo design [21]. The study protocol included a placebo control, and also blinded participants and study statisticians to treatment assignment. Participants allocated to the verum treatment (“EaseVRx”), and the placebo (“Sham VR”), were each instructed to complete one virtual reality session daily for a total of 56 days with investigators noting that the average duration of sessions was closely matched between both groups. These measures demonstrated an unusual commitment by going further than most digital health RCTs in paying attention to placebo design. Indeed, the authors concluded that, on average, participants allocated to EaseVRx experienced higher user satisfaction, and superior and clinically meaningful symptom reduction for pain intensity and pain-related interference with activity, mood, and stress, compared with those allocated to sham VR.

However, on closer inspection, deficits associated with the design of the placebo control, and a lack of clarity about why design choices were made, undermine concrete inferences about the effectiveness of the intervention. There is a conspicuous lack of clarity about the hypothesized “engine of treatment” of EaseVRx: namely, whether researchers consider the internet-based CBT component to be most therapeutically valuable, the VR component to be therapeutic, the design of the interface to be relevant to outcome, or whether they view some combination of all these factors to be relevant. Currently, the study does not isolate and control which of these potentially remedial factors may make a difference to the treatment, further constraining inferences about what aspects of the verum might be potentially effective.

Examining the delivery of the treatments, EaseVRx comprised 3D images with immersive skills training including psycho-educational videos and a rich explanation for the purported rationale of the treatment. In contrast, sham VR comprised only 2D nature footage with “neutral music” selected to be neither “overly relaxing, aversive, nor distracting” with no educational instruction or rationale for the footage proffered to participants. Interface design also diverged between treatment allocations: aside also from immersive CBT exercises, in the EaseVRx condition, patients viewed “high-resolution 360 videos with therapeutic voiceovers, music, guided breathing, and sound effects designed to maximize the relaxation response and participant engagement.” Since the quality of the design and what is communicated about the intervention may subtly influence responder biases and contribute to participants breaking blind, the variety of measure taken in the EaseVRx might well have augmented patients’ expectations about the effectiveness of the treatment (and thereby, enhance placebo effects), potentially influenced patient health behaviors (so-called Hawthorne effects), and as a result, patients may have experienced a boost in placebo responses.

It is important to note that these kinds of limitations are not unique to the EaseVRx study. From digital games and mindfulness apps [22] to smartphone-based apps for schizophrenia [23], a wide range of digital therapeutics have been hastily heralded as effective without due caution being given to the quality of the placebo arm in clinical trials.

Discussion

Placebo controls are prized as the gold standard of evidence-based medicine, but a variety of misconceptions associated with the nuances of their function within RCTs compromise their practical value. In particular, confusions associated with the nature of key definitions (see Table 1) contribute to what has been characterized as placebo illiteracy [6]. This inattentiveness to the fundamentals and justification for robust placebo design—in digital healthcare as in other health domains—is consequential. Lack of due diligence in placebo design may also be one route to the “replication crisis” which refers to the discovery that often classic—that is, highly cited— findings may not be reproduced in subsequent studies [24, 25]. In their studies, Prasad and Cifu, for example, identified around 40% of clinical practices that were considered well established but later found to be ineffective or harmful “medical reversals” [26, 27]. Medical reversals encompass medications, surgeries, and even public health programmes. The extent of medical reversals raises pressing concerns about questionable methodological practices, including about the adequacy of placebo controls in trial design and subsequent replications.

Given the lack of regulatory standards or expectations about how to design placebos, and a wide variety of interpretations about what constitutes an adequate placebo in digital contexts, I close by drawing key recommendations to help investigators devise better controls.

Choosing a Placebo Control

Waitlists do not constitute placebo controls and offer no capacity to screen out placebo responses. To test for the effectiveness of a digital intervention, researchers must make design choices, and these are contingent on the mode of delivery, and features of the treatment under scrutiny. This, in turn, requires researchers to be explicit in identifying the hypothesized locus of the treatment—that is, the constituent(s) that they believe may be therapeutic.

Structural Isomorphism

After identifying the components of the treatment that they wish to test, investigators should strive to design placebos that are structurally isomorphic to the intervention under scrutiny except for the hypothesized locus of treatment. For example, quality, aesthetics, and usability of the digital interface should be identical, as far as possible, between treatments; the number of treatment sessions, duration of sessions, training and education, and the disclosure of a treatment rationale should ideally be matched in format and design. It may not always be possible for every study to implement complete structural isomorphism, but investigators must take measures to strive for equivalence.

If a combination of factors is considered therapeutically relevant, so-called dismantling studies—whereby a standard treatment is compared to another, except one constituent component—offers another route to isolate effect sizes [12]. This may permit investigators to derive more valid inferences about which constituents of the treatment carry the therapeutic burden. Researchers should offer detailed descriptions about placebo design, including attempts to offer structure equivalency, and justification for decisions made.

Blinding of Participants and Co-researchers

Both study researchers involved in administering treatments, and participants, should be blinded to allocation. Researchers should be blind to patient allocation at all times to avoid investigator bias. If necessary, any participant interactions should be conducted by clinician-researchers blinded to study hypotheses.

To reduce opportunities for participants in the placebo arm to break blind, a rationale should also be delivered, and this should be pre-tested to determine its plausibility. Researchers might also request participants speculate on whether they have been assigned to the placebo or treatment to ascertain the robustness of the control.

Conclusion

Designing placebos in digital therapeutics presents multiple challenges. However, unlike in psychotherapy contexts where controlling for therapist effects and double blinding pose methodologically fraught problems, digital interventions offer a domain in which researchers can better control for factors arising in the context of the face-to-face visit [12]. To future proof digital healthcare, and to ensure effective translation of digital therapeutics into practice, trial designs need to be well designed and methodologically robust. This will require much greater reflection and studied caution about the nuances of placebo concepts, as well as greater humility and candour with respect to trial limitations.