Introduction

An important public health challenge of the twenty-first Century is to increase individuals’ levels of physical activity and to reduce their sedentary behavior. One third of the adult population worldwide does not reach the public health guidelines for recommended levels of physical activity [1], and almost one in five Europeans report sitting more than 7.5 h per day [2]. This is worrisome, given that physical inactivity and excessive sedentary behavior independently increase the risk of non-communicable diseases, and can shorten life expectancy [3,4,5,6,7].

A promising approach to break individuals’ unhealthy habits and promote healthy behavior (e.g. increase physical activity and decrease sedentary behavior) is to make subtle changes to the micro-environment in which individuals make decisions, an approach termed ‘choice architecture’ or ‘nudging’ [8,9,10,11,12]. The micro-environment refers to relatively small settings, such as homes and workplaces [13, 14]. Choice architecture is built on the principle that human decision making is often based on automatic and/or heuristic thought processes, rather than effortful deliberate processes alone [15,16,17,18]. These automatic thought processes play a considerable role in daily behavior, including habits [19]. Habits are context-response associations in memory that develop as individuals repeat behavior in daily life [9]; once a habit is formed, merely perceiving a certain context can automatically trigger the associated behavioral response [17].

Choice architecture interventions are applied in the physical, social and/or information environment [10, 15, 20]. In the physical environment, for instance, individuals can be prompted to take the stairs instead of the elevator through footprints on the floor that lead to the stairwell [21]. An example of an intervention in the social environment is the use of social norms, which can be either descriptive (i.e. providing information about the behavior of others) or injunctive (i.e. providing information about others’ approval) [22]. Finally, the information environment includes interventions that alter the way in which messages are presented or framed, for example in terms of gains (i.e. emphasizing the benefits of the desired behavior) or losses (i.e. emphasizing consequences of the undesired behavior) [11, 20, 23, 24].

In recent decades, choice architecture has gained momentum in the field of public health and health promotion [10, 12, 25, 26]; however, its theoretical principles originate in a long tradition of judgment and decision making research [16, 20, 27]. Past research has demonstrated that choice architecture interventions can effectively change behavior in a variety of health domains [10]; however, studies on choice architecture in the domain of physical activity and sedentary behavior have received relatively little attention compared to, for example, dietary behavior (e.g. [28,29,30]). The current review will therefore focus on choice architecture in the domain of physical activity and sedentary behavior.

Two scoping reviews have previously provided an overview of studies using choice architecture interventions to promote physical activity [14, 31], though both reviews only sparsely reported on the effectiveness of the interventions on physical activity. Moreover, there is still a lack of insight regarding the extent to which choice architecture interventions can effectuate durable behavior change after removal of the intervention [15]. It is important to make a distinction between initial behavior change and maintenance of behavior change [32], especially since interventions that effectuate behavior change during the intervention often fail to maintain this change in the long term after removal of the intervention [8, 9]. Finally, a more extensive insight into the effectiveness of choice architecture interventions could be obtained by looking at changes in behavioral intentions and health outcomes related to physical activity and sedentary behavior. It should be noted, however, that changes in intentions do not always equate to changes in behavior [33].

The aim of the current systematic literature review is therefore to summarize studies on micro-environmental choice architecture interventions that encourage physical activity or discourage sedentary behavior in adults, and to describe the effectiveness of those interventions on these behaviors – and on related intentions or health outcomes – in presence of the intervention and after removal of the intervention (i.e. post-intervention, regardless of the time elapsed).

Methods

This systematic literature review was reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines [34]. The review was prospectively registered with the International Prospective Register of Systematic Reviews (PROSPERO) on October 26, 2018 (PROSPERO 2018: CRD42018102999).

Definitions

For the purpose of this review, choice architecture interventions were defined as interventions that alter the presentation of a choice through information or through the physical or social micro-environment in which individuals make decisions, with the intention of changing health-related choices and behaviors. This definition was based on the descriptions of choice architecture by Hollands et al. (2013), Thaler and Sunstein (2008) and Münscher et al. (2016) [10, 11, 14]. In addition, we specified three types of environments in which choice architecture interventions can be applied: the physical, social and information environment. We did not consider interventions that (a) are conducted in the macro-environment, such as the construction of parks and bicycle paths in a city, (b) limit freedom of choice, such as mandates, (c) make use of economic instruments, such as financial incentives, (d) have commercial purposes or (e) solely aim to raise awareness [10, 14].

The outcome measures of interest were (a) the intention or motivation to be physically active/less sedentary; (b) behavioral measures of physical activity or sedentary behavior; and (c) anthropometric and cardiovascular health outcomes (e.g. change in body weight and blood pressure). Outcomes could be self-reported, measured by wearable health monitoring devices, or assessed through biometric measurements.

Search strategy

In collaboration with a medical librarian (LS), a comprehensive search was performed in the bibliographic databases PubMed, Embase, PsycINFO (via Ebsco) and the Cochrane Library from inception to December 13, 2019. Search terms included controlled terms (MeSH in Pubmed, Emtree in Embase and thesaurus terms is PsycINFO) as well as free text terms. In terms of Population, Intervention, Comparison, Outcome and Study design (PICOS), the search strategy included terms for Intervention (e.g. ‘choice architecture’), Outcome (e.g. ‘health behavior’) and Study design (e.g. ‘randomized controlled trial’); Population and Comparison were manually checked during the article selection phase. Search terms were used as index terms or as free-text words; for most terms, synonyms and closely related words were included. A search filter was used to limit for experimental and quasi-experimental studies. The search was performed without date or language restriction. The full search strategies for all databases can be found in Additional file 1. Retrieved articles were imported in EndNote and subsequently de-duplicated using the Bramer method [35]. Additional references were obtained by hand-searching reference lists of included articles (backward search) and by citation search for included articles (forward search).

Eligibility criteria

Articles were eligible for inclusion if they (a) investigated the effect of a choice architecture intervention on physical activity or sedentary behavior, the intention to engage in these behaviors and/or associated health outcomes; (b) studied an adult population (aged 18 years and over); and (c) contained an experimental or quasi-experimental study design. To determine whether the studies derived from the search contained a choice architecture intervention, we used the abovementioned operational definition of this term and the taxonomy of choice architecture techniques from Münscher et al. (2016) [11]. Following from this, interventions did not necessarily need to be labeled as ‘choice architecture’ by the original studies. Articles were excluded if (a) they were written in a language other than English; (b) the study population consisted entirely of individuals with a communicable disease, psychiatric disorder or cancer; or (c) a combination of choice architecture and other behavioral change techniques was used, because this would interfere with our aim to attribute the effect to the choice architecture component(s) separately.

Article selection

Rayyan, an internet-based software program that facilitates collaboration among reviewers, was used for the study selection process [36]. As a first step, this process consisted of screening all titles against the eligibility criteria, which was done by one researcher (LL). Subsequently, abstracts of the remaining articles were screened by two researchers independently. In this phase, one researcher (LL) covered all articles, and two other researchers (JJ, OD) both covered a different half of the articles. The degree of inter-rater agreement was 81.3% for abstract assessments. One researcher (LL) subsequently screened all full-texts against the eligibility criteria. In case of doubt, two other researchers (JJ, OD) were consulted. Disagreements between reviewers were resolved through discussion.

Data extraction

One researcher (LL) extracted data from the included studies using a standardized form. Extracted data included study design, setting, target behavior, population characteristics, sample size, details of the intervention and comparison condition, intervention technique, type of environment (physical, social and/or information environment), outcome measurement and findings in presence of the intervention and after removal of the intervention. Outcomes were categorized as in presence of the intervention if the intervention was present at the moment of measurement, or if the effect was measured directly after exposure to the intervention. Outcomes were categorized as after removal of the intervention if the intervention was no longer present at the moment of measurement. An exception to this applied to interventions conducted in the information environment, since these interventions were typically of much shorter duration. For these interventions, the following cut-off points were used: in presence of the intervention: measurements directly after exposure to the intervention up to 1 week after exposure to the intervention; after removal of the intervention: measurements > 1 week post-intervention.

Intervention effectiveness was determined by the statistical significance of the effect (significant/not significant) as reported by the original studies. Unless otherwise specified, significant effects reported in the current review refer to effects in the healthy direction. For studies with multiple post-intervention measurements, we reported the outcomes of the measurement most distant from the end of the intervention. Studies that reported both significant and not significant effects on the same outcome variable (e.g. a significant effect on physical activity for women, but not for men) were labeled ‘mixed effects’. Note that in this review, significant effects in experimental studies with pre- and post- measures refer to a significant increase in the intervention condition compared to the comparison condition over time (i.e. baseline compared to follow-up), whereas significant effects in studies with a factorial design refer to a significant increase in one condition compared to another condition.

Quality assessment

The methodological quality assessment served to inform interpretation of findings, rather than to determine study eligibility. Study quality was assessed independently by two researchers (LL and JJ) using the QualSyst tool from Kmet el al. (2004) [37], allowing assessment of both experimental and quasi-experimental studies. The tool consisted of fourteen items to be scored ‘Yes’ [2], ‘Partial’ [1], ‘No’ (0) or ‘Not applicable’ (N/A), depending on the degree to which specific criteria were met or reported. Aspects covered include quality of study design, confounders, blinding, selection bias and misclassification bias. Discrepancies in assessments between reviewers were resolved through discussion. For each study, a summary score was calculated by summing the total score obtained and dividing it by the total possible score. A quality score of ≥.75 indicates strong quality, a score between .55 and .75 moderate quality, and a score ≤ .55 weak quality.

Data synthesis

High heterogeneity between studies with regard to study design, intervention characteristics, type of outcome measure and outcome measure assessment prevented performance of a meta-analysis. Instead, we synthesized extracted data by narratively summarizing the characteristics, quality and findings of the included studies. After summarizing the content of interventions, one researcher (LL) inductively identified different intervention techniques by checking the intervention components described in the studies against our operational definition of choice architecture and the choice architecture techniques described by Münscher et al. [11]. Techniques reported in the current review were termed in line with the general choice architecture literature as much as possible (e.g. [10]). The synthesis was structured around (a) the choice architecture techniques identified and (b) the effectiveness of interventions in changing intentions, behaviors or health outcomes in presence of the intervention and after removal of the intervention.

Results

Study selection

Figure 1 shows the flow diagram of the study selection process. The database searches initially identified 6841 records, of which 4798 remained after removal of duplicates from the database searches. Backward and forward citation searches identified 2768 records. A total of 202 full-text articles were assessed for eligibility. Eighty-four articles were included in this review, comprising 88 unique studies.

Fig. 1
figure 1

PRISMA flow diagram of the study selection process. * One study measured both physical activity and sedentary behavior [38]

Study characteristics

Table 1 summarizes study characteristics of the included studies. Studies were conducted in the United States (n = 38), Europe (n = 37), Canada (n = 6), China (n = 4) or Australia (n = 3). The number of participants across included studies ranged from 30 [47] to 9729 [116].

Table 1 Characteristics and key findings of included studies

Design and setting

Thirty-three studies applied an experimental research design [38, 40, 51, 61, 86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101, 103, 104, 106,107,108, 111, 113, 115, 117,118,119,120]Footnote 1: a pretest-posttest design (n = 18), factorial design (n = 11), cluster randomized design (n = 2), post-test only design (n = 1) or cross-over design (n = 1). The remaining 56 studies used a quasi-experimental design (38, 40–49, 51–59, 61–84, 101, 104, 108, 110, 112, 114, 116, 121)1; either a time-series design (n = 45), pretest-posttest design (n = 7) or post-test only design (n = 3). Field experiments were most frequently conducted at the workplace (n = 19) (40, 42, 43, 53, 56, 57, 60, 63, 64, 68, 73, 78–80, 108, 110, 112, 116, 121)1, followed by public transport locations (n = 11) (38, 45, 48, 55, 65, 69, 71, 74, 76, 77)1, university campuses (n = 11) [48, 50, 52, 53, 55, 59, 62, 63, 70, 71, 76, 114], shopping malls (n = 10) (41, 48, 65–67, 81–84, 110)1, hospitals (n = 2) [45, 73], and the home environment (n = 2) [60, 111]. A total of 23 studies were conducted in a laboratory setting (85–87, 89–103, 105–107)1. The remaining studies implemented an intervention through a mobile phone application or website (n = 8) [38, 47, 113, 115,116,117,118, 120], mobile text messages (n = 2) [51, 89] or e-mail (n = 2) [40, 119].

Study outcome

Of the included studies, 86 studies targeted physical activity and within these studies, seventeen measured the intention to be more physically active [86, 87, 89,90,91,92,93,94, 96, 99, 104,105,106,107,108, 122]1, 74 measured physical activity behavior [38, 39, 41,42,43,44,45,46,47,48,49,50,51,52,53, 55,56,57,58,59,60,61,62,63,64,65,66,67,68,69, 71,72,73,74,75,76,77,78,79,80,81,82,83,84,85, 87, 89, 90, 93,94,95,96,97,98, 100,101,102,103, 105, 106, 109,110,111,112,113,114,115,116,117,118,119,120]1 and four measured health outcomes [40, 54, 60, 111]. A total of three studies targeted sedentary behavior, of which one measured the intention to be less sedentary [121] and two measured sedentary behavior [38, 47]; none of the studies measured health outcomes. Individuals’ intentions to become physically active or less sedentary were usually measured by one to three questionnaire items on a 5-, 6- or 7-point scale. Physical activity was assessed with objective measuring devices (n = 11), including pedometers and accelerometers, validated questionnaires (n = 10), such as the International Physical Activity Questionnaire [123], and other self-report tools (n = 7), such as activity logs [97, 98, 105]. Studies that measured stair use (n = 48) counted the number of individuals that climbed the stairs within a certain time interval (mostly a few hours a day, during multiple weeks) by using observers (n = 36) or automatic (infrared) counters (n = 12). Three studies measured enrollment or attendance at exercise classes. Sedentary behavior was either assessed objectively [38, 47], for example by the SenseWear Mini Armband monitor [47], or observed by researchers [121]. Health outcomes were determined through biometric measurements, including body weight and blood pressure. The median duration of interventions was 21 days (range: 1 day to 24 months) and the median period between the end of an intervention and the most distant follow-up measurement 28 days (range: 1 day to 3 months).

Quality of the included studies

Table 1 presents the summarized quality scores for all studies. The majority of included studies (n = 70) were of high methodologic quality. The remaining studies (n = 18) were of moderate quality, with the lowest quality score being 0.61 [51]. Most of the moderate quality studies investigated the effectiveness of message framing. The relatively low quality score of these studies was mainly due to lack of blinding of investigators and participants or to lack of report on estimate of variance for the main results. A complete overview of quality ratings on all items can be found in Additional file 2.

Intervention effectiveness

Effectiveness in presence of intervention versus after removal

Overall, the effectiveness of interventions was more often measured in presence of the intervention (n = 80) than after removal of the intervention (n = 34). For intentions measured in presence of the intervention, four studies reported effective interventions [96, 106, 108, 121], four reported mixed effects [88, 92, 104, 107] and six reported no effect [86, 89, 91, 99, 102, 105]. Among the relatively low number of studies (n = 5) that measured intention after removal of the intervention, one reported effectiveness [94], whereas four did not [87, 90, 93, 96].

For behavior, 67.6% of the interventions were effective (38, 40, 41, 43, 45–49, 52, 56–61, 65–70, 73, 75–84, 102, 109, 110, 112–120)1, 13.2% showed mixed effects [51, 52, 55, 64, 72, 73, 75, 103] and 19.1% did not show an effect in presence of the intervention [43, 45, 56, 63, 65, 66, 89, 95, 101, 105, 109, 111]. After removal of the intervention, 47.1% of the interventions showed a significant effect (40, 43, 45, 47, 49, 58, 61, 67, 77, 82, 86, 97, 110, 118)1, 14.7% showed mixed effects [39, 52, 75, 112, 115] and 38.2% did not show an effect [43, 49, 69, 73, 81, 90, 93, 94, 97, 100, 106, 109, 114]. An explorative analysis of characteristics of the studies that reported a significant effect after removal of the intervention revealed that on average, effective interventions lasted longer (7.3 weeks) than interventions that showed no effect (3.7 weeks). Message-framing studies were excluded from this explorative analysis, since these studies involved one-shot interventions.

Of the four studies that measured health outcomes in presence of choice architecture, one study [40] reported a significant effect on aerobic fitness, but not on other health outcomes; one study [54] reported a reduction in cholesterol levels, but no effect on BMI or blood pressure; and two studies [60, 111] reported no effect.

Intervention techniques

From the 88 included studies, we derived six different choice architecture intervention techniques, each of which is discussed below. Some intervention techniques were almost always applied in the physical-, social- and or information environment; in these cases, the corresponding environment is specified in parentheses.

Prompting (physical and information environment)

Fifty-three studies used prompting (38–84, 108, 110, 114)1, most importantly the use of point-of-choice prompts, such as posters, signs, stair-riser banners and directional footprints on the floor to promote stair use. Prompting interventions lasted between 1 day [41] and 3.5 years [64]. Among the 50 studies that looked into the effect of prompting on physical activity in presence of the intervention, 37 (74.0%) reported a significant effect (38, 40, 41, 43, 45, 47–49, 52, 56–61, 65–70, 73, 75–84, 110, 114)1, eight (16.0%) reported mixed effects [51, 52, 55, 64, 72, 73, 75, 79] and seven (14.0%) reported no effect [43, 45, 56, 63, 65, 66, 109]. Twenty-one studies measured the effect of prompts on physical activity after removal of the intervention; twelve (57.1%) reported a significant effect (40, 43, 45, 47, 49, 58, 61, 67, 77, 82, 110)1, three (14.3%) reported mixed effects [39, 52, 75] and six (28.6%) reported no effect [43, 49, 69, 73, 81, 114].

Prompts consisting of a message differed in the topic emphasized; most prompts emphasized the relationship between physical activity and health (n = 25) (38, 41, 44, 45, 47–49, 51, 54, 55, 58, 64–68, 72, 77, 80, 82–84, 108, 110)1, caloric expenditure (n = 13) [44, 53, 58, 59, 62, 70,71,72, 76, 77, 79, 82, 83], physical fitness (n = 6) (49, 54, 67, 72, 80, 110)1 or saving time (n = 6) (44, 45, 54, 65, 77)1. The messages showed significant effects on physical activity in 91.7% (11/12) of the studies that emphasized caloric expenditure [44, 53, 58, 59, 62, 70, 71, 76, 77, 82, 83], in 72% (18/25) of the studies that emphasized the relationship between physical activity and health (38, 41, 45, 47–49, 51, 58, 65–67, 77, 80, 82–84, 110)1, in 67% (4/6) of the studies that emphasized physical fitness [50, 68, 81, 110], and in 50% (3/6) of the studies that emphasized saving time [46, 66, 78]. In six studies, stair use was prompted by making the staircase more pleasant or attractive, for instance by decorating it with artwork and/or by playing music [48, 60, 61, 64, 74, 80]. Five out of these six studies (83.3%) reported a significant effect on physical activity [48, 60, 61, 74, 80], although it should be noted that some interventions were combined with other choice architecture intervention components. Two studies prompted physical activity through e-mail or mobile phone messages that emphasized the health benefits of physical activity; one reported effectiveness [81] and one mixed effectiveness [51]. One study showed significantly reduced sedentary behavior by prompting physical activity breaks through mobile phone messages [47].

Message framing (information environment)

Twenty-four studies compared the effect of a message framed in a certain way with a similar message framed in a different way on individuals’ physical activity intentions and/or behaviors (85–107)1. The majority of these studies compared gain-framed messages with loss-framed messages (n = 21).

Out of the eleven studies [86, 88, 89, 91, 92, 96, 102, 104,105,106,107] that measured physical activity intentions in presence of the intervention, five (45.5%) [88, 92, 96, 106, 107] showed that gain-framed messages were more effective than loss-framed messages. However, two of these studies showing effectiveness was of moderate quality [88, 107]. Among the six studies (88, 94, 95, 102, 104)1 that measured physical activity behavior in presence of the intervention, one study (16.7%) [103] demonstrated that gain-framed messages were more effective in changing physical activity.

After removal of the intervention, five studies [87, 90, 93, 94, 96] measured the effect of gain- versus loss-framed messages on intentions and nine studies [87, 90, 93, 94, 96,97,98, 100, 106] on behavior; none of those studies reported a difference between the effect of gain- and loss-framed messages on intentions and in three of those studies [87, 96, 98], gain-framed messages caused a higher increase in physical activity compared to loss-framed messages.

Other types of framing included for example a credible versus non-credible source message (n = 2) [93, 94] or a narrative versus non-narrative message (n = 2) [92, 107]. Among these studies, one [94] reported a significant difference between messages: the credible source resulted in higher exercise intentions than the non-credible source.

Social influence (social environment)

Twelve studies used social influence interventions (108–118)1, including descriptive social norms (n = 4) (108–110)1, behavioral modeling (n = 3) [111,112,113], encouragement of competition between individuals or teams (n = 4) [114,115,116,117], and facilitation of social comparison through information about the performance of others (n = 3) [113, 115, 118]. All studies providing a descriptive social norm (i.e. messages that specify the prevalence of a specific behavior), except one [109], reported a significant effect on behavior in presence of the intervention. One study found a significant increase in physical activity, as well as a significant decrease in sedentary behavior [38]. Three studies using descriptive social norms also measured physical activity after removal of the intervention (108, 110)1. Among these, two studies (110)1 reported effectiveness; however, both studies were of moderate quality.

Within the three studies in which behavioral modeling was applied (i.e. demonstration of the desired behavior by another person), two studies (66.7%) [112, 113] reported a significant increase in physical activity in presence of the intervention, whereas one did not [111]. One study also measured the effectiveness after removal of the intervention and reported mixed effects: stair use only remained elevated after removal of the intervention in one of the two intervention buildings [112].

The four interventions that encouraged competition all effectively increased physical activity in presence of the intervention. For two of the interventions, effects were also measured after removal of the intervention [114, 115]; these effects were significant in one study [115]. Finally, the three studies that provided information about physical activity performances of others all reported a significant effect on physical activity during the intervention. Measures after removal of the intervention were performed in two of these studies [115, 118]; the effect was significant in one study [118].

Feedback

Feedback was used as an intervention technique in eight studies [38, 47, 106, 115, 116, 118,119,120]. These interventions consisted of behavioral feedback on one’s level or performance of physical activity [38, 106, 115, 116, 118,119,120], or on time spent in sedentary behavior [47]; all reported a significant effect on behavior in presence of the intervention. Three studies also measured the effectiveness after removal of the intervention [106, 115, 118]; one study [118] found a significant increase in physical activity and one study found a significant increase in physical activity for one condition, but not for another condition [115].

Default change

One study changed the default (i.e. a sit-stand desk was placed at stand-up height instead of sitting height) to encourage sedentary office workers to use the desk in a standing position [121]. The results showed that the intention for stand-up working significantly increased from pre- to post-measure.

Anchoring

One study used anchoring to increase daily steps – participants were either assigned a 5000 step goal (low anchor), or a 10,000 step goal (high anchor) – and reported that the high anchor condition resulted in a significantly higher number of daily steps compared to the low anchor condition [120].

Discussion

Summary of evidence

The aim of this systematic review was to summarize studies on micro-environmental choice architecture interventions that encourage physical activity or discourage sedentary behavior in adults, and to describe the effectiveness of those interventions on these behaviors – and on related intentions or health outcomes – in presence of the intervention and after removal of the intervention. Within the 88 included studies, six broad choice architecture intervention techniques were distinguished, including – in order of decreasing frequency – prompting, message framing, social influence, feedback, default change and anchoring. In the physical environment, we encountered mostly prompting interventions; in the social environment mostly social influence interventions and in the information environment mostly message framing studies. A great majority of studies targeted physical activity, predominantly stair use, while only three studies focused on reducing sedentary behavior. The results of the review suggest that choice architecture interventions effectively encourage stair use in adults, especially in presence of the intervention. However, since we did not assess effect sizes and only few studies reported follow-up outcomes, it remains unclear how meaningful these increases in stair use are on an individual level.

Consistent with previous research on health behavior change interventions in general [9], a higher proportion of studies reported a significant effect on behavior in presence of the intervention compared to after removal of the intervention. The presence of an intervention likely disrupted habitual behavior [124] (e.g. elevator use) and motivated the choice for a different, healthier option (e.g. the stairs) by bringing existing beliefs (such as ‘taking the stairs is good for my health’) into consciousness, while removal of the intervention probably decreased the salience of beliefs about the healthy option [69]. According to Wood & Neal (2016), behavior change interventions of longer duration tend to be more successful, because they allow for formation of new habits [9]. Indeed, the results of the current review demonstrate that interventions that had lasted longer were most successful in maintaining increases in physical activity after removal of the intervention. This finding should, however, be interpreted with caution since we did not control for other factors (e.g. moment of follow-up measurement). Those findings raise the question: how long should choice architecture interventions generally take to promote habit formation? A study by Kaushal et al. (2015) demonstrated that individuals needed at least six weeks of regular gym workouts to establish new exercise habits [125]; according to a study by Lally et al. (2010), the duration of habit formation varies highly between individuals, ranging from 18 to 254 days [126]. A potential disadvantage of a choice architecture intervention of longer duration in the physical environment could be that individuals become accustomed to it and therefore no longer notice it [39].

A relatively high number of studies that examined social influence as choice architecture technique reported significant changes in behavior, especially in presence of the intervention; eight out of ten studies increased physical activity and the only study that targeted sedentary behavior reported a decrease in sedentary behavior. The descriptive social norm interventions may be effective because people generally fear ostracism and experience a robust need to belong, which drives them to behave appropriately and receive approval [127, 128]. Evidence for the effectiveness of social norm interventions has also been demonstrated in other domains, such as alcohol consumption among college students (e.g. [129]). Due to the limited number of social influence studies identified in the current review, we cannot draw conclusions regarding the most effective type of social influence intervention.

With regard to message framing, our review predominantly identified studies that compared gain-framed messages with loss-framed messages. It should be noted that comparisons between message framing conditions differ from the comparisons that were made in most of the other studies included in this review; in the latter, intervention effects were often compared with ‘no intervention’. As opposed to Gallagher and Updegraff (2012) [24], who reported in their meta-analytic review that gain-framed messages more effectively promoted prevention behaviors (including physical activity) compared to loss-framed messages, we found no favorable effect of gain-framed messages over loss-framed messages on physical activity. This inconsistency in findings can be explained by the fact that more recently published message framing studies were included in our study (i.e. [89, 90, 95, 97, 101]), of which the majority did not report a significant effect on physical activity. For intentions to engage in physical activity, the findings of our review did not show a favorable effect of gain-framed over loss-framed messages either.

It is hard to assess the effectiveness of studies that investigated feedback, default change or anchoring as choice architecture technique, because those studies were underrepresented. Moreover, most of the studies that contained feedback also contained another choice architecture technique, which hampered assessment of the effectiveness of feedback alone. Studies on sedentary behavior were underrepresented as well. This can be explained by the fact that, contrary to physical inactivity, the adverse effects of excessive sedentary behavior on health have been fully recognized relatively recently [4, 130].

It must be noted that the choice architecture intervention techniques reviewed are not necessarily new compared to the behavior change techniques (BCTs) described in previous taxonomies of choice architecture and taxonomies of BCTs more in general. For example, some BCTs from the Behavior Change Taxonomy from Michie et al. (2013) (e.g. ‘Restructuring the physical environment’ and ‘Restructuring the social environment’) cover choice architecture techniques that were identified by the current review [131]. In our review, we used terms for choice architecture techniques as they are commonly used in the choice architecture literature (e.g. ‘default change’ and ‘anchoring’), because those terms refer to more specific techniques than the techniques from the taxonomy from Michie et al. [10, 12]. In addition to choice architecture techniques, a wide variety of other BCTs exists [131], such as social support or punishment, but our review did not look at combinations of choice architecture and such other BCTs. Since we excluded multicomponent interventions, we cannot assess whether choice architecture techniques alone, or combined with other BCTs, more effectively change physical activity and sedentary behavior. However, the exclusive focus of our review on choice architecture interventions permits attribution of the effects to specifically those interventions.

Strengths and limitations

Important strengths of our review are the addition of backward and forward citation searches to the database searches and the assessments of study quality by two independent reviewers. This review also contains several limitations. Firstly, accurate assessment of intervention effectiveness was impeded by the fact that (a) few studies adopted a controlled experimental research design; (b) few studies used objective measurement tools; and (c) we reported the effects of interventions in terms of statistical significance – which is less informative than assessment of effect sizes. Moreover, maintenance of behavior change is hard to assess based on this review, due to the often short-term nature of follow-up measures, and the fact that we reported outcomes in terms of ‘presence or absence of the intervention’, without taking the elapsed time at follow-up into account. High heterogeneity between studies in regard to study design, intervention characteristics, type of outcome measure and outcome measure assessment prevented us from conducting a meta-analysis; therefore, we were limited in comparing the effectiveness of interventions between studies. Since the vast majority of studies measured only stair use, the results cannot be generalized to physical activity as a whole. Another limitation relates to the quality assessment: the majority of studies was considered ‘high’ quality, which is improbable considering other literature reviews on physical activity. Therefore, it may be that we selected a too liberal cut-point for ‘high quality’ and/or that the QualSyst tool lacks sensitivity. Furthermore, despite the extensive search strategy conducted, relevant articles may have been missed. Although this limitation applies to all systematic literature reviews, it may be especially the case for this review because there is no commonly shared operational definition of choice architecture. The term choice architecture may thus cover many different intervention techniques that are termed differently in the literature. We attempted to minimalize those limitations by developing an operational definition of choice architecture and by including different concepts and examples related to choice architecture in our search strategy (e.g. nudging, behavioral economics, decision environment). Furthermore, the initial screening of titles was performed by only one researcher. However, this might not have influenced the results since articles were retained for the next screening phase if the researcher doubted about eligibility. Finally, we did not assess the risk of publication bias [132].

Conclusions

This systematic literature review extends the work of Forberger et al. (2019) [31] and Hollands et al. (2013) [14] by providing a systematic and comprehensive overview of studies that used choice architecture interventions to encourage physical activity or to discourage sedentary behavior in adults. The results of the current review suggest that prompting is a promising choice architecture technique to increase stair use over elevator or escalator use. For prompting, but also for other choice architecture techniques, it seems that intervention effectiveness decreases after removal of the intervention, which may be due to the fact that study participants did not (yet) develop the promoted behavior into a habit. The effectiveness of the choice architecture techniques social influence, feedback, default change and anchoring is hard to assess based on this review, since studies using those techniques were underrepresented. Finally, only few studies targeted sedentary behavior or other types of physical activity than stair use, such as active commuting and exercise during leisure time, which highlights the need for additional studies on those behaviors. To allow reliable assessment of behavior change and maintenance of behavior change, future studies must use objective measurement tools and a controlled experimental research design with (long-term) follow-up measures.