1 Introduction

Traditional shopping involves browsing stores in a physical environment, with considerations of product attributes such as size and color playing pivotal roles. With the rapid digitalization of numerous societal aspects, these decisions are frequently made through digital user interfaces (UI) on e-commerce platforms [1]. The shift towards e-commerce was further amplified by the COVID-19 pandemic, which necessitated many private and business interactions to migrate online [2].

Digital decision-making, however, contrasts with its offline counterpart. The vastness of online content can engender information overload, especially when faced with a multitude of product options and their respective features [3,4,5]. This scenario sets the stage for digital nudging (DN) – deliberate UI design features that modify user behavior in online decision contexts [6,7,8,9]. DN mechanisms are omnipresent and continuously shape choices in such environments [10], positioning DN in e-commerce as an emergent research domain [11]. While DN has the potential to enhance online decision-making, such as by suggesting default choices based on a user’s previous selections [12], by reducing the purchase rate of incompatible products through information nudges on check-out pages [13], it also unravels the risk of unfolding reactance or backfire effects [14] on behalf of the user resulting in a series of ethical conundrums [15].

At the heart of the DN discourse lies a key question: how do users interpret these subtle digital prompts? Grasping this is critical, as it underpins the very effectiveness of any nudging strategy, ensuring alignment with user expectations [16]. Most examinations into this have been observational, with researchers passively observing patterns without active interventions [17, 18]. This approach, especially in a field replete with psychological implications like nudging, can be restrictive. Its limitations become more pronounced without an appropriate control group for benchmarking. Observational studies suffer from their lacking randomization resulting in a lower internal validity as compared with experimental designs [19, 20]. With a multitude of external influences potentially affecting online decisions, the quest for precision intensifies. Therefore, Lembcke et al. [21] suggest employing application-oriented research designs to obtain a better grasp of DN facets, including both strengths and weaknesses. Hence, there is an unmistakable call to transition from observational studies to intervention-focused research designs.

And while nudging research deploys intervention methodologies [22,23,24,25,26,27], there remains an underemphasis on how users perceive digital nudges within e-commerce. Previous studies examined nudging interventions in e-grocery [22,23,24,25,26,27], aiming at the choice of sustainable alternatives [22,23,24, 26], without a focus on user perceptions. Few studies, like Michels et al. [28], which examined DN’s role in sustainable shipping choices and user ethical perceptions using Román’s scales [29] have combined intervention and perception. Their work, emblematic of the budding research in this area, showed the power of DN but also highlighted potential user reservations. It signifies a broader need: to cohesively integrate user perceptions with intervention data, thus offering a holistic view of DN in e-commerce. With a relative void in understanding user perceptions in this context [16], our research aims to bridge this gap, pivoting toward an individual-centric perspective. We ask: How do digital nudging interventions shape individual perceptions?

To illuminate this subject, we conducted an online experiment, probing if perceptions fluctuate based on DN design nuances. Grounded in both descriptive and inferential statistical analyses, our findings aim to steer e-commerce designers toward crafting enhanced user experiences [7]. We evaluate our findings with experts from information systems and psychology. In doing so, we heed the calls for a deeper probe into DN’s implications for e-commerce [30], further enriching this blossoming research domain [31].

Navigating through this paper, Sect. 2 delves into the theoretical underpinnings of DN within the e-commerce realm, elucidating key perception factors and formulating relevant hypotheses. In Sect. 3, we illustrate the research model that underlies our investigation. Building on that foundation, Sect. 4 describes our research methodology, while Sect. 5 presents the outcomes of our group variance analysis and of the expert evaluation. Moving to a reflective stance in Sect. 6, we explore the study’s limitations and ramifications, extending valuable insights to both seasoned and emerging e-commerce platform designers. Our concluding remarks and synthesis are provided in Sect. 7.

2 Theoretical background and hypotheses development

2.1 Digital nudging concepts in e-commerce

Nudges represent aspects of choice architecture that predictably sway people’s actions without prohibiting any choices or drastically altering their economic motivations [32]. The idea of DN has its roots in the nudging theory from an analog setting, originally posited by Thaler and Sunstein [32]. Their foundational theory was anchored in evidence from psychology and behavioral economics. Contrary to the classical economic model of the consistently rational homo economicus, they highlighted that human actions often deviate from rationality due to cognitive, emotional, and social dynamics. To empower people to make informed decisions, it is imperative to grasp their cognitive processes [32]. Dual process theories, a staple in social psychology, postulate that individuals employ two cognitive systems during decision-making: the intuitive “automatic” (system 1) and the deliberate “reflective” (system 2) [33, 34]. Research indicates that routine activities predominantly harness the automatic system, rendering decisions prone to heuristics and biases [35]. Notable among these heuristics are the availability (easily recalling events), anchoring (relying on reference points for assessments), and representativeness (leaning on stereotypes) [36]. Nudging, then, is an art of shaping choice environments – both digital and analog – to either exploit or mitigate these heuristics and biases [7]. For this research, we categorize DN as leveraging UI elements to guide user actions in online decision contexts.

A platform, in business terms, is a structure fostering value-driven interactions between external providers and consumers [37]. It provides an open infrastructure and integrates effective governance mechanisms [37]. The platforms’ objective is to harmonize the exchange of services and (social) goods among erstwhile unaffiliated users, generating value for all involved [37]. E-commerce embodies online transactions – buying and selling of commodities or services [38]. In this realm, the platform morphs into a digital intermediary, pairing sellers with buyers to enable these value-based interactions [37]. Sellers, through this digital channel, confer benefits upon buyers by delivering goods or services [39]. Conversely, sellers garner not only traditional monetary remunerations but also a “social currency”, encompassing data and feedback, enriching them with intangible economic benefits [37, 39].

2.2 Hypotheses development

2.2.1 Technology-related perception factors: perceived usefulness and perceived ease of use

Individuals utilize e-commerce platforms for their online transactions [40]. Their engagement is significantly influenced by both the utility and usability of these platforms. The technology acceptance model (TAM), a cornerstone of technology acceptance theories, emphasizes the constructs of perceived usefulness (PU) and perceived ease of use (PEOU) as key indicators of user behaviors towards IT adoption [41].

Perceived usefulness gauges an individual’s belief in the utility of a technology for their needs [42]. It essentially captures their subjective evaluation of the benefit offered by the IT artifact, in this case, the e-commerce platform [43]. As such, PU serves as a proxy for how an IT artifact boosts individual performance [41]. On the other hand, PEOU assesses how effortlessly an individual perceives the use of a system [42]. It mirrors the cognitive effort needed to utilize IT [43], denoting the relative effortlessness of navigating an IT artifact [40].

In relation to DN within e-commerce, there is a scarcity of research exploring its impact on primary factors such as PU and PEOU. Nonetheless, insights from parallel domains, like social media and e-learning acceptance, shed light on DN’s potential effects on perceived usability and effort. In their exploration of social media acceptance, Ren and Liu [44] investigated the mediating role of DN in social media app adoption. They established that DN bridges the relationship between PEOU and social media acceptance. Yet, its mediating influence between PU and social media acceptance remained unsubstantiated. Wambsganss et al. [45] delved into the realm of e-learning, examining the influence of automated feedback and social comparison nudging on students’ argumentation writing skills. Their mixed-method study on PEOU revealed that students benefited from automated feedback combined with social comparison nudging. Another study by Wambsganss et al. [46] evaluated the self-evaluation nudge’s efficacy in e-learning systems, considering PU and PEOU to gauge the design’s perceptual accuracy. The results painted a promising picture for the interplay of social nudges and adaptive feedback in promoting self-regulated learning.

Furthermore, Venkatesh [47] uncovered that individuals resort to certain anchors, termed facilitating conditions, when forming PEOU judgments about novel IT artifacts. These digital nudges aim to streamline decision-making by lessening cognitive or physical strain [12], epitomizing facilitating conditions. Research by Jesse et al. [48] in online food choices revealed that a combined nudge (default and social norm) not only boosted the likelihood of nudged item selection but also expedited the decision-making process, suggesting a positive PEOU shift. The practicality of IT tools is often tethered to their complexity, with surging complexities escalating associated costs [49]. Digital nudging can streamline intricate choices [7,50], guiding users seamlessly, for example, online systems can ease individual choice by guiding an user through a process. In the realm of technology acceptance, it’s underscored that information satisfaction bolsters PU [51]. Significantly, Jesse et al.‘s study [48] emphasized that their nudge did not undermine participants’ satisfaction or confidence.

In synthesis, DN likely exerts a positive influence on PU and PEOU. Thus, we propose:

H1 In an e-commerce setting, digital nudge interventions amplify the perceived usefulness compared to scenarios without nudges.

H2 Digital nudge interventions enhance the perceived ease of use in e-commerce contexts relative to those without nudges.

2.2.2 Channel-related perception factors: trust and perceived privacy risk

To understand e-commerce adoption, it is essential to consider more than just technology-based antecedents, which primarily emphasize the technological interface of a platform. The channel dimension of an e-commerce platform plays a pivotal role [43]. In the context of e-commerce and digital platforms, the term “channel” typically denotes the medium or interface through which interactions or transactions occur [52]. Channel-related perception factors should refer to the perceptions, beliefs, and attitudes that users or consumers hold concerning a particular mode or channel of communication or distribution,

With the proliferation of e-commerce platforms available for selection, trust in online transactions has evolved into a fundamental aspect of the e-commerce landscape [53]. We refer to the definition of trust as provided by McKnight et al. [54] and Lankton et al. [55, p. 883] as “trusting beliefs in technology, which are beliefs that a specific technology has the attributes necessary to perform as expected in a given situation in which negative consequences are possible”. Key facets of online trusting beliefs highlighted in literature encompass benevolence, integrity, and ability [56]. Benevolence underscores the trustee’s commitment to prioritize the truster’s welfare over self-interest [57]. Integrity encompasses enduring values like consistency and sincerity [57]. Ability focuses on competencies within a domain that certify the entity in question [56].

Thaler and Sunstein’s conception of DN [32] asserts that nudges, while altering the choice architecture, should not restrict any choice options. However, in practice, “dark patterns” arise – nudges that deviate from their intended purpose for unethical gains [58]. Djurica and Figl’s study [59] seeks to understand the impact of DN on product selection and the user’s perception of e-commerce platforms. They hypothesize that e-commerce websites deploying digital nudges (e.g., defaults) might be viewed more favorably, provided the nudge simplifies the user’s decision-making process and highlights the most appropriate choice for them [59]. Yet, Steffel et al. [60] argue that if users sense nudges as merely tools to inflate platform profits or to cater to platform interests, this could erode trust in the platform. Bongard-Blanchy et al. [61] noted that users often recognize manipulative tactics employed through platform design interfaces, especially within e-commerce. From their study on dark patterns, Maier and Harr [62] anticipate a long-term diminishment of users’ trusting beliefs towards platforms utilizing these manipulative tactics. Zanker [63] emphasizes that comprehensive explanations are instrumental for the efficacy of a recommendation system, cultivating users’ trusting beliefs in its outputs. Such explanations provide insight, considering both user profiles and product features, and act as bridges connecting the two [63]. Yet, when considering the implementation of social nudges, which are designed to act as a precursor to a recommender system, the interface provides a hint about the personalization of the recommendation. However, it lacks a clear, mediating explanation.

Given that digital nudges are essentially design alterations in user interfaces [7] and considering the absence of a clear explanation for our nudge intervention, we posit that digital nudges might undermine trust in e-commerce platforms. Consequently, we hypothesize:

H3 Digital nudges have an adverse impact on trust perceptions within an e-commerce context when compared to scenarios without nudging.

Perceived privacy risk (PPR) stands out as a significant component of perceived risk [64]. Perceived privacy risk becomes particularly pronounced in environments like e-commerce platforms, where vast amounts of user data are harvested and analyzed. This risk is due to the potential unauthorized access or illicit dissemination of sensitive consumer details [40]. In essence, PPR reflects the degree of an individual’s readiness to share personal data, weighing against potential privacy infringements [65].

Barev et al. [66] examined the effects of two nudges, specifically framing and social nudges, on information sharing behaviors in digital workplaces. Their findings revealed that the social nudge was perceived as intrusive and seemed to exploit individual vulnerabilities. Kroll and Stieglitz [67], while exploring privacy-centric factors that influence self-disclosure on social networks, discovered that digital nudges can sometimes evoke heightened privacy apprehensions. Huang et al. [68] point out that mishandling sensitive user data, especially within recommender systems, could lead to harmful exposure. While recommender systems, acting as digital nudges, are designed to aid users in making decisions by minimizing cognitive strain [69], such personalized suggestions might come across as overbearing to some users.

Given these insights, it can be inferred that digital nudges might elevate PPR on e-commerce platforms. In other words, such interventions might be perceived as encroachments on privacy. Therefore, we hypothesize:

H4 Digital nudge interventions amplify the perception of privacy risk in an e-commerce context, compared to scenarios without nudges.

3 Research model

The TAM, foundational in information system (IS) research, has been key in deciphering individual acceptance of IT and IS [47]. It underscores the paramount antecedents – PU and PEOU –that dictate technology utilization intent. Over time, the TAM has evolved, integrating channel-specific perception factors suited to different contexts like online shopping and e-commerce. A notable adaptation is by Wang and Benbasat [70], who introduced the integrated Trust-TAM, tailored for online recommendation agents. This model posits that alongside PU and PEOU, trusting beliefs in such agents amplify consumers’ intent to engage with them. Additionally, research works, such as those by Pavlou [40] and Gefen et al. [43], have delved into the multifaceted nature of trust within e-commerce. While Pavlou [40] enriched the TAM framework by weaving in perceived risk and trust, Gefen et al. [43] identified both technology-driven (like PU and PEOU) and trust-centric antecedents as crucial. The significance of both technology and channel perception factors for IS and IT acceptance has been reaffirmed across diverse studies, largely leveraging variance-based structural equation model methodologies. Yet, the majority of these acceptance investigations are observational, where researchers passively gauge patterns without instigating active interventions. Especially in psychological effects, such as in the realm of nudging, gauging a nudge’s impact is challenging without a control group for comparison. Thus, it is imperative to transition from hypothetical observational studies to intervention-based research.

Research on nudging within e-commerce settings has predominantly focused on intervention studies, often overlooking the importance of understanding users’ perceptions of digital nudges. Our study seeks to enhance the existing literature on e-commerce DN by incorporating nudging interventions with the foundational constructs of the TAM — specifically, PU and PEOU — along with their adaptations that emphasize TRUST and PPR. These elements are critical in influencing e-commerce decision-making processes [69,70,71]. By doing so, our research not only contributes to e-commerce DN literature but also extends its implications to human-computer interaction (HCI) studies. It highlights the pivotal role of user perception in the development and execution of digital nudges, stressing the importance of integrating psychological and behavioral insights with technological advancements to develop more effective and user-centric e-commerce platforms.

For our nudging interventions, we turn to social norms (SN), default (DF) settings, and scarcity warnings (SW). Social norms can be described as the understood “rules and standards” within a group that guide or restrict behavior without legal enforcement [72, p. 152]. A tangible representation of this in the e-commerce domain is the recommendation lists that users often encounter on product pages. For instance, users might see a prompt like “Customers who bought this item also bought […]”, indicating preferences of previous customers [33, p. 641]. Limayem et al. [73] highlighted the influential role of such norms, noting that these social parameters can considerably sway online shopping behaviors [74]. A default represents a choice made in advance, which users tend to stick with [75]. This adherence to a pre-made choice is rooted in the natural human preference for the status quo [76]. In the e-commerce realm, examples include pre-ticked boxes for specific delivery options during the checkout process [33]. Recognized as a prevalent and influential digital nudge [59]. Scarcity warnings hinge on the premise that individuals often prioritize potential losses over prospective gains [76]. This behavior manifests in e-commerce as notifications like “In high demand!” or “-45% only today!” Such alerts prompt users to act swiftly to avoid missing out [33]. E-commerce platforms frequently deploy such scarcity indications, especially concerning limited time or product availability [77].

Our selection of these specific digital nudges is dual-pronged. Primarily, they resonate deeply within DN research, echoing their fundamental psychological underpinnings. A proliferation of recent DN research delves into the nuances of default settings and social norm nudges, exploring their underlying psychological catalysts [22,25,48,78,79]. The effects linked to social norms, status quo bias, and loss aversion are recurrent themes in psychological studies centered on users [33]. Our study also gleans insights from the digital nudge design blueprint proposed by [80], which sheds light on quintessential DN principles [33]. The digital nudge design blueprint accentuates the influence of social norms, showcases the pivotal role of default settings through status quo bias, and emphasizes the compelling force of scarcity cues rooted in loss aversion [80].

The derived research model is illustrated in Fig. 1.

Fig. 1
figure 1

Research model (based on [40, 43])

4 Research approach

4.1 Research design

To address our research question (RQ), we adopted a quantitative approach, devising an online experiment centered on a hypothetical e-commerce scenario. We developed four distinct versions of a simulated e-commerce platform named “Shopera.” Each version differed solely based on the specific digital nudge employed: social norms, default settings, scarcity warnings, and a control version without any nudges. We tasked participants with selecting a smartphone on Shopera, a product chosen for its gender-neutral appeal and ubiquity. In crafting this exercise, we meticulously avoided references to specific brands, proprietary language, or distinctive selling points to minimize external influences and hone in on the potential effects of the digital nudges. Our design for these nudges drew inspiration from the structured methodology proposed by Mirsch et al. [80].

The first phase included the contextualization provided by Shopera, and the second phase addressed the ideation and design of DN concepts.

In the third phase, we transitioned to the technical realization of the digital nudges, crafting four distinctive versions of Shopera via Shopify [81, 82]. The interventions included SN, DF settings, and SW, while also incorporating a no nudge (NN) version to act as a control group. The SN nudge was realized through a collaborative recommendation system [74]. This system draws inspiration from the “social navigation” concept, wherein individuals tend to align their actions based on observed behaviors of others, seeking cues and direction [83]. To reinforce this in our study, we incorporated cues like “customers who bought smartphone XT10 also bought”, followed by visuals of ancillary items frequently bought alongside, such as smartphone cases, screen protectors, and power banks, as visualized in the upper left corner of Fig. 2.Footnote 1 The DF nudge was implemented via a button indicating a pre-chosen memory capacity (see upper right corner of Fig. 2). In our study, the SW was encapsulated with the message, “Only 1 item left in stock!” (refer to the lower left corner of Fig. 2). In the NN variant, participants encountered a straightforward interface without any particular design interventions. This version acted as our baseline, equipped with standard interface components: an image of the smartphone on offer, its price, an “add to cart” button, and an indicator that the product is hosted by “merchant912” on the Shopera platform (see lower right corner of Fig. 2).

Fig. 2
figure 2

Operationalization of social norms nudge (1, upper left corner), default nudge (2, upper right corner), scarcity warning nudge (3, lower left corner) and no nudge (4, lower right corner)

Table 1 provides a detailed summary of the digital nudges, their corresponding psychological foundations, and their implementation in our study, while Fig. 3 illustrates the research design
Fig. 3
figure 3

Research design (based on [26, 40, 43])

Our digital nudge design process concluded with an evaluation phase: Experts were interviewed regarding potential causes and implications for research and practice for the main findings of our quantitative research (cf. Section 6).

4.2 Data collection and analysis

4.2.1 Online experiment

Our study commenced with a comprehensive online experiment that introduced participants to the simulated e-commerce platform, “Shopera”. The participants were tasked with purchasing a smartphone through this platform, following which they were asked to respond to questions regarding their perceived experience. The constructs and items within the experiment were derived from existing validated measures (cf. Appendix 1).

The PU construct included items that assess the platform’s design efficiency in facilitating discovery of relevant information for individual product searches , its ability to provide interesting and valuable information, and its overall usefulness [85]. The PEOU construct addressed factors that facilitate understanding of the content, identification of critical information [86], and the overall ease of platform comprehension [40]. Considering the varied objectives of different stakeholders such as platform designers, who may have nudger driven goals like product sales [87], it was essential to assess users’ perceptions of the platform’s trustworthiness. The items included in this construct focused on the honesty, competence, effectiveness of the platform’s content and whether it served the users’ best interest [88], as well as the users’ general propensity to trust digital platforms [40]. The items in the PPR construct focused on the perceived riskiness of sharing personal data with the platform and the inherent uncertainty associated with it [89]. Furthermore, participants were asked to indicate the degree to which the platform fostered their freedom of choice – an essential evaluation criterion of digital nudges in the process model of Meske and Potthoff [90], as a fundamental criterion for ethical nudging [91].Footnote 2 All items were measured on a 5-point Likert scale.

In preparation for the primary study, we carried out a pretest using a smaller but representative sample of our target population. The aim of the pretest was to examine the clarity, relevance, and appropriateness of the experiment questions, and to estimate the time required to complete the experiment. The pretest sample consisted of 20 participants who were chosen carefully to reflect the diversity and characteristics of our primary study’s intended respondents. We strived to ensure this sample included a mix of gender, age, and educational backgrounds that mirrored our larger participant pool. The selection process for the pretest participants involved both purposeful and convenience sampling strategies. We reached out to potential participants from our personal and professional networks who matched our desired demographic profile. This approach, although less random, ensured that we received feedback from individuals who were similar to our intended study participants. During the pretest, participants were asked to provide feedback on the clarity of the instructions, the relevance of the questions to the study’s objectives, the usability of the experimental interface, and any difficulties they encountered while completing the experiment. The insights from the pretest allowed us to mitigate potential issues and enhance the reliability and validity of our final experimental instrument.

Data collection for our study was carried out from February to March 2022 using convenience sampling, a method chosen for its practicality and efficiency in reaching a broad participant base. We sourced participants through the market research platform SurveyCircle [92], social network survey groups on platforms such as Facebook, LinkedIn, and Xing, and via word-of-mouth referrals. This approach enabled us to efficiently gather a diverse pool of respondents (see the demographics in Sect. 5.1.1).

From the 341 respondents initially engaged, 68 were excluded based on our data cleaning criteria, including incomplete records, failure in attention-checking questions, extraordinarily short completion times, or conspicuous response patterns. This left us with a total of 273 participants for the final analysis. In our nonequivalent groups design, these 273 participants were assigned to one of four distinct versions of Shopera, our simulated e-commerce platform. The assignment process aimed to adhere to equivalent demographics across groups as much as possible. This was achieved by monitoring the demographic distribution (age, gender, education level, e-commerce usage frequency) of respondents as they were enrolled and directing them towards specific versions of Shopera to maintain demographic parity across groups. This careful management of participant distribution was essential in mitigating potential demographic biases and ensuring a balanced representation in each group.

Participants were therefore divided as follows: 66 experienced the SN version, 71 each interacted with the DF and SW versions, and 65 were engaged with the NN version. This non-randomized grouping was a pragmatic decision aligned with the constraints of our online experimental methodology. Our focus was on observing the distinct impacts of each digital nudge in a setting that simulates real-world online shopping environments. However, the nonequivalent nature of group formation necessitated thorough validation checks. We conducted robustness tests to assess baseline demographic differences between the groups. This step was vital in addressing potential biases and ensuring the integrity and comparability of our findings across the different Shopera variants (see Sect. 5.1.1 for the results).

Data analysis was performed using the programming language R. Before proceeding with the analysis of variance (ANOVA) to test the hypotheses, we ensured that the necessary assumptions were met. We first conducted Shapiro-Wilk tests to verify the normality of the distribution for our dependent variable. This was complemented by the Levene’s test, which allowed us to affirm homoscedasticity by confirming the equality of variances across groups. Given these conditions, we proceeded to perform one-way ANOVAs where data satisfied the assumptions of normality and homogeneity of variance, while Kruskall-Wallis tests were utilized for data not meeting these criteria. To control for the risk of Type I errors stemming from multiple comparisons, we employed Bonferroni correction and the Benjamini-Hochberg procedure on the resulting p-values from these tests. Post-hoc analysis tests were subsequently implemented for pairwise comparisons, allowing us to discern significant differences across groups in order to address our RQ. These methodological steps ensured not only the robustness and validity of our results, but also reflected our commitment to upholding the principles of statistical rigor throughout the study.

4.2.2 Expert evaluation

In order to better integrate the results into existing expertise on causes and resulting implications of DN mechanisms for research a qualitative follow-up study in the form of expert interviews was conducted.

In order to achieve the aforementioned evaluation research aim, a sample consisting of eleven experts was selected (cf. Table 2). As the research topic is both addressed by behavioral psychology and information systems, we selected experts in both fields. This allows for an informed assessment of the main findings of the online experiment from different disciplines.

Table 2 Experts for the evaluation

Regarding the information systems discipline, seven experts were interviewed comprising one professor (E2), one post-doctor researcher (E4) and five research assistants (E7 – E11).

Regarding the discipline of psychology, four experts were interviewed comprising two professors (E1 and E3) and two research assistants (E5 and E6).

Based on the research interest described above, a standardized interview guideline was designed [93], consisting of five topics. First, interviewees were presented with a detailed overview of the experimental design, including an explanation of the RQ and how the digital nudge interventions were operationalized. Second, experts were enabled to ask questions to ensure their comprehension of the experimental design and investigation intentions. Third, interviewees were asked to review the results and assess potential causes and implications from their perspective. Fourth, interviewees were presented with the main findings as deducted by the authors and asked to comment on them. Fifth, interviewees were asked to state their research focus.

The interview slides were continuously refined during each individual interview session. For instance, to facilitate identification of the current perception factor under discussion, we have added elements and used color highlighting. The items related to the perception factors were presented solely in the third topic to ensure their presence in the discussion section. This approach adheres to the qualitative research principle of openness [93].

The interviews were remotely conducted between November and December 2023 and the described topics two to five were recorded.Footnote 3 The recorded interviews had an average duration of 32 min, and the interview guideline was accomplished in all instances.

5 Results

5.1 Online experiment

5.1.1 Demographics, between-group descriptive statistics, and group equivalence analysis

Among the 273 study participants, the majority (72.2%) were women, while men accounted for 27.5% and non-binary individuals for 0.4% of the sample. The age distribution was as follows: 0.4% under 17 years, 89.7% within the 18 to 34 years bracket, 7.3% ranging from 35 to 54 years, and 2.6% exceeding 55 years. The majority of the participants held a university degree, with approximately 70% having obtained a bachelor’s degree or higher. Regular use of e-commerce platforms, denoted as usage at least once a month, was reported by 82.3% of the participants.

Table 3 shows metrics regarding the announced data cleaning process (cf. Section 4.2.1):

We further analyzed the descriptive statistics across four groups (SN, DF, SW, and NN), the details of which are encapsulated in Table 4. The SN group reported the highest average PU (M = 3.88, SD = 0.52) and PEOU (M = 4.11, SD = 0.45), both accompanied by the lowest standard deviations. TRUST also scored highest in this group (M = 3.69, SD = 0.54). Conversely, the SW group exhibited the highest mean PPR (M = 3.07, SD = 0.84), albeit with the lowest standard deviation in the SN group (SD = 0.83). The largest inter-group disparity was observed in the TRUST construct (SN: M = 3.69, SD = 0.54; SW: M = 2.87, SD = 0.74), while the PPR construct exhibited the smallest variance (SW: M = 3.07, SD = 0.84; SN: M = 2.51, SD = 0.83).

Table 4 Descriptive statistics

In order to account for reliability, as being important regarding study effects interpretation and results testing, Cronbach’s Alpha as internal consistency coefficient was calculated [88, 89]. The Cronbach’s Alpha values for each perception factor can be found in Appendix 1 (Measurement items of the questionnaire). The internal consistency exhibits moderate (α of PEOU = 0.65), good (α of TRUST = 0.78) to reliable (α of PU = 0.85, α of PPR = 0.85) values [90, 91].

In assessing the comparability of our nudging groups, we performed Chi-square tests on the demographic variables: gender, age, highest level of education, and frequency of e-commerce platform usage. Our statistical analysis yielded non-significant results across all demographic variables. This indicates that the distribution of these variables is similar among the groups, strengthening the validity of our experimental conditions (see Table 5).

Table 5 Baseline differences between the nudging groups

Figure 4 delineates the results from the analysis of perceived freedom of choice, stratified by the four groups: SN, DF, SW, and NN. Across the three digital nudge groups, high to very high perceived freedom of choice responses were above 60% (SN: 68.2%, DF: 69%, SW: 60.6%). This proportion was slightly higher in the NN group, reaching 67.7%. The mean response value for perceived freedom of choice across the digital nudge groups ranged from neutral to high (SN: 3.7, DF: 3.7, SW: 3.5), paralleling the NN group’s value of 3.5.

Fig. 4
figure 4

Results of perceived freedom of choice

5.1.2 Hypotheses testing

In our investigation of the significance of differences across digital nudge groups for various constructs, we initially performed the Shapiro-Wilk tests to ascertain univariate normality. Our findings indicated non-significant results (p > .05) for TRUST, PPR, and PEOU, implying that the sampling distributions for these constructs were not normally distributed across all groups. In contrast, we observed significant results (p < .01) for PU across all groups, signifying a normally distributed sampling. Furthermore, our analysis confirmed that heteroscedasticity did not pose an issue across all groups. Consequently, we opted for non-parametric tests for TRUST, PEOU and PPR, while deploying parametric tests for PU.

The conducted one-way ANOVA for the construct PU did not highlight any significant group differences at the 0.05 level, F(1, 271) = 0.606, p = .437. Consequently, H1 is not supported.

In examining the differences among various digital nudge groups for the constructs PEOU, TRUST, and PPR, we utilized the Kruskal-Wallis test due to their non-parametric distributions. Our findings demonstrated: For PEOU, no significant group differences were evident (Chi-square = 7.508, p > .05, df = 3), hence H2 is not supported. For TRUST, there were significant variances with a Chi-square value of 53.004 at a p-value of less than 0.05 and degrees of freedom (df) = 3. Lastly, PPR exhibited significant differences with a Chi-square value of 13.238 at a p-value of less than 0.05 and df = 3.

Following these significant findings, we proceeded with a post-hoc analysis using the Wilcoxon rank sum test to further investigate which specific digital nudge groups showed significant differences from the control group (see Table 6). In the case of the TRUST construct, we discovered significant differences between the DF and NN groups, as well as between the SW and NN groups (p < .01). Referring to the data in Table 4, the DF group had an average TRUST score of 3.3, while the SW group registered a mean value of 2.87. These figures stand in contrast to the NN group, which recorded a higher average of 3.64 for TRUST. This discrepancy indicates that participants in the DF and SW groups exhibited reduced trust in the platform, thereby validating our hypothesis H3.

However, in the PPR construct, the NN group showed no significant differences when compared to the intervention groups. Thus, H4 is not supported. The significant results in the PPR construct from the Kruskal-Wallis test can be primarily attributed to the differences noted between the SN and SW groups. To account for the increased risk of Type I errors due to multiple testing, we performed Bonferroni correction and Benjamini-Hochberg procedure on the p-values obtained from the above tests. The Bonferroni-corrected p-values remained significant for TRUST and PPR (p < .05/4), indicating robust group differences. The Benjamini-Hochberg procedure also confirmed these significant findings for TRUST and PPR (q < 0.05), suggesting a notable impact of our digital nudge groups on these constructs. The lack of significant differences in the other constructs persisted after corrections. Thus, these additional analyses further substantiate our initial findings, and suggest the robustness of our significant results.

Table 6 Results from the Wilcoxon rank sum test

5.2 Expert evaluation

The interviews were coded and analyzed using QDA Miner Lite. The results for potential causes and implications for research/ practice is shown below for each perception factor (cf. Table 7 regarding PU/ PEOU; cf. Table 8 regarding TRUST; cf. Table 9 regarding PPR). In the following, extracts from the results are presented per perception factor.

Regarding the findings from the online experiment for perception factors PU/ PEOU, many experts were not surprised and proposed that too many other factors in the vignettes were perceived more decisive than the nudges. Few implications for these potential causes were provided by the experts, referring inter alia to the mapping of individual effects with different scenarios or gradation levels.

Regarding the findings of the online experiment on the perception factor TRUST, experts were unsurprised. They stressed that in case of the DF nudge intervention, the most sophisticated option for the user is pre-selected. The interviews with experts uncovered numerous potential causes and implications. Social norms nudge intervention was discussed during the expert interviews, even though it was found that this intervention neither increased nor reduced trust, unlike the DF and SW intervention. The experts stated that the SN nudge intervention is advisory in nature, while DF and SW puts the user under pressure.

Regarding the findings of the online experiment on the perception factor PPR, some experts were surprised that the findings differed from those of TRUST. The timing of the nudge interventions was proposed as potential cause, as users did not share personal data or created purchase data.

Table 7 Results regarding perceived usefulness/ perceived ease of use
Table 8 Results regarding trust
Table 9 Results regarding perceived privacy risk

6 Discussion

The objective of our research was to explore the influence of digital nudge interventions on user perceptions within e-commerce environments. We delved into the theoretical underpinnings of DN, outlined key user perception constructs (PU, PEOU, TRUST, PPR), and employed an online experiment to discern perceptual differences among various DN intervention cohorts. Experts were interviewed regarding potential causes and implications for research and practice for the main findings of our quantitative research and will be referenced with their ID. Table 10 provides a concise overview of our main findings (MF), potential causes (PC), and their implications (IR/IP). The contents of Table 10 will be discussed and elaborated upon in the subsequent sections.

Table 10 Main findings, potential causes, and derived implications for research and practice

6.1 Implications for research and practice

6.1.1 Technology-related perception factors: perceived usefulness and perceived ease of use

In evaluating the technology-related perception factors PU and PEOU, our analysis revealed no significant distinctions between any digital nudge intervention and the NN group (MF1). Examining the descriptive statistics in Table 4 further substantiates this, as the mean values for both PU and PEOU remain uniformly elevated, without remarkable variance (MF2).

One explanation for the users’ low involvement relating to these technology-related perception factors may be the product category of smartphones representing an everyday object or a commodity these days (E3, E6) (PC1). This circumstance could make it more difficult to cover the constructs of PU and PEOU. The detected inconsequential nature of DN in affecting user perception is congruent with emerging findings in HCI research concerning nudge efficacy. Mertens et al. [94] highlighted a prevalent publication bias influencing documented outcomes. Their extensive survey, which categorized behavioral domains via Münscher et al.’s [95] choice architecture technique taxonomy, identified interventions accentuating decision structure as more impactful than those centered on decision information or assistance. Maier et al. [96] further deduced from their Bayesian analysis an evident gap in the affirmation of nudges as reliable behavior modification tools. This gap underscores an imperative for heightened scrutiny on the perception and ensuing efficacy of nudges, particularly within e-commerce sector product categories. Future research should examine individuals’ perception of nudges in relation to PU and PEOU by implementing various product categories to enhance user involvement (E7) (IR1).

A secondary rationale addressing the users’ low involvement might stem from the study’s vignette format, which restricts the ability to navigate or engage with the platform (E1, E3) (PC2). From a methodological standpoint, vignette studies are frequently employed in DN research [26, 97]. These studies provide certain advantages, such as the permission of concurrent display of factors leading to enhanced realism in contrast to cultivated survey items [98] and avoid bias towards social desirability [99]. In contrast, they entail certain disadvantages, such as forfeits in external validity [66]. Given the aforementioned observations, we recommend the following research direction: Future research should examine individuals’ perception of nudges in relation to PU and PEOU by implementing a more realistic experimental setting that permits navigating on the platform to enhance user involvement (IR2). This should balance the methodological trade-offs.

Another underlying reason for the observed findings might be that the users’ low involvement stems from the habituation effects of these nudge interventions regarding PU and PEOU (E6, E8, E10, E11) (PC3). Users that are aware or accustomed to DN on e-commerce platforms will reflect their attitude towards the nudging or persuading instance [59, 100]. This attitude could be reflected in a low level of user involvement. Therefore, it is crucial to conduct empirical mapping, which could occur during the experiment as a manipulation check (E3). Manipulation checks verify the efficacy of inducing the independent variable [101], the DN interventions in our case. A follow-up study could ask users after being presented with the stimuli whether they recognized anything. The insights gained could provide an answer to this potential cause. If users are accustomed to these nudges, future research could investigate that as the potential explanation: What are the digital nudges that users typically encounter? Which ones do users know? Thus, future studies should examine the perception of nudging, specifically in terms of PU and PEOU by empirically demonstrating potential habituation effects of nudging, assessed by a manipulation check (IR3).

6.1.2 Channel-related perception factors: trust and perceived privacy risk

In relation to the channel-associated perception factor TRUST, discernible group disparities were observed: DF exhibited lesser trust-inducement than NN (MF3), and similarly, SW was found less trust-inducing than NN (MF4). With regard to SN, no significant group differences were observed compared to NN (MF5).

A plausible explanation for MF3 and MF4 may be users’ interpretation of DF and SW as “dark patterns” (PC4), a specific ethical phenomenon elucidated by Gray et al. [102]. This term was coined to describe “instances where designers use their knowledge of human behavior (e.g., psychology) and the desires of end users to implement deceptive functionality that is not in the user’s best interest” [102, p. 1]. Although this concept remains somewhat underexplored in HCI academia, it has garnered attention among practitioners and media, shedding light on the risks of manipulative design approaches [102]. To raise awareness about these dark patterns within user interfaces, Brignull et al. [103] curated an online repository showcasing deceptive e-commerce tactics [104]. Hence, users might perceive DF and SW as veiled attempts to steer them towards platform actions that could undermine their interests or lead to potential drawbacks, like stress, pressure (E4, E5, E7, E10, E11), or financial setbacks. Such digital nudges, when misaligned with user welfare, stand in contrast to Thaler and Sunstein’s [105] principle of libertarian paternalism regarding nudges. Steffel et al. [60] emphasized that individuals tend to respond adversarial upon sensing a DF that prioritizes the platform designer or choice architect’s agenda over their own well-being. Adding another layer to this perspective, in an interview with Meske and Amojo [31], Weinmann noted the apparent diminished efficacy of defaults in online spaces compared to offline settings, suggesting that online consumers, frequently encountering defaults, have grown more vigilant [31].

Another underlying reason for MF3 and MF4 might be that users felt their choice freedom was curtailed, leading to a reactive stance (E3, E7) as described in Brehm’s reactance theory [106] (PC5). This is counter to the intended impact of digital nudges. Participants might have misconstrued the DF, possibly assuming they could not alter the “256GB” selection (E2, E5, E6, E7, E10, E11). The experiment did not expressly convey that this choice was modifiable. Notably, a larger proportion in the NN group perceived their choice freedom as either very high (12.3% vs. 11.3%) or high (55.4% vs. 49.3%) compared to the SW group. However, the NN group also had more respondents viewing their choice freedom as either low (21.5% vs. 19.7%) or very low (4.6% vs. 1.4%). By comparison, while the “very high” response for DF (22.5%) exceeded that of NN (12.3%), the “high,” “low,” and “very low” responses trailed behind NN (46.5% < 55.4%, 9.9% < 21.5%, and 4.2% < 4.6%).

Hence, we emphasize this practical recommendation: Decision makers should be explicitly informed about their autonomy in choices, especially the flexibility to modify preselected options (E1) (IP1). Information nudges could be implemented via mouse overs couched in terms of benefits, e.g. regarding storage options: “If you choose the largest one now, it is suitable because you can shoot a lot of videos with it, play games, so if you are a gamer, use it” (E2).

A third hypothesis for the MF3 and MF4 is that users might be wary of nudges targeting their intuitive reactions, potentially inducing a reactive response (PC6). As categorized by Caraban et al. [12], DF and SW are nudges targeting our instinctual system 1, whereas SN aims at our deliberative system 2. Philosophically speaking, there’s a natural inclination to favor system 2 nudges since they seem more respectful of user autonomy and bolster their agency [107]. To address this, we offer this practical insight: Emphasizing to users that nudges like DF and SW are designed to offset the biases and heuristics inherent to system 1 could prove beneficial (IP2). This is, to enrich these nudges with boosting elements (E3). Boosts are designed to enable certain behaviors by enhancing existing competencies or developing new ones. Additionally, their purpose is to preserve personal agency and to empower individuals to exert that agency [108]. Platform designers should consider how to combine nudging and boosting. In this vein, it would be interesting to examine how the combination of nudging and boosting elements affects different dimensions of the trust construct (E5) as Calefato et al. [109] investigate affective and cognitive trust dimensions.

An underlying reason for the instance that no significant group differences were observed between SN and NN (MF5) might stem from the distinct nature of the SN nudge intervention in contrast to the other two nudge interventions in this study (E5, E6, E8, E10) (PC7). The SN nudge intervention was operationalized through the cue “customers who bought smartphone XT10 also bought”, followed by images of commonly purchased add-ons (cf. Section 4.1). In contrast, the DF and SW nudge intervention in their operationalization (cf. Section 4.1) relate directly to the purchase of the smartphone, the product in question. Thus, with respect to the TRUST construct, the users could perceive the SN nudge intervention as neutral. In order to gain deeper insights into the mood of the user/ customer, narrative accompanying research should be conducted (E3) (IR4). Narrative research involves gathering and then reviewing people’s lived experiences [110]. Through an accompanying study, a comprehensive perspective on the perception of digital nudges regarding trust could be obtained. It would be interesting to consider reviews of merchants by other users as different operationalization of social norms nudging on the user perception. It has been shown that negative reviews of a retailer with whom a shopper has previously done business decrease repurchase behavior [111]. Research in the ECR community has suggested the redesign of review sort interfaces by integrating options such as perceived helpfulness [112].

Concerning PPR, there were no discernible differences between any digital nudge intervention and NN (MF6). This suggests that the specific implementation of DN did not significantly influence individuals’ perceptions of privacy risk.

One possible explanation for MF6 is the potentially timing of the nudge interventions (E2): Users are currently in the process of deciding on smartphone specifications, rather than in the buying process where the entry of personal information is required (PC8). Even though Dinev and Hart [113] highlighted privacy concerns as a major impediment to e-commerce adoption, participants in this study may not have primarily associated the nudge interventions with privacy risk (E1, E2, E3, E4, E6, E9). Process models for nudge development have stressed that the effectiveness of several digital nudges depends on their prompt delivery [114]. This leads to our proposed research directive: Future investigations should delve deeper into analyzing diverse time points of nudging and their influence on the perception of privacy risks in the e-commerce landscape (IR5).

Another perspective on MF5 is that the study’s portrayal of digital nudges, as illustrated in Fig. 2, seemed somewhat non-invasive when placed on a hypothetical “nudge-invasiveness scale” (PC9). There was no indication regarding automated data collection and analysis nor they did not detail the specifics, like which prior purchase influenced the product recommendations. Supporting this notion, Schöning et al. [115] demonstrated that privacy concerns within mobile health bonus programs can be influenced by personalized nudging tailored to users’ cognitive styles. Their findings emphasized that privacy perceptions can be significantly shaped by cognitive preferences [115]. This suggests that for nudges to be more effective, they should be tailored to the individual traits of the user. Dalecke and Karlsen [91] have even proposed a dynamic model for crafting smart nudges that take into account specific data forms, such as user activities, locations, and nudge history. From this, we derive another research implication: Further exploration is needed to determine the impact of highly personalized and invasive smart nudges on privacy perception (IR6). In this course, the enhancement of users’ understanding of the technology behind these interventions [116] might be interesting to examine (E11).

In the course of the qualitative inquiry, it was found that the included perception factors are heterogeneous in nature (E9) and could be interdependent, e.g. PPR could be mediated by trust perception (E6).

6.2 Limitations

This paper acknowledges several limitations, categorized across the pillars of validity, objectivity, and generalizability inherent to social research [117].

In terms of validity, it is pertinent to mention that “Shopera” functioned as a purely experimental platform. Consequently, gauging authentic e-commerce platform usage was infeasible. The findings were derived from a controlled laboratory environment, and the vignette study reflected only a fraction of a genuine e-commerce experience, without the latitude to engage or traverse through it, it challenges the accurate appraisal of the constructs PU and PEOU, which are indicative of an artifact’s utility and usability. This limited exposure could have similarly curtailed a definitive assessment of the nudge impacts, causing a misalignment with prevailing literature on these constructs. To illustrate, consider the SN nudge: Recommender systems aim to alleviate user effort and associated uncertainties in product exploration [118], while also being heralded as strategies to boost sales [119].

The study’s objectivity encountered challenges, primarily when assessing digital nudges independently from other platform components. This concern became pronounced in some voluntary feedback from experimental participants. For instance, one individual correlated the smartphone image with an Apple-branded iPhone and felt its displayed price was undervalued, arousing skepticism. Another self-proclaimed “technology enthusiast” expressed dissatisfaction with the platform’s insufficient technical details regarding the smartphone. Such feedback suggests participants evaluated the e-commerce platform holistically; they perceived digital nudges not as standalone elements but in conjunction with other platform features. This holistic perception significantly complicates the evaluation of digital nudges within e-commerce environments. Experimental vignette designs are particularly suitable for collecting subjective assessments [120]. As such, the study’s objectivity is compromised due to design choices, with potential repercussions on its validity and reliability. To some extent, we can overcome this limitation through our expert interview conduction that assess the implications of the main findings.

Pertaining to generalizability, our study encountered constraints due to its sampling approach. A significant 89.7% of the participants fell within the 18 to 34 age bracket, a distribution likely shaped by our chosen recruitment channels. Given that the entire cohort comprised solely of German respondents, this narrows the study’s reach. Furthermore, the quasi-experimental, nonequivalent groups design, while valuable for observing the impact of digital nudges in an online environment, may have influenced the diversity and representativeness of our sample. The lack of randomization in participant assignment to different digital nudge scenarios could also limit the extent to which our results can be generalized. These factors highlight the need for future research that employs a more globally diverse participant pool and potentially more randomized experimental designs. Such studies would enhance the understanding of digital nudges across varied demographics and cultural contexts, providing a more comprehensive view of their impact on Internet users worldwide.

7 Conclusion

The digital landscape is evolving, and with it, the techniques used to influence user perceptions. This research was propelled by a noted knowledge gap: how do users perceive the digital nudges implemented by platform designers? With an aim to explore the influence of specific DN interventions on key perception factors (PU, PEOU, TRUST, and PPR), we embarked on an in-depth investigation using an online experiment, a subsequent rigorous statistical analysis and a qualitative evaluation of the main findings with eleven experts from the information systems and psychology domain. Our results show that particularly, the TRUST scores in the DF and SW groups contrasted starkly with the control group (NN), indicating differing levels of trust towards platforms based on the type of nudge.

The insights derived from this study can be pivotal for platform designers. The findings underscore the nuanced implications digital nudges can have on user perceptions. As e-commerce continues its growth trajectory, understanding these perceptions becomes paramount. Moreover, as technology forges ahead, bringing forth diverse devices and interfaces, the art and science of DN will be of increasing significance. Future HCI studies in this domain would benefit from investigating the evolution of user perceptions across these emerging platforms and interfaces.