To test the hypotheses, an empirical study was conducted in a medium-sized city in eastern Germany. The experiment was designed as a one-factor (stress/no-stress), between-subject study in a laboratory setting (Kraus et al. 2016). First, each participant entered an experimental room containing a desk, chair, camera, and a bowl of water. After being welcomed by the female experimenter, the participant was told that the study deals with “relationship marketing.” No further information was provided. After the briefing, participants in the stress condition were treated according to the Social Evaluative Cold-Water Pressure Test (SECPT), as described by Schwabe et al. (2008). Although this is a laboratory method of stress-induction, it triggers comparable physical reactions similar to real-life stress situations (Cheng 2001). The SECPT is well-established in psychological stress research; it combines social and psychological elements of the Trierer Stress Test (TSST; Kirschbaum et al. 1993) and physical elements of the Cold-Water Pressure Test (CPT; Velasco et al. 1997) for stress induction. Consequently, the SECPT includes physical (cold water) and social (being watched and analyzed) stressors. Thus, the method is applicable for participants who respond to social and psychological stimuli, as well as for those participants who do not respond to psychological stimuli but do respond to physical stress-induction. According to the procedure, the participant was asked to immerse one hand in cold water (1–4 °C) for either 3 min or for as long as possible. At the same time, the female experimenter observed the participant and made notes, including objective stress criteria (e.g., eyelid movement, tremor). She also completed the same stress measure (Tansik and Routhieaux 1999) that the participant was provided subsequently, but with an altered wording to capture the experimenter’s perception of the participant’s stress level (see Table 2). Meanwhile, the camera recorded the respondents after being told that three male communication researchers would analyze their facial expressions subsequent to the experiment. In contrast to Schwabe et al. (2008), where only male respondents were investigated, the present experiment included both genders to gain more generalizability. According to Schwabe et al. (2008), being observed by a person of the opposite sex contributes to an increasing stress level. To account for the variation described above, all participants were told that the mentioned communication researchers were all male. Hence, this ensured that all participants thought they were being observed by at least one person of the opposite sex (female experimenter; male communication researcher). In the no-stress condition, respondents immersed their hands in warm water (33–38 °C) with no observation by the experimenter (although she was passively present) or camera recording.
In the second part of the experiment, participants were told to test service employees of a hotel-booking hotline. They were instructed to call a booking hotline and obtain a tentative offer for two separate hotel rooms (for themselves and a superior) near the fairground of another medium-sized city in eastern Germany. To ensure sufficient interaction time to accurately assess the hotline employee (at least 1:30 min), this task included asking for a vegan breakfast. Participants were provided with fictional names (male and female) and email addresses to be used in the task.
The fictional booking hotline was staffed by a male call-center professional who was instructed to act consistently polite and to recommend the same two (fictional) hotels, including prices and distances from the fairground. This call-center professional was not aware of the treatment group the participant was assigned to. After the telephone conversation the call-center professional filled out the same stress scale (Tansik and Routhieaux 1999) as the participants and female experimenter based on his impression of the participant’s voice to gain a more extensive evaluation of the participant’s stress level.
After having finished the call, participants were asked to fill out a paper-based questionnaire. The seven-point Likert scales for IPL (Nicholson et al. 2001) were slightly modified to fit the case of employee contact via telephone. In addition to the item from the original scale, “I like this service employee as much as other people I know,” we supplemented “I like this service employee more than other people I know,” and “I like this service employee less than other people I know.” The modification was necessary to better assess the assumed direction (positive or negative) in case of a low agreement. The questionnaire also included repurchase intention (three items; Maxham and Netemeyer 2002), WoM intention (four items; Zeithaml et al. 1996), and a stress measurement (three items; Tansik and Routhieaux 1999). Participants were asked how they thought the booking company would behave toward regular customers based on the constructs of perceived relationship quality and perceived relationship investment (trust, three items; commitment, three items; satisfaction, three items; and investment, three items; all four adapted from DeWulf et al. 2001). The questionnaire ended with demographics and several questions related to personal aspects. In this section, participants were also asked if they smoke and how many cigarettes they smoke daily. According to Kirschbaum et al. (1993), nicotine consumption affects participants’ responsiveness to stress. All multi-item scales used seven-point Likert-type scale formats and were translated into German and verified by back translation. Item wordings are provided in Tables 1 and 2. After completion of the experiment, participants were debriefed via email.
The sample included 104 participants (55% male) with a mean age of 29.73 (range: 20–64 years). Sixty-three percent were students, and the other participants were employees or workers (20%), freelancers (7%), job seekers (5%), or others (5%). Two participants were excluded due to an incomplete questionnaire and calling task. Participants were recruited by attendees of a seminal course as part of the universities’ master’s programs. As an incentive, all participants were given the opportunity to participate in a lottery for three Amazon vouchers (30 euros each). To prevent any physical harm, people with cardiovascular diseases were excluded in advance (Schwabe et al. 2008). Participants were randomly assigned to one of the two groups. Overall, 52 participants were assigned to the no stress treatment, and 50 to the stress treatment.
Analysis and results
Overview of approaches
The analysis is structured as follows: first, conditions for analysis of individual hypotheses were tested. For all multi-item variables, indices were used to represent the underlying construct. All constructs exceeded the minimum requirements (.7) of reliability in terms of Cronbach’s α (Table 3), dimensionality checks in explorative factor analysis (EFA), and convergent validity (confirmatory factor analysis, CFA, with average variance extracted, AVE, > .5, Table 3). To verify discriminant validity, the Fornell–Larcker criterion (Fornell and Larcker 1981) was applied. All multi-item constructs were found to be discriminant, as average variance extracted was constantly larger than squared correlations with all other constructs. The stress level measure consisted of three different dimensions (self-reports from participant, experimenter, and “employees’” scoring) to account for: (a) social desirability biases, because participants may state that they were not nervous or anxious, contradictory to their apparent behavior (e.g., red face, tremor, rapid eyelid movement); and (b) experimenter biases in evaluation of stress level, possibly caused by condition knowledge (stress or no stress).
Second, having established the validity of our constructs, we applied two approaches to address the multi-stage nature of our chain of effects to test the hypotheses. Following Bagozzi and Yi (1989), we used a SEM-based multi-stage model with treatment groups being a dummy variable (0 = no stress group, 1 = stress group). Further, we applied multiple mediation models following Preacher and Hayes (2008) that model all proposed mediations: (a) treatment → stress level → IPL; (b) stress level → IPL → relationship quality/relationship investment; and (c) IPL → relationship quality/relationship investment → behavioral intention variables. Again, stress was used as a dummy variable with identical coding.
SEM-approach to the chain of effects
With this approach, we investigated whether the chain of effects by the proposed hypotheses holds together. To do so, a dummy variable for the treatment as well as index scores for the continuously measured variables were applied within a covariance-based structural equation model to obtain estimates via Maximum Likelihood (Bagozzi and Yi 1989). Our models were set up with all variables from a previous stage being considered as predictors. For example, relationship satisfaction was modeled to be influenced from stress treatment, stress level, and IPL. Figure 2 depicts the results. With respect to the dummy stress treatment variable, it should be noted that the unstandardized path coefficient to stress level (b = .318, t(101) = 2.205, p < .05) estimated the overall difference in stress level between both groups (Mno stress = 3.21, SD = .820; Mstress = 3.53, SD = .637) equal to a t test or one-factor ANOVA. Stress level itself promoted IPL (b = .321, t(101) = 2.236, p < .05), supporting H1. In turn, IPL increased relationship satisfaction (b = .231, t(101) = 1.998, p < .05), relationship trust (b = .238, t(101) = 3.172, p < .01), relationship commitment (b = .407, t(101) = 3.723, p < .001), and relationship investment (b = .285, t(101) = 2.532, p < .05). Thus, hypothesis H3a–d were confirmed. Further, IPL influenced WoM intention (b = .330, t(101) = 2.124, p < .01) but not repurchase intention (b = .185, t(101) = 1.474, p > .05). As a consequence, relationship satisfaction (b = .166, t(101) = 1.417, p > .05) and relationship commitment (b = − .109, t(101) = 1.278, p > .05) had no effect, while relationship trust (b = .302, t(101) = 2.333, p < .05) and relationship investment (b = .307, t(101) = 2.998, p < .01) had an effect on WoM intention, confirming H4b and H4d, but rejecting H4a and H4c. Likewise, relationship satisfaction (b = .215, t(101) = 2.139, p < .05) and relationship investment (b = .367, t(101) = 3.924, p < .001) influenced repurchase intention, but neither relationship trust (b = .058, t(101) = .524, p > .05) nor relationship commitment did (b = − .030, t(101) = .299, p > .05). Hence, H4e and H4h were confirmed, but not H4f or H4g. In an exploratory lens, IPL explained between 7.1% for relationship satisfaction and 30.1% for WoM intention of variance in those outcomes. The stress manipulation revealed a power of .703 (p = .05, one-sided). Relationship investment on repurchase intention (power = .988) revealed the largest power; IPL effect on relationship satisfaction (power = .742) displayed the lowest power among significant hypotheses (p < .05).
Linear regressions for mediations
Next, we turn to the second approach and tested whether the proposed mediations were present. Again, a dummy variable was used to capture the stress treatment groups. In line with Preacher and Hayes (2008), a bootstrapping approach (5000 resamples) was used to obtain the indirect effect standard errors and significance tests (t-test) as well as 95% confidence intervals (termed LCI and UCI for lower and upper confidence intervals). Again, predictors from a previous stage were also modeled. It is noted that the indirect effects are calculated post hoc and not used for hypothesis testing.
Regarding hypothesis H2 that states that a higher stress level is connected to increasing IPL for the service employee, stress treatment significantly increased stress level (b = .318, t(101) = 2.183, p < .05), and stress level in turn raised IPL (b = .321, t(101) = 2.444, p < .05) confirming H1. The indirect effect indicated a significant mediation (indirect effect = .102, p < .05, LCI = .002, UCI = .250).
In line with Nicholson et al. (2001), a positive effect of IPL on (perceived) relationship quality constructs and relationship investment was expected. For relationship satisfaction, IPL promoted relationship satisfaction (b = .231, t(101) = 2.228, p < .05) and by that way mediated stress level effects (indirect effect = .073, p < .05, LCI = .002, UCI = .181). This mediation was also found for relationship trust (b = .238, t(101) = 2.758, p < .01, indirect effect = .076, p < .05, LCI = .001, UCI = .176), relationship commitment (b = .407, t(101) = 3.399, p < .001, indirect effect = .133, p < .05, LCI = .020, UCI = .282), and relationship investment (b = .285, t(101) = 2.588, p < .05, indirect effect = .092, p < .05, LCI = .006, UCI = .218). Thus, hypotheses H3a, H3b, H3c, and H3d are confirmed. Results showed that perceived relationship investment was, as expected, positively affected by IPL. That is, personal sympathy increases the likelihood that a customer evaluates a company to be caring and active. Likewise, relationship commitment defined as “a willingness to maintain a relationship with a firm” (Grégoire et al. 2009, p. 20) was also positively affected by IPL.
Further, it was assumed that relationship quality and relationship investment increased behavioral intention constructs. However, with respect to the dependent variable WoM intention, this hypotheses could only be confirmed for relationship investment (b = .307, t(101) = 2.222, p < .05, indirect effect = .087, p < .05, LCI = .005, UCI = .213), but not for relationship satisfaction (b = .166, t(101) = 1.000, p > .05, indirect effect = .038, p > .05, LCI = − .039, UCI = .149), relationship trust (b = .302, t(101) = 1.799, p > .05, indirect effect = .072, p > .05, LCI = − .004, UCI = .188), and relationship commitment (b = − .109, t(101) = 1.060, p > .05, indirect effect = −.044, p > .05, LCI = −.147, UCI = .040). Hence, only H4d could be confirmed; H4a–c were rejected. Instead, IPL directly influenced satisfaction, trust, and commitment.
For repurchase intention, the same pattern was found. Relationship investment (b = .367, t(101) = 2.794, p < .01, indirect effect = .104, p < .05, LCI = .014, UCI = .236) but not relationship satisfaction (b = .215, t(101) = 1.365, p > .05, indirect effect = .038, p > .05, LCI = − .039, UCI = .149), relationship trust (b = .058, t(101) = .362, p > .05, indirect effect = .014, p > .05, LCI = − .067, UCI = .098), and relationship commitment (b = − .030, t(101) = .307, p > .05, indirect effect = − .012, p > .05, LCI = − .098, UCI = .070) significantly mediated IPL on repurchase intention (Table 4). Consequentially, H4e was confirmed, but not H4f–h. In an explorative manner, IPL did not directly influence satisfaction, trust, and commitment.
It is noted that statistical power did not differ from the first approach, ranging from .706 for stress treatment on stress level to .978 for relationship investment on WoM intention. An interesting effect was found in both approaches. Stress level itself had a negative impact on relationship trust (SEM: b = − .271, t(101) = 2.436, p < .05; Regression: b = − .271, t(101) = 2.334, p < .05). That is, despite stress having a beneficial effect on IPL, and IPL increasing trust, stress also undermined the trustworthiness of a service employee to some degree. No other substantial direct effect of stress level or stress treatment was found.
The results of the two approaches described above show ambiguous results. While the first hypotheses (H1, H2, H3a–d) were confirmed in both approaches, there were uncertainties in hypotheses 4a–h. For this reason, Hypothesis Ha–c and H4e–g were rejected; only the mediation of relationship investment for both outcomes, WoM intention and repurchases intention, could be confirmed here. Table 5 summarizes the final conclusions for all hypotheses.
Finally, we considered gender (0 = male, 1 = female), student status (0 = non-student, 1 = student), and smoking (0 = non-smoking, 1 = smoking) as control variables in regressions to check robustness. All proposed relationships remained stable. However, smoking significantly affected relationship commitment (b = .283, t(101) = 3.113, p < .01): a finding we have no plausible reasoning for, and thus deemed it as random. A Chi square test revealed that there was no significant difference between smoking and treatment group assignment (χ2 = .171, p > .05; non-smokers in no-stress group = 41, non-smokers in stress group = 42, smokers in no-stress group = 11, smokers in stress group = 8). Thus, we excluded having a substantially larger share of smokers or non-smokers assigned to one treatment group.