1 Introduction

Forecasts are the basis for a wide range of managerial decisions (Butler & Ghosh, 2015; Chen et al., 2015a). We have seen several approaches to using new forms of data as well as new techniques to analyse them (Teoh, 2018). A prominent example is the use of customer data to implement smart, connected products (Porter & Heppelmann, 2014). In the field of management accounting, new forecasting approaches are a manifestation of this trend, and human forecasters are increasingly forced to interact with complex algorithms (Dietvorst et al., 2018).

With regard to forecasting, algorithms have been shown to be more accurate than human forecasters in many fields (Grove et al., 2000). This holds true even for simple forms of algorithms like linear models (Dawes, 1979). Due to algorithms’ superior performance in terms of forecasting accuracy, one would expect to see human forecasters integrate algorithms into their daily work (Dietvorst et al., 2015; Logg et al., 2019; Venkatesh & Davis, 2000). However, the literature shows a variety of cases in which forecasters reject superior algorithms at the expense of their work results (Castelo et al., 2019; Dietvorst et al., 2015; Grove et al., 2000; Prahl & van Swol, 2017). This phenomenon has a long history (Dawes, 1979; Meehl, 1954), and reasons for rejecting superior algorithms range from forecasters’ overconfidence (Logg et al., 2019) to irrational overreaction to new information (Remus et al., 1995).

A seminal study by Dietvorst et al. (2015) defines algorithm aversion as follows: human forecasters refuse to use algorithms that perform far better than themselves after they see an algorithm err. This effect is of high relevance because perfect accuracy is impossible in forecasting (Dietvorst et al., 2015). However, previous research also shows that forecasters are willing to use algorithms with which they have not previously interacted to improve their forecasts. Logg et al. (2019) refer to this willingness as algorithm appreciation. In contrast, algorithm aversion seems to be affected by different aspects of forecasters’ work environment or task specificities. For example, it is mitigated when forecasters can modify the results of an algorithm, and therefore worsening forecasting accuracy (Dietvorst et al., 2018), or if the task is perceived as objective rather than subjective (Castelo et al., 2019).

Although research on algorithms in forecasting and algorithm aversion has been ongoing for more than 60 years, there are still large gaps in the literature (Burton et al., 2019). The growing availability of algorithms increases the need for a better understanding of how algorithm aversion occurs and how working environments affect it (Castelo et al., 2019). Prior studies have largely neglected the working environment to which forecasters are subjected when they create forecasts and suffer from algorithm aversion. Following Shaw and Gupta (2015) and Merchant and van der Stede (2017), future work must address the question of under what conditions forecasters direct their actions toward the best performance, thus achieving the best forecasting accuracy. Logg et al. (2019) summed up this discussion by noting that “algorithm aversion is not as straightforward as prior literature suggests, nor as contemporary researchers predict.” This study is therefore motivated by the question: does the working environment of forecasters mitigate algorithm aversion?

We study the environment of forecasters and their decision-making process when interacting with algorithms. More specifically, we draw on the work of Bonner and Sprinkle (2002) to define the most relevant environmental variables for the forecasting task of Dietvorst et al. (2015). This replicates and extends the work of Dietvorst et al. (2015) with three emerging environmental variables from forecasters’ daily work routines. First, we induce time pressure to investigate when forecasters only have a limited amount of time for their forecasts. Second, we assign a “do your best” goal regarding the best forecasting accuracy. Third, we give our participants data input decision rights and therefore decision rights to choose the information input for the algorithm. To integrate these management accounting variables into the work of Dietvorst et al. (2015), we conducted an experimental study on Amazon’s Mechanical Turk with 1840 participants.

Our study contributes to the management accounting and forecasting literature in four ways: first, we contribute to the literature on algorithm aversion by demonstrating that time pressure mitigates algorithm aversion—without the need of forecasters to affect the algorithms’ outcome and therefore worsening the algorithms forecasting accuracy. In doing so, we also add important knowledge to the literature on time pressure by showing, second, that under time pressure forecasters willingly use algorithms as a meaningful way to escape the uncertainty and tense situation they experience under time pressure. Third, conversely, a “do your best” goal is ineffective in mitigating algorithm aversion. Fourth, giving forecasters more decision rights by letting them choose the data that the algorithm processes does not impact their algorithm aversion.

2 Theoretical background on algorithm aversion in forecasting

An algorithm is essentially a sequence of mathematical calculations with a specific goal, defined as “a procedure for computing a function” (Rogers, 1987). Algorithms can capture and evaluate large, unstructured sets of data, often referred to as “big data” (Logg et al., 2019; Vitale et al., 2020). Research on algorithms in decision-making dates back to the work of Meehl (1954) and Dawes (1979). Meehl (1954) was the first to describe the psychological aversion to algorithms, emphasizing the fundamental superiority of statistical models over expert opinions and how experts still assess these superior performing models with irrational scepticism. In a similar fashion, Dawes (1979) shows how simple linear models can predict students’ success better than experts can. Highhouse (2008) argues that the phenomenon of a forecaster rejecting an algorithm may be rooted in the belief that human forecasters tend to be able to act perfectly, whereas algorithms cannot do so. Grove et al. (2000) verified this phenomenon in a meta-analysis of 136 studies; however, the authors also found algorithms to display an average superiority of 10% over expert opinions.

Dietvorst et al. (2015) proposed the term algorithm aversion to describe this phenomenon in the presence of performance feedback in forecasting. Performance refers to the accuracy of a forecast, while feedback refers to receiving insights on previous performance—in this case, the performance of forecasts created either by humans or algorithms. The findings by Dietvorst et al. (2015) serve as a framework for research on algorithm aversion in our study. In several incentivized forecasting experiments, they gave participants the opportunity to choose either their own assessment or that of an algorithm. Most participants initially opted for the algorithm and subsequently were shown the results of the algorithm. The transparency of the algorithm’s forecast made the participants aware that the algorithm did not produce perfect results. As soon as the participants became aware of the algorithm’s potential for erroneous results, the rate of rejection increased. This rejection persisted after participants were told that the algorithm, despite not being perfect, performed better than humans on average. This finding is not in line with agency theory, because agency theory assumes that individuals maximize their benefits and thus choose the forecast that generates the highest revenues in the long run (Baiman, 1990; Eisenhardt, 1989). However, this rationale seems to be no longer given in the context of algorithm aversion. It is important to note that Dietvorst et al. (2015) do not postulate a general algorithm aversion in the sense that people will consistently reject any algorithm. Rather, algorithm aversion refers to the increasing rejection of an algorithm once it has been perceived as capable of producing erroneous results.

Performance feedback has often been studied in the literature on forecasting and decision-making, typically focusing on the search for causes and possibilities of the effect of feedback on performance (Ashton, 1990; Chen et al., 2015). Research has shown that performance feedback has different effects on forecasters. Depending on the characteristics of forecasters, it can improve the performance of forecasters (Ashton, 1990; Lourenco et al., 2018), or worsen it (Akın & Karagözoğlu, 2017), or even not affect it at all when forecasters are self-confident. In the context of algorithm aversion, there is a clear picture of how performance feedback affects decision-making: the effect of feedback is not examined for the accuracy of the forecaster, but rather on the likelihood that the forecaster will trust an algorithm or themselves. It is postulated that forecasters prefer their own forecasts if they receive performance feedback on both their own and the superior performance of an algorithm. Therefore, performance feedback on the algorithm’s forecasting accuracy is the cause of algorithm aversion (Dietvorst et al., 2015). If there is no performance indication of the algorithm’s forecasting accuracy, algorithm aversion does not occur. This creates a new dimension of the impact of feedback, making existing approaches less explanatory. However, there is no evidence regarding whether this effect of performance feedback in algorithm aversion might be affected by other variables.

A further explanation of algorithm aversion considers confidence in the algorithm as a mediating effect. Algorithm aversion can be seen as a loss of confidence in an algorithm when its poor performance is perceived by a forecaster. Confidence describes expectations of the forecaster regarding the accuracy of a forecast. The initial expectations a forecaster regarding an algorithm might be driven by past experiences and are therefore unstable over time. Interaction with algorithms over many years in everyday and professional life and the continuous improvement of algorithm performance might lead to increased positive experiences with the outcome of algorithms and therefore higher confidence in unknown algorithms (Al-Htaybat & Alberti-Alhtaybat, 2017; Appelbaum et al., 2017; Quattrone, 2016). This could result in two developments concerning algorithm aversion. First, without feedback regarding the specific algorithm’s performance, the use of algorithms in forecasts will increase because there is less evidence leading forecasters to set low expectations and reject unknown algorithms. Second, if the forecaster receives feedback about the imperfect performance of the algorithm, they gain uncertainty about the performance of that algorithm. This means that with a higher starting level of confidence, a feasible loss of confidence could arise. This could increase algorithm aversion over time. In addition, it must be noted that algorithm aversion is only harmful for organizations if it leads to worse forecasting accuracy. Thus, theory states that even algorithms with a better accuracy than the forecasters are rejected by the forecaster.

Organizations cannot prevent repeated collaboration of forecasters with algorithms and thus forecasters receiving performance feedback in the future. It is therefore important to investigate how algorithm aversion can be lowered. The following hypothesis development aims to discuss influencing variables that we expect to mitigate algorithm aversion when performance feedback is present.

3 Hypothesis development on mitigating algorithm aversion

Dietvorst et al. (2018) were able to show that algorithm aversion is mitigated by the possibility of forecaster intervention in the results of the algorithm, although such interventions generally worsened the forecasts (see also Carbone et al., 1983 or Goodwin & Fildes, 1999). Remarkably, it does not matter whether the forecaster can change the result of the algorithm strongly or only marginally; even small adjustments significantly mitigate algorithm aversion. Due to the effect of worsening forecast accuracy, this might not always be a practical solution.

Logg et al. (2019) follow Dietvorst et al., (2015, 2018) in conducting a series of experiments to answer the following research question: In what cases do forecasters trust the advice of algorithms more than that of experts? Logg et al. (2019) coined the term algorithm appreciation as a parallel to the term algorithm aversion to describe people’s acceptance of algorithms before receiving performance feedback. They show that algorithm appreciation is boosted when the role of the self is involved. In other words, people disregard advice in general and show a preference for their own judgment. Due to potential large errors or unintended functions of algorithms, humans need a critical scepticism toward algorithms with which they do not have experience (McKinney et al., 2017). Castelo et al. (2019) show that algorithm aversion is especially likely to occur when tasks have a subjective character. Consequently, algorithm aversion can be mitigated by increasing the perceived objectivity of a task. The research thus shows that other factors influence algorithm aversion.

Forecasters are “social beings with complex and somewhat changeable motivations, not as isolated operators of stable (probably profit-maximizing) decision models” (Luft, 2016, p. 9). As a result, their decision-making processes are subject to several environmental influencing factors. Besides the opportunity to change the algorithm results or the differentiation of more subjective or more objective tasks, there is a large gap regarding forecasters’ daily work environment.

First, during the preparation of forecasts, deadlines play a major role. Data can arrive on short notice and must be processed quickly, which usually leads to time pressure in the preparation of forecasts. Second, realistic payment is usually a fixed remuneration; forecasters’ salaries are rarely tied to the accuracy of a forecast, which is why they tend to simply “do their best” for a forecast without receiving a reduced remuneration in case of poor performance. Third, forecasts are a recurring task, and forecasters know the basis of their data. They can therefore judge to what extent they are suitable for the preparation of a forecast and decide for themselves which data their forecasting systems should work with. Linking these circumstances to accounting relevant variables, we draw on the framework of Bonner and Sprinkle (2002), who point out environmental variables that have been widely investigated in the management accounting literature: time pressure, assigning a “do your best” goal, and assigning decision rights for the data input (Bonner & Sprinkle, 2002).

3.1 The impact of time pressure on algorithm aversion

The availability of data is continuously increasing in terms of both scope and speed. In a fast-moving and volatile environment, there is less time for organizations and their respective employees to turn vast amounts of data into accurate forecasts (Camerer et al., 2004; Spiliopoulos & Ortmann, 2018). In cases where data arrive very soon before the forecast is due, the processing of the information for a forecast can become time critical. While time pressure can result from the forecasting task as such, a reduction in the time available for a task can also be induced by supervisors. For example, deadlines can be shortened, or data can be passed on later. The importance and effects of time pressure have been examined widely in the literature (Kelly et al., 2011; Lambert et al., 2017; Pietsch & Messier, 2017; Spilker, 1995; Wegier & Spaniol, 2015).

In the existing literature on algorithm aversion, forecasters do not have any time constraints for their forecasts. When time pressure occurs or is induced, there is less time for the forecaster to think of different options and possibilities that might lead to a change in behaviour (Wegier & Spaniol, 2015). Behavioural research therefore has shown that time pressure causes people to feel anxious, lose confidence in their own judgment, and decrease their effort towards a task (Pietsch & Messier, 2017). These findings appear in the literature on choking under pressure, which finds that people’s performance worsens when they lose confidence in their own skills (Beilock & Carr, 2001) and that people even may quit when performance expectations are very high and they no longer believe in themselves (Dai et al., 2018). These negative effects on forecasters mainly affect their self-confidence. Time pressure can also improve the change towards new ways and strategies of problem-solving (Mather & Lighthall, 2012).

In summary, the literature suggests that introducing time pressure will lower forecasters’ confidence in their own forecasts. Time pressure should not affect the way forecasters evaluate the performance of the algorithm. By decreasing confidence only in their own forecasts, we expect time pressure to mitigate algorithm aversion. We therefore hypothesize:

H1

Time pressure on the human forecaster mitigates algorithm aversion.

3.2 The impact of a “do your best” goal on algorithm aversion

In the literature on algorithm aversion, a payment scheme that encourages forecasters to achieve the best possible, if not perfect, forecast is typically applied. This means that a specific and difficult goal is assigned by an incentive structure that allows for few errors according to forecasting accuracy. In the real working environment of forecasters, there are usually no employment contracts that link payment to the accuracy of a forecast. Forecasters receive a fixed salary independent of their individual performance. The targets for a forecast are therefore formulated independently of monetary incentive and can therefore be assigned in the sense of a "do your best" goal. We therefore discuss in the following how different goal settings can affect algorithm aversion and why we expect a more realistic setting with a “do your best” goal to mitigate algorithm aversion.

One of the most widely proven effects in psychology and management science is that specific and difficult goals lead to better performance than vague “do your best” goals (Locke & Latham, 1990, 2013). This effect has been shown convincingly in over 90% of laboratory and field studies (Locke et al., 1981). In some cases, research has shown that specific goals do not lead to better performance (Chen et al., 2015a) or do not affect performance (Akın & Karagözoğlu, 2017). Ordóñez et al. (2009) note that while goals focus attention, “[u]nfortunately, goals can focus attention so narrowly that people overlook other important features of a task”. They present several potential reasons why specific goals have so far led to algorithm aversion and why a “do your best” goal could reduce it. These include the fact that people limit their focus by setting specific goals, thus preventing adaptation, working methods, or learning effects (see also Earley et al., 1989; Wood et al., 1990). This might focus forecasters narrowly on their own performance instead of carefully comparing it with the performance of the algorithm. In addition, Webb et al. (2013) add that when goals are more challenging, they hinder people from using new strategies to be more efficient. Instead, people start working harder using their existing methods. This means that difficult goals can increase performance but can also hinder the discovery of new, more efficient working routines. Polzer and Neale (1995) show that goals hinder people from thinking about task dimensions that are not directly affected by the goal.

Following van Dyck et al. (2005), difficult goals hinder people from truly processing new information within the task. Latham and Locke (2013) also note that the fear of making mistakes increases because of difficult goals. They suggest that mistakes must be accepted and embraced so that people can properly judge the mistake and learn from it. Such a work environment where errors are accepted can be established with a “do you best” goal. Setting a “do your best” goal by telling people they receive their incentive no matter their performance is supposed to be positively related to performance and goal achievement (Gold et al., 2014; Seckler et al., 2017; van Dyck et al., 2005). Arguing with the impact of a “do your best” goal, we conclude that difficult goals hinder people from truly processing the superior yet not perfect performance of the algorithm because of forecasters’ narrow focus on the algorithm’s mistakes. Therefore, to mitigate algorithmic aversion, a “do your best” goal is applied.

H2

A “do your best” goal regarding forecasting accuracy mitigates algorithm aversion.

3.3 The impact of data input decision rights on algorithm aversion

Fildes et al. (2009) show that 90% of the results of algorithms in forecasts are not directly accepted but rather are changed by the forecaster. This shows a desire of forecasters for decision rights when interacting with algorithms. Dietvorst et al. (2018) show that algorithm aversion can be mitigated consistently by letting people modify imperfect forecasts produced by an algorithm. However, modifying the forecast of an algorithm has an unintended effect: adjusting the result of the algorithm causes the accuracy of the forecast to decrease (Carbone et al., 1983; Dietvorst et al., 2018; Goodwin & Fildes, 1999; Remus et al., 1995). Fildes et al. (2009) note that small changes in forecasts impair their accuracy, while bigger changes often lead to improvement. Particularly in cases where forecasters think they have contemporary negative evidence from certain information in certain areas that the algorithm does not have in its historical data, it might be useful to forecasters to delete this information from the algorithm’s information processing (Remus et al., 1995). This implies there is some need for forecasters in selecting the data when forecasters interact with algorithms. It is important to understand how forecasters handle and value information for their own forecast and that of the algorithm.

The literature on algorithm aversion often describes a predefined algorithm that is built up on a particular database. Forecasters also must work with that database. By interacting with the database and the algorithm, forecasters get a feel for the difficulty of the task and how to derive a forecast from the data. It is important to understand that not all data has the same relevance to the quality of a forecast. The forecaster thus learns over time which data are particularly important for a good forecast.

When gaining experience with the data that underlie an algorithm’s forecast, a human forecaster might think the algorithm is weighting the information incorrectly. This implies that there should be a way to mitigate algorithm aversion by allowing the forecaster to select the information the algorithm processes and therefore giving them data input decision rights. We expect the ability to do so to raise confidence in the algorithm, especially when forecasters have experience with the data and might believe that they know where possible algorithm miscalculations are rooted. We therefore hypothesize:

H3

Data input decision rights for the human forecaster mitigate algorithm aversion.

4 Research design

We conducted an experimental study following the design and procedure of Dietvorst et al. (2015).Footnote 1 We collected a similar sample size and used the same data sources. However, we extended the study by introducing three additional variables, which we expect to find to mitigate algorithm aversion. The following chapters illustrate how we conducted our study, the measures we employed, and the structure of each experimental condition.

4.1 Participants

Selection and payment of our participants followed Dietvorst et al. (2015). We conducted our study on Amazon’s Mechanical Turk. Overall, we had 3032 participants and applied an exclusion rate of 38–39%, similar to Dietvorst (see Table 1). We excluded 865 participants who did not answer the main dependent variable of choosing the algorithm or their own forecast for their incentivized forecast. Another 47 participants were excluded for failing the attention check, and 280 participants failed to answer the belief questions.Footnote 2 This left us with 1840 participants. The sample averages M = 39 years of age and is 46% female. Table 1 offers an overview to compare the samples of our study and those in the study by Dietvorst et al. (2015).

Table 1 Comparing sample to Dietvorst et al. (2015)

4.2 Experimental design, manipulation, and measurement

In our setting, the participants were randomly assigned to one of eight conditions (n = 1840) (Table 2), which can be separated into two main clusters. One cluster did not gain any experience with the algorithm’s forecasting performance and their own forecasting performance (algorithm appreciation conditions), and the other gained experience with the algorithm’s forecasting performance and their own forecasting performance (algorithm aversion conditions) before making an incentivized forecast. In two experimental conditions no further adjustments were made. These conditions therefore replicate Dietvorst et al. (2015), and we refer to them as “replication”. By comparing participants reliance on algorithms between these clusters, algorithm aversion can be shown.

Table 2 Number of participants per condition

We applied the following payment rule to determine the bonus payment for each participant: every participant received $1 for finishing the task. Participants received an additional $1 for a perfect forecast. For each unit of error, this amount was reduced by $0.15. The unit of error in each forecast is the measure for forecasting accuracy. These measures are important to compare our findings with prior findings on algorithm aversion, forecasting accuracy, and bonus payments.

For Hypotheses H1-H3, we investigate in those participants that receive performance feedback and therefore suffer from algorithm aversion. To do so, we set the algorithm aversion replication condition as our control condition and compare it to the other three algorithm aversion conditions.

Participants in the time pressure condition were given 12 s to make a decision. We pre-tested not only whether the given time would be short enough to trigger time pressure, but also whether the task could still be fulfilled with sufficient care under this pressure. The payment rule was the same as the replication conditions. We refer to this condition as the “time pressure” condition.

The participants in the condition with a “do your best” goal were told that they would receive a fixed incentive of $2, regardless of the accuracy of their incentivized forecasting performance. They were told to “do their best” to achieve the best accuracy in the upcoming forecasting task. We refer to this condition as the “do your best” condition.

The participants in the condition with data input decision rights for input information for the algorithm to process were told that the algorithm could only process 4 of the 5 variables shown, and that they could choose which ones to use. The payment rule was the same as for the replication conditions. We refer to this condition as “data input decision rights”. Table 2 shows the final number of participants we gathered for each condition.

4.3 Experimental task and procedures

We started the study with a screening question to make sure participants were paying attention. After the screening question, participants were asked to solve a forecasting task. Specifically, they were asked to estimate “the rank of 1 U.S. state in terms of the number of airline passengers who departed from that state”. They were given a list of five different pieces of information about the state. They were then told that they were going to receive a prediction from an algorithm that was “developed by transportation analysts”.

The algorithm aversion conditions then started with 10 practice forecasts (stage 1 forecasts) to get accustomed to the task and the data. After each forecast, they received information about their forecasting accuracy, the algorithm’s accuracy, and the state’s true rank. The algorithm was the same as in Dietvorst et al. (2015) and could perform better or worse than the participants. The algorithm appreciation conditions proceeded directly to their incentivized forecast (stage 2 forecast) without gaining experience with the data or the algorithm. Before the incentivized forecast, each participant had to decide whether to tie their incentive to their own forecast or to the algorithm’s forecast.

Since we conducted the study on the same platform as the original, we added a question asking respondents to indicate whether they had previously participated in a similar task. Following Logg et al. (2019), we excluded these participants. We also excluded some explorative questions that were asked chronologically after the main dependent variable question was asked.

Each participant received information about their specific condition. All eight conditions were told that not all of the information shown to them would be equally relevant to the forecasting result. To avoid the data input decision rights condition, regretting their first selection and therefore rejecting the algorithm, they could adjust their decision (4 out of 5 input variables) before the incentivized stage 2 forecast. During stage 1 forecasts, the actual algorithm was not changed; thus, the participants always received the same feedback on the performance of the algorithm as the other conditions. Figure 1 provides a brief description of the experimental procedure.

Fig. 1
figure 1

Structure of experimental procedure

Each participant also had to answer questions regarding confidence and belief measures from Dietvorst et al. (2015) (see Table 3). Decreasing confidence in the algorithm due to performance feedback operates as a mediator for algorithm aversion in the literature. Belief measures were conducted to closely replicate the study by Dietvorst et al. (2015), and we used them to test whether participants understood the task.Footnote 3 Belief questions 1 and 2, as well as confidence questions 3 and 4, were asked in a randomized order. We also asked participants for their age, gender, and highest level of education.

Table 3 Confidence and belief measures

5 Results

In the following, we first present the analyses for algorithm aversion in our replication conditions. For a reliable evaluation of the results, we compare our results with those of Dietvorst et al. (2015). Second, we present the hypothesis tests on the mitigating effects of time pressure, “do your best” goals, and forecasters’ data input decision rights on the algorithms input on algorithm aversion.

5.1 Comparing results to Dietvorst et al. (2015)

In the algorithm appreciation replication condition, 67% of participants chose the algorithm to determine their incentivized forecast. Feedback on their own performance and that of the algorithm reduces this figure to 47% in the algorithm aversion replication condition (see Fig. 2). This means that the algorithm aversion in our replications conditions is significant: χ2 (1, N = 438) = 18,85, p =  < 0.001.

Fig. 2
figure 2

Descriptive statistics on algorithm aversion compared to Dietvorst et al. (2015)

Comparing our findings with those of Dietvorst et al. (2015), algorithm aversion seems to have increased since 2014. The percentage of participants who chose to tie their incentive to the algorithm without feedback increased from 54% in Dietvorst et al. (2015) to 67% in the present study. The isolated consideration of feedback conditions also shows an increase in the general acceptance of algorithms from 42 to 47%. The delta representing algorithm aversion has increased from 12 to 19% from Dietvorst et al. (2015) in our study.

The rejection of the algorithm is only harmful to forecasters—and, in the long term, to organizations—when the algorithm performs better than the forecaster. We therefore provide further evidence of the superior performance of the utilized algorithm (see Table 4). This lack of accuracy shown in Table 4 leads to low bonuses for our participants. Forecasters who relied on their own forecast earned a $0.27 bonus, while those who relied on the algorithm earned $0.49. This difference of $0.22 is significant (t(437) =  − 10.53, p < 0.001).

Table 4 Forecasting performance: means (standard deviation)

5.2 Analysis for mitigated algorithm aversion

In order to investigate whether algorithm aversion is mitigated by the three new variables, we first prove algorithm aversion within each condition. We found that it is significant within all conditions (comparing each algorithm appreciation with each algorithm aversion condition within the three new variables. Time pressure: χ2 (1, N = 486) = 9566, p = 0.002, “do your best” χ2 (1, N = 445) = 24,422, p < 0.001, data input decision rights: χ2 (1, N = 471) = 20,335, p < 0.001.).

Building on algorithm aversion within each condition, we further analysed algorithm aversion across conditions by comparing each treatment condition to the algorithm aversion replication condition (see Fig. 3). Participants’ preference for the algorithm in the algorithm aversion conditions is as follows: in the replication condition, 47% of the participants chose the algorithm. The data input decision rights condition behaves very similarly to the control condition. While in the “do your best” condition more than 50% of the participants chose the algorithm, one can see the significant influence of time pressure in the almost 56% preference for the algorithm.

Fig. 3
figure 3

Descriptive statistics on mitigating algorithm aversion

We tested Hypotheses H1–H3 using a one-sided chi-square test between the algorithm aversion conditions in each case, with the algorithm aversion replication condition as the reference (see Table 5).

Table 5 Hypothesis test for mitigated algorithm aversion

Table 5 shows significant support for H1. That is, time pressure mitigates algorithm aversion. Meanwhile, we find no support for H2 or H3.

6 Additional analysis: the role of forecasters’ confidence in their own forecast

The effect of performance feedback in the context of algorithm aversion is described by Dietvorst et al. (2015) as the influence on the confidence of the forecaster in the algorithm and self-confidence (see Fig. 4). The cause for algorithm aversion is described as the relationship between performance feedback as the independent variable, confidence in the algorithm as a mediator, and the choice between one’s own and the algorithm’s forecast as the dependent variable. Confidence in one’s own forecast has also been tested as a mediator, but did not have any significant effects (Dietvorst et al., 2015).

Fig. 4
figure 4

Potential role of confidence on algorithm aversion

A forecaster who receives performance feedback on their accuracy and on that of an algorithm will lose confidence in the algorithm but not in their own performance, even if the forecaster’s performance is worse. Confidence in the algorithm acts as a mediator between the experience with the algorithm and the forecaster’s choice. Confidence in one’s own forecast, however, has no influence on the forecaster’s decision-making process.

Due to this effect of the confidence measure, we now broaden the view to include the analysis of the influence of confidence in the algorithm’s forecast and confidence in one’s own forecast. It is of particular interest why we could only cause a mitigation in algorithm aversion through time pressure. A comparison of the confidence measures reveals a different picture. No significant changes in the confidence in the algorithm can be found. A significant change in the confidence in one’s own forecast under time pressure can be observed. Thus, time pressure makes forecasters feel increasingly insecure about their own forecasts.

Table 6 shows that due to time pressure, confidence in the forecaster’s own forecast is significantly lower than in the control condition. No other significant differences in confidence between our conditions can be shown.

Table 6 Comparing confidence measures to the algorithm aversion replication condition

We therefore test the significant mitigation in algorithm aversion due to time pressure (H1) for its relationship with mitigated confidence in the forecaster’s own forecast. We find that confidence in one’s own forecast mediates the effect of the mitigated algorithm aversion. We calculate a binary mediation analysis with 95% confidence intervals around the indirect effect. The sample size is 442 participants, including the algorithm aversion replication condition and time pressure condition. The dependent variable is the forecaster’s choice to tie the incentive to either the algorithm or the forecaster’s own forecast. The independent variable is whether a participant experienced time pressure, and the mediator is the confidence in one’s own forecast. We can thus provide further explanation and support for our hypothesis on the influence of time pressure. With 95% CI [0.0145, 0.3177], confidence in one’s own forecast mediates the mitigation of algorithm aversion when forecasters perceive time pressure. In the overall picture, the knowledge algorithm aversion can be expanded as follows. The basic aversion to algorithms whose performance is known to a forecaster results from mitigated confidence in the algorithm. Here, the trust in the forecaster’s own forecast does not yet play a role. If, however, the trust in the algorithm is already gone, the resulting aversion can be mitigated by the reduction of the confidence in one’s own forecast.

7 Discussion

Algorithm aversion describes the effect whereby forecasters reject a superior-performing algorithm as soon as they recognize it does not have perfect forecasting performance. Algorithm aversion is counterintuitive, as one would expect forecasters to choose a superior algorithm and not choose their own forecasts to determine their forecasting incentive. This behaviour is harmful to any organization. Hence, as algorithms are increasingly common in forecasters’ work, algorithm aversion needs to be better understood, as well as the circumstances that mitigate it.

In the first section of the results, we replicated Dietvorst et al. (2015) to show that people still reject algorithms after seeing their erroneous results. Since forecasters become increasingly used to working with algorithms, relying on them has become a daily routine for many people, leading them to build confidence in the algorithms, even when they do not have performance indications about that algorithm (Griffin & Wright, 2015; Quattrone, 2016). This means that in the absence of performance feedback regarding the algorithms’ accuracy, forecasters have large confidence in such algorithms and show algorithm appreciation in their forecasting tasks. This is somewhat remarkable because participants have no knowledge about the structure and reliability of the algorithm—the algorithm could completely fail. Even though our participants earned more money when they were relying on the algorithm, because of this potential failure, there needs to be more scepticism towards unknown algorithms (McKinney et al., 2017). As any algorithm can fail, questioning its function and performance is crucial to prevent the negative effects of poorly performing algorithms (Fildes et al., 2009).

Once forecasters see an algorithm’s performance, they dismiss it in favour of their objectively poorer individual human forecasting performance; even if the algorithm is not the most precise in terms of forecasting accuracy, it still outperforms the participants by far. We show that rejecting algorithms leads to lower accuracy of forecasts and thus to costly algorithm aversion. Thus, organizations need to ensure that their algorithms perform acceptably well as soon as they are released. In times when algorithms are used to make forecasts for events with significant social impact, it is all the more important with respect to the background of algorithm aversion to avoid issuing bad early forecast. Instead, an attempt should first be made to produce a reliable forecast, even if doing so takes longer, as it increases credibility and protects people in the long term. These findings make contributions to theory and have practical implications.

We extended these conclusions by integrating three variables from the management accounting literature that we expected to mitigate algorithm aversion. These variables represent specific circumstances of forecasters’ daily work. We found that time pressure can significantly mitigate algorithm aversion. The other two variables we tested—a “do your best” goal and data input decision rights—had no influence on algorithm aversion.

We hypothesized that time pressure mitigates algorithm aversion. Under conditions where forecasters receive performance feedback for both their own forecast and that of an algorithm, they usually choose their own forecast to rely on. When time pressure occurs, they start using the algorithm more and more. This results from the fact that confidence in one’s own forecast mediates the forecaster’s decision-making when that forecast is made under time pressure. When confidence in one’s own forecast decreases due to time pressure, algorithm aversion decreases as well.

Based on the existing evidence on algorithm aversion, our results can offer additional suggestions on how to mitigate algorithm aversion without interfering with the result of an algorithm and therefore not worsening the algorithm’s result (Dietvorst et al., 2018; Remus et al., 1995). This finding has several important implications and makes time pressure a meaningful instrument to mitigate algorithm aversion in practice. Time pressure can be used both as a deliberate control instrument and as a helpful result from some forecasting tasks as such. In the first case, data can be withheld by management, or deadlines for preparing a forecast can be shortened. In case of natural time pressure in certain forecasts, the resulting pressure on forecasters can be faced calmly. The stress of time pressure will make the forecaster rely on the algorithm because of less reliance on their own skills. These findings are in line with the literature on time pressure, which shows that time pressure causes people to feel anxious or lose confidence in their own judgment (Pietsch & Messier, 2017).

Our results add an important aspect to the literature on time pressure. Usually time pressure is perceived as something negative that leads to uncertainty (Pietsch & Messier, 2017; Wegier & Spaniol, 2015). This in turn leads to poor performance or limited creativity in finding new solutions. By combining two strands of the literature, an interesting extension emerges. In our study, the predictable effect of time pressure occurs, and participants lose confidence in their performance. At the same time, an algorithm is available to them as an alternative way of reaching the given goal that enables them to escape from the uncertainty. This shift is of particular importance because it allows participants to forecast more accurately and lower the pressure situation. They benefit in two ways—intentionally and unintentionally—lower pressure and better forecasts.

To prevent negative goal effects like a narrow focus on errors or slow learning effects, we established a “do your best” goal to mitigate algorithm aversion (Gold et al., 2014; Seckler et al., 2017). Contrary to a “do your best” goal, we conclude that difficult goals hinder people from truly processing the superior yet not perfect performance of the algorithm because of forecasters’ narrow focus on the algorithm’s mistakes. The “do your best” goal was established with a fixed incentive and participants, were told to “do your best” with respect to forecasting accuracy. Goal theory predicts that when people are told to do their best, they perceive a broader range of acceptable outcomes, which then shifts their focus from performing the task to finding new and alternative solutions (Locke & Latham, 2013; Webb et al., 2013). Surprisingly, assigning such a “do your best” goal did not affect forecasters’ algorithm aversion, and thus our corresponding hypothesis must be rejected. Along with prior findings (Akın & Karagözoğlu, 2017), participants in this condition behaved like those in the control conditions with one meaningful difference in terms of practical implications: due to the fixed incentive, they earned by far the highest bonus payments with the same algorithm version occurring. There seems to be no proper way to address algorithm aversion simply by incentivizing forecasters for any performance according to forecasting accuracy. Even though they perceive a broader range of acceptable outcomes, they do not change their behaviour toward the use of algorithms (Akın & Karagözoğlu, 2017). It seems like there is no such thing as testing the algorithm or using it out of curiosity. Due to increased payment without positive effects, this case is even to be avoided.

With regard to enabling forecasters with decision rights for the data the algorithm is to process, we hypothesized that by influencing the information input, participants would become more confident in the algorithm and mitigate algorithm aversion (Dietvorst et al., 2018; Kren & Liao, 1988). The literature suggested that algorithm aversion can be mitigated by letting people modify imperfect forecasts produced by an algorithm. However, adjusting the result of the algorithm causes the accuracy of the forecast to decrease (Carbone et al., 1983; Dietvorst et al., 2018; Goodwin & Fildes, 1999; Remus et al., 1995). With regard to the latter, we find almost no divergence in participants’ behaviour when they have data input decision rights compared to those who do not. Regarding these decision rights when working with algorithms, it must therefore be stated that algorithm aversion can only be reduced when forecasters can manipulate the output of an algorithm.

Our study is subject to several limitations. We could not ensure that participants had not taken on a similar task before and therefore did have experience with the task. We used a screening question, but participants may not have answered truthfully. For the data input decision rights condition, there is a possibility that confidence in the algorithm is unintentionally influenced by the mitigated amount of information input compared to the other conditions. Participants thought their algorithm processed only four input sources. Third, in line with the study we replicate, our sample does not consist of forecasting experts. Data input decision rights might be a more effective way of reducing algorithm aversion when experts instead of laypeople are involved. It might raise their overconfidence and start working as a factor to decrease algorithm aversion (Logg et al., 2019). Based on the original incentive plan, forecasters who relied on their own forecast earned a $0.27 bonus, while those who relied on the algorithm earned $0.49. The expected value for the “do your best” conditions differed from this original payment scheme. A “do your best” goal implies a fixed incentive that is not linked to participant performance. That raised the expected value to a 1$ bonus, regardless of the forecaster’s performance. This was necessary to ensure participants were not demotivated by smaller potential maximum incentives. Finally, it must be said that there is a bias in the experimental design with respect to forecasting accuracy because of training. Participants in the feedback conditions had an advantage of 10 trial runs over the participants who were not in the feedback conditions. We did not further investigate what impact besides algorithm aversion this might have had.

Our findings pose challenges and provide avenues for future research in several ways. First of all, it should be noted that the negative effect of seeing the algorithm perform (feedback) in this study is very stable and can only be affected by time pressure (Bandiera et al., 2013; Akın und Karagözoğlu, 2017; Lourenco et al., 2018). The question of how it can be further addressed therefore arises. The feedback was given to the participants in the form of a comparison of their absolute assessment and that of the algorithm. The feedback could be presented differently to improve the undesired feedback effect. Percentage deviations per forecast and the mean deviation of the forecasts can be shown to the forecaster. A performance summary can be given to the forecaster to show feedback in a neutral and objective way and to avoid creating effects that could lead to an overreaction due to large discrepancies in the algorithm’s performance (Petropoulos et al., 2016; Remus et al., 1995), or a few very good human forecasts promoting overconfidence in the forecaster’s performance (Choi & Hui, 2014; Grieco & Hogarth, 2009; Logg et al., 2019). Furthermore, the presentation of feedback can be optimized by new technologies such as voice output (Bentley et al., 2018) or KPI-dashboards, thus leading to greater acceptance.