1 Introduction

Hurricane Katrina struck the U.S. Gulf Coast in 2005, disabling most of the oil drilling and refining capacity in the region, which produces approximately 7% of the oil consumed in the U.S. (Mouawad 2005). Meanwhile, gasoline consumers reacted to the potential disruption, resulting in price spikes and long lines for gasoline (Mouawad and Romero 2005), as well as chaotic gasoline-buying patterns (Gold et al. 2005).

In the days and weeks following Katrina, therefore, the demand end of the gasoline supply chain temporarily experienced increased order volatility, while the supply end experienced less volatility since drillers and refiners were operating full-tilt at their (newly reduced) capacity. This suggests that demand volatility was greater than supply volatility—the reverse of the classical bullwhip effect (BWE). Indeed, we postulate in this paper that a reverse bullwhip effect (RBWE) occurs during and immediately after supply disruptions. We present evidence for the RBWE using both a variant of the “beer game” and a simulation study. Moreover, our study shows that human behavior creates an additional layer of variability to systems under supply disruptions, suggesting that the modeling of operational disruptions needs to take human reactions into account. This introduces new challenges for designing and managing flexible supply chains.

The classical bullwhip effect (BWE) describes the amplification of order variability as one moves upstream in the supply chain. The BWE was formally introduced and analyzed by Lee et al. (1997b), and has since drawn extensive attention from both academia and industry. However, recent evidence suggests that the BWE does not prevail in general. Baganha and Cohen (1998) study the quantity of shipments from manufacturers and that of sales from wholesalers and retailers in the USA from 1978 to 1985. They conclude, by examining the coefficient of variation, that it is the wholesalers rather than the manufacturers who see the largest variance of demand (i.e., orders from retailers). This implies that the wholesalers actually smooth the orders received from the retailers rather than amplifying them. Moreover, Cachon et al. (2007) perform a detailed empirical study at the industry level and show that only 47% of industries studied exhibit the BWE, while the remaining 53% show the reverse, again demonstrating that order variance tends to be largest in the middle of the supply chain. Similarly, although a number of studies have confirmed the presence of the BWE in an experimental setting using the beer game, several of these studies (Croson and Donohue 2003, 2006; Croson et al. 2004; Kaminsky and Simchi-Levi 2000; Wu and Katok 2006) find a substantial portion of trials in which the opposite effect occurs. Our findings help to explain the BWE, or lack thereof, identified by these empirical and experimental studies.

In this paper, we show that both the BWE and the RBWE can occur during the beer game in the presence of supply disruptions. In order to examine and verify the kind of behaviors that trigger the BWE and RBWE, we perform statistical analyses on the results of the beer game experiment and use simulation both to validate the findings of the experiment and to generate additional insight. Previous beer game studies (e.g., Sterman 1989; Croson and Donohue 2006) have suggested that the BWE is caused by demand uncertainty and an underweighting of the supply line (i.e., partially ignoring on-order inventory when setting order quantities). Our results confirm these findings. However, we also find that some players put more emphasis on the supply line in the presence of disruptions. Moreover, we observe that some players either increase or decrease their order quantities significantly during supply disruptions. For those players, we find the inverse result to earlier studies: that overreaction to supply disruptions and overweighting of the supply line can cause the RBWE. Using the terminology of our bullwhip metaphor in Sect. 1.1, overreaction to supply shocks serves as an amplifying factor, overweighting of the supply line serves as a damping factor, and both cause the RBWE.

When the BWE occurs, the demand/order variability increases upstream. In this case, as Glatzel et al. (2009) advocate, additional flexible resources are needed upstream, because the supply chain needs to be able to react quickly in the face of highly volatile demand. However, when the RBWE occurs, it raises a particular challenge for flexible supply chain design, since in this case the supply and demand processes are highly interdependent, unlike the independence assumption typically made in the flexibility literature (e.g. Tomlin and Wang 2005; Lim et al. 2008). If the flexible design ignores the RBWE, either over- or under-investment may occur, depending on whether the customers decrease or increase their orders during disruptions. Our beer game experiment provides a behavioral model to describe these demand patterns. Moreover, our simulation study helps us understand how the upstream supply process affects the demand pattern at each stage in the supply chain. Both studies will be useful in the development of future models for the design of flexible supply chains.

1.1 The bullwhip metaphor

To illustrate the RBWE, we extend the common metaphor of the supply chain as a string or whip, with the left-hand side representing upstream supply and the right-hand side representing downstream demand. Demand variability is represented as a vibration applied to the right end of the string. It is well known that a base-stock policy is optimal at each stage of a serial supply chain (and thus the BWE does not occur) if demands and purchase prices are stationary, upstream supply is infinite with a fixed lead time, and there is no fixed order cost (Lee et al. 1997b). In this case, vibrations (demand changes) are transmitted without modification up the string, as in Fig. 1a. It has been argued (Sterman 1989) that demand spikes act as shocks applied to the right end of the string, and that these shocks amplify as they move up the string, causing the BWE (Fig. 1b).

Fig. 1
figure 1

String vibrations a with no amplification, b with a demand vibration and BWE, c with a supply shock and RBWE, d with a demand vibration, a fixed point, and umbrella pattern, and e with a demand vibration, a supply shock, and umbrella pattern. Thick lines above strings plot wave amplitude

Now suppose that a shock is applied to the left end of the string instead of the right (Fig. 1c). The wave then initiates upstream and amplifies as it propagates downstream—the RBWE.

It is also possible for both effects to occur simultaneously. For example, if the left end of the string acts as a “fixed point” (that is, it is immovable), then vibrations will tend to first amplify and then dampen as they move up the string (Fig. 1d). Such a fixed point may represent an upstream supply shortage: The upstream stage utilizes 100% of its (now reduced) capacity, so it has no variability in its order quantities. The fixed point may also represent ordering behaviors that tend to dampen, rather than amplify, order variability. Either way, the fixed point ensures that the vibration (demand) amplitude first amplifies and then dampens as one moves upstream; that is, the BWE occurs downstream and the RBWE occurs upstream. We call this the umbrella pattern because a plot of the order variability stage-by-stage resembles an umbrella. The umbrella pattern may also occur when exogenous supply and demand shocks both occur (both ends of the string are perturbed—Fig. 1e). In both cases, order volatility (wave amplitude) is smallest at the ends of the supply chain (string) and largest in the middle.

The umbrella pattern occurs frequently in our beer game experiment and simulation study. It also is the shape observed at a macro level by Baganha and Cohen (1998) and Cachon et al. (2007). We believe our experiments help to explain the results found by these studies.

The discussion above identifies two types of RBWE. In one type, the upstream end of the supply chain is fixed, and the volatility tends to reduce as it approaches the fixed point (Fig. 1d), while in the other, a shock upstream amplifies as it propagates downstream (Fig. 1c, e). We refer to the former type of RBWE as damping-type and the latter as amplification-type; we discuss both types in this paper.

The remainder of the paper is organized as follows. In Sect. 2, we review the relevant literature. In Sect. 3, we explain the basic settings for our experiments. Sections 4 and 5 discuss the results of our beer game and simulation experiments, respectively. Finally, we summarize our conclusions in Sect. 6.

2 Literature review

The BWE was first described by Forrester (1958), although the term “bullwhip effect” was coined by managers at Proctor & Gamble and introduced into the literature by Lee et al. (1997a, b), who suggest four causes of the BWE: demand forecasting, rationing game, order batching, and price fluctuations. Lee et al. (1997b) show that all four causes can result from rational, optimizing behaviors on the part of managers. Prior to the work by Lee et al., it was generally thought that only irrational behaviors caused the BWE.

Sterman (1989) introduces the beer game and observes an order amplification due to the underweighting of the supply line—that is, players tend to ignore some or all of their pipeline inventory and instead base their ordering decisions primarily on their on-hand inventory. The behavioral study by Sterman (1989) and the theoretical results developed by Lee et al. (1997b) have stimulated numerous theoretical studies on the causes of the BWE, as well as a number of additional beer game experiments that attempt to reconcile the theories with actual human behavior, and to explore other behavioral causes of the BWE. Since this paper is concerned with behavioral causes for both the BWE and the RBWE, we first review the literature on the beer game. We refer readers who are interested in theoretical analyses of the BWE to the survey by Lee et al. (2004).

Kaminsky and Simchi-Levi (2000) find that a reduction in order information delay and shipment lead-time results in lower total supply chain costs but not in a reduction of order variability amplification. Chen and Samroengraja (2000) make the mean and standard deviation of the normally distributed demand known to every player and find that, though the four operational causes are removed, the BWE still occurs. Croson and Donohue (2003) observe a decrease in the magnitude of the BWE in their point-of-sale (POS) treatment group, who know the realized customer demand, when compared with their control group, who know only the underlying demand distribution. The primary reason for the decrease is that the participants in the POS treatment group almost equally utilize the realized customer demand and the order information from their immediate downstream stage in their ordering decisions. Steckel et al. (2004) show that POS information can actually increase a team’s costs when it distracts the participants under certain types of customer demand.

Croson and Donohue (2006) tell the participants the status of the inventory across the supply chain at any point in time. The magnitude of the BWE decreases compared with the situation in which participants are not provided with such information, because upstream stages use downstream inventory information to anticipate and adjust their orders. Oliva and Gonçalves (2007) suggest that participants respond differently to their own on-hand inventory and backorders due to the difference between holding and backorder costs and find that participants tend to ignore their own backorders rather than over-reacting to them and placing panicked orders. Wu and Katok (2006) show that effective communication along with learning can significantly diminish the magnitude of BWE. Croson et al. (2004) show that the BWE still persists even if the demand is constant and known to every player. They attribute this to “coordination risk”; that is, players place larger than necessary orders to protect themselves against the risk that other players will not behave optimally.

Despite its potential to systematically investigate the outcomes generated by various ordering behaviors, simulation has rarely been used in the BWE literature. An exception is Chatfield et al. (2004), who use simulation to study the effect of players’ behaviors in a beer game in which all stages are managed by computer, rather than by live players. Their order functions are very similar to the base stock policy studied by Chen et al. (2000). Chatfield et al. find that an increase in the variance of the stochastic lead time results in greater BWE, while information sharing dampens BWE. Furthermore, they provide three forecasting models under stochastic lead times. The model that forecasts demand and lead-time separately leads to higher forecast variability and therefore higher order variability than the other two, one ignoring lead-time uncertainty and the other forecasting lead-time demand. The BWE reaches its maximum magnitude when demand and lead-time are estimated separately.

The theoretical studies on the rationing game by Rong et al. (2008) and on the interactions among capacity, price and demand by Rong et al. (2009) are the first studies of the RBWEFootnote 1 in the literature to analyze the operational causes of the RBWE in the presence of supply uncertainty. Rong et al. (2008) show that the BWE occurs between retailers and customers and that the RBWE occurs between suppliers and retailers when retailers compete for scarce supply under a standard mechanism used by the supplier to allocate the available supply. Rong et al. (2009) show that when customers react not only to the price itself but also to changes in price, some pricing strategies implemented by the supplier may lead to the RBWE. These two papers provide a theoretical development of the RBWE, while the aim of the present paper is to address the behavioral causes of the RBWE.

Current studies on supply disruptions assume that decision makers are rational optimizers. Parlar and Berkin (1991), Berk and Arreola-Risa (1994), Parlar and Perry (1995, 1996), Gupta (1996), Mohebbi (2003), Snyder (2008), and many others modify classical inventory models to cope with supply disruptions. Tomlin (2006) examines how the optimal mitigation strategy (backup inventory, supplier redundancy, or some combination) changes as the characteristics of the disruptions change. Snyder and Tomlin (2008) take into consideration the benefit of advance warning of supply disruptions. Babich et al. (2007) study the impact of supplier default risk on the relationship between one retailer and multiple suppliers. Kim et al. (2006), Hopp and Liu (2006), Snyder and Shen (2006) and Schmitt et al. (2008) extend the study of supply uncertainty to multi-echelon supply chains.

Since managers have limited experience in dealing with supply disruptions due to their low probability of occurrence, it may be difficult to apply the models cited in the previous paragraph in practice. Moreover, just as Thietart and Forgues (1995) suggest that the “butterfly effect” (that is, a small variation at one point may cause a large variation of the whole system) can exist in organizations, so, too, can a small disruption be amplified by irrational decision makers within a supply chain. Therefore, studying human behavior under disruptions is important. One of the main contributions of our paper is to examine people’s behavior when they face supply disruptions, and the impact of this behavior on order-variability propagation in a multi-echelon setting.

3 Basic settings

In our beer game experiment and simulation, we study a 4-stage serial supply chain under periodic review. Stages 1–4 correspond to the retailer, wholesaler, distributor, and manufacturer, respectively. The retailer receives demand from an external customer. Since our study focuses on the impact of supply shocks, we remove the demand shocks from beer game and fix the demand to 50 units per period. (The exception is the simulation study in Sect. 5.1, in which we examine the impact of supply-line weighting under demand uncertainty only.)

The manufacturer (stage 4) has a production capacity that limits the quantity it may order in a given period. The production capacity, ξ t , is also a random variable. Each time period is classified as either an “up” or a “down” period. During up periods, ξ t is larger than the demand observed by the retailer and is smaller during down periods. This setting is consistent with the supply disruption literature cited in Sect. 2. The difference is that most papers set ξ t  = 0 during down periods, but we set ξ t  > 0 to model the situation in which capacity is reduced but not totally eliminated during a disruption. See Sects. 3.1 and 5.2 for more details on the supply process ξ t .

In each period, each stage i experiences the following sequence of events:

  1. 1.

    The shipment from stage i + 1 shipped two periods ago arrives at stage i (that is, the lead-time is 2). If i = 4, stage i + 1 refers to the external supplier.

  2. 2.

    The order placed by stage i − 1 in the current period arrives at stage i. If i = 1, stage i − 1 refers to the external customer.

  3. 3.

    Stage i determines its order quantity and places its order to stage i + 1.

  4. 4.

    The order from stage i − 1 is satisfied using the current on-hand inventory, and excess demands are backordered. Holding and/or stockout costs are incurred.

To reflect the modern data-processing environment (e.g., EDI) and to maintain consistency with our assumption that information about disruptions is propagated instantaneously, we assume (unlike Sterman 1989 and most subsequent papers) no order information delay, i.e., stage i receives order information in the same period that the order is placed by stage i − 1.

We examine the presence of the BWE or RBWE at each stage individually. Let σ i be the standard deviation of orders placed by stage i across the time horizon. When σ i  > σ i−1, stage i amplifies its order variability; i.e., the bullwhip effect (BWE) occurs at stage i. If σ i  < σ i−1, then the reverse bullwhip effect (RBWE) occurs at stage i instead.

If σ i+1 > σ i for all i ≤ 3, then the system exhibits pure BWE, and when σ i+1 < σ i for all i ≤ 3, the system exhibits pure RBWE. There are also several other possible shapes for the pattern of order standard deviations across the supply chain. For example, when σ i+1 > σ i for i = 1,2 and σ i+1 < σ i for i = 3, the supply chain exhibits the umbrella pattern; this pattern is natural when the downstream part of the supply chain is affected primarily by demand uncertainty while the upstream is affected primarily by the capacity process.

3.1 Experimental design

Our beer game setup is motivated in part by consumer buying patterns for gasoline following hurricane Katrina. Customers were aware of the supply shock (but not of the magnitude of its downstream effect), and many of them filled their cars at the beginning of the shock in order to avoid future shortage and price fluctuations. In our beer game experiment, we create capacity shocks during the game to observe how players behave during supply disruptions. All players know when a capacity shock is occurring, but only the player in the role of the manufacturer knows its severity.

Our beer game experiment was conducted using a Microsoft-Excel-based implementation written by the authors. Our computerized implementation gives players more information about the status of the system than in the traditional board version of the game. Figure 2 shows the game’s user interface. Players can easily acquire information about their own on-hand inventory, backorders, on-order inventory, and in-transit inventory, as well as backorders at their suppliers.

Fig. 2
figure 2

Screen shot of beer game

Each player is randomly assigned to a team and role, and players do not know the team and role that the other players have been assigned to. No communication is allowed during the game.

Our implementation automates the information-transfer process: when a player places an order, it is transmitted electronically to his or her supplier, and when orders are shipped, the delivery quantity is transmitted downstream electronically. This reduces transaction errors, speeds the playing of the game, and enforces the no-communication rule since players do not know who their teammates are.

Following Sterman (1989), we set the holding and backorder cost to $0.50 and $1.00, respectively, at every stage of the supply chain. In order to focus the study on the supply uncertainty faced by the whole supply chain, the demand is deterministic. We fix the demand to 50 units per week and make this known to every player.

The manufacturer has a capacity limitation on his or her order size. Since most of the literature on supply disruptions (e.g., Parlar and Berkin 1991; Berk and Arreola-Risa 1994; Parlar and Perry 1995, 1996; Gupta 1996; Mohebbi 2003) uses a Markov process to model disruptions and recoveries, we also assume that the capacity fluctuates throughout the game following a two-state discrete-time Markov process. The “up” state corresponds to full capacity ξ u and the “down” state corresponds to disrupted capacity ξ d . We fix ξ u  = 60 and allow ξ d to vary randomly according to a normal distribution with mean 40 and variance 4. The capacity ξ d is different for each disruption but the same for every period during a given disruption. The transition probability from the up state to the down state is p d , and that from the down state to the up state is p u . The stationary probabilities of being in the up and down states are therefore p u /(p d  + p u ) and p d /(p d  + p u ), respectively. To ensure the stability of the system, we require (ξ u p u  + E d )p d )/(p d  + p u ) > 50, the demand per period. In the experiment, we set p d  = 0.2 and p u  = 0.3. Hence the average capacity is 52.

The players were told that the manufacturer occasionally experiences disruptions, that during a disruption, the capacity decreases to a random value for a random number of periods, and that after a disruption ends, the capacity returns to its normal value. Players were not given the actual values of the disruption parameters (p u , p d , ξ u , ξ d ). They were told, however, that p u  ≥ 50 and that p d is random with mean less than 50.

Each team faced the same sample path of the disruption process, which was generated randomly before the experiment began and then repeated to ensure that fair comparisons could be made from team to team. This sample path includes 17 down periods during the 50-period horizon, occurring during the same periods for every team. Players were informed that every team would face the same sample path.

In the real world, when there is a supply disruption, the further away a company is from the disruption source, the less information it generally has about the disruption. For example, when a fire occurred at a Philips semiconductor plant in 2000 (Sheffi 2005), the managers at telecom operators’ retail stores are unlikely to have known how serious the problem was. In order to reflect this situation in our game, players were notified (via an indicator in the beer game screen) whether a disruption was in progress, but they did not know how severe the disruption was. The exception is the manufacturer, who can indirectly determine the capacity in any period since the program prompts him or her for a new order quantity if the quantity entered exceeds the capacity.

Our experiment consisted of 92 participants (23 teams of 4 players each) from Lehigh University, including 84 undergraduate students and 8 graduate students. Roughly one-third of the participants received a cash incentive for playing the game (the other two-thirds played the game as part of a course they were enrolled in). For those receiving cash, the amount of the award was scaled based on the teams’ performance in a manner similar to that described by Croson and Donohue (2003).

Each team played for up to 1 h and 45 min. The introduction lasts 20 min, followed by a roughly 10-min practice round in which the participants play for five periods to familiarize themselves with the software environment; the results of this practice round are discarded. The remaining time is used for the actual experiment. The maximum number of periods that each team played is 50, and most teams completed at least 40 periods.

We exclude teams who completed fewer than 30 periods’ worth of game play since it is reasonable to assume that those teams may not have understood the game well enough to play fluently. Based on this criteria, 4 of the 23 teams (teams 9–12) are omitted from the results below. In addition, team 6 was omitted because its retailer and wholesaler had mean order quantities of 77.4 and 91.2, respectively, both of which are more than four standard deviations above the mean order quantity for all retailers and wholesalers (and significantly more than the demand of 50 per period). Therefore, the results reported below include a total of 18 teams consisting of 72 participants.

3.2 Order functions

It is well known that a stationary base-stock policy is optimal in a serial supply chain if the back-order cost only occurs at the most downstream stage of the supply chain and all parameters are stationary. Lee and Whang (1999) show that such a policy is still optimal for decentralized supply chains by introducing specific performance measures in the decentralized system. However, in most of the beer game literature, the backorder cost occurs at every stage. We are not aware of any literature that addresses the structure of the optimal ordering policy for individual-utility-maximizing players in a serial supply chain in which information is not shared among supply chain stages and holding and backorder costs are incurred at every stage. Moreover, although there has been a great deal of research on supply chain disruptions recently, the study of multi-echelon supply chains under disruptions is fairly limited (see, e.g., Hopp and Liu 2006). Consequently, the optimal ordering policy is unknown for centralized serial supply chains with supply disruptions, nor is the optimal behavior known for individual-utility-maximizing players under our beer game settings.

In light of these difficulties, Sterman (1989) postulates a model that expresses a player’s order quantity as a function of several random state variables in order to analyze whether the players put more focus on their inventory level or their on-order inventory:

$$ O^i_{t}=\hbox{max}\left\{0,\hat{O}^{i-1}_{t+1}+\alpha^{i}_{b}(IL^{i}_{t}-a^{i}_{b})+ \beta^{i}_{b}(IP^{i}_{t}-IL^{i}_{t}-b^{i}_{b})\right\}.$$
(1)

We refer to this as the base order function. The state variables used in the function are as follows:

  • IL i t : Inventory level (on-hand inventory − backorders) at stage i after event 2 (i.e., after observing its demand but before placing its order) in period t.

  • IP i t : Inventory position (on-hand inventory + on-order inventory − backorders) at stage i after event 2 in period t.

  • O i t : Order quantity placed by stage i in event 3 in period t. If i = 0, O i t represents demand from the external customer.

  • \(\hat{O}^i_t: \) Forecast of order quantity that will be placed by stage i in period t. This forecast is calculated by stage i + 1 after event 2 in period t − 1 using exponential smoothing:

    $$ \hat{O}^{i-1}_{t}=\eta O^{i-1}_{t-1}+(1-\eta )\hat{O}^{i-1}_{t-1}, $$
    (2)

    where η is the smoothing factor, 0 ≤ η ≤ 1.

In (1), a b and b b represent target values for the inventory level (IL) and supply-line inventory (IP − IL), respectively. The constants α b and β b are adjustment parameters controlling the change in order quantity when the actual inventory level and the supply line, respectively, deviate from the desired targets. (The subscript b stands for “base”.) Sterman based this order function on the anchoring and adjustment method proposed by Tversky and Kahneman (1979). It accounts for changes in the demand, inventory level, and supply line dynamically, even when the demand and supply processes are unknown. \(\hat{O}^{i-1}_{t+1}\) is treated as the anchor, serving as a starting point for the order quantity, while the remaining part is the adjustment to correct the initial decision based on the inventory level and supply line. The order quantity placed by stage 4 in period t is bounded by its capacity ξ t in that period. Therefore the actual order placed by stage 4 in period t is min{O 4 t , ξ t }.

The relationship between |α b | and |β d | determines how the supply line is weighted: |α b | > |β d |, |α b | = |β d |, |α b | < |β d | results in underweighting, equal weighting, and overweighting the supply line, respectively.

The base order function does not capture the difference in players’ behavior during up and down periods. To address this difference, we introduce a new order function, which we call the disruption order function. This function assumes that each player knows only that a disruption has occurred but does not know the precise impact it has on inventories, as is often the case in real-world disruptions. The disruption order function is as follows. (The subscript d on the coefficients stands for “disruption.”)

$$ O^i_{t}=\hbox{max}\left\{0,\hat{O}^{i-1}_{t+1}+\alpha^{i}_{d}(IL^{i}_{t}-a^{i}_{d})+ \beta^{i}_{d}(IP^{i}_{t}-IL^{i}_{t}-b^{i}_{d}) +\gamma^{i}_{d} S_t\right\}, $$
(3)

where S t is a public signal to indicate whether there is a supply disruption in the system; that is, S t  = 1 if stage 4 is in the down state and 0 otherwise. If γ i d  < 0, then the player will order less during a disruption (e.g., to reduce potential backorders at its supplier). If γ i d  = 0, the player ignores disruptions, while if γ i d  > 0, then the player orders more during a disruption (e.g., to protect against possible future disrupted periods). As in the base order function, the actual order quantity placed by stage 4 is given by min{O 4 t , ξ t }.

Our two proposed order functions are used to study (1) whether or not players pay more attention to the supply line in the presence of disruptions; and (2) whether or not players order differently during down periods compared with up periods. As we mentioned above, a mathematical model of rational players is unavailable under our beer game settings. Therefore, these two order functions should be considered as tentative models for player behavior. It is an interesting topic of future research to test the robustness of the results gained from our two order functions versus other order functions.

4 Beer game results

Our interest is primarily in the presence or absence of the BWE and RBWE during supply disruptions, rather than over the course of the entire horizon. Prior to performing our experiment, we conjectured that the RBWE would occur more often than the BWE during down periods, but that the BWE may dominate when the SD is calculated over all periods. This reflects the suggestion that supply disruptions cause the RBWE and the fact that, taken across all periods, the upstream stage’s order process is actually more volatile because the capacity changes themselves cause order variance. Except for Table 1 below, the results reported in the remainder of Sect. 4 include observations from down periods only.

Table 1 SD of orders for each player

Our findings may be summarized as follows. In Sect. 4.1, we show that fewer than half of the players in our beer game experiment exhibited the BWE (that is, had σ i  > σ i−1 for a given i) during supply disruptions, while slightly over half exhibited the RBWE, with the BWE more likely to occur downstream and RBWE more likely to occur upstream. In Sect. 4.2, we examine the relationship between the BWE/RBWE and supply-line weighting and show that the players who underweight [overweight] tend to exhibit the BWE [RBWE]. Moreover, in Sect. 4.3, we show that some players decrease their order quantities during disruptions since the players are evaluated based on the whole team’s performance. The decrease in order quantity in response to disruptions and supply-line weighting type of the players provide an explanation of why some players exhibit the RBWE while others do not.

4.1 The existence of BWE and RBWE

One of the major purposes of our version of the beer game is to test the prevalence of the BWE in an environment that is different from the standard beer game setup. Our results suggest that the BWE no longer dominates when supply disruptions are present. Table 1 contains the SD of the players’ orders for all teams except the five omitted teams (see Sect. 3.1). Standard deviations are reported both across all the periods and for down periods only. The column labels R, W, D, and M represent retailer, wholesaler, distributor, and manufacturer, respectively. In Table 1, if “†” appears beside a number in the “All Periods” or “Down Periods Only” column, then the corresponding player exhibits the RBWE during the whole time horizon or during down periods, respectively.

Table 1 indicates that 51.4% (37 out of 72) players exhibited RBWE during down periods, confirming our conjecture that the RBWE at least sometimes occurs during disruptions. Even when taken across all periods, 30.6% (22 out of 72) players exhibited RBWE. The middle stages of the supply chain (wholesalers and distributors) have the greatest order SDs, on average. In the sections below, we investigate why some players exhibit RBWE while others do not.

The RBWE can occur at wholesalers, distributors and manufacturers, both during down periods and across all periods. Figure 3 provides a graphical representation of the order SD of each team during down periods. The dashed line represents the mean of the order standard deviation over all 18 teams, while the solid lines represent the individual teams. The general trend in Fig. 3 is an “umbrella” shape, with demand variability increasing (BWE) downstream but decreasing (RBWE) upstream. Except for the retailer, the “average” player in each role (dash line) exhibits RBWE.

Fig. 3
figure 3

SD of orders during supply disruptions. Solid lines represent individual teams’ SDs; dashed line represents mean SD

We applied Spearman’s rank correlation test (significance level 0.1) to determine whether the differences in SD between orders and demands (i.e., the differences among the numbers in Table 1) are statistically significant. The results are summarized in Table 2, which lists the number of players in each role who exhibited statistically significant BWE or RBWE. (Details of this and all other statistical tests can be found in the Appendix.)

Table 2 Statistically significant occurrences of BWE and RBWE

Table 2 indicates that, during down periods, 21 out of 72 players (29.4%) exhibited statistically significant RBWE during down periods and 27 out of 72 (38.2%) exhibited BWE. Across all periods, 13 players (18.1%) exhibit RBWE and 36 players (50.0%) exhibit BWE. Note, however, that since the retailer faces constant demand, every retailer by definition exhibits BWE (or no BWE/RBWE). When retailers are excluded from the analysis, 21 out of 54 players (38.9%) exhibit RBWE during disruptions and 10 players (18.5%) exhibit BWE.

The majority of manufacturers exhibited statistically significant RBWE during down periods because their orders are bounded by the reduced capacity. On the other hand, few manufacturers exhibit RBWE when taken across all periods, since the capacity changes necessarily cause order variability on the part of the manufacturer. Similarly, retailers automatically exhibit BWE because they face no demand variability. Taken together, these observations confirm our conjecture that during disruptions, the BWE occurs downstream and the RBWE upstream. The picture is less clear in the middle of the supply chain. There is no clear pattern to whether the BWE or RBWE dominates at wholesalers and distributors during supply disruptions. In the next two sections, we explore the question of why some wholesalers and distributors exhibit BWE, some exhibit RBWE, and some exhibit neither. In addition, we explore the difference between damping-type and amplification-type RBWEs. In the remainder of Sect. 4, we consider BWE and RBWE during disrupted periods only.

4.2 BWE/RBWE and supply-line weighting

Sterman (1989) suggests that one of the main causes of the BWE is that people tend to weight the supply line less than the inventory level when choosing an order quantity. But when the system faces supply uncertainty rather than demand uncertainty, does players’ behavior change?

To address the relationship between supply-line weighting and the occurrence of the BWE/RBWE, we first estimate the parameters of the base order function for each individual player (except for the manufacturers since their orders are bounded by the capacity). See section “Base order function regression” in the Appendix for details.

Our main focus is the behavior of wholesaler and distributor since the behavior of these stages is the least predictable. The results in section “Base order function regression” show that 23 out of 36 wholesalers and distributors (63.9%) underweight the supply line, while another 13 overweight it. This result is significantly different from that of Croson and Donohue (2003, 2006), who find that 98% of 172 players underweight the supply line.

An F-test indicates that 21 out of 36 of wholesalers and distributors (58.3%) over- or underweight the supply line at a statistically level. Of these, 16 underweight the supply line and 5 overweight it. We are interested in the relationship between under-/overweighting and BWE/RBWE during disruptions. Players who overweight pay more attention to the supply line, and therefore it is reasonable to assume that these players want the on-order quantity to be stable more than players who underweight do. Our conjecture is that players who underweight tend to exhibit BWE, while players who overweight tend to exhibit RBWE. Table 3 lists the number of wholesalers and distributors falling into each category. Only players who exhibit statistically significant under-/overweighting and BWE/RBWE are included. Note that all of the players who underweight the supply line exhibit BWE (or neither), while all but one of the players who overweight the supply line exhibit RBWE (or neither).

Table 3 Under-/overweighting of supply line and BWE/RBWE for wholesalers and distributors

We conclude that underweighting the supply line is still a major cause of the BWE, even if supply uncertainty is introduced to the system. On the other hand, the presence of supply disruptions causes players to think more about the supply line, hence overweighting it more; those who do are more likely to exhibit RBWE. This overweighting causes a damping-type RBWE; by overweighting the supply line, players dampen, rather than amplify, the order variability. In our experiment, we believe that the emphasis on disruptions happens to be the cause for players to overweight; however, it is possible that there are also other causes for overweighting (and consequent damping-type RBWE) that are not related to disruptions.

4.3 BWE/RBWE and reaction to supply disruptions

To evaluate players’ behavior during disruptions, we estimated the parameters for the disruption order function using the same statistical regression procedure as in Sects. 4.2 and “Base order function regression”, except that we use the disruption order function in place of the base order function. Again we omit manufacturers from our regression analysis since their order sizes are constrained by the capacity.

The results, given in detail in Table 5 in section “Disruption order function regression” of the Appendix, indicate that 39 out of 54 players (72.2%) have γ d  < 0: 14 retailers, 10 wholesalers, and 15 distributors. This suggests that players are very likely to decrease their order quantities during supply disruptions to prevent backorders at their suppliers. This result surprised us at first, but it is intuitive since the beer game is centralized, and players are evaluated based on the performance of the whole team. Over-ordering during a disruption does not provide any benefit (since it does not improve upstream supply) and it hurts the team by incurring additional stockout penalties. This setting is different from the hurricane Katrina example, in which customers act in their own best interests.

A significance test on these data indicate that for 14 out of 54 players (25.9%), the over-ordering during disruptions is statistically significant at a level of significance of 0.1; these consisted of 6 retailers, 3 wholesalers, and 5 distributors. Two players (3.7%) exhibit statistically significant under-ordering. For 9 out of the 18 teams (50.0%), one of the two downstream players (retailer or wholesaler) orders significantly less during disruptions. The downstream reaction in these teams causes a (negative) demand shock for the middle-stage players (wholesaler and distributor), who then face both supply and demand shocks. There is no clear pattern as to whether these middle-stage players exhibit BWE, RBWE, or neither; it depends on whether these players under- or overweight the supply line (i.e., whether they pay more attention to their customers or their suppliers).

The data indicate that 8 out of 18 retailers either increase or decrease their order quantity significantly during disruptions in our beer game experiment. The resulting demand shock may be amplified by the middle-stage players (if they tend to underweight the supply line) or may be dampened (if they tend to overweight the supply line). The interaction between players’ reaction to disruptions and their partners’ supply-line weighting causes either the BWE or the RBWE; this relationship deserves further study. When the RBWE does occur for these teams, it is (at least partly) an amplification-type RBWE, since downstream players cause abnormally high volatility in reaction to the supply disruption.

5 Simulation experiment

Although the beer game can provide valuable insights into players’ individual behaviors, it can be difficult to draw general inferences from such an experiment for two reasons. First, the total cost of the supply chain is highly dependent on the behavior of the individuals in the chain and on the arrangement of players in the team. Second, from previous studies (Croson and Donohue 2006; Sterman 1989), we know that individuals behave quite differently from each other, and the behaviors of the participants in a given team may interact strongly. The same player, if assigned to different groups, may even behave differently due to the effect of other players. For example, a player may be relatively rational if her downstream partner can control his order variability, but her behavior may be more chaotic if her partner’s orders are volatile and unpredictable.

Another drawback of the beer game experiment is the time limit. To achieve some sort of stable behavior, the participants need to learn the order patterns of their customer and supplier, and this may take a long time. This suggests that the estimated parameters in the order function vary over time at the beginning of the horizon. However, the time limitation prevents the beer game from being played long enough to achieve stability.

In addition, there are differences between the incentives in the beer game and those in a real business setting. In the beer game, the total cost of the supply chain is the performance measure, while real businesses care about their own profit, not (directly) that of their partners. This may cause different values of the parameters in the order function, e.g., the customers in the hurricane Katrina example may have positive γ d instead of negative ones. Finally, due to time and cost considerations, it is possible to test only a limited set of assumptions in the beer game (e.g., one type of disruption process, etc.).

These drawbacks can be addressed using a simulation study, in which all stages are operated by a computer rather than by humans, following pre-defined ordering rules. Such a study makes it convenient to perform what-if analyses regarding different ordering behaviors, disruption processes, and so on. It is also trivial to run the system long enough to achieve an approximately steady state. Thus, our simulation study complements our beer game experiment and may be viewed as serving a sensitivity analysis role.

To perform our study, we used the freeware software (Snyder 2006), which simulates multi-echelon supply chains with stochastic supply and/or demand. Each stage can have its own ordering function, following any of several types of inventory policies, including base stock, (r, Q), (s, S), and various anchoring and adjustment order functions. More information about the supply chain assumptions made by the simulation can be found in the documentation that accompanies the software (Snyder 2006).

Our goal is to simulate the system with different parameter values to determine the impact of various behaviors on ordering patterns across the whole system. For each setting of the parameters, we simulated the system for 10 trials, each consisting of 1,000 periods with a 100-period warm-up interval.

In Sect. 5.1, we establish the relationship between the BWE/RBWE and supply-line weighting type under demand uncertainty only in order to validate our simulation model against previous studies. Then, in Sect. 5.2, we investigate the impact of supply-line weighting type and disruption reaction on the BWE/RBWE. Finally, in Sect. 5.3, we consider the impact of different capacity processes on the BWE/RBWE.

5.1 Effect of supply-line weighting under demand uncertainty only

Previous experimental studies, such as Sterman (1989), show through regression analysis that underweighting the supply line is a major factor in causing the BWE. To evaluate the relationship between the BWE/RBWE and supply-line weighting type, as well as the magnitude of the BWE/RBWE under demand uncertainty only, we use a 2-stage model consisting of a retailer and a wholesaler. The demand follows a normal distribution with mean 50 and standard deviation 10. There is no capacity limit at the wholesaler. The exponential smoothing factor η is fixed to 0 based on the assumption that the customer demand process is known to both stages. We set a = 10 and b = 100. Using the base order function (Sect. 4.2), we vary α b and β b from 0.1 to 0.9 in increments of 0.1, respectively.

Figure 4 shows a contour plot of the BWE/RBWE at the wholesaler (i.e., the difference in order standard deviations between the retailer and wholesaler) under different weights on the inventory level and supply line. The x- and y-axes plot α b and β b . It is evident from Fig. 4 that there is a break roughly along the line |α b | = |β b |, that is, equal weighting between on hand inventory and the supply line. When |α b | is sufficiently larger than |β b | (underweighting the supply line), the retailer’s order SD is smaller than the wholesaler’s; that is, the BWE occurs. The less weight is placed on the supply line, the greater the magnitude of the BWE is. On the other hand, when |α b | < |β b | (overweighting the supply line), the system exhibits RBWE. However, the magnitude of the RBWE appears to be insensitive to the weight placed on the supply line.

Fig. 4
figure 4

Impact of weights on BWE/RBWE

Our simulation confirms previous experimental studies on the relationship between the BWE and underweighting the supply line and suggests further that the RBWE occurs when players overweight the supply line. The magnitude of the RBWE appears to be smaller than that of the BWE and less sensitive to the degree of supply-line weighting. On the other hand, in a recent survey by Muthukrishnan and Shulman (2006), 65% of the 2,990 responding executives believe that supply risk has increased during the past 5 years. Thus, today’s managers may rely more heavily on the supply line when making inventory decisions. This can diminish the magnitude of the BWE or even create the RBWE. As shown in our beer game experiment, when supply disruptions are a major factor, some players may change their behavior and put more emphasis on the supply line, resulting in the RBWE.

5.2 Impact of reaction to supply disruptions

To examine the impact of supply disruptions in a serial supply chain, we use a 4-stage model, as in the beer game setting. The simulation study shows that the reaction to supply disruptions downstream serves as a trigger of order volatility. The relative weights between on-hand and supply-line inventory of each player largely determine the pattern of order standard deviations throughout the supply chain.

We use the disruption order function (Sect. 3.2) to model players’ reaction to disruptions. To isolate the effect of downstream volatility, and because the results in Sect. 4.3 indicate that downstream players are most likely to react to supply disruptions, we set γ d  ≠ 0 at stage 1 only; that is, only stage 1 changes its order quantity in response to disruptions. For the underweighting case, we set α d  = −0.2 and β d  = −0.1. For the overweighting case, we set α d  = −0.1 and β d  = −0.2. This choice of α and β makes the underweighting and overweighting significant while keeping their values within the typical ranges observed in Table 5. The parameters η, a d , and b d are set the same as their counterparts in Sect. 5.1.

The capacity distribution is almost the same as in the beer game, except the down-state capacity is fixed rather than random. The order placed by stage 4 is bounded above by ξ d or ξ u (depending on the state). We assume ξ d  < 50 < ξ u .

We investigate three cases, representing two extremes and a more realistic hybrid scenario: (1) all stages overweight the supply line (OW); (2) all stages underweight the supply line (UW); (3) stages 3 and 4 overweight the supply line and stages 1 and 2 underweight the supply line (OUW). In case OW, all players follow traditional beer game behaviors. In case UW, players are aware of the supply disruptions and put more weight on the supply line. Case OUW represents the case in which players closer to the supply pay more attention to the supply line while players closer to the demand pay more attention to their own inventory.

We vary γ d to model stage 1’s reaction to disruptions. The order SD of each stage during disruptions is shown in Fig. 5. We fix ξ u  = 60, ξ d  = 40 and p d  = 0.1 and vary p u among 0.2, 0.5 and 0.8 to represent slow, medium, and quick recovery, respectively. Stage 4’s order SD during disruptions is zero or close to zero because of the tight capacity.

Fig. 5
figure 5

Effect of stage 1’s reaction to supply disruptions. a OW p u  = 0.2, b UW p u  = 0.2, c OUW p u  = 0.2, d OW p u  = 0.5, e UW p u  = 0.5, f OUW p u  = 0.5, g OW p u  = 0.8, h UW p u  = 0.8, i OUW p u  = 0.8

From Fig. 5, when γ d  = 0, there is still some order variability at all stages; i.e., the supply disruption transfers downstream. That is because the supply disruption affects the downstream stages through changes in its inventory level and supply line. But if the retailer does not react to disruptions (much), then the order variability is limited, especially when the disruptions are not severe.

When γ d is large enough, the retailer generates sufficient demand uncertainty for his or her upstream partner. Underweighting the supply line then magnifies the order shock such that the BWE appears from stage 1 to stage 3 consistently in case UW and from stage 1 to stage 2 consistently in case OUW. At the same time, overweighting the supply line can effectively reduce the order shock, which causes the RBWE for the wholesaler and distributor in case OW and the distributor in cases OUW. However, the sign of γ d is not an important factor in determining the order SD since the order SDs are quite symmetric in Fig. 5. Note also that, as p u increases, the order SDs decrease, reflecting the improved reliability of the system.

Figure 6 depicts the order SD for each case in Fig. 5, taken at γ d  = −10. It represents the general pattern of order SDs that can occur in the supply chain. The beer game experiment indicates that people may respond to supply disruptions differently. This is also shown in Fig. 6, where the order standard deviation can exhibit various types of curves. In reality, if the manufacturer and the retailer have more control over the supply chain—for example, the manufacturer produces a popular brand and/or the retailer has the advantage of a strong sales channel, then companies close to the manufacturer may weigh the supply line more while companies close to the retailer may weigh the inventory level more. Consequently, the chance of having an “umbrella” pattern of order SDs is high.

Fig. 6
figure 6

Simulated order SD when γ d  = −10

We conclude that, for sufficiently large |γ d |, the retailer generates significant demand uncertainty for the wholesaler. Interestingly, the magnitude of the reaction matters more than the direction, as evidenced by the approximate symmetry with respect to the y-axis in Fig. 5. Moreover, the players’ weighting types determines the ordering of SDs, and therefore the presence of the BWE, RBWE, or umbrella pattern.

Players’ reactions to supply disruptions can be considered as a magnifying force, created by the supply disruption and propagating downstream in the form of an amplification-type RBWE. At the same time, overweighting the supply line can be considered as a damping force that decreases vibrations that are propagating upstream, creating a damping-type RBWE.

5.3 Impact of disruption process

In the base and disruption order functions, it is the variability in the inventory level and supply line that creates order fluctuations during supply disruptions. The inventory level and supply line are, in turn, highly dependent on the capacity process. However, we find that the order SD does not increase monotonically with respect to the failure probability. It usually reaches its highest point when the failure probability is roughly equal to 0.5. In contrast, the order SD decreases monotonically with respect to the recovery probability. Moreover, we find that the recovery probability has more impact on the order SD than the failure probability does.

We examine the impact of the capacity process on order variability in Fig. 7. We set γ d  = −10. In parts a–c, we set p u  = 0.9 and vary p d , while in parts d–f, we set p d  = 0.1 and vary p u .

Fig. 7
figure 7

Effect of the frequency of supply disruptions. a OW p u  = 0.9, b UW p u  = 0.9, c OUW p u  = 0.9, d OW p d  = 0.1, e UW p d  = 0.1, f OUW p d  = 0.1

When all stages overweight the supply line (parts a and d), the RBWE is evident between stages 1 and 4 for all values of p d and p u , though for some values, σ3 > σ2. When all players underweight (parts b and e), the BWE prevails strongly for stages 1–3, though the SD is small at stage 4 due to the tight capacity. Finally, when upstream players overweight and downstream players underweight (parts c and f), the umbrella pattern is apparent, with the smallest order SDs at stages 1 and 4 and the largest at stages 2 and 3.

As the frequency of supply disruptions increases in parts a–c, the order SD at each stage does not react monotonically. This can be explained as follows. The number of up periods follows a geometric distribution with parameter p d . The mean and variance of the number of up periods decrease as the failure probability increases. The inventory level and supply line approach a stationary level when the number of up periods increases. If the failure probability is small, the variance of the number of up periods is large but the inventory and supply line become stable at the end of the last up period due to a high mean number of up periods. If the failure probability is large, the change in the inventory level and supply line is quick because the mean number of up periods is small, but the variance of the number of up periods is low. Therefore, the greatest fluctuation in the inventory and supply line at the beginning of the disruption is achieved when p d  ≈ 0.5. Put another way, the system is relatively predictable if p d is either large or small, but is less predictable otherwise. Therefore, the order standard deviation reaches its highest point when the failure probability is in the medium range.

As the recovery speed increases in parts d–f, the order SDs at each stage decrease monotonically, because when the number of down periods increases, the inventory level decreases. This makes the order quantity increase by the number of recovery periods, which expands the range of order quantities. Therefore, the order SD decreases with the recovery probability.

In Fig. 8, we set p u  = p d ; that is, the failure and recovery probabilities are the same, and therefore so are the stationary probabilities of being up and down. We set γ d  = −10, and ξ u and ξ d to 65 and 45 to ensure that the overall capacity is sufficient to meet the demand. As p u and p d increase, disruptions become more frequent but shorter. As this happens, Fig. 8 is closer to parts d–f than it is to parts a–c in Fig. 7. This indicates that the recovery process dominates the disruption process in terms of order SD. This can also be verified from the magnitude of Fig. 7, where the maximum value of the y-axis in parts a–c is half of that in parts d–f.

Fig. 8
figure 8

Effect of frequency of supply disruptions with equal up and down probabilities. a OW, b UW, c OUW

We conclude that, just as in the beer game, players’ supply-line weighting type has a significant impact on whether the BWE, RBWE, or umbrella pattern is present. Moreover, the order SDs are not monotonic with respect to p d but decrease monotonically with p u . The recovery process, therefore, seems to have a stronger impact on order SDs than the disruption process.

6 Conclusion

Managers face uncertainty not only from the demand side but also from the supply side. The past several years have seen a range of high-profile disruptions or near-disruptions, including Y2K, September 11th, SARS, the Indian Ocean tsunami, and hurricane Katrina. These low-probability, high-impact events have a tremendous impact on the supply chain, as do smaller, less newsworthy disruptions that happen on a regular basis. In this paper, we studied potential forms of ordering behavior during disruptions by introducing supply uncertainty into the beer game and a simulation model.

From our beer game experiment and simulation studies, we conclude that the BWE is not a ubiquitous phenomenon and suggest that a reverse phenomenon, the RBWE, often occurs because of supply disruptions. We have identified two independent ways to generate the RBWE: overweighting of the supply line and overreacting to capacity shocks. The first causes the RBWE by smoothing the order pattern upstream (damping-type RBWE). The second propagates supply disruptions downstream in the form of increased order volatility (amplification-type RBWE). Both causes provide some explanation as to why recent empirical studies have concluded that the BWE is not as prevalent as previously thought (Baganha and Cohen 1998; Cachon et al. 2007).

Behavioral supply chain research addresses the question of how people behave in various settings and the effect of that behavior on the supply chain as a whole. Our descriptive models examine the relationship between individual behaviors and order variability. We believe they can serve as a foundation for studies involving questions of designing and managing flexible supply chains. Possible research questions might include:

  • When supply disruptions occur, how should a manager of site(s) in the middle of the supply chain manage flexibility in the presence of both upstream shortages and demand changes due to human reactions?

  • As the BWE and RBWE indicate that both demand and supply uncertainty may be amplified, where should flexibility be added to the supply chain? Should it be close to the sources of uncertainty or near the locations with the most direct impact on supply chain performance?

  • The current literature on flexibility (see the recent reviews by Buzacott and Mandelbaum 2009, Chou et al. 2009 in this issue of the journal) provides many means to plan and manage flexible resources. Will increased flexibility reduce or increase the BWE and RBWE? Would chaining or other well studied flexible structures still be preferable if BWE/RBWE is a major concern of the decision maker? If not, how should the principles and guidelines on flexible structure design be changed?

  • How can contracts be designed between two parties in a supply chain to share the cost and benefit of flexibility to cope with both demand and supply uncertainty?