1 Introduction

In this paper, we show that endogenous business cycles (inventory cycles) arise from a combination of nonconvex costs and economic interactions among firms. In particular, we show that the aggregate of randomly behaving microeconomic agents generates deterministic collective behavior via interactions.

Economic fluctuations are certainly an important issue in economics, but what causes such fluctuations? This natural and fundamental question has not yet been answered in economics. For example, Cochrane (1994) demonstrates that popular economy-wide shocks (e.g., monetary shocks or oil prices) fail to explain the bulk of economic fluctuations. He writes, “What shocks are responsible for economic fluctuations? Despite at least two hundred years in which economists have observed fluctuations in economic activity, we still are not sure” (p. 295). We cannot resort to mysterious aggregate exogenous shocks to explain aggregate fluctuations.

However, because an economy is composed of many firms, it might be expected that aggregate fluctuations stem from firm-specific shocks and inherit some properties from them. At the micro level, economic activities are characterized by lumpiness and discreteness. Managers temporarily shut down plants or change the number of shifts for inventory adjustment. This behavior clearly contradicts the standard production-smoothing theory in microeconomics textbooks. In fact, the production-smoothing theory has been empirically rejected (see Blinder and Maccini 1991). It is found that, when some fixed costs exist (e.g., ordering costs), the cost curve is kinked and nonconvexity emerges, which implies that the cost-minimizing strategy of firms is production bunching (or the bunching of orders). This theory can account for the stylized fact that production is more volatile than sales (e.g., Hall 2000). The aim of this paper is to investigate how these firm-level characteristics are related to aggregate fluctuations.

There are two different views concerning the effect of microeconomic characteristics on aggregate fluctuations. One is that microeconomic characteristics disappear at the macroscopic level. Indeed, less attention has been paid to the role of idiosyncratic shocks in the macroeconomic literature simply because these shocks are considered to average out in the aggregate by the law of large numbers (LLN). Lucas (1977)’s argument is a typical one.Footnote 1 According to this view, the observed aggregate fluctuations must be explained by the presence of shocks that have a common origin across firms in the economy. By definition, they are aggregate shocks.

On the other hand, another view, which has attracted much attention in recent years, emphasizes the effects of interactions between sectors (or firms), especially input–output linkages. In fact, positive comovement across sectors is a salient feature of the business cycle. In contrast to the LLN argument, it is emphasized that the effects of interactions between sectors (or firms) through input–output linkages, which propagate idiosyncratic shocks throughout the economy, cause the aggregate fluctuations that are unexplained by the usual aggregate shocks (e.g., Long and Plosser 1983; Carvalho 2010; Foerster et al. 2011; Acemoglu et al. 2012; Carvalho and Gabaix 2013; for a review, see Carvalho 2014). The key element of models used in these studies is the existence of sectors that have disproportional impacts on the entire economy. This is due to the heterogeneity of input–output linkages; that is, sectors are not equally intense material suppliers. Shocks to general purpose technologies such as oil, electricity, and iron and steel propagate to all sectors through the input–output linkages because most sectors rely on them. In this sense, the microeconomic shocks accounting for aggregate fluctuations in these studies can be regarded as “pseudo–macroeconomic” shocks. There are other strands of literature that are related to our analysis, for example, Bak et al. (1993) and Durlauf (1993), where nonconvex technology and (local) interactions are explicitly considered. Bak et al. (1993) demonstrate that small shocks to final goods can cause an “avalanche” of production increases via supply chains.

Even though such interactions explain how aggregate fluctuations can be caused by microeconomic shocks, there exist broad distinctions between our model and previous studies. In contrast to Carvalho (2010) and Acemoglu et al. (2012), we assume that each firm is small compared to the economy as a whole and can hardly influence the outcome of the economy on its own. Furthermore, in contrast to Bak et al. (1993), in which shocks to final goods are assumed to be exogenous, we assume that demand for the products depends on the overall economic condition. We assume that on the one hand, the behavior of a firm is affected by the state of the economy as a whole, but on the other hand, the economy is composed of the firms themselves. In other words, the macroscopic state of the economy not only is an aggregation of the firms, it also prescribes the macroeconomic environment in which the firms engage in business activities. This feedback loop generates rich interesting phenomena. This idea is closely related to the “macro-micro loop” emphasized by Hahn (2002), where a macro variable acts as an externality. We show that this mechanism can generate collective behavior that is different from the motion of an individual firm.

On this point, our approach is close in spirit to heterogeneous interacting agent models (see, e.g., Delli Gatti et al. 2009; Stiglitz and Gallegati 2011; for a survey, see Hommes 2006), especially to Aoki’s methods (Aoki 1996, 2002; Aoki and Shirai 2000; Aoki and Yoshikawa 2007). Aoki and coauthors have developed the application of jump Markov processes, where the evolution of the probability distribution is described by master equations. Although there is no doubt that Aoki’s methods expand the scope of macroeconomic analysis, there exist some difficulties and situations that cannot be dealt with in his framework (see Sect. 4.1). In particular, in our model, firms’ inventories are distributed continuously and affect firms’ choice of production. That is, the system is described by an infinite-dimensional random variable, which is the distribution of inventories (and production).

By using the propagation of chaos instead of Aoki’s methods, we present an alternative method to investigate how the system (i.e., the probability distribution) behaves and changes its properties when parameters are changed. On the basis of cost-function nonconvexity and the feedback effect, we show that a regular cyclical movement at the macroscopic level emerges given that the effect exceeds a certain threshold. This cyclical movement is endogenous and is an explanation for the Kitchin cycle.

The rest of the paper is organized as follows. Section 2 discusses the firm behavior characterized by nonconvexities, which can explain the empirical puzzle that the volatility of production is larger than that of sales. Section 3 discusses the importance of inventory movement for understanding business cycles. Section 4 contains our main results and shows that the simple LLN cannot be applied and that an endogenous movement emerges. Section 5 concludes.

2 Firm behavior: production and inventory

The standard cost function has been assumed to be convex in output and in the change of output in standard microeconomic textbooks. This means that for cost minimization, the manager of a firm must smooth its production by using inventories as a buffer stock given that there exist sales fluctuations (production-smoothing models; see, e.g., Holt et al. 1960). This implies that production is less variable than sales. However, this prediction is known to be inconsistent with the empirical data (see Blinder and Maccini 1991). In particular, the correlation between sales and inventories is positive, not negative as predicted by production-smoothing models. Namely, production is more volatile than sales. The time evolution of inventories that firms have cannot be explained by the motive for a buffer stock.

Blinder and Maccini (1991) present a well-known (Ss) model in which a firm places an order of size \(S-s\) whenever its inventories reach the lower bound s. They show that it is optimal for a firm to place infrequent large orders when fixed costs of ordering exist, leading to bunched orders. The inventory series is characterized by a sawtooth pattern. The (Ss) model is strongly supported by empirical data (e.g., Hall and Rust 2000). Although they emphasize retail and wholesale inventories, in other words, the lumpiness of the delivery process, the bunching of orders by the retail sector can induce production bunching in the manufacturing sector even though the latter has the usual increasing marginal costs. For example, Cooper and Haltiwanger (1992) point out this possibility, saying, “Downstream bunching of orders by retailers may be the source of upstream production bunching by manufactures” (p. 116).

In relation to these studies, a close examination of data at the micro level (especially for the automobile industry) reveals that changes in production are quite lumpy. Managers may shut their plants down for a week or change the number of shifts, thus varying production. Ramey (1991), Cooper and Haltiwanger (1992), Bresnahan and Ramey (1994), and Hall (2000) focus on the nonconvexity of the cost function to explain these behaviors. They show that when there are fixed costs associated with opening a plant and adding an additional shift, production bunching is an optimal strategy. For example, Cooper and Haltiwanger (1992) present a simple model and show that a start-up cost for a production run and a constant marginal cost of production lead to production bunching.

Fig. 1
figure 1

A nonconvex cost function. The horizontal (vertical) axis is quantity (costs)

To illustrate how the cost function associated with such fixed costs might look, a simple nonconvex cost function is depicted in Fig. 1. If a manager has to produce, on average, output \(Q \equiv (A+B)/2\), the average cost can be reduced by alternating between production at A and B rather than production at Q, that is, by production bunching. Namely, even if there is no fluctuation in sales (or a small one around Q), firms optimally change their production level substantially, and therefore, the nonconvexity leads to excess volatility of production. Furthermore, this nonconvexity is quantitatively important to explain the variation of output. Bresnahan and Ramey (1994) write, “[M]ost of the variance of output comes from varying hours over the nonconvex portions of the cost function, rather than from varying hours over the convex portions of the cost function” (p. 610).

The question then arises as to whether the automobile industry is representative of all manufacturing or is a special case. On this point, Mattey and Strongin (1997) consider two extremes of technology types. “Pure assemblers” adjust their output by varying plants’ work period, that is, temporary plant shutdowns, adding or dropping shifts, and adding overtime hours (Saturday work). The automobile and transportation industries are typical examples. The other type is “pure continuous processing” operations, where output adjustment is carried out by varying the instantaneous flow rates of production rather than work-period margin. Mattey and Strongin (1997) conclude that “pure assembly” is a better characterization for manufacturing on the ground that plant work-period margin is commonly used. Moreover, among these output-adjustment margins, changes in the number of shifts are quantitatively important. Bresnahan and Ramey (1994) show that at a quarterly frequency, changes in the number of shifts account for \(40\,\%\) of plant-level output volatility in the automobile industry and is the most important contributor to the variation of output. Shapiro (1996) shows that close to half of the changes in employment in the U.S. manufacturing take place on late shifts. Thus, we focus on changes in the number of shifts in the following analysis.

The discussion above suggests that the behavior of firms is as follows. For the sake of simplicity, we assume that the firms choose one of two production states, high and low (the same simplification can be found in the literature; see, for example, Bak et al. (1993) and Durlauf (1993)). Suppose that a manufacturing firm has sufficient inventories (or the wholesale and retail inventories that the firm supplies) and demand is low. The firm chooses a low production state (e.g., one-shift production) to reduce its inventories. After eliminating the excess inventories, the firm waits for demand to improve. If this happens, the firm adds a new shift to the existing line and increases its output. Even if the sales forecast is overestimated, it is optimal for a manager to maintain the high production for a while because of the fixed costs. After it replenishes its inventories, the firm lays off the workers on the second shift and returns to the initial state.

Note that the above pattern of behavior is not deterministic, but is exposed to various idiosyncratic shocks. Suppose first that the demand (or sales) of a firm indexed by \(i \in \{ 1,\ldots , N \}\), \(s^i_t\), fluctuates around \(\overline{S^i}\),

$$\begin{aligned} s^i_t = \overline{S^i} + \xi ^i_t \end{aligned}$$
(1)

Here, \(\xi ^i_t\) represents a temporary demand shock with mean 0, that causes unintended inventory investment, and N is the number of firms in this economy. We write \(\xi ^i_t \equiv - \frac{\sigma _2 dW^i_{2,t}}{dt}\), where \(W_{2, t}^i\) is a standard Brownian motion, \(\sigma _2 > 0\) is a constant, and \(\frac{dW^i_{2, t}}{dt}\) is the formal derivative with respect to t. Because we focus on the fluctuations around \(\overline{S^i}\) instead of the level of sales itself, we normalize \(\overline{S^i}=0\) and, thus, expected aggregate sales \(\sum _{i=1}^{N} \overline{S^i} =0\). By definition, inventory investment can be written as the difference between production and sales:

$$\begin{aligned} dy_t^i = (x_t^i - s^i_t)dt \end{aligned}$$
(2)

where \(x_t^i\) denotes the production of firm i.Footnote 2 We assume that production is described by a motion in a double-well potential:

$$\begin{aligned} dx_t^i=(-V'(x_t^i) - e y_t^i)dt + \sigma _1 dW^i_{1,t}, \ \ \ V(x) = \frac{1}{4}x^4 - \frac{1}{2}x^2 \end{aligned}$$
(3)

where \(W_{1,t}^i\) is a standard Brownian motion and \(\sigma _1, e > 0\) are constants. The stochastic term represents various idiosyncratic shocks that affect the target level of production—for example, changes in the price of materials. The potential function V(x) is shown in Fig. 2.

Fig. 2
figure 2

Potential function V(x)

The region around \(-1 (+1)\) corresponds to low (high) production state. This model is a generalization of a two-state Markov chain. Suppose, for example, that \(e=0\). Because \(-1\) and 1 are the local minima, \(x_t^i\) stays around there until a large shock occurs, at which point \(x_t^i\) goes toward the other local minimum. Thus, the path of \(x_t^i\) alternates between low and high production.Footnote 3 The second term on the right-hand side, \(e y^i_t\), represents the effect of the inventories on the decision of its production level. That is, if \(y_t^i\) is large, the manager is likely to choose low production around \(x_t^i=-1\).

Combining these equations, the behavior of firm i is described by the following two-dimensional stochastic differential equations:

$$\begin{aligned} dx^i_t= & {} (-V'(x^i_t) - e y^i_t)dt + \sigma _1 dW^i_{1,t}, \quad V(x) = \frac{1}{4}x^4 - \frac{1}{2}x^2 \nonumber \\ dy^i_t= & {} x^i_t dt + \sigma _2 dW^i_{2,t} \end{aligned}$$
(4)

where \(W_{k,t}^i, k=1,2\) are independent Brownian motions, and \(\sigma _1, \sigma _2 > 0\) represent the intensities of idiosyncratic shocks.Footnote 4

These equations duplicate the firm behavior discussed above. Suppose that \(x_t^i\) is near \(-1\) and \(y_t>0\): the firm has sufficient inventories and chooses low production. Because of the effect of \(y_t^i\), \(x_t^i\) stays around \(-1\) until \(y_t^i\) is sufficiently reduced. When \(y_t^i < 0\), production \(x_t^i\) is pushed up by the shortage of inventories. Exceeding the top of the curve (around 0), \(x_t^i\) goes toward high production (+1), and the inventories are replenished. The stochastic terms \(\sigma _1 dW^i_{1,t}\) and \(\sigma _2 dW^i_{2,t}\) represent idiosyncratic shocks to firm i. For example, a good market condition \(\sigma _2 dW^i_{2,t} < 0\) reduces the inventories beyond expectation and \(x^i_t\) might stay around \(+1\) longer. Sample paths of Eq. (4) are depicted in Figs. 3 and 4. Figure 3 shows that \(x_t^i\) oscillates between \(+1\) and \(-1\) with the stochastic noise. In Fig. 5, the result of numerical simulations of \(N=20{,}000\) independent copies of Eq. (4) is shown, which clearly shows the bimodality of production.

Fig. 3
figure 3

A sample path of production x in Eq. (4) with \(\sigma _1^2 = \sigma _2^2 = 1/4\) and \(e=0.1\). The interval of a single time step, \(\Delta t\), is 0.01. The horizontal axis is the number of steps

Fig. 4
figure 4

A sample path of inventories y in Eq. (4) with \(\sigma _1^2 = \sigma _2^2 = 1/4\) and \(e=0.1\)

Fig. 5
figure 5

Histogram of \(N=20{,}000\) independent copies of \(x_t^i\)

3 Inventory investment and business cycles

3.1 Importance of inventory investment

As is well known, inventory investment behavior is a key element in explaining aggregate fluctuations. For example, Blinder and Maccini (1991) demonstrate that the drop in inventory investment accounted for \(87\,\%\) of the drop in GNP during the average postwar recession in the United States. In addition, a large part of short-run fluctuations (business cycle frequencies) are explained by the behavior of inventory investment. Blinder (1981) says, “Inventory fluctuations are important in business cycles; indeed, to a great extent, business cycles are inventory fluctuations” (p. 500).

Furthermore, there is a consensus in the empirical literature that inventory movements are procyclical and that production is more volatile than sales at the sector and aggregate levels (see Ramey and West 1999 and the references cited in Sect. 2). As discussed in the previous section, these features contradict the production-smoothing theory, which predicts countercyclical inventory movements and smooth production. Thus, from a macroscopic point of view, inventories are considered destabilizing factors because recessions are aggravated by declining inventories (e.g., Metzler 1941).

Interestingly, these “stylized facts” seem to depend on which frequencies we examine. Wen (2005) examines quarterly aggregate data from the U.S. and OECD countries and shows that production and inventories exhibit drastically different behaviors at low and high frequencies. According to his analysis, the procyclicality of inventory investment can be observed only at relatively low cyclical frequencies such as business-cycle frequencies (about 8–40 quarters per cycle). On the other hand, at a high frequency (2–3 quarters per cycle), production is less volatile than sales and inventory investment is strongly countercyclical. This can be due to managers being unable to handle unexpected demand shocks at such high frequencies because of the sluggishness involved when making production adjustments, wherein inventories act as buffer stock as the production-smoothing theory predicts.

As discussed above, which frequencies (or time scales) we examine is important. For example, Hall (2000) examines weekly data for automobile assembly plants and shows that two nonconvex margins (changes in the number of shifts and temporary plant shutdowns) play an important role in explaining production behavior. He particularly emphasizes intermittent production—for example, weeklong temporary plant shutdowns (for a recent application of this model, see Copeland et al. 2011). Although weeklong shutdowns used to vary output are relevant to high-frequency (weekly or monthly) production behavior, they are not suitable for explaining business cycles. Bresnahan and Ramey (1994) show that adding or dropping an additional shift are substantially more important at the quarterly frequency than at the weekly frequency. In fact, they are the most important contributors to the quarter-on-quarter variation of output. Bresnahan and Ramey (1994) write, “While closing the plant temporarily might be important for the week-to-week variation in output, it might not be as important at the quarterly frequency” (p. 609). This is why we emphasize changes in the number of shifts to explain the business cycle in Sect. 2.

4 Aggregation: interaction effects

4.1 Interaction

In Sect. 2, we discussed the importance of lumpiness at the micro level. However, it is unclear whether this lumpiness has some impact on business cycles. Because an economy is composed of a large number of firms, it might be expected that lumpiness might be irrelevant at the macroscopic level. In particular, each firm is exposed to idiosyncratic shocks. Indeed, the conventional notion is that microeconomic behaviors would cancel each other out by LLN and that aggregate exogenous shocks are needed to explain business cycles (e.g., Lucas 1977). According to this view, without aggregate shocks, microeconomic structures such as cost-function nonconvexity have no aggregate implications.

However, recent theoretical investigations present another possibility: Aggregate fluctuations can result from microeconomic shocks. The distinct feature of these models is input–output linkages through which the shocks propagate to other sectors (e.g., Long and Plosser 1983; Acemoglu et al. 2012; Foerster et al. 2011; for a review, see Carvalho 2014). In these models, a positive productivity shock to sector i increases not only sector i’s output, but also the output of other sectors that use good i for materials. In particular, when sectoral outdegrees follow a fat-tailed distribution, aggregate volatility decays at a lower rate as the size of the economy tends to infinity (Carvalho 2010; Acemoglu et al. 2012). Shocks to general purpose technologies such as oil, electricity, and iron and steel propagate to all sectors because most sectors rely on them. Significant asymmetry—the presence of hubs—leads to aggregate fluctuations. In this sense, their models are closely related to the “granular hypothesis” (Gabaix 2011) that there exist sectors (or firms) that have a disproportional impact on aggregate fluctuations.Footnote 5

Although the literature discussed above does not take into account nonconvex technology, nonconvexity and interaction are explicitly considered in Bak et al. (1993) and Durlauf (1993). Assuming production bunching and supply chains, Bak et al. (1993) demonstrate that the cascade of production is caused by a small shock to final goods (“avalanche”). Although the shocks to final goods are exogenous in their model, it is plausible to assume that the shocks also depend on economic conditions. That is, if a large fraction of firms expanded their production, there would be increases in national income (or GDP) and in the sales of final goods. In general, a firm is affected by the condition of the entire economy, while the economy consists of the firms themselves. This feedback (or interaction) effect is an important aspect, and will be shown to be the origin of business cycles. In this respect, our model is related to Durlauf (1993), who explores the role of complementarities (the positive spillover effect) and the resulting stationary probability distribution. Assuming that each individual industry chooses one of two types production (technologies 1 and 2), he shows that when strong enough, these complementarities lead to multiple equilibria. Although the assumption of binary choice (technology 1 or 2) simplifies the analysis significantly, more heterogeneous situations can also be considered. In our model, firms’ inventories are distributed continuously and affect firms’ choice of production. Inventories act as a state variable of a firm and their behavior cannot be described by the binary choice model. To discuss the evolution at the macroeconomic level, we have to deal with the distribution, which is an infinite-dimensional variable. In this sense, our model is more heterogeneous than that of Durlauf (1993). Of course, there is no a priori reason to assume that the resulting distribution is stationary. Thus, we require an alternative framework to investigate the time evolution of an economy at the macroscopic level.

A series of studies conducted by Aoki (Aoki 1996, 2002; Aoki and Shirai 2000; Aoki and Yoshikawa 2007) address this problem and present a framework called jump Markov processes. In this framework, given that the transition rate from one state to another one is specified, the evolution of the probability distribution is described by the master equation. Although there is no doubt that Aoki’s methods expand the scope of macroeconomic analysis and can be applied to various problems (for an application to the Diamond search model, see Aoki and Shirai (2000)), there exist some difficulties, and this framework is not suited for our problem. In particular, in our model, firms’ inventories are heterogeneous and distributed continuously, and a diffusion process is considered. In such a situation, it is difficult to explicitly specify the transition rate and the master equation. In addition, it is difficult to investigate the nonstationary behavior of the probability distribution by solving master equations. We use the propagation of chaos instead of Aoki’s methods to investigate how the probability distribution behaves, without directly seeking the probability distribution itself.

4.2 Model with interaction

We consider an economy consisting of a large number of firms and add an interaction term to Eq. (4). We assume that the sales of a firm i depend on the conditions of the overall economy:

$$\begin{aligned} s^{i, N}_t = h \langle x \rangle + \xi ^i_t, \ \ \langle x \rangle \equiv \frac{1}{N}\sum _{j=1}^N x_t^{j, N} \end{aligned}$$
(5)

where \(0<h<1\) and N is the number of firms in the economy.Footnote 6 The assumption that h is less than 1 means that the sales do not increase as much as national income does. In other words, h is the marginal propensity to consume. In addition, firms are assumed to adjust their production depending on the expectation of the sales. Incorporating these effects into our model, Eq. (4) gets modified as follows:

$$\begin{aligned} dx_t^{i, N}= & {} (-V'(x_t^{i, N}) - e y_t^{i, N} + D (E[s_t^{i, N}] - x_t^{i, N}))dt + \sigma _1 dW_{1,t}^i, \nonumber \\ V(x)= & {} \frac{1}{4}x^4 - \frac{1}{2}x^2\\ dy_t^{i, N}= & {} (x_t^{i, N} - s_t^{i, N})dt\nonumber \end{aligned}$$
(6)

where \(E[s_t^{i, N}]\) refers to the expectation of \(s_t^{i, N}\) and \(D >0\). The stochastic term \(\sigma _1 dW_{1,t}^i\) includes estimation errors of \(E[s_t^{i, N}]\) by managers. Because \(E[s^{i, N}]= h \langle x \rangle \), we finally obtain

$$\begin{aligned} dx_t^{i, N}= & {} (-V'(x_t^{i, N}) - e y_t^{i, N} + D (h \langle x \rangle - x_t^{i, N}))dt + \sigma _1 dW_{1,t}^i, \nonumber \\ V(x)= & {} \frac{1}{4}x^4 - \frac{1}{2}x^2 \\ dy_t^{i, N}= & {} (x_t^{i, N} - h \langle x \rangle ) dt + \sigma _2 dW_{2,t}^{i}\nonumber \end{aligned}$$
(7)

The term \(D(h \langle x \rangle - x_t^{i, N})\) means that the production of a firm, i, rises if the marginal propensity to consume h times the average of production \(\langle x \rangle \) is higher than the production of the firm i. On the other hand, by definition, \(\langle x \rangle \) consists of all firms in the economy. Through this feedback mechanism, an individual firm interacts with all other firms (global interactions). In this sense, D can be viewed as the strength of the interaction effects. Note that our model should not be interpreted to mean that each firm knows the current production of all other firms. A key assumption in our model is the “macro-micro loop” (Hahn 2002). Even if each firm focuses on the sales of its own product, the macroeconomic environment represented by \(\langle x \rangle \) affects its sales. Due to this effect, the firm, whether intentionally or unintentionally, interacts with all other firms.

Using the empirical measure defined by \(U^{(N)}_t=\frac{1}{N}\sum _{j=1}^{N}\delta _{z_t^{j, N}}\) (\(\delta _z\) denotes the Dirac measure at z), the system of equations in (7) can be written as

$$\begin{aligned} dz_t^{i,N}= & {} \sigma dW_t^i + f(z_t^{i,N})dt + \frac{1}{N} \sum _{j=1}^{N} b(z_t^{i,N},z_t^{j,N}) dt \end{aligned}$$
(8)
$$\begin{aligned}= & {} \sigma dW_t^i + f(z_t^{i,N})dt + \left( \int b(z_t^{i, N}, z) U^{(N)}_t(dz) \right) dt \end{aligned}$$
(9)

Here,

$$\begin{aligned} z_t^{i,N}= & {} \left( \begin{array}{c} x_t^{i,N} \\ y_t^{i,N} \\ \end{array} \right) , \ \ f(z_t^{i,N}) = \left( \begin{array}{c} -(x_t^{i,N})^3 + x_t^{i,N} - e y_t^{i,N} \\ x_t^{i,N} \\ \end{array} \right) , \ \ \sigma = \left( \begin{array}{c@{\quad }c} \sigma _1 &{} 0 \\ 0 &{} \sigma _2 \\ \end{array} \right) \nonumber \\ dW_t^i= & {} \left( \begin{array}{c} dW_{1,t}^i \\ dW_{2,t}^i \\ \end{array} \right) , \ \ b(z_t^{i,N},z_t^{j,N}) = \left( \begin{array}{c} D (h x_t^{j,N} - x_t^{i,N}) \\ - h x_t^{j, N} \\ \end{array} \right) \nonumber \end{aligned}$$

It should be noted that the state of the economy is determined by the empirical measure \(U_t^{(N)}\), which is also a random variable. Equation (9) means that on the one hand, \(U_t^{(N)}\) consists of \(\{ z_t^{i, N} \}_{i=1,\ldots ,N}\) by definition, but on the other hand, individual process \(z_t^{i, N}\) is affected by \(U_t^{(N)}\). The feedback mechanism mentioned above is represented by interaction term \(\int b(z_t^{i, N}, z) U^{(N)}_t(dz)\).

For technical reasons, we consider the following modified version of equations (8):

$$\begin{aligned} dz_t^{i,N}= & {} f(z_t^{i,N})dt + g(z_t^{i,N}) dW_t^i + \frac{1}{N} \sum _{j=1}^{N} \tilde{b}(z_t^{i,N},z_t^{j,N}) dt \nonumber \\ \tilde{b}(z_t^{i,N},z_t^{j,N})= & {} \left( \begin{array}{c} \tilde{b}_1(z_t^{i,N},z_t^{j,N}) \\ \tilde{b}_2(z_t^{j,N}) \\ \end{array} \right) \nonumber \\ \tilde{b}_1(z_t^{i,N},z_t^{j,N})= & {} \left\{ \begin{array}{l@{\quad }l} D (h x_t^{j,N} - x_t^{i,N}) &{} \mathrm{if} \ \ |D (h x_t^{j,N} - x_t^{i,N})| \le K_1 \\ K_1 &{} \mathrm{if} \ \ D (h x_t^{j,N} - x_t^{i,N}) > K_1 \\ - K_1 &{} otherwise \end{array} \right. \\ \tilde{b}_2(z_t^{j,N})= & {} \left\{ \begin{array}{l@{\quad }l} - h x_t^{j,N} &{} \mathrm{if} \ \ |- h x_t^{j,N}| \le K_2 \\ K_2 &{} \mathrm{if} \ \ - h x_t^{j,N} > K_2 \\ - K_2 &{} otherwise \end{array} \right. \nonumber \end{aligned}$$
(10)

where we have replaced the interaction b with \(\tilde{b}\), and \(K_1, K_2 >0 \). Although this technical assumption is needed for the following proposition, it is not expected to substantially affect the behavior of \(z_t^i\) and \(U_t^{(N)}\) given that \(K_1\) and \(K_2\) are sufficiently large.Footnote 7

Next, we introduce the corresponding mean-field equation given by

$$\begin{aligned} dz_t^i= & {} f(z_t^i)dt + g(z_t^i)dW_t^i + \Bigl ( \int \tilde{b}(z_t^i,z) u_t(dz) \Bigr ) dt \nonumber \\ u_t(dz)= & {} \mathrm{the \ law \ of} \ \ z_t^i \end{aligned}$$
(11)

Assuming that initial condition \(\{ z^i_0 \}_{i=1,\ldots ,N}\) is drawn independently from the identical distribution, \(u_0\), we obtain the following results.

Proposition 1

  1. 1.

    The mean-field equation (11) is well-posed; that is, there exists a unique solution on [0, T] for any \(T > 0\).

  2. 2.

    The process \(z_t^{i,N}\) described by Eq. (10) converges in law to the solution of the mean-field equation (11), \(z_t^i\), with speed \(1/\sqrt{N}\); that is,

    $$\begin{aligned} \sup _{N} \sqrt{N} E[\sup _{t \le T} \Vert z_t^{i,N} - z_t^{i}\Vert ] < \infty \end{aligned}$$
    (12)
  3. 3.

    For any \(k \in \mathbb {N} \) and any k-tuple \((i_1,\ldots ,i_k)\), the law of the process \((z_t^{i_1,N}, \ldots , z_t^{i_k,N}, t \le T)\) converges to \(u_t \otimes \cdots \otimes u_t\).

Proof

Given (10), the interaction, \(\tilde{b}\), is bounded—that is, \(\Vert \tilde{b} \Vert ^2 \le K\) for some \(K > 0\). Hence, \(\tilde{b}\) satisfies the linear growth condition of the interactions (H3) in Baladron et al. (2012). Further, other conditions about fg, and \(\tilde{b}\) (H1, H2, and H4 in Baladron et al. (2012), respectively) hold. Therefore, applying Theorems 2 and 4 in Baladron et al. (2012), our claim follows.

Property 3 in Proposition 1 is called the propagation of chaos.Footnote 8 This property means that the probability distribution of \((z_t^1, \ldots ,z_t^k)\) evolves as if each element is independent when \(N \rightarrow \infty \). Namely, the motions of k tagged particles approach independent copies of Eq. (11), regardless of the interactions represented by \(\int \tilde{b}(z_t^{i, N}, z) U^{(N)}_t(dz)\). The underlying reason is clarified by the following proposition.

Proposition 2

Property 3 in Proposition 1 is equivalent to \(U^{(N)}_t\) (\(M(\mathbb {R}^2)\)-valued random variables, where \(M(\mathbb {R}^2)\) denotes the set of probability measures on \(\mathbb {R}^2\)) converging in law to constant random variable \(u_t\) (Proposition 2.2 in Sznitman 1991).

This proposition states that the empirical measure \(U^{(N)}_t\) tends to concentrate near \(u_t\), the solution of the mean-field equation (11). In other words, while each element behaves stochastically, the empirical distribution converges to the deterministic one, \(u_t\), as \(N \rightarrow \infty \). In this sense, this can be considered as a form of the law of large numbers. Because of this property, the interaction term \(\int \tilde{b}(z_t^{i, N}, z) U^{(N)}_t(dz)\) converges to the deterministic term \(\int \tilde{b}(z_t^i,z) u_t(dz)\), and therefore, from the point of view of an individual firm, there is no difference between the usual drift term and the interaction effects. This is why the law of k tagged particles (firms) can be described by the product of \(u_t\), \(u_t \otimes \cdots \otimes u_t\). Of course, the solution of Eq. (11), \(u_t\), is different from that of the equation without the interaction term \(\int \tilde{b}(z_t^i,z) u_t(dz)\). It should be noted that there is no a priori reason to assume that \(u_t\) is stationary. In fact, as we will see later, it shows cyclical movement at some parameter values. In what follows, we study the behavior of \(u_t\) when we change the parameters.

4.3 Stability analysis

In the previous section, we demonstrated that \(\displaystyle \lim \nolimits _{N \rightarrow \infty } U^{(N)}_t = u_t\). As Eq. (11) shows, on the one hand, \(u_t\) is the law of \(z_t^i\) by definition, but on the other hand, stochastic process \(z_t^i\) also depends on \(u_t\). Namely, it can be considered that the equation is nonlinear with respect to \(u_t\) and, in practice, an explicit solution is unfeasible. Therefore, to investigate the evolution of \(u_t\), an approximation method is needed. It should be noted that in our model [Eq. (7)], each firm depends on \(\langle x \rangle \) and \(\langle y \rangle \) instead of \(u_t\) itself, and our primary concern is the behavior of \(\langle x \rangle \) and \(\langle y \rangle \).Footnote 9 Instead of investigating \(u_t\) directly, we consider the dynamics of lower moments of \(u_t\) (see, e.g., Dawson 1983; Zaks et al. 2005; Kawai et al. 2004).

Setting \(\varphi = (x - \langle x \rangle )^n(y - \langle y \rangle )^m\) and using Ito’s formula, we have

$$\begin{aligned} d \varphi= & {} n(x - \langle x \rangle )^{n-1}(y - \langle y \rangle )^{m}dx^{i}_{t} + m(x - \langle x \rangle )^{n}(y - \langle y \rangle )^{m-1}dy^{i}_{t} \nonumber \\&+ \frac{1}{2}n(n-1)(x - \langle x \rangle )^{n-2}(y - \langle y \rangle )^{m}\sigma ^{2}_{1}dt \nonumber \\&+ \frac{1}{2}m(m-1)(x - \langle x \rangle )^{n}(y - \langle y \rangle )^{m-2}\sigma ^{2}_{2}dt \end{aligned}$$
(13)

Using the Taylor expansion around \(\langle x \rangle \) in (7), substituting \(d x_t^i\) and \(d y_t^i\) into (13), and then taking the expectation of both sides of (13), we obtain the following dynamical systems of moments:

$$\begin{aligned} \dot{ \langle x \rangle }= & {} \langle x \rangle - \langle x \rangle ^3 - 3 \mu _{2,0} \langle x \rangle - \mu _{3,0} - e \langle y \rangle - D (1-h) \langle x \rangle \nonumber \\ \dot{ \langle y \rangle }= & {} (1-h) \langle x \rangle \nonumber \\ \dot{\mu }_{2,0}= & {} -2 D \mu _{2,0} -2 e \mu _{1,1} + 2 (1 -3 \langle x \rangle ^2) \mu _{2,0} - 6 \langle x \rangle \mu _{3,0} - 2 \mu _{4,0} +\sigma _1^2 \\ \dot{\mu }_{1,1}= & {} -D \mu _{1,1} - e \mu _{0,2} + (1-h) \mu _{2,0} + (1 - 3\langle x \rangle ^2)\mu _{1,1} -3 \langle x \rangle \mu _{2,1} - \mu _{3,1} \nonumber \\ \dot{\mu }_{0,2}= & {} 2 (1-h) \mu _{1,1} + \sigma _2^2\nonumber \end{aligned}$$
(14)

where \(\mu _{n,m}=\langle (x - \langle x \rangle )^n (y - \langle y \rangle )^m \rangle \) and \(\langle \rangle \) denotes the expectation with respect to \(u_t\), that is, \(\int \varphi (z) u_t(dz)\). \(\dot{}\) denotes the time derivative.Footnote 10

Now, we focus on state \(\langle x \rangle = \langle y \rangle = 0\) (called the disordered state). Note that the disordered state is the stationary solution without interaction and corresponds to the situation where idiosyncratic shocks cancel each other out (i.e., the LLN argument holds), and therefore, no aggregate fluctuations appear. Moreover, \(\langle x \rangle = \langle y \rangle = 0\) is always a solution with an arbitrary value of D because of the symmetry in our model. At this stationary solution, other moments are determined by Eq. (14):

$$\begin{aligned} 0= & {} - 2 D \mu _{2,0}^* - 2 e \mu _{1,1}^* + 2 \mu _{2,0}^* - 2 \mu _{4,0}^* +\sigma _1^2 \end{aligned}$$
(15)
$$\begin{aligned} 0= & {} - D \mu _{1,1}^* - e \mu _{0,2}^* + (1-h) \mu _{2,0}^* + \mu _{1,1}^* - \mu _{3,1}^* \end{aligned}$$
(16)
$$\begin{aligned} 0= & {} 2 (1-h) \mu _{1,1}^* + \sigma _2^2 \end{aligned}$$
(17)

We then apply the Gaussian approximation to investigate the linear stability of the disordered state. The Gaussian approximation means that we approximate the system by the Gaussian distribution with time-varying parameters (see Zaks et al. 2005; Kawai et al. 2004). Because all the moments of Gaussian distributions are determined by the lower moments (\(\langle x \rangle \), \(\langle y \rangle \), \(\mu _{2,0}\), \(\mu _{1,1}\), \(\mu _{0,2}\)), Eq. (14) becomes a closed-form expression. Specifically, \(\mu _{3,0}=0, \ \ \mu _{3,1} = 3 \mu _{2,0} \mu _{1,1}\), and \(\mu _{4,0} = 3 \mu _{2,0}^2\) are used in our model.

Next, we conduct a standard linear stability analysis. The Jacobian of the five-dimensional system of (\(\langle x \rangle \ \langle y \rangle \ \mu _{2,0} \ \mu _{1,1} \ \mu _{0,2}\)) can be written in a block diagonal form:

$$\begin{aligned} \left( \begin{array}{c@{\quad }c} A &{} \mathbf{0} \\ \mathbf{0} &{} B \\ \end{array} \right) \end{aligned}$$
(18)

A is a \(2 \times 2\) matrix and B is a \(3 \times 3\) matrix. Therefore, the behavior of \(\langle x \rangle \) and \(\langle y \rangle \) around the disordered state can be determined solely by

$$\begin{aligned} A = \left( \begin{array}{c@{\quad }c} 1-3 \mu _{2,0}^* - D(1-h) &{} -e \\ (1-h) &{} 0 \\ \end{array} \right) \end{aligned}$$
(19)

The eigenvalues are given by

$$\begin{aligned} \lambda _{\pm } = \frac{1}{2}\Bigl (1-3 \mu _{2,0}^* - D(1-h) \pm \root \of { (1-3 \mu _{2,0}^* - D(1-h))^2 - 4 (1-h) e}\Bigr )\qquad \end{aligned}$$
(20)

We examine when the stability of the disordered state is lost—that is, the real parts of the eigenvalues become 0. From (20), \(\mu _{2,0}^* = \frac{1}{3}(1-D(1-h))\). From (17), \(\mu _{1,1}^* = - \frac{\sigma _2^2}{2(1-h)}\). Substituting these values into Eq. (15), we obtain the following condition:

$$\begin{aligned} f_h(D) \equiv D h (1 - D + Dh) = \frac{3}{2} \bigl ( \sigma _1^2 + \frac{e \sigma _2^2}{1-h} \bigr ) \equiv \frac{3}{2} \sigma ^2 \end{aligned}$$
(21)

\(\sigma ^2(\equiv \sigma _1^2 + \frac{e \sigma _2^2}{1-h})\) represents the intensity of idiosyncratic shocks. The left-hand side, \(f_h(D)\), can be interpreted as the degree of interaction that generates “order” (or collective behaviors) in the system. In particular,

$$\begin{aligned} \lim _{h \rightarrow 1} f_h(D) = D \end{aligned}$$
(22)
Fig. 6
figure 6

Equation (21)

When the two parameters, \(\sigma \) and D, satisfy this relation, bifurcation occurs. The interpretation is as follows. When the interaction effect, D, is below the critical point, \(D^*\), idiosyncratic shocks dominate the system. These shocks disturb and impede the generation of collective behaviors, and the system is close to the one with no interaction. Therefore, simple LLN holds and the stationary distribution, with \(\langle x \rangle = \langle y \rangle = 0\), is stable. The microeconomic structure (e.g., lumpiness) is irrelevant for an explanation of aggregate fluctuations.

However, when D exceeds critical point \(D^*\), the situation changes completely. The linear stability analysis above shows that the stationary distribution with \(\langle x \rangle = \langle y \rangle = 0\) is no longer stable. That is, idiosyncratic shocks do not prevent the interaction from generating collective behaviors in the system (Fig. 6). In fact, as we will see later, regular cyclical behavior is observed at the aggregate level. In the next subsection, we conduct numerical simulations.

4.4 Simulation

Figures 7, 8, 9, 10, 11, 12 and 13 show simulation results for \(\langle x \rangle \) and \(\langle y \rangle \) of Eq. (7) with different values of D (other parameters are fixed and \(N=20{,}000\)). In Fig. 7 with a small value of D, there is no observable aggregate behavior: only a small variation around \(\langle x \rangle = \langle y \rangle = 0\) exists. This is considered to be the finite–number effect of N. It is consistent with our analysis in the previous section. The microeconomic shocks cancel each other out; therefore, production bunching (or nonconvexity) at the firm level plays no role in aggregate fluctuations.

Fig. 7
figure 7

Simulation of Eq. (7) with \(\sigma _1^2 = \sigma _2^2 = 1/4 \), \(e=0.1\), \(h=0.9\), and \(D=0.1\). The solid (dashed) line is \(\langle x \rangle \)(\(\langle y \rangle \))

Fig. 8
figure 8

Equation (7) with \(D=0.32\). The other parameters are the same as in Fig. 7

Fig. 9
figure 9

Histogram of \(x^i_t\) when \(\langle x \rangle = 0.00\). The parameters are the same as in Fig. 8

Fig. 10
figure 10

Histogram of \(y^i_t\) when \(\langle x \rangle = 0.00\). It corresponds to an economy going through a phase of contraction due to excess inventories, \(\langle y \rangle = 0.56\)

Fig. 11
figure 11

Histogram of \(y^i_t\) when \(\langle x \rangle = 0.00\). It corresponds to an economy going through a phase of expansion, \(\langle y \rangle = - 0.52\)

Fig. 12
figure 12

Histogram of \(x^i_t\) when \(\langle x \rangle = 0.45\)

Fig. 13
figure 13

Histogram of \(x^i_t\) when \(\langle x \rangle = -0.46\)

Figure 8 shows that when the interaction effect is large enough to compensate for the disturbance caused by idiosyncratic shocks, a different aggregate behavior appears and an endogenous cyclical movement is observed. This is consistent with the fact that the eigenvalues (20), have an imaginary part different from 0 near the bifurcation point. Interestingly, the movement at the macroscopic level is more regular than at the firm level.

Figure 10 shows the histogram of y when \(\langle x \rangle = 0.00\) and \(\langle y \rangle > 0\)—that is, when the economy has excess inventories. This corresponds to a situation in which the economy goes through a phase of contraction to reduce the excess inventories. However, it should be noted that there is heterogeneity among firms and, as Fig. 10 shows, some firms’ inventories are running short. The same argument can be applied to Fig. 12, where business is good, \(\langle x \rangle > 0\). Under this favorable business condition, there exist firms that choose low production depending on their states. The motions of \(\langle x \rangle \) and \(\langle y \rangle \) are the firms’ averaging behaviors in the economy.

However, the cyclical behavior of \(\langle x \rangle \) and \(\langle y \rangle \) can be observed to be significantly below the critical value, \(D^* = 0.92\), predicted by the stability analysis in the previous section. This is related to the fact that the resulting distribution is different from a Gaussian distribution. In particular, the marginal distribution of \(x_t^i\) shows clear bimodality. In Fig. 14, we estimate the spectral density of the cycle of \(\langle x \rangle \). This density peaks at 0.014—that is, the period of the cycle is 71. On the other hand, from (20), the frequency is approximately given by \(2 \pi / \sqrt{(1-h) e} = 0.016\) near the bifurcation point. The period predicted by the stability analysis is \(1/0.016 = 63\), which is relatively close to the estimated value. Therefore, although the critical value of D is overestimated, we conclude that the qualitative feature of our model is captured by the stability analysis.Footnote 11

Fig. 14
figure 14

Spectral density of \(\langle x \rangle \)

This cyclical behavior of \(\langle x \rangle \) and \(\langle y \rangle \) is closely related to the well-known Kitchin cycle, which is usually explained as follows (see, e.g., Korotayev and Tsirel 2010). Suppose that firms observe an improvement in their commercial situation. They manage the increase in demand by increasing production. The demand is filled with the supply, but the supply gradually becomes excessive because it takes some time for businesspeople to realizes that supply exceeds demand. This time lag generates an unexpected increase in inventories, which leads to reduction of production so as to decrease excessive inventories. After inventories are sufficiently reduced, a new cycle of demand increase is initiated. According to this explanation, the origin of these cycles is information time lags.

At first glance, as shown in Fig. 8, the behavior of \(\langle x \rangle \) and \(\langle y \rangle \) appears to be consistent with the above scenario. An increase in \(\langle x \rangle \) is an increase in demand that leads to an increase in \(\langle y \rangle \). The cycle of \(\langle y \rangle \) lags behind that of \(\langle x \rangle \). However, it should be noted that information time lags at the firm level are not assumed in our model. Because of nonconvex technology, business people optimally choose low or high production and increase or decrease inventories. Furthermore, in contrast to Carvalho (2010), Acemoglu et al. (2012), and Gabaix (2011), each firm has a negligible impact on \(\langle x \rangle \) and \(\langle y \rangle \) as N is large. \(\langle x \rangle \) and \(\langle y \rangle \) are the average \(x^i_t\) and \(y^i_t\) values, respectively, of all firms in the economy; therefore, there is no representative firm corresponding to the motion of \(\langle x \rangle \) and \(\langle y \rangle \). Indeed, as shown in Figs. 3, 4, and 8, the behavior of \(\langle x \rangle \) and \(\langle y \rangle \) is different from that of an individual firm, \(x^i_t\) and \(y^i_t\). The behavior of \(\langle x \rangle \) and \(\langle y \rangle \) is a type of collective behavior that can only be observed at the macroscopic level.

Last, it is worth noting that the relation of the two cyclical behavior of \(\langle x \rangle \) and \(\langle y \rangle \) can be explicitly written. Summing both sides of Eq. (2) over i and dividing them by N, we obtain

$$\begin{aligned} \frac{1}{N} \sum _{i=1}^N dy_t^i= & {} \frac{1}{N} \sum _{i=1}^N (x_t^i - s^i_t)dt = \frac{1}{N} \sum _{i=1}^N (x_t^i - h \langle x \rangle - \xi ^i_t)dt\nonumber \\= & {} \left( (1 - h) \langle x \rangle - \frac{1}{N} \sum _{i=1}^N \xi ^i_t \right) dt \end{aligned}$$
(23)

Taking the limit, \(N \rightarrow \infty \), we obtain the simple relation \(\dot{ \langle y \rangle } = (1 - h) \langle x \rangle \) by LLN. This means that aggregate inventory investment (change in inventories) comoves with aggregate production without a time lag. This prediction is consistent with empirical data (see, e.g., Table 2 in Stock and Watson 1999).

5 Concluding remarks

This paper investigates the relationship between microeconomic structures and business cycles. The standard production-smoothing theory has been empirically rejected in the literature; therefore, we focused on the nonconvex cost function. This hypothesis, which has empirical support, can explain the excess volatility of production. The issue is whether this microeconomic structure has a nontrivial effect at the aggregate level. In particular, our model explicitly takes into account the feedback loop—that is, the macroscopic state of the economy not only represents firms’ aggregation but also prescribes the macroeconomic environment experienced by firms. If this effect is taken into account, this problem becomes complicated. We need to deal with the evolution of the distribution of production and inventories, that is, an infinite-dimensional random variable.

To investigate this problem, the propagation of chaos approach is useful. It shows that whereas each element behaves stochastically, the empirical distribution converges to the deterministic distribution as \(N \rightarrow \infty \). This does not imply that the distribution is stationary. In fact, the feedback loop together with nonconvex technology generates rich interesting phenomena and has been shown to be the origin of business cycles.

The standard linear stability analysis shows that the disorder state corresponding to the LLN argument loses its stability given that the interaction effect exceeds a critical point. This means that the interaction effect generates “order” (or collective behaviors) in the system. With the help of numerical simulations, we have demonstrated that the resulting aggregate behavior shows regular cyclical movement without any aggregate exogenous shocks. This endogenous business cycle is an explanation for the Kitchin cycle. It should be noted that there is no representative firm corresponding to \(\langle x \rangle \) and \(\langle y \rangle \) and that the behavior of \(\langle x \rangle \) and \(\langle y \rangle \) is different from that of an individual firm, \(x^i_t\) and \(y^i_t\). This is one example of the collective behaviors that can be observed only at the aggregate level and are crucial to macroeconomic analysis.

Finally, there exists other microeconomic behavior that is characterized by lumpiness (e.g., Cooper and Haltiwanger 2006). Investigating how microeconomic characteristics affect aggregate fluctuations via interactions is a promising subject for future research.