1 Introduction

Policy makers aim for well functioning financial markets, because of the key roles financial markets have for a society. They are essential financing tools catalysing the development of enterprises and fostering innovation. Price setting by the equilibrium between supply and demand of multiple investors with varied sources of information usually ensures the “correct” valuation, allowing an efficient and rational allocation of resources to the different sectors of the economy. Because investors allocate funds to firms with promising future standing and/or growth prospects, financial markets are inherently forward looking. In other terms, they are predictors of future economic growth. Financial markets also provide storage of value. The wealth effect of investors feeling richer upon stock market appreciation is well documented to boost consumption in a virtuous circle of economic expansion.

This ideal description flies in the face of a more complex reality in which excess volatility phases (Shiller, 1981) are the norm more than the exception, and exuberant market regimes (Shiller, 2006) are sometimes followed by crashes (Johansen & Sornette, 2002; Sornette, 2003). These turbulences have increasingly characterised financial markets in the last three decades, with the dot-com bubble that crashed in 2000, the bubble on real-estate and financial securitisation of mortgages that crashed in 2008, the market boom fuelled by quantitative easing that ended in 2018-Q1, the short-lived market exuberance of 2019 ending with the Covid-19 triggered market crash in March 2020 (see e.g. Sornette and Cauwels 2014, 2015b).

Arguably, policy makers should have a strong interest in stabilizing financial markets by decreasing excess volatility and damping market turbulences such as bubbles and crashes, but without sacrificing economic growth. Biswas et al. (2020) even show that the burst of a bubble can result in persistent aggregate economic activity below the pre-bubble trend, thus reducing the ex-ante welfare. However, in practice, the stance of most policy makers seems to be summed up by the remarks by Chairman Alan Greenspan at a symposium sponsored by the Federal Reserve Bank of Kansas City, Jackson Hole, Wyoming (August 30, 2002): “As events evolved, we recognized that, despite our suspicions, it was very difficult to definitively identify a bubble until after the fact—that is, when its bursting confirmed its existence.” Greenspan further confirmed that “instead of trying to contain a putative bubble by drastic actions with largely unpredictable consequences, we chose, as we noted in our mid-1999 congressional testimony, to focus on policies to mitigate the fallout when it occurs and, hopefully, ease the transition to the next expansion” Greenspan (2004). Acting to prick a bubble when the bubble is on its way up is generally considered impossible or ill-advised for fear of false positives and the danger that the remedy might be worse than the disease. Apart from some exceptions, the attitude of central banks has thus been in general to act vigorously only after a crash occurred, to provide liquidity as well as cheaper access to credit in the form of lower interest rates.

On the other side of the debate, Cecchetti et al. (2000, 2002) develop a number of arguments for how asset price misalignments should be used to guide central bank policy. In particular, they show that interest rates should respond to stock price bubbles in order to dampen the overall volatility in economic activity. The theoretical work by Bernanke and Gertler (2000, 2001) finds that direct asset price targeting might have undesirable side-effects, but changes in the asset price can help to forecast inflationary or deflationary pressure and a flexible inflation-targeting provides macroeconomic and financial stability. In his review of six general arguments favoring monetary targeting of asset bubbles, Roubini (2006) suggested that the standard arguments against policy intervention to thwart bubbles do not hold on close scrutiny. In particular, in the case of endogenous bubbles (defined by their susceptibility to monetary policies), optimal monetary policy calls for attempting to control the bubble. Wadhwani (2008) reviews the justifications for “leaning against the wind”, using a number of indicators such as loan-to-value ratios, growth of the value of bank assets and so on. Mishkin (2011) argues that policy intervention should monitor credit market conditions and use macroprudential measures to restrain over-exuberance in credit markets as well as financial imbalances. Ikeda (2019) develops a dynamic model with rational bubbles in which bubble-led boom reduces firms’ borrowing constraints and keeps inflation from rising. Ramsey-optimal monetary policy is shown to call for tightening to curb the boom.

There has been some research implying that specific policies can decrease the amplitude of bubbles. For example, Caballero and Krishnamurthy (2006) show the beneficial impact of liquidity requirements, sterilization of capital inflows and structural policies on alleviating the bubble risk. Miao et al. (2015) show that property taxes and credit policy can prevent the formation of a land bubble in an agent-based model with representative households. Fischbacher et al. (2013), Bao and Zong (2019) and Galí et al. (2021) study the relationship between interest rate policies and bubble dynamics in the laboratory. Fischbacher et al. (2013) find a small impact of their interest rate policy on bubbles, but observe a significant impact of disclosing the possibility of reserve requirements on the size of bubbles. Bao and Zong (2019) observe a significant decrease of the bubble size when the interest rate is reduced sharply in response to price deviations. Galí et al. (2021) show in their setting that “leaning against the wind” by increasing interest rates in response to asset price increases reduces the price of the asset bubble on the short term but tends to exacerbate asset price bubbles on the longer term. They also stress that expectations are backward-looking, with adaptive and trend extrapolating elements, rather than rational. Blot et al. (2020) assess the dynamic impact of monetary policy shocks on a new bubble indicator based on a principal component analysis to extract the common pattern of structural, econometric and statistical empirical approaches. Their main result based on their chosen examples is that restrictive monetary policy (“leaning against the wind”) cannot help deflating stock or housing price bubbles.

Here, we complement these burgeoning theoretical, empirical and experimental approaches by deploying a realistic agent-based model (ABM) that supports transient endogenous “super-exponential” bubbles followed by crashes, in order to test the consequences of direct policy intervention on the overall financial market and the risk-adjusted performance of the investors. ABMs belong to the broader class of computational economic models and simulate individual operations and interactions of multiple agents. ABMs can incorporate heterogeneous beliefs and have become more widely used in economics in recent years (Kirman, 2012). In principle, ABMs have the great advantage of allowing for the introduction of arbitrary levels of complexity and heterogeneity among a large population of economic agents, thus providing the means to approach a realistic description of real economic and financial systems. ABMs are well suited to represent out-of-equilibrium phenomena such as financial bubbles and are not constrained to stationary conditions. They can describe transient dynamics, including convergence to or deviations from equilibrium (Sornette, 2014). Consequently, they can be used to test the dynamics of an inherently nonlinear impact of a policy in the presence of complex feedback loops. Kyle (1985) and Black (1986) were among the first who developed ABMs of financial markets that consider the behavior of traders who are influenced by trends and price patterns instead of only the fundamental value of an asset. Later, De Long et al. (1990, 2007) developed an ABM in which financial bubbles emerge from positive feedback caused by noise traders. Extensive reviews of ABMs of financial markets can be found in Samanidou et al. (2007), and more recently in Dieci and He (2018) and Hommes and LeBaron (2018).

In the present paper, we build on the ABM consisting of fundamentalists and noise traders developed by Kaizoji et al. (2015) and extended by Westphal and Sornette (2020). It features realistically looking bubbles and crashes, while reproducing the most important stylized facts of financial markets. The set-up can incorporate the feedbacks of policy decisions on the market and on the other traders. Thus, this market model is used to analyse the impact and consequences of policy intervention on the overall market and to explore the existence of trade-off between economic growth and financial stability objectives.

In our previous research (Westphal & Sornette, 2020), we investigated the impact of a “dragon-rider”, i.e. a third class of investors in addition to fundamentalists and noise traders, who exploit their ability to diagnose financial bubbles from the endogenous price history to determine optimal entry and exit trading times. The name “dragon-rider” is based on the empirical observation that crashes that follow bubbles are exceptional events, outliers of strong significance (Johansen & Sornette, 2002, 2010), which are named “dragon-kings” to emphasise their special status and specific amplifying mechanisms (Sornette, 2009; Sornette & Ouillon, 2012) (for a pedagogical introduction, see https://en.wikipedia.org/wiki/Dragon_King_Theory). Using calibrations of the price dynamics with models of bubbles viewed as transient super-exponential episodes, the dragon-riders obtain a diagnostic of the presence or absence of a bubble and use this information to “ride” the ascending price bubble that they exit when they assess that the burst is close.

Here, we consider the different situation where yet another class of investors are introduced in addition to fundamentalists and noise traders, the “dragon-slayers”. As the name implies, the dragon-slayers are taken to represent policy makers whose goal is to prevent the development of bubbles, and thus avert the ensuing crashes. While dragon-riders strive to maximize their risk-adjusted return by exploiting financial bubbles, dragon-slayers aim at reducing or even suppressing bubbles and their subsequent damaging crashes. Their metric of success is not necessarily a risk-adjusted return but how much they calm the markets and prevent the big swings of bubbles and crashes. The analysis followed by our dragon-slayers to decide how to intervene can be decomposed into three components: an estimation of the long-term growth rate of the asset, a diagnostic of a growing bubble, and the anticipation of the crash. The later two are quantified by a crash probability that is estimated via a logistic function whose argument is a measure of the realised excess return over the expected return of the asset. Equipped with their diagnostic, the dragon-slayers invest to increase their allocation to the stock market when they detect a small incipient positive bubble so as to pile up assets that they are then able to sell countercyclically when they diagnose too much exuberance. Building an inventory of the risky asset is indeed necessary in order to sell sufficiently large amounts of it later to stop the bubble from growing. One could think that this would not be needed for a policy maker who is not supposed to be wealth limited, since a central bank can in principle print unlimited quantities of their own currency. The issue is that the intervention is not about buying a depressed asset to prevent its further crash but rather to sell it astutely to countercyclically lean against the exuberant price growth. One could thus think that this could be performed by short selling, but this is not a viable general solution, because short selling is limited by the amount of equity available for borrowing, which would be insufficient to implement the interventions needed to make a difference at the whole market level, as we shall see below. They follow the opposite strategy for negative bubbles (see e.g. (Sornette & Cauwels, 2015a) for a definition). We investigate the intended and possible unintended consequences of these interventions and quantify the trade-off between financial stability and economic growth.

We find that the policy maker succeeds in preventing bubbles and crashes in our ABM. In simulations without bubbles, the policy maker behaves similarly to the fundamentalists and his impact is negligible, following the principle of “Primum non nocere”. In simulations where bubbles form spontaneously as a result of the noise traders’s strategies, the policy maker’s intervention reduces the average drawdown by a factor of two when his market impact becomes significant. We find that the policy maker intervention improves all analysed metrics of market returns, including volatility, skewness, kurtosis and VaR, making the market less turbulent and more stable. The combination of fewer bubbles and crashes, lower market risks and the stability of the long-term growth rate make the policy maker intervention to improve the performance of all investors as measured by their risk-adjusted return, increasing the Sharpe ratios from approximately 0.3 to 0.5 for noise traders, from 0.6 to 0.8 for fundamentalists as the market impact of the policy maker increases to the level of the fundamentalists. We also test the sensitivity of these results to variations of the key parameters of the strategy of the policy maker and find very robust outcomes. In particular, the conclusions are unchanged even under very large miscalibrated long-term expected returns of the risky asset.

In the financial literature, the dragon-slayers’ interventions are called open market interventions and have become one of the favoured tools of central banks, completing more standard monetary policies via interest rate guidance. Open market operations usually involve buying or selling government bonds in the open markets. Market intervention in the form of stock purchases is a more recent phenomenon, which seems to develop in importance as the severity of economic and market stresses has been mounting. Stock purchases have been documented as a policy measure to end a crash and help the economy to recover, for example in Hong Kong in 1998 and in China in 2015 (Su et al., 2002; Huang et al., 2019). In Hong Kong, the government purchased stocks that are constituants of the Hang Seng Index to restore investors’ confidence. The total investment of HK $118 billion (or US $15 billion) in the 33 constituent stocks of the Hang Seng made the government one of the largest shareholders, owning between 2.49 and 12.28% of the outstanding shares of the individual companies. The purchases reversed the trend of declining stock prices, and the higher stock prices persisted even after the intervention period ended (Su et al., 2002). After the crash in the Chinese stock market in mid-June 2015, the government directly and indirectly purchased stocks of more than 1000 firms between July and September. These purchases increased stock demand, reduced default probabilities, and increased liquidity. This increased the value of a subsample of the firms by about RMB 206 billion (Huang et al., 2019). Between 2008 and 2011, the Swiss franc appreciated from 1.6 CHF/EUR to less than 1.1 CHF/EUR. As a response to this massive “over-valuation” and in order to maintain price stability, the Swiss National Bank (SNB) has been intervening in the foreign exchange market, building reserve assets becoming larger than Switzerland’s GDP in 2016. As a consequence of its interventions, the SNB became a large public investor globally: in its latest 13F SEC filing for Q2 2020, the Swiss National Bank has disclosed 2437 total holdings and a US stock portfolio valuation of about 118 billion USD. Following the burst of Japan’s asset price bubble in 1990, the Bank of Japan has been taking an ever expanding role in its fight against deflation. After having exhausted the orthodox policy arsenal, the Bank of Japan started buying equity ETFs in 2010. By April 2019, it has become a top ten shareholder of more than 50% of publicly traded companies. As a reaction of the March 2020 market crash, on March 16, Governor Haruhiko Kuroda announced that the central bank would double the pace of its equity purchases to $113 billion per year.

Notwithstanding these examples, outright interventions by central banks on equity markets remains rare and one could remain skeptical about the relevance of our work. In contrast, it has become common practice in the last decade for central banks to buy government bonds. Consider the case of Europe, which has been in an extraordinary period of banking and sovereign stress since 2009. The European Central Bank reacted with non-standard measures to deal with the sovereign debt crisis, such as the Long-Term Refinancing Operations in December 2011 and February 2012, and the Outright Monetary Transactions program (OMT) in summer 2012. These programs were initially controversial because they amount to an indirect funding of the sovereign debts of European countries, which is formally forbidden in the Maastrisch and Lisbon treaties that explicitly prohibit monetary financing of public debt. Formally, the European Central Bank can only be a lender of last resort to financial institutions. The “whatever it takes” speech by Mario Draghi in the summer of 2012 saved the euro, calmed markets and bought time for reforms. It is a prime example of a change of “rule” where interventions that were considered “unthinkable” before the crisis become the norm thereafter. It is in the spirit that we offer our investigation, keeping in mind that, while rare, outright interventions by central banks on equity markets could become much more fashionable if new extraordinary adverse market developments requires it. With the on-going and future impacts of the Covid-19 crisis and of the energy transition to address climate change, such scenarios might not be so farfetched. Our results suggest that there are opportunities for central banks to intervene on financial markets with potential beneficial results.

The paper is structured as follows. Section 2 presents the ABM consisting of fundamentalists and noise traders, which is used to test the policy interventions, and describes how the strategy of the dragon-slayer (policy maker) is constructed. Section 3 describes the influence of the dragon-slayer on the price dynamics and analyses the consequences on price peaks and drawdowns. In Sect. 4, the impact of the dragon-slayer’s intervention on the risk-adjusted return of the traders is analysed together with the quantification of the fraction of wealth accumulated by the dragon-slayer. The observed improvement on market risk properties resulting from the dragon-slayer’s intervention is appraised with respect to parameter miscalibration risks. This is done by varying the parameters controlling the strategy of the dragon-slayer. Section 4 concludes.

2 The Market Model

The market in the ABM consists of two types of assets: a risky asset, which is a dividend paying stock, and a risk-free asset, which pays a constant return in each time-step and represents a risk-free government bond or a bank account. These assets are traded by three types of investors, fundamentalists (f), noise traders (n), and a dragon-slayer (d). Fundamentalists are rational risk-averse investors who invest by maximizing their expected utility under a constant relative risk aversion (CRRA) utility function in each time-step. The noise traders invest based on the price momentum and social imitation. The dragon-slayer trades with the objective to prevent bubbles and crashes in the price of the risky asset. The set-up of the model follows the ABM developed by Kaizoji et al. (2015) and modified by Westphal and Sornette (2020) to introduce dragon-riders as mentioned in the introduction. The present version of the model replaces the dragon-riders of Westphal and Sornette (2020) by dragon-slayers. In absence of financial bubbles, the dragon-riders of Westphal and Sornette (2020) behave like fundamentalists. Therefore we simplify the set-up by focusing on the interaction of dragon-slayers with fundamentalists and noise traders.

2.1 Market Set-Up

The traders decide in each time-step how to allocate their wealth between the two assets. The risk-free asset has perfect elastic supply and pays a constant return \(r_{f}\). In contrast, the price \(P_t\) of the risky asset is defined endogenously by demand and supply. The asset pays a dividend \(d_t\) in each time-step. The dividend process is a discrete stochastic growth process following (Westphal & Sornette, 2020). It is defined as

$$\begin{aligned} d_t=d_{t-1}\left( 1+r_t^d\right) ~, \end{aligned}$$
(1)

where the growth rate \(r_t^d\) is a Gaussian process with mean value \(r_d>0\) and variance \(\sigma _d^2\),

$$\begin{aligned} r_t^d=r_d+\sigma _du_t, \end{aligned}$$
(2)

with \(u_t {\mathop {\sim }\limits ^{iid}}\mathcal {N}(0,1)\). This specification of the dividend process ensures that it has a completely negligible probability of going negative and we have never encountered such realisations in our simulations.

The difference between the return of the risky asset and the risk-free asset is called excess return. It is defined as the sum of the capital return \(r_t=\frac{P_t}{P_{t-1}}-1\), where \(P_t\) is the price of the risky asset at time t, and of the return from the dividend \(d_t\) of the risky asset minus the risk-free rate \(r_f\):

$$\begin{aligned} r_{excess, t}= r_{t} + \frac{d_{t-1} \cdot (1+r_t^d)}{P_{t-1}}-r_f. \end{aligned}$$
(3)

2.2 Fundamentalist Strategy

At each time-step, fundamentalists invest a fraction \(x_t^f\) of their wealth in the risky asset and the remaining fraction into the risk-free asset, such that they maximize their expected utility with a constant relative risk aversion (CRRA) utility function over one-period. In other words, they are myopic investors who update at each time step their investment decision based on the new information on the risky asset price and its dividend that is obtained at the end of each period. The one-period optimisation is chosen because it quite realistically captures the bounded rationality of real human investors and it allows us to keep the mathematical formulation simple. It also represents reasonably well the large universe of actively managed funds that, contrarily to the belief that their highly sophisticated technical skills would give them an edge, overwhelmingly underperform. Indeed, several studies have shown that only about 1% of professional funds overperform the market (corresponding to the buy-and-hold strategy) as a result of genuine skill (Barras et al., 2010; Harvey & Liu, 2020). The rest of overperformers (roughly 20% of the funds) are just lucky, while the remaining \(\approx 80\%\) funds underperform. In that approach, we follow the tradition of previous investigations, such as Chiarella et al. (2006), Hommes and Wagener (2009), Kaizoji et al. (2015), and Westphal and Sornette (2020). Moreover, in the absence of transaction cost (and other limitations on trading), a greedy strategy that only considers one period at a time is optimal, since performance for the current period does not depend on previous holdings (Boyd et al., 2016). Since most of the efforts in developing a good trading algorithm goes into forming good forecasts of the expected return (Campbell et al., 1997; Grinold, 1999), and given the large noise and difficulties inherent in the one-period return prediction, a multi-period approach to forecast returns would be difficult to justify at the concrete operational level.

Each fundamentalist is equipped with the same information and utility function. Therefore, each fundamentalist decides on the same optimal allocation of their wealth. Thus, the fundamentalists’ investment can be considered at the aggregate level as the optimization problem of one representative agent who invests a fraction \(x_t^f\) of the cumulative wealth \(W_t^f\) in the risky asset.

The CRRA utility function with risk aversion parameter \(\gamma \) is defined as Ljungsqvist and Sargent (2012):

$$\begin{aligned} U(W)={\left\{ \begin{array}{ll} \log (W) &{} \text { for } \gamma =1\\ \frac{W^{1-\gamma }}{1-\gamma } &{} \text { for } \gamma \ne 1 \end{array}\right. } \end{aligned}$$
(4)

In each time-step, the fundamentalists solve the maximization problem

$$\begin{aligned} \max \limits _{x_t^f} \textbf{E}_{t-1}\left[ U(W_t^f(x_t^f))\right] . \end{aligned}$$
(5)

The wealth \(W_t^f\) evolves as a function of \(x_t^f\). This evolution consists of the wealth \(W_{t-1}^f \cdot x_{t-1}^f\) invested in the risky asset, which enjoys the return \(r_t\) on the risky asset and the dividend payment \(d_t\) per share, plus the wealth \(W_{t-1}^f\cdot \left( 1-x_{t-1}^f\right) \) invested in the risk-free asset paying an interest at the risk-free interest rate \(r_f\):

$$\begin{aligned} W_t^f=\; & {} W_{t-1}^f \left( 1+ x_{t-1}^f \cdot \left( r_{t}+\frac{d_{t}}{P_{t-1}} \right) + \left( 1-x_{t-1}^f \right) \cdot r_f \right) \nonumber \\= \;& {} W_{t-1}^f \cdot \left( 1+x_{t-1}^f \cdot r_{excess, t}+r_f \right) , \end{aligned}$$
(6)

where \(r_{excess, t}\) is defined by expression (3). In first order approximation and assuming \(d_t \ll P_t\) (Kaizoji et al., 2015), the resulting fraction of wealth invested in the risky asset with CRRA coefficient \(\gamma \) is given by

$$\begin{aligned} x_{t-1}^f=\frac{1}{\gamma } \frac{E_{t-1}[r_{excess,t}]}{Var_{t-1}[r_{excess,t}]} = \frac{E_{r_{t}}^f+\frac{d_{t-1}}{P_{t-1}}(1+r_d) -r_f}{\gamma (\sigma ^2+\frac{d_{t-1}^2\cdot \sigma _r^2}{P_{t-1}^2})} \approx \frac{E_{r_{t}}^f+\frac{d_{t-1}}{P_{t-1}}(1+r_d) -r_f}{\gamma \sigma ^2} \end{aligned}$$
(7)

where \(E_{r_{t}}^f\) is the fundamentalists’ expectation of the return of the risky asset and \(\sigma ^2\) is its expected variance. Thus, the first order approximation reduces the CRRA optimisation to a mean-variance optimisation. Denoting the number of shares invested in the risky asset by \(n_t^f:=\frac{x_t^fW_t^f}{P_t}\), the excess demand of the fundamentalists for the risky asset is described by the following equation:

$$\begin{aligned} \Delta D_{t-1\rightarrow t}:=\;&n_{t}^fP_{t}-n_{t-1}^fP_{t} \nonumber \\ =\;&x_{t}^fW_{t}^f -x_{t-1}^fW_{t-1}^f\frac{P_{t}}{P_{t-1}} \\ =\;&W_{t-1}^f \left( x_{t}^f\left[ x_{t-1}^f \left( r_{t}+\frac{d_{t}}{P_{t-1}} -r_f \right) + r_f+1 \right] -x_{t-1}^f\frac{P_{t}}{P_{t-1}} \right) \nonumber \end{aligned}$$
(8)

together with (7). Note that the demand is \(\Delta D_{t-1\rightarrow t} = n_{t}^fP_{t}-n_{t-1}^fP_{t}\) and not \(\Delta D_{t-1\rightarrow t} = n_{t}^fP_{t}-n_{t-1}^fP_{t-1}\) as this expresses that the change in the desired number of shares is paid at the price \(P_t\).

2.3 Noise Trader Strategy

The noise traders’ investment strategy is based on the analysis of the assets historical price returns and on the opinion of other noise traders. Analogous to Kaizoji et al. (2015) and Westphal and Sornette (2020), each individual noise trader is either invested in the risky asset or in the risk-free asset and does not diversify their portfolio. In each time-step, the noise traders decide independently and probabilistically to keep their current position or switch their strategy to the other asset. The switching probability is influenced by the opinion of the other noise traders and the price momentum. The price momentum \(H_t\) is defined as the exponential moving average of the return of the risky asset

$$\begin{aligned} H_t = \theta H_{t-1}+(1-\theta ) r_t, \end{aligned}$$
(9)

where \(0\le \theta \le 1\) controls the time-span \(\sim 1/(1-\theta )\) of the noise traders’ memory.

The number of noise traders invested in the risky asset is denoted as \(N_t^{+}\), the number of noise traders invested in the risk-free asset is \(N_t^{-}\), and the total number of noise traders is \(N^{n}=N_t^{+}+N_t^{-}\). This total number of noise traders is kept constant during the simulations. The opinion index \(s_t\) describes the collective opinion towards the risky asset compared to the risk-free asset in each time-step. It is defined as

$$\begin{aligned} s_t=\frac{N_t^{+} - N_t^{-} }{N^n} ~~~~\in [-1, 1]. \end{aligned}$$
(10)

The probability at time-step t for a noise trader who is invested in the risky asset to switch his investment position to the risk-free asset is denoted as \(p_t^+\). Respectively, the switching probability of a trader, who is invested in the risk-free asset, is denoted as \(p_t^-\). These probabilities are given by

$$\begin{aligned} p_t^{\pm } = \frac{p}{2} \left( 1 \mp \kappa _t (s_t + H_t)\right) , \end{aligned}$$
(11)

where the constant p control the average holding time of each asset type.

The time-dependent parameter \(\kappa _t\) determines the strength of social imitation and momentum following, which are assumed here to be controlled by the same parameter. Financial markets are characterised by periods of exuberance alternating with periods of pessimistic mood (Sornette, 2003; Shiller, 2006). We propose to account for the existence of different regimes and for the random switches between them by allowing the noise traders to shift between periods when they have a large tendency to herd and when they have more heterogenous opinions. Thus, their susceptibility to herding is regime dependent. This incorporates the influence of exogenous factors such as economic and geopolitical regimes in the model. We account for these characteristics by allowing the coupling strength \(\kappa _t\) to be time-dependent according to a discretized Ornstein–Uhlenbeck process

$$\begin{aligned} \kappa _t = \kappa _{t-1} + \eta _\kappa (\mu _\kappa -\kappa _{t-1}) +\sigma _\kappa v_t, \end{aligned}$$
(12)

where \(\eta _\kappa \) is the strength of mean reversion that controls the persistence time \(\sim 1/\eta _\kappa \) of deviations from the mean \(\mu _\kappa \), driven by fluctuations with standard deviation \(\sigma _\kappa \) with \(v_t{\mathop {\sim }\limits ^{iid}}\mathcal {N}(0,1)\). The expected value of the Ornstein-Uhlenbeck process \(\kappa _t\) starting at an initial value \(\kappa _0\) is

$$\begin{aligned} E[\kappa _t] = \kappa _0\cdot \exp (-\eta _\kappa \cdot t)+\mu _\kappa (1-\exp (-\eta _\kappa \cdot t)). \end{aligned}$$
(13)

Thus, the estimated time to revert from a value \(\kappa _0>\kappa _c\) to a value \(\kappa _c >\mu _\kappa \) is

$$\begin{aligned} \Delta T = \frac{1}{\eta } \log \left( \frac{\kappa _0-\mu _\kappa }{\kappa _c -\mu _\kappa } \right) . \end{aligned}$$
(14)

Aggregating the independent investment decisions over all noise traders amounts to considering an equivalent representative noise trader who decides on the fraction \(x_t^n\) of his wealth invested in the risky asset. The risky fraction is then given by

$$\begin{aligned} x_t^n=\frac{N_t^+}{N^n}~~~~\in [0,1]. \end{aligned}$$
(15)

The fraction of wealth invested in the risky asset then evolves as

$$\begin{aligned} x_t^n=\frac{1}{N^n}\left( \sum _{k=1}^{N_{t-1}^+}\left( 1-\xi _k(p_{t-1}^{+})\right) +\sum _{l=1}^{N_{t-1}^-}\xi _l(p_{t-1}^{-}) \right) , \end{aligned}$$
(16)

where \(\xi (p)\) are Bernoulli random numbers. The drawing of a Bernoulli random number corresponds to the decision of each individual noise trader to switch to the other asset or to stay invested as in the previous time-step.

The corresponding aggregated wealth equation has the same structure as (6):

$$\begin{aligned} W_{t}^n=W_{t-1}^n \cdot \left( 1+x_{t-1}^n \cdot \left( r_{t} + \frac{d_{t}}{P_{t-1}} \right) + (1-x_{t-1}^n) \cdot r_f \right) . \end{aligned}$$
(17)

Combining the previous equations, the resulting aggregated excess demand from the noise traders for the risky asset is described by the following equation:

$$\begin{aligned} \Delta D_{t-1\rightarrow t}^n = W_{t-1}^n \left( x_{t}^n \left[ x_{t-1}^n \left( r_{t} + \frac{d_{t}}{P_{t-1}}-r_f \right) +r_f +1 \right] -x_{t-1}^n \frac{P_{t}}{P_{t-1}}\right) \end{aligned}$$
(18)

together with equations (16) and (11).

2.4 Dragon-Slayer Strategy

The dragon-slayer’s objective is to prevent bubbles and crashes by predicting them and trading the risky asset. He builds an expectation of the future return based on his diagnostic of the bubble, his anticipation of the drawdown that is expected to result from the burst of the bubble and his expectation of the long-term growth rate. Using this expected return, he maximizes his expected utility similarly to the fundamentalists. The strategy is designed such that the dragon-slayer increases his allocation in the risky asset when the bubble starts to grow in order to sell his shares later to countercyclically lean against the on-going exuberant growth of the bubble. Moreover, the utility maximization ensures a reasonable control of the risk-adjusted return to prevent excessive losses or the need for the creation of money. The later would have other unintended consequences in the economy linked to the financial market. By acting similarly to a standard investor, the intervention of the dragon-slayer has the virtue of avoiding additional market disruptions and of being the special target of arbitrageurs.

First, the dragon-slayer calculates the excess return momentum \(y_t\) above the long-term return \(\bar{r}\) in units of daily return as an exponential moving average with memory parameter a:

$$\begin{aligned} y_t = a \cdot y_{t-1} + (1-a) \cdot (r_{t-1} - \bar{r}). \end{aligned}$$
(19)

Note that \(y_t\) is knowable by the dragon-slayer at time \(t-1\) when \(r_{t-1}\) is observed. The index t of \(y_t\) is used to indicate that the excess return momentum \(y_t\) is used by the dragon-slayer to decide on his allocation on the risky asset over the period from \(t-1\) to t. In comparison the the price momentum \(H_t\) used by the noise traders, the parameter \(y_t\) is more sensitive to accelerations of the price growth.

Using \(y_t\), the dragon-slayer estimates the probability that the overpricing will result in a crash according to the logistic function:

$$\begin{aligned} \lambda _t=1/(1+e^{-(\vert y_t \vert -l_y)/s}) \end{aligned}$$
(20)

The threshold \(l_y\) describes the level of overpricing that the dragon-slayer defines as excessive and s quantifies the confidence of the dragon-slayer in the existence of overpricing. The choice of the logistic function follows a long tradition in social sciences, medical fields and machine learning (Birnbaum & Chavez, 1997; Carbone & Hey, 2000). It amounts to a linear dependence of the logarithm of the odds of a crash as a function of the excess return momentum \(y_t\). It enjoys several properties, such as being the negative of the derivative of the binary entropy function. It is also a central element of the probabilistic Rasch model for measurement, with applications in psychological and other areas (Alagumalai et al., 2005).

The dragon-slayer invests in the risky asset when he detects a small deviation of the return from the long-term growth rate in order to construct an inventory that he will be able to draw from later to fight a possible future market exuberance. Then, when this deviation between the current growth rate and the long-term growth rate exceeds the dragon-slayer’s tolerance level, he starts to sell the risky asset, that he has accumulated earlier, to fight against future price increase.

The dragon-slayer’s main objective is to prevent bubbles and crashes. However, as a policy maker, he is also obligated not to squander the wealth (or money creation power) he is entrusted with and to invest it in a way that ensures a reasonable amount of risk-adjusted return. Therefore, the dragon-slayer’s strategy is embedded in the same framework as the fundamentalist strategy, maximizing the expected utility with the same risk-aversion parameter \(\gamma \). The only difference between pure fundamentalists and dragon-slayers lies in the way they form their expectation of the future return. The inclusion of bubble and crash forecasts in the expected return of the risky asset by the dragon-slayer implicitly results in the desired counter-cyclical investment strategy, as we shall see. Moreover, this formulation ensures that the strategy of the dragon-slayer converges towards that of the fundamentalists in the absence of bubbles.

At time \(t-1\), the dragon-slayer forms an expectation of the risky asset return according to the following expression

$$\begin{aligned} E_{r_t}^{d}=\overbrace{(1-\lambda _t) \cdot w_y \cdot y_t}^{(I)}-\overbrace{\lambda _t \cdot {{\,\textrm{sign}\,}}(y_t) \cdot w_y \cdot l_y}^{(II)}+ \overbrace{\bar{r}}^{(III)} \end{aligned}$$
(21)

with \(w_y\) being the weight that the dragon-slayer gives to his bubble diagnostic. This expected return \(E_{r_t}^{d}\) is time-dependent and is performed over a time horizon proportional to \(1/(1-a)\) as seen from expression (19).

Term (I) in (21) corresponds to the diagnostic by the dragon-slayer of the degree with which the risky asset can deliver a return above the long term value. This occurs when the excess return momentum \(y_t\) becomes positive but not too large so that the crash probability \(\lambda _t\) remains small. This corresponds to \(-l_y \le y_t \le l_y\).

Term (II) captures the impact on the expected return \(E_{r_t}^{d}\) stemming from the anticipation of the drawdown. This part is proportional to the threshold \(l_y\) that is tolerated in the price momentum, which is also the expected amplitude of the drawdown. Furthermore, \({{\,\textrm{sign}\,}}(y_t)\) incorporates the direction of the bubble. Term (II) is significant when the estimated probability \(\lambda _t\) that the bubble is going to crash is close to 1, which corresponds to \(\vert y_t\vert >l_y\).

Term (III) is the long-term growth rate of the risky asset, which is equal to the fundamentalists’ expectation of the future return and is equal to the dividend growth rate \(r_d\) (Westphal & Sornette, 2020). This term is not influenced by the bubble diagnostic of the dragon-slayer. When the excess return momentum \(y_t\rightarrow 0\), \(E_{r_t}^{d}\longrightarrow \bar{r}\), which means that the investment allocation of the dragon-slayer converges towards that of the fundamentalists.

After building his expectation of the return of the risky asset according to (21) at time \(t-1\), the dragon-slayer chooses to allocate the fraction \(x_{t-1}^{d}\) of his wealth to the risky asset, which is given by expression (22). This fraction \(x_{t-1}^{d}\) holds from \(t-1\) to t, at which time the dragon-slayer observes \(r_t\) and recalculates the new value of the excess return momentum \(y_{t+1}\) cascading into a new value \(E_{r_{t+1}}^{d}\) and thus of his allocation \(x_{t}^{d}\), and so on. To determine \(x_{t-1}^{d}\), the dragon-slayer uses the same maximisation process as the fundamentalist, using a CRRA utility with a risk aversion level \(\gamma \). The difference with Eq. (7) is the use of \(E_{r_t}^{d}\) rather than \(\bar{r}=r_d\) for the one-period expected return of the risky asset. This yields

$$\begin{aligned} x_{t-1}^{d}= \frac{E^{d}_{r_{t}}+\frac{d_{t-1}}{P_{t-1}}(1+r_d) -r_f}{\gamma \sigma ^2} \end{aligned}$$
(22)

The dragon-slayer decides on a level \(l_y\) of mispricing that seems unreasonable (or unsustainable) to him. This means that, whenever this level is exceeded, the dragon-slayer thinks the asset is in a bubble and is going to crash soon. For any \(0<y_t<l_y\), the dragon-slayer expects the asset to grow further and his estimation of the expected return of the risky asset for the near future is larger than \(r_d\). However, when \(y_t>l_y\), he expects a crash, which leads to his reduced expected return below \(r_d\) as a result of the drawdown anticipated to burst the bubble. For a negative bubble corresponding to an underpricing of the risky asset, the above reasoning applies ceteris paribus by changing \(y_t\) into \(|y_t|\).

To estimate the relevant values of \(l_y\), we first determine the standard deviation \(\sigma _y\) of \(y_t\). Expression (19) shows that, under statistical stationary conditions, the standard deviation \(\sigma _y\) of \(y_t\) is given by

$$\begin{aligned} \sigma _y = {1-a \over \sqrt{1-a^2}} \sigma _r \end{aligned}$$
(23)

where \( \sigma _r\) is the standard deviation of \(r_{t-1} - \bar{r}\), i.e. of the returns. The memory parameter a controls the time scale \(\sim {1 \over 1-a}\) over which the dragon-slayer estimates the excess return momentum with (19). Typically, this time scale is of the order of months to years. In simulations below, we take \(a=0.98\) as an illustration, corresponding to a time scale of two calendar months if one unit of discrete time is taken to represent one trading day. This means that a is close to 1 and can be written as \(a=1-\epsilon \) with \(\epsilon \ll 1\). To first-order in \(\epsilon \), expression (23) becomes

$$\begin{aligned} \sigma _y = \sqrt{\epsilon \over 2}~ \sigma _r~. \end{aligned}$$
(24)

Expression (24) implies that the typical amplitude of the fluctuations of \(y_t\) is significantly reduced compared with the typical amplitude of the daily returns of about 1%. This is the expected effect of performing a moving average. With \(a=0.98\), we obtain \(\sigma _y = 0.1 \sigma _r \approx 0.1\%\). In the following, we will present simulation results for the choice \(l_y = 0.8\%\), which is 8× the standard deviation \(\sigma _y\) of \(y_t\). This numerical example shows that, in absence of a serious mispricing, the dragon-slayer expects a crash with very low probability. Figure 2 below confirms that \(y_t\) indeed rarely reach the level \(l_y\), except after long periods of large price increase.

Figure 1 shows the risky fraction of the dragon-slayer given by (22) with (21) as a function of the excess return momentum \(y_t\). If the dragon-slayer does not detect any overpricing (or underpricing) (\(y_t=0\)), the expected return is equal to that of the fundamentalists. The future expected return increases with \(y_t\) as the dragon-slayer expects the bubble to grow further, until \(y_t\) becomes too large and he expects a crash.

Analogously to the fundamentalists, the dragon-slayer’s excess demand for the risky asset is

$$\begin{aligned} \Delta D_{t-1\rightarrow t}= W_{t-1}^d \left( x_{t}^d\left[ x_{t-1}^d \left( r_{t}+\frac{d_{t}}{P_{t-1}} -r_f \right) + r_f +1 \right] -x_{t-1}^f\frac{P_{t}}{P_{t-1}} \right) \end{aligned}$$
(25)

with Eq. (22).

Fig. 1
figure 1

The dragon-slayer’s risky fraction (22) as a function of the excess return momentum \(y_t\). The parameters are \(w_y=0.035\), \(l_y=0.008\), \(\bar{r}=0.00016\). Three values of s are shown, where s defined in expression (20) quantifies the confidence of the dragon-slayer in the existence of overpricing. Positive (resp. negative) values of \(y_t\) correspond to positive (resp. negative) bubble regimes, namely overpricing (resp. underpricing) of the risky asset

2.5 Market Clearing and Price Equation

As presented for the fundamentalists, noise traders and dragon-slayer above, each trader decides on his excess demand for the next time-step according to Eqs. (8), (18), and (25) respectively. The price is obtained from the market clearing condition, which balances demand and supply according to Walras’ theory of general equilibrium (Walras, 1954):

$$\begin{aligned} 0=\Delta D_{t-1\rightarrow t}^{d} +\Delta D_{t-1\rightarrow t}^n +\Delta D_{t-1\rightarrow t}^f~. \end{aligned}$$
(26)

This yields a quadratic equation of the price at the next time-step, which has a unique positive solution giving the price \(P_t\). The full equations and resulting price equation can be found in Appendix 1.1.

3 Impact of the Dragon-Slayer on the Price Dynamics

3.1 General Conditions of the Simulations

The market parameters that are used in all simulations are listed in Table 2. They are chosen such that each time-step corresponds to approximately one trading day. In the present work, we do not attempt to calibrate the model to empirical data but focus on typical parameters that produce realistic financial price time series to demonstrate the impact of dragon-slayers. We recall that our model is able to match the stylized facts of financial markets, such as the autocorrrelation functions of signed and absolute returns, the fat-tailed property of returns, as demonstrated in previous works (Kaizoji et al., 2015; Westphal & Sornette, 2020).

We investigate two classes of markets:

  • markets with bubbles obtained with time varying social imitation strength \(\kappa \) following an Ornstein-Uhlenbeck (OU) process (12) shown in Fig. 2;

  • markets without bubbles obtained for constant \(\kappa \) shown in Fig. 3.

These figures plot the time evolution of realisations of the price path \(P_t\), the risky fractions \(x_t^i\) invested by the three trader types, the excess return momentum \(y_t\), and the wealth of the three trader types. The price trajectory \(P_t\) exhibits bubbles and crashes in the simulation with OU \(\kappa \). The noise traders’ risky fraction and wealth increases during the bubble regimes, but crashes together with the price. The fundamentalists’ risky fraction fluctuates less than the other traders’ risky fraction and they decrease their exposure to the risky asset during the bubble regime by investing proportionally to the dividend-price ratio. Therefore, their wealth is smaller than the noise traders’ wealth during bubbles, but exceeds their wealth in the long-term. In the absence of bubbles, the dragon-slayer’s risky fraction is the same as that of the fundamentalists. However, during bubbles, the excess return \(y_t\) increases and the dragon-slayer strategy deviates from the fundamentalist strategy. His risky fraction increases until \(y_t\) exceeds the threshold \(l_y\), which triggers the dragon-slayer to sell the risky asset. In the simulations with constant \(\kappa \), which represents markets without bubbles, the dragon-slayer strategy fluctuates around the fundamentalists’ strategy.

The Ising-like structure of the noise traders’ decision making (Kaizoji et al., 2015; Westphal & Sornette, 2020) allows for in a phase transition between a disordered and an ordered regime. In the disordered regime, the noise traders’ opinions are heterogeneous, in the ordered regime the noise traders polarize, which leads to an increased demand for one of the two assets. This is reflected in the price time-series as a positive or negative bubble. In the simulations with constant \(\kappa \), we choose its value in the subcritical regime, at 0.98% of the critical value \(\kappa _c\). In simulations with OU \(\kappa \), the parameter has the same mean value \(0.98 \kappa _c\), but fluctuates around it according to a mean reverting OU process. Thus, there are transient regimes in which \(\kappa \) is larger than the critical value. This describes regimes where the noise traders tend to polarize their decisions, as a result of the spontaneous collective organisation of individuals who interact repeatedly and sufficiently strongly.

Fig. 2
figure 2

Example of a simulated price path \(P_t\) with OU \(\kappa \). The upper panel shows the price \(P_t\) in linear-log scale, the second panel shows the risky fraction invested by the three trader types, the third panel shows the excess return momentum \(y_t\), with the dotted lines indicating the threshold \(\pm l_y\). The last panel shows the wealth of the three trader types over time. The dragon-slayer has no market impact in the simulation, i.e. these simulations are performed in the case where their wealth is negligible compared to that of the other traders. The parameters used in the strategy of the dragon-slayer are \(a=0.98\), \(l_y=0.008\), \(w_y=0.035\), and \(s=0.0005\). The 5000 time steps that are shown correspond to approximately 20 years, given than one time-step corresponds to 1 trading day

Fig. 3
figure 3

Same as figure 2, but with constant \(\kappa =0.98 \kappa _c\)

3.2 Dragon-Slayers with Negligible Market Impact

In our ex-post analysis, we are interested in characterising how well does the dragon-slayer diagnose bubbles and predict crashes. As a preliminary analysis, we need to identify the price peaks, which can be considered to be the target proxies of the dragon-slayers. We thus define that a price peak occurs at time scale k at time-step \(t_i\) if

$$\begin{aligned} P_{t_i} \ge P_{t_j}~~~~~ \forall t_j \in [t_i-k, t_i+k], \end{aligned}$$
(27)

where \(P_t\) is the price at time t and k is the minimum distance between two peaks. Thus, a peak occurs at a given time if the price at this time is larger than the price at the k previous and consecutive times. In this analysis, the minimum distance between two peaks is chosen to be \(k=250\) trading days, which corresponds to approximately one calendar year.

Figure 4 shows how the dragon-slayer’s bubble diagnostics correlate with the price trajectory, its peaks and subsequent drawdowns. Figure 4 shows a simulated price path and its major peaks and the crash probability \(\lambda (y_t)\) estimated by the dragon-slayer according to expression (20). The black dotted lines and black triangles characterise the occurence times and price heights of the peaks identified ex post for comparison.

In this example, among the 7 peaks diagnosed according to (27), 5 are correctly predicted (true positives) by the condition that \(\lambda _t\) exhibits a well-defined peak, while only 3 are correctly predicted if the condition is more stringent, for instance that \(\lambda _t\) should be larger than 0.2. Two peaks, at \(t \approx 1000\) and \(t \approx 4000\), are not detected by the dragon-slayers (false negatives), because they occur rather close to previous peaks and are relatively smaller than their close predecessors.

It is possible optimise the prediction performance in terms of sensitivity and specificity, for instance, by varying the parameters \(a, l_y, s, w_y\) involved in the definition of the dragon-slayer strategy. We refrain from such optimisation in order to focus on the robustness of our conclusions. We examine below how the properties of the price dynamics of the risky asset and the wealth dynamics of the three trader types change upon varying the model and strategy parameters.

Fig. 4
figure 4

Simulated price path (logarithmic scale on the left axis) with the posterior identified peaks indicated as triangles (see main text for their definition) and the crash probability \(\lambda _t\) estimated by the dragon-slayer and given by expression (20) on the right axis. The parameters are \(a=0.95\), \(l_y=0.004\), \(\bar{r}=0.00016\), and \(s=0.0005\)

3.3 Dragon-Slayers with Significant Market Impact

We analyse the impact of the dragon-slayer on the price time-series of the risky asset by increasing his initial wealth from 0 to 50% of the total initial wealth of the three trader types. The ABM is simulated with 15 different initial fractions of the total wealth allocated to the dragon-slayer (0%, 1%, 2%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 48%, 49%, 50%). The initial wealth of the noise traders is kept constant, the initial wealth of the fundamentalists decreases by the same amount that the initial wealth of the dragon-slayer increases. This ensures that the impact on the price is due to the increase of the dragon-slayer’s wealth and not due to the decrease of the noise traders presence in the market. This is further motivated by the fact that the dragon-slayers are in fact a sub-species of fundamentalists, as explained above. Our procedure thus ensures a constant fraction of all types of fundamentalists and also of the noise traders who are crucial for the creation of bubbles and crashes. Each scenario is simulated for the same 1000 random seeds and the same set of parameters. If not denoted otherwise, the dragon-slayer parameters are \(a=0.98\), \(l_y=0.008\), \(\bar{r}=0.00016\) and \(s=0.0005\). The total simulation duration is \(T=25000\).

First, the impact of the dragon-slayer’s wealth is shown qualitatively by comparing the price paths with different fractions of the dragon-slayer’s wealth in Fig. 5. Then, Fig. 6 shows the impact of the dragon-slayer’s wealth on the number and amplitudes of the peaks and drawdowns in the price of the risky asset, Table 1 quantifies the impact of the dragon-slayer’s wealth on the long-term growth of the market, while Fig. 7 illustrates the impact of the dragon-slayer’s wealth on the return distribution of the risky asset.

Fig. 5
figure 5

Section of a price path (left) and the corresponding fractions invested in the risky asset for the three trader types (right). Each row of panels shows a different dragon-slayer wealth fraction in the market, from 0% (top) to 50% (bottom). The social coupling strength \(\kappa \) follows the OU process (12). The dragon-slayer parameters are \(a=0.98\), \(w_y=0.035\), \(l_y=0.008\), \(\bar{r}=0.00016\), and \(s=0.0005\)

Figure 5 shows sections of a price path and the corresponding fractions invested by the three trader types for OU \(\kappa \) (12), the corresponding figure with the same random seed is shown for constant \(\kappa \) in Appendix Fig. 10. In this example, the price of the risky asset exhibits a large bubble around \(t=1250\) in the simulation where the dragon-slayer impact is negligible. This peak disappears in the simulations performed under the same conditions, except for the initial wealth fraction of the dragon-slayer being 20% or larger. In the cases with 10% and 30%, the main bubble is slightly decreased in amplitude while some secondary peaks appear in the case of 30%. In the 40% and 50% cases, the price trajectory becomes very similar to those obtained with constant \(\kappa \). In the price path with 20%, all bubbles are eliminated. This shows that the effect of the dragon-slayer impact is not deterministically monotonous as a function of the dragon-slayer wealth fraction and needs to be defined probabilistically as it depends on the specific random realisation of the prices process created by the noise traders. We thus need to perform detailed statistical analysis over many realisations to obtain meaningful conclusions. This analysis is developed below.

In the right panels in Fig. 5, the risky fraction of the dragon-slayer is more volatile for small initial wealth fractions. However, as his initial wealth increases, his strategy becomes more similar to the fundamentalist strategy and the risky fraction fluctuates closely around the fundamentalists risky fraction. By investing proportionally to the dividend-price ratio, the fundamentalist strategy has a stabilizing effect on the market. Despite the absence of bubbles, the noise trader risky fraction fluctuates a lot. This is different from the simulations with constant \(\kappa \), where the noise trader risky fraction remains between 0.3 and 0.7 most of the time. The fundamentalists’ risky fraction is not affected significantly by the change of initial wealth of the dragon-slayer. In the simulations with constant \(\kappa \), the impact of increasing the initial wealth of the dragon-slayer is much smaller. The price path and risky fractions seem similar to the reference simulation without the dragon-slayer in all six scenarios.

Fig. 6
figure 6

Quantification of the impact of the dragon-slayer strategy on the price of the risky asset. The average number of peaks and the average peak-to-valley-drawdown are shown as a function of the initial dragon-slayer wealth fraction in the risky asset. The average quantities are calculated over 1000 simulations with 1000 different random seeds. The error bars represent one standard deviation. Each quantity is calculated over price realisations occurring between \(t=5000\) and \(t=17500\), which corresponds to approximately 50 years. The dragon-slayer parameters are \(a=0.98\), \(w_y=0.035\), \(l_y=0.008\), \(\bar{r}=0.00016\), and \(s=0.0005\)

With the ex post identification of peaks given in Eq. (27), Fig. 6 presents a quantitative description of the dragon-slayer’s impact on bubbles and crashes in terms of two metrics: the number of price peaks and the average amplitude of the peak-to-valley drawdown (referred to as the average peak size). The valley following a peak occurs at the time when the price takes its minimum value between two consecutive peaks. The corresponding size of a drawdown \(d_{t_i}\) is defined as the difference between the log-price at the time of the peak (\(t_i\)) and the log-price at the time of the consecutive valley.

$$\begin{aligned} d_{t_i}=\log (P_{t_{peak}=t_i})-\log (P_{t_{valley}}). \end{aligned}$$
(28)

Figure 6 shows the average number of peaks calculated over 1000 simulations with 1000 different random number seeds as a function of the initial dragon-slayer wealth fraction. The number of peaks is calculated over 12,500 time-steps, which corresponds to approximately 50 years, and for a minimum distance \(k=250\) between price peaks (see definition (27)). The figure shows that for, OU \(\kappa \), the average number of peaks does decrease by 5.7% when increasing the dragon-slayer’s wealth fraction from 0% to 50%. In contrast, for constant \(\kappa \), the average number of peaks remains approximately constant. This result is not necessarily bad news as it is not surprising that peaks subsist in random price trajectories and we should not expect a strong effect for this metric. In contrast, the average peak size is significantly reduced by the dragon-slayer, almost by a factor 2. For OU \(\kappa \), it decreases from 57% in absence of dragon-slayer to 33% when the dragon-slayer completely replaces all fundamentalists. Thus, we can conclude that the dragon-slayer has a very strong impact in essentially eliminating all bubbles in the dynamics of the risky asset. Figures 11 and 12 show that the dragon-slayer with a shorter memory length (\(a=0.95\)) as defined in expression (19) prevents even more bubbles and reduces further the size of the drawdowns more efficiently than the dragon-slayer with slower reactions characterised by \(a=0.99\).

Table 1 Annualized growth rate of the price of the risky asset estimated from \(t=5000\) to \(t=25000\) and averaged over 1000 simulations with different random seed for each initial wealth fraction of the dragon-slayer

While the dragon-slayer prevents bubbles and crashes, one could worry that this would come at the cost of impacting the long-term average return of the risky asset. By presenting the average return and its standard deviation calculated over 1000 simulations with different random seeds, Table 1 shows that this is not the case. Each value is the annualized growth-rate \(r_a\) between \(t=5000\) and \(t=25000\) calculated as

$$\begin{aligned} r_a = {250 \over 25000-5000} \log \left( {P_{25000} \over P_{5000}}\right) . \end{aligned}$$
(29)

The first 5000 time-steps are removed to avoid the influence of transients at the beginning of the simulations. The simulations have been performed with \(r_d=0.00016\) per time step (day), which corresponds to an annualised growth rate of dividends of 4%. Thus, theoretically, the long-term growth rate of the risky asset should also be equal to 4% (Westphal & Sornette, 2020). For all scenarios with different dragon-slayer fractions, we find that the empirical mean value of the return of the risky asset is less than one standard deviation away from the theoretical value. Thus, one can conclude that the dragon-slayer does not change the long-term growth rate of the risky asset.

Figure 7 shows with box plots how the distribution of returns of the risky asset is influenced by the dragon-slayer fraction for OU \(\kappa \). Figure 7 shows the median, quartiles and range of the variance, skew, excess kurtosis and \(\hbox {VaR}_{1\%}\) calculated over 1000 simulations with different random seeds. As observed in the analysis above, the impact on simulations with constant \(\kappa \) is very small, the corresponding figure can be found in Appendix Fig. 13. For OU \(\kappa \), the first panel shows that the median of the variance of the return decreases from \(1.47\cdot 10^{-4}\) to \(0.56\cdot 10^{-4}\) (almost a factor 3) in the presence of the dragon-slayer. Already a fraction of 5% of initial wealth owned by the dragon-slayer decreases the variance by 28%. The skewness of the return is pushed closer to zero as the impact of the dragon-slayer increases. Without the dragon-slayer, the median skewness of the return is \(-\)0.081 while, with 50% dragon-slayer, the skewness is \(-\) 0.037. The median excess kurtosis decreases by 71.1% from 1.50 to 0.43 when the dragon-slayer’s wealth fraction increases from 0 to 50%. The median of the absolute value of the 1%-VaR decreases from 0.0294 to 0.0180 for the same change of the dragon-slayer’s wealth fraction. The interquartile ranges of all four measures also decrease with the increase of the dragon-slayer’s wealth fraction in the market. In particular, the decrease of the variance and of the absolute value of the 1%-VaR demonstrate clearly that the dragon-slayer stabilizes the market. A final noteworthy observation is that the beneficial impact of the dragon-slayer is stronger for small wealth fractions and its marginal effect decreases as its wealth fraction increases. Thus, already a small initial dragon-slayer’s wealth fraction can stabilize the market.

Fig. 7
figure 7

Boxplots of the dragon-slayer’s impact on the statistics of returns of the risky asset for OU \(\kappa \), calculated over 1000 simulations with different random seeds for each dragon-slayer fraction. The red lines show the median, the bottom and top of the boxes correspond to the 25% and 75% quartiles and the whiskers indicate the largest and lowest observed value within 1.5× the interquartile range. The four panels show the variance, skewness, excess kurtosis, and 1%- VaR of the return of the risky asset. For each simulation, the parameters are calculated over the time interval from \(t=5000\) to \(t=17500\). The dragon-slayer parameters are \(a=0.98\), \(w_y=0.035\), \(l_y=0.008\), \(\bar{r}=0.00016\), and \(s=0.0005\)

4 Impact of the Dragon-Slayer on the Traders’ Performance

The previous section showed how the dragon-slayer removes bubbles from the risky asset price and decreases its volatility without influencing the long-term growth rate of the asset. Here, we focus on how his presence affects the performance of the three types of traders present in the market.

Figure 8 shows the average Sharpe ratio of the three trader types, calculated over 1000 simulations with different random seeds, as a function of the initial dragon-slayer’s wealth fraction in the market. The corresponding figures for \(a=0.95\) and \(a=0.99\) can be found in Appendix Figs. 14 and 15. The Sharpe ratio is calculated over the interval \(t\in [5000, 17500]\), which corresponds to 50 years. Figure 8 shows that, for OU \(\kappa \) for which bubbles emerge naturally from the strategies of the noise traders, the risk-adjusted return of all three trader types increases with increasing initial wealth fraction of the dragon-slayer. The increases of the Sharpe ratios are economically very significant, from approximately 0.3 to 0.5 for noise traders, from 0.6 to 0.8 for fundamentalists and from 0.8 to close to 1 for the dragon-slayer, as his wealth fraction increases from 0 to 50%. In contrast, for constant \(\kappa \), the Sharpe ratios remains approximately constant. This is consistent with the observation from Fig. 3, that shows very little impact of the dragon-slayer on the price dynamics of the risky asset for constant \(\kappa \) for which bubbles do not appear. This comes from the fact that, in absence of bubbles, the dragon-slayer’s strategy reduces to that of the fundamentalists. This leads to conclude that, in absence of bubbles, the performance of traders remains unchanged when increasing the wealth of the dragon-slayer while it improves significantly when the dragon-slayer removes bubbles that were otherwise present.

Fig. 8
figure 8

Average Sharpe ratios of the three trader types as a function of the dragon-slayer’s wealth fraction for OU \(\kappa \) (solid lines) and constant \(\kappa \) (dashed lines) calculated over 1000 realisations with different random seeds and over approximately 12,500 time steps. The error bars represent one standard deviation. The dragon-slayer parameters are \(a=0.98\), \(w_y=0.035\), \(l_y=0.008\), \(\bar{r}=0.00016\), and \(s=0.0005\)

While Fig. 8 provides an in-depth analysis of one specific dragon-slayer strategy corresponding to a specific set of parameters, Fig. 9 analyses the sensitivity of the traders’ Sharpe ratio to four parameters of the dragon-slayer strategy and to the average growth rate \(\bar{r}\) of the risky asset. The three top panels (resp. bottom panel) show the average annualized Sharpe ratios of the traders with OU (solid line) and constant \(\kappa \) (dotted line) as a function of one of the dragon-slayer strategy parameters (resp. \(\bar{r}\)).

Each scenario is simulated 1000× with different random seeds, the filled circles indicate the mean value of the Sharpe ratios, each calculated from \(t=5000\) to \(t=17500\). The error bars represent one standard deviation. The dragon-slayer is endowed with 10% of the initial wealth and this is fixed over all simulations when varying the parameters. The Sharpe ratios with 0% dragon-slayer are included as reference values. The first panel shows the sensitivity of the Sharpe ratios to variations in the dragon-slayer’s memory parameter a used in the estimation by the dragon-slayer of the excess return momentum \(y_t\) (19). We scan the values \(a\in [0.9, 0.95, 0.98, 0.99, 0.996]\). This means the memory length \(1/(1-a)\) is varied between 10 trading days and 250 trading days. For OU \(\kappa \), the fundamentalists and noise traders performances are better in the presence of a dragon-slayer with small a, corresponding to a short memory. The performance is found to be best for \(a=0.95\), which corresponds to a memory of 20 trading days. For all analysed values of a, the presence of the dragon-slayer is always beneficial. The noise traders also enjoy improved performance when the dragon-slayer is present for all a except for \(a=0.996\) (250 days), where their Sharpe ratio is slightly below the reference value. As expected, the impact of the dragon-slayer is small for constant \(\kappa \), as the Sharpe ratios are very close to the reference Sharpe ratios for all analyzed memory parameters. For constant \(\kappa \), for both fundamentalists and noise traders, the Sharpe ratio in the presence of the dragon-slayer is slightly above the reference value for \(a \le 0.95\) and slightly below it for \(a>0.95\). In general, the dragon-slayer with a shorter memory length can react faster to changes in the momentum, performs better and is more beneficial to the market.

Fig. 9
figure 9

Dependence of the Sharpe ratios of the three types of traders as a function of three parameters of the dragon-slayer’s strategy and of the average growth rate \(\bar{r}\) of the risky asset. The filled circles represent the mean values over 1000 simulations with different random seeds, with the error bars representing one standard deviation. The dragon-slayer is given an initial wealth corresponding to 10% of the total initial wealth over all traders. The solid lines correspond to OU \(\kappa \), and the dotted lines correspond to constant \(\kappa \). The reference Sharpe ratios averaged over 1000 simulations in the absence of the dragon-slayer are shown in blue-grey for the fundamentalists and green-grey for the noise traders. In each panel, a single parameter is varied, while the other parameters have the default values from the parameter set \(a=0.98\), \(w_y=0.035\), \(l_y=0.008\), \(\bar{r}=0.00016\) (corresponding to an annualised return of 4%), and \(s=0.0005\). The Sharpe ratios are annualized values calculated over the time interval from \(t=5000\) to \(t=17500\). In the bottom panel, the range of variation of \(\bar{r}\) from 0.00008 to 0.00024 (daily) corresponds to a range from 2 to 6% annualised

The second panel of Fig. 9 shows the Sharpe ratios of the three trader types with 10% dragon-slayer as a function of the overpricing threshold \(l_y\in [0.00016, 0.0008, 0.0032, 0.008, 0.016]\). With the long-term daily growth rate \(r_d=0.00016\) (4% annualised), this is equivalent to an excess return threshold between \(r_d\) and \(100 \cdot r_d\). We find that the risk-adjusted return of the three traders is larger with the larger thresholds \(l_y\). With small \(l_y\)’s, the dragon-slayer tends to overreact to small deviations of the risky asset price from the long term trend controlled by the average return \(r_d\). Thus, for \(l_y=0.0032\), the dragon-slayer performs even worse than the other traders. The fundamentalists average Sharpe ratio slightly decreases to 0.5342 when a dragon-slayer with \(l_y=0.0032\) is present in the market. The threshold \(l_y=0.0032\) means that the dragon-slayer considers a persistent excess return of 0.32% per day to be unsustainable. While this is 50× the long-term growth rate, it is significantly smaller than the expected daily volatility which is approximately 1%. Thus, the dragon-slayer jumps between buying and selling the risky asset that are illustrated in Fig. 1 outside of the dotted lines representing the threshold levels. These rapid portfolio reallocations result in a destabilisation of the market. For thresholds that are larger than 0.5%, the presence of the dragon-slayer improves the performance of the other traders. The conclusion, which should not be a surprise, is that the dragon-slayer should err on the side of discriminating diagnostics of bubbles to avoid over-reacting on too many false positives.

The weight \(w_y\) enters in the determination of the expectation \(E_{r_t}^{d}\) of the return by the dragon-slayer in expression (21). A dragon-slayer who chooses \(w_y=0\) is identical to the fundamentalists, while the larger \(w_y\) is, the more he is concerned with bubbles. Thus, \(w_y\) controls the amplitude of the expectation of the dragon-slayer concerning the market return, as shown in Fig. 1. The third panel of Fig. 9 shows that, for OU \(\kappa \), a medium weight value between \(w_y=0.05\) and 0.1 is optimal for all traders. For all analyzed values of \(w_y\), the traders obtain better Sharpe ratios than without the dragon-slayer. However, for constant \(\kappa \), large weights (\(w_y>0.05\)) slightly decrease the risk-adjusted return of the traders compared to the reference value, because the dragon-slayer uses less the stabilizing fundamentalist strategy of investing proportionally to the dividend-price ratio.

The parameter s enters in the definition of the probability \(\lambda _t\) that the overpricing will result in a crash according to expression (20). It controls the reaction to changes in \(y_t\) near the threshold \(l_y\). As illustrated in Fig. 1, a large s results in a slower and smoother reaction, while a small s results in an immediate readjustment of the portfolio when the threshold \(l_y\) is reached. The fourth panel in Fig. 9 shows that the Sharpe ratios of all traders are monotonously decreasing as a function of s. Thus, smaller s values are beneficial for all trader types, implying that the traders benefit from a dragon-slayer who reacts determinedly to the detected overpricing. The Sharpe ratios for constant \(\kappa \) are not influenced by a change of s, because \(y_t\) fluctuates around 0 and only the reaction to larger deviations is influenced by s. For OU \(\kappa \), the traders benefit from the presence of the dragon-slayer for all analyzed values of s. The Sharpe ratios are on average significantly larger than in the reference simulations without the dragon-slayer.

The last bottom panel of Fig. 9 shows the average Sharpe ratios of all traders as a function of the expected long-term growth rate \(\bar{r}\) of the risky asset. In the default parameter setting, the dragon-slayer uses the true long-term growth rate of the market, which is equal to the dividend growth rate \(r_d\). This is identical to the fundamentalist strategy. However, in real markets, it is difficult to have an accurate estimation of the long-term growth rate of an asset. The figure shows that, even with a wrong estimation of the true growth rate, the traders benefit from the presence of the dragon-slayer. When the dragon-slayer underestimate the growth-rate and use 0.00008 (2% annualised) or 0.00012 (3% annualised) instead of the correct 0.00016 (4% annualised), the traders perform even better than with the true growth rate. When choosing a smaller \(\bar{r}\), the overpricing \(y_t\) increases by the difference between \(\bar{r}\) and the real growth rate, because it is defined as the exponential moving average of the difference between observed return and expected long-term return. However, with an inaccurate estimation of \(\bar{r}\), the reaction to bubbles is not symmetric anymore. Thus, choosing a smaller \(\bar{r}\) has the same impact as shifting the threshold \(l_y\) to the left. In any case, the most important conclusion in practice from the simulations presented in this bottom panel is that the conclusion about the beneficial influence of the dragon-slayer is robust with respect to an error of more than 50% (2–6% around the true 4%) in the estimation of the long-term growth rate. This is not surprising given that transient bubbles are characterised by much larger short-term growth rates, which make their detection robust with respect to a miscalibration of the long-term growth rate.

In summary, Fig. 9 shows that the dragon-slayer strategy improves significantly the performance of the other traders over a wide range of strategy parameters. Shorter memory lengths, larger overpricing thresholds, a medium weight on the bubble and crash diagnostic, and a small expected long-term growth rate bring exceptional benefit to the three trader types.

5 Conclusion

We have presented an extension of a previously studied agent-based model (ABM) originally developed by Kaizoji et al. (2015), which is characterised by the spontaneous formation of bubbles and crashes emerging from the interaction between fundamentalists and noise traders. We have introduced a third type of traders, called dragon-slayer, who represents a policy maker who has the objective of preventing bubbles and crashes by trading between a risky and a risk-free asset. The dragon-slayer diagnoses burgeoning bubbles by forming an expectation of the future return of the risky asset in the form of an exponential moving average of the excess return over the long-term return. When this excess return momentum exceeds a threshold that the dragon-slayer estimates as an unsustainable level, he forms a prediction that a crash may happen with a probability given by a logistic function of the excess return momentum. Equipped with this bubble diagnostic, the dragon-slayer constructs his trading strategy similarly to the fundamentalists but with the advantage of using a real-time dynamical estimation of a transient expected excess return. Specifically, the policy maker invests in the risky asset when he detects a sufficiently large deviation of the average excess momentum from the long-term growth rate in order to construct an inventory that he draws upon later to fight future market exuberance. Then, when this deviation between the current growth rate and the long-term growth rate exceeds the policy maker’s tolerance level, he starts to sell the risky asset that he has accumulated earlier, in a countercyclical prevent future price increase.

We have found that the dragon-slayer succeeds in preventing bubbles and crashes in the ABM. In simulations where market parameters prevent the formation of bubbles, the dragon-slayer behaves similarly to the fundamentalists and his impact is negligible. This is a good property in the sense of that any cure should first follow the principle of “Primum non nocere” (first, do no harm). In simulations where bubbles form spontaneously as a result of the noise traders’s strategies, the average drawdown is decreased from 57% in absence of the dragon-slayer to 33% when the dragon-slayer is initialized with 50% of the total wealth so that his market impact is very significant. Concomitantly, the average number of peaks is reduced by 5.7%. The stabilising effect of the intervening policy maker is also reflected in the return dynamics of the risky asset. An initial wealth fraction of just 5% for the dragon-slayer reduces the variance of the return by 28%. A larger wealth fraction reduces the variance up to 62%. Furthermore, the skewness of the returns is pushed closer to zero, and the dragon-slayer decreases significantly the excess kurtosis and the absolute value of the 1%-VaR.

While removing bubbles, we find that the dragon-slayer strategy does not affect the long-term growth rate of the risky asset. For all analyzed scenarios, the growth rate is found close to the theoretical value, which is determined by the growth rate of the dividend process underlying the risky asset. This combination of fewer bubbles and crashes, and the stability of the long-term growth rate, leads the dragon-slayer to provide improved performance for the other traders in terms of their risk-adjusted return. Thus, the dragon-slayer increases the wealth of all market participants.

The economic mechanism explaining the successful reduction of bubbles and crashes is simply based on the market impact of the dragon-slayers and the robustness of their diagnostic of both the nucleating period of bubbles and the times when they become ripe. The build-up of the inventory of the risky asset during the ascending phase of the bubble is done sufficiently smoothly so as not exacerbating too much the bubble dynamics. The progressing sell-off by dragon-slayers when they deem that the market has appreciated too much beyond the long-term expected return ensures an ordered deflating of the bubble, again via their market impact.

Finally, we have tested the sensitivity of these results to variations of the key parameters of the strategy of the dragon-slayer. We investigated the average Sharpe ratios of fundamentalists and noise traders with a dragon-slayer possessing a wealth amounting to 10% of the total wealth. We found that the dragon-slayer strategy is beneficial to the other traders over a wide range of parameters of his strategy. In general, a faster reacting dragon-slayer with a shorter memory length provides the largest benefit to the other traders. Furthermore, the traders perform better in the presence of the dragon-slayer in the whole analysed range of expected growth rates of the risky asset. Thus, a dragon-slayer who uses a miscalibrated market growth rate that is even larger than 50% off the true growth rate still provides significant increase of the investment performance for all traders.

In sum, our simulations have shown that direct intervention in the stock market to prevent bubbles and drawdowns can be very effective and beneficial for all involved traders.