1 Introduction

Stock markets are a key component of an economy based on the principles of capitalism. Their primary function is to provide a straightforward and convenient mechanism to transfer ownership of an asset claim in a formal trading venue that is governed by rules of trade. A trade is consummated when a price is agreed upon. Prices, however, often exhibit rapid fluctuations that at times are characterized by large and often unexpected changes. Taken together, a sequence of price changes involving a broad group of stocks in a relatively short time period that results in a meaningful cumulative decrease in their paper values is referred to as a market crash. As Glasserman and Young (2016) point out, interconnectedness is a defining characteristic of global and domestic modern financial systems. Although this phenomenon may provide transactional benefits, it also enhances the fragility of the system through common risk exposure via the ownership of similar assets, liquidity shocks, and macroeconomic shocks in general, making these risks difficult to manage because of network complexity and market crashes.

The purpose of this paper is (1) to model the price movements exhibited by individual stocks during the 2010 Flash Crash using a Hawkes process excitation matrix and (2) to interpret its entries in the context of complex networks and crowd behavior from the Granger causality and Adaptive Market Hypothesis perspectives. The analyses provide insights into the development of market design initiatives, trading strategies, and risk management methods that incorporate an intra-day, endogenous perspective.

Stock market crashes are not uncommon. In addition to the Flash Crash, recent crashes include Black Monday (October 19, 1987), the Dotcom Bubble Burst (March 10, 2000), the 2008 Financial Crisis (September 29, 2008), and the 2020 Pandemic (February 19, 2020). We explore the nature of the Flash Crash because its short length (36 min from the start through recovery) severely limits the number and resultant impacts of external factors, such as market intervention by government or stock exchange regulators that might affect the behavior of the market during its drawdown and subsequent recovery.

We focus on the 30 stocks that make up the Dow Jones Industrial Average (DJIA). The DJIA companies are large, blue chip firms chosen to represent the bulk of US economic activity. The Hawkes excitation matrix measures how the influence these 30 stocks have on each other and themselves and how these influences may evolve over time as traders learn about and react to the ever-changing environment. We use individual stock sub-second transactions data, which are the most granular data available at the time of the crash. This granularity permits the analysis of sequential transactions that often occur within milliseconds of each other, making it possible to construct a complex system containing multiple feedback loops that depict the impacts of a stock not only on itself but also on the other 29 stocks in the network. Thus, we provide a new lens on the behavior of stocks during the Flash Crash, which will help to understand the nature of other market crashes and other disruptions.

The Hawkes (2018) process is a stochastic model that describes the frequency of events within a specific time interval, allowing the occurrence of one or more of these events to increase the likelihood of triggering more or fewer events in the near future. Because of this characteristic, Hawkes processes have been used to quantify the endogenous and exogenous price effects in various asset trading markets. Our main focus is on the endogenous effects measured by the excitation matrix, which is a square matrix where its columns represent the stocks that are the influencers and the rows indicate the stocks being influenced. A stock’s self-influence is measured by the matrix’s principal diagonal. To provide a dynamic perspective, we estimate the excitation matrix using a rolling sample. This approach allows us to examine how the excitation matrix evolves before, during and after the crash. Similar to any other statistical model, however, the Hawkes process mines the statistical association between the stocks but it cannot reconstruct the true causal relationship between them in terms of specific types of human cognitive biases.

Nevertheless, Etesami et al. (2016) demonstrate that the specification of the linear multivariate Hawkes model permits us to interpret the feedback relationships in the Granger-cause sense (Granger 1969, 2004). We analyze these relationships from the perspective of the Adaptive Market Hypothesis (Lo 2004), which uses the ideas and concepts associated with behavioral finance to help understand why stock markets behave as they do. The Adaptive Market Hypothesis recognizes that traders, being human, are not perfectly rational, although at times they might act as they are, and are subject to social and psychological biases. These biases may affect their assessments of current and future stock prices and these assessments may change as trading progresses.

We find that the influences between the 30 DJIA stocks increase on average during the Flash Crash and then revert close to their pre-crash level. The level of influence, however, varies greatly among stocks before, during and after the crash. Moreover, stocks that are strongly (weakly) influenced by other stocks are only weakly (strongly) influenced by their past behavior. Furthermore, although the industry sector of the 30 stocks does not seem to have an effect on the influences between them, the specific trading venue in which the transaction occurs does have a noticeable impact, suggesting that at times the ability to provide liquidity may vary among venues, which supports the notion of fragmented markets. These influence differences suggest the possibility that there may be worthwhile dynamic diversification strategies and market microstructure policies that have not yet been fully identified or exploited.

The remainder of the paper is divided into five sections. In Sect. 2, we present background information related to the 2010 Flash Crash and discuss possible price discovery network implications. We devote Sect. 3 to describing the role of complexity in the stock market, including the specification of the multivariate Hawkes model and how trader behavior can be thought to be embedded in a model’s excitation matrix.Footnote 1 In Sect. 4, we discuss our data and their source, as well as provide several important descriptive statistics and caveats where appropriate. We also illustrate the interpretation of the excitation matrix and its corresponding network diagram for a subsample of DJIA stocks. We present our empirical results in Sect. 5 in three parts: (1) the 30 DJIA as a group using a simple average, (2) the 30 individual stocks and (3) the 30 stocks divided into industry sector and trading venue communities. In the last section, Sect. 6, we discuss our findings in a broader economic context, giving emphasis to their potential usage in portfolio risk measurement and management, stock index construction and the impact of various types of composition changes, and implications for fragmented markets.

2 Crashes and price discovery implications

Crashes and their recoveries may be long or short and may or may not be followed by an economic recession or even a depression. Moreover, other than using arbitrary rules of thumb, it is often difficult to determine precisely when crashes begin and end. Numerous studies have been conducted to pin down the cause(s) of these crashes and many possibilities have been suggested. The cause(s) may be large or small, exogenous or endogenous, but, whatever the size or origination, the cause(s) must be able to promote the strong need for many stockholders, with large or small positions, to attempt to divest themselves of their holdings, at least temporarily. We focus on the timelines, possible reasons for flash crashes that have been suggested in the related literature, as well as implications for modeling crash behavior.

2.1 2010 Flash crash

The Flash Crash began on Thursday, May 6, 2010. US stock markets had been highly volatile since their openings as a result of, some believe, the disappointing economic news from Europe concerning its then-ongoing Greek debt crisis. As a result, O’Hara (2015) points out, buy and sell orders were becoming increasingly unbalanced. The Commodity Futures Trading Commission (CFTC) and the Securities Exchange Commission (SEC) in joint reports (CFTC-SEC 2010a; b) indicate that the crash began about 2:32 p.m. EST and lasted until 3:08 p.m. or 14:32 to 15:08 in 24-h clock time. For the next 13 min or so after 14:32, stock prices continued to drop, with the largest declines occurring in the last few minutes. At the nadir of the crash, roughly $1 trillion in the paper value of market capitalization was lost. By 14:45 markets began to stabilize, although there were still extreme movements in the prices of some stocks. Between 15:00 and 15:08 (the generally agreed upon end of the recovery), markets became noticeably less volatile as prices approached their pre-crash values and orderly trading resumed, although it took four additional trading days to reach May 10th’s high. No external stimulus, government or otherwise, was involved and the decrease was not quite large enough to enact the market-wide circuit breaker that was initiated as a result of the 1987 Black Monday Crash. In the following days, stock market trading and prices appear to have proceeded in a “normal” manner and an economic depression or even a recession did not follow the day’s seemingly unexpected and unsettling intraday event.

The cause(s) of the Flash Crash is unclear. The (CFTC-SEC 2010a; b) attributes the start of the crash to a large fundamental trader placing an order for 75,000 ($4.1 billion) E-mini S&P 500 futures contracts to hedge against an existing equity position. Aldrich et al. (2017), however, suggest that the flash crash was the result of the continued presence of many large sell orders and the corresponding widespread withdrawal of liquidity, i.e., the decrease in the number of contracts quoted close to the best price. Similarly, Easley et al. (2012) posits that this decrease may have been the result of the inability of cross-arbitragers to make a profit on their transactions and, hence, withdrew from the market. They also indicate that the Volume-Weighted Probability of Informed Trading Index (VPIN), a statistical measure of market toxicity, increased about 60 min prior to the start of the crash, thereby signaling a possible short-term liquidity crisis. Menkveld and Yueshen (2019) highlight the impacts of a fragmented marketplace on the flash crash from a cross-arbitrage perspective using E-mini contracts that trade on the Chicago Mercantile Exchange and S&P 500 ETF Trust (SPY) positions that are tradable on eight different exchanges. They suggest that liquidity differences among the various SPY trading venues may have contributed to the crash, although Kirilenko et al. (2017) document that the trading pattern of high-frequency traders did not change during the crash.

2.2 Price discovery hypotheses and stock markets crashes

The function of a stock market is to bring various traders who want to buy or sell stock together so that they can jointly determine the value of a specific stock. Bid prices and ask prices are made by buyers and sellers, respectively. When they are equivalent in price, a transaction occurs. A market for a stock is considered liquid if there are many posted bid and asked prices that are very close to each other. An illiquid market is one in which there are only a very small number of bid prices and ask prices quoted and these quotes are far apart. In this type of market, a transaction will not be made unless there are major price concessions by one or both of the two parties. This buyer–seller interaction, whether or not there is a transaction, is typically referred to as the price discovery process. There are three main hypotheses concerning the way that investors engage in determining what they believe to be the true price of a stock: (1) the Efficient Market Hypothesis, (2) the Fractal Market Hypothesis, and (3) the Adaptive Market Hypothesis.

The Efficient Market Hypothesis (EMH) is the foundation of much of current finance theory and its relevant applications. It maintains that stock market participants make transaction decisions based on rational expectations created by rational individuals using all relevant and available information. The EMH is validated, at least on average, by the actual prices and how fast the information is incorporated in the price. For example, for the weak version of this hypothesis (the information set contains only market generated information), validation occurs when daily stock prices are characterized by a random walk, which in economic parlance means that the stock market is “efficient.”Footnote 2 Market efficiency implies that each transaction price is an equilibrium price because it considers all necessary information. Thus, Fama (1998) maintains that instances in which the random walk hypothesis does not appear to explain the behavior of stock returns, such as stock market crashes, should be considered anomalies, i.e., situations that occur by chance and should be discarded. After all, he (1998, p. 291) opines “…a model should be judged … on how it explains the big picture.”

The Fractal Market Hypothesis (FMH) developed by Peters (1991, 1994) relies on Mandelbrot’s (1963) concepts that are embedded in fractal geometry.Footnote 3 In the stock market context, the geometric shape is usually thought of as a triangle with its left (right) side depicting a positive (negative) price change and its base measured in units of time, which, in a general sense is referred to as scale (Mandelbrot 1963, 2001; Mandelbrot and Hudson 2004). Triangles with different scales are combined to create patterns that resemble stock movements over time and are characterized by the presence of continuous discontinuities. The FMH posits that a stock market consists of many types of investors who have different investment horizons (scales) with correspondently different information needs, although all investors may be aware of all available information.Footnote 4 For instance, fundamental investors may consider their time frame to be the business cycle while short-term traders may focus only on the day in which they are trading. Any significant change in trading horizons by either group may result in an overall shift of demand and supply of various stocks and this shift is reflected in stock price turbulence, which, among other factors, may reflect a lack of liquidity. As a result, the FMH is consistent with the notion that market crashes are not anomalies. Nevertheless, it is silent on the way that traders make the transition from one investment horizon to another.Footnote 5

The Adaptive Market Hypothesis (AMH), which is largely attributed to Lo (2004), rests firmly on the notion that traders are human, have various cognitive biases, and do not operate in isolation but instead engage in various types of crowd behavior. In doing so, they not only gather objective information but also learn from others and other traders learn from them ad infinitum. Lo (2004) mentions that these biases include, but are not limited to, overconfidence, overreaction, loss aversion, herding (a.k.a. flocking), mental accounting, and regret.Footnote 6 Bouchard (2010) emphasizes that herding is an important, persistent bias and typically is a result of traders imitating each other, a behavior that reflects the concern of traders believing that other traders may be smarter or have better information. Burztyn et al. (2014) maintain that the learning channel between traders is more effective when those who are less informed observe the behavior of traders who are thought to be more sophisticated (successful).Footnote 7,Footnote 8 Moreover, Park and Sabourian (2011) show that traders herd when information is widely dispersed and, consequently, believe that extreme outcomes are likely. In this case, herding is a rational response in the “bounded” sense that tends to result in more volatile stock prices and lower liquidity and, thereby, is self-reinforcing.Footnote 9 In summary, traders’ beliefs concerning stock prices may contain biases and these beliefs may evolve over time as trading progresses.

From a philosophical point of view, hypotheses are formal conjectures that are made to explain what is observed. Before a conjecture is accepted as potential truth, it should be based on logical reasoning and subject to rigorous testing. As more data become available and new tests are devised, existing hypotheses that previously could not be rejected are found to be acceptable under certain conditions or rejected in their entirety. Thus, a hypothesis, unless it is shown to be true by irrefutable evidence available now or in the future, such as a mathematical proof, can never be considered true in an absolute sense (Popper 1959/2002).

How do the EMH, FMH, and AMH fare with respect to Popper’s (1959/2002) philosophic view? Despite his defense of the EMH, Fama (1998) characterizes stock market crashes as anomalies. Kuhn (1970), however, believes evidence that is not consistent with the current paradigm should cause the emergence of new paradigms. Christensen and Raynor (2003, p. 27) agree with this sentiment and believe that instead we should do “…anomaly-seeking (italics theirs) research, not anomaly-avoiding research.” In contrast, the FMH tackles the issue of crashes by stating that they are not anomalies but are a result of investors moving from one fractal scale to a fractal with a smaller scale and eventually returning to their original scale. At the present time, there does not appear to be an in-depth analysis of the ways in which investors obtain the information relevant to changing fractal scales nor how they accomplish this change over time. The AMH differs from both the EMH and the FMH because it involves sequential investor learning not only from outside sources but also the behavior of other investors through feedback loops and cognitive biases.Footnote 10

AMH’s perspective on human behavior is compatible with the econophysic view that stock markets are complex systems. For instance, Sornette (2003/2017, p. 280) maintains that a stock market is a complex system and similar to other complex systems “…has an endogenous origin and exogenous shocks only serve as triggering or tipping point factors.” The endogenous origin refers to the cooperative behavior of market’s participants, and the exogenous shocks to events that occur external to the market. Arthur (1999) opines that the notion of complex systems applied to a stock market contradicts the EMH because complexity in this context involves cooperative behavior of the market’s participants.

3 Stock market complexity, modeling, and human behavior

In this section, we discuss market complexity in terms of a network, its relationship to the concepts of endogenous and exogenous, and how we statistically measure and interpret these results from the perspective of the AMH.

3.1 Complex networks

Much of the literature on complex systems focuses on physical or biological topics.Footnote 11 Ladyman et al. (2012), however, maintain that important complex features are also found in social systems, including stock markets. They indicate that these markets are highly complex because there are numerous potential buyers and sellers who often randomly interact with one another, thereby providing ongoing feedback to each other and themselves. Buyers and sellers need not be human; transactions can be undertaken by algorithms developed by humans or by artificial intelligence.It is the feedback process that creates the endogeny present in the stock market. This process, however, is dynamic and traders may face Knightian uncertainty, i.e., the probability of stock price moves or the factors that might cause these changes are unknown. As a result, these traders learn about and adapt to the changing market environment, which is sometimes made more difficult in a changing technological environment.Footnote 12

According to Kuhlmann (2014), crashes can occur endogenously or exogenously. Endogeny results in a market where traders search for an equilibrium in a stock’s price but never attain it. As a result, extreme events like crashes can occur when some of the traders in the system adapt to or adopt the behaviors of other traders. In this regard, Ladyman et al. (2012) maintain that, although traders may interact with each other in a way that is disordered, their actions as a group are organized because the traders not only communicate information to each other, but they do so by using a defined set of formal and informal rules. Frank et al. (2019) provide evidence that in upstairs markets traders interact with each other and sometimes appear to engage in reciprocal agreements, especially if the traders belong to the same investment house.

Any specific exogenous event can be relatively large or small. Yet, regardless of the event type, it acts as a trigger that initiates a crash with a significant number of stockholders trying to sell all or a significant number of their stock holdings, thereby driving stock prices down. Fleeing from the market is an example of what psychologists have dubbed as “fight or flight” behavior, which are two physiological responses by humans facing real or perceived danger (Canon 1929).Footnote 13 In this context, the act of fleeing (selling) is, according to Goldstein and Kopin (2007), ingrained behavior triggered by a person’s sympathetic nervous system. This behavior does not depend on whether the event or events occur; it can also exist if there is a fear (or an anxiety) that they will occur. From this perspective, market recoveries occur in the same manner. Previous stockholders seek to regain their earlier exposure to equities or new stockholders enter the market because they fear missing the possibility of significant profits in the future.Footnote 14

Some types of events can be either endogenous or exogenous, and their designation depends on the context in which they occur. For example, the lack of liquidity is often proffered as a potential cause of a market crash. Liquidity could be exogenously decreased by a halt in trading as a result of a circuit breaker being activated. It could also be endogenously lessened by market-makers reducing their market presence because of dwindling trading profits. The first reason is the result of market microstructure rules and the second by market toxicity, i.e., an imbalance between buyers and sellers. With respect to toxicity, Liu et al. (2021) provide empirical evidence that there is a strong negative relationship between herding and liquidity.

How does this behavior manifest itself in a stock market context? We address this issue in the next section, which focuses on the ways in which individual stocks are connected in complex networks and how these connections exhibit their strength. According to Kuhlmann (2014, p. 1124), these networks seem to be able to “…identify the common underlying structural mechanism.”

3.2 Statistical methods and approaches

We use the multivariate Hawkes process to model the complex microstructure behavior of the DJIA 30 stocks. Hawkes (2018) provides a brief summary of his process and gives a useful and extensive bibliography pertaining to the model’s development and finance applications, with a special mention of Bacry et al. (2015) for high-frequency applications.Footnote 15 Previous applied research on this area that is particularly relevant to our study are Filimonov and Sornette (2012) and Aїt-Sahalia et al. (2015). Filimonov and Sornette (2012) examine the behavior of the E-mini S&P 500 contract, which is traded on the Chicago exchange, from 1998 to 2010. They report that from 1998 to 2010 the portion of the price changes attributed to the endogeny of this financial instrument increased dramatically and reached almost 100% during the Flash Crash. Aїt-Sahalia et al. (2015) extend this approach by combining Hawkes and diffusion processes to model the joint time series behavior of the S&P 500 (U.S.), FTSE 100 (U.K.), Nikkei 225 (Japan), Hang Seng (Hong Kong), and IPC (Mexico) stock indexes.Footnote 16 They use daily open and close data for various sub-periods (because of lack of data for some indexes) within the overall time span beginning January 2, 1980, and ending April 30, 2013. They present significant evidence of endogenous behavior within each market and similar relationships between various market pairs, with the latter phenomenon suggesting the presence of contagion among the markets examined.

In contrast to the above studies, we model the activity level of each individual stock and the timing of its activities. For each stock, we study its price-changing events: a price-changing event is a transaction at a price different from that of its immediately previous transaction. Without ambiguity, we refer to such price-changing events simply as events. These events act as a proxy for a change in the traders’ view of the paper value of the stock in question as a result of their on-going learning experiences. O’Hara (2015) maintains that positive price changes signal a mixture of good news and negative changes are consistent with bad news, with both types of price changes containing some noise. Passive trades are transactions not associated with a price change. Because there are multiple stocks considered, price-changing events from each stock are treated as unique.

3.2.1 The Hawkes modelFootnote 17

Mathematically, let \(t_{i}^{s}\) denote the time of the ith event of stock \(s\) with the stocks indexed from 1 to 30, i.e., \(s \in \left\{ {1,2, \ldots ,30} \right\}\). The Hawkes process assumes that at any time t, the probability \(P_{s}\) that an event of stock s will occur in the next dt time units is determined by the instantaneous rate: \(\lambda_{s} \left( t \right)\): \(P_{s} \approx \lambda_{s} \left( t \right) \cdot dt\).The rate of events \(\lambda_{s} \left( t \right)\) is modeled as a function of the occurrences of previous events from all the stocks (including itself):

$$ \lambda_{s} \left( t \right) = \mu_{s} + \mathop \sum \limits_{{s^{\prime} = 1}}^{30} \mathop \sum \limits_{{i:t_{i}^{s^{\prime}} < t}} a_{ss^{\prime}} g\left( {t - t_{i}^{s^{\prime}} } \right), $$
(1)

where (\(i:t_{i}^{s^{\prime}} < t\)) corresponds to all the events of stock \(s^{\prime}\) that occurred before time t; \(a_{ss^{\prime}} \) captures the impact of stock \(s^{\prime}\) on stock s; \(\mu_{s}\) is the baseline rate of events of stock s that is independent of previous events; and \(g\left( {t - t_{i}^{s^{\prime}} } \right)\) is the memory kernel that models how the effect from each previous event decays over time. More detailed descriptions of the variables are given in Table 1.

Table 1 Hawkes model symbols and definitions

The above model for \(\lambda_{s} \left( t \right)\) applies to each stock s, and models for different stocks are coupled together as the events of stock s explicitly depend on the events of other stocks \(s^{\prime}\). We can rewrite Eq. (1) compactly with matrices. Let \({\varvec{\lambda}}\left( t \right)\), \({\varvec{\mu}}\left( t \right)\), \(A\), and \({\varvec{g}}\left( t \right)\) represent the combined matrices for \(\user2{ }\lambda_{s}\), \(\mu_{s}\), \(a_{ss^{\prime}}\), and \(\mathop \sum \limits_{{i:t_{i}^{s^{\prime}} < t}} g\left( {t - t_{i}^{s^{\prime}} } \right),{\text{ respectively}}.\) Eq. (1) then becomes:

$$ {\varvec{\lambda}}\left( t \right) = \user2{ \mu }\left( t \right) + A \cdot {\varvec{g}}\left( t \right). $$
(2)

Equation (2) highlights the important role that excitation matrix A plays in the Hawkes process. It is the crux of our analysis because it captures the interactions between all of the stocks. Etasami et al. (2016) point out that the excitation matrix is equivalent to a minimal generative model and this type of model, according to Quinn et al. (2011), measures directed information that can be interpreted as Granger causality (Granger 1969, 2004). Thus, A’s columns depict the stocks that trigger the effects, and its rows denote the stocks that are affected. Consequently, the principal diagonal represents the self-induced impact (self-influence) on each of the 30 stocks, and the other 870 cells in the matrix represent the impact of an individual stock on another individual stock (cross-influence). The number contained in each of the matrix’s 900 cells is approximately the average number of events of the corresponding row stock triggered by one event of the corresponding column stock.

3.2.2 Model estimation

The model is estimated using the Bayesian method as adopted by Linderman and Adams (2014) and their Python package “pyhawkes” (https://github.com/slinderman/pyhawkes), which has been found efficient and reliable in discovering latent network structure in both synthetic and real data. Below we briefly describe the intuition behind their Bayesian inference method. (See Linderman and Adams (2014) for details.)

Let \(p(\left\{ {t_{i}^{s} } \right\}|{\varvec{\mu}},\user2{ }A, {\varvec{g}})\) be the likelihood function of observing the sequences of events \(\left\{ {t_{i}^{s} } \right\}_{s = 1}^{30}\) given the parameters \({\varvec{\mu}},\user2{ }A, {\varvec{g}},\) which is well-defined as a result of the assumptions of the Hawkes model. A traditional approach to estimating the parameters is to compute their maximum likelihood estimate by maximizing this likelihood function. Because of the complex structure of the likelihood function, however, there is no closed-form solution nor is there a straightforward approach to optimize it. Nevertheless, Linderman and Adams (2014) develop a novel and efficient Bayesian inference algorithm by taking advantage of the superposition property of the Hawkes model, which simply states that the total response equals the sum of the individual responses.

Before introducing the Bayesian inference method, we note one extra step in Linderman and Adams (2014) and decompose the excitation matrix A into two parts: \(A = A^{\prime} \times W\), where \(A^{\prime}\) is a binary matrix modeling the structure of the network (\(A_{ij}^{^{\prime}} = 1\) if there is an edge between nodes i and j and \(A_{ij}^{^{\prime}} = 0\) otherwise), and W is a non-negative weight matrix that models the strength of the edges between nodes. The advantage of this separation is that recent advances in random graph models can be used to describe the network structure and impose separate beliefs about the strength and the structure of the network as Bayesian priors.

Given the likelihood function \(p(\left\{ {t_{i}^{s} } \right\}|{\varvec{\mu}},\user2{ }A^{\prime},W, {\varvec{g}})\), the Bayesian inference method proceeds as follows, which is an example of Gibbs sampling (a procedure that uses a Markov Chain—Monte Carlo algorithm) with modifications (Linderman and Adams 2015). In particular, assuming \({\Pi }\left( {\varvec{\mu}} \right)\) is a prior distribution of \({\varvec{\mu}}\), by Bayes’ rule, the posterior distribution of \({\varvec{\mu}}\) is: \(p({\varvec{\mu}}|\left\{ {t_{i}^{s} } \right\},A^{\prime},W, {\varvec{g}}) = p(\left\{ {t_{i}^{s} } \right\}|{\varvec{\mu}},\user2{ }A^{\prime},W, {\varvec{g}}){{ \Pi }}\left( {\varvec{\mu}} \right)/p\left( {\left\{ {t_{i}^{s} } \right\},\user2{ }A^{\prime},W, {\varvec{g}}} \right)\). We sample \({\varvec{\mu}}\) from this posterior as its estimate. Similarly, we derive the posterior distributions of other parameters \(A^{\prime},W, {\varvec{g}}\), respectively, and sample from their posteriors. This process is then repeated multiple times and the average of the samples is used as the estimate for each parameter.

Linderman and Adams (2014) show that their method achieves better results in discovering network structures than standard estimation methods on both synthetic and real data. One contributing factor to their success is the direct modeling of network structure. Another advantage of their Bayesian method is the capacity to impose prior distributions on the parameters, which to some extent regulates noise in the interactions between the stocks. In addition, the Bayesian approach is the equivalent of updating its parameters, including the components of the excitation matrix, from samples of past values. Conceptually, this is similar to the approach exhibited by human behavior underlying the AMH. Both involving learning from the past and adapting their behavior to attempt to optimize their behavior.

4 Data and descriptive information

In this section, we describe the sample stocks and their sources as well as demonstrate with actual data how to use and interpret the excitation matrix. We also divide the Flash Crash timeline into five economic periods, provide some important statistics on price changes, and discuss the issues using statistics based on large samples.

4.1 Data source and composition

To explore the viability of the Hawkes processes to model the pre-crash, crash, and post- crash stock price behavior, we focus on the 30 stocks that comprised the Dow Jones Industrial Average (DJIA) at the time of the crash.These companies, which are shown in Table 2, are very large, publicly traded, US-based, and most were originally listed on the New York Stock (NYSE) with the remainder on Nasdaq exchange (NQNM).Footnote 18 Their stocks can be traded on their original listing exchange as well as at eight other exchanges as long as they have a dual listing with them. These venues include either NYSE or NQNM and seven smaller exchanges. Trades can also be made off-exchange but these transactions must be reported to a separate unit overseen by the industry. The behavior of stocks not in the DJIA and all other financial instruments related to any stock or group of stocks is considered exogenous and their impacts are treated accordingly.

Table 2 The 30 Dow Jones Industrial Average (DJIA) Stocks on May 6, 2010, with selected industrial sector and price information

Virtually all of the 30 companies are household names and represent almost all of the major sectors in the US economy, i.e., consumer staples, industrial materials, financials, telecommunications, energy, consumer discretionary, information technology, and heath care. Of the major sectors only transportation and utilities are not represented by a stock in the index. As a group, the 30 DJIA companies accounted for approximately 22% of the market value of all traded US stock around the time of the flash crash.

Our data were obtained from Nanex, which provides real-time option and stock price data via its NxCore product. Data are archived by Nanex as transactions and quotes arrive from the various exchanges and are time-stamped at millisecond time intervals. When the data were collected, Nanex’s timestamp was the most granular available. Nanex's data are generated by activity from all US exchanges where a given stock is traded, which is not necessarily where it was originally listed. The primary data extracted from the Nanex feed used in our analysis are transaction prices and their time stamps.

For each of the DJIA 30 stocks listed in Table 2, we provide its open, close, and low prices on May 6. The time each stock reached its lowest price is also included. We also plot the transaction price standardized by its open price for each of the 30 stocks in Fig. 1. For comparison, we include the standardized price series on the day (May 6, Fig. 1-middle) of the flash crash as well as the standardized prices from one day before (May 5, Fig. 1-left) and one day after (May 7, Fig. 1-right). There are noticeable breaks in the price series between the days. This is the result of the markets closing at the end of the trading day, thereby enabling the effects of news and overnight trading to be acted on by the market at its opening the next day.

Fig. 1
figure 1

Prices for the DJIA 30 stocks on 5/5/2010 (Left), 5/6/2010 (Middle), and 5/7/2010 (Right) from 9:30 to 16:00 (x-axis) each day. The price series of each stock is standardized by its opening price in order to fit all the series in the same plot. Subtracting 1.0 from the result of this standardization procedure creates a measure of return based on the stock’s price at the beginning of the standardization period

The price series behavior on May 6 is markedly different from the adjacent two days, with large abrupt drops occurring around 14:30. Before and after the flash crash and its recovery, prices tend to move up and down in small increments and do not seem to follow a trend. Statistically, this pattern has been often modeled using a continuous-time Markov process with Brownian motion after converting the price series to continuous returns by taking the first difference of the natural logarithm of prices. Economically, this type of pattern is typically attributed to normal transactions activity such as not wanting to buy or sell an unusually large position in a short period of time.

To see clearly the timing and the magnitude of the price drops of each of the 30 DJIA stocks on May 6th, we show the lowest standardized price for each stock (y-axis) and the time each stock reaches its nadir (x-axis) in Fig. 2. The time points for these prices are clustered between 14:45 and 14:48 with Walmart (WMT) being the first stock (14:45:29.2) to reach its lowest price and Kraft Foods (KFT) to be the last (14:47:58.8). Although most stocks dropped about 10%, 3 M (MMM) dropped 20% and Procter & Gamble (PG) dropped more than 35%. The observations in Figs. 1 and 2 are consistent with the reports from the CFTC-SEC (2010a; b) and also reveal the distinguishing feature of a flash crash, i.e., large cumulative declines in a very short time period and a corresponding rapid recovery.

Fig. 2
figure 2

Time (x-axis) each stock reached its lowest price on May 6, 2010. Stock prices (y-axis) are standardized by their corresponding opening prices. Stock names corresponding to the stock symbols in the figure are given in Table 2

4.2 Modeling excitation matrix dynamics

Because the behaviors of stocks may vary over time, especially during the flash crash, we do not fit the Hawkes model to all the data combined as this would mask the dynamics of the stocks over time. Instead, we divide the data into overlapping time windows. The length of the rolling window is five minutes, and the window moves five seconds at each step. In total, there are 2,160 instances of the moving window between 13:00 and 16:00 plus one startup window immediately prior to 13:00. Thus, each five-minute moving window, on average, lengthens the rolling sample by 1.67% to accommodate new observations and shortens the sample by the same percent by deleting the oldest observations. The Hawkes model is then fitted in every window to the DJIA 30 stock data.

For the visualizations that follow, the parameter estimates from each window are plotted against the right boundary of the window. From the 5-min history before each time point, we estimate the Hawkes model and use it to characterize the stocks at that time point. Accordingly, there will be one collection of parameters (e.g., baseline rate \(\mu_{s}\) and excitation matrix A) estimated from each window. Since the parameters at each time point are estimated using information from the 5-min time window prior to it, their calculated effects reflect a moving average and are not instantaneous and better reflect the investor learning process.Footnote 19

An example of the excitation matrix for five randomly picked stocks during the 13:00:00–13:05:00 window is shown in Fig. 3, together with its network representation. Column names denote the stock that influences, and row names indicate the stocks being influenced. We use Bank of America (BAC) and Travelers (TRV) to illustrate the interpretation of this matrix. BAC positively influences itself (0.44) and to a much lesser extent TRV (0.03). In contrast, TRV’s influence on itself is relatively small (0.10) and it has no effect on BAC (0.00). These relations graphically depicted in the network diagram with arrow heads showing the direction of influence and the thickness of the arrow shaft indicating the relative size of the influence. That TRV has no effect on BAC is shown by the lack of an arrow. Self-influence is not shown in the network diagram because it would only add unnecessary detail. Nevertheless, the excitation matrix entries show that BAC exhibits the most self-influence and is substantially higher than the other three stocks in our example.

Fig. 3
figure 3

The excitation matrix for five randomly picked stocks during the 13:00:00–13:05:00 window (left) and the corresponding influence network (right). Directional arrows indicate source and recipient of the influence, and their thickness represents the strength of influence. Stock names corresponding to the stock symbols in the figure are given in Table 2. As indicated by the principal diagonal of the matrix, all stocks exhibit self-influence with BAC having the highest value and MSFT having the lowest. Arrows in the diagram that indicate self-influence are not provided for visual simplicity

To extend this example consider two distinct scenarios: (1) none of the five stocks influenced themselves or each other, and (2) each of the five stocks only influenced themselves. In the first case, the excitation matrix would only contain zero entries. In the second situation, the principal diagonal cells of the matrix would have non-zero values, but all the other entries would be zero. Including noise to the example makes the scenarios a bit more difficult to picture because non-zero values would be added to the excitation matrix’s entries. Nevertheless, because excitation matrices are calculated using a rolling sample approach in conjunction with a Bayesian algorithm, the impact of noise on the matrices should be mitigated.

We explain the impact of each of the excitation matrices using the following summary measures. First, we consider the density of the influence network between the stocks, which is the number of links in the network.Footnote 20 Next, we consider the influence strengths. Following the extant convention, for a specific window we define the self-reflexivity of the market during that period as the average of the diagonal entries of the excitation matrix A. In a similar manner, we label the average network link strength (i.e., the mean of the nonzero off-diagonal entries of the excitation matrix A) as cross-reflexivity (sometimes referred to as mutual-reflexivity), which can alternately be thought of as the average interaction strength of the market.

We also construct three measures for individual stocks: self-influence, out-influence and in-influence. The self-influence of a stock is the impact of the stock on itself and is the value indicated by the stock’s position on the excitation matrix’s principal diagonal, i.e., the intersection of the stock’s row and column entries. In contrast, the out-influence of a stock is the impact of this stock on the 29 other DJIA stocks or the weighted out-degree of this stock in the influence network. Mathematically, this quantity is the sum of the corresponding column in the matrix \(A\) less the value of its diagonal entry. Correspondingly, the in-influence of a stock is the impact of the 29 other stocks on it and is measured by the stock’s row sum less the self-influence of the stock being measured. Thus, all the information contained in the excitation matrix is used by these three influence measures.

The excitation matrix is similar to the variance–covariance matrix that is often used to measure the risk associated with stock portfolios as it measures how the stocks are related to each other. However, the excitation matrix differs in three important ways. First, it does not require the time series to be synchronized, which high-frequency trading data are typically not. Second, the excitation matrix is asymmetric (or directed) so that a stock can have a specific impact on another stock, but the reverse need not be the case as the impacts may be asymmetric. Finally, it does not suffer from the Epps (1979) effect, which typically renders the variance–covariance matrix unreliable for high-frequency data.Footnote 21

4.3 Economic periods and sample size implications

All the figures and tables that follow are split into five economic periods: pre-crash, crash, nadir, recovery, and post-recovery. These periods are determined ex post and are defined in Table 3. The time period spanned by the crash, nadir and recovery periods corresponds to the SEC’s (2010a; b) crash period. Our nadir period is structured so that it contains the lowest price observation for each of the 30 stocks as depicted in Fig. 2. Table 3 also contains information about the number of transactions in each of the five periods and these periods in aggregate. The table also provides the frequency of transactions (the average number of milliseconds between transactions) and the proportion of transactions associated with price increases (the complement to this proportion is associated with price decreases.)

Table 3 Economic crash and recovery periods and selected trade information on May 6, 2010

As displayed in Table 3 price-changing trades, often referred to as active trades, never exceed 50% of total trades. In the pre-crash stage, they account for only 8.49% of the total number of trades. The percentage monotonically increases to 44.62 in the nadir period and decreases to 19.31% in the post-recovery stage. The total number of active trades across the five economic periods combined is slightly over one million, which translates to one transaction occurring every 10.1 ms, on average. The frequency of trades monotonically increases from one every 22.1 ms in the pre-crash period to one every 2.8 ms in the nadir period and then monotonically decreases to one every 7.8 ms in the post-recovery period.Footnote 22 This U-shape pattern is roughly echoed by the proportion of trades that are associated with positive price increases. This proportion drops from 49.1% in the pre-crash period to 48.3% in crash period and rises to 50.4% in the nadir period and subsequently becomes relativity flat for the remaining two periods.

Recall that our Hawkes model specification requires that a price change signals the existence of an event despite the price change being negative or positive. Because our event measure is a binary variable and transaction profits can be made on price changes regardless of direction, we would expect that each type of price change would usually account for 50% of the total. Z-scores for the trade proportion for each of the five economic periods and the five periods combined are provided in Table 3. We use a two-tailed Z-score to test the null hypothesis that the positive and negative price-change trades each account for 50% of the active trades. A one-tailed Z-score is used to test the hypothesis that a positive (negative) price change is less (more) than or equal to 50%.

Turning first to the two-tail Z-scores, we find that values for all periods combined, and four of the five sub-periods are statistically significant using any standard critical p value, indicating that the null hypothesis that the negative and positive price changes are equally balanced is rejected. The exception is the nadir period, which is insignificant using the traditional 0.10 critical value. The one-tail Z-scores help to pinpoint the source of the imbalances. Negative price changes in the pre-crash and crash periods statistically dominate the positive price changes and the reverse is the case for the recovery and post-crash periods. Over the entire test period the number of negative price changes dominate.

Thus, from the above statistics, it would seem that in all of the above cases we should reject the null hypothesis that the behavior of the two types of price changes are the same even if the actual numbers are slightly different from the hypothesized values. This conclusion, however, is an example of the “Large Sample Fallacy (LSF).” In the case of the Z-test and similar statistical tests, the LSF is the result of dividing the statistic’s denominator by the square root of the sample size so that as the sample becomes larger the Z-test value becomes larger and the p value of the test becomes smaller. As a result, the size of the universe from which the sample was extracted is not considered so that in relative standards a large sample may be only a small fraction of the total population.

As pointed out by Lin et al. (2013), among others, several approaches have been suggested to mitigate this problem, which is becoming increasingly important in the era of “Big Data.” The two most popular appear to be (1) decreasing the acceptable p value and (2) focusing on whether the actual finding is meaningful in the context of the phenomenon under investigation. Of course, the two approaches are not mutually exclusive but de facto they often are. The first approach minimizes Type 1 error, which makes the null hypothesis more difficult to reject, and requires the p value signaling statistical significance to be specified ahead of time.Footnote 23 The second involves determining whether the difference between the null hypotheses and what is observed is substantive or, in our case, economically meaningful. As argued and demonstrated by Ziliak and McCloskey (2008), failing to address the latter may result in dire consequences. We adopt the second approach and advance two arguments supporting the position that positive and negative price changes should be considered the same since the purpose of the price change variables is only to count how many trades occurred that contained new information.

First, a popular view of a crash is that some major negative information is noted by the market participants, and as stock prices begin to fall, some sort of market contagion takes effect and prices drop in unison. The reverse holds true during a recovery. Our results do not support this view, which may be exacerbated by media hype and the fact that the public does not have access to nor is aware of high frequency data. For example, as displayed in Table 2, 48.26% of the price-changing events are positive during the crash period, and 50.41% are positive during the recovery period. For the official crash period, which includes the previous two periods plus the nadir period, the positive events account for 49.62% of the total. Numerically, these are very close to the 50% neutral value and indicate that market participants engage in price discovery, so that they tend to find and exploit profit opportunities regardless of the direction of the market’s general movement to learn the true value of the market after the shock. Frank et al. (2019) suggest that traders who engage in this process together are an example of Adam Smith’s (1776/2020) well known and often mentioned “invisible hand” metaphor at work, which in our case is moving price toward a moving equilibrium value.

Second, market regulators throughout the world are concerned about the effectiveness of the price discovery process. A majority of them, including those in the USA, have adopted some sort of circuit breaker, which temporarily stops trading for a short period of time, despite the fact that theoretical and empirical research is mixed concerning its usefulness (see, e.g., Ackert (2012) and Sifat and Mohamad (2018)). Circuit breakers can be market-wide or focused on individual securities. Typically, they are concerned with falling prices. Thus, the main argument favoring this approach is that it provides a cooling off period. This delay gives traders an opportunity to come to grips with their cognitive biases and, as a result, to better evaluate the reasons for the price drop in an effort to make better trading decisions, or to adjust parameters in their algorithms. The contrary view is that the delay only postpones trading and may exacerbate the price decline when traders try to change their trading strategies in an attempt to game the system as the circuit breaker trigger price approaches.Footnote 24

On June 19, 2010, approximately one month after the May 6 flash crash, US regulators established a single stock circuit breaker to guard against falling prices. On April 5, 2011, the Financial Industry Regulatory Authority (FINRA), along with several security exchanges, suggested replacing the single stock circuit breaker with a “limit up-limit down” (LULD) circuit breaker. The main reason for the replacement was that this type of mechanism would not only handle downside volatility but also upside volatility. On May 31, 2012, the Securities Exchange Commission approved the proposal for a trial run, and on April 11, 2019, it gave the LULD permanent status.Footnote 25

5 Empirical findings

We first present the empirical results for the 30 DJIA as a group then focus on the 30 stock’s impact upon themselves and upon each other. We also report the possible presence of subgroups based on industry sector and trading venue. For convenience we refer to the 30 stocks taken together as the “market,” individually as simply “stock,” by company name or stock symbol, and in subgroups as communities. All results are based on excitation matrices calculated using the five-minute rolling window.

5.1 DJIA market

We first examine the density of the influence network (i.e., number of edges in the network) between the stocks, which is shown in Fig. 4-top. Recalling that the Hawkes process only uses information about the stocks’ trading events, it is notable that the two time series—average market price and network density—almost collapse on each other.Footnote 26 Specifically, the two series drop down simultaneously when the crash starts, reach their bottoms at the same time, and recover concurrently. While the network density follows closely with the market price, the influence strength shows a different pattern. The cross-reflexivity of the market (i.e., average of the cross-influences between the 30 stocks) is shown in Fig. 4-bottom. The abrupt increase of cross-reflexivity around 14:32 reflects the beginning of the sudden decline in stock prices. The two plots together suggest that, when the crash starts, network density decreases, but the interaction strength for the remaining links increases dramatically, indicating that although market activity increases, it is concentrated between fewer stocks. The average cross-reflexivity reaches its highest point several minutes after the average price reaches its lowest value, possibly because traders are unable to determine exactly when the market reached its lowest point while the activity level of the market is still high.After reaching their extreme points, price and cross-reflexivity both tend to return to their approximate pre-crash levels, although price is not quite as high nor is cross-reflexivity quite as low.

Fig. 4
figure 4

Network density (blue), i.e., number of links in the influence network (Top) and cross-reflexivity (blue), i.e., average of the cross-influences between the 30 stocks (Bottom), and the average standardized price of all DJIA 30 stocks (orange) from 13:00 to 16:00 (market close) (color figure online)

Recalling that our model includes three types of effects—the exogenous effect (the average baseline rate (\(\mu_{s}\))), the self-reflexivity, and the cross-reflexivity—we further examine the relative strengths of these three effects over time. Figure 5 shows the proportion of each type of effect to their sum. The exogenous effect, which includes stocks not included in the DJIA and exogenous information to the market, is small (blue curve), taking a proportion of less than 1% most of the time. The self-reflexivity (orange curve) takes a larger proportion than the cross-reflexivity (green curve), but the former starts to decay and the latter to grow when the crash starts (around 14:32), and the two become closer during the crash. They roughly return to pre-crash levels after the crash. The overall pattern of the cross-reflexivity is similar to that of the exogenous effect, which suggests that the latter effect may be dominated by stocks that are not part of the DJIA.

Fig. 5
figure 5

Proportions of exogenous effect (blue), self-reflexivity (orange), and cross-reflexivity (green) from 13:00 to 16:00 (market close). The sum of the three influences at each time point equals one (color figure online)

5.2 DJIA stocks

To determine the major influencers in the market, for each company we calculate the average out-influence, which is the impact of a particular stock on the other 29 stocks, in the pre-crash through post-recovery periods (see Table 3 for specific dates). As we show in Table 4, the pattern of out-influence varies among the stocks, and it varies among the five economic periods. In addition to the average out-influence for each stock in each economic period, we provide the stock’s average ranking of out-influence. As indicated in Table 4, the three strongest out-influence stocks over time are Bank of America (BAC), ExxonMobil (XOM), and JPMorgan Chase (JPM), and the three weakest are 3 M (MMM), Dupont (DD) and Travelers (TRV).

Table 4 Average out-influences and corresponding ranks for each of the 30 Dow Jones Industrial Average (DJIA) Stocks on May 6, 2010

In a format similar to Tables 4 and 5, Table 6 reports the self-influence data for each of the 30 DJIA stocks. As shown in this table the three strongest stocks with respect to self-influence are Bank of America (BAC), Microsoft (MFST), and ExxonMobil (XOM) and the three weakest stocks are Chevron (CVX), United Technologies (UTX), and DuPont (DD). Two observations merit special mention. First, ExxonMobil is one of the strongest stocks with respect to out-influence (rank: 01) and self-influence (rank: 03) but is one of the weakest stocks with respect to in-influence (rank: 30). Second, although Chevron is not in either the strong or weak category with respect to out-influence (rank: 10) or in-influence (rank: 10), it is clearly in the weakest self-influence category (rank: 28). Taken together these observations suggest that ExxonMobil exhibits more market power, but this market power may not be related to industry sector since both ExxonMobil and Chevron are in the energy sector and they are the only two DJIA stocks that are in this category (see Table 2).Footnote 27

Table 5 Average in-influences and corresponding ranks for each of the 30 Dow Jones Industrial Average (DJIA) Stocks on May 6, 2010
Table 6 Average self-influences and corresponding ranks for each of the 30 Dow Jones Industrial Average (DJIA) Stocks on May 6, 2010

Out-influences and in-influences are much larger than self-influences, although the averages of all three influences changed size as the market moved through the pre-crash, crash, nadir, recovery, and post-recovery periods. As shown in Table 7, self-influence and out-influence are positively correlated in all periods. In contrast, in-influence is always negatively correlated with self-influence and out-influence measures. In absolute terms, the correlations of the three pairs of influences in the crash period are smaller than those experienced in the pre-crash period. Beginning in the recovery period, these correlations tend to move toward their pre-crash levels.

Table 7 Correlations among out-influence, in-influence and self-influence ranks for the 30 Dow Jones Industrial Average (DJIA) Stocks on May 6, 2010 by Economic Period

To further investigate the behavior of the three different types of influence, we calculate the means of the out-influence, in-influence, and self-influence for the 30 Dow Jones stocks when the environment changes from pre-crash to crash, crash to nadir, and so forth. We conjecture that the sequential means may be dependent in some way to the previous mean, e.g., the mean in the nadir period is dependent on the mean in the crash period. Thus, we use a paired t-test to examine the way in which the means evolve over time. The results of these tests are given in Table 8. We cannot reject the null hypothesis that the mean out-influence and the mean in-influence do not change between the sample periods. This is not the case for self-influence. All of the paired t-tests are statistically significant. The mean self-influence decreases during the crash and increases as the market recovers.

Table 8 Out-influence, in-influence, and self-influence changes for the 30 Dow Jones Industrial Average (DJIA) Stocks on May 6, 2010 by Economic Period

5.3 DJIA communities

As previously mentioned, the stock market can be thought of as a dynamic network with its nodes being the individual stocks and the movements being the number of price-changing trades associated with these stocks. A useful descriptor of large-scale structures of a network is modularity.Footnote 28 Modularity quantifies the degree to which a network can be partitioned into communities, or, in our case, groups or clusters of stocks that may be related to one another in some fashion. Networks with high modularity have dense connections between nodes within same communities but sparse connections between nodes contained in different communities.Footnote 29 Thus, modularity is different for different network partitions, but typically, in applications, modularity refers to the maximum modularity according to the best partition and we do as well. To find the best partitions and the corresponding modularity scores, we use the Python package leidenalg (https://github.com/vtraag/leidenalg) by Traag et al. (2019). For completely random networks, modularity will be close to zero, and the larger the modularity, the more fragmented is the network.

Similar to some of our earlier analyses, the modularity of our stock network is plotted in Fig. 6 along with the stock price series from the pre-crash period to the post-recovery period. As shown in Fig. 6, the modularity increases during the crash, then peaks during the nadir period (i.e., the network is the most fragmented) and decreases in the recovery period. Compared to these three middle periods, the pre-crash and post-recovery periods are both more homogeneous and less fragmented as indicated by their very low modularity value.

Fig. 6
figure 6

Network modularity (blue) and average standardized price for all DJIA 30 stocks (orange) from 13:00 to 16:00 (market close) (color figure online)

Numerous previous studies, e.g., King (1966), Cavaglia et al. (2000), and Fan et al. (2016), suggest the presence of an industry effect such that the prices of stocks in the same industry move together because they face the same production and demand issues. Other studies, e.g., O’Hara and Ye (2011), Menkveldt and Yueshen (2019), and Tivan et al. (2020), argue that stock markets may be fragmented, i.e., there are many markets that serve the same general purpose, but, at times, they may not be well connected to each other and, thus, have the potential to create current or latent liquidity problems.

We explore these two topics in more detail by examining the existence of industry sector and stock trading venue communities using modularity and the normalized mutual information (NMI) statistic.Footnote 30 In our case, the NMI measures the similarity that exists between any two partitions of the same set of objects (i.e., the 30 DJIA stocks). If the NMI value is one, the partitions completely overlap each other, indicating that there is no difference between them. However, if the NMI value is zero, then the partitions do not overlap, signaling that there is no similarity between them. Newman and Girvan (2004) report that in their studies typical values range from 0.30 to 0.70, with values above 0.70 being quite rare. Accordingly, for each excitation matrix we cluster the stocks using the modularity method and compute the NMI statistic between the identified stock communities and the industry sectors. We make the same calculation for trading venues and stock trade reporting units.

The calculated NMI values between the nine industry sectors (see Table 2 for their names and symbols) and the corresponding identified communities are plotted in Fig. 7. A review of Fig. 7 indicates that the NMI does not have any significant trend and its value remains relatively small, i.e., about 0.30, throughout the time period under examination. This finding in conjunction with the modularity results depicted in Fig. 6 suggests that the network becomes more clustered during the crash, but these clusters do not appear to materially overlap with industry sectors. In other words, the impact of the crash is more likely to spread among the different industry sectors than within the industry sectors.

Fig. 7
figure 7

Normalized mutual information (NMI) statistics between identified stock communities and by industry sectors (blue) and the average standardized price for all DJIA 30 stocks (orange) from 13:00 to 16:00 (market close) (color figure online)

Another possible candidate for understanding the network modularity pattern is the trading venue. As mentioned on Sect. 4.1, the DJIA stocks are traded on 10 different trading venues, nine of which are stock exchanges. Each of the exchanges is responsible for accounting for all of the information corresponding to each trade. To be able to be traded on any of the exchanges, the stock must be listed on the exchange in question. Since the DJIA companies are large and well known, it is common for these companies to be listed on many exchanges. Although each listing must be purchased, most chief financial officers believe that additional costs are worth the potential additional liquidity. The tenth venue is the Nasdaq Trade Reporting Facility (NTRF), which is a partnership between Nasdaq and Financial Industry Regulatory Authority (FINRA). It is responsible for the collection of trade data for all of the internal trades, including crossing networks and dark pools operated by broker-dealer firms as well as trading desks that execute internal trades for large investment houses. In the 1980s, these ten trading venues were linked together under the auspices of the National Market System (NMS). To improve competition among these venues, the NMS creates a consolidated tape, thereby permitting all market participants the opportunity to view the transactions and quotes from each venue at the same time. This arrangement led O’Hara and Ye (2011) to suggest that the USA does not have multiple markets but rather has a single market that supplies its participants with multiple access points with each access point being one of the ten trading venues.

The NMI between the identified stock communities and the ten trading venues are plotted in Fig. 8. Unlike the NMI plot for industries, it contains patterns that are significant when compared to the lower benchmark of 0.30. In particular, 30 min before the crash the NMI increases and becomes significant around 14:15 but it then decreases and returns to insignificance a few minutes before the crash. At the beginning of the crash (14:32) it again significantly increases throughout the entire crash period. It then begins to decrease until 14:55 but it rises again until the end of the recovery period. Except for a very few seconds all the NMI values are significant. Throughout the post-recovery period, the NMI remains relatively steady and significant. Although not a perfect match, NMI pattern for the trade venue community segmentation is clearly reminiscent of the modularity pattern for the market as a whole (Fig. 6).

Fig. 8
figure 8

Normalized mutual information (NMI) statistics between identified stock communities defined and trade reporting market venues (blue) and the average standardized price for all DJIA 30 stocks (orange) from 13:00 to 16:00 (market close) (color figure online)

To probe more deeply into NMI fragmentation pattern, we examine the market shares of each of the 10 venues. The names of these trading (and reporting) venues are shown in Table 9, along with market share percentages for each of the five market periods individually and their total. Nasdaq led the venues’ market share with approximately 46%. The New York Stock Exchange (NYSE) and BATS Global Markets (BATS) are ranked second and third. Together the top three venues account for slightly over 79% of the price-changing trades. On the low end of the market shares are the Cincinnati Stock Exchange (CINC), the Chicago Stock Exchange (CHIC), and CBOE Global Markets (CBOE), which together amount to 0.21%. The venue composition of the three market share categories varies little over the five economic periods. Although the differences in market share are relatively small, the Cincinnati Stock Exchange (CINC) increased its market share in the nadir and recovery periods and switched positions with the Nasdaq Trade Reporting Facility (NTRF), which dropped market share in the same two periods.

Table 9 Percentage trade volume of trade reporting venue for the 30 Dow Jones Industrial Average (DJIA) Stocks on May 6, 2010 by Economic Period

There are, however, notable differences in venue trading activity between adjacent market periods. For example, the top three venues (NQEX, NYSE, and BATS) accounted for nearly 80% of the trades in the pre-crash period, dropped to 79% during the crash, and continued to fall to 75% at the market’s nadir for a market share drop of approximately five percentage points. Together their market share began to increase during the recovery period and continued to increase to approximately 79%. These large market share venues did not, however, operate in tandem. From the pre-crash period to the nadir periods NQEX market share dropped by 11.15 percentage points. The corresponding changes for NYSE and BATS are an increase of 6.33 percentage points and a decrease of 0.05 percentage points, respectively. As a result, almost five percentage points were picked up by the seven other trading venues, although up and down patterns are also exhibited by these smaller trading venues. These changes in market share strongly suggest that traders are willing to move from one venue to another in an effort to find the best price for the size of their transactions.

Gomber et al. (2016) conclude their extensive survey on the nature of market fragmentation versus consolidation by suggesting that the economic welfare costs and benefits of stock markets mostly depend on the way they handle issues of price discovery and adequate liquidity. Although there are economies of scale that strongly favor consolidation, they maintain that because of the differences in trader behavior such as trading motives, order sizes as well as the need for quickness, it may be difficult to design a single market that can satisfy the needs of all traders. Recent research by Nicole et al. (2020), however, posit that the presence of heterogeneous agents may not be a reason for the existence of multiple markets. Instead, their work on market fragmentation using agent-based modeling leads them to suggest that traders’ preference for multiple markets may be the result of their adaptive behaviors.

The industry and trading venue community results differ greatly. We suggest that this difference is the length of the crash as well as high-frequency trading. When the crash and its recovery are very short as is the case of the Flash Crash, a large majority of the trading is endogenous as a result of traders seeking the best deal regardless of venue in order to gain profit or minimize loss. In contrast, industry information does not change and, hence, is exogenous. For longer crash and recovery periods (months or even years), we expect there would be an industry effect since industry information would most likely be part of the price discovery process employed by the traders.

6 Discussion and concluding remarks

Our empirical findings indicate that the DJIA 30 stocks exhibit the characteristics of self- and cross-reflexivity as well as out-influence and in-influence, suggesting that past price movements in the prices of these stocks not only influence the future prices of the stocks themselves, but also of other stocks that make up the index. The out- and in-influence interactions between the stocks vary before, during and after the 2010 flash crash. Nevertheless, with respect to rank correlation, the behavior of stocks is strongly negative for in-influence vs. out-influence or self-influence. Taken as a whole, the self-influence of the 30 Dow Jones stocks declines from the pre-crash period to the bottom (nadir) and then increases in the recovery period.

In addition, we find that the US trading venues are usefully viewed as a tightly connected network that traders tend to use to their advantage. As O’Hara and Ye (2011) suggest, the venues compete with each other to attract volume and, hopefully, increase profits by offering lower transaction costs and faster execution speeds, although these advantages may also be associated with greater volatility in the short run, which may cause some traders, e.g., market-makers, to temporarily exit the market, possibly resulting in the loss of liquidity. Traders have the ability to access all of the trading venues and can quickly search for the best deal for them and their clients. For extremely large buy or sell orders, fragmentation allows the traders to split their orders among trading venues to try to lessen the impact of the transaction.

Our findings have implications for the performance of stocks in terms of risk and return, market microstructure design, and stock index composition. Turning first to stock portfolio performance, modern risk management has focused on various measures of expected return and volatility. The initial basis for this approach was initiated by Markowitz (1952) using the statistical concepts of the mean and variance of returns. Relying on this development, Sharpe (1964, 1994) creates a risk-return performance ratio (the Sharpe Ratio), which implicitly assumes that the relevant return distributions are Gaussian.Footnote 31 This assumption, however, is often not true. Recognizing that return distributions are typically asymmetric and thick-tailed, Sortino (2001) designed a performance measure that focuses on downside risk (the Sortino Ratio) that is concerned only with the left tail of the return distribution.Footnote 32 Applying either the Sharpe Ratio or the Sortino Ratio to high-frequency data, however, is problematic because of the Epps effect (see fn. 21). Aїt-Sahalia and Hurd (2016), however, have developed a capital asset pricing model where the stocks being considered are described by mutually exciting Hawkes processes. They show in a dynamic context that the optimal portfolio composition changes in responses to changes in the jump intensities of individual stocks. An important result of their model when applied to risky assets and a risk-free asset is that because of the excitation relationship among stocks, a jump in one causes the investor to sell all the stocks and invest the proceeds in the risk-free asset. This behavior suggests, that in an environment that recognizes the possibility of contagion, a very few or even a single stock could trigger a crash, flash, or otherwise.

Second, as we mention in Sect. 4.2, the Securities and Exchange Commission (SEC) adopted the limit up-limit down (LULD) circuit breaker rule to mitigate the possibility of a market crash. The idea was to try to constrain the volatility of a specific stock in the hope that this would reduce the possibility of a market crash induced by this stock. Our results, however, show that the contagion (or influence) between stocks is not symmetric, i.e., in-influence and out-influence are not identical and are negatively related. This finding suggests that a more refined microstructure arrangement might have different LU and LD limits with the magnitude of these limits determined by the magnitude of the two respective influences. If having different limits is not administratively practical, an alternative to consider is to have symmetric limits with the values being determined by the most out- and in-influencing stocks.

Third, the composition of stock indexes is routinely changed to reflect structural changes in the economy. This is typically done by adding some stocks of strong companies in growing industries and eliminating stocks in companies that have become less strong in declining industries. Our results, however, indicate that industries may not be the important factor in the construction of a high-frequency stock index. Instead, what is important is the influence effect that stocks have on each other and themselves. Moreover, although current daily indexes are designed to handle the effects of stock price differences when their composition is being modified, there does not appear to be a rigorous quantitative assessment of the possible cross-influence differences between the stocks being added and those being deleted. Not accounting for these differences in some way (even qualitative) may render the index behavior before and after the change as not comparable with respect to its dynamical behavior.

In sum, because of the high level of technology that is being used in the major stock and similar markets worldwide and the likelihood of continued high-frequency and algorithmic trading, additional work should be done exploring the implications of the Aїt-Sahalia and Hurd (2016) capital asset pricing model using real transactions data with a focus on measuring portfolio performance and trading strategies. Work should also be done on the implications of stock cross-influences and on the rules of trade and their impact on prices and liquidity restrictions, as well as the design of stock indexes that are not only used to monitor the overall performance of the market, but also as an input to various asset pricing models. Additionally, efforts should be directed toward the ways that market fragmentation affects the overall self-, in- and out-influences of individual stocks. Finally, the Hawkes model quantitatively describes the behavior of the stock market during the 2010 Flash Crash. Nevertheless, more intricate versions of the process may be needed to better understand a market crash that may last weeks, months, or even years. One possibility is to add higher-order nonlinear terms or jump processes to the model, e.g., stochastic cusp catastrophe models.

We leave these and other similar interesting topics to future research. Nevertheless, we note that Kuhlmann (2014) believes that social complexity in economics/finance is not often explicitly included in its models or theory because its impact through institutional arrangements and interaction are not sufficiently recognized. We concur with his view and urge that financial research, especially that involving market microstructure research, should be guided not only by statistical models and measures, but also by the notion of the complex nature of human behavior and how this behavior is reflected in the research scheme.