Path-dependent behavior and information leakage in financial markets

Testa, Alessia

doi:10.1007/s00199-018-1102-3

Path-dependent behavior and information leakage in financial markets

Research Article
Open access
Published: 01 February 2018

Volume 67, pages 909–949, (2019)
Cite this article

Download PDF

You have full access to this open access article

Economic Theory Aims and scope Submit manuscript

Path-dependent behavior and information leakage in financial markets

Download PDF

Alessia Testa ORCID: orcid.org/0000-0001-7579-2653¹

2107 Accesses
1 Altmetric
Explore all metrics

Abstract

I develop a new mechanism that exploits the leakage of information in financial markets to deliver herding and contrarian behavior, which I label “path-dependent behavior.” The mechanism is related to the role of word-of-mouth communication in transmitting information about stocks among traders. In practice, this occurs via dealing interactions and a notable phenomenon called “the broker’s ear.” I find that, for a suitably long trading history, path-dependent behavior is more likely the better the quality of traders’ private information. Herding in the direction of the true state of the world occurs almost surely for any initial beliefs, and it improves price informativeness. An external observer who underestimates/overestimates the rate of information leakage will always overestimate/underestimate the quality of private information. I also show that when the quality of private information is very high and the asymmetry of information between traders and market maker is pronounced, the market is expected to herd 50% of the time. Nonetheless, this has little impact on order imbalance, excluding cases exhibiting catastrophic price behavior.

Communication and the Stock Market

Are individuals informed in global markets?

Article 05 October 2021

Trading Agent Kills Market Information

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Over the past twenty-five years, the market microstructure literature (see Avery and Zemsky 1998, AZ henceforth, and Cipriani and Guarino 2014) has linked the phenomenon of event uncertainty (Easley and O’Hara 1992) to herd behavior. In this paper, I introduce a new mechanism which describes a form of information leakage in financial markets that can generate herd and contrarian behavior, phenomena that I group under the label of path-dependent behavior (PDB).^{Footnote 1} Like event uncertainty, this mechanism exploits the asymmetry of information between traders and the marker maker setting the price. Unlike event uncertainty, it does not require a shock in the value of the asset. Changing the nature of the informational asymmetry has important implications in terms of tractability,^{Footnote 2} allowing the model to link beliefs and price formation throughout time to information quality and market structure.

The proposed mechanism is related to the role of word-of-mouth communication and networks in transmitting information about stocks among investors. In his book Irrational Exuberance, Shiller (2015) notes that “word-of-mouth transmission of ideas appears to be an important contribution to day-to-day or hour-to-hour stock market fluctuations” (p. 155). In the same spirit, Hong et al. (2005) study the holdings and trades of mutual fund managers working in the same city to test the hypothesis that they exchange ideas by word-of-mouth. They find that this prediction is strongly confirmed by the data at the level of both holdings and trades of thousands of securities.^{Footnote 3}

I think of the interaction of traders—whether acting as brokers or as dealers—on trading floors (physical and virtual) as forming an information network. I model the exchange/leakage of information about an asset by assigning each trader a type and by assuming that traders on the same information network can observe each other’s types. I assume two types: type I traders are always informed and receive a signal about the (unknown) value of the asset; type II traders are informed with some known probability and uninformed otherwise. If uninformed, traders buy and sell with equal probability. I model the trading activity following Glosten and Milgrom (1985). Each period, a trader is selected randomly to trade one unit of the asset with a competitive, risk neutral and uninformed market maker. The market maker does not belong to the network governing the flow of information about the asset. He posts a bid and an ask price at which he buys and sells without knowing the type or the information of his trading counterpart. Information is leaked because, through the observation of types, traders gain a better assessment of whether the market activity is more or less revealing of the value of the asset.^{Footnote 4}

Herding occurs after a long enough sequence of type I buys which causes traders to update their beliefs as if they had full information about the signal realizations. In other words, information leaks from the long sequence of type I buys. At the same time, the market maker has to consider the eventuality that the trades he observes are generated by noise. This causes price rigidity to the point where traders find it advantageous to buy regardless of their signal. On the other hand, contrarianism is the consequence of type II trading activity. The probability of noise conditional on observing a type II trader is higher than the overall fraction of noise traders. This causes the price to overreact to type II trading activity and causes traders to ignore their private information to go against the market.

As mentioned, AZ appeal to event uncertainty to generate herd behavior. Under event uncertainty, a shock may cause the asset value to change. Traders receive a signal informing them both about the occurrence of the shock and about the likelihood of each possible new value, whereas the market maker can only learn about the shock by observing their trading behavior. AZ conduct comparative statics with respect to information quality for a given trading history. Due to the recursive nature of their model, their analysis is limited to obtaining results “one step at a time.” They find that, holding beliefs fixed at each point in time, herding occurs for values of the signal precision below some threshold. In contrast, using the new mechanism in my model, I can link the quality of information and market structure to price and belief formation over time, and to the likelihood of different trading histories. This allows me to formally prove that, for any trading history, there exists a threshold for the quality of private information above which PDB occurs with positive probability.^{Footnote 5}

Moreover, I find that, after some critical time period in the trading history, there exists a threshold for the quality of private information above which the probability of PDB is higher the higher the quality of the traders’ information.^{Footnote 6} Early on, the trading history does not have much weight (i.e., does not contain much information), and an increase in the quality of private information makes it more difficult for the traders to disregard their private signal to follow or go against the crowd. For a long enough trading history and for a high enough information quality, the weight of history increases relatively more than the information contained in the private signal, the higher the precision of the latter. This increases the probability of PDB.

PDB makes the price more volatile^{Footnote 7}; however, it does not lead to extreme price behavior such as booms and crashes. No extreme price behavior is generated by traders’ communication, even if the marker maker is not aware of information leaking, as long as he knows the quality of private information and the overall level of noise trading. In fact, during a PDB episode, traders are aware that the trading activity is completely uninformative. This gives the market maker the ability to catch up on the information that is available to the traders through their interaction. Overall, price informativeness is improved as more information is available to the traders, triggering herding in the direction of the true state of the world, which, on average, suppresses the opinion of traders receiving incorrect signals.^{Footnote 8}

In fact, in the limit, for any initial beliefs, PDB occurs almost surely in the form of herd behavior in the direction of the correct state of the world, while the likelihood of contrarianism converges to zero.^{Footnote 9} This is because contrarianism requires a low rate of information leakage, but when the rate of information leakage is low, the price is too sticky, which hinders contrarianism. In general, for any level of information leakage, the market structure outlined above is characterized by a high degree of price stickiness which helps herding but hinders contrarianism.

Studying the limit (for $t\rightarrow \infty $) behavior of a market with information leakage, I find that an increase in the quality of information results in a monotonic increase in order imbalance.^{Footnote 10} In this case, market participants will not be confused between poorly and well informed markets, as long as they are aware that the market structure allows for herding. In contrast, when a market analyst misjudges the market structure by either overestimating or underestimating the leakage of information in the market, he will always underestimate or overestimate the quality of private information, respectively.^{Footnote 11} Specifically, an analyst underestimating the information leakage attributes too much of the order imbalance to the quality of the private information rather than to the fact that this is leaking.

In particular, when an analyst is completely unaware of information leakage, the overestimation of the signal precision is non-monotonic in the quality of information.^{Footnote 12} There are three factors contributing to this result. First, at low levels, increasing the signal precision increases the speed of buildup of the difference in beliefs between the traders and the market maker, triggering herding. Second, at low levels, increasing the signal precision makes the time of recovery from herding longer, as the informational content of the last non-herding trade is larger. The latter constitutes the informational gap that the market maker needs to eliminate in order to catch up with the traders and to recover normal trading. At low levels of the signal precision, both factors increase the distortion in inference about information quality. Third, as the signal quality increases, the probability of receiving an incorrect signal decreases. Since herding goes, on average, in the direction of the true state of the word, the less likely the realization of an incorrect signal, the less likely herding is to alter trading activity. For a sufficiently high level of the signal precision, this third effect takes over and the distortion in the inference about information quality starts decreasing.

Despite the mild impact of PDB on the price, traders can spend a large amount of time herding. In particular, when all information available is leaked, the signal precision is very high, and the level of noise trading is neither too high nor too low, the market is expected to herd 50% of the time.^{Footnote 13} Clearly, both the facts that the quality of information is high and that it is made perfectly available to all traders exacerbate the asymmetry of information between the traders and the market maker. Too much noise trading would not generate enough information to be leaked causing the price to be sticky, whereas low levels of noise trading allow the marker maker to keep up with the information in trading activity.

1.1 The role of the market maker

Modeling the market maker as an agent completely cut out of the information network needs some further discussion. The market maker can be interpreted as an arbitrary price fixing mechanism, such as a Walrasian auctioneer, or as a traditional liquidity provider. The market maker is not completely uninformed. But, because he is not directly interacting with traders second by second, in the middle of the fray, he can find himself relatively uninformed compared to speculators and brokers. Speculators or brokers have the opportunity, the ability and the incentive to share information with one another. As a result, information can leak and spread among informed traders for a variety of reasons, some voluntary and some involuntary, leaving the market maker at an informational disadvantage.

For instance, once a trade is executed there is more to gain than to lose in letting others know that “some known bank” or “a significant market participant” has just placed its bets one way or another. In fact, traders want others to do the same in order to move the market in a favorable direction (MacKenzie 2008a; Van Bommel 2003).

Another reason to voluntarily share information lies in the reciprocity of traders’ relationships with one another. Information about the latest transactions and mood witnessed on the local trading floor is repaid by granting favorable prices and volunteering further information. These links of reciprocity facilitate the flow of information and allow traders to track the market “as it is made” (Knorr Cetina and Bruegger 2002). Belonging to a business network enables a trader to gain an information edge and to profit from it before others.

Moreover, sharing information is a major part of what clients expect from their brokers. While they would not explicitly reveal the identity of market actors, “there is a grey area where euphemisms can be used [such as] ‘the usual German’ has just done something” (MacKenzie 2008b).

Information can leak involuntarily when a broker who is not member of an exchange, or who does not have access to a trading venue, needs to pass on his order to another broker for execution. Before the order hits the market, many parties might get sight or hear rumors of it. For instance, where trading floors are still physical places, traders overhear conversations around them. Sociologists conducting field studies on trading floors witness a well-known phenomenon called the “broker’s ear.” MacKenzie (2008b) reports the following statement by a trader: “When you’re on the desk you’re expected to hear everyone else’s conversations as well, because they are all relevant to you, and if you are on the phone speaking to someone about what’s going on in the market there could be a hot piece of information coming in with one of your colleagues that you would want to tell your clients, so you’ve got to be able to hear it coming in as you’re speaking to the person.”

The information channels outlined so far might also offer a way to understand how the market maker in AZ can be completely unaware of an informational event (e.g., new management, a merger), even though enough traders to generate herding receive this information from contacts, before any private announcement has been made. In support of this, Schindler (2007) reports, based on his 2003 survey of traders in stocks, bonds, FX and commodity markets, that about 70% of traders answered that when they overhear rumors, or have other first-hand information, they spread them to a few priority people in the hope that later they will reciprocate and do the same. This way, traders build information networks that evolve in such a way that for a while those in a network know more than others in the market. This pattern has been confirmed by about 80% of the traders surveyed.

With regard to older floor-based trading, Baker (1984) studies network patterns and formation within crowds in markets for stock options. The larger the crowd, the more the limitations to communication due to noise and physical separation. To overcome this, traders organize themselves in multiple cliques. Market makers cannot necessarily monitor the behavior in all the cliques. Baker (1984) reports the detrimental effects of large crowds to communication as described by a veteran market maker: “In really large crowds [$\cdots $] it’s noisy; you can’t hear. It happens when the stock is changing. Some people trade, and they tell others, and then lots of people are coming over. There are some aberrations sometimes.”

Finally, assuming that the market maker does not belong to any information network might be considered as a modeling artifice to understand the subset of traders who will herd. Only traders who think they are more informed than the market maker will herd. In contrast, traders who occupy a marginal position within an information network would rationally take the price as a more accurate valuation of the asset than their own. Hence, such traders would behave as in a Glosten–Milgrom market and always follow their signal. In this sense, the model does not exclude the possibility that the market maker is more informed than some traders. Only the traders who are more connected would engage in PDB. These are the traders I am interested in.

1.2 Other related literature

Other mechanisms have been exploited to generate herd and contrarian behavior in financial markets. Notably, Park and Sabourian (2011) generalize event uncertainty by pointing out that any U-shaped signal with positive or negative bias—i.e., a signal moving probability mass from moderate to extreme states with a bias toward positive or negative states, respectively, is necessary and almost sufficient to generate herd behavior. Hence, it is the shape of the signal and not the multiple layers of uncertainty that is responsible for herding. Their paper also shows that herding is possible, if not even simpler to characterize, when signals have the monotone likelihood property.

In Chari and Kehoe (2004), endogenous timing in the trading decision brings the usual trade-off between investing and waiting to invest. Once this trade-off is resolved, future information will never be revealed and all traders hurry to decide independently of it.

In Lee (1998) information remains trapped because of transaction costs, which makes it unprofitable for traders to act on their private signal. While information is trapped, only traders with good news buy the asset, increasing its price. Once a trader with sufficiently high-quality information sells, all the negative information finally reaches the market making the price collapse.

In Dasgupta and Prat (2008) and Dasgupta and Prat (2006) portfolio managers ignore their private information to follow the crowd because of career concerns. Good portfolio managers receive correlated signals; thus, they prefer to ignore their private information as investors are more likely to believe that a manager is good but unlucky when he fails along with others, rather than when he fails alone.

Finally, Bose et al. (2008) model the interaction between an exogenous sequence of informed buyers and a monopolist seller who sets the price in order to learn from the buyers’ decisions. As in the literature mentioned so far, prices adjust to reflect the information revealed from past trades. But in addition, the monopolist sets prices so as to control the learning process. Initially, the seller charges high separating prices to allow the buyers to learn, while he eventually induces a purchase cascade by setting a pooling price.

2 The model

There is a countable number of risk neutral agents/traders $N=\left\{ 1,2,\ldots \right\} $ who are selected randomly and anonymously to trade with a perfectly competitive and risk neutral market maker. They trade one unit of the only asset in the economy, asset that can take value $V\in \left\{ 0,1\right\} $. Traders act sequentially and only once in their lifetime. Time is discrete, $t\in \left\{ 1,2,\ldots ,T\right\} $, where T is the time when the asset is liquidated and the capital gain (loss) is realized. At each t, the market maker posts a bid $B_{t}$ and an ask $A_{t} $ price at which he commits to trade. Agents, if called to trade, decide whether to buy, sell or not trade given those prices. Indicate a generic action/trade at t with $a_{t}$, while the realized price at time t is denoted by $ V_{t}^{m}$.

Agents can be of two types: they are either type I traders with probability $ \left( 1-\lambda \right) $, or they are type II traders. Type I traders are informed with probability one and receive a signal $\sigma $ about the value of the asset, while type II traders are either informed with probability $ \left( 1-\mu \right) $ or noise traders with probability $\mu $. Both types’ draws and the draws determining whether a type II trader is informed or not are independent and independent from each other. Noise traders trade for liquidity reasons and they are assumed to buy and sell with equal probability.^{Footnote 14}

Signals can be either high (H) or low (L) and, conditional on V, they are independent. The probability that a signal reveals the true state is $p> \frac{1}{2}$, i.e., $\Pr \left\{ \sigma =H\mid V=1\right\} =\Pr \left\{ \sigma =L\mid V=0\right\} =p$, where the initial common prior is $\pi _{0}=\Pr \left\{ V=1\right\} =\frac{1}{2}$. I use the convention that informed traders receive their signal only at the moment in which they are called to trade. This is without loss of generality and it implies that, before being called to trade, traders share the same valuation for the asset.

Each period, the selected trader is assigned a type which can be observed by the other traders but not by the market maker. Hence, if a trader is observed to be of type I, he is automatically recognized to be informed, although his signal cannot be observed by anybody else but himself. If the selected trader is observed to be of type II, his fellow traders cannot distinguish whether he is informed or a noise trader. The market maker does not know either the type or the signal. He just receives the trading order and executes it, while the other traders observe the realized price.

The different possible transactions in every period together with all the possible bid and ask prices form the space $\varOmega =\left\{ \text {buy, sell, no trade}\right\} \times \left[ 0,1\right] ^{2}$, which is identical for all $t\in \left\{ 1,2,\ldots ,T\right\} $. The space of all possible trading sequences is $\mathcal {H}=\prod \nolimits _{t=1}^{T}\varOmega _{t}$, where $ \varOmega _{t}=\varOmega $. Call $\mathcal {F}$ the algebra on $\mathcal {H}$, and $ \left\{ \mathcal {F}_{t}\right\} $ the corresponding filtration.

Each trader’s information is composed of three parts: the trading history, the vector of types of those who traded before him, and his private signal $ \sigma ^{i}$. Formally, indicate with $\tau _{t}$ a random variable that takes value 1 if the trader at t is of type I and 0 if the trader at t is of type II. Call $\mathcal {N}=\varPi _{t=1}^{T}\left\{ 1,0\right\} _{t}$ the history of types and $\left\{ \mathcal {T}_{t}\right\} $ the corresponding filtration. Then, each trader’s information structure at time t is represented by the filtration

$$\begin{aligned} \left\{ \mathcal {I}_{t}^{i}\right\} =\left\{ \mathcal {F}_{t},\mathcal {T}_{t},\sigma ^{i}\right\} . \end{aligned}$$

Call $\pi _{t}^{i}$ the posterior probability that, at time t, agent i assigns to the event that the true value of the asset is 1, formally:

$$\begin{aligned} \pi _{t}^{i}= & {} \Pr \left( V=1\mid a_{t},\mathcal {I}_{t}^{i}\right) \\= & {} \frac{\Pr \left( a_{t}\mid V=1,\mathcal {F}_{t},\mathcal {T}_{t},\sigma ^{i}\right) \pi _{t-1}^{i}}{\Pr \left( a_{t}\mid V=1,\mathcal {F}_{t}, \mathcal {T}_{t},\sigma ^{i}\right) \pi _{t-1}^{i}+\Pr \left( a_{t}\mid V=0, \mathcal {F}_{t},\mathcal {T}_{t},\sigma ^{i}\right) \left( 1-\pi _{t-1}^{i}\right) }, \end{aligned}$$

where $\sigma ^{i}=\varnothing $ if i is not trading at t. Correspondingly, the traders’ valuation of the asset is

$$\begin{aligned} V_{t}^{i}=E\left[ V\mid a_{t},\mathcal {I}_{t}^{i}\right] =\pi _{t}^{i} \end{aligned}$$

It is understood that the optimal decision for an informed trader i called to trade at time t is

$$\begin{aligned} \text {buy if }\quad V_{t}^{i}> & {} A_{t}, \end{aligned}$$

(1)

$$\begin{aligned} \text {sell if }\quad V_{t}^{i}< & {} B_{t}. \end{aligned}$$

(2)

Call $\pi _{t}^{m}$ the probability that the market maker assigns to the event that $V=1$ given the trading history at the end of time t:

$$\begin{aligned} \pi _{t}^{m}= & {} \Pr \left( V=1\mid a_{t},\mathcal {F}_{t}\right) \nonumber \\= & {} \frac{\Pr \left( a_{t}\mid \mathcal {F}_{t},V=1\right) \pi _{t-1}^{m}}{\Pr \left( a_{t}\mid \mathcal {F}_{t},V=1\right) \pi _{t-1}^{m}+\Pr \left( a_{t}\mid \mathcal {F}_{t},V=0\right) \left( 1-\pi _{t-1}^{m}\right) }. \end{aligned}$$

(3)

Correspondingly, the market maker’s valuation of the asset is:

$$\begin{aligned} V_{t}^{m}=E\left[ V\mid a_{t},\mathcal {F}_{t}\right] =\pi _{t}^{m}\text {.} \end{aligned}$$

In setting the price at the beginning of time t, the market maker does not know whether he will be facing a buy or a sell order. Conditional on a buy or a sell, he posts an ask and a bid price, respectively, so that the zero profit condition is satisfied:

$$\begin{aligned} A_{t}= & {} E\left[ V\mid \mathcal {F}_{t},a_{t}=\text {buy}\right] =E\left[ V\mid \mathcal {F}_{t},V_{t}^{i}>A_{t}\right] \end{aligned}$$

(4)

$$\begin{aligned} B_{t}= & {} E\left[ V\mid \mathcal {F}_{t},a_{t}=\text {sell}\right] =E\left[ V\mid \mathcal {F}_{t},V_{t}^{i}<B_{t}\right] . \end{aligned}$$

(5)

Due to the presence of a price mechanism, I adopt the same definition of herding and contrarianism as Avery and Zemsky (1998) and Park and Sabourian (2011). Roughly, an agent is herding if he disregards his private signal to trade in the direction of the market, while an agent engages in contrarian behavior if he disregards his private signal to trade against the market. Following Avery and Zemsky (1998) and Park and Sabourian (2011), the definitions are given abstracting from bid and ask prices.

Definition 1

(Herding–Contrarianism) A trader with signal $\sigma ^{i}$ engages in herd behavior at time t if he buys when $V_{1}^{i}\left( \sigma ^{i}\right) <V_{1}^{m}<$$V_{t}^{m}$ or if he sells when $V_{1}^{i}\left( \sigma ^{i}\right)>V_{1}^{m}>V_{t}^{m}$; and buying (or selling) is strictly preferred to other actions.

A trader with signal $\sigma ^{i}$ engages in contrarian behavior if he buys when $V_{1}^{i}\left( \sigma ^{i}\right) <V_{1}^{m}$ and $ V_{t}^{m}<V_{1}^{m}$, and if he sells when $V_{1}^{i}\left( \sigma ^{i}\right) >V_{1}^{m}$ and $V_{t}^{m}>V_{1}^{m}$; and buying (or selling) is strictly preferred to other actions.

As AZ point out, for herd buying to occur three things need to happen. First, without observing any trading history, the trader sells at $t=1$. Second, the history of trades must be positive. Third, despite the increase in price, the trader must be willing to buy after having observed the trading history. Herding can be interpreted as a situation where the price has not moved as much as the trader’s valuation after observing a positive trading history. Correspondingly, for contrarian buying to occur three things need to happen. First, the trader sells at $t=1$. Second, the trading history must lead to a decrease in the price. Third, the trader must be willing to buy after observing the trading history. Contrarianism is the consequence of the price reacting too much to the trading history compared to the traders’ valuation. In general, PDB, whether this is herding or contrarianism, is triggered when traders who would have followed their signal at $t=1$ disregard it after observing the trading history.

2.1 Traders’ updating and market maker’s pricing rule

Traders’ updating

As informed traders receive a signal only when called to trade, before being called all share the same valuation of the asset. In particular, all agree on whether the conditions for PDB are met. Since they differ only by their signal, if one of them engages in PDB at t, all of them would. It follows that a type I trader engaging in PDB at t does not release any information to the other traders. Similarly, if the conditions for PDB are in place and a type II trader is observed, his actions are uninformative as well because he is either engaging in PDB or he is a noise trader. In both cases, $V_{t}^{i}=V_{t-1}^{i}$ for every i.

Consider a time t when the conditions for PDB are not met. The traders’ valuation is determined by the trading history through the difference between the number of high signals and the number of low signals implicitly “observed” through type I trades, and by the difference between the number of type II buys and the number type II sells observed up to time t. Formally, let $h_{t}$ and $l_{t}$ be the number of type I buys and sells, and $b_{t}^{i}$ and $s_{t}^{i}$ be the number of type II buys and sells observed up to time t, respectively. Then, for every i,

$$\begin{aligned} V_{t}^{i}=\frac{p^{h_{t}-l_{t}}\left[ \frac{\mu }{2}+\left( 1-\mu \right) p \right] ^{b_{t}^{i}-s_{t}^{i}}}{p^{h_{t}-l_{t}}\left[ \frac{\mu }{2}+\left( 1-\mu \right) p\right] ^{b_{t}^{i}-s_{t}^{i}}+\left( 1-p\right) ^{h_{t}-l_{t}}\left[ \frac{\mu }{2}+\left( 1-\mu \right) \left( 1-p\right) \right] ^{b_{t}^{i}-s_{t}^{i}}}. \end{aligned}$$

(6)

Market maker’s pricing rule Given the trading history, the market maker fixes bid and ask prices conditional on the traders’ strategies, as formalized by (4) and (5). When the conditions for PDB are not met, the strategy of an informed trader is to buy upon the reception of a high signal and to sell upon the reception of a low signal. If the conditions for PDB are in place, the market maker needs to account for the fact that an informed trader buys or sells regardless of his signal.

I focus the analysis on the ask price as the bid price is determined similarly. First I derive the pricing rule in an example to illustrate the ideas before providing the general rule. In the example I show how the market maker computes the expected value of the asset in the eventuality that traders might be herding as early as $t=3$. I then generalize the example for the case in which path-dependent buying occurs for the first time at t. Finally, I show that this expected value constitutes a rational expectations price.

Example Consider a trading history $\mathcal {F}_{3}=\left\{ \left( B,V_{1}^{m}\right) ,\left( B,V_{2}^{m}\right) \right\} $, and suppose that up until $t=3$ no possibility of PDB has arisen. Then, $\mathcal {F}_{3}$ is compatible with four “type histories” $\left\{ \mathcal {G}_{3}^{j}\right\} _{j=1}^{4}$:

$$\begin{aligned} \begin{array}{c} \mathcal {F}_{3} \\ \overbrace{ \begin{array}{c} \\ \left( B,B\right) \\ \\ \end{array} } \end{array}\rightarrow & {} \underbrace{\left\{ \begin{array}{c} \left( \left( B,\tau =1\right) ,\left( B,\tau =1\right) \right) \\ \left( \left( B,\tau =1\right) ,\left( B,\tau =0\right) \right) \\ \left( \left( B,\tau =0\right) ,\left( B,\tau =1\right) \right) \\ \left( \left( B,\tau =0\right) ,\left( B,\tau =0\right) \right) \end{array} \right. } \left. \begin{array}{c} \leftarrow \mathcal {G}_{3}^{1} \\ \leftarrow \mathcal {G}_{3}^{2} \\ \leftarrow \mathcal {G}_{3}^{3} \\ \leftarrow \mathcal {G}_{3}^{4} \end{array} \right. \\&\quad \quad \quad \quad \quad \left\{ \mathcal {G}_{3}^{j}\right\} _{j=1}^{4} \end{aligned}$$

Indicate with $E\left[ V\mid \mathcal {G}_{3}^{j}\right] $ the valuation of a trader who has observed $\mathcal {G}_{3}^{j}$. Suppose that p, $\lambda $ and $\mu $ are such that a trader observing $\mathcal {G}_{3}^{1}$ is going to herd if the market maker sets the ask price at $t=3$ conditioning on the fact that only noise traders and traders with a high signal are buying. Then, such price cannot be the equilibrium price. Facing a buy order at $t=3$ , the market maker needs to consider the following scenarios:

$$\begin{aligned} \left( B,B,B\right) \rightarrow \left\{ \begin{array}{c} \mathcal {G}_{3}^{1}\left. \begin{array}{c} \nearrow \\ \rightarrow \\ \searrow \end{array} \right. \left. \begin{array}{c} L \\ H \\ noise \end{array} \right. \\ \mathcal {G}_{3}^{2},\mathcal {G}_{3}^{3},\mathcal {G}_{3}^{4}\left. \begin{array}{c} \nearrow \\ \searrow \end{array} \right. \left. \begin{array}{c} H \\ noise \end{array} \right. \end{array} \right. \ \end{aligned}$$

The market maker needs to compute $E\left[ V\mid \mathcal {F}_{3},a_{3}=\text { buy}\right] $ allowing for the possibility of a trader on $\mathcal {G}_{3}^{1}$ to buy with a low signal. Then, if this expected value is still smaller than $E\left[ V\mid \mathcal {G}_{3}^{1},\sigma ^{3}=H,\right] $, he can set $A_{3}=E\left[ V\mid \mathcal {F}_{3},a_{3}=\text {buy}\right] $ as the rational expectations price. $\square $

In order to generalize what illustrated in the previous example to the case where PDB occurs for the first time at a generic t, define

$$\begin{aligned} \Sigma =\Big \{ \left( B,\tau =0\right) ,\left( S,\tau =0\right) ,\left( B,\tau =1\right) ,\left( S,\tau =1\right) \Big \} \end{aligned}$$

to be the type space. Moreover, define the set of all possible T -dimensional vectors of type sequences as $\Sigma ^{T}=\prod \nolimits _{t=1}^{T}\Sigma _{t}$, where $\Sigma _{t}=\Sigma $ for every t. Call $\mathcal {G}$ the algebra on $\Sigma ^{T}$ and $\left\{ \mathcal {G}_{t}\right\} $ its generic filtration.

At time t, traders observe a type history $\mathcal {G}_{t}^{j}$ and compute $E\left[ V\mid \mathcal {G}_{t}^{j}\right] $. The market maker observes the trading history $\mathcal {F}_{t}$ which, without any previous possibility of PDB, is compatible with $2^{t}$ type histories $\left\{ \mathcal {G}_{t}^{j}\right\} _{j=1}^{2^{t}}$, as illustrated below:

$$\begin{aligned} \begin{array}{c} \\ \mathcal {F}_{t} \\ \\ \end{array} \rightarrow \,\, \quad \left\{ \begin{array}{c} \mathcal {G}_{t}^{1} \\ \mathcal {G}_{t}^{2} \\ \vdots \\ \mathcal {G}_{t}^{2^{t}} \end{array} \right. \left. \begin{array}{c} \rightarrow E\left[ V\mid \mathcal {G}_{t}^{1}\right] \\ \rightarrow E\left[ V\mid \mathcal {G}_{t}^{2}\right] \\ \vdots \\ \rightarrow E\left[ V\mid \mathcal {G}_{t}^{2^{t}}\right] \end{array} \right. \end{aligned}$$

The traders’ valuation along the $\mathcal {G}_{t}^{j}$ paths depends on the number and the type of buys and sells as specified in (6). In particular, the higher the number of type I buys and the number of type II sells, the larger the difference $E\left[ V\mid \mathcal {G}_{t}^{j}\right] -E \left[ V\mid \mathcal {F}_{t}\right] $ and the easier for path-dependent buying to occur. For any trading history, without any possibility of PDB in the past, there is a unique $\mathcal {G}_{t}^{j^{*}}$ such that $E \left[ V\mid \mathcal {G}_{t}^{j^{*}}\right] =\max _{j}E\left[ V\mid \mathcal {G}_{t}^{j}\right] $. Along $\mathcal {G}_{t}^{j^{*}}$, all buys are of type I and all sells are of type II. Indicate with $b_{t}^{m}$ the total number of buy orders and with $s_{t}^{m}$ the total number of sell orders observed at the end of time t. If path-dependent buying occurs for the first time at t, it must be happening on $\mathcal {G}_{t}^{j^{*}}$ . If $b_{t}^{m}>s_{t}^{m}$ then it is herd buying, while if $ b_{t}^{m}<s_{t}^{m}$ it is contrarian buying. Hence, when computing the ask price for the case of first-time path-dependent buying, the market maker only needs to consider this unique type path, whose associated value I indicate with $V_{t-1}^{B}=E\left[ V\mid \mathcal {G}_{t}^{j^{*}}\right] $.

The probability of a single type history is the probability of r successes in t Bernoulli trials, where r is the number of type I trades in the history. Formally,

$$\begin{aligned} \Pr \left( \mathcal {G}_{t}^{j}\mid \mathcal {F}_{t}\right) =\left( {\begin{array}{c}t\\ r\end{array}}\right) \left[ \Pr \left( \tau =0\right) \right] ^{t-r}\left[ \Pr \left( \tau =1\right) \right] ^{r}\text {.} \end{aligned}$$

For each time t, define the probability of the type history corresponding to the highest valuation $V_{t-1}^{B}$ as

$$\begin{aligned} \eta _{t}^{B}=\Pr \left( \mathcal {G}_{t}^{j^{*}}\mid \mathcal {F} _{t}\right) =\left( {\begin{array}{c}t\\ b_{t}\end{array}}\right) \lambda ^{t-b_{t}}\left( 1-\lambda \right) ^{b_{t}}\text {,} \end{aligned}$$

Appendix A.1 shows that, given a trading history $\mathcal {F}_{t} $ such that no PDB could have occurred before time t and such that on the type path $\mathcal {G}_{t}^{j^{*}}$ an informed trader buys regardless of his signal, the following expected value constitutes a rational expectations equilibrium price:

$$\begin{aligned} A_{t}= & {} E\left[ V\mid \mathcal {F}_{t},a_{t}=\text {buy}\right] \nonumber \\= & {} \frac{\left[ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) p\right] V_{t-1}^{m}+\eta _{t}^{B}\left( 1-\lambda \mu \right) \left( 1-p\right) V_{t-1}^{B}}{\frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) \left[ pV_{t-1}^{m}+\left( 1-p\right) \left( 1-V_{t-1}^{m}\right) \right] +\eta _{t}^{B}\left( 1-\lambda \mu \right) \left[ \left( 1-p\right) V_{t-1}^{B}+p\left( 1-V_{t-1}^{B}\right) \right] }.\nonumber \\ \end{aligned}$$

(7)

If the probability of PDB $\eta _{t}^{B}$ is zero, (7) reduces to the expected value of the asset given that only noise traders or informed traders with a high signal are going to buy at t. If $\eta _{t}^{B}>0$, the market maker needs to consider that on the type path $\mathcal {G}_{t}^{j^{*}}$ an informed trader buys with a low signal. Conditional on $ V=1$, the probability of an informed trader with a low signal is $\left( 1-\lambda \mu \right) \left( 1-p\right) $ and $V_{t-1}^{B}$ is the valuation of a trader who has observed $\mathcal {G}_{t}^{j^{*}}$.

In Appendix A.1 I show that the ask price in periods of PDB, i.e., when $\eta _{t}^{B}>0$, is always higher than the ask price in periods without PDB, i.e., when $\eta _{t}^{B}=0$. The reason is that, although the market maker is factoring in the possibility of an informed trader buying with a low signal, this trader has a very high asset valuation along the type path $\mathcal {G}_{t}^{j^{*}}$, namely $V_{t-1}^{B}$. This also implies that the bid-ask spread in periods of PDB is larger than in “normal” periods of trading, confirming the observation that volatility is higher when traders herd. By setting a larger spread the marker maker recuperates the losses incurred due to the adverse selection caused by information leakage.

3 Herding and contrarian behavior

This section establishes the conditions for PDB to occur for the first time at t. Typically, the reasoning used to prove the existence of herd behavior invokes the activity of noise traders: in finite time any trading history has positive probability; noise trading can always generate trading histories compatible with herding, hence the existence of herding. In my model, type realizations are as important for herding as the trading history they are compatible with. In fact, noise can generate trading histories whose compatible type histories meet the conditions for herding, but the very fact of assuming noise to generate those histories rules out herding. Herd buying (selling) is a consequence of the under-reaction of the market maker’s price to a long enough history of type I buys (sells). Noise can lead to contrarianism because during the realization of type II trades the price moves too much for the amount of information actually present in the market. Contrarian buying (selling) is a consequence of the excess of information the market maker assigns to a sequence of type II sells (buys).

Recall that p, $\lambda $ and $\mu $, respectively, are the signal precision, the probability of type II trades and the probability of being a noise trader conditional on being type II. Further, recall that $h_t$ and $l_t$ are the number of high an low signals implicitly observed by the traders along the type history, and that $b^i_t$ and $s^i_t$ are the number of type II buys and sells. Also, $b^m_t$ and $s^m_t$ are the number of buys and sells (of both types) observed by the market maker along the trading history up to time t.

The proof of the existence of PDB is achieved in two steps. First, Theorem 1 establishes conditions on p, $\lambda $ and $\mu $ involving both the trading history and the compatible type realizations under which herd and contrarian buying occur with probability one. Then, Theorem 2 shows that trading histories satisfying the conditions for PDB exist with positive probability.

Theorem 1

(Conditions for path-dependent behavior) Consider a trading history $\mathcal {F}_{t}$ such that no path-dependent behavior could have occurred before t. Then, for every $\lambda \in \left( 0,1\right) $ and $\mu \in \left( 0,1\right] $, and for every $\mathcal {G}^{j}_{t}$ compatible with $\mathcal {F}_{t}$ such that $h_{t-1}-l_{t-1}=1$ and $b_{t-1}^{i}-s_{t-1}^{i}=0$ do not occur simultaneously at $t-1$:

1.
If $h_{t-1}-l_{t-1}<\frac{2}{\lambda \mu }-1+\left( b_{t-1}^{i}-s_{t-1}^{i}\right) \left( \frac{1}{\lambda }-1\right) $ there exists a cutoff level $\frac{1}{2}<p_t^{*}\left( \lambda ,\mu \right) <1$ such that path-dependent buying occurs at t if and only if $1>p>$$p_t^{*}\left( \lambda ,\mu \right) $;
2.
If $h_{t-1}-l_{t-1}\ge \frac{2}{\lambda \mu }-1+\left( b_{t-1}^{i}-s_{t-1}^{i}\right) \left( \frac{1}{\lambda }-1\right) $ path-dependent buying occurs at t for every value of $p\in \left( \frac{1}{ 2},1\right) $.

Moreover,
3.
$\frac{\partial p_t^{*}}{\partial \lambda }<0$ if and only if $ b_{t}^{m}>s_{t}^{m}$;
4.
$\frac{\partial p_t^{*}}{\partial \mu }<0$ for any $\mathcal {F} _{t}$ leading to path-dependent behavior.

Proof

See Appendix A.2. $\square $

The theorem states that for any given type history$\mathcal {G}^{j}_{t}$, there exists a threshold $ p_t^{*}\left( \lambda ,\mu \right) $ above which path-dependent buying takes place for the first time at t. The larger p, the larger the effect of the asymmetry of information between the traders and the market maker for any level of noise and type composition. If type I trades contain relatively more information for the traders than for the market maker and type II trades are assigned relatively more information by the market maker than by the traders, then a long enough sequence of type I buys or type II sells is going to make the difference in beliefs $\pi _{t-1}^{i}-\pi _{t-1}$, and hence in valuations $V_{t-1}^{i}-V_{t-1}^{m}$, increase more the higher the signal precision.

When the precision $p=1$, the absolute difference in valuations between the traders and the market maker is always (weakly) positive and all informed traders act alike, not because they are herding but because they all receive the same, perfectly revealing information. In contrast, when $p=\frac{1}{2},$ the signal is completely uninformative and traders do not learn anything from it or by observing the other market participants’ behavior. The price remains equal to its initial value $\pi _{0}\,\ $and no learning takes place.

It is important to notice that, in equilibrium, the case considered in point (2) of Theorem 1 does not occur. If the event in point (2) occurred in equilibrium, the threshold $p_t^{*}\left( \lambda ,\mu \right) $ would already have decreased to $\frac{1}{2}$. As the traders’ signal precision p is strictly greater than $\frac{1}{2}$, the type history leading to the condition in point (2) would have satisfied the condition in point (1) in a prior period, triggering PDB. To illustrate, consider a type history $\mathcal {G}_{t}$ such that the inequality in point (1) holds. Corresponding to this history there exists a cutoff $p_t^{*}\left( \lambda ,\mu \right) $, such that PDB occurs if and only if the signal precision is larger than the cutoff. Hence, if the signal precision is smaller than $p_t^{*}\left( \lambda ,\mu \right) $, PDB does not occur at t. Then, for the cutoff $p_t^{*}\left( \lambda ,\mu \right) $ to decrease below the signal precision p, the type history must evolve so that the inequality of point (1) holds less tightly. As the type history unfolds in such a direction, eventually, the cutoff falls below the signal precision. Once this happens, PDB is triggered.

Define $\varGamma =\frac{1-\lambda }{1-\lambda \mu }$ to be the rate at which information is “leaked,” i.e., the probability that a trader is of type I given that he is informed. The numerator is equal to the probability that an informed trader is known to be informed to the rest of the market, while the denominator is the overall probability of informed trading. When $\lambda =0$ there are no type II traders and all information is leaked. There is no asymmetry of information because the market maker knows that all the traders are observed to be informed. By increasing $\lambda $, the asymmetry of information initially increases and then it decreases to a situation where there are only type II traders and no information is leaked. Notice that $\varGamma =1$ also when the probability of noise conditional of observing a type II trader $\mu =1$. All information is leaked; however, traders retain an informational advantage over the market maker in detecting noise traders, making PDB possible.

Increasing either $\lambda $ or $\mu $ leads to an increase in the overall level of noise. However, by increasing $\mu $, although the total amount of information decreases, more of it is leaked, increasing the asymmetry of information between traders and the market maker monotonically. It follows that the effect of an increase in the conditional noise $\mu $ is always beneficial to the occurrence of PDB. In case of herding, it dampens the market maker’s price adjustment to type I trading. In case of contrarianism, it makes the price too reactive to type II trading. Hence, an increase in $\mu $ enlarges the set of values of the signal precision leading to PDB.

When the condition for path-dependent buying is consistent with herding, namely when $b_{t}^{m}>s_{t}^{m}$, the cutoff $p^{*}$ above which a trader buys with a low signal decreases with $\lambda $ or, equally, it increases with the leakage rate $\varGamma $. This is because herding needs the price to be sticky. A larger leakage rate makes the price more reactive to the traders’ activity. Hence, for a given realization of the type history $\mathcal {G}_{t}$ satisfying the conditions of Theorem 1, if the price is more reactive due to an increase in the leakage rate, the signal threshold $p_t^{*}$ increases. Lower signal precisions can no longer generate enough asymmetry of information in the interpretation of the trading history for herding to occur.

When $b_{t}^{m}<s_{t}^{m}$, the conditions of Theorem 1 are consistent with contrarian buying. Contrarianism needs the price to be “too reactive” to the realization of the type history. An increase in the leakage rate makes the price more reactive, which results in a decrease in the threshold $p_t^{*}$ for a decrease in $\lambda $. In fact, when the price reacts more to a given realization of the type history $\mathcal {G}_{t}$, even lower levels of the signal precision are compatible with contrarian buying.

Next, I show that PDB occurs with positive probability except when $p=\frac{1}{2}$ or 1. The following notation is introduced for the log-likelihood ratios of various events:

$$\begin{aligned} L= & {} \log \frac{\Pr \left( a_{t}=\text {buy}\mid V=1\text {, }\tau _{t}=1\right) }{\Pr \left( a_{t}=\text {buy}\mid V=0\text {, }\tau _{t}=1\right) }, \\ L^{\mu }= & {} \log \frac{\Pr \left( a_{t}=\text {buy}\mid V=1\text {, }\tau _{t}=0\right) }{\Pr \left( a_{t}=\text {buy}\mid V=0\text {, }\tau _{t}=0\right) }, \\ L^{\lambda }= & {} \log \frac{\Pr \left( a_{t}=\text {buy}\mid V=1\right) }{\Pr \left( a_{t}=\text {buy}\mid V=0\right) }, \end{aligned}$$

where $L\ge L^{\lambda }\ge L^{\mu }$ for every p, $\lambda $ and $\mu $.

Theorem 2

(Existence of path-dependent behavior) For $p\in \left\{ \frac{1}{2},1\right\} $ path-dependent behavior does not occur for any trading history $\mathcal {F}_{t}$ and any $\lambda ,\mu \in \left[ 0,1\right] $. For $p\in \left( \frac{1}{2},1\right) $, $0<\lambda <1$ and $\mu \ne 0,$ path-dependent behavior occurs with positive probability. In particular, path-dependent buying occurs with positive probability for the first time at t whenever $\mathcal {F}_{t}$ is such that

$$\begin{aligned} b_{t-1}^{m}\ge \frac{L+L^{\lambda }}{L-L^{\lambda }}-s_{t-1}^{m}\frac{ L^{\lambda }-L^{\mu }}{L-L^{\lambda }}. \end{aligned}$$

(8)

Proof

See Appendix A.3. $\square $

For an intuition of Theorem 2, consider the case where the market opens with a series of $b_{t-1}^{m}$ type I buys in the first $t-1$ periods. Then, the number of high signals $h_{t-1}=b_{t-1}^{m}$ and if at t an individual with a low signal is called to trade, he will herd at t for given values of p, $\lambda $ and $\mu $ if $b_{t-1}^{m}\ge \left[ L^{\lambda }+L\right] /\left[ L-L^{\lambda }\right] $ since $s^m_{t-1}=0$. Suppose, instead, that the market opens with a sequence of $ s_{t-1}^{m}$ type II sells in the first $t-1$ periods. Then, $b_{t-1}^{m}=0$ and an individual with a low signal who is called to trade at t engages in contrarian buying if $s_{t-1}^{m}\ge \left[ L+L^{\lambda }\right] /\left[ L^{\lambda }-L^{\mu }\right] $. In the extreme cases where $\lambda =0$ and $\lambda =1$, or where $\mu =0$, there is no asymmetric information between the traders and the market maker, making PDB impossible.

Since the market maker only observes the trading history $\mathcal {F}_{t}$, he will never know for sure whether PDB is happening at any time t. Traders, on the other hand, observe the type realization and are aware of PDB taking place. Traders do not update their beliefs during times of PDB, while the price keeps moving with the trading activity. For this reason, no informational cascades in the sense of Bikhchandani, Hirshleifer and Welch Bikhchandani et al. (1992) will ever happen in this market. Eventually, the price will catch up with the market beliefs and normal trading will recover.

3.1 Market limit behavior and price informativeness

Theorems 1 and 2 establish the existence of PDB. However, how likely is PDB? Proposition 1 states that starting from any trading history, herd behavior occurs almost surely.

Proposition 1

Consider any starting trading history $\mathcal {F}_{\tau }$ generating beliefs $\pi _{\tau -1}^{i}$ and $\pi _{\tau -1}^{m}.$ For any continuation trading history $\mathcal {F}_{\tau +t}$, path-dependent behavior arises almost surely as $t\rightarrow \infty $: it takes the form of herd buying when $V=1$ and of herd selling when $V=0$. Moreover, as $t\rightarrow \infty $, contrarian behavior almost never happens.

Proof

See Appendix A.4. $\square $

At time t, the probability of path-dependent buying is equal to^{Footnote 15}

$$\begin{aligned} \Pr \left[ \left( L-L^{\lambda }\right) \left( h_{t-1}-l_{t-1}\right) >L+L^{\lambda }+\left( L^{\lambda }-L^{\mu }\right) \left( b_{t-1}^{i}-s_{t-1}^{i}\right) \right] . \end{aligned}$$

(9)

On the left-hand side of the inequality, the difference $h_{t-1}-l_{t-1}$ (the difference in high and low signals) captures how much type I realizations matter at t, as type realizations matter only if the corresponding imbalance between buys and sells is large. The term $ L-L^{\lambda }$ captures how sticky the price is in response to type I trades. Overall, $\left( L-L^{\lambda }\right) \left( h_{t-1}-l_{t-1}\right) $ captures how much price stickiness matters for a given realization of type I trades at t. On the right-hand side of the inequality, the difference $ b_{t-1}^{i}-s_{t-1}^{i}$ (type II buys and sells), captures how much type II realizations matter at t. The term $L^{\lambda }-L^{\mu }$ captures how over-reactive the price is in response to type II trades. Overall, $\left( L^{\lambda }-L^{\mu }\right) \left( b_{t-1}^{i}-s_{t-1}^{i}\right) $ captures how much price over-reactivity matters for a given realization of type II trades at t.

As $t\rightarrow \infty $, only herd behavior in the direction of the true state of the world survives. In the limit, the importance of $L+L^{\lambda }$ vanishes, and the inequality in (9) is satisfied only when $V=1$. When $V=1$, and as $t\rightarrow \infty $, noise buys and sells cancel out, whereas informed trading creates an order imbalance in the direction of buys. Hence, the condition for path-dependent buying is satisfied by herd buying. When $V=0$, as $t\rightarrow \infty $, the probability in (9) goes to zero, which rules out any form of path-dependent buying when $V=0$. A similar and symmetric argument can be made to establish that only herd selling survives in the limit, when $V=0$. Intuition would suggest that contrarian selling could take place when $V = 1$ for high levels of $\lambda $. That is, for high enough $\lambda $ the complement of condition (9) could hold. This is not the case. Although the rate at which information is leaked is low and the realization of type II traders makes traders’ beliefs incorporate information very slowly, when $\lambda $ is high the price is even stickier. Contrarianism relies on the over-reactivity of the price, which is dampened for high levels of $\lambda $.

To illustrate, notice that when $t\rightarrow \infty $ and $V=1$, the inequality in (9) can be written as:

$$\begin{aligned} \varGamma L+\left( 1-\varGamma \right) L^{\mu }>L^{\lambda } \end{aligned}$$

(10)

The left-hand side of (10) expresses how information is, on average, incorporated in traders’ beliefs given the rate of information leakage, while the right-hand side of (10) expresses how information is incorporated in the price. When the rate at which information is leaked equals zero (i.e., when $\lambda =1$), (10) is satisfied with equality as $L^{\lambda }=L^{\mu }\,$. In this instance, there is no asymmetric information between traders and the market maker: price and beliefs are always aligned like in a market à la Glosten and Milgrom. When $\lambda =0$ there is no noise and the rate at which information is leaked equals one. Then, (10) is satisfied with equality as $L^{\lambda }=L$: no PDB is possible as the market maker fully incorporates information in the price. Figure 1 illustrates that (10) holds strictly for all $\lambda \in \left( 0,1\right) $ when $V=1$, making contrarian selling not possible.

Overall, in this market, the price is always expected to be sticky relative to the traders’ beliefs, leading to herding almost surely, as formalized in Proposition 1. The limit behavior of this market alternates periods of normal trading and herd buying when $V=1$, and normal trading and herd selling when $V=0$.

The next result is a corollary to Proposition 1. Together with Proposition 1, it allows to conclude that, in the limit, the price of a market affected by PDB is more informative of the true state of the world than in a Glosten–Milgrom market.

Corollary 1

Herd behavior decreases the deviation of the asset price from its true value, namely $E\left[ |V^{m}_{t+1}-V|\mid V, \mathcal {F}_{t}\right] $ . In the limit, as $t\rightarrow \infty $, herd behavior improves price informativeness compared to a market à la Glosten and Milgrom.

Proof

See Appendix A.5. $\square $

3.2 Probability of path-dependent behavior

The previous section established that herding occurs almost surely in the limit. Proposition 2 helps to understand how the probability of PDB evolves over time and with respect to the quality of private information p.

Proposition 2

Define $t^{**}=\frac{1}{1-\lambda }+1$. For every $\lambda \in \left( 0,1\right) $ and for every $\mu \in \left( 0,1\right] $, if $t>t^{**}$ there exists $p^{**}\left( \lambda ,\mu ,t\right) $ such that:

1) PDB is more likely the better the quality of traders private information for $p>p^{**}$;
2) PDB is less likely the better the quality of traders private information for $p<p^{**}$.

Moreover, $\frac{\partial p^{**}}{\partial t}<0.$ If $t<t^{**}$, PDB is less likely the better the quality of traders private information for every $p\in \left( \frac{1}{2} ,1\right) $.

Proof

See Appendix A.6. $\square $

The effect of the signal precision p on PDB is twofold. On the one hand a high p exacerbates the asymmetry of information coming from the traders’ and the market maker’s different interpretations of the trading history. Given some trading history, this will hold more weight in generating PDB, the higher is p. On the other hand, a high p makes it more difficult for a trader to disregard his precise private signal to follow the crowd. PDB occurs when the first effect is stronger than the latter.

In order for PDB to occur, the trading history needs to overwhelm individuals’ private information. Prior to $t^{**}$ the trading history does not have much weight and an increase in the signal precision makes it less likely for a trader to disregard his private information to follow the information contained in the relatively short trading history. The likelihood of PDB is decreasing in p.

After $t^{**}$ the trading history has more weight. For low values of $p<p^{**}$ it is still the case that the likelihood of PDB is decreasing in p. However, for $p>p^{**}$, an increase in p makes the traders’ and the market maker’s different interpretations of the trading history matter more, increasing the probability of PDB.

When $\lambda $ is small (but not equal to zero), the $t^{**}$ is small. In this case, the leakage rate is high and the trading history can soon be overwhelming individuals’ private information. The larger the leakage rate, the faster this asymmetry of information matters.

The probability of PDB is likely to unfold as follows. Initially, the trading history does not have much weight and increasing the signal precision strengthens the traders’ private signal, making it more difficult for them to act against it to follow the direction of the trading history. After a number of trading rounds $t^{**}$, the trading history gains weight, which makes a signal precision higher than $p^{**}$ favor PDB. The more time passes, the lower the threshold $p^{**}$ above which the likelihood of PDB is increasing in the quality of information.

4 Learning from the trading activity

Do poorly informed markets and well-informed markets act alike? This point was first raised by AZ in their analysis of composition uncertainty. According to AZ, there is composition uncertainty “when the probability of traders of different types is not common knowledge,” where a type is defined by a trader’s signal precision. AZ observe that “trading patterns in a market with many poorly informed traders and herding mimic the trading patterns in a market with well-informed traders. In a poorly informed market, a sequence of buy orders is natural because of herding. In a well-informed market, a sequence of buy orders is also natural because the agents tend to have the same (very informative) private signal. Without the knowledge of the composition of the market, it can then become difficult to distinguish whether a sequence of buy orders reveals a large amount of information about value uncertainty (because the market is well informed) or almost none at all (because the market is poorly informed and informed traders are herding).”

The market modeled in this paper does not address composition uncertainty. In fact, even if we regard the two types as having different signal precisions, with type I traders having precision p and type II traders having an average precision $\frac{\mu }{2}+\left( 1-\mu \right) p$, the probability of the realization of a type, $\lambda $, is common knowledge. Making $\lambda $ not common knowledge would change the way the market maker sets the price, as he would have to learn both about the value of the asset and about $\lambda $. The analysis of such a modified market is nontrivial^{Footnote 16} and it is beyond the scope of this paper. In any case, a market maker who knows the overall probability of noise trading $\lambda \mu $ and the quality of private information p, but who is unaware of information leakage, would set prices à la Glosten and Milgrom without too much disruption to the trading activity. In fact, as shown in Appendix A.1, the rational expectation price cannot avert PDB any more than the price of a naive market maker who is not aware of PDB taking place. The only effect of the unawareness of the marker maker would be on the time the market takes to recover from PDB, as price movements would be slower.

The questions addressed here are: what can an external observer or market analyst infer by observing the trading activity? How does misinterpreting the market structure (parametrized by $\lambda $) affect the estimation of the quality of information in the market?

These questions have more than just academic relevance. Wrongly estimating the informational content of trades could have repercussions in the evaluation of adverse selection faced in a market. As an example of how this can lead to inefficiencies, think of an investment manager or a trader formulating trading strategies to maximize the value of the portfolio under management. The trade-off faced is between acting immediately and incurring liquidity costs, or waiting in the hope for more favorable terms of trade and potentially being exposed to opportunity costs. Underestimating adverse selection by underestimating the informational content of trades will likely lead to acting too early and incurring high costs of execution. On the other hand, overestimating the costs of adverse selection could lead one to wait too long and see the price move prior to execution, eroding the value of the investment.

Consider a market having information precision p, probability of noise trading $\lambda \mu $, and leakage rate $\varGamma = \frac{1-\lambda }{1-\lambda \mu }$. Furthermore, consider an analyst who knows the correct probability of noise trading $\lambda \mu $ but who doesn’t know the information leakage rate. In this section I consider what such an observer infers from observing the trading history.

Let $\widehat{\varGamma }$ be the analyst’s estimate of $\varGamma $ and suppose this estimate is either an overestimate $\left( \widehat{\varGamma }>\varGamma \right) $ or an underestimate $\left( \widehat{\varGamma } < \varGamma \right) $. The estimate must be a percentage of the total information and so $\widehat{\varGamma }\in \left[ 0,1\right] $, which implies the analyst’s estimate of the fraction of type II traders $\widehat{\lambda } \ge \lambda \mu $.

The following proposition states that an analyst (over) underestimating the leakage rate will always (under) overestimate the quality of private information when trying to infer it from the trading history, namely $\widehat{p}>p$. Conversely, if the analyst (under) overestimates the quality of private information, it must be that he is (over) underestimating the leakage rate, namely $\widehat{\varGamma }<\varGamma $ (or $\widehat{\lambda }>\lambda $).

Proposition 3

Consider a market having signal precision p and information leakage rate given by $\varGamma =\frac{1-\lambda }{1-\lambda \mu }$ , with $0\le \lambda ,\mu \le 1$. Consider a market analyst guessing a leakage rate $\widehat{\varGamma }=\frac{1-\widehat{\lambda }}{1-\widehat{\lambda }\widehat{\mu }}$ such that $\widehat{\lambda }\widehat{\mu }=\lambda \mu $. Then, $\widehat{p}\ge p$ if and only if $\widehat{\varGamma }\le \varGamma $.

Proof

See Appendix A.7. $\square $

To illustrate, consider an analyst who studies the trading data of a single stock from the TAQ dataset, which contains both the quotes (bid and ask prices) and the price at which each single transaction has occurred. From this data, using the standard Lee-Ready algorithm (Lee and Ready 1991) to classify trades, a history of order imbalance $b^m_t-s^m_t$ for the stock can be obtained.

The analyst, observing the trading data after markets have closed, can observe information events and realizations of V. Using these observations, the analyst partitions the time series around information events in order to estimate the quality of information in the market. Information events could be known events such as dividend announcements, central bank announcements, jobs reports, etc., or informal news events that the analyst can infer from price movements. The important thing is that the analyst can determine the new value of the asset after the information event has occurred. In the language of the present model, the analyst is able to study the trading history conditioning on the true value of the asset, $V=1$ or $V=0$.

Consider the extreme case where the analyst embarks in his estimation completely unaware of information leakage, i.e., $\widehat{\varGamma }=0$. The analyst believes he is observing a GM market where the probability of a noise trader is $\widehat{\lambda } \widehat{\mu }=\lambda \mu $, and the signal precision is $\widehat{p}$ instead of p. Let $OI_{T}^{GM}$ be the Glosten–Milgrom order imbalance at time T. Then, conditional on $V=1$ (this is without loss of generality, as conditioning on $V=0$ is analogous), the analyst believes the observed order imbalance $b^m_T-s^m_T$ satisfies the following equation:^{Footnote 17}

$$\begin{aligned} E\left[ OI_{T}^{GM}\right] =\left( 1-\widehat{\lambda }\widehat{ \mu } \right) \left( 2\widehat{p}-1\right) T. \end{aligned}$$

(11)

In the actual market the analyst is observing, the information is leaked at a rate $\varGamma >0 $. The observed order imbalance at time T actually satisfies:

$$\begin{aligned} E\left[ OI_{T} \right] =\left( 1-\lambda \mu \right) \left( 2p-1\right) T+2\left( 1-\lambda \mu \right) \left( 1-p\right) \frac{Tn}{F+n} \end{aligned}$$

(12)

where the second term on the right-hand side is the expected additional order imbalance $AOI_{T }$ generated by PDB, F is the expected time between herding episodes and n is the expected length of a herding episode. Both F and n are functions of $ p,\lambda $ and $\mu $ and their analytical expressions are given by (32 ) and (33), respectively, in Appendix A.7.

The precision $\widehat{p}$ the analyst needs in order to reconcile the observed order imbalance to the behavior of a GM market is such that the right-hand side of equation (11) equals the right-hand side of equation (12). Substituting $\lambda \mu = \widehat{\lambda } \widehat{\mu } $ and rearranging, the distortion $\mathcal {D}=\widehat{p}-p$ in the inference of the signal precision by an external observer is equal to:

$$\begin{aligned} \mathcal {D}=\frac{\left( 1-p\right) }{\frac{F}{n}+1}. \end{aligned}$$

(13)

This is plotted in Fig. 2, which shows that, for every value of $\lambda $, there exists a precision level $\bar{p }$ such that the distortion in the inference of the signal precision introduced by PDB is increasing for $p<\bar{p}$ and decreasing for $p>\bar{p}$.

The distortion $\mathcal {D}$ can be split into three components: the expected time between herding episodes F, the expected length of a herding episode n, and the likelihood of an incorrect signal $1-p$. The graphs of F and n (see Fig. 5 in Appendix A.7) are, respectively, decreasing and increasing in the signal precision for any level of $\lambda $ and $\mu $. This means that, through both F and n, a higher signal precision contributes to a larger $\mathcal {D}$. In particular, when p is high, it takes less time to build a difference in beliefs between traders and market maker leading to herd behavior. Moreover, once in an herding episode, it takes longer for the belief of the market maker to catch up with the belief of the traders. This is because the higher the signal precision, the higher the informational content of the last trade before herding starts, which is the one pushing the value of an informed trader past the bid and ask prices. For normal trading to resume, the market maker needs to make up for this misalignment in beliefs during herding.^{Footnote 18}

The positive impacts of p on the distortion $\mathcal {D}$ through F and n last until some level of precision $\bar{p}$ above which the negative effect through $1-p$ takes over. Then, as p increases, the likelihood of receiving an incorrect signal decreases, which reduces the distortion herding has on trading decisions by hiding incorrect signals.

In a market where information leaks, herding is expected to impact the trading activity by hiding incorrect signals in a way that increases the order imbalance in the direction of the true state of the world. Overall, when the signal precision is either high or low, the additional order imbalance due to information leakage is not very large. In the former case, this is because herding is relatively infrequent. In the latter case, although herding is frequent, it goes, on average, in the direction of the true state of the world, hiding mostly correct signals. Thus, the impact on the actual trading activity is small.

Consider now an observer who is aware of the possibility of informational leakages. The left panel of Fig. 3 plots the level curves of the expected additional order imbalance as a fraction of the total trading periods: $\frac{E\left[ AOI_{T}\right] }{T}$. In expectation, the same level can be generated by different pairs $ \left( p,\lambda \right) $ showing that the trade-off between p and $\lambda $ is non-monotonic. This reflects the non-monotonic/differential impact in the price and the traders’ beliefs of a variation of p and $\lambda $, as highlighted by Eq. (10) and by Fig. 1.

When information is leaked at a high rate ($\lambda $ small), the price is very reactive and an increase in $\lambda $ has a larger impact in making the price sticky than on traders being exposed to a smaller amount of information, increasing the overall $E\left[ AOI_{T}\right] $. In order to bring it back to a lower level, the quality of information available to the market needs to decrease.

When information is leaked at a low rate ($\lambda $ large) the price is already very sticky and a change in $\lambda $ does not affect it very much. An increase in $ \lambda $ makes the traders less informed. This needs to be counterbalanced by a better quality of information (i.e., larger p) to keep $E\left[ AOI_{T} \right] $ constant.

The trade-off between p and $\lambda $ is muted for very high levels of the signal precision. When p is very close to one, the number of incorrect signals that are hidden by herding decreases, and so does $E\left[ AOI_{T} \right] $. Similarly, the trade-off is not very strong when p is close to 0.5, as only very little herd behavior is expected.

The right panel of Fig. 3 plots the level curves for (12) as a fraction of the total trading periods and illustrates that, despite the non-monotonic behavior of the trade-off between p and $\lambda $, the magnitude of $E\left[ AOI_{T}\right] $ is negligible compared to the bulk of order imbalance generated by periods of normal trading. The latter displays a monotonic trade-off between p and $\lambda $, which dominates. In the long run, an analyst who knows the correct leakage ratio can learn p from observing the trading activity.

The fact that the impact of $E\left[ AOI_{T}\right] $ is negligible does not mean that the market does not spend a lot of time herding. Proposition 2 establishes that herding is more likely, under suitable conditions, for higher levels of the signal precision. This is precisely when herding mainly “hides” signals that would have made the traders act in the same direction as herding does. Figure 4 plots the volumetric $\mu $-slices for the percentage of time the market is expected to be herding as a function of p and $\lambda $, which is defined as $\frac{n}{F+n}$. The left-hand panel slices the volume at $\mu =0.8$, while the right-hand panel slices it at $ \mu =1$. For these two values, regardless the amount and quality of information, it is either almost all leaked ($\mu =0.8$) or it is all leaked ($\mu =1$). In the latter case, for a very high quality of information ($ p>0.95 $) there are levels of $\lambda $ at which the market is expected to spend half of the time herding. The intuition is again that what makes herding likely is a right balance between informational leakage and price stickiness, which occurs for moderate values of $\lambda $. Even for lower levels of $\mu $, Fig. 4 shows that very precise information is disregarded more than 30% of the time.

5 Conclusion

In this paper I construct a mechanism that leads to PDB by exploiting the informational leakage naturally occurring on trading floors. My formulation has the advantage of recovering PDB with a smaller state space than the state space required for event uncertainty. This enables the model to link the formation of price and market beliefs to the quality of information throughout time. The result is that the comparative statics can be conducted in terms of exogenous variables only, leading to the conclusion that, given any trading history, there is always a threshold for the signal precision above which PDB occurs. A further implication of the model is that only herding in the direction of the true state of the world survives in the limit, without causing catastrophic effects on the price and actually improving its informativeness. Moreover, as the weight of the trading history increases, higher signal precisions increase the probability of PDB.

Medium levels of information leakage create the right mix of price stickiness and asymmetry of information such that, when information is very precise, the market spends a sizeable amount of time herding. A market analyst who underestimates/overestimates the rate of information leakage will overestimate/underestimate the informational content of trades, with consequences on the profitability of his trading strategies.

Besides allowing for PDB, I believe this mechanism is interesting per se, as a way to model the exchange of information among traders. Its simplicity is helpful to bring it to the data in order to estimate the rate at which information is leaked in markets, a phenomenon that has been widely witnessed and studied by the financial sociology literature. Future work can estimate the model’s parameters via maximum likelihood, using a similar methodology to Easley et al. (1997) and Cipriani and Guarino (2014).

Notes

The name alludes to the fact that, to generate herding and contrarianism, the sequence in which buys and sells arrive matters as much as their number. See, for instance, Cipriani and Guarino (2014).
Event uncertainty requires three states of the world, which makes AZ’s model recursive. The model presented in this paper requires only two states of the world to generate PDB.
An extensive literature documents the role of information networks and word-of-mouth to identify information transfers in security markets. In particular: Hong et al. (2004) employ social networks to understand the investment behavior of individual investors, whereas Cohen et al. (2008) use education networks to identify the impact of information transfer between managers and corporate boards in security markets. For a formal model of word-of-mouth communication, see Ellison and Fudenberg (1995).
The assumptions on the availability of information to the market maker will be discussed in Sect. 1.1.
This is formalized in Theorem 1.
This is formalized in Proposition 2.
The fact that PDB generates larger bid-ask spreads is a consequence of the market maker’s pricing rule formalized in Appendix A.1.
This is formalized in Corollary 1.
This is formalized in Proposition 1.
This is illustrated in Fig. 3.
This is formalized in Proposition 3.
This is illustrated in Fig. 2.
This is illustrated in Fig. 4.
The case where noise traders do not trade is omitted. Noise traders can be thought of as individuals coming to the market because they experience a liquidity shock or the need to hedge risk. These events force them to modify their portfolio, so one needs to actively trade to qualify as noise trader.
The probability of PDB at any time also depends on the initial beliefs $\pi _{\tau -1}^{i}$ and $\pi _{\tau -1}^{m}$. In the appendix I show that, in the limit, the starting beliefs do not matter, so I abstract from them in the discussion.
Due to its complexity, AZ carry out the analysis of price paths under composition uncertainty by means of numerical simulation.
When $V=1$, the vector $X_{T}=(h_{T}, l_{T}, b_{T}^{i}, s_{T}^{i})$ has multinomial distribution $\left( T;\mathbf {\mu }\right) $ where:
$$\begin{aligned} \mathbf {\mu }= \begin{bmatrix} \left( 1-\lambda \right) p \\ \left( 1-\lambda \right) \left( 1-p\right) \\ \lambda \left( 1-\mu \right) p+\frac{\lambda \mu }{2} \\ \lambda \left( 1-\mu \right) \left( 1-p\right) +\frac{\lambda \mu }{2} \end{bmatrix} \end{aligned}$$
It follows that for every T, $E[X_{T}]=T\mathbf {\mu }$.
In particular, $n>1$ only if the last trade before herding is of type I.
This cannot be proved analytically, but it can be shown by computation.
This abstracts from any initial beliefs. If, in expectation, only herding occurs, after each herding episode beliefs would be $\pi ^{i}>\pi ^{m}$, which implies that F is smaller than the one computed here. The overall effect would be in favor of a stronger distortion in the inference of p, but qualitatively it would not change.

References

Avery, C., Zemsky, P.: Multidimensional uncertainty and herd behavior in financial markets. Am. Econ. Rev. 88, 724–748 (1998)
Google Scholar
Baker, W.E.: The social structure of a national securities market. Am. J. Sociol. 89(4), 775–811 (1984)
Article Google Scholar
Bikhchandani, S., Hirshleifer, D., Welch, I.: A theory of fads, fashion, custom, and cultural change as informational cascades. J. Polit. Econ. 100(5), 992–1026 (1992)
Article Google Scholar
Bose, S., Orosel, G., Ottaviani, M., Vesterlund, L.: Monopoly pricing in the binary herding model. Econ. Theory 37(2), 203–241 (2008). https://doi.org/10.1007/s00199-007-0313-9
Article Google Scholar
Chari, V.V., Kehoe, P.J.: Financial crises as herds: overturning the critiques. J. Econ. Theory 119(1), 128–150 (2004)
Article Google Scholar
Cipriani, M., Guarino, A.: Estimating a structural model of herd behavior in financial markets. Am. Econ. Rev. 104(1), 224–251 (2014)
Article Google Scholar
Cohen, L., Frazzini, A., Malloy, C.: The small world of investing: Board connections and mutual fund returns. J. Polit. Econ. 116(5), 951–979 (2008)
Article Google Scholar
Dasgupta, A., Prat, A.: Financial equilibrium with career concerns. Theor. Econ. 1(1), 67–93 (2006)
Google Scholar
Dasgupta, A., Prat, A.: Information aggregation in financial markets with career concerns. J. Econ. Theory 143(1), 83–113 (2008)
Article Google Scholar
Easley, D., O’Hara, M.: Time and the process of security price adjustment. J. Financ. 47(2), 577–605 (1992)
Article Google Scholar
Easley, D., Kiefer, N.M., O’Hara, M.: One day in the life of a very common stock. Rev. Financ. Stud. 10(3), 805–835 (1997)
Article Google Scholar
Easley, D., López de Prado, M.M., O’Hara, M.: Flow toxicity and liquidity in a high-frequency world. Rev. Financ. Stud. 25(5), 1457–1493 (2012)
Article Google Scholar
Ellison, G., Fudenberg, D.: Word-of-mouth communication and social learning. Q. J. Econ. 110(1), 93–125 (1995)
Article Google Scholar
Glosten, L.R., Milgrom, P.R.: Bid, ask and transaction prices in a specialist market with heterogeneously informed traders. J. Financ. Econ. 14(1), 71–100 (1985)
Article Google Scholar
Hong, H., Kubik, J.D., Stein, J.C.: Social interaction and stock-market participation. J. Financ. 59(1), 137–163 (2004)
Article Google Scholar
Hong, H., Kubik, J.D., Stein, J.C.: Thy neighbor’s portfolio: word-of-mouth effects in the holdings and trades of money managers. J. Financ. 60(6), 2801–2824 (2005)
Article Google Scholar
Knorr Cetina, K., Bruegger, U.: Global microstructures: the virtual societies of financial markets. Am. J. Sociol. 107(4), 905–950 (2002)
Article Google Scholar
Lee, C., Ready, M.J.: Inferring trade direction from intraday data. J. Financ. 46(2), 733–746 (1991)
Article Google Scholar
Lee, I.H.: Market crashes and informational avalanches. Rev. Econ. Stud. 65(4), 741–759 (1998)
Article Google Scholar
MacKenzie, D.: An address in mayfair. London Review of Books. http://www.lrb.co.uk/v30 (23) (2008a)
MacKenzie, D.: What’s in a number? the importance of libor. Real-World Econ. Rev. 47(3), 237–242 (2008b)
Google Scholar
Park, A., Sabourian, H.: Herding and contrarian behavior in financial markets. Econometrica 79(4), 973–1026 (2011)
Article Google Scholar
Schindler, M.: Rumors in financial markets: insights into behavioral finance. Wiley, New York (2007)
Google Scholar
Shiller, R.J.: Irrational exuberance. Princeton University Press, Princeton (2015)
Book Google Scholar
Van Bommel, J.: Rumors. J. Financ. 58(4), 1499–1520 (2003)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Portsmouth, Richmond Building, Portland Road, Portsmouth, PO1 3DE, UK
Alessia Testa

Authors

Alessia Testa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alessia Testa.

Additional information

To my father Giuseppe, a life enthusiast who never stops inspiring me.

I thank the editor, the associate editor and two anonymous referees for providing very useful comments which have greatly improved the paper. I would also like to thank Amil Dasgupta, Ian Jewitt, Pete Kyle, Meg Meyer, Sujoy Mukerji, Han Ozsoylev, Alexis Stanfors, Bruno Strulovici, Dezsö Szalay, for their useful comments and suggestions. I also benefited from conversations with Godfrey Keller, Andrea Patacconi, Robert Ritz and Chris Wallace.

A Appendix

1.1 A.1 Market Maker Pricing Rule

Define $J\left( \mathcal {F}_t\right) =\left\{ j:\mathcal {G}_{t}^{j} \text { is compatible with } \mathcal {F}_t\right\} $. Then, $\left\{ \mathcal {G}_{t}^{j}\right\} _{j\in {J\left( \mathcal {F}_t\right) }}$ is the family of type histories at time t compatible with $\mathcal {F}_{t}$. To simplify notation, I will equivalently use $\left\{ \mathcal {G}_{t}^{j}\right\} _{j}$ when it does not lead to confusion. Suppose that at t the first possibility of herd/contrarian buying arises on $\mathcal {G}_{t}^{j^{*}}$, where $j^{*}\in \arg \max _{j\in {J\left( \mathcal {F}_t\right) }}E^{i} \left[ V\mid \mathcal {G}_{t}^{j}\right] $. We know that $j^{*}$ is unique. Indicate with $V_{t-1}^{B}$ the valuation of the asset by a trader i who has observed $\mathcal {G}_{t}^{j^{*}}$. We can write the market maker’s valuation of the asset as the weighted sum of the value of the asset along the paths $\left\{ \mathcal {G}_{t}^{j}\right\} _{j}$ with weights given by the probability of each of the paths given the trading history. We can then rewrite $V_{t-1}^{m}$ as

$$\begin{aligned} V_{t-1}^{m}= & {} E\left[ V\mid \mathcal {F}_{t}\right] =\Pr \left( V\ =1\mid \mathcal {F}_{t}\right) =\frac{\Pr \left( \mathcal {F}_{t}\mid V=1\right) \Pr \left( V=1\right) }{\Pr \left( \mathcal {F}_{t}\right) } \nonumber \\= & {} \frac{\sum _{i}\Pr \left( \mathcal {G}_{t}^{i}\mid V=1\right) \Pr \left( V=1\right) }{\Pr \left( \mathcal {F}_{t}\right) } \nonumber \\= & {} \frac{\sum _{i}\frac{\Pr \left( V=1\mid \mathcal {G}_{t}^{i}\right) \Pr \left( \mathcal {G}_{t}^{i}\right) }{\Pr \left( V=1\right) }\Pr \left( V=1\right) }{\Pr \left( \mathcal {F}_{t}\right) }=\frac{\sum _{i}\Pr \left( V=1\mid \mathcal {G}_{t}^{i}\right) \Pr \left( \mathcal {G}_{t}^{i}\right) }{ \Pr \left( \mathcal {F}_{t}\right) } \end{aligned}$$

(14)

Using the fact that $\Pr \left( \mathcal {F}_{t}\mid \mathcal {G} _{t}^{i}\right) =1$,

$$\begin{aligned} \Pr \left( \mathcal {G}_{t}^{i}\mid \mathcal {F}_{t}\right) \Pr \left( \mathcal {F}_{t}\right) =\Pr \left( \mathcal {F}_{t}\mid \mathcal {G} _{t}^{i}\right) \Pr \left( \mathcal {G}_{t}^{i}\right) =\Pr \left( \mathcal {G}_{t}^{i}\right) \text {.} \end{aligned}$$

It follows that

$$\begin{aligned} \frac{\Pr \left( \mathcal {G}_{t}^{i}\right) }{\Pr \left( \mathcal {F} _{t}\right) }=\Pr \left( \mathcal {G}_{t}^{i}\mid \mathcal {F}_{t}\right) \end{aligned}$$

(15)

and we can rewrite (14) as

$$\begin{aligned} V_{t-1}^{m}=\sum _{i}\Pr \left( V=1\mid \mathcal {G}_{t}^{i}\right) \Pr \left( \mathcal {G}_{t}^{i}\mid \mathcal {F}_{t}\right) . \end{aligned}$$

(16)

To ease notation, we use $a_{t}=B$ to indicate a buy. Using the law of conditional expectations and Bayes’ rule we can write (3) as

$$\begin{aligned}&\Pr \left( V=1\mid a_{t}=B,\mathcal {F}_{t}\right) =\frac{\Pr \left( \mathcal {F}_{t},a_{t}=B,V=1\right) }{\Pr \left( a_{t}=B,\mathcal {F} _{t}\right) }, \\&\quad =\frac{\Pr \left( a_{t}=B,\mathcal {F}_{t},V=1\right) }{\Pr \left( a_{t}=B, \mathcal {F}_{t},V=1\right) +\Pr \left( a_{t}=B,\mathcal {F}_{t},V=0\right) }, \\&\quad =\frac{\Pr \left( \mathcal {F}_{t}\mid a_{t}=B,V=1\right) \Pr \left( a_{t}=B,V=1\right) }{\Pr \left( \mathcal {F}_{t}\mid a_{t}=B,V=1\right) \Pr \left( a_{t}=B,V=1\right) +\Pr \left( \mathcal {F}_{t}\mid a_{t}=B,V=0\right) \Pr \left( a_{t},V=0\right) }. \end{aligned}$$

Since $\left\{ \mathcal {G}_{t}^{i}\right\} _{i}$ “partitions” $\mathcal {F}_{t}$,

$$\begin{aligned}&\Pr \left( V=1\mid a_{t}=B,\mathcal {F}_{t}\right) \\&\quad =\frac{\sum _{i=1}^{2^{t}}\Pr \left( \mathcal {G}_{t}^{i}\mid a_{t}=B,V=1\right) \Pr \left( a_{t}=B,V=1\right) }{\sum _{i=1}^{2^{t}}\left[ \Pr \left( \mathcal {G}_{t}^{i}\mid a_{t}=B,V=1\right) \Pr \left( a_{t}=B,V=1\right) +\Pr \left( \mathcal {G}_{t}^{i}\mid a_{t}=B,V=0\right) \Pr \left( a_{t}=B,V=0\right) \right] }, \\&\quad =\frac{\sum _{i=1}^{2^{t}}\Pr \left( a_{t}=B\mid V=1,\mathcal {G} _{t}^{i}\right) \Pr \left( V=1\mid \mathcal {G}_{t}^{i}\right) \Pr \left( \mathcal {G}_{t}^{i}\right) }{\sum _{i=1}^{2^{t}}\left[ \Pr \left( a_{t}=B\mid V=1,\mathcal {G}_{t}^{i}\right) \Pr \left( V=1\mid \mathcal {G}_{t}^{i}\right) \Pr \left( \mathcal {G}_{t}^{i}\right) +\Pr \left( a_{t}=B\mid V=0,\mathcal {G} _{t}^{i}\right) \Pr \left( V=0\mid \mathcal {G}_{t}^{i}\right) \Pr \left( \mathcal {G}_{t}^{i}\right) \right] }. \end{aligned}$$

Dividing both the numerator and the denominator by $\Pr \left( \mathcal {F}_{t}\right) $ and using (14), the numerator can be written as

$$\begin{aligned} \sum _{i=1}^{2^{t}}\Pr \left( a_{t}=B\mid V=1,\mathcal {G}_{t}^{i}\right) \Pr \left( V=1\mid \mathcal {G}_{t}^{i}\right) \Pr \left( \mathcal {G}_{t}^{i}\mid \mathcal {F}_{t}\right) . \end{aligned}$$

(17)

On $\mathcal {G}_{t}^{j^{*}}$, the probability of a buy order differs from the probability in other type paths, as traders herd and buy regardless of their signal. In particular,

$$\begin{aligned} \Pr \left( a_{t}=B\mid V=1,\mathcal {G}_{t}^{i}\right)= & {} \left[ \frac{ \lambda \mu }{2}+\left( 1-\lambda \mu \right) p\right] \text { for }i\ne j^{*}, \\ \Pr \left( a_{t}=B\mid V=1,\mathcal {G}_{t}^{i}\right)= & {} \left[ \frac{ \lambda \mu }{2}+\left( 1-\lambda \mu \right) \left( p+1-p\right) \right] \text { for }i=j^{*}. \end{aligned}$$

We can the rewrite (17) as

$$\begin{aligned}&\left[ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) p\right] \underset{V_{t-1}^{m}}{\underbrace{\sum _{i=1}^{2^{t}}\Pr \left( V=1\mid \mathcal {G} _{t}^{i}\right) \Pr \left( \mathcal {G}_{t}^{i}\mid \mathcal {F}_{t}\right) }} \\&\quad +\,\left( 1-\lambda \mu \right) \left( 1-p\right) \underset{V_{t-1}^{B}}{ \underbrace{\Pr \left( V=1\mid \mathcal {G}_{t}^{j^{*}}\right) }}\Pr \left( \mathcal {G}_{t}^{j^{*}}\mid \mathcal {F}_{t}\right) . \end{aligned}$$

It follows that

$$\begin{aligned}&E\left[ V\mid \mathcal {F}_{t},a_{t}=\text {buy}\right] \nonumber \\&\quad =\frac{\left[ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) p\right] V_{t-1}^{m}+\left( 1-\lambda \mu \right) \left( 1-p\right) V_{t-1}^{B}\eta _{t}^{B}}{\left[ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) p\right] V_{t-1}^{m}+\left( 1-\lambda \mu \right) \left( 1-p\right) V_{t-1}^{B}\eta _{t}^{B}+\left[ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) p\right] V_{t-1}^{m}+\left( 1-\lambda \mu \right) p\left( 1-V_{t-1}^{B}\right) \eta _{t}^{B}}\nonumber \\ \end{aligned}$$

(18)

where $\eta _{t}^{B}=\Pr \left( \mathcal {G}_{t}^{j^{*}}\mid \mathcal {F}_{t}\right) \,$ is the probability of the path leading to herding/contrarianism. Q.E.D.

It is left to show that (18) is indeed a rational expectations price. In order to do so, I introduce a “naive” market maker, meaning a marker maker who sets bid and ask prices assuming that every informed trader follows his signal even when the conditions for PDB are in place. Correspondingly, his pricing rule satisfies:

$$\begin{aligned} A_{t}^{naive}= & {} E\left[ V\mid \mathcal {F}_{t},V_{t}^{i}\left( \mathcal {F} _{t},\mathcal {T}_{t},\sigma ^{i}=H\right)>A_{t}^{naive}\right] \text {,} \\ B_{t}^{naive}= & {} E\left[ V\mid \mathcal {F}_{t},V_{t}^{i}\left( \mathcal {F} _{t},\mathcal {T}_{t},\sigma ^{i}=L\right) >B_{t}^{naive}\right] \, . \end{aligned}$$

I am going to show that if a trader buys with a low signal at $A_{t}^{naive}$ then he still buys at $E\left[ V\mid \mathcal {F}_{t},a_{t}=\text {buy}\right] $, which is then the competitive rational expectation equilibrium price.

Lemma 1

Consider a trading history $\mathcal {F}_{t}$ and a type history $ \mathcal {G}_{t}^{j}$ compatible with it such that $V_{t}^{i}\left( \mathcal {G }_{t}^{j},\sigma ^{i}=L\right) >A_{t}^{Naive}$. Then,

$$\begin{aligned} V_{t}^{i}\left( \mathcal {G}_{t}^{j},\sigma ^{i}=L\right) >E\left[ V\mid \mathcal {F}_{t},a_{t}=\text {buy}\right] . \end{aligned}$$

The converse is also true. Hence, $A_{t}=E\left[ V\mid \mathcal {F}_{t},a_{t}= \text {buy}\right] $ is a rational expectation equilibrium price.

Proof

$V_{t}^{i}\left( \mathcal {G}_{t}^{j},\sigma ^{i}=L\right) >A_{t}^{Naive}$ is equivalent to

$$\begin{aligned} \frac{\left( 1-p\right) V_{t-1}^{i}}{\left( 1-p\right) V_{t-1}^{i}+p\left( 1-V_{t-1}^{i}\right) }>\frac{\left[ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) p\right] V_{t-1}^{m}}{\frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) \left[ pV_{t-1}^{m}+\left( 1-p\right) \left( 1-V_{t-1}^{m}\right) \right] }, \end{aligned}$$

rearranging,

$$\begin{aligned}&\left( 1-p\right) V_{t-1}^{i}\left\{ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) \left[ pV_{t-1}^{m}+\left( 1-p\right) \left( 1-V_{t-1}^{m}\right) \right] \right\} \\&\quad >\left[ \left( 1-p\right) V_{t-1}^{i}+p\left( 1-V_{t-1}^{i}\right) \right] \left[ \frac{\lambda \mu }{2 }+\left( 1-\lambda \mu \right) p\right] V_{t-1}^{m}. \end{aligned}$$

Adding $\eta _{t}\left( 1-\lambda \mu \right) \left( 1-p\right) V_{t-1}^{i}\left\{ \left( 1-p\right) V_{t-1}^{i}+p\left( 1-V_{t-1}^{i}\right) \right\} $ to both sides,

$$\begin{aligned}&\left( 1-p\right) V_{t-1}^{i}\left\{ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) \left[ pV_{t-1}^{m}+\left( 1-p\right) \left( 1-V_{t-1}^{m}\right) \right] \right. \\&\quad \left. +\,\eta _{t}\left( 1-\lambda \mu \right) \left[ \left( 1-p\right) V_{t-1}^{i}+p\left( 1-V_{t-1}^{i}\right) \right] \right\} \\&>\left[ \left( 1-p\right) V_{t-1}^{i}+p\left( 1-V_{t-1}^{i}\right) \right] \left\{ \left[ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) p\right] V_{t-1}^{m}\right. \\&\quad \left. +\,\eta _{t}\left( 1-\lambda \mu \right) \left( 1-p\right) V_{t-1}^{i}\right\} . \end{aligned}$$

Rearranging,

$$\begin{aligned}&\frac{\left( 1-p\right) V_{t-1}^{i}}{\left( 1-p\right) V_{t-1}^{i}+p\left( 1-V_{t-1}^{i}\right) } \\&\quad >\frac{\left[ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) p\right] V_{t-1}^{m}+\eta _{t}\left( 1-\lambda \mu \right) \left( 1-p\right) V_{t-1}^{i}}{\frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) \left[ pV_{t-1}^{m}+\left( 1-p\right) \left( 1-V_{t-1}^{m}\right) \right] +\eta _{t}\left( 1-\lambda \mu \right) \left\{ \left( 1-p\right) V_{t-1}^{i}+p\left( 1-V_{t-1}^{i}\right) \right\} }, \end{aligned}$$

which is equivalent to $V_{t}^{i}\left( \mathcal {G}_{t}^{j},\sigma ^{i}=L\right) >E\left[ V\mid \mathcal {F}_{t},a_{t}=\text {buy}\right] $. Hence, $A_{t}=E\left[ V\mid \mathcal {F}_{t},a_{t}=\text {buy}\right] $ is a rational expectations price. $\square $

One might be tempted to think that the previous result is driven by the fact that the price of a sophisticated market maker needs to be lower than the price of a naive market maker because the former is accounting for both a high and a low signal driving the buy order on $\mathcal {G}_{t}^{j^{*}}$ . This last intuition is true, but it is only part of the story: a buy order makes $\mathcal {G}_{t}^{j^{*}}$ and its associated high prior $ V_{t-1}^{B}$ more likely in the eyes of the market maker, inducing an overall increase in his valuation. In general we always have that $ A_{t}>A_{t}^{naive}.$ Correspondingly, $B_{t}<B_{t}^{naive}$, confirming the observation that price volatility increases both during periods of herding and during periods of contrarianism.

Lemma 2

Consider a trading history $\mathcal {F}_{t}$ and a type history $ \mathcal {G}_{t}^{j}$ compatible with it such that $E\left[ V\mid \mathcal {G} _{t}^{j},\sigma ^{i}=L\right] >A_{t}^{Naive}$. Then $A_{t}>A_{t}^{naive}$. The converse is also true.

Proof

$A_{t}>A_{t}^{naive}$ is equivalent to

$$\begin{aligned}&\frac{\left[ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) p\right] V_{t-1}^{m}+\eta _{t}\left( 1-\lambda \mu \right) \left( 1-p\right) V_{t-1}^{i}}{\frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) \left[ pV_{t-1}^{m}+\left( 1-p\right) \left( 1-V_{t-1}^{m}\right) \right] +\eta _{t}\left( 1-\lambda \mu \right) \left\{ \left( 1-p\right) V_{t-1}^{i}+p\left( 1-V_{t-1}^{i}\right) \right\} } \\&\quad >\frac{\left[ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) p\right] V_{t-1}^{m}}{\frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) \left[ pV_{t-1}^{m}+\left( 1-p\right) \left( 1-V_{t-1}^{m}\right) \right] }. \end{aligned}$$

Rearranging,

$$\begin{aligned}&\left\{ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) \left[ pV_{t-1}^{m}+\left( 1-p\right) \left( 1-V_{t-1}^{m}\right) \right] \right\} \eta _{t}\left( 1-\lambda \mu \right) \left( 1-p\right) V_{t-1}^{i} \\&\quad >\eta _{t}\left( 1-\lambda \mu \right) \left\{ \left( 1-p\right) V_{t-1}^{i}+p\left( 1-V_{t-1}^{i}\right) \right\} \left[ \frac{\lambda \mu }{ 2}+\left( 1-\lambda \mu \right) p\right] V_{t-1}^{m}, \end{aligned}$$

which is equivalent to

$$\begin{aligned} \frac{\left( 1-p\right) V_{t-1}^{i}}{\left( 1-p\right) V_{t-1}^{i}+p\left( 1-V_{t-1}^{i}\right) }>\frac{\left[ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) p\right] V_{t-1}^{m}}{\frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) \left[ pV_{t-1}^{m}+\left( 1-p\right) \left( 1-V_{t-1}^{m}\right) \right] }, \end{aligned}$$

which concludes the proof. $\square $

We have established that $E\left[ V\mid \mathcal {G}_{t}^{j^{*}},\sigma ^{i}=L\right]>A_{t}>A_{t}^{naive}$. This is easily interpreted if we observe that, by moving from the highest of the valuations to the lowest, we are “losing” something either at the level of information or at the level of rationality. In fact, if $E\left[ V\mid \mathcal {G}_{t}^{j^{*}},\sigma ^{i}=L\right] $ is the valuation of a fully informed and fully rational agent, $A_{t}$ is the valuation of a partially informed and fully rational agent, to conclude with $A_{t}^{naive}$, which is the valuation of a partially informed and partially rational agent.

One consequence of these results is that, at every time period t, investigating the conditions for path-dependent behavior under the pricing rule of a naive market maker is equivalent to studying the same market under a sophisticated market maker. This does not mean that using the pricing rule of a market maker who is always naive is equivalent to using the pricing rule of a market maker who is not. The simplification is used to check conditions at a specific t, not as the market maker’s pricing rule over time.

1.2 A.2 Proof of Theorem 1

Path-dependent buying occurs when $E\left[ V\mid \mathcal {F}_{t},\mathcal {T} _{t},\sigma ^{i}=L\right] >A_{t}$. By proposition (1), this is equivalent to $E\left[ V\mid \mathcal {F}_{t},\mathcal {T}_{t},\sigma ^{i}=L \right] >A_{t}^{naive}$ which, in the case of first-time herding, can be written as

$$\begin{aligned}&\frac{p^{\left( h_{t-1}-l_{t-1}-1\right) }\left[ \frac{\mu }{2}+\left( 1-\mu \right) p\right] ^{b_{t-1}^{i}-s_{t-1}^{i}}}{p^{\left( h_{t-1}-l_{t-1}-1\right) }\left[ \frac{\mu }{2}+\left( 1-\mu \right) p\right] ^{b_{t-1}^{i}-s_{t-1}^{i}}+\left( 1-p\right) ^{\left( h_{t-1}-l_{t-1}-1\right) }\left[ \frac{\mu }{2}+\left( 1-\mu \right) \left( 1-p\right) \right] ^{b_{t-1}^{i}-s_{t-1}^{i}}} \nonumber \\&>\frac{\left[ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) p\right] ^{b_{t-1}^{m}-s_{t-1}^{m}+1}}{\left[ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) p\right] ^{b_{t-1}^{m}-s_{t-1}^{m}+1}+\left[ \frac{\lambda \mu }{ 2}+\left( 1-\lambda \mu \right) \left( 1-p\right) \right] ^{b_{t-1}^{m}-s_{t-1}^{m}+1}}. \end{aligned}$$

(19)

As long as no PDB has occurred yet, given the number of and type of buys and sells, their sequence does not change the market maker’s and traders’ valuations at t. Setting $\delta _{t}=h_{t-1}-l_{t-1}$, $\delta _{t}^{i}=b_{t-1}^{i}-s_{t-1}^{i}$, $\delta _{t}^{m}=b_{t-1}^{m}-s_{t-1}^{m}$ , $\gamma _{t}=\frac{\delta _{t}-1}{\delta _{t}+1}$ and

$$\begin{aligned} K\left( p\right) =\left[ \frac{\left[ \frac{\mu }{2}+\left( 1-\mu \right) p \right] \left[ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) \left( 1-p\right) \right] }{\left[ \frac{\mu }{2}+\left( 1-\mu \right) \left( 1-p\right) \right] \left[ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) p\right] }\right] ^{\frac{\delta _{t}^{i}}{\delta _{t}+1}}, \end{aligned}$$

the previous condition is equivalent to

$$\begin{aligned} \varDelta \left( p,\lambda ,\mu ,\delta _{t},\delta _{t}^{i}\right)= & {} p^{\gamma _{t}}\frac{\lambda \mu }{2}K\left( p\right) +p^{\gamma _{t}}\left( 1-p\right) \left( 1-\lambda \mu \right) K\left( p\right) \nonumber \\&-\left( 1-p\right) ^{\gamma _{t}}\frac{\lambda \mu }{2}-\left( 1-p\right) ^{\gamma _{t}}\left( 1-\lambda \mu \right) p>0. \end{aligned}$$

(20)

Notice that (20) is never satisfied whenever both $\delta =1$ and $\delta ^i=0$ at the same time at $t-1$. Moreover, notice that $0<K\left( p\right) \le 1$ for every $p\in \left[ \frac{1}{2},1 \right] $. As $K\left( \frac{1}{2}\right) =1,$ we have that $\varDelta \left( \frac{1}{2}\right) =0$, while $\varDelta \left( 1\right) >0$ for every $\lambda $, $\mu $, $\delta _{t}$ and $\delta _{t}^{i}$. The derivative of $\varDelta $ with respect to p is

$$\begin{aligned} \varDelta ^{\prime }\left( p,\mu ,\delta _{t},\delta _{t}^{i}\right)&=\gamma _{t}p^{\gamma _{t}-1}\frac{\lambda \mu }{2}K\left( p\right) +p^{\gamma _{t}} \frac{\lambda \mu }{2}K^{\prime }\left( p\right) \\&\quad \,+\,\gamma _{t}p^{\gamma _{t}-1}\left( 1-p\right) \left( 1-\lambda \mu \right) K\left( p\right) -p^{\gamma _{t}}\left( 1-\lambda \mu \right) K\left( p\right) \\&\quad \,+\,p^{\gamma _{t}}\left( 1-p\right) \left( 1-\lambda \mu \right) K^{\prime }\left( p\right) +\gamma _{t}\left( 1-p\right) ^{\gamma _{t}-1}\frac{\lambda \mu }{2}\\&\quad \,-\,\left( 1-p\right) ^{\gamma _{t}}\left( 1-\lambda \mu \right) +\gamma _{t}\left( 1-p\right) ^{\gamma _{t}-1}\left( 1-\lambda \mu \right) p. \end{aligned}$$

At $p=\frac{1}{2}$ this is equal to

$$\begin{aligned} \varDelta ^{\prime }\left( \frac{1}{2},\lambda ,\mu ,\delta _{t},\delta _{t}^{i}\right) =\left( \frac{1}{2}\right) ^{\gamma _{t}-1}\left\{ \gamma _{t}-\left( 1-\lambda \mu \right) -\frac{\delta _{t}^{i}}{\delta _{t}+1}\mu \left( 1-\lambda \right) \right\} , \end{aligned}$$

whose sign behaves as follows:

$$\begin{aligned} \varDelta ^{\prime }\left( \frac{1}{2}\right)\ge & {} 0 \quad \text {when }\delta _{t}\ge \frac{2}{\lambda \mu }-1+\delta _{t}^{i}\left( \frac{1 }{\lambda }-1\right) , \\ \varDelta ^{\prime }\left( \frac{1}{2}\right)< & {} 0 \quad \text {when } \delta _{t}<\frac{2}{\lambda \mu }-1+\delta _{t}^{i}\left( \frac{1}{\lambda } -1\right) . \end{aligned}$$

The function $\varDelta \left( p,\lambda ,\mu ,\delta _{t},\delta _{t}^{i}\right) =0$ implicitly defines $\delta _{t}=\varphi _{t}\left( p,\lambda ,\mu ,\delta _{t}^{i}\right) $, everywhere but at $\left( \frac{1}{2},\lambda ,\mu ,\delta _{t}^{i}\right) $. We can find $\varphi _{t}\left( p,\lambda ,\mu ,\delta _{t}^{i}\right) $ explicitly:

$$\begin{aligned} \varphi _{t}\left( p,\lambda ,\mu ,\delta _{t}^{i}\right) =\frac{ L+L^{\lambda }}{L-L^{\lambda }}+\delta _{t}^{i}\frac{L^{\lambda }-L^{\mu }}{ L-L^{\lambda }}. \end{aligned}$$

(21)

Define $l=\partial L/\partial p$. Differentiating (21) we respect to p we find:

$$\begin{aligned} \frac{\partial \varphi _{t}}{\partial p}=\frac{\left( l^{\lambda }+l\right) \left( L-L^{\lambda }\right) -\left( L^{\lambda }+L\right) \left( l-l^{\lambda }\right) +\delta _{t}^{i}\left[ \left( l^{\lambda }-l^{\mu }\right) \left( L-L^{\lambda }\right) -\left( l-l^{\lambda }\right) \left( L^{\lambda }-L^{\mu }\right) \right] }{\left( L-L^{\lambda }\right) ^{2}} \end{aligned}$$

The sign of $\frac{\partial \varphi _{t}}{\partial p}$ depends on the value of $\delta _{t}^{i}$. In particular, $\frac{\partial \varphi }{\partial p}$ is positive if and only if

$$\begin{aligned} \delta _{t}^{i}<2\frac{lL^{\lambda }-l^{\lambda }L}{l^{\lambda }L-lL^{\lambda }-l^{\mu }\left( L-L^{\lambda }\right) +L^{\mu }\left( l-l^{\lambda }\right) }=D\left( p,\lambda ,\mu \right) , \end{aligned}$$

where $D\left( p,\lambda ,\mu \right) <0$ for every p, $\lambda $ and $\mu $ and $\frac{\partial D\left( p,\lambda ,\mu \right) }{\partial p}>0$.

Case 1: :: $\delta _{t}<\frac{2}{\lambda \mu }-1+\delta _{t}^{i}\left( \frac{1}{\lambda }-1\right) .$ When $p=\frac{1}{2}$ we found that $\varDelta ^{\prime }\left( \frac{1}{2}\right) <0\,$. Since $\varDelta \left( 1\right) >0$ for every $\lambda $, $\mu $, $\delta _{t}$ and $\delta _{t}^{i}$, we can conclude that $\varDelta \left( p,\lambda ,\mu ,\delta _{t},\delta _{t}^{i}\right) $ cuts the x-axis at least once. Suppose that $\varDelta \left( p^{*},\lambda ,\mu ,\delta _{t},\delta _{t}^{i}\right) =0$ and that $ \delta _{t}^{i}<D\left( p^{*},\lambda ,\mu \right) $. This means that if we increase p from $p^{*}$ to $p^{\prime }\in \left( p^{*},1\right] $, the level of $\delta _{t}$ needed to keep $\varDelta $ at zero increases. As $\varDelta \left( p,\lambda ,\mu ,\delta _{t},\delta _{t}^{i}\right) $ is increasing in $\delta _{t}$, this means that, for given $\delta _{t}$ and $ \delta _{t}^{i}$, if $\varDelta $ is cutting the x-axis at $p^{*}$ it must be doing it from above. However, as $\varDelta \left( 1\right) >0$, it must be the case that it is cutting it again from below. This last fact is not possible, as $\frac{\partial D\left( p,\lambda ,\mu \right) }{\partial p}>0$ , which implies that for any other $p^{\prime }\in \left( p^{*},1\right] $, we continue to have $\delta _{t}^{i}<D\left( p^{\prime },\lambda ,\mu \right) $: any crossing of the x-axis as p increases must occur from above. We conclude that we must have $\delta _{t}^{i}>D\left( p^{*},\lambda ,\mu \right) $, and any crossing of the x-axis for $p\in \left( \frac{1}{2},1\right] $ must be occurring once and from below.
Case 2: :: $\delta _{t}\ge \frac{2}{\lambda \mu }-1+\delta _{t}^{i}\left( \frac{1}{\lambda }-1\right) .$ As $\varDelta ^{\prime }\left( \frac{1}{2}\right) \ge 0$ and $\varDelta \left( 1\right) >0$, and given that we have just established that $\varDelta \left( p,\mu ,\lambda ,\delta _{t},\delta _{t}^{i}\right) $ can never cross the x-axis from above, it follows that $\varDelta \left( p,\mu ,\lambda ,\delta _{t},\delta _{t}^{i}\right) >0$ for every $p\in \left( \frac{1}{2},1\right] $.

The analysis of the previous two cases shows that path-dependent buying occurs either for high values or for any value of the signal precision. If $b_{t}^{m}>s_{t}^{m}$ path-dependent buying coincides with herd buying, if $b_{t}^{m}<s_{t}^{m}$ it coincides with contrarian buying.

To study the effect of $\lambda $ on $p^{*}$, notice that $\varDelta >0$ if and only if $\varPsi =\left( \delta _{t}-1\right) L+\delta _{t}^{i}L^{\mu }-\left( \delta _{t}^{m}+1\right) L^{\lambda }>0$. As $L^{\lambda }$ is decreasing in $\lambda $, $\varPsi $ increases in $\lambda $ whenever $ b_{t}^{m}<s_{t}^{m}\,$, and it decreases in $\lambda $ when $ b_{t}^{m}>s_{t}^{m}$, for every $p\in \left( \frac{1}{2},1\right] $. It follows that $\frac{\partial p^{*}}{\partial \lambda }>0$ when $ b_{t}^{m}<s_{t}^{m}$, and that $\frac{\partial p^{*}}{\partial \lambda } <0$ when $b_{t}^{m}>s_{t}^{m}$. Moreover, $\varPsi $ is increasing in $\mu $ for every $p\in \left( \frac{1}{ 2},1\right] $. It follows that $\frac{\partial p^{*}}{\partial \mu }<0.$

1.3 A.3 Proof of Theorem 2

Consider $p=\frac{1}{2}$. Then both the market maker and the traders’ valuations will be equal to $\frac{1}{2}$ for every $\mathcal {F}_{t}$ , every $\sigma $ and every $\lambda ,\mu \in \left[ \frac{1}{2},1\right] $. Since we have defined path-dependent buying only in the case where buying is strictly preferred to any other actions, then herding is not possible.

Consider $p=1$. Then for any $\mathcal {F}_{t}$ and every $\lambda ,\mu \in \left[ \frac{1}{2},1\right] $, $E\left[ V\mid \mathcal {F}_{t},\mathcal {T} _{t},\sigma ^{i}\right] =1$ if $\sigma ^{i}=H$ (and $\mathcal {F}_{t}$, $E \left[ V\mid \mathcal {F}_{t},\mathcal {T}_{t},\sigma ^{i}\right] =L$ if $ \sigma ^{i}=L$) and no PDB is possible.

For $p\in \left( \frac{1}{2},1\right) $, we have from (19) that path-dependent buying occurs at t whenever

$$\begin{aligned} h_{t-1}-l_{t-1}>\frac{L+L^{\lambda }}{L-L^{\lambda }}+\left( b_{t-1}^{i}-s_{t-1}^{i}\right) \frac{L^{\lambda }-L^{\mu }}{L-L^{\lambda }} \end{aligned}$$

whose left-hand side is maximized for $l_{t-1}=0$ and whose right-hand side it minimized for $b_{t-1}^{i}=0$. The path just described, where all buys are type I and the sells are type II, is a unique type path compatible with $ \mathcal {F}_{t}$, and the one with the highest possible traders’ valuation of the asset associated to it. It follows that path-dependent buying has positive probability of happening at t whenever

$$\begin{aligned} b_{t-1}^{m}>\frac{L+L^{\lambda }}{L-L^{\lambda }}-s_{t-1}^{m}\frac{ L^{\lambda }-L^{\mu }}{L-L^{\lambda }} \end{aligned}$$

(22)

I cannot appeal to noise trading in order to say that a trading history such that the previous inequality is satisfied has positive probability and this is enough to prove existence of herding because if, in fact, the trading history were generated by at least one noise buy, then herding could not occur anymore. However, if the market maker assigns positive probability to herd buying given the trading history, then consistency implies that, from an ex-ante perspective, this probability must be positive.

1.4 A.4 Proof of Proposition 1

Rearranging (19), from the point of view of time $\tau $, the probability of path-dependent buying at $T=\tau +t$ is:

$$\begin{aligned} \Pr \left( \left( h_{T-1}-l_{T-1}\right) \left( L-L^{\lambda }\right) >L+L^{\lambda }+\left( L^{\lambda }-L^{\mu }\right) \left( b_{T-1}^{i}-s_{T-1}^{i}\right) +\varPi _{\tau }\right) . \end{aligned}$$

(23)

where $\varPi _{\tau }=\log \frac{\pi _{\tau -1}^{m}\left( 1-\pi _{\tau -1}^{i}\right) }{\pi _{\tau -1}^{i}\left( 1-\pi _{\tau -1}^{m}\right) }.$

When $V=1$, the vector $X=(h_{T-1}, l_{T-1}, b_{T-1}^{i}, s_{T-1}^{i})$ has a multinomial distribution $\left( T-1;\mathbf {\mu }\right) $ where:

$$\begin{aligned} \mathbf {\mu }= \begin{bmatrix} \left( 1-\lambda \right) p \\ \left( 1-\lambda \right) \left( 1-p\right) \\ \lambda \left( 1-\mu \right) p+\frac{\lambda \mu }{2} \\ \lambda \left( 1-\mu \right) \left( 1-p\right) +\frac{\lambda \mu }{2} \end{bmatrix} \end{aligned}$$

Asymptotically, as $T\rightarrow \infty $, the multinomial distribution converges to a multivariate normal distribution $N\left( \left( T-1\right) \mathbf {\mu },\Sigma \right) $, where $\Sigma $ is the covariance matrix. Then, conditional on $V=1$, as $t\rightarrow \infty $, $h_{\tau +t-1}>l_{\tau +t-1}$ and $b_{\tau +t-1}^{i}>s_{\tau +t-1}^{i}$. The probability in (23) can be written as:

$$\begin{aligned}&\Pr \left( \lim _{t\rightarrow \infty }\left( \frac{h_{T-1}-l_{T-1}}{ b_{T-1}^{i}-s_{T-1}^{i}}-\frac{L+L^{\lambda }+\varPi _{\tau }}{\left( L-L^{\lambda }\right) \left( b_{t-1}^{i}-s_{t-1}^{i}\right) }\right)>\frac{ L^{\lambda }-L^{\mu }}{L-L^{\lambda }}\right) \\&\quad =\Pr \left( \frac{1-\lambda }{\lambda \left( 1-\mu \right) }>\frac{ L^{\lambda }-L^{\mu }}{L-L^{\lambda }}\right) . \end{aligned}$$

The ratio $\left( L^{\lambda }-L^{\mu }\right) /\left( L-L^{\lambda }\right) $ is decreasing in p for every $\lambda $ and $\mu $.^{Footnote 19} Moreover, by l’Hopital’s Rule,

$$\begin{aligned} \lim _{p\rightarrow \frac{1}{2}}\frac{L^{\lambda }-L^{\mu }}{L-L^{\lambda }}= \frac{\left( 1-\lambda \right) }{\lambda }\text {,} \end{aligned}$$

which is always smaller than $\left( 1-\lambda \right) /\lambda \left( 1-\mu \right) $ for every $\lambda $ and $\mu >0$. Hence, for $T\rightarrow \infty $ the probability in (23) is equal to 1.

We can conclude that, in the limit, path-dependent buying coincides with herd buying.

At T, the probability of path-dependent selling is equal to:

$$\begin{aligned} \Pr \left( \left( h_{T-1}-l_{T-1}\right) \left( L-L^{\lambda }\right) +L+L^{\lambda } <\left( b_{T-1}^{i}-s_{T-1}^{i}\right) +\varPi _{\tau } \right) \left( L^{\lambda }-L^{\mu }\right) \end{aligned}$$

(24)

As $T\rightarrow \infty $, this probability becomes $\Pr \left( \frac{ 1-\lambda }{\lambda \left( 1-\mu \right) }<\frac{L^{\lambda }-L^{\mu }}{ L-L^{\lambda }}\right) $. Since we have just proved that the complement set has probability one, it must be that the probability of contrarian selling goes to zero as $T\rightarrow \infty $.

Similarly, it can be shown that when $V=0$ herd selling happens almost surely, whereas herd buying and contrarian behavior almost never happen.

1.5 A.5 Proof of Corollary 1

Parts of the proof of this corollary follow largely the intuition behind the proof of AZ’s Proposition 10.Assume $V=1$ and let $\phi ^{\prime }_{h}=E\left[ 1-V^{m}_{t+1} \mid V=1, \mathcal {F}_{t}\right] $ during periods of herd buying in a market where information leaks. Let $\phi _{v}$ be the same quantity, but in a Glosten–Milgrom market where herding is not possible. Formally,

$$\begin{aligned} \phi ^{\prime }_{h}= & {} 1-E\left[ \pi ^{m}_{t+1} \mid V=1\right] \nonumber \\= & {} 1-\left[ Pr\left( a_{t}=\text {buy} \mid V=1\right) A^{\prime }_{t}+Pr\left( a_{t}=\text {sell} \mid V=1\right) B_{t}\right] \nonumber \\= & {} 1-\left[ \left( 1-\frac{\lambda \mu }{2}\right) A^{\prime }_{t}+\frac{ \lambda \mu }{2}B_{t}\right] , \end{aligned}$$

(25)

where $A^{\prime }_{t}$ is the rational expectation ask price as derived in Appendix A.1. Indicate with $\phi _{h}$ the same quantity as in equation (25), but replacing $A^{\prime }_{t}$ with the naive price $A_{t}$. The quantity $\phi _{v}$ is derived similarly, where $ Pr\left( a_{t}=\text {buy}\mid V=1\right) $ and $Pr\left( a_{t}=\text {sell}\mid V=1\right) $ are the probabilities of a buy and a sell in a Glosten–Milgrom market:

$$\begin{aligned} \phi _{v}= & {} 1-\left\{ \left[ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) p \right] A_{t}+\left[ \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) \left( 1-p \right) \right] B_{t}\right\} \end{aligned}$$

The price is more informative of the true state during herd buying than during periods of normal trading if the deviation of the asset price from its value is smaller during herding than during periods of normal trade, that is if $\phi ^{\prime }_{h}<\phi _{v}$. In particular,

$$\begin{aligned} \phi ^{\prime }_{h}-\phi _{v}<\phi _{h}-\phi _{v}=\left( 1-p\right) \left( 1-\lambda \mu \right) \left( B_{t}-A_{t}\right) , \end{aligned}$$

which is always negative.

Analogously, one can prove that herd selling reduces $E\left[ V^{m}_{t+1}-0 \mid V=0, \mathcal {F}_{t}\right] $ compared to the same quantity in a non-herding market.

Appendix A.4 proves that, as $t\rightarrow \infty $, only herding in the direction of the asset’s true value survives a.s. Then, in the limit, herd behavior improves price informativeness.

1.6 A.6 Proof of Proposition 2

Without loss of generality, assume $V=1$. Path-dependent buying occurs at t whenever the quadruple $\left( h_{t-1},l_{t-1},b_{t-1}^{i},s_{t-1}^{i}\right) $ is such that:

$$\begin{aligned} \left( h_{t-1}-l_{t-1}\right) \left( L-L^{\lambda }\right) >L+L^{\lambda }+\left( L^{\lambda }-L^{\mu }\right) \left( b_{t-1}^{i}-s_{t-1}^{i}\right) +\varPi _{0} \end{aligned}$$

(26)

where $\varPi _{0}$, the log-ratio of the initial beliefs, will be ignored for the sake of the comparative statics. In expectation, (26) is equal to

$$\begin{aligned} \left( t-1\right) \left( 2p-1\right) \left[ \left( 1-\lambda \right) L+\lambda \left( 1-\mu \right) L^{\mu }-L^{\lambda }\left( 1-\lambda \mu \right) \right] >L+L^{\lambda } \end{aligned}$$

(27)

Proving that (27) is less strict as p increases is equivalent to proving that (26) is more likely to hold at t the larger p, which makes PDB more likely the higher the quality of private information.

Define the function

$$\begin{aligned} G\left( p,\lambda ,\mu ,t\right)= & {} \left( t-1\right) \left( 2p-1\right) \left[ \left( 1-\lambda \right) L+\lambda \left( 1-\mu \right) L^{\mu }\right. \nonumber \\&\left. -\,L^{\lambda }\left( 1-\lambda \mu \right) \right] -L-L^{\lambda }. \end{aligned}$$

(28)

PDB is more likely the higher the quality of information p whenever G is increasing in p. Define $g\left( p,\lambda ,\mu ,t\right) =\frac{\partial G }{\partial p}$:

$$\begin{aligned} g\left( p,\lambda ,\mu ,t\right)= & {} \left( t-1\right) \Big \{\left( 2p-1\right) \left[ \left( 1-\lambda \right) l+\lambda \left( 1-\mu \right) l^{\mu }-l^{\lambda }\left( 1-\lambda \mu \right) \right] \nonumber \\&+\,2\left[ \left( 1-\lambda \right) L+\lambda \left( 1-\mu \right) L^{\mu }-L^{\lambda }\left( 1-\lambda \mu \right) \right] \Big \} -l-l^{\lambda }.\qquad \end{aligned}$$

(29)

g is continually differentiable in $p\in \left[ \frac{1}{2},1\right) $ and t. Then, $g\left( p,\lambda ,\mu ,t\right) =0$ implicitly defines the function

$$\begin{aligned} \tau \left( p;\lambda ,\mu \right) =\frac{l+l^{\lambda }}{\left( 2p-1\right) \left[ \left( 1-\lambda \right) l+\lambda \left( 1-\mu \right) l^{\mu }-l^{\lambda }\left( 1-\lambda \mu \right) \right] +2\left[ \left( 1-\lambda \right) L+\lambda \left( 1-\mu \right) L^{\mu }-L^{\lambda }\left( 1-\lambda \mu \right) \right] }+1. \end{aligned}$$

Lemma 3

The function $\tau \left( p;\lambda ,\mu \right) $ is strictly decreasing in $p\in \left[ \frac{1}{2},1\right] \,$ for every value of $ \lambda ,\mu \in \left[ 0,1\right] $.

Proof

Defining:

$$\begin{aligned} Z\left( p;\lambda ,\mu \right)= & {} \left( 1-\lambda \right) L+\lambda \left( 1-\mu \right) L^{\mu }-L^{\lambda }\left( 1-\lambda \mu \right) \\ z\left( p;\lambda ,\mu \right)= & {} \left( 1-\lambda \right) l+\lambda \left( 1-\mu \right) l^{\mu }-l^{\lambda }\left( 1-\lambda \mu \right) \\ z^{\prime }\left( p;\lambda ,\mu \right)= & {} \frac{\partial z\left( p;\lambda ,\mu \right) }{\partial p}\text {, }l^{\prime }=\frac{\partial l}{\partial p} \text {, }l^{\lambda \prime }=\frac{\partial l^{\lambda }}{\partial p} \end{aligned}$$

allows to write $\tau \left( p;\lambda ,\mu \right) $ in more compact terms:

$$\begin{aligned} \tau \left( p;\lambda ,\mu \right) =\frac{l+l^{\lambda }}{\left( 2p-1\right) z+2Z}+1. \end{aligned}$$

Then, the relationship between t and p (the number of trading rounds that keep the likelihood of PDB constant as the quality of information changes) is given by the sign of the following derivative:

$$\begin{aligned} \frac{\partial \tau \left( p;\lambda ,\mu \right) }{\partial p}=\frac{\left( l^{\prime }+l^{\lambda \prime }\right) \left[ \left( 2p-1\right) z+2Z\right] -\left( l+l^{\lambda }\right) \left[ 4z+\left( 2p-1\right) z^{\prime }\right] }{\left[ \left( 2p-1\right) z+2Z\right] ^{2}}, \end{aligned}$$

(30)

which is negative if and only if its numerator is negative. The numerator is the difference of two positive terms, whose sign cannot be determined in closed form. Numerical computation over $\mu $, $\lambda $ and p shows that the derivative is maximized when $p=\frac{1}{2}$, and for all values of $\mu $ and $\lambda $, on the interior. In all cases that maximum is bounded below zero. The overall maximum occurs when $\mu =\lambda =0$ or $ \lambda =1$, when $\frac{\partial \tau \left( p;\lambda ,\mu \right) }{ \partial p}=0$.

$\square $

At $p=\frac{1}{2}$, $g\left( \frac{1}{2},\lambda ,\mu ,t\right) =-4\left( 2-\lambda \mu \right) <0.$ Moreover,

$$\begin{aligned} \lim _{p\rightarrow 1}g\left( p,\lambda ,\mu ,t\right) =\left\{ \begin{array}{c} +\infty \text { when }t>\frac{1}{\left( 1-\lambda \right) }+1 \\ -\infty \text { when }t<\frac{1}{\left( 1-\lambda \right) }+1 \end{array} \right. \end{aligned}$$

Suppose that $t>\frac{1}{\left( 1-\lambda \right) }+1$. Then, g crosses the p-axis at least once. The claim is that g cannot cross the p-axis from above, meaning that it either crosses the p-axis only once (when $t> \frac{1}{\left( 1-\lambda \right) }+1$) or never (when $t<\frac{1}{\left( 1-\lambda \right) }+1$). By contradiction, suppose that g crosses the p -axis from above

$$\begin{aligned} g\left( p^{\prime },\lambda ,\mu ,t\right) =0\text { for some }p^{\prime } . \end{aligned}$$

Holding everything else equal, $g\left( p,\lambda ,\mu ,t\right) <0$ for every $p^{\prime }<p<p^{\prime }+\varepsilon $ in the right neighborhood of $ p^{\prime }$. In order to keep g at zero as p increases, t needs to decrease, as it is shown in Lemma 3 that $\frac{\partial \tau \left( p;\lambda ,\mu \right) }{\partial p}<0$. However, by inspecting g’s functional form, $\frac{\partial g}{\partial t}\ge 0$ holds for every $ p,\lambda ,\mu $. It follows that g cannot cross the p-axis from above.

When $\lambda =1$, $\frac{1}{\left( 1-\lambda \right) }+1=+\infty $: no information leaks and PDB is not possible. When $\lambda =0$ or when $\mu =0,$ both G and g are always negative: both terms multiplying t in (28) and (29) go to zero, so there is no possibility to increase t in order to make G and g positive.

In conclusion, when $t>\frac{1}{\left( 1-\lambda \right) }+1$, there exists $ p^{**}\left( \lambda ,\mu ,t\right) $ such that $G\left( p,\lambda ,\mu ,t\right) $ is increasing in p if and only if $p\in \left( p^{**},1\right) $.

1.7 A.7 Proof of Proposition 3

The statement can be proven by looking at order imbalance, which provides a proxy for the quantity and direction of the information present in the market (see for instance Easley et al. (2012)). In this respect, the following lemma will prove useful.

Lemma 4

Consider a market where the total probability of noise trading is $\lambda \mu $ and where information is leaked at a rate $\varGamma $. The expected order imbalance at time T is:

$$\begin{aligned} E\left[ OI_{T}\right] =E\left[ b_{T}^{m}-s_{T}^{m}\right] =\left( 1-\lambda \mu \right) \left( 2p-1\right) T+2\left( 1-\lambda \mu \right) \left( 1-p\right) \frac{Tn}{F+n} \end{aligned}$$

where F is the expected time between herding episodes and n is the expected length of a herding episode. Both F and n are functions of $ p,\lambda $ and $\mu $ and their analytical expressions are given by (32 ) and (33), respectively.

Proof

Consider a market at T and suppose that $V=1$. The expected order imbalance in a Glosten–Milgrom market is given by (11). In a market with PDB, we need to account only for the possibility of herd buying, as this is the only PDB that is expected when $V=1$. Indicate with $\mathcal {H}$ the number of expected herd buying episodes and with n their expected length. In a market with PDB and $V=1$ the expected difference between buys and sells is

$$\begin{aligned} E\left[ b^{m}_{T}-s^{m}_{T}\right] =\left( 1-\lambda \mu \right) \left( 2p-1\right) T+2\left( 1-\lambda \mu \right) \left( 1-p\right) n\mathcal {H}, \end{aligned}$$

(31)

where $n\mathcal {H}$ is multiplied by $\left( 1-\lambda \mu \right) \left( 1-p\right) $ because only herding periods hiding low signals impact the difference between buys and sells, and it is multiplied by 2 because a hidden low signal results in one more buy and one less sell, increasing the difference by a factor of 2. Indicate with F the expected time between herding periods. Then F is the first integer such that, in expectation, (19) holds just about to trigger herding:^{Footnote 20}

$$\begin{aligned} E\left[ h_{F}-l_{F}\right] L-L+E\left[ b_{F}^{i}-s_{F}^{i}\right] L^{\mu } \ge E\left[ b_{F}^{m}-s_{F}^{m}\right] L^{\lambda }+L^{\lambda }. \end{aligned}$$

Taking the expectations and rearranging:

$$\begin{aligned} F=\left\lceil \frac{L^{\lambda }+L}{\left( 2p-1\right) \left[ \left( 1-\lambda \right) \left( L-L^{\lambda }\right) -\lambda \left( 1-\mu \right) \left( L^{\lambda }-L^{\mu }\right) \right] }\right\rceil \end{aligned}$$

(32)

At T, the expected number of herding episodes $\mathcal {H}$ can be calculated as:

$$\begin{aligned} \mathcal {H}=\frac{T-n\mathcal {H}}{F}\Leftrightarrow \mathcal {H}=\frac{T}{F+n} \end{aligned}$$

The average duration n of a herding episode can be calculated by noticing that n is such that during herding the market maker’s price needs to catch up with the informational content of the last Type I buy.

The impact of the period before herding starts if a Type I buy is realized is:

$$\begin{aligned}&\left( \log \frac{\pi _{-1}^{i}}{1-\pi _{-1}^{i}}-\log \frac{\pi _{-1}^{m} }{1-\pi _{-1}^{m}}\right) -\left( \log \frac{\pi _{-2}^{i}}{1-\pi _{-2}^{i}} -\log \frac{\pi _{-2}^{m}}{1-\pi _{-2}^{m}}\right) \\= & {} \left( \log \frac{\pi _{-2}^{i}}{1-\pi _{-2}^{i}}+L-\left( \log \frac{\pi _{-2}^{m}}{1-\pi _{-2}^{m}}+L^{\lambda }\right) \right) -\left( \log \frac{ \pi _{-2}^{i}}{1-\pi _{-2}^{i}}-\log \frac{\pi _{-2}^{m}}{1-\pi _{-2}^{m}} \right) \\= & {} L-L^{\lambda } \end{aligned}$$

The impact of herding in recovering from the gap accumulated in the $-1$ period is

$$\begin{aligned}&\left( \log \frac{\pi _{-1}^{i}}{1-\pi _{-1}^{i}}-\log \frac{\pi _{-1}^{m} }{1-\pi _{-1}^{m}}\right) -\left( \log \frac{\pi _{-1+n}^{i}}{1-\pi _{-1+n}^{i}}-\log \frac{\pi _{-1+n}^{m}}{1-\pi _{-1+n}^{m}}\right) \\= & {} \left( \log \frac{\pi _{-1}^{i}}{1-\pi _{-1}^{i}}-\log \frac{\pi _{-1}^{m} }{1-\pi _{-1}^{m}}\right) -\left( \log \frac{\pi _{-1}^{i}}{1-\pi _{-1}^{i}} -\log \frac{\pi _{-1}^{m}}{1-\pi _{-1}^{m}}-n^{\prime }L^{\lambda }\right) \\= & {} n^{\prime }L^{\lambda } \end{aligned}$$

Then normal trading recovers when $n^{\prime }L^{\lambda }\ge L-L^{\lambda } $, meaning that $n^{\prime }\ge \frac{L-L^{\lambda }}{L^{\lambda }}.$ As trading starts normally as soon as the price moves by more than the information contained in the last trade prior to herding, we can take $ n^{\prime }=\left\lceil \frac{L-L^{\lambda }}{L^{\lambda }}\right\rceil $.

The impact of the period before herding starts if a Type II sell is realized is:

$$\begin{aligned}&\left( \log \frac{\pi _{-1}^{i}}{1-\pi _{-1}^{i}}-\log \frac{\pi _{-1}^{m} }{1-\pi _{-1}^{m}}\right) -\left( \log \frac{\pi _{-2}^{i}}{1-\pi _{-2}^{i}} -\log \frac{\pi _{-2}^{m}}{1-\pi _{-2}^{m}}\right) \\= & {} \left( \log \frac{\pi _{-2}^{i}}{1-\pi _{-2}^{i}}-L^{\mu }-\left( \log \frac{\pi _{-2}^{m}}{1-\pi _{-2}^{m}}-L^{\lambda }\right) \right) -\left( \log \frac{\pi _{-2}^{i}}{1-\pi _{-2}^{i}}-\log \frac{\pi _{-2}^{m}}{1-\pi _{-2}^{m}}\right) \\= & {} L^{\lambda }-L^{\mu } \end{aligned}$$

The impact of herding in recovering from the gap accumulated in the $-1$ period is

$$\begin{aligned}&\left( \log \frac{\pi _{-1}^{i}}{1-\pi _{-1}^{i}}-\log \frac{\pi _{-1}^{m} }{1-\pi _{-1}^{m}}+n^{\prime \prime }L^{\lambda }\right) -\left( \log \frac{ \pi _{-1}^{i}}{1-\pi _{-1}^{i}}-\log \frac{\pi _{-1}^{m}}{1-\pi _{-1}^{m}} \right) \\= & {} n^{\prime \prime }L^{\lambda } \end{aligned}$$

Then normal trading recovers when $n^{\prime \prime }L^{\lambda }\ge L^{\lambda }-L^{\mu }$, meaning that $n\ge \frac{L^{\lambda }-L^{\mu }}{ L^{\lambda }}.$ As trading starts normally as soon as the price moves by more than the information contained in the last trade prior to herding, we can take $n^{\prime \prime }=\left\lceil \frac{L^{\lambda }-L^{\mu }}{L^{\lambda }}\right\rceil $. Then, on average, the average duration of a herding episode is:

$$\begin{aligned} n=\left( 1-\lambda \right) \frac{L-L^{\lambda }}{L^{\lambda }}+\lambda \frac{ L^{\lambda }-L^{\mu }}{L^{\lambda }} \end{aligned}$$

(33)

Figure 5 shows that F is a decreasing function of p while n is an increasing function of p for any $\lambda $ and $\mu $ in $\left[ 0,1 \right] $. $\square $

Consider an analyst observing an order imbalance equal to $b^m_t-s^m_t$ and attributing it to

$$\begin{aligned} E\left[ \widehat{OI}_{T}\right] =\left( 1-\widehat{\lambda } \widehat{\mu }\right) \left( 2\widehat{p}-1\right) T+2\left( 1-\widehat{\lambda }\widehat{\mu }\right) \left( 1-\widehat{p}\right) \frac{T\widehat{n}}{\widehat{F}+\widehat{n}} \end{aligned}$$

such that $\widehat{\lambda }\widehat{\mu }=\lambda \mu $. To complete the proof of Proposition 3 will first show that if the analyst correctly believes that $\widehat{\lambda }=\lambda $, then it must be that $\widehat{p}=p$ and vice-versa. Then, the proof will continue to establish that $\widehat{\lambda }>\lambda $ if and only if $\widehat{p}>p$.

1.
$\widehat{\lambda }=\lambda $ implies $\widehat{p}=p.$ Suppose that the analyst observes an order imbalance equal to $b^m_t-s^m_t$ and that he believes $\widehat{\lambda }=\lambda $ (i.e., his assessment of the probability of type II traders is correct). In expectation, the order imbalance can be described by equation (12). If $\widehat{\lambda }=\lambda $, then it must be that $\widehat{\mu }=\mu $ to satisfy the requirement that $\widehat{\mu }=\frac{\lambda \mu }{\widehat{\lambda }}$. Simplifying, subtracting one to both sides and changing sign, we can rewrite $E\left[ \widehat{OI }_T\right] =E\left[ OI_T\right] $ as
$$\begin{aligned} \frac{\left( 1-\widehat{p }\right) }{1+\frac{\widehat{n}}{\widehat{F}}}=\frac{\left( 1-p\right) }{1+\frac{n}{F}} \end{aligned}$$
(34)
As $\frac{\widehat{n}}{\widehat{F}}$ is increasing in $\widehat{p}$, the entire fraction on the left-hand side of equation (34) is decreasing in $\widehat{p}$, for every $\lambda $ and $\mu $, which makes $E\left[ \widehat{OI }_T\right] $ increasing (we changed sign!) in $\widehat{p}$ for every $\lambda $ and $\mu $.

Suppose, by contradiction, that $\widehat{p}>p$. As $\frac{\partial E\left[ OI_T\right] }{\partial p}>0$, $\widehat{\lambda }=\lambda $ and $\widehat{\mu }=\mu $, it must be that $E\left[ \widehat{OI }_T\right] >E\left[ OI_T\right] $. Contradiction. An analogous argument can be made when we assume that $ \widehat{p}<p$.
2.
$\widehat{p }=p$ implies $\widehat{\lambda }=\lambda .$ As $ \widehat{p }=p$, setting $E\left[ \widehat{ OI}_T\right] =E\left[ OI_T\right] $ is equivalent to $\frac{\widehat{n }}{\widehat{F}}=\frac{n}{F}$. When $\widehat{\lambda }\widehat{\mu }$ is held constant to $\lambda \mu $, the derivative of $\frac{\widehat{n }}{\widehat{F}}$ with respect to $\widehat{\lambda }$ is equal to
$$\begin{aligned} \left. \frac{\partial \frac{\widehat{n}}{\widehat{F }}}{\partial \widehat{\lambda }}\right| _{\widehat{\lambda }\widehat{\mu }=\lambda \mu }= & {} \Big \{ \left[ \left( 1-\widehat{\lambda } \right) L-L^{\lambda }+\widehat{\lambda } L^{\mu }+\lambda \mu \left( L^{\lambda }-L^{\mu }\right) \right] \left( 2L^{\lambda }-L-L^{\mu }\right) \\&-\left( L-L^{\mu }\right) \left[ \left( 1-\widehat{\lambda }\right) \left( L-L^{\lambda }\right) +\widehat{\lambda } \left( L^{\lambda }-L^{\mu }\right) \right] \Big \} \left[ \left( L^{\lambda }+L\right) L^{\lambda }\right] ^{-2}. \end{aligned}$$
The following lemma is useful to determine the sign of the derivative.

Lemma 5

For any $p\in \left[ \frac{1}{2},1\right] $ and for any $\lambda ,\mu \in \left[ 0,1\right] $, $\left( 1-\lambda \right) L-L^{\lambda }+\lambda L^{\mu }\ge 0$.

Proof

At $p=\frac{1}{2},$ the LHS is equal to zero for every $\lambda $ and $\mu $ . The derivative of the LHS is:

$$\begin{aligned} \frac{1-\lambda }{p\left( 1-p\right) }+\frac{\lambda \left( 1-\mu \right) }{ \left( 1-\frac{\mu }{2}\right) \frac{\mu }{2}+\left( 1-\mu \right) ^{2}p\left( 1-p\right) }-\frac{\left( 1-\lambda \mu \right) }{\left( 1-\frac{ \lambda \mu }{2}\right) \frac{\lambda \mu }{2}+\left( 1-\lambda \mu \right) ^{2}p\left( 1-p\right) } \end{aligned}$$

This is positive if and only if:

$$\begin{aligned} \lambda \mu ^{2}( 1-\lambda ) ( 2p-1)^{2}\left[ ( 1-\mu ) ( 1-\lambda \mu ) ( 1-4p( 1-p) ) +( 1-\mu ) +2-\lambda \mu \right] \ge 0 \end{aligned}$$

By inspection, this inequality is always satisfied. $\square $

Given the result in Lemma 5, we can conclude that $\left. \frac{\partial \frac{\widehat{n }}{\widehat{F}}}{\partial \widehat{\lambda }} \right| _{\widehat{\lambda }\widehat{\mu }=\lambda \mu }<0$. Suppose, by contradiction, that $\widehat{\lambda }>\lambda $. Then, it must be that $ \frac{\widehat{n }}{\widehat{F}}<\frac{n}{F}$ along the $\left( \widehat{ \lambda }\widehat{\mu }=\lambda \mu \right) $-manifold. Contradiction. An analogous argument can be made when we assume that $\widehat{\lambda }<\lambda $.

3.
$\widehat{\lambda }>\lambda $ implies $\widehat{p }>p$. Suppose, instead, that $\widehat{p }<p$. Since $\frac{\partial E\left[ \widehat{OI}_T\right] }{\partial \widehat{p }}>0$ (the higher the quality of information, the stronger the asymmetry of information between traders and market maker, which results in a larger order imbalance, in expectation) coeteris paribus it must be that $E\left[ \widehat{OI }_T\right] <E\left[ OI_T \right] $. In order to re-establish equality between the observed $OI_T$ and the expected order imbalance imputed by $E\left[ \widehat{OI}_T\right] $, as $ \left. \frac{\partial E\left[ \widehat{OI}_T\right] }{\partial \widehat{\lambda }}\right| _{\widehat{\lambda }\widehat{\mu }=\lambda \mu }<0$ (a higher $\lambda $ results in a lower leakage ratio, which leads to a lower order imbalance, in expectation) one would need $\widehat{\lambda }<\lambda $, which leads to a contradiction.
4.
$\widehat{p }>p$ implies $\widehat{\lambda }>\lambda $. Suppose, by contradiction, that $\widehat{\lambda }<\lambda $. Since $ \left. \frac{\partial E\left[ \widehat{OI}_T\right] }{\partial \widehat{\lambda }}\right| _{\widehat{\lambda }\widehat{\mu }=\lambda \mu }<0$ coeteris paribus it must be that $E\left[ \widehat{OI }_T\right] >E\left[ OI_T \right] $ along the $\left( \widehat{\lambda }\widehat{\mu }=\lambda \mu \right) $-manifold. In order to re-establish equality between the observed $ OI_T$ and the expected order imbalance imputed by $E\left[ \widehat{OI }_T\right] $, as $\frac{\partial E\left[ \widehat{OI}_T\right] }{\partial \widehat{p }}>0$ , one would need $\widehat{p }<p$, which leads to a contradiction.

As $\widehat{\lambda }=\lambda $ if and only if $ \widehat{p}=p$, and $\widehat{\lambda }>\lambda $ if and only if $\widehat{p}>p$, it follows that $\widehat{p}\ge p$ if and only if $\widehat{\varGamma }\le \varGamma $.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Testa, A. Path-dependent behavior and information leakage in financial markets. Econ Theory 67, 909–949 (2019). https://doi.org/10.1007/s00199-018-1102-3

Download citation

Received: 15 July 2016
Accepted: 18 January 2018
Published: 01 February 2018
Issue Date: 01 June 2019
DOI: https://doi.org/10.1007/s00199-018-1102-3

Keywords

JEL Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Path-dependent behavior and information leakage in financial markets

Abstract

Similar content being viewed by others

Communication and the Stock Market

Are individuals informed in global markets?

Trading Agent Kills Market Information

1 Introduction

1.1 The role of the market maker

1.2 Other related literature

2 The model

Definition 1

2.1 Traders’ updating and market maker’s pricing rule

3 Herding and contrarian behavior

Theorem 1

Proof

Theorem 2

Proof

3.1 Market limit behavior and price informativeness

Proposition 1

Proof

Corollary 1

Proof

3.2 Probability of path-dependent behavior

Proposition 2

Proof

4 Learning from the trading activity

Proposition 3

Proof

5 Conclusion

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

A Appendix

A Appendix

1.1 A.1 Market Maker Pricing Rule

Lemma 1

Proof

Lemma 2

Proof

1.2 A.2 Proof of Theorem 1

1.3 A.3 Proof of Theorem 2

1.4 A.4 Proof of Proposition 1

1.5 A.5 Proof of Corollary 1

1.6 A.6 Proof of Proposition 2

Lemma 3

Proof

1.7 A.7 Proof of Proposition 3

Lemma 4

Proof

Lemma 5

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Search

Navigation