1 Introduction

In electronic trading, agents are exposed to the risks that stem from latency in the marketplace. Latency is the aggregate of the time lags associated with the various stages of a trade. These stages, which occur in succession and are separated in time by random delays, include: the exchange sends quotes to an agent, the agent receives the quotes, the agent processes information and sends an instruction to the exchange, the exchange receives and handles the instruction, and finally, the outcome is notified to the agent. During the latency period, it is likely that the exchange will process other instructions that modify the limit order book (LOB) and possibly affect the outcome of the agent’s instructions. In particular, liquidity takers face the risk that the prices they target are not available by the time the exchange processes their order because the best quotes they observed were stale and therefore updated during the latency period. The risks faced by liquidity providers are different; e.g. their limit orders may be adversely selected by a taker. If the market moves against the trader’s interest, the order will not be filled or will be filled at worse prices – the outcome depends on the type of liquidity-taking order sent by the trader.

At high frequencies (milliseconds, microseconds), the best quotes in the LOB tend to flicker. Flickers are unpredictable short-lived deviations of a few ticks from the best quotes, and are the result of very frequent occurrences of rapid sequences of post-and-cancel activity in the LOB and the less frequent arrival of aggressive orders that consume liquidity that is immediately replenished. Thus over the latency period, the main source of risk faced by liquidity takers is from flickers in the book, and, to a lesser extent, the risk that stems from unexpected changes in the fundamental best quotes by the time their orders are processed. If latency were zero, traders would not face these risks because they would always take liquidity at the prices and quantities they observe in the LOB; however, all market participants face latency and latency is random.

In this paper, we show how a trader can execute a large position in a financial instrument when the trader faces random latency in the marketplace. The dynamics of the best bid price in the LOB consist of the ‘fundamental’ best quote of the instrument and the flickers (similarly for the best ask price). Innovations in the fundamental best quote are driven by a stochastic process, and the size and arrival times of flickers in the quotes are represented by a marked point process. The trader employs marketable limit orders (MLOs) to execute the position over a time window; MLOs are liquidity-taking orders that specify a price limit and are for immediate execution only. Filled MLOs have permanent price impact: a filled buy (sell) MLO exerts upward (downward) pressure on the fundamental value of the instrument. However, if the price limit of the MLO precludes it from being filled, the exchange cancels the order and there is no price impact because other market participants cannot observe missed trades.

In our model, the trader controls the price limit of the MLO and controls when to submit it to the exchange, both of which largely depend on the trader’s degree of urgency to execute the position. An impatient liquidity taker will send sell (buy) MLOs with price limits that are below (above) the best bid (ask) price they observed in the LOB. This strategy increases the probability that the MLO is filled when processed by the exchange and caps how far the MLO is allowed to walk the LOB when there are adverse price changes over the latency period. On the other hand, we show that patient liquidity takers use MLOs predominantly to send speculative trades. Speculative MLOs seek a price improvement relative to the fundamental best quote observed by the trader when sending the order to the exchange. Patient liquidity takers do not use their speed to finalise their execution programme ahead of time, or to hit the best quotes they observe in the LOB, they use speed mainly to send as many speculative MLOs as possible during the trading horizon – fast traders have many opportunities to retry missed orders before reaching the end of the trading window.

We use proprietary foreign exchange data from the LMAX Exchange (henceforth LMAX) between September 2019 and February 2020 for ten currency pairs to study the order types that liquidity takers use in the foreign exchange market and to analyse the effect of latency on the efficacy (i.e., hit and miss) of liquidity-taking orders. In all pairs, MLOs trade more volume than any other type of liquidity-taking order; in particular, more than market orders (MOs), which are for immediate execution and walk the LOB until filled in full. Though MLOs offer protection against adverse price moves, traders concede that the order may be cancelled by the exchange because clearing prices would breach the price limit instructed in the MLO. For example, in the EUR/USD pair, 41.0% of the MLOs were filled in full, 1.9% were partially filled, and 57.1% were missed. The MLOs that were partially or fully filled represent 56.9% of the volume traded in the pair, while MOs represent 13.5% of the total volume. The limit rates of missed MLOs were, on average, three ticks away from the best quote in the LOB when they were processed by the exchange.

We use the LMAX data to implement and to benchmark the performance of the trader’s random-latency-optimal execution strategy against four strategies: (i) send MOs over a trading window and assume latency is zero, (ii) send MLOs over a trading window and assume that latency is deterministic, (iii) employ a discrete time-weighted average price (TWAP) strategy that sends MOs at equally spaced time intervals over the execution window, (iv) send one MO to execute the entire order at the beginning of the trading window, where, unrealistically, we assume that there is enough liquidity at the best bid in the LOB to fill the entire order.

Estimates of model parameters for the EUR/USD currency pair are obtained with data between 1 August 2019 and 31 August 2019, and trading strategies are implemented with data between 1 September 2019 and 29 February 2020. We show that the performances of the random-latency-optimal strategy and of the deterministic-latency-optimal strategy are statistically the same. When the trader is patient, both strategies outperform the other benchmarks by a cash amount that is greater than the transaction costs paid by liquidity takers in foreign exchange markets, and around news events, the value of the outperformance increases to between two and ten times the value of the transaction costs. The source of the better performance of the latency-optimal strategies stems from the speculative MLOs that are filled during the trading interval, and from the price protection of the MLOs against adverse price moves. In the EUR/USD pair, the number and the value of filled speculative MLOs increase when market activity increases because the probability and the size of positive flickers in the best bid rate increase during heightened market activity – this explains the considerable outperformance of the latency-optimal strategies around news events. Finally, we show that the value of the outperformance decreases as the degree of impatience of the trader increases because the strategy sends fewer speculative MLOs – very impatient traders do not send speculative MLOs, all their MLOs are for price protection.

As far as we are aware, this is the first work to appear in the literature that shows how to optimally execute a position in an asset when the trader faces random latency. Closest to our work are those by Øksendal and Sulem [25] and Bruder and Pham [9], where the authors investigate general impulse control problems with deterministic delay, while ours assumes stochastic delay; [25] studies an infinite-horizon control problem with deterministic delay and an arbitrary number of pending orders, while [9] investigates a finite-horizon stochastic control problem with any finite number of pending orders. The key mathematical novelty that distinguishes our work is that delays are stochastic. From a modelling perspective, this requires an explicit definition of the \(\sigma \)-algebra of the trader and an account of the new terms that arise as a consequence of stochastic delays.

Although our model focuses on an optimal execution problem with stochastic latency, our framework is general and can be applied to control problems in which the outcomes of the controlled actions are observed with stochastic delay. In financial applications, our model may be implemented in all asset classes that trade in electronic LOBs with liquidity-taking orders that consider price limits and immediate time-in-force of the liquidity-taking orders (equity and foreign exchanges that offer MLOs include LMAX, LSE, Chi-X, CBOE, NASDAQ, CME, and NYSE). In contrast, all models of optimal execution in the extant literature assume that the trader operates in the marketplace with zero latency and employ only MOs to take liquidity; see e.g. Almgren and Chriss [4], Almgren [2], Bayraktar and Ludkovski [6], Alfonsi et al. [1], Guéant et al. [21], Guilbaud and Pham [22], Cartea et al. [13], Cartea and Jaimungal [10], Guéant [19], Cartea and Jaimungal [11], Barger and Lorig [5] and the monographs by Cartea et al. [12, Sect. 6] and Guéant [20, Part II].

In the literature, there is work that discusses how latency affects liquidity-taking strategies, market making and passive trading. Cartea and Sánchez-Betancourt [15] show how traders can adjust the price limit of MLOs to target a fill ratio in a given currency pair when the trader sends a sequence of orders over a trading interval (finite or infinite). In market making, Gao and Wang [18] show how an agent provides liquidity to the LOB of large-tick stocks when latency is deterministic and fixed during the trading interval. In passive trading (i.e., trading with limit orders), the work by Moallemi and Saĝlam [24] quantifies the cost of latency in equity (NASDAQ).

In the next section, we employ high-frequency foreign exchange data to show that MLOs protect orders from adverse price moves and also receive price improvements when prices move in the trader’s interest over the latency period. Section 3 introduces the trader’s optimal execution problem with random latency and characterises the random-latency-optimal strategy as the solution of a Hamilton–Jacobi–Bellman quasi-variational inequality (HJBQVI), and develops two new benchmarks where the investor faces deterministic latency or faces no latency. Section 4 compares the performance of the trader’s random-latency-optimal strategy with that of the benchmarks. Section 5 concludes, and we collect some proofs in the appendices.

2 Order types and data

2.1 Orders and revealed preferences

In order-driven electronic markets, the basic building blocks to trade are orders that provide liquidity and orders that take liquidity. When a trader sends an order to the exchange, the order contains two specific instructions: quote type and time-in-force. The quote type specifies the main feature of the order: price limit (i.e., LO), no price limit (i.e., MO), a trigger (i.e., stop orders), among others. The time-in-force refers to how long the order is active in the market. The shortest time-in-force is immediate execution and the longest time-in-force is typically for the rest of the trading day. Two prominent liquidity-taking order types that have an immediate execution time-in-force are immediate-or-cancel (IoC) and fill-or-kill (FoK). IoC is an order to buy or to sell assets that must be executed immediately, in full or in part, while obeying the order’s price limit, and any portion of the volume of the order that cannot be filled at the desired price limit is cancelled. FoK is an order to buy or to sell assets that must be executed immediately in full or it is cancelled.

Combinations of the various features constitute the type of order that market participants send to the exchange. Here, we focus on the types of liquidity-taking order that are designed to protect traders against the frictions that stem from latency. We define MLOs as liquidity-taking orders that have a price limit and the time-in-force is IoC or FoK. An MO is an MLO without price limit; the MLOs we consider have a finite price limit and are for immediate execution. MLOs protect liquidity takers from adverse price movements that occur between the time the trader makes a decision to trade (on a possibly stale quote) and the time the exchange matches the order with a limit order resting in the book (when possible).

Traders reveal their preferences when they choose a type of liquidity-taking order for a trade or a sequence of trades whose outcome is contingent on latency. Their choices demonstrate that traders balance the cost of completing a trade and the costs of price protection. A trader who must complete a trade without delay will choose an MO to guarantee execution in full – and expose the trade to price movements over the latency period. In contrast, a trader who can afford to miss the trade if the price is ‘not right’ will choose an MLO. Thus in exchange for price protection against adverse price movements, the trader concedes that the trade might not get filled. On the other hand, if prices move in a favourable direction, the trader receives a price improvement, i.e., the MLO is executed at a better price than the trader’s decision price. Indeed, MLOs are also used by traders to complete a trade only if the price improves over the latency period. In the next section, we provide summary statistics of the use of MLOs in foreign exchange markets.

2.2 Data

We present descriptive statistics for ten currency pairs in the foreign exchange spot market at LMAX. The data are stamped at a microsecond frequency and the range is from 1 September 2019 to 29 February 2020 – in Sect. 4, we use data from August 2019 to estimate model parameters. For each currency pair, we have: the aggregate volume of the limit sell orders posted at the best ask rate; the aggregate volume of the limit buy orders posted at the best bid rate; liquidity-taking orders sent to the LOB, including the type of order and trader identification; rate limits of MLOs; full fills, partial fills, and missed liquidity-taking trades. Finally, our data set does not contain the liquidity posted at the LOB beyond the best bid and best ask rates, but does contain the average rate paid or received for all liquidity-taking orders, including those that walked the LOB beyond the best quotes.

Table 1 shows the percentage of the total number of trades that are MLOs, MOs and others, and also shows the percentages by traded volume. Recall that the MLOs we consider are IoC and FoK with an immediate time-in-force. In practice, the exchange processes the order as soon as it arrives and matches it (if possible). This processing time usually takes under 80 microseconds and is followed by a message from the exchange to the trader to notify the outcome.

Table 1 Percentage of liquidity-taking by order type. Period: 1 September 2019 to 29 February 2020. For each currency pair, we use bold to highlight the highest percentage by number of trades and by volume traded

The category ‘others’ includes the remaining order types that consume liquidity, for example, marketable orders: (i) limit orders that are good-until-cancel and good-for-day, (ii) stop orders good-for-day and good-until-cancel, and (iii) dark limit orders good-for-day and good-until-cancel.

2.3 Flickering of best rates in the LOB

Figure 1 shows snippets of the evolution of best quotes and liquidity-taking activity for the currency pair EUR/USD on 3 September 2019 at around 2.00pm British summer time – the data are timestamped to the microsecond. Solid circles denote filled sell MLOs, squares denote filled sell MOs, and empty circles denote sell MLOs that were not filled because the limit sell rate in the order was higher than the best bid rate in the LOB when the exchange processed the order.

Fig. 1
figure 1

Best bid rate (blue line), best ask rate (red line), missed MLOs (empty circles), filled MLOs (full circles), filled MOs (squares). Top panel: 2.00pm to 2.01pm (60 seconds). Middle panel: 2.00.08pm to 2.00.14pm (6 seconds). Bottom panel: 2.00.12pm to 2.00.13pm (1 second). Microsecond data from EUR/USD pair, 3 September 2019

The top and middle panels of Fig. 1 show the best bid rate and sell liquidity-taking orders between 2.00.00pm and 2.01.00pm (i.e., one minute) and between 2.00.08pm and 2.00.14pm (i.e., six seconds), respectively. It is clearly visible that the best bid rate goes through unpredictable flickers, which are extremely short-lived deviations of a significant number of ticks from the ‘fundamental’ or ‘true’ value of the best bid rate. These flickers are the result of cancellations and arrivals of limit orders at the best bid rates, and of liquidity that is consumed by aggressive orders and immediately replenished by the arrival of limit orders.

The bottom panel shows a one-second snippet, between 2.00.12pm and 2.00.13pm, of the best bid rate and the best ask rate and the liquidity-taking orders, all of which were buy MOs. In most cases, flickers on either side of the LOB cause a short-lived widening of the quoted spread of the LOB.

From visual inspection, flickers in the best bid (best ask) rate are more often negative (positive) than positive (negative). This is a prevalent feature in the dynamics of the best quotes in the LOB. We return to this in Sect. 4 when we estimate model parameters and discuss the distribution of the flickers in the performance of the execution strategies.

2.4 Distribution of flickers: hit and miss

We proceed to discuss the limit rates employed by traders who send MLOs to buy and to sell the EUR/USD currency pair. We employ the same data as that in Table 1 – the results for the remaining currency pairs are similar. To analyse both the hit and miss performance of the MLOs and the distribution of the flickers in the best quotes, we compute the slippage-price-improvement (SPI) measure

$$ \text{SPI}_{i}=(L_{i} - M_{i}) I_{i} . $$
(2.1)

Here, \(L_{i}\) denotes the exchange rate limit of the order, i.e., the maximum (minimum) rate that the trader is willing to pay (receive) per unit that she wishes to buy (sell); \(M_{i}\) is the exchange rate that the order would pay (receive) per unit bought (sold) if the order is filled in full. The indicator \(I_{i}\) determines the direction of the order: \(I_{i}\) takes the value \(+1\) when the trader sends a buy MLO and the value −1 when the trader sends a sell MLO, for \(i=1,\dots ,n\), where \(n\) is the total number of MLOs. Thus when \(\text{SPI}_{i}\) is positive, the order was filled (relative to the limit rate \(L_{i}\)) with a slack of \(\text{SPI}_{i}\) ticks, which we refer to as price improvement. Similarly, when \(\text{SPI}_{i}\) is negative, the order missed (fully or in part) by \(\text{SPI}_{i}\) ticks (relative to the limit rate \(L_{i}\)), which we refer to as slippage.

For each MLO, the quantity in (2.1) could understate or overstate slippage and price improvement relative to the trader’s decision rate; note that we compute SPI relative to the limit rate instructed in the order. Our data set does not contain the time-stamps of when the trader submitted the MLO to the exchange, nor does it contain the rate observed by the trader when she decided to trade (i.e., the decision rate). We have the time-stamp of when the order is processed by the exchange. Thus we do not know the best bid and best ask rates and quantities posted in the LOB when the trader decided to send the order.

The trader may find it optimal to send a buy MLO with a limit rate that is below the observed best ask rate or to send a sell MLO with a limit rate that is above the observed best bid rate. In other words, traders may send speculative MLOs that are filled only if there is a price improvement relative to the quotes they observe when deciding to trade – in Sect. 3.6 below we return to this point and show that it is optimal to send MLOs that target a price improvement.

Figure 2 shows histograms in log-scale of SPI for the MLOs in the EUR/USD currency pair, where the \(x\)-axes in both panels are truncated to lie between −25 and 25 ticks. To gain insights into the limit rates of the MLOs and into the distribution of the flickers, the top panel shows the histogram of SPI with all the MLOs sent to the exchange, and the bottom panel shows a histogram of SPI with the MLOs that aimed at the best quote, i.e., the trader’s decision rate and the limit rate she instructed in the MLO are the same. We consider that the limit rate of an MLO is equal to the decision rate of the trader if at any point within the previous 150 ms of the processing time of the MLO, there was a best quote in the LOB equal to the limit rate of the MLO (we obtain similar results when we assume that the time period of the look-back window is 100 ms). The histograms are for both IoCs and FoKs.

Fig. 2
figure 2

Histograms of SPI between −25 ticks and 25 ticks in log-scale for the EUR/USD, 1 September 2019 to 29 February 2020. One tick is \(10^{-5}\) USD. Top: all MLOs. Bottom: MLOs with limit rate equal to trader’s decision rate

In both panels, the histograms are skewed to the left. The skewness may be due to informed traders in the market and due to the distribution of the flickers. Informed traders send buy (sell) MLOs in anticipation of an increase (decrease) in the exchange rate. Thus all else being equal, their trading strategies skew the distribution of SPI to the left because it is more likely that a buy (sell) MLO misses the trade when the exchange rate drifts up (down). Informed MLOs are included in the top panel. In addition, the skew in the histogram also results from asymmetry in the distribution of the size of the flickers – this is better represented in the bottom panel where the limit rate of the MLO is equal to the trader’s decision rate; see also the bottom panel of Fig. 1, where most flickers on the best bid (ask) rate are negative (positive). Below we show that flickers in the best bid rate are negatively skewed and flickers in the best ask rate are positively skewed.

2.5 Liquidity: make and take

In this subsection, we report various statistics of the volume posted at the best quotes in the LOB and the volume of the filled liquidity-taking orders of the ten currency pairs we study. The key message is that liquidity-taking orders hardly ever consume all the liquidity available at the best quotes in the LOB.

Table 2 reports statistics of the liquidity-taking and the liquidity-provision activity during the period 1 September 2019 to 29 February 2020 between 9.00am and 4.00pm British summer time. Columns 2–4 describe the liquidity-taking activity. The mean, median and standard deviation are in 10’000 units of the base currency. For example, the mean value of the traded quantity in the pair EUR/USD is 20.94, which, in the base currency EUR, is a mean of 209’400 EUR. The last four columns of the table summarise statistics of the spread and of the best quotes in the LOB: the time-weighted spread for each currency pair, the time-weighted quantity, the median quantity available at the best quotes, and the mode quantity at the best quotes.

Table 2 TWS: time-weighted spread. TWQ: time-weighted quantity at best quotes. Period: 1 September 2019 to 29 February 2020 between 9.00am and 4.00pm. Each unit of quantity is a lot of 10’000 units of the base currency

Table 3 shows the percentage of occurrences when the traded quantity is greater than: the time-weighted quantity at the best quotes; the median quantity at the best quotes; and the mode of the quantity at the best quotes. Note that the information in columns 1, 3, 5 is the same as that in the last three columns of Table 2.

Table 3 TWQ: time-weighted quantity available at best quotes. TQ: traded quantity. Q ask/bid: quantity posted at best ask rate and best bid rate. Period: 1 September 2019 to 29 February 2020 between 9.00am and 4.00pm. Each unit of quantity is for 10’000 units of the base currency

We observe that in all currency pairs, the proportion of trades that cannot be filled with the liquidity available at the best quotes is between 0.1% and 8.8%. In the model that follows, we assume that the MLOs sent by the trader to the exchange will not walk the LOB.

3 Optimal execution with random latency

We present a model of optimal execution with random latency. We focus on the execution of a large order that is divided in child orders that are sent to the market over a finite trading horizon. However, we remark that our model is useful to decide the optimal strategy to execute as little as one order over a trading horizon. Our discussion is framed within foreign exchange markets; however, our model is general and applicable in all asset classes in which instruments are traded in a visible electronic LOB. Also, our framework may be applied in general stochastic control problems in which the outcomes of the agent’s actions are known at a random future date.

3.1 Exchange rate dynamics and execution of MLOs

As discussed above, flickers are short-lived deviations; so here we assume that only the flicker that occurs at the processing time \(\tilde{t}_{0}\) is relevant when the exchange processes the trader’s sell MLO. Therefore, the trader models the best bid rate \(\hat{S}=(\hat{S}_{t} )_{t\geq 0}\) as

$$ \hat{S}_{t} = S_{t} + F_{t}, $$
(3.1)

where \(S=( S_{t} )_{t\geq 0}\) denotes the fundamental best bid rate, and \(F=( F_{t} )_{t\geq 0}\) denotes the flickers that affect the bid rate only at processing times of MLOs. The model (3.1) does not include flickers that occur between processing times of the MLOs sent by the trader – it models paths that are relevant for the trader. In the sequel, we refer to \(\hat{S}_{t}\) as the observed best bid rate, and in Sect. 4, we show how the trader employs the data of the LOB to obtain the fundamental best bid rate \(S_{t}\).

Formally, the notification time of the order and the size of the flicker are modelled by the background marked point process (MPP) \(\mathcal{N}=(T_{n}, Z_{n})_{n\geq 1}\) with random measure \(\mathfrak {p}(\mathrm {d}t,\mathrm {d}z)\). The background MPP dictates the stopping time when the outcome of each order is notified to the trader and dictates the flicker that affects the fundamental bid rate when each order is processed. That is, the trader sends a sell MLO to the exchange at time \(t\) and the exchange notifies the outcome of the order at time \(\tilde{t}\), where \(\tilde{t}\) is the next stopping time (after \(t\)) when a mark (i.e., flicker) of the MPP arrives. More precisely, let \(n\geq 1\) be such that \(T_{n-1}\leq t< T_{n}\); then the notification time \(\tilde{t}\) is \(T_{n}\) and the flicker \(F_{\tilde{t}}\) is \(Z_{n}\). Note that the value \(F_{\tilde{t}}\) of the flicker at the notification time is independent of the trader’s information at time \(t\).

We assume that the background process \(\mathcal {N}\) is non-explosive and the interarrival times \((T_{i}-T_{i-1})_{i\geq 1}\) are independent and identically distributed exponential random variables with parameter \(\lambda >0\) and \(T_{0}=0\). The flickers \((Z_{n})_{n\geq 1}\) that affect the fundamental best bid rate at processing times take values in ℝ and are i.i.d. with law

$$ \nu (\mathrm {d}z)= p_{0} \delta _{\{0\}}(\mathrm {d}z) + \underbrace{p_{+} \eta _{+} e^{-\eta _{+} z} \boldsymbol{1}_{\{z>0\}} \, \mathrm {d}z}_{ \text{{price improvement}}}+ \underbrace{p_{-} \eta _{-} e^{\eta _{-} z} \boldsymbol{1}_{\{z< 0\}} \, \mathrm {d}z}_{ \text{{slippage}}} , $$
(3.2)

where \(\delta _{\{0\}}(\mathrm {d}z)\) is the Dirac measure at zero, \(\eta _{+},\eta _{-}\in (0,\infty )\) and \(p_{0},p_{+},p_{-}\in \mathbb{R}_{+}\) with \(p_{0}+p_{+}+p_{-}=1\).

Let \(T\in (0,\infty )\). It is straightforward to see that

$$ \mathbb{E}\big[\big(\mathfrak {p}([0,T], \mathbb{R} )\big)^{2}\big]< \infty \qquad \text{and}\qquad \mathbb{E}\left [\int _{0}^{T}\int _{ \mathbb{R}}\left |z\right | \mathfrak {p}(\mathrm {d}t,\mathrm {d}z)\right ]< \infty . $$
(3.3)

The law in (3.2) is a modelling choice; the framework we develop here is valid for any law \(\nu (\mathrm {d}z)\) that satisfies the two conditions in (3.3).

3.2 Outcome of trade attempts: miss or fill

The information flow that the trader observes is encoded in the filtration \(\mathbb{F}=(\mathcal{F}_{t})_{t\geq 0}\) defined by

$$ \mathcal{F}_{t}=\sigma \big(W_{s},\mathfrak {p}([0,s],A) : A\in \mathcal{B}( \mathbb{R}), s\leq t\big), $$

where \((W_{s})_{s\geq 0}\) is a standard Brownian motion. Let \(\tau \) be an \(\mathbb{F}\)-stopping time, and set \(N_{t}=\mathfrak {p}((0,t],\mathbb{R})\) for \(t\in (0,T]\) and \(N_{0}=0\), where \(N_{t}\) denotes the total number of marks of the background process \(\mathcal {N}\) up to time \(t\). We define the notification time \(\tilde {\tau }\) of a trade attempt by

$$ \tilde {\tau }=\inf \{t>\tau : \Delta N_{t}>0 \}, $$

where \(\Delta N_{t}=N_{t}-N_{t-}\). Thus \(\tilde {\tau }\) denotes the time when the exchange notifies the outcome of the trade attempt sent at time \(\tau <\tilde {\tau }\). Henceforth, we put a tilde \(\tilde{\ } \) on stopping times to denote that they are notification times.

The following lemma shows that: (i) the notification times are stopping times, and (ii) the times between trade attempts and notification times are i.i.d. with exponential distribution.

Lemma 3.1

Let \((\tau _{i})_{i\in \mathbb{N}}\) be a sequence of \(\mathbb{F}\)-stopping times, where the index \(i\) denotes the \(i\)th trade attempt, and let \((\tilde {\tau }_{i})_{i\in \mathbb{N}}\) be the sequence of notification times of the outcome of each trade attempt. Let \((\tau _{i})_{i\in \mathbb{N}}\) and \((\tilde {\tau }_{i})_{i\in \mathbb{N}}\) satisfy \(\tilde {\tau }_{i}\leq \tau _{i+1}\) for all \(i\in \mathbb{N}\). Then:

(i) \((\tilde {\tau }_{i})_{i\in \mathbb{N}}\) are \(\mathbb{F}\)-stopping times.

(ii) \((\tilde {\tau }_{i}-\tau _{i})_{i\in \mathbb{N}}\) constitute a collection of independent and identically distributed random variables that are exponentially distributed with parameter \(\lambda >0\).

Proof

\({\mathrm{{(i)}}}\) Let \(j\in \mathbb{N}\) and \(t\geq 0\). Then

$$\begin{aligned} \{\tilde {\tau }_{j}\leq t\}&=\{\tau _{j}\leq t\} \cap \{N_{t}>N_{t\wedge \tau _{j}}\}\in \mathcal{F}_{t} \end{aligned}$$

because \(\{\tau _{j}\leq t\}\in \mathcal{F}_{t} \) and \(\{N_{t}>N_{t\wedge \tau _{j}}\}\in \mathcal{F}_{t}\).

\({\mathrm{{(ii)}}}\) For \(i\in \mathbb{N}\) and \(t\geq 0\), we have

$$\begin{aligned} \mathbb{P} [\tilde {\tau }_{i}-\tau _{i}\geq t ]&=\mathbb{P} [\inf \{s>0 : \Delta N_{s+\tau _{i}}>0\}\geq t ] \\ &=\mathbb{P} [\inf \{s>0 : \Delta N_{s}>0\}\geq t ]=e^{-\lambda t} \end{aligned}$$

because \((N_{\tau _{i}+s}-N_{\tau _{i}})_{s\geq 0}\) is a Poisson process with parameter \(\lambda \). Finally, the collection \((\tilde {\tau }_{i}-\tau _{i})_{i\in \mathbb{N}}\) has the independence property because \([\tau _{i},\tilde {\tau }_{i})_{i\in \mathbb{N}}\) are over non-overlapping intervals. □

Next, we define the auxiliary process \(\mathsf {F}=(\mathsf {F}_{t})_{t\geq 0}\) to determine the outcome of the trade attempts. The value of \(\mathsf {F}_{t}\) is the most recent mark up to time \(t\). Thus \(\mathsf {F}\) satisfies

$$ \mathrm {d}\mathsf {F}_{t}=\int _{\mathbb{R}} (z-\mathsf {F}_{t-} ) \mathfrak {p}(\mathrm {d}t, \mathrm {d}z), \qquad \mathsf {F}_{0}=0, $$

where we recall that \(\mathfrak {p}(\mathrm {d}t,\mathrm {d}z)\) is the random measure of the background process \(\mathcal {N}\). When the trader sends a trade at time \(\tau \) and the notification time is \(\tilde {\tau }\), the value \(F_{\tilde {\tau }}\) of the flicker is \(\mathsf {F}_{\tilde {\tau }}\).

At the notification time \(\tilde {\tau }\), if the sell MLO is an FoK with limit rate \(\mathfrak {l}\), it will be filled in full if \(\mathfrak {l}\leq S_{\tilde {\tau }-}+ \mathsf {F}_{\tilde {\tau }}\); otherwise the MLO misses the trade and is cancelled by the exchange. To formally define a miss and a fill of a trade attempt, we proceed as follows. Let \(\tilde{H}(x)=\boldsymbol{1}_{\{x\leq 0\}}\) be a step function. The outcome of the sell MLO is determined by the step function \(\tilde{H}(\mathfrak {l}-S_{\tilde {\tau }-}- \mathsf {F}_{\tilde {\tau }})\), which takes the value 1 when the trader receives the notification of a fill or the value 0 when the notification is of a miss. When the value of the flicker is positive and the MLO is filled, the liquidity providers that were offering liquidity at the best quote are likely to be adversely selected. Next, approximate \(\tilde{H}\) with a \(\mathcal{C}^{\infty}\) function \(f^{\varepsilon}\), where the parameter \(\varepsilon \) controls the convergence of \(f^{\varepsilon}\) to \(\tilde{H}\) as \(\varepsilon \searrow 0\). For example, to determine the fill or miss outcome of a trade attempt, we use instead of \(\tilde{H}(x)\) the sigmoid function evaluated at \(-x/\varepsilon \). We recall that the sigmoid function is given by \(S(x) = 1/(1+\exp (-x))\). We work with an approximation to the step function \(\tilde{H}\) to preserve the continuity of the impulse operator. To simplify notation, we drop the superscript \(\varepsilon \) and refer to \(f^{\varepsilon}\) as \(f\).

3.3 Admissible strategies

The trader controls when to send MLOs and controls the limit rate of the MLOs; this is summarised in the execution strategy \({\alpha}=(\tau _{i},\mathfrak {l}_{i})_{i\geq 1}\). The limit rates \((\mathfrak {l}_{i})_{i\geq 1}\) of the MLOs are \(\mathcal{F}_{\tau _{i}}\)-measurable and take values in a non-empty compact set \(L\subseteq \mathbb{R}\). We assume that the trader does not attempt a trade until the outcome of the previous MLO is known. Thus we require that the \(\mathbb{F}\)-stopping times in the sequence \((\tau _{i})_{i\geq 1}\) satisfy \(\tau _{i+1}\geq \tilde {\tau }_{i}> \tau _{i}\) for \(i\in \mathbb{N}\). Specifically, the trader’s set of admissible strategies is

$$\begin{aligned} \mathcal{A}= \{ {\alpha}=(\tau _{i},\mathfrak {l}_{i})_{i\geq 1} &: \text{for each $i\geq 1, \tau _{i}$ is an $\mathbb{F}$-stopping time,} \\ & \phantom{=} \tau _{i+1}\geq \tilde {\tau }_{i}> \tau _{i}, \mathfrak {l}_{i} \text{ is valued in $L$, and $\mathfrak {l}_{i}$ is $\mathcal{F}_{\tau _{i}}$-measurable} \} . \end{aligned}$$

If at the terminal time \(T\), there is an outstanding MLO in the exchange, the order is cancelled. The trader keeps track of pending orders in the exchange with the process \(k_{t}(\alpha )\) (for notational convenience, we use \(k(t,\alpha )\)), which returns the value 1 if the trader is waiting to be notified of the outcome of a trade attempt, or the value 0 if there is no pending order waiting to be processed by the exchange.

Hence, \(k_{t}(\alpha )=\operatorname{card}\{i\in \mathbb{N}:\tau _{i}\leq t, \tilde {\tau }_{i}>t \}\in \{0, 1\} \), which is adapted to \(\mathbb{F}\) because for a fixed \(t\in [0,T]\) and \(i\in \mathbb{N}\), we have that \(\tau _{i}\) and \(\tilde {\tau }_{i}\) are \(\mathbb{F}\)-stopping times. Thus \(\{\tau _{i}\leq t\}\cap \{\tilde {\tau }_{i}>t\}\in \mathcal{F}_{t}\).

3.4 System dynamics

In our framework, MLOs have permanent price impact when the order is filled. Note that if the MLO is not filled, the trader is the only market participant who knows that the trade attempt was not successful; thus, missed MLOs do not have price impact. If market participants had access to the information on missed trades, they would learn about buy and sell pressure in the market and would adjust their liquidity-provision and liquidity-taking strategies.

All trade attempts are of size one unit, which we refer to as child orders. The size of the parent order is \(\mathfrak{M}>0\) child orders, which is the initial inventory the trader seeks to liquidate. If the limit rate of the sell MLO to execute a child order is less than or equal to the best bid rate, we assume that there is enough liquidity to fill the MLO without walking the book, in which case the trader is indifferent between sending an IoC or an FoK to the exchange. Thus we assume that trades do not have temporary price impact; it is straightforward to include this price impact in the price dynamics of our model. In Table 3, we see that the majority of liquidity-taking orders do not walk the LOB. Traders specify quantities that can be filled with the volumes displayed at the best bid and best ask rates in the market.

For an execution strategy \(\alpha =(\tau _{i}, \mathfrak {l}_{i})_{i\geq 1}\in \mathcal{A}\), the trader monitors the system

$$ X^{\alpha }_{t}=(S^{\alpha }_{t}, Q^{\alpha }_{t}, C^{\alpha }_{t}), $$

where \((S^{\alpha }_{t})_{t\geq 0}\) is the fundamental best bid rate process, \((Q^{\alpha }_{t})_{t\geq 0}\) is the inventory of the agent, and \((C^{\alpha }_{t})_{t\geq 0}\) is the cash process. The dynamics of the fundamental best bid rate process are

$$\begin{aligned} S^{\alpha }_{t}&=S_{0}+\int _{0}^{t} b(S^{\alpha }_{u}) \, \mathrm {d}u + \int _{0}^{t} \sigma (S^{\alpha }_{u}) \, \mathrm {d}W_{u} - \kappa \sum _{ \tilde {\tau }_{i}\leq t} f(\mathfrak {l}_{i}-S^{\alpha }_{\tilde {\tau }_{i}-}-\mathsf {F}_{ \tilde {\tau }_{i}}), \end{aligned}$$
(3.4)

and the cash and the inventory process satisfy, respectively,

$$\begin{aligned} C^{\alpha }_{t}&=\sum _{\tilde {\tau }_{i}\leq t} f(\mathfrak {l}_{i}-S^{\alpha }_{ \tilde {\tau }_{i}-}-\mathsf {F}_{\tilde {\tau }_{i}}) (S^{\alpha }_{\tilde {\tau }_{i}-}+\mathsf {F}_{ \tilde {\tau }_{i}}), \\ Q^{\alpha }_{t}&=\mathfrak{M}-\sum _{\tilde {\tau }_{i}\leq t} f(\mathfrak {l}_{i}-S^{ \alpha }_{\tilde {\tau }_{i}-}-\mathsf {F}_{\tilde {\tau }_{i}}) . \end{aligned}$$

In the fundamental bid rate in (3.4), the functions \({b},\sigma :\mathbb{R}\to \mathbb{R}\) are Lipschitz-continuous. The function \(b\) is the drift of the fundamental bid price and \({\sigma}\) is the volatility of the innovations. The last term on the right-hand side represents the permanent price impact that filled MLOs have on the fundamental best bid rate, where \(\kappa \geq 0\) is the permanent impact parameter of the bid rate. Recall that \(f\) is the \(\mathcal{C}^{\infty}\) approximation of the step function that flags when a trade is filled or missed, and that all MLOs sent by the trader are of size one. We assume that the affected fundamental best bid rate \((S^{\alpha}_{t})_{t\geq 0}\) is bounded by the unaffected fundamental best bid rate \((S_{t})_{t\geq 0}\) as follows: for any execution strategy \(\alpha \in \mathcal{A}\), we have that \(\sup _{0\leq t\leq T} |S^{\alpha }_{t} | \leq \sup _{0\leq t \leq T}\left |S_{t}\right | + \kappa N_{T}\). This assumption is non-restrictive. If the affected fundamental best bid rate is assumed to be always positive, then \(S^{\alpha }_{t}\leq S_{t}\) for \(t\in [0,T]\) implies that \(\sup _{0\leq t\leq T} |S^{\alpha }_{t} | \leq \sup _{0\leq t \leq T}\left |S_{t}\right |\leq \sup _{0\leq t\leq T}\left |S_{t}\right | + \kappa N_{T}\). Alternatively, if \(b,\sigma \) are constant functions, the assumption is also satisfied.

The trader intervenes in the system at time \(\tau _{i}\) (i.e., sends a sell MLO), and due to latency, the outcome is known at the notification time \(\tilde {\tau }_{i} > \tau _{i}\) when the system evolves from \(X_{\tilde {\tau }_{i}-}^{\alpha }\) to \(X_{\tilde {\tau }_{i}}^{\alpha }=\Gamma (X_{\tilde {\tau }_{i-}}, \mathsf {F}_{ \tilde {\tau }_{i}}, \mathfrak {l}_{i})\). Here, the function \(\Gamma :\mathbb{R}^{3}\times \mathbb{R}\times L\to \mathbb{R}^{3}\) is the impulse operator

$$ \Gamma (s,q,c,\mathsf {f},\ell )=\big(s-\kappa f(\ell -s-\mathsf {f}), q-f( \ell -s-\mathsf {f}), c+(s+\mathsf {f}) f(\ell -s-\mathsf {f})\big), $$
(3.5)

which describes how the system (i.e., best bid rate, inventory, cash) changes at notification times. If the MLO sell order is filled: the first argument on the right-hand side of (3.5) shows that the fundamental bid rate decreases by one tick; the second argument shows that the inventory decreases by one unit; and the last argument shows that the amount of cash increases by the fundamental bid rate plus the flicker. If the sell MLO is not filled, the fundamental bid rate is not affected, and the cash and inventory positions do not change.

We denote by \(\vert \cdot \vert \) the Euclidean norm, and the operator \(\Gamma \) is a continuous function that satisfies

$$ \sup _{(y,\ell )\in \mathbb{R}^{4}\times L} \frac{ \vert \Gamma (y,\ell )\vert}{1+\vert y \vert}< \infty . $$
(3.6)

The initial state of the system is \(X_{0}=(S_{0},\mathfrak{M},0)\). Here, \(S_{0}\) is the initial value of the fundamental bid rate, we recall that \(\mathfrak{M}\) is the number of child orders (lots of equal size) that the trader wishes to liquidate, and the initial value of the cash account is zero. The controlled system \(X^{\alpha }\) is the solution to the SDE

$$\begin{aligned} X^{\alpha }_{t} &=X_{0}+\int _{0}^{t} \boldsymbol{b}(X^{\alpha }_{u}) \, \mathrm {d}u + \int _{0}^{t} \boldsymbol{\sigma}(X^{\alpha }_{u}) \, \mathrm {d}W_{u} \\ &\quad{} +\sum _{\tilde {\tau }_{i}\leq t} \big(\Gamma (X^{\alpha }_{\tilde {\tau }_{i}-}, \mathsf {F}_{\tilde {\tau }_{i}},\mathfrak {l}_{i})-X^{\alpha }_{\tilde {\tau }_{i}-}\big), \end{aligned}$$
(3.7)

where \(\boldsymbol{b}:\mathbb{R}^{3}\to \mathbb{R}^{3}\) and \(\boldsymbol{\sigma}:\mathbb{R}^{3}\to \mathbb{R}^{3}\) are given by \(\boldsymbol{b}(x_{1},x_{2},x_{3})=(b(x_{1}),0,0)\) and \(\boldsymbol{\sigma}(x_{1},x_{2},x_{3})=(\sigma (x_{1}),0,0)\), respectively, and both are Lipschitz-continuous. We now fix a finite horizon \(T<\infty \) and an execution strategy \({\alpha}\in \mathcal{A}\). It follows that

$$ \mathbb{E}\Big[\sup _{0\leq s\leq T} \lvert X^{ {\alpha}}_{s} \rvert ^{2} \Big]< \infty , $$
(3.8)

a result we employ below to show that the value function of the trader is well defined.

3.5 Liquidity taking with stochastic delay

For all \(t\in [0,T)\), the agent has either one or no pending order in the exchange. Thus we define two sets of admissible strategies: (i) admissible strategies with one pending order, and (ii) admissible strategies with no pending order. If at time \(t\in [0,T)\), there is one pending order with price limit \(\ell \in L\), then the set of admissible strategies is

$$ \mathcal{A}_{t,\ell}= \{ {\alpha}=(\tau _{i},\mathfrak {l}_{i})_{i\geq 1} \in \mathcal{A} : \tau _{1}=t, \mathfrak {l}_{1}=\ell \}, $$

and if there is no pending order, the set of admissible strategies is

$$ \mathcal{A}_{t}= \{ {\alpha}=(\tau _{i},\mathfrak {l}_{i})_{i\geq 1}\in \mathcal{A} : \tau _{1}\geq t \} . $$

For any \((t,{x},\mathsf {f})\in [0,T]\times (\mathbb{R}\times (-\infty , \mathfrak{M}]\times \mathbb{R})\times \mathbb{R}\) with \({x}=(s,q,c)\) and for a pending order with limit price \(\ell \in L\) and \(\alpha \in \mathcal{A}_{t,\ell}\), we denote by \(X^{t,{x},\mathsf {f},\ell ,\alpha}\) the solution to (3.7) for \(t\leq s \leq T\) with initial data \(X_{t}={x}\) and \(\mathsf {F}_{t}=\mathsf {f}\). One can drop the dependence on \(\mathsf {f}\) in \(X^{t,{x},\ell ,\alpha}\) because for any \(\mathsf {f}_{1},\mathsf {f}_{2}\in \mathbb{R}\), we have \(\mathsf {F}^{t,\mathsf {f}_{1}}_{\tilde {t}}=\mathsf {F}^{t,\mathsf {f}_{2}}_{\tilde {t}}\). That is, we write

$$\begin{aligned} X^{t,{x},\ell ,\alpha}_{s} &={x} +\int _{t}^{s} \boldsymbol{b}(X^{t,{x}, \ell ,\alpha}_{u}) \,\mathrm {d}u + \int _{t}^{s} \boldsymbol{\sigma}(X^{t,{x}, \ell ,\alpha}_{u}) \, \mathrm {d}W_{u} \\ & \phantom{=:}{} + \sum _{t< \tilde {\tau }_{i}\leq s} \big(\Gamma (X^{t,{x},\ell ,\alpha}_{ \tilde {\tau }_{i}-},\mathsf {F}^{t,0}_{\tilde {\tau }_{i}},\mathfrak {l}_{i})-X^{t,{x},\ell , \alpha}_{\tilde {\tau }_{i}-}\big) . \end{aligned}$$

Similarly, when there is no pending order at time \(t\), we denote by \(X^{t,{x},\alpha}\) the solution to (3.7) for \(t\leq s \leq T\) with \(X_{t}={x}\), for any \((t,{x})\in [0,T] \times (\mathbb{R}\times (-\infty , \mathfrak{M}]\times \mathbb{R})\).

Fix \((t,{x})\in [0,T]\times \mathbb{R}\times (-\infty , \mathfrak{M}] \times \mathbb{R}\), \(\ell \in L\), \(\alpha _{1}\in \mathcal{A}_{t,\ell}\) and \(\alpha _{0}\in \mathcal{A}_{t}\). It follows from (3.6), Gronwall’s lemma, the Burkholder–Davis–Gundy inequality and (3.3) that there exists a constant \(C>0\) such that

$$\begin{aligned} &\mathbb{E}\Big[\sup _{t\leq s\leq T} \lvert X^{t,{x},\ell ,\alpha _{1}}_{s} \rvert ^{2}\Big]< C (1+ \left \lvert {x}\right \rvert ^{2} ), \\ & \mathbb{E}\Big[\sup _{t\leq s\leq T} \lvert X^{t,{x},\alpha _{0}}_{s} \rvert ^{2}\Big]< C (1+\left \lvert {x}\right \rvert ^{2} ) . \end{aligned}$$

The trader evaluates the performance of the execution strategy \(\alpha \) with the function

$$ \Pi ( {\alpha})=g(X^{\alpha}_{T})+\int _{0}^{T} h(X^{\alpha}_{s}) \, \mathrm {d}s+ \sum _{\tilde {\tau }_{i}\leq T} \mathfrak{C}(X_{\tilde {\tau }_{i}-}, \mathsf {F}_{\tilde {\tau }_{i}},\mathfrak {l}_{i}), $$

where \(g:\mathbb{R}^{3}\to \mathbb{R}\), \(h:\mathbb{R}^{3}\to \mathbb{R}\), \(\mathfrak{C}:\mathbb{R}^{4}\to \mathbb{R}\) are

$$\begin{aligned} g(s,q,c) &=c+q \big(s + \zeta -a (q-1)\big)(1-\rho ), \\ h(s,q,c)&=-\phi q^{2}, \\ \mathfrak{C}(s,q,c,\mathsf {f},\ell )&=-\rho (s+\mathsf {f}) f(\ell -s-\mathsf {f}) . \end{aligned}$$
(3.9)

Here, \(a\geq 0\) is the terminal inventory penalty parameter, and \(\phi \geq 0\) is the running inventory penalty parameter.

The function \(g\) consists of three terms: (i) \(c\), the cash accumulated by the strategy, (ii) \(q (s+\zeta )\), the mark-to-market value of the inventory (net of fees), which includes the expected value \(\zeta =\int _{\mathbb{R}} z \nu (\mathrm {d}z)\) of the flicker, and (iii) \(a q (q-1)\), the costs of walking the LOB (net of fees) when the terminal inventory is greater than one child order.

The function \(h\) represents the urgency of the agent to complete the execution programme. Everything else being equal, the higher the value of \(\phi \geq 0\), the quicker will the trader liquidate inventory at the beginning of the trading interval. Each trader decides the urgency of their trade or set of trades. For example, the value of the urgency parameter is arbitrarily high for a trader who requires immediate execution. On the other hand, a patient liquidity taker who has time to retry missed trades will set the value of \(\phi \) close to or at zero.

The function ℭ represents transaction costs per unit of inventory traded. These costs are based on the notional traded, which is \((s+\mathsf {f}) f(\mathfrak {l}-s-\mathsf {f})\) per one unit of inventory, and \(\rho \in (0,1)\) is the transaction cost parameter.

The controlled processes \((C^{\alpha }_{t})_{0\leq t\leq T}\), \((S^{\alpha }_{t} )_{0\leq t\leq T}\) and \((Q^{\alpha }_{t})_{0\leq t\leq T}\) obey the bounds, which do not depend on the strategy \(\alpha \) or \(t\in [0,T]\),

$$\begin{aligned} |C^{\alpha }_{t} | &\leq \sum _{i=1}^{N_{T}} |Z_{i} | + N_{T} \sup _{0\leq u \leq T} S_{u} + \kappa N^{2}_{T}, \\ |S^{\alpha }_{t} | &\leq \sup _{0\leq u \leq T} |S_{u} | + \kappa N_{T} , \\ |Q^{\alpha }_{t} | &\leq \mathfrak{M} + N_{T} . \end{aligned}$$
(3.10)

Furthermore, we see that

$$ \sup _{({x},\mathsf {f},\ell )\in \mathbb{R}^{3}\times \mathbb{R}\times L} \frac{\left |g({x})\right |+\left |h({x})\right |+\left |\mathfrak{C}({x},\mathsf {f},\ell )\right |}{1+\left |\mathsf {f}\right |+\left \lvert {x}\right \rvert ^{2}}< \infty , $$

which ensures that the value functions we present below are well defined, i.e., they are finite as a consequence of the inequality above and (3.8) and (3.10).

When there is one pending order in the exchange, the trader’s performance criterion and value function are

$$ J_{1}(t,{x},\ell ,\alpha )=\mathbb{E}\bigg[g(X^{t,{x},\ell , \alpha }_{T})+\int _{t}^{T} h(X^{t,{x},\ell ,\alpha }_{s})\, \mathrm {d}s+ \sum _{\tilde {\tau }_{i}\leq T} \mathfrak{C}(X^{t,{x},\ell ,\alpha }_{ \tilde {\tau }_{i}-},\mathsf {F}^{t,0}_{\tilde {\tau }_{i}},\ell _{i})\bigg] $$

and

$$ v_{1}(t,{x},\ell )=\sup _{\alpha \in \mathcal{A}_{t,\ell}}J_{1}(t,{x}, \ell ,\alpha ), $$
(3.11)

respectively, for \((t,{x})\in [0,T]\times \mathbb{R}\times (-\infty , \mathfrak{M}] \times \mathbb{R}\), \(\ell \in L\), \(\alpha \in \mathcal{A}_{t,\ell}\).

Similarly, when the trader does not have an order pending in the exchange, we have

$$ J_{0}(t,{x},\alpha )=\mathbb{E}\bigg[g(X^{t,{x},\alpha }_{T})+ \int _{t}^{T} h(X^{t,{x},\alpha }_{s}) \,\mathrm {d}s+\sum _{\tilde {\tau }_{i} \leq T} \mathfrak{C}(X^{t,{x},\alpha }_{\tilde {\tau }_{i}-},\mathsf {F}^{t,0}_{ \tilde {\tau }_{i}}, \mathfrak {l}_{i})\bigg], $$

for \((t,{x})\in [0,T]\times \mathbb{R}\times (-\infty , \mathfrak{M}] \times \mathbb{R}\), \(\alpha \in \mathcal{A}_{t}\), with the corresponding value function

$$ v_{0}(t,{x})=\sup _{\alpha \in \mathcal{A}_{t}}J_{0}(t,{x},\alpha ) . $$
(3.12)

In Appendix A, we show in Theorem A.1 that the trader’s value functions satisfy the dynamic programming principle (DPP). The part of the proof that deals with measurable selection arguments is omitted (see Bouchard and Touzi [8] for the derivation of the dynamic programming equation in the sense of viscosity solutions). In Sect. A.1, we derive the HJBQVI satisfied by the value functions, and in Sect. A.2, we study the viscosity properties of the value functions.

3.6 Benchmarks

In Appendix B, we develop two new strategies to benchmark the performance of the random-latency-optimal trading strategy. Throughout, most of the notation is the same as in the previous sections. We derive optimal execution strategies when latency is zero in the marketplace, and when latency is greater than zero and deterministic, respectively. In both cases, the trader solves an impulse control problem. In the first model, the strategy determines the optimal times to send MOs to the exchange, and in the second, the strategy determines the timing and the limit rate of each MLO. We remark that most approaches in the execution literature assume that agents send MOs at a continuous rate. Although agents cannot continuously trade in the market, this assumption is convenient because in some cases one finds strategies in closed form; see Cartea et al. [12, Sect. 6] and Guéant [20, Part II]. One notable exception is the paper by Cartea and Jaimungal [10], where the authors solve an impulse control problem in which the trader employs both MOs and limit orders to execute a large position in a financial instrument when there is zero latency in the marketplace.

4 Performance of execution strategy

We employ ultra-high-frequency market data for the currency pair EUR/USD to compare the performance of the random-latency-optimal strategy (RLOS) developed in Sect. 3 with four benchmarks (of which the first two are characterised in Appendix B): (i) deterministic-latency-optimal strategy (DLOS), (ii) zero-latency-optimal strategy (ZLOS), (iii) time-weighted-average-price (TWAP), and (iv) execution now (ENOW). (RLOS, DLOS and ZLOS are computed using finite differences. Convergence results for these algorithms remain to be studied.) TWAP sends MOs of equal size and at equally spaced time intervals over the trading window. ENOW is a hypothetical benchmark that assumes there is enough liquidity at the best bid rate in the LOB to execute all the inventory with one MO, with zero latency, at the beginning of the trading window.

Recall that in our formulation, the fundamental bid rate \(S_{t}\) is an input to RLOS, DLOS, ZLOS. However, market participants observe the best bid rate posted on the LOB of the exchange, a rate that consists of the fundamental bid rate \(S_{t}\) and short-lived deviations. Here, we employ LOB observations of the best bid rates and of the quoted spreads in the LOB of the exchange to estimate the fundamental bid rate \(S_{t}\). In Sect. 2.3, the bottom panel of Fig. 1 depicted the evolution of the best rates in the LOB of the EUR/USD currency pair, where we saw that most flickers cause short-lived widening of the quoted spread: most flickers in the best bid rate are positive, and most flickers in the best ask rate are negative. Thus we assume here that at time \(t\), the fundamental best bid rate \(S_{t}\) is the bid rate in the LOB the last time the spread in the LOB was less than \(\upsilon \) ticks. Throughout, we assume that \(\upsilon =11\) ticks – our results are robust to various choices of the value of \(\upsilon \).

The trader’s execution horizon is \(T = 6\) seconds, and the inventory to liquidate is \(\mathfrak {M} =10\) lots, where each lot is €500’000; so the objective is to exchange €5’000’000 into USD. The execution value of ENOW in a frictionless market is the value of \(\mathfrak {M}\) lots exchanged at the best bid rate in the LOB at the start of the trading horizon. For TWAP and ZLOS, the MO for each lot is processed with the liquidity posted at the best bid rate in the LOB of the exchange, and if necessary, the order will walk down until filled in full. Similarly, if the limit rate of the MLO allows it, RLOS and DLOS will also walk down the LOB if the liquidity in the best quotes is not enough – we do so even though the strategies characterised in this paper assume that orders of size one do not walk the LOB.

To compute RLOS, DLOS, ZLOS, we assume that the fundamental best bid rate satisfies

$$ S_{t}=S_{0} + \sigma W_{t}, $$
(4.1)

where \(\sigma \) is a volatility parameter. However, we remark that when we implement the liquidation strategies, we do not simulate the fundamental best bid rate and do not simulate the flickers that affect the MLO when it is processed by the exchange. Instead, we use the best bid rate in the LOB of LMAX. The only aspect of the model that we simulate is the latency of the trader that employs the strategy – in Sect. 4.5, we perform robustness checks with respect to latency. Thus the results we report are robust to model and parameter misspecification. We assume that our trades do not impact the dynamics of the LOB. Ideally, one would like to test the model when other market participants react to the activity generated by the latency-optimal strategy; however, it is not possible to endogenise the reaction of other market participants to our trades.

We employ data from 1 August 2019 to 31 August 2019 to estimate the parameters of the model, and data between 1 September 2019 and 29 February 2020 to implement and compute the performance of RLOS and the benchmark strategies. Each day, we implement the liquidation strategy over three ten-minute windows. We use the data in the ten minutes starting at 9.00am, at 2.00pm and at 7.00pm for each trading day between 1 September 2019 and 29 February 2020. Times reported in this paper are in London time. At 9.00am and 2.00pm, the foreign exchange activity tends to be very high because by 9.00am, many financial markets in London are open and very active, and many financial markets in the US open at around 2.00pm. Later in the afternoon, foreign exchange activity decreases, and by 7.00pm, many markets worldwide are outside the hours of their main trading activity.

4.1 Model parameters

We employ data from August 2019 to estimate the parameters of the model as follows:

(i) Volatility of fundamental best bid rate dynamics: We compute the quadratic variation of the fundamental best bid rate to estimate the volatility parameter \(\sigma \) in (4.1). The estimates over the three ten-minute windows are: (9.00am to 9.10am) \(\hat{\sigma}_{\text{9am}}=2.1\times 10^{-4}\); (2.00pm to 2.10pm) \(\hat{\sigma}_{\text{2pm}}=2.9\times 10^{-4}\); (7.00pm to 7.10pm) \(\hat{\sigma}_{\text{7pm}}=1.7\times 10^{-4}\). The estimate of the volatility from 9.00am to 8.00pm is given by \(\hat{\sigma}=1.9\times 10^{-4}\); we use this parameter estimate in Sect. 4.4 when trading around news events.

(ii) Distribution of the size of the flicker that affects the MLOs at processing times: Recall that the law of the flickers is

$$ \nu (\mathrm {d}z)= p_{0} \delta _{\{0\}}(\mathrm {d}z) + \underbrace{p_{+} \eta _{+} e^{-\eta _{+} z} \boldsymbol{1}_{\{z>0\}} \, \mathrm {d}z}_{ \text{{price improvement}}}+ \underbrace{p_{-} \eta _{-} e^{\eta _{-} z} \boldsymbol{1}_{\{z< 0\}} \, \mathrm {d}z}_{ \text{{slippage}}} . $$

Agents use their trade activity to estimate their expected latency in the marketplace; here, we assume that their estimate is one of 10, 30, 60, 90 ms. Next, for a given expected latency and for a value of \(\upsilon \) to obtain the fundamental best bid rate, agents estimate the parameters of the distribution of flickers as follows: employ LOB data to study the hit and miss of MLOs with limit rates equal to the fundamental best bid rate when latency is exponentially distributed; use ten minutes of LOB data (beginning at 9.00am, 2.00pm and 7.00pm) of each trading day in August to simulate over one million MLOs; compute the SPIs of the MLOs; and finally, obtain maximum likelihood estimates of the parameters \(p_{0}, p_{+}, p_{-}, \eta _{+}, \eta _{-}\) for each ten-minute trading interval and for each expected latency.

(iii) Terminal penalty: We employ the liquidity-taking trades that walk the LOB to compute the value of the terminal penalty parameter \(a\). For each order, we look at the excess traded quantity above the liquidity posted at the best quotes and perform a linear regression where the explanatory variable is the excess traded quantity, bought or sold, and the dependent variable is the excess rate paid or received in the quote currency for walking the book. The slope of the regression for data in August 2019 is \(\hat{a}=1.80\times 10^{-5}\), which the agent uses to compute the optimal strategies. On the other hand, to test the performance of the strategy from September 2019 to February 2020, we use \(\hat{a}=1.61\times 10^{-5}\), which is the in-sample estimate, and we remark that the trader does not know this value.

We assume that the value of the permanent price impact parameter \(\kappa \) is zero because we use LOB data to run the simulations; so we cannot include the permanent impact that the trader’s orders would have had on the market. Recall that in the implementations, the MLOs walk the book if necessary (while obeying the price limit), and the strategies incorporate the fees paid to the exchange. Thus we account for all trading frictions except the friction arising from the permanent price impact. The central scenarios we study assume that the liquidity trader is patient; so we set the value of the urgency parameter \(\phi \) to zero. Finally, the transaction cost parameter is \(\mathfrak {c}=3\times 10^{-6}\), i.e., 3 USD per million USD traded, which are the fees paid by active traders in the foreign exchange market. Table 4 summarises the model parameters for RLOS, DLOS, ZLOS to trade between 9.00am and 9.10am when expected latency is 30 ms. In the interest of space, we do not report the parameters for the other scenarios we study.

Table 4 Trading window 9.00am to 9.10am. Fundamental best bid rate is obtained with spread filter of \(\upsilon = 11\) ticks. EL stands for expected latency

4.2 Simulations and performance measure

There are 128 days of market data in the period from 1 September 2019 to 29 February 2020. For each day, we split the ten-minute window into 100 intervals of six seconds. For each interval of six seconds, we combine the one market run of the best bid rate posted on the LOB with 100 simulated runs of the random latency that the trader would face in each trade attempt. Thus the total number of runs for each ten-minute window (9.00am to 9.10am, 2.00pm to 2.10pm, 7.00pm to 7.10pm) is \(128\times 100\times 100 = \text{1'280'000}\). In each run, the strategy exchanges 5’000’000 EUR into USD, which is 10 times the volume that one normally observes posted at the best bid rate in the exchange, i.e., \(\mathfrak {M} =10\) lots of 500’000 EUR each.

For RLOS, DLOS, ZLOS, we recall that the cost of liquidating terminal inventory with one MO is as follows: if \(Q_{T-} = 1\) lot, the order does not walk the LOB. However, if \(Q_{T}\geq 2\), the first lot does not walk the book, and the remaining \(Q_{T} -1\) lots walk the LOB, so that the USD received from exchanging the final position in EUR is \(Q_{T-} (S_{T} +\zeta - a (Q_{T-}-1))(1-\rho )\); see (3.9).

The measure of performance is

$$ \frac{\mathbb{E} [{\mathrm{TC}}_{T}(\text{RLOS}) ]-\mathbb{E} [{\mathrm{TC}}_{T}(\text{B}) ]}{V_{0}} \times 10^{6}, $$
(4.2)

where \({\mathrm{TC}}_{T}\) is the terminal cash in USD obtained by the strategy, \(V_{0}\) is the value of the position to exchange at the beginning of the trading window, and B represents one of the four benchmark strategies. The numerator of (4.2) is in USD and the denominator is in EUR; so the units of the performance measure are $/€M, i.e., USD per million of EUR exchanged. Finally, when the benchmark is ENOW, the performance measure is ‘implementation shortfall’, which is a widely used benchmark to assess implicit costs and the liquidity of markets. ENOW is seldom achieved because large orders need much more than the liquidity available at the best quotes of the LOB; see Almgren [3] and Cartea et al. [12, Sect. 4]. Generally, one computes implementation shortfall with the midprice. Here, we use the best bid rate for a liquidation problem.

4.3 Results: performance of RLOS and benchmarks

In this subsection, we present the results of the execution strategy for the three ten-minute trading windows that start at 9.00am, 2.00pm and 7.00pm for various levels of the trader’s expected latency in the marketplace. Later, in Sect. 4.4, we examine the performance of the strategies around news announcements. The main finding we report is the superior performance of the latency-optimal strategies RLOS and DLOS. Both RLOS and DLOS perform similarly; we do not find evidence at 95% confidence to reject the hypothesis that their mean performance is the same. The outperformance of RLOS over DLOS is positive in most scenarios, but is always less than $0.5/€M in absolute value. RLOS and DLOS produce similar performances because the optimal limit rates coincide for most of the simulations when rounded to the nearest tick size. The cash value of the outperformance is approximately the same as the fees paid by traders in foreign exchange markets; the value of the outperformance considerably increases during news events. The outperformance stems from the speculative MLOs sent by RLOS and DLOS, which are more valuable when the market is more active.

There are two key insights. First, the performance of MLOs depends on the distribution of the flickers. When activity in the markets increases, the value of MLOs increases because both the probability of receiving a price improvement and the size of the improvements increase – the value of MLOs is highest during news events. On the other hand, an increase in either the size or the probability of slippage during times of heightened activity has little effect on the performance of latency-optimal strategies because the rate limits in the sell MLOs prevent orders from walking down the LOB. This is in contrast with strategies that send MOs (e.g. TWAP and ZLOS) because these orders are exposed to adverse changes in the rates. Second, traders use speed to complete the execution programme with as many MLOs as possible because they can retry missed speculative MLOs before reaching the end of the trading interval. Thus the lower the latency of the trader, the more opportunities the trader will have to fill speculative MLOs, so the better is the performance of latency-optimal strategies that use MLOs over strategies that employ MOs.

Cartea and Sánchez-Betancourt [15] estimate the range of the stochastic latency of an active trader in the foreign exchange market to be between 10 and 50 ms. Table 5 reports the results of the performance measure (4.2) over the window 9.00am to 9.10am for a patient trader with 10, 30, 60, 90 ms of expected latency. RLOS receives approximately between 5.14 and 2.65 USD more than TWAP for every million EUR exchanged into USD, and RLOS receives between 4.56 and 2.34 USD more than ZLOS per million EUR exchanged, and RLOS outperforms ENOW by between 1.42 and 3.91 USD per million EUR exchanged. Finally, the average USD amount that RLOS receives is statistically the same as that received by DLOS, so the table does not report the results for DLOS. Here, two means are statistically the same if the Student \(t\)-test does not reject the hypothesis that the two means are the same with \(95\%\) confidence. All outperformances we report are statistically different from zero. For example, in Table 5, the outperformance of RLOS over TWAP when the expected latency is 30 ms is $3.80/€M. Here, we reject the null hypothesis that the two means are the same with a \(p\)-value of \(2.15\times 10^{-5}\).

Table 5 Values of performance measure (4.2) from 9.00am to 9.10am, \(T=6\) seconds, \(\phi =0\). The revenues from RLOS and from DLOS (not shown) are statistically the same

The results in Table 5 show that the value of the outperformance of RLOS (and DLOS) decreases as the value of the expected latency increases – we provide two stylised facts of the strategies to explain the results in the table. First, as expected latency increases, it is more likely that at the end of the trading interval, all strategies will fall short of liquidating the target inventory. In the extreme case where expected latency is arbitrarily large, it is very likely that by the terminal time \(T\), most of the target inventory will remain in euros, i.e., is not exchanged into USD. Thus at time \(T\), the remaining lots of currency pairs will be liquidated with one MO, where the trader pays the costs of walking the book; the costs are the same for all strategies (except for ENOW).

Second, for a fixed trading window and for a fixed liquidation target, faster traders (i.e., with lower expected latency) have on average more opportunities than slow traders to send MLOs to the exchange (recall that in our model, the trader cannot send another trade if her previous order is pending in the exchange). Therefore, compared with slow traders, faster traders send more (and fill more) MLOs that seek a rate improvement, which explains the results we report in the table – we return to this point below. In an extreme case, a patient trader with nearly zero latency will use her superior speed advantage to send a large amount of MLOs that seek price improvements, and due to the flickers in the best bid rate in the LOB, the trader will exchange a large proportion of the initial inventory in EUR into USD with these speculative MLOs. A very fast and very patient trader will not employ her speed in the marketplace to liquidate the target inventory quickly and ahead of time; the trader will use her low latency to fill as many speculative trades as possible to complete the execution target.

Our results show that the slower the trader, the less she benefits from latency-optimal strategies. However, on the other hand, an implication of our findings is that slow traders may profit from investing in co-location, software and other services to increase their speed in the marketplace if the improvement in the performance of the execution strategies outweighs the costs to reduce their latency, which largely depends on the volumes traded.

To gain further insights, we report summary statistics of the behaviour of the strategies from 9.00am to 9.10am when expected latency is 30 ms. Table 6 reports: the average number of trade attempts (there are 200 periods of 30 ms for an execution horizon of six seconds); the average number of filled orders; the proportion of runs for which the strategy sends an MO at time \(T\) to liquidate the terminal inventory and pays extra costs from walking the LOB (i.e., average number of runs for which \(Q_{T-}\geq 2\)); the proportion of the number of attempts for which the limit rate (LR) of the MLO is lower than the fundamental best bid rate \(S_{t}\) (the trader is willing to walk the LOB to complete the trade); the average number of attempts for which \(\text{LR} = S_{t}\), and the average number of attempts where \(\text{LR} > S_{t}\), which are speculative trades that target a rate improvement (relative to the fundamental exchange rate).

Table 6 Expected latency is 30 ms, \(T=6\) seconds, so expected number of attempts is 200, and \(\phi =0\). LR: limit rate of MLO. \(S_{t}\): fundamental best bid rate when the trader decides to send the MLO
Table 7 Expected latency is 30 ms, \(T=6\) seconds, so expected number of attempts is 200, and \(\phi = 10^{-3}\). LR: limit rate of MLO. \(S_{t}\): fundamental best bid rate when the trader decides to send the MLO

On average, RLOS and DLOS employ a similar number of MLO attempts, of which RLOS fills 9.95 and DLOS fills 9.65 (these figures exclude the MOs sent, if necessary, at time \(T\)). Approximately 97% of the attempted MLOs by both strategies are speculative, of which 2.4% (2.3%) sent by RLOS (DLOS) were filled. As discussed above, filled speculative MLOs are the main source of the superior performance of latency-optimal strategies. Within the trading window, the limit rates of the speculative MLOs become less ambitious if the inventory is not on target and the terminal date approaches. However, the latency-optimal strategies cannot guarantee that at time \(T\), the inventory in EUR is drawn to \(Q_{T-}< 2\). The table shows that in approximately 0.3% (6.9%) of the runs, RLOS (DLOS) arrives at the end of the trading horizon with two or more lots that are liquidated with one MO. Finally, the percentage of runs where RLOS outperforms is 57% for DLOS, 61% for ZLOS, 67% for TWAP and 55% for ENOW.

As Table 6 reports, when the value of the urgency parameter \(\phi \) is zero, RLOS and DLOS miss approximately 95% of attempts with MLOs – the majority of the MLOs seek a price improvement that does not materialise. To examine the role of the urgency parameter, we implement the execution strategies described above for a range of values of \(\phi \) and report a summary of the findings for \(\phi =5\times 10^{-5}\) and \(\phi =1\times 10^{-3}\). These two choices represent impatient liquidity takers who have more urgency to liquidate the position than the patient trader with \(\phi =0\).

When \(\phi =1\times 10^{-4}\), RLOS misses 85% of the MLOs and outperforms TWAP by 1.68 USD per million euro exchanged. Similarly, when \(\phi =1\times 10^{-3}\), RLOS misses 2% of the MLOs and outperforms TWAP by 0.79 USD per million euro exchanged. The performance of latency-optimal strategies is similar to that of TWAP because the urgency with which RLOS must liquidate inventory precludes the strategy from sending speculative trades, and because the limit rates of the MLOs are set low to increase the probability of a fill at the expense of price protection. In Table 7, we observe that RLOS sends nearly all MLOs with a limit rate below the fundamental best bid rate to ensure a prompt liquidation and does not send speculative MLOs – due to the trader’s impatience, in 99.97% of the simulations, the strategy completes the liquidation early. These results are in stark contrast with those reported in Table 6, where the trader was patient and used the entire window to complete the execution programme with as many speculative MLOs as possible, and where the majority of orders sent are retries of missed speculative MLOs.

When we amalgamate the MLOs of all traders in our data set between 1 September 2019 and 29 February 2020 in the EUR/USD currency pair, we find that 57.1% of MLOs sent by all traders missed the trade. On the other hand, when we look at the MLOs sent by each trader, we find that approximately 35% of the traders miss up to 25% of their MLOs, and approximately 6% of traders miss between 75% and 100% of their MLOs – these traders also send other types of orders to the exchange (e.g. MOs), see Table 1. Finally, one cannot tell from the data which trades are part of an execution programme (i.e., child orders) or which trades are a single execution that is not part of a larger order. However, as discussed above, our model is designed to execute one stand-alone order or to execute a large parent order that is split into child orders.

In the remainder of this subsection, we focus on a trader with 30 ms of expected latency. Table 8 reports the performance measure when the ten-minute execution windows start at 9.00am, 2.00pm and 7.00pm. We find that the outperformance of RLOS over ZLOS, TWAP, ENOW is worse when the start time is 7.00pm, which is the least active period we study and one for which the probability of positive flickers in the best bid rate is lowest, so MLOs are less valuable.

Table 8 Values of performance measure (4.2) from 9.00am to 9.10am (first column), from 2.00pm to 2.10pm (second column), and from 7.00pm to 7.10pm (third column), \(T=6\) seconds, and expected latency is 30 ms

Next, we study the performance of RLOS when the trader estimates 30 ms of expected latency in the market, but her latency is one of 20 ms (underestimated speed in the marketplace), 30 ms (correct speed estimate), 40 ms (overestimated speed in the marketplace). The results, reported in Table 9, show that the value of the outperformance of the RLOS and DLOS over the other strategies is slightly higher (lower) when the agent underestimates (overestimates) expected latency. Recall that over the trading window, for a lower (higher) value of the expected latency, the trader has more (fewer) attempts on average to execute her inventory.

Table 9 Values of performance measure (4.2) starting at 9.00am, \(T=6\) seconds, and \(\phi =0\)

4.4 Trading around news events

Throughout the calendar year, many pieces of news are released according to a schedule. Market participants know the timing of the release, but do not know the content. In general, trade activity and volatility of the exchange rates tend to increase around the time of the news release. Here, we employ the EUR/USD news event information provided by FXStreet between 1 September 2019 and 29 February 2020 (see https://www.fxstreet.com/economic-calendar). During this period, there are 117 news events marked as high impact for EUR and for USD, and the time of the release of each event is timestamped with precision up to one minute.

We use the estimate \(\hat{\sigma}=1.9\times 10^{-4}\), which is the volatility of the fundamental best bid between 9.00am to 8.00pm (see Sect. 4.1), to compute the trading strategies around the time of the arrival of the scheduled news; results are robust to employing volatility estimates computed with the data around news events in August 2019. The execution horizon is \(T=6\) seconds and the agent trades during 1, 3, 5, 7 and 10 minutes around the news event. For example, the ISM manufacturing index was scheduled to be released at 2.00pm on 3 September 2019; so for the one-minute window, we employ market data from 1.59.30pm to 1.59.36pm for the first simulation, data from 1.59.36pm to 1.59.42pm for the second simulation, and so on, until the tenth simulation from 2.00.24pm to 2.00.30pm. For each six-second execution interval, we perform 100 simulations of the random latency faced by the trader. Thus for the one-minute window, the study consists of \(117\times 10\times 100 = \text{117'000}\) simulations. Table 10 reports the results of the performance measure in (4.2).

Table 10 Performance measure around news events, \(T=6\) seconds, \(\phi =0\), and expected latency is 30 ms

The value of the outperformance of RLOS over ZLOS and TWAP is approximately between two and ten times the value reported in Table 5. This significant improvement in the performance of RLOS follows from an increase in the number of speculative MLOs that are filled. Table 11 shows the ex-post parameters of the distribution of flickers faced by the trader with 30 ms of expected latency. It is clear that during news announcements, the value of speculative MLOs is higher because the probability of a price improvement (respectively, slippage) and the size of the positive (respectively, negative) flickers are greater than those of the distribution of flickers for the ten-minute trading windows that start at 9.00am, 2.00pm and 7.00pm.

Table 11 Period is from 1 September 2019 to 29 February 2020. Expected latency is 30ms
Table 12 Performance measure from 9.00am to 9.10am, \(T=6\) seconds, and expected latency 30 ms

4.5 Robustness checks

In the results we showed above, the fundamental best bid rate was the current or most recent observed best bid rate when the spread was less than \(\upsilon =11\) ticks; recall that one tick is \(10^{-5}\) USD. Table 12 shows that the outperformance of RLOS over the benchmarks is robust for \(\upsilon \in \{7,9,11,13, \infty \}\) ticks, where expected latency is 30 ms. Note that when \(\upsilon =\infty \) ticks, the fundamental best bid rate is the observed bid rate, i.e., \(\hat{S}_{t} = S_{t}\).

The value of the outperformance of RLOS and DLOS over other benchmarks peaks for \(\upsilon =11\) ticks. In addition, we note that for the outperformance of RLOS over TWAP, any two consecutive results are statistically the same. For example, the outperformance of RLOS over TWAP for \(\upsilon =9\) and \(\upsilon =11\) are statistically the same. Finally, the performances of RLOS and DLOS are robust to misspecification of the volatility parameter of the fundamental best rate. We ran simulations with volatility estimates \(\sigma \in \{1, 2, 3\}\times 10^{-4}\) and expected latency 30 ms. The performance measure are as those reported above plus minus $0.4/€M.

5 Conclusions

We have solved a general stochastic impulse control problem in which there is a stochastic delay between the action and its outcome. As an application, an investor liquidates a large position in a financial instrument when there is latency in the marketplace. We have derived the optimal strategies for stochastic and for deterministic latencies, and compared their performance with that of three benchmarks, including TWAP. During normal trading hours, we have found that the latency-optimal strategies outperform the benchmarks by an amount similar to that of the fees paid by traders in the foreign exchange market. Around news events, the value of the outperformance increases to between two and ten times the value of the fees. The superior performance of the latency-optimal strategies stems from both the speculative MLOs sent by the trader and the rate protection provided by the MLOs. We have shown that fast and patient traders use their superior speed to execute the target inventory with as many speculative MLOs as possible; they do not use their speed advantage to finish the execution programme early, or to hit the best quotes they observe in the LOB.

In financial applications, interesting research problems include a study of the effect that latency has on the financial performance of electronic trading strategies, e.g. market making, pairs trading and other statistical arbitrage strategies. In particular, one could try to extend the recent works of Kalsi et al. [23] and Cartea et al. [14] to account for latency. In these works, the authors use techniques from rough path theory to solve relevant algorithmic trading problems, where it is assumed that there is no delay between trade attempts and executions. Finally, from a mathematical point of view, a challenging problem is to extend the current framework to include other distributions of the delay in the marketplace, and to extend the model so that there can be more than one pending order in the system when latency is stochastic.