Abstract
Pairs trading is a typical example of a convergence trading strategy. Investors buy relatively underpriced assets simultaneously, and sell relatively overpriced assets to exploit temporary mispricing. This study examines optimal pairs trading strategies under symmetric and nonsymmetric trading constraints. Under the assumption that the price spread of a pair of correlated securities follows a meanreverting OrnsteinUhlenbeck(OU) process, analytical trading strategies are obtained under a meanvariance(MV) framework. Model estimation and empirical studies on trading strategies have been conducted using data on pairs of stocks and futures traded on China’s securities market. These results indicate that pairs trading strategies have fairly good performance.
1 Introduction
Statistical arbitrage trading strategies have been widely used in financial markets. The implementation of statistical arbitrage trading strategies may restrain excessive speculation, and enhance market liquidity. A convergence trade is a statistical arbitrage trade that exploits mispricing of two assets with similar trends in payoffs in the future. As reported by Liu and Timmermann (2013), convergence trades include merger arbitrage (risk arbitrage), pairs trading (relative value trades), ontherun/offtherun bond trades, tranched structured securities, and arbitrage between the same stocks trading in different markets. Pairs trading was pioneered by Gerry Bamberger, and further developed by Nunzio Tartaglia’s quantitative group at Morgan Stanley in the 1980s (Gatev et al. 2006). The core idea of pairs trading is to sell overpriced security, and buy underpriced securities when the price spread widens. It also involves clearing the trading position when the price spread converges. Huck (2010) proposed a general and flexible framework for selection of pairs and a multistepahead forecast method. We refer the reader to Whistler (2004) and Reverre (2001) for more details about pairs trading.
Studies on pairs trading primarily focus on three major approaches, namely, the distance approach, stochastic spread approach and cointegration approach. The distance approach is a trading strategy that attempts to make a profit when the sum of squared differences between two stock prices triggers a prescribed threshold ( Nath 2003). The distance method lacks forecasting ability despite its straightforward structure, owing to the convergence time and the expected holding period (Do et al. 2006). The stochastic spread approach (Elliott et al. 2005) describes the temporary divergence in the prices of two correlated securities. The divergence in prices may be attributed to liquidity shortages, and is expected to converge to an equilibrium level in the future. Song and Zhang (2013) explored optimal stopping problems by maximizing the overall return under the meanreverting assumption. Sperling and Siu (2018) further considered regimeswitching by extending the model reported by Göncü and Akyildirim (2016). The cointegration approach is based on the premise that a pair of asset price series is cointegrated. Vidyamurthy (2004) and Gatev et al. (2006) pioneered the cointegration approach in pairs trading research. This approach was further developed by Lin et al. (2006) using optimal loss protection. Explicit optimal portfolio trading strategies were derived under the MV and expected utility objective functions (Liu and Timmermann 2013,Chiu and Wong 2013,Chiu and Wong 2015). Due to its tractability and flexibility, we consider the conintegration approach in this study.
Markowitz (1952) pioneered the MV paradigm for portfolio selection in a singleperiod modelling framework. The MV criterion has been further investigated in the discretetime multiperiod setting (Li and Ng 2000), continuoustime with bankruptcy prohibition (Bielecki et al. 2005), and meanrisk formulation (Cui et al. 2017). The expected utility framework has also been studied widely in the context of the portfolio selection problem since the pioneering works of (Merton 1969, 1971). These two frameworks represent different investment preferences of various market participants, and have attracted considerable attention in the finance literature. Mudchanatongsuk et al. (2008) and Tourin and Yan (2013) explored optimal pairs trading strategies with the expected utility on the terminal wealth. Inspired by these two works, we study the optimal pairs trading strategies of MVpreference investors. Wang and Zhou (2020) identified two main reasons for the popularity of the MV criterion. First, the MV criterion is intuitively appealing from a practical perspective. In addition, it is transparent in terms of capturing the tradeoff between risk and return, which is one of the main concerns of traders and investors. Second, the MV criterion leads to a theoretically intriguing issue of the Bellman’s inconsistency inherent to the underlying stochastic control problems, which is interesting from a theoretical perspective. It may be noted that in some cases, the MV criterion may lead to a simple solution to the portfolio selection problem, which entails practically meaningful interpretation, though the challenging issue of Bellman’s inconsistency needs to be revolved before achieving the simple solution. As noted in, for example, Bielecki et al. (2005) indicated that the basic concept of the MV model is a foundation of neoclassical finance theory, including the mutual fund theorem, the elegant capital asset pricing model etc.
In the MV framework, the inadequacy of the iteratedexpectations property leads to the inability of applying the traditional dynamic programming approach. This renders optimality conceptually unclear (Björk and Murgoci 2010). A precommitment strategy was reported that aims to find a strategy or a control that maximises the initial value function at a fixed starting time point, while disregarding the fact that a decision maker or investor may have an incentive to deviate from the initial policy at a later time (Dang and Forsyth 2016; Kryger and Steffensen 2010). However, this strategy is not timeconsistent. Specifically, when the same problem is solved at a later time, the resulting optimal control will be different from that obtained at the starting time. To address this timeinconsistency, Basak and Chabakauri (2010) adopted a game theoretic approach to solve a continuoustime MV problem for an investor who updates her nonlinear MV objective by taking future updates into account in a timeconsistent manner, and derived an equilibrium control policy. For more details about timeconsistent equilibrium controls, we refer the reader to Strotz (1955), Krusell and Smith (2003), Björk et al. (2014) and Huang and Nguyenhuu (2018). In this study, we consider timeconsistent trading strategies for the pairs trading problems.
Building on existing works such as Mudchanatongsuk et al. (2008), Basak and Chabakauri (2010), Tourin and Yan (2013), and Gu et al. (2020), an optimal trading strategy is formulated as a dynamic MV portfolio selection problem. The price spread of two correlated securities is modelled by an OU process, which captures the meanreverting property of the price spread. In Mudchanatongsuk et al. (2008) and Tourin and Yan (2013), the expected utility maximisation objective was considered using the Bellman principle. The objective of this study is to investigate timeconsistent pairs trading strategies with an MV objective. By employing the approach based on the total variance formula in Basak and Chabakauri (2010), the original optimization problem is transformed into a quadratic form, and an analytical solution is obtained. To explore the potential implementation of the proposed approach, the empirical studies on the optimal trading strategies are conducted using data on pairs of stocks and futures traded on China securities market.
In summary, the key contributions of our paper are as follows. Firstly, a closedform optimal trading strategy is obtained under the assumption that the spread of the asset prices follows an OU process, and the portfolio weights allocated to the two assets are symmetric. Secondly, we extend the model setup to allow for nonsymmetric portfolio weights. This leads to a more general trading strategy. Third, we calibrate the model parameters for different pairs of assets from the Chinese securities market, including stocks and futures, to validate the analytical optimal solutions.
The paper is structured as follows. The next section presents the model setup for pairs trading adopted from Mudchanatongsuk et al. (2008). Section 3 discusses the formulation of optimal pairs trading problems with a dynamic MV problem under two different settings. The timeconsistent solutions to the problems in both situations are presented. Section 4 presents empirical illustrations, and finally, Sect. 5 concludes the paper. The proofs and derivations of some results are provided in the “Appendix”.
2 The model dynamics in pairs trading
In this section, the dynamics for the price spread and the pairs trading strategies are described in a continuoustime modeling framework, as in Mudchanatongsuk et al. (2008). A continuoustime financial market is considered, where the time parameter set is [0, T], (i.e., \(t \in [0, T]\)). Hereafter, we simply use the (continuous) time index t without referring to the time parameter set for convenience. The uncertainties are described by a complete probability space \((\Omega , \mathcal{F}, \mathbb {P})\), where \(\mathbb {P}\) is a realworld probability measure. Now we consider three tradeable securities in the market, namely, a riskfree asset and two risky assets, where the price dynamics of two risky assets are assumed to be cointegrated. We also impose some standard assumptions for a perfect market as follows. There are no transaction costs or taxes in trading these securities and short selling was allowed. The main purpose of this study is to obtain optimal timeconsistent pairs trading strategies, and the method may be applicable when transaction costs or taxes are considered.
Let r be the continuously compounded rate of interest, which is assumed to be a positive constant for simplicity. The price of the riskfree asset at time t is denoted by M(t) and it satisfies the following differential equation:
Let A(t) and B(t) denote the prices of the pair of assets A and B at time t, respectively. We assume that the price of stock B follows the geometric Brownian motion:
where \(\mu \) and \(\sigma \) are the constant drift and volatility, respectively; \(\{ Z(t) \}\) is a standard Brownian motion.
Let X(t) denote the price spread of stocks A and B at time t, which is defined as follows:
To capture the meanreverting property, we assume that the above price spread follows an OU process:
where \(\{W(t) \}\) is another standard Brownian motion; \(k>0\) is the rate of mean reversion; \(\theta \) is the longterm mean of the process; \(\eta >0\) is the volatility of the price spread; \(\rho \) is the instantaneous correlation coefficient between the two Brownian motions \(\{Z(t)\}\) and \(\{W(t)\}\). Therefore, by a straightforward calculation, we obtain
The information structure of the model is specified by a filtration \(\{\mathcal {F}_t\}\), which is the natural filtration generated by the two correlated Brownian motions \(\{ W (t) \}\) and \(\{ Z (t) \}\) augmented by the \(\mathbb {P}\)null sets. For notational convenience, we denote the conditional expectation and the conditional variance given \(\mathcal {F}_t\) as \(E_t (\cdot )\) and \(Var_t(\cdot )\) respectively under the probability measure \(\mathbb {P}\). We calibrate the proposed model by following an approach based on the maximum likelihood estimation method proposed by Mudchanatongsuk et al. (2008).
3 The dynamic MV problem
In what follows, the optimal pairs trading problems are formulated as MV portfolio selection problems under two cases: following Basak and Chabakauri (2010) and Gu and Steffensen (2015). The MV problems for optimal pairs trading are solved by employing the dynamic programming principle, and two cases with different trading constraints are discussed. In the first case, the portfolio weights invested in the two risky assets are assumed to have a sum of zero. However, this constraint was relaxed in the second case. In the two cases, the problems were formulated as quadratic optimization problems. Then, the problems were solved by combining the FeymannKac formula and the obtained HamiltonJacobiBellman (HJB) equation. The main results of the timeconsistent optimal solutions for the dynamic MV problems in the two situations are provided in Propositions 1 and 2.
3.1 Case I
Let V(t) be the value of a selffinancing pairs trading portfolio. We denote h(t) and \(\hat{h}(t)\) as the portfolio weights invested in stocks A and B at time t, respectively. In this model, we assume that the stocks A and B can only be traded as pairs. Specifically, we are only allowed to short one of them and long the other one in equal units. Thus, we require \(h(t)=\hat{h}(t)\). The wealth process V(t) becomes:
Substituting Eq. (2) and Eq. (5) into Eq. (6) gives:
We define \(\pi (t):= V(t)h(t)e^{r (Tt)}\), where V(t)h(t) denotes the present amount invested in the stocks. Eq. (7) can then be rewritten as follows:
or equivalently,
The objective of the dynamic MV problem is given by:
where \(\lambda <0\). Note that by the joint Markov property of (X(t), V(t)) with respect to the filtration \(\{ \mathcal{F}_t \}\), the conditional expectation \(E_t \) and conditional variance \(Var_t\) are indeed of the form \(E( \cdot X(t); V (t))\) and \(Var( \cdot X(t); V (t))\), respectively.
Suppose that \(\pi ^*(\cdot )\) denotes the timeconsistent control and \(V^*(\cdot )\) denotes the respective wealth process. Then, we define the value function as follows:
In short, we also write J(t, X(t), V(t)) as \(J_t\) in the following content. We consider the situation where decisions are made in the time horizon \([t, t + \tau ]\), for \(\tau > 0\). The decision maker must decide a strategy \(\{ \pi (s) \}_{s \in [t, t+\tau ]}\) with the objective function \(E_t[J_{t+\tau }]+\lambda Var_t[E_{t+\tau }(V(T))]\). It is known that the decisionmakers follow the equilibrium law \(\pi ^*(s)\) after time \(t+\tau \). The objective function is different from the traditional dynamic one in the sense that there is a timeconsistent adjustment term \(\lambda Var_t[E_{t+\tau }(V(T))]\). The presence of this timeconsistent adjustment term implies that \(\{ \pi ^*(s) \}_{s \ge t+\tau }\) may not be optimal at time t, in addition to the failure of Bellman’s optimality principle. The timeconsistent adjustment term \(\lambda Var_t[E_{t+\tau }(V(T))]\) arises due to the “Total Variance Formula”(Basak and Chabakauri 2010). Applying the techniques in HJB dynamic programming by considering time consistency, the dynamic MV problem with the objective function in Eq. (10) and the dynamic budget constraint in Eq. (6) can be solved. The solution is presented in the following proposition.
Proposition 1
A timeconsistent solution to the dynamic MV problem in Eq. (10) with the dynamic budget constraint in Eq. (6) is given by:
The respective optimal weight in pairs trading is given by:
Proof
The proof is given in the “Appendix”.
Remark 1

Proposition 1 implies that with an increase in volatility \(\sigma \) or an increase in the correlation coefficient \(\rho \), the investor allocates more funds to risky assets. This makes intuitive sense, because when \(\sigma \) increases, the amount of uncertainty also increases. This may lead to more opportunities for arbitrage. Furthermore, with an increase in the correlation of price pairs, the price spread tends to converge. This may lead to higher profits upon investing in risky securities.

From the expression \(\pi ^*(t)\) in Eq. (12), we can see that \(\pi ^*(t)= O((Tt)^2)\). We also obtain that
$$\begin{aligned} h^*(t) =\frac{\pi ^*(t)}{V(t) e^{r(Tt)}} \rightarrow 0, \end{aligned}$$when \(T \rightarrow \infty \). This means that when T is sufficiently large, the optimal weight in pairs trading is considerably small. This highlights the insight that to prevent volatility risk, traders may tend to hold small positions when the trading period is long.
Proposition 2
(Verification Theorem) Assume that \(\tilde{J}\) is a solution of Eq. (18) with terminal condition \(\tilde{J}(T,X(T),V(T))=V(T)\), and control \(\pi ^*\) realizes the supremum in the Eq. (18). Then \(\pi ^*\) is an equilibrium control and the corresponding value function is \(\tilde{J}\).
Proof
For any perturbation \(\pi ^{\epsilon ,u}(s):= u{\varvec{1}}_{s\in [t,t+\epsilon )}+\pi ^*(s){\varvec{1}}_{s\in [t+\epsilon ,T]}\), we aim to prove that
We skip the details of the proof, as it is similar to the proof of Theorem 7.1 in Björk and Murgoci (2010).
3.2 Case II
In the above analysis, we require that \(h(t)=\hat{h}(t)\). The general situation where this trading constraint is relaxed is considered in this subsection. In this case, the wealth equation for \(\{ V (t) \}\) is given by:
This implies that
Let
Then
The control problem becomes:
Same as in Case I, the conditional expectation \(E_t \) and conditional variance \(Var_t\) are of the form \(E( \cdot X(t); V (t))\) and \(Var( \cdot X(t); V (t))\), respectively. Given the optimal policy \(\hat{{\varvec{\pi }}}^*(\cdot )\) and the respective wealth process \(\hat{V}^*(\cdot )\), the value function \(\hat{J}\) is defined as follows:
and we sometimes write \(\hat{J}_t\) for short.
The main result of this case is presented in the following proposition.
Proposition 3
A timeconsistent solution to the dynamic MV problem in Eq. (15) with the dynamic budget constraint in Eq. (13) is given by:
where g is given by:
and \(\tilde{A}=\mu r+\rho \sigma \eta +\frac{\eta ^2}{2}\). The respective optimal weights, therefore, are given by:
Proof
The proof of this proposition is given in the “Appendix”.
Remark 2

Similarly to Case I, \(H^*(t)\rightarrow {\varvec{0}}\) when \(T\rightarrow \infty \). This coincides with the previous case and verifies again that the investor would be more cautious after a long period.

Similarly to Proposition 2, for the corresponding verification theorem, one may refer to the specific case of Theorem 7.1 in Björk and Murgoci (2010).

Tourin and Yan (2013) analyze the optimal pairs trading strategies with exponential utility function \(U(w)=e^{\gamma w}\). The optimal strategies under our set up with the exponential utility function are given as follows:
$$\begin{aligned} \hat{{\varvec{\pi }}}^*_{TY}(t)= \left( \begin{array}{c} \frac{1}{\gamma (\eta ^2\sigma ^2)}\left\{ [k(\theta x)+\mu +\frac{\eta ^2}{2}+\rho \sigma \eta ][k(Tt)+1]+ \frac{k^2(Tt)^2(\eta ^22\sigma ^2)}{4}\right\} \\ \frac{\mu }{\gamma \sigma ^2}\frac{k(Tt)[k(\theta x)+\mu +\frac{\eta ^2}{2}+\rho \sigma \eta ]}{\gamma (\eta ^2\sigma ^2)} \frac{k^2(Tt)^2(\eta ^22\sigma ^2)}{4\gamma (\eta ^2\sigma ^2)} \end{array} \right) \end{aligned}$$when \(r=0\). For investors with MV preference when \(r=0\), the optimal strategies are given as follows:
$$\begin{aligned}&\hat{{\varvec{\pi }}}^*(t)=\frac{1}{2\lambda (1\rho ^2)\eta ^2}\\&\left( \begin{array}{c} [2k(\theta x)+\rho \sigma \eta +\frac{\eta ^2}{2}\frac{\rho \eta \mu }{\sigma }][k(Tt)+1] +k^2(Tt)^2(\rho \sigma \eta +\frac{\eta ^2}{2})k(\theta x)\\ k(Tt)[2k(\theta x)+\rho \sigma \eta +\frac{\eta ^2}{2}\frac{\rho \eta \mu }{\sigma }]k^2(Tt)^2(\rho \sigma \eta +\frac{\eta ^2}{2}) +N \end{array} \right) , \end{aligned}$$where
$$\begin{aligned} N=\frac{(\eta ^2+\rho \eta \sigma )\mu }{\sigma ^2}\frac{\sigma ^2+\rho \eta }{\sigma }k(\theta x)\frac{\eta (\sigma +\rho \eta )(2\rho \sigma +\eta )}{2\sigma }. \end{aligned}$$The optimal strategies for investors with different preferences are quite different with each other.
Mudchanatongsuk et al. (2008) consider expected power utility investors with “symmetric” positions(the same as case I in our setting), the optimal results obtained there is also quite different from ours which is obtained with MV criterion. Tourin and Yan (2013) investigate expected exponential utility investors with “asymmetric” positions(the same as case II in our setting) allocated to each risky asset. The results above demonstrate the differences between their optimal strategies and ours. In summary, market participants with different preferences behave heterogeneously. Furthermore, it is unclear if the properties discussed in Remarks 1 and 2 would still hold for the optimal solutions obtained by Mudchanatongsuk et al. (2008) and Tourin and Yan (2013).
4 Empirical experiments
In this section, some examples of stocks and futures are presented to illustrate our results. From a number of stock sets traded on Chinese securities market, we selected three correlated pairs with the sample period 31 December 201231 March 2016 (3.25 years) from different industries: Huatai Securities Co., Ltd and Haitong Securities Co., Ltd; Qiming Information Technology Co., Ltd and YGSoft Co., Ltd; Shanghai Pudong Development Bank and China Merchants Bank. The data are obtained from the Flush software and only the trading day data are given. This results in a total of 787 sample observations. The futures pairs considered in the sample period 1 February 201631 August 2016 are au1612 and au1702. In both cases, daily closing prices are employed. By applying the calibration method illustrated in Mudchanatongsuk et al. (2008), the related parameters are estimated with the selected training datasets. For the details about the analytical formulas for the parameters estimates, please refer to the “Appendix” of Mudchanatongsuk et al. (2008).
Now we focus on the three pairs of stocks. Figures 1, 3 and 5 present the dynamics of pairs of stock prices, which show that the three price pairs converge at some time points. For illustration, we assume the interest rate r and the risk coefficient \(\lambda \) to be \(5\%\) and \(1.5\) respectively. By using the movingwindow method, we conduct outofsample testing for all stock datasets. We investigate the logreturns of our pairs trading strategies from 02 January 2014 to 31 March 2016 (2.25 year) and update the parameters on each trading during this period. Specifically, we estimate the related parameters for each trading day by using the data of the previous year, and update them accordingly. One sample path of investors’ wealth obtained from timeconsistent pairs trading strategies in cases I and II (\(V^*(\cdot )\) and \(\hat{V}^*(\cdot )\) respectively) with an initial endowment of 100 units are presented in Figs. 2, 4 and 6, where the blue lines represent the wealth dynamics by applying the purelybuyandsellsecurities strategy (with strict constraints), i.e. case I. The red lines represent the wealth dynamics by applying the trading strategy with relaxed constraints, i.e. case II. Figures 2, 4 and 6 indicate the effectiveness of our strategies by comparing them with the wealth dynamics(yellow lines) obtained using conservative investment strategies, which place all endowments in banking accounts. All three figures show that the asymmetrical strategies always dominate the symmetric ones. This phenomenon is reasonable, because the strategies in case II are more flexible. Specifically, since our model is asymmetric with two assets, different choices of risky assets assigned to A and B in Eq. (3) yield distinct optimal results. The optimal wealths obtained with alternative choices of A and B are presented in the “Appendix”. Investors may use the maximum likelihood estimation method to determine the configuration of the risky assets pairs.
For a deeper investigation of these experiments, we simulated the scenarios 1000 times, and the statistical results of the investors’ annual logreturns are shown in Table 1. In this table, S.D. stands for standard deviation. Table 1 indicates that for each pair of selected stocks, the mean of the annual yield (logreturns) under relaxed constraints dominates the respective results under strict constraints. This phenomenon is consistent with the results shown in Figs. 2, 4 and 6.
Now, we examine the corresponding results for the selected pair of futures. By setting \(r=5\%\) and \(\lambda =1.5\), we provide the parameter estimates using the datasets in the period from 1 February 2016 to 31 May 2016. The price dynamics of the two futures are depicted in Fig. 7. Subsequently, we investigate the wealth dynamics using timeconsistent pairs trading strategies in cases I and II ((\(V^*(\cdot )\) and \(\hat{V}^*(\cdot )\) respectively)) and the conservative strategy with initial 100 units from 1 June 2016 to 31 August 2016 (Fig. 8). Due to the short testing period (1 June31 August 2016), we dismissed the parameter updating. The wealth dynamics of three strategies in Fig. 8 show that the results of this example are in agreement with those for stock pairs. Table 2 reports the logreturns of investors with different risk parameters \(\lambda \) during the testing period (with 1000 simulations). We notice that the mean of logreturns decreases as \(\lambda \) decreases. This is reasonable, because when the risk parameter \(\lambda \) decreases, the investor becomes more risk averse. This may result in less expected profits.
Thus, the obtained results show exceptional performance of the strategies. The following implicit assumptions may explain this phenomenon. First, the liquidity of the strategies, especially for shorting assets, is assumed to be quite high. Second, we ignore the related transaction costs. Third, the pairs that we have chosen exhibit great convergence trends, while the shortrun arbitrage opportunities do not always exist in reality.
5 Conclusion
This study provides analytical equilibrium control strategies for the optimal MV problem of pairs trading. Specifically, we assume that the price spread of a pair of correlated risky securities follows a meanreverting OU process. Explicit timeconsistent results are derived by solving optimization problems using the dynamic programming approach, and we examine explicit solutions using selected stocks and futures traded on China’s securities market. The numerical experiments indicate that our pairs trading strategies yield an annual profit with a modest standard deviation.
In this work, we mainly focus on exploring optimal strategies and considering an ideal market. However, funds trades have many constraints in reality. For instance, limitations in shortselling, regulatory constraints, and other market regulations. Furthermore, funds are always confronted by liquidity and funding risks. Adapting our proposed strategies to these issues is a potential scope for future research.
References
Basak S, Chabakauri G (2010) Dynamic meanvariance asset allocation. Rev Financ Stud 23(8):2970–3016
Bielecki TR, Jin H, Pliska SR, Zhou XY (2005) Continuoustime meanvariance portfolio selection with bankruptcy prohibition. Math Financ 15(2):213–244
Björk T, Murgoci A (2010) A general theory of Markovian timeinconsistent stochastic control problems. Working paper
Björk T, Murgoci A, Zhou XY (2014) Meanvariance portfolio optimization with state dependent risk aversion. Math Financ 24(1):1–24
Chiu M, Wong H (2013) Optimal investment for an insurer with cointegrated assets: CRRA utility. Insur Math Econ 52(1):52–64
Chiu M, Wong H (2015) Dynamic cointegrated pairs trading: meanvariance timeconsistent strategies. J Comput Appl Math 290:516–534
Cui X, Li X, Li D, Shi Y (2017) Time consistent behavioral portfolio policy for dynamic meanvariance formulation. J Oper Res Soc 68(12):1647–1660
Dang D, Forsyth P (2016) Better than precommitment meanvariance portfolio allocation strategies: a semiselffinancing HamiltonJacobiBellman equation approach. Eur J Oper Res 250(3):827–841
Do B, Faff R, Hamza K (2006) A new approach to modeling and estimation for pairs trading. In: Proceedings of 2006 Financial Management Association European Conference, Stockholm
Elliott R, Van Der Hoek J, Malcom W (2005) Pairs trading. Quant Finance 5(3):271–276
Gatev E, Goetzmann W, Rouwenhorst K (2006) Pairs trading: performance of a relative average arbitrage rule. Rev Financ Stud 19:797–827
Gu J, Steffensen M (2015) Optimal portfolio liquidation and dynamic meanvariance criterion (November 9, 2015). https://ssrn.com/abstract=2687999 or http://dx.doi.org/10.2139/ssrn.2687999
Gu JW, Si SJ, Zheng H (2020) Constrained utility deviationrisk optimization and timeconsistent HJB equation. SIAM J Control Optim 58:866–894
Göncü A, Akyildirim E (2016) A stochastic model for commodity pairs trading. Quant Finance 16(12):1843–1857
Huang Y, Nguyenhuu A (2018) Timeconsistent stopping under decreasing impatience. Finance Stochast 22(1):69–95
Karatzas I, Shreve SE (2012) Brownian motion and Stochastic calculus. Springer, Berlin
Krusell P, Smith AA (2003) Consumption and savings decisions with quasigeometric discounting. Econometrica 71(1):365–375
Kryger EM, Steffensen M (2010) Some solvable portfolio problems with quadratic and collective objectives. Working paper https://ssrn.com/abstract=1577265
Li D, Ng W (2000) Optimal dynamic portfolio selection: multiperiod meanvariance formulation. Math Financ 10:387–406
Lin Y, Mccrae M, Gulati CM (2006) Loss protection in pairs trading through minimum profit bounds: a cointegration approach. J Appl Math Decis Sci 1:14
Liu J, Timmermann A (2013) Optimal convergence trade strategies. Rev Financ Stud 26(4):1048–1086
Markowitz H (1952) Portfolio selection. J Finance 7(1):77–91
Merton RC (1969) Lifetime portfolio selection under uncertainty: the continuoustime case. Rev Econ Stat 51(3):247–257
Merton RC (1971) Optimum consumption and portfolio rules in a continuoustime model. J Econ Theory 3(4):373–413
Mudchanatongsuk S, Primbs JA, Wong WH (2008) Optimal pairs trading: a stochastic control approach. In: American Control Conference, pp 1035–1039
Nath P (2003) High frequency pairs trading with US treasury securities: risks and rewards for hedge funds. Available at SSRN: http://ssrn.com/abstract=565441
Reverre S (2001) Complete arbitrage deskbook. McGraw Hill, New York
Sperling A, Siu TK (2018) A Markovswitching model with general innovations for pairs trading. Unpublished Manuscript
Song Q, Zhang Q (2013) An optimal pairstrading rule. Automatica 49:3007–3014
Strotz RH (1955) Myopia and inconsistency in dynamic utility maximization. Rev Econ Stud 23(3):165–180
Tourin A, Yan R (2013) Dynamic pairs trading using the stochastic control approach. J Econ Dyn Control 37(10):1972–1981
Vidyamurthy G (2004) Pairs tradingquantitative methods and analysis. Wiley, New York
Wang H, Zhou XY (2020) Continuoustime meanvariance Portfolio selection: a reinforcement learning framework. Math Financ 30(4):1273–1308
Weiss N (2005) A course in probability. AddisonWesley, Boston
Whistler M (2004) Trading pairscapturing profits and hedging risk with statistical arbitrage strategies. Wiley, New York
Acknowledgements
We would like to express our gratitude to the Editor, Associate Editor and anonymous referees for their thorough reviews and their helpful comments and suggestions. This research work was supported by Research Grants Council of Hong Kong under Grant Number 17301519, National Natural Science Foundation of China Under Grant numbers 71601044, 11671158 and 11801262, the Fundamental Research Funds for the Central Universities 2242020S30030, IMR, RAE Research Fund, Faculty of Science, Seed Funding for Basic Research, The University of Hong Kong and Seed Funding of HKUTCL Joint Research Centre for Artificial Intelligence.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix
Proof of Proposition 1
Proof
By the law of total variance (e.g., (Weiss 2005)),
Substituting the above equation into the value function J gives:
Since \(E_t(V^*(T))=E_t(E_{t+\tau }(V^*(T)))\), we have:
This implies that as \(\tau \) becomes small,
By substituting Eq. (9) into Eq. (11),
where c(x, t) represents the sum of the second and third terms in the above equation for convenience.
By Eq. (9),
Define \({f(X(t),t)} := E_t(V^*(T))e^{r(Tt)}V(t)\), which is the expected gains or losses of the investor over the horizon \(Tt\) under the timeconsistent control. Then
which is the same as \(E_t(V^*(T))e^{r(Tt)}V(t)\).
Eq. (18) becomes:
subject to \(J_T=V(T)\) and the constraint Eq. (7).
By Basak and Chabakauri (2010), f is a function of x and t only. By applying Itô’s lemma and the FeynmanKac Theorem (Theorem 7.6, Karatzas and Shreve (2012)), Eq. (20) gives:
where \(\mathcal{{D}}c\) denotes the Dynkin operator on the function c(x, t), and it is defined as follows:
We obtain that
Applying the FeynmanKac theorem to f gives:
Substituting \(\pi ^*(t)\) into Eq. (23) gives:
We have an ansatz for f:
With Eq. (24), we obtain a system of ODEs:
Solving the system of ODEs in Eq. (25), we obtain that
Since \(f(X(T),T)=0\), we can also solve the unknown constants as follows:
Substituting f into Eq. (22) yields the reported result.
Proof of Proposition 3
Proof
Similar to the proof of Proposition 1, we have:
Define \({\hat{f}(X(t),t)}:=E_t(\hat{V}^*(T))e^{r(Tt)}V(t)\) and by Eq. (14):
By combining Eq. (14) and Eq. (16), the value function \(\hat{J}\) can be separable as \({\hat{J}(t,X(t),V(t))}=e^{r(Tt)}V(t)+\hat{c}({X(t)},t)\), (see (Basak and Chabakauri 2010)). Applying the above equations and the FeynmanKac Theorem (Theorem 7.6, Karatzas and Shreve (2012)), the recursive equation (27) becomes:
Notice that the objective function can be written as the following quadratic form:
where
Define:
The objective function (29) is equivalent to:
Since Q is a symmetric positive definite matrix, this is a convex optimization problem and the optimal solution is given by \(\hat{{\varvec{\pi }}}^*(t)=Q^{1}{\varvec{b}}\). By applying FeynmanKac theorem to \(\hat{f}\), we have:
Similar as before, we have an ansatz for \(\hat{f}\):
For notational convenience, in what follows, we denote \(\tilde{A}=\mu +\frac{1}{2}\eta ^2+\rho \sigma \eta r\).
where
Let
Substituting \(Q^{1}\) into the above equations gives:
Then Eq. (33) can be written as follows:
By solving the above system of differential equations, we have:
Consequently,
where g is given by:
and \(\tilde{A}=\mu r+\rho \sigma \eta +\frac{\eta ^2}{2}\). The solution obtains accordingly.
Alternative choices of A and B
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhu, DM., Gu, JW., Yu, FH. et al. Optimal pairs trading with dynamic meanvariance objective. Math Meth Oper Res 94, 145–168 (2021). https://doi.org/10.1007/s0018602100751z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s0018602100751z
Keywords
 Dynamic meanvariance (MV)
 OrnsteinUhlenbeck (OU)
 Pairs trading
 Time inconsistency