Skip to main content

Optimization of relative arbitrage


In stochastic portfolio theory, a relative arbitrage is an equity portfolio which is guaranteed to outperform a benchmark portfolio over a finite horizon. When the market is diverse and sufficiently volatile, and the benchmark is the market or a buy-and-hold portfolio, functionally generated portfolios introduced by Fernholz provide a systematic way of constructing relative arbitrages. In this paper we show that if the market portfolio is replaced by the equal or entropy weighted portfolio among many others, no relative arbitrages can be constructed under the same conditions using functionally generated portfolios. We also introduce and study a shaped-constrained optimization problem for functionally generated portfolios in the spirit of maximum likelihood estimation of a log-concave density.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  • Amari, S.I., Cichocki, A.: Information geometry of divergence functions. Bull Polish Acad Sci Tech Sci 58(1), 183–195 (2010)

    Google Scholar 

  • Banner, A.D., Fernholz, D.: Short-term relative arbitrage in volatility-stabilized markets. Ann Finance 4(4), 445–454 (2008)

    Article  Google Scholar 

  • Billingsley, P.: Convergence of Probability Measures, vol. 493. Wiley, New Jersey (2009)

    Google Scholar 

  • Bouchey, P., Nemtchinov, V., Paulsen, A., Stein, D.M.: Volatility harvesting: why does diversifying and rebalancing create portfolio growth? J Wealth Manag 15(2), 26–35 (2012)

    Article  Google Scholar 

  • Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)

    Book  Google Scholar 

  • Chuaqui, M., Duren, P., Osgood, B.: Schwarzian derivative criteria for valence of analytic and harmonic mappings. In: Mathematical Proceedings of the Cambridge Philosophical Society, vol. 143, pp. 473–486. Cambridge University Press, Cambridge (2007)

  • Chuaqui, M., Duren, P., Osgood, B., Stowe, D.: Oscillation of solutions of linear differential equations. Bull Aust Math Soc 79(01), 161–169 (2009)

    Article  Google Scholar 

  • Cule, M., Samworth, R.: Theoretical properties of the log-concave maximum likelihood estimator of a multidimensional density. Elect J Stat 4, 254–270 (2010)

    Article  Google Scholar 

  • Cule, M., Samworth, R., Stewart, M.: Maximum likelihood estimation of a multi-dimensional log-concave density. J R Stat Soc Ser B (Stat Methodol) 72(5), 545–607 (2010)

    Article  Google Scholar 

  • DeMiguel, V., Garlappi, L., Uppal, R.: Optimal versus naive diversification: How inefficient is the 1/n portfolio strategy? Rev Financ Stud 22(5), 1915–1953 (2009)

    Article  Google Scholar 

  • Dümbgen, L., Rufibach, K.: Maximum likelihood estimation of a log-concave density and its distribution function: basic properties and uniform consistency. Bernoulli 15(1), 40–68 (2009)

    Article  Google Scholar 

  • Emery, M., Meyer, P.A.: Stochastic Calculus in Manifolds. Springer, New York (1989)

    Book  Google Scholar 

  • Fernholz, D., Karatzas, I.: On optimal arbitrage. Ann Appl Probab 20(4), 1179–1204 (2010)

    Article  Google Scholar 

  • Fernholz, D., Karatzas, I.: Optimal arbitrage under model uncertainty. Ann Appl Probab 21(6), 2191–2225 (2011)

    Article  Google Scholar 

  • Fernholz, E.R.: Stochastic Portfolio Theory. Applications of Mathematics. Springer, New York (2002)

    Book  Google Scholar 

  • Fernholz, E.R., Karatzas, I.: Stochastic portfolio theory: an overview. In: Ciarlet, P.G. (ed.) Handbook of Numerical Analysis, Handbook of Numerical Analysis, vol. 15, pp. 89–167. Elsevier, Amsterdam (2009)

    Google Scholar 

  • Fernholz, R.: Portfolio Generating Functions. Quantitative Analysis in Financial Markets, River Edge. World Scientific, Singapore (1999)

    Google Scholar 

  • Fernholz, R., Garvy, R., Hannon, J.: Diversity-weighted indexing. J Portf Manag 24(2), 74–82 (1998)

    Article  Google Scholar 

  • Hiriart-Urruty, J.B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms I: Fundamentals, vol. 305. Springer, New York (1996)

    Google Scholar 

  • Hsu, J.C., Chow, T.m., Kalesnik, V., Little, B.: A survey of alternative equity index strategies. Financ Anal J 67(5), 37–57 (2011)

    Article  Google Scholar 

  • Koenker, R., Mizera, I.: Quasi-concave density estimation. Ann Stat 38(5), 2998–3027 (2010)

    Article  Google Scholar 

  • Lang, R.: A note on the measurability of convex sets. Arch Math 47(1), 90–92 (1986)

    Article  Google Scholar 

  • Pal, S., Wong, T.K.L.: Energy, entropy, and arbitrage. ArXiv e-prints (1308.5376) (2013)

  • Pal, S., Wong, T.K.L.: The geometry of relative arbitrage. ArXiv e-prints (1402.3720v4) (2014)

  • Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1997)

    Google Scholar 

  • Rockafellar, R.T., Wets, R.J.B.: Variational analysis. Grundlehren der Mathematischen Wissenchaften. Springer, New York (1998)

    Google Scholar 

  • Ruf, J.: Optimal Trading Strategies Under Arbitrage. Ph.D. thesis, Columbia University (2011)

  • Seregin, A., Wellner, J.A.: Nonparametric estimation of multivariate convex-transformed densities. Ann Stat 38(6), 3751 (2010)

    Article  Google Scholar 

  • Strong, W.: Generalizations of functionally generated portfolios with applications to statistical arbitrage. Arxiv e-prints (1212.1877) (2012)

Download references


The author would like to thank Soumik Pal for his constant guidance and support during the preparation of the paper, Tatiana Toro for helpful discussions about the proof of Theorem 1, and Jiashan Wang for help with numerical optimization. He also thanks the anonymous referee who spotted an error in the original definition of the support condition and suggested the current definition. The referee’s valuable comments improved greatly the presentation of the paper.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Ting-Kam Leonard Wong.

Appendix: Proofs of Theorems 4 and 5

Appendix: Proofs of Theorems 4 and 5

First we will state and prove some lemmas from convex analysis.

Lemma 10

Let \(p_0 \in \varDelta ^{(n)}\) be fixed and let \({\mathcal C}_0\) be the collection of positive concave functions \(\varPhi \) on \(\varDelta ^{(n)}\) satisfying \(\varPhi (p_0) = 1\). Then any sequence in \({\mathcal C}_0\) has a subsequence which converges locally uniformly on \(\varDelta ^{(n)}\) to a function in \({\mathcal C}_0\).


By ((Rockafellar 1997, Theorem 10.9)), it suffices to prove that \({\mathcal C}_0\) has a uniform upper bound (the lower bound is immediate since functions in \({\mathcal C}_0\) are non-negative). We first derive an upper bound in the one-dimensional case. Let \(f\) be a non-negative concave function on the real interval \([a, b]\). Let \(x_0 \in (a, b)\) and suppose \(f(x_0) = 1\). Let \(x \in [a, x_0]\) and write \(x_0 = \uplambda x + (1 - \uplambda )b\) for some \(\uplambda \in [0, 1]\). By concavity,

$$\begin{aligned} 1 = f(x_0) \ge \uplambda f(x) + (1 - \uplambda ) f(b) \ge \uplambda f(x). \end{aligned}$$


$$\begin{aligned} f(x) \le \frac{1}{\uplambda } = \frac{b - x}{b - x_0} \le \frac{b - a}{b - x_0}, \quad x \in [a, x_0]. \end{aligned}$$

The case \(x \in [x_0, b]\) can be handled similarly, and we get

$$\begin{aligned} f(x) \le \frac{b - a}{\min \{|x_0 - a|, |x_0 - b|\}}, \quad x \in [a, b]. \end{aligned}$$

Now let \(\varPhi \in {\mathcal C}_0\). Applying (45) to the restrictions of \(\varPhi \) to line segments in \(\varDelta ^{(n)}\) containing \(p_0\), we get

$$\begin{aligned} \varPhi (p) \le \frac{\text {diam}\left( \varDelta ^{(n)}\right) }{\text {dist}\left( p_0, \partial \varDelta ^{(n)}\right) }, \quad p \in \varDelta ^{(n)}, \end{aligned}$$

where \(\text {diam}(\varDelta ^{(n)})\) is the diameter of \(\varDelta ^{(n)}\) and \(\text {dist}(p_0, \partial \varDelta ^{(n)})\) is the distance from \(p_0\) to the boundary of \(\varDelta ^{(n)}\). This completes the proof of the lemma. \(\square \)

Lemma 11

Let \((\pi , \varPhi ), (\pi ^{(k)}, \varPhi ^{(k)}) \in {\mathcal {FG}}\), \(k \ge 1\). Suppose \(\varPhi ^{(k)}\) converges locally uniformly on \(\varDelta ^{(n)}\) to \(\varPhi \). Let \(p \in \varDelta ^{(n)}\) be a point at which \(\varPhi \) is differentiable. Then given \(\varepsilon > 0\), there exists \(\delta > 0\) and a positive integer \(k_0\) such that \(\Vert \pi ^{(k)}(q) - \pi (p)\Vert < \varepsilon \) whenever \(k \ge k_0\) and \(q \in B(p, \delta )\). In particular, \(\pi ^{(k)}\) converges \(m\)-almost everywhere to \(\pi \) as \(k \rightarrow \infty \).


It is clear that \(\log \varPhi ^{(k)}\) also converges locally uniformly to \(\log \varPhi \). We will use a well-known convergence result for the superdifferentials of concave functions, see ((Hiriart-Urruty and Lemaréchal 1996, Theorem 6.2.7)). Indeed, the proof of ((Hiriart-Urruty and Lemaréchal 1996, Theorem 6.2.7)) implies a slightly stronger statement than the theorem. Namely, for any \(p \in \varDelta ^{(n)}\) and any \(\varepsilon > 0\), there exists a positive integer \(k_0\) and \(\delta > 0\) such that

$$\begin{aligned} \begin{aligned} \partial \log \varPhi ^{(k)}(q)&\subset \partial \log \varPhi (p) + B(0, \varepsilon ), \quad k \ge k_0, \quad q \in B(p, \delta ), \\ \partial \log \varPhi (q)&\subset \partial \log \varPhi (p) + B(0, \varepsilon ), \quad q \in B(p, \delta ). \end{aligned} \end{aligned}$$

Suppose \(\varPhi \) is differentiable at \(p\). Then \(\partial \log \varPhi (p)\) is a singleton. By Lemma 2, there are measurable selections \(\xi ^{(k)}\) and \(\xi \) of \(\partial \log \varPhi ^{(k)}\) and \(\partial \log \varPhi \) respectively such that

$$\begin{aligned} \begin{aligned} \pi ^{(k)}_i(q)&= q_i \left( \xi ^{(k)}_i(q) + 1 - \sum _{j = 1}^n q_j\xi ^{(k)}_j(q)\right) , \\ \pi _i(q)&= q_i \left( \xi _i(q) + 1 - \sum _{j = 1}^n q_j\xi _j(q)\right) , \\ \end{aligned} \end{aligned}$$

for all \(q \in \varDelta ^{(n)}\), \(i = 1, \ldots , n\), and \(k \ge 1\).

For each \(i = 1, \ldots , n\), consider the map \(G_i\) defined by

$$\begin{aligned} (q, \xi ) \in \varDelta ^{(n)} \times T\varDelta ^{(n)} \mapsto q_i\left( \xi _i + 1 - \sum _{j = 1}^n q_j \xi _j\right) . \end{aligned}$$

The map \(G = (G_1, \ldots , G_n)\) is clearly jointly continuous. We have \(\pi (q) = G(q, \xi (q))\) and \(\pi ^{(k)}(q) = G(q, \xi ^{(k)}(q))\).

By (46), for any \(\varepsilon > 0\), there exists \(k_0\) and \(\delta > 0\) such that

$$\begin{aligned} \Vert \xi ^{(k)}(q) - \xi (p)\Vert < \varepsilon , \quad \Vert \xi (q) - \xi (p)\Vert < \varepsilon \end{aligned}$$

for all \(k \ge k_0\) and \(q \in B(p, \delta )\). The claim (46) follows from (47) and the joint continuity of \(G\) at \((q, \xi (q))\). The last statement follows since a finite concave function on \(\varDelta ^{(n)}\) is differentiable \(m\)-almost everywhere ((Rockafellar 1997, Theorem 25.5)). \(\square \)

Proof of Theorem 4

(i) The existence of an optimal solution will be proved by a compactness argument. Suppose \(\{(\pi ^{(k)}, \varPhi ^{(k)})\}\) is a maximizing sequence for (33). By scaling, we may assume \(\varPhi ^{(k)}(p_0) = 1\) where \(p_0 \in \varDelta ^{(n)}\) is fixed. By Lemma 10, we may replace it by a subsequence such that \(\varPhi ^{(k)}\) converges locally uniformly on \(\varDelta ^{(n)}\) to a positive concave function \(\varPhi \) on \(\varDelta ^{(n)}\). By Lemma 2(ii), \(\varPhi \) generates a portfolio \(\pi \).

Case 1. \({{\mathbb {P}}}\) is absolutely continuous. By Lemma 11, \(\pi ^{(k)}\) converges \(m\)-almost everywhere to \(\pi \). Let \(T^{(k)}\) and \(T\) be the L-divergences of \((\pi ^{(k)}, \varPhi ^{(k)})\) and \((\pi , \varPhi )\) respectively. Recall that \({{\mathbb {P}}}\) is supported on \(K \times K\) where \(K \subset \varDelta ^{(n)}\) is compact. For \(x \in \overline{\varDelta ^{(n)}}\) and \(p, q \in K\), we have

$$\begin{aligned} 1 + \left\langle \frac{x}{p}, q - p \right\rangle = \sum _{i = 1}^n x_i \frac{q_i}{p_i} \le \sum _{i = 1}^n \frac{x_i}{p_i} \le \frac{1}{\min _{p \in K, 1 \le i \le n} p_i}. \end{aligned}$$

Also \(\varPhi ^{(k)} \rightarrow \varPhi \) uniformly on \(K\). Hence the family of L-divergences \(\{T, T^{(1)}, T^{(2)}, \ldots \}\) is uniformly bounded on \(K \times K\). By Lebesgue’s dominated convergence theorem, we have

$$\begin{aligned} \lim _{k \rightarrow \infty } \int T^{(k)}\left( q \mid p \right) {\mathrm {d}} {{\mathbb {P}}} = \int T\left( q \mid p \right) {\mathrm {d}} {{\mathbb {P}}}. \end{aligned}$$

Thus \((\pi , \varPhi )\) is optimal.

Case 2. \({{\mathbb {P}}}\) is discrete and has masses at \((p(j), q(j))\). Since \(\overline{\varDelta ^{(n)}}\) is compact, by a diagonal argument we can extract a further subsequence (still denoted by \(\{(\pi ^{(k)}, \varPhi ^{(k)})\}\)) such that \(\lim _{k \rightarrow \infty } \pi ^{(k)}(p(j))\) exists for each \(j\). Now we can redefine \(\pi \) on \(\{p(1), p(2), \ldots \}\) such that \(\pi (p(j)) = \lim _{k \rightarrow \infty } \pi ^{(k)}(p(j))\) for each \(j\). Since we only modify \(\pi \) at countably many points, \(\pi \) is still Borel measurable. Now we may apply Lebesgue’s dominated convergence theorem and conclude that \((\pi , \varPhi )\) is optimal.

(ii) Suppose \((\pi ^{(1)}, \varPhi ^{(1)})\) and \((\pi ^{(2)}, \varPhi ^{(2)})\) are optimal solutions. Define \(\pi = \frac{1}{2} \pi ^{(1)} + \frac{1}{2} \pi ^{(2)}\) which is generated by the geometric mean \(\varPhi = \sqrt{\varPhi ^{(1)}\varPhi ^{(2)}}\) (Lemma 8). Also let \(T\), \(T^{(1)}\) and \(T^{(2)}\) be the L-divergences of \((\pi , \varPhi )\), \((\pi ^{(1)}, \varPhi ^{(2)})\) and \((\pi ^{(2)}, \varPhi ^{(2)})\) respectively. By concavity of the L-divergence (Lemma 9), we have

$$\begin{aligned} \int T\left( q \mid p \right) {\mathrm {d}} {{\mathbb {P}}} \ge \frac{1}{2} \left( \int T^{(1)}\left( q \mid p \right) {\mathrm {d}} {{\mathbb {P}}} + \int T^{(2)}\left( q \mid p \right) {\mathrm {d}} {{\mathbb {P}}}\right) . \end{aligned}$$

Hence \((\pi , \varPhi )\) is also optimal. It follows from (49) and the strict concavity of the logarithm that

$$\begin{aligned} \left\langle \frac{\pi ^{(1)}(p)}{p}, q - p \right\rangle = \left\langle \frac{\pi ^{(2)}(p)}{p}, q - p \right\rangle \end{aligned}$$

for \({{\mathbb {P}}}\)-almost all \((p, q)\).

If \({{\mathbb {P}}}\) is absolutely continuous and satisfies the support condition, then for \(m\)-almost all \(p\) for which \(f(p) > 0\), we have

$$\begin{aligned} \left\langle \frac{\pi ^{(1)}(p)}{p}, v \right\rangle = \left\langle \frac{\pi ^{(2)}(p)}{p}, v \right\rangle \end{aligned}$$

for all tangent vectors \(v\). This and the fact that \(\pi ^{(1)}(p), \pi ^{(2)}(p) \in \overline{\varDelta ^{(n)}}\) imply that \(\pi ^{(1)}(p) = \pi ^{(2)}(p)\) \(m\)-almost everywhere on \(\{p: f(p) > 0\}\). \(\square \)

Proof of Theorem 5

By scaling, we may assume that \({\widehat{\varPhi }}^{(N)}(p_0) = \varPhi (p_0) = 1\) for all \(N \ge 1\). By Lemma 10, any subsequence of \(\{{\widehat{\varPhi }}^{(N)}\}\) has a further subsequence which converges locally uniformly to a positive concave function \({\widehat{\varPhi }}\) on \(\varDelta ^{(n)}\). Replacing \(\{{\widehat{\varPhi }}^{(N)}\}\) by such a convergent subsequence, we may assume that \({\widehat{\varPhi }}^{(N)} \rightarrow {\widehat{\varPhi }}\) locally uniformly on \(\varDelta ^{(n)}\). Let \({{\widehat{\pi }}}\) be any portfolio generated by \({\widehat{\varPhi }}\) (which exists by Lemma 2(ii)). We claim that \(({{\widehat{\pi }}}, {\widehat{\varPhi }})\) is optimal and hence \({{\widehat{\pi }}} = \pi \) \(m\)-almost everywhere on \(\{p: f(p) > 0\}\).

Let \({{\widehat{T}}}^{(N)}\), \({{\widehat{T}}}\) and \(T\) be the L-divergences of \(({{\widehat{\pi }}}^{(N)}, {\widehat{\varPhi }}^{(N)})\), \(({{\widehat{\pi }}}, {\widehat{\varPhi }})\) and \((\pi , \varPhi )\) respectively. By the optimality of \((\pi ^{(N)}, \varPhi ^{(N)})\) for the measure \({{\mathbb {P}}}_N\), we have

$$\begin{aligned} \int {{\widehat{T}}}^{(N)}\left( q \mid p \right) {\mathrm {d}} {{\mathbb {P}}}_N \ge \int T\left( q \mid p \right) {\mathrm {d}} {{\mathbb {P}}}_N, \quad N \ge 1. \end{aligned}$$

We would like to let \(N \rightarrow \infty \) in (50). The L-divergence \(T\left( q \mid p \right) \) is clearly continuous on \(K \times K\) (note that \(K\) is compact). By the definition of weak convergence, we have

$$\begin{aligned} \lim _{N \rightarrow \infty } \int T\left( q \mid p \right) {\mathrm {d}} {{\mathbb {P}}}_N = \int T\left( q \mid p\right) {\mathrm {d}} {{\mathbb {P}}}. \end{aligned}$$

Suppose we can prove that

$$\begin{aligned} \lim _{N \rightarrow \infty } \int {{\widehat{T}}}^{(N)}\left( q \mid p \right) {\mathrm {d}} {{\mathbb {P}}}_N = \int {{\widehat{T}}}\left( q \mid p\right) {\mathrm {d}} {{\mathbb {P}}}. \end{aligned}$$

Then letting \(N \rightarrow \infty \) in (50), we have

$$\begin{aligned} \int {{\widehat{T}}}\left( q \mid p\right) {\mathrm {d}} {{\mathbb {P}}} \ge \int T\left( q \mid p\right) {\mathrm {d}} {{\mathbb {P}}}, \end{aligned}$$

so \(({{\widehat{\pi }}}, {\widehat{\varPhi }})\) is optimal for the measure \({{\mathbb {P}}}\). Since \({{\mathbb {P}}}\) satisfies the support condition by assumption, by Theorem 4(ii) \({{\widehat{\pi }}}\) and \(\pi \) are equal \(m\)-almost everywhere on \(\{p: f(p) > 0\}\).

Thus we only need to prove (51). Here the technicality lies in the fact that both the integrands and the measures change with \(N\), so standard integral convergence theorems do not apply.

The main idea is to use the local uniform convergence property in Lemma 11 and approximate the integrals in (51) by Riemann sums. Let \(\varepsilon > 0\) be given. We will construct two partitions \(\{A_k\}_{k = 0}^{k_0}\), \(\{B_{\ell }\}_{\ell = 1}^{\ell _0}\) of \(K\), points \(p_k \in A_k\), \(q_{\ell } \in B_{\ell }\) and a positive integer \(N_0\) with the following properties:

  1. (i)

    \(A_k \times B_{\ell }\) is a \({{\mathbb {P}}}\)-continuity set, i.e., \({{\mathbb {P}}}(\partial (A_k \times B_{\ell })) = 0\). Thus, by the Portmanteau theorem (see Billingsley 2009), we have

    $$\begin{aligned} \lim _{N \rightarrow \infty } {{\mathbb {P}}}_N(A_k \times B_{\ell }) = {{\mathbb {P}}}(A_k \times B_{\ell }). \end{aligned}$$

    So for \(N \ge N_0\) where \(N_0\) is sufficiently large, we have

    $$\begin{aligned} \left| {{\mathbb {P}}}_N(A_k \times B_{\ell }) - {{\mathbb {P}}}(A_k \times B_{\ell })\right| < \frac{\varepsilon }{k_0\ell _0} \end{aligned}$$

    for all \(k\), \(\ell \).

  2. (ii)

    \({{\mathbb {P}}}(A_0 \times K) < \varepsilon \) and \({{\mathbb {P}}}_N(A_0 \times K) < \varepsilon \) for \(N \ge N_0\).

  3. (iii)

    For \(N \ge N_0\), \(p \in A_k\), \(q \in B_{\ell }\), \(1 \le k \le k_0\) and \(1 \le \ell \le \ell _0\), we have

    $$\begin{aligned} \left| {{\widehat{T}}}^{(N)}\left( q \mid p \right) - {{\widehat{T}}}\left( q_{\ell } \mid p_k\right) \right| < \varepsilon , \quad \left| {{\widehat{T}}}\left( q \mid p \right) - {{\widehat{T}}}\left( q_{\ell } \mid p_k\right) \right| < \varepsilon . \end{aligned}$$
  4. (iv)

    \(\left| \log {\widehat{\varPhi }}^{(N)}(p) - \log {\widehat{\varPhi }}(p) \right| < \varepsilon \) for \(p \in K\) and \(N \ge N_0\). (This is immediate since \({\widehat{\varPhi }}^{(N)}\) converges uniformly to \({\widehat{\varPhi }}\) on \(K\) and \({\widehat{\varPhi }}\) is positive on \(K\).)

Suppose these objects have been constructed. Then for \(N \ge N_0\) we can approximate the integrals as follows. By (ii) and (iii), we have

$$\begin{aligned}&\left| \int {{\widehat{T}}}\left( q \mid p\right) {\mathrm {d}}{{\mathbb {P}}} - \sum _{\ell = 1}^{\ell _0} \sum _{k = 1}^{k_0} {{\widehat{T}}}\left( q_{\ell } \mid p_k\right) {{\mathbb {P}}}(A_k \times B_{\ell })\right| \nonumber \\&\quad \le \left| \int _{A_0 \times K} {{\widehat{T}}}\left( q \mid p\right) {\mathrm {d}}{{\mathbb {P}}} \right| + \sum _{\ell = 1}^{\ell _0} \sum _{k = 1}^{k_0} \int _{A_k \times B_{\ell }} \left| {{\widehat{T}}}\left( q \mid p \right) - {{\widehat{T}}}\left( q_{\ell } \mid p_k\right) \right| {\mathrm {d}} {{\mathbb {P}}} \nonumber \\&\quad \le \varepsilon \max _{p, q \in K} {{\widehat{T}}}\left( q \mid p\right) + \varepsilon . \end{aligned}$$

Similarly, we have

$$\begin{aligned}&\left| \int {{\widehat{T}}}^{(N)}\left( q \mid p\right) {\mathrm {d}}{{\mathbb {P}}}_N - \sum _{\ell = 1}^{\ell _0} \sum _{k = 1}^{k_0} {{\widehat{T}}}\left( q_{\ell } \mid p_k\right) {{\mathbb {P}}}_N(A_k \times B_{\ell }) \right| \nonumber \\&\quad \le \varepsilon \max _{p, q \in K} {{\widehat{T}}}^{(N)}\left( q \mid p\right) + \varepsilon . \end{aligned}$$

By (48) and uniform convergence of \(\{{\widehat{\varPhi }}^{(N)}\}\) on \(K\), we can bound \(\max _{p, q \in K} {{\widehat{T}}}\left( q {\,\mid \,} p\right) \) and \( \max _{p, q \in K} {{\widehat{T}}}^{(N)}\left( q \mid p\right) \) by a constant \(C\). Using (i) and (iii), we get

$$\begin{aligned}&\left| \sum _{k, \ell } {{\widehat{T}}} \left( q_{\ell } \mid p_k \right) {{\mathbb {P}}}_N(A_k \times B_{\ell }) - \sum _{k, \ell } {{\widehat{T}}} \left( q_{\ell } \mid p_k \right) {{\mathbb {P}}}(A_k \times B_{\ell }) \right| \nonumber \\&\quad \le \sum _{k, \ell } {{\widehat{T}}}\left( q_{\ell } \mid p_k\right) \left| {{\mathbb {P}}}_N(A_k \times B_{\ell }) - {{\mathbb {P}}}(A_k \times B_{\ell })\right| \nonumber \\&\quad \le k_0\ell _0C\frac{\varepsilon }{k_0\ell _0} = C\varepsilon . \end{aligned}$$

Combining (52), (53) and (54), we have the estimate

$$\begin{aligned} \left| \int {{\widehat{T}}}^{(N)}\left( q \mid p\right) {\mathrm {d}}{{\mathbb {P}}}_N - \int {{\widehat{T}}}\left( q \mid p\right) {\mathrm {d}}{{\mathbb {P}}}\right| \le (3C + 2)\varepsilon , \quad N \ge N_0, \end{aligned}$$

and so (51) holds.

It remains to construct the sets \(\{A_k\}\), \(\{B_{\ell }\}\), the points \(p_k\), \(q_{\ell }\) and \(N_0\) satisfying (i)-(iv). Before we begin, we note the fact that the boundary of any convex subset of \(\varDelta ^{(n)}\) has \(m\)-measure zero ((Lang 1986, Theorem 1)). Let \(\varepsilon > 0\) be given. By ((Rockafellar 1997, Theorem 10.6)), the family \(\{{\widehat{\varPhi }}, {\widehat{\varPhi }}^{(1)}, {\widehat{\varPhi }}^{(2)}, \ldots \}\) is uniformly Lipschitz on \(K\). Also, it is not difficult to verify that there exists a constant \(L > 0\) so that

$$\begin{aligned} \left| \log \left( 1 + \left\langle \frac{x}{p}, q - p \right\rangle \right) - \log \left( 1 + \left\langle \frac{x}{p'}, q' - p' \right\rangle \right) \right| \le L\left( \Vert p - p'\Vert + \Vert q - q'\Vert \right) \end{aligned}$$

for all \(x \in \overline{\varDelta ^{(n)}}\) and \(p, p', q, q' \in K\). It follows that the family of L-divergences \(\{{{\widehat{T}}}, {{\widehat{T}}}^{(1)}, {{\widehat{T}}}^{(2)} \ldots \}\) is uniformly Lipschitz on \(K \times K\). Thus there exists \(\delta _0 > 0\) such that if \(p, p', q, q' \in \varDelta ^{(n)}\), then

$$\begin{aligned} \left| {{\widehat{T}}}^{(N)}\left( q' \mid p' \right) - {{\widehat{T}}}^{(N)}\left( q \mid p \right) \right| < \frac{\varepsilon }{2} \quad \text {and} \quad \left| {{\widehat{T}}}\left( q' \mid p' \right) - {{\widehat{T}}}\left( q \mid p \right) \right| < \varepsilon \end{aligned}$$

whenever \(\Vert q - q'\Vert < \delta _0\), \(\Vert p - p'\Vert < \delta _0\).

Let \(D\) be the set of points in \(K\) at which \({\widehat{\varPhi }}\) is differentiable. Then \(K {\setminus } D\) has \(m\)-measure zero by ((Rockafellar 1997, Theorem 25.5)). Let \(\varepsilon ' > 0\) be arbitrary. By Lemma 11, for each \(p \in D\) there exists \(0 < \delta (p) \le \delta _0\) and a positive integer \(N_0(p)\) such that \(\left\| {{\widehat{\pi }}}^N(q) - {{\widehat{\pi }}}(p)\right\| < \varepsilon '\) for all \(N \ge N_0(p)\) and \(q \in B(p, \delta (p))\).

Since \(K\) is compact, it is separable, and so is \(D\) as a subset of \(K\). The collection \(\{B(p, \delta (p))\}_{p \in D}\) forms an open cover of \(D\) and hence there exists a countable subcover. By the continuity of measure, for any \(\eta > 0\) there exists \(p_1, \ldots , p_{j_0} \in D\) such that

$$\begin{aligned} m(A_0) < \eta , \quad A_0 := K {\setminus } \bigcup _{j = 1}^{j_0} B(p_j, \delta (p_j)), \end{aligned}$$

Since \(\partial A_0 \subset \partial K \cup \bigcup _j \partial B(p_j, \delta (p_j))\), \(\partial (A_0 \times K)\) has \(m\)-measure zero and hence \(A_0 \times K\) is a \({{\mathbb {P}}}\)-continuity set. Since \({{\mathbb {P}}}\) is absolutely continuous, choosing \(\eta > 0\) sufficiently small we have

$$\begin{aligned} {{\mathbb {P}}}(A_0 \times K) < \varepsilon , \end{aligned}$$

and by weak convergence we have \({{\mathbb {P}}}_N(A_0 \times K) < \varepsilon \) for \(N\) sufficiently large, so (ii) holds. Let \(A_1 = B(p_1, \delta (p_1)) \cap K\) and define \(A_k = \{p_k\} \cup (B(p_k, \delta (p_k)) \cap K) {\setminus } (A_1 \cup \cdots \cup A_{k-1})\), \(j = 2, \ldots , k_0\). If \(N \ge \max _{1 \le k \le k_0} N_0(p_k)\), we have

$$\begin{aligned} \left\| {{\widehat{\pi }}}^N(p) - {{\widehat{\pi }}}(p_k)\right\| < \varepsilon ', \quad p \in A_k, \quad k = 1, \ldots , k_0. \end{aligned}$$

Next choose \(q_1, \ldots , q_{\ell _0} \in K\) such that \(K \subset \bigcup _{\ell = 1}^{\ell _0} B(q_{\ell }, \delta _0)\). Define \(B_1 = B(q_1, \delta _0) \cap K\) and \(B_{\ell } = \{q_{\ell }\} \cup (B(q_{\ell }, \delta _0) \cap K) {\setminus } (B_1 \cup \cdots \cup B_{\ell -1})\), \(j = 2, \ldots , \ell _0\). Again it is clear that \(\partial (A_k \times B_{\ell })\) has \(m\)-measure zero and is a \({{\mathbb {P}}}\)-continuity set. So (i) holds for \(N\) sufficiently large. Finally, if we choose \(\varepsilon ' > 0\) small enough in (56), we have

$$\begin{aligned} \left| {\widehat{T}}^{(N)}\left( q \mid p \right) - {\widehat{T}}\left( q \mid p_k \right) \right| < \frac{\varepsilon }{2}, \quad p \in B(p_k, \delta _0), \quad q \in \varDelta ^{(n)} \end{aligned}$$

for \(N\) sufficiently large. This and (55) imply (iii) and the proof of Theorem 5 is complete. \(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wong, TK.L. Optimization of relative arbitrage. Ann Finance 11, 345–382 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Stochastic portfolio theory
  • Relative arbitrage
  • Functionally generated portfolio
  • Shape-constrained optimization
  • Portfolio management

JEL Classification

  • G11
  • C61