1 Introduction

In this paper, we argue that it is more important for an investor to consider an orthogonal Sharpe ratio than a risk-adjusted return when evaluating how to blend opportunity sets. The reason is that the former (contrary to the latter) provides sufficient information on how to form a mean-variance efficient portfolio. To state our case, we present a geometric approach to portfolio theory, in part based on the framework outlined in Bermin and Holm (2021b). More precisely, we consider an opportunity set consisting of N primary assets and a numéraire asset, such that a self-financing trading strategy can, for a fixed point in time, be seen as an element in \(\mathbb {R}^N\). This vector space is further endowed with a natural inner product, through the instantaneous covariance matrix of logarithmic excess returns, which thus forms a Hilbert space. As shown in Bermin and Holm (2021b), the instantaneous rate of excess portfolio return can then be represented as the inner product of the corresponding trading strategy and the growth optimal Kelly trading strategy. It is this unique feature that allows us to formulate a geometric approach to portfolio theory by means of a single vector (i.e., the growth optimal Kelly vector) and the inner product in our Hilbert space.

Since the growth optimal Kelly vector (or equally called the Kelly criterion) plays a central role to this study, we pay homage to the original contributors Kelly (1956) and Latané (1959). Kelly stressed that the important quantity to look at is the logarithmic excess return (rather than the excess return) and argued: “The reason has nothing to do with the value function which [the investor] attached to his money, but merely with the fact that it is the logarithm which is additive in repeated bets and to which the law of large numbers applies”. While early promoters of the optimal growth theory existed, see for instance (Hakansson and Ziemba 1995; Thorp 2011), and the references therein, the Kelly criterion was nonetheless subject to severe criticism over the years, see MacLean et al. (2011) and Ziemba (2015) for a historical recount. The main arguments against Kelly’s result relate to the riskiness of the trading strategy and to the deviation from the expected utility approach in von Neumann and Morgenstern (1947). To address the former, MacLean et al. (1992) introduced the so-called fractional Kelly strategies, in which a constant fraction of the wealth is invested in the growth optimal Kelly strategy and the remaining fraction in the bank account. This definition was later extended in a serious of papers (Bermin and Holm 2021a, b, 2023) to allow for more general leverage; with the authors defining Kelly strategies simply as those trading strategies for which the instantaneous Sharpe ratio, see Nielsen and Vassalou (2004), is maximal. Hence, by design, all Kelly strategies lie on the efficient (local) frontier in the sense of Markowitz (1952) and Tobin (1958), but have different risk. When further combined with the observation in Platen (2006) that an investor who, for a fixed level of volatility, prefers higher rate of excess return to lower chooses to allocate the wealth proportional to the growth optimal Kelly strategy, it becomes apparent that Kelly strategies can be seen as the natural extension of the single-period mean–variance efficient portfolios. The main difference is that, in a multi-period setting, it is never efficient to leverage more than the Kelly criterion as shown in, for instance, Bermin and Holm (2023), Davis and Lleo (2021).

Another interesting aspect of the growth optimal Kelly strategy is that the reciprocal of the wealth process can be regarded as an admissible stochastic discount factor, see Long (1990). We refine this result by showing that the corresponding market price of risk vector is identical to the growth optimal Kelly vector, albeit expressed in coordinates of a different basis, when the market is complete. Hence, an immediate consequence of our geometrical approach is that we strengthen the connection between these, sometimes, separate fields of research. It also provides new means to use portfolio theory in order to value derivatives in incomplete markets.

The main motivation, though, for writing this paper is to explain and clarify the geometrical principles behind risk-adjusted returns, in particular Jensen’s alpha as introduced in Jensen (1964) and the corresponding beta parameter. The first observation we make is that given any trading strategy with nonzero alpha, we can apply leverage to reach any targeted alpha. In fact, we argue that neither a higher alpha, for a fixed beta, nor a lower beta, for a fixed alpha, is strictly better for an investor. While alpha describes the excess return of a particular portfolio, formed in such a way that it is locally uncorrelated with the reference asset, this portfolio does not tell us how best to trade in order to reach the maximal instantaneous Sharpe ratio, or equivalently to be on the efficient (local) frontier. We circumvent the problems with alpha and beta by rather studying the Sharpe ratio of the alpha-generating trading strategy. With the knowledge of the orthogonal Sharpe ratios, we can easily construct any trading strategy on the efficient frontier. Overall, we find that while the alpha/beta approach has severe limitations (especially in higher dimensions), only minor conceptual modifications are needed to complete the picture. However, these minor modifications (e.g., using orthogonal Sharpe ratios rather than risk-adjusted returns) can only be appreciated once a full geometric approach to portfolio theory is developed.

In addition, we derive a number of intermediate results that are of interest by themselves. We show that the growth optimal Kelly vector on a subspace equals the orthogonal projection of the growth optimal Kelly vector onto that subspace. We also show that the length of any growth optimal Kelly vector equals its instantaneous Sharpe ratio. A financial interpretation, of these two results, is that the maximal Sharpe ratio decreases as we reduce the opportunity set. We further show that the instantaneous correlation between an arbitrary trading strategy and its corresponding growth optimal Kelly strategy can be expressed as the ratio between their Sharpe ratios. Additionally, we derive a general bound for the correlation between two arbitrary trading strategies in terms of their Sharpe ratios and the Sharpe ratio of the corresponding growth optimal Kelly strategy. By analyzing the level sets of various financial quantities, we also find that points in the mean–variance space cannot, in general, be associated with a unique trading strategy. Only the points on the efficient frontier (that is those with maximal Sharpe ratio) can uniquely be identified. For such trading strategies, collinear to the growth optimal Kelly vector, we formalize the notion of relative value trading that is implicit in Platen (2006), Bermin and Holm (2021a). This allows us to explicitly quantify the additional return an investor can obtain for a fixed level of risk. Thereafter, we apply geometric principles to derivative pricing and introduce the concept of pricing by means of No Added Relative Value (NARV, for short). We say that this concept applies to a given asset, relative an initial portfolio, when no value can be added by augmenting the initial opportunity set with the given asset. It follows that the NARV price of a derivative is defined such that its orthogonal Sharpe ratio equals zero and that this price further corresponds to the no-arbitrage price of the, so-called, minimal martingale measure (Föllmer and Schweizer 1991); a result first derived in Bermin and Holm (2021a), albeit with much different methods. Finally, we show how to extend the geometric approach such that risk can be measured against an arbitrary asset, different from the numéraire, as described in Bermin and Holm (2021b).

In order to derive our result, we use tensor analysis. While this is not a standard approach in the financial literature, it greatly simplifies the notation and geometrical understanding, compared to a formalism based on matrix algebra. Hopefully, the readers agree with us once passed the initial hurdle.

The paper is organized as follows: In Sect. 2, we briefly recap the framework laid out in Bermin and Holm (2021b). In Sect. 3, we introduce basic notations from geometric algebra, after which we establish the portfolio framework in a number of subsections. In Sect. 4, we investigate Jensen’s alpha; starting with the simple case of how to efficiently trade in two assets and following up with the general case of how to efficiently trade in two opportunity sets. Section 5 deals with relative value trading and its connection to derivative pricing, while in Sect. 6 we briefly explain how to adjust risk against an asset different from the numéraire.

2 Basic portfolio theory

We consider a capital market consisting of a number of primary assets \((P_{0},P_{1},\ldots ,P_{N})\) expressed in some common numéraire unit, say US dollar. An asset related to a dividend paying stock is seen as a fund with the dividends re-invested. All assets are assumed to be positive adapted continuous processes living on a filtered probability space \((\Omega ,\mathcal {F},\mathbb {F}, \mathbb {P})\), where \(\mathbb {F}=\lbrace \mathcal {F}(t):t\ge 0\rbrace \) is a right-continuous increasing family of \(\sigma \)-algebras such that \(\mathcal {F}(0)\) contains all the \(\mathbb {P}\)-null sets of \(\mathcal {F}\). As usual we think of the filtration \(\mathbb {F}\) as the carrier of information and henceforth we assume that the filtration is generated by a standard Brownian motion W of dimension \(M\ge N\). We further let \(P_{0}\) be the numéraire asset of the economy, describing how the value of the numéraire unit changes over time, and introduce the relative prices \(P_{ 0 \vert n }=P_{n}/P_{0}\) according to

$$\begin{aligned} \frac{dP_{0\vert n}(t)}{P_{0\vert n}(t)}=b_{0\vert n}(t)dt+\sum _{m=1}^{M}\Sigma _{0\vert n,m}(t)dW^{m}(t),\quad n\in \lbrace 1,\ldots ,N\rbrace , \end{aligned}$$

for some \(\mathbb {F}\)-adapted, \(\mathbb {R}^{N}\)-valued, excess return process \(b_0\) and some \(\mathbb {F}\)-adapted, \(\mathbb {R}^{N\times M}\)-valued, volatility process \(\Sigma _0\). We also require \(P_0>0\) a.s. and impose the mild regularity condition

$$\begin{aligned} \int _0^T(\Vert b_0(t)\Vert _{\mathbb {R}^{N}}+\sum _{n=1}^N\sum _{m=1}^M\Sigma _{0\vert n,m}^2(t))dt<\infty ,\quad \hbox {a.s.}, \end{aligned}$$

such that the relative asset prices are well defined over the horizon [0, T].

An investor can trade in the assets, and throughout this paper we assume that there are no transaction fees, that short-selling is allowed, that trading takes place continuously in time, and that trading activity does not impact the asset prices. We define a trading strategy as an \(\mathbb {F}\)-predictable vector process \(w=(w^{1},\ldots ,w^{N})'\), representing the proportion of wealth invested in each asset, and we let \(X_{w}\) denote the corresponding wealth process. We also set \(X_{0\vert w}=X_{w}/P_{0}\). In order to analyze the performance of a trading strategy, we further let \(X_{u\vert w}=X_{w}/X_{u}\) refer to the ratio of portfolios using the trading strategies w and the reference strategy u, respectively. Of course, only reference strategies u satisfying the constraint \(X_u>0\) a.s. over some time horizon [0, T] are considered admissible. In this setup, the self-financing condition, see Geman et al. (1995), and the asset dynamics reads

$$\begin{aligned} \frac{dX_{u\vert w}(t)}{X_{u\vert w}(t)}&=w^{0}(t)\frac{dP_{u\vert 0}(t)}{P_{u\vert 0}(t)}+\sum _{n=1}^{N}w^{n}(t)\frac{dP_{u\vert n}(t)}{P_{u\vert n}(t)},\quad w^{0}(t)=1-\sum _{n=1}^{N}w^{n}(t),\\ \frac{dP_{u\vert n}(t)}{P_{u\vert n}(t)}&=b_{u\vert n}(t)dt+\sum _{m=1}^{M}\Sigma _{u\vert n,m}(t)dW^{m}(t),\quad n\in \lbrace 0,1,\ldots ,N\rbrace , \end{aligned}$$

for some \(\mathbb {F}\)-adapted, \(\mathbb {R}^{N+1}\)-valued, excess return process \(b_u\) and some \(\mathbb {F}\)-adapted, \(\mathbb {R}^{N+1\times M}\)-valued, volatility process \(\Sigma _u\). Note that, by setting uw to \(\textbf{0}=\left( 0,\ldots ,0\right) '\), \(X_{\textbf{0}}\) is seen to be proportional to the market numéraire asset \(P_{0}\). Hence, the local characteristics of \(X_{\textbf{0}\vert w}\) and \(X_{0\vert w}\) are identical. The instantaneous rate of return of the trading strategy w, in excess of the reference strategy u, can be expressed as:

$$\begin{aligned} b_{u\vert w}(t)=w^{0}(t)b_{u\vert 0}(t)+\sum _{n=1}^{N}w^{n}(t)b_{u\vert n}(t), \end{aligned}$$
(1)

such that \(b_{u\vert \textbf{0}}=b_{u\vert 0}\) and \(b_{u\vert e_{n}}=b_{u\vert n}\), for \(e_{n}=(0,\ldots ,0,1,0,\ldots ,0)'\) being the n’th coordinate vector corresponding to the investable assets. Note that when \(u=\textbf{0}\) and the numéraire asset is taken to be locally risk-free (i.e., a bank account), we measure the rate of excess return relative to the interest rate. We further define the instantaneous covariance process (by means of the quadratic covariation process, see Karatzas and Shreve 1988) and the corresponding instantaneous correlation and variance processes

$$\begin{aligned} V_{u\vert v,w}(t)&=\frac{\textrm{d}}{\textrm{dt}}[\log X_{u\vert v},\log X_{u\vert w}](t),\\ \rho _{u\vert v,w}(t)&=\frac{V_{u\vert v,w}(t)}{\sigma _{u\vert v}(t)\sigma _{u\vert w}(t)},\quad \sigma ^2_{u\vert w}(t)=V_{u\vert w,w}(t). \end{aligned}$$

We also let \(\mu _{u\vert w}\) denote the rate of logarithmic excess return and claim that a simple application of Itô’s formula yields

$$\begin{aligned} \mu _{u\vert w}(t)=b_{u\vert w}(t)-\frac{1}{2}\sigma ^{2}_{u\vert w}(t). \end{aligned}$$
(2)

Additionally, we follow Bermin and Holm (2021b) and introduce a few more important concepts. First, we define the generalized instantaneous Sharpe ratio

$$\begin{aligned} s_{u\vert w}(t)=\frac{b_{u\vert w}(t)}{\sigma _{u\vert w}(t)}, \end{aligned}$$
(3)

Second we define the relative leverage risk processes \(k_{u\vert w}\) and the relative drawdown process \(R_{u\vert w}\) according to

$$\begin{aligned} k_{u\vert w}(t)=\frac{\sigma ^{2}_{u\vert w}(t)}{b_{u\vert w}(t)},\quad R_{u\vert w}(t)=\mathcal {R}\left( {k_{u\vert w}(t)}\right) ,\quad \mathcal {R}(k)=\frac{k}{2-k}. \end{aligned}$$
(4)

As shown in Bermin and Holm (2021b), any bankruptcy-avoiding trading strategy holding the relative leverage risk process \(k_{u\vert w}\) constant over time, at say a level \(k\in (0,2)\), has a maximal drawdown distribution given by the simple analytical formula

$$\begin{aligned} \mathbb {P}\left( \inf _{0\le t <\infty }\log \frac{X_{u\vert w}(t)}{X_{u\vert w}(0)} \le -n\mathcal {R}\left( k\right) \right) =e^{-n},\quad n\ge 0. \end{aligned}$$

While the formula provides an intuitive interpretation for the relative drawdown risk, we do not necessarily require that the relative leverage risk process is kept constant over time. Instead, we directly associate drawdown risk with the process \(R_{u\vert w}\) and recall from Bermin and Holm (2021b) that this process shares many of the properties seen in coherent and convex risk measures, see Artzner et al. (1999), Föllmer and Schied (2002) for further details.

What makes the proposed framework compelling is that all quantities can be computed from the instantaneous covariance and rate of excess return processes. The dependency on the reference strategy can further be removed as explained below. Define the covariance matrix process of the investable assets, relative to the market numéraire, by \(V_{0}=\Sigma _0\Sigma ^{\prime }_0\). Assume that the matrix \(V_{0}\) is a.s. positive definite such that it generates an inner product of the form \(\langle v,w\rangle _{V_{0}}=v'V_{0}w\). It now follows, from the self-financing property and Eq. (1), that

$$\begin{aligned} V_{0\vert v,w}(t)=\langle v(t),w(t)\rangle _{V_{0}(t)},\quad b_{0\vert w}(t)=\langle w_{*}(t),w(t)\rangle _{V_{0}(t)}, \end{aligned}$$
(5)

where the particular trading strategy \(w_{*}=V_0^{-1}b_0\) is commonly known as the growth optimal Kelly strategy. Moreover, using Eqs. (2) and (5), it is easily seen that the growth optimal Kelly strategy can be characterized as

$$\begin{aligned} w_{*}(t)=\mathop {\mathrm {arg\,max}}\limits _{w(t)}\mu _{0\vert w}(t)=\mathop {\mathrm {arg\,max}}\limits _{w(t)}\mu _{u\vert w}(t),\quad \forall u. \end{aligned}$$

The observation that \(w_*\) is independent of the reference strategy follows from the alternative representation \(X_{u\vert w}=X_{0\vert w}/X_{0\vert u}\), which implies that \(\mu _{u\vert w}=\mu _{0\vert w}-\mu _{0\vert u}\). Finally, we extend Eq. (5) to an arbitrary reference strategy as described below

Proposition 1

For every reference strategy u, the instantaneous covariance process \(V_{u}\) and the rate of excess return process \(b_{u}\) equal

$$\begin{aligned} V_{u\vert v,w}(t)=V_{0\vert v-u,w-u}(t),\quad b_{u\vert w}(t)=V_{0\vert w_*-u,w-u}(t). \end{aligned}$$

Proof

The proof follows from straightforward calculations, see Bermin and Holm (2021b). \(\square \)

Hence, the minimal representation of the framework is given by the quantities \((w_{*},V_{0})\). Once these terms are specified everything else is computable. Having established the connection between an arbitrary reference strategy and the market numéraire, we now focus on the case where the latter is used as the reference strategy and provide details, at a later stage, on how to generalize the results derived.

Kelly’s approach to portfolio allocation is fundamentally different from the expected utility approach of von Neumann and Morgenstern (1947) and consequently different from the (single period) mean–variance approach of Markowitz (1952). Yet, the final results are very similar to those obtained by Markowitz. Kelly started by considering the long-term performance of a trading strategy in conjunction with the law of large numbers. In our settings, and with appropriate technical regularity conditions (Bermin and Holm 2021b), this means that

$$\begin{aligned} \lim _{T\rightarrow \infty }\frac{1}{T}\log \frac{X_{0\vert w}(T)}{X_{0\vert w}(0)}=\lim _{T\rightarrow \infty }\frac{1}{T}\int _0^T\mu _{0\vert w}(t)dt,\quad a.s. \end{aligned}$$
(6)

Hence, by applying the growth optimal Kelly strategy \(w_*=\mathop {\mathrm {arg\,max}}\limits _w\mu _{0\vert w}\), such that \(\mu _{0\vert w_*}=\frac{1}{2}s^2_{0\vert w_*}\), Kelly noted that "our gambler’s capital will surpass, with probability one, that of any other gambler apportioning his money differently". It also showed that \(w_*\) was (locally) a mean–variance efficient strategy, since its instantaneous Sharpe ratio was maximal. The fact that optimal long-term capital growth requires optimal instantaneous Sharpe ratio allocations is an interesting property.

This brings us to the core of the paper: how can an investor increase the instantaneous Sharpe ratio of a portfolio. In order to answer such a question, we follow Nielsen and Vassalou (2004) and apply a Taylor expansion to the term \(s_{0\vert w+\varepsilon v}\). By the use of Proposition 1, we compute

$$\begin{aligned} s_{0\vert w+\varepsilon v}(t)&=s_{0\vert w}(t)+ \frac{ \alpha _{0\vert v,w}(t)}{\sigma _{0\vert w}(t)}\varepsilon +\mathcal {O}(\varepsilon ^{2}), \end{aligned}$$
(7)
$$\begin{aligned} \alpha _{0\vert v,w}(t)&=b_{0\vert v}(t)-\beta _{0\vert v,w}(t)b_{0\vert w}(t),\quad \beta _{0\vert v,w}(t)=\frac{\sigma _{0\vert v}(t)\rho _{0\vert v,w}(t)}{\sigma _{0\vert w}(t)}. \end{aligned}$$
(8)

We recognize \(\alpha _{0\vert v,w}\) as Jensen’s alpha, see Jensen (1964), describing how the instantaneous excess return of the trading strategy v is risk-adjusted with respect to the trading strategy w. The adjustment equals the product between the risk parameter beta and the instantaneous excess return of the trading strategy w. It is apparent from Eq. (7) that for \(\varepsilon \) sufficiently small we can always improve the Sharpe ratio if Jensen’s alpha is different from zero. While theoretically interesting this observation has, as explained in Nielsen and Vassalou (2004), limited practical applicability since only infinitesimal contributions are considered.

In this paper, we show how to calculate the optimal instantaneous Sharpe ratio when the opportunity set is enlarged. In doing so, we emphasize on the non-trivial geometry governing risk adjustments. The main conclusion is that an investor who wants to run a locally efficient trading strategy cares more about the orthogonal Sharpe ratio than the risk-adjusted return. We stress that this is a static analysis, carried out for a given and fixed point in time, and consequently we often suppress the time dimension to facilitate the reading.

3 Basic geometry

We start by giving a very brief introduction to geometry, including tensors and tensor notation. For additional details, see for instance (Dodson and Poston 1991). The reason for choosing this path is that we sometimes need to study the geometry from the viewpoint of different coordinate systems. Consequently, it is beneficial to work with a coordinate-free representation. The tensor notation further offers superior understanding, compared to the linear algebra matrix notation, in describing how the components transform with respect to linear transformations of the basis vectors. In order to easily distinguish components from basis vectors (and tensors), we write the latter ones in bold.

Throughout this paper, let U be an N-dimensional vector space over \(\mathbb {R}\) such that U is isomorphic to \(\mathbb {R}^{N}\). A typical element of U is denoted by \(\textbf{w}\) and corresponds to a trading strategy at a given point in time. Expressed in terms of the standard basis \(\lbrace \textbf{e}_{1},\ldots ,\textbf{e}_{N}\rbrace \) this means that \(\textbf{w}=w^{1}(t)\textbf{e}_{1}+\cdots +w^{N}(t)\textbf{e}_{N}\) for some vector of components w(t). Similarly, we let \(\textbf{w}_{*}\) denote the growth optimal Kelly vector with components \(w_{*}(t)\) in the standard basis. We also fix the inner product \(\textbf{V}_0(\textbf{v},\textbf{w} )=\langle v(t),w(t)\rangle _{V_0(t)}\), representing the random variable \(V_{0\vert v,w}(t)\), and note that \(\mathcal {H}=(U,\textbf{V}_{0})\) is a Hilbert space.

We further let \(U^*\) denote the dual vector space containing all linear forms on U. The elements of the dual space are referred to as covectors, or 1-forms, and a typical example is the instantaneous rate of excess return. We define the covector \(\textbf{b}_{0}(\textbf{w})=\textbf{V}_0(\textbf{w}_{*},\textbf{w})\) such that it represents the random variable \(b_{0\vert w}(t)\). Hence, \(\textbf{b}_{0}(\textbf{e}_n)\) corresponds to the n’th term of the component vector \(b_{0}(t)=(b_{0\vert 1}(t),\ldots ,b_{0\vert N}(t))'\) for the investable assets. The notion of the dual space is important throughout this work, and from linear algebra we know that the dual space \(U^*\) is itself a vector space of the same dimension as U. Moreover, as U is finite-dimensional the map into its double dual space \(U^{**}\) is a natural isomorphism; whence \(U^{**}\) can be identified with the original vector space. This means that we can also regard \(\textbf{w}\) as the linear form \(\textbf{w}(\textbf{b}_{0})=\textbf{b}_{0}(\textbf{w})\).

For any vector basis \(\lbrace \textbf{u}_1,\ldots ,\textbf{u}_N\rbrace \) of U, there exists a dual vector basis \(\lbrace \textbf{u}^1,\ldots ,\textbf{u}^N\rbrace \) for which \(U^{*}={{\,\textrm{span}\,}}(\textbf{u}^1,\ldots ,\textbf{u}^N)\). We express the canonical dual basis, using the Kronecker delta, according to \(\textbf{u}^i(\textbf{u}_j)=\delta _j^i\). One notes that in the special case where the vector basis \(\lbrace \textbf{u}_1,\ldots ,\textbf{u}_N\rbrace \) is orthonormal the canonical dual basis takes the same form as the vector basis, which consequently allows for considerable simplifications. In our situation, however, this is not the case. The standard basis is related to the investable assets, which are assumed to be correlated with each other. To further explain the relationship, we apply a linear transformation to the standard basis. It follows, using Einstein summation for repeated indices, that if \(\bar{\textbf{e}}_i=A_i^j \textbf{e}_j\) then \(\bar{\textbf{e}}^i=(A^{-1})_j^i \textbf{e}^j\). We verify this statement using the linearity of covectors

$$\begin{aligned} \bar{\textbf{e}}^i(\bar{\textbf{e}}_j)=(A^{-1})_k^i {\textbf {e}}^k( A_j^l {\textbf {e}}_l)=A_j^l (A^{-1})_k^i \delta _l^k=A_j^l (A^{-1})_l^i=\delta _j^i, \end{aligned}$$

Similarly, we see that vector components also transform inversely to the coordinates: \(\textbf{w}=w^i(t) \textbf{e}_i=\bar{w}^i(t)\bar{\textbf{e}}_i=\bar{w}^i(t)A_i^j\textbf{e}_j\) implies that \(\bar{w}^i(t)= (A^{-1})_j^i w^j(t)\). For a covector, though, the components transform similar to the vector basis, that is with \(\textbf{b}_{0}=b_{0\vert i}(t) \textbf{e}^i=\bar{b}_{0\vert i}(t)\bar{\textbf{e}}^i\) we obtain \(\bar{b}_{0\vert i}(t)=A_i^j b_{0\vert j}(t)\). Furthermore, the asset–asset covariance matrix \(V_0(t)\) generating the inner product \(\textbf{V}_0=V_{0\vert i,j}(t)\textbf{e}^i\otimes \textbf{e}^j=\bar{V}_{0\vert i,j}(t)\varvec{\bar{e}}^i\otimes \varvec{\bar{e}}^j\) transforms as \(\bar{V}_{0\vert i,j}(t)=A_i^kA_j^lV_{0\vert k,l}(t)\).

The framework briefly outlined above is that of tensor analysis. The takeaway is that a tensor is always independent of the chosen basis but that the components change in such a way as to reflect the basis used. More formally, we regard a (pq)-tensor \(\textbf{T}\) as an element of the space

$$\begin{aligned} \underbrace{U\otimes \cdots \otimes U}_{p}\otimes \underbrace{U^*\otimes \cdots \otimes U^*}_q, \end{aligned}$$

such that \(\textbf{T}\) maps p covectors (recall that we identify \(U^{**}\) with U) and q vectors to \(\mathbb {R}\) in a coordinate-free and multilinear way. It is important to understand, however, that in order to compute the function value in \(\mathbb {R}\) we must always choose a particular basis and identify the corresponding components.

In Table 1, we highlight the main tensors used in this paper. With the abstract tensor notation, we observe that the instantaneous rate of excess return covector \(\textbf{b}_{0}=\textbf{V}_{0}(\textbf{w}_{*})\) is the metric dual of the growth optimal Kelly vector. In other words, \(\textbf{w}_{*}\in \mathcal H=(U,\textbf{V}_0)\) is the Riesz representation of \(\textbf{b}_{0}\in \mathcal {H}^{*}=(U^{*},\textbf{V}_0^{-1})\), where the inner product \(\textbf{V}_0^{-1}=(V_0^{-1}(t))^{ij}\textbf{e}_i\otimes \textbf{e}_j\) is generated by the inverse covariance matrix \(V_0^{-1}(t)\) of the investable assets, such that \(\Vert \textbf{w}_{*}\Vert _\mathcal {H}=\Vert \textbf{b}_{0}\Vert _{\mathcal {H}^*}\). Similarly, we can also write \(\textbf{w}_{*}=\textbf{V}_0^{-1}(\textbf{b}_0)\), with the interpretation that \(\textbf{w}_*\) is an element of the double dual space \(U^{**}\cong U\), such that \(\textbf{w}_*(\textbf{e}^n)=\textbf{e}^n(\textbf{w}_*)\) equals the n’th component of \(w_*(t)\). The (1, 1)-tensor \(\textbf{P}_{0}=P_{0\vert j}^i(t)\textbf{e}_i\otimes \textbf{e}^j\) is a projection operator mapping a covector and a vector to \(\mathbb {R}\). More commonly, though, we regard it as a map from either U onto U or from \(U^*\) onto \(U^*\). Finally, let us mention that we have chosen to represent other financial key quantities with the same notation, although they are not tensors. For instance, we let \(\textbf{s}_0(\textbf{w})=\textbf{b}_0(\textbf{w})/\sqrt{\textbf{V}_0(\textbf{w},\textbf{w})}\) denote the instantaneous Sharpe ratio and note that this quantity is truly speaking not a tensor due to the nonlinear scaling \(\textbf{s}_0(\lambda \textbf{w})=\textrm{sign}(\lambda )\textbf{s}_0(\textbf{w})\).

Table 1 Summary of main tensors

Next we present additional results related to the growth optimal Kelly vector, with the purpose both to motivate the use of tensors and to present a framework suitable for geometric analysis.

3.1 Absence of arbitrage

In order to highlight the power of tensor analysis, we provide an enlightening example of when it is important to consider vectors rather than simply components for a particular basis.

Recall that in Sect. 2 we presented the basic portfolio theory directly in terms of the components corresponding to the standard basis, such that the components of the asset–asset covariance matrix \(V_0(t)\) equaled \(\Sigma _{0}(t)\Sigma _{0}'(t)\). It is well known that the absence of arbitrage implies the existence of an \(\mathbb {F}\)-adapted process \(\theta =(\theta ^1,\ldots ,\theta ^M)'\), see for instance (Karatzas and Shreve 1999), satisfying

$$\begin{aligned} \Sigma _{0\vert n,m}(t)\theta ^m(t)=b_{0\vert n}(t). \end{aligned}$$

We call \(\theta \) the market price of risk process and notice that in a complete market, where \(M=N\), this process equals \(\theta (t)=\Sigma ^{-1}_0(t)b_0(t)\). Consequently, in a complete market it follows, from Eq. (5), that we can express the growth optimal Kelly strategy as

$$\begin{aligned} \textbf{w}_*=w^i_*(t) \textbf{e}_i=\theta ^a(t)(\Sigma _0^{-1}(t))_a^j\textbf{e}_j=\theta ^a(t)\bar{\textbf{e}}_a. \end{aligned}$$

Moreover, since \(\lbrace \bar{\textbf{e}}_1,\ldots ,\bar{\textbf{e}}_N\rbrace \) is a basis of U we see that \(\theta ^a(t)\bar{\textbf{e}}_a\) naturally describes the market price of risk vector \(\varvec{\Theta }\in U\). We summarize the observations below.

Theorem 2

In a complete market, where \(M=N\), the market price of risk vector \(\varvec{\Theta }\) is identical to the growth optimal Kelly vector \(\textbf{w}_*\). That is, with \(\varvec{\Theta }=\theta ^a(t)\bar{\textbf{e}}_a\), the components and the basis vectors relate according to

$$\begin{aligned} \begin{array}{ll} \theta ^a(t)=w_*^{j}(t)(\Sigma _0(t))_j^a,&{}\qquad \quad \bar{\textbf{e}}_a=(\Sigma _0^{-1}(t))_a^j\textbf{e}_j,\\ w_*^{i}(t)=\theta ^a(t)(\Sigma _0^{-1}(t))_a^i,&{}\qquad \quad \textbf{e}_i=\left( \Sigma _0(t)\right) _i^a\bar{\textbf{e}}_a, \end{array} \end{aligned}$$

such that

$$\begin{aligned} \textbf{s}_0(\textbf{w}_*)=\Vert \textbf{w}_*\Vert _\mathcal {H}=\Vert \varvec{\Theta }\Vert _\mathcal {H}=\Vert \theta (t)\Vert _{\mathbb {R}^N}. \end{aligned}$$

Note further that we can always choose a new orthonormal basis \(\lbrace \check{\textbf{e}}_1,\ldots ,\check{\textbf{e}}_N\rbrace \), through a standard orthogonal coordinate transformation, such that

$$\begin{aligned} \varvec{\Theta }=\textbf{s}_0(\textbf{w}_*)\check{\textbf{e}}_1,\quad \check{\textbf{e}}_1=\frac{\textbf{w}_*}{\Vert \textbf{w}_*\Vert _\mathcal {H}}. \end{aligned}$$

Proof

Given that \(\textbf{w}_*=\varvec{\Theta }\) we need to show that \(\Vert \textbf{w}_*\Vert _\mathcal {H}=\textbf{s}_0(\textbf{w}_*)\) and \(\Vert \varvec{\Theta }\Vert _\mathcal {H}=\Vert \theta (t)\Vert _{\mathbb {R}^N}\). Direct calculations using Eqs. (3) and (5) yield

$$\begin{aligned} \textbf{s}^2_0(\textbf{w}_*)=\frac{\textbf{b}^2_0(\mathbf {w_*})}{\textbf{V}_0(\textbf{w}_*,\textbf{w}_*)}=\textbf{V}_0(\textbf{w}_*,\textbf{w}_*)=\Vert \textbf{w}_*\Vert ^2_\mathcal {H}. \end{aligned}$$

Furthermore, since the components of \(\textbf{w}\), with respect to the standard basis, satisfy \(w'_*(t)=\theta '(t)\Sigma _0^{-1}(t)\) and \(V_0(t)\) admits the decomposition \(\Sigma _{0}(t)\Sigma _{0}'(t)\) it follows that

$$\begin{aligned} \Vert \textbf{w}_*\Vert ^2_\mathcal {H}=\langle w_*(t),w_*(t)\rangle _{V_0(t)}=w'_*(t)V_0(t)w_*(t)=\theta '(t)\theta (t), \end{aligned}$$

which we recognize as the square of the Euclidean norm in \(\mathbb {R}^N\). \(\square \)

By taking a geometric approach, we identify the growth optimal Kelly vector with the market price of risk vector in a complete market. The key observation is that in algebra and analysis the latter vector is typically expressed using components from a basis different from the standard basis, which muddies the water and hides the fact that the length of the vector equals its instantaneous Sharpe ratio. With this introduction to Kelly trading, we proceed by investigating how to characterize the growth optimal Kelly vector on subspaces.

3.2 The opportunity set and projections

So far we have considered the opportunity set to consist of N numéraire-based investable assets. The first observation to be made is that this dimension is local in time since new assets might be available for investment in the future, while other assets might cease to exist for various reasons. However, for a given point in time, the dimension of the opportunity set can also vary from investor to investor and below we aim to clarify the geometry governing such reductions or expansions.

The approach we follow is to consider a subspace \(U_1\subseteq U\). Any vector \(\textbf{w}\in U_1\) can be expressed as \(\textbf{w}=w_\textbf{v}^j(t)\textbf{v}_j\) for a given basis \(\lbrace \textbf{v}_1,\ldots ,\textbf{v}_{N_1}\rbrace \), \(N_1\le N\), of \(U_1\). Hence, the \(N_1\) investable assets of the opportunity set \(U_1\) are linear combinations of the N investable assets in U. We can further translate the representation to the standard basis of U, by setting \(\textbf{v}_j=v_j^i\textbf{e}_i\), such that \(\textbf{w}=w^i(t)\textbf{e}_i\) with \(w^i(t)=w_\textbf{v}^j(t)v^i_j\). Below we show how to characterize the growth optimal Kelly vector on \(U_1\), defined by

$$\begin{aligned} \textbf{w}_*[U_1]=\mathop {\mathrm {arg\,max}}\limits _{\textbf{w}\in U_1}\varvec{\mu }_0(\textbf{w}), \end{aligned}$$

in terms of \(\textbf{w}_*\). However, first we introduce a technical result.

Lemma 3

Given a subspace \(\mathcal {H}_1\subseteq \mathcal {H}\). The orthogonal projection of a vector \(\textbf{w} \in \mathcal {H}\) onto \(\mathcal {H}_1=(U_1,\textbf{V}_0)\) is unique and satisfies

$$\begin{aligned} \textbf{V}_0(\textbf{w},\textbf{P}_{0\vert U_1}(\textbf{x}))=\textbf{V}_0(\textbf{P}_{0\vert U_1}(\textbf{w}),\textbf{x})=\textbf{V}_0(\textbf{P}_{0\vert U_1}(\textbf{w}),\textbf{P}_{0\vert U_1}(\textbf{x})),\quad \forall \textbf{x}\in \mathcal {H}. \end{aligned}$$

Furthermore, the orthogonal projection admits the representation

$$\begin{aligned} \textbf{P}_{0\vert U_1} (\textbf{w}) =\sum _{i\ge 1} \frac{\textbf{V}_{0}(\textbf{w}, \textbf{v}_i)}{\textbf{V}_{0}(\textbf{v}_i, \textbf{v}_i)} \textbf{v}_i, \end{aligned}$$

for any orthogonal sequence \(\lbrace \textbf{v}_i\rbrace _{i\ge 1}\) spanning \(U_1\).

Proof

For details about the proof, we refer to Luenberger (1997). \(\square \)

It is worth mentioning that the functional representation of the orthogonal projection is more complicated when expanded in a non-orthogonal basis; a topic we return to later in this paper. With that being said, we now return to the growth optimal Kelly vector and highlight the financial connection.

Theorem 4

For \(\mathcal {H}_1=(U_1,\textbf{V}_0)\), let \(\mathcal {H}_1\subseteq \mathcal {H}\). Then,

$$\begin{aligned} \textbf{w}_{*}[U_1] = \textbf{P}_{0\vert U_1}(\textbf{w}_*),\quad \Vert \textbf{w}_{*}[U_1]\Vert _\mathcal {H}=\textbf{s}_0(\textbf{w}_{*}[U_1]). \end{aligned}$$

Proof

Let \(\lbrace \textbf{v}_i\rbrace _{i\le N_1}\), \(N_1=\dim (U_1)\), be an orthogonal sequence spanning \(U_1\) such that any vector \(\textbf{w}\) in \(U_1\) takes the form \(\textbf{w}=\lambda ^j\textbf{v}_j\). It now follows from Eqs. (2) and (5) that

$$\begin{aligned} \mu _{0}(\lambda ^j\textbf{v}_j)=\lambda ^j \textbf{V}_0(\textbf{w}_*,\textbf{v}_j)-\frac{1}{2}\lambda ^j\lambda ^k \textbf{V}_0(\textbf{v}_j,\textbf{v}_k). \end{aligned}$$

Hence, the rate of excess logarithmic return is maximal when

$$\begin{aligned} 0 = \frac{\partial }{\partial \lambda ^i} \mu _{0}(\lambda ^j\textbf{v}_j) = \textbf{V}_0(\textbf{w}_*,\textbf{v}_i)-\lambda ^k \textbf{V}_0(\textbf{v}_i,\textbf{v}_k). \end{aligned}$$

Since \(\lbrace \textbf{v}_i\rbrace _{i\le N_1}\) is an orthogonal sequence, we see that \(\lambda ^i = \textbf{V}_0(\textbf{w}_*,\textbf{v}_i)/\textbf{V}_0(\textbf{v}_i,\textbf{v}_i)\). The first part of the proof follows by identifying the terms with those in Lemma 3. Having identified the growth optimal Kelly vector on a subspace as a projection, we again apply Lemma 3 to obtain

$$\begin{aligned} \textbf{b}_0(\textbf{w}_*[U_1])&=\textbf{V}_0(\textbf{w}_*,\textbf{P}_{0\vert U_1}(\textbf{w}_*))=\textbf{V}_0(\textbf{P}_{0\vert U_1}(\textbf{w}_*),\textbf{P}_{0\vert U_1}(\textbf{w}_*))=\Vert \textbf{w}_{*}[U_1]\Vert _\mathcal {H}^2, \end{aligned}$$

from which the proof concludes. \(\square \)

A different explanation can be seen from the expression \(\mu _{0}(\textbf{v})=\frac{1}{2}(\Vert \textbf{w}_*\Vert ^2_\mathcal {H}-\Vert \textbf{w}_*-\textbf{v}\Vert ^2_\mathcal {H})\), which shows that the local maximum is attained at the point with minimal distance to the growth optimal Kelly vector. Hence, for any subspace, the line from this unique point to \(\textbf{w}_*\) is orthogonal to the subspace and therefore coincides with the orthogonal projection of \(\textbf{w}_*\) onto the subspace.

Remark 1

By setting \(\textbf{b}_0[U^*_1]=\textbf{V}_0(\textbf{w}_*[U_1])\) one notes, from Lemma 3, that

$$\begin{aligned} \textbf{b}_0[U^*_1](\textbf{v})=\textbf{V}_0(\textbf{P}_{0\vert U_1}(\textbf{w}_*),\textbf{v})=\textbf{V}_0(\textbf{w}_*,\textbf{P}_{0\vert U_1}(\textbf{v}))=\textbf{b}_0(\textbf{ v}),\quad \textbf{v}\in \mathcal {H}_1\subseteq \mathcal {H}. \end{aligned}$$

The interpretation is that \(\textbf{b}_0[U^*_1]\) can be expressed in any dual basis spanning \(U_1^*\), while \(\textbf{b}_0\) must be expanded in any dual basis spanning \(U^*\). Similarly, we sometimes write \(\textbf{V}_0[U^*_1]\) when to emphasize that the inner product can be expanded using a dual basis spanning \(U^*_1\). While the components of the expansions change for every chosen basis, it is important to remember that the mapping to the real numbers do not. Consequently, we often omit the notion of subspace for ease of readability.

We further see that we can regard the Hilbert space \(\mathcal {H}\), corresponding to the investable assets for a given investor, as a subspace of the Hilbert space \(\mathcal {\bar{H}}=(\bar{U},\textbf{V}_0)\) representing all the world’s assets. What this means is that when analyzing optimal portfolio allocations, for a particular investor, we only have to consider the covariance structure of the investable assets for that investor. This follows as, restricted to a subspace \(U_1\), we only need to find the components of \(\textbf{V}_0\) for a given basis (meaning the investable assets) of \(U_1\). It is quite remarkable that we can equate the growth optimal Kelly vector on any subspace with a projection of the worldwide growth optimal Kelly vector. This feature further implies that a growth optimal Kelly vector can be expressed as a nested sequence of projections \(\textbf{w}_*[U_K]=\textbf{P}_{0\vert U_K}\cdots \textbf{P}_{0\vert U_1}(\textbf{w}_*)\), \(U_K\subset \cdots \subset U_1\). The financial interpretation of such a nested sequence is best understood by a simple application of Cauchy–Schwarz inequality, stating that \(\Vert \textbf{P}_{0\vert U_k}(\textbf{w})\Vert _\mathcal {H}\le \Vert \textbf{w}\Vert _\mathcal {H}\) for all \(\textbf{w}\in \mathcal {H}\). Consequently, \(\Vert \textbf{w}_*[U_k]\Vert _\mathcal {H}\le \Vert \textbf{w}_*[U_{k-1}]\Vert _\mathcal {H}\), which implies (see Theorem 4) that \(\textbf{s} _0(\textbf{w}_*[U_k])\le \textbf{s} _0(\textbf{w}_*[U_{k-1}])\). Hence, at each time we reduce the dimension of the investable assets, for instance, by replacing some assets by a mutual fund, the maximal instantaneous Sharpe ratio is reduced.

3.3 Level sets, correlations, and reflections

Modern portfolio theory is largely based on the geometric principle that the level sets of the instantaneous Sharpe ratio are cones. In other words, if we set \(C_s=\lbrace \textbf{w}\in \mathbb {R}^N:\textbf{s}_0(\textbf{w})=s\rbrace \), then for each \(\textbf{w}\in C_s\), and positive scalar \(\lambda >0\), we have \(\lambda \textbf{w}\in C_s\). Markowitz (1952) and Tobin (1958) used this property to derive the well-known efficient mean–variance frontier, characterized by the set of trading strategies for which the Sharpe ratio is maximal. Kelly (1956) and Latané (1959), however, argued that by leveraging too hard (that is using a too high \(\lambda \)) the logarithmic excess return, as opposed to the excess return, eventually becomes negative. Following Bermin and Holm (2021b), we illustrate this feature using the concept of relative leverage (or drawdown) risk

$$\begin{aligned} \textbf{b}_0(\textbf{w})=\frac{1}{\textbf{k}_0(\textbf{w})}\varvec{\sigma }^2_0(\textbf{w}),\quad \varvec{\mu }_0(\textbf{w})=\left( \frac{1}{\textbf{k}_0(\textbf{w})}-\frac{1}{2}\right) \varvec{\sigma }^2_0(\textbf{w}). \end{aligned}$$
(9)

As can be seen, the instantaneous excess return is strictly positive if and only if \(\textbf{k}_0(\textbf{w})>0\), while the instantaneous logarithmic excess return is strictly positive if and only if \(0<\textbf{k}_0(\textbf{w})<2\).

In order to visualize the framework geometrically, we first claim that, for \(\textbf{w}\in \mathcal {H}\), the level sets of \(\varvec{\sigma }_0(\textbf{w})\), \(\varvec{\mu }_0(\textbf{w})\) and \(\textbf{k}_0(\textbf{w})\) are spheres of dimension \(N-1\), while the level sets of \(\textbf{b}_0(\textbf{w})\) are hyperplanes of dimension \(N-1\). Furthermore, there exists a sphere of dimension \(N-2\), for which all trading strategies are equivalent with respect to the quantities just mentioned.

Proposition 5

The various level sets can be characterized

$$\begin{aligned}&\hbox {Level}{} & {} \hbox {Topology}{} & {} \hbox {Center}{} & {} \hbox {Radius}\\&\textbf{b}_0(\textbf{w})=b{} & {} \mathbb {R}^{N-1}{} & {} \frac{b}{\textbf{s}^2_*}\textbf{w}_*{} & {} -\\&\sigma _0(\textbf{w})=\sigma{} & {} S^{N-1}{} & {} 0{} & {} \sigma \\&\mu _0(\textbf{w})=\mu{} & {} S^{N-1}{} & {} \textbf{w}_*{} & {} \textbf{s}_*\sqrt{1-\frac{2\mu }{\textbf{s}^2_*}}\\&\textbf{k}_0(\textbf{w})=k{} & {} S^{N-1}{} & {} \frac{1}{2}k\textbf{w}_*{} & {} \frac{1}{2}k\textbf{s}_*\\&\begin{array}{ll} \textbf{b}_0(\textbf{w})=b\\ \varvec{\sigma }_0(\textbf{w})=\sigma \end{array}{} & {} S^{N-2}{} & {} \frac{b}{\textbf{s}^2_*}\textbf{w}_*{} & {} \sigma \sqrt{1-\left( \frac{b}{\sigma \textbf{s}_*}\right) ^2} \end{aligned}$$

where we have set \(\textbf{s}_*=\textbf{s}_0(\textbf{w}_*)\) for convenience. Note further that the joint levels sets of \((\textbf{b}_0,\sigma _0)\) imply level sets for \((\mu _0,\textbf{k}_0)\).

Proof

We, unconventionally, express the quantities using the norm on \(\mathcal {H}\) according to

$$\begin{aligned} \varvec{\sigma }_0^2(\textbf{w})&=\Vert \textbf{w}\Vert ^2_\mathcal {H},\quad \varvec{\mu }_0(\textbf{w})=\frac{1}{2}\Vert \textbf{w}_*\Vert _\mathcal {H}^2-\frac{1}{2}\Vert \textbf{w}-\textbf{w}_*\Vert ^2_\mathcal {H},\\ \textbf{k}_0^2(\textbf{w})&=\frac{4}{\Vert \textbf{w}_*\Vert _\mathcal {H}^2}\Vert \textbf{w}-\frac{1}{2}\textbf{k}_0(\textbf{w})\textbf{w}_*\Vert ^2_\mathcal {H}. \end{aligned}$$

We also note from Lemma 3 that

$$\begin{aligned} \textbf{w}_\parallel =\textbf{P}_{0\vert {{\,\textrm{span}\,}}(\textbf{w}_*)}(\textbf{w})=\frac{\textbf{V}_0(\textbf{w},\textbf{w}_*)}{\textbf{V}_0(\textbf{w}_*,\textbf{w}_*)}\textbf{w}_*=\frac{\textbf{b}_0(\textbf{w})}{\Vert \textbf{w}_*\Vert _\mathcal {H}^2}\textbf{w}_*, \end{aligned}$$

such that, with \(\textbf{w}_\perp =\textbf{w}-\textbf{w}_\parallel \), we have

$$\begin{aligned} \Vert \textbf{w}_\perp \Vert ^2_\mathcal {H}=\Vert \textbf{w}-\textbf{w}_\parallel \Vert ^2_\mathcal {H}=\sigma _0^2(\textbf{w})-\frac{\textbf{b}^2_0(\textbf{w})}{\Vert \textbf{w}_*\Vert _\mathcal {H}^2}. \end{aligned}$$

Finally, we identify the center point and the radius of the expressions. We also recall that the norm of the growth optimal Kelly vector equals its instantaneous Sharpe ratio as shown in Theorem 4. The proof concludes from the observation that \((\mu _0,\textbf{k}_0)\) can be expressed in terms of \((\textbf{b}_0,\sigma _0)\). \(\square \)

The importance of the growth optimal Kelly vector can be explained from the observation that the instantaneous excess return is invariant with respect to trading strategies orthogonal to \(\textbf{w}_*\). That is, with \(\textbf{v}=\textbf{v}_\parallel +\textbf{v}_\perp \), where \(\textbf{v}_\parallel \) and \(\textbf{w}_*\) are collinear while \(\textbf{v}_\perp \) and \(\textbf{w}_*\) are perpendicular, one sees that

$$\begin{aligned} \textbf{b}_0(\textbf{v})=\textbf{V}_0(\textbf{w}_*,\textbf{v}_\parallel +\textbf{v}_\perp )=\textbf{V}_0(\textbf{w}_*,\textbf{v}_\parallel )+\textbf{V}_0(\textbf{w}_*,\textbf{v}_\perp )=\textbf{b}_0(\textbf{v}_\parallel ). \end{aligned}$$

However, since the volatility increases with \(\textbf{v}_\perp \), through the formula \(\varvec{\sigma }^2_0(\textbf{v})=\varvec{\sigma }^2_0(\textbf{v}_\parallel )+\varvec{\sigma }^2_0(\textbf{v}_\perp )\), it is clear that the instantaneous Sharpe ratio is maximal for trading strategies collinear to \(\textbf{w}_*\). Hence, as can be seen from Fig. 1, the instantaneous efficient mean–variance frontier (minimal variance for a fixed excess return) consists of all vectors collinear to the growth optimal Kelly vector. These vectors can, however, equally be represented by different constraint optimization problems, such as minimal relative leverage risk for a fixed (logarithmic) excess return, to give an example. Thus, the risk measures introduced in Bermin and Holm (2021b) fit well with the (local) mean–variance approach.

Fig. 1
figure 1

This figure shows the vectors \(\textbf{v}=k\hat{{\textbf {v}}}\) and \(\textbf{v}_r=k\hat{{\textbf {v}}}_r\), where the latter is a reflection of the first through the line spanned by the growth optimal Kelly vector \(\textbf{w}_*\). We also highlight the level sets of \(\textbf{k}_0\) (black), \(\textbf{b}_0\) (blue), \(\varvec{\mu }_0\) (green) and those of \(\varvec{\sigma }_0\) (red)

We proceed by considering the projection of the growth optimal Kelly vector on the subspace spanned by a single vector \(\textbf{v}\). By the use of Eq. (4) and Lemma 3, we define

$$\begin{aligned} \hat{\textbf{v}}=\frac{1}{\textbf{k}_0( \textbf{v})} \textbf{v}=\frac{\textbf{V}_0(\textbf{w}_*, \textbf{v})}{\textbf{V}_0(\textbf{v}, \textbf{v})}\textbf{v}=\textbf{P}_{0\vert {{\,\textrm{span}\,}}(\textbf{v})}(\textbf{w}_*)=\textbf{w}_{*}[{{\,\textrm{span}\,}}( \textbf{v})]. \end{aligned}$$
(10)

Bermin and Holm (2021b) call trading strategies generated in this way for generalized growth optimal Kelly strategies and show that these strategies have the same relative drawdown/leverage risk as the growth optimal Kelly strategy. Their proof is a direct consequence of the simple relationships \(\textbf{k}_0(\lambda \textbf{v})=\lambda \textbf{k}_0(\textbf{v})\) and \(\textbf{k}_0(\textbf{w}_*)=1\). Since \(\hat{{\textbf {v}}} \) is the orthogonal projection of the growth optimal Kelly vector onto \(U_1={{\,\textrm{span}\,}}(\textbf{v})\), the vector \(\textbf{w}_*-\hat{{\textbf {v}}} \) is further perpendicular to \(\hat{{\textbf {v}}} \). Consequently, as shown in Fig. 1, the angle between the vectors \(\textbf{w}_*\) and \(\hat{\textbf{v}}\) satisfies \(\cos {\varphi _{\textbf{w}_*,\hat{{\textbf {v}}} }}=\Vert \hat{{\textbf {v}}} \Vert _\mathcal {H}/\Vert \textbf{w}_*\Vert _\mathcal {H}\). The financial interpretation of the angle between vectors is the correlation and through the relationship \(\varvec{\rho }_0(\textbf{v},\textbf{w})=\cos {\varphi _{\textbf{v},\textbf{w}}}\) we obtain Roll’s result, see Roll (1977), that any efficient trading strategy (i.e., collinear to the growth optimal Kelly vector) satisfies the CAPM equation.

Theorem 6

For \(\mathcal {H}_1=(U_1,\textbf{V}_0)\) let \(\textbf{v}\in \mathcal {H}_1\subseteq \mathcal {H}\). Then,

$$\begin{aligned} \textbf{s}_0(\textbf{v})=\varvec{\rho }_0(\textbf{v},\lambda \textbf{w}_*[U_1])\textbf{s}_0(\lambda \textbf{w}_*[U_1]),\quad \lambda >0. \end{aligned}$$

Proof

Without loss of generality, we set \(\lambda =1\). Lemma 3 then yields

$$\begin{aligned} \varvec{\rho }_0(\textbf{v},\textbf{w}_*[U_1])=\frac{\textbf{V}_0(\textbf{v},\textbf{P}_{0\vert U_1}(\textbf{w}_*))}{\Vert \textbf{v}\Vert _\mathcal {H}\Vert \textbf{w}_*[U_1]\Vert _\mathcal {H}}=\frac{\textbf{V}_0(\textbf{P}_{0\vert U_1}(\textbf{v}),\textbf{w}_*)}{\Vert \textbf{v}\Vert _\mathcal {H}\Vert \textbf{w}_*[U_1]\Vert _\mathcal {H}}=\frac{\textbf{b}_0(\textbf{P}_{0\vert U_1}(\textbf{v}))}{\sigma _0(\textbf{v})\Vert \textbf{w}_*[U_1]\Vert _\mathcal {H}}. \end{aligned}$$

Hence, for \(\textbf{v}\in \mathcal {H}_1\subseteq \mathcal {H}\), the proof concludes by the use of Theorem 4. \(\square \)

Since the correlation between any vector and the growth optimal Kelly vector equals the ratio of their instantaneous Sharpe ratios, we again see that \(\vert \textbf{s}_0(\textbf{v})\vert \le \textbf{s}_0(\textbf{w}_*[U_1])\) for any vector \(\textbf{v}\in U_1\). Furthermore, the correlation is in fact bounded by the various Sharpe ratios as shown below.

Corollary 7

For \(\mathcal {H}_1=(U_1,\textbf{V}_0)\) let \(\textbf{v}, \textbf{w} \in \mathcal {H}_1\subseteq \mathcal {H}\). Then

$$\begin{aligned} \left| \rho _0(\textbf{v},\textbf{w}) - \frac{\textbf{s}_0(\textbf{v})\textbf{s}_0(\textbf{w})}{\textbf{s}^2_0(\textbf{w}_*[U_1])} \right| \le \sqrt{\left( 1 -\frac{\textbf{s}^2_0(\textbf{v})}{\textbf{s}^2_0(\textbf{w}_*[U_1])} \right) \left( 1-\frac{\textbf{s}^2_0(\textbf{w})}{\textbf{s}^2_0(\textbf{w}_*[U_1])} \right) }, \end{aligned}$$

with equality if \(\dim (U_1)=2\).

Proof

Define \(\textbf{v}_{\perp } = \textbf{v} - \textbf{P}_{0\vert {{\,\textrm{span}\,}}(\textbf{w}_*[U_1])}(\textbf{v})\) and \(\textbf{w}_{\perp } = \textbf{w} - \textbf{P}_{0\vert {{\,\textrm{span}\,}}(\textbf{w}_*[U_1])}(\textbf{w})\). Direct calculations, using Lemma 3, then show that

$$\begin{aligned} \textbf{V}_0(\textbf{v}_\perp ,\textbf{w}_\perp )=\sigma _0(\textbf{v})\sigma _0(\textbf{w}) \left( \rho _0(\textbf{v},\textbf{w})-\rho _0(\textbf{w}_*[U_1],\textbf{v})\rho _0(\textbf{w}_*[U_1],\textbf{w})\right) , \end{aligned}$$

which yields

$$\begin{aligned} \rho _0(\textbf{v}_\perp ,\textbf{w}_\perp )=\frac{\rho _0(\textbf{v},\textbf{w})-\rho _0(\textbf{w}_*[U_1],\textbf{v})\rho _0(\textbf{w}_*[U_1],\textbf{w})}{\sqrt{1-\rho ^2_0(\textbf{w}_*[U_1],\textbf{v})}\sqrt{1-\rho ^2_0(\textbf{w}_*[U_1],\textbf{w})}}. \end{aligned}$$

Since \(\vert \rho _0(\textbf{v}_\perp ,\textbf{w}_\perp )\vert \le 1\) the first part of the proof follows from Theorem 6.

We further note that if \(\dim (U_1)=2\) then \(\textbf{v},\textbf{w}\) and \(\textbf{w}_*[U_1]\) lie in the same plane. This means that the angle \(\varphi _{\textbf{v},\textbf{w}}=\varphi _{\textbf{w}_*[U_1],\textbf{v}}+\varphi _{\textbf{w}_*[U_1],\textbf{w}}\), or \(\varphi _{\textbf{v},\textbf{w}}=2\pi -\varphi _{\textbf{w}_*[U_1],\textbf{v}}-\varphi _{\textbf{w}_*[U_1],\textbf{w}}\), or \(\varphi _{\textbf{v},\textbf{w}}=\pm (\varphi _{\textbf{w}_*[U_1],\textbf{v}}-\varphi _{\textbf{w}_*[U_1],\textbf{w}})\), such that \(\varphi _{\textbf{v},\textbf{w}}\in [0,\pi ]\). By inspecting each case, we find that

$$\begin{aligned} \cos \varphi _{\textbf{v},\textbf{w}}=\cos \varphi _{\textbf{w}_*[U_1],\textbf{v}}\cos \varphi _{\textbf{w}_*[U_1],\textbf{w}}\mp \sin \varphi _{\textbf{w}_*[U_1],\textbf{v}}\sin \varphi _{\textbf{w}_*[U_1],\textbf{w}}, \end{aligned}$$

where the sign preceding the sine functions is negative for the first two representations of \(\varphi _{\textbf{v},\textbf{w}}\) and positive for the latter two. The proof now follows from Theorem 6. \(\square \)

In Fig. 1, we also plot the reflection of the vector \(\textbf{v}\) with respect to the growth optimal Kelly vector. Hence, by setting \(\textbf{v}_r=\textbf{v}_\parallel -\textbf{v}_\perp \), such that

$$\begin{aligned} \textbf{v}_r=2\textbf{v}_\parallel -\textbf{v}=2\textbf{P}_{0\vert {{\,\textrm{span}\,}}(\textbf{w}_*)}(\textbf{v})-\textbf{v}=2\frac{\textbf{V}_0(\textbf{w}_*,\textbf{v})}{\textbf{V}_0(\textbf{w}_*,\textbf{w}_*)}\textbf{w}_*-\textbf{v}, \end{aligned}$$

straightforward calculations yield

$$\begin{aligned} \textbf{b}_0(\textbf{v}_r)=\textbf{b}_0(\textbf{v}),\quad \textbf{V}_0(\textbf{v}_r,\textbf{v}_r)=\textbf{V}_0(\textbf{v},\textbf{v}). \end{aligned}$$

From these expressions, it follows that also the instantaneous: excess logarithmic return, Sharpe ratio and relative drawdown/leverage risk are invariant when the vector \(\textbf{v}\) is replaced by the reflection vector \(\textbf{v}_r\). The fact that we can, in general, identify two distinct trading strategies with identical local characteristics is a result of great importance in order to fully understand the widely used mean–variance framework. By construction, we also note that

$$\begin{aligned} \textbf{V}_0(\textbf{v}_r,\textbf{w})=\textbf{V}_0(\textbf{w}_r,\textbf{v})=2\frac{\textbf{V}_0(\textbf{w}_*,\textbf{v})\textbf{V}_0(\textbf{w}_*,\textbf{w})}{\textbf{V}_0(\textbf{w}_*,\textbf{w}_*)}-\textbf{V}_0(\textbf{v},\textbf{w}), \end{aligned}$$

which, together with Theorems 4 and 6, implies the identity

$$\begin{aligned} \varvec{\rho }_0(\textbf{v}_r,\textbf{w})=\varvec{\rho }_0(\textbf{w}_r,\textbf{v})=2\frac{\textbf{s}_0(\textbf{v})\textbf{s}_0(\textbf{w})}{\textbf{s}^2_0(\textbf{w}_*)}-\varvec{\rho }_0(\textbf{v},\textbf{w}). \end{aligned}$$

Hence, for every pair of correlated trading strategies \((\textbf{v},\textbf{w})\) we can always find new pairs \((\textbf{v}_r,\textbf{w})\) and \((\textbf{v},\textbf{w}_r)\) with modified correlation but with otherwise identical characteristics, see Bermin and Holm (2023) for additional details related to this result. Note also that in the particular case where \(\textbf{w}=\textbf{v}\), we obtain

$$\begin{aligned} \varvec{\rho }_0(\textbf{v}_r,\textbf{v})=2\varvec{\rho }_0^2(\textbf{w}_*,\textbf{v})-1=2\cos ^2\varphi _{\textbf{w}_*,\textbf{v}}-1=\cos 2\varphi _{\textbf{w}_*,\textbf{v}}, \end{aligned}$$

which confirms that the angle \(\varphi _{\textbf{v}_r,\textbf{v}}=2\varphi _{\textbf{w}_*,\textbf{v}}\) as illustrated in Fig. 1.

We have shown that the only trading strategies which are locally unique, in the sense mentioned above, are the so-called (fractional) Kelly strategies, \(\textbf{w}=k\textbf{w}_*\), first introduced in MacLean et al. (1992). For these trading strategies, characterized by having maximal instantaneous squared Sharpe ratio, one easily verifies that

$$\begin{aligned} \varvec{\mu }_0(k\textbf{w}_*)=\frac{1}{2}k(2-k)\textbf{s}^2_0(\textbf{w}_*),\quad \varvec{\sigma }^2_0(k\textbf{w}_*)=k^2\textbf{s}^2_0(\textbf{w}_*). \end{aligned}$$
(11)

Consequently, we notice that a Kelly strategy is efficient if the relative leverage risk \(\textbf{k}_0(k\textbf{w}_*)=k\in [0,1]\), since otherwise we can always lower the volatility without reducing the logarithmic excess return, see also Bermin and Holm (2023), Davis and Lleo (2021). Finally, we briefly discuss how the geometric framework can be used to visualize trade-offs between risk and return. For instance, from Fig. 1 we deduce how to lower the relative leverage/drawdown risk of an arbitrary trading strategy, at no expense on the logarithmic excess return, by employing an efficient Kelly strategy. We illustrate the approach by calculating the fraction \(k\in [0,1]\) such that \(\varvec{\mu }_0(k\textbf{w}_*)=\varvec{\mu }_0({\hat{{\textbf {v}}}})\), using geometric principles only. One sees that the radius of the circle describing the level sets of the logarithmic excess return can be expressed in the two different ways: \(\sin \varphi _{\textbf{w}_*,{\hat{{\textbf {v}}}}}\Vert \textbf{w}_*\Vert _\mathcal {H}\) and \((1-k)\Vert \textbf{w}_*\Vert _\mathcal {H}\). Hence, with

$$\begin{aligned} k=1-\sin \varphi _{\textbf{w}_*,{\hat{{\textbf {v}}}}}=1-\sqrt{1-\cos ^2\varphi _{\textbf{w}_*,{\hat{{\textbf {v}}}}}}=1-\sqrt{1-\varvec{\rho }^2_0(\textbf{w}_*,{\hat{{\textbf {v}}}})}, \end{aligned}$$

the relative leverage/drawdown risk is reduced from \(\textbf{k}_0({\hat{{\textbf {v}}}})=1\) to \(\textbf{k}_0(k\textbf{w}_*)=k\le 1\), without affecting the excess logarithmic return. In much the same way it follows that, for a fixed logarithmic excess return, the trading strategies with lowest volatility are the Kelly strategies. This observation is a direct consequence of Proposition 5, stating that the level sets of the volatility are spheres centered at origo.

4 Risk-adjusted returns

We are now ready to apply the geometric approach to study the concept of risk-adjusted returns. That is we quantify the excess (logarithmic) return an investor can achieve by augmenting the opportunity set and, at the same time, we provide geometric interpretations of Jensen’s alpha and the beta parameter. While these quantities are considered fundamental for many portfolio managers, the amount of information they carry is rather limited. In fact, as pointed out in Eqs. (7) and (8), the only information contained in Jensen’s alpha is the sign, indicating whether to add a long or short infinitesimal position of an asset, to an existing portfolio, in order to increase the instantaneous Sharpe ratio. However, the knowledge of alpha and beta is, by itself, not enough to determine how to form the portfolio with maximal instantaneous Sharpe ratio. The reason why alpha and beta fail to be self-contained is due to the easily verifiable scaling properties

$$\begin{aligned} \alpha _{0}(\lambda _1 \textbf{w}_1, \lambda _2 \textbf{w}_2) = \lambda _1 \alpha _0(\textbf{w}_1, \textbf{w}_2), \quad \beta _{0}(\lambda _1 \textbf{w}_1, \lambda _2 \textbf{w}_2) = \frac{\lambda _1}{\lambda _2} \beta _{0}(\textbf{w}_1, \textbf{w}_2). \end{aligned}$$

In other words, given arbitrary trading and reference strategies, represented by \((\textbf{w}_1,\textbf{w}_2)\), we can apply leverage \((\lambda _1 \textbf{w}_1,\lambda _2 \textbf{w}_2)\) to achieve any targeted alpha and beta. Moreover, even when the reference strategy is held fixed, \(\lambda _2=1\), one sees that Jensen’s alpha scales similarly to the instantaneous excess return \(\textbf{b}_0\), thus ignoring the fact that by increasing the leverage the instantaneous logarithmic excess return \(\varvec{\mu }_0\) eventually turns negative. We therefore suggest a slightly modified measure, which we call the orthogonal Sharpe ratio, that instead quantifies how much the instantaneous Sharpe ratio can be improved.

In order to formulate our approach we first introduce some terminology. Given two subspaces \(U_1,U_2\) with trivial intersection, \(U_1 \cap U_2=\lbrace \textbf{0}\rbrace \), we let \(U_1\oplus U_2\) denote the direct sum and recall the similar concept for Hilbert spaces

$$\begin{aligned} \mathcal {H}_1\oplus \mathcal {H}_2=\left( U_1,\textbf{V}_{0}\left[ U^*_1\right] \right) \oplus \left( U_2,\textbf{V}_{0}\left[ U^*_2\right] \right) =\left( U_1\oplus U_2,\textbf{V}_{0}\left[ U^*_1\right] \oplus \textbf{V}_{0}\left[ U^*_2\right] \right) . \end{aligned}$$

Hence, \(\mathcal {H}=\mathcal {H}_1\oplus \mathcal {H}_2\) if \(U=U_1\oplus U_2\) and \(U_1\perp U_2\). As mentioned in Remark 1, there is no real conceptual gain in explicitly expressing the space for which the inner product can be expanded in some basis. Consequently, from here and onward, we simply write \(\textbf{ V}_0\oplus \textbf{V}_0\) unless there is ambiguity. We also write \(\textbf{w}_*\) when referring to \(\textbf{w}_*[U]\) and \(U=U_1\oplus U_2\). The following results show the importance of the Hilbert space direct sum decomposition.

Proposition 8

Let \(\mathcal {H}=\mathcal {H}_1\oplus \mathcal {H}_2\). Then

$$\begin{aligned} \textbf{w}_*&= \textbf{w}_*[U_1]+\textbf{w}_*[U_2],\\ \textbf{s}_0^2(\textbf{w}_*)&= \textbf{s}_0^2(\textbf{w}_*[U_1])+\textbf{s}_0^2(\textbf{w}_*[U_2]),\\ \textbf{b}_0(\textbf{w}_*)&= \textbf{b}_0(\textbf{w}_*[U_1])+\textbf{b}_0(\textbf{w}_*[U_2]). \end{aligned}$$

Proof

Since \(\mathcal {H}=\mathcal {H}_1\oplus \mathcal {H}_2\), there is a unique decomposition \(\textbf{w}_*=\textbf{w}_1+\textbf{w}_2\), such that \(\textbf{w}_i\in \mathcal {H}_i\). Because \(U_1\perp U_2\), we can further identify \(\textbf{w}_i\) with \(\textbf{P}_{0\vert U_i}(\textbf{w}_*)\), from which the first result follows by Theorem 4. The second result also follows from Theorem 4, since \(U_1\perp U_2\), while the third result follows from the dual representation \(\mathcal {H}^*=\mathcal {H}^*_1\oplus \mathcal {H}^*_2\). \(\square \)

Consequently, for a growth optimal Kelly trader the (logarithmic) excess return related to an augmentation of the opportunity set is directly linked to finding the Hilbert space direct sum decomposition of two arbitrary (and thus not necessarily orthogonal) vector spaces \(U_1,U_2\subseteq U\). Henceforth, we let \(U^\perp _{2\vert 1}\) denote the orthogonal subspace to \(U_1\) in U, while \(U^\perp _{1\vert 2}\) denotes the orthogonal subspace to \(U_2\) in U, such that

$$\begin{aligned} \textbf{P}_{0\vert U_{2\vert 1}^\perp }=\textbf{1}_U-\textbf{P}_{0\vert U_1},\quad \textbf{P}_{0\vert U_{1\vert 2}^\perp }=\textbf{1}_U-\textbf{P}_{0\vert U_2}. \end{aligned}$$
(12)

This shows that \(\mathcal {H}=\mathcal {H}_1\oplus \mathcal {H}_{2\vert 1}^\perp =\mathcal {H}_{1\vert 2}^\perp \oplus \mathcal {H}_2\), where

$$\begin{aligned} \mathcal {H}_1\oplus \mathcal {H}_{2\vert 1}^\perp =\left( U_1\oplus U_{2\vert 1}^\perp ,\textbf{V}_0\oplus \textbf{V}_0\right) ,\quad \mathcal {H}_{1\vert 2}^\perp \oplus \mathcal {H}_2=\left( U_{1\vert 2}^\perp \oplus U_2,\textbf{V}_0\oplus \textbf{V}_0\right) . \end{aligned}$$

The financial interpretation is that a growth optimal Kelly trader in \(U_1\) should add the orthogonal vector \(\textbf{w}_*[U_{2\vert 1}^\perp ]\) to be growth optimal in U, while a growth optimal Kelly trader in \(U_2\) should add the orthogonal vector \(\textbf{w}_*[U_{1\vert 2}^\perp ]\). In order to establish a connection to the alpha and beta parameters, we further show that the instantaneous excess return covectors \(\textbf{b}_0[U_{2\vert 1}^{\perp *}]\) and \(\textbf{b}_0[U_{1\vert 2}^{\perp *}]\) are related to alpha, while the orthogonal projection operators \(\textbf{P}_{0\vert U_{2\vert 1}^\perp }\) and \(\textbf{P}_{0\vert U_{1\vert 2}^\perp }\) are linked to beta. We also stress that while the primary market might consist of, say, N numéraire based assets, we generally assume that only some mutual funds are available for investment. Consequently, \(\dim (U)=\dim (U_1)+\dim (U_2)\le N\). For ease of readability, we choose to present our results in two steps: first we consider the simple case where \(\dim (U_1)=\dim (U_2) =1\) and thereafter we consider the general case. As always, most of the results carry over to higher dimensions albeit with some modifications.

4.1 Kelly solution in two dimensions

Consider a market with only two investable assets such that \(\dim (U)=2\). We stress that each asset can be thought of as a mutual fund, with positions in a much larger asset universe. Let further \(\textbf{v}_1,\textbf{v}_2\in U\) be two linearly independent vectors (each corresponding to a particular trading strategy) and set \(U_i={{\,\textrm{span}\,}}(\textbf{v}_i)\), for \(i=1,2\). Then, as \(U_1\cap U_2=\lbrace \textbf{0}\rbrace \), we have \(U=U_1\oplus U_2\). However, since the vectors \(\textbf{v}_1,\textbf{v}_2\) are typically not orthogonal, we cannot yet form the Hilbert space direct sum. For this reason, we also consider the alternative decompositions \(U=U_1\oplus U^\perp _{2\vert 1}\) and \(U=U^\perp _{1\vert 2}\oplus U_2\). It should come as no surprise that calculating the risk-adjusted quantities can be greatly simplified if we use orthogonal basis vectors but that eventually we want to represent the risk adjustments using the natural basis vectors \((\textbf{v}_1,\textbf{v}_2)\). Hence, our first goal is to construct basis vectors \((\textbf{v}_1,\textbf{v}_{2\vert 1})\) and \((\textbf{v}_{1\vert 2},\textbf{v}_2)\), where \(\textbf{v}_{2\vert 1}\) is some vector spanning \(U^\perp _{2\vert 1}\) and similarly for \(\textbf{v}_{1\vert 2}\). Below we show how to use the projection operators to construct three such sets of basis vectors and in doing so we derive a geometrical interpretation of alpha and beta.

Fig. 2
figure 2

This figure shows the orthogonal decompositions \(U=U_1\oplus U^\perp _{2\vert 1}=U^\perp _{1\vert 2}\oplus U_2\) for two separate cases. The growth optimal Kelly vector \(\textbf{w}_*={\hat{{\textbf {v}}}}_1+{\hat{{\textbf {v}}}}_{2\vert 1}={\hat{{\textbf {v}}}}_{1\vert 2}+{\hat{{\textbf {v}}}}_2\), which implies a representation \(\textbf{w}_*=w^1_{*}\textbf{v}_1+w^2_{*}\textbf{v}_2\) in the non-orthogonal decomposition \(U=U_1\oplus U_2\). We use the notations: \(\rho _{1,2}=\varvec{\rho }_0(\textbf{v}_1,\textbf{v}_2)\), \(\beta _1^2=\varvec{\beta }_0(\textbf{v}_1,\textbf{v}_2)\), \(\beta _2^1=\varvec{\beta }_0(\textbf{v}_2,\textbf{v}_1)\) and also highlight the level sets of \(\textbf{k}_0(\textbf{w})=k\), for \(k\in \lbrace 1,2,\pm \infty \rbrace \)

Given the non-orthogonal natural basis \((\textbf{v}_1,\textbf{v}_2)\), we recall the canonical dual basis \((\textbf{v}^1,\textbf{v}^2)\). By the use of Lemma 3, we immediately get

$$\begin{aligned} \textbf{P}_{0\vert U_1}=\frac{\textbf{V}_0(\textbf{v}_1)}{\textbf{V}_0(\textbf{v}_1,\textbf{v}_1)}\textbf{v}_1=\varvec{\beta }_0(\cdot ,\textbf{v}_1)\textbf{v}_1,\quad \textbf{P}_{0\vert U_2}=\frac{\textbf{V}_0(\textbf{v}_2)}{\textbf{V}_0(\textbf{v}_2,\textbf{v}_2)}\textbf{v}_2=\varvec{\beta }_0(\cdot ,\textbf{v}_2)\textbf{v}_2, \end{aligned}$$

which identifies beta as being linked to the component of a projection tensor. Since the latter are (1,1)-tensors, we can further expand them using the canonical dual basis, and from Eq. (12) we further obtain

$$\begin{aligned} \textbf{P}_{0\vert U^\perp _{2\vert 1}}=\sum _{i=1}^2\left( \textbf{v}_i-\textbf{P}_{0\vert U_1}\left( \textbf{v}_i\right) \right) \textbf{v}^i=\left( \textbf{v}_2-\varvec{\beta }_0(\textbf{v}_2,\textbf{v}_1)\textbf{v}_1\right) \textbf{v}^2,\\ \textbf{P}_{0\vert U^\perp _{1\vert 2}}=\sum _{i=1}^2\left( \textbf{v}_i-\textbf{P}_{0\vert U_2}\left( \textbf{v}_i\right) \right) \textbf{v}^i=\left( \textbf{v}_1-\varvec{\beta }_0(\textbf{v}_1,\textbf{v}_2)\textbf{v}_2\right) \textbf{v}^1. \end{aligned}$$

We can now easily identify orthogonal vectors by setting

$$\begin{aligned} \textbf{v}_{2\vert 1}=\textbf{P}_{0\vert U^\perp _{2\vert 1}}(\textbf{v}_2)=\textbf{v}_2-\varvec{\beta }_0(\textbf{v}_2,\textbf{v}_1)\textbf{v}_1,\\ \textbf{v}_{1\vert 2}=\textbf{P}_{0\vert U^\perp _{1\vert 2}}(\textbf{v}_1)=\textbf{v}_1-\varvec{\beta }_0(\textbf{v}_1,\textbf{v}_2)\textbf{v}_2. \end{aligned}$$

In Fig. 2, we display the geometry of the orthogonal decompositions \(U_1\oplus U^\perp _{2\vert 1}\) and \(U^\perp _{1\vert 2}\oplus U_2\), indicating the role of beta as being the components of a projection operator. From the degenerate case, plot (b), we further notice that, say, \({\hat{{\textbf {v}}}}_{2\vert 1}=\textbf{w}_*[U^\perp _{2\vert 1}] = \textbf{0}\) is not equivalent to \({\hat{{\textbf {v}}}}_2=\textbf{w}_*[U_2] = \textbf{0}\). Having constructed the two auxiliary coordinate systems \((\textbf{v}_1,\textbf{v}_{2\vert 1})\) and \((\textbf{v}_{1\vert 2},\textbf{v}_2)\), we proceed by investigating their local properties.

Proposition 9

The characteristics of the vectors \(\textbf{v}_{2\vert 1}\) and \(\textbf{v}_{1\vert 2}\) are given by:

$$\begin{aligned} \textbf{b}_0(\textbf{v}_{2\vert 1})=\varvec{\alpha }_0(\textbf{v}_2,\textbf{v}_1),\quad \varvec{\sigma }^2_0(\textbf{v}_{2\vert 1})=\varvec{\sigma }^2_0(\textbf{v}_2)-\varvec{\beta }^2_0(\textbf{v}_2,\textbf{v}_1)\varvec{\sigma }^2_0(\textbf{v}_1),\\ \textbf{b}_0(\textbf{v}_{1\vert 2})=\varvec{\alpha }_0(\textbf{v}_1,\textbf{v}_2),\quad \varvec{\sigma }^2_0(\textbf{v}_{1\vert 2})=\varvec{\sigma }^2_0(\textbf{v}_1)-\varvec{\beta }^2_0(\textbf{v}_1,\textbf{v}_2)\varvec{\sigma }^2_0(\textbf{v}_2). \end{aligned}$$

Proof

We illustrate the proof for the vector \(\textbf{v}_{2\vert 1}\).

$$\begin{aligned} \textbf{b}_0(\textbf{v}_{2\vert 1})&=\textbf{V}_0(\textbf{w}_*,\textbf{v}_{2\vert 1})=\textbf{V}_0(\textbf{w}_*,\textbf{v}_{2})-\varvec{\beta }_0(\textbf{v}_2,\textbf{v}_1)\textbf{V}_0(\textbf{w}_*,\textbf{v}_{1}),\\&=\textbf{b}_0(\textbf{v}_{2})-\varvec{\beta }_0(\textbf{v}_2,\textbf{v}_1)\textbf{b}_0(\textbf{v}_{1})=\varvec{\alpha }_0(\textbf{v}_2,\textbf{v}_1),\\ \textbf{V}_0(\textbf{v}_{2\vert 1},\textbf{v}_{2\vert 1})&=\textbf{V}_0(\textbf{v}_2,\textbf{v}_2)+\varvec{\beta }^2_0(\textbf{v}_2,\textbf{v}_1)\textbf{V}_0(\textbf{v}_1,\textbf{v}_{1})-2\varvec{\beta }_0(\textbf{v}_2,\textbf{v}_1)\textbf{V}_0(\textbf{v}_1,\textbf{v}_{2}),\\&=\textbf{V}_0(\textbf{v}_2,\textbf{v}_2)-\varvec{\beta }^2_0(\textbf{v}_2,\textbf{v}_1)\textbf{V}_0(\textbf{v}_1,\textbf{v}_1). \end{aligned}$$

The proof for \(\textbf{v}_{1\vert 2}\) is done analogously and is thus omitted. \(\square \)

It follows that the new strategies \(\textbf{v}_{2\vert 1}\) and \(\textbf{v}_{1\vert 2}\) have different relative leverage/drawdown risk than the original strategies \(\textbf{v}_{2}\) and \(\textbf{v}_{1}\), respectively. It is also important to notice that the knowledge of alpha and beta alone is not sufficient to calculate the corresponding Sharpe ratios \(\textbf{s}_0(\textbf{v}_{2\vert 1})\) and \(\textbf{s}_0(\textbf{v}_{1\vert 2})\), since these quantities additionally depend on the volatility of each trading strategy. We paraphrase this observation as:

$$\begin{aligned}&\text {Larger alpha with fixed beta is not necessarily better}.\\&\text {Smaller beta with fixed alpha is not necessarily better}. \end{aligned}$$

Furthermore, while it is pleasant to be able to interpret the excess return of the vectors \(\textbf{v}_{2\vert 1}\) and \(\textbf{v}_{1\vert 2}\) in terms of alpha, we must remember that the only purpose of these vectors is to span the spaces \(U^\perp _{2\vert 1}\) and \(U^\perp _{1\vert 2}\). Hence, any linear scaling of these vectors would serve equally well since the ultimate goal is to find \({\hat{{\textbf {v}}}}_{2\vert 1}\) and \({\hat{{\textbf {v}}}}_{1\vert 2}\). For this reason, we rather prefer to express any risk adjustment in terms of Sharpe ratios as described below. This approach further reduces the number of free variables.

Definition 1

We call \(\textbf{s}_0(\textbf{v}_{2\vert 1})\) the orthogonal Sharpe ratio of \(\textbf{v}_2\) given \(U_1\) and define the corresponding orthogonal Sharpe ratio of \(\textbf{v}_1\) given \(U_2\) analogously.

Theorem 10

The growth optimal Kelly vector admits the representation

$$\begin{aligned} \textbf{w}_*&=\frac{\textbf{s}_0(\textbf{v}_{1\vert 2})}{\varvec{\sigma }_0(\textbf{v}_1)\sqrt{1-\varvec{\rho }^2_0(\textbf{v}_1,\textbf{v}_2)}}\textbf{v}_{1}+\frac{\textbf{s}_0(\textbf{v}_{2\vert 1})}{\varvec{\sigma }_0(\textbf{v}_2)\sqrt{1-\varvec{\rho }^2_0(\textbf{v}_1,\textbf{v}_2)}}\textbf{v}_{2}, \end{aligned}$$

where the instantaneous orthogonal Sharpe ratios equal

$$\begin{aligned} \textbf{s}_0(\textbf{v}_{2\vert 1})=\frac{\textbf{s}_0(\textbf{v}_2)-\varvec{\rho }_0(\textbf{v}_1,\textbf{v}_2)\textbf{s}_0(\textbf{v}_1)}{\sqrt{1-\varvec{\rho }^2_0(\textbf{v}_1,\textbf{v}_2)}},\quad \textbf{s}_0(\textbf{v}_{1\vert 2})=\frac{\textbf{s}_0(\textbf{v}_1)-\varvec{\rho }_0(\textbf{v}_1,\textbf{v}_2)\textbf{s}_0(\textbf{v}_2)}{\sqrt{1-\varvec{\rho }^2_0(\textbf{v}_1,\textbf{v}_2)}}. \end{aligned}$$

Furthermore, the squared Sharpe ratio of the growth optimal Kelly strategy satisfies

$$\begin{aligned} \textbf{s}_0^2(\textbf{w}_*)&=\textbf{s}_0^2(\textbf{v}_1)+\textbf{s}_0^2(\textbf{v}_{2\vert 1})=\textbf{s}_0^2(\textbf{v}_{1\vert 2})+\textbf{s}_0^2(\textbf{v}_2),\\ \textbf{s}_0^2(\textbf{w}_*)&=\frac{\textbf{s}_0^2(\textbf{v}_1)+\textbf{s}_0^2(\textbf{v}_2)-2\varvec{\rho }_0(\textbf{v}_1,\textbf{v}_2)\textbf{s}_0(\textbf{v}_1)\textbf{s}_0(\textbf{v}_2)}{1-\varvec{\rho }_0^2(\textbf{v}_1,\textbf{v}_2)}. \end{aligned}$$

Proof

By applying Proposition 8 and the notation in Eq. (10), we have

$$\begin{aligned} \textbf{w}_*={\hat{{\textbf {v}}}}_1+{\hat{{\textbf {v}}}}_{2\vert 1}=\frac{1}{\textbf{k}_0(\textbf{v}_1)}\textbf{v}_1+\frac{1}{\textbf{k}_0(\textbf{v}_{2\vert 1})}\textbf{v}_{2\vert 1},\\ \textbf{w}_*={\hat{{\textbf {v}}}}_{1\vert 2}+{\hat{{\textbf {v}}}}_2=\frac{1}{\textbf{k}_0(\textbf{v}_{1\vert 2})}\textbf{v}_{1\vert 2}+\frac{1}{\textbf{k}_0(\textbf{v}_2)}\textbf{v}_2. \end{aligned}$$

We then transform these results to the coordinates \((\textbf{v}_1,\textbf{v}_2)\). Straightforward calculations yield

$$\begin{aligned} \textbf{w}_*=\frac{1}{\textbf{k}_0(\textbf{v}_{1\vert 2})}\textbf{v}_1+\frac{1}{\textbf{k}_0(\textbf{v}_{2\vert 1})}\textbf{v}_2. \end{aligned}$$

The proof now follows from Propositions 8 and 9. \(\square \)

The importance of this result lies in the fact that any trading strategy with maximal instantaneous Sharpe ratio is proportional to the growth optimal Kelly vector, or in other words all efficient trading strategies are of the form \(\textbf{w}=k\textbf{w}_*\), \(k\ge 0\). Below we provide two examples highlighting the behavior of the growth optimal Kelly vector in the degenerate cases.

Example 1

Suppose that \(\textbf{s}_0(\textbf{v}_2)=0\), such that \(\textbf{s}_0(\textbf{w}_*[U_2])=0\). Then, as shown in Theorem 4, \(\Vert \textbf{w}_*[U_2]\Vert _\mathcal {H}=0\), or equally \(\textbf{w}_*[U_2]=\textbf{0}\). But this does not imply that one should not invest in \(\textbf{v}_2\) when the opportunity set equals \(U_1\oplus U_2\). Rather, Theorem 10 gives

$$\begin{aligned} \textbf{w}_*=\frac{\textbf{s}_0(\textbf{v}_1)}{\varvec{\sigma }_0(\textbf{v}_1)(1-\varvec{\rho }^2_0(\textbf{v}_1,\textbf{v}_2))}\textbf{v}_{1}-\frac{\varvec{\rho }_0(\textbf{v}_1,\textbf{v}_2)\textbf{s}_0(\textbf{v}_1)}{\varvec{\sigma }_0(\textbf{v}_2)(1-\varvec{\rho }^2_0(\textbf{v}_1,\textbf{v}_2))}\textbf{v}_{2}. \end{aligned}$$

Example 2

Suppose that \(\textbf{v}_2=k\textbf{w}_*\), such that \(\textbf{s}^2_0(\textbf{v}_2)=\textbf{s}_0^2(\textbf{w}_*)\). Then, as shown in Theorem 10, \(\textbf{s}_0(\textbf{v}_{1\vert 2})=0\), or equally \(\textbf{s}_0(\textbf{v}_1)=\varvec{\rho }_0(\textbf{v}_1,\textbf{v}_2)\textbf{s}_0(\textbf{v}_2)\). But this implies that

$$\begin{aligned} \textbf{w}_*=\frac{\textbf{s}_0(\textbf{v}_{2\vert 1})}{\varvec{\sigma }_0(\textbf{v}_2)\sqrt{1-\varvec{\rho }^2_0(\textbf{v}_1,\textbf{v}_2)}}\textbf{v}_{2}=\frac{\textbf{s}_0(\textbf{v}_2)}{\varvec{\sigma }_0(\textbf{v}_2)}\textbf{v}_2=\textbf{w}_*[U_2]. \end{aligned}$$

Hence, an investor who enlarges his opportunity set from \(U_1\) to \(U_1\oplus U_2\) may well trade in the new asset even if it has zero Sharpe ratio. Moreover, such an investor may also fully discard his existing trading strategy in favor of only trading the asset in \(U_2\) (even though the initial portfolio has nonzero Sharpe ratio). Finally, we take advantage of the two-dimensional framework and present a pure geometric approach to identify the maximal Sharpe ratio and implicitly, thereby, the orthogonal Sharpe ratios.

Fig. 3
figure 3

This figure shows that the orthogonal decompositions \(U=U_1\oplus U^\perp _{2\vert 1}=U^\perp _{1\vert 2}\oplus U_2\) form a cyclic quadrilateral. The circle, in which the quadrilateral is inscribed, corresponds to the level set of \(\textbf{k}_0(\textbf{w})=1\), that is centered at \(\textbf{w}_*/2\) with a radius of AC/2. The quadrilateral is cyclic because opposite angles sum to \(\pi \). Furthermore, the diagonals relate to the sides by Ptolemy’s celebrated formula \(BD\cdot AC=CD\cdot AB + AD\cdot BC\)

Example 3

From Fig. 3 and Ptolemy’s formula, we know that

$$\begin{aligned} BD\cdot \Vert \textbf{w}_*\Vert _\mathcal {H}=\Vert {\hat{{\textbf {v}}}}_{2\vert 1}\Vert _\mathcal {H}\Vert {\hat{{\textbf {v}}}}_2\Vert _\mathcal {H}+\Vert {\hat{{\textbf {v}}}}_1\Vert _\mathcal {H}\Vert {\hat{{\textbf {v}}}}_{1\vert 2}\Vert _\mathcal {H}, \end{aligned}$$

where BD represents the distance between the vectors \({\hat{{\textbf {v}}}}_1\) and \({\hat{{\textbf {v}}}}_2\). We divide both sides with \(\Vert \textbf{w}_*\Vert ^2_\mathcal {H}\) and identify the ratios on the right-hand side with angles according to

$$\begin{aligned} \frac{BD}{\Vert \textbf{w}_*\Vert _\mathcal {H}}=\sin \varphi _{\textbf{w}_*,{\hat{{\textbf {v}}}}_{1}}\cos \varphi _{\textbf{w}_*,{\hat{{\textbf {v}}}}_{2}}+\cos \varphi _{\textbf{w}_*,{\hat{{\textbf {v}}}}_{1}}\sin \varphi _{\textbf{w}_*,{\hat{{\textbf {v}}}}_{2}}=\sin (\varphi _{\textbf{w}_*,{\hat{{\textbf {v}}}}_{1}}+\varphi _{\textbf{w}_*,{\hat{{\textbf {v}}}}_{2}}). \end{aligned}$$

Since \(\varphi _{\textbf{w}_*,{\hat{{\textbf {v}}}}_{1}}+\varphi _{\textbf{w}_*,{\hat{{\textbf {v}}}}_{2}}=\varphi _{{\hat{{\textbf {v}}}}_{1},{\hat{{\textbf {v}}}}_{2}}\), it now follows from the law of cosines that

$$\begin{aligned} \Vert \textbf{w}_*\Vert ^2_\mathcal {H}=\frac{BD^2}{\sin ^2\varphi _{{\hat{{\textbf {v}}}}_{1},{\hat{{\textbf {v}}}}_{2}}}=\frac{\Vert {\hat{{\textbf {v}}}}_1\Vert ^2_\mathcal {H}+\Vert {\hat{{\textbf {v}}}}_2\Vert ^2_\mathcal {H}-2\cos \varphi _{{\hat{{\textbf {v}}}}_{1},{\hat{{\textbf {v}}}}_{2}}\Vert {\hat{{\textbf {v}}}}_1\Vert _\mathcal {H}\Vert {\hat{{\textbf {v}}}}_2\Vert _\mathcal {H}}{1-\cos ^2\varphi _{{\hat{{\textbf {v}}}}_{1},{\hat{{\textbf {v}}}}_{2}}}, \end{aligned}$$

which is the form presented in Theorem 10.

4.2 Kelly solution in arbitrary dimensions

Here we provide the solution of adding one opportunity set to another. The main difficulty lies in the fact that both opportunity sets consist of correlated assets; both among themselves but also among each other. In order to understand how the new assets affect the portfolio allocation, we orthogonalize the covariance matrix, seen as a block matrix of the two sets of assets, in a way much similar to what was done previously where the two sets only held one asset each. Although computing inverses of block matrices is well understood, using matrix formalism to solve our problem is, if not impossible, at least very difficult.

In order to formulate the approach mathematically, we consider two subspaces \(U_1\) and \(U_2\) of dimension \(N^1\) and \(N^2\), respectively. Each subspace is spanned by some linearly independent trading strategies and we use the notation \(U_n={{\,\textrm{span}\,}}(\textbf{v}_{1_n},\ldots ,\textbf{v}_{N^n_n})\) to describe them. We further assume, without loss of generality, that \(U_1\cap U_2=\lbrace \textbf{0}\rbrace \) and form the direct sum \(U=U_1\oplus U_2\), such that \(\dim (U)=N^1+N^2\). For convenience, we also introduce the notation

$$\begin{aligned} i_1=i,\quad i_2=N^1+i, \end{aligned}$$
(13)

such that we can identify the trading strategies in U when needed. Having defined our usage of multi-indices, we proceed by expanding the projection tensors, see Eq. (12), according to

$$\begin{aligned} \textbf{P}_{0\vert U^\perp _{2\vert 1}}=\sum _{n=1}^2\sum _{i_n=1_n}^{N^n_n}\left( \textbf{v}_{i_n}-\textbf{P}_{0\vert U_1}(\textbf{v}_{i_n})\right) \textbf{v}^{i_n}=\sum _{i_2=1_2}^{N^2_2}\left( \textbf{v}_{i_2}-\textbf{P}_{0\vert U_1}(\textbf{v}_{i_2})\right) \textbf{v}^{i_2},\\ \textbf{P}_{0\vert U^\perp _{1\vert 2}}=\sum _{n=1}^2\sum _{i_n=1_n}^{N^n_n}\left( \textbf{v}_{i_n}-\textbf{P}_{0\vert U_2}(\textbf{v}_{i_n})\right) \textbf{v}^{i_n}=\sum _{i_1=1_1}^{N^1_1}\left( \textbf{v}_{i_1}-\textbf{P}_{0\vert U_2}(\textbf{v}_{i_1})\right) \textbf{v}^{i_1}, \end{aligned}$$

where the canonical dual basis vectors satisfy \(\textbf{v}^{j_k}(\textbf{v}_{i_l})=\delta ^{j_k}_{i_l}\). In order to further highlight the similarities with the two-dimensional case, we use Einstein summation and write: \(\textbf{P}_{0\vert U_1}(\textbf{v}_{i_2})=\varvec{\beta }_{i_2}^{k_1}\textbf{v}_{k_1}\), \(\textbf{P}_{0\vert U_2}(\textbf{v}_{i_1})=\varvec{\beta }_{i_1}^{k_2}\textbf{v}_{k_2}\), in terms of some generalized beta parameters. However, before we show how to compute the components of these projections, we first introduce some notation.

Definition 2

For every subspace \(\mathcal {H}_0=(U_0,\textbf{V}_0)\) of \(\mathcal {H}\), let \(\lbrace \textbf{V}_{0\vert U_0}^{i,j}\rbrace \) denote the inverse of the Gram matrix on \(\mathcal {H}_0\subseteq \mathcal {H}\), such that for any chosen basis \(\lbrace \textbf{v}_k\rbrace _{k\le K}\) of \(U_0\), with \(\dim (U_0)=K\), we have

$$\begin{aligned} \textbf{V}_0(\textbf{v}_{i},\textbf{v}_{k})\textbf{V}_{0\vert U_0}^{j,k}=\delta _{i}^{j},\quad {i},{j}\in \lbrace {1},\ldots ,{K}\rbrace , \end{aligned}$$

Similarly, we let \(\lbrace \varvec{\rho }_{0\vert U_0}^{i,j}\rbrace \) denote the inverse of the corresponding correlation matrix on \(\mathcal {H}_0\), such that

$$\begin{aligned} \rho _{0\vert U_0}^{i,j}=\sigma _0(\textbf{v}_i)\textbf{V}_{0\vert U_0}^{i,j}\sigma _0(\textbf{v}_j),\quad \varvec{\rho }_0(\textbf{v}_{i},\textbf{v}_{k})\varvec{\rho }_{0\vert U_0}^{j,k}=\delta _i^j,\quad i,j\in \lbrace {1},\ldots ,{K}\rbrace . \end{aligned}$$

Lemma 11

Let \(\mathcal {H}_0=(U_0,\textbf{V}_0)\) be an arbitrary K-dimensional subspace of \(\mathcal {H}\) and let \(\lbrace \textbf{v}_k\rbrace _{k\le K}\) be a basis of \(U_0\). Then, the projection

$$\begin{aligned} \textbf{P}_{0\vert U_0}(\textbf{w})=\textbf{V}_0(\textbf{w},\textbf{v}_{j})\textbf{V}_{0\vert U_0}^{j,k}\textbf{v}_{k}, \end{aligned}$$

of \(\textbf{w}\in \mathcal {H}\) onto \(\mathcal {H}_0\subseteq \mathcal {H}\) is orthogonal.

Proof

We first show that \(\textbf{P}_{0\vert U_0}\) is indeed a projection.

$$\begin{aligned} \textbf{P}_{0\vert U_0}(\textbf{P}_{0\vert U_0}(\textbf{w}))&=\textbf{V}_0(\textbf{w},\textbf{v}_{j})\textbf{V}_{0\vert U_0}^{j,k}\textbf{V}_0(\textbf{v}_{k},\textbf{v}_{a})\textbf{V}_{0\vert U_0}^{a,b}\textbf{v}_{b},\\&=\textbf{V}_0(\textbf{w},\textbf{v}_{j})\textbf{V}_{0\vert U_0}^{j,k}\delta ^{b}_{k}\textbf{v}_{b}=\textbf{V}_0(\textbf{w},\textbf{v}_{j})\textbf{V}_{0\vert U_0}^{j,k}\textbf{v}_{k}=\textbf{P}_{0\vert U_0}(\textbf{w}). \end{aligned}$$

Next, we show that the projection is orthogonal

$$\begin{aligned} \textbf{V}_0(\textbf{P}_{0\vert U_0}(\textbf{w}),\textbf{P}_{0\vert U_0}(\textbf{x}))&=\textbf{V}_0(\textbf{w},\textbf{v}_{j})\textbf{V}_{0\vert U_0}^{j,k}\textbf{V}_0(\textbf{x},\textbf{v}_{a})\textbf{V}_{0\vert U_0}^{a,b}\textbf{V}_0(\textbf{v}_{k},\textbf{v}_{b}),\\&=\textbf{V}_0(\textbf{w},\textbf{v}_{j})\textbf{V}_{0\vert U_0}^{j,k}\textbf{V}_0(\textbf{x},\textbf{v}_{a})\delta ^{a}_{k},\\&=\textbf{V}_0(\textbf{w},\textbf{v}_{j})\textbf{V}_{0\vert U_0}^{j,k}\textbf{V}_0(\textbf{x},\textbf{v}_{k})=\textbf{V}_0(\textbf{P}_{0\vert U_0}(\textbf{w}),\textbf{x}). \end{aligned}$$

Hence, \(\textbf{P}_{0\vert U_0}(\textbf{w})\perp \textbf{x}-\textbf{P}_{0\vert U_0}(\textbf{x})\) which concludes the proof. \(\square \)

We stress that the above result generalizes Lemma 3 by allowing the basis vectors to be non-orthogonal. Hence, rather than working with a non-observable abstract vector basis, we can directly consider the investable assets. Having identified the generalized beta parameters, we now construct the vectors

$$\begin{aligned} \textbf{v}_{i_2\vert 1}=\textbf{P}_{0\vert U^\perp _{2\vert 1}}(\textbf{v}_{i_2})=\textbf{v}_{i_2}-\beta _{i_2}^{k_1}\textbf{v}_{k_1},\quad \beta _{i_2}^{k_1}=\textbf{V}_0(\textbf{v}_{i_2},\textbf{v}_{j_1})\textbf{V}^{j_1,k_1}_{0\vert U_1}, \\ \textbf{v}_{i_1\vert 2}=\textbf{P}_{0\vert U^\perp _{1\vert 2}}(\textbf{v}_{i_1})=\textbf{v}_{i_1}-\beta _{i_1}^{k_2}\textbf{v}_{k_2},\quad \beta _{i_1}^{k_2}=\textbf{V}_0(\textbf{v}_{i_1},\textbf{v}_{j_2})\textbf{V}^{j_2,k_2}_{0\vert U_2}, \end{aligned}$$

such that \({{\,\textrm{span}\,}}( \textbf{v}_{1_2\vert 1},\ldots ,\textbf{v}_{N^2_2\vert 1})=U^\perp _{2\vert 1}\perp U_1\) and \({{\,\textrm{span}\,}}( \textbf{v}_{1_1\vert 2},\ldots ,\textbf{v}_{N^1_1\vert 2})=U_2\perp U^\perp _{1\vert 2}\). The local properties of these orthogonal vectors are summarized below.

Proposition 12

The characteristics of the vectors \(\lbrace \textbf{v}_{i_2\vert 1}\rbrace \) and \(\lbrace \textbf{v}_{i_1\vert 2}\rbrace \) are summarized by their instantaneous Sharpe ratios

$$\begin{aligned} \textbf{s}_0(\textbf{v}_{i_2\vert 1})=\frac{\textbf{s}_0(\textbf{v}_{i_2})-\varvec{\rho }_0(\textbf{v}_{i_2},\textbf{v}_{j_1})\varvec{\rho }_{0\vert U_1}^{j_1,k_1}\textbf{s}_0(\textbf{v}_{k_1})}{\sqrt{1-\varvec{\rho }^2_0(\textbf{v}_{i_2},\textbf{P}_{0\vert U_1}(\textbf{v}_{i_2}))}},\\ \textbf{s}_0(\textbf{v}_{i_1\vert 2})=\frac{\textbf{s}_0(\textbf{v}_{i_1})-\varvec{\rho }_0(\textbf{v}_{i_1},\textbf{v}_{j_2})\varvec{\rho }_{0\vert U_2}^{j_2,k_2}\textbf{s}_0(\textbf{v}_{k_2})}{\sqrt{1-\varvec{\rho }^2_0(\textbf{v}_{i_1},\textbf{P}_{0\vert U_2}(\textbf{v}_{i_1}))}}, \end{aligned}$$

and their instantaneous volatilities

$$\begin{aligned} \varvec{\sigma }_0(\textbf{v}_{i_2\vert 1})=\varvec{\sigma }_0(\textbf{v}_{i_2})\sqrt{1-\varvec{\rho }^2_0(\textbf{v}_{i_2},\textbf{P}_{0\vert U_1}(\textbf{v}_{i_2}))},\\ \varvec{\sigma }_0(\textbf{v}_{i_1\vert 2})=\varvec{\sigma }_0(\textbf{v}_{i_1})\sqrt{1-\varvec{\rho }^2_0(\textbf{v}_{i_1},\textbf{P}_{0\vert U_2}(\textbf{v}_{i_1}))}. \end{aligned}$$

Furthermore, the instantaneous correlation between the vectors, in each basis, equal

$$\begin{aligned} \rho _0(\textbf{v}_{i_2\vert 1},\textbf{v}_{j_2\vert 1})=\frac{\varvec{\rho }_0(\textbf{v}_{i_2},\textbf{v}_{j_2})-\varvec{\rho }_0(\textbf{v}_{i_2},\textbf{v}_{j_1})\varvec{\rho }_{0\vert U_1}^{j_1,k_1}\varvec{\rho }_0(\textbf{v}_{k_1},\textbf{v}_{j_2})}{\sqrt{1-\varvec{\rho }^2_0(\textbf{v}_{i_2},\textbf{P}_{0\vert U_1}(\textbf{v}_{i_2}))}\sqrt{1-\varvec{\rho }^2_0(\textbf{v}_{j_2},\textbf{P}_{0\vert U_1}(\textbf{v}_{j_2}))}},\\ \rho _0(\textbf{v}_{i_1\vert 2},\textbf{v}_{j_1\vert 2})=\frac{\varvec{\rho }_0(\textbf{v}_{i_1},\textbf{v}_{j_1})-\varvec{\rho }_0(\textbf{v}_{i_1},\textbf{v}_{j_2})\varvec{\rho }_{0\vert U_2}^{j_2,k_2}\varvec{\rho }_0(\textbf{v}_{k_2},\textbf{v}_{j_1})}{\sqrt{1-\varvec{\rho }^2_0(\textbf{v}_{i_1},\textbf{P}_{0\vert U_2}(\textbf{v}_{i_1}))}\sqrt{1-\varvec{\rho }^2_0(\textbf{v}_{j_1},\textbf{P}_{0\vert U_2}(\textbf{v}_{j_1}))}}, \end{aligned}$$

where

$$\begin{aligned} \varvec{\rho }^2_0(\textbf{w},\textbf{P}_{0\vert U_n}(\textbf{w}))=\varvec{\rho }_0(\textbf{w},\textbf{v}_{j_n})\varvec{\rho }_{0\vert U_n}^{j_n,k_n}\varvec{\rho }_0(\textbf{v}_{k_n},\textbf{w}),\quad n\in \lbrace 1,2\rbrace . \end{aligned}$$

Proof

We only show how to compute the terms for \(\textbf{v}_{i_2\vert 1}\) since \(\textbf{v}_{i_1\vert 2}\) is treated similarly. First note that

$$\begin{aligned} \beta _{i_2}^{k_1}\textbf{V}_0(\textbf{v}_{k_1},\textbf{v}_{l_1})=\textbf{V}_0(\textbf{v}_{i_2},\textbf{v}_{j_1})\textbf{V}_{0\vert U_1}^{j_1,k_1}\textbf{V}_0(\textbf{v}_{k_1},\textbf{v}_{l_1})=\textbf{V}_0(\textbf{v}_{i_2},\textbf{v}_{j_1})\delta _{l_1}^{j_1}=\textbf{V}_0(\textbf{v}_{i_2},\textbf{v}_{l_1}). \end{aligned}$$

We therefore obtain

$$\begin{aligned} \textbf{V}_0(\textbf{v}_{i_2\vert 1},\textbf{v}_{j_2\vert 1})&=\textbf{V}_0(\textbf{v}_{i_2}-\beta _{i_2}^{k_1}\textbf{v}_{k_1},\textbf{v}_{j_2}-\beta _{j_2}^{l_1}\textbf{v}_{l_1}),\\&=\textbf{V}_0(\textbf{v}_{i_2},\textbf{v}_{j_2})-\beta _{i_2}^{k_1}\textbf{V}_0(\textbf{v}_{k_1},\textbf{v}_{j_2})-\beta _{j_2}^{l_1}\textbf{V}_0(\textbf{v}_{l_1},\textbf{v}_{i_2})\\&\quad +\beta _{j_2}^{l_1}\beta _{i_2}^{k_1}\textbf{V}_0(\textbf{v}_{k_1},\textbf{v}_{l_1}),\\&=\textbf{V}_0(\textbf{v}_{i_2},\textbf{v}_{j_2})-\beta _{i_2}^{k_1}\textbf{V}_0(\textbf{v}_{k_1},\textbf{v}_{j_2}),\\&=\textbf{V}_0(\textbf{v}_{i_2},\textbf{v}_{j_2})-\textbf{V}_0(\textbf{v}_{i_2},\textbf{v}_{j_1})\textbf{V}_{0\vert U_1}^{j_1,k_1}\textbf{V}_0(\textbf{v}_{k_1},\textbf{v}_{j_2}). \end{aligned}$$

We also calculate the generalized alpha representation

$$\begin{aligned} \textbf{b}_0(\textbf{v}_{i_2\vert 1})=\textbf{b}_0(\textbf{v}_{i_2})-\beta _{i_2}^{k_1}\textbf{b}_0(\textbf{v}_{k_1})=\textbf{b}_0(\textbf{v}_{i_2})-\textbf{V}_0(\textbf{v}_{i_2},\textbf{v}_{j_1})\textbf{V}_{0\vert U_1}^{j_1,k_1}\textbf{b}_0(\textbf{v}_{k_1}). \end{aligned}$$

The proof concludes by replacing \(\textbf{V}_{0\vert U_1}\) with \(\varvec{\rho }_{0\vert U_1}\), as in Definition 2. \(\square \)

It is of course a matter of taste which financial quantities to use when describing the local characteristics and here we deviate from Proposition 9 by focusing on orthogonal Sharpe ratios, correlations and volatilities. The main reason for choosing these quantities is that the magnitude of both the Sharpe ratio and the correlation does not depend on leverage. The drawback is that neither quantity is a tensor, which means that sometimes it is easier to work with \((\textbf{b}_0,\textbf{V}_0)\). We now present the multi-dimensional version of Theorem 10.

Theorem 13

The growth optimal Kelly vector admits the representation

$$\begin{aligned} \textbf{w}_*=\textbf{s}_0(\textbf{v}_{i_1\vert 2})\varvec{\rho }_{0\vert U^\perp _{1\vert 2}}^{i_1,j_1}\sigma _0^{-1}(\textbf{v}_{j_1\vert 2})\textbf{v}_{j_{1}}+\textbf{s}_0(\textbf{v}_{i_2\vert 1})\varvec{\rho }_{0\vert U^\perp _{2\vert 1}}^{i_2,j_2}\sigma ^{-1}_0(\textbf{v}_{j_2\vert 1})\textbf{v}_{j_{2}}. \end{aligned}$$

Furthermore, the squared Sharpe ratio of the growth optimal Kelly strategy satisfies

$$\begin{aligned} \textbf{s}_0^2(\textbf{w}_*)=\textbf{s}_0^2(\textbf{w}_*[U_1])+\textbf{s}_0^2\left( \textbf{w}_*[U^\perp _{2\vert 1}]\right) =\textbf{s}_0^2\left( \textbf{w}_*[U^\perp _{1\vert 2}]\right) +\textbf{s}_0^2(\textbf{w}_*[U_2]), \end{aligned}$$

where

$$\begin{aligned} \textbf{s}_0^2(\textbf{w}_*[U_k])&=\textbf{s}_0(\textbf{v}_{i_k})\varvec{\rho }_{0\vert U_k}^{i_k,j_k}\textbf{s}_0(\textbf{v}_{j_k}),\quad k\in \lbrace 1,2\rbrace ,\\ \textbf{s}_0^2\left( \textbf{w}_*[U^\perp _{2\vert 1}]\right)&=\textbf{s}_0(\textbf{v}_{i_2\vert 1})\varvec{\rho }_{0\vert U^\perp _{2\vert 1}}^{i_2,j_2}\textbf{s}_0(\textbf{v}_{j_2\vert 1}), \\ \textbf{s}_0^2\left( \textbf{w}_*[U^\perp _{1\vert 2}]\right)&=\textbf{s}_0(\textbf{v}_{i_1\vert 2})\varvec{\rho }_{0\vert U^\perp _{1\vert 2}}^{i_1,j_1}\textbf{s}_0(\textbf{v}_{j_1\vert 2}). \end{aligned}$$

Proof

From Theorem 4 and Proposition 8, we have

$$\begin{aligned} \textbf{w}_*=\textbf{P}_{0\vert U_1}(\textbf{w}_*)+\textbf{P}_{0\vert U^\perp _{2\vert 1}}(\textbf{w}_*)=\textbf{P}_{0\vert U^\perp _{1\vert 2}}(\textbf{w}_*)+\textbf{P}_{0\vert U_2}(\textbf{w}_*). \end{aligned}$$

Consequently, Lemma 11 gives us the equivalent expressions

$$\begin{aligned} \textbf{w}_*&=\textbf{V}_0(\textbf{w}_*,\textbf{v}_{i_1})\textbf{V}_{0\vert U_1}^{i_1,j_1}\textbf{v}_{j_1}+\textbf{V}_0(\textbf{w}_*,\textbf{v}_{i_2\vert 1})\textbf{V}_{0\vert U^\perp _{2\vert 1}}^{i_2,j_2}\textbf{v}_{j_2\vert 1},\\ \textbf{w}_*&=\textbf{V}_0(\textbf{w}_*,\textbf{v}_{i_1\vert 2})\textbf{V}_{0\vert U^\perp _{1\vert 2}}^{i_1,j_1}\textbf{v}_{j_1\vert 2}+\textbf{V}_0(\textbf{w}_*,\textbf{v}_{i_2})\textbf{V}_{0\vert U_2}^{i_2,j_2}\textbf{v}_{j_2}, \end{aligned}$$

such that straightforward calculations yield

$$\begin{aligned} \Vert \textbf{w}_*\Vert ^2_\mathcal {H}&=\textbf{V}_0(\textbf{w}_*,\textbf{v}_{i_1})\textbf{V}_{0\vert U_1}^{i_1,j_1}\textbf{V}_0(\textbf{w}_*,\textbf{v}_{j_1})+\textbf{V}_0(\textbf{w}_*,\textbf{v}_{i_2\vert 1})\textbf{V}_{0\vert U^\perp _{2\vert 1}}^{i_2,j_2}\textbf{V}_0(\textbf{w}_*,\textbf{v}_{j_2\vert 1}),\\ \Vert \textbf{w}_*\Vert ^2_\mathcal {H}&=\textbf{V}_0(\textbf{w}_*,\textbf{v}_{i_1\vert 2})\textbf{V}_{0\vert U^\perp _{1\vert 2}}^{i_1,j_1}\textbf{V}_0(\textbf{w}_*,\textbf{v}_{j_1\vert 2})+\textbf{V}_0(\textbf{w}_*,\textbf{v}_{i_2})\textbf{V}_{0\vert U_2}^{i_2,j_2}\textbf{V}_0(\textbf{w}_*,\textbf{v}_{j_2}). \end{aligned}$$

Next, we represent \(\textbf{w}_*\) in terms of the basis vectors \(\lbrace \textbf{v}_{i_1}\rbrace \) and \(\lbrace \textbf{v}_{i_2}\rbrace \). Similar to the proof of Theorem 10, we pick terms from each of the two representations to arrive at

$$\begin{aligned} \textbf{w}_*=\textbf{V}_0(\textbf{w}_*,\textbf{v}_{i_1\vert 2})\textbf{V}_{0\vert U^\perp _{1\vert 2}}^{i_1,j_1}\textbf{v}_{j_{1}}+\textbf{V}_0(\textbf{w}_*,\textbf{v}_{i_2\vert 1})\textbf{V}_{0\vert U^\perp _{2\vert 1}}^{i_2,j_2}\textbf{v}_{j_{2}}. \end{aligned}$$

Finally, we use Definition 2 to express the results in terms of correlations rather than covariances. \(\square \)

In order to verify that the above formula collapses to Theorem 10 when \(N^1=N^2=1\), we notice that in this case \(\textbf{w}_*=\textbf{k}^{-1}_0(\textbf{v}_{1_1\vert 2})\textbf{v}_{1_1}+\textbf{k}^{-1}_0(\textbf{v}_{1_2\vert 1})\textbf{v}_{1_2}\). The correspondence then follows from converting the index references for each subspace \(U_1,U_2\) to index references in U, as explained in Eq. (13). We continue with an example highlighting the benefits of diversification seen in higher dimensions. Loosely speaking, we can think of the example as adding an asset to a trading strategy in, say S &P500, versus adding the asset to the opportunity set of the index.

Example 4

Let \(N^1>1\) and \(N^2=1\). In this example, we study the difference in trading the assets \(\lbrace \textbf{v}_{1_1},\ldots ,\textbf{v}_{N^1_1},\textbf{v}_{1_2}\rbrace \) versus trading only in \(\lbrace \textbf{w}_*[U_1],\textbf{v}_{1_2}\rbrace \). For the sake of simplicity, we introduce a new orthogonal basis \(\lbrace \check{\textbf{e}}_{i_1} \rbrace \) on \(U_1={{\,\textrm{span}\,}}(\textbf{v}_{1_1},\ldots ,\textbf{v}_{N^1_1})\), such that

$$\begin{aligned} \check{\textbf{e}}_{1_1}=\frac{\textbf{w}_*[U_1]}{\Vert \textbf{w}_*[U_1]\Vert _\mathcal {H}}. \end{aligned}$$

The inverse correlation matrix corresponding to the new Gram matrix on \(U_1\) then takes the form \({\check{\rho }}_{0\vert U_1}^{j_1,k_1}=\delta ^{j_1,k_1}\). Moreover, since \(\textbf{s}_0({\check{\textbf{e}}}_{i_1})=\textbf{s}_0(\textbf{w}_*[U_1])\), if \(i=1\), and zero otherwise, Proposition 12 yields

$$\begin{aligned} \textbf{s}^2_0(\textbf{w}_*[U^\perp _{2\vert 1}])=\textbf{s}_0^2(\textbf{v}_{1_2\vert 1})= \frac{\left( \textbf{s}_0(\textbf{v}_{1_2})-\varvec{\rho }_0(\textbf{v}_{1_2},{\check{\textbf{e}}}_{1_1})\textbf{s}_0({\check{\textbf{e}}}_{1_1})\right) ^2}{1-\sum _{i=1}^{N^1}\varvec{\rho }^2_0(\textbf{v}_{1_2},{\check{\textbf{e}}}_{i_1})}. \end{aligned}$$

We now compare this result with a growth optimal Kelly strategy on \(\check{U}=\check{U}_1\oplus U_2\), where \(\check{U}_1={{\,\textrm{span}\,}}(\textbf{w}_*[U_1])={{\,\textrm{span}\,}}({\check{\textbf{e}}}_{1_1})\). Consequently, we have \(\textbf{s}_0(\textbf{w}_*[\check{U}^\perp _{2\vert 1}])=\textbf{s}_0(\textbf{w}_*[U^\perp _{2\vert 1}])\vert _{N^1=1}\), and from Theorem 13 we calculate

$$\begin{aligned} \frac{\textbf{s}^2_0(\textbf{w}_*[U])-\textbf{s}^2_0(\textbf{w}_*[\check{U}])}{\left( \textbf{s}_0(\textbf{v}_{1_2})-\varvec{\rho }_0(\textbf{v}_{1_2},\textbf{w}_*[U_1])\textbf{s}_0(\textbf{w}_*[U_1])\right) ^2}=\frac{1}{1-\sum _{i=1}^{N^1}\varvec{\rho }^2_0(\textbf{v}_{1_2},{\check{\textbf{e}}}_{i_1})}-\frac{1}{1-\varvec{\rho }^2_0(\textbf{v}_{1_2},{\check{\textbf{e}}}_{1_1})}. \end{aligned}$$

This positive quantity equals zero if and only if \(\varvec{\rho }_0(\textbf{v}_{1_2},{\check{\textbf{e}}}_{i_1})=0\), for \(2_1\le i_1 \le N_1^1\). In general, though, diversification has a positive effect on the maximal Sharpe ratio and thereby on the (logarithmic) excess return for any Kelly trader.

Without going into details, we mention that the previous example can easily be generalized to the situation where both \(N^1,N^2>1\). Here, one finds that

$$\begin{aligned} \textbf{s}^2_0(\textbf{w}_*[U])=\textbf{s}^2_0(\textbf{w}_*[\tilde{U}]),\quad \tilde{U}={{\,\textrm{span}\,}}(\textbf{w}_*[U_1])\oplus {{\,\textrm{span}\,}}(\textbf{w}_*[U_2]), \end{aligned}$$

if and only if \(\textbf{w}_*[U_1]\perp {\check{\textbf{e}}}_{2_2},\ldots ,{\check{\textbf{e}}}_{N^2_2}\) and \(\textbf{w}_*[U_2]\perp {\check{\textbf{e}}}_{2_1},\ldots ,{\check{\textbf{e}}}_{N^1_1}\), where \(\lbrace {\check{\textbf{e}}}_{i_k}\rbrace _{i\ge 1}\) denotes an orthogonal basis in \(U_k\), \(k\in \lbrace 1,2\rbrace \), such that \(\check{\textbf{e}}_{1_k}=\textbf{w}_*[U_k]/\Vert \textbf{w}_*[U_k]\Vert _\mathcal {H}\).

We conclude by noting that Jensen’s alpha, as a risk-adjusted return, has a number of shortcomings. First, it does not specify the risk metric under which we can quantify adjusted excess return for a given risk level. Second, it does not answer the question how to form mean–variance efficient trading strategies, and third, it does not readily generalize to higher dimensions since the diversification effect is not taken into account. In contrast, we argue that the Kelly approach, in combination with the orthogonal Sharpe ratio, brings clarity to the picture, and in Sect. 5 we provide further evidence supporting this claim.

5 Relative value trading

We investigate the connection between relative value trading and option pricing as highlighted in Bermin and Holm (2021a). As shown in Sect. 3.3, for a fixed level of logarithmic excess return, it is always favorable to use an efficient Kelly strategy, both in terms of relative leverage/drawdown risk and in terms of volatility. These properties follow from the fact that a Kelly strategy, by design, has maximal instantaneous Sharpe ratio and that it is never optimal to leverage more than the growth optimal Kelly strategy. We therefore choose to study the transformation of one efficient Kelly strategy to another as we enlarge the opportunity set. In other words, we start with a trading strategy \(\textbf{w}_1=k_1\textbf{w}_*[U_1]\), \(k_1\in [0,1]\) and investigate the impact of extending the asset universe to \(U=U_1\oplus U_2\), when the new Kelly strategy \(\textbf{w}=k\textbf{w}_*\) is used. From Eq. (11), we then have

$$\begin{aligned} \begin{array}{ll} \mu _0(\textbf{w}_1)=\frac{1}{2}k_1(2-k_1)\textbf{s}_0^2(\textbf{w}_*[U_1]), &{}\mu _0(\textbf{w})=\frac{1}{2}k(2-k)\textbf{s}_0^2(\textbf{w}_*),\\ \sigma ^2_0(\textbf{w}_1)=k^2_1\textbf{s}_0^2(\textbf{w}_*[U_1]), &{}\sigma ^2_0(\textbf{w})=k^2\textbf{s}_0^2(\textbf{w}_*),\\ \textbf{k}_0(\textbf{w}_1)=k_1, &{}\textbf{k}_0(\textbf{w})=k. \end{array} \end{aligned}$$

Hence, for a fixed relative leverage risk, \(k=k_1\), the logarithmic excess return increases as

$$\begin{aligned} \mu _0(\textbf{w})-\mu _0(\textbf{w}_1)=\frac{1}{2}k_1(2-k_1)\left( \textbf{s}^2_0(\textbf{w}_*)-\textbf{s}^2_0(\textbf{w}_*[U_1])\right) \ge 0. \end{aligned}$$

If, instead, we keep the volatility fixed by setting \(k=k_1\textbf{s}_0(\textbf{w}_*[U_1])/\textbf{s}_0(\textbf{w}_*)\), then

$$\begin{aligned} \mu _0(\textbf{w})-\mu _0(\textbf{w}_1)=k_1\textbf{s}_0(\textbf{w}_*[U_1])\left( \textbf{s}_0(\textbf{w}_*)-\textbf{s}_0(\textbf{w}_*[U_1])\right) \ge 0. \end{aligned}$$

Conversely, for a fixed logarithmic excess return we find that

$$\begin{aligned} k=1\pm \sqrt{1-k_1(2-k_1)\frac{\textbf{s}^2_0(\textbf{w}_*[U_1])}{\textbf{s}^2_0(\textbf{w}_*)}}. \end{aligned}$$

By choosing the efficient strategy with lowest variance (that is the one for which \(k\in [0,1]\)), we obtain after some algebraic manipulations

$$\begin{aligned} \frac{\textbf{k}_0(\textbf{w})-\textbf{k}_0(\textbf{w}_1)}{1-k_1}&=1-\sqrt{1+\frac{k_1(2-k_1)}{(1-k_1)^2}\frac{(\textbf{s}^2_0(\textbf{w}_*)-\textbf{s}^2_0(\textbf{w}_*[U_1]))}{\textbf{s}^2_0(\textbf{w}_*)}}\le 0,\\ \frac{\varvec{\sigma }_0(\textbf{w})-\varvec{\sigma }_0(\textbf{w}_1)}{\textbf{s}_0(\textbf{w}_*)-k_1\textbf{s}_0(\textbf{w}_*[U_1])}&=1-\sqrt{1+2k_1\frac{\textbf{s}_0(\textbf{w}_*[U_1])(\textbf{s}_0(\textbf{w}_*)-\textbf{s}_0(\textbf{w}_*[U_1]))}{(\textbf{s}_0(\textbf{w}_*)-k_1\textbf{s}_0(\textbf{w}_*[U_1]))^2}} \le 0. \end{aligned}$$

The conclusion to be drawn is that when restricted to trading strategies with maximal instantaneous Sharpe ratio it is almost always beneficial to enlarge the opportunity set. By doing so, we can either increase the logarithmic excess return for a given relative leverage risk (or volatility) level or reduce the relative leverage risk (or volatility) for a given logarithmic excess return level. The only time when no value can be added, relative the initial portfolio, is when \(\textbf{s}_0(\textbf{w}_*)=\textbf{s}_0(\textbf{w}_*[U_1])\). Here, the direct sum representation degenerates, see Fig. 2, and below we provide a number of equivalent conditions for when this happens.

Proposition 14

Let \(\mathcal {H}=(U_1\oplus U^\perp _{2\vert 1},\textbf{V}_0\oplus \textbf{V}_0)\). The following conditions are equivalent

$$\begin{aligned} \begin{array}{ll} \textbf{w}_*=\textbf{w}_*[U_1], &{} \textbf{w}_*[U^\perp _{2\vert 1}] =\textbf{0},\\ \textbf{b}_0(\textbf{w}_*)=\textbf{b}_0(\textbf{w}_*[U_1]), &{}\textbf{b}_0(\textbf{w}_*[U^\perp _{2\vert 1}]) =0,\\ \textbf{s}_0(\textbf{w}_*)=\textbf{s}_0(\textbf{w}_*[U_1]), &{}\textbf{s}_0(\textbf{w}_*[U^\perp _{2\vert 1}]) =0, \end{array}\end{aligned}$$

and

$$\begin{aligned} \textbf{b}_0(\textbf{w})&=\textbf{b}_0(\textbf{P}_{0\vert U_1}(\textbf{w})),\quad \forall \textbf{w}\in \mathcal {H},\\ \textbf{s}_0(\textbf{w})&=\varvec{\rho }_0(\textbf{w},\textbf{w}_*[U_1])\textbf{s}_0(\textbf{w}_*[U_1]),\quad \forall \textbf{w}\in \mathcal {H}. \end{aligned}$$

Proof

Since \(\textbf{w}_*=\textbf{w}_*[U_1]+\textbf{w}_*[U^\perp _{2\vert 1}]\), \(\textbf{b}_0(\textbf{w}_*)=\textbf{b}_0(\textbf{w}_*[U_1])+\textbf{b}_0(\textbf{w}_*[U^\perp _{2\vert 1}])\), and \(\textbf{s}^2_0(\textbf{w}_*)=\textbf{s}^2_0(\textbf{w}_*[U_1])+\textbf{s}^2_0(\textbf{w}_*[U^\perp _{2\vert 1}])\), the first set of conditions are trivially equivalent. Next, we prove that

$$\begin{aligned} \textbf{s}^2_0\left( \textbf{w}_*[U^\perp _{2\vert 1}]\right) =0\Leftrightarrow \textbf{b}_0(\textbf{w}) = \textbf{b}_0(\textbf{P}_{0\vert U_1}(\textbf{w})), \quad \forall \textbf{w}\in \mathcal {H}. \end{aligned}$$

We first note, since the projection operator \(\textbf{P}_{0\vert U^\perp _{2\vert 1}}\) is orthogonal, that

$$\begin{aligned} \textbf{V}_0\left( \textbf{P}_{0\vert U^\perp _{2\vert 1}}(\textbf{w}_*),\textbf{P}_{0\vert U^\perp _{2\vert 1}}(\textbf{w}_*)\right) =\textbf{V}_0\left( \textbf{w}_*,\textbf{P}_{0\vert U^\perp _{2\vert 1}}(\textbf{w}_*)\right) =\textbf{b}_0\left( \textbf{P}_{0\vert U^\perp _{2\vert 1}}(\textbf{w}_*)\right) . \end{aligned}$$

Hence, \(\textbf{s}^2_0(\textbf{w}_*[U^\perp _{2\vert 1}])=\Vert \textbf{w}_*[U^\perp _{2\vert 1}\Vert ^2_\mathcal {H}=0\) if and only if the covector \(\textbf{b}_0\circ \textbf{P}_{0\vert U^\perp _{2\vert 1}} = \textbf{0}\). But this is equivalent to \(\textbf{b}_0(\textbf{w})=\textbf{b}_0(\textbf{P}_{0\vert U_1}(\textbf{w}))\), for all \(\textbf{w}\in \mathcal {H}\), since

$$\begin{aligned} \textbf{b}_0\circ \textbf{P}_{0\vert U^\perp _{2\vert 1}}(\textbf{w})=\textbf{b}_0\left( \textbf{P}_{0\vert U^\perp _{2\vert 1}}(\textbf{w})\right) =\textbf{b}_0\left( \textbf{w}-\textbf{P}_{0\vert U_1}(\textbf{w})\right) =\textbf{b}_0(\textbf{w})-\textbf{b}_0\left( \textbf{P}_{0\vert U_1}(\textbf{w})\right) , \end{aligned}$$

which proves the statement. Finally, we notice that

$$\begin{aligned} \textbf{b}_0(\textbf{w})=\textbf{b}_0(\textbf{P}_{0\vert U_1}(\textbf{w}))=\textbf{V}_0(\textbf{w}_*,\textbf{P}_{0\vert U_1}(\textbf{w}))=\textbf{V}_0(\textbf{P}_{0\vert U_1}(\textbf{w}_*),\textbf{w}), \end{aligned}$$

is equivalent to

$$\begin{aligned} \textbf{s}_0(\textbf{w})=\varvec{\rho }_0(\textbf{P}_{0\vert U_1}(\textbf{w}_*),\textbf{w})\Vert \textbf{P}_{0\vert U_1}(\textbf{w}_*)\Vert _\mathcal {H}, \end{aligned}$$

from which the proof concludes by Theorem 4. \(\square \)

Below we explain how the concept \(\textbf{s}_0(\textbf{w}_*)=\textbf{s}_0(\textbf{w}_*[U_1])\) can be applied to the pricing of derivatives. We call the pricing rule No Added Relative Value (NARV, for short), with the meaning that the price of an asset is set such that there is no added value, relative to an existing Kelly portfolio, in trading the asset.

5.1 Derivative pricing

Below we explain how to price a derivative on one or several assets in a space \(U_1\). We let \(\textbf{v}_\pi \) denote a trading strategy that only takes positions in the derivative. As previously explained, a Kelly trader with opportunity set \(U_1\) can add value to his portfolio by extending the opportunity set if \(U_1\cap {{\,\textrm{span}\,}}(\textbf{v}_\pi ) =\lbrace \textbf{0}\rbrace \) and \(\textbf{s}_0(\textbf{w}_*[U_1\oplus {{\,\textrm{span}\,}}(\textbf{v}_\pi )])\ne \textbf{s}_0(\textbf{w}_*[U_1])\). Below, we analyze the meaning of these two conditions and highlight the connection with derivative pricing by means of no-arbitrage.

First, we observe that if \(\textbf{v}_\pi \in U_1\) then \(U_1\cap {{\,\textrm{span}\,}}(\textbf{v}_\pi ) \ne \lbrace \textbf{0}\rbrace \), with the interpretation that \(U_1\) is instantaneously a complete market for valuing the derivative. From Theorem 6, we then have

$$\begin{aligned} \textbf{s}_0(\textbf{v}_\pi )=\varvec{\rho }_0(\varvec{\textbf{v}_\pi },\textbf{w}_*[U_1])\textbf{s}_0(\textbf{w}_*[U_1]). \end{aligned}$$
(14)

Note that when the derivative is written on one asset only, such that \(\textbf{v}_\pi =\lambda \textbf{v}_{1_1}\), a repeated use of Theorem 6 verifies the well-known expression \(\textbf{s}_0(\textbf{v}_\pi )=\pm \textbf{s}_0(\textbf{v}_{1_1})\), with the sign depending on whether we are, for instance, considering a call or a put option. If we further require Eq. (14) to hold for each fixed point in time until the expiry of the derivative, the corresponding price is uniquely defined once we specify the terminal payoff of the derivative. We identify the price as the no-arbitrage price of Merton (1973), allowing for a synthetic replication of the terminal payoff by dynamically trading in the underlying assets.

Next, let us assume that \(U_1\cap {{\,\textrm{span}\,}}(\textbf{v}_\pi ) =\lbrace \textbf{0}\rbrace \), such that \(\textbf{v}_\pi \notin U_1\). In this case, we say that \(U_1\) is instantaneously an incomplete market with respect to the derivative. From Proposition 14, it then follows that the No Added Relative Value (NARV) price is characterized by

$$\begin{aligned} \textbf{s}_0(\textbf{w}_*[U_1\oplus {{\,\textrm{span}\,}}(\textbf{v}_\pi )])=\textbf{s}_0(\textbf{w}_*[U_1])\Leftrightarrow \textbf{s}_0(\textbf{v}_\pi )=\varvec{\rho }_0(\varvec{\textbf{v}_\pi },\textbf{w}_*[U_1])\textbf{s}_0(\textbf{w}_*[U_1]). \end{aligned}$$

Hence, the local characteristics of the NARV price are identical to those of the no-arbitrage price in a complete market.

In order to further explain the properties of NARV pricing, we let \(\textbf{v}_\pi \in U_1\oplus U_2\), for some set \(U_2\). The interpretation is that \(U_1\oplus U_2\) is instantaneously a complete market or equally that the instantaneously incomplete market \(U_1\) has been completed by adding the opportunity set \(U_2\). The unique price of the derivative then satisfies \(\textbf{s}_0(\textbf{v}_\pi )=\varvec{\rho }_0(\varvec{\textbf{v}_\pi },\textbf{w}_*[U_1 \oplus U_2])\textbf{s}_0(\textbf{w}_*[U_1\oplus U_2])\), as shown in Theorem 6. Consequently, the market completion adds no value, relative \(U_1\), if \(\textbf{w}_*[U_1\oplus U_2]=\textbf{w}_*[U_1]\). But, as shown in Proposition 14, this is equivalent to

$$\begin{aligned} \textbf{s}_0(\textbf{w})=\varvec{\rho }_0(\textbf{w},\textbf{w}_*[U_1])\textbf{s}_0(\textbf{w}_*[U_1]),\quad \forall \textbf{w}\in U_1\oplus U_2. \end{aligned}$$

Hence, in this case, the functional form of the local characteristics is similar for the derivative \(\textbf{w}=\textbf{v}_\pi \) and for the assets \(\textbf{w}\in U_2\). In order to explain the significance of this observation let us consider a market exhibiting stochastic volatility. We assume that \(U_1\) consists of only one asset and that we want to value, say, a call option with strike \(K_1\). Moreover, we further assume that the price of the derivative (represented by \(\textbf{v}_{\pi _1}\)) is uniquely defined once we augment the opportunity set with another call option (represented by \(\textbf{v}_{\pi _2}\)) with, say, strike \(K_2\). Then it is reasonable to claim, since we a priori do not know the price of either derivative, that it should not matter in which order we complete the market and this is exactly what NARV pricing achieves.

Another way to characterize the NARV prices is by recalling Theorem 2, where it was proved that in a complete market the market price of risk vector is identical to the growth optimal Kelly vector. Consequently, if \(\textbf{w}_*[U_1\oplus U_2]=\textbf{w}_*[U_1]\), Proposition 14 alternatively states that the NARV prices can be computed using a market price of risk process satisfying

$$\begin{aligned} \varvec{\Theta }[U_1\oplus U_2]=\textbf{w}_*[U_1\oplus U_2]=\textbf{w}_*[U_1]\Rightarrow \varvec{\Theta }[U^\perp _{2\vert 1}]=\textbf{w}_*[U^\perp _{2\vert 1}]=\textbf{0}, \end{aligned}$$

for every fixed point in time. In the finance literature, the probability measure associated with such a market price of risk process is called the minimal martingale measure and was first introduced in Föllmer and Schweizer (1991). While the connection between Kelly trading and derivative pricing has been derived in Bermin and Holm (2021a), we believe our geometrical approach provides additional insights; notably by realizing that the market price of risk vector and the growth optimal Kelly vector are identical in a complete market.

Finally, we stress that should the market price of a derivative not equal the minimal martingale measure price, a Kelly trader can always add value to his portfolio by enlarging the opportunity set with the derivative.

6 Comment on risk relativity

Here we briefly outline how our framework can be extended to cover the situation where risk is measured relative an asset different from the numéraire. As an illustrative example, we may consider a fund manager who benchmarks his performance against, say, bitcoin but reports his earnings in dollars. This leads us to develop a Kelly-like theory for hyperplanes, which are not necessarily going through origo and, hence, are not vector spaces but merely affine spaces. We proceed as follows: given that \(U={{\,\textrm{span}\,}}(\textbf{v}_1,\ldots ,\textbf{v}_N)\), we consider a K-dimensional hyperplane A, with \(K\le N\), defined such that for any point \(\textbf{w}\in A\) one can find coefficients \(\lbrace \lambda _i\rbrace _{1\le i\le K}\) satisfying \(\textbf{w}=\textbf{v}_0+\lambda ^i(\textbf{v}_i-\textbf{v}_0)\), for some arbitrary point \(\textbf{v}_0\). With \(\textbf{u}\) denoting the reference vector two situations can now occur. Either \(\textbf{u}\) belongs to A or the reference vector lies outside of the hyperplane. In this paper we only consider the first case, which allows us to choose \(\textbf{v}_0=\textbf{u}\). It follows that we can translate the hyperplane to origo by subtracting the reference vector and form the vector space \(A_\textbf{u}=A-\textbf{u}\). We then define the Hilbert space \(\mathcal {H}_\textbf{u}=(A_\textbf{u},\textbf{V}_\textbf{u})\), where the inner product in \(A_\textbf{u}\) relates to that in U according to

$$\begin{aligned} \textbf{V}_\textbf{u}(\textbf{v}_\textbf{u},\textbf{w}_\textbf{u})=\textbf{V}_0(\textbf{v}-\textbf{u},\textbf{w}-\textbf{u}), \quad \textbf{v}_\textbf{u},\textbf{w}_\textbf{u}\in A_\textbf{u}. \end{aligned}$$
(15)

One notes that the zero vector is the origo in each vector space U and \(A_\textbf{u}\), respectively, but when expressed in terms of U the origo of \(A_\textbf{u}\) equals the point associated with the reference vector. Following the notation in Bermin and Holm (2021b), we then define, in accordance with Proposition 1, the financial quantities

$$\begin{aligned} \textbf{b}_\textbf{u}(\textbf{w})&=\textbf{V}_0(\textbf{w}_*-\textbf{u},\textbf{w}-\textbf{u}), \quad \varvec{\sigma }^2_\textbf{u}(\textbf{w})=\textbf{V}_0(\textbf{w}-\textbf{u},\textbf{w}-\textbf{u}),\end{aligned}$$
(16)
$$\begin{aligned} \varvec{\rho }_\textbf{u}(\textbf{v},\textbf{w})&=\frac{\textbf{V}_0(\textbf{v}-\textbf{u},\textbf{w}-\textbf{u})}{\sqrt{\textbf{V}_0(\textbf{v}-\textbf{u},\textbf{v}-\textbf{u})\textbf{V}_0(\textbf{w}-\textbf{u},\textbf{w}-\textbf{u})}}, \end{aligned}$$
(17)

where, as usual, \(\textbf{w}_*=\textbf{w}_*[U]\). We also set \(\varvec{\mu }_\textbf{u}(\textbf{w})=\textbf{b}_\textbf{u}(\textbf{w})-\frac{1}{2}\varvec{\sigma }^2_\textbf{u}(\textbf{w})\), \(\textbf{s}_\textbf{u}(\textbf{w})=\textbf{b}_\textbf{u}(\textbf{w})/\varvec{\sigma }_\textbf{u}(\textbf{w})\), and \(\textbf{k}_\textbf{u}(\textbf{w})=\varvec{\sigma }_\textbf{u}(\textbf{w})/\textbf{s}_\textbf{u}(\textbf{w})\). Note that while these definitions are natural from a financial point of view they come with the drawback that the tensor properties of \(\textbf{b}_\textbf{u}\) and \(\varvec{\sigma }^2_\textbf{u}\) are lost. One way to overcome this issue is to define \(\textbf{b}_\textbf{u}(\mathbf {w_u})=\textbf{V}_0(\textbf{w}_*-\textbf{u},\mathbf {w_u})\) and \(\varvec{\sigma }^2_\textbf{u}(\mathbf {w_u})=\textbf{V}_0(\mathbf {w_u},\mathbf {w_u})\). Whichever notation that is most convenient to use may vary from application to application. With that being said, we continue by defining the growth optimal Kelly vector on the hyperplane \(A=\textbf{u}+A_\textbf{u}\) according to

$$\begin{aligned} \textbf{w}_{*}[A]=\mathop {\mathrm {arg\,max}}\limits _{\textbf{w}\in A}\varvec{\mu }_\textbf{u}(\textbf{w})=\textbf{u}+\mathop {\mathrm {arg\,max}}\limits _{\textbf{w}_\textbf{u}\in A_\textbf{u}}\varvec{\mu }_\textbf{u}(\textbf{u}+\textbf{w}_\textbf{u}), \end{aligned}$$
(18)

such that \(\textbf{w}_*[A]=\textbf{w}_*\) if \(A_\textbf{u}\) and U share the same point space. Similar to affine subspaces, we then define subspaces of the hyperplane A as being generated by subspaces of the corresponding vector space \(A_\textbf{u}\). Moreover, for any subspace \(A_{\textbf{u}1}\subseteq A_\textbf{u}\) we let \(\textbf{P}_{\textbf{u}\vert A_{\textbf{u}1}}\) denote the orthogonal projection of \(A_\textbf{u}\) onto \(A_{\textbf{u}1}\), see Lemma 11 for related details. This allows us to generalize Theorems 4 and 6 as below.

Fig. 4
figure 4

This figure shows the orthogonal decompositions of the translated vector space \(A_\textbf{u}\) (black) and those of the initial vector space U (gray). The growth optimal Kelly vector is invariant with respect to the translation vector \(\textbf{u}\), that is \(\textbf{w}_*=\textbf{u}+\textbf{w}_{\textbf{u}*}\), which implies that Kelly strategies in \(A_\textbf{u}\) correspond to the trading strategies \(\mathbf {w_u}=k(\textbf{w}_*-\textbf{u})\). The growth optimal Kelly vector \(\textbf{w}_{\textbf{u}*}=\varvec{\widehat{v_u}}_1+\varvec{\widehat{v_u}}_{2\vert 1}=\varvec{\widehat{v_u}}_{1\vert 2}+\varvec{\widehat{v_u}}_2\), further admits a representation \(\textbf{w}_{\textbf{u}*}=w^1_{\textbf{u}*}\textbf{v}_{\textbf{u}1}+w^2_{\textbf{u}*}\textbf{v}_{\textbf{u}2}\) in the non-orthogonal decomposition \(A_\textbf{u}=A_{\textbf{u}1}\oplus A_{\textbf{u}2}\). We use the notations: \(\rho _{1,2}=\varvec{\rho }_\textbf{u}(\textbf{v}_1,\textbf{v}_2)\), \(\beta _1^2=\varvec{\beta }_\textbf{u}(\textbf{v}_1,\textbf{v}_2)\), \(\beta _2^1=\varvec{\beta }_\textbf{u}(\textbf{v}_2,\textbf{v}_1)\) and also highlight the level sets of \(\textbf{k}_\textbf{u}(\textbf{w})=k\), for \(k\in \lbrace 1,2,\pm \infty \rbrace \)

Corollary 15

For \(\mathcal {H}_{\textbf{u}1}=(A_{\textbf{u}1},\textbf{V}_\textbf{u})\subseteq \mathcal {H}_\textbf{u}\), let \(A_{1}=\textbf{u}+A_{\textbf{u}1}\) be the associated subspace of A. Then,

$$\begin{aligned} \textbf{w}_{*}[A_{1}] -\textbf{u}= \textbf{P}_{\textbf{u}\vert A_{\textbf{u}1}}(\textbf{w}_{*}-\textbf{u}),\quad \Vert \textbf{w}_{*}[A_{1}] -\textbf{u}\Vert _{\mathcal {H}_\textbf{u}}=\textbf{s}_\textbf{u}(\textbf{w}_{*}[A_{1}]). \end{aligned}$$

Proof

Straightforward calculations, setting \(\textbf{w}_{*\textbf{u}}=\textbf{w}_*-\textbf{u}\) and assuming \(\mathbf {w_u}\in A_{\textbf{u}1}\), yield

$$\begin{aligned} \mu _\textbf{u}(\textbf{u}+\mathbf {w_u})&=\textbf{V}_\textbf{u}(\textbf{w}_{*\textbf{u}},\mathbf {w_u})-\frac{1}{2}\textbf{V}_\textbf{u}(\textbf{w}_{\textbf{u}},\mathbf {w_u}),\\&=\textbf{V}_\textbf{u}(\textbf{P}_{\textbf{u}\vert A_{\textbf{u}1}}(\textbf{w}_{*\textbf{u}}),\mathbf {w_u})-\frac{1}{2}\textbf{V}_\textbf{u}(\textbf{w}_{\textbf{u}},\mathbf {w_u}),\\&=\frac{1}{2}\Vert \textbf{P}_{\textbf{u}\vert A_{\textbf{u}1}}(\textbf{w}_{*\textbf{u}})\Vert ^2_{\mathcal {H}_\textbf{u}}-\frac{1}{2}\Vert \mathbf {w_u}-\textbf{P}_{\textbf{u}\vert A_{\textbf{u}1}}(\textbf{w}_{*\textbf{u}})\Vert ^2_{\mathcal {H}_\textbf{u}}. \end{aligned}$$

Hence, \(\mathop {\mathrm {arg\,max}}\limits _{\mathbf {w_u}\in A_{\textbf{u}1}}\mu _\textbf{u}(\textbf{u}+\mathbf {w_u}) = \textbf{P}_{\textbf{u}\vert A_{\textbf{u}1}}(\textbf{w}_{*\textbf{u}})\), from which the first part of the proof follows. The second part is a direct consequence of \(\textbf{P}_{\textbf{u}\vert A_{\textbf{u}1}}\) being an orthogonal projection. \(\square \)

Corollary 16

For \(\mathcal {H}_{\textbf{u}1}=(A_{\textbf{u}1},\textbf{V}_\textbf{u})\subseteq \mathcal {H}_\textbf{u}\), let \(A_{1}=\textbf{u}+A_{\textbf{u}1}\) be the associated subspace of A. Then, for \(\textbf{v}\in A_{1}\), we have

$$\begin{aligned} \textbf{s}_\textbf{u}(\textbf{v})=\varvec{\rho }_\textbf{u}(\textbf{v},\lambda \textbf{w}_{*}[A_{1}])\textbf{s}_\textbf{u}(\lambda \textbf{w}_{*}[A_{1}]),\quad \lambda >0. \end{aligned}$$

Proof

The proof follows similarly to that of Theorem 6 and is thus omitted. \(\square \)

In fact, all the results derived throughout this paper are presented in such a way that they can be modified by simply changing the reference vector. For instance, simple calculations, assuming \(\dim (A_\textbf{u})=2\), yield

$$\begin{aligned} \textbf{s}^2_{\textbf{u}}(\textbf{w}_*)=\textbf{s}^2_{\textbf{u}}(\textbf{v}_1)+\textbf{s}^2_\textbf{u}(\textbf{v}_{2\vert 1})=\textbf{s}^2_\textbf{u}(\textbf{v}_{1\vert 2})+\textbf{s}^2_{\textbf{u}}(\textbf{v}_2), \end{aligned}$$
(19)

where the orthogonal Sharpe ratios equals

$$\begin{aligned} \textbf{s}_\textbf{u}(\textbf{v}_{2\vert 1})=\frac{\textbf{s}_\textbf{u}(\textbf{v}_2)-\rho _\textbf{u}(\textbf{v}_1,\textbf{v}_2)\textbf{s}_\textbf{u}(\textbf{v}_1)}{\sqrt{1-\rho _\textbf{u}^2(\textbf{v}_1,\textbf{v}_2)}},\\ \textbf{s}_\textbf{u}(\textbf{v}_{1\vert 2})=\frac{\textbf{s}_\textbf{u}(\textbf{v}_1)-\rho _\textbf{u}(\textbf{v}_1,\textbf{v}_2)\textbf{s}_\textbf{u}(\textbf{v}_2)}{\sqrt{1-\rho _\textbf{u}^2(\textbf{v}_1,\textbf{v}_2)}}. \end{aligned}$$

We visualize the role of what is considered to be the risk-free asset in Fig. 4. Although the lines \((A_{\textbf{u}1}, A_{\textbf{u}2})\), spanned by \(\textbf{u}+\lambda ^1(\textbf{v}_1-\textbf{u})\) and \(\textbf{u}+\lambda ^2(\textbf{v}_2-\textbf{u})\), respectively, are very different from the lines \((U_1,U_2)\), spanned by \(\lambda ^1\textbf{v}_1\) and \(\lambda ^2\textbf{v}_2\), the direct sums \(A_{\textbf{u}1}\oplus A_{\textbf{u}2}\) and \(U_1\oplus U_2\) have the same point space. Consequently, \(\textbf{w}_*[A]=\textbf{w}_*\) and the trading strategies with maximal Sharpe ratio (i.e., the Kelly strategies) are now of the form \(\textbf{w}=\textbf{u}+k(\textbf{w}_*-\mathbf {u)}\). For such trading strategies one easily verifies that

$$\begin{aligned} \varvec{\mu }_\textbf{u}(\textbf{w})=\frac{1}{2}k(2-k)\textbf{s}^2_\textbf{u}(\textbf{w}_*),\quad \varvec{\sigma }^2_\textbf{u}(\textbf{w})=k^2\textbf{s}^2_\textbf{u}(\textbf{w}_*),\quad \textbf{k}_\textbf{u}(\textbf{w})=k. \end{aligned}$$
(20)

Hence, we recover the well-known Kelly expressions. For higher dimensions, Eq. (19) must, however, be modified as described in Theorem 13. We leave the details to the reader. Finally, we stress that the restricted growth optimal Kelly vectors \(\textbf{w}_*[A_{1}]\), \(A_{1}\subseteq A\), can change considerably with respect to the chosen reference vector \(\textbf{u}\), even though \(\textbf{w}_*[A]\) is invariant.

7 Conclusions

In this paper, we present a geometric approach to portfolio theory, with the aim to explain the geometrical principles behind risk-adjusted returns, in particular Jensen’s alpha. We find that while the alpha/beta approach has severe limitations (especially in higher dimensions), only minor conceptual modifications are needed to complete the picture. However, these minor modifications (e.g., using orthogonal Sharpe ratios rather than risk-adjusted returns) can only be appreciated once a full geometric approach to portfolio theory is developed. In particular, we show how to create trading strategies on the efficient (local) frontier, in the sense of Markowitz (1952) and Tobin (1958), having maximal instantaneous Sharpe ratio. The approach taken is strongly linked to the Kelly criterion and the growth optimal Kelly vector.

Additionally, we derive a number of intermediate results that are of interest by themselves. For instance, we show that in a complete market the so-called market price of risk vector is identical to the growth optimal Kelly vector, albeit expressed in coordinates of a different basis. We further show that the instantaneous correlation between an arbitrary trading strategy and its corresponding growth optimal Kelly strategy can be expressed as the ratio between their Sharpe ratios. By analyzing the level sets of various financial quantities, we also find that points in the mean–variance space cannot, in general, be associated with a unique trading strategy. Only the points on the efficient frontier (that is those with maximal Sharpe ratio) can uniquely be identified. For such trading strategies, collinear to the growth optimal Kelly vector, we formalize the notion of relative value trading that is implicit in Platen (2006), Bermin and Holm (2021a). We then apply geometric principles to investigate derivative pricing and introduce the concept of pricing by means on No Added Relative Value (NARV, for short). We say that this concept applies when the orthogonal Sharpe ratio of the derivative equals zero. Using simple geometric arguments, we show that NARV pricing is identical to no-arbitrage pricing with the so-called minimal martingale measure (Föllmer and Schweizer 1991), a result first derived in Bermin and Holm (2021a), albeit with much different methods. We further show that should the market price of a derivative not equal the minimal martingale measure price, a Kelly trader can always add value to his portfolio by enlarging the opportunity set with the derivative.