Model uncertainty is a challenge that is inherent in many applications of mathematical models. Optimization procedures in general take place under a particular model. This model, however, might be misspecified due to statistical estimation errors, incomplete information, biases, and for various other reasons. In that sense, any specified model must be understood as an approximation of the unknown “true” model. Difficulties arise since a strategy which is optimal under the approximating model might perform rather badly for the true model specifications. A natural way to deal with model uncertainty is to consider worst-case optimization.

Model uncertainty, also called Knightian uncertainty in reference to the seminal book by Knight [10], has been addressed in numerous papers. Gilboa and Schmeidler [9] and Schmeidler [27] formulate rigorous axioms on preference relations that account for risk aversion and uncertainty aversion. A robust utility functional in their sense is a mapping

$$\begin{aligned} X\mapsto \inf _{Q\in \mathcal {Q}}{{\,\mathrm{\mathbb {E}}\,}}_Q\bigl [U(X)\bigr ], \end{aligned}$$

where U is a utility function and \(\mathcal {Q}\) a convex set of probability measures. Chen and Epstein [4] give a continuous-time extension of this multiple-priors utility. In Maccheroni et al. [15] the authors thoroughly axiomatize the robust approach to utility maximization via so-called ambiguity-averse preferences.

Optimal investment decisions under such preferences are investigated in Quenez [23] and Schied [25]. An extension of those results by means of a duality approach is given in Schied [26]. Uncertainty about both drift and volatility in a continuous-time Brownian framework under multiple priors is studied by Lin and Riedel [13]. Further papers addressing drift uncertainty in financial markets are Garlappi et al. [8] and Biagini and Pınar [2]. The latter also focuses on ellipsoidal uncertainty sets, as we do in this work. Neufeld and Nutz [18] incorporate jumps of the price process by considering a Lévy processes setup.

A relation between model uncertainty and portfolio diversification is investigated in a recent paper by Pham et al. [22]. Pflug et al. [21] study a one-period risk minimization problem under model uncertainty and show convergence of the optimal strategy to the uniform diversification strategy. Our results generalize these findings to a continuous-time utility maximization problem and provide an explanation for the good performance of the uniform diversification strategy also in a continuous-time setting.

The optimization problem that we address here is a utility maximization problem in a continuous-time financial market. The most basic utility maximization problem in a Black–Scholes market is the Merton problem of maximizing expected utility of terminal wealth. It can be written in the form

$$\begin{aligned} V(x_0) = \sup _{\pi \in \mathcal {A}(x_0)} {{\,\mathrm{\mathbb {E}}\,}}\bigl [U(X^\pi _T)\bigr ], \end{aligned}$$

where \(U:\mathbb {R}_+\rightarrow \mathbb {R}\) is a utility function, \(X^\pi _T\) denotes the terminal wealth that is achieved when using strategy \(\pi \), and \(\mathcal {A}(x_0)\) is the class of admissible strategies starting with initial capital \(x_0\). Merton [16] solves this problem for power and logarithmic utility in a multivariate financial market model and gives a corresponding optimal strategy. However, the setup of the problem assumes that an investor knows the market parameters, in particular the drift \(\mu \) of asset returns. This is a rather unrealistic assumption since drift parameters are notoriously difficult to estimate. To obtain strategies that are robust with respect to a possible misspecification of the drift we consider the worst-case optimization problem

$$\begin{aligned} {\overline{V}}(x_0) = \sup _{\pi \in \mathcal {A}(x_0)} \inf _{\mu \in K} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U(X^\pi _T)\bigr ]. \end{aligned}$$

Here, we write \({{\,\mathrm{\mathbb {E}}\,}}_\mu [\cdot ]\) for the expectation with respect to a measure \(\mathbb {P}^\mu \) under which the drift of the asset returns is \(\mu \in \mathbb {R}^d\), with d denoting the number of risky assets in the market. The set \(K\subseteq \mathbb {R}^d\) is called the uncertainty set. Our aim is to study the structure of optimal strategies, as well as their asymptotic behavior as the uncertainty set K increases. Since for large uncertainty, investors usually do not invest in the risky assets at all, we restrict the class of admissible strategies by imposing a constraint that prevents a pure bond investment. We focus on ellipsoidal uncertainty sets K, see (4).

Our main results consist firstly in finding an explicit representation of the optimal strategy and the worst-case drift parameter for the robust utility maximization problem with constrained strategies and ellipsoidal uncertainty sets. Secondly, by using this explicit representation, a minimax theorem of the form

$$\begin{aligned} \sup _{\pi \in \mathcal {A}(x_0)} \inf _{\mu \in K} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U(X^\pi _T)\bigr ] = \inf _{\mu \in K} \sup _{\pi \in \mathcal {A}(x_0)} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U(X^\pi _T)\bigr ] \end{aligned}$$

is proven. Thirdly, we show that the optimal strategy converges to a generalized uniform diversification strategy. In case of K being a ball, this is the equal weight strategy, corresponding to uniform diversification. This result is somewhat surprising since in the limit the optimal strategy does not depend on the volatility structure of the assets anymore. In that sense, our results help to explain the popularity of uniform diversification strategies by the presence of uncertainty in the model.

The paper is organized as follows. In Sect. 2 we state our multivariate, possibly incomplete, Black–Scholes type financial market model and introduce the robust utility maximization problem. Our main results are given in Sect. 3, where we solve our optimization problem for power and logarithmic utility. The main idea is to solve the dual problem explicitly and to show then that the solution forms a saddle point of the problem. We give representations of the optimal strategy and the worst-case drift parameter and prove a minimax theorem. In Sect. 4 we study the asymptotic behavior of the optimal strategy and the worst-case parameter as the degree of uncertainty goes to infinity. We show that the optimal strategy converges to a generalized uniform diversification strategy, where by uniform diversification we mean the equal weight or 1/d strategy for the investment in the risky assets. Furthermore, we analyze the influence of the investor’s risk aversion on the speed of convergence and investigate measures for the performance of the optimal robust strategies. Section 5 gives an outlook on more general financial market models with stochastic drift processes for which we state a suitable problem formulation. Our results can then be used to derive an explicit representation of the optimal strategy as well as a minimax theorem also in the more general model. For better readability, all proofs are collected in Appendix A.

Notation. We use the notation \(I_d\) for the identity matrix in \(\mathbb {R}^{d\times d}\) as well as \(e_i\), \(i=1,\dots ,d\), for the i-th standard unit vector in \(\mathbb {R}^d\), and \(\mathbf {1}_d\) for the vector in \(\mathbb {R}^d\) containing a one in every component. We shortly write \(\mathbb {R}_+=(0,\infty )\). By \(\langle \cdot ,\cdot \rangle \) we denote the scalar product on \(\mathbb {R}^d\times \mathbb {R}^d\) with \(\langle x,y\rangle =x^\top y\) for \(x,y\in \mathbb {R}^d\). If \(x\in \mathbb {R}^d\) is a vector, \(\Vert x\Vert \) denotes the Euclidean norm of x.

Robust utility maximization problem

Financial market model

We consider a continuous-time financial market with one risk-free and various risky assets. By \(T>0\) we denote some finite investment horizon. Let \((\Omega , \mathcal {F}, \mathbb {F}, \mathbb {P})\) be a filtered probability space where the filtration \(\mathbb {F}=(\mathcal {F}_t)_{t\in [0,T]}\) satisfies the usual conditions. All processes are assumed to be \(\mathbb {F}\)-adapted. The risk-free asset \(S^0\) is of the form \(S^0_t=\mathrm {e}^{rt}\), \(t\in [0,T]\), where \(r\in \mathbb {R}\) is the constant risk-free interest rate. Aside from the risk-free asset, investors can also invest in \(d\ge 2\) risky assets. Their return process \(R=(R^1,\dots ,R^d)^\top \) is defined by

$$\begin{aligned} \mathrm {d}R_t = \nu \,\mathrm {d}t + \sigma \,\mathrm {d}W_t, \quad R_0=0, \end{aligned}$$

where \(W=(W_t)_{t\in [0,T]}\) is an m-dimensional Brownian motion under \(\mathbb {P}\) with \(m\ge d\), allowing for incomplete markets. Further, \(\nu \in \mathbb {R}^d\) and \(\sigma \in \mathbb {R}^{d\times m}\), where we assume that \(\sigma \) has full rank equal to d.

We introduce model uncertainty by assuming that the true drift of the stocks is only known to be an element of some set \(K\subseteq \mathbb {R}^d\) with \(\nu \in K\) and that investors want to maximize their worst-case expected utility when the drift takes values within K. The value \(\nu \) can be thought of as an estimate for the drift that was for instance obtained from historical stock prices. Changing the drift from \(\nu \) to some \(\mu \in K\) can be expressed by a change of measure. For this purpose, define the process \((Z^\mu _t)_{t\in [0,T]}\) by

$$\begin{aligned} Z^\mu _t = \exp \Bigl (\theta (\mu )^\top W_t -\frac{1}{2}\Vert \theta (\mu )\Vert ^2 t\Bigr ), \end{aligned}$$

where \(\theta (\mu )=\sigma ^\top (\sigma \sigma ^\top )^{-1}(\mu -\nu )\). We can then define a new measure \(\mathbb {P}^\mu \) by setting \(\frac{\mathrm {d}\mathbb {P}^\mu }{\mathrm {d}\mathbb {P}} = Z^\mu _T\). Note that since \(\theta (\mu )\) is a constant, the process \((Z^\mu _t)_{t\in [0,T]}\) is a strictly positive martingale. Therefore, \(\mathbb {P}^\mu \) is a probability measure that is equivalent to \(\mathbb {P}\) and we obtain from Girsanov’s Theorem that the process \((W^\mu _t)_{t\in [0,T]}\), defined by \(W^\mu _t = W_t-\theta (\mu )t\), is a Brownian motion under \(\mathbb {P}^\mu \). We can thus rewrite the return dynamics as

$$\begin{aligned} \mathrm {d}R_t = \nu \,\mathrm {d}t + \sigma \,\mathrm {d}W_t = \nu \,\mathrm {d}t + \sigma \bigl (\mathrm {d}W^\mu _t+\theta (\mu )\,\mathrm {d}t\bigr ) = \mu \,\mathrm {d}t + \sigma \,\mathrm {d}W^\mu _t, \end{aligned}$$

and see that a change of measure from \(\mathbb {P}\) to \(\mathbb {P}^\mu \) corresponds to changing the drift in the return dynamics from \(\nu \) to \(\mu \). We thus shortly write \({{\,\mathrm{\mathbb {E}}\,}}_\mu [\cdot ]\) for the expectation under measure \(\mathbb {P}^\mu \) and \({{\,\mathrm{\mathbb {E}}\,}}[\cdot ]={{\,\mathrm{\mathbb {E}}\,}}_\nu [\cdot ]\) for the expectation under our reference measure \(\mathbb {P}=\mathbb {P}^\nu \).

An investor’s trading decisions are described by a self-financing trading strategy \((\pi _t)_{t\in [0,T]}\) with values in \(\mathbb {R}^d\). The entry \(\pi ^i_t\), \(i=1, \dots , d\), is the proportion of wealth invested in asset i at time t. The corresponding wealth process \((X^\pi _t)_{t\in [0,T]}\) given initial wealth \(x_0>0\) can then be described by the stochastic differential equation

$$\begin{aligned} \mathrm {d}X^\pi _t = X^\pi _t\Bigl ( r\,\mathrm {d}t + \pi _t^\top (\mu -r\mathbf {1}_d)\,\mathrm {d}t + \pi _t^\top \sigma \,\mathrm {d}W^\mu _t \Bigr ), \quad X^\pi _0=x_0, \end{aligned}$$

for any \(\mu \in K\). We require trading strategies to be \(\mathbb {F}^R\)-adapted, where we have \(\mathbb {F}^R=(\mathcal {F}^R_t)_{t\in [0,T]}\) for \(\mathcal {F}^R_t=\sigma ((R_s)_{s\in [0,t]})\). The admissibility set is defined as

$$\begin{aligned} \mathcal {A}(x_0) = \biggl \{\pi =(\pi _t)_{t\in [0,T]} \;\bigg |\; \pi \text { is } \mathbb {F}^R\text {-adapted}, \; X^\pi _0=x_0,\\ \; {{\,\mathrm{\mathbb {E}}\,}}_\mu \biggl [\int _0^T \Vert \sigma ^\top \pi _t\Vert ^2\,\mathrm {d}t\biggr ]<\infty \text { for all } \mu \in K\biggr \}. \end{aligned}$$

Our robust portfolio optimization problem can then be formulated as

$$\begin{aligned} {\overline{V}}(x_0) = \sup _{\pi \in \mathcal {A}(x_0)} \inf _{\mu \in K} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U_\gamma (X^\pi _T)\bigr ], \end{aligned}$$

where \(U_\gamma \) is a power or logarithmic utility function, i.e. \(U_\gamma :\mathbb {R}_+\rightarrow \mathbb {R}\) for any \(\gamma \in (-\infty ,1)\), where \(U_\gamma (x)=\frac{x^\gamma }{\gamma }\) for \(\gamma \ne 0\) denotes power utility, \(U_0(x)=\log (x)\) logarithmic utility.

Constraint on the admissible strategies

In the following, our aim is to investigate problem (1) in detail. First, we make the observation that for a large degree of model uncertainty the trivial strategy \(\pi \equiv 0\) becomes optimal both for logarithmic and for power utility. This result has been shown in a similar setting by Biagini and Pınar [2, Sec. 3.1–3.2] who address in addition to the finite horizon setting also the case with an infinite time horizon.

Proposition 2.1

Let \(\gamma \in (-\infty ,1)\) and \(K\subseteq \mathbb {R}^d\). If \(r\mathbf {1}_d\in K\), then the strategy \((\pi _t)_{t\in [0,T]}\) with \(\pi _t=0\) for all \(t\in [0,T]\) is optimal for the optimization problem

$$\begin{aligned} \sup _{\pi \in \mathcal {A}(x_0)} \inf _{\mu \in K} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U_\gamma (X^\pi _T)\bigr ]. \end{aligned}$$

This observation implies that as the level of uncertainty about the true drift parameter exceeds a certain threshold, it is optimal for investors to not invest anything in the stocks.

Remark 2.2

Proposition 2.1 could be reformulated in terms of robust utility functionals by assuming only that a martingale measure is in the ambiguity set. The statement of the proposition is in line with Øksendal and Sulem [19, 20] where the authors obtain a similar result for optimality of \(\pi \equiv 0\). They consider a jump diffusion model with a worst-case approach where the market chooses a scenario from a fixed but very comprehensive set of probability measures. In contrast, it is shown in Zawisza [29] that, if the model allows for stochastic interest rate r, the optimal strategy does not invest exclusively in the bond. Lin and Riedel [14] show that, when there is a large degree of uncertainty about interest rates, the investor even puts all money in the risky assets.

Investing everything in the risk-free asset is a sensible but very extreme reaction to model uncertainty. We are interested in finding out which strategies are reasonable under high model uncertainty if investors still want to invest a part of their wealth into the risky assets, or, alternatively, if they are forced to invest due to some external requirements. For that purpose, we introduce a constraint on our strategies that prevents investors from solely investing in the bond. Consider for some \(h>0\) the admissibility set

$$\begin{aligned} \mathcal {A}_h(x_0)=\bigl \{ \pi \in \mathcal {A}(x_0) \,\big |\, \langle \pi _t,\mathbf {1}_d\rangle = h \text { for all } t\in [0,T] \bigr \}. \end{aligned}$$

We do not want to exclude short-selling, so negative entries of \(\pi \) are possible. Taking \(h=1\) would imply that investors are not allowed to invest anything in the risk-free asset. They must then distribute all of their wealth among the risky assets. For instance, a constraint of the form \(\langle \pi _t,\mathbf {1}_d\rangle = h>0\) typically applies for some mutual funds when investors are required to invest a certain amount in risky assets. Moreover, it has been studied in DeMiguel et al. [6] how constraining the norm of portfolio weight vectors in a one-period model can improve portfolio performance in the presence of estimation errors.

Remark 2.3

The admissibility set \(\mathcal {A}_h(x_0)\) might seem unnecessarily restrictive at first glance. Instead of fixing \(\langle \pi _t,\mathbf {1}_d\rangle =h\) one might want to consider utility maximization among the larger class of strategies \(\pi \) with \(\langle \pi _t,\mathbf {1}_d\rangle \ge h\). However, we are mainly interested in the asymptotic behavior of the optimal strategies as the level of uncertainty increases. It is intuitively clear that, when uncertainty is large, investors seek to invest as little as possible in the risky assets. Therefore, we consider optimization among strategies in \(\mathcal {A}_h(x_0)\) and use our results to show that enlarging the class of admissible strategies asymptotically does not change the value of the optimization problem, see Sect. 4.2.

A duality approach

In this section we solve for power or logarithmic utility \(U_\gamma \) and for specific uncertainty sets K the optimization problem

$$\begin{aligned} \sup _{\pi \in \mathcal {A}_h(x_0)} \inf _{\mu \in K} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U_\gamma (X^\pi _T)\bigr ]. \end{aligned}$$

Remark 3.1

In the situation with logarithmic utility and uncertainty sets that are balls in some p-norm, \(p\in [1,\infty )\), it is possible to carry over methods from a one-period risk minimization problem as in Pflug et al. [21] to our continuous-time robust utility maximization problem. If \(K=\{\mu \in \mathbb {R}^d\,|\,\Vert \mu -\nu \Vert _p\le \kappa \}\), then for every \(\varepsilon >0\) there exists a \(\kappa _0>0\) such that for all \(\kappa \ge \kappa _0\) the strategy \(\pi ^*(\kappa )\) that is optimal for

$$\begin{aligned} \sup _{\begin{array}{c} \pi \in \mathcal {A}_h(x_0)\\ \pi \text { deterministic} \end{array}} \inf _{\mu \in K} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [\log (X^\pi _T)\bigr ] \end{aligned}$$


$$\begin{aligned} \biggl \Vert \frac{1}{T}\int _0^T\Bigl (\pi _s^*(\kappa )-\frac{h}{d}\mathbf {1}_d\Bigr )\mathrm {d}s\biggr \Vert _q<\varepsilon , \end{aligned}$$

where \(q\in (1,\infty ]\) with \(\frac{1}{p}+\frac{1}{q}=1\). See Westphal [28, Thm. 3.4] for a proof. This shows that the optimal strategy among the deterministic ones converges, as uncertainty increases, to a uniform diversification strategy \(\pi ^u\) with \(\pi ^u_t=\frac{h}{d}\mathbf {1}_d\) for every \(t\in [0,T]\). Hence, as uncertainty about the true drift parameter goes to infinity, investors split the proportion h of their money more and more evenly among all risky assets.

This approach has several drawbacks. Firstly, we can follow the ideas from Pflug et al. [21] in continuous time only for logarithmic utility and uncertainty sets K that are balls in p-norm. Secondly, we have to restrict to the class of deterministic strategies to be able to use their methods. However, it is by no means clear in the first place that an optimal strategy to our problem should be a deterministic one. In fact, in many worst-case optimization problems it is even beneficial to use randomized strategies, see Delage et al. [5]. And lastly, the above result does not yield an explicit solution to the robust optimization problem, it only gives asymptotic results for large levels of uncertainty. To overcome these problems we follow here a different approach that works for both power and logarithmic utility and that results in an explicit solution of the optimization problem.

We study the case where the uncertainty set is an ellipsoid in \(\mathbb {R}^d\) centered around the reference parameter \(\nu \), i.e.

$$\begin{aligned} K=\bigl \{ \mu \in \mathbb {R}^d \,\big |\, (\mu -\nu )^\top \varGamma ^{-1}(\mu -\nu ) \le \kappa ^2 \bigr \}. \end{aligned}$$

Here, \(\kappa >0\), \(\nu \in \mathbb {R}^d\), and \(\varGamma \in \mathbb {R}^{d\times d}\) is symmetric and positive definite. The matrix \(\varGamma \) determines the shape of the ellipsoid, the value of \(\kappa \) its size. Higher values of \(\kappa \) correspond to more uncertainty about the true drift.

By means of \(\varGamma \) we can model that some (linear combinations of) drifts are known at a higher degree of accuracy than others. A special case discussed in the literature is \(\varGamma =\sigma \sigma ^\top \), see e.g. Biagini and Pınar [2]. But also different forms of \(\varGamma \) can be motivated. For \(\varGamma =I_d\) we simply get a ball in the Euclidean norm with radius \(\kappa \) and center \(\nu \). By setting \(\varGamma \) equal to a diagonal matrix different from the identity we can give different weights to the uncertainty of the single asset drifts.

More generally, assume that the reference drift parameter \(\nu \) is obtained as the value of an unbiased estimator \({\hat{\mu }}\) for the true drift, say from observing historical returns. Then the covariance matrix \({{\,\mathrm{cov}\,}}({\hat{\mu }})\) is a reasonable choice for \(\varGamma \), because then the uncertainty set K constitutes a natural (asymptotic) confidence region for the true drift. This flexibility in the form of \(\varGamma \) is especially useful for a generalization of our model to a setting with time-dependent drift and uncertainty sets, see Sect. 5, where we give a short outlook on Sass and Westphal [24]. In that follow-up work a time-dependent uncertainty set is constructed based on filtering techniques.

Solution of the non-robust problem

To solve the optimization problem (3) we first address the non-robust constrained utility maximization problem under a fixed parameter \(\mu \in \mathbb {R}^d\). We repeatedly make use of a specific matrix that we introduce in the following lemma.

Lemma 3.2

Consider the matrix

$$\begin{aligned} D = \begin{pmatrix} 1 &{} &{} 0 &{} -1 \\ &{}\ddots &{} &{}\vdots \\ 0 &{} &{} 1 &{} -1 \end{pmatrix}\in \mathbb {R}^{(d-1)\times d}. \end{aligned}$$

Then, given that \(\sigma \in \mathbb {R}^{d\times m}\) has rank d, \(D\sigma \) has rank \(d-1\).

The matrix D defined in the lemma above comes up naturally in calculations when using the constraint \(\langle \pi _t,\mathbf {1}_d\rangle =h\) in the form \(\pi ^d_t = h-\sum _{i=1}^{d-1} \pi ^i_t\). This can be seen as a reduction of the problem from d dimensions to \(d-1\) dimensions. For better readability of the calculations below we introduce the following notation.

Definition 3.3

We define the matrix \(A\in \mathbb {R}^{d\times d}\) and the vector \(c\in \mathbb {R}^d\) by

$$\begin{aligned} A&= D^\top (D\sigma \sigma ^\top D^\top )^{-1}D, \\ c&= e_d-D^\top (D\sigma \sigma ^\top D^\top )^{-1}D\sigma \sigma ^\top e_d = (I_d-A\sigma \sigma ^\top )e_d, \end{aligned}$$

where \(D\in \mathbb {R}^{(d-1)\times d}\) is as given in Lemma 3.2 and \(e_d\) is the d-th standard unit vector in \(\mathbb {R}^d\).

Note that we assume \(\sigma \in \mathbb {R}^{d\times m}\) to have full rank, hence by the previous lemma we know that \(D\sigma \) has full rank, in particular \(D\sigma \sigma ^\top D^\top =D\sigma (D\sigma )^\top \) is nonsingular. Using this notation we give the optimal strategy for the constrained optimization problem given a fixed drift \(\mu \). The possible incompleteness of the market does not complicate our approach here. The reason is that, for determining the optimal strategy, we can essentially reduce the problem to an unconstrained less-dimensional financial market where the optimal strategy can be obtained as a classical Merton strategy.

Proposition 3.4

Let \(\mu \in \mathbb {R}^d\). Then the optimal strategy for the optimization problem

$$\begin{aligned} \sup _{\pi \in \mathcal {A}_h(x_0)} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U_\gamma (X^\pi _T)\bigr ] \end{aligned}$$

is the strategy \((\pi _t)_{t\in [0,T]}\) with

$$\begin{aligned} \pi _t = \frac{1}{1-\gamma }A\mu +hc \end{aligned}$$

for all \(t\in [0,T]\), with A and c as in Definition 3.3.

In the proof the d-dimensional constrained problem is reduced to a \((d-1)\)-dimensional unconstrained problem. Using the form of the optimal strategy in the \((d-1)\)-dimensional market which is known from Merton [16] yields the following representation for the optimal expected utility from terminal wealth.

Corollary 3.5

Let \(\mu \in \mathbb {R}^d\). Then the optimal expected utility from terminal wealth is

$$\begin{aligned} \begin{aligned}&\sup _{\pi \in \mathcal {A}_h(x_0)} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U_\gamma (X^\pi _T)\bigr ] \\&= {\left\{ \begin{array}{ll} \frac{x_0^\gamma }{\gamma }\exp \Bigl (\gamma T\Bigl ( {\widetilde{r}}+\frac{1}{2(1-\gamma )}\bigl ({\widetilde{\mu }}-{\widetilde{r}}\mathbf {1}_{d-1}\bigr )^\top ({\widetilde{\sigma }}{\widetilde{\sigma }}^\top )^{-1}\bigl ({\widetilde{\mu }}-{\widetilde{r}}\mathbf {1}_{d-1}\bigr )\Bigr )\Bigr ), &{}\gamma \ne 0,\\ \log (x_0) + \Bigl ( {\widetilde{r}}+\frac{1}{2}\bigl ({\widetilde{\mu }}-{\widetilde{r}}\mathbf {1}_{d-1}\bigr )^\top ({\widetilde{\sigma }}{\widetilde{\sigma }}^\top )^{-1}\bigl ({\widetilde{\mu }}-{\widetilde{r}}\mathbf {1}_{d-1}\bigr ) \Bigr )T, &{}\gamma =0, \end{array}\right. } \end{aligned} \end{aligned}$$


$$\begin{aligned} \begin{aligned} {\widetilde{\sigma }}&=D\sigma , \\ {\widetilde{r}}&=(1-h)r+he_d^\top \mu -\frac{1}{2}(1-\gamma )\Vert h\sigma ^\top e_d \Vert ^2, \\ {\widetilde{\mu }}&=D\mu - h(1-\gamma )D\sigma \sigma ^\top e_d+{\widetilde{r}}\mathbf {1}_{d-1}. \end{aligned} \end{aligned}$$

The previous results give a representation of the optimal strategy and the optimal expected utility of terminal wealth under the constraint \(\langle \pi _t,\mathbf {1}_d\rangle = h\), given that the drift parameter \(\mu \) is known. Of course, both the strategy and the terminal wealth then depend on \(\mu \). However, we aim at solving the robust utility maximization problem

$$\begin{aligned} \sup _{\pi \in \mathcal {A}_h(x_0)} \inf _{\mu \in K} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U_\gamma (X^\pi _T)\bigr ]. \end{aligned}$$

For that purpose, we address in a next step the question what the worst possible parameter \(\mu \) would be for the investor, given that she reacts optimally, i.e. by applying the strategy from Proposition 3.4. This corresponds to solving the dual problem

$$\begin{aligned} \inf _{\mu \in K} \sup _{\pi \in \mathcal {A}_h(x_0)} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U_\gamma (X^\pi _T)\bigr ]. \end{aligned}$$

Note here that we do not know yet whether equality holds between our original problem and the corresponding dual problem. In general the solution of the dual problem may not be of great help. In the following, after deriving the solution to the dual problem, we prove a minimax theorem that establishes the desired equality. Results from the literature, e.g. from Quenez [23], do not directly carry over to our setting as we discuss in Remark 3.9 below.

The worst-case parameter

From Corollary 3.5 we have a representation of the optimal expected utility of terminal wealth, depending on the transformed parameters \({\widetilde{r}}\), \({\widetilde{\mu }}\) and \({\widetilde{\sigma }}\). Note that for any \(\gamma \in (-\infty ,1)\), minimizing this expression in \(\mu \) is equivalent to minimizing

$$\begin{aligned} {\widetilde{r}}+\frac{1}{2(1-\gamma )}\bigl ({\widetilde{\mu }}-{\widetilde{r}}\mathbf {1}_{d-1}\bigr )^\top ({\widetilde{\sigma }}{\widetilde{\sigma }}^\top )^{-1}\bigl ({\widetilde{\mu }}-{\widetilde{r}}\mathbf {1}_{d-1}\bigr ). \end{aligned}$$

We now plug in the representations of \({\widetilde{r}}\), \({\widetilde{\mu }}\) and \({\widetilde{\sigma }}\) from the corollary and obtain

$$\begin{aligned} \begin{aligned}&(1-h)r+he_d^\top \mu -\frac{1}{2}(1-\gamma )\Vert h\sigma ^\top e_d \Vert ^2\\&+\frac{1}{2(1-\gamma )}\bigl (D\mu - h(1-\gamma )D\sigma \sigma ^\top e_d\bigr )^\top (D\sigma \sigma ^\top D^\top )^{-1}\bigl (D\mu - h(1-\gamma )D\sigma \sigma ^\top e_d\bigr ). \end{aligned} \end{aligned}$$

Our aim is to minimize the above expression in \(\mu \). We see that many terms do not depend on \(\mu \). The minimization is therefore equivalent to the minimization of

$$\begin{aligned} \begin{aligned}&\frac{1}{2(1-\gamma )}\mu ^\top D^\top (D\sigma \sigma ^\top D^\top )^{-1}D\mu + h\Bigl ( e_d^\top \mu -(D\sigma \sigma ^\top e_d)^\top (D\sigma \sigma ^\top D^\top )^{-1}D\mu \Bigr ) \\&= \frac{1}{2(1-\gamma )}\mu ^\top A\mu +hc^\top \mu \end{aligned} \end{aligned}$$

on the ellipsoid K, where A and c were introduced in Definition 3.3. To make this minimization problem easier, we apply a transformation to the elements \(\mu \in K\). For that purpose, note that since \(\varGamma \in \mathbb {R}^{d\times d}\) is assumed to be symmetric and positive definite, there exists some nonsingular matrix \(\tau \in \mathbb {R}^{d\times d}\) such that \(\varGamma =\tau \tau ^\top \). The matrix \(\tau \) can be obtained for example by the Cholesky decomposition. Then we can rewrite the constraint \((\mu -\nu )^\top \varGamma ^{-1}(\mu -\nu ) \le \kappa ^2\) as

$$\begin{aligned} \begin{aligned} \kappa ^2\ge (\mu -\nu )^\top (\tau \tau ^\top )^{-1}(\mu -\nu )&=(\mu -\nu )^\top (\tau ^\top )^{-1}\tau ^{-1}(\mu -\nu )\\&=\bigl (\tau ^{-1}(\mu -\nu )\bigr )^\top \bigl (\tau ^{-1}(\mu -\nu )\bigr ). \end{aligned} \end{aligned}$$

Hence, for an arbitrary \(\mu \in K\) we define \(\rho :=\tau ^{-1}(\mu -\nu )\) so that \(\mu =\nu +\tau \rho \) and \(\Vert \rho \Vert \le \kappa \). We can then rewrite (6) as

$$\begin{aligned} \begin{aligned} \frac{1}{2(1-\gamma )}&\mu ^\top A\mu +hc^\top \mu \\&= \frac{1}{2(1-\gamma )}\bigl ((\tau \rho )^\top A\tau \rho +2\nu ^\top A\tau \rho +\nu ^\top A\nu \bigr ) +hc^\top \tau \rho +hc^\top \nu \\&= \frac{1}{2(1-\gamma )}\rho ^\top \tau ^\top A\tau \rho +\Bigl (\frac{1}{1-\gamma }A\nu +hc\Bigr )^\top \tau \rho +\frac{1}{2(1-\gamma )}\nu ^\top A\nu +hc^\top \nu . \end{aligned} \end{aligned}$$

Minimizing (6) in \(\mu \in K\) is therefore equivalent to minimizing the function \(g:B_\kappa (0)\rightarrow \mathbb {R}\) with

$$\begin{aligned} g(\rho )=\frac{1}{2(1-\gamma )}\rho ^\top \tau ^\top A\tau \rho +\Bigl (hc+\frac{1}{1-\gamma }A\nu \Bigr )^\top \tau \rho \end{aligned}$$

in \(\rho \) and then setting \(\mu =\nu +\tau \rho \). The behavior of g is determined to a large extent by the matrix A from Definition 3.3. So we analyze properties of A next.

Lemma 3.6

The matrix A is symmetric and positive semidefinite and \(\mathrm {ker}(A)=\mathrm {span}(\{\mathbf {1}_d\})\).

We immediately deduce that also \(\tau ^\top A\tau \in \mathbb {R}^{d\times d}\) is symmetric and positive semidefinite with \(\mathrm {ker}(\tau ^\top A\tau )=\mathrm {span}(\{\tau ^{-1}\mathbf {1}_d\})\). Having collected these properties of the matrix A and of \(\tau ^\top A\tau \) enables us to find the parameter \(\rho \) that minimizes \(g(\rho )\) on the set \(B_\kappa (0)\).

Lemma 3.7

Let \(0=\lambda _1<\lambda _2\le \cdots \le \lambda _d\) denote the eigenvalues of \(\tau ^\top A\tau \), and let further \(v_1=\frac{1}{\Vert \tau ^{-1}\mathbf {1}_d\Vert }\tau ^{-1}\mathbf {1}_d, v_2,\dots ,v_d\in \mathbb {R}^d\) denote the respective orthogonal eigenvectors with \(\Vert v_i\Vert =1\) for all \(i=1,\dots , d\). Then the minimum of the function \(g:B_\kappa (0)\rightarrow \mathbb {R}\) with

$$\begin{aligned} g(\rho )=\frac{1}{2(1-\gamma )}\rho ^\top \tau ^\top A\tau \rho +\Bigl (hc+\frac{1}{1-\gamma }A\nu \Bigr )^\top \tau \rho \end{aligned}$$

on the domain \(B_\kappa (0)=\{ \rho \in \mathbb {R}^d \,|\, \Vert \rho \Vert \le \kappa \}\) is attained by the vector

$$\begin{aligned} \rho ^*=-\sum _{i=1}^d \biggl ({\frac{\lambda _i}{1-\gamma }+\frac{h}{\psi (\kappa )\Vert \tau ^{-1}\mathbf {1}_d\Vert }}\biggr )^{-1}\biggl \langle h\tau ^\top c+\frac{\lambda _i}{1-\gamma }\tau ^{-1}\nu , v_i\biggr \rangle v_i, \end{aligned}$$

where \(\psi (\kappa )\in (0,\kappa ]\) is uniquely determined by \(\Vert \rho ^*\Vert =\kappa \).

Note that \(\psi (\kappa )\) in the above lemma is the unique value in \((0,\kappa ]\) that makes \(\rho ^*\) lie on the boundary of \(B_\kappa (0)\). In the representation \(\rho ^*=\sum _{i=1}^d a_iv_i\) it holds \(a_1=-\psi (\kappa )\), i.e. \(\psi (\kappa )\) is the negative of the coefficient belonging to \(v_1\). Recall that \(v_1\) is the eigenvector to eigenvalue zero of \(\tau ^\top A\tau \), hence it plays an important role in the minimization of the function g above. In Sect. 4 we will study the asymptotic behavior for large uncertainty \(\kappa \). It will turn out that asymptotically \(v_1\) will be the dominant component in the representation \(\rho ^*=\sum _{i=1}^d a_iv_i\), a claim that we show by analyzing the asymptotic behavior of \(\psi (\kappa )\). The previous lemma now yields the solution of the dual problem to our original optimization problem.

Theorem 3.8

Let \(0=\lambda _1<\lambda _2\le \cdots \le \lambda _d\) denote the eigenvalues of \(\tau ^\top A\tau \), and let further \(v_1=\frac{1}{\Vert \tau ^{-1}\mathbf {1}_d\Vert }\tau ^{-1}\mathbf {1}_d, v_2,\dots ,v_d\in \mathbb {R}^d\) denote the respective orthogonal eigenvectors with \(\Vert v_i\Vert =1\) for all \(i=1,\dots , d\). Then

$$\begin{aligned} \inf _{\mu \in K} \sup _{\pi \in \mathcal {A}_h(x_0)} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U_\gamma (X^\pi _T)\bigr ] = {{\,\mathrm{\mathbb {E}}\,}}_{\mu ^*}\bigl [U_\gamma (X^{\pi ^*}_T)\bigr ], \end{aligned}$$


$$\begin{aligned} \mu ^*=\nu -\tau \sum _{i=1}^d \biggl ( \frac{\lambda _i}{1-\gamma }+\frac{h}{\psi (\kappa )\Vert \tau ^{-1}\mathbf {1}_d\Vert } \biggr )^{-1}\biggl \langle h\tau ^\top c+\frac{\lambda _i}{1-\gamma }\tau ^{-1}\nu , v_i\biggr \rangle v_i \end{aligned}$$

for \(\psi (\kappa )\in (0,\kappa ]\) that is uniquely determined by \(\Vert \tau ^{-1}(\mu ^*-\nu )\Vert =\kappa \), and where \((\pi ^*_t)_{t\in [0,T]}\) is for all \(t\in [0,T]\) defined by

$$\begin{aligned} \pi ^*_t = \frac{1}{1-\gamma }A\mu ^* +hc. \end{aligned}$$

Remark 3.9

The preceding theorem solves the problem

$$\begin{aligned} \inf _{\mu \in K} \sup _{\pi \in \mathcal {A}_h(x_0)} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U_\gamma (X^\pi _T)\bigr ]. \end{aligned}$$

This is the corresponding dual problem to our original optimization problem

$$\begin{aligned} \sup _{\pi \in \mathcal {A}_h(x_0)} \inf _{\mu \in K} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U_\gamma (X^\pi _T)\bigr ], \end{aligned}$$

but in general the values of these two problems do not coincide. There are, of course, special cases in which the supremum and the infimum do interchange. Those results are called minimax theorems in the literature. In a portfolio optimization setting that is similar to ours a minimax theorem has been shown in Quenez [23]. Here, the author applies classical techniques from Kramkov and Schachermayer [11, 12] for incomplete markets and embeds them into a multiple-priors framework. However, there are two main points that distinguish our setting from the one in Quenez [23]. Firstly, the results in that paper are only shown for non-negative utility functions and therefore not directly applicable to power utility \(U_\gamma \) with a negative \(\gamma \). Secondly, the constraint \(\langle \pi _t,\mathbf {1}_d\rangle =h\) that we put on the admissible trading strategies alters the structure of attainable terminal wealths so that it would be necessary to adjust the proofs and check several technical assumptions.

In addition, note that a minimax theorem does not endow us with the form of the optimal strategy (or the worst-case drift) yet. To obtain an explicit representation of the same, we would still need to go through the calculations done in this section. In the following, we will use the explicit representation of the optimal strategy for (7) to show that it indeed also solves (8) and that in this case, the supremum and the infimum can be interchanged.

A minimax theorem

The following representation of \(\pi ^*\) is useful for proving our minimax theorem.

Lemma 3.10

The strategy \(\pi ^*\) from Theorem 3.8 satisfies

$$\begin{aligned} \pi ^*_t = -\frac{h}{\psi (\kappa )\Vert \tau ^{-1}\mathbf {1}_d\Vert }\varGamma ^{-1}(\mu ^*-\nu ) \end{aligned}$$

for all \(t\in [0,T]\).

The preceding lemma characterizes the strategy \(\pi ^*\), which is the best strategy an investor can choose when the drift of stocks is \(\mu ^*\). In the following we show that, vice versa, \(\mu ^*\) is also the parameter the market has to choose to minimize the investor’s expected utility of terminal wealth, given that the investor applies strategy \(\pi ^*\). It then follows that the point \((\pi ^*,\mu ^*)\) is a saddle point of our problem, i.e. it holds

$$\begin{aligned} {{\,\mathrm{\mathbb {E}}\,}}_{\mu ^*}\bigl [U_\gamma (X^{\pi }_T)\bigr ] \le {{\,\mathrm{\mathbb {E}}\,}}_{\mu ^*}\bigl [U_\gamma (X^{\pi ^*}_T)\bigr ] \le {{\,\mathrm{\mathbb {E}}\,}}_{\mu }\bigl [U_\gamma (X^{\pi ^*}_T)\bigr ] \end{aligned}$$

for all \(\mu \in K\) and \(\pi \in \mathcal {A}_h(x_0)\). This property is essential for proving our minimax theorem. Note that the inequality

$$\begin{aligned} \sup _{\pi \in \mathcal {A}_h(x_0)} \inf _{\mu \in K} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U_\gamma (X^\pi _T)\bigr ] \le \inf _{\mu \in K} \sup _{\pi \in \mathcal {A}_h(x_0)} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U_\gamma (X^\pi _T)\bigr ] \end{aligned}$$

always holds when interchanging supremum and infimum, see Ekeland and Temam [7, Ch. VI, Prop. 1.1], for example. For the reverse inequality the saddle point property is needed.

Theorem 3.11

Let \(K=\{ \mu \in \mathbb {R}^d \,|\, (\mu -\nu )^\top \varGamma ^{-1}(\mu -\nu ) \le \kappa ^2 \}\). Then the parameter \(\mu \) that attains the minimum in

$$\begin{aligned} \inf _{\mu \in K} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U_\gamma (X^{\pi ^*}_T)\bigr ] \end{aligned}$$

is \(\mu ^*\), where both \(\mu ^*\) and \(\pi ^*\) are defined as in Theorem 3.8. In particular, it follows that

$$\begin{aligned} \sup _{\pi \in \mathcal {A}_h(x_0)} \inf _{\mu \in K} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U_\gamma (X^\pi _T)\bigr ] = {{\,\mathrm{\mathbb {E}}\,}}_{\mu ^*}\bigl [U_\gamma (X^{\pi ^*}_T)\bigr ] = \inf _{\mu \in K} \sup _{\pi \in \mathcal {A}_h(x_0)} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U_\gamma (X^\pi _T)\bigr ]. \end{aligned}$$

The previous theorem establishes duality between our original robust utility maximization problem and the dual problem where supremum and infimum are interchanged. Additionally, we now also know the solution to our original problem. The optimal strategy for our constrained robust utility maximization problem is given in a nearly explicit way. Note that the parameter \(\mu ^*\) in Theorem 3.8 is not given explicitly since the parameter \(\psi (\kappa )\) is defined in an implicit way. However, finding \(\psi (\kappa )\) numerically can be done in a straightforward way by a numerical root search of a monotone function. For this reason, determining \(\mu ^*\) and \(\pi ^*\) numerically does not pose any problems.

Remark 3.12

One can think of other reasonable sets K for modelling uncertainty about the drift parameter \(\mu \). Our duality approach can also be applied to the optimization problem with

$$\begin{aligned} K=\bigl \{\mu \in \mathbb {R}^d \,\big |\, \mathbf {1}_d^\top \mu =b\bigr \} \end{aligned}$$

for some \(b\in \mathbb {R}\). The motivation for this uncertainty set is that one has an estimate for the performance of a stock index, and therefore the overall average performance of the stocks, but not for the single stocks themselves. In that case, one can show that the optimal strategy for the optimization problem

$$\begin{aligned} \inf _{\mu \in K} \sup _{\pi \in \mathcal {A}_h(x_0)} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U_\gamma (X^\pi _T)\bigr ] \end{aligned}$$

is \((\pi ^*_t)_{t\in [0,T]}\) with \(\pi ^*_t = \frac{h}{d}\mathbf {1}_d\) for all \(t\in [0,T]\). The worst-case parameter \(\mu ^*\) can be determined explicitly given the eigenvalues and eigenvectors of the matrix A. Further, one can show a minimax theorem in analogy to Theorem 3.11. The optimal strategy is here just a uniform diversification strategy given the constraint on the bond investment. In the next section we show how this fits into the framework of our results for ellipsoidal uncertainty sets when we let the degree of uncertainty \(\kappa \) go to infinity.

Asymptotic behavior as uncertainty increases

In this section we consider again the setting with ellipsoidal uncertainty sets as in (4) and investigate what happens as the degree of uncertainty changes. Since K is an ellipsoid, we increase the degree of uncertainty about the true drift parameter by increasing the radius \(\kappa \), a lower value of \(\kappa \) corresponds to a more precise knowledge of the true drift.

Limit of worst-case parameter and optimal strategy

In the following, we address in detail the asymptotic behavior of the worst-case parameter and the optimal strategy as uncertainty increases, i.e. as \(\kappa \) goes to infinity. To underline the dependence on the degree of uncertainty, we write \(\mu ^*=\mu ^*(\kappa )\) and \(\pi ^*=\pi ^*(\kappa )\) in the following.

Remark 4.1

The other asymptotic regime \(\kappa \rightarrow 0\) corresponds to a more and more precise knowledge of the true drift. It is easy to see that

$$\begin{aligned} \lim _{\kappa \rightarrow 0}\mu ^*(\kappa )=\nu \quad \text {and}\quad \lim _{\kappa \rightarrow 0}\pi _t^*(\kappa )=\frac{1}{1-\gamma }A\nu +hc \end{aligned}$$

for all \(t\in [0,T]\). This means that the worst-case parameter converges to the reference drift \(\nu \) and the optimal strategy to the best constrained strategy, given that the drift equals \(\nu \). So we retrieve in the limit \(\kappa \rightarrow 0\) the setting without model uncertainty.

We now focus on \(\kappa \rightarrow \infty \). Note that the only quantity in the representation of \(\mu ^*\) from Theorem 3.8 that depends on \(\kappa \) is \(\psi (\kappa )\).

Lemma 4.2

It holds \(\lim _{\kappa \rightarrow \infty } \frac{\psi (\kappa )}{\kappa }=1\).

From this lemma we gain insights into the asymptotic behavior of \(\mu ^*\).

Proposition 4.3

It holds

$$\begin{aligned} \lim _{\kappa \rightarrow \infty } \frac{1}{\kappa }\tau ^{-1}\bigl (\mu ^*(\kappa )-\nu \bigr )=-v_1=-\frac{1}{\Vert \tau ^{-1}\mathbf {1}_d\Vert }\tau ^{-1}\mathbf {1}_d\end{aligned}$$


$$\begin{aligned} \lim _{\kappa \rightarrow \infty } \frac{1}{\kappa }\mu ^*(\kappa ) = -\tau v_1 = -\frac{1}{\Vert \tau ^{-1}\mathbf {1}_d\Vert }\mathbf {1}_d. \end{aligned}$$

Hence, asymptotically the direction of the worst-case parameter is \(-\mathbf {1}_d\). This means that, as \(\kappa \) tends to infinity, the worst drift which the market can choose for an investor who applies the optimal strategy \(\pi ^*\), is a drift vector where all entries are the same and negative. We have the following result for the asymptotic behavior of the investor’s optimal strategy.

Theorem 4.4

For any \(t\in [0,T]\) it holds

$$\begin{aligned} \lim _{\kappa \rightarrow \infty } \pi ^*_t(\kappa )=\frac{h}{\mathbf {1}_d^\top \varGamma ^{-1}\mathbf {1}_d}\varGamma ^{-1}\mathbf {1}_d. \end{aligned}$$

The theorem shows that the optimal strategy \(\pi ^*(\kappa )\) converges as the degree of uncertainty \(\kappa \) goes to infinity. If \(\varGamma =\sigma \sigma ^\top \), then the limit is a multiple of the minimum variance portfolio. Another interesting special case is \(\varGamma =I_d\), i.e. when K is simply a ball with radius \(\kappa \). In that case we have

$$\begin{aligned} \lim _{\kappa \rightarrow \infty } \pi ^*_t(\kappa )=\frac{h}{d}\mathbf {1}_d\end{aligned}$$

for any \(t\in [0,T]\), hence the optimal strategy converges to a uniform diversification strategy, given by \(\frac{h}{d}\mathbf {1}_d\) at each point in time. Hence, when forced to invest a total fraction of \(h>0\) in the risky assets, then in the limit for \(\kappa \) going to infinity investors will diversify their portfolio uniformly. For general \(\varGamma \) we shall speak of a generalized uniform diversification strategy.

This asymptotic behavior of the optimal strategy is striking because the limit is independent of the volatility matrix \(\sigma \). In combination with the structure of the function g in Lemma 3.7 this indicates that it might also be possible to allow for misspecified volatility. For a high level of uncertainty the optimal strategy is dominated by the matrix \(\varGamma \) shaping the uncertainty ellipsoid whereas both the volatility structure of the assets and the reference drift \(\nu \) become negligible. This effect is caused by the investor’s reaction to the worst-case drift parameter \(\mu ^*\) which, as shown in Proposition 4.3, behaves asymptotically like a multiple of \(\mathbf {1}_d\). The best reaction from the investor’s point of view is to diversify among all assets, weighted by the uncertainty structure \(\varGamma \). In the special case where K is a ball, this leads to a uniform diversification strategy. This result is in line with Pflug et al. [21] who show convergence of the optimal strategy to the uniform diversification strategy in a risk minimization setting with increasing model uncertainty.

Remark 4.5

Note that plugging in \(\kappa =\infty \) into the definition of the ellipsoid yields the uncertainty set \(K=\mathbb {R}^d\), so that in fact every drift parameter \(\mu \in \mathbb {R}^d\) is deemed possible by the investor. Then one easily obtains the worst-case utility

$$\begin{aligned} \inf _{\mu \in \mathbb {R}^d} {{\,\mathrm{\mathbb {E}}\,}}_{\mu }[U_\gamma (X^\pi _T)]= {\left\{ \begin{array}{ll} 0, &{} \gamma \in (0,1),\\ -\infty , &{} \gamma \in (-\infty ,0], \end{array}\right. } \end{aligned}$$

for any admissible \(\pi \). Hence, every strategy performs equally bad in the limit case. In particular, plugging in \(\kappa =\infty \) into the ellipsoid in the first place does not provide us with the optimal limit strategy of Theorem 4.4.

The intuition is that, as long as the uncertainty set is bounded, there exists a worst-case drift to which the investor can react in an optimal way. Nevertheless, when uncertainty goes to infinity, also the expected utility achieved by the best strategy will be driven to \(-\infty \) in case that \(\gamma \in (-\infty ,0]\), respectively to zero in case \(\gamma \in (0,1)\).

Relaxing the investment constraint

We use the above results to show that, as uncertainty \(\kappa \) goes to infinity, our robust optimization problem yields the same optimal value as a slightly different optimization problem with a more general class of admissible strategies. Recall that we have so far considered for \(h>0\) the set

$$\begin{aligned} \mathcal {A}_h(x_0)=\bigl \{ \pi \in \mathcal {A}(x_0) \,\big |\, \langle \pi _t,\mathbf {1}_d\rangle = h \text { for all } t\in [0,T] \bigr \} \end{aligned}$$

as the class of admissible strategies. Requiring \(\langle \pi _t,\mathbf {1}_d\rangle \ge h\) instead of \(\langle \pi _t,\mathbf {1}_d\rangle = h\) obviously enlarges this set. In the following, we show for logarithmic utility that maximizing worst-case expected utility among bounded strategies in this larger set asymptotically leads to the same value as our original problem. We write \(K=K(\kappa )\) for the uncertainty ellipsoid with radius \(\kappa \).

Proposition 4.6

Define for \(h>0\) the admissibility set

$$\begin{aligned} \mathcal {A}'_h(x_0)=\bigl \{ \pi \in \mathcal {A}(x_0) \,\big |\, \langle \pi _t,\mathbf {1}_d\rangle \ge h \text { for all } t\in [0,T] \bigr \} \end{aligned}$$

and let \(M>0\). Then there exists a \(\kappa _M>0\) such that for all \(\kappa \ge \kappa _M\) it holds

$$\begin{aligned} \sup _{\begin{array}{c} \pi \in \mathcal {A}'_h(x_0)\\ \Vert \pi \Vert \le M \end{array}} \inf _{\begin{array}{c} \mu \in K(\kappa )\\ \end{array}} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [\log (X^\pi _T)\bigr ] \le \sup _{\pi \in \mathcal {A}_h(x_0)} \inf _{\mu \in K(\kappa )} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [\log (X^\pi _T)\bigr ]. \end{aligned}$$

Here we use \(\Vert \pi \Vert \le M\) as a short notation for \(\Vert \pi _t\Vert \le M\) for all \(t\in [0,T]\).

For power utility, the result is slightly weaker. We first give a lemma that states some useful equalities concerning the matrix A and vector c from Definition 3.3.

Lemma 4.7

For the matrix A and the vector c we have

$$\begin{aligned} A\sigma \sigma ^\top A=A, \quad c^\top \sigma \sigma ^\top A=0 \quad \text {and}\quad c^\top \mathbf {1}_d=1. \end{aligned}$$

The next proposition gives a result similar to Proposition 4.6 for power utility. We define a different enlarged admissibility set \({\overline{\mathcal {A}}}_h(x_0)\) in this case. The reason is that, in contrast to the logarithmic utility case, we cannot ensure that we can restrict to deterministic strategies in \(\mathcal {A}'_h(x_0)\).

Proposition 4.8

Let \(\gamma \ne 0\) and \(h>0\) and define the admissibility set

$$\begin{aligned} {\overline{\mathcal {A}}}_h(x_0)=\bigcup _{h'\ge h} \mathcal {A}_{h'}(x_0). \end{aligned}$$

Then there exists a \(\kappa '>0\) such that for all \(\kappa \ge \kappa '\) it holds

$$\begin{aligned} \sup _{\pi \in {\overline{\mathcal {A}}}_h(x_0)} \inf _{\mu \in K(\kappa )} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U_\gamma (X^\pi _T)\bigr ] = \sup _{\pi \in \mathcal {A}_h(x_0)} \inf _{\mu \in K(\kappa )} {{\,\mathrm{\mathbb {E}}\,}}_\mu \bigl [U_\gamma (X^\pi _T)\bigr ]. \end{aligned}$$

The previous propositions show that as uncertainty increases it is reasonable for investors to choose strategies \(\pi \) with \(\langle \pi _t,\mathbf {1}_d\rangle \) as small as possible. Even if the class of admissible strategies is enlarged, the optimal value will for large uncertainty be attained by a strategy from \(\mathcal {A}_h(x_0)\). This is in line with the intuition from Proposition 2.1, where we have seen that as uncertainty exceeds a certain threshold, investors prefer to not invest anything into the risky assets.

Risk aversion and speed of convergence

As the class of admissible strategies we now take again

$$\begin{aligned} \mathcal {A}_h(x_0)=\bigl \{ \pi \in \mathcal {A}(x_0) \,\big |\, \langle \pi _t,\mathbf {1}_d\rangle = h \text { for all } t\in [0,T] \bigr \} \end{aligned}$$

for some \(h>0\). We have seen in Sect. 4.1 that the optimal strategy \(\pi ^*(\kappa )\) for our robust optimization problem with ellipsoidal uncertainty sets K converges as the level of uncertainty \(\kappa \) goes to infinity. If the uncertainty set K is a ball, then the limit is a uniform diversification strategy \(\frac{h}{d}\mathbf {1}_d\). In the following, we illustrate this convergence by an example and investigate which influence the risk aversion parameter \(\gamma \) has on the speed of convergence. Note that for our class of utility functions, the value \(1-\gamma \) is equal to the Arrow–Pratt measure of relative risk aversion. The smaller \(\gamma \) is, the more risk-averse is the investor.

Example 4.9

We consider a market with \(d=8\) risky assets. The volatility matrix has the form

figure a

Investors use strategies from \(\mathcal {A}_h(x_0)\) with \(h=1\). Further, we take \(\varGamma =I_d\) and \(\nu =\frac{3}{10}\mathbf {1}_d\) as parameters of the uncertainty ellipsoid. Note that for this choice of the parameter \(\nu \) the optimal strategy in the situation without model uncertainty, i.e. with \(\kappa =0\), does not depend on \(\gamma \). We then compute the constant optimal portfolio composition \(\pi ^*(\kappa )\) based on different values of \(\gamma \) and for all \(\kappa \in (0,0.5)\), and plot the result in Fig. 1 against \(\kappa \). For any fixed level of uncertainty \(\kappa \), the optimal composition \(\pi ^*(\kappa )\) is plotted as a stacked plot where every color corresponds to one stock.

For small values of \(\kappa \), the optimal strategy \(\pi ^*\) is negative in some components. This leads to an overall investment larger than one on the positive side. As \(\kappa \) becomes larger, the composition gets closer and closer to the uniform diversification vector. When comparing the different subplots one sees that the convergence is faster for higher values of \(\gamma \), an effect that has been shown to hold in general, see Westphal [28, Rem. 5.9]. This might be surprising at first glance since one expects a more risk-averse investor to choose a “safer” strategy sooner than a less risk-averse investor does. However, the effect becomes more intuitive when keeping in mind that we address a robust optimization problem where an investor is confronted with the worst possible drift parameter in the uncertainty set. An investor with a high, positive value of \(\gamma \) would, in the non-robust problem, invest in the assets with the allegedly highest drift. In the worst-case market this undiversified strategy would allow the market to choose a very extreme drift parameter with high absolute values for exactly these assets. This implies that a less risk-averse investor is much more prone to the market’s choice of a drift parameter. To make up for this, there is more diversification, which can even be amplified by the constraint using \(h=1\), and thus the optimal robust strategy converges very fast, so that even for small values of uncertainty \(\kappa \), the investor is already driven into the diversified uniform strategy.

Fig. 1
figure 1

Optimal portfolio composition \(\pi ^*\) plotted against \(\kappa \) for different values of \(\gamma \). The model parameters are given in Example 4.9. For any \(\gamma \), we observe convergence against a uniform diversification strategy. For larger values of \(\gamma \), convergence appears to take place faster than for smaller values of \(\gamma \)

Measures of robustness performance

We have seen that introducing uncertainty in our utility maximization problem leads to more diversified strategies. The question arises what an investor gains from using robust strategies and what downside comes with behaving in a robust way in situations where it is not necessary. These two antithetic effects can be rated by the measures cost of ambiguity and reward for distributional robustness that have been studied in a different context in Analui [1, Sec. 3.4].

For our robust maximization problem, the center \(\nu \) of the uncertainty ellipsoid can be seen as an estimation for the true drift of the stocks. If an investor was sure that the estimation was correct, she would simply maximize \({{\,\mathrm{\mathbb {E}}\,}}_\nu [U_\gamma (X^\pi _T)]\). From Proposition 3.4 we know that the optimal strategy is then of the form \(({\hat{\pi }}_t)_{t\in [0,T]}\) with

$$\begin{aligned} {\hat{\pi }}_t = \frac{1}{1-\gamma }A\nu +hc \end{aligned}$$

for all \(t\in [0,T]\). In the presence of uncertainty, the solution to our utility maximization problem is the strategy \((\pi ^*_t)_{t\in [0,T]}\) with

$$\begin{aligned} \pi ^*_t = \frac{1}{1-\gamma }A\mu ^*+hc \end{aligned}$$

for all \(t\in [0,T]\), see Theorem 3.11. We now define measures for the robustness performance that consider the difference in the corresponding certainty equivalents when using \({\hat{\pi }}\) or \(\pi ^*\).

Definition 4.10

We define the cost of ambiguity as

$$\begin{aligned} {{\,\mathrm{COA}\,}}= U_\gamma ^{-1}\big ({{\,\mathrm{\mathbb {E}}\,}}_\nu \big [U_\gamma \big (X^{{\hat{\pi }}}_T\big )\big ]\big ) -U_\gamma ^{-1}\big ({{\,\mathrm{\mathbb {E}}\,}}_\nu \big [U_\gamma \big (X^{\pi ^*}_T\big )\big ]\big ) \end{aligned}$$

and the reward for distributional robustness as

$$\begin{aligned} {{\,\mathrm{RDR}\,}}= U_\gamma ^{-1}\big ({{\,\mathrm{\mathbb {E}}\,}}_{\mu ^*}\big [U_\gamma \big (X^{\pi ^*}_T\big )\big ]\big ) -U_\gamma ^{-1}\big ({{\,\mathrm{\mathbb {E}}\,}}_{\mu ^*}\big [U_\gamma \big (X^{{\hat{\pi }}}_T\big )\big ]\big ). \end{aligned}$$

The cost of ambiguity captures how big the loss in the certainty equivalent is when using the robust strategy \(\pi ^*\), given that the estimation \(\nu \) for the drift was actually correct. Note that \({\hat{\pi }}\) is the best strategy given drift \(\nu \) and that \(U_\gamma ^{-1}\) is a strictly increasing function, hence \({{\,\mathrm{COA}\,}}\) is non-negative. The reward for distributional robustness reflects how much an investor is rewarded when using the robust strategy \(\pi ^*\) compared to the “naive” strategy \({\hat{\pi }}\), assuming that indeed the worst possible drift parameter \(\mu ^*\) is the true one. We see that also \({{\,\mathrm{RDR}\,}}\) is non-negative since \(\pi ^*\) maximizes expected utility given \(\mu ^*\).

Remark 4.11

A different definition of \({{\,\mathrm{COA}\,}}\) and \({{\,\mathrm{RDR}\,}}\) is possible where one measures the difference in expected utility rather than the difference of the certainty equivalents. The asymptotic behavior of the reward for distributional robustness for large uncertainty is then heavily affected by the parameter \(\gamma \) of the investor’s utility function. In particular, as \(\kappa \) goes to infinity, the reward for distributional robustness goes to zero if \(\gamma >0\) and to infinity if \(\gamma <0\).

Proposition 4.12

Independently of \(\gamma \in (-\infty ,1)\) it always holds \({{\,\mathrm{COA}\,}}\ge {{\,\mathrm{RDR}\,}}\).

Furthermore, \({{\,\mathrm{COA}\,}}\) and \({{\,\mathrm{RDR}\,}}\) converge as \(\kappa \) goes to infinity. We write \({{\,\mathrm{COA}\,}}(\kappa )\) and \({{\,\mathrm{RDR}\,}}(\kappa )\) to emphasize the dependence on the degree of uncertainty.

Proposition 4.13

As \(\kappa \) goes to infinity, \({{\,\mathrm{COA}\,}}(\kappa )\) converges to a non-negative limit and \({{\,\mathrm{RDR}\,}}(\kappa )\) goes to zero.

Figure 2 illustrates the behavior of \({{\,\mathrm{COA}\,}}\) and \({{\,\mathrm{RDR}\,}}\) in dependence on the level of uncertainty \(\kappa \). We consider a market with \(d=8\) stocks, where the underlying market parameters are those from Example 4.9. The figure shows \({{\,\mathrm{COA}\,}}\) and \({{\,\mathrm{RDR}\,}}\) plotted against \(\kappa \) for different values of \(\gamma \). Note that the scaling in the second row of subfigures is different from the scaling in the first row. The absolute values of \({{\,\mathrm{COA}\,}}\) and \({{\,\mathrm{RDR}\,}}\) become smaller as \(\gamma \) increases.

We observe that the qualitative behavior of \({{\,\mathrm{COA}\,}}\) and \({{\,\mathrm{RDR}\,}}\) is the same for any value of the risk aversion coefficient \(\gamma \). For any fixed \(\gamma \) and \(\kappa \), \({{\,\mathrm{RDR}\,}}\) is always less than \({{\,\mathrm{COA}\,}}\), a property that we have proven in Proposition 4.12. As \(\kappa \) increases, \({{\,\mathrm{COA}\,}}\) goes to a finite positive limit, whereas \({{\,\mathrm{RDR}\,}}\) tends to zero, as we have shown in Proposition 4.13.

Fig. 2
figure 2

The behavior of \({{\,\mathrm{COA}\,}}\) and \({{\,\mathrm{RDR}\,}}\) plotted against uncertainty radius \(\kappa \) for different values of the risk aversion coefficient \(\gamma \). The parameters are those from Example 4.9

Outlook on stochastic drift and time-dependent uncertainty sets

In this section we want to give a brief outlook on how the results of this paper can be applied also in more general financial market models with a stochastic drift process. This generalization is the topic of our follow-up work Sass and Westphal [24]. Here we only give a short outline of the setup to illustrate the relevance of this work.

In Sass and Westphal [24] the results of the present paper are generalized to a financial market with a stochastic drift process and time-dependent uncertainty sets \((K_t)_{t\in [0,T]}\). This is motivated by the idea that information about the hidden drift process, as e.g. obtained from filtering techniques, might change over time. A surplus of information should then be reflected in a smaller uncertainty set. More precisely, we assume that under the reference measure returns follow the dynamics

$$\begin{aligned} \mathrm {d}R_t = \nu _t\,\mathrm {d}t+\sigma \,\mathrm {d}W_t, \end{aligned}$$

where the reference drift \((\nu _t)_{t\in [0,T]}\) is adapted to the filtration \((\mathcal {G}_t)_{t\in [0,T]}\) representing the investor’s information. This is justified by a separation principle where one performs a filtering step before solving the optimization problem, i.e. \((\nu _t)_{t\in [0,T]}\) represents the investor’s filter for the drift process. We introduce a time-dependent uncertainty set \((K_t)_{t\in [0,T]}\) that is a set-valued stochastic process adapted to \((\mathcal {G}_t)_{t\in [0,T]}\), meaning that the investor knows the realization of \(K_t\) at time t.

It is not obvious how to set up a worst-case optimization problem in this time-dependent setting. The problem lies in the fact that the realization of the uncertainty sets \((K_t)_{t\in [0,T]}\) is not known initially but gets revealed over time. A worst-case drift process \((\mu _t)_{t\in [0,T]}\) is characterized by being the worst one with the property that \(\mu _t\in K_t\) for all \(t\in [0,T]\). However, optimization with respect to this worst-case drift process is not feasible for an investor since it is not known initially. Instead, it makes sense to consider the following local approach. For any fixed \(t\in [0,T]\), the current uncertainty set \(K_t\) is known. Given this \(K_t\), investors take model uncertainty into account by assuming that in the future the worst possible drift process having values in \(K_t\) will be realized, i.e. the worst drift process from the class

$$\begin{aligned} \mathcal {K}^{(t)}=\bigl \{(\mu ^{(t)}_s)_{s\in [t,T]} \,\big |\, \mu ^{(t)}_s\in K_t \text { and } \mu ^{(t)}_s \text { is }\mathcal {G}_t\text {-measurable for each }s\in [t,T]\bigr \}. \end{aligned}$$

Investors then solve at each time \(t\in [0,T]\) the local optimization problem

$$\begin{aligned} \sup _{\pi ^{(t)}\in \mathcal {A}_h(t,x)} \inf _{\mu ^{(t)}\in \mathcal {K}^{(t)}} {{\,\mathrm{\mathbb {E}}\,}}_{\mu ^{(t)}}\Bigl [U\bigl (X^{t,x,\pi ^{(t)}}_T\bigr )\Bigr ]. \end{aligned}$$

Here, we write \(X^{t,x,\pi ^{(t)}}_s\) for the wealth at \(s\in [t,T]\) when starting at t with x and using strategy \(\pi ^{(t)}\in \mathcal {A}_h(t,x)\), where the admissibility set is defined analogously to \(\mathcal {A}_h(x_0)\) for strategies starting at t. This leads to an optimal strategy \((\pi ^{(t),*}_s)_{s\in [t,T]}\). In our continuous-time setting this decision will be revised as soon as \(K_t\) changes, possibly continuously in time. The realized optimal strategy of the investor is then given by \(\pi ^*_t=\pi ^{(t),*}_t\) for any \(t\in [0,T]\).

This setup of the local optimization problems is reasonable from an investor’s point of view. The uncertainty sets \(K_t\) change continuously in time due to new incoming information along with return observations, for example. Naturally, the optimal strategy of the investor will then also be adapted continuously. In Sass and Westphal [24] it is shown in detail how the results of this paper can be used to solve the above described more complicated problem. An explicit representation of the optimal strategy and a minimax theorem can be derived. Those results then also apply to much more general financial market models. The convergence results from Sect. 4, however, do not have a straightforward analogon in the setting with time-dependent uncertainty sets.

Remark 5.1

Initially it is not clear whether we have an inconsistent control problem, cf. Björk et al. [3], in our original formulation (3). But for the special case of (11) with a constant uncertainty set K, the results in Sass and Westphal [24] show that one obtains at time t the same optimal risky fractions as when starting at time 0. In combination with the Bellman principle, which implies that at time t we only need the information \(X_t=x\), this proves that our robust utility maximization problem with optimal solution \(\pi ^*\) obtained in Sect. 3 is time-consistent. A generalization to allowing for more probability measures than those corresponding to a constant drift in a formulation based on a robust utility functional may raise consistency issues and would need assumptions on the structure of this set. This may then be treated as in Müller [17] under appropriate conditions.