Skip to main content
Log in

Dynamic portfolio choice: a simulation-and-regression approach

  • Published:
Optimization and Engineering Aims and scope Submit manuscript

Abstract

Simulation-and-regression algorithms have become a standard tool for solving dynamic programs in many areas, in particular financial engineering and computational economics. In virtually all cases, the regression is performed on the state variables, for example on current market prices. However, it is possible to regress on decision variables as well, and this opens up new possibilities. We present numerical evidence of the performance of such an algorithm, in the context of dynamic portfolio choices in discrete-time (and thus incomplete) markets. The problem is fundamentally the one considered in some recent papers that also use simulations and/or regressions: discrete time, multi-period reallocation, and maximization of terminal utility. In contrast to that literature, we regress on decision variables and we do not rely on Taylor expansions nor derivatives of the utility function. Only basic tools are used, bundled in a dynamic programming framework: simulations—which can be black-boxed—as a representation of exogenous state variable dynamics; regression surfaces, as non-anticipative representations of expected future utility; and nonlinear or quadratic optimization, to identify the best portfolio choice at each time step. The resulting approach is simple, highly flexible and offers good performance in time and precision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. As outlined in Brandt (2010), a financial market is complete when all future outcomes (states) are spanned by the payoffs of traded assets. As shown in Cox and Huang (1989, 1991), the assumption of complete markets simplifies dynamic portfolio choices and can lead to closed form solutions for the investment policies. Such complete markets are however typically associated with a continuous time setting. Here, we examine a discrete time trading problem for which the market is incomplete. This framework does not allow explicit solutions except for few very specific cases.

  2. We found the papers by Koijen et al. and the section 6.6.6 of Bertsekas (2012) after the paper was written up; in fact, we came to the method we propose by generalizing simulation-and-regression techniques used in option pricing.

  3. It can be optimal for a multi-period investor to invest myopically under specific circumstances. A first circumstance is the case of constant investment opportunities, i.e. (i.i.d.) returns. A second circumstance is the case of stochastic but unhedgable investment opportunities. In this situation, the available assets cannot be used to hedge against changes in future investment opportunities and a myopic policy will be optimal. Finally, a logarithmic utility function will also yield an optimal myopic policy. See Brandt (2010) for more details.

  4. Such a modification would however be outside the scope of this paper as it would require non-trivial modifications. For example, with proportional transaction costs as in Balduzzi and Lynch (1999), the decisions of the investor, in addition to be conditioned on potential state variable values, should also be conditioned on an inherited allocation which is the result of the previous allocation and the realization of random returns. At each reallocation point, the investor should assess if the potential gain in expected utility from a new optimal allocation is sufficient to justify the loss associated with the transaction costs triggered by the new allocation.

  5. This continuous function is a diminutive version of the regression surface \(L_{k,t}(x,s)\) that will be introduced in Sect. 3.3.

  6. In order to reduce the sample variability, a moment matching simulation is used to generate the standard normal random variates. See, for examples, Barraquand (1995) or Boyle et al. (1997).

  7. We did check that with a fine discretization of time and larger bounds on weights, the LSMC–S&D solution converges to the quasi-analytic solution of Wachter.

  8. Wachter’s model uses a CRRA utility function. As is well known from the financial literature, with these preferences, optimal portfolios are independent of the wealth level and there is no real need to build a grid of wealth. We however do as if such a grid was required in order to assess the performance of the proposed method in a wealth dependent context.

  9. As discussed in DeMiguel et al. (2014) and the references therein, VAR processes can capture stock return serial dependence in a statistically significant manner. In any case, our goal was not to find the best predictors of future returns but to develop a problem that scales easily, so as to test our numerical approach.

  10. In fact, for wider decision spaces, if extreme leverage is allowed for example, the algorithm simply becomes slower.

  11. Note that while we report results based on the “realized values” approach, we observed that results that are based on the “regression surface” approach were very similar.

  12. To avoid solutions with allocation in only one or two assets, which are arguably less “interesting”, we arbitrarily changed parameter \(a_{0,1}\) from 0.0055 to 0.0062 for the four-assets (three risky) case below, and parameters \(a_{0,1}\) and \(a_{0,3}\) from 0.0057 (both cases) to 0.0068 and 0.0062 respectively, for the five-assets (four risky) case.

  13. In fact, these results illustrate that our quadrature approach is an imperfect benchmark, albeit the best we have. Indeed our problem is not simply a matter of computing a numerical integration (where quadrature is routinely used as a benchmark), rather it involves numerical integration within a scheme of dynamic programming, with underlying optimizations and interpolations.

  14. The model is presented in terms of lost wealth for simplicity of exposure given that conditional value at risk is typically defined in terms of a random variable we wish would be as small as possible. Minimizing lost wealth is equivalent to maximizing excess wealth.

  15. The time 0 return is known at \(t=0\). This known return value is \(R_{0}^{(a)}\) in the simulated sample of the table. The return at \(t=0\) for the second sample path appearing in the table has been simulated by starting the simulation at \(t=-1\). As it will become clear later in the example, this is done in order to avoid singular matrices in the regressions at time \(t=0\).

References

  • Artzner P, Delbaen F, Eber J-M, Heath D (1999) Coherent measures of risk. Math Finance 9:203–228

    Article  MathSciNet  MATH  Google Scholar 

  • Balduzzi P, Lynch A (1999) Transaction costs and predictability: some utility cost calculations. J Financ Econ 52:47–78

    Article  Google Scholar 

  • Barraquand J (1995) Numerical valuation of high dimensional multivariate European securities. Manag Sci 41:1882–1891

    Article  MATH  Google Scholar 

  • Bellini F, Bignozzi V (2015) Elicitable risk measures. Quant Finance 15(5):725–733

    Article  MathSciNet  Google Scholar 

  • Bertsekas DP (2012) Dynamic programming and optimal control, 4th edn, vols I and II. Athena Scientific, Belmont

    Google Scholar 

  • Boyle P, Broadie M, Glasserman P (1997) Monte Carlo methods for security pricing. J Econ Dyn Control 21:1263–1321

    Article  MathSciNet  MATH  Google Scholar 

  • Bradtke SJ, Barto AJ (1996) Linear least-squares algorithms for temporal difference learning. Mach Learn 22:33–57

    MATH  Google Scholar 

  • Brandt M (2010) Portfolio choice problems. In: Ait-Sahalia Y, Hansen LP (eds) Handbook of financial econometrics, Volume 1: Tools and techniques. North Holland, New York, pp 269–336

    Chapter  Google Scholar 

  • Brandt M, Goyal A, Santa-Clara P, Stroud J (2005) A simulation approach to dynamic portfolio choice with an application to learning about return predictability. Rev Financ Stud 18:831–873

    Article  Google Scholar 

  • Brennan M, Schwartz E, Lagnado R (1997) Strategic asset allocation. J Econ Dyn Control 21:1377–1403

    Article  MathSciNet  MATH  Google Scholar 

  • Carroll Chr (2006) The method of endogenous gridpoints for solving dynamic stochastic optimization problems. Econ Lett 91:312–320

    Article  MathSciNet  MATH  Google Scholar 

  • Cox JC, Huang CF (1989) Optimum consumption and portfolio policies when asset prices follow a diffusion process. J Econ Theory 49:33–83

    Article  MATH  Google Scholar 

  • Cox JC, Huang CF (1991) A variational problem occurring in financial economics. J Math Econ 20:465–487

    Article  MATH  Google Scholar 

  • Dammon R, Spatt C, Zhang H (2000) Optimal consumption and investment with capital gains taxes. Rev Financ Stud 14:583–616

    Article  Google Scholar 

  • DeMiguel V, Nogales F, Uppal R (2014) Stock return serial dependence and out-of-sample portfolio performance. Rev Financ Stud 27:1031–1073

    Article  Google Scholar 

  • Denault M, Simonato J-G (2017) Dynamic portfolio choices by simulation-and-regression: revisiting the issue of value function vs portfolio weight recursions. Comput Oper Res 79:174–189

    Article  MathSciNet  Google Scholar 

  • Denault M, Simonato J-G, Stentoft L (2013) A simulation-and-regression approach for stochastic dynamic programs with endogenous state variables. Comput Oper Res 40:2760–2769

    Article  MathSciNet  MATH  Google Scholar 

  • Detemple J, Garcia R, Rindisbacher M (2003) A Monte Carlo method for optimal portfolios. J Finance 58:401–446

    Article  Google Scholar 

  • Detemple J, Garcia R, Rindisbacher M (2005) Intertemporal asset allocation: a comparison of methods. J Bank Finance 29:2821–2848

    Article  Google Scholar 

  • Garlappi L, Skoulakis G (2009) Numerical solutions to dynamic portfolio problems: the case for value function iteration using Taylor expansion. Comput Econ 33:193–207

    Article  MATH  Google Scholar 

  • Garlappi L, Skoulakis G (2010) Solving consumption and portfolio choice problems: the state variable decomposition method. Rev Financ Stud 23:3346–3400

    Article  Google Scholar 

  • Koijen R, Nijman T, Werker B (2010) When can life cycle investors benefit from time-varying bond risk premia? Rev Financ Stud 23:741–780

    Article  Google Scholar 

  • Kuhn D, Wiesemann W, Georghiou A (2011) Primal and dual linear decision rules in stochastic and robust optimization. Math Program Ser A 130:177–209

    Article  MathSciNet  MATH  Google Scholar 

  • Lagoudakis MG, Parr R, Littman ML (2002) Least-squares methods in reinforcement learning for control, In: Vlahavas IP, Spyropoulos CD (eds) Methods and application of artificial intelligence, Lectures Notes in Computer Science, vol 2308, Second Hellenic Conference on AI, SETN 2002, Springer, pp 249–260

  • Longstaff F, Schwartz E (2001) Valuing American options by simulations: a simple least squares approach. Rev Financ Stud 14:113–148

    Article  Google Scholar 

  • Nijman T, Werker B, Koijen R (2007) Appendix to: when can life-cycle investors benefit from time-varying bond risk premia? Working paper, Network for Studies on Pensions, Aging and Retirement

  • Rockafellar RT, Uryasev S (2000) Optimization of conditional value-at-risk. J Risk 2:21–41

    Article  Google Scholar 

  • Rocha P, Kuhn D (2012) Multistage stochastic portfolio optimisation in deregulated electricity markets using linear decision rules. Eur J Oper Res 216:397–408

    Article  MathSciNet  MATH  Google Scholar 

  • Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge. Unpublished, updated 2012 edition available on the web (Spring 2015)

  • Tsitsiklis J, Van Roy B (2001) Regression methods for pricing complex American-style options. IEEE Trans Neural Netw 12:694–703

    Article  Google Scholar 

  • van Binsbergen J, Brandt M (2007) Solving dynamic portfolio choice problems by recursing on optimized portfolio weights or on the value function? Comput Econ 29:355–367

    Article  MATH  Google Scholar 

  • Wachter J (2002) Portfolio and consumption decisions under mean-reverting returns: an exact solution for complete markets. J Financ Quant Anal 37:63–91

    Article  Google Scholar 

  • Watkins CJCH (1989) Learning from delayed rewards. Ph.D. thesis, Cambridge University

  • Watkins CJCH, Dayan P (1992) Q-learning. Mach Learn 8:279–292

    MATH  Google Scholar 

  • Weissensteiner A (2009) A Q-learning approach to derive optimal consumption and investment strategies. IEEE Trans Neural Netw 20:1234–1243

    Article  Google Scholar 

  • Zenios SA (2007) Practical financial optimization, decision making for financial engineers. Blackwell, London

    MATH  Google Scholar 

Download references

Acknowledgements

The authors acknowledge the financial support of HEC-Montréal and NSERC (E.D. grant 386416-2010; M.D. grant 227838-2011).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michel Denault.

Appendix: Introductory example: regression on a state variable

Appendix: Introductory example: regression on a state variable

In this appendix, we extend the introductory example of Sect. 3.2 to the case where we regress on a state variable as well. To keep it compact and clear, the section is written with variables, not numbers.

Consider a two period example for a portfolio consisting of one risky and one risk-free asset with a constant gross risk-free rate \(R_{f}\). The investor, endowed with current wealth \(W_{0}\) at time \(t=0\), wants to maximize the utility of his wealth at time \(t=2\), with an optimal asset allocation at time \(t=0\) and \(t=1\), without any shortsales or borrowing.

Denote the grid of portfolio weights used here as

$$\begin{aligned} x=\left\{ \begin{array}{cccc} x_{1},&x_{2},&x_{3},&x_{4} \end{array} \right\} , \end{aligned}$$

while the grid of wealths is:

\(t=0\)

\(t=1\)

 

\(w_{1,1}\)

\(w_{0}\)

 
 

\(w_{2,1}\)

In this example, the return process has an AR(1) structure. Hence, when forming a portfolio at \(t=0\), the known return at \(t=0\) is a state variable.Footnote 15 To keep the example simple, only two sample paths of returns are simulated at each period:

Path

Period 0

Period 1

Period 2

a

\(R_{0}^{(a)}\)

\(R_{1}^{(a)}\)

\(R_{2}^{(a)}\)

b

\(R_{0}^{(b)}\)

\(R_{1}^{(b)}\)

\(R_{2}^{(b)}\)

In a second step, the recursive dynamic programming procedure is implemented with the above quantities. Starting at \(t=1\), one period before maturity, we perform the following operations, for each state of wealth at this time period. We first generate a sample of wealth at \(t=2\). For example, using the wealth level \(w_{1,1}\), it is possible to generate a wealth at \(t=2\) using the first weight and path a of the simulated return with

$$\begin{aligned} W\left( x_{1},R_{2}^{(a)}\right) =w_{1,1}\times \left( x_{1}R_{2}^{(a)}+R_{f}\right) . \end{aligned}$$

Using all the possible combinations of simulated returns at \(t=2\) and portfolio weights on the grid, the end of period sample of wealths at \(t=2\) is:

$$\begin{aligned} \left[ \begin{array}{cccccc} W\left( x_{1},R_{2}^{(a)}\right)&\cdots&W\left( x_{4},R_{2} ^{(a)}\right)&W\left( x_{1},R_{2}^{(b)}\right)&\cdots&W\left( x_{4},R_{2}^{(b)}\right) \ \end{array} \right] ^\intercal . \end{aligned}$$

Computing the utility of these end of period wealths with the utility function u(W), the following linear regression system can be formed:

$$\begin{aligned} \left[ \begin{array}{c} u\left( W\left( \cdot \right) \right) \\ u\left( W\left( \cdot \right) \right) \\ u\left( W\left( \cdot \right) \right) \\ u\left( W\left( \cdot \right) \right) \\ u\left( W\left( \cdot \right) \right) \\ u\left( W\left( \cdot \right) \right) \\ u\left( W\left( \cdot \right) \right) \\ u\left( W\left( \cdot \right) \right) \end{array} \right] =\left[ \begin{array}{cccccc} 1 &{}\quad x_{1} &{}\quad x_{1}^{2} &{}\quad R_{1}^{(a)} &{}\quad \left( R_{1}^{(a)}\right) ^{2} &{}\quad x_{1}R_{1}^{(a)}\\ 1 &{}\quad x_{2} &{}\quad x_{2}^{2} &{}\quad R_{1}^{(a)} &{}\quad \left( R_{1}^{(a)}\right) ^{2} &{}\quad x_{2}R_{1}^{(a)}\\ 1 &{}\quad x_{3} &{}\quad x_{3}^{2} &{}\quad R_{1}^{(a)} &{}\quad \left( R_{1}^{(a)}\right) ^{2} &{}\quad x_{3}R_{1}^{(a)}\\ 1 &{}\quad x_{4} &{}\quad x_{4}^{2} &{}\quad R_{1}^{(a)} &{}\quad \left( R_{1}^{(a)}\right) ^{2} &{}\quad x_{4}R_{1}^{(a)}\\ 1 &{}\quad x_{1} &{}\quad x_{1}^{2} &{}\quad R_{1}^{(b)} &{}\quad \left( R_{1}^{(b)}\right) ^{2} &{}\quad x_{1}R_{1}^{(b)}\\ 1 &{}\quad x_{2} &{}\quad x_{2}^{2} &{}\quad R_{1}^{(b)} &{}\quad \left( R_{1}^{(b)}\right) ^{2} &{}\quad x_{2}R_{1}^{(b)}\\ 1 &{}\quad x_{3} &{}\quad x_{3}^{2} &{}\quad R_{1}^{(b)} &{}\quad \left( R_{1}^{(b)}\right) ^{2} &{}\quad x_{3}R_{1}^{(b)}\\ 1 &{}\quad x_{4} &{}\quad x_{4}^{2} &{}\quad R_{1}^{(b)} &{}\quad \left( R_{1}^{(b)}\right) ^{2} &{}\quad x_{4}R_{1}^{(b)} \end{array} \right] \left[ \begin{array}{c} \beta _{1}\\ \beta _{2}\\ \beta _{3}\\ \beta _{4}\\ \beta _{5}\\ \beta _{6} \end{array} \right] +\left[ \begin{array}{c} \varepsilon _{1}\\ \varepsilon _{2}\\ \varepsilon _{3}\\ \varepsilon _{4}\\ \varepsilon _{5}\\ \varepsilon _{6}\\ \varepsilon _{7}\\ \varepsilon _{8} \end{array} \right] , \end{aligned}$$

where the elements on the left-hand side are the utility of wealths at \(t=2\), and the lines of the independent variable matrix are formed with a constant and all first and second degree monomials obtained using any of the portfolio weights or of the returns at \(t=1\) (the state variables). In this system, \(\varepsilon _{i}\) are random errors with zero expected value, and \(\beta _{1}\) to \(\beta _{6}\) are unknown coefficients, whose values can be estimated with an ordinary least square regression. Using these estimated coefficients, the approximate expected utility, conditional on the first value of the state variable can be written as a continuous function of the portfolio weights, which can be used to find the optimal portfolio weight as:

$$\begin{aligned} \widehat{x}_{1,1}^{\left(a\right) }=\underset{0\le x\le 1}{\arg \max }{\text { } }\widehat{\beta }_{1}+\widehat{\beta }_{2}x+\widehat{\beta }_{3}x^{2} +\widehat{\beta }_{4}R_{1}^{(a)}+\widehat{\beta }_{5}\left( R_{1}^{(a)}\right) ^{2}+\widehat{\beta }_{6}xR_{1}^{(a)} \end{aligned}$$

which obtains a function value at the optimum of \(\widehat{v}_{1,1}^{(a)}\) which correspond to the regression surface value. Using the optimal weight \(\widehat{x}_{1,1}^{( a) }\) and the realized return \(R_{2}^{( a) }\), it is possible to compute the realized value \(\overline{v}_{1,1}^{( a) }=u(w_{1,1}\cdot (\widehat{x}_{1,1}R_{2}^{( a)}+R_{f}) )\).

Similarly, the approximate expected utility, conditional on the second value of our state variable can be written as a continuous function of the portfolio weights, which can be used to find the optimal portfolio weight as:

$$\begin{aligned} \widehat{x}_{1,1}^{\left( b\right) }=\underset{0\le x\le 1}{\arg \max }\,\,\widehat{\beta }_{1}+\widehat{\beta }_{2}x+\widehat{\beta }_{3}x^{2} +\widehat{\beta }_{4}R_{1}^{(b)}+\widehat{\beta }_{5}\left( R_{1}^{(b)}\right) ^{2}+\widehat{\beta }_{6}xR_{1}^{(b)} \end{aligned}$$

which yields a function value at the optimum of \(\widehat{v}_{1,1}^{(b)}\) corresponding to the regression surface value. It is possible to compute the realized value as \(\overline{v}_{1,1}^{( b) }=u( w_{1,1}\cdot ( \widehat{x}_{1,1}R_{2}^{( b) }+R_{f}) ) \). Similar calculations for the other state of wealth leads to the table below which summarizes the computed quantities for each state of wealth:

\(w_{k,1}\)

\(\widehat{x}_{k,1}^{( j) }\)

\(\widehat{v}_{k,1}^{( j) }\)

\(W( \widehat{x}_{k,1},R_{2}^{(j)})\)

\(\overline{v}_{k,1}^{( j)}\)

\(w_{1,1}\)

\(\widehat{x}_{1,1}^{(a)}\)

\(\widehat{v}_{1,1}^{(a)}\)

\(W( \widehat{x}_{1,1},R_{2}^{(a)})\)

\(\overline{v}_{1,1}^{(a)}\)

\(\widehat{x}_{1,1}^{(b)}\)

\(\widehat{v}_{1,1}^{(b)}\)

\(W( \widehat{x}_{1,1},R_{2}^{(a)})\)

\(\overline{v}_{1,1}^{(b)}\)

\(w_{2,1}\)

\(\widehat{x}_{2,1}^{(a)}\)

\(\widehat{v}_{2,1}^{(a)}\)

\(W( \widehat{x}_{2,1},R_{2}^{(a)})\)

\(\overline{v}_{2,1}^{(a)}\)

\(\widehat{x}_{2,1}^{(b)}\)

\(\widehat{v}_{2,1}^{(b)}\)

\(W( \widehat{x}_{2,1},R_{2}^{(a)})\)

\(\overline{v}_{2,1}^{(b)}\)

Using the above values, the optimal weight at \(t=0\) can be computed using a similar procedure. We first compute a sample of wealth at \(t=1\) using all combinations of simulated returns and portfolio weights. This sample is

$$\begin{aligned} \left[ \begin{array}{cccccc} W\left( x_{1},R_{1}^{(a)}\right)&\cdots&W\left( x_{4},R_{1} ^{(a)}\right)&W\left( x_{1},R_{1}^{(b)}\right)&\cdots&W\left( x_{4},R_{1}^{(b)}\right) \ \end{array} \right] ^\intercal . \end{aligned}$$

Using these, we can generate a sample of points on the value function at \(t=1\) that will be used as dependant variables in a regression. This can be done by linear interpolation with the pairs of wealths and realized values (the quantities summarized in the above table). For example, the value corresponding to \(W( x_{1},R_{1}^{(a)}) \) is obtained by a linear interpolation computed with the quantities below:

Wealth

Value

\(w_{1,1}\)

\(\overline{v}_{1,1}^{(a)}\)

\(W( x_{1},R_{1}^{(a)}) \)

\(w_{2,1}\)

\(\overline{v}_{2,1}^{(a)}\)

while a value corresponding to \(W( x_{4},R_{1}^{(b)}) \) would be computed with

Wealth

Value

\(w_{1,1}\)

\(\overline{v}_{1,1}^{(b)}\)

\(W( x_{4},R_{1}^{(b)}) \)

\(w_{2,1}\)

\(\overline{v}_{2,1}^{(b)}\)

Let us denote these interpolated values by \(v( x_{1},R_{1} ^{(1)})\) and \(v( x_{4},R_{1}^{(b)})\). With similar computations for the other values of the sample of simulated wealth at \(t=1\), we obtain the following linear system

$$\begin{aligned} \left[ \begin{array}{c} v\left( x_{1},R_{1}^{(a)}\right) \\ v\left( x_{2},R_{1}^{(a)}\right) \\ v\left( x_{3},R_{1}^{(a)}\right) \\ v\left( x_{4},R_{1}^{(a)}\right) \\ v\left( x_{1},R_{1}^{(b)}\right) \\ v\left( x_{2},R_{1}^{(b)}\right) \\ v\left( x_{3},R_{1}^{(b)}\right) \\ v\left( x_{4},R_{1}^{(b)}\right) \end{array} \right] =\left[ \begin{array}{cccccc} 1 &{}\quad x_{1} &{}\quad x_{1}^{2} &{}\quad R_{0}^{(a)} &{}\quad \left( R_{0}^{(a)}\right) ^{2} &{}\quad x_{1}R_{0}^{(a)}\\ 1 &{}\quad x_{2} &{}\quad x_{2}^{2} &{}\quad R_{0}^{(a)} &{}\quad \left( R_{0}^{(a)}\right) ^{2} &{}\quad x_{2}R_{0}^{(a)}\\ 1 &{}\quad x_{3} &{}\quad x_{3}^{2} &{}\quad R_{0}^{(a)} &{}\quad \left( R_{0}^{(a)}\right) ^{2} &{}\quad x_{3}R_{0}^{(a)}\\ 1 &{}\quad x_{4} &{}\quad x_{4}^{2} &{}\quad R_{0}^{(a)} &{}\quad \left( R_{0}^{(a)}\right) ^{2} &{}\quad x_{4}R_{0}^{(a)}\\ 1 &{}\quad x_{1} &{}\quad x_{1}^{2} &{}\quad R_{0}^{(b)} &{}\quad \left( R_{0}^{(b)}\right) ^{2} &{}\quad x_{1}R_{0}^{(b)}\\ 1 &{}\quad x_{2} &{}\quad x_{2}^{2} &{}\quad R_{0}^{(b)} &{}\quad \left( R_{0}^{(b)}\right) ^{2} &{}\quad x_{2}R_{0}^{(b)}\\ 1 &{}\quad x_{3} &{}\quad x_{3}^{2} &{}\quad R_{0}^{(b)} &{}\quad \left( R_{0}^{(b)}\right) ^{2} &{}\quad x_{3}R_{0}^{(b)}\\ 1 &{}\quad x_{4} &{}\quad x_{4}^{2} &{}\quad R_{0}^{(b)} &{}\quad \left( R_{0}^{(b)}\right) ^{2} &{}\quad x_{4}R_{0}^{(b)} \end{array} \right] \left[ \begin{array}{c} \beta _{1}\\ \beta _{2}\\ \beta _{3}\\ \beta _{4}\\ \beta _{5}\\ \beta _{6} \end{array} \right] +\left[ \begin{array}{c} \varepsilon _{1}\\ \varepsilon _{2}\\ \varepsilon _{3}\\ \varepsilon _{4}\\ \varepsilon _{5}\\ \varepsilon _{6}\\ \varepsilon _{7}\\ \varepsilon _{8} \end{array} \right] , \end{aligned}$$

and the optimal portfolio allocation as:

$$\begin{aligned} \widehat{x}_{1,0}^{\left( 1\right) }=\underset{0\le x\le 1}{\arg \max }\,\,\widehat{\beta }_{1}+\widehat{\beta }_{2}x+\widehat{\beta }_{3}x^{2} +\widehat{\beta }_{4}R_{0}^{(a)}+\widehat{\beta }_{5}\left( R_{0}^{(a)}\right) ^{2}+\widehat{\beta }_{6}xR_{0}^{(a)}. \end{aligned}$$

We note here that only one optimization is performed since only one value of the state variable (the known value) is relevant at this point. This value has been placed, by convention, in the first sample path when simulating the returns in the preliminary step.

Table 8 Example 1, portfolio weights
Table 9 Example 2, portfolio weights
Table 10 Example 3, portfolio weights
Table 11 Example 4, portfolio weights, four-assets case
Table 12 Example 4, portfolio weights, five-assets case
Table 13 Example 4. Certainty equivalents
Table 14 Example 5, Optimal portfolio weights under mean return versus CVaR tradeoff—three risky assets, one period
Table 15 Example 5, sub-optimality levels with respect to the polynomial degrees
Fig. 1
figure 1

Example 5: Efficient set frontier of performances of portfolios obtained from solving the mean-CVaR model with different risk aversion levels when using four different resolution schemes for the inner dynamic programming step. Note that the “times” and “plus” markers overlap in two locations which emphasizes the fact that these methods are incapable of identifying portfolios that trade-off between expected excess wealth and its conditional value at risk through the procedure that is described

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Denault, M., Delage, E. & Simonato, JG. Dynamic portfolio choice: a simulation-and-regression approach. Optim Eng 18, 369–406 (2017). https://doi.org/10.1007/s11081-017-9347-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11081-017-9347-4

Keywords

Navigation