Skip to main content
Log in

Sparse mean–variance customer Markowitz portfolio optimization for Markov chains: a Tikhonov’s regularization penalty approach

  • Published:
Optimization and Engineering Aims and scope Submit manuscript

Abstract

This paper considers the subject of penalty regularized expected utilities and investigates the applicability of the method for computing the mean–variance Markowitz customer portfolio optimization problem. We penalize the large values by introducing a penalty term expressed as least-squares in order to avoid an explosive number of solutions. This penalty term is known as the Tikhonov regularization parameter. Tikhonov’s regularization is one of the most popular approaches to solve discrete ill-posed problems and, in our case, it plays a fundamental role in order to ensure the convergence to a unique portfolio solution. In this sense, we first provide the parameter conditions under which the penalty regularized expected utility of a given optimal portfolio admits a unique solution. A crucial problem concerning Tikhonov’s regularization is the proper choice of the regularization parameter because it can modify (sometimes significantly) the shape of the original functional. The main objective of this paper is to derive a method for regularization in an optimal way. For solving the problem, the parameters of the regularized poly-linear optimization problem are balanced simultaneously. Then, we prove that the original Markowitz portfolio optimization problem converges to an exact solution (with the minimal weighted norm). We consider a projection gradient method for finding the extremal points including the proof of convergence of the method. We show how to select the parameters of the algorithm in order to guarantee the convergence of the suggested procedure. Finally, we present a numerical example to illustrate the practical implications of the theoretical issues of a penalty regularized portfolio optimization problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Ang A, Bekaert G (2002) International asset allocation with regime shifts. Rev Financ Stud 15(4):1137–1187

    Article  Google Scholar 

  • Anzengruber S, Ramlau R (2009) Morozov’s discrepancy principle for Tikhonov-type functionals with non-linear operators. Technical report 2009–2013, Johann Radon Institute for Computational and Applied Mathematics, Linz, Austria

  • Becker H (1995) The regularization of the ill-posed notion “fuzzy probability”. Inf Sci 86(4):227–255

    Article  MathSciNet  MATH  Google Scholar 

  • Best MJ, Grauer JJ (1991) On the sensitivity of mean–variance-efficient portfolios to changes in asset means: some analytical and computational results. Rev Financ Stud 4(2):315–342

    Article  Google Scholar 

  • Billio M, Pelizzon L (2000) Value-at-risk: a multivariate switching regime approach. J Empir Finance 7:531–554

    Article  Google Scholar 

  • Brandt MW, Goyal A, Santa-Clara P, Stroud JR (2005) Asimulation approach to dynamic portfolio choice with an application to learning about return predictability. Rev Financ Stud 18:831–873

    Article  Google Scholar 

  • Brodiea J, Daubechiesa I, De Molc C, Giannoned D, Loris I (2009) Sparse and stable Markowitz portfolios. Proc Nat Acad Sci USA 106:12267–12272

    Article  Google Scholar 

  • Carrasco M, Noumon N (2011) Optimal portfolio selection using regularization. Working paper University of Montreal. http://www.unc.edu/maguilar/metrics/carrasco.pdf

  • Chopra V, Ziemba W (1993) The effects of errors in means, variances, and covariances on optimal portfolio choice. J Portf Manag 19(2):6–11

    Article  Google Scholar 

  • Chung J, Español MI, Nguyen T (2014) Optimal regularization parameters for general-form Tikhonov regularization, arXiv preprint arXiv:1407.1911

  • Clempner JB, Poznyak AS (2014) Simple computing of the customer lifetime value: a fixed local-optimal policy approach. J Syst Sci Syst Eng 23(4):439–459

    Article  Google Scholar 

  • Clempner JB, Poznyak AS (2018) A Tikhonov regularized penalty function approach for solving polylinear programming problems. J Comput Appl Math 328:267–286

    Article  MathSciNet  MATH  Google Scholar 

  • DeMiguel V, Garlappi J, Nogales J, Uppal R (2009a) A generalized approach to portfolio optimization: improving performance by constraining portfolio norms. Manag Sci 55(5):798–812

    Article  MATH  Google Scholar 

  • DeMiguel V, Garlappi J, Uppal R (2009b) Optimal versus Naive diversification: how inefficient is the 1/n portfolio strategy? Rev Financ Stud 22(5):1915–1953

    Article  Google Scholar 

  • Denault M, Delage E, Simonato JG (2017) Dynamic portfolio choice: a simulation-and-regression approach. Optim Eng 18:369–406

    Article  MathSciNet  MATH  Google Scholar 

  • Dombrovskii VV, Obyedko TY (2011) Predictive control of systems with Markovian jumps under constraints and its application to the investment portfolio optimization. Autom Remote Control 72(5):989–1003

    Article  MathSciNet  MATH  Google Scholar 

  • Elliott RJ, Van der Hoek J (1997) An application of hidden Markov models to asset allocation problems. Finance Stoch 1:229–238

    Article  MATH  Google Scholar 

  • Fan J, Zhang J, Yu K (2012) Vast portfolio selection with gross exposure constraints. J Am Stat Assoc 107(498):592–606

    Article  MathSciNet  MATH  Google Scholar 

  • Fastrich B, Paterlini S, Winker P (2014) Cardinality versus q-norm constraints for index tracking. Quant Finance 14:2019–2032

    Article  MathSciNet  MATH  Google Scholar 

  • Fastrich B, Paterlini S, Winker P (2015) Constructing optimal sparse portfolios using regularization methods. Comput Manag Sci 12(3):417–434

    Article  MathSciNet  MATH  Google Scholar 

  • Fernandes M, Rocha G, Souza T (2012) Regularized minimum-variance portfolios using asset group information. http://webspace.qmul.ac.uk/tsouza/index arquivos/Page497.htm

  • Frost PA, Savarino JE (1986) An empirical Bayes approach to efficient portfolio selection. J Financ Quant Anal 21(3):293–305

    Article  Google Scholar 

  • Garcia CB, Zangwill WI (1981) Pathways to solutions, fixed points and equilibria. Prentice-Hall, Englewood Cliffs

    MATH  Google Scholar 

  • Garlappi L, Skoulakis G (2010) Solving consumption and portfolio choice problems: the state variable decomposition method. Rev Financ Stud 23:3346–3400

    Article  Google Scholar 

  • Garlappi L, Skoulakis G (2011) Taylor series approximations to expected utility and optimal portfolio choice. Math Financ Econ 5(2):121–156

    Article  MathSciNet  MATH  Google Scholar 

  • Hansen PC (2001) The L-curve and its use in the numerical treatment of inverse problems. In: Jhoson P (ed) Computational inverse problems in electrocardiology. WIT Press, Southampton, pp 119–142

  • Jagannathan R, Ma T (2003) Risk reduction in large portfolios: why imposing the wrong constraints helps. J Finance 58(4):1651–1683

    Article  Google Scholar 

  • Kan R, Zhou G (2007) Optimal portfolio choice with parameter uncertainty. J Financ Quant Anal 42:621–656

    Article  Google Scholar 

  • Kroll Y, Levy H, Markowitz HM (1984) Mean-variance versus direct utility maximization. J Finance 39:47–75

    Article  Google Scholar 

  • Ledoit O, Wolf M (2004) Well-conditioned estimator for large dimensional covariance matrices. J Multivar Anal 88:365–411

    Article  MathSciNet  MATH  Google Scholar 

  • Lotfi S, Salahi M, Mehrdoust F (2017) Adjusted robust mean-value-at-risk model: less conservative robust portfolios. Optim Eng 18:467–497

    Article  MathSciNet  MATH  Google Scholar 

  • Markowitz H (1952) Portfolio selection. J Finance 7:77–98

    Google Scholar 

  • Michaud RO (1989) The Markowitz optimization enigma: is “optimized” optimal? Financ Anal J 45(1):31–42

    Article  Google Scholar 

  • Pan Z, You X, Chen H, Tao D, Pang B (2013) Generalization performance of magnitude-preserving semi-supervised ranking with graph-based regularization. Inform Sci 221:284–296

    Article  MathSciNet  MATH  Google Scholar 

  • Poznyak AS (2008) Advanced mathematical tools for automatic control engineers. deterministic technique, vol 1. Elsevier, Amsterdam

    Google Scholar 

  • Poznyak AS, Najim K, Gomez-Ramirez E (2000) Self-learning control of finite Markov chains. Marcel Dekker, New York

    MATH  Google Scholar 

  • Rust B, O’Leary DP (2008) Residual periodograms for choosing regularization parameters for ill-posed problems. Inverse Probl 24(034):005

    MathSciNet  MATH  Google Scholar 

  • Sánchez EM, Clempner JB, Poznyak AS (2015a) A priori-knowledge/actor-critic reinforcement learning architecture for computing the mean–variance customer portfolio: the case of bank marketing campaigns. Eng Appl Artif Intell 46(Part A):82–92

    Article  Google Scholar 

  • Sánchez EM, Clempner JB, Poznyak AS (2015b) Solving the mean–variance customer portfolio in markov chains using iterated quadratic/lagrange programming: a credit-card customer-credit limits approach. Expert Syst Appl 42(12):5315–5327

    Article  Google Scholar 

  • Sotomayor LR, Cadenillas A (2009) Explicit solutions of consumption-investment problems in financial markets with regime-switching. Math Finance 19(2):251–279

    Article  MathSciNet  MATH  Google Scholar 

  • Takeda A, Niranjan M, Gotoh J, Kawahara Y (2013) Simultaneous pursuit of out-of-sample performance and sparsity in index tracking portfolios. Comput Manag Sci 10:21–49

    Article  MathSciNet  MATH  Google Scholar 

  • Tu J, Zhou G (2011) Markowitz meets talmud: a combination of sophisticated and naive diversification strategies. J Financ Econ 99(1):204–215

    Article  Google Scholar 

  • Wozabal D, Hochreiter R (2012) A coupled Markov chain approach to credit risk modeling. J Econ Dyn Control 36(3):403–415

    Article  MathSciNet  MATH  Google Scholar 

  • Wu L (2003) Jumps and dynamic asset allocation. Rev Quant Finance Account 20:207–243

    Article  Google Scholar 

  • Wu L, Yang Y (2014) Nonnegative elastic net and application in index tracking. Appl Math Comput 227:541–552

    MathSciNet  MATH  Google Scholar 

  • Wu L, Yang Y, Liu H (2014) Nonnegative-lasso and application in index tracking. Comput Stat Data Anal 70:116–126

    Article  MathSciNet  Google Scholar 

  • Yen Y (2013) A note on sparse minimum variance portfolios and coordinate-wise descent algorithms. Technical report

  • Yen Y (2015) Sparse weighted-norm minimum variance portfolios. Rev Finance 20(3):1259–1287

    Article  Google Scholar 

  • Yen Y, Yen T (2014) Solving norm constrained portfolio optimization via coordinate-wise descent algorithms. Comput Stat Data Anal 76:737–759

    Article  MathSciNet  Google Scholar 

  • Yin G, Zhou X (2004) Markowitz mean–variance portfolio selection with regime switching: from discrete-time models to their continuous-time limits. IEEE Trans Autom Control 39(3):349–360

    Article  MathSciNet  MATH  Google Scholar 

  • Zangwill WI (1969) Nonlinear programming: a unified approach. Prentice-Hall, Englewood Cliffs

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Julio B. Clempner.

Appendices

Appendix A: Proof of Theorem 1

Proof

Let us prove that the Hessian matrix H associated with the Markowitz portfolio given in Eq. (14) is strictly negative definite, i.e., we prove that for all \(c\in \bar{C}_{adm}\), \(\upsilon \in {\varUpsilon } _{adm}\) and some \(\mu ,\delta >0\)

$$\begin{aligned} \begin{bmatrix} \frac{\partial ^{2}}{\partial c^{2}}{\varPhi } _{\mu ,\delta }\left( \upsilon ,c\right)&\frac{\partial ^{2}}{\partial \upsilon \partial c}{\varPhi } _{\mu ,\delta }\left( \upsilon ,c\right) \\ \frac{\partial ^{2}}{\partial c\partial \upsilon }{\varPhi } _{\mu ,\delta }\left( \upsilon ,c\right)&\frac{\partial ^{2}}{\partial \upsilon ^{2}}{\varPhi } _{\mu ,\delta }\left( \upsilon ,c\right) \end{bmatrix} <0 \end{aligned}$$
(34)

Defining

$$\begin{aligned} V\left( \upsilon \right) :=\frac{1}{2}\dfrac{\partial }{\partial \upsilon } \bar{\upsilon }^{\intercal }\left( \upsilon \right) ={\mathrm {diag}}\left( \upsilon _{1|1},\ldots ,\upsilon _{1|M};\upsilon _{2|1},\ldots ,\upsilon _{2|M};\ldots ;\upsilon _{N|1},\ldots ,\upsilon _{N|M}\right) \end{aligned}$$

and noting that

$$\begin{aligned} \dfrac{{ \partial }}{{ \partial c}}\left( \left\| { c} \right\| ^{2}{ +}\left\| { \upsilon }\right\| ^{2}\right) ^{2}{ }&= {} 4\left( \left\| c\right\| ^{2}{ +} \left\| { \upsilon }\right\| ^{2}\right) { c} \\ \dfrac{{ \partial }^{2}}{{ \partial c}^{2}}\left( \left\| { c}\right\| ^{2}{ +}\left\| { \upsilon }\right\| ^{2}\right) ^{2}{ }&= {} 4\dfrac{{ \partial }}{{ \partial c}} \left[ \left( \left\| { c}\right\| ^{2}{ +}\left\| { \upsilon }\right\| ^{2}\right) { c}\right] { =4} \left( \left\| { c}\right\| ^{2}{ +}\left\| { \upsilon }\right\| ^{2}\right) { I}_{{ NM\times NM}}{ +8cc}^{\intercal } \\ \dfrac{{ \partial }}{{ \partial \upsilon }}\left( \left\| { c}\right\| ^{2}{ +}\left\| { \upsilon }\right\| ^{2}\right) ^{2}&= {} { 4}\left( \left\| { c}\right\| ^{2} { +}\left\| { \upsilon }\right\| ^{2}\right) { \upsilon } \\ \dfrac{{ \partial }^{2}}{{ \partial \upsilon }^{2}}\left( \left\| { c}\right\| ^{2}{ +}\left\| { \upsilon } \right\| ^{2}\right) ^{2}{}&= {} 4\dfrac{{ \partial }}{{ \partial \upsilon }}\left[ \left( \left\| c\right\| ^{2}{ +} \left\| \upsilon \right\| ^{2}\right) { \upsilon }\right] { =4}\left( \left\| c\right\| ^{2}{ +}\left\| \upsilon \right\| ^{2}\right) { I}_{NM\times NM}{ +8\upsilon \upsilon } ^{\intercal } \end{aligned}$$

by (14) we have

$$\begin{aligned} \dfrac{\partial }{\partial \upsilon }{\varPhi } _{\mu ,\delta }\left( \upsilon ,c\right)&= {} \mu \left[ \tilde{W}c+\xi \left( \upsilon ^{\intercal }\tilde{W} c\right) \tilde{W}c-\xi V\left( \upsilon \right) \bar{W}c\right] \nonumber \\&\quad -\, 2\left[ \upsilon ^{\intercal }e-\upsilon ^{+}\right] _{+}e-\left[ c^{\intercal }{\varPsi } \upsilon -b_{ineq}\right] _{+}{\varPsi } c-4\delta \left( \left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}\right) \upsilon \end{aligned}$$
(35)
$$\begin{aligned} \dfrac{\partial }{\partial c}{\varPhi } _{\mu ,\delta }\left( \upsilon ,c\right)&= {} \mu \left[ \tilde{W}\upsilon +\xi \left( \upsilon ^{\intercal }\tilde{W} c\right) \tilde{W}\upsilon -\dfrac{\xi }{2}\bar{W}\bar{\upsilon }\left( \upsilon \right) \right] \nonumber \\&\quad -\,\sum \limits _{j=1}^{N}\left( \bar{\pi }_{j}-\bar{e}_{j}\right) \left( \bar{\pi }_{j}-\bar{e}_{j}\right) ^{\intercal }c-\left( e^{\intercal }c-1\right) e \nonumber \\&\quad -\,\left[ c^{\intercal }{\varPsi } \upsilon -b_{ineq}\right] _{+}{\varPsi } \upsilon -4\delta \left( \left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}\right) c \end{aligned}$$
(36)

implying

$$\begin{aligned} \dfrac{\partial ^{2}}{\partial \upsilon ^{2}}{\varPhi } _{\mu ,\delta }\left( \upsilon ,c\right)&= {} \mu \xi \left[ \tilde{W}cc^{\intercal }\tilde{W}-\mathrm { diag}\left\{ \bar{W}c\right\} \right] \nonumber \\&\quad -\, 2\chi \left( \upsilon ^{\intercal }e-\upsilon ^{+}>0\right) ee^{\intercal }-\chi \left( c^{\intercal }{\varPsi } \upsilon -b_{ineq}>0\right) {\varPsi } cc^{\intercal }{\varPsi } \nonumber \\&\quad -\,\delta \left[ 4\left( \left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}\right) I_{NM\times NM}+8\upsilon \upsilon ^{\intercal } \right] \end{aligned}$$
(37)
$$\begin{aligned} \dfrac{\partial ^{2}}{\partial c^{2}}{\varPhi } _{\mu ,\delta }\left( \upsilon ,c\right)&= {} \mu \xi \tilde{W}\upsilon \upsilon ^{\intercal }\tilde{W} -\sum \limits _{j=1}^{N}\left( \bar{\pi }_{j}-\bar{e}_{j}\right) \left( \bar{\pi }_{j}-\bar{e}_{j}\right) ^{\intercal }-ee^{\intercal } \nonumber \\&\quad -\,\chi \left( c^{\intercal }{\varPsi } \upsilon -b_{ineq}>0\right) {\varPsi } \upsilon \upsilon ^{\intercal }{\varPsi } \nonumber \\&\quad -\,\delta \left[ 4\left( \left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}\right) I_{NM\times NM}+8cc^{\intercal }\right] \end{aligned}$$
(38)
$$\begin{aligned} \dfrac{\partial ^{2}}{\partial c\partial \upsilon }{\varPhi } _{\mu ,\delta }\left( \upsilon ,c\right)&= {} \left[ \dfrac{\partial ^{2}}{\partial \upsilon \partial c}{\varPhi } _{\mu ,\delta }\left( \upsilon ,c\right) \right] ^{\intercal }\nonumber \\ {}&= {} \mu \left[ \tilde{W}+\xi \left[ \tilde{W}\upsilon c^{\intercal }\tilde{W} +\left( \upsilon ^{\intercal }\tilde{W}c\right) \tilde{W}\right] -\xi \bar{W} V\left( \upsilon \right) \right] \nonumber \\&\quad -\, \underset{\left[ c^{\intercal }{\varPsi } \upsilon -b_{ineq}\right] _{+}\left( {\varPsi } +{\varPsi } \upsilon c^{\intercal }{\varPsi } ^{\intercal }\right) }{\underbrace{ \dfrac{\partial }{\partial c}\left( \left[ c^{\intercal }{\varPsi } \upsilon -b_{ineq}\right] _{+}{\varPsi } c\right) }}-8\delta c\upsilon ^{\intercal }\nonumber \\&= {} \mu \left[ \tilde{W}+\xi \left[ \tilde{W}\upsilon c^{\intercal }\tilde{W} +\left( \upsilon ^{\intercal }\tilde{W}c\right) \tilde{W}\right] -\xi \bar{W} V\left( \upsilon \right) \right] \nonumber \\&\quad -\,\left[ c^{\intercal }{\varPsi } \upsilon -b_{ineq}\right] _{+}\left( {\varPsi } +{\varPsi } \upsilon c^{\intercal }{\varPsi } ^{\intercal }\right) -8\delta c\upsilon ^{\intercal } \end{aligned}$$
(39)

Notice that (34) is fulfilled if

$$\begin{aligned}&\left[ \begin{array}{cc} \frac{\partial ^{2}}{\partial c^{2}}{ {\varPhi } }_{\mu ,\delta }\left( \upsilon ,c\right) &{}\quad \frac{\partial ^{2}}{\partial \upsilon \partial c} { {\varPhi } }_{\mu ,\delta }\left( \upsilon ,c\right) \\ \frac{\partial ^{2}}{\partial c\partial \upsilon }{ {\varPhi } }_{\mu ,\delta }\left( \upsilon ,c\right) &{}\quad \frac{\partial ^{2}}{\partial \upsilon ^{2}}{ {\varPhi } }_{\mu ,\delta }\left( \upsilon ,c\right) \end{array}\right] \\&\quad = -\,\delta \left[ \begin{array}{cc} { 4}\left( \left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}\right) I_{NM\times NM}+8cc^{\intercal } &{}\quad 8c\upsilon ^{\intercal } \\ 8\upsilon c^{\intercal } &{}\quad 4\left( \left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}\right) I_{NM\times NM}+8\upsilon \upsilon ^{\intercal } \end{array} \right] \\&\qquad +F_{\mu ,\delta }^{0}\left( \upsilon ,c\right) + \left[ \begin{array}{ll} F_{\mu ,\delta }^{1}\left( \upsilon ,c\right) &{}\quad 0 \\ 0 &{}\quad F_{\mu ,\delta }^{2}\left( \upsilon ,c\right) \end{array}\right] \\&\qquad +\,F_{\mu ,\delta }^{3}\left( \upsilon ,c\right) +F_{\mu ,\delta }^{4}\left( \upsilon ,c\right) \\&\quad \le -4\delta \left[ \begin{array}{cc} \left( \left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}\right) I_{NM\times NM} &{} \quad 0 \\ 0 &{} \quad \left( \left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}\right) I_{NM\times NM} \end{array} \right] +F_{\mu ,\delta }^{3}\left( \upsilon ,c\right) <0 \end{aligned}$$

where

$$\begin{aligned} F_{\mu ,\delta }^{0}\left( \upsilon ,c\right)&= {} -8\delta \begin{bmatrix} cc^{\intercal }&\quad c\upsilon ^{\intercal } \\ \upsilon c^{\intercal }&\quad \upsilon \upsilon ^{\intercal } \end{bmatrix} =-8\delta \begin{bmatrix} c \\ \upsilon \end{bmatrix} \begin{bmatrix} c \\ \upsilon \end{bmatrix} ^{\intercal }\le 0 \\ F_{\mu ,\delta }^{1}\left( \upsilon ,c\right)&= {} -ee^{\intercal }\le 0 \\ F_{\mu ,\delta }^{2}\left( \upsilon ,c\right)&= {} -2\chi \left( \upsilon ^{\intercal }e-\upsilon ^{+}>0\right) ee^{\intercal }\le 0 \\ F_{\mu ,\delta }^{3}\left( \upsilon ,c\right)&= {} \begin{bmatrix} { \mu \xi \tilde{W}\upsilon \upsilon }^{\intercal }{ \tilde{W}} ^{\intercal }&\tau \\ \tau ^{\intercal }&{ \mu \xi }\left[ { \tilde{W}cc}^{\intercal }{ \tilde{W}}^{\intercal }-{\mathrm {diag}}\left\{ \bar{W}c\right\} \right] \end{bmatrix} \end{aligned}$$

where

$$\begin{aligned}\tau&= {} { \mu }\left[ { \tilde{W}+\xi }\left( { \tilde{W} \upsilon c}^{\intercal }{ \tilde{W}}^{\intercal }+\left( { \upsilon }^{\intercal }{ \tilde{W}c}\right) \tilde{W}-{ \bar{W}V} \left( { \upsilon }\right) \right) \right] \\ \tau ^{\intercal }&= {} { \mu }\left[ { \tilde{W}+\xi }\left( { \tilde{W}\upsilon c}^{\intercal }{ \tilde{W}}^{\intercal }+\left( { \upsilon }^{\intercal }{ \tilde{W}c}\right) \tilde{W}-{ \bar{W}V}\left( { \upsilon }\right) \right) \right] ^{\intercal } \end{aligned}$$

Notice that

$$\begin{aligned} F_{\mu ,\delta }^{4}\left( \upsilon ,c\right) =-\left[ c^{\intercal }{\varPsi } \upsilon -b_{ineq}\right] _{+}\left[ \begin{array}{cc} {\varPsi } \upsilon \upsilon ^{\intercal }{\varPsi } &{} \left( {\varPsi } +{\varPsi } \upsilon c^{\intercal }{\varPsi } ^{\intercal }\right) \\ \left( {\varPsi } +{\varPsi } c\upsilon ^{\intercal }{\varPsi } ^{\intercal }\right) &{} {\varPsi } cc^{\intercal }{\varPsi } \end{array} \right] \le 0 \end{aligned}$$

since

$$\begin{aligned}&\left( \begin{array}{l} \upsilon \\ c \end{array} \right) ^{\intercal }\left[ \begin{array}{cc} {\varPsi } \upsilon \upsilon ^{\intercal }{\varPsi } &{} \left( {\varPsi } +{\varPsi } \upsilon c^{\intercal }{\varPsi } ^{\intercal }\right) \\ \left( {\varPsi } +{\varPsi } c\upsilon ^{\intercal }{\varPsi } ^{\intercal }\right) &{} {\varPsi } cc^{\intercal }{\varPsi } \end{array} \right] \left( \begin{array}{l} \upsilon \\ c \end{array} \right) \\&\quad = \left( \begin{array}{l} \upsilon \\ c \end{array} \right) ^{\intercal }\left[ \begin{array}{l} {\varPsi } \upsilon \left( \upsilon ^{\intercal }{\varPsi } \upsilon \right) +{\varPsi } c+{\varPsi } \upsilon \left( c^{\intercal }{\varPsi } ^{\intercal }c\right) \\ {\varPsi } \upsilon +{\varPsi } c\left( \upsilon ^{\intercal }{\varPsi } ^{\intercal }\upsilon \right) +{\varPsi } c\left( c^{\intercal }{\varPsi } c\right) \end{array} \right] \\&\quad = \left( \upsilon ^{\intercal }{\varPsi } \upsilon \right) ^{2}+\left( \upsilon ^{\intercal }{\varPsi } c+c^{\intercal }{\varPsi } \upsilon \right) +2\left( \upsilon ^{\intercal }{\varPsi } \upsilon \right) \left( c^{\intercal }{\varPsi } ^{\intercal }c\right) +\left( c^{\intercal }{\varPsi } c\right) ^{2}\\&\quad = \left( \upsilon ^{\intercal }{\varPsi } \upsilon +c^{\intercal }{\varPsi } c\right) ^{2}+\upsilon ^{\intercal }{\varPsi } \upsilon +c^{\intercal }{\varPsi } ^{\intercal }c\ge 0 \end{aligned}$$

This implies

$$\begin{aligned} 4\delta \left( \left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}\right) \left[ \begin{array}{cc} I_{NM\times NM} &{} 0 \\ 0 &{} I_{NM\times NM} \end{array} \right] >F_{\mu ,\delta }^{3}\left( \upsilon ,c\right) \end{aligned}$$

and

$$\begin{aligned}&\delta \left[ \begin{array}{cc} I_{NM\times NM} &{} 0 \\ 0 &{} I_{NM\times NM} \end{array} \right] >\dfrac{1}{4}\dfrac{F_{\mu ,\delta }^{3}\left( \upsilon ,c\right) }{ \left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}}\\&\quad = \dfrac{{ \mu }}{4} \begin{bmatrix} { \xi \tilde{W}}\dfrac{{ \upsilon \upsilon }^{\intercal }}{ \left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}}{ \tilde{W}}^{\intercal }&\phi \\ \phi ^{\intercal }&\dfrac{{ \xi }}{\left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}}\left[ { \tilde{W}cc} ^{\intercal }{ \tilde{W}}-{\mathrm {diag}}\left\{ \bar{W}c\right\} \right] \end{bmatrix} \\&{\text {where}} \\&\quad \phi =\dfrac{\left[ { \tilde{W}+\xi }\left( { \tilde{W}\upsilon c }^{\intercal }{ \tilde{W}}+\left( { \upsilon }^{\intercal } { \tilde{W}c}\right) \tilde{W}-{ \bar{W}V}\left( { \upsilon }\right) \right) \right] }{\left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}} \\&\quad \phi ^{\intercal }=\dfrac{\left[ { \tilde{W}+\xi }\left( { \tilde{W}\upsilon c}^{\intercal }{ \tilde{W}}+\left( { \upsilon } ^{\intercal }{ \tilde{W}c}\right) \tilde{W}-{ \bar{W}V}\left( { \upsilon }\right) \right) \right] ^{\intercal }}{\left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}} \end{aligned}$$

where

$$\begin{aligned}\phi &=\dfrac{\left[ { \tilde{W}+\xi }\left( { \tilde{W}\upsilon c }^{\intercal }{ \tilde{W}}+\left( { \upsilon }^{\intercal } { \tilde{W}c}\right) \tilde{W}-{ \bar{W}V}\left( { \upsilon }\right) \right) \right] }{\left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}} \\ \phi ^{\intercal }&=\dfrac{\left[ { \tilde{W}+\xi }\left( { \tilde{W}\upsilon c}^{\intercal }{ \tilde{W}}+\left( { \upsilon } ^{\intercal }{ \tilde{W}c}\right) \tilde{W}-{ \bar{W}V}\left( { \upsilon }\right) \right) \right] ^{\intercal }}{\left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}} \end{aligned}$$

or equivalently,

$$\begin{aligned} \begin{bmatrix} \delta I_{NM\times NM}-\dfrac{{ \mu }}{4}{ \xi \tilde{W}}\tfrac{ { \upsilon \upsilon }^{\intercal }}{\left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}}{ \tilde{W}}^{\intercal }&- \dfrac{{ \mu }}{4}\tfrac{\left[ { \tilde{W}+\xi }\left( { \tilde{W}\upsilon c}^{\intercal }{ \tilde{W}}+\left( { \upsilon } ^{\intercal }{ \tilde{W}c}\right) \tilde{W}-{ \bar{W}V}\left( { \upsilon }\right) \right) \right] }{\left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}} \\ -\dfrac{{ \mu }}{4}\tfrac{\left[ { \tilde{W}+\xi }\left( { \tilde{W}\upsilon c}^{\intercal }{ \tilde{W}}+\left( { \upsilon } ^{\intercal }{ \tilde{W}c}\right) \tilde{W}-{ \bar{W}V}\left( { \upsilon }\right) \right) \right] ^{\intercal }}{\left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}}&\delta I_{NM\times NM}-\dfrac{{ \mu }}{4}\tfrac{{ \xi }}{\left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}}\left[ { \tilde{W}cc} ^{\intercal }{ \tilde{W}}-{\mathrm {diag}}\left\{ \bar{W}c\right\} \right] \end{bmatrix} >0 \end{aligned}$$

for all \(\upsilon \in {\varUpsilon } _{adm}\) and \(c\in \bar{C}_{adm}\). By Schur’s complement to fulfill this condition it is necessary and sufficient to satisfy that

$$\begin{aligned} A&:= {} \delta I_{NM\times NM}-\dfrac{{ \mu }}{4}{ \xi \tilde{W}} \tfrac{{ \upsilon \upsilon }^{\intercal }}{\left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}}{ \tilde{W}}^{\intercal }\\&>\dfrac{{ \mu }}{4}\tfrac{\left[ { \tilde{W}+\xi }\left( { \tilde{W}\upsilon c}^{\intercal }{ \tilde{W}}+\left( { \upsilon } ^{\intercal }{ \tilde{W}c}\right) \tilde{W}-{ \bar{W}V}\left( { \upsilon }\right) \right) \right] ^{\intercal }}{\left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}}\\&\times \left[ \delta I_{NM\times NM}-\dfrac{{ \mu }}{4}\tfrac{{ \xi }}{ \left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}}\left[ { \tilde{W}cc}^{\intercal }{ \tilde{W}}-{\mathrm {diag}}\left\{ \bar{W}c\right\} \right] \right] ^{-1}\\&\times \dfrac{{ \mu }}{4}\dfrac{\left[ { \tilde{W}+\xi }\left( { \tilde{W}\upsilon c}^{\intercal }{ \tilde{W}}+\left( { \upsilon } ^{\intercal }{ \tilde{W}c}\right) \tilde{W}-{ \bar{W}V}\left( { \upsilon }\right) \right) \right] }{\left\| c\right\| ^{2}+\left\| \upsilon \right\| ^{2}}:=B \end{aligned}$$

The last matrix inequality holds for all \(\upsilon \in {\varUpsilon } _{adm}\) and \(c\in \bar{C}_{adm}\) if (here we use \(A\ge A^{\prime }>B^{\prime }\ge B\))

$$\begin{aligned} \delta -\dfrac{{ \mu }}{4}{ \xi }\underset{s}{\max }\left( { \tilde{W}}_{s}^{2}\right)\ge & {} 0 \\ \delta -\dfrac{{ \mu }}{4}{ \xi }\underset{s}{\max }\left( { \tilde{W}}_{s}^{2}\right)> & {} \dfrac{{ \mu }}{4}\left\| \dfrac{ { \tilde{W}}}{\varepsilon ^{2}\left( NM\right) }{ +\xi \tilde{W}} ^{2}\right\| \end{aligned}$$

being fulfilled if

$$\begin{aligned} \delta >{ \mu }\left\| { \tilde{W}}\right\| \left( \frac{ { \xi }}{2}\left\| { \tilde{W}}\right\| +\dfrac{1}{ 4\varepsilon ^{2}\left( NM\right) }\right) \end{aligned}$$

So, \(H<0\) which means that the penalty function (14) is strongly concave for all \(\upsilon \in {\varUpsilon } _{adm}\) and \(c\in \bar{C}_{adm}\), and, hence, has a unique maximal point defined below as \(\upsilon ^{*}\left( \mu ,\delta \right)\), \(c^{*}\left( \mu ,\delta \right)\). \(\square\)

Appendix B: Proof of Theorem 2

Proof

Before we prove Theorem 2, let us introduce the following notation

$$\begin{aligned} c={\mathrm {col}}\left[ c_{i|k}\right] ,\quad \upsilon ={\mathrm {col}}\left[ \upsilon _{i|k}\right] , \quad \omega ={\mathrm {col}}\left[ \upsilon _{i|k}\eta _{i|k}\right] ,\quad \eta ={\mathrm {col}}\left[ \eta _{i|k}\right] , \end{aligned}$$
$$\begin{aligned} A_{eq}c&= {} \left( \begin{array}{l} \left[ \sum \limits _{i=1}^{N}\sum \limits _{k=1}^{M}\pi _{j|ik}c_{i|k}-\sum \limits _{k=1}^{M}c_{j|k}\right] _{j=1,N} \\ {\mathbf {e}}_{NM}^{\intercal } \end{array} \right) =b_{eq} \\ A_{ineq}&:= {} \sum \limits _{i=1}^{N}\sum \limits _{k=1}^{M}\upsilon _{i|k}c_{i|k}\eta _{i|k}=\omega ^{\intercal }c \\ A_{ineq1}&:= {} \frac{\partial }{\partial c}A_{ineq}=\omega \\ A_{ineq2}&:= {} \frac{\partial }{\partial \upsilon }A_{ineq}={\mathrm {col}} \left[ c_{i|k}\eta _{i|k}\right] \end{aligned}$$

so that

$$\begin{aligned}&\dfrac{\partial }{\partial \upsilon }\left( \frac{1}{2}\left[ \sum \limits _{i=1}^{N}\sum \limits _{k=1}^{M}\upsilon _{i|k}c_{i|k}\eta _{i|k} -b_{ineq}\right] _{+}^{2}\right) \\&\quad =\left[ \sum \limits _{i=1}^{N}\sum \limits _{k=1}^{M}\left[ \upsilon _{i|k}\eta _{i|k}c_{i|k}-b_{ineq}\right] \right] _{+}{\mathrm {col}}[c_{i|k}\eta _{i|k}]\\&\quad = A_{ineq2}\left[ A_{ineq2}^{\intercal }\upsilon _{n}-b_{ineq}\right] _{+}=0 \\&\dfrac{\partial }{\partial c}\left( \frac{1}{2}\sum \limits _{j=1}^{N}\left( \left[ \sum \limits _{i=1}^{N}\sum \limits _{k=1}^{M}\pi _{j|ik}c_{i|k}-\sum \limits _{k=1}^{M}c_{j|k}\right] -\left( b_{eq}\right) _{j}\right) ^{2}\right) ={\mathrm {col}}\left[ Q_{\alpha |\beta } \right] \\&Q_{\alpha |\beta }:=\sum \limits _{j=1}^{N}\left[ \left( \left[ \sum \limits _{i=1}^{N}\sum \limits _{k=1}^{M}\pi _{j|ik}c_{i|k}-\sum \limits _{k=1}^{M}c_{j|k}\right] -\left( b_{eq}\right) _{j}\right) \left[ \pi _{\alpha j|\beta }-\chi _{j,\alpha } \right] \right] \\ \end{aligned}$$

where \(\chi _{j,\alpha }=1\) if \(\alpha =j\) and 0 otherwise

$$\begin{aligned} \dfrac{\partial }{\partial c}\left( \frac{1}{2}\left( \sum \limits _{i=1}^{N}\sum \limits _{k=1}^{M}c_{i|k}-1\right) ^{2}\right)&= {} \left[ \sum \limits _{i=1}^{N}\sum \limits _{k=1}^{M}c_{i|k}-1\right] \mathbf {e,} \quad \mathbf {e:}=(1,\ldots ,1)^{\top }\in {\mathbb {R}}^{NM} \\ \frac{\partial }{\partial c}\left[ \frac{1}{2}\left( \omega ^{\intercal }c-b_{ineq}\right) ^{2}\right] _{+}&= {} \left[ \omega ^{\intercal }c-b_{ineq} \right] _{+}\omega =A_{ineq1}\left[ A_{ineq1}^{\intercal }c-b_{ineq}\right] _{+} \end{aligned}$$

By the strict convexity property (15) for any \(y:=\left( c^{\intercal },\upsilon ^{\intercal }\right) ^{\intercal }\) \((\upsilon \in {\varUpsilon } _{adm}\) and \(c\in \bar{C}_{adm})\) and the extremal vector \(y_{n}^{*}\) we have

$$\begin{aligned} 0\ge & {} \left( y_{n}^{*}-y\right) ^{\intercal }\dfrac{\partial }{\partial y }{\varPhi } _{\mu _{n},\delta _{n}}\left( y_{n}^{*}\right) \nonumber \\&= {} \left( \upsilon _{n}^{*}-\upsilon \right) ^{\intercal }\dfrac{\partial }{ \partial \upsilon }{\varPhi } _{\mu _{n},\delta _{n}}\left( \upsilon _{n}^{*},c_{n}^{*}\right) +\left( c_{n}^{*}-c\right) ^{\intercal }\dfrac{ \partial }{\partial c}{\varPhi } _{\mu _{n},\delta _{n}}\left( \upsilon _{n}^{*},c_{n}^{*}\right) \nonumber \\&= {} \left( \upsilon _{n}^{*}-\upsilon \right) ^{\intercal }\left( \mu _{n} \dfrac{\partial }{\partial \upsilon }f\left( \upsilon _{n}^{*},c_{n}^{*}\right) +A_{ineq2}\left[ A_{ineq2}^{\intercal }\upsilon _{n}^{*}-b_{ineq}\right] _{+}+\delta _{n}^{*}\upsilon _{n}\right) \nonumber \\&\quad +\left( c_{n}^{*}-c\right) ^{\intercal }\left( \mu _{n}\dfrac{ \partial }{\partial c}f\left( \upsilon _{n}^{*},c_{n}^{*}\right) +A_{eq}\left[ A_{eq}^{\intercal }c_{n}^{*}-b_{eq}\right] \right. \nonumber \\&\quad + \left. \left[ \sum \limits _{i=1}^{N}\sum \limits _{k=1}^{M}c_{n,i|k}^{*}-1 \right] \mathbf {e+}A_{ineq1}\left[ A_{ineq1}^{\intercal }c_{n}^{*}-b_{ineq}\right] _{+}+\delta _{n}c_{n}^{*}\right) \end{aligned}$$
(40)

Selecting in (40) \(c:=c^{*}\in \bar{C}_{adm}^{*}\) (\(c^{*}\) is one of the admissible portfolio solutions such that \(A_{eq}c^{*}=b_{eq}\) and \(A_{ineq1}c^{*}-b_{ineq}=0\)), and \(\upsilon =\upsilon ^{*}\) (satisfying \(A_{ineq2}\upsilon ^{*}-b_{ineq}\le 0\)), we have that

$$\begin{aligned} 0& \ge {} \left( \upsilon _{n}^{*}-\upsilon ^{*}\right) ^{\intercal }\left( \mu _{n}\dfrac{\partial }{\partial \upsilon }f\left( \upsilon _{n}^{*},c_{n}^{*}\right) +A_{ineq2}\left[ A_{ineq2}^{\intercal }\upsilon _{n}^{*}-b_{ineq}\right] _{+}+\delta _{n}^{*}\upsilon _{n}\right) \\&\quad + \left( c_{n}^{*}-c^{*}\right) ^{\intercal }\left( \mu _{n}\dfrac{ \partial }{\partial c}f\left( \upsilon _{n}^{*},c_{n}^{*}\right) +A_{eq}\left[ A_{eq}^{\intercal }c_{n}^{*}-b_{eq}\right] +\left[ \sum \limits _{i=1}^{N}\sum \limits _{k=1}^{M}c_{n,i|k}^{*}-1\right] \mathbf { e}\right. \\&\quad +\left. A_{ineq1}\left[ A_{ineq1}^{\intercal }c_{n}^{*}-b_{ineq}\right] _{+}+\delta _{n}c_{n}^{*}\right) \\&= {} \mu _{n}\left( \upsilon _{n}^{*}-\upsilon ^{*}\right) ^{\intercal } \dfrac{\partial }{\partial \upsilon }f\left( \upsilon _{n}^{*},c_{n}^{*}\right) +\left( \upsilon _{n}^{*}-\upsilon ^{*}\right) ^{\intercal }\left( A_{ineq2}\left[ A_{ineq2}^{\intercal }\upsilon _{n}^{*}-b_{ineq}\right] _{+}+\delta _{n}^{*}\upsilon _{n}^{*}\right) \\&\mu _{n}\left( c_{n}^{*}-c^{*}\right) ^{\intercal }\dfrac{\partial }{ \partial c}f\left( \upsilon _{n}^{*},c_{n}^{*}\right) +\underset{ \ge 0}{\underbrace{\left( c_{n}^{*}-c^{*}\right) ^{\intercal }A_{eq}A_{eq}^{\intercal }\left( c_{n}^{*}-c^{*}\right) }}\\&\quad +\underset{\ge 0}{\underbrace{\left[ \sum \limits _{i=1}^{N}\sum \limits _{k=1}^{M}c_{n,i|k}^{*}-1\right] ^{2}}}+\left( c_{n}^{*}-c^{*}\right) ^{\intercal }A_{ineq1}^{\intercal }\left[ A_{ineq1}^{\intercal }\left( c_{n}^{*}-c^{*}\right) \right] _{+}\\&\quad +\left( c_{n}^{*}-c^{*}\right) ^{\intercal }A_{ineq1}^{\intercal } \left[ A_{ineq1}c_{n}^{*}-b_{ineq}\right] _{+}+\delta _{n}\left( c_{n}^{*}-c^{*}\right) ^{\intercal }c_{n}^{*} \end{aligned}$$

implying

$$\begin{aligned} 0\ge & {} \mu _{n}\left( \upsilon _{n}^{*}-\upsilon ^{*}\right) ^{\intercal }\dfrac{\partial }{\partial \upsilon }f\left( \upsilon _{n}^{*},c_{n}^{*}\right) +\left( \upsilon _{n}^{*}-\upsilon ^{*}\right) ^{\intercal }\left( A_{ineq2}\left[ A_{ineq2}^{\intercal }\upsilon _{n}^{*}-b_{ineq}\right] _{+}+\delta _{n}^{*}\upsilon _{n}^{*}\right) \nonumber \\&\mu _{n}\left( c_{n}^{*}-c^{*}\right) ^{\intercal }\dfrac{\partial }{ \partial c}f\left( \upsilon _{n}^{*},c_{n}^{*}\right) +\left( c_{n}^{*}-c^{*}\right) ^{\intercal }A_{ineq1}\left[ A_{ineq1}^{\intercal }c_{n}^{*}-b_{ineq}\right] _{+}+\delta _{n}\left( c_{n}^{*}-c^{*}\right) ^{\intercal }c_{n}^{*} \end{aligned}$$
(41)

Notice that

$$\begin{aligned}&\left( c_{n}^{*}-c^{*}\right) ^{\intercal }A_{ineq1}\left[ A_{ineq1}^{\intercal }c_{n}^{*}-b_{ineq}\right] _{+}\\&\quad =\left( A_{ineq1}^{\intercal }c_{n}^{*}-b_{ineq}-A_{ineq1}^{\intercal }c^{*}+b_{ineq}\right) \left[ A_{ineq1}^{\intercal }c_{n}^{*}-b_{ineq}\right] _{+}\\&\quad = \underset{\ge 0}{\underbrace{\left( A_{ineq1}^{\intercal }c_{n}^{*}-b_{ineq}\right) \left[ A_{ineq1}^{\intercal }c_{n}^{*}-b_{ineq}\right] _{+}}}\\&\qquad -\, \underset{\le 0}{\underbrace{\left( A_{ineq1}^{\intercal }c^{*}-b_{ineq}\right) }}\left[ A_{ineq1}^{\intercal }c_{n}^{*}-b_{ineq} \right] _{+}\ge 0 \end{aligned}$$

Analogously,

$$\begin{aligned} \left( \upsilon _{n}^{*}-\upsilon ^{*}\right) ^{\intercal }A_{ineq2} \left[ A_{ineq2}^{\intercal }\upsilon _{n}^{*}-b_{ineq}\right] _{+}\ge 0 \end{aligned}$$

Using these both inequalities in (41) implies

$$\begin{aligned}&0\ge \mu _{n}\left( \upsilon _{n}^{*}-\upsilon ^{*}\right) ^{\intercal }\dfrac{\partial }{\partial \upsilon }f\left( \upsilon _{n}^{*},c_{n}^{*}\right) +\left( \upsilon _{n}^{*}-\upsilon ^{*}\right) ^{\intercal }\delta _{n}^{*}\upsilon _{n}^{*} \\&\quad \mu _{n}\left( c_{n}^{*}-c^{*}\right) ^{\intercal }\dfrac{\partial }{ \partial c}f\left( \upsilon _{n}^{*},c_{n}^{*}\right) +\delta _{n}\left( c_{n}^{*}-c^{*}\right) ^{\intercal }c_{n}^{*} \end{aligned}$$

Dividing both sides of this inequality by \(\delta _{n}\) we get

$$\begin{aligned} 0\ge & {} \dfrac{\mu _{n}}{\delta _{n}}\left( \upsilon _{n}^{*}-\upsilon ^{*}\right) ^{\intercal }\dfrac{\partial }{\partial \upsilon }f\left( \upsilon _{n}^{*},c_{n}^{*}\right) +\dfrac{\mu _{n}}{\delta _{n}} \left( c_{n}^{*}-c^{*}\right) ^{\intercal }\dfrac{\partial }{ \partial c}f\left( \upsilon _{n}^{*},c_{n}^{*}\right) \nonumber \\&+\,\left( \upsilon _{n}^{*}-\upsilon ^{*}\right) ^{\intercal }\upsilon _{n}^{*}+\left( c_{n}^{*}-c^{*}\right) ^{\intercal }c_{n}^{*} \end{aligned}$$
(42)

Notice that the sequences \(\left\{ c_{n}^{*}\right\}\), \(\left\{ \upsilon _{n}^{*}\right\}\) are bounded on \(\bar{C}_{adm}\otimes {\varUpsilon } _{adm}\). Considering that by the supposition (19 ) \(\dfrac{\mu _{n}}{\delta _{n}}\underset{n\rightarrow \infty }{\rightarrow } 0\), from (42) we may conclude that for any partial limit points (which must exist for any bounded sequence by the Weierstrass theorem) \(c_{\infty }^{*}\in \bar{C}_{adm}^{*}\) and \(\upsilon _{\infty }^{*}\in {\varUpsilon } _{adm}^{*}\) (obviously, these partial limits may be not unique) that

$$\begin{aligned} 0\ge \left( c_{\infty }^{*}-c^{*}\right) ^{\intercal }c_{\infty }^{*}+\left( \upsilon _{\infty }^{*}-\upsilon ^{*}\right) ^{\intercal }\upsilon _{\infty }^{*} \end{aligned}$$
(43)

By the identities

$$\begin{aligned} \left( c_{\infty }^{*}-c^{*}\right) ^{\intercal }c_{\infty }^{*}&= {} \left\| c_{\infty }^{*}-c^{*}\right\| ^{2}+\left( c_{\infty }^{*}-c^{*}\right) ^{\intercal }c^{*} \\ \left( \upsilon _{\infty }^{*}-\upsilon ^{*}\right) ^{\intercal }\upsilon _{\infty }^{*}&= {} \left\| \upsilon _{\infty }^{*}-\upsilon ^{*}\right\| ^{2}+\left( \upsilon _{\infty }^{*}-\upsilon ^{*}\right) ^{\intercal }\upsilon ^{*} \end{aligned}$$

the inequality (43) leads to

$$\begin{aligned} 0\ge & {} \left( c_{\infty }^{*}-c^{*}\right) ^{\intercal }c_{\infty }^{*}+\left( \upsilon _{\infty }^{*}-\upsilon ^{*}\right) ^{\intercal }\upsilon _{\infty }^{*}=\left\| c_{\infty }^{*}-c^{*}\right\| ^{2}+\left\| \upsilon _{\infty }^{*}-\upsilon ^{*}\right\| ^{2}\\&+\left( c_{\infty }^{*}-c^{*}\right) ^{\intercal }c^{*}+\left( \upsilon _{\infty }^{*}-\upsilon ^{*}\right) ^{\intercal }\upsilon ^{*}\\\ge & {} \left( c_{\infty }^{*}-c^{*}\right) ^{\intercal }c^{*}+\left( \upsilon _{\infty }^{*}-\upsilon ^{*}\right) ^{\intercal }\upsilon ^{*}=\left( y_{\infty }^{*}-y^{*}\right) y^{*} \end{aligned}$$

The inequality \(0\ge \left( y_{\infty }^{*}-y^{*}\right) y^{*}\) exactly represents the necessary and sufficient condition that the point \(y^{*}\) is the minimum point of the function \(\left\| y_{\infty }^{*}\right\| ^{2}\) on the set \({\mathcal {Y}}_{adm}=\bar{C}_{adm}^{*}\otimes {\varUpsilon } _{adm}^{*}\) corresponding to \(\delta =0\). Indeed, the function \(\left\| y\right\| ^{2}\) has the minimum \(y^{*}\) at the set \({\mathcal {Y}}_{adm}\) which satisfies the necessary and sufficient condition

$$\begin{aligned} 0\ge \left( y-y^{*}\right) \frac{\partial }{\partial y}\left\| y^{*}\right\| ^{2}=2\left( y-y^{*}\right) y^{*} \end{aligned}$$

valid for any admissible \(y\in {\mathcal {Y}}_{adm}\). \(\square\)

Appendix C: Proof of Lemma 1

Proof

From (41) we have

$$\begin{aligned} \left\| A_{ineq1}c_{n}^{*}-b_{ineq}\right\| \le K_{1}\sqrt{\delta _{n}},\quad K_{1}={\mathrm {const}}>0 \end{aligned}$$

implying

$$\begin{aligned} A_{ineq1}c_{n}^{*}-b_{ineq}\le K_{1}\sqrt{\delta _{n}}e-u_{n}^{*}\le K_{1}\sqrt{\delta _{n}}e,\quad \left\| e\right\| =1 \end{aligned}$$

where the vector inequality is treated component-wise. We have also

$$\begin{aligned} \left\| c_{n}^{*}-\hat{c}_{n}\right\| ^{2}\le \underset{ A_{ineq1}{ c}-{ b}_{ineq}{ \le K}_{1}\sqrt{\delta _{n}} { e},\quad { c\in \bar{C}}_{{ adm}}}{\max }\quad \underset{y\in \bar{C}_{adm}}{\min }\left\| c-y\right\| ^{2}:=d\left( \delta _{n}\right) \end{aligned}$$

We introduce the new variable

$$\begin{aligned} \tilde{c}:=\left( 1-\nu _{n}\right) c+\nu _{n}{\mathring{c}}\in \bar{C}_{adm} \end{aligned}$$
(44)

where

$$\begin{aligned} 0<\nu _{n}:=\frac{K_{1}\sqrt{\delta _{n}}}{K_{1}\sqrt{\delta _{n}}+\underset{ j=1,\ldots ,M_{1}}{\min }\left| \left( A_{ineq1}{\mathring{c}}-b_{ineq}\right) _{j}\right| }<1 \end{aligned}$$

and \({\mathring{c}}\) satisfies the Slater condition (18). For the new variable \(c=\dfrac{\tilde{c}-\nu _{n}{\mathring{c}}}{1-\nu _{n}}\) we have

$$\begin{aligned} A_{ineq1}\tilde{c}-b_{ineq}&= {} \left( 1-\nu _{n}\right) A_{ineq1}c+\nu _{n}A_{ineq1}{\mathring{c}}-b_{ineq}\\&= {} \left( 1-\nu _{n}\right) \left( A_{ineq1}c-b_{ineq}\right) +\left( 1-\nu _{n}\right) b_{ineq}+\nu _{n}\left( A_{ineq1}{\mathring{c}}-b_{ineq}\right) \\ &\quad +\,\nu _{n}b_{ineq}-b_{ineq}=\left( 1-\nu _{n}\right) \left( A_{ineq1}c-b_{ineq}\right) +\nu _{n}\left( A_{ineq1}{\mathring{c}} -b_{ineq}\right) \\ &\le {} \left( 1-\nu _{n}\right) K_{1}\sqrt{\delta _{n}}e+\dfrac{K_{1}\sqrt{\delta _{n}}}{K_{1}\sqrt{\delta _{n}}+\underset{j=1,\ldots ,M_{1}}{\min }\left| \left( A_{ineq1}{\mathring{c}}-b_{ineq}\right) _{j}\right| }\left( A_{ineq1}{\mathring{c}}-b_{ineq}\right) \\ &= {} \dfrac{K_{1}\sqrt{\delta _{n}}}{K_{1}\sqrt{\delta _{n}}+\underset{ j=1,\ldots ,M_{1}}{\min }\left| \left( A_{ineq1}{\mathring{c}}-b_{ineq}\right) _{j}\right| } \\&\quad \times \left( \underset{j=1,\ldots ,M_{1}}{\min }\left| \left( A_{ineq1}{\mathring{c}} -b_{ineq}\right) _{j}\right| e+\left( A_{ineq1}{\mathring{c}} -b_{ineq}\right) \right) \le 0 \end{aligned}$$

Therefore,

$$\begin{aligned} d\left( \delta _{n}\right)&= {} \underset{A_{ineq1}{ c}-{ b}_{ineq} { \le K}_{1}\sqrt{\delta _{n}}{ e},{ c\in Cadm}}{ \max }\quad \underset{y\in Cadm}{\min }\left\| c-y\right\| ^{2}\\\le & {} \underset{A_{ineq1}\tilde{c}-{ b}_{ineq}{ \le 0},\tilde{ c}{ \in Cadm}}{\max }\left\| \dfrac{\tilde{c}-\nu _{n}{\mathring{c}}}{ 1-\nu _{n}}-\tilde{c}\right\| ^{2}\\&= {} \dfrac{\nu _{n}^{2}}{\left( 1-\nu _{n}\right) ^{2}}\underset{A_{ineq1}\tilde{ c}-{ b}_{ineq}{ \le 0},\quad \tilde{c}{ \in Cadm}}{\max }\left\| \tilde{c}-{\mathring{c}}\right\| ^{2}\le K_{2}\delta _{n}, \quad 0<K_{2}<\infty \end{aligned}$$

Thus, \(\left\| c_{n}^{*}-\hat{c}_{n}\right\|\) \(\le \sqrt{d\left( \delta _{n}\right) } \le\) \(\sqrt{K_{2}}\sqrt{\delta _{n}}\) which proves (23). \(\square\)

Appendix D: Proof of Theorem 3

Proof

(Theorem 3 on the convergence of the projection gradient method)

In view of (21) and using the projection property

$$\begin{aligned} \left\| \Pr \left\{ z_{n-1}+{\varGamma } _{n}\dfrac{\partial }{\partial z}{\varPhi } _{\mu _{n},\delta _{n}}\left( z_{n-1}\right) \right\} -z_{n}^{*}\right\| \le \left\| z_{n-1}+{\varGamma } _{n}\dfrac{\partial }{\partial z} {\varPhi } _{\mu _{n},\delta _{n}}\left( z_{n-1}\right) -z_{n}^{*}\right\| \end{aligned}$$

it follows that

$$\begin{aligned} Z_{n}&= {} \left\| z_{n}-z_{n}^{*}\right\| ^{2}\nonumber \\ &\le {} \left\| \left( z_{n-1}-z_{n-1}^{*}\right) -\gamma _{n}\dfrac{\partial }{\partial z}{\varPhi } _{\mu _{n},\delta _{n}}\left( z_{n-1}\right) +\left( z_{n-1}^{*}-z_{n}^{*}\right) \right\| ^{2}\nonumber \\&= {} Z_{n-1}+\gamma _{n}^{2}\left\| \dfrac{\partial }{\partial z}{\varPhi } _{\mu _{n},\delta _{n}}\left( z_{n-1}\right) \right\| ^{2}+\left\| \left( z_{n-1}^{*}-z_{n}^{*}\right) \right\| ^{2}\nonumber \\&\quad -\,2\gamma _{n}\left( z_{n-1}-z_{n-1}^{*}\right) ^{\intercal }\dfrac{ \partial }{\partial z}{\varPhi } _{\mu _{n},\delta _{n}}\left( z_{n-1}\right) \nonumber \\&\quad +\,2\left( z_{n-1}-z_{n-1}^{*}\right) ^{\intercal }\left( z_{n-1}^{*}-z_{n}^{*}\right) -2\gamma _{n}\left( z_{n-1}^{*}-z_{n}^{*}\right) ^{\intercal }\dfrac{\partial }{\partial z}{\varPhi } _{\mu _{n},\delta _{n}}\left( z_{n-1}\right) \end{aligned}$$
(45)

By the inequalities [see inequalities (21.17) and (21.36) in Poznyak (2008)] we can conclude that

$$\begin{aligned}&\left\| \dfrac{\partial }{\partial z}{\varPhi } _{\mu _{n},\delta _{n}}\left( z_{n-1}\right) \right\| ^{2}\\&\quad =\left\| \left[ \dfrac{\partial }{\partial z}{\varPhi } _{\mu _{n},\delta _{n}}\left( z_{n-1}\right) -\dfrac{\partial }{\partial z}{\varPhi } _{\mu _{n},\delta _{n}}\left( z_{n-1}^{*}\right) \right] +\dfrac{\partial }{ \partial z}{\varPhi } _{\mu _{n},\delta _{n}}\left( z_{n-1}^{*}\right) \right\| ^{2}\\&\quad \le \left( 1+\vartheta _{n}\right) \left\| \dfrac{\partial }{\partial z}{\varPhi } _{\mu _{n},\delta _{n}}\left( z_{n-1}\right) -\dfrac{\partial }{\partial z} {\varPhi } _{\mu _{n},\delta _{n}}\left( z_{n-1}^{*}\right) \right\| ^{2}\\&\qquad + \left( 1+\vartheta _{n}^{-1}\right) \left\| \dfrac{\partial }{\partial z} {\varPhi } _{\mu _{n},\delta _{n}}\left( z_{n-1}^{*}\right) \right\| ^{2}\le \left( 1+\vartheta _{n}\right) L_{\nabla }Z_{n-1}+\left( 1+\vartheta _{n}^{-1}\right) d \end{aligned}$$

where \(\left\| \dfrac{\partial }{\partial z}{\varPhi } _{\mu _{n},\delta _{n}}\left( z_{n-1}^{*}\right) \right\| ^{2}\le d\) and

$$\begin{aligned}&\left( z_{n-1}-z_{n-1}^{*}\right) ^{\intercal }\dfrac{\partial }{\partial z}\Phi _{\mu _{n},\delta _{n}}\left( z_{n-1}\right) \ge l_{n}Z_{n-1},\quad l_{n}=\left( \mu _{n}\lambda ^{-}+\delta _{n}\right) \\&\left| \left( z_{n-1}-z_{n-1}^{*}\right) ^{\intercal }\left( z_{n-1}^{*}-z_{n}^{*}\right) \right| \le \left\| z_{n-1}^{*}-z_{n}^{*}\right\| \sqrt{Z_{n-1}} \\&\left| \left( z_{n-1}^{*}-z_{n}^{*}\right) ^{\intercal }\dfrac{\partial }{\partial z}\Phi _{\mu _{n},\delta _{n}}\left( z_{n-1}\right) \right| \\&\quad \overset{\vartheta >0}{\le }\left\| z_{n-1}^{*}-z_{n}^{*}\right\| \sqrt{\left( 1+\vartheta \right) L_{\nabla }Z_{n-1}+\left( 1+\vartheta ^{-1}\right) d}\\&\quad \le \left\| z_{n-1}^{*}-z_{n}^{*}\right\| \left[ \left( 1+\vartheta ^{1/2}\right) \sqrt{L_{\nabla }}\sqrt{Z_{n-1}}+\left( 1+\vartheta ^{-1/2}\right) \sqrt{d}\right] \end{aligned}$$

Then, from (45) for \(m=n-1\) we obtain

$$\begin{aligned} Z_{n}&\le {} Z_{n-1}+\gamma _{n}^{2}\left[ \left( 1+\vartheta \right) L_{\nabla }Z_{n-1}+\left( 1+\vartheta ^{-1}\right) d\right] \\&\quad +\, 2\left( K_{1}^{2}\left| \mu _{n}-\mu _{n-1}\right| ^{2}+K_{2}^{2}\left| \delta _{n}-\delta _{n-1}\right| ^{2}\right) -2\gamma _{n}\left( \mu _{n}\lambda ^{-}+\delta _{n}\right) Z_{n-1}\\&\quad +\, 2\left( K_{1}\left| \mu _{n}-\mu _{n-1}\right| +K_{2}\left| \delta _{n}-\delta _{n-1}\right| \right) \sqrt{Z_{n-1}}\\&\quad +\, 2\gamma _{n}\left( K_{1}\left| \mu _{n}-\mu _{n-1}\right| +K_{2}\left| \delta _{n}-\delta _{n-1}\right| \right) \\ \quad &\times \left[ \left( 1+\vartheta ^{1/2}\right) \sqrt{L_{\nabla }}\sqrt{Z_{n-1}} +\left( 1+\vartheta ^{-1/2}\right) \sqrt{d}\right] \end{aligned}$$

or, equivalently,

$$\begin{aligned} Z_{n}\le Z_{n-1}\left( 1-\alpha _{n-1}\right) +\bar{\delta }_{n-1}\sqrt{ Z_{n-1}}+\beta _{n-1} \end{aligned}$$
(46)

where

$$\begin{aligned} \left. \begin{aligned} \alpha _{n-1}&=2\gamma _{n}\left( \mu _{n}\lambda ^{-}+\delta _{n}\right) -\gamma _{n}^{2}\left( 1+\vartheta \right) L_{\nabla } \\&=2\gamma _{n}\left( \mu _{n}\lambda ^{-}+\delta _{n}\right) \left[ 1-\dfrac{ \gamma _{n}\left( 1+\vartheta \right) L_{\nabla }}{2\left( \mu _{n}\lambda ^{-}+\delta _{n}\right) }\right] \\&\ge \gamma _{n}\delta _{n}2\left( 1+o\left( 1\right) \right) \left[ 1-\dfrac{ \gamma _{n}\left( 1+\vartheta \right) L_{\nabla }}{2\delta _{n}\left( o\left( 1\right) +1\right) }\right] \ge K_{\alpha }\gamma _{n}\delta _{n} \\ \bar{\delta }_{n-1}&=2\left( K_{1}\left| \mu _{n}-\mu _{n-1}\right| +K_{2}\left| \delta _{n}-\delta _{n-1}\right| \right) \left[ 1+\gamma _{n}\left( 1+\vartheta ^{1/2}\right) \sqrt{L_{\nabla }}\right] \\&\le K_{\delta }\left( \left| \mu _{n}-\mu _{n-1}\right| +\left| \delta _{n}-\delta _{n-1}\right| \right) \\ \beta _{n-1}&=\gamma _{n}^{2}\left( 1+\vartheta ^{-1}\right) d+\left( K_{1}^{2}\left| \mu _{n}-\mu _{n-1}\right| ^{2}+K_{2}^{2}\left| \delta _{n}-\delta _{n-1}\right| ^{2}\right) \\&\quad +\,2\gamma _{n}\left( K_{1}\left| \mu _{n}-\mu _{n-1}\right| +K_{2}\left| \delta _{n}-\delta _{n-1}\right| \right) \left( 1+\vartheta ^{-1/2}\right) \sqrt{d} \\&\le \gamma _{n}^{2}K_{\beta ,1}+\gamma _{n}\left( \left| \mu _{n}-\mu _{n-1}\right| +\left| \delta _{n}-\delta _{n-1}\right| \right) K_{\beta ,2} \\&\quad +\left( \left| \mu _{n}-\mu _{n-1}\right| ^{2}+\left| \delta _{n}-\delta _{n-1}\right| ^{2}\right) K_{\beta ,3} \end{aligned} \right\} \end{aligned}$$
(47)

Using the inequality

$$\begin{aligned} Z_{n}^{r}\le \left( 1-r\right) \theta _{n}^{r}+\frac{r}{\theta _{n}^{1-r}} Z_{n}, r\in \left( 0,1\right) ,\quad \theta _{n}>0 \end{aligned}$$

for \(r=1/2\) and \(\sqrt{\theta _{n}}=\) \(\dfrac{\bar{\delta }_{n-1}}{2\alpha _{n-1}\left( 1-\rho \right) }\) , \(\rho \in \left( 0,1\right)\), inequality (46) can be reduced to the following one:

$$\begin{aligned} Z_{n}\le & {} Z_{n-1}\left( 1-\alpha _{n-1}\left[ 1-\dfrac{\bar{\delta }_{n-1}}{ 2\alpha _{n-1}\sqrt{\theta _{n}}}\right] \right) +\left[ \beta _{n-1}+\frac{1 }{2}\bar{\delta }_{n-1}\sqrt{\theta _{n}}\right] \nonumber \\&= {} Z_{n-1}\left( 1-\alpha _{n-1}\rho \right) +\left[ \beta _{n-1}+\dfrac{\bar{ \delta }_{n-1}^{2}}{4\left( 1-\rho \right) \alpha _{n-1}}\right] \end{aligned}$$
(48)

By Theorem 16.14 in Poznyak (2008) \(Z_{n}\underset{n\rightarrow \infty }{\rightarrow }0\) if

$$\begin{aligned} \sum \limits _{n=0}^{\infty }\alpha _{n}=\infty ,\quad \dfrac{\beta _{n-1}}{ \alpha _{n-1}}+\dfrac{\bar{\delta }_{n-1}^{2}}{\alpha _{n-1}^{2}}\underset{ n\rightarrow \infty }{\rightarrow }0 \end{aligned}$$

which is equivalent to (25). This completes the proof. \(\square\)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Clempner, J.B., Poznyak, A.S. Sparse mean–variance customer Markowitz portfolio optimization for Markov chains: a Tikhonov’s regularization penalty approach. Optim Eng 19, 383–417 (2018). https://doi.org/10.1007/s11081-018-9374-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11081-018-9374-9

Keywords

Navigation