Abstract
We consider the stochastic variational inequality problem in which the map is expectation-valued in a component-wise sense. Much of the available convergence theory and rate statements for stochastic approximation schemes are limited to monotone maps. However, non-monotone stochastic variational inequality problems are not uncommon and are seen to arise from product pricing, fractional optimization problems, and subclasses of economic equilibrium problems. Motivated by the need to address a broader class of maps, we make the following contributions: (1) we present an extragradient-based stochastic approximation scheme and prove that the iterates converge to a solution of the original problem under either pseudomonotonicity requirements or a suitably defined acute angle condition. Such statements are shown to be generalizable to the stochastic mirror-prox framework; (2) under strong pseudomonotonicity, we show that the mean-squared error in the solution iterates produced by the extragradient SA scheme converges at the optimal rate of \({{\mathcal {O}}}\left( \frac{1}{{K}}\right) \), statements that were hitherto unavailable in this regime. Notably, we optimize the initial steplength by obtaining an \(\epsilon \)-infimum of a discontinuous nonconvex function. Similar statements are derived for mirror-prox generalizations and can accommodate monotone SVIs under a weak-sharpness requirement. Finally, both the asymptotics and the empirical rates of the schemes are studied on a set of variational problems where it is seen that the theoretically specified initial steplength leads to significant performance benefits.
Similar content being viewed by others
Notes
Note that this paper references our conference paper and a preprint of the current paper.
We prefer not to qualify the initial steplength as “optimal” since the error bound in general is a function of \(\gamma _0\) and \(\beta \).
References
Facchinei, F., Pang, J.-S.: Finite Dimensional Variational Inequalities and Complementarity Problems, vol. 1, 2. Springer, New York (2003)
Konnov, I.V.: Equilibrium Models and Variational Inequalities. Elsevier, Amsterdam (2007)
Brighi, L., John, R.: Characterizations of pseudomonotone maps and economic equilibrium. J. Stat. Manag. Syst. 5(1–3), 253–273 (2002)
Kihlstrom, R., Mas-Colell, A., Sonnenschein, H.: The demand theory of the weak axiom of revealed preference. Econometrica 44(5), 971–978 (1976)
Elizarov, A.M.: Maximizing the lift-drag ratio of wing airfoils with a turbulent boundary layer: exact solutions and approximations. Dokl. Phys. 53(4), 221–227 (2008)
Rousseau, A., Sharer, P., Pagerit, S., Das, S.: Trade-off between fuel economy and cost for advanced vehicle configurations. In: Proceedings of the 20th International Electric Vehicle Symposium, Monaco (2005)
Duensing, G.R., Brooker, H.R., Fitzsimmons, J.R.: Maximizing signal-to-noise ratio in the presence of coil coupling. J. Magn. Reson. Ser. B 111(3), 230–235 (1996)
Shadwick, W., Keating, C.: A universal performance measure. J. Perform. Meas. 6(3), 59–84 (2002)
Hossein, K., Thomas, S., Raj, G.: Omega as a Performance Measure. Preliminary Report, Duke University (2003)
Chandra, S.: Strong pseudo-convex programming. Indian J. Pure Appl. Math. 3(2), 278–282 (1972)
Hobbs, B.F.: Mill pricing versus spatial price discrimination under Bertrand and Cournot spatial competition. J. Ind. Econ. 35(2), 173–191 (1986)
Choi, S.C., Desarbo, W.S., Harker, P.T.: Product positioning under price competition. Manag. Sci. 36(2), 175–199 (1990)
Garrow, L.A., Koppelman, F.S.: Multinomial and nested logit models of airline passengers’ no-show and standby behaviour. J. Revenue Pricing Manag. 3(3), 237–253 (2004)
Newman, J.P.: Normalization of network generalized extreme value models. Transp. Res. Part B Methodol. 42(10), 958–969 (2008)
Cliquet, G.: Implementing a subjective MCI model: an application to the furniture market. Eur. J. Oper. Res. 84(2), 279–291 (1995)
Nakanishi, M., Cooper, L.G.: Parameter estimation for a multiplicative competitive interaction model: least squares approach. J. Market. Res. 11(3), 303–311 (1974)
Gallego, G., Hu, M.: Dynamic pricing of perishable assets under competition. Manag. Sci. 60(5), 1241–1259 (2014)
Ewerhart, C.: Cournot games with biconcave demand. Games Econ. Behav. 85, 37–47 (2014)
Shapiro, A., Dentcheva, D., Ruszczynski, A.: Lectures on Stochastic Programming: Modeling and Theory. The Society for Industrial and Applied Mathematics and the Mathematical Programming Society, Philadelphia (2009)
Xu, H.: Sample average approximation methods for a class of stochastic variational inequality problems. Asia-Pac. J. Oper. Res. 27(1), 103–119 (2010)
Lu, S., Budhiraja, A.: Confidence regions for stochastic variational inequalities. Math. Oper. Res. 38(3), 545–568 (2013)
Lu, S.: Symmetric confidence regions and confidence intervals for normal map formulations of stochastic variational inequalities. SIAM J. Optim. 24(3), 1458–1484 (2014)
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (1951)
Kushner, H.J., Yin, G.G.: Stochastic Approximation and Recursive Algorithms and Applications. Springer, New York (2003)
Borkar, V.S.: Stochastic Approximation: A Dynamical Systems Viewpoint. Cambridge University Press, Cambridge (2008)
Spall, J.C.: Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control. Wiley Series in Discrete Mathematics and Optimization. Wiley, New York (2005)
Nemirovski, A.S., Judin, D.B.: On Cezari’s convergence of the steepest descent method for approximating saddle point of convex–concave functions. In: Soviet Mathematics-Doklady, vol. 19 (1978)
Ruppert, D.: Efficient Estimations from a Slowly Convergent Robbins–Monro Process. Cornell University Technical Report, Operations Research and Industrial Engineering (1988)
Polyak, B.T.: New stochastic approximation type procedures. Autom. Telem. 7, 98–107 (1990)
Polyak, B.T., Juditsky, A.: Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30(4), 838–855 (1992)
Kushner, H.J., Yang, J.: Stochastic approximation with averaging of the iterates: optimal asymptotic rate of convergence for general processes. SIAM J. Control Optim. 31(4), 1045–1062 (1993)
Kushner, H.J., Yang, J.: Analysis of adaptive step-size SA algorithms for parameter tracking. IEEE Trans. Autom. Control 40(8), 1403–1410 (1995)
Nemirovski, A.S., Yudin, D.B.: Problem Complexity and Method Efficiency in Optimization. Wiley-Interscience, New York, Translated by E. R. Dawson (1983)
Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4), 1574–1609 (2009)
Ghadimi, S., Lan, G.: Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization, I: a generic algorithmic framework. SIAM J. Optim. 22(4), 1469–1492 (2012)
Ghadimi, S., Lan, G.: Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization, II: shrinking procedures and optimal algorithms. SIAM J. Optim. 23(4), 2061–2089 (2013)
Bertsekas, D., Tsitsiklis, J.: Gradient convergence in gradient methods with errors. SIAM J. Optim. 10(3), 627–642 (2000)
Ghadimi, S., Lan, G.: Stochastic first- and zeroth-order methods for nonconvex stochastic programming. SIAM J. Optim. 23(4), 2341–2368 (2013)
Jiang, H., Xu, H.: Stochastic approximation approaches to the stochastic variational inequality problem. IEEE Trans. Autom. Control 53(6), 1462–1475 (2008)
Koshal, J., Nedić, A., Shanbhag, U.V.: Regularized iterative stochastic approximation methods for stochastic variational inequality problems. IEEE Trans. Autom. Control 58(3), 594–609 (2013)
Yousefian, F., Nedić, A., Shanbhag, U.V.: A regularized smoothing stochastic approximation (RSSA) algorithm for stochastic variational inequality problems. In: Proceedings of the Winter Simulation Conference (WSC), pp. 933–944 (2013)
Juditsky, A., Nemirovski, A., Tauvel, C.: Solving variational inequalities with stochastic mirror-prox algorithm. Stoch. Syst. 1(1), 17–58 (2011)
Yousefian, F., Nedić, A., Shanbhag, U.V.: Optimal robust smoothing extragradient algorithms for stochastic variational inequality problems. In 53rd IEEE Conference on Decision and Control, pp. 5831–5836 (2014)
Chen, Y., Lan, G., Ouyang, Y.: Accelerated schemes for a class of variational inequalities. Math. Program. 165(1), 113–149 (2017)
Kannan, A., Shanbhag, U.V.: The pseudomonotone stochastic variational inequality problem: analytical statements and stochastic extragradient schemes. In: American Control Conference, ACC 2014, Portland, OR, USA, June 4–6, 2014, pp. 2930–2935. IEEE (2014)
Iusem, A.N., Jofré, A., Oliveira, R.I., Thompson, P.: Extragradient method with variance reduction for stochastic variational inequalities. SIAM J. Optim. 27(2), 686–724 (2017)
Yousefian, F., Nedić, A., Shanbhag, U.V.: On stochastic mirror-prox algorithms for stochastic Cartesian variational inequalities: randomized block coordinate and optimal averaging schemes. Set-Valued Var. Anal. 26(4), 789–819 (2018)
Yousefian, F., Nedić, A., Shanbhag, U.V.: Self-tuned stochastic approximation schemes for non-Lipschitzian stochastic multi-user optimization and Nash games. IEEE Trans. Autom. Control 61(7), 1753–1766 (2016)
Iusem, A.N., Jofré, A., Thompson, P.: Incremental constraint projection methods for monotone stochastic variational inequalities. Math. Oper. Res. (2018). https://doi.org/10.1287/moor.2017.0922
Yousefian, F., Nedić, A., Shanbhag, U.V.: On smoothing, regularization, and averaging in stochastic approximation methods for stochastic variational inequality problems. Math. Program. 165(1), 391–431 (2017)
Polyak, B.T.: Introduction to Optimization. Optimization Software Inc., New York (1987)
Nemirovski, A.: Prox-method with rate of convergence \({O}(1/{T})\) for variational inequalities with Lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM J. Optim. 15(1), 229–251 (2005)
Dang, C.D., Lan, G.: On the convergence properties of non-Euclidean extragradient methods for variational inequalities with generalized monotone operators. Comput. Optim. Appl. 60(2), 277–310 (2015)
Dvurechensky, P., Gasnikov, A., Stonyakin, F., Titov, A.: Generalized Mirror Prox: Solving Variational Inequalities with Monotone Operator, Inexact Oracle, and Unknown Hölder Parameters. arxiv:1806.05140.pdf (2018)
Nesterov, Y.: Introductory lectures on convex optimization: a basic course. In: Pardalos, P., Hearn, D. (eds.) Applied Optimization. Kluwer, Dordrecht (2004)
Watson, L.T.: Solving the nonlinear complementarity problem by a homotopy method. SIAM J. Control Optim. 17(1), 36–46 (1979)
Kannan, A., Shanbhag, U.V.: Distributed computation of equilibria in monotone Nash games via iterative regularization techniques. SIAM J. Optim. 22(4), 1177–1205 (2012)
Allaz, B., Vila, J.L.: Cournot competition, forward markets and efficiency. J. Econ. Theory 59(1), 1–16 (1993)
Facchinei, F., Kanzow, C.: Generalized Nash equilibrium problems. 4OR 5(3), 173–210 (2007)
Acknowledgements
The authors are grateful to Dr. Farzad Yousefian for his valuable suggestions on a previous version. We particularly appreciate the comments of the referees and the editor, all of which have led to significant improvements in the manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This research has been partially supported by NSF Awards 1246887 (CAREER), 1538193 and 1408366.
Appendix
Appendix
Lemma 7
Consider the function \(t_0(\gamma _0)\) defined as
where \(\sigma _0\) denotes the strong pseudomonotonicity constant. Then the following hold:
- (a)
A minimizer of \(t_0(\gamma _0)\) cannot exist in an interval \(\lfloor 2\sigma \gamma _0 \rfloor \in [n,n+1]\) where \(n > 1\).
- (b)
The infimum of \(t_0(\gamma _0)\) is given by the following :
$$\begin{aligned} {{ f^* \triangleq }} \inf _{\gamma _0} \left\{ t_0(\gamma _0) \mid 1< \ 2\sigma \gamma _0 \ < 2 \right\} = \frac{1}{\sigma ^2}. \end{aligned}$$ - (c)
Suppose an \(\epsilon \)-infimum of \(t_0(\gamma _0)\), denoted by \(f^*_{\epsilon }\), satisfies \(f^*_{\epsilon } \le f^* + \beta \epsilon \) for some \(\beta > 0\). Then \(f^*_{\epsilon }\) is achieved by \(\gamma _0 = \frac{2-\epsilon }{2\sigma }\) and satisfies \({{f_{\epsilon }^*}} \le f^* + \frac{2\epsilon }{\sigma ^2}\), where \(\epsilon \in (0,1/2)\).
Proof
-
(a)
We begin by observing that if \(2\sigma \gamma _0 \in \mathbb {Z}_+\), then \(t_0(\gamma _0) = +\infty .\) Consequently, any minimizer of \(t_0(\gamma _0)\) has to satisfy \(2\gamma _0 \sigma \not \in \mathbb {Z}_+\). We proceed to show that \(t_0(\gamma _0)\) does not admit a minimizer in \((n,n+1)\) where \(n > 1\). Assume this is false and suppose there exists a minimizer \(\gamma _0^*\) satisfying \(2\gamma _0^*\sigma \in (n,n+1)\). But there exists a \(\tilde{\gamma }_0\) such that \(2\sigma {\tilde{\gamma }}_0 \in (n-1,n)\). In fact, \(t_0({{\tilde{\gamma }}}_0) < t_0(\gamma _0^*)\) as we show next and our claim follows.
$$\begin{aligned} t_0({\tilde{\gamma }}_0) = \left( \frac{{\tilde{\gamma }}_0^2}{2\sigma {\tilde{\gamma }}_0 - \lfloor 2 \sigma {\tilde{\gamma }}_0 \rfloor } \right) < \left( \frac{(\gamma ^*_0)^2}{2\sigma \gamma ^*_0 - \lfloor 2 \sigma \gamma ^*_0 \rfloor } \right) = t_0(\gamma _0^*). \end{aligned}$$ -
(b)
From (a), it follows that if a minimizer exists, it has to satisfy \(2\gamma _0^* \sigma \in (1,2)\). It follows that \(t_0(\gamma _0)\) reduces to \(\gamma _0^2/(2\sigma \gamma _0 - 1).\) Consider the following optimization problem:
$$\begin{aligned} \inf _{ \gamma _0} \left\{ \frac{\gamma _0^2}{(2\sigma \gamma _0 - 1)} \mid 1< 2\sigma \gamma _0 < 2 \right\} . \end{aligned}$$We observe that \(t_0(\gamma _0)\) is a strictly decreasing function by noting that
$$\begin{aligned} t'_0(\gamma _0)= & {} \frac{2\gamma _0}{2\sigma \gamma _0-1} -\frac{ 2\sigma \gamma _0^2 }{(2\sigma \gamma _0 -1)^2} = \frac{2\gamma _0}{2\sigma \gamma _0-1} \left( 1-\frac{\sigma \gamma _0}{2\sigma \gamma _0-1} \right) \\= & {} \frac{2\gamma _0}{2\sigma \gamma _0-1} \left( \frac{\sigma \gamma _0-1}{2\sigma \gamma _0-1} \right) < 0, \end{aligned}$$since \(\sigma \gamma _0 < 1\). It follows that the infimum is at the end-point given by \(\sigma \gamma _0 = 1\) implying that
$$\begin{aligned} \inf _{ \gamma _0} \left\{ \frac{\gamma _0^2}{(2\sigma \gamma _0 - 1)} \mid 1< 2\sigma \gamma _0 < 2 \right\} = \frac{1}{\sigma ^2}. \end{aligned}$$ -
(c)
Suppose \({\widehat{\gamma }}_0 = \frac{2-\epsilon }{2\sigma }\) where \(\epsilon \in (0,1/2)\). Then we have that
$$\begin{aligned} f^*_{\epsilon } - f^* = \frac{(2-\epsilon )^2}{4\sigma ^2 (1-\epsilon )} - \frac{1}{\sigma ^2} \le \frac{1}{\sigma ^2} \left( \frac{1}{1-\epsilon } - 1\right) = \frac{1}{\sigma ^2}\frac{\epsilon }{1-\epsilon } = \frac{2}{\sigma ^2} \epsilon . \end{aligned}$$It follows that \({{f^*_{\epsilon } \le f^* + \frac{2\epsilon }{\sigma ^2}}}.\)\(\square \)
Example 1
Unfortunately, while one can derive an infimum of the above discontinuous optimization problem, this infimum cannot be achieved and the problem lacks a minimizer as proved in the above result. Yet, this infimum is informative in developing an approximate \(\epsilon \)-solution as part (c) shows. We proceed to use this \(\epsilon \)-infimum in deriving rate statements and demonstrate this result through an example. Suppose \(\sigma = 0.1\sqrt{1.3}\). Then \(t_0(\gamma _0)\) is shown as a solid line with discontinuities in Fig. 1 while the dashed flat line displays the infimum \(1/\sigma ^2\).
Lemma 8
Consider the following recursion: \( a_{k+1} \le (1-2c\theta /k) a_k + {\textstyle {1\over 2}}\theta ^2 M^2/k^2, \) where \(\theta \) and M are positive constants, \(a_k \ge 0\), and \((1-2c\theta ) < 0\). Then for \(k \ge 1\), we have that
Proof
We begin by noting that \({\bar{\epsilon }} > 0\) and \(\kappa >1\) as seen next.
We consider the following cases for k.
Case 1: Consider \(k = 1\). Then the following holds: \(a_{2} \le (1-2c\theta )a_1 + \frac{1}{2}\theta ^2 M^2.\) If \((2c\theta -1) > 0\), we may rearrange the inequalities as follows:
$$\begin{aligned} (2c\theta -1)a_1&\le -a_2 + \frac{1}{2} \theta ^2M^2 \le \frac{1}{2} \theta ^2M^2 \; \text {or} \; 2a_1 \le \frac{1}{2c\theta -1} \theta ^2M^2 \\ \Longrightarrow 2a_1&\le \max \left( (2c\theta -1)^{-1}\theta ^2M^2, 2a_1\right) \\&\le \max \left( (2c\theta -1)^{-1}\theta ^2M^2\kappa ,2a_1 \right) , \end{aligned}$$where \(\kappa > 1\).
Case 2: \(1 \ < \ k \le \bar{k}\). Recall that when \(k \le \bar{k}\), we have that
$$\begin{aligned} (1-2c\theta /k) \le (1-2c\theta /\bar{k}) = \left( 1-(2c\theta ) /(\lceil 2c\theta \rceil )\right) < 0. \end{aligned}$$Then the following holds:
$$\begin{aligned} a_{k+1}&\le \left( \left( 1-\frac{2c\theta }{k}\right) a_{k} + \frac{1}{2}\frac{\theta ^2 M^2}{k^2} \right) \\ a_k \left( \frac{2c\theta }{k}-1\right)&\le -a_{k+1} + \frac{\theta ^2M^2}{2k^2} \le \frac{\theta ^2 M^2}{2k^2}\\ \Longrightarrow a_k&\le \frac{\theta ^2M^2}{2k^2}\left( \frac{k}{2c\theta -k}\right) = \frac{\theta ^2M^2}{2k(2c\theta -k)}. \end{aligned}$$By the definition of \(\kappa \) and \(u \triangleq \max \left( \frac{\kappa \theta ^2 M^2}{(2c\theta -1)}, {2a_1}\right) \), we may conclude the following:
$$\begin{aligned} 2a_{k} \le \frac{\theta ^2M^2}{k(2c\theta -k)}&= \frac{\theta ^2M^2}{k(2c\theta -1)}\left( \frac{2c\theta -1}{2c\theta -k}\right) \\&\le \frac{\theta ^2M^2}{k(2c\theta -1)}\left( \frac{2c\theta -k+\bar{k}-1}{2c\theta -k}\right) \\&\le \frac{\theta ^2M^2}{k(2c\theta -1)} \left( 1+\frac{\bar{k}-1}{2c\theta -k}\right) \\&\le \frac{\theta ^2M^2}{k(2c\theta -1)} \left( 1+\frac{\bar{k}-1}{2c\theta -{\bar{k}} }\right) = \frac{\theta ^2M^2}{k(2c\theta -1)} \kappa \\&\le {1\over k}\max \left( \frac{\kappa \theta ^2 M^2}{(2c\theta -1)}, {2a_1}\right) = \frac{u}{k}, \end{aligned}$$Case 3: \(k > \bar{k}\). Suppose, this holds for \(k > \bar{k}\), implying that \(2a_k \le \frac{\max (\theta ^2M^2(2c\theta -1)^{-1}\kappa , 2a_1)}{k}\). We proceed to show that this holds for \(k: = k+1\) where \(u \triangleq \max \left( \theta ^2M^2 (2c\theta -1)^{-1}\kappa , 2a_1 \right) \) and \((1-\frac{2c\theta }{k}) > 0\) since \(k > \bar{k}\):
$$\begin{aligned} a_{k+1}&\le \left( 1-\frac{2c\theta }{k}\right) \frac{u}{2k} + \frac{1}{2}\left( \frac{\theta ^2M^2}{k^2}\right) \\&= \left( 1-\frac{2c\theta }{k}\right) \frac{u}{2k} + \frac{(2c\theta -1)}{2k} \left( \frac{\theta ^2M^2}{(2c\theta -1)k}\right) \\&\le \left( 1-\frac{2c\theta }{k}\right) \frac{u}{2k} + \frac{(2c\theta -1)}{2k} \left( \frac{\theta ^2M^2\kappa }{(2c\theta -1)k}\right) \\&\le \left( \frac{u}{2k}-\left( \frac{2c\theta }{k}\right) \frac{u}{2k}\right) + \left( \frac{2c\theta -1}{2k}\right) \frac{u}{k} \\&= \frac{u}{2k}-\frac{1}{k}\left( \frac{u}{2k}\right) \le \frac{u}{2k} -\frac{1}{k+1}\left( \frac{u}{2k}\right) = \frac{u}{2(k+1)}. \end{aligned}$$
\(\square \)
Lemma 9
Consider the following problem: \(\min \left\{ h(\gamma _0) g(z) \mid \gamma _0 \in \Gamma _0, z \in {{\mathcal {Z}}}\right\} ,\) where h and g are positive functions over \(\Gamma _0\) and \({\mathcal {Z}}\), respectively. If \({\bar{\gamma }}_0\) and \({\bar{z}}\) denote global minimizers of \(h(\gamma _0)\) and g(z) over \(\Gamma _0\) and \({{\mathcal {Z}}}\), respectively, then the following holds:
Proof
The proof has two steps. First, we note that \( \min _{\gamma _0 \in \Gamma _0, z \in {\mathcal {Z}}} h(\gamma _0) g(z) \ge h({\bar{\gamma }}_0) g({\bar{z}}), \) implying that at any global minimizer \((\gamma _0^*,z^*)\),
Second, since \(({\bar{\gamma }}_0, {\bar{z}}) \in \Gamma _0 \times {\mathcal {Z}}\), we have that \(h(\gamma _0^*)g(z^*)\) has an optimal value that is no smaller than that the value associated with any feasible solution or
By combining (35) and (36), the result follows. \(\square \)
Rights and permissions
About this article
Cite this article
Kannan, A., Shanbhag, U.V. Optimal stochastic extragradient schemes for pseudomonotone stochastic variational inequality problems and their variants. Comput Optim Appl 74, 779–820 (2019). https://doi.org/10.1007/s10589-019-00120-x
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-019-00120-x