Solving Mixed Variational Inequalities Beyond Convexity

We show that Malitsky’s recent Golden Ratio Algorithm for solving convex mixed variational inequalities can be employed in a certain nonconvex framework as well, making it probably the first iterative method in the literature for solving generalized convex mixed variational inequalities, and illustrate this result by numerical experiments.


Introduction
Mixed variational inequalities are general problems which encompass as special cases several problems from continuous optimization and variational analysis such as minimization problems, linear complementary problems, vector optimization problems or variational inequalities, having applications in economics, engineering, physics, mechanics and electronics (see [8,9,11,12,23,[32][33][34]36] among others).
The first motivation behind this study comes from [19,20], where existence results for mixed variational inequalities involving quasiconvex functions were provided, hence the question of numerically solving such problems arose naturally. As far as we know in the literature one can find only iterative methods for solving mixed variational inequalities without convexity assumptions where the involved functions are taken continuous in works like [32][33][34], however these algorithms are either implicitly formulated or merely conceptual schemes, no implementations of them being known to us. Note moreover that these methods involve inner loops or two forward steps in In order to provide algorithms for solving mixed variational inequalities involving noncontinuous nonconvex functions we can envision at least two ways, to extend algorithms from the corresponding convex case or to combine methods of proximal point type for minimizing quasiconvex functions (cf. [24,35]) with known algorithms for solving variational inequalities. In this note we contribute to the first direction, leaving the other for future research.
More precisely we show that the recent Golden Ratio Algorithm due to Malitsky (cf. [27]) remains convergent for certain nonconvex mixed variational inequalities as well, making it probably the first iterative method (of proximal point type) in the literature for solving nonsmooth mixed variational inequalities beyond convexity. The governing operators of the mixed variational inequalities which turn out to be solvable via the Golden Ratio Algorithm need to satisfy only a quite general monotonicity type hypothesis, while the involved functions can be prox-convex, a generalized convexity notion recently introduced in [13], whose definition is satisfied by several families of quasiconvex and weakly convex functions and not only. The theoretical advances are illustrated and confirmed by numerical experiments where some interesting facts regarding the "conflict" between the operator and the function involved in the mixed variational inequalities can be noticed even if only toy examples are considered.
The structure of the paper is as follows. In Sect. 2, we set up notation and preliminaries, reviewing some standard facts on generalized convexity. In Sect. 3, we show that the Golden Ratio Algorithm converges for certain nonconvex mixed variational inequalities, while in Sect. 4 we computationally solve in matlab two mixed variational inequalities where the involved functions are prox-convex (but not convex) on the constraint set, in particular an application in oligopolistic markets.

Preliminaries and Basic Definitions
We denote by ·, · the inner product of R n and by · the Euclidean norm. The indicator function of a nonempty set K ⊆ R n is defined by ι K (x) := 0 if x ∈ K , and ι K (x) := +∞ elsewhere. By B(x, δ) we mean the closed ball centered at x ∈ R n and with radius δ > 0. By convention, take 0(+∞) = +∞.
Given any x, y, z ∈ R n and β ∈ R, we have The proximity operator of parameter γ > 0 of a function h : Let K be a closed and convex set in R n , h : K → R with K ∩ dom h = ∅ a proper function, and f : K × K → R be a real-valued bifunction. We say that f is Every monotone bifunction is h-pseudomonotone, but the converse statement is not true in general. Furthermore, if h ≡ 0, then 0-pseudomonotonicity coincides with the usual pseudomonotonicity, see [16].
A new class of generalized convex functions which includes some classes of quasiconvex functions and weakly convex functions along other functions was introduced in [13]. Definition 2.1 Let K be a closed set in R n and h : R n → R be a proper function such that K ∩ dom h = ∅. We say that h is prox-convex on K (with prox-convex value α) if there exists α > 0 such that for every z ∈ K , Prox h+ι K (z) = ∅, and (2.8) Some basic properties of prox-convex functions are recalled below.

Remark 2.1 (i) Let
K be a closed set in R n and h : R n → R be a proper prox-convex function on K such that K ∩ dom h = ∅. Then the map z → Prox h+ι K (z) is single-valued, i.e., the left-hand side of relation (2.8) holds as an equality (by a slight abuse of notation see [13,Proposition 3.3]), and firmly nonexpansive on K , that is, given z 1 , z 2 ∈ K and x 1 ∈ Prox h+ι K (z 1 ) and x 2 ∈ Prox h+ι K (z 2 ), one has If h is a proper, lower semicontinuous and convex function, then h is prox-convex with prox-convex value α = 1 (see [13,Proposition 3.4]), but the converse statement does not hold in general. Indeed, the function h(x) = −x 2 − x is prox-convex (with prox-convex value α for every α > 0) without being convex on K = [0, 1]. Furthermore, a class of quasiconvex functions which are proxconvex as well is analyzed in [13, Section 3.2]. (iii) As was proved in [13,Theorem 4.1], the classical proximal point algorithm remains convergent for optimization problems consisting in minimizing proxconvex functions, too. (iv) In the literature one can find similar works where the proximity operator of a function is considered with respect to a given set like above, for instance [3,14] where the employed functions are not split from the corresponding sets either. (v) We are not aware of any universal recipe for computing the proximity operator of a prox-convex function with respect to a given set. The most direct approach (used also for the numerical experiments presented in Sect. 4 as well as in [13]) is to determine it by hand by means of basic calculus techniques for determining the minimum of a function over a set. Alternatively one could employ (or adapt if necessary) in order to computationally derive proximal points of functions over sets (as the algorithms actually use them at certain points and closed formulae of the corresponding proximity operators are not really necessary) the software packages from [40] (in julia) or [26] (in python). Related to this issue is the computation of proximal points of a convex function over a nonconvex set, a contribution to this direction being available in the recent work [15]. Note also that in [17] a method based on the computation of proximal points of piecewise affine models of certain classes of nonconvex functions is proposed for determining the proximal points of the latter, while in the convex case one finds a symbolic computation method in [25] and an interior one in [10], however the possible ways of extending them towards nonconvex frameworks or for functions restricted to sets require further investigations. (vi) To the best of our knowledge besides the convex and prox-convex functions only the weakly convex ones have single-valued proximity operators (as noted in [17]). This feature is very relevant when constructing proximal point algorithms as the new iterate is uniquely determined and does not have to be picked from a set (that needs to be found and can also be empty).
For a further study on generalized convexity we refer to [4,16] among other works.

Nonconvex Mixed Variational Inequalities
Let K ⊆ R n be nonempty, T : R n → R n be an operator and h : K → R a real-valued function. The mixed variational inequality problem or variational inequality of the second kind is defined by We denote its solution set by S(T ; h; K ).

Remark 3.1
Following [23,31], problem (3.1) can be seen as a reformulation of a Walrasian equilibrium model or of an oligopolistic equilibrium model, where (after some transformations explained in [23, Section 2]) the (monotone) operator T represents the demand, while h the supply. While in [23] the function representing the supply is taken convex (however no economical reason behind this choice is given), in [31] it is DC (difference of convex). Another concrete application that can be recast as a mixed variational inequality problem can be found in [12] (see also [11]), where electrical circuits involving devices like transistors are studied by means of mixed variational inequalities. There the involved (monotone) operator is linear, while the electrical superpotential of the practical diodes is represented by the involved function (taken convex in order to fit the theoretical model, again without any physical explanation). In [36] a dual variational formulation for strain of an elastoplasticity model with hardening is reformulated as a mixed variational inequality, while in [28] the same happens with the frictional contact of an elastic cylinder with a rigid obstacle, in the antiplane framework.
On the other hand, if T ≡ 0 (3.1) reduces to the constrained optimization problem of minimizing h over K . When h is convex, (3.1) is a convex mixed variational inequality, for which one can find various (proximal point type) algorithms in the literature, for instance in [5,27,37,41,[43][44][45]. However when involving nonconvex functions mixed variational inequalities become harder to approach. For instance, the solution sets of such problems may be empty even for a compact and convex set K as was noted in [30, Example 3.1] (see also [20, page 127]). Indeed, take T = I (the identity map), Assuming the existence of a solution to (3.1), denoted by x ∈ K , one has in (3.1) for The reason behind this situation is the "conflict" between the properties of the operator T and those of the function h. Since we are searching for a point where an inequality is satisfied for all the considered variables, if h goes downwards (for instance, when it is concave), then T needs to go upwards faster and vice-versa. This conflict is restricted to nonconvex functions, since if h is convex (3.1) has a solution on any compact and convex set K . Existence results for the nonconvex and noncoercive setting can be found in [19][20][21], in which this conflict appears in a theoretical assumption called mixed variational inequality property (see [42,Definition 3.1]).
For problem (3.1), let us consider the following additional assumptions (A1) T is an L-Lipschitz-continuous operator on K , where L > 0; (A2) h is a lower semicontinuous prox-convex function on K with prox-convex value α > 0; (A3) T and h satisfy the following generalized monotonicity condition on K (cf. [27,39])

Remark 3.3 (i) If
T is h-pseudomonotone on K (in particular, monotone), then T satisfies assumption (A3) (see [39]). Indeed, take x ∈ S(T ; h; K ). Since T is h-pseudomonotone, we have for every y ∈ K i.e., T satisfies assumption (A3). Note also that the generalized monotonicity condition (A3) can be further weakened to [27,Equation (71)] without losing the convergence of the algorithm proposed below. (ii) Results guaranteeing existence of solutions to mixed variational inequalities beyond convexity may be found in [19][20][21].
The algorithm we consider for solving (3.1) was recently introduced by Malitsky in [27] with the involved function taken convex.
As a first result, we validate the stopping criterion of Algorithm 1 in our generalized convex framework. In the following statements the sequences {x k } k and {z k } k are the ones generated by Algorithm 1. (A2) holds and x k+1 = x k = z k for some k ∈ N, then the stopping point of Algorithm 1 is a solution of (3.1), i.e. x k ∈ S(T ; h; K ).
In order to state the convergence of Algorithm 1, we first show the following general result, whose proof follows the steps of the one of [27, Theorem 1].
Proof By applying relation (2.8) to Eq. (3.4) for k + 1 and k, we have Take x = x ∈ S(T ; h; K ) in the first equation, x = x k+1 in the second equation, and note that By adding these inequalities, and since x − x k+1 = x − x k + x k − x k+1 and α > 0, we have Since T satisfies assumption (A3), we have i.e., it follows from (3.8) that By applying (2.1), we have (3.10) By replacing (3.10) in Eq. (3.9), and since Finally, since T is L-Lipschitz-continuous, it follows that and the result follows by replacing this in Eq. (3.11).
As a consequence, we have the following statement.
Then there exists γ ≥ 1 large enough such that condition (3.12) holds. Now, we present the following convergence result of Algorithm 1. Proof Take an arbitrary x ∈ S(T ; h; K ). It follows from Lemma 3.1 that the sequence x k − z k → 0 when k → +∞, (3.14) hence {x k } k has at least a cluster point. Going to subsequences if necessary, the same happens with {z k } k , too. By using (3.3), we have Therefore, it follows by Eqs. (3.14) and (3.15) that x k+1 − x k → 0 when k → +∞. Now, take x be an arbitrary cluster point of the sequence {x k } k generated by our algorithm, and recall Eq. (3.6): Since h is lower semicontinuous and T is L-Lipschitz-continuous, by taking the limit in Eq. (3.6) (passing to subsequences if needed) and by using Eq. (3.15), we have i.e., x ∈ S(T ; h; K ). Therefore, every cluster point of {x k } k is a solution of problem (3.1).

Remark 3.5 (i)
As explained before, the conflict between the properties of the function and those of the operator in the nonconvex case (something similar happens in Multiobjective Optimization where one has several conflicting objective functions) is controlled in Algorithm 1 in condition (3.12). However, note that if T = 0, then L = 0 and (3.12) holds automatically. On the other hand, if h is convex (in particular ι K ), then α = 1, so condition (3.12) becomes "L ∈ ]0, φ/2]", which is usual for the convex case. Note also that one can find in the literature various methods for determining the Lipschitz constants of the involved operators, see for instance [38]. (ii) In contrast to the methods proposed in [27,37,43] and references therein, Algorithm 1 is not restricted to the convexity of the function h. Other recent algorithms for solving mixed variational inequalities where the involved function is convex can be found also in [5,41,44,45].
The convergence statement of this modified algorithm is similar to Theorem 3.1, its proof following analogously. However, when assuming h to be prox-convex with proxconvex value α > 0 it is not known whether λh (for some λ > 0) is prox-convex or not. While (2.8) implies a similar inequality for λh with the prox-convex value λα, it is not sure that Prox h+ι K (z) = ∅ yields Prox λh+ι K (z) = ∅.

Numerical Experiments
We implemented the modification of Algorithm 1 presented in Remark 3.6 in matlab 2019b-Win64 on a Lenovo Yoga 260 Laptop with Windows 10 and an Intel Core i7 6500U CPU with 2.59 GHz and 16GB RAM. As stopping criterion of the algorithm we considered the situation when the absolute values of both the differences between x k+1 and x k , and x k+1 and z k are not larger than the considered error ε > 0. First we considered a toy example, where one can already note the "conflict" between the properties of the involved operator and function, then an application in oligopolistic markets.

Theoretical Example
We consider first the following mixed variational inequality that is a special case of (3. is prox-convex on K for any x ∈ K for any α > 0 and problem (4.1) has a unique solution x = 1 ∈ K . The fact that y → T (x), y is increasing on K while h is decreasing on K adds additional complications to problem (4.1). The value of the proximity operator of parameter αλ of h (for fixed α, λ > 0) over K at z − λT (x) ∈ K , for x, z ∈ K , is, for the situations considered below, where αλ < 1/2, equal to (z + λ(α − x))/(1 − 2λα) when this lies in K , otherwise being 0 when λ(x − α) ≥ z and 1 otherwise, i.e. when z + 3αλ − λx > 1. In case αλ ≥ 1/2 the analysis can be done in a similar manner. Hence, as λαh is prox-convex, we can employ for solving (4.1) the modification of Algorithm 1 mentioned in Remark 3.6 with constant stepsize λα.
In Tables 1 and 2 we present the computational results obtained by implementing this algorithm for various choices of the involved parameters and of the allowed error ε > 0. In all the situations presented below we give the number of iterations needed by the program to reach the solution x = 1 of (4.1).

Application in Oligopolistic Equilibrium
The Nash-Cournot oligopolistic market equilibrium is a fundamental model in economics (see [23,31] for more on this and related references). It assumes that there are n companies producing a common homogeneous commodity. For i ∈ {1, . . . , n}, company i has a strategy set D i ⊆ R + , a cost function ϕ i defined on the strategy set D = n i=1 D i of the model and a profit function f i that is usually defined as price times production minus costs (of producing the considered production). Each com- pany is interested in maximizing its profit by choosing the corresponding production level under knowledge on demand of the market and of production of the competition (seen as input parameters). A commonly used solution notion in this model is the celebrated Nash equilibrium. A point (strategy)x = (x 1 , . . . ,x n ) ∈ D is said to be a Nash equilibrium point of this Nash-Cournot oligopolistic market model if where the vectorx[x i ] is obtained fromx by replacingx i with x i . Following some transformations described in detail in [23,31], the problem of determining a Nash equilibrium of a Nash-Cournot oligopolistic market situation can be rewritten as a mixed variational inequality of type (3.1), where the involved operator captures various parameters and additional information and the function h is the sum of the cost functions of the considered companies, each of them depending on a different variable that represents the corresponding production.
Following the oligopolistic equilibrium model analyzed in [31], where some of the involved cost functions are convex and some DC, we consider the mixed variational inequality corresponding to a model with 5 companies, whose cost functions are defined as follows The cost functions ϕ 1 , ϕ 3 and ϕ 5 are prox-convex (see [13]), while ϕ 2 and ϕ 4 convex. Functions ϕ 1 and ϕ 3 were employed in the similar application considered in [31] (following the more applied paper [2]) as (DC) cost functions for oligopolistic equilibrium problems, while ϕ 4 is the Huber loss function (cf. [18]) that is used in various contexts in economics and machine learning for quantifying costs. This model also includes functions with negative values such as ϕ 1 that quantify the possibility of having negative costs, for instance in case of a surplus of energy on the market (as discussed in the recent works [1,7]). To make the problem more challenging from a mathematical point of view we consider the involved convex cost functions defined over the whole space. Unlike (4.1), the present problem is unbounded in the second and fourth argument (note that the corresponding constraint sets are not compact). Note moreover that different to the literature on algorithms for DC optimization problems where usually only critical points are determinable (and not optimal solutions), for the DC functions that are also prox-convex proximal point methods are capable of delivering global minima (on the considered sets). In our numerical experiments for solving (3.1) (denoted (O M P) in this case) where T (x) = Ax, x ∈ R 5 , with A ∈ R 5×5 a real symmetric positive semidefinite matrix of norm 1 (for simplicity) and h(x 1 , . . . , we employed the modification of Algorithm 1 mentioned in Remark 3.6 with constant stepsize λα, for fixed α, λ > 0. In each experiment the matrix A is randomly generated and scaled in order to have L = 1. The proximity operator of (the separable function) h has as components the ones of the involved functions, which are known (cf. [6,13,29]). For instance, the proximity operator of parameter γ > 0 of ϕ 1 over [0, 2] at z ∈ [0, 2], is, when γ < 1/2, equal to (z + γ )/(1 − 2γ ) when this lies in [0, 2], otherwise being 0 when z < −γ and 2 otherwise, while in case γ ≥ 1/2 the analysis can be done in a similar manner. In the following numerical implementations we apply these formulae for γ = αλ < 1/2.
In Tables 3 and 4 we present the computational results obtained by implementing the mentioned algorithm in matlab for various choices of the involved parameters and of the allowed error ε > 0. In all the situations presented below we give the number of iterations and the CPU time needed by the program to reach a suitable approximation of the solution of the corresponding oligopolistic equilibrium problem. As one can notice, unlike in the experiments for solving (4.1) where the choice of λ and α does not influence the number of iterations in a significant manner, in the present case this (like the CPU time) depends on the values of the two mentioned parameters.

Conclusions
We contribute to the study of mixed variational inequalities beyond convexity by showing that a recent proximal point type algorithm due to Malitsky remains convergent for a class of nonconvex mixed variational inequalities, too. The convergence statement holds when there is a certain relation between the Lipschitz constant of the governing operator of the mixed variational inequality and the prox-convex value of the involved function, which can be seen as a way to control the "conflict" between the properties of the mentioned operator and function. This conflict belongs to the nature of nonconvex mixed variational inequality problems since it is no longer present when the operator vanishes, and collapses to a standard assumption when the function is convex. Since this is the first (as far as we know) proximal point type algorithm for iteratively determining a solution to a generalized convex mixed variational inequality, this relation has not been analyzed in the literature yet.
In a subsequent work, we plan to continue with the study of nonconvex mixed variational inequalities and, in particular, to provide a deeper analysis of the mentioned relation and to extend beyond convexity other recent algorithms for solving convex mixed variational inequalities as well as methods of proximal point type for minimizing quasiconvex functions towards nonconvex mixed variational inequalities. by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.