Bilevel Optimal Tolls Problems with Nonlinear Costs: A Heuristic Solution Method

Kalashnikov, Vyacheslav; Flores Muñiz, José Guadalupe; Kalashnykova, Nataliya

doi:10.1007/978-3-030-31041-7_28

Vyacheslav Kalashnikov^6,7,8,
José Guadalupe Flores Muñiz⁹ &
Nataliya Kalashnykova⁹

Part of the book series: Studies in Computational Intelligence ((SCI,volume 835))

308 Accesses

Abstract

We consider a bilevel programming problem modeling the optimal toll assignment as applied to an abstract network of toll and free highways. A public governor or a private lease company run the toll roads and make decisions at the upper level when assigning the tolls with the aim of maximizing their profits. The lower level decision makers (highway users), however, search an equilibrium among them while trying to distribute their transportation flows along the routes that would minimize their total travel costs subject to the satisfied demand for their goods/passengers. Our model extends the previous ones by adding quadratic terms to the lower level costs thus reflecting the mutual traffic congestion on the roads. Moreover, as a new feature, the lower level quadratic costs aren’t separable anymore, i.e., they are functions of the total flow along the arc (highway). In order to solve the bi-level programming problem, a heuristic algorithm making use of the sensitivity analysis techniques for quadratic programs is developed. As a remedy against being stuck at a local maximum of the upper level objective function, we adapt the well-known “filled function” method which brings us to a vicinity of another local maximum point. A series of numerical experiments conducted on test models of small and medium size shows that the new algorithm is competitive enough.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The proof of Theorem 1 is exported to Appendix 1.
2.
The procedure for computing the ARSB is presented in the Appendix 2.1.
3.
The procedure for computing the Jacobian ${\partial x}/{\partial t}$ is presented in the Appendix 2.2.
4.
The procedure for the FF method is presented in the Appendix 3.1.
5.
The algorithms are presented in the Appendix 3.2.
6.
The proof of Theorem 2 can be found in Hadigheh et al. [10].
7.
In Boot [1], the QP problem is presented as a maximization problem.
8.
This FF proposed in Kalashnikov et al. [13] is the one used in our algorithms.
9.
In Kalashnikov et al. [12] the optimization problem is presented as a minimization problem.

References

J.C.G. Boot, On sensitivity analysis in convex quadratic programming problems. Oper. Res. 11, 771–786 (1963)
Article MathSciNet Google Scholar
L. Brotcorne, Operational and strategic approaches to traffic routers’ problems (in French). Ph.D. dissertation, Université Libre de Bruxelles (1998)
Google Scholar
L. Brotcorne, F. Cirinei, P. Marcotte, G. Savard, An exact algorithm for the network pricing problem. Discret. Optim. 8(2), 246–258 (2011)
Article MathSciNet Google Scholar
L. Brotcorne, F. Cirinei, P. Marcotte, G. Savard, A Tabu search algorithm for the network pricing problem. Comput. Oper. Res. 39(11), 2603–2611 (2012)
Article MathSciNet Google Scholar
S. Dempe, T. Starostina, Optimal toll charges: fuzzy optimization approach, in Methods of Multicriteria Decision - Theory and Applications, ed. by F. Heyde, A. Lóhne, C. Tammer (Shaker Verlag, Aachen, 2009), pp. 29–45
Google Scholar
S. Dempe, V.V. Kalashnikov, G.A. Pérez, N.I. Kalashnykova, Bilevel Programming Problems: Theory, Algorithms and Applications to Energy Networks (Springer, Berlin-Heidelberg, 2015)
Book Google Scholar
S. Dempe, A.B. Zemkoho, Bilevel road pricing; theoretical analysis and optimality conditions. Ann. Oper. Res. 196(1), 223–240 (2012)
Article MathSciNet Google Scholar
M. Didi-Biha, P. Marcotte, G. Savard, Path-based formulation of a bilevel toll setting problem, in Optimization with Multi-Valued Mappings: Theory, ed. by S. Dempe, V.V. Kalashnikov (Applications and Algorithms, Springer Science, Boston, MA, 2006), pp. 29–50
Google Scholar
J.G. Flores-Muñiz, V.V. Kalashnikov, V. Kreinovich, N.I. Kalashnykova, Gaussian and Cauchy functions in the filled function method why and what next: on the example of optimizing road tolls. Acta Polytecnica Hung. 14(13), 237–250 (2017)
Google Scholar
A.G. Hadigheh, O. Romanko, T. Terlaky, Sensitivity analysis in convex quadratic optimization: Simultaneous perturbation of the objective and right-hand-side vectors. Algorithmic Oper. Res. 2, 94–111 (2007)
MathSciNet MATH Google Scholar
B. Jansen, Interior Point Techniques in Optimization: Complementarity, Sensitivity and Algorithms, (Dordrecht, The Netherlands: Springer-Science+Business Media, B.V, 1997)
Google Scholar
V.V. Kalashnikov, F. Camacho, R. Askin, N.I. Kalashnykova, Comparison of algorithms solving a bilevel toll setting problem. Int. J. Innov. Comput. Inf. Control 6(8), 3529–3549 (2010)
Google Scholar
V.V. Kalashnikov, R.C. Herrera, F. Camacho, N.I. Kalashnykova, A heuristic algorithm solving bilevel toll optimization problems. Int. J. Logist. Manag. 27(1), 31–51 (2016)
Article Google Scholar
V.V. Kalashnikov, N.I. Kalashnykova, R.C. Herrera, Solving bilevel toll optimization problems by a direct algorithm using sensitivity analysis, in Proceedings of the 2011 New Orleans International Academic Conference, (New Orleans, LA, March 21–23, 2011) pp. 1009–1018
Google Scholar
V.V. Kalashnikov, V. Kreinovich, J.G. Flores-Muñiz, N.I. Kalashnykova, Structure of filled functions: why Gaussian and Cauchy templates are most efficient, to appear in Int. J. Comb. Optim. Probl. Inform. 7 (2017)
Google Scholar
M. Labbé, P. Marcotte, G. Savard, A bilevel model of taxation and its applications to optimal highway pricing. Manag. Sci. 44(12), 1608–1622 (1998)
Article Google Scholar
M. Labbé, P. Marcotte, G. Savard, On a class of bilevel programs, in Nonlinear Optimization and Related Topics, ed. by G. Di Pillo, F. Giannessi (Kluwer Academic Publishers, Dordrecht, 2000), pp. 183–206
Google Scholar
S. Lohse, S. Dempe, Best highway toll assigning models and an optimality test (in German), Preprint, TU Bergakademie Freiberg, Nr. 2005-6, Fakultt fr Mathematik und Informatik, Freiberg (2005)
Google Scholar
T.L. Magnanti, R.T. Wong, Network design and transportation planning: models and algorithms. Transp. Sci. 18(1), 1–55 (1984)
Article Google Scholar
P. Marcotte, Network design problem with congestion effects: a case of bilevel programming. Math. Program. 34(2), 142–162 (1986)
Article MathSciNet Google Scholar
G.E. Renpu, A filled function method for finding a global minimizer of a function of several variables. Math. Program. 46(1), 191–204 (1990)
Article MathSciNet Google Scholar
S. Roch, G. Savard, P. Marcotte, Design and analysis of an algorithm for Stackelberg network pricing. Networks 46(1), 57–67 (2005)
Article MathSciNet Google Scholar
Z. Wan, L. Yuan, J. Chen, A filled function method for nonlinear systems of equalities and inequalities. Comput. Appl. Math. 31(2), 391–405 (2012)
Article MathSciNet Google Scholar
Z.Y. Wu, M. Mammadov, F.S. Bai, Y.J. Yang, A filled function method for nonlinear equations. Appl. Math. Comput. 189(2), 1196–1204 (2007)
MathSciNet MATH Google Scholar
Z.Y. Wu, F.S. Bai, Y.J. Yang, M. Mammadov, A new auxiliary function method for general constrained global optimization. Optimization 62(2), 193–210 (2013)
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors’ research activity was financially supported by the SEP-CONACYT (Mexico) grants CB-2013-01-221676 and FC-2016-01-1938.

Author information

Authors and Affiliations

Tecnológico de Monterrey (ITESM), 64849, Monterrey, NL, Mexico
Vyacheslav Kalashnikov
Central Economics and Mathematics Institute (CEMI), 117418, Moscow, Russia
Vyacheslav Kalashnikov
Sumy State University, Sumy, 40007, Ukraine
Vyacheslav Kalashnikov
Universidad Autónoma de Nuevo León (UANL), 66455, San Nicols de los Garza, NL, Mexico
José Guadalupe Flores Muñiz & Nataliya Kalashnykova

Authors

Vyacheslav Kalashnikov
View author publications
You can also search for this author in PubMed Google Scholar
José Guadalupe Flores Muñiz
View author publications
You can also search for this author in PubMed Google Scholar
Nataliya Kalashnykova
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vyacheslav Kalashnikov .

Editor information

Editors and Affiliations

Department of Teacher Education, University of Texas at El Paso, El Paso, TX, USA
Olga Kosheleva
Institute of Computational Technologies SB RAS, Novosibirsk, Russia
Sergey P. Shary
Johns Creek, GA, USA
Gang Xiang
Deptartment of Informatics, The State Russian Museum, Saint Petersburg, Russia
Roman Zapatrin

Appendix 1: Proof of Theorem 1

Proof

We are going to show that the Nash equilibrium problem (5)–(9) and the quadratic programming problem (10)–(14) are equivalent. In order to do that we first state the latter problems in their matrix form. Let $\{t_a\mid a\in A_1\}$ satisfy (3) and (4), then, we can consider the vector $z\in R^M$ whose ath component is given by $c_a$ if $a\in A_2$ and by $t_a+c_a$ if $a\in A_1$. Thus, the Nash equilibrium problem (5)–(9) is given as follows:

$$\begin{aligned} x^k \in \varPsi _k(t,x^{-k}),\ \forall k\in K; \end{aligned}$$

(56)

where

$$\begin{aligned} \varPsi _k(t,x^{-k})= \underset{ x^k}{{\text {Argmin}}}&f_k( x^k)={z}^T x^k +\displaystyle \sum \limits _{k\ne \ell \in K} { x^k}^TD^{k,\ell }x^\ell +\frac{1}{2}{ x^k}^TD^{k,k} x^k,\end{aligned}$$

(57)

$$\begin{aligned} \text {subject to}& \qquad \qquad \displaystyle B^k x^k=b^k,\end{aligned}$$

(58)

$$\begin{aligned}&x^k\le q-\displaystyle \sum \limits _{k\ne \ell \in K}x^\ell ,\end{aligned}$$

(59)

$$\begin{aligned}&x^k\ge 0. \end{aligned}$$

(60)

Here, for $k,\ell \in K$, the components of the matrix $D^{k,\ell }\in \mathbb {R}^{M\times M}$ are the congestion factors $d^{k,\ell }_{a,e}$, $a,e\in A$, the matrix $B^k\in \mathbb {R}^{\eta \times M}$ and the vector $b^k\in \mathbb {R}^\eta $ corresponds to the equality constraints (7), and the vector $q\in \mathbb {R}^M$ has the capacity upper bounds $q_a$, $a\in A$, as its components. Using the above notation, the quadratic programming problem (10)–(14) is given by:

$$\begin{aligned} x\in \varPsi (t); \end{aligned}$$

(61)

where

$$\begin{aligned} \varPsi (t)=\underset{ x}{{\text {Argmin}}}&f( x)=\displaystyle \sum \limits _{k\in K} {z}^T x^k+\frac{1}{2}{ x}^TD x,\end{aligned}$$

(62)

$$\begin{aligned} \text {subject to}& \qquad B^k x^k=b^k,\ \forall k\in K,\end{aligned}$$

(63)

$$\begin{aligned}&\displaystyle \sum \limits _{\ell \in K} x^\ell \le q,\end{aligned}$$

(64)

$$\begin{aligned}&x\ge 0. \end{aligned}$$

(65)

The matrix D is a $\kappa \times \kappa $ block matrix whose block components are the matrices $D^{k,\ell }$ (thus, $D\in \mathbb {R}^{M\kappa \times M\kappa }$). Since the value $d^{k,\ell }_{a,e}=d^{\ell ,k}_{e,a}$, then, $D^{k,\ell }={D^{\ell ,k}}^T$, for all $k,\ell \in K$; moreover, without loss of generality we can suppose that the matrices D and $D^{k,\ell }$, $k,\ell \in K$, are symmetric (and positive semi-definite as we have assumed). Then, the programs appearing in (56)–(60) and program (61)–(65) are differentiable and convex (with linear constraints) quadratic programming problems, so these problems can be equivalently transformed into a nonlinear system of equations and inequalities using the KKT conditions. Therefore, in order to show the equivalence of problems (56)–(60) and (61)–(65), it suffices to demonstrate that the KKT conditions of one of the problems lead to a solution for the KKT conditions of the other problem with the same solution vector x. The KKT condition for problem (56)–(60) are as follows:

$$\begin{aligned}&\displaystyle \frac{d f_k}{d x^k}+\mu ^k+{B^k}^T\lambda ^k=z+\sum \limits _{\ell \in K} D^{k,\ell }x^\ell +\mu ^k+{B^k}^T\lambda ^k\ge 0,&\end{aligned}$$

(66)

$$\begin{aligned}&B^k x^k=b^k,&\end{aligned}$$

(67)

$$\begin{aligned}&x^k\le q-\displaystyle \sum \limits _{k\ne \ell \in K}x^\ell ,&\end{aligned}$$

(68)

$$\begin{aligned}&\mu ^k\left( \displaystyle \sum \limits _{\ell \in K} x^\ell - q\right) =0,&\end{aligned}$$

(69)

$$\begin{aligned}&x^k,\mu ^k\ge 0,&\end{aligned}$$

(70)

where $\mu ^k\in \mathbb {R}^M$ and $\lambda ^k\in \mathbb {R}^\eta $; for all $k\in K$. And the KKT conditions for problem (61)–(65) are:

$$\begin{aligned}&\displaystyle \frac{\partial f}{\partial x^k}+\mu +{B^k}^T\lambda ^k=z+\sum \limits _{\ell \in K} D^{k,\ell }x^\ell +\mu +{B^k}^T\lambda ^k\ge 0,\ \forall k\in K,&\end{aligned}$$

(71)

$$\begin{aligned}&B^k x^k=b^k,\ \forall k\in K,&\end{aligned}$$

(72)

$$\begin{aligned}&\displaystyle \sum \limits _{\ell \in K} x^\ell \le q,&\end{aligned}$$

(73)

$$\begin{aligned}&\mu \left( \displaystyle \sum \limits _{\ell \in K} x^\ell - q\right) =0,&\end{aligned}$$

(74)

$$\begin{aligned}&x,\mu \ge 0,&\end{aligned}$$

(75)

where $\mu \in \mathbb {R}^M$ and $\lambda ^k\in \mathbb {R}^\eta $, $k\in K$. Now we prove that the KKT conditions (66)–(70), for all $k\in K$, and (71)–(75) are equivalent. Let $x^k,\mu ^k\in \mathbb {R}^M$ and $\lambda ^k\in \mathbb {R}^\eta $, $k\in K$, satisfy (66)–(70) for all $k\in K$. Now, let’s choose a new vector $ {\mu }\in \mathbb {R}^M$ as follows:

$$\begin{aligned} {\mu }_a=\max \limits _{k\in K}\{\mu _a^k\},\ a\in A. \end{aligned}$$

(76)

Then, $ {\mu }\ge \mu ^k$ for all $k\in K$, and so:

$$\begin{aligned} z+\sum \limits _{\ell \in K} D^{k,\ell }x^\ell + {\mu }+{B^k}^T\lambda ^k\ge z+\sum \limits _{\ell \in K} D^{k,\ell }x^\ell +\mu ^k+{B^k}^T\lambda ^k \ge 0,\ \forall k\in K, \end{aligned}$$

(77)

which satisfy (71). It’s easy to see that condition (68) is the same for any $k\in K$ and equivalent to condition (73). Moreover, let $a\in A$. If

$$\begin{aligned} x_a^k<q_a-\sum \limits _{k\ne \ell \in K}x^\ell _a, \end{aligned}$$

(78)

then,

$$\begin{aligned} \sum \limits _{\ell \in K}x^\ell _a-q_a<0, \end{aligned}$$

(79)

and hence, $\mu _a^k=0$ for all $k\in K$, so $ {\mu }_a=0$ and therefore:

$$\begin{aligned} {\mu }_a\left( \sum \limits _{\ell \in K}x^\ell _a-q_a\right) =0. \end{aligned}$$

(80)

On the other hand, if

$$\begin{aligned} x_a^k=q_a-\sum \limits _{k\ne \ell \in K}x^\ell _a, \end{aligned}$$

(81)

then,

$$\begin{aligned} \sum \limits _{\ell \in K}x^\ell _a-q_a=0, \end{aligned}$$

(82)

and so

$$\begin{aligned} {\mu }_a\left( \sum \limits _{\ell \in K}x^\ell _a-q_a\right) =0. \end{aligned}$$

(83)

Therefore,

$$\begin{aligned} {\mu }_a\left( \sum \limits _{\ell \in K}x^\ell _a-q_a\right) =0,\ \forall a\in A, \end{aligned}$$

(84)

so condition (74) is satisfied. Finally, conditions (67) and (70) for all $k\in K$, imply that conditions (72) and (75). Therefore, the vectors $x^k,\mu \in \mathbb {R}^M$ and $\lambda ^k\in \mathbb {R}^\eta $, $k\in K$, satisfy (71)–(75). Conversely, let $x^k,\mu \in \mathbb {R}^M$ and $\lambda ^k\in \mathbb {R}^\eta $, $k\in K$, satisfy (71)–(75). Then, for a fixed $k\in K$, we have that:

$$\begin{aligned}&z+\sum \limits _{\ell \in K} D^{k,\ell }x^\ell +\mu +{B^k}^T\lambda ^k\ge 0,&\end{aligned}$$

(85)

$$\begin{aligned}&B^k x^k=b^k,&\end{aligned}$$

(86)

$$\begin{aligned}&\displaystyle x^k\le q-\sum \limits _{k\ne \ell \in K} x^\ell ,&\end{aligned}$$

(87)

$$\begin{aligned}&\mu \left( \displaystyle \sum \limits _{\ell \in K} x^\ell - q\right) =0,&\end{aligned}$$

(88)

$$\begin{aligned}&x^k,\mu \ge 0.&\end{aligned}$$

(89)

Therefore, the vectors $x^k,\mu \in \mathbb {R}^M$ and $\lambda ^k\in \mathbb {R}^\eta $ satisfy (66)–(70), for all $k\in K$. $\quad \square $

Appendix 2.1: The Procedure for Computing the ARSB

Let’s consider the primal quadratic programming (QP) problem:

$$\begin{aligned} \underset{x}{{\text {minimize}}}&\varphi (x)=\displaystyle c^Tx+\frac{1}{2}x^TQx,\end{aligned}$$

(90)

$$\begin{aligned} \text {subject to}& \qquad Ax=b,\end{aligned}$$

(91)

$$\begin{aligned}&x\ge 0, \end{aligned}$$

(92)

where $Q\in \mathbb {R}^{n\times n}$ is a symmetric positive semi-definite matrix, $A\in \mathbb {R}^{m\times n}$, $c\in \mathbb {R}^n$ and $b\in \mathbb {R}^m$ are fixed data, and $x\in \mathbb {R}^n$ is the unknown vector. The Wolfe-Dual of the latter QP problem is given by:

$$\begin{aligned} \underset{u,y,s}{{\text {maximize}}}&\psi (u,y,s)=\displaystyle b^Ty-\frac{1}{2}u^TQu,\end{aligned}$$

(93)

$$\begin{aligned} \text {subjected to}& \qquad A^Ty+s-Qu=c,\end{aligned}$$

(94)

$$\begin{aligned}&u,s\ge 0, \end{aligned}$$

(95)

where $u,s\in \mathbb {R}^n$ and $y\in \mathbb {R}^m$ are unknown vectors.

The feasible regions of (90)–(92) and (93)–(95) are denoted by $\mathscr {QP}$ and $\mathscr {QD}$, and their associated optimal solutions sets are $\mathscr {QP}^*$ and $\mathscr {QD}^*$, respectively. It is well known that for any optimal solution of (90)–(92) and (93)–(95) we have $Qx = Qu$ and $s^T x = 0$, which is equivalent to $x_is_i=0$, for all $i\in \{1\dots ,n\}$ (since $x,s\ge 0$). It is obvious that there are optimal solutions with $x = u$. Since we are only interested in the solutions where $x = u$, u will henceforth be replaced by x in the dual problem. It is easy to show that for any two optimal solutions $(x^*,y^*,s^* )$ and $(\tilde{x},\tilde{y},\tilde{s})$ of (90)–(92) and (93)–(95) it holds that $Qx^*=Q\tilde{x}$, $c^T x^*=c^T \tilde{x}$ and $b^T y^*=b^T \tilde{y}$ and consequently, $\tilde{x}^T s^*=\tilde{s}^T x^*=0$.

The optimal partition of the index set $\mathscr {I}=\{1,\dots ,n\}$ is defined as:

$$\begin{aligned} \mathscr {B}= & {} \{i\in \mathscr {I}\mid x_i>0\text { for an optimal solution }x\in \mathscr {QP}^*\},\end{aligned}$$

(96)

$$\begin{aligned} \mathscr {N}= & {} \{i\in \mathscr {I}\mid s_i>0\text { for an optimal solution }(x,y,s)\in \mathscr {QD}^*\},\end{aligned}$$

(97)

$$\begin{aligned} \mathscr {T}= & {} \mathscr {I}\setminus (\mathscr {B}\cup \mathscr {N}), \end{aligned}$$

(98)

and denoted by $\pi =(\mathscr {B},\mathscr {N},\mathscr {T})$. The support set of a vector v is defined as $\sigma (v)=\{i\in \mathscr {I}\mid v_i>0\}$. An optimal solution (x, y, s) is called maximally complementary if it possesses the following properties:

$$\begin{aligned} x_i>0&\text {if and only if}&i\in \mathscr {B},\end{aligned}$$

(99)

$$\begin{aligned} s_i>0&\text {if and only if}&i\in \mathscr {N}. \end{aligned}$$

(100)

For any maximally complementary solution (x, y, s) the relations $\sigma (x)=\mathscr {B}$ and $\sigma (s)=\mathscr {N}$ hold. The existence of a maximally complementary solution is a direct consequence of the convexity of the optimal sets $\mathscr {QP}^*$ and $\mathscr {QD}^*$. It is known that the interior point methods (IPM) find a maximally complementary solution as the limit solution.

The perturbed QP problem is:

$$\begin{aligned} \underset{x}{{\text {minimize}}}&\varphi _\lambda (x)=\displaystyle (c+\lambda \varDelta c)^Tx+\frac{1}{2}x^TQx,\end{aligned}$$

(101)

$$\begin{aligned} \text {subject to}& \qquad \qquad Ax=b,\end{aligned}$$

(102)

$$\begin{aligned}&x\ge 0, \end{aligned}$$

(103)

where $\varDelta c\in \mathbb {R}^n$ is a nonzero perturbation vector and $\lambda $ is a real parameter (in our Algorithms 1 and 2, $\varDelta c=e_i$, where $e_i$ is the element of the canonical base corresponding to the ith arc). The optimal value function $\phi (\lambda )$ denotes the optimal value of (101)–(103) as a function of the parameter $\lambda $. Thus, we define the dual perturbed problem corresponding to (93)–(95) as follows:

$$\begin{aligned} \underset{x,y,s}{{\text {maximize}}}&\psi _\lambda (x,y,s)=\displaystyle b^Ty-\frac{1}{2}x^TQx,\end{aligned}$$

(104)

$$\begin{aligned} \text {subject to}& \quad A^Ty+s-Qx=c+\lambda \varDelta c,\end{aligned}$$

(105)

$$\begin{aligned}&x,s\ge 0. \end{aligned}$$

(106)

Let $\mathscr {QP}_\lambda $ and $\mathscr {QD}_\lambda $ denote the feasible sets of problems (101)–(103) and (104)–(106), respectively. Their optimal solution sets are analogously denoted by $\mathscr {QP}_\lambda ^*$ and $\mathscr {QD}_\lambda ^*$.

Let us denote the domain of $\phi (\lambda )$ by:

$$\begin{aligned} \varLambda =\{\lambda \in \mathbb {R}\mid \mathscr {QP}_\lambda \ne \emptyset \text { and }\mathscr {QD}_\lambda \ne \emptyset \}. \end{aligned}$$

(107)

Since it is assumed that (90)–(92) and (93)–(95) have optimal solutions, it follows that $\varLambda \ne \emptyset $.

For $\lambda ^*\in \varLambda $, let $\pi =\pi (\lambda ^*)$ denote the optimal partition. We introduce the following notation:

$$\begin{aligned} \mathscr {O}(\pi )= & {} \{\lambda \in \varLambda \mid \pi (\lambda )=\pi \},\end{aligned}$$

(108)

$$\begin{aligned} \mathscr {S}_\lambda (\pi )= & {} \left\{ (x,y,s)\ \Bigg \vert \ {\begin{aligned}&x\in \mathscr {QP}_\lambda ,\ (x,y,s)\in \mathscr {QD}_\lambda ,\ x_\mathscr {B}>0,\\&x_{\mathscr {N}\cup \mathscr {T}}=0,\ s_\mathscr {N}>0,\ s_{\mathscr {B}\cup \mathscr {T}}=0 \end{aligned}} \right\} ,\end{aligned}$$

(109)

$$\begin{aligned} \overline{\mathscr {S}}_\lambda (\pi )= & {} \left\{ (x,y,s)\ \Bigg \vert \ \begin{aligned}&x\in \mathscr {QP}_\lambda ,\ (x,y,s)\in \mathscr {QD}_\lambda ,\ x_\mathscr {B}\ge 0,\\&x_{\mathscr {N}\cup \mathscr {T}}=0,\ s_{\mathscr {N}}\ge 0,\ s_{\mathscr {B}\cup \mathscr {T}}=0 \end{aligned} \right\} ,\end{aligned}$$

(110)

$$\begin{aligned} \varLambda (\pi )= & {} \{\lambda \in \varLambda \mid \mathscr {S}_\lambda (\pi )\ne \emptyset \},\end{aligned}$$

(111)

$$\begin{aligned} \overline{\varLambda }(\pi )= & {} \{\lambda \in \varLambda \mid \overline{\mathscr {S}}_\lambda (\pi )\ne \emptyset \}. \end{aligned}$$

(112)

Here $\mathscr {O}(\pi )$ denotes the set of parameter values for which the optimal partition $\pi $ is constant. Further, $\mathscr {S}_\lambda (\pi )$ is the primal-dual optimal solution set of maximally complementary optimal solutions of the perturbed primal and dual QP problems for the parameter value $\lambda \in \mathscr {O}(\pi )$. Next, $\varLambda (\pi )$ denotes the set of parameter values for which the perturbed primal and dual problems have an optimal solution (x, y, s) such that $\sigma (x)=\mathscr {B}$ and $\sigma (s)=\mathscr {N}$. Finally, $\overline{\mathscr {S}}_\lambda (\pi )$ is the closure of $\mathscr {S}_\lambda (\pi )$ for all $\lambda \in \varLambda (\pi )$ and $\overline{\varLambda }(\pi )$ is the closure of $\varLambda (\pi )$.

Theorem 2

Let $\lambda ^*\in \varLambda (\pi )$ and let $(x^*,y^*,s^* )$ be a maximally complementary solution of (101)–(103) and (104)–(106) with the optimal partition $\pi =(\mathscr {B},\mathscr {N},\mathscr {T})$. Then the left and right extreme points of the closed interval $\overline{\varLambda }(\pi )=[\lambda _\ell ,\lambda _u ]$ that contains $\lambda ^*$ are obtained by minimizing and maximizing $\lambda $ over $\overline{\mathscr {S}}_\lambda (\pi )$, respectively, i.e., by solving:

$$\begin{aligned} \underset{\lambda ,x,y,s}{{\text {minimize}}}&\lambda _\ell (\lambda )=\lambda ,\end{aligned}$$

(113)

$$\begin{aligned} \text {subject to}& \qquad Ax=b,\end{aligned}$$

(114)

$$\begin{aligned}&x_\mathscr {B}\ge 0,\end{aligned}$$

(115)

$$\begin{aligned}&x_{\mathscr {N}\cup \mathscr {T}}= 0,\end{aligned}$$

(116)

$$\begin{aligned}&A^Ty+s-Qx-\lambda \varDelta c=c,\end{aligned}$$

(117)

$$\begin{aligned}&s_{\mathscr {N}}\ge 0,\end{aligned}$$

(118)

$$\begin{aligned}&s_{\mathscr {B}\cup \mathscr {T}}=0, \end{aligned}$$

(119)

and

$$\begin{aligned} \underset{\lambda ,x,y,s}{{\text {maximize}}}&\lambda _u(\lambda )=\lambda ,\end{aligned}$$

(120)

$$\begin{aligned} \text {subject to}& \qquad Ax=b,\end{aligned}$$

(121)

$$\begin{aligned}&x_\mathscr {B}\ge 0,\end{aligned}$$

(122)

$$\begin{aligned}&x_{\mathscr {N}\cup \mathscr {T}}= 0,\end{aligned}$$

(123)

$$\begin{aligned}&A^Ty+s-Qx-\lambda \varDelta c=c,\end{aligned}$$

(124)

$$\begin{aligned}&s_{\mathscr {N}}\ge 0,\end{aligned}$$

(125)

$$\begin{aligned}&s_{\mathscr {B}\cup \mathscr {T}}=0. \end{aligned}$$

(126)

Appendix 2.2: The Procedure for Computing the Jacobian Matrix

Let’s consider the quadratic programming problem:

$$\begin{aligned} \underset{x=(\xi _1,\dots ,\xi _n)}{{\text {minimize}}}&\varphi (x)=\displaystyle a^Tx+\frac{1}{2}x^TBx\ \left( \sum \limits _{i=1}^n\alpha _i\xi _i+\frac{1}{2}\sum \limits _{i=1}^n\sum \limits _{j=1}^n\beta _{i,j}\xi _i\xi _j\right) ,\end{aligned}$$

(127)

$$\begin{aligned} \text {subject to}& \qquad Cx\le d\ \left( \sum \limits _{i=1}^n\gamma _{h,i}\xi _i\le \delta _h,\ h\in \{1,\dots ,t\}\right) . \end{aligned}$$

(128)

To guarantee the existence of a unique global solution, we will assume that the symmetric matrix B is positive definite.^{Footnote 6} ^{Footnote 7}

Suppose that we know the subset $S\subset \{1,\dots ,t\}$ out of the t constrains (128) such that, when we minimize (127) subject to the constrains belonging to S taken as exact equalities we get the vector $x^S$ that solves (127) and (128). For any set $S\ne \emptyset $, $x^S$ is defined as the vector minimizing (127) subject to the constraints that belong to S taken as exact equalities. The actual minimization process is carried out with the help of Lagrangians as follows. Differentiate

$$\begin{aligned} a^Tx+\frac{1}{2}x^TBx+(u^S)^T(C_Sx-d_S), \end{aligned}$$

(129)

with respect to x and $u^S$, and equate the resulting expressions to zero, to get:

$$\begin{aligned}&a+Bx+{C_S}^Tu^S=0,&\end{aligned}$$

(130)

$$\begin{aligned}&C_Sx=d_S.&\end{aligned}$$

(131)

Solving for $x^S$ and $u^S$ we find consecutively:

$$\begin{aligned}&x^S=-B^{-1}a-B^{-1}{C_S}^Tu^S,&\end{aligned}$$

(132)

$$\begin{aligned}&u^S=-{(C_SB^{-1}{C_S}^T)}^{-1}(C_SB^{-1}a+d_S),&\end{aligned}$$

(133)

and hence,

$$\begin{aligned} x^S=-B^{-1}a+B^{-1}{C_S}^T{(C_SB^{-1}{C_S}^T)}^{-1}(C_SB^{-1}a+d_S). \end{aligned}$$

(134)

It is to be noticed that ${(C_SB^{-1}{C_S}^T)}^{-1}$ will always exist when $C_S$ has full row-rank; this will be assumed. If the set S contains m elements, as we will assume throughout, this implies $m\le n$. Notice that all expressions are continuous in the elements of a.

The quadratic programming theory has established the necessary and sufficient conditions for $x^S$ to solve (127) and (128). These conditions are:

$$\begin{aligned} x^S\text { is feasible: }Cx^S\le d, \end{aligned}$$

(135)

and

$$\begin{aligned} u^S\ge 0. \end{aligned}$$

(136)

To begin, we will exclude the case of degeneracy, which, (by definition), occurs when either (135) holds with a strict equality for a constraint not in S (in $\overline{S}$, say, where $S\cap \overline{S}=\emptyset $ and $S\cup \overline{S}=\{1,\dots ,t\}$) or (136) holds with a strict equality for some constraint (by necessity) in S. Excluding degeneracy implies that we can rewrite condition (135) and (136) as follows:

$$\begin{aligned}&C_Sx^S= d_S,\ C_{\overline{S}}x^S< d_{\overline{S}},&\end{aligned}$$

(137)

$$\begin{aligned}&u^S>0.&\end{aligned}$$

(138)

It will be clear that (in the absence of degeneracy) infinitesimal changes in the elements of a do not affect the set S with the property that maximizing (127) subject to $C_S x=d_S$ produces the solution vector to the problem (127) and (128). For if $u^S>0$ originally, they will remain so for infinitesimal changes; and if $C_{\overline{S}}x^S< d_{\overline{S}}$ originally, they will remain so for infinitesimal changes; also the row-rank of $C_S$ will remain m. On the other hand, infinitesimal changes in the elements of a will, of course, influence $x^S$, the solution vector.

This changes can be derived by differentiating (134) with respect to a. Thus, we find:

$$\begin{aligned} \frac{\partial x^S}{\partial a}=-B^{-1}+B^{-1}{C_S}^T{(C_SB^{-1}{C_S}^T)}^{-1}C_SB^{-1}, \end{aligned}$$

(139)

the desired Jacobian matrix.

Appendix 3.1: The Procedure for the FF Method

Let $u=u(t)$ be a differentiable function defined over a polyhedral set $T\subset \mathbb {R}^n$. For simplicity purpose, we assume that any local maximum point of the later function provides a positive value.

Definition 1

Let $\overline{t}_0,t^*\in T$ satisfy $\overline{t}_0\ne t^*$ and $u(\overline{t}_0)\ge (4/5)u(t^*)$. A continuously differentiable function $Q_{t^*}=Q_{t^*}(t)$ is said to be a filled function (FF) for the maximization problem

$$\begin{aligned}&\underset{t}{{\text {maximize}}}\;\;u(t),\mu \end{aligned}$$

(140)

$$\begin{aligned}&\text {subject to}\;\;t\in T, \end{aligned}$$

(141)

at the point $t^*\in T$ with $u(t^*)>0$, if:

1.
$t^*$ is a strict local minimizer of $Q_{t^*}=Q_{t^*}(t)$ on T.
2.
Any local maximizer $\overline{t}$ of $Q_{t^*}=Q_{t^*}(t)$ on T satisfies $u(\overline{t})>(8/5)u(t^*)$, or $\overline{t}$ is a vertex of T.
3.
Any local maximizer $\hat{t}$ of the optimization problem (140)–(141) with $u(\hat{t})\ge (9/5)u(t^*)$ is a local maximizer of $Q_{t^*}=Q_{t^*}(t)$ on T.
4.
Any $\tilde{t}\in T$ with $\nabla Q_{t^*}=0$ implies $u(\tilde{t})>(8/5)u(t^*)$.

Now, to construct a typical FF in the sense of Definition 1, define two auxiliary functions as follows.

For arbitrary t and $t^*\in T$, denote $b=u(t^*)>0$ and $v=u(t)$, define:

$$\begin{aligned} g_b(v):={\left\{ \begin{array}{ll} 0,&{}\displaystyle \text {if }v\le \frac{2}{5}b,\\ \displaystyle 5-\frac{30}{b}v+\frac{255}{4b^2}v^2-\frac{125}{4b^3}v^3,&{}\displaystyle \text {if }\frac{2}{5}b\le v\le \frac{4}{5}b,\\ 1,&{}\displaystyle \text {if }v\ge \frac{4}{5}b, \end{array}\right. } \end{aligned}$$

(142)

and

$$\begin{aligned} s_b(v):={\left\{ \begin{array}{ll} \displaystyle v-\frac{2}{5}b,&{}\displaystyle \text {if }v\le \frac{2}{5}b,\\ \begin{aligned} \displaystyle 5-\frac{8}{5}b+\left( 8-\frac{30}{b}\right) v&{}-\frac{25}{2b}\left( 1-\frac{9}{2b}\right) v^2\\ {} &{}\displaystyle +\frac{25}{4b^2}\left( 1-\frac{5}{b}\right) v^3, \end{aligned} &{}\text {if }\frac{2}{5}b\le v\le \frac{4}{5}b,\\ 1,&{}\displaystyle \text {if }\frac{4}{5}b\le v\le \frac{8}{5}b,\\ \displaystyle 1217-\frac{2160}{b}v+\frac{1275}{b^2}v^2-\frac{250}{b^3}v^3,&{}\displaystyle \text {if }\frac{8}{5}b\le v\le \frac{9}{5}b,\\ 2,&{}\displaystyle \text {if }v\ge \frac{9}{5}b. \end{array}\right. } \end{aligned}$$

(143)

Now, given a point $t^*\in T$ such that $u(t^*)>0$ we define the following FF:

$$\begin{aligned} Q_{\rho ,t^*}(t):=-\exp (-{\Vert t-t^*\Vert }^2)g_{\frac{2}{5}u(t^*)}(u(t))-\rho s_{\frac{2}{5}u(t^*)}(u(t)), \end{aligned}$$

(144)

where $\rho >0$ is a^{Footnote 8} parameter.

Based on Wu et al. [25] we have the following theorem:

Theorem 3

Assume that the function $u=u(t)$ is continuously differentiable and there exists a polyhedron $T\subset \mathbb {R}^n$ and a point $t_0\in T$ such that $u(t)\le (4/5)u(t_0)$ for any $t\in \mathbb {R}^n\setminus {\text {Int}}(T)$. Let $\overline{t}_0,t^*\in T$, $\overline{t}_0\ne t^*$, satisfy the inequality $u(t^*)-u(\overline{t}_0)\le (2/5)u(t^*)$. Then:

1.
There exists a value $\rho _{t^*}^1\ge 0$ such that when $\rho >\rho _{t^*}^1$, any local maximizer $\overline{t}$ of the problem
$$\begin{aligned} \underset{t}{{\text {maximize}}}&Q_{\rho ,t^*}(t),\end{aligned}$$
(145)

$$\begin{aligned} \text {subjected to}& \quad t\in T, \end{aligned}$$
(146)
obtained via the search starting from $\overline{t}_0$, satisfies $\overline{t}\in {\text {Int}}(T)$.
2.
There exists a value $\rho _{t^*}^2>0$ such that when $0<\rho <\rho _{t^*}^2$, then, for any stationary point $\tilde{t}\in T$ with $\tilde{t}\ne t^*$ of the function $Q_{\rho ,t^*}(t)$, the following estimate holds:
$$\begin{aligned} u(\tilde{t})>\frac{8}{5}u(t^*). \end{aligned}$$
(147)

Appendix 3.2: The Benchmark Algorithms to Compare With

The Derivative-Free Quasi-Newton Algorithm

Step 0: Define $e=\{e_a\mid a\in A_1\}$ as the set of the canonic vectors. Let $\tau ,\varepsilon >0$ and $j=0$. Set an arbitrary toll vector $t^j$ and minimize the objective function f(x) of the lower level quadratic programming problem (10)–(14), in order to obtain the optimal response $x(t^j)$, and compute the leader’s objective function’s value $\varPsi (t^j,x(t^j))=F(t^j,x(t^j))$.

Step 1: For the toll variables compute the following approximation

$$\begin{aligned} \varphi _a^j=\frac{\partial \varPsi }{\partial t_a}(t^j,x(t^j))\approx \frac{\varPsi (t^j+e_a\tau ,x(t^j+e_a\tau ))-\varPsi (t^j-e_a\tau ,x(t^j-e_a\tau ))}{2\tau }, \end{aligned}$$

(148)

where $a\in A_1$. Now, obtain the approximation of the gradient vector as follows:

$$\begin{aligned} \nabla \varPsi (t^j,x(t^j))\approx \begin{pmatrix} \varphi _1^j\\ \varphi _2^j\\ \vdots \\ \varphi _{M_1}^j \end{pmatrix} =\phi _j. \end{aligned}$$

(149)

Step 2: For $j=0$, set $B_j$ as the identity $M_1\times M_1$ matrix and compute the direction $s_j=B_j\phi $ as the search direction at the current iteration. Setting i as a counter starting from $i=0$, establish $\alpha _i=1$ as the step size and compute $\varPsi (t^j+\alpha _is_j,x(t^j+\alpha _is_j))$ in order to obtain the best $\alpha _i$ value. In the case when $j>0$, $B_j$ is computed as is specified in Step 5.

Step 3: We can separate this step in two stages:

Stage 1: If $\varPsi (t^j,x(t^j))+\varepsilon \alpha _i\phi ^Ts_j<\varPsi (t^j+\alpha _is_j,x(t^j+\alpha _is_j))$ starting from $\alpha _i=1$ we increase its value in the following way: $\alpha _{i+1}=1.5\alpha _i$. Continue increasing the $\alpha _i$ value and $i:=i+1$ until $\varPsi (t^j,x(t^j))+\varepsilon \alpha _i\phi ^Ts_j\ge \varPsi (t^j+\alpha _is_j,x(t^j+\alpha _is_j))$ or $t^j+\alpha _is_j\ge t^{\max }$; select the corresponding penultimate $\alpha _i$ value as the best one, this is, for $i:=i-1$ compute $t^{j+1}:=t^j+\alpha _is_j$. Go to Step 4.

Stage 2: Otherwise, in the case when considering $\alpha _i=1$ and if the inequality $\varPsi (t^j,x(t^j))+\varepsilon \alpha _i\phi ^Ts_j\ge \varPsi (t^j+\alpha _is_j,x(t^j+\alpha _is_j))$ holds, we start to decrease $\alpha _i$ by $\alpha _{i+1}=\alpha _i/1.5$, compute $\varPsi (t^j+\alpha _is_j,x(t^j+\alpha _is_j))$ and continue decreasing $\alpha _i$, and stop when the desired inequality is achieved: $\varPsi (t^j,x(t^j))+\varepsilon \alpha _i\phi ^Ts_j<\varPsi (t^j+\alpha _is_j,x(t^j+\alpha _is_j))$. Under this scheme, we select the last $\alpha _i$ value as the best one.

Step 4: Using the values $t^{j+1}:=t^j+\alpha _is_j$, $x(t^{j+1})=x(t^j+\alpha _is_j)$, and $\varPsi (t^{j+1},x(t^{j+1}))=\varPsi (t^j+\alpha _is_j,x(t^j+\alpha _is_j))$ find the approximation to the gradient for $t^{j+1}$ as in Step 1, that is, $\phi _{j+1}\approx \nabla \varPsi (t^{j+1},x(t^{j+1}))$ and compute

$$\begin{aligned} d_j=t^{j+1}-t^j, \end{aligned}$$

(150)

$$\begin{aligned} y_j=\phi _{j+1}-\phi _j, \end{aligned}$$

(151)

and

$$\begin{aligned} \begin{aligned} \lambda _j=\varPsi (t^{j+1}-d_j,x(t^{j+1}-d_j))&+\varPsi (t^{j+1}+d_j,x(t^{j+1}+d_j))\\ {}&-2\varPsi (t^{j+1},x(t^{j+1})). \end{aligned} \end{aligned}$$

(152)

Step 5: Finally, determine the updated matrix

$$\begin{aligned} B_{j+1}=B_j-\frac{B_jd_j{d_j}^TB_j}{{d_j}^TB_jd_j}+\frac{\lambda _jy_j{y_j}^T}{{({d_j}^Ty_j)}^2}, \end{aligned}$$

(153)

and use it to find the next direction. Update iteration counter j as $j:=j+1$ and go to Step 2. Keep iterating until

$$\begin{aligned} \Vert \varPsi (t^{j+1},x(t^{j+1}))-\varPsi (t^{j},x(t^{j}))\Vert \le \varepsilon . \end{aligned}$$

(154)

Select $t^{j+1}$ and $x(t^{j+1})$ correspondingly as the tolls and flows approximate solution vectors to the TOP, and the objective function value $\varPsi (t^{j+1},x(t^{j+1}))$ as an acceptable problem’s solution.

The Sharpest-Ascent Algorithm

In this algorithm, we make use again of the Jacobian matrix dx/dt, so we require again that the matrix $D=\{d^{k,\ell }_{a,e}\mid a,e\in A;\ k,\ell \in K\}$ is positive definite.

Step 0: Let $\delta , \varepsilon >0$ and $j=0$. Set an arbitrary toll vector $t^j$ and minimize the lower level quadratic programming problem (10)–(14), in order to obtain the optimal response $x(t^j)$. Compute the leader’s objective function $\varPsi (t^j,x(t^j))=F(t^j,x(t^j))$.

Step 1: For the toll variables, using the Jacobian matrix dx/dt, compute the partial derivatives

$$\begin{aligned} \frac{\partial \varPsi }{\partial t_a}(t^j,x(t^j))=\sum \limits _{k\in K}\left( x_a^k(t^j)+t^j\cdot \frac{d x_a^k}{d t}(c+t^j)\right) , \end{aligned}$$

(155)

where $a\in A_1$. Now, obtain the objective function’s gradient:

$$\begin{aligned} \phi _j=\nabla \varPsi (t^j,x(t^j))= \begin{pmatrix} \displaystyle \frac{\partial \varPsi }{\partial t_1}(t^j,x(t^j))\\ \displaystyle \frac{\partial \varPsi }{\partial t_2}(t^j,x(t^j))\\ \vdots \\ \displaystyle \frac{\partial \varPsi }{\partial t_{M_1}}(t^j,x(t^j)) \end{pmatrix}. \end{aligned}$$

(156)

Step 2: Starting from $j=0$; assign $d_j=\phi $ as the derivative in the current iteration, set i as a counter starting from $i=0$, establish $\alpha _i=1$ as the step size and compute $\varPsi (t^j+\alpha _id_j,x(t^j+\alpha _id_j))$ in order to obtain the best step size (i.e. the best $\alpha _i$ value).

Step 3: First, compare the expressions $\varPsi (t^j,x(t^j))+\delta \alpha _i\phi ^Td_j$ against $\varPsi (t^j+\alpha _id_j,x(t^j+\alpha _id_j))$. If the following inequality does not holds: directly go to Step 4. But, if the following inequality is valid continue in this step.

$$\begin{aligned} \varPsi (t^j,x(t^j))+\delta \alpha _i\phi ^Td_j<\varPsi (t^j+\alpha _id_j,x(t^j+\alpha _id_j)). \end{aligned}$$

(157)

Starting from $\alpha _i=1$ we increase its value in the following way: $\alpha _{i+1}=1.5\alpha _i$. Continue increasing the $\alpha _i$ value and $i:=i+1$ until $\varPsi (t^j,x(t^j))+\delta \alpha _i\phi ^Td_j\ge \varPsi (t^j+\alpha _id_j,x(t^j+\alpha _id_j))$ or $t^j+\alpha _id_j\ge t^{\max }$; select the corresponding penultimate $\alpha _i$ value as the best one, this is, for $i:=i-1$. Go to Step 5.

Step 4: In this case, consider $\alpha _i=1$ and if the inequality $\varPsi (t^j,x(t^j))+\delta \alpha _i\phi ^Td_j\ge \varPsi (t^j+\alpha _id_j,x(t^j+\alpha _id_j))$ holds, we start to decrease $\alpha _i$ by $\alpha _{i+1}=\alpha _i/1.5$, compute $\varPsi (t^j+\alpha _id_j,x(t^j+\alpha _id_j))$ and continue decreasing $\alpha _i$, and stop when the following inequality is valid: $\varPsi (t^j,x(t^j))+\delta \alpha _i\phi ^Td_j<\varPsi (t^j+\alpha _id_j,x(t^j+\alpha _id_j))$.

Step 5: Consider the values $t^{j+1}=t^j+\alpha _id_j$, $x(t^{j+1})=x(t^j+\alpha _id_j)$, and $\varPsi (t^{j+1},x(t^{j+1}))=\varPsi (t^j+\alpha _id_j,x(t^j+\alpha _id_j))$ as the current ones and return to Step 1. Keep iterating until

$$\begin{aligned} \Vert \varPsi (t^{j+1},x(t^{j+1}))-\varPsi (t^{j},x(t^{j}))\Vert \le \varepsilon . \end{aligned}$$

(158)

Conclude by selecting the vectors $t^{j+1}$ and $x(t^{j+1})$ correspondingly as the tolls and flows approximate solution vectors to the TOP, and the objective function value $\varPsi (t^{j+1},x(t^{j+1}))$ as an acceptable problem’s solution.

The Nelder-Mead Algorithm

Unlike the previous algorithms, the Nelder-Mead algorithm is intended for global optimization. First, note we are interested in solving the following problem:

$$\begin{aligned} \underset{x}{{\text {maximize}}}&f(x),\end{aligned}$$

(159)

$$\begin{aligned} \text {subject to}&x\in \mathbb {R}^n, \end{aligned}$$

(160)

where $f:\mathbb {R}^n\rightarrow \mathbb {R}$ is not necessarily continuous.^{Footnote 9} At the beginning from at iteration we consider a non-degenerate simplex in $\mathbb {R}^n$ and finishes with other simplex in $\mathbb {R}^n$ different from the previous one. Define a non-degenerate simplex in $\mathbb {R}^n$ as the convex polyhedron formed by the $n+1$ non-coplanar points $x_1,x_2,\dots ,x_{n+1}\in \mathbb {R}^n$, this is that not all those points are over the same hyper-plane of $\mathbb {R}^n$. Let’s suppose that the initial simplex’s vertex are ordered in such way as:

$$\begin{aligned} f_1\ge f_2\ge \dots \ge f_{n+1}, \end{aligned}$$

(161)

where $f_i=f(x_i)$, $i\in \{1,2,\dots ,n\}$.

Since we are looking for the maximizing of f, we consider $x_1$ as the best vertex and $x_{n+1}$ as the worst. We define the diameter of a simplex S as

$$\begin{aligned} {\text {diam}}(S)=\max \limits _{1\le i,j\le n+1}\Vert x_i-x_j\Vert . \end{aligned}$$

(162)

The parameters $\rho $, $\delta $, $\gamma $ and $\sigma $ are used at each iteration and must satisfy that:

$$\begin{aligned} \delta >1,\ 0<\rho<\delta ,\ 0<\gamma<1,\text { and }0<\sigma <1. \end{aligned}$$

(163)

The default values commonly used are:

$$\begin{aligned} \rho =1,\ \delta =2,\ \gamma =\frac{1}{2},\text { and }\sigma =\frac{1}{2}. \end{aligned}$$

(164)

The kth iteration of the Nelder-Mead algorithm is described as follows:

Step 1 (Assort): Order the $n+1$ vertex of the simplex as in (161).

Step 2 (Reflect): Calculate the centroid of the n best points:

$$\begin{aligned} \hat{x}=\sum \limits _{i=1}^n\frac{x_i}{n}. \end{aligned}$$

(165)

Compute the reflection point:

$$\begin{aligned} x_r=\hat{x}+\rho (\hat{x}-x_{n+1})=(1+\rho )\hat{x}-\rho x_{n+1}. \end{aligned}$$

(166)

Calculate $f_r=f(x_r)$. If $f_1\ge f_r>f_{n}$, accept $x_r$ as the new simplex vertex, eliminate the worst vertex and finish the iteration.

Step 3 (Expand): If $f_r>f_1$ compute the expansion point

$$\begin{aligned} x_e=\hat{x}+\delta (x_r-\hat{x})=\hat{x}+\rho \delta (\hat{x}-x_{n+1})=(1+\rho \delta )\hat{x}-\rho \delta x_{n+1}, \end{aligned}$$

(167)

and evaluate $f_e=f(x_e)$. If $f_e>f_r$ accept $x_e$, eliminate the worst vertex and finish the iteration. In the other case ($f_e\le f_r$), accept $x_r$, eliminate the worst vertex and finish the iteration.

Step 4 (Contract): If $f_r\le f_n$ realize a contraction between $\hat{x}$ and the best point of $x_{n+1}$ and $x_r$.

4.a (External contraction): If $f_n\ge f_r\ge f_{n+1}$, calculate

$$\begin{aligned} x_{ec}=\hat{x}+\gamma (x_r-\hat{x})=\hat{x}+\rho \gamma (\hat{x}-x_{n+1})=(1+\rho \gamma )\hat{x}-\rho \gamma x_{n+1}, \end{aligned}$$

(168)

and evaluate $f_{ec}=f(x_{ec})$. If $f_{ec}\ge f_r$, accept $x_{ec}$, eliminate the worst vertex and finish the iteration. In the other case, go to Step 5.

4.b (Internal contraction): If $f_r\le f_{n+1}$, calculate

$$\begin{aligned} x_{ic}=\hat{x}-\gamma (\hat{x}-x_{n+1})=(1-\gamma )\hat{x}+\gamma x_{n+1}, \end{aligned}$$

(169)

and evaluate $f_{ic}=f(x_{ic})$. If $f_{ic}\ge f_{n+1}$, accept $x_{ic}$, eliminate the worst vertex and finish the iteration. In the other case, go to Step 5.

Step 5 (Shrink): Evaluate f in the n points $y_i=x_1+\sigma (x_i-x_1)$, $i\in \{2,3,\dots ,n+1\}$. The new vertex of the simplex for the next iteration will be $x_1,y_2,\dots ,y_{n+1}$.

The algorithm stops when ${\text {diam}}(S)<\varepsilon $, for some $\varepsilon >0$, and $x_1$ is taken as the best point for f.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kalashnikov, V., Flores Muñiz, J.G., Kalashnykova, N. (2020). Bilevel Optimal Tolls Problems with Nonlinear Costs: A Heuristic Solution Method. In: Kosheleva, O., Shary, S., Xiang, G., Zapatrin, R. (eds) Beyond Traditional Probabilistic Data Processing Techniques: Interval, Fuzzy etc. Methods and Their Applications. Studies in Computational Intelligence, vol 835. Springer, Cham. https://doi.org/10.1007/978-3-030-31041-7_28

Download citation

DOI: https://doi.org/10.1007/978-3-030-31041-7_28
Published: 29 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31040-0
Online ISBN: 978-3-030-31041-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Bilevel Optimal Tolls Problems with Nonlinear Costs: A Heuristic Solution Method

Abstract

Access this chapter

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix 1: Proof of Theorem 1

Appendix 1: Proof of Theorem 1

Proof

Theorem 2

Definition 1

Theorem 3

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation