1 Introduction

Considered here is a parametrized family of primal problems

$$\begin{aligned} P(y):\text { minimize }c_{0}(x)\text { subject to }c_{i}(x)\le y_{i},\text { }i\in I. \end{aligned}$$
(1)

Each function \(c_{i},\) \(i\in {\mathcal {I}}:=\left\{ 0\right\} \cup I,\) maps the same general space \({\mathbb {X}}\) into the set \({\mathbb {R}}\) of reals. The non-empty index ensemble (or list) I is finite, and \(0\notin I.\)

For interpretation, let \(x\in {\mathbb {X}}\) denote some activity choice and \(y_{i}\in {\mathbb {R}}\) be the quantity available of input \( i\in I\). Regard the parameter vector \(y:=(y_{i})_{i\in I}\in {\mathbb {Y}}:={\mathbb {R}}^{I}\) as a bundle of allowances, commodities or production factors, added to some already given endowment \(y^{0}=(y_{i}^{0})\) . With no loss of generality, posit \(y^{0}={\mathbf {0}}\).

Also for interpretation, choice x generates monetary revenue \( r(x):=-c_{0}(x),\) and it entails consumption \(c_{i}(x)\) of resource i. Suppose the latter be linearly valued by some unit price \( \lambda _{i}\ge 0\). Consequently, upon facing problem instance P(0), the decision-maker seeks to maximize own profit

$$\begin{aligned} \pi (x,\lambda ):=r(x)-\sum _{i\in I}\lambda _{i}c_{i}(x). \end{aligned}$$
(2)

Equivalently, he attempts to minimize the standard Lagrangian

$$\begin{aligned} L(x,\lambda ):=c_{0}(x)+\sum _{i\in I}\lambda _{i}c_{i}(x)=-\pi (x,\lambda ). \end{aligned}$$
(3)

Can Lagrangian (3) facilitate computation of optima? And then, seen as prices, are multipliers bounded?

Many mathematicians, included optimizers, know that these two questions are closely connected. Further, most economists consider Lagrange multipliers as prices, emerging as shadows somehow. Finally, every game theorist regards Lagrangian duality as a non-cooperative, two-player game.

Thus, three different fields shed light on Lagrangian (3), each offering a particular perspective. Historically, mathematics and mechanics came first, and these remain at the forefront. Economics and game theory appeared later.Footnote 1 Turning the said time order around, this paper inquires: May economic and game theory inform about eventual boundedness or existence of multipliers - and about their interpretation or nature?

For background and motivation - and no less important: to fix some notations - we begin with two examples. None offers any novelty, hence both can be skipped. Either has \({\mathbb {X}}={\mathbb {Y}}={\mathbb {R}}\) and \(I=\left\{ 1\right\} \).

Example 1.1

(on differential and topological instability). When \(c_{0}(x)=x\) and \(c_{1}(x)=x^{2}\), problem P(0) has unique solution \(x=0.\) However, the customary “optimality condition” \(\partial _{x}L(0,\lambda )=0\) implies the absurdity \(1=0\). What went wrong here? Regarding regularity of data, note that:

  • The functions \(c_{0}\) and \(c_{1}\) are both convex and differentiable; one can hardly ask for more.

  • Further, when \(y>0,\) problem P(y) has a feasible set

    $$\begin{aligned} X(y):=\left\{ x\in {\mathbb {X}}\text { }\left| \text { }c_{1}(x)\le y\right. \right\} \end{aligned}$$
    (4)

    which is bounded, closed, convex, with non-empty interior - each property being both convenient and desirable.

  • Yet, modulo the convention \(\inf \varnothing =+\infty ,\) the perturbed best cost function

    $$\begin{aligned} y\in {\mathbb {Y}}\mapsto c(y):=\inf \left\{ c_{0}(x)\text { }\left| \text { } c_{1}(x)\le y)\right. \right\} \in {\mathbb {R}}\cup \left\{ +\infty \right\} , \end{aligned}$$
    (5)

    which equals \(-y^{1/2}\) when \(y\ge 0\), \(+\infty \) elsewhere, has empty subdifferential

    $$\begin{aligned} \partial c(y):=\left\{ y^{*}\text { }\left| \text { }c({\hat{y}})\ge c(y)+y^{*}(\hat{y}-y)\text { for each }\hat{y}\in {\mathbb {Y}}\right. \right\} \end{aligned}$$
    (6)

    at \(y\le 0.\) Moreover, there is differential instability at 0 since \(\partial c(y)=-y^{-1/2}/2\rightarrow -\infty \) as \(y\searrow 0,\) but \( \partial c(0)=\varnothing .\)

  • Also, as \(y\rightarrow 0^{+},\) the set X(y) (4) shrinks monotonically to X(0), but no continuous mapping sends the set X(y),  \( y>0 \), in one-to-one manner onto X(0); there is topological instability at \(y=0.\)

  • Even the extended Lagrangian

    $$\begin{aligned} L_{0}(x,\lambda ):=\lambda _{0}c_{0}(x)+\sum _{i\in I}\lambda _{i}c_{i}(x), \text { with } \lambda =(\lambda _{i})_{i\in {\mathcal {I}}}\ge 0, \end{aligned}$$
    (7)

    offers no rescue. Indeed, \(\partial _{x}L_{0}(0,\lambda )=0\) entails \( \lambda _{0}=0\) alongside the totally empty and useless information that \( \lambda _{1}c_{1}^{\prime }(0)=0.\)

In short: neither the standard Lagrangian nor its extension helps here. Yet, sometimes the first serves to clarify whether the problem at hand is well posed or not - as illustrated next.

Example 1.2

(On eventual purchase of input). Let \( c_{0}(x)=\exp (-x)\) and \(c_{1}(x)=-x.\) No feasible x and \(\lambda \ge 0\) satisfy \(\frac{\partial }{\partial x}L(x,\lambda )=0.\) In fact, instance P(0) cannot be solved. What lacks is an upper bound for x or a strictly positive lower one for \(\lambda \).

Suppose the ineffective constraint be replaced with one which bites, namely: \(x\le y\) for some \(y>0.\) Further suppose the single input sells - in some exogenous market - at unit price \(y^{*}>0.\) Then, if the directional derivative \(c^{\prime }(y;1):=\lim _{s\rightarrow 0^{+}}[c(y+s1)-c(y)]/s\) satisfies

$$\begin{aligned} c^{\prime }(y;1)+y^{*}<0, \text { or more generally, if }y^{*}\notin -\partial c(y), \end{aligned}$$

some more input ought be purchased; see Lemma 4.1 or Sect. 5.

Applicability of Lagrangian (3) requires, of course, existence of at least one multiplier (vector). Moreover, these had better be bounded. Viewing these matters from the vantage points of economic and game theory, we proceed as follows:

Section 2 prepares the ground by valuing activity choice \(x\in {\mathbb {X}}\) via shadow prices \(y^{*}\) on resource bundles \(y\in {\mathbb {Y}}\).

Section 3 links single agent’s best choice to arbitrage - and to multi-agent or multi-objective efficiency.

Section 4 views Lagrangian relaxation in economic terms.

Section 5 considers options for (resource) trade alongside eventual arbitrage.

Section 6 concludes by connecting bounded multipliers, via generalized convexity, to game theory and existence of saddle values .

Our motivation stems from frequent needs, in communication or didactics, to interpret Lagrangian duality in economic or game terms - and to offer more on that account than what is common. While valuation is bread and butter in economics LeRoy and Werner (2001), game theory Osborne and Rubinstein (1994) rather studies strategic behavior. But both fields accommodate several decision-makers.

Accordingly, we address diverse readers. Included are economists, game theorists, mathematicians and operation researchers. Apologies extend to economists, and others, for accommodating diverse decision spaces. For convenience, the reader may hold on to a Euclidean space \({\mathbb {X}}\) throughout. Mathematical novelties and prerequisites are few. Likewise, we beg excuse to mathematicians for invoking many economic concepts - not all essential, but they facilitate interpretation.Footnote 2

Our modest aim is to emphasize some easily and frequently overlooked bridges between distinct fields. In addition, we want to facilitate students’ appreciation of–and encounters with–Lagrangian (3).

Notations Write \(c_{1}(x):=[c_{i}(x)]_{i\in I}\), and order \( {\mathbb {Y}}={\mathbb {R}}^{I}\) in customary component-wise manner, to restate problem (1) as

$$\begin{aligned} P(y):\text { minimize }c_{0}(x)\text { subject to }c_{1}(x)\le y. \end{aligned}$$
(8)

By standing assumption: P(0) (8) is feasible with finite best cost c(0) (5).

Until other notice, let \({\mathbb {X}}\) be any real vector space of whatever dimension, and suppose each function \(c_{i}:{\mathbb {X}}\rightarrow {\mathbb {R}}\), \(i\in {\mathcal {I}}=\left\{ 0\right\} \cup I,\) is Gâteaux differentiable–in short, just differentiable. This means that the derivative

$$\begin{aligned} c_{i}^{\prime }(x;d):=\lim _{s\rightarrow 0^{+}}\frac{c_{i}(x+sd)-c_{i}(x)}{s} \end{aligned}$$

exists along each direction \(d\in {\mathbb {X}}\) and is linear in that variable. On this premise, for notational convenience, we write \(c_{i}^{\prime }(x)\) for the functional \(d\in {{\mathbb {X}}}\mapsto c_{i}^{\prime }(x;d)=:c_{i}^{\prime }(x)d\in {{\mathbb {R}}}.\)

When \(c^{\prime }(0;d)\) is well defined for value function (5), clearly, \(d\ge 0\Longrightarrow c^{\prime }(0;d)\le 0.\) Of particular interest then are \(c^{\prime }(0;\pm e_{i})\) for unit vectors \( e_{i}=(0,..,0,1,0,..)\in {\mathbb {Y}}\), instance \(+\) (−) reflecting monetary bid (resp. ask) for buying (selling) marginal amounts of resource i.

As long as \({\mathbb {X}}\) remains linear, we use \(x^{*}\in {\mathbb {X}} ^{*}\) as shorthand to signify a linear function \(x\in {\mathbb {X}}\mapsto x^{*}(x)=:x^{*}x\in {\mathbb {R}}\). If moreover, \({\mathbb {X}}\) is topological and locally convex, take each such \(x^{*}\) to be continuous as well.

Similarly, the notation \(y^{*}\in {\mathbb {Y}}^{*}\) should be construed as some price regimefree of arbitrage, hence linear LeRoy and Werner (2001)–which values resource use \(y\in {\mathbb {Y}}\). We write \( y^{*}\ge 0\) to indicate a linear function \(y^{*}:{\mathbb {Y}}\rightarrow {\mathbb {R}}\) such that \(y^{*}y:=y^{*}\cdot y\ge 0\) whenever \( y\ge 0.\)

A larger (resource endowment) y expands the feasible set (4), thereby offering more freedom of choice. Accordingly, in view of the decision-maker’s willingness to pay for greater flexibility, assume \(y\le \hat{y}\Longrightarrow y^{*}y\le y^{*}\hat{y}\). So, quite naturally, \(y^{*}\ge 0.\)Footnote 3

As one might expect, when \({\mathbb {X}}\) is linear, \(x^{*}\) and \(y^{*}\) will connect to local linearizations of \(c_{0}\) and \(c_{1}\), respectively, at some point \(x\in {\mathbb {X}}\) of special notice. This feature is taken up first.

2 Shadow pricing

Suppose some \(y^{*}\in {\mathbb {Y}}^{*}\) valuates any resource bundle \( y\in {\mathbb {Y}}\). Construed as “price vector”, \(y^{*}\) might derive endogenously from the data of problem P(0) - like a shadow (see Proposition 4.2). In that case, \(y^{*}\) may attest to local optimality of x (see Proposition 3.2). Alternatively, \(y^{*}\) could be given exogenously (see Proposition 5.1 and its corollary). Anyway, for now, just imagine that whatever \(y\in {\mathbb {Y}}\) can be “bought” at linear cost \(y^{*}y\in {\mathbb {R}}.\)

Until other notice, let \({\mathbb {X}}\) be any real linear space, maybe of infinite dimension. It’s helpful then to coach some arguments as though they derive from linear programming. So, construe \(x\in {\mathbb {X}}\) as some activity plan, recorded in coordinate-free manner. For any linear functionals \(a_{i}\in {\mathbb {X}}^{*},i\in I,\) the operator \(A:=\left[ a_{i}\right] \) - viewed as “matrix with \(a_{i}\) in row iFootnote 4 - maps activity space \({\mathbb {X}}\) linearly into resource space \({\mathbb {Y}}\), by

$$\begin{aligned} \chi \in {\mathbb {X}}\mapsto A\chi :=[a_{i}\chi ]_{i\in I}\in {\mathbb {Y}}. \end{aligned}$$
(9)

Conversely, the adjoint or transposed operator \(A^{*},\) maps any resource price \(y^{*}\in {\mathbb {Y}}^{*}\) into a corresponding activity price \(x^{*}=A^{*}y^{*}:=y^{*}A\in {\mathbb {X}}^{*}\) by

$$\begin{aligned} \chi \in {\mathbb {X}}\mapsto x^{*}\chi =(A^{*}y^{*})\chi :=y^{*}(A\chi )\in {\mathbb {R}}; \end{aligned}$$
(10)

see Example 4.1.

Now, as long as \({\mathbb {X}}\) remains linear, and \(c_{i}\) is Gâteaux differentiable, let \(a_{i}=c_{i}^{\prime }(x)\) denote its derivative at some fixed \(x\in {\mathbb {X}}\), and view \(A=[c_{i}^{\prime }(x)]_{i\in I}=:c_{1}^{\prime }(x)\) as a linear operator from \({\mathbb {X}}\) to \({\mathbb {Y}} \) (9).

Two prices \(x^{*}=r^{\prime }(x)\in {\mathbb {X}}^{*}\) and \(y^{*}=\lambda =(\lambda _{i})\in {\mathbb {Y}}_{+}^{*}\) might then ”rationalize”choice x if \(x^{*}=A^{*}y^{*}=y^{*}A,\) meaning \(r^{\prime }(x)=\lambda c_{1}^{\prime }(x).\) That is, \(\lambda \) tests x for local optimality in so far as marginal revenues should equal marginal costs. Put differently: all profit margins must be nil in that \(\partial _{x}\pi (x,\lambda )=0\) (2). Equivalently, by (3), \( \partial _{x}L(x,\lambda )=0\).

Along the same line, declare factor i scarce, and the constraint \( c_{i}(x)\le 0\) active or binding, iff \(c_{i}(x)=0\) - as is signalled by writing \(i\in I(x).\) In economic, formal or intuitive terms, \(i\notin I(x)\Longleftrightarrow c_{i}(x)<0 \Longleftrightarrow \) factor i is disposable and overabundant. Hence, by endogenous valuation, it commands unit price \(\lambda _{i}=0.\) Together these considerations motivate the following:

Definition 2.1

(Lagrange multipliers as shadow prices ). With \({\mathbb {X}}\) linear and each \(c_{i}\) differentiable , the vector \(\lambda =(\lambda _{i})\in {\mathbb {R}}^{I}=:{\mathbb {Y}} ^{*}\) is declared a Lagrange multiplier or shadow price at a local optimum x to problem P(0) iff

$$\begin{aligned} \lambda \ge 0,\quad \partial _{x}L(x,\lambda )=0,\text { and }\lambda _{i}c_{i}(x)=0\text { for each }i\in I. \end{aligned}$$
(11)

Thus what emerged here, merely by shadow pricing (of resources), are the usual Karush-Kuhn-Tucker necessary optimality conditions (11) that apply to Lagrangian (3) for (Gâteaux) differentiable instances of problem P(0); see also Akgül (1984), Bertsekas and Ozdaglar (2002), Jie and Yan (2021).

Many routes lead to (11). Included are: differential calculus, theorems of the alternative, variational analysis and separation of (convex) sets. The new avenue or narrative, chosen here, differs from the said ones by invoking pricing - or a two-person non-cooperative game Osborne and Rubinstein (1994): Primal problem P(0) is embedded into a market-like scenario in which some dual (exogenous and) self-interested player sets prices - be these shadows or not. Moreover, if prudent, the latter party should not permit arbitrage (see Sect. 5).

Multipliers need not be unique. But they form a closed convex set, endogenous to problem P(0), and depending on x. Reflecting the decision-maker’s wish to ascertain or test local optimality at \(x\in X(0),\) his pricing ought comply with two time-honoured maxims - each standing on axiomatic, empirical, intuitive or theoretical grounds:

  • First, comes the principle of equimargins: For efficiency, marginal revenues must equal corresponding marginal costs : \(r^{\prime }(x)=\lambda c_{1}^{\prime }(x)\).

  • Second, if x is feasible, no overabundant input has positive price:

    $$\begin{aligned} c_{i}(x)<0\Longrightarrow \lambda _{i}=0. \end{aligned}$$

    Put differently: unless a resource be scarce, it cannot command any user cost. Consequently, by shadow pricing:

  • Total input cost should be nil: \(\sum _{i\in I}\lambda _{i}c_{i}(x)=0\).Footnote 5

    Clearly, if \(r^{\prime }(x)=0\), all this means that problem P(0) isn’t really constrained at x. Then, trivially \(\lambda =0\) fits as shadow price. This particular instance merits minor interest. So, henceforth suppose \(c_{0}^{\prime }(x)\ne 0\) whenever x is locally optimal for P(0). Then, every such solution to problem P(0) is indeed constrained, and each reasonable \(\lambda \) must be non-zero. Put differently, in purely verbal and economic terms:

  • No reasonable shadow pricing justifies absence of marginal costs. Restated in mathematical terms:

    $$\begin{aligned} {{No \,\,non zero }}(\lambda _{i})_{i\in I(x)}\ge 0 { \,\,{ satisfies }}\sum _{i\in I(x)}\lambda _{i}c_{i}^{\prime }(x)=0. \end{aligned}$$
    (12)

    (12) is the Mangasarian-Fromowitz constraint qualification–to which we shall return below.

For the rest of this section and the entire subsequent one, let the space \({\mathbb {X}}\) be Euclidean or Hilbert–or more generally, reflexive Banach.Footnote 6 On this premise, what does absence of shadow prices imply?

For argument, what comes next is a theorem of alternativesFootnote 7, in a form which puts prices next to geometry:

Farkas lemma (1902)(shadow prices versus eventual improvement). For any \(x\in {\mathbb {X}}\), either there is a shadow price \(\lambda \) (11) or some input bundle \(d\in {\mathbb {X}}\) gives

$$\begin{aligned} r^{\prime }(x)d>0\ \ {{and\ }}\ c_{i}^{\prime }(x)d\le 0\ \ {{for\,every}} \ \ i\in I. \end{aligned}$$
(13)

Inequality \(r^{\prime }(x)d>0\) ensures that a sufficiently small step \(s>0\) along d gives greater revenue: \(r(x+sd)>r(x).\) At the same time, \( c_{i}^{\prime }(x)d\le 0\) indicates, but does not generally guarantee, that \(c_{i}(x+sd)\le c_{i}(x).\) But at least, some purchase along d appears worthwhile - and, indeed, so it is granted the following property: call \(c_{i}\) locally affine-like at x iff

$$\begin{aligned} c_{i}^{\prime }(x)d\le 0\Longrightarrow c_{i}(x+sd)\le c_{i}(x) \,{{for\,sufficiently\,small\,step }}\, s>0. \end{aligned}$$
(14)

Given affine-like constraints, inequalities (13) already sheds light on the special nature of linearly constrained programs. The proof of the following is left to the reader.

Proposition 2.1

(on locally affine-like binding constraints). Let all \(c_{i},i\in I,\) be continuous here. If (13) holds, with \(c_{i}\) locally affine-like (14) at \(x\in X(0)\) for each \( i\in I(x)\), then adjusting resource use along direction d is feasible and strictly improving in that

$$\begin{aligned} x+sd\in X(0) \text { and }c_{0}(x+sd)<c_{0}(x) \end{aligned}$$

for sufficiently small step-size \(s>0.\) In particular, if each constraint function \(c_{i}\) is affine, then each locally optimal solution x to problem P(0) comes with a shadow price. \(\square \)

3 Local optimality, efficiency and arbitrage

Apart from locally affine-like constraints, as in Proposition 2.1, when is direction d (13) feasible? For that purpose, recall a chief theorem on alternatives, phrased in economic jargon here:

Gordan’s theorem (1873)Footnote 8: Given a non-empty, finite set of price vectors \(a_{i}\in {\mathbb {X}}^{*},\) \(i\in {\mathbb {I}},\) exactly one of the following two systems has a solution:

$$\begin{aligned} \underline{Either}\,some\,bundle\,d\in {\mathbb {X}}\, {{ has \,negative \,costs}\!\!:\,\, }a_{i}d<0 {{\, for \,each \,}}i\in {\mathbb {I}}, \end{aligned}$$
(15)
$$\begin{aligned} {\underline{or}}: \sum _{i\in {\mathbb {I}}}\lambda _{i}a_{i}=0 \,{{with} }\,\sum _{i\in {\mathbb {I}}}\lambda _{i}=1 \,{{ and \,each }}\,\lambda _{i}\ge 0. \end{aligned}$$
(16)

In particular, when \(a_{i}=c_{i}^{\prime }(x)\), and \(s>0\) is sufficiently small, (15) implies that \(c_{i}(x+sd)<c_{i}(x)\) for each \( i\in {\mathbb {I}}\). If moreover, \(x\in X(0),\) \({\mathbb {I}}=\left\{ 0\right\} \cup I(x),\) and each \(c_{i},i\notin {\mathbb {I}}\) is continuous at x,  that point cannot be locally optimal for problem P(0). Indeed, a sufficiently small step \(s>0\) along d gives strict feasibility and strictly reduced cost.

So, by Gordan’s theorem, for any locally optimal solution x to P(0),  alternative (16) must hold with \({\mathbb {I}}={\mathcal {I}}=\left\{ 0\right\} \cup I.\) Equivalently, at each such point, regarding the extended Lagrangian \(L_{0}\) (7), the Fritz John condition is in vigor:

$$\begin{aligned} {{There \,exists }}\,\lambda \gneqq 0 \,{ {such\,that} }\, \partial L_{0}(x,\lambda )=0\text { and }\lambda _{i}c_{i}(x)=0 \,\forall \quad i\in I. \end{aligned}$$
(17)

Clearly, (11) obtains from (17) when \(\lambda _{0}>0.\) Otherwise, consider the reduced index set \({\mathbb {I}}=I(x).\) Then, for economic intuition, (16) implies that some proper pricing \( \lambda \gvertneqq 0\) of inputs annuls all marginal costs: \(\lambda c_{1}^{\prime }(x)=0.\) In this case, \(x\in X(0)\) and \(\partial _{x}\pi (x,\lambda )=0\) would imply \(r^{\prime }(x)=0,\) and thereby tell that P(0) appears non-constrained. So, as argued earlier, for economic realism, suppose that whenever x is locally optimal for P(0),  no proper input pricing makes all marginal costs nil. In this way, with a view towards (16), an equivalent restatement of (12 ) comes up:

$$\begin{aligned} {{no }}(\lambda _{i})\ge 0 \, { {solves }}\,\sum _{i\in I(x)}\lambda _{i}c_{i}^{\prime }(x)=0 { \ {and \ }}\sum _{i\in I(x)}\lambda _{i}=1. \end{aligned}$$
(18)

From (17) and (18) follows forthwith:

Theorem 3.1

(the necessary KKT condition) Suppose problem P(0) admits a local optimum x at which (18) holds, then (11) is solvable for at least one shadow price \(\lambda \).

\(\square \)

By Gordan’s theorem, (18) amounts to the Mangasarian-Fromovitz constraint qualification at x Mangasarian and Fromovitz (1967):

$$\begin{aligned} {{Some \,direction }}\,d\in {\mathbb {X}} {{\ gives }}\, c_{i}^{\prime }(x)d<0 {{\ for\, each }}\,i\in I(x). \end{aligned}$$
(19)

Then, provided each \(c_{i},i\notin I(x)\), be continuous, a minor move along d ensures strict feasibility.

Remark 3.1

(on constraint qualifications). Granted (18), each “binding gradient” \(c_{i}^{\prime }(x),i\in I(x),\) must be non-zero - unlike Ex.1.1. Moreover, those vectors cannot be positively dependent. A fortiori, it would suffice that they were linearly independent.

Conversely, for a Slater condition, if each binding \(c_{i},\) \(i\in I(x),\) is locally starshaped at some strictly feasible point \(x^{0}\) in that each \(c_{i}(x^{0})<0,\) and

$$\begin{aligned} c_{i}({\rho }x^{0}+(1-{\rho })x)\le {\rho } c_{i}(x^{0})+(1-{\rho })c_{i}(x)={\rho }c_{i}(x^{0})<0=c_{i}(x) \end{aligned}$$

for sufficiently small \(\rho \in (0,1),\) then

$$\begin{aligned} \frac{c_{i}(x+{\rho }(x^{0}-x))-c_{i}(x)}{{\rho }}\le c_{i}(x^{0})<0. \end{aligned}$$

Hence, if each \(c_{i}\) is continuous, letting \({\rho }\rightarrow 0^{+}\), (19) holds with \(d=x^{0}-x\). \(\ \Diamond \)

Negation of (18), alongside (11), precludes that \( c_{0}^{\prime }(x)=0.\) Put differently: if Fermat’s classical optimality condition \(c_{0}^{\prime }(x)=0\) fails for each locally optimal \(x\in X(0),\) add some compromise \(\lambda c_{1}=\sum _{i\in I}\lambda _{i}c_{i}\) to criterion \(c_{0}\). Granted shadow pricing (11), the said compromise restores Fermat’s optimality condition, but this time for \( L(\cdot ,\lambda ).\)

As upshot so far: it imports, quite naturally, that the objective of problem P(0) not be fully detached from its constraints. Then, the decision-maker, while worshipping own profit (2), might price resources to test for local versions of arbitrage or Pareto efficiency:

Proposition 3.1

(on shadow pricing versus arbitrage). Let all \(c_{i},i\in I,\) be continuous. Suppose that x admits no shadow price (11) and that qualification (19) holds. Then x cannot be locally optimal for problem P(0). In fact, there exists a direction d along which any price \(y^{*}\ge 0\) on resources, with active part \(y^{*}(x):=[y_{i}^{*}]_{i\in I(x)}\ne 0,\) offers local pure arbitrage. That is, strictly more revenue can be had at negative cost:

$$\begin{aligned} r(x+sd)>r(x){ \ {and}\ \ }y^{*}c_{1}(x+sd)<0{{\ \ for \,sufficiently \,small }}\,s>0{{.}} \end{aligned}$$
(20)

Proof

Let \(d^{M}\) be a direction which suits (19), and \(d^{F}\) one which suits (13). Then, provided \(\delta >0\) is sufficiently small, \(d:=\delta d^{M}+d^{F}\) becomes a feasible direction for which \(r^{\prime }(x)d>0.\) In addition, all \( c_{i}(x+sd)<c_{i}(x)\) for small enough \(s>0.\) So, (20) follows. But then clearly, x cannot be the locally optimal for P(0). \(\square \)

When valid at some locally minimal solution x to P(0),  qualification ( 19) entails two important facts: first, the set of shadow prices is non-empty; second, that set must be bounded:

Proposition 3.2

(on bounded shadow prices). Let x be a locally optimal solution to problem P(0). Then the shadow prices \(\lambda \) (11) form a non-empty compact set iff (19) holds.

Proof

was first given by Gauvin Gauvin (1977). Diethard Klatte Klatte (2022) generously gave us the following simpler argumentFootnote 9: Consider the closed convex set

$$\begin{aligned} \Lambda (x):=\left\{ \lambda =(\lambda _{i})\in {\mathbb {R}}_{+}^{I(x)}\text { } \left| \text { }c_{0}^{\prime }(x)+\sum _{i\in I(x)}\lambda _{i}c_{i}^{\prime }(x)=0\right. \right\} \end{aligned}$$

of reduced KKT multipliers. It’s non-empty by Theorem 3.1. (18) implies that the recession cone \(0^{+}\Lambda (x)=\lim _{s\rightarrow 0^{+}}s\Lambda (x)\) reduces to \(\left\{ 0\right\} .\) Hence \(\Lambda (x)\) must be bounded; see (Rockafellar 1970, Theorem 8.4), or (Schrijver 1986, Section 8.2). \(\square \)

Hitherto P(0) was presented as the planning problem of an isolated, single decision-maker - maybe somewhat single-minded. It appears fitting therefore, to play down eventual egocentricity, methodological individualism or narrow-mindedness on his part. Accordingly, concluding this section is a complementary view on problem P(0). Accommodated right here is an extended ensemble \({\mathcal {I}}\) of different agents or objectives, each represented by a corresponding utility criterion \(u_{i}: {\mathbb {X}}\rightarrow {\mathbb {R}}\). Let label 0 indicate any selected member of \({\mathcal {I}}\) \(\mathcal {(}=\left\{ 0\right\} \cup I).\)

Given a benevolent planner, concerned with overall welfare of citizen ensemble \({\mathcal {I}}\)or a decision-maker who pursues improvement of multiple criteria \(u_{i}\), \(i\in {\mathcal {I}}\)–he would declare the profile of utility levels \((\underline{u}_{i})\in {\mathbb {R}}^{ {\mathcal {I}}}\) locally Pareto efficient at x iff the program:

$$\begin{aligned} \text {maximize }\,u_{0}(x)\,\text { subject to }\,u_{i}(x)\ge \underline{u} _{i}\,\text { for each }\,i\in I, \end{aligned}$$
(21)

has x as locally optimal solution with \(u_{0}(x)=\underline{u}_{0}.\)

Proposition 3.3

(on multi-agent or multi-objective efficiency). Let x be any locally optimal solution to problem (21) with \(u_{0}(x)=\underline{u}_{0}.\) Then, under (19), with \(c_{i}:=-u_{i}+\underline{u}_{i}\), there exists a profile of welfare weights \(\lambda =(\lambda _{i})_{i\in {\mathcal {I}}}\ge 0\) with \(\lambda _{0}\ne 0\) such that

$$\begin{aligned} \sum _{i\in {\mathcal {I}}}\lambda _{i}[u_{i}(x)-\underline{u}_{i}]=0 \,{{and }}\,\sum _{i\in {\mathcal {I}}}\lambda _{i}u_{i}^{\prime }(x)=0, \end{aligned}$$

with \(\lambda _{i}=0\) whenever \(i\in I\) is strictly satisfied, meaning \(u_{i}(x)>\underline{u}_{i}\). \(\square \)

4 Lagrangian relaxation

So far, optimality condition (11) came up just by shadow pricing. Stepping back, this section recalls that the Lagrangian itself also emerges via linear pricing of “right hand side resource” vectors \(y\in {\mathbb {Y}}\).

For greater generality, and with a view towards discrete optimization where L plays a central role Schrijver (1986), henceforth: \({\mathbb {X}}\) needs no longer be a vector space. Consequently, from here onwards, we cannot require that any \(c_{i},\) \(i\in {\mathcal {I}},\) be differentiable.

Further, allowing an extended-valued cost criterion \(c_{0}:{\mathbb {X}} \rightarrow {\mathbb {R}}\cup \left\{ +\infty \right\} \), its non-empty effective domain \(\left\{ x\in {\mathbb {X}}\text { }\left| \text { } c_{0}(x)\in {\mathbb {R}}\right. \right\} =:domc_{0}\) may well be a strict subset of \({\mathbb {X}}\). So, hereafter, instead of (4) as feasible set, use

$$\begin{aligned} X(y):=\left\{ x\in domc_{0}\quad \left| \text { }c_{1}(x)\le y\right. \right\} . \end{aligned}$$

In addition, suppose that any locally optimal solution x to problem P(0) is a global optimum - maybe after shrinking \( domc_{0}\). Accordingly, being concerned now with saddle-points, what is sought is a set \(\Lambda \) of shadow prices or multipliers, independent of x.

Since \(y^{*}0=0\), the option to purchase some suitable “perturbation” \(y\in {\mathbb {Y}}\), at expense \(y^{*}y,\) brings no harm or hurdle. In fact, such an opportunity opens a genuinely relaxed problem:

$$\begin{aligned} \mathcal {R(}y^{*}\mathcal {)}:\text { minimize }\,c_{0}(x)+y^{*}y \,\text { subject to }\,c_{1}(x)\le y. \end{aligned}$$

\(\mathcal {R(}y^{*}\mathcal {)}\) comprises two decisions: primal choice \(x\in {\mathbb {X}}\) alongside “perturbation” \(y\in {\mathbb {Y}}\). Note that \(\inf \mathcal {R(}y^{*}\mathcal {)=-\infty }\) unless \(y^{*}\ge 0.\) Also note a simple economic feature, easily overlooked: Problem \(\mathcal {R(}y^{*}\mathcal {)}\) denominates objective \(c_{0}(x)\) and perturbation cost \(y^{*}y\) in one and the same “currency”. Moreover, the two terms, being measured by the same rod, enter in additive separable manner. Either is accounted for by some “money commodity” or by letting the value of one term denominate the other.

For any \(x\in {\mathbb {X}}\) and non-negative \(y^{*}\in {\mathbb {Y}}^{*}\) , using the minimal resource bundle \(y=c_{1}(x)\) in \(\mathcal {R(}y^{*} \mathcal {)}\), gives the cheapest resource cost

$$\begin{aligned} \inf \left\{ y^{*}y\quad \left| \quad c_{1}(x)\le y\right. \right\} =y^{*}c_{1}(x) \end{aligned}$$
(22)

across \({\mathbb {Y}}\), and it ensures feasibility of x. Thus, Lagrangian L (3) emerges once again:

Proposition 4.1

(Lagrangian and relaxation). For any \( \lambda =y^{*}\ge 0,\) the relaxed problem \(\mathcal {R(} y^{*}\mathcal {)}\) reduces and simplifies by (22) to

$$\begin{aligned} {{minimize} \ }L(x,\lambda )=c_{0}(x)+\lambda c_{1}(x)\,\text { with respect to }\,x\in {\mathbb {X}}. \end{aligned}$$
(23)

If, for fixed x\(L(x,\lambda )\) is “maximized” with respect to \(\lambda \), it follows from

$$\begin{aligned} \sup _{\lambda \ge 0}\lambda c_{1}(x)=\left\{ \begin{array}{ll} 0 &{} {{when }}x {{\, is \,feasible \,for }}\,P(0)\text {{, and}} \\ +\infty &{} {{otherwise,}} \end{array} \right. \end{aligned}$$

that complementarity condition \(\lambda c_{1}(x)=0\) holds for each pair \((x,\lambda )\) of interest. Thus, total resource cost is nil, and

$$\begin{aligned} \sup _{\lambda \ge 0}\inf _{x\in {\mathbb {X}}}L(x,\lambda )\le \inf P(0):=c(0)=\inf _{x\in {\mathbb {X}}}\sup _{\lambda \ge 0}L(x,\lambda ). \end{aligned}$$
(24)

The inequality in (24) reflects that the optimal value of \( \mathcal {R(}y^{*}\mathcal {)}\) is, of course, less than or equal to c(0) for each \(y^{*}\ge 0.\)

Instead of linear charge for resource use, the decision-maker might be debited by some non-linear penalty function \(y\in {\mathbb {Y}}\mapsto \lambda (y)\) Bertsekas and Ozdaglar (2002), Jie and Yan (2021). Anyway, what stand out are those “multipliers” \(\lambda (\cdot )\ge 0,\) if any, which bring cost down to c(0):

Definition 4.1

(Lagrange multipliers). For general decision space \({\mathbb {X}}\) , any \(\lambda \ge 0\) such that

$$\begin{aligned} c(0)=\inf P(0)=\inf _{x\in {\mathbb {X}}}L(x,\lambda ) \end{aligned}$$
(25)

is called a Lagrange multiplier. Together these constitute a set \(\Lambda \subseteq {\mathbb {R}}_{+}^{I}.\)

Proposition 4.2

(Lagrange multipliers as turned-around subgradients). With general decision space \({\mathbb {X}}\), \( \lambda =y^{*}\) is a Lagrange multiplier (25) iff it’s a negative subgradient of best cost (5), meaning: \(-\lambda \in \partial c(0)\) (6).

Proof

For any fixed pair \((y^{*},y)\in {\mathbb {Y}}_{+}^{*} \times {\mathbb {Y}}\), (5) implies that

$$\begin{aligned} c(y)+y^{*}y=\inf _{x\in {\mathbb {X}}}\left\{ c_{0}(x)+y^{*}y\, \left| \quad c_{1}(x)\le y\right. \right\} . \end{aligned}$$

In particular, choosing \(y^{*}=\lambda \) as Lagrange multiplier (25), and keeping whatever \(y\in {\mathbb {Y}}\) on the left hand side, it follows from (24) that

$$\begin{aligned} c(y)+y^{*}y\ge \inf _{x\in {\mathbb {X}}} [\inf _{\hat{y}\in {\mathbb {Y}}}\left\{ c_{0}(x)+y^{*}\hat{y} \left| c_{1}(x)\le \hat{y}\right. \right\} ]=\inf _{x\in {\mathbb {X}}}L(x,\lambda )=c(0). \end{aligned}$$

Consequently, \(-\lambda \in \partial c(0)\) (6). The converse inclusion \(\partial c(0)\subseteq -\Lambda \) is straightforward.Footnote 10\(\square \)

Example 4.1

(Linear programming and multipliers as dual solutions). Returning briefly to a vector space \({\mathbb {X}}\), let \(a_{i}\in {\mathbb {X}}^{*}\) and \(c_{i}(x):=a_{i}x-y_{i}^{0}\) for \(i\in I.\) Given linear objective \(c_{0}^{*}\in {\mathbb {X}}^{*}\) and \(Ax:=[a_{i}x]\), consider the primal program P(y):

$$\begin{aligned} c(y):=\inf \left\{ c_{0}^{*}x\text { }\left| \quad Ax\le y^{0}+y\right. \right\} . \end{aligned}$$

The corresponding dual program:

$$\begin{aligned} \sup _{y^{*}}\inf _{x}L(x,y^{*})= & {} \sup _{y^{*}}\inf _{x}\left\{ c_{0}^{*}x+y^{*}(Ax-y^{0}-y)\right\} \\= & {} \sup _{y^{*}}\inf _{x}\left\{ (c_{0}^{*}+A^{*}y^{*})x-y^{*}(y^{0}+y)\right\} \\= & {} \sup _{y^{*}}\left\{ \begin{array}{ll} -y^{*}(y^{0}+y) &{} \text {if }c_{0}^{*}+A^{*}y^{*}=0, \\ -\infty &{} \text {otherwise,} \end{array} \right. \end{aligned}$$

has \(y^{*}\) as optimal solution iff c(y) is finite and \(-y^{*}\in \partial c(y).\) Indeed, this follows by subtracting \(c(y)=-y^{*}(y^{0}+y) \) from \(c(\hat{y})\ge -y^{*}(y^{0}+\hat{y})\). In particular, for \(y=0,\) it obtains that \(-y^{*}=-\lambda \in \partial c(0)\); see also Akgül (1984), Gauvin (2000) and Proposition 4.2.

So, when is \(c(0)<+\infty \)? Put differently: when does the linear inequality system \( a_{i}x\le y_{i}^{0},\) \(i\in I,\) admit a solution \(x\in X(0)\) (4)? Ky Fan Fan (1968) studied this question for arbitrary index set I,  finite (as here) or infinite. Following the proof of his Theorem 1, let \( {\mathbb {X}}\) be locally convex, Hausdorff and reflexive. Then feasibility obtains iff \((0,-1)\) does not belong to that closed convex cone spanned by \((a_{i},y_{i}^{0})\in {\mathbb {X}}^{*}\times {\mathbb {R}},\) \(i\in I.\) For elaborations on finite I,  or for \(\sup \left\{ x^{*}x:x\in X(0)\right\} \) with any fixed \(x^{*}\in {\mathbb {X}}^{*},\) see Fan (1968). \(\Diamond \)

As seen above, and below in Sect. 5, it’s desirable that \(\partial c(0)\) be bounded. But already indicated by Example 1.2, and more generally here, suppose some (exogenous resource price) \(y^{*}\) resides outside \( -\partial c(0)\)–whence doesn’t derive directly from problem P(0). Then, some resource trade might be worthwhile:

Lemma 4.1

(on exogenous pricing). Let a cost function \(c: {\mathbb {Y}}\rightarrow {\mathbb {R}}\cup \left\{ +\infty \right\} \) be locally convexFootnote 11 near a reference point y with non-empty compact subdifferential \(\partial c(y)\) (6). Then, for any price \({y}^{*}\in {{\mathbb {Y}}}^{*}\), direction \(d \in {{\mathbb {Y}}}\), and sufficiently small \(s>0\) ,

$$\begin{aligned} c^{\prime }(y;d)+y^{*}d<0\Longrightarrow c(y+sd)+y^{*}(sd)<c(y). \end{aligned}$$

When \(y^{*}\notin -\partial c(y)\) , such a direction d exists, and a sufficiently small step \(s>0\) along d, bought for expense \(y^{*}(sd)=sy^{*}d,\) reduces cost below c(y) (5).

Proof

From \(c(y+sd)=c(y)+sc^{\prime }(y;d)+o(s)\) and \(c^{\prime }(y;d)+y^{*}d<0\) follows that for sufficiently small step \(s>0,\)

$$\begin{aligned} c(y+sd)+y^{*}(sd)\approx c(y)+s\left[ c^{\prime }(y;d)+y^{*}d\right] <c(y). \end{aligned}$$

Note that \(c^{\prime }(y;d)=\max \partial c(y)d:=\max \left\{ y^{*}d \quad \left| \text { }y^{*}\in \partial c(y)\right. \right\} =:\gamma .\) So, for suitable direction, take any d which separates \( y^{*}\) strictly from \(-\partial c(y)\) in that \(y^{*}d<-\gamma \). \(\square \)

Returning now to the inequality in (24), it begs two questions: First, when is \(\sup _{\lambda \ge 0}\inf _{x\in {\mathbb {X}} }L(x,\lambda )=\inf P(0)?\) This topic is addressed in Sect. 6 with a view towards generalized convexity conditions - and two-person games of Stackelberg sort Osborne and Rubinstein (1994).

Second, shouldn’t (endogenous) shadow prices be compared somehow to (exogenous) market quotations, if any? This question is taken up next.

5 Arbitrage

Part of problem P(y),  \(y=0,\) is to valuate nearby (right hand side) resource vectors \(y\approx 0.\) In fact, that’s a major aim behind standard Lagrangian duality.Footnote 12 But valuation is also the sine qua non for manifold other mechanisms (and institutions)–say, auctions, exchanges or markets - these residing beside or beyond problem P(0).

Indeed, few optimization problems come fully detached from exogenous transactions and related pricing. To indicate here how and why, consider direct deals between two interlocutors. One is the decision-maker behind problem P(0)–construed say, as Robinson Crusoe (\(\mathcal{RC}\mathcal{}\)) on his island. The other is a merchant who sails by and cries out:

“I sell resource \(i\in I\) at unit price \(\bar{y} _{i}^{*}\) but buy at unit price \(\underline{y}_{i}^{*}\ge 0\), both in gold.”

Reasonably, to avoid self-inflicted loss, the said merchant precludes arbitrage LeRoy and Werner (2001) with \(\bar{y}_{i}^{*}\ge \underline{y} _{i}^{*}.\) Together these prices define a box \(Y^{*}:=\Pi _{i\in I}[ \underline{y}_{i}^{*},\bar{y}_{i}^{*}].\)

For the rest of this section, suppose \(\mathcal{RC}\mathcal{}\) invokes a non-empty compact set \(\Lambda \) of personal price vectors, rationalized by the requirement that

$$\begin{aligned} \partial c(0)\subseteq -\Lambda . \end{aligned}$$
(26)

Then, if \(\mathcal{RC}\mathcal{}\) contemplates some purchase of resource i,  he prudently bids no unit price above

$$\begin{aligned} \underline{\lambda }_{i}:=\inf \left\{ \lambda _{i}\text { }\left| \text { }\lambda \in \Lambda \right. \right\} . \end{aligned}$$
(27)

By contrast, if considering some sale of that resource, he ought ask no smaller unit price than

$$\begin{aligned} \bar{\lambda }_{i}:=\sup \left\{ \lambda _{i}\text { }\left| \text { } \lambda \in \Lambda \right. \right\} . \end{aligned}$$
(28)

Being prudent and consistent, by (27) and (28), he operates with bid-ask spread \(\bar{\lambda }_{i}-\underline{\lambda }_{i}\ge 0\) for input i. Now, if he can trade "resources" at exogenous prices not in \(\Lambda \) (26), that option embeds his self-sufficient (autarky) problem P(0) into a more attractive, market-like setting:Footnote 13

Proposition 5.1

(on market pricing and outside options). If resource i be supplied at unit price \( \bar{y}_{i}^{*}<\underline{\lambda }_{i}\) (27), it’s worthwhile for the decision-maker to buy some amount. Likewise, if \(\underline{y}_{i}^{*}>\bar{\lambda }_{i}\) (28), gain obtains by selling a bit of that resource.

Proof

It suffices to consider the case \(\bar{y}_{i}^{*}< \underline{\lambda }_{i}\). Let direction \(d\in {\mathbb {Y}}\) equal the i-th unit vector \((\ldots ,0,1,0,\ldots )\) to have by Proposition 4.2:

$$\begin{aligned} \max _{y^{*}\in Y^{*}}y^{*}d+c^{\prime }(0;d)= & {} \bar{y} _{i}^{*}+\sup \left\{ y^{*}d\text { }\left| \text { }y^{*}\in \partial c(0)\right. \right\} \\\le & {} \bar{y}_{i}^{*}+\sup \left\{ -\lambda d\text { }\left| \text { } \lambda \in \Lambda \right. \right\} \\\le & {} \bar{y}_{i}^{*}-\inf \left\{ \lambda d\text { }\left| \text { } \lambda \in \Lambda \right. \right\} =\bar{y}_{i}^{*}-\underline{\lambda }_{i}<0. \end{aligned}$$

Now invoke Lemma 4.1 to conclude. \(\square \)

Beyond bilateral direct deals, consider next a multi-agent, multi-commodity market. Denote by \(y_{i}^{*b}\) the unit price some participant bids upon demanding some amount of resource i. Similarly, let \(y_{i}^{*a}\) be the unit price the same agent asks upon supplying some amount of resource i. His quotations, if any, are construed as commitments.

A pure demander has \(y_{i}^{*b}\) finite but \(y_{i}^{*a}=+\infty \); a pure supplier has \(y_{i}^{*b}=-\infty \) and \( y_{i}^{*a}\in {\mathbb {R}}\). Unlike these, any participant who quotes both \(y_{i}^{*b}\) and \(y_{i}^{*a}\) as finite numbers, can be construed as a broker or intermediary. Anyway, for own rationality, everybody complies with \(y_{i}^{*b}\le y_{i}^{*a}\).

Let \(\bar{y}_{i}^{*}\) be maximal among individually posted bid prices \(y_{i}^{*b}\). Likewise, let \(\underline{y}_{i}^{*}\) equal the minimal ask price \(y_{i}^{*a}\) posted in the market. A bid-ask spread \(\bar{y}_{i}^{*}-\) \(\underline{y}_{i}^{*}>0\) cannot persist long; it invites and offers arbitrage, hence disappears quickly. Conversely, when \(\bar{y}_{i}^{*}-\) \(\underline{y}_{i}^{*}<0 \), market i remains inactive. Hence equilibrium prevails if quotations \(\bar{y}_{i}^{*}=\) \(\underline{y}_{i}^{*}=y_{i}^{*}\), \(i\in I,\) form a price vector \(y^{*}=(y_{i}^{*})\) which balances and clears the market(s). It reigns then the law of one price LeRoy and Werner (2001).

It merits mention that the economy at hand operates best when monetized. This simply means that trade proceeds in goods for money (say, fiat bills or gold).Footnote 14 The next result derives directly from Proposition 5.1:

Corollary 5.1

(on problem P(0) and market equilibrium). Suppose the decision-maker behind problem P(0) enters a market for bundles \(y\in {\mathbb {Y}}\) in which equilibrium prevails at some price \(y^{*}\notin \Lambda .\) Suppose his traded volume be so small that \(y^{*}\) isn’t affected.

If \(y_{i}^{*}<\underline{\lambda }_{i}\) (27), he can profitably buy some amount of resource i . Likewise, when \(y_{i}^{*}>\overline{\lambda }_{i}\) (28), he may gain by selling a bit of that resource. \(\square \)

To synthesize these arguments, reintroduce the reference endowment \( y^{0}\in {\mathbb {Y}}\), maybe not nil. Problem \(P(y^{0})\) has the same objective \(c_{0}\), but now constraint function \(c_{1}-y^{0}\) in place of \( c_{1}.\) Suppose the value function (5) has non-empty subdifferential \(\partial c(y^{0})\) (6). Clearly, resource trade changes \(y^{0}\) and thereby problem \(P(y^{0}).\) A step-wise or continuous process \(t\longmapsto y(t)=y\) could emanate from \(y(0)=y^{0}\). By Lemma 4.1 it holds:

Proposition 5.2

(on continued trade). Resource trade ought continue as long as \(\partial c(y)\) remains non-empty compact, and the price \(y^{*}\notin -\partial c(y).\)

6 Two-stage, two-player games and the duality gap

It remains the question: when is

$$\begin{aligned} \sup _{\lambda \ge 0}\inf _{x\in {\mathbb {X}}}L(x,\lambda )=\inf P(0)? \end{aligned}$$

For that important issue, extended Lagrangian \(L_{0}\) (7) needs some qualification so as to hold the leading multiplier \(\lambda _{0}\) strictly away from 0. By contrast, standard Lagrangian L (3) may accommodate an unbounded multiplier set, but it fixes \(\lambda _{0}>0.\)

The best of both worlds would obtain if the set \(\Lambda \) of Lagrange multipliers (Definition 4.1) already were bounded. Granted linear space \( {\mathbb {X}}\) and differentiable data, qualification (19) ensures precisely this property; see Proposition 3.2.

However, discrete optimization - a major field for use of Lagrangians - must often do without linear structure (and differentiability). On that premise, to conclude, presume that \(\Lambda \) indeed be non-empty compact and convex. This hypothesis facilitates discussion of game equilibrium - and of the inequality in (24).

For that discussion, view Lagrangian (3) here as formalizing a two-person, non-cooperative game. To wit, a primal player \( {\mathcal {P}}\) chooses "activity" or "strategy" x - in some space \({\mathbb {X}} \), not necessarily linear - to minimize operating cost \(c_{0}(x)\) plus expense \(\sum _{i\in I}\lambda _{i}c_{i}(x)\) for resource use. Opposed and unfriendly to him is a dual player \({\mathcal {D}}\) who sets price vector \(\lambda \ge 0\) to maximize own revenue \(\sum _{i\in I}\lambda _{i}c_{i}(x)\) for resource supply. (So, players’ "payoffs" don’t sum to zero.)

An extensive form of the game might allow play to unfold over two stages.Footnote 15 It imports then: who moves last and what does he know at that stage?

For one protocol, player \({\mathcal {D}}\) chooses \(\lambda \) last, already knowing x. He may then hold his interlocutor \({\mathcal {P}}\) to feasible choice and away from purchase of "resources", yet permit best cost \(c(0)=\inf P(0)=\inf _{x\in {\mathbb {X}}}\left[ \sup _{\lambda \ge 0}L(x,\lambda )\right] \). This was already brought out by the equality in ( 24).

In the turned-around protocol of play, \({\mathcal {D}}\) commits \( \lambda \) first, not knowing x. Maintaining rational expectations, he fears or predicts a worst possible response x. So, up front, he faces the dual problem

$$\begin{aligned} D:\text { maximize }\inf _{x\in {\mathbb {X}}}L(x,\lambda )\text { with respect to }\lambda \ge 0. \end{aligned}$$

Denote the optimal value by \(\sup D.\) The last arrangement may leave player \( {\mathcal {D}}\) disadvantaged: there could be a duality gap \(\inf P(0)-\sup D>0\). That entity measures the last mover’s benefit by waiting out to play a two-person conflict.

How much benefit might \({\mathcal {P}}\) realize by waiting? When is that benefit nil? The first question - in principle, fairly simple - bears on compactness of effective domains and continuity of objectives. The second question is more demanding; it also invokes convexity. Addressing only the second, it’s convenient, and it serves generality, to allow that \({\mathbb {X}}\) isn’t necessarily some vector space. Also, let \(\lambda \) be a multiplier or price regime of any sort. At this juncture, convexity enters, albeit generalized:

Definition 6.1

(convex-like functions). Given any non-empty set X , a function \(f:X\rightarrow {\mathbb {R}}\cup \left\{ +\infty \right\} \) is called convex-like iff for any \(x,\hat{x} \in X\) and real numbers \(\rho ,\hat{\rho }\ge 0,\) satisfying \(\rho +\hat{\rho }=1,\) there exists some \(\bar{x}\in X\) such that

$$\begin{aligned} f(\bar{x})\le \rho f(x)+\hat{\rho }f(\hat{x}). \end{aligned}$$

Any \(f:X\rightarrow {\mathbb {R}}\cup \left\{ +\infty \right\} \) which has a minimum is convex-like. And clearly, when X is a subset of a linear space \( {\mathbb {X}}\) on which f is quasi-convex, that function is convex-like.

Now, instead of Lagrangian L (3), more generally invoke a bivariate function \({\mathcal {L}}:X\times \Lambda \rightarrow {\mathbb {R}}\) for which it follows from Fan (1953), Kneser (1952) and Sions (1958)Footnote 16:

Theorem 6.1

(on saddle value). Let X be any non-empty set and \(\Lambda \) a proper compact convex subset of a topological vector space. Suppose a bivariate function \({\mathcal {L}}:X\times \Lambda \rightarrow {\mathbb {R}}\) is convex-like in x and concave upper semicontinuous in \(\lambda .\) Then, there is a (lop-sided) saddle valueFootnote 17:

$$\begin{aligned} \max _{\lambda \in \Lambda }\inf _{x\in X}{\mathcal {L}}(x,\lambda )=\inf _{x\in X}\max _{\lambda \in \Lambda }{\mathcal {L}}(x,\lambda ). \end{aligned}$$
(29)

If moreover, X is compact in some topology with \( {\mathcal {L}}(x,\lambda )\) lower semicontinuous in x,  then some saddle point \((x,\lambda )\in X\times \Lambda \) realizes the mini-max value:

$$\begin{aligned} \min {\mathcal {L}}(X,\lambda )={\mathcal {L}}(x,\lambda )=\max {\mathcal {L}} (x,\Lambda ). \end{aligned}$$

As usual, to pass from extremum to attainment, one needs appropriate compactness and semicontinuity.

Finally, to close the circle, returning to Lagrangian L (3), a main concern throughout was compactness (and concavity) in multipliers:

Corollary 6.1

(on saddle value of the standard Lagrangian). Here \({\mathcal {L}}\)\(=L\) (3) . Let X be any non-empty set, \({\mathbb {Y}}\) a linear topological space, \( c_{1} \) a mapping from X into \({\mathbb {Y}}\) , and \(\Lambda \) a proper compact convex subset of the dual space \( {\mathbb {Y}}^{*}\) . Suppose \(L\mathcal {(\cdot },\lambda ):X\rightarrow {\mathbb {R}}\) attains a minimum for each fixed \( \lambda \in \Lambda .\) Then, (29) holds.

In particular, if X is compact in some topology with \( L(x,\lambda )\) lower semicontinuous in x,  then there exists some saddle point \((x,\lambda )\in X\times \Lambda \).

7 Conclusion

Lagrangian (3) comes up in optimization, continuous and discrete, as well as in two-player, non-cooperative games. Either setting begs interpretation of multipliers as endogenous prices, fully synthesized within the problem at hand. In this optic, they had better exist and be bounded. Moreover, they ought compete well against exogenous prices, quoted in outside markets.