1 Introduction

Game theory and microeconomics—henceforth just called theory—abounds in agent-based models of decision problems. However, even among those models concerned with just a single agent, many display three objectionable features. First, the concerned agent should, with little or no hesitation, leap directly to a best choice. Second, his behavior ought be totally goal-oriented—and never affected by any cost of change. Third, he is often depicted as fully detached from history, precedence or status quo.

Such modelling of individual behavior begs immediate objections. Indeed, an agent’s choice may emerge step-wise; his cost of change can be considerable; and each arrival stems from some point of departure.

It’s comforting therefore, that algorithms geared toward best or better choice, have—at least since Cauchy (1847)—been coached as iterative processes. These require more than just one step. It also comforts that recent decades have brought forward procedures that expressly account for adjustment costs.Footnote 1

If several agents take part, an important fourth feature adds extra complexity, namely: How can participants foresee or observe actions taken by others—and respond to these?

Regrettably, much theory sidesteps all four features. In fact, large parts move straight to steady states or terminal outcomes, if any, called equilibria (Osborne and Rubinstein 1994; Vega-Redondo 2003). Thereby, many studies shun queries as to equilibrium emergence and strategic learning (Fudenberg and Levine 1994; Peyton Young 2004). Serious issues about perfection, selection and stability of equilibrium then escape attention (van Damme 1991; Harsanyi and Selten 1988; Samuelson 1997; Selten 1975).

In social sciences, various concepts of steady-state solutions exert considerable attraction—for good reasons. Each instance describes how parties behave, communicate or fare in equilibrium. However, off such special states, the underlying concept can, by itself, hardly explain attainment of a solution.Footnote 2 To emphasize or justify some “focal” state as particularly plausible, at least one stable process ought eventually approximate or reach that distinguished outcome (Fisher 1983).

Two hypotheses have often been invoked to this end. Either is tempting but neither quite attractive. One posits that agents, even out of equilibrium, behave as in it. The other presumes that each party acts throughout as though fully foresighted, marvelously competent, and perfectly rational.

More realistic approaches ought tolerate imperfections in agents’ capacities to choose, foresee or know.Footnote 3 Accordingly, here below, local perceptions replace global views, and improvements substitute for full optimization. While seeking own betterment, agents adapt, but usually in somewhat moderate or myopic manner (Miller and Page 2007) . If so, might they eventually come to a halt? And then, where?

These questions motivate the paper. For preparation, Sect. 2 considers just one agent, isolated from others. By contrast, Sect. 3 lets him play normal-form games among non-cooperative strategists. Then, convexity, in one form or another, plays important roles. Section 4 dispenses with convexity, letting play unfold in metric spaces. Provided cost of change exceeds the distance, the metric setting may frame proximal procedures and Nash equilibria in new manners; see Theorem 4.1 and its corollary. Section 5 concludes by briefly considering extensive-form games of Stackelberg variety.

The two main novelties come late in the paper: Theorem 4.2 leans on Caristi’s metric fixed point theorem to consider existence of strong Nash equilibria. Theorem 5.1 contends with topological assumptions for convergence to Stackelberg solutions of the principal-agent sort.

The paper addresses various readers. Included are operation researchers and computer scientists concerned with multi-agent interaction, optimizers using proximal procedures, and theorists who consider fixed points or games in metric spaces.

2 Preliminaries concerning the single agent

This section introduces notations and preliminaries. To begin with, and to simplify, it considers just one agent.Footnote 4 Actually, he holds a “position” x. If deviating from x to \(\hat{x}\), that transition gives him net benefit \(b(\hat{x}\left| x\right. ).\) His improvement or betterment

$$\begin{aligned} (\hat{x},x)\mapsto b(\hat{x}\left| x\right. )\in \mathbb {R\cup }\left\{ -\infty \right\} \end{aligned}$$

equals \(-\infty \) if \((\hat{x},x)\notin X\times X\) for some non-empty viability set X in the ambient space \(\mathbb {X}\) of alternatives. The “probabilistic” notation \(b(\hat{x} \left| x\right. )\) emphasizes that the agent, while conditioned by his point of departure x, seeks a suitable point of arrival \(\hat{x}.\) In particular, given \(x\in X\), he might

$$\begin{aligned} \text {maximize }b(\hat{x}\left| x\right. )\text { subject to }\hat{x} \in X. \end{aligned}$$
(1)

Many formalized decision problems mention no point of departure - or implicitly, regard the latter to be of no or negligible importance. Moreover, upon leaping directly to a very best choice, the agent seemingly incurs no cost for “dislodging” himself.Footnote 5 To wit, classical and customary instances let

$$\begin{aligned} b(\hat{x}\left| x\right. )=\beta (\hat{x})-\beta (x) \end{aligned}$$

for some gross benefit function \(\beta :\mathbb {X\rightarrow R\cup } \left\{ -\infty \right\} \), having effective domain \(X:=\beta ^{-1}( \mathbb {R})\). Since x is sunk already, this model incorporates no adjustment costs. The agent in question appears fully goal-driven—and never troubled by friction or inertia. More realistically, proximal point methods (Rockafellar 1976; Teboulle 1997) posit

$$\begin{aligned} b(\hat{x}\left| x\right. )=\beta (\hat{x})-\beta (x)-c(\hat{x}\left| x\right. ) \end{aligned}$$
(2)

for some (adjustment) cost function \(c:\mathbb {X\times X\rightarrow R}_{+} \mathbb {\cup }\left\{ +\infty \right\} \) which vanishes on the diagonal: \( c(x\left| x\right. )=0\) \(\forall x\in X\).Footnote 6 No symmetry is presumed; it may well happen that \(c(\hat{x}\left| x\right. )\ne c(x\left| \hat{x}\right. )\); the forward fare can differ from the backward one. It often appears natural though, that c satisfies the triangle inequality: \(c(\hat{x}\left| x\right. )+c(\check{x} \left| \hat{x}\right. )\ge c(\check{x}\left| x\right. ).\) Then, ( 2) makes a direct move \(x\rightarrow \check{x}\) preferable to any indirect one \(x\rightarrow \hat{x}\rightarrow \check{x}.\)Footnote 7

(2) supports the standing interpretation that \(b(\hat{x}\left| x\right. )\) denotes additional benefit in arrival state \(\hat{x},\) net of cost \(c(\hat{x}\left| x\right. )\) incurred upon departing directly from x.

If \(\hat{x}=x,\) the agent stays put; otherwise, he moves . A move from x to \(\hat{x}\) is declared (strictly) improving iff \(b(\hat{x}\left| x\right. )>0\). Naturally, suppose that staying put entails no improvement; that is, \(b(x\left| x\right. )\le 0\) for all \(x\in X.\) In many cases, \(b(x\left| x\right. )=0.\)

Stationary states stand out by allowing no improvements. They solve problem (1) by bringing up contingent fixed points:

Definition 2.1

(stationary states). \(x\in X\) is declared stationary for the bivariate mapping \((\hat{x},x)\in X\times X\mapsto b(\hat{x}\left| x\right. )\in \mathbb {R}\) iff

$$\begin{aligned} x\in \arg \max \left\{ b(\hat{x}\left| x\right. ): \hat{x}\in X\right\} . \end{aligned}$$

Each stationary point x makes \(b(\cdot \left| x\right. )\le 0\) on X,  and in most cases \(b(x\left| x\right. )=0.\)

This framing of the agent’s decision problem begs the question: Is there some stationary state? The following positive (albeit particular) answer is just a restatement of Ky Fan’s inequality (Aubin and Ekeland 1984; Fan 1972):

Theorem 2.1

(on existence of stationary states). Suppose X is a non-empty compact convex subset of a topological vector space \(\mathbb {X}.\) Also suppose \(b(\hat{x}\left| x\right. )\) be quasi-concave in \(\hat{x}\in X,\) lower semicontinuous in \(x\in X,\) and \(b(x\left| x\right. )\le 0\) \(\forall x.\) Then there exists at least one stationary state. \(\square \)

Theorem 2.1 points to topological vector spaces as tractable settings. It also emphasizes the important role of closed convex preferences.Footnote 8

Granted existence of at least one stationary state, how might the agent eventually reach one of those—and come to rest there? As in Polak (1997), Zangwill (1969) it’s convenient to model his step-wise adjustments in terms of a point-to-set correspondence \(A:X\rightrightarrows X\). It appears natural to have

$$\begin{aligned} A(x)\subseteq \left\{ \hat{x}\in X:b(\hat{x}\left| x\right. )\ge 0\right\} . \end{aligned}$$
(3)

From some accidental or historical point \(x^{0}\in X,\) there emanates an iterative process

$$\begin{aligned} x^{k+1}\in A(x^{k})\text {, }k=0,1,\ldots \end{aligned}$$
(4)

Proposition 2.1

(on appropriate cluster points). Let X be a closed subset of a topological space \(\mathbb {X}\). Suppose (4) generates a summable sequence \(k\mapsto b(x^{k+1}\left| x^{k}\right. )\ge 0\). Further suppose that each non-stationary point \(x\in X\) has some neighborhood \(\mathcal {N}\) and number \(\delta >0\) such that

$$\begin{aligned} b(\hat{\chi }\left| \chi \right. )\ge \delta \text { for all } \hat{ \chi }\in A(\chi )\text { when }\chi \in X\cap \mathcal {N}\text {.} \end{aligned}$$
(5)

Then, either the sequence \((x^{k})\) is finite with a stationary last point—or, every cluster point of the infinite sequence must be stationary.

Proof

In the viable set X, let \(x=\lim _{k\in K}x^{k}\) for some infinite subsequence K of natural numbers. Suppose x isn’t stationary. With no loss of generality, take \(x^{k}\in \mathcal {N}\) for all \(k\in K.\) Then, it obtains the contradiction

$$\begin{aligned} +\infty >\sum _{k=0}^{\infty }b(x^{k+1}\left| x^{k}\right. )\ge \sum _{k\in K}b(x^{k+1}\left| x^{k}\right. )=+\infty . \end{aligned}$$

\(\square \) Granted (3), if process (4) reaches a stationary point x (Definition 2.1), and

$$ \begin{aligned} \hat{x}\in A(x) \& \text { }\hat{x}\ne x\Longrightarrow b(\hat{x} \left| x\right. )>0\text {,} \end{aligned}$$
(6)

then

$$\begin{aligned} A(x)=\left\{ x\right\} \text { and }b(x\left| x\right. )=0. \end{aligned}$$
(7)

For these reasons, instance (2) and (3), always with \( c\ge 0,\) will be central and henceforth called the workhorse model. \(\square \)

The results of this section apply directly to interaction among several agents \(i\in I\), each just choosing his component \(x_{i}\) of an overall profile \(x=(x_{i}).\) Such settings are commonly framed as games—considered next.

3 Non-cooperative games

Henceforth I always denotes a finite ensemble of “players”, at least two of them. Then, by a strategy profile \(x=(x_{i})\) is meant a mapping \(i\in I\mapsto x_{i}\in X_{i}\) where \(X_{i}\) is a non-empty “viability set” in some ambient space \(\mathbb {X}_{i}\) of alternatives. Let \(\mathbb {X }:=\Pi _{i\in I}\mathbb {X}_{i}\). Further, posit \(X:=\Pi _{i\in I}X_{i},\) except under other notice.

Given a strategy-profile \(x=(x_{i})\in X\), suppose member \(i\in I\) anticipates net benefit \(b_{i}(\hat{x}_{i}\left| x\right. )\in \mathbb {R}\) upon deviating unilaterally, within his viability set \(X_{i}\), from actual strategy \(x_{i}\) to another one \(\hat{x}_{i}.\) Then, by assumption, \(b_{i}( \hat{x}_{i}\left| x\right. )\) already incorporates eventual cost of change. In terms of the residual profile \(x_{-i}:=(x_{j})_{j\ne i}\), actually implemented by his rivals, player i acts as though the updated version equals \((\hat{x}_{i},x_{-i}).\) That belief is justified iff he alone deviates.

Definition 3.1

(Non-cooperative stationary states). A strategy profile \(x=(x_{i})\in X\) is declared multi-agent stationary (or simply stationary) iff

$$\begin{aligned} x_{i}\in \arg \max \left\{ b_{i}(\hat{x}_{i}\left| x\right. ):\hat{x} _{i}\in X_{i}\right\} \forall i\in I. \end{aligned}$$
(8)

In much studied, but special instances, multi-agent stationarity adds nothing to the concept of Nash equilibrium (Osborne and Rubinstein 1994):

Proposition 3.1

(on multi-agent stationary states as Nash equilibria). Suppose player \(i\in I\) worships gross benefit \(\beta _{i}:\mathbb {X\rightarrow R\cup }\left\{ -\infty \right\} ,\) with \(dom\beta _{i}:=\beta _{i}^{-1}(\mathbb {R})=X\), and that

$$\begin{aligned} b_{i}(\hat{x}_{i}\left| x\right. )=\beta _{i}(\hat{x}_{i},x_{-i})- \beta _{i}(x_{i},x_{-i}). \end{aligned}$$
(9)

Then, a state \(x\in X\) is multi-agent stationary ( 8) iff it’s a Nash equilibrium of the non-cooperative game \(G:=(\beta _{i},X_{i})_{i\in I}\), meaning

$$\begin{aligned} x_{i}\in \arg \max \left\{ \beta _{i}(\hat{x}_{i},x_{-i}):\text { }\hat{x}_{i}\in X_{i}\right\} \text { }\forall i\in I.\text { } \end{aligned}$$

\(\square \)

It’s objectionable that Proposition 3.1 mentions no adjustment costs. Each player agent appears fully goal-driven. Nobody is ever troubled by friction or inertia. More realistically, instead of (9), following the lead of proximal point methods and workhorse model (2) , (3), one may posit

$$\begin{aligned} b_{i}(\hat{x}_{i}\left| x\right. )=\beta _{i}(\hat{x}_{i},x_{-i})- \beta _{i}(x)-c_{i}(\hat{x}_{i}\left| x\right. ) \end{aligned}$$
(10)

for some cost function \(c_{i}:\mathbb {X}_{i}\times \mathbb {X}\rightarrow \mathbb {R}_{+}\mathbb {\cup }\left\{ +\infty \right\} \) which is nil when \(\hat{ x}_{i}=x_{i}\).Footnote 9 That function could be asymmetric in the agent’s own arguments \((\hat{x}_{i},x_{i})\). The dependence of \(c_{i}(\hat{x}_{i}\left| x\right. )\) on the entire profile x fits games featuring congestion (Rosenthal 1973) or use of common resources (Flåm 2017). That feature of (10) differs from the two-player games in Attouch et al. (2007) where \(c_{i}(\hat{ x}_{i}\left| x\right. )=c_{i}(\hat{x}_{i}\left| x_{i}\right. )\).

If a Nash solution isn’t unique, (10) also bears on equilibrium refinement, selection and stability (van Damme (1991); Harsanyi and Selten (1988)) ; see the remark below on strong equilibria, motivating Theorem 4.2. While \(\hat{x}_{i}\mapsto \beta _{i}(\hat{x}_{i},x_{-i})\) is the customary Nash maximand, (10) subtracts a perturbation \( c_{i}(\hat{x}_{i}\left| x\right. )\ge 0\) which disappears in equilibrium but is apt to affect behavior elsewhere; see Geanakoplos (2000).Footnote 10 Indeed, out of equilibrium, considerable cost of change may steady a player’s trembling hand and focus his mind.

In whatever form, \(b_{i}(\hat{x}_{i}\left| x\right. )\) is meant to measure cardinal betterment for player i. Contending with ordinal comparisons, there is a noteworthy link to characterization and existence of stationary points:

Proposition 3.2

(on concave ordinal improvements). For each \(i\in I\), suppose \(X_{i}\) is a non-empty compact convex subset of some topological vector space \(\mathbb {X}_{i}\). Further suppose that

$$\begin{aligned} b_{i}(\hat{x}_{i}\left| x\right. )>0\text { }\Longrightarrow \text { } \beta _{i}(\hat{x}_{i},x_{-i})>\beta _{i}(x) \end{aligned}$$
(11)

with gross benefit function \(\beta _{i}:X\rightarrow \mathbb {R}\) concave in \(\hat{x}_{i}\in X_{i}\) and jointly continuous in \(x\in X.\) Then, there exists at least one Nash equilibrium in the game \(G=(\beta _{i},X_{i})_{i\in I}\). Each such equilibrium x is stationary (Definition 2.1) for

$$\begin{aligned} b(\hat{x}\left| x\right. ):=\sum _{i\in I}[\beta _{i}(\hat{x} _{i},x_{-i})-\beta _{i}(x)]. \end{aligned}$$
(12)

When moreover, \(b_{i}(x_{i}\left| x\right. )=0\) \(\forall i,\) that equilibrium is multi-agent stationary (8).

Proof

\(b(\hat{x}\left| x\right. )\) is concave in \(\hat{x},\) continuous in x,  and \(b(x\left| x\right. )=0\) for all \(x\in X.\) By Ky Fan’s inequality (Theorem 2.1 ), there exists a point \(x\in X\) such that \(b( \hat{x}\left| x\right. )\le 0\) for all \(\hat{x}\in X.\) Consequently, \( \beta _{i}(\hat{x}_{i},x_{-i})\le \beta _{i}(x)\) for each \(\hat{x}_{i}\in X_{i}\) and \(i\in I\). So, \(x=(x_{i})\) is a Nash equilibrium and stationary (Definition 2.1) for (12). Provided \(b_{i}(x_{i}\left| x\right. )=0\) \( \forall i,\) condition (8) is also satisfied. \(\square \)

Maintaining the conditions on \(\beta _{i},\) Proposition 3.2 fits the instance (10) when \(c_{i}(\hat{x}_{i}\left| x\right. )\ge 0\) is convex in \(\hat{x}_{i}\) and continuous in x. Then, it also enters within the frames of Theorem 3.1.

Monderer and Shapley (1996) studied non-cooperative games \( G=(\beta _{i},X_{i})_{i\in I}\) in which

$$\begin{aligned} \beta _{i}(\hat{x}_{i},x_{-i})-\beta _{i}(x)>0\Longrightarrow P(\hat{x} _{i},x_{-i})-P(x)>0 \end{aligned}$$

for some player-independent generalized ordinal potential \(P: \mathbb {X\rightarrow R\cup }\left\{ -\infty \right\} \) with P finite on X. Then, P may replace \(\beta _{i}\) in (11).

In many games, strategic interaction works via objectives and constraints; see (Flåm and Ruszczyński 2008; Flåm 2017) and references therein. In fact, besides the individual restrictions that \(x_{i}\in X_{i}\) \(\forall i,\) choice could also be subject to coupling constraints in that each strategy profile \(x=(x_{i})\) must belong to a non-empty, “non-rectangular” subset \( X\varsubsetneq \Pi _{i\in I}X_{i}\). Then, still with cost of strategy change embedded in \(b_{i}(\hat{x}_{i}\left| x\right. )\), letting

$$\begin{aligned} b(\hat{x}\left| x\right. ):=\sum _{i\in I}b_{i}(\hat{x}_{i}\left| x\right. ), \end{aligned}$$
(13)

a profile \(x\in X\) is stationary—and declared a generalized Nash equilibrium - iff (Definition 2.1) holds. Theorem 2.1 immediately entails

Theorem 3.1

(on stationary states and generalized Nash equilibria). Suppose X is a non-empty compact convex subset of a topological vector space \(\mathbb {X}\). If \((\hat{x},x)\in X\times X\longmapsto b(\hat{x}\left| x\right. )\in \mathbb {R}\)(13) is quasi-concave in \(\hat{x}\), lower semicontinuous in x,  and \(b(x\left| x\right. )\le 0\), then there exists a generalized Nash equilibrium. \(\square \)

Remark

(on strong Nash equilibria). Present several players, expression (13) hides, or glosses over, the important fact that, in general, individual benefits need neither be comparable nor transferable—whence do not easily add to form a meaningful overall criterion. So, while format (13) may shed light on computation or existence of stationary points, it doesn’t necessarily suit models of disequilibrium dynamics.

All the same, workhorse instance (2), (3) is apt to select strong Nash equilibria, withstanding joint deviations of several players. Partly for that reason, partly for the value a non-standard, novel setting, the next section lets the said instance be central.

4 Cost of displacement in metric spaces

This section specializes one feature but generalizes another. The space \( \mathbb {X}\) is metric now but convexity plays no role any longer. A main motivation is to loosen the grips convexity conditions hold on game theory.

Here, to bring different problems under a common umbrella and save space, \( x\in \mathbb {X}\) denotes either the position of an isolated optimizer or the strategy profile \(x=(x_{i})\) of several interacting agents \(i\in I\).

For greater flexibility in modelling step-wise adaptations, one may replace the time-invariant A in (4) with stage-dependent correspondences \( A^{k}:\mathbb {X\rightrightarrows X}\) to have

$$\begin{aligned} x^{k+1}\in A^{k}(x^{k})\text {, }k=0,1,\ldots \end{aligned}$$
(14)

Definition 4.1

(On asymptotic closure and regularity) In a metric space \((\mathbb {X},d)\), a limiting correspondence \(A: \mathbb {X}\rightrightarrows \mathbb {X}\) closes the sequence \((A^{k})\) if

$$\begin{aligned} (\hat{x}^{k},x^{k})\rightarrow (\hat{x},x)\text { with }\hat{x} ^{k}\in A^{k}(x^{k})\text { implies }\hat{x}\in A(x). \end{aligned}$$
(15)

\((A^{k})\) is declared asymptotically regular if \( x^{k+1}\in A^{k}(x^{k})\) yields \(d(x^{k+1},x^{k})\rightarrow 0\).Footnote 11

The rest of this section only considers workhorse model (2), (3) with \(c\ge 0.\) Suppose the effective domain \( dom\beta :=\beta ^{-1}(\mathbb {R})=:X\) of the gross benefit function \( \beta :\mathbb {X\rightarrow R\cup }\left\{ -\infty \right\} \) is closed.

For illustration of (14), replace functions \(\beta \), c with stage-dependent versions \(\beta ^{k}:\mathbb {X\rightarrow R\cup }\left\{ -\infty \right\} \), \(dom\beta ^{k}\) containing X\(c^{k}:\mathbb {X\times X\rightarrow R}_{+}\mathbb {\cup }\left\{ +\infty \right\} ,\) and posit

$$\begin{aligned} A^{k}(x):=\left\{ \hat{x}\in X:\beta ^{k+1}(\hat{x})-\beta ^{k}(x)\ge c^{k}( \hat{x}\left| x\right. )\right\} . \end{aligned}$$
(16)

Proposition 4.1

(on convergence). Let \((\mathbb {X},d)\) be complete metric, \(X\subseteq \mathbb {X}\) non-empty closed , and the sequence \(A^{k}\) (16) closed by the workhorse correspondence

$$\begin{aligned} x\in X\rightrightarrows A(x)=\left\{ \hat{x}\in X:\beta (\hat{x} )-\beta (x)\ge c(\hat{x}\left| x\right. )\ge 0\right\} . \end{aligned}$$

Suppose any sequence \(x^{k+1}\in A^{k}(x^{k}), k=0,1..\) , has \(\lim \sup \beta ^{k}(x^{k})\) finite and

$$\begin{aligned} c^{k}(x^{k+1}\left| x^{k}\right. )\ge \delta d(x^{k+1},x^{k}) \end{aligned}$$

for some number \(\delta >0\). Then, \((A^{k})\) is asymptotically regular, and any sequence \((x^{k})\) generated by (14) & (16) converges to a stationary point (Definition 2.1).

Proof

By telescoping,

$$\begin{aligned} +\infty >\lim \sup \beta ^{k}(x^{k})-\beta ^{0}(x^{0})\ge \sum _{k=0}^{ \infty }c^{k}(x^{k+1}\left| x^{k}\right. )\ge \delta \sum _{k=0}^{\infty }d(x^{k+1},x^{k}). \end{aligned}$$
(17)

Since the metric space is complete, (17) implies that \( x^{k}\rightarrow x\) for some unique limit x. Also by (17), there is asymptotic regularity: \(d(x^{k+1},x^{k})\rightarrow 0.\) Hence, by closure (15), \(x\in A(x)\), and stationarity obtains. \(\square \)

Contending without convexity here, existence of a stationary point (Definition 2.1) requires other arguments. For such, a minor addition to Caristi’s theorem (Caristi 1976) will do:

Theorem 4.1

(on singleton fixed points of a correspondence). Let the space (Xd) be complete metric, \( A:X\rightrightarrows X\) a point-to-set correspondence with non-empty values, and \(\beta :X\rightarrow \mathbb {R}\) bounded above, upper semicontinuous. If, for each \(x\in X\) and \(\hat{x}\in A(x),\)

$$\begin{aligned} \beta (\hat{x})-\beta (x)\ge d(\hat{x},x), \end{aligned}$$
(18)

then, A has a fixed point x at which A(x) reduces to a singleton (7). \(\square \)

Proof

following (Kirk 1976), is included for completeness. The binary relation

$$\begin{aligned} \hat{x}\succsim x\Longleftrightarrow {\beta }(\hat{x})-{\beta } (x)\ge d(\hat{x},x) \end{aligned}$$

defines a partial order on X. Consider totally ordered subsets of X, all containing some fixed reference point. By Zorn’s lemma, there exists a totally ordered \(\mathcal {X}\subseteq X\) which is maximal under set inclusion \(\subseteq \). Let \(\mathcal {X}=\left\{ x_{n}:n\in N\right\} \) for some index set or “net” N, totally ordered in the same manner: \(x_{\hat{n}}\succsim x_{n}\Longleftrightarrow \hat{n}\succsim n.\)

The net \(n\in N\mapsto {\beta }(x_{n})\in \mathbb {R}\) is increasing. Being bounded above, \(\lim _{n\uparrow }{\beta }(x_{i})=:r\in \mathbb {R}\) is well defined (Kelley 1955). Consequently, given any \(\varepsilon >0,\) there exists \(n(\varepsilon )\in N\) such that

$$\begin{aligned} n\succsim n(\varepsilon )\Longrightarrow r\ge {\beta }(x_{n})\ge r-\varepsilon . \end{aligned}$$

Therefore, if \(\hat{n}\succsim n\succsim n(\varepsilon ),\) then

$$\begin{aligned} \varepsilon \ge {\beta }(x_{\hat{n}})-{\beta }(x_{n})\ge d(x_{ \hat{n}},x_{n}). \end{aligned}$$

This proves that \((x_{n})_{n\in N}\) is a Cauchy net (Kelley 1955), hence has a limit x. Now, in the last string, let \(\hat{n}\uparrow \), and invoke the upper semicontinuity of \(\beta ,\) to get

$$\begin{aligned} {\beta }(x)-{\beta }(x_{n})\ge \lim _{\hat{n}\uparrow } { \beta }(x_{\hat{n}})-{\beta }(x_{n})\ge d(x,x_{n}), \end{aligned}$$

and thereby, \(x\succsim x_{n}\). Since \(\mathcal {X}\) is maximal under inclusion, \(x\in \mathcal {X}\). Also, for any \(\hat{x}\in A(x),\) it follows from (18) that \(\hat{x}\succsim x\). So, again since \(\mathcal {X}\) is maximal under inclusion, \(\hat{x}\in \mathcal {X}\). But thereby, \(x\succsim \) \( \hat{x}\). The upshot is that \(\left\{ x\right\} =A(x).\) \(\square \)

Specializing A to be the workhorse model (2) and (3), with \(c\ge 0\), it follows forthwith:

Corollary 4.1

(on singleton stationary point).Let (Xd) be a complete metric space, and \(\beta :X\rightarrow \mathbb {R }\) bounded above, upper semicontinuous. For each \(x\in X\) , suppose the set

$$\begin{aligned} A(x)\subseteq \left\{ \hat{x}\in X \left| b(\hat{x} \left| x\right. ):={\beta }(\hat{x})-{\beta }(x)-c(\hat{x} \left| x\right. )\ge 0\right. \right\} \end{aligned}$$

is non-empty, with \(c(\cdot \left| x\right. )\ge d(\cdot ,x)\) upper semicontinuous.Footnote 12 Then, there exists a stationary point x (Definition 2.1)at which

$$\begin{aligned} \left\{ x\right\} =A(x)=\arg \max \left\{ \hat{x}\in X:b(\hat{x}\left| x\right. )\right\} . \end{aligned}$$

\(\square \)

Concluding this section is a metric existence theorem on strong Nash equilibrium. As customary, a player may impact the objectives of others but, less common, also their constraints. Moreover, the selected equilibrium becomes strong by withstanding joint deviations. From Theorem & Corollary  4.1 follows directly:

Theorem 4.2

(on strong Nash equilibria). Let agent \(i\in I\) choose his strategy \(x_{i}\) in a subset \(X_{i}\) of a complete metric space \((\mathbb {X}_{i},d_{i}).\) Suppose non-cooperative play be constrained to a non-empty closed set \( X\subseteq \Pi _{i\in I}X_{i}\) in the product space \(\mathbb {X} =\Pi _{i\in I}\mathbb {X}_{i}\), endowed with compatible metric d. Let agent i worship maximization of own benefit \( (x_{i},x_{-i})=x\in X\mapsto \beta _{i}(x_{i},x_{-i})\in \mathbb {R}\), assumed bounded above and upper semicontinuous. Then, under the hypotheses of Corollary 4.1, and using workhorse model A, with specification (12), there exist a Nash equilibrium \(x\in X \) which is strong in that

$$\begin{aligned} (x_{i})_{i\in \mathcal {I}}=:x_{\mathcal {I}}\in \arg \max \left\{ \sum _{i\in \mathcal {I}}\beta _{i}(\hat{x}_{\mathcal {I}},x_{-\mathcal {I}}):(\hat{x}_{ \mathcal {I}},x_{-\mathcal {I}})\in X\right\} \forall \mathcal {I} \subseteq I. \end{aligned}$$

\(\square \)

5 Stackelberg games

Stationarity (8) captures non-cooperative equilibrium in strategic-form games where players commit their moves once and simultaneously. That form hardly illuminates extensive-form instances where players act step by step.Footnote 13 In such games, some “protocol” governs who moves next—on the basis of which information.

To illustrate some of the difficulties that emerge, this section concludes by considering, in some generality, the most “simple” yet utterly important setting of Stackelberg sort (Lignola and Morgan 2017). Merely two players take part: \(I=\left\{ \pm 1\right\} \). Each moves just once but in specific order.

The leading player 1 first chooses some \(x_{1}\in X_{1}\). Observing that choice, the other player \(-1\) responds, choosing some \(x_{-1}\in X_{-1}.\) Thereafter, each collects his benefit \(\beta _{i}(x_{i},x_{-i})\). Every set \( X_{i}\) is a compact topological space. As before, let \(X:=\Pi _{i\in I}X_{i}.\)

In essence, the follower reduces to a strategic dummy. He is just an “agency”, a robot who selects some best response

$$\begin{aligned} x_{-1}\in \mathcal {R}(x_{1}):=\arg \max \left\{ \beta _{-1}(\hat{x}_{-1},x_{1}): \hat{x}_{-1}\in X_{-1}\right\} . \end{aligned}$$
(19)

By contrast, up front, the leader ought

$$\begin{aligned} \text {maximize }\beta _{1}(x_{1},\mathcal {R}(x_{1}))\text { subject to } x_{1}\in X_{1}. \end{aligned}$$

His task is often rather demanding. He had better foresee or guess— or outright be told—the entire response correspondence \(\mathcal {R}(\cdot )\) (19). Moreover, if some \(\mathcal {R}(x_{1})\) isn’t a singleton, which selection therein appears appropriate?Footnote 14

Suppose the two agents play this stage game iteratively, each remembering his most recent choice. Can they eventually learn—and implement—an equilibrium of the underlying Nash–Stackelberg variety?Footnote 15 For a positive answer, suppose players, upon entering stage \(k=0,1,..,\) with most recent choices \(x_{i}^{k}\in X_{i},\) already sunk, use conditional benefit functions

$$\begin{aligned} (x_{i},x_{-i})\mapsto \beta _{i}^{k}(x_{i}\left| x_{i}^{k}\right. ,x_{-i})\le \beta _{i}(x_{i},x_{-i}) \forall i. \end{aligned}$$
(20)

Inequality (20) reflects two features. First, agent i incurs some non-negative cost by deviating from his last choice \(x_{i}^{k}\). Second, each conditional benefit \(\beta _{i}^{k}\) underestimates the true version \( \beta _{i}\). Suppose that

$$\begin{aligned} x^{k}\rightarrow _{X}x\Longrightarrow \lim \sup \beta _{1}^{k}(\chi _{1}\left| x_{1}^{k}\right. ,x_{-1}^{k})\ge \lim \inf \beta _{1}(\chi _{1},x_{-1}^{k})\text { }\forall \chi _{1}\in X_{1}\text {.} \end{aligned}$$
(21)

Similarly, suppose the condition \(x_{-1}^{k}\rightarrow _{X_{-1}}x_{-1}\) & \( \chi ^{k}\rightarrow _{X}\chi \) implies

$$\begin{aligned} \lim \sup \beta _{-1}^{k}(\chi _{-1}^{k}\left| x_{-1}^{k}\right. ,\chi _{1}^{k})\ge \lim \inf \beta _{-1}(\chi ^{k}). \end{aligned}$$
(22)

Assumptions (21) and (22) capture that ultimately, as play settles, adjustment costs disappear. For (21), provided the leader’s choice \(x_{1}^{k}\in X_{1}\) converges to some \(x_{1}\), asymptotically his cost of change doesn’t affect his benefit. For (22), provided \(x_{-1}^{k}\rightarrow _{X_{-1}}x_{-1}\) and the strategy pair \(\chi ^{k}\in X\) converges to some \(\chi ,\) asymptotically the follower’s cost of change has no impact on his benefit.Footnote 16

At stage k the leader believes or expects that the follower will apply a single-valued response function \(r^{k}:X_{1}\rightarrow X_{-1}\)—not necessarily a selection of \(\mathcal {R}\) (19). His belief or expectation must, however, be approximately rational in so far as, for any most recent \(\chi _{1},\chi _{-1}\),

$$\begin{aligned} {\beta }_{-1}^{k}(r^{k}(\chi _{1})\left| \chi _{-1}\right. ,\chi _{1})\ge \sup _{\hat{\chi }_{-1}\in \text { }X_{-1}}{\beta }_{-1}^{k}( \hat{\chi }_{-1}\left| \chi _{-1}\right. ,\chi _{1})-\varepsilon ^{k}\text { with }\varepsilon ^{k}\rightarrow 0^{+}. \end{aligned}$$
(23)

On these premises, at stage k,  the leader chooses an update

$$\begin{aligned} x_{1}^{k+1}\in A_{1}^{k}(x_{1}^{k}):=\arg \max {\beta } _{1}^{k}(\cdot \left| x_{1}^{k}\right. ,r^{k}(\cdot )). \end{aligned}$$
(24)

After observing \(x_{1}^{k+1},\) the follower comes up with a best response

$$\begin{aligned} x_{-1}^{k+1}\in A_{-1}^{k}(x_{-1}^{k},x_{1}^{k+1}):=\arg \max {\beta } _{-1}^{k}(\cdot \left| x_{-1}^{k}\right. ,x_{1}^{k+1}). \end{aligned}$$
(25)

The resulting process is not construed as realizing equilibrium play in an infinitely repeated stage game. More simply, the limit should just qualify as Nash outcome in a single interaction over two stages. Note that because of the sequential mode of play, the coupled updates (24 ), (25) do not fit (16). Nonetheless, it holds:

Theorem 5.1

(convergence in Stackelberg games). Let hypotheses (20)–(23) be in vigor. Suppose each function \(x\in X\mapsto {\beta }_{i}(x)\) is upper semicontinuous, and that the leader’s objective \({\beta } _{1}(x_{1},x_{-1})\) is lower semicontinuous in \(x_{-1}\in X_{-1}.\) Also suppose that for any point \(\chi =(\chi _{1},\chi _{-1})\in X\) and sequence \(\chi _{1}^{k}\in X_{1}\rightarrow \chi _{1}\), there exists a sequence \(\chi _{-1}^{k}\in X_{-1}\rightarrow \chi _{-1}\) such that \(\chi ^{k}:=(\chi _{1}^{k},\chi _{-1}^{k})\) yields

$$\begin{aligned} \lim \inf {\beta }_{-1}(\chi ^{k})\ge {\beta }_{-1}(\chi ). \end{aligned}$$
(26)

If \(r^{k}\) converges continuously to some \( r:X_{1}\rightarrow X_{-1}\), meaning

$$\begin{aligned} x_{1}^{k}\rightarrow _{X_{1}}x_{1}\Longrightarrow r^{k}(x_{1}^{k})\rightarrow r(x_{1}), \end{aligned}$$
(27)

then it holds for each limit \(x_{1}=\lim x_{1}^{k}\in X_{1}\) of the leader’s play (24) that

$$\begin{aligned} x_{1}\in \arg \max {\beta }_{1}(\cdot ,r(\cdot ))\text { and } x_{-1}:=r(x_{1})\in \arg \max {\beta }_{-1}(\cdot ,x_{1})=\mathcal {R} (x_{1}). \end{aligned}$$
(28)

If moreover, \((X_{1},d)\) is a metric space, and \( d(x_{1}^{k+1},x_{1}^{k})\rightarrow 0,\) each cluster point \(x_{1}\) of \((x_{1}^{k})\) satisfies (28).

Proof

Player 1 chooses \(x_{1}^{k}\) at stage k. Suppose \( x_{1}=\lim x_{1}^{k}\). Then, by continuous convergence of the response functions (27),

$$\begin{aligned} x_{-1}:=\lim r^{k}(x_{1}^{k+1})=\lim r^{k}(x_{1}^{k})=r(x_{1}). \end{aligned}$$

is well defined. With \(x=(x_{i}),\) it holds for any \(\chi _{1}\in X_{1}\) that

$$\begin{aligned} {\beta }_{1}(x)&\ge \lim \sup {\beta } _{1}(x_{1}^{k+1},r^{k}(x_{1}^{k+1}))\ge ^{{(20)}} \lim \sup {\beta }_{1}^{k}(x_{1}^{k+1}\left| x_{1}^{k}\right. ,r^{k}(x_{1}^{k+1})) \\&\ge ^{{(24)}} \lim \sup {\beta } _{1}^{k}(\chi _{1}\left| x_{1}^{k}\right. ,r^{k}(\chi _{1})) \\&\ge ^{{(21)}} \lim \inf {\beta }_{1}(\chi _{1},r^{k}(\chi _{1}))\ge {\beta }_{1}(\chi _{1},r(\chi _{1})). \end{aligned}$$

The first inequality derives from the upper semicontinuity of \({ \beta }_{1}.\) The last follows from the lower semicontinuity of \( { \beta }_{1}(\chi _{1},\cdot )\) and (27). Thus, \(x_{1}\in \arg \max {\beta }_{1}(\cdot ,r(\cdot ))\).

Further, for the same sequence \(x_{1}^{k}\rightarrow x_{1}\) and any \(\chi _{-1}\in X_{-1}\) there exists a sequence \(\chi _{-1}^{k}\rightarrow _{X_{-1}}\chi _{-1}\) such that (26) holds with \((x_{1}^{k},\chi _{-1}^{k})\rightarrow (x_{1},\chi _{-1})\). So,

$$\begin{aligned} {\beta }_{-1}(x)&\ge \lim \sup {\beta } _{-1}(r^{k}(x_{1}^{k+1}),x_{1}^{k+1})\ge ^{{(20)}} \lim \sup {\beta }_{-1}^{k}(r^{k}(x_{1}^{k+1})\left| x_{-1}^{k}\right. ,x_{1}^{k+1}) \\&\ge ^{{(23)}}\lim \sup [{\beta }_{-1}^{k}(\chi _{-1}^{k+1}\left| x_{-1}^{k}\right. x_{1}^{k+1})-\varepsilon ^{k}]\text { (since }\varepsilon ^{k}\rightarrow 0^{+}\text {)} \\&\ge ^{{(22)}}\text { }\lim \inf {\beta }_{-1}(\chi _{-1}^{k+1},x_{1}^{k+1})\ge {\beta }_{-1}(\chi _{-1},x_{1}). \end{aligned}$$

The first inequality derives from the upper semicontinuity of \({ \beta }_{-1};\) the last from (26). Thus, \(x_{-1}\in \arg \max {\beta }_{-1}(\cdot ,x_{1})\), and the proof is complete. \(\square \)

Invoking numerous assumptions, Theorem 5.1 proves existence and learning of Nash–Stackelberg equilibrium in an extensive game, featuring just two players, each moving just once, a leader ahead of the follower. It appears that several followers could be accommodated.

Admittedly, the modelling leaves several open ends. In particular, what sort of approximations \({\beta }_{i}^{k}\) might be expedient? What learning scheme, if any, could justify which response functions \(r^{k}?\) And, when will these functions converge continuously? These questions go beyond the scope of this paper. Suffice it to say that, for finite-action games, fictitious play may offer insights (Brown 1951; Fudenberg and Levine 1994; Robinson 1951; Shapley 1964). For games with continuous actions spaces, see the proximal point procedures in Caruso et al. (2018).

Much criticism of economics and operations research centers on the paradigm of optimizing behavior (Scitovsky 1992). Both fields of inquiry depict agents who know what they are doing and do the best they can. This paper rather emphasizes that agents, within their circumstances, just contend with—but invariably seek—eventual improvements, if any. On that simpler premise, some equilibrium may eventually obtain.