1 Introduction

The fixed point problem is a well-established subject in the area of nonlinear analysis [3, 4, 7], which is usually formulated in the following form:

$$ (\mathcal {{P}}_{0}): \quad \mathbf {x}= {F}(\mathbf {x}), $$
(1)

where \({F}:\mathcal {X}_{a} \rightarrow \mathcal {X}_{a}\) is a nonlinear mapping and \(\mathcal {X}_{a} \) is a subset of a normed space \(\mathcal {X}\). Problem \((\mathcal {{P}}_{0})\) appears extensively in engineering and sciences, for example, in equilibrium problems, mathematical economics, game theory, and numerical methods for nonlinear dynamical systems. A general form of the equilibrium problem was first considered by Nikaido and Isoda in 1955 as an auxiliary problem to establish existence results for Nash equilibrium points in non-cooperative games [4346]. Mathematically speaking, the nonlinear operator \(F(\mathbf {x})\) could be any arbitrarily given vector-valued function. Therefore, the formula \((\mathcal {{P}}_{0})\) for the fixed point problem is too abstract. Although it can be used to “model” a large class of mathematical problems, one must pay a price: it is impossible to develop a unified mathematical theory with powerful real-world applications. This dilemma is due to a gap between mathematics and physics. As indicated by V.I. Arnold [1]:“In the middle of the twentieth century it was attempted to divide physics and mathematics. The consequences turned out to be catastrophic.” Indeed, during the past sixty years extensive research on the fixed point problems has been mainly focused on this abstract form. It turns out that the majority theories and methods for solving this nonlinear problem are based on linear iteration [33, 34, 47, 50]. This paper will provide a different approach. For simplicity’s sake, we assume that \(\mathcal {X}_{a}\) is a convex open set in \({\mathbb {R}}^{n}\) with a norm \(\Vert \mathbf {x}\Vert \) induced by the bilinear form 〈∗, \(* \rangle :\mathcal {X}\times \mathcal {X}\rightarrow {\mathbb {R}}\).

Lemma 1

If F is a potential operator, i.e., there exists a real-valued function \({P}:\mathcal {X}_{a} \rightarrow {\mathbb {R}}\) such that \({F}(\mathbf {x}) = \nabla {P}(\mathbf {x})\), then \((\mathcal {{P}}_{0})\) is equivalent to the following stationary point problem:

$$ {\bar {\mathbf {x}}}= \arg \mathrm {sta}\biggl\{ \Pi (\mathbf {x})= {P}(\mathbf {x})- \frac{1}{2} \Vert \mathbf {x}\Vert ^{2} \bigm| \forall \mathbf {x}\in \mathcal {X}_{a} \biggr\} . $$
(2)

Otherwise, \((\mathcal {{P}}_{0})\) is equivalent to the following global minimization problem:

$$ {\bar {\mathbf {x}}}= \arg \min \biggl\{ \Pi (\mathbf {x})= \frac{1}{2} \bigl\Vert {F}(\mathbf {x}) - \mathbf {x}\bigr\Vert ^{2} \bigm| \forall \mathbf {x}\in \mathcal {X}_{a} \biggr\} . $$
(3)

Proof

First we assume that \({F}(\mathbf {x})\) is a potential operator, then x is a stationary point of \(\Pi (\mathbf {x})\) if and only if \(\nabla \Pi (\mathbf {x}) = \nabla {P}(\mathbf {x}) - \mathbf {x}= 0\), thus, x is also a solution to \((\mathcal {{P}}_{0})\) since \({F}(\mathbf {x}) = \nabla {P}(\mathbf {x})\).

Now we assume that \({F}(\mathbf {x})\) is not a potential operator. By the fact that \(\Pi (\mathbf {x}) = \frac{1}{2} \Vert {F}(\mathbf {x}) - \mathbf {x}\Vert ^{2} \ge 0 \ \forall \mathbf {x}\in \mathcal {X}\), the vector \({\bar {\mathbf {x}}}\) is a global minimizer of \(\Pi (\mathbf {x})\) if and only if \({F}({\bar {\mathbf {x}}}) - {\bar {\mathbf {x}}}= 0 \). Thus, \({\bar {\mathbf {x}}}\) must be a solution to \((\mathcal {{P}}_{0})\). □

By the facts that the global minimizer of an unconstrained optimization problem must be a stationary point and

$$ \frac{1}{2} \bigl\Vert {F}(\mathbf {x})-\mathbf {x}\bigr\Vert ^{2} = {P}(\mathbf {x}) - \frac{1}{2} \Vert \mathbf {x}\Vert ^{2}, \qquad {P}(\mathbf {x})= \frac{1}{2} \bigl\langle {F}(\mathbf {x}) , {F}(\mathbf {x}) \bigr\rangle - \bigl\langle \mathbf {x}, {F}(\mathbf {x}) \bigr\rangle + \Vert \mathbf {x}\Vert ^{2} , $$
(4)

the global minimization problem (3) is a special case of the stationary point problem (2). Mathematically speaking, if a fixed point problem has a trivial solution, then \({F}(\mathbf {x})\) must be a homogeneous operator, i.e., \({F}(0) = 0\). For general problems, \({F}(\mathbf {x})\) should have a nonhomogeneous term \(\mathbf {f}\in {\mathbb {R}}^{n}\). Thus, we can let

$$\begin{aligned} {P}(\mathbf {x})={W}(D\mathbf {x})- \langle \mathbf {x}, \mathbf {f}\rangle , \end{aligned}$$
(5)

where \(D: \mathcal {X}\rightarrow \mathcal {{W}}\subset {\mathbb {R}}^{m}\) is a linear operator, \({W}:\mathcal {{W}}\rightarrow {\mathbb {R}}\) is a so-called objective function. Objectivity is a basic concept in continuum physics [6, 41] and mathematical modeling [18, 19]. Its mathematical definition is given in Gao’s book (Definition 6.1.2 [11]).

Definition 1

(Objectivity)

Let \(\mathcal{R} \) be a proper orthogonal group, i.e., \(\mathbf {R}\in {\mathcal{R}} \) if and only if \(\mathbf {R}^{T} = \mathbf {R}^{-1} \), \(\det \mathbf {R}= 1\). A set \(\mathcal {{W}}_{a} \) is said to be objective if

$$ \mathbf {R}\mathbf {w}\in \mathcal {{W}}_{a} \quad \forall \mathbf {w}\in \mathcal {{W}}_{a}, \forall \mathbf {R}\in {\mathcal{R}}. $$

A real-valued function \({W}:\mathcal {{W}}_{a} \rightarrow {\mathbb {R}}\) is said to be objective if

$$ {W}(\mathbf {R}\mathbf {w}) = {W}(\mathbf {w}) \quad \forall \mathbf {w}\in \mathcal {{W}}_{a}, \forall \mathbf {R}\in {\mathcal{R}}. $$
(6)

Geometrically speaking, an objective function does not depend on rigid rotation of the system considered, but only on certain measure of its variable. In the Euclidean space \(\mathcal {{W}}\subset {\mathbb {R}}^{m}\), the simplest objective function is the \(\ell _{2}\)-norm \(\Vert \mathbf {w}\Vert \) in \({\mathbb {R}}^{m}\) as we have \(\Vert {\mathbf{R}} \mathbf {w}\Vert ^{2} = \mathbf {w}^{T} {\mathbf{R}}^{T} {\mathbf{R}} \mathbf {w}= \Vert \mathbf {w}\Vert ^{2} \ \forall {\mathbf{R}} \in {\mathcal{R}}\). For general \(F(\mathbf {x})\), we can see from (4) that \(\frac{1}{2} \Vert {F}(\mathbf {x}) \Vert ^{T} \) and \(\frac{1}{2} \Vert \mathbf {x}\Vert ^{2}\) are objective functions. By the fact that \(\mathbf {x}= F(\mathbf {x})\), we know that 〈x, \(F(\mathbf {x}) \rangle \) is also an objective function. Therefore, for a given fixed point problem, the corresponding \(\Pi (\mathbf {x})\) is naturally an objective function.

Physically, an objective function is governed by the intrinsic physical law of the system, which does not depend on observers. Because of Noether’s theorem, the objective function \({W}(\mathbf {w})\) should be a SO(n)-invariant and this invariant is equivalent to a certain conservation law (see Sect. 6.1.2 [11]). Therefore, objectivity is essential for any real-world mathematical models. It was emphasized by P.G. Ciarlet that the objectivity is not an assumption, but an axiom [6].

From the viewpoint of systems theory, if x represents the output (or the state, configuration, etc.) of the system, then the nonhomogeneous term f can be viewed as the input (or the control, applied force, etc.), which depends on each given problem. Correspondingly, the linear term 〈x, f〉 in (5) can be called the subjective function [18, 19]. Let \(\mathcal {X}_{a} = \{ \mathbf {x}\in \mathcal {X}\mid D\mathbf {x}\in \mathcal {{W}}_{a} \}\). The fixed point problem \((\mathcal {{P}}_{0})\) can be reformulated into the following stationary point problem:

$$ (\mathcal{P} ): {\bar {\mathbf {x}}}= \arg \mathrm {sta}\biggl\{ \Pi (\mathbf {x})= {W}( D\mathbf {x})-\frac{1}{2} \Vert \mathbf {x}\Vert ^{2} - \langle \mathbf {x}, \mathbf {f}\rangle \bigm| \forall \mathbf {x}\in \mathcal {X}_{a} \biggr\} . $$
(7)

From the theory of nonconvex analysis, any nonconvex function can be written as a d.c. (deference of convex) function [35]. Therefore, the fixed point problem is actually equivalent to a d.c. programming problem. By the fact that \(\mathcal {X}\) and \(\mathcal {{W}}\) are two different spaces with different scales (dimensions), the problem \((\mathcal {{P}})\) can be used to study general problems in multi-scale complex systems.

For a potential operator, a fixed point is just a stationary point, which can be easily found by traditional linear iteration methods. For a non-potential operator, the fixed point must be a global minimizer. Due to the lack of global optimality condition in the traditional theory of nonlinear optimization, to solve a general nonconvex minimization problem is considered to be NP-hard in global optimization and computer science. However, this paper will show that many of these nonconvex problems can be solved in an elegant way.

2 Methods

According to the Brouwer fixed point theorem, we know that any continuous function from the closed unit ball in an n-dimensional Euclidean space to itself must have a fixed point. Generally speaking, for any given nontrivial input, a well-defined system should have at least one nontrivial response.

Definition 2

(Properly- and well-posed problems [18])

The problem \((\mathcal {{P}})\) is called properly posed if, for any given nontrivial input \(\mathbf {f}\neq 0 \), it has at least one nontrivial solution. It is called well-posed if the solution is unique.

Clearly, this definition is more general than Hadamard’s well-posed problems in dynamical systems since the continuity condition for the solution is not required. Physically speaking, any real-world problems should be well-posed since all natural phenomena exist uniquely. But practically, it is difficult to model a real-world problem precisely. Therefore, properly posed problems are allowed for the canonical duality theory. This definition is important for understanding challenging problems in complex systems.

Example 1

(Manufacturing/production systems)

In management science, the output is a vector \(\mathbf {x}\in {\mathbb {R}}^{n}\), which could represent the products of a manufacture company. The input \(\mathbf {f}\in {\mathbb {R}}^{n}\) can be considered as market price (or demand). Therefore, the subjective function \(\langle \mathbf {x}, \mathbf {f}\rangle = \mathbf {x}^{T} \mathbf {f}\) in this example is the total income of the company. The products are produced by workers \(\mathbf {w}\in {\mathbb {R}}^{m}\). Due to the cooperation, we have \(\mathbf {w}= D\mathbf {x}\) and \(D\in {\mathbb {R}}^{m\times n}\) is a matrix. Workers are paid salary \(\boldsymbol {\sigma }= \partial {W}(\mathbf {w})\), therefore, the objective function \({W}(\mathbf {w})\) is the cost (in this example, W is not necessarily objective since the company is a man-made system). Let \(\frac{1}{2} {\alpha }\Vert \mathbf {x}\Vert ^{2}\) be the profit that the company must make, where \({\alpha }> 0\) is a parameter, then \(\Pi (\mathbf {x}) = {W}(D\mathbf {x}) + \frac{1}{2} {\alpha }\Vert \mathbf {x}\Vert ^{2} - \mathbf {x}^{T} \mathbf {f}\) is the target and the minimization problem \(\min \Pi (\mathbf {x})\) leads to the equilibrium equation

$$ {\alpha }\mathbf {x}= \mathbf {f}- D^{T} \partial _{\mathbf {w}} {W}(D\mathbf {x}) . $$

This is a fixed point problem. The cost function \({W}(\mathbf {w})\) could be convex for a small company, but usually nonconvex for big companies to allow some people have the same salaries.

Example 2

(Lagrange mechanics)

In analytical mechanics, the configuration \(\mathbf {x}\in \mathcal {X}\subset \mathcal {C}^{1}[I; {\mathbb {R}}^{n}]\) is a continuous vector-valued function of time \(t\in I \subset {\mathbb {R}}\). Its components \(\{ { x}_{i} \}\) (\(i = 1, \dots , n\)) are known as the Lagrangian coordinates. The input \(\mathbf {f}(t)\) is a given force vector function in \({\mathbb {R}}^{n}\). Therefore, the subjective functional in this case is \(\langle \mathbf {x}, \mathbf {f}\rangle = \int _{I} \mathbf {x}(t) \cdot \mathbf {f}(t) {\,\mbox{d}t}\). The total action of the system is

$$ \int _{I} L(\mathbf {x}, \dot{ \mathbf {x}} ) {\,\mbox{d}t},\quad L= T(\dot{ \mathbf {x}} ) - V( \mathbf {x}), $$

where T is the kinetic energy density, V is the potential density, and \(L= T- V\) is the standard Lagrangian density. In this case, the linear operator \(D = \partial _{t}\) is a derivative with time. Together, \(\Pi (\mathbf {x}) =\int _{I} [ T(\dot{ \mathbf {x}} ) - V(\mathbf {x}) - \mathbf {x}^{T} \mathbf {f}] {\,\mbox{d}t}\) is called the total action. For Newton mechanics, the kinetic energy is a quadratic (objective) function \(T(\mathbf {v}) = \frac{1}{2} m \Vert \mathbf {v}\Vert ^{2} \). Its stationary condition leads to the Euler–Lagrange equation:

$$ - m \ddot{\mathbf {x}}= \mathbf {f}+ \nabla V( \mathbf {x}) . $$
(8)

Finite difference method for solving this second-order differential equation leads to a fixed point problem [42]. It is well known that if the potential energy \(V(\mathbf {x})\) is convex, the operator \({F}= \mathbf {f}+ \nabla V( \mathbf {x}) \) is monotone and the problem \((\mathcal {{P}}_{0})\) has a stable fixed point solution. Correspondingly, the system has a stable trajectory. Otherwise, the system could have chaotic solutions. The relation between chaos in nonlinear dynamical systems and NP-hardness in computer science has been discovered recently [37].

Example 3

(Post-buckling of nonlinear Gao beam)

In large deformation solid mechanics, the correct nonlinear beam theory that can be used to model post-buckling phenomenon was proposed by Gao in 1996 [8], which is governed by a forth-order nonlinear differential equation:

$$ \chi _{xxxx} - \frac{3}{2} {\alpha }\chi _{x}^{2} \chi _{xx} + {\lambda }\chi _{xx} =q , $$
(9)

where \(\chi (x)\) is the deflection of the beam, which is a scaler-valued function over its domain \([0,L]\), where L is the beam length, \({\alpha }> 0\) is a material constant, the parameter λ depends on the axial force, and \(q(x)\) is a given distributed lateral load. Clearly, this nonlinear deferential equation can be written in the following fixed point problem:

$$ \chi (x) = F \bigl(x, \chi (x) \bigr), \qquad F(x,\chi ) = \int _{0}^{x} \int _{0}^{t} \int _{0}^{s} {\alpha }\biggl( \frac{1}{2} \chi _{x}^{3} - {\lambda }\chi _{x} \biggr) \,\mbox{d}s \, \mbox{d}t\, \,\mbox{d}x+ f(x), $$
(10)

where the function \(f(x)\) depends on both the lateral load \(q(x)\) and boundary conditions. In this case, \(F(x,\chi (x))\) is a nonlinear integration operator. This fixed point problem is equivalent to the stationary point problem

$$ \chi = \arg \mathrm {sta}\biggl\{ \Pi (\chi ) = \int _{0}^{L} \biggl[ \frac{1}{2} \chi ^{2}_{xx} + \frac{1}{2} {\alpha }\biggl( \frac{1}{2} \chi ^{2}_{x} -{\lambda }\biggr) ^{2} - q(x) \biggr] \,\mbox{d}x\bigm| \chi \in \mathcal {X}_{a} \biggr\} . $$
(11)

It was indicated in [13] that if \({\lambda }< {\lambda }_{c}\), the Euler buckling load defined by

$$ {\lambda }_{c} = \inf \frac{\int _{0}^{L} \chi ^{2}_{xx} \,\mbox{d}x}{{\alpha }\int _{0}^{L} \chi ^{2}_{x} \,\mbox{d}x}, $$

the total potential \(\Pi (\chi )\) is a convex functional, and problem (11) has only one fixed point. In this case, the beam is in a pre-buckling state. It was proved recently (see Lemma 2.1. and Theorem 2.1. in [40]) there exists a constant \({\lambda }^{G}_{c} > {\lambda }_{c}\) such that if \({\lambda }> {\lambda }^{G}_{c}\), then \(\Pi (\chi )\) is nonconvex, i.e., the so-called double-well potential, and the beam is in a post-buckling state. In this case, problem (11) has three fixed points \(\chi _{i}(x)\), \(i=1,2,3\), at each \(x \in [0,L]\): one global minimizer of \(\Pi (\chi )\), which corresponds to a globally stable post-buckling state of the beam, one local minimizer, which corresponds to a locally stable post-buckling state, and one local maximizer of \(\Pi (\chi )\), which corresponds to an un-buckled state. The combination of these three solutions at each \(x \in [0,L]\) forms a solution set with \(3^{\infty }\) number of strong solutions on \([0,L]\) to the nonlinear differential equation (9). It was proved in [25] that for certain lateral load distributions \(q(x)\), both the global and local minimum solutions could be nonsmooth and cannot be captured by any Newton-type method. Numerical approaches to this nonlinear differential equation are considered to be NP-hard by traditional theories and methods. In order to solve this challenging nonconvex stationary problem, a canonical dual finite element method has been developed recently [2]. The numerical results shown that the locally stable post-buckling configuration is extremely sensitive to the external load \(q(x)\) and numerical precision used in the program.

For unilateral post-buckling problems, the feasible set \(\mathcal {X}_{a}\) has usual inequality constraints. For example, a simply supported beam on a rigid foundation subjected to a downward lateral load \(q(x) \ \forall x\in [0,L]\), this feasible set is a convex cone:

$$ \mathcal {X}_{a} = \bigl\{ \chi (x) \in C^{2}[0,L]\mid \chi (x) \ge 0 \ \forall x\in [0, L], \chi (0)=\chi (L) = 0, \chi _{xx}(0)=\chi _{xx}(L)=0\bigr\} . $$

Due to the inequality constraint in \(\mathcal {X}_{a}\), the stationary condition of problem (11) leads not only to the so-called variational inequality [38]

$$ \int _{0}^{L} (\chi - \bar{\chi } ) \delta \Pi ( \bar{\chi }) \,\mbox{d}x\ge 0 \quad \forall \chi (x) \in \mathcal {X}_{a}, $$
(12)

where \(\delta \Pi (\chi ) = \chi _{xxxx} - \frac{3}{2} {\alpha }\chi _{x}^{2} \chi _{xx} + {\lambda }\chi _{xx} - q\) is the Gâteaux derivative of \(\Pi (\chi )\), but also to the well-known complementarity condition

$$ \biggl( \chi _{xxxx} - \frac{3}{2} {\alpha }\chi _{x}^{2} \chi _{xx} + {\lambda }\chi _{xx} - q \biggr) \chi (x) = 0\quad \forall x \in [0,L]. $$
(13)

Since the contact region (i.e., on which \(\chi (x) = 0\)) remains unknown till the problem is solved, problem (11) is the combination of the nonlinear free-boundary value problem, non-monotone variational inequality, and the nonconvex variational analysis. This problem could be one of the most challenging problems in nonconvex analysis, which deserves serious study in the future.

Canonical duality-triality is a methodological theory which can be used not only for modeling complex systems within a unified framework, but also for solving real-world problems with a unified methodology. This theory was developed originally from Gao and Strang’s work for solving the following nonsmooth/nonconvex variational problem [30]:

$$ \inf \bigl\{ {P}(\mathbf {u}) = {W}(D \mathbf {u}) - U(\mathbf {u}) \mid \mathbf {u}\in \mathcal {U}_{a} \bigr\} , $$
(14)

where the variational argument \(\mathbf {u}(\mathbf {x})\) is a deformation field, D is a differential operator such that the deformation gradient \(\mathbf {w}=D \mathbf {u}\) is a two-point tensor filed, \({W}(\mathbf {w})\) is an internal (or free) energy which must be an objective function of w [6, 41], while \(U(\mathbf {u})\) is an external energy which must be a linear functional, i.e., \(U(\mathbf {u}) = \langle \mathbf {u}, \mathbf {f}\rangle \) such that \(\partial U(\mathbf {u}) = \mathbf {f}\) is a given external force field. Thus, the difference \({P}(\mathbf {u}) \) is the well-known total potential energy in nonlinear elasticity. This variational problem (14) covers the most challenging problems in nonconvex analysis and nonlinear partial differential equations. By the objectivity of the free energy \({W}(\mathbf {w})\), there must exist an objective tensor \(\mathbf {c}= \mathbf {w}^{T} \mathbf {w}\) and a real-valued function \(V(\mathbf {c})\) such that \({W}(\mathbf {w}) = V(\mathbf {c}(\mathbf {w}))\) [6]. In finite deformation theory and differential geometry, this objective measure \(\mathbf {c}= \mathbf {c}^{T}\) is the well-known Cauchy–Riemann strain tensor and \(V(\mathbf {c})\) is usually a canonical function, i.e., the duality relation \(\mathbf {c}^{*} = \partial V(\mathbf {c})\) is a bijection (say the St. Venant–Kirchhoff material [23]). These basic truths in nonlinear analysis lay a foundation for the canonical duality theory. This is the reason why this theory can be used to solve analytically a large class of nonconvex variational problems and their associated partial differential equations, including Einstein’s special relativity equation [12], Kantorovich’s optimal mass transfer problem [39], chaotic dynamics [37, 42], global optimization [16, 28], phase transitions in solids [32], post-buckling of large deformed beam [2], nonlinear PDEs in 3-dimensional finite deformation theory [9, 10, 20, 23], etc. For those problems that cannot be solved analytically, numerical discretization (such as the finite element method) can always be used so that the general nonconvex variational problem (14) can be approximately reformulated as a global optimization problem in \({\mathbb {R}}^{n}\). By the fact that the discretized \(W(\mathbf {w})\) may not be an objective function, the canonical duality theory has been generalized for solving general nonconvex and discrete optimization problems [5, 15, 17, 26, 27, 29, 36, 49] as well as the most challenging bi-level knapsack problems and topology optimization in multi-scale complex systems [21, 22].

However, the well-defined objectivity in nonlinear analysis and physics has been seriously misused in optimization and mathematical programming, where the so-called objective function is allowed to be any arbitrarily given function. As a consequence, Gao–Strang’s work has been mistakenly challenged by M.D. Voisei and C. Zalinescu [48]. By oppositely choosing linear functions as the objective function W (see Example 3.1 in [48]) and nonlinear functions as the external energy \(U(\mathbf {u})\) (see Examples 3.2, 3.4, 3.5, and 3.6 in [48]), they produced a series of “counter examples” that led to absurd conclusions including “The hope for reading an optimization theory with diverse applications is ruined by the manner in which [30] is written and the fact that the majority of the results in [30] are false.” These conceptual mistakes verified Arnold’s declaration [1]: “A teacher of mathematics, who has not got to grips with at least some of the volumes of the course by Landau and Lifshitz, will then become a relict like the one nowadays who does not know the difference between an open and a closed set.” A comprehensive review on the canonical duality theory and breakthrough from the recent challenges are given in [24].

The goal of this paper is to apply the canonical duality theory for solving the challenging fixed point problem. The rest of this paper is arranged as follows. Based on the concept of objectivity, the canonical dual for the fixed point problem, its analytical solution, and global optimality condition are presented in the next section. Applications to a general fixed point problem with sum of exponential functions and nonconvex polynomial are discussed in Sect. 4.1. Analytical solutions for a general fixed point problem with a sum of logarithmic and quadratic functions are given in Sect. 4.2. The paper ends with conclusions and future work.

3 Results and discussion

According to the canonical duality, the linear measure \(\epsilon = D \mathbf {x}\) cannot be used directly for studying duality relation due to the objectivity. Also, the linear operator cannot change the nonconvexity of \(W(D\mathbf {x})\). We first introduce the canonical transformation.

Definition 3

(Canonical function and canonical transformation)

A real-valued function \(V:\mathcal {E}_{a} \rightarrow {\mathbb {R}}\) is called canonical if the duality mapping \(\partial V: \mathcal {E}_{a} \rightarrow \mathcal {E}_{a}^{*}\) is one-to-one and onto.

For a given nonconvex function \(W:\mathcal {{W}}_{a} \rightarrow {\mathbb {R}}\), if there exist a geometrically admissible mapping \({\Lambda }:\mathcal {{W}}_{a} \rightarrow \mathcal {E}_{a}\) and a canonical function \(V:\mathcal {E}_{a} \rightarrow {\mathbb {R}}\) such that

$$ W(\boldsymbol {\epsilon }) = V\bigl( {\Lambda }(\boldsymbol {\epsilon }) \bigr), $$
(15)

then transformation (15) is called the canonical transformation and \(\boldsymbol {\xi }= {\Lambda }(\boldsymbol {\epsilon }) \) is called the canonical measure.

By this definition, the one-to-one duality relation \(\boldsymbol {\varsigma }= \partial V(\boldsymbol {\xi }) : \mathcal {E}_{a} \rightarrow \mathcal {E}^{*}_{a}\) implies that the canonical function \(V(\boldsymbol {\xi })\) is differentiable and its conjugate function \(V^{*}:\mathcal {E}^{*}_{a} \rightarrow {\mathbb {R}}\) can be uniquely defined by the Legendre transformation [11]

$$ V^{*}(\boldsymbol {\varsigma }) = \bigl\{ \langle \boldsymbol {\xi }; \boldsymbol {\varsigma }\rangle - V(\boldsymbol {\xi }) \mid \boldsymbol {\varsigma }= \partial V(\boldsymbol {\xi }) \bigr\} , $$
(16)

where \(\langle \boldsymbol {\xi }; \boldsymbol {\varsigma }\rangle \) represents the bilinear form on \(\mathcal {E}\) and its dual space \(\mathcal {E}^{*}\). In this case, \(V:\mathcal {E}_{a} \rightarrow {\mathbb {R}}\) is a canonical function if and only if the following canonical duality relations hold on \(\mathcal {E}_{a} \times \mathcal {E}^{*}_{a}\):

$$ \boldsymbol {\varsigma }= \partial V(\boldsymbol {\xi }) \quad \Leftrightarrow \quad \boldsymbol {\xi }= \partial V^{*}(\boldsymbol {\varsigma }) \quad \Leftrightarrow \quad V(\boldsymbol {\xi }) + V^{*}(\boldsymbol {\varsigma }) = \langle \boldsymbol {\xi }; \boldsymbol {\varsigma }\rangle . $$
(17)

Let \(Q(\mathbf {x})=\frac{1}{2}\Vert \mathbf {x}\Vert ^{2} + \langle \mathbf {x}, \mathbf {f}\rangle \). Replacing \(V(\Lambda (\mathbf {x})) \) in the target function \(\Pi (\mathbf {x})\) by the Fenchel–Young equality \(V(\boldsymbol {\xi }) = \langle \boldsymbol {\xi }; \boldsymbol {\varsigma }\rangle - V^{*}(\boldsymbol {\varsigma })\), the Gao-Strang total complementary function (see [14]) \(\Xi : \mathcal {X}_{a} \times \mathcal {E}^{*}_{a} \rightarrow \mathbb{R}\) can be defined by

$$ \Xi (\mathbf {x}, \boldsymbol {\varsigma }) = \bigl\langle \Lambda (\mathbf {x}) ; \boldsymbol {\varsigma }\bigr\rangle - V^{*}( \boldsymbol {\varsigma }) - Q(\mathbf {x}). $$
(18)

By this total complementary function, the canonical dual of \(\Pi (\mathbf {x})\) can be obtained as

$$ \Pi ^{d}(\boldsymbol {\varsigma }) = \inf \bigl\{ \Xi (\mathbf {x}, \boldsymbol {\varsigma }) \mid \mathbf {x}\in \mathcal {X}\bigr\} = Q^{\Lambda }(\boldsymbol {\varsigma }) - V^{*}(\boldsymbol {\varsigma }), $$
(19)

where \(Q^{\Lambda }:\mathcal {E}^{*}_{a}\rightarrow \mathbb{R}\cup \{ - \infty \}\) is the so-called Λ-conjugate of \(Q(\mathbf {x})\) defined by (see [14])

$$ Q^{\Lambda }(\boldsymbol {\varsigma }) = \mathrm {sta}\bigl\{ \bigl\langle \Lambda (\mathbf {x}) ; \boldsymbol {\varsigma }\bigr\rangle - Q(\mathbf {x})\mid \mathbf {x}\in \mathcal {X}\bigr\} . $$
(20)

Let \(\mathcal {S}_{a} \subset \mathcal {E}^{*}_{a}\) be an admissible set such that on which \(Q^{{\Lambda }}(\boldsymbol {\varsigma })\) is well-defined. If \({\Lambda }(\mathbf {x})\) is a homogeneous quadratic operator, i.e., \({\Lambda }({\alpha }\mathbf {x}) = {\alpha }^{2} {\Lambda }(\mathbf {x})\), then the total complementary function

$$ \Xi (\mathbf {x}, \boldsymbol {\varsigma }) = \frac{1}{2} \bigl\langle \mathbf {x}, \mathbf {G}(\boldsymbol {\varsigma }) \mathbf {x}\bigr\rangle - V^{*}(\boldsymbol {\varsigma }) - \langle \mathbf {x}, \mathbf {f}\rangle , $$
(21)

where \(\mathbf {G}(\boldsymbol {\varsigma }) = \mathbf {H}(\boldsymbol {\varsigma }) - \mathbf {I}\), \(\mathbf {H}(\boldsymbol {\varsigma }) = \nabla ^{2}_{\mathbf {x}} \langle {\Lambda }(\mathbf {x}); \boldsymbol {\varsigma }\rangle \), and I is an identity matrix in \(\mathcal {X}\). In this case, the Λ-conjugate \(Q^{{\Lambda }}\) is simply defined by

$$ Q^{\Lambda }(\boldsymbol {\varsigma }) = -\frac{1}{2} \bigl\langle \mathbf {G}^{-1} ( \boldsymbol {\varsigma }) \mathbf {f}, \mathbf {f}\bigr\rangle , $$
(22)

and \(\mathcal {S}_{a} = \{ \boldsymbol {\varsigma }\in \mathcal {E}^{*}_{a}\mid \det \mathbf {G}(\boldsymbol {\varsigma }) \neq 0 \}\). Thus, the canonical dual problem \((\mathcal {{P}}^{d})\) can be proposed in the following:

$$ \bigl(\mathcal {{P}}^{d} \bigr): \quad \bar {\boldsymbol {\varsigma }}= \arg \mathrm {sta}\bigl\{ \Pi ^{d}(\boldsymbol {\varsigma }) \mid \boldsymbol {\varsigma }\in \mathcal {S}_{a} \bigr\} . $$
(23)

By the canonical duality theory, we have the following results.

Theorem 1

(Analytic solution and complementary-dual principle)

For a given f, if \(\bar {\boldsymbol {\varsigma }}\in \mathcal {S}_{a}\) is a solution to \((\mathcal {{P}}^{d})\), then

$$ {\bar {\mathbf {x}}}= \mathbf {G}^{-1}(\bar {\boldsymbol {\varsigma }})\mathbf {f}$$
(24)

is a solution to the problem \((\mathcal {{P}})\) and

$$ \Pi ({\bar {\mathbf {x}}})=\Xi ({\bar {\mathbf {x}}},\bar {\boldsymbol {\varsigma }})=\Pi ^{d}( \bar {\boldsymbol {\varsigma }}). $$
(25)

If \(F(\mathbf {x})\) is a potential operator, then \({\bar {\mathbf {x}}}\) is also a solution to the fixed point problem \((\mathcal {{P}}_{0})\). If \(F(\mathbf {x})\) is a non-potential operator, then \({\bar {\mathbf {x}}}\) is a solution to the fixed point problem \((\mathcal {{P}}_{0})\) only if \({\bar {\mathbf {x}}}\) is a global minimizer of \(\Pi (\mathbf {x})\).

Proof

By the canonical duality theory we know that \(({\bar {\mathbf {x}}}, \bar {\boldsymbol {\varsigma }})\) is a critical point of \(\Xi (\mathbf {x}, \boldsymbol {\varsigma })\) if and only if \({\bar {\mathbf {x}}}\) is a critical point of \(\Pi (\mathbf {x})\) and \(\bar {\boldsymbol {\varsigma }}\) is a critical point of \(\Pi ^{d}(\boldsymbol {\varsigma })\). It is easy to prove that the criticality condition \(\nabla \Xi ({\bar {\mathbf {x}}}, \bar {\boldsymbol {\varsigma }}) = 0\) leads to the following canonical equations:

$$ \mathbf {G}(\bar {\boldsymbol {\varsigma }}) {\bar {\mathbf {x}}}= \mathbf {f},\qquad {\Lambda }({\bar {\mathbf {x}}}) = \partial V^{*}(\bar {\boldsymbol {\varsigma }}). $$
(26)

The first equation is the canonical equilibrium equation, which leads to the analytical solution (24). By the canonical duality, the canonical duality equation \({\Lambda }({\bar {\mathbf {x}}}) = \partial V^{*}(\bar {\boldsymbol {\varsigma }})\) leads to the complementary-duality relation (25). The theorem is proved by Lemma 1. □

Theorem 1 shows that the solution to the fixed point problem depends analytically on the canonical dual solution, and there is no duality gap between the primal problem (\(\mathcal {{P}}\)) and the canonical dual problem (\(\mathcal {{P}}^{d}\)). By the fact that the problem \((\mathcal {{P}})\) may have many fixed points, in order to identify the extremality of these fixed points, we assume that the canonical function \(V:\mathcal {E}_{a} \rightarrow {\mathbb {R}}\) is convex and introduce two open sets:

$$\begin{aligned}& \mathcal {S}_{a}^{+} = \bigl\{ \boldsymbol {\varsigma }\in \mathcal {S}_{a} \mid \mathbf {G}(\boldsymbol {\varsigma })\succ 0 \bigr\} , \\& \mathcal {S}_{a}^{-} = \bigl\{ \boldsymbol {\varsigma }\in \mathcal {S}_{a} \mid \mathbf {G}(\boldsymbol {\varsigma })\prec 0 \bigr\} , \end{aligned}$$

where \(\mathbf {G}(\boldsymbol {\varsigma })\succ 0\) means that G is a positive definite matrix, and \(\mathbf {G}(\boldsymbol {\varsigma }) \prec 0\) means that G is a negative definite matrix. Also according to the terminology used in the canonical duality theory, a neighborhood of a critical point is an open set containing only one critical point.

Theorem 2

(Triality theorem)

Suppose that \(\bar {\boldsymbol {\varsigma }}\) is a solution to \((\mathcal {{P}}^{d})\) and \({\bar {\mathbf {x}}}=\mathbf {G}^{-1}(\bar {\boldsymbol {\varsigma }})\mathbf {f}\). If \(\bar {\boldsymbol {\varsigma }}\in \mathcal {S}_{a}^{+}\), then \({\bar {\mathbf {x}}}\) is a globally stable fixed point and

$$\begin{aligned} \Pi (\bar{\mathbf {x}}) =\min_{\mathbf {x}\in \mathcal {X}_{a} } \Pi (\mathbf {x})=\max _{\boldsymbol {\varsigma }\in \mathcal {S}_{a}^{+}} \Pi ^{d}(\boldsymbol {\varsigma })=\Pi ^{d}(\bar{ \boldsymbol {\varsigma }}). \end{aligned}$$
(27)

If \(\bar {\boldsymbol {\varsigma }}\in \mathcal {S}_{a}^{-}\), then \({\bar {\mathbf {x}}}\) is a local maximizer of \(\Pi (\mathbf {x})\) iff \(\bar {\boldsymbol {\varsigma }}\in \mathcal {S}_{a}^{-}\) is a local maximizer of \(\Pi ^{d}\) and on the neighborhood \(\mathcal {X}_{o} \times \mathcal {S}_{o} \subset \mathcal {X}_{a} \times \mathcal {S}_{a}^{-} \) of \(({\bar {\mathbf {x}}},\bar {\boldsymbol {\varsigma }})\), we have

$$ \Pi ({\bar {\mathbf {x}}}) =\max_{\mathbf {x}\in \mathcal {X}_{o}} \Pi (\mathbf {x})=\max _{\boldsymbol {\varsigma }\in \mathcal {S}_{o}} \Pi ^{d}(\boldsymbol {\varsigma })=\Pi ^{d}( \bar {\boldsymbol {\varsigma }}). $$
(28)

Moreover, \({\bar {\mathbf {x}}}\) is a locally unstable fixed point of F if it is a potential operator.

If \(\bar {\boldsymbol {\varsigma }}\in \mathcal {S}_{a}^{-}\) and \(\dim \mathcal {X}_{a} = \dim \mathcal {S}_{a}\), then \({\bar {\mathbf {x}}}\) is a local minimizer iff \(\bar {\boldsymbol {\varsigma }}\in \mathcal {S}^{-}_{a}\) is a local minimizer of \(\Pi ^{d}(\boldsymbol {\varsigma })\) and on the neighborhood \(\mathcal {X}_{o} \times \mathcal {S}_{o} \subset \mathcal {X}_{a} \times \mathcal {S}_{a}^{-} \),

$$ \Pi ({\bar {\mathbf {x}}}) =\min_{\mathbf {x}\in \mathcal {X}_{o}} \Pi (\mathbf {x})=\min _{\boldsymbol {\varsigma }\in \mathcal {S}_{o}} \Pi ^{d}(\boldsymbol {\varsigma })=\Pi ^{d}( \bar {\boldsymbol {\varsigma }}) . $$
(29)

Moreover, \({\bar {\mathbf {x}}}\) is a locally stable fixed point of F if it is a potential operator.

This theorem is an application of the triality theory. Detailed proof can be found in [31]. Statement (27) is the so-called canonical min-max duality. This statement shows that the global stable fixed point problem n is equivalent to a concave maximization problem

$$ \bigl(\mathcal {{P}}^{\sharp } \bigr): \quad \max \bigl\{ \Pi ^{d}( \boldsymbol {\varsigma }) \mid \boldsymbol {\varsigma }\in \mathcal {S}^{+}_{a} \bigr\} . $$
(30)

Since the feasible space \(\mathcal {S}^{+}_{a}\) is an open convex set, this canonical dual problem can be solved easily by well-developed nonlinear optimization techniques. The second statement (28) is the canonical double-max duality and the third statement (29) is the canonical double-min duality. For a potential operator F, these two statements can be used to identify locally unstable and stable fixed points, respectively.

4 Applications

4.1 Exponential and polynomial functions

As our first application, the objective function is assumed to be

$$\begin{aligned} {W}(D\mathbf {x})=\alpha \exp \biggl(\frac{1}{2} \Vert D_{1} \mathbf {x}\Vert ^{2} \biggr) +\frac{1}{2} \beta \biggl( \frac{1}{2} \Vert D_{2} \mathbf {x}\Vert ^{2} -\lambda \biggr)^{2}, \end{aligned}$$

where \(D_{1} \in \mathbb{R}^{m\times n}\) and \(D_{2} \in \mathbb{R}^{p\times n}\) are two given matrices, α, β, λ are real numbers. Clearly, for a given \({\lambda }> 0\), \({W}(D\mathbf {x})\) is nonconvex and

$$\begin{aligned} F(\mathbf {x})=\nabla P(\mathbf {x})=\alpha \exp \biggl(\frac{1}{2} \Vert D_{1} \mathbf {x}\Vert ^{2} \biggr) \bigl(D_{1}^{T} D_{1} \bigr)\mathbf {x}+\beta \biggl(\frac{1}{2} \Vert D_{2} \mathbf {x}\Vert ^{2} -\lambda \biggr) \bigl(D_{2}^{T}D_{2} \bigr)\mathbf {x}-\mathbf {f} \end{aligned}$$

is a non-monotone operator. In this case, the fixed point problem \((\mathcal {{P}}_{0})\) can be equivalently written as

$$\begin{aligned} {\bar {\mathbf {x}}}= \arg \mathop{\mathrm {sta}}_{\mathbf {x}\in {\mathbb {R}}^{n} } \biggl\{ \Pi (\mathbf {x})=\alpha \exp \biggl(\frac{1}{2} \Vert D_{1} \mathbf {x}\Vert ^{2} \biggr) +\frac{1}{2} \beta \biggl(\frac{1}{2} \Vert D_{2} \mathbf {x}\Vert ^{2} -\lambda \biggr)^{2}-\frac{1}{2} \Vert \mathbf {x}\Vert ^{2}-\mathbf {x}^{T}\mathbf {f}\biggr\} . \end{aligned}$$

Clearly, traditional methods for solving this nonlinear fixed point problem in \({\mathbb {R}}^{n}\) are difficult. However, by the canonical duality theory, this problem can be solved easily in \({\mathbb {R}}^{2}\).

The canonical measure in this problem can be given as

$$\begin{aligned} \boldsymbol {\xi }= {\Lambda }(\mathbf {x})= \left ( \textstyle\begin{array}{@{}c@{}} \xi _{1} \\ \xi _{2} \end{array}\displaystyle \right ) =\left ( \textstyle\begin{array}{@{}c@{}} \frac{1}{2} \Vert D_{1}\mathbf {x}\Vert ^{2}\\ \frac{1}{2} \Vert D_{2}\mathbf {x}\Vert ^{2} \end{array}\displaystyle \right ) : \quad {\mathbb {R}}^{n} \rightarrow \mathcal {E}_{a} = \bigl\{ \boldsymbol {\xi }\in {\mathbb {R}}^{2}\mid \xi _{1}, \xi _{2} \ge 0 \bigr\} . \end{aligned}$$

Correspondingly, the canonical function is

$$\begin{aligned} V(\boldsymbol {\xi }) = \left ( \textstyle\begin{array}{@{}c@{}} V_{1}(\xi _{1}) \\ V_{2}(\xi _{2}) \end{array}\displaystyle \right ) = \left ( \textstyle\begin{array}{@{}c@{}} \alpha \exp (\xi _{1}) \\ \frac{1}{2} \beta (\xi _{2}-\lambda )^{2} \end{array}\displaystyle \right ) , \end{aligned}$$

and the canonical dual variable is

$$\begin{aligned} \boldsymbol {\varsigma }= \left ( \textstyle\begin{array}{@{}c@{}} \varsigma _{1} \\ \varsigma _{2} \end{array}\displaystyle \right ) =\left ( \textstyle\begin{array}{@{}c@{}} \nabla V_{1}(\xi _{1})\\ \nabla V_{2}(\xi _{2}) \end{array}\displaystyle \right ) =\left ( \textstyle\begin{array}{@{}c@{}} \alpha \exp (\xi _{1})\\ \beta (\xi _{2}-\lambda ) \end{array}\displaystyle \right ) :\quad \mathcal {E}_{a} \rightarrow \mathcal {E}^{*}_{a} = \bigl\{ \boldsymbol {\varsigma }\in {\mathbb {R}}^{2} \mid \varsigma _{1} \ge \alpha , \varsigma _{2} \ge-\lambda\beta\bigr\} . \end{aligned}$$

By the Legendre transformation, the conjugate function \(V^{*}(\boldsymbol {\varsigma })\) is uniquely defined as

$$\begin{aligned} V^{*}(\boldsymbol {\varsigma }) =\left ( \textstyle\begin{array}{@{}c@{}} V_{1}^{*}(\varsigma _{1})\\ V_{2}^{*}(\varsigma _{2}) \end{array}\displaystyle \right ) = \left ( \textstyle\begin{array}{@{}c@{}} (\ln (\varsigma _{1}/\alpha )-1) \varsigma _{1} \\ \frac{1}{2 \beta } \varsigma _{2}^{2} + \lambda \varsigma _{2} \end{array}\displaystyle \right ) . \end{aligned}$$

Since the canonical measure in this application is a homogeneous quadratic operator, the total complementary function \(\Xi : \mathbb{R}^{n}\times \mathcal {E}_{a}^{*}\rightarrow \mathbb{R}\) has the following form:

$$\begin{aligned} \Xi (\mathbf {x},\boldsymbol {\varsigma })=\frac{1}{2}\mathbf {x}^{T} \mathbf {G}(\boldsymbol {\varsigma })\mathbf {x}-\mathbf {x}^{T} \mathbf {f}- \bigl(\ln (\varsigma _{1}/\alpha )-1 \bigr)\varsigma _{1}- \biggl(\frac{1}{2 \beta }\varsigma _{2}^{2} + \lambda \varsigma _{2} \biggr), \end{aligned}$$

where

$$\begin{aligned} \mathbf {G}(\boldsymbol {\varsigma })=\varsigma _{1} D_{1}^{T} D_{1} +\varsigma _{2} D_{2}^{T} D_{2} - \mathbf {I}. \end{aligned}$$

On the canonical dual feasible space \(\mathcal {S}_{a}= \{ \boldsymbol {\varsigma }=[\varsigma _{1},\varsigma _{2}]^{T} \in \mathcal {E}_{a}^{*} \mid \det ( \mathbf {G}(\boldsymbol {\varsigma })) \neq 0 \}\), the canonical dual problem can be formulated as

$$\begin{aligned} \bigl(\mathcal {{P}}^{d} \bigr):\quad \bar {\boldsymbol {\varsigma }}&= \arg \mathop{\mathrm {sta}}_{ \boldsymbol {\varsigma }\in \mathcal {S}_{a}} \biggl\{ P^{d}(\boldsymbol {\varsigma })= -\frac{1}{2} \mathbf {f}^{T} \mathbf {G}^{-1}(\boldsymbol {\varsigma }) \mathbf {f}- \bigl(\ln (\varsigma _{1}/\alpha )-1 \bigr) \varsigma _{1} \\ &\quad {}- \biggl(\frac{1}{2 \beta }\varsigma _{2}^{2} + \lambda \varsigma _{2} \biggr) \biggr\} . \end{aligned}$$
(31)

Example 1

Let \(n=2\), \(\alpha =6\), \(\beta =8\), \(\lambda =1\), and

$$\begin{aligned} D_{1}=\left [ \textstyle\begin{array}{@{}c@{\quad}c@{}} 2&0\\ 0&3 \end{array}\displaystyle \right ] , \qquad D_{1}= \left [ \textstyle\begin{array}{@{}c@{\quad}c@{}} 4&0\\ 0&5 \end{array}\displaystyle \right ] , \qquad \mathbf {f}=\left [ \textstyle\begin{array}{@{}c@{}} 2\\ 1 \end{array}\displaystyle \right ] , \end{aligned}$$

then the primal function (see Fig. 1)

$$\begin{aligned} \Pi (x_{1},x_{2})= 6 \exp \bigl(2x_{1}^{2}+4.5x_{2}^{2} \bigr)+4 \bigl(8x_{1}^{2}+12.5x_{2}^{2}-1 \bigr)^{2}-\frac{1}{2} \bigl(x_{1}^{2}+x_{2}^{2} \bigr)-2x_{1}-x_{2}. \end{aligned}$$

The corresponding canonical dual function is

$$\begin{aligned} &\Pi ^{d}(\varsigma _{1},\varsigma _{2}) \\ &\quad =- \frac{1}{2} \biggl(\frac{4}{4\varsigma _{1}+16\varsigma _{2}-1}+\frac{1}{9\varsigma _{1}+25\varsigma _{2}-1} \biggr) - \varsigma _{1} \bigl(\ln (\varsigma _{1}/6)- 1 \bigr) - \biggl( \frac{1}{16}\varsigma _{2}^{2}+\varsigma _{2} \biggr). \end{aligned}$$

Its graph is shown by Fig. 2. It is easy to find that the canonical dual problem \((\mathcal {{P}}^{d}) \) has three solutions:

$$\begin{aligned}& \boldsymbol {\varsigma }^{1} =[7.38697,-1.39206]^{T} \in \mathcal {S}_{a}^{+}, \\& \boldsymbol {\varsigma }^{2} =[6.00566,-7.97189]^{T}\in \mathcal {S}_{a}^{-} , \\& \boldsymbol {\varsigma }^{3} =[7.3106,-2.23695]^{T}\in \mathcal {S}_{a}^{-}. \end{aligned}$$

By Theorem 1 we have three primal solutions:

$$\begin{aligned}& \mathbf {x}^{1} =[0.318731, 0.0325932]^{T}, \\& \mathbf {x}^{2} =[-0.0191337,-0.00683777]^{T}, \\& \mathbf {x}^{3} =[-0.264945, 0.112718]^{T} . \end{aligned}$$
Figure 1
figure 1

Graphs of \(\Pi (x_{1},x_{2})\) and its contour for Example 1

Figure 2
figure 2

Graphs of \(\Pi ^{d}(\varsigma _{1}, \varsigma _{2})\) and its contour for Example 1

It is easy to check that

$$\begin{aligned}& \Pi \bigl(\mathbf {x}^{1} \bigr)=\Pi ^{d} \bigl(\boldsymbol {\varsigma }^{1} \bigr)=6.78671 , \\& \Pi \bigl(\mathbf {x}^{2} \bigr)=\Pi ^{d} \bigl(\boldsymbol {\varsigma }^{2} \bigr)=10.0225 , \\& \Pi \bigl(\mathbf {x}^{3} \bigr)=\Pi ^{d} \bigl(\boldsymbol {\varsigma }^{3} \bigr)=7.99906. \end{aligned}$$

By Theorem 2 we know that \(\mathbf {x}^{1} \) is a global minimizer of \(\Pi (\mathbf {x})\), \(\mathbf {x}^{2}\) is a local maximizer of \(\Pi (\mathbf {x})\), and \(\mathbf {x}^{3}=[-0.264945, 0.112718]^{T}\) is a local minimizer of \(\Pi (\mathbf {x})\) (see Fig. 1). By the fact that

$$\begin{aligned}& x_{1}^{i} = F_{1} \bigl(x_{1}^{i},x_{2}^{i} \bigr)=6\exp \bigl(2x_{1}^{i}+4.5 x_{2}^{i} \bigr)4x_{1}^{i}+8 \bigl(8x_{1}^{i}+12.5x_{2}^{i}-1 \bigr)16x_{1}^{i}-2, \\& x_{2}^{i} = F_{2} \bigl(x_{1}^{i},x_{2}^{i} \bigr)=6\exp \bigl(2x_{1}^{i}+4.5 x_{2}^{i} \bigr)9x_{2}^{i}+8 \bigl(8x_{1}^{i}+12.5x_{2}^{i}-1 \bigr)25x_{2}^{i}-1 \end{aligned}$$

hold for all \(i=1,2,3\), we know that \(\{ \mathbf {x}^{i}\} \) (\(i=1,2,3\)) are all fixed points.

4.2 Logarithmic and quadratic function

In this application, we let

$$\begin{aligned} {W}(D\mathbf {x})=c_{1} \Vert D\mathbf {x}\Vert ^{2}+c_{2} \Vert D\mathbf {x}\Vert ^{2} \log \Vert D\mathbf {x}\Vert ^{2}, \end{aligned}$$

where \(D\in \mathbb{R}^{m\times n}\) is a matrix, \(c_{1}\), \(c_{2}\) are real numbers. Clearly, \({W}(D\mathbf {x})\) is nonconvex and

$$\begin{aligned} F(\mathbf {x})=\nabla P(\mathbf {x}) =2c_{1} \bigl(D^{T} D\bigr)\mathbf {x}+2c_{2} \bigl( \bigl(D^{T} D\bigr) \mathbf {x}\bigr) \bigl( \log \Vert D\mathbf {x}\Vert ^{2}+1 \bigr) \end{aligned}$$

is non-monotone. The fixed point problem \(\mathbf {x}= F(\mathbf {x})\) can be reformulated as

$$\begin{aligned} (\mathcal {{P}}): \quad {\bar {\mathbf {x}}}= \arg \mathrm {sta}\biggl\{ \Pi (\mathbf {x})=c_{1} \Vert D\mathbf {x}\Vert ^{2}+c_{2} \Vert D\mathbf {x}\Vert ^{2} \log \Vert D\mathbf {x}\Vert ^{2} -\frac{1}{2} \Vert \mathbf {x}\Vert ^{2}-\mathbf {x}^{T}\mathbf {f}\mid \mathbf {x}\in {\mathbb {R}}^{n} \biggr\} . \end{aligned}$$

By using the canonical measure

$$ \xi = {\Lambda }(\mathbf {x})= \Vert D\mathbf {x}\Vert ^{2} : \quad {\mathbb {R}}^{n} \rightarrow \mathcal {E}_{a} = {\mathbb {R}}^{+}= \{ \xi \in {\mathbb {R}}\mid \xi \ge 0 \}, $$

the canonical function is \(V(\xi )=c_{1} \xi +c_{2} \xi (\log \xi +1)\) and its Legendre conjugate is

$$\begin{aligned} V^{*}(\varsigma )=c_{2} \exp \biggl[\frac{1}{c_{2}}(\varsigma -c_{1})-1 \biggr], \end{aligned}$$

which is convex on its domain \(\mathcal {E}^{*}_{a} = {\mathbb {R}}\). In this case, we have the total complementary function

$$\begin{aligned} \Xi (\mathbf {x},\varsigma )=\frac{1}{2}\mathbf {x}^{T} \mathbf {G}(\varsigma )\mathbf {x}-\mathbf {x}^{T} \mathbf {f}-c_{2} \exp \biggl[\frac{1}{c_{2}}(\varsigma -c_{1})-1 \biggr], \end{aligned}$$

where \(\mathbf {G}(\varsigma )=2\varsigma D^{T} D- \mathbf {I}\) and the canonical dual problem is

$$ \bigl(\mathcal {{P}}^{d} \bigr): \quad \bar {\varsigma }= \arg \mathrm {sta}\biggl\{ P^{d}(\varsigma )= -\frac{1}{2} \mathbf {f}^{T} \mathbf {G}^{-1}(\varsigma ) \mathbf {f}-c_{2} \exp \biggl[\frac{1}{c_{2}}(\varsigma -c_{1})-1 \biggr] \Bigm|\varsigma \neq 0 \biggr\} . $$
(32)

Example 2

We first let \(m=n=2\), \(c_{1}=-8\), \(c_{2}=10\), and

$$\begin{aligned} D=\left [ \textstyle\begin{array}{@{}c@{\quad}c@{}} 3&0\\ 0&4 \end{array}\displaystyle \right ] , \qquad \mathbf {f}=\left [ \textstyle\begin{array}{@{}c@{}} -5\\ 2 \end{array}\displaystyle \right ] . \end{aligned}$$

The primal function

$$\begin{aligned} \Pi (x_{1},x_{2})= -8 \bigl(9x_{1}^{2}+16x_{2}^{2} \bigr)+10 \bigl(9x_{1}^{2}+16x_{2}^{2} \bigr) \log \bigl(9x_{1}^{2}+16x_{2}^{2} \bigr) -\frac{1}{2} \bigl(x_{1}^{2}+x_{2}^{2} \bigr)-5x_{1}+2x_{2} \end{aligned}$$

is nonconvex and its graph is shown in Fig. 3. The corresponding canonical dual function is

$$\begin{aligned} \Pi ^{d}(\varsigma )=-\frac{1}{2} \biggl(\frac{25}{18\varsigma -1}+ \frac{4}{32\varsigma -1} \biggr) -10\exp \bigl[0.1(\varsigma +8)-1 \bigr]. \end{aligned}$$

For this example, the one-dimensional canonical dual problem \((\mathcal {{P}}^{d})\) can be solved easily (by using Mathematica) to obtain total three solutions (see Fig. 4):

$$\begin{aligned} \varsigma ^{1}=0.969642 > \varsigma ^{2} =-0.955077 > \varsigma ^{3} = -91.0174. \end{aligned}$$

Correspondingly, the three primal solutions are

$$ \mathbf {x}^{1} = \left [ \textstyle\begin{array}{@{}c@{}} -0.303886\\ 0.0666033 \end{array}\displaystyle \right ] , \qquad \mathbf {x}^{2} = \left [ \textstyle\begin{array}{@{}c@{}} 0.274855 \\ -0.0633664 \end{array}\displaystyle \right ] , \qquad \mathbf {x}^{3} = \left [ \textstyle\begin{array}{@{}c@{}} 0.00305006 \\ -0.000686446 \end{array}\displaystyle \right ] . $$

It is easy to check that \(\mathbf {x}^{i} = F(\mathbf {x}^{i})\), \(i=1,2,3\). Therefore, \(\{\mathbf {x}^{i}\}\) are fixed points. Since \(\varsigma ^{1} \in \mathcal {S}^{+}_{a} = \{ \varsigma \in {\mathbb {R}}\mid \varsigma > 0 \} \), we know that \(\mathbf {x}^{1}\) is a globally stable fixed point. It is easy to check that \(\mathbf {x}^{2} \) is a locally stable fixed point, \(\mathbf {x}^{3}\) is a locally unstable fixed point, and

$$\begin{aligned}& \Pi \bigl(\mathbf {x}^{1} \bigr)=\Pi ^{d} \bigl(\varsigma ^{1} \bigr)=-9.84726, \\& \Pi \bigl(\mathbf {x}^{2} \bigr)=\Pi ^{d} \bigl(\varsigma ^{2} \bigr)=-6.69103, \\& \Pi \bigl(\mathbf {x}^{3} \bigr)=\Pi ^{d} \bigl(\varsigma ^{3} \bigr)=0.00739894 . \end{aligned}$$
Figure 3
figure 3

Graph of \(\Pi (x_{1},x_{2})\) and its contour for Example 2

Figure 4
figure 4

Graph of \(\Pi ^{d}(\varsigma )\) for Example 2

Example 3

We now let \(m=3\), \(n=2\), \(c_{1}=-15\), \(c_{2}=9\), and

$$\begin{aligned} D=\left [ \textstyle\begin{array}{@{}c@{\quad}c@{}} 0.3&0.2\\ 0.5&0.6\\ 0.4&0.7 \end{array}\displaystyle \right ] , \qquad \mathbf {f}=\left [ \textstyle\begin{array}{@{}c@{}} 1\\ 4 \end{array}\displaystyle \right ] , \end{aligned}$$

then the primal function is

$$\begin{aligned} \Pi (x_{1},x_{2}) &= 9 \bigl(0.5x_{1}^{2}+1.28x_{1}x_{2}+0.89x_{2}^{2} \bigr) \log \bigl(0.5x_{1}^{2}+1.28x_{1}x_{2}+0.89x_{2}^{2} \bigr) \\ &\quad {}-15 \bigl(0.5x_{1}^{2}+1.28x_{1}x_{2}+0.89x_{2}^{2} \bigr)-\frac{1}{2} \bigl(x_{1}^{2}+x_{2}^{2} \bigr)-x_{1}-4x_{2}. \end{aligned}$$

Its graph is a nonconvex surface in \({\mathbb {R}}^{3}\), which has multiple critical points, but their locations cannot be found precisely as the surface is rather flat around these critical points (see Figs. 57). However, its canonical dual is a single-valued function

$$\begin{aligned} \Pi ^{d}(\varsigma )=-\frac{1}{2} \left [ \textstyle\begin{array}{@{}c@{\quad}c@{}} 1&4 \end{array}\displaystyle \right ] \left [ \textstyle\begin{array}{@{}c@{\quad}c@{}} \varsigma -1&1.28 \varsigma \\ 1.28 \varsigma & 1.78 \varsigma -1 \end{array}\displaystyle \right ] ^{-1} \left [ \textstyle\begin{array}{@{}c@{}} 1\\ 4 \end{array}\displaystyle \right ] -9\exp \biggl[\frac{1}{9}(\varsigma +15)-1 \biggr], \end{aligned}$$

and from its graph, we can see clearly that it has five critical points in total (see Figs. 89). These critical points can be easily obtained by Mathematica:

$$\begin{aligned} \varsigma ^{1}=20.396 > \varsigma ^{2}=17.9735 > \varsigma ^{3} =1.46219 > \varsigma ^{4}=-0.881733 > \varsigma ^{5}=-52.7144 . \end{aligned}$$

By Theorem 1, we have all the primal solutions:

$$\begin{aligned}& \mathbf {x}^{1} = \left [ \textstyle\begin{array}{@{}c@{}} -21.57\\ 16.065 \end{array}\displaystyle \right ] , \qquad \mathbf {x}^{2} = \left [ \textstyle\begin{array}{@{}c@{}} 18.937 \\ -13.93 \end{array}\displaystyle \right ] , \qquad \mathbf {x}^{3} = \left [ \textstyle\begin{array}{@{}c@{}} 2.130\\ 0.008 \end{array}\displaystyle \right ] , \\& \mathbf {x}^{4} = \left [ \textstyle\begin{array}{@{}c@{}} 0.546 \\ -1.797 \end{array}\displaystyle \right ] , \qquad \mathbf {x}^{5} = \left [ \textstyle\begin{array}{@{}c@{}} 0.323 \\ -0.272 \end{array}\displaystyle \right ] . \end{aligned}$$

Since \(F(\mathbf {x})\) is a potential operator, these stationary points are all fixed points of \(F(\mathbf {x})\). It is easy to find that the matrix \(\mathbf {G}(\varsigma ) \) has two singularity points: \(\varsigma _{1} = 19.266\) and \(\varsigma _{2} = 0.367\); therefore,

$$ \mathcal {S}^{+}_{a} = \{ \varsigma \in {\mathbb {R}}\mid \varsigma > 19.266 \}, \qquad \mathcal {S}^{-}_{a} = \{ \varsigma \in {\mathbb {R}}\mid \varsigma < 0.367 \}. $$

By the facts that \(\varsigma ^{1} \in \mathcal {S}^{+}_{a}\) and \(\varsigma ^{5}\in \mathcal {S}^{-}_{a}\), we know that \(\mathbf {x}^{1}\) is a globally stable fixed point, \(\mathbf {x}^{5}\) is a locally unstable fixed point. Although \(\varsigma ^{4} \in \mathcal {S}^{-}_{a}\) is a local minimizer of \(\Pi ^{d}(\varsigma )\), we cannot say if \(\mathbf {x}^{4}\) is a locally stable fixed point since \(\dim \mathcal {X}_{a} = 2 \neq \dim \mathcal {S}_{a} = 1\). But by the complementary-dual principle and the order of the canonical dual solutions \(\{\varsigma ^{i} \}\), we have

$$\begin{aligned} \Pi \bigl(\mathbf {x}^{1} \bigr)&=\Pi ^{d} \bigl(\varsigma ^{1} \bigr)=-190.381 \\ &< \Pi \bigl(\mathbf {x}^{2} \bigr)=\Pi ^{d} \bigl(\varsigma ^{2} \bigr)=-110.759 \\ &< \Pi \bigl(\mathbf {x}^{3} \bigr)=\Pi ^{d} \bigl(\varsigma ^{3} \bigr)=-21.7036 \\ & < \Pi \bigl(\mathbf {x}^{4} \bigr)=\Pi ^{d} \bigl(\varsigma ^{4} \bigr)=-12.5735 \\ &< \Pi \bigl(\mathbf {x}^{5} \bigr)=\Pi ^{d} \bigl(\varsigma ^{5} \bigr)=0.332915. \end{aligned}$$
Figure 5
figure 5

Graph of \(\Pi (\mathbf {x})\) and its contour for Example 3

Figure 6
figure 6

Graph of \(\Pi (\mathbf {x})\) and its contour for Example 3 around \(\mathbf {x}^{1}\)

Figure 7
figure 7

Graph of \(\Pi (\mathbf {x})\) and its contour for Example 3 around \(\mathbf {x}^{2}\)

Figure 8
figure 8

Graph of \(\Pi ^{d}(\varsigma )\) for Example 3

Figure 9
figure 9

Graph of \(\Pi ^{d}(\varsigma )\) for Example 3 around \(\varsigma ^{2}\)

5 Conclusions

Based on the canonical duality theory, a unified model is proposed such that the general fixed point problems can be reformulated as a global optimization problem. This model is directly related to many other challenging problems in variational inequality, d.c. programming, chaotic dynamics, nonconvex analysis/PDEs, post-buckling of large deformed structures, phase transitions in solids, computer science, etc. (see [24] and the references cited therein). By the complementary-dual principle, all the fixed points can be obtained analytically in terms of the canonical dual solutions. Their stability and extremality are identified by the triality theory. Applications are illustrated by problems governed by nonconvex polynomial, exponential, and logarithmic functions. Our examples show that both globally stable and locally stable/unstable fixed point problems in \({\mathbb {R}}^{n}\) can be obtained easily by solving the associated canonical dual problems in \({\mathbb {R}}^{m}\) with \(m< n\). However, the local stability condition for those fixed points \({\bar {\mathbf {x}}}(\bar {\boldsymbol {\varsigma }})\) with indefinite \(\mathbf {G}(\bar {\boldsymbol {\varsigma }})\) still remains unknown, and it deserves serious study in the future. Also, the results presented in this paper can be generalized to problems with nonsmooth potential functions.