1 Introduction

In the paper [1] Z. Boros proved that if D is an open, connected subset of \({\mathbb {R}}^2\) then a continuous function \(F : D \longrightarrow {\mathbb {R}}\,\), which is strictly monotone in one of its variables, fulfills the equations

$$\begin{aligned} F(x+t,y) = \Psi _1 (F(x,y),t) \quad \text{ and } \quad F(x,y+s) = \Psi _2 (F(x,y),s) \end{aligned}$$

with some unknown functions \( \Psi _1 \,,\, \Psi _2 \) if, and only if, there exist real numbers \( a \,,\, b \) and a strictly monotone, continuous real function f such that F can be represented as

$$\begin{aligned} F(x,y) = f(a x + b y) \, . \end{aligned}$$

Motivated by this result, it seems natural to consider the analogous version of the above mentioned system of composite functional equations for \(n \ge 2\) variables. Therefore, in our work we are going to investigate the system of functional equations

$$\begin{aligned} F(x_1+t_1, x_2, \dots , x_n)&= \Psi _1(F (x_1, x_2, \dots , x_n), t_1) \quad (1) \\ F(x_1, x_2 + t_2, \dots , x_n)&= \Psi _2(F (x_1, x_2, \dots , x_n), t_2) \quad (2) \\&\vdots \\ F(x_1, x_2, \dots , x_n + t_n)&= \Psi _n(F (x_1, x_2, \dots , x_n), t_n) \quad (n). \end{aligned}$$

Our purpose is to obtain some decomposition theorems for the continuous solutions of this system of equations, similar to the one in [1]. However, we will take an approach which fundamentally differs from the arguments applied in [1]. The main reason for using alternative methods is that the ideas used by Boros do not seem easily applicable for the higher dimensional system of equations.

Therefore we will utilize the geometrical meaning of the functional equations. Namely, one can easily check that each one of the equations \((1) - (n)\) ensures that if \(F(x) = F(y)\) holds for some points of the domain of definition of F, then \(F(x + t) = F(y + t)\) is also fulfilled for vectors \(t \in {\mathbb {R}}^n\) which are parallel to the corresponding coordinate axis (of course, assuming that F is defined in \(x+t\) and \(y+t\)). It seems reasonable to investigate whether an analogous property holds for arbitrary translation vectors as well. Hence we will introduce the concept of translation invariant functions: if S is a non-empty subset of \({\mathbb {R}}^n\) then the function \(F : S \longrightarrow {\mathbb {R}}\) is said to be translation invariant if \(F(x) = F(y)\) implies \(F(x + t) = F(y + t)\) for all vectors \(t \in {\mathbb {R}}^n\) such that \(x+t \, , y+t \in S\).

In the most extensive part of our work we will examine continuous, translation invariant functions with the purpose of formulating a decomposition theorem similar to the one in [1]. Then we will show that the obtained results can be applied to characterize the continuous solutions of \((1) - (n)\). As we will see, this can be done without any major difficulties, as the solutions of the system of equations are locally translation invariant, so they behave similarly to ’proper’ translation invariant functions. Another advantage of our approach is that we do not need to assume strict monotonicity in any of the variables of the solutions. This means that, as a particular case, we immediately obtain a stronger version of the main result of [1].

Finally, we are going to present an application of our results, in the field of mathematical economics. One can find numerous works (e. g. [2, 7]) which deal with the problem of how to characterize certain types of so-called utility functions. However, it is common that strong regularity conditions (such as higher-order differentiability) are assumed, which excludes many possible utility functions. On the other hand, our decomposition theorem provides a characterization of the frequently used Cobb–Douglas type utility functions, with the help of a system of composite functional equations, while assuming relatively weak regularity.

2 Preliminaries

Firstly, we will collect some fundamental notions and propositions concerning convex geometry and particular sets of metric spaces. Throughout our work we consider \({\mathbb {R}}^n\) with its standard inner product, norm and the induced topology. In order to avoid possible technical difficulties, we shall always assume that the dimension n is at least 2 (we are going to explicitly state if this is not the case).

We will use the concepts of affine/convex set, affine/convex combination and affine/convex hull in the usual sense. Special affine/convex sets, such as lines, line segments and hyperplanes will also be used in the standard way. These definitions can be found, for instance, in the first chapter of the monograph [4] of Lay. Let us introduce some notations.

Notation 1

Let \(x,y \in {\mathbb {R}}^n\). The line passing through x and y will be denoted by \(\text{ l }(x,y)\), and the line segment joining x and y will be denoted by \(\mathrm{s}(x,y)\). Furthermore, let

$$\begin{aligned} \mathrm{int} \, \mathrm{s}(x,y) = \{ \, (1-\lambda )x + \lambda y \ \vert \ \lambda \in \, ]0\,,1[ \, \} \end{aligned}$$

be the interior of the segment. Moreover, for fixed vectors \(x \in {\mathbb {R}}^n\) and \(0 \ne a \in {\mathbb {R}}^n\), the hyperplane that passes through x and has normal vector a will be denoted by H(xa), i.e.

$$\begin{aligned} H(x, a) = \lbrace y \in {\mathbb {R}}^n\ \vert \ \langle y-x , a \rangle = 0 \rbrace . \end{aligned}$$

Remark 2.1

The affine/convex hull of finitely many points \(x_1, \dots ,x_k \in {\mathbb {R}}^n\) will be denoted by \( \mathrm{Aff}(x_1, \dots ,x_k) \) and \(\mathrm{conv}(x_1, \dots ,x_k) \, \), respectively. Since the affine/convex hull of a set S consists of all the affine/convex combinations of the points of S (see [4, Theorem 2.22]), we get that these sets have the form

$$\begin{aligned} \mathrm{Aff}(x_1, \dots ,x_k) = \big \{ \, \sum _{j=1}^k \lambda _j x_j \ \vert \ \lambda _j \in {\mathbb {R}}, \, \sum _{j=1}^k \lambda _j = 1 \, \big \}, \\ \mathrm{conv}(x_1, \dots ,x_k) = \big \{ \, \sum _{j=1}^k \lambda _j x_j \ \vert \ \lambda _j \in [0,1], \, \sum _{j=1}^k \lambda _j = 1 \, \big \}. \end{aligned}$$

As it is well-known, the affine subsets of \({\mathbb {R}}^n\) can be characterized as the translates of linear subspaces. Namely, if \(A \subset {\mathbb {R}}^n\) is affine, then there exists a unique linear subspace \(L \subset {\mathbb {R}}^n\) such that \(A = p + L\), for any \(p \in A\) (cf. [4, Theorem 2.13]). Motivated by this result, we will refer to affine sets of \({\mathbb {R}}^n\) as affine subspaces (as it is common in the literature).

We will also use the following notion: the dimension of an affine subspace A is the dimension of the (uniquely determined) linear subspace belonging to A. For example, the \(n-1\) dimensional affine subspaces of \({\mathbb {R}}^n\) are the hyperplanes. Furthermore, the affine hull of a set \(S \in {\mathbb {R}}^n\) will be called the affine subspace generated by S.

Finally, we shall mention affine independence and give some equivalent conditions.

Definition 2.2

Let \(k \in {\mathbb {N}}\), \(x_1, \dots , x_k \in {\mathbb {R}}^n\) and \(\lambda _1, \dots , \lambda _k \in {\mathbb {R}}\). The points \(x_1, \dots x_k \) are said to be affinely independent, if

$$\begin{aligned} \sum _{j=1}^k \lambda _j x_j = 0 \ \text{ and } \ \sum _{j=1}^k \lambda _j = 0 \end{aligned}$$

implies \(\lambda _j = 0\), for every \(j = 1, \dots ,k \).

Proposition 2.3

Let \(x_1, \dots , x_k \in {\mathbb {R}}^n\). Then the following statements are equivalent

  1. i)

    \(x_1, \dots , x_k\) are affinely independent.

  2. ii)

    For any \(i \in \{1, \dots , k \}\), the system of vectors

    $$\begin{aligned} \lbrace x_j-x_i \ \vert \ j = 1, \dots , k \text{ and } j \ne i \} \end{aligned}$$

    is linearly independent.

  3. iii)

    The affine subspace generated by \(x_1, \dots , x_k\) has dimension \(k-1\).

The results of this statement can be found in [4], mostly in the form of exercises (e.g. [4, Exercise 2.27]) and remarks. Now we formulate a corollary of the proposition, which later will be an important auxiliary tool.

Lemma 2.4

Let \(x_1, \dots ,x_k \in {\mathbb {R}}^n\) be affinely independent and let

$$\begin{aligned} y \notin \mathrm{Aff}(x_1, \dots ,x_k). \end{aligned}$$

Furthermore, let us consider some non-zero scalars \(\lambda _1, \dots , \lambda _k \in {\mathbb {R}}\setminus \{ 0 \}\), and define

$$\begin{aligned} y_j := y + \lambda _j (x_j-y). (j = 1, \dots , k) \end{aligned}$$

Then \(y, y_1, \dots ,y_k\) are affinely independent.

Proof

Since \(x_1, \dots ,x_k\) are affinely independent and \(y \notin \mathrm{Aff}(x_1, \dots ,x_k) \), it follows that \(x_1, \dots ,x_k\) together with y are also affinely independent and therefore \(x_1-y, \dots ,x_k-y\) are linearly independent. Multiplying some of these vectors by non-zero scalars does not affect their linear independence. This means that the vectors \(y_j - y = \lambda _j (x_j-y)\) for \(j = 1, \dots , k\) are linearly independent, so \(y, y_1, \dots ,y_k\) are affinely independent. \(\square \)

In the next part we will summarize some important topological notions and theorems. Since we will work in \({\mathbb {R}}^n\), in this section we also restrict ourselves to metric spaces only, instead of a more general approach. We shall use the concept of open sets, closed sets as well as compact sets and connected sets in the usual sense (as they are introduced, for instance, in the monographs of Rudin [6] or Sutherland [8]).

An open ball with center x and radius r will be denoted by B(xr). We will frequently use the fact that the image of a connected set under a continuous function is connected, while the image of a compact set under a continuous function is compact. A well-known consequence of the latter statement is that if X is a metric space, \(\emptyset \ne K \subset X\) is compact and \(f: K \longrightarrow {\mathbb {R}}\) is a continuous function, then f attains its minimum and maximum on the set K. The proof of these previous statements can be found in [8, Propositions 13.15, 12.13 and Corollary 13.18].

Finally, we enumerate two statements that later will play a crucial role in many of the main results of our work. Both of them are often used in the proofs of some classical theorems of complex analysis.

Proposition 2.5

Let (Xd) be a metric space, \(K \subset X\) be compact, \(\emptyset \ne C \subset X\) be closed and assume \(K \cap C = \emptyset \). Then there exists \(r >0\) such that for every \(x \in K\) it holds that \(B(x,r) \subset X \setminus C\).

Proposition 2.6

Let \(\emptyset \ne D \subset {\mathbb {R}}^n\) be open, connected and let \(x,y \in D\). Then there exists a polygonal path in D joining x and y.

Remark 2.7

By a polygonal path in D joining x and y we mean the union of line segments

$$\begin{aligned} \mathrm{s}(z_{k-1},z_k) \subset D \, , \qquad (k= 1, \dots , m) \end{aligned}$$

where \(z_0, z_1, \dots z_m \in D\) such that \(z_0 = x\) and \(z_m = y\).

We shall note that the proof of a somewhat weaker statement – namely that any open, connected subset of \({\mathbb {R}}^n\) is path-connected – can be found in the monograph of Sutherland [8, Proposition 12.25]. The author also mentions that the proof could be applied to the case of polygonal paths as well, without any major adjustments.

3 Translation invariant functions

In this section we introduce the concept of translation invariance, and then formulate various propositions for continuous, translation invariant functions. We will show that such functions – if the domain of definition is connected and open – can be represented as the composition of a strictly monotone, continuous, real valued function and a linear functional.

3.1 Translation invariant functions on affine sets

Definition 3.1

Let \(S \subset {\mathbb {R}}^n\) be a nonempty set and \(F : S \longrightarrow {\mathbb {R}}\) be a function. We say that the function F is translation invariant, if \(F(x) = F(y)\) implies \(F(x+t)=F(y+t)\), for any vectors \(x,y,t \in {\mathbb {R}}^n\) such that \(x ,y , x+t ,y+t \in S\).

Sometimes it is more convenient to use another condition which will be referred to as local translation invariance.

Definition 3.2

Let \(S \subset {\mathbb {R}}^n\) be a nonempty set and \(F : S \longrightarrow {\mathbb {R}}\) be a function. Suppose that for any \(x,y \in S\) and for any \(0 < r \in {\mathbb {R}}\) such that \(B(x,r) \subset S\) and \(B(y,r) \subset S\) the following assertion holds: if \(F(x) = F(y)\) then \(F(x+h) = F(y+h)\) is fulfilled for any \(h \in B(0,r)\). Then the function F is said to be locally translation invariant.

We recall a notation which will be used later in many of our proofs: let x be an arbitrary real number, then

$$\begin{aligned} \Big \lfloor x \Big \rfloor= & {} \max \lbrace k \in {\mathbb {Z}}\ \vert \ k \le x \rbrace \text{ is } \text{ the } \text{ floor } \text{ of } x \text{ and } \\ \Big \lceil x \Big \rceil= & {} \min \lbrace k \in {\mathbb {Z}}\ \vert \ x \le k \rbrace \text{ is } \text{ the } \text{ ceiling } \text{ of } x. \end{aligned}$$

Proposition 3.3

Let \(\emptyset \ne K \subset {\mathbb {R}}^n\) be an open, convex set and let \(F : K \longrightarrow {\mathbb {R}}\) be locally translation invariant. Then F is translation invariant.

Proof

Let \(x,y \in K\) and \(0 \ne t \in {\mathbb {R}}^n\) be arbitrary vectors such that \(x+t,y+t \in K\) and \(F(x) = F(y)\). Then the convexity of K provides \(\mathrm{s}(x, x+t) \subset K\) and \(\mathrm{s}(y, y+t) \subset K\). These line segments are compact while \({\mathbb {R}}^n\setminus K\) is closed, thus, due to Proposition 2.5, there exists \(r > 0\) such that

$$\begin{aligned} T_x := \bigcup _{z \, \in \, s(x, x+t)} B(z, r) \subset K \quad \text{ and } \quad T_y := \bigcup _{z \, \in \, s(y, y+t)} B(z, r) \subset K \end{aligned}$$

hold. Now let us define the positive integer N and vector h as

$$\begin{aligned} N := \left\lceil \frac{2 \Vert t \Vert }{r} \right\rceil \quad \text{ and } \quad h := \frac{1}{N} \, t . \end{aligned}$$

Hence we get that

$$\begin{aligned} \Vert h \Vert = \left\Vert\frac{1}{N} t \right\Vert= \frac{1}{N} \Vert t \Vert \le \frac{1}{\frac{2 \Vert t \Vert }{r}} \Vert t \Vert = \frac{r}{2} < r. \end{aligned}$$

Now the local translation invariance implies \(F(x + h) = F(y+h)\). We shall repeat the translation with h (and use the local translation invariance) N times for the appropriate starting points \(x+kh\), \(y+kh\) (\(k= 1, \dots ,N-1\)). Then we obtain \(F(x + t) = F(x + N h) = F(y+Nh) = F(y+t)\). Thus we have proven the translation invariance of F. \(\square \)

The next lemma will be crucial for our later investigations.

Lemma 3.4

Let \(\emptyset \ne S \subset {\mathbb {R}}^n\) and let \(F : S \longrightarrow {\mathbb {R}}\) be a continuous, translation invariant function. Furthermore, let \(x,y \in S\) such that \(F(x) = F(y) = \alpha \), and suppose \(\mathrm{s}(x,y) \subset S\). Then \(F(p) = \alpha \) for every \(p \in \mathrm{s}(x,y)\).

Proof

We may assume \(x \ne y\), otherwise the statement is trivial. In the first step we show that for all \(0 < r \in {\mathbb {R}}\) there exist \(u,v \in \mathrm{s}(x,y)\) such that \(F(u) = F(v)\), \(u \in \mathrm{s}(x,v)\) and \(0< \Vert v - u \Vert < r\). Let us introduce two notations:

$$\begin{aligned} T := s(x,y) \quad \text{ and } \quad e:= \frac{y-x}{\Vert y-x \Vert }. \end{aligned}$$

Since the line segment T is compact, the continuous function F attains its extrema on T. Moreover, as \(F(x) = F(y)\), there exists \(m \in \mathrm{int} \, \mathrm{s}(x,y)\) such that \(F(m) = \max \{F(z) \ \vert \ z \in T \}\) or \(F(m) = \min \{F(z) \ \vert \ z \in T \}\). We investigate only the first case, the other one can be handled analogously. Now let us choose \(\varepsilon \in \, ]0 , r[\) such that

$$\begin{aligned} S_{-} := \mathrm{s}\left( m - \frac{\varepsilon }{2} e, m\right) \subset T \quad \text{ and } \quad S_{+} := \mathrm{s}\left( m, m + \frac{\varepsilon }{2}\right) \subset T. \end{aligned}$$

Then \(I_{-} := F(S_{-})\) and \(I_{+} := F(S_{+})\) are closed intervals in \({\mathbb {R}}\) as they are images of compact, connected line segments under the continuous function F. Observe that if \(I_{-} = \{m\}\) then \(u = m - \frac{\varepsilon }{2} e \, , v = m\) is an appropriate pair of vectors. Similarly, if \(I_{+} = \{m\}\) then we may choose \(u = m \, , v = m + \frac{\varepsilon }{2} e\). Finally, if \(I_{-} = [a_{-} \, ,m]\) and \(I_{+} = [a_{+} \, ,m]\) for some real numbers \(a_{-} < m\), \(a_{+} < m\), then there exist \(u \in S_{-} , v \in S_{+}\) such that

$$\begin{aligned} F(u) = \max \{ a_{-} \, , a_{+} \} = F(v). \end{aligned}$$

Obviously, \(0< \Vert u-v \Vert \le \varepsilon < r\) and \(u \in \mathrm{s}(x,v)\), so we have obtained an adequate pair of points.

In the second part of the proof we show that we are able to give a partition of the segment consisting of finitely many sub-segments such that the length of each new segment is at most r, and at each endpoint of these segments F has value \(\alpha \). For this purpose our first observation is that using the translation invariance for the previously obtained points u and v and for the translating vector \(x-u\), we get

$$\begin{aligned} F(x + v-u) = F(v + x-u) = F(u + x-u) = F(x) = \alpha . \end{aligned}$$

Now let \(N := \displaystyle \left\lfloor \, \frac{\Vert y-x \Vert }{\Vert v-u \Vert } \, \right\rfloor \) and let us define the points

$$\begin{aligned} z_k := x + k(v-u) \qquad (k= 0,1,\dots , N) ~~ \text{ and } ~~ z_{N+1} := y. \end{aligned}$$

Since \(\Vert v - u \Vert < r\), it is clear that \(\Vert z_{k-1} - z_k \Vert < r\) for all \(k = 1, \dots , N\). Furthermore, from the definition of N it is easy to see that \(\Vert z_N - y \Vert < r\) also holds. On the other hand, we have already shown \(F(z_0) = F(z_1) = \alpha \). Due to the translation invariance, for any \(k \in \{ 1, \dots ,N-1\}\), if \(F(z_{k-1}) = F(z_k) = \alpha \) then

$$\begin{aligned} F(z_{k+1}) = F(z_k + v-u) = F(z_{k-1} + v-u) = F(z_k) = \alpha . \end{aligned}$$

Therefore by induction we get that \(F(z_k) = \alpha \) for all \(k = 0,1, \dots ,N+1\). That is, \(z_0, z_1, \dots , z_{N+1}\) gives the desired partition. Observe that if \(p \in T\) is arbitrary then there exists \(j \in \{0, 1, \dots ,N \}\) such that \(\Vert z_j - p \Vert <r\). Indeed, if \(j = \displaystyle \left\lfloor \, \frac{\Vert p-x \Vert }{\Vert v-u \Vert } \, \right\rfloor \) then \(p \in \mathrm{s}(z_j, z_{j+1})\).

For the final step of the proof let \(p \in T\) be arbitrary. Applying the previous construction for \(r = \frac{1}{n}\) (for all \(n \in {\mathbb {N}}\)), we get a sequence of points \((x_n) : {\mathbb {N}}\longrightarrow T\) such that \(F(x_n) = \alpha \) and \(\Vert x_n - p \Vert < \frac{1}{n}\). Now \(x_n \rightarrow p\) so, due to the continuity of F, we have

$$\begin{aligned} F(p) = F(\lim _{n \rightarrow \infty } x_n) = \lim _{n \rightarrow \infty } F(x_n) = \lim _{n \rightarrow \infty } \alpha = \alpha . \end{aligned}$$

Since p was an arbitrary point of the segment \(\mathrm{s}(x,y)\), we have verified our statement. \(\square \)

The following corollary generalizes this lemma.

Corollary 3.5

Let \(\emptyset \ne S \subset {\mathbb {R}}^n\) and \(F : S \longrightarrow {\mathbb {R}}\) be a continuous, translation invariant function. Furthermore, let \(x,y \in S\) such that \(\mathrm{s}(x,y) \subset S\). Under these assumptions it holds that if \(u, v \in \mathrm{s}(x,y)\) (\(u \ne v\)) such that \(F(u) = F(v) = \alpha \), then \(F(p) = \alpha \) for all \(p \in \mathrm{s}(x,y)\).

Proof

The point p can be written as an affine combination of u and v:

$$\begin{aligned} p = (1-\lambda ) u + \lambda v. \end{aligned}$$

Without loss of generality, we may suppose \(\lambda \ge 0\) (by exchanging the role of u and v, if necessary). Let

$$\begin{aligned} k := \lfloor \lambda \rfloor \quad \text{ and } \quad \mu := \lambda - k, \end{aligned}$$

hence \(\mu \in [0,1[\) is fulfilled. Let us also define some particular points of the segment \(\mathrm{s}(x,y)\):

$$\begin{aligned} z_j := u + j(v-u) (j = 0 \, , 1, \dots , k ) \quad \text{ and } \quad w := (1- \mu ) u + \mu v. \end{aligned}$$

This definition ensures that the introduced points are indeed on the segment \(\mathrm{s}(x,y) \). Actually one can easily see that they are elements of \(\mathrm{s}(u,p)\), especially \(p = w + k(v-u)\). According to the assumptions of the statement \( F(z_0) = F(z_1) = \alpha \) holds, therefore by induction we may claim that \(F(z_j) = \alpha \) is fulfilled for all \(j = 0 \, , 1, \dots , k\). Indeed, the translation invariance of F means that if \(F(z_{j-1}) = F(z_{j}) = \alpha \), then

$$\begin{aligned} F(z_{j+1}) = F(z_{j} + (v-u)) = F(z_{j-1} + (v-u)) = F(z_{j}) = \alpha \end{aligned}$$

for all \(j = 1, \dots , k-1\). Furthermore w and \(\mathrm{s}(u,v)\) satisfy the assumptions of Lemma 3.4, thus \(F(w) = \alpha \). This means that, by applying translation invariance once again, we get

$$\begin{aligned} F(p) = F (w + k(v-u)) = F (u + k(v-u)) = F (z_k) = \alpha , \end{aligned}$$

which had to be proven. \(\square \)

Remark 3.6

If the domain of definition of F is convex, then the two previous propositions can be summarized as follows:

Let \(K \subset {\mathbb {R}}^n\) be convex, \(F : K \longrightarrow {\mathbb {R}}\) be a continuous, translation invariant function. Suppose \(x,y \in K\) and suppose that there exist \(u, v \in \mathrm{s}(x,y)\) (\(u \ne v\)) such that \(F(u) = F(v) = \alpha \). Then \(F(p) = \alpha \) holds for all points \(p \in \mathrm{s}(x,y)\).

Finally, we will formalize a theorem that is a further generalization of the previous results. Namely, instead of two points, we will consider the affine hull of finitely many points having the same function value. This time we assume local translation invariance which will be more convenient for later applications of the statement.

Theorem 3.7

Let \(\emptyset \ne K \subset {\mathbb {R}}^n\) be open and convex, \(F : K \longrightarrow {\mathbb {R}}\) be a continuous, locally translation invariant function, and let \(x_1, x_2, \dots , x_k \in K\). It holds that if \(F(x_1) = F(x_2) = \dots = F(x_k) = \alpha \), then \(F(p) = \alpha \) is fulfilled for every \(p \in \mathrm{Aff}(x_1, x_2, \dots , x_k) \cap K\).

Proof

Since K is open and convex, Proposition 3.3 implies that F is translation invariant.

In the case of \(k = 1\) we have nothing to prove, so from now on we assume \(k \ge 2\). Let us observe that \(\mathrm{conv}(x_1, \dots , x_k) \subset K\), since K is convex. Firstly, we will show that the value of F is \(\alpha \) in all points of the convex hull.

Under the assumptions of the theorem, we can apply Lemma 3.4 which ensures that if \(p \in \mathrm{s}(x_i, x_j)\) for some \(i,j \in \{1, \dots , k \}\), then \(F(p) = \alpha \) holds. Now let us consider an arbitrary point of \(\mathrm{conv}(x_1, \dots , x_k) \subset K\), in the form of a convex combination of \(x_1, \dots , x_k\). If the number of non-zero coefficients in this convex combination is at most two, then the function value of this point is \(\alpha \), according to the cited lemma.

Let \(r \ge 3\), and let us assume that the value of F is \(\alpha \) in all such points of \(\mathrm{conv}(x_1, \dots , x_k) \subset K\) which can be expressed as a convex combination of at most \(r-1\) vectors from the system \(x_1, \dots , x_k\). This means that if \(p = \lambda _1 x_{i_1} + \dots + \lambda _r x_{i_r}\) for some \(\{i_1, \dots , i_r \} \subset \{1, \dots , k \}\) and \(\lambda _1, \dots , \lambda _r \in [0,1]\), then we may suppose that neither of those coefficients is 0 or 1, because otherwise, using the induction hypothesis, the proof is complete. However, if the assumption holds, then we can do the following calculations:

$$\begin{aligned} p = (1 - \lambda _r) \left( \frac{\lambda _1}{1-\lambda _r} x_{i_1} + \dots + \frac{\lambda _{r-1}}{1-\lambda _r} x_{i_{r-1}} \right) + \lambda _r x_{i_r} = (1 - \lambda _r){\tilde{p}} + \lambda _r x_{i_r}, \end{aligned}$$

and here for the point

$$\begin{aligned} {\tilde{p}} := \frac{\lambda _1}{1-\lambda _r} x_{i_1} + \dots + \frac{\lambda _{r-1}}{1-\lambda _r} x_{i_{r-1}} \end{aligned}$$

we have \(F({\tilde{p}}) = \alpha \), due to the inductive hypothesis.

As \(p \in \mathrm{s}({\tilde{p}}, x_{i_r})\), applying Lemma 3.4 we get \(F(p) = \alpha \), which means that the function F is constant on the whole \(\mathrm{conv}(x_1, \dots , x_n)\). Especially we have obtained \(F(c) = \alpha \), where

$$\begin{aligned} c := \frac{1}{k} x_1 + \dots + \frac{1}{k} x_k \, . \end{aligned}$$

Finally, let us choose an arbitrary \(q \in \mathrm{Aff}(x_1, \dots , x_k) \cap K\) such that \(q \ne c\). We will show that \(\mathrm{int} \, \mathrm{s}(c,q) \cap \mathrm{conv}(x_1, \dots , x_k) \ne \emptyset \). Let us assume

$$\begin{aligned} q = \mu _1 x_1 + \dots + \mu _k x_k \, \end{aligned}$$

with some \(\mu _1, \dots , \mu _k \in {\mathbb {R}}\) such that \(\mu _1 + \dots + \mu _k = 1\), and define

$$\begin{aligned} M := \max \left\{ \, \vert \mu _j - \frac{1}{k} \vert \ \bigg \vert \ j = 1, \dots , k \, \right\} ~~~ \text{ and } ~~~ \varepsilon := \frac{1}{k M} \, . \end{aligned}$$

Now \(c \ne q\) grants that \(M > 0\) and therefore \(\varepsilon > 0\,\), too. Hence for all \(j = 1, \dots , k \) we have

$$\begin{aligned} -M\le & {} \left( \mu _j - \frac{1}{k} \right) \le M \ \text{ thus } \ -\frac{1}{k} \le \varepsilon \left( \mu _j - \frac{1}{k} \right) \le \frac{1}{k} \ \Longrightarrow \\ 0\le & {} \frac{1}{k} + \varepsilon \left( \mu _j - \frac{1}{k} \right) \le \frac{2}{k} \le 1. \end{aligned}$$

This means that the point \( v = c + \varepsilon (q-c) \, \in \mathrm{int} \, \mathrm{s}(c,q) \) is an element of the convex hull \(\mathrm{conv}(x_1, \dots , x_k)\). Indeed,

$$\begin{aligned} v= & {} c + \varepsilon (q-c) \\= & {} \frac{1}{k} x_1 + \dots + \frac{1}{k} x_k + \varepsilon \left( \mu _1 x_1 + \dots + \mu _k x_k - \frac{1}{k} x_1 - \dots - \frac{1}{k} x_k \right) \\= & {} \left( \frac{1}{k} + \varepsilon \left( \mu _1 - \frac{1}{k}\right) \right) x_1 + \dots + \left( \frac{1}{k} + \varepsilon \left( \mu _k - \frac{1}{k}\right) \right) x_k, \end{aligned}$$

and here the affine coordinates of v are from the interval [0, 1], according to the calculation above. Thus v is definitely in the convex hull. Therefore \(F(v) = \alpha \), so using Corollary 3.5 for \(q \in \text{ l }(c,v)\) we also get \(F(q) = \alpha \) which needed to be verified. \(\square \)

3.2 Decomposition of translation invariant functions

In the previous subsection we showed that if a continuous, (locally) translation invariant function F, defined on an open, convex set, takes the same value in some points of its domain, then F is constant on the affine subspace spanned by these points. However, the dimension of that subspace will be critical for us later. We are going to present some results concerning the dimension of such subspaces. In order to proceed to that question, we need to prove a general existence theorem for continuous functions.

Theorem 3.8

Let \(p \in {\mathbb {R}}^n\) and \( 0 < \varepsilon \in {\mathbb {R}}\) be arbitrary, and let \(F : B(p, \varepsilon ) \longrightarrow {\mathbb {R}}\) be a continuous function. Then there exist \(x_1, \dots ,x_n \in B(p, \varepsilon )\) such that \(x_1, \dots ,x_n\) are affinely independent and

$$\begin{aligned} F(x_1) = \dots = F(x_n). \end{aligned}$$

Proof

We will prove the statement by induction on the dimension of the domain of definition. The one-dimensional case is trivial. From now on, let us assume that \(n+1 \ge 2\) and the theorem holds in \({\mathbb {R}}^n\).

Therefore let us fix an arbitrary point \(p = (p_1, \dots ,p_n, p_{n+1} ) \in {\mathbb {R}}^{n+1}\) and a radius \(\varepsilon > 0\). We may introduce the following notations:

$$\begin{aligned} B := B(p, \varepsilon ) \quad \text{ and } \quad {\widetilde{B}} := \lbrace (x_1, \dots , x_n) \in {\mathbb {R}}^n\ \vert \ (x_1, \dots , x_n, p_{n+1}) \in B \rbrace . \end{aligned}$$

Furthermore, let us define the function \({\widetilde{F}}\) as follows:

$$\begin{aligned} {\widetilde{F}} : {\widetilde{B}} \longrightarrow {\mathbb {R}}, \qquad {\widetilde{F}}(x_1, \dots , x_n) := F(x_1, \dots , x_n, p_{n+1}). \end{aligned}$$

One can easily check that actually

$$\begin{aligned} {\widetilde{B}} = B({\tilde{p}}, \varepsilon ) \subset {\mathbb {R}}^n, \quad \text{ where } \quad {\tilde{p}} = (p_1, \dots , p_n), \end{aligned}$$

holds, while the continuity of F implies that \({\widetilde{F}}\) is also a continuous function. Hence we can apply the inductive hypothesis for the domain \({\widetilde{B}}\) and the function \({\widetilde{F}}\). Namely, there exist affinely independent points \( {\tilde{y}}_1, \dots , {\tilde{y}}_n \in {\widetilde{B}} \subset {\mathbb {R}}^n\) such that

$$\begin{aligned} {\widetilde{F}}({\tilde{y}}_1) = \dots = {\widetilde{F}}({\tilde{y}}_n). \end{aligned}$$

The (\(n-1\) dimensional) hyperplane in \({\mathbb {R}}^n\) passing through these points will be denoted by \({\widetilde{A}}\). Now let us introduce

$$\begin{aligned} y_j:= & {} ({\tilde{y}}_j , p_{n+1}) \in {\mathbb {R}}^{n+1} \qquad (j = 1, \dots , n) \qquad \text{ moreover } \\ U:= & {} \lbrace (x, p_{n+1}) \in {\mathbb {R}}^{n+1} \ \vert \ x \in {\widetilde{B}} \rbrace \ \text{ and } \ A := \lbrace (x, p_{n+1}) \in {\mathbb {R}}^{n+1} \ \vert \ x \in {\widetilde{A}} \rbrace . \end{aligned}$$

Obviously, the points \(y_1, \dots , y_n \in U\) are affinely independent in \({\mathbb {R}}^{n+1}\) and the affine subspace generated by them is A. Moreover,

$$\begin{aligned} F(y_1) = \dots = F(y_n) \end{aligned}$$

follows from the definition of \({\widetilde{F}}\). Let us denote this common function value by \(\alpha \).

Now if there exists \(z \in B \setminus A \) such that \(F(z) = \alpha \), then the proof is finished. Indeed, according to Lemma 2.4, the system \(z, y_1, \dots , y_n\) is affinely independent, consisting of \(n+1\) points which all have the same value of F.

Therefore, in the next step, we consider the case when there is no point in \(B \setminus A\) with value \(\alpha \). Then we may assume that there exists \( z \in U \setminus A \) such that \(F(z) > \alpha \) (otherwise, if the value of every point in \(U \setminus A\) is less than \(\alpha \), then we should continue the proof using the function \(-F\) and replacing \(\alpha \) by \(- \alpha \)).

For the remaining part of the proof let us fix such a point \(z \in U \setminus A\). Now if there exists \(w \in B \setminus U\) such that \(F(w) < \alpha \), then we shall argue as follows: As B is convex, it holds that \(\mathrm{s}(w,z) \subset B\). Furthermore, as the segment \(\mathrm{s}(w,z)\) is connected and F is continuous, we get that \(F(\mathrm{s}(w,z))\) is an interval. Therefore there has to exist some \(v \in \mathrm{int} \, \mathrm{s}(w, z)\) such that \(F(v) = \alpha \).

Now \(z \in U\) and \(w \notin U\) entails that \(v \notin U\). Indeed, \(v \in U\) would imply \(w \in \mathrm{Aff}\, U\) which – considering that \(\mathrm{Aff}\, U \cap B = U \cap B\) – contradicts the assumption \(w \in B \setminus U\). So for the vector v we have obtained \(F(v) = \alpha \,\), while on the other hand \(v \notin U\) is fulfilled which also implies \(v \notin A\). But this is a contradiction as we have assumed earlier that no such vector exists.

Therefore the only possible case is when \(F(w) > \alpha \) holds for every \(w \in B \setminus U \). Using again the continuity of F in a similar manner, we can get that there exists a vector \(v \in B \setminus U\) such that \( \alpha< F(v) < F(z)\). With further basic arguments using the continuity, we can conclude the existence of points

$$\begin{aligned} x_j \in \mathrm{int} \, \mathrm{s}(z, y_j) \quad \text{ such } \text{ that } \quad F(x_j) = F(v) (j = 1, \dots ,n). \end{aligned}$$

This means that, for every index \(j = 1, \dots ,n\), there exists a real number \(\lambda _j \, \in ]0,1[\) so that

$$\begin{aligned} x_j = z + \lambda _j (y_j - z). \end{aligned}$$

In order to complete the proof we shall use Lemma 2.4: the points \(x_1, \dots , x_n\) are affinely independent – as z was fixed in the complement of A – and for a similar reason (namely, \(v \in B \setminus U\)) the points \(v, x_1, \dots , x_n\) are an affinely independent system, as well. Due to our construction, all of these \(n+1\) points have function value F(v). \(\square \)

Combining this theorem with the results of the previous subsection we can verify an important property of locally translation invariant functions.

Corollary 3.9

Let \(p \in {\mathbb {R}}^n\) and \( 0 < \varepsilon \in {\mathbb {R}}\) be arbitrary, and \(F : B(p, \varepsilon ) \longrightarrow {\mathbb {R}}\) be continuous, locally translation invariant. Then there exists \(0 \ne a \in {\mathbb {R}}^n\) such that

$$\begin{aligned} F \vert _{H(p,a) \cap B(p,\varepsilon )} \text{ is } \text{ constant }. \end{aligned}$$

Proof

Observe that \(B(p, \varepsilon )\) is a convex set hence F is in fact translation invariant. According to Theorem 3.8 there exists a system of affinely independent points \(x_1, \dots , x_n \in B(p, \frac{\varepsilon }{2})\) such that \(F(x_1)= \dots = F(x_n)\). Let us denote the hyperplane \(\mathrm{Aff}(x_1, \dots , x_n)\) by H and let a denote a normal vector of H. Using Theorem 3.7 we get

$$\begin{aligned} F \vert _{H \cap B(p,\varepsilon )} \equiv F(x_1). \end{aligned}$$

If \(p \in H\) then \(H = H(p,a)\) and \(F(p) = F(x_1)\), so the proof is finished. Otherwise, if \(p \notin H\) holds then let us consider the following points:

$$\begin{aligned} y_j := x_j + (p - x_1) \qquad (j=1, \dots , n ). \end{aligned}$$

We shall observe that

$$\begin{aligned} \Vert y_j - p \Vert = \Vert x_j + p - x_1 - p \Vert \le \Vert x_j - p \Vert + \Vert x_1 - p \Vert < \frac{\varepsilon }{2} + \frac{\varepsilon }{2} = \varepsilon \, , \end{aligned}$$

thus \(y_j \in B(p, \varepsilon )\). On the other hand \(y_1 = p\) holds. Hence, from the translation invariance, the equalities \(F(y_j) = F(p)\) follow for all indices \(j = 1, \dots , n\). It is also clear that the system \(y_1, \dots , y_n\) is affinely independent, while the affine hull of these points is \(H + (p - x_1) = H(p, a)\). Therefore we may apply Theorem 3.7 once again and obtain

$$\begin{aligned} F \vert _{H(p,a) \cap B(p,\varepsilon )} \equiv F(p), \end{aligned}$$

which had to be proven. \(\square \)

The meaning of the previous statement is that, for every interior point p in the n dimensional domain of definition of a continuous, locally translation invariant function, there exists an \(n-1\) dimensional open disc with center p such that the function is constant on this ball. We immediately note that, of course, these are not open balls in the standard topology of \({\mathbb {R}}^n\), they are intersections of proper open balls and appropriate (\(n-1\) dimensional) hyperplanes. However, for the sake of simplicity, we will continue to use the term \(n-1\) dimensional disc, when it is not confusing.

We would like to show that under certain circumstances we can claim more than this local property. Namely, that a continuous, locally translation invariant function is globally constant on parallel hyperplanes, if the domain of definition is open and connected. In order to prove such a statement, we will need the following technical lemma.

Lemma 3.10

Let n be a positive integer and consider some nonempty, closed intervals \(I_1 = [a_1, b_1] \subset {\mathbb {R}}\, , \dots \, , I_n = [a_n, b_n] \subset {\mathbb {R}}\) such that \(I_i\) and \(I_{i+1}\) have a common endpoint, i.e.

$$\begin{aligned} \lbrace a_i \, , b_i \rbrace \cap \lbrace a_{i+1} \, , b_{i+1} \rbrace \, \ne \emptyset \qquad ( i = 1, \dots , n-1). \end{aligned}$$

Let us also assume that \(I_1\) and \(I_n\) have a common endpoint as well, i.e. \(\lbrace a_1 \, , b_1 \rbrace \cap \lbrace a_n \, , b_n \rbrace \, \ne \emptyset \).

Moreover let X be an arbitrary set, and for every \(k = 1 ,\dots , n\) consider \(f_k : I_k \longrightarrow X\) such that for any index \(i = 1, \dots , n-1\) the functions \(f_i\) and \(f_{i+1}\) have the same function value in at least one of the common endpoints of the intervals \(I_i\) and \(I_{i+1}\), i.e

$$\begin{aligned} \exists v \in \lbrace a_i \,,\, b_i \rbrace \cap \lbrace a_{i+1} \,,\, b_{i+1} \rbrace \, : \, f_i(v) = f_{i+1} (v) \qquad (i = 1, \dots , n-1). \end{aligned}$$

Furthermore let us suppose that for all indices \(i,j = 1 ,\dots , n\) (\(i \ne j\)) the following holds: if \(I_i \cap I_j \ne \emptyset \) and \(f_i(x) = f_j(x)\) for some element \(x \in I_i \cap I_j\), then \( f_i \vert _{I_i \cap I_j} = f_j \vert _{I_i \cap I_j} \).

Under these assumptions we may claim that if w is a common endpoint of \(I_1\) and \(I_n\) then \(f_1(w) = f_n (w)\).

Remark 3.11

We shall note that in this lemma we did not exclude singleton intervals. This fact will be important later in some proofs.

Proof

In the first place we verify an auxiliary statement from which we can easily deduce the implication of the lemma. The statement is the following: for arbitrary indices \(i \ne j\) it holds that if \(I_i \cap I_j \ne \emptyset \) then there exists \(p \in I_i \cap I_j\) such that \(f_i(p) = f_j(p)\). Therefore let i and j be fixed indices and let us introduce the notation \(m = \vert i -j \vert \).

We will prove this statement by induction on m. The case of \(m = 1\) follows directly from the assumptions of the lemma: \(I_i\) and \(I_{i+i}\) have a common endpoint where the corresponding function values are equal (\(i = 1, \dots , n-1\)).

Now let \(m \ge 2\) and suppose that the statement holds for smaller natural numbers. Let us consider the intervals \(I_i\) and \(I_j\) where \(j = i+m\) and assume \(M = I_i \cap I_j \ne \emptyset \). We can show that there exists an index k such that \(i< k < j\) and \(M \cap I_k \ne \emptyset \).

We will consider 4 cases, depending on the order of the endpoints of \(I_i\) and \(I_j\). We shall note that these cases are not completely separated, although they cover all possibilities.

  1. 1.

    \(a_i \le a_j \le b_j \le b_i\). In this case \(M = [a_j, b_j]\), therefore \(I_{j-1} \cap M \ne \emptyset \) follows from the assumptions of the lemma, as (at least) one endpoint of \(I_{j-1}\) is contained in \(I_j \, \), hence in \(M \,\) as well.

  2. 2.

    \(a_j \le a_i \le b_i \le b_j\). Then \(M = [a_i, b_i]\) and therefore \(I_{i+1} \cap M \ne \emptyset \,\), for analogous reasons as before.

  3. 3.

    \(a_i< a_j \le b_i < b_j\). In this case \(M = [a_j, b_i]\). If \(b_i\) is a common endpoint of \(I_{i+1}\) and \(I_i \, \), then \(k = i+1\) is a suitable choice. Similarly, if \(a_j\) is a common endpoint of \(I_{j-1}\) and \(I_j \,\), then \(k = j-1\) is appropriate. If neither of these are fulfilled then we have \(a_i \in I_{i+1}\) and \(b_j \in I_{j-1}\). Clearly, this means

    $$\begin{aligned} a_i, \, b_j \in \bigcup _{l=i+1}^{j-1} I_l \, . \end{aligned}$$

    Let us observe that if two nonempty, closed intervals of \({\mathbb {R}}\) have a common point, then their union is also a closed interval. Because of that, the set \(\bigcup _{l=i+1}^{j-1} I_l\) is a closed interval as well. Therefore \(a_j \in \bigcup _{l=i+1}^{j-1} I_l\) holds as \(a_i< a_j < b_j\) was supposed. This means that there exists a required index \(i< k <j\) such that \(I_k \cap M \ne \emptyset \).

  4. 4.

    \(a_j< a_i \le b_j < b_i\). This means \(M = [a_i, b_j]\). We may use an argumentation analogous to the previous case: if \(a_i\) is a common endpoint of \(I_{i+1}\) and \(I_i \, \) , then choose \(k = i+1\). Similarly, if \(b_j\) is a common endpoint of \(I_{j-1}\) and \(I_j \,\) , then \(k = j-1\) is appropriate. Otherwise \(a_j \in I_{j-1}\) and \(b_i \in I_{i+1}\) thus

    $$\begin{aligned} a_j, \, b_i \in \bigcup _{l=i+1}^{j-1} I_l \, \text{, } \text{ which } \text{ is } \text{ a } \text{ closed } \text{ real } \text{ interval. } \end{aligned}$$

    Hence \(a_i \in \bigcup _{l=i+1}^{j-1} I_l\) so there exists \(i< k < j\) such that \(a_i \in I_k \), consequently \(I_k \cap M \ne \emptyset \).

Therefore we have investigated every possibility, and we have obtained that a required interval \(I_k\) must exist. Let \(p \in I_k \cap M\). Apply the inductive hypothesis, firstly to the intervals \(I_i\) and \(I_k\) then for \(I_k\) and \(I_j\): according to the hypothesis there exist elements \(x \in I_i \cap I_k\) and \(y \in I_k \cap I_j\) such that \(f_i(x) = f_k(x)\) and \(f_k(y) = f_j(y)\). The assumptions of the lemma ensure

$$\begin{aligned} f_i \vert _{I_i \cap I_k} = f_k \vert _{I_i \cap I_k} ~~~ \text{ and } ~~~ f_k \vert _{I_k \cap I_j} = f_j \vert _{I_k \cap I_j}, \end{aligned}$$

especially \(f_i(p) = f_k(p) = f_j(p)\).

Therefore we have verified the auxiliary statement formulated at the beginning of the proof. The implication of the lemma is a straightforward corollary of that: since \(I_1 \cap I_n \ne \emptyset \) was assumed, there exists \(p \in I_1 \cap I_n\) such that \(f_1(p) = f_n(p)\), thus for any common endpoint w (or, in fact, for any point of the intersection \(I_1 \cap I_n\)) the equation

$$\begin{aligned} f_1(w) = f_n(w) \text{ follows } \text{ from } f_1 \vert _{I_1 \cap I_n} = f_n \vert _{I_1 \cap I_n} \, . \end{aligned}$$

\(\square \)

From now on, for any fixed vector \(a \in {\mathbb {R}}^n\), \(p_a\) will denote the inner product with a (as a linear functional):

$$\begin{aligned} p_a (x) = \langle x \,,\, a \rangle \qquad (x \in {\mathbb {R}}^n). \end{aligned}$$

Proposition 3.12

Let \(\emptyset \ne D \subset {\mathbb {R}}^n\) be an open, connected set and \(F : D \longrightarrow {\mathbb {R}}\) be a continuous, locally translation invariant function. Then there exists \(0 \ne a \in {\mathbb {R}}^n\) such that \(p_a(x) = p_a(y)\) implies \(F(x) = F(y)\) for all \(x,y \in D\).

Proof

Firstly, let \(x,y \in D\) be two arbitrary, but different points. As D is an open, connected set, there exists a polygonal path in D, joining x and y (see Proposition 2.6). Precisely: there exist \(z_0, z_1, \dots , z_m \in D\) such that

$$\begin{aligned} z_0 = x, \ z_m = y \ \text{ and } \ S_k := \mathrm{s}(z_{k-1}, z_k) \subset D \qquad (k = 1, \dots , m). \end{aligned}$$

The polygonal path \(S= S_1 \cup \dots \cup S_m\) is compact while the set \({\mathbb {R}}^n\setminus D\) is closed. Therefore, according to Proposition 2.5, there exists a real number \(r > 0\) such that

$$\begin{aligned} T := \bigcup _{s \in S} B(s, r) \subset D. \end{aligned}$$

Observe that the open sets

$$\begin{aligned} T_k := \bigcup _{s \in S_k} B(s, r) \qquad (k = 1, \dots , m) \end{aligned}$$

are convex, so the restrictions \(F \vert _{T_k}\) are translation invariant, according to Proposition 3.3. Using Corollary 3.9 we get that there exists \(0 \ne a \in {\mathbb {R}}^n\) such that F is constant over the \(n-1\) dimensional disc \(H(z_0,a) \cap B(z_0,r)\).

We will show that for this vector a the implication of our proposition holds. Firstly, the translation invariance on \(T_1\) implies that, for each \(s \in S_1 \,\), F is constant over the translated disc \(H(s,a) \cap B(s,r)\), especially \(F \vert _{H(z_1,a) \cap B(z_1,r)}\) is also constant. By analogous reasoning for \(T_2\), \(T_3\) and so on, we obtain that \(F \vert _{H(z_m,a) \cap B(z_m,r)}\) is constant, that is, \(F \vert _{H(y,a) \cap B(y,r)}\) is constant. Since \(y \in D\) can be chosen arbitrarily, we get the following: for all \(z \in D\) there exists \(\varepsilon > 0\) such that \(F \vert _{H(z,a) \cap B(z,\varepsilon )}\) is constant.

Using Theorem 3.7 we can deduce immediately that, for all \(z \in D\) and \(r>0\) such that \(B(z,r) \subset D\), it holds that

$$\begin{aligned} F \vert _{H(z,a) \cap B(z,r)} \text{ is } \text{ constant. } \end{aligned}$$

From now on assume that \(p_a(x) = p_a(y)\) and consider the previous construction of S and T. For every \(k \in \{ 1, \dots , m \}\) the sets \(I_k = p_a(S_k) \subset {\mathbb {R}}\) are images of the compact, connected line segments \(S_k\), under the continuous function \(p_a\) therefore they are also compact and connected. This means that they are bounded, closed intervals.

Moreover, if \(p_a(z_{k-1}) < p_a(z_k)\) holds for some index \(k \in \{ 1, \dots , m \}\), then by using the bilinearity of the inner product we obtain

$$\begin{aligned} p_a((1-\lambda )z_{k-1} + \lambda z_k )= & {} (1-\lambda )p_a(z_{k-1}) + \lambda p_a(z_k) \\= & {} p_a(z_{k-1}) + \lambda (p_a(z_k) -p_a(z_{k-1}) ) \end{aligned}$$

where \(\lambda \in [0,1]\). From this calculation it is clear that in this case

$$\begin{aligned} p_a(z_{k-1}) \le p_a(z) \le p_a(z_k), \ \text{ holds } \text{ for } \text{ any } z \in S_k. \end{aligned}$$

For similar reasons, we may claim that if \(p_a(z_{k-1}) > p_a(z_k)\) is fulfilled for an index \(k \in \{ 1, \dots , m \}\) then

$$\begin{aligned} p_a(z_k) \le p_a(z) \le p_a(z_{k-1}), \ \text{ holds } \text{ for } \text{ any } z \in S_k \, , \end{aligned}$$

and if \(p_a(z_{k-1}) = p_a(z_k)\), then

$$\begin{aligned} p_a(z_k) = p_a(z) = p_a(z_{k-1}) \text{ holds } \text{ for } \text{ all } z \in S_k \, . \end{aligned}$$

As a summary of the three cases we get that, for all \(k = 1, \dots , m\), we have

$$\begin{aligned} I_k = \left[ \ \min \{ p_a(z_{k-1}), p_a(z_k) \} \, , \, \max \{ p_a(z_{k-1}), p_a(z_k) \} \ \right] . \end{aligned}$$
(1)

We proceed with some further investigation of these three types of intervals \(I_k\).

  1. 1.

    If \(p_a(z_{k-1}) < p_a(z_k)\) then \( I_k = \left[ p_a(z_{k-1}) , p_a(z_k) \right] \). For such intervals it is easy to see that the function \(p_a \vert _{S_k} : S_k \longrightarrow I_k\) is bijective. Surjectivity is obvious while injectivity can be justified as follows: if \(p_a(s) = p_a(t)\) was true for two different points \( s,t \in S_k \), then this would imply

    $$\begin{aligned} \langle s,a \rangle = \langle t,a \rangle , \text{ and } \text{ therefore } \langle s-t,a \rangle = 0 \Longrightarrow t \in H(s, a). \end{aligned}$$

    Hence the whole line segment \(S_k\) would be contained in the affine subspace H(sa) , especially \( p_a(z_{k-1}) = p_a(z_k)\) would hold, but this is a contradiction. So \(p_a \vert _{S_k}\) is indeed a bijection, thus the definition of the function

    $$\begin{aligned} f_k : I_k \longrightarrow {\mathbb {R}}\qquad f_k(c) := F(p_a ^{-1} (c)) \end{aligned}$$

    is correct. Note that for the endpoints of \(I_k\) the following equations hold:

    $$\begin{aligned} f_k(p_a(z_{k-1})) = F(z_{k-1}) \text{ and } f_k(p_a(z_k)) = F(z_k) . \end{aligned}$$
  2. 2.

    If \(p_a(z_{k-1}) > p_a(z_k)\) then \( I_k = \left[ p_a(z_k) , p_a(z_{k-1}) \right] \). By analogous reasoning one can see that \(p_a \vert _{S_k} : S_k \longrightarrow I_k\) is a bijection, so we may define a function \(f_k\) with the same formula

    $$\begin{aligned} f_k : I_k \longrightarrow {\mathbb {R}}\qquad f_k(c) := F(p_a ^{-1} (c)). \end{aligned}$$

    Especially, at the endpoints of \(I_k\) it holds that \(f_k(p_a(z_k)) = F(z_k)\) and \(f_k(p_a(z_{k-1})) = F(z_{k-1}) \).

  3. 3.

    If \(p_a(z_{k-1}) = p_a(z_k)\) then

    $$\begin{aligned} I_k = \left[ \, p_a(z_{k-1}) , p_a(z_{k-1}) \, \right] = \left[ p_a(z_k) , p_a(z_k) \right] = \{ p_a(z_k)\} = \{ p_a(z_{k-1})\}. \end{aligned}$$

    These singleton intervals occur if and only if the segment \(S_k\) is contained in a hyperplane with normal vector a. However, that being so we may apply Theorem 3.7 for the convex set: \(T_k = \bigcup _{s \in S_k} B(s, r)\). As the function F is constant on the \(n-1\) dimensional disc \(H(z_k, a) \cap B(z_k, r) \,\) with value \(F(z_k)\), due to Theorem 3.7\(F \vert _{S_k} \equiv F(z_k)\) is fulfilled. Nevertheless, in the case of these singleton intervals the definition of \(f_k\) remains the same:

    $$\begin{aligned} f_k : I_k \longrightarrow {\mathbb {R}}\qquad f_k \left( p_a(z_k) \right) := F(z_k). \end{aligned}$$

As a summary we may claim that for every index \( k = 1, \dots , m\) the (not necessarily different) endpoints of an above defined interval \(I_k\) are exactly the values of \(p_a\) at the endpoints of the segment \(S_k\), in the correct order. Furthermore, the value of \(f_k\) at any endpoint of \(I_k\) is equal to the value of F at the corresponding endpoint of \(S_k \,\).

In the next part of the proof we will show that the intervals \(I_k\) and functions \(f_k\) fulfill the assumptions of Lemma 3.10.

First of all, due to Eq. (1) the intervals \(I_i\) and \(I_{i+1}\) have a common endpoint, namely \(p_a(z_i)\) (\(i= 1, \dots , m-1)\). Furthermore, since \(z_0 = x\) and \(z_m = y\), it holds that \(p_a(z_0) = p_a(z_m)\) which means that \(I_1\) and \(I_m\) have a common endpoint. Moreover it is also clear, that at an appropriate common endpoint of two intervals with adjacent indices, the corresponding function values are equal, because the construction above implies

$$\begin{aligned} f_i(p_a(z_i)) = F(z_i) = f_{i+1}(p_a(z_i)) \qquad ( i = 1, \dots , m-1 ). \end{aligned}$$

To check the last assumption of Lemma 3.10, let us consider two intervals

$$\begin{aligned} I_i = p_a( \, \mathrm{s}(z_{i-1}, z_i) \,) \ \text{ and } \ I_j = p_a( \, \mathrm{s}(z_{j-1}, z_j) \,), \end{aligned}$$

where \(i,j \in \{ 1, \dots , m \}\), \(i \ne j\) such that \(f_i(c) = f_j(c)\) is fulfilled for some element \(c \in I_i \cap I_j\).

We shall prove \(f_i \vert _{I_i \cap I_j} = f_j \vert _{I_i \cap I_j}\). If either \(I_i\) or \(I_j\) is a singleton, then it holds automatically. Thus the only interesting case is when both intervals are proper (none of them is a singleton) and their intersection is also proper.

Let \(d \in I_i \cap I_j\) such that \(d \ne c\). That being so, there exist (uniquely determined) points

$$\begin{aligned} v, \, {\tilde{v}} \in S_i \quad \text{ and } \quad w, \, {\tilde{w}} \in S_j \end{aligned}$$

such that \(v \ne {\tilde{v}}\) and \(w \ne {\tilde{w}}\), moreover \(p_a(v) = p_a(w) = c\) and \(p_a({\tilde{v}}) = p_a({\tilde{w}}) = d\). Besides that, let us define two unit vectors

$$\begin{aligned} b_1 = \frac{1}{\Vert {\tilde{v}} - v \Vert }({\tilde{v}}-v) \quad \text{ and } \quad b_2 = \frac{1}{\Vert {\tilde{w}} - w \Vert }({\tilde{w}}-w). \end{aligned}$$

It is easy to see that \(\langle b_1, a \rangle \ne 0\) and \(\langle b_2, a \rangle \ne 0\). Otherwise \(S_i\) or \(S_j\) would be contained in a hyperplane orthogonal to a, which would entail that its image under \(p_a\) shrinks to a single point, but this was excluded. Furthermore, by calculating

$$\begin{aligned} \langle b_1, a \rangle= & {} \frac{1}{\Vert {\tilde{v}} - v \Vert } \langle {\tilde{v}}-v , a \rangle = \frac{\langle {\tilde{v}},a \rangle - \langle v,a \rangle }{\Vert {\tilde{v}}-v \Vert } = \frac{d-c}{\Vert {\tilde{v}} - v \Vert } \ \text{ and } \\ \langle b_2, a \rangle= & {} \frac{1}{\Vert {\tilde{w}} - w \Vert } \langle {\tilde{w}}-w , a \rangle = \frac{\langle {\tilde{w}} , a \rangle - \langle w , a \rangle }{\Vert {\tilde{w}}-w \Vert } = \frac{d-c}{\Vert {\tilde{w}} - w \Vert } \end{aligned}$$

we obtain that the sign of \(\langle b_1, a \rangle \) and \(\langle b_2, a \rangle \) is the same. Without loss of generality we may assume \( \vert \langle b_1, a \rangle \vert \le \vert \langle b_2, a \rangle \vert \) (if necessary, define the unit vectors inversely, i.e. use the points of \(S_j\) for \(b_1\) and use the points of \(S_i\) for \(b_2\)). This way we may introduce

$$\begin{aligned} \lambda := \frac{\langle b_1, a \rangle }{\langle b_2, a \rangle } \in \, ]0,1]. \end{aligned}$$

We shall observe that this implies, for all \(s \in S\),

$$\begin{aligned} s + \frac{r}{2} b_1 \in B\left( s, \frac{r}{2}\right) \subset T \quad \text{ and } \quad s + \lambda \frac{r}{2} b_2 \in B\left( s, \frac{r}{2}\right) \subset T . \end{aligned}$$

Furthermore, it also holds that

$$\begin{aligned} \left\langle w + \lambda \frac{r}{2} b_2 - \left( w + \frac{r}{2} b_1 \right) \, , \, a \right\rangle= & {} \frac{r}{2} \, \langle \lambda b_2 - b_1 \, , a \rangle \\= & {} \frac{r}{2} \left( \frac{\langle b_1, a \rangle }{\langle b_2, a \rangle } \langle b_2, a \rangle - \langle b_1, a \rangle \right) = 0 \end{aligned}$$

which means \(w + \lambda \frac{r}{2} b_2 \in H(w +\frac{r}{2} b_1, a) \).

Here using the local translation invariant property of F we get

$$\begin{aligned} F\left( v + \frac{r}{2} b_1\right) = F\left( w + \frac{r}{2} b_1\right) . \end{aligned}$$

We should also keep in mind that we already concluded that F is constant on the \(n-1\) dimensional disc \(B(w + \frac{r}{2} b_1, r) \cap H(w + \frac{r}{2} b_1 , a)\). Now

$$\begin{aligned} \Vert w + \lambda \frac{r}{2} b_2 - \left( w + \frac{r}{2} b_1 \right) \Vert = \frac{r}{2} \Vert b_1 - \lambda b_2 \Vert \le \frac{r}{2} \left( \Vert b_1 \Vert + \Vert \lambda b_2 \Vert \right) = \frac{r}{2} (1+ \lambda ) \le r. \end{aligned}$$

Observe that the first inequality is strict, because otherwise we would have \(b_1 = \gamma (-\lambda b_2)\) for some \(\gamma \ge 0\). However, this would imply

$$\begin{aligned} 0 < \lambda = \frac{\langle b_1, a \rangle }{\langle b_2, a \rangle } = \frac{\langle \gamma (-\lambda b_2), a \rangle }{\langle b_2, a \rangle } = \gamma (-\lambda ) \le 0 \end{aligned}$$

which is a contradiction. Hence we have in fact \(w + \lambda \frac{r}{2} b_2 \in B (w + \frac{r}{2} b_1, r)\). As \(w + \lambda \frac{r}{2} b_2 \in H(w +\frac{r}{2} b_1, a) \) was obtained previously, we may claim that

$$\begin{aligned}&w + \lambda \frac{r}{2} b_2 \in B \left( w + \frac{r}{2} b_1, r\right) \cap H(w + \frac{r}{2} b_1 , a) \quad \text{ and } \text{ therefore } \\&\quad F\left( w + \lambda \frac{r}{2} b_2\right) = F\left( w + \frac{r}{2} b_1\right) = F\left( v + \frac{r}{2} b_1\right) . \end{aligned}$$

We shall observe that this process works not just for \(\frac{r}{2}\), but for an arbitrary real number \(0 < \varrho \le \frac{r}{2}\) we can get

$$\begin{aligned} F(w + \lambda \varrho b_2) = F(w + \varrho b_1) = F(v + \varrho b_1). \end{aligned}$$

With some basic calculations analogous to the previously discussed ones it is easy to see that besides \(F(w + \lambda \varrho b_2) = F(v + \varrho b_1)\), the equation \(p_a( v + \varrho b_1) = p_a( w + \lambda \varrho b_2)\) holds as well, for any \(\varrho \in ] 0 , \frac{r}{2} ]\).

Now let us introduce the notation

$$\begin{aligned} N := \left\lfloor \frac{2\Vert {\tilde{v}}-v \Vert }{r} \right\rfloor . \end{aligned}$$

We shall execute this translation process N times for \(\frac{r}{2}\), starting every step from the most recently obtained points \(v + \frac{ l r}{2}b_1\) and \(w + \lambda \frac{ l r}{2}b_1\) where \( l = 0, 1, \dots ,N-1\). This way one gets

$$\begin{aligned} F\left( v + \frac{N r}{2} b_1\right) = F\left( w + \lambda \frac{N r}{2} b_2\right) . \end{aligned}$$

If necessary, we may do one final step, but instead of \(\frac{r}{2}\) now with some \(0< \delta < \frac{r}{2}\). Hence we get that \(F({\tilde{v}}) = F({\tilde{w}})\).

Finally, we shall utilize the fact that \(I_i\) and \(I_j\) are proper intervals and not singletons. Therefore the functions \(f_i\) and \(f_j\) were defined with the help of the bijective restrictions \(p_a \vert _{S_i}\) and \(p_a \vert _{S_j}\). Hence, for the considered number \(d \in I_i \cap I_j \,\), we have

$$\begin{aligned} f_i(d) = F(p_a ^{-1}(d)) = F({\tilde{v}}) = F({\tilde{w}}) = F(p_a^{-1}(d)) = f_j(d). \end{aligned}$$

As d was an arbitrary element of \(I_i \cap I_j\), we have checked that all the assumptions of Lemma 3.10 are fulfilled. Use the implication of the lemma for \(p_a(z_0) = p_a(z_m)\) which is a common endpoint of \(I_1\) and \(I_m \,\). Therefore \(f_1(p_a(z_0)) = f_m(p_a(z_m))\), but considering that \(z_0 = x\) and \(z_m = y\), this means

$$\begin{aligned} F(x) = f_1(p_a(x)) = f_1(p_a(z_0)) = f_m(p_a(z_m)) = f_m(p_a(y))= F(y), \end{aligned}$$

which completes our proof. \(\square \)

Corollary 3.13

Let \(\emptyset \ne D \subset {\mathbb {R}}^n\) be an open, connected set and \(F : D \longrightarrow {\mathbb {R}}\) be a continuous, locally translation invariant function. If there exist \(x \in D\) and \(\varepsilon > 0\) such that \(B(x, \varepsilon ) \subset D\) and \(F \vert _{B(x, \varepsilon )}\) is constant, then F is constant on the whole set D.

Proof

Let \(y \in D \setminus \{ x \}\) be an arbitrary point, and let us choose a vector \(0 \ne b \in {\mathbb {R}}^n\) such that \(\langle b , y-x \rangle = 0\) and therefore \(p_a(x) = p_a(y)\) holds. In the proof of Proposition 3.12 we have shown that, for any \(x \in D\), \(0 \ne a \in {\mathbb {R}}^n\) and \(r > 0\), if \(B(x,r) \subset D\) and F is constant on the \(n-1\) dimensional disc \(H(x,a) \cap B(x,r)\), then F is globally constant on the hyperplanes orthogonal to a. Now \(F \vert _{B(x,\varepsilon )}\) is constant, so especially for the vector b we get that \(p_b(x) = p_b(y)\) implies \(F(x) = F(y)\). Since y was an arbitrary point of D, we have shown that F is constant on the whole domain. \(\square \)

Before proving our main result we recall a well-known fact about real functions.

Theorem 3.14

Let \(I \subset {\mathbb {R}}\) be an interval and \(f : I \longrightarrow {\mathbb {R}}\) be a continuous, injective function. Then f is strictly monotone.

Remark 3.15

The previous statement is listed among some other fundamental properties of continuous real functions in [5, Theorem 8.1]. The main idea of the proof is the intermediate value property of continuous, real valued functions.

Theorem 3.16

Let \(\emptyset \ne D \subset {\mathbb {R}}^n\) be a connected, open set and let \(F : D \longrightarrow {\mathbb {R}}\) be a continuous, locally translation invariant function. Then either F is constant, or there exists a vector \(0 \ne a = (a_1, \dots , a_n) \in {\mathbb {R}}^n\) and a strictly monotone, continuous function \(f : p_a(D) \longrightarrow {\mathbb {R}}\) such that

$$\begin{aligned} F(x_1, \dots , x_n) = f (a_1 x_1 + \dots + a_n x_n) \end{aligned}$$

holds for all \((x_1, \dots , x_n) \in D \,\).

Proof

According to Proposition 3.12 there exists a vector \( 0 \ne a \in {\mathbb {R}}^n\) such that \(x,y \in D \,\), \(p_a(x) = p_a(y)\) implies \(F(x) = F(y)\). Now \(p_a(D) \subset {\mathbb {R}}\) is connected since \(p_a\) is continuous and D is connected. Thus \(p_a(D)\) is a nonempty real interval. Obviously it cannot be a singleton, because that would mean that the set D is contained in an \(n-1\) dimensional hyperplane, which contradicts that D is open.

For every \(c \in p_a(D)\) let us consider an element \(x_c \in D\) such that \(p_a(x_c) = c\). Then we shall define the function f in the following way:

$$\begin{aligned} f : p_a(D) \longrightarrow {\mathbb {R}}\qquad f(c) := F(x_c). \end{aligned}$$

Let us observe that the properties of a mentioned before ensure that f is well-defined (i.e. independent of the choice of \(x_c\)). Indeed, if we use some other \(\tilde{x_c}\) in the definition of f instead of \(x_c\), then \(p_a(\tilde{x_c})\) implies \(F(\tilde{x_c}) = F(x_c) = f(c)\), so f remains the same.

Now we may show that f is continuous. For this purpose let us fix an arbitrary \(c_0 \in p_a(D)\) and choose \(y_0 \in D\) so that \(p_a(y_0) = c_0\). Since D is open, there exists a real number \(r_0 >0\) such that \(B(y_0,r_0) \subset D\). Therefore, if \(r < \frac{r_0}{\Vert a \Vert } \,\), then \(y_0 + \lambda a \in D\) holds for all \(\lambda \in [-r,r]\). This means that by using the notation \(S := \mathrm{s}(y_0 -ra , y_0 +ra)\) we have that \(S \subset D\) and \(c_0\) is an interior point of \(p_a(S)\). Moreover, since S is compact and connected, we get that \(p_a(S)\) is a closed interval while clearly it is a proper interval, as S is not contained in any hyperplane orthogonal to a. Furthermore, it is easy to check that \(p_a \vert _S : S \longrightarrow p_a(S)\) is a bijective function. Therefore

$$\begin{aligned} f(c) = F(p_a^{-1}(c)) \quad \text{ holds } \text{ for } \text{ all } c \in p_a(S) . \end{aligned}$$

Here \(p_a \vert _S\) is a restriction of a linear function, so it is continuous. As its domain of definition is a compact line segment, we get that the inverse of \(p_a \vert _S\) is continuous, too. Hence f is a composition of two continuous functions, therefore f itself is continuous on \(p_a(S)\). Recall that \(c_0\) is an interior point of \(p_a(S)\), so f is continuous in an open neighborhood of \(c_0\). Since \(c_0 \in p_a(D)\) was arbitrary, we have obtained that f is continuous on its whole domain of definition \(p_a(D)\).

We shall prove that if f is not injective then F is constant. If there exist \(s, t \in p_a(D)\), \(s < t\) such that \(f(s) = f(t)\), then this means that there exist \(x_s , x_t \in D \) such that

$$\begin{aligned} p_a(x_s) \ne p_a(x_t) \ \text{ but } \ F(x_s) = F(x_t). \end{aligned}$$

Now if \(f \vert _{[s,t]}\) is constant, then clearly F is also constant on an open ball and hence constant on the whole domain D, according to Corollary 3.13. Therefore we may assume that f attains one of its extrema in an inner point of the interval [st]. Suppose that there exists \(u \in \, ]s,t[\) such that \(f(u) = \max \{f(c) \ \vert \ c \in [s,t] \}\) (the case when the minimum is attained in an inner point can be handled analogously). Let \(x_u \in D\) such that \(p_a(x_u) = u\). Then there exists \(\varepsilon > 0\) such that \(B := B(x_u, \varepsilon \Vert a \Vert ) \subset D\).

Now \(T := \mathrm{s}(x_u - \varepsilon a , x_u +\varepsilon a) \subset B\). Moreover u is a local maximum and therefore, due to the Darboux-property of the continuous function \(f \vert _{p_a(S)}\), there exist \(x_1, x_2 \in S\) and \(c \in p_a(D)\) such that \(p_a(x_1)< u < p_a(x_2)\) and

$$\begin{aligned} F(x_1) = f(p_a(x_1)) = c = f(p_a(x_2)) = F(x_2). \end{aligned}$$

This implies

$$\begin{aligned} F \vert _{H(x_1,a) \cap B} \equiv c \equiv F \vert _{H(x_2,a) \cap B} \, . \end{aligned}$$

Consequently, there exist \(n+1\) affinely independent points in B where F has the same value (namely c). According to Theorem 3.7, F is then constant on B and therefore on D as well.

So we have shown that if F is not constant on D then f must be injective. But f is also continuous hence it is strictly monotone as well. \(\square \)

4 Continuous solutions of a system of functional equations

Let us consider a given set \(\emptyset \ne S \subset {\mathbb {R}}^n\) and a function \(F : S \longrightarrow {\mathbb {R}}\,\). For every \(k = 1, \dots , n\) let us define the sets \(E_k(S,F) \subset {\mathbb {R}}^2\) in the following way:

$$\begin{aligned} E_k(S,F) =&\lbrace \, (u,v) \in {\mathbb {R}}^2 \ \vert \ \exists (x_1, \dots , x_n) \in S : \\&F(x_1, \dots , x_n) = u \text{ and } (x_1, \dots , x_{k-1} , x_k + v , x_{k+1}, \dots ,x_n) \in S \, \rbrace . \end{aligned}$$

Suppose that there exist some functions \(\Psi _k : E_k(S,F) \longrightarrow {\mathbb {R}}\) (\(k = 1, \dots , n\)) such that the following equations hold:

$$\begin{aligned} F(x_1+t_1, x_2, \dots , x_n)&= \Psi _1(F (x_1, x_2, \dots , x_n), t_1) \quad (1) \\ F(x_1, x_2 + t_2, \dots , x_n)&= \Psi _2(F (x_1, x_2, \dots , x_n), t_2) \quad (2) \\&\vdots \\ F(x_1, x_2, \dots , x_n + t_n)&= \Psi _n(F (x_1, x_2, \dots , x_n), t_n) \quad (n). \end{aligned}$$

Then we say that F is a solution of the system of composite functional equations \((1) - (n)\).

In this section we will describe the continuous solutions of this system of functional equations, using the previously verified results about locally translation invariant functions.

From now on, a vector \(v \in {\mathbb {R}}^n\) will often be represented by its coordinate vector with respect to the standard basis.

Lemma 4.1

Let \(\emptyset \ne D \subset {\mathbb {R}}^n\) be an open set and \(F : D \longrightarrow {\mathbb {R}}\) be a continuous solution of the system of functional equations \((1) - (n)\). Then F is locally translation invariant.

Proof

Let \(x,y \in D\) such that \(F(x) = F(y)\) and suppose that for some radius \(0 < r \in {\mathbb {R}}\) the inclusions \(B(x,r) \subset D\) and \(B(y,r) \subset D\) hold. Moreover suppose that \(\Vert h \Vert < r\) is fulfilled for some vector \(h \in {\mathbb {R}}^n\,\). Due to the assumptions, it is obvious that

$$\begin{aligned}&(x_1 + h_1, \dots , x_k +h_k , x_{k+1}, \dots , x_n) \in D \ \text{ and } \\&\quad (y_1 + h_1, \dots , y_k +h_k , y_{k+1}, \dots , y_n) \in D \end{aligned}$$

hold for all indices \(k = 1, \dots , n\). Therefore we can execute the following calculations:

$$\begin{aligned} F(x+h)= & {} F(x_1 + h_1, x_2 + h_2 , \dots , x_n + h_n) \\= & {} \Psi _1(F(x_1, x_2 + h_2, \dots , x_n + h_n),h_1) \\= & {} \Psi _1(\Psi _2(F(x_1, x_2, x_3 + h_3, \dots , x_n + h_n),h_2),h_1) = \dots \\= & {} \Psi _1(\Psi _2(\dots (\Psi _n( F(x_1, x_2, \dots , x_n),h_n) \dots ),h_2),h_1) \\= & {} \Psi _1(\Psi _2(\dots (\Psi _n( F(y_1, y_2, \dots , y_n),h_n) \dots ),h_2),h_1) = \dots \\= & {} \Psi _1(\Psi _2(F(y_1, y_2, y_3 + h_3, \dots , y_n + h_n),h_2),h_1) \\= & {} \Psi _1(F(y_1, y_2 + h_2, \dots , y_n + h_n),h_1) \\= & {} F(y_1 + h_1, y_2 + h_2 , \dots , y_n + h_n) = F(y+h). \end{aligned}$$

\(\square \)

Theorem 4.2

Let \(\emptyset \ne D \subset {\mathbb {R}}^n\) be a connected, open set and let \(F : D \longrightarrow {\mathbb {R}}\) be a continuous solution of the system of functional equations \((1) - (n)\). Then either F is constant, or there exist a vector \(0 \ne a = (a_1, \dots , a_n) \in {\mathbb {R}}^n\) and a strictly monotone, continuous function \(f : p_a(D) \longrightarrow {\mathbb {R}}\) such that

$$\begin{aligned} F(x_1, \dots , x_n) = f (a_1 x_1 + \dots + a_n x_n) \end{aligned}$$

holds for all \((x_1, \dots , x_n) \in D \,\).

Proof

It immediately follows from Theorem 3.16 and Lemma 4.1. \(\square \)

5 Application in mathematical economics

In this section we are going to use our previous decomposition theorems in order to characterize the Cobb–Douglas type utility functions with a system of functional equations.

Let us introduce a notation for such vectors of \({\mathbb {R}}^n\) that have only positive coordinates:

$$\begin{aligned} {\mathbb {R}}_+ ^n:= \lbrace (x_1, \dots , x_n) \in {\mathbb {R}}^n\ \vert \ x_1> 0 , \dots , x_n > 0 \rbrace \end{aligned}$$

and especially \({\mathbb {R}}_+:= ] 0, + \infty [ \).

Definition 5.1

Consider a subset \(\emptyset \ne D \subset {\mathbb {R}}_+ ^n\) and let \(A, \alpha _1, \dots , \alpha _n \in {\mathbb {R}}_+\) be given positive constants. Furthermore, let the function \(u : D \longrightarrow {\mathbb {R}}\) be defined with the following formula:

$$\begin{aligned} u(x_1, \dots ,x_n) = A \cdot x_1 ^{\alpha _1} \cdot \, \dots \, \cdot x_n^{\alpha _n} \qquad \left( (x_1, \dots , x_n) \in D \right) . \end{aligned}$$

Then u is called a Cobb–Douglas utility function.

A utility function \(u : D \longrightarrow {\mathbb {R}}\) always generates a preference relation \(\preceq _u\) on its domain of definition:

$$\begin{aligned} x \preceq _u y \Longleftrightarrow u(x) \le u(y) \qquad (x,y \in D). \end{aligned}$$

However, in mathematical economics the preference relation generated by a utility function is more relevant than the function itself [3]. It is easy to see that if the utility function is composed with a strictly increasing real function then the generated preference relation remains the same. Hence, if u is a Cobb–Douglas utility function and \(\varphi \) is a strictly increasing real function, then the composite function \(\varphi \circ u\) shall be considered as a Cobb–Douglas type utility function.

Now let \(\emptyset \ne S \subset {\mathbb {R}}^n\) be a given set and \(F : S \longrightarrow {\mathbb {R}}\,\) be a given function. At the beginning of Section 4 we defined the sets \(E_k(S,F)\). We shall introduce a similar notation which will be useful in the formulation and proof of the following theorem. For every \(k = 1, \dots , n\) let us define \(G_k(S,F) \subset {\mathbb {R}}^2\) in the following way:

$$\begin{aligned} G_k(S,F) =&\lbrace \, (a,b) \in {\mathbb {R}}^2 \ \vert \ \exists (x_1, \dots , x_n) \in S : \\&F(x_1, \dots , x_n) = a \text{ and } (x_1, \dots , x_{k-1} , x_k \cdot b , x_{k+1}, \dots ,x_n) \in S \, \rbrace . \end{aligned}$$

Theorem 5.2

Let \(D \subset {\mathbb {R}}_+ ^n\) be a connected, open set and \(u : D \longrightarrow {\mathbb {R}}\) be a continuous function which is strictly increasing in all of its variables. Then u is a Cobb–Douglas type utility function if, and only if, there exist some functions

$$\begin{aligned} \Phi _k : G_k(D,u) \longrightarrow {\mathbb {R}}\qquad (k = 1, \dots , n) \end{aligned}$$

with the following property: for all \((x_1, \dots , x_n) \in D\), \(t_k \in {\mathbb {R}}_+\) such that \((x_1, \dots ,x_{k-1},x_k \cdot t_k ,x_{k+1}, \dots ,x_n) \in D\) the equations

$$\begin{aligned} u(x_1, \dots ,x_{k-1},x_k \cdot t_k ,x_{k+1}, \dots ,x_n) = \Phi _k (u(x_1, \dots ,x_n), t_k) \end{aligned}$$

are fulfilled for all \(k=1, \dots ,n\).

Proof

Firstly, let u be a Cobb–Douglas type utility function i.e.

$$\begin{aligned} u(x_1, \dots ,x_n) = {\varphi } \left( x_1 ^{\alpha _1} \cdot \, \dots \, \cdot x_n^{\alpha _n} \right) , \end{aligned}$$

where the exponents \( {\alpha }_j \) \((j=1,\dots ,n)\) are positive and \( {\varphi } \) is strictly monotone. Then we can define appropriate functions \(\Phi _k\) (\(k = 1, \dots , n\)) in the following way

$$\begin{aligned} \Phi _k (a, b) = {\varphi } \left( {\varphi }^{-1} (a) \, b^{\alpha _k} \right) , \qquad \text{ where } (a,b) \in G_k(D,u). \end{aligned}$$

To check that the requirements of the theorem are fulfilled we shall calculate

$$\begin{aligned}&u(x_1, \dots ,x_{k-1},x_k \cdot t_k ,x_{k+1}, \dots ,x_n) \\&\quad ={\varphi } \left( x_1 ^{\alpha _1} \dots x_{k-1}^{\alpha _{k-1}} \, (x_k \cdot t_k)^{\alpha _k} \, x_{k+1}^{\alpha _{k+1}} \dots x_n^{\alpha _n} \right) = {\varphi } \left( x_1 ^{\alpha _1} \dots x_n^{\alpha _n} \cdot t_k^{\alpha _k} \right) \\&\quad ={\varphi } \left( {\varphi }^{-1} \left( u(x_1, \dots x_n) \right) \cdot t_k^{\alpha _k} \right) = \Phi _k (u(x_1, \dots ,x_n), t_k). \end{aligned}$$

Conversely, let us assume that u is continuous, strictly increasing in all of its variables and it is a solution of the system of composite functional equations

$$\begin{aligned} u(x_1, \dots ,x_{k-1},x_k \cdot t_k ,x_{k+1}, \dots ,x_n) = \Phi _k (u(x_1, \dots ,x_n), t_k) \end{aligned}$$

for every \(k = 1, \dots ,n\). Now we may define the set

$$\begin{aligned} S = \lbrace \ (\ln x_1, \dots , \ln x_n) \in {\mathbb {R}}^n~~ \vert ~~ (x_1, \dots , x_n) \in D \ \rbrace . \end{aligned}$$

Clearly, S is connected, since it is the image of the connected set D under a continuous function. In order to see that S is open as well, let us introduce the function \(E : {\mathbb {R}}^n\longrightarrow {\mathbb {R}}\) with the formula \(E(y_1, \dots , y_n) = (e^{y_1}, \dots ,e^{y_n} )\). Then E is obviously continuous and, as S is the preimage of D under E, we get that S is open. Let us define the function \(v : S \longrightarrow {\mathbb {R}}\) in the following way:

$$\begin{aligned} v(y_1, \dots , y_n) = u \left( e^{y_1}, \dots , e^{y_n} \right) , \qquad \text{ where } (y_1, \dots , y_n) \in S. \end{aligned}$$

In fact, here \(v = u \circ E\) holds, therefore v is continuous and \(v(S) = u(D)\) also holds. Furthermore, the exponential function is strictly increasing, thus v is strictly increasing in all of its variables. We shall also observe that, if we introduce the functions

$$\begin{aligned} \Psi _k : E_k(S,v) \longrightarrow {\mathbb {R}}\, , \qquad \Psi _k(a,b) = \Phi _k(a, e^b) \end{aligned}$$

for every \(k = 1, \dots , n \,\), then

$$\begin{aligned}&v(y_1, \dots ,y_{k-1},y_k + s_k ,y_{k+1}, \dots ,y_n) \\&\quad =u(e^{y_1}, \dots ,e^{y_{k-1}}, e^{y_k} \cdot e^{s_k} , e^{y_{k+1}}, \dots , e^{y_n}) \\&\quad =\Phi _k \left( u(e^{y_1}, \dots , e^{y_n}), e^{s_k} \right) = \Psi _k \left( v(y_1, \dots , y_n) , s_k \right) \end{aligned}$$

holds for all \((y_1, \dots , y_n) \in S\) and \(s_k \in {\mathbb {R}}\), fulfilling

$$\begin{aligned} (y_1, \dots ,y_{k-1},y_k + s_k ,y_{k+1}, \dots ,y_n) \in S. \end{aligned}$$

Consequently \(v : S \longrightarrow {\mathbb {R}}\) is a continuous solution of the system of functional equations

$$\begin{aligned} v(y_1, \dots ,y_{k-1},y_k + s_k ,y_{k+1}, \dots ,y_n) = \Psi _k (v(y_1, \dots ,y_n), s_k) \end{aligned}$$

for every \(k= 1, \dots , n\). Obviously v is not constant, therefore, due to Theorem 4.2, there exist a vector \(0 \ne a = (a_1, \dots , a_n) \in {\mathbb {R}}^n\) and a continuous, strictly monotone real function f such that

$$\begin{aligned} v(y_1, \dots ,y_n) = f(a_1 y_1 + \dots + a_n y_n). \qquad ((y_1, \dots y_n) \in S) \end{aligned}$$

Since v is strictly increasing in all of its variables, one can easily see that either all the coordinates of a are positive and f is strictly increasing, or all coordinates are negative and f is strictly decreasing. Without loss of generality we can restrict ourselves to the first case, as one can easily check that a is determined only up to a non-zero multiplicative constant.

From this result we can instantly conclude that

$$\begin{aligned}&u(x_1 ,\dots , x_n) = v(\ln x_1 , \dots , \ln x_n) \\&\quad =f( a_1 \ln x_1 + \dots + a_n \ln x_n) = (f \circ \ln ) (x_1 ^{a_1} \cdot \, \dots \, \cdot x_n^{a_n}) \end{aligned}$$

holds for all \((x_1, \dots , x_n) \in D\). Here f and \(\ln \) are strictly increasing, thus \(f \circ \ln \) is also strictly increasing, which means that u is a Cobb–Douglas type utility function. \(\square \)