1 Introduction

Suppose you were given a set A and N lamps you are to place such that the darkest point in A is as bright as possible. In less descriptive terms this max-min problem is known as the maximal polarization problem, which we now state in mathematical language.

Let \(A, D\subset \mathbb {R}^n\) be nonempty sets and let \(K:A\times D\rightarrow \mathbb {R}\cup \{+\infty \}\) be a function bounded from below. An N-point multiset \(C\subseteq D\) will be referred to as point configuration (of N points) and the set of all N-point configurations supported on D will be denoted by \(\mathcal {C}\). We assign the discrete K-potential associated with C to every point \(p\in A\) as

$$\begin{aligned}U_{K, A}(p, C) = \sum _{c\in C} K(p, c).\end{aligned}$$

To any point configuration we associate its polarization

$$\begin{aligned}P_{K, A}(C) = \inf _{p\in A} U_{K, A}(p, C).\end{aligned}$$

It is then natural to consider the (maximal) polarization problem:

$$\begin{aligned} \mathcal {P}_K(A) = \sup _{C\in \mathcal {C}} P_{K, A}(C). \end{aligned}$$
(1)

For a broader context and overview of this formulation of the polarization problem we refer to the recent monograph [1, CH. 14]. Problems of this kind have been extensively studied. In particular the case of \(A = D = S^{n-1}\) being a unit sphere and \(K(x,y) = \Vert x-y\Vert ^{-s}\) being related to a Riesz potential is rich in results on explicit optimal configurations of few points (eg. [2,3,4,5,6,7,8]), bounds on maximal polarization (eg. [4, 7]) and asymptotic results (eg. [9,10,11,12]). Asymptotic results are also available for more general choices of A, such as rectifiable sets.

Moreover, the polarization problem as stated in (1) is closely related to the well-studied covering problem, i.e. the question, whether A can be covered by N balls of radius \(r>0\). In particular, let \(K(x,y)=\mathbbm {1}_{[0,r]}(\Vert x-y\Vert )\), then, a covering with N balls exists if and only if \(1\le \mathcal {P}_K(A)\). General discussions of covering problems can be found, for example in the seminal book by Conway and Sloane [13]. For covering problems on compact metric spaces we refer to [14] for an overview, whereas constructive methods have been developed, e.g. in [15] and [16].

In this paper we consider polarization problems of the following kind. The set \(A\subset \mathbb {R}^n\) will be a compact set and we will impose no restrictions on the point configurations, i.e. \(D = \mathbb {R}^n\). Furthermore, we restrict to functions \(K(x, y) = f(\Vert x-y\Vert )\) for some continuous strictly monotone decreasing function \(f : \mathbb {R}_+ \rightarrow \mathbb {R}_+\) and use the notation \(U_{f, A}(p, C)\), \(P_{f, A}(C), \mathcal {P}_f(A)\). If the subscript parameters are clear from context we omit them.

Under the above assumptions, we therefore consider the optimization problem

$$\begin{aligned} \mathcal {P}_{f}(A) = \sup _{C\subset \mathcal {C}} P_{f, A}(C). \end{aligned}$$
(2)

For explicit computations we choose Gaussians \(f(x) = e^{-ax^2}\). These functions appear rather naturally in the context of universal optimality (cf. [17]): Recall that a function \(g:(0,\infty ) \rightarrow \mathbb {R}\) is completely monotonic if it is infinitely differentiable and the derivatives satisfy \((-1)^k g^{(k)} \ge 0\) for all k. The functions \(g(x) = e^{-\alpha x}\) are completely monotonic and we can write \(f(\Vert x-y\Vert ) = g(\Vert x-y\Vert ^2)\). In this context functions \(f(x) = g(x^2)\) are called completely monotonic functions of squared distance.

A Theorem of Bernstein (cf. [18, Thm. 9.16]) asserts that every completely monotonic function can be written as a convergent integral

$$\begin{aligned} g(x) = \int _{[0,\infty )}e^{-\alpha x} d\mu (\alpha ). \end{aligned}$$

From this one obtains that the set of completely monotonic functions of squared distance is the cone spanned by the Gaussians and the constant function \(x \mapsto 1\).

In particular the commonly used Riesz potentials can be written in this way.

We fix some more notation for the case that the infimum \(P_{f,A}\) is in fact a minimum, i.e. the minimizers of this function are points in A. In this case, any such minimizer will be called a darkest point of A. Moreover,

$$\begin{aligned}{{\,\textrm{Dark}\,}}_A(C) = \Big \{p\in A \ : \ \sum _{c\in C} f(\Vert p- c\Vert ) = P_{f, A}(C)\Big \}\end{aligned}$$

will be called the set of darkest points of C. To explain this wording we invite the reader to recall the interpretation of the problem we gave in the beginning: we center lamps at the points in C which now illuminate A. The polarization of A is then the lowest level of brightness any point in A can have, any point realizing this is a “darkest point”.

Note, that requiring A to be compact is rather natural. Indeed if A were unbounded, then the value of the polarization would always tend to \(N\cdot \inf f\). If A were not closed, darkest points need not exist. Consider for example A to be the open disc and C only containing the origin. In this case, \(P_{f, A}(C)\) is not attained at any point in A.

In Sect. 2 we provide some results connecting a locally optimal configuration to the set of its respective darkest points. Theorem 2.1 states that the points of such a configuration are contained in the convex hull of the darkest points while on the other hand Theorem 2.5 states that the darkest points are located either on the boundary of A or in the interior of the convex hull of the configuration. These restrictions provide necessary conditions for optimality.

In Sect. 3 we investigate mixed-integer approximations of the polarization problem providing upper and lower bounds. These are collected in Theorem 3.5. We then prove that these bounds indeed converge to \(\mathcal {P}_{f}(A)\) in Theorems 3.8 and 3.9.

In Sect. 4 we illustrate capabilities and limitations of the approach on some benchmark instances.

2 Darkest Points and Necessary Conditions

In this section, we investigate structural properties an optimal configuration needs to satisfy in order to potentially falsify the optimality of a given polarization and reduce the search space of optimal configurations.

In particular, we have the following necessary condition that relates local optimality of a configuration to the set of its darkest points:

Theorem 2.1

If C is a locally optimal solution of (2), then

$$\begin{aligned}C\subset {{\,\textrm{conv}\,}}{{\,\textrm{Dark}\,}}_A(C).\end{aligned}$$

Proof

Suppose C is a configuration for which we have \(c\in C\) such that \(c\notin {{\,\textrm{conv}\,}}({{\,\textrm{Dark}\,}}_A(C))\). In the following, we discuss how to construct a new configuration \(C'\) in an arbitrary neighbourhood of C such that \(P(C') > P(C)\). Thus C can not be locally optimal. Since f is continuous, the niveau line

$$\begin{aligned}S = \{p\in \mathbb {R}^n \ : \ U(p, C) = P(C)\}\end{aligned}$$

containing the darkest points is closed and thus \({{\,\textrm{Dark}\,}}_A(C) = A\cap S\) is compact. Therefore, \({{\,\textrm{conv}\,}}({{\,\textrm{Dark}\,}}_A(C))\) is a compact convex set and we can find a hyperplane \(H = \{x \ : \ a^\top x = b\}\) strictly separating this set from c such that \(a^\top c < b\). For \(\varepsilon > 0\) small enough \(c' = c + \varepsilon a\) still satisfies \(a^\top c' < b\). We obtain a new configuration \(C' = C\cup \{c'\}\setminus \{c\}\). Note, that for every neighbourhood of C, there is a sufficiently small \(\varepsilon \) such that \(C'\) is contained in said neighbourhood. Obviously \(|c' -p| < |c - p|\) for all points p in the non-negative halfspace of H. In particular \(c'\) is closer to all of the darkest points than c and since f is monotonously decreasing

$$\begin{aligned}U(p, C') > U(p, C) \ge P(C)\end{aligned}$$

for all points p in the non-negative halfspace of H.

It remains to assert this also on the negative halfspace. Since all the darkest points are on the positive side of H, a point \(p\in A\cap (H\cup H_{-})\) satisfies

$$\begin{aligned}U(p, C) > P(C).\end{aligned}$$

Since \(A\cap (H\cup H_{-})\) is compact this yields

$$\begin{aligned}U(p, C) \ge \delta > P(C)\end{aligned}$$

for some constant \(\delta \). By continuity of f, for \(\varepsilon \) small enough, we can guarantee that

$$\begin{aligned}U(p, C') > P(C)\end{aligned}$$

for all \(p\in A\cap (H\cup H_{-})\). Altogether,

$$\begin{aligned}P(C') = \inf _{p\in A} U(p, C') > P(C).\end{aligned}$$

\(\square \)

The formulated condition is very “unstable” in the following sense:

Proposition 2.2

Let C be a configuration such that \(C\subset {{\,\textrm{conv}\,}}{{\,\textrm{Dark}\,}}_A(C)\). Let \(c\in C\) and \(c'\ne c\) and \(C' = C\cup \{c'\}\setminus \{c\}\). Then

  1. 1.

    \(P(C') < P(C)\) and

  2. 2.

    \(C'\nsubseteq {{\,\textrm{conv}\,}}{{\,\textrm{Dark}\,}}_A(C')\).

Proof

  1. 1.

    Consider the hyperplane H with outer normal \(c - c'\) through c, oriented such that \(c'\) is on the negative side. Since \(c\in {{\,\textrm{conv}\,}}{{\,\textrm{Dark}\,}}_A(C)\) there has to be a darkest point \(d\in {{\,\textrm{Dark}\,}}_A(C)\) in the non-negative halfspace of H (it might be in H). Then \(\Vert c - d\Vert < \Vert c' -d\Vert \) and by monotonicity \(f(\Vert c - d\Vert ) > f(\Vert c' -d\Vert )\). The potentials \(U(d,C')\) and U(dC) differ by \(f(\Vert c'-d\Vert )-f(\Vert c-d\Vert )\), therefore the above implies

    $$\begin{aligned} P(C') \le U(d, C') < U(d, C) = P(C). \end{aligned}$$
  2. 2.

    Suppose \(C'\subset {{\,\textrm{conv}\,}}{{\,\textrm{Dark}\,}}_A(C')\). Then we can apply 1. to \(C'\) with the roles of \(c,c'\) reversed. But this would give \(P(C')< P(C) < P(C'),\) which is a contradiction.

\(\square \)

Optimization methods which only consider single components (like pattern search) or move single configuration points therefore possibly converge to a configuration contained in the convex hull of the darkest points which is not locally optimal. Therefore it seems reasonable to only use optimization methods which are able to move several points at once. Another conclusion is the following, which seems to suggest that the number of optimization variables can be reduced to only \(N-1\) vectors.

Corollary 2.3

For given points \(C'\) with \(|C'| = N-1\) there is at most one point c such that \(\{c\}\cup C'\subset {{\,\textrm{conv}\,}}{{\,\textrm{Dark}\,}}_A(\{c\}\cup C')\).

We can use Theorem 2.1 to study the structure of the darkest points even more. First, we discuss a way to find certificates for \(p\notin {{\,\textrm{Dark}\,}}_A(C)\). To this end, we recall that the conic hull of a set \(S\subseteq \mathbb {R}^n\) is given by \({{\,\textrm{cone}\,}}(S) =\bigcap _{K\supset S:\ K\text { is a convex cone}} C\).

Lemma 2.4

Let C be a configuration and \(p\in \mathbb {R}^n\) be an arbitrary point. Let

$$\begin{aligned} N(p,C) = \{p + v \ : \ v\ne 0 \text { and } v^\top w \ge 0 \text { for all } w\in {{\,\textrm{cone}\,}}\{p-c \ : \ c\in C\}\}. \end{aligned}$$
  1. 1.

    For all \(q\in N(p, C)\) we have \(U(q, C) < U(p, C)\),

  2. 2.

    if \(N(p, C) \cap A \ne \emptyset \) then \(p\notin {{\,\textrm{Dark}\,}}_A(C)\).

Proof

Write \(q = p + v\in N(p, C)\) with \(v\ne 0\). Then for all \(c\in C\) we have

$$\begin{aligned}|c - (p+v)|^2 = |c - p|^2 + 2(p-c)^\top v + |v|^2 > |c-p|^2.\end{aligned}$$

Since f is strictly monotone decreasing, we have \(U(q, C) < U(p, C)\). From this, the second claim follows immediately. \(\square \)

Observe that Lemma 2.4 1 implies, that N(pC) contains only points whose potential is strictly smaller than the potential at p. If we recall the visualization of the polarization problem as placing light sources C to illuminate A the above definition of N(pC) of a point p contains only points that are illuminated less than p itself. It is closely related to the idea of a physical shadow as \(p + {{\,\textrm{cone}\,}}\{p-c'\}\) with \(c'\in C\) can be seen as the set of points lying in the shadow thrown by object p with respect to light source \(c'\). In this interpretation \(p + {{\,\textrm{cone}\,}}\{p-c \ :\ c \in C\}\) resembles the shadow with respect to all light sources simultaneously. Note that N(pC) is defined by replacing \({{\,\textrm{cone}\,}}\{p-c \ :\ c \in C\}\) by its dual coneFootnote 1. With this we prove the following result which further restricts the location of the darkest points:

Theorem 2.5

Let C be a feasible configuration for (2). Then the points of \({{\,\textrm{Dark}\,}}_A(C)\) are either in the interior of \({{\,\textrm{conv}\,}}(C)\) or in \(\delta A\), i.e. \({{\,\textrm{Dark}\,}}_A(C) \subset {{\,\textrm{int}\,}}{{\,\textrm{conv}\,}}(C) \cup \delta A\). Moreover, if C is locally optimal for (2), then \({{\,\textrm{Dark}\,}}_A(C)\cap \delta A \ne \emptyset \).

Proof

Let \(p \in {{\,\textrm{Dark}\,}}_A(C)\) and assume \(p \notin {{\,\textrm{int}\,}}{{\,\textrm{conv}\,}}(C)\). Furthermore, let N(pC) be defined as in Lemma 2.4. We can find a hyperplane \(H = \{x \in \mathbb {R}^n\ :\ a^\top x = \beta \}\) through p separating C from p, in particular \(a^\top c \le \beta \) for all \(c\in C\). Then for all \(c \in C\)

$$\begin{aligned} a^\top (p-c) = a^\top p - a^\top c = \beta - a^\top c \ge 0, \end{aligned}$$

which shows that \(p + \lambda a \in N(p,C)\) for arbitrary \(\lambda >0\). If \(p \in {{\,\textrm{int}\,}}A\), so is \(p + \lambda a\) for \(\lambda \) sufficiently small. Then \(A \cap N(p,C) \ne \emptyset \) in contradiction to Lemma 2.4. Thus \(p \in \partial A\) as claimed.

In addition, if C is also locally optimal for (2) by Theorem 2.1 we immediately obtain that \(C\subset {{\,\textrm{conv}\,}}{{\,\textrm{Dark}\,}}_A(C)\). Now, assume \({{\,\textrm{Dark}\,}}_A(C)\cap \delta A = \emptyset \), then as seen above \({{\,\textrm{Dark}\,}}_A(C) \subseteq {{\,\textrm{int}\,}}{{\,\textrm{conv}\,}}C\) and we obtain

$$\begin{aligned}C \subseteq {{\,\textrm{conv}\,}}{{\,\textrm{Dark}\,}}_A(C) \subseteq {{\,\textrm{int}\,}}{{\,\textrm{conv}\,}}C,\end{aligned}$$

which is a contradiction since C is finite. \(\square \)

Fig. 1
figure 1

Illustration of Theorems 2.1 and 2.5. A is depicted in red, \({{\,\textrm{Dark}\,}}_A\) in black and the configuration C in orange. The dashed lines depict the convex hulls \({{\,\textrm{conv}\,}}C\) and \({{\,\textrm{conv}\,}}{{\,\textrm{Dark}\,}}_A(C)\), whereas the black line depicts all points \(p\in \mathbb {R}^2\), such that \(U(p,C)=P(C)\)

To summarize, locally optimal configurations C of (2) and its corresponding darkest points \({{\,\textrm{Dark}\,}}_A(C)\) share a similar containment property as is illustrated in Fig. 1.

3 An MIP Approach to Polarization

The current section is dedicated to the development of two hierarchies of mixed-integer linear programs (MIP) that approximate the maximal polarization of a compact set A with respect to a monotonically decreasing and continuous function \(f:\ \mathbb {R}_+ \rightarrow \mathbb {R}_+\). The MIP, that computes the lower bounds is constructive, i.e. solutions to this MIP are configurations whose polarization is lower bounded by the value of the MIP. The actual polarization of these configurations may very well exceed this lower bound by a significant margin, cf. Figure 3 for some numerical evidence.

First we give an equivalent description of problem (2). For this we observe that by Theorem 2.1 any locally optimal point configuration is necessarily supported on \({{\,\textrm{conv}\,}}(A)\). Furthermore we can get rid of the infimum by adding new constraints. The resulting optimization problem is then

$$\begin{aligned} \mathcal {P}_{f}(A)= \max _{x,C}\ {}&x{} & {} \nonumber \\&C\in \left[ \begin{array}{cc} {{\,\textrm{conv}\,}}(A) \\ N \end{array} \right] \nonumber \\&x \le U_{f, A}(p, C){} & {} \text { for all } p\in A, \end{aligned}$$
(3)

where \(\left[ \begin{array}{cc} X \\ N \end{array} \right] \) describes the set of all multisets of size N with elements in X. It is now clear, that the \(\sup \) is actually a \(\max \), since the feasible region can easily be made compact by bounding x from below (e.g. \(x \ge 0\)) without changing the value of the program.

3.1 MIP Hierarchies

We observe that Problem (3) is an optimization problem with finitely many variables (namely xC), but infinitely many constraints - it is a semiinfinite program (SIP) - and therefore not solvable using standard solvers. In the remainder of this section we introduce two hierarchies of (tractable) MIPs, that approximate \(\mathcal {P}(A)\) from above and below (see Theorem 3.5). For this we make use of the following concept of functions which “control” the difference of two values of f.

Definition 3.1

We call a family of functions \(g_{c, p}:\mathbb {R}_+\rightarrow \mathbb {R}_+\) for \(c\in {{\,\textrm{conv}\,}}(A)\), \(p \in A\) a family of control functions (with respect to f, A) if for all \(c \in {{\,\textrm{conv}\,}}(A)\), \(p\in A\):

  1. 1.

    \(g_{c, p}(0) = 0\),

  2. 2.

    \(g_{c, p}\) is continuous and non-decreasing,

  3. 3.

    \(|f(\Vert c - p\Vert ) - f(\Vert c' - p\Vert )| \le g_{c, p}(\Vert c-c'\Vert )\) for all \(c'\in {{\,\textrm{conv}\,}}(A)\),

  4. 4.

    \(|f(\Vert c- p\Vert ) - f(\Vert c- p'\Vert )| \le g_{c, p}(\Vert p-p'\Vert )\) for all \(p\in A\),

where \(\Vert \cdot \Vert \) denotes the standard Euclidean norm.

Note that f is related to a function K taking two points cp as arguments: \(K(c,p) = f(\Vert c-p\Vert )\). A family of control functions allows us to control the way K changes as we vary either c or p.

This control will be an important ingredient of the proof of Theorem 3.5. For continuous functions this is related to bounding the slope of K as can be illustrated by the following example: Suppose the function \(K(c, \cdot ) = f(\Vert c - \cdot \Vert )\) is Lipschitz-continous with Lipschitz constant L for all \(p\in A\). Then, \(g_{c, p}(\varepsilon ) = L\cdot \varepsilon \) is a valid control function for f.

However, applying global Lipschitz-continuity is not a very precise approximation as it ignores local information around specific points cp. Therefore we provide a more suitable family of control functions.

Proposition 3.2

For f monotonously decreasing and continuous the following is a family of control functions:

$$\begin{aligned}g_{c, p}(\varepsilon ) = \max ({\hat{g}}_{c, p}(\varepsilon ), {\hat{g}}_{c, p}(-\varepsilon )),\end{aligned}$$

where

$$\begin{aligned}{\hat{g}}_{c, p}(x) = {\left\{ \begin{array}{ll}f(0) - f(\Vert c - p\Vert ) \text { if } x < - \Vert c- p\Vert ,\\ \left| f(\Vert c-p\Vert + x) - f(\Vert c-p\Vert ) \right| \text { otherwise}.\end{array}\right. }\end{aligned}$$

Proof

We fix cp and write \(g = g_{c, p}\) and \({\hat{g}} = {\hat{g}}_{c, p}.\) Clearly \(g(0) = {\hat{g}}(0) = 0\). Since f is continuous, so is g.

For \(x\in (-\infty , -\Vert c-p\Vert )\) the function \({\hat{g}}(x)\) is constant. For \(x\in (-\Vert c-p\Vert , 0)\) we have

$$\begin{aligned}{\hat{g}}(x) = f(\Vert c-p\Vert + x) - f(\Vert c-p\Vert ),\end{aligned}$$

which is decreasing since f is decreasing. For \(x\in (0, \infty )\) we have

$$\begin{aligned}{\hat{g}}(x) = f(\Vert c-p\Vert ) - f(\Vert c-p\Vert + x),\end{aligned}$$

which is increasing since f is decreasing. Overall \(g(\varepsilon ) = \max ({\hat{g}}(\varepsilon ), {\hat{g}}(-\varepsilon ))\) is an increasing function on \(\mathbb {R}_+\).

By symmetry, it is sufficient to prove that g provides an upper bound for \(\Delta = |f(\Vert c-p\Vert ) - f(\Vert c-p'\Vert )|\) for all \(p'\in {{\,\textrm{conv}\,}}(A)\). To this end, we use the triangle inequalities

$$\begin{aligned}\Vert c-p\Vert - \Vert p'-p\Vert \le \Vert c-p'\Vert \le \Vert c-p\Vert + \Vert p' - p\Vert \end{aligned}$$

and that f is a decreasing function. Then, on the one hand if \(\Vert c-p\Vert \le \Vert c-p'\Vert \), we have

$$\begin{aligned} \Delta&= f(\Vert c-p\Vert ) - f(\Vert c-p'\Vert ) \\&\le f(\Vert c-p\Vert ) - f(\Vert c-p\Vert + \Vert p'-p\Vert ) = {\hat{g}}(\Vert p'-p\Vert ) \le g(\Vert p'-p\Vert ). \end{aligned}$$

On the other hand, if \(\Vert c-p\Vert \ge \Vert c-p'\Vert \), we obtain

$$\begin{aligned} \Delta&= - f(\Vert c-p\Vert ) + f(\Vert c-p'\Vert ) \\&\le - f(\Vert c-p\Vert ) + f(\Vert c-p\Vert - \Vert p-p'\Vert ) = {\hat{g}}(-\Vert p-p'\Vert ) \le g(\Vert p-p'\Vert ). \end{aligned}$$

\(\square \)

For explicit computations we need to discretize two aspects of the problem. Firstly, we discretize the set of possible point configurations. For this we choose a finite sample \(\Lambda \subset {{\,\textrm{conv}\,}}(A)\) and only optimize over

$$\begin{aligned} C \in \left[ \begin{array}{cc} \Lambda \\ N \end{array} \right] . \end{aligned}$$
(4)

Secondly, we replace the infinite number of constraints, parameterized by A, by a finite subcollection. For this we again choose a finite sample \(\Gamma \subset A\), and only consider the inequalities

$$\begin{aligned} x \le U(p,C) \quad \text { for all } p \in \Gamma . \end{aligned}$$
(5)

However, this naively sampled problem is not necessarily connected to the original problem, since we enforce only a subset of the infinitely many constraints and allow only a finite number of configurations. Either one of these changes would provide valid bounds but they unfortunately work in different directions. We will now show how to overcome this problem by utilizing the above family of control functions to obtain lower and upper bounds on the original problem.

Let us first consider lower bounds on (3). It is clear that we can restrict the choice of configurations to be supported on a finite sample \(\Lambda \) of \({{\,\textrm{conv}\,}}(A)\) as in (4) and obtain a program that computes a lower bound.

Discretizing the constraints is the harder part, since removing constraints lets the maximum grow. The following lemma shows how a slight variation of discretized constraints for some finite sample \(\Gamma \) of A imply the validity of all of the infinitely many original constraints.

Lemma 3.3

Let \(g_{c, p}\) be a family of control functions. Let \(\varepsilon > 0\), \(\Lambda \) be an arbitrary finite sample of \({{\,\textrm{conv}\,}}(A)\) and \(\Gamma \) be an \(\varepsilon \)-net of A. Furthermore, suppose \(x\in \mathbb {R}, C\subset \left[ \begin{array}{cc} \Lambda \\ N \end{array} \right] \) satisfy

$$\begin{aligned}x \le \sum _{c\in \Lambda } \mathbbm {1}_C(c) \cdot \left( f(\Vert c-p\Vert ) - g_{c, p}(\varepsilon )\right) \quad \text { for all } p\in \Gamma .\end{aligned}$$

Then

$$\begin{aligned}x\le \sum _{c\in \Lambda } \mathbbm {1}_C(c)\cdot f(\Vert c-p\Vert ) \quad \text { for all } p\in A.\end{aligned}$$

Proof

Let \(p\in A\) be arbitrary and \(n(p)={{\,\textrm{argmin}\,}}_{\bar{p}\in \Gamma }\{\Vert p-\bar{p}\Vert \}\) denote the closest sample point to \(p\in A\). Note that \(\Vert p-n(p)\Vert < \varepsilon \) since \(\Gamma \) is an \(\varepsilon \)-net. Then

$$\begin{aligned} \sum _{c\in \Lambda } \mathbbm {1}_C(c) f(\Vert c-p\Vert )&= \sum _{c\in \Lambda } \mathbbm {1}_C(c)\cdot \left( f(\Vert c-p\Vert ) - f(\Vert c-n(p)\Vert )\right) \\&\qquad +\sum _{c\in \Lambda } \mathbbm {1}_C(c) \cdot \left( f(\Vert c-n(p)\Vert ) - g_{c, n(p)}(\varepsilon )\right) \\&\qquad +\sum _{c\in \Lambda }\mathbbm {1}_C(c)\cdot g_{c, n(p)}(\varepsilon )\\&\ge - \sum _{c \in \Lambda } \mathbbm {1}_C(c)\cdot g_{c, n(p)}(\Vert p - n(p)\Vert )\\&\qquad + x\\&\qquad + \sum _{c\in \Lambda }\mathbbm {1}_C(c)\cdot g_{c, n(p)}(\varepsilon ), \end{aligned}$$

which is larger than x since \(g_{c,n(p)}\) is non-decreasing and \(\Vert p-n(p)\Vert < \varepsilon \). \(\square \)

Conversely, if we consider upper bounds on (3), we now cannot simply choose a finite sample \(\Lambda \) of \({{\,\textrm{conv}\,}}(A)\) to approximate the above SIP. Indeed this would restrict the set of feasible solutions of (3) and thereby lower the maximum instead. Again, the following lemma provides a way around this problem using a variation of the constraints.

Lemma 3.4

Let \(g_{c, p}\) be a family of control functions. Let \(\varepsilon > 0\) and \(\Lambda \) be an \(\varepsilon \)-net of \({{\,\textrm{conv}\,}}(A)\). Furthermore, suppose \(C\in \left[ \begin{array}{cc} {{\,\textrm{conv}\,}}(A) \\ N \end{array} \right] \) and x satisfy

$$\begin{aligned}x \le U(p, C) = \sum _{c\in C} f(\Vert c - p\Vert ) \quad \text { for all }p\in \Gamma .\end{aligned}$$

Then, there exists a configuration \(C'\in \left[ \begin{array}{cc} \Lambda \\ N \end{array} \right] \) such that

$$\begin{aligned}x \le \sum _{c\in C'} f(\Vert c - p\Vert ) + g_{c, p}(\varepsilon ) \quad \text { for all }p\in \Gamma .\end{aligned}$$

Proof

Let \(C' = \{\{n(c) \ : \ c\in C\}\}\) where \(n(c) = {{\,\textrm{argmin}\,}}_{c'\in \Lambda } \Vert c - c'\Vert \). Then

$$\begin{aligned} \sum _{c\in C'} f(\Vert c - p\Vert ) + g_{c, p}(\varepsilon )&= \sum _{c\in C} f(\Vert n(c) - p\Vert ) - f(\Vert c - p\Vert )\\&\qquad + \sum _{c\in C} f(\Vert c - p\Vert ) + \sum _{c\in C} g_{n(c), p}(\varepsilon )\\&\ge -\sum _{c\in C} g_{n(c), p}(\Vert c - n(c)\Vert ) + x + \sum _{c\in C} g_{n(c), p}(\varepsilon ) \ge x, \end{aligned}$$

where the last inequality holds since \(g_{n(c),p}\) is non-decreasing and \(\Vert c-n(c)\Vert <\varepsilon \) as \(\Lambda \) is an \(\varepsilon \)-net of \({{\,\textrm{conv}\,}}(A)\). \(\square \)

Now we can prove the main result of this section.

Theorem 3.5

Let \(\varepsilon _\Lambda , \varepsilon _\Gamma > 0\) and \(\Lambda \) be an \(\varepsilon _\Lambda \)-net of \({{\,\textrm{conv}\,}}(A)\) and \(\Gamma \) be an \(\varepsilon _\Gamma \)-net of A. Furthermore, let \(g_{c, p}\) be a family of control functions. Then we have the following:

$$\begin{aligned} \max \ {}&x \end{aligned}$$
(6a)
$$\begin{aligned}&y \in \{0,\ldots , N\}^\Lambda \nonumber \\&\mathbbm {1}^\top y = N \nonumber \\&x \le \sum _{c\in \Lambda } y_c\cdot (f(\Vert c - p\Vert ) - g_{c, p}(\varepsilon _\Gamma )){} & {} \text { for all } p\in \Gamma \nonumber \\ \le \max \ {}&x \end{aligned}$$
(6b)
$$\begin{aligned}&y \in \{0,\ldots , N\}^\Lambda \nonumber \\&\mathbbm {1}^\top y = N \nonumber \\&x \le \sum _{v\in \Lambda } y_c\cdot f(\Vert c - p\Vert ){} & {} \text { for all } p\in A \nonumber \\ \le \mathcal {P}(A) \end{aligned}$$
(6c)
$$\begin{aligned} \le \max \ {}&x \end{aligned}$$
(6d)
$$\begin{aligned}&C \in \left[ \begin{array}{cc} {{\,\textrm{conv}\,}}(A) \\ N \end{array} \right] \nonumber \\&x \le \sum _{c\in C} f(\Vert c - p\Vert ){} & {} \text { for all } p\in \Gamma \nonumber \\ \le \max \ {}&x&\end{aligned}$$
(6e)
$$\begin{aligned}&y\in \{0,\ldots , N\}^\Lambda \nonumber \\&\mathbbm {1}^\top y = N \nonumber \\&x \le \sum _{c\in \Lambda } y_c\cdot (f(\Vert c - p\Vert ) + g_{c, p}(\varepsilon _\Lambda )){} & {} \text { for all } p\in \Gamma \end{aligned}$$
(6f)

Proof

We show, that feasible solutions of the left hand sides are also feasible for the right hand sides with the same objective value justifying the asserted inequalities. First, observe that Lemma 3.3 implies that a feasible solution xy of (6a) is also feasible for (6b) and the objective values coincide. Next, we consider a feasible solution xy of (6b) and observe that y encodes a multiset \(C\in \left[ \begin{array}{cc} \Lambda \\ N \end{array} \right] \subseteq \left[ \begin{array}{cc} {{\,\textrm{conv}\,}}(A) \\ N \end{array} \right] \). Moreover, xC satisfy the constraints in (2) and with the same objective value x. The next inequality follows rather immediately since (6d) is a relaxation of (2) due to dropping constraints for \(p\in A\setminus \Gamma \). Lastly, if xC is a feasible solution of (6d), we apply Lemma 3.4 to obtain a set \(C'\in \left[ \begin{array}{cc} \Lambda \\ N \end{array} \right] \) satisfying the constraints of (6e). Then, by encoding \(C'\) through \(y\in \{0,\ldots , N\}^\Lambda \) with \(\mathbbm {1}^\top y = N\) we obtain a feasible solution to (6e) with the same objectve value x. \(\square \)

Let us briefly comment on the computational complexity of the mixed-integer programs (6a) and (6e). It is worth noting, that mixed-integer linear programming usually refers to optimization problems that include binary variables, which run significantly faster. We would like to note that the integral variables \(y\in \{0,\ldots ,N\}^\Lambda \) in both (6a) and (6e) can be replaced by \(|\Lambda | \cdot \log (N)\) binary variables.

Moreover, in the lower bound of Theorem 3.5 the vector y can be chosen as \(y\in \{0,1\}^\Lambda \), which still provides a (potentially worse) lower bound and reduces the number of binary variables significantly. Unfortunately, a similar simplification is not immediately possible for the upper bound. However, we introduce another concept which aims to reduce the computational complexity in a similar fashion in the upper bound case.

Definition 3.6

A finite subset \(\Lambda \subset \mathbb {R}^n\) is called an \((\varepsilon , k)\)-net of A if

  1. 1.

    \(\Lambda \subset A\),

  2. 2.

    For every \(p\in A\) there are at least k distinct points \(p_1, \dots , p_k\in \Lambda \) such that \(|p_i - p| < \varepsilon \).

Using an \((\varepsilon _\Lambda , N)\)-net we obtain a hierarchy similar to Theorem 3.5 restricting the possible entries of y to \(\{0,1\}\).

Proposition 3.7

Let \(\varepsilon _\Lambda , \varepsilon _\Gamma > 0\) and \(\Lambda \) be an \((\varepsilon _\Lambda , N)\)-net of \({{\,\textrm{conv}\,}}(A)\) and \(\Gamma \) be an \(\varepsilon _\Gamma \)-net of A. Furthermore, let \(g_{c, p}\) be a family of control functions. Then,

$$\begin{aligned} (6d) \le \max \ {}&x&\\&y\in \{0,1\}^\Lambda \\&\mathbbm {1}^\top y = N \\&x \le \sum _{c\in \Lambda } y_c\cdot (f(\Vert c - p\Vert ) + g_{c, p}(\varepsilon _\Lambda )){} & {} \text { for all } p\in \Gamma \end{aligned}$$

Proof

The proof works similar to the proof of Theorem 3.5 by replacing \(C' = \{\{n(c) \ : c\in C\}\}\) in the proof of Lemma 3.4 by a set \(C'\) of N distinct points of \(\Lambda \). This is possible since \(\Lambda \) is an \((\varepsilon _\Lambda , N)\)-net (see Definition 3.6). \(\square \)

A trivial example of an \((\varepsilon , N)\)-net can basically be obtained by a multiset consisting of N copies of an \(\varepsilon \)-net. However, in practise there are usually solutions that need fewer points, albeit more than a classical \(\varepsilon \)-net.

3.2 Convergence Results

After establishing upper and lower bounds to \(\mathcal {P}(A)\) through the hierarchies presented in Theorem 3.5, we study the quality of these bounds. To this end, we show in this section, that solutions of the bounding problems (6a) and (6e) converge, as \(\varepsilon _\Lambda , \varepsilon _\Gamma \) both tend to 0, to a solution of the original problem (3). Both proofs rely in large parts on the proof of Lemma 6.1 in [19], which proves similar convergence for more general semiinfinite programs, but include minor necessary modifications. At first, we focus on the lower bounds, i.e., we show, that (6a) converges to (6b) as \(\varepsilon _\Gamma \rightarrow 0\):

Theorem 3.8

Let \((\varepsilon _k)\) be a non-negative sequence converging towards 0. Furthermore, for every \(k\in \mathbb {N}\) choose an \(\varepsilon _k\) net \(\Gamma _k\) of A. Then, any accumulation point of a sequence \((x_k,y_k)_{k \in \mathbb {N}}\) of optimal solutions of (6a) w.r.t. \(\Gamma _k\) and \(\varepsilon _k\) is an optimal solution of (6b).

Proof

Let \((\bar{x},{\bar{y}})\) be an accumulation point of \((x_k,y_k)\). By passing to a subsequence we can assume that \((x_k,y_k) \rightarrow (\bar{x},\bar{y})\) if \(k\rightarrow \infty \). We are now going to prove, that \((\bar{x},\bar{y})\) is feasible and in fact optimal for (6b):

Consider an arbitrary \(p\in A\) and observe that since \(\Gamma _k\) is an \(\varepsilon _k\)-net of A, there exists a sequence \((p_k)\) with \(p_k\in \Gamma _k\) such that \(p_k\rightarrow p\) as \(k\rightarrow \infty \). We observe further, that for all k we have

$$\begin{aligned}x_k \le \sum _{c\in \Lambda } (y_k)_c \cdot (f(\Vert c - p_k\Vert ) - g_{c, p_k}(\varepsilon _k)) \le \sum _{c\in \Lambda } (y_k)_c\cdot f(\Vert c - p_k\Vert )\end{aligned}$$

and by taking limits

$$\begin{aligned}{\bar{x}} \le \sum _{c\in \Lambda }{\bar{y}}_c \cdot f(\Vert c - p\Vert ).\end{aligned}$$

Hence, \((\bar{x},\bar{y})\) is feasible for (6b).

Now, let (xy) be an arbitrary solution to (6b). Since A is compact and \(\varepsilon _k>0\), we know that \(g_c = \max _{p\in A} g_{c, p}\) is a continuous, monotonously non-decreasing function with \(g_c(0) = 0\). We now observe, that

$$\begin{aligned}(x- \sum _{c\in \Lambda } y_c\cdot g_c(\varepsilon _k), y)\end{aligned}$$

is feasible for (6a) with respect to \(\Gamma _k\). Since \((x_k, y_k)\) is an optimal solution to (6a), we have \(x_k\ge x - \sum _{c\in \Lambda } y_c\cdot g_c(\varepsilon _k)\). Consequently, as \(g_c(0)=0\), in the limit we obtain that \(\bar{x} \ge x\). Since x was chosen arbitrarily, we conclude, that \((\bar{x},\bar{y})\) is indeed optimal for (6b). \(\square \)

Note that the convergence of (6b) to (6c) as \(\varepsilon _\Lambda \rightarrow 0\) follows directly since the utility function and f are continuous. Thus, Theorem 3.8 implies the convergence of (6a) to (6c), i.e. the value of (6a) tends to \(\mathcal {P}(A)\), as \(\varepsilon _\Lambda ,\varepsilon _\Gamma \rightarrow 0\).

Moreover, with the same arguments, we conclude the convergence of (6d) to (6c) as \(\varepsilon _\Gamma \rightarrow 0\) and thus only one proof of convergence remains, namely that (6e) converges to (6d) as \(\varepsilon _\Lambda \rightarrow 0\).

One difficulty of the following theorem is the different kinds of feasible solutions when altering the sample \(\Lambda \). Feasible solutions of (6e) have the form \(y\in \{1, \dots , N\}^\Lambda \) with \(\mathbbm {1}^\top y = N\) while feasible solutions of (6d) are N-point multisets supported on \({{\,\textrm{conv}\,}}(A)\). Note that these objects do not permit an easy discussion of convergence. However, both notions can be translated into an element \(\omega \in ({{\,\textrm{conv}\,}}(A))^N\) which is independent of \(\Lambda \) and allows a discussion of convergence. Note that \(\omega \) can canonically be translated back into a multiset.

Theorem 3.9

Let \((\varepsilon _k)\) be a non-negative sequence converging towards 0. Furthermore, for every \(k\in \mathbb {N}\) choose an \(\varepsilon _k\)-net \(\Lambda _k\) of \({{\,\textrm{conv}\,}}(A)\). Let \((x_k,y_k)\) be a sequence of optimal solutions of (6e) w.r.t. \(\Lambda _k\), \(\varepsilon _k\). Identifying each \(y_k\) with \(\omega _k\in ({{\,\textrm{conv}\,}}(A))^N\), any accumulation point \(({\bar{x}},\bar{\omega })\) of this sequence corresponds to an optimal solution of (6d) by identification of \({\bar{\omega }}\) with a multiset.

Proof

The proof is similiar to the proof of Theorem 3.8. Note that, since order of elements is not important for the discussed problems, we can regard to elements of \(({{\,\textrm{conv}\,}}(A))^N\) either as tuples or as multisets depending on the context. Suppose \((x_k, \omega _k)\) with has an accumulation point \(({\bar{x}}, {\bar{\omega }})\). By passing to a subsequence we can assume that \((x_k, \omega _k)\rightarrow ({\bar{x}}, {\bar{\omega }})\). Consider the continuous function \(g_p = \max _{c\in {{\,\textrm{conv}\,}}(A)} g_{c, p}\) with \(g_p(0) = 0\). Then, we have for all k and \(p\in \Gamma \):

$$\begin{aligned} x_k&\le \sum _{c\in \Lambda _k} (y_k)_c\cdot (f(\Vert c - p\Vert ) + g_{c, p}(\varepsilon _k))\\&\le \sum _{i=1}^N f(\Vert (\omega _k)_i - p\Vert ) + g_p(\varepsilon _k) \end{aligned}$$

By taking limits we obtain

$$\begin{aligned}{\bar{x}} \le \sum _{i=1}^N f(\Vert {\bar{\omega }}_i - p\Vert )\end{aligned}$$

for all \(p\in \Gamma \). Thus \({\bar{x}}, {\bar{\omega }}\) is feasible for (6d).

Now suppose \(x, \omega \) is an arbitrary solution of (6d). Then by Lemma 3.4 there exists \(\omega _k'\) such that \(x, \omega _k'\) is a feasible solution for (6d). Since \((x_k, \omega _k)\) is an optimal solution, we have \(x_k \ge x\) and by taking limits \({\bar{x}} \ge x\). Therefore \({\bar{x}}\) is also optimal for (6e). \(\square \)

Note, that the proofs of Theorems 3.8, 3.9 still work if we restrict y to be binary as was discussed at the end of Sect. 3.1.

Combining Theorems 3.8 and 3.9, we conclude that by choosing a suitable sequence \((\varepsilon _\Gamma )_k, (\varepsilon _\Lambda )_k\), we can in theory bound the value of \(\mathcal {P}(A)\) as tightly as we need. However, solving the respective mixed-integer linear problems in practice will pose a computational challenge.

4 Computational Results

This section presents numerical experiments illustrating the capabilities and limits of the MIP approach presented in this paper. All computations have been performed using Gurobi on a HP DL380 Gen9 server with two Intel(R) Xeon(R) CPU E5-2660v@2.00GHz (each with 14 cores) and 256 GB RAM. We first focus on a simple illustrative example, where A is an equilateral triangle and the size of the configuration is \(N=3\). In addition, we chose \(f(x) = e^{-5\Vert x\Vert ^2}\) for our potential function and \(\varepsilon _\Gamma =0.014, \varepsilon _\Lambda = \varepsilon _\Gamma /3\) as the respective discretization widths of \(\Gamma \subseteq A\) and \(\Lambda \subseteq {{\,\textrm{conv}\,}}(A)\). Lastly, we restrict both, (6a) and (6e) to binary variables \(y\in \{0,1\}^\Lambda \) instead of integral \(y\in \{0,\ldots , N\}^\Lambda \) as discussed below Theorem 3.5. Since we expect the resulting configuration to consist of three separate points, this should not significantly impact the quality of the bounds.

We illustrate the configuration given by (6a) in Fig. 2. It was obtained after approximately 10 hours.

Fig. 2
figure 2

Optimal configuration for (6a) with \(\varepsilon = 0.014\) and a heatmap of the respective f-potential (from dark blue over green to yellow). The points of the configuration are represented by orange circles

We continue by assessing the numerical evidence on the convergence for the above example. To this end, we illustrate the quality of the binary versions of both, (6a) and (6e) for decreasing values of \(\varepsilon _\Lambda \) and \(\varepsilon _\Gamma \). Here, the binary variant of (6e) was derived from Proposition 3.7. To be precise, for every \(\varepsilon \in \{0.04, 0.038, \dots , 0.014\}\) we computed the lower bound using \(\varepsilon _\Lambda =\varepsilon /3\), \(\varepsilon _\Gamma = \varepsilon \) and the upper bound using \(\varepsilon _\Gamma = \varepsilon _\Lambda = \varepsilon \). We chose these scalings for a better comparability, since the \((\varepsilon _\Lambda , 3)\)-net in the upper-bound case contains more sample points and therefore yields more variables than an \((\varepsilon _\Lambda , 1)\)-net. Furthermore, we used scaled versions of the \(A_2\) lattice complemented with additional sample points on the boundary to generate the samples \(\Lambda \) and \(\Gamma \). This construction ensures that both, \(\Lambda \) and \(\Gamma \) are indeed \(\varepsilon _\Lambda \) and \(\varepsilon _\Gamma \)-nets respectively. The obtained bounds are visualized in Fig. 3.

It is apparent, that lower values of \(\varepsilon \) do not always yield better bounds although there is a clearly visible trend to close the gap between the bounds as can be expected from our convergence results established in Theorems 3.8 and 3.9. A drawback of this approach is the computational runtime of the respective MIPs, which vastly increases with the sample size of \(\Gamma \) and \(\Lambda \) from a few seconds if \(\varepsilon =0.04\) to 10 hours for \(\varepsilon =0.014\).

Fig. 3
figure 3

Upper and lower bounds computed with decreasing values of \(\varepsilon \) and the respective running optimum (dashed lines) as well as an approximate polarization of the lower bound configuration

As an additional academic example, we use the same approach for different suitable choices of \(\varepsilon = \varepsilon _\Lambda = \varepsilon _\Gamma \) and different convex, non-convex or even non-connected A to showcase the wide applicability of our approach. We illustrate the polarizations derived by the binary approximation of our lower bound MIP (6a) in Fig. 4.

Fig. 4
figure 4

Optimal configurations of (6a) for different A in orange with a heatmap of the respective potential (from dark blue over green to yellow). The border of the respective shape A is highlighted in blue (from left to right: ball, triangles, non-convex shape)

Moreover, we briefly summarize the computational results on these additional shapes A in Table 1 below. The respective sample widths were chosen such that the corresponding MIPs could be solved in reasonable time.

Table 1 Computational results of polarizations for exemplary shapes

We note, that the shape of A significantly impacts the runtime of our MIP approach. It seems that the large symmetry group of the ball may contribute to a larger runtime as good solutions may be found everywhere in the branch-and-bound tree used by solvers such as Gurobi. If true, symmetry reduction techniques may lead to substantial improvements.

5 Outlook

We have seen in Sect. 2 that the location of the darkest points and the location of the points of a locally optimal configuration are intertwined. We suspect that these results can be extended, in particular by utilizing symmetries of A or requiring A to be convex or even a polytope. Furthermore, it would be interesting to extend these results to other choices of D.

However, it is clear that there will be limitations to this approach. Consider for example \(A = D = S^{n-1}\) the unit sphere, where the convexity condition of Theorem 2.1 is not applicable. However, using different techniques, information on the set of darkest points for certain configurations on \(S^{n-1}\) has been obtained, e.g. for regular simplices [8, Thm. 2.4] and for m-stiff and strongly m-sharp configurations in [20, Thms. 4.3 and 4.5] extending previous results in [2, 21].

In this paper, we have not dealt with explicit computations of locally or globally optimal point configurations, even on simple sets such as n-gons or the unit ball. However, numerical experiments suggest that such configurations show some structure and we hope that extensions of the results in Sect. 2 can be utilized to obtain proof of optimality for some configurations. Here, we would like to highlight one result in this direction we are aware of, namely that for certain Riesz potentials of modest decay and with A being chosen as the closed d-dimensional unit ball, the optimal point configuration consists of N copies of the origin (see [1, Thm. 14.2.6]). We were able to observe similar effects in numerical experiments on regular polytopes.

The MIP hierarchies presented in Sect. 3 give provable upper and lower bounds converging to the optimal solution. However, unsurprisingly computing these bounds for sufficiently fine samples is very time consuming since MIP is NP-complete. A natural question is, whether well known techniques from mathematical programming - such as convex relaxations, inner approximations, column generation or local refinement, that speed up the computations can be utilized to achieve results for finer samples. However, most of these techniques only provide approximations of the discussed MIP hierarchies, which might limit the gain achieved through the finer samples.

Moreover, it might be helpful to carefully fit the choice of the samples to the specific instance of the problem. For example, if one has a conjecture for an optimal configuration and/or the correct location of the darkest points, this information can be fitted into the samples while retaining the \(\varepsilon \)-net property of the samples. Furthermore, these ideas might provide a way to use our bounds for analytic proofs of optimality in highly structured situations.