Optimal configurations of lines and a statistical application

Motivated by the construction of confidence intervals in statistics, we study optimal configurations of 2d − 1 lines in real projective space ℝℙd−1. For small d, we determine line sets that numerically minimize a wide variety of potential functions among all configurations of 2d − 1 lines through the origin. Numerical experiments verify that our findings enable to assess efficiently the tightness of a bound arising from the statistical literature.


Introduction
Motivated by a question arising from statistics related to the construction of uniformly valid post-selection confidence intervals [3,5], whose details we present below, we aim at computing 2 d −1 evenly-spaced lines through the origin in R d .The terminology "evenly-spaced" is loose and, indeed, there are several mathematical formulations that make sense but often lead to different types of line sets.Usually, one considers a potential energy with a "repulsive" pairwise interaction kernel, and a line set may be called evenly-spaced if it is optimal with respect to this energy.However, note that optimal line sets may differ among different potential energies.For very particular choices of the number of lines with respect to the ambient dimension d, see [8], optimal lines coincide for a large class of monotonic pairwise interaction kernels considered in [7] and are therefore called universally optimal.However, universal optimality is a rare event in the sense that only very few configurations can exist and even fewer are actually proven to be universally optimal.Indeed, in general it is extremely difficult to prove that a certain configuration is universally optimal, cf.[7,8,9].
In the present paper, we consider configurations of 2 d − 1 lines in R d for d = 2, . . ., 6 that minimize (not necessarily simultaneously) three types of potential energies associated to the distance-, the Riesz-1-, and the log-kernels.Additionally, we compare these minimizers to the corresponding best packings of lines found in [10], which can be seen as limiting cases of potential energy minimizers.In dimension d = 2, 3, all the three minimizing configurations of 3 and 7 lines, respectively, are known to coincide with the corresponding best packings of lines, by virtue of their universal optimality [8].Unfortunately, for d = 4, 5 our numerical experiments suggest that there are no universally optimal line sets.Surprisingly, for d = 6 there seems to exist a universally optimal configuration of 63 lines, which we identified as the best packing configuration provided in [10].However, this particular configuration, which can be composed by the 36 lines going through the vertices and the 27 lines going through the centers of the 5-faces of the 1 22 polytope (also called the E 6 polytope), cannot be proven to be universally optimal with the so-call sharpness condition introduced in [7], so that we state its universal optimality as a conjecture.
Let us now present the statistical application.In the context of the construction of valid post-model-selection confidence intervals, [3,5] have proposed new confidence intervals that are derived from evaluating a special statistical potential function of at most 2 d − 1 many lines in R d .It is desirable to find the maximum of this function, when the lines are subject to certain restrictions [3,5].Due to the inherent complexity of these restrictions and of the statistical potential itself, direct optimization approaches seem hopeless.However, a standard upper bound is available for this maximum [5], derived by two consecutive inequalities, cf. ( 3) and ( 4) in Section 2. It is known that the second inequality is an equality for d = 2 [3,Lemma A.4] and that it is tight as d → ∞ [5].In the present paper, we shall complete this picture for other small values of d by simply evaluating the statistical potential at sets of 2 d − 1 evenly-spaced lines.These evaluations are very close to the upper bound, which demonstrate the tightness of the second inequality.Moreover, for universally optimal line sets, the gap is significantly smaller, which underlines that special property.
The outline is as follows.In Section 2, we fix notation and present the statistical potential that motivates us to search for 2 d − 1 evenly-spaced lines in R d .The potential energies in projective space are introduced in Section 3, where we also provide the definitions of the distance-, the Riesz-s-, and the log-energy [17] as well as the notion of universal optimal configurations of lines [7].For each dimension d = 2, . . ., 6 we provide in Section 4 one of the numerically found minimizers of the distance-, the Riesz-1-, or the log-energy.Section 5 is dedicated to the statistical application, in which we compare the performance of the evenly-spaced lines with a naive Monte Carlo optimization of the statistical potential.

Let
In order to construct valid confidence intervals in statistical model selection, the authors in [3,5] consider a function where α ∈ (0, 1) and r ∈ N * are fixed parameters.The value of f d,r,α (L), for L ∈ D ≤N , is defined as the unique K > 0 such that (1) where F d,r is the cumulative distribution function of the F-distribution with parameters d and r, and V is a uniformly distributed random vector on S d−1 .In the statistical context, for fixed values of α and r, it is desirable to find the maximum of f d,r,α on a certain subset D ⊂ D ≤N , cf. [3,5], i.e., one aims to determine (2) sup where each set of lines in D is derived from some statistical data set but where D depends only on d, cf.[5, Equations (5.2) and (5.3) and Section 4.10] and [3,Equation (6)].Note that [3] and [5] yield two different subsets D, but this difference is of no consequence in the sequel.Considering the supremum (2) is beneficial because this supremum is data-independent and can be tabulated, for fixed r and α, once and for all.Exactly determining (2) is difficult if not computationally infeasible since the set D is defined in [3] or [5] in an intricate manner, that has so far made it impossible to determine (2) theoretically.Moreover, the values f d,r,α (L) can usually only be approximated by Monte Carlo methods, as in Algorithm 1, which we write only for the case of sets of N lines, for concision.Note, that Algorithm 1 is also used in [3].
In [3,5], an upper bound K(d, r, α) has been proposed for (2) that can easily be determined numerically, and we refer to Proposition 2.3 and Algorithm 4.3 in [3] for its definition and computation.This bound satisfies the following sequential inequalities: where ( 3) is due to D ⊂ D ≤N and to the fact that K in (1) is increased if additional lines are added, so that sup 4) is derived from a union bound, cf.Equation (A.17) in [5].
As noted in [5] and [3, Remark 2.10], the inequality ( 4) is tight for large values of d, i.e., (5) sup In the present paper, we shall assess whether the upper bound K(d, r, α) in (4) for sup L∈D N f d,r,α (L) is also tight for small values of d.
To evaluate the quality of the inequality (4), we must approximate (6) sup to determine its difference to K(d, r, α).However, exactly computing ( 6) is also difficult, if not numerically infeasible, and we shall rather aim to derive good lower bounds.In fact, we shall verify that evaluating f d,r,α on one or few candidates of evenly-spaced lines L = { 1 , . . ., N } ⊂ RP d−1 yields better lower bounds on (6) than several Monte Carlo attempts.Our numerical results on this issue are presented in Section 5.
Remark 2.1.The quantity K 2 in [3] corresponds to (2) where the supremum holds over a certain subset D that depends on other quantities beside d (it is data dependent).In the case of K 2 , there exists another upper-bound K 3 in [3], which is smaller than K(d, r, α), incomparable with sup L∈D N f d,r,α (L) and convenient to compute.Nevertheless, the context of this paper, where the subset D is data independent, is unrelated to K 3 , so that K(d, r, α) is the only available upper-bound for (2).

Potential functions in real projective space
We shall now consider families of potential energies whose minimization can provide us with rather evenly-spaced lines.We recall that the chordal distance d c between two lines i , j ∈ RP d−1 is given by ( 7) where i = u i R and j = u j R with u i , u j ∈ S d−1 .Let f : (0, 1] → R be a decreasing continuous function.For fixed N and d, we aim to minimize the potential energy i.e., we aim to find L ∈ D N such that We shall explicitly consider three types of pairwise interaction kernels, (A) the distance-energy, , for s > 0, (C) the log-energy, and we refer to [8,9,14,15,16,17,18] and references therein, for investigations on their minimizers and asymptotic results when N tends to infinity.We call a function f : (0, 1] → R completely monotonic if it is infinitely often differentiable and (−1) k f (k) ≥ 0, for k = 1, 2, . ... Note that the above f 1 , f 2 , f 3 satisfy this property.The following definition is borrowed from [7].Definition 3.1.We call L ∈ D N universally optimal if it minimizes P f among D N , for all completely monotonic functions f .Additionally, it seems natural to consider configurations solving the packing problem of N lines in RP d−1 , i.e., sets of lines which maximize the minimal chordal distance, i.e., find { ˆ 1 , . . ., cf. [10].Note that the packing problem corresponds to the limit case of the Rieszs-potential when s tends to infinity.In particular, if the set { ˆ 1 , . . ., ˆ N } ∈ D N is universally optimal, it also solves the packing problem.

2 d − 1 evenly-spaced lines in small dimensions
Recall that we are interested in minimizers of the distance-, Riesz-1-, and the logenergy for N = 2 d − 1 with d = 2, . . ., 6. Note, that for each potential energy there exist lots of local minimizers and, unless all such minimizers have been determined, there is no general statement that one has already found a global minimizer.
Here, we repeatedly apply a local optimization procedure initialized by randomly chosen starting points (in our case the nonlinear CG method described in [12,13]), and we are convinced that the line configurations we present are actually minimizers of the corresponding energies.
In general the minimizers of each of the three potential energies are different.However, since the different configurations perform rather equally well for the statistical potential we will provide for each dimension only one selected minimizer explicitly.
4.1.Universally optimal lines in R 2 .It is known that three equiangular lines in R 2 are universally optimal, cf.[7], hence all minimizers coincide with that best packing configuration.For instance, such lines { k } 3 k=1 are given by the vectors u k = (cos( 23 kπ), sin( 2 3 kπ)) , with k = u k R, for k = 1, 2, 3. Note that this configuration is highly symmetric and generated by a single orbit of its symmetry group D 3 , which is the dihedral group of order |D 3 | = 6.

4.2.
Universally optimal lines in R 3 .It has been proven recently that there exist 7 lines that are universally optimal in RP 2 , see [9], hence all minimizers coincide with that best packing configuration.In such a configuration, 4 lines are going through the vertices and 3 through the centers of the faces of a cube centered at the origin, see Figure 1.If the cube's edges are aligned with the coordinate axis, then the lines { k = u k R} 7 k=1 are given by the vectors u 5 = (1, 0, 0) , u 6 = (0, 1, 0) , u 7 = (0, 0, 1) .
Note that this configuration is also highly symmetric and is generated by only 2 orbits of its symmetry group O h , which is the full octahedral group of order |O h | = 48.

4.3.
Optimal lines in R 4 .Our numerical computations provide us with strong evidence that there are no universally optimal configurations of 15 lines in RP 4 .We found a very symmetric configuration L = { k } 15 k=1 , which seems to minimize the log-energy and the Riesz-1-energy simultaneously.It is more symmetric than our numerical minimizer of the distance-energy in the sense that it is composed by fewer group orbits.Moreover one can simply check that it is a stationary point of any Riesz-s-energy, s > 0. However, it does not solve the best packing problem (the limiting case s → ∞), as it has a slightly smaller minimal distance than the configuration found in [10].
The symmetry group G of L as a subgroup of the orthogonal matrices O(4) ⊂ R 4×4 acts naturally by left multiplication on RP 3 .The group has order |G| = 144 and is generated by the following matrices The 15 lines L = L 1 ∪ L 2 are composed by the two orbits of cardinality 6 and 9, respectively.See Figure 2, for a visualization.

4.4.
Optimal lines in R 5 .We numerically derived different minimizers for each of the three potential energies.They all have a nontrivial symmetry group, and we want to provide the most symmetric configuration of lines, which is found for the Riesz-1-energy.In that case the 31 lines L = { k } 31 k=1 are composed by 8 orbits of its symmetry group.More precisely, the symmetry group G has order |G| = 12 and .Blue points correspond to orbit L 1 , yellow correspond to L 2 .For visualization, we consider the intersection of lines in R 4 with the upper hemisphere of the unit sphere.Hence, lines become points.Now, we apply the stereographic projection to map the upper hemisphere in R 4 into the full ball in R 3 , which we can plot.In general, a line in R 4 reduces to a single point in the ball but lines that intersect the equator have two intersection points and we plot both. is generated by the following three matrices The 31 lines L = 8 i=1 L i are then given by the eight orbits ) , G ∈ G}, ) , G ∈ G}, ) , G ∈ G}, of size 1, 3, 3, 3, 3, 6, 6 and 6, respectively, where the constants can be computed to arbitrary precision by numerical minimization a 1 = 0.

4.5.
Optimal lines in R 6 .Our numerical investigations provide us with evidence that there exists an arrangement of 63 lines which is universally optimal in RP 5 and thus coincides with the best packing solution presented in [10].This configuration k=1 is given by the 36 lines going through the vertices and the 27 lines going through the centers of the 5-faces of the 1 22 polytope, cf.[11], also known as the E 6 polytope.The symmetry group G of that polytope has order |G| = 103680, which contains the Coxeter group E 6 as an index 2 subgroup.The high order of the symmetry group G shows the remarkably high symmetry of this configuration of lines.Note that the group G can be generated by 2 matrices , so that the 63 lines L = L 1 ∪ L 2 are explicitly constructed by the two orbits of cardinality 36 and 27, respectively.Alternatively, the 63 lines can be derived from the minimal vectors of the E 6 lattice and its dual lattice E * 6 , see, for instance, G. Nebe's and N. Sloane's website [1].
Note that L 1 itself satisfies the sharpness condition in the sense of [7] and hence is universally optimal in RP 5 .The 63 lines are not sharp, but we conjecture that they are universally optimal in RP 5 .Moreover, the corresponding configurations of 126 points on the sphere is stationary for any completely monotonic potential function of the squared distance of 126 points on S 5 , cf. [4]. 5. Applications to the statistical potential 5.1.Two approaches.Now that we have some configurations of N = 2 d − 1 evenly-spaced lines in projective space in hand, we can apply them to derive lower bounds on (9) sup as was our aim stated in Section 2. We shall compare evaluating f d,r,α at our evenly-spaced lines with a standard Monte Carlo optimization method.Note that f d,r,α cannot be evaluated directly, and we apply Algorithm 1 to do so, which is the same algorithm as [3, Algorithm 4.1], and which is also used by the authors of [5].To summarize, we shall compare the following two methods: (i) evenly-spaced lines: This method simply consists in applying Algorithm 1 with { ˆ 1 , ..., ˆ N } being one of the optimal line sets derived in Section 4.
(ii) naive Monte Carlo: One aims to maximize f d,r,α by Monte Carlo optimization, given in Algorithm 4, which first calls the random line generator in Algorithm 2 and then applies Algorithm 3, which itself repeatedly calls Algorithm 1.This type of Monte Carlo optimization has already been used in [3,Section 5] (in a context unrelated to this paper and the evenly-spaced line method (i)).
Our intention is to check that (i) indeed outperforms (ii).From a computational point of view, (i) has a clear advantage: the set of evenly-spaced lines is obtained numerically once and for all, and can be used for any value of r and α.In contrast, one needs to repeat the naive Monte Carlo optimization for each values of r and α under consideration.
Algorithm 1 Evaluation of f d,r,α at { 1 , . . ., N } Input: lines { 1 , ..., N } ∈ D N and some parameter I ∈ N Output: K approximating f d,r,α ({ 1 , ..., N }).1: generate independent uniformly distributed random vectors {V i } I i=1 ⊂ S d−1 .2: for each i = 1, . . ., I, calculate c i = max j=1,...,N u j , V i 2 , where j = u j R. 3: determine K that solves (10) 1 for j = 1, . . ., N , generate a ν j -distributed random vector u j ∈ S d−1 .In order to further investigate on the approach (i), we shall also consider local searches in proximity to the evenly-spaced lines: (a) local evenly-spaced lines: We search for the maximum of f d,r,α locally around { ˆ 1 , . . ., ˆ N } given as in (i).Indeed, we call Algorithm 2 using N projected Gaussian distributions with isotropic variance σ 2 = 0.1 2 and mean vectors {u * j } N j=1 , respectively.With the resulting set D 1 , we apply Algorithm 3. (b) very local evenly-spaced lines: As in (a) but we search even more locally by choosing σ 2 = 0.01 2 .In the numerical experiments, the parameters for (i), (ii), (a), and (b) are chosen by I = 20 000 000, N 1 = 200 000, N 2 = 20 000, I 1 = 10 000, I 2 = 200 000, and I 3 = 20 000 000.In Figure 3, we report, for each of the configurations of d, r (8 configurations in total), the ratios K/ K(d, r, α), where K takes four values obtained by the methods (i), (ii), (a), and (b), and where K(d, r, α) is the upper bound in (4).For the methods (i), (a), and (b), for d = 3, 6, a unique set of 2 d − 1 lines is under consideration, which minimizes all the potential functions of Section 3.For d = 4, 5 different sets minimize different potentials, but give approximately the same values for f d,r,α (the lines corresponding to the packing problem slightly lag behind the others).Hence, in Figure 3 below, we only report the results for the minimizer of the Riesz-1-potential, for concision.The ratios K/ K(d, r, α) enable us to compare the two methods, (i) and (ii), while those obtained from (a) and (b) address the local optimality of approach (i).
In Figure 3, for the configurations of d and r under consideration, the method (i) using evenly-spaced lines provides a better maximization of f d,r,α over D N than (ii).Hence, it is beneficial both from a computational and performance point of view.
We also observe in Figure 3 that the values of K obtained from (a) are below those obtained from (b).Furthermore, although not perceptible in the figure, we always have that either K is smaller in (b) than in (i), or the two values cannot be distinguished, because of the very small but positive variance of Algorithm 1.This is numerical indication that the sets of evenly-spaced lines are local maximizers for f d,r,α over D N .At least in the cases d = 3, 6, since we are then dealing with universally optimal lines, it seems plausible that we even obtained the global maximizers.
Note that, in light of Figure 3, K(d, r, α) is a tight upper bound in (4) for d = 3, ..., 6, with a difference of less than 0.5%.This result is of complementary nature with the tightness for large d, see (5), derived in [5, proof of Theorem 6.3].
We can conclude that it would not be very beneficial to aim at improving the union bound (4).Instead, one may want to study the inequality (3) more closely.Given the complexity of the sets D considered in [3] or [5], however, this may require other tools and may turn out to be an extremely challenging task going beyond the scope of the present paper.
We have a smaller window of possible values for (9) in dimensions in which universally optimal lines exist, or are conjectured to exist, with the predefined cardinality N = 2 d − 1, see Figure 3 where d = 3, 6 yield higher ratios for (i) than d = 4, 5.In this sense, our statistical application indicates that the concept or property of universal optimality is indeed beneficial.
S d−1 denote the unit sphere and RP d−1 the projective space (the set of lines through the origin) of R d .Any u ∈ S d−1 defines a line ∈ RP d−1 by = uR, and its antipodal counterpart −u yields the same line.Throughout the entire manuscript, we shall fix N := N (d) := 2 d − 1.The set of all sets of at most N lines is denoted by D ≤N := D ≤N (d) := {L ⊂ RP d−1 , #L ≤ N } and the set of all sets of exactly N lines is denoted by D r, α) → 1, when d tends to infinity, cf.[5, proof of Theorem 6.3].For d = 2, we even have sup L∈D N f d,r,α (L) = K(d, r, α), see [3, Lemma A.4].

Figure 2 .
Figure 2. 15 lines in R 4 visualized in the unit ball in R 3. Blue points correspond to orbit L 1 , yellow correspond to L 2 .For visualization, we consider the intersection of lines in R 4 with the upper hemisphere of the unit sphere.Hence, lines become points.Now, we apply the stereographic projection to map the upper hemisphere in R 4 into the full ball in R 3 , which we can plot.In general, a line in R 4 reduces to a single point in the ball but lines that intersect the equator have two intersection points and we plot both.

Figure 3 .
Figure 3. Plot of the ratios K/ K(d, r, α), where K is obtained from the methods (i), (ii), (a), and (b).We investigate d = 3, 4, 5, 6, r = 20, 60 and use the minimizers of the Riesz-1-potential presented in Section 4 as sets of evenly-spaced lines in methods (i), (a), and (b).The sets of evenly-spaced lines appear to be local maximizers of the statistical potential function, possibly global maximizers for d = 3, 6.They clearly perform better than the naive Monte Carlo maximization (ii).