Positive bases with maximal cosine measure

Positive spanning sets and positive bases are important in the construction of derivative-free optimization algorithms. The convergence properties of the algorithms might be tied to the cosine measure of the positive basis that is used, and having higher cosine measure might in general be preferable. In this paper, the upper bound of the cosine measure for certain positive bases in $$\mathbb {R}^n$$Rn are found. In particular if the size of the positive basis is $$n+1$$n+1 (the minimal positive bases) the maximal value of the cosine measure is 1 / n. A straightforward corollary is that the maximal cosine measure for any positive spanning set of size $$n+1$$n+1 is 1 / n. If the size of a positive basis is 2n (the maximal positive bases) the maximal cosine measure is $$1/\sqrt{n}$$1/n. In all the cases described, the positive bases achieving these upper bounds are characterized.


Introduction
The seminal work on positive bases by Davis [6] established many of their properties. The last few decades positive bases have received renewed attention due to their use in derivative-free optimization. For example, Conn et al. [5] which gives an introduction to derivative-free optimization introduces positive spanning sets and positive bases and the cosine measure (if necessary, see definitions below) as part of the background before starting on developing any optimization routines. The survey paper [10] also introduces these topics early on. Positive bases are also introduced as a background before constructing a couple of direct search methods in the recent book [3]. The usefulness of positive bases within derivative-free optimization was first pointed out in [11].
The convergence properties of several derivative free algorithms depends on the cosine measure of a positive spanning set, and it is preferable to design the positive B Geir Naevdal gena@norceresearch.no 1 NORCE Norwegian Research Centre, Postboks 22, 5838 Bergen, Norway spanning set having its cosine measure as large as possible. In a recent paper, the problem of determining positive spanning sets with maximal cosine measure is described as "widely open" [7]. The present paper shows that the maximal cosine measure for a positive spanning set for R n of size n + 1 in 1/n. The minimal size of a positive spanning set for R n is n + 1, which means that such a positive spanning set is necessarily a positive basis, and it is called a minimal positive basis. (It is trivial to show that the minimal size of a positive basis of R n is n + 1.) Already C. Davis [6] established that the maximal size of a positive basis for R n is 2n and such a positive basis is called a maximal positive basis. To show that the maximal size is 2n is not trivial, but a short proof using ideas from optimization theory was given by Audet [2]. In [7], the cosine measure of a positive spanning set for R n of size 2n is conjectured to always be less than or equal to 1/ √ n. The present paper shows that this is indeed the case for the maximal positive bases (which is a subset of the positive spanning sets of size 2n).
Both the minimal and maximal positive bases with maximal cosine measure turn out to be given as examples of positive bases in [5]. The minimal bases of unit vectors with cosine measure 1/n are the positive bases The maximal positive bases for R n consisting of the standard basis vectors and their negatives have cosine measure 1/ √ n. In [1] the authors show the existence of a positive basis v 1 , . . . , v n+1 with v T i v j = −1/n if i = j and show how to construct such a positive basis motivated by an optimization problem occurring in molecular geometry. An application using positive bases and the cosine measure in improving the performance of a direct search algorithm is shown in [4].
To be more formal, let us provide the definitions of positive spanning sets, positive bases and their cosine measure. A positive spanning set is a set of vectors v 1 , v 2 , . . . , v r ∈ R n with the property that any v ∈ R n is in its positive span, that is v = α 1 v 1 + · · · + α r v r for some α 1 , . . . , α r ≥ 0. The set is a positive basis if none of the vectors v 1 , . . . , v r can be written as a positive sum of the remaining vectors (i.e. the vectors are positively independent). Basic properties of positive spanning sets and positive bases can be found in e.g., [5,6,12]. The size of the spanning set is equal to the number of vectors it contains, which is r in the definition given above. For R n , n ≥ 3 it has been shown that there exists positively independent set of vectors of an arbitrary number of elements (shown independently in [8,12]).
For convenience, only positive spanning sets without the zero vector is considered. Then the cosine measure of a positive spanning set or of a positive basis D is defined by see, e.g., [5]. The paper is organized as follows. Fundamental material on positive bases and positive semidefinite matrices are presented in Sect. 2. Section 3 gives the main results of the paper. A short outlook is provided in Sect. 4.

Some background about positive semidefinite matrices
The ideas used to provide the proofs of my results are essentially from two sources. Obviously, the properties of positive spanning sets and positive bases would be exploited. The other source of auxiliary results are from the theory of positive semidefinite matrices and is covered by [9,Chapter 7]. The results presented in this section should be well-known in their respective domains, but are included here for easy reference.
The results concerning positive spanning sets and positive bases are introduced as needed. As mentioned in the introduction, references for this topic are e.g., [5,6,12].
Here we will only point to one property, that is given in all the three references above, that will be used in the proof of Theorem 1. This result is one of the characterizations of positive spanning sets and states that for any positive spanning set v 1 , . . . , v r ∈ R n it holds that for any w ∈ R n there exists Obviously, the inequality can also be reversed since the result also must hold for −w.
Recall that an n × n matrix G is positive semidefinite if it holds that x T Gx ≥ 0 for all x ∈ R n (x = 0). If the inequality is strict, the matrix is positive definite. In the proof of Theorem 1 we will utilize the fact that any principal submatrix of a positive semidefinite matrix is again positive semidefinite. In particular, this means that if all the diagonal entries of a positive semidefinite matrix are one, then the off-diagonal entries will be in the interval is formed by selecting the columns and rows i 1 < i 2 < · · · < i k of A. The inheritance property follows easily by restricting the vector x in the definition above to the same index set.) Concerning positive semidefinite matrices the properties of the Gram matrix of a set of vectors v 1 , . . . , v r will be useful. The Gram matrix is defined as Later on, the form G(v 1 , . . . , v r ) is used to denote such a Gram matrix. The Gram matrices are necessarily positive semidefinite (they can be factorized using the vectors that define them). If v 1 , . . . , v r are linearly independent, the matrix G(v 1 , . . . , v r ) is non-singular (i.e. positive definite). If the vectors are linearly dependent, the rank of G(v 1 , . . . , v r ) is equal to the dimension of the (sub)space the vectors span. The proof of Theorem 1 utilizes the fact that if the (n + 1) × (n + 1) matrix, G, written as a 2 × 2 block matrix, where A is a singular n × n matrix and b ∈ R n , is a positive semidefinite matrix, then b is orthogonal to the kernel of A. This can be proven by noting that if b is not orthogonal to the kernel of A, then we can select c ∈ R n from the kernel of A (i.e., Ac = 0), such that c T b = 1. Then it follows that contradicting the assumption that G is a positive semidefinite matrix.
The following technical lemma will be used in the proof of Theorem 2. (Note that the bound for α that can be obtained directly from G(v 1 , . . . , v n ) would be useful.)

Lemma 1
Let v 1 , . . . , v n ∈ R n be linearly independent unit vectors. Let e ∈ R n be the vector having all its entries equal to one. Then there exists a unit vector u ∈ R n such that By multiplying the matrix above with from the right and its transpose from the left (here I and 0 denote the identity and zero matrices with appropriate dimension) it follows that the matrix given in (1) has the same rank as The matrix above is positive semidefinite with rank n if

Results
Now it is time to present the main results of the paper.

Theorem 1
Let v 1 , v 2 , . . . , v n+1 be a (minimal) positive basis for R n . Then the following holds: 1. Its cosine measure is bounded above by 1/n.
Let u ∈ R n be a unit vector and extend the Gram matrix G ∈ R (n+1)×(n+1) to a Gram matrix The matrix G(v 1 , . . . , v n+1 , u) is a positive semidefinite matrix, again with rank n.
Since v 2 , . . . , v n+1 is a basis, there exists a vector u as described above where Since γ > 0, it follows that β = v T 1 u < 0 since v 1 , . . . , v n+1 is a positive basis. Since v 2 , . . . , v n+1 form a basis for R n the latter statement is true for any γ , but u will be a unit vector only for one positive value of γ . We can obtain a bound for γ by noting that the matrix is a positive semidefinite matrix only if the vector β γ ... γ T is orthogonal to the kernel of G(v 1 , . . . , v n+1 ). This means that βα 1 + n+1 j=2 γ α j = 0. Since α 1 = 1, solving for γ gives γ = −β n+1 j=2 α j . Since 0 > β ≥ −1 and α j ≥ 1 we obtain the bound γ ≤ 1 n which shows that the cosine measure of v 1 , . . . , v n+1 is bounded above by 1/n. The fact that β ≥ −1 follows for instance by applying the Cauchy-Schwarz inequality to v T 1 u. Proof of (2): γ = 1/n only if α 2 = · · · = α n = 1 and β = −1. In the first part of the proof, the vectors v 1 , v 2 , . . . , v n+1 was, if necessary, reordered such that v 1 was chosen such that α 1 = 1. It is clear that in this case any of the vectors v i can be interchanged with v 1 . Moreover, since all principal submatrices of a positive semidefinite matrix are again positive semidefinite this can only hold if v T i v j = −1/n for all i = j. This follows by considering the principal submatrix of the matrix given in (2). Since we have, (1/n)) 2 must be nonnegative, since the determinant of a positive semidefinite matrix is always nonnegative, also provides the result.) That there indeed exists a positive basis of unit vectors v 1 , v 2 , . . . , v n+1 ∈ R n such that v T i v j = −1/n for all i, j is proven in [1,5]. Since every positive spanning set of R n of size n + 1 must be a positive basis, we also can state that:

Corollary 1
The maximal cosine measure for any positive spanning set of size n + 1 in R n is 1/n. Now we proceed to show the result for maximal positive bases. The maximal positive bases in R n has size 2n and it consists of a set of n linearly independent vectors and their negatives (which can be multiplied by an arbitrary positive constant). The structure for maximal positive bases is conveniently stated in [12,Theorem 6.3]. They can be listed as where δ i > 0 for 1 ≤ i ≤ n be a (maximal) positive basis for R n . Then the following holds: 1. The cosine measure is bounded above by 1/ √ n.

The bound is attained if and only if v T
i v j = 0 for i = j and 1 ≤ i, j ≤ n. Proof By the definition of the cosine measure we will need the scaled positive basis vector v i v i . Therefore we can without loss of generality we assume that v 1 , . . . , v n are unit vectors and δ i = 1 for 1 ≤ i ≤ n. The proof proceeds by showing that by making a proper selection of v i or −v i for 1 ≤ i ≤ n there exist a unit vector u with the property that u T v 1 = u T v 2 = · · · = u T v n ≤ 1/ √ n. That there exists a vector u such that u T v 1 = u T v 2 = · · · = u T v n = β for any β follows immediately since the vectors v 1 , . . . , v n are linearly independent. If β = 0, the vector u = 0. Otherwise, u = 0, and scaling u to be a unit vector of course also scales β. The choice of β that provides a solution u that is a unit vector can be found by considering the Gram matrix G(v 1 , v 2 , . . . , v n , u). This matrix has the required properties if u T u = 1 and where e n is the vector of length n with all its entries equal to one. The required upper bound will follow from Lemma 1 if we can select the vectors v 1 , . . . , v n such that e T n G(v 1 , . . . , v n )e n ≤ n by replacing some v i 's by −v i if necessary.
We will prove by induction that this is possible for all k ≤ n to construct G(v 1 , . . . , v k ) such that e T k G(v 1 , . . . , v k )e k ≤ k. For k = 1 there is nothing to prove, since G(v 1 ) = 1. For k = 2 the result follows by replacing v 2 we see that the induction step is completed by replacing To prove the second part of the theorem, one needs to do the minor change in the provided construction that one numbers the vectors such that v T 1 v 2 = 0, if no such pair of vectors exists, the proof is done. Otherwise, select v 1 and v 2 such that v T 1 v 2 < 0. This leads to e T 2 G(v 1 , v 2 )e 2 < 2. Now we can repeat the induction step shown above, but we will have e T k G(v 1 , . . . , v k )e k < k, which in the end leads to e T n G(v 1 , . . . , v n )e n < n, and therefore we get β < 1/ √ n by Lemma 1.

Discussion
This paper have presented the upper bounds for the cosine measures for the minimal and maximal positive bases. Obviously, the results provided here should be complemented. One direction would be to find the maximal cosine measure for positive spanning sets consisting of 2n vectors in R n as discussed in [7]. One can also look for extension of the results provided here to positive bases for R n of size r for n + 1 < r < 2n. That the results here cover the maximal and minimal positive bases is in good alignment with the fact that the structure in these two cases are better described than positive bases of intermediate size. The structures for the two cases discussed here are for instance provided in [12,Section 6].
Concerning the minimal cosine measures, both [5,10] provides examples where the cosine measure becomes arbitrarily close to zero.