Noisy Euclidean distance matrix completion with a single missing node

We present several solution techniques for the noisy single source localization problem, i.e. the Euclidean distance matrix completion problem with a single missing node to locate under noisy data. For the case that the sensor locations are fixed, we show that this problem is implicitly convex, and we provide a purification algorithm along with the SDP relaxation to solve it efficiently and accurately. For the case that the sensor locations are relaxed, we study a model based on facial reduction. We present several approaches to solve this problem efficiently, and we compare their performance with existing techniques in the literature. Our tools are semidefinite programming, Euclidean distance matrices, facial reduction, and the generalized trust region subproblem. We include extensive numerical tests.


Introduction
In this paper we consider the noisy, single source localization problem. The objective is to locate the source of a signal that is detected by a set of sensors with exactly known locations. Distances between sensors and source are given, but contaminated with noise. For instance, in an application to cellular networks, the source of the signal is a cellular phone and the cellular towers are the sensors. Our data is the, possibly noisy, distance measurements from each sensor to the source.
The single source localization problem has applications in e.g. navigation, structural engineering, and emergency response [3,4,8,24,26,38]. In general, it is related to distance geometry problems where the input consists of Euclidean distance measurements and a set of points in Euclidean space. The sensor network localization problem is a generalization of our single source problem, where there are multiple sources and only some of the distance estimates are known. The general Euclidean distance matrix completion problem is yet a further generalization, where sensors do not have specified locations and only partial, possibly noisy, distance information is available, e.g. [2,13,15]. We refer the readers to the books [1,5,9,10] and survey article [27] for background and applications, and to the paper [18] for algorithmic comparisons. We also refer the readers for the related nearest Euclidean distance matrix (NEDM) problem to the papers [30,31] where a semismooth Newton approach and a rank majorization approach is presented. The more general weighted NEDM is a much harder problem though. For theory that relates NEDM to semidefinite programming, see e.g. [12,25].
A common approach to solving an instance of the single source localization problem is a modification of the least squares problem, referred to as the squared least squares (SLS) problem. We consider two equivalent formulations of SLS: the generalized trust region subproblem (GTRS) formulation; and the nearest Euclidean distance matrix with fixed sensors (NEDMF) formulation. We show that every extreme point of the semidefinite relaxation of GTRS may be easily transformed into a solution of GTRS and thus a solution of the SLS problem.
We also introduce and analyze several relaxations of the NEDMFformulation. These utilize semidefinite programming, facial reduction, and parametric optimization. We provide theoretical evidence that, generally, the solutions to these relaxations may be easily transformed into solutions of SLS. We also provide empirical evidence that the solutions to these relaxations may give better prediction for the location of the source.

Outline
In Sect. 1.2 we establish our notation and introduce background concepts. In Sect. 2.1 we prove strong duality for the GTRS formulation of SLSand in Sect. 2.2 we derive the semidefinite relaxation (SDR ), and prove that it is tight. We also show that the extreme points of the optimal set of SDR correspond exactly to the optimizers of SLS. A purification algorithm for obtaining the extreme points is presented in Sect. 2.2.1. In Sect. 3 we introduce the NEDM formulation as well as several relaxations. We analyze the theoretical properties of the relaxations and present algorithms for solving them. The results of numerical comparisons of the algorithms are presented in Sect. 4.
Unless otherwise specified, the norm of a matrix is the Frobenius norm, and we may drop the subscript F. For a convex set C, the convex subset f ⊆ C is a face of C if for all x, y ∈ C, x, y ∈ f with z ∈ (x, y), (the open line segment between x and y) we have z ∈ f .
The cone of positive semidefinite matrices is denoted by S n + and its interior is the cone of positive definite matrices, S n ++ . The positive semidefinite cone is pointed, closed and convex. Moreover, the cone S n + induces a partial order on S n , that is Y X if Y − X ∈ S n + and Y X if Y − X ∈ S n ++ . Every face of S n + is characterized by the range or nullspace of matrices in its relative interior, equivalently, by matrices of maximum rank. For S ⊆ S n + , we denote the minimal face of S, face(S), the smallest face of S n + that contains S. Let X ∈ S n + have rank r with orthogonal spectral decomposition.
Then the range and nullspace characterizations of face(X ) are, We say that the matrix Q Q T is an exposing vector for face(X ). Sometimes it is helpful to vectorize a symmetric matrix. Let svec : S n → R n(n+1)/2 map the upper triangular elements of a symmetric matrix to a vector, and let sMat = svec −1 .
The centered subspace of S n , denoted S C , is defined as where e is the vector of all ones. The hollow subspace of S n , denoted S H , is where diag : S n → R n , diag(X ):= (X 11 , . . . , X nn ) T . A matrix D ∈ S H is said to be a Euclidean distance matrix, EDM if there exists an integer r and points x 1 , . . . , x n ∈ R r such that where · 2 denotes the Euclidean norm. As for the Frobenius norm, we assume the norm of a vector to be the Euclidean norm when the subscript is omitted. The set of all n × n EDMs, denoted E n , forms a closed, convex cone with E n ⊂ S H . The classical result of Schoenberg [32] states that EDMs are characterized by a face of the positive semidefinite cone. We state the result in terms of the Lindenstrauss mapping, with adjoint and Moore-Penrose pseudoinverse, respectively. Here Diag is the adjoint of diag, the matrix J n := I − 1 n ee T is the orthogonal projection onto S C , and offDiag(D) refers to zeroing out the diagonal of D, i.e. the orthogonal projection onto S H . The range of K is exactly S H and the range of K † is the subspace S C . Moreover, K(S n + ) = E n and K is an isomorphism between S C and S H .
The Schoenberg characterization states that K is an isomorphism between S n + ∩ S C and E n , see [2] for instance. Specifically, Moreover, if D ∈ E n and K † (D) = P P T has rank r with full column rank factorization P P T , then the rows of P correspond to the points in R r with pairwise distances corresponding to the elements of D. For more details, see e.g. [2,11,12,21,22].

SDP Formulation
We begin this section by formulating the SLS problem using the model and notation of [3]. We let n denote the number of sensors, p 1 , . . . , p n ∈ R r denotes their locations, and r is the embedding dimension.

Assumption 2.1
The following holds throughout: The first two items in Assumption 2.1 ensure that a signal can be uniquely recovered if we have accurate distance measurements. If the towers are positioned in a proper affine subspace of R r , and the signal is not contained within this affine subspace, then there are multiple possible locations for the signal with the given distance measurements. We assume that such poor designs are avoided in our applications. The third assumption is made so that the sources are centered about the origin. This property leads to a cleaner exposition in the NEDM relaxations of Sect. 3.
We let d =d + ε ∈ R n denote the vector of noisy distances from the source to the ith sensor, whered i is the true distance and ε i is a perturbation, or noise. When the noise ε 1 , . . . , ε n is not too large, then a satisfactory approximation of the location of the source can be obtained as a nearest distance problem to the sensors. Using the Euclidean norm as a metric, we obtain the least squares problem This problem has the desirable property that its solution is the maximum likelihood estimator when the noise is assumed to be normal and the covariance matrix a multiple of the identity, e.g. [8]. However, it is a non-convex problem with an objective function that is not differentiable. Motivated by the success in [3], the main problem we consider instead is the optimization problem with squared distances Though still a non-convex problem, in the subsequent sections we show that a solution of SLS can be obtained by solving at most k ≤ r + 1 convex problems, see Theorem 2.7 below.

GTRS
The GTRS is an optimization problem where the objective is a quadratic and there is a single two-sided quadratic constraint [29,33]. Note that this class of problems also includes equality constraints. If we expand the squared norm term in SLS and substitute using x 2 = α as in [3], we get the equivalent problem In this formulation, we have a convex quadratic objective that is minimized over a level curve of a convex quadratic function. It follows that (2.3) is an instance of the standard trust region subproblem. Strong duality is proved in [29,33]. For the sake of completeness, we include a proof of strong duality for our particular class of GTRS.

Theorem 2.2 Let
Consider SLS in (2.2) and the equivalent form given in (2.3). Then: 1. The problem SLS is equivalent to 2. The rank of A is r + 1 and the optimal value of GTRS is finite and attained. 3. Strong duality holds for GTRS , i.e. GTRS and its Lagrangian dual have a zero duality gap and the dual value is attained: Proof The first claim that SLS can be rewritten as GTRS follows immediately using the substitution y = (x T , α) T . For the second claim, note that by Assumption 2.1, Item 2, rank(P T ) = r . Therefore, (P T ) T e = 0 implies that rank(A) = rank(P T ) + 1 = r + 1. Now, since A has full column rank, we conclude that A T A is positive definite, and therefore the objective of GTRS is strictly convex and coercive. Moreover, the constraint set is closed and thus the optimal value of GTRS is finite and attained, as desired.
That we have a zero duality gap for GTRS follows from [29], since this is a generalized trust region subproblem. We now prove this for our special case. Note that Let γ = λ min (4P T T P T ) be the (positive) smallest eigenvalue of 4P T T P T so that we have A T A − γĨ 0, but singular. We note that the convex constraint y TĨ y + 2b T y ≤ 0 satisfies the Slater condition, i.e. strict feasibility. Therefore, the following holds, with justification to follow. (2.10) The first equality follows from Item 1 and the second equality holds since γ (y TĨ y + 2b T y) is identically 0 for any feasible y.
For the third equality, let the objective and constraint, respectively, be denoted by The optimal value of (2.8) is a lower bound for p * SLS since the feasible set of (2.8) is a superset of the feasible set of (2.7) and the objectives are the same. Now suppose, for the sake of contradiction, that the optimal value of (2.8) is strictly less than p * SLS . Then there existsȳ satisfying, . Then by the structure of A T A and construction of γ we see that h = h T 0 T withh = 0. Moreover, we have, Now we choose η ∈ {±1} such that, These observations imply that that there existsᾱ > 0 such that We have confirmed the third equality. Now (2.8) is a convex quadratic optimization problem where the Slater constraint qualification holds. This implies that strong duality holds, i.e. we get (2.9) with attainment for some λ ≥ 0. Now if λ < 0 in (2.9) then the Hessian of the objective is indefinite (by construction of γ ) and the optimal value of the inner minimization problem is −∞. Thus since (2.9) is maximized with respect to λ in the outer optimization problem, we may remove the non-negativity constraint and obtain (2.10). The remaining lines are due to the definition of the Lagrangian dual and weak duality. Strong duality follows immediately.
The above Theorem 2.2 shows that even though SLS is a non-convex problem, it can be formulated as an instance of GTRS and satisfies strong duality. Therefore it can be solved efficiently using, for instance, the algorithm of [29]. Moreover, in the subsequent results we show that SLS is equivalent to its semidefinite programming (SDP) relaxation in (2.15), a convex optimization problem.
We compare our SDP approach with the approach used by Beck et al. [3]. In their approach they have to solve the following system obtained from the optimality conditions of GTRS: (2.12) The so-called hard case results in A T A + λ * Ĩ being singular for the optimal λ * and this can cause numerical difficulties. We note that in our SDP relaxation, we need not differentiate between the 'hard case' and 'easy case'.

The semidefinite relaxation, SDR
We now study the convex equivalent of SLS. We analyze the dual and the SDP relaxation of GTRS. By homogenizing the quadratic objective and constraint and using the fact that strong duality holds for the standard trust region subproblem [34], we obtain an equivalent formulation of the Lagrangian dual of GTRS as an SDP. We first definē The Lagrangian dual of GTRS may be obtained as follows: Ā . (2.14) Here the first equality follows from the definition of the dual. The second and third equalities are just equivalent reformulations of the first one. For the last equality, letỹ = [y T , y 0 ] T and M =Ā − λB − se r +2 e T r +2 and supposeỹ T Mỹ < 0 for someỹ. If y 0 is nonzero, we can get a contradition by scalingỹ. If y 0 is zero, by the continutity ofỹ →ỹ T Mỹ, we can perturb y by a small enough amount so that the last element ofỹ is nonzero. This is a contradiction as in the previous case.
We observe that (2.14) is a dual-form SDP corresponding to the primal SDP problem, e.g. [39], (2.15) Now let F and , respectively, denote the feasible and optimal sets of solutions of SDR. We define the map ρ : R r +1 → S r +2 as, Note that ρ is an isomorphism between R r +1 and rank 1 matrices of S r +2 + , where the (r + 2, r + 2) element is 1.

Lemma 2.3 The map ρ is an isomorphism between the feasible sets of GTRS and SDR.
Moreover, the objective value is preserved under ρ, i.e. Ay − b 2 = Ā , ρ(y) .

Theorem 2.4
The following holds: 1. The optimal values of GTRS, SDR, and (2.14) are all equal, finite, and attained. 2. The matrix X * is an extreme point of if, and only if, X * = ρ(y * ) for some minimizer, y * , of GTRS.
Proof From Theorem 2.2 and weak duality, we have that Moreover, since SDR is a relaxation of GTRS we get, Furthermore, from Theorem 2.2 the above values are all finite and the optimal values of GTRS and (2.14) are attained. To see that the optimal value of SDR is attained it suffices to show that (2.14) has a Slater point. Indeed, the feasible set of (2.14) consists of all μ, s ∈ R such that, Setting μ = 0 and applying the Schur complement condition, we have By Theorem 2.2, A T A is positive definite and a Slater point may be obtained by choosing s so that b T b − s is sufficiently large. Now we consider Item 2.4. By the existence of a Slater point for (2.14) we know that is compact and convex. Now we show that is actually a face of F . To see this, let θ ∈ (0, 1) and let Z = θ X + (1 − θ)Y ∈ for some X , Y ∈ F . Since Z is optimal for SDR and X and Y are feasible for SDR, we have Now equality holds throughout and we have Ā , X = Ā , Y = Ā , Z . Therefore X , Y ∈ and by the definition of face, we conclude that is a face of F .
Since is a compact convex set it has an extreme point, say X * . Now X * is also an extreme point of F , as the relation face of is transitive, i.e. a face of a face is a face. Moreover, since there are exactly two equality constraints in SDR, by Theorem 2.1 of [28], we have rank(X * )(1 + rank(X * ))/2 ≤ 2. This equation is satisfied if, and only if, rank(X * ) = 1. Equivalently, X * = ρ(y * ) for some y * ∈ R r +1 . Now, by Lemma 2.3 and the first part of this proof we have that y * is a minimizer of GTRS .
For the converse in Item 2.4, let y * be a minimizer of GTRS. Then by Lemma 2.3, X * := ρ(y * ) is optimal for SDR. To see that X * is an extreme point of , let Y , Z ∈ such that Since X * has rank 1 and Y , Z 0, it follows that Y and Z are non-negative multiples of X * . But by feasibility, X * r +2,r +2 = Y r +2,r +2 = Z r +2,r +2 and thus Y = Z = X * . So, by definition, X * is an extreme point of , as desired.
We have shown that the optimal value of SLS may be obtained by solving the nice convex problem SDR. Moreover, every extreme point of the optimal face of SDR can easily be transformed into an optimal solution of SLS. However, SDR is usually solved using an interior point method that is guaranteed to converge to a relative interior solution of . In general, such a solution may not have rank 1. In the following corollary of Theorem 2.4 we address those instances for which the solution of SDR is readily transformed into a solution of SLS. For other instances, we present an algorithmic approach in Sect. 2.2.1.

Corollary 2.5
The following hold.
1. If GTRS has a unique minimizer, say y * , then the optimal set of SDR is the singleton ρ(y * ). 2. If the optimal set of SDR is a singleton, say X * , then rank(X * ) = 1 and ρ −1 (X * ) is the unique minimizer of GTRS.
Proof Let y * be the unique minimizer of GTRS . By Theorem 2.4 we know that ρ(y * ) is an extreme point of . Now suppose, for the sake of contradiction, that there exists X = ρ(y * ) in . Since is a compact convex set it is the convex hull of its extreme points. Thus there exists an extreme point of , say Y , that is distinct from ρ(y * ). By Theorem 2.4, we know that ρ −1 (Y ) is a minimizer of GTRS and by Lemma 2.3, ρ −1 (Y ) = y * , contradicting the uniqueness of y * . For the converse, let X * be the unique minimizer of SDR. Then X * is the only extreme point of and consequently ρ −1 (X * ) is the unique minimizer of GTRS, as desired.

A purification algorithm
Suppose the optimal solution of (2.15) isX with optimal value p * SDR = Ā ,X and rank(X ) =r wherer > 1. Note that we can not obtain an optimal solution of GTRS fromX since the rank is too large. However, in this section we construct an algorithm that returns an extreme point of which, by Theorem 2.4, is easily transformed into an optimal solution of GTRS. We note that this does not require the extreme point to be an exposed extreme point.
Let the compact spectral decomposition ofX beX := U DU T with D ∈ Sr ++ . We use the substitution X = U SU T and solve the problem (2.20), below, to obtain an optimal solution with lower rank. Note that D 0 is a strictly feasible solution for (2.20). We choose the objective matrix C ∈ Sr ++ to be random and positive definite. To simplify the subsequent exposition, by abuse of notation, we redefinē whereĒ := e r +2 e T r +2 . We define the linear map A : Sr → R 3 and the vector b ∈ R 3 as, respectively. The rank reducing program is (2.20) In Algorithm 2.1 we extend the idea of the rank reducing program and in the subsequent results we prove that the output of the algorithm is a rank 1 optimal solution of SDR.

Algorithm 2.1 Purification Algorithm
Redefine A k S and b k S using U k as in (2.18) and ensure that it is full rank. 6: Choose Update k ← k + 1. 9: end while 10: OUTPUT:

Lemma 2.6
Let k ≥ 1 be an integer and suppose that C k , A k S , and b k S are as in Algorithm 2.1. Then Proof By construction, D k ∈ F k . Therefore, For the forward direction, assume that S k+1 0 and, for the sake of contradiction, suppose that, S k+1 is not the only element of F k . Then S k+1 ∈ relint(F k ) and for any T ∈ Null(A k S ) there exists ε > 0 such that, By the choice of C k , there exists T ∈ Null(A k S ) such that C k , T = 0 and we may assume, without loss of generality, that this inner product is in fact negative. Then, contradicting the optimality of S k+1 . Theorem 2.7 LetX ∈ S r +2 + be an optimal solution to SDR . IfX is an input to Algorithm 2.1, then the algorithm terminates with at most rank(X ) − 1 ≤ r + 1 calls to the while loop and the output, X * , is a rank 1 optimal solution of SDR.
Proof We proceed by considering the trivial case, rank(X ) = 1. Clearly X * =X in this case, and we have the desired result. Thus we may assume that the while loop is called at least once. We show that for every S k generated by Algorithm 2.1 with k ≥ 1, we have, (2.21) To this end, let us consider the constraint B , S k = 0. By the update formula (2.18), we have, Similarly the other two constraints comprising A k−1 S are satisfied by X k and therefore X k ∈ . Now we show that the sequence of ranks, r 1 , r 2 , . . . , generated by Algorithm 2.1 is strictly decreasing. It immediately follows that the algorithm terminates in at most rank(X ) − 1 calls to the while loop and that the output matrix X * has rank 1. Suppose, to the contrary, that there exists an integer k ≥ 2 such that r k = r k−1 . Then by construction, we have that rank(S k ) = r k = r k−1 and S k is a Slater point of the optimization problem, Therefore, by Lemma 2.6 we have that S k is the only feasible solution of (2.22). Now we claim that X k as defined above is an extreme point of . To see this, let Since Y k and Z k are both positive semidefinite we have that .
and it follows that V k and W k are feasible for (2.22). By uniqueness of S k we have that Y k = Z k = X k and X k is an extreme point of . Then by Theorem 2.4, rank(S k ) = 1 and Algorithm 2.1 terminates before generating r k , a contradiction.
We remark that in many of our numerical tests the rank ofX was 2 or 3. Consequently, the purification process did not require many iterations.

EDM Formulation
In this section we use the Lindenstrauss operator, K, and the Schoenberg characterization to formulate SLS as an EDM completion problem. Recall that the exact locations of the sensors (towers) are known, and that the tower-source distances are noisy. The corresponding EDM restricted to the towers is denoted D T and is defined by Then the approximate EDM for the sensors and the source is Recall that From Assumption 2.1 the towers are centered, i.e. e T P T = 0. This property is desirable due to the Schoenberg characterization which states that K is an isomorphism between S n + ∩ S C and E n . Moreover, it allows for easy recovery of the towers in the last step of our algorithm by solving a Procrustes problem. Now let G T := P T P T T be the Gram matrix restricted to the towers, and note that The nearest EDM problem with fixed sensors is min x∈R r For any x ∈ R n let By simplifying the objective, we see that the NEDMP problem in (3.1) is indeed equivalent to SLS , i.e.
The approach of [13] for the related sensor network localization problem is to replace and then introduce a constraint on the block of X corresponding to the sensors. Taking this approach, we obtain nearest Euclidean distance matrix with fixed sensors (NEDMF) problem, where The objective of this (3.2) is exactly the objective of SLS (acting on the matrix variable) and the affine constraint restricts X to those Gram matrices for which the block corresponding to the sensors has exactly the same distances as P T P T T . That is, if is feasible for (3.2), with X T ∈ R n×r and x c ∈ R r , then X T differs from P T only by translation and rotation. Since neither translation nor rotation affect the distances between the rows of X T and x c we translate the points in R r so that X T is centered. This corresponds to the assumption that P T is centered. Then we solve the Procrustes problem to obtain the rotation and thus have a complete description of the transformation from X T to P T . Applying the transformation to x c yields a vector feasible for SLS . Thus every feasible solution of (3.2), corresponds to a feasible solution of SLS . The converse is trivially true and we conclude that (3.2) is equivalent to SLS due to the rank constraint. We show in the subsequent sections that the relaxation where the rank and the linear constraints are dropped, may be used to solve the problem accurately in a large number of instances.

Nearest Euclidean distance matrix formulation
One relaxation of (3.2) is obtained by removing the affine constraint and modifying the objective as follows: Due to the semidefinite characterization of E n+1 this problem is the projection of D T c onto the set of EDMs with embedding dimension at most r . The motivation behind this relaxation is the assumption that the distance measurements corresponding to the sensors are very accurate. Therefore, any minimizer of NEDM will likely have the first n points very near the sensors. As we show in the subsequent sections by introducing weights, we can obtain a solution arbitrarily close to that of (3.2). The challenge in problem NEDM is the rank constraint. A simpler problem is to first solve the unconstrained least squares problem and then to project the solution onto the set of positive semidefinite matrices with rank at most r . This is equivalent to solving the inverse nearest EDM problem: Note that if the positive semidefinite constraint is removed, this problem is just the projection onto the matrices with rank at most r . By the Eckart-Young theorem, this projection is a rank r matrix obtained by setting the n −r smallest eigenvalues (in magnitude) of K † (D T c ) to zero.
In the following lemma we show that for sufficiently small noise, the negative eigenvalue is of small magnitude and hence the Eckart-Young rank r projection is positive semidefinite. We denote by D ∈ S n+1 the true EDM of the sensors and the source, that is It is easy to see from the definitions ofd and ε that, has at most 1 negative eigenvalue with magnitude bounded above by Proof First we note that the norm of e n+1 ξ T + ξ e T n+1 is bounded above by the magnitude of the noise: Next we observe that the matrix e n+1 ξ T +ξ e T n+1 has trace 0 and rank 2. Thus e n+1 ξ T +ξ e T n+1 has exactly one negative and one positive eigenvalue. By the Moreau decomposition theorem, e.g. [23], e n+1 ξ T + ξ e T n+1 may be expressed as the sum of two rank one matrices, say P 0 and Q 0, that are the projections of e n+1 ξ T + ξ e T n+1 onto S n+1 + and −S n+1 + , respectively. Now we have, , where the first term in the last line is positive semidefinite with at least r and at most r +1 positive eigenvalues (−J n+1 D J n+1 is a positive semidefinite matrix with rank r and −J n+1 Q J n+1 is positive semidefinite with rank at most 1); and the second term is negative semidefinite with at most one negative eigenvalue. Using the Cauchy-Schwartz inequality it can be shown By (3.7) and the fact that P is a projection of e n+1 ξ T + ξ e T n+1 onto −S n+1 + , we have It follows that K † (D T c ) has rank at most r + 2 and by the Courant-Fischer-Weyl theorem, e.g. [37], it has at most one negative eigenvalue whose magnitude is bounded above by √ 2 2 J n+1 2 ε , as desired.
The following corollary follows immediately.

Weighted, facially reduced NEDM
While we have discarded the information pertaining to the locations of the sensors in relaxing the problem (3.2) to the problem NEDM , we still make use of the distances between the sensors. Thus, to some extent the locations of the sensors have an implicit effect on the optimal solution of NEDM and the approximation NEDMinv from the previous section.
In this section we take greater advantage of the known distances between the sensors by restricting NEDM to a face of S n+1 + by facial reduction. The true Gram matrix, K † (D), belongs to the set, Now the constraint X 0 in NEDM , may actually be refined to say X ∈ face(F T , S n+1 + ) which is the following: (3.9) Moreover, we may obtain a closed form expression for face(F T , S n+1 + ) in the form of an exposing vector. To see this, consider the spectral decomposition of the sensor Gram matrix, Note that W T W T T is an exposing vector for face(G T , S n c,+ ) since the following two conditions hold: We now extend W T W T T to an exposing vector for face(F T , S n+1 + ).

Lemma 3.3 Let W T := [W T T 0] T and let W := W T W
T T + ee T . Then, Proof This statement is a special case of Theorem 4.13 of [16].
Through W we have a 'nullspace' characterization of face(F T , S n+1 + ). However, the 'range space' characterization is more useful in the context of semidefinite optimizaiton as it leads to dimension reduction, numerical stability, and strong duality. To this end, we consider any (n + 1) × (r + 1) matrix such that its columns form a basis for null(W ). One such choice is, (3.10) To verify that the columns of V indeed form a basis for null(W ), we first observe that rank(V ) = r + 1 and secondly we have, It follows that face(F T , S n+1 Thus we may replace the variable X in NEDMP by V RV T for R ∈ S r +1 + . To simplify the notation, we define the composite map K V := K(V · V T ). Moreover, we introduce a weight matrix to the objective and obtain the weighted facially reduced problem, FNEDM, (3.12) Here H α := α H T + H c and α is positive. Let us make a few comments regarding this problem. When α = 1 the weight matrix has no effect and FNEDM reduces to NEDMP . On the other hand, when α is very large, the solution has to satisfy the distance constraints for the sensors more accurately and in this case FNEDM approximates (3.2). In fact, in Theorem 3.9 we prove that the solution to FNEDM approaches that of (3.2) as α increases. We begin our analysis by proving that V α is attained.

f (R, α) is strictly convex and coercive, 3. the problem FNEDM admits a minimizer.
Proof For Item 1, under the assumption that α > 0, we have Recall that K is one-to-one between the centered and hollow subspaces and K(0) = 0. By construction, range(V · V T ) is a subset of the centered matrices. Hence Finally, the feasible set of FNEDM is closed. Combining this observation with coercivity of the objective, from Item 2, we obtain Item 3.
We conclude this subsection by deriving the optimality conditions for the convex relaxation of FNEDM , which is obtained by dropping the rank constraint.

Lemma 3.5
The matrix R ∈ S r +1 + is optimal for the relaxation of (3.12) obtained by ignoring the rank constraint if, and only if, In addition, R is optimal for (3.12) if rank R ≤ r.

Analysis of FNEDM
In this section we show that the optimal value of FNEDM is a lower bound for the optimal value of SLS. Moreover, the this lower bound becomes exact as α is increased to +∞. In the SLS model, the distances between the towers are fixed, while in the NEDM model (3.4), the distances between towers are free. The facial reduction model allows the distances between the towers to change but the towers can still be transformed back to their original positions by a square matrix Q ∈ R r ×r . Note that Q does not have to be orthonormal, so it is possible that Q Q T = I . Theorem 3.6 Let P T be as above, V as in (3.10), and let P be a centered matrix with, Then there exists a matrix Q ∈ R r ×r such that P T Q = J n T if, and only if, Substituting into the equation P T Q = J n T we get, which yields the following expression for P, Now by (3.11) we have, Applying (3.13) we verify that the last statement in the equivalence holds, For the other direction, let and recall that V = J n+1 V 1 . Suppose P P T belongs to the face V S r +1 + V T . Then P = J n+1 V 1 M for some M ∈ R (r +1)×r . We show that if Q ∈ R r ×r denotes the first r rows of M, then P T Q = J n T . To this end, letJ = [J n 0] and observe thatJ P = J n T . Moreover, sinceJ is centered,J J n+1 =J . Then, Theorem 3.6 indicates that when using the facial reduction model FNEDM we can use a least square approach to exactly get back the original positions of the sensors. This approach will be discussed in Sect. 3.2 along with the Procrustes approach.
In the following, we show that the optimal value of the problem in (3.12) is not greater than the optimal value of the SLS estimates (2.2) or (3.2). We also prove that the solution to FNEDM approaches that of (3.2) as α increases.

Then V T is finite and satisfies V T = V S .
Proof That V T is finite, follows from arguments analogous to those used in Lemma 3.4. For the equality claim, it is clear that V S ≤ V T . To show that V S ≥ V T , consider X that is feasible for (3.2). First we show that X may be assumed to be centered. To see this, consider X = J n X J n . Note thatX is the orthogonal projection of X onto S c and it can be verified that X = K † K(X ). Now it is clear thatX 0 and that K(X ) = K(X ). Moreover, since J n is singular we have, rank(X ) ≤ rank(X ). Therefore,X is also feasible for (3.2) and provides the same objective value as X . Now there exists T ∈ R n×r and c ∈ R r such that, Then, from the tower constraint of (3.2) we get the implications, Thus, there exists an orthogonal Q such that J n T = P T Q. By Theorem 3.6 we have X ∈ V S r +1 + V T and it follows that V S ≥ V T .

Moreover, the first inequality is strict if D T c is not an EDM with embedding dimension r .
Proof For the first inequality, let R 0 be such that rank(R) ≤ r . Then, Note that the inequality is strict if, and only if, For the second inequality, we first observe that, Now V T and V α 2 are both optimal values of f (R, α 2 ) over their respective domains, but the domain for V T is smaller than that of V α 2 . Hence, the second inequality holds.

Theorem 3.9
For any α > 0, let R α denote the minimizer of FNEDM . Let {α } ∈N ⊂ R ++ be a sequence of increasing numbers such that R α →R for someR ∈ S r +1 . Then V α ↑ V T andR is a minimizer of (3.14).
Proof First we note that R α is well defined by Lemma 3.4. Now, from Lemma 3.8, we have that V α is monotonically increasing and bounded above by V T . Hence there exists V * such that, Next, we show thatR is feasible for (3.14). Since S r +1 + is closed and the rank function is lower semicontinuous, we have rank(R) ≤ r andR 0. Moreover, for every ∈ N, Rearranging and taking the limit we get, The last equality follows from the continuity of h. Since the limit in (3.16) exists we get, by continuity. ThusR is feasible for (3.14) and we have h(R) ≥ V T . On the other hand, from (3.16) we have h(R) ≤ V * . Combining these observations with (3.15) we get, Now equality holds throughout (3.18) and the desired results are immediate.

Solving FNEDM
The solution set of the unconstrained version of (3.12) can be stated in terms of the Moore-Penrose generalized inverse of H α • K V , denoted by (H α • K V ) † . Indeed, the solution to the least squares problem is, . (3.19) In this subsection we explore the relationship between the optimal solution of FNEDM and the eigenvalues of R L S . In general the Moore-Penrose inverse may be difficult to obtain, however, the following result implies that R L S may be derived efficiently and it is the unique minimizer of f .

Lemma 3.10
Let R L S be as in (3.19). Then, R L S is the unique minimizer of f and Proof That R L S is the unique minimizer of f follows from strict convexity as in Item 2 of Lemma 3.4. Moreover, by Item 1 of Lemma 3.4, we have null( The desired expression for R L S is obtained by substituting the left inverse into (3.19).
admits an r × r matrix representation. Thus if r is small, as in many applications, the inverse of (H α • K V ) * (H α • K V ), and consequently R L S , may be obtained efficiently.
We consider three cases regarding the eigenvalues of R L S , each of which corresponds to a different approach to solving FNEDM .

Case I R L S 0 and rank(R
In the best scenario, Case I, we have that R L S is the unique minimizer of FNEDM . In this case FNEDM reduces to an unconstrained convex optimization problem. Moreover, we have a closed form solution for the minimizer, R L S . In Case II, the minimizer of FNEDM may also be obtained through a convex relaxation as is indicated by the following result. Proof Let R denote the optimal solution of FNEDM without the rank constraint. Note that R exists by arguments analogous to those in Lemma 3.4. If rank(R ) ≤ r , then clearly R is a minimizer of FNEDM . Thus we may assume that R 0. Since R L S is the unique minimizer of f , we have f (R L S ) < f (R ). Moreover, by strict convexity of f , every matrix R in the relative interior of the line segment Then,R is feasible for the relaxation of FNEDM where the rank constraint is removed. However, f (R) < f (R ), contradicting the optimality of R .
In Case III we are motivated by the primal-dual approach of [30,31] and the penalty approach of [19,30,31]. Let h = [1, · · · , α] T , we notice that , it is easy to see that (3.12) is equivalent to the problem: As in [30,31], we define K n+1 + (r ) := {Y ∈ S n+1 | − J n+1 Y J n+1 0, rank(J n+1 Y J n+1 ) ≤ r }, and let B (Y ) = 0 represent the linear constraints diag(Y ) = 0 and J n+1 Y J n+1 , W = 0. Then (3.20) may be written as, If B consists only of the diagonal constraint and T = I , then (3.21) is exactly the problem considered in [30,31], where a sufficient condition for strong duality was presented. In the subsequent results, we present an analogous dual problem for the general constraint B (Y ) = 0.

Lemma 3.12
The Lagrangian dual of (3.21) is − min (3.23) we then write the dual object function θ : R n+2 → R as, From Eqs. (3.26) to (3.27) we need to prove the triangle equality holds, i.e.
To this end, consider any matrix X ∈ S n+1 and let (X ) be a nearest point in K n+1 T (r ) to X . Since K n+1 T (r ) is a cone, the ray θ (X ) for all θ ≥ 0 is contained in the set K n+1 T (r ).
Moreover this ray is convex and (X ) is the nearest point to X from this ray. Now we can use orthogonality: (X ) − X is orthogonal to (X ) − 0. Then the triangle inequality follows: The Lagrangian dual problem is then defined by, as desired.
In [30,31] it is shown that the Lagrangian dual has compact level sets and therefore the optimal value is finite and attained. The dual problem (3.28) can be solved by the semi-smooth Newton approach proposed in [30].
In [30,31], the authors proposed a rank majorization approach where strong duality is guaranteed if the penalty function goes to zero. The approach can be readily modified to replace the diagonal constraint by the linear constraint B and to include the diagonal weight matrix T . The strong duality result and global optimal condition can also be carried out to our problem (3.21). The drawback of this approach is the slow convergence when n is large. Therefore, in our facial reduction model we prefer to stay in S r +1 rather than S n+1 since the dimension is lower. Hence we develop a rank majorization approach in S r +1 in the following: To penalize rank, we consider the concave penalty function, Note that p is non-negative over the positive semidefinite matrices and Hence, p is an appropriate penalty function for the rank constraint of FNEDM. Now we consider the penalized version of FNEDM , where γ is a positive constant. The objective is a difference of convex functions and the feasible set is convex. The literature on this type of optimization problem is extensive and the theory well established. In particular, the well-known majorization approach guarantees convergence to a matrix satisfying the first order necessary conditions for PNEDM, i.e. a stationary point. See for instance [35,36]. The majorization approach is outlined below in Algorithm 3.1. Central to the approach is the observation that p is majorized by its linear approximation, since it is concave. In the algorithm, ∂ p(R) denotes the subdifferential of p at R. Thus at every iterate, the convex subproblem (3.31) is solved to obtain the next iterate.  Choose U k ∈ ∂ p(R k ) 5: Obtain R k+1 , Proof By [35,36], the stationary pointR satisfies the following condition: Under the assumption rank(R) = r , we haveR with the columns of V 1 being the eigenvectors corresponding to λ 1 , . . . , λ r . We have Due to the convexity of f (R, α), for anyR ∈ face(R), we have Hence our claim is proved.

Identifying outliers using l 1 minimization and facial reduction
In this section, we address the issue of unequal noise, where a few distance measurements are outliers, i.e. much more inaccurate than others. We use l 1 norm minimization to try and identify the outliers, and remove them to obtain a more stable problem. We assume that we have many more towers available than is necessary, so that removal of a few outliers leaves us with towers that still satisfy Assumption 2.1. Problem (3.12) is equivalent to minimizing the residual of an overdetermined linear system in the domain of an SDP cone. Let z := svec(R) for R ∈ S r +1 . Abusing our previous notation, let b := svec(H α • D T c ) and let A denote the matrix representation of H α • K V . Then z ∈ R (r +1)(r +2)/2 and b ∈ R n(n+1)/2 . In practice, n is much larger than r + 1, so A will have more rows than columns. In other words, we have an overdetermined system. Under this new notation, problem (3.12) is equivalent to, min δ s.tAz − b = δ sMat(z) 0 (3.33) To motivate the compressed sensing approach, suppose that only the outlier measurements are noisy and that the remaining measurements are accurate. Ifz denotes the true solution, then Az − b is sparse and we consider the popular l 1 norm minimization problem, The problem (3.34) differs from the classical compressed sensing model in the positive semidefinite constraint. However, in our numerical tests, we have found that adding the positive semidefinite constraint greatly increases the success rate in identifying outliers. In compressed sensing, If the matrix N satisfy the so-called restricted isometry property, then the sparse signal can be recovered exactly [7, Theorem 1.1]. However, there are no practical algorithms available right now to check if a given matrix satisfies the restricted isometry property. If δ 0 is the solution to (3.34) and most of the elements of δ 0 are 0, then the non-zero elements indicate the outlier measurements.
Thus far, we have assumed that most of the measurements in b are exact and a few have large error. Now let us revert to the original assumption of this section: that most elements of b are slightly inaccurate and few elements are very inaccurate. If the positive semidefinite constraint is ignored, then the identification of outliers is guaranteed to be accurate assuming that N satisfies the restricted isometry property. To be specific, if δ # represents the optimal solution of (3.34) without the positive semidefinite constraint, then ||δ # − δ 0 || l 2 ≤ C S · where C S and are small constants [6,7]. The specifics for our outlier-detection algorithm are stated in Algorithm 3.2.

Recovering source position from gram matrix
After finding the EDM from our data, we need to rotate the sensors back to their original positions in order to recover the position of the source. This is done by solving a Procrustes problem. That is, suppose that the, appropriately partitioned, final EDM, corresponding Gram matrix and points are, AssumingP f and the original data P T are both centered, we now have two approaches. The first approach solves the following Procrustes problem using [20, Algorithm 12.4.1] The optimal solution can be found explicitly from the singular value decomposition ofP T f P T . IfP T f P T =: U f f V T f , then the optimal solution to (3.36) is Q * := U f V T f . The recovered position of the source is then P T c = p T f Q * . The second approach is to solve the least square problem (3.37) The least square solution isQ =P † f P T . Recall thatP † f is the Moore-Penrose generalized inverse ofP f . The recovered position of the source is then p T c = p T fQ .

Numerical results
To compare the different methods, we used randomly generated data with an error proportional to the distance to each tower. The proportionality is given by η. This gives where D is the generated EDM and ε ∈ U (−η, η). The outliers are obtained by multiplying (4.1) by another factor θ for a small subset of the indices. We let M denote the set of optimization methods to be tested. Then for M ∈ M, the relative error, c M re , between the true location of the source, c, and the location obtained using method M, denoted c M , is given by The data is then found by calculating this error for all the methods and varying the error η in Eq. (4.1) and the amount of sensors n. For each pair (n, η), one hundred instances are solved. The methods in the tables are labelled according to the models with some additional prefixes. To be specific, the L and P prefixes represent the different ways used to obtain the position of the source, c. By L we denote the least square approach of (3.37) and P represents the Procrustes approach in (3.36). We choose α = 1 in FNEDM and the constant γ for PNEDM in (3.30) is chosen to be 1000.  We report some results in the following table.
From Tables 1 and 2 we can see generally P-FNEDM has the smallest error, and occasionally L-FNEDM is better. Also we can see that as the number of towers n increases, the relative error c M re decreases which is expected as we have more sensors, the location of the source should be more accurate.
To compare the overall performance of all the methods, we use the well known performance profiles [14]. The approach is outlined below.
For each pair (n, η) and one hundred solved instances, we calculate the mean of the relative error c M re for method M. We denote this c n,η,M = mean over 100 instances, for n towers, with error factor η and method M.
We then compute the performance ratio,  The performance profiles can be seen in Fig. 1a and b, the P-FNEDM approach has the best performance over all 5 methods. Also using the Procrustes approach (3.36) is better than using the least squares approach (3.37). Allowing the sensors to move in FNEDM model is better than fixing the sensors in SDR or making the sensors completely free in NEDM for recovering the location of the source.
We also generate the data with outliers. In FNEDM , the outliers are detected and removed using the 1 norm approach described in Sect. 3.1.5. We report the results with outliers added in the following Tables 3 and 4.  Outlier factor θ ∼ U (3,6) From Table 3 and 4 we can see clearly that when outliers are added, the FNEDM outperforms both SDR and NEDM with a big improvement, as the outliers can be removed. It is also consistent with our previous conclusion that using the Procrustes approach (3.36) is better than using the least squares approach (3.37).

Conclusion
We showed that the SLS formulation of the single source localization problem is inherently convex, by considering the semidefinite relaxation, SDR, of the GTRS formulation. The extreme points of the optimal set of SDR correspond exactly to the optimal solutions of the SLS formulation and these extreme points can be obtained by solving no more than r + 1 convex optimization problems.
We also analyzed several EDM based relaxations of the SLS formulation and introduced the weighted facial reduction model FNEDM. The optimal value of FNEDM was shown to converge to the optimal value of SLS by increasing α. In our numerical tests, we showed that our newly proposed model FNEDM performs the best for recovering the location of the source. Without any outliers present, the performance of each method improves as the number of towers increases. This is expected since more information is available. All the methods tend to perform similarly as the number of towers increases but the facial reduction model, FNEDM, using the Procrustes approach performs the best.
Finally, we used the 1 norm approach in Algorithm 3.2, to remove outlier measurements. In Tables 3 and 4 we demonstrate the effectiveness of this approach.