1 Preliminaries

1.1 Introduction

Sensor network localization (SNL) is the problem of estimating the unknown positions of m sensors from the known positions of n anchors and the measured distances (which may contain measurement errors) between sensor–sensor or sensor–anchor pairs. In this problem, let \(E_{\mathrm {ss}}\) and \(E_{\mathrm {sa}}\) be the sets of sensor–sensor and sensor–anchor pairs, respectively, with known measured distances, and let \(d_{ij}\) and \(d_{ik}\) be the measured distances for each \(ij :=\{i,j\}\in E_{\mathrm {ss}}\) and \(ik:=\{i,k\} \in E_{\mathrm {sa}}\), respectively. Let the anchor coordinates be \({\varvec{a}}_k= (a_{k1},\dots ,a_{kd})^\top \in \mathbb {R}^d \ (k = m + 1,\dots , m + n)\). Then, the SNL problem is formulated as the following system of equations with variables \({\varvec{x}}_i\in \mathbb {R}^d \ (i = 1,\dots , m)\):

$$\begin{aligned} \Vert {\varvec{x}}_i-{\varvec{x}}_j\Vert _2 = d_{ij}\ (\forall ij \in E_{\mathrm {ss}}),\quad \Vert {\varvec{x}}_i-{\varvec{a}}_k\Vert _2 = d_{ik}\ (\forall ik \in E_{\mathrm {sa}}). \end{aligned}$$
(1)

When a matrix variable Z is introduced, finding the \({\varvec{x}}_1,\dots ,{\varvec{x}}_m\) satisfying system (1) is known to be equivalent to finding a solution of the following semidefinite programming (SDP) problem with a rank constraint [4, 21]:

$$\begin{aligned} \vline \quad \begin{aligned}&\text {min}&0\\&\text {s.t.}&A_{ij}\cdot Z = d_{ij}^2\quad (\forall ij\in E_{\mathrm {ss}}),\\&A_{ik}\cdot Z = d_{ik}^2\quad (\forall ik\in E_{\mathrm {sa}}),\\&Z_{(1:d,1:d)} = I_d,\\&{\mathrm{rank}}(Z) \le d,\\&Z \in \mathcal {S}_+^{d+m}. \end{aligned} \end{aligned}$$
(2)

Here, for each \(ij\in E_\mathrm{ss}\) and \(ik \in E_\mathrm{sa}\),

$$\begin{aligned} A_{ij}&:=\begin{pmatrix} {\varvec{0}}_d\\ {\varvec{e}}_i-{\varvec{e}}_j \end{pmatrix}\begin{pmatrix} {\varvec{0}}_d\\ {\varvec{e}}_i-{\varvec{e}}_j \end{pmatrix}^\top ,\ \quad A_{ik} :=\begin{pmatrix} {\varvec{a}}_k\\ -{\varvec{e}}_i \end{pmatrix}\begin{pmatrix} {\varvec{a}}_k\\ -{\varvec{e}}_i \end{pmatrix}^\top , \end{aligned}$$

where \({\varvec{e}}_1,\dots ,{\varvec{e}}_m\) is the canonical basis of \(\mathbb {R}^m\). See Sect. 1.2 for the definitions of the other symbols. When \(d_{ij}\) and \(d_{ik}\) contain measurement errors, problem (2) generally does not have a solution. Therefore, to account for the case in which \(d_{ij}\) and \(d_{ik}\) contain errors, researchers have often considered problem (3) defined below [4, 8,9,10, 16, 19]. In this problem, the distance constraints are incorporated in the objective function in the form of quadratic errors, and a penalty is imposed for violating those constraints. In this paper, we also seek to estimate sensor positions in SNL by solving this problem:

$$\begin{aligned} \vline \quad \begin{aligned}&\text {min}&\frac{1}{2}\sum _{ij\in E_{\mathrm {ss}}}(A_{ij}\cdot Z-d_{ij}^2)^2 + \frac{1}{2}\sum _{ik\in E_{\mathrm {sa}}}( A_{ik}\cdot Z-d_{ik}^2)^2\\&\text {s.t.}&Z_{(1:d,1:d)}=I_d,\\&{\mathrm{rank}}(Z) \le d,\\&Z\in \mathcal {S}_+^{d+m}. \end{aligned} \end{aligned}$$
(3)

The SNL problem is generally known to be NP-hard [1], and the formulations of the SNL problem as optimization problems (2) and (3) are also nonconvex. This nonconvexity is due to the rank constraint that appears in problems (2) and (3). Therefore, many previous SNL studies removed the rank constraint and relaxed the problem into an SDP problem to estimate approximate sensor positions [2,3,4,5, 14, 22]. Among these methods, sparse full SDP (SFSDP) as proposed by Kim et al. [13] is an especially representative one. However, a solution of the SDP relaxation problem is not always a solution of the original problem. So and Ye [18] referred to problem (1), which has a unique solution and does not have sensor positions satisfying all of the given distances in a higher-dimensional space, as “uniquely localizable”. They proved that problem (1) is uniquely localizable if and only if the maximum rank of the solutions of the SDP relaxation problem (2) is d. Therefore, when the SDP problem is solved by the interior-point method, which is a representative method for solving general SDP problems, if the problem is uniquely localizable, then the exact sensor positions can be determined by solving the SDP relaxation problem. Otherwise, an optimal solution of the SDP relaxation problem corresponds to a configuration of the sensors in a higher-dimensional space because of the max-rank property of the interior-point method [11], and it thus might give poorly estimated sensor positions in d-dimensional space.

On the other hand, the exact sensor positions in d-dimensional space can be estimated only if problem (1) has a unique solution (Wan et al. [21] referred to a problem satisfying this condition as “locatable”). Therefore, methods have recently emerged for estimating the sensor positions by solving problem (2) or (3) itself. Wan et al. [21] proposed a method that obtains a solution of problem (2) by solving SDP problems multiple times. This approach is based on the fact that the rank of a matrix being less than or equal to d is equivalent to its \((d+1)\)th and subsequent eigenvalues all being zero. Numerical experiments showed that their method’s estimation accuracy was better than that of an SDP relaxation-based method. However, their method took more time than the latter method, because it requires solving the SDP problem via an SDP solver at each iteration. Wan et al. [20] also proposed a method that transforms the SDP problem with the rank constraint into an SDP problem with a complementarity constraint and alternately performs minimization with regard to the two semidefinite matrices that appear in the problem. Numerical experiments also confirmed that this method was more accurate than SDP relaxation methods such as SFSDP. However, as in [21], it took too long to estimate the sensor positions.

Another method for solving general SDP problems is the Burer–Monteiro factorization, in which a semidefinite matrix Z is factorized into the form \(VV^\top \) and a nonconvex optimization problem is solved after the factorization [7]. If the number of columns of V is chosen as r in this factorization, then it introduces the constraint that the rank must be less than or equal to r. Therefore, this method is suitable for obtaining low-rank solutions of SDP problems. In a series of studies, Chang and colleagues [8,9,10] attempted to estimate sensor positions by using the Burer–Monteiro factorization. First, Chang and Xue [9] proposed a method that applies the limited-memory Broyden–Fletcher–Goldfarb–Shanno method to the problem after the Burer–Monteiro factorization. However, they set the number of columns of V used in the factorization to that of the semidefinite matrix before the factorization, so their method does not consider the rank constraint. Second, Chang et al. [10] used the same method as in [9] to estimate sensor positions in three-dimensional space, but unlike in [9], they set the number of columns of V to three, so we can say that this method takes the rank constraint into account. They compared it with SFSDP through numerical experiments and reported that the sensor positions could be estimated more quickly and with the same level of accuracy as SFSDP. However, those experiments involved only small problems with up to 200 sensors. Finally, Chang and Liu [8] proposed a method that they called NLP-FD, which solves the optimization problem obtained from the Burer–Monteiro factorization by the curvilinear search algorithm [23]. Their numerical experiments showed the superiority of NLP-FD over SFSDP when a problem is large in scale and the measured distances include errors.

In this paper, we propose a new method for SNL that accounts for the rank constraint. First, we factorize Z in problem (3) into a product of two matrices through the Burer–Monteiro factorization:

$$\begin{aligned} Z = \begin{pmatrix} I_d\\ U^\top \end{pmatrix}\begin{pmatrix} I_d\\ V^\top \end{pmatrix}^\top . \end{aligned}$$

This factorization is equivalent under the constraint \(U - V ={\varvec{O}}\). Therefore, problem (3) can be transformed into an unconstrained multiconvex optimization problem by adding the difference between the two matrices, with a penalty parameter \(\gamma \), to the objective function. Then, the block coordinate descent method can be applied to the new objective function, and optimization can be performed sequentially for each column of U and V. We formalize this procedure as Algorithm 1.

We also analyze the proposed method theoretically. First, we show that each subproblem in Algorithm 1 is an unconstrained convex quadratic optimization problem and can be solved analytically (Theorem 2). Second, we show that any accumulation point of the sequence generated by Algorithm 1 is a stationary point of the objective function (Theorem 3). Third, we give a range of \(\gamma \) for which the two matrices U and V used in the factorization coincide at any accumulation point (Theorem 4). Finally, we explain the relationship between the objective function in the reformulated problem and the augmented Lagrangian. Numerical experiments confirm that the proposed method does inherit the rank constraint; furthermore, the results demonstrate not only that our method estimates sensor positions faster than SFSDP and NLP-FD without sacrificing estimation accuracy, especially when the measured distances include errors, but also that our method does not run out of memory even for large-scale SNL problems.

The rest of this paper is organized as follows. In Sect. 2, we present the proposed method and analyze it theoretically. In Sect. 3, we compare it with the other methods to confirm its effectiveness. Finally, in Sect. 4, we present our conclusions and suggest possible future work.

1.2 Notation

  • \(\mathbb {N}\) denotes the set of natural numbers without zero. \(\mathbb {R}^{p}\) denotes the set of p-dimensional real vectors, and \({\varvec{0}}_p\) denotes the zero vector of \(\mathbb {R}^p\). When the size is clear from the context, we omit the size subscript at the lower right. \(\mathbb {R}^{p\times q}\) denotes the set of \(p\times q\) real matrices. Let \(I_p\) be the identity matrix of \(\mathbb {R}^{p\times p}\) and \({\varvec{O}}\) be the zero matrix of an appropriate size. \(\mathcal {S}_+^p\) denotes the set of \(p\times p\) symmetric positive semidefinite matrices.

  • For \({\varvec{x}} \in \mathbb {R}^p\), \(\Vert {\varvec{x}}\Vert _2\) denotes the 2-norm of \({\varvec{x}}\). For \(A,\ B\in \mathbb {R}^{p\times q}\), \(A \cdot B\) means the inner product between A and B, denoted by \({\mathrm{tr}}(A^\top B)\); \(\Vert A\Vert _F\) denotes the Frobenius norm of A; \(A_{(i:j,k:l)}\) denotes the submatrix of A obtained by choosing the \(\{i,\dots ,j\}\)th rows of A and the \(\{k,\dots ,l\}\)th columns of A; and \({\mathrm{rank}}(A)\) denotes the rank of A. For any symmetric matrix A, \(\lambda _{\mathrm {max}}(A)\) denotes its maximum eigenvalue.

  • For \( i=1,\dots ,m\), \(E_{\mathrm {ss}}[i]\) and \(E_{\mathrm {sa}}[i]\) denote the sets of sensors and anchors, respectively, that are connected directly to sensor i.

2 Proposed method and analyses

2.1 Proposed method

Problem (3) is an SDP problem with a rank constraint and is difficult to solve directly. In this subsection, we propose a new method that transforms problem (3) into an unconstrained multiconvex optimization problem and solves the latter problem sequentially to estimate sensor positions. First, a matrix Z satisfies the three constraints in problem (3) if and only if it can be factorized into the product of two matrices as follows:

$$\begin{aligned} Z = \begin{pmatrix} I_d&{} U\\ U^\top &{} U^\top U \end{pmatrix} = \begin{pmatrix} I_d\\ U^\top \end{pmatrix}\begin{pmatrix} I_d\\ V^\top \end{pmatrix}^\top ,\ U-V={\varvec{O}}. \end{aligned}$$

Thus, problem (3) is equivalent to

$$\begin{aligned} \vline \quad \begin{aligned}&\text {min}&f(U,V) :=\frac{1}{2}\sum _{ij\in E_{\mathrm {ss}}}\left( A_{ij}\cdot \begin{pmatrix} I_d\\ U^\top \end{pmatrix}\begin{pmatrix} I_d\\ V^\top \end{pmatrix}^\top -d_{ij}^2\right) ^2 \\&+ \frac{1}{2}\sum _{ik\in E_{\mathrm {sa}}}\left( A_{ik}\cdot \begin{pmatrix} I_d\\ U^\top \end{pmatrix}\begin{pmatrix} I_d\\ V^\top \end{pmatrix}^\top -d_{ik}^2\right) ^2\\&\text {s.t.}&U - V ={\varvec{O}}.\\ \end{aligned} \end{aligned}$$
(4)

To make problem (4) easier to solve, we remove the constraint \(U - V={\varvec{O}}\) and add a quadratic penalty term \(\gamma /2\Vert U-V\Vert _F\) with a penalty parameter \(\gamma \ (>0)\) to the objective function; as a result, the objective function takes larger values as the constraint \(U - V={\varvec{O}}\) is more strongly violated. In other words, we adopt the following unconstrained optimization problem:

$$\begin{aligned} \vline \quad \begin{aligned}&\text {min}&F(U,V;\gamma ) :=\frac{\gamma }{2}\Vert U-V\Vert _F^2 + f(U,V). \end{aligned} \end{aligned}$$
(5)

In the proposed algorithm, we let \(U = ({\varvec{u}}_1,\dots ,{\varvec{u}}_m)\) and \(V = ({\varvec{v}}_1,\dots ,{\varvec{v}}_m)\) and then perform minimization with regard to \({\varvec{u}}_1,\dots ,{\varvec{u}}_m,{\varvec{v}}_1,\dots ,{\varvec{v}}_m\), i.e., each column of U and V sequentially. Specifically, the procedure is as listed in Algorithm 1.

figure a

The proposed method has the following advantages over other methods:

  1. (i)

    The SDP problem with the rank constrained (3) is equivalent to problem (4), from which we obtained problem (5) by incorporating \(U-V={\varvec{O}}\) in the objective function as a quadratic penalty term with a penalty parameter \(\gamma \). As we will see from Theorem 4, if \(\gamma \) is larger than a real-valued threshold, then the \(U^{(p)}\) and \(V^{(p)}\) generated by Algorithm 1 coincide with each other at any accumulation point, thereby satisfying the constraint of problem (4). Therefore, the proposed method inherits the rank constraint in problem (3) and retains the potential capability to estimate sensor positions accurately for problems that are not uniquely localizable. This advantage will be verified in Sect. 3.1.

  2. (ii)

    As we will see from Theorem 2, each subproblem appearing inside a for statement in Algorithm 1 is an unconstrained convex quadratic optimization problem. The solution of each subproblem can be obtained analytically, because the solution process can be reduced to solving a system of linear equations with an invertible coefficient matrix of size d. Because d is at most three in real situations, the system can be solved rapidly and without running out of memory, regardless of the number of sensors m. Moreover, the subproblems only need to be solved 2m times (i.e., a number proportional to m) for each outer loop. Therefore, especially in the case of large-scale SNL problems, we expect faster estimates of the sensor positions as compared with other methods. This advantage will be verified in Sect. 3.2.

2.2 Analyses of the proposed method

In this subsection, we present theoretical analyses of problem (5) and Algorithm 1. First, we impose an assumption about the problem that we are examining.

Assumption 1

All sensors are connected to an anchor either directly or indirectly.

The same assumption was also made in [8, 18, 22] and is very natural when estimating sensor positions: if a sensor is not connected to any anchors, then its absolute position cannot be determined uniquely.

First, we prove that the optimal solution of each subproblem in Algorithm 1 can be obtained uniquely as an analytical solution.

Theorem 2

Fix \(U'=({\varvec{u}}'_1,\dots ,{\varvec{u}}'_m)\) and \(V'=({\varvec{v}}'_1,\dots ,{\varvec{v}}'_m)\) arbitrarily. Then, for each \(i=1,\dots ,m\), the solutions \({\varvec{u}}_i^*,\ {\varvec{v}}_i^*\) of the following two optimization problems

$$\begin{aligned} \min _{{\varvec{u}}_i\in \mathbb {R}^d}F({\varvec{u}}'_1,\dots ,{\varvec{u}}'_{i-1},{\varvec{u}}_i,{\varvec{u}}'_{i+1},\dots ,{\varvec{u}}'_m,V';\gamma ),\\ \min _{{\varvec{v}}_i\in \mathbb {R}^d}F(U',{\varvec{v}}'_1,\dots ,{\varvec{v}}'_{i-1},{\varvec{v}}_i,{\varvec{v}}'_{i+1},\dots ,{\varvec{v}}'_m;\gamma ) \end{aligned}$$

are respectively \({\varvec{u}}_i^* = A_{{\varvec{u}}_i}^{-1}{\varvec{b}}_{{\varvec{u}}_i}\), \({\varvec{v}}_i^* = A_{{\varvec{v}}_i}^{-1}{\varvec{b}}_{{\varvec{v}}_i}\), where

$$\begin{aligned} A_{{\varvec{u}}_i}&:=\gamma I_d + \sum _{j\in E_{\mathrm {ss}}[i]}({\varvec{v}}'_i-{\varvec{v}}'_j)({\varvec{v}}'_i-{\varvec{v}}'_j)^\top + \sum _{k\in E_{\mathrm {sa}}[i]}({\varvec{v}}'_i-{\varvec{a}}_k)({\varvec{v}}'_i-{\varvec{a}}_k)^\top ,\\ {\varvec{b}}_{{\varvec{u}}_i}&:=\gamma {\varvec{v}}'_i + \sum _{j\in E_{\mathrm {ss}}[i]}(({\varvec{u}}'_j)^\top {\varvec{v}}'_i - ({\varvec{u}}'_j)^\top {\varvec{v}}'_j+d_{ij}^2)({\varvec{v}}'_i-{\varvec{v}}'_j) \\&+ \sum _{k\in E_{\mathrm {sa}}[i]}({\varvec{a}}_k^\top {\varvec{v}}'_i-{\varvec{a}}_k^\top {\varvec{a}}_k+d_{ik}^2)({\varvec{v}}'_i-{\varvec{a}}_k),\\ A_{{\varvec{v}}_i}&:=\gamma I_d + \sum _{j\in E_{\mathrm {ss}}[i]}({\varvec{u}}'_i-{\varvec{u}}'_j)({\varvec{u}}'_i-{\varvec{u}}'_j)^\top + \sum _{k\in E_{\mathrm {sa}}[i]}({\varvec{u}}'_i-{\varvec{a}}_k)({\varvec{u}}'_i-{\varvec{a}}_k)^\top ,\\ {\varvec{b}}_{{\varvec{v}}_i}&:=\gamma {\varvec{u}}'_i + \sum _{j\in E_{\mathrm {ss}}[i]}( ({\varvec{v}}'_j)^\top {\varvec{u}}'_i - ({\varvec{v}}'_j)^\top {\varvec{u}}'_j+d_{ij}^2)({\varvec{u}}'_i-{\varvec{u}}'_j) \\&+ \sum _{k\in E_{\mathrm {sa}}[i]}({\varvec{a}}_k^\top {\varvec{u}}'_i-{\varvec{a}}_k^\top {\varvec{a}}_k+d_{ik}^2)({\varvec{u}}'_i-{\varvec{a}}_k).\\ \end{aligned}$$

Proof

If we focus on only \({\varvec{u}}_i\) in \(F (U, V;\gamma )\) in particular, then we can represent \(F({\varvec{u}}'_1,\dots ,{\varvec{u}}'_{i-1},{\varvec{u}}_i,{\varvec{u}}'_{i+1},\dots ,{\varvec{u}}'_m,V';\gamma )\) as

$$\begin{aligned}&F({\varvec{u}}'_1,\dots ,{\varvec{u}}'_{i-1},{\varvec{u}}_i,{\varvec{u}}'_{i+1},\dots ,{\varvec{u}}'_m,V';\gamma ) \nonumber \\&\quad = \frac{1}{2}{\varvec{u}}_i^\top A_{{\varvec{u}}_i}{\varvec{u}}_i -{\varvec{b}}_{{\varvec{u}}_i}^\top {\varvec{u}}_i + [\text {a constant unrelated to } {\varvec{u}}_i ]. \end{aligned}$$
(6)

From Eq. (6), we can see that \(F({\varvec{u}}'_1,\dots ,{\varvec{u}}'_{i-1},{\varvec{u}}_i,{\varvec{u}}'_{i+1},\dots ,{\varvec{u}}'_m,V';\gamma )\) is \(\gamma \)-strongly convex. Thus, the optimal solution of

$$\begin{aligned} \min _{{\varvec{u}}_i\in \mathbb {R}^d}F({\varvec{u}}'_1,\dots ,{\varvec{u}}'_{i-1},{\varvec{u}}_i,{\varvec{u}}'_{i+1},\dots ,{\varvec{u}}'_m,V';\gamma ) \end{aligned}$$

is the stationary point of \(F({\varvec{u}}'_1,\dots ,{\varvec{u}}'_{i-1},{\varvec{u}}_i,{\varvec{u}}'_{i+1},\dots ,{\varvec{u}}'_m,V';\gamma )\). Because

$$\begin{aligned} \nabla _{{\varvec{u}}_i}F({\varvec{u}}'_1,\dots ,{\varvec{u}}'_{i-1},{\varvec{u}}_i,{\varvec{u}}'_{i+1},\dots ,{\varvec{u}}'_m,V';\gamma ) = A_{{\varvec{u}}_i}{\varvec{u}}_i - {\varvec{b}}_{{\varvec{u}}_i} \end{aligned}$$

and \(A_{{\varvec{u}}_i}\) is positive definite and invertible in particular, we can obtain \({\varvec{u}}_i^*\). We can also obtain \({\varvec{v}}_i^*\) from the same calculation. \(\square \)

Next, we show that the sequence generated by Algorithm 1 converges to a stationary point of the objective function F.

Theorem 3

Fix the penalty parameter \(\gamma \) in problem (5) arbitrarily. Let \(\mathcal {N}\) be the set of stationary points of F. Then, the sequence \(\{(U^{(p)},V^{(p)})\}_{p=1}^\infty \) generated by Algorithm 1 satisfies

$$\begin{aligned} \lim _{p\rightarrow \infty }\inf _{(U,V)\in \mathcal {N}}\Vert (U^{(p)},V^{(p)})-(U,V)\Vert _F = 0. \end{aligned}$$
(7)

In particular, any accumulation point \((U^*,V^*)\) of the generated sequence is a stationary point of F.

The consequence of Theorem 3 is based on the result of [25]. By Corollary 2.4 in [25], if all three of the following conditions are satisfied, then Eq. (7) holds when the stationary point of F in the definition of \(\mathcal {N}\) is replaced by the Nash equilibrium of F (see Definition 2 below).

  1. Condition (a):

    F is continuous, bounded below, and has a Nash equilibrium.

  2. Condition (b):

    The objective function of each subproblem is strongly convex.Footnote 1

  3. Condition (c):

    The sequence generated by Algorithm 1 is bounded.

In the following, we prove Theorem 3 by showing the equivalence between the Nash equilibrium and the stationary point of F and then verifying that all three conditions are satisfied.

We can see from Eq. (6) that the function \(F(U, V;\gamma )\) is convex on \(\mathbb {R}^d\) if we focus only on each column of U and V. Such a function F with this property is called multiconvex [25]. For simplicity, let \(\mathcal {X} :=\mathbb {R}^{n_1}\times \cdots \times \mathbb {R}^{n_s}\) (where \(n_1,\dots ,n_s \in \mathbb {N}\)),Footnote 2 and when \({\varvec{x}} \in \mathcal {X}\) is represented as \({\varvec{x}} = ({\varvec{x}}_1,\dots ,{\varvec{x}}_s)\), then \({\varvec{x}}_i \in \mathbb {R}^{n_i}\ (i=1,\dots ,s)\).

Definition 1

A function \(g: \mathcal {X}\rightarrow \mathbb {R}\) is called multiconvex on \(\mathcal {X}\) (with respect to the block division \({\varvec{x}} = ({\varvec{x}}_1,\dots ,{\varvec{x}}_s)\in \mathcal {X}\)) if for all \(i=1,\dots ,s\) and all \({\varvec{x}}_j\in \mathbb {R}^{n_j}\ (j=1,\dots ,i-1,i+1,\dots ,m)\), the function

$$\begin{aligned} g({\varvec{x}}_1,\dots ,{\varvec{x}}_{i-1},\cdot ,{\varvec{x}}_{i+1},\dots ,{\varvec{x}}_{s}):\ \mathbb {R}^{n_i}\rightarrow \mathbb {R} \end{aligned}$$

is convex.

One of the concepts of minimality for a multiconvex function is the Nash equilibrium [25], which appears in Condition (a).

Definition 2

For a function \(g:\mathcal {X}\rightarrow \mathbb {R}\), \(({\varvec{x}}_1^*,\dots ,{\varvec{x}}_s^*)\in \mathcal {X}\) is called a Nash equilibrium of g (with respect to the block division as in Definition 1) if

$$\begin{aligned} g({\varvec{x}}_1^*,\dots ,{\varvec{x}}_{i-1}^*,{\varvec{x}}_i^*,{\varvec{x}}_{i+1}^*,\dots ,{\varvec{x}}_s^*) \le g({\varvec{x}}_1^*,\dots ,{\varvec{x}}_{i-1}^*,{\varvec{x}}_i,{\varvec{x}}_{i+1}^*,\dots ,{\varvec{x}}_s^*) \end{aligned}$$

holds for all \(i=1,\dots ,s\) and all \({\varvec{x}}_i\in \mathbb {R}^{n_i}\).

Gorski et al. [12] proved the equivalence between the stationary point and the Nash equilibrium in the case of \(s = 2\).Footnote 3 Herein, we extend the equivalence to the case of arbitrary s.

Lemma 1

Let \(g:\mathcal {X}\rightarrow \mathbb {R}\) be once differentiable and multiconvex. Then, \({\varvec{x}}^* =({\varvec{x}}_1^*,\dots ,{\varvec{x}}_s^*)\in \mathcal {X}\) is a stationary point of g if and only if \({\varvec{x}}^*\) is a Nash equilibrium of g.

Proof

We begin by proving the “if” part. If we assume that \({\varvec{x}}^*\) is a stationary point of g, then because

$$\begin{aligned} g_i({\varvec{x}}_i) :=g({\varvec{x}}_1^*,\dots ,{\varvec{x}}_{i-1}^*,{\varvec{x}}_i,{\varvec{x}}_{i+1}^*,\dots ,{\varvec{x}}_s^*) \end{aligned}$$

is convex on \(\mathbb {R}^{n_i}\) for all \(i=1,\dots ,m\),

$$\begin{aligned} g_i({\varvec{x}}_i) \ge g_i({\varvec{x}}_i^*) + \nabla _{{\varvec{x}}_i}g_i({\varvec{x}}_i^*)^\top ({\varvec{x}}_i-{\varvec{x}}_i^*) \end{aligned}$$
(8)

holds for all \({\varvec{x}}_i \in \mathbb {R}^{n_i}\). Because \(\nabla _{{\varvec{x}}_i}g_i({\varvec{x}}_i^*) = {\varvec{0}}\) follows from the assumption of \({\varvec{x}}^*\) being a stationary point of g, we can say from inequality (8) that \(g_i({\varvec{x}}_i) \ge g_i({\varvec{x}}_i^*)\) for all \({\varvec{x}}_i\in \mathbb {R}^{n_i}\). Because i is arbitrary, we can conclude that \({\varvec{x}}^*\) is a Nash equilibrium of g.

Next, we prove the “only if” part. If we assume that \({\varvec{x}}^*\) is a Nash equilibrium of g, then for each \(i = 1,\dots ,s\), \(g_i({\varvec{x}}_i)\) attains its minimum value at \({\varvec{x}}_i = {\varvec{x}}_i^*\), from which we obtain \(\nabla _{{\varvec{x}}_i}g_i({\varvec{x}}_i^*)={\varvec{0}}\). Thus, \({\varvec{x}}^*\) is a stationary point of g. \(\square \)

Lemma 2

Suppose that Assumption 1 holds. Then, for all \(\alpha \), the level set \(S_F(\alpha ) :=\{(U,V)\mid F(U,V;\gamma ) \le \alpha \}\) is bounded and closed.

Because similar (but not the same) results were already pointed out in [10, 18], we omit the proof of Lemma 2 because of the page limit. Note that if Assumption 1 does not hold, then \(S_F(\alpha )\) is always not bounded.

Corollary 1

For any initial point \((U^{(0)},V^{(0)})\), the sequence \(\{(U^{(p)},V^{(p)})\}_{p=1}^\infty \) generated by Algorithm 1 is bounded.

Proof

Let \(\alpha :=F(U^{(0)},V^{(0)};\gamma )\). Because each minimization subproblem in Algorithm 1 is strictly optimized, we can say that \((U^{(p)},V^{(p)})\in \ S_F(\alpha )\) for all \(p\in \mathbb {N}\), i.e., \(\{(U^{(p)},V^{(p)})\}_{p=1}^\infty \subseteq S_F(\alpha )\), which is bounded from Lemma 2. \(\square \)

We can now prove Theorem 3 on the basis of the above claims.

Proof

The continuity and below-boundedness of F are evident from its definition. When combined with Lemma 2, the global optimal solution of problem (5) is guaranteed to exist, from which the existence of a Nash equilibrium can be proved. Therefore, we can say that Condition (a) holds. In addition, from Eq. (6), the objective function of each subproblem is \(\gamma \)-strongly convex, and thus Condition (b) holds. Finally, Condition (c) is also satisfied from Corollary 1. Therefore, the conditions of Corollary 2.4 in [25] are all satisfied, from which we can show Eq. (7) by using the equivalence between the stationary point and the Nash equilibrium that was shown in Lemma 1. Because F is of class \(C^1\), \(\mathcal {N}\) is closed, from which we can easily show the last part of Theorem 3. \(\square \)

Finally, we show that the \(U^{(p)}\) and \(V^{(p)}\) generated by Algorithm 1 coincide with each other at any accumulation point if \(\gamma \) is larger than a real-valued threshold, which does not generally hold in the quadratic penalty method.

Theorem 4

For any initial point \((U^{(0)},V^{(0)})\) such that \(U^{(0)}=V^{(0)}\), if

$$\begin{aligned} \gamma > \frac{1}{2}\sqrt{2f(U^{(0)},V^{(0)})}\max _{1\le i\le m}\sqrt{4|E_{\mathrm {ss}}[i]|+|E_{\mathrm {sa}}[i]|}, \end{aligned}$$
(9)

then any accumulation point \((U^*,V^*)\) of the sequence \(\{(U^{(p)},V^{(p)})\}_{p=1}^\infty \) generated by Algorithm 1 satisfies \(U^*=V^*\).

Proof

For each \(p\in \mathbb {N},\ ij\in E_\mathrm{ss}\), and \(ik\in E_\mathrm{sa}\), let

$$\begin{aligned} \alpha _{ij}^{(p)}&:=A_{ij}\cdot \begin{pmatrix} I_d\\ (U^{(p)})^\top \end{pmatrix}\begin{pmatrix} I_d\\ (V^{(p)})^\top \end{pmatrix}^\top -d_{ij}^2 ,\\ \alpha _{ik}^{(p)}&:=A_{ik}\cdot \begin{pmatrix} I_d\\ (U^{(p)})^\top \end{pmatrix}\begin{pmatrix} I_d\\ (V^{(p)})^\top \end{pmatrix}^\top -d_{ik}^2 . \end{aligned}$$

Then, because the initial point satisfies \(U^{(0)}=V^{(0)}\) and the value of the objective function F decreases monotonically by Algorithm 1, we can conclude that for all \(p\in \mathbb {N}\),

$$\begin{aligned} f(U^{(0)},V^{(0)})&= F(U^{(0)},V^{(0)};\gamma )\ge F(U^{(p)},V^{(p)};\gamma ) \nonumber \\&\ge \frac{1}{2}\sum _{ij\in E_{\mathrm {ss}}}(\alpha _{ij}^{(p)})^2 + \frac{1}{2}\sum _{ik\in E_{\mathrm {sa}}}(\alpha _{ik}^{(p)})^2. \end{aligned}$$
(10)

By taking a subsequence, without loss of generality, we can assume that \(\{(U^{(p)},V^{(p)})\}_{p=1}^\infty \) itself converges to \((U^*,V^*)\). Then, it follows from inequality (10) that

$$\begin{aligned} \frac{1}{2}\sum _{ij\in E_{\mathrm {ss}}}(\alpha _{ij}^*)^2 + \frac{1}{2}\sum _{ik\in E_{\mathrm {sa}}}(\alpha _{ik}^*)^2 \le f(U^{(0)},V^{(0)}). \end{aligned}$$
(11)

By Theorem 3, \((U^*,V^*)\) is a stationary point of F, from which we have

$$\begin{aligned} \nabla _{(U,V)}F(U^*,V^*;\gamma ) = {\varvec{O}}. \end{aligned}$$

Thus, let \(U_l,\ V_l\in \mathbb {R}^m\) be the respective lth-column vectors of \(U^\top ,\ V^\top \) for each \(l=1,\dots ,d\); then, \(\nabla _{U_l}F(U^*,V^*;\gamma ) = \nabla _{V_l}F(U^*,V^*;\gamma ) = {\varvec{0}}\) holds. For each \(ik\in E_{\mathrm {sa}}\) and \(l=1,\dots ,d\), let \({\varvec{b}}_{ik}^l\) be an m-dimensional vector such that its ith component is \(-a_{kl}\) and all other components are zeros. Furthermore, for each \(ij\in E_{\mathrm {ss}}\) and \(ik\in E_{\mathrm {sa}}\), let \(\bar{A}_{ij}\) and \(\bar{A}_{ik}\) respectively be

$$\begin{aligned} \bar{A}_{ij} :=(A_{ij})_{(d+1:d+m,d+1:d+m)},\ \bar{A}_{ik}&:=(A_{ik})_{(d+1:d+m,d+1:d+m)}. \end{aligned}$$

Using these symbols, we can represent \(F(U,V;\gamma )\) as

$$\begin{aligned} F(U,V;\gamma )&= \frac{\gamma }{2}\sum _{l=1}^d\Vert U_l-V_l\Vert _2^2 + \frac{1}{2}\sum _{ij\in E_{\mathrm {ss}}}\left( \sum _{l=1}^dU_l^\top \bar{A}_{ij}V_l-d_{ij}^2\right) ^2\\&+ \frac{1}{2}\sum _{ik\in E_{\mathrm {sa}}}\left( \sum _{l=1}^dU_l^\top \bar{A}_{ik}V_l + \sum _{l=1}^d({\varvec{b}}_{ik}^l)^\top (U_l+V_l) + {\varvec{a}}_k^\top {\varvec{a}}_k -d_{ik}^2\right) ^2. \end{aligned}$$

Because

$$\begin{aligned} \nabla _{U_l}F(U^*,V^*;\gamma )&= \gamma (U_l^*-V_l^*) + \sum _{ij\in E_{\mathrm {ss}}}\alpha _{ij}^*\bar{A}_{ij}V_l^* + \sum _{ik\in E_{\mathrm {sa}}}\alpha _{ik}^*(\bar{A}_{ik}V_l^*+{\varvec{b}}_{ik}^l) \\&= {\varvec{0}},\\ \nabla _{V_l}F(U^*,V^*;\gamma )&= \gamma (V_l^*-U_l^*) + \sum _{ij\in E_{\mathrm {ss}}}\alpha _{ij}^*\bar{A}_{ij}U_l^* + \sum _{ik\in E_{\mathrm {sa}}}\alpha _{ik}^*(\bar{A}_{ik}U_l^*+{\varvec{b}}_{ik}^l) \\&= {\varvec{0}} \end{aligned}$$

for all \(l=1,\dots ,d\), we obtain

$$\begin{aligned}&(U_l^*-V_l^*)^\top \nabla _{U_l}F(U^*,V^*;\gamma ) + (V_l^*-U_l^*)^\top \nabla _{V_l}F(U^*,V^*;\gamma ) \nonumber \\&=(U_l^*-V_l^*)^\top \left\{ 2\gamma I_m-\left( \sum _{ij\in E_{\mathrm {ss}}}\alpha _{ij}^*\bar{A}_{ij}+\sum _{ik\in E_{\mathrm {sa}}}\alpha _{ik}^*\bar{A}_{ik}\right) \right\} (U_l^*-V_l^*) = 0. \end{aligned}$$
(12)

For convenience, let

$$\begin{aligned} \bar{A} :=\sum _{ij\in E_{\mathrm {ss}}}\alpha _{ij}^*\bar{A}_{ij}+\sum _{ik\in E_{\mathrm {sa}}}\alpha _{ik}^*\bar{A}_{ik}. \end{aligned}$$

Next, we seek to prove the following inequality:

$$\begin{aligned} \lambda _{\mathrm {max}}(\bar{A}) \le \sqrt{2f(U^{(0)},V^{(0)})}\max _{1\le i\le m}\sqrt{4|E_{\mathrm {ss}}[i]|+|E_{\mathrm {sa}}[i]|}. \end{aligned}$$
(13)

In fact, if inequality (13) can be shown, then because \(\gamma \) satisfies inequality (9), \(2\gamma I_m-\bar{A}\) is a positive definite matrix, and thus, \(U_l^*=V_l^*\) from equality (12). Because of the arbitrariness of l, we can eventually conclude that \(U^* = V^*\). Therefore, we need only prove inequality (13).

It follows from the Gershgorin circle theorem that

$$\begin{aligned} \lambda _{\mathrm {max}}(\bar{A}) \le \max _{1\le i\le m}\left\{ \sum _{j\in E_{\mathrm {ss}}[i]}\alpha _{ij}^* + \sum _{k\in E_{\mathrm {sa}}[i]}\alpha _{ik}^* + \sum _{j\in E_{\mathrm {ss}}[i]}|\alpha _{ij}^*|\right\} . \end{aligned}$$
(14)

For each \(i=1,\dots ,m\), let v(i) be the optimal value of the following optimization problem:Footnote 4

$$\begin{aligned} \vline \quad \begin{aligned}&\text {max}&\sum _{j\in E_{\mathrm {ss}}[i]}\alpha _{ij} + \sum _{k\in E_{\mathrm {sa}}[i]}\alpha _{ik} + \sum _{j\in E_{\mathrm {ss}}[i]}|\alpha _{ij}|\\&\text {s.t.}&\frac{1}{2}\sum _{ij\in E_{\mathrm {ss}}}\alpha _{ij}^2 + \frac{1}{2}\sum _{ik\in E_{\mathrm {sa}}}\alpha _{ik}^2 \le f(U^{(0)},V^{(0)}), \end{aligned} \end{aligned}$$

where the right side of inequality (14) does not exceed \(\max _{1\le i\le m}v(i)\) because of inequality (11). We can easily check that v(i) is equal to the optimal value of the following optimization problem for each \(i=1,\dots ,m\):

$$\begin{aligned} \vline \quad \begin{aligned}&\text {max}&2\sum _{j\in E_{\mathrm {ss}}[i]}\alpha _{ij} + \sum _{k\in E_{\mathrm {sa}}[i]}\alpha _{ik}\\&\text {s.t.}&\frac{1}{2}\sum _{j\in E_{\mathrm {ss}}[i]}\alpha _{ij}^2 + \frac{1}{2}\sum _{k\in E_{\mathrm {sa}}[i]}\alpha _{ik}^2=f(U^{(0)},V^{(0)}). \end{aligned} \end{aligned}$$
(15)

Using the method of Lagrange multipliers, we can see that the optimal value of problem (15) is \(\sqrt{2f(U^{(0)},V^{(0)})}\sqrt{4|E_{\mathrm {ss}}[i]|+|E_{\mathrm {sa}}[i]|}\). Therefore,

$$\begin{aligned}&\max _{1\le i\le m}\left\{ \sum _{j\in E_{\mathrm {ss}}[i]}\alpha _{ij}^* + \sum _{k\in E_{\mathrm {sa}}[i]}\alpha _{ik}^* + \sum _{j\in E_{\mathrm {ss}}[i]}|\alpha _{ij}^*|\right\} \\&\quad \le \max _{1\le i\le m}v(i) = \sqrt{2f(U^{(0)},V^{(0)})}\max _{1\le i\le m}\sqrt{4|E_{\mathrm {ss}}[i]|+|E_{\mathrm {sa}}[i]|}, \end{aligned}$$

which implies inequality (13). \(\square \)

2.3 Relationship to the augmented Lagrangian

In this paper, we adopt the quadratic-penalty-based method, in which the equality constraint \(U - V = {\varvec{O}}\) is incorporated in the objective function as a quadratic penalty term and the resulting new objective function is minimized. On the other hand, there are also methods such as the augmented Lagrangian method [17] and the alternating direction method of multipliers [6] that minimize the augmented Lagrangian, which contains not only the quadratic penalty term but also the Lagrange multiplier term. It is known that the augmented Lagrangian method is more efficient than the quadratic penalty method. For example, while the quadratic penalty method requires the penalty parameter to diverge to positive infinity, the augmented Lagrangian method does not require it to diverge, and the sequence obtained by the augmented Lagrangian method converges faster than that obtained by the quadratic penalty method [17, Example 17.4]. Hence, we explain that problem (5) can be regarded as a minimization problem of the augmented Lagrangian with an exact Lagrangian multiplier.

\(\varLambda = {\varvec{O}}\) is the exact Lagrange multiplier of problem (4). In fact, for all local optimum solutions \((U^*,V^*)\) of problem (4), because problem (4) satisfies the linear independence constraint qualification, there exists a Lagrange multiplier \(\varLambda ^*\in \mathbb {R}^{d\times m}\) satisfying the Karush–Kuhn–Tucker condition. In other words, if we let the Lagrangian for problem (4) be

$$\begin{aligned} \mathcal {L}(U,V,\varLambda ) :=f(U,V)-\varLambda \cdot (U-V), \end{aligned}$$

then

$$\begin{aligned}&\nabla _{U}\mathcal {L}(U^*,V^*,\varLambda ^*) = \nabla _{U}f(U^*,V^*) - \varLambda ^* = {\varvec{O}}, \end{aligned}$$
(16a)
$$\begin{aligned}&\nabla _{V}\mathcal {L}(U^*,V^*,\varLambda ^*) = \nabla _{V}f(U^*,V^*) + \varLambda ^* = {\varvec{O}}, \end{aligned}$$
(16b)
$$\begin{aligned}&\nabla _{\varLambda }\mathcal {L}(U^*,V^*,\varLambda ^*) = -(U^* - V^*) = {\varvec{O}} \end{aligned}$$
(16c)

hold. We get \(U^* = V^*\) from Eq. (16c) and denote both of them as \(W^*\). Because \(f(U,V) = f(V,U)\ (\forall U,\ V\in \mathbb {R}^{d\times m})\), \(\nabla _{U}f(W^*,W^*) = \nabla _{V}f(W^*,W^*)\). Using this equation and Eqs. (16a) and (16b), we obtain \(\varLambda ^* = {\varvec{O}}\). Therefore, the augmented Lagrangian with the Lagrange multiplier \(\varLambda = {\varvec{O}}\) and penalty parameter \(\gamma \) is

$$\begin{aligned} f(U,V) - {\varvec{O}}\cdot (U-V) + \frac{\gamma }{2}\Vert U-V\Vert _F^2 = f(U,V) + \frac{\gamma }{2}\Vert U-V\Vert _F^2, \end{aligned}$$

which is the definition of \(F(U,V;\gamma )\) itself. Hence, problem (5) can be regarded as a minimization problem of the augmented Lagrangian with the exact Lagrange multiplier \(\varLambda = {\varvec{O}}\) for problem (4).

3 Numerical experiments

In this section, we use numerical simulation to verify the advantages (i) and (ii) described in Sect. 2.1 for the proposed method. We begin by confirming that our method does inherit the rank constraint; to confirm this, we compare it with an SDP relaxation-based method for a problem that is locatable but not uniquely localizable. Next, to confirm the effectiveness of the proposed method, we compare its estimation time and estimation accuracy with those of other methods by using artificial data under various conditions. All experiments were conducted on a computer with the macOS Catalina operating system, an Intel Core i5-8279U 2.40 GHz CPU, and 16 GB of memory. All the algorithms were implemented using MATLAB (R2020a). The parameter \(\epsilon \) in Algorithm 1 was set to \(10^{-5}\) throughout the experiments.

3.1 Comparison with SFSDP for a problem that is not uniquely localizable

In this subsection, we demonstrate that the proposed method has the capability to estimate sensor positions accurately for a problem that is locatable but not uniquely localizable. Specifically, we examine a problem from [18]:

$$\begin{aligned} \Vert {\varvec{x}}_1-{\varvec{x}}_2\Vert _2 = \sqrt{10}/5,\ \Vert {\varvec{x}}_1-{\varvec{a}}_4\Vert _2 = \sqrt{5}/2,\ \Vert {\varvec{x}}_1-{\varvec{a}}_5\Vert _2 = \sqrt{5}/2,\\ \Vert {\varvec{x}}_2-{\varvec{a}}_3\Vert _2 = \sqrt{85}/10,\ \Vert {\varvec{x}}_2-{\varvec{a}}_5\Vert _2 = \sqrt{65}/10, \end{aligned}$$

where \({\varvec{a}}_3 = (0, 1.4)^\top \), \({\varvec{a}}_4 = (-1, 0)^\top \), and \({\varvec{a}}_5 = (1, 0)^\top \), and the true positions of the two sensors are \({\varvec{x}}_1^\mathrm{true}=(0,0.5)^\top ,\ {\varvec{x}}_2^\mathrm{true}=(0.6,0.7)^\top \). For this problem, Algorithm 1 was executed after fixing the penalty parameter \(\gamma \) as \(\sqrt{2f(U^{(0)},V^{(0)})}\max _{1\le i\le m}\sqrt{4|E_\mathrm{ss}| + |E_\mathrm{sa}|}/2\) according to Theorem 4. We examined the two cases of whether the initial points \({\varvec{u}}_i^{(0)}\ (= {\varvec{v}}_i^{(0)}) \in \mathbb {R}^2\ (i=1,2)\) in Algorithm 1 are in the interior or the exterior of the convex hull of the three anchors. Figure 1 shows the sensor positions estimated by SFSDP, the SDP relaxation-based method described in Sect. 1, and the proposed method. Note that when we estimated the sensor positions with the proposed method, the randomness of the initial points was varied 10 times. The results were similar to those of Fig. 1b in all cases in which the initial points were in the interior of the convex hull of the anchors. On the other hand, the results were similar to those of either Fig. 1c or d in all cases in which the initial points were in the exterior of the convex hull. Accordingly, only these three cases are included in Fig. 1.

Fig. 1
figure 1

Estimated sensor positions when the locatable but non-uniquely localizable problem was solved by a SFSDP and bd the proposed method

When we used SFSDP, the sensor positions were not estimated correctly (Fig. 1a). For the proposed method, the estimation accuracy depended on the initial points. When the initial points were in the interior of the convex hull of the anchors, the sensor positions were estimated accurately (Fig. 1b). On the other hand, when the initial points were in the exterior of the convex hull, the sensor positions were estimated accurately in some cases (Fig. 1c) but not in others (Fig. 1d).

Of course, because problem (5) examined in this paper is a nonconvex optimization problem, whether the sensor positions can be estimated accurately depends on the initial points. However, as shown in Fig. 1b and c, the proposed method still has the capability to estimate sensor positions accurately even for problems that are not uniquely localizable, although the example here is quite simple. On the other hand, when we use SDP relaxation-based methods, if a given problem is not uniquely localizable, there is no capability for accurate estimation because of the max-rank property of the interior-point method, as described in Sect. 1. Therefore, we can say that the proposed method does inherit the rank constraint.

3.2 Comparison of estimation time and accuracy

In this subsection, we quantitatively compare the estimation time and the estimation accuracy of the proposed method with those of existing methods for sensors located in two- or three- dimensional space. The compared methods are SFSDP, which was also used in Sect. 3.1, and NLP-FD, which takes the rank constraint into account as our proposed method does. Although we introduced the methods proposed by Wan et al. [20, 21] in Sect. 1, which also account for the rank constraint, we do not compare them here because of their extremely low scalability. In these experiments, \(m = 1000,\ 3000,\ 5000\), and 20,000 sensors and \(n = 0.1m\) anchors were placed randomly in \([0, 1]^d\). \(E_{\mathrm {ss}}\) and \(E_{\mathrm {sa}}\) were defined as

$$\begin{aligned} E_{\mathrm {ss}}&:=\{ij\mid 1\le i<j\le m,\ \Vert {\varvec{x}}_i^{\mathrm {true}}-{\varvec{x}}_j^{\mathrm {true}}\Vert _2< \rho \},\\ E_{\mathrm {sa}}&:=\{ik\mid 1\le i\le m,\ m+1\le k\le m+n,\ \Vert {\varvec{x}}_i^{\mathrm {true}}-{\varvec{a}}_k\Vert _2 < \rho \}, \end{aligned}$$

where the \({\varvec{x}}_i^{\mathrm {true}}\ (i=1,...,m)\) are the sensors’ true positions. In other words, we considered a model in which the distance between two sensors or between a sensor and an anchor is observed if and only if it is less than a radio range \(\rho \ (>0)\). We set \(\rho \) to 0.1 and \(\sqrt{10/m}\) in the case of \(d=2\) and to 0.25 and \(\root 3 \of {15/m}\) in the case of \(d=3\). The measured distances \(d_{ij}\ (ij\in E_{\mathrm {ss}})\) and \(d_{ik}\ (ik\in E_{\mathrm {sa}})\) were given by

$$\begin{aligned} d_{ij}&= \max \{(1+\sigma \epsilon _{ij}),0.1\}\Vert {\varvec{x}}_i^{\mathrm {true}}-{\varvec{x}}_{j}^{\mathrm {true}}\Vert _2,\\ d_{ik}&= \max \{(1+\sigma \epsilon _{ik}),0.1\}\Vert {\varvec{x}}_i^{\mathrm {true}}-{\varvec{a}}_{k}\Vert _2, \end{aligned}$$

where \(\epsilon _{ij},\ \epsilon _{ik}\) were selected independently from the standard normal distribution, and \(\sigma \) is a noise factor determining the influence of the error. \(\sigma \) was set to \(0,\ 0.1\), and 0.2. As an indicator to measure the estimation accuracy, we used the root-mean-square distance (RMSD), which has been used in many other papers on SNL [8, 13, 20, 22] and is defined as

$$\begin{aligned} \mathrm {RMSD} :=\sqrt{\frac{1}{m}\sum _{i=1}^m\Vert \hat{{\varvec{x}}}_i-{\varvec{x}}_i^{\mathrm {true}}\Vert _2^2}, \end{aligned}$$

where \(\hat{{\varvec{x}}}_i\) is the estimated position of sensor i. For each set of \((m,n,\rho ,\sigma )\), five different problems of varying randomness were created, and the final results were the averages of five measurements of the estimation time (CPU time) and the estimation accuracy (RMSD).

The initial point \((U^{(0)},V^{(0)})\in \mathbb {R}^{d\times m}\times \mathbb {R}^{d\times m}\) in Algorithm 1 was decided similarly to the method in [8]. That is, for each sensor \(i\ (i=1,\dots ,m)\), if it was connected directly to an anchor, then \({\varvec{u}}_i^{(0)}\) and \({\varvec{v}}_i^{(0)}\) were set to the coordinates of the anchor nearest to sensor i; otherwise, \({\varvec{u}}_i^{(0)}\) and \({\varvec{v}}_i^{(0)}\) were set to

$$\begin{aligned} \frac{1}{2}\left( \begin{pmatrix} \max _{k}a_{k1}\\ \vdots \\ \max _{k}a_{kd} \end{pmatrix} + \begin{pmatrix} \min _{k}a_{k1}\\ \vdots \\ \min _{k}a_{kd} \end{pmatrix}\right) . \end{aligned}$$

The penalty parameter \(\gamma \) was updated dynamically according to Theorem 4 by the following procedure.

Step 1.:

Let

$$\begin{aligned} \gamma ^{(0)} = 5 \times 10^{-3} \times \sqrt{2f(U^{(0)},V^{(0)})}\max _{1\le i\le m}\sqrt{4|E_{\mathrm {ss}}[i]|+|E_{\mathrm {sa}}[i]|}/2. \end{aligned}$$

By using \(\gamma ^{(0)}\) as the penalty parameter, \((U^{(1)},V^{(1)})\) is calculated by the update rule in the while loop of Algorithm 1. Let \(\gamma ^{(1)} = \gamma ^{(0)}/2\). Then, by using \(\gamma ^{(1)}\) as the penalty parameter, \((U^{(2)},V^{(2)})\) is also calculated by this update rule. Let \(p=2\).

Step 2.:

If

$$\begin{aligned}&\frac{f(U^{(p-1)},V^{(p-1)})-f(U^{(p)},V^{(p)})}{f(U^{(p-1)},V^{(p-1)})}\\&\quad \ge \frac{f(U^{(p-2)},V^{(p-2)})-f(U^{(p-1)},V^{(p-1)})}{f(U^{(p-2)},V^{(p-2)})}, \end{aligned}$$

then \(\gamma ^{(p)} = (\gamma ^{(p-1)}/\gamma ^{(p-2)})\gamma ^{(p-1)}\); otherwise, \(\gamma ^{(p)} = \gamma ^{(p-2)}\). By using \(\gamma ^{(p)}\) as the penalty parameter, \((U^{(p+1)},V^{(p+1)})\) is calculated by the update rule in the while loop of Algorithm 1.

Step 3.:

If \(|f(U^{(p)},V^{(p)})-f(U^{(p+1)},V^{(p+1)})|/f(U^{(p)},V^{(p)}) < 10^{-2}\) or the overall stopping criterion (line 8 of Algorithm 1) is satisfied, then go to Step 4;Footnote 5 otherwise, set \(p=p+1\) and go to Step 2.

Step 4.:

Let \(W:=(U^{(p)}+V^{(p)})/2\) and

$$\begin{aligned} \gamma = \sqrt{2f(W,W)}\max _{1\le i\le m}\sqrt{4|E_{\mathrm {ss}}[i]|+|E_{\mathrm {sa}}[i]|}/2. \end{aligned}$$

Restart Algorithm 1 with W as the initial point and \(\gamma \) as the penalty parameter.

The method of updating \(\gamma \) described above consists of two components. First, in Steps 1–3, \(\gamma \) is updated so that the value of f, which represents the squared error of the squared distances, decreases rather than the penalty term \(\Vert U-V\Vert _F\). However, if we keep reducing the value of f rather than the penalty term, then the penalty term does not decrease much, and U and V may end up taking very different values from each other. Therefore, in Step 4, we try to reduce the difference between U and V by fixing \(\gamma \) according to Theorem 4.

Table 1 Results of numerical experiments for sensors and anchors placed randomly in \([0, 1]^2\)

First, the results for \(d = 2\) are given in Table 1,Footnote 6 wherein the proposed method is referred to as “BCD.” We can see that when the measured distances included no errors (\(\sigma = 0\)), the estimation time of the proposed method was the lowest in most cases; furthermore, even when the proposed method was not the fastest, its estimation time was almost the same as that of the fastest method. In terms of the estimation accuracy, SFSDP estimated the sensor positions with the best accuracy of all the methods, by an order of magnitude. However, comparing the proposed method and NLP-FD shows that there was no appreciable difference between their estimation accuracies. When the measured distances included errors (\(\sigma = 0.1\) and 0.2), the proposed method estimated the sensor positions the most rapidly of all the methods, by an order of magnitude in all cases, and the estimation accuracy was about the same as those of the other two methods.

Table 2 Results of numerical experiments for sensors and anchors placed randomly in \([0, 1]^3\)

Next, the results for \(d = 3\) are given in Table 2.Footnote 7 The results for the three-dimensional scenario were similar to those for the two-dimensional scenario; that is, when the measured distances did not include errors, the estimation time of the proposed method was the lowest in each case. In terms of the estimation accuracy, SFSDP estimated the sensor positions with the highest accuracy of all the methods, by an order of magnitude; however, comparing the proposed method and NLP-FD again shows that there was no appreciable difference between their estimation accuracies. When the measured distances included errors, the estimation time of the proposed method was the lowest in all cases except \((m, n, \rho , \sigma ) = (1000, 100, 0.1, 0.2)\), and even in that case, its estimation time was also almost the same as that of the fastest method (SFSDP). The estimation accuracy of the proposed method was also comparable to those of the other two methods. In addition, for m =20,000, SFSDP could not estimate the sensor positions because of insufficient memory, while the proposed method could estimate the positions without running out of memory.

Overall, these results for the two- and three-dimensional cases show that the proposed method has practical advantages over the other methods: it can estimate sensor positions faster than those methods can without sacrificing the estimation accuracy, especially when measurement errors are included, and it does not run out of memory even for large-scale SNL problems.

It is interesting to consider why the proposed method and NLP-FD, both of which account for the rank constraint, could not estimate the sensor positions with as much accuracy as SFSDP when there were no measurement errors. The reason was probably because a formulation that accounts for the rank constraint is a nonconvex optimization problem and thus might converge to a stationary point that is not a global optimal solution. In contrast, if a problem is uniquely localizable, then SFSDP can estimate accurate sensor positions because the convergence to the global optimum of a relaxation problem is guaranteed. On the other hand, when measurement errors are included, even if a global minimum solution of the objective function f is obtained and the optimal value is zero, it does not mean that the true sensor positions are estimated, but rather that the positions are estimated incorrectly. In other words, even if the objective function is strictly minimized, it does not necessarily mean that a good estimate of the sensor positions is obtained; thus, when measurement errors are included, estimation accuracy comparable to that of SFSDP can be obtained even with methods that account for the rank constraint and may cause the generated sequence to fall into a stationary point that is not a global optimal solution.

4 Conclusion

In this paper, we proposed a new method that transforms the formulation of problem (3), which appears in SNL, into an unconstrained multiconvex optimization problem (5), to which the block coordinate descent method is applied. We also presented theoretical analyses of the proposed method. First, we showed that each subproblem that appears in Algorithm 1 can be solved analytically. In addition, we showed that any accumulation point \((U^*,V^*)\) of the sequence \(\{(U^{(p)},V^{(p)})\}_{p=1}^\infty \) generated by the proposed algorithm is a stationary point of the objective function of problem (5), and we gave a range of \(\gamma \) such that \((U^*,V^*)\) satisfies \(U^*=V^*\). We also pointed out the relationship between the objective function of problem (5) and the augmented Lagrangian. Numerical experiments showed that our method does inherit the rank constraint and that it can estimate sensor positions faster than other methods without sacrificing the estimation accuracy, especially when the measured distances contain errors, and without running out of memory.

The present study suggests three directions for future work. First, Algorithm 1 uses a cycle rule in which the 2m subproblems are solved in the order of \({\varvec{u}}_1,\dots ,{\varvec{u}}_m,{\varvec{v}}_1,\dots ,{\varvec{v}}_m\). However, in the general coordinate descent method, there are other update rules such as a random rule and a greedy rule [15, 24]. In SNL, the strategy of updating from variables corresponding to sensors that are connected directly to anchors is also expected to improve the estimation accuracy and time. Therefore, there is still room to consider how the order of solving the 2m subproblems affects the estimation time and accuracy. Second, we performed the minimization sequentially with respect to each column of U and V for computational efficiency, but updating some columns of U and V together is also possible, and the manner of block division in applying the block coordinate descent method should be examined further. Finally, the proposed method could be extended to general quadratic SDP problems with a rank constraint.