Keywords

1 Introduction

The Wiener index of a weighted graph \(G=(V,E)\) is the sum, \(\sum _{u,v \in V} \delta _G(u,v)\), of the shortest path lengths in the graph between every pair of vertices, where \(\delta _G(u,v)\) is the weight of the shortest (minimum-weight) path between u and v in G. The Wiener index was introduced by the chemist Harry Wiener in 1947 [30]. The Wiener index and its several variations have found applications in chemistry, e.g., in predicting the antibacterial activity of drugs and modeling crystalline phenomena. It has also has been used to give insight into various chemical and physical properties of molecules [28] and to correlate the structure of molecules with their biological activity [20]. The Wiener index has become part of the general scientific culture, and it is still the subject of intensive research [2, 10, 12, 32]. In its applications in chemistry, the Wiener index is most often studied in the context of unweighted graphs. The study of minimizing the sum of interpoint distances also arises naturally in the network design field, where the problem of computing a spanning tree of minimum Wiener index is known as the Optimum Communication Spanning Tree (OCST) problem [15, 18].

Given an undirected graph \(G=(V,E)\) and a (nonnegative) weight function on the edges of G, representing the delay on each edge, the routing cost c(T) of a spanning tree T of G is the sum of the weights (delays) of the paths in T between every pair of vertices: \(c(T)=\sum _{u,v \in V} \delta _T(u,v)\), where \(\delta _T(u,v)\) is the weight of the (unique) path between u and v in T. The OCST problem aims to find a minimum routing cost spanning tree of a given weighted undirected graph G, thereby seeking to minimize the expected cost of a path within the tree between two randomly chosen vertices. The OCST was originally introduced by Hu [18] and is known to be NP-complete in graphs, even if all edge weights are 1 [19]. Wu et al. [31] presented a polynomial time approximation scheme (PTAS) for the OCST problem. Specifically, they showed that the best k-star (a tree with at most k internal vertices) yields a \((\frac{k + 3}{k + 1})\)-approximation for the problem, resulting in a \((1+\varepsilon )\)-approximation algorithm of running time \(O\big (n^{2\lceil \frac{2}{\varepsilon }\rceil -2}\big )\).

While there is an abundance of research related to the Wiener index, e.g., computing and bounding the Wiener indexes of specific graphs or classes of graphs [16, 17, 24] and explicit formulas for the Wiener index for special classes of graphs [3, 23, 26, 29, 30], to the best of our knowledge, the Wiener index has not received much attention in geometric settings. In this work, we study the Wiener index and the optimum communication spanning tree problem in selected geometric settings, hoping to bring this important and highly applicable index to the attention of computational geometry researchers.

Our Contributions and Overview. Let P be a set of n points in the plane. We study the problem of computing a spanning tree on P that minimizes the Wiener index when the underlying graph is the complete graph over P, with edge weights given by their Euclidean lengths. In Sect. 2, we prove that the optimal tree (that minimizes the Wiener index) has no crossing edges. As our main algorithmic result, in Sect. 3, we give a polynomial-time algorithm to solve the problem when the points P are in convex position; this result strongly utilizes the structural result that the edges of an optimal tree do not cross, which enables us to devise a dynamic programming algorithm to optimize. Then, in Sect. 4, we prove that the “Euclidean Wiener Index Tree Problem”, in which we seek a spanning tree on P whose Wiener index is at most W, while having total (Euclidean) weight at most B, is (weakly) NP-hard. Finally, in Sect. 5, we discuss the problem of finding a minimum Wiener index path spanning P.

Related Work. A problem related to ours is the minimum latency problem, also known as the traveling repairman problem TRP: Compute a path, starting at point s, that visits all points, while minimizing the sum of the distances (the “latencies”) along the path from s to every other point (versus between all pairs of points, as in the Wiener index). There is a PTAS for TRP (and the k-TRP, with k repairmen) in the Euclidean plane and in weighted planar graphs [27].

Wiener index optimization also arises in the context of computing a noncontracting embedding of one metric space into another (e.g., a line metric or a tree metric) in order to minimize the average distortion of the embedding (defined to be the sum of all pairs distances in the new space, divided by the sum of all pairs distances in the original space). It is NP-hard to minimize average distortion when embedding a tree metric into a line metric; there is a constant-factor approximation (based on the k-TRP) for minimizing the average distortion in embedding a metric onto a line (i.e., finding a spanning path of minimum Wiener index) [11], which, using [27], gives a \((2+\varepsilon )\)-approximation in the Euclidean plane.

A related problem that has recently been examined in a geometric setting is the computation of the Beer index of a polygon P, defined to be the probability that two randomly (uniformly) distributed points in P are visible to each other [1]; the same paper also studies the problem of computing the expected distance between two random points in a polygon, which is, like the Wiener index, based on computing the sum of distances (evaluated as an integral in the continuum) between all pairs of points.

Another area of research that is related to the Wiener index is that of spanners: Given a weighted graph G and a real number \(t > 1\), a t-spanner of G is a spanning sub-graph \(G^*\) of G, such that \(\delta _{G^*}(u,v) \le t \cdot \delta _G(u,v)\), for every two vertices u and v in G. Thus, the shortest path distances in \(G^*\) approximate the shortest path distances in the underlying graph G, and the parameter t represents the approximation ratio. The smallest t for which \(G^*\) is a t-spanner of G is known as the stretch factor. There is a vast literature on spanners, especially in geometry (see, e.g., [4,5,6,7,8, 13, 22, 25]). In a geometric graph, G, the stretch factor between two vertices, u and v, is the ratio between the length of the shortest path from u to v in G and the Euclidean distance between u and v. The average stretch factor of G is the average stretch factor taken over all pairs of vertices in G. For a given weighted connected graph \(G=(V,E)\) with positive edge weights and a positive value W, the average stretch factor spanning tree problem seeks a spanning tree T of G such that the average stretch factor (over \(\left( {\begin{array}{c}n\\ 2\end{array}}\right) \) pairs of vertices) is bounded by W. For points in the Euclidean plane, one can construct in polynomial time a spanning tree with constant average stretch factor [9].

2 Preliminaries

Let P be a set of n points in the plane and let \(G=(P,E)\) be the complete graph over P. For each edge \((p,q) \in E\), let \(w(p,q)=|pq|\) denote the weight of (pq), given by the Euclidean distance, |pq|, between p and q. Let T be a spanning tree of P. For points \(p,q \in P\), let \(\delta _T(p,q)\) denote the weight of the (unique) path between p and q in T. Let \(W(T) = \sum _{p,q \in P} \delta _T(p,q)\) denote the Wiener index of T, given by the sum of the weights of the paths in T between every pair of points. Finally, for a point \(p \in P\), let \(\delta _p(T) = \sum _{q \in P}\delta _{T}(p,q)\) denote the total weight of the paths in T from p to every point of P.

Theorem 1

Let T be a spanning tree of P that minimizes the Wiener index. Then, T is planar.

Proof

Assume towards a contradiction that there are two edges (ac) and (bd) in T that cross each other. Let F be the forest obtained by removing the edges (ac) and (bd) from T. Thus F contains three sub-trees. Assume, w.l.o.g., that a and b are in the same sub-tree \(T_{ab}\), and c and d are in separated sub-trees \(T_c\) and \(T_d\), respectively; see Fig. 1. Let \(n_{ab}\), \(n_c\), and \(n_d\) be the number of points in \(T_{ab}\), \(T_c\), and \(T_d\), respectively. Thus,

$$\begin{aligned} W(T)&= W(T_{ab}) + n_c\cdot \delta _a(T_{ab}) + n_d\cdot \delta _b(T_{ab}) \\&+ W(T_c) + (n_{ab}+n_d)\cdot \delta _c(T_c) + n_c(n_{ab}+n_d)\cdot |ac| \\&+ W(T_d) + (n_{ab}+n_c)\cdot \delta _d(T_d) + n_d(n_{ab}+n_c)\cdot |bd| \\&+ n_c\cdot n_d \cdot \delta _{T}(a,b) \,. \end{aligned}$$
Fig. 1.
figure 1

The trees T, \(T'\), and \(T''\) (from left to right).

Let \(T'\) be the spanning tree of P obtained from T by replacing the edge (bd) by the edge (ad). Similarly, let \(T''\) be the spanning tree of P obtained from T by replacing the edge (ac) by the edge (bc). Thus,

$$\begin{aligned} W(T')&= W(T_{ab}) + (n_c + n_d)\cdot \delta _a(T_{ab}) \\&+ W(T_c) + (n_{ab}+n_d)\cdot \delta _c(T_c) + n_c(n_{ab}+n_d)\cdot |ac| \\&+ W(T_d) + (n_{ab}+n_c)\cdot \delta _d(T_d) + n_d(n_{ab}+n_c)\cdot |ad| \,, \end{aligned}$$

and

$$\begin{aligned} W(T'')&= W(T_{ab}) + (n_c + n_d)\cdot \delta _b(T_{ab}) \\&+ W(T_c) + (n_{ab}+n_d)\cdot \delta _c(T_c) + n_c(n_{ab}+n_d)\cdot |bc| \\&+ W(T_d) + (n_{ab}+n_c)\cdot \delta _d(T_d) + n_d(n_{ab}+n_c)\cdot |bd| \,. \end{aligned}$$

Therefore,

$$\begin{aligned} W(T) - W(T')&= n_d\big (\delta _b(T_{ab}) - \delta _a(T_{ab})\big ) + n_d(n_{ab}+n_c)\big (|bd|-|ad|\big ) \\&+ n_c\cdot n_d \cdot \delta _{T}(a,b) \,, \end{aligned}$$

and

$$\begin{aligned} W(T) - W(T'')&= n_c\big (\delta _a(T_{ab}) - \delta _b(T_{ab})\big ) + n_c(n_{ab}+n_d)\big (|ac|-|bc|\big ) \\&+ n_c\cdot n_d \cdot \delta _{T}(a,b) \,. \end{aligned}$$

If \(W(T) - W(T') > 0\) or \(W(T) - W(T'') > 0\), then this contradicts the minimality of T, and we are done.

Assume that \(W(T) - W(T') \le 0\) and \(W(T) - W(T'') \le 0\). Since \(n_c > 0\) and \(n_d > 0\), we have

$$\begin{aligned} \delta _b(T_{ab}) - \delta _a(T_{ab}) + (n_{ab}+n_c)\big (|bd|-|ad|\big ) + n_c\cdot \delta _{T}(a,b) \le 0 \,, \end{aligned}$$

and

$$\begin{aligned} \delta _a(T_{ab}) - \delta _b(T_{ab}) + (n_{ab}+n_d)\big (|ac|-|bc|\big ) + n_d \cdot \delta _{T}(a,b) \le 0 \,. \end{aligned}$$

Thus, by summing these inequalities, we have

$$\begin{aligned} (n_{ab}+n_c)\big (|bd|-|ad|\big ) + (n_{ab}+n_d)\big (|ac|-|bc|\big ) + (n_c+n_d)\cdot \delta _{T}(a,b) \le 0 \,. \end{aligned}$$

That is,

$$\begin{aligned} n_{ab}\big (|bd| + |ac| - |ad| - |bc|\big )&+ n_c\big (|bd| + \delta _{T}(a,b) - |ad| \big ) \\&+ n_d\big (|ac| + \delta _{T}(a,b) - |bc| \big ) \le 0 \,. \end{aligned}$$

Since \(n_{ab}, n_c, n_d > 0\), and, by the triangle inequality, \(|bd| + |ac| - |ad| - |bc| > 0\), \(|bd| + \delta _{T}(a,b) - |ad| > 0\), and \(|ac| + \delta _{T}(a,b) - |bc| > 0\), this is a contradiction.    \(\square \)

3 An Exact Algorithm for Points in Convex Position

Let \(P=\{p_1,p_2,\ldots ,p_n\}\) be a set of n points in convex position in the plane, ordered in clockwise-order with an arbitrary first point \(p_1\); see Fig. 2. For simplicity of presentation, we assume that all indices are taken modulo n. For each \(1 \le i \le j \le n\), let \(P[i,j] \subseteq P\) be the set \(\{p_i,p_{i+1},\dots ,p_j\}\). Let \(T_{i,j}\) be a spanning tree of P[ij], and let \(W(T_{i,j})\) denote its Wiener index. For a point \(x\in \{i,j\}\), let \(\delta _x(T_{i,j})\) be the total weight of the shortest paths from \(p_x\) to every point of P[ij] in \(T_{i,j}\). That is \(\delta _x(T_{i,j}) = \sum _{p \in P[i,j]}\delta _{T_{i,j}}(p_x,p)\).

Fig. 2.
figure 2

The convex polygon that is obtained from P. \(p_1\) is connected to \(p_j\) in T.

Let T be a minimum Wiener index spanning tree of P and let \(W^*\) be its Wiener index, i.e., \(W^*=W(T)\). Notice that, for any \(1\le i<j\le n\), the points in P[ij] are in convex position, since the points in P are in convex position. Since T is a spanning tree, each point, particularly \(p_1\), is adjacent to at least one edge in T. Let \(p_j\) be the point with maximum index j that is connected to \(p_1\) in T. Moreover, by Theorem 1, T is planar. Thus, there exists an index \(1 \le i \le j\) such that all the points in P[1, i] are closer to \(p_1\) than to \(p_j\) in T, and all the points in \(P[i+1,j]\) are closer to \(p_j\) than to \(p_1\) in T. Let \(T_{1,i}\), \(T_{i+1,j}\), and \(T_{j,n}\) be the sub-trees of T containing the points in P[1, i], \(P[i+1,j]\), and P[jn], respectively; see Fig. 2. Hence,

$$\begin{aligned} W^*&= W(T_{1,i}) + (n-i)\cdot \delta _1(T_{1,i}) \\&+ W(T_{i+1,j}) + (n-j+i)\cdot \delta _j(T_{i+1,j}) \\&+ W(T_{j,n}) + (j-1)\cdot \delta _j(T_{j,n}) \\&+ i(n-i)\cdot |p_1p_j|. \end{aligned}$$

For \(1\le i<j\le n\), let \(W_j[i,j] = W(T_{i,j}) + (n-j+i-1)\cdot \delta _j(T_{i,j})\) be the minimum value obtained by a spanning tree \(T_{i,j}\) of P[ij] rooted at \(p_j\). Similarly, let \(W_i[i,j] = W(T_{i,j}) + (n-j+i-1)\cdot \delta _i(T_{i,j})\) be the minimum value obtained by a spanning tree \(T_{i,j}\) of P[ij] rooted at \(p_i\). Thus, we can write \(W^*\) as

$$ W^* = W_1[1,n] = W_1[1,i] + W_j[i+1,j] + W_j[j,n] + i(n-i)\cdot |p_1p_j| \,. $$

Therefore, in order to compute \(W^*\), we compute \(W_1[1,i]\), \(W_j[i+1,j]\), \(W_j[j,n]\), and \(i(n-i)\cdot |p_1p_j|\) for each j between 2 and n and for each i between 1 and j, and take the minimum over the sum of these values. In general, for every \(1\le i<j\le n\), we compute \(W_j[i,j]\) and \(W_i[i,j]\) recursively using the following formulas; see also Fig. 3.

$$ W_j[i,j] =\underset{k \le l< j}{\underset{i \le k < j}{\min }} \ \big \{ W_k[i,k] + W_k[k,l] + W_j[l+1,j] + (l-i+1)(n-l+i-1)\cdot |p_kp_j| \big \} \,, $$

and

$$ W_i[i,j] =\underset{i \le l< k}{\underset{i < k \le j}{\min }} \ \big \{ W_i[i,l] + W_k[l+1,k] + W_k[k,j] + (j-l)(n-j+l)\cdot |p_ip_k|\big \} \,. $$
Fig. 3.
figure 3

A sub-problem defined by P[ij]. (a) Computing \(W_j[i,j]\). (b) Computing \(W_i[i,j]\).

We compute \(W_j[i,j]\) and \(W_i[i,j]\), for each \(1 \le i < j \le n\), using dynamic programming as follows. We maintain two tables \({\mathop {M}\limits ^{\rightarrow }}\) and \({\mathop {M}\limits ^{\leftarrow }}\) each of size \(n \times n\), such that \({\mathop {M}\limits ^{\rightarrow }}[i,j] = W_j[i,j]\) and \({\mathop {M}\limits ^{\leftarrow }}[i,j] = W_i[i,j]\), for each \(1 \le i < j \le n\). We fill in the tables using Algorithm 1.

figure a

Notice that when we fill the cell \({\mathop {M}\limits ^{\rightarrow }}[i,j]\), all the cells \({\mathop {M}\limits ^{\rightarrow }}[i,k]\), \({\mathop {M}\limits ^{\leftarrow }}[k,l]\), and \({\mathop {M}\limits ^{\rightarrow }}[l+1,j]\), for each \(i \le k < j\) and for each \(k \le l < j\), are already computed, and when we fill the cell \({\mathop {M}\limits ^{\leftarrow }}[i,j]\), all the cells \({\mathop {M}\limits ^{\leftarrow }}[i,l]\), \({\mathop {M}\limits ^{\rightarrow }}[l+1,k]\), and \({\mathop {M}\limits ^{\leftarrow }}[k,j]\), for each \(i < k \le j\) and for each \(i \le l < k\), are already computed. Thus, each cell in the table is computed in \(O(n^2)\) time, and the whole table is computed in \(O(n^4)\) time. Therefore, \(W^*=W_1[1,n]={\mathop {M}\limits ^{\leftarrow }}[1,n]\) can be computed in \(O(n^4)\) time.

The following theorem summarizes the result of this section.

Theorem 2

Let P be a set of n points in convex position. Then, a spanning tree of P of minimum Wiener index can be computed in \(O(n^4)\) time.

4 Hardness Proof

Let P be a set of points in the plane and let T be a spanning tree of P. Recall that \(W(T) = \sum _{p,q \in P} \delta _T(p,q)\) denote the Wiener index of T, where \(\delta _T(p,q)\) is the length of the path between p and q in T. We define the weight of T as \(wt(T)=\sum _{(p,q) \in T} |pq|\), where |pq| is the Euclidean distance between p and q. For a edge (pq), let \(N_T(p)\) (resp., \(N_T(q)\)) be the number of points in T that are closer to q than q (resp., to q than p). It is well known [21] that W(T) can be formulated as:

$$W(T) = \sum _{(p,q) \in T} N_T(p)\cdot N_T(q)\cdot |pq|.$$

In this section, we prove that the following problem is NP-hard.

Euclidean Wiener Index Tree Problem: Given a set P of points in the plane, a cost W, and a budget B, decide whether there exists a spanning tree T of P, such that \(W(T) \le W\) and \(wt(T) \le B\).

Theorem 3

The Euclidean Wiener Index Tree Problem is weakly NP-hard.

Proof

Inspired by Carmi and Chaitman-Yerushalmi [8], We reduce from the Partition problem, which is known to be NP-hard [14], to the Euclidean Wiener Index Tree Problem. In the Partition problem, we are given a set \(X=\{x_1,x_2,\ldots ,x_n\}\) of n positive integers with even \(R=\sum _{i=1}^{n}x_i\), and the goal is to decide whether there is a subset \(S\subseteq X\), such that \(\sum _{x_i \in S} x_i = \frac{1}{2}R\).

Given an instance \(X=\{x_1,x_2,\ldots ,x_n\}\) of the Partition problem, where \(x_i\)’s are integers, we construct a set P of \(m = n^3 + 3n\) points as follows. The set P consists of n points \(p_1,p_2,\ldots ,p_n\) located equally spaced on a circle of radius nR, a cluster C of \(n^3\) points located on the center of the circle. Moreover, for each \(1 \le i \le n\), we locate two points \(l_i\) and \(r_i\) both at distance \(x_i\) from \(p_i\) and the distance between them is \(\frac{1}{2}x_i\); see Fig. 4. Finally, we set

$$\begin{aligned} B=&\ \Big ( n^2 + \frac{7}{4}\Big )R \text {, and } \\ W=&\ 3n^2\big ( m-3 \big )R + \Big ( \frac{9}{4}m - \frac{13}{4}\Big )R \\ =&\ 3n^5R + \frac{45}{4}n^3R -9n^2R + \frac{27}{4}nR - \frac{13}{4}R \,. \end{aligned}$$
Fig. 4.
figure 4

The set P produced by the reduction. Connecting the points \(l_j\), \(r_j\), and \(p_j\) for \(x_j \in S\) (blue) and connecting the points \(l_i\), \(r_i\), and \(p_i\) for \(x_i \in X\setminus S\) (red). (Color figure online)

Assume that there exists a set \(S\subseteq X\), such that \(\sum _{x_i \in S} x_i = \frac{1}{2}R\). We construct a spanning tree T for the points in P as follows:

  • Select an arbitrary point \(s \in C\) and connect it to all the points in \(C\cup \{p_1,p_2,\ldots ,p_n\}\) as a star centered at s.

  • For each \(1 \le i \le n\), connect the points \(p_i\) and \(l_i\).

  • For each \(x_i \in S\), connect the points \(p_i\) and \(r_i\).

  • For each \(x_i \in X\setminus S\), connect the points \(r_i\) and \(l_i\); see Fig. 4.

It is easy to see that \(wt(T)= n^2R + R + \frac{3}{4}R = \big ( n^2 + \frac{7}{4}\big )R = B\). Moreover, the Wiener index of T is:

$$\begin{aligned} W(T) =&\sum _{(p,q) \in T} N_T(p)\cdot N_T(q)\cdot |pq| \\ =&\ 3 (n^3 + 3n -3) n^2 R + \sum _{x_i \in S'} 2(n^3 + 3n -1) x_i \\&\quad + \sum _{x_i \notin S'} \Big ( (n^3 + 3n -1) \frac{1}{2}x_i\Big ) + \sum _{x_i \notin S'} \Big ( 2 (n^3 + 3n -2) x_i \Big ) \\ =&\ 3 n^5 R + 9n^3 R -9n^2 R + (n^3 + 3n -1)R \\&\quad + \frac{1}{4}(n^3 + 3n -1) R + (n^3 + 3n -2) R \\ =&\ 3n^5R + \frac{45}{4}n^3R -9n^2R + \frac{27}{4}nR - \frac{13}{4}R = W \,. \end{aligned}$$

Conversely, let \(T'\) be a spanning tree of P with \(wt(T') \le B\) and \(W(T') \le W\).

Claim

The number of edges \((p, q) \in T'\), such that \(p \in C\) and \(q \in P \setminus C\) is n.

Proof

Assume there are k such edges. The weight of each such edge is at least nR thus the \(wt(T') \ge k nR\), since \(B = (n^2 + \frac{7}{4}) R\) we get that \(k \le n\). We have

$$\begin{aligned} W(T') >&\ (3 k nR + 3 (n-k)(nR + 2\pi R)) n^3 \\ =&\ (3 k n + 3 n^2 + 6n \pi - 3kn - 6 k \pi ) n^3 R \\ =&\ ( 3 n^2 + 6 \pi (n - k)) n^3 R \\ =&\ 3 n^5 R + 6 \pi (n - k) n^3 R \,. \end{aligned}$$

Thus, if \( k < n \), then we get that \(W(T')> 3 n^5 R + 6 \pi n^3 R > W\), for sufficiently large n.    \(\square \)

Let \(P_i = \{p_i, l_i, r_i \}\), for every \(1 \le i \le n\). From the proof of Claim 4, it follows that for every \(1 \le i \le n\), there is an exactly one edge (pq) in \(T'\), where \(q \in P_i\) and \(p \in C\). Moreover, it is easy to see that \(q = p_i\). Thus, in every \(P_i\), we have \((p_i, l_i) \in T'\) or \((p_i, r_i) \in T'\). Assume w.l.o.g., that \((p_i, l_i) \in T'\). Therefore, either \((p_i, r_i) \in T'\) or \((l_i, r_i) \in T'\). Let \(S' \subseteq X\), such that \(x_i \in S'\) if and only if \((p_i,r_i) \in T'\), and let \(R' =\sum _{x_i \in S'} x_i\).

Thus, to finish the proof we show that if \(R' \ne \frac{1}{2}R\), then either \(wt(T') > B\) or \(W(T) > W\).

Case 1: \(R' > \frac{1}{2}R\). In this case, we have

$$\begin{aligned} wt(T') \ge&\ n^{2} R + \sum _{x_i \in S'} 2 x_i + \sum _{x_i \notin S'} \frac{3}{2} x_i \ = \ n^{2} R + 2R' + \frac{3}{2}(R-R') \\ =&\ n^{2} R + \frac{1}{2}R' + \frac{3}{2}R \ > \ n^{2} R + \frac{1}{4}R + \frac{3}{2}R \ = \ \big ( n^2 + \frac{7}{4}\big )R \ = B \,. \end{aligned}$$

Therefore, \(wt(T') > B\).

Case 2: \(R' < \frac{1}{2}R\). In this case, we have

$$\begin{aligned} W(T) =&\sum _{(p,q) \in T} N_T(p)\cdot N_T(q)\cdot |pq| \\ =&\ 3 (n^3 + 3n -3) n^2 R + \sum _{x_i \in S'} 2(n^3 + 3n -1) x_i \\&\quad + \sum _{x_i \notin S'} \Big ( (n^3 + 3n -1) \frac{1}{2}x_i\Big ) + \sum _{x_i \notin S'} \Big ( 2 (n^3 + 3n -2) x_i \Big ) \\ =&\ 3 n^5 R + 9n^3 R -9n^2 R + 2(n^3 + 3n -1)R' \\&\quad + \frac{1}{2}\Big (n^3 + 3n -1\Big ) (R-R') + 2(n^3 + 3n -2) (R-R') \\ \quad \quad \quad \quad \quad =&\ 3 n^5 R + 9n^3 R -9n^2 R + 2 (n^3 + 3n -2)R \\&\quad - \Big (\frac{1}{2}\Big (n^3 + 3n -1\Big ) -2\Big )R' + \frac{1}{2}\Big (n^3 + 3n -1\Big ) R \\&\quad - \Big (\frac{1}{2}\Big (n^3 + 3n -1\Big ) -2\Big )R' + \frac{1}{2}\Big (n^3 + 3n -1\Big ) R \\ >&\ 3 n^5 R + 9n^3 R - 9n^2 R + 2 (n^3 + 3n -2)R \\&\quad - \frac{1}{2}\Big (\frac{1}{2}\Big (n^3 + 3n -1\Big ) -2\Big )R + \frac{1}{2}\Big (n^3 + 3n -1\Big ) R \\ =&\ 3n^5R + \frac{45}{4}n^3R -9n^2R + \frac{27}{4}nR - \frac{13}{4}R = W \,. \qquad \qquad \qquad \end{aligned}$$

   \(\square \)

Fig. 5.
figure 5

A set P of \(n=2m+2\) points in a convex position.

5 Paths that Optimize Wiener Index

We consider now the case of spanning paths that optimize the Wiener index.

Theorem 4

Let P be a set of n points. The path that minimizes the Wiener index among all Hamiltonian paths of P is not necessarily planar.

Proof

Consider the set P of \(n=2m+2\) points in convex position as shown in Fig. 5. The set P consists of two clusters \(P_l\) and \(P_r\) and two points p and q, where \(|P_l|= |P_r|=m\). The points in cluster \(P_l\) are arbitrarily close to the origin (0, 0), and the points in cluster \(P_r\) are arbitrarily close to coordinate (6, 0). The points p and q are located on coordinates (5, 1) and \((5,-1)\), respectively.

Since the points in \(P_l\) are arbitrarily close to the origin (0, 0), any path connecting these points has a Wiener index zero. Thus, any Hamiltonian path \(\varPi \) of P that aims to minimize the Wiener index will connect the points in \(P_l\) by a path. Similarly, any Hamiltonian path \(\varPi \) of P that aims to minimize the Wiener index will connect the points in \(P_r\) by a path. Therefore, it is sufficient to consider the 12 possible Hamiltonian paths defined on points (0, 0), (6, 0), p, and q, while treating each one of the points (0, 0) and (6, 0) as a path (of Wiener index zero) containing m points starting and ending at this point. We computed the Wiener index of these Hamiltonian paths, and this computation shows that the Hamiltonian path that minimizes the Wiener index is not planar (for sufficiently large n); see Fig. 6.     \(\square \)

Fig. 6.
figure 6

The 12 possible Hamiltonian paths that are defined on points (0, 0), (6, 0), p, and q, and their Wiener index.

Theorem 5

For points in the Euclidean plane, it is NP-hard to compute a Hamiltonian path minimizing the Wiener index.

Proof

We reduce from Hamiltonicity in a grid graph (whose vertices are integer grid points and whose edges join pairs of grid points at distance one; see Fig. 7). It is well known that the Wiener index of a Hamiltonian path of n points, where each edge is of length one, is \({ n+1 \atopwithdelims ()3}\) (see [21]). Thus, it is easy to see that a grid graph \(G=(P,E)\) has a Hamiltonian path if and only if there exists a Hamiltonian path in the complete graph over P of Wiener index \({ n+1 \atopwithdelims ()3}\).    \(\square \)

Fig. 7.
figure 7

A grid graph G and a Hamiltonian path with Wiener index \({ n+1 \atopwithdelims ()3}\) in G.

Theorem 6

There exists a set P of n points in the plane, such that the Wiener index of any Hamiltonian path is at least \(\varTheta (\sqrt{n})\) times the Wiener index of the complete Euclidean graph over P

Proof

Let P be a set of n points located on a\(\sqrt{n}\times \sqrt{n}\) integer grid. The Wiener index of any Hamiltonian path of P is at least \({ n+1 \atopwithdelims ()3}\), which is the Wiener index of a Hamiltonian path whose all its edges are of length one. Thus, the Wiener index of any Hamiltonian path of P is \(\varTheta (n^{3})\). On the other hand, since the distance between every two points in P is at most \(\sqrt{2n}\) and there are \({ n \atopwithdelims ()2}\) pairs of points, the Wiener index of the complete graph over P is \(O(n^{2.5})\).    \(\square \)