On a Traveling Salesman Problem for Points in the Unit Cube

Balogh, József; Clemen, Felix Christian; Dumitrescu, Adrian

doi:10.1007/s00453-024-01257-w

On a Traveling Salesman Problem for Points in the Unit Cube

Open access
Published: 18 July 2024

Volume 86, pages 3054–3078, (2024)
Cite this article

Download PDF

You have full access to this open access article

Algorithmica Aims and scope Submit manuscript

On a Traveling Salesman Problem for Points in the Unit Cube

Download PDF

József Balogh¹,
Felix Christian Clemen² &
Adrian Dumitrescu³

276 Accesses
Explore all metrics

Abstract

Let X be an n-element point set in the k-dimensional unit cube $[0,1]^k$ where $k \ge 2$. According to an old result of Bollobás and Meir (Oper Res Lett 11:19–21, 1992) , there exists a cycle (tour) $x_1, x_2, \ldots , x_n$ through the n points, such that $\left( \sum _{i=1}^n |x_i - x_{i+1}|^k \right) ^{1/k} \le c_k$, where $|x-y|$ is the Euclidean distance between x and y, and $c_k$ is an absolute constant that depends only on k, where $x_{n+1} \equiv x_1$. From the other direction, for every $k \ge 2$ and $n \ge 2$, there exist n points in $[0,1]^k$, such that their shortest tour satisfies $\left( \sum _{i=1}^n |x_i - x_{i+1}|^k \right) ^{1/k} = 2^{1/k} \cdot \sqrt{k}$. For the plane, the best constant is $c_2=2$ and this is the only exact value known. Bollobás and Meir showed that one can take $c_k = 9 \left( \frac{2}{3} \right) ^{1/k} \cdot \sqrt{k}$ for every $k \ge 3$ and conjectured that the best constant is $c_k = 2^{1/k} \cdot \sqrt{k}$, for every $k \ge 2$. Here we significantly improve the upper bound and show that one can take $c_k = 3 \sqrt{5} \left( \frac{2}{3} \right) ^{1/k} \cdot \sqrt{k}$ or $c_k = 2.91 \sqrt{k} \ (1+o_k(1))$. Our bounds are constructive. We also show that $c_3 \ge 2^{7/6}$, which disproves the conjecture for $k=3$. Connections to matching problems, power assignment problems, related problems, including algorithms, are discussed in this context. A slightly revised version of the Bollobás–Meir conjecture is proposed.

The Traveling Salesman Problem and the Gnedenko Theorem

A Polyhedral Study of the Quadratic Traveling Salesman Problem

Hard to solve instances of the Euclidean Traveling Salesman Problem

Article Open access 12 March 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Given n points in the unit square, Newman [20, Problem 57] proved that there is a closed polygonal Hamiltonian cycle (tour) H through the n points such that the sum of the squares of its edge-lengths is at most 4. The upper bound of 4 cannot be improved: Fig. 1 shows three different point sets whose optimal tours yield exact equality. More importantly, the above upper bound is independent of n.

Meir [19] considered the extension of this problem to higher dimensions. For a point $x \in \mathbb {R}^k$, let |x| denote the Euclidean length of x; namely, if $x=(\xi _1,\xi _2,\ldots ,\xi _k)$, then

$$\begin{aligned} |x|= \left( \sum _{i=1}^k \xi _i^2 \right) ^{1/2}. \end{aligned}$$

For two points $x,y \in \mathbb {R}^k$, let the weight of the edge $e=xy$, be $|e|:=|x-y|$, i.e., the Euclidean distance between x and y.

Let X be an n-element point set in the unit cube $[0,1]^k$. For a graph G on vertex set X, set

$$\begin{aligned} S_k(G)&= \sum _{e\in G} |e|^k \quad \text { and } \quad s_k(G) = \left( \sum _{e\in G} |e|^k\right) ^{1/k}. \end{aligned}$$

(1)

We refer to $S_k(G)$ and $s_k(G)$ as the unscaled and scaled costs, respectively. Denote by $S_k^{\texttt {HC}}(X)$, $S_k^{\texttt {ST}}(X)$ and $S_k^{\texttt {HP}}(X)$ ($s_k^{\texttt {HC}}(X)$, $s_k^{\texttt {ST}}(X)$ and $s_k^{\texttt {HP}}(X)$) the minimum over $S_k(G)$ ($s_k(G)$) where G is a Hamiltonian cycle, respectively a spanning tree or Hamiltonian Path with vertex set X. Further, let

$$\begin{aligned} s_k^{\texttt {HC}}(n)= & {} \sup \{ s_k^{\texttt {HC}}(X) :X\subseteq [0,1]^k, |X|=n \},\\ s_k^{\texttt {ST}}(n)= & {} \sup \{ s_k^{\texttt {ST}}(X) :X\subseteq [0,1]^k, |X|=n \}, \\ s_k^{\texttt {HP}}(n)= & {} \sup \{ s_k^{\texttt {HP}}(X) :X\subseteq [0,1]^k, |X|=n \}, \\ s_k^{\texttt {HC}}= & {} \sup _{n \ge 2} s_k^{\texttt {HC}}(n), \quad s_k^{\texttt {ST}} = \sup _{n \ge 2} s_k^{\texttt {ST}}(n) \quad \text {and} \\ s_k^{\texttt {HP}}= & {} \sup _{n \ge 2} s_k^{\texttt {HP}}(n). \end{aligned}$$

It is clear that $s_k^{\texttt {HC}}(n) \ge s_k^{\texttt {HC}}(m)$, whenever $n \ge m$ (by clustering points and taking the limit). In this notation, Newman’s result mentioned earlier reads $s_2^{\texttt {HC}}(n) =2$ for every $n \ge 2$. A more recent reference to this result can be found in [6, Problem 124]. Currently this is the only exact value known. Meir [19] asked whether $s_k(n)$ is bounded from above by a constant $c_k>0$ for every k. Soon after, Bollobás and Meir [7] answered Meir’s question in the positive by proving that $ s_k^{\texttt {HC}}(n) \le 9 \left( \frac{2}{3} \right) ^{1/k} \cdot \sqrt{k}$ for every $k \ge 3$ and $n \ge 2$ (and recall that $c_2=2$). From the other direction, the 2-point example consisting of two opposite vertices of $\{0,1\}^k$ shows that $s_k^{\texttt {HC}}(n) \ge 2^{1/k} \cdot \sqrt{k}$ for every $k \ge 2$ and $n \ge 2$; see Fig. 1 (center). We record their result below.

Theorem 1.1

(Bollobás and Meir [7]). Let $k\ge 3$ and $n\ge 2$. Then,

$$\begin{aligned} 2^{1/k}\sqrt{k} \le s_k^{\texttt {HC}}(n) \le 3^{2-\frac{1}{k}}2^{1/k}\sqrt{k}. \end{aligned}$$

In the conclusion of their paper [7], the authors conjectured that $s_k^{\texttt {HC}}(n) = 2^{1/k} \cdot \sqrt{k}$ for every $k \ge 2$ and $n \ge 2$. Meir [19] also asked for an algorithm that computes a tour whose cost is bounded by a constant depending on k. As we will see in more detail in Sect. 2, Bollobás and Meir’s proof implicitly gives a positive answer to this latter question. Similarly, our new bounds in Theorem 1.3 and Corollary 5.1 are constructive too.

Background and related work. The traveling salesman problem (TSP) is perhaps the most studied problem in the theory of combinatorial optimization. Its approximability depends on the particular version of the problem. Specifically, TSP with Euclidean distances admits a polynomial-time approximation scheme [3, 16]. If the distances form a metric, then the problem is $\textsf {MaxSNP}$-hard [21] and the best approximation ratio known is essentially 3/2 [8, 13].

Estimating the length of a shortest tour of n points in the unit square with respect to Euclidean distances has been studied as early as 1940 s and 1950s by Fejes Tóth [10], Few [11], and Verblunsky [30], respectively. Few [11] proved that the (Euclidean) length of a shortest cycle (tour) through n points in the unit square $[0,1]^2$ is at most $\sqrt{2n}+7/4$. The same upper bound holds for the minimum spanning tree [11]. Few’s bound was rediscovered in 1983 by Supowit, Reingold, and Plaisted [26]. A slightly better upper bound for the shortest cycle, $1.392 \sqrt{n} + 7/4$, has been derived by Karloff [14], who also emphasized the difficulty of the problem. The current best lower bound for the length of such a cycle is due to Fejes Tóth [10] and Few [11]: it is $\left( \frac{4}{3}\right) ^{1/4} \sqrt{n} - o(\sqrt{n})$, where $(4/3)^{1/4} = 1.075\ldots $. For every dimension $k \ge 3$, Few showed that the maximum length of a shortest tour through n points in the unit cube is $\Theta (n^{1-1/k})$. Moran [18] studied the length of the shortest traveling salesman tour through a set of n points of unit diameter in $\mathbb {R}^k$.

The length of a shortest tour through a random sample $\{ X_1,\ldots ,X_n\}$ of n points in the unit cube $[0,1]^k$ was determined by Beardwood, Halton, and Hammersley. Let this length be denoted by $L(X_1,\ldots ,X_n)$. If $\{X_i\}$ is a sequence of independent random variables with the uniform distribution on $[0,1]^k$, then there is a constant $\beta (k)>0$ such that

$$\begin{aligned} {L(X_1,\ldots ,X_n) / n^{1-1/k}} \rightarrow \beta (k) \end{aligned}$$

with probability one [4]. Later, Rhee [22] proved that ${\beta (k) / \sqrt{k}} \rightarrow {1 / \sqrt{2 \pi e}}$, see also [25]. The relevance of the cube diagonal, $\sqrt{k}$, in the above formulas, can be also observed in our estimates for $s_k(n)$; see Theorem 1.3 (ii) and Conjecture 5.5.

Expressions for the cost of a Hamiltonian cycle of the kind in (1) have been considered in the context of power assignment problems in wireless networks. Let X be an n-element point set in the unit cube $[0,1]^k$ and $\alpha \ge 1$ be a real number. For a Hamiltonian cycle H as above, one is interested in minimizing a cost of the form

$$\begin{aligned} \textrm{cost}(H)= \sum _{i=1}^n |x_i - x_{i+1}|^\alpha . \end{aligned}$$

(2)

Such costs typically reflect the energy costs along the edges that make the cycle [9, 15] in wireless network transmission. An illustrative example is that of a virtual token floating through the network, where sensor nodes can attach or read data from the token before sending it to the next node on the cycle. One can speak about finding a traveling salesman tour (TSP tour) of minimum energy cost [12]. The fact that k is the smallest value of $\alpha $ for which the cost in (2) is bounded from above by a constant (depending on k but independent on n) should be noted [7, 15]; a fine grid section in the cube proves this point.

As pointed out in several places in the literature [2, 5, 9, 12], simply computing a short (even optimal) tour for the underlying Euclidean instance does not work, i.e., does not provide a good approximation with respect to the power costs in (2). Funke, Laue, Lotker and Naujoks [12] showed that the cost of an optimal tour for the Euclidean instance can be a factor of $\Omega (n)$ larger than that of optimal tour for the power costs (a simple example can be constructed with equidistant points on a line or on a circle of large radius).

In [12] a recursive algorithm was also presented, that given n points in $\mathbb {R}^2$, it constructs a TSP tour for edge costs $|pq|^\alpha = |e|^\alpha $, whose cost is at most $2 \cdot 3^{\alpha -1}$ times that of a minimum spanning tree (MST) of the point set. Since the cost of an MST does not exceed that of an optimal Euclidean TSP tour, their algorithm is $2 \cdot 3^{\alpha -1}$-factor approximation for the TSP with power costs as in (2). The authors further show that the approach extends to $\mathbb {R}^k$ with the same ratio:

Theorem 1.2

(Funke, Laue, Lotker, and Naujoks [12]). There exists a $2 \cdot 3^{\alpha -1}$-approximation algorithm for the TSP in $\mathbb {R}^k$ if the edge weights are Euclidean distances to the power $\alpha $.

If for some $\tau >1$ distances of a TSP instance satisfy

$$\begin{aligned} \textrm{dist}(x,z) \le \tau \left( \textrm{dist}(x,y) + \textrm{dist}(y,z) \right) , \end{aligned}$$

for any three vertices x, y, z, we say that they satisfy the relaxed triangle inequality, see [2, 5, 17]. It is important to note that the metric with Euclidean distances to the power $\alpha $ satisfies the relaxed triangle inequality with $\tau = 2^{\alpha -1}$; see [9, 12]. For $\alpha =2$ (i.e., TSP with squared distances), Theorem 1.2 yields a 6-approximation. De Berg, van Nijnatten, Sitters, Woeginger and Wolff [9] obtained a 5-approximation.

Our results. The upper bound $s_k^{\texttt {HC}}(n) \le 9 \left( \frac{2}{3} \right) ^{1/k} \cdot \sqrt{k}$, where $k \ge 3$, has stood unchanged for 30 years [7]. Here we obtain several improvements.

Theorem 1.3

The following bounds are in effect:

(i)
There exists a 4-element point set in $[0,1]^3$ such that the cost of the shortest tour is at least $2^{7/6}=2.24\ldots $. Consequently, $s_3^{\texttt {HC}}(n) \ge 2^{7/6}=2.24\ldots $, for every $n \ge 4$.
(ii)
Let X be an n-element point set in the k-dimensional unit cube $[0,1]^k$, $k \ge 3$. Then there exists a tour $H=x_1, x_2, \ldots , x_n$ through the n points, such that $\left( \sum _{i=1}^n |x_i - x_{i+1}|^k \right) ^{1/k} \le 3 \sqrt{5} \left( \frac{2}{3} \right) ^{1/k} \cdot \sqrt{k}$. Consequently, $s_k^{\texttt {HC}}(n) \le 3 \sqrt{5} \left( \frac{2}{3} \right) ^{1/k} \cdot \sqrt{k} = 6.708\ldots \cdot \left( \frac{2}{3} \right) ^{1/k} \cdot \sqrt{k}$.
(iii)
H can be computed in time proportional to that needed for computing a MST of the points, in particular, in subquadratic time.

Several sharper bounds are obtained for sufficiently large k. We note that the conjectured optimal configuration consisting of a diameter pair of the cube as well as the lower bound construction we will present for $k=3$ in Theorem 1.3 (i) are subsets of $\{0,1\}^k$. This raises the natural question if one can determine the maximum of $s_k^{\texttt {HC}}(X)$ if the point set X is in $\{0,1\}^k$. We answer this question.

Theorem 1.4

There exists an integer $k_0$ such that for all $k\ge k_0$ the following holds. If X is an arbitrary subset of vertices of $\{0,1\}^k$, then there exists a Hamiltonian cycle H through X such that $s_k(H) \le 2^{1/k}\sqrt{k}$.

The “sufficiently large” requirement for Theorem 1.4 is in fact quite modest. The threshold $k_0$ is below 30. Note that the bound in Theorem 1.4 is attained for $|X|=2$.

Theorem 1.5

For the family of minimum spanning trees, we have

$$\begin{aligned} s_k^{\texttt {ST}} \le \sqrt{k} \ (1+o_k(1)). \end{aligned}$$

Apart from the error term, this bound is best possible.

By transforming a minimum spanning tree into a Hamiltonian cycle by using the method of Sekanina [23] and Bollobás and Meir [7], we obtain $s_k^{\texttt {HC}} \le 3\sqrt{k} \ (1+o_k(1))$. A further refinement based on a two-phase algorithm and a new greedy algorithm that maintains a collection of spanning paths allows us to obtain the following sharper bound.

Theorem 1.6

For the family of Hamiltonian cycles, we have

$$\begin{aligned} s_k^{\texttt {HC}} \le 2.91 \sqrt{k} \ (1+o_k(1)). \end{aligned}$$

When the number of points n is bounded by a constant (independent of k), we can obtain a better asymptotic bound, close to the conjectured value $2^{1/k}\sqrt{k}$.

Theorem 1.7

Let $n\ge 2$ be fixed. For the family of Hamiltonian cycles, we have

$$\begin{aligned} s_k^{\texttt {HC}}(n)= 2^{1/k}\sqrt{k} \ (1+o_k(1)). \end{aligned}$$

Note however, that in Theorem 1.7 we require n to be constant; it does not imply $s_k^{\texttt {HC}}= 2^{1/k}\sqrt{k} \ (1+o_k(1))$.

The improved upper bounds in Theorem 1.3 and 1.6, have implications for the existence of Hamiltonian paths and perfect matchings whose costs are bounded from above by constants depending on k. These are discussed in Sect. 5.

2 Hamiltonian Cycles: Exact Upper and Lower Bounds

2.1 An Improved Lower Bound for $k=3$

In this subsection we prove Theorem 1.3(i). Consider the four-element point set

$$\begin{aligned} X=\{(0,0,0),(0,1,1),(1,0,1),(1,1,0)\} \subset [0,1]^3. \end{aligned}$$

X is in fact a binary code of length 3 with minimum Hamming distance 2; see, e.g., [29, Ch. 5]. As such, the corresponding Euclidean pairwise distances are at least $\sqrt{2}$. Consequently, the unscaled cost of any TSP tour H is at least $S_k(H)\ge 4 \cdot (\sqrt{2})^3 = 11.31\ldots $. On the other hand, the conjectured [7] optimal unscaled cost was $2 \cdot (\sqrt{3})^3 = 10.39\ldots $. $\square $

It is possible that the new lower bound gives the right value of $s_3^{\texttt {HC}}(n)$ for $n \ge 4$, see Conjecture 5.5 in Sect. 5.

Remark. Interestingly enough, for $k=4$, there exist (at least) two different point sets, one with $n=2$ and the other with $n=8$, whose shortest tours have the same cost $S_4^{\texttt {HC}}(X)$ as the conjectured value, $S_4^{\texttt {HC}}(n) = 2 \cdot (\sqrt{4})^4 = 32$. The former set consists of a pair of diagonally opposite vertices, say, $\{(0,0,0,0),(1,1,1,1)\}$. This is in fact the point set that is behind the conjectured maximum cost for every k. The latter set is a binary code of length 4 with minimum distance 2; for example, one can take the eight binary vectors with an even number of ones:

$$\begin{aligned} X&= \{(0,0,0,0),(0,0,1,1),(0,1,0,1),(0,1,1,0),\\&(1,0,0,1), (1,0,1,0), (1,1,0,0), (1,1,1,1)\}. \end{aligned}$$

Then $S_4^{\texttt {HC}}(X) \ge 8 (\sqrt{2})^4 = 32$ and this value can be attained; equivalently, $s_4^{\texttt {HC}}(X) \ge 2^{5/4}$. We were not able to find two different sets X with $s_k^{\texttt {HC}}(X) \ge 2^{1/k} \cdot \sqrt{k}$ for any other $k \ge 5$.

2.2 An Improved Upper Bound for Every $k \ge 3$

In this section we prove the last two items in Theorem 1.3. Our proof is modeled by that in [7]. It uses a ball packing argument based on the following lemma. (A similar lemma, however, with smaller ball radii, can be found in [15].)

Lemma 2.1

(Bollobás and Meir [7]). Let $T=(V,E)$ be a minimum spanning tree for a finite point set $X \subset \mathbb {R}^k$. For each edge $e=xy \in E$ let $B_{e}$ be the open ball of radius $\frac{1}{4} |x-y|$ centered at $\frac{1}{2} (x+y)$. Then $B_{e} \cap B_{e'} =\emptyset $ whenever e and $e'$ are edges of T. The factor $\frac{1}{4}$ is as large as possible.

In addition, a suitable order of traversing the vertices of a minimum spanning tree first developed by Sekanina [23, 24] is needed. The algorithm can be made to run in linear time. A proof of this traversal result — in slightly different terms — also appears in [7]. A few definitions and notations (from [7]) are as follows. The h’th power $G^h$ of a graph $G=(V,E)$ is the graph with vertex set V and edge set $E(G^h) = \{xy :x,y \in V, 1 \le d(x,y) \le h\}$. Here d(x, y) is the distance between x and y in the graph. Let T be a tree and $xy \in E(T^h)$. An edge $uv \in E(T)$ is said to be used by xy if the edge uv is on the unique path in T (of length at most h) from x to y. If H is a subgraph of $T^h$, then an edge of T is used t times by H if it is used by t edges of H.

Lemma 2.2

(Sekanina [23], Bollobás and Meir [7]). Let x be a vertex of a tree T with at least 3 vertices. Then $T^3$, the cube of T, contains a Hamiltonian cycle H such that every edge of T is used exactly twice by H, and one of the edges of H incident to x is an edge of T.

It implies the following lemma which is not stated explicitly in [7] but is used in the proof of their Theorem 3. For completeness, we include their proof here.

Lemma 2.3

(Bollobás and Meir [7]). Let T be a spanning tree for a finite point set $X \subset \mathbb {R}^k$. Then there exists a Hamiltonian cycle H on X such that

$$\begin{aligned} S_k(H)\le \frac{2}{3}\cdot 3^k \cdot S_k(T). \end{aligned}$$

Proof

Let $e_1,\ldots ,e_n$ be the edges of a Hamiltonian cycle H in $T^3$ guaranteed by Lemma 2.2. Suppose that the edges of T used by $e_i$ have lengths $d_{i_{1}},\ldots , d_{i_{\ell }}$, where $\ell \le 3$. Set $f_i = d_{i_{1}}+\ldots +d_{i_{\ell }}$ and $f=(f_i)_{i\in [n]}\in {\mathbb {R}}^n$. Then $|e_i|\le f_i$ for every i, each $f_i$ is a sum of at most three $d_j$’s and each $d_j$ occurs in the representations of two $f_i$’s.

Now, we can form three vectors $v_1,v_2,v_3\in {\mathbb {R}}^n$ such that $f=v_1+v_2+v_3$, every coordinate of $v_i$ is a $d_j$ or 0, and every $d_j$ occurs exactly twice as a coordinate in the three $v_i$’s. Therefore, $\sum _{i=1}^3 \Vert v_i \Vert _k^k = 2\sum _{j=1}^{n-1}d_j^k$. Hence, by the triangle-inequality and Jensen’s inequality,

$$\begin{aligned} \Vert f \Vert _k&=\Vert v_1+v_2+v_3 \Vert _k \le \sum _{i=1}^3 \Vert v_i \Vert _k\le 3 \left( \frac{1}{3} \sum _{i=1}^3 \Vert v_i \Vert _k^k \right) ^{1/k} \\&= 3 \left( \frac{2}{3} \sum _{j=1}^{n-1}d_j^k \right) ^{1/k}= 3 \left( \frac{2}{3}\right) ^{1/k} \cdot s_k(T), \end{aligned}$$

and thus

$$\begin{aligned} S_k(H)=\sum _{i=1}^n |e_i|^k\le \Vert f \Vert _k^k\le \frac{2}{3}\cdot 3^k \cdot S_k(T). \end{aligned}$$

$\square $

For convenience, here we work with the unit cube $U=[-1/2,1/2]^k$ centered at the origin $o=(0,\ldots ,0)$. Assume that $n \ge 3$, since it is clear otherwise that $s_k(H) \le 2^{1/k} \cdot \sqrt{k}$. It was shown in [7] that $\cup _{e\in T} B_e $ is contained in the ball of radius $0.75 \sqrt{k}$ centered at the origin o. We next show that $\cup _{e\in T} B_e$ is contained in the ball of radius $ \frac{\sqrt{5}}{4} \sqrt{k} =0.559 \ldots \cdot \sqrt{k}$ centered at o. The idea for the improvement is that centers of balls corresponding to long edges of T cannot be too far from the center of the cube. The key step is the following.

Lemma 2.4

Let $U=[-1/2,1/2]^k$ and $u,v \in U$. Then

$$\begin{aligned} \frac{|u+v|}{2} + \frac{|u-v|}{4} \le \frac{\sqrt{5}}{4} \sqrt{k}. \end{aligned}$$

(3)

This inequality is the best possible.

Proof

To start with, note that

$$\begin{aligned} |u|^2 \le \sum _1 ^k \frac{1}{4} = \frac{k}{4}, \quad \quad |v|^2 \le \sum _1 ^k \frac{1}{4} = \frac{k}{4} \quad \quad \text {and} \quad \quad |u -v| \le \sqrt{k}. \end{aligned}$$

The first two relations immediately yield

$$\begin{aligned} |u|^2 + |v|^2 \le \frac{k}{2}. \end{aligned}$$

(4)

Recall the identities

$$\begin{aligned} |u+v|^2 = |u|^2 + |v|^2 + 2 uv, \quad \quad |u-v|^2&= |u|^2 + |v|^2 - 2 uv. \end{aligned}$$

(5)

Here uv is the dot product of u and v. We deduce that

$$\begin{aligned} |u-v|^2 = 2 (|u|^2+|v|^2) - |u+v|^2 \le 2 (|u|^2+|v|^2). \end{aligned}$$

We can thus write $|u-v| = \lambda \sqrt{|u|^2+|v|^2}$, where $0 \le \lambda \le \sqrt{2}$, whence

$$\begin{aligned} |u-v|^2 = \lambda ^2 (|u|^2+|v|^2). \end{aligned}$$

From the two equations in (5) we also obtain

$$\begin{aligned} |u+v| = \sqrt{2(|u|^2 + |v|^2) - |u-v|^2} = \sqrt{(2 - \lambda ^2) (|u|^2 + |v|^2)}. \end{aligned}$$

Substituting the expressions of $|u+v|$ and $|u-v|$ and using (4) yields

$$\begin{aligned} \frac{|u+v|}{2} + \frac{|u-v|}{4}&= \frac{\sqrt{(2 - \lambda ^2) (|u|^2 + |v|^2)}}{2} + \frac{\lambda \sqrt{|u|^2+|v|^2}}{4} \\&= \left( \frac{ \sqrt{2 - \lambda ^2}}{2} + \frac{\lambda }{4} \right) \sqrt{|u|^2+|v|^2} \\&\le \frac{1}{4} \left( \lambda + 2 \sqrt{2 -\lambda ^2} \right) \sqrt{ \frac{k}{2}}. \end{aligned}$$

A standard calculation shows that the function $f(\lambda ) = \lambda + 2 \sqrt{2 -\lambda ^2}$, where $0 \le \lambda \le \sqrt{2}$, attains its maximum, $\sqrt{10}$, at $\lambda = \sqrt{\frac{2}{5}}$. Consequently,

$$\begin{aligned} \frac{|u+v|}{2} + \frac{|u-v|}{4}&\le \frac{1}{4} \sqrt{10} \sqrt{\frac{k}{2}} = \frac{\sqrt{5}}{4} \sqrt{k}. \end{aligned}$$

This concludes the proof of the upper bound.

For a tight example, assume that k is a multiple of 5 and let $u=u_1,\ldots ,u_k$, and $v=v_1,\ldots ,v_k$, where

$$\begin{aligned} u_i&= \left\{ \begin{array}{ll} +\frac{1}{2}, &{} \text { for } i=1,\ldots ,\frac{4k}{5}, \\ -\frac{1}{2}, &{} \text { for } i=\frac{4k}{5}+1,\ldots ,k. \end{array} \right. \\ v_i&= +\frac{1}{2}, \ \ \ \ \text { for } i=1,\ldots ,k. \end{aligned}$$

It is now easily verified that

$$\begin{aligned} \frac{|u+v|}{2}&= \sqrt{\frac{4k}{5} \cdot \frac{1}{4}}, \frac{|u-v|}{4} = \sqrt{\frac{k}{5} \cdot \frac{1}{16}}, \quad \quad \text {and} \\ \frac{|u+v|}{2} + \frac{|u-v|}{4}&= \frac{5}{4} \cdot \sqrt{\frac{k}{5}} = \frac{\sqrt{5}}{4} \sqrt{k}, \end{aligned}$$

as required. $\square $

Final argument in the proof of Theorem 1.3. Let $u,v \in U$ such that $e=uv$ is an edge of the MST T. By the triangle inequality, the distance from the center of the cube to any point in the ball $B_{e}$ is at most $\frac{1}{2} |u+v| + \frac{1}{4} |u-v|$. By Lemma 2.4 this distance is at most $\frac{\sqrt{5}}{4} \sqrt{k}$, thus $\cup _{e\in T} B_e \subset B$, where B is the ball of radius $ \frac{\sqrt{5}}{4} \sqrt{k} =0.559 \ldots \cdot \sqrt{k}$ centered at o.

The ball packing argument in [7] yields $S_k(T) \le (3 \sqrt{k})^k$. Using Lemma 2.4 instead improves this bound to $S_k(T) \le (\sqrt{5k})^k$. By Lemma 2.3 we obtain a Hamiltonian cycle H through P satisfying

$$\begin{aligned} S_k(H) \le \frac{2}{3}\cdot 3^k \cdot S_k(T) \le \frac{2}{3} \cdot 3^k \cdot (5k)^{k/2}. \end{aligned}$$

(6)

Taking the k-th root completes the proof of item (ii). Note that the only change in the calculation is replacing a multiplicative factor of 3 by $\sqrt{5}$ (in Inequality (2) from [7]). The improvement carries on proportionally and is reflected in the final bound.

Recall that the traversal of the MST T using the algorithm of Sekanina [23, 24] takes linear time. As such, the running time for computing the TSP tour is determined by the time to compute T. This proves item (iii) and completes the proof of Theorem 1.3. $\square $

An alternative way to verify the upper bound in (6) is by using Theorem 1.2. The details are left to the reader.

3 Hamiltonian Cycles for Subsets of Cube Vertices

In this section we consider our problem (the study of extremal values for Hamiltonian cycles and paths in $[0,1]^k$) when the input is restricted to subsets of cube vertices. Note that this restriction is quite natural, since all known best constructions are attained or matched by such subsets. We will use some results on binary codes.

3.1 Preparation: Binary Codes

First we prove an optimization result which will be used multiple times throughout this paper.

Lemma 3.1

Let $q_1,q_2,\ldots , q_m\in [0,1]$. Then,

$$\begin{aligned} \sum _{i<j} |q_i-q_j|^2 \le \left\lfloor \frac{m}{2} \right\rfloor \cdot \Big \lceil \frac{m}{2} \Big \rceil . \end{aligned}$$

Proof

We prove this result by induction on m. The statement holds trivially for $m=1$ and $m=2$. Let $q_1,q_2,\ldots , q_m\in [0,1]$ for some $m\ge 3$. We can assume $0=q_1\le q_2\le \ldots \le q_m=1$. By the induction assumption,

$$\begin{aligned} \sum _{1<i<j<m} |q_i-q_j|^2 \le \left\lfloor \frac{m-2}{2} \right\rfloor \cdot \Big \lceil \frac{m-2}{2} \Big \rceil . \end{aligned}$$

Observe that the maximum of the quadratic function $f(x) = x^2 + (1-x)^2$ over the interval [0, 1] is obtained at $x=0$ or $x=1$. Thus, $|q_1-q_j|^2+|q_m-q_j|^2=q_j^2+(1-q_j)^2\le 1$ for $j\in \{2,\ldots ,m-1\}$. Therefore,

$$\begin{aligned} \sum _{i<j} |q_i-q_j|^2&= |q_1-q_m|^2+\sum _{1<j<m} (|q_1-q_j|^2+|q_m-q_j|^2) + \sum _{1<i<j<m} |q_i-q_j|^2 \\&\le 1+(m-2)+ \left\lfloor \frac{m-2}{2} \right\rfloor \cdot \Big \lceil \frac{m-2}{2} \Big \rceil = \left\lfloor \frac{m}{2} \right\rfloor \cdot \Big \lceil \frac{m}{2} \Big \rceil , \end{aligned}$$

completing the proof of this lemma. $\square $

Lemma 3.2

Let $\delta ,\gamma > 0$, and $k_1,k_2$ be non-negative integers. Let $X\subseteq [0,\delta ]^{k_1} \times [0,\gamma ]^{k_2}$ be a finite set of size $|X|\ge m\ge 2$. Then there exists two distinct points $p,q\in X$ such that

$$\begin{aligned} |p-q|^2\le \frac{\lfloor \frac{m}{2} \rfloor \lceil \frac{m}{2} \rceil }{\left( {\begin{array}{c}m\\ 2\end{array}}\right) } \, (\delta ^2k_1+\gamma ^2k_2). \end{aligned}$$

Proof

Let $p_1,p_2,\ldots ,p_m$ be any m points from X. Given integers i and j, we denote by ${p_i}_j$ the j-th coordinate of $p_i$. By applying Lemma 3.1 and scaling we obtain

$$\begin{aligned} \sum _{i<i'} |p_{{i}_j} -p_{{i'}_j}|^2&\le \left\lfloor \frac{m}{2} \right\rfloor \cdot \Big \lceil \frac{m}{2} \Big \rceil \cdot \delta ^2 \quad \quad \text {for every}\, j\in [k_1],\, \text {and} \end{aligned}$$

(7)

$$\begin{aligned} \sum _{i<i'} |p_{{i}_j} -p_{{i'}_j}|^2&\le \left\lfloor \frac{m}{2} \right\rfloor \cdot \Big \lceil \frac{m}{2} \Big \rceil \cdot \gamma ^2 \quad \quad \text {for every}\, j\in [k_1+k_2]\setminus [k_1]. \end{aligned}$$

(8)

By summing up the inequalities (7) and (8), we obtain

$$\begin{aligned} \sum _{i<i'} |p_{{i}} -p_{{i'}}|^2 \le \left\lfloor \frac{m}{2} \right\rfloor \cdot \Big \lceil \frac{m}{2} \Big \rceil \cdot (\delta ^2k_1+\gamma ^2k_2). \end{aligned}$$

Thus, by averaging over all pairs of points, the minimizing pair satisfies the claimed inequality. $\square $

Applying Lemma 3.2 with $\delta =\gamma =1$, $k_1=k$ and $k_2=0$, immediately yields the following symmetric version.

Lemma 3.3

Let $X\subseteq [0,1]^{k}$ of size $|X|\ge m \ge 3$. Then there exist two distinct points $p,q \in X$ such that

$$\begin{aligned} |p-q|^2 \le \frac{\lfloor \frac{m}{2} \rfloor \cdot \lceil \frac{m}{2} \rceil }{{m \atopwithdelims ()2}} \cdot k. \end{aligned}$$

Let A(k, d) denote the maximum cardinality of a binary code of length k with minimum distance d. We recall the following fact [28]:

Lemma 3.4

(Singleton bound). $A(k,d) \le 2^{k-d+1}$.

We need the following improvement.

Lemma 3.5

If $d<\frac{2}{3}k$, then $A(k,d) \le 2^{k-\frac{3}{2}d+2}$.

Proof

Towards contradiction, assume that there exists $X\subseteq \{0,1\}^k$ of size $|X|>2\cdot 2^{k-\frac{3}{2}d+1}$ such that $|p-q|^2\ge d$ for every $p,q\in X$. By the pigeonhole principle, there exists $p,q,r \in X$ which coincide on the first $\lfloor k-\frac{3}{2}d+1 \rfloor $ coordinates. By Lemma 3.3, applied with $m=3$ to the last $\lceil \frac{3}{2}d\rceil -1$ coordinates, we get that

$$\begin{aligned} \min \{|p-q|^2,|p-r|^2,|r-q|^2\} \le \frac{2}{3} \left( \Big \lceil \frac{3}{2}d\Big \rceil -1\right) < d, \end{aligned}$$

a contradiction. $\square $

3.2 Building a Path Greedily

In the proofs of some of our results we will analyze a greedy algorithm which takes a discrete point set $X\subseteq [0,1]^k$ of size $|X|=n$ as an input and creates a Hamiltonian path F through X. It processes the point pairs in nondecreasing order of distance and maintains a collection of paths.

Algorithm 1: Initially, set $F_0$ to be the empty graph on X. For $i\in [n-1]$, let $e_i$ be an edge of smallest weight among all edges $e\not \in F_{i-1}$ which satisfy that $F_{i-1}+e$ is a vertex-disjoint union of paths. Set $F_{i}:=F_{i-1}+e_i$. Then, $F:=F_{n-1}$ is a Hamiltonian path.

Lemma 3.6

Let $j\in [k]$. The number of edges $e\in F$ satisfying $|e|^2\ge j$ is less than A(k, j).

Proof

Let $\ell $ be the smallest integer such that $|e_\ell |^2\ge j$. The number of edges $e\in F$ satisfying $|e|^2\ge j$ is less than the number of components in $F_{\ell }$, which is $n-\ell $. Let $P_\ell \subseteq X$ be a set containing one endpoint of each path in $F_{\ell }$. The set $P_\ell $ is a binary code of length k with minimum distance j. Thus, the number of edges $e\in F$ satisfying $|e|^2\ge j$ is less than A(k, j). $\square $

Proof of Theorem 1.4

If $|X|=2$, the statement holds trivially. Assume $n:=|X|\ge 3$. Let F be the Hamiltonian path created by Algorithm 1. We partition the edges $e\in F$ into four classes.

1.
short edges: $|e|^2 \le \frac{k}{5}$.
2.
medium edges: $\frac{k}{5} < |e|^2 \le \frac{3k}{5}$.
3.
long edges: $\frac{3k}{5} < |e|^2 \le \frac{2k}{3} $.
4.
very long edges: $\frac{2k}{3} <|e|^2 $.

Denote by $F^s,F^m,F^l,f^{vl}$ the subgraphs of F containing all short, medium, long and very long edges, respectively. They partition F and thus $S_k(F)= S_k(F^s)+S_k(F^m)+S_k(F^l)+S_k(F^{vl})$. We will provide upper bounds for the four contributions separately.

Since $n\le 2^k$, the number of short edges is trivially at most $2^k$. Thus,

$$\begin{aligned} S_k(F^s)\le 2^k \left( \sqrt{\frac{k}{5}}\right) ^k. \end{aligned}$$

Now, we estimate $S_k(F^m)$. Let j be an integer satisfying $\frac{k}{5} < j \le \frac{3k}{5}$. The number of edges $e\in F$ satisfying $|e|^2\ge j$ is less than $ A(k,j)\le 2^{k-\frac{3}{2}j+2}$ by Lemmas 3.5 and 3.6. Therefore,

$$\begin{aligned} S_k(F^m)\le & {} \sum _{j=\Big \lceil \frac{k}{5}\Big \rceil }^{\left\lfloor \frac{3k}{5}\right\rfloor } 2^{k-\frac{3}{2}j+2} \left( \sqrt{j}\right) ^k \\\le & {} 4\cdot \left( \left\lfloor \frac{3k}{5}\right\rfloor -\Big \lceil \frac{k}{5}\Big \rceil \right) \cdot \left( 0.842\sqrt{k}\right) ^k \\\le & {} \frac{8k}{5} \cdot \left( 0.842\sqrt{k}\right) ^k. \end{aligned}$$

Here we used that the function $f(x)=2^{1-3x/2} \sqrt{x}$, where $x\ge 0$, is maximized for $x=\frac{1}{\log (8)}$ and thus $2^{1-3x/2} \sqrt{x}\le 0.842$.

Next, we estimate $S_k(F^l)$. The number of edges $e\in F$ satisfying $|e|^2>\frac{3k}{5}$ is less than $A(k,\lfloor \frac{3k}{5}\rfloor +1)\le 4$ by Lemma 3.3, applied with $m=5$ and by Lemma 3.6. Therefore,

$$\begin{aligned} S_k(F^l)\le 3 \cdot \left( \sqrt{\frac{2}{3}k}\right) ^k. \end{aligned}$$

Last, we estimate $S_k(F^{vl})$. The number of edges $e\in F$ satisfying $|e|^2>\frac{2k}{3}$ is less than $A(k,\lfloor \frac{2k}{3}\rfloor +1)\le 2$ by Lemma 3.3, applied with $m=3$ and Lemma 3.6. Thus, there is at most one very long edge e in F. This very long edge has length at most $|e| \le \sqrt{k-1}$ by the following argument. Consider the last step of the greedy algorithm, when the last two paths, call them $P_1$ and $P_2$, are being joined. Since $|X|\ge 3$, one of them, say $P_1$, contains at least two vertices. An endpoint of the path $P_2$ has distance at most $\sqrt{k-1}$ to one of the endpoints of $P_1$, since not both endpoints can be opposite on the cube. Thus, $|e| \le \sqrt{k-1}$. We get $S_k(F^{vl})\le \sqrt{k-1}^k$.

Adding up the four contributions to $S_k(F)$ yields

$$\begin{aligned} S_k(F)&= S_k(F^s)+S_k(F^m)+S_k(F^l)+S_k(F^{vl})\nonumber \\&\le 2^k \left( \sqrt{\frac{k}{5}} \right) ^k + \frac{8k}{5} \cdot \left( 0.842\sqrt{k}\right) ^k +3 \cdot \left( \sqrt{\frac{2}{3}k}\right) ^k+\left( \sqrt{k-1}\right) ^k\nonumber \\&= \left( \sqrt{k}\right) ^k\left( \left( \frac{2}{\sqrt{5}}\right) ^k+\frac{8k}{5}\cdot 0.842^k +3 \cdot \left( \sqrt{\frac{2}{3}}\right) ^k + \left( \sqrt{\frac{k-1}{k}}\right) ^k \right) \nonumber \\&< \left( \sqrt{k}\right) ^k, \end{aligned}$$

(9)

where the last inequality holds for k sufficiently large. We used the fact that $\left( \sqrt{\frac{k-1}{k}}\right) ^k$ converges to $e^{-1/2}$. Let H be the Hamiltonian cycle obtained from F by connecting the two endpoints. Then

$$\begin{aligned} S_k(H) \le S_k(F)+ \left( \sqrt{k}\right) ^k \le 2\left( \sqrt{k}\right) ^k. \end{aligned}$$

$\square $

We remark that the proof of Theorem 1.4 works for $k_0=29$. The last inequality in (9) is strict. Thus, Theorem 1.4 is tight only for $|X|=2$.

4 Hamiltonian Cycles: Asymptotic Upper Bounds

In this section we prove Theorems 1.5, 1.6 and 1.7.

4.1 Preparation

Lemma 4.1

Let $0<\alpha < 1$ and $Y\subseteq [0,1]^k$ such that $|u-v|> \alpha \sqrt{k}$ for every two distinct points $u,v\in Y$. Let $m\in \mathbb {N}$. Then,

$$\begin{aligned} |Y|\le 2m \cdot \Big \lceil \sqrt{\frac{1}{2}\left( 1+\frac{1}{2m-1}\right) }\alpha ^{-1}\Big \rceil ^k. \end{aligned}$$

Proof

Let $\beta =\Big \lceil \sqrt{\frac{1}{2}\left( 1+\frac{1}{2\,m-1}\right) }\alpha ^{-1}\Big \rceil $. Assume that $|Y|>2m\cdot \beta ^k$. Partition the unit box $[0,1]^k$ into $\beta ^ k$ boxes $B_1,B_2,\ldots ,B_{\beta ^k}$ as follows: We split up [0, 1] into $ \beta $ disjoint consecutive intervals of length $\beta ^{-1}$ each. This gives $\beta ^k$ boxes in total.

Since $|Y|> 2m\cdot \beta ^k$, there exists a box $B_j$ such that at least 2m points from Y are contained in it. By Lemma 3.2, applied with $\gamma =\delta =\beta ^{-1}, k_1=k$ and $k_2=0$, there exist $p,q\in B_j\cap Y$ such that $|p-q|^2\le \frac{1}{2}\left( 1+\frac{1}{2m-1}\right) \beta ^{-2}k$. We conclude

$$\begin{aligned} \alpha ^2 k< |p-q|^2\le \frac{1}{2}\left( 1+\frac{1}{2m-1}\right) \beta ^{-2}k, \quad \text {implying} \quad \alpha < \sqrt{\frac{1}{2}\left( 1+\frac{1}{2m-1}\right) }\beta ^{-1}. \end{aligned}$$

However, by the choice of $\beta $, we have $\alpha < \sqrt{\frac{1}{2}\left( 1+\frac{1}{2\,m-1}\right) }\beta ^{-1}\le \alpha $, a contradiction. $\square $

The following lemma is a version of Lemma 4.1 which improves the bound in a certain range of $\alpha $.

Lemma 4.2

Let $\sqrt{\frac{100}{1791}}<\alpha < \sqrt{\frac{100}{199}}$ and $Y\subseteq [0,1]^k$ such that $|u-v|> \alpha \sqrt{k}$ for every two distinct points $u,v\in Y$. Then,

$$\begin{aligned} |Y|\le 600 \cdot 3^{\frac{9}{8}\left( 1-\frac{199}{100}\alpha ^2\right) k}. \end{aligned}$$

Proof

Let $a=\frac{9}{8}(1-\frac{199}{100}\alpha ^2)$. Note that $0<a <1$. Partition the unit box $[0,1]^k$ into $3^{\lceil ak\rceil }$ boxes $B_1,B_2,\ldots ,B_{3^{\lceil ak \rceil }}$ as follows: Let $I=\{1,2,\ldots , \lceil ak \rceil \}\subseteq [k]$. For the coordinates in I, we split up [0, 1] into 3 disjoint consecutive $[0,1]=[0,\frac{1}{3})\cup [\frac{1}{3},\frac{2}{3}) \cup [\frac{2}{3},1]$ intervals of length $\frac{1}{3}$ each. If $|Y|> 200 \cdot 3^{\lceil ak \rceil }$, then there exists a box $B_j$ such that at least 200 points from Y are contained in it. By Lemma 3.2, applied with $m=200$, $\delta =\frac{1}{3}$, $\gamma =1$, $k_1=\lceil ak \rceil $ and $k_2=k-k_1$, there exist $p,q\in B_j\cap Y$ such that

$$\begin{aligned} \frac{|p-q|^2}{k}\le \frac{100}{199}\left( \frac{1}{3}\right) ^2 \frac{\lceil ak \rceil }{k}+ \frac{100}{199} \frac{k-\lceil ak \rceil }{k} \le \frac{100}{199}-\frac{8}{9}\frac{100}{199}a=\alpha ^2, \end{aligned}$$

contradicting $\alpha ^2 k< |p-q|^2$. We conclude that

$\square $

Lemma 4.3

There exists $k_0$ such that for all integers $k\ge k_0$ the following holds. Let $0< \alpha < 0.99$ and let $Y\subseteq [0,1]^k$ such that $|u-v|> \alpha \sqrt{k}$ for every two distinct points $u,v\in Y$. Then $|Y|\alpha ^k\le 0.999^k$.

Proof

Let $k_0$ be sufficiently large for the following proof to hold. First, assume $\sqrt{\frac{100}{199}}<\alpha <0.99$. Then $|Y|\le 200$ by Lemma 3.3, applied with $m=200$. Thus,

$$\begin{aligned} |Y|\alpha ^k \le 200 \alpha ^k \le 0.999^k. \end{aligned}$$

Next, assume $0.29\le \alpha \le \sqrt{\frac{100}{199}}$. Then by Lemma 4.2,

$$\begin{aligned} |Y|\alpha ^k \le 600 \cdot \left( 3^{\frac{9}{8}\left( 1-\frac{199}{100}\alpha ^2\right) } \alpha \right) ^k \le 0.999^k. \end{aligned}$$

Finally, assume $0<\alpha \le 0.29$. Then by Lemma 4.1, applied with $m=100$,

$$\begin{aligned} |Y|\alpha ^k \le 200 \left( \Big \lceil \frac{\sqrt{\frac{1}{2}\left( 1+\frac{1}{199}\right) }}{\alpha }\Big \rceil \alpha \right) ^k \le 200 \left( \sqrt{\frac{100}{199}}+\alpha \right) ^k\le 0.999^k. \end{aligned}$$

$\square $

4.2 Proofs of Theorems 1.5, 1.6, and 1.7

First, we quickly demonstrate how Lemma 3.3 implies Theorem 1.7.

Proof of Theorem 1.7

Let $X\subseteq [0,1]^k$ be a point set of size n. We run Algorithm 1 from Sect. 3.2. Let $F_i$ be the collection of paths at the i-th step, let $e_i$ be the edge added in the i-th step, and let $F=F_{n-1}$ be the final Hamiltonian path.

We claim that $|e_i| \le \sqrt{\frac{2}{3}k}$ for $i\le n-2$. Let $e_i=xy$. The vertices x and y are endpoints of two different paths in $F_{i-1}$. Since $F_{i-1}$ has at least $n-(i-1)\ge n-(n-2-1)=3$ components, there exists a component containing neither x, nor y. Let $z\in X$ be an endpoint of the path forming this component. Since $e_i=xy$ was chosen in step i, but xz and yz were not, we have $|xy| \le |xz|$ and $|xy| \le |yz|$. By applying Lemma 3.3 to the set $\{x,y,z\}$, we get that $|e_i|=|xy| \le \sqrt{\frac{2}{3}k}$. Note that $|e_{n-1}| \le \sqrt{k}$ trivially.

Now, let $f=ab$ be the edge where a and b are the two endpoints of the final path F. Set $H=F+f$ to be the Hamiltonian cycle when f is added to F. Since $|f| \le \sqrt{k}$ trivially, we get

$$\begin{aligned} S_k(H)&=\sum _{e\in H}|e|^k= |f|^k+|e_{n-1}|^k+ \sum _{i=1}^{n-2} |e_i|^k \le 2\left( \sqrt{k}\right) ^k+ (n-2)\left( \sqrt{\frac{2}{3}k}\right) ^k. \end{aligned}$$

Consequently,

$$\begin{aligned} s_k^{\texttt {HC}}(n)\le s_k(H) \le 2^{1/k}\sqrt{k} \ (1+o_k(1)). \end{aligned}$$

$\square $

Proof of Theorem 1.5

Let k be sufficiently large and let $X\subseteq [0,1]^k$ be a finite point set. Set

$$\begin{aligned} \ell :=\Big \lceil \log _{1+\frac{1}{k}}\left( 0.9 k^{\frac{3}{4}}\right) \Big \rceil = O(k \log k) \quad \quad \text {and} \quad \quad a_i:=\frac{(1+\frac{1}{k})^i}{k^{\frac{3}{4}}} \end{aligned}$$

for integers i, $0\le i\le \ell $. Note that

$$\begin{aligned}&\frac{a_{i+1}}{a_i}=1+\frac{1}{k}\ \text {for} \ i\in \{0,1,\dots ,\ell -1\}, \quad \text {and}\\&\qquad \qquad \qquad \quad a_0< a_1< a_2< \cdots < a_{\ell -1}\le 0.9 \le a_\ell . \end{aligned}$$

Construct a minimum spanning tree T on vertex set X by successively joining points from X at minimal distance from each other, given the new edge does not create a cycle. For $0\le i\le \ell $, let $F_i$ be the forest with vertex set X and edges $e\in T$ such that $|e|\le a_i\sqrt{k}$. Then, $F_0\subseteq F_1 \subseteq \dots \subseteq F_{\ell } \subseteq T$ since the sequence $(a_i)$ is increasing. If $x,y\in X$ are in different components of $F_i$, then $|x-y|> a_i \sqrt{k}$.

We have $a_0= k^{-3/4}$. For an edge $e=xy\in F_0$, let $B_e$ be the open ball of radius |e|/4 and center $\frac{1}{2}(x+y)$. Since $F_0\subseteq T$, by Lemma 2.1, the balls $B_e$, $e\in F_0$ are disjoint. Also, $|e|\le a_0 \sqrt{k}=k^{-1/4}$. Denote by $V_k$ for the volume of the k-dimensional unit ball. It is well-known that

$$\begin{aligned} V_k= {\left\{ \begin{array}{ll} \dfrac{\pi ^{k/2}}{(k/2)!} &{} \mathrm{if \ } k \ \mathrm{is \ even}, \\ \dfrac{2^k \cdot \pi ^{(k-1)/2} \, ((k-1)/2)!}{k!} &{} \mathrm{if \ } k \ \mathrm{is \ odd}. \end{array}\right. } \end{aligned}$$

By Stirling’s approximation, $V_k \sim \frac{1}{\sqrt{k\pi }} (\frac{2\pi e}{k})^{k/2}$. Since $\bigcup _{e\in F_0} B_e\subseteq [-k^{-1/4},1+k^{-1/4}]$, we have

$$\begin{aligned}{} & {} \sum _{e\in F_0}\left( \frac{|e|}{4}\right) ^k V_k \le ((1+2k^{-1/4}))^k, \quad \text {and thus} \quad \\{} & {} \sum _{e\in F_0}|e|^k\le \frac{4^k(1+2k^{-1/4})^k}{V_k} \le (0.97\sqrt{k})^k, \end{aligned}$$

for k sufficiently large. Now, let $i\in \{0,1,\ldots ,\ell -1\}$. Let $Y \subseteq X$ be a set of vertices containing exactly one vertex from every component of $F_i$. Then $|y-y'|> a_i \sqrt{k}$ for every pair $y\ne y'\in Y$, and $|F_{i+1}{\setminus } F_i|\le |Y|-1$. By Lemma 4.3 we have $a_i^k|Y| \le 0.999^k$ for $i\le \ell $. Thus,

$$\begin{aligned} \sum _{e\in F_{i+1}\setminus F_i} |e|^k\le (a_{i+1}\sqrt{k})^k|Y|= (a_{i}\sqrt{k})^k|Y|\left( 1+\frac{1}{k}\right) ^k \le 3 \cdot (0.999 \sqrt{k})^k, \end{aligned}$$

for $i\le \ell $. Therefore,

$$\begin{aligned} \sum _{e\in F_\ell } |e|^k&= \sum _{e\in F_0} |e|^k +\sum _{i=0}^{\ell -1} \sum _{e\in F_{i+1}\setminus F_i} |e|^k \le \left( 0.97\sqrt{k}\right) ^k+ 3 \ell \cdot (0.999 \sqrt{k})^k, \end{aligned}$$

for k sufficiently large. If the forest $F_\ell $ consist of at least three components then three points $p,q,r\in X$, from different components each, have pairwise distance at least $0.9\sqrt{k}\ge \sqrt{\frac{2}{3}k}$. This contradicts Lemma 3.3. Therefore, $F_\ell $ has at most 2 components and thus there is at most one edge f in T which is not in $F_\ell $. We conclude

$$\begin{aligned} \sum _{e\in T} |e|^k=\sum _{e\in F_\ell } |e|^k+ |f|^k\le \left( \sqrt{k}\right) ^k(1+o_k(k^{-1})), \end{aligned}$$

which implies that for the family of minimum spanning trees, we have $s_k^{\texttt {ST}} \le \sqrt{k} \ (1+o_k(1))$, completing the proof of Theorem 1.5. $\square $

We remark that by applying Lemma 2.3 to T, there exists a Hamiltonian cycle H on vertex set X satisfying

$$\begin{aligned} \sum _{e\in H} |e|^k\le \frac{2}{3}\cdot 3^k \sum _{e\in T} |e|^k \le \left( 3\sqrt{k}\right) ^k(1+o_k(k^{-1})), \end{aligned}$$

implying that for the family of Hamiltonian cycles, we have $s_k^{\texttt {HC}} \le 3\sqrt{k} \ (1+o_k(1))$.

Proof of Theorem 1.6

Create a forest F by successively joining points from X at minimal distance from each other, given the new edge e does not create a cycle and satisfies $|e|\le k^{-1/4}$. This process stops when there is no such edge left. Let the trees $T_1,\ldots , T_N$ be the components of F. Every two vertices from different $T_i$’s have pairwise distance at least $k^{-1/4}$.

For an edge $e=xy\in F$, let $B_e$ be the open ball of radius |e|/4 and center $\frac{1}{2}(x+y)$. By Lemma 2.1, the balls $B_e$, $e\in F$ are disjoint. Also, $|e|\le k^{-\frac{1}{4}}$. We have $\bigcup _{e\in F} B_e\subseteq [-k^{-1/4},1+k^{-1/4}]$. Writing $V_k$ for the volume of the k-dimensional unit ball, we have

$$\begin{aligned}{} & {} \sum _{e\in F}\left( \frac{|e|}{4}\right) ^k V_k \le ((1+2k^{-1/4}))^k \quad \text {and thus} \\{} & {} \sum _{e\in F}|e|^k\le \frac{4^k(1+2k^{-1/4})^k}{V_k} \le (0.97\sqrt{k})^k, \end{aligned}$$

for k sufficiently large. Since the trees $T_1,\ldots , T_N$ decompose the edge set of the forest F, we have

$$\begin{aligned} \sum _{i=1}^N \sum _{e\in T_i} |e|^k= \sum _{e\in F}|e|^k\le (0.97\sqrt{k})^k. \end{aligned}$$

(10)

By Lemma 2.3, for each $i\in [N]$, there exists a Hamiltonian cycle $H_i$ on $V(T_i)$ such that

$$\begin{aligned} \sum _{e\in H_i} |e|^k\le 3^k \sum _{e\in T_i}|e|^k. \end{aligned}$$

(11)

Let $F_0$ be the collection of paths obtained by taking the union of all $H_i$, and removing an edge from each cycle. Then, by using (10) and (11), we obtain

$$\begin{aligned} \sum _{e\in F_0} |e|^k \le \sum _{i=1}^N\sum _{e\in H_i} |e|^k\le 3^k \sum _{i=1}^N\sum _{e\in T_i}|e|^k \le (2.91\sqrt{k})^k. \end{aligned}$$

(12)

Now, run Algorithm 1 from Sect. 3.2 initialized with $F_0$ (instead of the empty graph). Recall that this algorithm adds edges of minimum weight such that in each step we maintain a collection of paths. Denote by Q the final path which is created by this algorithm. Set

$$\begin{aligned} \ell :=\Big \lceil \log _{1+\frac{1}{k}}\left( 0.9 k^{\frac{3}{4}}\right) \Big \rceil = O(k \log k) \quad \quad \text {and} \quad \quad a_i:=\frac{(1+\frac{1}{k})^i}{k^{\frac{3}{4}}} \end{aligned}$$

for integers i, $0\le i\le \ell $. For $0\le i\le \ell $, let $F_i$ be the collection of paths with vertex set X and edges $e\in Q$ such that $|e|\le a_i\sqrt{k}$. Then, $F_0\subseteq F_1 \subseteq \dots \subseteq F_{\ell } \subseteq Q$ since the sequence $(a_i)$ is increasing. If $x,y\in X$ are in different components of $F_i$, then $|x-y|> a_i \sqrt{k}$. Now, let $i\in \{0,1,\ldots ,\ell -1\}$. Let $Y \subseteq X$ be a set of vertices containing exactly one endpoint of each path of $F_i$. Then $|y-y'|> a_i \sqrt{k}$ for every pair $y\ne y'\in Y$, and $|F_{i+1}\setminus F_i|\le |Y|-1$. By Lemma 4.3 we have $a_i^k|Y| \le 0.999^k\le 1$ for $i\le \ell $. Thus,

$$\begin{aligned} \sum _{e\in F_{i+1}\setminus F_i} |e|^k\le (a_{i+1}\sqrt{k})^k|Y|= (a_{i}\sqrt{k})^k|Y|\left( 1+\frac{1}{k}\right) ^k \le 3 \cdot \sqrt{k}^k, \end{aligned}$$

(13)

for $i\le \ell $. Therefore, by combining (12) with (13), we obtain

$$\begin{aligned} \sum _{e\in F_\ell } |e|^k&= \sum _{e\in F_0} |e|^k +\sum _{i=0}^{\ell -1} \sum _{e\in F_{i+1}\setminus F_i} |e|^k \le \left( 2.91\sqrt{k}\right) ^k+ 3 \ell \cdot \sqrt{k}^k, \end{aligned}$$

(14)

for k sufficiently large. Similarly, as in the proof of Theorem 1.5, $F_\ell $ has at most 2 components. Thus, using (14), the path Q satisfies

$$\begin{aligned} \sum _{e\in Q}|e|^k \le \sum _{e\in F_\ell }|e|^k +\sqrt{k}^k \le (2.91\sqrt{k})^k (1+o_k(1)). \end{aligned}$$

Adding one final edge f of weight at most $|f| \le \sqrt{k}$ to Q we obtain a Hamiltonian cycle with the desired properties. $\square $

5 Concluding Remarks

The upper bounds we obtained on the lengths of Hamiltonian cycles have the following implications for the existence of perfect matchings whose cost is bounded from above by a constant (depending on k). For example, Theorems 1.3 and 1.4 have the following implications. The proofs of Corollary 5.1 and that of Corollary 5.2 are analogous to the proof of Corollary 5.4 below.

Corollary 5.1

Given n points in $[0,1]^k$, where $k \ge 3$, and n is even, there exists a perfect matching M of the n points such that $\left( \sum _{e \in M} |e|^k \right) ^{1/k} \le 3 \sqrt{5} \left( \frac{1}{3} \right) ^{1/k} \cdot \sqrt{k}$. The matching M can be computed in time proportional to that needed for computing a MST of the points, in particular, in subquadratic time.

Corollary 5.2

There exists an integer $k_0$ such that for all $k\ge k_0$ the following holds. If X is any even-size subset of vertices of $\{0,1\}^k$, then there exists a perfect matching M of X such that $s_k(M) \le \sqrt{k}$. This bound is best possible.

Recall that a MST of n points in $\mathbb {R}^k$ (with respect to Euclidean distances) can be computed in $O\left( n^{2 - \frac{2}{\lceil k/2 \rceil +1} + \varepsilon } \right) $ time, for any $\varepsilon >0$ [1]. We also deduce the following related results (formulated here for the planar case, $k=2$.)

Corollary 5.3

Let $x_1,\ldots ,x_n$ be $n \ge 2$ points in the unit square. Let $d_i$ be the distance between $x_i$ and its nearest point (other than $x_i$). Then the following inequality holds: $\sum _{i=1}^n d_i^2 \le 4$.

Proof

Consider a Hamiltonian cycle, say $x_1,\ldots ,x_n$, whose cost $S_2(H)$ is at most 4. The distance from $x_i$ to its nearest point is at most $|x_i - x_{i+1}|$, for $i=1,\ldots ,n$. By squaring the n inequalities and adding them up, the claimed inequality follows. $\square $

An alternative proof of Corollary 5.3 can be found in [27, Problem G.27].

Corollary 5.4

Let $x_1,\ldots ,x_n$ be $n \ge 2$ points in the unit square, where n is even. Then there exists a perfect matching M such that $\sum _{e \in M} |e|^2 \le 2$. This bound is the best possible.

Proof

Consider a Hamiltonian cycle, say $H=x_1,\ldots ,x_n$, whose cost $S_2(H)$ is at most 4. H can be decomposed into two perfect matchings, one of which has a cost at most 2, as required.

The lower bounds for $n=2$ and $n=4$ are immediate (see Fig. 1). For every even $n \ge 6$ and $\varepsilon >0$, there are n points (in the neighborhoods of the four corners of the square) such that $\sum _{e \in M} |e|^2 \ge 2-\varepsilon $. $\square $

We have improved the upper bound of Bollobás and Meir [7] by more than 25 percent in the exact formulation and by more than 67 percent in the asymptotic formulation. Apart from some doubt concerning the values of $s_3^{\texttt {HC}}(n)$ and $s_4^{\texttt {HC}}(n)$, we think that their lower bound gives the right answer for every higher dimension. In view of Theorem 1.3 (i) we adjust their conjecture as follows:

Conjecture 5.5

For Hamiltonian cycles, the following equalities hold:

$$\begin{aligned} s_k^{\texttt {HC}}= {\left\{ \begin{array}{ll} 2^{7/6}, &{} \text { for } k =3, \\ 2^{1/k} \cdot \sqrt{k}, &{} \text { for } k \ge 4. \end{array}\right. } \end{aligned}$$

Hamiltonian path. If one was looking for a Hamiltonian path, instead of a Hamiltonian cycle, then the 2-point extremal lower bound example (given by a cube diagonal) loses a factor of 2 (or with scaling $2^{1/k}$); and so the question arises: is it still the best example, or maybe only for large k? Analogous to the situation for Hamiltonian cycles, we think that there is a threshold value for k after which the extremal examples stabilizes at the 2-point example. The threshold values for cycles and paths seem to differ, see Conjecture 5.6 below.

The current upper bound proofs essentially remain the same as for Hamiltonian cycles, with the change that the last edge is not needed. Some upper bounds remain unchanged, and others do improve. In particular, $s_2^{\texttt {HP}} \le s_2^{\texttt {HC}} = 2$ remains unchanged, whereas $s_2^{\texttt {HP}} \ge \sqrt{3}$ is implied by the two extremal examples in Fig. 1 (left and right).

From the other direction, for small values of k consider once again a binary code of length k with minimum distance 2 given by the set of all $x \in \{0,1\}^k$ with an even number of 1’s. It yields the values specified below.

Conjecture 5.6

For Hamiltonian paths, the following equalities hold:

$$\begin{aligned} s_k^{\texttt {HP}}= {\left\{ \begin{array}{ll} \sqrt{3}, &{} \text { for } k =2, \\ \left( 2^{k-1} -1 \right) ^{1/k} \cdot \sqrt{2}, &{} \text { for } k = 3,4,5,6, \\ \sqrt{k}, &{} \text { for } k \ge 7. \end{array}\right. } \end{aligned}$$

Further improvement. One might wonder where the next possible improvement is? We feel that it is in Lemma 2.2: It states that there is a Hamiltonian cycle such that each edge of the cycle is using at most 3 tree edges, yet the average usage is slightly less than 2. If it was true that every tree edge is used at most twice, then we would get a 2/3 factor improvement in the upper bound. However, the example of a tree with edges ab, bc, cd, de, cf, fg shows that this is not the case. Still, it is likely that there is a way to gain more in a tree to cycle or path conversion.

A different version. We conclude with yet another version of the problem. Instead of the unit cube $[0,1]^k \subset \mathbb {R}^k$, let the diameter of the point set be at most 1: That is, $\textrm{diam}(X) \le 1$, where $X \subset \mathbb {R}^k$ and $|X|=n$. What are the extremal values of the (say, unscaled) costs of a shortest Hamiltonian cycle (and path) for n points in $\mathbb {R}^k$ under this constraint? Are they given by the vertices of a unit simplex in $\mathbb {R}^k$ ($k+1$ and k, respectively)?

Data availibility

No datasets were generated or analysed during the current study.

References

Agarwal, P.K., Edelsbrunner, H., Schwarzkopf, O.: Euclidean minimum spanning trees and bichromatic closest pairs. Discrete Comput. Geom. 6, 407–422 (1991)
Article MathSciNet Google Scholar
Andreae, T.: On the traveling salesman problem restricted to inputs satisfying a relaxed triangle inequality. Networks 38(2), 59–67 (2001)
Article MathSciNet Google Scholar
Arora, S.: Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems. J. ACM 45(5), 753–782 (1998)
Article MathSciNet Google Scholar
Beardwood, J., Halton, J.H., Hammersley, J.M.: The shortest path through many points. Math. Proc. Camb. Philos. Soc. 55(4), 299–327 (1959)
Article MathSciNet Google Scholar
Bender, M., Chekuri, C.: Performance guarantees for the TSP with a parameterized triangle inequality. Inf. Process. Lett. 73(1–2), 17–21 (2000)
Article MathSciNet Google Scholar
Bollobás, B.: The Art of Mathematics-Coffee Time in Memphis. Cambridge University Press, Cambridge (2006)
Book Google Scholar
Bollobás, B., Meir, A.: A travelling salesman problem in the $k$-dimensional unit cube. Oper. Res. Lett. 11(1), 19–21 (1992)
Article MathSciNet Google Scholar
Christofides, N.: Worst-case analysis of a new heuristic for the Traveling Salesman Problem, Technical Report 388, Graduate School of Industrial Administration . Carnegie Mellon University, Pittsburgh (1976)
de Berg, M., van Nijnatten, F., Sitters, R., Woeginger, G.J., Wolff, A.: The Traveling Salesman Problem under squared Euclidean distances. In: Proc. of the 27th International Symposium on Theoretical Aspects of Computer Science, STACS 2010, LIPIcs series, vol. 5, pp. 239–250
Fejes Tóth, L.: Über einen geometrischen Satz. Math. Z. 46(1), 83–85 (1940)
Article MathSciNet Google Scholar
Few, L.: The shortest path and shortest road through $n$ points. Mathematika 2, 141–144 (1955)
Article MathSciNet Google Scholar
Funke, S., Laue, S., Lotker, Z., Naujoks, R.: Power assignment problems in wireless communication: Covering points by disks, reaching few receivers quickly, and energy-efficient traveling salesman tours. Ad Hoc Netw. 9(6), 1028–1035 (2011)
Article Google Scholar
Karlin, A.R., Klein, N., Gharan, S.O.: A (slightly) improved approximation algorithm for metric TSP, Proc. 53rd Annual ACM SIGACT Symposium on Theory of Computing (STOC 2021) Virtual Event, Italy, pp. 32–45 (2021)
Karloff, H.J.: How long can a Euclidean traveling salesman tour be? SIAM J. Discrete Math. 2, 91–99 (1989)
Article MathSciNet Google Scholar
Kozma, G., Lotker, Z., Stupp, G.: The minimal spanning tree and the upper box dimension. Proc. Am. Math. Soc. 134(4), 1183–1187 (2006)
Article MathSciNet Google Scholar
Joseph, S.B.: Mitchell, Guillotine subdivisions approximate polygonal subdivisions: a simple polynomial-time approximation scheme for geometric TSP, $k$-MST, and related problems. SIAM J. Comput. 28(4), 1298–1309 (1999)
Article MathSciNet Google Scholar
Mömke, T.: An improved approximation algorithm for the traveling salesman problem with relaxed triangle inequality. Inf. Process. Lett. 115(11), 866–871 (2015)
Article MathSciNet Google Scholar
Moran, S.: On the length of optimal TSP circuits in sets of bounded diameter. J. Comb. Theory Ser. B 37(2), 113–141 (1984)
Article MathSciNet Google Scholar
Moser, W.O.J.: Problems, problems, problems. Discret. Appl. Math. 31(2), 201–225 (1991)
Article Google Scholar
Newman, D.J.: A Problem Seminar. Springer, New York (1982)
Book Google Scholar
Papadimitriou, C.H., Yannakakis, M.: The traveling salesman problem with distances one and two. Math. Oper. Res. 18(1), 1–11 (1993)
Article MathSciNet Google Scholar
Rhee, W.T.: On the travelling salesperson problem in many dimensions. Random Struct. Algorithms 3(3), 227–233 (1992)
Article MathSciNet Google Scholar
Sekanina, M.: On an ordering of the set of vertices of a connected graph. Publ. Faculty Sci. Univ. Brno 412, 137–142 (1960)
MathSciNet Google Scholar
Sekanina, M.: On an algorithm for ordering of graphs. Can. Math. Bull. 14(2), 221–224 (1971)
Article MathSciNet Google Scholar
Steele, M.J.: Probability Theory and Combinatorial Optimization. SIAM, Philadelphia (1997)
Book Google Scholar
Supowit, K.J., Reingold, E.M., Plaisted, D.A.: The traveling salesman problem and minimum matching in the unit square. SIAM J. Comput. 12(1), 144–156 (1983)
Article MathSciNet Google Scholar
Székely, G.J.: Contests in Higher Mathematics. Springer, New York (1996)
Book Google Scholar
van Lint, J.H.: Codes, Ch. 16 in Handbook of Combinatorics, Ron Graham, Martin Grötschel, and László Lovász (editors), pp. 773–808. Elsevier, Amsterdam (1995)
van Lint, J.H.: Introduction to Coding Theory, 3rd edn. Springer, New York (1999)
Book Google Scholar
Verblunsky, S.: On the shortest path through a number of points. Proc. Am. Math. Soc. 2(6), 904–913 (1951)
Article MathSciNet Google Scholar

Download references

Acknowledgements

We thank anonymous referees for carefully reading the manuscript.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Department of Mathematics, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
József Balogh
Department of Mathematics, Karlsruhe Institute of Technology, 76131, Karlsruhe, Germany
Felix Christian Clemen
Algoresearch L.L.C., Milwaukee, WI, 53217, USA
Adrian Dumitrescu

Authors

József Balogh
View author publications
You can also search for this author in PubMed Google Scholar
Felix Christian Clemen
View author publications
You can also search for this author in PubMed Google Scholar
Adrian Dumitrescu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to this work.

Corresponding author

Correspondence to Felix Christian Clemen.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

J. Balogh: Research is partially supported by NSF grant DMS-1764123.

Appendix

1.1 Container Shapes with a Tight Bound When $k=2$

A key fact in deriving the tight bound, when $k=2$, for the cycle of n points in the unit square is the following tight bound in a right triangle [20]; see also [6].

Lemma 6.1

[20] Let X be a set of $n \ge 2$ points in a right triangle $\Delta $ whose sides are $a \le b \le c$. Then there is an extended path connecting the endpoints of c that visits all points in X and for which $\sum |e|^2 \le c^2$. In particular, X admits a Hamiltonian path P for which $\sum _{e \in P} |e|^2 \le c^2$. This bound is the best possible.

This result relies on a repeated application of the following simple corollary of the Cosine Law. It allows one to make shortcuts in a path or cycle at vertices where the two adjacent edges make an acute angle.

Lemma 6.2

[20] Let $\Delta $ be an triangle whose sides are $a \le b \le c$, and let $\gamma $ be the interior angle opposite to c. If $\gamma \le 90^\circ $, then $c^2 \le a^2 + b^2$.

We now exhibit two other container shapes for which we can deduce a tight bound. Lemma 6.3 below is an extension of Lemma 6.1.

Lemma 6.3

Let X be a set of $n \ge 2$ points in a non-obtuse triangle $\Delta $ whose sides are $a \le b \le c$. Then there is an extended path connecting the endpoints of c that visits all points in $\Delta $ and for which $\sum |e|^2 \le a^2 + b^2$. In particular, X admits a Hamiltonian path P for which $\sum _{e \in P} |e|^2 \le a^2 + b^2$. This bound is the best possible.

Proof

Let the altitude corresponding to c divide $\Delta $ into two right triangles. Consider the path obtained by concatenating the extended paths for the two right triangles. Further shortcut the path at the concatenation vertex by using Lemma 6.2 to obtain a Hamiltonian path P for which $\sum _{e \in P} |e|^2 \le a^2 + b^2$, see Fig. 2 (left). The three vertices of $\Delta $ provide a tight example. $\square $

Lemma 6.4

Let X be a set of $n \ge 2$ points in a non-obtuse triangle $\Delta $ whose sides are $a \le b \le c$. Then X admits a Hamiltonian cycle H for which $\sum _{e \in H} |e|^2 \le a^2 + b^2+ c^2$. This bound is the best possible.

Proof

By Lemma 6.3, X admits a Hamiltonian path P for which $\sum _{e \in P} |e|^2 \le a^2 + b^2$. Connecting the endpoints of this path (via an edge of length at most c) yields a Hamiltonian cycle H for which $\sum _{e \in H} |e|^2 \le a^2 + b^2+ c^2$. $\square $

By Lemma 6.3 and 6.4, we obtain the following corollary.

Corollary 6.5

Let X be a finite point set in in a non-obtuse triangle $\Delta $ whose sides are $a \le b \le c \le 1$. Then

1.
X admits a Hamiltonian path P for which $\sum _{e \in P} |e|^2 \le 2$.
2.
X admits a Hamiltonian cycle H for which $\sum _{e \in H} |e|^2 \le 3$.

Lemma 6.6

Let U be a unit square centered at o and let ab be one of its four sides. Let X be a set of $n \ge 2$ points in $V:=U {\setminus } \Delta {oab}$ (V as a closed set). Then X admits a Hamiltonian path P for which $\sum _{e \in P} |e|^2 \le 3$. This bound is the best possible.

Proof

Subdivide V into two right triangles as shown in Fig. 2 (right). Consider the path obtained by concatenating the extended paths for the two right triangles. Further shortcut the path by using Lemma 6.2 to obtain a Hamiltonian path P for which $\sum _{e \in P} |e|^2 \le 1^2 + (\sqrt{2})^2 =3$. The 4- and 5-point examples in Fig. 1 show that this bound is tight. $\square $

1.2 A Different Version Of Theorem 1.7

We remark that the proof of Theorem 1.7 can be extended for point sets of size n, when n is slowly growing in k, to obtain an upper bound sharper than that in Theorem 1.6.

Theorem 6.7

The following bounds are in effect:

(i)
If $n \le 2^k +2$, then there exists a Hamiltonian cycle H such that
$$\begin{aligned} S_k(H) \le \left( 2 + \left( \frac{8}{3}\right) ^{k/2} \right) k^{k/2}. \end{aligned}$$
Consequently, $s_k^{\texttt {HC}}(n) \le 1.64 \sqrt{k}$, for k sufficiently large.
(ii)
If $n \le 2^k$, then there exists a Hamiltonian cycle H such that
$$\begin{aligned} S_k(H) \le \left( 200 + 2.01^{k/2} \right) k^{k/2}. \end{aligned}$$
Consequently, $s_k^{\texttt {HC}}(n) \le 1.42 \sqrt{k}$, for k sufficiently large.

Proof

Let $X\subseteq [0,1]^k$ be a point set of size n. We run Algorithm 1 from Sect. 3.2. Let $F_i$ be the collection of paths at the i-th step, let $e_i$ be the edge added in the i-th step, and let $F=F_{n-1}$ be the final Hamiltonian path.

(i) We know that $|e_i| \le \sqrt{\frac{2}{3}k}$ for $i\le n-2$. Note that $|e_{n-1}| \le \sqrt{k}$ trivially. Now, let $f=ab$ be the edge where a and b are the two endpoints of the final path F. Set $H=F+f$ to be the Hamiltonian cycle when f is added to F. Since $|f| \le \sqrt{k}$ trivially, we get

$$\begin{aligned} S_k(H)&=\sum _{e\in H}|e|^k= |f|^k+|e_{n-1}|^k+ \sum _{i=1}^{n-2} |e_i|^k \le 2\left( \sqrt{k}\right) ^k+ (n-2)\left( \sqrt{\frac{2}{3}k}\right) ^k \\&\le \left( 2 + 2^k \cdot \left( \frac{2}{3}\right) ^{k/2} \right) \cdot k^{k/2} = \left( 2 + \left( \frac{8}{3}\right) ^{k/2} \right) k^{k/2}. \end{aligned}$$

Consequently, $s_k^{\texttt {HC}}(n) \le s_k(H) \le 1.64 \sqrt{k}$, for k sufficiently large.

(ii) We classify the edges $e\in F$ into two types.

1.
short edges: $|e|^2 \le \frac{100k}{199}$.
2.
long edges: $\frac{100k}{199} < |e|^2 $.

The number of short edges $e\in F$ is at most $n \le 2^k$, trivially. The number of long edges $e\in F$ is at most 199 by Lemma 3.3 applied with $m=200$. Now, let $f=ab$ be the edge where a and b are the two endpoints of the final path F. Set $H=F+f$ to be the Hamiltonian cycle when f is added to F. Since $|f| \le \sqrt{k}$ trivially, we get

$$\begin{aligned} S_k(H)&\le \left( 200 + 2^k \cdot \left( \frac{100}{199}\right) ^{k/2} \right) \cdot k^{k/2} \le \left( 200 + 2.01^{k/2} \right) k^{k/2}. \end{aligned}$$

Consequently, $s_k^{\texttt {HC}}(n) \le s_k(H) \le 1.42 \sqrt{k}$, for k sufficiently large. $\square $

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Balogh, J., Clemen, F.C. & Dumitrescu, A. On a Traveling Salesman Problem for Points in the Unit Cube. Algorithmica 86, 3054–3078 (2024). https://doi.org/10.1007/s00453-024-01257-w

Download citation

Received: 11 February 2024
Accepted: 12 July 2024
Published: 18 July 2024
Issue Date: September 2024
DOI: https://doi.org/10.1007/s00453-024-01257-w

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

On a Traveling Salesman Problem for Points in the Unit Cube

Abstract

Similar content being viewed by others

The Traveling Salesman Problem and the Gnedenko Theorem

A Polyhedral Study of the Quadratic Traveling Salesman Problem

Hard to solve instances of the Euclidean Traveling Salesman Problem

1 Introduction

Theorem 1.1

Theorem 1.2

Theorem 1.3

Theorem 1.4

Theorem 1.5

Theorem 1.6

Theorem 1.7

2 Hamiltonian Cycles: Exact Upper and Lower Bounds

2.1 An Improved Lower Bound for \(k=3\)

2.2 An Improved Upper Bound for Every \(k \ge 3\)

Lemma 2.1

Lemma 2.2

Lemma 2.3

Proof

Lemma 2.4

Proof

3 Hamiltonian Cycles for Subsets of Cube Vertices

3.1 Preparation: Binary Codes

Lemma 3.1

Proof

Lemma 3.2

Proof

Lemma 3.3

Lemma 3.4

Lemma 3.5

Proof

3.2 Building a Path Greedily

Lemma 3.6

Proof

Proof of Theorem 1.4

4 Hamiltonian Cycles: Asymptotic Upper Bounds

4.1 Preparation

Lemma 4.1

Proof

Lemma 4.2

Proof

Lemma 4.3

Proof

4.2 Proofs of Theorems 1.5, 1.6, and 1.7

Proof of Theorem 1.7

Proof of Theorem 1.5

Proof of Theorem 1.6

5 Concluding Remarks

Corollary 5.1

Corollary 5.2

Corollary 5.3

Proof

Corollary 5.4

Proof

Conjecture 5.5

Conjecture 5.6

Data availibility

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

1.1 Container Shapes with a Tight Bound When \(k=2\)

Lemma 6.1

Lemma 6.2

Lemma 6.3

Proof

Lemma 6.4

Proof

Corollary 6.5