1 Introduction

Inspired by several graph theoretical notions studied in mathematical chemistry, and, especially, by the notions of distance-balanced graphs [3, 4] and the Mostar index of a graph [2], Miklavič and Šparl [6] introduced the distance-unbalancedness of a graph G. Here, we confirm one of their conjectures from [6].

Before we can explain the distance-unbalancedness as well as our contribution, we need to introduce some notation. We consider only finite, simple, and undirected graphs. For a graph G, and two vertices u and v of G, let \(\mathrm{dist}_{G}(u,v)\) denote the distance in G between u and v, and let \(n_{G}(u,v)\) be the number of vertices w of G that are closer to u than to v, that is, that satisfy \(\mathrm{dist}_{G}(u,w)<\mathrm{dist}_{G}(v,w)\). The Mostar index [2] of G is

$$\begin{aligned} \mathrm{Mo}(G)=\sum \limits _{uv\in E(G)}|n_{G}(u,v)-n_{G}(v,u)|. \end{aligned}$$

A graph G is distance-balanced [3, 4] if \(n_{G}(u,v)=n_{G}(v,u)\) for every edge uv of G; or, equivalently, if \(\mathrm{Mo}(G)=0\). The distance-unbalancedness [6] of G is

$$\begin{aligned} \mathrm{uB}(G)=\sum \limits _{\{ u,v\}\in {V(G)\atopwithdelims ()2}}|n_{G}(u,v)-n_{G}(v,u)|, \end{aligned}$$

where \({V(G)\atopwithdelims ()2}\) denotes the set of all 2-element subsets of the vertex set V(G) of G, that is, the edge set of the complete graph with vertex set V(G). A graph G is highly distance-balanced [5] if \(n_{G}(u,v)=n_{G}(v,u)\) for every two distinct vertices u and v of G; or, equivalently, if \(\mathrm{uB}(G)=0\).

For a detailed discussion about the role of the above notions in mathematical chemistry, we refer to the cited references. In [6] Miklavič and Šparl collect numerous observations concerning the distance-unbalancedness and pose several conjectures. Confirming Conjecture 4.2. from [6], we prove the following.

Theorem 1

If T is a tree of order n, then

$$\begin{aligned} \mathrm{uB}(T)\ge \mathrm{uB}(K_{1,n-1})=(n-1)(n-2) \end{aligned}$$

with equality if and only if T is either a star \(K_{1,n-1}\) or \(n=4\) and T is the path \(P_4\).

As the definition of distance-unbalancedness involves a summation over all unordered pairs of distinct vertices, this parameter is much harder to approach than many other comparable parameters. In particular, it is much more difficult to analyze the effect of the kind of local modifications that are usual proof techniques in this area. On the one hand, the non-local nature of the definition of the distance-unbalancedness causes complications for a mathematical approach, but, on the other hand, it points to further chemical motivation as it illustrates the interplay of short-range interactions (considering just adjacent pairs of vertices) and long-range interactions (considering all pairs of vertices). See [1] and the references therein, for an example where the duality of through-space and through-bond interactions is taken into focus. Our proof of Theorem 1 relies on the insight, implicit in Lemma 2 below, that considering all unordered pairs of vertices of distance one or two is sufficient.

The rest of this paper is devoted to the proof of Theorem 1.

2 Proof of Theorem 1

For a graph G, the square \(G^{2}\) of G has the same vertex set as G, and two distinct vertices of G are adjacent in \(G^{2}\) if their distance in G is at most two. For the proof of Theorem 1, we consider the following auxiliary parameter

$$\begin{aligned} \mathrm{uB}_{2}(G)=\sum \limits _{uv\in E(G^{2})}|n_{G}(u,v)-n_{G}(v,u)|, \end{aligned}$$

and we establish the following.

Lemma 2

If T is a tree of order n, then \(\mathrm{uB}_{2}(T)\ge (n-1)(n-2)\).

Before proving this lemma, we show that Theorem 1 is an immediate consequence.

Proof of Theorem 1

By definition and Lemma 2, \(\mathrm{uB}(T)\ge \mathrm{uB}_{2}(T)\ge (n-1)(n-2)\) for every tree T of order n. It is an easy calculation that stars and \(P_4\) satisfy \(\mathrm{uB}(T)=(n-1)(n-2)\). Now, in order to complete the proof, we suppose, for a contradiction, that T is a tree of order n with \(\mathrm{uB}(T)=(n-1)(n-2)\) that is neither a star nor \(P_4\). Clearly, this implies that \(n\ge 5\), and that T has diameter at least three. Since \(\mathrm{uB}(T)=(n-1)(n-2)\) implies \(\mathrm{uB}(T)=\mathrm{uB}_{2}(T)\), we have \(n_T(u,v)=n_T(v,u)\) for every two vertices u and v at distance three.

Let u and v be two vertices at distance three. If u has a neighbor \(u'\) that does not lie on the path P between u and v, and \(v'\) is the neighbor of v on P, then \(u'\) and \(v'\) have distance three but \(n_T(u',v')<n_T(u,v)=n_T(v,u)<n_T(v',u'),\) which is a contradiction. Using \(n_T(u,v)=n_T(v,u)\) this easily implies that T arises from the disjoint union of two stars of order \(\frac{n}{2}\) by adding an edge between the two center vertices.

Now,

$$\begin{aligned} \mathrm{uB}(T)&\ge {} \mathrm{uB}_{2}(T)\\&= {} (n-2)^{2}+2\left( \frac{n}{2}-1\right) ^{2}\\&= {} (n-1)(n-2)+\frac{1}{2}(n-2)(n-4)\\&> {} (n-1)(n-2), \end{aligned}$$

which is a contradiction, and completes the proof. \(\square \)

We proceed to the proof of the lemma.

Proof of Lemma 2

Choose the tree T of order n such that \(\mathrm{uB}_{2}(T)\) is as small as possible. If T is a path, then a simple calculation yields \(\mathrm{uB}_{2}(T)=(n-1)(n-2)\), and the desired result follows. Hence, we may assume that T has at least one vertex of degree at least three.

We consider different cases.

Case 1 T has exactly one vertex c of degree k at least three.

Let the k components of \(T-c\) have orders \(n_{1},\ldots ,n_{k}\) with \(n_{1}\ge \ldots \ge n_{k}\ge 1\). Note that all these components are paths, and that \(n_{1}+\cdots +n_{k}=n-1\).

Case 1.1 \(n_{1}\le \frac{n}{2}\).

We have

$$\begin{aligned} \mathrm{uB}_{2}(T)= & {} \sum \limits _{i=1}^k\Big ((n-2)+(n-3)+\cdots +(n-2n_{i})\Big )+\sum \limits _{i=1}^{k-1}\sum \limits _{j=i+1}^{k}(n_{i}-n_j)\\\ge & {} \sum \limits _{i=1}^k\Big ((n-2)+(n-3)+\cdots +(n-2n_{i})\Big )+(n_{1}-n_{2})\\= & {} \sum \limits _{i=1}^k\Big ((2n_{i}-1)n-n_{i}(2n_{i}+1)+1\Big )+(n_{1}-n_{2})\\&= {} f_{1}(n,k)-\sum \limits _{i=1}^k2n_{i}^{2}+(n_{1}-n_{2}), \end{aligned}$$

where \(f_{1}(n,k)\) is a suitable function of n and k.

We consider the following optimization problem:

$$\begin{aligned} \begin{array}{rrcl} \min &{} f_{1}(n,k)-\sum \limits _{i=1}^k2n_{i}^{2}+(n_{1}-n_{2}) &{} &{} \\ s.th. &{} \frac{n}{2}\ge n_{1}\ge \ldots \ge n_{k}&{}\ge &{}1 \\ &{} n_{1}+\cdots +n_{k}&{}=&{}n-1 \\ &{} n_{1},\ldots ,n_{k} &{} \in &{} \frac{{\mathbb {N}}}{2}. \end{array} \end{aligned}$$
(1)

Note that in (1), the originally integral values of the \(n_{i}\) have been relaxed to being half-integral.

Let \((n_{1},\ldots ,n_{k})\) be a lexicographically maximal optimal solution of (1).

If \(n_{1}<\frac{n}{2}\) and \(n_{2}>n_{3}\), then

$$\begin{aligned}&\Bigg (-2\left( n_{1}+\frac{1}{2}\right) ^{2}-2\left( n_{2}-\frac{1}{2}\right) ^{2}+\left( n_{1} +\frac{1}{2}\right) -\left( n_{2}-\frac{1}{2}\right) \Bigg ) -\Bigg (-2n_{1}^{2}-2n_{2}^{2}+n_{1}-n_{2}\Bigg )\\&= {} -2(n_{1}-n_{2})\\&\le {} 0 \end{aligned}$$

implies that \(\left( n_{1}+\frac{1}{2},n_{2}-\frac{1}{2},\ldots ,n_{k}\right) \) is a lexicographically larger optimal solution of (1), which is a contradiction. If \(n_{1}<\frac{n}{2}\), \(n_{i}>1\) for some \(i\in \{ 3,\ldots ,k\}\), and i is chosen largest with this property, then

$$\begin{aligned}&\Bigg (-2\left( n_{1}+\frac{1}{2}\right) ^{2}-2\left( n_{i}-\frac{1}{2}\right) ^{2} +\left( n_{1}+\frac{1}{2}\right) \Bigg ) -\Bigg (-2n_{1}^{2}-2n_{i}^{2}+n_{1}\Bigg )\\= & {} -2(n_{1}-n_{i})-\frac{1}{2}\\< & {} 0 \end{aligned}$$

implies that \(\left( n_{1}+\frac{1}{2},\ldots ,n_{i}-\frac{1}{2},\ldots ,n_{k}\right) \) is a better solution of (1), which is a contradiction.

Finally, if \(n_{1}=\frac{n}{2}\) and \(n_{2}<\frac{n}{2}-k+1\), then \(n_{i}>1\) for some \(i\in \{ 3,\ldots ,k\}\). If i is largest with this property, then

$$\begin{aligned}&\Bigg (-2\left( n_{2}+\frac{1}{2}\right) ^{2}-2\left( n_{i} -\frac{1}{2}\right) ^{2} -\left( n_{2}+\frac{1}{2}\right) \Bigg ) -\Bigg (-2n_{2}^{2}-2n_{i}^{2}-n_{2}\Bigg )\\= & {} -2(n_{2}-n_{i})-\frac{3}{2}\\< & {} 0 \end{aligned}$$

implies that \(\left( n_{1},n_{2}+\frac{1}{2},\ldots ,n_{i}-\frac{1}{2},\ldots ,n_{k}\right) \) is a better solution of (1), which is a contradiction.

These observations imply that

  1. (a)

    either \(n\ge 2k\), \(n_{1}=\frac{n}{2}\), \(n_{2}=\frac{n}{2}-k+1\), and \(n_{3}=\ldots =n_{k}=1\),

  2. (b)

    or \(n<2k\), \(n_{1}=n-k\), and \(n_{2}=\ldots =n_{k}=1\).

In the first case,

$$\begin{aligned} \mathrm{uB}_{2}(T)\ge & {} \sum \limits _{i=1}^k\Big ((2n_{i}-1)n-n_{i}(2n_{i}+1)+1\Big )+n_{1}-n_{2}\\&{\mathop {=}\limits ^{(a)}}&(n-1)(n-2)+(n-2k)(k-2)\\\ge & {} (n-1)(n-2), \end{aligned}$$

and, in the second case,

$$\begin{aligned} \mathrm{uB}_{2}(T)\ge & {} \sum \limits _{i=1}^k\Big ((2n_{i}-1)n-n_{i}(2n_{i}+1)+1\Big )+n_{1}-n_{2}\\&{\mathop {=}\limits ^{(b)}}&(n-1)(n-2)+(2k-n)(n-k-1)\\\ge & {} (n-1)(n-2). \end{aligned}$$

Altogether, we obtain \(\mathrm{uB}_{2}(T)\ge (n-1)(n-2)\) as required in both cases.

Case 1.2 \(n_{1}>\frac{n}{2}\)

We have

$$\begin{aligned} \mathrm{uB}_{2}(T)= & {} \Big ((n-2)+\cdots +1+0+1+\ldots +(2n_{1}-n)\Big ) +\sum \limits _{i=2}^k\Big ((n-2)+\cdots +(n-2n_{i})\Big )\\&\quad +\sum \limits _{i=1}^{k-1}\sum \limits _{j=i+1}^{k}(n_{i}-n_j)\\\ge & {} \Big ((n-2)+\cdots +1+0+1+\ldots +(2n_{1}-n)\Big ) +\sum \limits _{i=2}^k\Big ((n-2)+\cdots +(n-2n_{i})\Big )\\&\quad +\sum \limits _{i=2}^{k}(n_{1}-n_{i})\\= & {} \frac{1}{2}(n-1)(n-2)+\frac{1}{2}(2n_{1}-n)(2n_{1}-n+1)+(k-1)n_{1}\\&\quad +\sum \limits _{i=2}^k\Big ((2n_{i}-1)n-n_{i}(2n_{i}+1)+1-n_{i}\Big )\\&= {} f_{2}(n,k) +2n_{1}^{2}-n_{1}(4n-k)-\sum \limits _{i=2}^k(2n_{i}^{2}+2n_{i}), \end{aligned}$$

where we used \(\sum \limits _{i=2}^k 2n_in=(2(n-1)-2n_{1})n\), and \(f_{2}(n,k)\) is a suitable function of n and k.

Note that, for \(i\in \{ 2,\ldots ,k\}\), we have \(n_{1}+n_{i}\le n_{1}+n_{2}\le n-k+1\), and, hence,

$$\begin{aligned} 4(n_{1}+n_{i})-4n+k+2\le -3k+6<0. \end{aligned}$$

If \(n_{i}>1\) for some \(i\in \{ 2,\ldots ,k\}\), and i is largest with this property, then

$$\begin{aligned}&\Bigg (2(n_{1}+1)^{2}-(n_{1}+1)(4n-k)-2(n_{i}-1)^{2}-2(n_{i}-1)\Bigg ) -\Bigg (2n_{1}^{2}-n_{1}(4n-k)-2n_{i}^{2}-2n_{i}\Bigg )\\= & {} 4(n_{1}+n_{i})-4n+k+2\\< & {} 0. \end{aligned}$$

This observation implies that

$$\begin{aligned} \begin{array}{rrcl} \min &{} f_{2}(n,k)+2n_{1}^{2}-n_{1}(4n-k)-\sum \limits _{i=2}^k(2n_{i}^{2}+2n_{i}) &{} &{} \\ s.th. &{} n_{1}&{}>&{}\frac{n}{2}\\ &{}n_{1}\ge \ldots \ge n_{k}&{}\ge &{}1 \\ &{} n_{1}+\cdots +n_{k}&{}=&{}n-1 \\ &{} n_{1},\ldots ,n_{k} &{} \in &{} {\mathbb {N}} \end{array} \end{aligned}$$

is assumed

  1. (c)

    for \(n_{1}=n-k\) and \(n_{2}=\ldots =n_{k}=1\).

This implies

$$\begin{aligned} \mathrm{uB}_{2}(T)&\ge {} \frac{1}{2}(n-1)(n-2)+\frac{1}{2}(2n_{1}-n)(2n_{1}-n+1)+(k-1)n_{1}\\&\quad +\sum \limits _{i=2}^k\Big ((2n_{i}-1)n-n_{i}(2n_{i}+1)+1-n_{i}\Big )\\&{\mathop {\ge }\limits ^{(c)}}&(n-1)(n-2)+(k-1)(k-2)\\ &\ge {} (n-1)(n-2), \end{aligned}$$

and, hence, also \(\mathrm{uB}_{2}(T)\ge (n-1)(n-2)\) as required in this case.

Case 2 T has at least two vertices of degree at least three.

Considering two vertices of degree at least three at maximum distance, it follows that T has a vertex c of degree \(k+1\) at least three such that \(T-c\) has

  • k components that are paths of orders \(n_{1},\ldots ,n_{k}\) with \(n_{1}\ge \ldots \ge n_{k}\ge 1\) and

    $$\begin{aligned} n':=1+n_{1}+\ldots +n_{k}\le \frac{n}{2}, \end{aligned}$$

    as well as

  • one component K of order \(n-n'\).

Let d be the neighbor of c in V(K). Let the tree \(T'\) arise from the disjoint union of K and a path P of order \(n'\) by adding one edge between d and an endvertex of P. Our goal is to show that \(\mathrm{uB}_{2}(T)>\mathrm{uB}_{2}(T')\), which would contradict the choice of T, and complete the proof.

We have

$$\begin{aligned} \mathrm{uB}_{2}(T)-\mathrm{uB}_{2}(T')= & {} \sum \limits _{i=1}^k\Big ((n-2)+\cdots +(n-2n_{i})+(n-n')-n_{i}\Big ) +\sum \limits _{i=1}^{k-1}\sum \limits _{j=i+1}^{k}(n_{i}-n_j)\\&\quad -\Big ((n-2)+\cdots +(n-(2n'-1))\Big )\\\ge & {} \sum \limits _{i=1}^k\Big ((n-2)+\cdots +(n-2n_{i})+(n-n')-n_{i}\Big )\\&\quad -\Big ((n-2)+\cdots +(n-(2n'-1))\Big )\\= & {} \sum \limits _{i=1}^k\Big ((2n_{i}-1)n-n_{i}(2n_{i}+1)+1+(n-n')-n_{i}\Big )\\&\quad -\Big ((2n'-2)n-n'(2n'-1)+1\Big )\\&= {} f_{3}(n,n',k)-\sum \limits _{i=1}^k2n_{i}^{2}, \end{aligned}$$

where \(f_{3}(n,n',k)\) is a suitable function of n, \(n'\), and k.

By the convexity of \(x\mapsto x^{2}\),

$$\begin{aligned} \begin{array}{rrcl} \min &{} f_{3}(n,n',k)-\sum \limits _{i=1}^k2n_{i}^{2} &{} &{} \\ s.th. &{} n_{1}\ge \ldots \ge n_{k}&{}\ge &{}1 \\ &{} n_{1}+\cdots +n_{k}&{}=&{}n'-1 \\ &{} n_{1},\ldots ,n_{k} &{} \in &{} {\mathbb {N}} \end{array} \end{aligned}$$

is assumed

  1. (d)

    for \(n_{1}=n'-k\) and \(n_{2}=\ldots =n_{k}=1\).

Note that \(3n'=2n'+n'\ge 2(k+1)+3\ge 2k+5\).

Now, we obtain

$$\begin{aligned} \mathrm{uB}_{2}(T)-\mathrm{uB}_{2}(T')&\ge {} \sum \limits _{i=1}^k\Big ((2n_{i}-1)n-n_{i}(2n_{i}+1)+1+(n-n')-n_{i}\Big )\\&\quad -\Big ((2n'-2)n-n'(2n'-1)+1\Big )\\&\quad {\mathop {\ge }\limits ^{(d)}}&(3n'-2k-3)(k-1)\\&\quad {\mathop {>}\limits ^{k\ge 2}}&0, \end{aligned}$$

which is the desired contradiction, completing the proof. \(\square \)