1 Introduction

In the Minimum Linear Ordering Problem (MLOP), given a finite set of elements E, and a function over the subsets, \(f: 2^E \rightarrow \mathbb {R}\), one seeks an ordering of the elements, i.e., a bijection \(\sigma : E \rightarrow \{1, \ldots , |E|\}\), that minimizes the aggregated cost over prefixes (or equivalently suffixes) of the ordering. In other words, MLOP is of the form \(\min _{\sigma \in \mathcal {S}_{E}} \sum _{i = 0}^{|E|} f( E_{i,\sigma } )\), where \(E_{i,\sigma } = \{e \in E: \sigma (e) \le i \}\), and \(\mathcal {S}_{E}\) is the set of permutations of E. This is in contrast to the classical phenomenon of minimizing a cost function over a combinatorial subset of the powerset of the elements, for example, as in the set cover problem or the minimum spanning tree problem.

It is known that the MLOP is NP-hard even with additional assumptions, for example, when the set function \(f(\cdot )\) is monotone and submodular, or symmetric and submodular, or supermodular (see Table 1). Despite a rich literature on hardness of MLOP variants, it is unclear whether the problem remains NP-hard for many structured cases, for instance when \(f(\cdot )\) is the rank function of a matroid (i.e., submodular, monotone, bounded by set size, and integral). Furthermore, much is still unknown about the related approximation guarantees. In this work, we push the envelope of hardness and approximability for variants of submodular MLOP. In particular, we show the following:

  1. 1.

    Matroid MLOP, graphic matroid MLOP, co-graphic matroid MLOP, and minimum latency vertex cover (MLVC) are NP-hard.

  2. 2.

    Graphic matroid is polynomially-solvable for some classes of graphs.

  3. 3.

    We improve the approximation factors for matroid MLOP to \(2 - \frac{1+r(E)}{1+|E|}\) and minimum latency set cover (MLSC) to \(2-\theta \), where \(\theta \ge \frac{2}{|E|+1}\), and it depends on the instance, by exploiting the theory of principal partitions. These results provide a refinement of the previously best-known factors for these problems [1].

  4. 4.

    We also show that MLVC can be approximated to \(\frac{4}{3}\), improving upon the previously best approximation achieving a factor 2 [1]. We analyze the fractional dimension of a related poset to achieve this bound. We further lower bound the integrality gap of the natural LP relaxation for MLVC.

Here, matroid variants of MLOP are when \(f(\cdot )\) is the rank function of the corresponding matroid, and the minimum latency set (vertex) cover problems are defined on a hypergraph (graph) where the vertices must be ordered so that the sum of the maximum indices at which every hyperedge (edge) is covered must be minimized. We include precise definitions of these problems in Sect. 4.

We summarize our hardness and approximation results on MLOP in Tables 1 and 2. The paper is structured as follows: We give an overview of our results and techniques in Sect. 2, discuss related work on MLOP variants in Sect. 3, and present preliminaries in Sect. 4. We discuss detailed proofs of our results in Sects. 58. We finally conclude the paper with open problems in Sect. 9.

2 Overview of results and techniques

We next present an overview of our results and techniques.

1. Hardness of matroid MLOP:

We first show in Sect. 5 the NP-hardness of matroid MLOP, by observing the fact that a uniform matroid on a ground set \(E ~(|E| = n)\) with rank k has the unique property (up to isomorphism) of having \({n \atopwithdelims ()k}\) independent sets of size k. We will show that any optimal matroid MLOP solution can detect this, thereby reducing the “uniform matroid isomorphism" problem (known to be NP-hard [2]) to matroid MLOP.

Theorem 1

Matroid MLOP is NP-hard.

Furthermore, we show that matroid MLOP in decision form on a family of matroids shares the same complexity class with matroid MLOP in decision form on the matroidal dual family. This observation will be useful for upcoming results.

2. Hardness of graphic matroid MLOP:

Next, in Sect. 6, we further restrict \(f(\cdot )\) to the special case of the rank function of any graphic matroid. Tutte [3] gave a complete minor-free characterization for graphic matroids. In particular, graphic matroids are regular, i.e., representable using a totally unimodular matrix, and in particular, do not contain a rank-2 uniform matroid over 4 elements as a minor (e.g., see [4]). Therefore for graphic matroids, the reduction from uniform matroid isomorphism does not suffice. We show that it is NP-hard using a series of reductions beginning at the minimum sum vertex cover (MSVC) on simple graphs G, which we show reduces to the minimum latency vertex cover (MLVC) on the complement graph \(\bar{G}\), which we show finally reduces to the graphic matroid MLOP.

Theorem 2

Graphic matroid MLOP is NP-hard.

To reduce MLVC to the graphic matroid MLOP with graph \(G = (V,E)\), we first create an auxiliary graph H by adding a new vertex z to V, and a weighted star graph T centered at z, connected to each vertex in V. We choose the edge weights for T in such a way that they each induce a distinct flat in any optimal ordering for weighted graphic MLOP. This implies solving (weighted) graphic matroid MLOP for H is equivalent to solving MLVC on G. As we can keep the magnitude of the weights controlled, this allows us to reduce MLVC to graphic matroid MLOP, thereby showing hardness of the latter. As a by product, we also show that MLVC and the co-graphic matroid MLOP are NP-hard, which was not known before our work.

3. Improved approximation of MLSC:

We present two different approximation algorithms for minimum latency set cover problem, both of which refine the best-known constant 2-approximation with improvements in different settings, and use different techniques. The MLSC can be modeled as a covering problem on a hypergraph (see Sect. 3 for details), and in Sect. 7, we present our first randomized approximation algorithm based on scheduling theory, whose approximation factor depends on rank of the hypergraph, i.e. the maximum cardinality of its hyperedges (or in other words, the maximum cardinality of the candidate sets) (Theorem 3). Our second approach simply applies the approximation for monotone submodular MLOP to MLSC, as the latter is a special case of the former (Corollary 2). Both the resultant approximation factors depend on the properties of the instance, and none of them dominate the other on all instances. They both improve on the previous best-known approximation bound for MLSC of 2, using a reduction to the single machine scheduling problem with precedence constraints [5,6,7].

Theorem 3

There is a randomized polynomial time algorithm that approximates MLSC within factor \(2-\frac{2}{1+\ell }\), where \(\ell \) is the maximum cardinality among all hyperedges of H.

The idea for achieving our improved approximation bound for MLSC is to exploit the structural complexity of the precedence constraints (corresponding to a poset) for the subsequent scheduling instance. Bounding the fractional dimension of this poset by \(1+\ell \) allows us to utilize a state-of-the-art scheduling algorithm by Ambühl et al. [8] to approximate the objective by a factor of \(2-\frac{2}{1+\ell }\). For the special case where the input is a graph, this algorithm gives a factor \(\frac{4}{3}\) approximation for MLVC.

Corollary 1

There exists a randomized polynomial time factor \(\frac{4}{3}\)-approximation algorithm for MLVC.

To the best of our knowledge, this is the current best approximation factor for MLVC. For \(\ell \)-uniform regular hypergraphs, i.e., where each hyperedge has size \(\ell \) and each vertex is contained in the same amount of hyperedges, we show that a simple LP relaxation also achieves the \(2-\frac{2}{1+\ell }\) approximation factor. In particular, the LP relaxation gives a factor \(\frac{4}{3}\)-approximation algorithm for MLVC on regular graphs. From this result, we raise the question whether the LP relaxation for MLSC on \(\ell \)-uniform hypergraphs has the same \(2-\frac{2}{1+\ell }\) approximation factor. Indeed, a better approximation factor seems unlikely, as we observe a lower bound of \(2-\frac{2}{1+\ell }\) on integrality gap of the LP relaxation for MLSC on \(\ell \)-uniform hypergraphs, matching our current approximation result.

In Sect. 8.3, we discuss the use of principal partitions to obtain an approximation for MLSC as a special case.

Corollary 2

There is a deterministic factor \((2-\frac{\varDelta +|E|}{\varDelta (1+|V|)})\)-approximation algorithm for MLSC, where \(\varDelta \) is the maximum degree of hypergraph \(H = (V,E)\).

Note that \(\varDelta =\max _{v\in V}|\{e\in E:v\in e\}|\). Together Theorem 3 and Corollary 2 imply that MLSC can be approximated within factor \(2-\theta \), where \(\theta =\max \{\frac{2}{1+\ell },\frac{\varDelta +|E|}{\varDelta (1+|V|)}\}\). Note \(\theta \) can be very small, for example, for \(\ell \)-uniform hypergraphs where \(\ell \) is large. However, since \(\theta \ge \frac{2}{n+1}\), we get a slight improvement over 2.

4. Polynomially solvable instances of matroid MLOP:

In Sect. 6.4, we propose a novel characterization of matroid MLOP, wherein one can search through bases and permutations of bases, rather than permutations of the ground set. In particular, whenever the number of bases of a matroid is small (polynomial in |E|) and the rank of the matroid is also small (constant), we show that matroid MLOP becomes polynomial time solvable.

Theorem 4

Let \(\mathcal {X}\) be a family of matroids such that for all \(M = (E,r_M) \in \mathcal {X}\) with \(|E| = m\), the number of bases of M is \(|\mathcal {B}(M)| \in O(g(m))\), and the rank of M is \(r_M(m) \in O(h(m))\), for some \(g, h: \mathbb {Z}_+ \rightarrow \mathbb {Z}_+\). Then, every matroid MLOP instance in \(\mathcal {X}\) can be solved in time \(O(g(m) \cdot poly(m,g(m)) \cdot (h(m))!)\) In particular, if g is polynomial in m and h is bounded by a constant, then matroid MLOP for \(\mathcal {X}\) is in P.

For the special case of graphic matroid MLOP on cactus graphs,Footnote 1 we show that an optimal MLOP ordering can be found by fixing any spanning tree of the cactus graph. To find an ordering of the edges of the spanning tree, we show that a greedy ordering on the cycles of the cactus graph suffices (even though the size of the basis may not be logarithmic in size with respect to the ground set).

Theorem 5

Given a simple cactus graph G, there is a polynomial time algorithm that solves graphic matroid MLOP on G.

Furthermore, in Sect. 7.2, we show how if a graph is regular, then the optimal objective values for MLA, MSVC, and MLVC are all related by linear shifts in the objective parameterized by the number and degree of the vertices. As many instances of regular graphs have polynomial time algorithms (e.g. see [9,10,11,12]) this leads to many new polynomial time algorithms for MSVC and MLVC for many instances of regular graphs.

5. Improved approximation for monotone submodular MLOP:

For monotone submodular MLOP, Iwata, Tetali, and Tripathi [1] provided a \((2-\frac{2}{|E|+1})\)-approximation algorithm for monotone submodular MLOP based on Lovász extension in 2012. Another natural approach is to use the theory of principal partitions induced by a given submodular function [13, 14]. The principal partition of a ground set E of a monotone submodular function is a chain of subsets \(\mathcal {C} = \emptyset \subseteq S_1 \subseteq \ldots \subseteq S_k = E\), such that each \(S_i\) is the unique maximal minimizer of \(f(S) - \lambda _i |S|\) for some \(\lambda _i \in \mathbb {R}_+\). As early as 1992, Pisaruk considered completing the chain \(\mathcal {C}\) randomly to add subsets of missing cardinality ([15], c.f. [16]). Later in 2019, Fokkink et al. [16] considered the same algorithm for the submodular search problem, which includes monotone submodular MLOP as a special case. They showed that this algorithm has an approximation ratio based on the total curvatureFootnote 2 of the submodular function, and is always at most 2.

It was not known how these two results compare, as they use very different techniques. We show that the algorithm based on principal partitions always has better approximation guarantee than the \((2-\frac{2}{|E|+1})\) bound of Lovász extension relaxation proven in [1].

Theorem 6

Let \(f:2^E \rightarrow \mathbb {R}\) be a non-trivial, normalized and monotone submodular function. There exists a factor \((2-\frac{1+\ell _{f}}{1+|E|})\)-approximation algorithm to MLOP with \(f(\cdot )\) in polynomial time, where \(\ell _{f}=\frac{f(E)}{\max _{x\in E}f(\{x\})}\).

As \(\ell _f\) is bounded below by 1, the above result is a refinement of the previous \((2 - \frac{2}{|E| + 1})\)-approximation [1]. Our result is also independent from the analysis in Fokkink et al. [16] using total curvature, and leads to nice approximation bounds for some classes of matroids where \(\ell _f\) is large. For example, for graphic matroid MLOP on connected graphs of bounded maximum degree \(\varDelta \) with \(\varDelta > 1\), we obtain a \((2 - \frac{2}{\varDelta })\)-approximation asymptotically. This constant factor improvement from 2 cannot be obtained using either the Lovász extension bound in [1] or the total curvature bound in [16].

Our results have led to multiple open questions which may be of independent interest, and are discussed in Sect. 9.

3 Related work

MLOP was formally introduced by Iwata et al. [1], generalizing many well-known combinatorial optimization problems. In this section, we describe related work in combinatorial optimization that can be viewed as different instances of MLOP. Some of these MLOP variants (e.g., minimum latency set cover) will be utilized in our proof that the graphic MLOP is NP-hard, as depicted in Fig. 1.

Fig. 1
figure 1

Overview of related problems. A solid arrow from problem A to B indicates that B generalizes A. A dashed arrow from problem A to B denotes that computation of A can be polynomially reduced to computation of B, using our gadgets

Minimum linear arrangement (MLA)

Motivated by applications in coding theory, Harper [17] introduced minimum linear arrangement (MLA) in 1964, which seeks to find an arrangement of the vertices of a given graph \(G = (V,E)\) such that the total “stretch” of each edge is minimized, i.e., MLA on a graph \(G = (V,E)\) is the following,

$$\begin{aligned} \min _{\pi \in \mathcal {S}_{V}}\sum _{(u,v) \in E} |\pi (u) - \pi (v)|. \end{aligned}$$

Note that any permutation \(\pi \in \mathcal {S}_{V}\) naturally induces a chain on V with prefix sets \(V_{i,\pi } = \{v \in V: \pi (v) \le i\}\). Let \(\phi \) be the cut function of the graph, i.e., for all \(S \subseteq V\), \(\phi (S)\) is the number of edges with exactly one end in S. Note then for any permutation \(\pi \in \mathcal {S}_{V}\) if an edge \((u,v) \in E\) is stretched to a value \(k = |\pi (u) - \pi (v)|\), it must cross the cut of exactly k prefix sets in the chain \(V_{0,\pi } \subsetneq V_{1,\pi } \subsetneq \cdots \subsetneq V_{n-1,\pi } \subsetneq V_{n,\pi }\) where \(|V(G)| = n\). Thus, MLA on a graph \(G = (V,E)\) is equivalent to

$$\begin{aligned} \min _{\pi \in \mathcal {S}_{V}}\sum _{i = 0}^{n} \phi (V_{i,\pi }), \end{aligned}$$

which is an instance of MLOP with \(\phi \) being a symmetric submodular function. Solving MLA for specific instances of graphs has received considerable attention due to its many applications, see surveys [9,10,11,12]. While MLA is polynomial time solvable for some classes of graphs, for example trees [18, 19], its decision form has been known to be NP-complete since 1974 [20]. The best known approximation bound for MLA is \(O(\sqrt{\log n} \log \log n)\) [21, 22]. Under the exponential time hypothesis [23] that there does not exist a randomized algorithm to solve SAT in time \(2^{n^{\varepsilon }}\) where n is the instance size and \(\varepsilon > 0\) is arbitrarily small, it is also known that MLA is inapproximable to some constant [24].

Table 1 Previously known results and our results on NP-hardness of different MLOP variants

Minimum latency set cover (MLSC)

MLSC was introduced by Hassin and Levin [27] with motivations from problems in job scheduling, and they provided an e-approximation. The best known approximation constant for MLSC is 2 [25, 27]. Later in our work, we show that MLSC can be viewed as an instance of monotone submodular MLOP, for which Iwata, Tetali and Tripathi [1] gave a factor \((2-\frac{2}{|E|+1})\) approximation algorithm using the Lovász extension. We give a more refined approximation algorithm for monotone submodular MLOP using principal partitions, which applies to MLSC as well.

Table 2 Summary of approximation factors known for MLOP variants

Minimum sum set cover

Minimum sum set cover (MSSC) was introduced by Feige, Lovász, and Tetali [26], who also presented a greedy algorithm that provides a 4-approximate solution to MSSC, and showed it is NP-hard to do better. Later, Iwata et al. [1] showed that MSSC is an instance of supermodular MLOP, and the greedy algorithm for MSSC can be generalized to approximate supermodular MLOP within factor 4.

MSSC can be formulated as follows, using the notation of hypergraphs: given a hypergraph \(H=(V(H),E(H))\), MSSC seeks to find a permutation of vertices that minimizes the total costs of all hyperedges, where the cost of each hyperedge is the minimum of its vertex labels, i.e.,

$$\begin{aligned} \min _{\pi \in \mathcal {S}_{V(H)}} \sum _{e \in E(H)} \min _{v\in e}\pi (v). \end{aligned}$$

The special case when H is a graph is the well-known minimum sum vertex cover (MSVC). Independent from MSSC, MSVC was introduced earlier by Burer and Monteiro [33] as a heuristic in solving semidefinite relaxation of the Max-Cut problem. Feige, Lovász, and Tetali [26] later showed MSVC has a 2-approximation based on linear programming rounding, and also showed that it is NP-hard to approximate for an unknown constant \(\epsilon \), where \(1<\epsilon <2\). Later, Barenholz, Feige and Peleg [34] improved to a 1.9946-approximation, and recently, Bansal et al. [31] gave a \(\frac{16}{9}\)-approximation for MSVC. The best possible approximation constant for MSVC is still unknown. For the special case of regular graphs, Feige, Lovász, and Tetali [26] gave a \(\frac{4}{3}\)-approximation. This approximation guarantee for regular graphs was later improved by Stanković [35] to 1.225.

These problems concern supermodular functions, but we only consider submodular functions in this work.

Other variants of MLOP

Another variant of MLOP is called the multiple intents ranking (MIR), and has been studied in [25, 31, 36,37,38]. Azar, Gamzu, and Yin [25] gave a 2-approximation for MIR, for the case when the weight vector for each hyperedge is monotonically non-decreasing. This variant of MIR includes MLSC as a special case. These problems have found a broad spectrum of applications in query results diversification [39], motion planning for robots [40], cost-minimizing search [41], and optimal scheduling [42], among others.

Another example of an instance of submodular MLOP is the sum cut problem (SUMCUT). The problem was independently introduced by Díaz et al. [28] and also Yixun and Jinjiang [29] to study circuit layouts. SUMCUT is NP-complete [28, 29] and Rao and Richa [32] gave a \(O(\log n)\)-approximation algorithm for SUMCUT using a divide-and-conquer approach.

Recently, Happach et al. [43] viewed MLOP under the umbrella of minimum sum ordering/permutation problem, and generalized results of Feige et al. [26].

4 Preliminaries

We now present notation and background useful for parsing this work. We refer an interested reader to [44] for further reading.

1. Submodular set functions

For a set of elements S and elements \(x\notin S,y\in S\), we use \(S+x,S-y\) to denote \(S\cup \{x\},S\setminus \{y\}\) respectively. Let \(f:2^E\rightarrow \mathbb {R}\) be a set function. We say f is submodular if for all \(S,T\subseteq E\), \(f(S)+f(T)\ge f(S\cup T)+f(S\cap T)\). An equivalent definition is \(f(S+e)-f(S)\ge f(T+e)-f(T)\) for all \(S\subseteq T\) and \(e\notin T\). This property is sometimes called the diminishing return property. A set function f is supermodular if \(-f\) is submodular, and is symmetric if \(f(S)=f(E\setminus S)\) for all \(S\subseteq E\) and is monotone if \(f(S)\le f(T)\) for all \(S\subseteq T\subseteq E\). We say f is normalized if \(f(\emptyset )=0\). A normalized monotone submodular function f is non-trivial if \(f(E)\ne 0\), i.e., f is not identically zero. For a normalized non-rivial monotone submodular function f, we define the steepness of f as \(\kappa _{f}=\max _{x\in E}f(\{x\})\), which is the maximum function value of any singleton. We further define the linearity of f as \(\ell _{f}=\frac{f(E)}{\kappa _{f}}\). For a normalized symmetric submodular function f, and \(s,t \in E\), an \(s - t\) cut is a subset \(X \subseteq E\) such that \(s \in X\), \(t \not \in X\) or \(s \not \in X\), \(t \in X\). The cut is said to have value \(f(X) = f(E {\setminus } X)\).

For a finite set E with size \(m>0\), we define \(\mathcal {S}_{E}\) to be the set of all bijective functions \(\sigma : E \rightarrow \{1, \ldots , |E|\}\). For every \(\sigma \in \mathcal {S}_{E}\), we define the prefix sets \(E_{i,\sigma } = \{e \in E: \sigma (e) \le i \}\). Note \(|E_{i,\sigma }|=i\) for all \(1\le i\le m\) and \(\emptyset \subsetneq E_{1,\sigma } \subsetneq E_{2,\sigma } \subsetneq \cdots \subsetneq E_{m,\sigma } = E\).

2. Matroids

A rank function (for a matroid) is an integer-valued nonnegative monotone submodular function \(r: 2^{E} \rightarrow \mathbb {Z}_{\ge 0}\), such that \(r(A) \le |A|\) for all \(A\subseteq E\). A pair \(M = (E,r)\) where r is a rank function on E is a matroid. There are multiple equivalent definitions for matroids and we refer to [4] for other equivalent definitions and basic theory. Note if r is a rank function of a non-trivial matroid (where \(r(E)>0\)) then \(\kappa _{r}=1\) and \(\ell _{r}= r(E)\). The set E is the ground set of the matroid M, which we also denote E(M). A set \(I \subseteq E\) is independent if \(r(I) = |I|\), and is dependent otherwise. A maximal independent set is a basis, and the set of all bases of M is denoted as \(\mathcal {B}(M)\). A circuit of M is a minimally dependent set. Let \(e,e' \in E(M)\) for some matroid M, then e is a loop if \(\{e\}\) is a circuit and e and \(e'\) are parallel if \(\{e,e'\}\) is a circuit. Let B be a basis for M and note for all \(e \in E {\setminus } B\), \(B + e\) is a dependent set. It is well known that \(B + e\) contains a unique circuit, called the fundamental circuit of e with respect to B, which will we denote C(Be). A flat of a matroid is a subset \(X \subset E\) that is maximal with respect to its rank. The closure of a set \(S \subseteq E\), is \(\text {cl}(S) = \{x \in E: r(S \cup \{x\}) = r(S)\}\).

A matroid is uniform of rank k if its bases consists of all subsets of size k. We denote a uniform matroid of size m and of rank k as \(U_k^m\). If the independent sets of a matroid M is the family of acyclic sets of a graph G, then M is a graphic matroid, which we denote \(M = M[G]\). If M is a matroid with rank function r, then its corank function is the following, \(r^*(X) = |X| - r(M) + r(E \setminus X)\). It is well known, see [4], that \(r^*\) is also a rank function for a matroid \(M^*\) on E(M), and we let \(M^*\) denote the dual matroid of M. An element is a coloop if it is a loop in the dual matroid. A cographic matroid is a matroid whose dual matroid is graphic.

For a positive integer m, let \([m] = \{1,2, \ldots , m\}\). Given a \(k\times m\) matrix A with integer entries, the vector matroid of A, denoted by M[A], is defined as follows: the ground set is [m], and the rank function of \(J\subset [m]\) is the (matrix) rank of \(k\times |J|\) submatrix \(A_J\), which is obtained from A by deleting columns whose index is not in J.

Given a matroid \(M = (E, r)\), matroid MLOP solves \(\min _{\sigma \in \mathcal {S}_{E}} \sum _{i = 0}^{|E|} r( E_{i,\sigma } )\), where \(E_{i,\sigma } = \{e \in E: \sigma (e) \le i \}\), and \(\mathcal {S}_{E}\) is the set of permutations of E.

3. Graphs, hypergraphs and partial orders

A graph G over a set of vertices V(G), can be defined by a multiset of edges \(E(G) \subseteq V \times V\). We allow graphs to have multiedges and loops, and a graph is simple if it does not have multiedges or loops. A graph is a clique if every pair of distinct vertices has a single edge joining them. The clique or complete graph on n vertices is denoted \(K_n\). The complement of a simple graph G, denoted \(\overline{G}\), is the graph where \(V(\overline{G}) = V(G)\) and for all distint \(u,v \in V(G)\), \((u,v) \in E(\overline{G})\) if and only if \((u,v) \not \in E(G)\). A block of a graph G is a maximal connected subgraph, without a cut vertex. Note that the blocks of G are either an edge or a 2-connected subgraph. It is well known that every pair of distinct blocks are either disjoint or intersect at a cut vertex of G. A cactus graph is graph G in which every block of G is an edge or a circuit. A hypergraph \(H = (V, E)\) is a generalization of graphs that allows each edge \(e \in E\) to be a subset of vertices V, where each such subset is referred to as a hyperedge. A graph is a special case of a hypergraph where all hyperedges have size 2.

We now define the minimum latency set cover (MLSC). Given a hypergraph, MLSC asks to find a permutation on its vertices that minimizes the aggregated cost of the hyperedges, where the cost of an hyperedge is the maximum label of its vertices.

$$\begin{aligned} \min _{\pi \in \mathcal {S}_{V(H)}} \sum _{e \in E(H)} \max _{e \in E(H)} \pi (v) \end{aligned}$$

The minimum latency vertex cover (MLVC) is an instance of MLSC where the input is restricted to being a graph.

A partially ordered set or poset is a pair \((P,<_P)\) where P is a set and \(<_P\) is an antisymmetric and transitive relation on P, i.e., such that for all distinct \(x,y,z\in P\), we have that

  1. 1.

    if \(x <_P y\) and \(y <_P z\) then \(x <_P z\), and

  2. 2.

    if \(x <_P y\) then \(y \not <_P x\).

For any \(x,y \in P\) we say x and y are comparable if \(x <_P y\), \(y <_P x\) or \(x = y\). A chain is a subset \(S \subseteq P\) of pairwise comparable elements of \((P,<_P)\). A partial order \((P,<_P)\) is a total order if P is a chain. A poset \((P,\prec _P)\) is an extension of a poset \((P,<_P)\) if for all \(x,y \in P\), if \(x <_P y\) implies \(x \prec _P y\). An extension \((P,\prec _P)\) is linear if \((P,\prec _P)\) is a total order.

4. Principal partitions

We refer the readers to [14, 45] for the general theory on principal partitions. Here we state some properties of principal partitions on monotone submodular functions.

Theorem 7

[45] Let f be a monotone submodular function such that \(f(A)=0\) if and only if \(A=\emptyset \). Then there exist positive integer \(s\ge 1\) and nested sets \(\emptyset =\varPi _0\subsetneq \cdots \subsetneq \varPi _s=E\), called principal partitions of f, as well as real numbers \(\lambda _0<\lambda _1< \cdots <\lambda _{s+1}\), called critical values, such that for all \(0\le i\le s\), \(\varPi _i\) is the unique maximal optimal solution to \(\min _{X\subseteq E} f(X)-\lambda |X|\), for all \(\lambda \in (\lambda _i,\lambda _{i+1})\). Furthermore, \(\{\varPi _i\}_{0\le i\le s}\) as well as \(\{\lambda _i\}_{1\le i\le s}\) can be computed in polynomial time.

Some authors refer to \(\{\varPi _i\}_{0\le i\le s}\) as the principal sequence of partitions, and/or use the minimal (which is also unique) instead of maximal optimal solution. Note that the principal partitions minimize the function value among subsets of the same size.

5 Matroid MLOP is as hard as uniform matroid isomorphism

Before we show NP-hardness of graphic matroid MLOP, we will first show in this section that the more general case that matroid MLOP is indeed NP-hard using a reduction to the uniform matroid isomorphism problem. Although the uniform matroid is one of the simplest matroids, it turns out that determining whether a given matroid is uniform is NP-hard [2]. Formally, the uniform matroid isomorphism problem is the following:

$$\begin{aligned} { Given}\,{ a}\, k \times m\,{ matrix}\, A\,{ with}\,{ integer}\,{ entries,}\,{ is}\, M[A]\,{ isomorphic}\,{ to}\, U^k_m? \end{aligned}$$

where M[A] denotes the vector matroid of A.

In the following lemma we argue that the optimal matroid MLOP value is unique for each uniform matroid. This provides a reduction to the uniform matroid isomorphism problem.

Lemma 1

Let \(M =(E, r)\) be a matroid, of size \(|E| = m\), and rank at most \(k \ge 1\). We have

$$\min _{\sigma \in \mathcal {S}_{E}}\sum _{i = 1}^m r(E_{i,\sigma }) = {k + 1 \atopwithdelims ()2} + k(m-k),$$

if and only if M is isomorphic to the uniform matroid \(U_k^m\).

Proof

For any uniform matroid \(U_{k}^m\) on ground set E, and for any ordering \(\sigma \) of elements in E we have, \(\sum _{i = 1}^{m} r(E_{i,\sigma }) = {k + 1 \atopwithdelims ()2} + k(m - k)\), for prefix sets \(E_{i, \sigma }\) of the ordering. If M is not isomorphic to the rank-k uniform matroid \(U_{k}^m\), then it must have some subset \(S \subseteq E\) of k elements with rank less than k. As E has rank at most k, ordering elements in S first, followed by elements in \(E\setminus S\) arbitrarily constructs a solution with the matroid MLOP value less than the optimal solution for \(U_k^m\). The claim follows. \(\square \)

Note if A is a \(k \times m\) matrix, then M[A] is a matroid of size m and rank at most k. By Lemma 1, if we solve matroid MLOP for M[A], we can determine if M[A] is isomorphic to \(U_m^k\). By NP-hardness of uniform matroid, we have the following theorem.

Theorem 1

Matroid MLOP is NP-hard.

For matroid MLOP, the next lemma shows that solving matroid MLOP for any matroid \(M = (E,r)\) is as hard as solving matroid MLOP for the dual matroid \(M^* = (E, r^*)\). This will be useful to show the hardness of matroid MLOP for cographic matroids.

Lemma 2

Let \(M = (E,r)\) be a matroid with \(|E| = m\) and consider an ordering \(\sigma \in \mathcal {S}_{E}\), then

$$\begin{aligned} \sum _{i = 1}^m r^*(E_{i,\sigma }) = {m + 1 \atopwithdelims ()2} - r(M)|E(M)| + \sum _{i = 1}^m r(E_{i,\sigma ^*}), \end{aligned}$$

where \(\sigma ^*\) is the reverse permutation, i.e., \(\sigma ^* = |E| + 1 - \sigma \in \mathcal {S}_{E}\).

Proof

One can easily verify that \(\sigma ^* \in \mathcal {S}_{E}\). As \(r^*(X) = |X| - r(M) + r(E \setminus X)\) it follows,

$$\begin{aligned} \sum _{i = 1}^m r^*(E_{i,\sigma })&= \sum _{i = 1}^m \big ( |E_{i,\sigma }| - r(M) + r(E \setminus E_{i,\sigma }) \big )\\&= {m + 1 \atopwithdelims ()2} - r(M)|E(M)| + \sum _{i = 1}^m r(E \setminus E_{i,\sigma })\\&= {m + 1 \atopwithdelims ()2} - r(M)|E(M)| + \sum _{i = 1}^m r(E_{i,\sigma ^*}). \end{aligned}$$

\(\square \)

Therefore, any optimal ordering of E for matroid MLOP for a given matroid \(M =(E,r)\) also gives an optimal ordering for matroid MLOP on the dual matroid \(M^* = (E, r^*)\).

Corollary 3

Matroid MLOP is NP-hard on a family of matroids \(\mathcal {X}\) if and only if matroid MLOP is NP-hard on the dual family \(\mathcal {X}^* = \{X^*: X \in \mathcal {X}\}\).

6 Graphic matroid MLOP is NP-hard

We next consider the complexity of graphic matroid MLOP. This turns out to be non-trivial, involving a series of reductions from minimum sum vertex cover, to minimum latency vertex cover, to weighted graphic matroid MLOP, to matroid MLOP.

To show these reductions, we first argue that an optimal chain of matroid MLOP has a useful structure of flats of the matroid, in Lemma 3. Next, in Lemma 4, we reduce weighted matroid MLOP to matroid MLOP. In section 6.2, we provide a reduction from minimum latency vertex cover (MLVC) to weighted graphic matroid MLOP. Finally in Section 6.3, we argue that MLVC and minimum set vertex cover (MSVC) are equivalent in decision form, thus completing the proof that graphic matroid MLOP is NP-hard. In Sect. 6.4 we give an alternative characterization to matroid MLOP. In matroid MLOP we optimize over permutations of the ground set, while in this new formulation, we optimize over bases and then permutations of those bases. Using this characterization, we argue that graphic matroid MLOP for cactus graphs has a polynomial time algorithm.

6.1 Weighted graphic matroid MLOP

In this section, we first argue that any optimal matroid MLOP solution on a ground set E of size m has a nice “flat-like" structure, i.e., for any optimal permutation \(\sigma \in \mathcal {S}_{E}\), the set \(~\bigcup \{ E_{j,\sigma }: r(E_{j,\sigma }) \le i \}\) is a flat for all \(i \in [m]\). This is a useful structural result for optimal solutions and is necessary step towards showing the hardness of graphic matroid MLOP.

Lemma 3

Let \(M = (E,r)\) be a matroid of size m and rank k and let \(\sigma \in \mathcal {S}_{E}\) be a permutation that minimizes matroid MLOP. Then, there exists a basis \(B = \{b_1, \ldots , b_k\} \in \mathcal {B}(M)\) and a partition \(\{X_0, X_1, \ldots , X_k\}\) of E such that (i) \(b_i \in X_i\), and (ii) \(\bigcup _{i = 0}^j X_i\) is a flat for all \(0 \le j \le k\), and (iii) \(\sigma (e)<\sigma (e^\prime )\) for \(e \in X_i, e^\prime \in X_l\), and \(i<l\).

Proof

We may suppose \(k \ge 1\), as otherwise the statement is trivial. Let \(\sigma \in \mathcal {S}_{E}\) be a permutation that minimizes matroid MLOP and \(X_i:= \{ e_j: r(E_{j,\sigma }) = i \}\) for \(i \ge 0\) and note that \(\{X_0, X_1, \ldots , X_k\}\) partitions the ground set E. Furthermore \(X_i \ne \emptyset \) for all \(1 \le i \le k\) as for all \(e \in E\) and \(X \subseteq E\), we have \(r(X + e) \le r(X) + 1\). For each \(1 \le i \le k\), let \(b_i \in X_i\) be the element e in \(X_i\) with the lowest index \(\sigma (e)\).

For each \(1 \le i \le k\), we claim \(\{b_1,\ldots ,b_i\}\) is an independent set. For \(i = 1\), this is clear. Suppose the claim holds for all positive integers less than j, and \(r(\{b_1,\ldots , b_j\}) = j-1\). Note that \(b_j \in \text {cl}(\{b_1, \ldots , b_{j-1}\}) = \text {cl}(\bigcup _{i = 0}^{j-1} X_i)\). As \(r(\text {cl}(\{b_1, \ldots , b_{j-1}\})) = j-1\), this contradicts the fact that \(r(\bigcup _{i = 0}^{j-1} X_i \cup \{b_j\}) = j\). In particular, this implies that \(\{b_1,\ldots , b_k\}\) is a basis of M and (i) holds.

We now show that \(\bigcup _{i=0}^j X_i\) is a flat for each \(j < k\). Suppose for \(e' \in X_{j'}\) for \(j' > j\), that \(r(\bigcup _{i = 0}^j X_i \cup \{e'\}) = j\). Let \(\sigma ' \in \mathcal {S}_{E}\) be the permutation where we place \(e'\) before \(b_{j+1}\) in \(\sigma \). That is,

$$\begin{aligned} \sigma '(e) = {\left\{ \begin{array}{ll} \sigma (e) &{}\text { if } \sigma (e)< \sigma (b_{j+1}),\\ \sigma (b_{j+1}) &{}\text { if } e = e',\\ \sigma (e) + 1 &{}\text { if } \sigma (e) \ge \sigma (b_{j+1}) \text{ and } \sigma (e) < \sigma (e^{\prime }),\\ \sigma (e) &{}\text { if } \sigma (e) \ge \sigma (e^{\prime }). \end{array}\right. }. \end{aligned}$$

Note \(\sum _{i = 1}^m r(E_{i,\sigma '}) < \sum _{i = 1}^m r(E_{i,\sigma })\). This contradicts the optimality of \(\sigma \), thus such an \(e'\) cannot exist. It follows each \(\bigcup _{i = 0}^j X_i\) is a flat for each j, and hence (ii) holds. As for all \(e \in X_i\) and \(e' \in X_l\) with \(i < l\), we have \(\sigma (e) < \sigma (b_{l}) \le \sigma (e')\), thus (iii) holds as well. \(\square \)

We next introduce a weighted matroid MLOP, which given positive integer costs \(c:E\rightarrow \mathbb {Z}_+\) to the elements E of a matroid \(M = (E, r)\), checks if there exists a permutation \(\sigma \in S_E\) with weighted MLOP cost at most K, i.e.,

$$\begin{aligned} \min _{\sigma \in \mathcal {S}_{E}} \sum _i r(E_{i,\sigma })c(\sigma ^{-1}(i)) \le K. \end{aligned}$$

We now argue that weighted matroid MLOP for a matroid \(M = (E,r)\) reduces to matriod MLOP as long as the total integer costs are bounded by a polynomial in |E|. This simply follows by duplicating an element \(e\in E\) a c(e) number of times, and solving unweighted matroid MLOP on the modified instance. For each duplication of e, \(r(E_{i,\sigma })\) is counted c(e) times for each duplicate for any permutation \(\sigma \in \mathcal {S}_{E}\).

Lemma 4

Weighted matroid MLOP with cost function c can be reduced to the matroid MLOP in time poly(|E|, c(E)) where \(c(E):= \sum _{e \in E}c(e)\).

Proof

Given a matroid \(M = (E,r)\) with cost function c, let \(N = (E',r')\) be the corresponding matroid where for each \(e \in E(M)\) we add \(c(e) - 1\) parallel elements to get \(E'\). Let \(m' = |E'| = c(E)\) and let \(\sigma ' \in \mathcal {S}_{E'}\) be an optimal ordering for the matroid MLOP on N. Since we only add parallel elements, the rank function of M induces a natural rank function on N.

Let \(r'(N) = k\). By Lemma 3, there exists a partition \(\{X_1, \ldots , X_k\}\) of \(E'\) such that \(\bigcup _{i = 1}^j X_i\) is a flat for all \(1 \le j \le k\) and if \(e \in X_i\) and \(e' \in X_{\ell }\) for \(i < \ell \), then \(\sigma '(e) < \sigma '(e')\). Suppose \(e'\) is parallel with e. As \(\{e',e\}\) is a dependent set, we have that \(e \in X_i\) if and only if \(e' \in X_i\) for all \(1 \le i \le k\). We now define a new ordering \(\sigma ''\) by rearranging the elements in \(\sigma '\) such that parallel elements are grouped together by consecutive indices. As parallel elements appear in the same \(X_i\), we have \(\sum _{i = 1}^{m'} r(E'_{i,\sigma '}) = \sum _{i = 1}^{m'} r(E'_{i,\sigma ''})\).

Note that \(\sigma ''\) induces a permutation \(\sigma \in \mathcal {S}_{E}\) on the original weighted matroid M such that for any distinct \(e,e' \in E\) we have \(\sigma (e) < \sigma (e')\) if and only if \(\sigma '(e) < \sigma '(e')\). Note then as parallel elements appear in the same partition set \(X_i\) we have,

$$\begin{aligned} \sum _{i = 1}^{m'} r'(E'_{i,\sigma '}) = \sum _{i = 1}^{m'} r'(E'_{i,\sigma ''}) = \sum _{i = 1}^m r(E_{i,\sigma })c(\sigma ^{-1}(i)). \end{aligned}$$

Thus, the optimal weighted matroid MLOP value for M with cost function c is at most the optimal matroid MLOP value on N. Furthermore, one can easily verify that if \(\sigma \in \mathcal {S}_{E}\) obtains the optimal weighted matroid MLOP value for the matroid M with cost function c, there is a corresponding permutation \(\sigma ' \in \mathcal {S}_{E'}\) that obtains the same matroid MLOP value for N. Thus, the optimal values for both problems are equal. As we only added \(c(E) - |E|\) additional elements to N, this is a poly(|E|, c(E)) time reduction. \(\square \)

6.2 Reducing MLVC to graphic matroid MLOP

We now show that graphic matroid MLOP is as hard as minimum latency vertex cover (MLVC). In MLVC we are given a graph \(G=(V(G),E(G))\), and seek to find a permutation of vertices that minimizes the total edge cost, where the cost of each edge is the maximum label of its vertices, i.e.,

$$\min _{\pi \in \mathcal {S}_{V(G)}} \sum _{(x,y)\in E(G)}\max \{\pi (x),\pi (y)\}.$$

Theorem 8

Minimum latency vertex cover (MLVC) problem can be reduced in polynomial time to the graphic matroid MLOP.

Proof

We will consider an instance of MLVC for a graph G, and construct an auxiliary graph H from G. We will then show that MLVC is equivalent to solving weighted graphic matroid MLOP on H with a specific cost function c. By showing a bound on the cost of edges \(c(E(H)):= \sum _{e \in E(H)} c(e)\) in terms of a polynomial of |E(G)|, by applying Lemma 4, we will complete the reduction.

(a) Construction of the graphic matroid MLOP instance: Let G be the given graph with n vertices and m edges. We may assume without loss of generality, G has no isolated vertices, as otherwise an optimal MLVC solution assigns isolated vertices last which play no role in the MLVC cost. Therefore, we have that \(n\le 2m\), by counting the endpoints of the edges which upper bounds the number of vertices.

Let H be a copy of G with an additional vertex z connected to each vertex of G, i.e., \(V(H) = V(G) \cup \{z\}\) and \(E(H) = E(G) \cup \{(z,v): v \in V(G)\}\). Let T be the spanning tree of H with \(E(T) = \{(z,v): v \in V(G)\}\). Therefore, H has \(n+1\) vertices, and \(m+n\) edges. Let \(\eta := 9m^2 + 2\) and define \(c(\cdot )\) to be a cost function defined on E(H), such that \(c(e) = \eta \) if \(e \in E(T)\), and \(c(e) = 1\) otherwise. Therefore, the total cost of edges in H is polynomially bounded by size of the input graph G:

$$\begin{aligned} c(E(H))&= \sum _{e \in E(H)} c(e) = \sum _{e \in E(G)} 1 + \sum _{e \in E(T)} \eta \\&= m + (9m^2 + 2)n \le m + (9m^2 + 2)2m. \end{aligned}$$

Now, for the sake of brevity, let \(E:= E(H)\), and \(m^\prime = |E(H)|\). Let \(\sigma \in \mathcal {S}_{E}\) be an optimal ordering for weighted graphic matroid MLOP over H with costs \(c(\cdot )\). Note,

$$\begin{aligned} \text {MLOP}(H,c,\sigma )&:= \sum _{i = 1}^{m'}r(E_{i,\sigma })c(\sigma ^{-1}(i))\\&= \sum _{i=1}^{m'} r(E_{i,\sigma }) + \sum _{e \in E(T)} r(E_{\sigma (e),\sigma })(\eta -1). \end{aligned}$$

(b) Optimal solutions of graphic matroid MLOP are “good”: We now argue that optimal solutions to graphic matroid MLOP on H have a particular structure. We will argue that if the weights for edges of T are large enough, then analogous to Lemma 3, each edge of T must belong to a different flat induced by \(\sigma \). This will be useful for relating the solutions of of graphic matroid MLOP on H to MLVC on G.

Let a permutation \(\pi \in \mathcal {S}_{E}\) be good if its prefix sets in the ordering has no two edges of T induce the same rank, i.e., \(r(E_{i,\pi })\ne r(E_{j,\pi })\) for all distinct edges \(\pi ^{-1}(i),\pi ^{-1}(j) \in E(T)\). We now argue that \(\sigma \) is an optimal permutation for weighted graphic matroid MLOP over H only if \(\sigma \) is also a good permutation. To show this, we will argue a stronger claim: any good permutation must achieve a lower MLOP objective on H compared to any non-good permutation.

Let \(\sigma ' \in \mathcal {S}_{E}\) be an arbitrary good permutation, and for the sake of contradiction, assume that there exists an optimal permutation \(\sigma \) which is not-good. We will argue \(\sigma '\) must have lower MLOP value than \(\sigma \), giving a contradiction.

Note \(\text {MLOP}(H,c, \sigma ') \ge \text {MLOP}(H,c,\sigma )\). Furthermore, we have that

$$\begin{aligned} \sum _{i = 1}^{m'} r(E_{i,\sigma '}) \le \sum _{i = 1}^{m'} i \le (m')^2 = (m + n)^2 \le 9m^2, \end{aligned}$$

as \(m + n \le 3m\). The difference in MLOP objective values of \(\sigma \) and \(\sigma '\) is as follows,

$$\begin{aligned} 0&\ge \text {MLOP}(H,c, \sigma )- \text {MLOP}(H,c,\sigma ^\prime ) \\&=\sum _{i=1}^{m'} r(E_{i,\sigma }) - \sum _{i=1}^{m'} r(E_{i,\sigma '}) + ({\eta } -1)\Bigg ( \sum _{e \in E(T)} r(E_{\sigma (e),\sigma }) - r(E_{\sigma '(e),\sigma '})\Bigg ) \\&\ge -\sum _{i=1}^{m'} r(E_{i,\sigma '}) + ({\eta } -1)\Bigg ( \sum _{e \in E(T)} r(E_{\sigma (e),\sigma }) - r(E_{\sigma '(e),\sigma '})\Bigg ) \\&\ge -9m^2 + ({\eta } -1)\Bigg (\sum _{e \in E(T)} r(E_{\sigma (e),\sigma }) - r(E_{\sigma '(e),\sigma '})\Bigg ). \end{aligned}$$

As \({\eta } -1 > 9m^2\), we have that \(\sum _{e \in E(T)} r(E_{\sigma (e),\sigma }) - r(E_{\sigma '(e),\sigma '}) \le 0\), otherwise the right hand side will be strictly positive (contradicting optimality of \(\sigma \), and we are done).

Let \(Y_i\) be the collection of prefix sets induced by \(\sigma \) up to the edges of T such that the rank is exactly i, i.e., \(Y_i:= \{ E_{\sigma (e),\sigma }: e \in E(T) \text { and } r(E_{\sigma (e),\sigma }) = i\}\). Note that more than one \(E_{\sigma (e),e}\) might belong to \(Y_i\). Furthermore we have \(\{Y_i: 1 \le i \le n\}\) partitions \(\{ E_{\sigma (e),\sigma }: e \in E(T) \}\). Similarly, let \(Y_i':= \{ E_{\sigma '(e),\sigma '}: \sigma ' \in E(T) \text { and } r(E_{\sigma '(e),\sigma '}) = i\}\). As \(\sigma '\) is a good permutation, we have that \(|Y_i'| = 1\) for all i. Note,

$$\begin{aligned} 0 \ge \sum _{e \in E(T)} r(E_{\sigma (e),\sigma }) - r(E_{\sigma '(e),\sigma '}') = \sum _{i = 1}^n i(|Y_i| - |Y_i'|) = \sum _{i = 1}^n i(|Y_i| - 1). \end{aligned}$$

We first argue that \(\sum _{j = 1}^i |Y_j| \le i\) for all \(1 \le i \le n\). Note the elements of \(\bigcup _{j = 1}^i Y_j\) form a chain of subsets. Define \(U_i\) to be the maximum element of \(\bigcup _{j = 1}^i Y_j\). In particular, as \(U_i \in Y_j\) for some \(j \le i\), we have \(r(U_i) \le i\). Furthermore, as E(T) is an independent set,

$$\begin{aligned} r(U_i) \ge | \{ E_{\sigma (e),\sigma } : e \in E(T) \text { and } E_{\sigma (e),\sigma } \subseteq U_i\}| = \sum _{j = 1}^i |Y_i|. \end{aligned}$$

Thus we have \(\sum _{j = 1}^i |Y_j| \le i\). Note as \(\{Y_i: 1 \le i \le n\}\) partitions \(\{ E_{\sigma (e),\sigma }: e \in E(T) \}\), we also have \(\sum _{j = 1}^{n} |Y_j| = E(T) = n\).

We now argue that the sum \(\sum _{i = 1}^n i(|Y_i| - 1)\) is minimized, i.e., \(\sum _{i = 1}^ni(|Y_i| - 1) = 0\), if only if \(|Y_i| = 1\) for all \(1 \le i \le n\). Suppose \(|Y_j| \ne 1\) for some j. Let k be the first index such that \(|Y_k| \ne 1\). As \(k \ge \sum _{i = 1}^k |Y_i| = |Y_k| + (k - 1)\), we have that \(|Y_k| = 0\). As \(\sum _{i = 1}^{n} |Y_i| = n\), for some \(l > k\), we have that \(|Y_l| > 1\). Moving an element from \(Y_l\) to \(Y_j\) would decrease the sum \(\sum _{i = 1}^n i(|Y_i| - 1)\). Thus \(\sum _{i = 1}^n i(|Y_i| - 1)\) is minimized, i.e., \(\sum _{i = 1}^n i(|Y_i| - 1) = 0\), only if \(|Y_i| = 1\) for all i. As we assumed \(\sigma \) to be not good, we have \(\sum _{i = 1}^n(|Y_i| -1) > 0\), contradicting the optimality of \(\sigma \). Thus we may conclude any optimal permutation for graphic matroid MLOP on H must also be a good permutation.

(c). Translation of optimal solutions for graphic matroid MLOP to MLVC: We have now argued all optimal solutions on graphic matroid MLOP on H are good, i.e., each edge \(e \in E(T)\) has a unique labeling \(r(E_{e,\sigma })\). As every vertex of V(G) is incident with exactly one edge of E(T), this ordering of E(T) naturally induces a permutation on V(G). We claim such an ordering will be an optimal MLVC ordering on V(G).

Note as T is a star, all other edges of \(E(G) = E {\setminus } E(T) = E(H) {\setminus } E(T)\) each form a unique triangle with the edges of T. Let \(\pi : V(G) \rightarrow [n]\) such that \(\pi (v) = r(E_{(v,z),\sigma })\) where \((v,z) \in E(T)\). As \(\sigma \) is a good permutation, we have that \(\pi \in \mathcal {S}_{V(G)}\).

We claim that for all \(e = (u,v) \in E(G)\), we have \(r(E_{\sigma (e),\sigma }) = \max \{\pi (v), \pi (u)\}\) for any optimal permutation \(\sigma \) of V(H). If \(r(E_{\sigma (e),\sigma }) > \max \{\pi (u), \pi (v)\}\), consider the ordering \(\sigma ^{\prime }\) in which e appears before the edge \(f \in E(T)\) where \(r(E_{\sigma (f),\sigma }) = \max \{\pi (u),\pi (v)\}\). This would have strictly decreasing MLOP objective value, a contradiction. Now suppose \(r(E_{\sigma (e),\sigma }) < \max \{\pi (v),\pi (u)\}\), then the set

$$\begin{aligned} A = \{e\} \cup \{f^{\prime } : f^{\prime } \in E(T) \text { and } r(E_{\sigma (f^{\prime }),\sigma }) \le r(E_{\sigma (e),\sigma })\} \end{aligned}$$

is an independent set of size \(r(E_{\sigma (e),\sigma }) + 1\). Now let \(f \in A \setminus \{e\}\) such that \(r(E_{\sigma (f),f}) = r(E_{\sigma (e),e})\) As either \(E_{\sigma (f),f}\) or \(E_{\sigma (e),e}\) contains A, we have a contradiction.

Thus we may conclude

$$\begin{aligned} \text {MLOP}(H,c,\sigma ) = \sum _{(u,v) \in E(G)} \max \{\pi (u), \pi (v)\} + \sum _{i = 1}^n i \cdot (9m^2 + 2). \end{aligned}$$

As c(E(H)) is bounded by a polynomial in m, by Lemma 4, the MLVC problem can be reduced in polynomial time to the graphic matroid MLOP. \(\square \)

6.3 Equivalence of MLVC and MSVC in decision form

The following theorem shows that solving an MLVC instance on a simple graph \(G=(V,E)\) where \(|V|=n\) is equivalent to solving an MSVC instance on its complement \(\overline{G}= (V, E(K_n){\setminus } E)\). Since MSVC is known to be NP-hard, by Theorem 8, this will imply graphic matroid MLOP is NP-hard.

Theorem 9

Let G be a simple graph on n vertices. For any labeling \(\pi \in \mathcal {S}_{V(G)}\), the MLVC objective on G corresponds to the MSVC objective on its complement graph \(\overline{G}\) with a linear shift, i.e.,

$$\begin{aligned} \sum _{(x,y) \in E(G)} \max \{\pi (x),\pi (y)\}&= (n^3 - n)/3 - (n+1)|E(\overline{G})| \\&\quad + \sum _{(x,y) \in E(\overline{G})} \min \{\pi '(x), \pi '(y)\}, \end{aligned}$$

where \(\pi ':= n+1 - \pi \in \mathcal {S}_{V(\overline{G})}\).

Proof

Note that any labeling of the vertices of a complete graph \(K_n\) gives an optimal MLVC objective value of \(\sum _{i = 1}^n (i - 1)i = (n^3 - n)/3\). It follows for all \(\pi \in \mathcal {S}_{V(G)}\),

$$\begin{aligned} \sum _{(x,y) \in E(G)} \max \{\pi (x),\pi (y)\} + \sum _{(x,y) \in E({\overline{G}})} \max \{\pi (x),\pi (y)\} = (n^3 - n)/3. \end{aligned}$$

This key observation in turn gives the equivalence between MLVC and MSVC as following:

$$\begin{aligned} \sum _{(x,y)\in E(G)} \max \{ \pi (x), \pi (y)\}&= (n^3 - n)/3 - \sum _{(x,y) \in E(\overline{G})} \max \{ \pi (x), \pi (y)\} \\&= (n^3 - n)/3 + \sum _{(x,y) \in E(\overline{G})} \min \{ -\pi (x), -\pi (y)\}\\&= (n^3 - n)/3 -(n+1)|E(\overline{G})|\\&\quad + \sum _{(x,y) \in E(\overline{G})}\big ( (n + 1) + \min \{ -\pi (x), -\pi (y)\}\big )\\&= (n^3 - n)/3 -(n+1)|E(\overline{G})|\\&\quad + \sum _{(x,y) \in E(\overline{G})} \min \{ n + 1 -\pi (x),n + 1 -\pi (y)\} . \end{aligned}$$

As \(\pi ' = n + 1 - \pi \in {\mathcal {S}_{V(\overline{G})}}\), this completes the proof. \(\square \)

As MSVC is NP-hard [26], we have that

Corollary 4

MLVC is NP-hard.

In Theorem 8, we have reduced any instance of MLVC to graphic matroid MLOP. Combining this with Corollary 4, we have the promise.

Theorem 2

Graphic matroid MLOP is NP-hard.

In Corollary 3, we showed if a matroid MLOP is NP-hard for a family of matroids, we have that the corresponding dual family is NP-hard as well. It follows,

Corollary 5

Cographic matroid MLOP is NP-hard.

6.4 Graphic matroid MLOP for cactus graphs is in P

We are now interested in further pushing the known boundaries of NP-hardness of graphic matroid MLOP, in particular show that there is a polynomial time algorithm to solve graphic matroid MLOP for cactus graphs. To achieve this we first introduce a new formulation for matroid MLOP which we believe will be of independent interest. In matroid MLOP we optimize over permutations of the ground set. In this new formulation, we first optimize over the bases of the matroid, and then over all permutations of the selected basis. To see this, given a basis B of a matroid and permutation \(\pi \in \mathcal {S}_{B}\), we construct an ordering \(\sigma \) with the following rule, for each \(e \not \in B\), find the minimal prefix set X of B such that \(X \cup e\) is dependent. Place e anywhere after X but before the next element of B in \(\sigma \). If B and \(\pi \) are chosen as described, this will always result in an optimal MLOP permutation. Now we present this argument in detail.

Let \(M = (E,r)\) be a loopless matroid, let \(r(M) = k\) and let \(\sigma \in \mathcal {S}_{E}\) have optimal matroid MLOP value. By Lemma 3, there exists a partition of E, say \(X = \{X_i: 1 \le i \le k\}\) such that \(\bigcup _{i = 1}^j X_i\) is a flat for all \(1 \le j \le k\) and there exists a basis \(B = \{b_1, \ldots , b_k\}\) such that \(b_i \in X_i\). Furthermore, we have that if \(e \in X_i\) and \(e' \in X_{\ell }\) for \(i < \ell \), then \(\sigma (e) < \sigma (e')\).

We now observe how this partition \(\{X_1, \ldots , X_k\}\) interacts with the values of \(r(E_{\sigma (e),\sigma })\) for optimal \(\sigma \). For all \(e \in E {\setminus } B\), \(B + e\) has a unique circuit, C(Be). As \(C(B,e) - e\) is an independent set, we have \(| \{r(E_{\sigma (e'),\sigma }): e' \in C(B,e) - e\}| = |C(B,e) - e|\). As C(Be) is a dependent set and \(\bigcup _{i = 1}^j X_i\) is a flat for all \(1 \le j \le k\), we have that

$$\begin{aligned} r(E_{\sigma (e),\sigma }) = \max \{r(E_{\sigma (e'),\sigma }) : e' \in C(B,e) - e\}. \end{aligned}$$

Furthermore as \(B = \{b_1, \ldots , b_k\}\) with \(b_i \in X_i\) for all \(1 \le i \le k\), we have that there is a one-to-one correspondence between \(\{b_1, \ldots , b_k\}\) and \(\{r(E_{\sigma (e'),\sigma }): e' \in B\} = \{1, \ldots , r(M)\}\). With this in mind, we define for all \(\pi \in \mathcal {S}_{B}\) and fundamental circuits C(Be), the set \(C(B,e)_{\pi }:= \{\pi (e'): e' \in C(B,e) - e\}\). Note that \(C(B,e)_{\pi }\) is the set of positions in the ordering \(\pi \) of the edges present in \(C(B,e) - e\).

We now build a permutation \(\sigma \) of E as follows. First select a basis B of the matroid, and permutation \(\pi \in \mathcal {S}_{B}\). Given this ordering of basis elements, we create a linear extension \(\sigma \) of this order by ensuring that:

  • For all distinct \(b,b' \in B\), \(\sigma (b) < \sigma (b')\) if and only if \(\pi (b) < \pi (b')\);

  • For all \(e \in E \setminus B\), if \(\max C(B,e)_{\pi } = i\), then \(\sigma (\pi ^{-1}(i))< \sigma (e) < \sigma (\pi ^{-1}(i+1))\).

This process always constructs a permutation \(\sigma \in \mathcal {S}_{E}\), and if the correct basis and \(\pi \) are chosen, will find the optimal matroid MLOP permutation. In particular,

Proposition 1

Matroid MLOP is equivalent to the following problem,

$$\begin{aligned} \min _{B \in \mathcal {B}(M)} \min _{\pi \in \mathcal {S}_{B}} \sum _{e \in E(M) \setminus B} \max C(B,e)_{\pi }. \end{aligned}$$

The characterization Proposition 1 leads to a new class of matroids in which matroid MLOP is in P.

Theorem 10

Let \(\mathcal {X}\) be a family of matroids such that for all \(M = (E,r_M) \in \mathcal {X}\) with \(|E| = m\), the number of bases of M is \(|\mathcal {B}(M)| \in O(g(m))\), and the rank of M is \(r_M(m) \in O(h(m))\), for some \(g, h: \mathbb {Z}_+ \rightarrow \mathbb {Z}_+\). Then, every matroid MLOP instance in \(\mathcal {X}\) can be solved in time \(O(g(m) \cdot poly(m,g(m)) \cdot (h(m))!)\) In particular, if g is polynomial in m and h is bounded by a constant, then matroid MLOP for \(\mathcal {X}\) is in P.

Proof

By Proposition 1, matroid MLOP for \(\mathcal {X}\) has the following formulation,

$$\begin{aligned} \min _{B \in \mathcal {B}(M)} \min _{\pi \in \mathcal {S}_{B}} \sum _{e \in E(M) \setminus B} \max C(B,e)_{\pi }. \end{aligned}$$

By [46], iterating over every basis requires \(poly(m,|\mathcal {B}(M)|)\) time. As \(|\mathcal {B}(M)| \le g(m)\) and \(|\mathcal {S}_{B}| \le (h(m))!\), simply iterating over every basis B and its corresponding permutations will solve matroid MLOP for \(\mathcal {X}\) in time \(O(g(m) \cdot poly(m,g(m)) \cdot (h(m))!)\). \(\square \)

We will now use Proposition 1 to solve graphic matroid MLOP for cactus graphs. We will first argue that the selection of spanning tree is arbitrary in finding an optimal solution for cactus graphs. Then we order greedily with respect to the size of the circuits of the graph to find an optimal solution.

Theorem 5

Given a simple cactus graph G, there is a polynomial time algorithm that solves graphic matroid MLOP on G.

Proof

Let G be a cactus graph. We may assume G is connected, as every graphic matroid M has a connected graph H such that \(M = M[H]\). Note as G is a cactus graph, each edge of G belongs to at most one cycle. Our algorithm is as follows:

  1. 1.

    Order the cycles by length in nondecreasing order, temporarily regarding a bridge as a cycle of infinite length.

  2. 2.

    Output any linear extension that respects this prior ordering. That is, first output all edges in the shortest cycle (in any order), followed by all edges in the next shortest cycle (in any order), and so on.

We now show its correctness. As bridges are coloops, a straightforward consequence of Lemma 2 implies bridges must come last in an optimal order. Thus, without loss of generality we may assume G is bridgeless as well. By Proposition 1, graphic matroid MLOP can be formulated as follows,

$$\begin{aligned} \min _{T \in \mathcal {B}(M)} \min _{\pi \in \mathcal {S}_{T}} \sum _{e \in E(M) \setminus T} \max C(T,e)_{\pi }. \end{aligned}$$

where T is a spanning tree of G, which again for convenience, we regard as a set of edges. Note as G is a cactus graph, the set of fundamental circuits corresponds to the set of cycles of G, i.e., does not depend on the choice of T. The algorithm to solve MLOP for G is clear, first select an arbitrary spanning tree T, and then order the cycles of G non-decreasing with respect to their lengths. Finally, choose a \(\pi \in \mathcal {S}_{T}\) that respects this ordering of the circuits. It is straight forward to verify that this ordering minimizes MLOP. \(\square \)

7 Approximations for minimum latency set cover (MLSC)

In Sect. 6, we introduced the MLVC problem in a series of reductions to show the graphic MLOP is NP-hard. Here we study its more general version MLSC, introduced by Hassin and Levin in 2005 [27]. In Sect. 7.1 we present a randomized factor \((2-\frac{2}{1+\ell })\)-approximation algorithm for MLSC, based on techniques from scheduling theory, where \(\ell \) is the size of largest input subset. Our result is better than previously best-known factor of 2 for generic instances [25]. In particular, our result implies a randomized factor \(\frac{4}{3}\)-approximation algorithm for MLVC, improving upon Azar et. al’s result [25].

We also show that for \(\ell \)-uniform hypergraphs, the natural linear programming (LP) relaxation (see eq. (MLSC-LP)) has an integrality gap of at least \(2-\frac{2}{1+\ell }\). As a special case, we show that the integrality gap for MLVC is \(\frac{4}{3}\). This implies that any approximation algorithm for MLVC based on the rounding of the LP relaxation (without additional inequalities) cannot improve upon our result.

In Sect. 7.2, we explore families of instances where MLVC admits polynomial time algorithms. We show an equivalence between MLA and MLVC for regular graphs in decision form. As many classes of regular graphs have previously been studied, this yields exact polynomial time algorithms for MLVC on these families of instances, and by Theorem 9, for MSVC problem for the graph complement of these families as well.

7.1 A randomized approximation algorithm for MLSC based on scheduling

Recall that minimum latency vertex cover is a special case of minimum latency set cover (MLSC). MLSC can be similarly defined, as in our notation for MLVC. Instead of a graph, we are given a hypergraph \(H = (V,E)\) with the objective

$$\begin{aligned} \min _{\pi \in \mathcal {S}_{V(H)}} \sum _{e \in E(H)} \max _{v \in e} \pi (v). \end{aligned}$$

The state-of-the-art approximation for MLSC is a factor 2, using a reduction to a well studied problem in scheduling theory, known as \(1|\text {prec}| \sum w_j C_j\), or (single machine) minimum sum scheduling with precedence constraints; that is defined as follows. The input includes a set of jobs J, with corresponding processing times and weights \(\{p_j\}_{j \in J}, \{w_j\}_{j \in J}\), along with a partially ordered set (poset) P over the jobs. We have a single machine that takes \(p_j\) amount of time to process the job j. A feasible schedule is one that processes job j earlier than job \(j'\) whenever \(j <_P j'\) in the poset. The objective is to minimize (weighted) sum of all completion times, \(\sum _j w_j C_j\), where each \(C_j\) is the completion time of job j, and is uniquely determined by the schedule and processing times.

MLSC has been known to be reducible to single machine minimum sum scheduling with precedence constraints since 2005 [27], using a simple construction as follows. For every vertex \(v \in V\), consider a job \(v \in J\) with processing time \(p_v = 1\) and weight \(w_v = 0\). For every hyperedge \(e \in E\), consider a job \(e \in J\) with processing time \(p_e = 0\) and weight \(w_e = 1\). The poset P over the set of jobs \(J = V \cup E\) is defined by all pairs \(v <_P e\) such that \(v \in V, e \in E\), and \(v \in e\). For convenience, we also have for any distinct hyperedges \(e,e' \in E\) if \(e \subsetneq e'\) then \(e <_P e'\). Furthermore, for all multiples of the same edge in E, we order them as a chain in P in some arbitrary manner. It is easy to verify the objective of this scheduling problem is equal to that of the original MLSC. Moreover, the reduction is approximation preserving, i.e., an \(\alpha \)-approximate solution to the scheduling instance gives an \(\alpha \)-approximate solution to MLSC [27].

Note that the 2-approximability of MLSC is immediate, using various 2-approxi-mations for scheduling [5,6,7]. Furthermore, by Proposition 6, MLSC is an instance of monotone submodular MLOP. Thus the \((2 - 2/(|E| + 1))\) approximation of [1] is applicable in this case as well. A better constant than 2-approximation for all instances seems unlikely, considering hardness results for the scheduling problem [47], or the vertex cover problem that it reduces to [48,49,50]. We instead show an instance-dependent improvement parameterized by the maximum size of the subsets. We achieve this result by studying the dimension of the poset and its fractional dimension (e.g., studied by [8, 51] in the context of scheduling). In the rest of this section, we prove Theorem 3 using the scheduling algorithm by [8].

Theorem 3

There is a randomized polynomial time algorithm that approximates MLSC within factor \(2-\frac{2}{1+\ell }\), where \(\ell \) is the maximum cardinality among all hyperedges of H.

We now define the fractional dimension of a poset, that was introduced by [52]. A poset \(P' (<_{P^\prime })\) is an extension of a poset \(P (<_P)\), if \(x <_P y\) then \(x <_{P'} y\), and \(P'\) is linear if \(x \ne y\) then we have \(x <_{P'} y\) or \(y <_{P'} x\). It is easy to see that the set of feasible solutions for the single machine scheduling problem are all linear extensions of the corresponding poset. Let \(\mathcal {F} = \{\mathcal {L}_1, \cdots , \mathcal {L}_t\}\) be a multiset of linear extensions of P. \(\mathcal {F}\) is a k-fold realizer of P, if for every incomparable pair (xy) of P, there are at least k linear extensions in \(\mathcal {F}\) in which \(y < x\). The fractional dimension of P is defined as \(\lim _{k\rightarrow \infty }\frac{t}{k}\), where t is the size of a minimum k-fold realizer (note that the fractional dimension of a poset \(\ge 2\) if it is not a linear order). Ambühl et al. [8] showed \(1|\text {prec}| \sum w_j C_j\) can be \((2-\frac{2}{f})\)-approximated, where f upper bounds the fractional dimension of the corresponding poset. Specifically, they proved the following.

Theorem 11

[8] Given an efficient sampling algorithm for a k-fold realizer of P, of size t (that is, to output each of the \(\mathcal {L}_i\)’s with probability at least 1/t), the problem \(1|\text {prec}| \sum w_j C_j\) has a randomized approximation algorithm of factor \(2-\frac{2}{t/k}\).

Given an oracle that outputs a random linear extension \(P^\prime \) of P such that \( \text{ Pr}_{}\left[ j <_{P^\prime } i \right] \ge b \), for every pair of incomparable jobs (ij) in P, Theorem 11 gives a \(2-2b\) approximate solution to the corresponding \(1|\text {prec}| \sum w_j C_j\). Let us call the sampling algorithm provided to the above theorem, a \(\frac{k}{t}\)-balanced linear ordering oracle for P. We show that it is easy to construct an \(\frac{1}{1+\ell }\)-balanced linear ordering oracle for posets corresponding to the MLSC’s reformulation to scheduling. This will result in a \((2-\frac{2}{1+\ell })\)-approximation algorithm for MLSC, using the result of Ambühl et. al Theorem 11.

Lemma 5

Consider an arbitrary MLSC problem defined over a hypergraph \(H = (V, E)\). Let P be the poset obtained from the reformulation of the MLSC instance as a scheduling problem. Then, P admits a \(\frac{1}{1+\ell }\)-balanced linear ordering oracle, where \(\ell \) is the maximum size of any hyperedge in MLSC.

Proof

Consider the following linear extensions to the poset P constructed randomly: pick any random ordering \(\{v_{l_1}, v_{l_2}, \ldots , v_{l_n}\}\) of the vertices V and let them appear in the schedule in this order. To schedule any hyperedge \(e\in E\), insert e in the ordering as soon as all its incident vertices have been scheduled. If edges are scheduled concurrently, we break ties at random. It is easy to see that this random scheduling order leads to a valid linear extension, satisfying all precedence constrains of P. Let’s call this linear extension \(P^\prime \).

Now, we claim that any random order obtained above satisfies that the probability of \(j <_{P^\prime } i\) for two incomparable jobs ij of P is at least \(\frac{1}{1+\ell }\). To see this, note that for a pair of vertices, this trivially holds as \(\text{ Pr}_{}\left[ u <_{P^\prime } v \right] = 0.5 \ge \frac{1}{1+\ell }\) for all distinct vertices u and v. Let us show the inequality holds for a pair of incomparable hyperedges. For an incomparable pair consisting of a vertex and a hyperedge, we overload the notation to treat any vertex as a hyperedge of size 1. We can now consider any two distinct incomparable hyperedges \(e, e'\).

Let \(a = |e {\setminus } e'|\), let \(b = |e' {\setminus } e|\), and let \(c = |e \cap e'|\). Note that \(a, b > 0\), otherwise one edge is a subset of another, i.e., they are not incomparable. We compute \(\text{ Pr}_{}\left[ e <_{P'} e' \right] \) conditioning on the last vertex of \(e \cup e'\) with respect to the random permutation. Call this last vertex \(v_{e,e'}\).

$$\begin{aligned} \text{ Pr}_{}\left[ e<_{P^\prime } e' \right]&= \text{ Pr}_{}\left[ e<_{P^\prime } e' | v_{e,e'} \in e \setminus e' \right] \cdot \text{ Pr}_{}\left[ v_{e,e'} \in e \setminus e' \right] \\&\quad + \text{ Pr}_{}\left[ e<_{P^\prime } e' | v_{e,e'} \in e \cap e' \right] \cdot \text{ Pr}_{}\left[ v_{e,e'} \in e \cap e' \right] \\&\quad + \text{ Pr}_{}\left[ e <_{P^\prime } e' | v_{e,e'} \in e' \setminus e \right] \cdot \text{ Pr}_{}\left[ v_{e,e'} \in e' \setminus e \right] \\&= 0 \cdot \frac{a}{a+b+c} + \frac{1}{2} \cdot \frac{c}{a+b+c} + 1 \cdot \frac{b}{a+b+c} \\&= \frac{b + c/2}{a+b+c}. \end{aligned}$$

We will now use the following well-known inequality: for positive numbers \(\alpha , \beta , \gamma , \delta \) such that \(\alpha /\beta < \gamma /\delta \), we have \(\frac{\alpha }{\beta }< \frac{\alpha +\gamma }{\beta +\delta } < \frac{\gamma }{\delta }\). If \(c = 0\), we have \(\text{ Pr}_{}\left[ e <_{P^\prime } e' \right] = \frac{b}{a+b} \ge \frac{1}{1+\ell }\). Suppose \(c > 0\), then we can write \(\text{ Pr}_{}\left[ e <_{P^\prime } e' \right] = \frac{b + c/2}{a+b+c} \ge \min \{ \frac{b}{a+b}, \frac{c/2}{c} \}\). Considering that \(\frac{b}{a+b}\) is minimized at \(\frac{1}{1 + \ell }\) subject to the constraints \(1 \le a,b \le \ell \), we have the desired lower bound on \(\text{ Pr}_{}\left[ e <_{P^\prime } e' \right] \) in both cases. \(\square \)

Therefore, we get a \(\frac{1}{1+l}\)-balanced linear ordering oracle for the MLSC’s scheduling reformulation, which ultimately gives us a \((2-\frac{2}{1+l})\)-approximation algorithm for MLSC.

Integrality Gap for \(\ell \) -uniform MLSC:

Next, we consider the relaxed linear program for MLSC on \(\ell \)-uniform hypergraphs on n vertices, i.e., where each hyperedge has size \(\ell \). Here, variables \(u_{e,t}\) represent whether a hyperedge e is still uncovered (from MLSC perspective) until time t, and \(x_{v,t}\) indicates a vertex v to be scheduled at time step t, when these are constrained to be integral.

$$\begin{aligned} \textsc {(MLSC-LP)}~~~ \text {minimize} \quad \sum \limits _{e,t} u_{e,t} \nonumber \\ \text {subject to} \qquad \sum \limits _{v} x_{v,t}&\le 1, \qquad \forall ~ t \in \{1, \ldots , n\}, \end{aligned}$$
(1)
$$\begin{aligned} u_{e,t} + \sum _{t'<t} x_{v,t^{\prime }}&\ge 1, \qquad \forall ~ v,e,t \text { s.t. } v \in e, \end{aligned}$$
(2)
$$\begin{aligned} u_{e,t},\ x_{v,t}&\ge 0, \qquad \forall ~ e,v,t. \end{aligned}$$
(3)

The constraints (1) and (2), respectively, ensure that at most one vertex is scheduled during each time step, and every hyperedge remains unscheduled until all incident vertices are scheduled, i.e., \(u_{e,t}\) is 0 only if all \(v \in e\) are scheduled strictly before t.

First we show a lower bound of \(2-\frac{2}{1+\ell }\) on the integrality gap, matching the approximation factor of Theorem 3.

Proposition 2

The integrality gap of the LP relaxation for MLSC on \(\ell \)-uniform hypergraphs is at least \(2-\frac{2}{1+\ell }\).

Proof

Consider the complete \(\ell \)-uniform hypergraph on n vertices. By a well-known binomial coefficient identityFootnote 3, any ordering on the vertices gives the optimal objective to the combinatorial problem, which can be shown to be

$$\begin{aligned} {\sum _{k = \ell }^{n} k{k-1\atopwithdelims ()\ell -1} = \sum _{k = \ell }^{n} \ell {k\atopwithdelims ()\ell } =\ell {n+1\atopwithdelims ()\ell +1}}. \end{aligned}$$

For \(\ell \)-uniform instances, the MLSC-LP objective can be upper bounded with a uniform fractional solution, i.e., \(x_{v,t} = \frac{1}{n}\) and \(u_{e,t} = 1 - \frac{t-1}{n}\) for all v, e, and t. It follows,

$$\begin{aligned} \sum _{e,t} u_{e,t} = |E| \cdot \Bigg (\sum _{t = 1}^{n} (1 - \frac{t-1}{n}) \Bigg ) {= {n\atopwithdelims ()\ell } \cdot \frac{n+1}{2}=\frac{\ell +1}{2}{n+1\atopwithdelims ()\ell +1}.} \end{aligned}$$

Thus, this family of examples provides a lower bound of \(\frac{2\ell }{\ell +1}=2-\frac{2}{1+\ell }\) for the integrality gap. \(\square \)

The integrality gap of MLSC-LP is therefore at least \(2 - \frac{2}{1+l}\), but it can be more for certain families of graphs. We end this section by showing that the integrality gap of the MLSC-LP is exactly \(2-\frac{2}{1+\ell }\), for \(\ell \)-uniform hypergraphs where the degree of each vertex is exactly d. We call these hypergraphs d-regular \(\ell \)-uniform hypergraphs. We do not know if the integrality gap for non-regular uniform hypergraphs is strictly larger than \(2-\frac{2}{1+\ell }\).

Proposition 3

Let H be any d-regular \(\ell \)-uniform hypergraph with n vertices. Then the integrality gap for H is at most \(2-\frac{2}{1+\ell }\).

Proof

We first show that the MLSC-LP has an optimal objective value \(\frac{dn(n+1)}{2\ell }\) for any d-regular \(\ell \)-uniform hypergraph with n vertices. For all fixed \(1\le t\le n\), summing over constraints 2 for all \(e \in E\) and all \(v\in e\), and we have:

$$\begin{aligned} {\ell }\sum _{e}u_{e,t}&=\sum _e \sum _{v\in e} u_{e,t} \end{aligned}$$
(4)
$$\begin{aligned}&\overset{(2)}{\ge }\ \sum _e \sum _{v\in e} \left( 1-\sum _{t'<t} x_{v,t'} \right) \end{aligned}$$
(5)
$$\begin{aligned}&=dn-d\sum _{t'<t}\sum _{v}x_{v,t'}\end{aligned}$$
(6)
$$\begin{aligned}&\overset{(1)}{\ge }\ dn-d(t-1), \, {\text { for all }1\le t\le n.} \end{aligned}$$
(7)

Now summing over (7) for t from 1 to n we have:

$$\begin{aligned} \sum _{e,t}u_{e,t}\ge {\frac{1}{\ell }}\sum _{t=1}^n (dn-d(t-1))={\frac{dn(n+1)}{2\ell }}. \end{aligned}$$

It is easy to see that this objective value is achieved by letting \(x_{v,t}=\frac{1}{n}\) and \(u_{e,t} = 1 - \frac{t-1}{n}\) for all evt, as this makes all inequalities satisfied with equality.

Now consider the MLSC problem. Using randomized rounding (e.g., [53]), we will show there exists a permutation with objective value at most \(2-\frac{2}{1+\ell }\) of the LP optimal value. Let \(\pi \) be a uniformly random permutation of vertices, i.e. \(\pi (v)=k\) with probability 1/n for all \(1\le k\le n\). Then, for any hyperedge e we have

$$\begin{aligned} {{\mathbb {E}}_{}\left[ \max \{\pi (v),v\in e\} \right] =\frac{1}{{n\atopwithdelims ()\ell }} \sum _{k=\ell }^n k{k-1\atopwithdelims ()\ell -1}=\frac{\ell {n+1\atopwithdelims ()\ell +1}}{{n\atopwithdelims ()\ell }}=\frac{\ell (n+1)}{\ell +1}}. \end{aligned}$$

Thus, by linearity of expectation, the expectation of the objective value for MLSC is

$$\begin{aligned} {\frac{dn}{\ell }{\mathbb {E}}_{}\left[ \max \{\pi (v),v\in e\} \right] =\frac{dn(n+1)}{\ell +1}}. \end{aligned}$$

Therefore, there exists a permutation with objective value at most \({\frac{dn(n+1)}{\ell +1}}\), which is \({2-\frac{2}{1+\ell }}\) of the LP optimal value. \(\square \)

7.2 Polynomial solvable instances for MLVC and MSVC

We next discuss classes of instances of MLVC and MSVC that can be solved in polynomial time. The following theorem relates the objective value of MLA with MLVC for the family of regular graphs.

Theorem 12

Let G be a d-regular graph on n vertices. For any labeling \(\sigma \in \mathcal {S}_{n}\), we have

$$\begin{aligned} 2\cdot \sum _{(x,y) \in E(G)} \max \{\pi (x),\pi (y)\} = d{n + 1 \atopwithdelims ()2} + \sum _{(x,y) \in E(G)} |\pi (x) - \pi (y)|. \end{aligned}$$

Proof

We have that,

$$\begin{aligned} \sum _{(x,y) \in E(G)} |\pi (x) - \pi (y)|&= \sum _{(x,y) \in E(G)} \left[ 2 \cdot \max \{\pi (x) , \pi (y)\} - \pi (x) - \pi (y) \right] \\&= -\sum _{v \in V(G)} \pi (v)d + 2 \cdot \sum _{(x,y) \in E(G)} \max \{\pi (x), \pi (y)\}\\&= -d\sum _{i = 1}^n i + 2 \cdot \sum _{(x,y) \in E(G)} \max \{\pi (x), \pi (y)\}\\&= -d{n + 1 \atopwithdelims ()2}+ 2 \cdot \sum _{(x,y) \in E(G)} \max \{\pi (x), \pi (y)\}. \end{aligned}$$

\(\square \)

By Theorem 12, we have that MLA and MLVC for regular graphs are equivalent in decision form. As the family of regular graphs is closed under graph complements, we also have by Theorem 9 that MSVC and MLVC for the family of regular graphs are equivalent in decision form as well. Thus we have the following,

Corollary 6

For the family of regular graphs, MLA, MLVC, and MSVC are equivalent in decision form.

As an illustration of the utility of Theorem 12, we introduce Hamming graphs H(dc), which are obtained from d Cartesian graph products of the complete graph \(K_c\). Motivated by designing error correcting codes, Harper [17] solved the MLA problem for hypercubes, i.e. H(d, 2) where d is any positive integer. Later, Nakano [54] generalized this result to all Hamming graphs H(dc) where d and c are positive integers. As Hamming graphs are regular, we have the following corollary of Theorem 12.

Corollary 7

MLVC is polynomial time solvable for Hamming graphs.

The literature for the MLA problem is vast and many other instances of regular graphs have been previously solved. Thus Theorem 12, while simple, provides a powerful tool for providing polynomial time algorithms for many families of regular graphs. Some of these families of graphs include toroidal grids [55], complete p-partite graphs [56], and de Bruijn graphs of order 4 [57]. This list is by no means exhaustive, and we refer the reader to the following surveys for further reading [9,10,11,12]. Furthermore by Theorem 9, the complements of these families also have polynomial time algorithms for the MSVC problem.

8 Improved approximation for monotone submodular MLOP

Monotone submodular MLOP was introduced by Iwata et al. [1], where the authors also provided a factor \((2-\frac{2}{1+|E|})\)-approximation algorithm using the Lovász extension of submodular functions. Fokkink et al. [16] studied the submodular search problem, which generalizes monotone submodular MLOP, and gave an approximation factor based on the total curvature of the submodular function. It was not known if a tighter approximation was possible. They considered the greedy contraction of the principal partition induced by the submodular function, an idea that has been used as early as 1992 by Pisaruk [15]. In this section, we give a different analysis to the same algorithm and improve the approximation factor to

$$\begin{aligned} 2-\frac{1+\ell _f}{1+|E|} \text { where } {\ell _f=\frac{f(E)}{\max _{x\in E} f(\{x\})}}. \end{aligned}$$

Our result can be applied to special cases including matroid and graphic matroid MLOP. For general matroid MLOP, our approximation factor is \(2-\frac{1+r(E)}{1+|E|}\), which is strictly smaller than 2 when \(r(E)=\varOmega (|E|)\) (e.g., graphic matroid on sparse graphs). Note that both approximation factors given by [16] based on total curvature and [1] based on Lovász extension are asymptotically 2 for all non-trivial instances of matroid MLOP.

Throughout this section, let E be a nonempty set of size m and \(f:2^E\rightarrow \mathbb {R}\) be a normalized (\(f(\emptyset )=0\)) monotone submodular set function. Without loss of generality, we can also assume that the maximum minimizer of the submodular function is the empty set,Footnote 4 i.e., \(f(S)>0\) for all \(S\ne \emptyset \). Recall from Sect. 4 that the steepness of a set function f is defined as \(\kappa _{f}=\max _{x\in E}f(\{x\})\), and linearity of f is \(\ell _{f}=\frac{f(E)}{\kappa _{f}}\). By submodularity and monotonicity of f, for all \(S\subseteq T\) we have \(f(T)\le f(S)+\kappa _{f} |T{\setminus } S|\).

Note for any non-trivial (i.e., \(f(E)>0\)) normalized monotone submodular function \(f:2^E\rightarrow \mathbb {R}\), we have \(1\le \ell _{f}\le |E|\). Both of the bounds are tight, as the lower bound \(\ell _{f}=1\) is attained when f is the rank function on a graphic matroid with 2 vertices and |E| parallel edges between them, while the upper bound \(\ell _{f}=|E|\) is attained when \(f(S)=|S|\) for all \(S\subseteq E\). Thus, the linearity \(\ell _{f}\) is a measure of how uniform and linear a submodular function is. The function will have high linearity if each singleton has approximately same function value, and the function is approximately linear, i.e., all submodular relations \(f(S)+f(T)\ge f(S\cap T)+f(S\cup T)\) are close to being tight. In the special case where f(S) is the rank function of some matroid, we have \(\kappa _{f}=1\) and \(\ell _{f}=f(E)\) (the rank of the matroid).

Fig. 2
figure 2

Diagram of our lower and upper bounds in grey as well as the optimal solution in red. The black circles represent the principal partitions

In this section, we show a \((2-\frac{1+\ell _{f}}{1+|E|})\)-approximation factor to monotone submodular MLOP using any linear extension of the principal partition with respect to the submodular function. Recall that a principal partition is a set of nested sets \(\emptyset =\varPi _0\subsetneq \ldots \subsetneq \varPi _s=E\) (\(s\ge 1\)) and a set of critical values \(\lambda _0<\lambda _1< \ldots <\lambda _{s+1}\), such that for all \(0\le i\le s\), \(\varPi _i\) is the unique maximal optimal solution to \(\min _{X\subseteq E} f(X)-\lambda |X|\), for all \(\lambda \in (\lambda _i,\lambda _{i+1})\) (Sect. 4).

Theorem 13

Let \(\{\varPi _i\}_{0\le i\le s}\) be the principal partition of a non-trivial monotone submodular function \(f:2^E \rightarrow \mathbb {R}\) satisfying \(f(\emptyset )=0\). Let \(\kappa _{f}=\max _{x\in E}f(\{x\})\) and \(\ell _{f}=\frac{f(E)}{\kappa _{f}}\). Let \(\sigma \in \mathcal {S}_{E}\) be any linear extension of the principal partition, i.e., \(E_{|\varPi _i|,\sigma } = \varPi _i\) for all \(1\le i\le s\). Then, the MLOP objective value of \(\sigma \) is at most factor \(2-\frac{1+\ell _{f}}{1+|E|}\) of the optimal solution.

Since \(1\le \ell _{f}\le |E|\), our result is a refinement on the \(2-\frac{2}{1+|E|}\) factor approximation of monotone submodular MLOP in [1]. For the lower bound, our key lemma (Lemma 7) is a more general version of the the well-known fact (see [14, 16, 58]) that any member of the principal partition \(\varPi _i\) is the “sparsest” subsetFootnote 5 in the \(\varPi _{i-1}\)-contracted submodular function \(f_{|\varPi _{i-1}}\), i.e., \(\frac{f(S)-f(\varPi _{i-1})}{|S|-|\varPi _{i-1}|}\ge \frac{f(\varPi _i)-f(\varPi _{i-1})}{|\varPi _i|-|\varPi _{i-1}|}\) for all \(S\supsetneq \varPi _{i-1}\). We show, in Lemma 7, that this algebraic statement holds for all subsets S where \(|S|\ne |\varPi _{i-1}|\), allowing us to lower bound the MLOP value of an arbitrary chain. For the upper bound, we consider any MLOP solution that is a linear extension of the principal partitions. The increase of the function value can be upper bounded using \(\kappa _f\), the linearity parameter of submodular function f, as well as the function value at the principal partitions.

See Fig. 2 for an illustration. The horizontal axis denotes the sizes of subsets appearing in an MLOP solution, and the vertical axis denotes the cost that these subsets incur in the MLOP objective. The coordinates of the black circles are the sizes and costs of the principal partitions. Between two adjacent black circles in the figure, the lower bound is the linear segment joining them, and the upper bound is formed using two linear segments, the first with positive slope \(\kappa _f\) and the second with slope 0. The red points represent subsets in an optimal MLOP solution, and we show that they always lie inside the triangular shaded regions formed by the lower and upper bounds. In particular, the principal partitions must appear in any optimal MLOP solution, which is also a consequence of Theorem 1 in [16].

The proofs for the lower and upper bounds are highly algebraic, and a lot of calculations are deferred to the appendix. One of the challenges is that for the upper bound, the difference between function values of two adjacent subsets in the principal partition may not be an integer multiple of \(\kappa _f\), thus additional steps are needed to deal with rounding as the upper bound approaches each horizontal segment.

8.1 Lower and upper bound on MLOP objective value

Consider a monotone submodular function \(f:2^E\rightarrow \mathbb {R}\) satisfying \(f(S)=0\) if and only if \(S=\emptyset \), its principal partition \(\{\varPi _i\}_{0\le i\le s}\) and the corresponding critical values \(\{\lambda _i\}_{1\le i\le s}\) (Sect. 4). The following lemma gives the relationship between the critical values and the principal partition [14].

Lemma 6

The principal partition \(\{\varPi _i\}_{0\le i\le s}\) and corresponding critical values \(\{\lambda _i\}_{1\le i\le s}\) satisfy the following relation:

$$\begin{aligned} \lambda _i=\frac{f(\varPi _i)-f(\varPi _{i-1})}{|\varPi _i|-|\varPi _{i-1}|}, \text { for all } 1\le i\le s. \end{aligned}$$

Furthermore, \(\varPi _{i-1}\) and \(\varPi _i\) are the unique minimal and maximal minimizers of \(\min _{X\subseteq E} f(X) - \lambda _i |X|\).

We include a proof of Lemma 6 in Appendix A.2 for completeness. It simply uses the definition of the principal partition and submodularity of the set function.

The following lemma lower gives an lower bound on the function value of any subset. As mentioned before, this lemma is more general than stating that \(\varPi _i\setminus \varPi _{i-1}\) is the unique maximal sparsest subset with respect to \(f_{|\varPi _{i-1}}\), the \(\varPi _{i-1}\)-contracted submodular function.

Lemma 7

Let \(f:2^E \rightarrow \mathbb {R}\) be a normalized monotone submodular function with \(f(S)>0\) if \(S\ne \emptyset \), and principal partition \(\{\varPi _i\}_{0\le i\le s}\). Let \(S\subseteq E\), then

$$\begin{aligned} {\frac{f(S)-f(\varPi _{i-1})}{|S|-|\varPi _{i-1}|}\ge \frac{f(\varPi _i)-f(\varPi _{i-1})}{|\varPi _i|-|\varPi _{i-1}|},} \end{aligned}$$

for all i such that \(|S|\ne |\varPi _{i-1}|\).

Proof

One can simply fix an arbitrary critical value \(\lambda _i\), and use the fact that \(f(\varPi _{i-1})-\lambda _i |\varPi _{i-1}|\le f(S)-\lambda _i |S|\) for any \(S \subseteq E\). Rearranging terms we get \(f(S)-f(\varPi _{i-1})\ge \lambda _i\big (|S|-|\varPi _{i-1}|\big )\). Substituting the value of \(\lambda _i =\frac{f(\varPi _i)-f(\varPi _{i-1})}{|\varPi _i|-|\varPi _{i-1}|}\) (Lemma 6) gives us the desired result. \(\square \)

Using the above lemma, we can sum up appropriate bounds for each subset \(E_{i,\sigma }\) for any ordering \(\sigma \), and obtain the following lower bound for monotone submodular MLOP. The proof after summation is purely algebraic manipulation, which is deferred to appendix.

Proposition 4

Let \(f:2^E \rightarrow \mathbb {R}\) be a normalized monotone submodular function with \(f(S)>0\) if \(S\ne \emptyset \), and principal partition \(\{\varPi _i\}_{0\le i\le s}\). Let \(\sigma \in \mathcal {S}_{E}\), then

$$\begin{aligned} \sum _{k = 1}^m f(E_{k,\sigma }) \ge \frac{1}{2}(|E|+1)f(E)-\frac{1}{2}\sum _{i=1}^s \big (f(\varPi _i)|\varPi _{i-1}|-f(\varPi _{i-1})|\varPi _i|\big ){> 0}. \end{aligned}$$

Proof

The proof is deferred to Appendix A.3. \(\square \)

For the upper bound, we require that the chain must contain all sets in principal partition, i.e. \(E_{|\varPi _i|,\sigma } = \varPi _i\) for all i. We use the fact that each added element into the subset can increase the function value by at most \(\kappa _f\) to upper bound the function value of remaining sets. Pictorially, if we start from \(\varPi _{i-1}\), the upper bound starts at \(f(\varPi _{i-1})\) and has slope \(\kappa _f\), until it reaches \(f(\varPi _i)\) where it remains flat until \(\varPi _i\) (refer to Fig. 2). Also note that the increase of function value is integer multiple of \(\kappa _f\) without additional analysis, and the rounding as function value approaches \(f(\varPi _i)\) has to be taken care of.

Proposition 5

Let \(f:2^E \rightarrow \mathbb {R}\) be a normalized monotone submodular function with \(f(S)>0\) if \(S\ne \emptyset \), and principal partition \(\{\varPi _i\}_{0\le i\le s}\). Let \(\sigma \in \mathcal {S}_{E}\) be such that \(E_{|\varPi _i|,\sigma } = \varPi _i\) for all \(1\le i\le s\). Then the MLOP objective value for f with permutation \(\sigma \) is at most

$$\begin{aligned}&f(E)|E|-\frac{f(E)^2}{2\kappa _{f}}+\frac{f(E)}{2}\\&\quad -\sum _{i=1}^s (f(E)-f(\varPi _i))(|\varPi _i|-|\varPi _{i-1}|)+\sum _{i=1}^s\frac{f(\varPi _{i-1})(f(\varPi _i)-f(\varPi _{i-1}))}{\kappa _{f}}. \end{aligned}$$

Proof

The proof is deferred to Appendix A.4. \(\square \)

8.2 Proof of Improved Approximation for MLOP in Theorem 13

Both Proposition 4 and 5 together allow us to prove Theorem 13. Our goal is to show the upper bound obtained from Proposition 5 is at most \(2 - \frac{1+\ell _{f}}{1+ |E|}\) the lower bound obtained from Proposition 4, thus showing for a \(\sigma \in \mathcal {S}_{E}\) such that \(E_{|\varPi _i|,\sigma } = \varPi _i\) for all \(1 \le i \le s\), is our desired approximation for monotone submodular MLOP. First comparing the non-summation terms in Proposition 4 and 5 we have

$$\begin{aligned} \frac{f(E)|E|-\frac{f(E)^2}{2\kappa _{f}}+\frac{f(E)}{2}}{\frac{1}{2}(|E|+1)f(E)}=\frac{2|E|-\frac{f(E)}{\kappa _{f}} + 1}{|E| + 1} = 2 - \frac{1+\ell _{f}}{1+ |E|}. \end{aligned}$$

To deal with the remaining summation terms, it suffices to prove that

$$\begin{aligned} \sum _{i=1}^s \big (f(\varPi _i)|\varPi _{i-1}|-f(\varPi _{i-1})|\varPi _i|\big )&\le \sum _{i=1}^s (f(E)-f(\varPi _i))(|\varPi _i|-|\varPi _{i-1}|)\\&\quad -\sum _{i=1}^s\frac{f(\varPi _{i-1})(f(\varPi _i)-f(\varPi _{i-1}))}{\kappa _{f}}, \end{aligned}$$

i.e., the decrease of upper bound from non-summation terms is at least twice the decrease of lower bound from non-summation terms. To make computation easier we rewrite the terms using differential notation. For all \(1\le i\le s\), let \(\delta _i=f(\varPi _i)-f(\varPi _{i-1})\) and \(\varDelta _i=|\varPi _i|-|\varPi _{i-1}|\). By definition of \(\kappa _{f}\), we have \(0 \le \delta _i\le \kappa _{f} \varDelta _i\). Note that \(f(\varPi _i)=\sum _{j=1}^i \delta _j\) and \(|\varPi _i|=\sum _{j=1}^i \varDelta _j \). Furthermore, we have \(f(\varPi _i)|\varPi _{i-1}|-f(\varPi _{i-1})|\varPi _i|=|\varPi _i|\delta _i-f(\varPi _i)\varDelta _i\). Thus, the statement to be proved can be rewritten as

$$\begin{aligned} \sum _{i=1}^s \Bigg (\delta _i\sum _{j=1}^i \varDelta _j-\varDelta _i\sum _{j=1}^i \delta _j\Bigg )\le \sum _{i=1}^s \Bigg (\varDelta _i\sum _{j=i+1}^s \delta _j -\frac{\delta _i}{\kappa _{f}}\sum _{j=1}^{i-1} \delta _j \Bigg ). \end{aligned}$$

Suppose \(s=1\), then both sides of this inequality are equal to zero.

Thus, we may assume \(s\ge 2\). Rearranging terms, for the left hand side we have

$$\begin{aligned} \sum _{i=1}^s \Bigg (\delta _i\sum _{j=1}^i \varDelta _j-\varDelta _i\sum _{j=1}^i \delta _j\Bigg ) = \sum _{i=1}^{s-1} \sum _{j>i}\delta _j\varDelta _i-\delta _i\varDelta _j, \end{aligned}$$

and the second part of right hand side can be rewritten as

$$\begin{aligned} \sum _{i=1}^s \frac{\delta _i}{\kappa _f}\sum _{j=1}^{i-1} \delta _j = \frac{1}{\kappa _f}\sum _{j=1}^{s-1} \sum _{i>j}\delta _i\delta _j = \frac{1}{\kappa _f}\sum _{i=1}^{s-1} \sum _{j>i}\delta _i\delta _j, \end{aligned}$$

after exchanging summation order and changing variable names. As \(\delta _j\le \kappa _{f}\varDelta _j\) and hence \(\sum _{i=1}^{s-1} \sum _{j>i}\delta _j\varDelta _i-\delta _i\varDelta _j\le \sum _{i=1}^{s-1} \sum _{j>i}\delta _j\varDelta _i-\delta _i\frac{\delta _j}{\kappa _{f}}\), we have the inequality holds and thus, the proof is finished.

Recall from Theorem 7 that principal partitions \(\{\varPi _i\}_{0\le i\le s}\) can be found in polynomial time. Thus, we have the following:

Theorem 6

Let \(f:2^E \rightarrow \mathbb {R}\) be a non-trivial, normalized and monotone submodular function. There exists a factor \((2-\frac{1+\ell _{f}}{1+|E|})\)-approximation algorithm to MLOP with \(f(\cdot )\) in polynomial time, where \(\ell _{f}=\frac{f(E)}{\max _{x\in E}f(\{x\})}\).

Note that our analysis works for any linear extension to the partial order on subsets induced by the principal partition. It is unclear how this analysis can be extended to more structured linear extensions. We now discuss a special case of Theorem 13, when f is the rank function of a matroid M. Since in this case, \(\ell _{f}=f(E)\), we get:

Corollary 8

Let \(M=(E,r)\) be a matroid on ground set E with rank function r. There exists a factor \((2-\frac{1+r(E)}{1+|E|})\)-approximation algorithm to matroid MLOP on M in polynomial time.

For graphic matroids, this improves upon the 2-factor approximation when graph is connected and has a linear number of edges. For instance, for connected d-regular graphs with vertex set V, the approximation factor is \(2-\frac{2|V|}{2+d|V|}\), which is asymptotically \(2-\frac{2}{d}\).

8.3 Application to minimum latency set cover (MLSC)

Recall that in Sect. 7.1 we present a randomized factor \((2-\frac{2}{1+\ell })\)-approximation algorithm for MLSC, where \(\ell \) is the size of largest hyperedge. For the special case of MLVC the factor is \(\frac{4}{3}\). In this section we make the observation that MLSC is an instance of monotone submodular MLOP, and use Theorem 13 to show that there exists a deterministic factor \((2-\frac{\varDelta +|E|}{\varDelta (1+|V|)})\)-approximation algorithm for MLSC, where \(\varDelta \) is the maximum degree of the hypergraph \(H = (V,E)\). Note that for \(\ell \)-uniform hypergraphs this bound is never better than the one obtained in Sect. 7.1.

Recall that in MLSC, we are given a hypergraph \(H=(V,E)\) with the objective

$$\begin{aligned} \min _{\pi \in \mathcal {S}_{V(H)}} \sum _{e \in E} \max _{v \in e} \pi (v). \end{aligned}$$

In other words, we minimize over all permutations of the vertices, where the cost of each hyperedge is the maximum label of all vertices in it. Throughout this section we let \(n=|V|\) denote the number of vertices.

For a fixed \(\pi \in \mathcal {S}_{V}\), its reverse permutation is defined as \(\pi '(v)=n+1-\pi (v)\) for all \(v\in V\). We now prove that the MLSC value with \(\pi \) is the same as the MLOP value with \(\pi '\) on a particular monotone submodular function, which shows that MLSC is an instance of monotone submodular MLOP.

Proposition 6

For a fixed hypergraph \(H=(V,E)\), let f be the set function on V such that for all \(S \subseteq V\), \(f(S)=|\{e\in E:S\cap e\ne \emptyset \}|\). Then f is a monotone submodular function satisfying \(f(\emptyset )=0\). Furthermore, for all \(\pi \in \mathcal {S}_{V}\) we have

$$\begin{aligned} \sum _{e \in E} \max _{v \in e} \pi (v)=\sum _{i=0}^n f(V_{i,\pi '}), \end{aligned}$$

where \(\pi ' \in \mathcal {S}_{V}\) is given by \(\pi '(v)=n+1-\pi (v)\).

Proof

It is straightforward to verify that f is monotone and \(f(\emptyset )=0\). For submodularity, for all \(S,T\subseteq V\), observe that

$$\begin{aligned}&f(S)+f(T)-f(S\cup T)-f(S\cap T)\\&\quad =|\{e\in E:e\cap S\ne \emptyset ,e\cap T\ne \emptyset ,e\cap S\cap T=\emptyset \}|\ge 0. \end{aligned}$$

Now for all \(0\le k\le n\) let \(T_k=\{e\in E: \exists v\in e,\pi '(v)\le k \}\). Then it is straightforward to verify that for all \(0\le k\le n\), \(f(V_{k,\sigma })=|T_k|\) and furthermore for all \(e\in E\), \(|\{k:e\in T_k\}|=\max _{v \in e} \pi (v)\). Therefore we have

$$\begin{aligned} \sum _{e \in E} \max _{v \in e} \pi (v)=\sum _{i=0}^n |T_i|=\sum _{i=0}^n f(V_{i,\pi '}). \end{aligned}$$

\(\square \)

Using Theorem 13, we obtain the following approximation algorithm for MLSC where the factor is based on maximum degree of the hypergraph. Note in this case \(\kappa _{f}=\varDelta (H)=\max _{e\in E}|e|\) is the maximum degree of hypergraph H.

Corollary 2

There is a deterministic factor \((2-\frac{\varDelta +|E|}{\varDelta (1+|V|)})\)-approximation algorithm for MLSC, where \(\varDelta \) is the maximum degree of hypergraph \(H = (V,E)\).

For comparison, in Sect. 7 we presented a randomized scheduling-based approximation algorithm for MLSC within factor \(2-\frac{2}{1+\ell }\), where \(\ell =\max _{e\in E}|e|\) is the size of largest hyperedge. The algorithm presented in this section is deterministic, but for uniform hypergraphs this bound is never better than the randomized algorithm based on scheduling.

9 Future directions

We conclude this work by presenting a list of open questions that stem from this work.

In Sects. 5 and 6 we investigated the hardness of restrictions of MLOP. In particular, we showed that graphic matroid MLOP is NP-hard. In Sect. 6.4, we saw how matroid MLOP can be viewed as an optimization problem over the bases of the matroid. However, even when a basis is fixed, the corresponding ordering problem on the ground set of elements can be non-trivial. In particular, in the context for graphic matroid MLOP on a connected graph G, consider the following optimization problem,

$$\begin{aligned} \min _{\sigma \in S_{T}} \sum _{e \in E(G) \setminus T} \max \{\sigma (e') : e' \in C(T,e) - e\} , \end{aligned}$$

where T is a given (fixed) edge set of a spanning tree of G, and C(Te) denotes the fundamental circuit with respect to T and e. We saw implicitly in the reduction of MLVC to graphic matroid MLOP (Theorem 8), that MLVC reduces to this problem when T is the star graph (thus, this is NP-hard). We prove in Proposition 1, that if we allow the choice of the spanning tree T to vary over all spanning trees of G, then the above problem is equivalent to graphic matroid MLOP. Thus, when T is fixed, this problem can be viewed as a “fixed-basis” restriction of graphic matroid MLOP.

Open question 1. Given a graph G, a spanning tree T and integer k, consider the problem of whether there exists a permutation \(\sigma \) of E(T) such that

$$\sum _{e \in E(G) {\setminus } E(T)} \max \{\sigma (e'): e' \in C(T,e) - e\} \le k.$$

For what families of trees is this problem NP-hard?

This problem is known to be NP-hard only when T is a star graph, and remains open for other simple families of trees, such as in the case where T is a path.

In Sect. 7, we showed that MLSC can be \((2-\frac{2}{1+\ell })\)-approximated using randomized scheduling techniques. Furthermore, we showed for \(\ell \)-uniform regular hypergraphs, MLSC can be \((2-\frac{2}{1+\ell })\)-approximated using an LP relaxation. This question for general \(\ell \)-uniform hypergraphs remains open.

Open question 2. Does solving the LP relaxation provide an approximation guarantee for MLSC on \(\ell \)-uniform hypergraphs by a factor of \(2-\frac{2}{1+\ell }\)?

In Sect. 7.2, we show that the MLVC, MSVC, and MLA are all equivalent problems in decision form for regular graphs. Using techniques similar to Theorem 9, we can show that the optimal value for all three problems are related by linear shifts. It is known that MSVC on regular graphs can be 4/3-approximated (see [26]), but we have not found a formal proof that this problem is NP-hard. Thus, the following question remains open, to the best of our knowledge.

Open question 3. Are MLA, MLVC, and MSVC NP-hard for the family of simple regular graphs?

In Sect. 8, we show that monotone submodular MLOP can be approximated within factor \(2-\frac{1+\ell _f}{1+|E|}\), using principal partitions. A related open question is to develop algorithms when the principal partitions are trivial, i.e., \(f(S)|E|\ge f(E)|S|\) for all \(S\subseteq E\). In this case, the principal partition-based algorithm studied by Fokkink et al. [16] (and by us) will simply output an arbitrary solution.

Open question 4. Do there exist better polynomial time approximation algorithms for monotone submodular MLOP in the case where the function f satisfies \(f(\emptyset )=0\) and \(f(S)|E|\ge f(E)|S|\) for all \(S\subseteq E\)?

In the scope of symmetric submodular MLOP, the current best known approximation factor for the special case MLA is polylogarithmic in the size of the graph, i.e., \(O(\sqrt{\log n} \log \log n)\), given by Feige and Lee [21]; see also Charikar et al. [22]. For the more general problem of symmetric submodular MLOP, there is currently no known efficient approximation algorithm better than O(|E|).

Open question 5. Can symmetric submodular MLOP over a ground set E be approximated to a factor better than O(|E|)?