1 Introduction

Submodularity is a property satisfied by many fundamental set functions, including coverage functions, matroid rank functions, and directed cut functions. Optimization of submodular set functions has found a wealth of applications in machine learning, including the spread of influence in social networks [1], sensor placement [2], information gathering [3], document summarization [4,5,6], image segmentation [7], and multi-object tracking [8], among others (see Krause and Golovin [9] for a survey).

Problems involving the maximization of non-negative submodular functions can be classified as either unconstrained or constrained. In the unconstrained case, the objective is to return a set in the domain of the function approximately maximizing the function; we refer to this problem as USM. In the most commonly studied form of constrained submodular maximization, the returned set is subject to a “matroid constraint,” which means that the returned set is constrained to be independent in a given matroid. We refer to this form of constrained submodular maximization as CSM. The simplest nontrivial example of a matroid constraint is a cardinality constraint, which means that an upper bound is given on the allowed size of the returned set. Additionally, we refer to the special case where the function f we are maximizing is monotone as “monotone CSM.”

Approximation algorithms for both USM and CSM have been studied extensively. Say that an algorithm running in polynomial time with respect to the size of the ground set provides an \(\alpha \)-approximation if it returns a set with expected value at least \(\alpha \) times that of the optimum. For USM, a 0.5-approximation algorithm was provided by Buchbinder et al. [10]. For monotone CSM, a \((1-e^{-1})\)-approximation was achieved by Nemhauser et al. [11] using a greedy algorithm for the special case of cardinality constraints and later generalized by Calinescu et al. [12] to matroid constraints using the continuous greedy algorithm. For general CSM, the measured continuous greedy algorithm of Feldman et al. [13] achieves an \(e^{-1}>0.367\)-approximation, and a subsequent algorithm due to Buchbinder and Feldman [14] achieves a 0.385-approximation.

To bound how far the algorithms from the previous paragraph are from optimal, corresponding inapproximability results have been shown. Say that a problem is \(\alpha \)-inapproximable if no algorithm running in sub-exponential time with respect to the size of the ground set can provide an \(\alpha \)-approximation. The first two approximation factors from the previous paragraph are in fact the best achievable; (\(0.5+\epsilon \))-inapproximability and (\(1-e^{-1}+\epsilon \))-inapproximability for any \(\epsilon >0\) were shown by Feige et al. [15] and Nemhauser and Wolsey [16], respectively, using ad hoc methods. On the other hand, the best achievable approximability for general CSM remains open; the best-known inapproximability factor is 0.478 due to Oveis Gharan and Vondrak [17] using the symmetry gap technique of Vondrak [18]. This technique has the advantage of being able to succinctly re-prove the inapproximability results of [15, 16] and many others.

In this work, we study approximation algorithms for maximizing the sum of a non-negative submodular function f and a linear function \(\ell \). Sviridenko et al. [19] were the first to study algorithms for the sum \(f+\ell \) in the case of f monotone, in order to provide improved approximation algorithms for monotone CSM with bounded curvature. Here, the curvature \(c\in [0,1]\) of a non-negative monotone submodular function g is roughly a measure of how far g is from being linear. They provided a \((1-c/e-\epsilon )\)-approximation algorithm and a complementary \((1-c/e+\epsilon )\)-inapproximability result. The idea of the algorithm is to decompose g into \(f+\ell \) and show that an approximation factor of \(1-e^{-1}\) can be achieved with respect to f and an approximation factor of 1 can be achieved with respect to \(\ell \) simultaneously. Formally, if \({\mathcal {I}}\) is the independent set family of a matroid, the algorithm computes a set \(T\in {\mathcal {I}}\) that satisfies

$$\begin{aligned} {\mathbb {E}}[g(T)]={\mathbb {E}}[f(T)+\ell (T)]\ge \max _{S\in {\mathcal {I}}}[(1-e^{-1}-\epsilon )f(S)+ (1-\epsilon )\ell (S)] \end{aligned}$$
(1.1)

by first “guessing” the value of \(\ell (S)\), and then running the continuous greedy algorithm. Subsequently, Feldman eliminated the need for the guessing step and the dependence on \(\epsilon \ell (S)\) by introducing a distorted objective [20]. Many faster algorithms and practical applications for the case of f monotone have since been introduced [21,22,23]. Note that \(\ell \) has several potential interpretations; while setting \(\ell \) to be non-negative provides improved approximations for monotone submodular functions with low curvature, setting \(\ell \) to be non-positive allows it to serve as a regularizer or soft constraint that favors returning smaller sets as suggested by Harshaw et al. [21].

On the other hand, we know of only two prior works that study RegularizedUSM where f is not constrained to be monotone. Bodek and Feldman [24] were the first to consider the case where f is non-monotone and the sign of \(\ell \) is unconstrained. They defined and studied the problem of Regularized Unconstrained Submodular Maximization (RegularizedUSM):

Definition 1.1

(RegularizedUSM [24]) Given query access to a (not necessarily monotone) non-negative submodular function \(f:2^{{\mathcal {N}}}\rightarrow {\mathbb {R}}_{\ge 0}\) and a linear function \(\ell :2^{{\mathcal {N}}}\rightarrow {\mathbb {R}}\) over the same ground set \({\mathcal {N}}\), an algorithm is said to provide an \((\alpha ,\beta )\)-approximation for RegularizedUSM if it outputs a set \(T\subseteq {\mathcal {N}}\) such that \({\mathbb {E}}[f(T)+\ell (T)]\ge \max _{S\subseteq {\mathcal {N}}}[\alpha \cdot f(S)+\beta \cdot \ell (S)]\).

The main approximation result of [24] is the first non-trivial approximation algorithm for RegularizedUSM with f non-monotone and the sign of \(\ell \) unconstrained. Specifically, they used non-oblivious local search to provide \(\left( \alpha (\beta )-\epsilon ,\beta -\epsilon \right) \)-approximations for RegularizedUSM for all \(\beta \in (0,1]\), where \(\alpha (\beta )\triangleq \beta (1-\beta )/(1+\beta )\) [24, Theorem 1.2]. They also proved inapproximability results for the cases of \(\ell \) non-negative and \(\ell \) non-positive using the symmetry gap technique of Vondrak [18]. In particular, they showed \((1-e^{-\beta }+\epsilon ,\beta )\)-inapproximability for monotone f and non-positive \(\ell \) for all \(\beta \ge 0\) [24, Theorem 1.1], essentially matching the \((1-e^{-\beta }-\epsilon ,\beta )\)-approximability provided by Lu et al.’s distorted measured continuous greedy algorithm [25] (note that \((\alpha ,\beta )\)-inapproximability is defined in the same way as \(\alpha \)-inapproximability).

In this work, we present improved approximability and inapproximability results for RegularizedUSM as well as the setting where S and T are subject to a matroid constraint, which we define analogously as Regularized Constrained Submodular Maximization (RegularizedCSM):

Definition 1.2

(RegularizedCSM) Given query access to a (not necessarily monotone) non-negative submodular function \(f:2^{{\mathcal {N}}}\rightarrow {\mathbb {R}}_{\ge 0}\) and a linear function \(\ell :2^{{\mathcal {N}}}\rightarrow {\mathbb {R}}\) over the same ground set \({\mathcal {N}}\), as well as a matroid with family of independent sets denoted by \({\mathcal {I}}\) also over the same ground set, an algorithm is said to provide an \((\alpha ,\beta )\)-approximation for RegularizedCSM if it outputs a set \(T\in {\mathcal {I}}\) such that \({\mathbb {E}}[f(T)+\ell (T)]\ge \max _{S\in {\mathcal {I}}}[\alpha \cdot f(S)+\beta \cdot \ell (S)]\).

The only prior work considering RegularizedCSM for non-monotone f that we are aware of is that of Lu et al. [25], which as noted by [24] achieves \((\beta e^{-\beta }-\epsilon ,\beta )\)-approximations for RegularizedCSM for all \(\beta \in [0,1]\), but only when \(\ell \) is constrained to be non-positive.

Organization of the Paper. We present the definitions and notation used throughout this work in Sect. 2 and summarize our results in Sect. 3. Sections 4567 and 8 prove the results introduced in Sect. 3. We conclude with a discussion of open problems in Sect. 9.

2 Preliminaries

Set Functions. Let \({\mathcal {N}}\triangleq \{u_1,u_2,\ldots ,u_n\}\) denote the ground set. A set function \(f:2^{{\mathcal {N}}} \rightarrow {\mathbb {R}}\) is said to be submodular if for every two sets \(S, T \subseteq {\mathcal {N}}\), \(f(S) + f(T) \ge f(S \cup T) + f(S \cap T)\). Equivalently, f is said to be submodular if it satisfies the property of “diminishing returns.” That is, for every two sets \(S \subseteq T \subseteq {\mathcal {N}}\) and an element \(u\in {\mathcal {N}}\backslash T\), \(f(u|S)\ge f(u|T)\), where \(f(u|S)\triangleq f(S\cup \{u\})-f(S)\) is the marginal value of u with respect to S. We use f(u) as shorthand for \(f(\{u\})\). All submodular functions are implicitly assumed to be non-negative unless otherwise stated.

A set function f is said to be monotone if for every two sets \(S \subseteq T \subseteq {\mathcal {N}}\), \(f(S) \le f(T)\). A set function \(\ell \) is said to be linear if there exist values \(\{\ell _u \in {\mathbb {R}}\mid u \in {\mathcal {N}}\}\) such that for every set \(S \subseteq {\mathcal {N}}\), \(\ell (S) = \sum _{u\in S}\ell _u\). When considering the sum of a non-negative submodular function f and a linear function \(\ell \) whose sign is unconstrained, define \(\ell _+(S)\triangleq \sum _{u\in S}\max (\ell _u,0)\) and \(\ell _-(S)\triangleq \sum _{u\in S}\min (\ell _u,0)\) to be the components of \(\ell \) with positive and negative sign, respectively.

Value Oracles. We make the standard assumption that an algorithm for the sum \(f+\ell \) does not have direct access to the representation of f; instead, it may obtain information about f only through a value oracle. Given any query set \(S\subseteq {\mathcal {N}}\), a value oracle for f returns f(S) in polynomial time. The coefficients of \(\ell \) are directly provided to the algorithm.

Multilinear Extensions. All vectors of reals are in bold (e.g., \({\textbf{x}}\)). Given two vectors \({\textbf{x}}, {\textbf{y}}\in [0, 1]^{{\mathcal {N}}}\), we define \({\textbf{x}}\vee {\textbf{y}}\), \({\textbf{x}}\wedge {\textbf{y}}\), and \({\textbf{x}}\circ {\textbf{y}}\) to be the coordinate-wise maximum, minimum, and multiplication, respectively, of \({\textbf{x}}\) and \({\textbf{y}}\). We also define \({\textbf{x}}\backslash {\textbf{y}}\triangleq {\textbf{x}}- {\textbf{x}}\wedge {\textbf{y}}\). Given a set function \(f:2^{{\mathcal {N}}} \rightarrow {\mathbb {R}}\), its multilinear extension is the function \(F:[0, 1]^{{\mathcal {N}}} \rightarrow {\mathbb {R}}\) defined by \(F({\textbf{x}}) = {\mathbb {E}}[f(\texttt {R}({\textbf{x}}))]\), where \(\texttt {R}({\textbf{x}})\) is a random subset of \({\mathcal {N}}\) including every element \(u\in {\mathcal {N}}\) with probability \({\textbf{x}}_u\), independently. One can verify that F is a multilinear function of its arguments as well an extension of f in the sense that \(F({\textbf{1}}_S)=f(S)\) for every set \(S\subseteq {\mathcal {N}}\). Here, \({\textbf{1}}_S\) is the characteristic vector of the set S; that is, the vector with value 1 at each \(u\in S\) and 0 at each \(u\in {\mathcal {N}}\backslash S\).

Matroid Polytopes. A matroid \({\mathcal {M}}\) may be specified by a pair of a ground set \({\mathcal {N}}\) and a family \({\mathcal {I}}\) of independent sets. The matroid polytope \({\mathcal {P}}\) corresponding to \({\mathcal {M}}\) is defined to be \(conv(\{{\textbf{1}}_S \mid S\in {\mathcal {I}}\})\), where conv denotes the convex hull. Due to the matroid axioms, \({\mathcal {P}}\) is guaranteed to be down-closed; that is, \(0\le {\textbf{x}}\le {\textbf{y}}\) and \({\textbf{y}}\in {\mathcal {P}}\) imply \({\textbf{x}}\in {\mathcal {P}}\). It is also well-known that \({\mathcal {P}}\) is solvable; that is, linear functions can be maximized over \({\mathcal {P}}\) in polynomial time [12, Section 2.3]. For CSM and RegularizedCSM, we let OPT denote any set such that \(OPT\in {\mathcal {I}}\) (equivalently, \({\textbf{1}}_{OPT}\in {\mathcal {P}}\)), while for USM and RegularizedUSM, we let OPT denote any subset of \({\mathcal {N}}\). For example, in the context of CSM, \({\mathbb {E}}[f(T)]\ge \alpha f(OPT)\) is equivalent to \(\forall S\in {\mathcal {I}}, {\mathbb {E}}[f(T)]\ge \alpha f(S)\).

Miscellaneous. We let \(\epsilon \) denote any positive real. Many of our algorithms are “almost” \((\alpha ,\beta )\) approximations in the sense that they provide an \((\alpha -\epsilon ,\beta )\)-approximation in \(\text {poly}\left( n,\frac{1}{\epsilon } \right) \) time for any \(\epsilon >0\). Similarly, some of our results show \((\alpha +\epsilon ,\beta )\)-inapproximability for any \(\epsilon >0\).

2.1 Technical Tools

We obtain our results by carefully combining the following known techniques.

To show approximability, the main techniques we use are the measured continuous greedy introduced by Feldman et al. [13] and used by [14, 25], as well as the distorted objective introduced by Feldman [20] and used by [25]. For some of our approximability results, we additionally require the analysis of the 0.385-approximation algorithm for CSM due to Buchbinder et al. [14] and the “guessing step” used by Sviridenko et al. [19].

To show inapproximability, the main technique we use is the symmetry gap of Vondrak [18], and most of our symmetry gap constructions are based on those of Oveis Gharan and Vondrak [17].

Additional notes on all relevant prior work can be found in Sect. A.1.

3 Our Contributions

Our results are organized into five sections. Section 4 contains our inapproximability result for CSM. The remaining sections contain our results for RegularizedUSM and RegularizedCSM divided by the assumptions they make about \(\ell \). Specifically, Sect. 5 covers non-negative \(\ell \), Sects. 6 and 7 cover non-positive \(\ell \), and Sect. 8 covers arbitrary \(\ell \).

3.1 Section 4: Inapproximability of Maximization with Cardinality Constraint

We start with the inapproximability of CSM. Oveis Gharan and Vondrak [17] used a symmetry gap construction [18] to prove 0.491-inapproximability of CSM in the special case where the matroid constraint is a cardinality constraint. Our first result improves the inapproximability factor for a cardinality constraint to 0.478 using a modified construction, matching the factor of the current best inapproximability result for CSM in the general case (also due to [17]).

Theorem 4.1  There exist instances of the problem \(\max \{f(S):S\subseteq {\mathcal {N}}\text { and } |S|\le w\}\) where f is non-negative submodular such that a 0.478-approximation would require exponentially many value queries.

3.2 Section 5: Non-positive \(\ell \)

The results of this section are summarized in Fig. 1. In Sect. 5.1, we present improved \((\alpha (\beta ),\beta )\)-approximations for RegularizedUSM for all \(\beta \ge 0\) and RegularizedCSM for all \(\beta \in [0,1]\),Footnote 1 Previously, the best known result for both RegularizedUSM and RegularizedCSM was \(\alpha (\beta )=\beta e^{-\beta }-\epsilon \) due to Lu et al. [25]. This function achieves its maximum value at \(\alpha (1)=e^{-1}-\epsilon >0.367\). We improve the approximation factor for RegularizedCSM to \(\alpha (1)>0.385\), matching the best-known approximation factor for CSM without a regularizer due to Buchbinder and Feldman [14]. Additionally, we show that larger values of \(\alpha (\beta )\) are approximable for RegularizedUSM when \(\beta >1\). The idea is to combine the “guessing step” of Sviridenko et al. [19] with a generalization of the aided measured continuous greedy algorithm of Buchbinder and Feldman [14].

Theorem 5.1 For  RegularizedUSM  with non-positive \(\ell \), an \((\alpha (\beta ),\beta )\)-approximation algorithm exists for any \((\alpha (\beta ),\beta )\) in Table 1. In particular, \(\alpha (1)>0.385\) and \(\alpha (1.3)>0.398\). When \(\beta \le 1\), there is an algorithm for RegularizedCSM  that achieves the same approximation factor.

A natural follow-up question is whether there is a \((0.5,\beta )\)-approximation algorithm for RegularizedUSM with non-positive \(\ell \) for some \(\beta \). Although it is unclear whether this is the case for general f, we use linear programming to show this result when f is an undirected or directed cut function (Propositions 5.4 and 5.5).

In Sect. 5.2, we use the symmetry gap technique to demonstrate improved inapproximability for RegularizedUSM with non-positive \(\ell \). The previous best inapproximability results were [24, Theorem 1.1] near \(\beta =0\) and [24, Theorem 1.3] near \(\beta =1\). Our result, which generalizes the construction from Sect. 4, beats or matches both of these theorems for all \(\beta \).

Fig. 1
figure 1

Graphical depiction of results for Sect. 5. Following the convention of [24], the x and y axes represent the coefficients of \(\ell \) and f, respectively. We use blue for approximation algorithms and red for inapproximability results, and the shaded area represents the gap between the best-known approximation algorithms and inapproximability results. Theorem 5.6 unifies the two inapproximability theorems from [24]. \((0.5, 2\ln 2-\epsilon \approx 1.386)\)-inapproximability is due to Proposition 5.9. For RegularizedCSM, the results are the same for \(\beta \le 1\)

Theorem 5.6 There are instances of RegularizedUSM with non-positive \(\ell \) such that \((\alpha (\beta ),\beta )\) is inapproximable for any \((\alpha (\beta ),\beta )\) in Table 2. In particular, \(\alpha (0)\approx 0\), matching the result of [24, Theorem 1.1], and \(\alpha (1) < 0.478\), matching the result of [24, Theorem 1.3].

We conclude this section by showing that taking the limit of Theorem 5.6 as \(\alpha (\beta )\rightarrow 0.5\) implies \((0.5,2\ln 2-\epsilon \approx 1.386)\)-inapproximability (Proposition 5.9).

3.3 Section 6: Non-negative \(\ell \), RegularizedUSM

The results of this section are summarized in Figs. 2 and 3.

It is easy to check that Theorem 5.1 may be modified to obtain guarantees for RegularizedUSM with non-negative \(\ell \), which we do in Sect. 6.2. But before that, in Sect. 6.1, we take a slight detour and reanalyze the guarantee for this task provided by the randomized double greedy algorithm of [10] (RandomizedDG), which achieves the best-known \((\alpha (\beta ),\beta )\)-approximations near \(\beta =3/4\). We also reanalyze the guarantee of the deterministic variant of double greedy from the same paper (DeterministicDG).

Fig. 2
figure 2

Graphical depiction of results for Sect. 6.1. For RandomizedDG, we obtain higher \(\alpha \) for all \(\beta \in (3/4,1)\), and for DeterministicDG, we obtain higher \(\alpha \) for all \(\beta \in (2/3,1)\)

Fig. 3
figure 3

Graphical depiction of results for Sect. 6. (\(0.5+\epsilon \))-inapproximability is due to [15] and \((0.4998+\epsilon ,1)\)-inapproximability is due to [24, Lemma 6.3]. (0.478, 1)-inapproximability is due to Theorem 6.8, and \((0.5,2\sqrt{2}/3+\epsilon )\)-inapproximability is due to Theorem 6.9

Recall that DeterministicDG achieves a 1/3-approximation for USM, while RandomizedDG achieves a 1/2-approximation for USM in expectation. Bodek and Feldman [24] extended these guarantees to RegularizedUSM with non-negative \(\ell \), showing that DeterministicDG simultaneously achieves \((\alpha ,1-\alpha )\)-approximations for all \(\alpha \in [0,1/3]\), and that RandomizedDG simultaneously achieves \((\alpha ,1-\alpha /2)\)-approximations for all \(\alpha \in [0,1/2]\). In Sect. 6.1, we show improved approximation factors for a variant of DeterministicDG and the original RandomizedDG:

  • Improved analysis of a variant of  DeterministicDG (Theorem 6.1). For any \(r\ge 1\), we describe a variant of DeterministicDG that simultaneously achieves (0, 1) and \(\left( \frac{1}{r+1+r^{-1}},\frac{r+1}{r+1+r^{-1}} \right) \)-approximations. For \(r=1\), the variant is actually just the original DeterministicDG.

  • Improved analysis of RandomizedDG (Theorem 6.2). We then show that RandomizedDG simultaneously achieves \(\left( \frac{2}{r+2+r^{-1}},\frac{r+2}{r+2+r^{-1}} \right) \)-approximations for all \(r\ge 1\).

Observe that for both DeterministicDG and RandomizedDG, increasing r improves the dependence of the approximation on \(\ell \) but decreases the dependence on f. Setting \(r=1\) recovers the guarantees of [24]. We also provide examples showing that neither DeterministicDG nor RandomizedDG achieve \((\alpha ,\beta )\)-approximations better than Theorems 6.1 and 6.2 in Propositions 6.3 and 6.4, respectively.

In Sect. 6.2 we provide improved approximation algorithms for non-negative \(\ell \) near \(\beta =1\) by combining the results of Sects. 5.1 and 6.1:

Theorem 6.5 An \((\alpha (\beta ),\beta )\)-approximation algorithm for RegularizedUSM  with non-negative \(\ell \) exists for any \((\alpha (\beta ),\beta )\) in Table 3. In particular, the \(\alpha (\beta )\) obtained for \(\beta \ge 0.85\) is superior to that of Theorem 6.2, and \(\alpha (1)>0.385\), matching the approximation factor of Theorem 5.1.

Fig. 4
figure 4

Graphical depiction of results for Sect. 7. Recall that [17, Theorem 5.4] showed 0.478-inapproximability for CSM

Fig. 5
figure 5

Graphical depiction of results from Sect. 8 relating to RegularizedUSM. The \(\left( \frac{\beta (1-\beta )}{1+\beta }-\epsilon ,\beta -\epsilon \right) \)-approximation for RegularizedUSM is due to [24, Theorem 1.2]. The red dot corresponds to (0.408, 1)-inapproximability, which is due to Theorem 8.5

In Sect. 6.3, we use the symmetry gap technique to prove both \((0.478,1-\epsilon )\)- and \((0.5,2\sqrt{2}/3\approx 0.943+\epsilon )\)-inapproximability (Theorems 6.8 and 6.9). These results are much stronger than [24, Theorem 1.6], which only proved \((0.4998+\epsilon ,1)\)-inapproximability. Again, our constructions are variants of that used in Sect. 4.

3.4 Section 7: Non-negative \(\ell \), RegularizedCSM

The results of this section are summarized in Fig. 4. In Sect. 7.1 we combine the distorted measured continuous greedy of [25] with the aided measured continuous greedy of [14] to show the following.

Theorem 7.1 For RegularizedCSM  with non-negative \(\ell \), there is a \(\left( \alpha (\beta )-\epsilon ,\beta -\epsilon \right) \) approximation algorithm for all \(\beta \in [0,1]\) where \(\alpha \) is a decreasing concave function satisfying \(\alpha (0.385)>0.385\), \(\alpha (0.6)>0.384, \alpha \left( 1-e^{-1} \right) =e^{-1}\), and \(\alpha (1)=0\).

Note that \(\alpha (0.385)>0.385\) matches the (trivial) result of directly applying the algorithm of [14] to \(f+\ell \). In Sect. 7.2, we prove a complementary inapproximability result showing that our algorithm is tight for \(\beta \ge 1-e^{-1}\).

Theorem 7.6 (Inapproximability of RegularizedCSM Near \(\beta =1\)For any \(0\le \beta \le 1\), there exist instances of RegularizedCSM with non-negative \(\ell \) such that a \((1-\beta +\epsilon ,\beta )\)-approximation would require exponentially many value queries.

3.5 Section 8: Unconstrained \(\ell \)

The results of this section are summarized in Figs. 5 and 6.

Fig. 6
figure 6

Graphical depiction of results from Sect. 8 relating to RegularizedCSM. The blue dot corresponds to (0.280, 0.7)-approximability, which is due to Theorem 8.3

In Sect. 8.1, we present the first nontrivial \((\alpha ,\beta )\)-approximation algorithm for RegularizedCSM where the sign of \(\ell \) is unconstrained. Furthermore, the \(\alpha \) we obtain for RegularizedUSM improves over that of [24] for all \(\beta \in (0,1)\).

Theorem 8.1 For all \(t\ge 0\), there is a \(\left( \frac{t e^{-t}}{t+e^{-t}}-\epsilon ,\frac{t}{t+e^{-t}} \right) \)-approximation algorithm for RegularizedUSM. This algorithm achieves the same approximation guarantee for RegularizedCSM  when \(t\le 1\).

For certain values of \(\beta \), we can achieve greater \(\alpha \) for RegularizedCSM than that guaranteed by Theorem 8.1. Because the improvement is marginal and we do not have a closed form, our following result addresses only the specific case of \(\beta =0.7\). Note that Theorem 8.1 guarantees a (0.277, 0.7)-approximation (by setting \(t\approx 0.925\)).

Theorem 8.3 There is a (0.280, 0.7) -approximation algorithm for RegularizedCSM .

In Sect. 8.2, we prove two theorems pertaining to improved inapproximability for RegularizedUSM. The former generalizes Theorem 1.3 of [24] in order to show improved inapproximability for a range of \(\beta \) when \(\ell \) is not necessarily constrained to be non-positive.

Theorem 8.4 (Inapproximability of RegularizedUSMThere are instances of RegularizedUSM  where \((\alpha (\beta ),\beta )\) is inapproximable for any \((\alpha (\beta ),\beta )\) in Table 4. In particular, (0.440, 1) is inapproximable.

The latter shows stronger inapproximability specifically near \(\beta =1\).

Theorem 8.5 (Inapproximability of RegularizedUSM, \(\beta =1\)There are instances of RegularizedUSM  where (0.408, 1) is inapproximable.

The best prior \((\alpha ,1)\)-inapproximability result for RegularizedUSM is (0.478, 1) due to Theorem 1.3 of [24], matching the 0.478-inapproximability result for CSM due to Oveis Gharan and Vondrak [17]. We note that as Bodek and Feldman [24] show inapproximability specifically for the case of non-positive \(\ell \), it is not too surprising that we can show improved inapproximability for general \(\ell \). Notably, the gap between the best approximability and inapproximability results for RegularizedUSM remains quite large; in fact, it remains unclear whether an \((\epsilon ,1)\)-approximation algorithm exists for any \(\epsilon >0\).

4 Inapproximability of Maximization with Cardinality Constraint

In this section, we prove Theorem 4.1:

Theorem 4.1

There exist instances of the problem \(\max \{f(S):S\subseteq {\mathcal {N}}\text { and } |S|\le w\}\) where f is non-negative submodular such that a 0.478-approximation would require exponentially many value queries.

First, we provide the relevant definitions for proving inapproximability using the symmetry gap technique from Vondrak [18].

Definition 4.2

(Symmetrization [18]) Let \({\mathcal {G}}\) be a group of permutations over \({\mathcal {N}}\). For \({\textbf{x}}\in [0, 1]^{{\mathcal {N}}}\), define the “symmetrization of \({\textbf{x}}\)” as \({\overline{{\textbf{x}}}}={\mathbb {E}}_{\sigma \in {\mathcal {G}}}[\sigma ({\textbf{x}})],\) where \(\sigma \in {\mathcal {G}}\) is uniformly random and \(\sigma ({\textbf{x}})\) denotes \({\textbf{x}}\) with coordinates permuted by \(\sigma \).

Definition 4.3

(Symmetry Gap [18]) Let \(\max \{f(S):S\in {\mathcal {F}}\subseteq 2^{{\mathcal {N}}}\}\) be strongly symmetric with respect to a group \({\mathcal {G}}\) of permutations over \({\mathcal {N}}\), meaning that for all \(\sigma \in {\mathcal {G}}\) and \(S\subseteq 2^{{\mathcal {N}}}\), \(f(S)=f(\sigma (S))\) and \(S\in {\mathcal {F}}\Leftrightarrow S'\in {\mathcal {F}}\) whenever \(\overline{{\textbf{1}}_S}=\overline{{\textbf{1}}_{S'}}\). Here, \(\sigma (S)=\{\sigma (i):i\in S\}\). Define \(P({\mathcal {F}})=conv(\{{\textbf{1}}_I :I\in {\mathcal {F}}\})\) to be the polytope associated with \({\mathcal {F}}\). Let \(\textbf{OPT}\triangleq \max _{{\textbf{x}}\in P({\mathcal {F}})}F({\textbf{x}})\) and \({\overline{\textbf{OPT}}}\triangleq \max _{{\textbf{x}}\in P({\mathcal {F}})}F({\overline{{\textbf{x}}}})\). Then the symmetry gap of \(\max \{f(S):S\in {\mathcal {F}}\}\) is defined as \(\gamma \triangleq \frac{{\overline{\textbf{OPT}}}}{\textbf{OPT}}.\)

Lemma 4.4

(Inapproximability due to Symmetry Gap [18]) Let \(\max \{f(S) :S\in {\mathcal {F}}\}\) be an instance of non-negative submodular maximization, strongly symmetric with respect to \({\mathcal {G}}\), with symmetry gap \(\gamma \). Let \({\mathcal {C}}\) be the class of instances \(\max \{{{\tilde{f}}}(S) :S \in {{\tilde{{\mathcal {F}}}}}\}\) where \({{\tilde{f}}}\) is non-negative submodular and \({{\tilde{{\mathcal {F}}}}}\) is a refinement of \({\mathcal {F}}\). Then for every \(\epsilon > 0\), any (even randomized) \((1+\epsilon )\gamma \)-approximation algorithm for the class \({\mathcal {C}}\) would require exponentially many queries to the value oracle for \(\tilde{f}(S)\).

The formal definition of refinement can be found in [18]. The important thing to note is that \({\tilde{{\mathcal {F}}}}\) satisfies the same properties as \({\mathcal {F}}\). In particular, \({\tilde{{\mathcal {F}}}}\) preserves cardinality and matroid independence constraints. Before proving Theorem 4.1, we start with a related lemma.

Lemma 4.5

(Inapproximability of Cardinality Constraint on Subset of Domain) Let T be some subset of the ground set \({\mathcal {N}}\). There exist instances of the problem \(\max \{f(S) :S\subseteq {\mathcal {N}}\wedge |S\cap T|\le w\}\) such that a 0.478-approximation would require exponentially many value queries.

Proof

It suffices to provide \({\mathcal {F}}\), f, and \({\mathcal {G}}\) satisfying the definitions of Lemma 4.4 with symmetry gap \(\gamma <0.478\). The construction is identical to that of [17, Theorem 5.4], except we omit \(|S\cap \{a,b\}|\le 1\) from the definition of \({\mathcal {F}}\). Specifically, letting \(a_{1\cdots k}\) be shorthand for \(a_1,a_2,\ldots ,a_k\), we define \({\mathcal {N}}\triangleq \{a,b,a_{1\cdots k},b_{1\cdots k}\}\) and

$$\begin{aligned} {\mathcal {F}}\triangleq \{S\mid S\subseteq {\mathcal {N}}\wedge |S \cap \{a_{1\cdots k},b_{1\cdots k}\}|\le 1\} \end{aligned}$$
(4.1)

instead of:

$$\begin{aligned} {\mathcal {F}}_{orig}\triangleq \{S\mid S\subseteq {\mathcal {N}}\wedge |S\cap \{a,b\}|\le 1 \wedge |S \cap \{a_{1\cdots k},b_{1\cdots k}\}|\le 1\}. \end{aligned}$$
(4.2)

Recall that Theorem 5.4 of [17] defines the submodular function f as the sum of the weighted cut functions of two directed hyperedges \((\{a_1,a_2,\ldots ,a_k\},a), (\{b_1,b_2,\ldots ,b_k\},b)\) and the undirected edge (ab) (see Fig. 4 of [17] for an illustration). Specifically, the weighted cut function on the directed hyperedge \((\{a_1,a_2,\ldots ,a_k\},a)\) contributes \(\kappa \triangleq 0.3513\) to the value of f(S) if \(S\cap \{a_1,\ldots ,a_k\} \ne \emptyset \) and \(a\not \in S\), and 0 otherwise. The weighted cut function on the directed hyperedge \((\{b_1,b_2,\ldots ,b_k\},b)\) is defined in the same way. Finally, the weighted cut function on the undirected edge (ab) contributes \(1-\kappa \) if \(|S\cap \{a,b\}|=1\) and 0 otherwise. Thus, the multilinear extension of f is as follows:

$$\begin{aligned}&F({\textbf{x}}_a,{\textbf{x}}_b,{\textbf{x}}_{a_{1\cdots k}},{\textbf{x}}_{b_{1\cdots k}})\nonumber \\&\quad \triangleq (1-\kappa )({\textbf{x}}_a(1-{\textbf{x}}_b)+{\textbf{x}}_b(1-{\textbf{x}}_a))\nonumber \\&\qquad +\kappa \left[ \left( 1-\prod _{i=1}^k(1-{\textbf{x}}_{a_i})\right) (1-{\textbf{x}}_a)+\left( 1-\prod _{i=1}^k(1-{\textbf{x}}_{b_i})\right) (1-{\textbf{x}}_b)\right] . \end{aligned}$$
(4.3)

As in [17, Lemma 5.1], we let \({\mathcal {G}}\) be the group of permutations generated by \(\{\sigma _1,\sigma _2\}\), where

$$\begin{aligned} \sigma _1(a)=b, \sigma _1(b)=a, \sigma _1(a_i)=b_i, \sigma _1(b_i)=a_i \end{aligned}$$
(4.4)

swaps the two hyperedges, and

$$\begin{aligned} \sigma _2(a)=a, \sigma _2(b)=b, \sigma _2(a_i)=a_{i\pmod {k}+1}, \sigma _2(b_i)=b_i \end{aligned}$$
(4.5)

rotates the tail vertices of the first hyperedge. It is easy to check that \((f,{\mathcal {F}})\) are strongly symmetric with respect to both \(\sigma _1\) and \(\sigma _2\), and that the symmetrization of \({\textbf{x}}\) is as follows:

$$\begin{aligned} {\overline{{\textbf{x}}}}={\mathbb {E}}_{\sigma \in {\mathcal {G}}}\left[ \sigma ({\textbf{x}}) \right] = {\left\{ \begin{array}{ll} {\overline{{\textbf{x}}}}_a={\overline{{\textbf{x}}}}_b=\frac{{\textbf{x}}_a+{\textbf{x}}_b}{2}\\ {\overline{{\textbf{x}}}}_{a_1}=\cdots ={\overline{{\textbf{x}}}}_{a_k}={\overline{{\textbf{x}}}}_{b_1}=\cdots ={\overline{{\textbf{x}}}}_{b_k}=\frac{\sum _{i=1}^{k}\left( {\textbf{x}}_{a_i}+{\textbf{x}}_{b_i} \right) }{2k}. \end{array}\right. } \end{aligned}$$
(4.6)

Observe that

$$\begin{aligned} \textbf{OPT}\ge \max _{S\in {\mathcal {F}}}f(S)\ge f(\{a,b_1\})=(1-\kappa )+\kappa =1. \end{aligned}$$
(4.7)

Defining \(q\triangleq \frac{{\textbf{x}}_a+{\textbf{x}}_b}{2}\) and \(p\triangleq \frac{\sum _{i=1}^k({\textbf{x}}_{a_i}+{\textbf{x}}_{b_i})}{2}\), the maximum of F over all symmetric \({\textbf{x}}\) is thus:

$$\begin{aligned} {\overline{\textbf{OPT}}}&=\max _{{\textbf{x}}\in P({\mathcal {F}})}F({\overline{{\textbf{x}}}})=\max _{{\textbf{x}}\in P({\mathcal {F}})}F(q,q,\overbrace{p/k,p/k,\ldots ,p/k}^{2k\text { times}})\triangleq \max _{{\textbf{x}}\in P({\mathcal {F}})}{{\hat{F}}}(q,p) \nonumber \\&=(1-\kappa )2q(1-q)+\kappa 2(1-q)(1-(1-p/k)^k) \nonumber \\&\approx (1-\kappa )2q(1-q)+\kappa 2(1-q)(1-e^{-p}) \end{aligned}$$
(4.8)

where the approximate equality holds as \(k\rightarrow \infty \). Now,

$$\begin{aligned} {\overline{\textbf{OPT}}}= & {} \max _{{\textbf{x}}\in P({\mathcal {F}})}F({{\overline{{\textbf{x}}}}})=\max _{p\le 1/2}{\hat{F}}(q,p)=\max _{p,q\le 1/2}{\hat{F}}(q,p)\nonumber \\= & {} \max _{{\textbf{x}}\in P({\mathcal {F}}_{orig})}F({{\overline{{\textbf{x}}}}})<0.478. \end{aligned}$$
(4.9)

The third equality holds (i.e., adding the constraint \(q\le 1/2\) has no effect) since \({\hat{F}}(q,p)\le {\hat{F}}(1-q,p)\) for \(q\in (1/2,1]\), while the inequality holds due to the proof of [17, Theorem 5.4] and may be verified using a numerical optimizer. So the symmetry gap \(\gamma =\frac{{\overline{\textbf{OPT}}}}{\textbf{OPT}}\) is less than 0.478, as desired. \(\square \)

Now, to show Theorem 4.1, all we need to do is to convert the cardinality constraint on \(S\cap T\) in Lemma 4.5 into a cardinality constraint on all of S.

Proof of Theorem 4.1

Again, it suffices to provide \({\mathcal {F}}\), f, and \({\mathcal {G}}\) satisfying the definitions of Lemma 4.4 with symmetry gap \(\gamma <0.478\). We start with the construction from Lemma 4.5, replace each element \(a_i\) and \(b_i\) with t copies \(a_{i,1},\ldots , a_{i,t}\), and \(b_{i,1},\ldots ,b_{i,t}\) and set \(w\triangleq t+1\). Letting \(a_{1\cdots k,1\cdots t}\) be shorthand for \(\{a_{ij}:i\in [1,k]\wedge j\in [1,t]\}\), we redefine f such that F is as follows:

$$\begin{aligned}&F({\textbf{x}}_a,{\textbf{x}}_b,{\textbf{x}}_{a_{1\cdots k,1\cdots t}},{\textbf{x}}_{b_{1\cdots k,1\cdots t}})\nonumber \\&\quad \triangleq (1-\kappa )({\textbf{x}}_a(1-{\textbf{x}}_b)+{\textbf{x}}_b(1-{\textbf{x}}_a))\nonumber \\&\quad \quad +\kappa \left[ \left( 1-\prod _{i=1}^k\left( 1-\frac{\sum _{j=1}^t{\textbf{x}}_{a_{i,j}}}{t}\right) \right) (1-{\textbf{x}}_a) \nonumber \right. \\&\qquad \left. +\left( 1-\prod _{i=1}^k\left( 1-\frac{\sum _{j=1}^t{\textbf{x}}_{b_{i,j}}}{t}\right) \right) (1-{\textbf{x}}_b)\right] . \end{aligned}$$
(4.10)

Importantly, f remains non-negative submodular and symmetric, with the new symmetrization being as follows for an appropriate choice of \({\mathcal {G}}\):

$$\begin{aligned} {\overline{{\textbf{x}}}}={\mathbb {E}}_{\sigma \in {\mathcal {G}}}\left[ \sigma ({\textbf{x}}) \right] ={\left\{ \begin{array}{ll} {\overline{{\textbf{x}}}}_a={\overline{{\textbf{x}}}}_b=\frac{{\textbf{x}}_a+{\textbf{x}}_b}{2}\\ {\overline{{\textbf{x}}}}_{a_{1,1}}=\cdots ={\overline{{\textbf{x}}}}_{a_{k,t}}={\overline{{\textbf{x}}}}_{b_{1,1}}=\cdots ={\overline{{\textbf{x}}}}_{b_{k,t}}\\ \quad =\frac{\sum _{i=1}^{k}\sum _{j=1}^t\left( {\textbf{x}}_{a_{i,j}}+{\textbf{x}}_{b_{i,j}} \right) }{2kt}. \end{array}\right. } \end{aligned}$$
(4.11)

For example, we may define \({\mathcal {G}}\) to be the group generated by \(\{\sigma _1,\sigma _2,\sigma _3\}\) where

  • \(\sigma _1\) swaps a with b and \(a_{i,j}\) with \(b_{i,j}\);

  • \(\sigma _2\) takes \(a_{i,j}\) to \(a_{i\pmod {k}+1,j}\) and leaves all other vertices unchanged;

  • \(\sigma _3\) takes \(a_{1,j}\) to \(a_{1,j\pmod {t}+1}\) and leaves all other vertices unchanged.

It can be verified that \(F({\overline{{\textbf{x}}}})\) may be written in terms of the same function of two variables \({\hat{F}}(q,p)\) from Lemma 4.5. Let q be as defined above and redefine \(p\triangleq \frac{\sum _{i=1}^k\sum _{j=1}^t({\textbf{x}}_{a_{ij}}+{\textbf{x}}_{b_{ij}})}{2t},\) so that:

$$\begin{aligned} F({{\overline{{\textbf{x}}}}})&\triangleq F(q,q,\overbrace{p/k,p/k,\ldots ,p/k}^{2kt\text { times}})\triangleq {{\hat{F}}}(q,p) \nonumber \\&=(1-\kappa )2q(1-q)+\kappa 2(1-q)(1-(1-p/k)^k) \nonumber \\&\approx (1-\kappa )2q(1-q)+\kappa 2(1-q)(1-e^{-p}) \end{aligned}$$
(4.12)

where the approximate equality holds as \(k\rightarrow \infty \), same as before.

To finish, we must show that the symmetry gap of f with respect to \({\mathcal {F}}\) remains less than 0.478 as \(t\rightarrow \infty \). As in the proof of Lemma 4.5,

$$\begin{aligned} \textbf{OPT}\ge & {} \max _{S:|S|\le t+1}f(S)\ge f(\{a,b_{11},\ldots ,b_{1t}\})=1, \end{aligned}$$
(4.13)
$$\begin{aligned} {\overline{\textbf{OPT}}}= & {} \max _{\sum {\textbf{x}}_i\le t+1}F({\overline{{\textbf{x}}}})=\max _{\sum {\textbf{x}}_i\le t+1}{\hat{F}}(q,p)\le \max _{p\le \frac{t+1}{2t}}{\hat{F}}(q,p)\approx \max _{p\le 1/2}{\hat{F}}(q,p)<0.478, \nonumber \\ \end{aligned}$$
(4.14)

where the gap between the two sides of the approximate equality goes to 0 as \(t\rightarrow \infty \) because \({\hat{F}}\) is Lipschitz continuous and its domain is bounded. So the symmetry gap \(\gamma =\frac{{\overline{\textbf{OPT}}}}{\textbf{OPT}}\) is again less than 0.478, as desired. \(\square \)

5 Non-positive \(\ell \)

The results of this section are summarized in Fig. 1.

5.1 Approximation Algorithms

In this subsection, we provide improved approximations for general f (Theorem 5.1) as well as for f a cut function (Propositions 5.4 and 5.5).

Theorem 5.1

For RegularizedUSM with non-positive \(\ell \), an \((\alpha (\beta ),\beta )\)-approximation algorithm exists for any \((\alpha (\beta ),\beta )\) in Table 1. In particular, \(\alpha (1)>0.385\) and \(\alpha (1.3)>0.398\). When \(\beta \le 1\), there is an algorithm for RegularizedCSM that achieves the same approximation factor.

We start with a special case of Theorem 5.1.

Lemma 5.2

There is a (0.385, 1) approximation algorithm for RegularizedCSM when \(\ell \) is non-positive.

Proof

The idea is to combine the “guessing step” of Sviridenko et al. [19] with the 0.385-approximation for CSM due to Buchbinder and Feldman [14] (which actually provides a \((0.385+\epsilon )\)-approximation for any \(\epsilon \le 0.0006\)). Recall that [19] achieves a \(\left( 1-\frac{1}{e}-\epsilon ,1 \right) \)-approximation for monotone f and non-positive \(\ell \). The idea is that if we know the value of \(\ell (OPT)\), we can run the algorithm of [14] on the intersection \({\mathcal {P}}\cap \{{\textbf{x}}: L({\textbf{x}})\ge \ell (OPT)\}\), which is down-closed and solvable because \({\mathcal {P}}\) is down-closed and solvable, and the same is true for \(\{{\textbf{x}}: L({\textbf{x}})\ge \ell (OPT)\}\). This will guarantee finding \({\textbf{x}}\in {\mathcal {P}}\) such that \({\mathbb {E}}[F({\textbf{x}})]\ge 0.385f(OPT)\) and \({\mathbb {E}}[L({\textbf{x}})]\ge \ell (OPT)\). Afterward, we can use pipage rounding to round \({\textbf{x}}\) to an integral solution \(T\in {\mathcal {I}}\) [18]. Specifically, given \({\textbf{x}}\in {\mathcal {P}}\), pipage rounding generates \(T\in {\mathcal {I}}\) such that \({\mathbb {E}}[{\textbf{1}}_T]={\textbf{y}}\) and \({\mathbb {E}}[F({\textbf{1}}_T)+L({\textbf{1}}_T)]\ge F({\textbf{x}})+L({\textbf{x}})\).

Table 1 \((\alpha (\beta ),\beta )\)-Approximations for RegularizedUSM with non-positive \(\ell \) (Theorem 5.1)

Of course, we do not actually know what the value of \(\ell (OPT)\) is. To guarantee that we run [14] on the intersection \({\mathcal {P}}\cap \{{\textbf{x}}: L({\textbf{x}})\ge w\}\) for some w satisfying \(\ell (OPT)\ge w\ge \ell (OPT)(1+\epsilon )\), it suffices to try setting w equal to each of the \({\mathcal {O}}\left( \frac{n^2}{\epsilon }\right) \) values in the following set:

(5.1)

For at least one of these values of w (“guesses”), we will have \({\mathbb {E}}[F({\textbf{x}})]\ge (0.385+\epsilon )f(OPT)\) (if \(\epsilon \le 0.0006\)) and \({\mathbb {E}}[L({\textbf{x}})]\ge (1+\epsilon )\ell (OPT)\). Combining these guarantees shows that rounding \({\textbf{x}}\) provides a \(\left( 0.385+\epsilon ,1+\epsilon \right) \) approximation, which in turn implies a \(\left( 0.385,1 \right) \) approximation because

$$\begin{aligned}&\max (0,(0.385+\epsilon )f(OPT)+(1+\epsilon )\ell (OPT))\nonumber \\&\quad \ge \frac{(0.385+\epsilon )f(OPT)+(1+\epsilon )\ell (OPT)}{1+\epsilon }\nonumber \\&\quad \ge 0.385 f(OPT)+\ell (OPT). \end{aligned}$$
(5.2)

\(\square \)

Before proving Theorem 5.1, we start by briefly reviewing the main algorithm from [14] when executed on a solvable down-closed polytope \({\mathcal {P}}\). First, it uses a local search to generate \({\textbf{z}}\in {\mathcal {P}}\) such that both of the following inequalities hold with high probability:

$$\begin{aligned} F({\textbf{z}})&\ge \frac{1}{2}F({\textbf{z}}\wedge {\textbf{1}}_{OPT})+\frac{1}{2}F({\textbf{z}}\vee {\textbf{1}}_{OPT})-o(1)\cdot f(OPT), \end{aligned}$$
(5.3)
$$\begin{aligned} F({\textbf{z}})&\ge F({\textbf{z}}\wedge {\textbf{1}}_{OPT})-o(1)\cdot f(OPT). \end{aligned}$$
(5.4)

Then it executes [14, Algorithm 2], Aided Measured Continuous Greedy, to generate \({\textbf{y}}\in {\mathcal {P}}\) such that

$$\begin{aligned} {\mathbb {E}}[F({\textbf{y}})]&\ge e^{t_s-1}\cdot [(2-t_s-e^{-t_s}-o(1))\cdot f(OPT)-(1-e^{-t_s})\cdot F({\textbf{z}}\wedge {\textbf{1}}_{OPT})\nonumber \\&\quad -(2-t_s-2e^{-t_s})\cdot F({\textbf{z}}\vee {\textbf{1}}_{OPT})]. \end{aligned}$$
(5.5)

Finally, assuming \({\mathcal {P}}\) is the matroid polytope corresponding to a family of independent sets \({\mathcal {I}}\), pipage rounding may be used to convert both \({\textbf{y}}\) and \({\textbf{z}}\) to integral solutions \(y\in {\mathcal {I}}\) and \(z\in {\mathcal {I}}\) such that \({\mathbb {E}}[f(y)]\ge F({\textbf{y}})\) and \({\mathbb {E}}[f(z)]\ge F({\textbf{z}})\), and the solution from y and z with the larger value of f will be returned.Footnote 2

To obtain improved approximation bounds, we need the following generalization of Eq. (5.5):

Lemma 5.3

(Generalization of Aided Measured Continuous Greedy) If we run Aided Measured Continuous Greedy given a fractional solution \({\textbf{z}}\) and a polytope \({\mathcal {P}}\) for a total of \(t_f\) time, where \(t_f\ge t_s\), it will generate \({\textbf{y}}\in t_f{\mathcal {P}}\cap [0,1]^{{\mathcal {N}}}\) such that

$$\begin{aligned} {\mathbb {E}}[F({\textbf{y}})]&\ge e^{-t_f}[(e^{t_s}+t_fe^{t_s}-t_se^{t_s}-1-o(1))f(OPT) \nonumber \\&\quad +(-e^{t_s}+1)F({\textbf{z}}\wedge {\textbf{1}}_{OPT})\nonumber \\&\quad +(-e^{t_s}-t_fe^{t_s}+t_se^{t_s}+1+t_f)F({\textbf{z}}\vee {\textbf{1}}_{OPT})] \end{aligned}$$
(5.6)

Note that this matches term by term with Eq. (5.5) when \(t_f=1\).

Proof Sketch

By [14], proving the conclusion for integral sets Z implies the conclusion for fractional \({\textbf{z}}\). So it suffices to prove the following.

$$\begin{aligned} {\mathbb {E}}[F({\textbf{y}}(t_f))]&\ge e^{-t_f}[(e^{t_s}+t_fe^{t_s}-t_se^{t_s}-1-o(1))f(OPT) \nonumber \\&\quad +(-e^{t_s}+1)f(OPT\cap Z) \nonumber \\&\quad +(-e^{t_s}-t_fe^{t_s}+t_se^{t_s}+1+t_f)f(OPT\cup Z)]. \end{aligned}$$
(5.7)

The idea of the original aided measured continuous greedy is to run measured continuous greedy for \(t_s\) time only on the elements of \({\mathcal {N}}\backslash Z\), and then for \(1-t_s\) additional time with all elements of \({\mathcal {N}}\). Working out what happens when we run it for a total of \(t_f\) instead of 1 time is just a matter of going through the equations from [14, Section 4] and making minor changes. The remainder of the proof is deferred to Sect. A.3.

Proof of Theorem 5.1

Our algorithm for RegularizedUSM is as follows:

  1. 1.

    As in Lemma 5.2, first guess the value of \(\ell (OPT)\) to within a factor of \(1+\epsilon \), and then replace \({\mathcal {P}}\) with \({\mathcal {P}}\cap \{{\textbf{x}}: L({\textbf{x}})\ge (1+\epsilon )\ell (OPT)\}\).

  2. 2.

    Generate \({\textbf{z}}\) using the local search procedure on \((f,{\mathcal {P}})\) described by [14].

  3. 3.

    Run aided continuous greedy given \({\textbf{z}}\) for all pairsFootnote 3

    $$\begin{aligned} (t_s,t_f)\in {\mathcal {T}}\triangleq \left\{ \left( \frac{x}{20}, \frac{y}{20} \right) \big | (x,y)\in {\mathcal {Z}}^2\text { and }0\le x\le y\le 40\right\} . \end{aligned}$$
    (5.8)
  4. 4.

    Round \({\textbf{z}}\) from step 1 and all fractional solutions found in step 2 to valid integral solutions. Note that by replacing \({\textbf{z}}\) with \(\texttt {R}({\textbf{z}})\), the value of \(F+L\) is preserved in expectation.

  5. 5.

    Return the solution from step 4 with the maximum value, or the empty set if none of these solutions has positive value. Let \(\textbf{OPT}'\) be the expected value of this solution.

For a fixed \(\beta \ge 0\), we can compute the maximum \(\alpha (\beta )\) such that an \((\alpha (\beta )-{\mathcal {O}}\left( \epsilon \right) ,\beta )\)-approximation is guaranteed by solving the following linear program:

(5.9)

Any point \((x_1,x_2,x_3,x_4)\) within the convex hull satisfies:

$$\begin{aligned} \textbf{OPT}'\ge x_1 f(OPT)+x_2F({\textbf{z}}\wedge {\textbf{1}}_{OPT})+x_3F({\textbf{z}}\vee {\textbf{1}}_{OPT})+x_4 L(OPT) \end{aligned}$$
(5.10)

if we ignore the o(1) terms contributed by Lemma 5.3 and Eqs. (5.3) and (5.4) and take the limit as \(\epsilon \rightarrow 0\). The points determining the hull are as follows:

  • (0, 0, 0, 0) corresponds to returning the empty set.

  • (0, 0.5, 0.5, 1) corresponds to \({\textbf{z}}\) satisfying Eq. (5.3).

  • (0, 1, 0, 1) corresponds to \({\textbf{z}}\) satisfying Eq. (5.4).

  • The remaining vertices of the hull correspond to running Lemma 5.3 on \({\mathcal {P}}\) given \({\textbf{z}}\) for all \((t_s,t_f)\in {\mathcal {T}}\).

Adding the constraints \(x_2,x_3\ge 0\) and \(x_4\le \beta \) ensures that \(\textbf{OPT}'\ge x_1F({\textbf{z}})+\beta L(OPT).\) The results of solving this program with CVXPY [26] for \(\beta \in [0,1.5]\) are displayed in Fig. 1. In particular, \(\alpha (1)\ge 0.385\) and the maximum value of \(\alpha \) is obtained around \(\alpha (1.3)\ge 0.398\).

For the case of RegularizedCSM, the reasoning is almost the same, but to ensure that all points returned by Lemma 5.3 lie within \({\mathcal {P}}\), we only include pairs in \({\mathcal {T}}\) with \(t_f\le 1\) in step 3, and pipage rounding with respect to the original \({\mathcal {P}}\) (not \({\mathcal {P}}\cap \{{\textbf{x}}: L({\textbf{x}})\ge (1+\epsilon )\ell (OPT)\}\), which is not necessarily a matroid polytope) must be used for step 4. The results turn out to be identical to those displayed in Fig. 1 for \(\beta \le 1\). \(\square \)

Next, we state better approximation results for f an undirected and directed cut function, respectively. The proofs, which use linear programming, are deferred to Sect. A.3. We note that linear programming was previously used to provide a 0.5-approximation for MAX-DICUT by Trevisan [27] and later by Halperin and Zwick [28].

Proposition 5.4

There is a (0.5, 1)-approximation algorithm for RegularizedCSM when \(\ell \) has arbitrary sign and f is the cut function of a weighted undirected graph (VEw); that is, for all \(S\subseteq V\),

$$\begin{aligned} f(S)\triangleq \sum _{ab\in E}w_{ab}\cdot [|S\cap \{a,b\}|=1], \end{aligned}$$
(5.11)

where each edge weight \(w_{ab}\) is non-negative.

Note that while our above result for undirected cut functions applies to RegularizedCSM, our subsequent result for directed cut functions only applies to RegularizedUSM.

Proposition 5.5

There is a (0.5, 1)-approximation algorithm for RegularizedUSM when \(\ell \) has arbitrary sign and f is the cut function of a weighted directed graph (VEw); that is, for all \(S\subseteq V\),

$$\begin{aligned} f(S)=\sum _{ab\in E}w_{ab}\cdot [a\in S\text { and }b\not \in S], \end{aligned}$$
(5.12)

where each edge weight \(w_{ab}\) is non-negative.

Table 2 Inapproximability of \((\alpha (\beta ),\beta )\)-approximations for RegularizedUSM with non-positive \(\ell \)

5.2 Inapproximability

In this subsection, we prove Theorem 5.6. Recall from Fig. 1 that it unifies the guarantees of [24, Theorem 1.1] and [24, Theorem 1.3].

Theorem 5.6

There are instances of RegularizedUSM with non-positive \(\ell \) such that \((\alpha (\beta ),\beta )\) is inapproximable for any \((\alpha (\beta ),\beta )\) in Table 2. In particular, \(\alpha (0)\approx 0\), matching the result of [24, Theorem 1.1], and \(\alpha (1) < 0.478\), matching the result of [24, Theorem 1.3].

The idea behind the proof of Theorem 5.6 is to extend the symmetry gap construction of [24, Theorem 1.3], which in turn is a modification of the 0.478-inapproximability result of [17] used in Sect. 4.

Before proving Theorem 5.6, we state a generalization of the symmetry gap technique to \(f+\ell \) sums that we use for Theorem 5.6 and the rest of our inapproximability results.

Definition 5.7

[18] We say that \(\max _{S\in {\mathcal {F}}}\left[ f(S)+\ell (S) \right] \) is strongly symmetric with respect to a group of permutations \({\mathcal {G}}\) if \(\ell (S)=\ell (\sigma (S))\) for all \(\sigma \in {\mathcal {G}}\) and \((f,{\mathcal {F}})\) are strongly symmetric with respect to \({\mathcal {G}}\) as defined in Definition 4.3.

Lemma 5.8

(Inapproximability of \((\alpha ,\beta )\) Approximations) Let \(\max _{S\in {\mathcal {F}}}\left[ f(S)+\ell (S) \right] \) be an instance of submodular maximization with f non-negative submodular and \(\ell \) linear, strongly symmetric with respect to a group of permutations \({\mathcal {G}}\). For any two constants \(\alpha ,\beta \ge 0\), if

$$\begin{aligned} \max _{{\textbf{x}}\in P({\mathcal {F}})}\left[ F({{\overline{{\textbf{x}}}}})+L({{\overline{{\textbf{x}}}}}) \right] <\max _{S\in {\mathcal {F}}}[\alpha f(S)+\beta \ell (S)], \end{aligned}$$
(5.13)

then no polynomial-time algorithm for RegularizedCSM can guarantee a \((\alpha ,\beta )\)-approximation. The same inapproximability holds for RegularizedUSM by setting \({\mathcal {F}}=2^{{\mathcal {N}}}\).

Proof

Theorem 3.1 of [24] proves this lemma only for the special case of \({\mathcal {F}}=2^{{\mathcal {N}}}\) because the proof of [24, Lemma A.3] cites a special case of [18, Lemma 3.3] that only applies for \({\mathcal {F}}=2^{{\mathcal {N}}}\). It suffices to modify the proof to cite the full version of [18, Lemma 3.3] instead. \(\square \)

Proof of Theorem 5.6

Set f to be the same as defined in Lemma 4.5. Now apply Lemma 5.8 with \(S=\{a,b_1\}\). For a fixed \(\beta \), we can show \((\alpha ,\beta )\)-inapproximability using this method if it is possible to choose \(\ell \) and \(\kappa \) such that the following inequality is true:

$$\begin{aligned}{} & {} \max _{{\textbf{x}}\in P({\mathcal {F}})}\left[ F({{\overline{{\textbf{x}}}}})+L(\overline{{\textbf{x}}}) \right] <\alpha f(\{a,b_1\})+\beta \ell (\{a,b_1\})\nonumber \\{} & {} \quad =\alpha + \beta \ell (\{a,b_1\}) \end{aligned}$$
(5.14)
$$\begin{aligned}{} & {} \implies \max _{{\textbf{x}}\in P({\mathcal {F}})}\left[ F({{\overline{{\textbf{x}}}}})+L({{\overline{{\textbf{x}}}}}) \right] -\beta \ell (\{a,b_1\})<\alpha . \end{aligned}$$
(5.15)

Our goal is now to minimize the LHS of the above inequality. Theorem 1.3 of [24] sets \(\ell _a=\ell _b=0\), and then chooses \(\kappa \) and \(\ell _{a_{1\cdots k}}=\ell _{b_{1\cdots k}}\triangleq \ell _p\le 0\) in order to minimize the quantity

$$\begin{aligned}&\max _{{\textbf{x}}\in [0,1]^{{\mathcal {N}}}}\left[ F({{\overline{{\textbf{x}}}}})+L(\overline{{\textbf{x}}}) \right] -\beta \ell (\{a,b_1\})\nonumber \\&\quad =\max _{{\textbf{x}}\in [0,1]^{{\mathcal {N}}}}\left[ F({{\overline{{\textbf{x}}}}})+L({{\overline{{\textbf{x}}}}}) \right] -\beta \ell _p\nonumber \\&\quad =\max _{0\le q\le 1, 0\le p}\left[ (1-\kappa )2q(1-q)+\kappa 2 (1-q)(1-e^{-p})+2p\ell _p \right] -\beta \ell _p. \end{aligned}$$
(5.16)

However, choosing \(\ell _a=\ell _b\triangleq \ell _q\) to be negative rather than zero gives superior bounds for small \(\beta \). That is, our goal is to compute

$$\begin{aligned} \min _{0\le \kappa \le 1, \ell _q\le 0, \ell _p\le 0}\Big [&\max _{0\le q\le 1, 0\le p}\big [(1-\kappa )2q(1-q)+\kappa 2(1-q)(1-e^{-p})\nonumber \\&+2p\ell _p+2q\ell _q\big ]-\beta (\ell _p+\ell _q)\Big ]. \end{aligned}$$
(5.17)

We can approximate the optimal value by brute forcing over a range of \((\kappa ,\ell _q,\ell _p)\). For \(\beta \in \{0.8,0.9,1.0\}\), it is optimal to set \(\ell _q=0\), and our guarantee is the same as that of [24, Theorem 1.3]. Our results for \(\beta \in \{0.6,0.7\}\) are stronger than those of [24, Theorem 1.3] even though they also set \(\ell _q=0\), because that theorem actually only considers \(\ell _p\ge -0.5\) and \(\kappa \le 0.5\). \(\square \)

Next, we consider the limit of Theorem 5.6 as \(\alpha (\beta )\rightarrow 0.5\). Note that this is not a new result in the sense that [24, Theorem 1.3] can already prove it when the parameters \(\ell _p\) and \(\kappa \) are chosen appropriately, but we nevertheless believe it is worth explicitly stating.

Proposition 5.9

For any \(\epsilon >0\), there are instances of RegularizedUSM with non-positive \(\ell \) such that \((0.5,2\ln 2-\epsilon \approx 1.386)\) is inapproximable.

Proof

To find the maximum \(\beta \) such that we can show \((0.5,\beta )\)-inapproximability using the construction of Theorem 5.6, our goal is to choose \(\kappa \in (0,0.5)\) and \(\ell _p<0\) such that \(\beta \) is maximized in

$$\begin{aligned}&\max _{0\le q\le 1, 0\le p}\left[ (1-\kappa )2q(1-q)+\kappa 2 (1-q)(1-e^{-p})+2p\ell _p \right] -\beta \ell _p<0.5\nonumber \\&\quad \Longleftrightarrow \beta <\frac{0.5-\max _{0\le q\le 1, 0\le p} \left[ (1-\kappa )2q(1-q)+\kappa 2(1-q)(1-e^{-p})+2p\ell _p \right] }{-\ell _p}. \end{aligned}$$
(5.18)

We can rewrite half the expression within the \(\max \) as

$$\begin{aligned}&\max _{q, 0\le p}\left[ (1-\kappa )q(1-q)+\kappa (1-q)(1-e^{-p})+p\ell _p \right] \nonumber \\&\quad = \max _{q,0\le p}\left[ -q^2(1-\kappa )+q(1-2\kappa +\kappa e^{-p})+\kappa (1-e^{-p})+p\ell _p \right] \nonumber \\&\quad =\max _{0\le p}\left[ \frac{(1-2\kappa +\kappa e^{-p})^2}{4(1-\kappa )}+\kappa (1-e^{-p})+p\ell _p \right] , \end{aligned}$$
(5.19)

so the RHS of Eq. (5.18) becomes:

$$\begin{aligned}&\frac{2\cdot \min _{0\le p}\left[ \frac{1-\kappa -(1-2\kappa +\kappa e^{-p})^2}{4(1-\kappa )}-\kappa (1-e^{-p})-p\ell _p \right] }{-\ell _p}\nonumber \\&\quad =\frac{2\cdot \min _{0\le p}\left[ \frac{\kappa (2e^{-p}-1) - \kappa ^2e^{-2p}}{4(1-\kappa )}-p\ell _p \right] }{-\ell _p}. \end{aligned}$$
(5.20)

Next, we claim that for any \(p^*>0\), it is possible to choose \(\ell _p<0\) such that the numerator of Eq. (5.20) reaches its minimum at \(p=p^*\). Define the function \(h(p)\triangleq \frac{\kappa (2e^{-p}-1) - \kappa ^2e^{-2p}}{4(1-\kappa )}\). It suffices to check that h is decreasing at \(p=0\) and concave up for \(p\ge 0\); that is, \(\frac{d}{dp}h(p)\Bigr |_{p=0}<0\) and \(\frac{d^2}{dp^2}\left[ \frac{\kappa (2e^{-p}-1) - \kappa ^2e^{-2p}}{4(1-\kappa )} \right] >0\) for all \(p\ge 0\). Both of these inequalities follow from the assumption \(\kappa \in (0,0.5)\).

Finally, when \(p^*<\ln 2\), \(2e^{-p^*}>1\), implying that \(h(p^*)>0\) when \(\kappa \) is sufficiently close to 0. For such \(p^*\), the RHS of Eq. (5.20) becomes

$$\begin{aligned} \frac{2(h(p^*)-p^*\ell _p)}{-\ell _p}\ge \frac{-2p^*\ell _p}{-\ell _p}=2p^*, \end{aligned}$$
(5.21)

so the RHS of Eq. (5.18) can be made arbitrarily close to \(2\ln 2\). \(\square \)

6 Non-negative \(\ell \): RegularizedUSM

6.1 Approximations with Double Greedy

In this subsection, we show improved approximability for DeterministicDG and RandomizedDG in Theorems 6.1 and 6.2, and then show that both of these results are tight in Propositions 6.3 and 6.4. The results of this subsection are summarized in Fig. 2.

First, we briefly review the behavior of the original DeterministicDG and RandomizedDG of [10] when executed on a non-negative submodular function g, as well as their approximation factors.

The Algorithm:

The algorithm will construct sequences of sets \(X_i,Y_i\) for \(i\in [0,n]\). First, \(X_0\triangleq \emptyset \) and \(Y_0\triangleq {\mathcal {N}}\). Then for each i from 1 to n, execute the following two steps:

  1. 1.

    Compute the marginal gains \(g(u_i|X_{i-1})=g(X_{i-1}\cup \{u_i\})-g(X_{i-1})\) and \(g(u_i|Y_{i-1}\backslash \{u_i\})= g(Y_{i-1})-g(Y_{i-1}\backslash \{u_i\})\). By the original proof of double greedy,

    $$\begin{aligned} g(u_i|X_{i-1})-g(u_i | Y_{i-1}\backslash \{u_i\})\ge 0 \end{aligned}$$
    (6.1)

    holds by submodularity.

  2. 2.

    Based on the marginal gains, either set \((X_i,Y_i)=(X_{i-1}\cup \{u_i\},Y_{i-1})\) or \((X_i,Y_i)=(X_{i-1},Y_{i-1}\backslash \{u_i\})\).

    • In DeterministicDG, the first event occurs if \(g(u_i|X_{i-1})\ge -g(u_i | Y_{i-1}\backslash \{u_i\})\).

    • In RandomizedDG, the first event occurs with probability proportional to \(a_i\triangleq \max (g(u_i|X_{i-1}),0)\), while the second event occurs with probability proportional to \(b_i\triangleq \max (-g(u_i | Y_{i-1}\backslash \{u_i\}),0)\). In the edge case where \(a_i=b_i=0\), it does not matter which event occurs.

Finally, the algorithm returns \(X_n=Y_n\).

The Approximation Factors:

Let \(OPT_i\triangleq (OPT\cup X_i)\cap Y_i\), so that \(OPT_0=OPT\) while \(OPT_n=X_n=Y_n\). For DeterministicDG, it can be shown via exhaustive casework that:

$$\begin{aligned} g(OPT_{i-1})-g(OPT_i)\le (g(X_i)-g(X_{i-1}))+(g(Y_i)-g(Y_{i-1})), \end{aligned}$$
(6.2)

while for RandomizedDG, it can similarly be shown that:

$$\begin{aligned} {\mathbb {E}}[g(OPT_{i-1})-g(OPT_i)]\le \frac{1}{2}{\mathbb {E}}\left[ (g(X_i)-g(X_{i-1}))+(g(Y_i)-g(Y_{i-1})) \right] . \end{aligned}$$
(6.3)

Summing Eq. (6.2) from \(i=1\) to \(i=n\) gives

$$\begin{aligned} g(OPT)-g(X_n)&=g(OPT_0)-g(OPT_n)\nonumber \\&\le g(X_n)-g(X_0)+g(Y_n)-g(Y_0)\nonumber \\&\le 2g(X_n)-g({\mathcal {N}})\nonumber \\ \implies g(X_n)&\ge \frac{g(OPT)+g({\mathcal {N}})}{3}, \end{aligned}$$
(6.4)

whereas summing Eq. (6.3) from \(i=1\) to \(i=n\) gives

$$\begin{aligned} {\mathbb {E}}\left[ g(OPT)-g(X_n) \right]&={\mathbb {E}}\left[ g(OPT_0)-g(OPT_n) \right] \nonumber \\&\le {\mathbb {E}}\left[ \frac{1}{2}\left( 2g(X_n)-g({\mathcal {N}}) \right) \right] \nonumber \\ \implies {\mathbb {E}}\left[ g(X_n) \right]&\ge \frac{2g(OPT)+g({\mathcal {N}})}{4}. \end{aligned}$$
(6.5)

Equations (6.4) and (6.5) imply that if we substitute \(f+\ell \) in place of g, DeterministicDG and RandomizedDG provide (1/3, 2/3)- and (1/2, 3/4)-approximations for RegularizedUSM, respectively, because \(\ell (OPT)\le \ell ({\mathcal {N}})\). Showing improved \((\alpha ,\beta )\)-approximations for DeterministicDG (RandomizedDG) when \(\alpha <\frac{1}{3}\) (\(\alpha <\frac{1}{2}\)) is just a matter of modifying Eq. (6.2) (Eq. (6.3)).

Theorem 6.1

For RegularizedUSM with non-negative \(\ell \) and any \(r\ge 1\), there exists a variant of DeterministicDG that simultaneously achieves (0, 1) and \(\left( \frac{1}{r+1+r^{-1}},\frac{r+1}{r+1+r^{-1}} \right) \)-approximations (and consequently, \((\alpha ,\beta )\)-approximations for all \((\alpha ,\beta )\) on the segment connecting these two points as well). For \(r=1\), the variant is actually just the original DeterministicDG.

Proof

Modify step 2 of DeterministicDG so that the first event occurs if \(g(u_i|X_{i-1})\ge -rg(u_i|Y_{i-1}\backslash \{u_i\})\). We claim that the following modified version of Eq. (6.2) now holds:

$$\begin{aligned} g(OPT_{i-1})-g(OPT_i)\le r^{-1}(g(X_i)-g(X_{i-1}))+r(g(Y_i)-g(Y_{i-1})). \end{aligned}$$
(6.6)

First we show that Eq. (6.6) implies an \(\left( \frac{1}{r+1+r^{-1}},\frac{r+1}{r+1+r^{-1}} \right) \)-approximation. Summing it from \(i=1\) to \(i=n\) gives:

$$\begin{aligned} g(OPT)-g(X_n)&= g(OPT_0)-g(OPT_n) \nonumber \\&\le r^{-1}(g(X_n)-g(X_0))+r(g(Y_n)-g(Y_n))\nonumber \\&\le \left( r^{-1}+r \right) g(X_n)-rg({\mathcal {N}}) \end{aligned}$$
(6.7)
$$\begin{aligned} \implies g(X_n)&\ge \frac{g(OPT)}{r+1+r^{-1}}+\frac{r}{r+1+r^{-1}}g({\mathcal {N}})\nonumber \\&\ge \frac{f(OPT)}{r+1+r^{-1}}+\frac{r+1}{r+1+r^{-1}}\cdot \ell (OPT), \end{aligned}$$
(6.8)

as desired. Now we show Eq. (6.6). First, we consider the case \(g(u_i|X_{i-1})\ge -rg(u_i|Y_{i-1}\backslash \{u_i\})\). This assumption implies that \(Y_i=Y_{i-1}\), so the last part of Eq. (6.6) drops out.

  1. 1.

    If \(u_i\in OPT_{i-1}\), then \(OPT_i=OPT_{i-1}\), and Eq. (6.6) reduces to \(0 \le g(u_i|X_{i-1})\), which holds by combining Eq. (6.1) with the assumption.

  2. 2.

    If \(u_i \not \in OPT_{i-1}\), then \(OPT_{i}=OPT_{i-1}\cup \{u_i\}\), then Eq. (6.6) reduces to

    $$\begin{aligned} -g(u_i|OPT_{i-1})\le r^{-1}g(u_i|X_{i-1}). \end{aligned}$$
    (6.9)

    Since \(OPT_{i-1}\subseteq Y_{i-1}\backslash \{u_i\}\), the LHS of this inequality is at most \(-g(u_i|Y_{i-1}\backslash \{u_i\})\) by submodularity. On the other hand, the RHS of this inequality is at least \(-g(u_i|Y_{i-1}\backslash \{u_i\})\) by assumption.

On the other hand, if \(g(u_i|X_{i-1})< -rg(u_i|Y_{i-1}\backslash \{u_i\})\), then \(X_i=X_{i-1}\), and the first part of the RHS of Eq. (6.6) drops out.

  1. 1.

    If \(u_i\not \in OPT_{i-1}\), then \(OPT_i=OPT_{i-1}\), and Eq. (6.6) reduces to \(0 \le -g(u_i|Y_{i-1}\backslash \{u_i\})\), which holds by combining Eq. (6.1) with the assumption.

  2. 2.

    If \(u_i\in OPT_{i-1}\), then \(OPT_{i}=OPT_{i-1}\backslash \{u_i\}\), then Eq. (6.6) reduces to

    $$\begin{aligned} g(u_i|OPT_{i})\le -rg(u_i|Y_{i-1}\backslash \{u_i\}). \end{aligned}$$
    (6.10)

    Since \(X_i\subseteq OPT_{i}\), the LHS of this inequality is at most \(g(u_i|X_i)\) by submodularity. On the other hand, the RHS of this inequality is greater than \(g(u_i|X_{i-1})\) by assumption.

It remains to show that this algorithm simultaneously achieves a (0, 1)-approximation. Because \(g(u_i|X_i)-g(u_i | Y_i\backslash \{u_i\})\ge 0\), \(\max \left( g(u_i|X_i), -rg(u_i | Y_i\backslash \{u_i\} \right) \ge 0\). Next,

  • If \(g(u_i|X_i)\ge -rg(u_i | Y_i\backslash \{u_i\}\), then \(g(u_i|X_i)\ge 0\), \(Y_{i+1}=Y_i\) and

    $$\begin{aligned} g(X_{i+1})=g(u_i|X_i)+g(X_i)\ge g(X_i). \end{aligned}$$
    (6.11)
  • Otherwise, if \(g(u_i|X_i)< -rg(u_i | Y_i\backslash \{u_i\})\), then \(g(u_i | Y_i\backslash \{u_i\})\le 0\), \(X_{i+1}=X_i\) and

    $$\begin{aligned} g(Y_{i+1})=g(Y_i)-g(u_i | Y_i\backslash \{u_i\})\ge g(Y_i). \end{aligned}$$
    (6.12)

Thus, the values of both \(g(X_i)\) and \(g(Y_i)\) are increasing over the course of the algorithm, so:

$$\begin{aligned} g(X_n)=g(Y_n)\ge g(Y_{n-1})\ge \cdots \ge g(Y_0)\ge \ell ({\mathcal {N}})\ge \ell (OPT). \end{aligned}$$
(6.13)

\(\square \)

The reasoning for RandomizedDG, which we show next, is very similar.

Theorem 6.2

Running RandomizedDG on \(f+\ell \) simultaneously achieves an \(\left( \frac{2}{r+2+r^{-1}},\frac{r+2}{r+2+r^{-1}} \right) \)- approximation for all \(r\ge 1\) for RegularizedUSM with non-negative \(\ell \).

Proof

We claim that the following modified version of Eq. (6.3) holds for any \(r>0\):

$$\begin{aligned} {\mathbb {E}}[g(OPT_{i-1})-g(OPT_i)]\le \frac{1}{2}{\mathbb {E}}\left[ r^{-1}(g(X_i)-g(X_{i-1}))+r(g(Y_i)-g(Y_{i-1})) \right] . \end{aligned}$$
(6.14)

As in the proof of Theorem 6.1, it is easy to check that Eq. (6.14) implies the conclusion:

$$\begin{aligned} {\mathbb {E}}\left[ g(OPT)-g(X_n) \right]&={\mathbb {E}}\left[ g(OPT_0)-g(OPT_n) \right] \nonumber \\&\le {\mathbb {E}}\left[ \frac{1}{2}\left( (r^{-1}+r)g(X_n)-rg({\mathcal {N}}) \right) \right] \nonumber \\ \implies {\mathbb {E}}\left[ g(X_n) \right]&\ge \frac{2}{r+2+r^{-1}}\cdot g(OPT)+\frac{r}{r+2+r^{-1}}\cdot g({\mathcal {N}}) \end{aligned}$$
(6.15)
$$\begin{aligned}&\ge \frac{2}{r+2+r^{-1}}\cdot f(OPT)+\frac{r+2}{r+1+r^{-1}}\cdot \ell (OPT) \end{aligned}$$
(6.16)

It remains to show Eq. (6.14). We note that in the edge case \(a_i=b_i=0\), Eq. (6.1) implies that \(g(u_i|X_{i-1})=g(u_i|Y_{i-1}\backslash \{u_i\})=0\), so the inequality reduces to \(0\le 0\). Otherwise, recall that the original proof of double greedy upper bounded the LHS of Eq. (6.14) by

$$\begin{aligned} {\mathbb {E}}[g(OPT_{i-1})-g(OPT_i)]\le \frac{a_ib_i}{a_i+b_i}. \end{aligned}$$
(6.17)

On the other hand, we can lower bound twice the RHS by

$$\begin{aligned}&{\mathbb {E}}[r^{-1}(g(X_i)-g(X_{i-1}))+r(g(Y_i)-g(Y_{i-1}))]\nonumber \\&\quad = r^{-1}\cdot \frac{a_i}{a_i+b_i}(g(X_{i-1}\cup \{u_i\})-g(X_{i-1}))\nonumber \\&\qquad +r\cdot \frac{b_i}{a_i+b_i}(g(Y_{i-1}\backslash \{u_i\})-g(Y_{i-1}))\nonumber \\&\quad =\frac{r^{-1}a_i^2}{a_i+b_i}+\frac{rb_i^2}{a_i+b_i} \ge \frac{2a_ib_i}{a_i+b_i}, \end{aligned}$$
(6.18)

where the last step follows from the AM-GM inequality as in the original proof. Equation (6.14) follows. \(\square \)

Next, we prove that DeterministicDG and RandomizedDG do no better than the bounds we just showed. Recall that [24, Theorem 1.4] proved that the original DeterministicDG is an \((\alpha ,\beta )\)-approximation algorithm whenever \(\alpha \le \frac{1}{3}\) and \(\alpha +\beta \le 1\). To show that this analysis is tight, it suffices to check that whenever \(\alpha >\frac{1}{3}\) or \(\alpha +\beta >1\), there are instances where DeterministicDG does not achieve the desired approximation factor. The inequality \(\alpha >\frac{1}{3}\) holds by [10, Theorem II.3], while \(\alpha +\beta >1\) holds by applying the following proposition with \(r=1\):

Proposition 6.3

For any \(r\ge 1\) and \(\epsilon >0\), there are instances of RegularizedUSM with non-negative \(\ell \) where the variant of DeterministicDG described in the proof of Theorem 6.1 does not achieve an \((\alpha ,\beta )\)-approximation for any \((\alpha ,\beta )\) above the line connecting (0, 1) and \(\left( \frac{1}{r+1+r^{-1}}, \frac{r+1}{r+1+r^{-1}} \right) \).

Proof

The points \((\alpha ,\beta )\) lying above the line connecting (0, 1) and \(\left( \frac{1}{r+1+r^{-1}}, \frac{r+1}{r+1+r^{-1}} \right) \) are precisely those that satisfy \(\alpha +\beta r=r+\epsilon \) for some \(\epsilon >0\). Define f(S) to be the sum of two weighted cut functions:

$$\begin{aligned}{} & {} {\mathcal {N}}\triangleq \{u_1,u_2\} \end{aligned}$$
(6.19)
$$\begin{aligned}{} & {} f(S)\triangleq (r+\epsilon /2) \cdot [u_1\in S\text { and }u_2\not \in S]+1\cdot [u_2\in S\text { and }u_1\not \in S] \end{aligned}$$
(6.20)
$$\begin{aligned}{} & {} \ell (u_1)=0, \ell (u_2)=r \end{aligned}$$
(6.21)

The weights of the directed edges are chosen such that if the variant of DeterministicDG considers \(u_1\) before \(u_2\), it will compute

$$\begin{aligned} g(u_1|X_0)=r+\epsilon /2>-rg(u_1|Y_0\backslash \{u_1\})=r, \end{aligned}$$
(6.22)

so it will return a set T satisfying \(u_1\in T\), implying that \(f(T)+\ell (T)\le r+\epsilon /2\) regardless of whether \(u_2\in T\) or not. If we define \(OPT\triangleq \{u_2\}\), then \(f(OPT)=1\) and \(\ell (OPT)=r\), so we get

$$\begin{aligned} f(T)+\ell (T)=r+\epsilon /2 <r+\epsilon = \alpha +\beta r= \alpha f(OPT) + \beta \ell (OPT) \end{aligned}$$
(6.23)

implying that an \((\alpha ,\beta )\)-approximation is not achieved. \(\square \)

Next, we generalize the construction of Proposition 6.3 to show that Theorem 6.2 is tight for RandomizedDG.

Proposition 6.4

For any \(r\ge 1\) and \(\epsilon >0\), there are instances of RegularizedUSM with non-negative \(\ell \) where RandomizedDG does not provide an \(\left( \alpha ,\beta \right) =\left( \frac{2}{r+2+r^{-1}}+\epsilon ,\frac{r+2}{r+2+r^{-1}} \right) \)-approximation.

Proof

Define f(S) to be the sum of \(2(n-1)\) weighted directed cut functions:

$$\begin{aligned} f(S)=\frac{1}{n-1}\left[ \sum _{i=1}^{n-1}\left( r[u_i\in S\wedge u_n\not \in S]+[u_n\in S\wedge u_i\not \in S] \right) \right] \end{aligned}$$
(6.24)

and \(\ell (u_1)=\ell (u_2)=\cdots =\ell (u_{n-1})=0, \ell (u_n)=r-1\). For each \(i\in [1,n-1]\), RandomizedDG will compute \(a_i=\frac{r}{n-1}\) and \(b_i=\frac{1}{n-1}\), so it will include each of \(u_{1\cdots n-1}\) in its returned set \(X_n\) independently with probability \(\frac{r}{r+1}\) each. Thus, for any \(\epsilon >0\), the following inequality holds by a Chernoff bound for sufficiently large n:

$$\begin{aligned} \Pr \left[ \left| \frac{|X_n\cap \{u_1,\ldots ,u_{n-1}\}|}{n-1}-\frac{r}{r+1} \right| \ge \frac{\epsilon }{2r} \right] =o(1), \end{aligned}$$
(6.25)

Assuming \(\left| \frac{|X_n\cap \{u_1,\ldots ,u_{n-1}\}|}{n-1}-\frac{r}{r+1} \right| < \frac{\epsilon }{2r}\) holds, it follows that

$$\begin{aligned} f(X_n)+\ell (X_n)<r\cdot \left( \frac{r}{r+1}+\frac{\epsilon }{2r} \right) =\frac{r^2}{r+1}+\frac{\epsilon }{2}. \end{aligned}$$
(6.26)

regardless of whether \(u_n\) is included in \(X_n\) or not. On the other hand, if we define \(OPT\triangleq \{u_n\}\), then

$$\begin{aligned} \alpha f(OPT)+\beta \ell (OPT)&= \alpha +(r-1)\beta \nonumber \\&=\frac{2}{r+2+r^{-1}}+\epsilon +\frac{(r-1)(r+2)}{r+2+r^{-1}}\nonumber \\&=\frac{r^2+r}{r+2+r^{-1}}+\epsilon \nonumber \\&=\frac{r^2}{r+1}+\epsilon . \end{aligned}$$
(6.27)

As \(f(X_n)+\ell (X_n)<\alpha f(OPT)+\beta \ell (OPT)-\frac{\epsilon }{2}\) with high probability and \(f(X_n)\) is bounded above by a constant independent of n, \({\mathbb {E}}[f(X_n) + \ell (X_n)]<\alpha f(OPT)+\beta \ell (OPT)\) for sufficiently large n, implying that RandomizedDG does not provide an \((\alpha ,\beta )\) approximation for this instance. \(\square \)

Unfortunately, neither version of double greedy achieves any \((\alpha ,\beta )\)-approximation when \(\ell \) is non-positive rather than non-negative. We defer further discussion to Sect. A.2.

Table 3 \((\alpha (\beta ),\beta )\)-approximations for RegularizedUSM with non-negative \(\ell \)

6.2 Additional Approximation Algorithms

In this subsection, we prove Theorem 6.5, which improves upon Theorem 6.2 for \(\beta \) close to one. The results of this subsection and the next are summarized in Fig. 3.

Theorem 6.5

An \((\alpha (\beta ),\beta )\)-approximation algorithm for RegularizedUSM with non-negative \(\ell \) exists for any \((\alpha (\beta ),\beta )\) in Table 3. In particular, the \(\alpha (\beta )\) obtained for \(\beta \ge 0.85\) is superior to that of Theorem 6.2, and \(\alpha (1)>0.385\), matching the approximation factor of Theorem 5.1.

First, we show that the result for \(\beta =1\) easily follows from Lemma 5.2.

Lemma 6.6

For RegularizedUSM with non-negative \(\ell \), there is a (0.385, 1)-approximation algorithm.

Proof

Define \(g(S)\triangleq f({\mathcal {N}}\backslash S)\), which is also non-negative submodular. Then apply Lemma 5.2 on \((g,-\ell )\) to find \(T\subseteq {\mathcal {N}}\) such that

$$\begin{aligned} {\mathbb {E}}[g(T)-\ell (T)]&\ge \max _S[0.385 g(S)-\ell (S)]\nonumber \\&=\max _S[0.385 f(S)-\ell ({\mathcal {N}}\backslash S)]\nonumber \\&=\max _S[0.385 f(S)+\ell (S)]-\ell ({\mathcal {N}}). \end{aligned}$$
(6.28)

Setting \(T'={\mathcal {N}}\backslash T\), we have

$$\begin{aligned} {\mathbb {E}}[f(T')+\ell (T')-\ell ({\mathcal {N}})]&={\mathbb {E}}[f(T')-\ell ({\mathcal {N}}\backslash T')]\nonumber \\&={\mathbb {E}}[g(T)-\ell (T)]\nonumber \\&\ge \max _S[0.385 f(S)+\ell (S)]-\ell ({\mathcal {N}}). \end{aligned}$$
(6.29)

Adding \(\ell ({\mathcal {N}})\) to both sides, we conclude that

$$\begin{aligned} {\mathbb {E}}[f(T')+\ell (T')]\ge \max _S[0.385 f(S)+\ell (S)]. \end{aligned}$$
(6.30)

So an algorithm returning \(T'\) would achieve a (0.385, 1)-approximation as desired. \(\square \)

For \(\beta \) close to one, we can obtain better \((\alpha ,\beta )\)-approximations than what Theorem 6.2 alone provides by combining double greedy with the following corollary of Lemma 6.6:

Corollary 6.7

An \((\alpha ,\beta )\)-approximation algorithm for RegularizedUSM for the case of \(\ell \) non-positive may be used to return a set \(T\subseteq {\mathcal {N}}\) such that

$$\begin{aligned} {\mathbb {E}}[f(T)]\ge \alpha f(OPT)+\beta \ell (OPT)+(1-\beta )\ell ({\mathcal {N}}). \end{aligned}$$
(6.31)

for the case of \(\ell \) non-negative.

Proof

The proof is very similar to the above; the RHSes of Eqs. (6.28) and (6.29) become \(\alpha f(OPT)+\beta \ell (OPT)-\beta \ell ({\mathcal {N}})\). \(\square \)

Now we can prove Theorem 6.5 by combining Corollary 6.7 with Theorem 6.2.

Proof of Theorem 6.5

Our algorithm returns the best of the solutions returned by the following two algorithms:

  1. 1.

    Randomized double greedy on \(f+\ell \), whose guarantee is given by Eq. (6.15)

  2. 2.

    Corollary 6.7 using Theorem 5.1 for \(\beta \in {\mathcal {T}}\triangleq \{(\alpha (1+0.01x),1+0.01x)\mid x\in \{0,1,2,\ldots ,30\}\)

As with Theorem 5.1, for a fixed \(\beta \) we can lower bound the \(\alpha (\beta )\) guaranteed by the algorithm above by the solution to the following linear program after choosing the set \({\mathcal {R}}\) appropriately:

(6.32)

Let \(\textbf{OPT}'\) denote the expected value of the returned solution. Any point \((x_1,x_2,x_3)\) within the convex hull satisfies the following inequality:

$$\begin{aligned} \textbf{OPT}'\ge x_1 f(OPT) + x_2 \ell (OPT)+x_3\ell ({\mathcal {N}}). \end{aligned}$$
(6.33)

The conditions \(x_2+x_3\ge \beta , x_3\ge 0\) ensure that \(\textbf{OPT}'\ge x_1 f(OPT)+\beta \ell (OPT)\). \(\square \)

6.3 Inapproximability

In this subsection, we prove Theorems 6.8 and 6.9.

Theorem 6.8

For some \(\epsilon >0\), there are instances of RegularizedUSM with non-negative \(\ell \) such that \((0.478,1-\epsilon )\) is inapproximable.

Note that this is much stronger than the \((0.4998+\epsilon ,1)\)-inapproximability provided by [24, Lemma 6.3].

Proof

We start by showing (0.478, 1)-inapproximability, which is easier. First, we claim that any \((\alpha ,1)\)-approximation algorithm for the RegularizedUSM instance \((f({\mathcal {N}}\backslash S),-\ell (S))\) immediately implies a \((\alpha ,1)\)-approximation algorithm for \((f(S),\ell (S))\). Letting \({\mathcal {N}}\backslash T\) be the set returned by the former approximation algorithm, we find

$$\begin{aligned}{} & {} {\mathbb {E}}[f({\mathcal {N}}\backslash ({\mathcal {N}}\backslash T))-\ell ({\mathcal {N}}\backslash T)]&\ge \alpha f({\mathcal {N}}\backslash OPT')-\ell (OPT')\nonumber \\{} & {} \quad \implies {\mathbb {E}}\left[ f(T)-\ell ({\mathcal {N}})+\ell (T) \right]&\ge \alpha f({\mathcal {N}}\backslash OPT')-\ell (OPT') \end{aligned}$$
(6.34)

Substituting \(OPT'={\mathcal {N}}\backslash OPT\) gives

$$\begin{aligned} {\mathbb {E}}\left[ f(T)-\ell ({\mathcal {N}})+\ell (T) \right]&\ge \alpha f(OPT)-\ell ({\mathcal {N}})+\ell (OPT)\nonumber \\ \implies {\mathbb {E}}[f(T)+\ell (T)]&\ge \alpha f(OPT)+\ell (OPT). \end{aligned}$$
(6.35)

Note that when \(\ell \) is set to be non-negative, this means that any \((\alpha ,1)\)-approximation algorithm for \(\ell \) non-positive implies an \((\alpha ,1)\)-approximation algorithm for \(\ell \) non-negative. Similarly, by setting \(\ell \) to be non-positive, we get the implication in the opposite direction. This means that \((\alpha ,1)\)-inapproximability results for one sign of \(\ell \) can be converted to corresponding inapproximability results for the other sign of \(\ell \). Thus, the (0.478, 1)-inapproximability result for non-positive \(\ell \) implies the same inapproximability result for non-negative \(\ell \).

The slightly stronger result of \((0.478,1-\epsilon )\) inapproximability for some \(\epsilon >0\) follows from modifying the symmetry gap construction of Theorem 5.6. Let \((f_-,\ell _-)\) be the f and \(\ell \) defined in the proof of Theorem 5.6 for \(\beta =1\). Then let

$$\begin{aligned} f(S)\triangleq f_{-}({\mathcal {N}}\backslash S), \ell (S)\triangleq -\ell _-(S). \end{aligned}$$
(6.36)

For k sufficiently large, this instance shows \((\alpha ,1)\)-inapproximability for some \(\alpha <0.478\). Furthermore, if we fix k to be constant, then the desired result follows; specifically, we can choose some \(\epsilon >0\) such that

$$\begin{aligned} \alpha f(OPT) + \ell (OPT) = 0.478 f(OPT)+(1-\epsilon )\ell (OPT), \end{aligned}$$
(6.37)

showing \((0.478,1-\epsilon )\)-inapproximability as desired. \(\square \)

Next, we provide an inapproximability result for \(\alpha =0.5\) by fixing \(k=2\) in the construction for Theorem 6.8.

Theorem 6.9

For any \(\epsilon >0\), there are instances of RegularizedUSM with non-negative \(\ell \) such that \((0.5,2\sqrt{2}/3 \approx 0.943+\epsilon )\) is inapproximable.

Proof

Again, let \((f_-,\ell _-)\) be the f and \(\ell \) defined in the proof of Theorem 5.6 with \(\ell _q=0\). Define

$$\begin{aligned} f(S)&\triangleq f_-({\mathcal {N}}\backslash S), \end{aligned}$$
(6.38)
$$\begin{aligned} p&\triangleq k-\frac{\sum _{i=1}^k({\textbf{x}}_{a_i}+{\textbf{x}}_{b_i})}{2}\in [0,k], \end{aligned}$$
(6.39)
$$\begin{aligned} q&\triangleq 1-\frac{{\textbf{x}}_a+{\textbf{x}}_b}{2}\in [0,1]. \end{aligned}$$
(6.40)
$$\begin{aligned} \ell (S)&\triangleq -\ell _-(S)=\ell _p(2k-2p) \end{aligned}$$
(6.41)

where we may choose any real number \(\ell _p>0\). Applying Lemma 5.8, we find that the LHS is given by

$$\begin{aligned}&\max _{{\textbf{x}}\in [0,1]^{{\mathcal {N}}}}[(F+L)({\overline{{\textbf{x}}}})]\nonumber \\&\quad =\max _{{\textbf{x}}\in [0,1]^{{\mathcal {N}}}}[F_-({\textbf{1}}-{\overline{{\textbf{x}}}})+L({\overline{{\textbf{x}}}})]\nonumber \\&\quad =\max _{0\le p\le k, 0\le q\le 1}\left[ \left( (1-\kappa )2q(1-q)+\kappa 2(1-q)(1-\left( 1-p/k \right) ^k) \right) \right. \nonumber \\&\qquad \left. +\left( -2p\ell _p+2k\ell _p \right) \right] \nonumber \\&\quad =\max _{0\le p\le k, 0\le q\le 1}\left[ (1-\kappa )2q(1-q)+\kappa 2(1-q)(1-\left( 1-p/k \right) ^k)-2p\ell _p \right] \nonumber \\&\qquad +2k\ell _p, \end{aligned}$$
(6.42)

while the RHS is bounded below by

$$\begin{aligned} (\alpha f+\beta \ell )({\mathcal {N}}\backslash \{a,b_1\})=\alpha +\beta \left[ (2k-1)\ell _p \right] . \end{aligned}$$
(6.43)

Now fix \(k=2\) and \(\alpha =0.5\), and define

$$\begin{aligned} g(p)\triangleq \max _{0\le q\le 1}\left[ (1-\kappa )2q(1-q)+\kappa 2(1-q)(1-\left( 1-p/k \right) ^k) \right] . \end{aligned}$$
(6.44)

Then the minimum \(\beta ^*\) such that we can show \((\alpha ,\beta ^*+\epsilon )\)-inapproximability using this technique is given by

$$\begin{aligned}{} & {} \max _{0\le p\le k}\left[ g(p)-2p\ell _p \right] +2k\ell _p=0.5+\beta ^*[(2k-1)\ell _p] \end{aligned}$$
(6.45)
$$\begin{aligned}{} & {} \implies \beta ^*=\min _{0\le \kappa \le 1, 0<\ell _p}\left[ \frac{\max _{0\le p\le k}\left[ g(p)-2p\ell _p \right] +2k\ell _p-0.5}{(2k-1)\ell _p} \right] . \end{aligned}$$
(6.46)

Choose any \(p^*\in (0,2-\sqrt{2})\), which automatically guarantees \(1-(1-p^*/k)^k<\frac{1}{2}\). For any such \(p^*\), we claim that there exist \(\kappa \) and \(\ell _p\) such that

$$\begin{aligned} \max _{0\le p\le k}\left[ g(p)-2p\ell _p \right] <0.5-2p^*\ell _p. \end{aligned}$$
(6.47)

The reason why Eq. (6.47) holds is that, for sufficiently small \(\kappa >0\), \(g(p^*)<0.5\), g(p) is increasing with respect to p, and g(p) is concave down with respect to p. Thus, we can always choose \(\ell _p>0\) so that \(\text {argmax}_{0\le p\le k}[g(p)-2p\ell _p]=p^*\). From Eq. (6.47) we can finish as follows:

$$\begin{aligned} \beta ^*\le \frac{2k\ell _p-2p^*\ell _p}{(2k-1)\ell _p}=\frac{4-2p^*}{3}. \end{aligned}$$
(6.48)

Taking the limit as \(p^*\rightarrow 2-\sqrt{2}^-\) shows the inapproximability of \(\beta ^*=\frac{4-2(2-\sqrt{2})}{3}+\epsilon =\frac{2\sqrt{2}}{3}+\epsilon \), as desired. \(\square \)

7 Non-negative \(\ell \): RegularizedCSM

The results of this section are summarized in Fig. 4.

7.1 Approximation Algorithms

In this subsection, we prove Theorem 7.1.

Theorem 7.1

For RegularizedCSM with non-negative \(\ell \), there is a \(\left( \alpha (\beta )-\epsilon ,\beta -\epsilon \right) \) approximation algorithm for all \(\beta \in [0,1]\) where \(\alpha \) is a decreasing concave function satisfying \(\alpha (0.385)>0.385\), \(\alpha (0.6)>0.384, \alpha \left( 1-e^{-1} \right) =e^{-1}\), and \(\alpha (1)=0\).

Recall from the introduction that [25] introduced distorted measured continuous greedy and analyzed its guarantee for the case of non-positive \(\ell \). Our improved results are based on generalizing the analysis to the case where \(\ell \) contains both positive and negative components.

Lemma 7.2

For unconstrained \(\ell \) and any \(t_f\in [0,1]\), there is a polynomial-time algorithm for RegularizedCSM that returns \(T\in {\mathcal {I}}\) such that

$$\begin{aligned} {\mathbb {E}}[f(T)+\ell (T)]\ge & {} (t_f e^{-t_f}-o(1))f(OPT)+(1-e^{-t_f}-o(1)) \ell _+(OPT)\nonumber \\{} & {} +t_f\ell _-(OPT). \end{aligned}$$
(7.1)

When \(t_f>1\), the algorithm provides the same approximation guarantee but is allowed to return any \(T\subseteq {\mathcal {N}}\).Footnote 4

Proof

It suffices to show that for any \(\epsilon >0\), with high probability, Algorithm 1 from [25] generates \({\textbf{y}}(t_f)\in [0,1]^{{\mathcal {N}}}\) such that \({\textbf{y}}(t_f) \in t_f\cdot {\mathcal {P}}\) and

$$\begin{aligned} F({\textbf{y}}(t_f))+L({\textbf{y}}(t_f))&\ge t_fe^{-t_f}f(OPT)+(1-e^{-t_f})\ell _+(OPT)\nonumber \\&\quad +t_f\ell _-(OPT)-{\mathcal {O}}\left( \epsilon M\right) \end{aligned}$$
(7.2)

in \(\text {poly}(n,1/\epsilon )\) time, where \(M\triangleq \max \{\max _{u\in {\mathcal {N}}}f(u|\emptyset ),-\min _{u\in {\mathcal {N}}}f(u|{\mathcal {N}}-u)\}>0\). How to use \({\textbf{y}}(t_f)\) to generate a set T satisfying the conditions in the statement of this lemma is standard and is deferred to the appendix.

First, we briefly review the measured continuous greedy algorithm introduced by Feldman et al. [13]. The idea is to continuously evolve a solution \({\textbf{y}}(t)\) from time \(t=0\) to time \(t=t_f\) such that \({\textbf{y}}(t)\in \left( t\cdot {\mathcal {P}} \right) \cap \left( (1-e^{-t})\cdot [0,1]^{{\mathcal {N}}} \right) \). At all times, \({\textbf{y}}'(t)={\textbf{z}}(t)\circ ({\textbf{1}}_{{\mathcal {N}}}-{\textbf{y}}(t))\), where \({\textbf{z}}(t)\in {\mathcal {P}}\). To transform this continuous process into an algorithm running in finite time, it is necessary to discretize time into timesteps of size \(\delta \), where \(\delta \) evenly divides \(t_f\). Then \({\textbf{y}}(t+\delta )\triangleq {\textbf{y}}(t)+\delta {\textbf{z}}(t)\circ ({\textbf{1}}_{{\mathcal {N}}}-{\textbf{y}}(t))\). How small \(\delta \) needs to be to achieve the desired approximation factor is given by a polynomial in terms of n and \(\epsilon \).

Algorithm 1 of [25] combines measured continuous greedy with Feldman’s distorted objective [20]. Specifically, Algorithm 1 of [25] defines the objective at time t to be

$$\begin{aligned} \Phi (t)=(1-\delta )^{(t_f-t)/\delta }F({\textbf{y}}(t))+L({\textbf{y}}(t))\approx e^{t-t_f}F({\textbf{y}}(t))+L({\textbf{y}}(t)) \end{aligned}$$
(7.3)

and chooses \({\textbf{z}}(t)\) so that with high probability,

$$\begin{aligned} \Phi (t+\delta )-\Phi (t)\ge \delta \left[ e^{-t_f}f(OPT)+\ell (OPT) \right] -\frac{\delta }{t_f} \cdot {\mathcal {O}}\left( \epsilon M\right) , \end{aligned}$$
(7.4)

assuming that \(\ell \) is non-positive [25, Lemma 3.8]. Summing this inequality over all \(\frac{t_f}{\delta }\) timesteps yields the desired result for non-positive \(\ell \).Footnote 5

We claim that when the sign of \(\ell \) is unconstrained, the following generalization of [25, Lemma 3.8] holds:

$$\begin{aligned} \Phi (t+\delta )-\Phi (t)\ge & {} \delta \left[ e^{-t_f}f(OPT)+(1-\delta )^{t/\delta }\ell _+(OPT)+\ell _-(OPT) \right] \nonumber \\{} & {} -\frac{\delta }{t_f} \cdot {\mathcal {O}}\left( \epsilon M\right) . \end{aligned}$$
(7.5)

To show this, the only part of the proof of [25, Lemma 3.8] that needs to change is the part where [25, Lemma 3.7] is invoked. Lemma 3.7 of [25] states that for non-positive \(\ell \),

$$\begin{aligned} L({\textbf{y}}(t+\delta ))-L({\textbf{y}}(t))=\delta L({\textbf{z}}(t)\circ ({\textbf{1}}_{{\mathcal {N}}}-{\textbf{y}}(t)))\ge \delta \langle {\ell },{\textbf{z}}(t)\rangle . \end{aligned}$$
(7.6)

For unconstrained \(\ell \), we obtain the following inequality instead:

$$\begin{aligned} L({\textbf{y}}(t+\delta ))-L({\textbf{y}}(t))&=\delta L({\textbf{z}}(t)\circ ({\textbf{1}}_{{\mathcal {N}}}-{\textbf{y}}(t))) \nonumber \\&=\delta \left( \langle \ell _+,{\textbf{z}}(t)\circ ({\textbf{1}}_{{\mathcal {N}}}-{\textbf{y}}(t))\rangle + \langle \ell _-,{\textbf{z}}(t)\circ ({\textbf{1}}_{{\mathcal {N}}}-{\textbf{y}}(t))\rangle \right) \nonumber \\&\ge \delta \left( \langle {\ell }_+,{\textbf{z}}(t)\rangle \cdot (1-\delta )^{t/\delta }+\langle {\ell }_-,{\textbf{z}}(t)\rangle \right) , \end{aligned}$$
(7.7)

where the last inequality follows from [25, Lemma 3.1], which states that \({\textbf{y}}_u(t)\le 1-(1-\delta )^{\frac{t}{\delta }}\) for all \(u\in {\mathcal {N}}\). It is easy to verify that Eq. (7.5) follows after substituting Eq. (7.7) in place of [25, Lemma 3.7] in the proof of [25, Lemma 3.8].

To go from Eq. (7.5) to the conclusion, we just need to check that \(\sum _{i=0}^{t_f/\delta -1}\delta (1-\delta )^i\ge 1-(1-\delta )^{t_f/\delta }\ge 1-e^{-t_f}.\) Thus, Eq. (7.2) has been proven. \(\square \)

Corollary 7.3

When \(\ell \ge 0\), there is a \(\left( e^{-1}-\epsilon ,1-e^{-1} \right) \)-approximation algorithm for RegularizedCSM.

Proof

The result follows immediately from substituting \(t_f=1\) into Lemma 7.2. \(\square \)

Before proving Theorem 7.1, we will need two more lemmas. The first is very simple.

Lemma 7.4

(Trivial Approximation for RegularizedCSM) When \(\ell \) is unconstrained, there exists a (0, 1)-approximation algorithm for RegularizedCSM.

Proof

Ignore f and maximize \(\ell \), which can be done in polynomial time as noted in the preliminaries. \(\square \)

The next lemma combines Lemma 7.2 with the aided measured continuous greedy used by [14].

Lemma 7.5

(Guarantee of Distorted Aided Measured Continuous Greedy) Let \(\ell \) be unconstrained. If we run Distorted Aided Measured Continuous Greedy given a fractional solution \({\textbf{z}}\) and a polytope \({\mathcal {P}}\) for a total of \(t_f\) time, where \(t_f\ge t_s\), it will generate \({\textbf{y}}\in t_f{\mathcal {P}}\cap \left( (1-e^{-t_f})\cdot [0,1]^{{\mathcal {N}}} \right) \) such that

$$\begin{aligned} {\mathbb {E}}\left[ F({\textbf{y}})+L({\textbf{y}}) \right]&\ge e^{-t_f}[(e^{t_s}+t_fe^{t_s}-t_se^{t_s}-1-o(1))f(OPT) \nonumber \\&\quad +(-e^{t_s}+1)F({\textbf{z}}\wedge {\textbf{1}}_{OPT}) \nonumber \\&\quad +(-e^{t_s}-t_fe^{t_s}+t_se^{t_s}+1+t_f)F({\textbf{z}}\vee {\textbf{1}}_{OPT})] \nonumber \\&\quad +(1-e^{-t_f})L_+({\textbf{1}}_{OPT}\backslash {\textbf{z}})+(1-e^{t_s-t_f})L_+({\textbf{1}}_{OPT}\wedge {\textbf{z}}) \nonumber \\&\quad +t_f L_-({\textbf{1}}_{OPT}\backslash {\textbf{z}}) + (t_f-t_s)L_-({\textbf{1}}_{OPT}\wedge {\textbf{z}}). \end{aligned}$$
(7.8)

Note that the terms depending on f are precisely the same as those in Lemma 5.3.

The proof is deferred to Sect. A.3.

Proof of Theorem 7.1

The algorithm is similar to that of Theorem 5.1.

  1. 1.

    Run the trivial approximation algorithm (Lemma 7.4).

  2. 2.

    Generate \({\textbf{z}}\) using the local search procedure described by [14, Lemma 3.1] on \((f+\ell ,{\mathcal {P}})\). This finds \({\textbf{z}}\in {\mathcal {P}}\) such that

    $$\begin{aligned} F({\textbf{z}})+L({\textbf{z}})&\ge \frac{(F+L)({\textbf{z}}\vee {\textbf{1}}_{OPT})+(F+L)({\textbf{z}}\wedge {\textbf{1}}_{OPT})}{2} \nonumber \\&\quad -o(1)\cdot (f+\ell )(OPT)\nonumber \\&\ge \frac{1}{2}F({\textbf{z}}\vee {\textbf{1}}_{OPT})+\frac{1}{2}F({\textbf{z}}\wedge {\textbf{1}}_{OPT})+\frac{1}{2}\ell (OPT)\nonumber \\&\quad +\frac{1}{2}L({\textbf{z}}\wedge {\textbf{1}}_{OPT})-o(1)\cdot (f+\ell )(OPT), \end{aligned}$$
    (7.9)

    and

    $$\begin{aligned} F({\textbf{z}})+L({\textbf{z}})\ge F({\textbf{z}}\wedge {\textbf{1}}_{OPT})+L({\textbf{z}}\wedge {\textbf{1}}_{OPT})-o(1)\cdot (f+\ell )(OPT). \end{aligned}$$
    (7.10)

    Note that unlike Theorem 5.1, there is no guessing step.

  3. 3.

    Run distorted aided measured continuous greedy given \({\textbf{z}}\) (Lemma 7.5), for all pairs

    $$\begin{aligned} (t_s,t_f)\in {\mathcal {T}}\triangleq \{(0.1x,1)\mid 0\le x\le 10\}. \end{aligned}$$
    (7.11)
  4. 4.

    Round \({\textbf{z}}\) from step 1 and all fractional solutions found in steps 2 and 3 to valid integral solutions using pipage rounding, which preserves the value of \(F+L\) in expectation.

  5. 5.

    Return the solution from step 4 with the maximum value. Let \(\textbf{OPT}'\) be the expected value of this solution.

As in the proof of Theorem 5.1, for a fixed \(\beta \), we claim that to find a lower bound on \(\alpha \) such that the following inequality is true:

$$\begin{aligned} \textbf{OPT}'&\triangleq \max \left( {\mathbb {E}}[F({\textbf{z}})+L({\textbf{z}})],\max _{(t_s,t_f)\in {\mathcal {T}}}\left( {\mathbb {E}}\left[ F({\textbf{y}}_{t_s,t_f})+L({\textbf{y}}_{t_s,t_f}) \right] \right) \right) \nonumber \\&\ge \alpha F(OPT) +\beta \ell (OPT), \end{aligned}$$
(7.12)

it suffices to solve the following linear program:

(7.13)

Any point \((x_1,x_2,x_3,x_4,x_5)\) within the convex hull satisfies:

$$\begin{aligned} \textbf{OPT}'\ge & {} x_1f(OPT)+x_2 F({\textbf{z}}\wedge {\textbf{1}}_{OPT})+x_3 F({\textbf{z}}\vee {\textbf{1}}_{OPT})+x_4 L({\textbf{1}}_{OPT}\backslash {\textbf{z}})\nonumber \\{} & {} +x_5 L({\textbf{z}}\wedge {\textbf{1}}_{OPT}) \end{aligned}$$
(7.14)

up to o(1) terms. The points determining the hull are as follows:

  • (0, 0, 0, 1, 1) corresponds to Lemma 7.4.

  • (0, 0.5, 0.5, 0.5, 1) corresponds to Eq. (7.9).

  • (0, 1, 0, 0, 1) corresponds to Eq. (7.10).

  • The remaining points correspond to Lemma 7.5 for all \((t_s,t_f)\in {\mathcal {T}}\).

The constraints \(x_2,x_3\ge 0\) ensure that

$$\begin{aligned} \textbf{OPT}'\ge x_1f(OPT)+x_4 L({\textbf{1}}_{OPT}\backslash {\textbf{z}})+x_5 L({\textbf{z}}\wedge {\textbf{1}}_{OPT}). \end{aligned}$$
(7.15)

The constraints \(\min (x_4,x_5)\ge \beta \) ensure that

$$\begin{aligned} \textbf{OPT}'\ge x_1 f(OPT)+\beta \ell (OPT). \end{aligned}$$
(7.16)

\(\square \)

7.2 Inapproximability

In this subsection, we prove Theorem 7.6, which can be used to show that Theorem 7.1 is tight for \(\beta \ge (e-1)/e\). We then discuss whether the construction used in Theorem 7.6 could potentially be extended to RegularizedUSM.

Theorem 7.6

(Inapproximability of RegularizedCSM Near \(\beta =1\)) For any \(0\le \beta \le 1\), there exist instances of RegularizedCSM with non-negative \(\ell \) such that a \((1-\beta +\epsilon ,\beta )\)-approximation would require exponentially many value queries.

Proof

Define \(\alpha \triangleq 1-\beta +\epsilon \). By Lemma 5.8, it suffices to construct a submodular function f satisfying

$$\begin{aligned} \max _{{\textbf{x}}\in {\mathcal {P}}}[F({\overline{{\textbf{x}}}})+L({\overline{{\textbf{x}}}})]< \max _{S\in {\mathcal {I}}}[\alpha \cdot f(S)+\beta \cdot \ell (S)]. \end{aligned}$$
(7.17)

We use the same f that Vondrak [18] uses for proving the inapproximability of maximization over matroid bases. Specifically, define \({\mathcal {N}}=\{a_1,\ldots ,a_k,b_1,\ldots ,b_k\}\) and let f correspond to the sum of directed cut functions of k disjoint arcs; that is, \(f(S)\triangleq \sum _{i=1}^k[a_i\in S\text { and }b_i\not \in S]\). Its multilinear extension is \(F({\textbf{x}}_{a_1\cdots a_k}, {\textbf{x}}_{b_1 \cdots b_k})=\sum _{i=1}^k{\textbf{x}}_{a_i}(1-{\textbf{x}}_{b_i}).\) We define \({\mathcal {I}}\) to consist of precisely the subsets of \({\mathcal {N}}\) that contain at most one element from \(a_1,\ldots ,a_k\) and at most \(k-1\) elements from \(b_1,\ldots , b_k\), resulting in the following matroid independence polytope:

$$\begin{aligned} {\mathcal {P}}=\left\{ ({\textbf{x}}_{a_i},{\textbf{x}}_{b_i})\big | \sum _{i=1}^k{\textbf{x}}_{a_i}\le 1\text { and }\sum _{i=1}^k{\textbf{x}}_{b_i}\le k-1\right\} . \end{aligned}$$
(7.18)

Finally, we define \(\ell \) as \(\ell (a_i)=0, \ell (b_i)=\frac{1}{k}.\) Then the RHS of Eq. (7.17) is at least:

$$\begin{aligned} \max _{S\in {\mathcal {I}}}[\alpha f(S)+\beta \ell (S)]\ge (\alpha f+\beta \ell )(\{a_1,b_2,b_3,\ldots ,b_k\})=\alpha + \beta \cdot \frac{k-1}{k}, \end{aligned}$$
(7.19)

while the LHS of Eq. (7.17) is:

$$\begin{aligned} \max _{{\textbf{x}}\in {\mathcal {P}}}[F({\overline{{\textbf{x}}}})+L({\overline{{\textbf{x}}}})]&=\max _{0\le p\le 1/k, 0\le q\le (k-1)/k}\left[ kp(1-q)+q \right] \nonumber \\&=\max _{0\le p\le 1/k, 0\le q\le (k-1)/k}\left[ q(1-kp)+kp \right] \nonumber \\&=\max _{0\le p\le 1/k}\left[ (1-kp)+kp \right] =1, \end{aligned}$$
(7.20)

where the third equality follows because the expression is always maximized by setting \(q=\frac{k-1}{k}\). For sufficiently large k we have

$$\begin{aligned} \alpha +\beta \cdot \frac{k-1}{k}\ge (\alpha +\beta )\frac{k-1}{k}=(1+\epsilon )\cdot \frac{k-1}{k}>1. \end{aligned}$$
(7.21)

Equation (7.17) follows. \(\square \)

In fact, the bound of Theorem 7.6 is (nearly) tight for \(\beta \) close to one.

Corollary 7.7

(Tight RegularizedCSM Near \(\beta =1\) for \(\ell \ge 0\)) For all \(\frac{e-1}{e}\le \beta < 1\), there is a \((1-\beta -\epsilon ,\beta )\)-approximation algorithm for RegularizedCSM with non-negative \(\ell \), nearly matching the bound of Theorem 7.6.

Proof

The better of Corollary 7.3 and Lemma 7.4 will be an \((\alpha ,\beta )\)-approximation for all \((\alpha ,\beta )\) lying above the segment connecting \(\left( \frac{1}{e}-\epsilon ,\frac{e-1}{e}-\epsilon \right) \) and (0, 1). \(\square \)

As the f used by Lemma 5.8 to prove Theorem 7.6 is just a directed cut function, it is natural to ask whether directed cut functions can be used by Lemma 5.8 to show improved inapproximability for RegularizedUSM. We build on Proposition 5.5 to show that doing so is impossible.

Proposition 7.8

When \(\ell \) is unconstrained, setting f to be a directed cut function in Lemma 5.8 cannot be used to show (0.5, 1)-inapproximability for RegularizedUSM.

The proof is deferred to Sect. A.3.

8 Unconstrained \(\ell \)

The results of this section are summarized in Figs. 5 and 6.

8.1 Approximability

In this section, we prove Theorems 8.1 and 8.3.

Theorem 8.1

For all \(t\ge 0\), there is a \(\left( \frac{t e^{-t}}{t+e^{-t}}-\epsilon ,\frac{t}{t+e^{-t}} \right) \)-approximation algorithm for RegularizedUSM. This algorithm achieves the same approximation guarantee for RegularizedCSM when \(t\le 1\).

Algorithm 1
figure a

Simple RegularizedCSM (t)

Proof

It suffices to show that Algorithm 1 achieves the desired approximation factor. Disregard the factors of o(1) in Lemma 7.2; they can be taken into account later at the cost of introducing the factor of \(\epsilon \). The set returned from Lemma 7.4 actually satisfies the stronger guarantee

$$\begin{aligned} \ell (T')\ge \ell _+(OPT), \end{aligned}$$
(8.1)

because if we define \(OPT^*\triangleq OPT\cap \{u:u\in {\mathcal {N}}\wedge \ell (u)>0\}\), \(OPT^*\in {\mathcal {I}}\) due to \({\mathcal {I}}\) being downward-closed, and thus \(\ell (T')\ge \ell (OPT^*)=\ell _+(OPT)\).

Next, add \(t+e^{-t}-1\) times Eq. (8.1) to the inequality of Lemma 7.2.

$$\begin{aligned} (t+e^{-t}){\mathbb {E}}[\max \left( f(T)+\ell (T),\ell (T') \right) ]&\ge {\mathbb {E}}[f(T)+\ell (T)] + (t+e^{-t}-1) {\mathbb {E}}[\ell (T')]\nonumber \\&\ge t e^{-t}f(OPT)+t (\ell _+(OPT)+\ell _-(OPT))\nonumber \\&=t e^{-t}f(OPT)+t \ell (OPT). \end{aligned}$$
(8.2)

Dividing both sides by \(t+e^{-t}\) gives the desired result after accounting for the factors of o(1):

$$\begin{aligned} {\mathbb {E}}[\max (f(T)+\ell (T),\ell (T'))]\ge \left( \frac{t e^{-t}}{t+e^{-t}}-\epsilon \right) f(OPT)+\frac{t}{t+e^{-t}} \ell (OPT). \end{aligned}$$
(8.3)

\(\square \)

Note that an analog of Corollary 7.7 (Tight RegularizedCSM Near \(\beta =1\) for \(\ell \ge 0\)) holds for unconstrained \(\ell \), though for a smaller range of \(\beta \):

Corollary 8.2

(Tight RegularizedCSM Near \(\beta =1\)) There is a \((1-\beta -\epsilon ,\beta )\)-approximation algorithm for RegularizedCSM for any \(\frac{e}{e+1}\le \beta < 1\), almost matching the bound of Theorem 7.6.

Proof

Setting \(t=1\), the output of Theorem 8.1 is both a \(\left( \frac{1}{e+1}-\epsilon ,\frac{e}{e+1} \right) \)-approximation and a (0, 1)-approximation for RegularizedCSM. Therefore it is also an \((\alpha ,\beta )\)-approximation for all \((\alpha ,\beta )\) lying above the segment connecting \(\left( \frac{1}{e+1}-\epsilon ,\frac{e}{e+1} \right) \) and (0, 1). \(\square \)

However, our result is not tight for \(\beta <e/(e+1)\); it turns out that it is possible to do a little better than Theorem 8.1 for \(\beta \) near 0.7 by making use of Lemma 7.5.

Theorem 8.3

There is a (0.280, 0.7)-approximation algorithm for RegularizedCSM.

Proof

The algorithm is Theorem 7.1 augmented to use the guessing step from Theorem 5.1. That is, we start by guessing the value of \(\ell _-(OPT)\) to within a factor of \(1+\epsilon \) and replacing \({\mathcal {P}}\) with \({\mathcal {P}}\cap \{{\textbf{x}}: L_-({\textbf{x}})\ge (1+\epsilon )\ell _-(OPT)\}\) as in Theorem 5.1, and then run Theorem 7.1.

To analyze the guarantee of this algorithm, we set up a linear program similar to that of Theorem 7.1 with two additional variables \(x_6\) and \(x_7\) corresponding to \(L_-({\textbf{1}}_{OPT}\backslash {\textbf{z}})\) and \(L_-({\textbf{z}}\wedge {\textbf{1}}_{OPT})\), respectively. Again, we ignore terms that are o(1) and those depending on \(\epsilon \).

(8.4)

Any point \((x_1,x_2,x_3,x_4,x_5,x_6,x_7)\) within the convex hull satisfies:

$$\begin{aligned} \textbf{OPT}'&\ge x_1f(OPT)+x_2 F({\textbf{z}}\wedge {\textbf{1}}_{OPT})+x_3 F({\textbf{z}}\vee {\textbf{1}}_{OPT})\nonumber \\&\quad +x_4 L({\textbf{1}}_{OPT}\backslash {\textbf{z}})+x_5 L({\textbf{z}}\wedge {\textbf{1}}_{OPT})+x_6L_-({\textbf{1}}_{OPT}\backslash {\textbf{z}})\nonumber \\&\quad +x_7L_-({\textbf{z}}\wedge {\textbf{1}}_{OPT}) \end{aligned}$$
(8.5)

up to o(1) terms. The points determining the hull are as follows:

  • (0, 0, 0, 1, 1, 0, 0) corresponds to Lemma 7.4.

  • (0, 0.5, 0.5, 0.5, 1, 1, 1) corresponds to Eq. (7.9). Note that this inequality holds only because of the guessing step.

  • (0, 1, 0, 0, 1, 0, 1) corresponds to Eq. (7.10).

  • The remaining vertices correspond to Lemma 7.5.

Choosing \({\mathcal {T}}=\{(0.205, 0.955)\}\) and solving the linear program gives \(x_1\ge 0.280\) as desired. \(\square \)

Table 4 Inapproximability of \((\alpha (\beta ),\beta )\)-approximations for RegularizedUSM with unconstrained \(\ell \) (Theorem 8.4)

8.2 Inapproximability

In this subsection, we prove Theorems 8.4 and 8.5. Note that Theorem 7.6 cannot possibly apply to RegularizedUSM because Theorem 8.1 achieves \((1-\beta +\epsilon ,\beta )\)-approximations for \(\beta \) close to one. Unfortunately, we are unable to prove \((1,\epsilon )\)-inapproximability of RegularizedUSM, but we modify Theorem 5.6 to show improved inapproximability for unconstrained \(\ell \) than for \(\ell \) non-negative or \(\ell \) non-positive.

Theorem 8.4

(Inapproximability of RegularizedUSM) There are instances of RegularizedUSM where \((\alpha (\beta ),\beta )\) is inapproximable for any \((\alpha (\beta ),\beta )\) in Table 4. In particular, (0.440, 1) is inapproximable.

Proof

Set f to be the same as defined in Lemma 4.5, and define \(S\triangleq \{a,b_1\}\). For a fixed \(\beta \), we can show \((\alpha ,\beta )\)-inapproximability using Lemma 5.8 if it is possible to choose \(\ell \) and \(\kappa \) such that:

$$\begin{aligned} \max _{{\textbf{x}}\in [0,1]^{{\mathcal {N}}}}\left[ F({{\overline{{\textbf{x}}}}})+L(\overline{{\textbf{x}}}) \right] <\alpha f(\{a,b_1\})+\beta \ell (\{a,b_1\})=\alpha + \beta \ell (\{a,b_1\}), \end{aligned}$$
(8.6)

which is equivalent to

$$\begin{aligned} \max _{{\textbf{x}}\in [0,1]^{{\mathcal {N}}}}\left[ F({{\overline{{\textbf{x}}}}})+L({{\overline{{\textbf{x}}}}}) \right] -\beta \ell (\{a,b_1\})<\alpha . \end{aligned}$$
(8.7)

For a fixed \(\beta \), our goal is to choose \(\ell \) and \(\kappa \) to minimize the LHS of the above inequality. Theorem 1.3 of [24] sets \(\ell _a=\ell _b=0\), and then chooses \(\kappa \) and \(\ell _{a_{1\cdots k}}=\ell _{b_{1\cdots k}}\triangleq \ell _p\) in order to minimize the quantity

$$\begin{aligned}&\max _{{\textbf{x}}\in [0,1]^{{\mathcal {N}}}}\left[ F({{\overline{{\textbf{x}}}}})+L(\overline{{\textbf{x}}}) \right] -\beta \ell (\{a,b_1\})=\max _{{\textbf{x}}\in [0,1]^{{\mathcal {N}}}}\left[ F({{\overline{{\textbf{x}}}}})+L({{\overline{{\textbf{x}}}}}) \right] -\beta \ell _p\nonumber \\&\quad \approx \max _{0\le q\le 1, 0\le p}\left[ (1-\kappa )2q(1-q)+\kappa 2 (1-q)(1-e^{-p})+2p\ell _p \right] -\beta \ell _p \end{aligned}$$
(8.8)

However, allowing \(\ell _a=\ell _b\triangleq \ell _q\) to be nonzero gives better bounds for all \(\beta \). That is, our goal is to compute

$$\begin{aligned}&\min _{0\le \kappa \le 1, \ell _q , \ell _p }\Big [\max _{0\le q\le 1, 0\le p}\big [(1-\kappa )2q(1-q)\nonumber \\&\quad +\kappa 2(1-q)(1-e^{-p})+2p\ell _p+2q\ell _q\big ]-\beta (\ell _p+\ell _q)\Big ]. \end{aligned}$$
(8.9)

We can approximate the optimal value by brute forcing over a range of \((\kappa ,\ell _q,\ell _p)\). The best triples we found are displayed in Table 4. Allowing \(\ell _q\) to be negative gives superior bounds for \(\beta \) near zero, while allowing \(\ell _q\) to be positive gives superior bounds for \(\beta \) near one. In particular, for \(\beta =1\), taking \(\kappa =0.6022\), \(\ell _p=-0.1819,\) and \(\ell _q=0.2152\) gives \(\alpha (\beta )\approx 0.4390<0.440\). \(\square \)

We can do slightly better than Theorem 8.4 for \(\beta \) very close to one with a construction inspired by [24, Theorem 1.6].

Theorem 8.5

(Inapproximability of RegularizedUSM, \(\beta =1\)) There are instances of RegularizedUSM where (0.408, 1) is inapproximable.

Proof

Again, we use Lemma 5.8. Let \({\mathcal {N}}\triangleq \{a_{1\cdots k}, b_{1\cdots k}\}\), and define f as the directed cut function of a generalized hyperedge \((a_{1\cdots k};b_{1\cdots k})\); that is, the generalized hyperedge is said to be cut by S if S contains at least one of the tails of the hyperedge (\(a_{1\cdots k}\)) but not all of the heads of the hyperedge (\(b_{1\cdots k}\)):

$$\begin{aligned} f(S)\triangleq [S\cap \{a_{1\cdots k}\} \ne \emptyset ]\cdot [\{b_{1\cdots k}\}\not \subset S]. \end{aligned}$$
(8.10)

Also define \(\ell (a_i)=-0.2037, \ell (b_i)=0.2037\), \(p\triangleq \sum _{i=1}^k{\textbf{x}}_{a_i}\) and \(q\triangleq k-\sum _{i=1}^k{\textbf{x}}_{b_i}\), and \({\mathcal {G}}\) such that \(a_1,\ldots ,a_k\) and \(b_1,\ldots ,b_k\) are symmetric. Then as \(k\rightarrow \infty \),

$$\begin{aligned} F({\overline{{\textbf{x}}}})=\left( 1-\left( 1-p/k \right) ^k \right) \left( 1-\left( 1-q/k \right) ^k \right) \approx (1-e^{-p})(1-e^{-q}). \end{aligned}$$
(8.11)

Now,

$$\begin{aligned} \max _{{\textbf{x}}\in [0,1]^{{\mathcal {N}}}}[F({{\overline{{\textbf{x}}}}})+L(\overline{{\textbf{x}}})]&=\max _{p,q\ge 0}[(1-e^{-p})(1-e^{-q})-0.2037(p+q)+0.2037k]\nonumber \\&=0.2037k, \end{aligned}$$
(8.12)

where the last equality follows since the maximum is attained at \(p=q=0\), which may be verified using a numerical optimizer. On the other hand,

$$\begin{aligned} \max _S[\alpha f(S)+\ell (S)]\ge (f+\ell )(\{a_1,b_{1\cdots k-1}\})=\alpha +0.2037(k-2). \end{aligned}$$
(8.13)

It follows from Lemma 5.8 that we have shown \((\alpha ,1)\)-inapproximability for any \(\alpha \) satisfying

$$\begin{aligned} 0.2037k<\alpha +0.2037(k-2)\implies \alpha > 0.4074. \end{aligned}$$
(8.14)

\(\square \)

9 Open Problems

For all of the settings we consider, there is still a range of \(\beta \) for which there is a gap between the highest \(\alpha (\beta )\) known to be approximable and the lowest \(\alpha (\beta )\) known to be inapproximable. The open problems discussed below focus more on RegularizedUSM than RegularizedCSM due to this gap being larger for RegularizedUSM.

Table 5 Gaps between current approximability and inapproximability

9.1 Approximability

Section 5: Non-positive \(\ell \).

Theorem 5.1 attains bounds for RegularizedUSM with \(\alpha \ge 0.398\). What is the maximum \(\alpha \) such that an \((\alpha ,\beta )\) approximation exists for some \(\beta \)? In particular, is \(\alpha =0.5\) achievable?

Section 8: Unconstrained \(\ell \).

Is there an algorithm that achieves an \((\epsilon ,1)\) approximation for RegularizedUSM? Recall that for RegularizedCSM this was achievable when \(\ell \) was restricted to be non-positive or non-negative (Lemmas 5.2 and 6.6, respectively), but not in the case where \(\ell \) can have arbitrary sign (Theorem 7.6).

Section A.2: Online RegularizedUSM.

Can the \(\alpha \) in Proposition A.4 be improved? Is there an online algorithm that works for general non-monotone f and \(\ell \) non-positive? We note that the semi-streaming algorithms studied by Kazemi et al. [22] and Nikolakaki et al. [23] provide a (0.5, 1)-approximation algorithm for RegularizedUSM when f is monotone. For non-monotone USM, simply selecting each element of f with probability 0.5 achieves a 0.25-approximation [15]. For non-monotone CSM where the constraint is a cardinality constraint, Buchbinder et al. [29] provide an online algorithm achieving a competitive ratio of \(\frac{56}{627}>0.0893\) when preemption is allowed.

9.2 Inapproximability

Table 5 summarizes some of the best-known inapproximability results and their corresponding approximation guarantees. The gaps between approximability and inapproximability are particularly large in the second and fourth rows, corresponding to RegularizedUSM for \(\ell \le 0\) and unconstrained \(\ell \), respectively. Most of our inapproximability results are applications of the symmetry gap technique to modified versions of [17, Theorem 5.4]. Perhaps further reducing these gaps is simply a matter of finding a different construction to apply the symmetry gap technique to.

10 Supplementary information

Python code for all theorems requiring computational verification can be found at https://github.com/bqi343/maximizing-sums.