figure a
figure b

1 Introduction

Given a Boolean formula F, the problem of model counting is to compute the number of models of F. Model counting is a fundamental problem in computer science with a wide range of applications, such as control improvisation [13], network reliability [9, 28], neural network verification [2], probabilistic reasoning [5, 11, 20, 21], and the like. In addition to myriad applications, the problem of model counting is a fundamental problem in theoretical computer science. In his seminal paper, Valiant showed that \(\#\textsf{SAT}\) is \(\#\textsf{P}\)-complete, where \(\#\textsf{P}\) is the set of counting problems whose decision versions lie in \(\textsf{NP}\) [28]. Subsequently, Toda demonstrated the theoretical hardness of the problem by showing that every problem in the entire polynomial hierarchy can be solved by just one call to a \(\#\textsf{P}\) oracle; more formally, \(\text{ PH }\subseteq \text{ P}^{\#\textsf{P}}\) [27].

Given the computational intractability of \(\#\textsf{SAT}\), there has been sustained interest in the development of approximate techniques from theoreticians and practitioners alike. Stockmeyer introduced a randomized hashing-based technique that provides \((\varepsilon , \delta )\)-guarantees (formally defined in Sect. 2) given access to an \(\textsf{NP}\) oracle [25]. Given the lack of practical solvers that could handle problems in \(\textsf{NP}\) satisfactorily, there were no practical implementations of Stockmeyere’s hashing-based techniques until the 2000s [14]. Building on the unprecedented advancements in the development of SAT solvers, Chakraborty, Meel, and Vardi extended Stockmeyer’s framework to a scalable \((\varepsilon , \delta )\)-counting algorithm, \(\textsf{ApproxMC}\) [7]. The subsequent years have witnessed a sustained interest in further optimizations of the hashing-based techniques for approximate counting [5, 6, 10, 11, 17,18,19, 23, 29, 30]. The current state-of-the-art technique for approximate counting is a hashing-based framework called \(\textsf{ApproxMC}\), which is in its fourth version, called \(\textsf{ApproxMC4}\) [22, 24].

The core theoretical idea behind the hashing-based framework is to use 2-universal hash functions to partition the solution space, denoted by \(\mathsf {sol({F})}\) for a formula F, into roughly equal small cells, wherein a cell is considered small if it contains solutions less than or equal to a pre-computed threshold, \(\textsf{thresh}\). An \(\textsf{NP}\) oracle (in practice, a SAT solver) is employed to check if a cell is small by enumerating solutions one-by-one until either there are no more solutions or we have already enumerated \(\textsf{thresh}+ 1\) solutions. Then, we randomly pick a cell, enumerate solutions within the cell (if the cell is small), and scale the obtained count by the number of cells to obtain an estimate for \(|\mathsf {sol({F})}|\). To amplify the confidence, we rely on the standard median technique: repeat the above process, called \(\textsf{ApproxMCCore}\), multiple times and return the median. Computing the median amplifies the confidence as for the median of t repetitions to be outside the desired range (i.e., \(\left[ \frac{|\mathsf {sol({F})}|}{1+\varepsilon }, (1+\varepsilon ) |\mathsf {sol({F})}| \right] \)), it should be the case that at least half of the repetitions of \(\textsf{ApproxMCCore}\) returned a wrong estimate.

In practice, every subsequent repetition of \(\textsf{ApproxMCCore}\) takes a similar time, and the overall runtime increases linearly with the number of invocations. The number of repetitions depends logarithmically on \(\delta ^{-1}\). As a particular example, for \(\epsilon =0.8\), the number of repetitions of \(\textsf{ApproxMCCore}\) to attain \(\delta =0.1\) is 21, which increases to 117 for \(\delta =0.001\): a significant increase in the number of repetitions (and accordingly, the time taken). Accordingly, it is no surprise that empirical analysis of tools such as \(\textsf{ApproxMC}\) has been presented with a high delta (such as \(\delta =0.1\)). On the other hand, for several applications, such as network reliability, and quantitative verification, the end users desire estimates with high confidence. Therefore, the design of efficient counting techniques for small \(\delta \) is a major challenge that one needs to address to enable the adoption of approximate counting techniques in practice.

The primary contribution of our work is to address the above challenge. We introduce a new technique called rounding that enables dramatic reductions in the number of repetitions required to attain a desired value of confidence. The core technical idea behind the design of the rounding technique is based on the following observation: Let L (resp. U) refer to the event that a given invocation of \(\textsf{ApproxMCCore}\) under (resp. over)-estimates \(|\mathsf {sol({F})}|\). For a median estimate to be wrong, either the event L happens in half of the invocations of \(\textsf{ApproxMCCore}\) or the event U happens in half of the invocations of \(\textsf{ApproxMCCore}\). The number of repetitions depends on \(\max (\Pr [L], \Pr [U])\). The current algorithmic design (and ensuing analysis) of \(\textsf{ApproxMCCore}\) provides a weak upper bound on \(\max \{\Pr [L], \Pr [U]\}\): in particular, the bounds on \(\max \{\Pr [L], \Pr [U]\}\) and \(\Pr [L \cup U]\) are almost identical. Our key technical contribution is to design a new procedure, \(\textsf{ApproxMC6Core}\), based on the rounding technique that allows us to obtain significantly better bounds on \(\max \{\Pr [L], \Pr [U]\}\).

The resulting algorithm, called \(\textsf{ApproxMC6}\), follows a similar structure to that of \(\textsf{ApproxMC}\): it repeatedly invokes the underlying core procedure \(\textsf{ApproxMC6Core}\) and returns the median of the estimates. Since a single invocation of \(\textsf{ApproxMC6Core}\) takes as much time as \(\textsf{ApproxMCCore}\), the reduction in the number of repetitions is primarily responsible for the ensuing speedup. As an example, for \(\varepsilon =0.8\), the number of repetitions of \(\textsf{ApproxMC6Core}\) to attain \(\delta =0.1\) and \(\delta =0.001\) is just 5 and 19, respectively; the corresponding numbers for \(\textsf{ApproxMC}\) were 21 and 117. An extensive experimental evaluation on 1890 benchmarks shows that the rounding technique provided \(4\times \) speedup than the state-of-the-art approximate model counter, \(\textsf{ApproxMC}\). Furthermore, for a given timeout of 5000 s, \(\textsf{ApproxMC6}\) solves 204 more instances than \(\textsf{ApproxMC}\) and achieves a reduction of 1063 s in the PAR-2 score.

The rest of the paper is organized as follows. We introduce notation and preliminaries in Sect. 2. To place our contribution in context, we review related works in Sect. 3. We identify the weakness of the current technique in Sect. 4 and present the rounding technique in Sect. 5 to address this issue. Then, we present our experimental evaluation in Sect. 6. Finally, we conclude in Sect. 7.

2 Notation and Preliminaries

Let F be a Boolean formula in conjunctive normal form (\(\textsf{CNF}\)), and let \(\textsf{Vars }(F)\) be the set of variables appearing in F. The set \(\textsf{Vars }(F)\) is also called the support of F. An assignment \(\sigma \) of truth values to the variables in \(\textsf{Vars }(F)\) is called a satisfying assignment or witness of F if it makes F evaluate to true. We denote the set of all witnesses of F by \(\mathsf {sol({F})}\). Throughout the paper, we will use \({n}\) to denote \(|\textsf{Vars }(F)|\).

The propositional model counting problem is to compute \(|\mathsf {sol({F})}|\) for a given \(\textsf{CNF}\) formula F. A probably approximately correct (or \(\textsf{PAC}\)) counter is a probabilistic algorithm \(\textsf{ApproxCount}(\cdot , \cdot , \cdot )\) that takes as inputs a formula F, a tolerance parameter \(\varepsilon > 0\), and a confidence parameter \(\delta \in (0,1]\), and returns an \((\varepsilon , \delta )\)-estimate c, i.e., \(\textsf{Pr}\left[ {\frac{|\mathsf {sol({F})}|}{1+\varepsilon } \le c \le (1 + \varepsilon )|\mathsf {sol({F})}|}\right] \ge 1 - \delta \). \(\textsf{PAC}\) guarantees are also sometimes referred to as \((\varepsilon , \delta )\)-guarantees.

A closely related notion is projected model counting, where we are interested in computing the cardinality of \(\mathsf {sol({F})}\) projected on a subset of variables \(\mathcal {P}\subseteq \textsf{Vars }(F)\). While for clarity of exposition, we describe our algorithm in the context of model counting, the techniques developed in this paper are applicable to projected model counting as well. Our empirical evaluation indeed considers such benchmarks.

2.1 Universal Hash Functions

Let \(n, m \in \mathbb {N}\) and \(\mathcal {H}(n, m) \overset{\triangle }{=}\ \{h : \{0, 1\}^n \rightarrow \{0, 1\}^m\}\) be a family of hash functions mapping \(\{0, 1\}^n\) to \(\{0, 1\}^m\). We use \(h \overset{R}{\leftarrow }\mathcal {H}(n, m)\) to denote the probability space obtained by choosing a function h uniformly at random from \(\mathcal {H}(n, m)\). To measure the quality of a hash function we are interested in the set of elements of \(\mathsf {sol({F})}\) mapped to \(\alpha \) by h, denoted \(\textsf{Cell}_{\langle F, h, \alpha \rangle }\) and its cardinality, i.e., \(|\textsf{Cell}_{\langle F, h, \alpha \rangle }|\). We write \(\Pr [Z:\varOmega ]\) to denote the probability of outcome Z when sampling from a probability space \(\varOmega \). For brevity, we omit \(\varOmega \) when it is clear from the context. The expected value of Z is denoted \(\textsf{E}\left[ {Z}\right] \) and its variance is denoted \(\sigma ^2[Z]\).

Definition 1

A family of hash functions \(\mathcal {H}(n, m)\) is strongly 2-universal if \(\forall x, y \in \{0, 1\}^n\), \(\alpha \in \{0, 1\}^m\), \(h \overset{R}{\leftarrow }\ \mathcal {H}(n, m)\),

$$\begin{aligned} \textsf{Pr}\left[ {h(x) = \alpha }\right] = \frac{1}{2^m} = \textsf{Pr}\left[ {h(x) = h(y)}\right] \end{aligned}$$

For \(h \overset{R}{\leftarrow }\ \mathcal {H}(n, n)\) and \(\forall m \in \{1, ..., n\}\), the \(m^{th}\) prefix-slice of h, denoted \(h^{(m)}\), is a map from \(\{0, 1\}^n\) to \(\{0, 1\}^m\), such that \(h^{(m)}(y)[i] = h(y)[i]\), for all \(y \in \{0, 1\}^n\) and for all \(i \in \{1, ..., m\}\). Similarly, the \(m^{th}\) prefix-slice of \(\alpha \in \{0, 1\}^n\), denoted \(\alpha ^{(m)}\), is an element of \(\{0, 1\}^m\) such that \(\alpha ^{(m)}[i] = \alpha [i]\) for all \(i \in \{1, ..., m\}\). To avoid cumbersome terminology, we abuse notation and write \(\textsf{Cell}_{\langle F, m \rangle }\)(resp. \(\textsf{Cnt}_{\langle F, m \rangle }\)) as a short-hand for \(\textsf{Cell}_{\langle F, h^{(m)}, \alpha ^{(m)} \rangle }\) (resp. \(|\textsf{Cell}_{\langle F, h^{(m)}, \alpha ^{(m)} \rangle }|\)). The following proposition presents two results that are frequently used throughout this paper. The proof is deferred to Appendix A.

Proposition 1

For every \(1 \le m \le n\), the following holds:

$$\begin{aligned} \textsf{E}\left[ {\textsf{Cnt}_{\langle F, m \rangle }}\right] = \frac{|\mathsf {sol({F})}|}{2^m} \end{aligned}$$
(1)
$$\begin{aligned} \sigma ^2\left[ \textsf{Cnt}_{\langle F, m \rangle }\right] \le \textsf{E}\left[ {\textsf{Cnt}_{\langle F, m \rangle }}\right] \end{aligned}$$
(2)

The usage of prefix-slice of h ensures monotonicity of the random variable, \(\textsf{Cnt}_{\langle F, m \rangle }\), since from the definition of prefix-slice, we have that for every \(1 \le m < n\), \(h^{(m+1)}(y) = \alpha ^{(m+1)} \Rightarrow h^{(m)}(y) = \alpha ^{(m)}\). Formally,

Proposition 2

For every \(1 \le m < n\), \(\textsf{Cell}_{\langle F, m+1 \rangle } \subseteq \textsf{Cell}_{\langle F, m \rangle }\)

2.2 Helpful Combinatorial Inequality

Lemma 1

Let \(\eta (t, m, p) = \sum ^t_{k=m} {t \atopwithdelims ()k}p^k(1-p)^{t-k}\) and \(p < 0.5\), then

$$\begin{aligned} \eta (t, \lceil t/2 \rceil , p) \in \varTheta \left( t^{-\frac{1}{2}} \left( 2\sqrt{p(1-p)}\right) ^t\right) \end{aligned}$$

Proof

We will derive both an upper and a matching lower bound for \(\eta (t, \lceil t/2 \rceil , p)\). We begin by deriving an upper bound: \(\eta (t, \lceil t/2 \rceil , p) = \sum ^t_{k=\lceil \frac{t}{2} \rceil } {t \atopwithdelims ()k}p^k(1-p)^{t-k}\) \(\le {t \atopwithdelims ()\lceil t/2 \rceil }\sum ^t_{k=\lceil \frac{t}{2} \rceil }p^k(1-p)^{t-k}\) \(\le {t \atopwithdelims ()\lceil t/2 \rceil } \cdot (p(1-p))^{\lceil \frac{t}{2} \rceil } \cdot \frac{1}{1-2p}\) \(\le \frac{1}{\sqrt{2\pi }} \cdot \frac{t}{\sqrt{\left( \frac{t}{2}-0.5\right) \left( \frac{t}{2}+0.5\right) }} \cdot \left( \frac{t}{t-1}\right) ^t \cdot e^{\frac{1}{12t}-\frac{1}{6t+6}-\frac{1}{6t-6}} \cdot t^{-\frac{1}{2}} 2^t \cdot (p(1-p))^{\frac{t}{2}} \cdot (p(1-p))^{\frac{1}{2}} \cdot \frac{1}{1-2p}\). The last inequality follows Stirling’s approximation. As a result, \(\eta (t, \lceil t/2 \rceil , p) \in \mathcal {O}\left( {t^{-\frac{1}{2}} \left( 2\sqrt{p(1-p)}\right) ^t}\right) \). Afterwards; we move on to deriving a matching lower bound: \(\eta (t, \lceil t/2 \rceil , p) = \sum ^t_{k=\lceil \frac{t}{2} \rceil } {t \atopwithdelims ()k}p^k(1-p)^{t-k}\) \(\ge {t \atopwithdelims ()\lceil t/2 \rceil }p^{\lceil \frac{t}{2} \rceil }(1-p)^{t-\lceil \frac{t}{2} \rceil }\) \(\ge \frac{1}{\sqrt{2\pi }} \cdot \frac{t}{\sqrt{\left( \frac{t}{2}-0.5\right) \left( \frac{t}{2}+0.5\right) }} \cdot \left( \frac{t}{t+1}\right) ^t \cdot e^{\frac{1}{12t}-\frac{1}{6t+6}-\frac{1}{6t-6}} \cdot t^{-\frac{1}{2}} 2^t \cdot (p(1-p))^{\frac{t}{2}} \cdot p^{\frac{1}{2}} (1-p)^{-\frac{1}{2}} \cdot \frac{1}{1-2p}\). The last inequality again follows Stirling’s approximation. Hence, \(\eta (t, \lceil t/2 \rceil , p) \in \varOmega \left( t^{-\frac{1}{2}} \left( 2\sqrt{p(1-p)}\right) ^t\right) \). Combining these two bounds, we conclude that \(\eta (t, \lceil t/2 \rceil , p) \in \varTheta \left( t^{-\frac{1}{2}} \left( 2\sqrt{p(1-p)}\right) ^t\right) \).    \(\square \)

3 Related Work

The seminal work of Valiant established that \(\#\textsf{SAT}\) is \(\#\textsf{P}\)-complete [28]. Toda later showed that every problem in the polynomial hierarchy could be solved by just a polynomial number of calls to a \(\#\textsf{P}\) oracle [27]. Based on Carter and Wegman’s seminal work on universal hash functions [4], Stockmeyer proposed a probabilistic polynomial time procedure, with access to an \(\textsf{NP}\) oracle, to obtain an \((\varepsilon , \delta )\)-approximation of F [25].

Built on top of Stockmeyer’s work, the core theoretical idea behind the hashing-based approximate solution counting framework, as presented in Algorithm 1 (\(\textsf{ApproxMC}\) [7]), is to use 2-universal hash functions to partition the solution space (denoted by \(\mathsf {sol({F})}\) for a given formula F) into small cells of roughly equal size. A cell is considered small if the number of solutions it contains is less than or equal to a pre-determined threshold, \(\textsf{thresh}\). An \(\textsf{NP}\) oracle is used to determine if a cell is small by iteratively enumerating its solutions until either there are no more solutions or \(\textsf{thresh}+1\) solutions have been found. In practice, an SAT solver is used to implement the \(\textsf{NP}\) oracle. To ensure a polynomial number of calls to the oracle, the threshold, \(\textsf{thresh}\), is set to be polynomial in the input parameter \(\varepsilon \) at Line 1. The subroutine \(\textsf{ApproxMCCore}\) takes the formula F and \(\textsf{thresh}\) as inputs and estimates the number of solutions at Line 7. To determine the appropriate number of cells, i.e., the value of m for \(\mathcal {H}(n, m)\), \(\textsf{ApproxMCCore}\) uses a search procedure at Line 3 of Algorithm 2. The estimate is calculated as the number of solutions in a randomly chosen cell, scaled by the number of cells, i.e., \(2^m\) at Line 5. To improve confidence in the estimate, \(\textsf{ApproxMC}\) performs multiple runs of the \(\textsf{ApproxMCCore}\) subroutine at Lines 5– 9 of Algorithm 1. The final count is computed as the median of the estimates obtained at Line 10.

In the second version of \(\textsf{ApproxMC}\) [8], two key algorithmic improvements are proposed to improve the practical performance by reducing the number of calls to the SAT solver. The first improvement is using galloping search to more efficiently find the correct number of cells, i.e., \(\textsf{LogSATSearch}\) at Line 3 of Algorithm 2. The second is using linear search over a small interval around the previous value of m before resorting to the galloping search. Additionally, the third and fourth versions [22, 23] enhance the algorithm’s performance by effectively dealing with CNF formulas conjuncted with XOR constraints, commonly used in the hashing-based counting framework. Moreover, an effective preprocessor named \(\textsf{Arjun}\) [24] is proposed to enhance \(\textsf{ApproxMC}\)’s performance by constructing shorter XOR constraints. As a result, the combination of \(\textsf{Arjun}\) and \(\textsf{ApproxMC4}\) solved almost all existing benchmarks [24], making it the current state of the art in this field.

In this work, we aim to address the main limitation of the \(\textsf{ApproxMC}\) algorithm by focusing on an aspect that still needs to be improved upon by previous developments. Specifically, we aim to improve the core algorithm of \(\textsf{ApproxMC}\), which has remained unchanged.

4 Weakness of \(\textsf{ApproxMC}\)

figure c
figure d

As noted above, the core algorithm of \(\textsf{ApproxMC}\) has not changed since 2016, and in this work, we aim to address the core limitation of \(\textsf{ApproxMC}\). To put our contribution in context, we first review \(\textsf{ApproxMC}\) and its core algorithm, called \(\textsf{ApproxMCCore}\). We present the pseudocode of \(\textsf{ApproxMC}\) and \(\textsf{ApproxMCCore}\) in Algorithms 1 and 2, respectively. \(\textsf{ApproxMCCore}\) may return an estimate that falls outside the \(\textsf{PAC}\) range \(\left[ \frac{|\mathsf {sol({F})}|}{1+\varepsilon }, (1 + \varepsilon )|\mathsf {sol({F})}| \right] \) with a certain probability of error. Therefore, \(\textsf{ApproxMC}\) repeatedly invokes \(\textsf{ApproxMCCore}\) (Lines 5– 9) and returns the median of the estimates returned by \(\textsf{ApproxMCCore}\) (Line 10), which reduces the error probability to the user-provided parameter \(\delta \).

Let \(\textsf{Error}_t\) denote the event that the median of t estimates falls outside \(\left[ \frac{|\mathsf {sol({F})}|}{1+\varepsilon }, (1 + \varepsilon )|\mathsf {sol({F})}| \right] \). Let L denote the event that an invocation \(\textsf{ApproxMCCore}\) returns an estimate less than \(\frac{|\mathsf {sol({F})}|}{1+\varepsilon }\). Similarly, let U denote the event that an individual estimate of \(|\mathsf {sol({F})}|\) is greater than \((1 + \varepsilon )|\mathsf {sol({F})}|\). For simplicity of exposition, we assume t is odd; the current implementation of t indeed ensures that t is odd by choosing the smallest odd t for which \(\Pr [\textsf{Error}_t] \le \delta \).

In the remainder of the section, we will demonstrate that reducing \(\max \left\{ \text {Pr}\left[ L\right] , \text {Pr}\left[ U\right] \right\} \) can effectively reduce the number of repetitions t, making the small-\(\delta \) scenarios practical. To this end, we will first demonstrate the existing analysis technique of \(\textsf{ApproxMC}\) leads to loose bounds on \(\Pr [\textsf{Error}_t]\). We then present a new analysis that leads to tighter bounds on \(\Pr [\textsf{Error}_t]\).

The existing combinatorial analysis in [7] derives the following proposition:

Proposition 3

$$\begin{aligned} \text {Pr}\left[ \textsf{Error}_t\right] \le \eta (t, \lceil t/2 \rceil , \text {Pr}\left[ L \cup U\right] ) \end{aligned}$$

where \(\eta (t, m, p) = \sum ^t_{k=m} {t \atopwithdelims ()k}p^k(1-p)^{t-k}\).

Proposition 3 follows from the observation that if the median falls outside the \(\textsf{PAC}\) range, at least \(\left\lceil t/2 \right\rceil \) of the results must also be outside the range. Let \(\eta (t, \lceil t/2 \rceil , \text {Pr}\left[ L \cup U\right] ) \le \delta \), and we can compute a valid t at Line 4 of \(\textsf{ApproxMC}\).

Proposition 3 raises a question: can we derive a tight upper bound for \(\text {Pr}\left[ \textsf{Error}_t\right] \)? The following lemma provides an affirmative answer to this question.

Lemma 2

Assuming t is odd, we have:

$$\begin{aligned} \text {Pr}\left[ \textsf{Error}_t\right] = \eta (t, \lceil t/2 \rceil , \text {Pr}\left[ L\right] ) + \eta (t, \lceil t/2 \rceil , \text {Pr}\left[ U\right] ) \end{aligned}$$

Proof

Let \(I^L_i\) be an indicator variable that is 1 when \(\textsf{ApproxMCCore}\) returns a \(\textsf{nSols}\) less than \( \frac{|\mathsf {sol({F})}|}{1\,+\,\varepsilon }\), indicating the occurrence of event L in the i-th repetition. Let \(I^U_i\) be an indicator variable that is 1 when \(\textsf{ApproxMCCore}\) returns a \(\textsf{nSols}\) greater than \((1+\varepsilon )|\mathsf {sol({F})}|\), indicating the occurrence of event U in the i-th repetition. We aim first to prove that \(\textsf{Error}_t \Leftrightarrow \left( \sum _{i=1}^{t} I_i^L \ge \left\lceil \frac{t}{2} \right\rceil \right) \vee \left( \sum _{i=1}^{t} I_i^U \ge \left\lceil \frac{t}{2} \right\rceil \right) \). We will begin by proving the right (\(\Rightarrow \)) implication. If the median of t estimates violates the \(\textsf{PAC}\) guarantee, the median is either less than \(\frac{|\mathsf {sol({F})}|}{1+\varepsilon }\) or greater than \((1+\varepsilon )|\mathsf {sol({F})}|\). In the first case, since half of the estimates are less than the median, at least \(\left\lceil \frac{t}{2} \right\rceil \) estimates are less than \(\frac{|\mathsf {sol({F})}|}{1+\varepsilon }\). Formally, this implies \(\sum _{i=1}^{t} I_i^L \ge \left\lceil \frac{t}{2} \right\rceil \). Similarly, in the case that the median is greater than \((1+\varepsilon )|\mathsf {sol({F})}|\), since half of the estimates are greater than the median, at least \(\left\lceil \frac{t}{2} \right\rceil \) estimates are greater than \((1+\varepsilon )|\mathsf {sol({F})}|\), thus formally implying \(\sum _{i=1}^{t} I_i^U \ge \left\lceil \frac{t}{2} \right\rceil \). On the other hand, we prove the left \((\Leftarrow )\) implication. Given \(\sum _{i=1}^{t} I_i^L \ge \left\lceil \frac{t}{2} \right\rceil \), more than half of the estimates are less than \(\frac{|\mathsf {sol({F})}|}{1+\varepsilon }\), and therefore the median is less than \(\frac{|\mathsf {sol({F})}|}{1+\varepsilon }\), violating the \(\textsf{PAC}\) guarantee. Similarly, given \(\sum _{i=1}^{t} I_i^U \ge \left\lceil \frac{t}{2} \right\rceil \), more than half of the estimates are greater than \((1+\varepsilon )|\mathsf {sol({F})}|\), and therefore the median is greater than \((1+\varepsilon )|\mathsf {sol({F})}|\), violating the \(\textsf{PAC}\) guarantee. This concludes the proof of \(\textsf{Error}_t \Leftrightarrow \left( \sum _{i=1}^{t} I_i^L \ge \left\lceil \frac{t}{2} \right\rceil \right) \vee \left( \sum _{i=1}^{t} I_i^U \ge \left\lceil \frac{t}{2} \right\rceil \right) \). Then we obtain:

$$\begin{aligned} \text {Pr}\left[ \textsf{Error}_t\right]&= \text {Pr}\left[ \left( \sum _{i=1}^{t} I_i^L \ge \left\lceil t/2 \right\rceil \right) \vee \left( \sum _{i=1}^{t} I_i^U \ge \left\lceil t/2 \right\rceil \right) \right] \\&= \text {Pr}\left[ \left( \sum _{i=1}^{t} I_i^L \ge \left\lceil t/2 \right\rceil \right) \right] + \text {Pr}\left[ \left( \sum _{i=1}^{t} I_i^U \ge \left\lceil t/2 \right\rceil \right) \right] \\&- \text {Pr}\left[ \left( \sum _{i=1}^{t} I_i^L \ge \left\lceil t/2 \right\rceil \right) \wedge \left( \sum _{i=1}^{t} I_i^U \ge \left\lceil t/2 \right\rceil \right) \right] \end{aligned}$$

Given \(I^L_i+I^U_i \le 1\) for \(i=1, 2, ..., t\), \(\sum _{i=1}^{t} (I^L_i + I^U_i) \le t\) is there, but if \(\left( \sum _{i=1}^{t} I_i^L \ge \left\lceil t/2 \right\rceil \right) \wedge \left( \sum _{i=1}^{t} I_i^U \ge \left\lceil t/2 \right\rceil \right) \) is also given, we obtain \(\sum _{i=1}^{t} (I^L_i + I^U_i) \ge t+1\) contradicting \(\sum _{i=1}^{t} (I^L_i + I^U_i) \le t\); Hence, we can conclude that \(\text {Pr}\left[ \left( \sum _{i=1}^{t} I_i^L \ge \left\lceil t/2 \right\rceil \right) \wedge \left( \sum _{i=1}^{t} I_i^U \ge \left\lceil t/2 \right\rceil \right) \right] =0\). From this, we can deduce:

$$\begin{aligned} \text {Pr}\left[ \textsf{Error}_t\right]&= \text {Pr}\left[ \left( \sum _{i=1}^{t} I_i^L \ge \left\lceil t/2 \right\rceil \right) \right] + \text {Pr}\left[ \left( \sum _{i=1}^{t} I_i^U \ge \left\lceil t/2 \right\rceil \right) \right] \\&= \eta (t, \lceil t/2 \rceil , \text {Pr}\left[ L\right] ) + \eta (t, \lceil t/2 \rceil , \text {Pr}\left[ U\right] ) \end{aligned}$$

   \(\square \)

Though Lemma 2 shows that reducing \(\text {Pr}\left[ L\right] \) and \(\text {Pr}\left[ U\right] \) can decrease the error probability, it is still uncertain to what extent \(\text {Pr}\left[ L\right] \) and \(\text {Pr}\left[ U\right] \) affect the error probability. To further understand this impact, the following lemma is presented to establish a correlation between the error probability and t depending on \(\text {Pr}\left[ L\right] \) and \(\text {Pr}\left[ U\right] \).

Lemma 3

Let \(p_{max} = \max \left\{ \text {Pr}\left[ L\right] , \text {Pr}\left[ U\right] \right\} \) and \(p_{max} < 0.5\), we have

$$\begin{aligned} \text {Pr}\left[ \textsf{Error}_t\right] \in \varTheta \left( t^{-\frac{1}{2}} \left( 2\sqrt{p_{max}(1-p_{max})}\right) ^t\right) \end{aligned}$$

Proof

Applying Lemmas 1 and 2, we have

$$\begin{aligned} \text {Pr}\left[ \textsf{Error}_t\right]&\in \varTheta \left( t^{-\frac{1}{2}} \left( \left( 2\sqrt{\text {Pr}\left[ L\right] (1-\text {Pr}\left[ L\right] )}\right) ^t + \left( 2\sqrt{\text {Pr}\left[ U\right] (1-\text {Pr}\left[ U\right] )}\right) ^t\right) \right) \\&= \varTheta \left( t^{-\frac{1}{2}} \left( 2\sqrt{p_{max}(1-p_{max})}\right) ^t\right) \end{aligned}$$

   \(\square \)

In summary, Lemma 3 provides a way to tighten the bound on \(\Pr [\textsf{Error}_t]\) by designing an algorithm such that we can obtain a tighter bound on \(p_{max}\) in contrast to previous approaches that relied on obtaining a tighter bound on \(\Pr [L \cup U]\).

5 Rounding Model Counting

In this section, we present a rounding-based technique that allows us to obtain a tighter bound on \(p_{max}\). On a high-level, instead of returning the estimate from one iteration of the underlying core algorithm as the number of solutions in a randomly chosen cell multiplied by the number of cells, we round each estimate of the model count to a value that is more likely to be within \((1+\varepsilon )\)-bound. While counter-intuitive at first glance, we show that rounding the estimate reduces \(\max \left\{ \text {Pr}\left[ L\right] , \text {Pr}\left[ U\right] \right\} \), thereby resulting in a smaller number of repetitions of the underlying algorithm.

We present \(\textsf{ApproxMC6}\), a rounding-based approximate model counting algorithm, in Sect. 5.1. Section 5.2 will demonstrate how \(\textsf{ApproxMC6}\) decreases \(\max \left\{ \text {Pr}\left[ L\right] , \text {Pr}\left[ U\right] \right\} \) and the number of estimates. Lastly, in Sect. 5.3, we will provide proof of the theoretical correctness of the algorithm.

5.1 Algorithm

Algorithm 3 presents the procedure of \(\textsf{ApproxMC6}\). \(\textsf{ApproxMC6}\) takes as input a formula F, a tolerance parameter \(\varepsilon \), and a confidence parameter \(\delta \). \(\textsf{ApproxMC6}\) returns an \((\varepsilon , \delta )\)-estimate c of \(|\mathsf {sol({F})}|\) such that \(\textsf{Pr}\left[ {\frac{|\mathsf {sol({F})}|}{1+\varepsilon } \le c \le (1 + \varepsilon )|\mathsf {sol({F})}|}\right] \ge 1 - \delta \). \(\textsf{ApproxMC6}\) is identical to \(\textsf{ApproxMC}\) in its initialization of data structures and handling of base cases (Lines 1–4).

In Line 5, we pre-compute the rounding type and rounding value to be used in \(\textsf{ApproxMC6Core}\). \(\textsf{configRound}\) is implemented in Algorithm 5; the precise choices arise due to technical analysis, as presented in Sect. 5.2. Note that, in \(\textsf{configRound}\), \(\textsf{Cnt}_{\langle F, m \rangle }\) is rounded up to \(\textsf{roundValue}\) for \(\varepsilon <3\) (\(\textsf{roundUp}=1\)) but rounded to \(\textsf{roundValue}\) for \(\varepsilon \ge 3\) (\(\textsf{roundUp}=0\)). Rounding up means we assign \(\textsf{roundValue}\) to \(\textsf{Cnt}_{\langle F, m \rangle }\) if \(\textsf{Cnt}_{\langle F, m \rangle }\) is less than \(\textsf{roundValue}\) and, otherwise, keep \(\textsf{Cnt}_{\langle F, m \rangle }\) unchanged. Rounding means that we assign \(\textsf{roundValue}\) to \(\textsf{Cnt}_{\langle F, m \rangle }\) in all cases. \(\textsf{ApproxMC6}\) computes the number of repetitions necessary to lower error probability down to \(\delta \) at Line 6. The implementation of \(\textsf{computeIter}\) is presented in Algorithm 6 following Lemma 2. The iterator keeps increasing until the tight error bound is no more than \(\delta \). As we will show in Sect. 5.2, \(\text {Pr}\left[ L\right] \) and \(\text {Pr}\left[ U\right] \) depend on \(\varepsilon \). In the loop of Lines 7–11, \(\textsf{ApproxMC6Core}\) repeatedly estimates \(|\mathsf {sol({F})}|\). Each estimate \(\textsf{nSols}\) is stored in List C, and the median of C serves as the final estimate satisfying the \((\varepsilon ,\delta )\)-guarantee.

Algorithm 4 shows the pseudo-code of \(\textsf{ApproxMC6Core}\). A random hash function is chosen at Line 1 to partition \(\mathsf {sol({F})}\) into roughly equal cells. A random hash value is chosen at Line 2 to randomly pick a cell for estimation. In Line 3, we search for a value m such that the cell picked from \(2^m\) available cells is small enough to enumerate solutions one by one while providing a good estimate of \(|\mathsf {sol({F})}|\). In Line 4, a bounded model counting is invoked to compute the size of the picked cell, i.e., \(\textsf{Cnt}_{\langle F, m \rangle }\). Finally, if \(\textsf{roundUp}\) equals 1, \(\textsf{Cnt}_{\langle F, m \rangle }\) is rounded up to \(\textsf{roundValue}\) at Line 6. Otherwise, \(\textsf{roundUp}\) equals 0, and \(\textsf{Cnt}_{\langle F, m \rangle }\) is rounded to \(\textsf{roundValue}\) at Line 8. Note that rounding up returns \(\textsf{roundValue}\) only if \(\textsf{Cnt}_{\langle F, m \rangle }\) is less than \(\textsf{roundValue}\). However, in the case of rounding, \(\textsf{roundValue}\) is always returned no matter what value \(\textsf{Cnt}_{\langle F, m \rangle }\) is.

For large \(\varepsilon \) \((\varepsilon \ge 3)\), \(\textsf{ApproxMC6Core}\) returns a value that is independent of the value returned by \(\textsf{BoundedSAT}\) in line 4 of Algorithm 4. However, observe the value depends on m returned by \(\textsf{LogSATSearch}\) [8], which in turn uses \(\textsf{BoundedSAT}\) to find the value of m; therefore, the algorithm’s run is not independent of all the calls to \(\textsf{BoundedSAT}\). The technical reason for correctness stems from the observation that for large values of \(\varepsilon \), we can always find a value of m such that \(2^m \times c\) (where c is a constant) is a \((1+\varepsilon )\)-approximation of \(|\mathsf {sol({F})}|\). An example, consider \(n=7\) and let \(c=1\), then a \((1+3)\)-approximation of a number between 1 and 128 belongs to [1, 2, 4, 8, 16, 32, 64, 128]; therefore, returning an answer of the form \(c \times 2^m\) suffices as long as we are able to search for the right value of m, which is accomplished by \(\textsf{LogSATSearch}\). We could skip the final call to \(\textsf{BoundedSAT}\) in line 4 of \(\textsf{ApproxMC6Core}\) for large values of \(\varepsilon \), but the actual computation of \(\textsf{BoundedSAT}\) comes with \(\textsf{LogSATSearch}\).

figure e
figure f
figure g
figure h

5.2 Repetition Reduction

We will now show that \(\textsf{ApproxMC6Core}\) allows us to obtain a smaller \(\max \left\{ \text {Pr}\left[ L\right] , \text {Pr}\left[ U\right] \right\} \). Furthermore, we show the large gap between the error probability of \(\textsf{ApproxMC6}\) and that of \(\textsf{ApproxMC}\) both analytically and visually.

The following lemma presents the upper bounds of \(\text {Pr}\left[ L\right] \) and \(\text {Pr}\left[ U\right] \) for \(\textsf{ApproxMC6Core}\). Let \(\textsf{pivot}=9.84\left( 1+\frac{1}{\varepsilon }\right) ^2\) for simplicity.

Lemma 4

The following bounds hold for \(\textsf{ApproxMC6}\):

$$\begin{aligned} \text {Pr}\left[ L\right] \le {\left\{ \begin{array}{ll} 0.262 &{} \text {if } \varepsilon<\sqrt{2}-1 \\ 0.157 &{} \text {if } \sqrt{2}-1\le \varepsilon<1 \\ 0.085 &{} \text {if } 1\le \varepsilon<3 \\ 0.055 &{} \text {if } 3\le \varepsilon <4\sqrt{2}-1 \\ 0.023 &{} \text {if } \varepsilon \ge 4\sqrt{2}-1 \\ \end{array}\right. } \end{aligned}$$
$$\begin{aligned} \text {Pr}\left[ U\right] \le {\left\{ \begin{array}{ll} 0.169 &{} \text {if } \varepsilon <3 \\ 0.044 &{} \text {if } \varepsilon \ge 3 \\ \end{array}\right. } \end{aligned}$$

The proof of Lemma 4 is deferred to Sect. 5.3. Observe that Lemma 4 influences the choices in the design of \(\textsf{configRound}\) (Algorithm 5). Recall that \(\max \left\{ \text {Pr}\left[ L\right] , \text {Pr}\left[ U\right] \right\} \le 0.36\) for \(\textsf{ApproxMC}\) (Appendix C), but Lemma 4 ensures \(\max \left\{ \text {Pr}\left[ L\right] , \text {Pr}\left[ U\right] \right\} \le 0.262\) for \(\textsf{ApproxMC6}\). For \(\varepsilon \ge 4\sqrt{2}-1\), Lemma 4 even delivers \(\max \left\{ \text {Pr}\left[ L\right] , \text {Pr}\left[ U\right] \right\} \le 0.044\).

The following theorem analytically presents the gap between the error probability of \(\textsf{ApproxMC6}\) and that of \(\textsf{ApproxMC}\)Footnote 1.

Theorem 1

For \(\sqrt{2}-1\le \varepsilon <1\),

$$\begin{aligned} \text {Pr}\left[ \textsf{Error}_t\right] \in {\left\{ \begin{array}{ll} \mathcal {O}\left( {t^{-\frac{1}{2}}0.75^t}\right) &{} \text {for }{\textsf{ApproxMC6}} \\ \mathcal {O}\left( {t^{-\frac{1}{2}}0.96^t}\right) &{} \text {for }{\textsf{ApproxMC}}\\ \end{array}\right. } \end{aligned}$$

Proof

From Lemma 4, we obtain \(p_{max} \le 0.169\) for \(\textsf{ApproxMC6}\). Applying Lemma 3, we have

$$\begin{aligned} \text {Pr}\left[ \textsf{Error}_t\right] \in \mathcal {O}\left( {t^{-\frac{1}{2}} \left( 2\sqrt{0.169(1-0.169)}\right) ^t}\right) \subseteq \mathcal {O}\left( {t^{-\frac{1}{2}}0.75^t}\right) \end{aligned}$$

For \(\textsf{ApproxMC}\), combining \(p_{max} \le 0.36\) (Appendix C) and Lemma 3, we obtain

$$\begin{aligned} \text {Pr}\left[ \textsf{Error}_t\right] \in \mathcal {O}\left( {t^{-\frac{1}{2}} \left( 2\sqrt{0.36(1-0.36)}\right) ^t}\right) = \mathcal {O}\left( {t^{-\frac{1}{2}}0.96^t}\right) \end{aligned}$$

   \(\square \)

Figure 1 visualizes the large gap between the error probability of \(\textsf{ApproxMC6}\) and that of \(\textsf{ApproxMC}\). The x-axis represents the number of repetitions (t) in \(\textsf{ApproxMC6}\) or \(\textsf{ApproxMC}\). The y-axis represents the upper bound of error probability in the log scale. For example, as \(t=117\), \(\textsf{ApproxMC}\) guarantees that with a probability of \(10^{-3}\), the median over 117 estimates violates the \(\textsf{PAC}\) guarantee. However, \(\textsf{ApproxMC6}\) allows a much smaller error probability that is at most \(10^{-15}\) for \(\sqrt{2}-1\le \varepsilon <1\). The smaller error probability enables \(\textsf{ApproxMC6}\) to repeat fewer repetitions while providing the same level of theoretical guarantee. For example, given \(\delta =0.001\) to \(\textsf{ApproxMC}\), i.e., \(y=0.001\) in Fig. 1, \(\textsf{ApproxMC}\) requests 117 repetitions to obtain the given error probability. However, \(\textsf{ApproxMC6}\) claims that 37 repetitions for \(\varepsilon <\sqrt{2}-1\), 19 repetitions for \(\sqrt{2}-1\le \varepsilon <1\), 17 repetitions for \(1\le \varepsilon <3\), 7 repetitions for \(3\le \varepsilon <4\sqrt{2}-1\), and 5 repetitions for \(\varepsilon \ge 4\sqrt{2}-1\) are sufficient to obtain the same level of error probability. Consequently, \(\textsf{ApproxMC6}\) can obtain \(3\times \), \(6\times \), \(7\times \), \(17\times \), and \(23\times \) speedups, respectively, than \(\textsf{ApproxMC}\).

Fig. 1.
figure 1

Comparison of error bounds for \(\textsf{ApproxMC6}\) and \(\textsf{ApproxMC}\).

5.3 Proof of Lemma 4 for Case \(\sqrt{2}-1\le \varepsilon <1\)

We provide full proof of Lemma 4 for case \(\sqrt{2}-1\le \varepsilon <1\). We defer the proof of other cases to Appendix D.

Let \(T_m\) denote the event \(\left( \textsf{Cnt}_{\langle F, m \rangle } < \textsf{thresh}\right) \), and let \(L_m\) and \(U_m\) denote the events \(\left( \textsf{Cnt}_{\langle F, m \rangle } < \frac{\textsf{E}\left[ {\textsf{Cnt}_{\langle F, m \rangle }}\right] }{1+\varepsilon } \right) \) and \(\left( \textsf{Cnt}_{\langle F, m \rangle } > \textsf{E}\left[ {\textsf{Cnt}_{\langle F, m \rangle }}\right] (1+\varepsilon ) \right) \), respectively. To ease the proof, let \(U'_m\) denote \(\left( \textsf{Cnt}_{\langle F, m \rangle } > \textsf{E}\left[ {\textsf{Cnt}_{\langle F, m \rangle }}\right] (1+\frac{\varepsilon }{1+\varepsilon }) \right) \), and thereby \(U_m \subseteq U'_m\). Let \(m^*= \left\lfloor \log _2|\mathsf {sol({F})}| - \log _2\left( \textsf{pivot}\right) + 1 \right\rfloor \) such that \(m^*\) is the smallest m satisfying \(\frac{|\mathsf {sol({F})}|}{2^m}(1+\frac{\varepsilon }{1+\varepsilon }) \le \textsf{thresh}-1\).

Let us first prove the lemmas used in the proof of Lemma 4.

Lemma 5

For every \(0<\beta <1\), \(\gamma >1\), and \(1 \le m \le n\), the following holds:

  1. 1.

    \(\text {Pr}\left[ \textsf{Cnt}_{\langle F, m \rangle } \le \beta \textsf{E}\left[ {\textsf{Cnt}_{\langle F, m \rangle }}\right] \right] \le \frac{1}{1+(1-\beta )^2\textsf{E}\left[ {\textsf{Cnt}_{\langle F, m \rangle }}\right] }\)

  2. 2.

    \(\text {Pr}\left[ \textsf{Cnt}_{\langle F, m \rangle } \ge \gamma \textsf{E}\left[ {\textsf{Cnt}_{\langle F, m \rangle }}\right] \right] \le \frac{1}{1+(\gamma -1)^2\textsf{E}\left[ {\textsf{Cnt}_{\langle F, m \rangle }}\right] }\)

Proof

Statement 1 can be proved following the proof of Lemma 1 in [8]. For statement 2, we rewrite the left-hand side and apply Cantelli’s inequality:

\(\text {Pr}\left[ \textsf{Cnt}_{\langle F, m \rangle } {-} \textsf{E}\left[ {\textsf{Cnt}_{\langle F, m \rangle }}\right] {\ge }(\gamma {-}1)\textsf{E}\left[ {\textsf{Cnt}_{\langle F, m \rangle }}\right] \right] {\le }\frac{\sigma ^2\left[ \textsf{Cnt}_{\langle F, m \rangle } \right] }{\sigma ^2\left[ \textsf{Cnt}_{\langle F, m \rangle } \right] {+} ((\gamma -1)\textsf{E}\left[ {\textsf{Cnt}_{\langle F, m \rangle }}\right] )^2}\). Finally, applying Eq. 2 completes the proof.    \(\square \)

Lemma 6

Given \(\sqrt{2}-1\le \varepsilon <1\), the following bounds hold:

  1. 1.

    \(\text {Pr}\left[ T_{m^*-3}\right] \le \frac{1}{62.5}\)

  2. 2.

    \(\text {Pr}\left[ L_{m^*-2}\right] \le \frac{1}{20.68}\)

  3. 3.

    \(\text {Pr}\left[ L_{m^*-1}\right] \le \frac{1}{10.84}\)

  4. 4.

    \(\text {Pr}\left[ U'_{m^*}\right] \le \frac{1}{5.92}\)

Proof

Following the proof of Lemma 2 in [8], we can prove statements 1, 2, and 3. To prove statement 4, replacing \(\gamma \) with \((1+\frac{\varepsilon }{1+\varepsilon })\) in Lemma 5 and employing \(\textsf{E}\left[ {\textsf{Cnt}_{\langle F, m^* \rangle }}\right] \ge \textsf{pivot}/2\), we obtain \(\text {Pr}\left[ U'_{m^*}\right] \le \frac{1}{1+\left( \frac{\varepsilon }{1+\varepsilon }\right) ^2\textsf{pivot}/2} \le \frac{1}{5.92}\).    \(\square \)

Now we prove the upper bounds of \(\text {Pr}\left[ L\right] \) and \(\text {Pr}\left[ U\right] \) in Lemma 4 for \(\sqrt{2}-1\le \varepsilon <1\). The proof for other \(\varepsilon \) is deferred to Appendix D due to the page limit.

Lemma 4. The following bounds hold for \(\textsf{ApproxMC6}\):

$$\begin{aligned} \text {Pr}\left[ L\right] \le {\left\{ \begin{array}{ll} 0.262 &{} \text {if } \varepsilon<\sqrt{2}-1 \\ 0.157 &{} \text {if } \sqrt{2}-1\le \varepsilon<1 \\ 0.085 &{} \text {if } 1\le \varepsilon<3 \\ 0.055 &{} \text {if } 3\le \varepsilon <4\sqrt{2}-1 \\ 0.023 &{} \text {if } \varepsilon \ge 4\sqrt{2}-1 \\ \end{array}\right. } \end{aligned}$$
$$\begin{aligned} \text {Pr}\left[ U\right] \le {\left\{ \begin{array}{ll} 0.169 &{} \text {if } \varepsilon <3 \\ 0.044 &{} \text {if } \varepsilon \ge 3 \\ \end{array}\right. } \end{aligned}$$

Proof

We prove the case of \(\sqrt{2}-1\le \varepsilon <1\). The proof for other \(\varepsilon \) is deferred to Appendix D. Let us first bound \(\text {Pr}\left[ L\right] \). Following \(\textsf{LogSATSearch}\) in [8], we have

$$\begin{aligned} \text {Pr}\left[ L\right] = \left[ \bigcup _{i\in \{1,...,n\}} \left( \overline{T_{i-1}} \cap T_i \cap L_i \right) \right] \end{aligned}$$
(3)

Equation 3 can be simplified by three observations labeled O1, O2 and O3 below.

  • O1 :  \(\forall i \le m^*-3, T_i \subseteq T_{i+1}\). Therefore,

    $$\begin{aligned} \bigcup _{i\in \{1,...,m^*-3\}} (\overline{T_{i-1}} \cap T_i \cap L_i) \subseteq \bigcup _{i\in \{1,...,m^*-3\}} T_i \subseteq T_{m^*-3} \end{aligned}$$
  • O2 : ]For \(i \in \{m^*-2, m^*-1\}\), we have

    $$\begin{aligned} \bigcup _{i\in \{m^*-2, m^*-1\}} (\overline{T_{i-1}} \cap T_i \cap L_i) \subseteq L_{m^*-2} \cup L_{m^*-1} \end{aligned}$$
  • O3 :  \(\forall i \ge m^*\), since rounding \(\textsf{Cnt}_{\langle F, i \rangle }\) up to \(\frac{\textsf{pivot}}{\sqrt{2}}\) and \(m^*\ge \log _2|\mathsf {sol({F})}| - \log _2\left( \textsf{pivot}\right) \), we have \(2^i \times \textsf{Cnt}_{\langle F, i \rangle } \ge 2^{m^*} \times \frac{\textsf{pivot}}{\sqrt{2}} \ge \frac{|\mathsf {sol({F})}|}{\sqrt{2}} \ge \frac{|\mathsf {sol({F})}|}{1+\varepsilon }\). The last inequality follows from \(\varepsilon \ge \sqrt{2}-1\). Then we have \(\textsf{Cnt}_{\langle F, i \rangle } \ge \frac{\textsf{E}\left[ {\textsf{Cnt}_{\langle F, i \rangle }}\right] }{1+\varepsilon }\). Therefore, \(L_i=\emptyset \) for \(i \ge m^*\) and we have

    $$\begin{aligned} \bigcup _{i\in \{m^*, ..., n\}} (\overline{T_{i-1}} \cap T_i \cap L_i) = \emptyset \end{aligned}$$

Following the observations O1, O2, and O3, we simplify Eq. 3 and obtain

$$\begin{aligned} \text {Pr}\left[ L\right] \le \text {Pr}\left[ T_{m^*-3}\right] + \text {Pr}\left[ L_{m^*-2}\right] + \text {Pr}\left[ L_{m^*-1}\right] \end{aligned}$$

Employing Lemma 6 gives \(\text {Pr}\left[ L\right] \le 0.157\).

Now let us bound \(\text {Pr}\left[ U\right] \). Similarly, following \(\textsf{LogSATSearch}\) in [8], we have

$$\begin{aligned} \text {Pr}\left[ U\right] = \left[ \bigcup _{i\in \{1,...,n\}} \left( \overline{T_{i-1}} \cap T_i \cap U_i \right) \right] \end{aligned}$$
(4)

We derive the following observations O4 and O5.

  • O4 :  \(\forall i \le m^*-1\), since \(m^*\le \log _2|\mathsf {sol({F})}| - \log _2\left( \textsf{pivot}\right) + 1\), we have \(2^i \times \textsf{Cnt}_{\langle F, i \rangle } \le 2^{m^*-1} \times \textsf{thresh}\le |\mathsf {sol({F})}|\left( 1+\frac{\varepsilon }{1+\varepsilon } \right) \). Then we obtain \(\textsf{Cnt}_{\langle F, i \rangle } \le \textsf{E}\left[ {\textsf{Cnt}_{\langle F, i \rangle }}\right] \left( 1+\frac{\varepsilon }{1+\varepsilon } \right) \). Therefore, \(T_i \cap U'_i = \emptyset \) for \(i \le m^*-1\) and we have

    $$\begin{aligned} \bigcup _{i\in \{1,...,m^*-1\}} \left( \overline{T_{i-1}} \cap T_i \cap U_i \right) \subseteq \bigcup _{i\in \{1,...,m^*-1\}} \left( \overline{T_{i-1}} \cap T_i \cap U'_i \right) = \emptyset \end{aligned}$$
  • O5 :  \(\forall i \ge m^*\), \(\overline{T_i}\) implies \(\textsf{Cnt}_{\langle F, i \rangle } > \textsf{thresh}\), and then we have \(2^i \times \textsf{Cnt}_{\langle F, i \rangle } > 2^{m^*} \times \textsf{thresh}\ge |\mathsf {sol({F})}|\left( 1+\frac{\varepsilon }{1+\varepsilon } \right) \). The second inequality follows from \(m^*\ge \log _2|\mathsf {sol({F})}| - \log _2\left( \textsf{pivot}\right) \). Then we obtain \(\textsf{Cnt}_{\langle F, i \rangle } > \textsf{E}\left[ {\textsf{Cnt}_{\langle F, i \rangle }}\right] \left( 1+\frac{\varepsilon }{1+\varepsilon } \right) \). Therefore, \(\overline{T_i} \subseteq U'_i\) for \(i \ge m^*\). Since \(\forall i, \overline{T_i} \subseteq \overline{T_{i-1}}\), we have

    $$\begin{aligned} \bigcup _{i\in \{m^*,...,n\}} \left( \overline{T_{i-1}} \cap T_i \cap U_i \right)&\subseteq \bigcup _{i\in \{m^*+1,...,n\}} \overline{T_{i-1}} \cup ( \overline{T_{m^*-1}} \cap T_{m^*} \cap U_{m^*} ) \nonumber \\&\subseteq \overline{T_{m^*}} \cup ( \overline{T_{m^*-1}} \cap T_{m^*} \cap U_{m^*} ) \nonumber \\&\subseteq \overline{T_{m^*}} \cup U_{m^*} \nonumber \\&\subseteq U'_{m^*} \end{aligned}$$
    (5)

    Remark that for \(\sqrt{2}-1\le \varepsilon <1\), we round \(\textsf{Cnt}_{\langle F, m^* \rangle }\) up to \(\frac{\textsf{pivot}}{\sqrt{2}}\), and we have \(2^{m^*}\times \frac{\textsf{pivot}}{\sqrt{2}} \le |\mathsf {sol({F})}|(1+\varepsilon )\), which means rounding doesn’t affect the event \(U_{m^*}\); therefore, Inequality 5 still holds.

Following the observations O4 and O5, we simplify Eq. 4 and obtain

$$\begin{aligned} \text {Pr}\left[ U\right] \le \text {Pr}\left[ U'_{m^*}\right] \end{aligned}$$

Employing Lemma 6 gives \(\text {Pr}\left[ U\right] \le 0.169\).    \(\square \)

The breakpoints in \(\varepsilon \) of Lemma 4 arise from how we use rounding to lower the error probability for events L and U. Rounding up counts can lower \(\text {Pr}\left[ L\right] \) but may increase \(\text {Pr}\left[ U\right] \). Therefore, we want to round up counts to a value that doesn’t affect the event U. Take \(\sqrt{2}-1 \le \varepsilon < 1\) as an example; we round up the count to a value such that \(L_{m^*}\) becomes an empty event with zero probability while \(U_{m^*}\) remains unchanged. To make \(L_{m^*}\) empty, we have

$$\begin{aligned} 2^{m^*} \times \textsf{roundValue}\ge 2^{m^*} \times \frac{1}{1+\varepsilon } \textsf{pivot}\ge \frac{1}{1+\varepsilon } |\mathsf {sol({F})}| \end{aligned}$$
(6)

where the last inequality follows from \(m^*\ge \log _2|\mathsf {sol({F})}| - \log _2\left( \textsf{pivot}\right) \). To maintain \(U_{m^*}\) unchanged, we obtain

$$\begin{aligned} 2^{m^*} \times \textsf{roundValue}\le 2^{m^*} \times \frac{1+\varepsilon }{2} \textsf{pivot}\le (1+\varepsilon ) |\mathsf {sol({F})}| \end{aligned}$$
(7)

where the last inequality follows from \(m^*\le \log _2|\mathsf {sol({F})}| - \log _2\left( \textsf{pivot}\right) +1\). Combining Eqs. 6 and 7 together, we obtain

$$\begin{aligned} 2^{m^*} \times \frac{1}{1+\varepsilon } \textsf{pivot}\le 2^{m^*} \times \frac{1+\varepsilon }{2} \textsf{pivot}\end{aligned}$$

which gives us \(\varepsilon \ge \sqrt{2}-1\). Similarly, we can derive other breakpoints.

6 Experimental Evaluation

It is perhaps worth highlighting that both \(\textsf{ApproxMCCore}\) and \(\textsf{ApproxMC6Core}\) invoke the underlying SAT solver on identical queries; the only difference between \(\textsf{ApproxMC6}\) and \(\textsf{ApproxMC}\) lies in what estimate to return and how often \(\textsf{ApproxMCCore}\) and \(\textsf{ApproxMC6Core}\) are invoked. From this viewpoint, one would expect that theoretical improvements would also lead to improved runtime performance. To provide further evidence, we perform extensive empirical evaluation and compare \(\textsf{ApproxMC6}\)’s performance against the current state-of-the-art model counter, \(\textsf{ApproxMC}\) [22]. We use \(\textsf{Arjun}\) as a pre-processing tool. We used the latest version of \(\textsf{ApproxMC}\), called \(\textsf{ApproxMC4}\); an entry based on \(\textsf{ApproxMC4}\) won the Model Counting Competition 2022.

Previous comparisons of \(\textsf{ApproxMC}\) have been performed on a set of 1896 instances, but the latest version of \(\textsf{ApproxMC}\) is able to solve almost all the instances when these instances are pre-processed by \(\textsf{Arjun}\). Therefore, we sought to construct a new comprehensive set of 1890 instances derived from various sources, including Model Counting Competitions 2020–2022 [12, 15, 16], program synthesis [1], quantitative control improvisation [13], quantification of software properties [26], and adaptive chosen ciphertext attacks [3]. As noted earlier, our technique extends to projected model counting, and our benchmark suite indeed comprises 772 projected model counting instances.

Experiments were conducted on a high-performance computer cluster, with each node consisting of 2xE5-2690v3 CPUs featuring 2 \(\times \) 12 real cores and 96GB of RAM. For each instance, a counter was run on a single core, with a time limit of 5000 s and a memory limit of 4 GB. To compare runtime performance, we use the PAR-2 score, a standard metric in the SAT community. Each instance is assigned a score that is the number of seconds it takes the corresponding tool to complete execution successfully. In the event of a timeout or memory out, the score is the doubled time limit in seconds. The PAR-2 score is then calculated as the average of all the instance scores. We also report the speedup of \(\textsf{ApproxMC6}\) over \(\textsf{ApproxMC4}\), calculated as the ratio of the runtime of \(\textsf{ApproxMC4}\) to that of \(\textsf{ApproxMC6}\) on instances solved by both counters. We set \(\delta \) to 0.001 and \(\varepsilon \) to 0.8.

Specifically, we aim to address the following research questions:

  • RQ 1. How does the runtime performance of \(\textsf{ApproxMC6}\) compare to that of \(\textsf{ApproxMC4}\)?

  • RQ 2. How does the accuracy of the counts computed by \(\textsf{ApproxMC6}\) compare to that of the exact count?

Summary. In summary, \(\textsf{ApproxMC6}\) consistently outperforms \(\textsf{ApproxMC4}\). Specifically, it solved 204 additional instances and reduced the PAR-2 score by 1063 s in comparison to \(\textsf{ApproxMC4}\). The average speedup of \(\textsf{ApproxMC6}\) over \(\textsf{ApproxMC4}\) was 4.68. In addition, \(\textsf{ApproxMC6}\) provided a high-quality approximation with an average observed error of 0.1, much smaller than the theoretical error tolerance of 0.8.

6.1 RQ1. Overall Performance

Figure 2 compares the counting time of \(\textsf{ApproxMC6}\) and \(\textsf{ApproxMC4}\). The x-axis represents the index of the instances, sorted in ascending order of runtime, and the y-axis represents the runtime for each instance. A point (x, y) indicates that a counter can solve x instances within y seconds. Thus, for a given time limit y, a counter whose curve is on the right has solved more instances than a counter on the left. It can be seen in the figure that \(\textsf{ApproxMC6}\) consistently outperforms \(\textsf{ApproxMC4}\). In total, \(\textsf{ApproxMC6}\) solved 204 more instances than \(\textsf{ApproxMC4}\).

Table 1 provides a detailed comparison between \(\textsf{ApproxMC6}\) and \(\textsf{ApproxMC4}\). The first column lists three measures of interest: the number of solved instances, the PAR-2 score, and the speedup of \(\textsf{ApproxMC6}\) over \(\textsf{ApproxMC4}\). The second and third columns show the results for \(\textsf{ApproxMC4}\) and \(\textsf{ApproxMC6}\), respectively. The second column indicates that \(\textsf{ApproxMC4}\) solved 998 of the 1890 instances and achieved a PAR-2 score of 4934. The third column shows that \(\textsf{ApproxMC6}\) solved 1202 instances and achieved a PAR-2 score of 3871. In comparison, \(\textsf{ApproxMC6}\) solved 204 more instances and reduced the PAR-2 score by 1063 s in comparison to \(\textsf{ApproxMC4}\). The geometric mean of the speedup for \(\textsf{ApproxMC6}\) over \(\textsf{ApproxMC4}\) is 4.68. This speedup was calculated only for instances solved by both counters.

Table 1. The number of solved instances and PAR-2 score for \(\textsf{ApproxMC6}\) versus \(\textsf{ApproxMC4}\) on 1890 instances. The geometric mean of the speedup of \(\textsf{ApproxMC6}\) over \(\textsf{ApproxMC4}\) is also reported.
Fig. 2.
figure 2

Comparison of counting times for \(\textsf{ApproxMC6}\) and \(\textsf{ApproxMC4}\).

Fig. 3.
figure 3

Comparison of approximate counts from \(\textsf{ApproxMC6}\) to exact counts from \(\textsf{Ganak}\).

6.2 RQ2. Approximation Quality

We used the state-of-the-art probabilistic exact model counter \(\textsf{Ganak}\) to compute the exact model count and compare it to the results of \(\textsf{ApproxMC6}\). We collected statistics on instances solved by both \(\textsf{Ganak}\) and \(\textsf{ApproxMC6}\). Figure 3 presents results for a subset of instances. The x-axis represents the index of instances sorted in ascending order by the number of solutions, and the y-axis represents the number of solutions in a log scale. Theoretically, the approximate count from \(\textsf{ApproxMC6}\) should be within the range of \(|\mathsf {sol({F})}| \cdot 1.8\) and \(|\mathsf {sol({F})}|/1.8\) with probability 0.999, where \(|\mathsf {sol({F})}|\) denotes the exact count returned by \(\textsf{Ganak}\). The range is indicated by the upper and lower bounds, represented by the curves \(y=|\mathsf {sol({F})}| \cdot 1.8\) and \(y=|\mathsf {sol({F})}|/1.8\), respectively. Figure 3 shows that the approximate counts from \(\textsf{ApproxMC6}\) fall within the expected range \(\left[ |\mathsf {sol({F})}|/1.8, |\mathsf {sol({F})}|\cdot 1.8\right] \) for all instances except for four points slightly above the upper bound. These four outliers are due to a bug in the preprocessor \(\textsf{Arjun}\) that probably depends on the version of the C++ compiler and will be fixed in the future. We also calculated the observed error, which is the mean relative difference between the approximate and exact counts in our experiments, i.e., \(\max \{\textsf{finalEstimate}/|\mathsf {sol({F})}|-1, |\mathsf {sol({F})}|/\textsf{finalEstimate}-1\}\). The overall observed error was 0.1, which is significantly smaller than the theoretical error tolerance of 0.8.

7 Conclusion

In this paper, we addressed the scalability challenges faced by \(\textsf{ApproxMC}\) in the smaller \(\delta \) range. To this end, we proposed a rounding-based algorithm, \(\textsf{ApproxMC6}\), which reduces the number of estimations required by 84% while providing the same \((\varepsilon ,\delta \))-guarantees. Our empirical evaluation on 1890 instances shows that \(\textsf{ApproxMC6}\) solved 204 more instances and achieved a reduction in PAR-2 score of 1063 s. Furthermore, \(\textsf{ApproxMC6}\) achieved a \(4\times \) speedup over \(\textsf{ApproxMC}\) on the instances both \(\textsf{ApproxMC6}\) and \(\textsf{ApproxMC}\) could solve.