The Power of Self-Reducibility: Selectivity, Information, and Approximation

Hemaspaandra, Lane A.

doi:10.1007/978-3-030-41672-0_3

Lane A. Hemaspaandra ORCID: orcid.org/0000-0003-0659-5204¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12000))

802 Accesses

Abstract

This chapter provides a hands-on tutorial on the important technique known as self-reducibility. Through a series of “Challenge Problems” that are theorems that the reader will—after being given definitions and tools—try to prove, the tutorial will ask the reader not to read proofs that use self-reducibility, but rather to discover proofs that use self-reducibility. In particular, the chapter will seek to guide the reader to the discovery of proofs of four interesting theorems—whose focus areas range from selectivity to information to approximation—from the literature, whose proofs draw on self-reducibility.

The chapter’s goal is to allow interested readers to add self-reducibility to their collection of proof tools. The chapter simultaneously has a related but different goal, namely, to provide a “lesson plan” (and a coordinated set of slides is available online to support this use [13]) for a lecture to a two-lecture series that can be given to undergraduate students—even those with no background other than basic discrete mathematics and an understanding of what polynomial-time computation is—to immerse them in hands-on proving, and by doing that, to serve as an invitation to them to take courses on Models of Computation or Complexity Theory.

In memory of Ker-I Ko, whose indelible contributions to computational complexity included important work (e.g., [24,25,26,27]) on each of this chapter’s topics: self-reducibility, selectivity, information, and approximation.

This chapter was written in part while on sabbatical at Heinrich Heine University Düsseldorf, supported in part by a Renewed Research Stay grant from the Alexander von Humboldt Foundation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 16.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

From Nonstandard Analysis to Various Flavours of Computability Theory

Complexity, Exactness, and Rationality in Polynomial Optimization

On the Equivalence among Problems of Bounded Width

Notes

1.
To cover all four problems would take two class sessions. Covering just the first two or perhaps three of the problems could be done in a single 75-minute class session.
2.
In this chapter, since student readers of the chapter will be working as individuals, I suggest to the reader, for most of the problems, longer amounts of time. But in a classroom setting where students are working in groups, 10–25 minutes may be an appropriate amount of time; perhaps 10 minutes for the first challenge problem, 15 for the second, 15 for the third, and 25 for the fourth. You’ll need to yourself judge the time amounts that are best, based on your knowledge of your students. For many classes, the just-mentioned times will not be enough. Myself, I try to keep track of whether the groups seem to have found an answer, and I will sometimes stretch out the time window if many groups seem to be still working intensely and with interest. Also, if TAs happen to be available who don’t already know the answers, I may assign them to groups so that the class’s groups will have more experienced members, though the TAs do know to guide rather than dominate a group’s discussions.

References

Arvind, V., Han, Y., Hemachandra, L., Köbler, J., Lozano, A., Mundhenk, M., Ogiwara, M., Schöning, U., Silvestri, R., Thierauf, T.: Reductions to sets of low information content. In: Ambos-Spies, K., Homer, S., Schöning, U. (eds.) Complexity Theory, pp. 1–45. Cambridge University Press, Cambridge (1993)
Google Scholar
Berman, P.: Relationship between density and deterministic complexity of NP-complete languages. In: Ausiello, G., Böhm, C. (eds.) ICALP 1978. LNCS, vol. 62, pp. 63–71. Springer, Heidelberg (1978). https://doi.org/10.1007/3-540-08860-1_6
Chapter Google Scholar
Cai, J.-Y., Hemachandra, L.: Enumerative counting is hard. Inf. Comput. 82(1), 34–44 (1989)
Article MathSciNet Google Scholar
Cai, J.-Y., Hemachandra, L.: A note on enumerative counting. Inf. Process. Lett. 38(4), 215–219 (1991)
Article MathSciNet Google Scholar
Clay Mathematics Institute: Millennium problems (web page) (2019). https://www.claymath.org/millennium-problems. Accessed 10 July 2019
Cook, S.: The complexity of theorem-proving procedures. In: Proceedings of the 3rd ACM Symposium on Theory of Computing, pp. 151–158. ACM Press, May 1971
Google Scholar
Fortune, S.: A note on sparse complete sets. SIAM J. Comput. 8(3), 431–433 (1979)
Article MathSciNet Google Scholar
Gasarch, W.: The third P =? NP poll. SIGACT News 50(1), 38–59 (2019)
Article MathSciNet Google Scholar
Glaßer, C.: Consequences of the existence of sparse sets hard for NP under a subclass of truth-table reductions. Technical report, TR 245, Institut für Informatik, Universität Würzburg, Würzburg, Germany, January 2000
Google Scholar
Glaßer, C., Hemaspaandra, L.: A moment of perfect clarity II: consequences of sparse sets hard for NP with respect to weak reductions. SIGACT News 31(4), 39–51 (2000)
Article Google Scholar
Hemachandra, L., Ogiwara, M., Watanabe, O.: How hard are sparse sets? In: Proceedings of the 7th Structure in Complexity Theory Conference, pp. 222–238. IEEE Computer Society Press, June 1992
Google Scholar
Hemaspaandra, E., Hemaspaandra, L., Menton, C.: Search versus decision for election manipulation problems. In: Proceedings of the 30th Annual Symposium on Theoretical Aspects of Computer Science, vol. 20, pp. 377–388. Leibniz International Proceedings in Informatics (LIPIcs), February/March 2013
Google Scholar
Hemaspaandra, L.: The power of self-reducibility: selectivity, information, and approximation (2019). File set–providing slides and their source code. http://www.cs.rochester.edu/u/lane/=self-reducibility/. Accessed 10 July 2019
Hemaspaandra, L., Hempel, H.: P-immune sets with holes lack self-reducibility properties. Theoret. Comput. Sci. 302(1–3), 457–466 (2003)
Article MathSciNet Google Scholar
Hemaspaandra, L., Jiang, Z.: Logspace reducibility: models and equivalences. Int. J. Found. Comput. Sci. 8(1), 95–108 (1997)
Article Google Scholar
Hemaspaandra, L., Narváez, D.: The opacity of backbones. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence, pp. 3900–3906. AAAI Press, February 2017
Google Scholar
Hemaspaandra, L.A., Narváez, D.E.: Existence versus exploitation: the opacity of backdoors and backbones under a weak assumption. In: Catania, B., Královič, R., Nawrocki, J., Pighizzini, G. (eds.) SOFSEM 2019. LNCS, vol. 11376, pp. 247–259. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-10801-4_20
Chapter Google Scholar
Hemaspaandra, L., Ogihara, M.: The Complexity Theory Companion. Springer, Heidelberg (2002). https://doi.org/10.1007/978-3-662-04880-1
Book MATH Google Scholar
Hemaspaandra, L., Ogihara, M., Toda, S.: Space-efficient recognition of sparse self-reducible languages. Comput. Complex. 4(3), 262–296 (1994)
Article MathSciNet Google Scholar
Hemaspaandra, L., Silvestri, R.: Easily checked generalized self-reducibility. SIAM J. Comput. 24(4), 840–858 (1995)
Article MathSciNet Google Scholar
Hemaspaandra, L., Torenvliet, L.: Theory of Semi-feasible Algorithms. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-662-05080-4
Book MATH Google Scholar
Hemaspaandra, L., Zimand, M.: Strong self-reducibility precludes strong immunity. Math. Syst. Theory 29(5), 535–548 (1996)
Article MathSciNet Google Scholar
Karp, R.: Reducibilities among combinatorial problems. In: Miller, R., Thatcher, J. (eds.) Complexity of Computer Computations, pp. 85–103. Springer, Boston (1972). https://doi.org/10.1007/978-1-4684-2001-2_9
Chapter Google Scholar
Ko, K.: The maximum value problem and NP real numbers. J. Comput. Syst. Sci. 24(1), 15–35 (1982)
Article MathSciNet Google Scholar
Ko, K.: On self-reducibility and weak P-selectivity. J. Comput. Syst. Sci. 26(2), 209–221 (1983)
Article MathSciNet Google Scholar
Ko, K.: On helping by robust oracle machines. Theoret. Comput. Sci. 52(1–2), 15–36 (1987)
Article MathSciNet Google Scholar
Ko, K., Moore, D.: Completeness, approximation, and density. SIAM J. Comput. 10(4), 787–796 (1981)
Article MathSciNet Google Scholar
Krentel, M.: The complexity of optimization problems. J. Comput. Syst. Sci. 36(3), 490–509 (1988)
Article MathSciNet Google Scholar
Levin, L.: Universal sequential search problems. Probl. Inf. Transm. 9(3), 265–266 (1975)
Google Scholar
Mahaney, S.: Sparse complete sets for NP: solution of a conjecture of Berman and Hartmanis. J. Comput. Syst. Sci. 25(2), 130–143 (1982)
Article MathSciNet Google Scholar
Mahaney, S.: Sparse sets and reducibilities. In: Book, R. (ed.) Studies in Complexity Theory, pp. 63–118. Wiley, Hoboken (1986)
Google Scholar
Mahaney, S.: The Isomorphism Conjecture and sparse sets. In: Hartmanis, J. (ed.) Computational Complexity Theory, pp. 18–46. American Mathematical Society (1989). Proceedings of Symposia in Applied Mathematics #38
Google Scholar
Meyer, A., Paterson, M.: With what frequency are apparently intractable problems difficult? Technical report, MIT/LCS/TM-126, Laboratory for Computer Science, MIT, Cambridge, MA (1979)
Google Scholar
Schnorr, C.: Optimal algorithms for self-reducible problems. In: Proceedings of the 3rd International Colloquium on Automata, Languages, and Programming, pp. 322–337. Edinburgh University Press, July 1976
Google Scholar
Selman, A.: P-selective sets, tally languages, and the behavior of polynomial time reducibilities on NP. Math. Syst. Theory 13(1), 55–65 (1979)
Article MathSciNet Google Scholar
Selman, A.: Some observations on NP real numbers and P-selective sets. J. Comput. Syst. Sci. 23(3), 326–332 (1981)
Article MathSciNet Google Scholar
Selman, A.: Analogues of semirecursive sets and effective reducibilities to the study of NP complexity. Inf. Control 52(1), 36–51 (1982)
Article MathSciNet Google Scholar
Selman, A.: Reductions on NP and P-selective sets. Theoret. Comput. Sci. 19(3), 287–304 (1982). https://doi.org/10.1016/0304-3975(82)90039-1
Article MathSciNet MATH Google Scholar
Valiant, L.: The complexity of computing the permanent. Theoret. Comput. Sci. 8(2), 189–201 (1979)
Article MathSciNet Google Scholar
Valiant, L.: The complexity of enumeration and reliability problems. SIAM J. Comput. 8(3), 410–421 (1979)
Article MathSciNet Google Scholar
Young, P.: How reductions to sparse sets collapse the polynomial-time hierarchy: a primer. SIGACT News 23 (1992). Part I (#3, pp. 107–117), Part II (#4, pp. 83–94), and Corrigendum to Part I (#4, p. 94)
Google Scholar

Download references

Acknowledgments

I am grateful to the students and faculty at the computer science departments of RWTH Aachen University, Heinrich Heine University Düsseldorf, and the University of Rochester. I “test drove” this chapter at each of those schools in the form of a lecture or lecture series. Particular thanks go to Peter Rossmanith, Jörg Rothe, and Muthu Venkitasubramaniam, who invited me to speak, and to Gerhard Woeginger regarding the counterexample in Appendix D. My warm appreciation to Ding-Zhu Du, Bin Liu, and Jie Wang for inviting me to contribute to this project that they have organized in memory of the wonderful Ker-I Ko, whose work contributed so richly to the beautiful, ever-growing tapestry that is complexity theory.

Author information

Authors and Affiliations

Department of Computer Science, University of Rochester, Rochester, NY, 14627, USA
Lane A. Hemaspaandra

Authors

Lane A. Hemaspaandra
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The University of Texas at Dallas, Richardson, TX, USA
Ding-Zhu Du
University of Massachusetts Lowell, Lowell, MA, USA
Jie Wang

Appendices

A Solution to Challenge Problem 1

Before we start on the proof, let us put up a figure that shows the flavor of a structure that we will use to help us understand and exploit $\mathrm{SAT}$’s self-reducibility. The structure is known as the self-reducibility tree of a formula. At the root of this tree sits the formula. At the next level as the root’s children, we have the formula with its first variable assigned to $\mathrm {True}$ and to $\mathrm {False}$. At the level below that, we have the two formulas from the second level, except with each of their first variables (i.e., the second variable of the original formula) assigned to both $\mathrm {True}$ and $\mathrm {False}$. Figure 1 shows the self-reducibility tree of a two-variable formula.

Self-reducibility tells us that, for each node N in such a self-reducibility tree (except the leaves, since they have no children), N is satisfiable if and only if at least one of its two children is satisfiable. Inductively, the formula at the root of the tree is satisfiable if and only if each level of the self-reducibility tree has at least one satisfiable node. And, also, the formula at the root of the tree is satisfiable if and only if every level of the self-reducibility tree has at least one satisfiable node.

How helpful is this tree? Well, we certainly don’t want to solve $\mathrm{SAT}$ by checking every leaf of the self-reducibility tree. On formulas with k variables, that would take time at least $2^k$—basically a brute-force exponential-time algorithm. Yuck! That isn’t surprising though. After all, the tree is really just listing all assignments to the formula.

But the magic here, which we will exploit, is that the “self-reducibility” relationship between nodes and their children as to satisfiability will, at least with certain extra assumptions such as about P-selectivity, allow us to not explore the whole tree. Rather, we’ll be able to prune away, quickly, all but a polynomially large subtree. In fact, though on its surface this chapter is about four questions from complexity theory, it really is about tree-pruning—a topic more commonly associated with algorithms than with complexity. To us, though, that is not a problem but an advantage. As we mentioned earlier, complexity is largely about building algorithms, and that helps make complexity far more inviting and intuitive than most people realize.

That being said, let us move on to giving a proof of the first challenge problem. Namely, in this section we sketch a proof of the result:

If $\mathrm{SAT}$ is $\mathrm{P}$-selective, then $\mathrm{SAT}\in \mathrm{P}$.

So assume that $\mathrm{SAT}$ is P-selective, via (in the sense of Definition 2) polynomial-time computable function f. Let us give a polynomial-time algorithm for $\mathrm{SAT}$. Suppose the input to our algorithm is the formula $F(x_1,x_2,\dots ,x_k)$. (If the input is not a syntactically legal formula we immediately reject, and if the input is a formula that has zero variables, e.g., $\mathrm {True}\wedge \mathrm {True}\wedge \mathrm {False}$, we simply evaluate it and accept if and only if it evaluates to $\mathrm {True}$.) Let us focus on F and F’s two children in the self-reducibility tree, as shown in Fig. 2.

Now, run f on F’s two children. That is, compute, in polynomial time, $f( F(\mathrm {True},x_2,\dots ,x_k),\, F(\mathrm {False},x_2,\dots ,x_k))$. Due to the properties of P-selectivity and self-reducibility, note that the output of that application of f is a formula/node that has the property that the original formula is satisfiable if and only if that child-node is satisfiable.

In particular, if $f( F(\mathrm {True},x_2,\dots ,x_k),\, F(\mathrm {False},x_2,\dots ,x_k)) = F(\mathrm {True},x_2,\dots ,x_k)$ then we know that $F(x_1,x_2,\dots ,x_k)$ is satisfiable if and only if $F(\mathrm {True},x_2,\dots ,x_k)$ is satisfiable. And if $f( F(\mathrm {True},x_2,\dots ,x_k),\, F(\mathrm {False},x_2,\dots ,x_k)) \ne F(\mathrm {True},x_2,\dots ,x_k)$ then we know that $F(x_1,x_2,\dots ,x_k)$ is satisfiable if and only if $F(\mathrm {False},x_2,\dots ,x_k)$ is satisfiable.

Either way, we have in time polynomial in the input’s size eliminated the need to pay attention to one of the two child nodes, and now may focus just on the other one.

Repeat the above process on the child that, as per the above, was selected by the selector function. Now, “split” that formula by assigning $x_2$ both possible ways. That will create two children, and then analogously to what was done above, use the selector function to decide which of those two children is the more promising branch to follow.

Repeat this until we have assigned all variables. We now have a fully assigned formula, but due to how we got to it, we know that it evaluates to $\mathrm {True}$ if and only if the original formula is satisfiable. So if that fully assigned formula evaluates to $\mathrm {True}$, then we state that the original formula is satisfiable (and indeed, our path down the self-reducibility tree has outright put into our hands a satisfying assignment). And, more interestingly, if the fully assigned formula evaluates to $\mathrm {False}$, then we state that the original formula is not satisfiable. We are correct in stating that, because at each iterative stage we know that if the formula we start that stage focused on is satisfiable, then the child the selector function chooses for us will also be satisfiable.

The process above is an at most polynomial number of at most polynomial-time “descend one level having made a linkage” stages, and so overall itself runs in polynomial time. Thus we have given a polynomial-time algorithm for $\mathrm{SAT}$, under the hypothesis that $\mathrm{SAT}$ is P-selective. This completes the proof sketch.

Our algorithm was mostly focused on tree pruning. Though F induces a giant binary tree as to doing variable assignments one variable at a time in all possible ways, thanks to the guidance of the selector function, we walked just a single path through that tree.

Keeping this flavor of approach in mind might be helpful on Challenge Problem 2, although that is a different problem and so perhaps you’ll have to bring some new twist, or greater flexibility, to what you do to tackle that.

And now, please pop right on back to the main body of the chapter, to read and tackle Challenge Problem 2!

B Solution to Challenge Problem 2

In this section we sketch a proof of the result:

If there exists a tally set T such that $\mathrm{SAT}\,\le _{m}^{{{p}}}\,T$, then $\mathrm{SAT}\in \mathrm{P}$.

So assume that there exists a tally set T such that $\mathrm{SAT}\,\le _{m}^{{{p}}}\,T$. Let g be the polynomial-time computable function performing that reduction, in the sense of Definition 4. (Keep in mind that we may not assume that $T \in \mathrm{P}$. We have no argument line in hand that would tell us that that happens to be true.) Let us give a polynomial-time algorithm for $\mathrm{SAT}$.

Suppose the input to our algorithm is the formula $F(x_1,x_2,\dots ,x_k)$. (If the input is not a syntactically legal formula we immediately reject, and if the input is a formula that has zero variables we simply evaluate it and accept if and only if it evaluates to $\mathrm {True}$.)

Let us focus first on F. Compute, in polynomial time, $g(F(x_1,x_2,\dots ,x_k))$. If $g(F(x_1,x_2,\dots ,x_k)) \not \in \{\epsilon ,0,00,\dots \}$, then clearly $F(x_1,x_2,\dots ,x_k) \not \in \mathrm{SAT}$, since we know that (a) $T \subseteq \{\epsilon ,0,00,\dots \}$ and (b) $F(x_1,x_2,\dots ,x_k) \in \mathrm{SAT}\iff g(F(x_1,x_2,\dots ,x_k)) \in T$. So in that case, we output that $F(x_1,x_2,\dots ,x_k) \not \in \mathrm{SAT}$. Otherwise, we descend to the next level of the “self-reducibility tree” as follows.

We consider the nodes (i.e., in this case, formulas) $F(\mathrm {True},x_2,\dots ,x_k)$ and $F(\mathrm {False},x_2,\dots ,x_k)$. Compute $g(F(\mathrm {True},x_2,\dots ,x_k))$ and $g(F(\mathrm {False},x_2,\dots ,x_k))$. If either of our two nodes in question does not, under the action just computed of g, map to a string in $\{\epsilon ,0,00,\dots \}$, then that node certainly is not a satisfiable formula, and we can henceforward mentally ignore it and the entire tree (created by assigning more of its variables) rooted at it. This is one key type of pruning that we will use: eliminating from consideration nodes that map to “nontally” strings.

But there is a second type of pruning that we will use: If it happens to be the case that $g(F(\mathrm {True},x_2,\dots ,x_k)) \in \{\epsilon ,0,00,\dots \}$ and $g(F(\mathrm {True},x_2,\dots ,x_k)) = g(F(\mathrm {False},x_2,\dots ,x_k))$, then at this point it may not be clear to us whether $F(\mathrm {True},x_2,\dots ,x_k)$ is or is not satisfiable. However, what is clear is that

$$F(\mathrm {True},x_2,\dots ,x_k) \in \mathrm{SAT}\iff F(\mathrm {False},x_2,\dots ,x_k) \in \mathrm{SAT}.$$

How do we know this? Since g reduces $\mathrm{SAT}$ to T, we know that

$$g(F(\mathrm {True},x_2,\dots ,x_k)) \in T \iff F(\mathrm {True},x_2,\dots ,x_k) \in \mathrm{SAT}$$

and

$$g(F(\mathrm {False},x_2,\dots ,x_k)) \in T \iff F(\mathrm {False},x_2,\dots ,x_k) \in \mathrm{SAT}.$$

By those observations, the fact that $g(F(\mathrm {True},x_2,\dots ,x_k)) = g(F(\mathrm {False},x_2,\dots ,x_k))$, and the transitivity of “${\iff }$”, we indeed have that $F(\mathrm {True},x_2,\dots ,x_k) \in \mathrm{SAT}\iff F(\mathrm {False},x_2,\dots ,x_k) \in \mathrm{SAT}$. But since that says that either both or neither of these nodes is a formula belonging to $\mathrm{SAT}$, there is no need at all for us to further explore more than one of them, since they stand or fall together as to membership in $\mathrm{SAT}$. So if we have $g(F(\mathrm {True},x_2,\dots ,x_k)) = g(F(\mathrm {False},x_2,\dots ,x_k))$, we can mentally dismiss $F(\mathrm {False},x_2,\dots ,x_k)$—and of course also the entire subtree rooted at it—from all further consideration.

After doing the two types of pruning just mentioned, we will have either one or two nodes left at the level of the tree—the level one down from the root—that we are considering. (If we have zero nodes left, we have pruned away all possible paths and can safely reject). Also, if $k = 1$, then we can simply check whether at least one node that has not been pruned away evaluates to $\mathrm {True}$, and if so we accept and if not we reject.

But what we have outlined can iteratively be carried out in a way that drives us right down through the tree, one level at a time. At each level, we take all nodes (i.e., formulas; we will speak interchangeably of the node and the formula that it is representing) that have not yet been eliminated from consideration, and for each, take the next unassigned variable and make two child formulas, one with that variable assigned to $\mathrm {True}$ and one with that variable assigned to $\mathrm {False}$. So if at a given level after pruning we end up with j formulas, we in this process start the next level with 2j formulas, each with one fewer variable. Then for those 2j formulas we do the following: For each of them, if g applied to that formula outputs a string that is not a member of $\{\epsilon ,0,00,\dots \}$, then eliminate that node from all further consideration. After all, the node clearly is not a satisfiable formula. Also, for all nodes among the 2j such that the string z that g maps them to belongs to $\{\epsilon ,0,00,\dots \}$ and z is mapped to by g by at least one other of the 2j nodes, for each such cluster of nodes that map to the same string z (of the form $\{\epsilon ,0,00,\dots \}$) eliminate all but one of the nodes from consideration. After all, by the argument given above, either all of that cluster are satisfiable or none of them are, so we can eliminate all but one from consideration, since eliminating all the others still leaves one that is satisfiable, if in fact the nodes in the cluster are satisfiable.

Continue this process until (it internally terminates with a decision, or) we reach a level where all variables are assigned. If there were j nodes at the level above that after pruning, then at this no-variables-left-to-assign level we have at most 2j formulas. The construction is such that $F(x_1,x_2,\dots ,x_k) \in \mathrm{SAT}$ if and only if at least one of these at most 2j variable-free formulas belongs to $\mathrm{SAT}$, i.e., evaluates to $\mathrm {True}$. But we can easily check that in time polynomial in $2j\times |F(x_1,x_2,\dots ,x_k)|$.

Is the proof done? Not yet. If j can be huge, we’re dead, as we might have just sketched an exponential-time algorithm. But fortunately, and this was the key insight in Piotr Berman’s paper that proved this result, as we go down the tree, level by level, the tree never grows too wide. In particular, it is at most polynomially wide!

How can we know this? The insight that Berman (and with luck, also you!) had is that there are not many “tally” strings that can be reached by the reduction g on the inputs that it will be run on in our construction on a given input. And that fact will ensure us that after we do our two kinds of pruning, we have at most polynomially many strings left at the now-pruned level.

Let us be more concrete about this, since it is not just the heart of this problem’s solution, but also might well (hint!, hint!) be useful when tacking the third challenge problem.

In particular, we know that g is polynomial-time computable. So there certainly is some natural number k such that, for each natural number n, the function g runs in time at most $n^k +k$ on all inputs of length n. Let $m = |F(x_1,x_2,\dots ,x_k)|$. Note that, at least if the encoding scheme is reasonable and we if needed do reasonable, obvious simplifications (e.g., $\mathrm {True}\wedge y \equiv y$, $\mathrm {True}\vee y \equiv \mathrm {True}$, $\lnot \mathrm {True}\equiv \mathrm {False}$, and $\lnot \mathrm {False}\equiv \mathrm {True}$), then each formula in the tree is of length less than or equal to m. Crucially, g applied to strings of length less than or equal to m can never output any string of length greater than $m^k+k$. And so there are at most $m^k +k +1$ strings (the “+ 1” is because the empty string is one of the strings that can be reached) in $\{\epsilon ,0,00,\dots \}$ that can be mapped to by any of the nodes that are part of our proof’s self-reducibility tree when the input is $F(x_1,x_2,\dots ,x_k)$. So at each level of our tree-pruning, we eliminate all nodes that map to strings that do not belong to $\{\epsilon ,0,00,\dots \}$, and since we leave at most one node mapping to each string that is mapped to in $\{\epsilon ,0,00,\dots \}$, and as we just argued that there are at most $m^k+k+1$ of those, at the end of pruning a given level, at most $m^k+k+1$ nodes are still under consideration. But m is the length of our problem’s input, so each level, after pruning, finishes with at most at most $m^k+k+1$ nodes, and so the level after it, after we split each of the current level’s nodes, will begin with at most $2(m^k+k+1)$ nodes. And after pruning that level, it too ends up with at most $m^k+k+1$ nodes still in play. The tree indeed remains at most polynomially wide.

Thus when we reach the “no variables left unassigned” level, we come into it with a polynomial-sized set of possible satisfying assignments (namely, a set of at most $m^k+k+1$ assignments), and we know that the original formula is satisfiable if and only if at least one of these assignments satisfies F.

Thus the entire algorithm is a polynomial number of rounds (one per variable eliminated), each taking polynomial time. So overall it is a polynomial-time algorithm that it is correctly deciding $\mathrm{SAT}$. This completes the proof sketch.

And now, please pop right on back to the main body of the chapter, to read and tackle Challenge Problem 3! While doing so, please keep this proof in mind, since doing so will be useful on Challenge Problem 3... though you also will need to discover a quite cool additional insight—the same one Steve Fortune discovered when he originally proved the theorem that is our Challenge Problem 3.

C Solution to Challenge Problem 3

In this section we sketch a proof of the result:

If there exists a sparse set S such that $\overline{\mathrm{SAT}} \,\le _{m}^{{{p}}}\,S$, then $\mathrm{SAT}\in \mathrm{P}$.

So assume that there exists a sparse set S such that $\overline{\mathrm{SAT}} \,\le _{m}^{{{p}}}\,S$. Let g be the polynomial-time computable function performing that reduction, in the sense of Definition 4. (Keep in mind that we may not assume that $S \in \mathrm{P}$. We have no argument line in hand that would tell us that that happens to be true.) Let us give a polynomial-time algorithm for $\mathrm{SAT}$.

Suppose the input to our algorithm is the formula $F(x_1,x_2,\dots ,x_k)$. (If the input is not a syntactically legal formula we immediately reject, and if the input is a formula that has zero variables we simply evaluate it and accept if and only if it evaluates to $\mathrm {True}$.)

What we are going to do here is that we are going to mimic the proof that solved Challenge Problem 2. We are going to go level by level down the self-reducibility tree, pruning at each level, and arguing that the tree never gets too wide—at least if we are careful and employ a rather jolting insight that Steve Fortune (and with luck, also you!) had.

Note that of the two types of pruning we used in the Challenge Problem 2 proof, one applies perfectly well here. If two or more nodes on a given level of the tree map under g to the same string, we can eliminate from consideration all but one of them, since either all of them or none of them are satisfiable.

However, the other type of pruning—eliminating all nodes not mapping to a string in $\{\epsilon ,0,00,\dots \}$—completely disappears here. Sparse sets don’t have too many strings per level, but the strings are not trapped to always being of a specific, well-known form.

Is the one type of pruning that is left to us enough to keep the tree from growing exponentially bushy as we go down it? At first glance, it seems that exponential width growth is very much possible, e.g., imagine the case that every node of the tree maps to a different string than all the others at the node’s same level. Then with each level our tree would be doubling in size, and by its base, if we started with k variables, we’d have $2^k$ nodes at the base level—clearly an exponentially bushy tree.

But Fortune stepped back and realized something lovely. He realized that if the tree ever became too bushy, then that itself would be an implicit proof that F is satisfiable! Wow; mind-blowing!

In particular, Fortune used the following beautiful reasoning.

We know g runs in polynomial time. So let the polynomial r(n) bound g’s running time on inputs of length n, and without loss of generality, assume that r is nondecreasing. We know that S is sparse, so let the polynomial q(n) bound the number of strings in S up to and including length n, and without loss of generality, assume that q is nondecreasing.

Let $m = |F(x_1,x_2,\dots ,x_k)|$, and as before, note that all the nodes in our proof are of length less than or equal to m.

How many distinct strings in S can be reached by applying g to strings of length at most m? On inputs of length at most m, clearly g maps to strings of length at most r(m). But note that the number of strings in S of length at most r(m) is at most q(r(m)).

Now, there are two cases. Suppose that at each level of our tree we have, after pruning, at most q(r(m)) nodes left active. Since q(r(m)) itself is a polynomial in the input size, m, that means our tree remains at most polynomially bushy (since levels of our tree are never, even right after splitting a level’s nodes to create the next level, wider than 2q(r(m))). Analogously to the argument of Challenge Problem 2’s proof, when we reach the “all variables assigned” level, we enter it with a set of at most 2q(r(m)) no-variables-left formulas such that F is satisfiable if and only if at least one of those formulas evaluates to $\mathrm {True}$. So in that case, we easily do compute in polynomial time whether the given input is satisfiable, analogously to the previous proof.

On the other hand, suppose that on some level, after pruning, we have at least $1+ q(r(m))$ nodes. This means that at that level, we had at least $1+ q(r(m))$ distinct labels. But there are only q(r(m)) distinct strings that g can possibly reach, on our inputs, that belong to S. So at least one of the $1+ q(r(m))$ formulas in our surviving nodes maps to a string that does not belong to S. But g was a reduction from $\overline{\mathrm{SAT}}$ to S, so that node that mapped to a string that does not belong to S must itself be a satisfiable formula. Ka-zam! That node is satisfiable, and yet that node is simply F with some of its variables fixed. And so F itself certainly is satisfiable. We are done, and so the moment our algorithm finds a level that has $1+ q(r(m))$ distinct labels, our algorithm halts and declares that $F(x_1,x_2,\dots ,x_k)$ is satisfiable.

Note how subtle the action here is. The algorithm is correct in reasoning that, when we have at least $1+q(r(m))$ distinct labels at a level, at least one of the still-live nodes at that level must be satisfiable, and thus $F(x_1,x_2,\dots ,x_k)$ is satisfiable. However, the algorithm doesn’t know a particular one of those at-least-$1+q(r(m))$-nodes that it can point to as being satisfiable. It merely knows that at least one of them is. And that is enough to allow the algorithm to act correctly. (One can, if one wants, extend the above approach to actually drive onward to the base of the tree; what one does is that at each level, the moment one gets to $1+q(r(m))$ distinct labels, one stops handling that level, and goes immediately on to the next level, splitting each of those $1+q(r(m))$ nodes into two at the next level. This works since we know that at least one of the nodes is satisfiable, and so we have ensured that at least node at the next level will be satisfiable.) This completes the proof sketch.

And now, please pop right on back to the main body of the chapter, to read and tackle Challenge Problem 4! There, you’ll be working within a related but changed and rather challenging setting: you’ll be working in the realms of functions and counting. Buckle up!

D Solution to Challenge Problem 4

1.1 D.1 Why One Natural Approach Is Hopeless

One natural approach would be to run the hypothetical 2-enumerator h on the input formula F and both of F’s $x_1$-assigned subformulas, and to argue that purely based on the two options that h gives for each of those three, i.e., viewing the formulas for a moment as black boxes (note: without loss of generality, we may assume that each of the three applications of the 2-enumerator has two distinct outputs; the other cases are even easier), we can either output $\Vert F\Vert $ or can identify at least one of the subformulas such that we can show a particular 1-to-1 linkage between which of the two predicted numbers of solutions it has and which of the two predicted numbers of solutions F has. And then we would iteratively walk down the tree, doing that.

But the following example, based on one suggested by Gerhard Woeginger, shows that that is impossible. Suppose h predicts outputs $\{0,1\}$ for F, and h predicts outputs $\{0,1\}$ for the left subformula, and h predicts outputs $\{0,1\}$ for the right subformula. That is, for each, it says “this formula either has zero satisfying assignments or has exactly one satisfying assignment.” In this case, note that the values of the root can’t be, based solely on the numbers the enumerator output, linked 1-to-1 to those of the left subformula, since 0 solutions for the left subformula can correspond to a root value of 0 ($0+0=0$) or to a root value of 1 ($0+1=1$). The same clearly also holds for the right subformula.

The three separate number-pairs just don’t have enough information to make the desired link! But don’t despair: we can make h help us far more powerfully than was done above!

1.2 D.2 XYZ Idea/Statement

To get around the obstacle just mentioned, we can try to trick the enumerator into giving us linked/coordinated guesses! Let us see how to do that.

What I was thinking of, when I mentioned XYZ in the food-for-thought hint (Sect. 6.2), is the fact that we can efficiently combine two Boolean formulas into a new one such that from the number of satisfying assignments of the new formula we can easily “read off” the number of satisfying assignments of both the original formulas. In fact, it turns out that we can do the combining in such a way that if we concatenate the (appropriately padded as needed) bitstrings capturing the numbers of solutions of the two formulas, we get the (appropriately padded as needed) bitstring capturing the number of solutions of the new “combined” formula. We will, when F is a Boolean formula, use $\Vert F\Vert $ to denote the number of satisfying assignments of F.

Lemma 1

There are polynomial-time computable functions $\mathrm {combiner}$ and $\mathrm {decoder}$ such that for any Boolean formulas F and G, $\mathrm {combiner}(F,\, G)$ is a Boolean formula and $\mathrm {decoder}(F,G, \Vert \mathrm {combiner}(F,G)\Vert )$ prints $\Vert F\Vert ,\Vert G\Vert $.

Proof Sketch. Let $F=F(x_1,\ldots ,\,x_n)$ and $G=G(y_1,\ldots ,\,y_m)$, where $x_1,\ldots ,\,x_n,y_1,\ldots ,\,y_m$ are all distinct. Let z and $z^\prime $ be two new Boolean variables. Then

$$ H=(F\wedge z)\vee (\bar{z}\wedge x_1 \wedge \cdots \wedge x_n \wedge G \wedge z^\prime ) $$

gives the desired combination, since $\Vert h\Vert =\Vert f\Vert 2^{m+1}+\Vert g\Vert $ and $\Vert g\Vert \le 2^m$. $\square $

We can easily extend this technique to combine three, four, or even polynomially many formulas.

1.3 D.3 Invitation to a Second Bite at the Apple

Now that you have in hand the extra tool that is Lemma 1, this would be a great time, unless you already found a solution to the fourth challenge problem, to try again to solve the problem. My guess is that if you did not already solve the fourth challenge problem, then the ideas you had while trying to solve it will stand you in good stead when you with the combining lemma in hand revisit the problem.

My suggestion to you would be to work again on proving Challenge Problem 4 until either you find a proof, or you’ve put in at least 15 more minutes of thought, are stuck, and don’t think that more time will be helpful.

When you’ve reached one or the other of those states, please go on to Sect. D.4 to read a proof of the theorem.

1.4 D.4 Proof Sketch of the Theorem

Recall that we are trying to prove:

If $\mathrm{\#SAT}$ is has a polynomial-time 2-enumerator, then there is a polynomial-time algorithm for $\mathrm{\#SAT}$.

Here is a quick proof sketch. Start with our input formula, F, whose number of solutions we wish to compute in polynomial time.

If F has no variables, we can simply directly output the right number of solutions, namely, 1 (if F evaluates to $\mathrm {True}$), or 0 (otherwise). Otherwise, self-reduce formula F on its first variable. Using the XYZ trick, twice, combine the original formula and the two subformulas into a single formula, H, whose number of solutions gives the number of solutions of all three. For example, if our three formulas are $F = F(x_1 x_2, x_3, \dots )$, $F_ left = F(\mathrm {True}, x_2, x_3, \dots )$, and $F_ right = F(\mathrm {False}, x_2, x_3, \dots )$, our combined formula can be

$$\begin{aligned} H = \mathrm {combiner}(F,\mathrm {combiner}(F_ left ,F_ right )), \end{aligned}$$

and the decoding process is clear from this and Lemma 1 (and its proof). Run the 2-enumerator on H. If either of H’s output’s two decoded guesses are inconsistent ($a \ne b + c$), then ignore that line and the other one is the truth. If both are consistent and agree on $\Vert F\Vert $, then we’re also done. Otherwise, the two guesses must each be internally consistent and the two guesses must disagree on $\Vert F\Vert $, and so it follows that the two guesses differ in their claims about at least one of $\Vert F_ left \Vert $ and $\Vert F_ right \Vert $. Thus if we know the number of solutions of that one, shorter formula, we know the number of solutions of $\Vert F\Vert $.

Repeat the above on that formula, and so on, right on down the three, and then (unless the process resolves internally or ripples back up earlier) at the end we have reached a zero-variable formula and for it we by inspection will know how many solutions it has (either 1 or 0), and so using that we can ripple our way all the way back up through the tree, using our linkages between each level and the next, and thus we now have computed $\Vert F\Vert $. The entire process is a polynomial number of polynomial-time actions, and so runs in polynomial time overall.

That ends the proof sketch, but let us give an example regarding the key step from the proof sketch, as that will help make clear what is going on.

In this example, note that we can conclude that $\Vert F\Vert = 100$ if $\Vert F(\mathrm {False}, x_2, x_3, \dots )\Vert = 17$, and $\Vert F\Vert = 101$ if $ \Vert F(\mathrm {False}, x_2, x_3, \dots )\Vert = 16$; and we know that $ \Vert F(\mathrm {False}, x_2, x_3, \dots )\Vert \in \{16,17\}$.

So we have in polynomial time completely linked $ \Vert F(x_1, x_2, x_3, \dots )\Vert $ to the issue of the number of satisfying assignments of the (after simplifying) shorter formula $F(\mathrm {False}, x_2, x_3, \dots )$. This completes our example of the key linking step.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hemaspaandra, L.A. (2020). The Power of Self-Reducibility: Selectivity, Information, and Approximation. In: Du, DZ., Wang, J. (eds) Complexity and Approximation. Lecture Notes in Computer Science(), vol 12000. Springer, Cham. https://doi.org/10.1007/978-3-030-41672-0_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-41672-0_3
Published: 21 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41671-3
Online ISBN: 978-3-030-41672-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics