Skip to main content

On the Hierarchical Community Structure of Practical Boolean Formulas

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 12831)


Modern CDCL SAT solvers easily solve industrial instances containing tens of millions of variables and clauses, despite the theoretical intractability of the SAT problem. This gap between practice and theory is a central problem in solver research. It is believed that SAT solvers exploit structure inherent in industrial instances, and hence there have been numerous attempts over the last 25 years at characterizing this structure via parameters. These can be classified as rigorous, i.e., they serve as a basis for complexity-theoretic upper bounds (e.g., backdoors), or correlative, i.e., they correlate well with solver run time and are observed in industrial instances (e.g., community structure). Unfortunately, no parameter proposed to date has been shown to be both strongly correlative and rigorous over a large fraction of industrial instances.

Given the sheer difficulty of the problem, we aim for an intermediate goal of proposing a set of parameters that is strongly correlative and has good theoretical properties. Specifically, we propose parameters based on a graph partitioning called Hierarchical Community Structure (HCS), which captures the recursive community structure of a graph of a Boolean formula. We show that HCS parameters are strongly correlative with solver run time using an Empirical Hardness Model, and further build a classifier based on HCS parameters that distinguishes between easy industrial and hard random/crafted instances with very high accuracy. We further strengthen our hypotheses via scaling studies. On the theoretical side, we show that counterexamples which plagued flat community structure do not apply to HCS, and that there is a subset of HCS parameters such that restricting them limits the size of embeddable expanders.

J. Li and J. Chung—Joint first author

J. Li—Work done in part while the authors were at the 2021 Satisfiability: Theory, Practice, and Beyond program at the Simons Institute, Berkeley, CA, USA.

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-80223-3_25
  • Chapter length: 18 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   89.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-80223-3
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   119.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.


  1. 1.

    The term industrial is loosely defined to encompass instances obtained from hardware and software testing, analysis, and verification applications.

  2. 2.

    Using terminology by Stefan Szeider [43].

  3. 3.

    Instance generator and data can be found at Also, for the full-length paper and appendices (with proofs of theorems in Sect. 6), please refer to the arXiv version of the paper [26].

  4. 4.

    For a complete list, see:

  5. 5.

    This value is the time limit used by the SAT competition.

  6. 6.

    See for details on clusters.


  1. Alekhnovich, M., Razborov, A.: Satisfiability. Branch-width and Tseitin tautologies. Comput. Complex. 20(4), 649–678 (2011).

    MathSciNet  CrossRef  MATH  Google Scholar 

  2. Ansótegui, C., Bonet, M.L., Giráldez-Cru, J., Levy, J.: The fractal dimension of SAT formulas. In: Proceedings of the 7th International Joint Conference on Automated Reasoning - IJCAR 2014, pp. 107–121 (2014).

  3. Ansótegui, C., Bonet, M.L., Levy, J.: Towards industrial-like random SAT instances. In: IJCAI 2009, Proceedings of the 21st International Joint Conference on Artificial Intelligence, pp. 387–392 (2009)

    Google Scholar 

  4. Ansótegui, C., Giráldez-Cru, J., Levy, J.: The community structure of SAT formulas. In: Proceedings of the 15th International Conference on Theory and Applications of Satisfiability Testing - SAT 2012, pp. 410–423 (2012).

  5. Ben-Sasson, E., Wigderson, A.: Short proofs are narrow—resolution made simple. J. ACM (JACM) 48(2), 149–169 (2001)

    Google Scholar 

  6. Bläsius, T., Friedrich, T., Göbel, A., Levy, J., Rothenberger, R.: The impact of heterogeneity and geometry on the proof complexity of random satisfiability. In: Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms, SODA 2021, pp. 42–53 (2021).

  7. Blondel, V., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, P10008 (2008).

  8. Blum, A.L., Furst, M.L.: Fast planning through planning graph analysis. Artif. Intell. 90(1–2), 281–300 (1997)

    CrossRef  Google Scholar 

  9. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001).

  10. Cadar, C., Ganesh, V., Pawlowski, P.M., Dill, D.L., Engler, D.R.: EXE: automatically generating inputs of death. ACM Trans. Inf. Syst. Secur. (TISSEC) 12(2), 1–38 (2008)

    CrossRef  Google Scholar 

  11. Cheeseman, P., Kanefsky, B., Taylor, W.M.: Where the really hard problems are. In: Proceedings of the 12th International Joint Conference on Artificial Intelligence IJCAI 1991, pp. 331–337. (1991)

    Google Scholar 

  12. Clarke Jr, E.M., Grumberg, O., Kroening, D., Peled, D., Veith, H.: Model Checking. MIT Press (2018)

    Google Scholar 

  13. Clauset, A., Moore, C., Newman, M.E.J.: Hierarchical structure and the prediction of missing links in networks. Nature 453(7191), 98–101 (2008).

    CrossRef  Google Scholar 

  14. Coarfa, C., Demopoulos, D.D., San Miguel Aguirre, A., Subramanian, D., Vardi, M.Y.: Random \(3\)-SAT: the plot thickens. Constraints 8(3), 243–261 (2003).

  15. Cook, S.A.: The complexity of theorem-proving procedures. In: Proceedings of the 3rd Annual ACM Symposium on Theory of Computing, pp. 151–158 (1971).

  16. Dolby, J., Vaziri, M., Tip, F.: Finding bugs efficiently with a SAT solver. In: Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 195–204 (2007).

  17. Eén, N., Biere, A.: Effective preprocessing in SAT through variable and clause elimination. In: Bacchus, F., Walsh, T. (eds.) SAT 2005. LNCS, vol. 3569, pp. 61–75. Springer, Heidelberg (2005).

    CrossRef  MATH  Google Scholar 

  18. Fortunato, S., Barthélemy, M.: Resolution limit in community detection. Proc. Natl. Acad. Sci. 104(1), 36–41 (2007).

    CrossRef  Google Scholar 

  19. Friedrich, T., Krohmer, A., Rothenberger, R., Sutton, A.M.: Phase transitions for scale-free SAT formulas. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, AAAI 2017, pp. 3893–3899. AAAI Press (2017)

    Google Scholar 

  20. Giráldez-Cru, J.: Beyond the structure of SAT formulas. Ph.D. thesis, Universitat Autònoma de Barcelona (2016)

    Google Scholar 

  21. Giráldez-Cru, J., Levy, J.: A modularity-based random SAT instances generator. In: Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, pp. 1952–1958 (2015).

  22. Granell, C., Gomez, S., Arenas, A.: Hierarchical multiresolution method to overcome the resolution limit in complex networks. Int. J. Bifurcat. Chaos 22(07), 1250171 (2012)

    CrossRef  Google Scholar 

  23. Hoory, S., Linial, N., Wigderson, A.: Expander graphs and their applications. Bull. Am. Math. Soc. 43(4), 439–561 (2006)

    MathSciNet  CrossRef  Google Scholar 

  24. Kilby, P., Slaney, J., Thiebaux, S., Walsh, T.: Backbones and backdoors in satisfiability. Proc. Natl. Conf. Artif. Intell. 3, 1368–1373 (2005)

    Google Scholar 

  25. Lauria, M., Elffers, J., Nordström, J., Vinyals, M.: CNFgen: a generator of crafted benchmarks. In: Proceedings of the 20th International Conference on Theory and Applications of Satisfiability Testing (SAT 2017), pp. 464–473 (2017).

  26. Li, C., et al.: On the hierarchical community structure of practical sat formulas. arXiv preprint arXiv:2103.14992 (2021)

  27. Liang, J.H., Ganesh, V., Poupart, P., Czarnecki, K.: Learning rate based branching heuristic for SAT solvers. In: Proceedings of the 19th International Conference on Theory and Applications of Satisfiability Testing - SAT 2016, pp. 123–140 (2016).

  28. Mateescu, R.: Treewidth in industrial SAT benchmarks. Tech. Rep. MSR-TR-2011-22, Microsoft (2011).

  29. Monasson, R., Zecchina, R., Kirkpatrick, S., Selman, B., Troyansky, L.: Determining computational complexity from characteristic ‘phase transitions’. Nature 400(6740), 133–137 (1999)

    MathSciNet  CrossRef  Google Scholar 

  30. Mull, N., Fremont, D.J., Seshia, S.A.: On the hardness of SAT with community structure. In: Proceedings of the 19th International Conference on Theory and Applications of Satisfiability Testing (SAT), pp. 141–159 (2016).

  31. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004).

  32. Newsham, Z., Ganesh, V., Fischmeister, S., Audemard, G., Simon, L.: Impact of community structure on SAT solver performance. In: Theory and Applications of Satisfiability Testing - SAT 2014–17th International Conference, Held as Part of the Vienna Summer of Logic, VSL 2014, Vienna, Austria, 14–17 July, 2014. Proceedings, pp. 252–268 (2014).

  33. Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  34. Platt, J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in Large Margin Classifiers, pp. 61–74. MIT Press (1999)

    Google Scholar 

  35. Ravasz, E., Somera, A.L., Mongru, D.A., Oltvai, Z.N., Barabási, A.L.: Hierarchical organization of modularity in metabolic networks. Science 297(5586), 1551–1555 (2002)

    Google Scholar 

  36. Samer, M., Szeider, S.: Backdoor trees. In: Automated Reasoning, vol. 1, pp. 363–368. Springer (2008)

    Google Scholar 

  37. Samer, M., Szeider, S.: Fixed-parameter tractability. In: Biere, A., Heule, M., van Maaren, H., Walsh, T. (eds.) Handbook of Satisfiability, Frontiers in Artificial Intelligence and Applications, 2nd edn., vol. 336. IOS Press (2021)

    Google Scholar 

  38. SAT: The International SAT Competition. Accessed 06 Mar 2021

  39. Selman, B., Mitchell, D.G., Levesque, H.J.: Generating hard satisfiability problems. Artif. Intell. 81(1–2), 17–29 (1996)

    MathSciNet  CrossRef  Google Scholar 

  40. SHARCNET: SHARCNET: Graham Cluster. Accessed 06 Mar 2021

  41. Simon, H.A.: The architecture of complexity. Proc. Am. Philos. Soc. 106(6), 467–482 (1962).

  42. Steel, R.G.D., Torrie, J.H.: Principles and Procedures of Statistics. McGraw-Hill (1960)

    Google Scholar 

  43. Szeider, S.: Algorithmic utilization of structure in SAT instances. Theoretical Foundations of SAT/SMT Solving Workshop at the Simons Institute for the Theory of Computing (2021)

    Google Scholar 

  44. Williams, R., Gomes, C.P., Selman, B.: Backdoors to typical case complexity. In: IJCAI-2003, Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, pp. 1173–1178 (2003).

  45. Xie, Y., Aiken, A.: Saturn: a SAT-based tool for bug detection. In: Proceedings of the 17th International Conference on Computer Aided Verification, CAV 2005, pp. 139–143 (2005).

  46. Xu, L., Hutter, F., Hoos, H., Leyton-Brown, K.: Features for SAT (2012). Accessed Feb 2021

  47. Zulkoski, E., Martins, R., Wintersteiger, C.M., Liang, J.H., Czarnecki, K., Ganesh, V.: The effect of structural measures and merges on SAT solver performance. In: Proceedings of the 24th International Conference on Principles and Practice of Constraint Programming, pp. 436–452 (2018).

  48. Zulkoski, E., et al.: Learning-sensitive backdoors with restarts. In: Proceedings of the 24th International Conference on Principles and Practice of Constraint Programming, pp. 453–469 (2018).

Download references

Author information

Authors and Affiliations


Corresponding authors

Correspondence to Jonathan Chung , Soham Mukherjee , Marc Vinyals , Noah Fleming , Alice Mu or Vijay Ganesh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Li, C. et al. (2021). On the Hierarchical Community Structure of Practical Boolean Formulas. In: Li, CM., Manyà, F. (eds) Theory and Applications of Satisfiability Testing – SAT 2021. SAT 2021. Lecture Notes in Computer Science(), vol 12831. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-80222-6

  • Online ISBN: 978-3-030-80223-3

  • eBook Packages: Computer ScienceComputer Science (R0)