Abstract
Evaluating improvements to modern SAT solvers and comparison of two arbitrary solvers is a challenging and important task. Relative performance of two solvers is usually assessed by running them on a set of SAT instances and comparing the number of solved instances and their running time in a straightforward manner. In this paper we point to shortcomings of this approach and advocate more reliable, statistically founded methodologies that could discriminate better between good and bad ideas. We present one such methodology and illustrate its application.
This work was partially supported by Serbian Ministry of Science grant 144030.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Audemard, G., Simon, L.: Experiments with Small Changes in Conflict-Driven Clause Learning Arghorithms. In: Proc. of the 14th International Conf. on Principles and Practice of Constraint Programming (2008)
Brglez, F., Li, X.Y., Stallmann, M.: On SAT Instance Classes and a Method for Reliable Performance Experiments with SAT Solvers. In: Annals of Mathematics and Artificial Intelligence (2005)
Brglez, F., Osborne, J.: Performance Testing of Combinatorial Solvers With Isomorph Class Instances. In: ECS 2007: Experimental Computer Science on Experimental Computer Science (2007)
Brown, B., Hettmansperger, T.: Kruskal-Wallis, Multiple Comparisons and Efron Dice. Australian & New Zealand Journal of Statistics (2002)
Cohen, J.: Statistical Power Analysis for the Behavioral Sciences. Lawrence Erlbaum Associates, Mahwah (1988)
Cohen, P.: Empirical Methods for Artificial Intelligence. MIT Press, Cambridge (1995)
Cramér, H.: Mathematical Methods of Statistics. Princeton Univeristy Press, Princeton (1946)
David, F., Mallows, C.: The Variance of Spearman’s rho in normal samples. In: Biometrika (1961)
David, S., Kendall, M., Stuart, A.: Some Questions of Distribution in the Theory of Rank Correlation. Biometrika (1951)
Efron, B.: Bootstrap Methods: Another Look at Jackknife. The Annals of Statistics (1979)
Efron, B., Stein, C.: The Jackknife Estimate of Variance. The Annals of Statistics (1981)
Etzoni, O., Etzoni, R.: Statistical Methods for Analyzing Speedup Learning Experiments. Machine Learning (1994)
Frost, D., Rish, I., Vila, L.: Summarizing CSP hardness with continuous probability distributions. In: Proc. of the 14th National Conf. on Artificial Intelligence (1997)
Gehan, E.: A Generalized Wilcoxon Test for Comparing Arbitrarily Singly-Censored Samples. Biometrika (1965)
Gomes, C., Selman, B., Crato, N., Kautz, H.: Heavy-Tailed Phenomena in Satisfiability and Constraint Satisfaction Problems. Journal of Automated Reasoning (2000)
Grissom, R., Kimm, J.: Effect Sizes for Research: A Broad Practical Approach. Lawrence Erlbaum Associates, Mahwah (2005)
Hoeffding, W.: A Class of Statistics with Asymptotically Normal Distribution. The Annals of Mathematical Statistics (1948)
Hotelling, H.: New Light on the Correlation Coefficient and its Transforms. Journal of the Royal Statistical Society (1953)
Kendall, M.: Further Contributions to the Theory of Paired Comparisons. Biometrics (1955)
Le Berre, D., Simon, L.: The Essentials of the SAT 2003 Competition. In: Giunchiglia, E., Tacchella, A. (eds.) SAT 2003. LNCS, vol. 2919, pp. 452–467. Springer, Heidelberg (2004)
Lehmann, E.: Consistency and Unbiasedness of Certain Nonparametric Tests. In: The Annals of Mathematical Statistics (1951)
Mantel, N.: Ranking Procedures for Arbitrarily Restricted Observations. Biometrics (1967)
Pulina, L.: Empirical evaluation of Scoring Methods. In: Proc. of the 3rd European Starting AI Researcher Symposium (2006)
Rosenthal, R.: Meta-Analytic Procedures for Social Research. Sage, Thousand Oaks (1991)
Zarpas, E.: Benchmarking SAT Solvers for Bounded Model Checking. Theory and Applications of Satisfiability Testing (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nikolić, M. (2010). Statistical Methodology for Comparison of SAT Solvers. In: Strichman, O., Szeider, S. (eds) Theory and Applications of Satisfiability Testing – SAT 2010. SAT 2010. Lecture Notes in Computer Science, vol 6175. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14186-7_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-14186-7_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14185-0
Online ISBN: 978-3-642-14186-7
eBook Packages: Computer ScienceComputer Science (R0)