A Monte Carlo Study of Randomised Restarted Search in ILP
Recent statistical performance surveys of search algorithms in difficult combinatorial problems have demonstrated the benefits of randomising and restarting the search procedure. Specifically, it has been found that if the search cost distribution (SCD) of the non-restarted randomised search exhibits a slower-than-exponential decay (that is, a “heavy tail”), restarts can reduce the search cost expectation. Recently, this heavy tail phenomenon was observed in the SCD’s of benchmark ILP problems. Following on this work, we report on an empirical study of randomised restarted search in ILP. Our experiments, conducted over a cluster of a few hundred computers, provide an extensive statistical performance sample of five search algorithms operating on two principally different ILP problems (artificially generated graph data and the well-known “mutagenesis” problem). The sample allows us to (1) estimate the conditional expected value of the search cost (measured by the total number of clauses explored) given the minimum clause score required and a “cutoff” value (the number of clauses examined before the search is restarted); and (2) compare the performance of randomised restarted search strategies to a deterministic non-restarted search. Our findings indicate that the cutoff value is significantly more important than the choice of (a) the specific refinement strategy; (b) the starting element of the search; and (c) the specific data domain. We find that the optimal value for the cutoff parameter remains roughly stable across variations of these three factors and that the mean search cost using this value in a randomised restarted search is up to three orders of magnitude (i.e. 1000 times) lower than that obtained with a deterministic non-restarted search.
KeywordsSearch Cost Monte Carlo Study Performance Vector Bottom Clause Inductive Logic Program
Unable to display preview. Download preview PDF.
- 2.Dzeroski, S.: Relational data mining applications: An overview. In: Relational Data Mining, September 2001, pp. 339–364. Springer, Heidelberg (2001)Google Scholar
- 4.Gomes, C., Selman, B.: On the fine structure of large search spaces. In: Proceedings the Eleventh International Conference on Tools with Artificial Intelligence ICTAI 1999, Chicago, IL (1999)Google Scholar
- 6.Kautz, H., Horvitz, E., Ruan, Y., Gomes, C., Selman, B.: Dynamic restart policies. In: Proceedings of the Eighteenth national conference on Artificial intelligence (AAAI 2002), Edmonton, Alberta, Canada (2002)Google Scholar
- 7.Muggleton, S.: Inverse entailment and Progol. New Generation Computing, Special issue on Inductive Logic Programming 13(3-4), 245–286 (1995)Google Scholar
- 8.Selman, B., Levesque, H.J., Mitchell, D.: A new method for solving hard satisfiability problems. In: Rosenbloom, P., Szolovits, P. (eds.) Proceedings of the Tenth National Conference on Artificial Intelligence, pp. 440–446. AAAI Press, Menlo Park (1992)Google Scholar
- 10.Trefethen, N.: Maxims about numerical mathematics, computers, science, and life. SIAM News (January/February 1998)Google Scholar
- 11.Železný, F., Srinivasan, A., Page, D.: Lattice-search runtime distributions may be heavy-tailed. LNCS, vol. 2583, pp. 333–345 (2003)Google Scholar