Abstract
A simple greedy algorithm has been known as an approximation algorithm for inference of a Boolean function from positive and negative examples, which is a fundamental problem in discovery science. It was conjectured from results of computational experiments that the greedy algorithm can find an exact (or optimal) solution with high probability if input data for each function are generated uniformly at random. This conjecture was proved only for AND/OR of literals. This paper gives a proof of the conjecture for more general Boolean functions which we call unbalanced functions. We also proved that unbalanced functions account for more than half of all Boolean functions, and the ratio of d-input unbalanced functions to all d-input Boolean functions converges to 1 as d grows. This means that the greedy algorithm can find the exact solution with high probability for most Boolean functions if input data are generated uniformly at random. In order to improve the performance for cases of small d, we develop a variant of the greedy algorithm. The theoretical results on the greedy algorithm and the effectiveness of the variant were confirmed through computational experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Imielinski, T., Swami, A.N.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. SIGMOD Conference 1993, Washington, D.C, pp. 207–216 (1993)
Akutsu, T., Bao, F.: Approximating Minimum Keys and Optimal Substructure Screens. In: Cai, J.-Y., Wong, C.K. (eds.) COCOON 1996. LNCS, vol. 1090, pp. 290–299. Springer, Heidelberg (1996)
Akutsu, T., Miyano, S., Kuhara, S.: Identification of Genetic Networks from a Small Number of Gene Expression Patterns Under the Boolean Network Model. In: Proc. Pacific Symposium on Biocomputing, pp. 17–28 (1999)
Akutsu, T., Miyano, S., Kuhara, S.: A simple greedy algorithm for finding functional relations: efficient implementation and average case analysis. Theoretical Computer Science 292, 481–495 (2003); Preliminary version has appeared in Morishita, S., Arikawa, S. (eds.): DS 2000. LNCS (LNAI), vol. 1967. Springer, Heidelberg (2000)
Blum, A., Langley, P.: Selection of Relevant Features and Examples in Machine Learning. Artificial Intelligence 97, 245–271 (1997)
Boros, E., Horiyama, T., Ibaraki, T., Makino, K., Yagiura, M.: Finding Essential Attributes from Binary Data. Annals of Mathematics and Artificial Intelligence 39, 223–257 (2003)
Gamberger, D.: A Minimization Approach to Propositional Inductive Learning. In: Lavrač, N., Wrobel, S. (eds.) ECML 1995. LNCS, vol. 912, pp. 151–160. Springer, Heidelberg (1995)
Gamberger, D., Lavrac, N.: Conditions for Occam’s Razor Applicability and Noise Elimination. In: van Someren, M., Widmer, G. (eds.) ECML 1997. LNCS, vol. 1224, pp. 108–123. Springer, Heidelberg (1997)
Kearns, M.J., Vazirani, U.V.: An Introduction to Computational Learning Theory. The MIT Press, Cambridge (1994)
Lavrac, N., Gamberger, D., Jovanoski, V.: A Study of Relevance for Learning in Deductive Databases. Journal of Logic Programming 40, 215–249 (1999)
Littlestone, N.: Learning Quickly When Irrelevant Attributes Abound: A New Linear-threshold Algorithm. Machine Learning 2, 285–318 (1987)
Mannila, H., Raiha, K.-J.: On the Complexity of Inferring Functional Dependencies. Discrete Applied Mathematics 40, 237–243 (1992)
Motwani, R., Raghavan, P.: Randomized Algorithms. Cambridge Univ. Press, Cambridge (1995)
Pagallo, G., Haussler, D.: Boolean Feature Discovery in Empirical Learning. Machine Learning 5, 71–99 (1990)
Tsai, C.-C., Marek-Sadowska, M.: Boolean Matching Using Generalized Reed- Muller Forms. In: Proc. Design Automation Conference, pp. 339–344 (1994)
Vazirani, V.V.: Approximation Algorithms. Springer, Berlin (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fukagawa, D., Akutsu, T. (2003). Performance Analysis of a Greedy Algorithm for Inferring Boolean Functions. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds) Discovery Science. DS 2003. Lecture Notes in Computer Science(), vol 2843. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39644-4_11
Download citation
DOI: https://doi.org/10.1007/978-3-540-39644-4_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20293-6
Online ISBN: 978-3-540-39644-4
eBook Packages: Springer Book Archive