Learning probabilistic read-once formulas on product distributions
- 177 Downloads
- 2 Citations
Abstract
This paper presents a polynomial-time algorithm for inferring a probabilistic generalization of the class of read-once Boolean formulas over the usual basis {AND, OR, NOT}. The algorithm effectively infers a good approximation of the target formula when provided with random examples which are chosen according to anyproduct distribution, i.e., any distribution in which the setting of each input bit is chosen independently of the settings of the other bits. Since the class of formulas considered includes ordinary read-once Boolean formulas, our result shows that such formulas are PAC learnable (in the sense of Valiant) against any product distribution (for instance, against the uniform distribution). Further, this class of probabilistic formulas includes read-once formulas whose behavior has been corrupted by large amounts of random noise. Such noise may affect the formula's output (“misclassification noise”), the input bits (“attribute noise”), or it may affect the behavior of individual gates of the formula. Thus, in this setting, we show that read-once formula's can be inferred (approximately), despite large amounts of noise affecting the formula's behavior.
Keywords
computational learning theory PAC-learning learning with noise read-once formulas product distributionsReferences
- Angluin, D., Hellerstein, L., and Karpinski, M. (1993). Learning read-once formulas with queries.Journal of the Association for Computing Machinery, 40(1):185–210.Google Scholar
- Angluin, D. and Laird, P. (1988). Learning from noisy examples.Machine Learning, 2(4):343–370.Google Scholar
- Blumer, A., Ehrenfeucht, A., Haussler, D., and Warmuth, M. K. (1987). Occam's razor.Information Processing Letters, 24(6):377–380.Google Scholar
- Bshouty, N. H., Hancock, T. R., and Hellerstein, L. (1992). Learning arithmetic read-once formulas. InProceedings of the Twenty-Fourth Annual ACM Syrnposium on the Theory of Computing, pages 370–381.Google Scholar
- Furst, M. L., Jackson, J. C., and Smith, S. W. (1991). Improved learning ofAC 0 functions. InProceedings of the Fourth Annual Workshop on Computational Learning Theory, pages 317–325.Google Scholar
- Goldman, S. A., Kearns, M. J., and Schapire, R. E. (1990). Exact identification of circuits using fixed points of amplification functions. In31st Annual Symposium on Foundations of Computer Science, pages 193–202. To appear,SIAM Journal on Computing.Google Scholar
- Hancock, T. and Hellerstein, L. (1991). Learning read-once formulas over fields and extended bases. InProceedings of the Fourth Annual Workshop on Computational Learning Theory, pages 326–336.Google Scholar
- Hancock, T. and Mansour, Y. (1991). Learning monotonekμ DNF formulas on product distributions. InProceedings of the Fourth Annual Workshop on Computational Learning Theory, pages 179–183.Google Scholar
- Hancock, T. R. (1990). Identifying μ-formula decision trees with queries. InProceedings of the Third Annual Workshop on Computational Learning Theory, pages 23–37.Google Scholar
- Hellerstein, L. and Karpinski, M. (1990). Read-once formulas over different bases. Technical Report 8556-CS, University of Bonn.Google Scholar
- Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables.Journal of the American Statistical Association, 58(301):13–30.Google Scholar
- Kearns, M., Li, M., Pitt, L., and Valiant, L. (1987). On the learnability of Boolean formulae. InProceedings of the Nineteenth Annual ACM Symposium on Theory of Computing, pages 285–295.Google Scholar
- Kearns, M. and Valiant, L. G. (1989). Cryptographic limitations on learning Boolean formulae and finite automata. InProceedings of the Twenty First Annual ACM Symposium on Theory of Computing, pages 433–444. To appear,Journal of the Association for Computing Machinery.Google Scholar
- Kearns, M. J. and Schapire, R. E. (1990). Efficient distribution-free learning of probabilistic concepts. In31st Annual Symposium on Foundations of Computer Science, pages 382–391. To appear,Journal of Computer and System Sciences.Google Scholar
- Linial, N., Mansour, Y., and Nisan, N. (1989). Constant depth circuits, Fourier transform, and learnability. In30th Annual Symposium on Foundations of Computer Science, pages 574–579.Google Scholar
- Pagallo, G. and Haussler, D. (1989). A greedy method for learning μDNF functions under the uniform distribution. Technical Report UCSC-CRL-89-12, University of California Santa Cruz, Computer Research Laboratory.Google Scholar
- Sloan, R. H. (1988). Types of noise in data for concept learning. InProceedings of the 1988 Workshop on Computational Learning Theory, pages 91–96.Google Scholar
- Valiant, L. G. (1984). A theory of the learnable.Communications of the ACM, 27(11):1134–1142.Google Scholar
- Verbeurgt, K. (1990). Learning DNF under the uniform distribution in quasi-polynomial time. InProceedings of the Third Annual Workshop on Computational Learning Theory, pages 314–326.Google Scholar
- Yamanishi, K. (1992). A learning criterion for stochastic rules.Machine Learning, 9(2/3):165–203.Google Scholar