Advertisement

Estimating a Boolean Perceptron from Its Average Satisfying Assignment: A Bound on the Precision Required

  • Paul W. Goldberg
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2111)

Abstract

A boolean perceptron is a linear threshold function over the discrete boolean domain 0, 1 n. That is, it maps any binary vector to 0 or 1 depending on whether the vector’s components satisfy some linear inequality. In 1961, Chow [9] showed that any boolean perceptron is determined by the average or “center of gravity” of its “true” vectors (those that are mapped to 1). Moreover, this average distinguishes the function from any other boolean function, not just other boolean perceptrons. We address an associated statisticalquestion of whether an empirical estimate of this average is likely to provide a good approximation to the perceptron. In this paper we show that an estimate that is accurate to within additive error (ε/n)O(log(1/ε)) determines a boolean perceptron that is accurate to within error ε (the fraction of misclassified vectors). This provides a mildly super-polynomial bound on the sample complexity of learning boolean perceptrons in the “restricted focus of attention” setting. In the process we also find some interesting geometrical properties of the vertices of the unit hypercube.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    M. Anthony and P.L. Bartlett (1999). Neural Network Learning: Theoretical Foundations, Cambridge University Press, Cambridge.zbMATHGoogle Scholar
  2. 2.
    M. Anthony, G. Brightwell and J. Shawe-Taylor (1995). On Specifying Boolean Functions by Labelled Examples. Discrete Applied Mathematics 61, pp. 1–25.zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    S. Ben-David and E. Dichterman (1998). Learning with Restricted Focus of Attention, J. of Computer and System Sciences, 56(3), pp. 277–298. (earlier version in COLT’93)zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    S. Ben-David and E. Dichterman (1994). Learnability with restricted focus of attention guarantees noise-tolerance, 5th International Workshop on Algorithmic Learning Theory, pp. 248–259.Google Scholar
  5. 5.
    A. Birkendorf, E. Dichterman, J. Jackson, N. Klasner and H.U. Simon (1998). On restricted-focus-of-attention learnability of Boolean functions, Machine Learning, 30, 89–123. (earlier version in COLT’96)zbMATHCrossRefGoogle Scholar
  6. 6.
    A. Blum, A. Frieze, R. Kannan and S. Vempala (1998). A Polynomial-time Algorithm for Learning Noisy Linear Threshold Functions. Algorithmica 22: pp. 35–52.zbMATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    A. Blumer, A. Ehrenfeucht, D. Haussler and M.K. Warmuth (1989). Learnability and the Vapnik-Chervonenkis Dimension, J.ACM 36, 929–965.zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    J. Bruck (1990). Harmonic analysis of polynomial threshold functions. SIAM Journal of Discrete Mathematics, 3(2), 168–177.zbMATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    C.K. Chow (1961). On the characterization of threshold functions. Proc. Symp. on Switching Circuit Theory and Logical Design, 34–38.Google Scholar
  10. 10.
    E. Dichterman (1998). Learning with Limited Visibility. CDAM Research Reports Series, LSE-CDAM-98-01 44pp.Google Scholar
  11. 11.
    M.E. Dyer, A.M. Frieze, R. Kannan, A. Kapoor, L. Perkovic and U. Vazirani (1993). A mildly exponential time algorithm for approximating the number of solutions to a multidimensional knapsack problem. Combinatorics, Probability and Computing 2, 271–284.zbMATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    T. Eiter, T. Ibaraki and K. Makino (1998). Decision Lists and Related Boolean Functions. Institut Für Informatik JLU Giessen (IFIG) Research Reports 9804.Google Scholar
  13. 13.
    P.W. Goldberg (1999). Learning Fixed-dimension Linear Thresholds from Fragmented Data. Warwick CSdept. tech. report RR362, Sept. 99, accepted for publication in Information and Computation as of Dec. 2000. A preliminary version is in Procs of the 1999 Conference on ComputationalLearning Theory, pp. 88–99 July 1999.Google Scholar
  14. 14.
    P. Kaszerman (1963). A geometric test-synthesis procedure for a threshold device. Information and Control 6(4), 381–398.zbMATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    N. Littlestone (1988). Learning Quickly When Irrelevant Attributes Abound: A New Linear-threshold Algorithm. Machine Learning 2, pp. 285–318.Google Scholar
  16. 16.
    R.L. Rivest (1996). Learning Decision Lists. Machine Learning 2 pp. 229–246.Google Scholar
  17. 17.
    F. Rosenblatt (1962). Principles of Neurodynamics. Spartan Books, New York, 1962.zbMATHGoogle Scholar
  18. 18.
    R.O. Winder (1969). Threshold Gate Approximations Based on Chow Parameters. IEEE Transactions on Computers 18, pp. 372–5.CrossRefGoogle Scholar
  19. 19.
    R.O. Winder (1971). Chow Parameters in Threshold Logic. Journal of the ACM 18(2), pp. 265–89.zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Paul W. Goldberg
    • 1
  1. 1.Dept. of Computer ScienceUniversity of WarwickCoventryUK

Personalised recommendations