Skip to main content
Log in

Improved Approximation of Linear Threshold Functions

  • Published:
computational complexity Aims and scope Submit manuscript

Abstract

We prove two main results on how arbitrary linear threshold functions \({f(x) = {\rm sign}(w \cdot x - \theta)}\) over the n-dimensional Boolean hypercube can be approximated by simple threshold functions. Our first result shows that every n-variable threshold function f is \({\epsilon}\) -close to a threshold function depending only on \({{\rm Inf}(f)^2 \cdot {\rm poly}(1/\epsilon)}\) many variables, where \({{\rm Inf}(f)}\) denotes the total influence or average sensitivity of f. This is an exponential sharpening of Friedgut’s well-known theorem (Friedgut in Combinatorica 18(1):474–483, 1998), which states that every Boolean function f is \({\epsilon}\)-close to a function depending only on \({2^{O({\rm Inf}(f)/\epsilon)}}\) many variables, for the case of threshold functions. We complement this upper bound by showing that \({\Omega({\rm Inf}(f)^2 + 1/\epsilon^2)}\) many variables are required for \({\epsilon}\)-approximating threshold functions. Our second result is a proof that every n-variable threshold function is \({\epsilon}\)-close to a threshold function with integer weights at most \({{\rm poly}(n) \cdot 2^{\tilde{O}(1/\epsilon^{2/3})}.}\) This is an improvement, in the dependence on the error parameter \({\epsilon}\), on an earlier result of Servedio (Comput Complex 16(2):180–209, 2007) which gave a \({{\rm poly}(n) \cdot 2^{\tilde{O}(1/\epsilon^{2})}}\) bound. Our improvement is obtained via a new proof technique that uses strong anti-concentration bounds from probability theory. The new technique also gives a simple and modular proof of the original result of Servedio (Comput Complex 16(2):180–209, 2007) and extends to give low-weight approximators for threshold functions under a range of probability distributions other than the uniform distribution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • M. Aizenman, F. Germinet, A. Klein, S. Warzel (2009) On Bernoulli decompositions for random variables, concentration bounds, and spectral localization. Probability Theory and Related Fields 143(1–2): 219–238

    MathSciNet  Google Scholar 

  • N Alon, V. H. Vu (1997) Anti-Hadamard Matrices, Coin Weighing, Threshold Gates, and Indecomposable Hypergraphs. Journal of Combinatorial Theory, Series A 79(1): 133–160

    Google Scholar 

  • M Anthony, G. Brightwell, J. Shawe-Taylor (1995) On specifying Boolean functions using labelled examples. Discrete Applied Mathematics 61: 1–25

    Article  MathSciNet  Google Scholar 

  • W Beckner (1975) Inequalities in Fourier analysis. Annals of Mathematics 102: 159–182

    Google Scholar 

  • R. Beigel (1994). Perceptrons, PP, and the Polynomial Hierarchy. Computational Complexity 4, 339–349.

    Google Scholar 

  • A Beimel, E. Weinreb (2006) Monotone circuits for monotone weighted threshold functions. Information Processing Letters 97(1): 12–18

    Article  MathSciNet  Google Scholar 

  • M. Bellare & J. Rompel (1994). Randomness-Efficient Oblivious Sampling. In 35th Annual Symposium on Foundations of Computer Science (FOCS), 276–287.

  • A Bonami (1970) Etude des coefficients Fourier des fonctiones de L p(G). Ann. Inst. Fourier (Grenoble) 20(2): 335–402

    Article  MathSciNet  Google Scholar 

  • J Bourgain (2002) On the distributions of the Fourier spectrum of Boolean functions. Israel J. Math. 131: 269–276

    Article  MathSciNet  MATH  Google Scholar 

  • J Bruck, R. Smolensky (1992) Polynomial threshold functions, AC 0 functions and spectral norms. SIAM Journal on Computing 21(1): 33–42

    Article  MathSciNet  Google Scholar 

  • N Bshouty, C. Tamon (1996) On the Fourier spectrum of monotone functions. Journal of the ACM 43(4): 747–770

    Article  MathSciNet  Google Scholar 

  • S Chawla, R. Krauthgamer, R. Kumar, Y. Rabani, D. Sivakumar (2006) On the Hardness of Approximating Multicut and Sparsest-Cut. Computational Complexity 15(2): 94–114

    Article  MathSciNet  Google Scholar 

  • M. Dertouzos (1965). Threshold Logic: A Synthesis Approach. MIT Press, Cambridge, MA.

  • I Diakonikolas, P. Gopalan, R. Jaiswal, R. Servedio, E. Viola (2010) Bounded independence fools halfspaces. SIAM J. on Comput. 39(8): 3441–3462

    Article  Google Scholar 

  • I Dinur, S. Safra (2005) On the Hardness of Approximating Minimum Vertex-Cover. Annals of Mathematics 162(1): 439–485

    Article  MathSciNet  Google Scholar 

  • W Doeblin, P. Lévy (1936) Calcul des probabilités. Sur les sommes de variables aléatoires indépendantes ’a dispersions bornées inférieurement. C.R. Acad. Sci. 202: 2027–2029

    Google Scholar 

  • D Dubhashi, A. Panconesi (2009) Concentration of measure for the analysis of randomized algorithms. Cambridge University Press, Cambridge

    Google Scholar 

  • P Erdös (1945) On a lemma of Littlewood and Offord. Bull. Amer. Math. Soc. 51: 898–902

    Article  MathSciNet  Google Scholar 

  • P Erdös (1965) Extremal Problems in Number Theory. Proc. Sympos. Pure Math. 8: 181–189

    Article  Google Scholar 

  • C.G. Esséen (1968) On the concentration function of a sum of independent random variables. Z. Wahrscheinlichkeitstheorie verw. Geb. 9: 290–308

    Article  Google Scholar 

  • V Feldman, P. Gopalan, S. Khot, A. Ponnuswami (2009) On Agnostic Learning of Parities, Monomials, and Halfspaces. SIAM J. Comput. 39(2): 606–645

    Article  MathSciNet  Google Scholar 

  • W. Feller (1968). An introduction to probability theory and its applications. John Wiley & Sons.

  • A. Fiat & D. Pechyony (2004). Decision Trees: More Theoretical Justification for Practical Algorithms. In Algorithmic Learning Theory, 15th International Conference (ALT 2004), 156–170.

  • E Friedgut (1998) Boolean functions with low average sensitivity depend on few coordinates. Combinatorica 18(1): 474–483

    MathSciNet  Google Scholar 

  • E Friedgut, G. Kalai (1996) Every Monotone Graph Property has a Sharp Threshold. Proc. Amer. Math. Soc. 124: 2993–3002

    Article  MathSciNet  Google Scholar 

  • D. Glasner, R. A. Servedio (2009). Distribution-Free Testing Lower Bound for Basic Boolean Functions. Theory of Computing 5(1), 191–216.

  • P Goldberg (2006) A Bound on the Precision Required to Estimate a Boolean Perceptron from its Average Satisfying Assignment. SIAM Journal on Discrete Mathematics 20: 328–343

    Article  MathSciNet  Google Scholar 

  • L Gross (1975) Logarithmic Sobolev inequalities. Amer. J. Math. 97(4): 1061–1083

    Article  MathSciNet  Google Scholar 

  • G Halász (1977) Estimates for the concentration function of combinatorial number theory and probability. Period. Math. Hungar. 8(3): 197–211

    MathSciNet  MATH  Google Scholar 

  • S Hampson, D. Volper (1986) Linear function neurons: structure and training. Biological Cybernetics 53: 203–217

    Article  MATH  Google Scholar 

  • J Håstad (1994) On the size of weights for threshold gates. SIAM Journal on Discrete Mathematics 7(3): 484–492

    Article  MathSciNet  Google Scholar 

  • J. Håstad (2005). Personal communication.

  • J. Hong (1987). On connectionist models. Technical Report 87-012, Dept. of Computer Science, University of Chicago.

  • J. Kahn, G. Kalai& N. Linial (1988). The influence of variables on boolean functions. In Proc. 29th Annual Symposium on Foundations of Computer Science (FOCS), 68–80.

  • A. Kalai (2007). Learning Nested Halfspaces and Uphill Decision Trees. In Proc. 20th Annual Conference on Learning Theory (COLT), 378–392.

  • A Kalai, A. Klivans, Y. Mansour, R. Servedio (2008) Agnostically Learning Halfspaces. SIAM Journal on Computing 37(6): 1777–1805

    Article  MathSciNet  MATH  Google Scholar 

  • S Khot, O. Regev (2008) Vertex cover might be hard to approximate to within \({2 - \epsilon}\). Journal of Computer & System Sciences 74(3): 335–349

    Article  Google Scholar 

  • S Khot, R. Saket (2011) On the hardness of learning intersections of two halfspaces. J. Comput. Syst. Sci. 77(1): 129–141

    Article  Google Scholar 

  • A.N. Kolmogorov (1958-60). Sur les propriétés des fonctions de concentration de M. P. Lévy. Ann. Inst. H. Poincaré 16, 27–34.

  • R Krauthgamer, Y. Rabani (2009) Improved Lower Bounds for Embeddings into L1. SIAM J. Comput. 38(6): 2487–2498

    Article  MathSciNet  Google Scholar 

  • J. E. Littlewood & A. C. Offord (1943). On the number of real roots of a random algebraic equation. III. Rec. Math. [Mat. Sbornik] N.S. 12, 277–286.

    Google Scholar 

  • K Matulef, R. O’Donnell, R. Rubinfeld, R. Servedio (2010) Testing Halfspaces. SIAM J. on Comput. 39(5): 2004–2047

    Article  Google Scholar 

  • S Muroga (1971) Threshold logic and its applications. Wiley-Interscience, New York.

    Google Scholar 

  • S Muroga, I. Toda, S. Takasu (1961) Theory of majority switching elements. J. Franklin Institute 271: 376–418

    Article  Google Scholar 

  • N Nisan, M. Szegedy (1994) On the degree of Boolean functions as real polynomials. Comput. Complexity 4: 301–313

    Article  MathSciNet  Google Scholar 

  • R O’Donnell, R. Servedio (2007) Learning monotone decision trees in polynomial time. SIAM J. on Comput. 37(3): 827–844

    Article  Google Scholar 

  • R O’Donnell, R. Servedio (2011) The Chow Parameters Problem. SIAM J. on Comput. 40(1): 165–199

    Article  Google Scholar 

  • Y Rabani, A. Shpilka (2010) Explicit construction of a small epsilon-net for linear threshold functions. SIAM J. on Comput. 39(8): 3501–3520

    Article  Google Scholar 

  • P. Raghavan (1988). Learning in threshold networks. In First Workshop on Computational Learning Theory, 19–27.

  • B.A. Rogozin (1973). An integral-type estimate for concentration functions of sums of independent random variables. Dokl. Akad. Nauk SSSR 211, 1067–1070.

    Google Scholar 

  • M Rudelson, R. Vershynin (2008) The Littlewood-Offord Problem and invertibility of random matrices. Advances in Mathematics 218(2): 600–633

    Article  MathSciNet  Google Scholar 

  • A. Sárközy & E. Szemerédi (1965). Über ein Problem von Erdös und Moser. Acta Arithmetica 11, 205–208.

    Google Scholar 

  • Servedio R. (2007) Every linear threshold function has a low-weight approximator. Comput. Complexity 16(2): 180–209

    Article  MathSciNet  MATH  Google Scholar 

  • A.A. Sherstov (2008) Halfspace Matrices. Comput. Complexity 17(2): 149–178

    Article  MathSciNet  Google Scholar 

  • I.S. Shiganov (1986). Refinement of the upper bound of the constant in the central limit theorem. Journal of Soviet Mathematics 2545–2550.

  • K.-Y. Siu, V.P. Roychowdhury & T. Kailath (1995). Discrete Neural Computation: A Theoretical Foundation. Prentice-Hall, Englewood Cliffs, NJ.

  • T. Tao & V. Vu (2006). Additive Combinatorics. Cambridge Studies in Advanced Mathematics, Cambridge University Press, Cambridge.

  • T Tao, V. Vu (2009) From the Littlewood-Offord problem to the Circular Law: universality of the spectral distribution of random matrices. Bull. Amer. Math. Soc. 46: 377–396

    Article  MathSciNet  Google Scholar 

  • V. Vu (2008). A Structural Approach to Subset-Sum Problems. In Building Bridges, volume 19 of Bolyai Society Mathematical Studies, 525–545. Springer, Berlin, Heidelberg.

  • A. Wigderson (1994). The amazing power of pairwise independence. In Proceedings of the 26th ACM Symposium on Theory of Computing, 645–647.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rocco A. Servedio.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Diakonikolas, I., Servedio, R.A. Improved Approximation of Linear Threshold Functions. comput. complex. 22, 623–677 (2013). https://doi.org/10.1007/s00037-012-0045-5

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00037-012-0045-5

Keywords

Subject classification

Navigation