# Improved Approximation of Linear Threshold Functions

## Abstract

We prove two main results on how arbitrary linear threshold functions $${f(x) = {\rm sign}(w \cdot x - \theta)}$$ over the n-dimensional Boolean hypercube can be approximated by simple threshold functions. Our first result shows that every n-variable threshold function f is $${\epsilon}$$ -close to a threshold function depending only on $${{\rm Inf}(f)^2 \cdot {\rm poly}(1/\epsilon)}$$ many variables, where $${{\rm Inf}(f)}$$ denotes the total influence or average sensitivity of f. This is an exponential sharpening of Friedgut’s well-known theorem (Friedgut in Combinatorica 18(1):474–483, 1998), which states that every Boolean function f is $${\epsilon}$$-close to a function depending only on $${2^{O({\rm Inf}(f)/\epsilon)}}$$ many variables, for the case of threshold functions. We complement this upper bound by showing that $${\Omega({\rm Inf}(f)^2 + 1/\epsilon^2)}$$ many variables are required for $${\epsilon}$$-approximating threshold functions. Our second result is a proof that every n-variable threshold function is $${\epsilon}$$-close to a threshold function with integer weights at most $${{\rm poly}(n) \cdot 2^{\tilde{O}(1/\epsilon^{2/3})}.}$$ This is an improvement, in the dependence on the error parameter $${\epsilon}$$, on an earlier result of Servedio (Comput Complex 16(2):180–209, 2007) which gave a $${{\rm poly}(n) \cdot 2^{\tilde{O}(1/\epsilon^{2})}}$$ bound. Our improvement is obtained via a new proof technique that uses strong anti-concentration bounds from probability theory. The new technique also gives a simple and modular proof of the original result of Servedio (Comput Complex 16(2):180–209, 2007) and extends to give low-weight approximators for threshold functions under a range of probability distributions other than the uniform distribution.

This is a preview of subscription content, access via your institution.

## References

1. M. Aizenman, F. Germinet, A. Klein, S. Warzel (2009) On Bernoulli decompositions for random variables, concentration bounds, and spectral localization. Probability Theory and Related Fields 143(1–2): 219–238

2. N Alon, V. H. Vu (1997) Anti-Hadamard Matrices, Coin Weighing, Threshold Gates, and Indecomposable Hypergraphs. Journal of Combinatorial Theory, Series A 79(1): 133–160

3. M Anthony, G. Brightwell, J. Shawe-Taylor (1995) On specifying Boolean functions using labelled examples. Discrete Applied Mathematics 61: 1–25

4. W Beckner (1975) Inequalities in Fourier analysis. Annals of Mathematics 102: 159–182

5. R. Beigel (1994). Perceptrons, PP, and the Polynomial Hierarchy. Computational Complexity 4, 339–349.

6. A Beimel, E. Weinreb (2006) Monotone circuits for monotone weighted threshold functions. Information Processing Letters 97(1): 12–18

7. M. Bellare & J. Rompel (1994). Randomness-Efficient Oblivious Sampling. In 35th Annual Symposium on Foundations of Computer Science (FOCS), 276–287.

8. A Bonami (1970) Etude des coefficients Fourier des fonctiones de L p(G). Ann. Inst. Fourier (Grenoble) 20(2): 335–402

9. J Bourgain (2002) On the distributions of the Fourier spectrum of Boolean functions. Israel J. Math. 131: 269–276

10. J Bruck, R. Smolensky (1992) Polynomial threshold functions, AC 0 functions and spectral norms. SIAM Journal on Computing 21(1): 33–42

11. N Bshouty, C. Tamon (1996) On the Fourier spectrum of monotone functions. Journal of the ACM 43(4): 747–770

12. S Chawla, R. Krauthgamer, R. Kumar, Y. Rabani, D. Sivakumar (2006) On the Hardness of Approximating Multicut and Sparsest-Cut. Computational Complexity 15(2): 94–114

13. M. Dertouzos (1965). Threshold Logic: A Synthesis Approach. MIT Press, Cambridge, MA.

14. I Diakonikolas, P. Gopalan, R. Jaiswal, R. Servedio, E. Viola (2010) Bounded independence fools halfspaces. SIAM J. on Comput. 39(8): 3441–3462

15. I Dinur, S. Safra (2005) On the Hardness of Approximating Minimum Vertex-Cover. Annals of Mathematics 162(1): 439–485

16. W Doeblin, P. Lévy (1936) Calcul des probabilités. Sur les sommes de variables aléatoires indépendantes ’a dispersions bornées inférieurement. C.R. Acad. Sci. 202: 2027–2029

17. D Dubhashi, A. Panconesi (2009) Concentration of measure for the analysis of randomized algorithms. Cambridge University Press, Cambridge

18. P Erdös (1945) On a lemma of Littlewood and Offord. Bull. Amer. Math. Soc. 51: 898–902

19. P Erdös (1965) Extremal Problems in Number Theory. Proc. Sympos. Pure Math. 8: 181–189

20. C.G. Esséen (1968) On the concentration function of a sum of independent random variables. Z. Wahrscheinlichkeitstheorie verw. Geb. 9: 290–308

21. V Feldman, P. Gopalan, S. Khot, A. Ponnuswami (2009) On Agnostic Learning of Parities, Monomials, and Halfspaces. SIAM J. Comput. 39(2): 606–645

22. W. Feller (1968). An introduction to probability theory and its applications. John Wiley & Sons.

23. A. Fiat & D. Pechyony (2004). Decision Trees: More Theoretical Justification for Practical Algorithms. In Algorithmic Learning Theory, 15th International Conference (ALT 2004), 156–170.

24. E Friedgut (1998) Boolean functions with low average sensitivity depend on few coordinates. Combinatorica 18(1): 474–483

25. E Friedgut, G. Kalai (1996) Every Monotone Graph Property has a Sharp Threshold. Proc. Amer. Math. Soc. 124: 2993–3002

26. D. Glasner, R. A. Servedio (2009). Distribution-Free Testing Lower Bound for Basic Boolean Functions. Theory of Computing 5(1), 191–216.

27. P Goldberg (2006) A Bound on the Precision Required to Estimate a Boolean Perceptron from its Average Satisfying Assignment. SIAM Journal on Discrete Mathematics 20: 328–343

28. L Gross (1975) Logarithmic Sobolev inequalities. Amer. J. Math. 97(4): 1061–1083

29. G Halász (1977) Estimates for the concentration function of combinatorial number theory and probability. Period. Math. Hungar. 8(3): 197–211

30. S Hampson, D. Volper (1986) Linear function neurons: structure and training. Biological Cybernetics 53: 203–217

31. J Håstad (1994) On the size of weights for threshold gates. SIAM Journal on Discrete Mathematics 7(3): 484–492

32. J. Håstad (2005). Personal communication.

33. J. Hong (1987). On connectionist models. Technical Report 87-012, Dept. of Computer Science, University of Chicago.

34. J. Kahn, G. Kalai& N. Linial (1988). The influence of variables on boolean functions. In Proc. 29th Annual Symposium on Foundations of Computer Science (FOCS), 68–80.

35. A. Kalai (2007). Learning Nested Halfspaces and Uphill Decision Trees. In Proc. 20th Annual Conference on Learning Theory (COLT), 378–392.

36. A Kalai, A. Klivans, Y. Mansour, R. Servedio (2008) Agnostically Learning Halfspaces. SIAM Journal on Computing 37(6): 1777–1805

37. S Khot, O. Regev (2008) Vertex cover might be hard to approximate to within $${2 - \epsilon}$$. Journal of Computer & System Sciences 74(3): 335–349

38. S Khot, R. Saket (2011) On the hardness of learning intersections of two halfspaces. J. Comput. Syst. Sci. 77(1): 129–141

39. A.N. Kolmogorov (1958-60). Sur les propriétés des fonctions de concentration de M. P. Lévy. Ann. Inst. H. Poincaré 16, 27–34.

40. R Krauthgamer, Y. Rabani (2009) Improved Lower Bounds for Embeddings into L1. SIAM J. Comput. 38(6): 2487–2498

41. J. E. Littlewood & A. C. Offord (1943). On the number of real roots of a random algebraic equation. III. Rec. Math. [Mat. Sbornik] N.S. 12, 277–286.

42. K Matulef, R. O’Donnell, R. Rubinfeld, R. Servedio (2010) Testing Halfspaces. SIAM J. on Comput. 39(5): 2004–2047

43. S Muroga (1971) Threshold logic and its applications. Wiley-Interscience, New York.

44. S Muroga, I. Toda, S. Takasu (1961) Theory of majority switching elements. J. Franklin Institute 271: 376–418

45. N Nisan, M. Szegedy (1994) On the degree of Boolean functions as real polynomials. Comput. Complexity 4: 301–313

46. R O’Donnell, R. Servedio (2007) Learning monotone decision trees in polynomial time. SIAM J. on Comput. 37(3): 827–844

47. R O’Donnell, R. Servedio (2011) The Chow Parameters Problem. SIAM J. on Comput. 40(1): 165–199

48. Y Rabani, A. Shpilka (2010) Explicit construction of a small epsilon-net for linear threshold functions. SIAM J. on Comput. 39(8): 3501–3520

49. P. Raghavan (1988). Learning in threshold networks. In First Workshop on Computational Learning Theory, 19–27.

50. B.A. Rogozin (1973). An integral-type estimate for concentration functions of sums of independent random variables. Dokl. Akad. Nauk SSSR 211, 1067–1070.

51. M Rudelson, R. Vershynin (2008) The Littlewood-Offord Problem and invertibility of random matrices. Advances in Mathematics 218(2): 600–633

52. A. Sárközy & E. Szemerédi (1965). Über ein Problem von Erdös und Moser. Acta Arithmetica 11, 205–208.

53. Servedio R. (2007) Every linear threshold function has a low-weight approximator. Comput. Complexity 16(2): 180–209

54. A.A. Sherstov (2008) Halfspace Matrices. Comput. Complexity 17(2): 149–178

55. I.S. Shiganov (1986). Refinement of the upper bound of the constant in the central limit theorem. Journal of Soviet Mathematics 2545–2550.

56. K.-Y. Siu, V.P. Roychowdhury & T. Kailath (1995). Discrete Neural Computation: A Theoretical Foundation. Prentice-Hall, Englewood Cliffs, NJ.

57. T. Tao & V. Vu (2006). Additive Combinatorics. Cambridge Studies in Advanced Mathematics, Cambridge University Press, Cambridge.

58. T Tao, V. Vu (2009) From the Littlewood-Offord problem to the Circular Law: universality of the spectral distribution of random matrices. Bull. Amer. Math. Soc. 46: 377–396

59. V. Vu (2008). A Structural Approach to Subset-Sum Problems. In Building Bridges, volume 19 of Bolyai Society Mathematical Studies, 525–545. Springer, Berlin, Heidelberg.

60. A. Wigderson (1994). The amazing power of pairwise independence. In Proceedings of the 26th ACM Symposium on Theory of Computing, 645–647.

## Author information

Authors

### Corresponding author

Correspondence to Rocco A. Servedio.

## Rights and permissions

Reprints and Permissions

Diakonikolas, I., Servedio, R.A. Improved Approximation of Linear Threshold Functions. comput. complex. 22, 623–677 (2013). https://doi.org/10.1007/s00037-012-0045-5

• Published:

• Issue Date:

### Keywords

• Threshold functions
• total influence
• average sensitivity
• integer-weight approximation
• juntas

• 68R99