Abstract
We prove two main results on how arbitrary linear threshold functions \({f(x) = {\rm sign}(w \cdot x - \theta)}\) over the n-dimensional Boolean hypercube can be approximated by simple threshold functions. Our first result shows that every n-variable threshold function f is \({\epsilon}\) -close to a threshold function depending only on \({{\rm Inf}(f)^2 \cdot {\rm poly}(1/\epsilon)}\) many variables, where \({{\rm Inf}(f)}\) denotes the total influence or average sensitivity of f. This is an exponential sharpening of Friedgut’s well-known theorem (Friedgut in Combinatorica 18(1):474–483, 1998), which states that every Boolean function f is \({\epsilon}\)-close to a function depending only on \({2^{O({\rm Inf}(f)/\epsilon)}}\) many variables, for the case of threshold functions. We complement this upper bound by showing that \({\Omega({\rm Inf}(f)^2 + 1/\epsilon^2)}\) many variables are required for \({\epsilon}\)-approximating threshold functions. Our second result is a proof that every n-variable threshold function is \({\epsilon}\)-close to a threshold function with integer weights at most \({{\rm poly}(n) \cdot 2^{\tilde{O}(1/\epsilon^{2/3})}.}\) This is an improvement, in the dependence on the error parameter \({\epsilon}\), on an earlier result of Servedio (Comput Complex 16(2):180–209, 2007) which gave a \({{\rm poly}(n) \cdot 2^{\tilde{O}(1/\epsilon^{2})}}\) bound. Our improvement is obtained via a new proof technique that uses strong anti-concentration bounds from probability theory. The new technique also gives a simple and modular proof of the original result of Servedio (Comput Complex 16(2):180–209, 2007) and extends to give low-weight approximators for threshold functions under a range of probability distributions other than the uniform distribution.
Similar content being viewed by others
References
M. Aizenman, F. Germinet, A. Klein, S. Warzel (2009) On Bernoulli decompositions for random variables, concentration bounds, and spectral localization. Probability Theory and Related Fields 143(1–2): 219–238
N Alon, V. H. Vu (1997) Anti-Hadamard Matrices, Coin Weighing, Threshold Gates, and Indecomposable Hypergraphs. Journal of Combinatorial Theory, Series A 79(1): 133–160
M Anthony, G. Brightwell, J. Shawe-Taylor (1995) On specifying Boolean functions using labelled examples. Discrete Applied Mathematics 61: 1–25
W Beckner (1975) Inequalities in Fourier analysis. Annals of Mathematics 102: 159–182
R. Beigel (1994). Perceptrons, PP, and the Polynomial Hierarchy. Computational Complexity 4, 339–349.
A Beimel, E. Weinreb (2006) Monotone circuits for monotone weighted threshold functions. Information Processing Letters 97(1): 12–18
M. Bellare & J. Rompel (1994). Randomness-Efficient Oblivious Sampling. In 35th Annual Symposium on Foundations of Computer Science (FOCS), 276–287.
A Bonami (1970) Etude des coefficients Fourier des fonctiones de L p(G). Ann. Inst. Fourier (Grenoble) 20(2): 335–402
J Bourgain (2002) On the distributions of the Fourier spectrum of Boolean functions. Israel J. Math. 131: 269–276
J Bruck, R. Smolensky (1992) Polynomial threshold functions, AC 0 functions and spectral norms. SIAM Journal on Computing 21(1): 33–42
N Bshouty, C. Tamon (1996) On the Fourier spectrum of monotone functions. Journal of the ACM 43(4): 747–770
S Chawla, R. Krauthgamer, R. Kumar, Y. Rabani, D. Sivakumar (2006) On the Hardness of Approximating Multicut and Sparsest-Cut. Computational Complexity 15(2): 94–114
M. Dertouzos (1965). Threshold Logic: A Synthesis Approach. MIT Press, Cambridge, MA.
I Diakonikolas, P. Gopalan, R. Jaiswal, R. Servedio, E. Viola (2010) Bounded independence fools halfspaces. SIAM J. on Comput. 39(8): 3441–3462
I Dinur, S. Safra (2005) On the Hardness of Approximating Minimum Vertex-Cover. Annals of Mathematics 162(1): 439–485
W Doeblin, P. Lévy (1936) Calcul des probabilités. Sur les sommes de variables aléatoires indépendantes ’a dispersions bornées inférieurement. C.R. Acad. Sci. 202: 2027–2029
D Dubhashi, A. Panconesi (2009) Concentration of measure for the analysis of randomized algorithms. Cambridge University Press, Cambridge
P Erdös (1945) On a lemma of Littlewood and Offord. Bull. Amer. Math. Soc. 51: 898–902
P Erdös (1965) Extremal Problems in Number Theory. Proc. Sympos. Pure Math. 8: 181–189
C.G. Esséen (1968) On the concentration function of a sum of independent random variables. Z. Wahrscheinlichkeitstheorie verw. Geb. 9: 290–308
V Feldman, P. Gopalan, S. Khot, A. Ponnuswami (2009) On Agnostic Learning of Parities, Monomials, and Halfspaces. SIAM J. Comput. 39(2): 606–645
W. Feller (1968). An introduction to probability theory and its applications. John Wiley & Sons.
A. Fiat & D. Pechyony (2004). Decision Trees: More Theoretical Justification for Practical Algorithms. In Algorithmic Learning Theory, 15th International Conference (ALT 2004), 156–170.
E Friedgut (1998) Boolean functions with low average sensitivity depend on few coordinates. Combinatorica 18(1): 474–483
E Friedgut, G. Kalai (1996) Every Monotone Graph Property has a Sharp Threshold. Proc. Amer. Math. Soc. 124: 2993–3002
D. Glasner, R. A. Servedio (2009). Distribution-Free Testing Lower Bound for Basic Boolean Functions. Theory of Computing 5(1), 191–216.
P Goldberg (2006) A Bound on the Precision Required to Estimate a Boolean Perceptron from its Average Satisfying Assignment. SIAM Journal on Discrete Mathematics 20: 328–343
L Gross (1975) Logarithmic Sobolev inequalities. Amer. J. Math. 97(4): 1061–1083
G Halász (1977) Estimates for the concentration function of combinatorial number theory and probability. Period. Math. Hungar. 8(3): 197–211
S Hampson, D. Volper (1986) Linear function neurons: structure and training. Biological Cybernetics 53: 203–217
J Håstad (1994) On the size of weights for threshold gates. SIAM Journal on Discrete Mathematics 7(3): 484–492
J. Håstad (2005). Personal communication.
J. Hong (1987). On connectionist models. Technical Report 87-012, Dept. of Computer Science, University of Chicago.
J. Kahn, G. Kalai& N. Linial (1988). The influence of variables on boolean functions. In Proc. 29th Annual Symposium on Foundations of Computer Science (FOCS), 68–80.
A. Kalai (2007). Learning Nested Halfspaces and Uphill Decision Trees. In Proc. 20th Annual Conference on Learning Theory (COLT), 378–392.
A Kalai, A. Klivans, Y. Mansour, R. Servedio (2008) Agnostically Learning Halfspaces. SIAM Journal on Computing 37(6): 1777–1805
S Khot, O. Regev (2008) Vertex cover might be hard to approximate to within \({2 - \epsilon}\). Journal of Computer & System Sciences 74(3): 335–349
S Khot, R. Saket (2011) On the hardness of learning intersections of two halfspaces. J. Comput. Syst. Sci. 77(1): 129–141
A.N. Kolmogorov (1958-60). Sur les propriétés des fonctions de concentration de M. P. Lévy. Ann. Inst. H. Poincaré 16, 27–34.
R Krauthgamer, Y. Rabani (2009) Improved Lower Bounds for Embeddings into L1. SIAM J. Comput. 38(6): 2487–2498
J. E. Littlewood & A. C. Offord (1943). On the number of real roots of a random algebraic equation. III. Rec. Math. [Mat. Sbornik] N.S. 12, 277–286.
K Matulef, R. O’Donnell, R. Rubinfeld, R. Servedio (2010) Testing Halfspaces. SIAM J. on Comput. 39(5): 2004–2047
S Muroga (1971) Threshold logic and its applications. Wiley-Interscience, New York.
S Muroga, I. Toda, S. Takasu (1961) Theory of majority switching elements. J. Franklin Institute 271: 376–418
N Nisan, M. Szegedy (1994) On the degree of Boolean functions as real polynomials. Comput. Complexity 4: 301–313
R O’Donnell, R. Servedio (2007) Learning monotone decision trees in polynomial time. SIAM J. on Comput. 37(3): 827–844
R O’Donnell, R. Servedio (2011) The Chow Parameters Problem. SIAM J. on Comput. 40(1): 165–199
Y Rabani, A. Shpilka (2010) Explicit construction of a small epsilon-net for linear threshold functions. SIAM J. on Comput. 39(8): 3501–3520
P. Raghavan (1988). Learning in threshold networks. In First Workshop on Computational Learning Theory, 19–27.
B.A. Rogozin (1973). An integral-type estimate for concentration functions of sums of independent random variables. Dokl. Akad. Nauk SSSR 211, 1067–1070.
M Rudelson, R. Vershynin (2008) The Littlewood-Offord Problem and invertibility of random matrices. Advances in Mathematics 218(2): 600–633
A. Sárközy & E. Szemerédi (1965). Über ein Problem von Erdös und Moser. Acta Arithmetica 11, 205–208.
Servedio R. (2007) Every linear threshold function has a low-weight approximator. Comput. Complexity 16(2): 180–209
A.A. Sherstov (2008) Halfspace Matrices. Comput. Complexity 17(2): 149–178
I.S. Shiganov (1986). Refinement of the upper bound of the constant in the central limit theorem. Journal of Soviet Mathematics 2545–2550.
K.-Y. Siu, V.P. Roychowdhury & T. Kailath (1995). Discrete Neural Computation: A Theoretical Foundation. Prentice-Hall, Englewood Cliffs, NJ.
T. Tao & V. Vu (2006). Additive Combinatorics. Cambridge Studies in Advanced Mathematics, Cambridge University Press, Cambridge.
T Tao, V. Vu (2009) From the Littlewood-Offord problem to the Circular Law: universality of the spectral distribution of random matrices. Bull. Amer. Math. Soc. 46: 377–396
V. Vu (2008). A Structural Approach to Subset-Sum Problems. In Building Bridges, volume 19 of Bolyai Society Mathematical Studies, 525–545. Springer, Berlin, Heidelberg.
A. Wigderson (1994). The amazing power of pairwise independence. In Proceedings of the 26th ACM Symposium on Theory of Computing, 645–647.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Diakonikolas, I., Servedio, R.A. Improved Approximation of Linear Threshold Functions. comput. complex. 22, 623–677 (2013). https://doi.org/10.1007/s00037-012-0045-5
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00037-012-0045-5