Efficient Learning Algorithms Yield Circuit Lower Bounds

  • Lance Fortnow
  • Adam R. Klivans
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4005)


We describe a new approach for understanding the difficulty of designing efficient learning algorithms. We prove that the existence of an efficient learning algorithm for a circuit class C in Angluin’s model of exact learning from membership and equivalence queries or in Valiant’s PAC model yields a lower bound against C. More specifically, we prove that any subexponential time, determinstic exact learning algorithm for C (from membership and equivalence queries) implies the existence of a function f in EXP NP such that \(f \not\in C\). If C is PAC learnable with membership queries under the uniform distribution or Exact learnable in randomized polynomial time, we prove that there exists a function fBPEXP (the exponential time analog of BPP) such that \(f {\not\in} C\).

For C equal to polynomial-size, depth-two threshold circuits (i.e., neural networks with a polynomial number of hidden nodes), our result shows that efficient learning algorithms for this class would solve one of the most challenging open problems in computational complexity theory: proving the existence of a function in EXP NP or BPEXP that cannot be computed by circuits from C. We are not aware of any representation-independent hardness results for learning polynomial-size depth-2 neural networks.

Our approach uses the framework of the breakthrough result due to Kabanets and Impagliazzo showing that derandomizing BPP yields non-trivial circuit lower bounds.


Arithmetic Circuit Membership Query Arithmetic Formula Equivalence Query Circuit Class 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Pitt, L., Valiant, L.: Computational limitations on learning from examples. Journal of the ACM 35, 965–984 (1988)CrossRefMathSciNetzbMATHGoogle Scholar
  2. 2.
    Gold, E.A.: Complexity of automaton identification from given data. Information and Control 37, 302–320 (1978)CrossRefMathSciNetzbMATHGoogle Scholar
  3. 3.
    Alekhnovich, Braverman, Feldman, Klivans, Pitassi: Learnability and automatizability. In: FOCS: IEEE Symposium on Foundations of Computer Science (FOCS) (2004)Google Scholar
  4. 4.
    Kearns, M., Valiant, L.: Cryptographic limitations on learning Boolean formulae and finite automata. Journal of the ACM 41, 67–95 (1994)CrossRefMathSciNetzbMATHGoogle Scholar
  5. 5.
    Kharitonov, M.: Cryptographic hardness of distribution-specific learning. In: Proceedings of the Twenty-Fifth Annual Symposium on Theory of Computing, pp. 372–381 (1993)Google Scholar
  6. 6.
    Jackson, J., Klivans, A., Servedio, R.: Learnability beyond AC 0. In: Proceedings of the 34th ACM Symposium on Theory of Computing (2002)Google Scholar
  7. 7.
    Kabanets, V., Impagliazzo, R.: Derandomizing polynomial identity tests means proving circuit lower bounds. In: Proceedings of the 35th ACM Symposium on the Theory of Computing, pp. 355–364. ACM, New York (2003)Google Scholar
  8. 8.
    Impagliazzo, R., Wigderson, A.: Randomness vs. time: Derandomization under a uniform assumption. Journal of Computer and System Sciences 63, 672–688 (2001)CrossRefMathSciNetzbMATHGoogle Scholar
  9. 9.
    Valiant, L.: A theory of the learnable. Communications of the ACM 27, 1134–1142 (1984)CrossRefzbMATHGoogle Scholar
  10. 10.
    Angluin, D.: Queries and concept learning. Machine Learning 2, 319–342 (1988)Google Scholar
  11. 11.
    Buhrman, H., Fortnow, L., Thierauf, T.: Nonrelativizing separations. In: Proceedings of the 13th IEEE Conference on Computational Complexity, pp. 8–12. IEEE, New York (1998)Google Scholar
  12. 12.
    Miltersen, P.B., Vinodchandran, N.V., Watanabe, O.: Super-polynomial versus half-exponential circuit size in the exponential hierarchy. In: Asano, T., et al. (eds.) COCOON 1999. LNCS, vol. 1627, p. 210. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  13. 13.
    Hartmanis, J., Stearns, R.: On the computational complexity of algorithms. Transactions of the American Mathematical Society 117, 285–306 (1965)CrossRefMathSciNetzbMATHGoogle Scholar
  14. 14.
    Kannan, R.: Circuit-size lower bounds and non-reducibility to sparse sets. Information and Control 55, 40–56 (1982)CrossRefMathSciNetzbMATHGoogle Scholar
  15. 15.
    Valiant, L.: The complexity of computing the permanent. Theoretical Computer Science 8, 189–201 (1979)CrossRefMathSciNetzbMATHGoogle Scholar
  16. 16.
    Toda, S.: PP is as hard as the polynomial-time hierarchy. SIAM Journal on Computing 20, 865–877 (1991)CrossRefMathSciNetzbMATHGoogle Scholar
  17. 17.
    Lipton, R.: New directions in testing. In: Feigenbaum, J., Merritt, M. (eds.) Distributed Computing and Cryptography. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, vol. 2, pp. 191–202. American Mathematical Society, Providence (1991)Google Scholar
  18. 18.
    Beaver, D., Feigenbaum, J.: Hiding instances in multioracle queries. In: Choffrut, C., Lengauer, T. (eds.) STACS 1990. LNCS, vol. 415, pp. 37–48. Springer, Heidelberg (1990)Google Scholar
  19. 19.
    Buhrman, H., Homer, S.: Superpolynomial circuits, almost sparse oracles and the exponential hierarchy. In: Shyamasundar, R.K. (ed.) FSTTCS 1992. LNCS, vol. 652, pp. 116–127. Springer, Heidelberg (1992)Google Scholar
  20. 20.
    Babai, L., Fortnow, L., Nisan, N., Wigderson, A.: BPP has subexponential time simulations unless EXPTIME has publishable proofs. Computational Complexity 3, 307–318 (1993)CrossRefMathSciNetzbMATHGoogle Scholar
  21. 21.
    Beimel, A., Bergadano, F., Bshouty, N., Kushilevitz, E., Varricchio, S.: On the applications of multiplicity automata in learning. In: Proceedings of the Thirty-Seventh Annual Symposium on Foundations of Computer Science, pp. 349–358 (1996)Google Scholar
  22. 22.
    Klivans, Shpilka: Learning arithmetic circuits via partial derivatives. In: COLT: Proceedings of the Workshop on Computational Learning Theory. Morgan Kaufmann Publishers, San Francisco (2003)Google Scholar
  23. 23.
    Bshouty, Hancock, Hellerstein: Learning arithmetic read-once formulas. SICOMP: SIAM Journal on Computing 24 (1995)Google Scholar
  24. 24.
    Bshouty.: On interpolating arithmetic read-once formulas with exponentiation. JCSS: Journal of Computer and System Sciences 56 (1998)Google Scholar
  25. 25.
    Linial, N., Mansour, Y., Nisan, N.: Constant depth circuits, fourier transform, and learnability. Journal of the ACM 40, 607–620 (1993)CrossRefMathSciNetzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Lance Fortnow
    • 1
    • 2
  • Adam R. Klivans
    • 1
    • 2
  1. 1.U. Chicago Comp. Sci.Chicago
  2. 2.UT-Austin Comp. Sci.Austin

Personalised recommendations