Designs, Codes and Cryptography

, Volume 65, Issue 3, pp 383–403 | Cite as

Boolean autoencoders and hypercube clustering complexity

Open Access
Article

Abstract

We introduce and study the properties of Boolean autoencoder circuits. In particular, we show that the Boolean autoencoder circuit problem is equivalent to a clustering problem on the hypercube. We show that clustering m binary vectors on the n-dimensional hypercube into k clusters is NP-hard, as soon as the number of clusters scales like \({m^\epsilon (\epsilon >0 )}\) , and thus the general Boolean autoencoder problem is also NP-hard. We prove that the linear Boolean autoencoder circuit problem is also NP-hard, and so are several related problems such as: subspace identification over finite fields, linear regression over finite fields, even/odd set intersections, and parity circuits. The emerging picture is that autoencoder optimization is NP-hard in the general case, with a few notable exceptions including the linear cases over infinite fields or the Boolean case with fixed size hidden layer. However learning can be tackled by approximate algorithms, including alternate optimization, suggesting a new class of learning algorithms for deep networks, including deep networks of threshold gates or artificial neurons.

Keywords

Autoencoders Clustering Boolean circuits Computational complexity 

Mathematics Subject Classification (2010)

68T05 

Notes

Acknowledgments

Work supported in part by grants NSF IIS-0513376, NIH LM010235, and NIH NLM T15 LM07443 to PB.

Open Access

This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

References

  1. 1.
    Afrati F., Papadimitriou C., Papageorgiou G.: The complexity of cubical graphs. Inf. control 66(1–2), 53–60 (1985)MathSciNetMATHCrossRefGoogle Scholar
  2. 2.
    Baldi P.: Autoencoders, unsupervised learning, and deep architectures. In: Journal of Machine Learning Research, Workshop and Conference Proceedings, Proceedings of the 2011 ICML Workshop on Unsupervised and Transfer Learning, vol. 27, Bellevue, WA, pp. 37–50 (2012).Google Scholar
  3. 3.
    Baldi P., Hornik K. (1988) Neural networks and principal component analysis: learning from examples without local minima. Neural Netw. 2(1):53–58Google Scholar
  4. 4.
    Baldi P., Lu Z.: Complex-valued autoencoders. Neural Netw. 33, 136–147 (2012)CrossRefGoogle Scholar
  5. 5.
    Bengio Y., LeCun Y.: Scaling learning algorithms towards AI. In: Bottou L., Chapelle O., DeCoste D., Weston J. (eds.) Large-scale kernel machines. MIT Press, Cambridge (2007).Google Scholar
  6. 6.
    Berlekamp E., McEliece R., van Tilborg H.: On the inherent intractability of certain coding problems(Corresp.). IEEE Trans. Inf. Theory 24(3), 384–386 (1978)MATHCrossRefGoogle Scholar
  7. 7.
    Blum A., Rivest R.: Training a 3-node neural network is np-complete. Neural Netw. 5(1), 117–127 (1992)CrossRefGoogle Scholar
  8. 8.
    Dempster A.P., Laird N.M., Rubin D.B.: Maximum likelihood from incomplete data via the em algorithm. J. R. Stat. Soc. B 39, 1–22 (1977)MathSciNetMATHGoogle Scholar
  9. 9.
    Duda R.O., Hart P.E., Stork D.G.: Pattern classification. Wiley, New York (2000)Google Scholar
  10. 10.
    Erhan D., Bengio Y., Courville A., Manzagol P.A., Vincent P., Bengio S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11, 625–660 (2010)MathSciNetMATHGoogle Scholar
  11. 11.
    Frey B., Dueck D.: Clustering by passing messages between data points. Science 315(5814), 972 (2007)MathSciNetMATHCrossRefGoogle Scholar
  12. 12.
    Garey M., Johnson D.: Computers and intractability. Freeman, San Francisco (1979).Google Scholar
  13. 13.
    Harary F.: Cubical graphs and cubical dimensions. Comput. Math. Appl. 15(4), 271–275 (1988)MathSciNetMATHCrossRefGoogle Scholar
  14. 14.
    Hartman J.: The homeomorphic embedding of Kn in the m-cube* 1. Discret. Math. 16(2), 157–160 (1976)MathSciNetMATHCrossRefGoogle Scholar
  15. 15.
    Havel I., Morávek J.: B-valuations of graphs. Czechoslov. Math. J. 22(2), 338–351 (1972)Google Scholar
  16. 16.
    Hinton G., Osindero S., Teh Y.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)MathSciNetMATHCrossRefGoogle Scholar
  17. 17.
    Hinton G., Salakhutdinov R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504 (2006)MathSciNetMATHCrossRefGoogle Scholar
  18. 18.
    Livingston M., Stout Q.: Embeddings in hypercubes. Math. Comput. Model. 11, 222–227 (1988)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Mahajan M., Nimbhorkar P., Varadarajan K.: The planar k-means problem is NP-hard. In: Proceedings of 3rd Annual Workshop on Algorithms and Computation WALCOM, Kolkata, pp. 274–285 (2009).Google Scholar
  20. 20.
    McEliece R.J.: The theory of information and coding. Addison-Wesley Publishing Company, Reading (1977).Google Scholar
  21. 21.
    Megiddo N., Supowit K.: On the complexity of some common geometric location problems. SIAM J. Comput. 13(1), 182–196 (1984)MathSciNetMATHCrossRefGoogle Scholar
  22. 22.
    Ntafos S., Hakimi S.: On the complexity of some coding problems (corresp.). IEEE Trans. Inf. Theory 27(6), 794–796 (1981)MathSciNetMATHCrossRefGoogle Scholar
  23. 23.
    Rumelhart D., Hinton G., Williams R.: Learning internal representations by error propagation. In: Parallel distributed processing, vol 1: Foundations. MIT Press, Cambridge (1986).Google Scholar
  24. 24.
    Scholkopf, B., Smola, A.J.: Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge (2002).Google Scholar
  25. 25.
    Slagle J., Chang C., Heller S.: A clustering and data reorganization algorithm. IEEE Trans. Syst. Man Cybern. 5, 121–128 (1975)Google Scholar
  26. 26.
    Sutskever I., Hinton G.: Deep, narrow sigmoid belief networks are universal approximators. Neural Comput. 20(11), 2629–2636 (2008)MATHCrossRefGoogle Scholar
  27. 27.
    Vardy A.: The intractability of computing the minimum distance of a code. IEEE Trans. Inf. Theory 43(6), 1757–1766 (1997)MathSciNetMATHCrossRefGoogle Scholar
  28. 28.
    Vattani A.: A simpler proof of the hardness of k-means clustering in the plane. UCSD Technical Report (2010).Google Scholar
  29. 29.
    Winkler P.: Proof of the squashed cube conjecture. Combinatorica 3(1), 135–139 (1983)MathSciNetMATHCrossRefGoogle Scholar

Copyright information

© The Author(s) 2012

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of CaliforniaIrvineUSA

Personalised recommendations