Abstract
We introduce and study the properties of Boolean autoencoder circuits. In particular, we show that the Boolean autoencoder circuit problem is equivalent to a clustering problem on the hypercube. We show that clustering m binary vectors on the n-dimensional hypercube into k clusters is NP-hard, as soon as the number of clusters scales like \({m^\epsilon (\epsilon >0 )}\) , and thus the general Boolean autoencoder problem is also NP-hard. We prove that the linear Boolean autoencoder circuit problem is also NP-hard, and so are several related problems such as: subspace identification over finite fields, linear regression over finite fields, even/odd set intersections, and parity circuits. The emerging picture is that autoencoder optimization is NP-hard in the general case, with a few notable exceptions including the linear cases over infinite fields or the Boolean case with fixed size hidden layer. However learning can be tackled by approximate algorithms, including alternate optimization, suggesting a new class of learning algorithms for deep networks, including deep networks of threshold gates or artificial neurons.
Article PDF
Similar content being viewed by others
References
Afrati F., Papadimitriou C., Papageorgiou G.: The complexity of cubical graphs. Inf. control 66(1–2), 53–60 (1985)
Baldi P.: Autoencoders, unsupervised learning, and deep architectures. In: Journal of Machine Learning Research, Workshop and Conference Proceedings, Proceedings of the 2011 ICML Workshop on Unsupervised and Transfer Learning, vol. 27, Bellevue, WA, pp. 37–50 (2012).
Baldi P., Hornik K. (1988) Neural networks and principal component analysis: learning from examples without local minima. Neural Netw. 2(1):53–58
Baldi P., Lu Z.: Complex-valued autoencoders. Neural Netw. 33, 136–147 (2012)
Bengio Y., LeCun Y.: Scaling learning algorithms towards AI. In: Bottou L., Chapelle O., DeCoste D., Weston J. (eds.) Large-scale kernel machines. MIT Press, Cambridge (2007).
Berlekamp E., McEliece R., van Tilborg H.: On the inherent intractability of certain coding problems(Corresp.). IEEE Trans. Inf. Theory 24(3), 384–386 (1978)
Blum A., Rivest R.: Training a 3-node neural network is np-complete. Neural Netw. 5(1), 117–127 (1992)
Dempster A.P., Laird N.M., Rubin D.B.: Maximum likelihood from incomplete data via the em algorithm. J. R. Stat. Soc. B 39, 1–22 (1977)
Duda R.O., Hart P.E., Stork D.G.: Pattern classification. Wiley, New York (2000)
Erhan D., Bengio Y., Courville A., Manzagol P.A., Vincent P., Bengio S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11, 625–660 (2010)
Frey B., Dueck D.: Clustering by passing messages between data points. Science 315(5814), 972 (2007)
Garey M., Johnson D.: Computers and intractability. Freeman, San Francisco (1979).
Harary F.: Cubical graphs and cubical dimensions. Comput. Math. Appl. 15(4), 271–275 (1988)
Hartman J.: The homeomorphic embedding of Kn in the m-cube* 1. Discret. Math. 16(2), 157–160 (1976)
Havel I., Morávek J.: B-valuations of graphs. Czechoslov. Math. J. 22(2), 338–351 (1972)
Hinton G., Osindero S., Teh Y.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Hinton G., Salakhutdinov R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504 (2006)
Livingston M., Stout Q.: Embeddings in hypercubes. Math. Comput. Model. 11, 222–227 (1988)
Mahajan M., Nimbhorkar P., Varadarajan K.: The planar k-means problem is NP-hard. In: Proceedings of 3rd Annual Workshop on Algorithms and Computation WALCOM, Kolkata, pp. 274–285 (2009).
McEliece R.J.: The theory of information and coding. Addison-Wesley Publishing Company, Reading (1977).
Megiddo N., Supowit K.: On the complexity of some common geometric location problems. SIAM J. Comput. 13(1), 182–196 (1984)
Ntafos S., Hakimi S.: On the complexity of some coding problems (corresp.). IEEE Trans. Inf. Theory 27(6), 794–796 (1981)
Rumelhart D., Hinton G., Williams R.: Learning internal representations by error propagation. In: Parallel distributed processing, vol 1: Foundations. MIT Press, Cambridge (1986).
Scholkopf, B., Smola, A.J.: Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge (2002).
Slagle J., Chang C., Heller S.: A clustering and data reorganization algorithm. IEEE Trans. Syst. Man Cybern. 5, 121–128 (1975)
Sutskever I., Hinton G.: Deep, narrow sigmoid belief networks are universal approximators. Neural Comput. 20(11), 2629–2636 (2008)
Vardy A.: The intractability of computing the minimum distance of a code. IEEE Trans. Inf. Theory 43(6), 1757–1766 (1997)
Vattani A.: A simpler proof of the hardness of k-means clustering in the plane. UCSD Technical Report (2010).
Winkler P.: Proof of the squashed cube conjecture. Combinatorica 3(1), 135–139 (1983)
Acknowledgments
Work supported in part by grants NSF IIS-0513376, NIH LM010235, and NIH NLM T15 LM07443 to PB.
Open Access
This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
Author information
Authors and Affiliations
Corresponding author
Additional information
This is one of several papers published together in Designs, Codes and Cryptography on the special topic: “Combinatorics – A Special Issue Dedicated to the 65th Birthday of Richard Wilson”.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Baldi, P. Boolean autoencoders and hypercube clustering complexity. Des. Codes Cryptogr. 65, 383–403 (2012). https://doi.org/10.1007/s10623-012-9719-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10623-012-9719-x