, Volume 65, Issue 3, pp 383-403,
Open Access This content is freely available online to anyone, anywhere at any time.
Date: 24 Jul 2012

Boolean autoencoders and hypercube clustering complexity


We introduce and study the properties of Boolean autoencoder circuits. In particular, we show that the Boolean autoencoder circuit problem is equivalent to a clustering problem on the hypercube. We show that clustering m binary vectors on the n-dimensional hypercube into k clusters is NP-hard, as soon as the number of clusters scales like \({m^\epsilon (\epsilon >0 )}\) , and thus the general Boolean autoencoder problem is also NP-hard. We prove that the linear Boolean autoencoder circuit problem is also NP-hard, and so are several related problems such as: subspace identification over finite fields, linear regression over finite fields, even/odd set intersections, and parity circuits. The emerging picture is that autoencoder optimization is NP-hard in the general case, with a few notable exceptions including the linear cases over infinite fields or the Boolean case with fixed size hidden layer. However learning can be tackled by approximate algorithms, including alternate optimization, suggesting a new class of learning algorithms for deep networks, including deep networks of threshold gates or artificial neurons.

This is one of several papers published together in DesignsCodes and Cryptography on the special topic: “Combinatorics – A Special Issue Dedicated to the 65th Birthday of Richard Wilson”.