T.W. Anderson, An Introduction to Multivariate Statistical Analysis, John Wiley, New York, 1958.
A.C. Andrews, Mathematical Techniques in Pattern Recognition, Wiley-Interscience, New York, 1972.
Z. Roth and Y. Baram, “Multi-dimensional density shaping by sigmoids,” IEEE Trans. on Neural Networks, Vol. 7, No. 5, pp. 1291-1298, September 1996.
R. Battiti, “Using mutual information for selecting features in unsupervised neural net learning,” IEEE Trans. on Neural Networks, Vol. 5, No. 4, pp. 537-550, July 1994.
A.J. Bell and T.J. Sejnowski, “An information maximization approach to blind separation and blind deconvolution,” Neural Computation, Vol. 7, No. 6, pp. 1129-1159, 1995.
M. Bichsel and P. Seiz, “Minimum class entropy: a maximum entropy approach to layered networks,” Neural Networks, 2: 133-141, 1989.
C.M. Bishop, Neural Networks for Pattern Recognition, Clarendon Press, Oxford, 1995.
K. Fukunaga, Introduction to Statistical Pattern Recognition, Academic Press, San Diego, 1990.
P. G. Hoel, Introduction to Mathematical Statistics, Wiley, New York, 1984.
A.R. Horn and C.H. Johnson, Matrix Analysis, Cambridge University Press, 1985.
R. Linsker, “How to generate ordered maps by maximizing the mutual information between input and output signals,” Neural Computation, Vol. 1, No. 3, pp. 402-411, 1989.
G.H. Golub and C.F. Van Loan, Matrix Computation, The Johns Hopkins University Press, Baltimore, MD, 1983.
A. Papoulis, Probability, Random Variables and Stochastic Processes, McGraw-Hill, Princeton, 1984.