Learning with tensors: a framework based on convex optimization and spectral regularization
We present a framework based on convex optimization and spectral regularization to perform learning when feature observations are multidimensional arrays (tensors). We give a mathematical characterization of spectral penalties for tensors and analyze a unifying class of convex optimization problems for which we present a provably convergent and scalable template algorithm. We then specialize this class of problems to perform learning both in a transductive as well as in an inductive setting. In the transductive case one has an input data tensor with missing features and, possibly, a partially observed matrix of labels. The goal is to both infer the missing input features as well as predict the missing labels. For induction, the goal is to determine a model for each learning task to be used for out of sample prediction. Each training pair consists of a multidimensional array and a set of labels each of which corresponding to related but distinct tasks. In either case the proposed technique exploits precise low multilinear rank assumptions over unknown multidimensional arrays; regularization is based on composite spectral penalties and connects to the concept of Multilinear Singular Value Decomposition. As a by-product of using a tensor-based formalism, our approach allows one to tackle the multi-task case in a natural way. Empirical studies demonstrate the merits of the proposed methods.
KeywordsSpectral regularization Matrix and tensor completion Tucker decomposition Multilinear rank Transductive and inductive learning Multi-task learning
- Argyriou, A., Micchelli, C., Pontil, M., & Ying, Y. (2007a). A spectral regularization framework for multi-task structure learning. In Advances in neural information processing systems. Google Scholar
- Argyriou, A., Micchelli, C. A., Pontil, M., & Ying, Y. (2007b). A spectral regularization framework for multi-task structure learning. In J. Platt, D. Koller, Y. Singer, & S. Roweis (Eds.), Advances in neural information processing systems (Vol. 20, pp. 25–32). Cambridge: MIT Press. Google Scholar
- Argyriou, A., Evgeniou, T., & Pontil, M. (2007c). Multi-task feature learning. In B. Schölkopf, J. Platt, & T. Hoffman (Eds.), Advances in neural information processing systems (Vol. 19, pp. 41–48). Cambridge: MIT Press. Google Scholar
- Argyriou, A., Micchelli, C., Pontil, M., Shen, L., & Xu, Y. (2011). Efficient first order methods for linear composite regularizers. Arxiv preprint arXiv:1104.1436.
- Becker, S., Candès, E. J., & Grant, M. (2010). Templates for convex cone problems with applications to sparse signal recovery. Tech. rep, Stanford University. Google Scholar
- Combettes, P., & Pesquet, J. (2009). Proximal splitting methods in signal processing. Arxiv preprint arXiv:0912.3522.
- De Brabanter, K., Karsmakers, P., Ojeda, F., Alzate, C., De Brabanter, J., Pelckmans, K., De Moor, B., Vandewalle, J., & Suykens, J. A. K. (2010). LS-SVMlab toolbox user’s guide version 1.8. Internal Report 10-146, ESAT-SISTA, K.U.Leuven (Leuven, Belgium). Google Scholar
- Fazel, M. (2002). Matrix rank minimization with applications. Ph.D. thesis, Elec. Eng. Dept., Stanford University. Google Scholar
- Goldberg, A., Xiaojin, Z., Recht, B., Xu, J., & Nowak, R. (2010). Transduction with matrix completion: three birds with one stone. In J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. Zemel, & A. Culotta (Eds.), Advances in neural information processing systems (Vol. 23, pp. 757–765). Google Scholar
- Hillar, C., & Lim, L. (2010). Most tensor problems are NP hard. Arxiv preprint arXiv:0911.1393.
- Jacob, L., Obozinski, G., & Vert, J. (2009). Group lasso with overlap and graph lasso. In Proceedings of the 26th annual international conference on machine learning. New York: ACM. Google Scholar
- Koltchinskii, V., Tsybakov, A., & Lounici, K. (2010). Nuclear norm penalization and optimal rates for noisy low rank matrix completion. Arxiv preprint arXiv:1011.6256.
- Liu, J., Musialski, P., Wonka, P., & Ye, J. (2009). Tensor completion for estimating missing values in visual data. In IEEE international conference on computer vision (ICCV), Kyoto, Japan (pp. 2114–2121). Google Scholar
- Nesterov, Y. (2007). Gradient methods for minimizing composite objective function. Center for Operations Research and Econometrics (CORE), Université catholique de Louvain, Tech. Rep. Google Scholar
- Srebro, N. (2004). Learning with matrix factorizations. Ph.D. thesis, Massachusetts Institute of Technology. Google Scholar
- Tomioka, R., & Aihara, K. (2007). Classifying matrices with a spectral regularization. In Proceedings of the 24th international conference on machine learning (pp. 895–902). New York: ACM. Google Scholar
- Tomioka, R., Hayashi, K., & Kashima, H. (2011). Estimation of low-rank tensors via convex optimization. Arxiv preprint arXiv:1010.0789.
- Tucker, L. R. (1964). The extension of factor analysis to three-dimensional matrices. In Contributions to mathematical psychology (pp. 109–127). New York: Holt, Rinehart & Winston. Google Scholar