An Analysis of Tensor Models for Learning on Structured Data

  • Maximilian Nickel
  • Volker Tresp
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8189)


While tensor factorizations have become increasingly popular for learning on various forms of structured data, only very few theoretical results exist on the generalization abilities of these methods. Here, we discuss the tensor product as a principled way to represent structured data in vector spaces for machine learning tasks. By extending known bounds for matrix factorizations, we are able to derive generalization error bounds for the tensor case. Furthermore, we analyze analytically and experimentally how tensor factorization behaves when applied to over- and understructured representations, for instance, when two-way tensor factorization, i.e. matrix factorization, is applied to three-way tensor data.


Tensor Factorization Structured Data Generalization Error Bounds 


  1. 1.
    Anthony, M., Harvey, M.: Linear Algebra: Concepts and Methods. Cambridge University Press (2012)Google Scholar
  2. 2.
    Bader, B.W., Harshman, R.A., Kolda, T.G.: Temporal analysis of semantic graphs using ASALSAN. In: Seventh IEEE International Conference on Data Mining (ICDM 2007), Omaha, NE, USA, pp. 33–42 (2007)Google Scholar
  3. 3.
    Bordes, A., Weston, J., Collobert, R., Bengio, Y.: Learning structured embeddings of knowledge bases. In: Proceedings of the 25th Conference on Artificial Intelligence, San Francisco, USA (2011)Google Scholar
  4. 4.
    Burdick, D.S.: An introduction to tensor products with applications to multiway data analysis. Chemometrics and Intelligent Laboratory Systems 28(2), 229–237 (1995)MathSciNetGoogle Scholar
  5. 5.
    Candes, E.J., Plan, Y.: Matrix completion with noise. Proceedings of the IEEE 98(6), 925–936 (2010)CrossRefGoogle Scholar
  6. 6.
    Candès, E.J., Recht, B.: Exact matrix completion via convex optimization. Foundations of Computational Mathematics 9(6), 717–772 (2009)MathSciNetzbMATHCrossRefGoogle Scholar
  7. 7.
    Carroll, J.D., Chang, J.J.: Analysis of individual differences in multidimensional scaling via an n-way generalization of Eckart-Young decomposition. Psychometrika 35(3), 283–319 (1970)zbMATHCrossRefGoogle Scholar
  8. 8.
    De Lathauwer, L.: Decompositions of a higher-order tensor in block termsPart II: definitions and uniqueness. SIAM Journal on Matrix Analysis and Applications 30(3), 1033–1066 (2008)MathSciNetzbMATHCrossRefGoogle Scholar
  9. 9.
    Gandy, S., Recht, B., Yamada, I.: Tensor completion and low-n-rank tensor recovery via convex optimization. Inverse Problems 27(2), 025010 (2011)Google Scholar
  10. 10.
    Harshman, R.A., Lundy, M.E.: PARAFAC: parallel factor analysis. Computational Statistics & Data Analysis 18(1), 39–72 (1994)zbMATHCrossRefGoogle Scholar
  11. 11.
    Hassoun, M.H.: Fundamentals of Artificial Neural Networks. MIT Press (1995)Google Scholar
  12. 12.
    Huang, Y., Tresp, V., Bundschus, M., Rettinger, A., Kriegel, H.-P.: Multivariate prediction for learning on the semantic web. In: Frasconi, P., Lisi, F.A. (eds.) ILP 2010. LNCS, vol. 6489, pp. 92–104. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  13. 13.
    Jenatton, R., Le Roux, N., Bordes, A., Obozinski, G.: A latent factor model for highly multi-relational data. In: Advances in Neural Information Processing Systems, vol. 25, pp. 3176–3184. MIT Press, Lake Tahoe (2012)Google Scholar
  14. 14.
    Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Review 51(3), 455–500 (2009)MathSciNetzbMATHCrossRefGoogle Scholar
  15. 15.
    Kramer, S., Lavrac, N., Flach, P.: Propositionalization approaches to relational data mining. Springer-Verlag New York, Inc. (2001)Google Scholar
  16. 16.
    Liu, J., Musialski, P., Wonka, P., Ye, J.: Tensor completion for estimating missing values in visual data. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2114–2121 (2009)Google Scholar
  17. 17.
    Mohri, M., Rostamizadeh, A.: Rademacher complexity bounds for non-iid processes. In: Advances in Neural Information Processing Systems, vol. 21, pp. 1097–1104. MIT Press, Cambridge (2009)Google Scholar
  18. 18.
    Nickel, M., Tresp, V., Kriegel, H.P.: A three-way model for collective learning on multi-relational data. In: Proceedings of the 28th International Conference on Machine Learning, pp. 809–816. ACM, Bellevue (2011)Google Scholar
  19. 19.
    Nickel, M., Tresp, V., Kriegel, H.P.: Factorizing YAGO: scalable machine learning for linked data. In: Proceedings of the 21st International Conference on World Wide Web, pp. 271–280. ACM, New York (2012)CrossRefGoogle Scholar
  20. 20.
    Rendle, S., Freudenthaler, C., Schmidt-Thieme, L.: Factorizing personalized markov chains for next-basket recommendation. In: Proceedings of the 19th International Conference on World Wide Web, pp. 811–820. ACM (2010)Google Scholar
  21. 21.
    Rettinger, A., Wermser, H., Huang, Y., Tresp, V.: Context-aware tensor decomposition for relation prediction in social networks. Social Network Analysis and Mining 2(4), 373–385 (2012)CrossRefGoogle Scholar
  22. 22.
    Signoretto, M., Van de Plas, R., De Moor, B., Suykens, J.A.: Tensor versus matrix completion: a comparison with application to spectral data. IEEE Signal Processing Letters 18(7), 403–406 (2011)CrossRefGoogle Scholar
  23. 23.
    Srebro, N.: Learning with matrix factorizations. Ph.D. thesis, Massachusetts Institute of Technology, Cambridge, MA, USA (2004)Google Scholar
  24. 24.
    Srebro, N., Alon, N., Jaakkola, T.S.: Generalization error bounds for collaborative prediction with low-rank matrices. In: Advances in Neural Information Processing Systems, vol. 17, pp. 1321–1328. MIT Press, Cambridge (2005)Google Scholar
  25. 25.
    Tomioka, R., Hayashi, K., Kashima, H.: Estimation of low-rank tensors via convex optimization. arXiv preprint arXiv:1010.0789 (2010)Google Scholar
  26. 26.
    Tomioka, R., Suzuki, T., Hayashi, K., Kashima, H.: Statistical performance of convex tensor decomposition. In: Advances in Neural Information Processing Systems, vol. 24, pp. 972–980 (2012)Google Scholar
  27. 27.
    Warren, H.E.: Lower bounds for approximation by nonlinear manifolds. Transactions of the American Mathematical Society 133(1), 167–178 (1968)MathSciNetzbMATHCrossRefGoogle Scholar
  28. 28.
    Wolf, L., Jhuang, H., Hazan, T.: Modeling appearances with low-rank SVM. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–6 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Maximilian Nickel
    • 1
  • Volker Tresp
    • 2
  1. 1.Ludwig Maximilian UniversityMunichGermany
  2. 2.Siemens AG, Corporate TechnologyMunichGermany

Personalised recommendations