Knowledge and Information Systems

, Volume 34, Issue 2, pp 243–265 | Cite as

Non-negative Tri-factor tensor decomposition with applications

  • Zhong-Yuan Zhang
  • Tao LiEmail author
  • Chris Ding
Regular Paper


Non-negative matrix factorization (NMF) mainly focuses on the hidden pattern discovery behind a series of vectors for two-way data. Here, we propose a tensor decomposition model Tri-ONTD to analyze three-way data. The model aims to discover the common characteristics of a series of matrices and at the same time identify the peculiarity of each matrix, thus enabling the discovery of the cluster structure in the data. In particular, the Tri-ONTD model performs adaptive dimension reduction for tensors as it integrates the subspace identification (i.e., the low-dimensional representation with a common basis for a set of matrices) and the clustering process into a single process. The Tri-ONTD model can also be regarded as an extension of the Tri-factor NMF model. We present the detailed optimization algorithm and also provide the convergence proof. Experimental results on real-world datasets demonstrate the effectiveness of our proposed method in author clustering, image clustering, and image reconstruction. In addition, the results of our proposed model have sparse and localized structures.


Non-negative tensor decomposition Non-negative matrix factorization 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Acar E, Yener B (2007) Unsupervised multiway data analysis: a literature survey. Technical report, Computer Science Department, Rensselaer Polytechnic InstituteGoogle Scholar
  2. 2.
    Acar E, Camtepe SA, Krishnamoorthy M, Yener B (2005) Modeling and multiway analysis of chatroom tensors. In: Proceedings of IEEE international conference on intelligence and security informatics. Lecture Notes in Computer ScienceGoogle Scholar
  3. 3.
    Bader B, Harshman R, Kolda T (2006) Analysis of latent relationships in semantic graphs using DEDICOM invited talk at the workshop on Algorithms for Modern Massive Data SetsGoogle Scholar
  4. 4.
    Bock HH (1986) On the interface between cluster analysis, principal components, and multidimensional scaling. In: Proceedings of advances symposium on multivariate modelling and data analysis. Reidel Publishing Co., Dordrecht, pp 17–34Google Scholar
  5. 5.
    Bolton RJ, Krzanowski WJ (2003) Projection pursuit clustering for exploratory data analysis. J Comput Graph Stat 12: 121–142MathSciNetCrossRefGoogle Scholar
  6. 6.
    Buntine W, Perttu S (2003) Is multinomial pca multi-faceted clustering or dimensionality reduction. In: Proceedings of 9th international workshop on artificial intelligence and statistics, pp 300–307Google Scholar
  7. 7.
    Cho H, Dhillon I, Guan Y, Sra S (2004) Minimum sum squared residue co-clustering of gene expression data. In: Proceedings of SIAM data mining conferenceGoogle Scholar
  8. 8.
    De Lathauwer L, De Moor B, Vandewalle J (2000) A multilinear singular value decomposition. SIAM J Matrix Anal Appl 21(4): 1253–1278MathSciNetzbMATHCrossRefGoogle Scholar
  9. 9.
    Ding C, He X, Simon H (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In: Proceedings of SIAM data mining conferenceGoogle Scholar
  10. 10.
    Ding C, Li T (2007) Adaptive dimension reduction using discriminant analysis and k-means clustering. In: ICML, pp 521–528Google Scholar
  11. 11.
    Ding C, Li T, Jordan Michael I (2010) Convex and semi-nonnegative matrix factorizations. IEEE Trans Pattern Anal Mach Intell 32(1): 45–55CrossRefGoogle Scholar
  12. 12.
    Ding C, Li T, Peng W, Park H (2006) Orthogonal nonnegative matrix tri-factorizations for clustering. In: SIGKDD, pp 126–135Google Scholar
  13. 13.
    Ding C, Ye JP (2005) 2-Dimensional singular value decomposition for 2D maps and images. In: Proceedings of SIAM data mining conferenceGoogle Scholar
  14. 14.
    Dhillon IS, Mallela S, Modha DS (2003) Information-theoretical co-clustering. In: SIGKDD, pp 89–98Google Scholar
  15. 15.
    DeSarbo WS (1982) GENNCLUS: new models for general non-hierarchical clustering analysis. Psychometrika 47: 449–475MathSciNetzbMATHCrossRefGoogle Scholar
  16. 16.
    De Soete G, Carroll JD (1994) K-means Clustering in a Low-dimensional Euclidean Space. In: New approaches in classification and data analysis. Springer, Heidelberg, pp 212–219Google Scholar
  17. 17.
    Eckart C, Young G (1936) The approximation of one matrix by another of lower rank. Psychometrika 1: 183–187CrossRefGoogle Scholar
  18. 18.
    Golub G, Van Loan C (1996) Matrix computations, 3rd edn. Johns Hopkins, BaltimorezbMATHGoogle Scholar
  19. 19.
    Govaert G (1995) Simultaneous clustering of rows and columns. Control Cybern 24: 437–458zbMATHGoogle Scholar
  20. 20.
    Harshman RA (1978) Models for analysis of asymmetrical relationships among N objects or stimuli. In: First joint meeting of the psychometric society for mathematical psychologyGoogle Scholar
  21. 21.
    Harshman RA (1970) Foundations of the parafac procedure: models and conditions for an ‘explanatory’ multi-modal factor analysis. UCLA working papers in phonetics 16, pp 1–84Google Scholar
  22. 22.
    Harshman RA, Kolda TG, Bader BW (2007) Temporal analysis of semantic graphs using asalsan. In: Proceedings of IEEE international conference on data mining (ICDM 2007)Google Scholar
  23. 23.
    Hastie T, Tibshirani R, Friedman JH (2001) The elements of statistical learning. Springer, BerlinzbMATHGoogle Scholar
  24. 24.
    Hyvarinen A, Karhunen J, Oja E (2001) Independent component analysis. Wiley, LondonCrossRefGoogle Scholar
  25. 25.
    Kim Y, Choi S (2007) Nonnegative tucker decomposition. In: Proceedings of IEEE conference on computer vision and pattern recognitionGoogle Scholar
  26. 26.
    Kroonenberg PM, De Leeuw J (1980) Principal component analysis of three-mode data by means of alternating least squares algorithms. Psychometrika 45: 69–97MathSciNetzbMATHCrossRefGoogle Scholar
  27. 27.
    Kolda T (2001) Orthogonal tensor decomposition. SIAM J Matrix Anal Appl 23: 243–255MathSciNetzbMATHCrossRefGoogle Scholar
  28. 28.
    Kolda T, Bader B (2006) The TOPHITS model for higher-order web link analysis. In: Workshop on link analysis, counter terrorism and securityGoogle Scholar
  29. 29.
    Lee D, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401: 788–791CrossRefGoogle Scholar
  30. 30.
    Lee D, Seung HS (2001) Algorithms for non-negative matrix factorization. In: NIPSGoogle Scholar
  31. 31.
    Li T (2008) Clustering based on matrix approximation: a unifying view. Knowl Inf Syst (KAIS) 17(1): 1–15zbMATHCrossRefGoogle Scholar
  32. 32.
    Li T, Ding C (2006) The relationships among various nonnegative matrix factorization methods for clustering. In: ICDM, pp 362–371Google Scholar
  33. 33.
    Li T, Ma S, Ogihara M (2004) Document clustering via adaptive subspace iteration. In: SIGIR, pp 218–225Google Scholar
  34. 34.
    Mardia KV, Kent JT, Bibby JM (1979) Multivariate analysis. Academic Press, LondonzbMATHGoogle Scholar
  35. 35.
    Paatero P (1999) The multilinear engine: a table-driven, least squares program for solving multilinear problems, including the n-way parallel factor analysis model. J Comput Graph Stat 8(4): 854–888MathSciNetGoogle Scholar
  36. 36.
    Paatero P, Tapper U (1994) Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5: 111–126CrossRefGoogle Scholar
  37. 37.
    Peng W, Li T (2011) Temporal relation co-clustering on directional social network and author-topic evolution. Knowl Inf Syst (KAIS) 26(3): 467–486CrossRefGoogle Scholar
  38. 38.
    Peng W, Li T (2011) On the equivalence between nonnegative tensor factorization and tensorial probabilistic latent semantic analysis. Appl Intell 35(2): 285–295MathSciNetzbMATHCrossRefGoogle Scholar
  39. 39.
    Rocci R, Vichi M (2005) Three-mode component analysis with crisp or fuzzy partition of units. Psychometrika 70(4): 715–736MathSciNetCrossRefGoogle Scholar
  40. 40.
    Shashua A, Hazan T (2005) Non-negative tensor factorization with applications to statistics and computer vision. ICML’05Google Scholar
  41. 41.
    Smilde A, Bro R, Geladi P (2004) Multi-way analysis: applications in the chemical sciences. Wiley, LondonCrossRefGoogle Scholar
  42. 42.
    Strehl A, Ghosh J (2002) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res (JMLR) 3: 583–617MathSciNetGoogle Scholar
  43. 43.
    Sun J, Zeng H, Liu H, Lu Y, Chen Z (2005) Cubesvd: a novel approach to personalized web search. In: Proceedings of the 14th international conference on World Wide WebGoogle Scholar
  44. 44.
    Tipping M, Bishop C (1999) Probabilistic principal component analysis. J R Stat Soc Ser B 21(3): 611–622MathSciNetCrossRefGoogle Scholar
  45. 45.
    Thurau C, Kersting K, Wahabzada M, Bauckhage C (2011) Convex non-negative matrix factorization for massive datasets. Knowl Inf Syst (KAIS)Google Scholar
  46. 46.
    Tucker LR (1966) Some mathematical notes on three-mode factor analysis. Psychometrika 31(3): 279–311MathSciNetCrossRefGoogle Scholar
  47. 47.
    Yu SX, Shi J (2003) Multiclass spectral clustering. In: Proceedigns of the 9th IEEE international conference on computer vision (ICCV 2003), pp 313–319Google Scholar
  48. 48.
    Vasilescu MAO, Terzopoulos D (2002) Multilinear analysis of image ensembles: Tensorfaces. In: Proceedings of the 7th European conference on computer vision-part I (ECCV’02), pp 447–460Google Scholar
  49. 49.
    Vichi M, Kiers HAL (2001) Factorial k-means analysis for two-way data. Comput Stat Data Anal 37: 49–64MathSciNetzbMATHCrossRefGoogle Scholar
  50. 50.
    Vichi M, Rocci R (2008) Two-mode multi-partitioning. Comput Stat Data Anal 52: 1984–2003MathSciNetzbMATHCrossRefGoogle Scholar
  51. 51.
    Vichi M, Rocci R, Kiers HAL (2007) Simultaneous component and clustering models for three-way data: within and between approaches. J Classif 24(1): 71–98MathSciNetzbMATHCrossRefGoogle Scholar
  52. 52.
    Welling M, Weber M (2001) Positive tensor factorization. Pattern Recogn Lett 22(12): 1255–1261zbMATHCrossRefGoogle Scholar
  53. 53.
    Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Yu PS, Zhou Z, Steinbach M, Hand DJ, Steinberg D (2007) Top 10 algorithms in data mining. Knowl Inf Syst (KAIS) 14(1): 1–37CrossRefGoogle Scholar
  54. 54.
    Zhang T, Golub GH (2001) Rank-one approximation to high order tensor. SIAM J Matrix Anal Appl 23: 534–550MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2012

Authors and Affiliations

  1. 1.School of StatisticsCentral University of Finance and EconomicsBeijingPeople’s Republic of China
  2. 2.School of Computing and Information SciencesFlorida International UniversityMiamiUSA
  3. 3.Computer Science and Engineering DepartmentUniversity of Texas at ArlingtonArlingtonUSA

Personalised recommendations