Skip to main content
Log in

Regularized nonnegative shared subspace learning

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Joint modeling of related data sources has the potential to improve various data mining tasks such as transfer learning, multitask clustering, information retrieval etc. However, diversity among various data sources might outweigh the advantages of the joint modeling, and thus may result in performance degradations. To this end, we propose a regularized shared subspace learning framework, which can exploit the mutual strengths of related data sources while being immune to the effects of the variabilities of each source. This is achieved by further imposing a mutual orthogonality constraint on the constituent subspaces which segregates the common patterns from the source specific patterns, and thus, avoids performance degradations. Our approach is rooted in nonnegative matrix factorization and extends it further to enable joint analysis of related data sources. Experiments performed using three real world data sets for both retrieval and clustering applications demonstrate the benefits of regularization and validate the effectiveness of the model. Our proposed solution provides a formal framework appropriate for jointly analyzing related data sources and therefore, it is applicable to a wider context in data mining.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Agarwal A, Daumé H III, Gerber S (2010) Learning multiple tasks using manifold regularization. In: Advances in neural information processing systems, vol 23, pp 46–54

  • Ando R, Zhang T (2005) A framework for learning predictive structures from multiple tasks and unlabeled data. J Mach Learn Res 6: 1817–1853

    MathSciNet  MATH  Google Scholar 

  • Bae E, Bailey J (2006) Coala: a novel approach for the extraction of an alternate clustering of high quality and high dissimilarity. In: IEEE international conference on data mining, pp 53–62

  • Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley, Reading, MA

    Google Scholar 

  • Baxter J (2000) A model of inductive bias learning. J Artif Intell Res 12: 149–198

    MathSciNet  MATH  Google Scholar 

  • Ben-David S, Schuller R (2003) Exploiting task relatedness for multiple task learning. In: 16th annual conference on computational learning theory, vol 2777, pp 567–580

  • Berry M, Browne M (2005) Email surveillance using non-negative matrix factorization. Comput Math Organ Theory 11(3): 249–264

    Article  MATH  Google Scholar 

  • Berry M, Browne M, Langville A, Pauca V, Plemmons R (2007) Algorithms and applications for approximate nonnegative matrix factorization. Comput Stat Data Anal 52(1): 155–173

    Article  MathSciNet  MATH  Google Scholar 

  • Bickel S, Scheffer T (2004) Multi-view clustering. In: Proceedings of the IEEE international conference on data mining, pp 19–26

  • Blei D, Ng A, Jordan M (2003) Latent Dirichlet allocation. J Mach Learn Res 3: 993–1022

    MATH  Google Scholar 

  • Boutsidis C, Gallopoulos E (2008) SVD based initialization: a head start for nonnegative matrix factorization. Pattern Recogn 41(4): 1350–1362

    Article  MATH  Google Scholar 

  • Bucak S, Gunsel B (2007) Video content representation by incremental non-negative matrix factorization. ICIP 2: 113–116

    Google Scholar 

  • Cai D, He X, Han J (2007) Semi-supervised discriminant analysis. In: International conference on computer vision, pp 1–7

  • Cai D, He X, Wu X, Han J (2008) Non-negative matrix factorization on manifold. IEEE international conference on data mining, pp 63–72

  • Cai D, He X, Han J, Huang T (2011) Graph regularized non-negative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(8): 1548–1560

    Article  Google Scholar 

  • Caruana R (1997) Multitask learning. Mach Learn 28(1): 41–75

    Article  MathSciNet  Google Scholar 

  • Chaudhuri K, Kakade S, Livescu K, Sridharan K (2009) Multi-view clustering via canonical correlation analysis. In: Proceedings of the 26th international conference on machine learning, pp 129–136

  • Choi S (2008) Algorithms for orthogonal nonnegative matrix factorization. In: Proceedings of the international joint conference on neural networks, pp 1828–1832

  • Cui Y, Fern X, Dy J (2007) Non-redundant multi-view clustering via orthogonalization. In: IEEE international conference on data mining. IEEE, pp 133–142

  • Dai W, Jin O, Xue G, Yang Q, Yu Y (2009) Eigentransfer: a unified framework for transfer learning. ICML, pp 193–200

  • Dempster A, Laird N, Rubin D et al (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1): 1–38

    MathSciNet  MATH  Google Scholar 

  • Ding C, Li T, Peng W, Park H (2006) Orthogonal nonnegative matrix tri-factorizations for clustering. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 126–135

  • Duda R, Hart P, Stork D (2001) Pattern classification, vol 2. Wiley-Interscience, New York

    Google Scholar 

  • Gu Q, Zhou J (2009a) Learning the shared subspace for multi-task clustering and transductive transfer classification. In: IEEE international conference on data mining, pp 159–168

  • Gu Q, Zhou J (2009b) Local learning regularized nonnegative matrix factorization. In: Proceedings of the 21st international joint conference on artificial intelligence, pp 1046–1051

  • Gupta S, Phung D, Adams B, Tran T, Venkatesh S (2010) Nonnegative shared subspace learning and its application to social media retrieval. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1169–1178

  • Gupta S, Phung D, Adams B, Venkatesh S (2011a) A Bayesian framework for learning shared and individual subspaces from multiple data sources. In: Advances in knowledge discovery and data mining, 15th Pacific-Asia conference (PAKDD), pp 136–147

  • Gupta SK, Phung D, Adams B, Venkatesh S (2011b) A matrix factorization framework for jointly analyzing multiple nonnegative data sources. In: Proceedings of text mining workshop, in conjuction with SIAM international conference on data mining

  • Hardoon D, Szedmak S, Shawe-Taylor J (2004) Canonical correlation analysis: an overview with application to learning methods. Neural Comput 16(12): 2639–2664

    Article  MATH  Google Scholar 

  • Ji S, Tang L, Yu S, Ye J (2008) Extracting shared subspace for multi-label classification. In: Proceeding of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 381–389

  • Jolliffe I (2002) Principle component analysis. Springer, Heidelberg

    Google Scholar 

  • Kailing K, Kriegel H, Pryakhin A, Schubert M (2004) Clustering multi-represented objects with noise. In: Advances in knowledge discovery and data mining, 8th Pacific-Asia conference (PAKDD), pp 394–403

  • Kim H, Park H (2008) Non-negative matrix factorization based on alternating non-negativity constrained least squares and active set method. SIAM J Matrix Anal Appl 30(2): 713–730

    Article  MathSciNet  MATH  Google Scholar 

  • Langville A, Meyer C, Albright R, Cox J, Duling D (2006) Initializations for the nonnegative matrix factorization. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining

  • Lee D, Seung H (2001) Algorithms for non-negative matrix factorization. Adv Neural Inform Process Syst 13: 556–562

    Google Scholar 

  • Li T, Ma S, Ogihara M (2004) Document clustering via adaptive subspace iteration. In: Proceedings of the 27th international ACM SIGIR conference on research and development in information retrieval, pp 218–225

  • Lin C (2007) Projected gradient methods for nonnegative matrix factorization. Neural Comput 19(10): 2756–2779

    Article  MathSciNet  MATH  Google Scholar 

  • Lin Y, Sundaram H, De Choudhury M, Kelliher A (2009) Temporal patterns in social media streams: theme discovery and evolution using joint analysis of content and context. In: IEEE international conference on multimedia and expo, pp 1456–1459

  • Lovász L, Plummer M (1986) Matching theory. Elsevier, Amsterdam

    MATH  Google Scholar 

  • Manning C, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Mardia KV, Bibby JM, Kent JT (1979) Multivariate analysis. Academic, New York

    MATH  Google Scholar 

  • Niu D, Dy J, Jordan M (2010) Multiple non-redundant spectral clustering views. In: Proceedings of the 27th international conference on machine learning, pp 831–838

  • Pan S, Yang Q (2008) A survey on transfer learning. Technical Report HKUST-CS08-08. Department of Computer Science and Engineering, HKUST, Hong Kong

  • Qi Z, Davidson I (2009) A principled and flexible framework for finding alternative clusterings. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, pp 717–726

  • Rui Y, Huang T (2000) Optimizing learning in image retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition. Published by the IEEE Computer Society, pp 236–243

  • Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inform Process Manag 24(5): 513–523

    Article  Google Scholar 

  • Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8): 888–905

    Article  Google Scholar 

  • Si S, Tao D, Geng B (2009) Bregman divergence based regularization for transfer subspace learning. IEEE Trans Knowl Data Eng 22(7): 929–942

    Article  Google Scholar 

  • Thrun S (1996) Is learning the n-th thing any easier than learning the first? In: Advances in neural information processing systems, pp 640–646

  • Wild S, Curry J, Dougherty A (2004) Improving non-negative matrix factorizations through structured initialization. Pattern Recogn 37(11): 2217–2232

    Article  Google Scholar 

  • Wiswedel B, Höppner F, Berthold M (2010) Learning in parallel universes. Data Min Knowl Disc 21(1): 130–152

    Article  Google Scholar 

  • Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th international ACM SIGIR conference on research and development in information retrieval. ACM, New York, NY, pp 267–273

  • Yan R, Tesic J, Smith J (2007) Model-shared subspace boosting for multi-label classification. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 834–843

  • Yang T, Jin R, Jain A, Zhou Y, Tong W (2010) Unsupervised transfer classification: application to text categorization. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1159–1168

  • Yu K, Zhu S, Lafferty J, Gong Y (2009) Fast nonparametric matrix factorization for large-scale collaborative filtering. In: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval, pp 211–218

  • Zhang J, Zhang C (2011) Multitask Bregman clustering. Neurocomputing 74(10): 1720–1734

    Article  Google Scholar 

  • Zhou D, Bousquet O, Lal T, Weston J, Schölkopf B (2004) Learning with local and global consistency. Adv Neural Inform Process Syst 16: 595–602

    Google Scholar 

  • Zhuang F, Luo P, Shen Z, He Q, Xiong Y, Shi Z, Xiong H (2010) Collaborative dual-plsa: mining distinction and commonality across multiple domains for text classification. CIKM, pp 359–368

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sunil Kumar Gupta.

Additional information

Responsible editor: Kristian Kersting.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gupta, S.K., Phung, D., Adams, B. et al. Regularized nonnegative shared subspace learning. Data Min Knowl Disc 26, 57–97 (2013). https://doi.org/10.1007/s10618-011-0244-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-011-0244-8

Keywords

Navigation