Heterogeneous representation learning with separable structured sparsity regularization

Yang, Pei; Tan, Qi; Zhu, Yada; He, Jingrui

doi:10.1007/s10115-017-1094-5

Heterogeneous representation learning with separable structured sparsity regularization

Regular Paper
Published: 09 August 2017

Volume 55, pages 671–694, (2018)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Pei Yang ORCID: orcid.org/0000-0001-8926-9695^1,2,
Qi Tan³,
Yada Zhu⁴ &
…
Jingrui He²

1864 Accesses
1 Citation
Explore all metrics

Abstract

Motivated by real applications, heterogeneous learning has emerged as an important research area, which aims to model the coexistence of multiple types of heterogeneity. In this paper, we propose a heterogeneous representation learning model with structured sparsity regularization (HERES) to learn from multiple types of heterogeneity. It aims to leverage the rich correlations (e.g., task relatedness, view consistency, and label correlation) and the prior knowledge (e.g., the soft-clustering of tasks) of heterogeneous data to improve learning performance. To this end, HERES integrates multi-task, multi-view, and multi-label learning into a principled framework based on representation learning to model the complex correlations and employs the structured sparsity to encode the prior knowledge of data. The objective is to simultaneously minimize the reconstruction loss of using the factor matrices to recover the heterogeneous data, and the structured sparsity imposed on the model. The resulting optimization problem is challenging due to the non-smoothness and non-separability of structured sparsity. We reformulate the problem by using the auxiliary function and prove that the reformulation is separable, which leads to an efficient algorithm family for solving structured sparsity penalized problems. Furthermore, we propose various HERES models based on different loss functions and subsume them into the weighted HERES, which is able to handle missing data. The experimental results in comparison with state-of-the-art methods demonstrate the effectiveness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Incomplete multi-view partial multi-label learning

Article 03 July 2021

Robust Semi-supervised Multi-label Learning by Triple Low-Rank Regularization

Co-regularized multiview nonnegative matrix factorization with correlation constraint for representation learning

Article Open access 15 June 2017

Notes

References

Argyriou A, Evgeniou T, Pontil M (2006) Multi-task feature learning. In: NIPS, pp 41–48
Argyriou A, Micchelli CA, Pontil M, Shen L, Xu Y (2011) Efficient first order methods for linear composite regularizers. CoRR, arXiv:1104.1436
Bhatia K, Jain H, Kar P, Varma M, Jain P (2015) Sparse local embeddings for extreme multi-label classification. In: NIPS, pp 730–738
Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: COLT, pp 92–100
Caruana R (1997) Multitask learning. Mach. Learn. 28(1):41–75
Article MathSciNet Google Scholar
Chang X, Nie F, Yang Y, Huang H (2014) A convex formulation for semi-supervised multi-label feature selection. In: AAAI, pp 1171–1177
Chen X, Lin Q, Kim S, Carbonell JG, Xing EP (2011) Smoothing proximal gradient method for general structured sparse learning. In: UAI, pp 105–114
Chua T, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) NUS-WIDE: a real-world web image database from national university of Singapore. In: CIVR
Elisseeff A, Weston J (2001) A kernel method for multi-labelled classification. In: NIPS, pp 681–687
Farquhar JDR, Hardoon DR, Meng H, Shawe-Taylor J, Szedmák S (2005) Two view learning: SVM-2K, theory and practice. In: NIPS
Gong P, Ye J, Zhang C (2012) Robust multi-task feature learning. In: KDD, pp 895–903
Gong P, Zhou J, Fan W, Ye J (2014) Efficient multi-task feature learning with calibration. In: KDD, pp 761–770
Guo Y (2013) Convex subspace representation learning from multi-view data. In: AAAI
Han L, Zhang Y (2015) Learning tree structure in multi-task learning. In: KDD, pp 397–406
He J, Lawrence R (2011) A graph-based framework for multi-task multi-view learning. In: ICML, pp 25–32
Jacob L, Obozinski G, Vert J (2009) Group Lasso with overlap and graph Lasso. In: ICML, pp 433–440
Jenatton R, Audibert J, Bach FR (2011) Structured variable selection with sparsity-inducing norms. J Mach Learn Res 12:2777–2824
MathSciNet MATH Google Scholar
Ji S, Tang L, Yu S, Ye J (2008) Extracting shared subspace for multi-label classification. In: KDD, pp 381–389
Ji S, Ye J (2009) An accelerated gradient method for trace norm minimization. In: ICML, pp 457–464
Kim S, Xing EP (2010) Tree-guided group Lasso for multi-task regression with structured sparsity. In: ICML, pp 543–550
Kong D, Ding CHQ, Huang H (2011) Robust nonnegative matrix factorization using L21-norm. In: CIKM, pp 673–682
Kong X, Ng MK, Zhou Z-H (2013) Transductive multilabel learning via label set propagation. IEEE Trans Knowl Data Eng 25(3):704–719
Article Google Scholar
Lewis DD, Yang Y, Rose TG, Li F (2004) Rcv1: a new benchmark collection for text categorization research. J Mach Learn Res 5:361–397
Google Scholar
Li Y, Tian X, Liu T, Tao D (2015) Multi-task model and feature joint learning. In: IJCAI, pp 3643–3649
Mairal J, Jenatton R, Obozinski G, Bach FR (2010) Network flow algorithms for structured sparsity. In: NIPS, pp 1558–1566
Mencía EL, Fürnkranz J (2008) Efficient pairwise multilabel classification for large-scale problems in the legal domain. In: ECML-PKDD, pp 126–135
Mosci S, Villa S, Verri A, Rosasco L (2010) A primal-dual algorithm for group sparse regularization with overlapping groups. In: NIPS, pp 2604–2612
Nie F, Huang H, Cai X, Ding CHQ (2010) Efficient and robust feature selection via joint \(\ell _{2,1}\)-norms minimization. In: NIPS, pp 1813–1821
Qin ZT, Goldfarb D (2012) Structured sparsity via alternating direction methods. J Mach Learn Res 13:1435–1468
MathSciNet MATH Google Scholar
Simon N, Friedman J, Hastie T, Tibshirani R (2013) A sparse-group Lasso. J Comput Graph Stat 22(2):231
Article MathSciNet Google Scholar
Sindhwani V, Rosenberg DS (2008) An RKHS for multi-view learning and manifold co-regularization. In: ICML, pp 976–983
Sridharan K, Kakade SM (2008) An information theoretic framework for multi-view learning. In: COLT, pp 403–414
Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B Methodol 58(1):267–288
MathSciNet MATH Google Scholar
Tseng P (2001) Convergence of a block coordinate descent method for nondifferentiable minimization. J Optim Theory Appl 109(3):475–494
Article MathSciNet MATH Google Scholar
White M, Yu Y, Zhang X, Schuurmans D (2012) Convex multi-view subspace learning. In: NIPS, pp 1682–1690
Xu C, Tao D, Xu C (2015) Multi-view intact space learning. IEEE Trans Pattern Anal Mach Intell 37(12):2531–2544
Article Google Scholar
Yang H, He J (2014) Learning with dual heterogeneity: a nonparametric bayes model. In: KDD, pp 582–590
Yang P, He J (2015) Model multiple heterogeneity via hierarchical multi-latent space learning. In: KDD, pp 1375–1384
Yang P, He J (2016) Heterogeneous representation learning with structured sparsity regularization. In: ICDM, pp 539–548
Yang P, He J, Yang H, Fu H (2014) Learning from label and feature heterogeneity. In: ICDM, pp 1079–1084
Yang S, Sun Q, Ji S, Wonka P, Davidson I, Ye J (2015) Structural graphical Lasso for learning mouse brain connectivity. In: KDD, pp 1385–1394
Yang X, Kim S, Xing EP (2009) Heterogeneous multitask learning with joint sparsity constraints. In: NIPS, pp 2151–2159
Yu H-F, Jain P, Kar P, Dhillon IS (2014) Large-scale multi-label learning with missing labels. In: ICML, pp 593–601
Yuan L, Liu J, Ye J (2013) Efficient methods for overlapping group Lasso. IEEE Trans Pattern Anal Mach Intell 35(9):2104–2116
Article Google Scholar
Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B Stat Methodol 68(1):49–67
Article MathSciNet MATH Google Scholar
Zhang J, Huan J (2012) Inductive multi-task learning with multiple view data. In: KDD, pp 543–551
Zhang M-L, Zhou Z-H (2007) ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit 40(7):2038–2048
Article MATH Google Scholar
Zhang M-L, Zhou Z-H (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837
Article Google Scholar
Zhou J, Chen J, Ye J (2011) Clustered multi-task learning via alternating structure optimization. In: NIPS, pp 702–710
Zhou J, Liu J, Narayan VA, Ye J (2012) Modeling disease progression via fused sparse group Lasso. In: KDD, pp 1095–1103

Download references

Acknowledgements

This work is supported by National Natural Science Foundation of China under Grant No. 61473123, National Science Foundation under Grant No. IIS-1552654, ONR under Grant No. N00014-15-1-2821, NASA under Grant No. NNX17AJ86A, and an IBM Faculty Award. The views and conclusions are those of the authors and should not be interpreted as representing the official policies of the funding agencies or the government.

Author information

Authors and Affiliations

South China University of Technology, Guangzhou, China
Pei Yang
Arizona State University, Tempe, AZ, USA
Pei Yang & Jingrui He
South China Normal University, Guangzhou, China
Qi Tan
IBM Research, Yorktown Heights, NY, USA
Yada Zhu

Authors

Pei Yang
View author publications
You can also search for this author in PubMed Google Scholar
Qi Tan
View author publications
You can also search for this author in PubMed Google Scholar
Yada Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Jingrui He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pei Yang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, P., Tan, Q., Zhu, Y. et al. Heterogeneous representation learning with separable structured sparsity regularization. Knowl Inf Syst 55, 671–694 (2018). https://doi.org/10.1007/s10115-017-1094-5

Download citation

Received: 27 January 2017
Revised: 02 July 2017
Accepted: 28 July 2017
Published: 09 August 2017
Issue Date: June 2018
DOI: https://doi.org/10.1007/s10115-017-1094-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Heterogeneous representation learning with separable structured sparsity regularization

Abstract

Access this article

Similar content being viewed by others

Incomplete multi-view partial multi-label learning

Robust Semi-supervised Multi-label Learning by Triple Low-Rank Regularization

Co-regularized multiview nonnegative matrix factorization with correlation constraint for representation learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Heterogeneous representation learning with separable structured sparsity regularization

Abstract

Access this article

Similar content being viewed by others

Incomplete multi-view partial multi-label learning

Robust Semi-supervised Multi-label Learning by Triple Low-Rank Regularization

Co-regularized multiview nonnegative matrix factorization with correlation constraint for representation learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation