# Low-Rank Transfer Learning

## Abstract

Real-world visual data are expensive to label for the purpose of training supervised learning algorithms. Leverage of auxiliary databases with well labeled data for the new task may save considerable labeling efforts. However, data in the auxiliary databases are often obtained under conditions that differ from those in the new task. Transfer learning provides techniques for transferring learned knowledge from a *source* domain to a *target* domain by mitigating the divergence. In this chapter, we discuss transfer learning in a generalized subspace where each target sample can be represented by some combination of source samples under a low-rank constraint. Under this constraint, the underlying structure of both source and target domains are considered in the knowledge transfer, which brings in three benefits: First, good alignment between domains is ensured in that only relevant data in some subspace of the source domain are used to reconstruct the data in the target domain. Second, the discriminative power of the source domain is naturally passed on to the target domain. Third, noisy information will be filtered out in the knowledge transfer. Extensive experiments on synthetic data, and important computer vision problems, e.g., face recognition application, visual domain adaptation for object recognition, demonstrate the superiority of the proposed approach over the existing, well-established methods.

## Keywords

Transfer learning Low-rank constraint Subspace learning Domain adaptation## Notes

### Acknowledgments

This research is supported in part by the NSF CNS award 1314484, Office of Naval Research award N00014-12-1-1028, Air Force Office of Scientific Research award FA9550-12-1-0201, U.S. Army Research Office grant W911NF-13-1-0160, and IC Postdoc Program Grant 2011-11071400006.

## References

- 1.A. Argyriou, T. Evgeniou, M. Pontil, Multi-task feature learning,
*Advances in Neural Information Processing Systems*(MIT, Cambridge, 2007), pp. 41–48Google Scholar - 2.A. Arnold, R. Nallapati, W. Cohen, A comparative study of methods for transductive transfer learning. in
*International Conference on Data Mining*(Workshops). IEEE (2007), pp. 77–82Google Scholar - 3.Y. Aytar, A. Zisserman, Tabula rasa: model transfer for object category detection. in
*IEEE International Conference on Computer Vision*. IEEE (2011), pp. 2252–2259Google Scholar - 4.R.H. Bartels, G. Stewart, Solution of the matrix equation ax + xb = c [f4]. Commun. ACM
**15**(9), 820–826 (1972)CrossRefGoogle Scholar - 5.P. Belhumeur, J. Hespanha, D. Kriegman, Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell.
**19**(7), 711–720 (2002)CrossRefGoogle Scholar - 6.M. Belkin, P. Niyogi, Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput
**15**(6), 1373–1396 (2003)CrossRefMATHGoogle Scholar - 7.J. Blitzer, D. Foster, S. Kakade, Domain adaptation with coupled subspaces. JMLR Proc. Track
**15**, 173–181 (2011)Google Scholar - 8.J. Blitzer, R. McDonald, F. Pereira, Domain adaptation with structural correspondence learning. in
*Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics*(2006), pp. 120–128Google Scholar - 9.J.F. Cai, E.J. Candès, Z. Shen, A singular value thresholding algorithm for matrix completion. SIAM J. Optim.
**20**, 1956–1982 (2010)MathSciNetCrossRefMATHGoogle Scholar - 10.E. Candes, X. Li, Y. Ma, J. Wright, Robust principal component analysis? J. ACM
**58**(3), 11 (2011)MathSciNetCrossRefGoogle Scholar - 11.E. Candès, B. Recht, Exact matrix completion via convex optimization. Found. Comput. Math.
**9**(6), 717–772 (2009)MathSciNetCrossRefMATHGoogle Scholar - 12.M. Chen, K. Weinberger, J. Blitzer, Co-training for domain adaptation. in
*Advances in Neural Information Processing Systems*(2011)Google Scholar - 13.D. Coppersmith, S. Winograd, Matrix multiplication via arithmetic progressions. J. Symb. Comput.
**9**(3), 251–280 (1990)MathSciNetCrossRefMATHGoogle Scholar - 14.W. Dai, G. Xue, Q. Yang, Y. Yu, Co-clustering based classification for out-of-domain documents. in
*ACM SIGKDD International Conference on Knowledge Discovery and Data Mining*. ACM (2007), pp. 210–219Google Scholar - 15.W. Dai, G.R. Xue, Q. Yang, Y. Yu, Transferring naive bayes classifiers for text classification. in
*AAAI Conference on Artificial Intelligence*(2007), pp. 540–545Google Scholar - 16.W. Dai, Q. Yang, G. Xue, Y. Yu, Boosting for transfer learning. in
*International Conference on Machine Learning*. ACM (2007), pp. 193–200Google Scholar - 17.H. Daumé, Frustratingly easy domain adaptation. Annu. Meet. ACL
**45**, 256–263 (2007)Google Scholar - 18.H. Daumé III, D. Marcu, Domain adaptation for statistical classifiers. J. Artif. Intell. Res.
**26**(1), 101–126 (2006)MATHGoogle Scholar - 19.L. Duan, I.W. Tsang, D. Xu, T.S. Chua, Domain adaptation from multiple sources via auxiliary classifiers. in
*International Conference on Machine Learning*. ACM (2009), pp. 289–296Google Scholar - 20.L. Duan, D. Xu, S.F. Chang, Exploiting web images for event recognition in consumer videos: A multiple source domain adaptation approach. in
*IEEE Conference on Computer Vision and Pattern Recognition*. IEEE (2012), pp. 1338–1345Google Scholar - 21.L. Duan, D. Xu, I. Tsang, Domain adaptation from multiple sources: a domain-dependent regularization approach. IEEE Trans. Neural Networks Learn. Syst.
**23**(3), 504–518 (2012)CrossRefGoogle Scholar - 22.L. Duan, D. Xu, I.W.H. Tsang, J. Luo, Visual event recognition in videos by learning from web data. IEEE Trans. Pattern Anal. Mach. Intell.
**34**(9), 1667–1680 (2012)CrossRefGoogle Scholar - 23.J. Eckstein, D. Bertsekas, On the douglasłrachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program.
**55**(1), 293–318 (1992)MathSciNetCrossRefMATHGoogle Scholar - 24.J. Gao, W. Fan, J. Jiang, J. Han, Knowledge transfer via multiple model local structure mapping. in
*ACM SIGKDD International Conference on Knowledge Discovery and Data Mining*. ACM (2008), pp. 283–291Google Scholar - 25.X. Glorot, A. Bordes, Y. Bengio, Domain adaptation for large-scale sentiment classification: a deep learning approach. in
*International Conference on Machine Learning*. ACM (2011), pp. 513–520Google Scholar - 26.B. Gong, Y. Shi, F. Sha, K. Grauman, Geodesic flow kernel for unsupervised domain adaptation. in
*IEEE Conference on Computer Vision and Pattern Recognition*. IEEE (2012), pp. 2066–2073Google Scholar - 27.R. Gopalan, R. Li, R. Chellappa, Domain adaptation for object recognition: an unsupervised approach. in
*IEEE International Conference on Computer Vision*. IEEE (2011), pp. 999–1006Google Scholar - 28.X. He, D. Cai, S. Yan, H. Zhang, Neighborhood preserving embedding. in
*International Conference on Computer Vision*, vol. 2. IEEE (2005), pp. 1208–1213Google Scholar - 29.X. He, P. Niyogi, Locality preserving projections,
*Advances in Neural Information Processing Systems*(MIT, Cambridge, 2004)Google Scholar - 30.J. Ho, M. Yang, J. Lim, K. Lee, D. Kriegman, Clustering appearances of objects under varying illumination conditions. in
*IEEE Conference on Computer Vision and Pattern Recognition*, vol. 1. IEEE (2003), pp. I-11Google Scholar - 31.J. Hoffman, E. Rodner, J. Donahue, K. Saenko, T. Darrell, Efficient learning of domain-invariant image representations (2013), arXiv:1301.3224
- 32.D. Huang, J. Sun, Y. Wang, The buaa-visnir face database instructions (2012), http://irip.buaa.edu.cn/research/The_BUAA-VisNir_Face_Database_Instructions.pdf
- 33.I.H. Jhuo, D. Liu, D. Lee, S.F. Chang, Robust visual domain adaptation with low-rank reconstruction. in
*IEEE Conference on Computer Vision and Pattern Recognition*. IEEE (2012), pp. 2168–2175Google Scholar - 34.J. Jiang, C. Zhai, Instance weighting for domain adaptation in nlp. Annu. Meet. ACL
**45**, 264–271 (2007)Google Scholar - 35.W. Jiang, E. Zavesky, S.F. Chang, A. Loui, Cross-domain learning methods for high-level visual concept classification. in
*IEEE International Conference on Image Processing*. IEEE (2008), pp. 161–164Google Scholar - 36.R. Keshavan, A. Montanari, S. Oh, Matrix completion from noisy entries. J. Mach. Learn. Res.
**99**, 2057–2078 (2010)MathSciNetGoogle Scholar - 37.B. Kulis, P. Jain, K. Grauman, Fast similarity search for learned metrics. IEEE Trans. Pattern Anal. Mach. Intell.
**31**(12), 2143–2157 (2009)CrossRefGoogle Scholar - 38.B. Kulis, K. Saenko, T. Darrell, What you saw is not what you get: domain adaptation using asymmetric kernel transforms. in
*IEEE Conference on Computer Vision and Pattern Recognition*. IEEE (2011), pp. 1785–1792Google Scholar - 39.N. Lawrence, J. Platt, Learning to learn with the informative vector machine. in
*International conference on Machine learning*. ACM (2004), pp. 65–72Google Scholar - 40.J. Lim, R. Salakhutdinov, A. Torralba, Transfer learning by borrowing examples for multiclass object detection,
*Advances in Neural Information Processing Systems*(MIT, Cambridge, 2011)Google Scholar - 41.Z. Lin, M. Chen, L. Wu, Y. Ma, The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. Technical report, UILU-ENG-09-2215 (2009)Google Scholar
- 42.G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu, Y. Ma, Robust recovery of subspace structures by low-rank representation. IEEE Trans. Pattern Anal. Mach. Intell.
**35**(1), 171–184 (2013)CrossRefGoogle Scholar - 43.G. Liu, Z. Lin, Y. Yu, Robust subspace segmentation by low-rank representation. in
*International Conference on Machine Learning*(2010), pp. 663–670Google Scholar - 44.D. Lopez-Paz, J. Hernndez-Lobato, B. Schölkopf, Semi-supervised domain adaptation with non-parametric copulas,
*Advances in Neural Information Processing Systems*(MIT, Cambridge, 2012)Google Scholar - 45.L. Lu, R. Vidal, Combined central and subspace clustering for computer vision applications. in
*International Conference on Machine Learning*. ACM (2006), pp. 593–600Google Scholar - 46.L. Mihalkova, T. Huynh, R. Mooney, Mapping and revising markov logic networks for transfer learning. In:
*AAAI Conference on Artificial Intelligence*(2007), pp. 608–614Google Scholar - 47.S.J. Pan, Q. Yang, A survey on transfer learning. IEEE Trans. Knowl. Data Eng.
**22**(10), 1345–1359 (2010)CrossRefGoogle Scholar - 48.G.J. Qi, C. Aggarwal, Y. Rui, Q. Tian, S. Chang, T. Huang, Towards cross-category knowledge propagation for learning visual concepts. in
*IEEE Conference on Computer Vision and Pattern Recognition*. IEEE (2011), pp. 897–904Google Scholar - 49.R. Raina, A. Battle, H. Lee, B. Packer, A. Ng, Self-taught learning: Transfer learning from unlabeled data. in
*International Conference on Machine Learning*(2007), pp. 759–766Google Scholar - 50.S.T. Roweis, L.K. Saul, Nonlinear dimensionality reduction by locally linear embedding. Science
**290**(5500), 2323–2326 (2000)CrossRefGoogle Scholar - 51.K. Saenko, B. Kulis, M. Fritz, T. Darrell, Adapting visual category models to new domains. in
*European Computer Vision Conference*(2010), pp. 213–226Google Scholar - 52.M. Shao, C. Castillo, Z. Gu, Y. Fu, Low-rank transfer subspace learning. in
*International Conference on Data Mining*. IEEE (2012), pp. 1104–1109Google Scholar - 53.M. Shao, D. Kit, Y. Fu, Generalized transfer subspace learning through low-rank constraint. Int. J. Comput. Vision
**109**(1–2), 74–93 (2014)CrossRefGoogle Scholar - 54.M. Shao, S. Xia, Y. Fu, Genealogical face recognition based on ub kinface database. in
*IEEE Conference on Computer Vision and Pattern Recognition (Workshop on Biometrics)*(2011), pp. 65–70Google Scholar - 55.S. Si, D. Tao, B. Geng, Bregman divergence-based regularization for transfer subspace learning. IEEE Trans. Knowl. Data Eng.
**22**(7), 929–942 (2010)CrossRefGoogle Scholar - 56.Q. Sun, R. Chattopadhyay, S. Panchanathan, J. Ye, A two-stage weighting framework for multi-source domain adaptation,
*Advances in Neural Information Processing Systems*(MIT, Cambridge, 2011)Google Scholar - 57.M. Turk, A. Pentland, Eigenfaces for recognition. J. Cognitive Neurosci.
**3**(1), 71–86 (1991)CrossRefGoogle Scholar - 58.Z. Wang, Y. Song, C. Zhang, Transferred dimensionality reduction,
*Machine Learning and Knowledge Discovery in Databases*(Springer, Heidelberg, 2008), pp. 550–565CrossRefGoogle Scholar - 59.J. Wright, A. Ganesh, S. Rao, Y. Peng, Y. Ma, Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization. Adv. Neural Inf. Proc. Syst.
**22**, 2080–2088 (2009)Google Scholar - 60.S. Yan, D. Xu, B. Zhang, H. Zhang, Q. Yang, S. Lin, Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans. Pattern Anal. Mach. Intell.
**29**(1), 40–51 (2007)CrossRefGoogle Scholar - 61.J. Yang, R. Yan, A.G. Hauptmann: Cross-domain video concept detection using adaptive svms. in
*International Conference on Multimedia*. ACM (2007), pp. 188–197Google Scholar - 62.J. Yang, W. Yin, Y. Zhang, Y. Wang, A fast algorithm for edge-preserving variational multichannel image restoration. SIAM J. Imaging Sci.
**2**(2), 569–592 (2009)MathSciNetCrossRefMATHGoogle Scholar - 63.C. Zhang, J. Ye, L. Zhang, Generalization bounds for domain adaptation,
*Advances in Neural Information Processing Systems*(MIT, Cambridge, 2012)Google Scholar - 64.T. Zhang, D. Tao, J. Yang, Discriminative locality alignment,
*European Conference on Computer Vision*(Springer, Heidelberg, 2008), pp. 725–738Google Scholar