A Unified Framework for Domain Adaptation Using Metric Learning on Manifolds

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11052)


We present a novel framework for domain adaptation, whereby both geometric and statistical differences between a labeled source domain and unlabeled target domain can be reconciled using a unified mathematical framework that exploits the curved Riemannian geometry of statistical manifolds. We exploit a simple but important observation that as the space of covariance matrices is both a Riemannian space as well as a homogeneous space, the shortest path geodesic between two covariances on the manifold can be computed analytically. Statistics on the SPD matrix manifold, such as the geometric mean of two SPD matries can be reduced to solving the well-known Riccati equation. We show how the Ricatti-based solution can be constrained to not only reduce the statistical differences between the source and target domains, such as aligning second order covariances and minimizing the maximum mean discrepancy, but also the underlying geometry of the source and target domains using diffusions on the underlying source and target manifolds. Our solution also emerges as a consequence of optimal transport theory, which shows that the optimal transport mapping between source and target distributions that are multivariate Gaussians is a function of the geometric mean of the source and target covariances, a quantity that also minimizes the Wasserstein distance. A key strength of our proposed approach is that it enables integrating multiple sources of variation between source and target in a unified way, by reducing the combined objective function to a nested set of Ricatti equations where the solution can be represented by a cascaded series of geometric mean computations. In addition to showing the theoretical optimality of our solution, we present detailed experiments using standard transfer learning testbeds from computer vision comparing our proposed algorithms to past work in domain adaptation, showing improved results over a large variety of previous methods. Code related to this paper is available at:



Portions of this research were completed when the first and third authors were at SRI International, Menlo Park, CA and when the second author was at, Bangalore, India.


  1. 1.
    Adel, T., Zhao, H., Wong, A.: Unsupervised domain adaptation with a relaxed covariate shift assumption. In: AAAI, pp. 1691–1697 (2017)Google Scholar
  2. 2.
    Baktashmotlagh, M., Harandi, M.T., Lovell, B.C., Salzmann, M.: Unsupervised domain adaptation by domain invariant projection. In: ICCV, pp. 769–776 (2013)Google Scholar
  3. 3.
    Belkin, M., Niyogi, P.: Convergence of Laplacian eigenmaps. In: NIPS, pp. 129–136 (2006)Google Scholar
  4. 4.
    Bellet, A., Habrard, A., Sebban, M.: Metric Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan and Claypool Publishers, San Rafael (2015)CrossRefGoogle Scholar
  5. 5.
    Ben-David, S., Blitzer, J., Crammer, K., Pereira, F.: Analysis of representations for domain adaptation. In: NIPS (2006)Google Scholar
  6. 6.
    Bhatia, R.: Positive Definite Matrices. Princeton Series in Applied Mathematics. Princeton University Press, Princeton (2007)zbMATHGoogle Scholar
  7. 7.
    Borgwardt, K.M., Gretton, A., Rasch, M.J., Kriegel, H.P., Schölkopf, B., Smola, A.J.: Integrating structured biological data by kernel maximum mean discrepancy. In: ISMB (Supplement of Bioinformatics), pp. 49–57 (2006)CrossRefGoogle Scholar
  8. 8.
    Cesa-Bianchi, N.: Learning the distribution in the extended PAC model. In: ALT, pp. 236–246 (1990)Google Scholar
  9. 9.
    Courty, N., Flamary, R., Tuia, D., Rakotomamonjy, A.: Optimal transport for domain adaptation. IEEE Trans. Pattern Anal. Mach. Intell. 39(9), 1853–1865 (2017)CrossRefGoogle Scholar
  10. 10.
    Cui, Z., Chang, H., Shan, S., Chen, X.: Generalized unsupervised manifold alignment. In: NIPS, pp. 2429–2437 (2014)Google Scholar
  11. 11.
    Daume, H.: Frustratingly easy domain adaptation. In: ACL (2007)Google Scholar
  12. 12.
    Edelman, A., Arias, T.A., Smith, S.T.: The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20(2), 303–353 (1998)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Fayek, H.M., Lech, M., Cavedon, L.: Evaluating deep learning architectures for speech emotion recognition. Neural Netw. 92, 60–68 (2017)CrossRefGoogle Scholar
  14. 14.
    Fernando, B., Habrard, A., Sebban, M., Tuytelaars, T.: Subspace alignment for domain adaptation. Technical report, arXiv preprint arXiv:1409.5241 (2014)
  15. 15.
    Fukumizu, K., Bach, F.R., Gretton, A.: Statistical convergence of kernel CCA. In: NIPS, pp. 387–394 (2005)Google Scholar
  16. 16.
    Gong, B., Shi, Y., Sha, F., Grauman, K.: Geodesic flow kernel for unsupervised domain adaptation. In: CVPR, pp. 2066–2073 (2012)Google Scholar
  17. 17.
    Hotelling, H.: Relations between two sets of variates. Biometrika 28, 321–377 (1936)CrossRefGoogle Scholar
  18. 18.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)CrossRefGoogle Scholar
  19. 19.
    Murphy, K.P.: Machine Learning : A Probabilistic Perspective. MIT Press, Cambridge (2013)zbMATHGoogle Scholar
  20. 20.
    Nadler, B., Lafon, S., Coifman, R.R., Kevrekidis, I.G.: Diffusion maps, spectral clustering and eigenfunctions of fokker-planck operators. In: NIPS, pp. 955–962 (2005)Google Scholar
  21. 21.
    Pan, S.J., Tsang, I.W., Kwok, J.T., Yang, Q.: Domain adaptation via transfer component analysis. IEEE Trans. Neural Netw. 22(2), 199–210 (2011). Scholar
  22. 22.
    Pan, S., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)CrossRefGoogle Scholar
  23. 23.
    Peyré, G., Cuturi, M.: Computational Optimal Transport. ArXiv e-prints, March 2018Google Scholar
  24. 24.
    Sun, B., Feng, J., Saenko, K.: Return of frustratingly easy domain adaptation. In: AAAI, pp. 2058–2065 (2016)Google Scholar
  25. 25.
    Wang, C., Mahadevan, S.: Manifold alignment without correspondence. In: IJCAI, pp. 1273–1278 (2009)Google Scholar
  26. 26.
    Wang, H., Wang, W., Zhang, C., Xu, F.: Cross-domain metric learning based on information theory. In: AAAI, pp. 2099–2105 (2014)Google Scholar
  27. 27.
    Wang, W., Arora, R., Livescu, K., Srebro, N.: Stochastic optimization for deep CCA via nonlinear orthogonal iterations. In: ALLERTON, pp. 688–695 (2015)Google Scholar
  28. 28.
    Zadeh, P., Hosseini, R., Sra, S.: Geometric mean metric learning. In: ICML, pp. 2464–2471 (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.University of MassachusettsAmherstUSA
  2. 2.MicrosoftHyderabadIndia
  3. 3.Samsung Research AmericaMountain ViewUSA

Personalised recommendations