Domain Adaptation with Regularized Optimal Transport

  • Nicolas Courty
  • Rémi Flamary
  • Devis Tuia
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8724)


We present a new and original method to solve the domain adaptation problem using optimal transport. By searching for the best transportation plan between the probability distribution functions of a source and a target domain, a non-linear and invertible transformation of the learning samples can be estimated. Any standard machine learning method can then be applied on the transformed set, which makes our method very generic. We propose a new optimal transport algorithm that incorporates label information in the optimization: this is achieved by combining an efficient matrix scaling technique together with a majoration of a non-convex regularization term. By using the proposed optimal transport with label regularization, we obtain significant increase in performance compared to the original transport solution. The proposed algorithm is computationally efficient and effective, as illustrated by its evaluation on a toy example and a challenging real life vision dataset, against which it achieves competitive results with respect to state-of-the-art methods.


Regularization Term Target Domain Domain Adaptation Source Distribution Target Distribution 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)CrossRefGoogle Scholar
  2. 2.
    Sugiyama, M., Nakajima, S., Kashima, H., Buenau, P.V., Kawanabe, M.: Direct importance estimation with model selection and its application to covariate shift adaptation. In: NIPS (2008)Google Scholar
  3. 3.
    Zhang, K., Zheng, V.W., Wang, Q., Kwok, J.T., Yang, Q., Marsic, I.: Covariate shift in Hilbert space: A solution via surrogate kernels. In: ICML (2013)Google Scholar
  4. 4.
    Kantorovich, L.: On the translocation of masses. C.R (Doklady) Acad. Sci. URSS (N.S.) 37, 199–201 (1942)MathSciNetGoogle Scholar
  5. 5.
    Villani, C.: Optimal transport: old and new. Grundlehren der mathematischen Wissenschaften. Springer (2009)Google Scholar
  6. 6.
    Cuturi, M.: Sinkhorn distances: Lightspeed computation of optimal transportation. In: NIPS, pp. 2292–2300 (2013)Google Scholar
  7. 7.
    Solomon, J., Rustamov, R., Leonidas, G., Butscher, A.: Wasserstein propagation for semi-supervised learning. In: Proceedings of The 31st International Conference on Machine Learning, pp. 306–314 (2014)Google Scholar
  8. 8.
    Bickel, S., Brückner, M., Scheffer, T.: Discriminative learning for differing training and test distributions. In: ICML (2007)Google Scholar
  9. 9.
    Daumé III., H.: Frustratingly easy domain adaptation. In: Ann. Meeting of the Assoc. Computational Linguistics (2007)Google Scholar
  10. 10.
    Pan, S.J., Yang, Q.: Domain adaptation via transfer component analysis. IEEE Trans. Neural Networks 22, 199–210 (2011)CrossRefGoogle Scholar
  11. 11.
    Kumar, A., Daumé III, H., Jacobs, D.: Generalized multiview analysis: A discriminative latent space. In: CVPR (2012)Google Scholar
  12. 12.
    Wang, C., Mahadevan, S.: Manifold alignment without correspondence. In: International Joint Conference on Artificial Intelligence, Pasadena, CA (2009)Google Scholar
  13. 13.
    Gopalan, R., Li, R., Chellappa, R.: Domain adaptation for object recognition: An unsupervised approach. In: ICCV, pp. 999–1006. IEEE (2011)Google Scholar
  14. 14.
    Gong, B., Shi, Y., Sha, F., Grauman, K.: Geodesic flow kernel for unsupervised domain adaptation. In: CVPR, pp. 2066–2073. IEEE (2012)Google Scholar
  15. 15.
    Zheng, J., Liu, M.-Y., Chellappa, R., Phillips, P.J.: A grassmann manifold-based domain adaptation approach. In: ICPR, pp. 2095–2099 (November 2012)Google Scholar
  16. 16.
    Rubner, Y., Tomasi, C., Guibas, L.J.: A metric for distributions with applications to image databases. In: ICCV, pp. 59–66 (January 1998)Google Scholar
  17. 17.
    Bonneel, N., van de Panne, M., Paris, S., Heidrich, W.: Displacement interpolation using lagrangian mass transport. ACM Transaction on Graphics 30(6), 158:1–158:12 (2011)Google Scholar
  18. 18.
    Ferradans, S., Papadakis, N., Rabin, J., Peyré, G., Aujol, J.-F.: Regularized discrete optimal transport. In: Scale Space and Variational Methods in Computer Vision, SSVM, pp. 428–439 (2013)Google Scholar
  19. 19.
    Knight, P.: The sinkhorn-knopp algorithm: Convergence and applications. SIAM J. Matrix Anal. Appl. 30(1), 261–275 (2008)CrossRefzbMATHMathSciNetGoogle Scholar
  20. 20.
    Candes, E.J., Wakin, M.B., Boyd, S.P.: Enhancing sparsity by reweighted l1 minimization. Journal of Fourier Analysis and Applications 14(5), 877–905 (2008)CrossRefzbMATHMathSciNetGoogle Scholar
  21. 21.
    Hunter, D.R., Lange, K.: A Tutorial on MM Algorithms. The American Statistician 58(1), 30–38 (2004)CrossRefMathSciNetGoogle Scholar
  22. 22.
    Tsybakov, A.: Introduction to Nonparametric Estimation. Springer Publishing Company, Incorporated (2008)Google Scholar
  23. 23.
    Saenko, K., Kulis, B., Fritz, M., Darrell, T.: Adapting visual category models to new domains. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 213–226. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  24. 24.
    Griffin, G., Holub, A., Perona, P.: Caltech-256 Object Category Dataset. Technical Report CNS-TR-2007-001, California Institute of Technology (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Nicolas Courty
    • 1
  • Rémi Flamary
    • 2
  • Devis Tuia
    • 3
  1. 1.Université de Bretagne Sud, IRISAVannesFrance
  2. 2.Université de Nice, Lab. Lagrance UMR CNRS 7293, OCANiceFrance
  3. 3.EPFL, LASIGLausanneSwitzerland

Personalised recommendations