We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.


Conditional density estimation and simulation through optimal transport


A methodology to estimate from samples the probability density of a random variable x conditional to the values of a set of covariates \(\{z_{l}\}\) is proposed. The methodology relies on a data-driven formulation of the Wasserstein barycenter, posed as a minimax problem in terms of the conditional map carrying each sample point to the barycenter and a potential characterizing the inverse of this map. This minimax problem is solved through the alternation of a flow developing the map in time and the maximization of the potential through an alternate projection procedure. The dependence on the covariates \(\{z_{l}\}\) is formulated in terms of convex combinations, so that it can be applied to variables of nearly any type, including real, categorical and distributional. The methodology is illustrated through numerical examples on synthetic and real data. The real-world example chosen is meteorological, forecasting the temperature distribution at a given location as a function of time, and estimating the joint distribution at a location of the highest and lowest daily temperatures as a function of the date.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 199

This is the net price. Taxes to be calculated in checkout.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13


  1. 1.



  1. Agnelli, J. P., Cadeiras, M., Tabak, E. G., Turner, C. V., & Vanden-Eijnden, E. (2010). Clustering and classification through normalizing flows in feature space. SIAM Multiscale Modeling & Simulation, 8, 1784–1802.

  2. Agueh, M., & Carlier, G. (2011). Barycenter in the Wasserstein space. SIAM Journal on Mathematical Analysis, 43(2), 094–924.

  3. Bashtannyk, D. M., & Hyndman, R. J. (2001). Bandwidth selection for kernel conditional density estimation. Computational Statistics & Data Analysis, 36(3), 279–298.

  4. Caffarelli, L. A. (2003). The Monge–Ampère equation and optimal transportation, an elementary review. In Optimal transportation and applications. Lecture Notes in Math (pp. 1–10). Berlin: Springer.

  5. Chenyue, W., & Tabak, E.G. (2018) Prototypal analysis and prototypal regression. In preparation

  6. De Gooijer, J. G., & Zerom, D. (2003). On conditional density estimation. Statistica Neerlandica, 57(2), 159–176.

  7. Dutordoir, V., Salimbeni, H., Hensman, J., & Deisenroth, M. (2018). Gaussian process conditional density estimation. In Advances in neural information processing systems (pp. 2385–2395).

  8. Escobar, M. D., & West, M. (1995). Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90(430), 577–588.

  9. Essid, M., Laefer, D., & Tabak, E. G. (2018). Adaptive optimal transport. Submitted to Information and Inference.

  10. Fan, J., Yao, Q., & Tong, H. (1996). Estimation of conditional densities and sensitivity measures in nonlinear dynamical systems. Biometrika, 83(1), 189–206.

  11. Fan, J., & Yim, T. H. (2004). A crossvalidation method for estimating conditional densities. Biometrika, 91(4), 819–834.

  12. Gray, A. G., & Moore, A. W. (2001). N-body’problems in statistical learning. In Advances in neural information processing systems (pp. 521–527).

  13. Holmes, M. P., Gray, A. G., & Isbell, C. L. (2012). Fast nonparametric conditional density estimation. arXiv preprint arXiv:1206.5278.

  14. Hyndman, R. J., Bashtannyk, D. M., & Grunwald, G. K. (1996). Estimating and visualizing conditional densities. Journal of Computational and Graphical Statistics, 5(4), 315–336.

  15. Kantorovich, L. V. (1942). On the translocation of masses. Compt. Rend. Akad. Sei, 7, 199–201.

  16. Kantorovich, L. V. (1948). On a problem of Monge. Uspekhi Matematicheskikh Nauk, 3(2), 225–226.

  17. Monge, G. (1781). Mémoire sur la théorie des déblais et des remblais. Histoire De L’acad mie Royale Des Sciences. Paris.

  18. Nadaraya, E. A. (1964). On estimating regression. Theory of Probability & Its Applications, 9(1), 141–142.

  19. Pass, B. (2013). On a class of optimal transportation problems with infinitely many marginals. SIAM Journal on Mathematical Analysis, 45(4), 2557–2575.

  20. Rosenblatt, M. (1969). Conditional probability density and regression estimators. Multivariate Analysis II, 25, 31.

  21. Silverman, B. (1986). Density estimation for statistics and data analysis, chapter 2–3

  22. Solomon, J., De Goes, F., Peyré, G., Cuturi, M., Butscher, A., Nguyen, A., et al. (2015). Convolutional wasserstein distances: Efficient optimal transportation on geometric domains. ACM Transactions on Graphics (TOG), 34(4), 66.

  23. Tabak, E. G., & Trigila, G. (2018a). Conditional expectation estimation through attributable components. Information and Inference: A Journal of the IMA, 7(4), 727–754.

  24. Tabak, E. G., & Trigila, G. (2018b). Explanation of variability and removal of confounding factors from data through optimal transport. Communications on Pure and Applied Mathematics, 71(1), 163–199.

  25. Tabak, E. G., & Trigila, G. (2018c). An iterative method for the Wasserstein barycenter problem. In preparation

  26. Tabak, E. G., & Turner, C. V. (2013). A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics, 66(2), 145–164.

  27. Tabak, E. G., & Vanden-Eijnden, E. (2010). Density estimation by dual ascent of the log-likelihood. Communications in Mathematical Sciences, 8, 217–233.

  28. Trigila, G., & Tabak, E. G. (2016). Data-driven optimal transport. Communications on Pure and Applied Mathematics, 69(4), 613–648.

  29. Trefethen, L. N. (2000). Spectral methods in MATLAB (Vol. 10). SIAM.

  30. Watson, G. S. (1964). Smooth regression analysis. Sankhyā: The Indian Journal of Statistics, Series A, 26(4), 359–372.

Download references


The work of E. G. Tabak and W. Zhao was partially supported by NSF Grant DMS-1715753 and ONR Grant N00014-15-1-2355.

Author information

Correspondence to Giulio Trigila.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Editor: Pradeep Ravikumar.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Tabak, E.G., Trigila, G. & Zhao, W. Conditional density estimation and simulation through optimal transport. Mach Learn (2020). https://doi.org/10.1007/s10994-019-05866-3

Download citation


  • Conditional density estimation
  • Optimal transport
  • Wasserstein barycenter
  • Explanation of variability
  • Confounding factors
  • Sampling
  • Uncertainty quantification