Skip to main content
Log in

Vector quantile regression and optimal transport, from theory to numerics

  • Published:
Empirical Economics Aims and scope Submit manuscript

A Correction to this article was published on 11 September 2020

This article has been updated

Abstract

In this paper, we first revisit the Koenker and Bassett variational approach to (univariate) quantile regression, emphasizing its link with latent factor representations and correlation maximization problems. We then review the multivariate extension due to Carlier et al. (Ann Statist 44(3):1165–92, 2016,; J Multivariate Anal 161:96–102, 2017) which relates vector quantile regression to an optimal transport problem with mean independence constraints. We introduce an entropic regularization of this problem, implement a gradient descent numerical method and illustrate its feasibility on univariate and bivariate examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Change history

Notes

  1. Whenever we write a variable in brackets after a constraint, as \([V_t]\) in (1.1), we mean that this variable plays the role of a multiplier.

  2. It may seem awkward to start with the “dual” formulation before giving out the “primal” one, and the “primal” being the dual to the “dual,” this choice of labeling is pretty arbitrary. However, our choice is motivated by consistency with optimal transport theory, introduced below.

  3. One way to define the nonatomicity of \((\Omega , {\mathcal {F}}, \mathbb {P})\) is by the existence of a uniformly distributed random variable on this space, this somehow ensures that the space is rich enough so that there exists random variables with prescribed law. If, on the contrary, the space is finite for instance only finitely supported probability measures can be realized as the law of such random variables.

  4. In fact for (2.3) to make sense one needs some integrabilty of Y, i.e., \({\mathbb {E}}(\vert Y\vert ) <+\infty \).

  5. if \({\mathbb {E}}(\Vert X\Vert ^2)<+\infty \) then (3.11) amounts to the standard requirement that \({\mathbb {E}}(X X^{\top })\) is nonsingular.

  6. Uniqueness will be discussed later on.

  7. If quantile regression is specified and the pair of functions \((\alpha , \beta )\) is as in definition 3.1, then for every t, \((\alpha (t), \beta (t))\) solves the conditions (3.13). This shows that specification implies quasi-specification.

  8. With a little abuse of notations when a reference number (A) refers to a maximization (minimization) problem, we will simply write \(\sup (A)\) (\(\inf (A) \)) to the denote the value of this optimization problem.

  9. Note the analogy with the fact that in the univariate case the cdf and the quantile of Y are generalized inverse to each other.

  10. In the case where \({\mathbb {E}}(\Vert Y\Vert ^2)<+\infty \), (5.4) is equivalent to minimize \({\mathbb {E}}(\Vert V- Y\Vert ^2)\) among uniformly distributed V’s.

  11. A deep regularity theory initated by Caffarelli (1992) in the 1990’s gives conditions on \(\nu (.\vert x)\) such that this is in fact the case that the optimal transport map is smooth and/or invertible, we refer the interested reader to the textbook of Figalli (2017) for a detailed and recent account of this regularity theory.

  12. here we assume that both X and Y are integrable

  13. Recall that the softmax with regularization parameter \(\varepsilon >0\) of \((\alpha _{1},\ldots ,\alpha _{J})\) is given by \({\mathrm {Softmax}}_{\varepsilon }(\alpha _{1},\ldots \alpha _{J}):=\varepsilon \log (\sum _{j=1}^{J}e^{\frac{\alpha _{j}}{\varepsilon } })\).

  14. Which can be proved either by using the Fenchel-Rockafellar duality theorem or by hand. Indeed, in the primal, there are only finitely many linear constraints and nonnegativity constraints are not binding because of the entropy. The existence of Lagrange multipliers for the equality constraints is then straightforward.

  15. it is even strictly convex once we have chosen normalizations which take into account the two invariances of J explained above.

  16. https://www.openlab.psu.edu/ansur2/

References

  • Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2009;2(1):183–202.

    Article  Google Scholar 

  • Brenier Y. Polar factorization and monotone rearrangement of vector-valued functions. Commun. Pure Appl. Math. 1991;44(4):375–417.

    Article  Google Scholar 

  • Caffarelli L. The regularity of mappings with a convex potential. J. Am. Math. Soc. 1992;5(1):99–104.

    Article  Google Scholar 

  • Carlier G, Chernozhukov V, Galichon A. Vector quantile regression: an optimal transport approach. Ann. Statist. 2016;44(3):1165–92.

    Article  Google Scholar 

  • Carlier G, Chernozhukov V, Galichon A. Vector quantile regression beyond the specified case. J. Multivariate Anal., 2017. pp. 161, pp. 96–102.

  • Cuturi M, Peyré G. A smoothed dual approach for variational Wasserstein problems. SIAM J. Imaging Sci. 2016;9(1):320–43.

    Article  Google Scholar 

  • Figalli A. The Monge-Amp ere equation and its applications. Zurich Lectures in Advanced Mathematics. European Mathematical Society (EMS), Zurich. 2017.

  • Genevay A, Cuturi M, Peyré G, Bach F. Stochastic optimization for large-scale optimal transport. Advances in neural information processing systems. 2016;3440–8.

  • Koenker R, Bassett G Jr. Regression quantiles. Econometrica. 1978;46(1):33–50.

    Article  Google Scholar 

  • McCann R. Existence and uniqueness of monotone measure preserving maps. Duke Math. J. 1995;80(2):309–23.

    Article  Google Scholar 

  • Nesterov Y. A method for solving the convex programming problem with convergence rate O(1=k2). Dokl Akad Nauk SSSR. 1983;269(3):543–7.

    Google Scholar 

  • Ryff J. Measure preserving transformations and rearrangements. J Math Anal Appl. 1970;31:449–58.

    Article  Google Scholar 

Download references

Funding

Funding from Région Ile-de-France grant is acknowledged. Funding from NSF Grant DMS-1716489 is acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alfred Galichon.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

Proof of Lemma 4.3

Since \(\mathbf {1}_{[0,t]}\in {\mathcal {C}}\), one obviously first has

$$\begin{aligned} \sup _{v\in {\mathcal {C}}}\int _{0}^{1}v(s)q(s)ds\ge \max _{t\in [0,1]}\int _{0}^{t}q(s)ds=\max _{t\in [0,1]}Q(t). \end{aligned}$$

Let us now prove the converse inequality, taking an arbitrary \(v\in { \mathcal {C}}\). We first observe that Q is absolutely continuous and that v is of bounded variation (its derivative in the sense of distributions being a bounded nonpositive measure which we denote by \(\eta \)), integrating by parts and using the definition of \({\mathcal {C}}\) then give:

$$\begin{aligned} \int _{0}^{1}v(s)q(s)ds&=-\int _{0}^{1}Q\eta +v(1^{-})Q(1) \\&\le (\max _{[0,1]}Q)\times (-\eta ([0,1])+v(1^{-})Q(1) \\&=(\max _{[0,1]}Q)(v(0^{+})-v(1^{-}))+v(1^{-})Q(1) \\&=(\max _{[0,1]}Q)v(0^{+})+(Q(1)-\max _{[0,1]}Q)v(1^{-}) \\&\le \max _{[0,1]}Q. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Carlier, G., Chernozhukov, V., De Bie, G. et al. Vector quantile regression and optimal transport, from theory to numerics. Empir Econ 62, 35–62 (2022). https://doi.org/10.1007/s00181-020-01919-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00181-020-01919-y

Keywords

JEL Classification

Navigation