Sufficient Dimension Reduction Through Independence and Conditional Mean Independence Measures

Dong, Yuexiao

doi:10.1007/978-3-030-69009-0_8

Yuexiao Dong³

382 Accesses

Abstract

We propose a unified framework for sufficient dimension reduction through independence and conditional mean independence measures. When the interest is the conditional distribution of Y given X, α-distance covariance is used to recover the central space. If the focus is the conditional mean of Y given X, the central mean space can be estimated through α-martingale difference divergence. Compared with existing estimators based on the distance covariance which recover the central space, the new estimators are more accurate when the target is the central mean space. By choosing α smaller than one, the new estimators outperform existing estimators when the predictor distribution is heavy-tailed and when there is data contamination.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.99; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

X. Chen, R.D. Cook, C. Zou, Diagnostic studies in sufficient dimension reduction. Biometrika 102, 545–558 (2015)
Article MathSciNet Google Scholar
X. Chen, W. Sheng, X. Yin, Efficient sparse estimate of sufficient dimension reduction in high dimension. Technometrics 60, 161–168 (2018)
Article MathSciNet Google Scholar
R.D. Cook, Regression Graphics: Ideas for Studying Regressions Through Graphics (Wiley, New York, 1998)
Book Google Scholar
R.D. Cook, L. Forzani, Likelihood-based sufficient dimension reduction. J. Am. Stat. Assoc. 104, 197–208 (2009)
Article MathSciNet Google Scholar
R.D. Cook, B. Li, Dimension reduction for the conditional mean. Ann. Stat. 30, 455–474 (2002)
Article MathSciNet Google Scholar
R.D. Cook, S. Weisberg, Discussion of sliced inverse regression for dimension reduction. J. Am. Stat. Assoc. 86, 28–33 (1991)
MATH Google Scholar
Y. Dong, A note on moment-based sufficient dimension reduction estimators. Stat. Interface 9, 141–145 (2016)
Article MathSciNet Google Scholar
Y. Dong, A brief review of linear sufficient dimension reduction through optimization. J. Stat. Plann. Inference 211, 154–161 (2021)
Article MathSciNet Google Scholar
Y. Dong, B. Li, Dimension reduction for non-elliptically distributed predictors: second order methods. Biometrika 97, 279–294 (2010)
Article MathSciNet Google Scholar
Y. Dong, Q. Xia, C. Tang, Z. Li, On sufficient dimension reduction with missing responses through estimating equations. Comput. Stat. Data Anal. 126, 67–77 (2018)
Article MathSciNet Google Scholar
B. Li, Sufficient Dimension Reduction: Methods and Applications with R (CRC Press, 2018)
Google Scholar
B. Li, Y. Dong, Dimension reduction for non-elliptically distributed predictors. Ann. Stat. 37, 1272–1298 (2009)
Article Google Scholar
B. Li, S. Wang, On directional regression for dimension reduction. J. Am. Stat. Assoc. 479, 997–1008 (2007)
Article MathSciNet Google Scholar
K.C. Li, Sliced inverse regression for dimension reduction. J. Am. Stat. Assoc. 86, 316–327 (1991)
Article MathSciNet Google Scholar
K.C. Li, On principal Hessian directions for data visualization and dimension reduction: another application of Stein’s lemma. J. Am. Stat. Assoc. 87, 1025–1039 (1992)
Article MathSciNet Google Scholar
K.C. Li, N. Duan, Regression analysis under link violation. Ann. Stat. 17, 1009–1052 (1989)
MathSciNet MATH Google Scholar
Y. Ma, L.P. Zhu, A semiparametric approach to dimension reduction. J. Am. Stat. Assoc. 107, 168–179 (2012)
Article MathSciNet Google Scholar
Y. Ma, L. Zhu, A review on dimension reduction. Int. Stat. Rev. 81, 134–150 (2013)
Article MathSciNet Google Scholar
Y. Ma, L.P. Zhu, On estimation efficiency of the central mean subspace. J. R. Stat. Soc. Ser. B 76, 885–901 (2014)
Article MathSciNet Google Scholar
X. Shao, J. Zhang, Martingale difference correlation and its use in high dimensional variable screening. J. Am. Stat. Assoc. 109, 1302–1318 (2014)
Article MathSciNet Google Scholar
W. Sheng, X. Yin, Direction estimation in single-index models via distance covariance. J. Multivar. Anal. 122, 148–161 (2013)
Article MathSciNet Google Scholar
W. Sheng, X. Yin, Sufficient dimension reduction via distance covariance. J. Comput. Graph. Stat. 25, 91–104 (2016)
Article MathSciNet Google Scholar
G.J. Székely, M.L. Rizzo, Brownian distance covariance. Ann. Appl. Stat. 3, 1236–1265 (2009)
MathSciNet MATH Google Scholar
G.J. Székely, M.L. Rizzo, Energy statistics: a class of statistics based on distances. J. Stat. Plann. Inference 143, 1249–1272 (2013)
Article MathSciNet Google Scholar
G.J. Székely, M.L. Rizzo, N.K. Bakirov, Measuring and testing dependence by correlation of distances. Ann. Stat. 35, 2769–2794 (2007)
Article MathSciNet Google Scholar
Y. Xia, H. Tong, W. Li, L. Zhu, An adaptive estimation of dimension reduction space. J. R. Stat. Soc. Ser. B 64, 363–410 (2002)
Article MathSciNet Google Scholar
Y. Zhang, J. Liu, Y. Wu, X. Fang, A martingale-difference-divergence-based estimation of central mean subspace. Stat. Interface 12, 489–500 (2019)
Article MathSciNet Google Scholar
Y. Zhu, P. Zeng, Fourier methods for estimating the central subspace and the central mean subspace in regression. J. Am. Stat. Assoc. 101, 1638–1651 (2006)
Article MathSciNet Google Scholar
L.P. Zhu, L.X. Zhu, Z.H. Feng, Dimension reduction in regressions through cumulative slicing estimation. J. Am. Stat. Assoc. 105, 1455–1466 (2010)
Article MathSciNet Google Scholar

Download references

Acknowledgements

The author sincerely thanks the editor and two anonymous referees for useful comments that led to a much improved presentation of the paper.

Author information

Authors and Affiliations

Department of Statistical Science, Temple University, Philadelphia, PA, USA
Yuexiao Dong

Authors

Yuexiao Dong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuexiao Dong .

Editor information

Editors and Affiliations

Applied Statistics, Vienna University of Technology, Vienna, Wien, Austria
Efstathia Bura
Department of Statistics, The Pennsylvania State University, University Park, PA, USA
Bing Li

Appendix

Proof of Proposition 1

For part (i), since the weight function w _q,p(t, s) is positive, Φ_α(V, U) = 0 if and only if f _V,U(t, s) = f _V(t)f _U(s) for almost all s and t. Thus as long as it is well defined, Φ_α(V, U) is zero if and only if V and U are independent. The proof of part (ii) follows directly from the proof of Theorem 7 in Székely and Rizzo (2009) and is thus omitted. □

The following Lemma is needed before we prove Proposition 2. Its proof follows directly from the proof of Theorem 3 in Székely and Rizzo (2009) and is thus omitted.

Lemma 1

For random vectors $V_1, V_2\in {\mathbb R}^q$ , and $U_1,U_2\in {\mathbb R}^p$ , assume $E(|U_1|{ }_p^\alpha )<\infty $, $E(|U_2|{ }_p^\alpha )<\infty $, $E(|V_1|{ }_q^\alpha )<\infty $ , and $E(|V_2|{ }_q^\alpha )<\infty $ . Denote Φ _α as the square root of $\Phi _\alpha ^2$ . If $[V_1^T, U_1^T]^T$ is independent of $[V_2^T, U_2^T]^T$ , then

$$\displaystyle \begin{aligned}\Phi_\alpha(V_1+V_2,U_1+U_2)\leq \Phi_\alpha(V_1,U_1)+\Phi_\alpha(V_2,U_2).\end{aligned} $$

Equality holds if and only if U ₁ and V ₁ are both constants, or U ₂ and V ₂ are both constants, or U ₁ , U ₂ , V ₁ , V ₂ are mutually independent.

Proof of Proposition 2

We follow the proof of Proposition 2 in Sheng and Yin (2016). For any $\beta \in {\mathbb R}^{p\times d}$, there exists rotation matrix $M\in {\mathbb R}^{d\times d}$ such that βM = [β _a, β _b], where Span(β _a) ⊆Span(β ₀) and Span(β _b) ⊆Span(β ₀)^⊥, where Span(β ₀)^⊥ denotes the orthogonal space of Span(β ₀). From (1), we have . Together with , we have . It follows that . Let U ₁ = [X ^T β _a, 0]^T, U ₂ = [0, X ^T β _b]^T, V ₁ = Y , and V ₂ = 0. Then $[V_1, U_1^T]^T$ is independent of $[V_2, U_2^T]^T$. According to Lemma 1,

(11)

On the other hand, M being a rotation matrix implies that MM ^T = M ^T M = I _d and |M ^T β ^T(X − X′)|_d = |β ^T(X − X′)|_d. It follows from Proposition 1 that

$$\displaystyle \begin{aligned} \Phi_\alpha(Y,M^T \beta^T X)=\Phi_\alpha(Y,\beta^T X). \end{aligned} $$

(12)

Similarly, Span(β _a) ⊆Span(β ₀) implies $|\beta _a^T (X-X')|{ }_{d_a}\leq |\beta _0^T (X-X')|{ }_d$, where d _a is the number of columns for β _a. Apply Proposition 1 and we have

$$\displaystyle \begin{aligned} \Phi_\alpha(Y,\beta_a^T X)\leq \Phi_\alpha(Y,\beta_0^T X). \end{aligned} $$

(13)

(11), (12), and (13) together lead to $\Phi _\alpha (Y,\beta ^T X)\leq \Phi _\alpha (Y,\beta _0^T X)$. We get equality if and only if Span(β _a) = Span(β ₀), in which case β _b vanishes. Since β ^∗ maximizes $\Phi _\alpha ^2(Y,\beta ^T X)$ over $\beta \in {\mathbb R}^{p\times d}$, we must have $\mathrm {Span}(\beta ^*)=\mathrm {Span}(\beta _0)={\mathcal S}_{Y|X}$. □

Proof of Proposition 3

The proof follows directly from the proof of Proposition 3 in Sheng and Yin (2016) and is thus omitted. □

Proof of Proposition 4

For part (i), note that Ξ_α(V ∣U) = 0 if and only if g _V,U(s) = g _V f _U(s) for almost all s. Thus Ξ_α(V ∣U) = 0 if and only if E(V ) = E(V ∣U) almost surely. For part (ii), the proof of Theorem 1 in Shao and Zhang (2014) can be followed directly. □

Proof of Proposition 5

Denote $\eta _{0\perp }\in {\mathbb R}^{p\times (p-r)}$ as a basis for the orthogonal space of Span(η ₀). We choose η ₀ and η _0⊥ such that $\eta _0^T\Sigma \eta _0=I_r$, $\eta _{0\perp }^T\Sigma \eta _0=0$, and $\eta _{0\perp }^T\Sigma \eta _{0\perp }=I_{p-r}$. For any $\eta \in {\mathbb R}^{p\times r}$ satisfying η ^T Ση = I _r, there exists $A\in {\mathbb R}^{r\times r}$ and $C\in {\mathbb R}^{(p-r)\times r}$ such that η = η ₀ A + η _0⊥ C. Then

$$\displaystyle \begin{aligned} I_r=\eta^T\Sigma\eta=(\eta_0 A+\eta_{0\perp} C)^T\Sigma (\eta_0 A+\eta_{0\perp} C)=A^T A+C^T C. \end{aligned} $$

(14)

For $s\in {\mathbb R}^r$, because , we have

$$\displaystyle \begin{aligned} E(Y)E(e^{i\langle s,\eta^T X \rangle})&=E(Y)E(e^{i\langle s, X^T \eta_0 A \rangle}e^{i\langle s, X^T \eta_{0\perp} C \rangle})\\ &=E(Y)E(e^{i\langle s, X^T \eta_0 A \rangle})E(e^{i\langle s, X^T \eta_{0\perp} C \rangle}). \end{aligned} $$

(15)

Note that (2) implies $E(Y\mid X)=E(Y\mid \eta _0^T X)$, and we have

$$\displaystyle \begin{aligned} E(Ye^{i\langle s,\eta^T X \rangle})&=E\{E(Y\mid X)e^{i\langle s,\eta^T X \rangle}\}\\ &=E\{ E(Y\mid \eta_0^T X)e^{i\langle s, X^T \eta_0 A \rangle}e^{i\langle s, X^T \eta_{0\perp} C \rangle}\}\\ &=E\{ E(Y\mid \eta_0^T X)e^{i\langle s, X^T \eta_0 A \rangle}\}E(e^{i\langle s, X^T \eta_{0\perp} C \rangle})\\ &=E(Y e^{i\langle s, X^T \eta_0 A \rangle})E(e^{i\langle s, X^T \eta_{0\perp} C \rangle}). \end{aligned} $$

(16)

(15), (16), and the definition of $\Xi _\alpha ^2$ in (7) together lead to

(17)

On the other hand, it follows from (14) that

(18)

(17), (18), and equation (8) from Proposition 4 together lead to

We get equality if and only if A ^T A = I _r, in which case C vanishes and η = η ₀ A. Since η ^∗ maximizes $\Xi _\alpha ^2(Y\mid \eta ^T X)$ over $\eta \in {\mathbb R}^{p\times r}$, we must have $\mathrm {Span}(\eta ^*)=\mathrm {Span}(\eta _0)={\mathcal S}_{E(Y|X)}$. □

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Dong, Y. (2021). Sufficient Dimension Reduction Through Independence and Conditional Mean Independence Measures. In: Bura, E., Li, B. (eds) Festschrift in Honor of R. Dennis Cook. Springer, Cham. https://doi.org/10.1007/978-3-030-69009-0_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-69009-0_8
Published: 23 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69008-3
Online ISBN: 978-3-030-69009-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Sufficient Dimension Reduction Through Independence and Conditional Mean Independence Measures

Abstract

Access this chapter

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Proof of Proposition 1

Lemma 1

Proof of Proposition 2

Proof of Proposition 3

Proof of Proposition 4

Proof of Proposition 5

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation