Skip to main content

Sufficient Dimension Reduction Through Independence and Conditional Mean Independence Measures

  • Chapter
  • First Online:
Festschrift in Honor of R. Dennis Cook
  • 382 Accesses

Abstract

We propose a unified framework for sufficient dimension reduction through independence and conditional mean independence measures. When the interest is the conditional distribution of Y  given X, α-distance covariance is used to recover the central space. If the focus is the conditional mean of Y  given X, the central mean space can be estimated through α-martingale difference divergence. Compared with existing estimators based on the distance covariance which recover the central space, the new estimators are more accurate when the target is the central mean space. By choosing α smaller than one, the new estimators outperform existing estimators when the predictor distribution is heavy-tailed and when there is data contamination.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • X. Chen, R.D. Cook, C. Zou, Diagnostic studies in sufficient dimension reduction. Biometrika 102, 545–558 (2015)

    Article  MathSciNet  Google Scholar 

  • X. Chen, W. Sheng, X. Yin, Efficient sparse estimate of sufficient dimension reduction in high dimension. Technometrics 60, 161–168 (2018)

    Article  MathSciNet  Google Scholar 

  • R.D. Cook, Regression Graphics: Ideas for Studying Regressions Through Graphics (Wiley, New York, 1998)

    Book  Google Scholar 

  • R.D. Cook, L. Forzani, Likelihood-based sufficient dimension reduction. J. Am. Stat. Assoc. 104, 197–208 (2009)

    Article  MathSciNet  Google Scholar 

  • R.D. Cook, B. Li, Dimension reduction for the conditional mean. Ann. Stat. 30, 455–474 (2002)

    Article  MathSciNet  Google Scholar 

  • R.D. Cook, S. Weisberg, Discussion of sliced inverse regression for dimension reduction. J. Am. Stat. Assoc. 86, 28–33 (1991)

    MATH  Google Scholar 

  • Y. Dong, A note on moment-based sufficient dimension reduction estimators. Stat. Interface 9, 141–145 (2016)

    Article  MathSciNet  Google Scholar 

  • Y. Dong, A brief review of linear sufficient dimension reduction through optimization. J. Stat. Plann. Inference 211, 154–161 (2021)

    Article  MathSciNet  Google Scholar 

  • Y. Dong, B. Li, Dimension reduction for non-elliptically distributed predictors: second order methods. Biometrika 97, 279–294 (2010)

    Article  MathSciNet  Google Scholar 

  • Y. Dong, Q. Xia, C. Tang, Z. Li, On sufficient dimension reduction with missing responses through estimating equations. Comput. Stat. Data Anal. 126, 67–77 (2018)

    Article  MathSciNet  Google Scholar 

  • B. Li, Sufficient Dimension Reduction: Methods and Applications with R (CRC Press, 2018)

    Google Scholar 

  • B. Li, Y. Dong, Dimension reduction for non-elliptically distributed predictors. Ann. Stat. 37, 1272–1298 (2009)

    Article  Google Scholar 

  • B. Li, S. Wang, On directional regression for dimension reduction. J. Am. Stat. Assoc. 479, 997–1008 (2007)

    Article  MathSciNet  Google Scholar 

  • K.C. Li, Sliced inverse regression for dimension reduction. J. Am. Stat. Assoc. 86, 316–327 (1991)

    Article  MathSciNet  Google Scholar 

  • K.C. Li, On principal Hessian directions for data visualization and dimension reduction: another application of Stein’s lemma. J. Am. Stat. Assoc. 87, 1025–1039 (1992)

    Article  MathSciNet  Google Scholar 

  • K.C. Li, N. Duan, Regression analysis under link violation. Ann. Stat. 17, 1009–1052 (1989)

    MathSciNet  MATH  Google Scholar 

  • Y. Ma, L.P. Zhu, A semiparametric approach to dimension reduction. J. Am. Stat. Assoc. 107, 168–179 (2012)

    Article  MathSciNet  Google Scholar 

  • Y. Ma, L. Zhu, A review on dimension reduction. Int. Stat. Rev. 81, 134–150 (2013)

    Article  MathSciNet  Google Scholar 

  • Y. Ma, L.P. Zhu, On estimation efficiency of the central mean subspace. J. R. Stat. Soc. Ser. B 76, 885–901 (2014)

    Article  MathSciNet  Google Scholar 

  • X. Shao, J. Zhang, Martingale difference correlation and its use in high dimensional variable screening. J. Am. Stat. Assoc. 109, 1302–1318 (2014)

    Article  MathSciNet  Google Scholar 

  • W. Sheng, X. Yin, Direction estimation in single-index models via distance covariance. J. Multivar. Anal. 122, 148–161 (2013)

    Article  MathSciNet  Google Scholar 

  • W. Sheng, X. Yin, Sufficient dimension reduction via distance covariance. J. Comput. Graph. Stat. 25, 91–104 (2016)

    Article  MathSciNet  Google Scholar 

  • G.J. Székely, M.L. Rizzo, Brownian distance covariance. Ann. Appl. Stat. 3, 1236–1265 (2009)

    MathSciNet  MATH  Google Scholar 

  • G.J. Székely, M.L. Rizzo, Energy statistics: a class of statistics based on distances. J. Stat. Plann. Inference 143, 1249–1272 (2013)

    Article  MathSciNet  Google Scholar 

  • G.J. Székely, M.L. Rizzo, N.K. Bakirov, Measuring and testing dependence by correlation of distances. Ann. Stat. 35, 2769–2794 (2007)

    Article  MathSciNet  Google Scholar 

  • Y. Xia, H. Tong, W. Li, L. Zhu, An adaptive estimation of dimension reduction space. J. R. Stat. Soc. Ser. B 64, 363–410 (2002)

    Article  MathSciNet  Google Scholar 

  • Y. Zhang, J. Liu, Y. Wu, X. Fang, A martingale-difference-divergence-based estimation of central mean subspace. Stat. Interface 12, 489–500 (2019)

    Article  MathSciNet  Google Scholar 

  • Y. Zhu, P. Zeng, Fourier methods for estimating the central subspace and the central mean subspace in regression. J. Am. Stat. Assoc. 101, 1638–1651 (2006)

    Article  MathSciNet  Google Scholar 

  • L.P. Zhu, L.X. Zhu, Z.H. Feng, Dimension reduction in regressions through cumulative slicing estimation. J. Am. Stat. Assoc. 105, 1455–1466 (2010)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The author sincerely thanks the editor and two anonymous referees for useful comments that led to a much improved presentation of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuexiao Dong .

Editor information

Editors and Affiliations

Appendix

Appendix

Proof of Proposition 1

For part (i), since the weight function w q,p(t, s) is positive, Φα(V, U) = 0 if and only if f V,U(t, s) = f V(t)f U(s) for almost all s and t. Thus as long as it is well defined, Φα(V, U) is zero if and only if V  and U are independent. The proof of part (ii) follows directly from the proof of Theorem 7 in Székely and Rizzo (2009) and is thus omitted. □

The following Lemma is needed before we prove Proposition 2. Its proof follows directly from the proof of Theorem 3 in Székely and Rizzo (2009) and is thus omitted.

Lemma 1

For random vectors \(V_1, V_2\in {\mathbb R}^q\) , and \(U_1,U_2\in {\mathbb R}^p\) , assume \(E(|U_1|{ }_p^\alpha )<\infty \), \(E(|U_2|{ }_p^\alpha )<\infty \), \(E(|V_1|{ }_q^\alpha )<\infty \) , and \(E(|V_2|{ }_q^\alpha )<\infty \) . Denote Φ α as the square root of \(\Phi _\alpha ^2\) . If \([V_1^T, U_1^T]^T\) is independent of \([V_2^T, U_2^T]^T\) , then

$$\displaystyle \begin{aligned}\Phi_\alpha(V_1+V_2,U_1+U_2)\leq \Phi_\alpha(V_1,U_1)+\Phi_\alpha(V_2,U_2).\end{aligned} $$

Equality holds if and only if U 1 and V 1 are both constants, or U 2 and V 2 are both constants, or U 1 , U 2 , V 1 , V 2 are mutually independent.

Proof of Proposition 2

We follow the proof of Proposition 2 in Sheng and Yin (2016). For any \(\beta \in {\mathbb R}^{p\times d}\), there exists rotation matrix \(M\in {\mathbb R}^{d\times d}\) such that βM = [β a, β b], where Span(β a) ⊆Span(β 0) and Span(β b) ⊆Span(β 0), where Span(β 0) denotes the orthogonal space of Span(β 0). From (1), we have . Together with , we have . It follows that . Let U 1 = [X T β a, 0]T, U 2 = [0, X T β b]T, V 1 = Y , and V 2 = 0. Then \([V_1, U_1^T]^T\) is independent of \([V_2, U_2^T]^T\). According to Lemma 1,

(11)

On the other hand, M being a rotation matrix implies that MM T = M T M = I d and |M T β T(X − X′)|d = |β T(X − X′)|d. It follows from Proposition 1 that

$$\displaystyle \begin{aligned} \Phi_\alpha(Y,M^T \beta^T X)=\Phi_\alpha(Y,\beta^T X). \end{aligned} $$
(12)

Similarly, Span(β a) ⊆Span(β 0) implies \(|\beta _a^T (X-X')|{ }_{d_a}\leq |\beta _0^T (X-X')|{ }_d\), where d a is the number of columns for β a. Apply Proposition 1 and we have

$$\displaystyle \begin{aligned} \Phi_\alpha(Y,\beta_a^T X)\leq \Phi_\alpha(Y,\beta_0^T X). \end{aligned} $$
(13)

(11), (12), and (13) together lead to \(\Phi _\alpha (Y,\beta ^T X)\leq \Phi _\alpha (Y,\beta _0^T X)\). We get equality if and only if Span(β a) = Span(β 0), in which case β b vanishes. Since β maximizes \(\Phi _\alpha ^2(Y,\beta ^T X)\) over \(\beta \in {\mathbb R}^{p\times d}\), we must have \(\mathrm {Span}(\beta ^*)=\mathrm {Span}(\beta _0)={\mathcal S}_{Y|X}\). □

Proof of Proposition 3

The proof follows directly from the proof of Proposition 3 in Sheng and Yin (2016) and is thus omitted. □

Proof of Proposition 4

For part (i), note that Ξα(V ∣U) = 0 if and only if g V,U(s) = g V f U(s) for almost all s. Thus Ξα(V ∣U) = 0 if and only if E(V ) = E(V ∣U) almost surely. For part (ii), the proof of Theorem 1 in Shao and Zhang (2014) can be followed directly. □

Proof of Proposition 5

Denote \(\eta _{0\perp }\in {\mathbb R}^{p\times (p-r)}\) as a basis for the orthogonal space of Span(η 0). We choose η 0 and η 0⊥ such that \(\eta _0^T\Sigma \eta _0=I_r\), \(\eta _{0\perp }^T\Sigma \eta _0=0\), and \(\eta _{0\perp }^T\Sigma \eta _{0\perp }=I_{p-r}\). For any \(\eta \in {\mathbb R}^{p\times r}\) satisfying η T Ση = I r, there exists \(A\in {\mathbb R}^{r\times r}\) and \(C\in {\mathbb R}^{(p-r)\times r}\) such that η = η 0 A + η 0⊥ C. Then

$$\displaystyle \begin{aligned} I_r=\eta^T\Sigma\eta=(\eta_0 A+\eta_{0\perp} C)^T\Sigma (\eta_0 A+\eta_{0\perp} C)=A^T A+C^T C. \end{aligned} $$
(14)

For \(s\in {\mathbb R}^r\), because , we have

$$\displaystyle \begin{aligned} E(Y)E(e^{i\langle s,\eta^T X \rangle})&=E(Y)E(e^{i\langle s, X^T \eta_0 A \rangle}e^{i\langle s, X^T \eta_{0\perp} C \rangle})\\ &=E(Y)E(e^{i\langle s, X^T \eta_0 A \rangle})E(e^{i\langle s, X^T \eta_{0\perp} C \rangle}). \end{aligned} $$
(15)

Note that (2) implies \(E(Y\mid X)=E(Y\mid \eta _0^T X)\), and we have

$$\displaystyle \begin{aligned} E(Ye^{i\langle s,\eta^T X \rangle})&=E\{E(Y\mid X)e^{i\langle s,\eta^T X \rangle}\}\\ &=E\{ E(Y\mid \eta_0^T X)e^{i\langle s, X^T \eta_0 A \rangle}e^{i\langle s, X^T \eta_{0\perp} C \rangle}\}\\ &=E\{ E(Y\mid \eta_0^T X)e^{i\langle s, X^T \eta_0 A \rangle}\}E(e^{i\langle s, X^T \eta_{0\perp} C \rangle})\\ &=E(Y e^{i\langle s, X^T \eta_0 A \rangle})E(e^{i\langle s, X^T \eta_{0\perp} C \rangle}). \end{aligned} $$
(16)

(15), (16), and the definition of \(\Xi _\alpha ^2\) in (7) together lead to

(17)

On the other hand, it follows from (14) that

(18)

(17), (18), and equation (8) from Proposition 4 together lead to

We get equality if and only if A T A = I r, in which case C vanishes and η = η 0 A. Since η maximizes \(\Xi _\alpha ^2(Y\mid \eta ^T X)\) over \(\eta \in {\mathbb R}^{p\times r}\), we must have \(\mathrm {Span}(\eta ^*)=\mathrm {Span}(\eta _0)={\mathcal S}_{E(Y|X)}\). □

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Dong, Y. (2021). Sufficient Dimension Reduction Through Independence and Conditional Mean Independence Measures. In: Bura, E., Li, B. (eds) Festschrift in Honor of R. Dennis Cook. Springer, Cham. https://doi.org/10.1007/978-3-030-69009-0_8

Download citation

Publish with us

Policies and ethics