Skip to main content
Log in

Groupwise sufficient dimension reduction via conditional distance clustering

  • Published:
Metrika Aims and scope Submit manuscript

Abstract

It becomes increasingly common to incorporate the predictors’ grouping knowledge into dimension reduction techniques. In this article, we establish a complete framework named groupwise sufficient dimension reduction via conditional distance clustering, when the grouping information is unknown. We introduce a simple-type conditional dependence measurement and a corresponding conditional independence test. A clustering procedure based on the measurement and test is constructed to detect the suitable group structure. Finally we conduct sufficient dimension reduction under the obtained structure. Both simulations and a real data analysis demonstrate that the clustering strategy is effective, and the groupwise sufficient dimension reduction method is generally superior to the classical sufficient dimension reduction method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Adragni KP, Al-Najjar E, Martin S, Popuri SK, Raim AM (2016) Group-wise sufficient dimension reduction with principal fitted components. Comput Stat 31:923–941

    Article  MathSciNet  Google Scholar 

  • Cook RD, Weisberg S (1991) Disscussion of “Sliced inverse regression for dimension reduction,” by K. C. Li. J Am Stat Assoc 86:328–332

    MATH  Google Scholar 

  • Cook RD (1996) Graphics for regressions with a binary response. J Am Stat Assoc 91:983–992

    Article  MathSciNet  Google Scholar 

  • Cook RD (2007) Fisher lecture: dimension reduction in regression (with discussion). Stat Sci 22:1–26

    Article  Google Scholar 

  • Cook RD, Forzani L (2009) Likelihood-based sufficient dimension reduction. J Am Stat Assoc 104:197–208

    Article  MathSciNet  Google Scholar 

  • Enz R (1991) Prices and earnings around the globe: a comparison of purchasing power in 48 cities. Union Bank of Switzerland, Economic Research Dept., Zurich

    Google Scholar 

  • Ferré L (1998) Determining the dimension in sliced inverse regression and related methods. J Am Stat Assoc 93:132–140

    MathSciNet  MATH  Google Scholar 

  • Guo Z, Li L, Lu W, Li B (2015) Groupwise dimension reduction via envelope method. J Am Stat Assoc 110:1515–1527

    Article  MathSciNet  Google Scholar 

  • Hastie TJ, Tibshirani RJ (1990) Generalized additive models. Chapman & Hall, London

    MATH  Google Scholar 

  • Jaccard P (1912) The distribution of the flora in the alpine zone. New Phytol 11:37–50

    Article  Google Scholar 

  • Li KC (1991) Sliced inverse regression for dimension reduction. J Am Stat Assoc 86:316–342

    Article  MathSciNet  Google Scholar 

  • Li B, Zha H, Chiaromonte F (2005) Contour regression: a general approach to dimension reduction. Ann Stat 33:1580–1616

    Article  MathSciNet  Google Scholar 

  • Li Q, Racine JS (2007) Nonparametric econometrics: theory and practice. Princeton University Press, Princeton

    MATH  Google Scholar 

  • Li B, Wang SL (2007) On directional regression for dimension reduction. J Am Stat Assoc 102:997–1008

    Article  MathSciNet  Google Scholar 

  • Li L (2009) Exploiting predictor domain information in sufficient dimension reduction. Comput Stat Data Anal 53:2665–2672

    Article  MathSciNet  Google Scholar 

  • Li L, Li B, Zhu LX (2010) Groupwise dimension reduction. J Am Stat Assoc 105:1188–1201

    Article  MathSciNet  Google Scholar 

  • Székely GJ, Rizzo ML, Bakirov NK (2007) Measuring and testing dependency by correlation of distances. Ann Stat 35:2769–2794

    Article  Google Scholar 

  • Wand M, Jones M (1994) Multivariate plug-in bandwidth selection. Comput Stat 9:97–116

    MathSciNet  MATH  Google Scholar 

  • Wang X, Pan W, Hu W, Tian Y, Zhang H (2015) Conditional distance correlation. J Am Stat Assoc 110:1726–1734

    Article  MathSciNet  Google Scholar 

  • Weisberg s (2015) R package “dr”

  • Xia Y, Tong H, Li WK, Zhu LX (2002) An adaptive estimation of dimension reduction space. J R Stat Soc Ser B 64:363–410

    Article  MathSciNet  Google Scholar 

  • Yin X, Li B, Cook RD (2008) Successive direction extraction for estimating the central subspace in a multiple-index regression. J Multivar Anal 99:1733–1757

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

Funding was provided by MOE Project of Key Research Institute of Humanities and Social Sciences at Universities (Grant No. 16JJD910002) and Outstanding Innovative Talents Cultivation Funded Programs 2018 of Renmin Univertity of China.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jingxiao Zhang.

Ethics declarations

Conflict of interest statement

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Here we present proofs of theorems in this paper. In the proof of Theorem 1, the following lemma from Székely et al. (2007) is essential.

Lemma 1

If \(0<\alpha <2\), then for all x in \({\mathbb {R}}^d\)

$$\begin{aligned} \int _{{\mathbb {R}}^d}\frac{1-\cos \langle t,x\rangle }{|t|_d^{d+\alpha }}dt=C(d,\alpha )|x|^{\alpha }, \end{aligned}$$

where

$$\begin{aligned} C(d,\alpha )=\frac{2\pi ^{d/2}\varGamma (1-\alpha /2)}{\alpha 2^{\alpha }\varGamma ((d+\alpha )/2)}, \end{aligned}$$

and \(\varGamma (\cdot )\) is the complete gamma function. The integrals at 0 and \(\infty \) are meant in the principal value sense: \(\lim _{\varepsilon \rightarrow 0}\int _{{\mathbb {R}}^d\backslash \{\varepsilon B+\varepsilon ^{-1}B^c\}}\), where B is the unit ball (centered at 0) in \({\mathbb {R}}^d\) and \(B^c\) is the complement of B.

In this article, we take \(\alpha =1\), which is the simplest case, and the constant in Lemma 1 is

$$\begin{aligned} c_d=C(d,1)=\frac{\pi ^{(1+d)/2}}{\varGamma ((1+d)/2)}. \end{aligned}$$

Proof of Theorem 1

Lemma 1 implies that there exist constants \(c_p\) and \(c_q\) such that for all \({\mathbf {X}}\in {\mathbb {R}}^p\), \({\mathbf {Y}}\in {\mathbb {R}}^q\),

$$\begin{aligned}&\int _{{\mathbb {R}}^p}\frac{1-\exp \{i\langle \mathbf {t},{\mathbf {X}}\rangle \}}{|\mathbf {t}|_p^{1+p}}d\mathbf {t}=c_p|{\mathbf {X}}|_p,\\&\int _{{\mathbb {R}}^q}\frac{1-\exp \{i\langle \mathbf {s},{\mathbf {Y}}\rangle \}}{|\mathbf {s}|_q^{1+q}}d\mathbf {s}=c_q|{\mathbf {Y}}|_q,\\&\int _{{\mathbb {R}}^p}\int _{{\mathbb {R}}^q}\frac{1-\exp \{i\langle \mathbf {t},{\mathbf {X}}\rangle +i\langle \mathbf {s},{\mathbf {Y}}\rangle \}}{|\mathbf {t}|_p^{1+p}|\mathbf {s}|_q^{1+q}}d\mathbf {t}d\mathbf {s}=c_pc_q|{\mathbf {X}}|_p|{\mathbf {Y}}|_q. \end{aligned}$$

For simplicity, consider the case \(p=q=1\). By plug in the empirical conditional characteristic functions with \((c_pc_q|\mathbf {t}|_p^{p+1}|\mathbf {s}|_q^{1+q})^{-1}=\pi ^{-2}t^{-2}s^{-2}\), the integration terms involve \(|\phi _{X,Y|Z}^n(t,s)|^2\), \(|\phi _{X|Z}^n(t)\phi _{Y|Z}^n(s)|^2\) and \(\overline{\phi _{X,Y|Z}^n(t,s)}\phi _{X|Z}^n(t)\phi _{Y|Z}^n(s)\).

For the first we have

$$\begin{aligned}&\phi _{X,Y|Z}^n(t,s)\overline{\phi _{X,Y|Z}^n(t,s)}\nonumber \\&=\frac{1}{\sum \nolimits _{k=1}^nI_{Z_k}\sum \nolimits _{l=1}^nI_{Z_l}} \sum _{k,l=1}^n\cos (X_k-X_l)t\cos (Y_k-Y_l)sI_{Z_k}I_{Z_l}+V_1, \end{aligned}$$

where \(V_1=-\frac{1}{\sum \nolimits _{k=1}^nI_{Z_k}\sum \nolimits _{l=1}^nI_{Z_l}}i\sin (X_k-X_l)t\sin (Y_k-Y_l)sI_{Z_k}I_{Z_l}\) and the integral of \(V_1\) equals to zero.

Similarly, we have

$$\begin{aligned}&\phi _{X|Z}^n(t)\phi _{Y|Z}^n(s)\overline{\phi _{X|Z}^n(t)\phi _{Y|Z}^n(s)}\\&=\,\frac{1}{(\sum \nolimits _{k=1}^nI_{Z_k})^2} \sum _{k,l=1}^n\cos (X_k-X_l)tI_{Z_k}I_{Z_l}\frac{1}{(\sum \nolimits _{k=1}^nI_{Z_k})^2} \sum _{k,l=1}^n\cos (Y_k-Y_l)sI_{Z_k}I_{Z_l}+V_2, \end{aligned}$$

and

$$\begin{aligned} \begin{aligned}&\overline{\phi _{X,Y|Z}^n(t,s)}\phi _{X|Z}^n(t)\phi _{Y|Z}^n(s)=\overline{\phi _{X|Z}^n(t)\phi _{Y|Z}^n(s)}\phi _{X,Y|Z}^n(t,s)\\ =\,&\frac{1}{(\sum \nolimits _{k=1}^nI_{Z_k})^3} \sum _{k,l,m=1}^n\cos (X_k-X_l)t\cos (Y_k-Y_m)sI_{Z_k}I_{Z_lI_{Z_m}}+V_3, \end{aligned} \end{aligned}$$

where the integrals of \(V_2\) and \(V_3\) equal to zero.

Using the technique

$$\begin{aligned} \cos u\cos v=1-(1-\cos u)-(1-\cos v)+(1-\cos u)(1-\cos v), \end{aligned}$$

note that the terms similar to

$$\begin{aligned} \frac{1}{(\sum \nolimits _{k=1}^nI_{Z_k})^2}\sum _{k,l=1}^n \int _{{\mathbb {R}}^2}\frac{1-(1-\cos (X_k-X_l)t)-(1-\cos (Y_k-Y_l)s)}{t^2s^2}I_{Z_k}I_{Z_l}dtds \end{aligned}$$

are cancelled in the final integration. So we only need to calculate integrals of the following type

$$\begin{aligned} \begin{aligned}&\int _{{\mathbb {R}}^2}\frac{(1-\cos (X_k-X_l)t)(1-\cos (Y_k-Y_l)s)}{t^2s^2}I_{Z_k}I_{Z_l}dtds\\&\quad =\int _{{\mathbb {R}}}\frac{(1-\cos (X_k-X_l)t))}{t^2}I_{Z_k}I_{Z_l}dt\int _{{\mathbb {R}}}\frac{(1-\cos (Y_k-Y_l)s)}{s^2}I_{Z_k}I_{Z_l}ds\\&\quad = c_1^2|X_k-X_l||Y_k-Y_l|I_{Z_k}I_{Z_l}\\&\quad = \pi ^2|X_k-X_l||Y_k-Y_l|I_{Z_k}I_{Z_l}. \end{aligned} \end{aligned}$$

For random vectors \({\mathbf {X}}\in {\mathbb {R}}^p\) and \({\mathbf {Y}}\in {\mathbb {R}}^q\), the same steps are applied. Thus Theorem 1 holds. \(\square \)

Proof of Theorem 2

Wang et al. (2015) have demonstrated that \({\mathcal {D}}_n({\mathbf {X}},{\mathbf {Y}}|{\mathbf {Z}}){\mathop {\rightarrow }\limits ^{a.s.}}{\mathcal {D}}({\mathbf {X}},{\mathbf {Y}}|{\mathbf {Z}})\) in the proof of their Theorem 4. Following Theorem 1, Theorem 2 naturally holds. \(\square \)

Proof of Theorem 3

(1) Let \(U_k=\exp (i\langle \mathbf {t},{\mathbf {X}}_k\rangle )-\phi ^n_{{\mathbf {X}}|{\mathbf {Z}}}(\mathbf {t})\) and \(V_k=\exp (i\langle \mathbf {s},{\mathbf {Y}}_k\rangle )-\phi ^n_{{\mathbf {Y}}|{\mathbf {Z}}}(\mathbf {s})\). Then by the Cauchy–Schwarz inequality,

$$\begin{aligned} \begin{aligned} |\phi ^n_{{\mathbf {X}},{\mathbf {Y}}|{\mathbf {Z}}}(\mathbf {t},\mathbf {s})-\phi ^n_{{\mathbf {X}}|{\mathbf {Z}}}(\mathbf {t})\phi ^n_{{\mathbf {Y}}|{\mathbf {Z}}}(\mathbf {s})|^2 =\,&\left| \frac{1}{\sum \nolimits _{k=1}^nI_{Z_k}}\sum \limits _{k=1}^n[U_kV_k]I_{Z_k}\right| ^2\\ \le \,&\left( \frac{1}{\sum \nolimits _{k=1}^nI_{Z_k}}\sum \limits _{k=1}^n[|U_k||V_k|]I_{Z_k}\right) ^2\\ \le \,&\frac{1}{\sum \nolimits _{k=1}^nI_{Z_k}}\sum \limits _{k=1}^n[|U_k|^2]I_{Z_k} \frac{1}{\sum \nolimits _{k=1}^nI_{Z_k}}\sum \limits _{k=1}^n[|V_k|^2]I_{Z_k}\\ =\,&(1-|\phi ^n_{{\mathbf {X}}|{\mathbf {Z}}}(\mathbf {t})|^2)(1-|\phi ^n_{{\mathbf {Y}}|{\mathbf {Z}}}(\mathbf {s})|^2). \end{aligned} \end{aligned}$$

Thus

$$\begin{aligned} \begin{aligned} 0\le&\frac{1}{c_pc_q}\int _{{\mathbb {R}}^{p+q}}\frac{|\phi ^n_{{\mathbf {X}},{\mathbf {Y}}|{\mathbf {Z}}}(\mathbf {t},\mathbf {s})-\phi ^n_{{\mathbf {X}}|{\mathbf {Z}}}(\mathbf {t})\phi ^n_{{\mathbf {Y}}|{\mathbf {Z}}}(\mathbf {s})|^2}{|\mathbf {t}|_p^{p+1}|\mathbf {s}|_q^{q+1}}d\mathbf {t}d\mathbf {s}\\ \le&\frac{1}{c_pc_q}\int _{{\mathbb {R}}^{p+q}}\frac{(1-|\phi ^n_{{\mathbf {X}}|{\mathbf {Z}}}(\mathbf {t})|^2)(1-|\phi ^n_{{\mathbf {Y}}|{\mathbf {Z}}}(\mathbf {s})|^2)}{|\mathbf {t}|_p^{p+1}|\mathbf {s}|_q^{q+1}}d\mathbf {t}d\mathbf {s}. \end{aligned} \end{aligned}$$

This implies that \(0\le {\mathcal {R}}_n({\mathbf {X}},{\mathbf {Y}}|Z)\le 1\).

(2) Wang et al. (2015) have demonstrated that, for \(\mathbf {a}_1\in {\mathbb {R}}^p\) and \(\mathbf {a}_2\in {\mathbb {R}}^q\), \(b_1\ne 0\) and \(b_2\ne 0\), \(\mathbf {C}_1\in {\mathbb {R}}^{p\times p}\) and \(\mathbf {C}_2\in {\mathbb {R}}^{q\times q}\), \({\mathcal {D}}_n^2(\mathbf {a}_1+b_1\mathbf {C}_1{\mathbf {X}},\mathbf {a}_2+b_2\mathbf {C}_2{\mathbf {Y}}|Z)=|b_1b_2|{\mathcal {D}}_n^2({\mathbf {X}},{\mathbf {Y}}|Z)\). Following the definition of \({\mathcal {R}}_n({\mathbf {X}},{\mathbf {Y}}|Z)\) and Theorem 1, Theorem 3(2) naturally holds. \(\square \)

Proof of Theorem 4

Let \({\bar{\rho }}_n=\frac{1}{n}\sum _{i=1}^n\rho ({\mathbf {X}},{\mathbf {Y}}|Z=Z_i)\). Since \(E_Z[\rho ({\mathbf {X}},{\mathbf {Y}}|Z=Z_i)]={\mathcal {T}}\), \(Var_Z[\rho ({\mathbf {X}},{\mathbf {Y}}|Z=Z_i)]<\infty \) for \(\rho ({\mathbf {X}},{\mathbf {Y}}|Z=Z)\in [-1,1]\), following from the law of large numbers, we have that

$$\begin{aligned} \forall \varepsilon >0,~~\lim _{n\rightarrow \infty }P(|{\bar{\rho }}_n-{\mathcal {T}}|<\varepsilon )=1. \end{aligned}$$

Next we condition on Corollary 1 to prove that

$$\begin{aligned} \forall \varepsilon >0,~~\lim _{n\rightarrow \infty }P(|{\mathcal {T}}_n-{\bar{\rho }}_n|<\varepsilon )=1. \end{aligned}$$

Let \(R_{ni}\) denote \(R_n({\mathbf {X}},{\mathbf {Y}}|Z=Z_i)\), and \(\rho _i\) denote \(\rho ({\mathbf {X}},{\mathbf {Y}}|Z=Z_i)\). Conditioning on Corollary 1, and the property that almost sure convergence implies convergence in probability, we have that

$$\begin{aligned} \forall \varepsilon >0,~~\lim _{n\rightarrow \infty }P(|R_{ni}-\rho _i|\ge \varepsilon )=0, \end{aligned}$$

and obviously

$$\begin{aligned} \forall \varepsilon >0,~~\lim _{n\rightarrow \infty }P\left( \frac{1}{n}\sum _{i=1}^n|R_{ni}-\rho _i|\ge \varepsilon \right) =0. \end{aligned}$$

It follows that

$$\begin{aligned} P\left( |\frac{1}{n}\sum _{i=1}^nR_{ni}-\frac{1}{n}\sum _{i=1}^n\rho _i|\ge \varepsilon \right) \le P\left( \frac{1}{n}\sum _{i=1}^n|R_{ni}-\rho _i|\ge \varepsilon \right) , \end{aligned}$$

and thus \(P(|{\mathcal {T}}_n-{\bar{\rho }}_n|\ge \varepsilon )\xrightarrow {n\rightarrow \infty }0\).

Since

$$\begin{aligned} \begin{aligned} P(|{\mathcal {T}}_n-{\mathcal {T}}|\ge \varepsilon )&\le P(|{\mathcal {T}}_n-{\bar{\rho }}_n|+|{\bar{\rho }}_n-{\mathcal {T}}|\ge \varepsilon )\\&\le P\left( \left\{ |{\mathcal {T}}_n-{\bar{\rho }}_n|\ge \frac{\varepsilon }{2}\right\} \bigcup \left\{ |{\bar{\rho }}_n-{\mathcal {T}}|\ge \frac{\varepsilon }{2}\right\} \right) \\&\le P\left( |{\mathcal {T}}_n-{\bar{\rho }}_n|\ge \frac{\varepsilon }{2}\right) +P\left( |{\bar{\rho }}_n-{\mathcal {T}}|\ge \frac{\varepsilon }{2}\right) \xrightarrow {n\rightarrow \infty }0, \end{aligned} \end{aligned}$$

Theorem 4

$$\begin{aligned} \forall \varepsilon >0,~~\lim _{n\rightarrow \infty }P(|{\mathcal {T}}_n-{\mathcal {T}}|\ge \varepsilon )=0. \end{aligned}$$

naturally holds. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, X., Zhang, J. Groupwise sufficient dimension reduction via conditional distance clustering. Metrika 83, 217–242 (2020). https://doi.org/10.1007/s00184-019-00732-7

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00184-019-00732-7

Keywords

Navigation