Skip to main content
Log in

Adjusted Pearson Chi-Square feature screening for multi-classification with ultrahigh dimensional data

  • Published:
Metrika Aims and scope Submit manuscript

Abstract

Huang et al. (J Bus Econ Stat 32:237–244, 2014) first proposed a Pearson Chi-Square based feature screening procedure tailored to multi-classification problem with ultrahigh dimensional categorical covariates, which is a common problem in practice but has seldom been discussed in the literature. However, their work establishes the sure screening property only in a limited setting. Moreover, the p value based adjustments when the number of categories involved by each covariate is different do not work well in several practical situations. In this paper, we propose an adjusted Pearson Chi-Square feature screening procedure and a modified method for tuning parameter selection. Theoretically, we establish the sure screening property of the proposed method in general settings. Empirically, the proposed method is more successful than Pearson Chi-Square feature screening in handling non-equal numbers of covariate categories in finite samples. Results of three simulation studies and one real data analysis are presented. Our work together with Huang et al. (J Bus Econ Stat 32:237–244, 2014) establishes a solid theoretical foundation and empirical evidence for the family of Pearson Chi-Square based feature screening methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Cui HJ, Li RZ, Zhong W (2015) Model-free feature screening for ultrahigh dimensional discriminant analysis. J Am Stat Assoc 110:630–641

    Article  MATH  MathSciNet  Google Scholar 

  • Fan JQ, Fan YY (2008) High dimensional classification using features annealed independence rules. Ann Stat 36:2605–2637

    Article  MATH  MathSciNet  Google Scholar 

  • Fan JQ, Lv JC (2008) Sure independence screening for ultra-high dimensional feature space (with discussion). J R Stat Soc Ser B 70:849–911

    Article  MathSciNet  Google Scholar 

  • Fan JQ, Song R (2010) Sure independent screening in generalized linear models with NP-dimensionality. Ann Stat 38:3567–3604

    Article  MATH  Google Scholar 

  • Fan JQ, Ma YB, Dai W (2014) Nonparametric independence screening in sparse ultra-high dimensional varying coefficient models. J Am Stat Assoc 109:1270–1284

    Article  MATH  MathSciNet  Google Scholar 

  • He XM, Wang L, Hong HG (2013) Quantile-adaptive model-free variable screening for high-dimensional heterogeneous data. Ann Stat 41:342–369

    Article  MATH  MathSciNet  Google Scholar 

  • Huang DY, Li RZ, Wang HS (2014) Feature screening for ultrahigh dimensional categorical data with applications. J Bus Econ Stat 32:237–244

    Article  MathSciNet  Google Scholar 

  • Li RZ, Zhong W, Zhu LP (2012) Feature screening via distance correlation learning. J Am Stat Assoc 107:1129–1139

    Article  MATH  MathSciNet  Google Scholar 

  • Mai Q, Zou H (2013) The Kolmogorov filter for variable screening in high-dimensional binary classification. Biometrika 100:229–234

    Article  MATH  MathSciNet  Google Scholar 

  • Mai Q, Zou H (2015) The fused Kolmogorov filter: a nonparametric model-free screening method. Ann Stat 43:1471–1497

    Article  MATH  MathSciNet  Google Scholar 

  • Ni L, Fang F (2016) Entropy-based model-free feature screening for ultrahigh-dimensional multiclass classification. J Nonparametr Stat 28:515–530

    Article  MATH  MathSciNet  Google Scholar 

  • Pan R, Wang HS, Li RZ (2016) Ultrahigh dimensional multi-class linear discriminant analysis by pairwise sure independence screening. J Am Stat Assoc 111:169–179

    Article  Google Scholar 

  • Quinlan JR (1992) C4.5: programs for machine learning, 1st edn. Morgan Kaufmann, Burlington

    Google Scholar 

  • Wang HS (2009) Forward regression for ultra-high dimensional variable screening. J Am Stat Assoc 104:1512–1524

    Article  MATH  MathSciNet  Google Scholar 

  • Zhu LP, Li LX, Li RZ, Zhu LX (2011) Model-free feature screening for ultrahigh dimensional data. J Am Stat Assoc 106:1464–1475

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors would like to thank an anonymous reviewer and the Associate Editor for their helpful comments and suggestions. Lyu Ni’s research was partially supported by ECNU Fund for Short Term Overseas Academic Visit (40600-511232-16204/010/001). Fang Fang’s research was partially supported by Shanghai Nature Science Foundation (15ZR1410300), Shanghai Rising Star Program (16QA1401700), National Scientific Foundation of China (11601156), and the 111 Project (B14019).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fang Fang.

Appendix

Appendix

Lemma 1

For categorical \(X_k\), under condition (C1) and (C3), we have

$$\begin{aligned} P\left( \left| {{\hat{\varDelta }}}_k^*-\varDelta _k^*\right| >2\varepsilon \right) \le O(RJ)\exp \left\{ -c_{5}\frac{n\varepsilon ^2}{R^6J^6}\right\} \end{aligned}$$

for any \(0<\varepsilon <1\), where \(c_{5}\) is a positive constant.

Proof

Note that

$$\begin{aligned} \log J_k\cdot ({{\hat{\varDelta }}}_k^*-\varDelta _k^*)= & {} {{\hat{\varDelta }}}_k-\varDelta _k\nonumber \\= & {} \sum _{r=1}^R\sum _{j=1}^{J_k}\left\{ \frac{\left( {\hat{p}}_r {\hat{w}}_j^{(k)}-{\hat{\pi }}_{r,j}^{(k)}\right) ^{2}}{{\hat{p}}_r {\hat{w}}_j^{(k)}}-\frac{\left( p_r w_j^{(k)}- \pi _{r,j}^{(k)}\right) ^{2}}{ p_r w_j^{(k)}}\right\} \nonumber \\= & {} \sum _{r=1}^R\sum _{j=1}^{J_k}\frac{1}{p_r w_j^{(k)}}\left\{ \left( {\hat{p}}_r {\hat{w}}_j^{(k)}-{\hat{\pi }}_{r,j}^{(k)}\right) ^{2}-\left( p_r w_j^{(k)}- \pi _{r,j}^{(k)}\right) ^{2}\right\} \nonumber \\&+ \sum _{r=1}^R\sum _{j=1}^{J_k}\left( {\hat{p}}_r {\hat{w}}_j^{(k)}-{\hat{\pi }}_{r,j}^{(k)}\right) ^{2}\left\{ \frac{1}{{\hat{p}}_r {\hat{w}}_j^{(k)}}-\frac{1}{p_r w_j^{(k)}}\right\} \nonumber \\=: & {} I_1+I_2. \end{aligned}$$
(3)

Since \(\log J_k \ge \log 2>0.5\), we have

$$\begin{aligned} P\left( \left| {{\hat{\varDelta }}}_k^*-\varDelta _k^*\right|>2\varepsilon \right)= & {} P\left( \log J_k \cdot \left| {{\hat{\varDelta }}}_k^*-\varDelta _k^*\right|>\log J_k \cdot 2\varepsilon \right) \nonumber \\\le & {} P\left( \left| {{\hat{\varDelta }}}_k -\varDelta _k\right|> \varepsilon \right) \nonumber \\\le & {} P\left( \left| I_1\right|> \frac{\varepsilon }{2}\right) + P\left( \left| I_2\right| > \frac{\varepsilon }{2}\right) . \end{aligned}$$
(4)

Further,

$$\begin{aligned} |I_1|\le & {} \sum _{r=1}^R\sum _{j=1}^{J_k}\left| \frac{1}{p_r w_j^{(k)}}\left\{ \left( {\hat{p}}_r{\hat{w}}_j^{(k)}-{\hat{\pi }}_{r,j}^{(k)}\right) ^2-\left( p_r w_j^{(k)}- \pi _{r,j}^{(k)}\right) ^2\right\} \right| \nonumber \\\le & {} \sum _{r=1}^R\sum _{j=1}^{J_k}\frac{4}{p_r w_j^{(k)}} \left| \left( {\hat{p}}_r {\hat{w}}_j^{(k)}-{\hat{\pi }}_{r,j}^{(k)}\right) -\left( p_r w_j^{(k)}- \pi _{r,j}^{(k)}\right) \right| \end{aligned}$$
(5)
$$\begin{aligned}\le & {} \sum _{r=1}^R\sum _{j=1}^{J_k}\frac{4R J_k}{c_1^2} \left| \left( {\hat{p}}_r {\hat{w}}_j^{(k)}-p_r w_j^{(k)}\right) -\left( {\hat{\pi }}_{r,j}^{(k)}- \pi _{r,j}^{(k)}\right) \right| \end{aligned}$$
(6)
$$\begin{aligned}\le & {} \sum _{r=1}^R\sum _{j=1}^{J_k} \frac{4R J_k}{c_1^2} \left| {\hat{p}}_r -p_r \right| + \sum _{r=1}^R\sum _{j=1}^{J_k} \frac{4R J_k}{c_1^2} \left| {\hat{w}}_j^{(k)}- w_j^{(k)}\right| \nonumber \\&+ \sum _{r=1}^R\sum _{j=1}^{J_k}\frac{4R J_k}{c_1^2}\left| {\hat{\pi }}_{r,j}^{(k)}- \pi _{r,j}^{(k)}\right| \end{aligned}$$
(7)
$$\begin{aligned}=: & {} I_{1.1}+I_{1.2}+I_{1.3}, \end{aligned}$$
(8)

where inequality (5) holds because \(\left| {\hat{p}}_r {\hat{w}}_j^{(k)}\right| \), \(\left| {\hat{\pi }}_{r,j}^{(k)}\right| \), \(\left| p_r w_j^{(k)}\right| \) and \(\left| \pi _{r,j}^{(k)}\right| \) all have upper bounds, inequality (6) holds because \(p_r\ge c_1/R\) and \(w_j^{(k)}\ge c_1/J_k\), and inequality (7) holds because \( \left| {\hat{p}}_r {\hat{w}}_j^{(k)}-p_r w_j^{(k)}\right| \le \left| {\hat{w}}_j^{(k)}- w_j^{(k)}\right| +\left| {\hat{p}}_r -p_r\right| \).

Similarly, since \(\left| \left( {\hat{p}}_r{\hat{w}}_j^{(k)}-{\hat{\pi }}_{r,j}^{(k)}\right) ^2\right| \le 2\left( {\hat{p}}_r^2 \left( {\hat{w}}_j^{(k)}\right) ^2+\left( {\hat{\pi }}_{r,j}^{(k)}\right) ^2\right) \le 4 \),

$$\begin{aligned} |I_2|\le & {} \sum _{r=1}^R\sum _{j=1}^{J_k} \frac{4}{{\hat{p}}_r}\left| \frac{1}{{\hat{w}}_j^{(k)}}-\frac{1}{w_j^{(k)}}\right| +\sum _{r=1}^R\sum _{j=1}^{J_k} \frac{4}{w_j^{(k)}}\left| \frac{1}{{\hat{p}}_r}-\frac{1}{p_r}\right| \nonumber \\=: & {} I_{2.1}+I_{2.2}. \end{aligned}$$
(9)

Therefore,

$$\begin{aligned} P\left( |I_1|>\frac{\varepsilon }{2}\right)\le & {} P\left( I_{1.1}>\frac{\varepsilon }{6}\right) +P\left( I_{1.2}>\frac{\varepsilon }{6}\right) +P\left( I_{1.3}>\frac{\varepsilon }{6}\right) \text{ and } \end{aligned}$$
(10)
$$\begin{aligned} P\left( |I_2|>\frac{\varepsilon }{2}\right)\le & {} P\left( I_{2.1}>\frac{\varepsilon }{4}\right) +P\left( I_{2.2}>\frac{\varepsilon }{4}\right) . \end{aligned}$$
(11)

We deal with \(I_{1.3}\) first.

$$\begin{aligned} P\left( I_{1.3}>\frac{\varepsilon }{6}\right)= & {} P\left( \sum _{r=1}^R\sum _{j=1}^{J_k}\frac{4RJ_k}{c_1^2}\left| {\hat{\pi }}_{r,j}^{(k)}-\pi _{r,j}^{(k)}\right|>\frac{\varepsilon }{6}\right) \nonumber \\= & {} P\left( \sum _{r=1}^R\sum _{j=1}^{J_k} \left| {\hat{\pi }}_{r,j}^{(k)}-\pi _{r,j}^{(k)}\right|> \frac{\varepsilon c_1^2}{24RJ_k}\right) \nonumber \\\le & {} P\left( \max _{r,j}\left| {\hat{\pi }}_{r,j}^{(k)}-\pi _{r,j}^{(k)}\right|> \frac{\varepsilon c_1^2}{24R^2J_k^2}\right) \nonumber \\\le & {} \sum _{r=1}^R\sum _{j=1}^{J_k} P\left( \left| {\hat{\pi }}_{r,j}^{(k)}-\pi _{r,j}^{(k)}\right| > \frac{\varepsilon c_1^2}{24R^2J_k^2}\right) \end{aligned}$$
(12)
$$\begin{aligned}\le & {} 2RJ_k\exp \left\{ -\frac{\left( \frac{n\varepsilon c_1^2}{24R^2J_k^2}\right) ^2}{\frac{n}{2}+\frac{n\varepsilon c_1^2}{36R^2J_k^2}}\right\} , \end{aligned}$$
(13)

where inequality (13) holds due to Bernstein’s inequality. Similarly,

$$\begin{aligned} P\left( I_{1.1}>\frac{\varepsilon }{6}\right)\le & {} \sum _{r=1}^R P\left( \left| {\hat{p}}_r-p_r\right| >\frac{\varepsilon c_1^2}{24R^2 J_k^2}\right) \nonumber \\\le & {} 2R\exp \left\{ -\frac{\left( \frac{n\varepsilon c_1^2}{24R^2 J_k^2}\right) ^2}{\frac{n}{2}+\frac{n\varepsilon c_1^2}{36R^2J_k^2}}\right\} , \end{aligned}$$
(14)
$$\begin{aligned} P\left( I_{1.2}>\frac{\varepsilon }{6}\right)\le & {} \sum _{j=1}^{J_k} P\left( \left| {\hat{w}}_j^{(k)}-w_j^{(k)}\right| >\frac{\varepsilon c_1^2}{24R^2J_k^2}\right) \end{aligned}$$
(15)
$$\begin{aligned}\le & {} 2J_k\exp \left\{ -\frac{\left( \frac{n\varepsilon c_1^2}{24R^2J_k^2}\right) ^2}{\frac{n}{2}+\frac{n\varepsilon c_1^2}{36R^2J_k^2}}\right\} , \end{aligned}$$
(16)
$$\begin{aligned} P\left( I_{2.1}>\frac{\varepsilon }{4}\right)\le & {} \sum _{j=1}^{J_k} P\left( \left| {\hat{w}}_j^{(k)}-w_j^{(k)}\right|>\frac{\varepsilon c_1^3}{192R^2J_k^3}\right) \nonumber \\&+\, \sum _{j=1}^{J_k}P\left( \left| {\hat{w}}_j^{(k)}-w_j^{(k)}\right|>\frac{c_1}{2J_k}\right) +\sum _{r=1}^R P\left( \left| {\hat{p}}_r-p_r\right| > \frac{c_1}{2R}\right) \end{aligned}$$
(17)
$$\begin{aligned}\le & {} 2J_k \exp \left\{ -\frac{\left( \frac{n\varepsilon c_1^3}{192R^2J_k^3}\right) ^2}{\frac{n}{2}+\frac{n\varepsilon c_1^3}{288R^2J_k^3}}\right\} + 2J_k\exp \left\{ -\frac{\left( \frac{nc_1}{2J_k}\right) ^2}{\frac{n}{2}+\frac{nc_1}{3J_k}}\right\} \nonumber \\&+\, 2R \exp \left\{ -\frac{\left( \frac{nc_1}{2R}\right) ^2}{\frac{n}{2}+\frac{nc_1}{3R}}\right\} , \end{aligned}$$
(18)

and

$$\begin{aligned} P\left( I_{2.2}>\frac{\varepsilon }{4}\right)\le & {} \sum _{r=1}^R P\left( \left| {\hat{p}}_r-p_r\right|>\frac{\varepsilon c_1^3}{64R^3J_k^2}\right) +\sum _{r=1}^RP\left( \left| {\hat{p}}_r-p_r\right| > \frac{c_1}{2R}\right) \nonumber \\\le & {} 2R \exp \left\{ -\frac{\left( \frac{n\varepsilon c_1^3}{64R^3J_k^2}\right) ^2}{\frac{n}{2}+\frac{n\varepsilon c_1^3}{96R^3J_k^2}}\right\} + 2R \exp \left\{ -\frac{\left( \frac{nc_1}{2R}\right) ^2}{\frac{n}{2}+\frac{nc_1}{3R}}\right\} , \end{aligned}$$
(19)

where inequalities (14), (16), (18) and (19) hold due to Bernstein’s inequality.

Finally, by the inequalities (4), (8), (9), (10), (11), (13), (14), (16), (18) and (19), we have

$$\begin{aligned} P\left( \left| {{\hat{\varDelta }}}_k^*-\varDelta _k^*\right| >2\varepsilon \right) \le O(RJ_k)\exp \left\{ -c_5\frac{n\varepsilon ^2}{R^6J_k^6}\right\} \le O(RJ)\exp \left\{ -c_5\frac{n\varepsilon ^2}{R^6J^6}\right\} , \end{aligned}$$

for all \(k=1,\ldots , p\). \(\square \)

Proof of Theorem 1

By Lemma A1 and Conditions (C2) and (C3), we have

$$\begin{aligned} P(\mathcal {S}\subseteq {\hat{\mathcal {S}}}_{\text {APC}})\ge & {} P(|{{\hat{\varDelta }}}_k^*-\varDelta _k^*|\le cn^{-\tau },\forall k\in \mathcal {S})\\\ge & {} P\left( \max _{1\le k\le p}|{{\hat{\varDelta }}}_k^*-\varDelta _k^*|\le cn^{-\tau }\right) \\\ge & {} 1-\sum _{k=1}^p P(|{{\hat{\varDelta }}}_k^*-\varDelta _k^*|> cn^{-\tau })\\\ge & {} 1-O(RJ)p\exp \left\{ -c_5\frac{c^2n^{1-2\tau }}{4R^6J^6}\right\} \\\ge & {} 1-O(pn^{\xi +\kappa })\exp \{-bn^{1-2\tau -6\xi -6\kappa }\}\\\ge & {} 1-O(p\exp \{-bn^{1-2\tau -6\xi -6\kappa }+(\xi +\kappa )\log n\}), \end{aligned}$$

where b is a positive constant. \(\square \)

Lemma 2

For any continuous covariate \(X_k\) satisfying conditions (C4) and (C5), let \(F_k(y,x)\) be the cumulative distribution function of \((Y,X_k)\) and \({\hat{F}}_{k}(y,x)\) be the empirical cumulative distribution function. We have

$$\begin{aligned} P(|{\hat{F}}_k(r,{\hat{q}}_{k,(j)})-F_k(r,q_{k,(j)})|>\varepsilon )\le c_6 \exp \{-c_7 n^{1-2\rho }\varepsilon ^2\} \end{aligned}$$

for any \(\varepsilon >0\), \(1\le r\le R\) and \(1\le j\le J_k\), where \({\hat{q}}_{k,(j)}\) and \(q_{k,(j)}\) are the sample and population \(j/J_k\)-percentile of \(X_k\), and \(c_6\) and \(c_7\) are two positive constants.

Proof

Details can be found in Ni and Fang (2016). \(\square \)

Lemma 3

For any continuous \(X_k\), under condition (C1), (C4) and (C5), we have

$$\begin{aligned} P\left( \left| {{\hat{\varDelta }}}_k^*-\varDelta _k^*\right| >2\varepsilon \right) \le O(RJ)\exp \left\{ -c_{9}\frac{n^{1-2\rho }\varepsilon ^2}{R^6J^6}\right\} \end{aligned}$$

for any \(0<\varepsilon <1\), where \(c_{9}\) is a positive constant.

Proof

Since

$$\begin{aligned} {\hat{w}}_{j}^{(k)}-w_{j}^{(k)}= & {} \frac{1}{n}\sum _{i=1}^{n}I\left\{ X_{i,k}\in ({\hat{q}}_{k,(j-1)},{\hat{q}}_{k,(j)}]\right\} -P\left( X_k \in (q_{k,(j-1)},q_{k,(j)}]\right) \\= & {} \sum _{r=1}^R \left\{ ({\hat{F}}(r,{\hat{q}}_{k,(j)})-F(r,q_{k,(j)}))-({\hat{F}}(r-1,{\hat{q}}_{k,(j)})-F(r-1,q_{k,(j)}))\right. \\&-\, ({\hat{F}}(r,{\hat{q}}_{k,(j-1)})-F(r,q_{k,(j-1)}))+({\hat{F}}(r-1,{\hat{q}}_{k,(j-1)})\\&\left. -\,F(r-1,q_{k,(j-1)}))\right\} , \end{aligned}$$

we have

$$\begin{aligned} \left| {\hat{w}}_{j}^{(k)}-w_{j}^{(k)}\right|\le & {} \sum _{r=1}^R \left| {\hat{F}}(r,{\hat{q}}_{k,(j)})-F(r,q_{k,(j)})\right| +\sum _{r=1}^R \left| {\hat{F}}(r-1,{\hat{q}}_{k,(j)})-F(r-1,q_{k,(j)})\right| \\&+\,\sum _{r=1}^R\left| {\hat{F}}(r,{\hat{q}}_{k,(j-1)})-F(r,q_{k,(j-1)})\right| \nonumber \\&+\,\sum _{r=1}^R\left| {\hat{F}}(r-1,{\hat{q}}_{k,(j-1)})-F(r-1,q_{k,(j-1)})\right| \\=: & {} I_{3.1}+I_{3.2}+I_{3.3}+I_{3.4}. \end{aligned}$$

So

$$\begin{aligned} P\left( \left| {\hat{w}}_{j}^{(k)}-w_{j}^{(k)}\right|>\varepsilon \right)\le & {} P\left( I_{3.1}>\frac{\varepsilon }{4}\right) +P\left( I_{3.2}>\frac{\varepsilon }{4}\right) \nonumber \\&+ P\left( I_{3.3}>\frac{\varepsilon }{4}\right) +P\left( I_{3.4}>\frac{\varepsilon }{4}\right) . \end{aligned}$$

Further,

$$\begin{aligned} P\left( I_{3.1}>\frac{\varepsilon }{4}\right)\le & {} P\left( \max _{r}\left| {\hat{F}}(r,{\hat{q}}_{k,(j)})-F(r,q_{k,(j)})\right|>\frac{\varepsilon }{4R}\right) \nonumber \\\le & {} \sum _{r=1}^R P\left( \left| {\hat{F}}(r,{\hat{q}}_{k,(j)})-F(r,q_{k,(j)})\right| >\frac{\varepsilon }{4R}\right) \nonumber \\\le & {} c_6 R\exp \left\{ -c_7n^{1-2\rho }\frac{\varepsilon ^2}{16R^2}\right\} , \end{aligned}$$
(20)

where inequality (20) holds because of Lemma A2.

Similarly, \(P\left( I_{3.2}>\varepsilon /4\right) \), \(P\left( I_{3.3}>\varepsilon /4\right) \) and \(P\left( I_{3.4}>\varepsilon /4\right) \) are all not larger than \(c_6R\cdot \exp \left\{ -c_7n^{1-2\rho }\varepsilon ^2/{(16R^2)}\right\} \). Then

$$\begin{aligned} P\left( \left| {\hat{w}}_{j}^{(k)}-w_{j}^{(k)}\right| >\varepsilon \right) \le 4c_6R \exp \left\{ -c_7n^{1-2\rho }\frac{\varepsilon ^2}{16R^2}\right\} . \end{aligned}$$
(21)

Similarly,

$$\begin{aligned} P\left( \left| {\hat{\pi }}_{r,j}^{(k)}-\pi _{r,j}^{(k)}\right| >\varepsilon \right) \le 4c_6\exp \left\{ -c_7n^{1-2\rho }\frac{\varepsilon ^2}{16}\right\} . \end{aligned}$$
(22)

By inequalities (12), (15), (17), combined with inequalities (21) and (22), we have

$$\begin{aligned} P\left( I_{1.2}>\frac{\varepsilon }{6}\right)\le & {} 4c_6RJ_k\exp \left\{ -c_7n^{1-2\rho }\frac{\varepsilon ^2c_1^4}{96^2R^6J_k^4}\right\} ,\\ P\left( I_{2.1}>\frac{\varepsilon }{4}\right)\le & {} 4c_6RJ_k\exp \left\{ -c_7n^{1-2\rho }\frac{\varepsilon ^2c_1^6}{768^2R^6J_k^6}\right\} \nonumber \\&+\, 4c_6RJ_k\exp \left\{ -c_7n^{1-2\rho }\frac{c_1^2}{64R^2J_k^2}\right\} + 2R\exp \left\{ -\frac{\left( \frac{nc_1}{2R}\right) ^2}{\frac{n}{2}+\frac{nc_1}{3R}}\right\} , \text{ and } \\ P\left( I_{1.3}>\frac{\varepsilon }{6}\right)\le & {} 4c_6RJ_k\exp \left\{ -c_7n^{1-2\rho }\frac{\varepsilon ^2c_1^4}{96^2R^4J_k^4}\right\} , \end{aligned}$$

and meanwhile the arguments (3), (4), (8), (9), (10), (11), (14) and (19) still hold. Therefore,

$$\begin{aligned} P\left( \left| {{\hat{\varDelta }}}_k^{*}-\varDelta _k^{*}\right| >2\varepsilon \right) \le O(RJ_k)\exp \left\{ -c_9\frac{n^{1-2\rho }\varepsilon ^2}{R^6J_k^6}\right\} . \end{aligned}$$

\(\square \)

Proof of Theorem 2

With Lemma A3 and Condition (C2), the proof of Theorem 2 is the same as that of Theorem 1 and hence is omitted. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ni, L., Fang, F. & Wan, F. Adjusted Pearson Chi-Square feature screening for multi-classification with ultrahigh dimensional data. Metrika 80, 805–828 (2017). https://doi.org/10.1007/s00184-017-0629-9

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00184-017-0629-9

Keywords

Navigation