Semiparametric mixtures of regressions with single-index for model based clustering

Xiang, Sijia; Yao, Weixin

doi:10.1007/s11634-020-00392-w

Semiparametric mixtures of regressions with single-index for model based clustering

Regular Article
Published: 23 April 2020

Volume 14, pages 261–292, (2020)
Cite this article

Advances in Data Analysis and Classification Aims and scope Submit manuscript

419 Accesses
9 Citations
Explore all metrics

Abstract

In this article, we propose two classes of semiparametric mixture regression models with single-index for model based clustering. Unlike many semiparametric/nonparametric mixture regression models that can only be applied to low dimensional predictors, the new semiparametric models can easily incorporate high dimensional predictors into the nonparametric components. The proposed models are very general, and many of the recently proposed semiparametric/nonparametric mixture regression models are indeed special cases of the new models. Backfitting estimates and the corresponding modified EM algorithms are proposed to achieve optimal convergence rates for both parametric and nonparametric parts. We establish the identifiability results of the proposed two models and investigate the asymptotic properties of the proposed estimation procedures. Simulation studies are conducted to demonstrate the finite sample performance of the proposed models. Two real data applications using the new models reveal some interesting findings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semiparametric mixtures of nonparametric regressions

Article 05 November 2016

Robust estimation of the number of components for mixtures of linear regression models

Article 04 August 2015

Semiparametric mixture of linear regressions with nonparametric Gaussian scale mixture errors

Article 23 November 2023

References

Cao J, Yao W (2012) Semiparametric mixture of binomial regression with a degenerate component. Statistica Sinica 22:27–46
Article MathSciNet Google Scholar
Chatterjee S, Handcock MS, Simmonoff JS (1995) A casebook for a first course in statistics and data analysis. Wiley, New York
Google Scholar
Chen J, Li P (2009) Hypothesis test for normal mixture models: the EM approach. Ann Stat 37:2523–2542
Article MathSciNet Google Scholar
Cook RD, Li B (2002) Dimension reduction for conditional mean in regression. Ann Stat 30:455–474
Article MathSciNet Google Scholar
Fan J, Zhang C, Zhang J (2001) Generalized likelihood ratio statistics and Wilks phenomenon. Ann Stat 29:153–193
Article MathSciNet Google Scholar
Frühwirth-Schnatter S (2001) Markov chain Monte Carlo estimation of classical and dynamic switching and mixture models. J Am Stat Assoc 96:194–209
Article MathSciNet Google Scholar
Green PJ, Richardson S (2002) Hidden markov models and disease mapping. J Am Stat Assoc 97:1055–1070
Article MathSciNet Google Scholar
Härdle W, Hall P, Ichimura H (1993) Optimal smoothing in single-index models. Ann Stat 21:157–178
Article MathSciNet Google Scholar
Henning C (2000) Identifiability of models for clusterwise linear regression. J Classif 17:273–296
Article MathSciNet Google Scholar
Hu H, Yao W, Wu Y (2017) The robust EM-type algorithms for log-concave mixtures of regression models. Comput Stat Data Anal 111:14–26
Article MathSciNet Google Scholar
Huang M, Yao W (2012) Mixture of regression models with varying mixing proportions: a semiparametric approach. J Am Stat Assoc 107:711–724
Article MathSciNet Google Scholar
Huang M, Li R, Wang S (2013) Nonparametric mixture of regression models. J Am Stat Assoc 108:929–941
Article MathSciNet Google Scholar
Huang M, Li R, Wang H, Yao W (2014) Estimating mixture of Gaussian processes by kernel smoothing. J Bus Econ Stat 32:259–270
Article MathSciNet Google Scholar
Ichimura H (1993) Semiparametric least squares (SLS) and weighted SLS estimation of single-index models. J Econom 58:71–120
Article MathSciNet Google Scholar
Jordan MI, Jacobs RA (1994) Hierarchical mixtures of experts and the EM algorithm. Neural Comput 6:181–214
Article Google Scholar
Li K (1991) Sliced inverse regression for dimension reduction. J Am Stat Assoc 86(414):316–327
Article MathSciNet Google Scholar
Li P, Chen J (2010) Testing the order of a finite mixture. J Am Stat Assoc 105:1084–1092
Article MathSciNet Google Scholar
Li B, Zha H, Chiaromonte F (2005) Contour regression: a general approach to dimension reduction. Ann Stat 33:1580–1616
Article MathSciNet Google Scholar
Luo R, Wang H, Tsai CL (2009) Contour projected dimension reduction. Ann Stat 37:3743–3778
Article MathSciNet Google Scholar
Ma Y, Zhu L (2012) A semiparametric approach to dimension reduction. J Am Stat Assoc 107(497):168–179
Article MathSciNet Google Scholar
Ma Y, Zhu L (2013) Efficient estimation in sufficient dimension reduction. Ann Stat 41:250–268
Article MathSciNet Google Scholar
Shao J (1993) Linear models selection by cross-validation. J Am Stat Assoc 88:486–494
Article MathSciNet Google Scholar
Stephens M (2000) Dealing with label switching in mixture models. J R Stat Soc B 62:795–809
Article MathSciNet Google Scholar
Titterington D, Smith A, Makov U (1985) Statistical analysis of finite mixture distribution. Wiley, New York
MATH Google Scholar
Wang H, Xia Y (2008) Sliced regression for dimension reduction. J Am Stat Assoc 103:811–821
Article MathSciNet Google Scholar
Wang Q, Yao W (2012) An adaptive estimation of MAVE. J Multivar Anal 104:88–100
Article MathSciNet Google Scholar
Wang S, Yao W, Huang M (2014) A note on the identiability of nonparametric and semiparametric mixtures of GLMs. Stat Probab Lett 93:41–45
Article Google Scholar
Wedel M, DeSarbo WS (1993) A latent class binomial logit methodology for the analysis of paired comparison data. Decis Sci 24:1157–1170
Article Google Scholar
Xiang S, Yao W (2018) Semiparametric mixtures of nonparametric regressions. Ann Inst Stat Math 70:131–154
Article MathSciNet Google Scholar
Xiang S, Yao W, Yang G (2019) An overview of semiparametric extensions of finite mixture models. Stat Sci 34:391–404
Article MathSciNet Google Scholar
Yao W, Lindsay BG (2009) Bayesian mixture labeling by highest posterior density. J Am Stat Assoc 104:758–767
Article MathSciNet Google Scholar
Yao W, Nandy D, Lindsay B, Chiaromonte F (2019) Covariate information matrix for sufficient dimension reduction. J Am Stat Assoc 114:1752–1764
Article MathSciNet Google Scholar
Young DS, Hunter DR (2010) Mixtures of regressions with predictors dependent mixing proportions. Comput Stat Data Anal 54:2253–2266
Article MathSciNet Google Scholar
Zeng P (2012) Finite mixture of heteroscedastic single-index models. Open J Stat 2:12–20
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors are grateful to the editor, the guest editor, and two referees for numerous helpful comments during the preparation of the article. Funding was provided by National Natural Science Foundation of China (Grant No. 11601477), Natural Science Foundation (USA) (Grant No. DMS-1461677), Department of Energy (Grant No. 10006272), the First Class Discipline of Zhejiang - A (Zhejiang University of Finance and Economics-Statistics), China (Grant No. NA) and Natural Science Foundation of Zhejiang Province (Grant No. LY19A010006).

Author information

Authors and Affiliations

School of Data Sciences, Zhejiang University of Finance & Economics, Hangzhou, Zhejiang, 310018, People’s Republic of China
Sijia Xiang
Department of Statistics, University of California, Riverside, CA, USA
Weixin Yao

Authors

Sijia Xiang
View author publications
You can also search for this author in PubMed Google Scholar
Weixin Yao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sijia Xiang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Technical conditions

(C1)
The sample $\{({{\varvec{x}}}_i,Y_i),i=1,\ldots ,n\}$ is independent and identically distributed from its population $({{\varvec{x}}},Y)$. The support for ${{\varvec{x}}}$, denoted by $\mathscr {X}$, is a compact subset of $\mathbb {R}^3$.
(C2)
The marginal density of ${\varvec{\alpha }}^\top {{\varvec{x}}}$, denoted by $f(\cdot )$, is twice continuously differentiable and positive at the point z.
(C3)
The kernel function $K(\cdot )$ has a bounded support, and satisfies that
$$\begin{aligned}&\int K(t)dt=1,\qquad \int tK(t)dt=0,\qquad \int t^2K(t)dt<\infty ,\nonumber \\&\int K^2(t)dt<\infty ,\qquad \int |K^3(t)|dt<\infty . \end{aligned}$$
(C4)
$h\rightarrow 0$, $nh\rightarrow 0$, and $nh^5=O(1)$ as $n\rightarrow \infty $.
(C5)
The third derivative $|\partial ^3\ell ({\varvec{\theta }},y)/\partial \theta _i\partial \theta _j\partial \theta _k|\le M(y)$ for all y and all ${\varvec{\theta }}$ in a neighborhood of ${\varvec{\theta }}(z)$, and $E[M(y)]<\infty $.
(C6)
The unknown functions ${\varvec{\theta }}(z)$ have continuous second derivative. For $j=1,\ldots ,k$, $\sigma _j^2(z)>0$, and $\pi _j(z)>0$ for all ${{\varvec{x}}}\in \mathscr {X}$.
(C7)
For all i and j, the following conditions hold:
$$\begin{aligned} E\left[ \left| \frac{\partial \ell ({\varvec{\theta }}(z),Y)}{\partial \theta _i}\right| ^3\right]<\infty \qquad E\left[ \left( \frac{\partial ^2\ell ({\varvec{\theta }}(z),Y)}{\partial \theta _i\partial \theta _j}\right) ^2\right] <\infty \end{aligned}$$
(C8)
${\varvec{\theta }}_0''(\cdot )$ is continuous at the point z.
(C9)
The third derivative $|\partial ^3\ell ({\varvec{\pi }},y)/\partial \pi _i\partial \pi _j\partial \pi _k|\le M(y)$ for all y and all ${\varvec{\pi }}$ in a neighborhood of ${\varvec{\pi }}(z)$, and $E[M(y)]<\infty $.
(C10)
The unknown functions ${\varvec{\pi }}(z)$ have continuous second derivative. For $j=1,\ldots ,k$, $\pi _j(z)>0$ for all ${{\varvec{x}}}\in \mathscr {X}$.
(C11)
For all i and j, the following conditions hold:
$$\begin{aligned} E\left[ \left| \frac{\partial \ell ({\varvec{\pi }}(z),Y)}{\partial \pi _i}\right| ^3\right]<\infty \qquad E\left[ \left( \frac{\partial ^2\ell ({\varvec{\pi }}(z),Y)}{\partial \pi _i\partial \pi _j}\right) ^2\right] <\infty \end{aligned}$$
(C11)
${\varvec{\pi }}''(\cdot )$ is continuous at the point z.

Proof of Theorem 1

Ichimura (1993) have shown that under conditions (i)–(iv), ${\varvec{\alpha }}$ is identifiable. Further, Huang et al. (2013) showed that with condition (v), the nonparametric functions are identifiable. Thus completes the proof. $\square $

Proof of Theorem 2

Let

$$\begin{aligned}&\hat{\pi }_j^*=\sqrt{nh}\{\hat{\pi }_j-\pi _{j}(z)\},\quad j=1,\ldots ,k-1.\\&\hat{m}_j^*=\sqrt{nh}\{\hat{m}_j-m_{j}(z)\},\quad j=1,\ldots ,k,\\&\hat{\sigma }_j^{2*}=\sqrt{nh}\{\hat{\sigma }_j^2-\sigma _{j}^2(z)\},\quad j=1,\ldots ,k. \end{aligned}$$

Define $\hat{{\varvec{\pi }}}^*=(\hat{\pi }_1^*,\ldots ,\hat{\pi }_{k-1}^*)^\top $, $\hat{{{\varvec{m}}}}^*=(\hat{m}_1^*,\ldots ,\hat{m}_k^*)^\top $, $\hat{{{\varvec{\sigma }}}}^*=(\hat{\sigma }_1^*,\ldots ,\hat{\sigma }_k^*)^\top $ and denote $\hat{{\varvec{\theta }}}^*=(\hat{{\varvec{\pi }}}^{*T},\hat{{{\varvec{m}}}}^{*T},(\hat{{{\varvec{\sigma }}}}^{*2})^\top )^\top $. Let $a_n=(nh)^{-1/2}$ and

$$\begin{aligned} \ell ({\varvec{\theta }}(z),\tilde{{\varvec{\alpha }}},{{\varvec{x}}}_i,Y_i)=\log \left\{ \sum _{j=1}^k\pi _j(\tilde{{\varvec{\alpha }}}^\top {{\varvec{x}}}_i)\phi (Y_i|m_j(\tilde{{\varvec{\alpha }}}^\top {{\varvec{x}}}_i),\sigma _j^2(\tilde{{\varvec{\alpha }}}^\top {{\varvec{x}}}_i))\right\} K_h(\tilde{{\varvec{\alpha }}}^\top {{\varvec{x}}}_i-z). \end{aligned}$$

If $(\hat{{\varvec{\pi }}},\hat{{{\varvec{m}}}},\hat{{{\varvec{\sigma }}}}^2)^\top $ maximizes (4), then $\hat{{\varvec{\theta }}}^*$ maximizes

$$\begin{aligned} \ell _n^*({\varvec{\theta }}^*)=h\sum _{i=1}^n[\ell ({\varvec{\theta }}(z)+a_n{\varvec{\theta }}^*,\tilde{{\varvec{\alpha }}},{{\varvec{x}}}_i,Y_i)-\ell ({\varvec{\theta }}(z),\tilde{{\varvec{\alpha }}},{{\varvec{x}}}_i,Y_i)]K_h(\hat{Z}_i-z) \end{aligned}$$

(22)

with respect to ${\varvec{\theta }}^*$. By a Taylor expansion,

$$\begin{aligned} \ell _n^*({\varvec{\theta }}^*)={{\varvec{W}}}_{1n}^\top {\varvec{\theta }}^*+\frac{1}{2}{\varvec{\theta }}^{*T}{{\varvec{A}}}_{1n}{\varvec{\theta }}^*+o_p(1), \end{aligned}$$

(23)

where

$$\begin{aligned} {{\varvec{W}}}_{1n}=\sqrt{\frac{h}{n}}\sum _{i=1}^n\frac{\partial \ell ({\varvec{\theta }}(z),\tilde{{\varvec{\alpha }}},{{\varvec{x}}}_i,Y_i)}{\partial {\varvec{\theta }}}K_h(\hat{Z}_i-z), \end{aligned}$$

and

$$\begin{aligned} {{\varvec{A}}}_{2n}=\frac{1}{n}\sum _{i=1}^n\frac{\partial ^2\ell ({\varvec{\theta }}(z),\tilde{{\varvec{\alpha }}},{{\varvec{x}}}_i,Y_i)}{\partial {\varvec{\theta }}\partial {\varvec{\theta }}^\top }K_h(\hat{Z}_i-z). \end{aligned}$$

By WLLN, it can be shown that ${{\varvec{A}}}_{1n}=-f(z)\mathscr {I}^{(1)}_\theta (z)+o_p(1)$. Therefore,

$$\begin{aligned} \ell _n^*({\varvec{\theta }}^*)={{\varvec{W}}}_{1n}^\top {\varvec{\theta }}^*-\frac{1}{2}f(z){\varvec{\theta }}^{*T}\mathscr {I}^{(1)}_\theta (z){\varvec{\theta }}^*+o_p(1). \end{aligned}$$

(24)

Using the quadratic approximation lemma (see, for example, Fan and Gijbels 1996), we have that

$$\begin{aligned} \hat{{\varvec{\theta }}}^*=f(z)^{-1}\mathscr {I}^{(1)}_\theta (z)^{-1}{{\varvec{W}}}_{1n}+o_p(1). \end{aligned}$$

(25)

Note that

$$\begin{aligned} {{\varvec{W}}}_{1n}&=\sqrt{\frac{h}{n}}\sum _{i=1}^n\frac{\partial \ell ({\varvec{\theta }}(z),{\varvec{\alpha }},{{\varvec{x}}}_i,Y_i)}{\partial {\varvec{\theta }}}K_h(Z_i-z)+D_{1n}+O_p\left( \sqrt{\frac{h}{n}}\Vert \tilde{{\varvec{\alpha }}}-{\varvec{\alpha }}\Vert ^2\right) \end{aligned}$$

where

$$\begin{aligned} D_{1n}&=\sqrt{\frac{h}{n}}\sum _{i=1}^n\left\{ \frac{\partial ^2\ell ({\varvec{\theta }}(z),{\varvec{\alpha }},{{\varvec{x}}}_i,Y_i)}{\partial {\varvec{\theta }}\partial {\varvec{\theta }}^\top }[{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)]^\top K_h(Z_i-z)\right\} (\tilde{{\varvec{\alpha }}}-{\varvec{\alpha }}). \end{aligned}$$

Since $\sqrt{n}(\tilde{{\varvec{\alpha }}}-{\varvec{\alpha }})=O_p(1)$, it can be shown that

$$\begin{aligned} D_{1n}=-\sqrt{h}f(z)E\left[ \frac{\partial ^2\ell ({\varvec{\theta }}(z),{\varvec{\alpha }},{{\varvec{x}}},Y)}{\partial {\varvec{\theta }}\partial {\varvec{\theta }}^\top }[{{\varvec{x}}}{\varvec{\theta }}'(Z)]^\top \right] =o_p(1), \end{aligned}$$

and

$$\begin{aligned} O_p\left( \sqrt{\frac{h}{n}}\Vert \tilde{{\varvec{\alpha }}}-{\varvec{\alpha }}\Vert ^2\right) =o_p(1). \end{aligned}$$

Therefore,

$$\begin{aligned} {{\varvec{W}}}_{1n}&=\sqrt{\frac{h}{n}}\sum _{i=1}^n\frac{\partial \ell ({\varvec{\theta }},{\varvec{\alpha }},{{\varvec{x}}}_i,Y_i)}{\partial {\varvec{\theta }}}K_h(Z_i-z)+o_p(1). \end{aligned}$$

To complete the proof, we now calculate the mean and variance of ${{\varvec{W}}}_n$. Note that

$$\begin{aligned} E({{\varvec{W}}}_{1n})&=\sqrt{nh}E\left[ E\left[ \frac{\partial \ell ({\varvec{\theta }},{\varvec{\alpha }},{{\varvec{x}}}_i,Y_i)}{\partial {\varvec{\theta }}}K_h(Z_i-z)|Z=z_0\right] \right] \nonumber \\&=\sqrt{nh}\left[ \frac{1}{2}f(z)\varLambda _1^{''}(z|z)+f'(z)\varLambda _1^{'}(z|z)\right] \kappa _2h^2. \end{aligned}$$

(26)

Similarly, we can show that

$$\begin{aligned} \text {Cov}({{\varvec{W}}}_{1n})=f(z)\mathscr {I}^{(1)}_\theta (z)\nu _0+o_p(1), \end{aligned}$$

where $\kappa _l=\int t^lK(t)dt$ and $\nu _l=\int t^lK^2(t)dt$. The rest of the proof follows a standard argument. $\square $

Proof of Theorem 3

Denote $Z={\varvec{\alpha }}^\top {{\varvec{x}}}$ and $\hat{Z}=\hat{{\varvec{\alpha }}}^\top {{\varvec{x}}}$. Let $\ell ({\varvec{\theta }}(z),X,Y)=\log \sum _{j=1}^k\pi _j(z)\phi (Y|m_j(z),\sigma _j^2(z))$. If $\hat{{\varvec{\theta }}}(z_0;\hat{{\varvec{\alpha }}})$ maximizes (4), then it solves

$$\begin{aligned} {{\varvec{0}}}=n^{-1}\sum _{i=1}^n\frac{\partial \ell (\hat{{\varvec{\theta }}}(z_0;\hat{{\varvec{\alpha }}}),X_i,Y_i)}{\partial {\varvec{\theta }}}K_h(\hat{Z}_i-z_0). \end{aligned}$$

Apply a Taylor expansion and use the conditions on h, we obtain

$$\begin{aligned} {{\varvec{0}}}&=n^{-1}\sum _{i=1}^nq_{1i}(Z_i)K_h(Z_i-z_0)\\&\quad + n^{-1}\sum _{i=1}^n\left[ q_{2i}(Z_i)K_h(Z_i-z_0)\right] (\ hat{{\varvec{\theta }}}(z_0;\hat{{\varvec{\alpha }}})-{\varvec{\theta }}(z_0))\\&\quad +n^{-1}\sum _{i=1}^nq_{2i}(Z_i)[{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)]^\top K_h(Z_i-z_0)(\hat{{\varvec{\alpha }}}-{\varvec{\alpha }})+o_p(n^{-1/2})+O_p(h^2) \end{aligned}$$

By similar argument as in the previous proof,

$$\begin{aligned} \hat{{\varvec{\theta }}}(z_0;\hat{{\varvec{\alpha }}})-{\varvec{\theta }}(z_0)&=n^{-1}f^{-1}(z_0)\mathscr {I}^{(1)-1}_\theta (z_0)\sum _{i=1}^nq_{1i}(Z_i)K_h(Z_i-z_0)\nonumber \\&\quad -\mathscr {I}^{(1)-1}_\theta (z_0)E\{q_2(Z)[{{\varvec{x}}}{\varvec{\theta }}'(Z)]^\top |Z=z_0\}(\hat{{\varvec{\alpha }}}-{\varvec{\alpha }})+o_p(n^{-1/2}). \end{aligned}$$

(27)

Note that

$$\begin{aligned} \hat{{\varvec{\theta }}}&(\hat{{\varvec{\alpha }}}^\top {{\varvec{x}}}_i;\hat{{\varvec{\alpha }}})-{\varvec{\theta }}({\varvec{\alpha }}^\top {{\varvec{x}}}_i)=\hat{{\varvec{\theta }}}(\hat{{\varvec{\alpha }}}^\top {{\varvec{x}}}_i;\hat{{\varvec{\alpha }}}) -\hat{{\varvec{\theta }}}({\varvec{\alpha }}^\top {{\varvec{x}}}_i;\hat{{\varvec{\alpha }}})+\hat{{\varvec{\theta }}}({\varvec{\alpha }}^\top {{\varvec{x}}}_i;\hat{{\varvec{\alpha }}})-{\varvec{\theta }}({\varvec{\alpha }}^\top {{\varvec{x}}}_i)\nonumber \\&=(\hat{{\varvec{\theta }}}'({\varvec{\alpha }}^\top {{\varvec{x}}}_i;\hat{{\varvec{\alpha }}}))^\top (\hat{{\varvec{\alpha }}}^\top -{\varvec{\alpha }}^\top ){{\varvec{x}}}_i+\hat{{\varvec{\theta }}}({\varvec{\alpha }}^\top {{\varvec{x}}}_i;\hat{{\varvec{\alpha }}})-{\varvec{\theta }}({\varvec{\alpha }}_0^\top {{\varvec{x}}}_i)+o_p(n^{-1/2})\nonumber \\&=({\varvec{\theta }}'({\varvec{\alpha }}^\top {{\varvec{x}}}_i))^\top (\hat{{\varvec{\alpha }}}^\top -{\varvec{\alpha }}^\top ){{\varvec{x}}}_i+\hat{{\varvec{\theta }}}({\varvec{\alpha }}^\top {{\varvec{x}}}_i;\hat{{\varvec{\alpha }}})-{\varvec{\theta }}({\varvec{\alpha }}^\top {{\varvec{x}}}_i)+o_p(n^{-1/2}), \end{aligned}$$

(28)

where the second part is handled by (27).

Since $\hat{{\varvec{\alpha }}}$ maximizes (9), it is the solution to

$$\begin{aligned} {{\varvec{0}}}=\lambda \hat{{\varvec{\alpha }}}+n^{-1/2}\sum _{i=1}^n{{\varvec{x}}}_i\hat{{\varvec{\theta }}}'(\hat{{\varvec{\alpha }}}^\top {{\varvec{x}}}_i;\hat{{\varvec{\alpha }}}) \frac{\partial \ell (\hat{{\varvec{\theta }}}(\hat{{\varvec{\alpha }}}^\top {{\varvec{x}}}_i;\hat{{\varvec{\alpha }}}),X_i,Y_i)}{\partial {\varvec{\theta }}}, \end{aligned}$$

where $\lambda $ is the Lagrange multiplier. By the Taylor expansion and using (28), we have that

$$\begin{aligned} {{\varvec{0}}}&=\lambda \hat{{\varvec{\alpha }}}+n^{-1/2}\sum _{i=1}^n {{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)q_{1i}(Z_i)\\&\quad +n^{-1/2}\sum _{i=1}^n {{\varvec{x}}}_i{\varvec{\theta }}'(Z_i) q_{2i}(Z_i) [\hat{{\varvec{\theta }}}(\hat{{\varvec{\alpha }}}^\top {{\varvec{x}}}_i)-{\varvec{\theta }}({\varvec{\alpha }}^\top {{\varvec{x}}}_i)]+o_p(1)\\&=\lambda \hat{{\varvec{\alpha }}}+n^{-1/2}\sum _{i=1}^n{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)q_{1i}(Z_i)\\&\quad +n^{-1/2}\sum _{i=1}^n {{\varvec{x}}}_i{\varvec{\theta }}'(Z_i) q_{2i}(Z_i)({{\varvec{x}}}_i{\varvec{\theta }}'(Z_i))^\top (\hat{{\varvec{\alpha }}}-{\varvec{\alpha }})\\&\quad +n^{-1/2}\sum _{i=1}^n{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)q_{2i}(Z_i) [\hat{{\varvec{\theta }}}(Z_i)-{\varvec{\theta }}(Z_i)])+o_p(1). \end{aligned}$$

Define

$$\begin{aligned} A_\alpha =E\{[{{\varvec{x}}}{\varvec{\theta }}'(Z)]q_2(Z)[{{\varvec{x}}}{\varvec{\theta }}'(Z)]^\top \}, \end{aligned}$$

and apply (27),

$$\begin{aligned} {{\varvec{0}}}&=\lambda \hat{{\varvec{\alpha }}}+n^{-1/2}\sum _{i=1}^n{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i) q_{1i}(Z_i)+n^{1/2}A_\beta (\hat{{\varvec{\alpha }}}-{\varvec{\alpha }})\nonumber \\&\quad -n^{-1/2}\sum _{i=1}^n{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)q_{2i}(Z_i)\mathscr {I}^{-1}_\theta (Z_i)E\{q_2(Z)[{{\varvec{x}}}{\varvec{\theta }}'(Z)]^\top |Z=Z_i\}(\hat{{\varvec{\alpha }}}-{\varvec{\alpha }})\nonumber \\&\quad +n^{-1/2}\sum _{i=1}^n{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)q_{2i}(Z_i) n^{-1}f^{-1}(Z_i)\mathscr {I}_\theta ^{-1}(Z_i)\nonumber \\&\quad \times \sum _{t=1}^nq_{1t}(Z_t)K_h(Z_t-Z_i)+o_p(1)\nonumber \\&=\lambda \hat{{\varvec{\alpha }}}+n^{-1/2}\sum _{i=1}^n{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)q_{1i}(Z_i) +{{\varvec{Q}}}_1 n^{1/2}(\hat{{\varvec{\alpha }}}-{\varvec{\alpha }})\nonumber \\&\quad +n^{-1/2}\sum _{i=1}^n{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)q_{2i}(Z_i) n^{-1}f^{-1}(Z_i)\mathscr {I}^{(1)-1}_\theta (Z_i)\nonumber \\&\quad \times \sum _{t=1}^nq_{1t}(Z_t)K_h(Z_t-Z_i)+o_p(1). \end{aligned}$$

(29)

Interchanging the summations in the last term, we get

$$\begin{aligned}&n^{-1/2}\sum _{i=1}^n\left[ n^{-1}\sum _{t=1}^n{{\varvec{x}}}_t{\varvec{\theta }}'(Z_t)q_{2t}(Z_t) K_h(Z_t-Z_i)f^{-1}(Z_t)\mathscr {I}_\theta ^{-1}(Z_t)q_{1i}(Z_i)\right] \nonumber \\&\quad =n^{-1/2}\sum _{i=1}^nE[{{\varvec{x}}}{\varvec{\theta }}'(Z)q_2(Z)|Z_i]\mathscr {I}^{(1)-1}_\theta (Z_i)q_{1i}(Z_i)+o_p(1). \end{aligned}$$

(30)

Let $\varGamma _\alpha =I-{\varvec{\alpha }}{\varvec{\alpha }}^\top +o_p(1)$. Combining (29) and (30), and multiply by $\varGamma _\alpha $, we have

$$\begin{aligned}&\varGamma _\alpha {{\varvec{Q}}}_1 n^{1/2}(\hat{{\varvec{\alpha }}}-{\varvec{\alpha }})=n^{-1/2}\sum _{i=1}^n\varGamma _\alpha \{{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)\nonumber \\&\quad +E[{{\varvec{x}}}{\varvec{\theta }}'(Z)q_2(Z)|Z_i]\mathscr {I}^{(1)-1}_\theta (Z_i)\}q_{1i}(Z_i)+o_p(1). \end{aligned}$$

(31)

It can be shown that the right-hand side of (31) has the covariance matrix $\varGamma _\alpha {{\varvec{Q}}}_1\varGamma _\alpha $, and therefore, completes the proof. $\square $

Proof of Theorem 4

Ichimura (1993) have shown that under conditions (i)–(iv), ${\varvec{\alpha }}$ is identifiable. Furthermore, Huang and Yao (2012) showed that with condition (v), $({\varvec{\pi }}(\cdot ),{\varvec{\beta }},{{\varvec{\sigma }}}^2)$ are identifiable. Thus completes the proof. $\square $

Proof of Theorem 5

This proof is similar to the proof of Theorem 2.

Let $\hat{\pi }_j^*=\sqrt{nh}\{\hat{\pi }_j-\pi _j(z)\}$, $j=1,\ldots ,k-1$, and $\hat{{\varvec{\pi }}}^*=(\hat{\pi }_1^*,\ldots ,\hat{\pi }_{k-1}^*)^\top $. It can be shown that

$$\begin{aligned} \hat{{\varvec{\pi }}}^*=f(z)^{-1}\mathscr {I}_\pi ^{(2)-1}(z){{\varvec{W}}}_{2n}+o_p(1), \end{aligned}$$

where

$$\begin{aligned} {{\varvec{W}}}_{2n}=\sqrt{\frac{h}{n}}\sum _{i=1}^n\frac{\partial \ell ({\varvec{\pi }}(z),\hat{{{\varvec{\lambda }}}},{{\varvec{x}}}_i,Y_i)}{\partial {\varvec{\pi }}}K_h(\hat{Z}_i-z). \end{aligned}$$

To complete the proof, notice that

$$\begin{aligned} E({{\varvec{W}}}_{2n})=\,&\sqrt{nh}E\left\{ E[\frac{\partial \ell ({\varvec{\pi }},{{\varvec{\lambda }}},{{\varvec{x}}}_i,Y_i)}{\partial {\varvec{\pi }}}K_h(Z_i-z)|Z=z_0]\right\} \\ =\,&\sqrt{nh}[\frac{1}{2}f(z)\varLambda _2''(z|z)+f'(z)\varLambda _2'(z|z)]\kappa _2h^2, \end{aligned}$$

and Cov$({{\varvec{W}}}_{2n})=f(z)\mathscr {I}^{(2)}_\pi (z)\nu _0+o_p(1)$. The rest of the proof follows a standard argument. $\square $

Proof of Theorem 6s

The proof is similar to the proof of Theorem 3. It can be shown that

$$\begin{aligned}&\hat{{\varvec{\pi }}}(z_0;\hat{{{\varvec{\lambda }}}})-{\varvec{\pi }}(z_0)=n^{-1}f^{-1}(z_0)\mathscr {I}^{(2)-1}_\pi (z_0)\sum _{i=1}^nq_{\pi i}(Z_i)K_h(Z_i-z_0)\\ -&\mathscr {I}^{(2)-1}_\pi (z_0) E\{q_{\pi \pi }(Z)[{{\varvec{x}}}{\varvec{\pi }}'(Z)]^\top |Z=z_0\}(\hat{{\varvec{\alpha }}}-{\varvec{\alpha }})-\mathscr {I}^{(2)-1}_\pi (z_0) E\{q_{\pi \eta }(Z)|Z\\&=z_0\}(\hat{{{\varvec{\eta }}}}-{{\varvec{\eta }}})+o_p(n^{-1/2}), \end{aligned}$$

and therefore,

$$\begin{aligned}&\hat{{\varvec{\pi }}}(\hat{Z}_i;\hat{{{\varvec{\lambda }}}})-{\varvec{\pi }}(Z_i)=\{{{\varvec{x}}}_i{\varvec{\pi }}'(Z_i)\}^\top (\hat{{\varvec{\alpha }}}-{\varvec{\alpha }})+\hat{{\varvec{\pi }}}(Z_i;\hat{{{\varvec{\lambda }}}})\nonumber \\&\quad -{\varvec{\pi }}(Z_i)+o_p(n^{-\frac{1}{2}}). \end{aligned}$$

(32)

Since $\hat{{{\varvec{\lambda }}}}$ maximizes (14), it is the solution to

$$\begin{aligned} {{\varvec{0}}}=\gamma \begin{pmatrix}\hat{{\varvec{\alpha }}}\\ {{\varvec{0}}}\end{pmatrix}+n^{-\frac{1}{2}}\sum _{i=1}^n\begin{pmatrix}{{\varvec{x}}}_i\hat{{\varvec{\pi }}}'(\hat{Z}_i;\hat{{{\varvec{\lambda }}}})\\ \mathbf{I} \end{pmatrix}q_\pi (\hat{{\varvec{\pi }}}(\hat{Z}_i;\hat{{{\varvec{\lambda }}}}),\hat{{{\varvec{\lambda }}}}), \end{aligned}$$

where $\gamma $ is the Lagrange multiplier. By Taylor series and (32)

$$\begin{aligned} {{\varvec{0}}}=\,&\gamma \begin{pmatrix}\hat{{\varvec{\alpha }}}\\ {{\varvec{0}}}\end{pmatrix}+n^{-\frac{1}{2}}\sum _{i=1}^n{{\varvec{\varLambda }}}_{1i}q_{\pi i}(Z_i)+n^{\frac{1}{2}}{{\varvec{Q}}}_2\begin{pmatrix}\hat{{\varvec{\alpha }}}-{\varvec{\alpha }}\\ \hat{{{\varvec{\eta }}}}-{{\varvec{\eta }}}\end{pmatrix}\nonumber \\&+n^{-\frac{1}{2}}\sum _{i=1}^n{{\varvec{\varLambda }}}_{1i}q_{\pi \pi i}(Z_i)n^{-1}f^{-1}(Z_i)\mathscr {I}^{(2)-1}_\pi (Z_i)\nonumber \\&\times \sum _{j=1}^nq_{\pi j}(Z_j)K_h(Z_j-Z_i)+o_p(1)\nonumber \\ =\,&\gamma \begin{pmatrix}\hat{{\varvec{\alpha }}}\\ {{\varvec{0}}}\end{pmatrix}+n^{-\frac{1}{2}}\sum _{i=1}^n{{\varvec{\varLambda }}}_{1i}q_{\pi i}(Z_i)+n^{\frac{1}{2}}{{\varvec{Q}}}_2\begin{pmatrix}\hat{{\varvec{\alpha }}}-{\varvec{\alpha }}\\ \hat{{{\varvec{\eta }}}}-{{\varvec{\eta }}}\end{pmatrix}\nonumber \\&+n^{-\frac{1}{2}}\sum _{i=1}^nE[{{\varvec{\varLambda }}}_{1i}q_{\pi \pi }(Z_i)]\mathscr {I}^{(2)-1}_\pi (Z_i)q_{\pi i}(Z_i)+o_p(1), \end{aligned}$$

(33)

where ${{\varvec{\varLambda }}}_{1i}=\begin{pmatrix}{{\varvec{x}}}_i{\varvec{\pi }}'(Z_i)\\ \mathbf{I} \end{pmatrix}$, and the last equation is the result of interchanging the summations. Let $\varGamma _\alpha =\begin{pmatrix}{{\varvec{I}}}-{\varvec{\alpha }}{\varvec{\alpha }}^\top &{}\mathbf 0 \\ \mathbf 0 &{}{{\varvec{I}}}\end{pmatrix}+o_p(1)$. By (33), and multiply by $\varGamma _\alpha $, we have

$$\begin{aligned}&n^{\frac{1}{2}}\varGamma _\alpha {{\varvec{Q}}}_2\begin{pmatrix}\hat{{\varvec{\alpha }}}-{\varvec{\alpha }}\\ \hat{{{\varvec{\eta }}}}-{{\varvec{\eta }}}\end{pmatrix}=n^{-\frac{1}{2}}\sum _{i=1}^n\varGamma _\alpha \left\{ {{\varvec{\varLambda }}}_{1i}-\mathscr {I}_\pi ^{(2)-1}(Z_i)E[{{\varvec{\varLambda }}}_{1i}(Z_i)q_{\pi \pi }(Z_i)|Z_i]\right\} \nonumber \\&\quad \times \, q_{\pi i}(Z_i)+o_p(1). \end{aligned}$$

(34)

It can be shown that the right-hand side of (34) has the covariance matrix $\varGamma _\alpha {{\varvec{Q}}}_2\varGamma _\alpha $, and thus, completes the proof. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiang, S., Yao, W. Semiparametric mixtures of regressions with single-index for model based clustering. Adv Data Anal Classif 14, 261–292 (2020). https://doi.org/10.1007/s11634-020-00392-w

Download citation

Received: 25 October 2018
Revised: 27 January 2020
Accepted: 06 March 2020
Published: 23 April 2020
Issue Date: June 2020
DOI: https://doi.org/10.1007/s11634-020-00392-w

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semiparametric mixtures of regressions with single-index for model based clustering

Abstract

Access this article

Similar content being viewed by others

Semiparametric mixtures of nonparametric regressions

Robust estimation of the number of components for mixtures of linear regression models

Semiparametric mixture of linear regressions with nonparametric Gaussian scale mixture errors

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

Proof of Theorem 4

Proof of Theorem 5

Proof of Theorem 6s

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Semiparametric mixtures of regressions with single-index for model based clustering

Abstract

Access this article

Similar content being viewed by others

Semiparametric mixtures of nonparametric regressions

Robust estimation of the number of components for mixtures of linear regression models

Semiparametric mixture of linear regressions with nonparametric Gaussian scale mixture errors

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A

Appendix A

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

Proof of Theorem 4

Proof of Theorem 5

Proof of Theorem 6s

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation