Abstract
Most research on panel data focuses on mean or quantile regression, while there is not much research about regression methods based on the mode. In this paper, we propose a new model named fixed effects modal regression for panel data in which we model how the conditional mode of the response variable depends on the covariates and employ a kernel-based objective function to simplify the computation. The proposed modal regression can complement the mean and quantile regressions and provide better central tendency measure and prediction performance when the data are skewed. We present a linear dummy modal regression method and a pseudo-demodeing two-step method to estimate the proposed modal regression. The computations can be easily implemented using a modified modal–expectation–maximization algorithm. We investigate the asymptotic properties of the modal estimators under some mild regularity conditions when the number of individuals, N, and the number of time periods, T, go to infinity. The optimal bandwidths with order \((NT)^{-1/7}\) are obtained by minimizing the asymptotic weighted mean squared errors. Monte Carlo simulations and two real data analyses of a public capital productivity study and a carbon dioxide emissions study are presented to demonstrate the finite sample performance of the newly proposed modal regression.
Similar content being viewed by others
Notes
Modal regression can complement mean and quantile regressions and provide some other useful information regarding the features of conditional distributions that the existing regression models might miss, especially for the skewed dataset. For example, assume Y and X satisfy \(Y=X^{T} \beta _m+\sigma (X) \xi ,\) where \(\xi \) has a density with mean 0 and mode 1, \(\beta _m\) is a vector of coefficients, \(\sigma (X)=m(X)-X^{T} \beta _m\) in which m(X) is a nonlinear function, and \(X^{T}\) denotes the transpose of X. Then, \({\mathbb {E}}(Y \mid X)=X^{T} \beta _m,\) while \({Mode}(Y \mid X)=m(X)\). The mean regression is linear, but the modal regression could be nonlinear. Similarly, it is also possible that the mean regression is nonlinear, but the modal regression is linear.
For example, if we consider \(Y_{it}=X^T_{it}\beta +\mu _i+v_{it}\) with \(Mode(v_{it}\mid X_{it},\mu _i)=0\), applying the first-difference transformation on equation yields \(Y_{it}-Y_{it-1}=(X^T_{it}-X^T_{it-1})\beta +v_{it}-v_{it-1}\) in which we cannot guarantee \(Mode(v_{it}-v_{it-1}\mid X_{it})=0\). The same problem arises if we apply the mean difference transformation.
If \(X \sim Ga(\alpha , \theta )\) and \(Y \sim Ga(\beta , \theta )\) are independently distributed with the same scale parameter, then \(X+Y\) follows \(Ga(\alpha +\beta , \theta )\) with variance \((\alpha +\beta )\theta ^2\).
*: \(p<0.1\); **: \(p <0.05\); ***: \(p<0.01\).
References
Baltagi B (2009) A companion to econometric analysis of panel data. Wiley, Chichester
Baltagi B (2013) Econometrics analysis of panel data, 5th edn. Wiley, Chichester
Baltagi B, Pinnoi N (1995) Public capital stock and state productivity growth: further evidence from an error components model. Empir Econ 20:351–359
Botev ZI, Grotowski JF, Kroese DP (2010) Kernel density estimation via diffusion. Ann Stat 38:2916–2957
Canay IA (2011) A simple approach to quantile regression for panel data. Econom J 14:368–386
Chen YC (2018) Modal regression using kernel density estimation: a review. Wiley Interdiscip Rev Comput Stat 10:e1431
Chen YC, Genovese CR, Tibshirani RJ, Wasserman L (2016) Nonparametric modal regression. Ann Stat 44(2):489–514
Eddy WF (1980) Optimum kernel estimators of the mode. Ann Stat 8(4):870–882
Evans P, Karras G (1994) Are government activities productive? Evidence from A Panel of U.S. States. Rev Econ Stud 76(1):1–11
Fragkias M, Lobo J, Strumsky D, Seto KC (2013) Does size matter? Scaling of \({\rm CO}_2\) emissions and U.S. Urban Areas. PLOS ONE 8(6):e64727
Friedman JH, Fisher NI (1999) Bump hunting in high-dimensional data. Stat Comput 9:123–143
Galvao AF (2011) Quantile regression for dynamic panel data with fixed effects. J Econom 164(1):142–157
Gao Y, Li K (2013) Nonparametric estimation of fixed effects panel data models. J Nonparametric Stat 25(3):679–693
Henderson D, Carroll R, Li Q (2008) Nonparametric estimation and testing of fixed effects panel data models. J Econom 144(1):257–275
Henderson D, Ullah A (2014) Nonparametric estimation in a one-way error component model: a Monte Carlo analysis. Stat Paradigms 213–237
Hidalgo FJ (1992) Adaptive semiparametric estimation in the presence of autocorrelation of unknown form. J Time Ser Anal 13:47–78
Kemp GCR, Parente PMDC, Santos Silva JMC (2019) Dynamic vector mode regression. J Bus Econ Stat 38(3):647–661
Kemp GCR, Santos Silva JMC (2012) Regression towards the mode. J Econom 170(1):92–101
Krief JM (2017) Semi-linear mode regression. Econom J 20:149–167
Lamarche C (2010) Robust penalized quantile regression estimation for panel data. J Econom 157:396–408
Lee M (1989) Mode regression. J Econom 42:337–349
Lee M (1993) Quadratic model regression. J Econom 57:1–19
Lee S (2003) Efficient semiparametric estimation of a partially linear quantile regression model. Econom Theory 19(1):1–31
Lee Y, Mukherjee D, Ullah A (2019) Nonparametric estimation of the marginal effect in fixed-effect panel data models. J Multiv Anal 171:53–67
Li X, Huang X (2019) Linear mode regression with covariate measurement error. Can J Stat 47:262–280
Li J, Ray S, Lindsay BG (2007) A nonparametric statistical approach to clustering via mode inentification. J Mach Learn Res 8(8):1687–1723
Lin Z, Li Q, Sun Y (2014) A consistent nonparametric test of parametric regression functional form in fixed effects panel data models. J Econom 178:167–179
Muller DW, Sawitzki G (1991) Excess mass estimates and tests for multimodality. J Am Stat Assoc 86:738–746
Munnell AH (1990) How does public infrastructure affect regional economic performance? New Engl Econ Rev 11–33
Ota H, Kato K, Hara S (2019) Quantile regression approach to conditional mode estimation. Electron J Stat 13:3120–3160
Parzen M (1962) On estimation of a probability density function and mode. Philos Trans R Soc Lond Ser A 186:343–414
Sager TW, Thisted RA (1982) Maximum likelihood estimation of isotonic modal regression. Ann Stat 10(3):690–707
Silverman BW (1981) Using kernel density estimates to investigate multimodality. J R Stat Soc Ser B 43:97–99
Su L, Chen Y, Ullah A (2009) Functional coefficient estimation with both categorical and continuous data. Adv Econom 25:131–167
Su L, Ullah A (2006) Profile likelihood estimation of partially linear panel data models with fixed effects. Econ Lett 92(1):75–81
Su L, Ullah A (2011) Nonparametric and semiparametric panel econometric models: estimation and testing. In: Ullah A, Giles DEA (eds) Handbook of empirical economics and finance 455–497
Su L, Ullah A, Wang Y (2013) Nonparametric regression estimation with general parametric error covariance: a more efficient two-step estimator. Empir Econ 1009–1024
Tarter ME, Lock MD (1993) Model-free curve estimation. CRC Press, Boca Raton, p 56
Yao W (2013) A note on EM algorithm for mixture models. Stat Probab Lett 83:519–526
Yao W, Li L (2014) A new regression model: modal linear regression. Scand J Stat 41:656–671
Yao W, Lindsay BG, Li R (2012) Local modal regression. J Nonparametric Stat 24(3):647–663
Yao W, Xiang S (2016) Nonparametric and varying coefficient modal regression. arXiv:1602.06609v1
Zhou H, Huang X (2016) Nonparametric modal regression in the presence of measurement error. Electron J Stat 10:3579–3620
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Funding
This study was not supported by any funding.
Conflict of interest
Each author declares he has no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We are grateful to the guest editor and three anonymous referees for their helpful suggestions and comments which have greatly improved the paper. We also thank seminar participants at the 2020 Econometric Society World Congress.
Appendices
A Appendix-Tables
B Appendix-Proofs
1.1 Proof of Theorem 2.1
Recall that \(Y_{it}={X}_{it}^T {\beta }+Z^T_{\mu ,i}\mu +{v}_{it}\). We define \(X^{*}_{it}=({X}_{it}^T, Z^T_{\mu ,i})^T\), \(\theta =(\beta ^T,\mu ^T)^T\), and \(\theta _0=(\beta _0^T,\mu _0^T)^T\), where \(\theta _{0}\) is the true parameter value. Let \(\delta _{NT}=h_0^{2}+\sqrt{\left( NTh_0^{3}\right) ^{-1}}\), then it is sufficient to show that for any given \(\eta ,\) there exists a large number constant a such that
where \(\Vert \cdot \Vert \) represents the Euclidean distance, and \(\theta _{0}\) is the true parameter value.
Applying the Taylor expansion, it follows
where \(v^*_{it}\) is between \(v_{it}\) and \(v_{it}-\delta _{NT} c^{T} X^*_{i t}\). Based on the result \(T_{N T}={\mathbb {E}}\left( T_{N T}\right) +O_{p}(\sqrt{{\text {Var}}\left( T_{N T}\right) }),\) we consider each part of above Taylor expansion.
For the first part, which is \(I_1=\frac{1}{NTh_0}\sum ^N_{i=1}\sum ^T_{t=1}\left( -\phi ^{(1)} \left( \frac{v_{it}}{h_0} \right) \left( \frac{\delta _{NT} c^{T} X^*_{i t}}{h_0} \right) \right) \), we can calculate it directly to achieve
Meanwhile, we know that
Combing (A.3) and (A.4) obtains
For the second part, which is \(I_2=\frac{1}{NTh_0}\sum ^N_{i=1}\sum ^T_{t=1}\left( \frac{1}{2}\phi ^{(2)} \left( \frac{v_{it}}{h_0} \right) \left( \frac{\delta _{NT} c^{T} X^*_{i t}}{h_0} \right) ^{2} \right) \), we can prove
which indicates that the second part will dominate the first part when we choose a big enough.
For the third part, which is \(I_3=\frac{1}{NTh_0}\sum ^N_{i=1}\sum ^T_{t=1}\left( -\frac{1}{6}\phi ^{(3)} \left( \frac{v^*_{it}}{h_0} \right) \left( \frac{\delta _{NT} c^{T} X^*_{i t}}{h_0} \right) ^{3} \right) \), as \(v^*_{it}\) is between \(v_{it}\) and \(v_{it}-\delta _{NT} c^{T} X^*_{i t}\), after some direct calculations we can obtain
which indicates that the second part dominates the third part with the assumption \(NTh^5_0 \rightarrow \infty \).
Based on these, we can choose a bigger enough such that the second term dominates the other two terms with probability \(1-\eta \). Because the second term is negative, thus \(P\left\{ \sup _{\Vert c\Vert =a} Q_{N T}\left( \theta _{0}+\delta _{T} c\right) <Q_{N T}\left( \theta _{0}\right) \right\} \ge 1-\eta \) holds. Hence, with the probability approaching 1, there exists a local maximizer \(\hat{\theta }\) such that
\(\square \)
1.2 Proof of Theorem 2.2
Recall that
Because \((\hat{\beta }, \hat{\mu })\) maximize \(Q_{NT}({\beta }, {\mu })\), we can take the derivative of \(Q_{NT}({\beta }, {\mu })\) respect to \(\beta \) and \(\mu \) to obtain
By applying Taylor expansion for (A.9) and (A.10), we can achieve
where \(v^*_{it}\) is between \(v_{it}\) and \({v}_{it}-{X}_{it}^T (\hat{\beta }-\beta _0)-Z^T_{\mu ,i}(\hat{\mu }-\mu _0)\).
We focus on (A.12) firstly. Considering \(-\frac{1}{NTh^2_0}\sum ^N_{i=1}\sum ^T_{t=1} \left( \phi ^{(1)}\left( \frac{v_{it}}{h_0} \right) Z_{\mu ,i} \right) \), we get
Considering \(\frac{1}{NTh^3_0}\sum ^N_{i=1}\sum ^T_{t=1} \left( \phi ^{(2)}\left( \frac{v_{it}}{h_0} \right) \left( Z_{\mu ,i} X^{T}_{it} \right) \right) \), we achieve
Considering \(\frac{1}{NTh_0}\sum ^N_{i=1}\sum ^T_{t=1} \left( \phi ^{(2)}\left( \frac{v_{it}}{h_0} \right) \left( \frac{Z_{\mu ,i}}{h_0}\frac{Z^{T}_{\mu ,i}}{h_0} \right) \right) \), we can obtain
Then, it follows that
where \(\Phi =\lim _{N \rightarrow \infty } (1/N) \sum ^N_{i=1} {\mathbb {E}} \left( Z_{\mu ,i}Z^{T}_{\mu ,i} f^{(2)}_{v}(0 \mid X_{it}, Z_{\mu ,i}) \right) \) and \(\Psi =\lim _{N,T \rightarrow \infty } (1/(NT))\sum ^N_{i=1}\) \(\sum ^T_{t=1} {\mathbb {E}}\left( Z_{\mu ,i}X^{T}_{it} f^{(2)}_{v}(0 \mid X_{it}, Z_{\mu ,i}) \right) \).
Substituting (A.16) into (A.11), we can have
Define
we then get
With some calculations, we can obtain
Meanwhile, based on above calculations, we have
where \(v_2=\int \phi ^{2} \left( \tau \right) \tau ^2 d \tau \) and \({L}=\lim _{N,T \rightarrow \infty } (1/(NT))\sum ^N_{i=1}\) \(\sum ^T_{t=1} {\mathbb {E}} \Big ( X_{it}X^{T}_{it} f_{v}(0 \mid X_{it}, Z_{\mu ,i}) \Big )\).
To show Theorem 2.2, it is sufficient to show the asymptotic normality for \(M^*_{NT}=\sqrt{NTh^3_0}M_{NT} \), where we prove that for any unit vector \(d \in {\mathbb {R}}^q\),
Then, we check Lyapunov’s condition. Let \(\xi _i=-1/\sqrt{NTh_0}\phi ^{(1)}\left( \frac{v_{it}}{h_0} \right) d^T X_{it}\), we need to prove \(NT {\mathbb {E}} |\xi _1 |^3 \rightarrow 0\). As \(\left( d^T X_{it} \right) ^2 \le \Vert d\Vert ^2 \Vert X_{it} \Vert ^2\) and \(\phi ^{(1)}(.)\) is bounded, we have
Thus, the asymptotic normality for \(M^*_{NT}\) holds with
According to Slutsky’s Theorem, we obtain Theorem 2.2. \(\square \)
1.3 Proof of Theorem 3.1
The main proof steps here are similar with those of Proof of Theorem 2.1. We briefly outline the proof. Recall that
where \({\tilde{X}}^{T}_{it}=(1,{X}_{it}^T)\) and \(\theta =(\gamma _1, \beta ^T)^T\). Define \(\delta _{NT}=h_1^{2}+\sqrt{\left( NTh_1^{3}\right) ^{-1}}\), it is sufficient to show that for any given \(\eta ,\) there exists a large number constant a such that \(P\left\{ \sup _{\Vert c\Vert =a} Q_{N T}\left( \theta _{0}+\delta _{NT} c\right) <Q_{N T}\left( \theta _{0}\right) \right\} \ge 1-\eta \), where \(\Vert \cdot \Vert \) represents the Euclidean distance, and \(\theta _{0}\) is the true parameter value. Applying the Taylor expansion, it follows
where \(v^*_{it}\) is between \(v_{it}+\alpha _{i0}-\hat{\alpha }_i\) and \(v_{it}+\alpha _{i0}-\hat{\alpha }_i-\delta _{NT} c^{T} {\tilde{X}}_{it}\). Following the same steps as the Proof of Theorem 2.1, with assumption that \(\sqrt{T}h^2_1 \rightarrow \infty \) (i.e., \(N^a/T \rightarrow 0\) for some \(a>4/3\)), one can obtain \(\Vert \hat{\theta }-\theta _0 \Vert \le \delta _{NT}\).
\(\square \)
1.4 Proof of Theorem 3.2
Recall that \({\tilde{X}}^T_{it}=(1, {X}_{it}^T)\), \(\theta _0 =(\gamma _{10}, \beta _0^T)^T\), and \(\hat{\theta }=(\hat{\gamma }_1, \hat{\beta }^T)^T\). If \(\hat{\theta }\) maximizes (11), it will satisfy the following equation
Then, we can achieve
where \(v^*_{it}\) is between \(v_{it}\) and \(v_{it}+\alpha _{i0}-\hat{\alpha }_i-{\tilde{X}}^T_{it} (\hat{\theta }-\theta _0 )\). It can be shown that the third term on the left-hand side of (A.28) is dominated by the second term. With assumption that \(\sqrt{T}h^2_1 \rightarrow \infty \) (i.e., \(N^a/T \rightarrow 0\) for some \(a>4/3\)), we could then follow the same proof steps as those of Proof of Theorem 2.2 to achieve the results.
\(\square \)
Rights and permissions
About this article
Cite this article
Ullah, A., Wang, T. & Yao, W. Modal regression for fixed effects panel data. Empir Econ 60, 261–308 (2021). https://doi.org/10.1007/s00181-020-01999-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00181-020-01999-w