Skip to main content
Log in

Robust density power divergence estimates for panel data models

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

The panel data regression models have become one of the most widely applied statistical approaches in different fields of research, including social, behavioral, environmental sciences, and econometrics. However, traditional least-squares-based techniques frequently used for panel data models are vulnerable to the adverse effects of data contamination or outlying observations that may result in biased and inefficient estimates and misleading statistical inference. In this study, we propose a minimum density power divergence estimation procedure for panel data regression models with random effects to achieve robustness against outliers. The robustness, as well as the asymptotic properties of the proposed estimator, are rigorously established. The finite-sample properties of the proposed method are investigated through an extensive simulation study and an application to climate data in Oman. Our results demonstrate that the proposed estimator exhibits improved performance over some traditional and robust methods in the presence of data contamination.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Aquaro, M., Cizek, P. (2013). One-step robust estimation of fixed-effects panel data models. Computational Statistics and Data Analysis, 57(1), 536–548.

    Article  MathSciNet  MATH  Google Scholar 

  • Athey, S., Bayati, M., Doudchenko, N., et al. (2021). Matrix completion methods for causal panel data models. Journal of the American Statistical Association, 1–15.

  • Bakar, N. M. A., Midi, H. (2015). Robust centering in the fixed effect panel data model. Pakistan Journal of Statistics, 31(1), 33–48.

    MathSciNet  Google Scholar 

  • Balestra, P., Nerlove, M. (1966). Pooling cross-section and time series data in the estimation of a dynamic model: The demand for natural gas. Econometrica, 34(3), 585–612.

    Article  Google Scholar 

  • Baltagi, B. H. (2005). Econometric analysis of panel data. Chichester: John Wiley and Sons.

    MATH  Google Scholar 

  • Basak, S., Basu, A., Jones, M. (2021). On the ‘optimal’ density power divergence tuning parameter. Journal of Applied Statistics, 48(3), 536–556.

    Article  MathSciNet  MATH  Google Scholar 

  • Basu, A., Ghosh, A., Mandal, A., et al. (2017). A Wald-type test statistic for testing linear hypothesis in logistic regression models based on minimum density power divergence estimator. Electronic Journal of Statistics, 11(2), 2741–2772.

    Article  MathSciNet  MATH  Google Scholar 

  • Basu, A., Harris, I. R., Hjort, N. L., et al. (1998). Robust and efficient estimation by minimising a density power divergence. Biometrika, 85(3), 549–559.

    Article  MathSciNet  MATH  Google Scholar 

  • Basu, A., Mandal, A., Martin, N., et al. (2013). Testing statistical hypotheses based on the density power divergence. Annals of the Institute of Statistical Mathematics, 65(2), 319–348.

    Article  MathSciNet  MATH  Google Scholar 

  • Basu, A., Mandal, A., Martin, N., et al. (2018). Testing composite hypothesis based on the density power divergence. Sankhya, Ser. B, 80(2), 222–262.

  • Beyaztas, B. H., Bandyopadhyay, S. (2020). Robust estimation for linear panel data models. Statistics in Medicine, 39(29), 4421–4438.

    Article  MathSciNet  Google Scholar 

  • Bramati, M. C., Croux, C. (2007). Robust estimators for the fixed effects panel data model. Econometrics Journal, 10(3), 521–540.

    Article  MATH  Google Scholar 

  • Cameron, A. C., Trivedi, P. K. (2005). Microeconometrics: Methods and applications. New York Cambridge University Press.

    Book  MATH  Google Scholar 

  • Cizek, P. (2010). Reweighted least trimmed squares: an alternative to one-step estimators. CentER Discussion Paper Series 91/2010.

  • Cox, D. R., Hall, P. (2002). Estimation in a simple random effects model with nonnormal distributions. Biometrika, 89(4), 831–840.

    Article  MathSciNet  MATH  Google Scholar 

  • Diggle, P. J., Heagerty, P., Liang, K.-Y., et al. (2002). Analysis of Longitudinal Data. United Kingdom Oxford University Press.

    MATH  Google Scholar 

  • Ferguson, T. S. (1996). A course in large sample theory. Texts in Statistical Science Series. London Chapman & Hall.

    Book  MATH  Google Scholar 

  • Fitzmaurice, G. M., Laird, N. M., Ware, J. H. (2004). Applied longitudinal analysis. New York: John Wiley and Sons.

    MATH  Google Scholar 

  • Fujisawa, H. (2013). Normalized estimating equation for robust parameter estimation. Electronic Journal of Statistics, 7, 1587–1606.

    Article  MathSciNet  MATH  Google Scholar 

  • Fujisawa, H., Eguchi, S. (2008). Robust parameter estimation with a small bias against heavy contamination. Journal of Multivariate Analysis, 99(9), 2053–2081.

    Article  MathSciNet  MATH  Google Scholar 

  • Gardiner, J. C., Luo, Z., Roman, L. A. (2009). Fixed effects, random effects and gee: What are the differences? Statistics in Medicine, 28(2), 221–239.

    Article  MathSciNet  Google Scholar 

  • Gervini, D., Yohai, V. J. (2002). A class of robust and fully efficient regression estimators. The Annals of Statistics, 30(2), 583–616.

    Article  MathSciNet  MATH  Google Scholar 

  • Ghosh, A., Basu, A. (2013). Robust estimation for independent non-homogeneous observations using density power divergence with applications to linear regression. Electronic Journal of Statistics, 7, 2420–2456.

    Article  MathSciNet  MATH  Google Scholar 

  • Ghosh, A., Mandal, A., Martin, N., et al. (2016). Influence analysis of robust Wald-type tests. Journal of Multivariate Analysis, 147, 102–126.

    Article  MathSciNet  MATH  Google Scholar 

  • Greene, W. H. (2017). Econometric analysis. New York: Prentice Hall.

    Google Scholar 

  • Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J., et al. (1986). Robust statistics: The approach based on influence functions. New York: John Wiley & Sons Inc.

    MATH  Google Scholar 

  • Hsiao, C. (1985). Benefits and limitations of panel data. Econometric Reviews, 4(1), 121–174.

    Article  MATH  Google Scholar 

  • Hsiao, C. (2007). Panel data analysis - advantages and challenges. Test, 16(1), 1–22.

    Article  MathSciNet  MATH  Google Scholar 

  • Huber, P. J. (1981). Robust statistics. John Wiley & Sons, Inc., New York. Wiley Series in Probability and Mathematical Statistics.

  • Jana, S., Basu, A. (2019). A characterization of all single-integral, non-kernel divergence estimators. IEEE Transactions on Information Theory, 65(12), 7976–7984.

    Article  MathSciNet  MATH  Google Scholar 

  • Jirata, M. T., Chelule, J. C., Odhiambo, R. O. (2014). Deriving some estimators of panel data regression models with individual effects. International Journal of Science and Research, 3(5), 53–59.

    Google Scholar 

  • Kennedy, P. (2003). A guide to econometrics. Cambridge: The MIT Press.

    Google Scholar 

  • Kuchibhotla, A. K., Mukherjee, S., Basu, A. (2019). Statistical inference based on bridge divergences. Annals of the Institute of Statistical Mathematics, 71(3), 627–656.

    Article  MathSciNet  MATH  Google Scholar 

  • Kutner, M. H., Nachtsheim, C. J., Neter, J. (2004). Applied linear regression models. New York: McGraw-Hill Education.

    Google Scholar 

  • Laird, N. M., Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics, 38(4), 963–974.

    Article  MATH  Google Scholar 

  • Lamarche, C. (2010). Robust penalized quantile regression estimation for panel data. Journal of Econometrics, 157(2), 396–408.

    Article  MathSciNet  MATH  Google Scholar 

  • Lehmann, E. L. (1999). Elements of large-sample theory. New York: Springer Texts in Statistics. Springer-Verlag.

  • Maciak, M. (2021). Quantile LASSO with changepoints in panel data models applied to option pricing. Econometrics and Statistics, 20, 166–175.

    Article  MathSciNet  Google Scholar 

  • Maddala, G. S., Mount, T. D. (1973). A comparative study of alternative estimators for variance components models used in econometric applications. Journal of the American Statistical Association, 68(342), 324–328.

    Article  Google Scholar 

  • Mandal, A., Ghosh, S. (2019). Robust variable selection criteria for the penalized regression. arXiv preprint arXiv:1912.12550.

  • Maronna, R. A., Martin, R. D., Yohai, V. J. (2006). Robust statistics. Theory and methods. New York: John Wiley and Sons.

    Book  MATH  Google Scholar 

  • Maronna, R. A., and Yohai, V. J. (2000). Robust regression with both continuous and categorical predictors. Journal of Statistical Planning and Inference, 89(1–2), 197–214.

    Article  MathSciNet  MATH  Google Scholar 

  • Midi, H., Muhammad, S. (2018). Robust estimation for fixed and random effects panel data models with different centering methods. Journal of Engineering and Applied Sciences, 13(17), 7156–7161.

    Google Scholar 

  • Mundlak, Y. (1978). On the pooling of time series and cross section data. Econometrica, 46(1), 69–85.

    Article  MathSciNet  MATH  Google Scholar 

  • Rousseeuw, P. J. (1984). Least median of squares regression. Journal of the American Statistical Association, 79(388), 871–880.

    Article  MathSciNet  MATH  Google Scholar 

  • Rousseeuw, P. J., Leroy, A. M. (2003). Robust regression and outlier detection. New York: John Wiley and Sons.

    MATH  Google Scholar 

  • Rousseeuw, P. J., van Zomeren, B. C. (1990). Unmasking multivariate outliers and leverage points. Journal of the American Statistical Association, 85(41), 633–639.

    Article  Google Scholar 

  • Sherman, J., Morrison, W. J. (1950). Adjustment of an inverse matrix corresponding to a change in one element of a given matrix. The Annals of Mathematical Statistics, 21(1), 124–127.

    Article  MathSciNet  MATH  Google Scholar 

  • Sugasawa, S., Yonekura, S. (2021). On selection criteria for the tuning parameter in robust divergence. Entropy, 23(9), 1147.

    Article  MathSciNet  Google Scholar 

  • Visek, J. A. (2015). Estimating the model with fixed and random effects by a robust method. Methodology and Computing in Applied Probability, 17(4), 999–1014.

    Article  MathSciNet  MATH  Google Scholar 

  • Wallace, T. D., Hussain, A. (1969). The use of error components models in combining cross section and time-series data. Econometrica, 37(1), 55–72.

    Article  MATH  Google Scholar 

  • Warwick, J., Jones, M. (2005). Choosing a robustness tuning parameter. Journal of Statistical Computation and Simulation, 75(7), 581–588.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors gratefully acknowledge the comments of two anonymous referees, which led to an improved version of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abhijit Mandal.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 240 KB)

Appendices

Appendix

A The Estimating Equations

Using Equation (17) of Supplementary Material, the DPD measure in Eq. (8) can be simplified as

$$\begin{aligned} \begin{aligned} {\widehat{d}}_\gamma (f_\theta , g)&= (2\pi )^{-\frac{T \gamma }{2}} |{{\Omega }}|^{-\frac{\gamma }{2}} (1+\gamma )^{-\frac{1}{2}} - \frac{1+\gamma }{N \gamma } \sum _{i=1}^{N} f_\theta ^{\gamma }(y_i | x_i) + c(\gamma )\\&= (2\pi )^{-\frac{T \gamma }{2}} |{{\Omega }}|^{-\frac{\gamma }{2}} (1+\gamma )^{-\frac{1}{2}} \left[ 1 - \frac{(1+\gamma )^{3/2} }{N\gamma } \sum _{i=1}^{N}\exp \left[ -\frac{\gamma }{2} B_i\right] \right] + c(\gamma ), \end{aligned} \end{aligned}$$
(18)

where \(B_i = (y_i - x_i \beta )^T {{\Omega }}^{-1} (y_i - x_i \beta )\). Using the Sherman–Morrison formula (Sherman and Morrison 1950), we get

$$\begin{aligned} {{\Omega }}^{-1} = \frac{1}{\sigma _\epsilon ^2} {{I}}_T - \frac{\sigma _\alpha ^2 e_T e_T^T}{\sigma _\epsilon ^2 ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )},\ \ \ |{{\Omega }}| = \sigma _\epsilon ^{2(T-1)} ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 ). \end{aligned}$$
(19)

It further simplifies \(B_i\) as follows

$$\begin{aligned} B_i = \frac{1}{\sigma _\epsilon ^2}\sum _{t=1}^{T} (y_{it} - x_{it} \beta )^2 - \frac{\sigma _\alpha ^2 }{\sigma _\epsilon ^2 ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )} \left\{ \sum _{t=1}^{T} (y_{it} - x_{it} \beta ) \right\} ^2 . \end{aligned}$$
(20)

The estimating equations of \(\theta \) is obtained from equation \(\frac{\partial }{\partial \theta } {\widehat{d}}_\gamma (f_\theta , g) =0\), and the equations corresponding to \(\beta \), \(\sigma _\alpha ^2\) and \(\sigma _\epsilon ^2\) are simplified as

$$\begin{aligned} \begin{aligned}&\sum _{i=1}^{N} \sum _{t=1}^{T} x_{it} (y_{it} - x_{it} \beta ) \exp \left[ -\frac{\gamma }{2} B_i\right] = \frac{T \sigma _\alpha ^2 }{( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )} \sum _{i=1}^{N} \sum _{t=1}^{T} {\bar{x}}_i (y_{it} - x_{it} \beta ) \exp \left[ -\frac{\gamma }{2} B_i\right] ,\\&\gamma T |{{\Omega }}|^{-1} \sigma _\epsilon ^{2(T-1)} \left\{ (1+\gamma )^{-\frac{1}{2}} - \frac{1+\gamma }{N \gamma } \sum _{i=1}^{N}\exp \left[ -\frac{\gamma }{2} B_i\right] \right\} \\&\quad = - \frac{(1+\gamma )}{N ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )^2} \sum _{i=1}^{N}\exp \left[ -\frac{\gamma }{2} B_i\right] \left\{ \sum _{t=1}^{T} (y_{it} - x_{it} \beta ) \right\} ^2 ,\\&\gamma T |{{\Omega }}|^{-1} \sigma _\epsilon ^{2(T-2)} \left\{ \sigma _\epsilon ^2 + (T-1) \sigma _\alpha ^2 \right\} \left\{ (1+\gamma )^{-\frac{1}{2}} - \frac{1+\gamma }{N \gamma } \sum _{i=1}^{N}\exp \left[ -\frac{\gamma }{2} B_i\right] \right\} \\&\ \ = \frac{(1+\gamma ) }{N} \sum _{i=1}^{N}\exp \left[ -\frac{\gamma }{2} B_i\right] \left[ - \frac{1}{\sigma _\epsilon ^4}\sum _{t=1}^{T} (y_{it} - x_{it} \beta )^2 + \frac{\sigma _\alpha ^2 (2\sigma _\epsilon ^2 + T \sigma _\alpha ^2 )}{\sigma _\epsilon ^4 ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )^2} \left\{ \sum _{t=1}^{T} (y_{it} - x_{it} \beta ) \right\} ^2 \right] , \end{aligned} \end{aligned}$$
(21)

where \( {\bar{x}}_i = \frac{1}{T}\sum _{t=1}^T x_{it}\). The MDPDE of \(\theta \) is obtained by solving the above system of equations. One may use an iterative algorithm for this purpose or directly minimize the DPD measure in Eq. (8) with respect to \(\theta \in \Theta _0\).

B Regularity Conditions

For the asymptotic distribution of the MDPDE, we need the following assumptions:

  1. (A1)

    The true density g(y|x) is supported over the entire real line \(\mathbb {R}\).

  2. (A2)

    There is an open subset \(\omega \in \Theta _0\) containing the best fitting parameter \(\theta \) such tat \({{J}}\) is positive definite for all \(\theta \in \omega \).

  3. (A3)

    There exist functions \(M_{jkl}(x, y)\) such tat \(|\partial ^3 \exp [(y - x \beta )^T {{\Omega }}^{-1} (y - x \beta )] /\partial \theta _j \partial \theta _k \partial \theta _l | \le M_{jkl}(x, y)\) for all \(\theta \in \omega \), where \(\int _x \int _y |M_{jkl}(x, y)| g(y|x) h(x) dy dx < \infty \) for all jk and l.

Note that these regularity conditions hold good for the contaminated model, defined in Lemma 1, when \(\eta (\gamma )\) is sufficiently small.

C Proof of Theorem 1

Proof

The proof of the first part closely follows the consistency of the maximum likelihood estimator with the line of modifications as given in Theorem 3.1 of Ghosh and Basu (2013). For brevity, we only present the detailed proof of the second part.

Let \(\widehat{\theta }\) be the MDPDE of \(\theta \). Then

$$\begin{aligned} \frac{\partial }{\partial \theta } {\widehat{d}}_\gamma (f_\theta , g) = \frac{\partial }{\partial \theta } \left[ \frac{1}{N} \sum _{i=1}^{N} \int _y f^{1+\gamma }_\theta (y|x_i) dy - \frac{1+\gamma }{N \gamma } \sum _{i=1}^{N} f_\theta ^{\gamma }(y_i | x_i)\right] =0. \end{aligned}$$
(22)

Thus, it can be written as the estimating equation of an M-estimator as follows

$$\begin{aligned} \sum _{i=1}^N \Psi _{\widehat{\theta }}(y_i|x_i) = 0, \end{aligned}$$
(23)

where

$$\begin{aligned} \Psi _{\theta }(y_i|x_i) = u_\theta (y_i|x_i) f_\theta ^\gamma (y_i|x_i) - \int _y u_\theta (y|x_i) f_\theta ^{1+\gamma }(y_i|x_i) dy. \end{aligned}$$
(24)

Let \(\theta _g\) be the true value of \(\theta \), then \( E\left( \sum _{i=1}^N \Psi _{\theta _g}(y_i|x_i)\right) = 0\) gives

$$\begin{aligned} \sum _{i=1}^N \Bigg [\int _y u_{\theta _g}(y|x_i) f_{\theta _g}^\gamma (y|x_i) g(y|x_i) dy - \int _y u_{\theta _g}(y|x_i) f_{\theta _g}^{1+\gamma }(y_i|x_i) dy \Bigg ] = 0 . \end{aligned}$$
(25)

Taking a Taylor series expansion of Eq. (23), we get

$$\begin{aligned} \begin{aligned} \frac{1}{N}\sum _{i=1}^N \Psi _{\theta _g}(y_i|x_i)&+ \frac{1}{N}\sum _{i=1}^N \frac{\partial }{\partial \theta }\Psi _\theta (y_i|x_i)\Big |_{\theta = \theta _g} (\widehat{\theta } -\theta _g) + R_N = 0,\\ \text{ or } \sqrt{N}(\widehat{\theta } -\theta _g)&= - \left[ \frac{1}{N}\sum _{i=1}^N \frac{\partial }{\partial \theta }\Psi _\theta (y_i|x_i)\Big |_{\theta = \theta _g} \right] ^{-1} \left[ \frac{1}{\sqrt{N}}\sum _{i=1}^N \Psi _{\theta _g}(y_i|x_i) + \sqrt{N}R_N\right] ,\\ \end{aligned} \end{aligned}$$
(26)

where \(R_N\) is the remainder term. Using the weak law of large numbers (WLLN), we have

$$\begin{aligned} \begin{aligned}&\frac{1}{N}\sum _{i=1}^N \frac{\partial }{\partial \theta }\Psi _\theta (y_i|x_i) \\&\overset{p}{\rightarrow }\ \lim _{N \rightarrow \infty } E\left[ \frac{1}{N}\sum _{i=1}^N \frac{\partial }{\partial \theta }\Psi _\theta (y_i|x_i) \right] \\&\overset{p}{\rightarrow }\ \lim _{N \rightarrow \infty } \frac{1}{N}\sum _{i=1}^N E\left[ \frac{\partial }{\partial \theta } \left( u_\theta f_\theta ^\gamma - \int u_\theta f_\theta ^{1+\gamma } \right) \right] \\&\overset{p}{\rightarrow }\ \lim _{N \rightarrow \infty } \frac{1}{N}\sum _{i=1}^N E\left[ -I_\theta f_\theta ^\gamma + \gamma u_\theta u_\theta ^T f_\theta ^\gamma - \int \left\{ - I_\theta f_\theta ^{1+\gamma } + (1+\gamma ) u_\theta u_\theta ^T f_\theta ^{1+\gamma } \right\} \right] \\&\overset{p}{\rightarrow }\ \lim _{N \rightarrow \infty } \frac{1}{N}\sum _{i=1}^N \left[ - \int I_\theta f_\theta ^\gamma g + \gamma \int u_\theta u_\theta ^T f_\theta ^\gamma g + \int I_\theta f_\theta ^{1+\gamma } - (1+\gamma ) \int u_\theta u_\theta ^T f_\theta ^{1+\gamma } \right] \\&\overset{p}{\rightarrow }\ - \lim _{N \rightarrow \infty } \frac{1}{N}\sum _{i=1}^N \left[ \int u_\theta u_\theta ^T f_\theta ^{1+\gamma } + \int \left( I_\theta -\gamma u_\theta u_\theta ^T \right) (g - f_\theta ) f_\theta ^\gamma \right] . \end{aligned} \end{aligned}$$
(27)

So

$$\begin{aligned} \frac{1}{N}\sum _{i=1}^N \frac{\partial }{\partial \theta }\Psi _\theta (y_i|x_i)\Big |_{\theta = \theta _g} \overset{p}{\rightarrow }\ - \lim _{N \rightarrow \infty } \frac{1}{N}\sum _{i=1}^N {{J}}^{(i)} = - {{J}}. \end{aligned}$$
(28)

From Eq. (25), we get

$$\begin{aligned} \begin{aligned}&E\left[ \frac{1}{\sqrt{N}}\sum _{i=1}^N \Psi _{\theta _g}(y_i|x_i) \right] \\&\quad = \frac{1}{\sqrt{N}} \sum _{i=1}^N \Bigg [\int _y u_{\theta _g}(y|x_i) f_{\theta _g}^\gamma (y|x_i) g(y|x_i) dy - \int _y u_{\theta _g}(y|x_i) f_{\theta _g}^{1+\gamma }(y_i|x_i) dy \Bigg ]\\&\quad = 0. \end{aligned} \end{aligned}$$
(29)

Now,

$$\begin{aligned} \begin{aligned} V\left[ \frac{1}{\sqrt{N}}\sum _{i=1}^N \Psi _{\theta _g}(y_i|x_i) \right]&= \frac{1}{N} \sum _{i=1}^N V\left[ \Psi _{\theta _g}(y_i|x_i) \right] \\&= \frac{1}{N} \sum _{i=1}^N \Bigg [\int _y u_{\theta _g}(y|x_i) u_{\theta _g}^T(y|x_i) f_{\theta _g}^{2\gamma }(y|x_i) g(y|x_i) dy - \xi ^{(i)} \xi ^{(i)T}\Bigg ]\\&= \frac{1}{N} \sum _{i=1}^N {{K}}^{(i)}. \end{aligned} \end{aligned}$$
(30)

Following Section 5 of Ferguson (1996) or Section 2.7 of Lehmann (1999) and using Eqs. (29) and (30), the central limit theorem (CLT) for the independent but not identical random variables gives

$$\begin{aligned} \frac{1}{\sqrt{N}}\sum _{i=1}^N \Psi _{\theta _g}(y_i|x_i) \overset{a}{\sim }\ N\left( 0, {{K}}\right) . \end{aligned}$$
(31)

Under regularity condition (A3), it can be easily shown that the reminder term \(\sqrt{N}R_N = o_p(1).\) Therefore, combining Eqs. (28) and (31), we get from Eq. (26)

$$\begin{aligned} \sqrt{N}(\widehat{\theta } -\theta _g) \overset{a}{\sim }\ N\left( 0, {{J}}^{-1}{{K}}{{J}}^{-1}\right) . \end{aligned}$$
(32)

This completes the proof. \(\square \)

1.1 D \({{J}}\) and \({{K}}\) Matrices at the model

Let us write the score function as

$$\begin{aligned} u_\theta (y_i | x_i) = (u^T_\beta (y_i | x_i), u^T_{\sigma _\alpha ^2}(y_i | x_i), u^T_{\sigma _\epsilon ^2}(y_i | x_i))^T. \end{aligned}$$
(33)

Suppose \( {\bar{x}}_i = \frac{1}{T}\sum _{t=1}^T x_{it}\). Then, it can be shown that

$$\begin{aligned} \begin{aligned} u_\beta (y_i | x_i)&= \frac{\partial }{\partial \beta } \log f_\theta (y_i | x_i) = \frac{1}{\sigma _\epsilon ^2}\sum _{t=1}^{T} x_{it} (y_{it} - x_{it} \beta ) - \frac{ T {\bar{x}}_i \sigma _\alpha ^2 }{\sigma _\epsilon ^2 ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )} \sum _{t=1}^{T} (y_{it} - x_{it} \beta ),\\ u_{\sigma _\alpha ^2}(y_i | x_i)&= \frac{\partial }{\partial \sigma _\alpha ^2} \log f_\theta (y_i | x_i) = -\frac{T}{2 ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )} + \frac{ 1}{2( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )^2} \left\{ \sum _{t=1}^{T} (y_{it} - x_{it} \beta ) \right\} ^2,\\ u_{\sigma _\epsilon ^2}(y_i | x_i)&= \frac{\partial }{\partial \sigma _\epsilon ^2} \log f_\theta (y_i | x_i) = -\frac{1}{2 ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )} \left[ (T-1) \sigma _\epsilon ^{2} ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 ) + 1\right] \\&\quad + \frac{1}{2\sigma _\epsilon ^4}\sum _{t=1}^{T} (y_{it} - x_{it} \beta )^2 - \frac{\sigma _\alpha ^2 (2\sigma _\epsilon ^2 + T \sigma _\alpha ^2 )}{2\sigma _\epsilon ^4 ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )^2} \left\{ \sum _{t=1}^{T} (y_{it} - x_{it} \beta ) \right\} ^2. \end{aligned} \end{aligned}$$
(34)

Note that if the true distribution g(y|x) is a member of the model family \(f_\theta (y|x)\) for some \(\theta \in \Theta _0\), then

$$\begin{aligned} \begin{aligned}&{{J}}^{(i)} = \int _y u_\theta (y | x_i) u^T_\theta (y | x_i) f_\theta ^{1+\gamma }(y | x_i) dy, \\&{{K}}^{(i)} = \int _y u_\theta (y | x_i) u^T_\theta (y | x_i) f_\theta ^{2\gamma +1}(y | x_i) dy - \xi ^{(i)} \xi ^{(i)T} , \\&\xi ^{(i)} = \int _y u_\theta (y | x_i) f_\theta ^{\gamma +1}(y | x_i) dy. \end{aligned} \end{aligned}$$
(35)

In this case, the symmetric matrix \({{J}}^{(i)}\) can be partitioned as

$$\begin{aligned} {{J}}^{(i)} = \begin{bmatrix} {{J}}_\beta ^{(i)} &{} {{J}}_{\beta , \ \sigma _\alpha ^2}^{(i)} &{} {{J}}_{\beta , \ \sigma _\epsilon ^2}^{(i)}\\ . &{} {{J}}_{\sigma _\alpha ^2}^{(i)} &{} {{J}}_{\sigma _\alpha ^2, \ \sigma _\epsilon ^2}^{(i)}\\ . &{} . &{} {{J}}_{\sigma _\epsilon ^2}^{(i)} \end{bmatrix}, \end{aligned}$$
(36)

where

$$\begin{aligned} \begin{aligned} {{J}}_\beta ^{(i)}&= M \sigma _\epsilon ^{-4} \Bigg [ \sigma _\epsilon ^2 \sum _{t=1}^{T} x_{it} x_{it}^T + T^2 \sigma _\alpha ^2 \left( \frac{ T \sigma _\alpha ^2 }{( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )} -1 \right) {\bar{x}}_i {\bar{x}}_i^T \Bigg ] ,\\ {{J}}_{\sigma _\alpha ^2}^{(i)}&= \frac{M T^2 ( \gamma ^2 + 2 )}{4 (1+\gamma ) ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )^2} ,\\ {{J}}_{\sigma _\epsilon ^2}^{(i)}&= \frac{ M T^2 (\gamma - 1) \left[ \sigma _\epsilon ^2 + (T-1) \sigma _\alpha ^2 \right] ^2}{4\sigma _\epsilon ^4 ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )^2} \\&\quad + \frac{MT}{4\sigma _\epsilon ^8} \left[ (T+2) \sigma _\epsilon ^4 + 2(T+2) \sigma _\epsilon ^2\sigma _\alpha ^2 + 3T \sigma _\alpha ^4 \right] + \frac{3MT^2\sigma _\alpha ^4 (2\sigma _\epsilon ^2 + T \sigma _\alpha ^2 )^2}{4\sigma _\epsilon ^8 (1+\gamma ) ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )^2} \\&\quad - \frac{ T M (1+\gamma )\sigma _\alpha ^2 (2\sigma _\epsilon ^2 + T \sigma _\alpha ^2 )}{2\sigma _\epsilon ^8 ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )^2} \Big [ (T+2) \sigma _\epsilon ^4 + (T^2 + 2T +3) \sigma _\epsilon ^2 \sigma _\alpha ^2 + 3(T^2 -T + 1 ) \sigma _\alpha ^4 \Big ] ,\\ {{J}}_{\beta , \ \sigma _\alpha ^2}^{(i)}&= 0,\ \ {{J}}_{\beta , \ \sigma _\epsilon ^2}^{(i)} = 0,\\ {{J}}_{\sigma _\alpha ^2, \ \sigma _\epsilon ^2}^{(i)}&= \frac{T M (1+\gamma ) }{4 \sigma _\epsilon ^4 ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )^2} \Big [ 2(T+1) \sigma _\epsilon ^4 + (2T^2 + T +3) \sigma _\epsilon ^2 \sigma _\alpha ^2 + 3(T^2 -T +1) \sigma _\alpha ^4 \Big ] \\&\quad - \frac{3MT^2\sigma _\alpha ^2 (2\sigma _\epsilon ^2 + T \sigma _\alpha ^2 )}{4\sigma _\epsilon ^4 (1+\gamma ) ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )^2} - \frac{M T^2 \left[ (T-1) \sigma _\alpha ^2 + \sigma _\epsilon ^2 \right] }{2\sigma _\epsilon ^2 ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )^2} , \end{aligned} \end{aligned}$$
(37)

and

$$\begin{aligned} M = (2\pi )^{-\frac{T \gamma }{2}} (1+\gamma )^{-\frac{T+2}{2}} \sigma _\epsilon ^{-\gamma (T-1)} ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )^{-\frac{\gamma }{2}} . \end{aligned}$$
(38)

Similarly, \(\xi ^{(i)}\) can be partitioned as \(\xi ^{(i)} = \left( \xi _\beta ^{(i)T}, \xi _{\sigma _\alpha ^2}^{(i)} , \xi _{\sigma _\epsilon ^2}^{(i)} \right) ^T\), and it is shown that

$$\begin{aligned} \xi _\beta ^{(i)} = 0,\ \ \ \xi _{\sigma _\alpha ^2}^{(i)} = -\frac{MT \gamma }{2 ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 )}, \text{ and } \xi _{\sigma _\epsilon ^2}^{(i)} = -\frac{M T \gamma \left[ \sigma _\epsilon ^2 + (T-1) \sigma _\alpha ^2 \right] }{2 \sigma _\epsilon ^2 ( \sigma _\epsilon ^2 + T \sigma _\alpha ^2 ) }. \end{aligned}$$
(39)

Note that if we write the matrix \({{J}}^{(i)}\) as a function of \(\gamma \), i.e., \({{J}}^{(i)} \equiv {{J}}^{(i)}(\gamma )\), then we have

$$\begin{aligned} {{K}}^{(i)} = {{J}}^{(i)}(2\gamma ) - \xi ^{(i)} \xi ^{(i)T}. \end{aligned}$$
(40)

Moreover, \(\xi _\beta ^{(i)}\) is constant for all values of \(i=1, 2, \cdots , N\). Therefore, \({{K}}\) can be written as

$$\begin{aligned} {{K}}= \frac{1}{N} \sum _{i=1}^N {{J}}^{(i)}(2\gamma ) - \xi ^{(i)} \xi ^{(i)T}. \end{aligned}$$
(41)

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mandal, A., Beyaztas, B.H. & Bandyopadhyay, S. Robust density power divergence estimates for panel data models. Ann Inst Stat Math 75, 773–798 (2023). https://doi.org/10.1007/s10463-022-00862-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-022-00862-2

Keywords

Navigation