Skip to main content
Log in

Measurement Error Models for Replicated Data Under Asymmetric Heavy-Tailed Distributions

  • Published:
Computational Economics Aims and scope Submit manuscript

Abstract

Replicated data with measurement errors are frequently presented in economical, environmental, chemical, medical and other fields. In this paper, we discuss a replicated measurement error model under the class of scale mixtures of skew-normal distributions, which extends symmetric heavy and light tailed distributions to asymmetric cases. We also consider equation error in the model for displaying the matching degree between the true covariate and response. Explicit iterative expressions of maximum likelihood estimates are provided via the expectation–maximization type algorithm. Empirical Bayes estimates are conducted for predicting the true covariate and response. We study the effectiveness as well as the robustness of the maximum likelihood estimations through two simulation studies. The method is applied to analyze a continuing survey data of food intakes by individuals on diet habits.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723.

    Article  Google Scholar 

  • Andrews, D. F., & Mallows, C. L. (1974). Scale mixtures of normal distributions. Journal of the Royal Statistical Society: Series B, 36(1), 99–102.

    Google Scholar 

  • Arellano-Valle, R. B., Bolfarine, H., & Lachos, V. H. (2005). Skew-normal linear mixed models. Journal of Data Science, 3, 415–438.

    Google Scholar 

  • Azzalini, A., & Capitanio, A. (1999). Statistical applications of the multivariate skew-normal distribution. Journal of the Royal Statistical Society: Series B, 61(3), 579–602.

    Article  Google Scholar 

  • Bartlett, J. W., De Stavola, B. L., & Frost, C. (2009). Linear mixed models for replication data to efficiently allow for covariate measurement error. Statistics in Medicine, 28(25), 3158–3178.

    Article  Google Scholar 

  • Basso, R. M., Lachos, V. H., Cabral, C. R., & Ghosh, P. (2010). Robust mixture modeling based on scale mixtures of skew-normal distributions. Computational Statistics and Data Analysis, 54(12), 2926–2941.

    Article  Google Scholar 

  • Branco, M. D., & Dey, D. K. (2001). A general class of multivariate skew-elliptical distributions. Journal of Multivariate Analysis, 79(1), 99–113.

    Article  Google Scholar 

  • Cancho, V. G., Lachos, V. H., & Ortega, E. M. M. (2008). A nonlinear regression model with skew-normal errors. Statistical Papers, 52, 571–583.

    Google Scholar 

  • Cao, C. Z., Lin, J. G., & Shi, J. Q. (2014). Diagnostics on nonlinear model with scale mixtures of skew-normal and first-order autoregressive errors. Statistics, 48(5), 1033–1047.

    Article  Google Scholar 

  • Cao, C. Z., Lin, J. G., Shi, J. Q., Wang, W., & Zhang, X. Y. (2015). Multivariate measurement error models for replicated data under heavy-tailed distributions. Journal of Chemometrics, 29(8), 457–466.

    Article  Google Scholar 

  • Carroll, R. J., Ruppert, D., Stefanski, L. A., & Crainiceanu, C. M. (2006). Measurement error in nonlinear models: A modern perspective (2nd ed.). Boca Raton: Chapman and Hall.

    Book  Google Scholar 

  • Chan, L. K., & Mak, T. K. (1979). Maximum likelihood estimation of a linear structural relationship with replication. Journal of the Royal Statistical Society: Series B, 41(2), 263–268.

    Google Scholar 

  • Cheng, C. L., & Van Ness, J. W. (1999). Statistical regression with measurement error. London: Arnold.

    Google Scholar 

  • Cheng, C. L., & Riu, J. (2006). On estimating linear relationships when both variables are subject to heteroscedastic measurement errors. Technometrics, 48, 511–519.

    Article  Google Scholar 

  • Dempster, A., Laird, N., & Rubin, D. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B, 39, 1–38.

    Google Scholar 

  • Fang, K. T., Kotz, S., & Ng, K. W. (1990). Symmetrical multivariate and related distributions. London: Chapman and Hall.

    Book  Google Scholar 

  • Fuller, W. A. (1987). Measurement error models. New York: Wiley.

    Book  Google Scholar 

  • Genton, M. G. (2004). Skew-elliptical distributions and their applications: A Journey beyond normality. Boca Raton: Chapman & Hall.

    Book  Google Scholar 

  • Giménez, P., & Patat, M. L. (2005). Estimation in comparative calibration models with replicated measurement. Statistics and Probability Letters, 71(2), 155–164.

    Article  Google Scholar 

  • Gori, L., & Sodini, M. (2011). Nonlinear dynamics in an OLG growth model with young and old age labour supply: The role of public health expenditure. Computational Economics, 38, 261–275.

    Article  Google Scholar 

  • Harnack, L., Stang, J., & Story, M. (1999). Soft drink consumption among US children and adolescents: Nutritional consequences. Journal of the American Dietetic Association, 99(4), 436–441.

    Article  Google Scholar 

  • Harville, D. A. (1997). Matrix algebra from a statistician’s perspective. New York: Springer.

    Book  Google Scholar 

  • Isogawa, Y. (1985). Estimating a multivariate linear structural relationship with replication. Journal of the Royal Statistical Society: Series B, 47, 211–215.

    Google Scholar 

  • Jacobs, H. L., Kahn, H. D., Stralka, K. A., & Phan, D. B. (1998). Estimates of per capita fish consumption in the US based on the continuing survey of food intake by individuals (CSFII). Risk Analysis, 18(3), 283–291.

    Article  Google Scholar 

  • Jara, A., Quintana, F., & Martin, E. S. (2008). Linear mixed models with skew-elliptical distributions: A Bayesian approach. Computational Statistics and Data Analysis, 52(11), 5033–5045.

    Article  Google Scholar 

  • Jones, D. Y., Schatzkin, A., Green, S. B., Block, G., Brinton, L. A., Ziegler, R. G., et al. (1987). Dietary fat and breast cancer in the National Health and Nutrition Examination Survey I: Epidemiologic follow-up study. Journal of the National Cancer Institute, 79, 465–471.

    Google Scholar 

  • Lachos, V. H., Angolini, T., & Abanto-Valle, C. A. (2011). On estimation and local influence analysis for measurement errors models under heavy-tailed distributions. Statistical Papers, 52, 567–590.

    Article  Google Scholar 

  • Lachos, V. H., Ghosh, P., & Arellano-Valle, R. B. (2010a). Likelihood based inferance for skew-normal/independent linear mixed models. Statistica Sinica, 20, 303–322.

    Google Scholar 

  • Lachos, V. H., Labra, F. V., Bolfarine, H., & Ghosh, P. (2010b). Multivariate measurement error models based on scale mixtures of the skew-normal distribution. Statistics, 44(6), 541–556.

    Article  Google Scholar 

  • Lange, K. L., & Sinsheimer, J. S. (1993). Normal/independent distributions and their applications in robust regression. Journal of Computational and Graphical Statistics, 2, 175–198.

    Google Scholar 

  • le Coutre, J., Mattson, M. P., Dillin, A., Friedman, J., & Bistrian, B. (2013). Nutrition and the biology of human aging: Cognitive decline/food intake and caloric restriction. The Journal of Nutrition, Health and Aging, 17(8), 717–720.

    Article  Google Scholar 

  • Lin, N., Bailey, B. A., He, X. M., & Buttlar, W. G. (2004). Adjustment of measuring devices with linear models. Technometrics, 46, 127–134.

    Article  Google Scholar 

  • Lin, J. G., & Cao, C. Z. (2013). On estimation of measurement error models with replication under heavy-tailed distributions. Computational Statistics, 28(2), 809–829.

    Article  Google Scholar 

  • McLachlan, G. L., & Krishnan, T. (1997). The EM algorithm and extensions. New York: Wiley.

    Google Scholar 

  • Montenegro, L. C., Bolfarine, H., & Lachos, V. H. (2010). Inference for a skew extension of the Grubb’s model. Statistical Papers, 51, 701–715.

    Article  Google Scholar 

  • Osorio, F., Paula, G. A., & Galea, M. (2009). On estimation and influence diagnostics for the Grubb’s model under heavy-tailed distributions. Computational Statistics and Data Analysis, 53, 1249–1263.

    Article  Google Scholar 

  • Reiersol, O. (1950). Identifiability of a linear relation between variables which are subject to errors. Econometrica, 18, 375–389.

    Article  Google Scholar 

  • Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.

    Article  Google Scholar 

  • Sun, S. Z., & Empie, M. W. (2007). Lack of findings for the association between obesity risk and usual sugar-sweetened beverage consumption in adults: A primary analysis of databases of CSFII-1989–1991, CSFII-1994–1998, NHANES III, and combined NHANES 1999–2002. Food and Chemical Toxicology, 45(8), 1523–1536.

    Article  Google Scholar 

  • Wimmer, G., & Witkovský, V. (2007). Univariate linear calibration via replicated errors-in-variables model. Journal of Statistical Computation and Simulation, 77, 213–227.

    Article  Google Scholar 

  • Xie, F. C., Wei, B. C., & Lin, J. G. (2008). Homogeneity diagnostics for skew-normal nonlinear regression models. Statistics and Probability Letters, 20, 303–322.

    Google Scholar 

  • Zeller, C. B., Carvalho, R. R., & Lachos, V. H. (2012). On diagnostics in multivariate measurement error models under asymmetric heavy-tailed distributions. Statistical Papers, 53(3), 665–683.

    Article  Google Scholar 

  • Zeller, C. B., Lachos, V. H., & Vilca-Labra, F. E. (2011). Local influence analysis for regression models with scale mixtures of skew-normal distributions. Journal of Applied Statistics, 38(2), 343–368.

    Article  Google Scholar 

  • Zeller, C. B., Lachos, V. H., & Vilca-Labra, F. E. (2014). Influence diagnostics for Grubb’s model with asymmetric heavy-tailed distributions. Statistical Papers, 55(3), 671–690.

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by the National Science Foundation of China (Grant No. 11301278), the Natural Science Foundation of Jiangsu Province of China (Grant No. BK2012459), the MOE (Ministry of Education in China) Project of Humanities and Social Sciences (Grant No. 13YJC910001), and Academic Degree Postgraduate innovation projects of Jiangsu province Ordinary University (Grant No. KYLX15-0883).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chunzheng Cao.

Appendix

Appendix

1.1 The PDF of Some SMSN Distributions and the Conditional Moments

The pdf of some important SMSN distributions and the properties about conditional moments are as follows:

(1) The multivariate skew-t distribution \(ST _m(\varvec{\mu }, {\varvec{\Sigma }}, \varvec{\lambda };\nu )\):

\(\kappa (u)=1/u\), \(U\sim Gamma (\nu /2,\nu /2)\) with \(\nu > 0\). The pdf of \(\varvec{Y}\) is given by

$$\begin{aligned} f(\varvec{y})=2t_m(\varvec{y}|\varvec{\mu },{\varvec{\Sigma }};\nu )T\left( \sqrt{\frac{m+\nu }{d+\nu }}A;\nu +m\right) , \end{aligned}$$

where \(d=(\varvec{y}-\varvec{\mu })^{\top }{\varvec{\Sigma }}^{-1}(\varvec{y}-\varvec{\mu })\), \(t_m(\cdot |\varvec{\mu }, {\varvec{\Sigma }};\nu )\) and \(T(\cdot ;\nu )\) denote the pdf of m-dimensional Student-t distribution and the cdf of standard univariate t distribution, respectively. The skew-normal distribution is the limiting case when \(\nu \rightarrow +\infty \).

The conditional moments take the forms

$$\begin{aligned} u_r=&\,\frac{f_0(\varvec{y})}{f(\varvec{y})}\frac{2^{r+1}\Gamma ((\nu +m+2r)/2)(\nu +d)^{-r}}{\Gamma ((\nu +m)/2)}T\left( \sqrt{\frac{m+\nu +2r}{d+\nu }}A;\nu +m+2r\right) ,\\ \eta _r=&\,\frac{f_0(\varvec{y})}{f(\varvec{y})}\frac{2^{(r+1)/2}\Gamma ((\nu +m+r)/2)}{\pi ^{1/2}\Gamma ((\nu +m)/2)}\frac{(\nu +d)^{(\nu +m)/2}}{(\nu +d+A^2)^{(\nu +m+r)/2}}. \end{aligned}$$

where \(f_0(\varvec{y})=\int \nolimits _0^{\infty }\phi _m(\varvec{y}|\varvec{\mu },\kappa (u){\varvec{\Sigma }})d H (u)\), i.e. the pdf of the class of SMN distribution when \(\varvec{\lambda }=\varvec{0}\).

(2) The multivariate skew-slash distribution \(SS _m(\varvec{\mu }, {\varvec{\Sigma }}, \varvec{\lambda };\nu )\):

\(\kappa (u)=1/u\), \(U\sim Beta (\nu , 1)\) with \(0<u<1\) and \(\nu > 0\). The pdf of \(\varvec{Y}\) is given by

$$\begin{aligned} f(\varvec{y})=2\nu \int \nolimits _0^{1}u^{\nu -1}\phi _m(\varvec{y}|\varvec{\mu },u^{-1}{\varvec{\Sigma }})\Phi (u^{1/2}A)d u. \end{aligned}$$

When \(\nu \rightarrow +\infty \), the skew-slash distribution reduces to the skew-normal one.

The conditional moments take the forms

$$\begin{aligned} u_r&=\frac{f_0(\varvec{y})}{f(\varvec{y})}\frac{2\Gamma ((2\nu +m+2r)/2)}{\Gamma ((2\nu +m)/2)}\Big (\frac{2}{d}\Big )^{r}\frac{P_1((2\nu +m+2r)/2,d/2)}{P_1((2\nu +m)/2,d/2)}\\&\quad \times E [\Phi (S^{1/2}A)],\\ \eta _r&=\frac{f_0(\varvec{y})}{f(\varvec{y})}\frac{2^{{(r+1)}/2}\Gamma ((2\nu +m+r)/2)}{\pi ^{1/2}\Gamma ((2\nu +m)/2)}\frac{d^{(2\nu +m)/2}}{(d+A^2)^{(2\nu +m+r)/2}}\\&\quad \times \frac{P_1((2\nu +m+r)/2,(d+A^2)/2)}{P_1((2\nu +m)/2,d/2)}, \end{aligned}$$

where \(S\sim Gamma ((2\nu +m+2r)/2,d/2)I _{(0,1)}\) and \(P_x(a,b)\) denotes the cdf of the \(Gamma (a,b)\) distribution evaluated at x.

(3) The multivariate skew-contaminated normal distribution \(SCN _m(\varvec{\mu }, {\varvec{\Sigma }}, \varvec{\lambda };\nu ,\gamma )\):

When \(\kappa (u)=1/u\) and U follows a discrete random probability function \(h(u;\nu ,\gamma )=\nu I _{(u=\gamma )}+(1-\nu ) I _{(u=1)}\) with given parameter vector \(\varvec{\nu }=(\nu ,\gamma )^{\top }\) and \(0<\nu<1, 0<\gamma \leqslant 1\), we get the multivariate skew-contaminated normal distribution with the pdf as

$$\begin{aligned} f(\varvec{y})=2\left\{ \nu \phi _m\left( \varvec{y}|\varvec{\mu },\gamma ^{-1}{\varvec{\Sigma }}\right) \Phi \left( \gamma ^{1/2}A\right) +(1-\nu )\phi _m\left( \varvec{y}|\varvec{\mu },{\varvec{\Sigma }}\right) \Phi (A)\right\} . \end{aligned}$$

The SN distribution is a special case as \(\gamma =1\).

The conditional moments take the forms

$$\begin{aligned}&u_r=\frac{2}{f(\varvec{y})}\left\{ \nu \gamma ^{r}\phi _m\left( \varvec{y}|\varvec{\mu },\gamma ^{-1}{\varvec{\Sigma }}\right) \Phi \left( \gamma ^{1/2}A\right) +(1-\nu )\phi _m\left( \varvec{y}|\varvec{\mu },{\varvec{\Sigma }}\right) \Phi (A)\right\} ,\\&\eta _r=\frac{2}{f(\varvec{y})}\left\{ \nu \gamma ^{r/2}\phi _m\left( \varvec{y}|\varvec{\mu },\gamma ^{-1}{\varvec{\Sigma }}\right) \phi \left( \gamma ^{1/2}A\right) +(1-\nu )\phi _m\left( \varvec{y}|\varvec{\mu },{\varvec{\Sigma }}\right) \phi (A)\right\} . \end{aligned}$$

1.2 The First Derivatives of \(d_t\), \(A_t\) and \(\log |{\varvec{\Sigma }}|\) with Respect to \(\varvec{\theta }\)

By direct calculations, we have the first derivatives of \(d_t\), \(A_t\) and \(\log |{\varvec{\Sigma }}|\) as follows:

for \(d_t\):

$$\begin{aligned} \frac{\partial d_t}{\partial \theta _i}= -2{(\varvec{Z}_t-\varvec{\mu })}^{\top }{\varvec{\Sigma }}^{-1}\frac{\partial \varvec{\mu }}{\partial \theta _i} -{(\varvec{Z}_t-\varvec{\mu })}^\top {\varvec{\Sigma }}^{-1}\frac{\partial {\varvec{\Sigma }}}{\partial \theta _i}{\varvec{\Sigma }}^{-1}(\varvec{Z}_t-\varvec{\mu }), \end{aligned}$$

for \(A_t\):

$$\begin{aligned} \frac{\partial A_t}{\partial \theta _i}=\bigg (\frac{\partial \psi }{\partial \theta _i}\varvec{b}^{\top } +\psi \frac{\partial \varvec{b}^{\top }}{\partial \theta _i}-\psi \varvec{b}^{\top }{\varvec{\Sigma }}^{-1} \frac{\partial {\varvec{\Sigma }}}{\partial \theta _i}\bigg ){\varvec{\Sigma }}^{-1}(\varvec{Z}_t-\varvec{\mu }) -\psi \varvec{b}^{\top }{\varvec{\Sigma }}^{-1}\frac{\partial \varvec{\mu }}{\partial \theta _i}, \end{aligned}$$

where \(\psi =\frac{\lambda _x\phi _x}{\sqrt{\phi _x+\lambda _x^2\Lambda _x}}\).

for \(\log |{\varvec{\Sigma }}|\):

$$\begin{aligned} \frac{\partial \log |{\varvec{\Sigma }}|}{\partial \beta }=&2q\beta \phi _{\delta }\phi _x/\tau ,\\ \frac{\partial \log |{\varvec{\Sigma }}|}{\partial \phi _x}=&\left[ p(\phi _{\varepsilon }+q\phi _e)+q\beta ^2\phi _{\delta }\right] /\tau ,\\ \frac{\partial \log |{\varvec{\Sigma }}|}{\partial \phi _{\delta }}=&(p-1)/\phi _{\delta }+(\phi _{\varepsilon }+q\phi _e+q\beta ^2\phi _x)/\tau ,\\ \frac{\partial \log |{\varvec{\Sigma }}|}{\partial \phi _{\varepsilon }}=&(q-1)/\phi _{\varepsilon }+(\phi _{\delta }+p\phi _x)/\tau ,\\ \frac{\partial \log |{\varvec{\Sigma }}|}{\partial \phi _e}=&q(\phi _{\delta }+p\phi _x)/\tau , \end{aligned}$$

where \(|{\varvec{\Sigma }}|=\phi _{\delta }^{p-1}\phi _{\varepsilon }^{q-1}\tau \), \(\tau =(\phi _{\delta }+p\phi _x)(\phi _{\varepsilon }+q\phi _e)+q\beta ^2\phi _{\delta }\phi _x\).

In addition, we also need to calculate the following derivation:

for \(\varvec{\mu }\):

$$\begin{aligned} \frac{\partial \varvec{\mu }}{\partial \mu _x}=\varvec{b}, \frac{\partial \varvec{\mu }}{\partial \alpha }=\varvec{c}, \frac{\partial \varvec{\mu }}{\partial \beta }=\mu _x\varvec{c}. \end{aligned}$$

for \({\varvec{\Sigma }}\):

$$\begin{aligned} \frac{\partial {\varvec{\Sigma }}}{\partial \beta }= & {} \phi _x\left( \varvec{c}\varvec{b}^{\top }+\varvec{b}\varvec{c}^{\top }\right) , \frac{\partial {\varvec{\Sigma }}}{\partial \phi _x}=\varvec{b}\varvec{b}^{\top }, \frac{\partial {\varvec{\Sigma }}}{\partial \phi _{\delta }}=\varvec{D}\left\{ \left( \varvec{1}_p^{\top },\varvec{0}_q^{\top }\right) \right\} , \frac{\partial {\varvec{\Sigma }}}{\partial \phi _{\varepsilon }}\\= & {} \varvec{D}(\varvec{c}), \frac{\partial {\varvec{\Sigma }}}{\partial \phi _e}=\varvec{c}\varvec{c}^{\top }. \end{aligned}$$

for \(\varvec{b}\):

$$\begin{aligned} \frac{\partial \varvec{b}}{\partial \beta }=\varvec{c}. \end{aligned}$$

for \(\psi \):

$$\begin{aligned}&\frac{\partial \psi }{\partial \beta }=q\beta c^{-2}\psi ^3/(\phi _{\varepsilon }+q\phi _e), \quad \frac{\partial \psi }{\partial \lambda _x}=\phi _x^2(\phi _x+\lambda _x^2\Lambda _x)^{-3/2},\\&\frac{\partial \psi }{\partial \phi _x}=\frac{1}{2}\psi \phi _x^{-1}+\frac{1}{2}\psi ^3\phi _x^{-1}c^{-2}\varvec{b}^{\top }{\varvec{\Sigma }}_1^{-1}\varvec{b}, \quad \frac{\partial \psi }{\partial \phi _{\delta }}=-\frac{1}{2}p\psi ^3c^{-2}\phi _{\delta }^{-2},\\&\frac{\partial \psi }{\partial \phi _{\varepsilon }}=-\frac{1}{2}\psi ^3c^{-2}q\beta ^2{(\phi _{\varepsilon }+q\phi _e)}^{-2},\quad \frac{\partial \psi }{\partial \phi _e}=-\frac{1}{2}q^2\psi ^3c^{-2}\beta ^2{(\phi _{\varepsilon }+q\phi _e)}^{-2}. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cao, C., Wang, Y., Shi, J.Q. et al. Measurement Error Models for Replicated Data Under Asymmetric Heavy-Tailed Distributions. Comput Econ 52, 531–553 (2018). https://doi.org/10.1007/s10614-017-9702-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10614-017-9702-8

Keywords

Navigation