Abstract
Statistical techniques are used in all branches of science to determine the feasibility of quantitative hypotheses. One of the most basic applications of statistical techniques in comparative analysis is the test of equality of two population means, generally performed under the assumption of normality. In medical studies, for example, we often need to compare the effects of two different drugs, treatments or preconditions on the resulting outcome. The most commonly used test in this connection is the two sample \(t\) test for the equality of means, performed under the assumption of equality of variances. It is a very useful tool, which is widely used by practitioners of all disciplines and has many optimality properties under the model. However, the test has one major drawback; it is highly sensitive to deviations from the ideal conditions, and may perform miserably under model misspecification and the presence of outliers. In this paper we present a robust test for the two sample hypothesis based on the density power divergence measure (Basu et al. in Biometrika 85(3):549–559, 1998), and show that it can be a great alternative to the ordinary two sample \(t\) test. The asymptotic properties of the proposed tests are rigorously established in the paper, and their performances are explored through simulations and real data analysis.
Similar content being viewed by others
References
Basu A, Harris IR, Hjort NL, Jones MC (1998) Robust and efficient estimation by minimising a density power divergence. Biometrika 85(3):549–559
Basu A, Mandal A, Martin N, Pardo L (2013) Testing statistical hypotheses based on the density power divergence. Ann Inst Stat Math 65(2):319–348
Basu A, Mandal A, Martin N, Pardo L (2014) Density power divergence tests for composite null hypotheses. arXiv:1403.0330
Dik JJ, de Gunst MCM (1985) The distribution of general quadratic forms in normal variables. Stat Neerl 39(1):14–26
Doksum KA, Sievers GL (1976) Plotting with confidence: graphical comparisons of two populations. Biometrika 63(3):421–434
Fraser DAS (1957) Most powerful rank-type tests. Ann Math Stat 28:1040–1043
Fujisawa H, Eguchi S (2006) Robust estimation in the normal mixture model. J Stat Plan Inference 136(11):3989–4011
Fujisawa H, Eguchi S (2008) Robust parameter estimation with a small bias against heavy contamination. J Multivar Anal 99(9):2053–2081
Ghosh A, Basu A (2013) Robust estimation for independent non-homogeneous observations using density power divergence with applications to linear regression. Electron J Stat 7:2420–2456
Jones MC, Hjort NL, Harris IR, Basu A (2001) A comparison of related density-based minimum divergence estimators. Biometrika 88(3):865–873
Koopmans LH (1987) Introduction to contemporary statistical methods. Duxbury Press, Boston
Stigler SM (1977) Do robust estimators work with real data? Ann Stat 5(6):1055–1098
Tiku ML, Tan WY, Balakrishnan N (1986) Robust inference, volume 71 of statistics: textbooks and monographs. Marcel Dekker Inc., New York
Voinov V, Balakrishnan N, Nikulin MS (2013) Chi-squared goodness of fit tests with applications. Academic Press, Waltham
Yuen KK, Dixon WJ (1973) The approximate behaviour and performance of the two-sample trimmed t. Biometrika 60(2):369–374
Acknowledgments
This work was partially supported by Grants MTM-2012-33740 and ECO-2011-25706. The authors gratefully acknowledge the suggestions of two anonymous referees which led to an improved version of the paper.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Proof of Theorem 1
As \(\widehat{\mu }_{i\beta }\) is the solution of the estimating equation \(_{1}h_{n_{i},\beta }^{\prime }\left( \mu _{i},\sigma \right) =0\), we get from Eq. (7)
Hence, using (9) we get
where
It is clear that \(\widehat{\mu }_{1\beta }\) and \(\widehat{\mu }_{2\beta }\) are based on two independent set of observations, hence, \(Cov(\widehat{\mu }_{1\beta },\widehat{\mu }_{2\beta })=0\). As \(_{2}h_{n_1,n_2,\beta }^{\prime }(\widehat{\varvec{\eta }}_\beta )=0\), taking a Taylor series expansion around \(\varvec{\eta }_0\) we get
Notice that
Similarly we get
Moreover,
Therefore, using Eqs. (30), (31) and (32) we get from Eq. (29)
Similarly we also have
Hence,
Now, from Eq. (33) we get
where
As \(\varvec{J}_{12,\beta }(\sigma _0)=\varvec{J}_{21,\beta }(\sigma _0)=0\), it is clear that
Therefore, \(Cov(\widehat{\mu }_{1\beta },\widehat{\sigma }_{\beta })=Cov(\widehat{\mu }_{2\beta },\widehat{\sigma }_{\beta })=0\). Moreover, \(Cov(\widehat{\mu }_{1\beta },\widehat{\mu }_{2\beta })=0\). Combining the results in (27) and (34) we get the variance-covariance matrix of \(\sqrt{\frac{n_1n_2}{n_1+n_2}}\widehat{\varvec{\eta }}_{\beta }\) as follows
where the values of the diagonal elements are given in (28) and (35). Hence, the theorem is proved.\(\square \)
Proof of Theorem 2
A Taylor expansion of \( d_{\gamma }(f_{\widehat{\mu }_{1\beta },\widehat{\sigma }_\beta },f_{ \widehat{\mu }_{2\beta },\widehat{\sigma }_\beta })\) around \(\varvec{\eta }_{0}\) gives
where \(\varvec{t}_{\gamma }\left( \varvec{\eta }_{0}\right) =\frac{ \partial }{\partial \varvec{\eta }}\left. d_{\gamma }(f_{\mu _1,\sigma },f_{\mu _2,\sigma })\right| _{\varvec{\eta }=\varvec{\eta }_{0}}\); the expressions of the components \(t_{\gamma , i}\left( \varvec{\eta }_{0}\right) \), \(i=1,2,3\), are given in (18)–(20). Hence, the result directly follows from Theorem 1. \(\square \)
Proof of Theorem 3
If \(\mu _{10}=\mu _{20}\), it is obvious that \(d_{\gamma }(f_{\mu _{10},\sigma _0},f_{\mu _{20},\sigma _0})=0\), and \(\varvec{t}_{\gamma } ( \varvec{\eta }_{0})=0\). Hence, a second order Taylor expansion of \(d_{\gamma }(f_{\widehat{\mu }_{1\beta },\widehat{\sigma }_\beta },f_{ \widehat{\mu }_{2\beta },\widehat{\sigma }_\beta })\) around \(\varvec{\eta }_{0}\) gives
where \(\varvec{A}_{\gamma }(\sigma _0)\) is the matrix containing the second derivatives of \(d_{\gamma }(f_{\mu _1,\sigma },f_{\mu _2,\sigma })\ \) evaluated at \(\mu _{10}=\mu _{20}\). It can be shown that
where
Therefore, Eq. (36) simplifies to
where
From Theorem 1 we know that
where
Therefore, \(\frac{2 n_1n_2}{n_1+n_2} d_{\gamma }(f_{\widehat{\mu }_{1\beta },\widehat{\sigma }_{\beta }},f_{\widehat{\mu }_{2\beta },\widehat{\sigma }_\beta })\) has the same asymptotic distribution (see Dik and de Gunst 1985) as the random variable
where \(Z_{1}\) and \(Z_{2}\) are independent standard normal variables, and
are the eigenvalues of the matrix \(\varvec{\Sigma }_{w,\beta }^{*}(\sigma _0)\varvec{A}_{\gamma }^{*}\left( \sigma _0\right) \). Hence,
Finally, since \(\widehat{\sigma }_\beta \) is a consistent estimator of \( \sigma \), replacing \(\lambda _{\beta ,\gamma }(\sigma _0)\) by \(\lambda _{\beta ,\gamma }(\widehat{\sigma }_\beta )\) and by following Slutsky’s theorem we obtain the desired result. \(\square \)
Rights and permissions
About this article
Cite this article
Basu, A., Mandal, A., Martin, N. et al. Robust tests for the equality of two normal means based on the density power divergence. Metrika 78, 611–634 (2015). https://doi.org/10.1007/s00184-014-0518-4
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-014-0518-4