Background

Incomplete data often arise in various research fields such as crossover trials, equivalence trials, and pre and post-test comparative studies. For instance, ([1] pp. 212) designed a crossover clinical trial to measure the onset of action of two doses of formoterol solution aerosol: 12 ug and 24 ug. In this study, twenty-four patients were randomly allocated in equal numbers to one of the six possible sequences of two treatments at a time. Each patient was received two aerosols at each of visits 2 and 4. After four weeks, researchers measured the forced expiratory volume of a second (FEV1) indicators for twenty-four patients. Due to the fact that researches did not consider all possible combinations of three treatments (e.g., placebo, 12 ug and 24 ug aerosols), which indicates that the missing data mechanism is missing completely at random (MCAR) thus FEV1 was only observed for 7 patients under both treatments (e.g., 12 ug and 24 ug aerosols), 9 patients only for 12 ug aerosol, and 8 patients only for 24 ug aerosol. The resultant data are shown in Table 1, which consist of two parts: the complete observations and the incomplete observations.

Table 1 FEV1 indicators of patients for 12 ug and 24 ug formoterol solution aerosol

For the above crossover clinical trial, our main interest is to test the equivalence between 12 ug and 24 ug formoterol solution aerosols with respect to the FEV1 value. To this end, we can construct a (1−α)100 % confidence interval for the difference of two FEV1 values. If the resultant confidence interval (CI) lies entirely in the interval (−δ 0,δ 0) with δ 0(>0) being some pre-specified clinical acceptable threshold, we thus could conclude the equivalence between two doses of formoterol solution aerosol at the α significance level. As a result, reliable CIs for the difference in the presence of incomplete data are necessary.

The problem of testing the equality and constructing CI for the difference of two correlated proportions in the presence of incomplete paired binary data has received considerable attention in past years. For example, ones can refer to [26] for the large sample method, and [7] for the corrected profile likelihood method. When sample size is small, [8] proposed the exact unconditional test procedure for testing equality of two correlated proportions with incomplete correlated data. Tang, Ling and Tian [9] developed the exact unconditional and approximate unconditional CIs for proportion difference in the presence of incomplete paired binary data. Lin et al. [10] presented a Bayesian method to test equality of two correlated proportions with incomplete correlated data. Li et al. [11] discussed the confidence interval construction for rate ratio in matched-pair studies with incomplete data. However, all the aforementioned methods were developed for incomplete paired binary data.

Statistical inference on the difference of two means with incomplete correlated data has received a limited attention. For example, [12] discussed the problem of testing the equality of two means with missing data on one response and recommended [13] statistic when the variances were not too different. Lin and Stivers [14] also gave a similar comparison. Lin and Stivers [15] and [12] suggested some test statistics for testing the equality of two means with incomplete data on both response. However, to our knowledge, little work has been done on CI construction for the difference of two means with incomplete correlated data under the MCAR assumption.

Inspired by [1619], we develop several CIs for the difference of two means with incomplete correlated data under the MCAR assumption based on the large sample method, hybrid method and Bootstrap-resampling method. The presented Bootstrap-resampling CIs have not been considered in the literature related to missing observations.

The rest of this article is organized as follows. Several methods are presented to construct CIs for the difference of the two means with incomplete correlated data in Section “Methods”. Simulation studies and an example are conducted to evaluate the finite performance of the proposed CIs in terms of coverage probability, expected interval width, and mesial and distal non-coverage probabilities in Section “Results”. A brief discussion is given in Section “Discussion”. Some concluding remarks are given in Section “Conclusion”.

Methods

Suppose that x=(x 1,x 2) is a 2×1 vector of random variables, and follows a distribution with mean μ and covariance matrix Σ given by

$$\boldsymbol{\mu}=\left(\begin{array}{l} \mu_{1} \\ \mu_{2} \end{array} \right) \,\text{and}\,\, \vspace*{-4pt} \boldsymbol{\Sigma}=\left(\begin{array}{ll} {\sigma_{1}^{2}}&\, \rho\sigma_{1}\sigma_{2}\\ \rho\sigma_{1}\sigma_{2}& \,{\sigma_{2}^{2}} \end{array} \right), $$

respectively. Let {(x 1m ,x 2m ):m=1,⋯,n} be n paired observations on x 1 and x 2, \(\left \{x_{1,n+1}, \cdots, x_{1,n+n_{1}}\right \}\) be n 1 additional observations on x 1, \(\left \{x_{2,n+1}, \cdots, x_{2,n+n_{2}}\right \}\) be n 2 additional observations on x 2. Thus, there are n 1 missing observations on x 2, and n 2 missing observations on x 1. Without loss of generality, the data may be presented as follows:

$$ \begin{array}{lll} x_{11}, \cdots, x_{1n}, & x_{1,n+1}, \cdots, x_{1,n+n_{1}},& \\ x_{21}, \cdots, x_{2n}, & &x_{2,n+1}, \cdots, x_{2,n+n_{2}}, \end{array} $$

where (x 1m ,x 2m ) is referred to as a paired observation, while x 1,n+j and x 2,n+k are referred to as incomplete or unpaired observations. Similar to [20, 21], throughout this article, it is assumed that the missing data mechanism is MCAR (i.e., independent of treatment and outcome). Based on these observations, we here want to construct reliable explicit CIs for the difference of two means δ=μ 1μ 2 under MCAR assumption.

Confidence interval based on the large sample method

To make a comparison with the following proposed methods, we assume that x follows a bivariate normal distribution in this subsection. In this case, if only variable x 1 or x 2 is subject to missingness (i.e., n 1=0 or n 2=0), one can obtain the closed forms of the maximum likelihood estimates (MLEs) of μ and Σ [22]. However, there are no closed forms of the MLEs for μ and Σ when variables x 1 and x 2 are simultaneously subject to missingness (i.e., n 1≠0 and n 2≠0), though one can find the MLEs of μ and Σ using an iterative algorithm [23]. To get the closed forms of MLEs for μ and Σ, [15] proposed the modified MLEs using a non-iterative procedure and provided several test statistics based on the obtained estimators of μ and Σ.

(i) Confidence interval based on Lin and Stivers’s test statistics

Let \(\hat {\delta }=\hat {\mu }_{1}-\hat {\mu }_{2}\) be the MLE of δ under the bivariate normal assumption of x. When Σ is known, it follows from [15] that the MLE of δ is

$$ \hat{\delta}=a\overline{x}_{1}^{(n)}+\left(1-a\right)\overline{x}_{1}^{(n_{1})}-b\overline{x}_{2}^{(n)}-(1-b)\overline{x}_{2}^{(n_{2})}, $$

and the asymptotic variance of \(\hat {\delta }\) can be expressed as

$$ \begin{aligned}\text{Var}(\hat{\delta})=~ &h\left\{\left[n+n_{2}\left(1-\rho\right)^{2}\right]\right\}{\sigma_{1}^{2}}-2n\rho\sigma_{1}\sigma_{2}\\ &\left.+\left[n+n_{1}\left(1-\rho^{2}\right)\right]{\sigma_{2}^{2}}\right\},\end{aligned} $$

respectively, where \(\overline {x}_{1}^{(n)}=\frac {1}{n}\sum _{j=1}^{n}x_{1j}\), \(\overline {x}_{2}^{(n)}=\frac {1}{n}\sum _{j=1}^{n}x_{2j}\), \(\overline {x}_{1}^{(n_{1})}=\frac {1}{n_{1}}\sum _{j=1}^{n_{1}}x_{1,n+j}\), \(\overline {x}_{2}^{(n_{2})}=\frac {1}{n_{2}}\sum _{k=1}^{n_{2}}x_{2,n+k}\), a=nh(n+n 2+n 1 β 21), b=nh(n+n 1+n 2 β 12), β 21=ρσ 2/σ 1, β 12=ρσ 1/σ 2, h=1/{(n+n 1)(n+n 2)−n 1 n 2 ρ 2}. An approximate 100(1−α) % CI of δ is given by \(\left (\hat {\delta }-\textit {z}_{\alpha /2}\sqrt {\text {Var}(\hat {\delta })}, \hat {\delta }+\textit {z}_{\alpha /2}\sqrt {\text {Var}(\hat {\delta })}\right)\), which is denoted as T w1-CI.

Following [15], when Σ is unknown, the statistic for testing H 0:δ=δ 0 versus H 1:δδ 0 is given by

$$ {}T_{1}\,=\,\frac{A\!\left(\overline{x}_{1}^{(n)}\!-\overline{x}_{1}^{(n_{1})}\right)\,-\,B\left(\overline{x}_{2}^{(n)}\!-\overline{x}_{2}^{(n_{2})}\right)\!+\overline{x}_{1}^{(n_{1})}\!-\overline{x}_{2}^{(n_{2})}\!-\delta_{0}}{\sqrt{V_{1}}}, $$

which is asymptotically distributed as t-distribution with n degrees of freedom under H 0, where V 1=[{A 2/n+(1−A)2/n 1 }m 1+{ B 2/n+(1−B)2/n 2 } m 2−2ABm 12/n]/(n−1), A={n(n+n 2+n 1 m 12/m 1}/{ (n+n 1)(n+n 2)−n 1 n 2 r 2}−1, B={n(n+n 1+n 2 m 12/ m 2} /{ (n + n 1)(n + n 2)−n 1 n 2 r 2}−1, \(m_{1}=\sum _{j=1}^{n} \left (x_{1j}-\overline {x}_{1}^{(n)}\right)^{2}\), \(m_{2}=\sum _{j=1}^{n}\left (x_{2j}-\overline {x}_{2}^{(n)}\right)^{2}\), \(m_{12}=\sum _{j=1}^{n}\left (x_{1j}\,-\,\overline {x}_{1}^{(n)}\right)\left (x_{2j}-\overline {x}_{2}^{(n)}\right)\), \(r=m_{12}/\sqrt {m_{1}m_{2}}\). Therefore, the approximate 100(1−α) % CI on the basis of T 1 is given by (L, U), where \(L=A\left (\overline {x}_{1}^{(n)}-\overline {x}_{1}^{(n_{1})}\right)-B\left (\overline {x}_{2}^{(n)}-\overline {x}_{2}^{(n_{2})}\right)+\overline {x}_{1}^{(n_{1})}-\overline {x}_{2}^{(n_{2})}-t_{\alpha /2}(n)\sqrt {V_{1}}\), and \(U=A\left (\overline {x}_{1}^{(n)}-\overline {x}_{1}^{(n_{1})}\right)-B\left (\overline {x}_{2}^{(n)}-\overline {x}_{2}^{(n_{2})}\right)+\overline {x}_{1}^{(n_{1})}-\overline {x}_{2}^{(n_{2})}+t_{\alpha /2}(n)\sqrt {V_{1}}\), which is denoted as T 1-CI.

Another test statistic defined by [15] for testing H 0:δ=δ 0 versus H 1:δδ 0, which is a generalization of [24] test statistic for two independent samples, is given by

$$ T_{2}=\frac{\bar{x}_{1}^{\left(n+n_{1}\right)}-\bar{x}_{2}^{\left(n+n_{2}\right)}-\delta_{0}}{\sqrt{h_{1}+h_{2}+h_{3}}}, $$

which is asymptotically distributed as t distribution with degrees ν of freedom, where \(\bar {x}_{1}^{(n+n_{1})}=(n+n_{1})^{-1}\sum _{j=1}^{n+n_{1}}x_{1j}\), \(\bar {x}_{2}^{(n+n_{2})}=(n+n_{2})^{-1}\sum _{j=1}^{n+n_{2}}x_{2j}\), h 1=n{(n+n 2)m 1/(n+n 1)+(n+n 1)m 2/(n+n 2)−2m 12}/{(n−1)(n+n 1)(n+n 2)}, h 2=n 1 b 1/{(n 1−1)(n+n 1)2}, h 3=n 2 b 2/{(n 2−1)(n+n 2)2}, \(b_{1}=\sum _{j=n+1}^{n+n_{1}}\left (x_{1j}-\overline {x}_{1}^{(n_{1})}\right)^{2}\), \(b_{2}=\sum _{j=n+1}^{n+n_{2}}\left (x_{2j}-\overline {x}_{2}^{(n_{1})}\right)^{2}\), and \(\nu =\left (h_{1}+h_{2}+h_{3}\right)^{2}/\{{h_{1}^{2}}/(n-1)+{h_{2}^{2}}/(n_{1}-1)+{h_{3}^{2}}/(n_{2}-1)\}\). Therefore, the approximate 100(1−α) % CI of δ for statistic T 2 is denoted as T 2-CI.

When σ 1=σ 2, it follows from [15] that the statistic for testing H 0:δ=δ 0 versus H 1:δδ 0 can be expressed as

$$ \begin{aligned} T_{3}=\left\{\bar{x}_{1}^{(n+n_{1})}-\bar{x}_{2}^{(n+n_{2})}-\delta_{0}\right\} \sqrt{\frac{(n+n_{1}+n_{2}-2)(n+n_{1})(n+n_{2})}{(b_{1}+c_{2})(2n-2nr+n_{1}+n_{2})}}, \end{aligned} $$

which is asymptotically distribution as t-distribution with degrees n+n 1+n 2−4 of freedom. Note that when n 2>n 1, b 1+c 2 should be replaced by b 2+c 1. Thus, the approximate 100(1−α) % CI of δ for T 3 is denoted as T 3-CI, where \(c_{1}=\sum _{j=1}^{n+n_{1}}\left (x_{1j}-{n+n_{1}}\sum _{j=1}^{n+n_{1}}x_{1j}\right)^{2}\), and \(c_{2}=\sum _{j=1}^{n+n_{2}}\left (x_{2j}-\frac {1}{n+n_{2}}\sum _{j=1}^{n+n_{2}}x_{2j}\right)^{2}\).

Also, [12] presented the similar but simpler test statistics for testing the mean difference δ=μ 1μ 2, which are adopted to construct CIs of δ as follows.

(ii) Confidence interval based on Ekbohm’s test statistics

Following [12], an unbiased estimator of δ is given by \(\hat {\delta }=\bar {x}_{1}^{(n+n_{1})}-\bar {x}_{2}^{(n+n_{2})}\), and its variance is given by \(\text {Var}(\hat {\delta })= \text {Var}(\hat {\mu })=\left \{(n+n_{2}){\sigma _{1}^{2}}+(n+n_{2}){\sigma _{2}^{2}}-2n\rho \sigma _{1}\sigma _{2}\right \}/\left \{(n+n_{1})(n+n_{2})\right \}\). An approximate 100(1 − α) % CI of δ can be obtained by \(\left (\hat {\delta }-\textit {z}_{\alpha /2}\sqrt {\text {Var}(\hat {\delta })}, \hat {\delta }+\right.\) \(\left.\textit {z}_{\alpha /2}\sqrt {\text {Var}(\hat {\delta })}\right),\) which is denoted as T w2-CI.

When σ 1=σ 2, Ekbothm (1976) proposed the following statistic for testing H 0: \(T_{4}=(\tilde {\delta }-\delta _{0})\sqrt {\!(n\,+\,\!n_{1})(n\,+\,\!n_{2})\,-\,n_{1}n_{2}\lambda ^{2}}/\) \( \left \{\hat \sigma \!\sqrt {2n(1-\lambda)+(n_{1}+n_{2})(1-\lambda ^{2})}\right \}\), where \(\tilde {\delta }=\left [n\left (n+n_{2}+n_{1}\lambda \right)\overline {x}_{1}^{(n)}\!-n\left (n+n_{1}+n_{2}\lambda \right)\overline {x}_{2}^{(n)}+n_{1}\!\left \{n\,+\,n_{2}\!\left (1\,-\,\lambda ^{2}\right)-\!n\lambda \right \}\overline {x}_{1}^{(n_{1})}-n_{2}\left \{n\,+\,n_{1}\!\left (1\!\,-\,\!\lambda \right)^{2}\!\,-\,n\lambda \right \}\overline {x}_{2}^{(n_{2})}\right ]\!\big /\!\!\left \{(n\,+\,n_{1})(n+n_{2})\,-\,n_{1}n_{2}\lambda ^{2}\right \}\), \(\hat {\sigma }^{2}\,=\,\left \{m_{1}\,+\,m_{2}\,+\,(1+\lambda ^{2})(b_{1}\,+\,b_{2})\right \}/\left \{2(n-\!1)+\!(1\,+\,\lambda ^{2})(n_{1}\,+\,n_{2}\,-\,2)\right \}\), and λ=2m 12/(m 1+m 2). Under H 0, T 4 is asymptotically distributed as t-distribution with degrees n of freedom. Therefore, the approximate 100(1−α) % CI is denoted as T 4-CI.

Following [12], when σ 1=σ 2, another statistic for testing H 0 can be expressed as \(T_{5}= \left (\bar {x}_{1}^{(n+n_{1})}\!-\bar {x}_{2}^{(n+n_{2})}\,-\,\delta _{0}\!\right)\!\sqrt {(n\,+\,n_{1})(n\,+\,n_{2})/(R_{1}\!\,+\,R_{2})}\), which is asymptotically distributed as t distribution with degrees ν σ of freedom under H 0, where R 1 = n(m 1 + m 2 − 2m 12) /(n − 1), R 2 =(n 1 + n 2)(b 1+b 2)/(n 1+n 2−2), and \(\nu _{\sigma }=\left (R_{1}+R_{2}\right)^{2}\!\!\!~/\left \{{R_{1}^{2}}/(n+1)+{R_{2}^{2}}/(n_{1}+n_{2})\right \}-2\). Thus, an approximate 100(1−α) % CI of δ for T 5 is denoted as T 5-CI.

Confidence interval based on the generalized estimating equations(GEEs)

To relax the bivariate normality assumption of x, the method of the generalized estimating equations (GEEs) with exchangeable working correlation structure (e.g., [25]) can be adopted to make statistical inference on δ in the incomplete correlated data because the GEE approach have become one of the most widely used methods in dealing with correlated response data [26, 27]. Following [28], the GEEs with exchangeable working correlation structure can be used to estimate parameter vector μ; the so-called sandwich variance estimator can be used to consistently estimate the covariance matrix of μ; and the ML method under a bivariate normal assumption via available paired observations is used to estimate the correlation parameter. Thus, an approximate 100(1−α) % CI of δ based on GEE method is denoted as T g -CI.

Confidence interval based on the hybrid method

When the distribution function of x is unknown, a hybrid method is developed to construct CI of δ in this subsection. We first introduce the general concept of hybrid method. Let θ 1 and θ 2 be two parameters of interest. Now our main interest is to construct a 100(1−α) % two-sided CI (L,U) of θ 1θ 2 via hybrid method. Let \(\hat {\theta }_{1}\) and \(\hat {\theta }_{2}\) be two estimates of θ 1 and θ 2, respectively; and let (l 1,u 1) and (l 2,u 2) denote two approximate 100(1−α) % CIs for θ 1 and θ 2, respectively. Under the dependent assumption on \(\hat \theta _{1}\) and \(\hat \theta _{2}\), it follows from the central limit theorem that the approximate two-sided 100(1−α) % CI of θ 1θ 2 is given by (L,U), where

$${\kern100pt}L=\hat{\theta}_{1}-\hat{\theta}_{2}-z_{\alpha/2}\sqrt{\text{Var}(\hat{\theta}_{1})+\text{Var}(\hat{\theta}_{2})-2\text{Cov}(\hat\theta_{1}, \hat\theta_{2})}, $$
$${\kern100pt}U=\hat{\theta}_{1}-\hat{\theta}_{2}+z_{\alpha/2}\sqrt{\text{Var}(\hat{\theta}_{1})+\text{Var}(\hat{\theta}_{2})-2\text{Cov}(\hat\theta_{1}, \hat\theta_{2})}. $$

Because \(\text {Cov}(\hat \theta _{1}, \hat \theta _{2})=\text {corr}(\hat \theta _{1}, \hat \theta _{2})\left \{\text {Var}(\hat {\theta }_{1})\text {Var}(\hat {\theta }_{2})\right \}^{1/2}\), the lower limit L and the upper limit U can be rewritten as

$$\begin{aligned} &L=\hat{\theta}_{1}-\hat{\theta}_{2}-z_{\alpha/2}\sqrt{\text{Var}(\hat{\theta}_{1})+\text{Var}(\hat{\theta}_{2})-2\text{corr}(\hat\theta_{1}, \hat\theta_{2})\left\{\text{Var}(\hat{\theta}_{1})\text{Var}(\hat{\theta}_{2})\right\}^{1/2}}\\ &U=\hat{\theta}_{1}-\hat{\theta}_{2}+z_{\alpha/2}\sqrt{\text{Var}(\hat{\theta}_{1})+\text{Var}(\hat{\theta}_{2})-2\text{corr}(\hat\theta_{1}, \hat\theta_{2})\left\{\text{Var}(\hat{\theta}_{1})\text{Var}(\hat{\theta}_{2})\right\}^{1/2}}, \end{aligned} $$

respectively. Note that (l 1,u 1) contains the plausible parameter values of θ 1, and (l 2,u 2) contains the plausible parameter values for θ 2. Among these plausible values for θ 1 and θ 2, the values closest to the minimum L and maximum U are respectively l 1u 2 and u 1l 2 in spirit of the score-type CI [29]. From the central limit theorem, the variance estimates can now be recovered from θ 1=l 1 as \(\widehat {\text {Var}(\hat \theta _{1})}=(\hat \theta _{1}-l_{1})^{2}/z_{\alpha /2}^{2}\) and from θ 2=u 2 as \(\widehat {\text {Var}(\hat \theta _{2})}=\left (u_{2}-\hat \theta _{2}\right)^{2}\!\!\!\big /z_{\alpha /2}^{2}\) for setting L. As a result, the lower limit L for θ 1θ 2 is

$$\begin{array}{@{}rcl@{}} \begin{aligned}L\,=\,\hat\theta_{1}\!-\hat{\theta}_{2}-\!\sqrt{\left(\hat{\theta}_{1}\!-l_{1}\right)^{2}\!\,+\,\left(u_{2}-\hat{\theta}_{2}\right)^{2}\!\,-\,2\widehat{\text{corr}}\!\left(\hat{\theta}_{1}, \hat{\theta}_{2}\right)\!\!\left(\hat{\theta}_{1}\,-\,l_{1}\right)\!\!\left(u_{2}\,-\,\hat{\theta}_{2}\right)}\end{aligned} \end{array} $$
((1))

Similarly, we can obtain

$$\begin{array}{@{}rcl@{}} \begin{aligned}U\,=\,\hat{\theta}_{1}-\hat{\theta}_{2}\,+\,\sqrt{\!\left(u_{1}\!-\hat{\theta}_{1}\right)^{2}\!\,+\,\left(\hat{\theta}_{2}\,-\,l_{2}\right)^{2}\,-\,2\widehat{\text{corr}}\!\left(\hat{\theta}_{1}, \hat{\theta}_{2}\right)\!\!\left(u_{1}\!-\hat{\theta}_{1}\right)\!\!\left(\hat{\theta}_{2}\,-\,l_{2}\right)}\end{aligned} \end{array} $$
((2))

To obtain the above presented approximate 100(1−α) % hybrid CI for μ 1μ 2, one requires evaluating the (1−α) 100 % CIs of θ 1 = μ 1 (denoted as (l 1, u 1)) and θ 2=μ 2 (denoted as (l 2, u 2)), and estimating the correlation coefficient \(\widehat {\text {corr}}(\hat {\theta }_{1}, \hat \theta _{2})\). For the former, following [19], we consider the following two methods for getting the confidence limits (l 1, u 1) and (l 2, u 2) of θ 1 and θ 2.

(i) The Wilson score method

$$ l_{i}=\tilde{\theta}_{i}-\frac{z_{\alpha/2}}{N_{i}+z_{\alpha/2}^{2}}\sqrt{\frac{n}{n-1}{\sum}_{j=1}^{n}\left(x_{ij}-\hat{\theta}_{i}\right)^{2}+\frac{z_{\alpha/2}^{2}}{4}}, $$
$$ u_{i}=\tilde{\theta}_{i}+\frac{z_{\alpha/2}}{N_{i}+z_{\alpha/2}^{2}}\sqrt{\frac{n}{n-1}{\sum}_{j=1}^{n}\left(x_{ij}-\hat{\theta}_{i}\right)^{2}+\frac{z_{\alpha/2}^{2}}{4}}, $$

where N i =n+n i and \(\hat {\theta }_{i}=\frac {1}{N_{i}}\sum _{j=1}^{N_{i}}x_{ij}\) for i=1,2.

(ii)The Agresti-coull method

$$ {\small{\begin{aligned} {\kern20pt}l_{i}=\tilde{\theta}_{i}-z_{\alpha/2}\sqrt{\frac{\sum_{j=1}^{n}(x_{ij}-\hat{\theta}_{i})^{2}}{\left(N_{i}+z_{\alpha/2}^{2}\right)(n-1)}}, \end{aligned}}} $$
$$\kern20pt {\small{\begin{aligned} u_{i}=\tilde{\theta}_{i}+z_{\alpha/2}\sqrt{\frac{\sum_{j=1}^{n}(x_{ij}-\hat{\theta}_{i})^{2}}{\left(N_{i}+z_{\alpha/2}^{2}\right)(n-1)}}, \end{aligned}}} $$

where N i =n+n i and \(\tilde {\theta }_{i}=\left (\sum _{j=1}^{N_{i}}x_{ij}+0.5z_{\alpha /2}^{2}\right)/\left (N_{i}+z_{\alpha /2}^{2}\right)\) for i=1,2.

To construct CI for δ=μ 1μ 2 via the above described hybrid method, we can simply set θ 1=μ 1 and θ 2=μ 2. If Σ is known, the estimated correlation coefficient \(\widehat {\text {corr}}(\hat {\mu }_{1}, \hat {\mu }_{2})\) of \(\hat {\mu }_{1}\) and \(\hat {\mu }_{2}\) is given by \(\widehat {\text {corr}}(\hat {\mu }_{1}, \hat {\mu }_{2})=2n\rho /\sqrt {(n+n_{1})(n+n_{2})}\). If Σ is unknown, \(\widehat {\text {corr}}(\hat {\mu }_{1}, \hat {\mu }_{2})\) is given by \(\widehat {\text {corr}}(\hat {\mu }_{1}, \hat {\mu }_{2})=nr/\left \{(n+n_{1})(n+n_{2})-n_{1}n_{2}r^{2}\right \}\), where \(r=m_{12}/\sqrt {m_{1}m_{2}}\), \(m_{1}=\sum _{j=1}^{n}\left (x_{1j}-\overline {x}_{1}^{(n)}\right)^{2}\) and \(m_{2}=\sum _{j=1}^{n}\left (x_{2j}-\overline {x}_{2}^{(n)}\right)^{2}.\) Thus, using Eqs. (1) and (2) yields CIs of δ=μ 1μ 2. When l i and u i are estimated by the Wilson score method, we denote the corresponding CI as W s -CI; when l i and u i are estimated by the Agresti-coull method, the corresponding CI is denoted as W a -CI.

Bootstrap-resampling-based confidence intervals

When the distribution of x is known, one can obtain the approximate CIs of δ based on the asymptotic distributions of the constructed test statistics under the null hypotheses H 0:δ=δ 0. However, when the distribution of x is unknown, the asymptotic distributions of the constructed test statistics may not be reliable, especially with small sample size. On the other hand, estimators of some nuisance parameters have not the closed-form solutions even if the approximate distribution is reliable, and they must be obtained by using some iterative algorithms, which are computationally intensive. In this case, the Bootstrap method is often adopted to construct CIs of parameter of interest. The Bootstrap CIs can be constructed via the following steps.

Step 1. Given the paired observations and incomplete observations

$${} D=\left(\!\! \begin{array}{lll} x_{11}, \cdots, x_{1n}, & x_{1,n+\!1}, \cdots, x_{1,n+n_{1}},& \\ x_{21}, \cdots, x_{2n}, & &\!x_{2,n+\!1}, \cdots, x_{2,n+n_{2}} \end{array} \!\!\right)$$

we draw n paired observations \(\left \{(x_{1m}^{*},x_{2m}^{*}):\! m=1, \cdots,n\right \}\) with replacement from n paired observations {(x 11,x 21),⋯,(x 1n ,x 2n )}, generate n 1 observations \(\{x_{1,n+j}^{*}: j=1, \cdots, n_{1}\}\) with replacement from \(\left \{x_{1,n+1}, \cdots, x_{1,n+n_{1}}\right \}\), and sample n 2 observations \(\left \{x_{2,n+k}^{*}: k=1, \cdots, n_{2}\right \}\) with replacement from \(\left \{x_{2,n+1}, \cdots, x_{2,n+n_{1}}\right \}\). Thus, we obtain the following Bootstrap resampling sample

$$D_{b}^{*}\,=\,\left(\! \begin{array}{lll} x_{11}^{*}, \cdots, x_{1n}^{*}, & x_{1,n+\!1}^{*}, \cdots, x_{1,n+n_{1}}^{*},& \\ x_{21}^{*}, \cdots, x_{2n}^{*}, & &,x_{2,n+\!1}^{*}, \cdots, x_{2,n+n_{2}}^{*} \end{array} \!\right). $$

Step 2. For the above generated Bootstrap resampling sample \(D_{b}^{*}\), we first compute \(\hat {\mu }_{1}^{*}=(n+n_{1})^{-1}\sum _{j=1}^{n+n_{1}}x_{1j}^{*}\) and \(\hat {\mu }_{2}^{*}=(n+n_{2})^{-1}\sum _{j=1}^{n+n_{2}}x_{2j}^{*}\), and then calculate the estimated value \(\hat {\delta }^{*}\) of δ via \(\hat {\delta }^{*}=\hat {\mu }_{1}^{*}-\hat {\mu }_{2}^{*}\).

Step 3. Repeating the above steps 1 and 2 for a total of G times yields G Bootstrap estimates \(\left \{\hat {\delta }_{g}^{*}: g=1,2,\cdots,G\right \}\) of δ. Let \(\hat \delta _{(1)}^{*}<\hat \delta _{(2)}<\cdots <\hat \delta _{(G)}^{*}\) be the ordered values of \(\left \{\hat \delta _{g}^{*}: g=1,2,\cdots,G\right \}\).

Step 4. Based on the bootstrap estimates \(\left \{\hat {\delta }_{g}^{*}, g=1,2,\ldots,G\vphantom {\left \{\hat {\delta }_{g}^{*}, g=\right.}\right \}\), Bootstrap-resampling-based CIs for δ can be constructed as follows.

Generally, the standard error se \((\hat {\delta })\) of \(\hat {\delta }\) can be estimated by the sample standard deviation of the G replications, i.e., \(\hat {\text {se}}(\hat {\delta })=\sqrt {(G-1)^{-1}\sum _{g=1}^{G}\left (\hat {\delta }_{g}^{*}-\bar {\delta }_{B}^{*}\right)^{2}}\), where \(\bar \delta _{B}^{*}=\left (\hat {\delta }_{1}^{*}+\cdots +\hat {\delta }_{G}^{*}\right)/G\). If \(\left \{\hat {\delta }_{g}^{*}: g=1,\cdots, G\right \}\) is approximately normally distributed, an approximate 100(1−α) % Bootstrap CI for δ is given by \(\left (\hat {\delta }-z_{\alpha /2}\hat {\text {se}}(\hat {\delta }), \hat {\delta }+z_{\alpha /2}\hat {\text {se}}(\hat \delta)\right)\), where z α/2 is the upper α/2-percentile of the standard normal distribution, which is referred as the simple Bootstrap confidence interval. When \(\hat {\delta }=a\overline {x}_{1}^{(n)}+(1-a)\overline {x}_{1}^{(n_{1})}-b\overline {x}_{2}^{(n)}-(1-b)\overline {x}_{2}^{(n_{2})}\), the corresponding simple Bootstrap CI is denoted as B 1. When \(\hat {\delta }=\bar {x}_{1}^{(n+n_{1})}-\bar {x}_{2}^{(n+n_{2})}\), the corresponding simple Bootstrap CI is denoted as B 2.

Alternatively, if \(\left \{\hat {\delta }_{g}^{*}: g=1,\cdots, G\right \}\) is not normally distributed, it follows from ([16] p.132) that the approximate 100(1−α) % Bootstrap-resampling-based percentile CI for δ is \(\left (\hat \delta _{\left ([G\alpha /2]\right)}^{*},\hat {\delta }_{([G(1-\alpha /2)])}^{*}\right)\), where [ a] represents the integer part of a, which is referred as the percentile Bootstrap CI. When \(\hat \delta =a\overline {x}_{1}^{(n)}+(1-a)\overline {x}_{1}^{(n_{1})}-b\overline {x}_{2}^{(n)}-\left (1-b\right)\overline {x}_{2}^{(n_{2})}\), the corresponding percentile Bootstrap CI is denoted as B 3. When \(\hat {\delta }=\bar {x}_{1}^{(n+n_{1})}-\bar {x}_{2}^{(n+n_{2})}\), the corresponding percentile Bootstrap CI is denoted as B 4.

Results

Simulation studies

In this subsection, we investigate the finite performance of various CIs in terms of empirical coverage probability (ECP), empirical confidence widths (ECW), and distal and mesial non-coverage probabilities (DNP and MNP) in various parameter settings via Monte Carlo simulation studies. A summary of abbreviation for various confidence intervals is presented in Table 2.

Table 2 Summary of various abbreviations

In the first simulation study, we consider the following case that (n,n 1,n 2) is set to be (5,2,2); μ 1=0,1,2; μ 2=0.25,1,1.5; ρ=−0.9,−0.5,−0.1,0,0.1,0.5,0.9; δ=μ 1μ 2=−0.25,0,0.5; \({\sigma _{2}^{2}}=4\); \({\sigma _{1}^{2}}=1,8\) and α=0.05. For a given combination (n,n 1,n 2,μ 1,μ 2,ρ,σ 1,σ 2), we generate n+n 1+n 2 random samples of (x 1,x 2) from a bivariate normal distribution with μ=(μ 1,μ 2) and

$$\Sigma=\left(\begin{array}{ll} {\sigma_{1}^{2}}& \rho\sigma_{1}\sigma_{2}\\ \rho\sigma_{1}\sigma_{2}& {\sigma_{2}^{2}} \end{array} \right). $$

Then, for the generated n+n 1+n 2 random samples, the n 1 observations on x 2 are deleted randomly. For the remaining paired n+n 2 random samples, the n 2 observations on x 1 are deleted randomly. Thus, (x 1m ,x 2m )(m=1,⋯,n) are n pairs observations on (x 1,x 2); x 1,n+j (j=1,⋯,n 1) are n 1 additional observations on x 1; x 2,n+k (k=1,⋯,n 2) are n 2 additional observations on x 2. Based on the observation {(x 1j ,x 2j ):m=1,⋯,n}, {x 1,n+j :j=1,⋯,n 1}, {x 2,n+k :k=1,⋯,n 2}, we can draw 5000 bootstrap resampling samples. Independently repeating the above process M=10000 times, we can compute their corresponding ECP, ECW, MNP and DNP values. The ECP, ECW, MNP and DNP are defined by

$$ \begin{aligned} \text{ECP}&=\frac{1}{M}\sum\limits_{m=1}^{M}I\left\{\delta\in \left[L\left(\boldsymbol{x}^{(m)}\right),U\left(\boldsymbol{x}^{(m)}\right)\right]\right\},\\ \text{ECW}&=\frac{1}{M}\sum\limits_{m=1}^{M}\left[U\left(\boldsymbol{x}^{(m)})\right)-L\left(\boldsymbol{x}^{(m)}\right)\right], \end{aligned} $$
((3))
$$ \begin{aligned} \text{MNP}&=\frac{1}{M}\sum\limits_{m=1}^{M}I\left\{\delta\in \left[-\infty, L\left(\boldsymbol{x}^{(m)}\right)\right]\right\},\\ \text{DNP}&=\frac{1}{M}\sum\limits_{m=1}^{M}I\left\{\delta\in \left[U\left(\boldsymbol{x}^{(m)}\right), +\infty\right]\right\}, \end{aligned} $$
((4))

respectively, where \(I\{\delta \in \mathcal {A}\}\) is an indicator function, which is 1 if \(\delta \in \mathcal {A}\) and 0 otherwise. The ratio of the MNP to the non-coverage probability (NCP) is defined as

$$ \text{RNCP}=\frac{\text{MNP}}{\text{NCP}}=\frac{\text{MNP}}{1.0-\text{ECP}}. $$
((5))

Results are presented in Tables 3, 4 and 5. Also, to investigate the performance of the proposed CIs under the assumption \({\sigma _{1}^{2}}={\sigma _{2}^{2}}=\sigma ^{2}\), we calculate the corresponding results for T 3, T 4, T 5, hybrid CIs, Bootstrap-resampling-based CIs when σ 2=4 and (n,n 1,n 2)=(5,5,2), which are given in Tables 9, 10 and 11.

Table 3 ECPs of various confidence intervals under bivariate normal distribution with different ρ and δ, μ 1, \(\mu _{2} {\sigma _{1}^{2}}\) and (n,n 1,n 2)=(5,2,2) and \({\sigma _{2}^{2}}=4\)
Table 4 ECW of various confidence intervals under bivariate normal distribution with different ρ and δ, μ 1, μ 2, \({\sigma _{1}^{2}}\) and (n,n 1,n 2)=(5,2,2) and \({\sigma _{2}^{2}}=4\)
Table 5 RNCP of various confidence intervals under bivariate normal distribution with different ρ and δ, μ 1, μ 2, \({\sigma _{1}^{2}}\) and (n,n 1,n 2)=(5,2,2) and \({\sigma _{2}^{2}}=4\)

Following [17, 30], an interval can be regarded as satisfactory if (i) its ECP is close to the pre-specified 95 % confidence level, (ii) it possesses shorter interval width, and (iii) its RNCP lies in the interval [0.4,0.6]; too mesially located if its RNCP is less than 0.4; and too distally if its RNCP is greater than 0.6.

In the second Monte Carlo simulation study, we assume that the random samples of bivariate variables x 1 and x 2 are generated from a bivariate t-distribution with five degrees of freedom, and mean μ and scale parameter Σ specified in the first simulation study. The corresponding results with (n,n 1,n 2)=(5,5,5) are given in Tables 6, 7 and 8. Similarly, we calculate the corresponding results for T 3, T 4, T 5, hybrid CIs, Bootstrap-resampling-based CIs when σ 2=4 and (n,n 1,n 2)=(5,5,2), which are given in Tables 9, 10 and 11.

Table 6 ECPs of various confidence intervals under bivariate t-distribution with different ρ and δ, μ 1, μ 2, \({\sigma _{1}^{2}}\) and (n,n 1,n 2)=(5,5,5) and \({\sigma _{2}^{2}}=4\)
Table 7 ECW of various confidence intervals under bivariate t-distribution with different ρ and δ, μ 1, μ 2, \({\sigma _{1}^{2}}\) and (n,n 1,n 2)=(5,5,5) and \({\sigma _{2}^{2}}=4\)
Table 8 RNCP of various confidence intervals under bivariate t-distribution with different ρ and δ, μ 1, μ 2, \({\sigma _{1}^{2}}\) and (n,n 1,n 2)=(5,5,5) and \({\sigma _{2}^{2}}=4\)

To investigate powers for the proposed CIs, we calculated the power in both the first and second simulation study. The results are shown in Tables 12 and 13. There is very little power in both the first and second simulation study to exclude a difference of zero.

Results of simulation studies

From Tables 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 and 13, we have the following findings. First, when Σ is unknown, the CIs based on the the Bootstrap-resampling-based methods except for B 3 behave satisfactorily in the sense that their ECPs are close to the pre-specified confidence level 95 % (e.g., see Tables 3 and 6); the CI based on the Bootstrap-resampling-based method B 1 generally yielded shorter ECWs than others (e.g., see Tables 4 and 7); the CIs corresponding to bivariate t-distribution are generally wider than those corresponding to bivariate normal distribution; the ECWs decrease as the correlation coefficient ρ increases. Second, the RNCPs of all the considered CIs lie in the interval [0.4,0.6] (e.g., see Tables 5 and 8), which show that our derived CIs generally demonstrate symmetry. Third, when \({\sigma _{1}^{2}}={\sigma _{2}^{2}}\), the CIs based on statistics T 3, T 4 and T 5 behave unsatisfactory (e.g., see Tables 9 and 10) because their corresponding ECPs are almost less than the pre-specified confidence level 95 %. Fourth, powers corresponding to W a and B 1 are larger than others (e.g., see Tables 12 and 13). From the above findings, we would recommend the usage of the Bootstrap-resampling-based CI (i.e., B 1) because its coverage probability is generally close to the pre-chosen confidence level, it consistently yields the shortest interval width even when sample size is small, it usually guarantees its ratios of the MNCPs to the non-coverage probabilities lying in [0.4, 0.6], and its power is usually larger than others.

Table 9 ECPs of various confidence intervals with different ρ and δ, μ 1, μ 2, (n,n 1,n 2)=(5,5,2), when \({\sigma _{1}^{2}}={\sigma _{2}^{2}}=4\)
Table 10 ECW of various confidence interals with different ρ and δ, μ 1, μ 2, (n,n 1,n 2)=(5,5,2), when \({\sigma _{1}^{2}}={\sigma _{2}^{2}}=4\)
Table 11 RNCP of various confidence intervals with different ρ and δ, μ 1, μ 2, (n,n 1,n 2)=(5,5,2), when \({\sigma _{1}^{2}}={\sigma _{2}^{2}}=4\)
Table 12 Power of various confidence intervals with different ρ and δ, μ 1, \(\mu _{2}, {\sigma _{1}^{2}}\) and (n,n 1,n 2)=(5,2,2) and \({\sigma _{2}^{2}}=4\)
Table 13 Power of various confidence intervals with different ρ and δ, μ 1, μ 2, (n,n 1,n 2)=(5,5,2), when \({\sigma _{1}^{2}}={\sigma _{2}^{2}}=4\)

An worked example

In this subsection, the data introduced in Section for the action of two doses of formoterol solution aerosol are used to illustrate the proposed methodologies. In this example, we are interested in CI construction of the difference of two FEV1 values for two doses of formoterol solution aerosol. Under the previously given notation, we have n=7, n 1=9, n 2=8, \(\hat \delta =a\overline {x}_{1}^{(n)}+\left (1-a\right)\overline {x}_{1}^{(n_{1})}-b\overline {x}_{2}^{(n)}-(1-b)\overline {x}_{2}^{(n_{2})}=-0.0840\) (or \(\hat {\delta }=\sum _{j=1}^{n+n_{1}}x_{1j}/(n+n_{1})-\sum _{j=1}^{n+n_{2}}x_{2j}/(n+n_{2})=0.0228\)). Various 95 % CIs for δ under Σ unknown assumption are presented in Table 14. Examination of Table 14 shows that the actions of two doses of formaterol solutions aerosol are the same because all the derived CIs include zero.

Table 14 Various 95 % confidence intervals for δ=μ 1μ 2 based on formoterol solution aerosol

Discussion

Although testing equivalence of two correlated means with incomplete data has been studied, there is little work done on their interval estimators. To address the issue, this paper proposes various interval estimators of the difference of two correlated means for Σ known and unknown cases based on the large sample method, hybrid method and Bootstrap-resampling method. Extensive simulation studies are conducted to evaluate the finite performance of the proposed CIs in terms of the empirical coverage probability, empirical interval width and ratio of the mesial non-coverage probability to the non-coverage probability (RNCP). Empirical results evidence that the Bootstrap-resampling-based CIs B 1, B 2, B 4 behave satisfactorily for small to moderate sample sizes in the sense that their coverage probabilities could be well controlled around the pre-specified nominal confidence level and their RNCPs almost lie in the interval [0.4, 0.6]. However, confidence intervals based on the large sample method and hybrid method behave unsatisfactory for small sample sizes because the distributions of statistics T 1,⋯,T 5 are asymptotical, and these asymptotical distributions are proper only when N i . When Σ is unknown, using GEE method to estimate variance is less efficient.

It is interesting to investigate confidence interval construction of the difference of two means with incomplete correlated data under missing at random and non-ignorable missing data mechanism assumptions of bivariate variables. We are working on the topics.

Conclusion

According to the aforementioned findings, we can draw the following conclusions. The Bootstrap-resampling-based CI B 1 is a desirable interval estimator for the difference of two means with incomplete correlated data.