Skip to main content
Log in

The Efficient Covariate-Adaptive Design for high-order balancing of quantitative and qualitative covariates

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

In the context of sequential treatment comparisons, the acquisition of covariate information about the statistical units is crucial for the validity of the trial. Furthermore, balancing the assignments among covariates is of primary importance, since the potential imbalance of the covariate distributions across the groups can severely undermine the statistical analysis. For this reason, several covariate-adaptive randomization procedures have been suggested in the literature, but most of them only apply to categorical factors. In this paper we propose a new class of rules, called the Efficient Covariate-Adaptive Design, which is high-order balanced regardless of the number of factors and their nature (qualitative and/or quantitative), also accounting for every order covariate effects and interactions. The suggested procedure performs very well, is flexible and simple to implement. The advantages of our proposal are also analyzed via simulations and its finite sample properties are compared with those of other well-known rules, by also including the redesign of a real clinical trial.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Atkinson AC (1982) Optimum biased coin designs for sequential clinical trials with prognostic factors. Biometrika 69:61–67

    Article  MathSciNet  Google Scholar 

  • Atkinson AC (2002) The comparison of designs for sequential clinical trials with covariate information. J R Stat Soc Ser A 165:349–373

    Article  MathSciNet  Google Scholar 

  • Baldi Antognini A, Giovagnoli A (2015) Adaptive designs for sequential treatment allocation. Chapman, Hong Kong

    Book  Google Scholar 

  • Baldi Antognini A, Zagoraiou M (2011) The covariate-adaptive biased coin design for balancing clinical trials in the presence of prognostic factors. Biometrika 98:519–535

    Article  MathSciNet  Google Scholar 

  • Baldi Antognini A, Zagoraiou M (2012) Multi-objective optimal designs in comparative clinical trials with covariates: the reinforced doubly-adaptive biased coin design. Ann Stat 40:1315–1345

    Article  MathSciNet  Google Scholar 

  • Baldi Antognini A, Zagoraiou M (2017) Estimation accuracy under covariate-adaptive randomization procedures. Electron J Stat 11:1180–1206

    Article  MathSciNet  Google Scholar 

  • Begg CB, Iglewicz B (1980) A treatment allocation procedure for sequential clinical trials. Biometrics 36:81–90

    Article  Google Scholar 

  • Ciolino J, Zhao W, Martin R et al (2011) Quantifying the cost in power of ignoring continuous covariate imbalances in clinical trial randomization. Contemp Clin Trials 32(2):250–259

    Article  Google Scholar 

  • Donny EC, Denlinger RL, Tidey JW et al (2015) Randomized trial of reduced-nicotine standards for cigarettes. N Engl J Med 373(14):1340–1349

    Article  Google Scholar 

  • Efron B (1971) Forcing sequential experiments to be balanced. Biometrika 58:403–417

    Article  MathSciNet  Google Scholar 

  • Heritier S, Gebski V, Pillai A (2005) Dynamic balancing randomization in controlled clinical trials. Stat Med 24:3729–3741

    Article  MathSciNet  Google Scholar 

  • Hu Y, Hu F (2012) Asymptotic properties of covariate-adaptive randomization. Ann Stat 40:1794–1815

    Article  MathSciNet  Google Scholar 

  • Lauzon SD, Ramakrishnan V, Nietert PJ et al (2020) Statistical properties of minimal sufficient balance and minimization as methods for controlling baseline covariate imbalance at the design stage of sequential clinical trials. Stat Med 39:2506–2517

    Article  MathSciNet  Google Scholar 

  • Ma W, Hu F, Zhang L (2015) Testing hypotheses of covariate-adaptive randomized clinical trials. J Am Stat Assoc 110(510):669–680

    Article  MathSciNet  Google Scholar 

  • Ma Z, Hu F (2013) Balancing continuous covariates based on kernel densities. Contemp Clin Trilas 34(2):262–269

    Article  Google Scholar 

  • Meyn SP, Tweedie RL (1992) Stability of Markovian processes i: criteria for discrete time Markov chains. Adv Appl Probab 24(3):542–574

    Article  MathSciNet  Google Scholar 

  • Morgan KL, Rubin D (2012) Rerandomization to improve covariate balance in experiments. Ann Stat 40(2):1263–1282

    Article  MathSciNet  Google Scholar 

  • Pocock SJ, Simon R (1975) Sequential treatment assignment with balancing for prognostic factors in the controlled clinical trial. Biometrics 31:103–115

    Article  Google Scholar 

  • Rosenberger WF, Lachin JL (2002) Randomization in clinical trials: theory and practice. Wiley, New York

    Book  Google Scholar 

  • Shao J, Yu X, Zhong B (2010) A theory for testing hypotheses under covariate-adaptive randomization. Biometrika 97:347–360

    Article  MathSciNet  Google Scholar 

  • Smith RL (1984) Properties of biased coin designs in sequential clinical trials. Ann Stat 12:1018–1034

    Article  MathSciNet  Google Scholar 

  • Smith RL (1984) Sequential treatment allocation using biased coin designs. J R Stat Soc Ser B 46:519–543

    MathSciNet  Google Scholar 

  • Taves DR (1974) Minimization: a new method of assigning patients to treatment and control groups. J Clin Pharm Therapeutics 15:443–453

    Article  Google Scholar 

  • Wei LJ (1978) The adaptive biased coin design for sequential experiments. Ann Stat 6:92–100

    Article  MathSciNet  Google Scholar 

  • Weir CJ, Lees KR (2003) Comparison of stratification and adaptive methods for treatment allocation in an acute stroke clinical trial. Stat Med 22(5):705–726

    Article  Google Scholar 

  • Zhou Q, Ernst PA, Morgan KL et al (2018) Sequential rerandomization. Biometrika 105(3):745–752

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alessandro Baldi Antognini.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A Proofs

Appendix A Proofs

1.1 A.1 Optimality of balance for estimating/testing the treatment difference \(\mu _A-\mu _B\)

For simplicity of notation in this Appendix we omit the subscript n. After n allocations, let \(\hat{\varvec{\gamma }}\) be the LSE of \(\varvec{\gamma }=(\mu _A,\mu _B,\varvec{\beta })^{t}\) and \(a^t=(1;-1;{\varvec{0}}_q^t)\), then \(\text {{var}}(a^t\hat{\varvec{\gamma }})=\sigma ^2a^t(n{\mathbb {M}})^{-1}a\), where

$$\begin{aligned} {\mathbb {M}}=\frac{1}{n}\left( \begin{array}{ccc} n_A &{} 0 &{} \varvec{\delta }^{t}{\mathbb {F}} \\ 0 &{} n_B &{} ({\varvec{1}}-\varvec{\delta })^{t}{\mathbb {F}} \\ {\mathbb {F}}^{t}\varvec{\delta } &{} {\mathbb {F}}^{t}({\varvec{1}}- \varvec{\delta }) &{} {\mathbb {F}}^{t}{\mathbb {F}} \end{array} \right) . \end{aligned}$$

Let \({\textbf{x}}^t={\textbf{1}} ^t{\mathbb {F}}\) and \({\textbf{y}}^t=(2\varvec{\delta }-{\varvec{1}})^{t}{\mathbb {F}}\), then \(n{\mathbb {M}}= \left( \begin{array} {c | c} \text {{diag}}\left( n_A,n_B\right) &{} {\mathbb {D}}^t\\ \hline {\mathbb {D}} &{} n{\mathbb {A}} \end{array}\right) \), with \(\mathbb {D}=2^{-1}\left( {\textbf{x}}+{\textbf{y}}\mid {\textbf{x}}-{\textbf{y}}\right) \). By letting \(\mathbb {T}=\text {{diag}}\left( n_A,n_B\right) -{\mathbb {D}}^t (n{\mathbb {A}})^{-1} {\mathbb {D}}\), then \(a^t(n{\mathbb {M}})^{-1}a=(1,-1)\mathbb {T}^{-1}\left( \begin{array} {c} 1\\ -1 \end{array}\right) \), where

$$\begin{aligned} \mathbb {T}=\left( \begin{array} {c | c} n_A-(4n)^{-1}({\textbf{x}}+{\textbf{y}})^t{\mathbb {A}}^{-1}({\textbf{x}}+{\textbf{y}}) &{} -(4n)^{-1}({\textbf{x}}+{\textbf{y}})^t{\mathbb {A}}^{-1}({\textbf{x}}-{\textbf{y}})\\ \hline -(4n)^{-1}({\textbf{x}}-{\textbf{y}})^t{\mathbb {A}}^{-1}({\textbf{x}}+{\textbf{y}}) &{} n_B-(4n)^{-1}({\textbf{x}}-{\textbf{y}})^t{\mathbb {A}}^{-1}({\textbf{x}}-{\textbf{y}}) \end{array}\right) . \end{aligned}$$

After some algebra, \(\det \mathbb {T}=(n-{\textbf{x}}^t(n{\mathbb {A}})^{-1}{\textbf{x}})(n-\ell )/4\), so that

$$\begin{aligned} a^t(n{\mathbb {M}})^{-1}a=\frac{n-{\textbf{x}}^t(n{\mathbb {A}})^{-1}{\textbf{x}}}{(n-{\textbf{x}}^t(n{\mathbb {A}})^{-1}{\textbf{x}})(n-\ell )/4}= \frac{4}{n-\ell }= \frac{4}{n} \left( 1 - \frac{\ell }{n}\right) ^{-1}. \end{aligned}$$
(A.1)

Taking into account hypothesis testing, under well-known regularity conditions \(\sqrt{n}(\hat{\varvec{\gamma }}-\varvec{\gamma })\overset{d}{\longrightarrow } \text {N}({\varvec{0}}_{q+2}, \sigma ^{2} {\mathbb {M}}^{-1})\), so that \(\sqrt{n}a^t(\hat{\varvec{\gamma }}-\varvec{\gamma })\overset{d}{\longrightarrow } \text {N}(0, \sigma ^{2}\Vert a\Vert _{{\mathbb {M}}^{-1}}^2)\). Assuming \(\sigma ^2\) known, the classical Wald statistic is \(W=n\hat{\varvec{\gamma }}^t a[\sigma ^2\Vert a\Vert _{{\mathbb {M}}^{-1}}^2]^{-1}a^t\hat{\varvec{\gamma }}=n({\hat{\mu }} _{A}-{\hat{\mu }} _{B})^2[\sigma \Vert a\Vert _{{\mathbb {M}}^{-1}}]^{-2}\). Under \(H_0\), \(W \overset{d}{\longrightarrow }\ \chi _{1}^2\), namely it converges to a (central) \(\chi ^2\) with 1 degree of freedom (dof); whereas, under the alternative W converges to a non-central \(\chi ^2_1\) with non-centrality parameter \(n(\mu _{A}-\mu _{B})^2[\sigma \Vert a\Vert _{{\mathbb {M}}^{-1}}]^{-2}\). From (A.1), \(a^t{\mathbb {M}}^{-1}a=4 \left( 1- \ell /n\right) ^{-1}\), so the non-centrality parameter is equal to \((2\sigma )^{-2} n(\mu _{A}-\mu _{B})^2(1 - \ell /n)\). For fixed dof the non-central \(\chi ^2\) is stochastically increasing in the non-centrality parameter. Thus, for every sample size the power is an increasing function of it and is maximized when \(\ell =0\).

1.2 A.2 Proof of Theorem 1

Following Theorem 4.5 of Meyn and Tweedie (1992), a Markov chain \(\{{\textbf{X}}_n\}_{n\in \mathbb {N}}\) on a general state-space \(\mathbb {X}\) is bounded in probability if i) \({\textbf{X}}_n\) is a T-chain and ii) \({\textbf{X}}_n\) satisfies a positive drift condition, namely there exists a norm-like function \(V:\mathbb {X} \longrightarrow {\mathbb {R}}^+\) such that, for some \(\varepsilon >0\) and a compact set \({\mathcal {C}}\in {\mathcal {B}}(\mathbb {X})\) (where \({\mathcal {B}}(\mathbb {X})\) is the Borel sigma algebra), we have

$$\begin{aligned} \begin{aligned} (D)&\quad \Delta V({\textbf{X}}_n):={\mathbb {E}} [V({\textbf{X}}_{n+1}) \mid {\textbf{X}}_n]-V({\textbf{X}}_n)\le -\varepsilon , \qquad {\textbf{X}}_n \in {\mathcal {C}}^c.\\ \end{aligned} \end{aligned}$$

Since a Markov chain on a countable state space is always a T-chain (Meyn and Tweedie 1992, p. 548) in what follows we just need to show that condition (D) is satisfied by \({\textbf{D}}_n\) and \({\textbf{b}}_n\). Let \({\mathbb {C}}={\mathbb {H}}^t{\mathbb {W}}{\mathbb {H}}\), then \({\mathbb {C}}=(c_{ij})_{i,j=1,\ldots ,s}\) is symmetric and positive-definite, since \({\mathbb {H}}\) is a full (row) rank matrix. Moreover, by letting \(\tilde{{\textbf{D}}}_n={\mathbb {H}}^t{\mathbb {W}}{\mathbb {H}} {\textbf{D}}_n\), the linear transformation \(\tilde{{\textbf{D}}}_{n}={\mathbb {C}}{\textbf{D}}_n\) is an isomorphism and therefore the behavior of the Markov chain \(\{{\textbf{D}}_n\}_{n \in \mathbb {N}}\) is equivalent to the one of \(\{\tilde{{\textbf{D}}}_n\}_{n \in \mathbb {N}}\) (although defined in a proper transformed space) and their roles in the proof could be naturally exchanged. We show that condition (D) holds by setting \(V({\textbf{D}}_n)={\textbf{D}}_n^t{\mathbb {C}}{\textbf{D}}_n\) and \({\mathcal {C}}=\{{\textbf{D}}_n: \Vert {\mathbb {C}} {\textbf{D}}_n\Vert _{\infty } \le \kappa \}\), where \(\kappa >0\). Note that \(\Vert {\mathbb {C}} {\textbf{D}}_n\Vert _{\infty }\) is still a norm of the vector \({\textbf{D}}_n\), since \({\mathbb {C}}\) is invertible (namely the corresponding linear transformation is injective). Due to the isomorphism, the compact set could be analogously expressed by \({\mathcal {C}}=\{\tilde{{\textbf{D}}}_n: \Vert \tilde{{\textbf{D}}}_n\Vert _{\infty }\le \kappa \}\), so \({\mathcal {C}}^c=\{\tilde{{\textbf{D}}}_n: \max _{j=1,\ldots ,s} \vert \tilde{{D}}_{nj}\vert > \kappa \}\). From (6),

$$\begin{aligned} \Delta V({\textbf{D}}_n)=\sum _{j=1}^s {\mathbb {E}} [V({\textbf{D}}_{n+1}) -V({\textbf{D}}_n) \mid {\textbf{D}}_n, {\textbf{x}}_{n+1}={\textbf{e}}_j] \Pr ({\textbf{x}}_{n+1}={\textbf{e}}_j), \end{aligned}$$
(A.2)

where

$$\begin{aligned} \begin{aligned}&{\mathbb {E}} [V({\textbf{D}}_{n+1}) -V({\textbf{D}}_n)\mid {\textbf{D}}_n , {\textbf{x}}_{n+1}={\textbf{e}}_j]=\\&\Pr (\delta _{n+1}=1\mid {\textbf{D}}_n, {\textbf{x}}_{n+1}={\textbf{e}}_j)\, {\mathbb {E}}[V({\textbf{D}}_{n+1}) -V({\textbf{D}}_n)\mid {\textbf{D}}_n,{\textbf{x}}_{n+1}={\textbf{e}}_j, \delta _{n+1}=1]+\\&\Pr (\delta _{n+1}=0\mid {\textbf{D}}_n, {\textbf{x}}_{n+1}={\textbf{e}}_j) \,{\mathbb {E}}[V({\textbf{D}}_{n+1}) -V({\textbf{D}}_n)\mid {\textbf{D}}_n,{\textbf{x}}_{n+1}={\textbf{e}}_j, \delta _{n+1}=0].\\ \end{aligned}\nonumber \\ \end{aligned}$$
(A.3)

Moreover,

$$\begin{aligned} \begin{aligned} {\mathbb {E}}[V({\textbf{D}}_{n+1}) -V({\textbf{D}}_n)\mid&{\textbf{D}}_n, {\textbf{x}}_{n+1}={\textbf{e}}_j, \delta _{n+1}=1]=\\ =&({\textbf{D}}_n+{\textbf{e}}_j)^t {\mathbb {C}}({\textbf{D}}_n+{\textbf{e}}_j)-{\textbf{D}}_n^t {\mathbb {C}} {\textbf{D}}_n=c_{jj}+2{\textbf{e}}_j^t{\mathbb {C}}{\textbf{D}}_n \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} {\mathbb {E}}[V({\textbf{D}}_{n+1}) -V({\textbf{D}}_n)\mid&{\textbf{D}}_n, {\textbf{x}}_{n+1}={\textbf{e}}_j, \delta _{n+1}=0]=\\ =&({\textbf{D}}_n-{\textbf{e}}_j)^t {\mathbb {C}}({\textbf{D}}_n-{\textbf{e}}_j)-{\textbf{D}}_n^t {\mathbb {C}} {\textbf{D}}_n=c_{jj}-2{\textbf{e}}_j^t{\mathbb {C}}{\textbf{D}}_n, \end{aligned} \end{aligned}$$

where \(c_{jj}={\textbf{e}}_j^t{\mathbb {C}}{\textbf{e}}_j>0\). Thus, recalling that \(\tilde{{\textbf{D}}}_n={\mathbb {C}}{\textbf{D}}_n\), Eq. (A.3) becomes

$$\begin{aligned} \begin{aligned} {\mathbb {E}} [V({\textbf{D}}_{n+1})&-V({\textbf{D}}_n)\mid {\textbf{D}}_n, {\textbf{x}}_{n+1}={\textbf{e}}_j]= h({\textbf{e}}_j^t\tilde{{\textbf{D}}}_n)[c_{jj}+2{\textbf{e}}_j^t\tilde{{\textbf{D}}}_n]+\\ +&[1-h({\textbf{e}}_j^t\tilde{{\textbf{D}}}_n)] [c_{jj}-2{\textbf{e}}_j^t\tilde{{\textbf{D}}}_n]=4\tilde{{D}}_{nj}\left[ h(\tilde{{D}}_{nj})-\frac{1}{2}\right] +c_{jj}. \end{aligned} \end{aligned}$$
(A.4)

Therefore, from (A.2) and (A.4), \(\Delta V({\textbf{D}}_n)=4\sum _{j=1}^s\tilde{{D}}_{nj}[h(\tilde{{D}}_{nj})-1/2]p_j+\sum _{j=1}^s c_ {jj} p_j\), where the first term is always non-positive since, if \(\tilde{{D}}_{nj}\ge 0\) then \(h(\tilde{{D}}_{nj})\le 1/2\) and when \(\tilde{{D}}_{nj}<0\) then \(h(\tilde{{D}}_{nj})>1/2\). Condition (D) is equivalent to

$$\begin{aligned} \sum _{j=1}^s \vert \tilde{{D}}_{nj} \vert [ h(-\vert \tilde{{D}}_{nj} \vert )- 1/2] p_j\ge (\varepsilon + \sum _{j=1}^s c_ {jj} p_j)/4. \end{aligned}$$
(A.5)

By the definition of \({\mathcal {C}}^c\), there exists at least one stratum \({\tilde{j}}\) such that \(\vert \tilde{{D}}_{n{\tilde{j}}}\vert > \kappa \) and therefore \(\sum _{j=1}^s \vert \tilde{{D}}_{nj} \vert [ h(-\vert \tilde{{D}}_{nj} \vert )- 1/2] p_j\ge \kappa [h(-\vert \kappa \vert )-1/2]p_{{\tilde{j}}}\), which is an increasing function of \(\kappa \) since \(h(-\vert \kappa \vert )>1/2\) and \(p_{{\tilde{j}}}>0\). Thus, for every \({\textbf{D}}_n \in {\mathcal {C}}^c\) condition (D) is verified, since the RHS of (A.5) is bounded.

In addition, since \({\textbf{b}}_n={\mathbb {H}}{\textbf{D}}_n\), then \({\textbf{b}}_n=O_p(1)\) as a bounded linear combination of \({\textbf{D}}_n\); thus, from (2) and (4), \(\ell _n=o_p(1)\), since \(\Vert {\textbf{b}}_n \Vert _{{\mathbb {P}}_n^{-1}}^2\) is asymptotic equivalent to \(\Vert {\textbf{b}}_n \Vert _{{\mathbb {P}}^{-1}}^2=O_p(1)\) (recalling that \({\mathbb {P}}_n-{\mathbb {P}}=o_{a.s.}(1)\)). At the same time \({\textbf{b}}_n=O_p(1)\), then \(n_A\bar{{\textbf{f}}}^A_{n} -n_B\bar{{\textbf{f}}}^B_{n}=O_p(1)\) and \(n^{-1}D_n=o_p(1)\); by letting \(\pi _n=n_A/n\), then \(\pi _n-1/2=o_p(1)\) and \(\Vert n_A\bar{{\textbf{f}}}^A_{n} -n_B\bar{{\textbf{f}}}^B_{n}\Vert ^2_{{\mathbb {A}}_n^{-1}}=O_p(1)\), so that \(n^{-1}\Vert n_A\bar{{\textbf{f}}}^A_{n} -n_B\bar{{\textbf{f}}}^B_{n}\Vert ^2_{{\mathbb {A}}_n^{-1}}=n \Vert \pi _n\bar{{\textbf{f}}}^A_{n} -(1-\pi _n)\bar{{\textbf{f}}}^B_{n}\Vert ^2_{{\mathbb {A}}_n^{-1}}=o_p(1)\). Since \(\pi _n - 1/2=o_p(1)\) and \(\varvec{{\mathbb {A}}}_n-\varvec{{\mathbb {A}}}=o_{a.s.}(1)\), then \(n \Vert \pi _n\bar{{\textbf{f}}}^A_{n} -(1-\pi _n)\bar{{\textbf{f}}}^B_{n}\Vert ^2_{{\mathbb {A}}_n^{-1}}\) is asymptotic equivalent to \(4^{-1}n \Vert \bar{{\textbf{f}}}^A_{n} -\bar{{\textbf{f}}}^B_{n}\Vert ^2_{{\mathbb {A}}^{-1}}=o_p(1)\), namely \({\mathcal {M}}_n=o_p(1)\). Thus, under the ECADE, \(\ell _n\) is asymptotic equivalent to \({\mathcal {M}}_n\). By the same arguments \(\ell _n\) and \({\mathcal {M}}_n\) are asymptotic equivalent for every CA rule under which \(\pi _n-1/2=o_p(1)\).

1.3 A.3 Proof of Theorem 2

Let \({\mathbb {C}}_n={\mathbb {H}}^t{\mathbb {W}}_n {\mathbb {H}}\), since \({\mathbb {W}}_n\longrightarrow {\mathbb {W}}\) a.s., then \({\mathbb {C}}_n \longrightarrow {\mathbb {C}}={\mathbb {H}}^t{\mathbb {W}}{\mathbb {H}}\) a.s. from Slutshy’s theorem. Consider the function \(W({\textbf{D}}_n)={\textbf{D}}_n^t {\mathbb {C}}_{n} {\textbf{D}}_n=U({\textbf{D}}_n)+V({\textbf{D}}_n)\), where \(U({\textbf{D}}_n)={\textbf{D}}_n^t ({\mathbb {C}}_{n}-{\mathbb {C}}) {\textbf{D}}_n\), while \(V(\cdot )\) and the compact set \({\mathcal {C}}\) are the same as in the proof of Theorem 1 in Sect. 1. Thus we show that condition (D) is satisfied by \(\Delta W({\textbf{D}}_n)= {\mathbb {E}} [W({\textbf{D}}_{n+1}) -W({\textbf{D}}_{n})\mid {\textbf{D}}_n]=\Delta U({\textbf{D}}_n) +\Delta V({\textbf{D}}_n)\). Since \({\mathbb {C}}_{n}-{\mathbb {C}}=o_{a.s.}(1)\), then \(\Delta U({\textbf{D}}_n)={\mathbb {E}}[{\textbf{D}}_{n+1}^t ({\mathbb {C}}_{n+1}-{\mathbb {C}}) {\textbf{D}}_{n+1}-{\textbf{D}}_{n}^t ({\mathbb {C}}_{n}-{\mathbb {C}}) {\textbf{D}}_{n}\mid {\textbf{D}}_{n}]\) tends to be negligible for a sufficiently large n. Moreover, from Sect. 1, \(\Delta V({\textbf{D}}_n)\) satisfies condition (D); thus the Markov chain induced by \({\mathbb {W}}_n\) is asymptotically equivalent to the one corresponding to \({\mathbb {W}}\), and therefore \({\textbf{D}}_{n}=O_p(1)\).

1.4 A.4 Proof of Theorem 3

Under the ECADE with quantitative factors \({\textbf{b}}_{n+1}={\textbf{b}}_n+(2\delta _{n+1}-1) (1;{\textbf{f}}({\varvec{Z}}_{n+1})^t)^t\) a.s. for every n (with \({\textbf{b}}_0={\textbf{0}}_{q+1}\)) and therefore \(\{{\textbf{b}}_n\}_{n \in {\mathbb {N}}}\) is Markov chain on \(\mathbb {X}=\mathbb {Z}\times \mathbb {R}^q\) with one step transition kernel

$$\begin{aligned} \begin{aligned} P({\textbf{x}}, A)&=\Pr ({\textbf{b}}_{n+1}\in A \mid {\textbf{b}}_n={\textbf{x}})\\&=\int \Pr ({\textbf{b}}_{n+1}\in A \mid {\textbf{b}}_n={\textbf{x}},{\textbf{Z}}_{n+1}={\varvec{z}} ) {\mathcal {L}}({\varvec{z}}) d{\varvec{z}}\\&= \int \Big \{ h\left( (1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{x}}\right) I\{{\textbf{x}}+(1;{\textbf{f}}({\varvec{z}})^t)^t \in A \} + \\&\quad \quad + \left[ 1- h\left( (1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{x}}\right) \right] I\{{\textbf{x}}-(1;{\textbf{f}}({\varvec{z}})^t)^t \in A\}\Big \} {\mathcal {L}}({\varvec{z}}) d{\varvec{z}}. \end{aligned} \end{aligned}$$
(A.6)

Following Theorem 4.5 of Meyn and Tweedie (1992), we firstly show that \(\{{\textbf{b}}_{n}\}_{n \in \mathbb {N}}\) is a T-chain and then we prove that condition (D) is satisfied. As regards the T-chain property, we need to show that there exists a sampling distribution a and a substochastic transition kernel \(T({\textbf{x}},\cdot )\) such that for any \(A \in {\mathcal {B}}(\mathbb {X})\), \(K_\alpha ({\textbf{x}},A)= \sum _{i=1}^\infty P^i({\textbf{x}},A)\alpha (i) \ge T({\textbf{x}},A)\), where \(T(\cdot , A)\) is a lower semicontinuous (LSC) function with \(T({\textbf{x}}, \mathbb {X})>0\) for all \({\textbf{x}} \in \mathbb {X}\). By taking \(\alpha (1)=1\) and 0 otherwise, then \(K_\alpha ({\textbf{x}},A)=P({\textbf{x}}, A)\); if we set \(T({\textbf{x}},A)=e\int I\{{\textbf{x}}+(1;{\textbf{f}}({\varvec{z}})^t)^t \in A \} {\mathcal {L}}({\varvec{z}}) d{\varvec{z}}\), then \(P({\textbf{x}}, A) \ge T({\textbf{x}},A)\), recalling that \(h(x)\ge e\) for any \(x \in \mathbb {R}\). Since the indicator function of any open set is LSC, \(T({\textbf{x}}, A)\) is always LSC. Indeed, if A is an open subset then

$$\begin{aligned} \begin{aligned} \lim _{{\textbf{y}} \rightarrow {\textbf{x}}} \inf T({\textbf{y}}, A)&=e \lim _{{\textbf{y}} \rightarrow {\textbf{x}}} \inf \int I\{{\textbf{y}}+(1;{\textbf{f}}({\varvec{z}})^t)^t \in A \}{\mathcal {L}}({\varvec{z}}) d{\varvec{z}} \\&\ge e \int \lim _{{\textbf{y}} \rightarrow {\textbf{x}}} \inf I\{{\textbf{y}}+(1;{\textbf{f}}({\varvec{z}})^t)^t \in A \} {\mathcal {L}}({\varvec{z}}) d{\varvec{z}} \\&\ge e \int I\{{\textbf{x}}+(1;{\textbf{f}}({\varvec{z}})^t)^t \in A \} {\mathcal {L}}({\varvec{z}}) d{\varvec{z}}=T({\textbf{x}}, A). \end{aligned} \end{aligned}$$
(A.7)

Moreover, (A.7) holds even if A is not open; indeed, \(T({\textbf{x}},A)\) does not change since the closure of A has zero Lebesgue measure. Finally notice that \(T({\textbf{x}},\mathbb {X})>0\) since \({\textbf{x}}+(1;{\textbf{f}}({\varvec{z}})^t)^t \in \mathbb {Z}\times \mathbb {R}^q\) a.s. Thus, \(\{{\textbf{b}}_{n}\}_{n \in \mathbb {N}}\) is a T-chain.

We now show that condition (D) is satisfied by choosing \(V({\textbf{b}}_n)={\textbf{b}}_n^t{\mathbb {W}}{\textbf{b}}_n\). The one-step drift is

$$\begin{aligned} \begin{aligned} \Delta V({\textbf{b}}_n)&={\mathbb {E}} [V({\textbf{b}}_{n+1}) -V({\textbf{b}}_n)\mid {\textbf{b}}_n]\\&={\mathbb {E}} \left[ {\mathbb {E}} [V({\textbf{b}}_{n+1}) -V({\textbf{b}})\mid {\textbf{b}}_n={\textbf{b}}, {\textbf{Z}}_{n+1}={\varvec{z}} ]\right] \\&=\int {\mathbb {E}} [V({\textbf{b}}_{n+1}) -V({\textbf{b}}_n)\mid {\textbf{b}}_n,{\textbf{Z}}_{n+1}={\varvec{z}}]{\mathcal {L}}({\varvec{z}}) d{\varvec{z}}, \end{aligned} \end{aligned}$$

where the inner expectation is

$$\begin{aligned} \begin{aligned}&{\mathbb {E}} [V({\textbf{b}}_{n+1}) -V({\textbf{b}}_n)\mid {\textbf{b}}_n,{\textbf{Z}}_{n+1}={\varvec{z}}]\\&\quad ={\mathbb {E}} [V({\textbf{b}}_{n+1}) -V({\textbf{b}}_n)\mid {\textbf{b}}_n,{\textbf{Z}}_{n+1}={\varvec{z}},\delta _{n+1}=1] h((1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n)\\&\qquad +{\mathbb {E}} [V({\textbf{b}}_{n+1}) -V({\textbf{b}}_n)\mid {\textbf{b}}_n,{\textbf{Z}}_{n+1}={\varvec{z}},\delta _{n+1}=0] [1-h((1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n)]. \end{aligned} \end{aligned}$$

Since \({\mathbb {E}} [V({\textbf{b}}_{n+1}) -V({\textbf{b}}_n)\mid {\textbf{b}}_n, {\textbf{Z}}_{n+1}= {\varvec{z}},\delta _{n+1}=1]=2 (1;{\textbf{f}}({\varvec{z}} )^t){\mathbb {W}}{\textbf{b}}_n+(1;{\textbf{f}}({\varvec{z}} )^t){\mathbb {W}}(1;{\textbf{f}}({\varvec{z}} )^t)^t\) and \({\mathbb {E}} [V({\textbf{b}}_{n+1}) -V({\textbf{b}}_n)\mid {\textbf{b}}_n,{\textbf{Z}}_{n+1}={\varvec{z}},\delta _{n+1}=0]=-2 (1;{\textbf{f}}({\varvec{z}} )^t){\mathbb {W}}{\textbf{b}}_n+(1;{\textbf{f}}({\varvec{z}} )^t){\mathbb {W}}(1;{\textbf{f}}({\varvec{z}})^t)^t\), then \({\mathbb {E}} [V({\textbf{b}}_{n+1}) -V({\textbf{b}}_n)\mid {\textbf{b}}_n,{\textbf{Z}}_{n+1}={\varvec{z}}]= 4(1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n\left[ h\left( (1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n\right) -\frac{1}{2}\right] +(1;{\textbf{f}}({\varvec{z}} )^t){\mathbb {W}}(1;{\textbf{f}}({\varvec{z}} )^t)^t\), so that

$$\begin{aligned} \begin{aligned} \Delta V({\textbf{b}}_n)&=4\int (1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n \left[ h\left( (1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n\right) -\frac{1}{2}\right] {\mathcal {L}}({\varvec{z}})d{\varvec{z}}+\\&+\int (1;{\textbf{f}}({\varvec{z}} )^t){\mathbb {W}}(1;{\textbf{f}}({\varvec{z}})^t)^t {\mathcal {L}}({\varvec{z}})d{\varvec{z}}. \end{aligned} \end{aligned}$$

Notice that \((1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n \left[ h\left( (1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n\right) -1/2\right] \le 0\), since \((1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n \left[ h\left( (1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n\right) -1/2\right] =0\) if and only if \((1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n=0\) and it is negative otherwise. To verify condition (D), we need to show that, for a compact set \({\mathcal {C}}\), \(\Delta V({\textbf{b}}_n)\le -\varepsilon \), namely

$$\begin{aligned} \begin{aligned} \int \vert (1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n\vert&\left[ h\left( -\vert (1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n\vert \right) -\frac{1}{2}\right] {\mathcal {L}}({\varvec{z}})d{\varvec{z}} \ge \\&\frac{\varepsilon + \int (1;{\textbf{f}}({\varvec{z}} )^t){\mathbb {W}}(1;{\textbf{f}}({\varvec{z}})^t)^t {\mathcal {L}}({\varvec{z}})d{\varvec{z}}}{4}, \quad \text {on} \,\,\, {\mathcal {C}}^c. \end{aligned} \end{aligned}$$
(A.8)

Let \({\mathcal {Z}}^*=\{ {\varvec{z}} : \vert (1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n\vert >0\}\subset \mathbb {R}^p\), then \(\Pr {\left( {\varvec{z}}\in {\mathcal {Z}}^*\right) >0}\) and the LHS of (A.8) becomes

$$\begin{aligned} \int _{{\mathcal {Z}}^*} \vert (1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n\vert \left[ h\left( -\vert (1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n\vert \right) -\frac{1}{2}\right] {\mathcal {L}}({\varvec{z}})d{\varvec{z}}. \end{aligned}$$

Let \({\mathcal {C}}=\{{\textbf{b}}_n : \max \vert (1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n \vert \le \kappa , {\varvec{z}}\in {\mathcal {Z}}^*\}\) be the compact set (in \({\mathcal {Z}}^*\) the linear transformation \((1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n\) is injective and corresponds to an induced norm of \({\textbf{b}}_n\)), for every \({\textbf{b}}_n\in {\mathcal {C}}^c\),

$$\begin{aligned}{} & {} \int _{{\mathcal {Z}}^*} \vert (1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n\vert \left[ h\left( -\vert (1;{\textbf{f}}({\varvec{z}})^t){\mathbb {W}}{\textbf{b}}_n\vert \right) -\frac{1}{2}\right] {\mathcal {L}}({\varvec{z}})d{\varvec{z}}\\{} & {} \quad >\kappa \left[ h(-\vert \kappa \vert )-\frac{1}{2} \right] \Pr {\left( {\varvec{z}}\in {\mathcal {Z}}^*\right) }; \end{aligned}$$

so condition (D) is verified since \(\kappa \left[ h(-\vert \kappa \vert )-\frac{1}{2} \right] \Pr {\left( {\varvec{z}}\in {\mathcal {Z}}^*\right) }\) increases in \(\kappa \), while the RHS of (A.8) is bounded. Finally, the last statement follows from Theorem 2.

Under the mixed scenario, through the usual factorization of the joint distribution of mixed random variables, \({\mathcal {L}}({\varvec{z}})\) in (A.6) will be substituted by the product of the joint probability distribution of \(p_1\) qualitative covariates and the conditional density function of the remaining \(p_2\) factors. Thus, the T-chain property is preserved and, under the same choice of the function V and the compact set \({\mathcal {C}}\), the positive drift condition is also satisfied, so \(\{{\textbf{b}}_n\}_{n \in \mathbb {N}}\) is bounded in probability.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Baldi Antognini, A., Frieri, R., Zagoraiou, M. et al. The Efficient Covariate-Adaptive Design for high-order balancing of quantitative and qualitative covariates. Stat Papers 65, 19–44 (2024). https://doi.org/10.1007/s00362-022-01381-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-022-01381-1

Keywords

Navigation