On MSE-Optimal Circular Crossover Designs

In crossover designs, each subject receives a series of treatments, one after the other in p consecutive periods. There is concern that the measurement of a subject at a given period might be influenced not only by the direct effect of the current treatment but also by a carryover effect of the treatment applied in the preceding period. Sometimes, the periods of a crossover design are arranged in a circular structure. Before the first period of the experiment itself, there is a run-in period, in which each subject receives the treatment it will receive again in the last period. No measurements are taken during the run-in period. We consider the estimate for direct effects of treatments which is not corrected for carryover effects. If there are carryover effects, this uncorrected estimate will be biased. In that situation, the quality of the estimate can be measured by the mean square error, the sum of the squared bias and the variance. We determine MSE-optimal designs, that is, designs for which the mean square error is as small as possible. Since the optimal design will in general depend on the size of the carryover effects, we also determine the efficiency of some designs compared to the locally optimal design. It turns out that circular neighbour-balanced designs are highly efficient.


Introduction
In a crossover design, the effects of t treatments are compared by giving each of n subjects a series of treatments in p consecutive periods. The response of each unit in each period is measured and used to estimate the so-called direct effects of the treatments. Usually, for experiments of this kind, there is concern that a treatment given in period i, 1 ≤ i ≤ p, in addition to its direct effect will also have a carryover effect that influences the measurement in period i + 1. An important application of crossover designs is in sensory experiments, where each assessor evaluates a number of products, one after the other. Here, a carryover effect may for instance be caused by a lingering taste of a product.
There are two options to deal with carryover effects. One possibility is to explicitly include carryover effects in the model and use an estimate that corrects for carryover effects. There is a vast literature on designs for the corrected estimate, see, e.g. [3].
Another approach is to neglect the carryover effect and use an uncorrected estimate which does not correct for carryover effects. In sensory experiments, the analysis is usually done with the uncorrected estimate. In most cases, the experimenter then uses technical measures like, e.g. washout with a neutral taste to reduce carryover.
To deal with the, hopefully small, carryover effects which may be present in spite of washout, some authors recommend using neighbour-balanced designs as a precaution to minimize the effects of carryover, see, e.g. [11, p. 189]. Neumann and Kunert [12] have formally shown that the use of neighbour-balanced designs can minimize the mean square error of the uncorrected estimate.
In a circular crossover design, there is a run-in period where each unit receives a treatment but no measurement is made. During the run-in period of a circular design, each unit receives the treatment which it will receive again in the last period. Hence, there is a carryover effect in each period. Circular designs were introduced to the area of crossover designs by Magda [10]. There are some results on optimal designs for estimators taking account of the carryover effects, e.g. by Kunert [7], Druilhet [5] and Bailey and Cameron [2]. Azaïs [1] has shown that for circular designs and the uncorrected estimate, there is a randomization which preserves the neighbour structure but validates an analysis with the simple block model. This makes circular designs appealing for the uncorrected estimate.
The present paper extends the results of [12] to the class of circular designs. Since the mean square error depends on the size of the carryover effects, we determine locally optimal designs. If the number of periods is larger than the number of treatments, however, it turns out that there are designs minimizing the mean square error regardless of the size of the carryover effects. If the number of periods is less or equal to the number of treatments, the optimal design varies with the size of the carryover effects. As long as the carryover effects are small, no experimental unit in the optimal design receives the same treatment twice. When the carryover effects get larger, some units will receive the same treatment twice in consecutive periods. The proportion of units receiving the same treatment twice increases with increasing carryover effects. These results remain correct whether there is a period effect in the model or not.

Calculating the MSE
Define t,n, p as the set of all circular crossover designs with t treatments, n units (subjects) and p periods. We assume that the j-th observation on unit i can be written as Here, α i , 1 ≤ i ≤ n, is the effect of the i-th unit, τ d(i, j) is the effect of the treatment given to the i-th unit in the j-th period by the design d ∈ t,n, p , ρ d(i, j−1) is the carryover effect of the treatment given to unit i in period ( j − 1) with ρ d(i,0) = ρ d(i, p) , and ε i j is the error. The errors are independent, identically distributed with expectation 0 and variance σ 2 .
To continue, we need some notation. Denote by A T the transpose of the matrix A. If I t is the t × t-identity matrix and 1 t the t-dimensional vector of ones, define where B + denotes the Moore-Penrose generalized inverse of the matrix B.
In vector notation, our model then can be written as Here, y = [y 11 , . . . , y 1 p , y 21 , . . . , y np ] T and ε is the vector of the errors. Further, α, τ and ρ are the vectors of the unit, direct and carryover effects, respectively. The matrices U , T d and F d are the corresponding design matrices. We assume that the analysis of the data is done with a model neglecting the carryover effects, i.e.
If we define M d11 = T T d ω ⊥ (U )T d , then M d11 is the information matrix for the estimation of direct effects in model (2). Note that M d11 and, therefore, M + d11 have rowand column-sums 0. Consequently, If there are carryover effects, then the uncorrected estimateτ is biased. More precisely, As in [12], we try to determine a design that minimizes the mean square error (MSE) as a performance measure combining bias and variance. The joint information matrix of direct and carryover effects in model (1) can be written as where M d11 is as before and see [3, p. 15]. Analogous to [12], we calculate the MSE in the following way: We restrict attention to designs which allow estimation of all contrasts of direct effects in the model without carryover effects. Because M d11 is the information matrix for direct effects in the model without carryover effects, this is the set of all designs for which rank(M d11 ) = t − 1. In the model with carryover effects, the MSE of the uncorrected estimate τ i − τ j for any pair (i, j), i = j then is where i j is a t-dimensional vector with +1 in position i, −1 in position j and all other entries 0. With the calculations done in [12], we get that the average MSE over all pairs equals where tr(M) denotes the trace of a matrix M. As in [12], we consider the worst case for given ρ T H t ρ = (ρ i −ρ) 2 = δ, say. That is, we consider where λ i (M) is the i-th largest eigenvalue of the symmetric matrix M. We multiply this equation by t−1 2 and get the following definition for our optimalitycriterion. For any d ∈ t,n, p , we define The advantage of this criterion is that the multivariate purpose of minimizing the bias and maximizing the precision of the estimators can be measured as a single real number. In what follows, we assume without loss of generality that σ 2 = 1. Our aim is to find a design that minimizes MSE(d).
We call a square matrix M completely symmetric, if there are numbers a and b such that all diagonal elements are equal to a and all off-diagonal elements are equal to b. We get the following bound for MSE(d).

Theorem 1 For any design d ∈ t,n, p , there is a lower bound for MSE(d), namely
Equality holds if both M d11 and M d12 are completely symmetric.
Proof The proof given for Proposition 1 of [12] remains valid in the circular case.
Note that for every M d11 and M d12 there are symmetrized versionsM d11 andM d12 withM Here S t is the set of all (t × t)−permutation matrices . We see that allM di j are completely symmetric and that tr( Then, the lower bound in Theorem 1 can be written as Each subject in the design d receives a sequence s of treatments. Denote by T (s) and F(s) the part of T d and F d that corresponds to s and define We call two sequences s 1 and s 2 equivalent if s 1 can be transformed to s 2 by relabelling the treatments. Two equivalent sequences then have the same q i j (s). If for given t and p there are K , say, equivalence classes of sequences, we choose a representative sequence s k , 1 ≤ k ≤ K for each class. As pointed out by Kushner [9], the q di j then are weighted means of the q i j (s k ). More precisely, we get Similarly to [12], we consider approximate designs. For approximate designs, we remove the restriction that the number of experimental units to receive a given sequence s must be an integer. For an approximate design, the π d (k), 1 ≤ k ≤ K can be any set of nonnegative real numbers, subject to the condition that K k=1 π d (k) = 1. An exact design d ∈ t,n, p then is a special instance of an approximate design, where each sequence is assigned to an integral number of units. We denote the set of all Table 1 Size of q 11 (s) and q 12 (s) for two classes of sequences in the case p > t

Class
Sequence approximate designs for given t, n and p by t,n, p . Note that the number n of units is not important for an approximate design d ∈ t,n, p . It plays a role in the calculation of MSE(d), however. A design d ∈ t,n, p is called symmetric, if each sequence s from class k, 1 ≤ k ≤ K appears equally often (i.e. π d (k)/m k times, where m k is the number of sequences in class k). If the design d is symmetric, then all M di j are completely symmetric, 1 ≤ i ≤ j ≤ 2. We see from the definition of symmetrized designs that for every combination of tr(M d11 ) and tr(M d12 ) there is a symmetric design with the same traces. As these designs achieve the lower bound in Theorem 1, we can restrict attention to symmetric designs.

Optimal Designs
For any sequence s, we observe that where f s,m is the frequency of treatment m in sequence s and B s the number of periods where the treatment of the preceding period is repeated. These formulas show that the q i j (s) depend on the sequence only through the treatment frequencies and the number of periods where the preceding treatment is repeated. The position of the treatments within the sequence does not matter. Hence, we will choose the representative sequences such that the periods with repeated treatments, if there are, come at the end. As a first step, let p > t. In this case, we can write p = gt + s with g and s being integers and 1 ≤ s ≤ t. In this situation, there are two classes of particular interest, namely A with representative sequence s 1 and B with representative sequence s 2 , where [1, . . . , 1, 2, . . . , 2, . . . , t − 1, . . . , t − 1, t, . . . , t] s 2 = [1, 2, . . . , t, 1, 2, . . . , t, . . . , 1, 2, . . . , t, 1, 2, . . . , r ].
In these sequences, each treatment appears either g or g + 1 times. Note that it makes no difference for q 11 (s 1 ) and q 12 (s 1 ) whether the treatments appearing g + 1 times come early or late in s 1 . The values of q 11 (s) and q 12 (s) for the two classes A and B are displayed in Table 1. It can be seen that t m=1 f 2 s,m is minimal, if the f s,m , 1 ≤ m ≤ t, are as nearly equal as possible. Consequently, s 1 and s 2 maximize q 11 (s). With that observation, the MSEoptimal design can be determined as follows.
Theorem 2 Assume p > t and consider a symmetric approximate design d * ∈ t,n, p consisting of sequences from class A with proportion and of sequences from class B with proportion π 2 = 1 − π 1 . We then get Proof In each sequence used by d * , each treatment in each unit appears either g or g + 1 times. Therefore, we have that Observe that the choice of π 1 implies that q d * 12 = 0. Since d * is symmetric, we get which completes the proof.
If p ≤ t, the determination of an MSE-optimal design is more complicated. We examine the special cases p = 2 ≤ t and p = 3 ≤ t before we find general optimal designs for 4 ≤ p ≤ t.
Assume p = 2. Note that there are only two possible classes of sequences A and B, displayed in Table 2.
Proof Assume the design has proportion π 1 of its sequences from class A and π 2 = 1 − π 1 from class B. We get from Table 2 that q d11 = π d (s)q 11 (s) = π 2 and q d12 = −π 2 . Hence, This completes the proof.
For the case p = 3, there are only the five possible classes of sequences listed in Table 3. In this case, the optimal design consists of sequences [1, 2, 3] only. Theorem 4 Assume 3 = p ≤ t and consider a symmetric approximate design d * ∈ t,n,3 that consists only of sequences from class C with representative sequence [1,2,3]. Then, ∀d ∈ t,n, 3 : Proof In the circular model, sequences s from any of the three classes B 1 , B 2 and B 3 all have the same q 11 (s) and q 12 (s), respectively. Assume the design d has proportion π 2 of its sequences from one of the classes B j and π 3 from class C. Then, the proportion of sequences from class A is π 1 = 1 − π 2 − π 3 . We get from Table 3 that q d11 = π d (s)q 11 (s) = 2( 2 3 π 2 + π 3 ) and q d12 = −( 2 3 π 2 + π 3 ). Hence, and d * is MSE-optimal.
also in the circular case. This inequality is used in the next proposition.

Proposition 1 Define B(d) of a design d ∈ t,n, p as the weighted mean of the B s of its sequences, that is B(d) = s B s π d (s). Then, for any d ∈ t,n, p we have
Proof From the inequality t m=1 f 2 s,m ≥ p + 2B s , we get for any sequence s that Since q di j = π d (s)q i j (s), the desired inequalities follow.  . . . , p], i.e. no treatment is given twice and B s = 0. Class B has been shown to produce (universally) optimal designs for various cases [2,5,8], since balanced designs use sequences from this class. Class A was shown by Zheng et al. [13] to be important for non-trivial covariance matrices.
The respective values for q 11 (s) and q 12 (s) can be seen in Table 4. We now show that a design minimizing the MSE can be found in the set of all designs consisting of sequences from classes A and B only.

Proposition 2 Assume 4 ≤ p ≤ t. For any design d ∈ t,n, p definē
We construct a symmetric designd t,n, p ∈ t,n, p which consists of a proportionπ of sequences from class A and a proportion 1 −π of sequences from class B. Then, Proof For the designd, we conclude from the q i j (s) in Table 4, that qd 11 = p−1− 4 pπ and qd 12 = −(1 −π) +π p−4 p = −1 + 2π p−2 p . Making use of the fact thatd is symmetric, we then have Now distinguish between two cases. Case 1: B(d) ≥ p p−2 . In this case,π = p 2( p−2) and therefore qd 11 = p − 1 − 2 p B(d) while qd 12 = 0. It follows from Proposition 1 that q d11 ≤ p − 1 − 2 p B(d) ≤ qd 11 . Hence, Case 2: Assume 0 ≤ B(d) < p p−2 . In that case, it follows from Proposition 1 that q d12 ≤ −1 + p−2 p B(d) < 0. Therefore, a lower bound for MSE(d) is achieved by making both q d12 and q d11 as large as possible. We hence get from Proposition 1 that It follows from Proposition 2 that the MSE-optimal design for a given δ can be found among the set of all designsd defined in Proposition 2. That is, we have to find the bestπ .
where l di,k is the number of appearances of treatment i in period k (with l di0 = l dip ) and r di the number of appearances of treatment i in the design. The numbers q d11 and q d12 are as in Sect. 3.
For symmetric designs, we get tr(M d11 ) = n q d11 and tr(M d12 ) = n q d12 , i.e. q di j = q di j . In the case p > t, the best symmetric designs have q d12 = 0 and the results of Theorem 2 extend to the model with period effects.
There are examples-at least in non-circular designs-of designs that are nonsymmetric and where the uncorrected estimate from the model with period effects has a smaller bias than the estimate from the model without period effects. In the case p ≤ t, the best symmetric designs have nonzero bias. Hence, it is not obvious that the results of Theorems 3 and 4 and of Proposition 2 still hold in the model with period effects.
To show that they do, we start by showing that for non-symmetric designs the loss in tr(M d11 ) is higher than the possible gain in |tr(M d12 )|.

Proposition 3 For any d ∈ t,n, p , it holds that
Because of the circular structure of the carryover effects, we have [8] or [6]) and, hence, that T T d QT d = F T d Q F d . Therefore, the Cauchy-Schwarz inequality implies that and the proposition follows.
If p ≥ 4, we are now able to show that for every design d there is a symmetric design d s consisting of sequences from classes A and B described in Table 4 only and having smaller or equal MSE. Hence, Therefore, ρ = 1 2 , regardless of our choice of the l di,k . It follows that

Efficiency in Terms of MSE and Example
We have seen in Sects. 3 and 4 that MSE-optimality, for 4 ≤ p ≤ t, is a local criterion, the MSE-optimum design depends on the true δ. Since in practice, we will in general not know δ, we try to find a design which is efficient for a broad range of δ. The efficiency of a design d at a point δ is defined as where MSE δ,opt is the minimum MSE for this δ. Thus, a high efficiency is desirable and the efficiency is a number between 0 and 1.
In what follows, we consider the case 4 ≤ p ≤ t. In the case of circular designs, often designs are considered that only consist of sequences from class B. Special cases of these designs are so-called circular balanced designs or orthogonal arrays of type I. These have been shown to be optimal in a broad class of designs, see, e.g. [2]. However, Zheng et al. [13] showed that designs with sequences from class A can be efficient in some situations. Thus, we will consider the designs d A and d B that only consist of sequences from classes A or B, respectively, and calculate their efficiency. First, we examine the case that δ < ( p−1)(t−1) 2 np( p− 3) . In this case, d B is MSE-optimal and thus Eff δ (d B ) = 1.
For d A , we get in this case . and We see that the efficiency of d B decreases with δ while the efficiency of d A increases with δ.
If δ increases, we see that the proportion 1 − π(δ) of sequences from class B in the MSE-optimal design decreases ( Fig. 1). This is in agreement with the decreasing efficiency of the design d B and the increasing efficiency of the design that only consists of sequences from class A (Fig. 2). However, as long as δ remains reasonably small, the efficiency of d B is still high (and larger than the efficiency of d A ). Therefore, it makes sense to use the design d B .
Of particular interest for practice is the case when the number of treatments t equals the number of periods p. In this case, we can construct a circular balanced uniform design d B with n = t(t − 1) units.
For a design of this size, the limit δ 1 = ( p−1)(t−1) 2 np( p−3) becomes 0.56 for p = 4, 0.32 for p = 5, 0.23 for p = 6 and goes down to 0.01 for p = 100. The efficiency Eff d B (δ) is 1 as long as δ is less than δ 1 , and decreases when δ increases. For δ = 3, we observe that Eff

Conflict of interest
The authors declare that there are no conflicts of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted