1 Introduction

The aim of this work is computation of highly efficient experimental designs in multiple-group random coefficient regression models. Analytical approach for determining optimal approximate designs for this type of models has been discussed, i.e., in Fedorov and Jones (2005), Schmelter (2007) and Prus (2022). In Fedorov and Jones (2005), optimal designs were obtained for specific regression functions. Schmelter (2007) proposed optimality conditions in the particular case of group-wise identical designs for commonly used linear and determinant criteria. In Prus (2022), equivalence theorems for the general form of multiple-group models have been formulated.

For computing optimal designs in mixed-effect models, several solutions are available. However, most of them focus on computing approximate designs for several traditional criteria, and we are not aware of any work that considers additional constraints besides the size of the design.

Namely, Dumont et al. (2018) created an R package for computing designs in mixed-effects model that is predominantly focused on nonlinear models that are used in drug development. However, they only consider the D-optimality criterion without any additional constraints that may arise in the experiment (such as budget, material or other types of resources). The software package (Aliev et al. 2012) is aimed at computing approximate designs, mainly for mixed-effects models arising in pharmacokinetic applications. Finally, the software solution of Nyberg et al. (2012) seems to be the most versatile, as it admits user-defined criteria, including their Bayesian versions, but the focus is on approximate designs in nonlinear mixed-effect models and no additional constraints can be included.

In our paper, we propose to use the algorithm of Harman et al. (2016), originally developed for computing D-efficient exact designs in the linear regression model with possibly multiple resource constraints, for the general form of multiple-group models. We also propose analytical solutions based on equi- and invariance properties of optimal designs for several particular models.

The paper has the following structure: In Sect. 2, we shortly introduce the multiple group model, the design problem and the optimality conditions that are subsequently used in Sect. 3 to show the equivariance and invariance properties of D- and L-optimal designs. In Sect. 4, we show that the problem of computing optimal designs in the multiple-group mixed model can be reformulated as a problem of computing optimal designs with respect to a monotonous criterion function with resource constraints on weights, and, hence, a modification of a recent algorithm for computing resource constrained designs can be used to obtain efficient exact designs in our model. In Sects. 5 and 6, we compute the D- and IMSE-optimal designs in bilinear and quadratic models and show that we can easily solve problems with additional constraints that cannot be solved analytically.

2 Multiple-group RCR model

2.1 Model specification

In multiple-group random coefficient regression model the h-th observation of the j-th observational unit in the i-th group is given by

$$\begin{aligned} \textbf{Y}_{ijh}= & {} \textbf{F}_{(i)}(x_{ih}){\varvec{\beta }}_{ij} + {\varvec{\varepsilon }}_{ijh},\quad x_{ih} \in \mathcal {X}_i,\quad i=1,\dots , s,\quad j=1,\dots , n_i, \nonumber \\ h= & {} 1,\dots , m_i, \end{aligned}$$
(1)

where \(n_i\) is the number of observational units in group i, \(m_i\) is the number of observations per unit in group i, observational settings \(x_{ih}\) come from some experimental region \(\mathcal {X}_i\). In this work we allow for multivariate (l-variate) response and \(\textbf{F}_{(i)}\) denotes a group-specific (\(l\times p\)) matrix of known regression functions in group i. In the particular case of univariate response we deal with “classical” regression functions: \(\textbf{F}_{(i)}=\textbf{f}_{(i)}^\top \), and \(l=1\). Unit-specific random parameters \({\varvec{\beta }}_{ij}=(\beta _{ij1}, \dots , \beta _{ijp})^\top \) have unknown mean \({\varvec{\beta }}_0\) and given (\(p\times p\)) covariance matrix \(\textbf{D}_i\), \({\varvec{\varepsilon }}_{ijh}\) denote observational errors with zero mean and non-singular (\(l\times l\)) covariance matrix \({\Sigma }_i\). All observational errors and all random parameters are assumed to be uncorrelated.

The covariance matrix of the best linear unbiased estimator for \({\varvec{\beta }}_0\) is given by

$$\begin{aligned} \text {Cov}\left( \hat{{\varvec{\beta }}}_0\right) = \left[ \sum _{i=1}^sn_i((\tilde{\textbf{F}}_i^\top \tilde{\textbf{F}}_i)^{-1}+\textbf{D}_i)^{-1}\right] ^{-1}, \end{aligned}$$
(2)

where \(\tilde{\textbf{F}}_i=(\tilde{\textbf{F}}^\top _{(i)}(x_{i1}), \ldots , \tilde{\textbf{F}}^\top _{(i)}(x_{im_i}))^\top \) for \(\tilde{\textbf{F}}_{(i)}(x_{ih})={\Sigma }_i^{-1/2}\textbf{F}_{(i)}(x_{ih})\), \(h=1, \dots , m_i\), and the symmetric positive definite matrix \({\Sigma }_i^{1/2}\) with the property \({\Sigma }_i={\Sigma }_i^{1/2}{\Sigma }_i^{1/2}\).

2.2 Design criteria

The experimental settings \(x_{i1}, \dots , x_{im_i}\) in formula 1 are not necessarily all distinct. We define an exact design in group i as

$$\begin{aligned} \xi _i= \left( \begin{array}{c} x_{i1},\dots , x_{ik_i} \\ m_{i1},\dots , m_{ik_i} \end{array} \right) , \end{aligned}$$
(3)

where \(x_{i1},\dots ,x_{ik_i}\) are the distinct support points in \(\mathcal {X}_i\) with the corresponding numbers of observations \(m_{i1},\dots , m_{ik_i}\in \mathbb {N}\), \(\sum _{k=1}^{k_i}{m_{ik}}=m_i\).

For analytical purposes we also introduce approximate designs:

$$\begin{aligned} \xi _i= \left( \begin{array}{c} x_{i1}, \ldots , x_{ik_i} \\ w_{i1}, \ldots , w_{ik_i} \end{array} \right) , \end{aligned}$$

where \(w_{ik}\ge 0\) denotes the weight of observations at \(x_{ik}\), \(k=1, \dots k_i\), and \(\sum _{k=1}^{k_i}{w_{ik}}=1\).

We will use the following notation for the moment (or information) matrix in group i:

$$\begin{aligned} \textbf{M}_i(\xi _i)=m_i\sum _{k=1}^{k_i}w_{ik}\,\tilde{\textbf{F}}_{(i)}(x_{ik})^\top \tilde{\textbf{F}}_{(i)}(x_{ik}). \end{aligned}$$
(4)

For exact designs we have \(w_{ik}=m_{ik}/m_i\) and

$$\begin{aligned} \textbf{M}_i(\xi _i)=\tilde{\textbf{F}}_i^\top \tilde{\textbf{F}}_i, \end{aligned}$$

which follows from formula (4) and the definition of \(\tilde{\textbf{F}}_i\) below formula (2).

We will also use the notation \({\varvec{\xi }}\) for the tuple of all group-designs \(\xi _i\): \({\varvec{\xi }}=(\xi _1, \dots , \xi _s)\).

Further we extend the definition of the variance–covariance matrix (2) with respect to approximate designs:

$$\begin{aligned} \text {Cov}_{\xi } = \left[ \sum _{i=1}^sn_i\left( \textbf{M}_i(\xi _i)^{-1}+\textbf{D}_i\right) ^{-1}\right] ^{-1}. \end{aligned}$$
(5)

We generally search for the designs which minimize the variance-covariance matrix. Instead of the minimization of the matrix itself (which is in general not possible), we instead minimize suitable functions of this matrix which we call optimality criteria. We focus on the commonly used linear (L-) and determinant (D-) criteria for the estimation of the population parameters \({\varvec{\beta }}_0\), which are given by

$$\begin{aligned} \phi _{L}({\varvec{\xi }})=\textrm{tr}\left( \left[ \sum _{i=1}^sn_i\left( \textbf{M}_i(\xi _i)^{-1}+\textbf{D}_i\right) ^{-1}\right] ^{-1}\textbf{V}\right) , \end{aligned}$$
(6)

where \(\textbf{V}\) is some non-negative definite (\(p\times p\)) matrix, and

$$\begin{aligned} \phi _{D}({\varvec{\xi }})=-\textrm{ln}\,\textrm{det}\left( \sum _{i=1}^sn_i\left( \textbf{M}_i(\xi _i)^{-1}+\textbf{D}_i\right) ^{-1}\right) , \end{aligned}$$
(7)

respectively (see Prus 2022). (Note that matrix \(\textbf{M}_i(\xi _i)\) here differs from that in Prus Prus 2022 by constant \(m_i\).)

Frequently used particular cases of the L-criterion are the c- and A-criterion, which are of the form (6) with \(\textbf{V}=\textbf{c}\textbf{c}^\top \), \(\textbf{c}\in \mathbb {R}^p\), and \(\textbf{V}=\mathbb {I}_p\), where \(\mathbb {I}_p\) is the \(p\times p\) identity matrix, respectively. Another frequently used linear criterion is the IMSE-criterion. For the estimation of the mean parameters \({\varvec{\beta }}_0\) in multiple-group model (1) we define this criterion as follows:

$$\begin{aligned}{} & {} \phi _{IMSE}({\varvec{\xi }}) \nonumber \\{} & {} \quad =\sum _{i=1}^s a_i\,\textrm{tr}\left( \int _{\mathcal {X}_i}\textrm{E}\left[ \left( \textbf{F}_{(i)}(x) \hat{{\varvec{\beta }}}_0-\textbf{F}_{(i)}(x) {\varvec{\beta }}_0\right) \left( \textbf{F}_{(i)}(x) \hat{{\varvec{\beta }}}_0-\textbf{F}_{(i)}(x) {\varvec{\beta }}_0\right) ^\top \right] \nu _i(\text {d}x)\right) , \nonumber \\ \end{aligned}$$
(8)

where \(\nu _i\) is some suitable measure on the experimental region \(\mathcal {X}_i\) (typically uniform on \(\mathcal {X}_i\)) with \(\nu _i(\mathcal {X}_i)=1\) and \(a_i\) is a coefficient related to group i, \(\sum _{i=1}^sa_i=1\). The coefficients \(a_1, \dots , a_s\) may depend on the group sizes or, alternatively, equal weight may be given to each group. IMSE-criterion (8) may be rewritten in form

$$\begin{aligned} \phi _{IMSE}({\varvec{\xi }}) =\textrm{tr}\left( \text {Cov}\left( \hat{{\varvec{\beta }}}_0\right) \sum _{i=1}^s a_i \int _{\mathcal {X}_i}\textbf{F}_{(i)}(x)^\top \textbf{F}_{(i)}(x)\,\nu _i(\text {d}x)\right) . \end{aligned}$$
(9)

Then we extend it for approximate designs by using the extended variance-covariance matrix (5) and we obtain the particular linear criterion with \(\textbf{V} = \sum _{i=1}^s a_i \int _{\mathcal {X}_i}\textbf{F}_{(i)}(x)^\top \textbf{F}_{(i)}(x)\,\nu _i(\text {d}x)\), which simplifies to \(\textbf{V} = \int _{\mathcal {X}_1}\textbf{F}_{(1)}(x)^\top \textbf{F}_{(1)}(x)\,\nu _1(\text {d}x)\) if the regression matrices \(\textbf{F}_{(i)}\), the experimental regions \(\mathcal {X}_i\) and the weighting measures \(\nu _i\) are the same among all groups: \(\textbf{F}_{(i)}=\textbf{F}_{(1)}\), \(\mathcal {X}_i=\mathcal {X}_1\) and \(\nu _i=\nu _1\) for \(i=1, \dots , s\).

2.3 Optimality conditions

The optimality conditions for the L- and D-criteria are provided by the following theorems (see Prus 2022):

Theorem 1

Approximate designs \({\varvec{\xi }}^*=(\xi _1^*,\dots , \xi _s^*)\) are L-optimal for estimation of the mean parameters \({\varvec{\beta }}_0\) iff

$$\begin{aligned}{} & {} m_i\,\textrm{tr}\left\{ \tilde{\textbf{F}}_{(i)}(x_i)\left[ \textbf{M}_i(\xi _i^*)^{-1}\left( \textbf{M}_i(\xi _i^*)^{-1}+\textbf{D}_i\right) ^{-1}\left[ \sum _{r=1}^sn_r\left( \textbf{M}_r(\xi _r^*)^{-1}+\textbf{D}_r\right) ^{-1}\right] ^{-1}\textbf{V}\right. \right. \nonumber \\{} & {} \left. \left. \cdot \left[ \sum _{r=1}^sn_r\left( \textbf{M}_r(\xi _r^*)^{-1}+\textbf{D}_r\right) ^{-1}\right] ^{-1}\left( \textbf{M}_i(\xi _i^*)^{-1}+\textbf{D}_i\right) ^{-1}\textbf{M}_i(\xi _i^*)^{-1}\right] \tilde{\textbf{F}}_{(i)}(x_i)^\top \right\} \nonumber \\{} & {} \le \textrm{tr}\left\{ \textbf{M}_i(\xi _i^*)^{-1}\left( \textbf{M}_i(\xi _i^*)^{-1}+\textbf{D}_i\right) ^{-1}\left[ \sum _{r=1}^sn_r\left( \textbf{M}_r(\xi _r^*)^{-1}+\textbf{D}_r\right) ^{-1}\right] ^{-1}\textbf{V}\right. \nonumber \\{} & {} \left. \cdot \left[ \sum _{r=1}^sn_r\left( \textbf{M}_r(\xi _r^*)^{-1}+\textbf{D}_r\right) ^{-1}\right] ^{-1}\left( \textbf{M}_i(\xi _i^*)^{-1}+\textbf{D}_i\right) ^{-1}\right\} \end{aligned}$$
(10)

for \(x_i \in \mathcal {X}_i\), \(i=1, \dots , s\).

For support points of \(\xi _i^*\) equality holds in (10).

Theorem 2

Approximate designs \({\varvec{\xi }}^*=(\xi _1^*,\dots , \xi _s^*)\) are D-optimal for estimation of the mean parameters \({\varvec{\beta }}_0\) iff

$$\begin{aligned}{} & {} m_i\,\textrm{tr}\left\{ \tilde{\textbf{F}}_{(i)}(x_i)\left[ \textbf{M}_i(\xi _i^*)^{-1}\left( \textbf{M}_i(\xi _i^*)^{-1}+\textbf{D}_i\right) ^{-1}\left[ \sum _{r=1}^sn_r\left( \textbf{M}_r(\xi _r^*)^{-1}+\textbf{D}_r\right) ^{-1}\right] ^{-1}\right. \right. \nonumber \\{} & {} \left. \left. \cdot \left( \textbf{M}_i(\xi _i^*)^{-1}+\textbf{D}_i\right) ^{-1}\textbf{M}_i(\xi _i^*)^{-1}\right] \tilde{\textbf{F}}_{(i)}(x_i)^\top \right\} \nonumber \\{} & {} \le \textrm{tr}\left\{ \textbf{M}_i(\xi _i^*)^{-1}\left( \textbf{M}_i(\xi _i^*)^{-1}+\textbf{D}_i\right) ^{-1}\left[ \sum _{r=1}^sn_r\left( \textbf{M}_r(\xi _r^*)^{-1}+\textbf{D}_r\right) ^{-1}\right] ^{-1}\right. \nonumber \\{} & {} \left. \cdot \left( \textbf{M}_i(\xi _i^*)^{-1}+\textbf{D}_i\right) ^{-1}\right\} \end{aligned}$$
(11)

for \(x_i \in \mathcal {X}_i\), \(i=1, \dots , s\).

For support points of \(\xi _i^*\) equality holds in (11).

Example 1

We consider the two-groups model of general form (1) with the regression functions \(\textbf{F}_{(i)}(x)=(1,x), x\in \mathcal {X}_i\):

$$\begin{aligned} Y_{ijh}= {\varvec{\beta }}_{ij1} + {\varvec{\beta }}_{ij2}x_{ih}+{\varvec{\varepsilon }}_{ijh},\quad j=1,\dots , n_i,\quad h=1,\dots , m_i, \quad i=1,2, \end{aligned}$$
(12)

on the design regions \(\mathcal {X}_i=[0,1]\). The covariance structures of random effects and observational errors are given by \(\textbf{D}_i=\textrm{diag}(d_{i1},d_{i2})\) and \(\Sigma _i=1\) for both groups. For this model the left hand sides of the optimality conditions (10) and (11) are parabolas with positive leading terms. Therefore, D- and L-optimal approximate group-designs have the form

$$\begin{aligned} \xi _i=\left( \begin{array}{cc}0 &{} 1 \\ 1-w_{i1} &{} w_{i1}\end{array}\right) , \end{aligned}$$
(13)

where \(w_{i1}\) denotes the weight of observations in point 1 for the i-th group and may depend on the choice of the design criterion as well as on model parameters. The moment matrices are given by

$$\begin{aligned} \textbf{M}_i(\xi _i)=\left( \begin{array}{cc} m_i &{} m_{i1} \\ m_{i1} &{} m_{i1}\end{array}\right) , \end{aligned}$$
(14)

where \(m_{i1}=w_{i1}m_i\). Optimal designs for random intercept and random slope have been considered in more detail in Prus (2022).

3 Equi- and invariance considerations for construction of optimal designs

Equi- and invariance of design criteria play an important role for determining optimal designs in fixed effects models (see e.g. Heiligers 1992 or Schwabe 1996, ch. 3). Prus and Schwabe (2016) investigated the related properties of designs, which are optimal for prediction of individual random parameters in single-group mixed effects models. Here we extend those results to multiple-group models.

We consider a one-to-one transformation g of the experimental regions \(\mathcal {X}_i\) for all \(i=1, \dots s\) simultaneously with \(g(\mathcal {X}_i)=\mathcal {X}_i^g\). We assume the regression matrices \(\textbf{F}_{(i)}\) to be defined on both \(\mathcal {X}_i\) and \(\mathcal {X}_i^g\). We also assume the existence of a non-singular \(p\times p\) matrix \(\textbf{Q}_g\) such that

$$\begin{aligned} \tilde{\textbf{F}}_{(i)}(g (x))=\textbf{Q}_g\,\tilde{\textbf{F}}_{(i)}(x), \quad \forall x \in \mathcal {X}_i, \quad i=1, \dots , s, \end{aligned}$$
(15)

i.e. all \(\tilde{\textbf{F}}_{(i)}\) are linearly equivariant with respect to the transformation g (see e.g. Schwabe 1996, ch. 3). We denote by \(\xi _i^g\) the following transformation of an approximate design \(\xi _i\):

$$\begin{aligned} \xi _i^g= \left( \begin{array}{c} g(x_{i1}), \ldots , g(x_{ik_i}) \\ w_{i1},\, \ldots ,\ w_{ik_i} \end{array} \right) , \end{aligned}$$
(16)

where the weight \(w_{ik}\) is the same for both \(\xi _i\) and \(\xi _i^g\) and only the design points \(x_{ik}\) are transformed. Then we obtain the next property of the moment matrices:

$$\begin{aligned} \textbf{M}_i(\xi _i^g)=\textbf{Q}_g\,\textbf{M}_i(\xi _i)\,\textbf{Q}_g^\top , \quad i=1, \dots , s. \end{aligned}$$
(17)

Further we use the notations \(\mathcal {D}=(\textbf{D}_1, \dots , \textbf{D}_s)\) and \(\mathcal {\mathbb {X}}=\times _{i=1}^s\mathcal {X}_i\) for the tuple of covariance matrices and the Cartesian product of the experimental regions, respectively, in all groups. For the covariance matrix (5) the following relation can be easily verified:

$$\begin{aligned} \text {Cov}_{\xi ^g}(\mathcal {D}^g)= \textbf{Q}_g^{-\top }\text {Cov}_{\xi }(\mathcal {D})\textbf{Q}_g^{-1}, \end{aligned}$$
(18)

where \({\varvec{\xi }}^g=(\xi _1^g, \dots , \xi _s^g)\), \(\mathcal {D}^g=(\textbf{D}_1^g, \dots , \textbf{D}_s^g)\), \(\textbf{D}_i^g=\textbf{Q}_g^{-\top } \textbf{D}_i\textbf{Q}_g^{-1}\) and \(\textbf{Q}_g^{-\top }=(\textbf{Q}_g^{\top })^{-1}\). We use the notation \(\text {Cov}_{\xi }(\mathcal {D})\) [instead of \(\text {Cov}_{\xi }\) as in formula (5)] to emphasize the dependence on the covariance matrices \(\mathcal {D}\) of random effects.

Then the equivariance of the D- and L-criteria with respect to a transformation g can be established.

Theorem 3

If the approximate designs \({\varvec{\xi }}^*\) are D-optimal for the estimation of \({\varvec{\beta }}_0\) on the experimental regions \(\mathbb {X}\) under the dispersion matrices \(\mathcal {D}\), then the induced approximate designs \({{\varvec{\xi }}}^g\) are D-optimal for the estimation of \({\varvec{\beta }}_0\) on the experimental regions \(\mathbb {X}^g=\times _{i=1}^s\mathcal {X}_i^g\) under the induced dispersion matrices \(\mathcal {D}^g\).

Proof

From the definition of the D-criterion for the estimation of \({\varvec{\beta }}_0\) and formula (18) we obtain

$$\begin{aligned} \phi _{D}({\varvec{\xi }}^g, \mathcal {D}^g)= - 2\ln |\det (\textbf{Q}_g)|+\phi _{D}({\varvec{\xi }}, \mathcal {D}), \end{aligned}$$

which proves the optimality of \({\varvec{\xi }}^g\) on \(\mathbb {X}^g\) for \({\varvec{\xi }}\) optimal on \(\mathbb {X}\). \(\square \)

Theorem 4

If the approximate designs \({\varvec{\xi }}^*\) are L-optimal for the estimation of \({\varvec{\beta }}_0\) on the experimental regions \(\mathbb {X}\) under the dispersion matrices \(\mathcal {D}\) with respect to the transformation matrix \(\textbf{V}\), then the induced approximate designs \({{\varvec{\xi }}^*}^g\) are L-optimal for the estimation of \({\varvec{\beta }}_0\) on the experimental regions \(\mathbb {X}^g\) under the induced dispersion matrices \(\mathcal {D}^g\) with respect to the induced transformation matrix \(\textbf{V}_g=\textbf{Q}_g\textbf{V}\textbf{Q}_g^\top \).

Proof

Using formulas (6) and (18) it can be easily verified that

$$\begin{aligned} \phi _{L}({\varvec{\xi }}^g, \mathcal {D}^g, \textbf{V}_g)= \phi _{L}({\varvec{\xi }}, \mathcal {D}, \textbf{V}), \end{aligned}$$

which proves the optimality of \({\varvec{\xi }}^g\) on \(\mathbb {X}^g\). \(\square \)

Corollary 1

The A-criterion for the estimation of \({\varvec{\beta }}_0\) is equivariant with respect to a transformation g if \(\textbf{Q}_g\) is orthogonal, i.e.:

$$\begin{aligned} \textbf{Q}_g\textbf{Q}_g^\top =\textbf{Q}_g^\top \textbf{Q}_g=\mathbb {I}_p. \end{aligned}$$
(19)

To verify the equivariance of the IMSE-criterion we assume, besides the transformed regression matrices \(\tilde{\textbf{F}}_{(i)}\), the original regression matrices \(\textbf{F}_{(i)}\) to be linearly equivariant with respect to the transformation g:

$$\begin{aligned} \textbf{F}_{(i)}(g (x))=\textbf{Q}_g\,\textbf{F}_{(i)}(x), \quad \forall x \in \mathcal {X}_i, \quad i=1, \dots , s. \end{aligned}$$
(20)

Then if the measure \(\nu _i\) is transformed to its image \(\nu _i^g\), we obtain

$$\begin{aligned} \textbf{V}_g=\sum _{i=1}^s a_i \int _{\mathcal {X}_i^g}\textbf{F}_{(i)}(x)^\top \textbf{F}_{(i)}(x)\,\nu _i^g(\text {d}x)=\textbf{Q}_g\textbf{V}\textbf{Q}_g^\top . \end{aligned}$$

Corollary 2

The IMSE-criterion for the estimation of \({\varvec{\beta }}_0\) is equivariant with respect to a transformation g if condition (20) is satisfied.

Example 1 (continued). We consider again the two-groups linear regression model (12) on \(\mathcal {X}_i=[0,1]\) with diagonal covariance structure of random effects. For the IMSE-criterion we chose the uniform weighting \(\nu _i=\lambda _{[0,1]}\), \(i=1,2\), where \(\lambda _{[c_1,c_2]}\) denotes the Lebesgue measure on \([c_1,c_2]\). Let \(\xi _i^*\) be D-, A- or IMSE-optimal group-designs of form (13) with the optimal weight of observations \(w_{i1}^*\) (which generally depends on the choice of the design criterion).

Now we consider the linear transformation \(g(x)=ax\), \(a>0\), for which we obtain \(\textbf{Q}_g=\textrm{diag}(1,a)\). Then the D-, A- or IMSE-optimal group-designs in model (12) on \(\mathcal {X}_i^g=[0,a]\) for \(\textbf{D}_i^g=\textrm{diag}(d_{i1},d_{i2}/a^2)\) and \(\nu _i^g=\frac{1}{a}\lambda _{[0,a]}\) are given by

$$\begin{aligned} {\xi _i^*}^g=\left( \begin{array}{cc}0 &{} a \\ 1-w_{i1}^* &{} w_{i1}^*\end{array}\right) . \end{aligned}$$
(21)

Same behavior of optimal designs has been established for the prediction of random effects in single-group model in Prus and Schwabe (2016).

Further we consider a finite group G of transformations \(g:\mathcal {X}_i\rightarrow \mathcal {X}_i\) of the experimental regions \(\mathcal {X}_i\) onto themselves for all \(i=1, \dots s\) simultaneously. We assume the equivariance condition (15) to be satisfied and the dispersion matrices to be invariant: \(\textbf{D}_i^g=\textbf{D}_i\), for all \(g \in G\), \(i=1, \dots , s\). For the linear criteria we additionally assume the invariance of the transformation matrices: \(\textbf{V}_g=\textbf{V}\). Then the D- and L-criteria are invariant with respect to all \(g \in G\) and the following statement can be formulated:

Theorem 5

If the approximate designs \({\varvec{\xi }}^*\) are D- or L-optimal for the estimation of \({\varvec{\beta }}_0\), then the symmetrized designs \(\bar{{\varvec{\xi }}}^*=(\bar{\xi }_1^*, \dots , \bar{\xi }_s^*)\) for \(\bar{\xi }_i^*=\frac{1}{\#G}\sum _{g\in G}{\xi _i^*}^g\) are also D- or L-optimal for the estimation of \({\varvec{\beta }}_0\).

Proof

Let the designs \({\varvec{\xi }}^*\) be D-optimal for the estimation of \({\varvec{\beta }}_0\). Then it follows from Theorem 3 and the invariance of the dispersion matrices that the induced designs \({{\varvec{\xi }}^*}^g\) are also D-optimal, i.e.

$$\begin{aligned} \phi _{D}({\varvec{\xi ^g}}^*, \mathcal {D})=\phi _{D}({\varvec{\xi }}^*, \mathcal {D}), \quad \forall g \in G. \end{aligned}$$

From the convexity of the criterion we obtain

$$\begin{aligned} \phi _{D}(\bar{{\varvec{\xi }}}^*, \mathcal {D}) \le \phi _{D}({\varvec{\xi }}^*, \mathcal {D}), \end{aligned}$$

which implies the D-optimality of the designs \(\bar{{\varvec{\xi }}}^*\).

For the linear criterion the proof is similar. \(\square \)

The invariance of the A-criterion is straightforward if condition (19) is satisfied for all \(g \in G\).

Corollary 3

If the approximate designs \({\varvec{\xi }}^*\) are A-optimal for the estimation of \({\varvec{\beta }}_0\) and condition (19) is satisfied for all \(g \in G\), then the symmetrized designs \(\bar{{\varvec{\xi }}}^*\) are also A-optimal for the estimation of \({\varvec{\beta }}_0\).

For the IMSE-criterion we require the invariance of the weighting measures: \(\nu _i^g=\nu _i\), which leads to \(\textbf{V}_g=\textbf{V}\).

Corollary 4

If the approximate designs \({\varvec{\xi }}^*\) are IMSE-optimal for the estimation of \({\varvec{\beta }}_0\) and condition (20) is satisfied for all \(g \in G\), then the symmetrized designs \(\bar{{\varvec{\xi }}}^*\) are also IMSE-optimal for the estimation of \({\varvec{\beta }}_0\).

Example 2

We consider the multiple-group model of the form (1) with the regression functions \(\textbf{F}_{(i)}(x)=(1,x, x^2)\) on a symmetric design region \(\mathcal {X}_i=[-a,a]\), \(a>0\), \(i=1, \dots , s\):

$$\begin{aligned} Y_{ijh}= {\varvec{\beta }}_{ij1} + {\varvec{\beta }}_{ij2}x_{ih}+ {\varvec{\beta }}_{ij3}x^2_{ih}+ {\varvec{\varepsilon }}_{ijh},\quad j=1,\dots , n_i,\quad h=1,\dots , m_i. \nonumber \\ \end{aligned}$$
(22)

For this model the left hand sides of the optimality conditions (10) and (11) for the L- and D-criterion, respectively, are polynomial functions of degree four. Consequently, the corresponding optimal group-designs \(\xi _i\) are supported by not more than three design points including the two endpoints of the experimental region:

$$\begin{aligned} \xi _i^*= \left( \begin{array}{ccc} -a &{} o_i &{} a \\ w_{i1}^* &{} 1-w_{i1}^*-w_{i2}^* &{} w_{i2}^* \end{array} \right) , \end{aligned}$$
(23)

where \(o_i \in (-a,a)\) may differ for different design criteria or for different groups. We assume covariance structures of random effects and observational errors to be given by

$$\begin{aligned} \textbf{D}_i=\left( \begin{array}{ccc} d_{i11} &{} 0 &{} d_{i13} \\ 0 &{} d_{i22} &{} 0 \\ d_{i13} &{} 0 &{} d_{i33} \end{array}\right) \end{aligned}$$
(24)

and \(\Sigma _i=1\) for all groups. For the IMSE-criterion we chose the uniform weighting measure \(\nu _i=\frac{1}{2a}\lambda _{[-a,a]}\) for all \(i=1, \dots , s\).

Further we consider the group of transformations \(G=\{ g_1, g_2 \}\) with \(g_1(x)=-x\) and \(g_2(x)=x\). Then we obtain \(\textbf{Q}_{g_1}=\textrm{diag}(1, -1, 1)\) and \(\textbf{Q}_{g_2}\) is equal to the identity matrix. Hence, the dispersion matrices \(\textbf{D}_i\) and the measures \(\nu _i\) are invariant and conditions (19) and (20) are satisfied for both \(g_1\) and \(g_2\). Then by Theorem 5 and Corollary 3 group-designs of the general form

$$\begin{aligned} \bar{\xi _i^*}= \left( \begin{array}{ccc} -a &{} 0 &{} a \\ w_{i1}^* &{} 1-2w_{i1}^* &{} w_{i1}^* \end{array} \right) \end{aligned}$$
(25)

are D-, A- and IMSE-optimal for the estimation of the mean parameters \({\varvec{\beta }}_0\). The optimal weights of observations \(w_{i1}^*\) at points \(x=a\) and \(x=-a\) generally depend on the design criterion, the variance parameters, the group sizes, the numbers of observations and the length of the interval (see Sect. 6 for examples of the designs).

Further we consider some examples of multiple polynomial regression. For models without random effects optimal designs for multiple polynomial regression have been discussed, e.g., in Galil and Kiefer (1977) and Heiligers (1992).

Example 3

We consider the multiple-group bi-linear model with the regression functions \(\textbf{F}_{(i)}(x)=(1, x_1, x_2)\) on a design region \(\mathcal {X}_i=[-a,a]^2\), \(a>0\), \(i=1, \dots , s\):

$$\begin{aligned} Y_{ijh}= {\varvec{\beta }}_{ij1} + {\varvec{\beta }}_{ij2}x_{ih1}+ {\varvec{\beta }}_{ij3}x_{ih2}+ {\varvec{\varepsilon }}_{ijh},\quad j=1,\dots , n_i,\quad h=1,\dots , m_i. \end{aligned}$$
(26)

For this model the left hand sides of the optimality conditions (10) and (11) for the L- and D-criterion, respectively, are convex paraboloids. Therefore, the only admissible support points for optimal designs are \(x_{i1}=(a,a)\), \(x_{i2}=(a,-a)\), \(x_{i3}=(-a,a)\), \(x_{i4}=(-a,-a)\):

$$\begin{aligned} \xi _i^*= \left( \begin{array}{cccc} x_{i1} &{} x_{i2} &{} x_{i3} &{} x_{i4} \\ w_{i1}^* &{} w_{i2}^* &{} w_{i3}^* &{} w_{i4}^*\end{array} \right) , \end{aligned}$$
(27)

where \(\sum _{k=1}^4{w_{ik}}=1\). For the IMSE-criterion we use the product measure \(\nu _i=\frac{1}{2a}\lambda _{[-a,a]}\times \frac{1}{2a}\lambda _{[-a,a]}\) for all \(i=1, \dots , s\).

Further we assume the same covariance structures of random effects and observational errors as in Example 2 of quadratic regression and we consider the group of transformations \(G=\{ g_1, g_2\}\) with \(g_1(x)=(-x_1,x_2)^\top \) and \(g_2(x)=(x_1,x_2)^\top \). We obtain \(\textbf{Q}_{g_1}=\textrm{diag}(1, -1, 1)\) and \(\textbf{Q}_{g_2}\) is equal to the \(3\times 3\) identity matrix. Then the dispersion matrices \(\textbf{D}_i\) and the weighting measures \(\nu _i\) are invariant and conditions (19) and (20) are satisfied for both \(g_1\) and \(g_2\) and, consequently, group-designs of the general form

$$\begin{aligned} \bar{\xi _i^*}= \left( \begin{array}{cccc} x_{i1} &{} x_{i2} &{} x_{i3} &{} x_{i4} \\ w_{i1}^* &{} w_{i2}^* &{} w_{i1}^* &{} w_{i2}^*\end{array} \right) \end{aligned}$$
(28)

with \(w_{i2}^*=\frac{1}{2}(1-2w_{i1}^*)\) are D-, A- and IMSE-optimal for the estimation of the mean parameters \({\varvec{\beta }}_0\). Only the optimal weights of observations \(w_{i1}^*\) have to be determined. Note that besides the choice of the design criterion these numbers may also depend on the model parameters (see Sect. 5 for illustrative examples).

For the particular case with diagonal covariance structure of random effects: \(d_{i13}=0\), we consider the group of transformations \(G=\{ g_1, g_2, g_3, g_4 \}\) with \(g_3(x)=(x_1,-x_2)^\top \) and \(g_4(x)=(-x_1,-x_2)^\top \), for which we obtain \(\textbf{Q}_{g_3}=\textrm{diag}(1, 1, -1)\) and \(\textbf{Q}_{g_4}=\textrm{diag}(1, -1, -1)\). The dispersion matrices \(\textbf{D}_i\) and the measures \(\nu _i\) are invariant and conditions (19) and (20) are satisfied for all transformation in G. Then the balanced group-designs

$$\begin{aligned} \bar{\xi _i}^*= \left( \begin{array}{cccc} x_{i1} &{} x_{i2} &{} x_{i3} &{} x_{i4} \\ 1/4 &{} 1/4 &{} 1/4 &{} 1/4\end{array} \right) \end{aligned}$$
(29)

are D-, A- and IMSE-optimal.

Example 4

We consider the multiple-group bi-quadratic model with the regression functions \(\textbf{F}_{(i)}(x)=(1, x_1, x_2, x_1x_2, x_1^2, x_2^2)\) on a design region \(\mathcal {X}_i=[-a,a]^2\), \(a>0\):

$$\begin{aligned} Y_{ijh}= {\varvec{\beta }}_{ij1} + {\varvec{\beta }}_{ij2}x_{ih1}+ {\varvec{\beta }}_{ij3}x_{ih2}+{\varvec{\beta }}_{ij4}x_{ih1}x_{ih2} + {\varvec{\beta }}_{ij5}x_{ih1}^2+ {\varvec{\beta }}_{ij6}x_{ih2}^2 + {\varvec{\varepsilon }}_{ijh} \end{aligned}$$
(30)

for \(j=1,\dots , n_i\), \(h=1,\dots , m_i\) and \(i=1, \dots , s\). For this model the left hand sides of the optimality conditions (10) and (11) are quadric surfaces in \((x_1,x_2)\), for which the projections on both \(x_1=0\) and \(x_2=0\) are polynomials of degree four. Then the only admissible support points for L- and D-optimal designs are \(x_{i1}=(a,a)\), \(x_{i2}=(a,-a)\), \(x_{i3}=(-a,a)\), \(x_{i4}=(-a,-a)\), \(x_{i5}=(o_{i1},a)\), \(x_{i6}=(o_{i2},-a)\), \(x_{i7}=(a,o_{i3})\), \(x_{i8}=(-a,o_{i4})\) and \(x_{i9}=(o_{i5},o_{i6})\), where \(o_{il}\in (-a,a)\), \(l=1, \dots , 6\). For the IMSE-criterion we use the same weighting measures as in Example 3.

Further we assume the following simple covariance structure of the random effects and the observational errors: \(\textbf{D}_i=\textrm{diag}(d_{i1}, \dots d_{i6})\) and \(\Sigma _i=1\), \(i=1, \dots s\). Then we consider the same group of transformations G as in the previous example and we obtain \(\textbf{Q}_{g_1}=\textrm{diag}(1, -1, 1, -1, 1, 1)\), \(\textbf{Q}_{g_2}\) is equal to the \(6\times 6\) identity matrix, \(\textbf{Q}_{g_3}=\textrm{diag}(1, 1, -1, -1, 1, 1)\) and \(\textbf{Q}_{g_4}=\textrm{diag}(1, -1, -1, 1, 1, 1)\). Then conditions (19) and (20) are satisfied and dispersion matrices \(\textbf{D}_i\) and the measures \(\nu _i\) are invariant for all \(g_i \in G\). Therefore, D-, A- and IMSE-optimal designs have the general form

$$\begin{aligned} \bar{\xi _i^*}= \left( \begin{array}{ccccccccc} x_{i1} &{} x_{i2} &{} x_{i3} &{} x_{i4} &{} x_{i5} &{} x_{i6} &{} x_{i7} &{} x_{i8} &{} x_{i9} \\ w_{i1}^* &{} w_{i1}^* &{} w_{i1}^* &{} w_{i1}^* &{} w_{i2}^* &{} w_{i2}^* &{} w_{i3}^* &{} w_{i3}^* &{} w_{i4}^*\end{array} \right) , \end{aligned}$$
(31)

where \(w_{i4}^*=1-4w_{i1}^*-2w_{i2}^*-2w_{i3}^*\) and all \(o_{il}=0\), i. e. \(x_{i5}=(0,a)\), \(x_{i6}=(0,-a)\), \(x_{i7}=(a,0)\), \(x_{i8}=(-a,0)\) and \(x_{i9}=(0,0)\). The weights of observations \(w_{i2}^*\), \(w_{i3}^*\) and \(w_{i4}^*\) depend on the choice of the design criterion and on the model parameters and have to be optimized.

Then we additionally assume the conditions \(d_{i2}=d_{i3}\) and \(d_{i5}=d_{i6}\) to be satisfied and consider the extended group of transformations \(G_1=G\cup \{ g_5 \}\) with \(g_5(x)=(x_2,x_1)^\top \), for which we obtain \(\textbf{Q}_{g_5}=\text {block-diag}(1, \textbf{P}, 1, \textbf{P})\), where \(\textbf{P}\) is the (\(2\times 2\)) permutation matrix:

$$\begin{aligned} \textbf{P}=\left( \begin{array}{cc} 0 &{} 1 \\ 1 &{} 0 \end{array}\right) . \end{aligned}$$

The dispersion matrices \(\textbf{D}_i\) and the measures \(\nu _i\) are also invariant with respect to \(g_5\) and conditions (19) and (20) are satisfied. Then the general form (31) of optimal designs simplifies to

$$\begin{aligned} \xi _i^*= \left( \begin{array}{ccccccccc} x_{i1} &{} x_{i2} &{} x_{i3} &{} x_{i4} &{} x_{i5} &{} x_{i6} &{} x_{i7} &{} x_{i8} &{} x_{i9} \\ w_{i1}^* &{} w_{i1}^* &{} w_{i1}^* &{} w_{i1}^* &{} w_{i2}^* &{} w_{i2}^* &{} w_{i2}^* &{} w_{i2}^* &{} 1-4(w_{i1}^*+w_{i2}^*)\end{array} \right) . \end{aligned}$$
(32)

Note that similar behavior has been established for optimal designs for Kiefer’s \(\Phi _p\)-criteria in fixed effects models (see Galil and Kiefer 1977). However, designs obtained in that work depend on the choice of the design criterion only. In the model under investigation optimal designs may also depend on the variance parameters, the group sizes and the numbers of observations per observational unit.

4 Computing the multiple-group mixed models designs

In this Section, we will show how to compute efficient exact designs for model (1). To this end, let’s discretize each (possibly continuous) experimental region \(\mathcal {X}_i\), \(i=1,\ldots ,s\), into \(k_i\) points \(x_{i1},\ldots ,x_{ik_i}\) and denote the corresponding numbers of measurements in these points by \(m_{i1},\ldots ,m_{ik_i}\in \mathbb {N}_0\), as is customary in optimal design algorithms. Similarly to the notation adopted in Sect. 2, we define the \(k_i\)-dimensional vectors \(m_i=(m_{i1},\ldots ,m_{ik_i})\) and the \(\sum _{i=1}^sk_i=u\)-dimensional vector \(\textbf{m}=(m_1,\ldots ,m_s)\).

Now, consider the optimization problem presented in Harman et al. (2016):

$$\begin{aligned} \left. \begin{array}{rl} \min _{\textbf{m}} &{} \Phi (\textbf{m}) \\ \mathrm{subject ~to} &{} \textbf{A}\textbf{m}\le \textbf{b}. \end{array}\right. \end{aligned}$$

Here, we minimize the function \(\Phi \) on the set of permissible designs determined by the linear inequality \(\textbf{A}\textbf{m}\le \textbf{b}\), where \(\textbf{A}\in \mathbb {R}^{k\times u}\) and \(\textbf{b}\in \mathbb {R}^{k}\) are such that the elements of \(\textbf{A}\) are nonnegative and the elements of \(\textbf{b}\) are positive. These kinds of constraints are called resource constraints, i.e., we can view each measurement as consuming some amount of each of the k resources, limit on which are given by the vector \(\textbf{b}\).

The method described in Harman et al. (2016) is related to the Detmax procedure, employing a tabu search principle. The algorithm is based on excursions in the set of all feasible designs. More precisely, from a design \(\xi \) we can either make a forward step to one of its upper neighbours or a backward step to one of its lower neighbours. These excursions are directed by the attribute of each design (which can be, e.g., its criterion value), a tabu list of the attributes of already visited designs and a local heuristic evaluation of the design that roughly estimates how promising a design is as a part of an excursion leading to an efficient design.

Note that although the algorithm is primarily developed for D-optimality in the standard linear regression model, it can be easily adapted for different criteria that are monotonous on the set of all approximate designs, which enables us to compute D- and L-efficient exact designs in model (1).

To show this, we rewrite the covariance matrix (5) in the following form:

$$\begin{aligned} \text {Cov}_{\xi } = \left[ \left( \mathbbm {1}_{s}^{\top }\otimes \mathbb {I}_p\right) \left( \textbf{M}_{\xi }^{-1}+\textbf{D}\right) ^{-1}\left( \mathbbm {1}_s\otimes \mathbb {I}_p\right) \right] ^{-1}, \end{aligned}$$
(33)

where \(\textbf{M}_{\xi }=\text {diag}\left( \tilde{\textbf{M}}_1(\xi _1), \dots , \tilde{\textbf{M}}_s(\xi _s)\right) \) is the block-diagonal matrix with the blocks \(\tilde{\textbf{M}}_i(\xi _i)=n_i\,\textbf{M}_i(\xi _i)\) and \(\textbf{D}=\text {diag}(\tilde{\textbf{D}}_1, \dots , \tilde{\textbf{D}}_s)\) is the block-diagonal matrix with the blocks \(\tilde{\textbf{D}}_i=\frac{1}{n_i}\textbf{D}_i\), \(\mathbbm {1}_s\) is the vector of length s with all entries equal to 1 and ”\(\otimes \)” denotes the Kronecker product.

Then the L- and D-criteria defined by (6) and (7) can be written as the function of the design vector w in the following way:

$$\begin{aligned} \phi _{L}({\varvec{\xi }})=\textrm{tr}\left( \left[ \left( \mathbbm {1}_s^\top \otimes \mathbb {I}_p\right) \left( \textbf{M}_{\xi }^{-1}+\textbf{D}\right) ^{-1}\left( \mathbbm {1}_s\otimes \mathbb {I}_p\right) \right] ^{-1}\textbf{V}\right) \end{aligned}$$
(34)

and

$$\begin{aligned} \phi _{D}({\varvec{\xi }})=-\textrm{ln}\,\textrm{det}\left[ \left( \mathbbm {1}_s^\top \otimes \mathbb {I}_p\right) \left( \textbf{M}_{\xi }^{-1}+\textbf{D}\right) ^{-1}\left( \mathbbm {1}_s\otimes \mathbb {I}_p\right) \right] . \end{aligned}$$
(35)

Note that both criteria (34) and (35) are monotonically decreasing with respect to \(\textbf{M}_{\xi }\).

Further, model (1) can be viewed as a one-group model on \(\mathcal {X}=\times _{i=1}^s\mathcal {X}_i\) with marginal constraints (see, e.g., Cook and Thibodeau 1980) that constrict the number of observations in each group to \(\sum _{h=1}^{k_i}{m_{ih}}=m_i\). This can be formulated in the form of resource constraints by putting \(\textbf{A}=diag(\mathbbm {1}_{k_1}^\top , \dots , \mathbbm {1}_{k_s}^\top )\) and \(\textbf{b}=(m_1,\ldots ,m_s)^\top \).

Hence, the optimization problem to solve is

$$\begin{aligned} \left. \begin{array}{rl} \min _{\textbf{m}} &{} \phi (\textbf{m}) \\ \hbox {subject to} &{} \textrm{diag}(\mathbbm {1}_{k_1}^\top , \dots , \mathbbm {1}_{k_s}^\top )\textbf{m} \le \textbf{b}, \end{array}\right. \end{aligned}$$
(36)

where by \(\phi \) we denote either of the optimality criteria in (34) or (35).

Note that the algorithm used here is heuristic, i.e., it does not guarantee that the resulting design is optimal, although it is demonstrated in Harman et al. (2016) that it is usually highly efficient. Therefore, in the following sections, we will call the designs obtained by the algorithm as efficient exact designs.

Further, we will demonstrate that it is of great practical use that the matrix \(\textbf{A}\) and the vector b can be modified so that they incorporate additional linear resource constraints on the weights, such as the limit on the number of measurements in particular points or cost constraints (see Sect. 6 for an example of such constraints), simply by adding suitable rows to the matrix \(\textbf{A}\) and elements to the vector b.

5 Bi-linear regression

Let’s consider the model of bi-linear regression (26) with three groups, \(\mathcal {X}_i=[-1,1]^2\), \(\Sigma _i=1\), \(n=(1,1,1)\) and

$$\begin{aligned} D_i=\begin{pmatrix} 1 &{} 0 &{} d\\ 0 &{} 1 &{} 0\\ d &{} 0 &{} 1 \end{pmatrix},\ i=1,2,3. \end{aligned}$$

As the analytical results in Example 3 show, the approximate optimal designs are supported on the four vertices of the square \([-1,1]^2\) and the number of observations is identical in the points (1, 1), \((-1,1)\) and in the points \((-1,-1)\), \((1,-1)\). This phenomenon was confirmed also for the exact D-efficient designs by our algorithm.

In this example, we will numerically illustrate the dependence of efficient designs on the parameter d in the matrices \(D_i\), \(i=1,2,3\). To this end, let’s consider that in all three groups, the parameter d is the same. Figure 1 shows how the numbers of observation in the point (1, 1) change with d varying from -1 to 1 for two different settings: \(m=(10,20,40)\) (left) and \(m=(20,20,20)\) (right). We can see that in both cases, the number of observations in (1, 1) decreases with increasing d.

Fig. 1
figure 1

The dependence of the number of observations in the point (1, 1) on the parameter d in the exact D-efficient designs in bilinear model (26) on \([-1,1]^2\) with total numbers of observations in the groups given by \(m=(10,20,40)\) (left) and \(m=(20,20,20)\) (right). The three lines denote the number of observations in the point (1, 1) normalized by \(m_i\) for the first (full line), second (dashed line) and third (dotted line) group

Now, suppose that

$$\begin{aligned} D_i=\begin{pmatrix} 1 &{} 0 &{} d_i\\ 0 &{} 1 &{} 0\\ d_i &{} 0 &{} 1 \end{pmatrix},\ i=1,2,3, \end{aligned}$$
(37)

where \(d_i\in \{-0.5, 0, 0.5\}\) are not necessarily the same between groups. In Table 1 we show the behavior of the numbers of observation for several selected \(d_1,d_2,d_3\) in the case \(m=(20,20,20)\).

Table 1 Exact D-efficient designs in bilinear model (26) on \([-1,1]^2\) with total numbers of observations in the groups given by \(m=(20,20,20)\) with \(d_i\in \{-0.5, 0, 0.5\}\)

6 Quadratic regression on a symmetric interval

Consider the two-groups model of the form (1) with the regression functions \(\textbf{F}_{(i)}(x)=(1,x, x^2)^\top \), \(x\in \mathcal {X}_i\), and the design region \(\mathcal {X}_i=[-1,1]\), \(i=1,2\):

$$\begin{aligned} Y_{ijh}= {\varvec{\beta }}_{ij1} + {\varvec{\beta }}_{ij2}x_{ih}+ {\varvec{\beta }}_{ij3}x^2_{ih}+ {\varvec{\varepsilon }}_{ijh},\quad j=1,\dots , n_i,\quad h=1,\dots , m_i. \end{aligned}$$
(38)

The covariance structures of random effects and observational errors are given by \(\textbf{D}_i=\textrm{diag}(d_{i1},d_{i2},d_{i3})\) and \(\Sigma _i=1\) for both groups.

The corresponding approximate optimal group-designs \(\xi _i\) are supported by three design points \(-1,0,1\) (see Sect. 3):

$$\begin{aligned} \xi _i^*= \left( \begin{array}{ccc} -1 &{} 0 &{} 1 \\ w^*_{i1} &{} 1-2w^*_{i1} &{} w^*_{i1} \end{array} \right) . \end{aligned}$$
(39)

This result was heuristically confirmed to hold also for the exact designs: we discretized the design region into q points \(-1=x_1< x_2<\cdots <x_q=1\) and confirmed that for all cases considered below, the support points are indeed \(-1, 0\) and 1.

The exact D- and IMSE-efficient designs for this case and several particular \(m=(m_1,m_2)\) are given in Table 2 in the Appendix.

We can see that for the criterion of D-optimality and the values of the diagonal of the matrix \(D_i\) equal to either (1, 1, 1) or (1, 1, 0), half of the measurements is in the point 0 and the remaining half is distributed equally among the points \(-1\) and 1. For \((d_1,d_2,d_3)\) equal either to (0, 1, 1) or (1, 0, 0), the measurements are heavily concentrated in the support point 0, but the designs are still nonsingular. The case (1, 0, 1) shows opposite phenomenon with the measurements being concentrated in the points \(-1\) and 1. For the remaining cases, the pattern is not so clear and the weights depend more on m, sometimes even resulting in singular designs.

For the IMSE criterion, we get results identical to D-optimality if the diagonal of \(D_i\) is (1, 1, 1). For the rest of the cases, the situation is more varied and we refer reader to Table 2 for details.

From practical point of view, it may not be desirable to only have three support points for each group. Therefore, additional constraints on the design were suggested, where it is prescribed that, for each group, maximum one half of the measurements can be taken at \(-1,0\) or 1. Formally, these constraints can be written in the form \(\textbf{A}^{(1)}w\le b^{(1)}\) (see Sect. 4 for details), where

$$\begin{aligned} \textbf{A}^{(1)}=\begin{pmatrix} c_q &{} 0^\top _q\\ 0^\top _q &{} c_q \end{pmatrix}\in \mathbb {R}^{2\times 2q},\ b^{(1)}=\begin{pmatrix} m_1/2\\ m_2/2\end{pmatrix}, \end{aligned}$$
(40)

where \(c_q=(1,0,\ldots ,0,1,0,\ldots ,0,1)\in \mathbb {R}^q\) with 1 on the positions corresponding to the points \(-1,0,1\). The D- and IMSE-efficient designs for the discretization \(\mathcal {X}_i=\{-1, -0.8,\ldots , 0.8,1\}\) of the interval \([-1,1]\) with the step 0.2 (i.e. \(q=11\)) are given in Tables 3 and 4. Note that for both criteria, the tendency is to distribute the measurements as close as possible to the original support points \(-1,0\) and 1.

Another type of constraint that is often used in practical situations, is the cost constraint: this is natural, for example, in clinical trials, where taking a measurement at a point x consumes a certain number of time, personal or material resources and the total cost of the experiment is limited. In our case, let the measurement at the point x cost \(|x|+0.1\) units, and, for group j, let the maximum admissible cost be \(m_j/4\). This leads to adding the constraints \(\textbf{A}^{(2)}w\le b^{(2)}\) with the following \(\textbf{A}^{(2)}\), \(b^{(2)}\) to the problem (36):

$$\begin{aligned} \textbf{A}^{(2)}=\begin{pmatrix} u_q+0.1 \mathbbm {1}_q^\top &{} 0\\ 0 &{} u_q+0.1 \mathbbm {1}_q^\top \end{pmatrix}\in \mathbb {R}^{2\times 2q},\ b^{(2)}=\frac{1}{4}\begin{pmatrix}m_1\\ m_2\end{pmatrix}, \end{aligned}$$
(41)

where \(u_q=(1,0.8,\ldots ,0.8,1)\).

Again, we computed D- and IMSE-efficient designs with respect to this constraint for the discretization \(\mathcal {X}_i=\{-1, -0.8,\ldots , 0.8,1\}\). Now, the designs are supported on \(-1,0\) and 1, but, compared to the unconstrained designs, much more measurements are made at the point 0, which is ’cheap’: the results are summarized in Table 5.

Finally, it is also feasible and possible to consider both types of constraints together, resulting in \(\textbf{A}^{(3)}w\le b^{(3)}\) with

$$\begin{aligned} \textbf{A}^{(3)}=\begin{pmatrix} \textbf{A}^{(1)}\\ \textbf{A}^{(2)} \end{pmatrix}\in \mathbb {R}^{4\times 4q},\ b^{(3)}=\begin{pmatrix} b^{(1)} \\ b^{(2)} \end{pmatrix}. \end{aligned}$$
(42)

The resulting D- and IMSE-efficient designs for this constraint are given in Tables 6 and 7.

Note that in some cases, the additional constraints on the designs were saturated for a number of measurements that is lower than the maximum attainable number of measurements given by \((m_1,m_2)^\top \). This is demonstrated in a more detailed way in Fig. 2, where we again consider the D-efficient design with \((m_1,m_2)=(20,40)\) and the cost constraints (41), but now the cost \(b^{(2)}\) can vary between 0 and \(m_i\) for the i-th group. All the designs are supported in the points \(-1, 0\) and 1 and the figure shows that when the maximum allowed cost is too low, the total maximum number of measurements is (sometimes significantly) lower than the corresponding \(m_i\).

Fig. 2
figure 2

The numbers of measurements in the point 0 (full line), \(-1\) (dashed line) and 1 (dot-dashed line) for the second group in the D-efficient design in model (38) with \((m_1,m_2)=(20,40)\) and constraints of the type (41) with the maximum cost \(b^{(2)}\) in the second group varying between 0 and 40

7 Discussion

In the paper, we have considered equi- and invariance properties of approximate optimal designs in multiple-group mixed models. We have used these properties to fix the support points and, consequently, to reduce the number of unknown variables in first- and second-order models on a symmetric square. As we currently have no universal computational tool for approximate designs, these results can be used to determine optimal designs analytically in a few isolated and easy cases, as shown in the examples in Sect. 3.

However, from practical point of view, it is more important to be able to compute efficient exact designs, possibly even with some additional constraints given by the experimental conditions. We have shown a modified version of the algorithm of Harman et al. (2016) is a useful tool for such computations, even in the cases where there are several nontrivial constraints on the design.

In the models considered here, covariance matrix of random effects is assumed to be known. A natural question that arises while reading this work is how to perform in the situation where no prior knowledge about variances and covariances is available. In this case an estimation can be used. However, the quality of obtained designs depends on the accuracy of the estimation. For some particular structures of the covariance matrix it may happen that optimal designs turn out to be independent on the variance parameters (consider, for example, compound symmetry structure in Prus and Piepho 2021).