Abstract
The present paper is devoted to the construction of R-optimal designs in multiresponse linear models. The R-optimality criterion introduced by Dette (J R Stat Soc Ser B 59:97–110, 1997) minimizes the volume of Bonferroni rectangular confidence region for the parameter estimation. A generalization of Elfving’s theorem is proved for the optimal designs with respect to R-optimality, which gives a geometric characterization of R-optimal designs. The geometric characterizations of the R-optimal designs are illustrated by four examples.
Similar content being viewed by others
References
Chernoff H (1999) Elfving’s impact on experimental design. Stat Sci 12:201–205
Dette H (1993) Elfving’s theorem for D-optimality. Ann Stat 21:753–766
Dette H (1996) A note on bayesian \(c\)- and \(D\)-optimal designs in nonlinear regression models. Ann Stat 24:1225–1234
Dette H (1997) Designing experiments with respect to “standardized” optimality criteria. J R Stat Soc Ser B 59:97–110
Dette H, Holland-Letz T (2009) A geometric characterization of c-optimal designs for heteroscedastic regression. Ann Stat 37:4088–4103
Dette H, Studden WJ (1994) A geometric solution of the Bayesian E-optimal design problem. In: Gupta SS, Berger JO (eds) Statistical decision theory and related topics V. Springer, New York, NY
Dette H, Heiligers B, Studden WJ (1995) Minimax designs in linear regression models. Ann Stat 23:30–40
Elfving G (1952) Optimum allocation in linear regression theory. Ann Math Stat 23:255–262
Haines LM (1995) A geometric approach to optimal design for one-parameter non-linear models. J R Stat Soc Ser B 57:575–598
He L, Yue R-X (2017) \(R\)-optimal designs for multi-factor models with heteroscedastic errors. Metrika 80:717–732
Holland-Letz T, Dette H, Pepelyshev A (2011) A geometric characterization of optimal designs for regression models with correlated observations. J R Stat Soc Ser B 73:239–252
Huang M-NL, Chen RB, Lin CS, Wong WK (2006) Optimal designs for parallel models with correlated responses. Stat Sin 16:121–133
Kiefer J (1974) General equivalence theory for optimum designs (approximate theory). Ann Stat 2:849–879
Liu X, Yue R-X (2013) A note on R-optimal designs for multiresponse models. Metrika 76:483–493
Liu X, Yue R-X, Lin Dennis KJ (2013) Optimal design for prediction in multiresponse linear models based on rectangular confidence region. J Stat Plan Inference 143:1954–1967
Liu X, Yue R-X, Chatterjee K (2014a) A note on R-optimal designs for multi-factor models. J Stat Plan Inference 146:139–144
Liu X, Yue R-X, Chatterjee K (2014b) \(R\)-optimal designs in random coefficient regression models. Stat Probab Lett 88:127–132
Liu X, Yue R-X, Chatterjee K (2016) Algorithmic construction of R-optimal designs for second-order response surface models. J Stat Plan Inference 178:61–69
Pukelsheim F (1993) Optimal design of experiments. Wiley, New York
Studden WJ (1971) Elfving’s theorem and optimal designs for quadratic loss. Ann Stat 42:1613–1621
Studden WJ (2005) Elfving’s theorem revisited. J Stat Plan Inference 130:85–94
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported by NSFC Grant 11871143.
Appendix
Appendix
1.1 Proof of Theorem 2
Here we present the proof by using similar arguments in Dette (1993) and invoking Theorem 1.
Let \(\xi ^*=\bigg \{\begin{array}{c} {\varvec{x}}_v \\ w_v \end{array} \bigg \}_{v=1}^s\) denote an R-optimal design for the model (1). By Theorem 1 we have
for all \({\varvec{x}}\in \mathcal {X}\), and
for all \(v=1,\ldots ,s\). Letting \(\gamma _{i}^{-2}=p{\varvec{e}}_{i}^T{\varvec{M}}^{-1}(\xi ^*){\varvec{e}}_{i}, i=1, \ldots , p\) and \({\varvec{D}}={\varvec{M}}^{-1}(\xi ^*){\varvec{\Gamma }}\), it follows that
where \({\varvec{K}}_{v}={\varvec{\Sigma }}^{-1/2}{\varvec{F}}({\varvec{x}}_v){\varvec{D}}\), \(v=1,\ldots ,s\). This proves the representation given in (a). Moreover, from these definitions and the fact in (19) we have
From the inequality (18) and the Cauchy–Schwarz inequality we get
for all \({\varvec{x}}\in \mathcal {X}\), whenever the matrix \({\varvec{K}}\) satisfies the equation \(\parallel {\varvec{K}}\parallel =1\). Observing (20) it is now easy to see that the point \({\varvec{\Gamma }}\) is a boundary point of the set \(\mathcal {R}\) with supporting hyperplane \({\varvec{D}}\) which proves (b). Finally, the condition (c) follows readily from the definitions of \({\varvec{\Gamma }}\) and \({\varvec{D}}\).
To prove sufficiency, we let \({\varvec{D}}\in \mathbb {R}^{p\times p}\) denote a supporting hyperplane to the set \(\mathcal {R}_{p,r}\) at the boundary point \({\varvec{\Gamma }}\). Thus we have for all \({\varvec{x}}\in \mathcal {X}\) and \({\varvec{K}}\) satisfying \(\parallel {\varvec{K}}\parallel =1,\)
Defining \({\varvec{K}}({\varvec{x}})={\varvec{\Sigma }}^{-1/2}{\varvec{F}}({\varvec{x}}){\varvec{D}}/\sqrt{\text{ tr }\{{\varvec{D}}^T{\varvec{F}}^T({\varvec{x}}){\varvec{\Sigma }}^{-1}{\varvec{F}}({\varvec{x}}){\varvec{D}}\}}\), we observe from (21) that
Because \({\varvec{D}}\) is a supporting hyperplane to \(\mathcal {R}_{p,r}\) at the boundary point \({\varvec{\Gamma }}\) we obtain from (21) (used at \({\varvec{x}}={\varvec{x}}_v\)) and the representation (a)
and this implies \(\text{ tr }\left\{ {\varvec{D}}^T{\varvec{F}}^T({\varvec{x}}_v){\varvec{\Sigma }}^{-1/2}{\varvec{K}}_{v}\right\} =1, v=1,\ldots ,s\). By an application of the Cauchy–Schwarz inequality we now get for \(v=1,\ldots ,s\)
where the last inequality results from (22). Therefore, we have \({\varvec{K}}_{v}=\lambda _v{\varvec{\Sigma }}^{-1/2}{\varvec{F}}({\varvec{x}}_v){\varvec{D}}\) for some \(\lambda _v \in \mathbb {R}, v=1,\ldots ,s.\) From the normalizing conditions on the \({\varvec{K}}_{v}\) we thus obtain
On the other hand, we have from the property that \({\varvec{\Gamma }}\) is a boundary point of \(\mathcal {R}_{p,r}\) with supporting hyperplane \({\varvec{D}}\)
It follows from (24) and noting \(w_v> 0\) and \(\sum _{v=1}^{s}w_v=1\) that \(\lambda _v=1\) which implies \({\varvec{K}}_{v}={\varvec{\Sigma }}^{-1/2}{\varvec{F}}({\varvec{x}}_v){\varvec{D}}, v=1,\ldots ,s.\) From this representation we finally obtain
Observing condition (c) it follows
and the inequality (22) yields
for all \({\varvec{x}}\in \mathcal {X}\). By Theorem 1 it then immediately follows that the design \(\xi ^*\) is R-optimal for the model (1), which completes the proof of Theorem 2.
1.2 Proof of the result in (15)
By \(\mathcal {S}\) denote the set in the right side of (15), i.e.,
We first show that \(\mathcal {S}\) is convex and then show that \(\mathcal {S}=\text {conv}(\mathcal {H})\) where \(\mathcal {H}\) is defined by (14).
For any \({\varvec{U}}=\left( \begin{array}{cc} u_1 &{}\quad u_2\\ u_3 &{}\quad u_4\end{array} \right) \in \mathcal {S}\) and \({\varvec{V}}=\left( \begin{array}{cc} v_1 &{}\quad v_2\\ v_3 &{}\quad v_4\end{array} \right) \in \mathcal {S}\), we have
and
Thus, for any \(t\in [0,1]\), \({\varvec{W}}=\left( \begin{array}{cc} w_1 &{}\quad w_2\\ w_3 &{}\quad w_4\end{array} \right) =(1-t){\varvec{U}}+t{\varvec{V}}\in \mathcal {S}\), which follows from
Therefore, we conclude that \(\mathcal {S}\) is a convex set. Next, we prove that \(\text {conv}(\mathcal {H})=\mathcal {S}\).
Let us replace \({\varvec{\epsilon }}\) by \({\varvec{\epsilon }}=(\epsilon _{1}, \epsilon _{2})^T.\) Then
Define
and
It is to be noted that \(\mathcal {H}_1\) and \(\mathcal {H}_2\) are subsets of \(\mathcal {H}\) corresponding to \(x=0\) and \(x=1\), respectively. Moreover, it is easy to check that
and
In what follows, we shall prove both \(\text {conv}(\mathcal {H})\subset \mathcal {S}\) and \(\mathcal {S}\subset \text {conv}(\mathcal {H})\) hold, respectively. First, for any \({\varvec{H}}=\left( \begin{array}{cc} h_1 &{}\quad h_2\\ h_3 &{}\quad h_4\end{array}\right) \in \mathcal {H}\), we have
which implies \({\varvec{H}}\in \mathcal {S}\). It means that \(\mathcal {H}\subset \mathcal {S}\) and hence \(\text {conv}(\mathcal {H})\subset \text {conv}(\mathcal {S})=\mathcal {S}\). Secondly, we will prove that
- (i)
\(\mathcal {S}\subset \text {conv}\left( \text {conv}(\mathcal {H}_1)\cup \text {conv}(\mathcal {H}_2)\right) \), and
- (ii)
\(\text {conv}\left( \text {conv}(\mathcal {H}_1)\cup \text {conv}(\mathcal {H}_2)\right) \subset \text {conv}(\mathcal {H})\).
Let \({\varvec{H}}=\left( \begin{array}{cc} h_1 &{}\quad h_2\\ h_3 &{}\quad h_4\end{array}\right) \) be any element of \(\mathcal {S}\). It is obvious that if \({\varvec{H}}\) is a null matrix then \({\varvec{H}}\in \mathcal {H}\). For the non-null \({\varvec{H}}\), we let
and then we can rewrite \({\varvec{H}}\) as
Note that
and
Thus \({\varvec{U}}\in \text {conv}(\mathcal {H}_1)\) and \({\varvec{V}}\in \cup \text {conv}(\mathcal {H}_2)\). This means that \({\varvec{H}}\in \text {conv}\left( \text {conv}(\mathcal {H}_1) \text {conv}(\mathcal {H}_2)\right) \) and hence (i) holds.
As the proof of (ii), noting that \(\mathcal {H}_1\subset \mathcal {H}\) and \(\mathcal {H}_2\subset \mathcal {H}\), it immediately follows that \(\text {conv}(\mathcal {H}_1)\subset \text {conv}(\mathcal {H})\) and \(\text {conv}(\mathcal {H}_2)\subset \text {conv}(\mathcal {H}).\) We then have \((\text {conv}(\mathcal {H}_1)\cup \text {conv}(\mathcal {H}_2))\subset \text {conv}(\mathcal {H})\) which implies that \(\text {conv}\left( \text {conv}(\mathcal {H}_1)\cup \text {conv}(\mathcal {H}_2)\right) \subset \text {conv}(\text {conv}(\mathcal {H}))=\text {conv}(\mathcal {H})\). This completes the proof of the result in (15).
Rights and permissions
About this article
Cite this article
Liu, X., Yue, RX. Elfving’s theorem for R-optimality of experimental designs. Metrika 83, 485–498 (2020). https://doi.org/10.1007/s00184-019-00728-3
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-019-00728-3