An equivalence theorem for design optimality with respect to a multi-objective criterion

Maxi-min efficiency criteria are a kind of multi-objective criteria, since they enable us to take into consideration several tasks expressed by different component-wise criteria. However, they are difficult to manage because of their lack of differentiability. As a consequence, maxi-min efficiency designs are frequently built through heuristic and ad hoc algorithms, without the possibility of checking for their optimality. The main contribution of this study is to prove that the maxi-min efficiency optimality is equivalent to a Bayesian criterion, which is differentiable. In addition, we provide an analytic method to find the prior probability associated with a maxi-min efficient design, making feasible the application of the equivalence theorem. Two illustrative examples show how the proposed theory works.


Introduction
In this study, we aim at solving a multi-objective optimization problem that consists in the maximization of a minimum design-efficiency. In optimal design literature, different approaches can be classified as maxi-min efficiency criteria. The standardized max-min criterion, introduced to tackle the problem of parameter uncertainty, is the most common. This last issue, however, is not considered in this work because it has been already extensively studied (see for instance, Chen et al. 2015;Dette and Biedermann 2003;Nyquist 2013;Fackle-Fornius et al. 2015;Dette et al. 2007, among others); furthermore, parameter uncertainty is not easily interpretable as a multi-task problem. Differently, examples of maxi-min efficiency criteria that can be interpreted as multi-objective problems are: the SMV-criterion (proposed by Dette (1997)), which aims to obtain an accurate estimation of each one of the model parameters, taking into account their different scale (see also López-Fidalgo and Tommasi 2004 and the references therein) and the extensions of T-and KL-criteria (proposed by Atkinson and Fedorov (1975) and Tommasi et al. (2016), respectively) to handle the problem of model uncertainty. Another interesting application might be the identification of an optimal design for model identification, precise parameter estimation and accurate predictions. This multiple objective could be achieved by maximizing the minimum efficiency of three criteria reflecting these three distinct goals.
The maxi-min approach arises naturally when we wish to protect against the worst case scenario; however, it is difficult to compute the corresponding optimal design (the maxi-min efficiency design) because this criterion is not differentiable. Consequently, a standard directional derivative argument cannot be applied to check whether a given design is optimal because unfortunately, the directional derivative involves an unknown measure; see for instance Wong (1992) and Atkinson and Fedorov (1975).
In addition, the construction of the maxi-min efficiency design is not straightforward at all. Frequently, it is found numerically by the application of some algorithm, but there is no way to prove that it is really the optimum.
The main contribution of this study is to prove the equivalence between the maximin efficiency approach and the Bayesian criterion for a specific prior, which is differentiable. Hence, the directional derivative of the Bayesian criterion can be used to check for the minimum efficiency optimality. Let us note that the Bayesian criterion is another kind of multi-objective optimality function, being a convex combination of different quantities. The connection between maxi-min efficiency and Bayesian optimum designs has been already explored by other authors, see for instance Schervish (1995), Müller and Pazman (1998) and Dette et al. (2007); other versions of the equivalence theorem can be found but they are specialized for specific problems; for instance, Dette and Biedermann (2003) or Berger et al. (2000) consider parameter uncertainty in a non-linear model and the D-criterion.
In this study, we prove a more general version of the equivalence theorem, because it covers any multi-objective problem that can be expressed as a minimum designefficiency (for any component-wise criteria). Furthermore, following similar ideas as in Chen et al. (2017), we provide a method to determine the prior probability that matches the maxi-min efficiency criterion and the Bayesian optimality; this makes possible the application of the equivalence theorem.
The paper is organized as follows. In Sect. 2, we recall some background information and the used notation. In Sect. 3, we state the equivalence theorem and the rule to determine the prior probability that makes the minimum efficiency and the Bayesian criteria equivalent. Section 4 concerns a pair of illustrative examples. Section 5 provides some conclusions, and finally Appendix A includes the proofs of the theoretical results.

Background and notation
In this section, we introduce the main ideas of optimal experimental design and the notation used in what follows.
Let us assume that f (y, x, θ) is a statistical model that describes the response Y at the experimental condition x, which may be chosen in a compact set X and θ ∈ ⊆ IR p denotes a p × 1 parameter vector.
An approximate design is a probability measure on the design space X with a finite support, i.e.
where ξ(x i ) ≈ n i /n, and n i is the number of observations to be taken at the experimental condition x i , i = 1, . . . , r . The aim is to find a design ξ * θ maximizing (minimizing) a concave (convex) optimality criterion function (ξ ; θ) defined on the space of all designs to the real line. This means that an optimal design ξ * θ may be found according to several criteria reflecting different inferential goals: parameter estimation, prediction or model discrimination. Many optimality criteria for the precise estimation of θ are concave (or convex) functions of the information matrix of a design ξ ∈ , i.e.
If (ξ ; θ) is a non-negative concave function, then a measure of the goodness of a design ξ with respect to the optimal design ξ * θ , is the following efficiency function: If (ξ ; θ) is convex, then the ratio on the right-hand side of Eq.

Minimum efficiency and pseudo-Bayesian criteria
Let i (ξ ; θ i ) with i = 1, . . . , k be k different concave optimality criteria, that reflect distinct goals and possibly depend on some unknown parameter vector θ i . Let θ 0i be a guessed value for θ i ; thus, ξ * i = ξ * i;θ 0i = arg max ξ ∈ i (ξ ; θ 0i ) are local optimum designs. When we are interested in a compromise design that is 'good' for all the different criteria, we need to combine i (ξ ; θ i ), for i = 1, . . . , k, in a multi-objective criterion. To this aim, as suggested by Dette (1997), we should first standardize the criteria i (ξ ) = i (ξ ; θ 0i ), obtaining their efficiency functions: k. An easy way of combining the standardized criteria is through a linear combination. If we have same prior knowledge about criteria i (ξ ) for i = 1, . . . , k, we might compute a Bayesian optimum design maximizing the following criterion: where π T = (π 1 , . . . , π k ) is a prior probability on the set {1, . . . , k}. For an application of this criterion, see for instance Tommasi and López-Fidalgo (2010).
It is easy to prove that ξ * π is a Bayesian optimal design if and only if it satisfies the following inequality: and that = 0, at the support points of ξ * π .
When we are unable to provide a prior distribution π , another possibility to takes into consideration all the objectives represented by the k different criteria is the following minimum efficiency criterion: This multi-objective optimality function, differently from the previous one, is not differentiable, and thus the computation of -optimal designs is not straightforward at all.
A design ξ * is a maxi-min efficiency design if and only if .
From the last equation, ξ * is also the design that minimizes the maximum inefficiency optimality criterion: We find maxi-min efficiency designs by minimizing −1 (ξ ), for which we can state the following propositions: The proof is straightforward.
where e i denotes the canonical vector of the Euclidean space, .
The proof is deferred to Appendix A.

Equivalence theorem
Bayesian optimum designs are usually found by applying standard algorithms, because the equivalence inequality (4) is completely known. Maxi-min efficiency designs are difficult to determine because they are not differentiable (their equivalence inequality depends on an unknown measure); see for instance, Wong (1992). See also Chen et al. (2017) and Dette and Biedermann (2003) for an equivalence theorem for the standardized max-min D-optimal design criterion. In this section, we provide a new formulation of the equivalence theorem, which establishes a connection between B (ξ ; π) and (ξ ).

Theorem 3 (Equivalence Theorem) A design ξ * is a maxi-min efficiency design if and only if there exists a probability distribution π * on the index set
such that ξ * is a Bayesian optimum design for the prior distribution π * , that is, if and only if ξ * fulfils the following inequality, The detailed proof of the equivalence theorem is deferred to Appendix A. In addition, from (5), we can state the following corollary: The equivalence between the minimum efficiency and the Bayesian optimality criteria can be used to check whether a design is optimal with respect to criterion (6). Recently, several algorithms have been applied to construct optimal designs numerically; see for instance,  who apply the Nedler-Mead algorithm, (Chen et al. 2015(Chen et al. , 2020, where the authors use particle swarm optimization, or (Belmiro et al. 2015), where a semi-infinite programming based algorithm is considered. These algorithms provide a solution based on a suitable stopping rule; however, it is necessary to check the equivalence inequality to prove that an 'optimum' has been reached. We follow the same idea as in Chen et al. (2017) (page 87). Given a solution of a numerical procedure ξ * s , from the equivalence inequality (8) with ξ * s instead of ξ * , we can compute the prior distribution π * solving the minimization problem where S ξ * s denotes the support of ξ * s and I(ξ * s ) is the set defined in (7) with ξ * replaced by ξ * s . Equation (9) comes out from the equivalence theorem, for which the weighted sum of the component-wise criteria's derivatives must be zero at each support point of the optimal design, and thus, the weights can be chosen by minimizing the sum of squares of these expressions for all the support points.
Given a design ξ * s , using the solutions of (9) we can check whether ξ * s really is an optimal design by computing the equivalence inequality (8).
Remark 1 At the optimal design, the value of (9) should be zero (except for rounding approximations).

Illustrative examples
The first example of this section underlines the difficulty in finding out a maxi-min efficiency design, when the search is done step-by-step by comparing the k efficiencies. This leads to the conclusion that suitable optimization algorithms should be applied, and then their numerical solutions should be checked for their optimality through Equivalence inequality (8). This procedure is followed in Example 4.2.

SMV-optimum designs in biology immunoassays
In biology, immunoassays are usually performed to quantify the concentration of an analyte. In this example, the SMV-optimality criterion is applied to the four-parameter logistic model, which is the most frequently used model for symmetric immunoassay data, where y is the response at the concentration x, ε ∼ N (0; σ 2 ) is a random error, and θ 1 > 0, θ 2 > 0, θ 3 ∈ IR, and θ 4 > 0 are unknown parameters. The SMV-optimality criterion, proposed by Dette (1997), is an example of maximum inefficiency criterion (6), where k = 4 is the dimension of θ = (θ 1 , θ 2 , θ 3 , θ 4 ); θ 0 is a guessed value for θ ; i (ξ ) is given by where M(ξ, θ ) is the information matrix (1) for model (10), and e i , i = 1, ..., 4 are the canonical basis of IR 4 . In this example, X = [0, 5], θ 0 = (1, 2, 1, 1) and the gradient in (1) is where the third component has been slightly modified for computational reasons.
The procedure followed to find out the optimal design is quite cumbersome, but the prior probabilities which solve (9) enable us to identify the right maxi-min efficiency design. At first we search for designs that have the same efficiencies for any pair of the indices. Let I = {i 1 , . . . , i l }, with l = 2, . . . , k, be an index set. For instance, for I = {2, 4} we obtain the design   giving the same efficiency, 0.4778, for indices 2 and 4, which is smaller than the other efficiencies, 0.5734 and 0.5879 (for indices 1 and 3, respectively); therefore, ξ (2,4) 3 is a candidate design for the Bayesian optimality. To prove that, it is necessary to identify a prior distribution in I , π = {π 2 , π 4 }, such that ξ (2,4) 3 is Bayesian optimal for π . To find such a distribution, we employ condition (9). The weights minimizing this expression are π 2 = 0.608 and π 4 = 1 − π 2 , but the minimum value obtained from these weights is 3.547, which is far from zero. Thus, this design cannot be Bayesian optimal (and neither maxi-min efficiency optimal).
We obtain similar results with every pair (i 1 , i 2 ) of indices. Thus, the search proceeds among designs producing equal efficiencies for three of the component-wise criteria. At first we look for designs that produce a common efficiency for the indices in I = {1, 2, 4}; Table 1lists some designs verifying this condition, however, none of them has the remaining efficiency larger than this common value.
The same happens with the triplets of indices {1, 3, 4} and {1, 2, 3}. Differently, for the set I = {2, 3, 4}, we find out some designs with the same common efficiency, which is smaller than that for i = 1; see Table 2.
However, none of them can be Bayesian optimal because no distribution of weights in I gets a minimum value zero in (9). Finally, for the same index set, we obtain the design  (9), we get the solution π * = {0, 0.493, 0.054, 0.453} with a minimum value of 6.644 x 10 −4 and hence, ξ * turns out to be Bayesian optimal for π * .

Conclusions and discussion
In practice, obtaining an optimal design that accounts for several goals or experimenter's interests, is a difficult task. There is much literature on different approaches, usually considering specific situations. In this study, we consider a quite general setting; we aims at finding a max-min efficiency design which maximizes the minimum of the efficiencies of several component-wise criteria (reflecting different tasks). This multi-objective criterion depends on some nominal values of the parameters, therefore a sensitivity analysis to assess this dependence is advisable.
We provide theoretical results, including an equivalence theorem which states that the maxi-min efficiency design is Bayesian optimal for a specific prior distribution on the set of the component-wise criteria. Furthermore, a method to identify this prior distribution is given. This is important for two reasons.
(i) It enables the application of the equivalence theorem in such a way the optimality of a particular design, e.g. found by the implementation of an algorithm, can be checked through the equivalence theorem since the prior probability can be determined. (ii) This prior distribution tells the practitioner the weight the optimal design is assigning to each component-wise criterion. Notice that if a criterion does not receive any weight, this does not mean that the optimal design is going to be bad for that criterion. It is quite the opposite, as the efficiency of the optimal design with respect to that specific component-wise criterion will be higher than those with positive weights.
where C is the convex hull of the canonical vectors {e 1 , . . . , e k , −e 1 , . . . , −e k } and Proof The function (ξ ; c) is convex with respect to c; thus, it reaches its maximum at one (or more) of the vertexes of the set C, and which proves the lemma.
Proof of Proposition 2. The set C = {c : c ∈ IR k , |c| 1 ≤ 1}, where |c| 1 = max {|c 1 |, . . . , |c k |} is compact. Therefore, from Eq. (2.6.15) of Fedorov and Hackl (1997), the directional derivative of −1 (ξ ) evaluated at ξ in the direction ofξ − ξ is where ψ(x, c, ξ) is the directional derivative of (ξ ; c) in the direction of ξ x − ξ and The last expression for C(ξ ) is because (ξ ; c) always reaches its maximum at one or more points defined by the canonical vectors. Moreover, The following two lemmas are necessary to prove the Equivalence Theorem stated in Sect. 3.
Lemma 5 Let ξ andξ be two designs and let η denote a discrete distribution on C(ξ ), then Proof The following inequality holds for any η; thus, it is valid also for the measure η i which puts the whole mass at the vector e i , The last inequality is satisfied for any e i ∈ C(ξ ), and this means that On the other hand, This inequality is obtained by replacing each term X ψ(x, e i , ξ)ξ(dx) by max e i ∈C(ξ ) X ψ(x, e i , ξ)ξ(dx), and it is satisfied for any measure η; thus, The lemma follows from inequalities (14) and (15).
(17) On the other hand, the inequality holds for anyξ ; thus, it is also valid for any measureξ = ξ x , that is, (18) Since inequality (18) holds for any x ∈ X , it is also valid for the value of x which minimizes the quantity C(ξ ) ψ(x, e i , ξ)η(de i ), that is, ψ(x, e i , ξ)η(de i ).
From Lemma 5 the above is equivalent to On the other hand, from Lemma 6 inequality (20) is equivalent to max η min x C(ξ * ) ψ(x, e i , ξ * ) η(de i ) ≥ 0.