Skip to main content
Log in

Using SeDuMi to find various optimal designs for regression models

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

We introduce a powerful and yet seldom used numerical approach in statistics for solving a broad class of optimization problems where the search space is discretized. This optimization tool is widely used in engineering for solving semidefinite programming (SDP) problems and is called self-dual minimization (SeDuMi). We focus on optimal design problems and demonstrate how to formulate A-, A\(_s\)-, c-, I-, and L-optimal design problems as SDP problems and show how they can be effectively solved by SeDuMi in MATLAB. We also show the numerical approach is flexible by applying it to further find optimal designs based on the weighted least squares estimator or when there are constraints on the weight distribution of the sought optimal design. For approximate designs, the optimality of the SDP-generated designs can be verified using the Kiefer–Wolfowitz equivalence theorem. SDP also finds optimal designs for nonlinear regression models commonly used in social and biomedical research. Several examples are presented for linear and nonlinear models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • d’Aspremont A, El Ghaoui L, Jordan MI, Lanckriet GRG (2007) A direct formulation for sparse PCA using semidefinite programming. SIAM Rev 49:434–448

    Article  MathSciNet  Google Scholar 

  • Berger MPF, Wong WK (2009) An introduction to optimal designs for social and biomedical research. Wiley, Chichester

    Book  Google Scholar 

  • Bie DT, Cristianini N (2006) Fast SDP relaxations of graph cut clustering, transduction, and other combinatorial problems. J Mach Learn Res 7:1409–1436

    MathSciNet  MATH  Google Scholar 

  • Bose M, Mukerjee R (2015) Optimal design measures under asymmetric errors, with application to binary design points. J Stat Plan Inference 159:28–36

    Article  MathSciNet  Google Scholar 

  • Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, New York

    Book  Google Scholar 

  • Cuervo DP, Goos P, Sorensen K (2016) Optimal design of large-scale screening experiments: a critical look at the coordinate-exchange algorithm. Stati Comput 26:15–28

    Article  MathSciNet  Google Scholar 

  • Dette H, Haines LM, Imhof L (1999) Optimal designs for rational models and weighted polynomial regression. Ann Stat 27:1272–1293

    Article  MathSciNet  Google Scholar 

  • Dette H, Melas VB, Pepelyshev A (2004) Optimal designs for a class of nonlinear regression models. Ann Stat 32:2142–2167

    Article  MathSciNet  Google Scholar 

  • Dette H, O’Brien TE (1999) Optimality criteria for regression models based on predicted variance. Biometrika 86:93–106

    Article  MathSciNet  Google Scholar 

  • Dette H, Studden WJ (1997) The theory of canonical moments with applications in statistics, probability, and analysis. Wiley, New York

    MATH  Google Scholar 

  • Duarte BP, Wong WK (2014) A semi-infinite programming based algorithm for finding minimax optimal designs for nonlinear models. Stat Comput 24:1063–1080

    Article  MathSciNet  Google Scholar 

  • Duarte BPM, Wong WK, Atkinson AC (2015) A semi-infinite programming based algorithm for determining T-optimum designs for model discrimination. J Multivar Anal 135:11–24

    Article  MathSciNet  Google Scholar 

  • Fedorov VV (1972) Theory of optimal experiments. Academic Press, New York

    Google Scholar 

  • Gao LL, Zhou J (2014) New optimal design criteria for regression models with asymmetric errors. J Stat Plan Inference 149:140–151

    Article  MathSciNet  Google Scholar 

  • Gao LL, Zhou, J (to appear, 2015) D-optimal designs based on the second-order least squares estimator. Stat Pap

  • Gianchandani YB, Crary SB (1998) Parametric modeling of a microaccelerometer: comparing I-and D-optimal design of experiments for finite-element analysis. J Microelectromechanical Syst 7:274–282

    Article  Google Scholar 

  • Han C, Chaloner K (2003) D- and c-optimal designs for exponential regression models used in viral dynamics and other applications. J Stat Plan Inference 115:585–601

    Article  MathSciNet  Google Scholar 

  • Hardin RH, Sloane NJA (1993) A new approach to the construction of optimal designs. J Stat Plan Inference 37:339–369

    Article  MathSciNet  Google Scholar 

  • He Z, Studden WJ, Sun D (1996) Optimal designs for rational models. Ann Stat 24:2128–2147

    Article  MathSciNet  Google Scholar 

  • Imhof LA, Studden WJ (2001) E-optimal designs for rational models. Ann Stat 29:763–783

    Article  MathSciNet  Google Scholar 

  • Lu Z, Pong TK (2013) Computing optimal experimental designs via interior point method. SIAM J Matrix Anal Appl 34:1556–1580

    Article  MathSciNet  Google Scholar 

  • Macedo E (2015) Two-step semidefinite programming approach to clustering and dimensionality reduction. Stat Optim Inf Comput 3:294–311

    Article  MathSciNet  Google Scholar 

  • Mandal A, Wong WK, Yu Y (2015) Algorithmic searches for optimal designs. In: Dean A, Morris M, Stufken J, Bingham D (eds) Handbook of design and analysis of experiments. Chapman & Hall/CRC Press, Boca Raton, pp 755–783

    Google Scholar 

  • Meyer RK, Nachtsheim CJ (1995) The coordinate-exchange algorithm for constructing exact optimal experimental designs. Technometrics 37:60–69

    Article  MathSciNet  Google Scholar 

  • Papp D (2012) Optimal designs for rational function regression. J Am Stat Assoc 107:400–411

    Article  MathSciNet  Google Scholar 

  • Pukelsheim F (1993) Optimal design of experiments. Wiley, New York

    MATH  Google Scholar 

  • Pukelsheim F, Studden WJ (1993) E-optimal designs for polynomial regression. Ann Stat 21:402–415

    Article  MathSciNet  Google Scholar 

  • Seber GAF, Wild CJ (1989) Nonlinear regression. Wiley, New York

    Book  Google Scholar 

  • Silvey SD, Titterington DH, Torsney B (1978) An algorithm for optimal designs on a design space. Commun Stat Theory Methods 7:1379–1389

    Article  Google Scholar 

  • Sturm JF (1999) Using SeDuMi 1.02, A Matlab toolbox for optimization over symmetric cones. Optim Methods Softw 11:625–653

    Article  MathSciNet  Google Scholar 

  • Ye JJ, Zhou J., Zhou W (to appear, 2015) Computing A-optimal and E-optimal designs for regression models via semidefinite programming. Commun Stat Simul Comput

  • Yu Y (2010) Monotonic convergence of a general algorithm for computing optimal designs. Ann Stat 38:1593–1606

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors thank the referees for their helpful comments to improve the presentation of this article. All authors were partially supported by Discovery Grants from the Natural Science and Engineering Research Council of Canada. The research of Wong reported in this paper was also partially supported by the National Institute of General Medical Sciences of the National Institutes of Health under the Grant Award Number R01GM107639. The contents in this paper are solely the responsibility of the authors and do not necessarily represent the official views of the National Institutes of Health.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weng Kee Wong.

Appendix: Proofs and MATLAB program

Appendix: Proofs and MATLAB program

Proof of Theorem 1:

For the design problem in (6), we first note that there exists a \((q-r) \times q\) matrix \(\mathbf{U}\) such that the matrix

$$\begin{aligned} \mathbf{D}=\left( \begin{array}{c} \mathbf{T} \\ \mathbf{U} \end{array} \right) _{q \times q} \end{aligned}$$
(18)

has rank q. This implies that \(\mathbf{T}_r =(\mathbf{I}_r, \mathbf{0})\) and it is clear that \(\mathbf{T}_r \mathbf{D}=\mathbf{T}\). Thus, problem (8) is the same as problem (6).

Next we claim that problem (9) is a SDP problem. By (5), all the elements of \(\mathbf{A}(\mathbf{w})\) are linear functions of weights \(w_1, \ldots , w_{N-1}\), so are the elements of \(\mathbf{B}(\mathbf{w})\). From (10) and \(\mathbf{W}_N\), all the elements of \(\mathbf{B}_1, \ldots , \mathbf{B}_r\) and \(\mathbf{W}_N\) are linear functions of \(\mathbf{v}=(w_1, \ldots , w_{N-1}, v_N, \ldots , v_{N+r-1})^\top \), so the constraint in (9) is a linear matrix constraint. It is obvious that the objective function in (9) is a linear function of \(\mathbf{v}\) and our claim holds.

Now we show that a solution to problem (9) provides a solution to problem (8). Since \(\mathbf{B}(\mathbf{w})=\mathbf{D}^{-\top } \mathbf{A}(\mathbf{w}) \mathbf{D}^{-1}\), it is easy to verify that \( \mathbf{T}_r \mathbf{D} (\mathbf{A}(\mathbf{w}))^{-1} \mathbf{D}^\top \mathbf{T}_r^\top = \mathbf{T}_r (\mathbf{B}(\mathbf{w}))^{-1} \mathbf{T}_r^\top \). Let \(b_{ii}\) (\(i=1, \ldots , q\)) be the diagonal elements of \((\mathbf{B}(\mathbf{w}))^{-1}\). Then we have

$$\begin{aligned} \text{ trace } \left( \mathbf{T}_r \mathbf{D} (\mathbf{A}(\mathbf{w}))^{-1} \mathbf{D}^\top \mathbf{T}_r^\top \right) = \text{ trace } \left( \mathbf{T}_r (\mathbf{B}(\mathbf{w}))^{-1} \mathbf{T}_r^\top \right) = \sum _{i=1}^r b_{ii}. \end{aligned}$$
(19)

The constraints in (8) is equivalent to have \(\mathbf{W}_N \succeq 0\). Thus, problem (8) is to minimize \(\sum _{i=1}^r b_{ii}\) over the design weights subject to \(\mathbf{W}_N \succeq 0\). By (10) and \(\mathbf{B}_i \succeq 0\), we have

$$\begin{aligned} v_{N+i-1} \ge \mathbf{e}_i^\top \mathbf{B}(\mathbf{w}))^{-1} \mathbf{e}_i=b_{ii}, ~~i=1, \ldots , r. \end{aligned}$$
(20)

Since we minimize \(\sum _{i=1}^r v_{N+i-1}\) in (9), a solution to (9) must have \(v_{N+i-1}^* = b_{ii}\) from (20) and \(\sum _{i=1}^r b_{ii}\) is minimized. By (9), the solution satisfies \(\mathbf{W}_N \succeq 0\). It follows that if \(\mathbf{v}^*=(w_1^*, \ldots , w_{N-1}^*, v_N^*, \ldots , v_{N+r-1}^*)^\top \) is a solution to problem (9), \(\mathbf{w}^*=(w_1^*, \ldots , w_{N-1}^*, 1-\sum _{i=1}^{N-1} w_i^*)^\top \) is a solution to problem (8). \(\square \)

Proof of Lemma 1:

Let \(\mathbf{w}_0\) and \(\mathbf{w}_1\) be two weight vectors and \(\alpha \in [0, 1]\), and define \(\mathbf{w}_\alpha =(1-\alpha ) \mathbf{w}_0 + \alpha \mathbf{w}_1\). Assume \(\mathbf{A}(\mathbf{w}_0)\) and \(\mathbf{A}(\mathbf{w}_1)\) are nonsingular. We need to show that \(\phi (\mathbf{w}_\alpha )\) is a convex function of \(\alpha \). It is easy to get

$$\begin{aligned} \frac{\partial \phi (\mathbf{w}_\alpha )}{\partial \alpha }= & {} - \text{ trace } \left( \mathbf{T} \mathbf{A}^{-1}(\mathbf{w}_\alpha ) ( \mathbf{A}(\mathbf{w}_1) - \mathbf{A}(\mathbf{w}_0) )\mathbf{A}^{-1}(\mathbf{w}_\alpha ) \mathbf{T}^\top \right) , \\ \frac{\partial ^2 \phi (\mathbf{w}_\alpha )}{\partial \alpha ^2}= & {} 2 ~ \text{ trace } \left( \mathbf{T} \mathbf{A}^{-1}(\mathbf{w}_\alpha ) ( \mathbf{A}(\mathbf{w}_1) - \mathbf{A}(\mathbf{w}_0) ) \mathbf{A}^{-1}(\mathbf{w}_\alpha ) ( \mathbf{A}(\mathbf{w}_1)\right. \nonumber \\&\left. - \mathbf{A}(\mathbf{w}_0) ) \mathbf{A}^{-1}(\mathbf{w}_\alpha ) \mathbf{T}^\top \right) . \end{aligned}$$

Since the information matrices \(\mathbf{A}(\mathbf{w}_0)\) and \(\mathbf{A}(\mathbf{w}_1)\) are positive definite, \(\mathbf{A}(\mathbf{w}_\alpha )\) is also positive definite. Then it is clear that \(\frac{\partial ^2 \phi (\mathbf{w}_\alpha )}{\partial \alpha ^2} \ge 0\), which implies that \(\phi (\mathbf{w}_\alpha )\) is a convex function of \(\alpha \). \(\square \)

Proof of Lemma 2:

For any \(\mathbf{w}\), define \(\mathbf{w}_\alpha =(1-\alpha ) \hat{\mathbf{w}} + \alpha \mathbf{w}\). If \(\hat{\mathbf{w}}\) is an optimal design, then we must have \(\frac{\partial \phi (\mathbf{w}_\alpha )}{\partial \alpha } |_{\alpha =0} \ge 0\) for any \(\mathbf{w}\). Similar to the proof of Lemma 1, we have

$$\begin{aligned} \frac{\partial \phi (\mathbf{w}_\alpha )}{\partial \alpha } \mid _{\alpha =0}= & {} - \text{ trace } \left( \mathbf{T} \mathbf{A}^{-1}(\hat{\mathbf{w}}) ( \mathbf{A}(\mathbf{w}) - \mathbf{A}(\hat{\mathbf{w}}) )\mathbf{A}^{-1}(\hat{\mathbf{w}}) \mathbf{T}^\top \right) , \\= & {} - \text{ trace } \left( \mathbf{T} \mathbf{A}^{-1}(\hat{\mathbf{w}}) \mathbf{A}(\mathbf{w})\mathbf{A}^{-1}(\hat{\mathbf{w}}) \mathbf{T}^\top \right) + \phi (\hat{\mathbf{w}}) \\= & {} - \text{ trace } \left( \mathbf{T} \mathbf{A}^{-1}(\hat{\mathbf{w}}) \sum _{i=1}^N w_i \mathbf{f}(\mathbf{u}_i) (\mathbf{f}(\mathbf{u}_i))^\top \mathbf{A}^{-1}(\hat{\mathbf{w}}) \mathbf{T}^\top \right) + \sum _{i=1}^N w_i \phi (\hat{\mathbf{w}}), \\& ~\text{ by } \text{(5) } \text{ and } \text{(11), } \\= & {} - \sum _{i=1}^N w_i \left( \phi _{Ai}(\hat{\mathbf{w}})- \phi (\hat{\mathbf{w}}) \right) , ~~~~~\text{ by } \text{(14) }, \\\ge & {} 0, ~~\text{ for } \text{ any } ~~\mathbf{w}, \end{aligned}$$

which leads to \(\phi _{Ai}(\hat{\mathbf{w}})- \phi (\hat{\mathbf{w}}) \le 0\), for all \(i=1, \ldots , N\). \(\square \)

figure a

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wong, W.K., Yin, Y. & Zhou, J. Using SeDuMi to find various optimal designs for regression models. Stat Papers 60, 1583–1603 (2019). https://doi.org/10.1007/s00362-017-0887-7

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-017-0887-7

Keywords

Mathematics Subject Classification

Navigation