Ascent with quadratic assistance for the construction of exact experimental designs

Filová, Lenka; Harman, Radoslav

doi:10.1007/s00180-020-00961-9

Ascent with quadratic assistance for the construction of exact experimental designs

Original paper
Published: 04 February 2020

Volume 35, pages 775–801, (2020)
Cite this article

Computational Statistics Aims and scope Submit manuscript

884 Accesses
4 Citations
Explore all metrics

Abstract

In the area of statistical planning, there is a large body of theoretical knowledge and computational experience concerning so-called optimal approximate designs of experiments. However, for an approximate design to be realizable, it must be converted into an exact, i.e., integer, design, which is usually done via rounding procedures. Although rapid, rounding procedures often yield worse exact designs than heuristics that do not require approximate designs at all. In this paper, we build on an alternative principle of utilizing optimal approximate designs for the computation of optimal, or nearly-optimal, exact designs. The principle, which we call ascent with quadratic assistance (AQuA), is an integer programming method based on the quadratic approximation of the design criterion in the neighborhood of the optimal approximate information matrix. To this end, we present quadratic approximations of all Kiefer’s criteria with an integer parameter, including D- and A-optimality and, by a model transformation, I-optimality. Importantly, we prove a low-rank property of the associated quadratic forms, which enables us to use AQuA efficiently and apply it to large design spaces. We numerically demonstrate the robustness and superior performance of the proposed method for selected statistical models under various types of experimental constraints. We also show how can iterative application of AQuA be used for a stratified information-based subsampling of large datasets under a lower bound on the quality and an upper bound on the cost of the subsample.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Near-optimal discrete optimization for experimental design: a regret minimization approach

Article 10 January 2020

Optimal design generation: an approach based on discovery probability

Article 04 February 2015

A first-order algorithm for the A-optimal experimental design problem: a mathematical programming approach

Article 04 May 2014

Notes

Note that we implicitly assume that reordering of the trials does not influence the statistical quality of the experimental design.
In some experimental situations, the set of available design points can be modeled as a continuous domain. However, in many applications, the design space is finite. This is the case if each factor has - in principle or effectively - only a finite number of levels that the experimenter can select, or if the optimal design problem corresponds to data sub-selection (see the examples in Sect. 6). Moreover, the method proposed in this paper can also be useful for solving the problems with continuous design spaces, because an optimal experimental design on a finite design space can be a very efficient initial solution for optimization on continuous design spaces; cf. Sect. 5.1.
The symbols \(\mathbb {R}\), \(\mathbb {R}_+\), \(\mathbb {N}\), \(\mathbb {N}_0\), and \(\mathbb {R}^{k \times n}\) denote the sets of real, non-negative real, natural, non-negative integer numbers, and the set of all \(k \times n\) real matrices, respectively.
Therefore, we do not represent designs by normalized (probability) measures, as is frequently done in optimal design, but by non-normalized vectors of numbers of trials.
Approximate designs are sometimes also called “continuous” designs, which refers to the continuity of the space of designs, not the design space.
The symbols \(\mathbf {1}_n\), \(\mathbf {0}_n\), \(\mathbf {I}_n\) and \(\mathbf {J}_n\) denote the n-dimensional vector of ones, n-dimensional vector of zeros, the \(n \times n\) unit matrix and the \(n \times n\) matrix of ones, respectively.
In actual computation using integer programming solvers, this “replication-free” constraint can be forced by setting the type of variables to binary.
Alternatively, it is possible to select a convex criterion \(\Phi \) such that \(\Phi (\mathbf {M}(\xi ))\) can be interpreted as a loss from the experiment that depends on the design \(\xi \). In this case, the optimal design would minimize \(\Phi (\mathbf {M}(\cdot ))\) over \(\Xi ^E_{\mathbf {A},\mathbf {b}}\). Note also that some useful criteria do not depend on the design via its information matrix; we will not discuss them in this paper.
For brevity, we will henceforth use \(\mathcal {S}^m\), \(\mathcal {S}^m_+\), and \(\mathcal {S}^m_{++}\) to denote the sets of all symmetric, non-negative definite and positive definite \(m \times m\) matrices, respectively.
By two versions of a criterion, we mean two criteria that induce the same ordering on the set of information matrices.
Note that the optimal approximate information matrix \(\mathbf {M}_*\) with respect to \(\Phi _p^+\) and \(\Phi _p^-\) is non-singular for any \(p \in \mathbb {N}_0\).
Note that the matrix \(\mathbf {Q}_p^+\) is symmetric, as is the matrix \(\mathbf {Q}_p^-\) defined below, because \(\hbox {tr}(\mathbf {M}_1\mathbf {H}_1\mathbf {M}_2\mathbf {H}_2)=\hbox {tr}(\mathbf {M}_1\mathbf {H}_2\mathbf {M}_2\mathbf {H}_1)\) for the symmetric non-negative definite matrices \(\mathbf {M}_1\), \(\mathbf {M}_2\), \(\mathbf {H}_1\), and \(\mathbf {H}_2\).
The symbols \(\hbox {vech}\) and \(\hbox {vec}\) denote the vectorization and half-vectorization of a matrix, respectively.
Note that \(\tilde{\mathbf {Q}}\) can be a singular non-negative definite matrix; therefore, t can be even smaller than s.
Of course, the same is true for a multitude of other popular design algorithms which work only on finite spaces.
This criterion is sometimes called called IV- or V-optimality [see Sect. 10.6 in Atkinson et al. (2007)].
The reduction of the size of \(\mathfrak {X}\) means the reduction of the dimensionality of the associated convex optimization problem.
If this is not the last iteration of the algorithm, we can use AQuA without the integer constraints on the design. Indeed this iterative approach can also be used for computing optimal approximate designs, but we do not explore this possibility here.
Note that after we already have a candidate exact design for a specific problem, we can compute a lower bound on its efficiency relative to the optimal approximate design. This often leads to a guarantee which is fully satisfactory for practical purposes. Moreover, many if not most optimization heuristics which are eminently useful across sciences also lack theoretical bounds on the efficiency of the results that they generate; their usefulness is evidenced by the empirical fact that they often yield a better concrete result then any other competitor.
See Harman and Filová (2014) for an example of a strongly suboptimal result of AQuA for \(N=m\).
We stress that it is not completely trivial to find these small-support D-, and A-optimal ADs in the class of all optimal ADs; in fact, we have found them using the integer programming capabilities of AQuA. That is, AQuA can be very useful not only for computing efficient designs, but also for constructing an exact design in the possibly infinite set of optimal approximate designs.
We did not alter the default stopping rules and other options of the gurobi solver.
We would like to stress that here we do not focus on the constraints on the design region, which are trivial to incorporate (at least in the case of finite design spaces); we work with constraints on the n-dimensional design vector itself.
The efficiency of the I-optimal design produced using the standard IQP approach is somewhat lower, but it will improve the design if given more computational time.
For the application of the conic improvement, the quadratic forms must have low ranks.

References

Anderson-Cook CM, Borror CM, Montgomery DC (2009) Response surface design evaluation and comparison. J Stat Plan Inference 139(2):629–641
Article MathSciNet MATH Google Scholar
Atkinson AC, Donev AN, Tobias RD (2007) Optimum experimental designs, with SAS. Oxford University Press, Oxford
MATH Google Scholar
Bouhtou M, Gaubert S, Sagnol G (2010) Submodularity and randomized rounding techniques for optimal experimental design. Electron Notes Discrete Math 36:679–686
Article MATH Google Scholar
Cheng C (1987) An application of the Kiefer–Wolfowitz equivalence theorem to a problem in Hadamard transform optics. Ann Stat 15(4):1593–1603
Article MathSciNet MATH Google Scholar
Cornell JA (2002) Experiments with mixtures: designs, models, and the analysis of mixture data, 3rd edn. Wiley, New York
Book MATH Google Scholar
Dattorro J (2008) Convex optimization & Euclidean distance geometry. Meboo Publishing, Palo Alto
MATH Google Scholar
Draguljic D, Dean AM, Santner TJ (2012) Non-collapsing space-filling designs for bounded nonrectangular regions. Technometrics 54(2):169–178
Article MathSciNet Google Scholar
Dykstra O (1971) The augmentation of experimental data to maximize \(|\text{ X }^{\prime }\text{ X }|\). Technometrics 13:682–688
Google Scholar
Filová L, Harman R (2018) Ascent with quadratic assistance for the construction of exact experimental designs. arXiv preprint arXiv:1801.09124v2
Goos P, Jones B (2011) Optimal design of experiments: a case study approach. Wiley, Chichester
Book Google Scholar
Goos P, Jones B, Syafitri U (2016) I-optimal design of mixture experiments. J Am Stat Assoc 111:899–911
Article MathSciNet Google Scholar
Gurobi Optimization, LLC (2018) Gurobi optimizer reference manual. http://www.gurobi.com. Retrieved 31 Dec 2019
Haines LM (1987) The application of the annealing algorithm to the construction of exact optimal designs for linear-regression models. Technometrics 29:439–447
MATH Google Scholar
Harman R, Filová L (2014) Computing efficient exact designs of experiments using integer quadratic programming. Comput Stat Data Anal 71:1159–1167
Article MathSciNet MATH Google Scholar
Harman R, Filová L (2019) Package ’OptimalDesign’. https://CRAN.R-project.org/package=OptimalDesign
Harman R, Bachratá A, Filová L (2016) Construction of efficient experimental designs under multiple resource constraints. Appl Stoch Models Bus Ind 32(1):3–17
Article MathSciNet MATH Google Scholar
Harman R, Trnovská M (2009) Approximate D-optimal designs of experiments on the convex hull of a finite set of information matrices. Math Slovaca 59(6):693–704
Article MathSciNet MATH Google Scholar
Harville DA (1997) Matrix algebra from a statistician’s perspective. Springer, New York
Book MATH Google Scholar
Jones B, Silvestrini RT, Montgomery DC, Steinberg DM (2015) Bridge designs for modeling systems with low noise. Technometrics 57(2):155–163
Article MathSciNet Google Scholar
Kiefer J (1971) The role of symmetry and approximation in exact design optimality. In: Gupta SS, Yackel J (eds) Statistical decision theory and related topics. Academic Press, Cambridge, pp 109–118
Chapter Google Scholar
Liu S, Neudecker H (1995) A V-optimal design for Scheffé’s polynomial model. Stat Probab Lett 23(3):253–258
Article MATH Google Scholar
Mandal A, Wong WK, Yu Y (2015) Algorithmic searches for optimal designs. In: Dean A, Morris M, Stufken J, Bingham D (eds) Handbook of design and analysis of experiments. Chapman and Hall, London, pp 211–218
Google Scholar
McKay MD, Beckman RJ, Conover WJ (1979) A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21(2):239–245
MathSciNet MATH Google Scholar
Montgomery DC (2004) Design and analysis of experiments, 6th edn. Wiley, Hoboken
Google Scholar
Mosek Modeling Cookbook, Release 3.1. (2018). https://docs.mosek.com/MOSEKModelingCookbook-letter.pdf
Neubauer MG, Watkins W, Zeitlin J (2000) D-optimal weighing designs for six objects. Metrika 52(3):185–211
Article MathSciNet MATH Google Scholar
Novomestky F (2012) Matrixcalc: collection of functions for matrix calculations. https://cran.r-project.org/package=matrixcalc. Retrieved 31 Dec 2019
Pázman A (1986) Foundations of optimum experimental design. Reidel, Dordrecht
MATH Google Scholar
Pronzato L, Pázman A (2013) Design of experiments in nonlinear models. Lecture notes in statistics, 212. Springer, New York
Pronzato L, Zhigljavsky AA (2014) Algorithmic construction of optimal designs on compact sets for concave and differentiable criteria. J Stat Plan Inference 154:141–155
Article MathSciNet MATH Google Scholar
Pukelsheim F (2006) Optimal design of experiments (classics in applied mathematics). SIAM, Philadelphia
Book MATH Google Scholar
Pukelsheim F, Rieder S (1992) Efficient rounding of approximate designs. Biometrika 79:763–770
Article MathSciNet Google Scholar
R Development Core Team (2011) R A Lang Environ Stat Comput. Foundation for Statistical Computing, Vienna
Google Scholar
Sagnol G (2013) Approximation of a maximum-submodular-coverage problem involving spectral functions, with application to experimental designs. Discrete Appl Math 161(1):258–276
Article MathSciNet MATH Google Scholar
Sagnol G, Harman R (2015) Computing exact D-optimal designs by mixed integer second-order cone programming. Ann Stat 43(5):2198–2224
Article MathSciNet MATH Google Scholar
Stufken J, Yang M (2012) On locally optimal designs for generalized linear models with group effects. Stat Sin 22(4):1765–1786
MathSciNet MATH Google Scholar
The MOSEK Optimization Toolbox for MATLAB Manual. Version 9.1. (2020). http://docs.mosek.com/8.1/toolbox/index.html
Thoutt Z (2017) Wine reviews data. https://github.com/zackthoutt/wine-deep-learning. Retrieved 31 Dec 2019
Yang M, Biedermann S, Tang E (2013) On optimal designs for nonlinear models: a general and efficient algorithm. J Am Stat Assoc 108:1411–1420
Article MathSciNet MATH Google Scholar
Wang H, Yang M, Stufken J (2019) Information-based optimal subdata selection for big data linear regression. J Am Stat Assoc 114:393–405
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We are grateful to anonymous referees for insightful comments on the preliminary versions of this article. The work was supported by Grant No. 1/0341/19 from the Slovak Scientific Grant Agency (VEGA).

Author information

Authors and Affiliations

Faculty of Mathematics, Physics and Informatics, Comenius University in Bratislava, Bratislava, Slovakia
Lenka Filová & Radoslav Harman
Johannes Kepler University Linz, Linz, Austria
Radoslav Harman

Authors

Lenka Filová
View author publications
You can also search for this author in PubMed Google Scholar
Radoslav Harman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Radoslav Harman.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Filová, L., Harman, R. Ascent with quadratic assistance for the construction of exact experimental designs. Comput Stat 35, 775–801 (2020). https://doi.org/10.1007/s00180-020-00961-9

Download citation

Received: 04 May 2019
Accepted: 27 January 2020
Published: 04 February 2020
Issue Date: June 2020
DOI: https://doi.org/10.1007/s00180-020-00961-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Ascent with quadratic assistance for the construction of exact experimental designs

Abstract

Access this article

Similar content being viewed by others

Near-optimal discrete optimization for experimental design: a regret minimization approach

Optimal design generation: an approach based on discovery probability

A first-order algorithm for the A-optimal experimental design problem: a mathematical programming approach

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Ascent with quadratic assistance for the construction of exact experimental designs

Abstract

Access this article

Similar content being viewed by others

Near-optimal discrete optimization for experimental design: a regret minimization approach

Optimal design generation: an approach based on discovery probability

A first-order algorithm for the A-optimal experimental design problem: a mathematical programming approach

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation