Constructing two-level $$Q_B$$ -optimal screening designs using mixed-integer programming and heuristic algorithms

Vazquez, Alan R.; Wong, Weng Kee; Goos, Peter

doi:10.1007/s11222-022-10168-1

Constructing two-level $Q_B$-optimal screening designs using mixed-integer programming and heuristic algorithms

Original Paper
Published: 25 November 2022

Volume 33, article number 7, (2023)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

287 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Two-level screening designs are widely applied in manufacturing industry to identify influential factors of a system. These designs have each factor at two levels and are traditionally constructed using standard algorithms, which rely on a pre-specified linear model. Since the assumed model may depart from the truth, two-level $Q_B$-optimal designs have been developed to provide efficient parameter estimates for several potential models. These designs also have an overarching goal that models that are more likely to be the best for explaining the data are estimated more efficiently than the rest. However, there is no effective algorithm for constructing them. This article proposes two methods: a mixed-integer programming algorithm that guarantees convergence to the two-level $Q_B$-optimal designs; and, a heuristic algorithm that employs a novel formula to find good designs in short computing times. Using numerical experiments, we show that our mixed-integer programming algorithm is attractive to find small optimal designs, and our heuristic algorithm is the most computationally-effective approach to construct both small and large designs, when compared to benchmark heuristic algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Two-Phase Approach for Model-Based Design of Experiments Applied in Chemical Engineering

A Bayesian approach to constrained single- and multi-objective optimization

Article 04 April 2016

Multiple doubling: a simple effective construction technique for optimal two-level experimental designs

Article 01 February 2021

References

Atkinson, A., Donev, A., Tobias, R.: Optimum Experimental Designs, With SAS. OUP Oxford (2007)
MATH Google Scholar
Bertsimas, D., Shioda, R.: Algorithm for cardinality-constrained quadratic optimization. Comput. Optim. Appl. 43, 1–22 (2009)
MathSciNet MATH Google Scholar
Bertsimas, D., Weismantel, R.: Optimization Over Integers. Dynamic Ideas Press (2005)
Google Scholar
Bertsimas, D., King, A., Mazumder, R.: Best subset selection via modern optimization lens. Ann. Stat. 44, 813–852 (2016)
MathSciNet MATH Google Scholar
Bienstock, D.: Computational study of a family of mixed-integer quadratic programming problems. Math. Program. 74, 121–140 (1996)
MathSciNet MATH Google Scholar
Bixby, R.: A brief history of linear and mixed-integer programming computation. In: Doumenta Mathematica. Extra Volume: Optimization Stories, pp. 107–121 (2012)
Booth, K.H.V., Cox, D.R.: Some systematic supersaturated designs. Technometrics 4, 489–495 (1962)
MathSciNet MATH Google Scholar
Bulutoglu, D.A., Margot, F.: Classification of orthogonal arrays by integer programming. J. Stat. Plan. Inference 138, 654–666 (2008)
MathSciNet MATH Google Scholar
Butler, N.A.: Minimum aberration construction results for nonregular two-level fractional factorial designs. Biometrika 90, 891–898 (2003)
MathSciNet MATH Google Scholar
Butler, N.A.: Some theory for constructing minimum aberration fractional factorial designs. Biometrika 90, 233–238 (2003)
MathSciNet MATH Google Scholar
Chipman, H.: Bayesian variable selection with related predictors. Can. J. Stat. 24, 17–36 (1996)
MathSciNet MATH Google Scholar
Chipman, H., Hamada, M., Wu, C.F.J.: A Bayesian variable-selection approach for analyzing designed experiments with complex aliasing. Technometrics 39, 372–381 (1997)
MATH Google Scholar
Cook, R.D., Nachtsheim, C.J.: A comparison of algorithms for constructing exact D-optimal designs. Technometrics 22, 315–324 (1980)
MATH Google Scholar
De Campos, L.M., Fernández-Luna, J.M., Puerta, J.M.: An iterated local search algorithm for learning Bayesian networks with restarts based on conditional independence tests. Int. J. Intell. Syst. 18(2), 221–235 (2003)
MATH Google Scholar
De Corte, A., Sörensen, K.: An iterated local search algorithm for water distribution network design optimization. Networks 67(3), 187–198 (2016)
Google Scholar
DuMouchel, W., Jones, B.: A simple Bayesian modification of D-optimal designs to reduce dependence on an assumed model. Technometrics 36(1), 37–47 (1994)
MATH Google Scholar
Eendebak, P., Schoen, E.D.: Two-level designs to estimate all main effects and two-factor interactions. Technometrics 59, 69–79 (2017)
MathSciNet Google Scholar
Elster, C., Neumaier, A.: Screening by conference designs. Biometrika 82, 589–602 (1995)
MathSciNet MATH Google Scholar
Goos, P., Jones, B.: Design of Experiments: A Case Study Approach. Wiley, New York (2011)
Google Scholar
Goos, P., Kobilinsky, A., O’Brien, T.E., Vandebroek, M.: Model-robust and model-sensitive designs. Comput. Stat. Data Anal. 49, 201–216 (2005)
MathSciNet MATH Google Scholar
Grömping, U., Fontana, R.: An algorithm for generating good mixed level factorial designs. Comput. Stat. Data Anal. 137, 101–114 (2019)
MathSciNet MATH Google Scholar
Grosso, A., Jamali, A., Locatelli, M.: Finding maximin Latin hypercube designs by iterated local search heuristics. Eur. J. Oper. Res. 197(2), 541–547 (2009)
MATH Google Scholar
Gurobi Optimization, LLC (2020). Gurobi Optimizer Reference Manual. Available at http://www.gurobi.com
Harman, R., Filová, L.: Computing efficient exact designs of experiments using integer quadratic programming. Comput. Stat. Data Anal. (2014)
Harville, D.A.: Matrix Algebra from a Statistician’s Perspective. Springer, New York (2011)
MATH Google Scholar
Heredia-Langner, A., Montgomery, D.C., Carlyle, W.M., Borror, C.M.: Model-robust optimal designs: a genetic algorithm approach. J. Qual. Technol. 36, 263–279 (2004)
Google Scholar
Jones, B., Lin, D.K.L., Nachtsheim, C.J.: Bayesian D-optimal supersaturated designs. J. Stat. Plan. Inference 138, 86–92 (2007)
MathSciNet MATH Google Scholar
Jones, B.A., Li, W., Nachtsheim, C.J., Ye, K.Q.: Model-robust supersaturated and partially supersaturated designs. J. Stat. Plan. Inference 139, 45–53 (2009)
MathSciNet MATH Google Scholar
Jünger, M., Liebling, T.M., Naddef, D., Nemhauser, G.L., Pulleyblank, W.R., Reinelt, G., Rinaldi, G., and Wolsey, L.A. (eds.): 50 Years of Integer Programming 1958-2008—From the Early Years to the State-of-the-Art. Springer (2010)
Li, W.: Screening designs for model selection. In: Dean, A., Lewis, S. (eds.) Screening, pp. 207–234. Springer, New York (2006)
Google Scholar
Li, W., Nachtsheim, C.J.: Model-robust factorial designs. Technometrics 42, 345–352 (2000)
Google Scholar
Li, X., Sudarsanam, N., Frey, D.D.: Regularities in data from factorial experiments. Complexity 11, 32–45 (2006)
Google Scholar
Lin, D.K.J.: Another look at first-order saturated design: the p-efficient designs. Technometrics 35, 284–292 (1993)
MathSciNet MATH Google Scholar
Loeppky, J.L., Sitter, R.R., Tang, B.: Nonregular designs with desirable projection properties. Technometrics 49, 454–467 (2007)
MathSciNet Google Scholar
Lourenço, H.R., Martin, O.C., Stützle, T.: Iterated local search: framework and applications. In: Gendreau, M., Potvin, J.-Y. (eds.) Handbook of Metaheuristics, pp. 129–168. Springer (2019)
McCullagh, P., Nelder, J.A.: Generalized Linear Models. Chapman and Hall (1989)
Mee, R.: A Comprehensive Guide to Factorial Two-Level Experimentation. Mathematics and Statistics, Springer (2009)
Google Scholar
Mee, R.W., Schoen, E.D., Edwards, D.E.: Selecting an orthogonal or non-orthogonal two-level design for screening. Technometrics 59, 305–318 (2017)
MathSciNet Google Scholar
Merz, P., Huhse, J.: An iterated local search approach for finding provably good solutions for very large tsp instances. In: Rudolph, G., Jansen, T., Beume, N., Lucas, S., Poloni, C. (eds.) Parallel Problem Solving from Nature–PPSN X, pp. 929–939. Springer, Berlin (2008)
Google Scholar
Meyer, R.K., Nachtsheim, C.J.: The coordinate-exchange algorithm for constructing exact optimal experimental designs. Technometrics 37, 60–69 (1995)
MathSciNet MATH Google Scholar
Michalewicz, Z., Fogel, D.: How to Solve It: Modern Heuristics. Springer (2004)
MATH Google Scholar
Núñez-Ares, J., Goos, P.: Enumeration and multicriteria selection of orthogonal minimally aliased response surface designs. Technometrics 62, 21–36 (2020)
MathSciNet Google Scholar
Palhazi-Cuervo, D., Goos, P., Sörensen, K.: Optimal design of large-scale screening experiments: a critical look at the coordinate-exchange algorithm. Stat. Comput. 26, 15–28 (2016)
MathSciNet MATH Google Scholar
Palhazi-Cuervo, D., Goos, P., Sörensen, K., Arráiz, E.: An iterated local search algorithm for the vehicle routing problem with backhauls. Eur. J. Oper. Res. 237(2), 454–464 (2014)
MATH Google Scholar
Sartono, B., Goos, P., Schoen, E.D.: Constructing general orthogonal fractional factorial split-plot designs. Technometrics 57, 488–502 (2015a)
MathSciNet Google Scholar
Sartono, B., Schoen, E.D., Goos, P.: Blocking orthogonal designs with mixed integer linear programming. Technometrics 57, 428–439 (2015b)
MathSciNet Google Scholar
Schoen, E.D., Eendebak, P.T., Nguyen, M.V.M.: Complete enumeration of pure-level and mixed-level orthogonal arrays. J. Comb. Des. 18, 123–140 (2010)
MathSciNet MATH Google Scholar
Smucker, B., Drew, N.M.: Approximate model spaces for model-robust experiment design. Technometrics 57, 54–63 (2015)
MathSciNet Google Scholar
Smucker, B.J., del Castillo, E., Rosenberg, J.L.: Exchange algorithms for constructing model-robust experimental designs. J. Qual. Technol. 43, 28–42 (2011)
Google Scholar
Smucker, B.J., del Castillo, E., Rosenberg, J.L.: Model-robust two-level designs using coordinate exchange algorithms and a maximin criterion. Tehnometrics 54, 367–275 (2012)
MathSciNet Google Scholar
Sun, D. X.: Estimation Capacity and Related Topics in Experimental Design. PhD thesis, University of Waterloo, Department of Statistics and Actuarial Science, Waterloo (1993)
Sun, D.X., Li, W., Ye, K.Q.: Algorithmic construction of catalogs of non-isomorphic two-level orthogonal designs for economic run sizes. Stat. Appl. 6, 141–155 (2008)
Tang, B.: Theory of J-characteristics for fractional factorial designs and projection justification of minimum $G_2$-aberration. Biometrika 88, 401–407 (2001)
MathSciNet MATH Google Scholar
Tang, B., Deng, L.Y.: Minimum $G_2$-aberration for non-regular fractional factorial designs. Ann. Stat. 27, 1914–1926 (1999)
MATH Google Scholar
Tsai, P.-W., Gilmour, S.G.: A general criterion for factorial designs under model uncertainty. Technometrics 52, 231–242 (2010)
MathSciNet Google Scholar
Tsai, P.-W., Gilmour, S.G.: New families of $Q_B$-optimal saturated two-level main effects screening designs. Stat. Sin. 26, 605–617 (2016)
MATH Google Scholar
Tsai, P.-W., Gilmour, S.G., Mead, R.: Projective three-level main effects designs robust to model uncertainty. Biometrika 87, 467–475 (2000)
Tsai, P.-W., Gilmour, S.G., Mead, R.: Three-level main-effects designs exploiting prior information about model uncertainty. J. Stat. Plan. Inference 137, 619–627 (2007)
Vo-Thanh, N., Jans, R., Schoen, E.D., Goos, P.: Symmetry breaking in mixed integer linear programming formulations for blocking two-level orthogonal experimental designs. Comput. Oper. Res. 97, 96–110 (2018)
MathSciNet MATH Google Scholar
Wolsey, L.: Integer Programming, 2nd edn. Wiley (2020)
MATH Google Scholar
Wu, C.F.J., Hamada, M.S.: Experiments: Planning, Analysis and Optimization, 2nd edn. Wiley (2009)

Download references

Acknowledgements

The research of the first author was financially supported by the Flemish Fund for Scientific Research (FWO) through the Junior Postdoctoral Fellowship 1243320N. The first author thanks José Núñez Ares for the discussions that shifted the initial focus of this research towards mixed-integer programming.

Author information

Authors and Affiliations

Department of Industrial Engineering, University of Arkansas, Fayetteville, USA
Alan R. Vazquez
Department of Biostatistics, University of California, Los Angeles, USA
Weng Kee Wong
Department of Biosystems, KU Leuven, Leuven, Belgium
Peter Goos
Department of Engineering Management, University of Antwerp, Antwerp, Belgium
Peter Goos

Authors

Alan R. Vazquez
View author publications
You can also search for this author in PubMed Google Scholar
Weng Kee Wong
View author publications
You can also search for this author in PubMed Google Scholar
Peter Goos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alan R. Vazquez.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (zip 21 KB)

Appendices

Appendix A: Proofs

Proof of Lemma 1

Let $\textbf{D} = (D_{ij})$ where $D_{ij} = \pm 1$. We define

$$\begin{aligned} s_{i_1 i_2 \ldots i_k} = \sum _{l = 1}^{n} D_{l i_1} D_{l i_2} \cdots D_{l i_k}, \end{aligned}$$

where $i_r \in \{1, \ldots , m\}$. Note that when all elements in $(i_1, i_2, \ldots , i_k)$ are different, $s_{i_1 i_2 \ldots i_k}$ is a $J_k$-characteristic of $\textbf{D}$ (Tang 2001). The generalized word count of length k of $\textbf{D}$ can be calculated as follows:

$$\begin{aligned} B_k = \frac{1}{n^2 } \sum _{i_1< \cdots < i_k} s^{2}_{i_1 \ldots i_k}. \end{aligned}$$

From Lemma 1 of Butler (2003b), we have that

$$\begin{aligned} E_k = \frac{1}{n^2 } \sum ^{m}_{i_1 = 1} \sum ^{m}_{i_2 = 1} \cdots \sum ^{m}_{i_k = 1} s^{2}_{i_1 i_2 \ldots i_k}. \end{aligned}$$

(8)

We will show that the power moments $E_1$, $E_2$, $E_3$ and $E_4$, can be expressed as linear combinations of the generalized word counts $B_1$, $B_2$, $B_3$ and $B_4$. The first power moment is obvious since $E_1 = B_1$. For the second power moment, we have that

$$\begin{aligned} E_2 = \frac{1}{n^2} \sum _{i_1 = 1}^{m} s^2_{i_1 i_1} + \frac{2}{n^2} \sum _{i_1 = 1}^{m-1} \sum _{i_2 = i_1 + 1}^{m} s^{2}_{i_1 i_2} = m + 2 B_2. \end{aligned}$$

The third power moment $E_3$ in Equation (8) can expressed as a sum of $n^{-2} s^2_{i_1, i_2, i_3}$ over three types of triplets $(i_1, i_2, i_3)$. The first type involves triplets in which all elements are equal. The sum of all $n^{-2} s^2_{i_1, i_2, i_3}$ over this type equals $B_1$. The second type involves triplets in which all elements are distinct. The sum of all $n^{-2} s^2_{i_1, i_2, i_3}$ over the second type equals $6 B_3$, where the factor 6 is due to the fact that there are 6 permutations of the elements in $(i_1, i_2, i_3)$. The third type of triplets is such that exactly two elements are equal. The sum of all $n^{-2} s^2_{i_1, i_2, i_3}$ over this type is $3(m-1)B_1$. Therefore, we have that $E_3 = (3\,m - 2)B_1 + 6B_3$.

The fourth power moment $E_4$ in Equation (8) can expressed as a sum of $n^{-2} s^2_{i_1, i_2, i_3, i_4}$ over five types of quadruplets $(i_1, i_2, i_3, i_4)$. These types are summarized as follows: (1) All elements are equal; (2) all elements are distinct; (3) exactly two elements are equal; (4) exactly three elements are equal; and, (5) each element is equal to exactly one other element. By a counting argument, we can show that the sum of the $n^{-2} s^2_{i_1, i_2, i_3, i_4}$ over the first, second, third, fourth and fifth type of quadruplets equals m, $24 B_4$, $12(m-2)B_2$, $8 B_2$ and $3\,m(m-1)$, respectively. The sum of these totals shows that $E_4 = 24B_4 + 4(3m - 4) B_2 + m(3m-2)$. $\square $

Table 5 Average computing times in seconds and their standard deviations for 10 optimizations performed by the heuristic algorithms

Full size table

Proof of Corollary 1

Let $\textbf{d}_j$ denote the $m \times 1$ vector involving the j-th row of $\textbf{D}$. Without loss of generality, $\textbf{D} = [\textbf{d}_j, \textbf{D}^{T}_{-j}]^{T}$ where $\textbf{D}_{-j}$ is the $(n-1) \times m$ design matrix excluding the j-th row. We then have that

$$\begin{aligned} \textbf{T} = \left( \begin{array}{cc} m &{} \textbf{d}_{j}^{T}\textbf{D}^{T}_{-j} \\ \textbf{D}_{-j} \textbf{d}_j &{} \textbf{D}_{-j} \textbf{D}_{-j}^{T} \end{array}\right) . \end{aligned}$$

Let $\textbf{r} = \textbf{D}_{-j} \textbf{d}_j$ and $\textbf{S} = \textbf{D}_{-j} \textbf{D}_{-j}^{T}$. The k-th power moment of $\textbf{T}$ is

$$\begin{aligned} n^2 E_k = m^k + 2 \sum _{u=1}^{n-1} r^{k}_{u} + \sum _{u=1}^{n-1} \sum _{v=1}^{n-1} S^{k}_{uv}, \end{aligned}$$

(9)

where $r_u$ is the u-th element of $\textbf{r}$ and $S_{uv}$ is the element in the u-th row and the v-th column of $\textbf{E}$. For Case II, substituting the first four power moments given by Equation (9) in the $Q_B$ criterion in Theorem 1 gives

$$\begin{aligned} \frac{1}{n}\left\{ w_0 + \sum _{k=1}^{4} \frac{w_k}{n^2} \left( m^k + 2 \sum _{u=1}^{n-1} r^{k}_{u} \right) + \sum _{k=1}^{4} \frac{w_k}{n^2} \left( \sum _{u=1}^{n-1} \sum _{v=1}^{n-1} S^{k}_{uv}\right) \right\} . \end{aligned}$$

The contribution of the j-th row of $\textbf{D}$ to its $Q_B$ criterion value is the second term in the expression above, where we have that $\sum _{u=1}^{n-1} r^{k}_{u} = \sum _{i\ne j} T^{k}_{ij}$. The expression for Case I is found similarly. $\square $

Appendix B: Computing times

Our goal here is to compare the computing times required for a completed optimization by the heuristic algorithms. An optimization of the PBCE algorithm involves its standard settings, that is, a perturbation size ($\alpha $) of 0.1, a maximum number of perturbation without improvement (M) equal to 100, and 5 restarts of the whole procedure. An optimization of the CE, RCP and PE algorithms involves 1000 iterations. In this way, we mimic executions of the algorithms conducted by a standard user.

We consider three sets of design problems. The first set has all combinations of numbers of runs and numbers of factors in Table 1 with $\pi _1$ = 0.625, while the second set has the 7-factor designs in Table 4. The third set of problems involves the 11-factor designs with 20 and 24 runs in Table 4.

Table 5 shows the computing times required for a completed optimization by the heuristic algorithms. For each design problem, the table gives the averages and standard deviations of 10 optimizations. Preliminary experiments revealed that a single optimization of the PE algorithm with 1000 iterations is computationally more demanding than the other algorithms, especially for designs with more than nine factors. For this reason, Table 5 gives the computing times required for optimizations of the PE algorithm with 10 iterations only. Using these computing times, we estimate the average time required by an optimization of the PE algorithm with 1000 iterations.

Clearly, the PBCE algorithm outperforms the CE and RCP algorithms in terms of the computing time. For the 9-, 11-, 13- and 17-factor designs in Table 5, the PBCE algorithm even takes less computing time than the PE algorithm with 10 iterations. For the 17-factor 18-run design, the PE algorithm was computationally infeasible because its 10 iterations took longer than an hour. In contrast, the PBCE algorithm required 15 s to solve the optimization problem in this case.

For the 6- and 7-factor designs, the average computing times of the PE algorithm in Table 5 should be multiplied by 100, so as to obtain the times required by optimizations with 1000 iterations. Therefore, the computing times for the PE algorithm are around 9.3 s for six factors, and between 227.9 and 532.9 s for seven factors. Clearly, these computing times are much larger than those of the PBCE algorithm in the table.

In conclusion, a completed optimization of the PBCE algorithm is computationally less expensive than one of the benchmark heuristic algorithms.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Vazquez, A.R., Wong, W.K. & Goos, P. Constructing two-level $Q_B$-optimal screening designs using mixed-integer programming and heuristic algorithms. Stat Comput 33, 7 (2023). https://doi.org/10.1007/s11222-022-10168-1

Download citation

Received: 03 April 2022
Accepted: 07 October 2022
Published: 25 November 2022
DOI: https://doi.org/10.1007/s11222-022-10168-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Constructing two-level \(Q_B\)-optimal screening designs using mixed-integer programming and heuristic algorithms

Abstract

Access this article

Similar content being viewed by others

A Two-Phase Approach for Model-Based Design of Experiments Applied in Chemical Engineering

A Bayesian approach to constrained single- and multi-objective optimization

Multiple doubling: a simple effective construction technique for optimal two-level experimental designs

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (zip 21 KB)

Appendices

Appendix A: Proofs

Proof of Lemma 1

Proof of Corollary 1

Appendix B: Computing times

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Constructing two-level \(Q_B\)-optimal screening designs using mixed-integer programming and heuristic algorithms

Abstract

Access this article

Similar content being viewed by others

A Two-Phase Approach for Model-Based Design of Experiments Applied in Chemical Engineering

A Bayesian approach to constrained single- and multi-objective optimization

Multiple doubling: a simple effective construction technique for optimal two-level experimental designs

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (zip 21 KB)

Appendices

Appendix A: Proofs

Proof of Lemma 1

Proof of Corollary 1

Appendix B: Computing times

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation