Skip to main content
Log in

Constructing two-level \(Q_B\)-optimal screening designs using mixed-integer programming and heuristic algorithms

  • Original Paper
  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Two-level screening designs are widely applied in manufacturing industry to identify influential factors of a system. These designs have each factor at two levels and are traditionally constructed using standard algorithms, which rely on a pre-specified linear model. Since the assumed model may depart from the truth, two-level \(Q_B\)-optimal designs have been developed to provide efficient parameter estimates for several potential models. These designs also have an overarching goal that models that are more likely to be the best for explaining the data are estimated more efficiently than the rest. However, there is no effective algorithm for constructing them. This article proposes two methods: a mixed-integer programming algorithm that guarantees convergence to the two-level \(Q_B\)-optimal designs; and, a heuristic algorithm that employs a novel formula to find good designs in short computing times. Using numerical experiments, we show that our mixed-integer programming algorithm is attractive to find small optimal designs, and our heuristic algorithm is the most computationally-effective approach to construct both small and large designs, when compared to benchmark heuristic algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Atkinson, A., Donev, A., Tobias, R.: Optimum Experimental Designs, With SAS. OUP Oxford (2007)

    MATH  Google Scholar 

  • Bertsimas, D., Shioda, R.: Algorithm for cardinality-constrained quadratic optimization. Comput. Optim. Appl. 43, 1–22 (2009)

    MathSciNet  MATH  Google Scholar 

  • Bertsimas, D., Weismantel, R.: Optimization Over Integers. Dynamic Ideas Press (2005)

    Google Scholar 

  • Bertsimas, D., King, A., Mazumder, R.: Best subset selection via modern optimization lens. Ann. Stat. 44, 813–852 (2016)

    MathSciNet  MATH  Google Scholar 

  • Bienstock, D.: Computational study of a family of mixed-integer quadratic programming problems. Math. Program. 74, 121–140 (1996)

    MathSciNet  MATH  Google Scholar 

  • Bixby, R.: A brief history of linear and mixed-integer programming computation. In: Doumenta Mathematica. Extra Volume: Optimization Stories, pp. 107–121 (2012)

  • Booth, K.H.V., Cox, D.R.: Some systematic supersaturated designs. Technometrics 4, 489–495 (1962)

    MathSciNet  MATH  Google Scholar 

  • Bulutoglu, D.A., Margot, F.: Classification of orthogonal arrays by integer programming. J. Stat. Plan. Inference 138, 654–666 (2008)

    MathSciNet  MATH  Google Scholar 

  • Butler, N.A.: Minimum aberration construction results for nonregular two-level fractional factorial designs. Biometrika 90, 891–898 (2003)

    MathSciNet  MATH  Google Scholar 

  • Butler, N.A.: Some theory for constructing minimum aberration fractional factorial designs. Biometrika 90, 233–238 (2003)

    MathSciNet  MATH  Google Scholar 

  • Chipman, H.: Bayesian variable selection with related predictors. Can. J. Stat. 24, 17–36 (1996)

    MathSciNet  MATH  Google Scholar 

  • Chipman, H., Hamada, M., Wu, C.F.J.: A Bayesian variable-selection approach for analyzing designed experiments with complex aliasing. Technometrics 39, 372–381 (1997)

    MATH  Google Scholar 

  • Cook, R.D., Nachtsheim, C.J.: A comparison of algorithms for constructing exact D-optimal designs. Technometrics 22, 315–324 (1980)

    MATH  Google Scholar 

  • De Campos, L.M., Fernández-Luna, J.M., Puerta, J.M.: An iterated local search algorithm for learning Bayesian networks with restarts based on conditional independence tests. Int. J. Intell. Syst. 18(2), 221–235 (2003)

    MATH  Google Scholar 

  • De Corte, A., Sörensen, K.: An iterated local search algorithm for water distribution network design optimization. Networks 67(3), 187–198 (2016)

    Google Scholar 

  • DuMouchel, W., Jones, B.: A simple Bayesian modification of D-optimal designs to reduce dependence on an assumed model. Technometrics 36(1), 37–47 (1994)

    MATH  Google Scholar 

  • Eendebak, P., Schoen, E.D.: Two-level designs to estimate all main effects and two-factor interactions. Technometrics 59, 69–79 (2017)

    MathSciNet  Google Scholar 

  • Elster, C., Neumaier, A.: Screening by conference designs. Biometrika 82, 589–602 (1995)

    MathSciNet  MATH  Google Scholar 

  • Goos, P., Jones, B.: Design of Experiments: A Case Study Approach. Wiley, New York (2011)

    Google Scholar 

  • Goos, P., Kobilinsky, A., O’Brien, T.E., Vandebroek, M.: Model-robust and model-sensitive designs. Comput. Stat. Data Anal. 49, 201–216 (2005)

    MathSciNet  MATH  Google Scholar 

  • Grömping, U., Fontana, R.: An algorithm for generating good mixed level factorial designs. Comput. Stat. Data Anal. 137, 101–114 (2019)

    MathSciNet  MATH  Google Scholar 

  • Grosso, A., Jamali, A., Locatelli, M.: Finding maximin Latin hypercube designs by iterated local search heuristics. Eur. J. Oper. Res. 197(2), 541–547 (2009)

    MATH  Google Scholar 

  • Gurobi Optimization, LLC (2020). Gurobi Optimizer Reference Manual. Available at http://www.gurobi.com

  • Harman, R., Filová, L.: Computing efficient exact designs of experiments using integer quadratic programming. Comput. Stat. Data Anal. (2014)

  • Harville, D.A.: Matrix Algebra from a Statistician’s Perspective. Springer, New York (2011)

    MATH  Google Scholar 

  • Heredia-Langner, A., Montgomery, D.C., Carlyle, W.M., Borror, C.M.: Model-robust optimal designs: a genetic algorithm approach. J. Qual. Technol. 36, 263–279 (2004)

    Google Scholar 

  • Jones, B., Lin, D.K.L., Nachtsheim, C.J.: Bayesian D-optimal supersaturated designs. J. Stat. Plan. Inference 138, 86–92 (2007)

    MathSciNet  MATH  Google Scholar 

  • Jones, B.A., Li, W., Nachtsheim, C.J., Ye, K.Q.: Model-robust supersaturated and partially supersaturated designs. J. Stat. Plan. Inference 139, 45–53 (2009)

    MathSciNet  MATH  Google Scholar 

  • Jünger, M., Liebling, T.M., Naddef, D., Nemhauser, G.L., Pulleyblank, W.R., Reinelt, G., Rinaldi, G., and Wolsey, L.A. (eds.): 50 Years of Integer Programming 1958-2008—From the Early Years to the State-of-the-Art. Springer (2010)

  • Li, W.: Screening designs for model selection. In: Dean, A., Lewis, S. (eds.) Screening, pp. 207–234. Springer, New York (2006)

    Google Scholar 

  • Li, W., Nachtsheim, C.J.: Model-robust factorial designs. Technometrics 42, 345–352 (2000)

    Google Scholar 

  • Li, X., Sudarsanam, N., Frey, D.D.: Regularities in data from factorial experiments. Complexity 11, 32–45 (2006)

    Google Scholar 

  • Lin, D.K.J.: Another look at first-order saturated design: the p-efficient designs. Technometrics 35, 284–292 (1993)

    MathSciNet  MATH  Google Scholar 

  • Loeppky, J.L., Sitter, R.R., Tang, B.: Nonregular designs with desirable projection properties. Technometrics 49, 454–467 (2007)

    MathSciNet  Google Scholar 

  • Lourenço, H.R., Martin, O.C., Stützle, T.: Iterated local search: framework and applications. In: Gendreau, M., Potvin, J.-Y. (eds.) Handbook of Metaheuristics, pp. 129–168. Springer (2019)

  • McCullagh, P., Nelder, J.A.: Generalized Linear Models. Chapman and Hall (1989)

  • Mee, R.: A Comprehensive Guide to Factorial Two-Level Experimentation. Mathematics and Statistics, Springer (2009)

    Google Scholar 

  • Mee, R.W., Schoen, E.D., Edwards, D.E.: Selecting an orthogonal or non-orthogonal two-level design for screening. Technometrics 59, 305–318 (2017)

    MathSciNet  Google Scholar 

  • Merz, P., Huhse, J.: An iterated local search approach for finding provably good solutions for very large tsp instances. In: Rudolph, G., Jansen, T., Beume, N., Lucas, S., Poloni, C. (eds.) Parallel Problem Solving from Nature–PPSN X, pp. 929–939. Springer, Berlin (2008)

    Google Scholar 

  • Meyer, R.K., Nachtsheim, C.J.: The coordinate-exchange algorithm for constructing exact optimal experimental designs. Technometrics 37, 60–69 (1995)

    MathSciNet  MATH  Google Scholar 

  • Michalewicz, Z., Fogel, D.: How to Solve It: Modern Heuristics. Springer (2004)

    MATH  Google Scholar 

  • Núñez-Ares, J., Goos, P.: Enumeration and multicriteria selection of orthogonal minimally aliased response surface designs. Technometrics 62, 21–36 (2020)

    MathSciNet  Google Scholar 

  • Palhazi-Cuervo, D., Goos, P., Sörensen, K.: Optimal design of large-scale screening experiments: a critical look at the coordinate-exchange algorithm. Stat. Comput. 26, 15–28 (2016)

    MathSciNet  MATH  Google Scholar 

  • Palhazi-Cuervo, D., Goos, P., Sörensen, K., Arráiz, E.: An iterated local search algorithm for the vehicle routing problem with backhauls. Eur. J. Oper. Res. 237(2), 454–464 (2014)

    MATH  Google Scholar 

  • Sartono, B., Goos, P., Schoen, E.D.: Constructing general orthogonal fractional factorial split-plot designs. Technometrics 57, 488–502 (2015a)

    MathSciNet  Google Scholar 

  • Sartono, B., Schoen, E.D., Goos, P.: Blocking orthogonal designs with mixed integer linear programming. Technometrics 57, 428–439 (2015b)

    MathSciNet  Google Scholar 

  • Schoen, E.D., Eendebak, P.T., Nguyen, M.V.M.: Complete enumeration of pure-level and mixed-level orthogonal arrays. J. Comb. Des. 18, 123–140 (2010)

    MathSciNet  MATH  Google Scholar 

  • Smucker, B., Drew, N.M.: Approximate model spaces for model-robust experiment design. Technometrics 57, 54–63 (2015)

    MathSciNet  Google Scholar 

  • Smucker, B.J., del Castillo, E., Rosenberg, J.L.: Exchange algorithms for constructing model-robust experimental designs. J. Qual. Technol. 43, 28–42 (2011)

    Google Scholar 

  • Smucker, B.J., del Castillo, E., Rosenberg, J.L.: Model-robust two-level designs using coordinate exchange algorithms and a maximin criterion. Tehnometrics 54, 367–275 (2012)

    MathSciNet  Google Scholar 

  • Sun, D. X.: Estimation Capacity and Related Topics in Experimental Design. PhD thesis, University of Waterloo, Department of Statistics and Actuarial Science, Waterloo (1993)

  • Sun, D.X., Li, W., Ye, K.Q.: Algorithmic construction of catalogs of non-isomorphic two-level orthogonal designs for economic run sizes. Stat. Appl. 6, 141–155 (2008)

  • Tang, B.: Theory of J-characteristics for fractional factorial designs and projection justification of minimum \(G_2\)-aberration. Biometrika 88, 401–407 (2001)

    MathSciNet  MATH  Google Scholar 

  • Tang, B., Deng, L.Y.: Minimum \(G_2\)-aberration for non-regular fractional factorial designs. Ann. Stat. 27, 1914–1926 (1999)

    MATH  Google Scholar 

  • Tsai, P.-W., Gilmour, S.G.: A general criterion for factorial designs under model uncertainty. Technometrics 52, 231–242 (2010)

    MathSciNet  Google Scholar 

  • Tsai, P.-W., Gilmour, S.G.: New families of \(Q_B\)-optimal saturated two-level main effects screening designs. Stat. Sin. 26, 605–617 (2016)

    MATH  Google Scholar 

  • Tsai, P.-W., Gilmour, S.G., Mead, R.: Projective three-level main effects designs robust to model uncertainty. Biometrika 87, 467–475 (2000)

  • Tsai, P.-W., Gilmour, S.G., Mead, R.: Three-level main-effects designs exploiting prior information about model uncertainty. J. Stat. Plan. Inference 137, 619–627 (2007)

  • Vo-Thanh, N., Jans, R., Schoen, E.D., Goos, P.: Symmetry breaking in mixed integer linear programming formulations for blocking two-level orthogonal experimental designs. Comput. Oper. Res. 97, 96–110 (2018)

    MathSciNet  MATH  Google Scholar 

  • Wolsey, L.: Integer Programming, 2nd edn. Wiley (2020)

    MATH  Google Scholar 

  • Wu, C.F.J., Hamada, M.S.: Experiments: Planning, Analysis and Optimization, 2nd edn. Wiley (2009)

Download references

Acknowledgements

The research of the first author was financially supported by the Flemish Fund for Scientific Research (FWO) through the Junior Postdoctoral Fellowship 1243320N. The first author thanks José Núñez Ares for the discussions that shifted the initial focus of this research towards mixed-integer programming.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alan R. Vazquez.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (zip 21 KB)

Appendices

Appendix A: Proofs

Proof of Lemma 1

Let \(\textbf{D} = (D_{ij})\) where \(D_{ij} = \pm 1\). We define

$$\begin{aligned} s_{i_1 i_2 \ldots i_k} = \sum _{l = 1}^{n} D_{l i_1} D_{l i_2} \cdots D_{l i_k}, \end{aligned}$$

where \(i_r \in \{1, \ldots , m\}\). Note that when all elements in \((i_1, i_2, \ldots , i_k)\) are different, \(s_{i_1 i_2 \ldots i_k}\) is a \(J_k\)-characteristic of \(\textbf{D}\) (Tang 2001). The generalized word count of length k of \(\textbf{D}\) can be calculated as follows:

$$\begin{aligned} B_k = \frac{1}{n^2 } \sum _{i_1< \cdots < i_k} s^{2}_{i_1 \ldots i_k}. \end{aligned}$$

From Lemma 1 of Butler (2003b), we have that

$$\begin{aligned} E_k = \frac{1}{n^2 } \sum ^{m}_{i_1 = 1} \sum ^{m}_{i_2 = 1} \cdots \sum ^{m}_{i_k = 1} s^{2}_{i_1 i_2 \ldots i_k}. \end{aligned}$$
(8)

We will show that the power moments \(E_1\), \(E_2\), \(E_3\) and \(E_4\), can be expressed as linear combinations of the generalized word counts \(B_1\), \(B_2\), \(B_3\) and \(B_4\). The first power moment is obvious since \(E_1 = B_1\). For the second power moment, we have that

$$\begin{aligned} E_2 = \frac{1}{n^2} \sum _{i_1 = 1}^{m} s^2_{i_1 i_1} + \frac{2}{n^2} \sum _{i_1 = 1}^{m-1} \sum _{i_2 = i_1 + 1}^{m} s^{2}_{i_1 i_2} = m + 2 B_2. \end{aligned}$$

The third power moment \(E_3\) in Equation (8) can expressed as a sum of \(n^{-2} s^2_{i_1, i_2, i_3}\) over three types of triplets \((i_1, i_2, i_3)\). The first type involves triplets in which all elements are equal. The sum of all \(n^{-2} s^2_{i_1, i_2, i_3}\) over this type equals \(B_1\). The second type involves triplets in which all elements are distinct. The sum of all \(n^{-2} s^2_{i_1, i_2, i_3}\) over the second type equals \(6 B_3\), where the factor 6 is due to the fact that there are 6 permutations of the elements in \((i_1, i_2, i_3)\). The third type of triplets is such that exactly two elements are equal. The sum of all \(n^{-2} s^2_{i_1, i_2, i_3}\) over this type is \(3(m-1)B_1\). Therefore, we have that \(E_3 = (3\,m - 2)B_1 + 6B_3\).

The fourth power moment \(E_4\) in Equation (8) can expressed as a sum of \(n^{-2} s^2_{i_1, i_2, i_3, i_4}\) over five types of quadruplets \((i_1, i_2, i_3, i_4)\). These types are summarized as follows: (1) All elements are equal; (2) all elements are distinct; (3) exactly two elements are equal; (4) exactly three elements are equal; and, (5) each element is equal to exactly one other element. By a counting argument, we can show that the sum of the \(n^{-2} s^2_{i_1, i_2, i_3, i_4}\) over the first, second, third, fourth and fifth type of quadruplets equals m, \(24 B_4\), \(12(m-2)B_2\), \(8 B_2\) and \(3\,m(m-1)\), respectively. The sum of these totals shows that \(E_4 = 24B_4 + 4(3m - 4) B_2 + m(3m-2)\). \(\square \)

Table 5 Average computing times in seconds and their standard deviations for 10 optimizations performed by the heuristic algorithms

Proof of Corollary 1

Let \(\textbf{d}_j\) denote the \(m \times 1\) vector involving the j-th row of \(\textbf{D}\). Without loss of generality, \(\textbf{D} = [\textbf{d}_j, \textbf{D}^{T}_{-j}]^{T}\) where \(\textbf{D}_{-j}\) is the \((n-1) \times m\) design matrix excluding the j-th row. We then have that

$$\begin{aligned} \textbf{T} = \left( \begin{array}{cc} m &{} \textbf{d}_{j}^{T}\textbf{D}^{T}_{-j} \\ \textbf{D}_{-j} \textbf{d}_j &{} \textbf{D}_{-j} \textbf{D}_{-j}^{T} \end{array}\right) . \end{aligned}$$

Let \(\textbf{r} = \textbf{D}_{-j} \textbf{d}_j\) and \(\textbf{S} = \textbf{D}_{-j} \textbf{D}_{-j}^{T}\). The k-th power moment of \(\textbf{T}\) is

$$\begin{aligned} n^2 E_k = m^k + 2 \sum _{u=1}^{n-1} r^{k}_{u} + \sum _{u=1}^{n-1} \sum _{v=1}^{n-1} S^{k}_{uv}, \end{aligned}$$
(9)

where \(r_u\) is the u-th element of \(\textbf{r}\) and \(S_{uv}\) is the element in the u-th row and the v-th column of \(\textbf{E}\). For Case II, substituting the first four power moments given by Equation (9) in the \(Q_B\) criterion in Theorem 1 gives

$$\begin{aligned} \frac{1}{n}\left\{ w_0 + \sum _{k=1}^{4} \frac{w_k}{n^2} \left( m^k + 2 \sum _{u=1}^{n-1} r^{k}_{u} \right) + \sum _{k=1}^{4} \frac{w_k}{n^2} \left( \sum _{u=1}^{n-1} \sum _{v=1}^{n-1} S^{k}_{uv}\right) \right\} . \end{aligned}$$

The contribution of the j-th row of \(\textbf{D}\) to its \(Q_B\) criterion value is the second term in the expression above, where we have that \(\sum _{u=1}^{n-1} r^{k}_{u} = \sum _{i\ne j} T^{k}_{ij}\). The expression for Case I is found similarly. \(\square \)

Appendix B: Computing times

Our goal here is to compare the computing times required for a completed optimization by the heuristic algorithms. An optimization of the PBCE algorithm involves its standard settings, that is, a perturbation size (\(\alpha \)) of 0.1, a maximum number of perturbation without improvement (M) equal to 100, and 5 restarts of the whole procedure. An optimization of the CE, RCP and PE algorithms involves 1000 iterations. In this way, we mimic executions of the algorithms conducted by a standard user.

We consider three sets of design problems. The first set has all combinations of numbers of runs and numbers of factors in Table 1 with \(\pi _1\) = 0.625, while the second set has the 7-factor designs in Table 4. The third set of problems involves the 11-factor designs with 20 and 24 runs in Table 4.

Table 5 shows the computing times required for a completed optimization by the heuristic algorithms. For each design problem, the table gives the averages and standard deviations of 10 optimizations. Preliminary experiments revealed that a single optimization of the PE algorithm with 1000 iterations is computationally more demanding than the other algorithms, especially for designs with more than nine factors. For this reason, Table 5 gives the computing times required for optimizations of the PE algorithm with 10 iterations only. Using these computing times, we estimate the average time required by an optimization of the PE algorithm with 1000 iterations.

Clearly, the PBCE algorithm outperforms the CE and RCP algorithms in terms of the computing time. For the 9-, 11-, 13- and 17-factor designs in Table 5, the PBCE algorithm even takes less computing time than the PE algorithm with 10 iterations. For the 17-factor 18-run design, the PE algorithm was computationally infeasible because its 10 iterations took longer than an hour. In contrast, the PBCE algorithm required 15 s to solve the optimization problem in this case.

For the 6- and 7-factor designs, the average computing times of the PE algorithm in Table 5 should be multiplied by 100, so as to obtain the times required by optimizations with 1000 iterations. Therefore, the computing times for the PE algorithm are around 9.3 s for six factors, and between 227.9 and 532.9 s for seven factors. Clearly, these computing times are much larger than those of the PBCE algorithm in the table.

In conclusion, a completed optimization of the PBCE algorithm is computationally less expensive than one of the benchmark heuristic algorithms.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vazquez, A.R., Wong, W.K. & Goos, P. Constructing two-level \(Q_B\)-optimal screening designs using mixed-integer programming and heuristic algorithms. Stat Comput 33, 7 (2023). https://doi.org/10.1007/s11222-022-10168-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-022-10168-1

Keywords

Navigation