Skip to main content
Log in

Bayesian A-Optimal Design of Experiment with Quantitative and Qualitative Responses

  • Original Article
  • Published:
Journal of Statistical Theory and Practice Aims and scope Submit manuscript

Abstract

We consider the problem of A-optimal design of experiment under a Bayesian probabilistic model with both categorical and continuous response variables. The utility function of the local design problem is derived by applying Bayesian experimental design framework. We also develop an efficient optimization algorithm to obtain the local optimal design by combining the particle swarm optimization and the blocked coordinate descent methods. In addition, we discuss two different ways of constructing the global optimal design based on the algorithm for local optimal design. Simulation studies are presented to illustrate the efficiency of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. Linear-optimal (or L-optimal) criterion is defined by a linear function of the information matrix. For linear regression model with the model matrix \({\varvec{F}}\), the information matrix \({\varvec{M}}=({\varvec{F}}'{\varvec{F}})^{-1}\), and the L-optimal criterion is \(L({\varvec{M}})\), and \(L(\cdot )\) is a linear function. A-optimality is a special case of L-optimality because \(L({\varvec{M}})=\text{ tr }({\varvec{A}}\times {\varvec{M}})\) is a linear function of \({\varvec{M}}\). More details can be found in [22].

  2. We have used the following formulas of matrix-to-scalar derivative in our calculations: \(\frac{\partial \text{ tr }(U)}{\partial x}=\text{ tr }(\frac{\partial U}{\partial x})\),\(\frac{\partial UV}{\partial x}=U\frac{\partial V}{\partial x}+\frac{\partial U}{\partial x}V\), \(\frac{\partial U^{-1}}{\partial x}=-U^{-1}\frac{\partial U}{\partial x}U^{-1}\).

References

  1. Deng X, Jin R (2015) QQ models: joint modeling for quantitative and qualitative quality responses in manufacturing systems. Technometrics 57(3):320–331

    Article  MathSciNet  Google Scholar 

  2. Kang L, Kang X, Deng X, Jin R (2018) A Bayesian hierarchical model for quantitative and qualitative responses. J Qual Technol 50(3):290–308

    Article  Google Scholar 

  3. Kang L (2016) Bayesian d-optimal design of experiments with continuous and binary responses. In: International conference on design of experiments (ICODOE-2016)

  4. Eberhart RC, Kennedy J et al (1995) A new optimizer using particle swarm theory. In: Proceedings of the sixth international symposium on micro machine and human science, vol 1, pp 39–43. New York, NY

  5. Lukemire J, Mandal A, Wong WK (2016) Using particle swarm optimization to search for locally \(d\)-optimal designs for mixed factor experiments with binary response. arXiv preprintarXiv:1602.02187

  6. Chen R-B, Chang S-P, Wang W, Tung H-C, Wong WK (2015) Minimax optimal designs via particle swarm optimization methods. Stat Comput 25(5):975–988

    Article  MathSciNet  Google Scholar 

  7. Wong WK, Chen R-B, Huang C-C, Wang W (2015) A modified particle swarm optimization technique for finding optimal designs for mixture models. PLoS ONE 10(6):e0124720

    Article  Google Scholar 

  8. Chen R-B, Hsu Y-W, Hung Y, Wang W (2014) Discrete particle swarm optimization for constructing uniform design on irregular regions. Comput Stat Data Anal 72:282–297

    Article  MathSciNet  Google Scholar 

  9. Leatherman E, Dean A, Santner T (2014) Computer experiment designs via particle swarm optimization. In: Topics in statistical simulation, pp 309–317. Springer

  10. Kang L, Joseph VR (2012) Bayesian optimal single arrays for robust parameter design. Technometrics

  11. Wu CFJ, Hamada MS (2011) Experiments: planning, analysis, and optimization, vol 552. Wiley, Hoboken

    MATH  Google Scholar 

  12. Ai M, Kang L, Joseph VR (2009) Bayesian optimal blocking of factorial designs. J Stat Plan Inference 139(9):3319–3328

    Article  MathSciNet  Google Scholar 

  13. Joseph VR (2006) A Bayesian approach to the design and analysis of fractionated experiments. Technometrics 48(2):219–229

    Article  MathSciNet  Google Scholar 

  14. Lindley DV (1972) Bayesian statistics: a review, vol. 2. SIAM, Philadelphia

  15. Chaloner K, Verdinelli I (1995) Bayesian experimental design: a review. Stat Sci 10(3):273–304

    Article  MathSciNet  Google Scholar 

  16. Ryan EG, Drovandi CC, McGree JM, Pettitt AN (2016) A review of modern computational algorithms for Bayesian optimal design. Int Stat Rev 84(1):128–154

    Article  MathSciNet  Google Scholar 

  17. Overstall AM, Woods DC (2017) Bayesian design of experiments using approximate coordinate exchange. Technometrics 59(4):458–470

    Article  MathSciNet  Google Scholar 

  18. Drovandi CC, Tran M-N et al (2018) Improving the efficiency of fully Bayesian optimal design of experiments using randomised quasi-monte carlo. Bayesian Anal 13(1):139–162

    Article  MathSciNet  Google Scholar 

  19. Woods DC, Overstall AM, Adamou M, Waite TW (2017) Bayesian design of experiments for generalized linear models and dimensional analysis with industrial and scientific application. Qual Eng 29(1):91–103

    Google Scholar 

  20. Alexanderian A, Gloor PJ, Ghattas O et al (2016) On Bayesian a- and d-optimal experimental designs in infinite dimensions. Bayesian Anal 11(3):671–695

    Article  MathSciNet  Google Scholar 

  21. Huan X, Marzouk YM (2013) Simulation-based optimal Bayesian experimental design for nonlinear systems. J Comput Phys 232(1):288–317

    Article  MathSciNet  Google Scholar 

  22. Fedorov VV (1972) Theory of optimal experiments (translated by W. J. Studden and E. M. Klimko, eds.). Academic Press, New York

  23. Kiefer J (1985) Collected papers, vol 3. Surendra Kumar

  24. Yu Y (2011) D-optimal designs via a cocktail algorithm. Stat Comput 21(4):475–481

    Article  MathSciNet  Google Scholar 

  25. Yang M, Biedermann S, Tang E (2013) On optimal designs for nonlinear models: a general and efficient algorithm. J Am Stat Assoc 108(504):1411–1420

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This research was supported by U.S. National Science Foundation Grants CMMI-1435902.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lulu Kang.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Part of special issue guest edited by Pritam Ranjan and Min Yang—Algorithms, Analysis and Advanced Methodologies in the Design of Experiments.

Appendices

Appendices

1.1 A Proof of Theorem 1

For the A-optimal design, the design criterion is

$$\begin{aligned} \Phi ({\varvec{X}})&=\int \phi ({\varvec{\beta }}^{(1)},{\varvec{\beta }}^{(2)},{\varvec{\eta }},{\varvec{X}},{\varvec{y}},{\varvec{z}})p({\varvec{y}},{\varvec{z}}|{\varvec{X}},{\varvec{\beta }}^{(1)},{\varvec{\beta }}^{(2)},{\varvec{\eta }})\\&\quad \times \, p({\varvec{\beta }}^{(1)}, {\varvec{\beta }}^{(2)},{\varvec{\eta }})\text {d}{\varvec{\beta }}^{(1)}\text {d}{\varvec{\beta }}^{(2)}\text {d}{\varvec{\eta }} \text {d}{\varvec{y}} \text {d}{\varvec{z}}\\&=\int ({\varvec{\eta }}-\hat{{\varvec{\eta }}})'{\varvec{A}}_0({\varvec{\eta }}-\hat{{\varvec{\eta }}}) p({\varvec{z}}, {\varvec{\eta }}|{\varvec{X}})\text {d}{\varvec{z}} \text {d}{\varvec{\eta }}\\&\quad +\sum _{i=1}^2 \int ({\varvec{\beta }}^{(i)}-\hat{{\varvec{\beta }}}^{(i)})'{\varvec{A}}_i({\varvec{\beta }}^{(i)}-\hat{{\varvec{\beta }}}^{(i)}) p({\varvec{\beta }}^{(i)}|{\varvec{X}},{\varvec{y}},{\varvec{z}})\\&\quad \times \, p({\varvec{y}}|{\varvec{X}},{\varvec{z}}) p({\varvec{z}}, {\varvec{\eta }}|{\varvec{X}})\text {d}{\varvec{\beta }}^{(i)}\text {d}{\varvec{y}} \text {d}{\varvec{z}} \text {d}{\varvec{\eta }} \\&=\int \text{ tr }[{\varvec{A}}_0\cdot \text{ var }({\varvec{\eta }}|{\varvec{X}},{\varvec{z}})] p({\varvec{z}}|{\varvec{X}})\text {d}{\varvec{z}} \\&\quad + \sum _{i=1}^2\int \text{ tr }[{\varvec{A}}_i\cdot \text{ var }({\varvec{\beta }}^{(i)}|{\varvec{X}},{\varvec{y}},{\varvec{z}})]p({\varvec{y}}|{\varvec{X}},{\varvec{z}}) p({\varvec{\eta }}|{\varvec{X}},{\varvec{z}}) p({\varvec{z}}|{\varvec{X}}) \text {d}{\varvec{\eta }} \text {d}{\varvec{y}} \text {d}{\varvec{z}}. \end{aligned}$$

It can be easily seen that

$$\begin{aligned} \text{ var }({\varvec{\eta }}|{\varvec{X}},{\varvec{z}})&=\left\{ \sum _{i=1}^n \pi ({\varvec{x}}_i, {\varvec{\eta }})(1-\pi ({\varvec{x}}_i, {\varvec{\eta }})){\varvec{f}}({\varvec{x}}_i){\varvec{f}}({\varvec{x}}_i)'+\tau ^{-2}{\varvec{R}}_0^{-1}\right\} ^{-1}\\&=\left\{ {\varvec{F}}'{\varvec{W}}_1{\varvec{W}}_2{\varvec{F}}+\tau ^{-2}{\varvec{R}}_0^{-1}\right\} ^{-1}=\sigma ^2\left\{ {\varvec{F}}'{\varvec{W}}_0{\varvec{F}}+\rho {\varvec{R}}_0^{-1}\right\} ^{-1},\\&\quad \times \,\int \text{ tr }[{\varvec{A}}_0\cdot \text{ var }({\varvec{\eta }}|{\varvec{X}},{\varvec{z}})]p({\varvec{z}}|{\varvec{X}})\text {d}{\varvec{z}} \\&=\int \text{ tr }[{\varvec{A}}_0\cdot \sigma ^2\left\{ {\varvec{F}}'{\varvec{W}}_0{\varvec{F}}+\rho {\varvec{R}}_0^{-1}\right\} ^{-1}] p({\varvec{z}}|{\varvec{X}}, {\varvec{\eta }})p({\varvec{\eta }}) \text {d}{\varvec{z}} \text {d}{\varvec{\eta }}\\&=\sigma ^2\int \text{ tr }[{\varvec{A}}_0\cdot \left\{ {\varvec{F}}'{\varvec{W}}_0{\varvec{F}}+\rho {\varvec{R}}_0^{-1}\right\} ^{-1}] p({\varvec{\eta }}) \text {d}{\varvec{\eta }}=Q_0. \end{aligned}$$

Also, in the last equation of \(\Phi ({\varvec{X}})\), the integration with respect to \({\varvec{\eta }}\) in the last two terms cannot be computed explicitly. So for the local optimal design, we omit the integration with respect to \({\varvec{\eta }}\) and instead let the objective function depend on \({\varvec{\eta }}\).

Thus the objective function for the local optimal design is

$$\begin{aligned} \Phi ({\varvec{X}}|{\varvec{\eta }}) = Q_0+\sum _{i=1}^2E_{{\varvec{z}}}\{\text{tr }[{\varvec{A}}_i \cdot \text{ var }({\varvec{\beta }}^{(i)}|{\varvec{X}},{\varvec{y}},{\varvec{z}})]\}. \end{aligned}$$
(20)

According to the conditional posterior distribution of \({\varvec{\beta }}_i^{(i)}\),

$$\begin{aligned} \text{ var }({\varvec{\beta }}^{(i)}|{\varvec{X}},{\varvec{y}},{\varvec{z}})=\sigma ^2({\varvec{F}}'{\varvec{V}}_1{\varvec{F}}+\rho {\varvec{R}}_i^{-1})^{-1}. \end{aligned}$$

In Lemma 1, we prove the following

$$\begin{aligned}&\sigma ^2 E_{{\varvec{z}}}\{\text{tr }[{\varvec{A}}_i({\varvec{F}}'{\varvec{V}}_i{\varvec{F}}+\rho {\varvec{R}}_i^{-1})^{-1}] \} \nonumber \\&\quad =\sigma ^2 \text{ tr }[{\varvec{A}}_i({\varvec{F}}'{\varvec{W}}_i{\varvec{F}}+\rho {\varvec{R}}_i^{-1})^{-1}]+\Delta _i^2+E[o(||{\varvec{z}}-{\varvec{\pi }}||^2)]. \end{aligned}$$
(21)

Here \({\varvec{V}}_1=\text{ diag }\{{\varvec{z}}\}\), \({\varvec{V}}_2={\varvec{I}}_n-{\varvec{V}}_1\), \({\varvec{W}}_1=\text{ diag }\{{\varvec{\pi }}\}\),\({\varvec{W}}_2={\varvec{I}}_n-{\varvec{W}}_1\), and

$$\begin{aligned} \Delta _i^2\triangleq \sigma ^2\sum _{k=1}^n\pi _k(1-\pi _k)\text{ tr }[{\varvec{A}}_i{\varvec{C}}_i^{-1}({\varvec{\pi }}){\varvec{S}}_k{\varvec{C}}_i^{-1}({\varvec{\pi }}){\varvec{S}}_k{\varvec{C}}_i^{-1}({\varvec{\pi }})]. \end{aligned}$$

With this result, we can reach the result in Theorem 1.

Lemma 1

Define function \(Q_i({\varvec{b}}) = \sigma ^2\text{ tr }[{\varvec{A}}_i{\varvec{C}}_i^{-1}({\varvec{b}})]\). Then (21) can be rewritten as

$$\begin{aligned} E_{{\varvec{z}}}Q_i({\varvec{z}})=Q_i({\varvec{\pi }})+\Delta _i^2+E[o(||{\varvec{z}}-{\varvec{\pi }}||^2)] \end{aligned}$$
(22)

Proof

Performing second-order Taylor expansion to \(Q_1({\varvec{z}})\) at \({\varvec{z}}={\varvec{\pi }}\), we have

$$\begin{aligned} Q_i({\varvec{z}})&=Q_i({\varvec{\pi }})+\sum _{k=1}^n\left. \frac{\partial Q_i({\varvec{z}})}{\partial z_k}\right| _{{\varvec{z}}={\varvec{\pi }}} (z_k-\pi _k) \nonumber \\&\quad +\frac{1}{2}\sum _{j=1}^n\sum _{k=1}^n \left. \frac{\partial ^2 Q_i({\varvec{z}})}{\partial z_j\partial z_k}\right| _{{\varvec{z}}={\varvec{\pi }}}(z_j-\pi _j)(z_k-\pi _k)+o(||{\varvec{z}}-{\varvec{\pi }}||^2). \end{aligned}$$
(23)

Taking expectation to (23) yields

$$\begin{aligned} E_{{\varvec{z}}}Q_i({\varvec{z}})=Q_i({\varvec{\pi }})+\frac{1}{2}\sum _{k=1}^n \pi _k(1-\pi _k) \left. \frac{\partial ^2 Q_i({\varvec{z}})}{\partial z_k^2}\right| _{{\varvec{z}}={\varvec{\pi }}} +E[o(||{\varvec{z}}-{\varvec{\pi }}||^2)]. \end{aligned}$$
(24)

It remains to compute \(\frac{\partial ^2 Q_i({\varvec{z}})}{\partial z_k^2}\). We haveFootnote 2

$$\begin{aligned} \frac{\partial Q_i({\varvec{z}})}{\partial z_k}=\,&\sigma ^2\text{ tr }\left[ {\varvec{A}}_i\frac{\partial {\varvec{C}}_i^{-1}({\varvec{z}})}{\partial z_k}\right] \\ =&-\sigma ^2\text{ tr }\left[ {\varvec{A}}_i{\varvec{C}}_i^{-1}({\varvec{z}})\frac{\partial {\varvec{C}}_i({\varvec{z}})}{\partial z_k}{\varvec{C}}_i^{-1}({\varvec{z}})\right] \\ =&-\sigma ^2\text{ tr }[{\varvec{A}}_i{\varvec{C}}_i^{-1}({\varvec{z}}){\varvec{S}}_k{\varvec{C}}_i^{-1}({\varvec{z}})], \end{aligned}$$

and

$$\begin{aligned} \frac{\partial Q_i^2({\varvec{z}})}{\partial z_k^2}=&-\sigma ^2\text{ tr }\left[ {\varvec{A}}_i\frac{\partial ({\varvec{C}}_i^{-1}({\varvec{z}}){\varvec{S}}_k{\varvec{C}}_i^{-1}({\varvec{z}}))}{\partial z_k}\right] \\ =&-\sigma ^2\text{ tr }\left[ {\varvec{A}}_i\frac{\partial {\varvec{C}}_i^{-1}({\varvec{z}})}{\partial z_k}{\varvec{S}}_k{\varvec{C}}_i^{-1}({\varvec{z}})+{\varvec{A}}_i{\varvec{C}}_i^{-1}({\varvec{z}}){\varvec{S}}_k\frac{\partial {\varvec{C}}_i^{-1}({\varvec{z}})}{\partial z_k}\right] \\ =\,&2\sigma ^2\text{ tr }[{\varvec{A}}_i{\varvec{C}}_i^{-1}({\varvec{\pi }}){\varvec{S}}_k{\varvec{C}}_i^{-1}({\varvec{\pi }}){\varvec{S}}_k{\varvec{C}}_i^{-1}({\varvec{\pi }})]. \end{aligned}$$

Substituting this into (24) yields (22) and (21). This completes the proof. \(\square\)

B Practical Justification of the Negligibility of \(E[o(||{\varvec{z}}-{\varvec{\pi }}||^2)]\)

In this section, we show that \(E[o(||{\varvec{z}}-{\varvec{\pi }}||^2)]\) is negligible in general computation setting, and thus, ignoring \(E[o(||{\varvec{z}}-{\varvec{\pi }}||^2)]\) in (12) still comprises a good approximation of the exact objective function.

First, we generate \(M=1000\) random samples of \({\varvec{z}}_m\)’s from the distribution (4) for a given \({\varvec{\eta }}\) value and compute \(\text{ tr }[{\varvec{A}}_i\cdot \text{ var }({\varvec{\beta }}^{(i)}|{\varvec{X}},{\varvec{y}},{\varvec{z}}_m)]\). Then, we replace \(E_{{\varvec{z}}}\{\text{tr }[A\cdot \text{ var }({\varvec{\beta }}^{(i)}|{\varvec{X}},{\varvec{y}},{\varvec{z}})]\}\) by its sample mean \(\frac{1}{M}\sum _{m=1}^M\text{ tr }[{\varvec{A}}_i\cdot \text{ var }({\varvec{\beta }}^{(i)}|{\varvec{X}},{\varvec{y}},{\varvec{z}}_m)]\) in (20), and thus we estimate the exact objective function by

$$\begin{aligned} Q_{MC}=Q_0+\sum _{i=1}^2 \frac{1}{M}\sum _{m=1}^M\text{ tr }[{\varvec{A}}_i\cdot \text{ var }({\varvec{\beta }}^{(i)}|{\varvec{X}},{\varvec{y}},{\varvec{z}}_m)], \end{aligned}$$

where \(Q_0\) only depends on \({\varvec{\eta }}\) but not \({\varvec{z}}\). We denote Q as the approximated objective value computed by dropping \(E[o(||{\varvec{z}}-{\varvec{\pi }}||^2)]\) from (20), i.e.,

$$\begin{aligned} Q=Q_0+Q_1+Q_2+\Delta _1+\Delta _2, \end{aligned}$$

and none of these terms in Q depends on \({\varvec{z}}\)

To set up a simulation study to compare Q and \(Q_{MC}\), we take the same candidate points as the simulation in Sect. 4.1, but we use different parametric settings. For each run of the simulation, we sample \(r_0,r_1,r_2\) uniformly from (0, 1), \(\rho\) uniformly from (0, 10), and \({\varvec{\eta }}=\eta \cdot {\mathbf {1}}_7\) where \(\eta\) is uniformly sampled from (0, 20). We fix \(\sigma ^2\) to be 1 and the matrix \({\varvec{A}}\) is fixed as in Sect. 4.1 since they have only a scaling effect on the objective function. We repeat 1000 such runs, and for each run, we compute the relative error \(100\%(\frac{Q-Q_{MC}}{Q_{MC}})\). Figure 6 shows the histogram of 1000 values of this relative error in percentage. We can see that most of its values are centered around 0 and more than \(93\%\) of them are within the range \((-1\%,+1\%)\). Therefore, we conclude that Q provides a good approximation of \(Q_{MC}\). Since their difference is a realization of \(E[o(||{\varvec{z}}-{\varvec{\pi }}||^2)]\), this indicates that \(o(||{\varvec{z}}-{\varvec{\pi }}||^2)\) is practically negligible on average.

Fig. 6
figure 6

Histogram of relative error (in percentage) between Q and \(Q_{MC}\)

To get a better understanding of how close Q and \(Q_{MC}\) are, we fix \(r_0=0.3\), \(r_1=0.4\), \(r_2=0.5\) and \(\rho =5\). The values of \(\sigma ^2\) and \({\varvec{A}}\) are set the same as above. We only vary \(\eta\) as in \({\varvec{\eta }}=\eta \cdot {\mathbf {1}}_7\) from very small to relative large, since \(\eta\) controls how close \(V_i\) and \(W_i\) are. The values of Q and \(Q_{MC}\) computed based on different \(\eta\) values are listed in Table 1, from which we can see that Q well approximates \(Q_{MC}\) regardless of \(\eta\).

Table 1 A few examples of Q and \(Q_{MC}\) for different \(\eta\) values

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kang, L., Huang, X. Bayesian A-Optimal Design of Experiment with Quantitative and Qualitative Responses. J Stat Theory Pract 13, 64 (2019). https://doi.org/10.1007/s42519-019-0063-6

Download citation

  • Published:

  • DOI: https://doi.org/10.1007/s42519-019-0063-6

Keywords

Navigation