1 Introduction

In practice, a topology optimization problem’s data, e.g. the load applied or material properties, are typically uncertain and the optimal solution can often be sensitive to the specific values of such data, where a small change in some of the data can cause a significant change in the objective value or render the optimal solution obtained infeasible. There are multiple ways to model such data uncertainty. Robust optimization (RO), stochastic optimization (SO), risk-averse optimization (RAO) and reliability-based design optimization (RBDO) are some of the terms used in optimization literature to describe a plethora of techniques for handling uncertainty in the data of an optimization problem for different uncertainty models. In this paper, the focus will be on SO and RAO. For more on RO, the readers are refereed to Bertsimas et al. (2011) and Ben-Tal et al. (2009). And for more on RBDO, the readers are referred to Choi et al. (2007) and Youn and Choi (2004).

In SO and RAO, the data is assumed to follow a known probability distribution (Shapiro et al. 2009; Choi et al. 2007). Let \(\varvec{f}\) be a random load and \(\varvec{x}\) be the topology design variables. A probabilistic constraint can be defined as \(P(g(\varvec{x}; \varvec{f}) \le 0) \ge \eta\) where \(\varvec{f}\) follows a known probability distribution. This constraint is often called a chance constraint or a reliability constraint in RBDO. The objective of an SO problem is typically either deterministic or some probabilistic function such as the mean of a function of the random variable, its variance, standard deviation or a weighted sum of such terms. RAO can be considered a sub-field of SO borrowing concepts from risk analysis in mathematical economics to define various risk measures and tractable approximations to be used in objectives and/or constraints in SO. One such risk measure is the conditional value-at-risk (CVaR) (Shapiro et al. 2009). Other more traditional risk measures include the weighted sum of the mean and variance of a function or the weighted sum of the mean and standard deviation. For more on SO and RAO, the reader is referred to Shapiro et al. (2009).

In topology optimization literature, the term “robust topology optimization” is often used to refer to minimizing the weighted sum of the mean, and variance or standard deviation of a function subject to probabilistic uncertainty (Dunning and Kim 2013; Zhao and Wang 2014b; Cuellar et al. 2018). However, this use of the term “robust optimization” is not consistent with the standard definition of RO in optimization theory literature, e.g. Ben-Tal et al. (2009). The more compliant term to be used in this paper is stochastic topology optimization or risk-averse topology optimization.

Many works in literature tackled the problem of load uncertainty in mean compliance minimization problems (Guest and Igusa 2008; Dunning et al. 2011; Zhao and Wang 2014b; Zhang et al. 2017; Hutchinson 1990; Liu and Wen 2018; Tarek and Ray 2021). For a more detailed account of the literature on this, the readers are referred to Tarek and Ray (2021). Of all the works reviewed, only two works (Zhang et al. 2017; Tarek and Ray 2021) dealt with data-driven design with a finite number of loading scenarios. The loading scenarios can be data collected or sampled from the distributions. The main limitation of the work by Zhang et al. (2017) is that it can only be used to minimize the mean compliance which is not risk-averse since at the optimal solution, the compliance can still be very high for some probable load scenarios even if the mean compliance is minimized. In the work by Tarek and Ray (2021), a few exact methods for handling a large number of loading scenarios in compliance-based problems were proposed based on the singular value decomposition (SVD), where the loading matrix \(\varvec{F}\) has a low rank and/or only a few degrees of freedom are loaded. However when these conditions are not satisfied, the SVD-based approach may not be efficient enough. In particular, there are 2 limitations to the SVD-based approaches:

  1. 1.

    The computational time complexity of computing the SVD of \(\varvec{F}\) is \(O({\text{min}}(L, n_{\text{dofs}})^2 {\text{max}}(L, n_{\text{dofs}}))\) if the loads are dense, where L is the number of loading scenarios and \(n_{\text{dofs}}\) is the number of degrees of freedom, which can be computationally prohibitive for large problems.

  2. 2.

    The load matrix may not be low rank.

Some authors also studied risk-averse compliance minimization by considering the weighted sum of the mean and variance, the weighted sum of the mean and standard deviation, as well as other risk measures (Dunning and Kim 2013; Zhao and Wang 2014a; Chen et al. 2010; Martínez-Frutos and Herrero-Pérez 2018; Cuellar et al. 2018; Martínez-Frutos et al. 2018; Garcia-Lopez et al. 2013; Kriegesmann and Lüdeker 2019). For a more detailed account of the literature on this, the readers are referred to Tarek and Ray (2021).

In this paper, computationally efficient, SVD-free approximation schemes are proposed to estimate the value and gradient of:

  1. 1.

    The mean compliance

  2. 2.

    A class of scalar-valued functions of load compliances satisfying a few conditions, e.g. the standard deviation of the compliance

subject to a finite number of possible loading scenarios. The approximation schemes proposed here are different from the algorithms proposed in Tarek and Ray (2021) in that they don’t require an SVD which can be computationally expensive in some cases when there are many loading scenarios. Another difference is that the methods proposed in Tarek and Ray (2021) were exact methods while the ones proposed here are approximation schemes.

These approaches can be used in risk-averse compliance minimization. The rest of this paper is organized as follows. The proposed approaches for handling load uncertainty in continuum compliance problems in the form of a large, finite number of loading scenarios are detailed in Sects. 2.2 and 2.3. The experiments used and the implementations are then described in Sect. 3. Finally, the results and discussion are presented in the remaining sections before concluding in Sect. 7.

2 Proposed algorithms

2.1 Solid isotropic material with penalization

In this paper, the solid isotropic material with penalization (SIMP) method (Bendsoe 1989; Sigmund 2001; Rojas-Labanda and Stolpe 2015) is used to solve the topology optimization problems. Let \(0 \le x_e \le 1\) be the decision variable associated with element e in the ground mesh and \(\varvec{x}\) be the vector of such decision variables. Let \(\rho _e\) be the pseudo-density of element e, and \(\varvec{\rho }(\varvec{x})\) be the vector of such variables after sequentially applying to \(\varvec{x}\):

  1. 1.

    A checkerboard density filter typically of the form \(f_1(\varvec{x}) = \varvec{A} \varvec{x}\) for some constant matrix \(\varvec{A}\) (Bruns and Tortorelli 2001; Bourdin 2001),

  2. 2.

    An interpolation of the form \(f_2(y) = (1 - x_{\text {min}})y + x_{\text {min}}\) applied element-wise for some small \(x_{\text {min}} > 0\) such as 0.001,

  3. 3.

    A penalty such as the power penalty \(f_3(z) = z^p\) applied element-wise for some penalty value p, and

  4. 4.

    A projection method such as the regularized Heaviside projection (Guest et al. 2004) applied element-wise.

The compliance of the discretized design is defined as: \(C = \varvec{u}^T\varvec{K}\varvec{u} = \varvec{f}^T\varvec{K}^{-1}\varvec{f}\) where \(\varvec{K}\) is the stiffness matrix, \(\varvec{f}\) is the load vector, and \(\varvec{u} = \varvec{K}^{-1}\varvec{f}\) is the displacement vector. The relationship between the global and element stiffness matrices is given by \(\varvec{K} = \sum \limits _e \rho _e \varvec{K}_e\) where \(\varvec{K}_e\) is the hyper-sparse element stiffness matrix of element e with the same size as \(\varvec{K}\).

2.2 Approximating the compliance sample mean and its gradient

The mean compliance can be formulated as a trace function: \(\mu _{\text {C}} = \frac{1}{L} tr(\varvec{F}^T \varvec{K}^{-1} \varvec{F})\). Zhang et al. (2017) showed that Hutchinson’s trace estimator (1990) can be used to accurately estimate the compliance for a large number of load scenarios using a relatively small number of linear system solves. Hutchinson’s trace estimator is given by:

$$\begin{aligned}&tr(\varvec{A}) = E(\varvec{v}^T \varvec{A} \varvec{v}) \approx \frac{1}{N} \sum _{i=1}^{N} \varvec{v}_i^T \varvec{A} \varvec{v}_i \end{aligned}$$
(1)

where \(\varvec{v}\) is a random vector with each element independently distributed with 0 mean and unit variance, \(\varvec{v}_i\) are samples of the random vector \(\varvec{v}\), also known as probing vectors, and N is the number of such probing vectors. One common distribution used for the elements of \(\varvec{v}\) is the Rademacher distribution which is a discrete distribution with support \(\{-1, 1\}\) each of which has a probability of 0.5. Hutchinson proved that an estimator with the Rademacher distribution for \(\varvec{v}\) will have the least variance among all other distributions. Let \(\varvec{A} = \varvec{F}^T \varvec{K}^{-1} \varvec{F}\). The number of linear system solves required to compute the mean compliance \(\frac{1}{L} tr(\varvec{A})\) using the naive approach is L. However, when using Hutchinson’s estimator, that number becomes the number of probing vectors N. In general, a good accuracy can be obtained for \(N \ll L\). Other than the linear system solves, the time complexity of the remaining work using the trace estimation method is \(O(N \times n_{\text{dofs}} \times L)\) mostly spent on finding \(\varvec{F} \varvec{v}_i\) for all i. If only a small number of degrees of freedom \(n_{\text{loaded}}\) are loaded, the complexity of the remaining work reduces to \(O(N \times n_{\text{loaded}} \times L)\).

Let \(\varvec{z}_i = \varvec{K}^{-1} \varvec{F} \varvec{v}_i\) be cached from the trace computation. The elements of the gradient of the trace estimate with respect to \(\varvec{\rho }\) are given by:

$$\begin{aligned} \mu _{\text {C}}(\varvec{\rho })&= \frac{1}{L \times N} \sum _i^N \varvec{z}_i^T \varvec{K} \varvec{z}_i \end{aligned}$$
(2)
$$\begin{aligned} \frac{\partial \mu _{\text {C}}}{\partial \rho _e}&= \frac{1}{L \times N} \sum _{i=1}^N -\varvec{z}_i^T \varvec{K}_e \varvec{z}_i \end{aligned}$$
(3)

The additional time complexity of computing the gradient of the trace estimate after computing the trace is therefore \(O(N \times n_{\text{E}})\). For a detailed derivation of the partial above, see the appendix (Table 1).

Table 1 Summary of the computational cost of the algorithms discussed to calculate the mean compliance and its gradient

2.3 Approximating scalar-valued function of load compliances and its gradient

The above scheme for approximating the sample mean compliance can be generalized to handle the sample variance and standard deviations. The sample variance of the compliance C is given by \(\sigma _{\text {C}}^2 = \frac{1}{L-1} \sum _{i=1}^L (C_i - \mu _{\text {C}})^2\). The sample standard deviation \(\sigma _{\text {C}}\) is the square root of the variance. Let \(\varvec{C}\) be the vector of compliances \(C_i\), one for each load scenario. In vector form, \(\sigma _{\text {C}}^2 = \frac{1}{L-1} (\varvec{C} - \mu _{\text {C}} \varvec{1})^T (\varvec{C} - \mu _{\text {C}} \varvec{1})\). \(\varvec{C} = diag(\varvec{A})\) is the diagonal of the matrix \(\varvec{A} = \varvec{F}^T \varvec{K}^{-1} \varvec{F}\).

One can view the load compliances \(\varvec{C}\) as the diagonal of the matrix \(\varvec{F}^T \varvec{K}^{-1} \varvec{F}\). One way to estimate it is therefore to use a diagonal estimation method. One diagonal estimator directly related to Hutchinson’s trace estimator was proposed by Bekas et al. (2007). The diagonal estimator can be written as follows:

$$\begin{aligned}&diag(\varvec{A}) = E(\varvec{D}_{\varvec{v}} \varvec{A} \varvec{v}) \approx \frac{1}{N} \sum _{i=1}^N \varvec{D}_{\varvec{v}_i} \varvec{A} \varvec{v}_i \end{aligned}$$
(4)

where \(diag(\varvec{A})\) is the diagonal of \(\varvec{A}\) as a vector, \(\varvec{D}_{\varvec{v}}\) is the diagonal matrix with a diagonal \(\varvec{v}\), \(\varvec{v}\) is a random vector distributed much like in Hutchinson’s estimator, \(\varvec{v}_i\) are the probing vector instances of \(\varvec{v}\) and N is the number of probing vectors. The sum of the elements of the diagonal estimator above gives us Hutchinson’s trace estimator. Let \(\varvec{A} = \varvec{F}^T \varvec{K}^{-1} \varvec{F}\):

$$\begin{aligned}&\varvec{C} = diag(\varvec{\varvec{F}^T \varvec{K}^{-1} \varvec{F}}) \approx \frac{1}{N} \sum _{i=1}^N \varvec{D}_{\varvec{v}_i} \varvec{\varvec{F}^T \varvec{K}^{-1} \varvec{F}} \varvec{v}_i \end{aligned}$$
(5)

Bekas et al. showed that using the deterministic basis of a Hadamard matrix as probing vectors \(\varvec{v}_i\) rather than random vectors increases the accuracy of the diagonal estimator. In this paper, we do the same and use columns of a Hadamard matrix as probing vectors for the diagonal estimator. Given the diagonal estimate assuming \(N \ll L\), one can estimate \(\varvec{C}\) using N linear system solves, which can then be used to compute the sample variance and standard deviation. Other than the linear system solves, the additional work required above has a time complexity of \(O(N \times n_{\text{dofs}} \times L)\). But if only a few \(n_{\text{loaded}}\) degrees of freedom are loaded, the time complexity of the remaining work goes down to \(O(N \times n_{\text{loaded}} \times L)\).

The Jacobian of the compliances vector \(\varvec{C}\) with respect to \(\varvec{\rho }\), \(\nabla _{\varvec{\rho }} \varvec{C}\) is simply the stacking of the transposes of the gradients of \(C_i\), \(\nabla _{\varvec{\rho }} C_i\), for all i to form a matrix with L rows and \(n_{\text{E}}\) columns. The Jacobian of the estimate of \(\varvec{C}\) is given by:

$$\begin{aligned}&\nabla _{\varvec{\rho }} \varvec{C} \approx \frac{1}{N} \sum _{i=1}^N \varvec{D}_{\varvec{v}_i} \varvec{\varvec{F}}^T \varvec{K}^{-1} \varvec{F} \varvec{v}_i = \frac{1}{N} \sum _{i=1}^N \varvec{D}_{\varvec{v}_i} \varvec{F}^T \nabla _{\varvec{\rho }} \varvec{t}_i \end{aligned}$$
(6)

where \(\varvec{t}_i = \varvec{K}^{-1} \varvec{F} \varvec{v}_i\). The derivative \(\frac{\partial \varvec{t}_i}{\partial \rho _e} = -\varvec{K}^{-1} \varvec{K}_e \varvec{t}_i\). Therefore:

$$\begin{aligned}&\frac{\partial \varvec{C}}{\partial \rho _e} \approx \frac{1}{N} \sum _{i=1}^N \varvec{D}_{\varvec{v}_i} \varvec{\varvec{F}}^T \varvec{K}^{-1} \varvec{K}_e \varvec{K}^{-1} \varvec{\varvec{F}} \varvec{v}_i \end{aligned}$$
(7)

Note that to find the Jacobian in this case, one requires L linear system solves to find \(\varvec{K}^{-1} \varvec{F}\). However, with this many linear system solves one can use the exact method so there is no merit to using the diagonal estimation approach. This means that if the full Jacobian is required, the diagonal estimation method here is the wrong choice.

However, if only interested in \(\nabla _{\varvec{\rho }} \varvec{C}(\varvec{\rho })^T \varvec{w}\), a more efficient approach can be used:

$$\begin{aligned} \nabla _{\varvec{\rho }} \varvec{C}(\varvec{\rho })^T \varvec{w}&= \nabla _{\varvec{\rho }} (\varvec{C}(\varvec{\rho })^T \varvec{w}) = \nabla _{\varvec{\rho }} tr(\varvec{D}_{\varvec{w}} \varvec{F}^T \varvec{K}^{-1} \varvec{F}) \end{aligned}$$
(8)

Let \(\varvec{r}_i = \varvec{K}^{-1} \varvec{F} \varvec{v}_i\) which are cached from the function value calculation, and let \(\varvec{t}_i = \varvec{K}^{-1} \varvec{F} \varvec{D}_{\varvec{w}} \varvec{v}_i\).

$$\begin{aligned} \frac{\partial \varvec{C}(\varvec{\rho })^T \varvec{w}}{\partial \rho _e}&= -tr(\varvec{D}_{\varvec{w}} \varvec{F}^T \varvec{K}^{-1} \varvec{K}_e \varvec{K}^{-1} \varvec{F}) \end{aligned}$$
(9)
$$\begin{aligned}&\approx -\frac{1}{N} \sum _{i=1}^N \varvec{v}_i^T \varvec{D}_{\varvec{w}} \varvec{F}^T \varvec{K}^{-1} \varvec{K}_e \varvec{K}^{-1} \varvec{F} \varvec{v}_i \end{aligned}$$
(10)
$$\begin{aligned}&= -\frac{1}{N} \sum _{i=1}^N \varvec{t}_i^T \varvec{K}_e \varvec{r}_i \end{aligned}$$
(11)

This means that at a cost of an additional N linear system solves, one can compute the vectors \(\varvec{t}_i\) and then find the gradient of \(\varvec{C}^T \varvec{w}\). Other than the linear system solves, the remaining work has a time complexity of \(O(N \times (n_{\text{dofs}} \times L + n_{\text{E}}))\), \(O(N \times n_{\text{dofs}} \times L)\) from the accumulation of \(\varvec{F} \varvec{D}_{\varvec{w}} \varvec{v}_i\) and \(O(N \times n_{\text{E}})\) to evaluate the gradient given \(\varvec{t}_i\) and \(\varvec{r}_i\) for all i. If only a few degrees of freedom \(n_{\text{loaded}}\) are loaded, then the complexity goes down to \(O(N \times (n_{\text{loaded}} \times L + n_{\text{E}}))\).

Table 2 Summary of the computational cost of the algorithms discussed to calculate the load compliances \(\varvec{C}\) as well as \(\nabla _{\varvec{\rho }} \varvec{C}^T \varvec{w}\) for any vector \(\varvec{w}\)

(Table 2)

3 Setup and implementation

In this section, the most important implementation details and algorithm settings used in the experiments are presented.

3.1 Test problems

Fig. 1
figure 1

Cantilever beam problem. \(\varvec{F}_2\) and \(\varvec{F}_3\) are at 45 degree angles

The 2D cantilever beam problem shown in Fig. 1 was used to run the experiments. A ground mesh of plane stress quadrilateral elements was used, where each element is a square of side length \(1 \text { mm}\), and a sheet thickness of \(1 \text { mm}\). Linear iso-parametric interpolation functions were used for the field and geometric basis functions. A Young’s modulus of 1 MPa and Poisson’s ratio of 0.3 were used. Finally, a checkerboard density filter for unstructured meshes was used with a radius of 2 mm (Huang and Xie 2010). A 3D version of the problem above was also solved.

Three variants of the cantilever beam problem were solved:

  1. 1.

    Minimization of the mean compliance \(\mu _{\text {C}}\) subject to a volume constraint with a volume fraction of 0.4,

  2. 2.

    Minimization of a weighted sum of the mean and standard deviation (mean-std) of the compliance \(\mu _{\text {C}} + 2.0 \sigma _{\text {C}}\) subject to a volume constraint with a volume fraction of 0.4, and

  3. 3.

    Volume minimization subject to a maximum compliance constraint with a compliance threshold of \(70{,}000 \text { Nmm}\).

A total of 1000 load scenarios were sampled from:

$$\begin{aligned} \varvec{f}_i = s_1 \varvec{F}_1 + s_2 \varvec{F}_2 + s_3 \varvec{F}_3 + \frac{1}{R - 3} \sum _{j=4}^{R} s_j \varvec{F}_j \end{aligned}$$
(12)

where \(\varvec{F}_1\), \(\varvec{F}_2\) and \(\varvec{F}_3\) are unit vectors with directions as shown in Fig. 1 and R is an integer greater than or equal to 4. \(\varvec{F}_2\) and \(\varvec{F}_3\) are at 45 degrees. \(s_1\), \(s_2\) and \(s_3\) are identically and independently uniformly distributed random variables between -2 and 2. \(\varvec{F}_j\) for j in \(4 \dots R\) are vectors with non-zeros at all the surface degrees of freedom without a Dirichlet boundary condition. The non-zero values are identically and independently normally distributed random variables with mean 0 and standard deviation 1. \(s_j\) for j in \(4 \dots R\) are also identically and independently normally distributed random variables with mean 0 and standard deviation 1. The same loading scenarios were used for the 3 test problems. Let \(\varvec{F}\) be the matrix whose columns are the sampled \(\varvec{f}_i\) vectors. Given the way the loading scenarios have been defined the rank of \(\varvec{F}\) is almost certainly going to be around R.

3.2 Software

All the topology optimization algorithms described in this paper were implemented in TopOpt.jl Footnote 1 using the Julia programming language (Bezanson et al. 2014) for handling generic unstructured, iso-parametric meshes.

3.3 Settings

The value of \(x_{\text {min}}\) used was 0.001 for all problems and algorithms. Penalization was done prior to interpolation to calculate \(\varvec{\rho }\) from \(\varvec{x}\). A power penalty function and a regularized Heaviside projection were used. All of the problems were solved using 2 continuation SIMP routines. The first incremented the penalty value from \(p = 1\) to \(p = 6\) in increments of 0.5. Then the Heaviside projection parameter \(\beta\) was incremented from \(\beta = 0\) to \(\beta = 20\) in increments of 4 keeping the penalty value fixed at 6. An exponentially decreasing tolerance from \(1e-3\) to \(1e-4\) was used for both continuations.

The mean and mean-std compliance minimization SIMP subproblems problems were solved using the method of moving asymptotes (MMA) algorithm (Svanberg 1987). MMA parameters of \(s_{\text{init}} = 0.5\), \(s_{\text{incr}} = 1.1\) and \(s_{\text{decr}} = 0.7\) were used as defined in the MMA paper with a maximum of 1000 iterations for each subproblem. The dual problem of the convex approximation was solved using a log-barrier box-constrained nonlinear optimization solver, where the barrier problem was solved using the nonlinear CG algorithm for unconstrained nonlinear optimization (Nocedal and Wright 2006) as implemented in Optim.jl Footnote 2 (Mogensen and Riseth 2018). The nonlinear CG itself used the line search algorithm from Hager and Zhang (2006) as implemented in LineSearches.jl Footnote 3. The stopping criteria used was the one adopted by the KKT solver, IPOPT (Wächter and Biegler 2006). This stopping criteria is less scale sensitive than the KKT residual as it scales down the residual by a value proportional to the mean absolute value of the Lagrangian multipliers.

4 Accuracy and speed comparison

In this section, the accuracy and speed of the approximations proposed are presented and compared to the exact values. A method to boost the accuracy of approximations is also presented and mathematically analyzed. Tables 3 and 4 show the values computed for the mean compliance \(\mu _{\text {C}}\) and its standard deviation \(\sigma _{\text {C}}\) respectively together with the time required to compute their values and gradients using: the naive exact approach and the approximate method with trace or diagonal estimation using 100 Rademacher-distributed or Hadamard basis probing vectors. A value of \(R = 10\) was used.

Table 3 The table shows the function values of \(\mu _{\text {C}}\) computed using the exact method and the approximate method of trace estimation with 100 Rademacher-distributed or Hadamard basis probing vectors for a full ground mesh design
Table 4 The table shows the function values of \(\sigma _{\text {C}}\) and its gradients for a full ground mesh computed using the exact method and the approximate method of diagonal estimation with 100 Rademacher-distributed or Hadamard basis probing vectors
Fig. 2
figure 2

Accuracy profile of the trace and diagonal estimation methods for estimating the mean compliance and its standard deviation using 10, 100, 200, 300, 400, 500, 600, 700, 800, 900 and 1000 probing vectors. A value of \(R = 10\) was used here

As expected, the proposed approximation methods take a fraction of the time it takes to compute the exact mean and mean-std compliances using the approaches. Estimates of the mean compliance and its standard deviation for a full ground mesh using different numbers of Rademacher-distributed and Hadamard basis probing vectors are shown in Fig. 2. In this case, the estimates obtained using the Hadamard basis were always closer to the exact value than that of the Rademacher-distributed one. However, this depends on the order by which the Hadamard basis vectors are used.

5 Bias correction

While the Hadamard estimate is converging faster to the exact value compared to the Rademacher one in the above case as the number of probing vectors increases, it is still quite far in the case of the standard deviation unless a large number of probing vectors is used. If we have a constraint over the weighted sum of the mean compliance and its standard deviation, this huge discrepancy from the exact quantity renders the approximate method useless. In this section, it will be shown that usually the estimate can be multiplied by a correcting factor to significantly improve its accuracy. This will be demonstrated experimentally and then mathematically analyzed. When performing topology optimization, the function value and its gradient need to be computed repeatedly. So if we only need to compute the correcting factor a few times, we can still save a lot of computational time when using the approximate method without losing too much accuracy.

5.1 Experiments

Using a full ground mesh to calculate the correcting factor and only 10 probing vectors and \(R = 10\), the ratio between the exact mean compliance and the trace estimate using Hadamard basis probing vectors was 1.238. Similarly, the ratio of the exact compliance standard deviation to the estimated value was 0.165. Figures 3 and 4 show the distributions of the ratios of the exact value to the estimated one for the mean compliance and its standard deviation, respectively. The same 10 Hadamard basis probing vectors were used and each figure was generated using 500 random designs. For each figure, the random designs were sampled from a truncated normal distributions with a different mean and a standard deviation of 0.2, truncated between 0 and 1. One can see that using the same probing vectors, the ratio between the exact and estimated values doesn’t change significantly even when changing the mean volume by changing the mean of the truncated normal distribution. One can see that the correcting ratio that can multiply the estimated mean compliance or standard deviation to get the exact one is not very sensitive to the underlying design.

Fig. 3
figure 3

Histograms of the ratio between the exact mean compliance and the trace estimate using 10 Hadamard basis probing vectors. In each figure, 500 designs were randomly sampled where each element’s pseudo-density is sampled from a truncated normal distribution with the means indicated above and a standard deviation of 0.2, truncated between 0 and 1. A value of \(R = 10\) was used here

Fig. 4
figure 4

Histograms of the ratio between the exact compliance standard deviation and the estimate using 10 Hadamard basis probing vectors. In each figure, 500 designs were randomly sampled where each element’s pseudo-density is sampled from a truncated normal distribution with the means indicated above and a standard deviation of 0.2, truncated between 0 and 1. A value of \(R = 10\) was used here

5.2 Mathematical analysis

In this section, an attempt will be made to mathematically explain the insensitivity of the estimators’ correcting ratios to the design as shown above. While this section doesn’t provide a rigorous proof of the phenomena observed, it does provide some mathematical insight into why it is happening and when it can be expected to happen in other problems.

Let the diagonal estimator be:

$$\begin{aligned} \frac{1}{N} \sum _{k = 1}^N \varvec{D}_{\varvec{v}_k} \varvec{A} \varvec{v}_k = \frac{1}{N} \Bigl (\sum _{k = 1}^N \varvec{D}_{\varvec{v}_k} \varvec{A} \varvec{D}_{\varvec{v}_k} \Bigr ) \varvec{1} \end{aligned}$$
(13)

where \(\varvec{A} = \varvec{F}^T \varvec{K}^{-1} \varvec{F}\). Let \(a_{ij}\) be \((i,j)\)th element of \(\varvec{A}\). Let \(v_{ki}\) be the \(i\)th element of \(\varvec{v}_k\). The \((i,j)\)th element of \(\sum _{k = 1}^N \varvec{D}_{\varvec{v}_k} \varvec{A} \varvec{D}_{\varvec{v}_k}\) is therefore \(a_{ij} \sum _{k=1}^N v_{ki}v_{kj}\). Let \(N^+_{ij}\) be the number of times \(v_{ki}v_{kj}\) is 1 and \(N^-_{ij}\) be the number of times \(v_{ki}v_{kj}\) is − 1. In the case of Hadamard basis, if N is the smallest power of 2 larger than or equal to the number of loads L, then:

$$\begin{aligned} N^+_{ij} = N^-_{ij} = N/2&\quad \textit{if} \quad i \ne j \end{aligned}$$
(14)
$$\begin{aligned} N^+_{ij} = N, N^-_{ij} = 0&\quad \textit{if} \quad i = j \end{aligned}$$
(15)

This means that the diagonal estimate will be exact in that case. Bekas et al. (2007) showed that Hadamard basis work well for banded matrices and for matrices where off-diagonal values are decaying rapidly away from the diagonal. However in the case of load compliances, neither of those conditions apply. Therefore, as shown in the experiment above, the accuracy of the estimated diagonal is quite bad as obvious from the standard deviation of the estimate. Let the \(i\)th diagonal element (or load compliance) be \(C_i = a_{ii}\). The estimator of \(a_{ii}\), \({\hat{a}}_{ii}\), can be written as:

$$\begin{aligned} {\hat{a}}_{ii}&= \frac{1}{N} \sum _{j=1}^L a_{ij} \sum _{k=1}^N v_{ki}v_{kj} \end{aligned}$$
(16)
$$\begin{aligned}&= \frac{1}{N} \sum _{j=1}^L a_{ij} (N^+_{ij} - N^-_{ij}) \end{aligned}$$
(17)
$$\begin{aligned}&= a_{ii} + \sum _{j \ne i} a_{ij} \frac{N^+_{ij} - N^-_{ij}}{N} \end{aligned}$$
(18)

The ratio of the estimated diagonal element to the actual diagonal element is:

$$\begin{aligned} \frac{{\hat{a}}_{ii}}{a_{ii}} = 1 + \sum _{j \ne i} \frac{a_{ij}}{a_{ii}} \frac{N^+_{ij} - N^-_{ij}}{N} \end{aligned}$$
(19)

This ratio depends on:

  1. 1.

    \(\frac{a_{ij}}{a_{ii}}\) which depends on the design and the load scenarios, and

  2. 2.

    \(N^+_{ij}\) and \(N^-_{ij}\) which depend on the Hadamard basis used.

If the same basis are used for all the designs during optimization, then \(\frac{a_{ij}}{a_{ii}}\) is the only number that can vary.

$$\begin{aligned} \frac{a_{ij}}{a_{ii}} = \frac{\varvec{f}_i^T \varvec{K}^{-1} \varvec{f}_j}{\varvec{f}_i^T \varvec{K}^{-1} \varvec{f}_i} \end{aligned}$$
(20)

The partial derivative of \(a_{ij} / a_{ii}\) with respect to the \(e\)th element’s density \(\rho _e\) is:

$$\begin{aligned} \frac{\partial (a_{ij} / a_{ii})}{\partial \rho _e}&= \frac{\partial (\varvec{f}_i^T \varvec{K}^{-1} \varvec{f}_j / \varvec{f}_i^T \varvec{K}^{-1} \varvec{f}_i)}{\partial \rho _e} \end{aligned}$$
(21)
$$\begin{aligned}&= \frac{\frac{\partial \varvec{f}_i^T \varvec{K}^{-1} \varvec{f}_j}{\partial \rho _e} \varvec{f}_i^T \varvec{K}^{-1} \varvec{f}_i - \frac{\partial \varvec{f}_i^T \varvec{K}^{-1} \varvec{f}_i}{\partial \rho _e} \varvec{f}_i^T \varvec{K}^{-1} \varvec{f}_j}{(\varvec{f}_i^T \varvec{K}^{-1} \varvec{f}_i)^2} \end{aligned}$$
(22)
$$\begin{aligned}&= \frac{-(\varvec{u}_i^T \varvec{K}_e \varvec{u}_j) (\varvec{u}_i^T \varvec{K} \varvec{u}_i) + (\varvec{u}_i^T \varvec{K}_e \varvec{u}_i) (\varvec{u}_i^T \varvec{K} \varvec{u}_j)}{(\varvec{u}_i^T \varvec{K} \varvec{u}_i)^2} \end{aligned}$$
(23)

Lemma 1

$$\begin{aligned}&|\varvec{u}_i^T \varvec{K} \varvec{u}_j| \le \nonumber \\&\quad \frac{1}{2} \max (|(\varvec{u}_i + \varvec{u}_j)^T \varvec{K} (\varvec{u}_i + \varvec{u}_j)|, |\varvec{u}_i^T \varvec{K} \varvec{u}_i + \varvec{u}_j^T \varvec{K} \varvec{u}_j|) \end{aligned}$$
(24)

if \(\varvec{K}\) is positive or negative semi-definite and

$$\begin{aligned}&|\varvec{u}_i^T \varvec{K} \varvec{u}_j| \le \nonumber \\&\quad \frac{1}{2} \Big ( |(\varvec{u}_i + \varvec{u}_j)^T \varvec{K} (\varvec{u}_i + \varvec{u}_j)| + |\varvec{u}_i^T \varvec{K} \varvec{u}_i| + |\varvec{u}_j^T \varvec{K} \varvec{u}_j)| \Big ) \end{aligned}$$
(25)

otherwise.

Proof

$$\begin{aligned} 2 \varvec{u}_i^T \varvec{K} \varvec{u}_j = (\varvec{u}_i + \varvec{u}_j)^T \varvec{K} (\varvec{u}_i + \varvec{u}_j) - \varvec{u}_i^T \varvec{K} \varvec{u}_i - \varvec{u}_j^T \varvec{K} \varvec{u}_j \end{aligned}$$
(26)

If \(\varvec{K}\) is indefinite:

$$\begin{aligned}&2 |\varvec{u}_i^T \varvec{K} \varvec{u}_j| \le \nonumber \\&\quad |(\varvec{u}_i + \varvec{u}_j)^T \varvec{K} (\varvec{u}_i + \varvec{u}_j)| + |\varvec{u}_i^T \varvec{K} \varvec{u}_i| + |\varvec{u}_j^T \varvec{K} \varvec{u}_j| \end{aligned}$$
(27)

If \(\varvec{K}\) is positive semi-definite, then \((\varvec{u}_i + \varvec{u}_j)^T \varvec{K} (\varvec{u}_i + \varvec{u}_j)\), \(\varvec{u}_i^T \varvec{K} \varvec{u}_i\) and \(\varvec{u}_j^T \varvec{K} \varvec{u}_j\) are all non-negative. Therefore:

$$\begin{aligned}&-(\varvec{u}_i^T \varvec{K} \varvec{u}_i + \varvec{u}_j^T \varvec{K} \varvec{u}_j) \le \nonumber \\&\quad 2 \varvec{u}_i^T \varvec{K} \varvec{u}_j \le (\varvec{u}_i + \varvec{u}_j)^T \varvec{K} (\varvec{u}_i + \varvec{u}_j) \end{aligned}$$
(28)

Similarly, if \(\varvec{K}\) is negative semi-definite, then \((\varvec{u}_i + \varvec{u}_j)^T \varvec{K} (\varvec{u}_i + \varvec{u}_j)\), \(\varvec{u}_i^T \varvec{K} \varvec{u}_i\) and \(\varvec{u}_j^T \varvec{K} \varvec{u}_j\) are all non-positive. Therefore:

$$\begin{aligned}&(\varvec{u}_i + \varvec{u}_j)^T \varvec{K} (\varvec{u}_i + \varvec{u}_j) \le 2 \varvec{u}_i^T \varvec{K} \varvec{u}_j \le \nonumber \\&\quad -(\varvec{u}_i^T \varvec{K} \varvec{u}_i + \varvec{u}_j^T \varvec{K} \varvec{u}_j) \end{aligned}$$
(29)

It follows that:

$$\begin{aligned}&2 |\varvec{u}_i^T \varvec{K} \varvec{u}_j| \le \nonumber \\&\quad \max (|(\varvec{u}_i + \varvec{u}_j)^T \varvec{K} (\varvec{u}_i + \varvec{u}_j)|, |\varvec{u}_i^T \varvec{K} \varvec{u}_i + \varvec{u}_j^T \varvec{K} \varvec{u}_j|) \end{aligned}$$
(30)

This completes the proof. \(\square\)

Using the above bound, it follows that if for all combinations of i and j:

$$\begin{aligned}&\frac{\varvec{u}_j^T \varvec{K}_e \varvec{u}_j}{\varvec{u}_i^T \varvec{K} \varvec{u}_i} \le \alpha _1 \end{aligned}$$
(31)
$$\begin{aligned}&\frac{(\varvec{u}_i + \varvec{u}_j)^T \varvec{K}_e (\varvec{u}_i + \varvec{u}_j)}{2 \varvec{u}_i^T \varvec{K} \varvec{u}_i} \le \beta _1 \end{aligned}$$
(32)
$$\begin{aligned}&\frac{\varvec{u}_j^T \varvec{K} \varvec{u}_j}{\varvec{u}_i^T \varvec{K} \varvec{u}_i} \le \alpha _2 \end{aligned}$$
(33)
$$\begin{aligned}&\frac{(\varvec{u}_i + \varvec{u}_j)^T \varvec{K} (\varvec{u}_i + \varvec{u}_j)}{2 \varvec{u}_i^T \varvec{K} \varvec{u}_i} \le \beta _2 \end{aligned}$$
(34)

then

$$\begin{aligned} \Bigl |\frac{\partial (a_{ij} / a_{ii})}{\partial \rho _e}\Bigr |&\le \max (\alpha _1, \beta _1) + \alpha _1 \times \max (\alpha _2, \beta _2) \end{aligned}$$
(35)

It is natural to expect \(\alpha _1\) to be small since the element compliance due to any one load will likely be much smaller than the total compliance due to any other load. If the loading scenarios have widely varying magnitudes, \(\alpha _1\) may be large in that case. However to remedy this, the loading scenarios can be clustered into groups by their norm and a separate estimator can be used for each group. Similarly, \(\beta _1\) is likely to be small if all the forces have a close enough norm since the element compliance due to the superposition of 2 loads is likely to be much smaller than two times the total compliance due to any other load. If the norms of the loads are somewhat similar, \(\alpha _2\) and \(\beta _2\) can also be expected to be small constants greater than or equal to 1. This means that absolute value of the individual partial derivatives can be upper bounded by a small positive number. Interestingly, the sum of all the partial derivatives of the correcting ratio with respect to the individual element densities, \(\sum _e \frac{\partial (a_{ij} / a_{ii})}{\partial \rho _e}\), is 0. This does not guarantee that the directional derivative in any direction will be small but it increases the chances of term cancellation. This is consistent with the observations.

However, the correcting factor for the estimator \({\hat{C}}_i = {\hat{a}}_{ii}\) does not just depend on the individual \(\frac{\partial (a_{ij} / a_{ii})}{\partial \rho _e}\) but rather it depends on the sum \(\sum _{j \ne i} \frac{a_{ij}}{a_{ii}} \frac{N^+_{ij} - N^-_{ij}}{N}\). Three factors can make this sum small:

  1. 1.

    A good choice of probing vectors that make the distribution of \(N^+_{ij} - N^-_{ij}\) for different (ij) pairs symmetric around 0 promoting term cancellation.

  2. 2.

    Term cancellation due to the alternating signs of \(a_{ij}\). For instance if the mean load vector is the \(\varvec{0}\) vector, the summation \(\sum _{j \ne i} \frac{a_{ij}}{a_{ii}}\) is equal to -1 regardless of the number of loading scenarios.

  3. 3.

    A small ratio of the number of loading scenarios to the number of elements. This is detailed below.

For fixed loading scenarios, the values of \(\alpha _1\) and \(\beta _1\) decrease as the number of elements E increases. This is because the ratio of an individual element’s contribution to the total strain energy decreases as the element size decreases. Given that \(-1 \le \frac{N^+_{ij} - N^-_{ij}}{N} \le 1\):

$$\begin{aligned} \Biggl |\sum _{j \ne i} \frac{a_{ij}}{a_{ii}} \frac{N^+_{ij} - N^-_{ij}}{N} \Biggr | \le (L - 1)(\beta _1 + \alpha _1 + \alpha _1 (\beta _2 + \alpha _2)) \end{aligned}$$
(36)

Therefore, if \(L \ll E\) and the loads in \(\varvec{F}\) have close magnitudes, one can expect the correcting factor to be design insensitive especially near the end of the optimization when the design is not changing much.

The analysis above identified 3 strategies other than using more probing vectors that can help promote the insensitivity of the correcting factors to the design:

  1. 1.

    Clustering the loads by their magnitudes with a maximum number of loads per cluster \(\ll E\),

  2. 2.

    Centering the loads around \(\varvec{0}\). Let \(\varvec{\mu }_{\varvec{f}}\) be the sample mean of the loading scenarios and let \(\tilde{\varvec{f}}_i = \varvec{f}_i - \varvec{\mu }_{\varvec{f}}\). The \(i\)th load compliance \(\varvec{f}_i^T \varvec{K}^{-1} \varvec{f}_i\) would then be \(\tilde{\varvec{f}}_i^T \varvec{K}^{-1} \tilde{\varvec{f}}_i + 2 \tilde{\varvec{f}}_i^T \varvec{K}^{-1} \varvec{\mu }_{\varvec{f}} + \varvec{\mu }_{\varvec{f}}^T \varvec{K}^{-1} \varvec{\mu }_{\varvec{f}}\). The terms \(\tilde{\varvec{f}}_i^T \varvec{K}^{-1} \tilde{\varvec{f}}_i\) can be obtained from the diagonal estimator of \(\tilde{\varvec{F}}^T \varvec{K}^{-1} \tilde{\varvec{F}}\) where the columns of \(\tilde{\varvec{F}}\) are the vectors \(\tilde{\varvec{f}}_i\). The remaining terms can be computed using a single linear system solve \(\varvec{K}^{-1} \varvec{\mu }_{\varvec{f}}\).

  3. 3.

    Using a finer mesh, i.e. increasing E thus decreasing \(\alpha _1\) and \(\beta _1\).

Note that while the analysis above provides some mathematical insights into why the correcting ratios for the individual compliances may not be sensitive to the design, it is not a complete proof of the phenomena observed because only a single element’s \(\rho _e\) was assumed to be changing in the analysis. However, from the analysis above one can see that term cancellation is highly likely in practice. For instance, the sum of \(\frac{a_{ij}}{a_{ii}}\) for all j is equal to -1 if the mean load is \(\varvec{0}\) regardless of the number of loads, and the sum of \(\frac{\partial a_{ij}/a_{ii}}{\partial \rho _e}\) for all e is equal to 0. This term cancellation is the main reason behind the extreme insensitivity of the correcting ratio to the design observed in the experiments above even when all the elements’ densities are changing in random directions by large amounts.

Next it will be shown that under some conditions that the above insensitivity of the correcting ratio to any individual \(\rho _e\) can be extended to a class of scalar-valued functions of the load compliances. This class of functions includes the mean, variance and standard deviation but not the augmented Lagrangian penalty. Let \(\gamma _i\) be the correcting factor for the compliance \(C_i\). The correcting factor of a scalar-valued function f of the load compliances can therefore be written as:

$$\begin{aligned} \eta (\varvec{\rho }) = \frac{f(\gamma _1(\varvec{\rho }) {\hat{C}}_1(\varvec{\rho }), \gamma _2(\varvec{\rho }) {\hat{C}}_2(\varvec{\rho }), \dots , \gamma _L(\varvec{\rho }) {\hat{C}}_L(\varvec{\rho }))}{f({\hat{C}}_1(\varvec{\rho }), {\hat{C}}_2(\varvec{\rho }), \dots , {\hat{C}}_L(\varvec{\rho }))} \end{aligned}$$
(37)

Let \(f_{\varvec{{\hat{C}}}} = f({\hat{C}}_1, \dots , {\hat{C}}_L)\) and \(f_{\varvec{C}} = f(\gamma _1 {\hat{C}}_1, \dots , \gamma _L {\hat{C}}_L)\). Furthermore, let \(f_{\varvec{{\hat{C}}}}^{(i)}\) be the partial derivative of f with respect to its \(i\)h argument evaluated at \(({\hat{C}}_1, {\hat{C}}_2, \dots , {\hat{C}}_L)\) and let \(f_{\varvec{C}}^{(i)}\) be the partial derivative of f with respect to its \(i\)th argument evaluated at \((\gamma _1 {\hat{C}}_1, \gamma _2 {\hat{C}}_2, \dots , \gamma _L {\hat{C}}_L)\).

$$\begin{aligned} \frac{\partial \eta }{\partial \rho _e}&= \sum _i \Biggl (\frac{\partial \eta }{\partial \gamma _i} * \frac{\partial \gamma _i}{\partial \rho _e} + \frac{\partial \eta }{\partial {\hat{C}}_i} * \frac{\partial {\hat{C}}_i}{\partial \rho _e}\Biggr ) \end{aligned}$$
(38)
$$\begin{aligned}&= \sum _i \Biggl ( \frac{f_{\varvec{C}}^{(i)} {\hat{C}}_i}{f_{\hat{\varvec{C}}}} \frac{\partial \gamma _i}{\partial \rho _e} + \Biggl ( \frac{f_{\varvec{C}}^{(i)} \gamma _i}{f_{\varvec{{\hat{C}}}}} - \frac{f_{\varvec{{\hat{C}}}}^{(i)} f_{\varvec{C}}}{f_{\hat{\varvec{C}}}^2} \Biggr ) \frac{\partial {\hat{C}}_i}{\partial \rho _e} \Biggr ) \end{aligned}$$
(39)

One can see that if the magnitudes of \(f_{\hat{\varvec{C}}}^{(i)}\) and \(f_{\varvec{C}}^{(i)}\) scale down as L increases and if \(f_{\varvec{C}} / f_{\hat{\varvec{C}}}^2\) is small that the partial derivative \(\frac{\partial \eta }{\partial \rho _e}\) will also likely be small. For all i, let:

$$\begin{aligned}&|f_{\varvec{C}}^{(i)}| \le \frac{c_1}{L} \end{aligned}$$
(40)
$$\begin{aligned}&|f_{\hat{\varvec{C}}}^{(i)}| \le \frac{c_1}{L} \end{aligned}$$
(41)
$$\begin{aligned}&\Bigl | \frac{\partial {\hat{C}}_i}{\partial \rho _e} \Bigr | \le c_2 \end{aligned}$$
(42)
$$\begin{aligned}&\Bigl | \frac{\partial \gamma _i}{\partial \rho _e} \Bigr | \le c_3 \end{aligned}$$
(43)
$$\begin{aligned}&\Bigl | \frac{{\hat{C}}_i}{f_{\hat{\varvec{C}}}} \Bigr | \le c_4 \end{aligned}$$
(44)
$$\begin{aligned}&\Bigl | \frac{\gamma _i}{f_{\hat{\varvec{C}}}} \Bigr | \le c_5 \end{aligned}$$
(45)
$$\begin{aligned}&\Bigl | \frac{f_{\varvec{C}}}{f_{\hat{\varvec{C}}}^2} \Bigr | \le c_6 \end{aligned}$$
(46)

Then one can set the bound:

$$\begin{aligned} \Biggl | \frac{\partial \eta }{\partial \rho _e} \Biggr | \le c_1 c_4 c_3 + c_1 c_5 c_2 + c_1 c_6 c_2 \end{aligned}$$
(47)

From the above bound, one can see that \(c_3\), \(c_5\) and \(c_6\) must be small enough to guarantee a low upper bound on the absolute value of \(\frac{\partial \eta }{\partial \rho _e}\). This means that:

  1. 1.

    The diagonal’s correcting factors must not be sensitive to \(\rho _e\) (i.e. \(c_3\) is small). This has been established above under some conditions.

  2. 2.

    The ratio of the diagonal correcting factors to the function estimator \(f_{\hat{\varvec{C}}}\) must be small in magnitude, i.e. (\(c_5\) is small). This is true for the experiment above, where the diagonal correcting ratios at the full ground mesh ranged from -19.0 to 21.7 while the estimated compliance mean and standard deviation were 410.0 and 1329.7, respectively.

  3. 3.

    The ratio \(f_{\varvec{C}} / f_{\hat{\varvec{C}}}^2\) must be small in magnitude, (i.e. \(c_6\) is small). This is also true for the experiment above at the full ground mesh where the ratios were \(2.7e-3\) and \(3.0e-4\) for the mean and standard deviation of the compliance, respectively.

To show that the above result applies to the mean, standard deviation and variance functions, it suffices to show that \(|f_{\varvec{C}}^{(i)}| \le \frac{c_1}{L}\) for some constant \(c_1\). If this is true for \(f_{\varvec{C}}^{(i)}\) then it is also true for \(f_{\hat{\varvec{C}}}^{(i)}\) since this is the same function evaluated at different points. The partial derivatives of the mean, standard deviation and variance of \((C_1, C_2, \dots , C_L)\) with respect to each \(C_i\) are:

$$\begin{aligned}&\frac{\partial \mu _{\text {C}}}{\partial C_i} = \frac{1}{L} \end{aligned}$$
(48)
$$\begin{aligned}&\frac{\partial \sigma _{\text {C}}}{\partial C_i} = \Big ( 1 - \frac{1}{L} \Big ) \frac{C_i - \mu _{\text {C}}}{(L - 1) \times \sigma _{\text {C}}} \le \frac{2(C_i - \mu _{\text {C}})}{L \times \sigma _{\text {C}}} \end{aligned}$$
(49)
$$\begin{aligned}&\frac{\partial \sigma _{\text {C}}^2}{\partial C_i} = \Big ( 1 - \frac{1}{L} \Big ) \frac{2(C_i - \mu _{\text {C}})}{(L - 1)} \le \frac{4(C_i - \mu _{\text {C}})}{L} \end{aligned}$$
(50)

because \(L - 1 \ge L / 2\) for \(L > 1\). Let \(l_{\mu _{\text {C}}}\) and \(l_{\sigma _{\text {C}}}\) be lower bounds on \(\mu _{\text {C}}\) and \(\sigma _{\text {C}}\) for all the designs. The constant \(c_1\) is therefore 1 for \(\mu _{\text {C}}\), \(2(C_{\text {max}} - l_{\mu _{\text {C}}})/l_{\sigma _{\text {C}}}\) for \(\sigma _{\text {C}}\) and \(4(C_{\text {max}} - l_{\mu _{\text {C}}})\) for \(\sigma _{\text {C}}^2\).

Finally for the augmented Lagrangian function, it was not possible to establish the bound above. Even if the compliance constraints were scaled by 1/L allowing a bound of the form \(c_1 / L\), \(c_1\) would still scale up with the linear and quadratic penalties of the augmented Lagrangian function. The linear penalty is unbounded from above and the quadratic penalty grows exponentially during the optimization process. This means that no tight bound can be established. The experiments run were also consistent with this result where the diagonal estimation method was found to not work when solving a maximum compliance constrained problem using the augmented Lagrangian algorithm. A meaningless design was produced.

6 Optimization

6.1 Low rank loads

When minimizing the mean compliance only, the insensitivity of the correcting ratio to the design implies that one can minimize the mean compliance estimate instead of the exact one and get a reasonable design. This will be demonstrated in this section. In this section, a rank \(R = 10\) is used and the trace estimation method is compared against the naive exact method where all the loading scenarios are enumerated. When minimizing the weighted sum of the mean and standard deviation of the compliance, a corrected estimator was used by calculating the correcting ratio of the mean and standard deviation estimators separately at the full ground mesh. Let the uncorrected mean compliance estimator be \({\hat{\mu }}_{\text {C}}\) and the uncorrected standard deviation estimator be \({\hat{\sigma }}_{\text {C}}\). The corrected estimator, \({\hat{W}}\), of the weighted sum of the mean and standard deviation used was:

$$\begin{aligned} {\hat{W}} = \frac{\mu _{\text {C}}(\varvec{x}_0)}{{\hat{\mu }}_{\text {C}}(\varvec{x}_0)} {\hat{\mu }}_{\text {C}}(\varvec{x}) + 2 \frac{\sigma _{\text {C}}(\varvec{x}_0)}{{\hat{\sigma }}_{\text {C}}(\varvec{x}_0)} {\hat{\sigma }}_{\text {C}}(\varvec{x}) \end{aligned}$$
(51)

Only Hadamard basis probing vectors were used in this section.

6.1.1 Mean compliance minimization

To demonstrate the effectiveness of the proposed approaches, the cantilever beam problem described in Sect. 3 was solved using the proposed exact and approximate methods. Table 5 shows the statistics of the final optimal solutions obtained by minimizing the mean compliance subject to the volume fraction constraint using exact and trace estimation methods to evaluate the mean compliance. 10 Hadamard basis probing vectors were used in the trace estimator. The optimal topologies are shown in Fig. 5.

While the designs obtained were different, both algorithms converged to reasonable designs in similar amounts of time. The convergence time shows that the convergence behavior was not affected by the use of an estimator in place of the original objective. However, the design produced by the trace estimation method was significantly worse than the exact method’s which is to be expected since an approximate objective was minimized. Finally, note that the correcting ratio of the mean compliance estimator at the final design is 1.276 which is very close to the values shown in Fig. 3.

Fig. 5
figure 5

Optimal topologies of the mean compliance minimization problem using continuation SIMP

Table 5 Summary statistics of the load compliances of the optimal solution of the mean compliance minimization problem using the exact and trace estimation methods to evaluate the mean compliance

6.1.2 Mean-std compliance minimization

Similarly, Table 6 shows the statistics of the final solutions of the mean-std minimization problem solved using the exact and the corrected diagonal estimator method with 10 Hadamard basis probing vectors. The optimal topologies are shown in Fig. 6. Both algorithms converged to reasonable, feasible designs. Additionally, as expected the exact and approximate mean-std minimization algorithms converged to solutions with lower compliance standard deviations but higher means compared to the exact and approximate mean minimization algorithms. It should be noted that the approximation error and non-convexity of the problem can sometimes lead this expectation to be unmet with the approximate approaches. The results indicate that the approximate method is able to converge in a fraction of the time it takes the exact method to converge because evaluating the function and its gradient using diagonal estimation requires \(2N = 20\) linear system solves while the naive exact method requires 1000. This problem uses a low rank \(\varvec{F}\). The results of using the approximate methods proposed to solve a problem with a load matrix \(\varvec{F}\) of rank 100 are shown in the next section.

Fig. 6
figure 6

Optimal topologies of the mean-std compliance minimization problem using continuation SIMP

Table 6 Summary statistics of the load compliances of the optimal solution of the mean-std compliance minimization problem using exact and the corrected diagonal estimation method with 10 Hadamard basis probing vectors to evaluate the mean-std compliance

6.2 High rank loads

In this section, the 2D problems solved above will be solved using a load scenarios matrix \(\varvec{F}\) of rank \(R = 100\) instead of 10. Additionally, the SVD-based method proposed by Tarek and Ray (2021) will be used instead of the naive approach used above. This will highlight the disadvantage of the SVD-based method when using a high rank \(\varvec{F}\).

The results are shown to be consistent with the low rank \(\varvec{F}\) where the corrected estimator’s accuracy is significantly improved by a single correction at the beginning of the optimization. The histograms in Figs. 8 and 9 also suggest that the correcting ratio is insensitive to the design. Figure 7 shows that the Hadamard probing vectors do not always give a better estimator than the Rademacher-distributed one for the mean but it is consistently better for the standard deviation. Figures 10 and 11 and Tables 7 and 8 show the optimal topologies and results obtained using the exact and approximate methods. The results are consistent with the expectations.

Fig. 7
figure 7

Accuracy profile of the trace and diagonal estimation methods for estimating the mean compliance and its standard deviation using 10, 100, 200, 300, 400, 500, 600, 700, 800, 900 and 1000 probing vectors for the high rank \(\varvec{F}\) case

Fig. 8
figure 8

Histograms of the ratio between the exact mean compliance and the trace estimate using 10 Hadamard basis probing vectors for the high rank \(\varvec{F}\). In each figure, 500 designs were randomly sampled where each element’s pseudo-density is sampled from a truncated normal distribution with the means indicated above and a standard deviation of 0.2, truncated between 0 and 1

Fig. 9
figure 9

Histograms of the ratio between the exact compliance standard deviation and the estimate using 10 Hadamard basis probing vectors for the high rank \(\varvec{F}\) case. In each figure, 500 designs were randomly sampled where each element’s pseudo-density is sampled from a truncated normal distribution with the means indicated above and a standard deviation of 0.2, truncated between 0 and 1

Fig. 10
figure 10

Optimal topologies of the mean compliance minimization problem with a high rank \(\varvec{F}\) using continuation SIMP

Table 7 Summary statistics of the load compliances of the optimal solution of the mean compliance minimization problem with a high rank \(\varvec{F}\) using the exact and trace estimation methods to evaluate the mean compliance
Fig. 11
figure 11

Optimal topologies of the mean-std compliance minimization problem with high rank \(\varvec{F}\) using continuation SIMP

Table 8 Summary statistics of the load compliances of the optimal solution of the mean-std compliance minimization problem with a high rank \(\varvec{F}\) using exact and the corrected diagonal estimation method with 10 Hadamard basis probing vectors to evaluate the mean-std compliance

As shown in Tables 7 and 8, the SVD-based methods are slower than the approximation schemes proposed when the rank of the loads is high. This is because the number of non-zero singular values will be 100 which is 10x the number of probing vectors used. In the mean compliance minimization, a 10 speedup is achieved which is consistent with the expectation. In the mean-std compliance minimization, the diagonal estimation method requires 20 linear system solves so only a factor of 5 speedup is achieved with the approximate method compared to the SVD-based method.

6.3 3D cantilever beam problem

A 3D version of the 2D cantilever beam test problem used above was also solved using the methods proposed in this paper. The problem settings are described and the results are shown below.

A 60 mm x 20 mm x 20 mm 3D cantilever beam was used with hexahedral elements of cubic shape and side length of 1 mm. The loads \(\varvec{F}_1\), \(\varvec{F}_2\) and \(\varvec{F}_3\) were positioned at (60, 10, 10), (30, 20, 10) and (40, 0, 10) where the coordinates represent the length, height and depth, respectively. A value of \(R = 10\) was used. The remaining loads and multipliers were sampled from the same distributions as the 2D problem. A density filter radius of 3 mm was also used for the 3D problem. The same volume constrained mean compliance minimization and volume constrained mean-std compliance minimization problems were solved.

6.3.1 Mean compliance minimization

Fig. 12
figure 12

Cut views of the optimal topologies of the 3D mean compliance minimization problem using exact method

Fig. 13
figure 13

Cut views of the optimal topologies of the 3D mean compliance minimization problem using the trace estimation method

Table 9 Summary statistics of the load compliances of the optimal solution of the 3D mean compliance minimization problem using the exact and trace estimation methods to evaluate the mean compliance

The 3D cantilever beam problem described above was solved using the proposed approximate methods with the objective of minimizing the mean compliance subject to a volume fraction constraint with a limit of 0.4. Table 9 shows the statistics of the final optimal solutions obtained by minimizing the mean compliance subject to the volume fraction constraint using the naive exact approach and the trace estimation method to evaluate the mean compliance. 10 Hadamard basis probing vectors were used in the trace estimator. The optimal topologies are shown in Figs. 12 and 13. Similar results to the 2D case can be observed where the designs obtained are different but somewhat reasonable. The proposed method converged in a small fraction of the time that the naive method took to converge. However, the design produced by the trace estimation method was worse than the exact method’s which is to be expected since an approximate objective was minimized. Finally, note that the corrected estimate is close to the exact value.

6.3.2 Mean-std compliance minimization

Fig. 14
figure 14

Cut views of the optimal topologies of the 3D mean-std compliance minimization problem using the exact method

Fig. 15
figure 15

Cut views of the optimal topologies of the 3D mean-std compliance minimization problem using the corrected diagonal estimation method

Table 10 Summary statistics of the load compliances of the optimal solution of the mean-std compliance minimization problem using exact and the corrected diagonal estimation method with 10 Hadamard basis probing vectors to evaluate the mean-std compliance

Similarly, Table 10 shows the statistics of the final solutions of the 3D mean-std minimization problem solved using the naive exact approach and the corrected diagonal estimator method with 10 Hadamard basis probing vectors. The optimal topologies are shown in Figs. 14 and 15. Both algorithms converged to reasonable and feasible designs. Additionally, as expected the exact mean-std minimization converged to a solution with a lower standard deviation but a higher mean compliance compared to the exact mean minimization. However, due to the approximation error and non-convexity of the problems, the exact and approximate mean-std algorithms converged to solutions with a lower mean and std compared to the approximate mean algorithm. Finally as expected, the exact method took significantly longer to converge than the diagonal estimation method.

7 Conclusion

In this paper, two approximate methods were proposed to handle load uncertainty in compliance topology optimization problems where the uncertainty is described in the form of a set of finitely many loading scenarios. By re-formulating the function as a trace or diagonal estimation problem, significant performance improvements were achieved over the exact methods. Such improvement was demonstrated via complexity analysis and computational experiments. The methods proposed were shown to work well in practice while having a different time complexity profile.