1 Introduction

In structural optimisation, particularly in topology optimisation, the self-adjoint compliance minimisation problem is often studied (Rozvany et al. 1989). One can obtain design sensitivities for gradient-based optimisation at a marginal computational cost due to the self-adjointness of the problem. This advantage has likely contributed to the popularity of studying the compliance minimisation problem. However, as Rozvany et al. (1993) pointed out almost three decades ago: “Self-adjoint problems, such as design for a single stress, a single compliance or single natural frequency constraint do not represent a real-world situation, because most practical structures are subject to several load conditions and design constraints.” Almost three decades later, solving large-scale linear problems considering multiple physical loads and a large variety of responses—hereafter denoted by compound problems—is becoming increasingly attainable as available computational power increases. However, regardless of available computational power, efficient numerical implementations remain essential.

Typically, finding the state corresponding to a load, i.e. the solution to the governing equations dominates the overall computation time during optimisation. As Borrvall and Petersson (2001) report, the computational time of such procedures approaches 97% for minimum compliance problems considering a single physical load, where computation times increase further when considering compound problems.

Finding a solution to these systems of linear equations generally consists of two steps: preprocessing and solving (Amir and Sigmund 2010). The preprocessing for direct methods requires the (generally expensive) matrix factorisation, and solving requires finding the exact solution via comparatively inexpensive back-substitutions (Davis 2006). In contrast, iterative methods require the construction of a preconditioner, and they subsequently generate a sequence of approximate solutions until convergence (Saad 2003). The relative cost of preconditioner construction and the iterative solution process depends on many factors, such as the type of preconditioner and condition number. The preprocessing information can be repeatedly reused for subsequent solves within the same design iteration when this involves a system matrix with equivalent partitioning. This possibility holds for both solution methods.

Three strategies can be distinguished to lower the computational effort of solving large-scale linear systems of governing equations in structural optimisation, i.e. reduction of

  1. i

    the number of design iterations,

  2. ii

    the computational effort per solve, and

  3. iii

    the number of solves per design iteration.

The first technique has shown great potential to reduce computational effort, for instance using advanced sequential approximate optimisation schemes (e.g. see (Bruyneel et al. 2002; Li and Khandelwal 2015)). However, these approaches are out of scope for this discussion, independent of the presented methodology.

A common approach to reduce computation time per linear solve is to employ parallel computing (Borrvall and Petersson 2001; Aage et al. 2017), a technique which distributes the computational effort. However, to reduce this effort, approximation techniques should be considered, such as approximated reanalysis (Kirsch 1991; Amir 2015), iterative solution techniques (Borrvall and Petersson 2001; Amir et al. 2010, 2014), and approximated model order reduction (Ma et al. 1993; Choi et al. 2019). Alternatively, static condensation (Guyan 1965; Irons 1965) allows for exact model order reduction, decreasing the system dimensionality without loss of information (e.g. see (Yang and Lu 1996)). For a comprehensive review of techniques aiming to decrease the computational effort per solve in the context of topology optimisation, the reader is referred to the recent work by Mukherjee et al. (2021).

The third category—approaches to reduce the number of solves per design iterations—includes the adjoint sensitivity analysis method itself, for instance, when applied to most self-adjoint problems (Arora and Haug 1979; Vanderplaats 1980; Belegundu 1986). For problems considering many physical static loads, Zhang et al. (2020) reduce the number of deterministic loads to a single approximated load using sampling schemes. Recent study shows that static condensation allows for a reduction of the number of factorisations/preconditioning steps and the number of solves in multi-partition problems; which are problems that, as a result of changing boundary conditions, require multiple different partitions of the stiffness matrix (Koppen et al. 2022b).

In contrast to that study, in this paper, we focus on compound problems with a single partitioning of the system matrix. We introduce another method of the third category that reduces the number of solves per design iteration design problems with equivalent partitioning of degrees of freedom. Different boundary condition values can be handled as long as the partition remains the same. We herein assume linear state-based optimisation problems under (quasi-)static loading, which constitutes a significant fraction of all problems studied in the topology optimisation community (Bendsøe and Sigmund 2004). By automatically detecting linear dependencies between physical and adjoint loads, unnecessary solves in compound problems involving the same partition of system matrix can be avoided entirely while maintaining equal accuracy of the solution of the states. To help the reader recognise linear dependencies that may arise in common design optimization problems, we distinguish three cases of such linear dependency:

  1. i

    Linearly-Dependent Physical-Physical (LDPP) loads. Such cases are common in design problems involving multiple loading conditions with applied loads of varying magnitudes, for example, present in the case study of Sect. 4. Optimisation problems with LDPP loads are relatively easily detected manually and regularly avoided by the user.

  2. ii

    Linearly-Dependent Adjoint-Physical (LDAP) load pairs. Typical problems include cases where the adjoint load depends linearly on the corresponding physical load, as common in conventional self-adjointFootnote 1 problems (Belegundu 1986; Rozvany et al. 1993). The most well-known design problem in the topology optimisation community involving such load pairs is the classical compliance minimisation problem. Such cases are typically detected by academics in this field but may be overlooked otherwise.

  3. iii

    Mixed Linear Dependencies (MLD), i.e. cases where physical loads or adjoint loads can be written as a linear combination of previously considered physical or adjoint loads. MLDs also include linear dependencies between adjoint loads and between non-corresponding adjoint and physical loads (as well as any linear combination). These MLDs are the most general situation and the most difficult to foresee and consider by hand. Such cases are expected in problems with multiple response functions depending on multiple states. More specifically, such cases often occur when the locations where the loads are applied and the locations of the performance measures coincide, such as typical in the design of compliant mechanisms. These MLDs will be elaborately clarified in all numerical examples.

A user will typically be unaware of the presence and type of most of such linear dependencies. A Linear Dependency Aware Solver (LDAS) can be employed to detect and exploit any linear dependency, including any of the three aforementioned types, automatically. In this work, we demonstrate the need and benefits of an LDAS in the context of gradient-based, structural optimisation for compound problems and provide one such solver in the form of a simple algorithm to automatically detect and exploit any linear dependence in a (possibly large) set of loads. The focus is on MLDS since these linear dependencies are typically the hardest to detect. However, due to the generality of the method, it also automatically resolves unnecessary solves in LDPP and LDAP pairs. Thus, it is ensured that only the minimum number of linear solves is performed in each iteration. This advantage makes the approach suitable for general-purpose structural and topology optimisation implementations. Note that the presented algorithm does not exclude other additional techniques to reduce the computational effort and time, such as parallel computing, approximate modelling, or reduced order techniques, which can be implemented alongside the presented methodology.

2 Method

Consider a general inequality-constrained nonlinear structural optimisation problem

$$\begin{aligned} \underset{{\textbf {x}}\in \mathbb {X}^N}{\text {min}}&f\left[ {\textbf {x}}\right] \\ \text {s.t.}\quad &{\textbf {g}}\left[ {\textbf {x}}\right] \le {\textbf {0}}\\ \end{aligned}$$
(1)

with objective \(f \in \mathbb {R}\), m inequality constraints \({\textbf {g}} \in \mathbb {R}^{m}\) and N design variables \({\textbf {x}} \in \mathbb {X}^{N} \subseteq \mathbb {R}^N\).

2.1 Response and sensitivity analysis

The responses (objective and constraint functions) commonly depend on physical states \({\textbf {U}} := \left[ {\textbf {u}}_1 ,\ldots , {\textbf {u}}_a\right] \in \mathbb {R}^{n \times a}\), where n is the dimensionality of the discretised governing equations and a the number of states. These states implicitly depend on the design variables, i.e. \({\mathbf {U}} = {\mathbf {U}}\left[ {\mathbf {x}}\right]\). We consider a setting in which these physical states are obtained by solving a linear system of discretised governing equations, i.e.

$$\begin{aligned} {\mathbf {K}}\left[ {\mathbf {x}}\right] {\mathbf {U}} = {\mathbf {F}}\left[ {\mathbf {x}}\right] , \end{aligned}$$
(2)

with \({\textbf {F}}\left[ {\textbf {x}}\right] := \left[ {\textbf {f}}_1\left[ {\textbf {x}}\right] ,\ldots , {\textbf {f}}_a\left[ {\textbf {x}}\right] \right] \in \mathbb {R}^{n \times a}\) the physical loads and \({\textbf {K}}\left[ {\textbf {x}}\right] \in \mathbb {R}^{n \times n}\) a design-dependent, symmetric, and non-singular system matrix. In the following we assume the system in Eq. (2) constitutes a single partition, thus the physical loads are applied on the system under the same boundary conditions.

In gradient-based optimisation, the sensitivities of the responses to the design variables are required to update the design variables. For structural optimisation problems with a large ratio of the number of design variables to the number of state-based response functions, commonly, the adjoint method is applied to efficiently obtain this sensitivity information (Arora and Haug 1979; Vanderplaats 1980). To this end, consider the augmented response

$$\begin{aligned} {\mathcal {L}}_j\left[ {\mathbf {x}},{\mathbf {U}}\left[ {\mathbf {x}}\right] \right] = g_j\left[ {\mathbf {x}},{\mathbf {U}}\left[ {\mathbf {x}}\right] \right] - \varvec{\Lambda }_j : \left( {\mathbf {K}}\left[ {\mathbf {x}}\right] {\mathbf {U}} - {\mathbf {F}}\left[ {\mathbf {x}}\right] \right) . \end{aligned}$$
(3)

with \({\varvec{\Lambda }}_j := \left[ {{\lambda }}_{j,1} ,\ldots , {{\lambda }}_{j,a}\right] \in \mathbb {R}^{n \times a}\). Here, a suitable choice of the adjoint states \({\varvec{\Lambda }}_j\) can circumvent calculation of the computationally expensive derivative \(\frac{\partial {\mathbf {U}}}{\partial x_k}\) (Vanderplaats 1980). Doing so, full differentiation of Eq. (3) yields

$$\begin{aligned} \frac{\text {d} {\mathcal {L}}_j}{\text {d} x_k} = \frac{\partial g_j}{\partial x_k} - \varvec{\Lambda }_j : \left( \frac{\partial {\mathbf {K}}}{\partial x_k} {\mathbf {U}}\right) , \end{aligned}$$
(4)

with

$$\begin{aligned} {\mathbf {K}}\left[ {\mathbf {x}}\right] \varvec{\Lambda }_j = \frac{\partial g_j}{\partial {\mathbf {U}}}, \end{aligned}$$
(5)

where \(\frac{\partial g_j}{\partial {\mathbf {U}}}\) is referred to as the adjoint loads of response \(g_j\).

Each of the physical and adjoint loads can be linearly dependent on any combination of previously considered loads and thus can be reconstructed as their linear combination. Exploiting possible linear dependence can significantly reduce the costs required to find all states. Consider a set of a loads, of which b are linearly-independent, then the computational effort scales roughly with \(\frac{b}{a}\), as only b solves are required to reconstruct all states. To avoid unnecessarily solving Eqs. (2) and (5) for linear-dependent loads we propose

  1. i

    to compute each load’s dependency on previous loads, and

  2. ii

    to keep track of the states corresponding to linearly-independent loads.

Various possible methods exist to check for linear dependency and necessary bookkeeping. We consider one such algorithm that detects linear dependencies and builds orthogonal bases of linear-independent loads and their corresponding states.

2.2 Orthogonalisation and reconstruction

Consider the non-empty orthogonal bases of loads \({\mathcal {F}}\) and states \({\mathcal {U}}\) of length c. One can investigate the linear dependency of a load \({\mathbf {f}}\) (e.g. a physical load \({\mathbf {f}}\) or adjoint load \(\frac{\partial g}{\partial {\mathbf {u}}}\)) with respect to \({\mathcal {F}}\) by applying the last step of the well-known Gram–Schmidt orthogonalisation procedureFootnote 2 (Laplace 1820; Gram 1883; Schmidt 1907). The residual \({\mathbf {r}}\) is obtained via

$$\begin{aligned} {\mathbf {r}} := {\mathbf {f}} - \sum _{i=1}^{c} \alpha _i {\mathcal {F}}_i, \quad \text {with} \quad \alpha _i = \frac{{\mathcal {F}}_i\cdot {\mathbf {f}}}{{\mathcal {F}}_i \cdot {\mathcal {F}}_i}, \end{aligned}$$
(6)

with \({\mathcal {F}}_i\) the ith load in \({\mathcal {F}}\). A possible implementation is given by the pseudo-code Algorithm 1.

figure a

If the norm of the residual \({\mathbf {r}}\) is zero, then \({\mathbf {f}}\) is linearly dependent to basis \({\mathcal {F}}\). As a result, the corresponding state \({\mathbf {u}}\) (or adjoint state \(\varvec{\lambda }\)) is linearly dependent on basis \({\mathcal {U}}\). Thus, the state \({\mathbf {u}}\) may be reconstructed via

$$\begin{aligned} {\mathbf {u}} = \sum _{i=1}^{c} \alpha _i {\mathcal {U}}_i. \end{aligned}$$
(7)

As such, one can obtain the exact numerical solution of state \({\mathbf {u}}\), while avoiding solving the governing equations for loads \({\mathbf {f}}\). However, if the norm of the residual vector \({\mathbf {r}}\) is non-zero (or bigger than a relatively small value \(\varepsilon\)), \({\mathbf {f}}\) is linearly independent with respect to basis \({\mathcal {F}}\) and the expensive solve cannot be avoided.

We solve for the state \({\mathbf {v}}\) corresponding to residual load \({\mathbf {r}}\) defined by

$$\begin{aligned} {\mathbf {K}}\left[ {\mathbf {x}}\right] {\mathbf {v}} = {\mathbf {r}}. \end{aligned}$$
(8)

Subsequently load \({\mathbf {r}}\) and state \({\mathbf {v}}\) are added to bases \({\mathcal {F}}\) and \({\mathcal {U}}\), respectively. Since \({\mathbf {r}}\) is orthogonal with respect to basis \({\mathcal {F}}\), so is \({\mathbf {v}}\) to \({\mathcal {U}}\). As a result, both enriched bases \({\mathcal {F}}\) and \({\mathcal {U}}\) remain orthogonal. The state \({\mathbf {u}}\) is then reconstructed from Eqs. (6) and (7). The above procedure can be repeated using the enriched bases, as defined in Algorithm 2. Due to the general nature of the algorithm, the proposed procedure is independent of the type of dependencies as defined in Sect. 1. The equivalence of solutions is extensively verified for many test problems.

figure b

Although Algorithm 2 introduces additional computational operations, i.e. computing vector norms and orthogonality coefficients, their computational cost is typically negligible compared to the costs of solving a system of equations, as illustrated in Sect. 5. The computational effort increases with the number of loads to consider, however, remains negligible as long as the number of loads (both physical and adjoint) is smaller than the dimensionality of the load vectors. Furthermore, these operations do not change when considering distributed-memory parallelism. Alternatively, for loads that do not depend on the states, it is possible to rearrange Algorithm 2 to determine all the independent loads first and evaluate their solutions in parallel afterwards.

3 Analytical example

Compound problems may appear in any real-world problem, modelled by (a sequence of) linear governing equations. Typical examples of compound problems are formulations with multiple loading conditions and multiple response functions in which the degrees of freedom of (some of) the loads coincide with (some of) the degrees of freedom that define the response functions. For example, one may think of the design of a structure with multiple critical loading conditions, where the displacements of a loading condition are measured at the same degrees of freedom where the loads are applied at another loading condition. A direct example of this are multi-input–multi-output compliant mechanisms, see e.g. (Frecker et al. 1999) or (Liu and Korvink 2009). The problem formulation of such mechanisms includes multiple physical loads and responses, all applied to, or dependent on, the input and output degrees of freedom of the mechanism. As a result, MLD is commonly present. However, it generally remains unnoticed. To clarify the cases in which one might encounter linear dependency, we here exemplify the three different types of unnecessary solves, as introduced in Sect. 1.

Fig. 1
figure 1

One-dimensional two degrees of freedom compliant mechanism model

3.1 Problem formulation

Consider the two degrees of freedom spring model as depicted in Fig. 1. Note that this example—after applying static condensation—can exactly represent any single-input–single-output compliant mechanism, see e.g. (Wang 2009; Hasse et al. 2017). Therefore, this two degrees of freedom example is fully representative of large-scale linear problems considering multiple physical loads and responses while better suited to illustrate the proposed method.

3.2 Problem analysis

Next, we analyse the properties of this optimisation problem in light of the proposed method, with a specific emphasis on the required number of systems of equations that are to be solved.

3.2.1 Forward analysis

The physical and adjoint states can be obtained by solving the design-dependent discretised governing equations following Eqs. (2) and (5). A set of the following three physical loads is considered:

$$\begin{aligned} {\mathbf {F}} = \begin{bmatrix} \begin{bmatrix} 1 \\ 0\end{bmatrix} \begin{bmatrix} 1 \\ 2\end{bmatrix} \begin{bmatrix} 4 \\ 4\end{bmatrix}\end{bmatrix}. \end{aligned}$$
(9)

The first residual by definition equals the first load, that is \({\mathbf {r}}_1 = {\mathbf {f}}_1\). As a result, the state \({\mathbf {v}}_1 = {\mathbf {u}}_1\). Since the basis is initially empty when this load is considered, the resulting load and state are directly added to corresponding bases. The second residual is calculated via Eq. (6), that is

$$\begin{aligned} {\mathbf {r}}_2 = {\mathbf {f}}_2 - \alpha _1 {\mathcal {F}}_1 = \begin{bmatrix} 0 \\ 2 \end{bmatrix}. \end{aligned}$$
(10)

Since \({\mathbf {r}}_2\) is non-zero, the first and second physical loads are linearly-independent. The corresponding physical state \({\mathbf {v}}_2\) is obtained by solving for the non-zero load \({\mathbf {r}}_2\) via Eq. (8). As a result the following bases, consisting of orthogonal vectors, are obtained after solving for the first two loads:

$$\begin{aligned} {\mathcal {F}} = \left[ {\mathbf {f}}_1, {\mathbf {r}}_2\right] \quad \text {and} \quad {\mathcal {U}} = \left[ {\mathbf {u}}_1, {\mathbf {v}}_2\right] . \end{aligned}$$
(11)

The second physical state is now reconstructed following Eq. (7) and reads

$$\begin{aligned} {\mathbf {u}}_2 = \alpha _1 {\mathcal {U}}_1 + {\mathbf {v}}_2= {\mathbf {u}}_1 + {\mathbf {v}}_2. \end{aligned}$$
(12)

The third physical load can be written as a linear combination of the current orthogonal basis \({\mathcal {F}}\), resulting in a zero residual load \({\mathbf {r}}_3 = {\mathbf {0}}\). These are thus LDPP loads. Thus the basis \({\mathcal {U}}\) can be used to reconstruct the third physical state without an additional solve as in Eq. (7), i.e.

$$\begin{aligned} {\mathbf {u}}_3 = \alpha _1{\mathcal {U}}_1 + \alpha _2 {\mathcal {U}}_2 = 4{\mathbf {u}}_1 + 2{\mathbf {v}}_2. \end{aligned}$$
(13)
Table 1 Overview of both physical and adjoint loads and states, as well as the orthogonal bases encountered in the illustrative example presented in Fig. 1

3.2.2 Sensitivity analysis

Now consider a response function \(g_1\left[ {\mathbf {u}}_2\right]\) that is a measure for the strain energy due to load \({\mathbf {f}}_2\), i.e.

$$\begin{aligned} g_1\left[ {\mathbf {u}}_2\right] = \frac{1}{2}{\mathbf {f}}_2 \cdot {\mathbf {u}}_2. \end{aligned}$$
(14)

The second adjoint load for this response is linearly dependent on the corresponding physical load \({\mathbf {f}}_2\) as

$$\begin{aligned} \frac{\partial g_1}{\partial {\mathbf {u}}_2} = \frac{1}{2} {\mathbf {f}}_2, \end{aligned}$$
(15)

thus this is an LDAP pair, and consequently \({\mathbf {r}}_4 = {\mathbf {0}}\). As a result, one can use the basis \({\mathcal {U}}\) to reconstruct the second adjoint state, which yields

$$\begin{aligned} \varvec{\lambda }_{1,2} = \frac{1}{2}{\mathbf {u}}_2 = \alpha _1{\mathcal {U}}_1 + \alpha _2 {\mathcal {U}}_2 = \frac{1}{2}{\mathbf {u}}_1 + \frac{1}{2}{\mathbf {v}}_2, \end{aligned}$$
(16)

with \(\varvec{\lambda }_{j,i}\) the adjoint state of response j with respect to state i. Note that both the first and third adjoint loads of this response, that is \(\frac{\partial g_1}{\partial {\mathbf {u}}_1}\) and \(\frac{\partial g_1}{\partial {\mathbf {u}}_3}\) are zero, and thus so are \(\varvec{\lambda }_{1,1}\) and \(\varvec{\lambda }_{1,3}\).

Finally consider a (fictitious) response function \(g_2\left[ {\mathbf {u}}_1, {\mathbf {u}}_3\right]\) that depends on both degrees of freedom of the first state and third state via

$$\begin{aligned} g_2\left[ {\mathbf {u}}_1,{\mathbf {u}}_3\right] = \begin{bmatrix} 2 \\ 1 \end{bmatrix} \cdot {\mathbf {u}}_1 + \begin{bmatrix} 1 \\ 3 \end{bmatrix} \cdot {\mathbf {u}}_3. \end{aligned}$$
(17)

The adjoint loads for this response function can be written as

$$\begin{aligned} \frac{\partial g_2}{\partial {\mathbf {u}}_1} = \begin{bmatrix} 2 \\ 1 \end{bmatrix}&= 2 {\mathbf {f}}_1 + \frac{1}{2}{\mathbf {r}}_2 \quad \text {and} \\ \frac{\partial g_2}{\partial {\mathbf {u}}_3} = \begin{bmatrix} 1 \\ 3 \end{bmatrix}&= {\mathbf {f}}_1 + \frac{3}{2}{\mathbf {r}}_2. \end{aligned}$$
(18)

Note that both adjoint loads are linearly dependent on a combination of previously considered loads, i.e. an MLD. In this case, the adjoint loads are both linearly dependent on both loads in basis \({\mathcal {F}}\). As a result, one may again use the states in \({\mathcal {U}}\) to reconstruct the adjoint states via

$$\begin{aligned} \varvec{\lambda }_{2,1} = 2 {\mathbf {u}}_1 + \frac{1}{2}{\mathbf {v}}_2 \quad \text {and} \quad \varvec{\lambda }_{2,3} = {\mathbf {u}}_1 + \frac{3}{2}{\mathbf {v}}_2. \end{aligned}$$
(19)

The loads, states, and bases of this example are summarised in Table 1.

3.2.3 Concluding remarks

Six solves are required when all loads (physical and adjoint) are considered. If both LDPPs and LDAP pairs are taken into account, only three solves are needed. Finally, considering MLDs (and thus also LDPPs and LDAP pairs), only two solves are required. Although the presented example is simplified, more complex MLDs do appear in large-scale compound problems, as will be demonstrated in Sects. 4 and 5.

4 Numerical example 1: design of a bridge

In this section we demonstrate the use of an LDAS for a practically relevant numerical example. The emphasis will be on the potential gain, not on formulation, design or optimization convergence aspects.

4.1 Problem formulation

Consider the design of a simplified bridge-deck supporting structure. A schematic of the problem setting, together with an optimised design, is shown in Fig. 2. The engineer has selected a set of crucial loading conditions and (derived) constraints based on an extensive set of requirements and loading conditions, as typical in the design of such a bridge.

Fig. 2
figure 2

Optimised result of topology optimization problem Eq. (20). The solution (800 \(\times\) 120 finite elements and design variables) satisfies all constraints (all active) and the optimization process terminated in 59 design iterations. Corresponding displacements at the DOF of interest are listed in Table 3

The aim is to design a stiff bridge with limited material for a given set of four loading conditions considering three points of interest: one at a quarter, one at the middle and one at three-quarters of the bridge deck. The magnitudes of forces applied to the DOFs of interest for the four loading conditions are as shown in Table 2. Furthermore, it is decided that the difference in deformations from loading conditions with concentrated loads and combined loads must be restricted. As such, the design has to satisfy several constraints on the deflection of the points of interest under the given loading conditions.

Table 2 Magnitude of forces applied at DOFs 1, 2 and 3 (numbered as assigned in Fig. 2) for loading conditions (LC) 1, 2, 3 and 4

The topology optimization problem formulation reads

$$\begin{aligned} \underset{{\textbf {x}}\in \mathbb {X}^N}{\text {min}}&f\left[ {\textbf {x}}\right] : \sum _j^4 {\mathcal {E}}_j\left[ {\textbf {u}}_j\left[ {\textbf {x}}\right] \right] \\ \text {s.t.}\quad &g^\text {v}\left[ {\textbf {x}}\right] : v\left[ {\textbf {x}}\right] \le {\overline{v}}\\&g_{1}^\text {u}\left[ {\textbf {x}}\right] : u_{1,1}\left[ {\textbf {x}}\right] - u_{1,4}\left[ {\textbf {x}}\right] \le {\overline{u}}\\&g_{2}^\text {u}\left[ {\textbf {x}}\right] : u_{2,2}\left[ {\textbf {x}}\right] - u_{2,4}\left[ {\textbf {x}}\right] \le {\overline{u}}\\&g_{3}^{\text {u}}\left[ {\textbf {x}}\right] : u_{3,3}\left[ {\textbf {x}}\right] - u_{3,4}\left[ {\textbf {x}}\right] \le {\overline{u}} \end{aligned}$$
(20)

The objective is to minimise the strain energy \({\mathcal {E}}_j\), or equivalently maximise the stiffness, under the four loading conditions by finding design variables \(x_k\) that are bounded by \(\mathbb {X} = \left\{ x \in \mathbb {R} ~ \vert ~ 0 \le x \le 1\right\}\). Constraint \(g^\text {v}\left[ {\mathbf {x}}\right]\) limits the maximum material usage by fraction \({\overline{v}} = 0.5\). Constraint \(g_{i}^\text {u}\) limits the difference in displacement of DOF i between loading conditions i and 4 to \({\overline{u}} = 20\). Herein \(u_{i,j}\left[ {\mathbf {u}}_j\left[ {\mathbf {x}}\right] \right]\) is defined as the displacement at DOF of interest i for loading condition j.

An optimised solution is shown in Fig. 2. The displacements at the DOFs of interest for the four loading conditions of this constrained optimised design are shown in Table 3. Note that the deformations at the points of interest now satisfy the imposed restrictions.

Table 3 Displacements at DOF 1 and 2 for loading conditions 1, 2 and 3 of the optimised design with deformation constraints

4.2 Problem analysis

Now we analyse the potential gain of using an LDAS for solving this bridge design optimization problem.

4.2.1 Forward analysis

Let us first consider the objective of the problem formulation posed in Eq. (20). The objective is a function of the states of four loading conditions, that is \(f\left[ {\mathbf {u}}_1,{\mathbf {u}}_2,{\mathbf {u}}_3,{\mathbf {u}}_4\right]\). Straightforward analysis would thus require four solves. However, upon closer inspection, it can be observed that the fourth loading condition uniquely uses LDPP loads. The fourth state \({\mathbf {u}}_4\) can, thus, be written as a linear combination of states \({\mathbf {u}}_1\), \({\mathbf {u}}_2\) and \({\mathbf {u}}_3\). No solves are required for the forward analysis of the constraints since all states have previously been determined to calculate the objective. Thus, using an LDAS to solve Eq. (20) can save the user one of the four solves required in the forward analysis, thus requiring three solves per design iteration.

4.2.2 Sensitivity analysis

Straightforward sensitivity analysis of the objective requires four more solves. However, the adjoint loads \(\frac{\text {d} f}{\text {d} {\mathbf {u}}_i}\) for \(i =1,2,3,4\) can all be written as a linear combination of \({\mathbf {f}}_1\) and \({\mathbf {f}}_2\) and \({\mathbf {f}}_3\), that is four LDAP pairs. Considering the additional constraint functions, the number of solves required for sensitivity analysis quickly increases. Each constraint depends on two states, thus requiring two adjoint solves per constraint. This sensitivity analysis thus requires a total of six additional solves. Closer inspection, similar to the preceding section, brings to light the MLDs in these constraints; all the adjoint loads can be written as a linear combination of physical loads \({\mathbf {f}}_1\), \({\mathbf {f}}_2\) and \({\mathbf {f}}_3\). Using an LDAS thus avoids all of the ten solves, and the sensitivity analysis would not require any solve.

4.2.3 Concluding remarks

A straightforward implementation to solve the bridge design problem would require a total of fourteen solves per design iteration, four for the forward analysis and ten for the sensitivity analysis. Using an LDAS one only requires three solves per design iteration. That is a decrease in the number of solves by almost 80%.

5 Numerical example 2: design of a multi-DOF compliant mechanism

To further demonstrate the benefits of the proposed method, we consider as illustrative case study the topology optimisation of a planar, multiple degree-of-freedom micro-mechanism for use, for example, as analogue gate in a mechanical computer (Larsen et al. 1997). Note that the focus here is not on the optimisation (problem formulation) of the micro-mechanism but on demonstrating the numerical benefits of an LDAS.

5.1 Problem formulation

Consider the design problem depicted in Fig. 3a. The domain consists of four points of interest, each consisting of two Degrees Of Freedom (DOFs), \(u_x\) and \(u_y\), respectively. The target is to design a monolithic compliant mechanism that doubles a unit input motion at DOF 6 to the output motion at DOF 4 and a unit input motion at DOF 8 to an equivalent magnified output motion at DOF 2. Thus we consider two independent kinematic DOFs. Furthermore, we also consider parasitic motion, input coupling and output coupling: all remaining DOFs—apart from the intended input and output—are restricted to displace a maximum of 0.1% of the input motion.

The force paths have to cross, making this a challenging problem that is not necessarily intuitive for engineers to solve. Therefore we solve this problem using topology optimisation (Bendsøe and Sigmund 2004). We consider the following compound topology optimisation problem formulationFootnote 3:

$$\begin{aligned} \underset{{\textbf {x}}\in \mathbb {X}^N}{\text {min}}\\ f\left[ {\textbf {x}}\right] :&\quad \sum _j {\mathcal {E}}_j\left[ {\textbf {u}}_j\left[ {\textbf {x}}\right] \right] \quad \forall ~ j \in \left\{ 1, 3, 5, 7\right\} \\ \text {s.t.}\\ g^\text {v}\left[ {\textbf {x}}\right] :&\quad v\left[ {\textbf {x}}\right] \le {\overline{v}}\\ g_{j,j}^\text {in}\left[ {\textbf {x}}\right] :&\quad u_{j,j}\left[ {\textbf {u}}_j\left[ {\textbf {x}}\right] \right] \ge u_\text {in} \quad \forall ~ j \in \left\{ 6,8\right\} \\ g_{i,j}^\text {ct}\left[ {\textbf {x}}\right] :&\quad u_{i,j}\left[ {\textbf {u}}_j\left[ {\textbf {x}}\right] \right] \le u_\text {ct} \\&\quad -u_{i,j}\left[ {\textbf {u}}_j\left[ {\textbf {x}}\right] \right] \le u_\text {ct} \\&\quad \quad \forall ~ i,j \in {\left\{ \begin{array}{ll} \left\{ 1,2,3,5,7,8\right\} ,\left\{ 6\right\} \\ \left\{ 1,3,4,5,6,7\right\} ,\left\{ 8\right\} \end{array}\right. }\\ g_{i,j}^\text {t}\left[ {\textbf {x}}\right] :&\quad J_k u_{i,j}\left[ {\textbf {u}}_j\left[ {\textbf {x}}\right] \right] - u_{j,j}\left[ {\textbf {u}}_j\left[ {\textbf {x}}\right] \right] \le u_\text {t} \\&\quad u_{j,j}\left[ {\textbf {u}}_j\left[ {\textbf {x}}\right] \right] -J_k u_{i,j}\left[ {\textbf {u}}_j\left[ {\textbf {x}}\right] \right] \le u_\text {t}\\&\quad \quad \forall ~ i,j \in {\left\{ \begin{array}{ll} \left\{ 4\right\} ,\left\{ 6\right\} \\ \left\{ 2\right\} ,\left\{ 8\right\} \end{array}\right. } \end{aligned}$$
(21)

The objective is to minimise the strain energy \({\mathcal {E}}_j\), or equivalently maximise the stiffness, by finding design variables \(s_k\) that are bounded by \(\mathbb {X} = \left\{ x \in \mathbb {R} ~ \vert ~ 0 \le x \le 1\right\}\). Constraint \(g^\text {v}\left[ {\mathbf {x}}\right]\) limits the maximum material usage by fraction \({\overline{v}} = 0.25\). The other constraints enforce a minimum displacement at the input DOFs (\(g_{j,j}^\text {in}\)), limit cross talk (\(g_{i,j}^\text {ct}\)) to tolerance \(u_\text {ct}\), and enforce the transmission between input and output displacements (\(g_{i,j}^\text {t}\)) within a tolerance \(u_\text {t}\). In the next subsection these constraints will be further explained.

This problem formulation consists of standard, well-documented response functions as well as corresponding sensitivity analysis. An extensive description is therefore omitted. For an in-depth discussion on the design of compliant mechanisms using topology optimisation, the reader is referred to earlier works, such as (Ananthasuresh et al. 1994; Frecker et al. 1997; Sigmund 1997, 2001) and the review of Cao et al. (2013) and references therein. For works regarding multiple degrees of freedom systems, the works by (Frecker et al. 1999; Zhan and Zhang 2010; Alonso et al. 2014; Zhu et al. 2018; Koppen et al. 2022a) may be consulted.

The proposed compound topology optimisation problem Eq. (21) was discretised using 200 by 200 finite elements (and design variables). The design variable field is blurred using a linear convolution operator with a filter radius of two elements to eliminate modelling artefacts(Bruns and Tortorelli 2001).

A post-processed (via design variable thresholding) version of a solution is shown in Fig. 3b. This solution is obtained from a uniform initial guess in 58 design iterations using the method of moving asymptotes (Svanberg 1987). This solution adheres to the constraints imposed and, thus, satisfies the design requirements on displacement transmission and maximum parasitic motion. As expected, the solution to the topology optimisation problem using an LDAS is fully equivalent to the reference method.

Note the presence of rigid bodies and hinges and their location and connections. The resulting deformation and displacements of the DOFs of interest for one of the use-cases are displayed by the prototype in Fig. 3c. A movie of the prototype—available as supplementary material and provided on Github (Sect. 5.3)—demonstrates that the intended functionality has been achieved.

Fig. 3
figure 3

Design of a planar, decoupled multiple degrees of freedom compliant mechanism as described in Sect. 5.1. From left to right: a the initial design with the four points of interest each with two degrees of freedom (\(u_x\), \(u_y\)), b the topology as obtained from the optimization, and c a prototype model in deformed configuration

5.2 Problem analysis

Let us analyse the properties of this optimisation problem in light of the proposed method, with a specific emphasis on the required number of systems of equations to be solved.

5.2.1 Forward analysis

The objective function \(f\left[ {\mathbf {x}}\right]\) is a summation of strain energies, obtained by analysing the deformed structure under a unit load at DOFs \(\{1, 3, 5, 7\}\). The internal strain energy corresponding to each displacement field \({\mathbf {u}}_j\) reads as

$$\begin{aligned} {\mathcal {E}}_j = \frac{1}{2} {\mathbf {u}}_j \cdot {\mathbf {K}}\left[ {\mathbf {x}}\right] {\mathbf {u}}_j, \end{aligned}$$
(22)

where \({\mathbf {u}}_j\) is found by solving the system of equations

$$\begin{aligned} {\mathbf {K}}\left[ {\mathbf {x}}\right] {\mathbf {u}}_j = {\mathbf {f}}_j, \end{aligned}$$
(23)

with \({\mathbf {f}}_j\) the unit load vector that contains zeros at all entries except at DOF j of interest. To evaluate the objective function, the system of equations (Eq. (23)) needs to be solved repeatedly, since the four physical loads are linearly-independent. By minimising these strain energy terms, the motion corresponding to these DOFs is restricted in the resulting structure. None of the points of interest can significantly move in the x-direction.

Constraints \(g_{j,j}^\text {in}\left[ {\mathbf {x}}\right]\) are required to enforce a minimum displacement of \(u_\text {in}\) at \(u_{j,j}\) with j the DOFs of interest 6 and 8, requiring two additional solves. Note, \(u_{i,j}\) denotes the displacement at DOF i due to a unit load at DOF j. One may observe that the remaining displacement-based constraints are only dependent on \({\mathbf {u}}_6\) and \({\mathbf {u}}_8\). Since these were previously evaluated to determine \(g_{j,j}^\text {in}\left[ {\mathbf {x}}\right]\), inspection shows that no additional solves are required for the forward analysis.

Constraints \(g_{i,j}^\text {ct}\left[ {\mathbf {x}}\right]\) are imposed to limit the crosstalk (parasitic motion) \(u_{i,j}\) of DOFs \(\{1, 2, 3, 5, 7, 8\}\) due to a unit load at DOF 6 and the motion of DOFs \(\{1, 3, 4, 5, 6, 7\}\) due to a unit load at DOF 8 from below by \(-u_\text {ct}\) and from above by \(u_\text {ct}\), with \(u_\text {ct} = 0.001 u_\text {in}\). The number of crosstalk constraints is found by multiplying two kinematic DOF, six constraints per kinematic DOF, and two bounds per constraint, resulting in 24 constraint functions.

Constraints \(g_{i,j}^\text {t}\left[ {\mathbf {x}}\right]\) enforce a desired input–output transmission \(J_k := \frac{u_{\mathrm {out},k}}{u_{\mathrm {in},k}}\) for kinematic DOF k with a maximum transmission deviation of \(u_\text {t} = 0.1u_\text {in}\). The input–output transmission for the first kinematic mode is defined as the motion transmission from DOF 2 to DOF 4 \(J_1 := \frac{u_{4,6}}{u_{6,6}}\), and the second input–output transmission is defined as the motion transmission from DOF 8 to DOF 2 \(J_2 := \frac{u_{2,8}}{u_{8,8}}\). This introduces four constraints, as each constraint is bound from below and above.

All response functions combined require 32 response functions to be evaluated for this optimisation problem, which are fully resolved by performing a total of six solves (four for the objective and two for \(g_{j,j}^\text {in}\left[ {\mathbf {x}}\right]\)).

5.2.2 Sensitivity analysis

To obtain the sensitivities of the responses to the design variables, one generally loops over the responses, and consecutively calculates the corresponding sensitivities.

For the considered problem, the adjoint loads of the objective are linearly dependent on corresponding physical loads, i.e. they form four LDAP pairs. In this case \(\frac{\partial {\mathcal {E}}_j}{\partial {\mathbf {u}}_j} = {\mathbf {f}}_j\), and thus \(\varvec{\lambda }_{j,j} = {\mathbf {u}}_j\). Thus, to obtain the sensitivities of the objective no additional solves are required.

The adjoint loads corresponding to \(g_{j,j}^\text {in}\left[ {\mathbf {x}}\right]\) read

$$\begin{aligned} \frac{\partial g_{j,j}^\text {in}\left[ {\mathbf {x}}\right] }{\partial {\mathbf {u}}_j} = \frac{1}{u_\text {in}} {\mathbf {l}}_j, \end{aligned}$$
(24)

which can be written as a linear combination of the physical loads \({\mathbf {f}}_6\) and \({\mathbf {f}}_8\) previously considered to evaluate \(g_{j,j}^\text {in}\left[ {\mathbf {x}}\right]\).

The sensitivities of the crosstalk constraints \(g_{i,j}^\text {ct}\left[ {\mathbf {x}}\right]\) exhibit MLDs. Furthermore, for \(i = \{1,3,5,7\}\) and \(j = \{6,8\}\) the following holds

$$\begin{aligned} \frac{\partial g_{i,j}^\text {ct}\left[ {\mathbf {x}}\right] }{\partial {\mathbf {u}}_j} = \pm \frac{1}{u_\text {ct}} {\mathbf {l}}_j = \pm \frac{1}{u_\text {ct}} {\mathbf {f}}_j, \end{aligned}$$
(25)

and the adjoint loads are therefore linearly dependent on non-corresponding physical loads. However, for \(i,j = \{2,6\}\) and \(i,j= \{4,8\}\) the adjoint load can not be written as (a combination) of previously evaluated physical and/or adjoint loads and the corresponding systems of equations (Eq. (5)) need to be solved accordingly. Note, only two solves are required as the adjoint loads for the constraints related to lower and upper bounds are linear-dependent (these only show a sign difference).

Lastly, the adjoint loads corresponding to transmission constraint \(g_{i,j}^\text {t}\left[ {\mathbf {x}}\right]\) are given by

$$\begin{aligned} \frac{\partial g_{i,j}^\text {t}\left[ {\mathbf {x}}\right] }{\partial {\mathbf {u}}_j} = \pm \left( \frac{J_k}{u_\text {t}} {\mathbf {l}}_i - \frac{1}{u_\text {t}} {\mathbf {l}}_j\right) , \end{aligned}$$
(26)

which can all be written as a summation of the previous adjoint loads of \(g_{i,j}^\text {in}\left[ {\mathbf {x}}\right]\) (or physical loads \({\mathbf {f}}_6\) and \({\mathbf {f}}_8\) and \(g_{i,j}^\text {ct}\left[ {\mathbf {x}}\right]\). For such ‘combined’ loads it can be particularly obscure to manually express them as a linear combination of previous physical and/or adjoint loads.

5.2.3 Concluding remarks

The problem analysis reveals that if no linear dependencies are taken into account, 40 systems of equations need to be solved (of which 34 in the sensitivity analysis), as opposed to the minimum of 8 when considering all linear dependencies (MLDs). That is, one may expect a maximum decrease of computational effort by 80%. If only LDAP pairs are considered (this is generally the case), then 34 equations have to be solved. If, in addition to this, it is recognised that the adjoint loads of the constraints on lower and upper bounds only differ by a sign (and are thus linearly-dependent), one still has to solve 20 systems of equations. The results of the foregoing problem analysis are summarised in Table 4, aiding in the detection of linear dependency between loads and calculation of states.

Table 4 Result of the problem analysis (Sect. 5.2); relation between loads and DOF of interest. The horizontal axis states the eight DOFs of interest, and the vertical axis the physical and adjoint loads, respectively

Although manually finding all linear dependencies and their corresponding coefficients is achievable and yields significant savings, it is time-consuming, cumbersome, and error-prone. Moreover, it does not readily permit implementation in commercial software. In the following, we demonstrate how an LDAS, such as Algorithm 2 provides the same result in an automated manner with negligible computational overhead.

5.3 Verification by run-time experiment

The following discusses a run-time measurement comparison between the LDAS and manual implementations considering LDAP pair and MLD detection. This comparison is based on the design problem as proposed and analysed in Sects. 5.1 and 5.2. We aim to measure the run-time of a single design iteration using an automatic LDAS for solving the linear systems involved in a single design iteration of the problem proposed in Sect. 5.1, and compare this to the run-time required for manual implementations. In addition, we also focus on the attained performance improvements across a range of discretisations, indicated by the number of DOFs n, for a single design iteration. Assuming the physical and adjoint loads do not alter during the optimization process, the linear dependencies remain constant throughout the optimization process. Therefore, the computational effort of a complete optimization process simply scales with the number of design iterations. All presented run times are normalised to the implementation without exploiting linear dependencies. From the previous problem analysis, we found the number of solves required for each method: 40 for no detection, 34 considering LDAP, and 8 when including MLD, already hinting at potential performance improvements.

In order to consider the influence of different types of solution methods, we define the ratio \(\chi\) as the ratio between the computational effort a solution method requires for preprocessing and the effort required for a solve. To capture a wide range of solution methods, we opt to compare two extremes:

  • A high-\(\chi\) solution method with predominant effort in the preprocessing; we opt here for a direct method, such as a Cholesky factorisation (Benoit 1924) with back-substitution, and

  • A low-\(\chi\) solution method with predominant effort in solving the equations. We opt here for an iterative solution process, such as Incomplete Cholesky preconditioning with Conjugate Gradient (Saad 2003).

The presented experiments consider a moderate number of DOFs: small enough to highlight the change in performance as the number of DOFs is increased while large enough to ensure the computational effort and run-time are dominated by preprocessing and solving. These aspects are therefore emphasised in the following analysis, and other computational overhead is assumed negligibleFootnote 4. In all cases, we reused the preprocessing information (factorisation/preconditioner) when possible. The results of this run-time experiment are shown in Fig. 4. The figures show the normalised run-time \({\hat{t}}\), i.e. normalised to the run-time required without any linear dependency detection, of the solves required for a single design iteration, both for high and low-\(\chi\) methods.

For high-\(\chi\) solution methods, the gains for LDAS and MLD converge towards each other, indicating the relative overhead of the LDAS decreases with problem size. It should be noted that the ideal normalised run-time \({\hat{t}} = 0.2\) is not achieved for high-\(\chi\) methods since the chosen preprocessing is relatively expensive (or vice versa, the solve is relatively cheap), thereby limiting the possible gains in run-time in this situation to \({\hat{t}} = 0.4\). Clearly, the maximum achievable gain is higher for low-\(\chi\) solution methods (the difference is fully defined by the difference in \(\chi\)). Counting the number of linearly-independent solves of the different schemes gives an accurate estimate of relative computational efficiency. For the presented example, an 80% reduction may indeed be expected using an LDAS with a low-\(\chi\) solution method.

Regardless of the solution method, taking into account only LDAP pairs is not computationally efficient compared to using an LDAS for this problem. For both high-\(\chi\) and low-\(\chi\) solution methods, the overhead of the LDAS is negligible for problems of moderate to large size.

Fig. 4
figure 4

Normalised run-time \({\hat{t}}\) versus number of DOFs n of three implementations: LDAP (\(\bullet\)), MLD (\(\bullet\)) and LDAS (\(\blacktriangle\)). Herein LDAP and MLD are implementations that manually detect linear dependencies. The LDAP implementation detects only adjoint-physical load pairs, whereas the MLD implementation detects all linear dependencies. The LDAS implementation uses automatic detection, with a slight overhead to the manual MLD implementation. The figures include both a high-\(\chi\) and low-\(\chi\) solution method to solve the system of equations related to the numerical example presented in Sect. 5. For each of the six data points, the measurements are averaged over respectively 1000, 250, 64, 16, 4 and 1 repeated experiments on a high performance computing cluster to obtain a stable time measurement

6 Conclusions

The computational effort required to solve a gradient-based structural optimisation problem in a nested analysis and design setting is typically dominated by finding solutions to state equations. However, in real-world optimisation problems—that are typically compound, i.e. they consider multiple combinations of physical loading conditions and a wide variety of response functions—many avoidable linear system solves are executed regardless. This paper proposes the use of linear dependency aware solvers, complementary to methods aiming to reduce the total number of design iterations, or the cost per solve, by effectively reducing the number of solves per design iteration without compromising accuracy of the solution. The proposed concept leverages the linearity of the systems of equations—a trait present in many commonly considered topology optimisation problems—to automatically omit expensive solves if the solutions can be expressed as a linear combination of previously evaluated solutions for a given design iteration.

We proposed one such algorithm that is simple, as illustrated by the provided supplementary Python and MATLAB implementations of Algorithm 2, and can be integrated non-intrusively into existing optimisation software. Although the potential benefits of the proposed method hinge on the presence of linear dependencies of the problem at hand, it has been illustrated that the accompanying overhead is negligible, allowing the method to be applied freely and achieving significant performance improvements when linear dependencies are abundant. Additionally, the concept does not restrict other methods to reduce the computational time per solve, such as parallel computing, approximation techniques, or model order reduction, which allows the user to focus on the design problem formulation and avoids laborious manual linearly dependency analysis altogether.

7 Supplementary information

This article is supplemented with numerical implementations, i.e. a MATLAB and Python implementation of Algorithm 1 and Algorithm 2, as well as media files related to the prototype model from Fig. 3c.