1 Introduction

Topology optimization is concerned with the optimization of a domain by altering its geometrical features. Whereas in shape optimization only the boundary of a domain is variable, topology optimization considers the addition or removal of material to the geometry. Originally introduced in the context of solid mechanics, topology optimization has been considered for many practical applications, e.g., compliance minimization in elasticity (Eschenauer et al. (1994); Allaire et al. (2005); Amstutz and Andrä (2006)), design optimization in the context of fluid mechanics (Borrvall and Petersson (2003); Sá et al. (2016)), electrical machines (Gangl and Langer (2012)), as well as the solution of inverse problems (Canelas et al. (2015); Hintermüller and Laurain (2008); Laurain et al. (2013); Beretta et al. (2017)) and the modeling and simulation of fracture evolution (Xavier et al. (2018, 2017)).

The goal of topology optimization is to optimize a cost functional J depending on the set \(\Omega\), which plays the role of the design variable. To achieve this goal, usually all possible designs are assumed to belong to a fixed domain \({\textsf{D}}\subset \mathbb {R}^d\), \(d\in \mathbb {N}_{>0}\), where \(\mathbb {N}\) denotes the set of positive integers, which is referred to as hold-all domain. The understanding how a functional varies under perturbations of \(\Omega\) is crucial for the development of numerical methods. Of particular importance are perturbations obtained by removing or adding a small inclusion \(\omega _\varepsilon (x_0)\) at \(x_0\in \Omega\) or \(x_0\in {\textsf{D}}{\setminus } \overline{\Omega }\) of size \(\varepsilon\). The first order variation of the functional under this perturbation is called the “topological derivative”. This leads to the definition of the perturbed domains

$$\begin{aligned} \Omega _\varepsilon&:= \Omega _\varepsilon (x_0,\omega ) := {\left\{ \begin{array}{ll} \Omega \cup \omega _\varepsilon (x_0) &{} \text { for }x_0\in {\textsf{D}}\setminus \overline{\Omega }, \\ \Omega \setminus \overline{\omega _\varepsilon (x_0)} &{} \text { for } x_0 \in \Omega , \end{array}\right. } \end{aligned}$$
(1.1)

and \(\omega _\varepsilon (x_0):= x_0 + \varepsilon \omega\) with \(\omega \subset \mathbb {R}^d\) being a simply connected domain with \(0\in \omega\). Then, the topological derivative can be defined by

$$\begin{aligned} DJ(\Omega )(x_0,\omega ) = \lim _{\varepsilon \searrow 0} \frac{J(\Omega _\varepsilon (x_0,\omega )) - J(\Omega )}{f(\varepsilon )}, \end{aligned}$$
(1.2)

where f satisfies \(\lim _{\varepsilon \searrow 0} f(\varepsilon ) = 0\). In fact, the topological derivative can also depend on the inclusion \(\omega\), which we omit in our notation for simplicity and just write \(DJ(\Omega )\). For many shape functionals in practice the function f is given by the volume of \(\omega _\varepsilon\), however, there are problems where this is not the case, e.g., when Dirichlet boundary conditions are imposed on the inclusion boundary \(\partial \omega _\varepsilon\) (Amstutz (2022)).

The topological derivative was first formally introduced as the bubble method in Eschenauer et al. (1994) and later mathematically justified in Sokolowski and Zochowski (1999); Garreau et al. (2001). Topological derivatives have been established for a variety of PDE constrained functionals and we refer to the monographs Novotny and Sokołowski (2013, 2020) for further information. For the computation of the topological derivative there exist several methods, for instance a direct approach (Novotny and Sokołowski (2013)), where first the expansion of the state variable is computed and afterwards the topological expansion of the cost function is derived via Taylor’s formula. Lagrangian approaches provide another way to compute topological derivatives, e.g., a Lagrangian method using an averaged adjoint equation is presented in Sturm (2020), a method using a perturbed adjoint equation in conjunction with an unperturbed state equation is found in Amstutz and Andrä (2006), and another Lagrangian method using an unperturbed adjoint equation in Gangl and Sturm (2020). We refer to Baumann and Sturm (2022) for a comparison of Lagrangian methods and their advantages and disadvantages.

From a numerical perspective, one can either use the topological derivative directly to find the location of an optimal design (see, e.g., Hintermüller and Laurain (2008); Novotny and Sokołowski (2020)) or one can use a level-set approach (Amstutz and Andrä (2006)) in an iterative fashion to find the optimal topology. However, it should be noted that only the topological derivative is capable to create new holes while the level-set approach only allows to close already present holes. Additionally, Newton-type algorithms to find circular inclusions have been introduced using higher order topological expansions (Laurain et al. (2013); Canelas et al. (2015) or (Novotny et al. 2019, Chapter 10)). This leads to rapidly converging algorithms, however, it requires a combinatorial search, which makes it numerically expensive and also the study of nonlinear problems remains an open problem.

In the related field of shape optimization, the development, modification, and analysis of efficient solution algorithms has received lots of attention in recent years, e.g., in Blauth (2021b, 2022) and Schulz et al. (2016), where nonlinear conjugate gradient and quasi-Newton methods for shape optimization have been proposed, in Blauth (2022), where a space mapping technique for shape optimization is presented, and in Deckelnick et al. (2022), where a \(W^{{1,\infty }}\) approach for shape optimization is introduced.

In this paper, we follow these developments and propose novel quasi-Newton methods for solving PDE constrained topology optimization problems based on the topological derivative. Our approach is based on and extends the popular level-set approach of Amstutz and Andrä (2006). In particular, we first provide a new perspective on the evolution equation for the level-set function given in Amstutz and Andrä (2006) to derive a gradient descent-type algorithm. This algorithm is the foundation for the quasi-Newton methods we propose afterwards. In particular, we derive a limited-memory BFGS method for topology optimization. We investigate the novel methods numerically for several problem classes: Inverse topology optimization problems constrained by linear and semilinear Poisson problems, compliance minimization in linear elasticity, and the optimization of fluids in Navier–Stokes flow. We compare our methods behavior to the widely popular method of Amstutz and Andrä (2006) as well as a simpler convex combination method from Gangl and Sturm (2021). The results show that the novel quasi-Newton methods usually have a significantly better convergence behavior than the other solution algorithms. Particularly, they require substantially less iterations to find an optimizer of the problems while demanding only slightly more numerical resources per iteration. This makes our proposed quasi-Newton methods highly attractive for solving topology optimization problems.

This paper is structured as follows. In Sect. 2, we briefly recall basic results from topology optimization as well as the level-set method for topology optimization and the solution algorithms from Amstutz and Andrä (2006) and Gangl and Sturm (2021), which are widely used in the literature. In Sect. 3, we present a new perspective on the evolution equation for the level-set function. This allows us to derive a gradient descent algorithm for topology optimization and, afterwards, propose novel quasi-Newton methods for topology optimization. Finally, we investigate the methods numerically and compare our quasi-Newton methods to the current state-of-the-art methods in Sect. 4.

2 Preliminaries

2.1 Topological sensitivity analysis

In this section we recall basics on the topological derivative. We start by recalling the definition of the topological derivative (Amstutz (2022)).

Definition 2.1

Let \(\omega\) be a simply connected, bounded, and open subset of \(\mathbb {R}^d\), \(d\in \mathbb {N}_{>0}\), with \(0\in \omega\), let \({\textsf{D}}\) be a bounded hold-all domain, and let \(\mathcal {P}({\textsf{D}})\) be the power set of \({\textsf{D}}\), i.e., \(\mathcal {P}({\textsf{D}}) = \{\Omega \subset \mathbb {R}^d:\; \Omega \subset {\textsf{D}}\}\). Let \(J:\mathcal {P}({\textsf{D}}) \rightarrow \mathbb {R}\) be a shape functional. We say that J has a topological derivative at \(\Omega \in \mathcal {P}({\textsf{D}})\) and at the point \(x_0\in {\textsf{D}}\) w.r.t. \(\omega\) if there exists some function \(f:\mathbb {R}_{>0} \rightarrow \mathbb {R}_{>0}\), where \(\mathbb {R}_{>0}\) denotes positive real numbers, with \(\lim _{\varepsilon \searrow 0} f(\varepsilon ) = 0\) so that the following limit exists

$$\begin{aligned} DJ(\Omega )(x_0):= \lim _{\varepsilon \searrow 0} \frac{J(\Omega _\varepsilon ) - J(\Omega )}{f(\varepsilon )}, \end{aligned}$$

where the perturbed domain \(\Omega _\varepsilon\) is defined by

$$\begin{aligned} \Omega _\varepsilon = \Omega _\varepsilon (x_0, \omega ) = {\left\{ \begin{array}{ll} \Omega \setminus \overline{\omega _\varepsilon (x_0)} \quad &{}\text {for } x_0 \in \Omega ,\\ \Omega \cup \omega _\varepsilon (x_0) \quad &{}\text {for } x_0 \in {\textsf{D}}\setminus \overline{\Omega }, \end{array}\right. } \end{aligned}$$

and \(\omega _\varepsilon (x_0):= x_0 + \varepsilon \omega\).

We follow the approach of Amstutz and Andrä (2006) and represent a set \(\Omega \subset {\textsf{D}}\) with the help of a continuous level set function \(\psi :{\textsf{D}}\rightarrow \mathbb {R}\) as follows

$$\begin{aligned} \begin{aligned} \psi (x) < 0 \quad&\Leftrightarrow \quad x \in \Omega , \\ \psi (x) > 0 \quad&\Leftrightarrow \quad x \in {\textsf{D}}\setminus \overline{\Omega }, \\ \psi (x) = 0 \quad&\Leftrightarrow \quad x \in \partial \Omega \setminus \partial {\textsf{D}}. \end{aligned} \end{aligned}$$

We write \(\Omega _\psi := \Omega\) for a domain \(\Omega \subset {\textsf{D}}\) which is represented by the level-set function \(\psi\).

Definition 2.2

Let \(J:{\mathcal P}({\textsf{D}})\rightarrow \mathbb {R}\) be a shape functional, let \(\Omega \in \mathcal {P}({\textsf{D}})\) be an open set, and let \(\Gamma = \partial \Omega {\setminus } \partial {\textsf{D}}\). Assume that the topological derivative \(DJ(\Omega )(x)\) exists for all \(x\in {\textsf{D}}\setminus \Gamma\). Then, we define the generalized topological derivative by

$$\begin{aligned} {\mathcal D}J(\Omega )(x):= {\left\{ \begin{array}{ll} -DJ(\Omega )(x) &{} \text { for } x\in \Omega , \\ DJ(\Omega )(x) &{} \text { for } x\in {\textsf{D}}\setminus \overline{\Omega }. \end{array}\right. } \end{aligned}$$
(2.1)

The idea behind the generalized topological derivative originates from the following observation. Let \(\psi :{\textsf{D}}\rightarrow \mathbb {R}\) be a level set function representing \(\Omega\). If there is a constant \(c>0\) such that

$$\begin{aligned} {\mathcal D}J(\Omega _\psi )(x) = c\psi (x) \quad \text { for } x\in {\textsf{D}}, \end{aligned}$$
(2.2)

then

$$\begin{aligned} DJ(\Omega )(x)\ge 0 \quad \text { for all } x\in {\textsf{D}}\setminus \partial \Omega , \end{aligned}$$
(2.3)

which is the necessary condition for \(\Omega\) to be optimal (cf. Amstutz and Andrä (2006)). This observation is the starting point of the popular solution algorithm from Amstutz and Andrä (2006), which we present in the next section.

2.2 Topology optimization algorithms using a level-set method

Based on the discussion in the previous section, a solution algorithm for topology optimization has been derived in Amstutz and Andrä (2006), which we briefly recall in the following. For the derivation, we introduce a fictitious time t and consider a family of domains \(\Omega (t)\) represented by a level-set function \(\psi :[0,T] \times {\textsf{D}}\rightarrow \mathbb {R}\). To derive a solution algorithm, the idea is to establish an evolution equation for the level-set function \(\psi\) which ensures that \(\Omega (t)\) converges to a minimizer for \(t\rightarrow \infty\) and to discretize this evolution equation. A natural idea would be to evolve the level-set function according to the generalized topological derivative \(\mathcal {D}J(\Omega (t))\), so that

$$\begin{aligned} \frac{\partial \psi (t, \cdot )}{\partial t} = \mathcal {D}J(\Omega (t)), \quad t\ge 0. \end{aligned}$$

For the sake of better readability, we drop the dependence on the fictitious time t in the domain \(\Omega (t)\) throughout the rest of this paper.

In Amstutz and Andrä (2006), the authors note that, in general, using this formulation does not guarantee convergence since the generalized topological derivative \(\mathcal {D}J(\Omega )\) does not vanish for an optimal geometry (2.3). Instead, the main idea of Amstutz and Andrä (2006) is to use a modified version of this equation, where the topological derivative is projected to the orthogonal complement of \(\psi\) in \(L^2({\textsf{D}})\), which is written as

$$\begin{aligned} \frac{\partial \psi }{\partial t} = P_{\psi ^\perp }(\mathcal {D}J(\Omega )), \end{aligned}$$
(2.4)

where the operator \(P_{\psi ^\perp }\) is defined as

$$\begin{aligned} P_{\psi ^\perp }(a) = a - \frac{(a, \psi )}{\left|\left|\psi \right|\right|_{L^2({\textsf{D}})}^2} \psi . \end{aligned}$$

Here, \((a, b) = (a,b)_{L^2({\textsf{D}})}\) denotes the \(L^2({\textsf{D}})\) scalar product between \(a, b\in L^2({\textsf{D}})\). Now, if the right-hand side of (2.4) vanishes, i.e., if \(P_{\psi ^\perp }(\mathcal {D}J(\Omega )) = 0\), then it holds that there exists some \(\alpha \in \mathbb {R}\) so that \(\mathcal {D}J(\Omega ) = \alpha \psi\). If \(\alpha > 0\), then the necessary optimality conditions (2.3) for the optimization problem are satisfied and the geometry \(\Omega\) described by the corresponding level-set function \(\psi\) is a local minimizer. For this reason, we also consider the norm of the projected topological derivative, i.e.,

$$\begin{aligned} \left|\left|P_\psi ^\perp (\mathcal {D}J(\Omega )) \right|\right|_{L^2({\textsf{D}})} \end{aligned}$$
(2.5)

as a second convergence criterion for our numerical experiments in Sect. 4.

figure a

For the numerical solution, it is proposed in Amstutz and Andrä (2006) to discretize (2.4) with Euler’s scheme on the sphere using a step size \(\lambda\), leading to

$$\begin{aligned} \psi _{k+1} = \frac{1}{\sin (\theta _k)} \sin \left( (1- \lambda _k) \theta _k \right) \psi _k + \sin \left( \lambda _k \theta _k \right) \frac{\mathcal {D}J(\Omega _k)}{\left|\left|\mathcal {D}J(\Omega _k) \right|\right|_{L^2({\textsf{D}})}}, \end{aligned}$$
(2.6)

where \(\theta _k\) is the angle between \(\psi _k\) and \(DJ(\Omega _k)\), i.e.

$$\begin{aligned} \theta _k = \arccos \left( \frac{(\psi _k, \mathcal {D}J(\Omega _k))}{\left|\left|\psi _k \right|\right|_{L^2({\textsf{D}})} \left|\left|\mathcal {D}J(\Omega _k) \right|\right|_{L^2(D)}} \right) . \end{aligned}$$

Note that the iteration is terminated if the angle between generalized topological derivative and level-set function is sufficiently small, so that \(P_{\psi ^\perp }(\mathcal {D}J(\Omega )) \approx 0\) and the necessary optimality conditions are satisfied approximately.

The resulting numerical algorithm, which is analyzed in Amstutz (2011), can be seen in Algorithm 1. By slight misuse of notation, we write \(J(\psi )\) instead of \(J(\Omega )\), where \(\psi\) is the level-set function representing the domain \(\Omega\), in line 7 of Algorithm 1 for the sake of better readability.

figure b

The main idea of Algorithm 1 is to iteratively use a convex combination of generalized topological derivative and level-set function to reach a minimum of the optimization problem, where the weights for the convex combination are chosen according to Euler’s method on the sphere (cf. Amstutz and Andrä (2006)). A simpler idea was used in Gangl and Sturm (2021), where the authors consider the following convex combination to evolve the level-set function

$$\begin{aligned} \psi _{k+1} = \lambda \frac{\mathcal {D}J(\Omega _k)}{\left|\left|\mathcal {D}J(\Omega _k) \right|\right|_{L^2({\textsf{D}})}} + (1 - \lambda ) \psi _k, \end{aligned}$$
(2.7)

where the parameter \(\lambda\) plays the role of a step size. For a fixed step size \(\lambda > 0\), it is easy to see that if the method becomes stationary, i.e., if \(\psi _{k+1} = \psi _k\), then we have that \(\psi _k = \nicefrac {\mathcal {D}J(\Omega _k)}{\left|\left|\mathcal {D}J(\Omega _k) \right|\right|_{L^2({\textsf{D}})}}\) and, in particular, the necessary optimality conditions (2.2) are satisfied. This idea gives rise to Algorithm 2.

Note that particularly the solution method presented in Algorithm 1 is widely popular and represents the state-of-the-art algorithm for solving topology optimization problems with a level-set function.

3 Quasi-Newton methods for topology optimization

In this section, we present novel quasi-Newton methods for topology optimization. To do so, we first take a different perspective on equation (2.4) and formulate a new gradient descent method for topology optimization. This enables us to define (limited memory) BFGS methods for topology optimization afterwards.

3.1 A novel perspective on the level-set evolution equation

Before we can introduce quasi-Newton methods for topology optimization, we note that the algorithmic frameworks presented in Algorithms 1 and 2 are not suitable for defining such methods. The reason for this is that the update rules for the level-set function given in (2.6) and (2.7) do consider a convex combination of the level-set function and the generalized topological derivative, which is different to the classical form of descent methods, where the update of the design variables (the level-set function) is performed by subtracting the gradient (topological derivative), scaled by an appropriate step size, from the current iterate.

To remedy this problem, we start by considering the continuous equation (2.4) for evolving the level-set function from Amstutz and Andrä (2006). Instead of using the elaborate approach of Amstutz and Andrä (2006), we discretize (2.4) by an explicit Euler method, which yields the discretized equation

$$\begin{aligned} \frac{\psi _{k+1} - \psi _k}{\Delta t} = P_{\psi _k^\perp }(\mathcal {D}J(\Omega _k)) \qquad \Leftrightarrow \qquad \psi _{k+1} = \psi _k + \Delta t P_{\psi _k^\perp }(\mathcal {D}J(\Omega _k)) \end{aligned}$$
(3.1)

where k denotes the current time step. For discretizing equation (2.4), one usually would consider small time steps \(\Delta t\) converging to 0, which would yield a so-called gradient flow method. However, we change our viewpoint and interpret (3.1) as gradient descent method, where the time step \(\Delta t\) now plays the role of a step size. The benefit of this interpretation is that we can (potentially) make use of the convergence behavior of the gradient descent method and use large step sizes when appropriate, reducing the computational cost of our algorithm. Therefore, we can now interpret \(g_k = -P_{\psi _k^\perp }(\mathcal {D}J(\Omega _k))\) as the “gradient” associated to our topology optimization problem. The resulting optimization algorithm is presented in Algorithm 3. Note, that the main benefit of this method is that it follows the “standard” form of a gradient descent method, which makes it amenable to define quasi-Newton methods, which we do in the next section. Additionally, our numerical experiments in Sect. 4 show, that Algorithm 3 can also yield faster convergence compared to Algorithms 1 and 2.

figure c

3.2 A Limited Memory BFGS Method for Topology Optimization

With the gradient descent method described in the previous section, we now focus our attention to quasi-Newton methods for topology optimization, which can now be derived analogously to the finite-dimensional case (see, e.g., Nocedal and Wright (2006); Kelley (1999)). To do so, we introduce the functions \(s_k = \psi _{k+1} - \psi _k\) and \(y_k = g_{k+1} - g_k\). The quasi-Newton methods rely on the so-called secant equation, which in our setting can be written as

$$\begin{aligned} B_{k+1} s_k = y_k, \end{aligned}$$

where \(B_{k+1}\) is an isomorphism from \(L^2({\textsf{D}})\) to \(L^2({\textsf{D}})\) which can be seen as approximation of the Hessian, and we denote its inverse by \(H_{k+1}\). In the following, we will describe a BFGS method for topology optimization based on \(H_k\), the inverse of the Hessian approximation, which makes it easier to derive the limited-memory version of the method which we have implemented in the software package cashocs (Blauth (2021a, 2023)). In particular, the search direction for the BFGS method is given by

$$\begin{aligned} p_k = - H_k g_k \end{aligned}$$
(3.2)

and the update formula for \(H_k\) is given by

$$\begin{aligned} H_{k+1} = \left( \textrm{Id}_{L^2({\textsf{D}})} - \frac{s_k \otimes y_k}{(y_k, s_k)_{L^2({\textsf{D}})}} \right)\,H_k \left( \textrm{Id}_{L^2({\textsf{D}})} - \frac{y_k \otimes s_k}{(s_k, y_k)_{L^2({\textsf{D}})}} \right) + \frac{s_k \otimes s_k}{(y_k, s_k)_{L^2({\textsf{D}})}}, \end{aligned}$$
(3.3)

where \(\otimes\) denotes the outer product of \(L^2({\textsf{D}})\), i.e., \((a \otimes b) c = (b, c)_{L^2({\textsf{D}})} a\).

To avoid storing large dense matrices as discretizations of the operator \(H_{k}\), we employ a limited memory BFGS method for our numerical implementation which is shown in Algorithm 4. Particularly, the limited memory BFGS method only requires us to additionally store \(s_k\) and \(y_k\) to compute the application of \(H_{k}\) to some right-hand side, which is shown in Algorithm 5 (cf. Nocedal and Wright (2006) for a description of the L-BFGS method in a finite-dimensional setting).

Remark 3.1

Our numerical experiments presented in Sect. 4 showed that the search direction computed with (3.2) may sometimes not yield descent of the cost functional. Therefore, we employ a restarted version of the BFGS method, which replaces the search direction with \(p_k = - g_k\) if the line search procedure in line 9 of Algorithm 4 does not converge.

Remark 3.2

The idea presented above could also be applied analogously to derive nonlinear conjugate gradient (NCG) methods for topology optimization. As a thorough investigation of the several popular NCG methods is beyond the scope of this paper, we do not consider these methods in the following (cf. Blauth (2021b) for a discussion of NCG methods in the context of shape optimization). However, the NCG methods are already implemented and ready for use in our software package cashocs (Blauth (2021a, 2023)). An investigation of such NCG methods for topology optimization is planned for future research.

figure d
figure e

4 Numerical investigation of quasi-Newton methods for topology optimization

In this section, we consider the practical performance of the proposed quasi-Newton methods from Sect. 3. We consider four problem classes: Inverse topology optimization problems constrained by linear and semilinear Poisson problems, compliance minimization in linear elasticity, and the topological design optimization in Navier–Stokes flow. We solve each of these problems with the four solution algorithms presented in this paper, where we use the following notation for the sake of simplicity. Algorithm 1, originally introduced in Amstutz and Andrä (2006), is called the sphere combination method, Algorithm 2 is called convex combination method, Algorithm 3 is called gradient descent method and, finally, Algorithm 4 is called (limited memory) BFGS method. We remark that, for all numerical test cases considered in this paper, we choose a memory size of \(m=5\) for the limited memory BFGS methods.

Note that we have implemented all of the optimization algorithms for topology optimization considered in Sect. 2 and the novel gradient descent and BFGS methods from Sect. 3 in our open-source software package cashocs (Blauth (2021a, 2023)), which is a software for solving arbitrary PDE constrained shape optimization and optimal control problems. Our software is based on the finite element software FEniCS (Alnæs et al. (2015); Logg et al. (2012)) and derives the necessary adjoint systems for the optimization with the help of automatic differentiation. Therefore, for the discretization of the PDE constraints, the finite element method is naturally employed. Moreover, the source code for our numerical experiments is available freely on GitHub (Blauth and Sturm (2023)).

4.1 Linear poisson problem

In this section, we investigate a topology optimization problem constrained by a linear Poisson problem. Let \({\textsf{D}}\subset \mathbb {R}^d\) with \(d\in \mathbb {N}_{>0}\) be an open and bounded domain with boundary \(\partial {\textsf{D}}\) and let \(\Omega \subset {\textsf{D}}\) be an open subset. We denote by \(\Gamma = \partial \Omega {\setminus } \partial {\textsf{D}}\) the interior boundary of \(\Omega\) in \({\textsf{D}}\) and by \(\Omega ^c = {\textsf{D}}{\setminus } \overline{\Omega }\) the complement of \(\Omega\) in \({\textsf{D}}\). We consider the following problem

$$\begin{aligned} \begin{aligned}&\min _{\Omega } J(\Omega , u) = \frac{1}{2} \int _{{\textsf{D}}} \left( u - u_\textrm{des} \right) ^2 \ \textrm{d}x \\&\text {s.t.} \quad \begin{alignedat}{2} -\Delta u + \alpha _\Omega u&= f_\Omega \quad{} & {} \text { in } {\textsf{D}},\\ u&= 0 \quad{} & {} \text { on } \partial {\textsf{D}}, \end{alignedat} \end{aligned} \end{aligned}$$
(4.1)

where \(\alpha _\Omega (x) = \chi _\Omega (x) \alpha _\textrm{in} + \chi _{\Omega ^c}(x) \alpha _\textrm{out}\) and \(f_\Omega (x) = \chi _\Omega (x)\,f_\textrm{in} + \chi _{\Omega ^c}(x) f_\textrm{out}\) with \(\alpha _\textrm{in}, \alpha _\textrm{out} > 0\) as well as \(f_\textrm{in}, f_\textrm{out} \in \mathbb {R}\) are constant in \(\Omega\) and \(\Omega ^c\). In our setting, \(u_\textrm{des}\) is given as the solution of the PDE constraint on a desired domain \(\Omega _\textrm{des}\). Hence, the above problem can be interpreted as an inverse problem of identifying the unknown domain \(\Omega _\textrm{des}\) using the measurement \(u_\textrm{des}\). The generalized topological derivative for this problem is derived, e.g., in Amstutz (2022) and is given by

$$\begin{aligned} \begin{aligned} \mathcal {D}J(\Omega )(x) = (\alpha _\textrm{out} - \alpha _\textrm{in}) u(x) p(x) - (f_\textrm{out} - f_\textrm{in}) p(x) \quad \text { for all } x\in {\textsf{D}}\setminus \Gamma , \end{aligned} \end{aligned}$$

where u solves the state equation (4.1) and p solves the following adjoint equation

$$\begin{aligned} \begin{alignedat}{2} -\Delta p + \alpha _\Omega p&= - (u-u_\textrm{des}) \quad{} & {} \text { in } {\textsf{D}},\\ p&= 0 \quad{} & {} \text { on } \partial {\textsf{D}}. \end{alignedat} \end{aligned}$$

Remark 4.1

Usually, the term “inverse problem” is used to denote a identification problem which aims to reconstruct some unknown domain \(\Omega _\textrm{des} \subset {\textsf{D}}\) using measurements obtained on (parts of) the boundary of \({\textsf{D}}\), see, e.g., Canelas et al. (2015); Hintermüller and Laurain (2008); Laurain et al. (2013); Beretta et al. (2017). However, for our model problems (4.1) and (4.2), we use (artificial) measurements in the entire domain \({\textsf{D}}\). Still, we refer to these problems as inverse problems for the sake of simplicity.

In the following, we solve this problem with the four optimization algorithms described in Sects. 2 and 3, where we consider a hold-all domain of \({\textsf{D}}= (-2, 2)^2\). We discretize the geometry using four different mesh sizes to investigate the dependence of the algorithms on the discretization. The considered meshes are generated by creating a uniform quadrangular grid with \(32 \times 32\), \(48\times 48\), \(64\times 64\), and \(96\times 96\) squares, where each square is in turn subdivided into four triangles, so that the meshes consist of 2113 nodes and 4096 triangles (\(32\times 32\)), 4705 nodes and 9216 triangles (\(48\times 48\)), 8321 nodes and 16384 triangles (\(64\times 64\)), and 18625 nodes and 36864 triangles (\(96\times 96\)), respectively. Moreover, we discretize both the state and adjoint variables with linear Lagrange finite elements. For the parameters, we use \(\alpha _\textrm{in} = f_\textrm{in} = 10\) and \(\alpha _\textrm{out} = f_\textrm{out} = 1\). The sought design \(\Omega _\textrm{des}\) is chosen as the one corresponding to a clover shape, which can be seen in the right-most column of Table 1.

The results of the optimization can be seen in Fig. 1, where we show the evolution of the cost functional, the angle criterion, and the norm of the projected gradient (2.5) over the optimization for the finest discretization. Here, we observe that our proposed gradient descent and BFGS methods perform significantly better than the established methods. We observe that the convergence behavior, based on the cost functional and the norm of the projected gradient, is best for the BFGS method, followed by the gradient descent and sphere combination methods, and that the convex combination algorithm performs worst. Particularly, the BFGS method reaches stationarity in the cost functional and norm of the projected gradient after only about 125 iterations, whereas all remaining methods continue to decrease the respective measures until the final iteration.

Fig. 1
figure 1

Evolution of the optimization for the linear poisson problem (4.1)

Table 1 Evolution of the geometries for the linear Poisson problem (4.1)

However, none of the methods converged based on the angle criterion after 500 iterations. On the contrary, the angle between level-set function and generalized topological derivate remains bounded from below by about 30 \({}^{\circ }\), so that none of the methods can be considered as converged by this criterion. However, if we take a look at the evolution of the relative norm of the projected gradient, we observe a steep decrease for all methods. The convex and sphere combination methods are able to decrease the norm of the projected gradient by about five and six orders of magnitude, respectively. The gradient descent and BFGS methods are even more efficient and decrease the norm of the projected gradient by about seven and eight orders of magnitude, respectively. Therefore, based on the second convergence criterion, the algorithms can be considered converged. Moreover, for this example, it seems like the angle criterion is too strict as a convergence criterion, whereas the norm of the projected gradient seems to be better suited.

Remark 4.2

A possible explanation for this behavior is the fact, that problem (4.1) is well-known to be ill-conditioned. This means that topological derivatives become flatter the closer we get to the optimal solution of the problem, which is also what we have observed in our numerical experiments. The topological derivative becomes successively smaller and flatter over the course of the optimization.

In Fig. 2, we show plots of the topological derivative during the middle of the optimization for the sphere combination method. In Fig. 2a, the topological derivative is shown at iteration 203 and for this iterate, the angle between topological derivative and level-set function is comparatively small with 29 \({}^{\circ }\). On the other hand, the topological derivative in the next iteration, which is shown in Fig. 2b, has a much larger angle of 130 \({}^{\circ }\) with the level-set function. A possible reason for this behavior could be the line search, which, at first, uses successively larger step sizes up to iteration 204, after that the accepted step size drops down. This behavior repeats and causes the oscillations in the angle criterion which can be seen in Fig. 1.

Moreover, we remark that the angle criterion only provides a necessary condition for optimality, as can be easily seen by the discussion in Sect. 2.1 (cf. (2.2) and (2.3)). There may be other minimizers that satisfy (2.3) without satisfying the angle criterion. A thorough investigation of these issues is an interesting direction for future work.

Fig. 2
figure 2

Plots of the topological derivative at two iterations of the sphere combination algorithm for problem (4.1)

The evolution of the geometry during the optimization algorithms is depicted in Table 1, where we show the geometries after 50, 100, and 500 iterations for all four considered optimization algorithms on the finest discretization. In addition, we also show the reference shape for comparison. Note, that the reference shape has four larger inclusions in the middle and a smaller one in the center, making the problem particularly hard to solve. Comparing the obtained shapes after 50 iterations, we observe that the sphere combination and convex combination algorithms only start to form the four major inclusions, whereas they are already present for the gradient descent and BFGS methods. Moreover, the geometry obtained by BFGS method already has an inclusion in the center of the geometry, making it already very similar to the reference solution. After 100 iterations, the established methods have formed the major inclusions, but their position and shape is still quite wrong. The gradient descent method has improved the shapes of the major inclusions so that they are rather similar to the desired ones and is starting to locate the inclusion in the center. For the BFGS method, there is no visible difference anymore between the obtained and desired geometry, indicating that the method has already converged after 100 iterations, whereas the other methods are still quite far away from the desired shape. After 500 iterations, we observe that the sphere combination and convex combination methods have corrected the shape of the four major inclusions so that they are now quite similar to the desired ones. However, neither of these methods was able to reconstruct the inclusion in the center of the geometry. The gradient descent method, on the other hand, was able to reconstruct the center inclusion after 500 iterations, and there are no visual distinctions between the obtained and desired geometry anymore. The same is, of course, also true for the BFGS method, whose corresponding shape does not change visually between iterations 100 and 500.

Let us finally investigate the dependence of the algorithms on the discretization. We do so by comparing the evolution of the cost functional for all methods on the different discretization levels. The results are depicted in Fig. 3. Here, we observe that the performance of the algorithms, which we have discussed previously, is not dependent on the mesh size. In particular, the BFGS and gradient descent methods always perform best, whereas the sphere and convex combination methods perform significantly worse. Overall, the cost functional decreases further when a finer discretization is chosen. This is due to the fact, that a finer discretization also allows for a more detailed resolution of the desired shape, which in turn results in a lower cost functional value for the optimum. Therefore, we conclude that all algorithms behave mesh-independently, i.e., they do not require more iterations for finer levels of discretization to reach the same level of approximation of the optimal solution.

Fig. 3
figure 3

Evolution of the cost functional for the linear Poisson problem (4.1) for several discretization levels

These results show the great potential of the proposed BFGS methods as they performed best of all considered methods, significantly outperforming the remaining algorithms. Moreover, we have shown that the methods also show mesh-independent behavior, which is due to the fact that the presented algorithms are discretizations of optimization algorithms acting in infinite-dimensional spaces.

4.2 Semilinear poisson problem

To showcase the methods’ performance for nonlinear problems, we now investigate a semilinear variant of the Poisson problem we considered in Sect. 4.1. Our setting is like before, so \({\textsf{D}}\subset \mathbb {R}^d\) with \(d \in \mathbb {N}_{>0}\) and \(d\le 3\) denotes the holdall domain, \(\partial {\textsf{D}}\) is its boundary, \(\Omega \subset {\textsf{D}}\) is an open subset of \({\textsf{D}}\), \(\Gamma = \partial \Omega {\setminus } \partial {\textsf{D}}\) is the interior boundary of \(\Omega\) in \({\textsf{D}}\), and \(\Omega ^c = {\textsf{D}}{\setminus } \overline{\Omega }\) is the complement of \(\Omega\) in \({\textsf{D}}\). Our semilinear version of the problem reads

$$\begin{aligned} \begin{aligned}&\min _{\Omega } J(\Omega , u) = \frac{1}{2} \int _{{\textsf{D}}} \left( u - u_\textrm{des} \right) ^2 \ \textrm{d}x \\&\text {s.t. } \quad \begin{alignedat}{2} -\Delta u + \alpha _\Omega u^3&= f_\Omega \quad{} & {} \text { in } {\textsf{D}},\\ u&= 0 \quad{} & {} \text { on } \partial {\textsf{D}}, \end{alignedat} \end{aligned} \end{aligned}$$
(4.2)

where we again have \(\alpha _\Omega (x) = \chi _\Omega (x) \alpha _\textrm{in} + \chi _{\Omega ^c}(x) \alpha _\textrm{out}\) and \(f_\Omega (x) = \chi _\Omega (x) f_\textrm{in} + \chi _{\Omega ^c}(x) f_\textrm{out}\) with \(\alpha _\textrm{in}, \alpha _\textrm{out} > 0\) as well as \(f_\textrm{in}, f_\textrm{out} \in \mathbb {R}\). The topological derivative for this problem is given by

$$\begin{aligned} \mathcal {D}J(\Omega )(x) = (\alpha _\textrm{out} - \alpha _\textrm{in}) u(x)^3 p(x) - (f_\textrm{out} - f_\textrm{in})p(x) \quad \text { for all } x \in {\textsf{D}}\setminus \Gamma , \end{aligned}$$

where u solves the state equation and p solves the following adjoint equation

$$\begin{aligned} \begin{alignedat}{2} -\Delta p + 3\alpha _\Omega u^2 p&= - (u - u_\textrm{des}) \quad{} & {} \text { in } {\textsf{D}},\\ p&= 0 \quad{} & {} \text { on } \partial {\textsf{D}}. \end{alignedat} \end{aligned}$$

We again solve this problem with the four solution algorithms under consideration and use the hold-all domain \({\textsf{D}}= (-2, 2)^2\). The desired state \(u_\textrm{des}\) is obtained by solving the semilinear Poisson problem on a reference domain, which is given by the same clover shape as considered in Sect. 4.1. The geometry is discretized by dividing it into \(96\times 96\) squares, which are subdivided into four triangles each, so that we use 18625 nodes and 36864 triangles. Again, we employ linear Lagrange elements for the discretization of the state and adjoint variables. Finally, we use the same setting for \(\alpha\) and f as before, so that \(\alpha _\textrm{in} = f_\textrm{in} = 10\) and \(\alpha _\textrm{out} = f_\textrm{out} = 1\).

The evolution of the cost functional, angle criterion, and norm of the projected topological derivative over the course of the optimization are shown in Fig. 4. Analogously to our previous findings, we observe that the BFGS method significantly outperforms the remaining methods as it decreases the cost functional and the norm of the projected topological derivative most. For this problem, we observe that the gradient descent method does not perform as well as previously, but is rather comparable to the sphere and convex combination methods in performance. Again, for all considered algorithms, the angle between level-set function and generalized topological derivative remained bounded away from zero.

Fig. 4
figure 4

Evolution of the optimization for the semilinear Poisson problem (4.2)

Let us investigate the geometries obtained by the methods, which are depicted in Table 2 after 100, 200, and 500 iterations of the methods. Here, we observe that the BFGS method again performs substantially better than the other methods. Even after 100 iterations, all inclusions of the geometry are found and the geometry exhibits the correct topology. However, the shape of the inclusions is still a bit off. The geometry obtained with the remaining algorithms, on the other side, is still far away from the optimal geometry. The same is true after 200 iterations: There, the geometry obtained by the BFGS method is visually identical to the reference solution, whereas the other methods produce geometries that still deviate significantly, both in topology and shape, from the optimal solution. Finally, the results after 500 iterations show, that the sphere and convex combination methods are able to locate the four major inclusions of the reference solution, but their shape is still a bit off. Moreover, they were unable to detect the inclusion in the center. The gradient descent method, on the other hand, was able to find all inclusions and, hence, to reconstruct the sought topology. However, the shape of the inclusions could still be improved. For the BFGS method, there are no major changes in the geometry between 200 and 500 iterations, as the geometry showed no visual differences to the reference solution already after 200 iterations.

Again, these results highlight the potential and efficiency of the BFGS methods for solving topology optimization problems, as they significantly outperformed all other methods considered in this paper.

Table 2 Evolution of the geometries for the semilinear Poisson problem (4.2)

4.3 Compliance minimization in linear elasticity

In this section, we consider the problem of compliance minimization in linear elasticity which has been previously investigated, e.g., in Amstutz and Andrä (2006); Allaire et al. (2004, 2005). Let \({\textsf{D}}\subset \mathbb {R}^d\), \(d\in \mathbb {N}_{>0}\) be an open and bounded domain with boundary \(\partial {\textsf{D}}\), which is the disjoint union of the Dirichlet boundary \(\Gamma _D\) and Neumann boundary \(\Gamma _N\). Further, let \(\Omega \subset {\textsf{D}}\) and denote its complement by \(\Omega ^c = {\textsf{D}}{\setminus } \overline{\Omega }\). The compliance minimization problem is given by

$$\begin{aligned} \begin{aligned}&\min _{\Omega } J(\Omega , u) = \int _{{\textsf{D}}} \alpha _\Omega \sigma (u): e(u) \ \textrm{d}x + l \left| \Omega \right| \\&\text {s.t.} \quad \begin{alignedat}{2} -\textrm{div}(\alpha _\Omega \sigma (u))&= f \quad{} & {} \text { in } {\textsf{D}},\\ u&= 0 \quad{} & {} \text { on } \Gamma _D,\\ \alpha _\Omega \sigma (u) n&= g \quad{} & {} \text { on } \Gamma _N.\\ \end{alignedat} \end{aligned} \end{aligned}$$
(4.3)

Here, u is the deformation of a linear elastic material, \(\sigma (u) = 2\mu e(u) + \lambda \textrm{tr} e(u) I\) is Hooke’s tensor, \(e(u) = \nicefrac {1}{2}(\nabla u + (\nabla u)^\top )\) is the symmetric gradient of u, and A : B denotes the Frobenius inner product between matrices \(A,B\in \mathbb {R}^{d\times d}\), i.e., \(A:B:=\sum _{i,j=1}^d a_{ij}b_{ij}\). Here, \(\mu\) and \(\lambda\) are the so-called Lamé parameters for which we assume \(\mu > 0\) and \(2\mu + d \lambda > 0\). Moreover, \(\alpha _\Omega (x) = \chi _\Omega (x) \alpha _\textrm{in} + \chi _{\Omega ^c}(x) \alpha _\textrm{out}\) is, again, constant in \(\Omega\) and \(\Omega ^c = {\textsf{D}}{\setminus } \overline{\Omega }\) with \(\alpha _\textrm{in}, \alpha _\textrm{out} > 0\). The first term in the cost functional measures the compliance of the structure and the second term is a regularization parameter which penalizes large domains \(\Omega\), so that the optimization is not trivial.

The topological derivative for problem (4.3) can be found, e.g., in Amstutz and Andrä (2006) and is given by

$$\begin{aligned} DJ(\Omega )(x) = {\left\{ \begin{array}{ll} -\alpha _\textrm{in} \frac{r_\textrm{in} - 1}{\kappa r_\textrm{in} + 1} \frac{\kappa + 1}{2} \left( 2 \sigma (u): e(u) + \frac{(r_\textrm{in} - 1)(\kappa - 2)}{\kappa + 2 r_\textrm{in} - 1} \textrm{tr} \sigma (u) \textrm{tr} e(u) \right) - l &{} \quad \text { for } x\in \Omega , \\ -\alpha _\textrm{out} \frac{r_\textrm{out} - 1}{\kappa r_\textrm{out} + 1} \frac{\kappa + 1}{2} \left( 2\sigma (u): e(u) + \frac{(r_\textrm{out} - 1)(\kappa - 2)}{\kappa + 2 r_\textrm{out} -1} \textrm{tr} \sigma (u) \textrm{tr} e(u) \right) + l &{} \quad \text { for } x \in \Omega ^c = {\textsf{D}}\setminus \overline{\Omega }, \end{array}\right. } \end{aligned}$$

where \(r_\textrm{in} = \frac{\alpha _\textrm{out}}{\alpha _\textrm{in}}\), \(r_\textrm{out} = \frac{\alpha _\textrm{in}}{\alpha _\textrm{out}}\), and \(\kappa = \frac{\lambda + 3\mu }{\lambda + \mu }\).

In the following, we consider two test cases for this problem, namely the so-called cantilever and bridge problems, which are taken from Amstutz and Andrä (2006). For these test cases, we follow Amstutz and Andrä (2006) and use \(\alpha _\textrm{in} = 1\) and \(\alpha _\textrm{out} = {1\times 10^{-3}}\) as well as \(f = 0\). Additionally, as we consider the case of plane stress, the Lamé parameters are given by \(\mu = \frac{E}{2(1 + \nu )}\) and \(\lambda = \frac{2 \mu \lambda ^*}{\lambda ^* + 2 \mu }\) with \(\lambda ^* = \frac{E \nu }{(1+ \nu ) (1- 2\nu )}\) and we use a Young’s modulus of \(E = {}{1}{}\) and a Poisson’s ratio of \(\nu = 0.3\) for the following numerical experiments. Finally, we discretize the state variable using linear Lagrange finite elements.

4.3.1 Cantilever

For our first example, we consider the so-called cantilever problem (see, e.g., Amstutz and Andrä (2006); Allaire et al. (2004, 2005)). Here, the holdall domain is given by \({\textsf{D}}= (0,2) \times (0,1)\). The Dirichlet boundary \(\Gamma _D = \{x = 0\}\) is the left side of the rectangle and for the Neumann load g we consider a unitary point load at (2, 0.5). As in Amstutz and Andrä (2006), we discretize this domain with a uniform triangular mesh consisting of 4193 nodes and 8192 triangles and choose \(l = 100\). A schematic of the problem setting can be seen in Fig. 5.

The evolution of the cost functional, angle criterion, and norm of the projected topological derivative are shown in Fig. 6. Here, we observe that all methods converge very fast, requiring a maximum of 25 iterations to satisfy the angle criterion with a tolerance of 1.5 \({}^{\circ }\). The performance of all methods is very comparable for this problem and no method performs significantly better or worse than the others.

However, when we investigate the optimized geometries obtained with the methods, there are some differences, as they converged to different local minimizers of the problems. Whereas the sphere combination and BFGS methods converged to the same solution that was reported in Amstutz and Andrä (2006), the gradient descent and convex combination method converged to different geometries with finer beam structures (Fig. 7).

Fig. 5
figure 5

Schematic of the cantilever problem

Fig. 6
figure 6

Evolution of the optimization for the cantilever problem

Fig. 7
figure 7

Optimized geometries for the cantilever problem

4.3.2 Bridge

In this section, we consider another problem of compliance minimization in linear elasticity which corresponds to a bridge with a single and multiple loads. As before, these problems are taken from the literature (Amstutz and Andrä (2006); Allaire et al. (2004, 2005)). Here, the hold-all domain is given by \({\textsf{D}}= (0,2) \times (0, 1.2)\). As boundary conditions, we have zero vertical displacement at the bottom left and right corners. For the single load case, we apply a vertical downwards unitary force at (1, 0) and for the multiple load case we apply three vertical unitary loads at (0.5, 0), (1, 0), and (1.5, 0). We discretize this setup with a mesh consisting of 7809 vertices and 15360 triangles. For the volume regularization, we use the parameter \(l = 30\) in the single load case and use \(l = 120\) for the multiple load case, in analogy to Amstutz and Andrä (2006). A schematic of the problem can be seen in Fig. 8.

Fig. 8
figure 8

Schematic setup of the bridge problem

Let us first discuss the single load case. The history of the cost functional, angle criterion, and norm of the projected topological derivative are shown in Fig. 9. Similarly to the cantilever problem, there are no significant differences in the performance of the optimization algorithms. All methods required between 20 and 22 iterations to reach the prescribed tolerance of 1 \({}^{\circ }\) of the angle criterion. The only notable difference between the methods is that the convex combination algorithm decreases the quality measures slightly earlier than the other methods, but they “catch up” during later iterations. The optimized geometries obtained with the methods are shown in Fig. 10. Here, we also observe, that the methods converge to different local minimizers. Whereas the sphere combination and BFGS method find a similar geometry, which is the same as reported in Amstutz and Andrä (2006), the convex combination and gradient descent method find a different local minimizer with less height and a different supporting beam structure.

Fig. 9
figure 9

Evolution of the optimization for the bridge with a single load

Fig. 10
figure 10

Optimized geometries for the bridge with a single load

The results are very similar for the multiple loads case. In Fig. 11 the evolution of the considered quality measures is depicted over the history of the optimization. Here, we observe some slight differences in the performance of the algorithms. The convex combination method performs best as it requires only 25 iterations to satisfy the angle criterion, where we have chosen a tolerance of 1.5 \({}^{\circ }\) for this example. The gradient descent and BFGS methods showed the second best performance, requiring about 30 iterations each to satisfy the stopping criterion. The sphere combination method performed worst and required 40 iterations to reach the stopping tolerance. Finally, let us briefly investigate the optimized geometries, which are shown in Fig. 12. Here, we see that all methods converge to a similar solution, which has a slightly more complicated beam structure than the one reported in Amstutz and Andrä (2006).

Fig. 11
figure 11

Evolution of the optimization for the bridge with multiple loads

Fig. 12
figure 12

Optimized geometries for the bridge with multiple load

Altogether, for the case of linear elasticity, we observe no major differences between all considered optimization algorithms. The proposed BFGS methods show a very similar performance to the already established methods on all considered test problems. A possible reason for this behavior is that the problems are comparatively easy to solve, at least in comparison to the problems considered in Sects. 4.1 and 4.2, as the already established methods only require 20 to 50 iterations to solve these problems to a desired tolerance, so that there may not be enough iterations for the BFGS methods to show their potential, particularly if they have to be restarted often due to large changes in the topology. This topic is of interest for future research.

4.4 Optimization of fluids in Navier–Stokes flow

Let us now consider another application, the optimization of fluids in Navier–Stokes flow. Our setting is similar to before, i.e., let \({\textsf{D}}\subset \mathbb {R}^d\) be an open and bounded hold-all domain with boundary \(\partial {\textsf{D}}\). Let \(\Omega \subset {\textsf{D}}\), denote by \(\Omega ^c = {\textsf{D}}\setminus \overline{\Omega }\) the complement of \(\Omega\) in \({\textsf{D}}\), and let \(\Gamma = \partial \Omega {\setminus } \partial {\textsf{D}}\). We consider the following optimization problem

$$\begin{aligned} \begin{aligned}&\min _{\Omega } J(\Omega , u) = \int _{\textsf{D}}\mu \nabla u: \nabla u + \alpha _\Omega u \cdot u \ \textrm{d}x \\&\text {s.t.} \quad \begin{alignedat}{2} - \mu \Delta u + \rho \left( u \cdot \nabla \right) u + \nabla p + \alpha _\Omega u&= 0 \quad{} & {} \text { in } {\textsf{D}}, \\ \nabla \cdot u&= 0 \quad{} & {} \text { in } {\textsf{D}},\\ u&= u_\textrm{D} \quad{} & {} \text { on } \partial {\textsf{D}},\\ \int _{\textsf{D}}p \ \textrm{d}x&= 0, \\ \left|\Omega \right|&= \text {vol}_\textrm{des}, \end{alignedat} \end{aligned} \end{aligned}$$
(4.4)

where \(\alpha _\Omega (x) = \chi _\Omega (x) \alpha _\textrm{in} + \chi _{\Omega ^c}(x) \alpha _\textrm{out}\) with \(\alpha _\textrm{in}, \alpha _\textrm{out} > 0\) and \(\left|\Omega \right|\) denotes the Lebesgue measure in \(\mathbb {R}^d\). Here, u denotes the fluid’s velocity and p its pressure, \(\mu\) is its viscosity, \(\rho\) its density, and \(\alpha\) is the inverse permeability. The cost functional of the above problem models the energy dissipation of the fluid, which should be minimized. Moreover, we have a volume equality constraint, which ensures that only the desired volume of the domain is occupied by the fluid. For the topology optimization problem (4.4), the sought domain \(\Omega\) is the domain of the fluid, whereas its complement \(\Omega ^c\) plays the role of a solid region. In particular, the inverse permeability \(\alpha\) is small inside \(\Omega\) and large outside of it. For a more detailed discussion, we refer the reader to Borrvall and Petersson (2003), where this model was first introduced and used for the topology optimization of fluids.

For our numerical experiments, we regularize the equality constraint with a quadratic penalty term, leading to the following optimization problem

$$\begin{aligned} \begin{aligned}&\min _{\Omega } J(\Omega , u) = \int _{\textsf{D}}\mu \nabla u: \nabla u + \alpha _\Omega u\cdot u \ \textrm{d}x + \frac{l}{2} \left( \left| \Omega \right| - \text {vol}_\textrm{des} \right) ^2 \\&\text {s.t.} \quad \begin{alignedat}{2} - \mu \Delta u + \rho \left( u \cdot \nabla \right) u + \nabla p + \alpha _\Omega u&= 0 \quad{} & {} \text { in } {\textsf{D}}, \\ \nabla \cdot u&= 0 \quad{} & {} \text { in } {\textsf{D}},\\ u&= u_\textrm{D} \quad{} & {} \text { on } \partial {\textsf{D}},\\ \int _{\textsf{D}}p \ \textrm{d}x&= 0. \end{alignedat} \end{aligned} \end{aligned}$$
(4.5)

The topological derivative for problem (4.5) can be found, e.g., in Sá et al. (2016) and is given by

$$\begin{aligned} \mathcal {D}J(\Omega )(x) = \left( \alpha _\textrm{out} - \alpha _\textrm{in} \right) u(x) \cdot \left( u(x) + v(x) \right) + l \left( \left| \Omega \right| - \text {vol}_\textrm{des} \right) \quad \text { for } x \in {\textsf{D}}\setminus \Gamma , \end{aligned}$$

where (vq) solves the adjoint Navier–Stokes system

$$\begin{aligned} \begin{alignedat}{2} -\mu \Delta v + \rho (Du)^\top v - \rho \left( u \cdot \nabla \right) v + \nabla q + \alpha _\Omega v&= 2\left( \mu \Delta u - \alpha _\Omega u\right) \quad{} & {} \text { in } {\textsf{D}}, \\ \text {div}(v)&= 0 \quad{} & {} \text { in } {\textsf{D}}, \\ v&= 0 \quad{} & {} \text { on } \partial {\textsf{D}},\\ \int _{\textsf{D}}q \ \textrm{d}x&= 0. \end{alignedat} \end{aligned}$$

To model an actual solid region, \(\alpha _\textrm{out}\) should tend to \(+\infty\). For our numerical investigation, however, we follow Sá et al. (2016) and chose a finite value for \(\alpha _\textrm{in}\) and \(\alpha _\textrm{out}\), namely we choose

$$\begin{aligned} \alpha _\textrm{in} = \frac{2.5 \mu }{100^2} \qquad \alpha _\textrm{out} = \frac{2.5 \mu }{0.01^2}. \end{aligned}$$

Moreover, we choose a value of \(\rho = 1\) for the fluid’s density and \(\mu = {1\times 10^{-2}}\) for its viscosity. In the following, we will consider two common benchmark problems, which are taken from Borrvall and Petersson (2003); Sá et al. (2016) and consider the problem of constructing a pipe bend and the drag minimization of an obstacle. Note, that for both problems, the hold-all domain is given by \({\textsf{D}}= (0,1) \times (0,1)\), and that we discretize this with a uniform mesh consisting of 20201 nodes and 40000 triangles. Moreover, we use the LBB-stable Taylor-Hood finite element pair of quadratic Lagrange elements for the velocity and adjoint velocity and linear Lagrange elements for the pressure and adjoint pressure.

4.4.1 Pipe bend

Let us first investigate the problem of designing a pipe bend, which is taken from Borrvall and Petersson (2003). For Dirichlet boundary conditions, we prescribe the inlet velocity with the parabolic profile

$$\begin{aligned} u_\textrm{D}(x) = \left[ \begin{matrix} 1 - 100 \left( x_2 - 0.8\right) ^2, \\ 0 \end{matrix}\right] \qquad \text { for } x_1 = 0 \text { and } 0.7 \le x_2 \le 0.9 \end{aligned}$$

on the top part of the left boundary of \({\textsf{D}}\), and for the outlet velocity we use the profile

$$\begin{aligned} u_\textrm{D}(x) = \left[ \begin{matrix} 0, \\ -\left( 1 - 100 \left( x_1 - 0.8\right) ^2\right) \end{matrix}\right] \qquad \text { for } x_2 = 0 \text { and } 0.7 \le x_1 \le 0.9 \end{aligned}$$

on the right side of the bottom boundary of \({\textsf{D}}\). On all other parts of the boundary, we use a no-slip boundary condition, so that \(u_\textrm{D} = [0, 0]^\top\). The corresponding setup is shown schematically in Fig. 13. For the volume constraint, we proceed according to Borrvall and Petersson (2003) and use a value of \(\text {vol}_\textrm{des} = 0.08 \pi\). Additionally, we choose a regularization parameter of \(l = {1\times 10^{4}}\) to enforce the volume constraint.

Fig. 13
figure 13

Schematic setup of the pipe bend problem

The evolution of the cost functional, angle criterion, and norm of the projected topological derivative can be seen in Fig. 14. We observe that the BFGS method substantially outperforms the remaining methods as it only required about 30 iterations to reach the stopping criterion and find a local minimizer of the problem. The gradient descent method performed worse, but still way better than the already established methods: The former required around 100 iterations to satisfy the stopping criterion, whereas the sphere and convex combination methods performed very similarly, requiring about 250 iterations to reach a local minimizer. Considering the optimized geometries obtained by the method, which are depicted in Fig. 15, we observe that all methods converge to a similar minimizer and that all optimized geometries show only the slightest visual differences.

Fig. 14
figure 14

Evolution of the optimization for the pipe bend problem

Fig. 15
figure 15

Optimized geometries for the pipe bend problem

4.4.2 Rugby Ball

Let us now consider the rugby-ball problem from Borrvall and Petersson (2003), which is shown schematically in Fig. 16. Here, the goal is to design an obstacle which minimizes the energy dissipation of the flow. For the volume constraint of the problem, we choose \(\text {vol}_\textrm{des} = {}{0.8}{}\) and use a regularization parameter of \(l = {1\times 10^{4}}\) to enforce the volume constraint. For the boundary conditions, we prescribe a constant value of \(u_\textrm{D} = [0, 1]^\top\) on the entire boundary \(\partial {\textsf{D}}\).

Fig. 16
figure 16

Schematic of the rugby ball problem

The history of the cost functional, angle between topological derivative and level-set function, and the norm of the projected topological derivative are depicted in Fig. 17. Here, we can observe that the BFGS method again outperforms all other methods significantly, as it requires slightly less than 50 iterations to find a local minimizer. The gradient descent method shows the second best performance and is able to satisfy the stopping criterion after around 80 iterations. The performance of the sphere and convex combination methods is, again, very similar and both require slightly more than 100 iterations to converge.

The obtained optimized geometries are shown in Fig. 18. We see that all methods produce the same desired shape, which is reminiscent of a rugby ball, so that all of them converge to the same local minimizer of the problem.

Altogether, in the context of fluid design optimization, we observe that the BFGS method performs substantially better than the other methods considered in this paper. It reaches the desired stopping criteria with significantly less iterations than the remaining methods. Our findings highlight the efficiency and potential of the proposed BFGS method for solving topology optimization problems.

Fig. 17
figure 17

Evolution of the optimization for the rugby ball

Fig. 18
figure 18

Optimized geometries for the rugby ball problem

5 Conclusion and outlook

In this paper, we have presented novel quasi-Newton methods for topology optimization using a level-set method. We recalled the topological derivative, the level-set method for topology optimization, and the widely used optimization algorithm proposed in Amstutz and Andrä (2006). Then, we presented a new perspective on the evolution equation for evolving the level-set function according to the topological derivative, which enables an interpretation as a classical gradient descent method. This method is the basis for our derivation of quasi-Newton methods for topology optimization and we present a limited-memory BFGS method in this paper. The derivation of the BFGS methods is analogous to the finite-dimensional case and is possible due to the change in perspective described above. We investigated the performance of the proposed gradient descent and BFGS methods on four problem classes: Inverse topology optimization problems constrained by linear and semilinear Poisson problems, compliance minimization in linear elasticity, and the optimization of fluids in Navier–Stokes flow. We compared the results to current state-of-the-art solution algorithms for topology optimization with level-set methods. Our results show that the novel BFGS methods often significantly outperform the other considered methods, requiring substantially less iterations to compute a (local) minimizer. The only exception was the problem of compliance minimization in linear elasticity, where all considered methods performed very similarly, so that the BFGS method performed at least as good as the other, already established methods. All in all, the proposed BFGS methods are efficient and attractive solution algorithms for topology optimization and show great potential for solving such problems.

For future research, there are several interesting directions. One could consider nonlinear conjugate gradient (NCG) methods for topology optimization in analogy to Blauth (2021b), whose derivation is straightforward with the new perspective on the level-set evolution equation presented in this chapter. In fact, these methods are already implemented and available in our open-source software cashocs (Blauth (2021a, 2023)). However, a thorough numerical analysis of such NCG methods is still required for understanding their performance and behavior. Moreover, a theoretical analysis of the proposed BFGS methods is of great interest. For this, it would be useful to study the properties of the approximate Hessian operator, which the BFGS methods make use of, and its relation to higher order topological derivatives (see, e.g., Baumann and Sturm (2022) and the references therein). Further, it remains an open question why the BFGS methods do not perform as well in the context of compliance minimization in linear elasticity as they do for the other problem classes considered in this paper. Finally, it is, of course, of particular interest to employ our proposed methods for solving practically relevant topology optimization problems, e.g., in the fields of fluid-dynamical optimization or the simulation of fracture evolution.