Quasi-Newton methods for topology optimization using a level-set method

The ability to efficiently solve topology optimization problems is of great importance for many practical applications. Hence, there is a demand for efficient solution algorithms. In this paper, we propose novel quasi-Newton methods for solving PDE-constrained topology optimization problems. Our approach is based on and extends the popular solution algorithm of Amstutz and Andrä (J Comput Phys 216: 573–588, 2006). To do so, we introduce a new perspective on the commonly used evolution equation for the level-set method, which allows us to derive our quasi-Newton methods for topology optimization. We investigate the performance of the proposed methods numerically for the following examples: Inverse topology optimization problems constrained by linear and semilinear elliptic Poisson problems, compliance minimization in linear elasticity, and the optimization of fluids in Navier–Stokes flow, where we compare them to current state-of-the-art methods. Our results show that the proposed solution algorithms significantly outperform the other considered methods: They require substantially less iterations to find a optimizer while demanding only slightly more resources per iteration. This shows that our proposed methods are highly attractive solution methods in the field of topology optimization.


Introduction
Topology optimization is concerned with the optimization of a domain by altering its geometrical features.Whereas in shape optimization only the boundary of a domain is variable, topology optimization considers the addition or removal of material to the geometry.Originally introduced in the context of solid mechanics, topology optimization has been considered for many practical applications, e.g., compliance minimization in elasticity [1,6,18], design optimization in the context of fluid mechanics [15,27], electrical machines [19], as well as the solution of inverse problems [8,16,23,25] and the modeling and simulation of fracture evolution [35,36].
The goal of topology optimization is to optimize a cost functional J depending on the set Ω, which plays the role of the design variable.To achieve this goal, usually all possible designs are assumed to belong to a fixed domain D ⊂ R d , d ∈ N >0 , where N denotes the set of positive integers, which is referred to as hold-all domain.The understanding how a functional varies under perturbations of Ω is crucial for the development of numerical methods.Of particular importance are perturbations obtained by removing or adding a small inclusion ω ε (x 0 ) at x 0 ∈ Ω or x 0 ∈ D \ Ω of size ε.The first order variation of the functional under this perturbation is called the "topological derivative".This leads to the definition of the perturbed domains and ω ε (x 0 ) := x 0 + εω with ω ⊂ R d being a simply connected domain with 0 ∈ ω.Then, the topological derivative can be defined by where f satisfies lim ε↘0 f (ε) = 0.In fact, the topological derivative can also depend on the inclusion ω, which we omit in our notation for simplicity and just write DJ(Ω).For many shape functionals in practice the function f is given by the volume of ω ε , however, there are problems where this is not the case, e.g., when Dirichlet boundary conditions are imposed on the inclusion boundary ∂ω ε (cf.[5]).
The topological derivative was first formally introduced as the bubble method in [18] and later mathematically justified in [22,33].Topological derivatives have been established for a variety of PDE constrained functionals and we refer to the monographs [29,31] for further information.For the computation of the topological derivative there exist several methods, for instance a direct approach [31], where first the expansion of the state variable is computed and afterwards the topological expansion of the cost function is derived via Taylor's formula.Lagrangian approaches provide another way to compute topological derivatives, e.g., a Lagrangian method using an averaged adjoint equation is presented in [34], a method using a perturbed adjoint equation in conjunction with an unperturbed state equation is found in [6], and another Lagrangian method using an unperturbed adjoint equation in [20].We refer to [7] for a comparison of Lagrangian methods and their advantages and disadvantages.
From a numerical perspective, one can either use the topological derivative directly to find the location of an optimal design (see, e.g., [23,29]) or one can use a level-set approach [6] in an iterative fashion to find the optimal topology.However, it should be noted that only the topological derivative is capable to create new holes while the level-set approach only allows to close already present holes.Additionally, Newton-type algorithms to find circular inclusions have been introduced using higher order topological expansions (cf.[16,25] or [30,Chapter 10]).This leads to rapidly converging algorithms, however, it requires a combinatorial search, which makes it numerically expensive and also the study of nonlinear problems remains an open problem.
In the related field of shape optimization, the development, modification, and analysis of efficient solution algorithms has received lots of attention in recent years, e.g., in [10,11] and [32], where nonlinear conjugate gradient and quasi-Newton methods for shape optimization have been proposed, in [12], where a space mapping technique for shape optimization is presented, and in [17], where a W 1,∞ approach for shape optimization is introduced.
In this paper, we follow these developments and propose novel quasi-Newton methods for solving PDE constrained topology optimization problems based on the topological derivative.Our approach is based on and extends the popular level-set approach of Amstutz and Andrä [6].In particular, we first provide a new perspective on the evolution equation for the level-set function given in [6] to derive a gradient descent-type algorithm.This algorithm is the foundation for the quasi-Newton methods we propose afterwards.In particular, we derive a limited-memory BFGS method for topology optimization.We investigate the novel methods numerically for several problem classes: Inverse topology optimization problems constrained by linear and semilinear Poisson problems, compliance minimization in linear elasticity, and the optimization of fluids in Navier-Stokes flow.We compare our methods behavior to the widely popular method of Amstutz and Andrä [6] as well as a simpler convex combination method from [21].The results show that the novel quasi-Newton methods usually have a significantly better convergence behavior than the other solution algorithms.Particularly, they require substantially less iterations to find an optimizer of the problems while demanding only slightly more numerical resources per iteration.This makes our proposed quasi-Newton methods highly attractive for solving topology optimization problems.
This paper is structured as follows.In Section 2, we briefly recall basic results from topology optimization as well as the level-set method for topology optimization and the solution algorithms from [6] and [21], which are widely-used in the literature.In Section 3, we present a new perspective on the evolution equation for the level-set function.This allows us to derive a gradient descent algorithm for topology optimization and, afterwards, propose novel quasi-Newton methods for topology optimization.Finally, we investigate the methods numerically and compare our quasi-Newton methods to the current state-of-the-art methods in Section 4.

Preliminaries
2.1.Topological Sensitivity Analysis.In this section we recall basics on the topological derivative.We start by recalling the definition of the topological derivative (cf.[5]).Definition 2.1.Let ω be a simply connected, bounded, and open subset of R d , d ∈ N >0 , with 0 ∈ ω, let D be a bounded hold-all domain, and let P(D) be the power set of D, i.e., P(D) = Ω ⊂ R d : Ω ⊂ D .Let J : P(D) → R be a shape functional.We say that J has a topological derivative at Ω ∈ P(D) and at the point x 0 ∈ D w.r.t.ω if there exists some function f : R >0 → R >0 , where R >0 denotes positive real numbers, with lim ε↘0 f (ε) = 0 so that the following limit exists where the perturbed domain Ω ε is defined by and ω ε (x 0 ) := x 0 + εω.
We follow the approach of [6] and represent a set Ω ⊂ D with the help of a continuous level set function We write Ω ψ := Ω for a domain Ω ⊂ D which is represented by the level-set function ψ.The idea behind the generalized topological derivative originates from the following observation.Let ψ : D → R be a level set function representing Ω.If there is a constant c > 0 such that then which is the necessary condition for Ω to be optimal (cf.[6]).This observation is the starting point of the popular solution algorithm from [6], which we present in the next section.

2.2.
Topology Optimization Algorithms Using a Level-Set Method.Based on the discussion in the previous section, a solution algorithm for topology optimization has been derived in [6], which we briefly recall in the following.For the derivation, we introduce a fictitious time t and consider a family of domains Ω(t) represented by a level-set function ψ : [0, T ] × D → R. To derive a solution algorithm, the idea is to establish an evolution equation for the level-set function ψ which ensures that Ω(t) converges to a minimizer for t → ∞ and to discretize this evolution equation.A natural idea would be to evolve the level-set function according to the generalized topological derivative DJ(Ω(t)), so that For the sake of better readability, we drop the dependence on the fictitious time t in the domain Ω(t) throughout the rest of this paper.
In [6], the authors note that, in general, using this formulation does not guarantee convergence since the generalized topological derivative DJ(Ω) does not vanish for an optimal geometry (cf.(2.3)).Instead, Algorithm 1: Euler's method on the sphere algorithm for topology optimization.
Input: Initial level-set function ψ 0 and corresponding geometry Ω 0 , initial step size λ 0 = 1, stopping tolerance τ > 0, maximum number of iterations k max ∈ N 1 for k = 0, 1, 2, . . ., k max do 2 Compute the generalized topological derivative DJ(Ω k ) Update the level-set function: the main idea of [6] is to use a modified version of this equation, where the topological derivative is projected to the orthogonal complement of ψ in L 2 (D), which is written as where the operator P ψ ⊥ is defined as Here, (a, b) = (a, b) L 2 (D) denotes the L 2 (D) scalar product between a, b ∈ L 2 (D).Now, if the right-hand side of (2.4) vanishes, i.e., if P ψ ⊥ (DJ(Ω)) = 0, then it holds that there exists some α ∈ R so that DJ(Ω) = αψ.If α > 0, then the necessary optimality conditions (2.3) for the optimization problem are satisfied and the geometry Ω described by the corresponding level-set function ψ is a local minimizer.
For this reason, we also consider the norm of the projected topological derivative, i.e., as a second convergence criterion for our numerical experiments in Section 4.
For the numerical solution, it is proposed in [6] to discretize (2.4) with Euler's scheme on the sphere using a step size λ, leading to where θ k is the angle between ψ k and DJ(Ω k ), i.e. .
Note that the iteration is terminated if the angle between generalized topological derivative and levelset function is sufficiently small, so that P ψ ⊥ (DJ(Ω)) ≈ 0 and the necessary optimality conditions are satisfied approximately.The resulting numerical algorithm, which is analyzed in [4], can be seen in Algorithm 1.By slight misuse of notation, we write J(ψ) instead of J(Ω), where ψ is the level-set function representing the domain Ω, in line 7 of Algorithm 1 for the sake of better readability.
The main idea of Algorithm 1 is to iteratively use a convex combination of generalized topological derivative and level-set function in order to reach a minimum of the optimization problem, where the weights for the convex combination are chosen according to Euler's method on the sphere (cf.[6]).A simpler idea was used in [21], where the authors consider the following convex combination to evolve the level-set function Algorithm 2: Convex combination algorithm for topology optimization.
Input: Initial level-set function ψ 0 and corresponding geometry Ω 0 , initial step size λ 0 = 1, stopping tolerance τ > 0, maximum number of iterations k max ∈ N 1 for k = 0, 1, 2, . . ., k max do 2 Compute the generalized topological derivative DJ(Ω k ) Update the level-set function: where the parameter λ plays the role of a step size.For a fixed step size λ > 0, it is easy to see that if the method becomes stationary, i.e., if ψ k+1 = ψ k , then we have that ψ k = DJ(Ω k ) /||DJ(Ω k )|| L 2 (D) and, in particular, the necessary optimality conditions (2.2) are satisfied.This idea gives rise to the Algorithm 2.
Note that particularly the solution method presented in Algorithm 1 is widely popular and represents the state-of-the-art algorithm for solving topology optimization problems with a level-set function.

Quasi-Newton Methods for Topology Optimization
In this section, we present novel quasi-Newton methods for topology optimization.To do so, we first take a different perspective on equation (2.4) and formulate a new gradient descent method for topology optimization.This enables us to define (limited memory) BFGS methods for topology optimization afterwards.

3.1.
A Novel Perspective on the Level-Set Evolution Equation.Before we can introduce quasi-Newton methods for topology optimization, we note that the algorithmic frameworks presented in Algorithms 1 and 2 are not suitable for defining such methods.The reason for this is that the update rules for the level-set function given in (2.6) and (2.7) do consider a convex combination of the level-set function and the generalized topological derivative, which is different to the classical form of descent methods, where the update of the design variables (the level-set function) is performed by subtracting the gradient (topological derivative), scaled by an appropriate step size, from the current iterate.
To remedy this problem, we start by considering the continuous equation (2.4) for evolving the levelset function from [6].Instead of using the elaborate approach of [6], we discretize (2.4) by an explicit Euler method, which yields the discretized equation where k denotes the current time step.For discretizing equation (2.4), one usually would consider small time steps ∆t converging to 0, which would yield a so-called gradient flow method.However, we change our viewpoint and interpret (3.1) as gradient descent method, where the time step ∆t now plays the role of a step size.The benefit of this interpretation is that we can (potentially) make use of the convergence behavior of the gradient descent method and use large step sizes when appropriate, reducing the computational cost of our algorithm.Therefore, we can now interpret g k = −P ψ ⊥ k (DJ(Ω k )) as the "gradient" associated to our topology optimization problem.The resulting optimization algorithm is presented in Algorithm 3. Note, that the main benefit of this method is that it follows the "standard" form of a gradient descent method, which makes it amenable to define quasi-Newton methods, which we do in the next section.Additionally, our numerical experiments in Section 4 show, that Algorithm 3 can also yield faster convergence compared to Algorithms 1 and 2.
Algorithm 3: Gradient descent method for topology optimization.
Input: Initial level-set function ψ 0 , initial step size λ 0 = 1, stopping tolerance τ > 0, maximum number of iterations Update the level-set function: A Limited Memory BFGS Method for Topology Optimization.With the gradient descent method described in the previous section, we now focus our attention to quasi-Newton methods for topology optimization, which can now be derived analogously to the finite-dimensional case (see, e.g., [24,28]).To do so, we introduce the functions s k = ψ k+1 − ψ k and y k = g k+1 − g k .The quasi-Newton methods rely on the so-called secant equation, which in our setting can be written as where B k+1 is an isomorphism from L 2 (D) to L 2 (D) which can be seen as approximation of the Hessian, and we denote its inverse by H k+1 .In the following, we will describe a BFGS method for topology optimization based on H k , the inverse of the Hessian approximation, which makes it easier to derive the limited-memory version of the method which we have implemented in the software package cashocs [9,13].In particular, the search direction for the BFGS method is given by and the update formula for H k is given by where ⊗ denotes the outer product of L 2 (D), i.e., (a To avoid storing large dense matrices as discretizations of the operator H k , we employ a limited memory BFGS method for our numerical implementation which is shown in Algorithm 4. Particularly, the limited memory BFGS method only requires us to additionally store s k and y k to compute the application of H k to some right-hand side, which is shown in Algorithm 5 (cf.[28] for a description of the L-BFGS method in a finite-dimensional setting).Remark 3.1.Our numerical experiments presented in Section 4 showed that the search direction computed with (3.2) may sometimes not yield descent of the cost functional.Therefore, we employ a restarted version of the BFGS method, which replaces the search direction with p k = −g k if the line search procedure in line 9 of Algorithm 4 does not converge.
Remark 3.2.The idea presented above could also be applied analogously to derive nonlinear conjugate gradient (NCG) methods for topology optimization.As a thorough investigation of the several popular NCG methods is beyond the scope of this paper, we do not consider these methods in the following (cf.[10] for a discussion of NCG methods in the context of shape optimization).However, the NCG methods are already implemented and ready for use in our software package cashocs [9,13].An investigation of such NCG methods for topology optimization is planned for future research.
Algorithm 4: Limited-memory BFGS method for topology optimization.
Input: Initial level-set function ψ 0 , initial step size λ 0 = 1, initial inverse Hessian approximation Update the level-set function: Algorithm 5: Limited-memory double loop for computing H k b.
Input: Right-hand side b and stored elements

Numerical Investigation of Quasi-Newton Methods for Topology Optimization
In this section, we consider the practical performance of the proposed quasi-Newton methods from Section 3. We consider four problem classes: Inverse topology optimization problems constrained by linear and semilinear Poisson problems, compliance minimization in linear elasticity, and the topological design optimization in Navier-Stokes flow.We solve each of these problems with the four solution algorithms presented in this paper, where we use the following notation for the sake of simplicity.Algorithm 1, originally introduced in [6], is called the sphere combination method, Algorithm 2 is called convex combination method, Algorithm 3 is called gradient descent method and, finally, Algorithm 4 is called (limited memory) BFGS method.We remark that, for all numerical test cases considered in this paper, we choose a memory size of m = 5 for the limited memory BFGS methods.
Note that we have implemented all of the optimization algorithms for topology optimization considered in Section 2 and the novel gradient descent and BFGS methods from Section 3 in our open-source software package cashocs [9,13], which is a software for solving arbitrary PDE constrained shape optimization and optimal control problems.Our software is based on the finite element software FEniCS [3,26] and derives the necessary adjoint systems for the optimization with the help of automatic differentiation.Therefore, for the discretization of the PDE constraints, the finite element method is naturally employed.Moreover, the source code for our numerical experiments is available freely on GitHub [14] where α Ω (x) = χ Ω (x)α in + χ Ω c (x)α out and f Ω (x) = χ Ω (x)f in + χ Ω c (x)f out with α in , α out > 0 as well as f in , f out ∈ R are constant in Ω and Ω c .In our setting, u des is given as the solution of the PDE constraint on a desired domain Ω des .Hence, the above problem can be interpreted as an inverse problem of identifying the unknown domain Ω des using the measurement u des .The generalized topological derivative for this problem is derived, e.g., in [5] and is given by where u solves the state equation (4.1) and p solves the following adjoint equation Remark 4.1.Usually, the term "inverse problem" is used to denote a identification problem which aims to reconstruct some unknown domain Ω des ⊂ D using measurements obtained on (parts of) the boundary of D, see, e.g., [8,16,23,25].However, for our model problems (4.1) and (4.2), we use (artificial) measurements in the entire domain D. Still, we refer to these problems as inverse problems for the sake of simplicity.
In the following, we solve this problem with the four optimization algorithms described in Sections 2 and 3, where we consider a hold-all domain of D = (−2, 2) 2 .We discretize the geometry using four different mesh sizes in order to investigate the dependence of the algorithms on the discretization.The considered meshes are generated by creating a uniform quadrangular grid with 32 × 32, 48 × 48, 64 × 64, and 96 × 96 squares, where each square is in turn subdivided into four triangles, so that the meshes consist of 2113 nodes and 4096 triangles (32 × 32), 4705 nodes and 9216 triangles (48 × 48), 8321 nodes and 16 384 triangles (64×64), and 18 625 nodes and 36 864 triangles (96×96), respectively.Moreover, we discretize both the state and adjoint variables with linear Lagrange finite elements.For the parameters, we use α in = f in = 10 and α out = f out = 1.The sought design Ω des is chosen as the one corresponding to a clover shape, which can be seen in the right-most column of Table 1.
The results of the optimization can be seen in Figure 1, where we show the evolution of the cost functional, the angle criterion, and the norm of the projected gradient (cf.(2.5)) over the optimization for the finest discretization.Here, we observe that our proposed gradient descent and BFGS methods   Gradient descent BFGS perform significantly better than the established methods.We observe that the convergence behavior, based on the cost functional and the norm of the projected gradient, is best for the BFGS method, followed by the gradient descent and sphere combination methods, and that the convex combination algorithm performs worst.Particularly, the BFGS method reaches stationarity in the cost functional and norm of the projected gradient after only about 125 iterations, whereas all remaining methods continue to decrease the respective measures until the final iteration.However, none of the methods converged based on the angle criterion after 500 iterations.On the contrary, the angle between level-set function and generalized topological derivate remains bounded from below by about 30 • , so that none of the methods can be considered as converged by this criterion.However, if we take a look at the evolution of the relative norm of the projected gradient, we observe a steep decrease for all methods.The convex and sphere combination methods are able to decrease the norm of the projected gradient by about five and six orders of magnitude, respectively.The gradient descent and BFGS methods are even more efficient and decrease the norm of the projected gradient by about seven and eight orders of magnitude, respectively.Therefore, based on the second convergence criterion, the algorithms can be considered converged.Moreover, for this example, it seems like the angle criterion is too strict as a convergence criterion, whereas the norm of the projected gradient seems to be better suited.
Remark 4.2.A possible explanation for this behavior is the fact, that problem (4.1) is well-known to be ill-conditioned.This means that topological derivatives become flatter the closer we get to the optimal solution of the problem, which is also what we have observed in our numerical experiments.The topological derivative becomes successively smaller and flatter over the course of the optimization.
In Figure 2, we show plots of the topological derivative during the middle of the optimization for the sphere combination method.In Figure 2a, the topological derivative is shown at iteration 203 and for this iterate, the angle between topological derivative and level-set function is comparatively small with 29 • .On the other hand, the topological derivative in the next iteration, which is shown in Figure 2b, has a much larger angle of 130 • with the level-set function.A possible reason for this behavior could be the line search, which, at first, uses successively larger step sizes up to iteration 204, after that the accepted step size drops down.This behavior repeats and causes the oscillations in the angle criterion which can be seen in Figure 1.
Moreover, we remark that the angle criterion only provides a necessary condition for optimality, as can be easily seen by the discussion in Section 2.1 (cf.(2.2) and (2.3)).There may be other minimizers that satisfy (2.3) without satisfying the angle criterion.A thorough investigation of these issues is an interesting direction for future work.
The evolution of the geometry during the optimization algorithms is depicted in Table 1, where we show the geometries after 50, 100, and 500 iterations for all four considered optimization algorithms on the finest discretization.In addition, we also show the reference shape for comparison.Note, that the reference shape has four larger inclusions in the middle and a smaller one in the center, making the problem particularly hard to solve.Comparing the obtained shapes after 50 iterations, we observe that the sphere combination and convex combination algorithms only start to form the four major inclusions, whereas they are already present for the gradient descent and BFGS methods.Moreover, the geometry obtained by BFGS method already has an inclusion in the center of the geometry, making it already  very similar to the reference solution.After 100 iterations, the established methods have formed the major inclusions, but their position and shape is still quite wrong.The gradient descent method has improved the shapes of the major inclusions so that they are rather similar to the desired ones and is starting to locate the inclusion in the center.For the BFGS method, there is no visible difference anymore between the obtained and desired geometry, indicating that the method has already converged after 100 iterations, whereas the other methods are still quite far away from the desired shape.After 500 iterations, we observe that the sphere combination and convex combination methods have corrected the shape of the four major inclusions so that they are now quite similar to the desired ones.However, neither of these methods was able to reconstruct the inclusion in the center of the geometry.The gradient descent method, on the other hand, was able to reconstruct the center inclusion after 500 iterations, and there are no visual distinctions between the obtained and desired geometry anymore.The same is, of course, also true for the BFGS method, whose corresponding shape does not change visually between iterations 100 and 500.Let us finally investigate the dependence of the algorithms on the discretization.We do so by comparing the evolution of the cost functional for all methods on the different discretization levels.The results are depicted in Figure 3. Here, we observe that the performance of the algorithms, which we have discussed previously, is not dependent on the mesh size.In particular, the BFGS and gradient descent methods always perform best, whereas the sphere and convex combination methods perform significantly worse.Overall, the cost functional decreases further when a finer discretization is chosen.This is due to the fact, that a finer discretization also allows for a more detailed resolution of the desired shape, which in turn results in a lower cost functional value for the optimum.Therefore, we conclude that all algorithms behave mesh-independently, i.e., they do not require more iterations for finer levels of discretization to reach the same level of approximation of the optimal solution.
These results show the great potential of the proposed BFGS methods as they performed best of all considered methods, significantly outperforming the remaining algorithms.Moreover, we have shown that the methods also show mesh-independent behavior, which is due to the fact that the presented algorithms are discretizations of optimization algorithms acting in infinite-dimensional spaces.
where we again have α Ω (x) = χ Ω (x)α in +χ Ω c (x)α out and f Ω (x) = χ Ω (x)f in +χ Ω c (x)f out with α in , α out > 0 as well as f in , f out ∈ R. The topological derivative for this problem is given by  where u solves the state equation and p solves the following adjoint equation We again solve this problem with the four solution algorithms under consideration and use the hold-all domain D = (−2, 2) 2 .The desired state u des is obtained by solving the semilinear Poisson problem on a reference domain, which is given by the same clover shape as considered in Section 4.1.The geometry is discretized by dividing it into 96 × 96 squares, which are subdivided into four triangles each, so that we use 18 625 nodes and 36 864 triangles.Again, we employ linear Lagrange elements for the discretization of the state and adjoint variables.Finally, we use the same setting for α and f as before, so that α in = f in = 10 and α out = f out = 1.
The evolution of the cost functional, angle criterion, and norm of the projected topological derivative over the course of the optimization are shown in Figure 4. Analogously to our previous findings, we observe that the BFGS method significantly outperforms the remaining methods as it decreases the cost functional and the norm of the projected topological derivative most.For this problem, we observe that the gradient descent method does not perform as well as previously, but is rather comparable to the sphere and convex combination methods in performance.Again, for all considered algorithms, the angle between level-set function and generalized topological derivative remained bounded away from zero.
Let us investigate the geometries obtained by the methods, which are depicted in Table 2 after 100, 200, and 500 iterations of the methods.Here, we observe that the BFGS method again performs substantially better than the other methods.Even after 100 iterations, all inclusions of the geometry are found and the geometry exhibits the correct topology.However, the shape of the inclusions is still a bit off.The geometry obtained with the remaining algorithms, on the other side, is still far away from the optimal geometry.The same is true after 200 iterations: There, the geometry obtained by the BFGS method is visually identical to the reference solution, whereas the other methods produce geometries that still deviate significantly, both in topology and shape, from the optimal solution.Finally, the results after 500 iterations show, that the sphere and convex combination methods are able to locate the four major inclusions of the reference solution, but their shape is still a bit off.Moreover, they were unable to detect the inclusion in the center.The gradient descent method, on the other hand, was able to find all inclusions and, hence, to reconstruct the sought topology.However, the shape of the inclusions could still be improved.For the BFGS method, there are no major changes in the geometry between 200 and 500 iterations, as the geometry showed no visual differences to the reference solution already after 200 iterations.
Again, these results highlight the potential and efficiency of the BFGS methods for solving topology optimization problems, as they significantly outperformed all other methods considered in this paper.

4.3.
Compliance Minimization in Linear Elasticity.In this section, we consider the problem of compliance minimization in linear elasticity which has been previously investigated, e.g., in [1,2,6] Here, u is the deformation of a linear elastic material, σ(u) = 2µe(u) + λtre(u)I is Hooke's tensor, e(u) = 1 /2(∇u + (∇u) ⊤ ) is the symmetric gradient of u, and A : B denotes the Frobenius inner product between matrices A, B ∈ R d×d , i.e., A : Here, µ and λ are the so-called Lamé parameters for which we assume µ > 0 and 2µ + dλ > 0.Moreover, α Ω (x) = χ Ω (x)α in + χ Ω c (x)α out is, again, constant in Ω and Ω c = D \ Ω with α in , α out > 0. The first term in the cost functional measures the compliance of the structure and the second term is a regularization parameter which penalizes large domains Ω, so that the optimization is not trivial.
The topological derivative for problem (4.3) can be found, e.g., in [6] and is given by where r in = αout αin , r out = αin αout , and κ = λ+3µ λ+µ .In the following, we consider two test cases for this problem, namely the so-called cantilever and bridge problems, which are taken from [6].For these test cases, we follow [6] and use α in = 1 and α out = 1×10 −3 as well as f = 0. Additionally, as we consider the case of plane stress, the Lamé parameters are given by µ = E 2(1+ν) and λ = 2µλ * λ * +2µ with λ * = Eν (1+ν)(1−2ν) and we use a Young's modulus of E = 1 and a Poisson's ratio of ν = 0.3 for the following numerical experiments.Finally, we discretize the state variable using linear Lagrange finite elements.4.3.1.Cantilever.For our first example, we consider the so-called cantilever problem (see, e.g., [1,2,6]).Here, the holdall domain is given by D = (0, 2) × (0, 1).The Dirichlet boundary Γ D = { x = 0 } is the left side of the rectangle and for the Neumann load g we consider a unitary point load at (2, 0.5).As in [6], we discretize this domain with a uniform triangular mesh consisting of 4193 nodes and 8192 triangles and choose l = 100.A schematic of the problem setting can be seen in Figure 5.
The evolution of the cost functional, angle criterion, and norm of the projected topological derivative are shown in Figure 6.Here, we observe that all methods converge very fast, requiring a maximum of 25 iterations to satisfy the angle criterion with a tolerance of 1.5 • .The performance of all methods is very comparable for this problem and no method performs significantly better or worse than the others.However, when we investigate the optimized geometries obtained with the methods, there are some differences, as they converged to different local minimizers of the problems.Whereas the sphere combination and BFGS methods converged to the same solution that was reported in [6], the gradient descent and convex combination method converged to different geometries with finer beam structures.4.3.2.Bridge.In this section, we consider another problem of compliance minimization in linear elasticity which corresponds to a bridge with a single and multiple loads.As before, these problems are taken from the literature [1,2,6].Here, the hold-all domain is given by D = (0, 2) × (0, 1.2).As boundary conditions, we have zero vertical displacement at the bottom left and right corners.For the single load case, we apply a vertical downwards unitary force at (1, 0) and for the multiple load case we apply three vertical unitary loads at (0.5, 0), (1, 0), and (1.5, 0).We discretize this setup with a mesh consisting of 7809 vertices and 15 360 triangles.For the volume regularization, we use the parameter l = 30 in the       single load case and use l = 120 for the multiple load case, in analogy to [6].A schematic of the problem can be seen in Figure 8.Let us first discuss the single load case.The history of the cost functional, angle criterion, and norm of the projected topological derivative are shown in Figure 9. Similarly to the cantilever problem, there are no significant differences in the performance of the optimization algorithms.All methods required between 20 and 22 iterations to reach the prescribed tolerance of 1 • of the angle criterion.The only notable difference between the methods is that the convex combination algorithm decreases the quality measures slightly earlier than the other methods, but they "catch up" during later iterations.The optimized geometries obtained with the methods are shown in Figure 10.Here, we also observe, that the methods converge to different local minimizers.Whereas the sphere combination and BFGS method find a similar geometry, which is the same as reported in [6], the convex combination and gradient descent method find a different local minimizer with less height and a different supporting beam structure.
The results are very similar for the multiple loads case.In Figure 11 the evolution of the considered quality measures is depicted over the history of the optimization.Here, we observe some slight differences in the performance of the algorithms.The convex combination method performs best as it requires only 25 iterations to satisfy the angle criterion, where we have chosen a tolerance of 1.5 • for this example.The gradient descent and BFGS methods showed the second best performance, requiring about 30 iterations each to satisfy the stopping criterion.The sphere combination method performed worst and required 40   iterations to reach the stopping tolerance.Finally, let us briefly investigate the optimized geometries, which are shown in Figure 12.Here, we see that all methods converge to a similar solution, which has a slightly more complicated beam structure than the one reported in [6].
Altogether, for the case of linear elasticity, we observe no major differences between all considered optimization algorithms.The proposed BFGS methods show a very similar performance to the already established methods on all considered test problems.A possible reason for this behavior is that the problems are comparatively easy to solve, at least in comparison to the problems considered in Sections 4.1 and 4.2, as the already established methods only require 20 to 50 iterations to solve these problems to a desired tolerance, so that there may not be enough iterations for the BFGS methods to show their potential, particularly if they have to be restarted often due to large changes in the topology.This topic is of interest for future research.
Here, u denotes the fluid's velocity and p its pressure, µ is its viscosity, ρ its density, and α is the inverse permeability.The cost functional of the above problem models the energy dissipation of the fluid, which should be minimized.Moreover, we have a volume equality constraint, which ensures that only the desired volume of the domain is occupied by the fluid.For the topology optimization problem (4.4), the sought domain Ω is the domain of the fluid, whereas its complement Ω c plays the role of a solid region.
In particular, the inverse permeability α is small inside Ω and large outside of it.For a more detailed discussion, we refer the reader to [15], where this model was first introduced and used for the topology optimization of fluids.
For our numerical experiments, we regularize the equality constraint with a quadratic penalty term, leading to the following optimization problem The topological derivative for problem (4.5) can be found, e.g., in [27] and is given by where (v, q) solves the adjoint Navier-Stokes system To model an actual solid region, α out should tend to +∞.For our numerical investigation, however, we follow [27] and chose a finite value for α in and α out , namely we choose α in = 2.5µ 100 2 α out = 2.5µ 0.01 2 .Moreover, we choose a value of ρ = 1 for the fluid's density and µ = 1 × 10 −2 for its viscosity.In the following, we will consider two common benchmark problems, which are taken from [15,27] and consider the problem of constructing a pipe bend and the drag minimization of an obstacle.Note, that for both problems, the hold-all domain is given by D = (0, 1) × (0, 1), and that we discretize this with a uniform mesh consisting of 20 201 nodes and 40 000 triangles.Moreover, we use the LBB-stable Taylor-Hood finite element pair of quadratic Lagrange elements for the velocity and adjoint velocity and linear Lagrange elements for the pressure and adjoint pressure.4.4.1.Pipe Bend.Let us first investigate the problem of designing a pipe bend, which is taken from [15].For Dirichlet boundary conditions, we prescribe the inlet velocity with the parabolic profile u D (x) = 1−100(x2−0.8) 2 , 0 for x 1 = 0 and 0.7 ≤ x 2 ≤ 0.9 on the top part of the left boundary of D, and for the outlet velocity we use the profile for x 2 = 0 and 0.7 ≤ x 1 ≤ 0.9 on the right side of the bottom boundary of D. On all other parts of the boundary, we use a no-slip boundary condition, so that u D = [0, 0] ⊤ .The corresponding setup is shown schematically in Figure 13.
For the volume constraint, we proceed according to [15] and use a value of vol des = 0.08π.Additionally, we choose a regularization parameter of l = 1 × 10 4 to enforce the volume constraint.The evolution of the cost functional, angle criterion, and norm of the projected topological derivative can be seen in Figure 14.We observe that the BFGS method substantially outperforms the remaining methods as it only required about 30 iterations to reach the stopping criterion and find a local minimizer of the problem.The gradient descent method performed worse, but still way better than the already established methods: The former required around 100 iterations to satisfy the stopping criterion, whereas     the sphere and convex combination methods performed very similarly, requiring about 250 iterations to reach a local minimizer.Considering the optimized geometries obtained by the method, which are depicted in Figure 15, we observe that all methods converge to a similar minimizer and that all optimized geometries show only the slightest visual differences.
4.4.2.Rugby Ball.Let us now consider the rugby-ball problem from [15], which is shown schematically in Figure 16.Here, the goal is to design an obstacle which minimizes the energy dissipation of the flow.
For the volume constraint of the problem, we choose vol des = 0.8 and use a regularization parameter of     The history of the cost functional, angle between topological derivative and level-set function, and the norm of the projected topological derivative are depicted in Figure 17.Here, we can observe that the BFGS method again outperforms all other methods significantly, as it requires slightly less than 50 iterations to find a local minimizer.The gradient descent method shows the second best performance and is able to satisfy the stopping criterion after around 80 iterations.The performance of the sphere and convex combination methods is, again, very similar and both require slightly more than 100 iterations to converge.
The obtained optimized geometries are shown in Figure 18.We see that all methods produce the same desired shape, which is reminiscent of a rugby ball, so that all of them converge to the same local minimizer of the problem.
Altogether, in the context of fluid design optimization, we observe that the BFGS method performs substantially better than the other methods considered in this paper.It reaches the desired stopping criteria with significantly less iterations than the remaining methods.Our findings highlight the efficiency and potential of the proposed BFGS method for solving topology optimization problems.

Conclusion and Outlook
In this paper, we have presented novel quasi-Newton methods for topology optimization using a levelset method.We recalled the topological derivative, the level-set method for topology optimization, and the widely-used optimization algorithm proposed in [6].Then, we presented a new perspective on the evolution equation for evolving the level-set function according to the topological derivative, which enables an interpretation as a classical gradient descent method.This method is the basis for our derivation of quasi-Newton methods for topology optimization and we present a limited-memory BFGS method in this paper.The derivation of the BFGS methods is analogous to the finite-dimensional case and is possible due to the change in perspective described above.We investigated the performance of the proposed gradient descent and BFGS methods on four problem classes: Inverse topology optimization problems constrained by linear and semilinear Poisson problems, compliance minimization in linear elasticity, and the optimization of fluids in Navier-Stokes flow.We compared the results to current state-of-the-art solution algorithms for topology optimization with level-set methods.Our results show that the novel BFGS methods often significantly outperform the other considered methods, requiring substantially less iterations to compute a (local) minimizer.The only exception was the problem of compliance minimization in linear elasticity, where all considered methods performed very similarly, so that the BFGS method performed at least as good as the other, already established methods.All in all, the proposed BFGS methods are efficient and attractive solution algorithms for topology optimization and show great potential for solving such problems.
For future research, there are several interesting directions.One could consider nonlinear conjugate gradient (NCG) methods for topology optimization in analogy to [10], whose derivation is straightforward with the new perspective on the level-set evolution equation presented in this chapter.In fact, these methods are already implemented and available in our open-source software cashocs [9,13].However, a thorough numerical analysis of such NCG methods is still required for understanding their performance and behavior.Moreover, a theoretical analysis of the proposed BFGS methods is of great interest.For this, it would be useful to study the properties of the approximate Hessian operator, which the BFGS methods make use of, and its relation to higher order topological derivatives (see, e.g., [7] and the references therein).Further, it remains an open question why the BFGS methods do not perform as well in the context of compliance minimization in linear elasticity as they do for the other problem classes considered in this paper.Finally, it is, of course, of particular interest to employ our proposed methods for solving practically relevant topology optimization problems, e.g., in the fields of fluid-dynamical optimization or the simulation of fracture evolution.

4. 1 .
Linear Poisson Problem.In this section, we investigate a topology optimization problem constrained by a linear Poisson problem.Let D ⊂ R d with d ∈ N >0 be an open and bounded domain with boundary ∂D and let Ω ⊂ D be an open subset.We denote by Γ = ∂Ω \ ∂D the interior boundary of Ω in D and by Ω c = D \ Ω the complement of Ω in D. We consider the following problem (a) Cost functional.(b) Angle.(c)Norm of the projected topological derivative.

Figure 1 .
Figure 1.Evolution of the optimization for the linear Poisson problem (4.1).

Figure 2 .
Figure 2. Plots of the topological derivative at two iterations of the sphere combination algorithm for problem (4.1).

(a) 32 Figure 3 .
Figure 3. Evolution of the cost functional for the linear Poisson problem (4.1) for several discretization levels.

4. 2 .
Semilinear Poisson Problem.To showcase the methods' performance for nonlinear problems, we now investigate a semilinear variant of the Poisson problem we considered in Section 4.1.Our setting is like before, so D ⊂ R d with d ∈ N >0 and d ≤ 3 denotes the holdall domain, ∂D is its boundary, Ω ⊂ D is an open subset of D, Γ = ∂Ω \ ∂D is the interior boundary of Ω in D, and Ω c = D \ Ω is the complement of Ω in D. Our semilinear version of the problem reads Norm of the projected topological derivative.

Figure 4 .
Figure 4. Evolution of the optimization for the semilinear Poisson problem (4.2).
. Let D ⊂ R d , d ∈ N >0 be an open and bounded domain with boundary ∂D, which is the disjoint union of the Table 2. Evolution of the geometries for the semilinear Poisson problem (4Γ D and Neumann boundary Γ N .Further, let Ω ⊂ D and denote its complement by Ω c = D \ Ω.The compliance minimization problem is given by

Figure 5 .
Figure 5. Schematic of the cantilever problem.
(a) Cost functional.(b) Angle.(c)Norm of the projected topological derivative.

Figure 6 .
Figure 6.Evolution of the optimization for the cantilever problem.

Figure 7 .
Figure 7. Optimized geometries for the cantilever problem.

Figure 8 .
Figure 8. Schematic setup of the bridge problem.
(a) Cost functional.(b) Angle.(c)Norm of the projected topological derivative.

Figure 9 .
Figure 9. Evolution of the optimization for the bridge with a single load.

Figure 10 .
Figure 10.Optimized geometries for the bridge with a single load.
Norm of the projected topological derivative.

Figure 11 .
Figure 11.Evolution of the optimization for the bridge with multiple loads.

Figure 12 .
Figure 12.Optimized geometries for the bridge with multiple load.

4. 4 .
Optimization of Fluids in Navier-Stokes Flow.Let us now consider another application, the optimization of fluids in Navier-Stokes flow.Our setting is similar to before, i.e., let D ⊂ R d be an open and bounded hold-all domain with boundary ∂D.Let Ω ⊂ D, denote by Ω c = D \ Ω the complement of Ω in D, and let Γ = ∂Ω \ ∂D.We consider the following optimization problem min Ω J(Ω, u) = D µ∇u : ∇u
Norm of the projected topological derivative.

Figure 14 .
Figure 14.Evolution of the optimization for the pipe bend problem.

Figure 15 .
Figure 15.Optimized geometries for the pipe bend problem.
Norm of the projected topological derivative.

Figure 17 .
Figure 17.Evolution of the optimization for the rugby ball.

Figure 18 .
Figure 18.Optimized geometries for the rugby ball problem.

Table 1 .
Evolution of the geometries for the linear Poisson problem (4.1).