1 Introduction

In many acoustic applications, acoustic waveguides connect various parts of a device from the transducer to the radiating aperture or receiver (Rutsch et al. 2022; Haugwitz et al. 2022). This includes the transmission of acoustic waves between parts with different geometry, dimensions, and acoustic impedances. In most cases, a minimum reflection and alteration of the incoming signal is desired, which can be achieved using an impedance matching section (Wadbro 2014; Robertson et al. 2019).

One of the earliest works on acoustic transition sections is the study by Kirby (2008), who used numerical simulations to model acoustic wave propagation in two waveguides connected through a transition section. Wadbro Wadbro (2014) used a material distribution topology optimization method to design a transition section between two cylindrical waveguides with different radii to achieve impedance matching. Robertson et al. (2019) considered a similar problem of impedance matching between two cylindrical waveguides, although they used a two-neck Helmholtz resonator as the transition section. They showed that perfect impedance matching can be achieved by tuning the dimensions of the Helmholtz resonator. However, this approach has bandwidth limitations with impedance matching achieved only at a narrow range of frequencies close to the resonance frequency of the resonator. Cao et al. (2020) took a different approach by considering a two-dimensional acoustic transformation section with an impedance-tunable transformed medium. They showed that desirable broadband impedance matching can be achieved in this way, though in practice, it is very difficult to set a transformation medium with acoustic properties changing according to a given function (Cao et al. 2020).

Here, we use the material distribution topology optimization method, also known as density-based topology optimization, to design an acoustic transition section for impedance matching. This is a common method in computational design optimization for acoustic problems (Wadbro and Berggren 2006; Dühring et al. 2008; Bokhari et al. 2021; Yoon et al. 2020). To obtain a final topology that is favorable for a broad range of frequencies, there are two fundamentally different approaches:

First, the problem can be viewed as a topology optimization task for infinitely many load cases. Using an appropriate discretization scheme, the problem can be transformed into a deterministic multi-load formulation, where the objective function admits a finite sum structure (Diaz and Bendsøe 1992; Li et al. 2020; Zhang et al. 2017), for which various deterministic optimization schemes have been explored. In this contribution, the resulting optimization problems are solved via the method of moving asymptotes (MMA) (Svanberg 1987). We investigate how the solution depends on the number of frequencies considered in the optimization.

Changing perspective, the original objective can also be considered as a robust optimization problem (Carrasco et al. 2012; Dunning and Kim 2013; Tootkaboni et al. 2012), where the broadband frequency range models an underlying uncertainty of load cases. While many approaches for robust optimization again rely on discretizations or series expansions of the full objective, we specifically choose two methods from stochastic optimization which do not follow this philosophy. Namely, we optimize the transition section using the stochastic gradient descent (Robbins and Monro 1951) and the continuous stochastic gradient descent (Pflug et al. 2020; Grieshammer et al. 2023a, b) method. Both of these approaches represent probabilistic, sample-based optimization schemes.

On the first glance, using probabilistic solvers for a fully deterministic optimization problem seems counterintuitive, especially given that gradient descent schemes typically perform worse than specialized techniques like MMA. However, there are several reasons why these methods might still yield favorable results. The aforementioned discretization of integrals in the deterministic approach automatically results in a trade-off between accuracy and computational time. Especially in our broadband setting, the numerical effort associated to the required accuracy can quickly outgrow the computational time of sample-based methods, which have an approximately constant cost per iteration.

The remainder of this paper is structured as follows. In Sect. 2, we introduce the problem setup, including the geometric configuration and the governing equations. We also discuss the discretization of the state equations using the finite element method. Next, we introduce the objective function for the design optimization problem aiming for a broadband transition. For the different optimization approaches considered in this work, we present the corresponding results of our numerical experiments in Sect. 3. Finally, in Sect. 4, we present a concluding discussion that includes a comprehensive comparison of the results obtained by the different approaches. We summarize the main findings and provide insights into the strengths and limitations of each method.

2 Problem statement

Consider the cylindrical setup illustrated in Fig. 1, consisting of two semi-infinite pipes connected by a transition section. Assume a planar acoustic wave propagating from left to right in the left pipe. As this incoming wave propagates in the transition section \(\Omega ^\text {D}\), highlighted in grey, a part of the wave will propagate through the transition section to the right pipe, while another part will be reflected back to the left pipe. By optimizing the distribution of sound-hard material in the transition section \(\Omega ^\text {D}\), this study aims to ensure that the planar incoming wave in the left pipe continues to propagate as a planar wave to the right pipe, despite the change in the diameter of the waveguide. A similar problem was previously considered by Wadbro (2014) for a narrower range of frequencies. In this study, we aim to extend the analysis to a broader range of frequencies, which is of practical importance for a wide range of acoustic applications. Additionally, we compare two distinct approaches to solving the optimization problem: the deterministic approach and the stochastic approach. Here and throughout this article, the waves that propagate away from the transition section are termed outgoing waves, and the waves that propagate towards the transition section are termed incoming waves. We note that the outgoing waves are further divided into reflected waves, traveling to the left in the left pipe, and transmitted waves, traveling to the right in the right pipe.

Fig. 1
figure 1

a Axi-symmetric setup with cylindrical design domain \(\Omega ^\text {D}\) in the middle and one cylindrical waveguide on both sides. b A 3D visualization of the setup. c The targeted wave propagation characteristics in the waveguides, where the planar incoming wave in the left pipe continues to propagate as a planar wave to the right pipe

2.1 Mathematical model

In this study, we consider linear wave propagation in the cylindrically symmetric (axisymmetric) setup, illustrated in Fig. 1a. Specifically, we let \({P(x,t)=Re\left\{ p(x)e^{i\omega t}\right\} }\) denote the time-harmonic pressure in the air-filled region. Under these assumptions, the complex pressure amplitude p(x) satisfies the Helmholtz equation in cylindrical coordinates. That is,

$$\begin{aligned} -\nabla \cdot (r \nabla p) - k^2 rp = 0, \quad \text {in}\, \hat{\Omega }, \end{aligned}$$
(1)

which describes the behavior of sound waves in the system. Therein, \({\hat{\Omega }}\) is the air-filled region of the setup, and the wave number \(k= \omega / c\) is determined by the speed of sound in air c and the angular frequency \(\omega\). As mentioned earlier, the design domain \(\Omega ^\text {D}\) (transition section) may partly be filled with sound-hard material. At the air and the solid interface, the normal velocity is zero, which by the relation of velocity and pressure yields the sound-hard boundary condition \(\partial p/\partial n=0\), where n is the interface’s outward directed normal. Fig. 2 shows the air-filled domain \({\hat{\Omega }}\) and an arbitrary distribution of sound-hard material \(\Omega ^s\) in the design domain. By varying the distribution of sound-hard material within the design domain, we can explore how this affects the propagation of sound waves through the transition section and identify optimal designs that minimize wave reflection and ensure planar wave transmission.

Fig. 2
figure 2

The computational domain \(\Omega\) comprising the design domain \(\Omega ^\text {D}\) and two truncated waveguides. An arbitrary sound-hard material distribution inside \(\Omega ^\text {D}\) is given as \(\Omega ^\text {s}\). The remainder of the setup \({\hat{\Omega }}=\Omega \setminus \Omega ^\text {s}\) is filled with air. The boundaries are divided into an axi-symmetric axis \(\Gamma ^\text {sym}\) (dash-dotted line), the sound-hard walls \(\Gamma ^\text {s}\) (solid line) and the boundaries \(\Gamma ^\text {L}\) and \(\Gamma ^\text {R}\) (dashed lines) of the artificially truncated waveguides

To numerically approximate the infinitely long pipes on both sides of the design domain, we truncate them and use Dirichlet-to-Neumann non-reflecting boundary conditions at the artificial boundaries \(\Gamma ^\text {L}\) and \(\Gamma ^\text {R}\), as illustrated in Fig. 2. Considering these artificial boundary conditions, we obtain the boundary-value problem

$$\begin{aligned} -\nabla \cdot (r\nabla p) - k^2 rp&= 0,&\quad&\text {in} \quad \hat{\Omega }, \end{aligned}$$
(2a)
$$\begin{aligned} \frac{\partial p}{\partial n}&= 0,&\quad&\text {on} \quad \Gamma ^\text {s}, \end{aligned}$$
(2b)
$$\begin{aligned} \frac{\partial p}{\partial n} - \text {DtN}(p)&= 2\textrm{i}k,&\quad&\text {on} \quad \Gamma ^\text {L}, \end{aligned}$$
(2c)
$$\begin{aligned} \frac{\partial p}{\partial n} - \text {DtN}(p)&= 0,&\quad&\text {on} \quad \Gamma ^\text {R}. \end{aligned}$$
(2d)

Conditions (2c) and (2d), in which \(\text {DtN}\) represent the Dirichet-to-Neumann operator, ensure that all the outgoing waves are perfectly absorbed. Furthermore, condition (2c) also specifies an incoming planar wave with unit-amplitude at \(\Gamma ^\text {L}\). For more details about this type of artificial boundary conditions, we refer the reader to the book by Ihlenburg (1998) and the appendix of the article by Wadbro (2014). By multiplying Eq. (2a) with a test function q and integrating over the domain \({{\hat{\Omega }}}\), the variational form of boundary-value problem (2) can be written as follows.

$$\begin{aligned}{} & {} \text {Find }p \in H^1 ({{\hat{\Omega }}})\text { such that:}\nonumber \\{} & {} \qquad \int _{{{\hat{\Omega }}}} r \nabla q \cdot \nabla p\,\textrm{d}\Omega - k^2 \int _{{{\hat{\Omega }}}} r q p\,\textrm{d}\Omega \nonumber \\{} & {} \qquad - \int _{\Gamma ^\text {L}} rq \text {DtN}(p)\,\textrm{d}\Gamma - \int _{\Gamma ^\text {R}} rq \text {DtN}(p)\,\textrm{d}\Gamma \nonumber \\{} & {} \qquad \quad = 2\textrm{i}k\int _{\Gamma ^\text {L}} rq\,\textrm{d}\Gamma , \quad \forall q \in H^1 ({\hat{\Omega }}). \end{aligned}$$
(3)

For a given distribution of solid material in the design domain and a given shape of region \(\Omega ^s\), the solution p to Eq. (3) shows the distribution of the complex pressure in \({{\hat{\Omega }}}\). Following a standard approach in topology optimization, we define a material indicator function \(\alpha\) such that \(\alpha \equiv 0\) in \(\Omega ^s\) and \(\alpha \equiv 1\) in \(\hat{\Omega }\). Using this function, we extend the integration domain of the domain integrals in variational formulation (3) from \({\hat{\Omega }}\) to \({\Omega = {\hat{\Omega }} \cup \Omega ^s}\). We note that the computational domain \(\Omega\) consists of both the air and the solid regions. The resulting reformulation of variational formulation (3) is then given by

$$\begin{aligned}{} & {} \text {Find }p \in H^1 (\Omega )\text { such that}\nonumber \\{} & {} \qquad \int _{\Omega } \alpha r \nabla q \cdot \nabla p\,\textrm{d}\Omega - k^2 \int _{\Omega } \alpha r q p\,\textrm{d}\Omega \nonumber \\{} & {} \qquad - \int _{\Gamma ^\text {L}} rq \text {DtN}(p)\,\textrm{d}\Gamma - \int _{\Gamma ^\text {R}} rq \text {DtN}(p)\,\textrm{d}\Gamma \nonumber \\{} & {} \qquad \quad = 2\textrm{i}k\int _{\Gamma ^\text {L}} rq\,\textrm{d}\Gamma , \quad \forall q \in H^1 (\Omega ). \end{aligned}$$
(4)

The solution p to the variational formulation (4) represents the distribution of complex pressure in the waveguide, given a design of solid scatter in \(\Omega ^\text {D}\), described by a material indicator function \(\alpha\).

2.2 Discretization

We use the finite element method to discretize and numerically solve problem (4) on a structured grid of square elements. Let V be a finite element functional space consisting of continuous and bi-quadratic functions on each element, and let \(\varphi _j\), \({j=1, 2, \ldots , N}\) be bi-quadratic shape functions, where N is the number of degrees of freedom. Thus, \(V=\text {span}\left\{ \varphi _1, \varphi _2, \ldots , \varphi _N\right\}\). We approximate the complex pressure p and the test function q by \(p_h \in V\) and \(q_h \in V\), respectively. Additionally, we approximate the material indicator function \(\alpha\) with an element-wise constant function \(\alpha _h\). Using the above definitions and approximations, we obtain the discretized version of problem (4) below.

$$\begin{aligned}{} & {} \text {Find }p_h\in \,V\text { such that}\nonumber \\{} & {} \qquad \int _\Omega \alpha _h r \nabla q_h \cdot \nabla p_h\,\textrm{d}\Omega - k^2 \int _\Omega \alpha _h r q_h p_h\,\textrm{d}\Omega \nonumber \\{} & {} \qquad - \int _{\Gamma ^\text {L}} r q_h \text {DtN}_h(p_h)\,\textrm{d}\Gamma -\int _{\Gamma ^\text {R}} r q_h \text {DtN}_h(p_h)\,\textrm{d}\Gamma \nonumber \\{} & {} \qquad \quad = 2\textrm{i}k\int _{\Gamma ^\text {L}} r q_h\,\textrm{d}\Gamma , \quad \forall q_h\in V, \end{aligned},$$
(5)

where \(\text {DtN}_h\) represents semi-discrete Dirichlet-to-Neumann type boundary operators on \(\Gamma ^\text {L}\) and \(\Gamma ^\text {R}\) (Wadbro 2014, Appendix A). The algebraic or matrix formulation of problem (5) reads

$$\left( {\textbf{K}}(\varvec{\alpha }) - k^2{\textbf{M}}(\varvec{\alpha }) - {\textbf{B}}^\text {L} - {\textbf{B}}^\text {R}\right) {\textbf{p}} = 2ik {\textbf{M}}^\text {L} \mathbf {\mathbbm {1}_N},$$
(6)

where \({\textbf{p}}=\left[ p_1,p_2,\ldots ,p_{N}\right] ^T\) is the vector of nodal values of the complex acoustic pressure amplitude, \(\varvec{\alpha }=\left[ \alpha _1,\alpha _2,\ldots ,\alpha _{N^\text {D}}\right] ^T\) is the vector that holds the element values of \(\alpha _h\) (with \(N^\text {D}\) denoting the number of elements in \(\Omega ^\text {D}\)) and \(\mathbbm {1}_N = [1,1, \dots , 1]^T\) is a vector of length N. Also, the \(N\times N\) stiffness \({\textbf{K}}\), mass \({\textbf{M}}\), and boundary mass \({\textbf{M}}^\text {L}\) matrices have components

$$\begin{aligned} K_{i j}&= \int _\Omega \alpha _h r \nabla \varphi _i \cdot \nabla \varphi _j\,\textrm{d}\Omega , \end{aligned}$$
(7a)
$$\begin{aligned} M_{i j}&= \int _\Omega \alpha _h r \varphi _i \varphi _j\,\textrm{d}\Omega , \end{aligned}$$
(7b)
$$\begin{aligned} M^\text {L}_{i j}&= \int _{\Gamma ^\text {L}}\, r\varphi _i \varphi _j \,\textrm{d}\Gamma , \end{aligned},$$
(7c)

respectively. The boundary matrices \({\textbf{B}}^\text {L}\) and \({\textbf{B}}^\text {R}\) represent the non-reflecting boundary conditions at \(\Gamma ^\text {L}\) and \(\Gamma ^\text {R}\), respectively. A detailed derivation of \({\textbf{B}}^\text {L}\) and \({\textbf{B}}^\text {R}\) is provided in a previous work by Wadbro (2014, Appendix A).

2.3 Power of outgoing waves

Let Helmholtz Eq. (1) govern the distribution of the complex pressure p in the two semi-infinite pipes on the left and right side of the transmission section illustrated in Fig. 1(a) with sound-hard boundary condition (2b) on the solid walls. Using the separation of variables, the general solution for p in the left and right pipes reads

$$\begin{aligned} p^\text {L}&= \sum _m f_m(r) \bigg (A^\text {L}_m \textrm{e}^{\textrm{i} k_m (z^\text {L}-z)} + B^\text {L}_m \textrm{e}^{\textrm{i} k_m (z-z^\text {L})}\bigg ), \end{aligned}$$
(8a)
$$\begin{aligned} p^\text {R}&= \sum _m f_m(r) \bigg (A^\text {R}_m \textrm{e}^{\textrm{i} k_m (z-z^\text {R})} + B^\text {R}_m \textrm{e}^{\textrm{i} k_m (z^\text {R}-z)}\bigg ), \end{aligned},$$
(8b)

respectively, where \(z^\text {L}\) and \(z^\text {R}\) are the position of z-axis on \(\Gamma ^\text {L}\) and \(\Gamma ^\text {R}\), the functions \(f_m(r)\) are the modes at left and right waveguide, \(\textrm{e}\) is the base of the natural logarithm, \(\textrm{i}\) is the imaginary unit and \(A^\text {L}_m\) and \(B^\text {L}_m\) are complex constants that determine the amplitude of incoming and outgoing waves at the left waveguide, respectively. Similarly, \(A^\text {R}_m\) and \(B^\text {R}_m\) are complex constants that determine the amplitude of incoming and outgoing waves at the right waveguide, respectively. Lastly, the constants \(k_m\) are the so-called reduced wave numbers. In the continuous case, we have an infinite number of modes \(m=0,1,2,\ldots\), but only the modes with real-values of \(k_m\) are propagating modes and the ones with imaginary \(k_m\) will decay exponentially according to the equations (8). These modes are known as evanescent modes.

The mode functions \(f_m(r)\) should satisfy the following one-dimensional eigenvalue problem in the radial direction on the boundaries \(\Gamma ^\text {L}\) and \(\Gamma ^\text {R}\):

$$\begin{aligned} -\frac{\partial }{\partial r}\bigg (r \frac{\partial f}{\partial r}\bigg ) = \lambda r f, \quad \frac{\partial f}{\partial r} \bigg |_{r=0} =\frac{\partial f}{\partial r} \bigg |_{r=W}=0, \end{aligned},$$
(9)

where W is the radius of the pipe. In the continuous case, it is well-known that the functions \(f_m\) are so-called Bessel functions. For our numerical treatment, we can extrapolate the numerical solution on \(\Gamma ^\text {L}\) and \(\Gamma ^\text {R}\) to any point in the waveguides using expansion (8) and viewing the problem continuously in the lengthwise direction and discretely in the radial direction. Note that for a given finite element discretization of Eq. (9), the number of modes that are representable in the discretized case M equals the number of basis functions with support on the boundary. Let \(f_m^h\) be an eigenfunction (mode) corresponding to eigenvalue \(\lambda _m\), where \(m=0,1,\ldots , M\). Since the complex pressure p satisfies Helmholtz Eq. (1), for the reduced wave number we have

$$\begin{aligned} k_{m}^2 = k^2 - \lambda _{m}. \end{aligned}$$
(10)

Recall that \(f_m^h\) is a propagating mode if its corresponding reduced wave number is real. The smallest eigenvalue in solving problem (9) is 0, which corresponds to the planar wave. So \(\lambda _{0}=0\) and \(k_0=k\), and thus, the planar wave mode is always a propagating mode. Moreover, for a given frequency f, there is a finite number \(M^p\) of propagating modes satisfying the condition \(\lambda _m \le k^2\). The number of propagating modes depends on the frequency of the wave and the radius of the pipe. Thus, for the different radii of the left and right pipes, we may have a different number of propagating modes at \(\Gamma ^\text {L}\) and \(\Gamma ^\text {R}\), which we denote by \(M_L^p\) and \(M_R^p\), respectively.

In the discretized case, the solution at the boundaries \(\Gamma ^\text {L}\) and \(\Gamma ^\text {R}\), where \(z^\text {L}-z=0\) and \(z-z^\text {R}=0\), respectively, reads

$$\begin{aligned} p\bigg |_{\Gamma ^\text {L}}&= \sum _0^{M_L^p} f_m^h(r) \bigg (A^\text {L}_m + B^\text {L}_m \bigg ), \end{aligned}$$
(11a)
$$\begin{aligned} p\bigg |_{\Gamma ^\text {R}}&= \sum _0^{M_R^p} f_m^h(r) \bigg (A^\text {R}_m + B^\text {R}_m \bigg ). \end{aligned}$$
(11b)

From now onward, we occasionally use the superscript X in an expression to represent either \(\text {L}\) for the left waveguide or \(\text {R}\) for the right waveguide. Note that the corresponding statement holds in both cases of replacing X by either \(\text {L}\) or \(\text {R}\) referring to the left and right waveguides, respectively. Let \(f_n^h\) be the nth propagating mode. Then, we have

$$\begin{aligned} \begin{aligned} \int _{\Gamma ^X} r f_n^h p&= \sum _{0}^{M^p} \int _{\Gamma ^X} r f_n^h \bigg ( \big (A_m^X + B_m^X \big ) f_m^h \bigg ) \\&= A_n^X + B_n^X, {\quad X \in \{\text {L},\text {R} \},} \end{aligned} \end{aligned},$$
(12)

where the first equality follows from substituting \(p\big |_{\Gamma ^\text {X}}\) from Eq. (11) into the first expression and the second equality follows from the orthonormality of modes. Considering \({\textbf{v}}_n^X\) to be the \(N \times 1\) vector representing the nodal values of the discrete mode \(f_n^h\) on the boundary nodes at \(\Gamma ^X\) (note that all other entries of \({\textbf{v}}_n^X\) corresponding to the internal nodes are zero), Eq. (12) in matrix form reads

$$\begin{aligned} \left( {\textbf{v}}_n^X\right) ^T{\textbf{M}}^\text {X} {\textbf{p}}= A_n^X + B_n^X, {\quad X \in \{\text {L},\text {R} \},} \end{aligned},$$
(13)

where \({\textbf{M}}^\text {X}\) is the boundary mass matrix as defined in Eq. (7c) at \(\Gamma ^X\), and \({\textbf{p}}\) is the nodal values of the complex pressure. Thus, for a given solution \({\textbf{p}}\) to problem (6), we can recover the complex amplitudes of the incoming and outgoing waves for each propagating mode at \(\Gamma ^X\), using Eq. (13).

In this study, we only consider the case where we have a planar incoming wave with unit-amplitude at \(\Gamma ^\text {L}\). Therefore, we can rewrite Eq. (13) as

$$\begin{aligned} \begin{aligned} \left( {\textbf{v}}_0^\text {L}\right) ^T{\textbf{M}}^\text {L} {\textbf{p}}&= 1 + B_0^\text {L},\\ \left( {\textbf{v}}_m^\text {L}\right) ^T{\textbf{M}}^\text {L} {\textbf{p}}&= B_m^\text {L}, \quad m=1,2,\ldots ,M_L^p,\\ \left( {\textbf{v}}_n^\text {R}\right) ^T{\textbf{M}}^\text {R} {\textbf{p}}&= B_n^\text {R}, \quad n=0,1,\ldots ,M_R^p.\\ \end{aligned} \end{aligned}$$
(14)

Recall that \(M_L^p\) and \(M_R^p\) are the highest propagating modes in the left and right pipes, respectively.

The power of a propagating wave is proportional to the square of its amplitude and its corresponding reduced wave number. Defining the normalized power of outgoing waves as the power of outgoing wave divided by the power of the unit-amplitude incoming wave and considering Eq. (14) to compute the amplitude of outgoing waves, we have

$$\begin{aligned} \begin{aligned} P_m^\text {L}= {\left\{ \begin{array}{ll} \bigg |\left( {\textbf{v}}_0^\text {L}\right) ^T{\textbf{M}}^\text {L} {\textbf{p}}-1 \bigg |^2 \, &{}\text {if} \, m=0,\\ \frac{k_m}{k}\bigg |\left( {\textbf{v}}_m^\text {L}\right) ^T{\textbf{M}}^\text {L} {\textbf{v}}\bigg |^2 \, &{}\text {if} \, m=1,\ldots ,M_L^p,\\ \end{array}\right. } \end{aligned} \end{aligned}$$
(15)

and

$$\begin{aligned} P_n^\text {R}= \frac{k_n}{k}\bigg |\left( {\textbf{v}}_n^\text {R}\right) ^T{\textbf{M}}^\text {R} {\textbf{p}}\bigg |^2 \, \text {for} \, n=0,1,\ldots ,M_R^p. \end{aligned}$$
(16)

Here \(P_m^\text {L}\) and \(P_n^\text {R}\) are the normalized power of the outgoing waves of mode m and n at \(\Gamma ^\text {L}\) and \(\Gamma ^\text {R}\), respectively. A detailed derivation of the power of outgoing waves for a given amplitude in the discretized case is provided by Wadbro (2014, Appendix B). Here and throughout this article, whenever the term power of outgoing wave is used, it means the normalized power of the outgoing wave by dividing the power of the outgoing wave with the power of a unit-amplitude incoming wave imposed at \(\Gamma ^\text {L}\).

2.4 Objective function

As mentioned earlier, the aim of this study is to design the transmission section in Fig. 1 to (i) maximize the transmission and (ii) ensure that the transmitted wave is planar as illustrated in Fig. 1c. To achieve this, we minimize the sum of the power of all outgoing waves except for the planar wave to the right over the targeted range of frequencies \({\mathscr {F}}\). Thus, the primary objective function can be written as

$$\begin{aligned} J_p(\varvec{\alpha }) = \frac{1}{|{\mathscr {F}} |}\int _{{\mathscr {F}}} \left( \sum ^{M^{p}_L\left( f \right) }_{m=0}P_m^\text {L} + \sum ^{M^{p}_R\left( f \right) }_{n=1}P_n^\text {R}\right) \,\textrm{d}{\mathscr {F}}. \end{aligned}$$
(17)

Note that we have normalized the objective by the length of the targeted frequency range \(|{\mathscr {F}} |\). Note that for each frequency, we need to calculate \(M^{p}_L\left( f \right)\) and \(M^{p}_R\left( f \right)\), the number of propagating modes at \(\Gamma ^\text {L}\) and \(\Gamma ^\text {R}\), respectively.

For binary values of \(\alpha _h=\left\{ 0,1\right\}\), the optimization problem with objective function (17) is a large-scale non-linear integer optimization problem. Also, if \(\alpha _h=0\) in some of the elements, then the system matrix in Eq. (6) becomes singular. To solve the numerical and mathematical issues that arise when solving this problem, a standard approach in topology optimization is to relax the binary value constraint and let \(\alpha _h\) take values in the range \([\epsilon ,1]\), where \(\epsilon\) is a small number (Wadbro and Berggren 2006; Dühring et al. 2008; Wadbro 2014; Kasolis et al. 2015; Bokhari et al. 2021). Changing the lower bound from 0 to \(\epsilon\) slightly modifies the governing equation; the induced error in the complex pressure field is linear in \(\epsilon\) (Kasolis et al. 2015).

Moreover, we aim for a pure solid (\(\alpha _h=\epsilon\)) or air (\(\alpha _h=1\)) final design. Thus, we use a combination of filtering and penalty methods to suppress the intermediate values. The non-linear density filters used in the numerical experiments also ensure a size control on the solid region in the design (Hassan et al. 2018; Sigmund 2007; Hägg and Wadbro 2017, 2018; Bokhari et al. 2021). Let \({{\textbf{d}}=\left[ d_1,d_2,\ldots ,d_{N^\text {D}}\right] ^T}\) be the vector of design variables before filtering. Thus, we define the \(N^\text {D} \times 1\) vector \({\varvec{\alpha }:={\mathcal {F}}\left( {\textbf{d}} \right) }\), where \({\mathcal {F}}\) is a filter operator. To further suppress the intermediate values of the design variables, we add a standard quadratic penalty term (Allaire and Kohn 1993; Borrvall and Petersson 2001; Wadbro 2014; Bokhari et al. 2021) to the primary objective function (17) and thus, we define the objective function for the numerical experiments as follows:

$$\begin{aligned} \begin{aligned} J(\varvec{\alpha })&=J_p(\varvec{\alpha }) + \frac{\gamma }{|\Omega ^\text {D} |}\int _{\Omega ^\text {D}} (\alpha _h-\epsilon )(1-\alpha _h) \\&= \frac{1}{|{\mathscr {F}} |} \int _{{\mathscr {F}}} \left( \sum ^{M^{p}_L\left( f \right) }_{m=0}P_m^\text {L} + \sum ^{M^{p}_R\left( f \right) }_{n=1}P_n^\text {R} \right) \,\textrm{d}{\mathscr {F}} \\& \quad + \frac{\gamma }{N^\text {D}} \sum ^{N^\text {D}}_{k=1}(\alpha _k-\epsilon )(1-\alpha _k), \end{aligned} \end{aligned}$$
(18)

where \(\gamma\) is the penalty parameter, \(|\Omega ^\text {D} |\) denotes the size of the design domain.

3 Numerical experiments

In our numerical experiments, we consider the setup illustrated in Fig. 2 with the following dimensions: The radius and length of the design region \(\Omega ^\text {D}\) is \({r^\text {D}=50\,\text {mm}}\) and \({l^\text {D}=50\,\text {mm}}\), respectively. The radius and length of the truncated waveguides are \({r^\text {L}=30\,\text {mm}}\), \({r^\text {R}=40\,\text {mm}}\), and \({l^\text {W}=20\,\text {mm}}\), respectively. We aim to maximize the transmission of the planar incoming wave in the frequency range of 4–16 kHz, ensuring that the transmitted wave is also planar. We discretize the computational domain into a structured grid of square elements with a uniform mesh size of \(h=0.25\,\text {mm}\), resulting in 250,721 degrees of freedom for the finite element discretization. To solve the optimization problem, we employ three different optimization algorithms: the MMA method (Svanberg 1987), the stochastic gradient (SG) method (Robbins and Monro 1951), and the continuous stochastic gradient (CSG) method (Pflug et al. 2020; Grieshammer et al. 2023a, b).

We define the performance of a given design at frequency f as the normalized power of the outgoing planar wave, computed using expression (16), as follows:

$$\begin{aligned} \text {Performance}=P_0^\text {R}(f)= \bigg |\left( {\textbf{v}}_0^\text {R}\right) ^T{\textbf{M}}^\text {R} {\textbf{p}}\bigg |^2. \end{aligned}$$
(19)

To evaluate the performance of the optimized designs, we use a boundary-fitted mesh for the final designs in the Acoustics Modules in COMSOL Multiphysics. The performance of different designs are compared over a range of frequencies from 4 kHz to 16 kHz with a step size of 20 Hz.

3.1 MMA approach

To solve the optimization problem considering objective function (18) using the MMA method, we approximate the integral over the range of targeted frequencies using the function values of the integrand at just a few frequencies. Thus, we discretize the optimization problem as

$$\begin{aligned}{} & {} \min _{{\textbf{d}}\in {\mathcal {A}}} \quad \sum ^{Q}_{i=1} \left( \sum ^{M^{p}_L\left( f_i \right) }_{m=0}P_m^\text {L} + \sum ^{M^{p}_R\left( f_i \right) }_{n=1}P_n^\text {R} \right) \nonumber \\{} & {} \qquad + \frac{\gamma }{N^\text {D}} ({\mathcal {F}}({\textbf{d}})-\epsilon \mathbf {\mathbbm {1}_{N^\text {D}}})^T(\mathbf {\mathbbm {1}_{N^\text {D}}}-{\mathcal {F}}({\textbf{d}})), \end{aligned},$$
(20)

where Q is the number of frequencies subject to optimization, \(\mathbf {\mathbbm {1}_{N^\text {D}}}\) is \(N^\text {D} \times 1\) vector with all entries equal to 1, and \({\mathcal {A}} = \{{\textbf{d}}\in {\mathbb {R}}^{N^\text {D}} \mid \epsilon \le d_i \le 1 \,\forall \, i \}\) is the set of admissible designs. The scaling constant \(Q^{-1}\) is neglected in expression (20). This is done because the scaling between the primary objective function and the quadratic penalty term can be also tuned using the penalty parameter \(\gamma\). Note that by increasing the number of frequencies subject to optimization Q, we can improve the approximation used to discretize objective function (18). To solve optimization problem (20), we utilize the least squares formulation of the MMA approach described by Svangberg (1987, 2002). Thus, we need the sensitivity information for each part of the objective function in optimization problem (20). The computation of sensitivities for the quadratic penalty term can be readily performed for a given filter \({\mathcal {F}}\). However, the task of computing the gradient of the power of outgoing modes with respect to the design variables poses a challenge. This process involves determining the gradient of the amplitudes of the outgoing modes with respect to the design variables, as per Eqs. (15) and (14). Notably, the power of a propagating mode is proportional to the square of its amplitude. The sensitivity computations are done analytically using the adjoint variable method, which is a powerful technique for computing the gradient of a function that depends on the solution of a partial differential equation with respect to the design variables. A detailed derivation of the design sensitivities is provided in the appendix.

The design problem has 40,000 design variables, that is, the number of elements in the design domain \(\Omega ^\text {D}\). We set the lower bound for the design variable as \(\epsilon =10^{-8}\), the filter radius to \(1\,\text {mm}\), and use a so-called continuation approach for the penalty parameter. That is, we solve problem (20) for a sequence of increasing penalty parameters \(\gamma _i = 10^{i}\), \(i=0,1,\ldots ,5\), using the previously computed solution as the initial design. The aim is to gradually move the optimizer’s focus from the acoustic performance of the device towards obtaining a black-and-white final design. This approach ensures an optimized layout with sharp solid–fluid boundaries, free of any intermediate values of the material indicator functions (Wadbro and Berggren 2006). To solve optimization problem (20), we need to consider a sufficiently large number of frequencies Q to get broadband acoustic performance. However, increasing the number of frequencies subject to optimization will increase the number of times we need to solve the state Eq. (6). Note that the finite element solver is the primary contributor to the computational costs in the optimizer. Consequently, increasing the number of frequencies will result in a significant increase in computational cost. Here, we consider three cases for the number of frequencies subject to optimization:

Case I:

Four equidistant frequencies in the targeted range; that is, 4 kHz, 8 kHz, 12 kHz, and 16 kHz.

Case II:

Seven equidistant frequencies in the targeted range; that is, 4 kHz, 6 kHz, ..., 16 kHz.

Case III:

Thirteen equidistant frequencies in the targeted range; that is, 4 kHz, 5 kHz, ..., 16 kHz.

Fig. 3
figure 3

On the left is an axisymmetric and on the right is a 3D visualization of the optimized designs in a Case I, b Case II, and c Case III

Note that here, the convergence criteria is based on the residual norm of the KKT (or the first-order optimality) condition, together with a limitation on the number of iterations in each penalty step. We will use the number of evaluations, defined as the number of times we need to solve the state Eq. 6, as a metric to compare the computational cost between different cases. Figure 3 shows the optimized design for all three cases. Figure 4 shows the performance of each of these designs, computed using expression (19). Figure 5 shows the convergence history for all three MMA cases, where two numerical approximations of the objective function (17) are plotted versus the number of evaluations, one is the naive approximation using only the frequencies subject to optimization in each case and the other one is achieved using 150 equidistant frequencies in the range 4 kHz to 16 kHz.

Fig. 4
figure 4

Performance of the optimized designs in a Case I, the red line b Case II, the dash-dotted teal line and c Case III, the dashed violet line

Fig. 5
figure 5

History of convergence in all three cases. The black line indicates the approximation of the objective function (17) using numerical integration utilizing the evaluation of the integrand only at the targeted frequencies in each case. The dash-dotted colored lines demonstrate the true objective function approximated by numerical integration using 150 equidistant frequencies

Considering the observations from Figs. 4 and 5, the following conclusions can be drawn:

  • In Case I, where only a few frequencies within the targeted range are considered in the optimization, the overall broadband performance of the final layout is poor, with only a few peaks observed at the considered frequencies. The total number of evaluations required for the convergence is approximately 500 evaluations.

  • In Case II, by considering more frequencies to approximate objective function (17), a better overall performance is achieved for the optimized design. However, there are still deeps in the frequency response, indicating weak broadband performance. In this case, Fig. 5 shows that the approximation of the objective function is still poor throughout the optimization. Moreover, approximately 2200 evaluations were required for convergence.

  • In Case III, a desirable broadband performance is achieved for the optimized layout. However, this improvement comes at the expense of an increase in computational cost, as evident from the approximately 2700 evaluations required to obtain the optimized layout. As illustrated in Fig. 5, the approximation of the objective function is significantly improved throughout the optimization compared to Case I and Case II.

These observations highlight two disadvantages of using a deterministic approach in this problem which can be manifested as follows:

  1. 1.

    To ensure an acceptable broadband performance, an increased number of frequencies needs to be included in the optimization process. This results in a higher number of evaluations and computational costs.

  2. 2.

    Evaluating objective function (17) only at specific frequencies leads to designs that are tailored for those particular frequencies. As the number of frequencies considered in the optimization increases, the resulting designs exhibit numerous free-hanging parts, as depicted in Fig. 3. This phenomenon is characterized by an increasing number of small inclusions from Case I to Case III in Fig. 3.

In summary, while the MMA approach demonstrates advantages in achieving desirable broadband performance, it is pivotal to consider the associated computational cost and the design’s specificity to the frequencies included in the optimization process. Furthermore, a key challenge in manufacturing the final designs, as depicted in Fig. 3, pertains to the placement of free-floating solid inclusions in the context of axially symmetric configurations. Note that these free-hanging parts in the axially symmetric case are free-hanging rings in reality as demonstrated in a 3D visualization in Fig. 3. A simple solution is to support these inclusions with thin strings or bars attached to the outer tube as suggested by Mousavi et al. (2023), where they showed that these thin connecting parts have a negligible effect on the acoustic performance of the device. Another alternative is to modify the objective function for the optimization problem to ensure the connectivity of the design inclusions. However, it is essential to recognize that such an alteration would significantly restrict the available design space for the optimizer in the axial symmetric case.

3.2 SG approach

In contrast to MMA, SG does not require a computation of the integral over frequencies appearing in (18). Instead, in each iteration, we draw a random frequency \(f_n\in {\mathscr {F}}\) and evaluate the objective function gradient only for this specific choice. This gradient sample is then used as a search direction for the current optimization step. As a result, compared to MMA, an SG iteration consumes significantly less time, but generally provides a smaller improvement of the objective function.

While SG is typically used with a diminishing learning rate, we fix a constant learning rate and impose a shrinkage in step length directly through move limits. To be precise, in every iteration, the search direction is multiplied by the constant learning rate and afterward projected onto the set

$$\begin{aligned} \big \{ {\textbf{x}}\in {\mathbb {R}}^{N^\text {D}}\,:\, \Vert {\textbf{x}}\Vert _\infty \le C_n\big \}, \end{aligned},$$

where \(C_n={\mathcal {O}}(1/\sqrt{n})\). To end up with a black-and-white design, the final result is rounded, i.e., we apply an element-wise threshold operation

$$\begin{aligned} \alpha _i^{\text {final}} = {\left\{ \begin{array}{ll} \epsilon &{} \text {if }\alpha _i < \tfrac{1}{2}, \\ 1 &{} \text {else,} \end{array}\right. }\quad \text {for }i=1,\ldots ,N^\text {D}. \end{aligned}$$

This setup yielded the best performance for SG in our numerical experiments.

Due to the stochastic nature of SG, we performed 50 independent optimization runs with 500 iterations each. An overview of the observed performance range is found in Fig. 10. For the best final design obtained, the objective function evolution is shown in Fig. 6, while the final design is illustrated in Fig. 7. The corresponding performance is given in Fig. 8.

Fig. 6
figure 6

Value of the objective function (17) for all intermediate SG iterates. Each value was approximated by numerical integration using 150 equidistant frequencies

Fig. 7
figure 7

On the left is axisymmetric and on the right is a 3D visualization of the optimized designs in a SG approach and b CSG approach

Fig. 8
figure 8

Performance of the optimized designs in the SG approach, the dash-dotted orange line, and in the CSG approach, the blue line

3.3 CSG approach

Similar to SG, the CSG search direction is based on stochastic samples of the full objective function gradient. However, since evaluating such a gradient sample is still computationally expensive, discarding all information after each iteration is rather wasteful. Therefore, in CSG, gradient samples \((g_i)_{i=1,\ldots ,n}\) from past iterations are stored. By calculating design-dependent integration weights \((\beta _i)_{i=1,\ldots ,n}\), the full objective function gradient is then approximated by the continuous stochastic model

$$\begin{aligned} \nabla J(\varvec{\alpha }_n) \approx \sum _{i=1}^n \beta _i g_i =: {\hat{G}}_n. \end{aligned}$$

For our numerical experiments, we choose so-called exact hybrid weights, since they are easily computable due to \({{\text {dim}}({\mathscr {F}})=1}\). More details on how these weights are computed in practice is given by Grieshammer et al. (2023a). Therein, it was also shown that the approximation error vanishes during the optimization process. That is,

$$\begin{aligned} \lim _{n\rightarrow \infty } \big \Vert \nabla J(\varvec{\alpha }_n) - {\hat{G}}_n\big \Vert = 0. \end{aligned}$$

As a consequence, CSG inherits strong convergence results from full gradient schemes, like convergence for constant learning rates and line search techniques, while retaining a low cost per iteration, since the integration weight computation is negligible compared to the numerical solution of the state equation.

For a better comparison, we also choose a combination of constant learning rates and move limits for CSG in our experiments. This time, however, it is not necessary to pick diminishing move limits. Instead, we can adaptively choose these limits based on the progress achieved in the internal CSG model for the objective function.

Again, 50 independent optimization runs with 500 evaluations each were performed. The full overview of results is shown in Fig. 10. The objective function evolution of the run corresponding to the best final result obtained is given in Fig. 9. Therein, we also included the history of objective function approximations by CSG. These approximations indicate the quality of the underlying continuous–stochastic model, which is used internally in CSG. An illustration of the final design and the corresponding performance can be found in Fig. 7 and Fig. 8, respectively.

Fig. 9
figure 9

CSG approximation to the objective function value (17) in each iteration (black). The dash-dotted blue line gives a different approximation to the objective function value, obtained using numerical integration with 150 equidistant frequencies

3.4 Impact of Stochasticity

To better capture the probabilistic nature of SG and CSG, the transmission spectra of all 50 final designs are calculated. Afterwards, for each individual frequency \(f\in {\mathscr {F}}\), we determined the respective transmission quantiles \(P_{0.1,0.9}(f)\) and \(P_{0.25,0.75}(f)\). Here, \(P_{0.1,0.9}(f)\) denotes the range of transmission values achieved by all runs, where the highest 10 % and lowest 10 % of values are omitted. Likewise, \(P_{0.25,0.75}(f)\) indicates the range of transmission values obtained by 50 % of all runs, where the best 25 % and worst 25 % of results are neglected. Thus, the resulting quantile plots in Fig. 10 give a good impression concerning both the average performance of a design obtained by SG or CSG as well as the variance in results, depending on the random sequence of sample frequencies. Note, however, that this form of representation results in smoother-looking spectra, since sharp peaks and other resonance effects are averaged out in the process.

Fig. 10
figure 10

Median transmission spectrum (solid lines) of all final designs after 500 iterations of CSG (blue) and SG (orange). The shades areas indicate the quantiles \(P_{0.1,0.9}\) (light) and \(P_{0.25,0.75}\) (dark)

3.5 Discussion

To better compare the achieved performances for all methods, we introduce the Cumulative Performance Density function (CPD), defined as follows:

$$\begin{aligned} \begin{aligned} \textrm{CPD}(p)&= \frac{1}{\vert {\mathscr {F}}\vert }\int _{{\mathscr {F}}} \chi _{[0,p]}\big (P_0^\text {R}(f)\big )\,\textrm{d}f \\&= {\frac{\left| \left\{ f\in {\mathscr {F}}\,:\, P_0^\text {R}(f)\le p\right\} \right| }{\vert {\mathscr {F}}\vert }} \end{aligned} \end{aligned}$$
(21)
Fig. 11
figure 11

Cumulative performance density (CPD) curves, see (21), for all final designs in Figs. 3 and 7 as well as the empty design domain. For a given final design, the cumulative performance density curve directly indicates two measures of quality: the objective function value (area under the graph) and the median performance (intersection with horizontal line at \(y=0.5\))

By construction, \(\textrm{CPD}(p)\) provides the relative amount of frequencies in our frequency range \({\mathscr {F}}\), for which the performance is lower than the given threshold \(p\in [0,1]\). For example, if a design satisfies \(\textrm{CPD}(0.5)=0.25\), its performance is less or equal to 0.5 for \(25\,\%\) of all considered frequencies. Thus, an ideal design, which has a perfect performance of 1 for all \(f\in {\mathscr {F}}\), satisfies

$$\begin{aligned} \textrm{CPD}(p) = {\left\{ \begin{array}{ll} 0, &{} p<1, \\ 1 &{} p=1.\end{array}\right. } \end{aligned}$$

Furthermore, the objective function value (17) of a design is obtained by integrating \(\textrm{CPD}\) over the interval [0, 1], that is,

$$\begin{aligned} J_p(\varvec{\alpha }) = \int _0^1 \textrm{CPD}(p)\,\textrm{d}p. \end{aligned}$$

As a consequence, CPD allows for a more detailed analysis of the final designs than a plain comparison of objective function values. Especially, even if two designs have identical average performance, they can still differ in CPD, e.g., if one design performs relatively equal for all considered frequencies, while the other has strong oscillations in the performance spectrum.

For each of the optimization approaches, the \(\textrm{CPD}\) of the corresponding final design is given in Fig. 11. Therein, we also included the \(\textrm{CPD}\) for the empty design domain (\(\alpha _h \equiv 1)\). As we can see, the final design obtained with MMA in Case I yields the worst performance overall, even falling behind the empty design region. The final designs of SG and MMA Case II have comparable median performances. However, while the final SG design performs rather similar for all frequencies, indicated by the sharp increase of CPD at \(\sim 0.9\), the MMA Case II design performs poorly for a wide range of frequencies, resulting in a much better final objective function value of SG. The best overall performance is achieved by the final design of MMA Case III, with the final CSG design performing slightly worse. However, recall that the associated numerical effort for MMA Case III is much higher (2704 state equation solutions) when compared to CSG (500 state equation solutions).

4 Conclusion

In this study, we presented the results of topology optimization applied to a broadband acoustic transition section. We compared the outcomes of a deterministic approach using the method of moving asymptotes (MMA) with two stochastic approaches: stochastic gradient (SG) and continuous stochastic gradient (CSG) methods.

In the case of the MMA approach, we found that achieving optimal broadband performance requires optimizing over an increased number of frequencies (Fig. 4). However, this comes at the cost of a significant increase in computational costs and results in designs with a higher number of free-hanging inclusions, which can negatively impact manufacturability (Fig. 3). On the other hand, the stochastic approaches, SG and CSG, offer a more computationally efficient alternative while still producing optimized designs with improved manufacturability. In particular, we observed that the CSG method outperforms the SG method, as evidenced by the median transmission spectrum of the final designs and the overall frequency response shown in the quantile plots (Fig. 10).

These findings highlight the potential of stochastic approaches in acoustic applications, especially when broadband acoustic performance is desired. By reducing computational costs and improving manufacturability, stochastic methods offer a promising alternative for optimizing acoustic systems in such applications.