1 Introduction

In photonics, TO [1] such as gradient-based methods has emerged as a powerful tool for optimizing metasurfaces. TO and adjoint method (AM) are often used together to optimize metasurfaces [2,3,4,5,6,7,8]. The adjoint method (AM) allows the computation of the gradient corresponding to the targeted function with respect to the design variables, using a single adjoint solver, making it computationally efficient. However, these computations often involve evaluating certain components of the EM field at various nodes within the calculation domain (i.e. 2D or 3D), making them computationally intensive.

On the one hand, for TO, FDTD/FDFD (Finite Difference Time/Frequency Domain) method and FEM (Finite Element Method) had been widely implemented. Both methods require computing the values of the EM field components in direct space. As a result, the computation of the gradient of the figure of merit (FOM) does not require additional effort. Contrary to the aforementioned methods, spectral methods such as the Fourier modal method (FMM or RCWA) [9,10,11] and the Polynomial modal method (PMM) [12], utilize basis functions to approximate the solution of the partial differential equations (PDEs) obtained from Maxwell’s equations. This makes them highly accurate and efficient, particularly suitable for solving “smooth” problems in electromagnetism, such as metasurface diffraction. Another advantage of these methods is that they do not require the computation of the values of the EM field components at each point in the computation domain. Instead, spectral or weighted coefficients are used to describe the EM response of the structure under study. Therefore, in the framework of spectral methods, when using gradient-based methods for TO, computing the values of the EM field components appears as an additional effort.

On the other hand, in linear problems with a simply convex FOM function shape, utilizing gradient information can lead to relatively easy attainment of an optimal solution. However, the design of metasurfaces can involve non-convex, non-linear, and high-dimensional optimization problems. In these cases, metaheuristic methods, such as gradient-free methods, that do not rely on gradient information to search for solutions in complex and large search spaces, appear naturally as an interesting alternative. Specifically, SMA [13, 14] is a fascinating bio-inspired technique that has shown promising results in solving various optimization problems, including pathfinding and network design. However, its application to metasurface design, particularly for TO, has not been extensively explored yet.

In this paper, we propose to apply the SMA for the topology optimization of a 1D dielectric metagrating in the framework of a spectral method. The objective of this design is to create a 1D metagrating consisting of a sequence of nanorods and air gaps with varying widths, optimized to efficiently deflect incident plane wave power into a specified transmission order. The proposed SMA starts by randomly generating a set of widths for the nanorods and air gaps. A set of binary permittivity functions is defined for this sequence of nanorods/air gap widths. However, as we are focused on a Topology Optimization (TO) method, a blurring process is applied to this initial population. This process transforms the discrete set of widths into a continuous permittivity functions sequence. One of the novelties and key points of the proposed method is the choice of the filter used during this blurring process. Instead of using classical Gauss or binary functions as filters, we introduce the Schwartz function. The function’s shape can be significantly modified by switching some of its key parameters, providing more flexibility and control over the continuous permittivity function at each iteration during the optimization process.

This paper is organized as follows: In Sect. 2 we recall briefly all numerical tools required to understand the work presented in this paper namely: the Slime Mould Algorithm (SMA) and the SMA applied to topology optimization. Section 3 is devoted to validity tests and applications. We demonstrate that by coupling the SMA with the TO, we provide an effective and robust optimization process, leading to an optimized design of the 1D dielectric metagrating that meets our specific transmission order requirements. The numerical results demonstrate that our proposed method is less sensitive to initial conditions compared to traditional gradient-based methods. In Section 4 conclusions and future work are introduced.

2 Methods

2.1 The slime mold algorithm (SMA)

The SMA starts by randomly generating a vector X of N-population size with bounded dim-characteristics. We denote LB and UB as the upper and lower bounds of the design variables. Each population, component of the vector X represents a slime mold and dim is the dimension of the problem to be solved. In real life, setting these initial conditions for the slime mold model corresponds to a distribution of possible food source locations. In the context of the Topology Optimization (TO) problem at hand, this step involves generating potential desired structures, comprising a bounded and continuously defined set of permittivity functions. In the next step, the mold creates, deploys, or destroys venous structures, in order to approach, wrap, and finally grasp food at the best location. The slime mold’s population updating is evaluated by defining some parameters related to the slime mold’s growth dynamics, such as the flux, the rate of tube reinforcement, the decay rate, and using the objective function. All these parameters namely the fitness weight W and oscillations parameters \(v_a\) and \(v_c\) of slime mold, allow controlling the progression of the mold in order to provide faster convergence and to avoid local optimal as possible. In the mathematical model, at each iteration (t), for each component vector \(X_n^{(t)}\) \(n \in \{1, 2,..., N\}\) of the vector \(X^{(t)}\), the fitness S(n) is computed. This fitness is then used to update the vector \(X^{(t)}\) as follows:

$$\begin{aligned} X^{(t+1)} = {\left\{ \begin{array}{ll} rand(UB - LB) + LB &{} rand< z\\ X_b^{(t)} + v_a^{(t)}(W X_A^{(t)} - X_B^{(t)}) &{} r<p \\ v_cX^{(t)} &{} r \ge p \end{array}\right. } \end{aligned}$$
(1)

with rand and r being random variables belonging to [0, 1]. z is a random parameter. \(X_A^{(t)}\) and \(X_B^{(t)}\) are two randomly selected individual locations, at iteration (t) from the population. The parameter \(p = tanh |S(n) - DF|\), with DF representing the best fitness obtained over all iterations. W is the weight of slime mold defined as: \(W(SmellIndex(n)) = 1+r log\left[ (b_F- S(n))/(b_F- w_F) + 1\right]\) if \(n\le N/2\) and \(W(SmellIndex(n)) = 1-r log\left[ (b_F- S(n))/(b_F- w_F) + 1\right]\) if \(n > N/2\); with \(SmellIndex = sort(S)\). The oscillations parameters are computed as: \(v_a^{(t)}\in [-a^{(t)}, a^{(t)}]\), and \(v_c^{(t)}\in [-c^{(t)}, c^{(t)}]\), with \(a^{(t)}=atanh\left( 1-t/T\right)\), and \(c^{(t)}=\left( 1-t/T \right)\).

2.2 A Topology optimization method based on the SMA

In the topology optimization approach of a 1D metagrating proposed here, the algorithm starts by generating a vector X with a size of \(N \times Q\). At an iteration (t), each component of X represents a permittivity function values \(X_n^{(t)}=[\varepsilon ^{(t)}_n(x_q)]_{q\in [1,Q]}\), that continuously varies between \(\varepsilon _{min}\) and \(\varepsilon _{max}\). Q is the number of points \(x_q\) in the computation domain where each function \(\varepsilon _n\) is evaluated, while N is the number of initial candidates. To generate these initial candidates, in the case of a 1D structure Fig. 1a, we first consider binary structures composed of \(N_p\) nanorods + air-gaps. Let’s denote \([e_k]_k\) as the sequence of nanorods and air-gap widths. A binary function \(\widetilde{\varepsilon }(x)\) is constructed on this set of subintervals \((e_k)_k\). From \(\widetilde{\varepsilon }(x)\), a density function is defined as \(\widetilde{\rho }(x)= (\widetilde{\varepsilon }(x)-\varepsilon _{min})/(\varepsilon _{max}-\varepsilon _{min})\). The topology optimization requires the use of continuous functions as a starting point. However, \(\widetilde{\varepsilon }\) and \(\widetilde{\rho }\) are binary functions with respect to the x variable. Therefore, in the second step, a continuous function is created by convolving \(\widetilde{\rho }\) with a distribution: \(\widehat{\rho }(x)= \sum _q \mathcal {F}_q(x) \widetilde{\rho }(x_q)/\sum _q \widetilde{\rho }(x_q)\). Here, the Schwartz profile is used in this blurring process since it provides more degrees of freedom for a dynamic blurring scheme than the commonly used Gauss or binary functions. The Schwartz function is defined as:

$$\begin{aligned} \mathcal {F}_q(x)= {\left\{ \begin{array}{ll} \exp \left[ b-\dfrac{bl^2}{l^2-4 (x-x_q)^2} \right] , &{} |x-x_q| \le .5l \\ 0 &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
(2)

The parameters l and b jointly control the blur radius or the spatial correlation length to adjust the length scale that may be enforced by the manufacturing constraints. Figure 1b and c show how the parameters b and l can modify the shape of the Schwartz profile. In these figures, we present the sketch of the Schwartz profile function \(\mathcal {F}_0(x)\) for \(x\in [-1,1]\) and \(x_0=0.5\). As shown these Figures, a high value of l as well as a low value of b both lead to a large spatial filter band. Let’s remark that \(b=0\) is the extreme limit of a binary filter function.

Fig. 1
figure 1

a Sketch of the 1D metagrating to be designed. b and c: Sketch of Schwartz profile function \(\mathcal {F}_0(x)\) for \(x\in [-1,1]\), \(x_0=0.5\). In fig. 1(b) the parameter l is set to \(l=1\) while \(b\in \{0,0.1,0.5,5,20 \}\). In fig. 1c the parameter b is set to \(b=2\) while \(l\in \{0.5,1,2 \}\). A high value of l as well as a low value of b both lead to a large spatial filter band. Let’s remark that \(b=0\) is the extreme limit of a binary filter function

The binarization step consists of associating a density function \(\widehat{\rho }\) with the blurred density and updating the permittivity function over the iterations to produce a device consisting of a piecewise constant dielectric function. In this step, a projection scheme is used to recreate a new density profile \(\bar{\rho }\) from the blurred density as:

$$\begin{aligned} \bar{\rho }_i= {\left\{ \begin{array}{ll} \eta e^{-\beta (\eta -\widehat{\rho }_i) /\eta }- \left( \eta -\widehat{\rho }_i\right) e^{-\beta } &{} \text {if } 0\le \widehat{\rho }_i \le \eta \\ \ 1-(1-\eta ) e^{-\beta (\widehat{\rho }_i-\eta ) / (1-\eta ) }-\left( \eta -\widehat{\rho }_i \right) e^{-\beta } &{} \text {if } \eta \le \widehat{\rho }_i \le 1 \end{array}\right. } \end{aligned}$$
(3)

Here, \(\beta\) controls the strength of the projection. The final continuous permittivity function is defined from the density \(\bar{\rho }\) as: \(\varepsilon (x)= (\varepsilon _{max}-\varepsilon _{min})\bar{\rho }(x)+\varepsilon _{min}\). We highlight both the blurring and binarization effects in Fig. 2. This figure displays, an initial binary function \(\widetilde{\varepsilon }\) and several blurred functions \(\varepsilon\) obtained for different values of filtering parameters b and l. Firstly a set of real \(N_p=5\) values \((e_k)_k\), \(k \in {1,2,3,4,5}\) is randomly generated. These variables are constrained by the relation \(\sum _k e_k=d\), d being the width of the computation domain. The binary function \(\widetilde{\varepsilon }\), the values of which vary between 1 and \(3.6082^2\) on the interval [0, d] is constructed. Here d is set to \(d=0.9 \mu m\). In Fig. 2, \(\widetilde{\varepsilon }\) is displayed as gray bars. The continuous function \(\varepsilon\) obtained after the blurring process is also displayed for different values of the parameters l, b, and \(\beta\). The filtering parameters, namely l and b, act on the correlation length in the spatial filtering process. As one can remark in Fig. 2, the maximum peak-to-valley separation can be strongly modulated by switching the correlation length through parameters l and b. Hence a very large range of roughness levels of the initial profile can be swept by dynamically controlling the values of b and l over the iterations.

Fig. 2
figure 2

Illustration of the blurring and binarization effects. The maximum peak-to-valley separation can be strongly modulated by switching the correlation length through parameters l and b

In Fig. 2a, the parameter l is set to \(l=100 nm\), while in Fig. 2b \(l=200 nm\). \(\beta\) is set to 1 in both cases. Decreasing b from 20 to 0.1 for a given value of l allows the spatial filter bandwidth to increase, resulting in a smoother profile. Obviously, the profile is much smoother for higher values of l for a given value of b. The binarization parameter \(\beta\) can push the geometry towards a discontinuous one. The profile is highly binarized for a high value of \(\beta\), such as \(\beta =100\) in Fig. 2c. During the proposed iterative optimization process, while updating the permittivity functions values at the different location nodes, the filter parameters l, b are dynamically tuned in order to prevent the formation of narrow rays/occlusion/particle, in the device under optimization. Moreover the dynamic control of the binarization parameter \(\beta\) allows to push the design toward a full binarized one. To do so, the following laws are used: \(b^{(t+1)}= b_0\left( 1-\dfrac{t}{T}\right) ^{\alpha }\), \(l^{(t+1)}=l_{max}-(l_{max}-l_{min})\dfrac{t-1}{T-1}\), and \(\beta ^{(t+1)}=\beta ^{(t)}+t\), with \(\beta ^{(1)}=0\). Here \(\alpha\) is set to 0.5. A flowchart of the proposed algorithm is shown in Fig.3. The progression of the permittivity function profile is also depicted.

Fig. 3
figure 3

Flowchart of the proposed TO-SMA algorithm. The progression of the permittivity function profile is also depicted

3 Results and discussion

We now demonstrate the use of the proposed TO-SMA gradient-free optimization method to inverse-design optical 1D grating deflecting an incident TM-polarized plane wave into a desired diffracted angle \(\theta _d\). The ultimate expected optimal device is a one-dimensional metagrating, consisting of Si nanorods with a refraction index 3.6082, deposited on a SiO\(_2\) substrate (refractive index: 1.45). See Fig. 1a. The grating’s height is set to \(h=325 nm\). The operating wavelength \(\lambda\) is set to \(0.9 \mu m\), and the deflection angle \(\theta _d\) is \(60^{\circ }\). To comply with standard fabrication techniques, a minimum size of both rods and air-gaps widths is set to 50nm within the optimization process. The transmitted efficiency diffracted in the desired direction \(\theta _d\), denoted \(T(\theta _d)\) is the figure of merit to be improved. At a given iteration (t), for a given geometry \(X^{(t)}_n\), the sensitivity is computed as \(S(n)=|1-T(\theta _d)|^2\). The numerical optimization of the structure is performed based on the Fourier modal method FMM [9,10,11], where the electromagnetic field components are approximated by \(2M+1=81\) Fourier harmonics. In the optimization process, we start with an initial relative permittivity in the design region. To perform the TO-SMA, first, a vector with components \(e_k\) (\((e_k)_{k\in [1:N_p]}\)) simultaneously satisfying the constraint \(\sum _{k=1}^{N_{p}} e_k=d\) is randomly generated. See an example in Fig. 4a where \(N_p=9\). Second a sequence of continuous functions is deduced, leading to a sequence of N blurred profiles. Figure 4b shows an example obtained from Fig. 4a. Third, the SMA is applied. The result of this phase is a nonrealistic tiny-feature profile. See Fig. 4c. At the end of this first iteration, a blurring and binarization process are applied leading to the result displayed in Fig. 4d.

Fig. 4
figure 4

From discontinuous to continuous profile: Blurring and binarization effects

Fig. 5
figure 5

TO-SMA applied to metagrating optimization. a, b, c convergence of all 50 initial devices starting from 3, 4 and 5 nanorods respectively. d, e and f: Efficiencies histograms of optimized metagratings, for \(N_p=7\), \(N_p=9\), \(N_p=11\) respectively. g, h, i: convergence of best devices obtained from 3, 4, 5 nanorods respectively

Fig. 6
figure 6

a, b, c: sketch of best devices starting from 3, 4 and 5 nanorods respectively

The SMA is executed for \(T=100\) iterations (\(t\in [0,100]\)) and \(N=50\) initial populations. However, since SMA can be sensitive to initial conditions, we perform 50 restarts with different randomly generated initial sets of vectors X to ensure robustness. The optimization performance characteristics of the gratings starting from different initial conditions (\(N_p=7\), \(N_p=9\), and \(N_p=11\)) are shown in Fig. 5. Figure 5a, b, and c illustrate the convergence of the transmitted efficiency concerning the number of iterations (t) on the Y-axis for all 50 initial randomly generated profiles on the X-axis. The results demonstrate that the optimization process converges with respect to the number of iterations. However, it is noteworthy that applying the TO-SMA method to initially randomly distributed nanorods tends to yield locally optimized devices. As shown in the Fig. 5a, b and c, basins of local minima appear as red rays, disrupting the yellow background of high efficiencies. The chosen numerical example of a 1D periodic structure in our paper holds significance due to the availability of numerous works utilizing different methods, facilitating accurate and efficient comparisons. Specifically, reference [15] presents a variety of relevant works that employ different optimization methods, including gradient-based topology optimization methods, to design metagratings with geometric and physical parameters similar to those in our example. Figure 3C of page 5370 in that reference can then serve as a basis for comparing the results obtained by the TO-SMA and adjoint-based topology optimization methods. Upon comparing TO-SMA results with those from [15], it becomes apparent that TO-SMA exhibits lower sensitivity to initial conditions compared to gradient-based optimization methods. This behavior is consistent with what is typically expected from a global optimizer. The presence of the red streaks (basins of local minima) in Fig. 5a, b and c is significantly influenced by the value of \(N_p\). These figures indicate that \(N_p = 9\) might be a better choice among these three examples, for the current objective, at least in terms of avoiding getting stuck in local minima. This is confirmed by results of Fig. 5d, e and f. These results present the histograms showing the distribution (according to diffraction efficiency) of optimized metagratings for the three values of the parameter \(N_p\). The histogram corresponding to the case \(N_p=9\) is the narrowest, indicating that this parameter value seems to yield better results. In Fig. 5g, h, and i, we depict the convergence of the algorithm for the three previous values of \(N_p\) as a function of iterations for the most efficient structures. To visualize various stages in the evolution of the structure’s topology throughout the optimization process, some key points with coordinates (t,eff) are displayed on these figures; t corresponds to the completed iteration, while the parameter eff is the diffracted efficiency of the design obtained at that iteration t. In the case of \(N_p=7\), the algorithm starts stagnating in a local minimum. Between iterations \(t=1\) and \(t=22\), the topology of the permittivity function profile varies very little, with a transmitted efficiency of only \(64\%\). Then, the algorithm successfully escapes this trap. At iteration \(t=23\), a new topology, though not realistic, is discovered, but it is more efficient. The transmitted efficiency then abruptly increases from \(64\%\) to \(86\%\). From this point (t,eff)=\((23,86\%)\), a monotonous growth in efficiency followed by a regular evolution of the topology leads to convergence to the final result at the point (t,eff)=\((100,97.56\%)\). Now, let’s consider the case of \(N_p=9\). As shown in Fig. 5h, the convergence curve in this case is much smoother and monotonous than in the previous case of \(N_p=7\) Fig. 5g. No stagnation point is observed in this case. This is because the topology of the primary candidate (initial) is very similar to the topology of the final design; that is, an optimal structure made of 4 nanorods + 5 air gaps. The case of \(N_p=11\) is the least favorable. As shown in Fig. 5i, the convergence curve is much more rugged, with several stagnation points. These points correspond to the presence of basins of local minima that tend to trap the algorithm during iterations. However, it is essential to note that the algorithm manages to escape these traps by exhibiting a new topology. A completely new topology appears around the eighth and eighteenth iterations. After escaping this last basin of local minimum, the final design consisting of 4 nanorods finally emerges beyond \(t=35\) iterations. One might think that this optimal structure contains 5 nanorods, but this is not the case. Indeed, the structure is periodic, and the first and last rods are integral parts of the same pattern. This optimal structure is indeed also made up of 4 nanorods. To understand why the case of \(N_p=9\)-kind initial candidates yields the best results, let’s examine Fig. 6a, b, and c. In these figures, we display the highest performance solutions obtained starting from \(N_p=7\), \(N_p=9\), and \(N_p=11\). We can observe that, first, high-performance devices are achieved regardless of the initial \(N_p\) value. Second, despite starting the optimization process with three different values of \(N_p\), the final results in these three cases are similar. They all represent a design composed of 4 nanorods + 5 air gaps with a transmission of \(97.58\%\) in the case \(N_p=7\), \(98\%\) in the case \(N_p=9\) and \(97.93\%\) for \(N_p=11\). These observations suggest that, despite different initial values for \(N_p\), the SMA tends to converge towards similar high-performance designs. To emphasize the impact of the Schwartz function on numerical results, we compare the performance of TO-SMA using the Schwartz filter with different values of the parameter \(b_0\). This assessment involves a comparison with Gaussian and binary functions. The results are illustrated in Fig. 7. These studies are conducted with the same initial profile. Notably, when the parameter \(b_0\) is carefully chosen, the Schwartz filter demonstrates a remarkable ability to converge rapidly towards high-performance solutions. This resilience suggests its efficacy across diverse settings, emphasizing the importance of fine-tuning the shape parameter \(b_0\). These observations underscore the need for further exploration to deepen our understanding of the optimization mechanisms influenced by the parameters of the Schwartz function. Additionally, investigating alternative scenarios involving variations in \(b^{(t)}\) in conjunction with \(l^{(t)}\) and \(\beta ^{(t)}\) may provide valuable insights to enhance the overall performance of the method.

Fig. 7
figure 7

Comparison of the performances of the TO-SMA using different filter functions. The results obtained using the Schwartz filter for different values of the parameter \(b_0\) are juxtaposed with those computed using Gaussian and binary functions

The computation times for the design of metagratings using:

  • The Fourier modal method (FMM) [10, 11] (in its conical incidence formulation), implemented with \(2\times 40 +1\) Fourier basis functions (the size of the eigenvalues equations matrix is then \(2(2\times 40 +1)\times 2(2\times 40 +1)\) )

  • N highly pixelated-continuous permittivity functions with \(2^8\) pixels and for T=100

are reported on Table 1. They are performed on a classical Laptop DELL PRECISION 7720, with a processor intel CORE i7 (3.10Ghz). Although the SMA can be fully parallelized when computing simultaneously, the objective for the N geometries, it’s important to note that no parallel computing has been implemented here, for the indicated computation time.

Table 1 Computation times for the design of metagratings using the Fourier modal method (FMM) in its conical incidence formulation

4 Conclusions and outlook

In summary, a novel approach to metasurface design is presented, leveraging the benefits of SMA within the framework of topology optimization (TO-SMA) and spectral modal methods. The proposed global-like optimization shows promising results in overcoming the limitations of gradient-based methods in the case of 1D metagratings design. In designing 1D dielectric metagratings, the choice of the parameter \(N_p\) is crucial for the effectiveness of the SMA. We demonstrate that the SMA applied to TO, with multiple restarts provides a robust approach to tackle the optimization of 1D dielectric metagratings. However, the presence of local minima highlights the complexity of the optimization landscape, and achieving a globally optimized design may require further exploration of the search space and the introduction of additional techniques to definitively escape local optima. The proposed method still depends on the initial conditions but more less than gradient-based methods. Coupling the proposed TO-SMA with the gradient method as suggested in [16] could improve the optimization process. In future research, further investigations will be conducted to extend the method to the case of 1.5D and 2D metagratings.