1 FORMULATION OF THE PROBLEM

To improve the physical characteristics of waves at the outputs of wave devices with a characteristic size of about 1 μm using changes of the shape of the devices, it is necessary to be able to predict such characteristics using computational simulation (i.e., it is necessary to calculate electric fields and perform topological optimization of wave devices [1, 2]).

The solution of such problems is especially important in the design of integrated circuits (ICs) and their components, since practical measurement of the physical characteristics of each individual IC component can be very difficult and sometimes even impossible (for example, for photonic ICs) [3]. In addition, the improvement of physical characteristics without application of computational methods is faced with the need to re-manufacture the IC each time, which causes additional material spendings and environmental damage. Note also that minor changes in the design of each IC component can substantially affect the physical characteristics of the output of the IC components and the IC itself [4, 5].

Such computational simulations are performed, for example, using Comsol [6], Synopsys (https:// www.synopsys.com), MEEP [7] and alternative tools. However, the commercially available programs require significant computational resources to provide acceptable accuracy in the calculation of electric fields. A commercially available resource-efficient method to achieve the required computational accuracy is still missing [8].

In this work, we employ the Helmholtz equation [9], which was is solved for IC components characterized by a distribution of permittivity ε that differs from permittivity of the substrate (medium) \(\tilde {\varepsilon }\). The Helmholtz equation establishes a relationship of the total and initial fields \(u(r),~\,\,w(r) \in C\):

$$\left\{ \begin{gathered} u\left( r \right) + k_{0}^{2}u\left( r \right) = 0,~\,\,\,r \in {{\mathbb{R}}^{2}}{\kern 1pt} \backslash {\kern 1pt} \Omega ; \hfill \\ \left( r \right) = 0,~\,\,\,r \in \partial \Omega \,\,\,{\text{Dirichlet boundary condition}}; \hfill \\ \mathop {{\text{lim}}}\limits_{\left| r \right| \to \infty } \left( {i{{k}_{0}}\left( {u - w} \right)\left( r \right) - {{\partial }_{{\left| r \right|}}}\left( {u - w} \right)\left( r \right)} \right)\left| r \right| = 0: \hfill \\ \,\,\,\,\,\,\,\,\,\,\,\,\,{\text{Sommerfeld radiation condition;}} \hfill \\ \end{gathered} \right.$$

where Δ is the Laplace operator, r is the spatial vector, and k0 is the wave number.

We solve the Helmholtz equation in the form of a 3D integral equation based on Green functions [9]. Then, the discretization of the resulting equation yields a system of linear equations (SLE) with Toeplitz matrix H(2):

$$Ax \equiv x - k_{0}^{2}\left( {\varepsilon - \tilde {\varepsilon }} \right){{H}^{{\left( 2 \right)}}}m^{*}x = F,$$

where \(x \in {{\mathbb{C}}^{{{{n}^{2}}}}}\) is the distribution of the electric field for uniform mesh \(\tilde {r} \in {{\mathbb{R}}^{{n \times n}}}\) of a square 2D space Ω, “*” is the Kronecker product operator, \(m \in {{\{ 0,1\} }^{{{{n}^{2}}}}}\) is the binary mask of the distribution of the material of the photonic component in the design areas, \({{H}^{{\left( 2 \right)}}} \in {{\mathbb{C}}^{{n \times n~}}}\) is the two-level Toeplitz matrix [10, 11], and \(F \in {{\mathbb{C}}^{{{{n}^{2}}}}}\) is the distribution of the electric field in the medium in the absence of the IC component.

Due to the fact that H(2) is the Toeplitz matrix, the complexity of matrix–vector multiplications can be reduced from \(O({{n}^{2}})\) to \(O(n{\mkern 1mu} {\kern 1pt} \log n)\) [9] using the fast Fourier transform and elementwise multiplication “⊙”:

$${{H}^{{\left( 2 \right)}}}^{*}x~\,\, = \,\,~{\text{IFFT(FFT(}} \odot {\text{)FFT(}}x{\text{))}}{\text{,}}$$

where \(G \in {{\mathbb{C}}^{{\left( {2n - 1} \right) \times \left( {2n - 1} \right)~}}}\) is a special substitute of the Toeplitz matrix [12, 13].

In addition, we solve SLEs using the generalized minimum residual method (GMRES), which requires fewer iterations than, for example, the stabilized biconjugate gradient method (BiCGSTAB), although it may require much more memory, especially if GMRES is used without restarts, since, in this case, it is necessary to store the Krylov basis of size k × n × n, where k is the number of GMRES iterations and n is the characteristic size of the SLE matrix [14].

The aforesaid facts as well as the complexity of operations, the number of iterations, and the PC storage space determine the efficiency of the numerical method to find a solution to the Helmholtz equation in the form of an electric field distribution at a fixed accuracy of its calculations.

The purpose of this work is to create a fast and efficient approach for calculating electric fields and to study its speed and efficiency.

2 NUMERICAL EXPERIMENTS AND RESULTS

W consider GMRES implementation using CUDA C++ and MKL C++. The code is written without “memcopy” operations and their analogues, so that the speed is increased. The performance was evaluated using the calculation of electric field for the Mie scattering by a cylinder in a 2D problem. In this example, an electric field with plane wavefronts comes from infinity and is incident on the side surface of the cylinder with the wavefronts being perpendicular to the cylinder axis.

Calculations for the Mie problem were carried out for the following mesh sizes (n × n) (Table 1): 256 × 256, 512 × 512, 1024 × 1024, …, 8192 × 8192. For each mesh, the Mie problem was repeatedly solved 100 times to calculate the error of the numerical solution compared to the analytical one and the average computation times for \({\text{the}}\) parallelized code (CUDA) and sequential algorithm (MKL). We divided the time of the sequential version by the time of the parallel version to determine acceleration of the code due to parallelization. The CUDA calculations were performed using an Nvidia Tesla V100 GPU, and the MKL calculations were performed using a Xeon Gold 6140 CPU.

Table 1. Results of numerical solutions compared with the results of analytical solution for different matrix sizes

The results show that, in all cases, the GMRES converges and reaches the maximum accuracy at the 24th iteration compared to the analytical solution for the Mie scattering by a cylinder.

3 CONCLUSIONS

Thus, we have found (Table 1) that an increase in the mesh size leads to higher accuracy of electric field calculations, but the computational time exponentially increases with the characteristic mesh size. Nevertheless, the acceleration of the code owing to parallelization tends to increase with increasing mesh size, which is due to the involvement of a greater number of graphic cores with an increase in the number of simultaneous calculations of matrix elements. The proposed approach will help to faster and cheaper design new wave microdevices, measure physical characteristics, and perform topological optimization of such microdevices.