On the efficient optimization of optical properties of particulate monolayers by a hybrid finite element approach

This article addresses the numerical simulation and optimization of the optical properties of mono-layered nano-particulate films. The particular optical property of interest is the so-called haze factor, for which a model close to an experimental setup is derived. The numerical solution becomes very involved due to the resulting size of computational domain in comparison to the wave length, rendering the direct simulation method infeasible. As a remedy, a hybrid method is suggested, which in essence consistently combines analytical solutions for the far field with finite element-based solutions for the near field. Using a series of algebraic reformulations, a model with an off- and online component is developed, which results in the computational complexity being reduced by several orders of magnitude. Based on the suggested hybrid numerical scheme, which is not limited to the haze factor as objective function, structural optimization problems covering geometrical and topological parametrizations are formulated. These allow the influence of different particle arrangements to be studied. The article is complemented by several numerical experiments underlining the strength of the method.


Introduction
In this paper, we focus on the numerical simulation of light scattering of particle monolayers that consist of nonspherical and inhomogeneous particles. Furthermore, the haze factor is used to characterize the design of such thin films (Preston et al. 2013). It should be noted that experiments show (see Bley et al. 2018) that design motifs corresponding to a low or high haze factor of a particulate monolayer can be transferred to the design of a transparent and conductive thin film. Such films play an important role Responsible Editor: Hyunsun Alicia Kim Michael Stingl stingl@math.fau.de 1 Lehrstuhl für Angewandte Mathematik (Kontinuierliche Optimierung), Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Cauerstraße 11, 91058 Erlangen, Germany in many photonic applications, such as OLEDs (Hecht et al. 2011), touch screens (Layani et al. 2014) and solar panels (Atwater and Polman 2010).
In terms of numerical simulation, the challenge is that the haze factor is calculated by evaluating the scattered electromagnetic field on a hemisphere that is larger than the wavelength of the visible light by several orders of magnitude. Hence, a classical numerical scheme based, e.g., on the finite element method (FEM) (Monk 2003), would result in a huge demand on computational resources or may even be out of reach. Alternative numerical approaches, including the T-matrix method (Waterman 1971;Mackowski 1994), are based on fundamental solutions of the time-harmonic Maxwell's equation. Here, the electromagnetic field is expressed as a series of the so-called vector spherical wave function (VSWF). These so-called spectral methods are very efficient if the particle has a spherical shape and consists of homogeneous material. In contrast, the FEM can handle complex shapes and inhomogeneous material distributions, but-as pointed out above-is strongly limited with respect to the size of the computational domain.
In this article we suggest a computational scheme which allows both, the efficient and accurate evaluation of far field properties as well as the treatment of arbitrarily composed or shaped individual particles. Moreover, the scheme we propose is intended to be used in an optimization framework. As a consequence of the latter, the electromagnetic field has to be computed multiple times for an alternating order, orientation or choice of the individual particles in the particulate film. To keep the computational burden in an acceptable range, it is thus highly desirable to calculate as much information as possible in a precomputation step, giving rise to an offline-online approach. Apart from the T-matrix method, which shares these desired properties as long as the shape and composition of the individual particles is sufficiently simple, a first attempt to handle arbitrarily shaped particles has been made in Cui et al. (2011Cui et al. ( , 2012. The authors of these articles suggest to use the finite element method to approximate the solution of the vector wave equation inside each particle. The coupling of these equations is then realized by linear operators based on a convolution of Green's function on the surface of each particle. These operators are used to assemble a boundary integral matrix via specialized boundary elements. This way a flexible scheme is obtained, which allows to simulate optical properties of particulate systems with arbitrarily distributed inhomogeneous particles. The method we suggest in this paper shows not only some similarities to this approach but also some differences, which are of particular importance in the optimization context. The main differences are in the approximation of the exterior electric fields and in the way, the exterior fields are coupled with the interior fields. The hybrid method in this manuscript uses vector spherical wave functions to describe the electric field scattered by each particle (exterior fields) combined with appropriate finite element approximations of the electric field inside (a spherical neighborhood of) the particles. The coupling is then performed by imposing the tangential continuity of the electric and magnetic fields as a boundary equation. After using a standard Galerkin ansatz, also in the approach suggested in this article a boundary integral matrix is obtained. Now, the first key feature of our method is that the size of the boundary integral matrix is fully determined by the number of particles in the system multiplied by the number of vector spherical wave functions used to approximate the exterior field. To give the reader an impression, typically less than 100 external degrees of freedom per particle are required for a pretty accurate approximation. Thus, a matrix of less than 10,000 × 10,000 in size is obtained for a system of 100 particles. Having this said, we would like to emphasize that this size of the boundary integral matrix is entirely independent from the level of discretization used for the finite element approximations of the interior fields.
Of course, the finite element discretization still plays a role in the assembling of the boundary element matrix. This is where the second key feature of our method is coming into play, which is the decomposition of the computation into on online and an offline phase. In our article we will show that all finite element computations can be done in the offline phase.
Combining these two key features, the assembly of the boundary integral matrix and the solution of the corresponding system of linear equations in the online phase takes only a fraction of a minute for systems with approximately 100 particles on a standard desktop computer. As our primary purpose is the optimization of particulate films, we consider this property of utmost importance. Although we are suggesting a gradient-based optimization method, typically in the order of 100 iterations are required for the solution of each optimization problem, which means 100 applications of the online part of the computations for one and the same offline phase. Beyond this, the offline properties for each individual particle type can be stored in an offline "catalogue," enabling their reusage for other parameter studies or optimization problems, combining, e.g., different particle types or using a different number of particles in the overall system.
Also it is important to understand that all this is true not only for the haze factor, which is used as an example in our paper, but also for other properties, which merely depend on the far field.
Finally we would like to stress that the design problems considered in this article are, in general, non convex. As the emphasis in this article is on fast solvers and fast calculation of gradients, we do not consider global optimization approaches, which would guarantee technically optimal solutions. However, in the numerical section of this article, we resort to using random initial conditions to test the dependence on the initial design as well as to probe the design space to find reasonably good designs.
The next section begins with the description of the geometrical setting, governing equations and the mathematical formulation of the haze factor. After discretization of the equations involved, the spectral method and the finite element method are combined to form a hybrid method. Moreover, the efficient assembly and solution of the corresponding hybrid system is discussed. In the third section, the resulting numerical scheme is used to design particulate monolayers. In this context, positions, orientations and material types of the particles are optimized in order to maximize or minimize the haze factor.
Throughout this paper, we denote the imaginary unit by ι := √ −1. Furthermore, A * is the complex conjugate of the complex-valued variable A ∈ C n×m and A H = (A * ) T . For two vectors u, v ∈ C n , we define the scalar product by u · v * = n i=1 u i v * i , in contrast to the standard scalar product for complex-valued vectors. For a function f : X 1 × · · · × X n → Z, we denote by D i f (x 1 , . . . , x n ) [h] the directional derivative with respect to the ith argument in a direction h ∈ X i . Unless otherwise stated, the differentials dx and dH 2 indicate the integration over a volume and surface, respectively.

Simulation of the haze factor for particulate films
We consider particulate films composed of non-spherical particles. To simulate the optical properties of such films, we describe a hybrid approach. This approach combines the advantage of the spectral method, in which the exterior field is expressed by superposition of the vector spherical wave functions, with the advantage of the finite element method, which enables complex geometries and variable relative permittivities to be treated. To do this, each particle is embedded in a spherical domain, the so-called ghost particle, in which the shape of the original particle is described by a varying, piecewise constant relative permittivity (see Fig. 1). The introduction of ghost particles in the treatment of non-spherical particles can already be found in Gimbutas and Greengard (2013). There the interior fields are calculated based on scalar and vector potential theory, resulting in integral equations (e.g., Jackson 1999). In an alternative approach, we describe the interior field with finite element shape functions and derive a formulation that is suitable to be used in a gradient-based design optimization framework (see Section 4). In the remainder of this article, all ghost particles are assumed to be nonoverlapping with a positive (but possibly small) distance between them.
It should be noted that the concept of a hybrid formulation in the sense of coupled analytical or potential functions and finite element shape functions is also used in other fields. A fundamental introduction can be found in Zienkiewicz et al. (1977). Applications of this concept can be seen, e.g., in Gilbert et al. (1996) for linear and nonlinear elastic structures, in Kufner et al. (2018) for linear Timoshenko beam theory, in Yohannes et al. (2012) for fracture analysis and in Rodriguez and Valli (2009) for eddy current problems.

Governing equations and objective functional
We denote a given number of particles in the monolayer by N p ∈ N and define the index set I p = {1, . . . , N p }. Moreover, let Ω (k) ⊂ R 3 be a spherical domain with center point ξ (k) ∈ R 3 and radius r (k) > 0 for all k ∈ I p . On each Ω (k) a particle is defined through a relative permittivity ε (k) : Ω (k) → C, which is a complex-valued function of space. Furthermore, we assume that the subdomains Ω (k) are disjoint, i.e., Ω (k) ∩ Ω (l) = ∅ for all 1 ≤ k < l ≤ N p . The assembly of particles Ω = Ω (1) ∪ · · · ∪ Ω (N p ) is embedded in a vacuum and is illuminated by an incident wave E I : R 3 → C 3 with the wave number ω > 0. The wave number ω > 0 is related to the vacuum wavelength λ by ω = 2π λ . ν (k) denotes the outward normal unit vector of the boundary Γ (k) = ∂Ω (k) of each subdomain Ω (k) . With this, we state the governing equations based on the timeharmonic Maxwell's equations for the exterior scattered electric field E ext : R 3 \ Ω → C 3 and the interior total electric field E (k) int : Ω (k) → C 3 for all k ∈ I p as follows: In addition, we assume that the following radiation condition (Monk 2003) is satisfied by the exterior field: Outside the particle assembly Ω, we define the energy flow S T of the total electric field E T = E ext + E I and the energy flow S I of the incident field E I as S T = 1 2ω Re(ιE T × curlE * T ) and respectively, using the Poynting vector (Mishchenko 2014). The so-called haze factor describes the fraction of light that is scattered when passing through a transparent material. In this article the haze factor will be computed and optimized for various transparent particulate films. With the explanation above one could say that the haze factor expresses, how clear a particulate film is perceived by the human eye. To give some examples, a film with haze factor close to 1 appears like a frosted glass, as the light is merely transmitted through the film in a diffuse manner. In contrast, window glass transmits light merely specular, resulting in a haze factor close to 0. In applications of transparent films often a very low haze factor is desired. In order to simulate the haze of a transparent film in a computational environment, we use the following setup.
x · e 3 ≤ R eval cos(α)} be the punctured hemisphere with opening angle α, centered at the origin and radius R eval > 0 (see Fig. 2). For the evaluation of the haze factor of the sample Ω, we assume that the sample is placed in a circular aperture with a radius w a in the x 1 -y 2 plane and illuminated from the negative x 3 -direction. Then, the haze factor H is calculated by the ratio (Preston et al. 2013) with two transmissions in presence of the sample, and two calibration measurements (without sample) T 3 = S α S I · νdH 2 and T 1 = S + S I · νdH 2 .
These four quantities are evaluated when the haze factor is experimentally determined using an UV/VIS Integrating Sphere (International Organization for Standardization 1999). The intention of (4) is to determine the amount of energy of the total electric field that is diffusely scattered from the sample. Since the incident wave is truncated by the aperture, we assume that the incident field vanishes on the punctured hemisphere for a sufficiently large opening angle α, which implies that S T ≈ S I on S α and T 3 ≈ 0. As a consequence of this assumption, we neglect the denominator T 2 , skip the second term in (4) and thus focus on the energy of the scattered electric field, given by the formula We call (5) the numerical haze.

Discretization
vertices. For ease of notation, Ω (k) henceforth denotes the discretized domain. The interior electric field E Nėdėlec 1986 or Andersen andVolakis 1998). With this the discretized electric field on Ω (k) is: In the exterior of Ω, the electric field E ext is approximated by a superposition of truncated series of fundamental solutions of Maxwell's equation, the so-called vector spherical wave functions (VSWF) (see, e.g., Colton and Kress 2013). In the following we associated such a truncated series with each particle k ∈ I p . Furthermore, we denote the order of the truncated series of the kth particle by L (k) , which then consists of N (k) a = 2(2L (k) + 1) degrees of freedom with coefficient vector v (k) ∈ C N a and global shape functions ψ a }. Each analytical shape function ψ (k) i is constructed as a translation of the VSWF by the center point ξ (k) . Here, Y lm are the spherical harmonics and h (1) l are the spherical Hankel functions of the first kind (Colton and Kress 2013). In this way we obtain for all i ∈ I (k) a and 1 ≤ l ≤ L (k) , −l ≤ m ≤ l the shape Using these, the approximate exterior field reads as: Next, we collect for all k ∈ I p the finite element shape functions φ (k) i and the VSWF based shape functions ψ In the following, we derive from (2a) to (2c), as well as (6) and (7), a linear system of equations which has to be solved to obtain the coefficient vectors u (k) and v (k) , k ∈ I p . By testing (2a) with some function φ (k) ∈ U (k) and using integration by parts we obtain We can now incorporate the interface condition (2c): The remaining Dirichlet-type interface condition (2b) is transformed to an integral equation by testing on the surface Γ (k) with tangential vector fields given by (ν ×curlψ (k) )×ν with ψ (k) ∈ V (k) (and thus also curlψ (k) ∈ V (k) ). Thus, (2b) results in where we added the normal component of ψ (k) that vanishes again in the product with a tangential field. Now the weak formulation of (2a) reads as where we have introduced the bilinear forms for a domain D ⊂ R 3 with boundary ∂D, functions f , g ∈ H (curl; D) and the relative permittivity distribution ε (D) : D → C.
Next, by inserting (6) and (7) into the bilinear form of the weak formulation (8), we obtain the following system of linear equations: Here, the block matrices A (k) Similarly, the right-hand side vectors F (k) a are given by Finally, combining the corresponding blocks to the matrices A, B, C and D and vectors u, v, F and G, respectively, we acquire the linear system of equations where we have the short hand K f for the full system matrix.
To compute the numerical haze (5), which depends only on the exterior field E ext , we can use the Schur complement trick to reduce (10) to Kv = L, with the reduced system matrix K and right-hand side vector L defined as Based on the solution of this, the numerical haze (5) is calculated by the algebraic equation with the matrix H constructed in the same way as D from block matrices H (k,l) a for k, l ∈ I p . These are given entry-wise by

Parametrization via rigid body transformations
Bearing in mind that the size of the reduced linear system for the VSWF coefficients v is given as k∈I p N (k) a and the number of degrees of freedom N (k) a for the approximation of the exterior field is small compared to the finite element degrees of freedom, the solution of the linear reduced system does not impose a major challenge for particle systems of reasonable size, even on a standard computer. However, looking at (11), the assembly of the reduced system matrix and right-hand side still requires with A −1 C and A −1 F the solution of the finite element system for multiple right-hand sides. Taking into account a desired number of 100-1000 particles and, in particular, the large number of finite element degrees of freedom per particle required for a sufficiently fine resolution of the interior field, the assembly would still pose a computational challenge, even in a state-of-the-art high performance computing environment. This is our motivation to further reduce the computational complexity by several orders of magnitude. As we will see, this will be possible without a significant loss of accuracy compared to a direct solution of (9). The reduction will be based on a consistent exploitation of the structure of the VSWF. To achieve this, all matrices and vectors in (11) will be expressed by quantities that depend either solely on the relative permittivity, or solely on the position and orientation of the individual particles.
To ensure this is possible, we assume that each particle is the result of a rigid transformation, i.e., rotation and translation, of a reference domainΩ (k) centered at the origin with boundaryΓ (k) . Moreover, since Ω (k) is polyhedral after discretization, the reference domain is also assumed to be polyhedral and triangulated. In anticipation of the optimization tasks formulated later in this paper, it is interesting to note that the rigid transformation assumption is not restrictive as long as the arrangement or choice of given particles, rather than the shape or topology of the individual particles, is subject to optimization.
To put this into practice, we introduce orientations Θ = (Θ (1) , . . . , Θ (N p ) ) parametrized by rotation matrices Θ (k) ∈ SO(3) for all k ∈ I p , as well as spatial positions ξ = (ξ (1) , . . . , ξ (N p ) ) ∈ R 3×N p to each particle Ω (k) . Then, a rigid body transformation T (k) based on the translation vector ξ (k) and the rotation Θ (k) ∈ SO(3) is defined as Based on this, the subdomain Ω (k) is given as Moreover, the relative permittivity ε (k) on Ω (k) is related to the reference relative permittivityε (k) : To illustrate this, Fig. 3 depicts an assembly of particles, which are identical in shape and material properties, and thus can be transformed from the same reference domainΩ. At this point, the block matrices and vectors contributing to the reduced system matrix and right-hand side (11) have the following explicit dependency on reference domainΩ (k) Fig. 3 Transformation of a common reference domainΩ to individual particles Ω (k) by rigid body transformation T (k) to form an assembly of three particles and its boundaryΓ (k) , the reference relative permittivitŷ ε (k) , the position ξ (k) and the orientation Θ (k) : We see that the block matrix A (k) depends on the relative permittivities of the particles and depends on neither the orientation nor the translation; this will be explained a little later. By construction, all the other expressions do not explicitly depend on the material properties. The fact that B (k) does not depend on translations is explained by the observation that only one particle is involved and by its independence of the incident field. As C (k,l) and D (k,l) both depend on two particles k and l, an explicit dependency on their relative translation (and thus on the translations ξ (k) , ξ (l) ) of both particles occurs. The expressions F (k) and G (k) again involve only one particle, but here the translational dependence is explained through the influence of the incidence field. Finally, all objects but A (k) depend on the orientation Θ (k) . We note that this dependency is solely due to the polyhedral shape of the ghost particles, and thus a consequence of discretization.
In the following, we investigate each of the block matrices and vectors above in detail. For an efficient assembly of the system matrix K and right-hand side vector L, we try to express all matrices and vectors above in terms of objects depending either solely on their spatial positions, on their orientation or on the finite element discretization including their relative permittivity. It should be noted that this will not be possible without introducing further approximations leading to additional geometrical and truncation errors. We will discuss the nature of these later in this article, but already note here that the large gain in computational complexity and the fact that these errors can be controlled by an appropriate choice of discretization parameters provides a good justification for this strategy.
We start with the matrices A (k) (Ω (k) ,ε (k) ). At first glance, it is not apparent that these matrices do not dependent on translations and rotations. However, this can be explained as follows. Using H (curl)-conforming transformations (see Monk 2003) we obtain the following relations for the finite element shape functionsφ (k) i associated with the reference domain and the shape function φ (k) i : (k) and .
Here, we used the fact that det Θ (k) = 1 and the orthogonality property , since Θ (k) ∈ SO(3). Then, using integration by substitution we observe: Thus, the block matrices A (k) are indeed independent of the rigid body transformations. The major advantage of this is that the block matrices A (k) can be computed for different particle types (defined through their relative permittivity distribution) before the actual translations and rotations are chosen. This is particularly interesting in an optimization framework in which the rigid body transformations serve as optimization variables (see, e.g., Section 4). In this case all finite element calculations can be performed in an offline phase, which is carried out prior to optimization. Furthermore, if the domains Ω (k) are transformed from a common reference domainΩ with different rigid body transformations T (k) as in Fig. 3, then the block matrices A (k) are identical and must be treated only once. Of course, this argument holds, if only a small number of particles types is used to construct a particulate film (see again Section 4.3). Next, we investigate the matrices B (k) (Γ (k) , Θ (k) ). We approximate these by a rectangular matrix denoted bŷ and a square matrix denoted by with the finite element shape function on the reference domain and VSWF ψ q , q ∈ I (k) a , centered at the origin. The latter are independent of the relevant particle and thus denoted without superscript. Note that by constructionB (k) is independent of the orientation of the kth particle. Next, let S r (k) be a sphere with radius r (k) centered at the origin, then the pth column of the matrix R (k) is obtained by solving the projection problem with the outer normal unit vector ν of S r (k) . Obviously, R (k) (Θ (k) ) does not depend on the finite element mesh rather on the orientation Θ (k) . In general, the matrix R (k) can be written as the solution of a linear system of equations Here, we have utilized the properties of the VSWF on the surface of a sphere to proceed to the bilinear form. Moreover the expression curlψ p (Θ (k) •) denotes the composition of the function curlψ p with a function x → Θ (k) x. We will use the same notation later in the paper to denote rigid transformations in a compact form.
To get deeper insight into the structure of R (k) ·,p we can write the rotation matrix Θ in terms of the three Euler angles α, β and γ such that Here, Θ z and Θ x denote rotations around the z-axis and the x-axis, respectively. Furthermore, a rotation around the xaxis by an angle β can be carried out by a rotation by an angle of π 2 around the y-axis, followed by a rotation around the z-axis by β, i.e., Thus, we obtain By repeatedly projecting the basis functions on a rotated coordinate frame using (15), we obtain the R (k) in terms of the Euler angles α (k) , β (k) and γ (k) of the kth particle by where R z maps δ → R (k) (Θ z (δ)) and R y maps δ → R (k) (Θ y (δ)). We shall later see that rotations around the zaxis result in a diagonal matrix R (k) z . Due to the structure of the VSWF, the entry (R y ) pq resulting from a rotation around the y-axis is only nonzero if p and q correspond to the same type of VSWF, i.e., H lm , with the same l and m. In ?[ ()Proof of Thm. 2.9]Colton2013, this is attributed to the fact that an orthogonal matrix transforms a homogeneous harmonic polynomial of order l, here the associated Legendre polynomial, into a linear combination of again homogeneous harmonic polynomials of order l. Consequently, the projection onto a rotated coordinate frame is exact, assuming exact integration. Furthermore, R (k) is independent of the choice of r (k) , which can be proven by inserting the definition of the VSWF into (15) and straightforward calculus.
Returning to the definition of the matrix B (k) , we use the transformation of the shape functionsφ (k) i and on the reference boundaryΓ (k) obtain the expression: Swapping the arguments leads to Now, with the matrix R (k) , we express the VSWF in the rotated coordinate frame in terms of the VSWF in the reference coordinate frame and obtain whereν is the outer normal vector ofΓ (k) . Here, we accept a geometrical error depending on the approximation of the polyhedral surfaceΓ (k) by the sphere S r (k) . Inserting (21) into (20), the block matrix B (k) is approximated by the product ofB (k) and R (k) such that Note that the accuracy of this approximation depends on the error introduced by the application of a quadrature rule used for the minimization problem (15) and the geometrical error of the finite element boundary.
Next, the block matrix C (k,l) is approximated in such a way that the dependency on the finite element mesh, the orientation and the position of the kth particle together with the position of the lth particle is separated. First of all, we define for all q ∈ I (k) a the function curlψ (k,l) q : x → curlψ q (x − (ξ (l) − ξ (k) )) and introduce the rectangular a . In this, the qth column of T (k,l) is the solution of the minimization problem T (k,l) ·,q := arg min .
An interpretation of this is as follows: we try to express the shifted VSWF ψ (k.l) by a linear combination of VSWF centered at the origin with coefficients t. This is a common problem used in VSWF based methods. For instance, in Gumerov and Duraiswami (2007), the translation of the VSWF is performed by a series expansion, which has to be truncated for numerical application. The formulation as a minimization problem poses the translation as a projection on a given finite-dimensional space. Similarly to the computation of R (k) , a solution of this problem can be computed by solving a linear system of equations for q ∈ I (l) a : Again, we have utilized the properties of the VSWF on the surface of a sphere to proceed to the bilinear form.
Using the matrix T (k,l) , the analytical shape functions shifted by the vector ξ (l) − ξ (k) can now be expressed approximately in terms of the shape function located at the origin as follows: Here, we again accept the geometric error between the boundary Γ (k) and the sphere S r (k) . In a second step, we use the matricesB (k) and R (k) to apply the transformation to the reference domain and obtain The accuracy of the approximation depends on the approximation error of S r (k) by the geometrical approximation and the truncation error of the analytical series expansion. It is possible to increase the accuracy by projecting the rotated basis function to a larger space, i.e., using a rectangular matrix a . Of course, in this case, the matrices R (k) andB (k) must be appropriately extended.
So far, we have derived an approximated version of BA −1 C using the approximations (22) and (25), which is entry-wise defined as T (k,l) .
We observe that by construction in this expression only the matrix depends on the finite element discretization and the relative permittivity of the kth particle. Thus, all block matrices N (k) can be precomputed for each particle type and then translated and rotated to their final positions by carrying out the relatively cheap translation and rotation operations. The latter are both given as matrix multiplications in a relatively low dimensional space, as the size of the matrices is defined by the number of DOFs of the analytical shape functions.
To complete the assembling process of the system matrix K, the block matrix D (k,l) is approximated by a matrix D (k,l) a . This matrix only depends on the positions of the kth and lth particles, which is specified as follows: Now, the approximation of D (k,l) is Similar to the matrixĈ (k,l) , we define the complex-valued vectorF (k) (ξ (k) ) ∈ C N (k) a as the projection of curlE I on the sphere S r (k) in terms of the VSWF: .
The entries ofF can be computed as a solution of the following linear system of equations: ∀r ∈ I (k) a . With this and the transformation to the reference domain, we finally get the approximation: (ξ (k) ).
In addition to the geometric error, we introduce here a truncation error due to the finite-dimensional projection.
Similarly, the right-hand side vector G (k) is approximated using a vectorĜ (k) (ξ (k) ) ∈ C N (k) a depending on the incident field E I shifted by the spatial position ξ (k) of the kth particle: We obtain the approximation where the approximation error is of the same nature as for F . Before we summarize our calculations, we note that the linear systems of equations required for the computation of R, T andF possess a unique solution, provided that ωr (k) is neither a root of ψ q nor of curlψ q .

L(ξ , Θ, ε) ≈L(ξ , Θ, ε) :=Ĝ(ξ ) − R(Θ) HN (ε)R(Θ)F (ξ ).
Up to this point we have successfully simplified the assembly of the reduced system (11). Next we discuss the remaining complexity assuming that the system defined by (29) and (30) is to be solved multiple times, due to changes in the arrangement of the particulate film. In this case the most expensive operations turn out to be the assembly of the right-hand side matrix of system (23), required for the computation of the block matrices T (k,l) , k, l ∈ I p , as well as the assembly ofD (k,l) , k,l ∈ I p (see (26)). Generally, all these matrices are fully populated. Moreover, each (k, l)-block depends explicitly on the relative position of particle k and l through the bilinear form b. Thus, the corresponding forms have to be re-evaluated by numerical integration, whenever the relative position ξ (l) − ξ (k) of the corresponding pair of particles changes. In the following, we suggest a strategy by which the major part of this task can again be performed in a pre-processing step. We describe our strategy or the assembly of the matricesD (k,l) , k, l ∈ I p only, as the same idea can be applied to the computations of T (k,l) , k, l ∈ I p .
To do this, we exploit the assumption that all particles are placed in a monolayer, defined by the x 1 -x 2 -plane. As a consequence, the x 3 -component of the particle positions can w. l. o. g. be set to zero, i.e., (ξ (k) ) 3 = 0. Accordingly, we define the planar spatial position ϒ (k) = ((ξ (k) ) 1 , (ξ (k) ) 2 ) for the kth particle. Now, we can-without loss of accuracy-split the matricesD (k,l) (and analogously T (k,l) ) into products of matrices depending either on the polar angle or the length of the vector ϒ (l) − ϒ (k) . To demonstrate this, we define for all k ∈ I (k) a the diagonal matrix function and a ∈ R. Here the index vector has the entries Furthermore, we introduce the matrix function with b ∈ R, defined entry-wise by ∀p ∈ I (k) a , q ∈ I (l) a , where e 1 = (1, 0, 0) T and b ∈ R. Using these definitions, setting s = ϒ (l) − ϒ (k) and denoting the polar angle of s by arg(s), it can be shown using straightforward calculus that We have explicitly noted the dependency ofD (k,l) on the particle positions by adding the argument ξ (l) − ξ (k) and the left-hand side, and we have introduced the abbreviation D (k,l) for the right-hand side. In the latter, the square symbol is used to emphasize the planar assumption.
Based on this, we can now again split the calculation into a pre-processing step (independent of the particular particle positions) and a position dependent step. For this, we approximate the matrix functionD (k,l) ρ which depends only on a scalar variable by an interpolation formula. More precisely, we pre-evaluateD (k,l) ρ on a one-dimensional grid using N g appropriately chosen grid points 0 < ρ 1 < ρ 2 . . . , ρ N g . The values ofD (k,l) ρ on the grid are denoted bŷ D (k,l) ρ (ρ 1 ), . . . ,D (k,l) ρ (ρ N g ). Then, by ΠD (k,l) ρ we denote a corresponding piecewise cubic Hermite interpolant (Fritsch and Carlson 1980), which is defined such that and finally obtain the formulâ The latter can be efficiently evaluated for each planar configuration defined by the relative distance s = ϒ (l) − ϒ (k) . Although we have introduced an interpolation error, this strategy ultimately enables the efficient assembly of all matricesD (k,l) in a two-stage numerical scheme. In the first stage (offline stage), evaluation takes place at grid points and in the second (online stage), the evaluation is made according to the formula (34). As the evaluation of the grid points is done once and forever, a small interpolation error does not impose a major challenge.
We finally observe that the interpolation basis can be reused when geometric arrangements are studied in multiple instances of the simulations or in the framework of optimization tasks. As indicated at the beginning of this paragraph, the same holds true for the computation of the matrices T (k,l) if we introduce matrices T (k,l) ρ and apply the interpolation scheme to those. Furthermore, it is interesting to note that the piecewise cubic Hermite interpolation formulas also enables the computation of derivatives ofD (k,l) and T (k,l) with respect to the spatial positions of particles. This is essential for the gradient-based optimization approach considered in Section 4.

Complexity analysis
To compare the hybrid FEM (defined by (11)) and the twostage hybrid FEM ( (29) and (30) including the interpolation formula (34)), we estimate the computational complexity of the assembling process for the system matrices K and K. For simplicity, we consider a particle system with N p 1 ghost particles with identical radius r. Furthermore, the interior field and the exterior electric field of each ghost particle are expressed by f finite element degrees of freedom and a VSWF, respectively. We assume that the relative permittivity of each particle is related to one of t particle types by rigid body transformation and that the spatial positions give rise to d ≤ p(p+1) 2 relative distances. Moreover, we assume that the Hermite interpolation is based on g grid points. Table 1 lists the computational complexities for individual steps of the assembly process for both numerical schemes investigated. The first column (VSWF) shows the numerical cost of tasks, which involve the evaluation of VSWF. In column FEM, the computational complexity for tasks related to the pure finite element analysis are listed. Lastly, the complexity of tasks, which mainly involve numerical linear algebra operations, are collected in column FLOP.
Starting with the hybrid FEM, the assembly of the matrices D, B and C requires the evaluation of the VSWF for each nonzero entry of the respective matrix, which results in the complexities O(p 2 a 2 ), O(pf a) and O(p 2 f a). Note that the matrix B is block diagonal and thus the assembly is linear with respect to the number of particles. Exploiting the block diagonal structure of the matrix A, it is observed that the finite element matrices can be assembled and factorized independently for each particle, leading to the complexities O(pf ) and O(pf 3 ). The assembly of K is completed by the multiplication of A −1 with C, which is executed through the solution of a linear system with multiple right-hand side (exploiting the block structure of A −1 ), and the multiplication with B.
In contrast to this, the two-stage hybrid FEM requires more intermediate steps to obtain the system matrixK. However, most of the steps are carried out in the offline stage. First, we prepare the Hermite interpolants ΠD ρ and ΠT ρ , whereby the assembly of the matricesD ρ and T ρ for each grid point of the interpolation requires O(ga 2 ) operations involving VSWF. The generation of the interpolation functions itself requires O(ga 2 ) arithmetic operations. In comparison to the hybrid FEM, the assembly ofB, A, factorization of A and the computation of the matrixN must be performed for each particle type only. Furthermore, the complexity of each of these tasks is linear with respect to the number of particle types due to the block diagonal structure ofB and A. This explains the corresponding complexity estimates. The offline stage is concluded by the assembly of the matrix R y ( π 2 ), which realizes the rotation by π 2 around the y-axis and is performed only once for all particles types.
During the online stage, the interpolation functions ΠD ρ and ΠT ρ are evaluated at the unique relative distances of the particle and the matricesD and T are computed by taking the polar angle between two particles into account. LikeN , the matrix R has a block diagonal structure, resulting in a linear complexity with respect to the number of particles (O(pa 3 )). After multiplication with T , we obtain the system matrixK of the two-stage hybrid FEM.
We see that if the number of particles types t is small, the offline stage of the two-stage hybrid FEM is already more efficient than the original hybrid FEM. Moreover, it is worth noting that the number of particles p never occurs in Table 1 Estimation of computational complexity to assemble the system matrices K andK of the hybrid FEM and the two-stage hybrid FEM combination with the potentially very large number of finite element degrees of freedom f . Finally, bearing in mind that the number of VSWF a is typically much smaller than f , it is clear that the online stage of the two-stage hybrid FEM is likely to outperform the original hybrid FEM scheme by several orders of magnitude. Consequently we use the twostage hybrid FEM for all optimization tasks presented in the next section.

Tests and comparisons
In this section, we provide two simple test scenarios to check the implementation of the hybrid FEM and two-stage hybrid FEM discussed in the manuscript. Assuming a normalized x-polarized plane incident wave E I (x, y, z) = √ 2e x exp(ιωz) in vacuum with wave number ω, the scattering cross section (Mishchenko 2002) is calculated on a sphere B R with radius R via the expression: Here, E S denotes the electric scattered field and H S = − ι ω curlE S the magnetic scattered field.
Single spherical particle The first test compares scattering cross sections of a single spherical particle with various radii r, relative permittivity ε r = (1.2) 2 and wave length λ based on -A direct solution of the hybrid system (9), -The two-stage hybrid formulation elaborated in Section 2.3, -An analytical solution via Mie theory (Mie 1908).
The mesh for the FEM approximations in the first two approaches is generated by projecting a regular mesh defined on a cube surface to a sphere. Based on this, a triangulation using TETGEN (Si 2015) is computed. Using various levels of refinement for the surface mesh, a hierarchy of meshes ranging from coarse to fine are generated. As the volume of the approximated sphere generated in this way, does in general not coincide with the volume of the perfect sphere, the radius of the perfect sphere used in the Mie calculation is adapted to match the volume of the meshed sphere.

Two spherical particles
In the second test scenario, the scattering cross sections of an assembly of two spherical particle is calculated with the hybrid FEM and two-stage hybrid FEM. Furthermore, the scattering cross section is calculated with the software ADDA (Yurkin and Hoekstra 2011) based on the Discrete Dipole Approximation (DDA). We use particle radii of 0.05λ and 0.1λ and a distances of 0.5λ. The relative permittivity (for both particles) is the same as in the first scenario, i.e., ε r = (1.2) 2 . Although DDA is also an approximative method, it is known to generate very accurate results for refractive indices close to 1 and sufficiently small particles, which allow for the discretization by a large number of dipoles. Therefore the DDA results can serve as reference solutions. In DDA we use a dipole spacing of 0.02r and the so-called filtered coupled dipole polarizability.
Besides the finite element mesh used to approximate the interior fields, also the number of analytical basis functions, controlled by the index L (compare Section 2.2), is varied for both scenarios. The results are depicted in Fig. 4. In Fig. 4a, the relative errors between the scattering cross sections of a single particle resulting from the corresponding numerical schemes (solid: hybrid FEM, dashed: two-stage hybrid FEM) and an online Mie scattering calculator (Prahl 2018) for r = 0.6λ are shown. We can see the impact of the choice of L on the convergence of relative error over different mesh sizes ranging from coarse to fine. It is observed that there is almost no difference in the error curves for L = 6 and L = 8, which indicates that already for L = 6 a high accuracy approximation is obtained provided the finite element approximation for the interior fields is good enough. In a similar manner, the results for the two particle scenario with a chosen radius of r = 0.1λ for both spheres is displayed in Fig. 4b. As outlined before, in this case, the reference solutions are calculated using the DDA software package ADDA. Due to the small particle radius (necessary for a high precision of the DDA reference solution) already an approximation order of L = 2 generates good results. Nevertheless, we choose L = 8 for the remaining experiments presented in this section.
In Fig. 5a, further error curves for the single particle scenario resulting are shown. Again errors are computed for both numerical schemes (solid: hybrid FEM, dashed: twostage hybrid FEM) and the reference solution is provided by the online Mie scattering calculator (Prahl 2018). This time instead of the parameter L, the radius of the sphere is varied. The corresponding results for the two particle scenario are shown in Fig. 5b.
Both these tests demonstrate how the error convergences if the finite element mesh used to approximate the interior fields is refined. We can see that the errors are typically lower for smaller particles. We can also see that the error is consistently larger in the two-stage hybrid FEM approach compared to the error obtained using the direct solution of (9). This is a consequence of a number of simplifications as, e.g., the decoupling of the interior and exterior field or the geometrical error outlined in Section 2.3. Nevertheless, also in the case of the two-stage hybrid FEM approach the error goes consistently down, when we refine the finite element resolution and use an approximation order of L ≥ 6 for the exterior field. It is finally noted that the precision which is obtained for the two-stage hybrid FEM approach (approximately two correct digits) is sufficient for the optimization examples presented in this article. On the other hand, thanks to the two-stage scheme, a further Fig. 4 Relative errors for the scattering cross sections obtained from the direct hybrid FEM approach (hybrid) and the two-stage hybrid FEM approach (two-stage) are shown. The reference solutions are computed based on Mie theory in the single particle example (a) and by the DDA method in the two particle case (b). The size of the spherical particles is chosen as r = 0.6λ (a) and as r = 0.1λ (b). The errors are computed for different orders of approximation L for the exterior field L and different mesh resolutions used to approximate the interior fields

Fig. 5
Relative errors of the scattering cross sections obtained from the hybrid FEM approach (hybrid) and the two-stage hybrid FEM approach (two-stage) are shown for different particle radii and varying resolutions of the finite element mesh used to approximate the interior fields. On a, the result for the single particle scenario (comparison against Mie results) and on b the results for the two particle scenario (comparison against DDA results) are displayed. The order of approximation for the exterior field is chosen as L = 8 in all computations refinement of the finite element approximation would be possible. As the online phase is independent of the finite element resolution, this would not effect the computational efficiency reported for the optimization results presented in Section 4.

Spatial position of particles
The first optimization problem addresses the minimization of the numerical haze (12) by changing the spatial position of spherical particles in the x 1 -x 2 -plane, i.e., a particle monolayer. We assume that the particle centers ξ (k) = (ϒ (k) 1 , ϒ (k) 2 , 0) ∈ R 3 for all k ∈ I p are parametrized by the design variables ϒ = (ϒ (1) , . . . , ϒ (N p 2 ) ∈ R 2 for all k ∈ I p . Furthermore, we assume that all particles have the same constant relative permittivity distribution, i.e., ε (k) (x) = ε ∈ C for all x ∈ Ω (k) and identical radius r (k) = r ∈ R >0 for all k ∈ I p .
To express the dependency of all involved matrices and vectors on the design variables, we define the expressions ,H (ϒ) := H (ξ ) andL(ϒ) := . With these, we state the optimization problem with pairwise constraints on the particle positions to guarantee disjoint particles, and box constraints with C ∞ < 2w ap to ensure that the particles stay inside the aperture with size w ap . Since the coefficient vector v =K(ϒ) −1L (ϒ) = S(ϒ) is uniquely determined by the parameters ϒ, we can define the so-called reduced objective functional by (ϒ) : = J (S(ϒ), ϒ). To use a gradient-based optimization algorithm, we have to provide the derivative of the reduced objective function (ϒ) with respect to the parameters ϒ in a direction h 1 ∈ R 2×N p . For this, we obtain the formula

Now, with v = S(ϒ) and the adjoint variable w = K(ϒ) −H (H (ϒ)+H (ϒ) H )v, the first term can be rewritten
as Moreover, the second term results in In these expressions, the directional derivative of the system matrixK with respect to the planar spatial positions ϒ in direction h 1 is obtained by where h 2 ∈ R 3×N p is the result of extending H 1 by zero in the third component, i.e., (h 2 ) i,k = (h 1 ) i,k for all i ∈ {1, 2} and (h 2 ) 3,k = 0 for all k ∈ I p . The derivatives of the block matricesD (k,l) and T (k,l) depend only on the spatial positions ξ (k) and ξ (l) . Thus by application of the product rule for all k, l ∈ I p and with s : Here the arguments of M (k) , M (l) and ΠD (k,l) ρ were omitted for the sake of readability. The remaining expressions are obtain with chain rule, with the derivative of the Hermite interpolation function D 1 ΠD (k,l) ρ and the derivative D 1 M (k) of the matrix M (k) . This is defined for a direction h ∈ R and for all p ∈ I We note that the same procedure applies to the derivatives D 1 T and D 1Ĥ . The derivative D 1L (k) of the right-hand side vectorL (k) , which enters D 1L , can, in general, be computed in a straightforward manner using the Jacobian of the incident field. However, in the case that the incident field is a plane wave travelling perpendicularly to the planar thin film, this derivative vanishes.
Finally, the derivative of the distance constraint, which is ρ(ϒ (l) − ϒ (k) ), with respect to the spatial position holds In our first numerical example all particles have a radius of r (k) = 123 nm and a relative permittivity ε (k) = 2.5469, which corresponds to the refractive index of polystyrene at a wavelength of 550 nm (Sultanova et al. 2009). The assembly is illuminated by a x 1 -polarized plane wave propagating in positive x 3 -direction with a wave length of λ = 550 nm, i.e., E I = e 1 exp(ιωx 3 ). Initially, the particles are placed in a hexagonal grid with grid size d hex = 370 nm truncated to a hexagonal shape. Moreover, each particle position is locally perturbed by a realization of a normally distributed random variable with probability distribution N (0, 0.05). The optimization procedure is then executed 20 times with randomized initial values. The bounding box for the optimization is set to C ∞ = 4 μm.
Moreover, the numerical scheme is implemented in MATALB (The MathWorks Inc. 2014) and we use sequential quadratic programming (SQP) for the solution of the geometry optimization problem (35). Figure 6 illustrates the results of the geometry optimization, where the numerical haze is minimized using 61 particles. From the final objective functional values obtained for each of the randomized initial values, the design with the lowest objective functional value J min (Fig. 6a) is compared to the design with the highest objective functional value J max (Fig. 6c) and the design corresponding to the objective functional value that is the closest to the average objective functional value J avg (Fig. 6b). We observe that the particles keep a certain distance such that the distance constraint is not active. Furthermore, they form hexagonal patterns, which seems to be a compromise between a global compact design of the sample and a minimal distance between each adjacent pair of particles.
In contrast, the maximization of the haze factor (which is here and in all later examples realized by a multiplication of the objective functional with −1) leads to chain-like formations that are aligned with the polarization of the incident wave (see Fig. 7). Here, the distance constraint is active for all randomized initial configurations. Table 2 summarizes the results for minimization and maximization of the haze factor with various total numbers of particles. To compare the results, the objective functional is evaluated before the particles are randomly perturbed (J 0 , unperturbed), i.e., a perfect hexagonal pattern. Again, J min , J avg and J max denote the minimal, average and maximal objective functional value for 20 randomized configurations, respectively. The minimization with 61 particles yields an average reduction to 79.7% of the unperturbed configuration. If the haze factor is maximized by the geometry optimization then we observe that the haze factor is increased by 553% on average.

Orientation of particles
With the next optimization problem, we aim to study the effect of particle orientations on the haze factor given by Fig. 6 Results of the geometry optimization with 61 particles, where the haze factor is minimized; the designs correspond to the minimal (J min (J avg ) and the maximal (J max ) objective functional, where we have started the optimization repeatedly from 20 randomized initial values. a J min . b J avg . c J max (5). Let the reference domainΩ (k) for each particle be a spherical domainΩ (0) with radius r. We define a nanorod Ω rod ⊂Ω (0) with radius r rod and height h rod in the interior ofΩ (0) by see Fig. 8. Furthermore, we denote a constant relative permittivity of the nanorod by ε rod ∈ C. Thus, the relative permittivity on the reference domain is defined for allx ∈Ω (k) aŝ By rotating the reference domainΩ (k) with the rotation matrix Θ (k) ∈ SO(3), we effectively rotate the nanorods. As in the previous example, the particles are placed in the x 1 -x 2 -plane with particle centers ξ (k) ∈ R 3 and (ξ (k) ) 3 = 0, but this time the position is fixed throughout the optimization. Furthermore, we restrict ourselves to in-plane rotations, i.e., rotations around the positive x 3 axis, which are parametrized by the angles α = (α (1) , . . . , α (N p ) ) ∈ Fig. 7 Results of the geometry optimization with 61 particles, where the haze factor is maximized; the designs correspond to the minimal (J min ), an average (J avg ) and the maximal (J max ) objective functional, where we have started the optimization repeatedly from 20 randomized initial values. a J min . b J avg . c J max [0, 2π) N p such that Using this parametrization, the in-plane rotation of the kth particle by the matrix R (k) (15) is related to the matrix M (k) (31) via R (k) (Θ (k) (α (k) )) = M (k) (α (k) ).
(36) Fig. 8 Sketch of a nanorod with height h rod and radius r rod inside a spherical reference domainΩ (k) with radius r With this, we defineK( α) :=K(Θ, ξ , ε),H := H (ξ ) and L(α) :=L (Θ, ξ , ε), and obtain the minimization problem: The derivative of the reduced objective functional (α) = J (S(α)) in a direction h 1 ∈ R N p with the solution vector v = S(α) =K(α) −1L (α) and the adjoint solution The blockK (k,l) (α (k) ) of the system matrixK(α) and the kth block componentL (k) (α (k) ) of the right-hand side vectorL(α) depend only on the angle α (k) . Thus, with (36), Here, In this numerical example, we consider N p = 61 nanorods with height h rod = 62 nm and radius r rod = 31 nm. The nanorods are arranged in a hexagonal pattern in the x 1 -x 2 -plane with grid size d hex = 270 nm and form a hexagonal shape with 5 objects per side. Each nanorod has a relative permittivity ε P = (1 + 10i) 2 and is embedded in a sphere with radius r (k) = 123 nm. The assembly is illuminated by a plane wave E I with fixed wave length λ = 550nm, polarization in the x 1 -direction and propagation in the positive x 3 -direction, i.e., E I (x) := e 1 exp(ιωx 3 ) with ω = 2π λ . We choose a discretization for each subdomain Ω (k) , which results in N (k) f = 394432 degrees of freedom for the interior electric field, and N (k) a = 96 degrees of freedom for the electric fields scattered by each particle. Furthermore, we compare results for the minimization and maximization of the haze factor for different initial out-ofplane rotations, i.e., about the x 2 -axis, by the angles θ ∈ {0, π 6 , π 4 , π 3 } (see Fig. 9). The effect of polarization on the optimization results is also studied by using an unpolarized incident light. This is realized by averaging the solutions of the electric field for x 1 -polarized and x 2 -polarized light.
The optimization is started from uniform angles given as α (k) = π 4 for all k ∈ I p and an SQP algorithm is used again for the solution.
In the case of minimizing the haze factor for polarized incident light, the nanorods align perpendicularly to the incident light polarization for all initial out-of-plane rotations, examples of which are depicted in Fig. 10a and b. There, the in-plane orientation angles α (k) are visualized from top view by a colored hexagonal background for each nanorod and from the perspective view by colored nanorods. If the haze factor is maximized, most of the nanorods align parallel to the incident light polarization (see Fig. 10c).
If the nanorod are tilted they form an out-of-plane zig zag bringing the tips of the nanorods closer together (see Fig. 10d).
Using an unpolarized incident light source, the haze factor is only reduced to 87% of the initial objective function value for θ = 0 or to 97% of the initial objective function value for θ = π 4 (see Fig. 11a and b). In contrast, the maximization of the haze factor using unpolarized light leads to a gain of 223% relative to the initial objective functional value. Both optimized designs shown in Fig. 11c and d exhibit a ring formation in the central area of the sample. 1 ] = Z 2 T (k,l) (ξ (l) − ξ (k) ) and D 1L (k,l) We see that D 1K (k,l) (δ (k) ) and D 1L (k) (δ (k) ) are constant with respect to δ (k) and so they must be computed only once in a pre-processing step. In the following examples, we solve the optimization problem (37) with N p ∈ {7, 19, 61, 217} spherical particles arranged in a hexagonal grid in the x 1 -x 2 plane with particle distance d hex = 368 nm. Here, the admissible domainΩ 1 is a small sphere with radius r 1 = 110 nm andΩ 2 is a larger sphere with radius r 2 = 166 nm (see Fig. 12).  Fig. 15 Rounding of "gray" designs Both admissible particles types have a relative permittivity ε 1 = ε 2 = 2.5469 (Sultanova et al. 2009). The sample is illuminated by a x 1 -polarized plane wave propagating in the positive x 3 -direction with the wave length λ = 550nm. The bounds for the number of particles are chosen as n = (V − 0.05)N p and n = (V + 0.05)N p , based on a desired volume fraction V of the particle of the second type, here the large particles. The initial value for the parameter δ is chosen to be either uniformly set to V , i.e., δ (k) = V for all k ∈ I p , or using realizations of an uniformly distributed random variable with probability distribution U ([V − 0.1, V + 0.1]). The grayness penalty parameter is set to γ = 0.01 and is successively increased by a factor of 1.2 within a so-called continuation process (Sigmund and Petersson 1998).
Exemplary results of the optimization for V = 0.4 are depicted in Fig. 13 with a view to showing the influence of the random initialization versus the uniform initialization of the optimization result. In the case of haze factor maximization the uniform initialization lead to highly symmetric designs (see Fig. 13a), in contrast to the random initialization (see Fig. 13b). In both cases, we find that the large particles form chains.
When the haze factor is minimized, we observe that the difference between the resulting design for uniform and randomized initialization is very small or sometimes they are even identical (see in Fig. 13c and d). This suggests that compact, sparse and symmetric designs are beneficial for the haze factor minimization.
For a larger number of particles (N p = 217), the characteristic features described above are maintained for both optimization types. This is illustrated in Fig. 14. The value of the numerical haze, the value of the grayness term and the number of blue particles are printed below the design. Again we observe the chain formation in Fig. 14a and the sparse symmetric structures in Fig. 14b.
In some cases, we do not obtain fully integer solutions due to balancing effects caused by bounds on the number Fig. 16 Comparison of the optimization results for 61 particles of blue particles and the local reduction of the haze factor. As there is no physical interpretation of these so-called gray designs, we round the design parameter and evaluate these similar black (1) and white (0) designs (see Fig. 15), although they are not directly obtained as solutions of the optimization problem.
Finally, we compare all results for the minimization and maximization of the haze factor by choosing the particle type of 61 particles in Fig. 16. The plot shows the final objective functional value obtained with uniform and randomized initial design parameters against the value "fraction of the blue particles" (solid, dots). Moreover, the objective functional evaluated for the rounded "gray" designs is marked by a square whenever applicable. We see that with an increasing fraction of large particles, the haze also increases, since the large particles scatter more light than the smaller particles.

Computation times
All experiments in the previous sections were carried out on an Intel Xeon E5-2687W computer with 20x2 CPUs and 256 GB RAM in MATLAB (The MathWorks Inc. 2014). During the online phase, each optimization iteration took less than 30 s for a system of 61 particles. It was further observed that the computation time per online iteration indeed depends quadratically on the number of particles. On average, the computation of the offline matrices, e.g., D ρ and T ρ , was performed in about 30 seconds s per block matrix at a grid point. In the worst case, this results in about 23 min for each full matrix massively parallelized on 40 cores. In the most expensive case, the offline FE analysis took 130 s.

Replication of results computation times
For the optimization, the SQP solver from the MATLAB optimization toolbox has been used without specifying second-order derivatives. Material parameters and system sizes are described with the individual examples.

Conclusion
In this article, a two-stage hybrid FEM scheme for the simulation of optical properties of complex particle systems was derived. Exploiting the properties of VSWF, it was possible to reduce the computational complexity compared to a standard hybrid FEM scheme by several orders of magnitude. This, combined with the separation of the computations into a design independent offline stage and a design dependent online stage allowed us to efficiently solve optimal arrangement problems of a different kind.
Even on a standard computer, it was possible to solve problems including a few hundred particles. Moreover, the derived numerical scheme enables the efficient structural optimization of particle assemblies with respect to position, type and orientation of the particles in a straightforward manner.
Future research directions include an extension of the methodology to fully three dimensional particle arrangements. Moreover, it would also be interesting to develop optimization solvers that can directly exploit the structure of the hybrid FEM system. From a practical point of view, optimization problems with multiple frequencies and (in the full 3D case) multiple illumination directions would be of interest.. Moreover, instead of working with given particle sizes and shapes, it would be more realistic to consider distributions of these quantities. Finally, if one is interested in globally optimal solutions, gradient-based global optimization methods (see, e.g., Noel 2012) could be applied. We think that our approach is attractive to be used with these solvers, as the computational cost for an objective function evaluation in the online phase is comparably low.
Funding Open Access funding enabled and organized by Projekt DEAL. This research work received funding from the German Research Foundation (DFG) within Collaborative Research Centre 1411, subproject D5.

Conflict of interest The authors declare that there is no conflict of interest
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creativecommonshorg/licenses/by/4.0/.