A sequential addition and migration method for generating microstructures of short fibers with prescribed length distribution

We describe an algorithm for generating fiber-filled volume elements for use in computational homogenization schemes. The algorithm permits to prescribe both a length distribution and a fiber-orientation tensor of second order, and composites with industrial filler fraction can be generated. Typically, for short-fiber composites, data on the fiber-length distribution and on the volume-weighted fiber-orientation tensor of second order is available. We consider a model where the fiber orientation and the fiber length distributions are independent, i.e., uncoupled. We discuss the use of closure approximations for this case and report on identifying the describing parameters of the frequently used Weibull distribution for modeling the fiber-length distribution. We discuss how to integrate these procedures in the Sequential Addition and Migration algorithm, developed for fibers of equal length, and work out algorithmic modifications accounting for possibly rather long fibers. We investigate the capabilities of the introduced methodology for industrial short-fiber composites, demonstrating the rather low dispersion of the effective elastic moduli for the generated unit cells.


State of the art
Due to their favorable stiffness-to-weight ratio, fiberreinforced composites are frequently used when designing lightweight components. When a high design freedom is required, injection molding with discontinuously reinforced polymers is typically the first choice. Due to the anisotropy of the fiber reinforcements, the effective mechanical properties of such fiber-reinforced composite materials are anisotropic, as well. Moreover, as a consequence of the complex manufacturing process, the material microstructure varies throughout the component. For instance, the fiber-volume fraction, the fiber orientation and also the fiber-length distribution may are typically subject to non-homogeneity.
To improve upon the predictive capability of mean-field models [1][2][3], computational homogenization techniques [4] B Matti Schneider matti.schneider@kit.edu 1 Institute of Engineering Mechanics, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany are typically employed to determine the effective mechanical properties of such composites. Based on the mathematical theory of homogenization [5,6], computational homogenization represents a well-defined and flexible strategy to understand the influence of the constituents and the microstructural composition on the effective mechanical properties which complements the pertinent experimental procedures in an effective way.
Modern image-processing techniques provide a detailed impression about the complexity of fiber-reinforced materials [7,8]. To reduce the the influence of randomness and to keep the number of parameters manageable, microstructuregeneration tools [9] are highly valuable prerequisites for accurate computational multiscale modeling procedures. For short-fiber composites, the fibers do typically show little bending, and an approximation by a straight cylinder is appropriate. Simple algorithms based on random sequential addition (RSA) [10,11] fail to produce microstructures with industrial volume fraction and moderate fiber aspectratio, i.e., the quotient of fiber length and diameter, and this limitation is intrinsic [12,13]. Therefore, a number of improved RSA methods was introduced [14][15][16][17]. Sequential deposition [18,19], flexible fiber based [20,21] and collec-tive re-arrangement algorithms [22][23][24] are more successful when it comes to reaching the desired volume fractions. As a particular example of the latter group, the sequential addition and migration (SAM) method [25] was shown to generate short-fiber microstructures with straight cylinders and high fiber-volume fractions. Moreover, a fourth-order fiber-orientation tensor could be prescribed to the SAM algorithm, and the produced microstructure was shown to match the prescribed orientation tensor to high accuracy. This close match of a number of pre-defined statistical desirables implied that different computational cells produced closely matching effective mechanical properties. Even more striking, the size of the considered unit cells could be chosen to be extremely small, and still be representative [26][27][28]. Such a property is extremely desirable, in particular when it comes to nonlinear material properties of composites, and turned out to be critical when undertaking a number of scientific studies, including fatigue [29,30], fracture [31,32] and thermomechanically coupled problems [33,34].

Contributions
This work is concerned with an extension of the SAM method [25] to fibers of variable length, preserving the positive characteristics of the method shown for constant fiber lengths. Such an augmentation is less trivial than one might think initially. The challenges are actually twofold. The first, more obvious challenge, concerns the presence of very long fibers in unit cells. The classical SAM method [25] required the cell size to exceed the fiber length. If we retained this prerequisite, the cell would have to be larger than the longest considered fiber. In particular, such an approach would be computationally inefficient. Alternatively, we need to account for fibers that are longer than the cell edge-length. As we consider periodic boundary conditions, such a fiber may wrap around the cell several times, as shown in Fig. 1. In particular, it may intersect itself. Appropriate strategies are required to reliably deal with such a situation.
The second challenge is more subtle, and concerns the statistical data to be prescribed, balanced by the actual data that is available. Indeed, although micro-computed tomography (μCT) scans of fiber composites are readily available, extracting the full fiber length-orientation distribution from voxel data is far from trivial. Indeed, algorithms for segmenting individual fibers from voxel data are still subject of the latest research [35], yet only detect around 80% of the fibers. Thus, we need to live with the available data. For extracting the length-weighted fiber-orientation tensors of second and fourth order from voxel data, reliable algorithms are available [36][37][38]. Moreover, the fiber-length distribution may be  [39]).
For the work at hand, we assume that the fiber orientation and the fiber-length distribution are independent. Recent experimental data [40] suggests that there is a coupling between fiber length and fiber orientation, yet only weakly so. In particular, our working assumption may be a good starting point, in particular in view of the data available. We consider the volume-weighted fourth-order fiber-orientation tensor as the target statistical quantity. In the computational experiments, it turns out that prescribing this statistical quantity indeed keeps the statistical fluctuations rather low for the considered scenario.
For concreteness, we consider the Weibull distribution [41] for modeling the statistics of fiber lengths, as it was shown previously to accurately model length data of short fibers [42][43][44]. We introduce a strategy to determine the governing parameters from prescribed (volume-weighted) mean length and its standard deviation, which appears to be innovative. Moreover, we introduce a quasi-random based strategy to sample the fiber lengths, minimizing stochastic artifacts in the process.
This work is organized as follows. Section 2 introduces the players of our game in a mathematically precise context, i.e., fiber-length and orientation distributions, the Weibull distribution and the relevant closure approximations for the fiber-orientation tensor. The algorithmic counterparts concerning the sampling of both length as well as orientation and the required modifications of the SAM algorithm are described in sect. 3. We demonstrate the capabilities of the introduced procedures in sect. 4 for a commercial PA6GF35, studying the necessary resolution, RVE (representative volume element) size and the influence of the mean as well as standard deviation of the length distribution on the effective elastic moduli. Moreover, we validate the approach with published experimental data [40]. The appendices provide details for determining the Weibull parameters and shed light on the similarities and differences of the exact and the maximumentropy closure.

Fiber-orientation and fiber-length distributions
We consider short-fiber reinforced composites, i.e., we assume that each fiber in such a composite may be described by a straight cylinder with length , principal axis p and diameter d. Typically, the variations of the diameter between different fibers of a composite is negligible, whereas both the fiber length and the fiber orientation p vary significantly. The latter two characteristics may be described in terms of a length-orientation distribution function where S 2 denotes the unit sphere in R 3 . The lengthorientation distribution function is a probability distribution which satisfies the symmetry condition which reflects the sign ambiguity when describing a cylinder in terms of its principal axis. In practice, the full length-orientation distribution function ψ is not known, and only partial information and estimates are available. A classical way to estimate the fiberlength distribution ρ : proceeds via incineration of the matrix material (see Tab. 1 in Goris et al. [39]), and counting the fiber length of the individual fibers under the microscope. In particular, the connection between fiber length and orientation is lost. Fiber-orientation data is typically encoded via fiberorientation tensors [45,46], which can be determined from μCT images [36][37][38]47,48]. Fiber-orientation tensors correspond to moments of the length-orientation distribution ψ, and the two most popular fiber-orientation tensors are of second and fourth order, Fiber-orientation tensors have their roots in injectionmolding simulations where the second-order fiberorientation tensor A is often the only information available [49,50]. Although considering higher moments in the length variable is conceivable, the length-averaged form (2.4) is the most natural, as it corresponds to a volume averaging of the cylinders. In particular, this form typically arises when computing fiber-orientation tensors from μCT images [51]. Classically, for short-fiber reinforced composites, the details of the fiber-length distribution are ignored, and only the mean fiber length¯ is considered (with different possibilities to determine this mean). Put differently, the fiber-length distribution ρ is assumed to be concentrated at the specific fiber length¯ . In this case, the fiber-orientation distribution carries all relevant information about the length-orientation characteristics of the composite. Apparently, there is a gap between the data which is available and the data which would be necessary for estimating the effective mechanical properties of short-fiber composites. More precisely, (at least) three shortcomings are evident from the discussion.
1. The length-orientation distribution function (2.1) is usually unavailable. The latest computational algorithms [35,[52][53][54] identify only about 80% of the fibers, leaving incomplete statistical information. Rather, data on the fiber-length and the fiber-orientation distributions is available separately. 2. Experimental data on the fiber-length distribution (2.3) is usually presented by counting and binning [42,44,55]. Such results are typically rather sensitive to the size of the bins with a significant influence on statistical quantities of interest, like mean and standard deviation. 3. Only rather partial information is available on the fiberorientation tensor in terms of the second-order fiberorientation tensor (2.4).
In this manuscript, we deal with these challenges from an engineering perspective. To deal with the first item, we make the working assumption that the fiber-length and the fiber-orientation distribution are uncoupled, i.e., the lengthorientation distribution (2.1) may be written in product form in terms of a fiber-length distribution (2.3) and a fiberorientation distribution (2.5). Mathematically speaking, the random variables and p are assumed to be independent. From a physical point of view, it appears plausible that there is some coupling between the fiber length and the fiber orientation. Indeed, for longer fibers to arrange properly at high volume fractions, it is advantageous to align, at least locally, whereas a high degree of orientational dispersion is possible for the shorter fibers. However, for short fibers, these differences of the orientation distribution for different fiber lengths is not very pronounced [35,Fig. 14]. In particular, our assumption appears reasonable as a working assumption.
To deal with the second item on the list of challenges, we consider specific fiber-length distributions to be able to work with a concise description of the fiber-length distribution. For the work at hand, we consider the Weibull distribution [41], which was found to be a flexible and accurate model for describing fiber-length distributions in short-fiber composites [42,43]. The details comprise sect. 2.2.
Last but not least, the use of fiber-orientation tensors is standard in injection molding, and the independence assumption (2.6) permits us to conclude i.e., the expressions for the fiber-orientation tensors (2.4) simplify to expressions, that are familiar from investigations with only a single fiber length. In particular, the fiber-orientation closureapproximation technology [56][57][58] developed for this scenario becomes available, see sect. 2.3. Indeed, going beyond the independence assumption (2.6) for the length-orientation distribution requires innovative closure techniques to be developed, and represents a promising research field for future contributions.

The Weibull distribution
The Weibull probability distribution [41] has been used frequently to model the fiber-length distribution in short-fiber composites [40,42,43]. Its density function [59, eq. (4-43)] reads where k and λ are positive parameters known as the the shape and scale parameter. The shape parameter k is sometimes referred to as Weibull modulus and is dimension-free. The scale parameter λ has dimensions of length for the model at hand. In practice, it is often more convenient to prescribe elementary statistical quantities instead of the parameters k and λ. For instance, it would be possible to prescribe the (numberweighted) mean μ and standard deviation σ , defined via μ = ∞ 0 ρ λ,k ( ) d and (2.10) and to determine the Weibull parameters λ and k. In engineering practice, it is more common to work with volumeweighted quantities instead of number-weighted ones. For the problem at hand and with the assumption of cylindrical fibers with equal diameters, the volume-weighted mean m and standard deviation s are given by the expression For given (positive) values of m and s, the parameter k solves the equation (2.14) A derivation of Eqs. (2.12) and (2.14) is given in Appendix A. Examples for Weibull probability distributions are shown in Fig. 2 for varying (volume-weighted) mean and standard deviation. We observe that, for fixed mean, once the standard deviation gets too large, the probability distribution develops a pole at zero, leading to a monotonically decreasing function. In contrast, for sufficiently large standard deviation, the probability density function attains a more familiar bell-like shape.
Last but not least, let us point out that the inequality μ ≤ m (2.15) between the number-weighted mean length μ and the volume-weighted mean length m holds for any fiber-length applied to a probability distribution.

Fiber-orientation closure approximations
When manufacturing components made of short-fiber reinforced materials the microstructure is typically varying throughout the component. In particular, the microstructure characteristics like the fiber-orientation distribution (2.5) vary throughout the component. Moreover, the fiberorientation distribution contains rather fine details, which are surplus to requirements when considering process simulations and a subsequent evaluation of the mechanical properties. Therefore, it is more convenient to work with the secondand fourth-order fiber-orientation tensors (2.8) introduced by Advani-Tucker [46]. Indeed, the most popular model for the evolution of the fiber orientation in fiber suspensions, the Folgar-Tucker equation [49], involves these two tensors. Moreover, it is well known that the fourth-order fiber-orientation tensor characterizes the effective elastic behavior of short-fiber composites [25,60,61], at least for fibers with uniform length. Due to reasons of storage economy, the second-order fiber orientation tensor A is usually the primary quantity of interest, and the fourth-order fiber-orientation tensor A is approximated by a so-called closure approximation [49,56,58], a tensor-valued function F which establishes a relationship As the second-order fiber-orientation tensor may be recovered from the fourth-order fiber-orientation tensor via the relationship A : Id = A in terms of the 3 × 3-identity Id, the information content of the fourth-order fiber-orientation tensor exceeds that of its second-order counterpart. The additional information is lost when working with a closure approximation (2.19). Yet, working with closure approximations is the rule rather than the exception [50,62,63], both for process simulations and mechanics. The success of closure approximations lies in their construction which accounts for constraints imposed by the underlying physics and selects suitable "plausible" fourth-order fiber-orientation tensors. We refer to Kugler et al. [58] for a recent review article.
The original point of view (2.19) permitted certain pathologies, which lead to unphysical or mathematically contradictory properties of the estimated higher-order tensors. For instance, the quadratic, the linear and the hybrid closure, introduced by Advani-Tucker [46], do not arise from a fiber-orientation distribution, in general.
Over time, it became apparent that it is more convenient to consider closure approximations which provide an estimate of the entire fiber-orientation distribution function ϕ based on prescribed fiber-orientation tensors, reducing the computation of the higher-order moments to a simple postprocessing of the entire distribution function ϕ. The two most popular closures of this type, taking the second-order fiberorientation tensor A as input, are the exact closure [64,65] and the maximum entropy closure [57,60].
The exact closure is based on the angular central Gaussian distribution [66] parametrized by a symmetric and positive definite 3 × 3-matrix B which is moreover unimodular, i.e., satisfies det B = 1. It can be shown that the ACG distributions (2.20) represent exact solutions of the fiber-orientation dynamics (with vanishing Folgar-Tucker diffusivity [49]), see Verley-Dupret [67].
The maximum entropy closure [57,60] utilizes the Bingham distribution [68] ϕ mE which involves a symmetric 3 × 3-matrix B and a prefactor c B > 0 which ensures that the total mass is one. This closure maximizes the information-theoretic entropy.
In either of the two cases, for prescribed second-order fiber-orientation tensor A, the matrix B may be determined (numerically) as a solution of the equation Some care has to be taken if the second-order fiberorientation tensor is singular or close to singular, i.e., the equation det A 1 holds [64,65,69]. In any case, with the estimated fiber-orientation distribution ϕ at hand, the necessary higher-order information can be post-processed, e.g., the fourth-order fiber-orientation tensor (2.23) Moreover, we may draw orientation vectors from the identified distribution. The exact and the maximum entropy closure give rise to rather similar effective elastic properties for most fiber-orientation states, see Appendix B. For the work at hand, we rely upon the exact closure approximation (2.20) due to its closer connection to the underlying physics and the simpler sampling, see sect. 3. Let us conclude this section by discussing how to solve Eq. (2.22) for the exact closure (2.20) with a robust numerical strategy. For this purpose, integrals of the form where f ( p) is a monomial in the components of the vector p, need to be evaluated. Montgomery-Smith et al. [64,65] proposed a numerical strategy to evaluate such integrals based on the Carlson form of the elliptic integrals [70]. Efficient strategies for solving eq. (2.22) based on this strategy, in particular concerning a clever initial guess, are discussed by Ospald & Herzog [69]. However, the case of degenerate fiberorientation tensor A, i.e., whenever det A = 0, requires some attention. In such a case, the fiber-orientation distribution is concentrated on a plane, and the expression (2.20) is no longer meaningful. Clearly, the special case det A = 0 can be handled explicitly, see Görthofer et al. [71], for elaboration. Yet, it is clear that a scenario with small but nonzero determinant of the tensor A is numerically challenging and a more robust strategy is useful. For this purpose, we note that the change of coordinates transforms the integral (2.24) into the form see Ospald et al. [72]. In particular, integrating against the ACG density (2.20) gets translated into an integration task for the uniform distribution on the unit sphere S 2 ⊆ R 3 . With this insight at hand, suppose a set q 1 , . . . , q N ∈ S 2 of integration points with corresponding positive weights w 1 , . . . , w N is given, which permits to approximate integrals of the form Then, we may approximate the integral (2.26) by the expression (2.28) Returning to our original task (2.22), we are thus left with solving the equation for B provided A is given. Still, the problem with the determinant remains. For this purpose, we record that the transformation (2.25) is invariant w.r.t. the scaling B → λ B for any λ > 0. Moreover, computing the inverse square root of a matrix comes with a certain degree of computational effort. Therefore, we introduce the quantity C = holds. Whenever det C > 0, the matrix B may be recovered. However, the matrix C also makes sense in the degenerate case. For instance, if all orientations are contained in the 1-2-plane, setting the (i, 3)-components of the matrix B (i = 1, 2, 3) to zero projects all vectors q onto the 1-2-plane. Therefore, we are led to solving the equation for the symmetric matrix C ∈ R 3×3 with unit trace. As the matrices A and C share the same eigenbasis [64,65], it is numerically convenient to diagonalize the Eq. (2.31) with non-increasing diagonal elements in A and to eliminate the 33-component of the C-matrix explicitly Then, Eq. (2.31) becomes a nonlinear equation for only two variables, and conventional numerical root finding strategies may be applied.
To conclude this section, let us remark that we found using symmetric (antipodal) spherical t-designs [73] quite useful as general-purpose integration points with equal weights. Moreover, having the matrix B at our disposal is not necessary for the developments in this article. Rather, the matrix C is sufficient. For instance, the fourth-order fiber-orientation tensor A may be computed as follows

Efficient sampling from a Weibull distribution
Suppose a rectangular computational cell Q = [0, is given, together with a fiber-length distribution ρ , a fixed diameter d and a targeted fiber-volume fraction φ. We wish to sample a number N of cylindrical fibers with lengths 1 , . . . , N which follow the distribution ρ and realize the desired volume fraction φ as closely as possible, i.e., the condition should hold. If the fiber lengths are uniform, i.e., i ≡¯ holds for all indices i, we may choose where the brackets denote rounding to the nearest integer. For a general fiber-length distribution, a similar approach could be pursued with the (number-weighted) mean in place of the parameter¯ in Eq. (3.2). However, such an approach suffers from inaccuracies when computing the empirical number-weighted mean compared to the exact value. In particular, errors may be introduced compared to the desired volume fraction φ. An alternative approach proceeds by sampling fiber lengths 1 , 2 , . . . inductively and to stop whenever the running sum exceeds the desired volume fraction φ. Then, the number of sampled fibers N is either chosen asÑ orÑ −1, depending on which number leads to a closer agreement with the targeted fiber-volume fraction.
The simplest strategy proceeds by sampling the lengths in a random fashion from the given probability distribution ρ. However, a rather large number of samples is required to approximate the desired characteristics of the prescribed fiber-orientation distribution closely. Indeed, for any random variable f : [0, ∞) → R, the exact expectation is approximated by the empirical expectation For instance, in case f ( ) = , this difference concerns the (number-weighted) mean length, and similarly for the standard deviation. For a random variable with finite variance, random sampling leads to an error which decreases as 1/ √ N . Put differently, to decrease the error by one order of magnitude, the sample size needs to be increased by a factor 100.
For the Weibull distribution (2.9), the integral (3.5) attains the specific form which we may rewrite in the form with the substitution u = e −( /λ) k . In particular, the transformation permits us to push forward any sampling method on the unit interval to length space, giving rise to a sampling method for the Weibull distribution. Instead of the classical random sampling on the unit interval, which may be interpreted as a Monte Carlo integration rule [74], we rely on quasi-random sampling [75], which is based on low-discrepancy sequences [76]. Assuming sufficient smoothness of the integrand f , quasi-random sampling leads to an error which decreases as 1/N up to a logarithmic correction [76]. Roughly speaking, to reduce the error by an order of magnitude, the number of sampling points needs to be increased by a factor of ten. Further improvements may be reached by the scrambling technique [77], which gives rise to an error which decreases roughly as 1/N 3/2 , sufficient smoothness of the integrand premised. Figure 3 shows a comparison for the different sampling methods. We consider random sampling, together with (scrambled and non-scrambled) Sobol sequences [78,79], implemented in PyTorch [80]. To get reproducible results, we ran the sampling 20 times and computed the average of the relative error. Moreover, we consider the Weibull data set We observe that, for random sampling, the relative error converges only slowly, roughly at the predicted N −1/2 rate. Even for N = 1000 samples, the relative error exceeds one percent. The Sobol sequence leads to a smaller error compared to random sampling for all sample sizes. There is a fluctuation by about an order of magnitude. Yet, even the worst-case is much better than random sampling, typically by roughly half an order of magnitude. For N = 1000 fibers, the relative error in the number-weighted mean is smaller than 0.5%. A relative error of 1% is already achieved at N = 200 samples in the worst-case scenario. Thus, only one fifth of the number of samples is necessary to reach the accuracy of the random sampling. Last but not least, we turn our attention to sampling based on scrambled Sobol sequences [77]. We observe an error scaling as N −1 , which is slightly than expected from theory. Still, the error scales roughly as for non-scrambled Sobol sampling, but at half a magnitude lower. In particular, for N = 1000 samples, the relative error is below 0.1%. In turn, 1% relative error is reached for as little as 80 samples. Please keep in mind that scrambled Sobol sequences involve a degree of randomness as well, enabling a statistical error estimation based on the empirical (unbiased) variance in a similar way as for purely random sampling. Thus, scrambled Sobol sequences represent the method of choice for our purposes, and are-to a significant extent-responsible for the high degree of reproducibility of the results of the computations in Sect. 4.

Robust sampling from an ACG distribution
In this section, we discuss how to sample fibers when using the exact closure approximation [64,65]. This is possible, as the exact closure approximation comes with a full fiberorientation density function the angular central Gaussian distribution [66]. Sampling from this distribution is straightforward. However, complications arise whenever the fiber orientation degrades to a two-dimensional state. Such a state cannot be described by a continuous distribution function, as a continuous function on the sphere, which vanishes everywhere except for a plane, has to vanish everywhere. As a remedy, we re-use ideas from sect. 2.3. Recall that using a symmetric positive semidefinite matrix C ∈ R 3×3 with trace one turned out to be more convenient when describing the exact closure approximation encompassing the degenerate cases. Suppose the matrix C is identified as a solution of Eq. (2.31). Let 1 , . . . , N denote the fiber lengths drawn in the previous section. Then, we draw N vectors q i ∈ R 3 from a standard normal distribution in three dimensions in a random fashion. Finally, the transformation gives rise to appropriate directions for the fibers in an asymptotically consistent and robust way. Please note that we rely upon classical random sampling, as the eventual match of the realized fourth-order fiber-orientation tensor with the exact fourth-order fiber-orientation tensor will be ensured by the SAM method, to be described in the next section.

An augmented SAM algorithm
The sequential addition and migration (SAM) algorithm [25] is a method for generating periodic assemblings of non-overlapping cylindrical fibers with a prescribed volume fraction and fourth-order fiber-orientation tensor. Upon convergence, the algorithm produces a fiber structure which is perfectly periodic, where no pair of distinct fibers overlap and where both the desired volume fraction and the targeted fourth-order fiber-orientation tensor are realized up to the prescribed numerical precision. Originally, the method was introduced for fibers with an equal fiber length. In this work, we report on an extension of the methodology. For a the cell Q = [0, and a given fourth-order fiber orientation tensor A, for instance arising from a closure approximation, we suppose that the fiber lengths 1 , . . . , N and initial fiber orientations p 0 1 , . . . , p 0 N have been drawn in such a way that the target fiber-volume fraction is realized to the desired degree. Moreover, we initialize the fiber centers x 0 1 , . . . , x 0 N by drawing from a uniform distribution on the cube [0, max(Q 1 , Q 2 , Q 3 )] 3 and discarding those centers which lie outside of the cell Q. The SAM algorithm [25] proceeds by the iterative scheme for i = 1, . . . , N and a step size τ , typically chosen as 0.3. Here, the vector ξ k i j realizes the distance between the spheroids described by the triples (x k i , p k i , i ) and (x k j , p k j , j ), and δ k i j = max(0, D − ξ k i j ) in terms of the fiber diameter D. The vector d k i computes as with the parameter the current orientation tensor and s k i j ∈ [−1, 1] defined by equation (4) in Schneider [25]. To illustrate the procedure, let us investigate what happens when two fibers do intersect, as shown in Fig. 4a. In order to remove the overlap, the fibers have the option to move their centers x i and x j , respectively, or to rotate their directions p i and p j appropriately. The distance to move is governed by the overlap parameter δ i j , and encoded by formula (3.12). Of course, it is necessary to balance the shifts of the centers and the rotations appropriately, see Fig. 4b. This is done implicitly via the parameter " i in (3.3) that arises from the inertia tensor of the corresponding spherocylinder.
The iterative scheme (3.12) is terminated whenever the fibers are in a non-overlapping condition and the match between the fourth-order fiber-orientation tensors is sufficiently good. We use a relative difference in Frobenius norm of the Voigt representations of the tensors below 10 −4 for this article throughout.
The bulk of the computational effort of the SAM algorithm (3.12) is hidden in determining the inter-particle collisions. Indeed, for N fibers, a naive implementation requires N (N + To reduce this effort, we utilize nested Verlet lists as proposed in Schneider [25].
When it comes to a fiber-length distribution, an additional difficulty arises. In the original work [25], fibers were assumed to be shorter than half the cell dimensions. In this way, periodic distance computations could be carried out with minimum effort. However, when considering a distribution of the fiber length it is rather typical that there are a few rather long fibers. If the cell dimensions were required to exceed the maximum length by a factor of two, the considered cells would be rather large. Thus, a workaround is required.
Suppose we wish to compute the distance between two fibers (x i , p i , i ) and (x j , p j , j ) on a periodic cell Q = [0, Q 1 ] × [0, Q 2 ] × [0, Q 3 ]. Suppose that the fiber length i exceeds Q/2 for Q = min(Q 1 , Q 2 , Q 3 ). Then, we break the first fiber into K individual segments, each of which is shorter than Q/2, and compute the K distances between the fiber segments and the j-th fiber. This strategy is illustrated in Fig. 5a, where the red fiber is decomposed into K = 7 segments. The shortest distance among these computed distances is then selected as the distance between the entire i-th fiber and the j-th fiber. Clearly, if the j-th fiber exceeds the length requirements, a similar approach is successful.
Last but not least let us mention the possibility that a rather long fiber intersects itself. Such a situation may be handled by a modification of the proposed strategy. The fiber is broken into K segments. Then, overlap is checked between the nonadjacent segments. This is illustrated in Fig. 5b for a fiber that is decomposed into K = 5 segments. Only between the pairs with a √ sign, an overlap check needs to be made. Due to symmetry, six checks need to be made for the example at hand. In case of overlap, an additional term (for i = j) arises in the sums (3.12) and (3.13).

Setup
The presented algorithms were implemented in Python with Cython extensions. Most of the code runs in serial, only the inter-fiber distance computations were parallelized with OpenMP. The timings were recorded on a PC with a six-core Intel i7 CPU and 32GB RAM. For computing the effective elastic properties, we rely upon an FFT-based computational homogenization code [81,82]. We utilize a discretization on a staggered grid [83], which is solved by a conjugate gradient method [84][85][86] up to a relative tolerance of 10 −5 . We refer to the overview article [87] for background.
To compute the effective elastic constants, we compute the effective stresses corresponding to six independent strain loadings. Subsequently, the effective elasticity tensor is approximated by an orthotropic fourth-order tensor [88] to enable interpreting the corresponding engineering constants.
As our point of departure, we consider a commercially available PA6GF35, i.e., a polyamide-6 matrix with 35% (by weight) glass-fiber reinforcement. As our "standard setup" we select fibers with a diameter of 10μm and a length of 250μm. The data is summarized in Table 2. The matrix and fiber are furnished with the isotropic elastic parameters given in Table 1 , following Hessman et al. [40].

On the necessary resolution
To compute the effective mechanical properties of composites, a number of error sources needs to be quantified. The most obvious such error source is the resolution, i.e., the mesh size of the discretization. Under appropriate hypotheses, the effective properties converge upon mesh refinement [89][90][91]. In particular, a sufficiently fine mesh is mandatory for accurate results. On the other hand, fine meshes also come with a higher number of degrees of freedom, increasing the computational effort significantly. Indeed, for regular meshes in three spatial dimensions, the effort of FFT-based solution techniques were shown to roughly increase by a factor of eight when halving the mesh width [83,86], which is close to optimal for these kinds of problems. Therefore, a suitable compromise between accuracy and manageable computational effort needs to be chosen. We consider the setup shown in Table 2, in particular with a uniform fiber length. Such an approach eliminates other sources of error, e.g., coming from the fiber-length distribution. We generated cubic microstructures with an edge length of 300μm, one for the three second-order fiber-orientation tensors corresponding to isotropic, planar isotropic and unidirectional (i.e., aligned) fiber-orientation states.
We set the isolation distance between fibers to 2μm, i.e., 20% of the fiber diameter consistently throughout the manuscript.
We change the resolution by considering different voxel sizes 4 μm, 2μm, 1 μm and 0.5 μm, giving rise to voxel models with 75 3 , 150 3 , 300 3 and 600 3 hexahedron elements, see Fig. 6 for an illustration in the case of an isotropic fiber orientation. For the finest resolution, a single fiber is resolved by 20 voxels across the diameter. This resolution decreases to ten and five voxels per diameter for a voxel size of 1 μm and 2 μm. The coarsest resolution, h = 4 μm, works with only 2.5 elements per fiber diameter, on average.
The effective directional Young's moduli of the orthotropic approximation of the effective elastic properties are shown in Fig. 7 for the three considered fiber-orientation states (4.1). For the isotropic orientation, we observe that the three direc-  For the planar isotropic case, the in-plane Young's moduli E 1 and E 2 exceed the out-of-plane Young's modulus E 3 . Coarsening the resolution has little influence on E 3 . Even the coarsest resolution deviates only by 1.3% from the finest resolution. For the Young's moduli E 1 and E 2 , such a coarsening leads to a slight underestimation of the moduli. For the coarsest resolution, there is an 6.6% deviation in E 1 . For h = 2 μm and h = 1 μm, the relative deviations decrease to 1.6% and 0.3%, respectively. For E 2 , the relative errors are similar. For the out-of-plane modulus E 3 , the relative deviation to the finest resolution is at 1.3%. Refining the resolution leads to even lower errors, at 0.01% and 0.15%.
For the uni-directional case, the transverse Young's moduli E 2 and E 3 coincide to the naked eye. They are much smaller, almost by a factor of three, than the parallel Young's modulus E 1 . At the coarsest resolution , E 2 is already 0.3%-close to the value at the highest resolution. Similarly, the coarsest resolution overestimates the modulus E 1 1.1%.
To sum up, we observe that the largest errors occur for the planar isotropic fiber orientation, and for the in-plane Young's moduli. Selecting h = 2 μm ensures that the Young's moduli are computed with less than 2% relative error, giving us sufficient confidence in the obtained results.

On the size of the representative volume element
Due to the randomness of the geometry of fiber-reinforced composites, the computed apparent elastic properties on unit cells involve a degree of randomness, as well. The effective properties, which are deterministic, emerge only upon considering sufficiently large cells, so-called representative volume elements [26][27][28]. A large body of studies were devoted to this concept (see Doškář et al. [92] for a recent overview), in particular as an increased size of the considered volume element necessarily increases the computational effort, as well. Therefore, it is desirable to work with volume elements that are large enough to closely approximate the effective properties and, at the same time, as small as possible to keep the effort manageable. The errors emerging when working with finite-sized cells may be naturally classified into two categories. Suppose we work with a cubic cell with an edge length Q. In this section, we write Q to emphasize that we work with a cubic cell. The apparent elastic properties computed for cells of this size fluctuate around a mean, and the associated variance quantifies the stochastic fluctuations of the elastic properties. This so-called random error (or dispersion) depends on the cell size and becomes smaller as the cell edge-length Q increases, with a rate that depends on the underlying stochastic model for the microstructure [28,93,94]. In addition to the random error, there might be a difference between the mean properties computed for a fixed cell size and the effective properties [28,95]. This error is called systematic error (or bias), and is more difficult to assess and grasp. Typically, the random error is much larger than the systematic error, and the attention focuses on a suitable estimation of the random error.
Previous studies have shown that both the random and the systematic error increase when using traction or displacement boundary conditions or working with non-periodized geometries [28,96,97]. As our generated microstructures are intrinsically periodic, the periodic boundary conditions that come naturally with FFT-based solvers do help us in this regard.
We would like to understand whether increasing the variance in the fiber length also necessitates working with larger unit cells. For this purpose, we consider a uniform fiberlength distribution as our point of departure.
We investigate the three extreme fiber-orientation states (4.1) with the setup described in Table 2, the material parameters of Table 1 and the previously identified voxel size h = 2 μm. We consider three cubic unit-cell sizes, with edge lengths Q = 300 μm, Q = 600 μm and Q = 900 μm. The smallest edge length, Q = 300 μm, only slightly exceeds the fiber length¯ = 250 μm.
The results of ten runs are recorded in Fig. 3. The orthotropic approximation error lies around 1%, and is thus smaller than the error induced from the resolution. In particular, it is meaningful to only consider the orthotropic engineering constants. We observe that the standard deviations of the directional Young's moduli are on a rather small level-they do not exceed 0.5% of the corresponding mean for Q = 300 μm. Increasing the cell size decreases the standard deviation, as well. For Q = 600 μm, the standard deviations do not exceed 0.2%, and drop to less than 0.09% for Q = 900 μm.
Concerning the systematic error, the highest deviation is reached for E 1 in the unidirectional case with a relative deviation of 1.2% compared to the large-scale modulus.
With these reference results at hand, we consider a Weibull-distributed fiber-length distribution with a (volumeweighted) mean fiber length m = 250 μm and a standard deviation s = 100 μm, see also Fig. 2 for an illustration of the fiber-length distribution function. The statistical measure of ten runs are collected in Table 4. Further complications arose for the unidirectional case and the smallest considered edge length Q = 300 μm. If a fiber of length is axis-aligned in a cubic cell with edge length Q, it will always intersect itself if ≥ Q, i.e., if the fiber length is larger or equal to the cell edge-length. For the considered standard deviation, a non-trivial number of fibers were simply too long to fit into the cell with Q = 300 μm. Therefore, after drawing the fiber length , we restricted it as follows ← min( , 0.99 Q). (4.2) For non-UD fiber orientations, this issue did not arise. Taking a look at the results collected in Tabble 4, we observe that the mean values do in fact differ from the reference results with zero variance, but the standard deviations of the apparent Young's moduli are on a similar level. Indeed, Fig. 8 compares the relative standard deviations of the considered orthotropic Young's moduli for the investigated orientations and varying cell edge-length. The standard deviations do not exceed 0.5% for the smallest cells, and decreases for increasing cell size. For Q = 300 μm, the standard deviations are a bit larger than for s = 0, but not significantly so. Similarly, the systematic error is also on a rather low level. Indeed, even for E 1 in the aligned case, the relative deviation between the means of the smallest and the largest cell is as low as 1.6%, even for the cropped fiber lengths (4.2).  To sum up, we could show that working with a considerable fiber-length variability in the fiber-length distribution does not negatively affect the necessary cell size when computing the effective elastic properties of short-fiber composites.

Results of varying fiber-length parameters
In this section, we investigate the effects of varying the parameters of the Weibull fiber-length distribution. For this purpose, we continue to investigate the commercial PA6GF35 with parameters summarized in Table 1 and Table 2. To keep the number of changing parameters manageable, we restrict to a fixed fiber-orientation state, described by the exact closure of the second-order fiber-orientation tensor as determined by Hessman et al. [40] with a μCT-based algorithm which identifies individual fibers [35]. Fig. 9 shows the influence of a change in the fiber-length distribution on the computed effective orthotropic Young's moduli of the composite. We consider variations of the (volume-weighted) mean and standard deviation individually. These changes correspond to the fiber-length distribution functions shown in Fig. 2. We start with a fixed standard deviation s = 120 μm and a continuously changing standard deviation, see Fig. 9a and 9b. We observe that the transverse Young's moduli E 2 and E 3 are scarcely affected by such a change. In contrast, the longitudinal Young's modulus E 1 increases with increasing mean fiber length. Between m = 250 μm and m = 325 μm, the modulus E 1 increases by roughly half a GPa. This difference increases even more when approaching the highly aligned case, e.g., unidirectional fibers. Taking a look at varying standard deviations with fixed mean length m = 250 μm, see Fig. 9b, we also observe that the transverse Young's moduli are hardly affected. The longitudinal Young's modulus E 1 decreases with increasing standard deviation. This result conforms with mechanical intuition. It is not clear whether a saturation point is reached for a → ∞, as generating the high standard deviations also came with the challenge of dealing with very long fibers (and exceedingly large volume elements). Moreover, for high standard deviation, the "turning point" is reached where the fiber-length distribution does not have a unique maximum any more, but shows a consistently decreasing trend. Yet, we observe a difference of roughly one GPa, i.e., about ten %, between a state with uniform fiber lengths and a strongly dispersed fiber-length distribution.
Last but not least, we investigate the runtime of the fibergeneration process, and how it is influenced by the fiberlength parameters.
Considering the three extreme orientations 4.1, we ran ten simulations on cubic volume elements with edge length Q = 600 μm. For a start, we consider a PA6GF35 with fixed standard deviation s = 120μm and increasing mean length m. The results are shown in Fig. 10a. The runtime for the unidirectional case is extremely small, as expected. Indeed, even random sequential addition (RSA) methods [14][15][16][17] are able to generate unidirectional states at high filler fraction. Non-aligned fiber-orientation states are more difficult to handle, and this is where most traditional algorithms have their limitations. Both for the isotropic and the planar isotropic fiber-orientation state, the overall runtimes never exceeded one minute. Taking into account the runtime of the subsequent computational homogenization, such an effort is clearly acceptable. We observe that, for almost all considered cases, generating the isotropic state required a higher effort than for the planar isotropic state. Moreover, increasing the mean fiber length also led to an increase in the runtimes for the isotropic case. The runtimes for the planar isotropic case were not adversely affected for increasing mean fiber length. Rather surprisingly, the runtime increased for the smallest considered fiber length. This behavior is rooted in the difficulty inherent to generating exactly planar fiber-orientation states. The problem comes from the fact that for a smaller mean fiber length, the number of considered fibers increases. In turn, attaining a purely planar fiber-orientation state becomes more difficult, i.e., the iteration count increases superlinearly with the fiber count.
To demonstrate the capabilities of the algorithm, we fixed the mean length as well as the standard deviation and increased the considered volume fraction. The results, see Fig. 10b, show that the algorithm was able to generate up to 30% filler fraction in less than three minutes in a reliable and robust manner. For the non-aligned orientations, we observe a strong increase in the runtimes necessary to generate the volume elements for increasing volume fraction.

Comparison to experimental data
In this section, we compare the introduced methodology to real data, both in terms of measured fiber-length data and concerning measured directional Young's moduli. Hessman et al. [40] consider a PA6GF35 with short-fiber reinforcement. Based on μCT-scans, individual fibers where identified, permitting a subsequent statistical analysis. The fiber diameter emerged as 10μm, as expected, and the determined fiberlength distribution is shown in Fig. 11a. Moreover, the fiberorientation tensors (2.4) may be determined, as well. The complete volume-averaged second-order fiber-orientation tensor is given in Eq. (4.3). However, an analysis of the fiber orientation across the thickness of the plate was also given [40, Fig. 4]. The orientation follows the classical skin-coreskin layering typical for short-fiber composites [98].  Table 5. Compared to the experimental data, the longitudinal Young's modulus is reproduced with high accuracy. The transverse Young's modulus is slightly overestimated, but still contained in the 99.9%-confidence interval. We ran the simulations ten times, but the standard deviation of the effective moduli turned out to be negligible, again.  As a comparison to the three-layered structure, we generated a volume element with a homogeneous fiber-orientation distribution (4.3) and the identified Weibull distribution, see Fig. 11a. The results, see mean data in Table 5), show that the longitudinal Young's modulus is about half a GPa lower than for the sandwich structure. Also, the transverse Young's modulus is about 0.2GPa lower. These results indicate that the load-carrying capacity of the layered structure is improved in both directions compared to the one without layers. Moreover, the longitudinal Young's modulus lies below the experimental values, actually at the lower end of the confidence interval, whereas the transverse Young's modulus matched quite well with the experimental value. Last but not least, we consider the case with only a single fiber length, fixed at = m ≡ 332.65μm. This case, labeled by fixed length in Table 5 leads to a slightly higher longitudinal Young's modulus, which lies below both the experimental mean and the prediction for the sandwich structure. The transverse Young's modulus is rather close to the mean data case, as was already observed in the previous section.
Last but not least, a glance at the runtime reveals that the runtime for the microstructure generation was significantly less than two minutes, for all cases considered.

Conclusion
This work was devoted to extending the SAM algorithm to varying fiber lengths and to studying the effects on the effective elastic properties of short-fiber reinforced plastics. We draw the following conclusions.
• There are many moments of the fiber length-orientation distribution (2.1) parametrized by non-negative integers m and k, which could be enforced in the SAM algorithm. For the work at hand, we considered m = 1 and k = 4, together with approximations for M m,0 via the quasi-random sam-pling of the fiber lengths. Such a strategy turned out to be successful in reducing the statistical variations of the computed effective elastic properties to a minimum. However, it might be of interest whether enforcing higher-order moments is beneficial for the effective of nonlinear and inelastic mechanical properties. • We considered the Weibull distribution to describe the fiber-length distribution. Working with other fiber distributions is possible with little adaptation. For instance, long-fiber composites may show two distinct maxima in the fiber-length distribution function, as a result of fibers breaking during the manufacturing process. • Our studies revealed that an increase in the standard deviation of the fiber length did not adversely affect the necessary RVE size. Such a result is encouraging, and should not be taken for granted. Indeed, a claim that an increase in the standard deviation would necessitate larger computational cells appears to be plausible. Yet, the influence of the fiber-length distribution appears to be more pronounced for inelastic material behavior, a topic which may deserve further studies. • It was possible to closely match the longitudinal Young's modulus of the experimental structure, whereas the transverse Young's modulus was overestimated a little. There is a number of possible error sources. For a start, as only 80% of the individual fibers get segmented correctly, there is still some uncertainty in both the fiber orientation and the fiber-length distribution. Moreover, we do not have access to the layer-wise fourth-order orientation tensors whose consideration may certainly enhance the predictive quality. Last but not least, we consider the fiber-length distribution to be uniform throughout the thickness. Still, despite all these uncertainties, the match is certainly sufficient for most engineering applications. • In this work, we relied upon the hypothesis that the fiber length and the fiber orientation are independent. Experimental data suggests otherwise, and the introduced error remains to be quantified. As a first step, it appears promising to estimate the entire fiber length-orientation distribution from the available data. • We could show the benefits of the quasi-random sampling of the fiber length. This is a key technology, and we advocate it to become the standard for expressive fiber-generation tools. Moreover, the developed technology may also be used for long and flexible fibers.
Acknowledgements Support by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)-projects 255730231 and 440998847-is gratefully acknowledged.
Funding Open Access funding enabled and organized by Projekt DEAL.

Appendix A Weibull parameters from volume -weighted mean and variance
The purpose of this section is to derive the expressions (2.12) and (2.14) for the parameters λ and k of the Weibull distribution (2.9) for prescribed volume-weighted mean m and standard deviation s, given by the formula where we use the shorthand notation for the (number-weighted) mean of any random variable g w.
using the linearity of the expectation (A. 3), to arrive at the final expression where we used the formula (A.2) for the volume-weighted mean m. Invoking the formula (A.4) for the moments, we get the system of equations. Solving for λ in the first equation and inserting this expression into the second equation yields the condition The latter equation corresponds to Eq. (2.12), and condition (A.8) coincides with equation (2.14) in the main body of the text, as was to be shown. Please note the Eq. (A.9) appears to be uniquely solvable for any given and positive right-hand side s 2 /m 2 , see Fig. 12. In this section, we would like to compare the exact closure approximation (2.20) to the maximum entropy closure (2.21). To separate the influence of the fiber-length distribution, we consider a uniform fiber-length distribution, with a length of 250μm and a diameter of 10μm. Moreover, we consider PA6GF35, i.e., a commercial polyamide 6 with 35% (by weight) glass-fiber reinforcement. This corresponds to 19.3% filler content by volume. We use the material parameters given in Table 1, following Hessman et al. [40]. For the computations, we use the mesh and volume-element sizes identified in sect. 4, i.e., h = 2μm and Q = 600μm.
To evaluate the differences between the closures, it is necessary to take a closer look at the set of possible input and output values of corresponding computations. The secondorder fiber-orientation tensor A, see Eq. (2.18), serves as the input for the computations, whereas the effective stiffness C eff arises as the output. We would like to cover all possible input tensors A. For this purpose, we notice that any second-order fiber-orientation tensor may be written in the form with an orthogonal matrix U ∈ R 3×3 and a pair (λ 1 , λ 2 ) of real numbers which satisfy the constraints λ 1 ≥ λ 2 , λ 1 + 2λ 2 ≥ 1 and λ 1 + λ 2 ≤ 1. (B.2) As an orthogonal transformation of the fiber-orientation tensor A leads to a suitable orthogonal transformation of the effective stiffness C eff it is actually sufficient to restrict attention to diagonal second-order tensors A which satisfy Eq. B.2. Geometrically, this phase space corresponds to a triangle in two dimensions, the fiber-orientation triangle [99].
The effective stiffness C eff depends continuously on the fiber-orientation tensor A, provided all other parameters remain fixed and the computational cells are sufficiently large to render the computations representative [26][27][28]. Therefore, it suffices to consider a (sufficiently fine) triangulation of the fiber-orientation triangle (B.2), which reduces the effort to a finite number of nodal computations. Here, we follow Köbler et al. [99] and consider a triangulation with 15 nodes and linear FE ansatz functions.
Regarding the output of the computations, working with the full stiffness tensor C eff , an object with 21 independent components, increases the difficulty in assessing the fine details. From an engineering perspective, it is more instructive to monitor the engineering constants associated to the effective stiffness. Indeed, for both the exact and the   Tables 1 and 2 maximum entropy closure, the effective stiffness tensors are orthotropic. This is a direct consequence of the fact that every second-order tensor is orthotropic, in particular the second-order fiber-orientation tensor A, and the prescribed probability density functions (2.20) and (2.21) for both closures reflect this symmetry condition. Thus, we report on the engineering constants of the effective stiffness, considered as functions on the fiber-orientation triangle. For the three directional Young's moduli, the results are shown in Fig. 13a. To the naked eye, both closures appear to give rise to almost identical results. This is a consequence of the following. For the three corners of the triangle, the unidirectional, the planar isotropic and the spatially isotropic fiber-orientation state, the fiber-orientation distributions for the maximum-entropy and the exact closure coincide. In particular, the effective elastic moduli coincide as well. Thus, the only possible differences arise in the interior of the fiberorientation triangle.
To study these differences more closely, we investigate the relative difference between the engineering constants, where we take the exact closure as the reference. Fig. 13b shows the effective directional Young's moduli and the effective shear moduli for the three coordinate planes. The Young's modulis E 3 leads to the smallest difference between the closures among the directional Young's moduli, with a maximum difference of 3.1%. The difference for E 2 is only slightly larger at 4.6%. For E 1 , this difference increases to 7.1%. For all three moduli, these maximum differences are realized for λ 1 ≈ 0.8. Yet, on average the difference is 2.6% for E 1 and at 1.3% and 1.6% for E 2 and E 3 , respectively.
Inspecting the shear moduli paints a similar picture -the moduli involving the 1-direction lead to larger differences that the shear modulus in the 2-3-plane. On average, the errors are 0.9%, 2.8% and 2.1% for G 23 , G 13 and G 12 , respectively.