Data-driven topology design using a deep generative model

In this paper, we propose a sensitivity-free and multi-objective structural design methodology called data-driven topology design. It is schemed to obtain high-performance material distributions from initially given material distributions in a given design domain. Its basic idea is to iterate the following processes: (i) selecting material distributions from a dataset of material distributions according to eliteness, (ii) generating new material distributions using a deep generative model trained with the selected elite material distributions, and (iii) merging the generated material distributions with the dataset. Because of the nature of a deep generative model, the generated material distributions are diverse and inherit features of the training data, that is, the elite material distributions. Therefore, it is expected that some of the generated material distributions are superior to the current elite material distributions, and by merging the generated material distributions with the dataset, the performances of the newly selected elite material distributions are improved. The performances are further improved by iterating the above processes. The usefulness of data-driven topology design is demonstrated through numerical examples.


Introduction
Structural design is to determine the structural shape and topology of artifacts on the basis of physics, mathematics, designer intuition, and so on.Among previously proposed methodologies for structural design, topology optimization originated by Bendsøe and Kikuchi (1988) is a promising one because of its potential to yield high-performance structures while considering both the shape and topology.
There are two basic concepts in topology optimization.One is replacing a structural design problem with a material distribution problem in a given design domain.The other is exploiting the optimal, or at least local optimal material distribution using mathematical programming under a given objective function and constraints, i.e., a given formulation.
Topology optimization has been applied to various engineering problems and has achieved immense success because of its versatility.Nevertheless, it has an intrinsic difficulty, as pointed out by Yamasaki et al. (2019).That is, it is often a difficult task for designers to determine appropriate formulations from design problems that are ambiguously described.This is because there are often tacit constraints in such design problems, and it is difficult to articulately describe them without trial-and-error.
To tackle this intrinsic difficulty, Yamasaki et al. (2019) has proposed a support system for formulating topology optimization problems.This formulation support system has a database constructed by collecting data of material distributions, which were obtained by solving various topology optimization problems, and the formulation support system provides useful knowledge for determining appropriate formulations on the basis of knowledge discovery in databases (Fayyad et al., 1996;Tsai et al., 2014;Adhikari and Adhikari, 2015).
More specifically, a user inputs multiple functions as candidates of the objective and constraint functions (hereafter, called candidate functions) into the formulation support system, and then it outputs material distributions having Pareto optimality from the database.By checking the outputs, the user decides whether the set of input candidate functions is appropriate or not.This process is repeated until an appropriate set is determined.By doing so, trial-and-error for determining an appropriate formulation is supported.
Their study is the first challenge to tackle the intrinsic difficulty described above and some issues remain.One major issue is the diversity of the material distributions in the database.Under a situation in which the diversity is insufficient, the formulation support system will output material distributions that may seem to be unusual for those having Pareto optimality, even if the set of input candidate functions is appropriate.This is not preferable, because the user may consider that such unusual material distributions are output owing to the inappropriateness of the input set.The formulation support system should output reasonable material distributions; at minimum, it should be clear for the user why such material distributions are suitable to the input set.
Regarding the above issue, the utilization of deep generative models (Kingma and Welling, 2013;Goodfellow et al., 2014) is promising.They can generate material distributions using the outputs of the formulation support system as the training data.Because of their generative nature, the generated material distributions are diverse and inherit features of the original material distributions, which are the current Pareto optimal solutions to the set of input candidate functions.Therefore, it is expected that some of the generated material distributions will be superior to the original material distributions, and that the Pareto front will be improved by integrating the generated material distributions into the original material distributions.In addition, the Pareto front will be further improved by iterating the above processes, and as a result, it is expected that unusual material distributions in the Pareto front will be suppressed.
On the basis of the above idea, in this paper, we propose to iteratively conduct the following processes to the database of the formulation support system: generating material distributions using a deep generative model from the Pareto optimal material distributions in the database, and integrating the generated material distributions into the database.We call this structural design methodology data-driven topology design.The essence of this methodology is to improve the structural shape and topology by iterating the data generation using a deep generative model and the data evaluation based on Pareto optimality.
Many deep generative models have recently been proposed, and variational autoencoders (VAEs) (Kingma and Welling, 2013) and generative adversarial networks (GANs) (Goodfellow et al., 2014) are representative.When compared to a GAN, a VAE is suitable for data-driven topology design because its neural network architecture is relatively simple and a VAE is therefore robust (Atienza, 2018).This robustness is particularly important because we train the neural network many times while updating the training data.We therefore adopt a VAE as a deep generative model for implementation and demonstrate that the formulation support system is enhanced by incorporating the implemented method.
The rest of this paper is organized as follows.We briefly introduce related studies in Section 2 and describe the overall procedure in Section 3. Next, we detail its implementation in Section 4 and provide numerical examples in Section 5. Finally, we provide some concluding remarks in Section 6.
2 Related studies

Topology optimization based on data-driven approach
Data-driven approaches based on deep-learning have recently gained significant attention from researchers in various fields, and some studies incorporating them into topology optimization have been proposed.Ulu et al. (2014) proposed to predict optimized material distributions of the minimum compliance problem using a neural network.In their study, various optimized material distributions were prepared using topology optimization while changing the load boundary condition.The network is then trained under the load boundary condition as the input and the corresponding optimized structure as the output.Using the trained network, the optimized material distribution to a given load boundary condition is predicted.Zhang et al. (2019b) also proposed to predict the optimized material distributions of the minimum compliance problem using a neural network.In their study, the displacement and strain fields of the initial material distributions are used as the input, the corresponding optimized material distributions are used as the output, and the neural network is trained using the input and output data.When an initial material distribution and its displacement and strain fields are given, the optimized material distribution is predicted using the trained network.They demonstrated that their proposed method covers a change in the location where the displacement fixed boundary condition is imposed, in addition to the load boundary condition.
Similar to the above studies, Yu et al. (2019) proposed a prediction method for the minimum compliance problem.Optimized material distributions are predicted through two steps in their study.First, an optimized material distribution under a given boundary condition is predicted in a low-resolution mesh, such as that described in the studies of Ulu et al. (2014); Zhang et al. (2019b).Next, the predicted material distribution is refined in a high-resolution mesh using conditional GAN (Mirza and Osindero, 2014).Sasaki and Igarashi (2019) proposed a topology optimization method for a structural design problem of inner permanent magnet motors (IPMs).In their study, quasi-optimal material distributions are exploited using a genetic algorithm (GA).Although topology optimization incorporating a GA is generally time-consuming, they reduced the computational costs by utilizing a neural network that predicts the performances of IPMs.
Although these studies utilized deep-learning for the regression, some studies have focused on deep-learning based generative models.Oh et al. (2019) proposed a topology optimization method for a wheel design problem in which the diversity of the optimized material distributions is ensured by referring to the material distributions generated by a GAN.They also used an autoencoder (Hinton and Salakhutdinov, 2006) to evaluate the novelty of the optimized material distributions.Guo et al. (2018) proposed a structural design method for the thermal compliance minimization problem, which consists of two steps.First, a VAE is trained using various material distributions, which are obtained using topology optimization while changing the boundary conditions.Next, the latent space of the trained VAE is exploited using a GA, and as a result of the exploitation, quasi-optimal material distributions are obtained.In addition, a style transfer network (Gatys et al., 2016) is used to reduce noises included in the material distributions generated by the VAE.Zhang et al. (2019a) proposed a structural design method for the three-dimensional shape of a glider.In their study, a VAE is trained using airplane models registered in a three-dimensional structure database (Wu et al., 2015), and the latent space of the trained VAE is exploited using a GA in a similar manner as the study of Guo et al. (2018).
Data-driven topology design may seem to be similar to the above studies, particularly, the studies of Oh et al. (2019), Guo et al. (2018) and Zhang et al. (2019a).However, its novelty can be clearly explained using the estimation of distribution algorithm (EDA) (Larrañaga and Lozano, 2001).Therefore, we introduce the EDA in the next section.

Estimation of distribution algorithm
Because of the generative nature for structures, data-driven topology design may seem to be an image-based GA in which only elite individuals are selected.Indeed, this can be regarded as an EDA, which is a type of GA, on the basis of the following two points: (i) probabilistic models are constructed with elite individuals, and new individuals are generated using these models, and (ii) this generative process is iteratively performed.Recently, Garciarena et al. (2018); Bhattacharjee and Gras (2019) proposed to adopt a VAE as a probabilistic model of an EDA, although their targets are well studied test problems in the field of the GA rather than structural design problems.The EDAs incorporating a VAE works well in their studies, and this fact enforces the validity of data-driven topology design.
Whereas the initial individuals are randomly generated in numerous studies on EDAs, the initial material distributions are given according to a certain guideline in data-driven topology design.That is, the outputs of the formulation support system are used as the initial material distributions.This is an important distinction between many studies conducted on EDAs and data-driven topology design.Because the latter deals with structural design problems having an extremely large number of design variables (typically, several thousands or more), it is difficult to prepare suitable initial material distributions using a random number generator.Therefore, the guideline for the initial material distributions plays an important role in data-driven topology design.

Novelty of data-driven topology design
As discussed in Section 2.2, data-driven topology design is novel in terms of its application to structural design and guideline for the initial individuals, when compared to previously proposed EDAs incorporating a deep generative model (Garciarena et al., 2018;Bhattacharjee and Gras, 2019).
Furthermore, data-driven topology design can be clearly distinguished from the studies of Oh et al. (2019), Guo et al. (2018) and Zhang et al. (2019a), from the viewpoint of an EDA.That is, the former can be regarded as a type of EDA, whereas the latter cannot.This is because a deep generative model is trained only by material distributions having Pareto optimality in the former, whereas various material distributions are used for training in the latter.This 3 Overall procedure In this section, we describe the overall procedure of data-driven topology design.This is schemed to obtain suitable material distributions to a given design problem, which is defined by the shape of the design domain, boundary conditions, and multiple objective functions.The data process flow starts from the preparation of the material distributions in the design domain, which are labeled the original data.Because data-driven topology design is used to enhance the formulation support system, the multiple objective functions correspond to the candidate functions, and the original data are prepared as described in the study of Yamasaki et al. (2019).
After preparing the original data, the data are processed as follows according to the indication in Fig. 1: Step 1 Evaluate the performances of the material distributions in the original data by computing the objective function values of these material distributions.Here, the data including the performance values are labeled the integrated data because they will be iteratively integrated with the generated data (see Step 6).
Step 2 Select the material distributions having Pareto optimality from the integrated data.The selected data are labeled as the Pareto optimal data.Meanwhile, the integrated data are stored for integration with the generated data.
Step 3 Judge whether the Pareto optimal data satisfy the convergence criteria.If so, the current Pareto optimal data are output as the final results.Otherwise, the material distributions of the Pareto optimal data are converted to conform to a normalized reference domain, which is a 1 × 1 square or 1 × 1 × 1 cube in a two-or threedimensional problem.Such a conversion is applied because the normalized domain is suitable for image-based learning.The design domain mapping (DDM) proposed by Yamasaki et al. (2019) is used for the conversion.Figure 2 shows an example of the material distribution conversion using the DDM.
Step 4 Train a deep generative model using the material distributions of the Pareto optimal data, and newly generate the material distributions using the trained deep generative model.These material distributions are labeled as the generated data.Step 5 Inversely convert the material distributions of the generated data to conform to the design domain, using the DDM.
Step 6 Evaluate the performances of the material distributions of the generated data, in the same manner as step 1.
The generated data including the performance values are integrated into the integrated data, and we return to step 2.
Through the above iterative procedure, we aim to obtain Pareto optimal data consisting of high-performance material distributions.

Implementation details
As described in Section 1, in this study, data-driven topology design is implemented using a VAE.Regarding the use of the VAE, some important implementation details are described in the following.

Normalization of material distributions
In data-driven topology design, we use two domains, i.e., the design and reference domains, as described in Section 3.
In the design domain D, the material distributions are represented using the density function ρ(x), where x are the coordinates of an arbitrary point in D. ρ(x) is continuous and takes a value of 0 to 1, and ρ(x) = 0 and 1 correspond to the void and material, respectively.By contrast, 0 < ρ(x) < 1 corresponds to an intermediate state, according to the conventional manner of density-based topology optimization (Bendsøe, 1989).Similarly, the material distributions are represented using the density function ρ(ξ ξ ξ ) in the reference domain D, where ξ ξ ξ are the coordinates of an arbitrary point in D. When using the above representation model, we must consider preferable features of the training data for the VAE.In conventional density-based topology optimization, it is necessary to reduce the intermediate state while maintaining the smoothness of the material distribution.From this perspective, the material distributions in Fig. 2, for example, are preferable.By contrast, it is thought that the intermediate state has a positive effect when training the VAE because it provides information regarding the outline of the structure.In fact, MNIST (Deng, 2012), one of the most important datasets in the field of deep-learning, includes thousands of grayscale images of handwritten digits.
Therefore, we blur the outline in the reference domain D as follows.First, we compute a scalar function φ (ξ ξ ξ ) as Next, we give φ (ξ ξ ξ ) the signed distance characteristic to the iso-contour of φ (ξ ξ ξ ) = 0, using a geometry-based reinitialization scheme (Yamasaki et al., 2010).Finally, we update ρ(ξ ξ ξ ) using the following equation: where h is the parameter for the bandwidth of the transition zone from the void to the material, and H(φ ) is given as follows: This process is a type of normalization to the material distribution; as an example, the material distribution in Fig. 2b is processed as shown in Fig. 3a by setting h to 0.08.Because of the normalization, the material distributions of the training data include wide transition zones from the void to the material (see Fig. 3a).Therefore, it is expected that the material distributions generated by the VAE also include wide transition zones.If such material distributions are inversely converted into the design domain, the wide transition zones still remain, as shown in Fig. 3b.Because such wide transition zones often cause fatal numerical errors in a forward analysis, we need to binarize the material distributions in the design domain.This is conducted by applying the normalization process described earlier to the design domain by setting h to a small value.
We set h to 0.08 for normalization in the reference domain because the smoothness of the material distributions generated by the VAE was improved by this setting in a preliminary study.

Details of data generation using VAE
Figure 4 shows the architecture of the VAE used in the numerical examples of Section 5.As shown in the figure, this is a type of multilayer perceptron including two hidden layers.The reference domain D is discretized with 50 × 50 square elements, and the material distributions in D are represented using the values of the density function ρ(ξ ξ ξ ) at the lattice points.Therefore, the input layer has 2, 601, i.e., 51 × 51 neurons.This input layer is fully connected to a hidden layer having 1, 700 neurons.
After activating these neurons using the ReLU function, this layer is also fully connected to two layers having 2 neurons, one corresponding to µ µ µ, which is the mean value vector of the latent variables z, and the other corresponding to log (σ σ σ • σ σ σ ), where σ σ σ is the variance vector of z, and • represents the element-size product.We then obtain the latent variables z as follows: where ε ε ε is a random vector according to the standard normal distribution.
The layer of the latent variables z is further fully connected to a hidden layer having 1, 700 neurons.After activating these neurons using the ReLU function, this layer is fully connected to the output layer having 2, 601 neurons, and output data of size 51 × 51 are obtained after the sigmoid activation.The output data are interpreted as material distributions in D in the same manner as the input data.Note that the architecture described in this section is for two-dimensional material distribution problems because we focus solely on two-dimensional problems in this paper; however, there are no technical limitations in extending the architecture to three-dimensional problems.
The VAE having the above architecture is trained using the material distributions of the Pareto optimal data as the input and output data, and the latent space composed of the latent variables is constructed through the training.In more detail, the training is conducted by minimizing the following loss function L using the Adam optimizer (Kingma and Ba, 2014): where L recon is the mean value of the reconstruction loss measured by the mean-squared error, and L KL is a term corresponding to the Kullback-Leibler divergence.L KL is computed as follows: where µ i, j and σ i, j are the i-th components of µ µ µ and σ σ σ in the j-th material distribution, respectively.N mt and N lt are the number of material distributions and the size of the latent space, respectively.Because the dimensionality is drastically compressed from the input and output layers into a two-dimensional latent space, it is expected that important features of the training data are extracted into this space.Furthermore, the range of the latent space that we should focus on is restrictive because the latent variables correspond to the training data do not take extremely large or small values according to the probability distribution N(0, 1).
On the basis of the above discussion, we generate material distributions by uniformly sampling in the latent space; the sampling range is [−4, +4] for each component of z, and the number of samples is 20 × 20.Thus, we obtain material distributions that are diverse and inherit important features of the training data.
As another notable issue, we did not prepare the validation data in this paper because the number of the Pareto optimal data is extremely small in early iterations of the numerical examples (less than 100) and the number of training data further decreases if we prepare the validation data.In addition, an appropriate strategy for dividing the Pareto optimal data into the training and validation data remains unclear in such a situation.Therefore, we simply train the VAE with the epoch number 600 in the numerical examples.As another parameters, the mini batch size and the learning rate are set to 10 and 1 × 10 −3 , respectively.These parameters were determined through a preliminary study.

Data thinning for effective data generation
It is not preferable to train a VAE using a dataset in which some of the material distributions are extremely similar (or perfectly identical) whereas other material distributions are unique.In such a case, the training result will be biased to the former.Therefore, it is necessary to thin out the material distributions according to the similarity.Here, we consider material distributions ρ ρ ρ j and ρ ρ ρ k , which are the jand k-th material distributions in the discrete system.We then thin out one of them if the following condition is satisfied: where ρ i, j and ρ i,k are the i-th component of ρ ρ ρ j and ρ ρ ρ k , respectively.t is the threshold used to judge the similarity and is set to 0.999 in this paper.N in is the number of components, i.e., 2, 601, as described in Section 4.2.This operation is applied to all pairs of material distributions.5 Numerical examples In this section, we provide three numerical examples to demonstrate the usefulness of data-driven topology design.Herein, we consider two design problems of structural mechanics, the design domains and boundary conditions of which are given shown in Fig. 5. Design problem 1 is a two-dimensional high-stiffness and light-weight structure design problem, where a vertical load is applied to the bottom-right boundary and the displacement is completely fixed on the left side boundary of the design domain (see Fig. 5a).In this design problem, two objective functions are set: one is the volume of the structure, and the other is the mean compliance to the applied load.
Design problem 2 is a two-dimensional low-stress and light-weight structure design problem, where a vertical load is applied to the center-right boundary and the displacement is completely fixed on the top side boundary of the design domain, as shown in Fig. 5b.For this design problem, two objective functions are set: the volume of the structure and the maximum value of the von Mises stress generated in the structure.Furthermore, the mean compliance is imposed as a constraint to ensure the mechanical connection from the displacement fixed boundary to the load imposed boundary; this constraint is crucial to obtain meaningful structures, as discussed by Yamasaki et al. (2019).
Design problem 1 has been well-studied in numerous studies of topology optimization; therefore, we provide example 1 targeted to design problem 1 to investigate the basic potential of data-driven topology design.By contrast, it is difficult to directly solve design problem 2 using topology optimization because it is difficult to accurately evaluate the von Mises stress and it is necessary to solve a min-max problem.This difficult problem is targeted in example 2.
Whereas examples 1 and 2 are provided to demonstrate that data-driven topology design can enhance the formulation support system, another aspect is investigated in example 3 using design problem 2.

Example 1
As described above, we solve the simple high-stiffness and light-weight structure design problem in example 1, the design domain and boundary conditions of which are shown in Fig. 5a.The design domain is discretized with 128 × 96 square elements, and the magnitude of the applied load per unit area is set to 1.To binarize the material distributions in the design domain, h in (2) is set to 0.025.Young's modulus of the structural material is set to 1 and is set to 1 × 10 −6 in the void to avoid the singular stiffness matrix, and Poisson's ratio is set to 0.3.We compute the displacement and stress fields under the plane stress condition.
First, we collect material distributions obtained by solving various topology optimization problems and construct a database of the formulation support system in the same manner as the study of Yamasaki et al. (2019).The material distributions in the database are converted to conform to the design domain using the DDM.By doing so, we obtain 2, 271 material distributions as the original data.In step 1, we evaluate the volume and mean compliance of these material distributions using the finite element method and obtain the integrated data.In step 2, we select the material distributions having Pareto optimality from the integrated data and obtain the Pareto optimal data.The material distributions of the Pareto optimal data are shown in Fig. 6.In step 3, we convert these material distributions to conform to the reference domain using the DDM, and in step 4, generate material distributions using the VAE as described in Section 4.2. Figure 7 shows the material distributions of the generated data.In step 5, we inversely convert the generated material distributions to conform to the design domain using the DDM.In step 6, we evaluate the performances of the generated material distributions and integrate them into the integrated data and then return to step 2.
We iterate the above data generation procedure 50 times.Figure 8 shows that the Pareto front gradually improves when iterating the data generation.Because the Pareto front is clearly improved after iteration 1, iterating the data generation procedure is significantly important to obtain high-performance material distributions.
Figure 9 shows representative material distributions at iterations 0 and 50.As this figure indicates, the Pareto front at iteration 0 includes unusual material distributions.For example, the load imposed boundary is not mechanically connected to the displacement fixed boundary despite an adequate amount of material in material distribution A. Material distribution B also seems to be unusual because a long bar sticks out from the base structure.In material distribution C, the material in the bottom side does not connect to the displacement fixed boundary although an adequate mount of material exists.In material distribution D, the material at the top-right of the design domain does not need to support the load.The performance of material distribution E will be further improved by moving the connecting point to the displacement fixed boundary from the center to the bottom side.Material distribution F is a Figure 8: History of data-driven topology design in example 1: iteration 0 (blue), iteration 1 (green), iteration 5 (orange), and iteration 50 (red) Figure 9: Pareto front and representative material distributions at iteration 0 in example 1 (blue) and at iteration 50 (red) fluid channel.These unusual material distributions are suppressed at iteration 50, and those at iteration 50 seem to be comparable to the well-known topology optimized structures of the minimum compliance problem.
Next, we discuss the importance of training the VAE using only the Pareto optimal data.We then generate the material distributions using a VAE trained with all material distributions of the integrated data, and finally obtain the Pareto front colored with black in Fig. 10.Clearly, the Pareto front is inferior to that obtained using data-driven topology design.More importantly, the material distributions obtained seem to be very poor; in particular, the material distributions encircled with the dotted-blue line remain in the Pareto front through iterations 0 − 50.These results indicate the disadvantage of training a VAE using all material distributions.If a VAE is trained using all material distributions, various features of low-performance material distributions will be reflected in the latent space.Therefore, it is extremely difficult to expect a VAE to efficiently generate high-performance material distributions with a limited number of samples.Thus, the results shown in  Finally, we compare the results of data-driven topology design and the results of density-based topology optimization.The Pareto front colored with black in Fig. 11 is obtained by directly solving the well-known minimum compliance problem while randomly changing the allowable upper limit of the volume 100 times.As shown in Fig. 11, data-driven topology design generates similar material distributions when the volume is greater than 0.5; the representative material distributions encircled with the dotted-blue line have the same topology as the topology optimized structures.Interestingly, these representative material distributions seem to be generated by learning the features of the material distributions encircled with the dotted-blue line in Fig. 6, which are optimized structures of quite different topology optimization problems.Thus, data-driven topology design can generate material distributions

Example 2
In this section, we solve a low-stress and light-weight structure design problem, the design domain and boundary conditions of which are given in Fig. 5b.The design domain is discretized with 78282 triangular elements whose representative length is 0.005, and the magnitude of the applied load per unit area is set to 1.The value of h in (2) is set to 0.025 to binarize the material distributions in the design domain.The material properties are set to the same values as in example 1.In this example, we use a conforming mesh to the structural boundary proposed by Yamasaki et al. (2017) to accurately compute the von Mises stress while excluding the so-called grayscale elements.
We prepare the original data in the same manner as in example 1 and compute the performances of the material distributions in the original data, that is, the volume, maximum value of the von Mises stress, and mean compliance.Next, we select the Pareto optimal material distributions regarding the volume and maximum value of the von Mises stress under a constraint in which the mean compliance is less than 10. Figure 12 shows the selected material distributions as the Pareto optimal data.In the same manner as in example 1, we iterate the data generation procedure, and obtain the result shown in Fig. 13.
As shown in this figure, some unusual material distributions exist on the Pareto front of iteration 0. For example, two holes of material distribution A seem to be useless for avoiding the stress concentration.Material distribution B also seems to be unusual because the narrow part in the top side of the design domain is unreasonable to avoid a stress concentration.In material distributions C and D, the material at the bottom side of the design domain is not needed to support the load.Similar to example 1, these unusual material distributions are suppressed as a result of the Furthermore, structures such as the letter "J" are found as light-weight structures on the Pareto front of iteration 50.Indeed, this type of structures is superior in avoiding the stress concentration at the inner corner.Thus, it is confirmed that reasonable low-stress and light-weight structures are surely obtained using data-driven topology design.

Example 3
In example 2, we provided the initial material distributions from the database of the formulation support system.However, we can provide higher-performance material distributions as the initial material distributions on the basis of another reasonable guideline, for example, by utilizing optimized structures of the minimum compliance problem whose design domain and boundary conditions are given in Fig. 5b.Because the above minimum compliance problem and design problem 2 have a correlation, data-driven topology design may generate quasi-optimal material distributions, although design problem 2 is difficult to directly solve.
The essence of this idea is to indirectly solve a topology optimization problem, which is difficult to solve directly, using material distributions obtained by solving another problem, which is easy to solve directly and is correlated with the former problem.The methodology based on the above was originated by Yaji et al. (2020), and is called multifidelity topology design.In the original study, the best material distribution among the prepared material distributions is simply chosen.Therefore, data-driven topology design may have potential as a new version of multifidelity topology design.
To investigate the potential, example 3 is provided using similar problem settings as in example 2, the only difference beeing the guideline for the initial material distributions.As described above, we solve the minimum compliance problem while randomly changing the allowable upper limit of the structural volume.By doing so, we obtain 80 material distributions, and further select 48 material distributions according to Pareto optimality of design problem 2. Figure 14 shows the material distributions of the Pareto optimal data.
Using these material distributions, we obtain the result shown in Fig. 15.As this figure shows, it may be difficult to assert a drastic improvement of the Pareto front.We consider the reason for this to be the strong correlation between the two types of topology optimization problems.However, some interesting facts can be seen in Fig. 15.For example, material distribution A' is more rounded than material distribution A, although they are very similar.Similarly, material distribution B' is more rounded than material distribution B. In general, rounded structures are preferable for avoiding the stress concentration, and therefore these results seem to be reasonable as low-stress and light-weight structures.From these results, we consider that data-driven topology design has a potential as a type of multifidelity topology design.

Conclusion
In this paper, we proposed data-driven topology design to enhance the formulation support system and demonstrated its usefulness through numerical examples.However, some issues remain.
As one issue, it is necessary to investigate another deep generative model, despite our adoption of a VAE in this paper; more suitable deep generative models may exist for data-driven topology design.In addition, a suitable architecture for the VAE should be further investigated.Although we adopted the architecture shown in Fig. 4 on the basis of the results of a preliminary study, there may be room for improvement.Furthermore, the theoretical backbone of data-driven topology design should be further investigated from the viewpoint of an EDA.We plan to tackle these issues in future studies and aim to develop more sophisticated data-driven topology design.

Figure 1 :
Figure 1: Data process flow of data-driven topology design

Figure 2 :
Figure 2: Example of material distribution conversion using DDM: a material distribution in the design domain, and b converted material distribution conforming to the reference domain, where the material and void are shown in black and white, respectively

Figure 3 :Figure 4 :
Figure 3: Example of material distributions including wide transition zones: a material distribution normalized from that in Fig. 2(b), and b material distribution in the design domain, which is inversely converted from that in a

Figure 5 :
Figure 5: Design domains and boundary conditions of design problems 1 (left) and 2 (right)

Figure 6 :Figure 7 :
Figure 6: Material distributions of the Pareto optimal data at iteration 0 in example 1 Fig. 10 indicate the importance of training a VAE using only the Pareto

Figure 10 :
Figure 10: Pareto front and representative material distributions at iteration 50 in example 1 (red) and those obtained using a VAE trained using all of the material distributions (black)

Figure 12 :
Figure 12: Material distributions of the Pareto optimal data at iteration 0 in example 2

Figure 14 :
Figure 14: Material distributions of the Pareto optimal data at iteration 0 in example 3

Figure 15 :
Figure 15: Pareto front and representative material distributions at iteration 0 in example 3 (blue) and at iteration 50 (red)