Solidification in a Supercomputer: From Crystal Nuclei to Dendrite Assemblages

Thanks to the recent progress in high-performance computational environments, the range of applications of computational metallurgy is expanding rapidly. In this paper, cutting-edge simulations of solidification from atomic to microstructural levels performed on a graphics processing unit (GPU) architecture are introduced with a brief introduction to advances in computational studies on solidification. In particular, million-atom molecular dynamics simulations captured the spontaneous evolution of anisotropy in a solid nucleus in an undercooled melt and homogeneous nucleation without any inducing factor, which is followed by grain growth. At the microstructural level, the quantitative phase-field model has been gaining importance as a powerful tool for predicting solidification microstructures. In this paper, the convergence behavior of simulation results obtained with this model is discussed, in detail. Such convergence ensures the reliability of results of phase-field simulations. Using the quantitative phase-field model, the competitive growth of dendrite assemblages during the directional solidification of a binary alloy bicrystal at the millimeter scale is examined by performing two- and three-dimensional large-scale simulations by multi-GPU computation on the supercomputer, TSUBAME2.5. This cutting-edge approach using a GPU supercomputer is opening a new phase in computational metallurgy.


INTRODUCTION
Many practical metals and alloys undergo solidification during their production. 1,2Since their microstructure directly affects the properties of products, it is essential to control the microstructure of metals and alloys during the solidification with a high degree of accuracy.Despite considerable effort over many years, it is still challenging to control the solidification microstructure as planned.This is mainly due to the following three reasons: (1) The difficulty of direct (in situ) observation.
(2) The wide range of temporal and spatial scales.
(3) The need to consider multiple physics including fluid flow, thermal and solute diffusion.
][7] These studies provided considerable information on dendrite growth.However, it is not yet straightforward to directly observe the dynamics of solidification during actual production processes in general.Therefore, computational studies have contributed to clarifying the nature of solidification processes.In the early stage of research, Monte Carlo simulations with the Pott model 8 were often performed to study the kinetics of grain growth, 9,10 and this became a popular method for the study of solidification, recrystallization and other phenomena. 11,12The front tracking method 13,14 and cellular automata 15,16 have also been widely employed for simulations of dendritic growth.In 1993, Kobayashi succeeded in reproducing a complicated dendrite structure using the phase-field model. 17Since then, phase-field simulation [18][19][20][21][22][23] has become a major tool for the simulation of solidification.The main advantage of the phase-field model is that it is not necessary to explicitly track the position of a sharp interface in complex microstructural patterns.On the other hand, a long-standing issue regarding a quantitative aspect of the phase-field model remained unresolved until recently.5][26][27][28][29][30][31][32][33] Combined with the recent rapid progress in high-performance computational environments, large-scale phase-field simulations can now capture the competition between bundles of dendrites, including selection and regularity, [34][35][36] which is filling the gaps in knowledge over a wide range of temporal and spatial scales.Moreover, the coupling of phase-field simulation with computational fluid dynamics is now capturing some aspects of multiple physics during solidification, 37,38 which can be applied to fluid-dynamics-based phenomena such as the fragmentation of dendrite tips during solidification. 38The Lattice Boltzmann method 39,40 is a promising numerical method for a large-scale fluid computation in solidification problem.
2][43][44][45][46][47][48][49] In particular, the anisotropy in the interfacial parameters is a key factor determining the morphology of dendrite structures, although there are few reliable experimental values for the degree of anisotropy in the interfacial parameters.Therefore, an important role of molecular dynamics simulations is to provide interfacial parameters estimated from atomic-scale information for use in mesoscale simulations, which is a basic concept of multiscale modeling.Moreover, recent large-scale molecular dynamics simulations have captured the morphological dynamics of crystal growth with curved interfaces [50][51][52] and multiple nucleation from an undercooled melt, which is followed by the formation of polycrystalline microstructures. 53,54uch progress in simulations of solidification is largely attributable to the rapid progress in highperformance computational environments.In particular, considerable benefit has been obtained from the high parallel efficiency of graphics processing units (GPUs).Large-scale simulations performed on GPU supercomputers have ranged from the nucleation and subsequent grain growth in a millionatom molecular dynamics simulation to the competitive growth of millimeter-size dendrite assemblages in a large-scale phase-field simulation.In this paper, cutting-edge simulations of solidification performed on a GPU supercomputer are introduced with a brief introduction to the current state of computational studies on solidification.Firstly, the spontaneous evolution of anisotropy in a solid nucleus investigated by million-atom molecular dynamics simulation is discussed in ''Solidification in Large-Scale Molecular Dynamics Simulation'' section.Investigation of the nucleation of crystal nuclei from an undercooled melt, which is an initial stage of solidification, is also outlined in the same section.In ''Advances in Quantitative Computation of Solidification Microstructures'' section, advances in quantitative computation of phase-field simulations are discussed with an examination of the convergence of the results of quantitative phasefield simulations.In ''Large-Scale Phase-Field Simulation of Competitive Growth of Dendrite Assemblages'' section, we report the competitive growth of dendrite assemblages during the directional solidification of a binary alloy bicrystal investigated by performing two-and threedimensional large-scale simulations using the quantitative phase-field model by multi-GPU computation.We conclude with a discussion of the implications of these results for the future of computational metallurgy.

Spontaneous Evolution of Anisotropy in a Solid Nucleus
As described above, molecular dynamics simulations have contributed to the estimation of interfacial properties.The estimated values of these properties in representative papers [41][42][43][44][45][46][47][48][49] are summarized in Table I.As can be seen in the table, the kinetic coefficient of the bcc h100i orientation is slightly higher than those of the h110i and h100i orientations in general.Such a difference in the kinetic coefficient as well as that in the interfacial energy causes anisotropy in the solid nucleus, which results in dendritic growth in accordance with the interfacial stability.Therefore, it is reasonable for an anisotropic morphology to be generated during solidification in a phase-field simulation when the effect of anisotropy is taken into account in the interfacial parameters.In turn, it should be possible in principle to achieve an anisotropic morphology even in a molecular dynamics simulation if the system size is sufficiently large.Then, what is the critical size for obtaining clear anisotropy in a solid nucleus in an atomic-scale simulation?At least we did not observe such anisotropy in a solid nucleus during solidification in a cubic cell of side 15 nm. 46,55Therefore, a much larger system should be employed to discuss this issue, which is, however, computationally demanding.
We have developed our own code for carrying out large-scale molecular dynamics simulations by single-GPU computing, which enables molecular dynamics simulations of 1 million atoms to be handled over a period of nanoseconds with a computational time of several days. 56Using this code with single-GPU computing, the spontaneous evolution of anisotropy in a solid nucleus during the solidification of iron was investigated by million-atom molecular dynamics simulation. 52As the simulation methodology, the Finnis-Sinclair potential, 57 which is one of the established potentials for bcc metals, was employed for the interaction between iron atoms.A leapfrog method was used to integrate a classical equation of motion with a time step of 5.0 fs.A Berendsen thermostat 58 was applied to control the temperature.The Andersen method 59 was employed to control the pressure in each direction independently.The initial configuration was prepared by embedding an octagonal solid nucleus in an iron melt.The iron melt was obtained in advance by heating a bcc crystal of iron of size 53.4 9 53.4 9 4.3 nm 3 (1,037,880 atoms) at 3500 K under a NVT constant condition.The solid nucleus was prepared as an octagonal cutout from the bcc crystal with four {100} facets and four {110} facets of side 1.78 nm.The solid nucleus was inserted into the center of the iron melt, while omitting all melt atoms located within 2.5 A ˚from solid atoms.After the quenching of the simulation cell at 10 K for 25 ps to fill the gap between the melt and solid atoms, the obtained initial configuration was isothermally undercooled at DT = 300 K.Note that the melting point of bcc Fe according to the Finnis-Sinclair potential is 2400 K, 60 and therefore the undercooling temperature of 300 K corresponds to 2100 K.
Figure 1 shows snapshots of the atomic configuration during the spontaneous evolution of anisotropy in the solid nucleus during solidification. 52lthough the edges of the octagonal nucleus are smoothed and the nucleus takes a spherical shape in the initial stage, the spherical nucleus gradually preferentially grows in the h100i direction from approximately 300 ps.Then, it grows to form a rhombic-like structure with fourfold symmetry at 500 ps.The shape of the solid-liquid interface in the snapshot at 500 ps is traced and extracted with respect to the rotation angle h from the x-axis.In the figure showing the extracted information, the radius normalized by the average radius of 19.87 nm is plotted.It was confirmed that there are preferential growth directions at rotation angles of 0°, 90°, 180°and 270°, which correspond to the h100i direction.This is in agreement with the information  in Table I, in which the kinetic coefficient of the h100i direction is larger than that of the h110i direction.The spontaneous evolution of anisotropy in a solid nucleus during solidification was achieved for the first time with the aid of the high processing ability of the GPU architecture.

Homogeneous Nucleation and Subsequent Grain Growth
One of the remaining issues in the simulation of solidification is how to treat the nucleation. 54In existing phase-field simulations, the nuclei in the melt are given in advance with a random distribution or are formed forcibly on the basis of classical nucleation theory.On the other hand, it is possible in principle to achieve nucleation in molecular dynamics simulations when suitable conditions are given.2][63][64][65][66] However, generally, it is not yet straightforward to achieve multiple nucleation, which is essential for the formation of polycrystalline microstructures, since a broad range of temporal and spatial scales is required.Therefore, multiple nucleation in a large-scale system is usually achieved with the aid of inducing factors such as a high pressure 53 and surface fluctuation. 67e successfully achieved spontaneous nucleation from an undercooled iron melt without any inducing factor in a million-atom molecular dynamics simulation on a GPU supercomputer using the code described in the previous subsection.The simulation methodology basically followed the simulation in the previous subsection.Firstly, a bcc crystal of iron with a size of 53.4 9 53.4 9 4.3 nm 3 (1,037,880 atoms) was heated at 3500 K under a NVT (number of atoms, volume and temperature) constant condition to obtain an iron melt as the initial configuration.The prepared initial configuration was isothermally undercooled at DT = 1000 K for 10,000 ps under zero pressure by a NPT (number of atoms, pressure and temperature) constant condition.The Finnis-Sinclair potential 57 was employed for the interaction between iron atoms as in the above simulation.The periodic boundary condition was employed for all boundaries.Figure 2 shows snapshots of the atomic configuration during a consecutive simulation of nucleation, solidification and grain growth.Many nuclei are simultaneously nucleated before 150 ps and grow to form spherical grains in the melt.Other nuclei are continuously nucleated from the remaining melt.Later-nucleated grains fill the spaces between earlier-nucleated grains, and all the iron melt has solidified by 300 ps.After the solidification, the small grains gradually shrink and disappear whereas the large ones become larger, which is regarded as grain coarsening.Since the existence of grain boundaries yields excess grain boundary energy (approximately 0.5 J/m 2 to 2.0 J/m 260 ), such grain growth occurs in order to decrease the area of grain boundaries.It was also confirmed from the molecular dynamics simulation that the rate of grain coarsening is one order of magnitude slower than that of the solidification.
The incubation time until the first nucleation and the number of nuclei drastically change when the undercooling temperature is varied.It was confirmed that the incubation time as a function of temperature has a peak (i.e., nose shape) at the critical temperature, which is a characteristic shape of the time-temperature-transformation (TTT) curve. 54Therefore, it is considered that the nucleation observed in Fig. 2 is entirely thermally activated without any other inducing factor.The thermally activated nucleation in the million-atom molecular dynamics simulation has been investigated in detail elsewhere. 54

ADVANCES IN QUANTITATIVE COMPUTATION OF SOLIDIFICATION MICROSTRUCTURES Quantitative Phase-Field Model
As seen above, recent atomic-scale simulations can capture the scale of microstructure evolution.On the other hand, the description and prediction of microstructural evolution during solidification have generally been theoretically and numerically tackled within the framework of a free-boundary problem, the underlying physics of which are solutal and thermal diffusion in the bulk, mass and energy conservation laws at the interface and the Gibbs-Thomson effects.One of the central issues in modeling microstructural processes is therefore the precise description of the interface dynamics consistent with the free-boundary problem.9][20][21][22][23] This is a diffuse interface approach, in which the interface is not sharp but diffuse, exhibiting non-zero thickness.The main advantage of this model is that it is not necessary to explicitly track the position of a sharp interface in complex microstructural patterns.The phase-field model has been applied to a variety of solidification processes, [18][19][20][21][22] and its capability of affording a qualitative understanding of phenomena has generally been acknowledged.Despite this success, however, a long-standing issue regarding the quantitative aspect of the phase-field model remained unresolved until recently.
Phase-field models were developed in early works to reproduce the free-boundary problem of interest in the so-called sharp-interface limit, where the thickness of the diffuse interface W approaches zero.However, in practice, a prerequisite for this diffuse interface approach is to assign a finite value to W. A realistic value of W for the solid-liquid interface is typically a few nm; thus, a spatial resolution of A Shibuta, order is required to describe a diffuse interface having a realistic thickness.This high spatial resolution limits the system size to extremely small, making it impossible to deal with problems at the microstructural scale.Therefore, W has to be increased by orders of magnitude from the realistic thickness.However, this increment, in turn, causes the unrealistic magnification of some physical effects associated with the diffuse interface, which precludes the quantitative computation of solidification microstructures.This serious problem was resolved by Karma and Rappel. 24,25They put forward a model based on a new procedure called the thin-interface limit, in which W is taken to be smaller than any physical length appearing at the microstructural scale but much larger than the realistic thickness.This model is called the quantitative phase-field model since it enables quantitatively meaningful simulations. 25his model was later extended to deal with alloy solidification in a dilute binary alloy with zero diffusivity in the solid (one-sided model). 26,27Moreover, the quantitative phase-field model has been developed to describe two-phase solidification in binary alloys with zero diffusivity in the solids (onesided model), 28 alloy solidification with coupled heat and solute diffusion in dilute binary alloys having zero solutal diffusivity in the solid and equal thermal diffusivities in the solid and liquid (one-sided solute transport and symmetric heat transport), 29 and isothermal solidification in multicomponent alloys with zero diffusivities in the solid (one-sided model). 30In addition, one of the present authors has recently developed quantitative phase-field models for the two-sided case.To be more specific, the models were developed for isothermal single-phase solidification, 31 two-phase solidification in dilute binary alloys with an arbitrary value of the solid diffusivity, 32 and single-phase solidification in multicomponent alloys with coupled solutal and thermal diffusion. 33These two-sided models enable us to describe the equilibrium solidification, the microsegregation, the motion of the solid-solid interface, and the solidification processes in practical alloy systems such as carbon steels, where the dif-fusion in the solid is not negligible.][70][71][72][73][74][75][76][77][78][79][80][81] As mentioned above, significant progress has been made in quantitative phase-field modeling for alloy solidification.The accuracy of quantitative phase-field simulations is evaluated by observing the convergence behavior of the simulation results with decreasing W. It has been demonstrated that the convergence of the results in quantitative phase-field simulations is much faster than that of the results in the conventional phase-field model, [31][32][33] which indicates that accurate results can be obtained using a large value for W in quantitative phase-field models.This is quite advantageous in terms of the computational cost because the computational time for a three-dimensional simulation using a finite difference method is proportional to W À5 . 69Hence, the quantitative phase-field model enables highly accurate and large-scale computations of solidification microstructures.However, it should be pointed out that the value of W required to obtain well-converged results is strongly dependent on the solidification condition of interest.No criterion has yet been established regarding a suitable choice of W, and a convergence test is generally required for each solidification condition to ensure accurate and efficient computations.Therefore, it is desirable to obtain information to reduce the effort involved in the convergence test.This point is addressed below.

Convergence of Outcome in Quantitative Phase-Field Simulation
We have carried out quantitative phase-field simulations of the directional solidification of Al-2-mass%Cu alloy in a two-dimensional system to gain some insight into the convergence behavior.The competitive growth of solids was analyzed by considering a single solid growing in the y-direction with the periodic boundary condition applied in the x-direction.The details of time evolution equations can be found in Ref. 36.The input parameters used in this study are listed in Table II.A temperature gradient and an initial were set to G = 1000 K/m and u 0 = (c l À c 0 )/[(1 À k)c 0 ] = À1, respectively, where c l is the concentration in the liquid phase, c 0 is the average concentration, and k is the partition coefficient.We started with initial seeds periodically spaced by the targeted spacing.This spacing corresponds to the primary arm spacing k and it was set to about 150 lm.By performing a moving frame simulation, we obtained steady-state values of tip undercooling, given by X = 1 À y tip / l T , where y tip is the position of the dendrite tip and l T is the thermal diffusion length, and the curvature radius of the dendrite tip q.
The calculated results for V = 500 and 50 lm/s are shown in Fig. 3a and b, respectively.The horizontal axis is the interface thickness normalized by the chemical capillary length d 0 .In each case, W and q fully converge with unique values when W is small, which indicates excellent convergence behavior.However, the value of W required for the convergence strongly depends on the pulling velocity, V.The convergence for V = 500 and 50 lm/s starts to break down when W/d 0 < 60 and 200, respectively.This large difference in the convergence behavior indicates that the effort required to find a suitable value of W strongly depends on the solidification condition of interest.One may suppose that the breakdown of convergence should be related to the onset of the unphysical magnification of the interface effects that always exists in conventional phase-field models constructed in the sharpinterface limit as described in the previous section.
According to Fig. 3a and b, the converged value of q is strongly dependent on the pulling velocity.Hereafter, the values of X and q calculated for the smallest value of W are regarded as the converged values and are, respectively, denoted by X c and q c .In Fig. 3, the values of q c /d 0 for V = 500 and 50 lm/s are about 300 and 1000, respectively.Hence, from the comparison between Fig. 3a and b, one may speculate that the convergence of the simulation for a relatively coarse structure starts to break down at a relatively large W. We investigated the validity of this speculation in a quantitative manner.All the data shown in Fig. 3a and b are plotted in Fig. 3c, where q and X are normalized by q c and X c , re-spectively, on the y-axis and W is normalized by q c on the x-axis.It can be seen that the convergence starts to break down when W/q c $ 0.2 in both cases.In other words, regardless of the solidification condition, the results fully converge as long as W/ q c £ 0.2.This is also supported by the results for the directional solidification of an impure succinonitrile alloy in Ref. 27.
Note that the steady-state profile of the phasefield in our model is given by / = tanh[r/(2 1/2 W)].
Here, / is the phase-field, which takes a value of +1 (À1) in the solid (liquid) and continuously changes from -1 to +1 inside the interface, and r is the spatial coordinate in the direction normal to the interface.This solution is obtained for the boundary condition / = ±1 at r fi ±1.In this model, the interface thickness cannot be well defined.In the above discussion, W was used as a measure of the interface thickness and is actually the length of the region for À0.34 < / < 0.34.When the region for À0.95 < / < 0.95 is considered, the thickness of this region W¢ is about 5 W. Hence, the condition W/q c £ 0.2 corresponds to W¢ £ q c .Within the framework of the diffuse interface approach, an accurate description of the size and morphology of microstructures is not possible when the interface thickness is set to larger than the minimum curvature radius of the interface appearing in the microstructure.The condition W¢ £ q c should originate from this fact.Namely, the breakdown of the convergence shown in Fig. 3c is not triggered by the onset of the unphysical magnification of interface effects and is actually a natural consequence of the limitation unique to the diffuse interface approach.
To provide evidence for this, data for free dendritic growth reported in the literature 31,32 are plotted in Fig. 4 in the same manner as in Fig. 3c.Six sets of data are distinguished by symbols with different shapes.For each dataset, the open and filled symbols represent V/V c and q/q c , respectively.Here, V is the steady-state value of moving velocity of dendrite tip and V c is the converged one, viz., the value calculated for the smallest value of W. Datasets A and B are the results for isothermal solidification in binary alloys (Figs. 4 and 5 in Ref. 31).Datasets C and D are those for nonisothermal solidification in a binary alloy without and with diffusion in the solid (Figs. 2 and 3 in Ref. 32) and datasets E and F are the results for isothermal and non-isothermal solidification in a ternary alloy (Figs. 4 and 5 in Ref. 32), respectively.Importantly, all the data converge as long as W/q c £ 0.2.Hence, the condition W/q c £ 0.2 holds true in both directional and free dendritic growth.This fact will reduce the effort involved in convergence tests.Once the minimum curvature radius, q c , of the growing phase(s) appearing during a microstructural evolution process is obtained from preliminary simulations, accurate and efficient computation can be conducted by assigning a value of about 0.2 q c to W.

LARGE-SCALE PHASE-FIELD SIMULATION OF COMPETITIVE GROWTH OF DENDRITE ASSEMBLAGES
High-Performance Computation for the Phase-Field Method The development of the quantitative phase-field model enables the use of a large interface W or a large computational lattice.However, as shown in the previous section, the value of W is restricted by the curvature of the dendrite tip q.Therefore, dendrite growth simulations using the phase-field method have been limited to two-dimensional problems or three-dimensional simula-tions of a small number of dendrites.Actually, many solidification structures are formed through the interactions during the competitive growth of dendrite assemblages. 1,2Although the cellular automaton method has been widely used for polycrystal solidification simulations, 15,16,82 and has been employed for large-scale solidification simulations, 83,84 multiple-dendrite-growth simulation by the phase-field method is crucial for accurate prediction of the solidification microstructure.An adaptive mesh refinement technique, in which fine meshes are used only around the interface, [85][86][87][88][89] can reduce the computational cost.However, its applicability to polycrystal solidification with a large interface area fraction is not flexible.Moreover, the development of the code for adaptive mesh refinement requires tremendous effort.
Under such circumstances, GPU computation has attracted the attention of many phase-field researchers because GPUs have been successfully used to increase the speed of phase-field computation. 33,90Moreover, parallel computation using multiple GPUs has the potential to capture realistic dendrite assemblages. 22,35,91,92Shimokawabe et al. achieved the first-ever petascale phase-field simulation of dendrite growth using 4000 GPUs on the TSUBAME2.0supercomputer at the Tokyo Institute of Technology. 92Subsequently, the present authors and coworkers successfully performed a very-large-scale simulation of multiple dendritic competitive growth using 4000 3 meshes. 35rom the viewpoint of applications, understanding of the competitive growth of dendrite assemblages is essential to improve and control solidification microstructures.It is widely accepted that dendrites whose h100i preferential growth direction is almost parallel to the heat flow direction can continue to grow by stopping the growth of  dendrites having a h100i crystallographic orientation that deviates from the heat flow direction. 16,93,94On the other hand, unusual dendrite selections, in which inclined dendrites overgrow dendrites growing in the heat flow direction, have recently been observed in the unidirectional solidification of a sample. 95,96To clarify the mechanism of this unusual overgrowth, two-dimensional phase-field simulations have been performed. 34,36,90In the following subsections, we report the competitive growth of dendrite asseminvestigated by two-and three-dimensional simulations using the quantitative phase-field model, which were performed on multiple GPUs.

Two-Dimensional Simulation of Competitive Growth
Figure 5 shows snapshots from a two-dimensional simulation of competitive dendrite growth during the directional solidification of a binary alloy bicrystal.The quantitative phase-field model for the solidification of a dilute binary alloy 31 was used with the moving frame algorithm for directional solidification under a constant temperature gradient, G.The computational conditions were same as in Ref.
36 except for a computational domain of 3.072 9 1.152 mm 2 (4096 9 1536 meshes) and 20 million computational steps (750 s) performed for Al-3mass%Cu with V = 100 lm/s and G = 10 K/mm.The computation was performed within 1 day using eight GPUs.Two seeds were placed at the two bottom corners of the computational domain, where the left seed was the favorably oriented (FO) grain and the right seed was the unfavorably oriented (UO) grain with its h100i crystallographic orientation at angle of 10º from the heat flow direction.As shown in Fig. 5a, the solids cover the bottom surface and the dendrites grow in the heat flow direction.A grain boundary (GB) is formed at the collision point between the two grains, as shown in Fig. 5b, and steady-state competitive growth between the GB dendrites starts from 8 9 10 5 steps (Fig. 5c).Here, FO and UO dendrites are labeled using ''F'' and ''U'', respectively.At the GB, some UO dendrites are blocked by the FO dendrite, and then the UO dendrite overgrows the FO dendrite after the blocking.Finally, all the FO dendrites are overgrown after about 18 million steps.To overgrow the FO dendrites labeled F1-F7, 3, 3, 3, 3, 8, 9 and 1 UO dendrites are required, respectively.This means that, for example, the F1 dendrite blocks the growth of the U1 and U2 dendrites and is overgrown by the U3 dendrite, and the F2 dendrite blocks the U3 and U4 dendrites and is overgrown by the U5 dendrite.This difference is mainly caused by the difference in the dendrite arm spacing between the GB FO den-drite and the FO dendrite at its immediate left.As shown in Fig. 5c, the arm spacing of F5-F6 (197 lm) and F6-F7 (324 lm) is the largest compared to F1-F2 (175 lm), F2-F3 (161 lm), F3-F4 (182 lm), and F4-F5 (179 lm).In addition, the UO dendrite arm spacing also affects the overgrowth of FO dendrites.The F4 dendrite is overgrown by the U9 dendrite.The average UO dendrite arm spacing shown in Fig. 5c, or U1-U7, is 199 lm.On the other hand, the average UO dendrite arm spacing shown in Fig. 5d, or U10-U16, is 296 lm.Thus, the large difference in the number of UO dendrites needed to overgrow the FO dendrite is caused by both spacing of FO dendrites and UO dendrites.As shown in Fig. 5d, when the U9 dendrite approaches the F4 dendrite, both GB dendrites fall down and the F4 dendrite moves to the left.When the spacing between F4 and F5 reaches the minimum value in which the two dendrites can coexist, 36 the F4 dendrite is overgrown by the U9 dendrite.Accordingly, the horizontal migration of the FO dendrite when the UO dendrite approaches to the FO dendrite is a key process in the unusual selection.The unusual overgrowth observed in the present simulation occurs less readily with increasing inclination angle of the UO dendrites 90 and pulling velocity 34 due to the reduced the solute interaction around the tips of the GB dendrites.

Three-Dimensional Simulation of Competitive Growth
As introduced in the previous section, the basic mechanism of the unusual dendrite selection can be investigated by two-dimensional simulation.However, actual dendrite growth occurs in three-dimensional space and the competition between dendrites at GBs is more complicated. 97,98Figure 6 shows the snapshots from a three-dimensional simulation of competitive dendrite growth during the directional solidification of a binary alloy bicrystal.The computational domain was set to 1.536 9 1.536 9 1.024 mm 3 (1536 9 1536 9 1024 meshes) and a computation of 0.5 million steps (23.7 s) was performed.Except for the lattice size of Dx = 1 lm and the inclination angle of UO grain of 20°, the computational conditions were the same as in the previous two-dimensional simulation.It took about half a day for this be performed using 512 GPUs of the TSUBAME2.5supercomputer at Tokyo Institute of Technology, which is a practical computational time.At the beginning of the computation, as shown in Fig. 6a, the two grains spread along the bottom surface to form a fanlike shape, and many secondary arms grow in the heat flow direction.Figure 6b show that the two grains collide and a straight GB is formed because of the high density of arms.Dendrite selection subsequently occurs and the number of dendrites decreases as shown in Fig. 6c and d.Because the inclined dendrites grow in the h100i direction with increasing arm spacing, [99][100][101][102] the UO dendrites move toward the FO dendrites.As a result, the competition between dendrites becomes intense at the GB, and the shape of the GB becomes zigzag as shown in Fig. 6e.This zigzag GB is very similar to that observed experimentally. 97,98ere, we showed the very beginning stage of competitive growth.By continuing this simulation longer, we will be able to observe the detail competition between FO and UO dendrites in three-dimensional space in detail.In three-dimensional space, because the solute diffusion is possible in the three directions, we need a longer computational time than for the two-dimensional problem to see the unusual overgrowth phenomenon.Therefore, this is challenging topic even using a supercomputer.Nevertheless, the results will be available in the near future.

CONCLUSION
Utilizing the high parallel efficiency of GPUs, cutting-edge simulations were performed to capture the nature of solidification from various viewpoints.From an atomic viewpoint, a million-atom molecular dynamics simulation revealed the spontaneous evolution of anisotropy in a solid nucleus embedded in an undercooled iron melt, in which fourfold symmetry was achieved naturally without the use of any empirical parameters.Homogeneous nucleation from an undercooled melt was achieved by another million-atom molecular dynamics simulation, in which multiple nuclei solidified to form a multigrain microstructure and grain coarsening occurred during 10 ns, according to the results of the calculation.Moreover, the convergence behavior in quantitative phase-field simulations has been discussed in detail.Such convergence enables the use of a large interface thickness in quantitative phase-field simulations.Using the quantitative phase-field model for the solidification of a dilute binary alloy, the competitive growth of dendrite assemblages during the directional solidification of a binary alloy bicrystal in a millimeter scale was examined by performing two-and three-dimensional large-scale simulations by multi-GPU computation.From the two-dimensional simulation, the mechanism of the unusual overgrowth phenomenon, in which dendrites inclined to the heat flow direction overgrow those growing in the heat flow direction during unidirectional solidification, was clarified.On the other hand, a zigzag grain boundary was formed during the competition between favorably and unfavorably oriented dendrites in the three-dimensional phasefield simulation.
In summary, many topics remain to be investigated in solidification science and other fields of metallurgy.We believe that large-scale simulations are powerful tools for their investigation and should bring about significant changes in computational metallurgy.Although results from molecular dynamics and phase-field simulations in this paper are not directly linked but so far independent, further large-scale molecular simulation will enable a direct comparison with the phase-field and other mesoscale simulations.Moreover, the statistical sampling of nucleation in the large-scale molecular dynamics simulation can export the proper information of nucleation event to the phase-field simulation in the near future.Finally, we celebrate the beginning of a new phase of computational metallurgy with the impressive snapshot in Fig. 7, which was obtained from a very-large-scale threedimensional phase-field simulation of the directional solidification of a binary alloy polycrystal. 35he calculation was carried out in a system with dimensions of 3.072 9 3.072 9 3.072 mm 3 (4096 9 4096 9 4096 meshes) for a total time period of more than 100 s (4 million computational steps) using 768 GPUs with 768 CPUs on the TSU-BAME2.0,which is the largest simulation of dendrite growth ever to be reported to the best of our knowledge.Fig. 7. Snapshot from a very-large-scale three-dimensional phasefield simulation of the directional solidification of a binary alloy polycrystal. 35The calculation were carried out in a system with dimensions of 3.072 9 3.072 9 3.072 mm 3 (4096 9 4096 9 4096 meshes) for a total time period of more than 100 s (4 million computational steps) using 768 GPUs with 768 CPUs on TSUBAME2.0.

Fig. 1 .
Fig. 1.Million-atom molecular dynamics simulation of spontaneous evolution of anisotropy in solid nucleus during solidification. 52Results from Ref. 52 are reconstructed.(Top) Snapshots at 100, 300 and 500 ps and the trace of the solid-liquid interface obtained from the snapshot at 500 ps.(Bottom) Normalized radius of the solid nucleus as a function of rotation angle h.

Fig. 2 .
Fig. 2. Consecutive molecular dynamics simulation of nucleation, solidification and grain growth.An iron melt consisting of 1,037,880 atoms in a cell of 53.4 9 53.4 9 4.3 nm 3 was isothermally undercooled at DT = 1000 K over 10,000 ps.Purple and white atoms represent iron atoms with and without the bcc configuration, respectively.

Fig. 3 .
Fig. 3. Convergence behavior of degree of undercooling and curvature radius of the dendrite tip during directional dendritic growth in Al-2mass%Cu alloy calculated for (a) V = 500 and (b) 50 lm/s.(c) Convergence behavior plotted on normalized axes.

Fig. 4 .
Fig. 4. Convergence behavior of velocity (open symbols) and curvature radius (filled symbols) of the dendrite tip during free dendritic growth in binary (A, B, C, D) and ternary (E, F) alloys.Details of datasets A-F can be found in the text.

Fig. 5 .
Fig. 5. Two-dimensional simulation of competitive growth of Al-3mass%Cu binary alloy bicrystal with V = 100 lm/s and G = 10 K/mm.The domain size was set to 3.072 9 1.152 mm 2 (4096 9 1536 meshes) and the time step was Dt = 3.75 9 10 À5 s.Zero Neumann boundary conditions for / and u were set for all boundaries.

Fig. 6 .
Fig. 6.Three-dimensional simulation of competitive growth of Al-3mass%Cu binary alloy bicrystal with V = 100 lm/s and G = 10 K/mm.The domain size was set to 1.536 9 1.536 9 1.024 mm 3 (1536 9 1536 9 1024 meshes) and the time step was Dt = 4.74 9 10 À5 s.Zero Neumann boundary conditions for / and u were set for all boundaries.The sky blue and yellow grains are the favorably and unfavorably oriented grains, respectively.

Table II .
36put parameters employed in quantitative phase-field simulations for directional solidification of an Al-2mass%Cu alloy36