Introduction

In the past decade, interest has significantly increased regarding high-throughput, property-driven discovery and design of new materials [1]. This interest has been fueled by adopting the ICME and its techniques [2]. The CALPHAD-based method has proven to be a fast and reliable approach, making it the efficient technique currently available for such endeavors [3]. ICMD™ from Questek QuesTek Innovations LLC, Evanston, IL, USA, is considered to be one of the pioneers and the first commercialized tools in the CALPHAD-Genomic Materials Design [4]. By now, several software frameworks are integrating the CALPHAD approach into the workflow of material design to seamlessly transition between the stages of concept, design, and qualification [3, 5,6,7,8,9,10,11,12]. These software employ diverse approaches, including implementing high-throughput CALPHAD-based simulations and utilizing machine learning techniques [5, 6]. Despite the varying approaches, the tools share a common foundation, which involves the collection of necessary calculations and the execution using available CALPHAD software. However, most of these tools are unavailable as open-source solutions, limiting their use. Thus, it is not possible to build on top of the existing solutions and adapt them to domain-specific needs.

Most of the reported CALPHAD-based HTS solutions are integrating the equilibrium or non-equilibrium (Scheil) simulations. The latter assumes constrained diffusion in the solid state by cooling [13] and is often used for modeling fast solidification processes such as additive manufacturing [14,15,16,17,18]. However, as the material undergoes continuous reheating (intrinsic or post heat treatment), the consideration of the solid-state precipitation is essential [19]. As a result, a combination of solidification modeling and solid-state precipitation modeling is necessary. Gao et al. [5] introduced a machine learning accelerated distributed task management system (Malac-Distmas), which includes precipitation simulations. The authors used the PanPrecipitation module of Pandat™ for modeling the precipitation kinetics. However, no details were found on whether the prior solidification process was considered. To do so, the results of the solidification modeling should be used as input into the solid-state precipitation simulations. In this manner, both primary and secondary phase precipitates can be considered.

The present work introduces the open-source computational framework CAROUSEL that addresses the discussed issues: accessibility and through-process modeling. The designed framework enables effective high-throughput microstructure simulations by distributing calculations across parallel computational resources and achieving a notable speedup as compared to a single-thread calculation. Using CAROUSEL, it is possible to screen multiple combinations of chemical compositions and process parameters and account for a coupled simulation of solidification and solid-state precipitation. The goal of the present work is to describe the design of CAROUSEL and its possible application. First, the overview of CAROUSEL architecture and its features will be given in Section “CAROUSEL”. Then, the framework’s application will be demonstrated in Section “Applications” by optimizing heat treatment of additively manufactured metals. Finally, the performance analysis will be discussed in Section “Performance Analysis”. In the end, the outlook for further development will be given.

CAROUSEL

Architecture

CAROUSEL is the open-source application accessible on GitHub [20]. The framework is released under the MIT license, promoting transparency and encouraging contributions from the scientific community for continuous improvement [21]. The framework includes the following features:

  • Open-source implementation

  • Scripting tools

  • Graphical user interface

  • Task distribution system

  • Management of simulation data

  • Data visualization

  • Extensibility to different CALPHAD implementations

  • Equilibrium, non-equilibrium (Scheil), and precipitation kinetic simulations

  • Through-process modeling

The architectural design of CAROUSEL, illustrated in Fig. 1, comprises a core library that manages fundamental framework functionalities, including data management and scripting. This core library is integrated within the communication library, facilitating information exchange with the CALPHAD software. Ultimately, the graphical user interface is the primary interface for end-users to interact with the CAROUSEL framework.

Fig. 1
figure 1

Schematic overview illustrating the CAROUSEL architecture

Integration of CALPHAD Software

There are two approaches to integrating CALPHAD software into the CAROUSEL framework. The first approach involves utilizing Lua scripting, enabling advanced users to develop supplementary libraries that augment the existing framework’s functionalities. This method provides a straightforward and efficient way to create extensions tailored to specific requirements, allowing for customization and expansion of CAROUSEL’s capabilities in a controlled and flexible manner. By leveraging Lua scripting, researchers and developers can enhance the framework’s functionality, introduce new features, and adapt the system to address evolving scientific demands with relative ease and precision [22].

The second approach requires some proficiency in C/C++ programming. The integration of the CALPHAD software is achieved by implementing the communication API interface, as depicted in Fig. 1. By undertaking the implementation of the communication API interface, users gain the capability to define the communication protocols between the CAROUSEL framework and the CALPHAD software. Additionally, it offers the flexibility to determine data storage methodologies and facilitates the seamless integration of supplementary features into the framework. This avenue empowers proficient users to customize and enrich the framework’s capabilities deeply, enabling seamless interoperability with different CALPHAD software and integrating specialized functionalities for distinct research demands. Currently, CAROUSEL is using the MatCalc® tool and its implemented models [23].

Scripting Workflow

In software development, a scripting language offers a simplified approach to coding by integrating existing components developed using a programming language [24]. The main goal of scripting languages is to empower users to automate sets of instructions without delving into the complexities of each component’s internal workings. This abstraction enables users to extend the capabilities of pre-existing components effortlessly by adding automation scripts [24].

The CAROUSEL framework utilizes the Lua interpreter to read instructions from the GUI or user-defined scripts. The core then executes these instructions, which delegates all calculation commands to the communication API. The API sends the appropriate command set to the CALPHAD software, as shown in Fig. 2. For comprehensive information about the available commands and data models, users can refer to the CAROUSEL GitHub repository [20]. The repository contains examples and additional details to assist users in their understanding and usage of the framework.

Fig. 2
figure 2

Overview of the CAROUSEL framework’s instruction-set system and distribution. Adapted from Ref. [32]

To begin scripting in CAROUSEL, the user must first import the CAROUSEL Lua library, which loads all the necessary command sets and enables the use of simplified wrappers to execute specific commands for a data model or instruction set. An example of a basic script that runs a Scheil-Gulliver solidification simulation is provided in Listing 1. It demonstrates the essential steps to perform the simulation within the framework. Furthermore, it is possible to execute multiple calculations concurrently. This is achieved by pre-defining all calculation configurations and invoking the run command on the Project data model, enabling parallel processing. This parallel processing capability significantly improves computational efficiency, enabling the execution of numerous simulations concurrently, similar to a batch processing approach.

figure a

GUI Workflow

To accommodate end-users, the CAROUSEL framework incorporates a user-friendly GUI (Graphical User Interface). This GUI enables users to interact with the framework without the need to write scripts. The user interface, depicted in Fig. 3, features a main ribbon containing essential editing tools for creating and loading projects and scripts. On the left side, there is a navigation tree, facilitating easy navigation through all the generated data. Additionally, the interface offers information display, scripting tools, and more, making scripting accessible to both newcomers and advanced users.

Fig. 3
figure 3

The main window interface: (1) scripting interface; (2) main scripting tools, providing various functions for script creation and editing; (3) information window, which presents relevant details; and (4) text suggestion feature, offering helpful prompts for easier and faster script formulation. Adapted from Ref. [32]

Data Visualization Workflow

The CAROUSEL framework provides various charting options for data visualization, including the project view shown in Fig. 4. The project view allows direct interaction with each data point, enabling users to modify the displayed data. Moreover, it incorporates a search function that assists users in finding specific data points based on certain characteristics, streamlining the data exploration process. These visualization features enhance the framework’s usability and enable users to gain valuable insights from the generated data.

Fig. 4
figure 4

The project view interface: (1) main plot; (2) interactive legend; (3) data selection; (4) additional information on data point; and (5) interactive points with context menu for further options

Data Management

The data generated from simulations is stored within a structured SQL database. The architecture of the database is dependent on the configuration of each individual simulation, the type of simulations to be executed, and the data required to be stored from the calculations. CAROUSEL can currently store solidification, equilibrium, and kinetic simulation information using the data structure presented in Fig. 5. It is possible to extend this structure to any type of simulation provided by the chosen CALPHAD implementation.

Fig. 5
figure 5

The diagram illustrating the data structure model stored in the SQL database. Adapted from Ref. [32]

The CAROUSEL database has been intentionally designed to be user-accessible allowing seamless access to the simulation results for further analysis at any time. Users with basic SQL knowledge can readily query the data structure to extract specific information or perform additional analyses on the simulation data. This accessibility empowers researchers to explore, manipulate, and derive valuable insights from the simulation results, promoting a deeper understanding of the underlying phenomena and facilitating data-driven scientific investigations.

Applications

Solidification Simulations

In materials design for AM, researchers commonly use the Scheil-Gulliver method (or Scheil equation) to conduct non-equilibrium solidification simulations [25]. This method can account for the rapid cooling rates typical for the additive manufacturing process by assuming a constrained diffusion in solid state and unlimited diffusion in liquid [14, 26]. To account for uncertainties in alloy compositions or to design new alloys, conducting a large number of Scheil calculations would be necessary. With the help of CAROUSEL, managing these simulations becomes much more efficient. CAROUSEL handles all the simulations by allowing users to set up configurations for composition ranges, start and end temperatures for the Scheil calculations, temperature step size, and other parameters before executing and distributing the calculations. This significantly reduces the time and effort required to carry out the simulations. All the configurations are communicated to the CALPHAD software through the communication API, and the results are stored in the database for further analysis and evaluation.

To showcase the capabilities of the CAROUSEL framework, we designed a series of Scheil calculations. Our study investigates an Al–Mg–Si–Ti–Fe alloy system designed explicitly for AM [15, 16, 27]. We use the MatCalc 6.03 (rel 1.000) pro version and the free thermodynamic database mc_al_v2.032.tdb to perform the Scheil calculations. It is generally possible to select all the phases from the thermodynamic database. Still, for simplicity, we restrict the selection based on the main alloying elements to the LIQUID, FCC_A1, MG2SI_B, and AL3TI_L phases. In total, 705 calculations were performed while varying the chemical composition in the following ranges: Mg: 0.5–1.0; Si: 0.5–1.0; Ti: 0.9–2.0; Fe: 0.1 wt%. Figure 6 illustrates the results in spider charts that allow easy analysis of changes in phase fractions when varying the chemical compositions. As can be seen, there is a clear correlation between the case (herein, chemical composition) and the amount of selected phases.

Fig. 6
figure 6

The spider chart demonstrating the Scheil calculation results

Precipitation Kinetic Simulations

While cooling from the solidification temperature, nucleation of secondary precipitates might be expected. During the following heat treatment, these nuclei can grow and dissolve again depending on heat treatment parameters. The primary precipitates formed during solidification do not nucleate but can dissolve or grow. The phase fractions obtained during the solidification modeling (Scheil calculations) are used as an input for the precipitation kinetic modeling in the solid state. Figure 7 illustrates the complete heat treatment cycle, where the user can design and analyze each segment individually. By the end of the Scheil equation (1), the precipitate distribution is generated from the obtained phase fractions and is used as primary precipitates in the following heat treatment from (2) to (5). The secondary precipitates can nucleate in steps (2) to (3) and grow during the further segments. The user can arbitrary set up the segments and their number, which are then transferred to the CALPHAD software, herein MatCalc®, to perform the microstructure simulations. It is worth noting that all types of microstructure simulations provided by MatCalc® can be used here. It is, for instance, possible to account for strain and thus simulate the deformation-related microstructural processes like recrystallization and grain growth using the dislocation density evolution models [23].

Fig. 7
figure 7

Illustration of the complete temperature cycle

Figure 8 demonstrates the results of the precipitation simulation for a single heat treatment at \(350\,^\circ \hbox {C}\) for 6 h. To perform the simulations, the free mobility database mc_al.ddb and the physical constants databases physical_data.pdb were used. As can be seen, there is a noticeable increase in the fraction of the secondary MG2SI_B phase precipitates (MG2SI_B_P0) due to the coarsening of these precipitates (increase in the mean radius).

Fig. 8
figure 8

Simulation of a single heat treatment involving a 6-h holding time at a temperature of \(350\,^\circ \hbox {C}\) of a solidified (as-built) Al–Mg–Si–Ti–Fe alloy

Material and Parameter Screening

Testing different variations in chemical composition and process parameters is often necessary to achieve the target material properties. As the number of variables to be modified increases, the required calculations grow exponentially, making the optimization of computational resources a challenging task [28]. Eliseeva et al. [29] emphasized the importance of determining the correct parameters and identifying the most optimal path to achieve a target property. The authors compared this challenge with the “shortest path problem,” which aims to find the most efficient route between two points. However, considering additional parameters would expand the possibility tree, leading to exponential growth and increased complexity, as illustrated in Fig. 9. For example, various parameters, such as temperature, holding time, and heating/cooling rates, must be modified when determining the heat treatment [30]. This will lead to creation of multiple paths to achieve the target property.

Fig. 9
figure 9

Finding the optimal path to suitable candidates for trials. Adapted from Refs. [29, 32]

Navigating through this vast solution space requires careful consideration of various factors, such as the desired property, cost constraints, feasibility, and practical limitations [1]. It becomes crucial to employ systematic approaches, such as optimization algorithms, to efficiently explore the parameter space and identify the most promising paths toward achieving the desired alloy property [29]. CAROUSEL provides a system that stores all compositions and calculation configurations, enabling the distribution of calculation sets to streamline the process. By using this approach, researchers can efficiently identify promising alloy compositions, potentially leading to significant cost savings and accelerated material design.

To showcase the capabilities of the CAROUSEL framework, a series of heat treatment simulations were performed. The simulations encompassed variations of chemical compositions and heat treatment parameters such as temperature and duration as summarized in Table 1. In total, 705 combined solidification and solid-state precipitation simulations were performed.

Table 1 The variation of heat treatment parameters investigated in present work

Figure 10 compares the calculated mean radii and the phase fractions. The results are highlighted only for the MG2SI phase for easier interpretation, while all other selected phases are faded out. By analyzing the results, it is observed that the increase in the heat treatment temperature leads to a decrease in the phase fraction that corresponds to the dissolution of the MG2SI precipitates by reaching the temperature of \(550\,^\circ \hbox {C}\). The increasing duration (holding time) results oppositely in an increase of the MG2SI phase fraction due to coarsening of the precipitates. The increase in Si content leads to a rise in the mean radius and a reduction of the overall MG2SI phase fraction. Overall, the suggested project view simplifies navigation between all the results and assists in identifying promising alloy candidates based on the target properties.

Fig. 10
figure 10

The project view demonstrating the results of 705 precipitation simulations conducted for the Al–Mg–Si–Ti–Fe alloy

Performance Analysis

Metrics of Parallel Optimization

It is commonly assumed that dividing a job into smaller tasks results in a proportional reduction in the time required to complete the job [31]. However, this assumption holds true only for jobs that exhibit full parallelizability, also called "embarrassingly parallel" tasks. Foster et al. [31] elucidate that the degree of parallelizability of a task primarily depends on data dependency, specifically on whether a task relies on data generated or modified by another task. Two standard metrics are utilized to assess the effectiveness of parallel optimization: the speedup factor and parallel efficiency. The speedup factor is calculated using Amdahl’s law (Eq. 1), which takes into account the fraction (f) of the sequential execution that cannot be parallelized, the execution time (\(t_1\)) when employing a single processor, and the number of parallel units (p). This metric quantifies the improvement in execution time achieved by parallelization on one hand. On the other hand, parallel efficiency provides insights into the actual utilization of parallel resources. It is calculated as the ratio of the obtained speedup to the number of cores (Eq. 2). Higher parallel efficiency indicates that the parallel resources are effectively utilized, leading to a more efficient parallel implementation. Researchers and practitioners can assess the performance gains and the efficiency of parallelized algorithms or systems by evaluating both the speedup factor and parallel efficiency.

$$\begin{aligned}{} & {} \text {Speedup} = \frac{t_1}{ft_1 + (1-f)t_1/p} \end{aligned}$$
(1)
$$\begin{aligned} & \eta _s = \frac{\text {Speedup}}{\text {Parallel units}} \end{aligned}$$
(2)

The speedup and parallel efficiency serve as fundamental metrics for analyzing algorithms, providing insights into their performance. Two commonly used analyses, weak and strong scaling, are employed to better understand how algorithms scale with an increasing number of processors and varying data sizes. These analyses aim to systematically vary a single factor at a time, either the number of processors or the data size while keeping other factors constant [31]. By conducting weak scaling experiments, the impact of increasing the number of processors on the workload per processor can be evaluated on one hand. On the other hand, strong scaling experiments assess how the algorithm performs when the data size is increased while maintaining a fixed number of processors. These analyses provide valuable insights into the scalability and efficiency of algorithms under different computational configurations.

Performance Results

Weak and strong scaling analyses were executed to assess the scalability and efficiency of the CAROUSEL. Evaluations were conducted based on the predefined configuration outlined in Table 2 for the testing environment summarized in Table 3. The goal is to determine the speedup and parallel efficiency achieved when using multiple instances of the MatCalc® software compared to utilizing only a single instance. All simulations were conducted with an identical setup to maintain consistency and mitigate the influence of irrelevant variables. The simulations focused on solidification modeling employing the Scheil equations. This standardized approach ensured a fair and controlled comparison of the framework’s performance across parallelization scenarios.

Table 2 Experimental simulation runs and assigned thread configurations
Table 3 Computer system specifications used for evaluation

Figure 11a and b presents the parallel speedup and efficiency evaluation results. First, the simulations involving data sizes of 10, 40, and 50 cases are analyzed. It is observed that beyond two threads, these data series do not demonstrate a significant speedup. The lack of substantial improvement can be attributed to the overhead incurred during the instantiation of the CALPHAD software. However, shifting the focus to the remaining three data series, which encompass larger data sizes ranging from 1000 to 3000 cases, a discernible speedup is evident with the utilization of each additional thread. With the expansion in data size, the impact of the software instantiation overhead diminishes, resulting in a speedup and parallel efficiency surpassing 100%, as depicted in Fig. 11b. Although achieving parallel efficiency above 100% lacks practical significance within the context of parallel programming, it can be attributed to optimizations implemented within the CALPHAD software and the underlying operating system. Remarkably, the maximum parallelization efficiency guided by the used system specifications is attained by employing 2 threads for larger data sets that effectively mitigates the influence of the software instantiation overhead. Additionally, it is observed from both parallel speedup and efficiency that the application exhibits proper scaling as data sizes increase. This promising outcome suggests the need for further investigation, which could involve employing more powerful computers or implementing networking to distribute the workload. It is expected that such enhancements would yield significantly improved and promising results.

Fig. 11
figure 11

The results of weak and strong scaling based on the experimental simulations defined in Table 2: a the parallel speedup and b the parallel efficiency. Adapted from Ref. [32]

Conclusions and Outlook

The present paper described the development and validation of the CAROUSEL—an open-source framework for high-throughput microstructure simulations. It uses the CALPHAD implementations to perform the equilibrium, non-equilibrium, and precipitation kinetic simulations and predict the precipitation evolution during metal processing, particularly additive manufacturing. The distinguishing feature is the ability to perform through-process modeling by combining solidification and solid-state precipitation steps. Furthermore, CAROUSEL can use all functions of the chosen CALPHAD software (currently MatCalc®) and its models to predict microstructural phenomena. The simulation results are stored in the SQL database and can be accessed anytime for further processing. CAROUSEL executes diverse simulations using its effective distribution of computational tasks. The performance evaluations demonstrated a significant speedup achieved by the framework, confirming its ability to handle multiple simulations efficiently. CAROUSEL visualization tools assist in the fast evaluation of simulation results and understanding the trends when varying the chemical compositions and process parameters. The user-friendly graphical interface makes CAROUSEL a valuable tool for material engineers. Advanced users can profit from scripting capabilities to perform complex simulations. Finally, CAROUSEL offers a platform for community engagement and international collaboration that will help to integrate it into different metal industries and high-performance computing.