Software Platforms for Electronic/Atomistic/Mesoscopic Modeling: Status and Perspectives

  • Mikael Christensen
  • Volker Eyert
  • Arthur France-Lanord
  • Clive Freeman
  • Benoît Leblanc
  • Alexander Mavromaras
  • Stephen J Mumby
  • David Reith
  • David Rigby
  • Xavier Rozanska
  • Hannes Schweiger
  • Tzu-Ray Shan
  • Philippe Ungerer
  • René Windiks
  • Walter Wolf
  • Marianna Yiannourakou
  • Erich Wimmer
Open Access
Thematic Section: 2nd International Workshop on Software Solutions for ICME

Abstract

Predicting engineering properties of materials prior to their synthesis enables the integration of their design into the overall engineering process. In this context, the present article discusses the foundation and requirements of software platforms for predicting materials properties through modeling and simulation at the electronic, atomistic, and mesoscopic levels, addressing functionality, verification, validation, robustness, ease of use, interoperability, support, and related criteria. Based on these requirements, an assessment is made of the current state revealing two critical points in the large-scale industrial deployment of atomistic modeling, namely (i) the ability to describe multicomponent systems and to compute their structural and functional properties with sufficient accuracy and (ii) the expertise needed for translating complex engineering problems into viable modeling strategies and deriving results of direct value for the engineering process. Progress with these challenges is undeniable, as illustrated here by examples from structural and functional materials including metal alloys, polymers, battery materials, and fluids. Perspectives on the evolution of modeling software platforms show the need for fundamental research to improve the predictive power of models as well as coordination and support actions to accelerate industrial deployment.

Keywords

Integrated computational materials engineering (ICME) Materials modeling Software Interoperability Industrial deployment Metal alloys Polymers Batteries Fluids 

Introduction

We are witnessing the dawn of a Golden Age of integrated computational materials engineering (ICME). The confluence of five main factors is creating this unprecedented situation, namely (i) theoretical physics and chemistry have established a solid scientific foundation in the form of classical mechanics, electrodynamics, statistical thermodynamics, and quantum mechanics; (ii) computer hardware with astounding performance has become readily affordable; (iii) advanced software systems are enabling unprecedented productivity while the tools for software development are constantly improving; (iv) today’s communication technologies enable instantaneous and global collaboration as well as access to a daunting wealth of data; and, last but not the least, (v) the potential economic impact of this technology has aroused the interest of industry around the globe, thus driving the accelerated transition from academic research to practical applications.

The vision of ICME is illustrated in Fig. 1. The design of materials is treated as an integral part of the overall engineering process. Rather than being restricted to existing materials in the design of components and systems, the most fundamental building blocks of any engineering endeavor, namely the materials themselves, become dynamic variables in the design process. To this end, the ability to compute properties of materials prior to their actual synthesis is a key requirement for ICME. This capability is the foundation of ICME and, rightfully, this has created tremendous excitement and opportunities. Thus, this paper focuses on the ability to compute and use materials property data.
Fig. 1

Scheme of integrated computational materials engineering (Color figure online)

There are also important modeling applications in the field of raw natural materials. These materials can be either minerals (e.g., rocks, ores, clay) or organic (e.g., wood, coal, kerogen, asphalt). They are used or processed by important industrial sectors including oil and gas, building materials, metallurgy, energy, and chemistry. Understanding the properties of these materials is required for the design of safe and economic processes in these branches. Due to their complexity, they are particularly challenging for modeling, but substantial progress is already being made.

At present, simulations based on classical mechanics, fluid dynamics, and electrodynamics are well established R&D tools in the design of components and systems, leading to better products, reduced development time, and lower experimental costs. The benefits are evident such as safer and better cars, highly efficient airplanes and turbines, and a myriad of electronic products. However, the materials property data required for these macroscopic simulations are for the most part taken from experiment. Hence, the introduction of new materials hinges on experimental synthesis and characterization, which are often slow and expensive bottlenecks.

While experiments will always be needed, the capability of computing properties of materials prior to their synthesis radically changes this picture. This is a wonderful situation, but the reality of predicting the properties of a material at a level of accuracy and reliability to be useful in an engineering process is challenging. Successes are emerging, but it is fair to say that the technological readiness level of electronic and atomistic software, which is a fundamental part of ICME, leaves ample opportunities for improvement.

It is the purpose of the present paper to review the current state of software platforms, which enable materials property predictions based on a combination of electronic structure methods, atomistic simulation tools using interatomic potentials or forcefields, and mesoscopic modeling tools. In the following, this combination will be referred to as “e/a/m.” The focus here is on approaches based on fundamental physical concepts such as Schrödinger’s equation and statistical thermodynamics. One needs to keep in mind that in practice, empirical methods such as quantitative structure-property relationships (QSPR), data mining, and machine learning play an important role. In fact, these approaches can be very effective if used in combination with physical equation-based predictive methods.

Theoretical Foundation

Computational materials science has its theoretical foundation in quantum mechanics, statistical mechanics, classical mechanics, and electrodynamics. Within the approaches based on physical laws, one distinguishes between “discrete” and “continuum” models, the former referring to methods with explicit treatment of electrons, atoms, or groups of atoms, the latter encompassing a vast range of methods using a continuum description of matter. Structural analysis with finite element methods (FEM), computational fluid dynamics (CFD), and the so-called technology computer-aided design (TCAD) methods for the simulation of electronic circuits belongs to this very important and well-established class of methods. In materials science, thermodynamic approaches such as the calculation of phase diagrams (CALPHAD), phase field methods, and the solution of diffusion equations for simulating, for example, solidification of metal alloys, play an increasingly important role. At present, the established continuum methods rely overwhelmingly on experimentally determined materials property data, although increasingly, these methods also incorporate data from atomistic and quantum mechanical simulations, as indicated in Fig. 2.
Fig. 2

Computational methods in materials modeling. MD molecular dynamics, MC Monte Carlo, BD Brownian dynamics, DPD dissipative particle dynamics, FEM finite element methods, CFD computational fluid dynamics, CALPHAD calculation of phase diagrams, TCAD technology computer-aided design, QSPR quantitative structure-property relationship (Color figure online)

Historically, the methods shown in Fig. 2 have been developed by different groups, initially mostly in academic research groups, government laboratories, and in some cases by industrial research organizations such as Bell Labs and the research centers of IBM in Yorktown Heights and Rüschlikon. The work by different research groups in physics, chemistry, materials science, and biology, and the different focus, assumptions, and approximations used in describing various materials and properties have resulted in a fragmentation of this field, which persists to the present day.

On the level of ab initio quantum mechanical approaches, theoretical chemists and solid-state physicists pursued different routes for many decades. Many quantum chemists aimed at the most accurate solution of Schrödinger’s equation for small molecules starting from Hartree-Fock theory while theoretical solid-state physicists developed methods based on density functional theory (DFT) [1, 2]. During the past decades, DFT has become a workhorse also for molecular systems while Hartree-Fock-based methods have found their way into solid-state methods. Driven by the quest for higher accuracy and enabled by increasing compute power, we are witnessing today a convergence of these two approaches, for example in the form of hybrid functionals.

Semi-empirical quantum mechanical approaches as implemented in programs such as MOPAC [3] are extremely useful due to their high computational efficiency and good performance especially for organic and inorganic molecular systems. The use of extensive training sets of experimental data in the parameterization while maintaining key aspects of molecular quantum mechanics are the foundation for the success of this approach. Semi-empirical calculations can be 100 times faster than ab initio calculations, thus offering an interesting approach for high-throughput calculations [4]. Furthermore, semi-empirical methods can be implemented such that the scaling of the computing time is better than N2 while ab initio approaches typically scale with a power of 3 or higher, with N being the number of electrons, although the so-called order-N ab initio methods exist as well.

The modeling and property predictions of liquids [5] and amorphous materials such as polymers and glasses, as well as simulations of dislocations in metal alloys, may require the sampling of millions of configurations or large systems containing hundreds of thousands of atoms or more. The desire to model such systems has led to the development of forcefield (or force field) methods, a terminology preferably used by chemists [6, 7, 8] or interatomic potentials, preferably used by physicists [9, 10]. Both terms refer to the same concept, namely, the use of relatively simple mathematical expressions such as Morse-type binding curves to describe the interaction between atoms. A major incentive for the development of forcefields was the desire to simulate the interaction of drug molecules with DNA and proteins. Extension of these forcefields to synthetic polymers and organic liquids has enabled the prediction of structural, thermomechanical, and rheological properties with remarkable accuracy. Interatomic potentials have also been successful in describing metallic systems as well as highly ionic materials. However, these approaches are different in character and thus mixed systems such as an interface between a polymer and a metal represent a conceptual dilemma. In this context, it should be pointed out that the Nobel prize for chemistry was awarded to Martin Karplus, Michael Levitt, and Arieh Warshel in 2013 for the “Development of multi-scale models for complex chemical systems.”

The optimization of forcefield parameters using results from ab initio calculations as training set is one method to expand the scope of ab initio methods. Calibrating the forcefield parameters on sensitive quantities such as the experimental density, heat of formation, mechanical properties, or the melting point of a material leads to powerful computational approaches, which can be more accurate than DFT calculations, albeit with a narrower range of applicability, as will be illustrated by the prediction of boiling points of liquids discussed in a later section.

The so-called cluster expansion method [11, 12] offers another possibility to expand the scope of ab initio calculations. In this powerful and elegant method, ab initio calculations are used iteratively to build an expression of the total energy of a system as a function of local arrangements (clusters) of atoms. This expansion reproduces the original DFT values with a fidelity of a few milli-electron-volts (or a few tenths of a kJ mol−1). However, the presence of a lattice is required. Hence, this method is particularly well suited for the simulation of metal alloys.

Kinetic Monte Carlo simulations can describe phenomena on very long time scales and on large systems containing millions of atoms. For example, the dynamic evolution of phase segregation in a metal alloy can be simulated, if the jump rates of elementary diffusion steps are known. Diffusion rates and, more generally, reaction rates can be obtained from ab initio calculations using transition state theory or specific forms of molecular dynamics.

Coarse-graining is yet another possibility within the class of forcefield methods to extend the length and time scales of atomistic simulations. This method can be applied to perform molecular dynamics, Monte Carlo simulations, Brownian dynamics, and dissipative particle dynamics. While conceptually appealing, the construction of accurate coarse-grained models requires substantial insight into the key interaction mechanisms of a material and thus is far from being an automatic process.

This short overview of e/a/m methods gives a glimpse on the contrasting simplifications of the various approaches and the resulting difficulties in integrating these approaches into a unified modeling platform with smooth interoperability among the different approaches as well as connecting such a platform with simulations operating on the continuum level.

To assess the present modeling platforms, we will now discuss the requirements for such a software system. While great progress has been achieved during the past decades, it will also become obvious that in many instances, these requirements are only partially fulfilled, thus leaving room for major improvements.

Status of e/a/m Modeling Platforms

Material modeling platforms serve two closely connected purposes, namely (i) enabling a deeper understanding of mechanisms that lead to a certain behavior of materials, for example stress corrosion cracking, and (ii) predicting properties of materials prior to their synthesis and experimental characterization, for example solid-state electrolytes for Li-ion batteries. Here, we assess the status of e/a/m modeling platforms guided by their key attributes. To this end, we draw on experience in the development of the modeling and simulation platforms Insight II and Discover (Biosym Technologies), Cerius2 (Molecular Simulations, Inc.), Materials Studio (Accelrys), UniChem (Cray Research), and MedeA (Materials Design, Inc.). It should be noted that this selection is based on the direct experience of the authors and is not intended to be exhaustive. A comprehensive compilation of such platform and simulation tools is given by Schmitz and Prahl [13].

A modern e/a/m software platform should meet the following criteria:
  • Comprehensive—predictions of all relevant physical and chemical properties for all types of materials such as metals and alloys, semiconductors, and insulators; inorganic and organic materials; crystalline and amorphous phases such as glasses and polymers as well as fluids in liquid and gaseous form.

  • State-of-the-art—computational materials science is an active research field with new methods emerging at a relentless pace. Users of leading platforms expect the best and most recent methods to be available.

  • Verified—tests need to show that algorithms are correctly implemented and that programming bugs have been found and corrected. Verification of computer codes need to be repeated after changes are made to the software or when hardware or operating systems change.

  • Validated—computed properties need to be validated against experimental data with estimation of error bars.

  • Robust—in the coming age of high-throughput calculations, all simulation programs need to work correctly under a large range of input conditions.

  • Error recovery and fault tolerance—errors in the computations, whatever their origin might be, must be automatically detected and, if possible, corrected.

  • Ease of use—time to solution is critical and needs to be minimized; input parameters should be kept to a minimum without undue complexity nor redundancy.

  • Standardized—computer programs should use standardized procedures to facilitate the exchange of data from simulations performed by different people at different times with different programs.

  • Ability to create, store, and re-use workflow protocols—providing traceability and reusability are essential requirements for industrial use and quality control.

  • Well documented—context-dependent documentation of the underlying theory, algorithms, and tutorials are important parts of any good software system.

  • Computationally efficient—the performance of hardware continues to progress and should be fully utilized by any software platform.

  • Supported consistently on evolving hardware and operating systems—acquiring the skills to use any software platforms for materials simulations to its fullest capabilities represents an investment of years. This investment needs to be protected by support over many years.

  • Extensible—theory, algorithms, and software implementations evolve and any sustainable software platform needs to be able to incorporate new capabilities without having to rewrite entire codes.

  • Portable—scientific software has a longer life cycle than computer hardware and operating systems; software should be written in a form which facilitates portability to new hardware and operating systems.

  • Interoperable with other software components and platforms—modeling and simulations of materials involve tools from different sources and suppliers; standardization and interoperability are mandatory.

  • Able to communicate efficiently with public and proprietary databases—modeling and simulations of materials are built on previous knowledge and data; a close integration of existing experimental data with e/a/m simulation platforms sets the stage for powerful materials property mining and optimization.

  • Tracking of metadata for traceability and re-use—like experimental data, computed results require metadata to make them useful such as input data, computational parameters, date and time, links to input and output files, program version, operating system, time, and operator.

Based on these requirements, the current status is discussed in detail in the following sections.

Comprehensive Functionality

Industrial materials and processes often involve multiple phases, for example a liquid and solid phase in an additive manufacturing process, a carbide precipitate in steel, a polymer/ceramic interface in an electronic package, or a water/oxide interface in a corrosion problem. Thus, an e/a/m platform needs to be able to handle a broad range of materials including metals and alloys, ceramics, and semiconductors, as well as organic materials in crystalline, amorphous, liquid, and gaseous forms. For each of these materials as well as for the interfaces between any combination of such materials, the platform needs to be able to permit the computation of a range of materials properties as illustrated in Fig. 3.
Fig. 3

Properties for materials and their interfaces, which an e/a/m platform needs to predict. The two axes at the base indicate possible interfaces between different materials (Color figure online)

Current e/a/m platforms address a wide range of materials and properties, as shown in Fig. 3, but the technological readiness level of such property calculations is not uniform. For example, elastic coefficients of crystalline solids and synthetic polymers like cross-linked epoxy resins can be computed with remarkable accuracy and predictive power while the accurate prediction of other properties such as the melting temperature of even some common crystalline materials remains as a major challenge.

Historically, the modeling of different classes of materials has evolved from research efforts of quite different scientific communities. This has led to a range of different theoretical and computational approaches, terminologies, and different units. Thus, there are now modeling methods which address specific materials and properties quite well, but are applicable only to individual systems such as a bulk metal or an organic polymer, but not to an interface between a polymer and a metal.

Fortunately, as computational approaches mature, they tend to become more general, thus allowing broader modeling of complex multi-phase systems. For example, electronic structure methods for solids on the one hand and quantum chemistry methods for molecules on the other hand have been developed for well over half a century by different research groups. The solid-state community pursued density functional theory (using Rydberg atomic units or eV) while quantum chemists built on Hartree-Fock methods (using Hartree atomic units or kcal mol−1). To further complicate matters, the definition of “mol” could mean mol of atoms, mol of compounds, or mol of simulation boxes. Today, these methods have merged and one can readily compute the interaction of a molecule with a metal surface. Nevertheless, domain-specific computational methods are still predominant presenting a challenge for the practitioner who wants to solve multi-phase and multi-materials problems in a single modeling platform.

Today’s leading e/a/m modeling platforms provide in a single environment access to a comprehensive range of methods including electronic structure methods for solids, surfaces, and molecules as well as forcefield-based methods. These programs are accessible from a single user interface, thus allowing the choice of the most appropriate method for the question or materials property at hand. In addition, there are specialized platforms with a focus on specific properties, for example thermo-physical properties of molecular liquids.

State of the Art

Innovative methods in computational materials science are being developed by an increasing number of research groups around the world. Users of advanced simulation platforms expect to benefit from the best of these developments without undue delay. These could involve, for example, new functionals in DFT methods, new algorithms for highly accurate post-DFT approaches such as the random phase approximation in electronic structure methods [14], or it could be a new forcefield or a new statistical ensemble. Of course, rapid access to new capabilities competes with other requirements such as validation, robustness, and standardization.

Development and marketing strategies for materials modeling software vary. If the software is intended for a broader user base, then packaging may take precedence over rapid access to the latest and greatest computational methods. Academic sources of software tend to put a high priority on state-of-the-art functionality whereas some commercial software platforms seek a broader user base by providing ease-of-use and automation. Each strategy has its merits and drawbacks and the end-users select the appropriate environment given their needs.

Verification

Both scientific and commercial applications of software rely on the correct implementation of algorithms and the absence of programming errors. Verification is a very difficult and time-consuming part of software engineering. Static verification is usually the first step, where the source code is being inspected for aspects such as compliance with the standards of a chosen computer language and the absence of simple typographical errors. Dynamic verification involves the execution of tests where the answer is known from other sources.

When new features are implemented in a software, regression tests are used to verify that existing functionality is preserved. This test is also critical when the software is ported, for example, to new operating systems, but also when new compilers or libraries are used. Non-regression tests are necessary to ensure that improved or new features indeed have the desired effect. Both types of tests are necessary. It is good practice that testing is performed by others than the original code developers.

If a new computational method together with results for specific cases is already published, then a new implementation of such a method should be tested against these earlier results. A recent example is a comparison of DFT results obtained with three different computer codes, namely VASP, GPAW, and Wien2k [15]. This study demonstrated that the variation between the results from these three different implementations vary by an order less than the typical deviation of the computed results from experiment. This example also indicates that verification and validation often go hand in hand.

Validation and Estimation of Error Bars

Practical solutions of many industrial problems do not require the most accurate approaches. However, frequently one needs fast and cost-effective property values with sufficient accuracy and reliability to be used for making engineering decisions. This requirement leads to the need for reliable error estimates, especially for approximate methods. A priori such estimates are very difficult to make. Hence, the assessment by experts are necessary in this case, possibly aided by statistical methods.

Validation of computational approaches and the analysis of uncertainty thus remains a central requirement of e/a/m platforms. Systematic comparison with accurate and reliable experimental data provides the most compelling analysis. However, one should keep in mind that occasionally experimental data are incorrect, for example simply due to typographical transcription errors or due to uncontrolled effects during the measurements. In fact, high-level computations can play a critical role in identifying such errors and deficiencies. Illustrative examples are provided by the realization that the reported elastic constants of sapphire had for many years included an indexing error [16] and an incorrect feature of the band structure was reported for InAs, a standard III-V semiconductor [17]. In both these instances, the source of the experimental discrepancy was identified based on accurate computation. Industrial research and development experience, which frequently does not immediately emerge in the research literature, indicates that these are far from isolated examples.

Robustness

In the late twentieth century, when compute power was much more limited and expensive than today, typical modeling efforts entailed in the study of individual or a handful of systems. In this situation, it was quite feasible to monitor the progress of calculations and to adjust computational parameters such as basis sets of quantum mechanical calculations or convergence criteria in geometry optimizations by hand. This is no longer practical in high-throughput calculations on thousands of systems, which are enabled by today’s compute power.

For this reason, robustness of “compute engines” is becoming a critical aspect of modern simulation platforms. This means that the right choices of computational parameters can no longer be made by hand for each system, but they need to be universal over a large set of systems. Before launching any simulations, the modeling platform needs to assess the formal correctness and consistency of input structures and computational parameters. This includes simple aspects such as avoiding that input structures contain atoms which are unphysically close, but it also can mean that a model of cross-linked polymer with 100,000 atoms does not contain unphysical topologies such as highly strained ring catenation. Leading modeling platforms provide such consistency checks.

Today’s leading quantum mechanical solid-state programs such as VASP are remarkably general in terms of the choice of atoms (from hydrogen to curium) and robust in terms of initial structures in geometry optimizations or ab initio molecular dynamic simulations. The situation is quite different in forcefield-based molecular dynamics and Monte Carlo simulations. While excellent forcefields (in terms of generality and accuracy) for many organic molecules are available, which today’s modeling platforms can assign automatically in a large variety of models such as molecular liquids and polymers, occasionally functional groups or multifunctional molecules are encountered, where some forcefield parameters are missing. This issue is more pronounced when forcefields (or interatomic potentials) are needed to describe inorganic systems and/or systems comprising inorganic and organic matter. For many cases, classical forcefields simply do not exist or transferability of parameters cannot be achieved. Using such forcefields in a blind fashion can then lead to unexpected results or perhaps unmitigated disasters. The fact that a given forcefield is not applicable to a specific system or to a certain property may not be obvious even to the experienced users. No current modeling platform can guarantee robustness in every situation. Rather, the platform should warn the user before launching a simulation, in cases where the initial model of the system(s) or the description of the interatomic interactions may be unphysical.

Another aspect of robustness is the behavior of the modeling platform when the size of the system or the required simulation time exceeds the available computing resources. While it would be desirable that the user is informed about the necessary computing resources prior to launching a simulation, estimating such resource requirements is demanding and today’s modeling platforms usually do not currently offer such a feature.

Robustness of a software system also implies stable behavior under any operating conditions. State-of-the-art modeling platforms involve very complex human/machine interactions as well as asynchronous, non-deterministic inter-process communications, for example between a user interface and tasks running on different computers in a network. Sophisticated software engineering, employing established communication patterns and protocols, make such systems reliable.

Error Detection and Correction, Fault Tolerance

Errors can occur in any modeling step including the building of structural models, the setting of input parameters, lack of convergence in iterative procedures, and numerical instabilities, as well as in the analysis, storage, and retrieval of computed results. Furthermore, today’s highly networked and distributed computing environments are subject to faults due to network problems as well as failures in hardware and operating systems. In the past, when simulations were executed individually, error detections and corrections were possible by human control and intervention. With workflows and high-throughput calculations, this is no longer possible and automated control mechanisms are required. Determining whether a specific simulation was correct or erroneous requires context-dependent information and judgment. Most advanced materials modeling platforms presently leave room for improvement in error handling and fault tolerance.

Ease of Use

The ease of use of a software system can be gauged by the number, rapidity, and visual comfort of steps a user must take to accomplish a task. If the task is simple and well-defined, then the number of steps should be minimal and convenient. If the task is complex, the user should have flexibility in defining the appropriate modeling approach and parameters. In any event, the user interface should be homogeneous in its automation, e.g., in the treatment of single-phase and multi-phase systems or in the treatment of a single system or a large list of systems.

The options for each step should be self-evident to the user and they should be presented in a form and language which is commonly used in the specific discipline. The high degree of specialization in materials science and engineering makes the design of easy-to-use software platforms challenging. However, specialization also implies a narrow focus, which could make it easier to construct protocols involving a small number of steps.

Typically, academic researchers require leading functionality and access to a large range of computational parameters enabling them to push the frontiers of research while industrial materials engineers need well-tested computational protocols and parameter settings to obtain a specific answer rapidly and reliably in the context of an overall engineering project. Few parameters are often a good indicator of mature algorithms and programs.

Software engineers of user interfaces for smartphones are masters in the creation of ergonomic systems, thus setting similarly high expectations for ease of use in other software systems. However, user interface design for a smartphone is subject to very different constraints than those of highly specialized materials modeling platform. The market for smartphones is measured in billions while that for materials modeling software is about five orders of magnitude smaller. The possible investments in user interface design scale accordingly. The functionality of smartphones like making a phone call, taking a picture, or getting directions to a restaurant can be readily defined while ease-of-use for functionality such as predicting the stress-corrosion behavior of a new alloy is rather different. Furthermore, “ease of use” for materials scientists and engineers who are not modeling experts means being able to perform specific but complex simulations in an easy and straightforward way, e.g., use of property modules and libraries of flowcharts. A full-time expert modeler has other requirements, namely ergonomic design, intuitive menus, and access to maximal functionality in the quickest ways, avoiding multiple repeats of a certain action, e.g., by defining loops in flowcharts and custom stages. Thus, ease of use remains a significant challenge for developers of e/a/m modeling platforms and there is room for design diversity and products depending on the specific needs of target users.

Scientists and engineers spend many hours on their computers and this time may increase in the future. Hence, ergonomics should not be forgotten and input from the medical science should be taken into account in the design of user interfaces.

Standardization

Standards are the foundation of efficient communication and interoperability. It is probably fair to say that today’s components for e/a/m materials modeling, for example different DFT programs or forcefield codes, each have their own sets of conventions and data representations, which makes their integration in a common platform a tedious task. In software platforms developed since the 1980s, integration and unification of different programs was achieved by creating data representations and file formats which provides sufficient information to generate program-specific input files. Post-processing then converts the various output formats into a common representation. However, each software platform has had and still has its own set of formats and limited practical standardization has been achieved. However, several common file formats have emerged to exchange structural data of atomistic models between different software platforms. In this respect, significant efforts have been made by crystallographers, which have stimulated the developers of integrated software e/a/m platforms.

An important milestone in the standardization of molecular ab initio calculations was the introduction of the so-called Pople Gaussian basis sets. Thus, Hartree-Fock calculations on a specific molecule with such a standardized basis set yield the same total energies close to machine precision independent of the specific program. This is significant, as complex thermodynamic cycles can be constructed from values taken from calculations performed by different authors with different programs.

In the case of solid-state calculations using DFT, such a standardization has not yet occurred at the same level as in quantum chemistry. Even if two programs use the same approximation for electron exchange and correlation, the resulting total energies still depend on many other computational choices such as specific pseudopotentials or numerical grids used for the radial functions in all-electron methods, number of plane waves, and Fourier grids. This lack of standardization hampers the exploitation of collections of DFT results and represents a major challenge for efforts such as the NoMaD DFT data storage and indexing project [38]. However, the issue of reproducibility in DFT calculations is being addressed [19].

The situation in forcefield calculations is better than in the realm of DFT calculations. With a given forcefield, computed properties obtained from different programs should result in identical results if the same statistical ensembles and simulation parameters are employed. In practice, results from different simulations carry a statistical uncertainty, but they can be used in a common analysis.

Forcefield methods are used to simulate the behavior of systems such as molecular fluids and polymers. Computing properties reliably for this class of materials hinges on correct statistical sampling and, of course, on the quality of the underlying forcefield. For example, the result for elastic coefficients of a single computation on a single amorphous polymer model has limited meaning, whereas performing the same calculation on 100 different models yields well-defined upper and lower bounds for such properties. Leading modeling platforms provide this type of statistical analysis.

The requirement for performing a set of connected simulations points to another requirement, namely that of reproducible workflows. This is discussed in the next section.

Reproducible Workflows

Scripting of computational protocols has been in use for many decades. This can be achieved using shell scripts or, more conveniently, by graphical construction of flowcharts where each stage defines a specific operation. To ensure reproducibility and re-usability of such flowcharts, it is important that these flowcharts are stored together with input and output data in a form which is readily accessible by other users. Furthermore, additional information such as the version number of the software platform, the operating system, and computer hardware may be needed to ensure reproducibility, although the latter should not have a noticeable influence on computed materials properties, if the algorithmic implementations are suitably numerically stable (Fig. 4).
Fig. 4

Flowchart of calculations of the mechanical properties of a thermoset polymer. After building an amorphous model, the uncured resin and cross-linking compound are equilibrated in a molecular dynamics stage. The subsequent Thermoset Builder constructs a cross-linked model which is again equilibrated. The mechanical properties are computed in a final stage. This procedure implies loops over about 100 different amorphous models. This flowchart can be stored, thus enabling reproducibility and re-usability (Color figure online)

Documentation

The complexity of scientific software requires documentation detailing the physical equations being solved, explaining the algorithms and parameters, which influence the quality of the solutions, and describing the practical use of the software. Scientific publications related to these programs are valuable as additional source, but they are different from software documentation in purpose and scope. In fact, a complete users’ guide of e/a/m platforms is comparable in volume to a textbook. Documentation is part of the software package and should be updated with each new version of a platform.

Well-designed software systems offer context-sensitive help whenever the user needs specific instructions to accomplish a certain task or needs to understand the available options and implications of their choices. This form of documentation is extremely helpful in intuitive, direct-manipulation interfaces as it relieves the user from switching to different sources during a work session.

Computational Efficiency

The computation of materials properties can be numerically very demanding and thus computational efficiency is very important. Throughout the evolution of computational materials science platforms, developers have had to face the dilemma of finding the right balance between creating code which can be easily ported between different hardware platforms, and implementations, which seek the highest performance on specific hardware architectures. In the 1980s and 1990s, vector supercomputers offered unprecedented performance gains if special hardware features could be fully utilized. Cray Research produced the most successful of this class of computers. One of the reasons for this success was the excellent balance between vector and scalar performance and the availability of powerful compilers, which did not require deep changes in the computer programs to receive the benefit of the vector architecture. At present, similar arguments hold for parallel architectures. Given the steady progress in computational approaches and in view of the fact that the optimization of a large computer code for a specific hardware architecture can be a daunting and time-consuming task, developers are often reluctant to devote their time to the optimization of a specific hardware architecture such as GPUs. By the time such an optimization is completed, the field may have moved on and the users may have a highly optimized code with obsolete functionality. Efficient compilers and software tools are thus essential for taking advantage of high-performance hardware. One has to keep in mind that various computational approaches such as ab initio quantum mechanics, forcefield-based molecular dynamics, Monte Carlo simulations, and machine learning have quite different requirements for high-performance computing. Care must be taken that each method benefits from advances in high-performance hardware.

Support

Support of a sophisticated scientific software system such as an e/a/m modeling platform is needed on several levels, namely during installation and configuration of the software, in the practical handling of the software, and in the correct and efficient scientific use. The first level is the domain of information technology. Software running on a single machine typically can be installed fully automatically with minimum user intervention. Complex software, which is installed on different machines on a network with different levels of security and access privileges, may require support and customization. The second level, the correct manipulation of the software can either be covered by demonstrations, tutorials, user forums, online support, and training sessions.

In an industrial environment, the time of scientists and engineers is highly valuable. Thus, learning the effective use of a new software tool such as an e/a/m modeling platform and its integration in the engineering process represents a significant investment. To protect this investment, it is critical that the software environment is supported for many years in the form of updates, addition of new functionality, and migration to new hardware systems.

Extensibility

Computational materials science continues to evolve, existing methods are extended, new theoretical methods and algorithms are being developed, and new software systems are being created. Thus, the ability to extend and grow is an essential requirement for the long-term success of an e/a/m platform. This implies a clear modular structure of the software based on a thoughtfully designed data model. Extensions can consist, for example, of the addition of a new approach in solving Schrödinger’s equation, they can mean a new forcefield for molecular dynamics or Monte Carlo simulations, or the addition of a new coarse-graining for large-scale simulations. However, extensions can also mean the addition of new paradigms such as high-throughput calculations.

Portability

Scientific software has a life span of many decades, as seen, for example, by computational chemistry programs such as Gaussian [20], which has its origin in the 1960s when computers had core memories measured in kBytes and were programmed with punch cards. Today’s computing environments for scientific and engineering applications are characterized by a mix of laptops, desktops, local compute clusters, and massively parallel supercomputers using Windows, Mac OS X, and Linux operating systems. Cloud computing and hand-held devices are likely to play an increasing role in the future. Porting a complex software system to a new operating system can be greatly facilitated if software is written in a machine-independent language and if any operating system-dependent parts are well isolated from the bulk of the code. On the other hand, if a software system is tied to a single operating system with a graphical user interface relying on a legacy programming language, then the port to other environments can become a daunting task. A costly re-write of the entire system may be the only way forward.

Interoperability

The properties of materials depend on phenomena reaching over 10 to 20 orders of magnitude in lengths and time scales and span essentially all branches of physics and chemistry. A great number of different approaches and software systems have been developed to deal with this diversity on both the discrete (e/a/m) and the continuum levels. Integration of hitherto separate programs into unified platforms has been pursued since the 1980s, for example Insight II of Biosym Technologies provided access to forcefield-based molecular dynamic programs (Discover) and a quantum chemical program (DMol) in a single platform. Another example is UniChem of Cray Research which provided interoperability between a semi-empirical quantum chemistry program (MNDO91), an ab initio Hartree-Fock program (CADPAC), and a molecular density functional code (DGauss). From a single interface, a user could create an atomistic model, perform a geometry optimization with a semi-empirical method, and use the output of this calculation seamlessly as input for Hartree-Fock or DFT calculations.

More recently, the AiiDA open source project [21] aims at the development of a platform for automation of calculations, input and output data storage, interoperability within work flows, and for sharing these contents within research communities. Plug-ins for the solid state ab initio plane wave code Quantum Espresso [22], the molecular quantum chemistry code NWChem [23], Wannier90 [24], [25] for obtaining maximally localized Wannier functions, and access to structural databases such as the COD [26, 27] are being implemented. Furthermore, related activities can be observed in larger research group and institutions, aiming at such platforms mainly for internal use.

This level of interoperability relies on a common platform that uses a data model allowing the creation of program-specific input and the interpretation of the output of one program such that it can be used as input to other programs. The unifying platform thus becomes a hub, which enables to plug in various programs. This model is relatively straightforward to implement for quantum chemical calculations where a system is defined by atomic positions, element type, and possibly magnetic moments.

On the other hand, the interoperability between programs using classical forcefields is more complicated. For example, a Monte Carlo method for the calculation of adsorption isotherms of a liquid in a nano-porous structure may use a united atom (UA) description of interatomic interactions, which cannot be uniquely mapped onto an all-atom (AA) description using a valence forcefield. Nevertheless, algorithms can be found to bridge this gap between different representations as long as the UA and the AA forcefields can describe the entire system. This is no longer the case if a model includes, for example, a metal surface covered with an organic polymer. The popular embedded atom method (EAM) would work well for the metal, but is difficult to reconcile with a valence forcefield, which is available only for the organic part of the system. The interoperability of programs and methods applicable only to a subset of a model is thus problematic.

Such incompatibilities become more complicated if one tries to build bridges between a quantum mechanical description of part of the system coupled to a forcefield description for the surrounding of a chemically reactive area. Such so-called quantum mechanical/molecular mechanical (QM/MM) methods have been pursued since the 1980s especially in biomolecular systems. While conceptually appealing, the practical treatment of the transition between the QM and MM domains remains rather difficult. As mentioned earlier, the merits of multi-scale approaches for complex chemical systems were recognized by the 2013 Nobel prize in chemistry.

Interoperability and coarse-graining open a host of issues, which have yet to be fully resolved. For example, while lumping and clustering techniques seem logical in working from the atomic scale to macroscopic phenomena, de-lumping techniques are required to allow interoperability in both directions. Coarsening of an atomistic model is generally straightforward, for example progressing from an all-atom description to united atoms, anisotropic united atoms, and techniques of dissipative particle dynamics. However, creating chemically correct atomistic models from a coarse grain model, for example in describing a grain boundary, is all but trivial. Significant work lies still ahead to achieve interoperability on these levels.

In the foreseeable future, no supplier is likely to provide a software system covering the entire range of all possible applications on all length and time scales in a single software system. Thus, enhanced interoperability between different software systems is essential for the overall success and impact of materials modeling.

Integration with Experimental Databases

“Numerical simulations without connection to experimental data is like clapping with one hand”, as Arthur J. Freeman used to say. Thus, the integration of experimental databases adds great value to materials modeling platforms. This has been implemented, for example, in the MedeA® software platform, which includes leading crystallographic databases (ICSD, Pearson, Pauling, COD). These databases can be searched with a common interface, although the integrity and uniqueness of each database is preserved. The modeler can search, for example, for all compounds which contain Li and a transition metal, retrieve all such the structures with a few mouse clicks, and initiate calculations such as the computation of the chemical potential of Li ions in all these compounds in a few minutes.

Tracking of Metadata

As in experiments, the results of computations are meaningful if additional information is associated with the computed physical properties. This information includes the type of equations underlying the models, the algorithms and computational parameters being used, the program version, the time, and the scientist/engineer who performed the computations. Present computational platforms provide mechanisms to associate and store these metadata automatically with the results of a computation. This aspect is particularly useful for work done in teams over a long period, where traceability and re-usability are of high value. Reproducibility might indeed be the most valuable aspect, because often one wants to build on previous experience starting with an existing case for verification.

As computed material property data become integrated in an overall engineering process, it is important for legal reasons and for the protection of intellectual property to pay attention to the careful curation of metadata associated with computations. On the other hand, an engineer may just be interested in a value and an error bar and might consider too many meta-data as distraction. This brings us back to the design of user interfaces, where the perception of “best” differs between different users.

Knowledge Management and Best Practices

As staff moves to different positions or retires, it is highly valuable for any organization that the knowledge and accomplishments of these employees is retained and transmitted to their successors. To this end, a good modeling platform should have mechanisms to capture and retain the results of computational investigations in a form that allows new employees to retrieve and to capitalize on previous work. Concepts such as the JobServer of the MedeA® [28] software platform automatically keep records of all simulations including input and output data together with flowcharts of simulation protocols as well as time stamps and user information, thus helping in the implementation of best practices for research and development.

From Academic to Commercial Software

A major part of e/a/m modeling software has been developed in academic research groups. The major driving force for these developments is scientific innovation and advanced functionality. While typically very strong with respect to advanced functionality, present academic software often falls short in many of the requirements discussed in the previous section, for example with respect to ongoing support and interoperability.

Open source software is common in a variety of application areas including electronic structure calculations. The success of Linux is often cited as an argument for this approach. It is remarkable that the most successful solid-state electronic structure program, namely VASP [29, 30], does not fall under this category. While the source code of this program is accessible via a license, maintenance and new developments are handled by a single university research group with clearly defined responsibilities and ownership. Another program, namely CASTEP, is integrated in a commercial modeling platform while other electronic structure codes such as Quantum Espresso, Abinit, CP2K, and Wien2k are distributed either in the form of the GNU General Public License or special licenses. In fact, there is a range of licensing models, e.g., the Apache License, BSD license, GNU General Public License, GNU Lesser General Public License, MIT License, Eclipse Public License, and Mozilla Public License. Each of these schemes has different implications for the transfer from academic research to industrial applications.

There is debate in the community as to which of these licensing schemes is best. It is probably fair to say that there is no simple answer to this question, because it depends on many factors such as scientific and technological maturity, the organizational structure supporting the development, the industrial relevance, and the objectives of the key authors.

Publicly funded academic software is not always “open source,” nor free. Moreover, there are additional aspects/differences between academic and commercial software, such as generality, coverage, and robustness. Academic software is usually application oriented, e.g., proteins, ionic liquids, sorption in specific solids, or the motion of dislocations in a metal. Additionally, academic software typically relies on developments made by non-computer scientists, and optimization is usually not the leading objective. Commercial codes tend to be more robust, more general in terms of methods and operating systems, and better optimized.

Long-term support is often a key issue for academic software. In fact, experience in the domain of finite element methods, computational fluid dynamics, and TCAD for electronic devices has shown that in the long run, the successful industrial deployment of modeling software is best served by commercial providers. In the field of e/a/m modeling, we are witnessing the transition from academic research to industrial application. In this context, efforts are being made to accelerate this transition. For example, the European Commission is funding projects such as the EMMC-CSA [31] to enhance the industrial uptake of material modeling by industry in the spirit of the ICME.

Illustrative Examples

Structural Materials

Zirconium Alloys

Zr alloys are primary structural materials in the core of nuclear power reactors. Zr is a key element in the material due to its low neutron absorption cross-section. Alloying elements including Fe, Ni, Cr, Sn, and Nb are added to improve mechanical properties such as yield strength and corrosion resistance. Fe, Ni, and Cr exhibit extremely low solubility and tend to precipitate into secondary phase particles (SPPs). During irradiation, the alloying elements are released from the SPPs and are free to diffuse and interact with defects in the material.

Zr alloys for nuclear applications have been subject to numerous, predominantly experimental, investigations. However, the number of computational studies is steadily growing. Ab initio DFT methods and EAM-based forcefields are valuable approaches that have been employed to study irradiation-induced structural changes in Zr alloys [32, 33]. With the development of experimental techniques such as atom probe tomography (APT), the material can be characterized at the Ångström regime, allowing for direct comparisons between experimental observations and data obtained from an e/a/m modeling platform. For example, APT studies have shown that Fe segregates to metal grain boundaries and ring-shaped features of Fe have been observed in addition to small clusters of not fully developed Fe precipitates [34]. The ring-shaped features can be interpreted as being due to segregation of Fe to dislocation loops.

All these experimental observations are reproduced with simulations which can be performed using an e/a/m modeling platform. Typical models are shown in Fig. 5. Simulations using an EAM forcefield developed by fitting to DFT data show that Fe as an interstitial defect (as well as Ni and Cr) is a fast diffuser. The diffusion is anisotropic and axial diffusion is faster than diffusion in the basal plane. Simulations additionally show that Fe can form small intermetallic Zr-Fe clusters (not fully developed precipitates) and interact with point defects and extended defects such as dislocation loops. An Fe atom occupying a vacancy site may swap place with an interstitial Zr atom, thereby healing out point defects in the lattice. In the simulations, Fe atoms were found to decorate the rim of irradiation-induced clusters of self-interstitial atoms, which can explain the experimental observation of ring-shaped features. DFT calculations show that it is energetically favorable for Fe atoms to segregate to Zr grain boundaries in agreement with the observations. In addition, the simulations show that the presence of Fe in the grain boundary is not detrimental to the mechanical properties. Rather, Fe acts as a glue strengthening the grain boundary.
Fig. 5

Models of alpha-zirconium showing the ABAB stacking in this hexagonal structure on the left-hand side. The middle panel illustrates octahedral and tetrahedral interstitial sites with the corresponding pathways and transition states (TS). The right-hand side shows a typical model which is used for molecular dynamics simulations of diffusion (Color figure online)

This example shows that the contact area between experiments and simulations is extending to the behavior of individual atoms. The empirical and theoretical methods complement each other and a state-of-the-art, comprehensive e/a/m modeling platform is a vital part of the synergistic approach.

Epoxy Thermosets

Among the widespread use of polymeric materials, epoxies are of great importance for light-weight high-strength materials especially in aerospace applications. The example presented here focuses on the dependence of the mechanical properties on the choice of the resin. To this end, three typical resins are considered, namely diglycidyl ether of bisphenol A (DGEBA), triclycidyl p-amino phenol (TGAP), and tetraglycidyl diaminodiphenylmethane (TGDDM). These resins were cured with 4,4′ diaminodiphenylsulfone (DDS). The structures of the components are shown in Fig. 6 together with a 3D model of DDS and a model of the cross-linked epoxy.
Fig. 6

Components of epoxy thermosets depicted as 2D chemical structures, a 3D model obtained from a quantum mechanical simulation, and a model of a cross-linked epoxy used in forcefield-based molecular dynamics simulations of the mechanical properties (Color figure online)

Between 50 and 100 different models for each composition of cross-linked epoxies were constructed and the mechanical properties were computed using the highly accurate pcff + forcefield [45]. The underlying molecular dynamics simulations were carried out with the LAMMPS program [44] as integrated in the MedeA® platform. Statistical averaging using the Hill-Wallpole method as introduced for simulations of polymers [37] leads to computed elastic properties showing a distinct dependence on the composition of the thermoset material. The agreement with available experimental data is good considering the uncertainties in the experimental data, as can be seen from Table 1.
Table 1

Computed and experimental elastic coefficients of epoxy thermosets with different resins

Resin

Calculated bounds (GPa)

Experiment (GPa)

DGEBA

3.49–3.53

2.4–3.2a

TGAP

4.42–4.45

4.396 ± 0.027b

TGDDM

5.18–5.19

5.103 ± 0.033b

aWhite et al. [38]; reported extents of reaction cover a relatively broad range, 0.5–1.0; ∼300 K; dynamical mechanical analysis

bBehzadi and Jones [39]; extent of reaction 0.93; 295 K; strain rate 1.67 × 10−2 s−1

Functional Materials

Design of Zero-Strain Cathode Materials for Li-Ion Batteries

In Li-ion and solid-state batteries, the volume change of the materials in the electrodes during charge and discharge is a major source of degradation, which limits the lifetime of the battery. While on the anode side, materials with close to zero strain such as Li4Ti5O12 exist, there is still an ongoing search for light-weight zero- or low-strain cathode materials. This work focused on manganese-based oxides crystallizing in the spinel structure. Specifically, starting from LiMn2O4, alloying elements were sought such that the quasi-ternary compound LiM1xM2yM3zO4 would minimize the strain induced by alternating lithiation and delithiation. From a systematic exploration of the composition space exploiting Vegard’s law, the best candidates were found within the class LiMnxCryMgzO4. Their volume changes as a function of lithium content are displayed in Fig. 7 together with that of the benchmark compound LixNi0.5Mn1.5O4. All calculations were carried out using VASP [29, 30] in MedeA® with the PBEsol exchange-correlation functional [40, 41].
Fig. 7

Computed cell volume as a function of Li concentration in transition metal oxides with a spinel structure (Color figure online)

These most promising materials were synthesized and characterized by X-ray diffraction as well as electrochemical techniques. The results were consistent with the ab initio predictions [42].

The DFT calculations also provided detailed insight into the mechanisms resulting in a near zero-strain behavior. It results from the very different chemical bonding characteristics of Mg and the transition metal atoms as illustrated in Fig. 8. While the Mg–O bonds tend to decrease with increasing Li concentration, the Mn–O bonds remain similar, whereas the Cr–O bonds tend to increase. As a result, the overall volume of the crystal structure changes little upon charging and discharging with Li ions.
Fig. 8

Computed interatomic distances of the compound LixMn1.125Cr0.5Mg0.375O4 for increasing Li concentration (Color figure online)

This behavior is in close analogy to observation for the zero-strain mechanism for Li4Ti5O12, where local distortions in the crystal structure likewise allow this material to keep the volume nearly unchanged upon lithium insertion.

Liquids

Compared with solid materials, fluids offer a tremendous advantage in connecting the macroscopic scale with atomistic simulations. In many cases, macroscopic properties such as the density, viscosity, and the vapor pressure can be obtained directly from atomistic simulations involving only of the order of a thousand molecules given proper statistical averaging over configurational space to obtain reliable results.

During the past decades, sophisticated molecular simulation methods have been developed, which enable the prediction of properties of fluids with remarkable accuracy and predictive power [43]. This is illustrated by the computation of the normal boiling temperature of a sample of 100 different organic compounds (Fig. 9).
Fig. 9

Left: Families of compounds for which the normal boiling point is calculated; right panel: Monte Carlo simulation results (y axis) compared against experimental data from DIPPR (x axis) for the normal boiling point temperature (Color figure online)

The computations are based on statistical thermodynamics using isochoric Gibbs Ensemble Monte Carlo simulations to determine the equilibrium between the vapor and liquid phases as a function of temperature [44, 45]. The interactions between atoms are described by a classical forcefield which expresses the total energy of a system as a summation of different terms which are described by mathematical functions containing adjustable parameters. These forcefield parameters describe interatomic interactions such as the bond strength between a carbon and oxygen atom as a function of the local chemical environment. They are obtained by fitting to data from first-principles quantum mechanical calculations on representative molecules as well as by calibration using a training set of experimental data. Once these forcefield parameters are defined, they are used in systematic and transferable way to perform calculations for a large number of molecules. The development of forcefield parameters for organic systems has a history of many decades and today excellent forcefields for such computations are available [46, 47].

The boiling points computed for 100 different organic compounds agree very well with the experimental data, as illustrated in Fig. 9, with an absolute average deviation of 1.4%. Note the wide range of different organic compounds and the fact that all data points are bound within a relatively tight tolerance. This means that the approach has good predictive power and thus can be used in design cycles.

For all non-aromatic compounds, the anisotropic united atom (AUA) forcefield [48, 49, 50, 51, 52, 53, 54, 55, 56, 46] has been used. For the aromatic compounds, the TraPPE forcefield has been used [47]. The two forcefields used have been chosen for their high quality, allowing fast and high accuracy vapor-liquid-equilibrium calculations of the families of compounds included in this work.

Such computations of the pure compounds can be extended to different temperatures, leading to a complete description of the vapor-liquid equilibrium or to different temperatures and pressures, leading to a complete description of the gas or the liquid phase, even at supercritical conditions.

Importantly, this approach can be also used for any mixtures of the above compounds, thus providing vapor-liquid phase diagrams for a rather large design space, which can now be explored computationally in an optimization process.

Many design and engineering problems involve not just single phases (solid, liquid, gas) but most often different phases in contact. Forcefield simulations can be applied in such systems, to study, for example, the sorption and diffusion of a molecular fluid into a micro-porous solid [43].

Properties of Molten Metals

Many types of liquids and their properties can be studied using molecular dynamics given suitable forcefields. For example, Fig. 10 illustrates a simulation cell that can be employed in computing the surface tension of a liquid metal, in the illustrated case, copper, for which there are several suitable embedded atom method (EAM) forcefields [57]. A range of properties may be computed for such systems. Figure 10 shows the variation of the computed surface tension as a function of temperature and comparison with published experimental values for this property. The details of the underlying surface tension calculation are applicable to any liquid with the only proviso being that adequate configurational sampling is feasible for the system in consideration [58, 59, 60, 61]. For simple liquids, such configurational sampling can be achieved with nanosecond length simulations. For copper, the results of Fig. 9 show the precision of the calculated property in comparison with experimental observation [62]. The deviation of the simulated results from their expected straight line is minimal in contrast with the variability observed in experimental studies. The general agreement between computation and experiment apparent in Fig. 10 demonstrates that simulation provides a route to determining, understanding, and optimizing the properties of such systems. This is particularly valuable for situations where experiment is challenging, for systems with high melting points, for example. We emphasize that simulated properties are dependent on the quality of the forcefield employed and that, even given a perfect forcefield, such calculations require adequate phase space sampling. The requirement for sufficient phase space sampling is a consequence of the fact that evaluating the surface tension amounts to computing surface-free energy implying the evaluation of the system’s partition function, a rather high dimensional integral. However, as Fig. 10 emphasizes, where forcefield quality may be relied upon, such methods provide substantial insights into materials properties.
Fig. 10

Comparison of simulated and experimental liquid copper surface tensions as a function of temperature. Experimental data from the work of Tahei et al. [54]. The left panel shows a section of the model used in computing the surface tension (Color figure online)

Perspectives and Conclusions

Present software platforms for materials modeling on the level of electrons and atoms have emerged from two major sources, namely molecular modeling systems used earlier in chemical research and computer programs developed by researchers in the field of solid-state physics and statistical mechanics. The resulting software platforms are used in academic research and their industrial deployment is growing. However, this field is presently characterized by groups of ab initio programs with similar functionality while there are still major gaps, which need to be filled.

Present computer codes originate predominantly from academic research groups and thus are driven by the quest for leading research. Although some of these codes are integrated in e/a/m software platforms and are commercially supported, it is fair to say that fundamentally, these codes retain a mostly academic character. This reflects the actual state of the theory underlying ab initio solid-state computations. The present theoretical underpinning such as density functional theory is very useful, but this level of theory is not yet fully satisfactory; new and better approximations to describe interacting many-electron systems are still evolving, which makes this field naturally somewhat dynamic. Consequently, in academia, functionality is the major driving force rather than stability, robustness, error handling, standardization, and interoperability. Thus, programs that combine state-of-the-art scientific functionality with a good measure of the other attributes discussed above have emerged as clear leaders in their league.

On the level of forcefield-based codes for molecular dynamics and Monte Carlo simulations, the picture is somewhat different. Algorithmically, this level is quite mature, but the Achilles heel of this type of simulations is the quality and coverage of the underlying forcefields or interatomic potentials. Forcefield developments have been pursued with great vigor since the 1980 especially for biological systems such as DNA and proteins, which led subsequently to the extension of this methodology to the simulation of synthetic polymers and, more generally, to the simulation of organic molecular systems including liquids. Hence, simulations of such systems have reached a remarkable accuracy, as demonstrated in this review for the case of the calculation of boiling points of organic liquids. In contrast, the same cannot be said for the simulation of inorganic systems such as metals, alloys, semiconductors, and insulators. For these types of systems, the neglect of a quantum mechanical description of the electronic degrees of freedom often encounters severe limitations, creating a serious challenge for the simulation of complex systems that require models consisting of millions of atoms and many millions of configurations.

Linking and coupling quantum mechanical domains with those requiring forcefield simulations remains as a major challenge for the creation of e/a/m modeling platforms. This goes beyond the integration of various programs in a single software system which is already accomplished by leading modeling platforms. The partitioning of a system into domains that are handled by a forcefield or by a quantum mechanical description is not automated and relies on the user’s decisions. Moreover, the coupling between these domains poses serious challenges, especially between metals and non-metallic solids or fluids.

The creation and optimization of forcefield parameters “on-the-fly” is an intriguing perspective and, in fact, novel tools have emerged that facilitate the generation of forcefield parameters from ab initio training sets. Yet, the choice of the appropriate forcefield type and the optimization of the parameters is far from being a fully automated and robust process.

The next level of linking in a comprehensive e/a/m modeling platform can be summarized as coarse-graining/fine-graining. There is clearly a lack of tools which would facilitate this task. As pointed out earlier, one needs to think not only about coarse-graining, but one should also consider the inverse, namely fine-graining, starting, for example, from a model of a microstructure represented at the continuum level and resolving the structure of dislocations and grain boundaries on the atomistic level.

Linking the microstructure to the processing and the chemistry of the material under consideration is yet another step, which should be part of an ICME implementation. While major efforts are made at this level, the communication with the e/a/m levels leave ample room for improvements. Creating overarching software environments remains a daunting task. Extrapolating from current activities, the connection in multi-scale approaches will be first implemented for specific applications such as additive manufacturing and thus will remain rather narrowly focused.

Integration into comprehensive software platforms and interoperability between platforms are certainly important and highly useful, but the main bottleneck remains on the level of the theoretical and computational approaches.

This brings us to the issue of the economics of the development and support of materials modeling platforms in a still rapidly evolving field. In contrast to molecular modeling in the pharmaceutical industry, materials modeling is subject to quite different boundary conditions. The goal of pharmaceutical research and development is very focused and well defined, namely the discovery of novel molecules with specific therapeutic effects. To support this effort, molecular modeling platforms serving this market have been developed and are commercially supported. In contrast, the materials-related industry is extremely diversified in its goals with each sector being relatively small, but very specialized. For example, the development of a high-performance steel is quite different from the optimization of a Schottky barrier in a semiconductor; the optimization of polymers for use in printed circuit boards requires quite different knowledge and tools than the search for better solid-state electrolytes for Li-ion batteries, novel materials for data storage, or rare-earth free magnetic materials. Each of these materials science problems requires quite different and very sophisticated simulation tools to make a valuable contribution to the engineering process. However, many of these materials, for example high-performance thermoelectric materials, represent only a small fraction of the value of the final product, and yet a new material, such as cobalt-oxide cathodes, can enable extraordinary technologies, such as lithium-ion batteries. The rewards for investment in advanced materials discovery and process optimization can be substantial.

There is also a desire to transfer research from industry to academic groups, which either develop their own software or rely on open source software, which is adapted for the problem at hand by indentured graduate students and post-docs. In some circumstances, this can inhibit investment in building the comprehensive e/a/m software platforms that fulfill the criteria discussed earlier in this review. Hence, in addition to the intrinsic scientific and technological challenges, there are also economic hurdles which are to be overcome.

One perspective is the open-source paradigm combined with a licensing scheme which keeps the software and all its future enhancements in the public. The initial investment is made by public funding until the community of developers and users is large enough to sustain future developments. While such a paradigm has worked well for software such as the Linux operating system, it is questionable if highly specialized and sophisticated scientific software can be developed and sustained in the long run by this approach. The future of the excellent LAMMPS molecular dynamics program [36] will show how this paradigm will play out in the long run.

Another perspective is the rise of commercial software companies building integrated software platforms, which are professionally engineered and supported. As pointed out earlier, the investment for creating and supporting such integrated software environments is very substantial because of the high degree of sophistication and specialization in materials science.

The evolution of this field will likely represent a combination of these two perspectives. There is a place for sharing software in the early stages of conceptual and algorithmic development in the spirit of academic research so that complex software systems can be created without unnecessary and time-consuming duplication of effort. However, as software technology matures and its innovative appeal for academic research diminishes, the industrial value of such software tends to rise as it moves from the stage of leading research to industrial production. Concurrently, the need for robustness and long-term support will increase. In fact, this is analogous to the development and transition of other research and development tools such as X-ray diffractometers, NMR machines, and scanning tunneling microscopes. Leading edge research at one stage becomes a production tool in a later stage and it is quite reasonable to assume that the same trend will take place in the field of software platforms for materials modeling.

In view of the tremendous importance of advanced materials and processes for the future of our civilization, the perspectives for ICME are very bright. However, vision, commitment, investment, skill, and persistence will be necessary to move this field forward and to reap its benefits for society. The challenges are many and the right expectations must be projected. The embedding of e/a/m platforms in ICME frameworks will open unprecedented possibilities for the design of materials and processes, but all participants in this process must continue to support each other, including universities, government laboratories, funding agencies, software companies, and industry. A Golden Age of ICME is ahead of us. Let us proceed with confidence!

Notes

Acknowledgements

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No. 723867.

References

  1. 1.
    Hohenberg P, Kohn W (1964) Inhomogeneous electron gas. Phys Rev 136:B864CrossRefGoogle Scholar
  2. 2.
    Kohn W, Sham LJ (1965) Self-consistent equations including exchange and correlation effects. Phys Rev 140:A1133CrossRefGoogle Scholar
  3. 3.
    Stewart JP (2013) Optimization of parameters for semiempirical methods VI: more modifications to the NDDO approximations and re-optimization of parameters. J Mol Model 19:1–32CrossRefGoogle Scholar
  4. 4.
    Rozanska X, Stewart JJP, Ungerer P, Leblanc B, Freeman C, Saxe P, Wimmer E (2014) High-throughput calculations of molecular properties in the MedeA® environment: accuracy of PM7 in predicting vibrational frequencies, ideal gas entropies, heat capacities, and Gibbs free energies of organic molecules. J Chem Eng Data 59:3136–3143Google Scholar
  5. 5.
    Allen MP, Tildesley DJ (1987) Computer simulations of liquids. Clarendon Press, OxfordGoogle Scholar
  6. 6.
    Jorgensen WL (2002) OPLS force fields, Encyclopedia of computational chemistry, John Wiley & SonsGoogle Scholar
  7. 7.
    Maple JR, Hwang M-J, Stockfisch TP, Dinur U, Waldman M, Ewig CS, Hagler AT (1994) Derivation of class II force fields. I. Methodology and quantum force field for the alkyl functional group and alkane molecules. J Comp Chem 15:162–182CrossRefGoogle Scholar
  8. 8.
    Sun H (1998) COMPASS: an ab initio force-field optimized for condensed-phase applications—overview with details on alkane and benzene compounds. J Phys Chem B 102:7338–7364CrossRefGoogle Scholar
  9. 9.
    Daw MS, Baskes M (1984) Embedded-atom method: derivation and application to impurities, surfaces, and other defects in metals. Phys Rev B 29:6443–6453CrossRefGoogle Scholar
  10. 10.
    Lewis GV, Catlow CRA (1985) Potential models for ionic oxides. J Phys C Solid State Phys 18:1149–1161CrossRefGoogle Scholar
  11. 11.
    Lerch D, Wieckhorst O, Hart GLW, Forcade RW, Müller S (2009) UNCLE: a code for constructing cluster expansions for arbitrary lattices with minimal user-input. Modelling Simul in Mater Sci Eng 17:055003 and references thereinCrossRefGoogle Scholar
  12. 12.
    Sanchez JM, Ducastelle F, Gratias D (1984) Generalized cluster description of multicomponent systems. Physica A 128:334–350CrossRefGoogle Scholar
  13. 13.
    Schmitz GJ, Prahl U (2016) Handbook of software solutions for ICME. John Wiley & SonsGoogle Scholar
  14. 14.
    Macher M, Klimeš FC, Kresse G (2014) The random phase approximation applied to ice. J Chem Phys 140:084502-1–084502-10CrossRefGoogle Scholar
  15. 15.
    Lejaeghere K, Van Speybroeck V, Van Oost G, Cottenier S (2014) Error estimates for solid-state density-functional theory predictions: an overview by means of the ground-state elemental crystals. Critical Reviews in Solid State and Materials Sciences 39:1–24CrossRefGoogle Scholar
  16. 16.
    Gladden JR, So JH, Maynard JD, Saxe PW, Le Page Y (2004) Reconciliation of ab initio theory and experimental elastic properties of Al2O3. Appl Phys Lett 85:392–394CrossRefGoogle Scholar
  17. 17.
    Geller CB, Wolf W, Picozzi S, Continenza A, Asahi R, Mannstadt W, Freeman AJ, Wimmer E (2001) Computational band-structure engineering of III-V semiconductor alloys. Appl Phys Lett 79:368–370CrossRefGoogle Scholar
  18. 18.
    NoMaD (Novel Materials Discovery) http://nomad-repository.eu/cms/. Accessed 14 November 2016
  19. 19.
    Lejaeghere K, Bihlmayer G, Bjorkman T et al. (2016) Reproducibility in density functional theory calculations of solids. Science 351:6280Google Scholar
  20. 20.
    Gaussian G09 (2016) http://www.gaussian.com/. Accessed 14 November 2016
  21. 21.
    Pizzi G, Cepellotti A, Sabatini R, Marzari N, Kozinsky B (2016) AiiDA: automated interactive infrastructure and database for computational science. Comp Mater Sci 111:218–230CrossRefGoogle Scholar
  22. 22.
    Giannozzi P, Baroni S, Bonini N, Calandra M, Car R, Cavazzoni C, Ceresoli D, Chiarotti G L, Cococcioni M, Dabo I, Dal Corso A, Fabris S, Fratesi G, de Gironcoli S, Gebauer R, Gerstmann U, Gougoussis C, Kokalj A, Lazzeri M, Martin-Samos L, Marzari N, Mauri F, Mazzarello R, Paolini S, Pasquarello A, Paulatto L, Sbraccia C, Scandolo S, Sclauzero G, Seitsonen A P, Smogunov A, Umari P, Wentzcovitch R M (2009) J Phys: Condens Matter 21 395502. doi:10.1088/0953-8984/21/39/395502
  23. 23.
    Valiev M, Bylaska EJ, Govind N, Kowalski K, Straatsma TP, van Dam HJJ, Wang D, Nieplocha J, Apra E, Windus TL, de Jong WA (2010) NWChem: a comprehensive and s calable open-source solution for large scale molecular simulations. Comput Phys Commun 181:1477CrossRefGoogle Scholar
  24. 24.
    Mostofi AA, Yates JR, Lee YS, Souza I, Vanderbilt D, Marzari N (2008) Wannier90: a tool for obtaining maximally-localised Wannier functions. Comput Phys Commun 178:685CrossRefGoogle Scholar
  25. 25.
    Mostofi AA, Yates JR, Pizzi G, Lee YS, Souza I, Vanderbilt D, Marzari N (2014) An updated version of wannier90: a tool for obtaining maximally-localised Wannier functions. Comput Phys Commun 185:2309CrossRefGoogle Scholar
  26. 26.
    Gražulis S, Chateigner D, Downs RT, Yokochi AFT, Quirós M, Lutterotti L, Manakova E, Butkus J, Moeck P, Le Bail A (2009) Crystallography Open Database—an open-access collection of crystal structures. J Appl Crystallogr 42:726–729CrossRefGoogle Scholar
  27. 27.
    Gražulis S, Daskevic A, Merkys A, Chateigner D, Lutterotti L, Quirós M, Serebryanaya NR, Moeck P, Downs RT, Le Bail A (2012) Crystallography Open Database (COD): an open-access collection of crystal structures and platform for world-wide collaboration. Nucleic Acids Res 40. doi:10.1093/nar/gkr900
  28. 28.
    MedeA® – Materials Exploration and Design Analysis (2016) Materials Design, Inc. www.materialsdesign.com
  29. 29.
    Kresse G, Furthmüller J (1996) Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Phys Rev B 54:11169–11186CrossRefGoogle Scholar
  30. 30.
    Kresse G, Joubert D (1999) From ultrasoft pseudopotentials to the projector augmented-wave method. Phys Rev B 59:1758–1775CrossRefGoogle Scholar
  31. 31.
    EMMC-CSA (2016) European Materials Modelling Council – Coordination and Support Action. European Commission, Directorate-General for Research & Innovation, Industrial Technologies, Advanced Materials and Nanotechnologies Grant number 723867Google Scholar
  32. 32.
    Christensen M, Angeliu TM, Ballard JD, Vollmer J, Najafabadi R, Wimmer E (2010) Effect of impurity and alloying elements on Zr grain boundary strength from first-principles computations. J Nucl Mater 404:121–127CrossRefGoogle Scholar
  33. 33.
    Christensen M, Wolf W, Freeman C, Wimmer E, Adamson RB, Hallstadius L, Cantonwine PE, Mader EV (2014) Effect of alloying elements on the properties of Zr and the Zr-H system. J Nucl Mater 445:241–250CrossRefGoogle Scholar
  34. 34.
    Sundell G, Thuvander M, Tejland P, Dahlbäck M, Hallstadius L, Andrén H-O (2014) Redistribution of alloying elements in Zircaloy-2 after in-reactor exposure. J Nucl Mater 454:178–185CrossRefGoogle Scholar
  35. 35.
    Rigby D, Saxe P, Freeman C, Leblanc B (2014) Computational prediction of mechanical properties of glassy polymer blends and thermosets, in Advanced Composites for Aerospace, Marine and Land Applications, Sano T, Srivatsan TS, Paretti MW (eds.) John Wiley; ISBN 8791118888919Google Scholar
  36. 36.
    Plimpton S (1995) Fast parallel algorithms for short-range molecular dynamics. J Comp Phys 117:1–19CrossRefGoogle Scholar
  37. 37.
    Suter UW, Eichinger BE (2002) Estimating elastic constants by averaging over simulated structures. Polymer 43:575–582CrossRefGoogle Scholar
  38. 38.
    White SR, Mather PT, Smith MJ (2002) Characterization of the cure-state of DGEBA-DDS epoxy using ultrasonic, dynamical, and thermal probes. Polyer Eng Sci 42:51–67CrossRefGoogle Scholar
  39. 39.
    Behzadi S, Jones FR (2005) Yielding behavior of model epoxy matrices for fiber reinforced composites: effect of strain rate and temperature. J Macromol Sci B 44:993–1005CrossRefGoogle Scholar
  40. 40.
    Perdew JP, Burke K, Ernzerhof M (1997) Generalized gradient approximation made simple. Phys Rev Lett 77:3865 ibid. Erratum. Phys Rev Lett 78:1396CrossRefGoogle Scholar
  41. 41.
    Perdew JP, Ruzsinszky A, Csonka GI, Vydrov OA, Scuseria GE, Constantin LA, Zhou X, Burke K (2008) Restoring the density-gradient expansion for exchange in solids and surfaces. Phys Rev Lett 100:136406 (2008); ibid. (2009) Erratum. Phys Rev Lett 102:039902CrossRefGoogle Scholar
  42. 42.
    Rosciano F, Christensen M, Eyert V, Mavromaras A, Wimmer E (2014) Reduced strain cathode materials for solid state lithium ion batteries. International Patent Publication Number WO2014191018 A1Google Scholar
  43. 43.
    Yiannourakou M, Ungerer P, Leblanc B, Ferrando N, Teuler J-M (2013) Overview of MedeA®-GIBBS capabilities for thermodynamic property calculation and VLE behavior description of pure compounds and mixtures: application to polar compounds generated from lingo-cellulosic biomass. Mol Simul 39:1165–1211Google Scholar
  44. 44.
    Mackie AD, Tavitian B, Boutin A, Fuchs AH (1997) Vapour-liquid phase equilibria predictions of methane–alkane mixtures by Monte Carlo simulation. Mol Simul 19:1–15CrossRefGoogle Scholar
  45. 45.
    Panagiotopoulos AZ (1987) Direct determination of phase coexistence properties of fluids by Monte Carlo simulation in a new ensemble. Mol Phys 61:813–826CrossRefGoogle Scholar
  46. 46.
    Ungerer P, Beauvais C, Delhommelle J, Boutin A, Rousseau B, Fuchs A (2000) Optimization of the anisotropic united atoms intermolecular potential for n-alkanes. J Chem Phys 112:5499–5510CrossRefGoogle Scholar
  47. 47.
    Wick CD, Martin MG, Siepmann JI (2000) Transferable potentials for phase equilibria. 4. United-atom description of linear and branched alkenes and alkylbenzenes. J Phys Chem B 104:8008–8016CrossRefGoogle Scholar
  48. 48.
    Bourasseau E, Ungerer P, Boutin A, Fuchs AH (2002) Monte Carlo simulation of branched alkanes and long chain n-alkanes with anisotropic united atoms intermolecular potential. Mol Sim 28:317–336CrossRefGoogle Scholar
  49. 49.
    Bourasseau E, Ungerer P, Boutin A (2002) Prediction of equilibrium properties of cyclic alkanes by Monte Carlo simulation—new anisotropic united atoms potential—new transfer bias method. J Phys Chem B 106:5483–5491CrossRefGoogle Scholar
  50. 50.
    Bourasseau E, Haboudou M, Boutin A, Fuchs AH, Ungerer P (2003) New optimization method for intermolecular potentials: optimization of a new anisotropic united atoms potential for olefins: prediction of equilibrium properties. J Chem Phys 118:3020–3034CrossRefGoogle Scholar
  51. 51.
    Ferrando N, Lachet V, Teuler J-M, Boutin A (2009) Transferable force field for alcohols and polyalcohols. J Phys Chem 113:5985–5995CrossRefGoogle Scholar
  52. 52.
    Ferrando N, Lachet V, Pérez-Pellitero J, Mackie AD, Malfreyt P, Boutin A (2011) A transferable force field to predict phase equilibria and surface tension of ethers and glycol ethers. J Phys Chem B 115:10654–10664CrossRefGoogle Scholar
  53. 53.
    Ferrando N, Lachet V, Boutin A (2012) Transferable force field for carboxylate esters: application to fatty acid methylic ester phase equilibria prediction. J Phys Chem B 116:3239–3248CrossRefGoogle Scholar
  54. 54.
    Ferrando N, Gedik I, Lachet V, Pigeon L, Lugo R (2013) Prediction of phase equilibrium and hydration free energy of carboxylic acids by Monte Carlo simulations. J Phys Chem 117:7123–7132CrossRefGoogle Scholar
  55. 55.
    Pérez-Pellitero J, Bourasseau E, Demachy I, Ridard J, Ungerer P, Mackie AD (2008) Anisotropic united-atoms (AUA) potential for alcohols. J Phys Chem B 112:9853–9863CrossRefGoogle Scholar
  56. 56.
    Toxvaerd SR (1990) Molecular dynamics calculation of the equation of state of alkanes. J Chem Phys 93:4290–4295CrossRefGoogle Scholar
  57. 57.
    Zhou XW, Johnson RA, Wadley HNG (2004) Misfit-energy-increasing dislocations in vapor-deposited CoFeNiFe multilayers. Phys Rev B 69:144113CrossRefGoogle Scholar
  58. 58.
    Gloor GJ, Jackson G, Blas FJ, de Miguel E (2005) Test-area method for the direct determination of the interfacial tension of systems with discontinuous potentials. J Chem Phys 123:134703CrossRefGoogle Scholar
  59. 59.
    Goujon F, Malfreyt P, Boutin A, Fuchs AH (2001) Vapour-liquid phase equilibria of n-alkanes by direct Monte Carlo simulations. Mol Simul 27:99–114CrossRefGoogle Scholar
  60. 60.
    Ibergay C, Ghoufi A, Goujon F, Ungerer P, Boutin A, Rousseau B, Malfreyt P (2007) Molecular simulations of the n-alkane liquid-vapor interface: interfacial properties and their long range corrections. Phys Rev E 75:051602CrossRefGoogle Scholar
  61. 61.
    Vega C, de Miguel E (2007) Surface tension of the most popular models of water using the test-area simulation method. J Phys Chem 126:154707CrossRefGoogle Scholar
  62. 62.
    Taihei M, Hidetoshi F, Takaharu U, Masayoshi K, Kiyoshi K (2005) Surface tension measurement of molten metal using a falling droplet in a short tube. Transactions JWRI 34:29Google Scholar

Copyright information

© The Minerals, Metals & Materials Society 2017

Authors and Affiliations

  • Mikael Christensen
    • 1
  • Volker Eyert
    • 1
  • Arthur France-Lanord
    • 1
  • Clive Freeman
    • 2
  • Benoît Leblanc
    • 1
  • Alexander Mavromaras
    • 1
  • Stephen J Mumby
    • 3
  • David Reith
    • 1
  • David Rigby
    • 3
  • Xavier Rozanska
    • 1
  • Hannes Schweiger
    • 3
  • Tzu-Ray Shan
    • 3
  • Philippe Ungerer
    • 1
  • René Windiks
    • 1
  • Walter Wolf
    • 1
  • Marianna Yiannourakou
    • 1
  • Erich Wimmer
    • 1
    • 3
  1. 1.Materials Design s.a.r.l.MontrougeFrance
  2. 2.Materials Design, Inc.Angel FireUSA
  3. 3.Materials Design, Inc.San DiegoUSA

Personalised recommendations