Introduction

The term extreme is defined as something farthest or of highest degree, which in terms of mechanical properties of materials imply unusual properties such as ultrahigh hardness1 and extremely negative Poisson’s ratio2. In the past decade, material scientists have been favorably using high-throughput screening for structure property prediction with high accuracy in searching for promising materials3. However, high-throughput screening prediction at the quantum level (first principles) is, although highly accurate, less efficient, and hence time consuming and computationally expensive3,4. In contrast, prediction at the classical level (such as classical molecular dynamics) is highly efficient but less accurate since they usually scale linearly with the number of atoms5,6. Because of the computational cost of density functional theory (DFT) and the less accuracy of classical potential, an intuitive idea is to bridge the gap between DFT-level accuracy and classical-level efficiency. Machine learning (ML) methods offer the possibility of bridging this gap7, and the application of ML has already help in speeding the process for material discovery8.

ML methods have been extensively used for materials properties prediction over the past decade, because ML models can be trained to have high efficiency and accuracy close to DFT9,10. Generally speaking, the accuracy of a ML model depends on the effective input representation of the crystal structures, since the atomic positions are not suitable for direct input representation because they are not rotationally and translationally invariant11. Such input representation is known as descriptors or features. The idea behind the use of ML methods for structure properties prediction is to analyze and map the relationship between the properties of materials and their characteristics by extracting information from existing data without knowing any explicit knowledge on how to draw conclusion from those data12. With given data, ML algorithms learn the rules and relationship that underlie a dataset by assessing the data and build a model to make prediction13. For example, ML models have been used for the prediction of mechanical properties of metal alloy14,15, band gap of crystals16,17, the formation energies of crystals18,19,20, melting temperature of binary inorganic compounds21.

Though ML is highly efficient, it has some limitation which reduces its accuracy in predicting properties. Such limitations include, but are not limited to, measurement error22, lack of generality and precision, reliance on high-quality data23, inability to determine high level concept24, prone to artifact25, good in interpolation but poor in extrapolation21,26. Another critical drawback for ML methods is the lack of laws, understanding, and knowledge from their use because ML methods are treated as black box6. More importantly, the predicted materials properties of almost all existing ML models usually cannot exceed the range of the original training data. This means that, the trained ML models are usually good at predicting material properties within the original training data pool, the so-called interpolation prediction, while they can seldom predict material properties outside of training dataset, i.e., the extrapolation ability is poor. However, many previous studies have proved that most of extraordinary structures reside in the sparse area of the huge material space. To ensure building extrapolative ML materials property prediction model in the sparse area, it is critical to develop some advanced ML models to identify the promising candidates whose properties may exceed the range of the training data.

In this study, we implemented boundless objective-free exploration (BLOX) algorithm27 for extreme mechanical property search. We use three different pairs of mechanical properties as the property space for the search, namely, bulk modulus vs. shear modulus, shear modulus vs. hardness, and Pugh’s ratio vs. Poisson’s ratio. The mechanical properties of a material are those properties that involve a reaction or behavior to an external or applied loading, and it is the characteristic that indicates the variation taking place in the material. The mechanical properties of a material characterize the reaction of the material to external loadings. Mechanical properties can be used to determine how a material would behave in each application and they are helpful in material selection process. They can also be used to estimate the lifetime of a material. In BLOX implementation, a ML model, namely Random Forest (RF) algorithm, is built to predict the properties of materials for which current data on calculated properties is available. In searching the property space, the BLOX algorithm searches outside the boundary to capture properties of materials that lie at the edge of the boundary. This can be made possible by using the Stein novelty (SN) scores to recommend potential materials with tendency of being outside the boundary, i.e., different from original training data. The SN scores measures a deviation between the observed properties and the predicted properties by using Stein discrepancy28. After thoroughly screening of the 85,707 crystal structures from Materials Project29 database, we found 30 structures with ultrahigh bulk and shear moduli, 21 structures with ultrahigh shear modulus and hardness, and 11 structures with negative Poisson’s ratio. We compare our result with traditional ML methods such as crystal graph convolution neural network (CGCNN)9, RF, Lasso regression, and Ridge regression30.

Results and discussion

The result of this study is described in four major subsections based on the material property space searched.

High bulk modulus and shear modulus

Superhard materials are defined as materials with hardness exceeding 40 GPa31 and they are of great importance because of their industrial applications such as abrasives, polishing, disc brakes, proactive coating, and cutting tools32. Diamond and related carbon nanostructures have been known to be at the very top of the hardest materials to date, with Vickers hardness in the range of 70–150 GPa33. However, diamond has several limitations for massive industrial applications such as high cost and oxidizing at temperatures above 800 °C34. A superhard material usually possesses a high bulk modulus (\(K\)) and shear modulus (\(G\)) and does not deform plastically. The shear modulus relates to strain response of a body to shear or torsional stress, and it involves change of shape without change of volume, while bulk modulus is related to the strain response of a body to hydrostatic stress which involves change in volume without change in shape35. Inspired by the mechanical properties of diamond36, such as ultrahigh bulk and shear moduli, our first goal is to search the structures with high bulk and shear moduli which has tendency to be superhard materials.

The scatter plot of observed mechanical properties and BLOX prediction selected from top SN scores for the first round is shown in Fig. 1a, while Fig. 1b shows the first round of DFT calculation of the recommended structures by BLOX algorithm in comparison with traditional ML methods (CGCNN, RF, Lasso regression, and Ridge regression). The initial set of elastic moduli data for property prediction model was obtained from the JARVIS-DFT database37,38, which consists for more than 65,000 materials and more than 15,000 elastic modulus data along with several other properties of materials. From Fig. 1a, we can clearly see that the BLOX algorithm recommended lots of materials that are out-of-trend from the initial observed materials. The materials with high SN score mean they will have higher chance to be out-of-trend, as shown by the color coding in Fig. 1a. The initial observed materials were randomly chosen from original pool of 10,192 structures downloaded from the Materials Project database39 and their mechanical properties were calculated by DFT. With recommendation by BLOX, we continue to verify the material properties of these materials with DFT calculations and then found some materials that have extremely high and low bulk modulus and shear modulus. It is worth noting that, since our target is to find materials with extremely high mechanical strength, we then cleaned our data by removing any materials that have bulk modulus below 130 GPa. This step is necessary because BLOX algorithm searches for out-of-trend materials in all directions in the bulk modulus vs. shear modulus space and this will lead to some materials with low bulk and shear moduli being recommended as well. In other words, we guide the BLOX algorithm to search in the direction we are interested. It is worth pointing out that, the threshold value of 130 GPa was chosen based on the empirical experience, which is about half of the maximum value of bulk modulus in the original training data. From Fig. 1b, we compare the mechanical properties of recommended materials between ML models and DFT calculations. We built CGCNN, RF, Lasso regression, and Ridge regression models using the initial observed data and then used these ML models to predict the mechanical properties of recommended structures. By comparing prediction by ML models to the DFT results of BLOX recommended structures, Fig. 1b provides direct evidence that the traditional ML models could not push the material properties to the limit, even if the exact same recommended materials were tested, which is one of the main drawbacks for many existing ML models as we pointed out earlier. This is understandable considering that traditional ML models are trained to be capable of predicting materials properties within the original range of training data, while they can hardly predict properties outside.

Fig. 1: Performance of BLOX algorithm in searching ultrahigh shear and bulk moduli materials.
figure 1

a Observed data and BLOX prediction for selected structure with highest Stein novelty (SN) score. 2000 structures were used as input to ML in BLOX to train a model which predicts the unchecked data. The SN score is a measure of discrepancy between the predicted properties of unchecked data and properties of observed data. Candidates with high SN scores are recommended for DFT calculations. b Comparison of CGCNN, RF, Ridge regression, Lasso regression, and DFT calculation for structures recommended in the 1st round. The ML models could not push the properties to the outside of the original dataset (observed data).

Once we got the DFT results for recommended structures, we added these data into original observed data and then use the expanded observe dataset to train a ML model, and then BLOX will recommend next round promising structures based on SN scores (see details in “Methods” section). We continued this loop in searching for ultrahigh bulk and shear moduli, by running BLOX for four rounds, and in each round, we added the previous materials with high mechanical properties verified by DFT calculations to the next round. In Fig. 2a, we observed that more and more materials in each round were identified to push the material property to the limit with DFT validation. From Fig. 2b, we can see that there are no significant changes in bulk modulus in our search, but in Fig. 2c, we observed that after the first iteration, there is an increase of about 60% in our maximum shear modulus as compared to our initial training data, since adding the verified DFT data to the initial training data improves BLOX recommendation, hence a need for further iteration. We observed there was no significant changes between third and fourth iteration, hence we stop the iteration. The stopping criteria depend largely on the specific material properties we are investigating. After four rounds we found 30 structures in total with ultrahigh bulk and shear moduli. It is interesting to notice that some identified structures even have almost doubled shear modulus as compared to the original observed data. To quantify the difference in the material properties between the BLOX recommended and DFT validated values and ML model predictions, we calculated the distance between the outlier of CGCNN and the real values by DFT calculations that are higher than the CGCNN predictions for each round as shown in Fig. 3a, by using the formula for distance between two points given below:

$$\sqrt {\left( {x_2 - x_1} \right)^2 + \left( {y_2 - y_1} \right)^2}$$
(1)

where x2, y2 are the bulk modulus and shear modulus, respectively, of materials higher than CGCNN prediction and x1, y1 is the bulk modulus and shear modulus of the highest outlier of CGCNN prediction. Figure 3a shows the maximum and average distance between CGCNN prediction and DFT calculations for the recommended materials. We observed that as the number of BLOX rounds increases, the distance between the CGCNN prediction and DFT calculation decreases. This is understandable considering that more and more material properties that are outside of original training data were added into the next training process, i.e., the property range that CGCNN model can predict will also expand. To better illustrate our ideal, we added CGCNN prediction and DFT calculations for the 1st, 2nd, 3rd, and 4th round as observed in Fig. 3b–e for bulk modulus vs. shear modulus. The blue symbol denotes the outlier for CGCNN prediction that was used for calculating the distances compared to real DFT values. The traditional ML models do not continuously improve as we add a few hundred recommended observed data to the initial 2000 observed data. This can be seen from Fig. 4 where we plot the mean absolute error (MAE) for CGCNN, Lasso regression, and Ridge regression for bulk and shear moduli prediction for different BLOX rounds. For most ML models the MAE for model prediction does not change noticeably. For Ridge regression model, we even found that the MAE increases with more data added into the training. There are several reasons responsible for this observation: (1) the total number of added data is still not significant as compared to the size of original observed data (2000), roughly estimated as 5–10%; (2) a considerably large portion of added data is still in the range of original observed data, which has already been well trained in previous rounds, and thus those data actually do not provide any information or contribute too much in the next training process; (3) the ML models can predict very well on the subset of training data, and with larger dataset there is increase in variability and the model might come across data not well considered in our training.

Fig. 2: Training procedure of BLOX algorithm.
figure 2

a Comparison of observed data and all rounds of DFT calculations recommended by BLOX. b, c Bar chart showing average (left y-axis) and maximum (right y-axis) value of bulk modulus and shear modulus respectively for each round of DFT calculations. “0” DFT round means the original training data.

Fig. 3: Evaluation of prediction of CGCNN model for shear vs. bulk modulus.
figure 3

a Maximum and average distance between outlier of CGCNN prediction and DFT calculation for bulk and shear moduli. For each round we measure the distance between the outlier of CGCNN prediction (blue symbols in bottom panels) and all DFT values with bulk and shear moduli higher than CGCNN prediction. be Comparison between CGCNN prediction and DFT values for the 1st, 2nd, 3rd, and 4th round, respectively. The blue symbol denotes the outlier structure from CGCNN prediction that is used for calculating the distances to real DFT values.

Fig. 4: Performance of traditional ML models.
figure 4

Mean absolute error (MAE) for Ridge regression, Lasso regression, Random Forest, and CGCNN models for bulk and shear moduli prediction at each DFT round.

Ultrahigh hardness

Our recent high-throughput study on ultrahard carbon allotropes illustrates that the hardness has a strong positive correlation with the shear modulus31,40. Using the shear modulus and hardness property space, we were able to find some materials with ultrahigh hardness. Here we are comparing the correlation between hardness vs. shear modulus and hardness vs. bulk modulus. Previous study has shown that, all materials with high shear modulus would normally have high hardness, but not all materials with high bulk modulus would have high hardness1. Therefore, shear modulus provides a better correlation with hardness than bulk modulus41. That is the reason we chose to search for materials in the hardness vs. shear modulus space.

In Fig. 5, we observed that more materials in each round were pushed to the limit with DFT validation. After four rounds we found 21 structures in total with ultrahigh hardness and shear modulus. Some identified structures were found to have Vickers hardness greater than 70 GPa, which is close to that of diamond or carbon allotropes. We also quantify the difference in the material properties between DFT validated values and CGCNN prediction, by calculating the distance between the outlier of CGCNN and the DFT calculated values higher than the CGCNN prediction for each rounds using Eq. (1). The results are shown in Fig. 6a. Here, we are calculating the distance of all DFT values that are greater than the highest value predicted by CGCNN model. As shown in Fig. 6b–e, for each round, an outlier predicted by CGCNN model was chosen (the blue symbol in the figure). Then, we calculated the distance of all DFT values that are higher than this outlier using Eq. (1). We observed that as the number of BLOX rounds increases, the distance between the CGCNN prediction and DFT calculations decreases, which is the same phenomenon as found before for bulk and shear moduli space (see Fig. 3b–e). However, once again, we found that the distance for CGCNN prediction, representing the recommended material properties relative to the original range, cannot continuously decrease with BLOX rounds increasing. This means that CGCNN model cannot easily be trained to predict material properties in the rare or boundary region.

Fig. 5: Performance of BLOX algorithm in searching ultrahigh hardness materials.
figure 5

The hardness vs. shear modulus plot with comparison of observed data and all rounds of DFT calculations. After looping through all rounds of DFT, we were able to get 21 structures with ultrahigh shear modulus and ultrahigh hardness. The range of hardness and shear modulus is increased (expanded) by 40% and 70%, respectively.

Fig. 6: Evaluation of prediction of CGCNN model for hardness vs. shear modulus.
figure 6

a Maximum and average distance between outlier of CGCNN prediction and DFT calculation for shear modulus and hardness. For each round we measure the distance between the outlier of CGCNN prediction (blue symbols in bottom panels) and all DFT values with shear modulus and hardness higher than CGCNN prediction. be Comparison between CGCNN prediction and DFT values for the 1st, 2nd, 3rd, and 4th round, respectively. The blue symbol denotes the outlier structure from CGCNN prediction that is used for calculating the distances to real DFT values.

Negative Poisson’s ratio

Poisson’s ratio is defined as the ratio of lateral strain in solid over the longitudinal strain measured in a simple tension experiment42. Most solid materials have positive Poisson’s ratio, but a small portion of solid materials have negative Poisson’s ratio, which are known as auxetic materials43,44. The materials with negative Poisson’s ratio have exceptional properties such as high energy absorption, high fracture resistance, difficult to deform under shear loading, enhanced toughness, and resistance to indentation. We use the Pugh’s ratio (defined as the ratio between the shear modulus and the bulk modulus to distinguish the ductile/brittle behavior of material45,46) vs. Poisson’s ratio for this search. We applied BLOX algorithm to explore in the negative direction to find structures with negative Poisson’s ratio. In Fig. 7, we present the Poisson’s ratio vs. Pugh’s ratio for different BLOX rounds. In total we found 11 structures with negative Poisson’s ratio from 85,707 crystal structures taking from Materials Project database. In contrast, there are only two materials in the original ~2000 observed dataset that have negative Poisson’s ratio. Our results indicate that the original Materials Project database does not include many materials with negative Poisson’s ratio.

Fig. 7: Performance of BLOX algorithm in searching negative Poisson’s ratio materials.
figure 7

Observed data and all rounds of DFT calculations recommended by BLOX for Poisson’s ratio vs. Pugh’s ratio. After a few loops, we were able to find 11 structures with negative Poisson’s ratio. The dashed line represents zero Poisson’s ratio and is the guide for eyes.

Data driven insight into mechanical properties

Once we obtain lots of high accuracy DFT data recommended by BLOX, we are now in the position to do further study to deeply understand the mechanisms of these outliers. The Pearson correlation matrix, as shown in Fig. 8, gives an insight on how much each property correlates with each other47. In principle, the mechanical behavior of a material depends on their interatomic bonding, which can then be further traced back to the electronic cloud such as charge density and spatial distribution. That is the reason we show the correlation between elastic properties and local potential (LOCPOT) and electron localization function (ELF) values. A Pearson correlation matrix relates two parameters to each other, and the values is between −1 and 1. A negative value of −1 shows a perfectly inverse correlation between the two parameters, while a positive value of 1 shows a perfectly positive correlation. A value of 0 shows no correlation. A value close to 0 indicate a weak direct correlation while a value close to −1 or 1 indicate a strong inverse or a strong direct correlation, respectively. Figure 8 shows that there is a strong inverse correlation between Poisson’s ratio and Pugh’s ratio, and a strong positive correlation between bulk and shear modulus, between bulk and Young’s modulus, and between shear and Young’s modulus. There is pretty strong negative and positive correlation between the mechanical properties and the mean values of LOCPOT and ELF, respectively.

Fig. 8: Material descriptor analysis.
figure 8

Pearson correlation matrix between maximum local potential, minimum local potential, mean local potential, maximum electron localization function, minimum electron localization function, mean electron localization function, bulk modulus, shear modulus, elastic modulus Pugh’s ratio, and Poisson’s ratio.

To analyze the mechanism for ultrahigh hardness materials, we established the correlation between electron work function (EWF), interatomic bonding, and hardness. Generally, the mechanical behavior of materials depend on their interatomic bonding strength, which is a basic feature of most known superhard materials like diamond having strong covalent bonding, governed by the behavior of the electrons48. EWF is the minimum energy required to move electron inside a material at the Fermi level to its surface without kinetic energy49. It is determined by its composition and charge redistribution on its surface caused by dipole layer50, and it reflects the electronic behavior of metals and atomic interaction51. Previous studies have demonstrated that, for hard materials, their hardness is mainly governed by the interatomic bonding strength through correlation with the EWF48, i.e., the higher the EWF, the higher the hardness of the material. We also study the ELF which is the measure of electron localization in atomic and molecular system. The ELF in Fig. 9, reflects the probability of finding an electron in the system and the LOCPOT for the top two superhard structures recommended by BLOX and two materials, namely Be4C8N4 (mp-1189451) and C12N8 (mp-1188347), that have never been published in literature to the best of our knowledge. From Fig. 9a, c, we can see that the ELF plot strongly illustrates the presence of electrons and strong covalent bonding existing between the elements of the materials. Figure 9b, d shows the presence of more electrons and hence an increase in EWF. Figure 9e–h also shows the presence of strong covalent bonds between the atoms of the structures. Thus, all plots agree with the ultrahigh hardness exhibited by the four materials as confirmed by DFT calculations.

Fig. 9: Electronic level insight into ultrahigh hardness.
figure 9

The ELF and LOCPOT plot of the structure B4C8N4 (mp-1079201) (a, b), BC7 (mp-1078935) (c, d) recommended by BLOX showing the presence of more electrons and covalent bonding. eh The ELF and LOCPOT plot of two identified structures Be4C8N4 (mp-1189451) and C12N8 (mp-1188347) showing the presence of strong covalent bonding.

Our results have demonstrated that BLOX algorithm can be effectively used to accelerate material discovery. Despite that some materials recommended by BLOX have been previously reported, it still shows that our screening results are accurate. One of such materials is BC752,53 with symmetry \(P\bar 4m2\) (space group number: 115) and hardness of 75.2 GPa. In our DFT calculations we found that BC7 has hardness of 71.7 GPa, which is very close to previous study. Finally, we found six superhard structures which have never been reported in literature to the best of our knowledges. All these six structures have negative average local potential, indicating a strong average atomic attractive interaction in the unit cell. These structures with relevant structural information and hardness are reported in Table 1. To further confirm the thermodynamic stability of these structures, Fig. 10 shows the phonon dispersions of selected structures along high symmetry points in the Brillouin zone. No negative frequencies were found for the structures, indicating these structures are thermodynamically stable.

Table 1 Structures identified by BLOX recommendation with corresponding ultrahigh hardness.
Fig. 10: Thermodynamic stability analysis of ultrahard materials.
figure 10

Phonon dispersions of a B4C8N4, b BC7, c Be4C8N4, and d C12N8 along high symmetry paths in the Brillouin zone. There is no negative frequency in phonon dispersions, indicating these structures are thermodynamically stable.

Before closing, we would like to discuss some important points about the ML + BLOX algorithms:

  1. (1)

    The stopping criterion: indeed, it is hard to recognize or quantify a stopping criterion by a mathematic formula. In practice, we stop iteration when there is no significant addition of structures coming out from validated DFT calculation from the previous round. This is an empirical and intuitive method. We would like to point out that, when and where we should stop BLOX loops would certainly depend on the specific material properties we are investigating. It also depends on the current region that the existing materials have already reached. Although it is almost impossible to theoretically or mathematically prove the upper or lower limit of material properties now, our study of coupling BLOX algorithms and DFT calculations paves the way to accelerating material discovery by identifying the out-of-trend materials using the BLOX approach. Our results demonstrate that the BLOX algorithm is very promising for future materials discovery that can push the materials properties to the limit with acceptable and achievable/realizable DFT calculations.

  2. (2)

    The BLOX algorithm could be coupled with any traditional ML regressors. The reason for using RF + BLOX in our study is, in previous study conducted by Terayama et al.27 they compared different ML + BLOX of which RF + BLOX gave the best result. We simply follow that recipe in our current work. Systematic cross-checking and comparison of the performance among different combinations of ML + BLOX will be the focus of our future work.

  3. (3)

    Certainly, it is hard to give out a mathematic formula for the maximum theoretical limit for material properties, including mechanical properties studied here. In many times people want to find the materials with enhanced properties. As we have presented, the ML + BLOX algorithm has great potential to accelerate such discovery, i.e., it allows to identify materials with the mechanical property close to or even going beyond the current boundary of the mechanical properties. Using this approach, we have identified 30 structures with ultrahigh bulk and shear moduli, 21 superhard structures with ultrahigh hardness, and 11 structures with negative Poisson’s ratio from 85,707 crystal structures taking from the well-known Materials Project database. It is also worth noting that, the final findings also depend on the material pool to be screened. Here, we used the Materials Project database with 85,707 crystal structures. We believe that, if the same method is applied to even larger database, such as OQMD database that currently has around 1 million structures, more structures with extreme mechanical properties will be identified quickly.

  4. (4)

    The transferability of the method to other material property: we believe that the same procedure can be straightforwardly carried out to find other material properties like lattice thermal conductivity, Grüneisen parameter, heat capacity, superconductivity, etc. In particular, the BLOX method is believed to be very suitable for finding extreme material properties that are hard to calculate by direct DFT. We would like to emphasize here again that, the overall performance of the BLOX algorithm depends on two major factors: (1) a well-defined to-be-pushed material property vs. dependent variable(s): according to our experience, the stronger correlation or relationship for such definition there is, the easier the BLOX algorithm can identify the trend and then recommend the outliers. (2) More accurate descriptor(s) for the ML regressor model: more accurate descriptor(s) will result in more accurate prediction of target properties of to-be-screened materials (unchecked data), which will facilitate the BLOX algorithm to pinpoint the outliers more efficiently and accurately, so that in each iteration structures that are outside of the previous boundary of material property will be identified. In this way, the material property will be gradually pushed to the limit as search iteration goes on.

  5. (5)

    Last but not least, the ML + BLOX algorithms used herein have helped us identify some structures in existing database, but their properties have not been previously explored yet. Such algorithms can be also coupled with the state-of-the-art crystal structure prediction methods or packages, such as Universal Structure Predictor (USPEX)54,55,56, Crystal structure Analysis by Particle Swarm Optimization (CALYPSO)57,58,59, to discover completely structures that are not included in existing materials databases with desired or extreme material properties.

Methods

Training data and DFT calculations

We used the classical force-field inspired descriptors (CFID)38,60 to transform our crystal structure to ML input. We used BLOX27, coupled with RF ML algorithm to screen 85,707 crystal structures downloaded from Materials Project39 database. We split our 85,707 crystal structures into ten different jobs and run them in parallel. For each job BLOX recommended 25 promising structures ranked by the SN score, making a total of 250 recommended candidates in each round. We then performed DFT calculations using the plane-wave basis projector augmented wave method61, within the Perdew-Burke-Ernzerhof exchange-correlation functional62, as implemented in the VASP package63,64,65. The cutoff energy is set to be 500 eV for the recommended crystal structures to calculate mechanical properties. The energy and force criteria for the DFT calculation of elastic constants were 10–6 eV and 10–4 eV/Å, respectively. DFT calculation was conducted for validating the recommended structures by BLOX because the ML model prediction was not accurate enough due to transferability issue and limited number of training data. We performed crystal graph convolutional neural networks (CGCNN), Lasso regression, and Ridge regression for the recommended structures and compare with DFT. The phonon dispersions of selected structures were calculated by the finite displacement method using PHONOPY package66 with harmonic second-order force constants calculated by VASP.

ML workflow

Traditional ML model was trained and used for structure property prediction to see if we can rely on the ML model to find extreme mechanical properties, and this will in turn reduce the cost of DFT computation. The traditional ML models (RF, Ridge, and Lasso regression) were used in this study as implemented in the scikit-learn67. RF is a ML technique, proposed by Breiman in 200168 for classification and regression problems, through the ensembles of different decision trees. The RF regressor produces an estimation by averaging the prediction of many individual trees fitted on randomly resampled sets of training data. Three-fold cross validation was used for model fitting and hyperparameter optimization. We set the maximum number of trees to 100, 80% of observed data was used for training and 20% for testing. Ridge regression was originally proposed by Hoerl and Kennard in 197030,69 and used for analyzing data which are affected by multicollinearity, whereas Lasso regression was put forward by Tibshirani in 199670 for parameter estimation and variable selection simultaneously in regression analysis. Lasso and Ridge regression are both regularized methods that significantly reduces the intricacy of the models such as the number or absolute size of the sum of all coefficients in the model71. Lasso regression minimizes the absolute sum of the coefficients (L1 regularization), and Ridge regression minimizes the squared sum of the coefficients (L2 regularization). They aim to regularize complex models by introducing penalty factors and they are great at reducing overfitting. Three-fold cross validation was also used for model fitting and hyperparameter optimization, 80% of observed data was used for training and 20% for testing, the maximum alpha (\(\alpha\)) value was set to 10.

The CGCNN model combines the descriptors and learning model into one inseparable step, i.e., the model learns material properties directly from the connection of atoms in the crystal9. The CGCNN framework has been demonstrated to represent periodic crystal that provides material property prediction with DFT accuracy4,9. Here, the crystal structures are represented by a crystal graph that encodes both atomic information and bonding interaction between atoms, and then build a convolutional neural network on top of the graph to automatically extract representations that are optimum for predicting targets properties. The atomic properties are represented by nodes and encoded in the feature vector \((v_i)\). For each atom, neighbors are first search within a 6 Å radius, and are considered as connected when the share a Voronoi face72 with the center atom and have interatomic distance lower than the sum of the Cordero covalent bond length73 of 0.25 Å. Crystal graphs do not form optimum representation by themselves; however, they are improved by using convolutional layers. After each convolutional layer, the features vectors gradually contain more information on the surrounding environment due to the concatenation between atom and bond features vector6. The convolution function by Xie et al.9 consists of:

$$v_i^{(t + 1)} = v_i^{(t)} + \mathop {\sum }\limits_{j,k} \sigma \left( {z_{\left( {i,j} \right)_k}^{(t)}W_f^{(t)} + b_f^{(t)}} \right) \odot g\left( {z_{\left( {i,j} \right)_k}^{(t)}W_s^{(t)} + b_s^{(t)}} \right)$$
(2)

where \(W_f^{\left( t \right)}\), \(W_s^{\left( t \right)}\), and \(b_i^{\left( t \right)}\) are the convolution weight matrix, self-weight matrix and bias of the \(t{{\rm{th}}}\) layer respectively, \(g\) is the activation function for introducing nonlinear coupling between layers, \(\sigma\) denotes the sigmoid function, ʘ denotes element-wise multiplication, and \(z_{\left( {i,j} \right)_k}^{(t)}\) is the concatenation of the neighbor vectors. After R convolutions, a pooling layer reduces the spatial dimensions of convolutional neural network, the pooling layer operates on all feature vectors. For simplicity, a normalization summation is used as the pooling function. For optimization, backpropagation, and stochastic gradient descent were used to update the weights with DFT calculated data. Here, we train the CGCNN model using the observed data, with 60% for training, 20% for testing, and 20% for validation. We then use our model to predict the properties for the unchecked 85,707 data. We add the DFT validated values to the observed data for the next round. We also compared the distance between the outlier of the CGCNN prediction with DFT calculations (see Figs. 3 and 6).

Structural descriptors

ML techniques have shown great prospect for screening and discovery of crystals structures because of their high efficiency in predicting material properties as compared to high demanding DFT calculations. However, in ML based approach, controlling the performance to enhance its accuracy is based on how compound/crystalline structures are represented in dataset74. Transforming the input data into a suitable representation of atoms for ML is a necessary step as it will reduce the amount of required training data and help increase the accuracy11. The transformation of input data is called descriptors/features extraction or engineering. Selecting a good descriptor is a very important step for ML model training, because a good descriptor can explain a target property well and this leads to a robust prediction model of a target property74. Combining descriptors with ML methods leads to model capable of accurately predicting structure properties. Chemical descriptors based on elemental properties have been successfully applied for various computational discovery75, nonetheless, this is not suitable for modeling crystal structures with the same composition since they ignore structural information76. In this study, we use CFID38,76 as descriptors, because the descriptors cover a wide range of crystal structures, and they are able to consider a combined form of elemental and structural representation74,77. The combined descriptors have been applied not only to crystalline systems but also to molecular systems74. Elemental representations we used in this study include atomic number, atomic mass, period, and group in the period table, first ionization energy, second ionization energy, electron affinity, Pauling electronegativity, Allen electronegativity, van der Walls radius, covalent radius, atomic radius, melting and boiling point, density, molar volume, heat of fusion, heat of vaporization, thermal conductivity, and specific heat. These elemental descriptors help capture essential information about compounds. Structural representations include simple coordination number, Voronoi polyhedron of central atom, angular distribution function, radial distribution function, bond-orientational order parameter78, and angular Fourier series79. The CFID consists of 1557 descriptors in total for each crystal structure: 438 average chemical, 4 simulation box size, 378 radial charge distribution, 100 radial distribution, 179 angle distribution up to the first neighbor, 179 angle distribution up to the second neighbor, 179 dihedral angle up to the first neighbor, and 100 nearest neighbor descriptors.

Computational workflow and BLOX algorithm

The schematic of our computational workflow performed in this study is illustrated in Fig. 11. The initial preparation is to randomly select 2000 materials from database and calculate their mechanical properties by DFT. These 2000 DFT data were served as observed data to initiate the whole process. We transformed these structures into ML input using CFID as descriptors. We also trained a RF model and used the model to predict structures from Materials Project database (unchecked data). After that, the search was performed by repeating the following steps. Step 1: construct a property prediction model; Step 2: recommend promising candidates ranked by the SN score based on kernel-based Stein discrepancy; Step 3: evaluate recommended candidates by DFT27 and add DFT data into training dataset for next round ML training. In step 1, RF model is built as a property prediction model on the already evaluated materials and their property data. Structures with high SN score were recommended for evaluation as potential candidates. SN score for each unchecked materials intuitively measures a deviation between the observed property and the predicted property28 as given in the equation below:

$${\rm{SN}}\left( {V \cup \left\{ p \right\}} \right) = {\rm{SD}}\left( V \right) - {\rm{SD}}\left( {V \cup \left\{ p \right\}} \right)$$
(3)

where SD(V) is the Stein discrepancy for the evaluated data (observed data), p is predicted point by ML, and \(\cup\) is union operator in set theory. We select the candidates with top SN scores. The SN score is based on Stein discrepancy, which can boundlessly evaluate a distance between any two distributions in any dimensional space.

Fig. 11: Schematic of workflow of BLOX algorithm.
figure 11

The loop of DFT/BLOX/recommendations were performed at least four rounds until there is no significant amount of interested material properties recommended by BLOX algorithm.