Introduction

NiTi (nitinol) is a premier shape memory alloy (SMA) with recoverable strain up to 10%, and martensite start temperature (Ms) below 100 °C [1, 2]. The promising adaptive and intelligent functional properties [3] of NiTi have opened up the window for applications in various sectors such as medical, automobile, oil and aerospace industries. However, the low Ms has limited applications of NiTi alloys in high temperature environments. With the addition of the third or even the fourth elements such as Au, Pd, Pt, Hf and Zr, Ms of NiTi-based alloys can be increased well above 100 °C [4,5,6,7,8,9]. In practice, SMAs with Ms above 100 °C are classified as high temperature shape memory alloys (HTSMAs) [7] and have been extensively explored over the past decade [10,11,12,13,14,15,16,17]. In particular, NiTi-based HTSMAs have drawn the attention of various industries such as space and oil sectors [7, 18], thanks to their elevated transformation temperatures up to 1100 °C, high strength over 2 GPa, and decent functional properties. An exemplar is the addition of Hf which improves the overall functionalities of NiTi-based HTSMAs at a significantly reduced cost. Such NiTiHf alloys satisfy the requirements for actuator applications in spacecraft as they are able to undergo repeatable thermo-mechanical cycles with negligible deterioration in functional and mechanical properties.

The maximum Ms reported in NiTiHf alloys (Hf content between 0 and 30 at.%) reaches 525 °C with recoverable strain exceeding 3% [19,20,21,22]. Of the NiTiHf alloys, Ni50.3Ti29.7Hf20 is the mostly investigated alloy exhibiting an austenite finish temperature of 178 °C and work output of 16 J/cm3 [23]. Nevertheless, the actuation strain (Fig. 1b) achieved in Ni50.3Ti29.7Hf20 remains low and needs to be improved for compact and efficient actuation applications. To date, more than 200 studies are devoted to the development of NiTiHf alloys [7], revealing that their functional and mechanical performance mostly depends on alloy composition. For instance, the increase of Ni content from 49.8 to 51.3 at.% dramatically decreases the Ms temperature of the NiTiHf alloy system from 500 to − 200 °C [9]. Pinpointing the right alloy composition is thus desired to develop NiTiHf alloys with balanced functional and mechanical properties. Traditionally, alloy development is a time-consuming process and low in accuracy, with 68% unsuccessful attempts in NiTiHf alloy development [24]. Furthermore, due to the complexity of phase transformations and their compositional sensitivity, traditional alloy design has become more challenging. More recently, instead of finding appropriate compositions through trial-and-error experiments, the availability of a large amount of published data on NiTiHf motivates a more data-driven pathway for alloy development.

Figure 1
figure 1

Essential materials properties of shape memory alloys for actuators. a Phase transformation temperatures and thermal hystersis (commonly defined as the difference between Af and Ms) shown in the typical DSC curves [25]. Ms—martensite start temperature, Mf—martensite finish temperature, As—austenite start temperature, Af—austenite finish temperature. b Work output (WO) is defined in the two-way shape memory effect curve [25] as the actuation strain or the maximum recoverable strain

Machine learning (ML), a data-driven methodology, can learn and accurately predict from sufficiently large data [26]. The high accuracy offered by data-driven ML has attracted more recent attention in novel alloy development [27,28,29,30,31,32,33]. With the aid of ML, Wen, et al. [32,33,34] developed novel Al-Co-Cr-Cu-Fe-Ni high-entropy alloys with improved hardness. In their work, with 155 experimental data points, eight different ML models have been established with the lowest reported root mean square error (RMSE) of 31 from the model of support vector regression (with radial basis kernel function). In another study, new titanium alloys with low Young’s moduli for biomedical applications have been identified through deep learning artificial neural networks (ANN) [35]. More specifically, Young’s moduli (from 164 data points) and Ms (from 112 data points) of titanium alloys were reliably predicted by ANN with low RMSE of 12.7 and 33.4, respectively.

With respect to NiTi-based alloys, many researchers have taken advantage of ML in alloy design and development despite the lack of “big data”. The recent development for the prediction of transformation temperature (TT) in Ni-Ti-Cu-Fe-Pd alloys by Gaussian process regression (GPR) was trained and tested by a total of 54 data points. The performance evaluation of the model was conducted through the mean absolute error (MAE), RMSE, and correlation coefficient (CC), which are reported as 0.4449, 0.8081 and 0.9999, respectively [36]. In another work [12], the process parameter optimization for additively manufactured Ni50.4Ti29.6Hf20 by selective laser melting (SLM) was achieved through an ANN model with an R2 (refers to Eq. 2 in Sect. 3) of 0.9958 and a residual sum of squares of 0.00683. Furthermore, a recent work [33] was implemented to identify the narrowest thermal hysteresis of NiTiCu alloy using ANN.

For SMA alloys in actuation applications, their functional performance is often characterised with respect to three key materials properties, martensitic start temperature (Ms), thermal hysteresis (TH) and work output (WO). These properties determine the capacity of SMA alloys having maximised energy density (closely related to WO) for high-temperature actuation applications. For instance, as shown in Fig. 1a, Ms represents the start temperature of martensitic transformation and TH is the difference between Af and Ms in most of the studies. Work output (WO) in Fig. 1b represents the actuation strain or the maximum recoverable strain, which is an indicator of energy density.

To date, a limited effort is devoted to the ML development of NiTiHf alloys for high-temperature actuators except a few latest studies [24, 37, 38]. This has led to the current initiative to identify new NiTiHf alloy compositions with balanced performance in terms of Ms, transformation hysteresis TH and WO. On this basis, the current study has adopted a data-driven approach to identify novel high-temperature NiTiHf SMAs for actuators. This has been implemented by evaluating seven well-known ML models (Table 1). As three important attributes for the design of actuators, Ms [9], TH [33, 39] and WO [40] are concerned in this study. After comparison of some literatures on NiTiHf alloys, the transformation characteristics may vary for the same alloy composition due to extreme sensitivity of the functional properties to the alloying elements especially Ni. The high sensitivity of the properties in relation to the alloy composition poses difficulties in screening reliable data for ML and has been addressed recently in a few studies [13, 41]. In this regard, it is imperative here to develop a holistic ML approach with an acceptable accuracy and reliability to predict NiTiHf alloys with desired properties.

Table 1 Adopted machine learning models [28, 29, 34]

Compared to the recently developed ML models on NiTiHf alloy development [24, 37, 38], the combination of all three data sets in this work further complicates the search for the most suitable alloys. In this work, the extracted raw data via ML is used to provide data distribution patterns among the properties. Another advantage of the ML approach developed here is the capability for fast automated identification of various alloy compositions with target properties and property customisation. Two questions have been addressed in the present study, including the correlation among Ms, TH and WO as well as their compositional dependence. To achieve this, a cascade data modelling approach [42] based on ML is used. Thereby, easy customisation and prompt addition of data to consider additional properties will guide the development of ML algorithms. A hierarchical ML structure enables users to change/re-arrange the property prioritisation with minimal impact on the prediction accuracy.

Seven ML models are used to understand the relationship between alloy compositions and their properties in NiTiHf HTSMAs, followed by a statistical evaluation to select the best fit ML model. Subsequently, each material property has an individual dataset for training, testing, and validation. In three different layers of the ML algorithm, a logical filtering method is used to identify NiTiHf alloys with an appropriate combination of Ms, TH and WO.

Methodology

Data collection

Data have been collected for the NixTiyHfz system from published experimental data [5, 8, 9, 20, 25, 39, 40, 43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60], with alloy compositions as input and Ms, TH and WO as output variables in the ML models. Only the thermal and functional properties of the NiTiHf alloys were considered and no other properties were taken into account.

It should be mentioned that data for Ms, TH, and WO for the alloy system was collected individually and only alloys undergoing similar post-processing conditions with homogenisation at 1050 °C/3 h and without any further ageing treatment were considered to minimise the influence of the processing method. Arc melted and vacuum induction melted alloys were selected to minimise the impact of alloy manufacturing methods from different literature sources. The property difference of the alloys manufactured by these two methods is minimal [13, 61]. Data post processing was carefully performed by removing inconsistent data from the ML data sheet to improve the consistency.

The collected data have been visualised in Fig. 2 showing 3D plots (a, b, and c), density contour plots (d, e, and f), and statistical summary of the selected data (g, h, and i). These plots present the distribution of data used in the ML model implementation and also reveal the most accurate region of the ML model. According to the density contour plots shown in Fig. 2, some composition regions of alloys have more data than others. The highly dense composition regions have higher accuracy than other regions. For instance, many alloys have Ni content between 50 and 51 at.%. Beyond this Ni range, very limited data are available in the literature.

Figure 2
figure 2

A snapshot of data used in ML modelling; a, d, g Data for Ms [8, 9, 20, 25, 40, 43,44,45,46,47,48,49,50,51,52,53,54,55,56,57]; b, e, h Data for TH [5, 8, 9, 25, 39, 40, 43, 45, 46, 48,49,50,51,52,53,54, 58, 59]; c, f, i Data for WO [5, 8, 40, 51,52,53, 60]. 3D graphs, a, b, c is the data distribution of the properties against the Ni and Hf percentage. Density contours d, e, f present the densified data points for the composition of the NiTiHf alloys. Yellow and orange contours have larger number of data points. g, h, i provides basic statistical information for each data set

Machine learning techniques

Seven well-known supervised ML models (Table 1) were employed to enable the learning. The selected ML models are commonly used in a number of recent alloy development studies [12, 24, 34, 37, 38]. Their adopted ML models have the capacity to reveal various possible input–output relationships including linear, nonlinear, polynomial, and nonparametric, which cover simple to complex relationships. Each dataset was used for training, testing, and validation of the selected ML models. Prior to training and testing, Python’s GridSearchCV was used to optimise the hyperparameters of each ML model and an optimised K-value was identified separately for the KNN model. The attributes of the ANN model have been optimised by changing the number of layers and the weights. Statistical analysis was conducted to find the best fit model.

Each data set for Ms, TH, and WO was assigned the best performing ML model after critical evaluation of the statistical measurements. These ML models have been denoted as MLMS, MLTH, and MLWO. As illustrated in Fig. 3, the present ML algorithm has multiple steps as described below.

  • MLMS is given the priority to identify X number of alloy compositions with Ms between 200 and 400 °C.

  • Among the alloy compositions selected through MLMS, MLTH is performed to identify Y number of alloy compositions with minimum or maximum TH.

  • With MLWO, Z number of alloy compositions offering maximum WO has been filtered from those developed through MLTH.

Figure 3
figure 3

ML algorithm. The algorithm starts with data collection and pre-processing. MLMS was trained, tested, and validated as the first step of the ML training process, identifying new NiTiHf alloy compositions at compositional resolution of 0.25 at.%. The number of compositions depends on the user’s requirement and their customizability. Then, MLTH was trained, tested, and validated. The composition identified from the MLMS was used to predict the TH for the compositions. The compositions with larger TH were brought forward. trained, tested, and validated through MLWO to find the WO for the compositions filtered from MLTH. The compositions with larger WO were taken forward as final composition/s. (Note: X ≥ Y ≥ Z)

Complex correlation exists between alloy composition and attainable properties of the NiTiHf alloys, making manual selection very challenging. For instance, multiple NiTiHf alloy compositions could provide similar Ms, but their TH and WO might be significantly different. Another advantage of this proposed ML algorithm is its ability in customisation with minimal effort. For instance, it has the flexibility to alter the priority of properties according to users’ application requirements.

Testing of the results

Testing of the ML model is important to evaluate the accuracy of the predictions. The best method is to use the experimental data that has not been employed in the learning/training cycle. In this study, a set of data (70%) was used for the learning/training cycle, while the remaining data points (30%) were adopted for testing and validation.

Machine learning models

As the initial step, the data for the ML training, and testing process has been collected and pre-processed. The collected individual data sets, as illustrated in Fig. 2, for each property Ms (data points-1423), TH (data points-467), and WO (data points-176) are illustrated in Fig. 2a-c. The compositional range of NiTiHf in each dataset is illustrated in Fig. 2d-f, and the statistical data summary is presented in Fig. 2g-i, respectively.

The ML model was evaluated in terms of several typical parameters. The RMSE is calculated using Eq. 1 [34] for each ML model.

$${\text{RMSE}} = \sqrt {\frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left( {y_{i} - \hat{y}_{i} } \right)^{2} }$$
(1)

where \({y}_{i}\) is the true value, \({\widehat{y}}_{i}\) is the predicted value, and n is the sample size. To minimise the error of prediction, it is desirable to achieve low RMSE values. The R2 value [12] quantifies how the model is fitted to the given data, and can be calculated by

$$R^{2} = 1 - \frac{{{\text{RSS}}}}{{{\text{TSS}}}}$$
(2)

where RSS is the sum of the square of residuals and TSS is the total sum of squares. The RSS and TSS can be defined as follows in Eqs. 3 and 4.

$${\text{RSS}} = { }\mathop \sum \limits_{i = 1}^{n} \left( {y_{i} - y_{{{\text{regression}}}} { }} \right)^{2}$$
(3)
$${\text{TSS}} = { }\mathop \sum \limits_{i = 1}^{n} \left( {y_{i} - y^{\prime}{ }} \right)^{2}$$
(4)

where \({y}_{i}\), \({y}^{^{\prime}}\), and \({y}_{\mathrm{regression}}\) are individual data point, mean value, and linear regression predicted value, respectively. When more than one variable is presented, the Adj_R2 value in Eq. 5 [12] provides a better comparison for the reliability of the model.

$${\text{Adj}}\_R^{2} { } = \frac{{\left( {1 - R^{2} } \right)\left( {N - 1} \right)}}{N - p - 1}$$
(5)

where N and p represent the total sample size and number of independent variables, respectively. The PCC value presented in Eq. 6 [36] quantifies the strength and direction of the relationship between variables and is expressed as

$${\text{PCC}} = { }\frac{{\sum \left( {x_{i} - \overline{x}} \right)\left( {y_{i} - \overline{y}} \right)}}{{\sqrt {\sum \left( {x_{i} - \overline{x}} \right)^{2} \sum \left( {y_{i} - \overline{y}} \right)^{2} } }}$$
(6)

where \({x}_{i}\) and \({y}_{i}\) are variable samples; \(\overline{x }\) and \(\overline{y }\) are mean values of variables. The aforementioned score techniques are used in the ML model selection process to identify the best model for each dataset.

ML models

The ML models outlined in Sect. 2.3 were trained and tested individually using the collected data. Initially, the datasets were split into two for training and testing purposes using the train-test split function in Python [34]. The train-test split ratios were maintained at 70 to 30%. The split data were then used in the ML models for training and testing and statistical information was collected for further evaluation.

ANN model

The ANN model is an innerconnected network of computational units known as artificial neurons. Neural network systems are constructed to understand the correlation among the data [62]. Even though the ANN model has been used in both supervised and unsupervised ML modelling, this work focuses on supervised modelling (regression). Figure 4 shows the trend of data training for each ANN model used with three individual property data.

Figure 4
figure 4

Train/validation loss for ANN models. a, b, and c represent Ms, TH, and WO ANN models, respectively. During the training process, a min–max scaler has been used to minimise the error due to a large data range. As an effect of the min–max scaler, the RMSE shown in Y-axis is lower than the real RMSE displayed in the graph

Regression models

ML models were trained and tested with parallel to ANN model. Figs. S1, S2 and S3 illustrate the trend of training samples. For instance, the network diagrams in Fig. 5a-c provide details on the trend of training data of the KNN model for MLMS, MLTH, and MLWO, respectively. These network diagrams show how closely the experimental data fit the prediction based on the ML models.

Figure 5
figure 5

ML model (regression model) assessment. a (Ms), b (TH), and c (WO) illustrate the trend of the training data vs experimental data of the KNN model in which the red colour line (testing data) laid on the blue colour line (training data) has higher accuracy. Lesser deviation of the line indicates lower RMSE. d (Ms), e (TH), and f (WO) are the scatter plots which represent the experimental data and predict data (based on the ML) for each composition. Scatters closer to the diagonal indicate higher accuracy

The predicted values as a function of the experimental values of the KNN model for MLMS, MLTH, and MLWO are presented in Fig. 5d-f, respectively. The scatters close to the diagonal lines have smaller RMSE than those away from the line. For comparison, more network diagrams and prediction versus experimental diagrams for the other five regression models are given in Figs. S2, S3 and S4.

K-nearest neighbouring method

During the model selection procedure, the K-value optimisation of the KNN model is presented in Fig. S4. The K-values from 1 to 50 was evaluated against the RMSE. The behaviour of the K-value vs. RMSE of Ms and TH was the generic K-value curve in similar situations while the RMSE for WO has not shown a significant change after reaching the lowest. However, the data received from graphs were adequate to select the K-values that provide the lowest RMSE.

Furthermore, the analysis of KNN graphs (KNN clusters, Fig. 6) is helpful for the interpretation of the data. The multiple similar colour clusters in each graph indicate that multiple compositions of NiTiHf alloys provide similar material properties. Especially, in Fig. 6a, upon closer inspection, several branches lead to a central point. Those multiple possible combinations of the alloy composition illustrate the difficulty of selecting a single composition through traditional means of alloy development.

Figure 6
figure 6

KNN network graph. KNN graphs a, b, and c represent the Ms, TH, and WO, respectively. The K values selected in each case are based on the optimised K values. a When observed closely, several clusters appear and different clusters have similar colour, which means that different compositions could provide similar Ms; b Multiple blue and purple clusters can be observed with a single yellow cluster. According to that, for a higher TH value, there is only one strategy to reach higher TH; c Cluster transition is smooth from low WO to high WO, but it seems that WO diminishes after reaching peak the WO values. More KNN graphs are shown in Fig. S5

Model selection and validation

The ML models were evaluated to identify the most suitable method (refer to supplementary Figs. S1, S2 and S3). Using the capability of the GridSearchCV library in Python, the optimised hyperparameters were identified for each model to improve the accuracy and to enable a fair comparison. In each ML model, using the Train-Test split method, the total dataset was divided for training and testing purposes. 70% of data have been used in training and the remaining 30% are used for testing. As illustrated in Figs. S1, S2 and S3, the testing data are plotted against the training data to identify any overfitting of the ML models. Furthermore, the K values of the KNN models are plotted against the RMSE in Fig. S4 which can be used to identify the most suitable K values. The performance evaluation starts once a high confidence is reached upon optimum fitting of the ML models. The performance measurements are illustrated in Fig. 7. Figure 7a-c provide the details on the RMSE error of each ML model for Ms, TH, and WO, respectively.

Figure 7
figure 7

Statistical analysis and comparison. a, b, and c compare RMSE values of Ms, TH, and WO, respectively. RMSE was the main parameter on consideration for selections of ML model. d, e and f are R-Squared, Adj-Squared, and PCC of Ms, TH, and WO, respectively

RMSE in Eq. 1 provides better understanding about the standard deviation of the predictions. The MLMS models with Lnr_r, Ply_r, SVR_rbf, SVR_lnr, and SVR_ply, and ANN have RMSE errors greater than 22.32, except KNN with the lowest RMSE of 5.11. Similarly, the RMSEs of Lnr_r, Ply_r, SVR_rbf, SVR_lnr, and SVR_ply, and ANN with TH data were higher than 11.79, whereas the KNN model resulted in 1.17. RMSEs of data for WO with Lnr_r, and SVR_rbf were greater than 4.58, in contrast to Ply_r, and KNN with low RMSEs of 1.20. Furthermore, other statistical parameters, R2, Adj_R2, and PCC are shown in Fig. 7d-f where a value close to 100% means high accuracy of the ML model. According to RMSE, R2, Adj_R2, and PCC, it can be concluded that the KNN model yields the lowest RMSE for all three property datasets.

The RMSE values of the KNN models for MLMS, MLTH and MLWO were 5.11, 1.17, and 1.21, respectively. R2 and Adj_R2 of KNN models were 99% for MLMS and MLTH and 98% for MLWO, proving the reliability of the KNN model. The three datasets are positively correlated with a PCC of 99% for the KNN model. The highest RMSE (131.0) and lowest R2 (44%), Adj_R2 (43%), and PCC (44%) for MLMS were reported with the SVR_lnr model. SVR_ply showed the least accuracy with highest RMSE (29.0) and lowest R2 (17%), Adj_R2 (15%), and.

PCC (57%). Furthermore, due to the high testing RMSE on TH dataset (the higher test loss than training loss indiates the ANN model has overfitted the training data), the ANN ML model is disqualified from further consideration in this work. Therefore, KNN was selected as the best performing model for further development of the identification of the new alloy compositions.

ML model validation

To prove the model accuracy, it is important to test the selected ML models with data that has not been used previously in training and testing. The validation process after testing completes the selection process to identify an ML model that can suggest suitable NiTiHf alloy compositions based on the given properties. The scatter plots in Fig. 8 show the validation.

Figure 8
figure 8

KNN model testing. A set of data has been used to validate the ML model after training and testing. The data used in the validation process has not been employed either in the training or the testing process. a, b, and c graphs represent Ms [8, 11, 63,64,65], TH [11, 39, 46, 63, 64], WO [8, 23, 66], respectively. The grey colour scatters represent the training data and coloured scatters represent the validation data. The scatters close to the diagonal axis prove the accuracy of the ML models

results of the ML models. Grey colour scatter represents the training data used previously throughout the ML model development process. Other data points (coloured) serve as validation data that has not been used in the training and testing process.

The composition of each testing data was used to predict the respective property through the KNN model. The majority of the validation data is located close to the diagonal line in all three ML models. Some deviation identified in the testing process (testing data away from the diagonal line) is assumed to be due to.

  • A slight change in the processing condition of experimental work,

  • Impact of the temperature profile of furnaces (testing data are from different research groups, and furnaces used in the casting and homogenisation process might have various temperature profiles), and

  • Purity of the elements used in casting.

Results and discussion

ML model performance

Results in the previous section show that the accuracy of ML models can differ even with the same data set. Each ML model has its unique structure with underlying mathematical concepts and related libraries. Therefore, it is important to identify the best fit model through critical analysis of performance parameters before the application to real-world scenarios. Such an ML approach is a promising methodology with minimal effort to discover novel alloy compositions compared to conventional alloy development procedures.

The selected ML KNN model has provided lower RMSE with low computational cost (completed within a minute). However, the ANN model presented here requires a higher computational cost which took more than 180 min to run the whole process with the same processing capacity.

Among similar work reported in the literature, ML has been employed to improve hardness [34] of high entropy alloys, Young’s modulus prediction of beta-Ti alloys [35] and TT prediction of NiTi-based alloys [36]. Those studies focused on one material property and reported comparably low accuracy with respect to the present work which has considered multiple properties with improved accuracy.

Properties prioritisation and customisability

Many ML-assisted alloy developments [24, 30,31,32, 34,35,36] applied equal weights for each property in their developments. The speciality of the present work is property prioritisation according to specific applications. In this work, as per the interest on actuator applications in space, the priority is given to Ms, followed by TH, and lastly WO. The algorithm developed has the.

potential to consider any sequence of Ms, TH, and WO with minimal effort. Furthermore, the addition of new properties at any stage is also possible in the developed algorithm. Therefore, the cascade model has multiple advantages over other similar work reported in the literature.

Figure 9 illustrates the effect of property prioritisation on the final suggested alloy compositions. Depending on the property priority, Fig. 9a-c show Ms, TH, and WO of 10 alloys identified with three different property sequences, respectively. For instance, the first sequence presented in Fig. 9a begins with Ms followed by TH and WO. The other two sequences start with TH or WO. The corresponding alloy compositions are illustrated via ternary phase diagrams in Fig. 9d-e.

Figure 9
figure 9

Customised alloys. Three different property sequences have been compared in (a-c, properties) and (d-e, composition). 200 (X) compositions with highest values of the first property in each property sequence were first selected. Then, the number of compositions has been reduced to 100 (Y) by selecting the highest values for the second property in the sequence. Finally, 10 (Z) compositions with the highest value for the third property have been selected and presented as the final compositions

As revealed in Fig. 9a-c, for a given property sequence WO is much less sensitive to alloy composition and only varies marginally (\(\Delta WO<1 \mathrm{J}/{\mathrm{cm}}^{3}\)) compared to Ms and TH which changes in the range of 348.3–421.1 °C and 60–90 °C, respectively, e.g. for the Ms-TH-WO sequence. In each step of the property sequence, the highest values of all three properties are considered. As a result, all final alloy compositions are Ni-rich consisting of 50.3 at.% Ni and Hf higher than 20 at.%, which offers comparatively higher thermal and functional properties [9].

Correlation among properties

This study addresses the identification of new NiTiHf alloy compositions based on Ms, TH, and WO for actuators used in space by considering three ML models, MLMS, MLTH, and MLWO that have been sequentially developed, trained, tested, and validated. The contour plot in Fig. 10 was constructed from 1391 new alloy compositions identified through the proposed ML models. The alloy compositions in Fig. 10 have been developed in a Ni range between 49 and 51.75 at.%. This contour plot illustrates the intercorrelation among Ms, TH, and WO of NiTiHf alloys with targeted properties. NiTiHf alloys with higher Ms (300 °C–600 °C) have a wide attainable TH window from 10 °C to 150 °C and moderate WO in an estimated range of 18–26 J/cm3. When Ms decreases, the TH window narrows down. For instance, the achievable range of TH for alloys with Ms below 100 °C is reduced to 10–100 °C while their WO is either below 20 J/cm3 or above 24 J/cm3.

Figure 10
figure 10

Correlation among Ms, TH, and WO. A contour plot was developed, based on the ML model, to understand the correlation among the properties. Contours are complex and irregular with multiple similar colour contours. Multiple similar colour contours reveal the complexity of the alloy compositions. Data points for validation: point A (Ms,; TH,; WO,) [66], point B (Ms,; TH,; WO,) [8, 49], point C (Ms,; TH,; WO,) [53], point D (Ms,; TH,; WO,) [8, 24]

Compared to the relationship between TH and Ms, the variation of WO with Ms is more complex and less distinct. As illustrated in Fig. 10, lower Ms corresponds to either lower WO or higher WO. Such variation is associated with the functional properties (e.g. recoverable strain) of the NiTiHf alloy system. NiTiHf alloys tend to have a low Ms under two different compositional ranges: Ni-rich alloys with Ni > 51 at.% [9] and those with Hf below 12 at.% [8]. The former group of Ni-rich NiTiHf alloys has higher recoverable strain than the latter group. However, alloys with Ms above 300 °C give rise to WO between 18 and 24 J/cm3. When considering all three attributes (Ms, TH, WO), the search for alloys with desired properties is often not straightforward. For instance, alloys with Ms above 500 °C have the smallest range for both TH (35–60 °C) and WO (20–22 J/cm3) while alloys with Ms below 500 °C have a higher range of TH but limited WO. Therefore, the selection of the right alloy composition is critical and needs more attention.

Composition sensitivity

As mentioned earlier, the functional properties of NiTiHf alloys are highly sensitive to the alloy composition. The contour plots in Fig. 11 provide more information on compositional sensitivity. For NiTiHf alloys with a low Hf content (5 at.%), with increasing Ni content from 50 to 51 at.% Ms reduces at a rate of 2.59 °C per 0.01 at.%Ni. When the Hf content increases to 20 at.%, the reduction rate of Ms increases to 3.18 °C per 0.01 at.%Ni (between 50 and 51 at.% of Ni). A maximum Ms reduction rate of 6.12 °C per 0.01 at.%Ni (between 50 and 51 at.% of Ni) is attained in an alloy with a higher Hf value of 30 at.%. This implies that the transformation temperature of NiTiHf alloys is more sensitive to Ni variation with increasing Hf content. However, outside the range of 50–51 at.%Ni, these properties vary to a lesser extent with respect to Ni. For a given Ni content, the properties of the alloys change linearly with Hf or Ti content. The variation of TH and WO with Ni is relatively lower than that with Hf. Although TH and WO are also sensitive to the Hf content, they are not as critical as Ms on the Ni content (Fig. 10, points B & C). Therefore, when designing an alloy with a Ni content between 50 and 51 at.%, more attention is required for the variation of Ms.

Figure 11
figure 11

Property sensitivity to alloy composition. a-c contour plots illustrate the relationship between Ni and Hf at.% with Ms, TH, and WO, respectively. d-f contour plots provide information on Ni at.% and Ti at.% with Ms, TH, and WO, respectively. Ms of the NiTiHf alloy system is more sensitive to Ni content than Hf and Ti. Both Ms and TH vary significantly between 50 and 51 at.%Ni. The WO is more sensitive to the Ti and Hf content than the Ni content

Proposed alloy compositions for actuators in space

The scatter plot in Fig. 12 was developed following the logical flow illustrated in Fig. 3. The A, B, and C scatters in the graph represent different stages of the selection procedure mentioned in the methodology. The C scatters are the possible new alloy compositions that could be used in the actuator application in space where the environment temperature can readily exceed at least 100 °C. The selected compositions here fulfil three key requirements for high-temperature aerospace actuators: Ms (200–400 °C), the largest TH among the compositions filtered through MLTH, and the largest WO of the compositions filtered from MLWO. This customizable ML model is able to prioritise specific properties and the number of filtering options.

Figure 12
figure 12

Selection of composition based on the ML models. The scatter plot on the background of the contour plot was developed to represent the algorithm of the ML model. Scatters A, B, and C (A + B + C = 592 scatters) have a Ms between 200 and 400 °C; B and C (B + C = 296 scatters) have Ms between 200 and 400 °C and the highest 50% of the TH from 592 scatters; C (73 scatters) has Ms between 200 and 400 °C, the highest 50% of the TH from 592 scatters, and the highest 25% of the WO from 296 scatters. The area representing scatters C is the most suitable composition for actuator applications in space

Table 2 lists alloy compositions suggested by ML when large and small TH are concerned during ML, respectively. The alloy compositions in Table 2, rows 1–5, offer Ms over 200 °C, large TH and high WO, which is superior to many existing NiTiHf alloys. In comparison, alloy compositions illustrated in Table 2, rows 6–10 have smaller TH with relatively lower Ms and equivalent WO. A larger TH in actuator application offers better controllability while a smaller TH enables rapid actuation response [62]. The requirement of TH is thus dependent on the application and the actuator mechanism. The experimental validation of the ML models and the suggested alloys in Table 2 will be performed via comprehensive microstructural and functional characterisations in our future work.

Table 2 Ms & TH prioritised NiTiHf composition. Selected final composition for further experimental evaluation. The listed compositions (1–5) and (6–10) have been proposed for actuator applications with larger TH and smaller TH, respectively

Conclusion

As promising high-temperature shape memory materials, NiTiHf alloys demonstrate superior functional properties such as high transformation temperature, large recoverable strain and work output. Nevertheless, these properties are very sensitive to their chemical composition, which complicates the optimisation of alloys by means of the trial-and-error experiments. The machine learning approach becomes a promising tool to facilitate this alloy optimisation process in this study. Major findings include:

  • Seven machine learning models have been investigated based on the properties of NiTiHf alloys to identify new alloy compositions with superior properties for space applications. The K-nearest neighbouring regression model was found to be accurate in calculating the properties of interest with low root mean square errors such as martensic start temperature (RMSE, 5.11), thermal hysteresis (RMSE, 1.17) and work output (RMSE, 1.21).

  • Martensitic start temperature and thermal hysteresis of the NiTiHf alloys are more sensitive to Ni content, while work output is more sensitive to the Ti and Hf contents.

  • From 1391 alloy compositions, 73 NiTiHf alloys with high transformation temperature, appropriate thermal hysteresis and large work output were filtered through the machine learning algorithm as potential candidates for actuator applications in space.