Prediction of the critical temperature of a superconductor by using the WOA/MARS, Ridge, Lasso and Elastic-net machine learning techniques

This study builds a predictive model capable of estimating the critical temperature of a superconductor from experimentally determined physico-chemical properties of the material (input variables): features extracted from the thermal conductivity, atomic radius, valence, electron affinity and atomic mass. This original model is built using a novel hybrid algorithm relied on the multivariate adaptive regression splines (MARS) technique in combination with a nature-inspired meta-heuristic optimization algorithm termed the whale optimization algorithm (WOA) that mimics the social behavior of humpback whales. Additionally, the Ridge, Lasso and Elastic-net regression models were fitted to the same experimental data for comparison purposes. The results of the current investigation indicate that the critical temperature of a superconductor can be successfully predicted using this proposed hybrid WOA/MARS-based model. Furthermore, the results obtained with the Ridge, Lasso and Elastic-net regression models are clearly worse than those obtained with the WOA/MARS-based model.


Introduction
Superconducting materials (materials that conduct current with zero resistance) have significant practical applications [1][2][3][4]. Perhaps the best-known application is in the Magnetic Resonance Imaging (MRI) systems widely employed by healthcare professionals for detailed internal body imaging. Other prominent applications include the superconducting coils used to maintain high magnetic fields in the Large Hadron Collider at CERN and the extremely sensitive magnetic field measuring devices called SQUIDs (Superconducting Quantum Interference Devices). Furthermore, superconductors could revolutionize the energy industry as frictionless (zero resistance) superconducting wires and electrical system may transport and deliver electricity with no energy loss.
A superconductor conducts current with zero resistance only at or below its superconducting critical temperature (T c ) [5][6][7][8][9]. Moreover, the scientific model and theory that predicts T c is an open problem, which has been baffling the scientific community since the discovery of superconductivity in 1911 by Heike Kamerlingh Onnes [1][2][3][4][5][6][7][8][9]. In the absence of any theory-based prediction models, we take here an entirely data-driven approach to create a statistical model that predicts T c based on its chemical formula. Indeed, an alternative approach for the superconducting critical temperature prediction problem is the machine learning (ML) approach, which builds data-driven predictive models by exploring the relationship between material composition similarity and critical temperature. Machine learning methods need a sufficient amount of training data to be available [10][11][12][13][14], but the availability of an increasing number of materials databases with experimental properties allows the application of these methods for materials property prediction.
In this investigation, a new hybrid regressive model based on the multivariate adaptive regression splines (MARS) technique has been used to successfully predict the superconducting critical temperature T c for different types of superconductors. This novel procedure, which combines the MARS approximation [15][16][17][18][19] with the whale optimization algorithm (WOA) [20][21][22], could be an attractive methodology that has not been tackled as of yet. For comparative purposes, the Ridge, Lasso, and Elasticnet regression models were also fitted to the same experimental dataset to estimate the T c and compare the results obtained [23][24][25][26][27][28][29]. However, the MARS technique is a statistical learning methodology built up in accordance with the statistics and mathematical analysis which has the ability to deal with nonlinearities including interactions among variables [30,31]. It is a nonparametric regression technique and can be seen as an extension of linear models that automatically model nonlinearities and complex interactions between variables. MARS approximation presents some benefits in comparison with the classical and metaheuristic regression techniques, including [32][33][34][35]: (1) avoiding physical models of the superconductor; (2) providing models that are more flexible than linear regression models; (3) creating models that are simple to understand and interpret; (4) allowing for the modeling of nonlinear relationships among the physico-chemical input variables of a superconductor; (5) offering a good bias-variance trade-off; and (6) providing an explicit mathematical formula of the dependent variable as a function of the independent variables through an expansion of the basis functions (hinge functions and products of two or more hinge functions). This last feature is a fundamental and noteworthy difference compared to other alternative methods, as most of them behave like a black box. Moreover, the WOA optimizer has been used to satisfactorily calculate the optimal MARS hyperparameters. In addition, previous research has indicated that MARS is a very effective tool for use in a large number of real applications, including soil erosion susceptibility prediction [36], rapid chloride permeability prediction of self-compacting concrete [37], evaluation of the earthquake induced uplift displacement of tunnels [38], estimation of hourly global solar radiation [39], atypical algal proliferation modeling in a reservoir [40], pressure drop estimation produced by different filtering media in microirrigation sand filters [41], assessing frost heave susceptibility of gravelly soils [42] and so on. However, it has never been used for evaluating superconducting critical temperature T c from the input physico-chemical parameters in most types of superconductors. This paper is structured as follows: Sect. 2 contains the experimental arrangement, all the variables included in this research and MARS, Ridge, Lasso, and Elastic-net methodologies; Sect. 3 presents the findings acquired with this novel technique by collating the MARS results with the observed values as well as the significance ranking of the input variables, and Sect. 4 concludes this study by providing an inventory of principal results of the research.

Dataset
The SuperCon database [43] is currently the biggest and most comprehensive database of superconductors in the world. It is free and open to the public, and it has been used in almost all ML studies of superconductors [44][45][46]. The SuperCon dataset was pre-processed for further research by Hamidieh [7], and this database is deposited in the University of California Irvine data repository [47]. As a result of the pre-treatment, materials that had some missing features were removed. Also, preliminary processing included the formation of new features based on existing ones. Atomic mass, density, first ionization energy, atomic radius, density, electron affinity, fusion heat, thermal conductivity, and valence were taken as the initial 8 features (see Table 1). That is, the chemical formula of the material was considered and based on statistical parameters of each features: mean, weighted mean, geometric mean, weighted geometric mean, entropy, entropy weighted, range, weighted range, standard deviation, and weighted standard deviation were calculated (see Table 2). This gives us 8 9 10 = 80 features. One additional feature, a numeric variable counting the number of elements in the superconductor, is also extracted. We end up with 81 features. Thus, we have data with 83 columns: 1 column corresponding to the name of the material (identification), 81 columns corresponding to the features extracted, and 1 column of the observed critical temperature (T c ) values. The dataset contains information for 21,263 superconductors so that we have 21,262 rows of data. All 82 attributes for each material are numeric. The 81 features extracted are used as independent predictors (input variables) of the critical temperature (T c ), which is the dependent variable of the model. This approach to the formation of features is quite general and suitable for the study of superconducting materials due to the general uncertainty of the dependence of the critical temperature.

Multivariate adaptive regression splines (MARS) approach
In statistical machine learning, multivariate adaptive regression splines (MARS) is a regression method first conceived by Friedman in 1991 which is appropriate for problems containing a large number of input variables [15][16][17][18][19]. The technique uses a nonparametric approach that can be understood as a prolongation of linear models which allows for considering interactions among input variables and nonlinearities. The MARS technique constructs models according to the following expansion [15][16][17][18][19]: Therefore, this technique approximates the dependent output variable y by means of an averaged addition of B i x ð Þ so that the coefficients c i are constant. B i x ð Þ can be [15][16][17][18][19]: • constant and equal to 1. This term is called intercept and corresponds to the term c 0 ; • a hinge or hockey stick function: this function is max 0; constant À x ð Þ or max 0; x À constant ð Þ . The constant value is termed knot. The MARS technique chooses variables and knot values for these according to the procedure indicated later; • the multiplication of hinge functions: in this case, these functions model nonlinear relationships between variables.
For instance, Fig. 1 shows a couple of splines for q = 1 at the node t = 3.5.
Two steps provide the base of the MARS method. First, it constructs a very complex model in the forward phase and then it simplifies it in the backward stage [19,30,34,48]: Then, the coefficients c i are determined using linear regression. Finally, it adds terms until a certain threshold for the residual error or a maximum number of terms is reached. • Backward stage: the previous stage usually constructs an overfitted model. In order to construct a better model with greater generalization skill, this new stage simplifies the model by removing terms, using the generalized cross-validation (GCV) criterion described below by first removing the terms that add more GCV to the model.
Generalized cross-validation (GCV) is the goodness-offit index utilized to assess the suitability of the terms of the model in order to prune them from the model. GCV not only takes into account the residual error but also how complex the model is. High values of GCV mean high residual error and complexity. The formula of this index is [15-19, 30, 34, 48]: where the parameter C M ð Þ increases with the number of terms in the regression function and thus, the value of the GCV index rises. It is given by [15][16][17][18][19]: where d is a coefficient that determines the importance of this parameter and M is the number of terms in Eq. (1). The relative importance of the independent variables that appear in the regression function (as only some of these variables remain in the final function) can be assessed using different criteria [15-19, 30, 34, 48]: (a) the GCV attached to a variable can be one of these criteria, and it is measured taking into account how much this index increases if the variable is erased from the final function; (b) the same criterion can be applied using the RSS index; (c) another criterion is the number of subsets (Nsubset) of which the variable is a part. If it is part of more terms, its importance is greater.

Whale optimization algorithm (WOA)
The whale optimization algorithm (WOA) is a new technique for solving optimization problems that was first proposed by Mirjalili and Lewis in order to optimize numerical problems [20]. The algorithm simulates the highly intelligent hunting behavior of humpback whales. This foraging behavior is called the bubble-net feeding method and is only observed in humpback whales, which create bubbles to encircle their prey while hunting. The whales dive approximately 12 m deep and then create the bubble spiral around their prey and then swim upward the surface following the bubbles. The mathematical model for spiral bubble-net feeding behavior is given as follows [20][21][22]: • Encircling prey Humpback whales can recognize the location of prey and encircle them. Since the position of the optimum design in the search space is not known a priori, the WOA algorithm assumes that the current best candidate solution is the target prey or is close to the optimum. After the best search agent is defined, the other search agents will hence try to update their positions toward the best search agent. This behavior is represented by the following equations: where t indicates the current iteration, Ã and C are coefficient vectors, X p is the position vector of the prey, and X indicates the position vector of a whale. The vectors Ã and C are calculated as follows:  Table 2 Description of the procedure for features extraction from material's chemical formula. (The last column serves as an example: features relied on thermal conductivities for Re 7 Zr 1 are derived and reported to two decimal places; Rhenium and Zirconium's thermal conductivity coefficients are t 1 ¼ 48 and t 2 ¼ 23 W/(m K), respectively. Here: Feature and description Formula Sample value (Re 7 Zr 1 ) where components of ã are linearly decreased from 2 to 0 over the course of iterations and r 1 , r 2 are random vectors in [0,1].
• Exploitation phase: bubble-net attack method The bubble-net strategy is a hybrid technique that combines two approaches that can be mathematically modeled as follows [20][21][22]: 1. Shrinking encircling mechanism: This behavior is achieved by decreasing the value of ã. Note that the fluctuation range of Ã is also decreased by ã. In other words, Ã is a random value in the interval Àa; a ½ where a is decreased from 2 to 0 over the course of iterations. Setting random values for Ã in À1; 1 ½ , the new position of a search agent can be defined anywhere in between the original position of the agent and the position of the current best agent. 2. Spiral updating position: This approach first calculates the distance between the whale located at X; Ỹ À Á and prey located at X Ã ; Ỹ Ã . A spiral equation is then created between the position of whale and prey to mimic the helix-shaped movement of humpback whales as follows: where is the distance between the i-th whale and the prey (best solution obtained so far), b is a constant for defining the shape of the logarithmic spiral, and t is a random number in À1; 1 ½ . Note that humpback whales swim around the prey within an increasingly shrinking spiral-shaped path. In order to model this simultaneous behavior, we assume that there is a probability of 50% to choose between either the shrinking encircling mechanism or the spiral model to update the position of the whales during optimization. The mathematical model is as follows [20][21][22]: where p is a random number in 0; 1 ½ . In addition to the bubble-net method, the humpback whales search for prey randomly. The mathematical model of the search is as follows: • Exploration phase: search for prey The same approach based on the variation of the Ã vector can be utilized to search for prey (exploration). In fact, humpback whales search randomly according to their relative position to each other. Therefore, we use Ã with the random values greater than 1 or less than À1 to force the search agent to move far away from a reference whale. In contrast to the exploitation phase, the position of a search agent in the exploration phase is updated according to a randomly chosen search agent instead of the best search agent. This mechanism and Ã [ 1 emphasize exploration and allow the WOA algorithm to perform a global search. The mathematical model is as follows [20][21][22]: where X rand is a random position vector (a random whale). The WOA algorithm starts with a set of random solutions. At each iteration, search agents update their positions with respect to either a randomly chosen search agent or the best solution obtained so far. The a parameter is decreased from 2 to 0 in order to provide exploration and exploitation, respectively. A random search agent is chosen when Ã [ 1, while the best solution is selected when Ã \1 for updating the position of the search agents.
Finally, the WOA algorithm is concluded upon the satisfaction of a termination criterion.

Ridge regression (RR)
Typically, we consider a sample consisting of n cases (or number of observations), that is, we have a set of training data x 1 ; y 1 ð Þ; :::; x n ; y n ð Þ, each of which consists of p covariates (number of variables) and a single outcome. Let y i be the outcome and x i ¼ x i1 ; x i2 ; :::; x ip À Á T be the covariate vector for the ith case. The most popular estimation method is known as the least squares fitting procedure, in which the coefficients b ¼ b 0 ; b 1 ; :::; b p À Á T have been selected to minimize the residual sum of squares (RSS) [23][24][25]: Ridge regression is very similar to least squares, with the exception that their coefficients are estimated by minimizing a slightly different quantity. Specifically, the ridge regression coefficient estimatesb RR are the values that minimize [18,[23][24][25]: where k ! 0 is the regularization parameter or complexity parameter to be determined separately (tuning parameter), that controls the amount of shrinkage: the larger the value of k, the greater the amount of shrinkage. Indeed, Eq. (10) trades off two different criteria. As with least squares, Ridge regression seeks coefficient estimates that fit the data well, by making the RSS small. However, the second term, k P p j¼1 b 2 j , called a shrinkage penalty, is small when b 1 ; :::; b p are close to zero, and so it has the effect of shrinking the estimates of b j toward zero. The tuning parameter k serves to control the relative impact of these two terms on the regression coefficient estimates. When k ¼ 0, the penalty term has no effect, and Ridge regression will produce the least squares estimates as k ! 0;b RR !b

RRS
. However, as k ! 1, the impact of the shrinkage penalty grows, and the Ridge regression coefficient estimates will approach zero as k ! 1;b RR ! 0 . Unlike least squares, which generates only one set of coefficient estimates, ridge regression will produce a different set of coefficient estimates,b RR k , for each value of k. Since selecting a good value for k is critical, cross-validation has been used.
The advantage of Ridge regressions over least squares is rooted in the bias-variance trade-off. As k increases, the flexibility of the ridge regression fit decreases, leading to decreased variance but increased bias. At the least squares coefficient estimates, which correspond to ridge regression with k ¼ 0, the variance is high, but there is no bias. But as k increases, the shrinkage of the ridge coefficient estimates leads to a substantial reduction in the variance of the predictions, at the expense of a slight increase in bias. Ridge regression improves prediction error by shrinking large regression coefficients in order to reduce overfitting, but it does not perform covariate selection and therefore does not help to make the model more interpretable.

Least absolute shrinkage and selection operator (Lasso) regression (LR)
Ridge regression does have one obvious disadvantage: it will include all p predictors in the final model. The penalty k P p j¼1 b 2 j in Eq. (10) will shrink all of the coefficients toward zero, but it will not set any of them exactly to zero (unless k ! 1). This may not be a problem for prediction accuracy, but it can create a challenge in model interpretation in situations in which the number of p variables is quite large.
The Lasso regression is a relatively recent alternative to Ridge regression that helps to overcome this disadvantage. The Lasso coefficients,b Lasso k , minimize the quantity [18,[25][26][27][28]: Comparing Eqs. (11) to (10) demonstrates that the Lasso and Ridge regressions have similar formulations. The only difference is that the b 2 j term in the Ridge regression penalty in Eq. (10) has been replaced by b j in the Lasso penalty in Eq. (11). In statistical terms, the Lasso uses an L 1 penalty instead of an L 2 penalty. The L p norm of a As with Ridge regression, the Lasso shrinks the coefficient estimates toward zero. However, in the case of the Lasso, the L 1 penalty has the effect of forcing some of the coefficient estimates to be exactly equal to zero when the tuning parameter k is sufficiently large. Hence, then it performs variable selection. As a result, the models generated are generally much easier to interpret than those produced by Ridge regression. It can be said to yield sparse models, that is, models that involve only a subset of the variables. As in Ridge regression, selecting a good value of k for the Lasso is critical. As a result, cross-validation has been employed.

Elastic-net regression (ENR)
Elastic-net regression (ENR) first emerged in response to critiques of the Lasso regression model, whose variable selection can be too dependent on data and thus unstable. The solution was to combine the penalties of Ridge and Lasso regressions to get the best of both worlds. Therefore, ENR is a convex combination of Ridge and Lasso regressions. Indeed, it aims at minimizing the following loss function [18,[23][24][25][26][27][28][29]: where a is the mixing parameter between Ridge (a ¼ 0) and Lasso (a ¼ 1). Now, there are two parameters to tune: k and a. In short, the ENR is a regularized regression method that linearly combines both penalties i.e. L 1 and L 2 of the Lasso and Ridge regression methods, and it proves particularly useful when there are multiple correlated features. The essential difference between Lasso and Elasticnet regressions lies in the fact that the Lasso model is likely to pick only one of these features at random while elasticnet model is likely to pick both at once.

Approach accuracy
Eighty of the above-mentioned input variables from Sect. 2.1 have been employed in this study to build this novel WOA/MARS-based method. As is well known, the superconducting critical temperature T c is the dependent variable to be predicted. In order to predict T c from eighty variables with sufficient confidence, it is essential to select the best model fitted to the observed dataset. Although there are several possible statistics that can be used to ascertain the goodness-of-fit, the rule employed in this study was the coefficient of determination R 2 [48][49][50], as it is a statistic employed in the scope of a statistical model whose principal objective is to predict upcoming results or to check an assumption. Next, the observed values are referred to as t i and the values predicted by the model y i , making it possible to define the following sums of squares given by [48][49][50]: • SS tot ¼ P n i¼1 t i À t ð Þ 2 : is the overall sum of squares, proportional to the sample variance.
• SS reg ¼ P n i¼1 y i À t ð Þ 2 : is the regression sum of squares, also termed the explained sum of squares.
• SS err ¼ P n i¼1 t i À y i ð Þ 2 : is the residual sum of squares.
where t is the mean of the n observed data: Based on the former sums, the coefficient of determination is specified by the following equation [48][49][50]: Further criteria considered in this study were the rootmean-square error (RMSE) and mean absolute error (MAE) [48][49][50][51]. The RMSE is a statistic that is frequently used to evaluate the predictive capability of a mathematical model. Indeed, the root-mean-square error (RMSE) [48][49][50][51] is given by: If the root-mean-square error (RMSE) is zero, there is no difference between the predicted and the observed data. The MAE, on the other hand, measures the average magnitude of the errors in a set of forecasts without considering their direction. MAE is the average over the verification sample of the absolute values of the differences between a forecast and the corresponding observation. Its mathematical expression is given by [48][49][50][51]: Moreover, the MARS methodology relies heavily on the three hyperparameters [15][16][17][18][19]: It is important to consider that the MARS technique relies largely on the determination of all three of the aforementioned optimal hyperparameters. Some of the methods often used to determine suitable hyperparameters are [15-19, 30, 34, 48, 52]: grid search, random search, Nelder-Mead search, artificial bee colony, genetic algorithms, pattern search, etc. In this study, the numerical optimizer denominated whale optimization algorithm (WOA) [20][21][22] has been employed to determine these parameters based on its ability to solve nonlinear optimization problems.
Hence, a novel hybrid WOA/MARS-based method has been applied to predict the superconducting critical temperature T c (output variable) from eighty variables (input variables) by studying their influence in order to optimize the calculation through the analysis of the coefficient of determination R 2 with success. Figure 2 shows the flowchart of this new hybrid WOA/MARS-based model developed in this study.
Cross-validation was the standard technique used to find the real coefficient of determination (R 2 ) [48][49][50]. Indeed, in order to guarantee the predictive ability of the WOA/ MARS-based model, an exhaustive tenfold cross-validation algorithm was used [53], which involved splitting the sample into 10 parts and using nine of them for training and the remaining one for testing. This process was performed 10 times using each of the parties of the 10 divisions for testing and calculating the average error. Therefore, all the possible variability within the WOA/MARS-based model parameters has been evaluated in order to determine the optimum point, by having first searched for the parameters, which minimize the average error.
The implementation of the new hybrid WOA/MARSbased model has been performed using a multivariate adaptive regression splines (MARS) method, based on information obtained from the Earth library [54] together with the WOA technique with the MetaheuristicOpt package [20,52] from the R Project. Additionally, the Ridge, Lasso, and Elastic-net regression models were implemented by using the glmnet package [55].
The bounds (initial ranges) of the space of solutions used in the WOA technique are shown in Table 3. A population of 40 whales has been used in the WOA optimization. The stopping criteria were the number of iterations along with at least 5 iterations with the same results. A total of fifty iterations were performed.
To optimize the MARS parameters, the WOA module is used as it searches for the best Maxfuncs, Interactions, and Penalty parameters by comparing the cross-validation error in every iteration. The search space is organized into three dimensions, one for each parameter. The main fitness factor or objective function is the coefficient of determination R 2 .

Analysis of results and discussion
All of the eighty independent input variables (eighty physico-chemical variables) are indicated above in Tables 1 and 2. The total number of samples used in the present study was 21,263, which is to say that it has built and treated data from 21,263 experimental samplings. This entire dataset was split into two approximate halves and one was used as a training set while the other was used as the testing set. As the training set still contained a very large number of samples, 1000 samples were randomly extracted and the hyperparameter tuning was performed using tenfold cross-validation. Once the optimal parameters were determined, a model was constructed with the whole training dataset, which served as model validation using the testing dataset.
Based on this methodology, Table 4 identifies the optimal parameters of the best fitted MARS-relied approach that were encountered using the WOA optimizer. Table 5 shows a list of 32 main basis functions for fitted WOA/MARS-based model and their coefficients, respectively. Note that h Therefore, the MARS model can be seen as an extension of linear models that automatically model nonlinearities and interactions as a weighted sum of the basis functions called hinge functions [15][16][17][18][19].
A pictorial graph of the first-order and second-order terms that create the MARS-based approach for the superconducting critical temperature T c is shown in Figs. 3  and 4, respectively.
Based on the resulting calculations, the WOA/MARSbased technique allowed for the construction of a model     . 3 Representation of the first-order terms of the three more important input independent variables for the dependent superconducting critical temperature T c variable: a T c vs. Range atomic radius; b T c vs. Mean density; and c T c vs. Weighted standard deviation Thermal Conductivity with high allowances to assess the critical temperature T c by means of the test dataset. Additionally, the Ridge, Lasso, and Elastic-net regression models were also built for the T c output variable in order to predict the superconducting critical temperature of the superconductor state for different types of materials. Table 6 shows the determination and correlation coefficients (R 2 and r), root-meansquare error (RMSE), and mean absolute error (MAE) over the testing set for the WOA/MARS, Ridge, Lasso, and Elastic-net models for the dependent T c variable.

Significance of variables
Another important result of the current study is the relevance of the independent input variables in order to predict the superconducting critical temperature T c for this nonlinear complex problem (see Table 7 and Fig. 5). We found that the most influential attributes were related to thermal conductivity. This is to be expected as both superconductivity and thermal conductivity are driven by lattice phonons and electrons transitions [8]. Also, the influence of ionic properties (related to the first ionization energy and electron affinity) could likely reflect the capability of superconductors to form ions, which is related to the movement through the crystalline lattice. This interpretation aligns well with BCS theory of superconductivity [2]. The knowledge of the physico-chemical features that are more directly related to the critical temperature can facilitate the study of superconducting materials.
Overall, the MARS-based technique has demonstrated itself to be an extremely accurate and highly satisfactory tool to indirectly assess the superconducting critical temperature T c (dependent variable), conforming to the real observed data in this study, as a function of some main measured physico-chemical parameters. Specifically, Fig. 6 indicates the comparison between the experimental and predicted T c values employing the WOA/MARS, Ridge, Lasso, and Elastic-net regression models for the test dataset. Thus, it is essential to combine the MARS methodology with the WOA optimizer to overcome this nonlinear regression problem through a novel hybrid approach that is significantly more robust and more effective than the three remaining regression models. In particular, the modeled and measured T c values were found to be highly correlated. Table 8 shows the T c observed and predicted for the first materials in Fig. 6.

Conclusion
Based on the abovementioned results, several core discoveries of this study can be drawn: • Existing analytical models to predict the superconducting critical temperature T c from the observed values are not accurate enough as they make too many simplifications of a highly nonlinear and complex problem. Consequently, the use of machine learning methods such as the novel hybrid WOA/MARS-based approach employed in this study offer the best option for making accurate estimations of the T c from experimental samplings. • The hypothesis that the identification of T c can be determined with precision by employing a hybrid In conclusion, this procedure can be applied to successfully predict the superconducting critical temperature T c of a variety of superconductors; however, it remains essential to consider the different physico-chemical features of each superconductor and/or experiment. Hence, the WOA/MARS-based method proves to be an extremely robust and useful answer to the nonlinear problem of the estimation of the T c from experimental samplings in different superconductors. Researchers interested in finding high temperature superconductors may use the model to narrow their search. As a future extension of this work, we intend to apply the presented methodology to a more extensive database [43]. For instance, researchers could use this dataset along with new data (such as pressure or crystal structure) to make better models.
Funding Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature.

Declarations
Conflict of interest The authors declare no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.