A computational approach to nonparametric regression: bootstrapping CMARS method
 786 Downloads
 1 Citations
Abstract
Bootstrapping is a computerintensive statistical method which treats the data set as a population and draws samples from it with replacement. This resampling method has wide application areas especially in mathematically intractable problems. In this study, it is used to obtain the empirical distributions of the parameters to determine whether they are statistically significant or not in a special case of nonparametric regression, conic multivariate adaptive regression splines (CMARS), a statistical machine learning algorithm. CMARS is the modified version of the wellknown nonparametric regression model, multivariate adaptive regression splines (MARS), which uses conic quadratic optimization. CMARS is at least as complex as MARS even though it performs better with respect to several criteria. To achieve a better performance of CMARS with a less complex model, three different bootstrapping regression methods, namely, randomX, fixedX and wild bootstrap are applied on four data sets with different size and scale. Then, the performances of the models are compared using various criteria including accuracy, precision, complexity, stability, robustness and computational efficiency. The results imply that bootstrap methods give more precise parameter estimates although they are computationally inefficient and that among all, randomX resampling produces better models, particularly for medium size and scale data sets.
Keywords
Bootstrapping regression Conic multivariate adaptive regression splines FixedX resampling RandomX resampling Wild bootstrap Machine learning1 Introduction
Models are simple forms of research phenomena that relate ideas and conclusions (Hjorth 1994). In statistics, formulating a model to answer a scientific question is usually the first step taken in an empirical study. Parametric and nonparametric models are two major approaches to statistical modeling in machine learning. Parametric models depend on certain distributional assumptions; if those assumptions hold, they give reliable inferences. Otherwise, nonparametric modeling is recommended.
Multivariate adaptive regression splines (MARS) is a nonparametric regression method (Friedman 1991; Hastie et al. 2001), and widely used in biology, finance and engineering. This method is proved to be useful for handling complex data, which has a nonlinear relationship among numerous variables. MARS builds models by running forward selection and backward elimination algorithms in succession. In the forward algorithm, deliberately, as large model as possible is fitted. Later, in the backward elimination, terms which do not contribute to the model are omitted.
In recent years, a lot of studies have been conducted involving MARS modeling. To exemplify, Denison et al. (1998) provide a Bayesian algorithm for MARS. Moreover, Holmes and Denison (2003) used Bayesian MARS for classification. York et al. (2006) compare the power of the least squares (LS) fitting to that of the MARS with polynomials. Kriner (2007) uses this model for survival analysis. Deconinck et al. (2008) show that MARS is better for fitting nonlinearities, more robust to small changes in data and easy to interpret compared to boosted regression trees. Zakeri et al. (2010) predict the energy expenditure for the first time in this research area by using MARS. Lin et al. (2011) apply MARS to time series data. Lee and Wu (2012) study the MARS applications, where it is used as a metamodel in the global sensitivity analysis of ordinary differential equation models. Ghasemi and Zolfonoun (2013) propose a new approach for MARS using principal component analysis for selection of inputs and apply it to determine the chemical amounts.
Depending on the power of MARS method in modeling highdimensional and voluminous data, several studies have been conducted to improve its capability. One of them is Conic MARS (CMARS) developed as an alternative to backward elimination algorithm by using conic quadratic programming (CQP) (Yerlikaya 2008), and it is improved to model nonlinearities better (Batmaz et al. 2010). Taylan et al. (2010) compare the performances of MARS and CMARS in classification. Later, its performance is rigorously evaluated and compared with that of MARS using various reallife and simulated data sets with different features (Weber et al. 2012). The results show that CMARS is superior to MARS in terms of accuracy, robustness and stability under different data features. Moreover, it performs better than MARS on noisy data. Nevertheless, CMARS produces models which are at least as complex as MARS.
CMARS has also been compared with several other methods such as classification and regression trees (CART) (SezginAlp et al. 2011), infinite kernel learning (IKL) (Çelik 2010), and generalized additive models (GAMs) with CQP (SezginAlp et al. 2011) for classification, and multiple linear regression (MLR) (YerlikayaÖzkurt et al. 2014) and dynamic regression model (Özmen et al. 2011) for prediction. These studies reveal that CMARS method performs as good as or even better than the others considered. For detailed findings one can refer to a comprehensive review of CMARS (YerlikayaÖzkurt et al. 2014).
A quick look into literature demonstrates that almost a decade has been devoted to the development and improvement of the CMARS method. All these studies lead to a powerful alternative to MARS with respect to several criteria including accuracy. Nevertheless, as stated above, the complexity of CMARS models does not compete with that of MARS. Therefore, in this study, we aim at reducing the CMARS models’ complexity. In the usual parametric modeling, the statistical significance of the model parameters can be investigated by testing hypothesis or by constructing confidence intervals (CIs). Because there are no parametric assumptions regarding CMARS models, the methods of computational statistics (CS) may be a plausible approach to take here.
CS is relatively a newer branch of statistics which develops methodologies that intensively use computers (Wegman 1988). Some examples include bootstrap, CART, GAMs, nonparametric regression methods (Efron and Tibshirani 1991) and visualization techniques like parallel coordinates, projection pursuits, and so on (Martinez and Martinez 2002). Advances in the computer science make all these methods feasible and popular especially after 1980s. In this study, the mathematical intractability appears to be the lack of distribution fitting to CMARS parameters. An empirical cumulative distribution function (CDF) is tried to be fitted to each parameter by a CS method, called bootstrap resampling. In this approach, samples are drawn from the original samples with replacement (Hjorth 1994).
There are some applications of this technique to estimate the significance of parameters in a model. Efron (1988) implements bootstrap to least absolute deviation (LAD). Efron and Tibshirani (1993) employ resampling residuals to a model based on least median of squares (LMS). Montgomery et al. (2006) apply bootstrapping residuals to a nonlinear regression method, called MichaelisMenten. Fox (2002) uses randomX and fixedX resampling for a robust regression technique which uses Mestimator with the Huber weight function. Also, SalibianBarrera and Zamar (2002) apply bootstrapping to robust regression. Flachaire (2005) compares the pairs bootstrap with wild bootstrap for heteroscedastic models. Austin (2008) uses bootstrap and with backward elimination which results in improvement of estimation. Chernick (2008) uses vector resampling for a kind of nonlinear model used in aerospace engineering. YetereKurşun and Batmaz (2010) compare regression methods employing different bootstrapping techniques.
In this study, to reduce the complexity of CMARS models without degrading its performance with respect to other measures, a new algorithm, called Bootstrapping CMARS (BCMARS), is developed by using three different bootstrapping regression methods, namely fixedX, randomX and wild bootstrap. Next, these algorithms are run on four data sets chosen with respect to different sample sizes and scales. Then, the performances of the models developed are compared according to the complexity, stability, accuracy, precision, robustness and computational efficiency.
This paper is organized as follows. In Sect. 2, MARS, CMARS, bootstrap regression and validation methods are described. The proposed approach, BCMARS, is explained in Sect. 3. In Sect. 4, applications and findings are presented. Results are discussed in Sect. 5. In the last section, conclusions and further studies are stated.
2 Methods
2.1 MARS
MARS, developed by Friedman (1991), is a nonparametric regression model where there is no specific assumption regarding the relationship between the dependent and independent variables; it constructs one of the best models which approximates the nonlinearity and handles the high dimensionality in data. MARS models are built in two steps: forward and backward. In the forward step, the largest possible model is obtained. However, this large model leads to over fitting. Thus, a backward step is required to remove terms that do not contribute significantly to the model.
2.2 CMARS
Here, the optimization approach adopted takes both the accuracy and complexity into account. While accuracy is guaranteed by the RSS, complexity is measured by the second component of PRSS in (6). The tradeoff between these two criteria are represented by the penalty parameters \(\lambda _{m} \left( {m=1,2,\ldots ,M_{\max } } \right) \).
2.3 Bootstrap regression
2.3.1 Bootstrap resampling
The bootstrap is a resampling technique that takes samples from the original data set with replacement (Chernick 2008). It is a databased simulation method useful for making inferences such as estimating standard errors and biases, constructing CIs, testing hypothesis, and so on. Implementation of this method is not difficult, but depends heavily on computers. The bootstrap procedure can be defined as in Table 1.
Efron and Tibshirani (1993) indicate that bootstrap is applicable to any models such as nonlinear ones and the models which use estimation techniques other than LS. According to them, bootstrapping regression is applicable to nonparametric models as well as the parametric ones with no analytical solutions.
The bootstrap procedure
Step 1  Generate \(a\)th bootstrap sample (\(x^{*a})\) of size N from the original sample with replacement 
Step 2  Compute the statistic of interest for this sample 
Step 3  Repeat steps 1–2 \(a=1,{\ldots }, A\) times and obtain the empirical CDF of the statistic of interest 
2.3.2 FixedX resampling (residual resampling)
The fixedX resampling procedure
Step 1  Fit the model \({\varvec{y}}={\varvec{X\theta }} +{\varvec{\varepsilon }} \) to the data and obtain the fitted values, \(\hat{{\varvec{y}}}\), and the residuals, \({\varvec{e}}\) 
Step 2  Select a bootstrap sample \(e^{*a}\;(a=1,2,\ldots ,A)\) of residuals from \({\varvec{e}}\) using the procedure in Table 1, and add them to the fitted values to obtain new response variables, \({\varvec{y}}_{new} =\hat{{\varvec{y}}}+{\varvec{e}}^{*a}\) 
Step 3  Fit the model \({\varvec{y}}_{new} ={\varvec{X\theta }} +{\varvec{\varepsilon }} \) to the original independent variables, \({\varvec{X}}\), and the new response variables, \({\varvec{y}}_{\mathrm{new}} \), and collect new parameter estimates, \(\hat{\varvec{\theta }}\) 
Step 4  Repeat steps 2–3 \(A\) times 
2.3.3 RandomX resampling (pairs bootstrap)
The randomX resampling procedure
Step 1  Select bootstrap samples of size N, using the procedure in Table 1, among the rows of the augmented matrix, \(\varvec{Z}=\left( \varvec{y}\varvec{X} \right) ,\quad {\varvec{Z}}^{*a}\) 
Step 2  Fit the model \(\varvec{y}=\varvec{X\theta } +\varvec{\varepsilon } \) to \({\varvec{Z}}^{*a}\), and collect new parameter estimates, \(\hat{\varvec{\theta }}\) 
Step 3  Repeat steps 1–2 \(A\) times 
2.3.4 Wild bootstrap
The wild bootstrap is a relatively new approach, when compared to randomX resampling, proposed for handling heteroscedastic models (Flachaire 2005). Its algorithm is the same as to that of fixedX resampling given in Table 2 with the only change in Step 2 that the bootstrap of residuals, that is errors, \({\varvec{e}}^{*a} (a=1,\ldots ,A)\), are attached to the fitted values after they are randomly assigned to be 1 or 1 with equal probability.
2.4 Validation technique and performance criteria
In the comparison of models, 3fold CV technique is used (Martinez and Martinez 2002; Gentle 2009). In this technique, data sets are randomly divided into three parts (folds). At each of the three attempts, two different folds (66.6 % of observations) are combined to develop models while the other fold (33.3 % of observations) is kept to test them. The combined part and the other fold are referred to as training and test data sets, respectively.
The performances of the models developed are evaluated with respect to different criteria including accuracy, precision, complexity, stability, robustness and efficiency. The accuracy criterion is used to measure the predictive ability of the models while precision criterion is used to determine the amount of variation in the parameter estimates; the less variable ones indicate more precision. The mean absolute error (MAE), determination of coefficient \((\hbox {R}^{2})\) and percentage of residuals within three standard deviations (PWI). On the other hand, the precision of parameter estimates are determined by their empirical CIs. Other criterion used in comparisons is the complexity; it is measured by the mean squared error (MSE). It is expected that, in general, the performance measures for test data may not be as good as to that of the training data. Besides, the stabilities of the accuracy and complexity measures obtained from the training and test data sets are also evaluated. The definitions as well as bounds on these measures, where applies, are presented in the Appendix. Furthermore, robustness of the measures with respect to different data sets is evaluated by considering the standard deviations of the measures. Moreover, to assess the computational efficiency of the models build, computational run times are utilized.
3 BCMARS: bootstrapping CMARS
The BCMARS algorithm
Step 1  The forward part of MARS algorithm is run and the set of BFs is constructed using the original data, \({\varvec{y}}\) and \({\varvec{X}}\). Note that these BFs are considered as fixed 
Step 2  CMARS model is constructed and the optimal value of \(\sqrt{\tilde{M}}\) is decided as the corner point of the plot of \(\;\left\ {\varvec{y}\varvec{B}\left( {\tilde{\varvec{d}}} \right) \varvec{\theta }} \right\ _2 \) versus \(\left\ \varvec{L\theta } \right\ _2 \) in the log–log scale (see Fig. 1). The selected value gives the best solution for both accuracy and complexity in terms of PRSS in (12) 
Step 3  Select one of the following BCMARS methods 
\(\bullet \) BCMARSF: follow the procedure given in Table 2 by using the model in (13) in place of the MLR model  
\(\bullet \) BCMARSR: follow the procedure given in Table 3 by using the model in (13) in place of the MLR model  
\(\bullet \) BCMARSW: follow the procedure given in Sect. 2.3.4  
Step 4  Decide on the level of significance, \(\upalpha \), and construct bootstrap percentile interval using Eq. (17), given in the Appendix, to determine the significance of the parameters. If the percentile interval includes zero, then the parameter is found to be insignificant 
Step 5  Repeat Steps 2–5 until there is not any insignificant parameters in the model 
Data sets used in comparisons
Scale (p)  

(N, p)  Small \(({p}<10)\)  Medium (\(10<{p}<20\)) 
Sample size (N)  
Small (\(\hbox {N}\sim 150\))  Concrete slump (CS), (Yeh 2007), (103,7)  Uniform sampling (US), (Kartal 2007), (160,10) 
Medium (\(\hbox {N}\sim 500\))  PM10 (Aldrin 2006), (500,7)  Forest fires (FF) (Cortez and Morais 2007), (517,11) 
4 Application and findings
In order to evaluate and compare the performances of models developed by using MARS, CMARS and BCMARS methods, they are run on four different data sets to observe the effects of certain data characteristics such as size (i.e. the number of observations, N) and scale (i.e. the number of independent variables, \(p\)) on the methods’ performances. Note that the data sets are classified as small and medium subjectively. The data sets used in comparisons are presented in Table 5.
While validating the models, 3fold CV is used as described in Sect. 2.4. As a result, three models are developed and tested for each of the method applied on a data set. In applications, the R package “Earth” Milborrow (2009), MATLAB (2009) and the MOSEK optimization software (2011) run in MATLAB are utilized.
To construct BCMARS models, the algorithms given in Sect. 3 are applied stepbystep by taking \(A\), in Table 1, 2 and 3, as 1000. Then, the performance measures for each model are calculated. Moreover, the computational run times of the methods are recorded to be compared.
5 Results and discussion
In this section, it is aimed to compare the performances of the methods studied, namely MARS, CMARS, BCMARSF, BCMARSR and BCMARSW, in general, according to different features of data sets such as size and scale. In these comparisons, various criteria including accuracy, precision, stability, efficiency and robustness are considered.
5.1 Comparison with respect to overall performances

BCMARSF and BCMARSR are the most accurate, robust and least complex for training and testing data sets, respectively.

BCMARSR and BCMARSW methods are the most stable, and BCMARSR has the most robust stability.
5.2 Comparison with respect to sample size
Overall performances (Mean\(\pm \)SD) of the methods
Performance measures  MARS  CMARS  BCMARSF  BCMARSR  BCMARSW 

Training  
MAE  0.3453 \(\pm \)0.2336  0.4040 \(\pm \)0.3980  0.3204*\(\pm \)0.2260**  0.3356 \(\pm \)0.2263  0.4251 \(\pm \)0.2797 
MSE  0.4015 \(\pm \)0.3064  0.6070 \(\pm \)0.9080  0.3117*\(\pm \)0.2700**  0.4230 \(\pm \)0.3990  0.5770 \(\pm \)0.4950 
\({R}^{2}\)  0.6005 \(\pm \)0.2797  0.5911 \(\pm \)0.3407  0.6827*\(\pm \)0.2492**  0.6120 \(\pm \)0.3350  0.5127 \(\pm \)0.3398 
PWI  0.9944*\(\pm \)0.0082**  0.9942 \(\pm \)0.0082**  0.9909 \(\pm \)0.0153  0.9932 \(\pm \)0.0140  0.9855 \(\pm \)0.0158 
Testing  
MAE  0.4576*\(\pm \)0.2956**  0.5800 \(\pm \)0.4580  0.4838\(\pm \)0.3076  0.6460\(\pm \)0.6110  0.4977\(\pm \)0.2998 
MSE  3.0700\(\pm \)7.0900  1.5780 \(\pm \)2.1350  1.2670\(\pm \)1.9970  0.5480*\(\pm \)0.3660**  1.0720\(\pm \)1.2710 
R\(^{2}\)  0.4480\(\pm \)0.3820  0.3630\(\pm \)0.4030  0.4500\(\pm \)0.3800  0.4530*\(\pm \)0.3770**  0.3840\(\pm \)0.4010 
PWI  0.9930*\(\pm \)0.0108  0.9930*\(\pm \)0.0106**  0.9884\(\pm \)0.0177  0.9890\(\pm \)0.0169  0.9878\(\pm \)0.0120 
Stability  
MAE  0.7657\(\pm \)0.1848  0.7440\(\pm \)0.2383  0.7252\(\pm \)0.1939  0.7375\(\pm \)0.2870  0.8690*\(\pm \)0.1783** 
MSE  0.5500\(\pm \)0.3710  0.5690\(\pm \)0.3400  0.5550\(\pm \)0.3450  0.6374\(\pm \)0.2174**  0.7616*\(\pm \)0.2852 
R\(^{2}\)  0.6070\(\pm \)0.3680  0.4690\(\pm \)0.3940  0.5750\(\pm \)0.3640  0.6577*\(\pm \)0.3063**  0.6300\(\pm \)0.3650 
PWI  0.9950*\(\pm \)0.0070  0.9940\(\pm \)0.0070  0.9940\(\pm \)0.0070  0.9950*\(\pm \)0.0060**  0.9940\(\pm \)0.0080 
Averages of performance measures with respect to different sample sizes
Sample size  Performance measures  MARS  CMARS  BCMARSF  BCMARSR  BCMARSW 

Training  
Small  MAE  0.2340  0.3570  0.1899*  0.2092  0.3410 
MSE  0.1773  0.6020  0.1158*  0.1387  0.3000  
\({R}^{2}\)  0.8208  0.7840  0.8824*  0.8596  0.7350  
PWI  1.0000*  0.9970  0.9910  0.9910  0.9870  
Medium  MAE  0.4563  0.4498*  0.4769  0.4874  0.5090 
MSE  0.6257  0.6125  0.5469*  0.7630  0.8540  
\({R}^{2}\)  0.3802  0.3978  0.4431*  0.3140  0.2908  
PWI  0.9888  0.9900  0.9890  0.9940*  0.9830  
Testing  
Small  MAE  0.3300*  0.5560  0.3440  0.7280  0.3790 
MSE  0.3520*  1.0010  0.3980  0.3670  0.3570  
\({R}^{2}\)  0.7110*  0.5760  0.6770  0.6800  0.6500  
PWI  1.0000*  1.0000*  0.9910  0.9910  0.9920  
Medium  MAE  0.5849  0.6052  0.6518  0.5468*  0.6160 
MSE  5.7800  2.1500  2.3100  0.7658*  1.7880  
\({R}^{2}\)  0.1853*  0.1497  0.1765  0.1817  0.1178  
PWI  0.9860*  0.9860*  0.9850  0.9860*  0.9830  
Stability  
Small  MAE  0.2250  0.7300  0.7265  0.6110  0.9359* 
MSE  0.4980  0.5770  0.5750  0.5530  0.8835*  
\({R}^{2}\)  0.7700  0.5960  0.7350  0.7510  0.7710*  
PWI  1.0000*  0.9970  0.9990  0.9990  0.9940  
Medium  MAE  0.4431  0.7578  0.7236  0.8888*  0.8022 
MSE  0.5760  0.5620  0.4410  0.7049*  0.6150  
\({R}^{2}\)  0.4450  0.3410  0.3830  0.5460*  0.4890  
PWI  0.9915  0.9900  0.9898  0.9920  0.9948* 

All methods perform the best in small data sets when compared to the medium size for training and testing data.

BCMARSF and MARS perform the best for small training and testing data sets, respectively. Moreover, BCMARSW competes with MARS in small testing data sets.

Among all, BCMARSW method is the most stable one in small data sets.

BCMARSF and BCMARSW are the most stable methods in small size data when compared to medium size.
5.3 Comparisons with respect to scale

For training data sets, medium scale produces better models for all of the methods. Moreover, BCMARSF is the best performing one regardless of the scale.

For testing data sets, MARS, BCMARSF and BCMARSW perform equally well on both scales; while medium scale gives the best results for the other methods studied.

MARS and BCMARSW are the most stable methods for small scale data compared to medium scale; CMARS and BCMARSR are the most stable methods for medium scale compared to small scale. BCMARSF performs equally well on both scales.

MARS and BCMARSW are more stable for small scale among all methods. BCMARSR is more stable for medium scale data sets.
Averages of performance measures with respect to different scale
Scale  Performance measures  MARS  CMARS  BCMARSF  BCMARSR  BCMARSW 

Training  
Small  MAE  0.5229  0.4140*  0.4720  0.4910  0.6561 
MSE  0.4572  0.8992  0.3830*  0.4040  0.7728  
\({R}^{2}\)  0.5483  0.4985  0.6139*  0.5928  0.4078  
PWI  0.9970  0.9924  0.9980*  0.9980*  0.9934  
Medium  MAE  0.1677  0.1773  0.1384*  0.1492  0.1940 
MSE  0.3417  0.3500  0.2260*  0.4450  0.3810  
\({R}^{2}\)  0.6591  0.6630  0.7650*  0.6340  0.6170  
PWI  0.9913  0.9920*  0.9820  0.9870  0.9770  
Testing  
Small  MAE  0.6696  0.5445*  0.6747  0.6776  0.7130 
MSE  0.7327*  2.3959  0.7717  0.7443  0.8469  
\({R}^{2}\)  0.3297  0.3377*  0.3240  0.3293  0.2721  
PWI  0.9964*  0.9901  0.9960  0.9960  0.9932  
Medium  MAE  0.2703  0.2790  0.2550*  0.6070  0.2820 
MSE  5.4630  1.7800  1.8600  0.3130*  1.2980  
\({R}^{2}\)  0.5107  0.5040  0.6000  0.6020*  0.4960  
PWI  0.9892  0.9900*  0.9790  0.9810  0.9820  
Stability  
Small  MAE  0.5008  0.7801  0.7041  0.7300  0.9200* 
MSE  0.6515  0.5771  0.5183  0.5539  0.8378*  
\({R}^{2}\)  0.6521*  0.3714  0.5540  0.5539  0.6277  
PWI  0.9984*  0.9930  0.9977  0.9979  0.9980  
Medium  MAE  0.7666  0.7900  0.7505  0.7470  0.8100* 
MSE  0.3474  0.5920  0.3480  0.8040*  0.6850  
\({R}^{2}\)  0.5628  0.5310  0.6000  0.7600*  0.6330  
PWI  0.9931*  0.9930*  0.9910  0.9930  0.9920 
5.4 Evaluation of the computational efficiencies

Run times increases as sample size and scale increases, except MARS.

Bootstrap methods run considerably longer times than MARS and CMARS. Three bootstrap regression methods have almost the same computational efficiencies in small size and small scale data sets. Run times of these methods increase almost ten times as much as the scale increases from small to medium.

BCMARSR and BCMARSW have similar better efficiencies in medium size small scale data sets. Their run times increase almost five times as much as the sample size increases in small scale data sets.

BCMARSF and BCMARSW have similar better efficiencies for medium size medium scale data sets.
Run times (in seconds) of methods with respect to size and scale of data sets
Scale  

Small  Medium  
Sample size  
Small  MARS: \(<\)0.08 s*  MARS: \(<\)0.08 s* 
CMARS: \(<\)4.47 s  CMARS: \(<\)19.52 s  
BCMARSF: \(<\)1.6\(\times 10^{3}\) s  BCMARSF: \(<\) 1.3\(\times 10^{4}\) s  
BCMARSR: \(<\)1.6\(\times 10^{3}\) s  BCMARSR: \(<\)1.8\(\times 10^{4}\) s  
BCMARSW: \(<\)1.6\(\times 10^{3}\) s  BCMARSW: \(<\)1.5\(\times 10^{4}\) s  
Medium  MARS: \(<\)0.08 s*  MARS: \(<\)0.09 s* 
CMARS: \(<\)18.20 s  CMARS: \(<\) 21.67 s  
BCMARSF: \(<\)1.5\(\times 10^{4}\) s  BCMARSF: \(<\)1.8\(\times 10^{4}\) s  
BCMARSR: \(<\)0.7\(\times 10^{4}\) s  BCMARSR: \(<\)3.1\(\times 10^{4}\) s  
BCMARSW: \(<\)0.8\(\times 10^{4}\) s  BCMARSW: \(<\)1.6\(\times 10^{4}\) s 
5.5 Evaluation of the precision of model parameters

In US (small size medium scale) data, CMARS, BCMARSF and BCMARSR build the same models. Hence, the precision of their parameters are the same.

For all data sets except US, the lengths of CIs become narrower and standard deviations of the parameters become smaller after bootstrapping CMARS, thus, resulting in more precise parameter estimates.

In general, two different types of standard deviations obtained for all BCMARS methods are smaller than the ones obtained from CMARS.
6 Conclusion and further research
In this study, three different bootstrap methods are applied to a machine learning method, called CMARS, which is an improved version of the backward step of the wellknown method MARS. Although CMARS overperforms MARS with respect to several criteria, it constructs models which are at least as complex as MARS (Weber et al. 2012). In this study, it is aimed to reduce the complexity of CMARS models without degrading its performance. To achieve this aim, bootstrapping regression methods, namely fixedX and randomX resampling, and wild bootstrap, are utilized by adopting an iterative approach to determine whether the parameters statistically contribute to the developed CMARS model or not. The reason of using a computational method here is the lack of prior knowledge regarding the distributions of the model parameters.
The performances of the methods are empirically evaluated and compared with respect to several criteria (e.g. accuracy, complexity, stability, robustness, precision, computational efficiency) by using four data sets which are selected subjectively to represent the small and medium sample size and scale categories. All performance criteria are explained in the Appendix. In addition, to validate all models developed, threefold CV approach is used.

In the overall, BCMARSR is the best performing method.

Small size (training and testing) data sets produce the best results for all methods; for small and medium size data, BCMARSW and BCMARSR overperform the others, respectively.

Medium scale produces the best results for CMARS and BCMARSR when compared to the others, and BCMARSR is the better performing one.

Bootstrapping methods give the most precise parameter estimates; however, they are computationally the least efficient.
In the future, BCMARS methods are going to be applied on more data sets with small to large size and scale to be able to examine the interactions that may exist between data size and scale more clearly.
Notes
Acknowledgments
Authors would like to thank to the editor and the anonyms referees for their valuable comments and criticisms. Their contributions lead to the improved version of this paper.
References
 Aldrin, M. (2006). Improved predictions penalizing both slope and curvature in additive models. Computational Statistics and Data Analysis, 50(2), 267–284.MATHMathSciNetCrossRefGoogle Scholar
 Aster, R. C., Borchers, B., & Thurber, C. (2012). Parameter estimation and inverse problems. Burlington: Academic Press.Google Scholar
 Austin, P. (2008). Using the bootstrap to improve estimation and confidence intervals for regression coefficients selected using backwards variable elimination. Statistics in Medicine, 27(17), 3286–3300.MathSciNetCrossRefGoogle Scholar
 Batmaz, İ., YerlikayaÖzkurt, F., KartalKoç, E., Köksal, G., Weber, G. W. (2010). Evaluating the CMARS performance for modeling nonlinearities. In Proceedings of the 3rd global conference on power control and optimization, gold coast (Australia), vol. 1239, pp. 351–357.Google Scholar
 Çelik, G. (2010). Parameter estimation in generalized partial linear models with conic quadratic programming. Master Thesis, Graduate School of Applied Mathematics, Department of Scientific Computing, METU, Ankara, Turkey.Google Scholar
 Chernick, M. (2008). Bootstrap methods: A guide for practitioners and researchers. New York: Wiley.Google Scholar
 Cortez, P., & Morais., A. (2007). Data mining approach to predict forest fires using meteorological data. In J. Neves, M. F. Santos, J. Machado (ed.), New trends in artificial intelligence, proceedings of the 13th EPIA 2007  Portuguese conference on artificial intelligence, December, Guimarães (Portugal), pp. 512–523.Google Scholar
 Deconinck, E., Zhang, M. H., Petitet, F., Dubus, E., Ijjaali, I., Coomans, D., et al. (2008). Boosted regression trees, multivariate adaptive regression splines and their twostep combinations with multiple linear regression or partial least squares to predict bloodbrain barrier passage: A case study. Analytica Chimica Acta, 609(1), 13–23.CrossRefGoogle Scholar
 Denison, D. G. T., Mallick, B. K., & Smith, F. M. (1998). Bayesian MARS. Statistics and Computing, 8(4), 337–346.CrossRefGoogle Scholar
 Efron, B. (1988). Computerintensive methods in statistical regression. Society for Industrial and Applied Mathematics, 30(3), 421–449.MATHMathSciNetGoogle Scholar
 Efron, B., & Tibshirani, R. J. (1991). Statistical data analysis in the computer age. Science, 253, 390–395.CrossRefGoogle Scholar
 Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. New York: Chapman & Hall.MATHCrossRefGoogle Scholar
 Flachaire, E. (2005). Bootstrapping heteroskedastic regression models: Wild bootstrap vs. pairs bootstrap. Computational Statistics and Data Analysis, 49(2), 361–376.MATHMathSciNetCrossRefGoogle Scholar
 Fox, J. (2002). Bootstrapping regression models. An R and Splus companion to applied regression: Web appendix to the book. Sage, CA: Thousand Oaks.Google Scholar
 Freedman, D. A. (1981). Bootstrapping regression models. The Annals of Statistics, 9(6), 1218–1228.MATHMathSciNetCrossRefGoogle Scholar
 Friedman, J. (1991). Multivariate adaptive regression splines. Annals of Statistics, 19(1), 1–67.MATHMathSciNetCrossRefGoogle Scholar
 Gentle, J. E. (2009). Computational statistics. New York: Springer.MATHCrossRefGoogle Scholar
 Ghasemi, J. B., & Zolfonoun, E. (2013). Application of principal component analysis–multivariate adaptive regression splines for the simultaneous spectrofluorimetric determination of dialkyltins in micellar media. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 115, 357–363.CrossRefGoogle Scholar
 Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning, data mining, inference and prediction. New York: Springer.MATHGoogle Scholar
 Hjorth, J. S. U. (1994). Computer intensive statistical methods: Validation model selection and bootstrap. New York: Chapman & Hall.MATHGoogle Scholar
 Holmes, C. C., & Denison, D. G. T. (2003). Classification with bayesian MARS. Machine Learning, 50, 159–173.MATHCrossRefGoogle Scholar
 Kartal, E. (2007). Metamodeling complex systems using linear and nonlinear regression methods. Master Thesis, Graduate School of Natural and Applied Sciences, Department of Statistics, METU, Ankara, Turkey.Google Scholar
 Kriner, M. (2007). Survival analysis with multivariate adaptive regression splines. Dissertation, LMU Munchen: Faculty of Mathematics, Computer Science and Statistics, Munchen.Google Scholar
 Lee, Y., & Wu, H. (2012). MARS approach for global sensitivity analysis of differential equation models with applications to dynamics of influenza infection. Bulletin of Mathematical Biology, 74, 73–90.MATHMathSciNetCrossRefGoogle Scholar
 Lin, C. J., Chen, H. F., & Lee, T. S. (2011). Forecasting tourism demand using time series, artificial neural networks and multivariate adaptive regression splines: evidence from Taiwan. International Journal of Business Administration, 2(2), 14–24.CrossRefGoogle Scholar
 Martinez, W. L., & Martinez, A. R. (2002). Computational statistics handbook with Matlab. New York: Chapman & Hall.Google Scholar
 MATLAB Version 7.8.0 (2009). The math works, USA.Google Scholar
 Milborrow, S. (2009). Earth: Multivariate adaptive regression spline models.Google Scholar
 Montgomery, D. C., Peck, E. A., & Vining, G. G. (2006). Introduction to linear regression analysis. New York: Wiley.MATHGoogle Scholar
 MOSEK, Version 6. A very powerful commercial software for CQP, ApS, Denmark. http://www.mosek.com. Accessed Jan 7, 2011.
 OseiBryson, K. M. (2004). Evaluation of decision trees: A multicriteria approach. Computers & Operational Research, 31, 1933–1945.MATHCrossRefGoogle Scholar
 Özmen, A., Weber, G. W., Batmaz, İ., & Kropat, E. (2011). RCMARS: Robustification of CMARS with different scenarios under polyhedral uncertainty set. Communications in Nonlinear Science and Numerical Simulation (CNSNS), 16(12), 4780–4787.MATHCrossRefGoogle Scholar
 SalibianBarrera, M., & Zamar, R. Z. (2002). Bootstrapping robust estimates of regression. The Annals of Statistics, 30(2), 556–582.MATHMathSciNetCrossRefGoogle Scholar
 SezginAlp, O. S., Büyükbebeci, E., Iscanoglu Cekic, A., YerlikayaÖzkurt, F., Taylan, P., & Weber, G.W. (2011). CMARS and GAM & CQP—modern optimization methods applied to international credit default prediction. Journal of Computational and Applied Mathematics (JCAM), 235, 4639–4651.CrossRefGoogle Scholar
 Taylan, P., Weber, G.W., & YerlikayaÖzkurt, F. (2010). A new approach to multivariate adaptive regression spline by using Tikhonov regularization and continuous optimization. TOP (the Operational Research Journal of SEIO (Spanish Statistics and Operations Research Society), 18(2), 377–395.MATHGoogle Scholar
 Weber, G. W., Batmaz, İ., Köksal, G., Taylan, P., & YerlikayaÖzkurt, F. (2012). CMARS: A new contribution to nonparametric regression with multivariate adaptive regression splines supported by continuous optimization. Inverse Problems in Science and Engineering, 20(3), 371–400.MATHMathSciNetCrossRefGoogle Scholar
 Wegman, E. (1988). Computational statistics: A new agenda for statistical theory and practice. Journal of the Washington Academy of Sciences, 78, 310–322.Google Scholar
 Yazıcı, C. (2011). A computational approach to nonparametric regression: Bootstrapping CMARS method. Master Thesis, Graduate School of Natural and Applied Sciences, Department of Statistics, METU, Ankara, Turkey.Google Scholar
 Yazıcı, C., YerlikayaÖzkurt, F., & Batmaz, İ. (2011). A computational approach to nonparametric regression: Bootstrapping CMARS method. In ERCIM’11:4th international conference of the ERCIM W&G on computing and statistics. London, UK. December 17–19. Book of Abstracts, 129.Google Scholar
 Yeh, I.C. (2007). Modeling slump flow of concrete using secondorder regressions and artificial neural networks. Cement and Concrete Composites, 29(6), 474–480.CrossRefGoogle Scholar
 Yerlikaya, F. (2008). A new contribution to nonlinear robust regression and classification with mars and its applications to data mining for quality control in manufacturing. Master Thesis, Graduate School of Applied Mathematics, Department of Scientific Computing, METU, Ankara, Turkey.Google Scholar
 YerlikayaÖzkurt, F., Batmaz, İ., & Weber, G.W. (2014). A review of conic multivariate adaptive regression splines (CMARS): A powerful tool for predictive data mining, to appear as chapter in book. In D. Zilberman, A. Pinto, (eds.) Springer volume modeling, optimization, dynamics and bioeconomy, series springer proceedings in mathematics.Google Scholar
 YetereKurşun, & A., Batmaz, İ. (2010). Comparison of regression methods by employing bootstrapping methods. COMPSTAT2010: 19th international conference on computational statistics. Paris, France. August 22–27. Book of Abstracts, 92.Google Scholar
 York, T. P., Eaves, L. J., Van Den Oord, E., & JC, G. (2006). Multivariate adaptive regression splines: A powerful method for detecting diseaserisk relationship differences among subgroups. Statistics in Medicine, 25(8), 1355–1367.MathSciNetCrossRefGoogle Scholar
 Zakeri, I. F., Adolph, A. L., Puyau, M. R., Vohra, F. A., & Butte, N. F. (2010). Multivariate adaptive regression splines models for the prediction of energy expenditure in children and adolescents. Journal of Applied Physchology, 108, 128–136.Google Scholar