Skip to main content

Advertisement

Log in

Statistical learning and optimization of the helical milling of the biocompatible titanium Ti-6Al-7Nb alloy

  • ORIGINAL ARTICLE
  • Published:
The International Journal of Advanced Manufacturing Technology Aims and scope Submit manuscript

Abstract

Helical milling has been applied for hole-making in titanium alloys, especially in the Ti-6Al-4V alloy, considering the aims of the aeronautic, automobile, and other sectors. When considering hole-making in Ti-alloys for biomedical applications, few studies have been carried out. Besides, intelligent approaches for modeling and optimization of this process in these special alloys are demanded to achieve the best results in terms of hole surface quality, and productivity. This work presents an approach for modeling and optimization of helical milling for hole-making of Ti-6Al-7Nb biocompatible titanium alloy. The surface roughness of the holes was measured to quantify the hole quality. Principal component analysis was performed for dimensionality reduction of the roughness outputs. For modeling, a learning procedure was proposed considering polynomial response surface regression, tree-based methods, and support vector regression. Cross-validation is used for learning and model selection. The results pointed out that the support vector regression model was the best one. Multi-objective evolutionary optimization was performed considering the support vector regression model and the deterministic model of the material removal rate. The Pareto set and the Pareto frontier were plotted and discussed concerning practical aspects of the helical milling process. The proposed learning and optimization approach enabled the achievement of the best results of the helical milling in the biocompatible Ti-6Al-7Nb alloy and can be applied to other intelligent manufacturing applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Code availability

The R code to perform the analysis is available at https://github.com/robsonpro/Stat-learning-optimization-helical-milling-Ti-6Al-7Nb.

References

  1. Xu J, Zeng W, Zhang X, Zhou D (2019) Analysis of globularization modeling and mechanisms of alpha/beta titanium alloy. J Alloys Compd 788:110–117

    Article  Google Scholar 

  2. Koizumi H, Takeuchi Y, Imai H, Kawai T, Yoneyama T (2019) Application of titanium and titanium alloys to fixed dental prostheses. J Prosthodont Res 63(3):266–270

    Article  Google Scholar 

  3. Aniołek K, Kupka M, Barylski A (2016) Sliding wear resistance of oxide layers formed on a titanium surface during thermal oxidation. Wear 356:23–29

    Article  Google Scholar 

  4. Munirathinam B, Narayanan R, Neelakantan L (2016) Electrochemical and semiconducting properties of thin passive film formed on titanium in chloride medium at various pH conditions. Thin Solid Films 598:260–270

    Article  Google Scholar 

  5. Aniołek K, Kupka M , Dercz G (2019) Cyclic oxidation of Ti–6Al–7Nb alloy. Vacuum 168:108859

    Article  Google Scholar 

  6. Asri R, Harun W, Samykano M, Lah N, Ghani S, Tarlochan F, et al. (2017) Corrosion and surface modification on biocompatible metals: a review. Mater Sci Eng: C 77:1261–1274

    Article  Google Scholar 

  7. Geetha M, Singh AK, Asokamani R, Gogia AK (2009) Ti based biomaterials, the ultimate choice for orthopaedic implants–a review. Prog Mater Sci 54(3):397–425

    Article  Google Scholar 

  8. Lin J, Ozan S, Munir K, Wang K, Tong X, Li Y, et al. (2017) Effects of solution treatment and aging on the microstructure, mechanical properties, and corrosion resistance of a β type Ti–Ta–Hf–Zr alloy. RSC Adv 7(20):12309–12317

    Article  Google Scholar 

  9. Niinomi M, Nakai M, Hieda J (2012) Development of new metallic alloys for biomedical applications. Acta Biomater 8(11):3888–3903

    Article  Google Scholar 

  10. Tardelli JDC, Bolfarini C, Dos Reis AC (2020) Comparative analysis of corrosion resistance between beta titanium and Ti-6Al-4V alloys: a systematic review. J Trace Elem Med Biol 62:126618

    Article  Google Scholar 

  11. Xu J, Zhu J, Fan J, Zhou Q, Peng Y, Guo S (2019) Microstructure and mechanical properties of Ti–6Al–4V alloy fabricated using electron beam freeform fabrication. Vacuum 167:364–373

    Article  Google Scholar 

  12. Bartha K, Zháňal P, Stráskỳ J, Čížek J, Dopita M, Lukáč F, et al. (2019) Lattice defects in severely deformed biomedical Ti-6Al-7Nb alloy and thermal stability of its ultra-fine grained microstructure. J Alloys Compd 788:881–890

    Article  Google Scholar 

  13. Lauro CH, Ribeiro Filho SL, Brandão LC, Davim JP (2016) Analysis of behaviour biocompatible titanium alloy (Ti-6Al-7Nb) in the micro-cutting. Measurement 93:529–540

    Article  Google Scholar 

  14. Niinomi M (2003) Recent research and development in titanium alloys for biomedical applications and healthcare goods. Sci Technol Adv Mater 4(5):445

    Article  Google Scholar 

  15. Niinomi M (2008) Mechanical biocompatibilities of titanium alloys for biomedical applications. J Mech Behav Biomed Mater 1(1):30–42

    Article  Google Scholar 

  16. Singh G, Sharma N, Kumar D, Hegab H (2020) Design, development and tribological characterization of Ti–6Al–4V/hydroxyapatite composite for bio-implant applications. Mater Chem Phys 243:122662

    Article  Google Scholar 

  17. Cai Z, Shafer T, Watanabe I, Nunn ME, Okabe T (2003) Electrochemical characterization of cast titanium alloys. Biomaterials 24(2):213–218

    Article  Google Scholar 

  18. El-Hadad S, Safwat EM, Sharaf NF (2018) In-vitro and in-vivo, cytotoxicity evaluation of cast functionally graded biomaterials for dental implantology. Mater Sci Appl: C 93:987–995

    Google Scholar 

  19. Metikoš-Huković M, Tkalčec E , Kwokal A, Piljac J (2003) An in vitro study of Ti and Ti-alloys coated with sol–gel derived hydroxyapatite coatings. Surf Coat Technol 165(1):40–50

    Article  Google Scholar 

  20. Challa V, Mali S, Misra R (2013) Reduced toxicity and superior cellular response of preosteoblasts to Ti-6Al-7Nb alloy and comparison with Ti-6Al-4V. J Biomed Mater Res Part A 101(7):2083–2089

    Article  Google Scholar 

  21. dos Santos Monteiro E, de Souza Soares FM, Nunes LF, Santana AIC, de Biasi RS, Elias CN (2020) Comparison of the wettability and corrosion resistance of two biomedical Ti alloys free of toxic elements with those of the commercial ASTM F136 (Ti–6Al–4V) alloy. J Mater Res Technol 9(6):16329–16338

    Article  Google Scholar 

  22. Ribeiro Filho SLM, Lauro CH, Bueno AHS, Brandão LC (2016) Effects of the dynamic tapping process on the biocompatibility of Ti-6Al-4V alloy in simulated human body environment. Arab J Sci Eng 41 (11):4313–4326

    Article  Google Scholar 

  23. Chen G, Ren C, Zou Y, Qin X, Lu L, Li S (2019) Mechanism for material removal in ultrasonic vibration helical milling of Ti6Al4V alloy. Int J Mach Tools Manuf 138:1–13

    Article  Google Scholar 

  24. Li S, Qin X, Jin Y, Sun D, Li Y (2018) A comparative study of hole-making performance by coated and uncoated WC/co cutters in helical milling of ti/CFRP stacks. Int J Adv Manuf Technol 94 (5):2645–2658

    Article  Google Scholar 

  25. Akula S, Nayak S N, Bolar G, Managuli V (2021) Comparison of conventional drilling and helical milling for hole making in Ti6Al4V titanium alloy under sustainable dry condition. Manuf Rev 8:12

    Google Scholar 

  26. Sharif S, Rahim E (2007) Performance of coated-and uncoated-carbide tools when drilling titanium alloy—Ti–6Al4V. J Mater Process Technol 185(1-3):72–76

    Article  Google Scholar 

  27. Barman A, Adhikari R, Bolar G (2020) Evaluation of conventional drilling and helical milling for processing of holes in titanium alloy Ti6Al4V. Mater Today: Proc 28:2295–2300

    Google Scholar 

  28. Wang C, Zhao J, Zhou Y (2020) Mechanics and dynamics study of helical milling process for nickel-based superalloy. Int J Adv Manuf Technol 106(5):2305–2316

    Article  Google Scholar 

  29. Pereira RBD, Brandão LC, de Paiva AP, Ferreira JR, Davim JP (2017) A review of helical milling process. Int J Mach Tools Manuf 120:27–48

    Article  Google Scholar 

  30. Denkena B, Boehnke D, Dege J (2008) Helical milling of CFRP–titanium layer compounds. CIRP J Manuf Sci Technol 1(2):64–69

    Article  Google Scholar 

  31. Pereira RBD, da Silva LA, Lauro CH, Brandão LC, Ferreira JR, Davim JP (2019) Multi-objective robust design of helical milling hole quality on AISI h13 hardened steel by normalized normal constraint coupled with robust parameter design. Appl Soft Comput 75:652–685

    Article  Google Scholar 

  32. Balázs BZ, Geier N, Takács M, Davim JP (2021) A review on micro-milling: recent advances and future trends. Int J Adv Manuf Technol 112(3):655–684

    Article  Google Scholar 

  33. Sun D, Lemoine P, Keys D, Doyle P, Malinov S, Zhao Q, et al. (2018) Hole-making processes and their impacts on the microstructure and fatigue response of aircraft alloys. Int J Adv Manuf Technol 94(5):1719–1726

    Article  Google Scholar 

  34. Aghili SA, Hassani K, Nikkhoo M (2021) A finite element study of fatigue load effects on total hip joint prosthesis. Comput Med Biomech Biomed Eng 24(14):1545–1551

    Article  Google Scholar 

  35. Ren Z, Huang J, Bai H, Jin R, Xu F, Xu J (2021) Potential application of entangled porous titanium alloy metal rubber in artificial lumbar disc prostheses. J Bionic Eng 18(3):584– 599

    Article  Google Scholar 

  36. Sun L, Gao H, Wang B, Bao Y, Wang M, Ma S (2020) Mechanism of reduction of damage during helical milling of titanium/CFRP/aluminium stacks. Int J Adv Manuf Technol 107(11):4741–4753

    Article  Google Scholar 

  37. Wang B, Zhao H, Zhang F, Wang M, Zheng Y (2021) Comparison of the geometric accuracy of holes made in CFRP/ti laminate by drilling and helical milling. Int J Adv Manuf Technol 112(11):3343–3350

    Article  Google Scholar 

  38. Wang H, Tao K, Jin T (2021) Modeling and estimation of cutting forces in ball helical milling process. Int J Adv Manuf Technol 117(9):2807–2818

    Article  Google Scholar 

  39. Skowronek P, Olszewski P, Świkeszkowski W, Synder M, Sibiński M, Mazek J (2018) Unrecoverable bi-products of drilling titanium alloy and tantalum metal implants: a pilot study. HIP Int 28(5):531–534

    Article  Google Scholar 

  40. Festas A, Pereira R, Ramos A, Davim J (2021) A study of the effect of conventional drilling and helical milling in surface quality in titanium Ti-6Al-4V and ti-6AL-7nb alloys for medical applications. Arab J Sci Eng 46(3):2361–2369

    Article  Google Scholar 

  41. Hastie T, Tibshirani R, Friedman JH (2009) The elements of statistical learning: data mining, inference, and prediction, vol 2. Springer, Berlin

    Book  MATH  Google Scholar 

  42. Efron B (1982) The jackknife, the bootstrap and other resampling plans. SIAM, Philadelphia

    Book  MATH  Google Scholar 

  43. Steyerberg EW (2019) Clinical prediction models: a practical approach to development validation and updating. Statistics for Biology and Health. Springer International Publishing, Berlin

    Book  MATH  Google Scholar 

  44. Chen M, Wang Z, Jiang S, Sun J, Wang L, Sahoo N, et al. (2022) Predictive performance of different NTCP techniques for radiation-induced esophagitis in NSCLC patients receiving proton radiotherapy. Sci Rep 12(1):1–8

    Google Scholar 

  45. Raut J R, Schöttker B, Holleczek B, Guo F, Bhardwaj M, Miah K, et al. (2021) A micro RNA panel compared to environmental and polygenic scores for colorectal cancer risk prediction. Nat Commun 12(1):1–9

    Article  Google Scholar 

  46. Anan K, Ichikado K, Ishihara T, Shintani A, Kawamura K, Suga M, et al. (2019) A scoring system with high-resolution computed tomography to predict drug-associated acute respiratory distress syndrome: development and internal validation. Sci Rep 9(1):1– 9

    Article  Google Scholar 

  47. Tian W, Song J, Li Z, de Wilde P (2014) Bootstrap techniques for sensitivity analysis and model selection in building thermal performance analysis. Appl Energy 135:320–328

    Article  Google Scholar 

  48. Gareth J, Daniela W, Trevor H, Robert T (2013) An introduction to statistical learning: with applications in R. Spinger, Berlin

    MATH  Google Scholar 

  49. Alvim AC, Ferreira JR, Pereira RBD (2022) The enhanced normalized normal constraint approach to multi-objective robust optimization in helical milling process of AISI h13 hardened with crossed array. Int J Adv Manuf Technol 119(3):2763– 2784

    Article  Google Scholar 

  50. Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197

    Article  Google Scholar 

  51. de Melo SA, Pereira RBD, da Silva Reis AF, Lauro CH, Brandão LC (2022) Multi-objective evolutionary optimization of unsupervised latent variables of turning process. Appl Soft Comput , pp 108713

  52. ASTMF1295-16 (2016) Standard specification for wrought titanium-6Aluminum-7Niobium alloy for surgical implant applications (UNS R56700). West conshohocken PA : American society for testing and materials

  53. ISO5832-11 (2014) Implants for surgery — metallic materials — Part 11: wrought titanium 6-aluminium 7-niobium alloy. Geneva CH: international organization for standardization

  54. Cherkassky V, Mulier FM (2007) Learning from data: concepts, theory, and methods. Wiley, New York

    Book  MATH  Google Scholar 

  55. Smola AJ, Schölkopf B (2004) A tutorial on support vector regression. Stat Comput 14 (3):199–222

    Article  MathSciNet  Google Scholar 

  56. Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10 (5):988–999

    Article  Google Scholar 

  57. Drucker H, Burges C J, Kaufman L, Smola A, Vapnik V (1996) Support vector regression machines. Adv Neural Inf Process Syst, 9

  58. Vapnik V (1999) The nature of statistical learning theory. Springer Sci Bus Media

  59. Fernandes de Mello R, Antonelli Ponti M (2018) Machine learning: a practical approach on the statistical learning theory. Springer International Publishing, Cham

    Book  MATH  Google Scholar 

  60. Abu-Mostafa YS (2012) Learning from data: a short course

  61. Hoerl AE, regression Kennard RW (1970) Ridge Biased estimation for nonorthogonal problems. Technometrics 12(1):55–67

    Article  MATH  Google Scholar 

  62. Morgan JN, Sonquist JA (1963) Problems in the analysis of survey data, and a proposal. J Am Stat Assoc 58(302):415–434

    Article  MATH  Google Scholar 

  63. Friedman JH, et al. (1977) A recursive partitioning decision rule for nonparametric classification. IEEE Trans Computers 26(4):404–408

    Article  MATH  Google Scholar 

  64. James G, Witten D , Hastie T, Tibshirani R (2021) Statistical learning. In: An introduction to statistical learning, Springer, pp 15–57

  65. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

Download references

Acknowledgements

The authors gratefully acknowledge the Brazilian National Council for Scientific and Technological Development (CNPq) and the Coordination of Superior Level Staff Improvement (CAPES) for supporting this research. The authors gratefully acknowledge the Minas Gerais Research Funding Foundation (FAPEMIG) for supporting the project “Otimização do fresamento helicoidal para obtenção de furos em ligas de titânio biocompatíveis”, process APQ-01291-18/FAPEMIG. The authors gratefully acknowledge the Foundation for Science and Technology of Portugal (FCT), Portugal for supporting the project “Sustainable and intelligent manufacturing by machining (FAMASI)”, process POCI-01-0145-FEDER-031556. The authors gratefully acknowledge Ti-fast (Italy) for providing the material.

Funding

This work was supported by the following funding:

• APQ-01291-18/FAPEMIG: “Otimização do fresamento helicoidal para obtenção de furos em ligas de titânio biocompatíveis”;

• POCI-01-0145-FEDER-031556/FCT(Portugal): “Sustainable and intelligent manufacturing by machining (FAMASI)”.

Author information

Authors and Affiliations

Authors

Contributions

Tomás Barbosa da Costa: conceptualization, investigation, data curation, formal analysis, writing. Robson Bruno Dutra Pereira: conceptualization, supervision, formal analysis, visualization, writing. Carlos Henrique Lauro: conceptualization, cosupervision, investigation, formal analysis. Lincoln Cardoso Brandão: project administration, investigation, formal analysis, resources. J. Paulo Davim: project administration, investigation, formal analysis, resources.

Corresponding author

Correspondence to Robson Bruno Dutra Pereira.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Statistical learning for regression modeling

Appendix A: Statistical learning for regression modeling

Response surface methodology (RSM) has been widely applied for designed process modeling to enable optimization. Since RSM models are second-order multiple regression, it is based on the parametric classical estimation. Learning methods focus on non-parametric model estimation with good generalization capability as opposed to model estimation methodology under statistical assumptions. The theory of such approaches is called Statistical Learning Theory or Vapnik–Chervonenkis (VC) theory [54]. Statistical learning theory aims to describe the properties of learning machines to understand how they can generalize well in unseen data [55]. In this section, some statistical learning principles and modeling approaches will be summarized as alternatives for modeling (learning) and explainability.

1.1 A.1: Some learning principles

The statistical learning theory was developed from 1960 to 1990, but in the middle 1990s the theory was “confirmed” with the proposition of support vector machines, based on the developed theory [56, 57]. Considering a function estimation problem composed of the following: (i) a set of independently drawn random vectors x from unknown distribution P(x); (ii) a supervisor that returns an output vector y for each input vector x, according to a conditional fixed and unknown distribution P(yx); and (iii) a learning machine that is capable of implementing a set of functions f(x,α), α ∈Λ, where Λ is a set of parameters. The learning problem consists of selecting the best function, f(x,α), α ∈Λ, to approximate the supervisor’s response. This process is conducted based on a set of a training set of N iid observations of the studied phenomena, (x1,y1),..., (xN,yN), drawn according to P(x,y) = P(x)P(yx) [56, 58]. Let the supervisor’s answer, y, be a real value. The regression problem may be solved by minimizing the quadratic loss function of the Eq. 21 [56, 58].

$$ L(y, f(\mathbf{x,\alpha})) = (y - f(\mathbf{x,\alpha}))^{2} $$
(21)

The empirical risk minimization (ERM) principle of statistical learning is related to the minimization of such function considering empirical data. The least squares method is a classical approach of ERM for regression problems. Considering N observations of the pair of regressors’ vector and output, (x1,y1),..., (xN,yN), the least squares method seeks to minimize Eq. 22 [56, 58].

$$ R_{emp}(\mathbf{\alpha}) = \frac{1}{N}{\sum}_{i=1}^{N}(y_{i} - f(\mathbf{x}_{i},\mathbf{\alpha}))^{2} $$
(22)

The empirical risk function of Eq. 22 considers only the training set. It is, however, necessary to minimize the expected risk, \(R(\mathbf {\alpha }) = {\int \limits } L(y, f(\mathbf {x,\alpha }))dP(\mathbf {x},y)\), which is impossible in practical problems, since it would only be achievable, if population data of the joint distribution is considered P(x,y). Therefore, it is important to know if the empirical risk is a good estimator of the expected risk. This will define how well the approximated function generalizes, i.e., which is the quality of the predictions of unseen or future data [56, 59].

Through these thoughts the learning process can be summarized in two important questions [60]:

  • Is it possible to now if Remp(x,α) is close enough of R(x,α)?

  • Is it possible to make R(x,α) small enough?

The most simple way to answer these questions is through cross-validation. The next sections present some important statistical learning approaches for regression modeling.

1.2 A.2: Support vector regression

Support vector regression was proposed by Drucker et al. [57] as an approach for regression, based on the concept of support vectors. Through support vector regression, learning is achieved considering a small subset of the training data, called support vectors, and the optimal modeling of the support vectors generally guarantees optimal modeling of the available data. Let the training data be (x1,y1), (x2,y2),..., (xN,yN), i = 1,...,N, with xi = [xi1,xi2,...,xik]. Support vector regression seeks to approximate a function with maximum deviation ε to the training data. In this fashion, there is no problem to have an error, since it is less than ε.

Let the estimation of a linear function, according to the Eq. 23.

$$ \hat{y}_{i} = \mathbf{x}_{i}^{T}\mathbf{w} + b, i = 1, ..., N $$
(23)

To get such approximation, a loss function, L, added by a regularization (shrinkage) term must be minimized, according to Eq. 24, where C is a regularization constant [58]. This formulation is similar to the rigid regression approach [61], if the loss function is a quadratic function of the error, \({\sum }_{i=1}^{N} L(y_{i}, \hat {y}_{i}) = {\sum }_{i=1}^{N} (y_{i} - \hat {y}_{i})^{2}\), generally referred as the residuals sum of squares (RSS). However, in the rigid regression original formulation the regularization constant multiplies the term wTw.

$$ C\sum\limits_{i=1}^{N} L(y_{i}, \hat{y}_{i}) + \mathbf{w}^{T}\mathbf{w} $$
(24)

In the case of the support vector regression, it was proposed an ε-insensitive loss function [57], presented in Eq. 25, with \(\xi _{i}= \lvert y_{i} - \hat {y}_{i} \rvert - \varepsilon \). Figure 18 shows the loss ε-insensitive function.

$$ L = \begin{cases} 0 \text{, if } \lvert y_{i} - \hat{y}_{i} \rvert < \varepsilon\\ \lvert y_{i} - \hat{y}_{i} \rvert - \varepsilon \text{, otherwise} \end{cases} $$
(25)
Fig. 18
figure 18

ε-insensitive loss (a); SVM regression model (b)

Therefore, the primal support vector regression estimation problem can be written as in the formulation 26 [58].

$$ \text{Min}\ \begin{Bmatrix} \frac{1}{2}\mathbf{w}^{T}\mathbf{w} + C{\sum}_{i=1}^{N}(\xi_{i} + \xi_{i}^{*}) \end{Bmatrix} \\ $$
$$ \textrm{s.t.: } \begin{cases} y_{i} - \left[\mathbf{x}_{i}^{T}\mathbf{w} + b\right] \leq \varepsilon + \xi_{i}\\ \left[\mathbf{x}_{i}^{T}\mathbf{w} + b\right] - y_{i} \leq \varepsilon + \xi_{i}^{*} \\ \xi_{i}, \xi_{i}^{*} \geq 0 \end{cases} $$
(26)

To solve this estimation problem more easily, the dual formulation can be considered. Besides, the dual formulation will enable the extent of support vector regression to non-linear problems. However, before, it is suitable to present the Lagrangian formulation to the primal support vector regression problem, as follows in Eq. 27, where \(\alpha _{i}, \alpha _{i}^{*}, \eta _{i}\), and \(\eta _{i}^{*}\) are the Lagrange multipliers, which must be non-negative [57, 58].

$$ \begin{array}{@{}rcl@{}} L &=& \frac{1}{2}\mathbf{w}^{T}\mathbf{w} + C\sum\limits_{i=1}^{N}(\xi_{i} + \xi_{i}^{*})\\ && - \sum\limits_{i=1}^{N} \alpha_{i}(\varepsilon + \xi_{i} - y_{i} + \mathbf{x}_{i}^{T}\mathbf{w} + b) \\ &&- \sum\limits_{i=1}^{N} \alpha_{i}^{*}(\varepsilon + \xi_{i}^{*} - \mathbf{x}_{i}^{T}\mathbf{w} - b + y_{i}) \\ &&-\sum\limits_{i=1}^{N}(\eta_{i}\xi_{i} + \eta_{i}^{*}\xi_{i}^{*}) \end{array} $$
(27)

By deriving the Lagrangian function with respect to the primal variables problem, w, b, ξi, and \(\xi _{i}^{*}\), and setting it equal to zero, first-order optimality conditions are achieved as follows [55].

$$ \begin{array}{@{}rcl@{}} \frac{\partial L}{\partial b} &=& \sum\limits_{i=1}^{N}(\alpha_{i} + \alpha_{i}^{*}) = 0\\ \frac{\partial L}{\partial \mathbf{w}} &=& \mathbf{w} - \sum\limits_{i=1}^{N}(\alpha_{i} + \alpha_{i}^{*})\mathbf{x}_{i} = 0\\ \frac{\partial L}{\partial \xi_{i}} &=& C - \alpha_{i} - \eta_{i} = 0\\ \frac{\partial L}{\partial \xi_{i}^{*}} &=& C - \alpha_{i}^{*} - \eta_{i}^{*} = 0 \end{array} $$
(28)

Replacing the results of Eq. 28 in Eq. 26, it is achieved the dual formulation of the support vector regression problem, as follows [57, 58]

$$ \text{Max}\ \begin{Bmatrix} -\frac{1}{2}{\sum}_{i,j}^{N}(\alpha_{i} - \alpha_{i}^{*})(\alpha_{j} - \alpha_{j}^{*})\mathbf{x}_{i}\mathbf{x}_{j} - \varepsilon{\sum}_{i=1}^{N}(\alpha_{i} - \alpha_{i}^{*}) \\ + {\sum}_{i=1}^{N} y_{i}(\alpha_{i} - \alpha_{i}^{*}) \end{Bmatrix} \\ $$
$$ \textrm{s.t.: } \begin{cases} {\sum}_{i=1}^{N}(\alpha_{i} - \alpha_{i}^{*}) = 0\\ \alpha_{i}, \alpha_{i}^{*} \in [0,C] \end{cases} $$
(29)

In this dual formulation, w is rewritten as a linear combination of the training observations patterns, xixj. In the dual formulation, ηi and \(\eta _{i}^{*}\) were described considering C, αi, and \(\alpha _{i}^{*}\), and, therefore, were eliminated. Besides, \(\mathbf {w} = {\sum }_{i=1}^{N}(\alpha _{i} + \alpha _{i}^{*})\mathbf {x}_{i}\). In this sense, in the support vector regression, the initial model presented in Eq. 23 can be rewritten as follows.

$$ \hat{y} = \sum\limits_{i=1}^{N}(\alpha_{i} - \alpha_{i}^{*})\mathbf{x}_{i}\mathbf{x} + b $$
(30)

This support-vector expansion means that the initial model terms w are rewritten as linear combinations of the training data, but the model includes only the support vectors, i.e. xi such that αi > 0 or \(\alpha _{i}^{*} >0\), i = 1,...,N. Traditional statistical methods seek to approximate a function in the feature space. However, support vector regression seeks to increase the dimensionality in the support vector space [58].

Since the support vector regression method only depends on the dot product of the training data, xi, to approximate more complex functions, the dot product xixj may be replaced by a kernel, k(xi,xj). Some efficient options to the linear kernel, k(xi,xj) = xixj, include the polynomial kernel, k(xi,xj) = (xixj + c)d, and the radial basis, \(k(\mathbf {x}_{i},\mathbf {x}_{j}) = \exp (-\gamma ||\mathbf {x}_{i} - \mathbf {x}_{j}||^{2})\). Obviously, in the dual formulation, the term xixj must be replaced by the chosen kernel, k(xi,xj), and the support vector regression model can be expressed according to Eq. 12. According to the selected kernel, the hyperparameters, such as γ for the radial kernel, and the regularization constant, C, must be chosen through cross-validation. In the fashion of the statistical learning principles presented these are the models’ parameters to be selected, α ∈Λ.

1.3 A.3: Tree-based models

The so-called classification and regression trees (CART) approach date at least as far as Morgan and Sonquist [62] as a sequential approach focusing on reducing the prediction error, which is independent of the extent of the linearity in the classifications or the order in which the explanatory factors are introduced. In the case of regression, the algorithm needs to decide which variable to split, from x1 to xk, and the splitting point, j, in each step of the procedure, aiming to reduce RSS [41].

Considering the case where the partitions result in M regions, R1,...,Rm, and that the predicted value is a constant, cm, in that region, m = 1,...,M, the regression tree model can be written as in Eq. 31, where cm is the average of the response values of the training data in that region, Rm, and I(xRm) is an indicator function which is equal to one if true and 0 on the contrary.

$$ \hat{y} = \sum\limits_{m=1}^{M} c_{m} I(\mathbf{x} \in R_{m}) $$
(31)

To achieve this kind of model, a recursive binary splitting procedure is followed [63]. The procedure seeks to find the predictor, xj, j = 1,...,k, and the cut point s, which results in the lowest RSS. This binary partitioning, therefore, seeks to minimize the objective function Eq. 32 [41, 63, 64]. The approach is repeated in one of the partitions generated, in which a better decrease in RSS is achieved. The recursive binary splitting continues until a stopping criterion, such as a minimum number of training observations in each partition is reached.

$$ \underset{j,s} {\text{Min}} \bigg\{ \sum\limits_{i: \mathbf{x}_{i} \in R_{1}(j,s)} (y_{i} - c_{1})^{2} + {\sum}_{i: \mathbf{x}_{i} \in R_{2}(j,s)} (y_{i} - c_{2})^{2} \bigg\} $$
(32)

Figure 19 illustrates a tree model in function of two regressors. In this case, three partitions were performed, represented by the three nodes of the tree model representation, Fig. 19a. In this representation, the first node, is at the top, while the branches are at the bottom. Four regions were obtained in the regressors’ space through the recursive binary splitting, entailing four predicted values. These rectangular regions can be observed in Fig. 19b and 19c, in the distinct heights of the surface plot and in the distinct colors of the contour plot color scale, respectively.

Fig. 19
figure 19

Regression tree model

A promising approach to improve the prediction of the regression tree method is the bagging or bootstrap aggregation method. Through bagged trees, several regression trees are obtained considering resamples of the training data, and the final bagging model is the average of these distinct regression trees obtained. Figure 20 illustrates the bagging procedure. Considering B bootstrap samples generated from the training sample, \(\mathbf {Z}_{1}^{*}\), ..., \(\mathbf {Z}_{B}^{*}\), it is obtained B models, \(\hat {f}_{b}^{*}(x)\), b = 1,...,B. The final bagging model can be written according to Eq. 33. Each bootstrap tree will be different from the one estimated with the complete training data set, with distinct features and terminal nodes [41].

$$ \hat{f}_{bag}(x) = \frac{1}{B} \sum\limits_{b=1}^{B} \hat{f}_{b}^{*}(x) $$
(33)
Fig. 20
figure 20

Bagging regression trees

A popular modification in the bagging approach for CART is the random forest, proposed by Breiman [65]. In the random forest approach, only a limited number of features or input variables are considered in each step of the recursive binary splitting procedure. This is done to overcome the possible correlation between the features and also to ‘decorrelate’ the B trees generated through bootstrapping of the training data. For regression it is recommended to consider m = k/3 features in each splitting [41, 65].

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Costa, T.B.d., Pereira, R.B.D., Lauro, C.H. et al. Statistical learning and optimization of the helical milling of the biocompatible titanium Ti-6Al-7Nb alloy. Int J Adv Manuf Technol 125, 1789–1813 (2023). https://doi.org/10.1007/s00170-022-10686-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00170-022-10686-2

Keywords

Navigation