Skip to main content
Log in

A novel outlier-insensitive local support vector machine for robust data-driven forecasting in engineering

  • Original Article
  • Published:
Engineering with Computers Aims and scope Submit manuscript

Abstract

Machine learning (ML)-based data-driven methods have promoted the progress of modeling in many engineering domains. These methods can achieve high prediction and generalization performance for large, high-quality datasets. However, ML methods can yield biased predictions if the observed data (i.e., response variable y) are corrupted by outliers. This paper addresses this problem with a novel, robust ML approach that is formulated as an optimization problem by coupling locally weighted least-squares support vector machines for regression (LWLS-SVMR) with one weight function. The weight is a function of residuals and allows for iteration within the proposed approach, significantly reducing the negative interference of outliers. A new efficient hybrid algorithm is developed to solve the optimization problem. The proposed approach is assessed and validated by comparison with relevant ML approaches on both one-dimensional simulated datasets corrupted by various outliers and multi-dimensional real-world engineering datasets, including datasets used for predicting the lateral strength of reinforced concrete (RC) columns, the fuel consumption of automobiles, the rising time of a servomechanism, and dielectric breakdown strength. Finally, the proposed method is applied to produce a data-driven solver for computational mechanics with a nonlinear material dataset corrupted by outliers. The results all show that the proposed method is robust against non-extreme and extreme outliers and improves the predictive performance necessary to solve various engineering problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

The data that support the findings of this study are openly available in UCI machine learning repository at https://archive.ics.uci.edu/ml/index.php.

References

  1. Montáns FJ, Chinesta F, Gómez-Bombarelli R, Kutz JN (2019) Data-driven modeling and learning in science and engineering. Comptes Rendus Mécanique 347(11):845–855

    Google Scholar 

  2. Bock FE, Aydin RC, Cyron CJ, Huber N, Kalidindi SR, Klusemann B (2019) A review of the application of machine learning and data mining approaches in continuum materials mechanics. Front Mater 6:110

    Google Scholar 

  3. Adeli H (2001) Neural networks in civil engineering: 1989–2000. Comput-Aid Civil Infrast Eng 16(2):126–142

    Google Scholar 

  4. Reich Y (1997) Machine learning techniques for civil engineering problems. Comput-Aid Civil Infrast Eng 12(4):295–310

    Google Scholar 

  5. González D, Chinesta F, Cueto E (2019) Thermodynamically consistent data-driven computational mechanics. Contin Mech Thermodyn 31(1):239–253

    MathSciNet  Google Scholar 

  6. Ibañez R, Borzacchiello D, Aguado JV, Abisset-Chavanne E, Cueto E, Ladevèze P, Chinesta F (2017) Data-driven non-linear elasticity: constitutive manifold construction and problem discretization. Comput Mech 60(5):813–826

    MathSciNet  MATH  Google Scholar 

  7. Ibanez R, Abisset-Chavanne E, Aguado JV, Gonzalez D, Cueto E, Chinesta F (2018) A manifold learning approach to data-driven computational elasticity and inelasticity. Archiv Comput Methods Eng 25(1):47–57

    MathSciNet  MATH  Google Scholar 

  8. González D, García-González A, Chinesta F, Cueto E (2020) A data-driven learning method for constitutive modeling: application to vascular hyperelastic soft tissues. Materials 13(10):2319

    Google Scholar 

  9. González D, Chinesta F, Cueto E (2019) Learning corrections for hyperelastic models from data. Front Mater 2019(6):14

    Google Scholar 

  10. Kanno Y (2020) A kernel method for learning constitutive relation in data-driven computational elasticity. Jpn J Ind Appl Math. https://doi.org/10.1007/s13160-020-00423-1

    Article  MATH  Google Scholar 

  11. Kanno Y (2018) Simple heuristic for data-driven computational elasticity with material data involving noise and outliers: a local robust regression approach. Jpn J Ind Appl Math 35(3):1085–1101

    MathSciNet  MATH  Google Scholar 

  12. Kanno Y (2018) Data-driven computing in elasticity via kernel regression. Theor Appl Mech Lett 8(6):361–365

    Google Scholar 

  13. Leygue A, Coret M, Réthoré J, Stainier L, Verron E (2018) Data-based derivation of material response. Comput Methods Appl Mech Eng 331:184–196

    MathSciNet  MATH  Google Scholar 

  14. Versino D, Tonda A, Bronkhorst CA (2017) Data driven modeling of plastic deformation. Comput Methods Appl Mech Eng 318:981–1004

    MATH  Google Scholar 

  15. Capuano G, Rimoli JJ (2019) Smart finite elements: A novel machine learning application. Comput Methods Appl Mech Eng 345:363–381

    MathSciNet  MATH  Google Scholar 

  16. Nguyen LTK, Keip MA (2018) A data-driven approach to nonlinear elasticity. Comput Struct 194:97–115

    Google Scholar 

  17. Kirchdoerfer T, Ortiz M (2016) Data-driven computational mechanics. Comput Methods Appl Mech Eng 304:81–101

    MathSciNet  MATH  Google Scholar 

  18. Kirchdoerfer T, Ortiz M (2017) Data driven computing with noisy material data sets. Comput Methods Appl Mech Eng 326:622–641

    MathSciNet  MATH  Google Scholar 

  19. Kirchdoerfer T, Ortiz M (2018) Data-driven computing in dynamics. Int J Numer Meth Eng 113(11):1697–1710

    MathSciNet  Google Scholar 

  20. Gandomi AH, Roke DA (2015) Assessment of artificial neural network and genetic programming as predictive tools. Adv Eng Softw 88:63–72

    Google Scholar 

  21. Gandomi AH, Mohammadzadeh D, Pérez-Ordóñez JL, Alavi AH (2014) Linear genetic programming for shear strength prediction of reinforced concrete beams without stirrups. Appl Soft Comput 19:112–120

    Google Scholar 

  22. Nguyen H, Nguyen NM, Cao MT, Hoang ND, Tran XL (2021) Prediction of long-term deflections of reinforced-concrete members using a novel swarm optimized extreme gradient boosting machine. Eng Comput 38(2):1–13

    Google Scholar 

  23. Cheng MY, Gosno RA (2020) Symbiotic polyhedron operation tree (SPOT) for elastic modulus formulation of recycled aggregate concrete. Eng Comput 37(2):1–16

    Google Scholar 

  24. James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning. springer, New York, p 18

    MATH  Google Scholar 

  25. Rousseeuw PJ, Leroy AM (1987) Robust regression and outlier detection. Wiley, New York

    MATH  Google Scholar 

  26. Hampel FR, Ronchetti EM, Rousseeuw PJ, Stahel WA (2011) Robust statistics: the approach based on influence functions. John Wiley & Sons. 196.

  27. Rousseeuw PJ (1984) Least median of squares regression. J Am Stat Assoc 79(388):871–880

    MathSciNet  MATH  Google Scholar 

  28. Rousseeuw P, Yohai V (1984) Robust regression by means of S-estimators. Robust and nonlinear time series analysis. Springer, New York, pp 256–272

    MATH  Google Scholar 

  29. Rousseeuw PJ, Hubert M (2011) Robust statistics for outlier detection. Wiley Interdiscip Rev Data Min Knowled Dis 1(1):73–79

    Google Scholar 

  30. Mu HQ, Yuen KV (2015) Novel outlier-resistant extended Kalman filter for robust online structural identification. J Eng Mech 141(1):04014100

    Google Scholar 

  31. Yuen KV, Mu HQ (2012) A novel probabilistic method for robust parametric identification and outlier detection. Probab Eng Mech 30:48–59

    Google Scholar 

  32. Yuen KV, Ortiz GA (2017) Outlier detection and robust regression for correlated data. Comput Methods Appl Mech Eng 313:632–646

    MathSciNet  MATH  Google Scholar 

  33. Rusiecki A (2007) Robust LTS backpropagation learning algorithm. International Work-Conference on Artificial Neural Networks. Springer, Berlin Heidelberg, pp 102–109

    Google Scholar 

  34. Roy MH, Larocque D (2012) Robustness of random forests for regression. J Nonparamet Statist 24(4):993–1006

    MathSciNet  MATH  Google Scholar 

  35. Suykens J, Van Gestel T, De Brabanter J, De Moor B, Vandewalle J (2002) Least Squares Support Vector Machines. World Scientific

    MATH  Google Scholar 

  36. Pal M, Deswal S (2011) Support vector regression based shear strength modelling of deep beams. Comput Struct 89(13–14):1430–1439

    Google Scholar 

  37. Pal M, Singh NK, Tiwari NK (2011) Support vector regression based modeling of pier scour using field data. Eng Appl Artif Intell 24(5):911–916

    Google Scholar 

  38. Nhu VH, Hoang ND, Duong VB, Vu HD, Bui DT (2020) A hybrid computational intelligence approach for predicting soil shear strength for urban housing construction: a case study at Vinhomes Imperia project, Hai Phong city (Vietnam). Eng Comput 36(2):603–616

    Google Scholar 

  39. Tran TH, Nguyen H, Nhat-Duc H (2019) A success history-based adaptive differential evolution optimized support vector regression for estimating plastic viscosity of fresh concrete. Eng Comput 37(2):1–14

    Google Scholar 

  40. Chou JS, Ngo NT, Pham AD (2015) Shear strength prediction in reinforced concrete deep beams using nature-inspired metaheuristic support vector regression. J Comput Civ Eng 30(1):04015002

    Google Scholar 

  41. Chou JS, Pham AD (2015) Smart artificial firefly colony algorithm-based support vector regression for enhanced forecasting in civil engineering. Comput-Aid Civil Infrastruct Eng 30(9):715–732

    Google Scholar 

  42. Prayogo D, Cheng MY, Wu YW, Tran DH (2020) Combining machine learning models via adaptive ensemble weighting for prediction of shear capacity of reinforced-concrete deep beams. Eng Comput 36(3):1135–1153

    Google Scholar 

  43. Cheng MY, Hoang ND (2012) Risk score inference for bridge maintenance project using evolutionary fuzzy least squares support vector machine. J Comput Civ Eng 28(3):04014003

    Google Scholar 

  44. Hoang ND, Nguyen QL (2019) A novel method for asphalt pavement crack classification based on image processing and machine learning. Eng Comput 35(2):487–498

    Google Scholar 

  45. Luo H, Paal SG (2018) Machine learning-based backbone curve model of reinforced concrete columns subjected to cyclic loading reversals. J Comput Civ Eng 32(5):04018042

    Google Scholar 

  46. Luo H, Paal SG (2019) A locally weighted machine learning model for generalized prediction of drift capacity in seismic vulnerability assessments. Comput-Aided Civil Infrastruct Eng. 34(11):1–16

    Google Scholar 

  47. Luo H, Paal SG (2021) Reducing the effect of sample bias for small data sets with double-weighted support vector transfer regression. Comput-Aided Civil Infrastruct Eng 36(3):248–263

    Google Scholar 

  48. Vapnik V (1995) The nature of statistical learning theory. Springer-Verlag, New York

    MATH  Google Scholar 

  49. Suykens JA, De Brabanter J, Lukas L, Vandewalle J (2002) Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48(1–4):85–105

    MATH  Google Scholar 

  50. De Brabanter K, Pelckmans K, De Brabanter J, Debruyne M, Suykens JA, Hubert M, De Moor B (2009) Robustness of kernel based regression: a comparison of iterative weighting schemes. International Conference on Artificial Neural Networks. Springer, Berlin. Heidelberg, pp 100–110

    Google Scholar 

  51. Menzies T, Butcher A, Marcus A, Zimmermann T, Cok D (2011) Local vs. global models for effort estimation and defect prediction. In: 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011). IEEE. 343–351

  52. Hand DJ, Vinciotti V (2003) Local versus global models for classification problems: fitting models where it matters. Am Stat 57(2):124–131

    MathSciNet  Google Scholar 

  53. Bottou L, Vapnik V (1992) Local learning algorithms. Neural Comput 4(6):888–900

    Google Scholar 

  54. Vapnik V, Bottou L (1993) Local algorithms for pattern recognition and dependencies estimation. Neural Comput 5(6):893–909

    Google Scholar 

  55. Karevan Z, Feng Y, Suykens JA (2017) Moving Least Squares Support Vector Machines for weather temperature prediction. In: Proc. of the European Symposium on Artificial Neural Networks (ESANN). Bruges, Belgium.

  56. Cleveland WS (1979) Robust locally weighted regression and smoothing scatterplots. J Am Stat Assoc 74(368):829–836

    MathSciNet  MATH  Google Scholar 

  57. Cleveland WS, Devlin SJ (1988) Locally weighted regression: an approach to regression analysis by local fitting. J Am Stat Assoc 83(403):596–610

    MATH  Google Scholar 

  58. Atkeson CG, Moore AW, Schaal S (1997) Locally weighted learning. Artif Intell Rev 11:11–73

    Google Scholar 

  59. Atkeson CG, Moore AW, Schaal S (1997) Locally weighted learning for control. Artif Intell Rev 11(1–5):75–113

    Google Scholar 

  60. Suykens JAK, Lukas L, Van Dooren P, De Moor B, Vandewalle J (1999) Least squares support vector machine classifiers: a large scale algorithm. In: European Conference on Circuit Theory and Design, ECCTD. Citeseer. 839–842

  61. De Brabanter J, Pelckmans K, Suykens JA, Vandewalle J (2002) Robust cross-validation score function for non-linear function estimation. In International Conference on Artificial Neural Networks. 713-719. Springer. Berlin, Heidelberg

  62. Kvålseth TO (1985) Cautionary note about R2. Am Stat 39(4):279–285

    Google Scholar 

  63. Liu J, Wang Y, Fu C, Guo J, Yu Q (2016) A robust regression based on weighted LSSVM and penalized trimmed squares. Chaos, Solitons Fractals 89:328–334

    MATH  Google Scholar 

  64. Yang X, Tan L, He L (2014) A robust least squares support vector machine for regression and classification with noise. Neurocomputing 140:41–52

    Google Scholar 

  65. Huber PJ (1964) Robust estimation of a location parameter. Ann Mathemat Statis. 35(1):73–101

    MathSciNet  MATH  Google Scholar 

  66. Berry M, Parrish M, Eberhard M (2004) PEER Structural Performance Database. University of California, Berkeley, User’s Manual

    Google Scholar 

  67. Quinlan JR (1993). Combining instance-based and model-based learning. In: Proceedings of the tenth international conference on machine learning. 236–243.

  68. Nelson W (1981) Analysis of performance-degradation data from accelerated tests. IEEE Trans Reliab 30(2):149–155

    MATH  Google Scholar 

  69. Stone WC, Cheok GS (1989) Inelastic behavior of full-scale bridge columns subjected to cyclic loading, NIST BSS 166. U.S. National Institute of Standards and Technology, Gaithersburg, MD, p 261

  70. Priestley MJN, Potangaroa RT, Park R (1981) Ductility of spirally-confined concrete columns. J Struct Div 107(1):181–202

    Google Scholar 

  71. Sussman T, Bathe KJ (2009) A model of incompressible isotropic hyperelastic material behavior using spline interpolations of tension–compression test data. Commun Numer Methods Eng 25(1):53–63

    MathSciNet  MATH  Google Scholar 

  72. Latorre M, Montáns FJ (2013) Extension of the Sussman-Bathe spline-based hyperelastic model to incompressible transversely isotropic materials. Comput Struct 122:13–26

    Google Scholar 

  73. Crespo J, Latorre M, Montáns FJ (2017) WYPIWYG hyperelasticity for isotropic, compressible materials. Comput Mech 59(1):73–92

    MathSciNet  MATH  Google Scholar 

  74. De Rosa E, Latorre M, Montáns FJ (2017) Capturing anisotropic constitutive models with WYPiWYG hyperelasticity; and on consistency with the infinitesimal theory at all deformation levels. Int J Non-Linear Mech 96:75–92

    Google Scholar 

  75. Latorre M, Montáns FJ (2020) Experimental data reduction for hyperelasticity. Comput Struct 232:105919

    Google Scholar 

  76. Bai JL, He J, Li C, Jin SS, Yang H (2022) Experimental investigation on the seismic performance of a novel damage-control replaceable RC beam-to-column joint. Engineering Structures https://doi.org/10.1016/j.engstruct.2022.114692

    Google Scholar 

  77. Zhou Y, Chen LZ, Long L (2023) Modeling cyclic behavior of squat reinforced concrete walls exposed to acid deposition. Journal of Building Engineering https://doi.org/10.1016/j.jobe.2022.105432

    Google Scholar 

Download references

Acknowledgements

This material is based in part on work supported by the Natural Science Foundation of Hubei Province under Grant #2022CFB294, the National Natural Science Foundation of China under Grant #52208485 and the National Science Foundation under Grant CMMI #1944301. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the Natural Science Foundation of Hubei Province, National Natural Science Foundation of China and National Science Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huan Luo.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, H., Paal, S.G. A novel outlier-insensitive local support vector machine for robust data-driven forecasting in engineering. Engineering with Computers 39, 3671–3689 (2023). https://doi.org/10.1007/s00366-022-01781-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00366-022-01781-9

Keywords

Navigation