Regression Method in Data Mining: A Systematic Literature Review

Sebt, Mohammad Vahid; Sadati-Keneti, Yaser; Rahbari, Misagh; Gholipour, Zohreh; Mehri, Hamid

doi:10.1007/s11831-024-10088-5

Regression Method in Data Mining: A Systematic Literature Review

Review article
Published: 27 March 2024

(2024)
Cite this article

Archives of Computational Methods in Engineering Aims and scope Submit manuscript

Mohammad Vahid Sebt ORCID: orcid.org/0000-0001-6402-9214¹,
Yaser Sadati-Keneti¹,
Misagh Rahbari¹,
Zohreh Gholipour¹ &
…
Hamid Mehri¹

135 Accesses
Explore all metrics

Abstract

Regression is one of the most important supervised learning methods in data mining that is used to predict and discover knowledge in data mining science. After reviewing the studies conducted in the field of regression, it has been found that the tendency to use this method is increasing day by day among researchers. This study reviews 500 articles from about 230 reputable journals under one framework over the twenty-first century and also discusses the status and use of regression in data mining research. The systematic framework presented in this study includes the following steps: 1—Examining the position of regression in research conducted in data mining and determining the trend of different journals to conduct research in the field of regression in different years 2—Examining different study areas in the field of regression and determining the trend to conduct research in various areas of study in different years 3—Examining the algorithms used in the field of regression and determining the most widely used and trend to use algorithms by researchers in different years 4—Examining the keywords used in regression research in data mining and determining the strongest and most attractive rules obtained from the relationships of these keywords with each other using the Apriori algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The potential of working hypotheses for deductive exploratory research

Article Open access 08 December 2020

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Article 09 November 2022

Abbreviations

RT:: Regression tree
CART:: Classification and regression tree
MLR:: Multiple logistic regression
RF:: Random forest
SVR:: Support vector regression
HABES:: Harmful algal bloom expert system
ANN:: Artificial neural network
BRT:: Boosted regression tree
ZOIB:: Zero-or-one inflated beta
LLM:: Logit leaf model
TPOT:: Tree-based pipeline optimization tool
GBT:: Gradient boosting tree
MLPNN:: Multilayer perceptron neural network
TGD:: Three gorges dam
LSSVM:: Least squares support vector machine
DT:: Decision tree
DNN:: Dynamic neural network
PCA:: Principal component analysis
CSS:: Customer satisfaction survey
HPSO:: Hybrid particle swarm optimization
HHCART:: CART using house holders matrices
FT-SVR:: Fourier transform and SVR
ROC:: Receiver operating characteristic
CHAID:: Chi-square automatic interaction detection
NLRA:: Non-linear regression analysis
KNN:: K-nearest neighbor
HTBR:: Hierarchical tree-based regression
ISM-RT:: Interpretative structural modeling with regression tree
GBR:: Gradient boosted regression
LSTM:: Long-short term memory network
ABC-LR:: Artificial bee colony with logistic regression
GDP:: Gross domestic product
X12-ARIMA:: X12-auto regressive integrated moving average
MLP:: Multilayer perceptron
MARS-GBM:: MARS and gradient boosting machine
LOR:: Logistic regression
LR:: Linear regression
MRT:: Multivariate regression tree
SVM:: Support vector machine
GMDH:: Group method of data handling
STR:: Smooth transition regression
HPSO:: Hybrid particle swarm optimization
SPFs:: Safety performance functions
LDA:: Linear discriminant analysis
CT:: Classification tree
MARS:: Multivariate adaptive regression spline
ANFIS:: Adaptive neuro-fuzzy inference systems
IVTS:: Interval-valued time series
XGBoost:: Extreme gradient boosting
GBDT:: Gradient boosted decision tree
LSSVR:: Least squares support vector regression
QSAR:: Quantitative structure activity relationship
STR-tree:: Smooth transition regression and CART
RBFN:: Radial basis function network
HPSORTRBFN:: HPSO, RT and RBFN
SDM:: Species distribution model
MS-HCA:: Multidimensional scaling and hierarchical cluster analysis
AUC:: Area under the curve
IOT:: Internet of things
DTR:: Decision tree regression
GBRT:: Gradient boosting regression tree
ELM:: Extreme learning machine
fLogSLFN:: Filtered logistic single-hidden layer feedforward neural network
RFR:: Random forest regression
EMD-LSTM:: Empirical mode decomposition-based LSTM
ICD-9-CM:: International classification of disease 9th—revision clinical modification
SO2:: Sulfur dioxide
GP:: Genetic programming
GBDT-LR:: Gradient boosting decision tree with logistic regression

References

Alizamir M, Kim S, Kisi O, Zounemat-Kermani M (2020) A comparative study of several machine learning based non-linear regression methods in estimating solar radiation: case studies of the USA and Turkey regions. Energy 197:117239
Article Google Scholar
Andreetta A, Cecchini G, Bonifacio E, Comolli R, Vingiani S, Carnicelli S (2016) Tree or soil? Factors influencing humus form differentiation in Italian forests. Geoderma 264:195–204
Article Google Scholar
Belciug S (2020) Logistic regression paradigm for training a single-hidden layer feedforward neural network. Application to gene expression datasets for cancer research. J Biomed Inform 102:103373
Article Google Scholar
Buya S, Tongkumchum P, Owusu BE (2020) Modelling of land-use change in Thailand using binary logistic regression and multinomial logistic regression. Arab J Geosci 13:437
Article Google Scholar
Cai J, Xu K, Zhu Y, Hu F, Li L (2020) Prediction and analysis of net ecosystem carbon exchange based on gradient boosting regression and random forest. Appl Energy 262:114566
Article Google Scholar
Cao Y, Zhang X, Fu Y, Lu Z, Shen X (2020) Urban spatial growth modeling using logistic regression and cellular automata: a case study of Hangzhou. Ecol Ind 113:106200
Article Google Scholar
Cappelli C, Cerqueti R, D’Urso P, Di Iorio F (2020) Multiple breaks detection in financial interval-valued time series. Expert Syst Appl 164:113775
Article Google Scholar
Cappelli C, Penny RN, Rea WS, Reale M (2008) Detecting multiple mean breaks at unknown points in official time series. Math Comput Simul 78(2–3):351–356
Article MathSciNet Google Scholar
Carey V, Zeger SL, Diggle P (1993) Modelling multivariate binary data with alternating logistic regressions. Biometrika 80(3):517–526
Article Google Scholar
Chatterjee S, Hadi AS (2015) Regression analysis by example. Wiley, New York
Google Scholar
Chen MY (2011) Predicting corporate financial distress based on integration of decision tree classification and logistic regression. Expert Syst Appl 38(9):11261–11272
Article Google Scholar
Chen Q, Mynett AE (2004) Predicting Phaeocystis globosa bloom in Dutch coastal waters by decision trees and nonlinear piecewise regression. Ecol Model 176(3–4):277–290
Article Google Scholar
Cheng W, Wang K, Zhang X (2010) Implementation of a COM-based decision-tree model with VBA in ArcGIS. Expert Syst Appl 37(1):12–17
Article Google Scholar
Curcio CL, Wu YY, Vafaei A, Barbosa JFDS, Guerra R, Guralnik J, Gomez F (2020) A regression tree for identifying risk factors for fear of falling: the International Mobility in Aging Study (IMIAS). J Gerontol: Ser A 75(1):181–188
Article Google Scholar
D’Ambrosio A, Aria M, Iorio C, Siciliano R (2017) Regression trees for multivalued numerical response variables. Expert Syst Appl 69:21–28
Article Google Scholar
Da Rosa JC, Veiga A, Medeiros MC (2008) Tree-structured smooth transition regression models. Comput Stat Data Anal 52(5):2469–2488
Article MathSciNet Google Scholar
De’Ath G (2002) Multivariate regression trees: a new technique for modeling species–environment relationships. Ecology 83(4):1105–1117
Google Scholar
De Caigny A, Coussement K, De Bock KW (2018) A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees. Eur J Oper Res 269(2):760–772
Article MathSciNet Google Scholar
de Oña J, de Oña R, Calvo FJ (2012) A classification tree approach to identify key factors of transit service quality. Expert Syst Appl 39(12):11164–11171
Article Google Scholar
Dedeturk BK, Akay B (2020) Spam filtering using a logistic regression model trained by an artificial bee colony algorithm. Appl Soft Comput 91:106229
Article Google Scholar
Dong X, Kattel G, Jeppesen E (2020) Subfossil cladocerans as quantitative indicators of past ecological conditions in Yangtze River Basin lakes, China. Sci Total Environ 728:138794
Article Google Scholar
Elmaz F, Yücel Ö, Mutlu AY (2020) Predictive modeling of biomass gasification with machine learning-based regression methods. Energy 191:116541
Article Google Scholar
Erdal HI, Karakurt O (2013) Advancing monthly streamflow prediction accuracy of CART models using ensemble learning paradigms. J Hydrol 477:119–128
Article Google Scholar
Fan J, Yue W, Wu L, Zhang F, Cai H, Wang X et al (2018) Evaluation of SVM, ELM and four tree-based ensemble models for predicting daily reference evapotranspiration using limited meteorological data in different climates of China. Agric For Meteorol 263:225–241
Article Google Scholar
Feng JZ, Wang Y, Peng J, Sun MW, Zeng J, Jiang H (2019) Comparison between logistic regression and machine learning algorithms on survival prediction of traumatic brain injuries. J Crit Care 54:110–116
Article Google Scholar
Flores PG, López IF, Kemp PD, Dörner J, Zhang B (2017) Prediction by decision tree modelling of the relative magnitude of functional group abundance in a pasture ecosystem in the south of Chile. Agr Ecosyst Environ 239:38–50
Article Google Scholar
Galton F (1886) Regression towards mediocrity in hereditary stature. J Anthropol Inst G B Irel 15:246–263
Google Scholar
Gholampour A, Mansouri I, Kisi O, Ozbakkaloglu T (2020) Evaluation of mechanical properties of concretes containing coarse recycled concrete aggregates using multivariate adaptive regression splines (MARS), M5 model tree (M5Tree), and least squares support vector regression (LSSVR) models. Neural Comput Appl 32(1):295–308
Article Google Scholar
Gupta S (2015) A regression modeling technique on data mining. Int J Comput Appl 116(9)
Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam
Google Scholar
Hand DJ, Adams NM (2014) Data mining. Wiley StatsRef: Statistics Reference Online, pp 1–7
Hossny K, Magdi S, Soliman AY, Hossny AH (2020) Detecting explosives by PGNAA using KNN Regressors and decision tree classifier: a proof of concept. Prog Nucl Energy 124:103332
Article Google Scholar
Hu Y, Dai Z, Guldmann JM (2020) Modeling the impact of 2D/3D urban indicators on the urban heat island over different seasons: a boosted regression tree approach. J Environ Manag 266:110424
Article Google Scholar
Jafari A, Khademi H, Finke PA, Van de Wauw J, Ayoubi S (2014) Spatial prediction of soil great groups by boosted regression trees using a limited point dataset in an arid region, southeastern Iran. Geoderma 232:148–163
Article Google Scholar
Janitza S, Tutz G, Boulesteix AL (2016) Random forest for ordinal responses: prediction and variable selection. Comput Stat Data Anal 96:57–73
Article MathSciNet Google Scholar
Jeong JY, Kang JS, Jun CH (2020) Regularization-based model tree for multi-output regression. Inf Sci 507:240–255
Article MathSciNet Google Scholar
Jeung M, Baek S, Beom J, Cho KH, Her Y, Yoon K (2019) Evaluation of random forest and regression tree methods for estimation of mass first flush ratio in urban catchments. J Hydrol 575:1099–1110
Article Google Scholar
Jevšenak J, Levanič T, Džeroski S (2018) Comparison of an optimal regression method for climate reconstruction with the compare_methods () function from the dendroTools R package. Dendrochronologia 52:96–104
Article Google Scholar
Jovanovic M, Radovanovic S, Vukicevic M, Van Poucke S, Delibasic B (2016) Building interpretable predictive models for pediatric hospital readmission using Tree-Lasso logistic regression. Artif Intell Med 72:12–21
Article Google Scholar
Kaloop MR, Kumar D, Samui P, Hu JW, Kim D (2020) Compressive strength prediction of high-performance concrete using gradient tree boosting machine. Constr Build Mater 264:120198
Article Google Scholar
Kasprzyk I, Grinn-Gofroń A, Strzelczak A, Wolski T (2011) Hourly predictive artificial neural network and multivariate regression trees models of Ganoderma spore concentrations in Rzeszów and Szczecin (Poland). Sci Total Environ 409(5):949–956
Article Google Scholar
Kerby DS (2003) CART analysis with unit-weighted regression to predict suicidal ideation from Big Five traits. Pers Individ Differ 35(2):249–261
Article Google Scholar
Kisi O (2015) Pan evaporation modeling using least square support vector machine, multivariate adaptive regression splines and M5 model tree. J Hydrol 528:312–320
Article Google Scholar
Kisi O, Parmar KS, Soni K, Demir V (2017) Modeling of air pollutants using least square support vector regression, multivariate adaptive regression spline, and M5 model tree models. Air Qual Atmos Health 10(7):873–883
Article Google Scholar
Krishna K, Veettil VP, Anas A, Nair S (2020) Hydrological regulation of Vibrio dynamics in a tropical monsoonal estuary: a classification and regression tree approach. Environ Sci Pollut Res 28:724–737
Article Google Scholar
Kurt I, Ture M, Kurum AT (2008) Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease. Expert Syst Appl 34(1):366–374
Article Google Scholar
Lemon SC, Roy J, Clark MA, Friedmann PD, Rakowski W (2003) Classification and regression tree analysis in public health: methodological review and comparison with logistic regression. Ann Behav Med 26(3):172–181
Article Google Scholar
Levatić J, Ceci M, Stepišnik T, Džeroski S, Kocev D (2020) Semi-supervised regression trees with application to QSAR modelling. Expert Syst Appl 158:113569
Article Google Scholar
Levatić J, Kocev D, Ceci M, Džeroski S (2018) Semi-supervised trees for multi-target regression. Inf Sci 450:109–127
Article MathSciNet Google Scholar
Li G, Chen H, Hu Y, Wang J, Guo Y, Liu J et al (2018) An improved decision tree-based fault diagnosis method for practical variable refrigerant flow system using virtual sensor-based fault indicators. Appl Therm Eng 129:1292–1303
Article Google Scholar
Li S, Laima S, Li H (2018) Data-driven modeling of vortex-induced vibration of a long-span suspension bridge using decision tree learning and support vector regression. J Wind Eng Ind Aerodyn 172:196–211
Article Google Scholar
Littke KM, Cross J, Harrison RB, Zabowski D, Turnblom E (2017) Understanding spatial and temporal Douglas-fir fertilizer response in the Pacific Northwest using boosted regression trees and linear discriminant analysis. For Ecol Manag 406:61–71
Article Google Scholar
Liu Y (2010, January) Study on application of apriori algorithm in data mining. In: 2010 second international conference on computer modeling and simulation. IEEE, vol 3, pp 111–114
Liu J, Li Y (2020) Study on environment-concerned short-term load forecasting model for wind power based on feature extraction and tree regression. J Clean Prod 264:121505
Article Google Scholar
Luo RM, Li YQ, Guo HL, Zhou YP, Xu H, Gong H (2013) Adaptive configuration of radial basis function network by regression tree allied with hybrid particle swarm optimization algorithm. Chemom Intell Lab Syst 124:50–57
Article Google Scholar
McCord SE, Buenemann M, Karl JW, Browning DM, Hadley BC (2017) Integrating remotely sensed imagery and existing multiscale field data to derive rangeland indicators: application of Bayesian additive regression trees. Rangel Ecol Manag 70(5):644–655
Article Google Scholar
Mattern S, Fasbender D, Vanclooster M (2009) Discriminating sources of nitrate pollution in an unconfined sandy aquifer. J Hydrol 376(1–2):275–284
Article Google Scholar
Mikut R, Reischl M (2011) Data mining tools. Wiley Interdiscip Rev: Data Min Knowl Discov 1(5):431–443
Google Scholar
Nusinovici S, Tham YC, Yan MYC, Ting DSW, Li J, Sabanayagam C et al (2020) Logistic regression was as good as machine learning for predicting major chronic diseases. J Clin Epidemiol 122:56–69
Article Google Scholar
Palaniappan S, Awang R (2008, March) Intelligent heart disease prediction system using data mining techniques. In: 2008 IEEE/ACS international conference on computer systems and applications. IEEE, pp 108–115
Pendharkar PC (2004) An exploratory study of object-oriented software component size determinants and the application of regression tree forecasting models. Inf Manag 42(1):61–73
Article Google Scholar
Peters RP, Twisk JW, van Agtmael MA, Groeneveld AJ (2006) The role of procalcitonin in a decision tree for prediction of bloodstream infection in febrile patients. Clin Microbiol Infect 12(12):1207–1213
Article Google Scholar
Ploner A, Brandenburg C (2003) Modelling visitor attendance levels subject to day of the week and weather: a comparison between linear regression models and regression trees. J Nat Conserv 11(4):297–308
Article Google Scholar
Qin X, Wan Y, Fan M, Liao Y, Li Y, Wang B, Gao Q (2020) Diffusive flux of CH4 and N2O from agricultural river networks: regression tree and importance analysis. Sci Total Environ 717:137244
Article Google Scholar
Rahmatian M, Chen YC, Palizban A, Moshref A, Dunford WG (2017) Transient stability assessment via decision trees and multivariate adaptive regression splines. Electric Power Syst Res 142:320–328
Article Google Scholar
Rawls WJ, Pachepsky YA, Ritchie JC, Sobecki TM, Bloodworth H (2003) Effect of soil organic carbon on soil water retention. Geoderma 116(1–2):61–76
Article Google Scholar
Salimi A, Rostami J, Moormann C, Hassanpour J (2018) Examining feasibility of developing a rock mass classification for hard rock TBM application using non-linear regression, regression tree and generic programming. Geotech Geol Eng 36(2):1145–1159
Google Scholar
Smarra F, Di Girolamo GD, De Iuliis V, Jain A, Mangharam R, D’Innocenzo A (2020) Data-driven switching modeling for MPC using regression trees and random forests. Nonlinear Anal Hybrid Syst 36:100882
Article MathSciNet Google Scholar
Sanzana MB, Garrido SS, Poblete CM (2015) Profiles of Chilean students according to academic performance in mathematics: An exploratory study using classification trees and random forests. Stud Educ Eval 44:50–59
Article Google Scholar
Sarda-Espinosa A, Subbiah S, Bartz-Beielstein T (2017) Conditional inference trees for knowledge extraction from motor health condition data. Eng Appl Artif Intell 62:26–37
Article Google Scholar
Schwantes AM, Swenson JJ, Jackson RB (2016) Quantifying drought-induced tree mortality in the open canopy woodlands of central Texas. Remote Sens Environ 181:54–64
Article Google Scholar
Shim EJ, Yoon MA, Yoo HJ, Chee CG, Lee MH, Lee SH et al (2020) An MRI-based decision tree to distinguish lipomas and lipoma variants from well-differentiated liposarcoma of the extremity and superficial trunk: classification and regression tree (CART) analysis. Eur J Radiol 127:109012
Article Google Scholar
Smith R, Kasprzyk J, Rajagopalan B (2019) Using multivariate regression trees and multiobjective tradeoff sets to reveal fundamental insights about water resources systems. Environ Model Softw 120:104498
Article Google Scholar
Song Y, Zhou H, Wang P, Yang M (2019) Prediction of clathrate hydrate phase equilibria using gradient boosted regression trees and deep neural networks. J Chem Thermodyn 135:86–96
Article Google Scholar
Sproull GJ, Adamus M, Bukowski M, Krzyżanowski T, Szewczyk J, Statwick J, Szwagrzyk J (2015) Tree and stand-level patterns and predictors of Norway spruce mortality caused by bark beetle infestation in the Tatra Mountains. For Ecol Manag 354:261–271
Article Google Scholar
Suykens JA, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300
Article Google Scholar
Torgo L (1997, July) Functional models for regression tree leaves. In: ICML, vol 97, pp 385–393
Tso GK, Yau KK (2007) Predicting electricity energy consumption: a comparison of regression analysis, decision tree and neural networks. Energy 32(9):1761–1768
Article Google Scholar
Valle R, Buenaposada JM, Valdés A, Baumela L (2019) Face alignment using a 3d deeply-initialized ensemble of regression trees. Comput Vis Image Underst 189:102846
Article Google Scholar
Vallejo F, Díaz-Robles LA, Vega R, Cubillos F (2020) A novel approach for prediction of mass yield and higher calorific value of hydrothermal carbonization by a robust multilinear model and regression trees. J Energy Inst 93:1755–1762
Article Google Scholar
Vanli ND, Sayin MO, Mohaghegh M, Ozkan H, Kozat SS (2019) Nonlinear regression via incremental decision trees. Pattern Recogn 86:1–13
Article Google Scholar
Vega FA, Andrade ML, Covelo EF (2010) Influence of soil properties on the sorption and retention of cadmium, copper and lead, separately and together, by 20 soil horizons: comparison of linear regression and tree regression analyses. J Hazard Mater 174(1–3):522–533
Article Google Scholar
Wang FK, Mamo T (2020) Gradient boosted regression model for the degradation analysis of prismatic cells. Comput Ind Eng 144:106494
Article Google Scholar
Wang K, Simandl JK, Porter MD, Graettinger AJ, Smith RK (2016) How the choice of safety performance function affects the identification of important crash prediction variables. Accid Anal Prev 88:1–8
Article Google Scholar
Wickramarachchi DC, Robertson BL, Reale M, Price CJ, Brown J (2016) HHCART: an oblique decision tree. Comput Stat Data Anal 96:12–23
Article MathSciNet Google Scholar
Wolf BJ, Slate EH, Hill EG (2015) Ordinal logic regression: a classifier for discovering combinations of binary markers for ordinal outcomes. Comput Stat Data Anal 82:152–163
Article MathSciNet Google Scholar
Wright RE (1995) Logistic regression
Yang BS, Tan ACC (2009) Multi-step ahead direct prediction for the machine condition prognosis using regression trees and neuro-fuzzy systems. Expert Syst Appl 36(5):9378–9387
Article Google Scholar
Yang F, Wang D, Xu F, Huang Z, Tsui KL (2020) Lifespan prediction of lithium-ion batteries based on various extracted features and gradient boosting regression tree model. J Power Sources 476:228654
Article Google Scholar
Yang RM, Zhang GL, Liu F, Lu YY, Yang F, Yang F et al (2016) Comparison of boosted regression tree and random forest models for mapping topsoil organic carbon concentration in an alpine ecosystem. Ecol Indic 60:870–878
Article Google Scholar
Yu H, Cooper AR, Infante DM (2020) Improving species distribution model predictive accuracy using species abundance: application with boosted regression trees. Ecol Model 432:109202
Article Google Scholar
Yu X, Wang Y, Wu L, Chen G, Wang L, Qin H (2020) Comparison of support vector regression and extreme gradient boosting for decomposition-based data-driven 10-day streamflow forecasting. J Hydrol 582:124293
Article Google Scholar
Yu H, Wen J, Wang H, Jun L (2011) An improved Apriori algorithm based on the Boolean matrix and Hadoop. Procedia Eng 15:1827–1831
Article Google Scholar
Zegler CH, Renz MJ, Brink GE, Ruark MD (2020) Assessing the importance of plant, soil, and management factors affecting potential milk production on organic pastures using regression tree analysis. Agric Syst 180:102776
Article Google Scholar
Zeng N, Xiao H (2020) Inferring implications in semantic maps via the Apriori algorithm. Lingua 239:102808
Article Google Scholar
Zhan G, Yan X, Zhu S, Wang Y (2016) Using hierarchical tree-based regression model to examine university student travel frequency and mode choice patterns in China. Transp Policy 45:55–65
Article Google Scholar
Zhan X, Zhang S, Szeto WY, Chen X (2020) Multi-step-ahead traffic speed forecasting using multi-output gradient boosting regression tree. J Intell Transp Syst 24(2):125–141
Article Google Scholar
Zhang D, Kabuka MR (2018) Combining weather condition data to predict traffic flow: a GRU-based deep learning approach. IET Intell Transp Syst 12(7):578–585
Article Google Scholar
Zhang L, Huettmann F, Liu S, Sun P, Yu Z, Zhang X, Mi C (2019) Classification and regression with random forests as a standard method for presence-only data SDMs: a future conservation example using China tree species. Eco Inform 52:46–56
Article Google Scholar
Zhang L, Traore S, Ge J, Li Y, Wang S, Zhu G et al (2019) Using boosted tree regression and artificial neural networks to forecast upland rice yield under climate change in Sahel. Comput Electron Agric 166:105031
Article Google Scholar
Zhou S, Wang S, Wu Q, Azim R, Li W (2020) Predicting potential miRNA-disease associations by combining gradient boosting decision tree with logistic regression. Comput Biol Chem 85:107200
Article Google Scholar
Zhou F, Zhang Q, Sornette D, Jiang L (2019) Cascading logistic regression onto gradient boosted decision trees for forecasting and trading stock indices. Appl Soft Comput 84:105747
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Industrial Engineering, Faculty of Engineering, Kharazmi University, Tehran, Iran
Mohammad Vahid Sebt, Yaser Sadati-Keneti, Misagh Rahbari, Zohreh Gholipour & Hamid Mehri

Authors

Mohammad Vahid Sebt
View author publications
You can also search for this author in PubMed Google Scholar
Yaser Sadati-Keneti
View author publications
You can also search for this author in PubMed Google Scholar
Misagh Rahbari
View author publications
You can also search for this author in PubMed Google Scholar
Zohreh Gholipour
View author publications
You can also search for this author in PubMed Google Scholar
Hamid Mehri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Vahid Sebt.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sebt, M.V., Sadati-Keneti, Y., Rahbari, M. et al. Regression Method in Data Mining: A Systematic Literature Review. Arch Computat Methods Eng (2024). https://doi.org/10.1007/s11831-024-10088-5

Download citation

Received: 22 December 2020
Accepted: 04 February 2024
Published: 27 March 2024
DOI: https://doi.org/10.1007/s11831-024-10088-5

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Regression Method in Data Mining: A Systematic Literature Review

Abstract

Access this article

Similar content being viewed by others

The potential of working hypotheses for deductive exploratory research

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Abbreviations

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Navigation

Regression Method in Data Mining: A Systematic Literature Review

Abstract

Access this article

Similar content being viewed by others

The potential of working hypotheses for deductive exploratory research

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Abbreviations

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation