Skip to main content

Wrapper-based feature selection using regression trees to predict intrinsic viscosity of polymer


This paper introduces different types of regression trees for viscosity property forecasting in polymer solutions. Although regression trees have been extensively used in other fields, they do not have been explored to predict the viscosity. One key issue in the context of materials science is to determine a priori which characteristics must be included to describe the prediction model due to a large number of molecular descriptors is obtained. To deal with this, we propose a wrapper method to select the features based on regression trees. Thus, we use regression trees to evaluate different subsets of attributes and build a model from the subset of features that achieved the minimum error. In particular, the performance of eight regression tree algorithms, including both linear and non-linear models, is evaluated and compared to other forecasting approaches using a dataset composed of 64 polymers and 2962 molecular descriptors. The results show that regression trees with nearest neighbors based local models in leaves predict with high accuracy. Moreover, results have been compared to other forecasting approaches such as multivariate linear regression, neural networks and support vector machines showing remarkable improvements in terms of accuracy.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13


  1. 1.

    Afantitis A, Melagraki G, Sarimveis H, Koutentis PA, Markopoulos J, Igglessi-Markopoulou O (2006) Prediction of intrinsic viscosity in polymer-solvent combinations using a QSPR model. Polymer 47:3240–3248

    Article  Google Scholar 

  2. 2.

    Benguerba Y, Alnashef IM, Erto A, Balsamo M, Ernst B (2019) A quantitative prediction of the viscosity of amine based DESs using profile molecular descriptors. J Mol Struct 1184:357–363

    Article  Google Scholar 

  3. 3.

    Bertinetto CG, Duce C, Micheli A, Solaro R, Tiné MR (2010) QSPR analysis of copolymers by recursive neural networks: prediction of the glass transition temperature of (meth) acrylic random copolymers. Mol Inform 29:635–643

    Article  Google Scholar 

  4. 4.

    Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. Wadsworth and Brooks, Monterey

    MATH  Google Scholar 

  5. 5.

    CORElearn Package available in software R:

  6. 6.

    Cruz-Monteagudo M, Munteanu CR, Borges F, Cordeiro MND, Uriarte E, Chou K-C, GonzÁlez-DÁaz H (2008) Stochastic molecular descriptors for polymers. 4. Study of complex mixtures with topological indices of mass spectra spiral and star networks. Polymer 49:5575–5587

    Article  Google Scholar 

  7. 7.

    da Silva Barbosa R, Stefani R (2013) QSPR based on support vector machines to predict the glass transition temperature of compounds used in manufacturing OLEDs. Comput Chem Eng 39:234–244

    Google Scholar 

  8. 8.

    Fakhari A, Moghadam AM (2013) Combination of classification and regression in decision tree for multi-labeling image annotation and retrieval. Appl Soft Comput 13(2):1292–1302

    Article  Google Scholar 

  9. 9.

    Galicia A, Talavera-Llames R, Troncoso A, Koprinska I, Martínez-Álvarez F (2019) Multi-step forecasting for big data time series based on ensemble learning. Knowl Based Syst 163:830–841

    Article  Google Scholar 

  10. 10.

    Galicia A, Torres JF, Martínez-Álvarez F, Troncoso A (2018) A novel Spark-based multi-step forecasting algorithm for big data time series. Inf Sci 467:800–818

    Article  Google Scholar 

  11. 11.

    Gharagheizi F (2007) A new accurate neural network quantitative structure–property relationship for prediction of \(\theta \) (lower critical solution temperature) of polymer solutions. e-Polymers 7:1314–1334

    Google Scholar 

  12. 12.

    Gharagheizi F (2007) QSPR analysis for intrinsic viscosity of polymer solutions by means of GA-MLR and RBFNN. Comput Mater Sci 40:159–167

    Article  Google Scholar 

  13. 13.

    Ghomisheh Z, Gorji AE, Sobati MA (2020) Prediction of critical properties of sulfur-containing compounds: new QSPR models. J Mol Graph Model 101:107700

    Article  Google Scholar 

  14. 14.

    Gubskaya AV, Kholodovych V, Knight D, Kohn J, Welsh WJ (2007) Prediction of fibrinogen adsorption for biodegradable polymers: integration of molecular dynamics and surrogate modeling. Polymer 48:5788–5801

    Article  Google Scholar 

  15. 15.

    Ida D, Nakamura Y, Yoshizaki T (2008) Intrinsic viscosity of wormlike regular three-arm stars. Polymer 40:256–267

    Article  Google Scholar 

  16. 16.

    Ida D, Yoshizaki T (2007) A Monte Carlo study of the intrinsic viscosity of semiflexible regular three-arm star polymers. Polymer 39:1373–1382

    Article  Google Scholar 

  17. 17.

    Jabeen F, Chen M, Rasulev B, Ossowski M, Boudjouk P (2017) Refractive indices of diverse data set of polymers: a computational QSPR based study. Comput Mater Sci 137:1215–224

    Article  Google Scholar 

  18. 18.

    Kale SP, Garg S (2012) Prediction of the mutual diffusion coefficient for controlled drug delivery devices. Comput Chem Eng 39:186–198

    Article  Google Scholar 

  19. 19.

    Kenesei T, Abonyi J (2013) Hinging hyperplane based regression tree identified by fuzzy clustering and its application. Appl Soft Comput 13(2):782–792

    Article  Google Scholar 

  20. 20.

    Khajeh A, Shakourian-Fard M, Parvaneh K (2020) Quantitative structure–property relationship for melting and freezing points of deep eutectic solvents. J Mol Liq

  21. 21.

    Khajeha A, Modarress H, Rezaeec B (2009) Application of adaptive neuro-fuzzy inference system for solubility prediction of carbon dioxide in polymers. Expert Syst Appl 36(3, Part 1):5728–5732

    Article  Google Scholar 

  22. 22.

    Khan A, Shamsi MH, Choi T-S (2009) Correlating dynamical mechanical properties with temperature and clay composition of polymer-clay nanocomposites. Comput Mater Sci 45:257–265

    Article  Google Scholar 

  23. 23.

    Koc DI, Koc ML (2015) A genetic programming-based QSPR model for predicting solubility parameters of polymers. Chemometr Intell Lab Syst 144:122–127

    Article  Google Scholar 

  24. 24.

    Korda N, Szönyi B, Li S (2016) Distributed clustering of linear bandits in peer to peer networks. In: Proceedings of the 33rd international conference on machine learning, pp 1301–1309

  25. 25.

    Landrum GA, Penzotti JE, Putta S (2004) Machine-learning models for combinatorial catalyst discovery. Meas Sci Technol 16:270

    Article  Google Scholar 

  26. 26.

    Lemaoui T, Hammoudi NEH, Alnashef IM, Balsamo M, Erto A, Ernst B, Benguerba Y (2020) Quantitative structure properties relationship for deep eutectic solvents using S\(\sigma \)-profile as molecular descriptors. J Mol Liq 309:113165

    Article  Google Scholar 

  27. 27.

    Li S (2016) The art of clustering bandits. Doctoral thesis, University of Insubria, 2016

  28. 28.

    Li S, Karatzoglou A, Gentile C (2016) Collaborative filtering bandits. In: Proceedings of the 39th international ACM SIGIR conference on research and development in information retrieval, pp 539–548, 2016

  29. 29.

    Litvinova L, Bel’nikevich N (2003) Adsorption thin-layer chromatography and viscometry of polystyrenes in solvent mixtures. J Chromatogr 1005:165–176

    Article  Google Scholar 

  30. 30.

    Luan F, Zhang X, Zhang H, Zhang R, Liu M, Hu Z, Fan B (2006) QSPR study of permeability coefficients through low-density polyethylene based on radial basis function neural networks and the heuristic method. Comput Mater Sci 37:454–461

    Article  Google Scholar 

  31. 31.

    Lyon RE, Takemori MT, Safronava N, Stoliarov SI, Walters RN (2009) A molecular basis for polymer flammability. Polymer 50(12):2608–2617

    Article  Google Scholar 

  32. 32.

    Mallakpour S, Hatami M, Golmohammadi H (2010) Prediction of inherent viscosity for polymers containing natural amino acids from the theoretical derived molecular descriptors. Polymer 51:3568–3574

    Article  Google Scholar 

  33. 33.

    Miccio LA, Schwartz GA (2020) From chemical structure to quantitative polymer properties prediction through convolutional neural networks. Polymer 193:122341

    Article  Google Scholar 

  34. 34.

    Miccio LA, Schwartz GA (2020) Localizing and quantifying the intra-monomer contributions to the glass transition temperature using artificial neural networks. Polymer 203:122786

    Article  Google Scholar 

  35. 35.

    Nolte TM, Peijnenburg WJ, Hendriks AJ, Van de Meent D (2017) Quantitative structure–activity relationships for green algae growth inhibition by polymer particles. Chemosphere 179:49–56

    Article  Google Scholar 

  36. 36.

    Parandekar PV, Browning AR, Prakash O (2015) Modeling the flammability characteristics of polymers using quantitative structure-property relationships (QSPR). Polym Eng Sci 55:1553–1559

    Article  Google Scholar 

  37. 37.

    Qi G, Wang Y, Li X, Peng H, Yang S (2002) Viscometric study on the specific interaction between proton-donating polymers and proton-accepting polymers. J Appl Polym Sci 85:415–421

    Article  Google Scholar 

  38. 38.

    Quinlan JR (1992) Learning with continuous classes. In: Proceedings of the 5th Australian joint conference on artificial intelligence, pp 343–348

  39. 39.

    Smola AJ, Scholkopf B (2004) A tutorial on support vector regression. Stat Comput 14(3):199–222

    MathSciNet  Article  Google Scholar 

  40. 40.

    Torgo L (1997) Functional models for regression trees leaves. In: Proceedings of the 14th international conference on machine learning, pp 385–393

  41. 41.

    Tsuda M, Terao K, Nakamura Y, Kita Y, Kitamura S, Sato T (2010) Solution properties of amylose tris (3,5-dimethylphenylcarbamate) and amylose tris (phenylcarbamate): side group and solvent dependent chain stiffness in methyl acetate, 2-butanone, and 4-methyl-2-pentanone. Macromolecules 43:5779–5784

    Article  Google Scholar 

  42. 42.

    Vitrac O, Lézervant J, Feigenbaum A (2006) Decision trees as applied to the robust estimation of diffusion coefficients in polyolefins. J Appl Polym Sci 101:2167–2186

    Article  Google Scholar 

  43. 43.

    Witten IA, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann Publishers, Burlington

    Google Scholar 

  44. 44.

    Xu J, Wang L, Liang G, Wang L, Shen X (2011) A general quantitative structure–property relationship treatment for dielectric constants of polymers. Polym Eng Sci 51:2408–2416

    Article  Google Scholar 

  45. 45.

    Xu J, Zhang H, Wang L, Liang G, Wang L, Shen X (2011) Artificial neural network-based QSPR study on absorption maxima of organic dyes for dye-sensitised solar cells. Mol Simul 37:1–10

    Article  Google Scholar 

  46. 46.

    Yu X, Huang X (2016) Prediction of glass transition temperatures of polyacrylates from the structures of motion units. J Theor Comput Chem 15:1650011

    Article  Google Scholar 

  47. 47.

    Yu X, Xie Z, Yi B, Wang X, Liu F (2007) Prediction of the thermal decomposition property of polymers using quantum chemical descriptors. Eur Polym J 43:818–823

    Article  Google Scholar 

Download references


The authors would like to thank the Spanish Ministry of of Science, Innovation and Universities for the support under project TIN2017-8888209C2-1-R.

Author information



Corresponding author

Correspondence to A. Troncoso.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mortazavi, R., Mortazavi, S. & Troncoso, A. Wrapper-based feature selection using regression trees to predict intrinsic viscosity of polymer. Engineering with Computers (2021).

Download citation


  • Regression trees
  • Polymer viscosity prediction
  • Feature selection