On the Relevance of Feature Selection Algorithms While Developing Non-linear QSARs

  • Riccardo ConcuEmail author
  • M. Natália Dias Soeiro CordeiroEmail author
Part of the Methods in Pharmacology and Toxicology book series (MIPT)


Quantitative structure-activity relationships (QSARs) are mathematical models aimed at finding a quantitative relationship between a set of chemical compounds and a specific activity or endpoint, such as toxicity, chemical or physical property, biological activity, and so on. In order to find out the correlation between the chemicals and the selected endpoints, QSAR models use the so-called molecular descriptors (MDs) which encode specific chemical information or features of the molecules. The early QSAR models were based on a small set of MDs and a specific endpoint, and the correlation was usually a linear mathematical correlation. However, nowadays, QSAR models are usually non-linear and made up by thousands of chemicals and hundreds of MDs. In addition, novel QSAR models are also aimed at the prediction of different endpoints with the same model, the so-called multi-target QSAR (MT-QSAR). Due to this, nowadays many QSARs are usually developed using machine learning approaches which can model a dataset with different endpoints. Although these approaches have demonstrated to be able to solve MT-QSAR models, feature selection (FS) in these cases is a challenging task and a main point in the QSAR field. Considering these aspects, the main aim of this chapter is to analyze feature selection methods while developing non-linear QSAR models.

Key words

QSAR Molecular descriptors Feature selection Neural networks Filter methods Wrapper methods Machine learning Linear models Non-linear models 



This work received financial support from Fundaçao para a Ciência e a Tecnologia (FCT/MEC) through national funds and co-financed by the European Union (FEDER funds) under the Partnership Agreement PT2020, through projects UID/QUI/ 50006/2013, POCI/01/0145/FEDER/007265, NORTE-01-0145-FEDER-000011 (LAQV@REQUIMTE), and the Interreg SUDOE NanoDesk (SOE1/P1/E0215; UP). RC acknowledges also FCT and the European Social Fund for financial support (Grant SFRH/BPD/80605/2011). To all financing sources, the authors are greatly indebted.

The authors declare no competing financial interest.


  1. 1.
    Hansch C, Muir RM, Fujita T, Maloney PP, Geiger F, Streich M (1963) The correlation of biological activity of plant growth regulators and chloromycetin derivatives with hammett constants and partition coefficients. J Am Chem Soc 85(18):2817–2824CrossRefGoogle Scholar
  2. 2.
    Gombar VK, Enslein K, Blake BW (1995) Assessment of developmental toxicity potential of chemicals by quantitative structure-toxicity relationship models. Chemosphere 31(1):2499–2510PubMedCrossRefPubMedCentralGoogle Scholar
  3. 3.
    Roy K, Ghosh G (2004) QSTR with extended topochemical atom indices. 2. Fish toxicity of substituted benzenes. J Chem Inf Comput Sci 44(2):559–567PubMedCrossRefPubMedCentralGoogle Scholar
  4. 4.
    Basak SC, Nikolic S, Trinajstic N, Amic D, Beslo D (2000) QSPR modeling: graph connectivity indices versus line graph connectivity indices. J Chem Inf Comput Sci 40(4):927–933PubMedCrossRefPubMedCentralGoogle Scholar
  5. 5.
    Grover II, Singh II, Bakshi II (2000) Quantitative structure-property relationships in pharmaceutical research – part 2. Pharm Sci Technolo Today 3(2):50–57PubMedCrossRefPubMedCentralGoogle Scholar
  6. 6.
    Grover II, Singh II, Bakshi II (2000) Quantitative structure-property relationships in pharmaceutical research – part 1. Pharm Sci Technolo Today 3(1):28–35PubMedCrossRefPubMedCentralGoogle Scholar
  7. 7.
    Concu R, Kleandrova VV, Speck-Planche A, Cordeiro M (2017) Probing the toxicity of nanoparticles: a unified in silico machine learning model based on perturbation theory. Nanotoxicology 11(7):891–906CrossRefGoogle Scholar
  8. 8.
    Burello E, Worth AP (2011) QSAR modeling of nanomaterials. Wiley Interdiscip Rev Nanomed Nanobiotechnol 3(3):298–306PubMedCrossRefPubMedCentralGoogle Scholar
  9. 9.
    Cherkasov A, Muratov EN, Fourches D, Varnek A, Baskin II, Cronin M et al (2014) QSAR modeling: where have you been? Where are you going to? J Med Chem 57(12):4977–5010PubMedPubMedCentralCrossRefGoogle Scholar
  10. 10.
    Wilm A, Kuhnl J, Kirchmair J (2018) Computational approaches for skin sensitization prediction. Crit Rev Toxicol 48(9):738–760PubMedCrossRefPubMedCentralGoogle Scholar
  11. 11.
    Ford KA (2016) Refinement, reduction, and replacement of animal toxicity tests by computational methods. ILAR J 57(2):226–233PubMedCrossRefPubMedCentralGoogle Scholar
  12. 12.
    Tropsha A (2010) Best practices for QSAR model development, validation, and exploitation. Mol Inform 29(6–7):476–488CrossRefGoogle Scholar
  13. 13.
    Gaulton A, Hersey A, Nowotka M, Bento AP, Chambers J, Mendez D et al (2017) The ChEMBL database in 2017. Nucleic Acids Res 45(D1):D945–DD54PubMedCrossRefPubMedCentralGoogle Scholar
  14. 14.
    Kim S, Chen J, Cheng T, Gindulyte A, He J, He S et al (2019) PubChem 2019 update: improved access to chemical data. Nucleic Acids Res 47(D1):D1102–D11D9PubMedCrossRefPubMedCentralGoogle Scholar
  15. 15.
    Pence HE, Williams A (2010) ChemSpider: an online chemical information resource. J Chem Educ 87(11):1123–1124CrossRefGoogle Scholar
  16. 16.
    Jabeen I, Wetwitayaklung P, Chiba P, Pastor M, Ecker GF (2013) 2D- and 3D-QSAR studies of a series of benzopyranes and benzopyrano[3,4b][1,4]-oxazines as inhibitors of the multidrug transporter P-glycoprotein. J Comput Aided Mol Des 27(2):161–171PubMedPubMedCentralCrossRefGoogle Scholar
  17. 17.
    Mauri A, Consonni V, Pavan M, Todeschini R (2006) Dragon software: an easy approach to molecular descriptor calculations. Match Commun Math Comput Chem 56(2):237–248Google Scholar
  18. 18.
    Sadowski J, Gasteiger J, Klebe G (1994) Comparison of automatic three-dimensional model builders using 639 x-ray structures. J Chem Inf Comput Sci 34(4):1000–1008CrossRefGoogle Scholar
  19. 19.
    Ignatz-Hoover F, Petrukhin R, Karelson M, Katritzky AR (2001) QSRR correlation of free-radical polymerization chain-transfer constants for styrene. J Chem Inf Comput Sci 41(2):295–299PubMedCrossRefPubMedCentralGoogle Scholar
  20. 20.
    Roy K, Pratim RP (2009) Comparative chemometric modeling of cytochrome 3A4 inhibitory activity of structurally diverse compounds using stepwise MLR, FA-MLR, PLS, GFA, G/PLS and ANN techniques. Eur J Med Chem 44(7):2913–2922PubMedCrossRefGoogle Scholar
  21. 21.
    Baskin II, Palyulin VA, Zefirov NS (2008) Neural networks in building QSAR models. Methods Mol Biol 458:137–158PubMedGoogle Scholar
  22. 22.
    Wiese M, Schaper KJ (1993) Application of neural networks in the QSAR analysis of percent effect biological data: comparison with adaptive least squares and nonlinear regression analysis. SAR QSAR Environ Res 1(2–3):137–152PubMedCrossRefGoogle Scholar
  23. 23.
    Zernov VV, Balakin KV, Ivaschenko AA, Savchuk NP, Pletnev IV (2003) Drug discovery using support vector machines. The case studies of drug-likeness, agrochemical-likeness, and enzyme inhibition predictions. J Chem Inf Comput Sci 43(6):2048–2056PubMedCrossRefGoogle Scholar
  24. 24.
    Li S, Fedorowicz A, Andrew ME (2007) A new descriptor selection scheme for SVM in unbalanced class problem: a case study using skin sensitisation dataset. SAR QSAR Environ Res 18(5–6):423–441PubMedCrossRefGoogle Scholar
  25. 25.
    Shi W, Zhang X, Shen Q (2010) Quantitative structure-activity relationships studies of CCR5 inhibitors and toxicity of aromatic compounds using gene expression programming. Eur J Med Chem 45(1):49–54PubMedCrossRefGoogle Scholar
  26. 26.
    Stoyanova-Slavova IB, Slavov SH, Pearce B, Buzatu DA, Beger RD, Wilkes JG (2014) Partial least square and k-nearest neighbor algorithms for improved 3D quantitative spectral data-activity relationship consensus modeling of acute toxicity. Environ Toxicol Chem 33(6):1271–1282PubMedCrossRefGoogle Scholar
  27. 27.
    Nikolic K, Filipic S, Smolinski A, Kaliszan R, Agbaba D (2013) Partial least square and hierarchical clustering in ADMET modeling: prediction of blood-brain barrier permeation of alpha-adrenergic and imidazoline receptor ligands. J Pharm Pharm Sci 16(4):622–647PubMedCrossRefGoogle Scholar
  28. 28.
    Brandmaier S, Sahlin U, Tetko IV, Oberg T (2012) PLS-optimal: a stepwise D-optimal design based on latent variables. J Chem Inf Model 52(4):975–983PubMedCrossRefGoogle Scholar
  29. 29.
    Koba M, Baczek T (2013) The evaluation of multivariate adaptive regression splines for the prediction of antitumor activity of acridinone derivatives. Med Chem 9(8):1041–1050PubMedCrossRefGoogle Scholar
  30. 30.
    Put R, Xu QS, Massart DL, Vander HY (2004) Multivariate adaptive regression splines (MARS) in chromatographic quantitative structure-retention relationship studies. J Chromatogr A 1055(1–2):11–19PubMedCrossRefPubMedCentralGoogle Scholar
  31. 31.
    Scior T, Medina-Franco JL, Do QT, Martinez-Mayorga K, Yunes Rojas JA, Bernard P (2009) How to recognize and workaround pitfalls in QSAR studies: a critical review. Curr Med Chem 16(32):4297–4313PubMedCrossRefPubMedCentralGoogle Scholar
  32. 32.
    Gramatica P (2013) On the development and validation of QSAR models. Methods Mol Biol 930:499–526PubMedCrossRefPubMedCentralGoogle Scholar
  33. 33.
    Basak SC, Natarajan R, Mills D, Hawkins DM, Kraker JJ (2006) Quantitative structure-activity relationship modeling of juvenile hormone mimetic compounds for Culex pipiens larvae, with a discussion of descriptor-thinning methods. J Chem Inf Model 46(1):65–77PubMedCrossRefPubMedCentralGoogle Scholar
  34. 34.
    Khan PM, Roy K (2018) Current approaches for choosing feature selection and learning algorithms in quantitative structure-activity relationships (QSAR). Expert Opin Drug Dis 13(12):1075–1089CrossRefGoogle Scholar
  35. 35.
    Tetko IV, Sushko I, Pandey AK, Zhu H, Tropsha A, Papa E et al (2008) Critical assessment of QSAR models of environmental toxicity against Tetrahymena pyriformis: focusing on applicability domain and overfitting by variable selection. J Chem Inf Model 48(9):1733–1746PubMedCrossRefPubMedCentralGoogle Scholar
  36. 36.
    Topliss JG (1972) Utilization of operational schemes for analog synthesis in drug design. J Med Chem 15(10):1006–1011PubMedCrossRefPubMedCentralGoogle Scholar
  37. 37.
    Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(1):131–156CrossRefGoogle Scholar
  38. 38.
    Kira K, Rendell LA (1992) The feature selection problem: traditional methods and a new algorithm. Proceedings of the tenth national conference on artificial intelligence, San Jose, 1867155, AAAI Press, pp 129–134Google Scholar
  39. 39.
    Narendra PM, Fukunaga K (1977) A branch and bound algorithm for feature subset selection. IEEE Trans Comput 26(9):917–922CrossRefGoogle Scholar
  40. 40.
    Koller D, Sahami M (1996) Toward optimal feature selection. Proceedings of the thirteenth international conference on machine learning, Bari, 3091731, Morgan Kaufmann Publishers Inc., pp 284–292Google Scholar
  41. 41.
    Dash M, Liu H (2003) Consistency-based search in feature selection. Artif Intell 151(1):155–176CrossRefGoogle Scholar
  42. 42.
    Arauzo-Azofra A, Benitez JM, Castro JL (2008) Consistency measures for feature selection. J Intell Inf Syst 30(3):273–292CrossRefGoogle Scholar
  43. 43.
    Jun BH, Kim CS, Song H, Kim J (1997) A new criterion in selection and discretization of attributes for the generation of decision trees. IEEE Trans Pattern Anal Mach Intell 19(12):1371–1375CrossRefGoogle Scholar
  44. 44.
    Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comp Electr Eng 40(1):16–28CrossRefGoogle Scholar
  45. 45.
    Piramuthu S (2004) Evaluating feature selection methods for learning in data mining applications. Eur J Oper Res 156(2):483–494CrossRefGoogle Scholar
  46. 46.
    Whitley DC, Ford MG, Livingstone DJ (2000) Unsupervised forward selection: a method for eliminating redundant variables. J Chem Inf Comput Sci 40(5):1160–1168PubMedCrossRefPubMedCentralGoogle Scholar
  47. 47.
    Sutter JM, Kalivas JH (1993) Comparison of forward selection, backward elimination, and generalized simulated annealing for variable selection. Microchem J 47(1):60–66CrossRefGoogle Scholar
  48. 48.
    Livingstone DJ, Salt DW (2005) Variable selection—Spoilt for choice? Reviews in Computational Chemistry, Ed. Lipkowitz KB, Larter R, Cundari TR, John Wiley & Sons, Inc., chap.4, vol 21, pp. 287–348Google Scholar
  49. 49.
    Almuallim H, Dietterich TG (1991) Learning with many irrelevant features. Proceedings of the ninth National conference on Artificial intelligence, vol 2, Anaheim, 1865761, AAAI Press, pp 547–552Google Scholar
  50. 50.
    Almuallim H, Dietterich TG (1994) Learning Boolean concepts in the presence of many irrelevant features. Artif Intell 69(1):279–305CrossRefGoogle Scholar
  51. 51.
    Arauzo A, Benítez JM, Castro JL (eds) C-FOCUS: a continuous extension of FOCUS2003. Springer, LondonGoogle Scholar
  52. 52.
    Tay FEH, Lixiang S (2002) A modified Chi2 algorithm for discretization. IEEE Trans Knowl Data Eng 14(3):666–670CrossRefGoogle Scholar
  53. 53.
    Boros E, Hammer PL, Ibaraki T, Kogan A, Mayoraz E, Muchnik I (2000) An implementation of logical analysis of data. IEEE Trans Knowl Data Eng 12(2):292–306CrossRefGoogle Scholar
  54. 54.
    Demšar J, Zupan B, Leban G, Curk T (eds) Orange: from experimental machine learning to interactive data mining 2004. Springer Berlin Heidelberg, Berlin, HeidelbergGoogle Scholar
  55. 55.
    Bell DA, Wang H (2000) A formalism for relevance and its application in feature subset selection. Mach Learn 41(2):175–195CrossRefGoogle Scholar
  56. 56.
    Cardie C (1993) Using decision trees to improve case-based learning, in machine learning proceedings. Morgan Kaufmann, San Francisco (CA), pp 25–32CrossRefGoogle Scholar
  57. 57.
    Hanchuan P, Fuhui L, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238CrossRefGoogle Scholar
  58. 58.
    Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A et al (2000) Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403(6769):503–511PubMedCrossRefPubMedCentralGoogle Scholar
  59. 59.
    Jain A, Zongker D (1997) Feature selection: evaluation, application, and small sample performance. IEEE Trans Pattern Anal Mach Intell 19(2):153–158CrossRefGoogle Scholar
  60. 60.
    Ross DT, Scherf U, Eisen MB, Perou CM, Rees C, Spellman P et al (2000) Systematic variation in gene expression patterns in human cancer cell lines. Nat Genet 24:227PubMedCrossRefGoogle Scholar
  61. 61.
    Ding C, Peng H (eds) (2003) Minimum redundancy feature selection from microarray gene expression data. Computational systems bioinformatics CSB2003 proceedings of the 2003 IEEE bioinformatics conference CSB2003, 11–14 Aug 2003Google Scholar
  62. 62.
    Claypo N, Jaiyen S (eds) (2015) A new feature selection based on class dependency and feature dissimilarity. 2015 2nd international conference on advanced informatics: concepts, theory and applications (ICAICTA), 19–22 Aug 2015Google Scholar
  63. 63.
    Yu-Shuen T, Ueng-Cheng Y, Chung IF, Chuen-Der H (eds) (2013) A comparison of mutual and fuzzy-mutual information-based feature selection strategies. 2013 IEEE international conference on fuzzy systems (FUZZ-IEEE), 7–10 July 2013Google Scholar
  64. 64.
    Cheng Q, Zhou H, Cheng J (2011) The Fisher-Markov selector: fast selecting maximally separable feature subset for multiclass classification with applications to high-dimensional data 2011, pp 1217–1233Google Scholar
  65. 65.
    Aalaei S, Shahraki H, Rowhanimanesh A, Eslami S (2016) Feature selection using genetic algorithm for breast cancer diagnosis: experiment on three different datasets. Iran J Basic Med Sci 19(5):476–482PubMedPubMedCentralGoogle Scholar
  66. 66.
    Fukunaga K (1990) Chapter 10 – feature extraction and linear mapping for classification. In: Fukunaga K (ed) Introduction to statistical pattern recognition, 2nd edn. Academic Press, Boston, pp 441–507CrossRefGoogle Scholar
  67. 67.
    Fukunaga K (1990) Chapter 9 – feature extraction and linear mapping for signal representation. In: Fukunaga K (ed) Introduction to statistical pattern recognition, 2nd edn. Academic Press, Boston, pp 399–440CrossRefGoogle Scholar
  68. 68.
    Choi E, Lee C (2003) Feature extraction based on the Bhattacharyya distance. Pattern Recogn 36(8):1703–1709CrossRefGoogle Scholar
  69. 69.
    Drotár P, Gazda J, Smékal Z (2015) An experimental comparison of feature selection methods on two-class biomedical datasets. Comput Biol Med 66:1–10PubMedCrossRefGoogle Scholar
  70. 70.
    Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1):389–422CrossRefGoogle Scholar
  71. 71.
    Akhlaghi Y, Kompany-Zareh M (2006) Application of radial basis function networks and successive projections algorithm in a QSAR study of anti-HIV activity for a large group of HEPT derivatives. J Chemom 20(1–2):1–12CrossRefGoogle Scholar
  72. 72.
    Shanableh T, Assaleh K (2010) Feature modeling using polynomial classifiers and stepwise regression. Neurocomputing 73(10):1752–1759CrossRefGoogle Scholar
  73. 73.
    Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1):273–324CrossRefGoogle Scholar
  74. 74.
    Naseriparsa M, Bidgoli A-M, Varaee T (2013) A hybrid feature selection method to improve performance of a group of classification algorithms. CoRR;abs/1403.2372Google Scholar
  75. 75.
    Nicolotti O, Carotti A (2006) QSAR and QSPR studies of a highly structured physicochemical domain. J Chem Inf Model 46(1):264–276PubMedCrossRefPubMedCentralGoogle Scholar
  76. 76.
    Yang J, Honavar V (1998) Feature subset selection using a genetic algorithm. In: Liu H, Motoda H (eds) Feature extraction, construction and selection: a data mining perspective. Springer US, Boston, pp 117–136CrossRefGoogle Scholar
  77. 77.
    Wang XZ, Buontempo FV, Young A, Osborn D (2006) Induction of decision trees using genetic programming for modelling ecotoxicity data: adaptive discretization of real-valued endpoints. SAR QSAR Environ Res 17(5):451–471PubMedCrossRefPubMedCentralGoogle Scholar
  78. 78.
    Fjell CD, Jenssen H, Cheung WA, Hancock RE, Cherkasov A (2011) Optimization of antibacterial peptides by genetic algorithms and cheminformatics. Chem Biol Drug Des 77(1):48–56PubMedCrossRefPubMedCentralGoogle Scholar
  79. 79.
    Kumar M, Husain M, Upreti N, Gupta D (2010) Genetic algorithm: review and application. IJITM 2(2):451–454Google Scholar
  80. 80.
    Weile DS, Michielssen E (1997) Genetic algorithm optimization applied to electromagnetics: a review. IEEE Trans Antennas Propag 45(3):343–353CrossRefGoogle Scholar
  81. 81.
    Hopper E, Turton B (eds) (1998) Application of genetic algorithms to packing problems — a review. Springer, LondonGoogle Scholar
  82. 82.
    Hussein F, Kharma N, Ward R (eds) (2001) Genetic algorithms for feature selection and weighting, a review and study. Proceedings of Sixth International Conference on Document Analysis and Recognition. 13 Sept 2001Google Scholar
  83. 83.
    Leardi R (2001) Genetic algorithms in chemometrics and chemistry: a review. J Chemom 15(7):559–569CrossRefGoogle Scholar
  84. 84.
    Fernandez M, Caballero J, Fernandez L, Sarai A (2011) Genetic algorithm optimization in drug design QSAR: Bayesian-regularized genetic neural networks (BRGNN) and genetic algorithm-optimized support vectors machines (GA-SVM). Mol Divers 15(1):269–289PubMedCrossRefPubMedCentralGoogle Scholar
  85. 85.
    Niculescu SP (2003) Artificial neural networks and genetic algorithms in QSAR. J Mol Struct THEOCHEM 622(1):71–83CrossRefGoogle Scholar
  86. 86.
    Venkatraman V, Dalby AR, Yang ZR (2004) Evaluation of mutual information and genetic programming for feature selection in QSAR. J Chem Inf Comput Sci 44(5):1686–1692PubMedCrossRefPubMedCentralGoogle Scholar
  87. 87.
    Zhou A, Qu B-Y, Li H, Zhao S-Z, Suganthan PN, Zhang Q (2011) Multiobjective evolutionary algorithms: a survey of the state of the art. Swarm Evolutionary Comput 1(1):32–49CrossRefGoogle Scholar
  88. 88.
    Ozdemir M, Embrechts MJ, Arciniegas F, Breneman CM, Lockwood L, Bennett KP (eds) (2001) Feature selection for in-silico drug design using genetic algorithms and neural networks. SMCia/01 proceedings of the 2001 IEEE mountain workshop on soft computing in industrial applications (Cat No01EX504), 27 June 2001Google Scholar
  89. 89.
    Bahmani A, Saaidpour S, Rostami A (2017) Quantitative structure–retention relationship modeling of morphine and its derivatives on OV-1 column in gas–liquid chromatography using genetic algorithm. Chromatographia 80(4):629–636CrossRefGoogle Scholar
  90. 90.
    Mizera M, Krause A, Zalewski P, Skibiński R, Cielecka-Piontek J (2017) Quantitative structure-retention relationship model for the determination of naratriptan hydrochloride and its impurities based on artificial neural networks coupled with genetic algorithm. Talanta 164:164–174PubMedCrossRefGoogle Scholar
  91. 91.
    Ghasemi G, Nirouei M, Shariati S, Abdolmaleki P, Rastgoo Z (2016) A quantitative structure–activity relationship study on HIV-1 integrase inhibitors using genetic algorithm, artificial neural networks and different statistical methods. Arab J Chem 9:S185–SS90CrossRefGoogle Scholar
  92. 92.
    Velásco-Mejía A, Vallejo-Becerra V, Chávez-Ramírez AU, Torres-González J, Reyes-Vidal Y, Castañeda-Zaldivar F (2016) Modeling and optimization of a pharmaceutical crystallization process by using neural networks and genetic algorithms. Powder Technol 292:122–128CrossRefGoogle Scholar
  93. 93.
    Li Y, Abbaspour MR, Grootendorst PV, Rauth AM, Wu XY (2015) Optimization of controlled release nanoparticle formulation of verapamil hydrochloride using artificial neural networks with genetic algorithm and response surface methodology. Eur J Pharm Biopharm 94:170–179PubMedCrossRefGoogle Scholar
  94. 94.
    Noorizadeh H, Farmany A, Noorizadeh M (2011) Application of GA–KPLS and L–M ANN calculations for the prediction of the capacity factor of hazardous psychoactive designer drugs. Med Chem Res 21:2680–2688CrossRefGoogle Scholar
  95. 95.
    Sukumar N, Prabhu G, Saha P (2014) Applications of genetic algorithms in QSAR/QSPR modeling. In: Valadi J, Siarry P (eds) Applications of metaheuristics in process engineering. Springer International Publishing, Cham, pp 315–324Google Scholar
  96. 96.
    Dorigo M, Maniezzo V, Colorni A (1996) Ant system: optimization by a colony of cooperating agents. IEEE Trans Syst Man Cybern B Cybern 26(1):29–41PubMedCrossRefGoogle Scholar
  97. 97.
    Mullen RJ, Monekosso D, Barman S, Remagnino P (2009) A review of ant algorithms. Expert Syst Appl 36(6):9608–9617CrossRefGoogle Scholar
  98. 98.
    Goodarzi M, Freitas MP, Jensen R (2009) Feature selection and linear/nonlinear regression methods for the accurate prediction of glycogen synthase kinase-3 beta inhibitory activities. J Chem Inf Model 49(4):824–832PubMedCrossRefPubMedCentralGoogle Scholar
  99. 99.
    Niu B, Lu W-C, Yang S-S, Cai Y-D, Li G-Z (2007) Support vector machine for SAR/QSAR of phenethyl-amines1. Acta Pharmacol Sin 28(7):1075–1086PubMedCrossRefPubMedCentralGoogle Scholar
  100. 100.
    Embrechts MJ, Arciniegas F, Ozdemir M, Breneman CM, Bennett K, Lockwood L (eds) (2001) Bagging neural network sensitivity analysis for feature reduction for in-silico drug design. IJCNN’01 international joint conference on neural networks proceedings (Cat No01CH37222), 15–19 July 2001Google Scholar
  101. 101.
    Tanabe K, Kurita T, Nishida K, Lučić B, Amić D, Suzuki T (2013) Improvement of carcinogenicity prediction performances based on sensitivity analysis in variable selection of SVM models. SAR QSAR Environ Res 24(7):565–580PubMedCrossRefGoogle Scholar
  102. 102.
    Kennedy J, Eberhart R (eds) (1995) Particle swarm optimization. Proceedings of ICNN’95 – international conference on neural networks. 27 Nov–1 Dec. 1995Google Scholar
  103. 103.
    Agrafiotis DK, Cedeño W (2002) Feature selection for structure−activity correlation using binary particle swarms. J Med Chem 45(5):1098–1107PubMedCrossRefGoogle Scholar
  104. 104.
    Wang Z, Durst GL, Eberhart RC, Boyd DB, Miled ZB (eds) Particle swarm optimization and neural network application for QSAR. 18th international parallel and distributed processing symposium, 2004 proceedings, 26–30 Apr 2004Google Scholar
  105. 105.
    Xue Y, Li ZR, Yap CW, Sun LZ, Chen X, Chen YZ (2004) Effect of molecular descriptor feature selection in support vector machine classification of pharmacokinetic and toxicological properties of chemical agents. J Chem Inf Comput Sci 44(5):1630–1638PubMedCrossRefGoogle Scholar
  106. 106.
    Soto AJ, Cecchini RL, Vazquez GE, Ponzoni I (2009) Multi-objective feature selection in QSAR using a machine learning approach. QSAR Comb Sci 28(11–12):1509–1523CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Authors and Affiliations

  1. 1.Department of Chemistry and Biochemistry, Faculty of ScienceUniversity of PortoPortoPortugal

Personalised recommendations