Skip to main content

Advertisement

Log in

Better prediction of aqueous solubility of chlorinated hydrocarbons using support vector machine modeling

  • Original Paper
  • Published:
Environmental Chemistry Letters Aims and scope Submit manuscript

Abstract

Remediation of water contaminated by organic pollutants is a major challenge, which could be improved by better knowledge on the aqueous solubility of organic compounds. Indeed, the aqueous solubility controls the fate and toxicity of pollutants. Here we performed a structure–property study based on a genetic algorithm for the prediction of aqueous solubility of chlorinated hydrocarbons. 1497 descriptors were calculated with the Dragon software. The variable selection method of the genetic algorithm was used to select an optimal subset of descriptors that have significant contribution to the overall aqueous solubility, from the large pool of calculated descriptors. The support vector machine was then employed to model the possible quantitative relationships between selected descriptors and aqueous solubility. Our results show that total size, polarizability and electronegativity modify the aqueous solubility of compounds. We also found that the support vector machine method gave better results than other methods such as principal component regression and partial least squares.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Byvatov E, Fechner U, Sadowski J, Schneider G (2003) Comparison of support vector machine and artificial neural network systems for drug/nondrug classification. J Chem Inf Comput Sci 43:1882–1889. doi:10.1021/ci0341161

    Article  CAS  Google Scholar 

  • Cizmas L, Sharma VK, Gray CM, McDonald TJ (2015) Pharmaceuticals and personal care products in waters: occurrence, toxicity, and risk. Environ Chem Lett 13:381–394. doi:10.1007/s10311-015-0524-4

    Article  CAS  Google Scholar 

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. doi:10.1007/BF00994018

    Google Scholar 

  • Delgado EJ (2002) Prediction aqueous solubility of chlorinated hydrocarbons from molecular structure. Fluid Phase Equilib 199:101–107. doi:10.1016/S0378-3812(01)00818-4

    Article  CAS  Google Scholar 

  • Dohányosová P, Sarraute S, Dohnal V, Majer V, Gomes MC (2004) Aqueous solubility and related thermodynamic functions of nonaromatic hydrocarbons as a function of molecular structure. Ind Eng Chem Res 43:2805–2815. doi:10.1021/ie030800t

    Article  Google Scholar 

  • Dsikowitzky L, Schwarzbauer J (2014) Industrial organic contaminants: identification, toxicity and fate in the environment. Environ Chem Lett 12:371–386. doi:10.1007/s10311-014-0467-1

    Article  CAS  Google Scholar 

  • Gunn SR (1998) Support vector machines for classification and regression. Technical Report, University of Southampton

  • Hibbert DB (1993) Genetic algorithms in chemistry. Chemom Intell Lab Syst 19:277–293. doi:10.1016/0169-7439(93)80028-G

    Article  CAS  Google Scholar 

  • Huibers PDT, Katritzky AR (1998) Correlation of the aqueous solubility of hydrocarbons and halogenated hydrocarbons with molecular structure. J Chem Inf Comput Sci 38:283–292. doi:10.1021/ci9700438

    Article  CAS  Google Scholar 

  • John EM, Shaike JM (2015) Chlorpyrifos: pollution and remediation. Environ Chem Lett 13:269–291. doi:10.1007/s10311-015-0513-7

    Article  CAS  Google Scholar 

  • Kasiotis KM, Emmanouil C (2015) Advanced PAH pollution monitoring by bivalves. Environ Chem Lett 13:395–411. doi:10.1007/s10311-015-0525-3

    Article  CAS  Google Scholar 

  • Kubinyi H (1994) Variable selection in QSAR studies. II. a highly efficient combination of systematic search and evolution. QSAR Comb Sci 13:393–401. doi:10.1002/qsar.19940130403

    Article  CAS  Google Scholar 

  • Leardi R (1994) Application of a genetic algorithm to feature selection under full validation conditions and to outlier detection. J Chemom 8:65–79. doi:10.1002/cem.1180080107

    Article  CAS  Google Scholar 

  • Leardi R, Boggia R, Terrile M (1992) Genetic algorithms as a strategy for feature selection. J Chemom 6:267–281. doi:10.1002/cem.1180060506

    Article  CAS  Google Scholar 

  • Liao Y, Fang SC, Nuttle HLW (2004) A neural network model with bounded-weights for pattern classification. Compu Oper Res 31:1411–1426. doi:10.1016/S0305-0548(03)00097-2

    Article  Google Scholar 

  • Liu HX, Zhang RS, Luan F, Yao XJ, Liu MC, Hu ZD, Fan BT (2003a) Diagnosing breast cancer based on support vector machines. J Chem Inf Comput Sci 43:900–907. doi:10.1021/ci0256438

    Article  CAS  Google Scholar 

  • Liu HX, Zhang RS, Yao XJ, Liu MC, Hu ZD, Fan BT (2003b) QSAR study of ethyl 2-[(3-methyl-2,5-dioxo(3-pyrrolinyl))amino]-4-(trifluoromethyl)pyrimidine-5-carboxylate: an inhibitor of AP-1 and NF-κB mediated gene expression based on support vector machines. J Chem Inf Comput Sci 43:1288–1296. doi:10.1021/ci0340355

    Article  CAS  Google Scholar 

  • Liu HX, Zhang RS, Yao XJ, Liu MC, Hu ZD, Fan BT (2004) Prediction of the isoelectric point of an amino acid based on GA-PLS and SVMs. J Chem Inf Comput Sci 44:161–167. doi:10.1021/ci034173u

    Article  CAS  Google Scholar 

  • Lucasius CB, Kateman G (1993) Understanding and using genetic algorithms Part 1. concepts, properties and context. Chemom Intell Lab Syst 19:1–33. doi:10.1016/0169-7439(93)80079-W

    Article  CAS  Google Scholar 

  • Lucasius CB, Kateman G (1994) Understanding and using genetic algorithms Part 2. representation, configuration and hybridization. Chemom Intell Lab Syst 25:99–145. doi:10.1016/0169-7439(94)85038-0

    Article  CAS  Google Scholar 

  • Netzeva TI, Worth AP, Aldenberg T, Benigni R, Cronin MTD, Gramatica P, Jaworska JS, Kahn S, Klopman G, Marchant CA, Myatt G, Nikolova-Jeliazkova N, Patlewicz GY, Perkins R, Roberts DW, Schultz TW, Stanton DT, van de Sandt JJM, Tong W, Veith G, Yang C (2005) Current status of methods for defining the applicability domain of (quantitative) structure–activity relationships. ATLA 33:1–19

    Google Scholar 

  • Norinder U (2003) Support vector machine models in drug design: applications to drug transport processes and QSAR using simplex optimisations and variable selection. Neurocomputing 55:337–346. doi:10.1016/S0925-2312(03)00374-6

    Article  Google Scholar 

  • Pan Y, Jiang J, Wang R, Cao H, Cui Y (2009) A novel QSPR model for prediction of lower flammability limits of organic compounds based on support vector machine. J Hazard Mater 168:962–969. doi:10.1016/j.jhazmat.2009.02.122

    Article  CAS  Google Scholar 

  • Pereda S, Awan JA, Mohammadi AH, Valtz A, Coquelet C, Brignole EA, Richon D (2009) Solubility of hydrocarbons in water: experimental measurements and modeling using a group contribution with association equation of state (GCA-EoS). Fluid Phase Equilib 275:52–59. doi:10.1016/j.fluid.2008.09.008

    Article  CAS  Google Scholar 

  • Schölkopf B, Smola AJ (2002) Learning with kernels. MIT, London

    Google Scholar 

  • Tijani JO, Fatoba OO, Babajide OO, Petrik LF (2016) Pharmaceuticals, endocrine disruptors, personal care products, nanomaterials and perfluorinated pollutants: a review. Environ Chem Lett 14:27–49. doi:10.1007/s10311-015-0537-z

    Article  CAS  Google Scholar 

  • Todeschini R, Consonni V (2009) Molecular descriptors for chemoinformatics. Wiley-VCH, Weinheim

    Book  Google Scholar 

  • Vapnik V (1998) Statistical learning theory. Wiley, New York

    Google Scholar 

  • Young DC (2001) Computational chemistry: a practical guide for applying techniques to real-world problems. Wiley, New York

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Morteza Atabati.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bahadori, B., Atabati, M. & Zarei, K. Better prediction of aqueous solubility of chlorinated hydrocarbons using support vector machine modeling. Environ Chem Lett 14, 541–548 (2016). https://doi.org/10.1007/s10311-016-0561-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10311-016-0561-7

Keywords

Navigation