Skip to main content
Log in

Machine learning-based quantitative structure–retention relationship models for predicting the retention indices of volatile organic pollutants

  • Original Paper
  • Published:
International Journal of Environmental Science and Technology Aims and scope Submit manuscript

Abstract

In this research, a dataset including 206 volatile organic compounds was used to develop quantitative structure–retention relationship models for predicting the retention indices of volatile organic compounds on DB-5 stationary phase. A total of 141 molecules were put in train set to build models and 65 molecules were put in test set to validate models, externally. By using stepwise-multiple linear regression, two descriptors including X1sol (solvation connectivity index chi-1) and AAC (mean information index on atomic composition) were selected to create linear and nonlinear quantitative structure–retention relationship models. Multiple linear regression, epsilon-support vector regression and deep learning-based artificial neural network were used as modeling techniques. All models were validated by calculating several statistical parameters for both train and test sets that show created models have high predictive power. R2 values for the test set of multiple linear regression, epsilon-support vector regression and deep learning-based artificial neural network models were 0.90, 0.94 and 0.94, respectively. Results show the Van der Waals interactions of molecules with methyl groups in DB-5 stationary phase and the electrostatic interactions of atoms with partial negative charge in molecules with the hydrogen atoms of phenyl groups in DB-5 stationary phase are responsible for the separation of volatile organic compounds in DB-5 stationary phase. Finally, these created models were used to predict the retention indices of 694 volatile organic compounds that had no retention index data on DB-5 stationary phase.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

Download references

Acknowledgements

The authors wish to thank all who assisted in conducting this work.

Funding

This research has been supported by University of Kurdistan.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to B. Sepehri.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Editorial responsibility: S. Hussain.

Supplementary Information

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sepehri, B., Ghavami, R., Farahbakhsh, S. et al. Machine learning-based quantitative structure–retention relationship models for predicting the retention indices of volatile organic pollutants. Int. J. Environ. Sci. Technol. 19, 1457–1466 (2022). https://doi.org/10.1007/s13762-021-03271-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13762-021-03271-9

Keywords

Navigation