Statistical Methods in QSAR/QSPR

  • Kunal RoyEmail author
  • Supratik Kar
  • Rudra Narayan Das
Part of the SpringerBriefs in Molecular Science book series (BRIEFSMOLECULAR)


QSAR/QSPR studies are aimed at developing correlation models using a response of chemicals (activity/property) and chemical information data in a statistical approach. The regression- and classification-based strategies are employed to serve the purpose of developing models for quantitative and graded response data, respectively. In addition to the conventional methods, various machine learning tools are also useful for QSAR/QSPR modeling analysis especially for studies involving high-dimensional and complex chemical information data bearing a nonlinear relationship with the response under consideration.


Applicability domain Chemometric tools Classification MLR Model development OECD Validation 


  1. 1.
    Snedecor GW, Cochran WG (1967) Statistical methods. Oxford and IBH, New DelhiGoogle Scholar
  2. 2.
    Wold S, Sjöström M, Eriksson L (2001) PLS-regression: a basic tool of chemometrics. Chemom Intell Lab Syst 58:109–130CrossRefGoogle Scholar
  3. 3.
    Agresti A (1996) An introduction to categorical data analysis. Wiley, HobokenGoogle Scholar
  4. 4.
    Everitt BS, Landau S, Leese M (2001) Cluster analysis, 4th edn. Arnold, LondonGoogle Scholar
  5. 5.
    Topliss JG, Costello RJ (1972) Chance correlation in structure-activity studies using multiple regression analysis. J Med Chem 15:1066–1068CrossRefGoogle Scholar
  6. 6.
    Jaworska JS, Comber M, Auer C, Van Leeuwen CJ (2003) Summary of a workshop on regulatory acceptance of (Q)SARs for human health and environmental endpoints. Environ Health Perspect 111:1358–1360CrossRefGoogle Scholar
  7. 7.
    Wold S (1978) Cross-validation estimation of the number of components in factor and principal components models. Technometrics 20:397–405CrossRefGoogle Scholar
  8. 8.
    Roy K (2007) On some aspects of validation of predictive QSAR models. Expert Opin Drug Discov 2:1567–1577CrossRefGoogle Scholar
  9. 9.
    Gramatica P (2007) Principles of QSAR models validation: internal and external. QSAR Comb Sci 26:694–701CrossRefGoogle Scholar
  10. 10.
    Roy K, Mitra I (2011) On various metrics used for validation of predictive QSAR models with applications in virtual screening and focused library design. Comb Chem High Throughput Screen 14:450–474CrossRefGoogle Scholar
  11. 11.
    Hawkins DM, Basak SC, Mills D (2003) Assessing model fit, by cross-validation. J Chem Inf Comput Sci 43:579–586CrossRefGoogle Scholar
  12. 12.
    Roy K, Mitra I, Kar S, Ojha PK, Das RN, Kabir H (2012) Comparative studies on some metrics for external validation of QSPR models. J Chem Inf Model 52:396–408CrossRefGoogle Scholar
  13. 13.
    Roy K, Chakraborty P, Mitra I, Ojha PK, Kar S, Das RN (2013) Some case studies on application of “rm2” metrics for judging quality of QSAR predictions: emphasis on scaling of response data. J Comput Chem 34:1071–1082CrossRefGoogle Scholar
  14. 14.
    Mitra I, Roy PP, Kar S, Ojha P, Roy K (2010) On further application of rm2 as a metric for validation of QSAR models. J Chemometrics 24:22–33CrossRefGoogle Scholar
  15. 15.
    Mitra I, Saha A, Roy K (2010) Exploring quantitative structure-activity relationship (QSAR) studies of antioxidant phenolic compounds obtained from traditional Chinese medicinal plants. Mol Simult 36:1067–1079CrossRefGoogle Scholar
  16. 16.
    Golbraikh A, Tropsha A (2002) Beware of q2! J Mol Graph Model 20:269–276CrossRefGoogle Scholar
  17. 17.
    Schuurmann G, Ebert RU, Chen J, Wang B, Kuhne R (2008) External validation and prediction employing the predictive squared correlation coefficient-Test-set activity mean vs training set activity mean. J Chem Inf Model 48:2140–2145CrossRefGoogle Scholar
  18. 18.
    Consonni V, Ballabio D, Todeschini R (2010) Evaluation of model predictive ability by external validation techniques. J Chemometrics 24:194–201CrossRefGoogle Scholar
  19. 19.
    Chirico N, Gramatica P (2011) Real External predictivity of QSAR models: How to evaluate it? Comparison of different validation criteria and proposal of using the concordance correlation coefficient. J Chem Inf Model 51:2320–2335CrossRefGoogle Scholar
  20. 20.
    Roy K, Kar S (2014) How to judge predictive quality of classification and regression based QSAR models? In: Haq Z, Madura JD (eds) Frontiers in computational chemistry. Bentham Science Publishers, SharjahGoogle Scholar
  21. 21.
    Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27:861–874CrossRefGoogle Scholar
  22. 22.
    Perez-Garrido A, Helguera AM, Borges F, Cordeiro MNDS, Rivero V, Escudero AG (2011) Two new parameters based on distances in a receiver operating characteristic chart for the selection of classification models. J Chem Inf Model 51:2746–2759CrossRefGoogle Scholar
  23. 23.
    Galvez J, Garcia-Domenech R, de Gregorio Alapont C, De Julian-Ortiz V, Popa L (1996) Pharmacological distribution diagrams: a tool for de novo drug design. J Mol Graph 14:272–276CrossRefGoogle Scholar

Copyright information

© The Author(s) 2015

Authors and Affiliations

  1. 1.Department of Pharmaceutical TechnologyJadavpur UniversityKolkataIndia

Personalised recommendations