Which Performance Parameters Are Best Suited to Assess the Predictive Ability of Models?
We have revisited the vivid discussion in the QSAR-related literature concerning the use of external versus cross-validation, and have presented a thorough statistical comparison of model performance parameters with the recently published SRD (sum of (absolute) ranking differences) method and analysis of variance (ANOVA). Two case studies were investigated, one of which has exclusively used external performance merits. The SRD methodology coupled with ANOVA shows unambiguously for both case studies that the performance merits are significantly different, independently from data preprocessing. While external merits are generally less consistent (farther from the reference) than training and cross-validation based merits, a clear ordering and a grouping pattern of them could be acquired. The results presented here corroborate our earlier, recently published findings (SAR QSAR Environ. Res., 2015, 26, 683–700) that external validation is not necessarily a wise choice, and is frequently comparable to a random evaluation of the models.
KeywordsPerformance parameters (merits) Ranking Cross-validation External validation QSAR modeling
The work was supported by the Hungarian Scientific Research Fund (OTKA, grant number K 119269).
- Chirico, N., & Gramatica, P. (2011). Real external predictivity of QSAR models: How to evaluate it? Comparison of different validation criteria and proposal of using the concordance correlation coefficient. Journal of Chemical Information and Modeling, 51, 2320–2335. doi:10.1021/ci200211n.CrossRefGoogle Scholar
- Héberger, K., Kolarević, S., Kračun-Kolarević, M., et al. (2014). Evaluation of single-cell gel electrophoresis data: Combination of variance analysis with sum of ranking differences. Mutation Research, Genetic Toxicology and Environmental Mutagenesis, 771, 15–22. doi:10.1016/j.mrgentox.2014.04.028.CrossRefGoogle Scholar
- Lindman, H. R. (1991). Analysis of variance in experimental design. New York: Springer.Google Scholar
- Schüürmann, G., Ebert, R.-U., Chen, J., et al. (2008). External validation and prediction employing the predictive squared correlation coefficient test set activity mean vs training set activity mean. Journal of Chemical Information and Modeling, 48, 2140–2145. doi:10.1021/ci800253u.CrossRefGoogle Scholar
- Silla, J. M., Nunes, C. A., Cormanich, R. A., et al. (2011). MIA-QSPR and effect of variable selection on the modeling of kinetic parameters related to activities of modified peptides against dengue type 2. Chemometrics and Intelligent Laboratory Systems, 108, 146–149. doi:10.1016/j.chemolab.2011.06.009.CrossRefGoogle Scholar