Predictive Toxicogenomics in Preclinical Discovery
- 504 Downloads
The failure of drug candidates during clinical trials due to toxicity, especially hepatotoxicity, is an important and continuing problem in the pharmaceutical industry.
This chapter explores new predictive toxicogenomics approaches to better understand the hepatotoxic potential of human drug candidates and to assess their toxicity earlier in the drug development process. The underlying data consisted of two commercial knowledgebases that employed a hybrid experimental design in which human drug-toxicity information was extracted from the literature, dichotomized, and merged with rat-based gene expression measures (primary isolated hepatocytes and whole liver). Toxicity classification rules were built using a stochastic gradient boosting machine learner, with classification error estimated using a modified bootstrap estimate of true error. Several types of clustering methods were also applied, based on sets of compounds and genes. Robust classification rules were constructed for both in vitro (hepatocytes) and in vivo (liver) data, based on a high-dose, 24-h design. There appeared to be little overlap between the two classifiers, at least in terms of their gene lists. Robust classifiers could not be fitted when earlier time points and/or low-dose data were included, indicating that experimental design is important for these systems. Our results suggest development of a compound screening assay based on these toxicity classifiers appears feasible, with classifier operating characteristics\break used to tune a screen for a specific implementation within preclinical testing paradigms.
Keywordsclassification rule in vitro in vivo machine learning stochastic gradient boosting toxicity toxicogenomics
The authors thank our colleagues at Millennium Pharmaceuticals, Inc.—Arek Raczynski, Carl Alden, and Scott Coleman—for their scientific input and thoughtful review of this work; and Victor Farutin for his expert help implementing stochastic gradient boosting. We also thank Gene Logic Inc. for supplying the in vivo and in vitro gene expression data. Permission to reproduce material was obtained from Future Medicine for the inclusion of parts of the following papers from Pharmacogenomics:
•Barros, S. (2005) The importance of applying toxicogenomics to increase the efficiency of drug discovery. Pharmacogenomics 6(6), 547–550.
•Martin, R., Rose, D., Yu, K., and Barros, S. (2006) Toxicogenomics strategies for predicting drug toxicity. Pharmacogenomics 7(7), 1003–1016.
•Martin, R. and Yu, K. (2006) Assessing performance of prediction rules in machine learning. Pharmacogenomics 7(4), 543–550.
- 2.DiMasi, J., Hansen, R., and Grabowski, H. (2003) The price ofinnovation: new estimates of drug development costs. J. HealthEcon. 22, 151–185.Google Scholar
- 8.Efron, B. and Tibshirani, R. (1997) Improvements oncross-validation: the .632+ bootstrap method. JASA 92,548–560.Google Scholar
- 9.Hastie, T., Tibshirani, R., and Friedman, J. (2001) TheElements of Statistical Learning. Springer-Verlag, New York.Google Scholar
- 10.Efron, B. (1983): Estimating the error rate of a prediction rule. JASA 78, 316–331.Google Scholar
- 13.Affymetrix RGU34A microarray. Available at www.affymetrix.com/analysis/index.affx.
- 16.Luenberger, D. (2003) Linear and Nonlinear Programming, 2nded. Springer, Berlin.Google Scholar
- 17.Metrigenix. Corporate home page. Available at www.metrigenix.com.
- 18.Brieman, L., Friedman, J., Olshen, R., and Stone, C. (1998) Classification and Regression Tree. Chapman and Hall/CRC, BocaRaton, FL.Google Scholar
- 19.Therneau, T. and Atkinson, E. (1997) An introduction to recursivepartitioning using the rpart routines. Mayo Clinic TechnicalReport.Google Scholar