Abstract
This chapter is a broad overview of the drug discovery process and areas where statistical input can have a key impact. The focus is primarily in a few key areas: target discovery, compound screening/optimization, and the characterization of important properties. Special attention is paid to working with assay data and phenotypic screens. A discussion of important skills for a nonclinical statistician supporting drug discovery concludes the chapter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
The term “Mendelian Randomization” refers to the notion that we are randomized at birth to the genetic “treatment” of the SNP.
- 3.
Thankfully, the academic community has been highly co-operative with one another in creating large consortia to produce meta-analyses from many smaller GWAS studies that total to hundreds of thousands of subjects.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
References
Abecasis G, Cherny S, Cookson W, Cardon L (2001) Merlin-rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet 30(1):97–101
Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P (2007) Molecular biology of the cell. Garland Publishing, New York
Alberts B, Bray D, Hopkin K, Johnson A, Lewis J, Raff M, Roberts K, Walter P (2013) Essential cell biology. Garland Publishing, New York
Anderson B, Holford N (2008) Mechanism-based concepts of size and maturity in pharmacokinetics. Ann Rev Pharmacol Toxicol 48(1):303–332
Arrowsmith J (2011a) Trial watch: phase III and submission failures: 2007–2010. Nat Rev Drug Discov 10(2):87–87
Arrowsmith J (2011b) Trial watch: phase II failures: 2008–2010. Nat Rev Drug Discov 10(5): 328–329
Bickle M (2010) The beautiful cell: high-content screening in drug discovery. Anal Bioanal Chem 398(1):219–226
Bonate P (2011) Pharmacokinetic-pharmacodynamic modeling and simulation. Springer, Berlin
Box GEP, Hunter S, Hunter W (2005) Statistics for experimenters: design, innovation, and discovery. Wiley, Hoboken
Burdick R, Borror C, Montgomery D (2003) A review of methods for measurement systems capability analysis. J Qual Technol 35(4):342–354
Burdick R, Borror C, Montgomery D (2005) Design and analysis of gauge R&R studies: making decisions with confidence intervals in random and mixed ANOVA models, vol 17. SIAM, Philadelphia
Burton P, Clayton D, Cardon L, Craddock N et al (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447(7145):661–678
Clark J, Flanagan M, Telliez J-B (2014) Discovery and development of janus kinase (JAK) inhibitors for inflammatory diseases. J Med Chem 57(12):5023–5038
Cochran W, Cox G (1950) Experimental designs. Wiley, New York
Crick F (1970) Central dogma of molecular biology. Nature 227(5258):561–563
Curry S, McCarthy D, DeCory H, Marler M, Gabrielsson J (2002) Phase I: the first ppportunity for extrapolation from animal data to human exposure. Wiley, New York, pp 95–115
Djebali S, Davis C, Merkel A, Dobin A et al (2012) Landscape of transcription in human cells. Nature 489(7414):101–108
Dunham I, Kundaje A, Aldred S, Collins P et al (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414):57–74
Eggert U (2013) The why and how of phenotypic small-molecule screens. Nat Chem Biol 9(4):206–209
Espie P, Tytgat D, Sargentini-Maier M, Poggesi I, Watelet J (2009) Physiologically based pharmacokinetics (PBPK). Drug Metab Rev 41(3):391–407
Evans S, Dawson P (1988) The end of the p value? Br Heart J 60(3):177
Fieller E (1954) Some problems in interval estimation. J R Stat Soc Ser B (Methodological) 16(2):175–185
Ganesh T, Jiang J, Yang M, Dingledine R (2014) Lead optimization studies of cinnamic amide EP2 antagonists. J Med Chem 57(10):4173–4184
Gao X (2011) Multiple testing corrections for imputed SNPs. Genet Epidemiol 35(3):154–158
Gentleman R, Carey VJ, Bates D et al (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5:R80
Gregory R (2005) Synergy between sequence and size in large-scale genomics. Nat Rev Genet 6(9):699–708
Griffith M, Griffith O, Coffman A, Weible J, McMichael J, Spies N, Koval J, Das I, Callaway M, Eldred J, Miller C, Subramanian J, Govindan R, Kumar R, Bose R, Ding L, Walker J, Larson D, Dooling D, Smith S, Ley T, Mardis E, Wilson R (2013) DGIdb: mining the druggable genome. Nat Methods 10(12):1209–1210
Grundberg E, Small K, Hedman A, Nica A et al (2012) Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat Genet 44(10):1084–1089
Haaland P (1989) Experimental design in biotechnology, vol 105. CRC Press, Boca Raton
Haney S, Lapan P, Pan J, Zhang J (2006) High-content screening moves to the front of the line. Drug Discov Today 11(19–20):889–894
Harvey P, Tarran R, Garoff S, Myerburg M (2011) Measurement of the airway surface liquid volume with simple light refraction microscopy. Am J Respir Cell Mol Biol 45(3):592–599
Hendriks M, de Boer J, Smilde A (1996) Robustness of analytical chemical methods and pharmaceutical technological products. Elsevier, Amsterdam
Hermann J, Chen Y, Wartchow C, Menke J, Gao L, Gleason S, Haynes N, Scott N, Petersen A, Gabriel S, Vu B, George K, Narayanan A, Li S, Qian H, Beatini N, Niu L, Gan Q (2013) Metal impurities cause false positives in high-throughput screening campaigns. ACS Med Chem Lett 4(2):197–200
Hill A, LaPan P, Li Y, Haney S (2007) Impact of image segmentation on high-content screening data quality for SK-BR-3 cells. BMC Bioinf 8(1):340–353
Holmes M, Simon T, Exeter H, Folkersen L et al (2013) Secretory phospholipase A2-IIA and cardiovascular disease. J Am Coll Cardiol 62(21):1966–1976
Howie B, Donnelly P, Marchini J (2009) A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet 5:e1000529
Hughes J, Rees S, Kalindjian S, Philpott K (2011) Principles of early drug discovery. Br J Pharmacol 162(6):1239–1249
Hwang W, Fu Y, Reyon D, Maeder M, Tsai S, Sander J, Peterson R, Yeh J-R, Joung J (2013) Efficient genome editing in zebrafish using a CRISPR-Cas system. Nat Biotechnol 31(3): 227–229
Johnson R, Wichern D (2007) Applied multivariate statistical analysis, 6th edn. Prentice Hall, New York
Jones S, de Souza P, Lindsay M (2004) siRNA for gene silencing: a route to drug target discovery. Curr Opin Pharmacol 4(5):522–527
Jorde L, Wooding S (2004) Genetic variation, classification and ‘race’. Nat Genet 36:S28–S33
Kainkaryam R, Woolf P (2009) Pooling in high-throughput drug screening. Curr Opin Drug Discov Dev 12(3):339–350
Kalbfleisch J, Prentice R (1980) The statistical analysis of failure time data. Wiley, New York
Kang H, Sul J, Service S, Zaitlen N, Kong S, Freimer N, Sabatti C, Eskin E (2010) Variance component model to account for sample structure in genome-wide association studies. Nat Genet 42(4):348–354
Kim S, Swaminathan S, Inlow M, Risacher S, The Alzheimer’s Disease Neuroimaging Initiative (ADNI) (2013) Influence of genetic variation on plasma protein levels in older adults using a multi-analyte panel. PLoS ONE 8(7):e70269
Kola I, Landis J (2004) Can the pharmaceutical industry reduce attrition rates? Nat Rev Drug Discov 3(8):711–716
Korn K, Krausz E (2007) Cell-based high-content screening of small-molecule libraries. Curr Opin Chem Biol 11(5):503–510
Kuhn M, Johnson K (2013) Applied predictive modeling. Springer, Berlin
Landry Y, Gies J-P (2008) Drugs and their molecular targets: an updated overview. Fundam Clin Pharmacol 22(1):1–18
Li J, Ji L (2005) Adjusting multiple testing in multilocus analyses using the eigenvalues of a correlation matrix. Heredity 95(3):221–227
Li Y, Willer C, Ding J, Scheet P, Abecasis G (2010) MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet Epidemiol 34(8):816–834
Lin J, Lu A (1997) Role of pharmacokinetics and metabolism in drug discovery and development. Pharmacol Rev 49(4):403–449
Lindsay M (2003) Target discovery. Nat Rev Drug Discov 2(10):831–838
Lonsdale J, Thomas J, Salvatore M, Phillips R et al (2013) The genotype-tissue expression (GTEx) project. Nat Genet 45(6):580–585
Luo C, Laaja P (2004) Inhibitors of JAKs/STATs and the kinases: a possible new cluster of drugs. Drug Discov Today 9(6):268–275
Malo N, Hanley J, Cerquozzi S, Pelletier J, Nadon R (2006) Statistical practice in high-throughput screening data analysis. Nat Biotechnol 24(2):167–175
Matthews J, Altman D (1996) Statistics notes: interaction 2: compare effect sizes not P values. Br Med J 313(7060):808–808
McVean G, Altshuler D, Durbin R, Abecasis G et al (2012) An integrated map of genetic variation from 1,092 human genomes. Nature 491(7422):56–65
Montgomery D (2012) Introduction to statistical quality control. Wiley, New York
Muller P, Milton M (2012) The determination and interpretation of the therapeutic index in drug development. Nat Rev Drug Discov 11(10):751–761
Murray C, Rees D (2009) The rise of fragment-based drug discovery. Nat Chem 1(3):187–192
Nyholt D (2004) A simple correction for multiple testing for single-nucleotide polymorphisms in linkage disequilibrium with each other. Am J Hum Genet 74(4):765–769
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira M, Bender D, Maller J, Sklar P, de Bakker P, Daly M, Sham P (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81(3):559–575
R Core Team (2014) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org
Rang H, Dale M, Ritter J, Moore P (2007) Pharmacology. Churchill Livingstone, Edinburgh
Ratjen F, Doring D (2003) Cystic fibrosis. Lancet 361(9358):681–689
Remlinger K, Hughes-Oliver J, Young S, Lam R (2006) Statistical design of pools using optimal coverage and minimal collision. Technometrics 48(1):133–143
Rendic S, Di Carlo F (1997) Human cytochrome P450 enzymes: a status report summarizing their reactions, substrates, inducers, and inhibitors. Drug Metab Rev 29(1–2):413–580
Rockman M, Kruglyak L (2006) Genetics of global gene expression. Nat Rev Genet 7(11):862–872
Sackett D (2001) Why randomized controlled trials fail but needn’t: 2. Failure to employ physiological statistics, or the only formula a clinician-trialist is ever likely to need (or understand!). Can Med Assoc J 165(9):1226–1237
Shariff A, Kangas J, Coelho L, Quinn S, Murphy R (2010) Automated image analysis for high-content screening and analysis. J Biomol Screen 15(7):726–734
Shin S, Fauman E, Petersen A, Krumsiek J et al (2014) An atlas of genetic influences on human blood metabolites. Nat Genet 46(6):543–550
Simpson E (1951) The interpretation of interaction in contingency tables. J R Stat Soc Ser B (Methodological) 13:238–241
Smith G, Shah E (2003) Mendelian randomization: can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol 32(1):1–22
Soille P (2003) Morphological image analysis: principles and applications. Springer, Berlin
Sterne J (2001) Sifting the evidence—what’s wrong with significance tests? Another comment on the role of statistical methods. Br Med J 322(7280):226–231
Swinney D (2013) Phenotypic vs. target-based drug discovery for first-in-class medicines. Clin Pharmacol Ther 93(4):299–301
Swinney D, Anthony J (2011) How were new medicines discovered? Nat Rev Drug Discov 10(7):507–519
The C Reactive Protein Coronary Heart Disease Genetics Collaboration (2011) Association between c reactive protein and coronary heart disease: mendelian randomisation analysis based on individual participant data. Br Med J 342:d548
Verkman A, Song Y, Thiagarajah J (2003) Role of airway surface liquid and submucosal glands in cystic fibrosis lung disease. Am J Physiol Cell Physiol 284(1):C2–C15
Voight B, Peloso G, Orho-Melander M, Frikke-Schmidt R, Barbalic M, Jensen M, Hindy G, Holm H, Ding E, Johnson T et al (2012) Plasma HDL cholesterol and risk of myocardial infarction: a mendelian randomisation study. Lancet 380(9841):572–580
Wang Q, Rager J, Weinstein K, Kardos P, Dobson G, Li J, Hidalgo I (2005) Evaluation of the MDR-MDCK cell line as a permeability screen for the blood-brain barrier. Int J Pharm 288(2): 349–359
Watson J (1992) Recombinant DNA. Macmillan, New York
Wilks A (2008) The JAK kinases: not just another kinase drug discovery target. Semin Cell Dev Biol 19(4):319–328
Yang H, Liu X, Chimalakonda A, Lu Z, Chen C, Lee F, Shyu W (2010) Applied pharmacokinetics in drug discovery and development. Wiley, Hoboken, pp 177–239
Zhang X (2011) Optimal high-throughput screening: practical experimental design and data analysis for genome-scale RNAi research. Cambridge University Press, Cambridge
Zheng W, Thorne N, McKew J (2013) Phenotypic screens as a renewed approach for drug discovery. Drug Discov Today 18(21–22):1067–1073
Acknowledgements
We would like to thank David Potter and Bill Pikounis for providing feedback on a draft of this chapter.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Kuhn, M., Yates, P., Hyde, C. (2016). Statistical Methods for Drug Discovery. In: Zhang, L. (eds) Nonclinical Statistics for Pharmaceutical and Biotechnology Industries. Statistics for Biology and Health. Springer, Cham. https://doi.org/10.1007/978-3-319-23558-5_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-23558-5_4
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23557-8
Online ISBN: 978-3-319-23558-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)