Abstract
Identifying the viability of protein targets is one of the preliminary steps of drug discovery. Determining the ability of a protein to bind drugs in order to modulate its function, termed the druggability, requires a non-trivial amount of time and resources. Inability to properly measure druggability has accounted for a significant portion of failures in drug discovery. This problem is only further exacerbated by the large sample space of proteins involved in human diseases. With these barriers, the druggability space within the human proteome remains unexplored and has made it difficult to develop drugs for numerous diseases. Hence, we present a new feature developed in eFindSite that employs supervised machine learning to predict the druggability of a given protein. Benchmarking calculations against the Non-Redundant data set of Druggable and Less Druggable binding sites demonstrate that an AUC for druggability prediction with eFindSite is as high as 0.88. With eFindSite, we elucidated the human druggability space to be 10,191 proteins. Considering the disease space from the Open Targets Platform and excluding already known targets from the predicted data set reveal 2731 potentially novel therapeutic targets. eFindSite is freely available as a stand-alone software at https://github.com/michal-brylinski/efindsite.
This is a preview of subscription content, access via your institution.









References
Abi Hussein H, Geneix C, Petitjean M, Borrel A, Flatters D, Camproux AC (2017) Drug Discov Today 22(2):404
DiMasi JA, Grabowski HG, Hansen RW (2016) J Health Econ 47:20
Lamberti MJ, Getz KA (2015) White paper: Tufts Center for the Study of Drug Development, Boston, MA
Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP (2012) Nucleic Acids Res 40(Database issue):D1100
Santos R, Ursu O, Gaulton A, Bento AP, Donadi RS, Bologa CG, Karlsson A, Al-Lazikani B, Hersey A, Oprea TI, Overington JP (2017) Nat Rev Drug Discov 16(1):19
Brown D, Superti-Furga G (2003) Drug Discov Today 8(23):1067
Bohacek RS, McMartin C, Guida WC (1996) Med Res Rev 16(1):3
Shuker SB, Hajduk PJ, Meadows RP, Fesik SW (1996) Science 274(5292):1531
Edfeldt FN, Folmer RH, Breeze AL (2011) Drug Discov Today 16(7–8):284
Hopkins AL, Groom CR (2002) Nat Rev Drug Discov 1(9):727
Koscielny G, An P, Carvalho-Silva D, Cham JA, Fumis L, Gasparyan R, Hasan S, Karamanis N, Maguire M, Papa E, Pierleoni A, Pignatelli M, Platt T, Rowland F, Wankar P, Bento AP, Burdett T, Fabregat A, Forbes S, Gaulton A, Gonzalez CY, Hermjakob H, Hersey A, Jupe S, Kafkas S, Keays M, Leroy C, Lopez FJ, Magarinos MP, Malone J, McEntyre J, Munoz-Pomer Fuentes A, O’Donovan C, Papatheodorou I, Parkinson H, Palka B, Paschall J, Petryszak R, Pratanwanich N, Sarntivijal S, Saunders G, Sidiropoulos K, Smith T, Sondka Z, Stegle O, Tang YA, Turner E, Vaughan B, Vrousgou O, Watkins X, Martin MJ, Sanseau P, Vamathevan J, Birney E, Barrett J, Dunham I (2017) Nucleic Acids Res 45(D1):D985
Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (2001) Adv Drug Deliv Rev 46(1–3):3
Ringe D (1995) Curr Opin Struct Biol 5(6):825
Hajduk PJ, Huth JR, Fesik SW (2005) J Med Chem 48(7):2518
Craik DJ, Smith PA, Clark RJ (2010) NMR-based screening and drug discovery. In: Abraham DJ (ed). Burger’s medicinal chemistry and drug discovery. Wiley, Hoboken
Aretz J, Kondoh Y, Honda K, Anumala UR, Nazare M, Watanabe N, Osada H, Rademacher C (2016) Chem Commun (Camb) 52(58):9067
Vukovic S, Huggins DJ (2018) Drug Discov Today 23(6):1258
Somody JC, MacKinnon SS, Windemuth A (2017) Drug Discov Today 22(12):1792
Brylinski M, Feinstein WP (2013) J Comput Aided Mol Des 27(6):551
Feinstein WP, Brylinski M (2014) Mol Inform 33(2):135
Borrel A, Regad L, Xhaard H, Petitjean M, Camproux AC (2015) J Chem Inf Model 55(4):882
Schmidtke P, Barril X (2010) J Med Chem 53(15):5858
Kyte J, Doolittle RF (1982) J Mol Biol 157(1):105
Cammisa M, Correra A, Andreotti G, Cubellis MV (2013) BMC Bioinformatics 14 (Suppl 7):S9
Krasowski A, Muthas D, Sarkar A, Schmitt S, Brenk R (2011) J Chem Inf Model 51(11):2829
Humphrey W, Dalke A, Schulten K (1996) J Mol Graph 14(1):33
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) Nucleic Acids Res 28(1):235
Zhang Y, Skolnick J (2004) Proteins 57(4):702
Soga S, Shirai H, Kobori M, Hirayama N (2007) J Chem Inf Model 47(2):400
Millman KJ (2015) Permute—a Python package for permutation tests and confidence sets. University of California, Berkeley, 2015
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) J Mach Learn Res 12:2825
Böhning D (1992) Ann Inst Statist Math 44(1):197
Matthews BW (1975) Biochim Biophys Acta 405(2):442
Le Guilloux V, Schmidtke P, Tuffery P (2009) BMC Bioinformatics 10:168
Kellenberger E, Muller P, Schalon C, Bret G, Foata N, Rognan D (2006) J Chem Inf Model 46(2):717
Schneider VA, Graves-Lindsay T, Howe K, Bouk N, Chen HC, Kitts PA, Murphy TD, Pruitt KD, Thibaud-Nissen F, Albracht D, Fulton RS, Kremitzki M, Magrini V, Markovic C, McGrath S, Steinberg KM, Auger K, Chow W, Collins J, Harden G, Hubbard T, Pelan S, Simpson JT, Threadgold G, Torrance J, Wood JM, Clarke L, Koren S, Boitano M, Peluso P, Li H, Chin CS, Phillippy AM, Durbin R, Wilson RK, Flicek P, Eichler EE, Church DM (2017) Genome Res 27(5):849
Aken BL, Achuthan P, Akanni W, Amode MR, Bernsdorff F, Bhai J, Billis K, Carvalho-Silva D, Cummins C, Clapham P, Gil L, Giron CG, Gordon L, Hourlier T, Hunt SE, Janacek SH, Juettemann T, Keenan S, Laird MR, Lavidas I, Maurel T, McLaren W, Moore B, Murphy DN, Nag R, Newman V, Nuhn M, Ong CK, Parker A, Patricio M, Riat HS, Sheppard D, Sparrow H, Taylor K, Thormann A, Vullo A, Walts B, Wilder SP, Zadissa A, Kostadima M, Martin FJ, Muffato M, Perry E, Ruffier M, Staines DM, Trevanion SJ, Cunningham F, Yates A, Zerbino DR, Flicek P (2017) Nucleic Acids Res 45(D1):D635
Brylinski M, Lingam D (2012) PLoS ONE 7(11):e50200
Wang Z, Tegge AN, Cheng J (2009) Proteins 75(3):638
Zemla A (2003) Nucleic Acids Res 31(13):3370
Sterling T, Irwin JJ (2015) J Chem Inf Model 55(11):2324
Wishart DS, Feunang YD, Guo AC, Lo EJ, Marcu A, Grant JR, Sajed T, Johnson D, Li C, Sayeeda Z, Assempour N, Iynkkaran I, Liu Y, Maciejewski A, Gale N, Wilson A, Chin L, Cummings R, Le D, Pon A, Knox C, Wilson M (2018) Nucleic Acids Res 46(D1):D1074
Liu X, Hanson BL, Langan P, Viola RE (2007) Acta Crystallogr D Biol Crystallogr 63(Pt 9):1000
Brylinski M (2013) J Chem Inf Model 53(11):3097
Barends TR, Polderman-Tijmes JJ, Jekel PA, Williams C, Wybenga G, Janssen DB, Dijkstra BW (2006) J Biol Chem 281(9):5804
Oster L, Tapani S, Xue Y, Kack H (2015) Drug Discov Today 20(9):1104
Tanimoto TT. An elementary mathematical theory of classification and prediction. IBM Internal Report, 1958
Kawabata T (2011) J Chem Inf Model 51(8):1775
Brylinski M (2018) Chem Biol Drug Des 91(2):380
Sobolev V, Sorokine A, Prilusky J, Abola EE, Edelman M (1999) Bioinformatics 15(4):327
Ikushiro H, Islam MM, Okamoto A, Hoseki J, Murakawa T, Fujii S, Miyahara I, Hayashi H (2009) J Biochem 146(4):549
Sato D, Shiba T, Karaki T, Yamagata W, Nozaki T, Nakazawa T, Harada S (2017) Sci Rep 7(1):4874
Acknowledgements
Research reported in this publication was supported by the National Institute of General Medical Sciences of the National Institutes of Health under Award Number R35GM119524. Portions of this research were conducted with computing resources provided by Louisiana State University.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kana, O., Brylinski, M. Elucidating the druggability of the human proteome with eFindSite. J Comput Aided Mol Des 33, 509–519 (2019). https://doi.org/10.1007/s10822-019-00197-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-019-00197-w
Keywords
- Druggability prediction
- Human proteome
- Drug targets
- Pocket prediction
- Structural bioinformatics
- Molecular modeling
- eFindSite