Skip to main content

Elucidating the druggability of the human proteome with eFindSite

Abstract

Identifying the viability of protein targets is one of the preliminary steps of drug discovery. Determining the ability of a protein to bind drugs in order to modulate its function, termed the druggability, requires a non-trivial amount of time and resources. Inability to properly measure druggability has accounted for a significant portion of failures in drug discovery. This problem is only further exacerbated by the large sample space of proteins involved in human diseases. With these barriers, the druggability space within the human proteome remains unexplored and has made it difficult to develop drugs for numerous diseases. Hence, we present a new feature developed in eFindSite that employs supervised machine learning to predict the druggability of a given protein. Benchmarking calculations against the Non-Redundant data set of Druggable and Less Druggable binding sites demonstrate that an AUC for druggability prediction with eFindSite is as high as 0.88. With eFindSite, we elucidated the human druggability space to be 10,191 proteins. Considering the disease space from the Open Targets Platform and excluding already known targets from the predicted data set reveal 2731 potentially novel therapeutic targets. eFindSite is freely available as a stand-alone software at https://github.com/michal-brylinski/efindsite.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

References

  1. Abi Hussein H, Geneix C, Petitjean M, Borrel A, Flatters D, Camproux AC (2017) Drug Discov Today 22(2):404

    Article  PubMed  Google Scholar 

  2. DiMasi JA, Grabowski HG, Hansen RW (2016) J Health Econ 47:20

    Article  PubMed  Google Scholar 

  3. Lamberti MJ, Getz KA (2015) White paper: Tufts Center for the Study of Drug Development, Boston, MA

  4. Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP (2012) Nucleic Acids Res 40(Database issue):D1100

    Article  CAS  PubMed  Google Scholar 

  5. Santos R, Ursu O, Gaulton A, Bento AP, Donadi RS, Bologa CG, Karlsson A, Al-Lazikani B, Hersey A, Oprea TI, Overington JP (2017) Nat Rev Drug Discov 16(1):19

    Article  CAS  PubMed  Google Scholar 

  6. Brown D, Superti-Furga G (2003) Drug Discov Today 8(23):1067

    Article  PubMed  Google Scholar 

  7. Bohacek RS, McMartin C, Guida WC (1996) Med Res Rev 16(1):3

    Article  CAS  PubMed  Google Scholar 

  8. Shuker SB, Hajduk PJ, Meadows RP, Fesik SW (1996) Science 274(5292):1531

    Article  CAS  PubMed  Google Scholar 

  9. Edfeldt FN, Folmer RH, Breeze AL (2011) Drug Discov Today 16(7–8):284

    Article  CAS  PubMed  Google Scholar 

  10. Hopkins AL, Groom CR (2002) Nat Rev Drug Discov 1(9):727

    Article  CAS  PubMed  Google Scholar 

  11. Koscielny G, An P, Carvalho-Silva D, Cham JA, Fumis L, Gasparyan R, Hasan S, Karamanis N, Maguire M, Papa E, Pierleoni A, Pignatelli M, Platt T, Rowland F, Wankar P, Bento AP, Burdett T, Fabregat A, Forbes S, Gaulton A, Gonzalez CY, Hermjakob H, Hersey A, Jupe S, Kafkas S, Keays M, Leroy C, Lopez FJ, Magarinos MP, Malone J, McEntyre J, Munoz-Pomer Fuentes A, O’Donovan C, Papatheodorou I, Parkinson H, Palka B, Paschall J, Petryszak R, Pratanwanich N, Sarntivijal S, Saunders G, Sidiropoulos K, Smith T, Sondka Z, Stegle O, Tang YA, Turner E, Vaughan B, Vrousgou O, Watkins X, Martin MJ, Sanseau P, Vamathevan J, Birney E, Barrett J, Dunham I (2017) Nucleic Acids Res 45(D1):D985

    Article  CAS  PubMed  Google Scholar 

  12. Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (2001) Adv Drug Deliv Rev 46(1–3):3

    Article  CAS  PubMed  Google Scholar 

  13. Ringe D (1995) Curr Opin Struct Biol 5(6):825

    Article  CAS  PubMed  Google Scholar 

  14. Hajduk PJ, Huth JR, Fesik SW (2005) J Med Chem 48(7):2518

    Article  CAS  PubMed  Google Scholar 

  15. Craik DJ, Smith PA, Clark RJ (2010) NMR-based screening and drug discovery. In: Abraham DJ (ed). Burger’s medicinal chemistry and drug discovery. Wiley, Hoboken

    Google Scholar 

  16. Aretz J, Kondoh Y, Honda K, Anumala UR, Nazare M, Watanabe N, Osada H, Rademacher C (2016) Chem Commun (Camb) 52(58):9067

    Article  CAS  Google Scholar 

  17. Vukovic S, Huggins DJ (2018) Drug Discov Today 23(6):1258

    Article  CAS  PubMed  Google Scholar 

  18. Somody JC, MacKinnon SS, Windemuth A (2017) Drug Discov Today 22(12):1792

    Article  CAS  PubMed  Google Scholar 

  19. Brylinski M, Feinstein WP (2013) J Comput Aided Mol Des 27(6):551

    Article  CAS  PubMed  Google Scholar 

  20. Feinstein WP, Brylinski M (2014) Mol Inform 33(2):135

    Article  CAS  PubMed  Google Scholar 

  21. Borrel A, Regad L, Xhaard H, Petitjean M, Camproux AC (2015) J Chem Inf Model 55(4):882

    Article  CAS  PubMed  Google Scholar 

  22. Schmidtke P, Barril X (2010) J Med Chem 53(15):5858

    Article  CAS  PubMed  Google Scholar 

  23. Kyte J, Doolittle RF (1982) J Mol Biol 157(1):105

    Article  CAS  PubMed  Google Scholar 

  24. Cammisa M, Correra A, Andreotti G, Cubellis MV (2013) BMC Bioinformatics 14 (Suppl 7):S9

    Article  PubMed  PubMed Central  Google Scholar 

  25. Krasowski A, Muthas D, Sarkar A, Schmitt S, Brenk R (2011) J Chem Inf Model 51(11):2829

    Article  CAS  PubMed  Google Scholar 

  26. Humphrey W, Dalke A, Schulten K (1996) J Mol Graph 14(1):33

    Article  CAS  PubMed  Google Scholar 

  27. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) Nucleic Acids Res 28(1):235

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Zhang Y, Skolnick J (2004) Proteins 57(4):702

    Article  CAS  PubMed  Google Scholar 

  29. Soga S, Shirai H, Kobori M, Hirayama N (2007) J Chem Inf Model 47(2):400

    Article  CAS  PubMed  Google Scholar 

  30. Millman KJ (2015) Permute—a Python package for permutation tests and confidence sets. University of California, Berkeley, 2015

    Google Scholar 

  31. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) J Mach Learn Res 12:2825

    Google Scholar 

  32. Böhning D (1992) Ann Inst Statist Math 44(1):197

    Article  Google Scholar 

  33. Matthews BW (1975) Biochim Biophys Acta 405(2):442

    Article  CAS  PubMed  Google Scholar 

  34. Le Guilloux V, Schmidtke P, Tuffery P (2009) BMC Bioinformatics 10:168

    Article  PubMed  PubMed Central  Google Scholar 

  35. Kellenberger E, Muller P, Schalon C, Bret G, Foata N, Rognan D (2006) J Chem Inf Model 46(2):717

    Article  CAS  PubMed  Google Scholar 

  36. Schneider VA, Graves-Lindsay T, Howe K, Bouk N, Chen HC, Kitts PA, Murphy TD, Pruitt KD, Thibaud-Nissen F, Albracht D, Fulton RS, Kremitzki M, Magrini V, Markovic C, McGrath S, Steinberg KM, Auger K, Chow W, Collins J, Harden G, Hubbard T, Pelan S, Simpson JT, Threadgold G, Torrance J, Wood JM, Clarke L, Koren S, Boitano M, Peluso P, Li H, Chin CS, Phillippy AM, Durbin R, Wilson RK, Flicek P, Eichler EE, Church DM (2017) Genome Res 27(5):849

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Aken BL, Achuthan P, Akanni W, Amode MR, Bernsdorff F, Bhai J, Billis K, Carvalho-Silva D, Cummins C, Clapham P, Gil L, Giron CG, Gordon L, Hourlier T, Hunt SE, Janacek SH, Juettemann T, Keenan S, Laird MR, Lavidas I, Maurel T, McLaren W, Moore B, Murphy DN, Nag R, Newman V, Nuhn M, Ong CK, Parker A, Patricio M, Riat HS, Sheppard D, Sparrow H, Taylor K, Thormann A, Vullo A, Walts B, Wilder SP, Zadissa A, Kostadima M, Martin FJ, Muffato M, Perry E, Ruffier M, Staines DM, Trevanion SJ, Cunningham F, Yates A, Zerbino DR, Flicek P (2017) Nucleic Acids Res 45(D1):D635

    Article  CAS  PubMed  Google Scholar 

  38. Brylinski M, Lingam D (2012) PLoS ONE 7(11):e50200

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Wang Z, Tegge AN, Cheng J (2009) Proteins 75(3):638

    Article  CAS  PubMed  Google Scholar 

  40. Zemla A (2003) Nucleic Acids Res 31(13):3370

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Sterling T, Irwin JJ (2015) J Chem Inf Model 55(11):2324

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Wishart DS, Feunang YD, Guo AC, Lo EJ, Marcu A, Grant JR, Sajed T, Johnson D, Li C, Sayeeda Z, Assempour N, Iynkkaran I, Liu Y, Maciejewski A, Gale N, Wilson A, Chin L, Cummings R, Le D, Pon A, Knox C, Wilson M (2018) Nucleic Acids Res 46(D1):D1074

    Article  CAS  PubMed  Google Scholar 

  43. Liu X, Hanson BL, Langan P, Viola RE (2007) Acta Crystallogr D Biol Crystallogr 63(Pt 9):1000

    Article  CAS  PubMed  Google Scholar 

  44. Brylinski M (2013) J Chem Inf Model 53(11):3097

    Article  CAS  PubMed  Google Scholar 

  45. Barends TR, Polderman-Tijmes JJ, Jekel PA, Williams C, Wybenga G, Janssen DB, Dijkstra BW (2006) J Biol Chem 281(9):5804

    Article  CAS  PubMed  Google Scholar 

  46. Oster L, Tapani S, Xue Y, Kack H (2015) Drug Discov Today 20(9):1104

    Article  PubMed  CAS  Google Scholar 

  47. Tanimoto TT. An elementary mathematical theory of classification and prediction. IBM Internal Report, 1958

  48. Kawabata T (2011) J Chem Inf Model 51(8):1775

    Article  CAS  PubMed  Google Scholar 

  49. Brylinski M (2018) Chem Biol Drug Des 91(2):380

    Article  CAS  PubMed  Google Scholar 

  50. Sobolev V, Sorokine A, Prilusky J, Abola EE, Edelman M (1999) Bioinformatics 15(4):327

    Article  CAS  PubMed  Google Scholar 

  51. Ikushiro H, Islam MM, Okamoto A, Hoseki J, Murakawa T, Fujii S, Miyahara I, Hayashi H (2009) J Biochem 146(4):549

    Article  CAS  PubMed  Google Scholar 

  52. Sato D, Shiba T, Karaki T, Yamagata W, Nozaki T, Nakazawa T, Harada S (2017) Sci Rep 7(1):4874

    Article  PubMed  PubMed Central  CAS  Google Scholar 

Download references

Acknowledgements

Research reported in this publication was supported by the National Institute of General Medical Sciences of the National Institutes of Health under Award Number R35GM119524. Portions of this research were conducted with computing resources provided by Louisiana State University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michal Brylinski.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kana, O., Brylinski, M. Elucidating the druggability of the human proteome with eFindSite. J Comput Aided Mol Des 33, 509–519 (2019). https://doi.org/10.1007/s10822-019-00197-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-019-00197-w

Keywords

  • Druggability prediction
  • Human proteome
  • Drug targets
  • Pocket prediction
  • Structural bioinformatics
  • Molecular modeling
  • eFindSite