A Generalized Partial Canonical Correlation Model to Measure Contribution of Individual Drug Features Toward Side Effects Prediction

  • Rakesh KanjiEmail author
  • Ganesh BaglerEmail author
Conference paper
Part of the Lecture Notes on Data Engineering and Communications Technologies book series (LNDECT, volume 37)


Identification of potential drug-side effects is an open problem of importance for drug development. Side effects are related to a variety of interlinked aspects such as chemical properties of drugs, drug–target interactions, pathways involved, and many more. Existing statistical methods and machine learning models toward creating models that incorporate such features to predict adverse drug reactions. One of the challenges in these efforts is to disentangle the interdependence of features to identify the contribution of individual features toward specifying side effects. We present a partial canonical correlation analysis (PCCA) model that facilitates enumeration of contribution from individual drug features toward the prediction of a class of side effects, irrespective of interdependence on other features. The model is a combination of analytical and numerical strategies, and can be used to arrive at the most effective set of drug features starting from a range of available descriptors. For eye and nose related side effects, we demonstrate the implementation of our model for identification of best 2D chemical features that are closely linked with organ-specific adverse reactions. Despite the presence of a large number of drugs that are simultaneously associated with both the organs, the model could discern distinct drug features specifically linked to each class. With the availability of large amounts of data with an array of interdependent drug descriptors, such a model is of value in the drug discovery process as it enables in dealing with multidimensional drug features space.


Drug discovery Partial canonical correlation Chemoinformatics Side effect modeling Feature selection 



GB acknowledges the support from Indraprastha Institute of Information Technology Delhi (IIIT-Delhi) and seed grant support from the Indian Institute of Technology Jodhpur (IITJ/SEED/2014/0003). RK thanks the Ministry of Human Resource Development, Government of India as well as Indian Institute of Technology Jodhpur for the Senior Research Fellowship.


  1. 1.
    G.D. Demetri, Journal of Clin. Investig. 117, 3650–3653 (2007)CrossRefGoogle Scholar
  2. 2.
    C.P. Adams, V.V. Brantner, Health Aff. 25, 420–428 (2006)CrossRefGoogle Scholar
  3. 3.
    I. Kola, J. Landis, Nat. Rev. Drug Discov. 3, 1–5 (2004)CrossRefGoogle Scholar
  4. 4.
    M. Liu, Y. Wu, Y. Chen, J. Sun, Z. Zhao, X.W. Chen, M.E. Matheny, H. Xu, J. Am. Med. Inf. Assoc. JAMIA. 19, e25–e35 (2012)Google Scholar
  5. 5.
    M. Ammad-Ud-Din, E. Georgii, M. Gonen, T. Laitinen, O. Kallioniemi, K. Weenerberg, A. Poso, S. Kaski, J. Chem. Inf. Model. 54, 2347–2359 (2014)CrossRefGoogle Scholar
  6. 6.
    M.P. Menden, F. Irio, M. Garnett, U. Mecdormatt, C.H. Benes, P.J. Ballester, J. Seize-Rodriguez, Plos ONE.8 (2013)CrossRefGoogle Scholar
  7. 7.
    L. Chen, T. Huang, J. Zhang, M.-Y. Zhang, K.-Y. Feng, Y-D. Cai, K-C. Chou, BioMed Res. Int. 2013 (2013)Google Scholar
  8. 8.
    L.-C. Huang, X. Wu, J.Y. Chen, BMC Genomics 12 (2011)MathSciNetCrossRefGoogle Scholar
  9. 9.
    L.-C. Huang, X. Wu, J.Y. Chen, Proteomics 18, 313–324 (2013)CrossRefGoogle Scholar
  10. 10.
    N. Atias, R. Sharan, J. Comput. Biol. 18, 207–218 (2011)MathSciNetCrossRefGoogle Scholar
  11. 11.
    S. Mizutani, E. Pauwels, V. Stoven, S. Goto, Y. Yamanishi, Bioinformatics 28, 522–528 (2012)CrossRefGoogle Scholar
  12. 12.
    E. Pauwels, Y. Yamanishi, V. Stoven, BMC Bioinformatics, 12 (2011)Google Scholar
  13. 13.
    D. Weenik, Proceedinges of the institute of Phonetic Sciences of the University of Amesterdam, 81–99 (2003)Google Scholar
  14. 14.
    R. Kanji, A. Sharma, G. Bagler, Mol. BioSyst. 11, 2900–2906 (2015)CrossRefGoogle Scholar
  15. 15.
    Y. Yamanishi, E. Pawels, M. Kotera, J. Chem. Inf. Model. 52, 3284–3292 (2012)CrossRefGoogle Scholar
  16. 16.
    E. Bresso, R. Grisoni, G. Marchetti, A. Karaboga, M. Souchet, M.D. Devignes, M. Smail Tabbone, BMC Bioinformatics, 14 (2013)Google Scholar
  17. 17.
    M. Xiong, X. Fang, J. Zhao, Genome Res., 1878–1887 (2001)CrossRefGoogle Scholar
  18. 18.
    Z.E. Perlman, Science 306, 1194–1198 (2004)CrossRefGoogle Scholar
  19. 19.
    J. Weston, F. Perez-Cruz, O. Bousquet, O. Chapell, A. Elisseeff, B. Scholkoph, Bioinformatics, 19, 764–771 (2003)CrossRefGoogle Scholar
  20. 20.
    Y. Liu, J. Chem. Inf. Comput. Sci 44, 1823–1828 (2004)CrossRefGoogle Scholar
  21. 21.
    I. Guyon, S. Gunn, M. Nikravesh, L.A. Zedha, Feature Extraction: Foundations and Application. Springer, p. 207 (2008)Google Scholar
  22. 22.
    Y. Saeyes, I. Inza, P. Larranaga, Bioinformatics 23, 2507–2517 (2007)CrossRefGoogle Scholar
  23. 23.
    V. Bolon-Canedo, N. Sanchez-Marono, A. Alonso-Betanzos, Knowl. Inf. Syst. 34, 483–519 (2013)CrossRefGoogle Scholar
  24. 24.
    C. Ding, H. Peng, Comput. Syst. Bioinform.: IEEE Bioinform. Conf. 3, 185–203 (2003)CrossRefGoogle Scholar
  25. 25.
    H. Peng, F. Long, C. Ding, IEEE Trans. Pattern Anal. Mach. Intell. 27, 1226–1238 (2005)CrossRefGoogle Scholar
  26. 26.
    L. Yu, H. Liu, J. Mach. Learn. Res. 5, 1205–1224 (2004)Google Scholar
  27. 27.
    R. Tibshirani, J. Royal Stat. Soc., 267–288 (1996)Google Scholar
  28. 28.
    F.R. Bach, Proceedinges of the 25th international conference on Machine Learning, pp. 33–40 (2008)Google Scholar
  29. 29.
    S. Ma, J. Huang, Bioinformatics 21, 91–103 (2005)CrossRefGoogle Scholar
  30. 30.
    R.P. Li, I.B. Turksen M. Mukaidono, Fuzzy Sets Syst., 130, 101–108 (2002)Google Scholar
  31. 31.
    I. Inza, P. Larranaga, R. Blanco, A.J. Cerrolaza, Artif. Intell. Med. 31, 91–103 (2004)CrossRefGoogle Scholar
  32. 32.
    I. Inza, P. Larranaga, R. Etxeberria, B. Sierra, Artif. Intell. (Elsevier) 123, 157–184 (2000)CrossRefGoogle Scholar
  33. 33.
    M. Kuan, I. Letunic, L.J. Jensen, P. Bork, Nucleic Acid Res., 33–40 (2016)Google Scholar
  34. 34.
    M. Campilos, Persepect. Sci. 9, 49–52 (2016)CrossRefGoogle Scholar
  35. 35.
    S. Deghou, G. Zeller, M. Iskan, M. Driensen, M. Casollo, V. Van Noort, P. Bork, Bioinformatics 32, 2869–2872 (2016)CrossRefGoogle Scholar
  36. 36.
    B. Wooden, N. Goossen, Y. Hoshida, S.L. Friedman, Gastroentology, 835–847 (2016)Google Scholar
  37. 37.
    T. Anderson, Introduction to Multivariate Statistical Analysis, Vol. 121 (Eds. Second), Wiley, pp. 1–482 (2014)Google Scholar
  38. 38.
    D.M. Witten, R. Tibashirani, T. Hastie, Biostatistics 10, 515–534 (2009)CrossRefGoogle Scholar
  39. 39.
    C.D. Meyer, Matrix Analysis and Linear Algebra, Vol. 2, SIAM 2000, pp. 1–700Google Scholar
  40. 40.
    M. Bhasin, G.P.S. Raghava, Nucleic Acid Res. 32, 383–389 (2004)CrossRefGoogle Scholar
  41. 41.
    M. Hall, Correlation-based Feature Selection for Machine Learning. The University of Waikato, NewzealandGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.Computer Science and EngineeringNarula Institute of Technology, Jis GroupKolkataIndia
  2. 2.Centre of Computational Biology, Indraprastha Institute of Information TechnologyDelhiIndia

Personalised recommendations