Skip to main content
Log in

Using semantic analysis of texts for the identification of drugs with similar therapeutic effects

  • Full Article
  • Published:
Russian Chemical Bulletin Aims and scope

Abstract

Semantic analysis of text collections was used to identify drugs with similar therapeutic activity. Natural language processing methods were applied to analyse > 2.5 mln texts from drug reviews (in English) found on patient forums and discussion boards. In order to obtain distributed word representations form the input data, a continuous bag-of-words type model was used. Such model is one of the word2vec models intended to analyse the natural language semantics. This allowed the assignment of a numeric vector to each drug name. A list of pairs of drugs with similar vectors was formed. An analysis of this list confirmed that similar word vectors correspond to either drugs with the same active compound or to drugs with close therapeutic effects that belong to the same therapeutic group. The chemical similarity in such drug pairs was found to be low. The suggested procedure was used to visualize the chemical drug space and in the search for compounds with potentially similar biological effects among drugs of different therapeutic groups.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. E. Lekka, S. N. Deftereos, A. Persidis, A. Persidis, C. An-dronis, Drug Discovery Today: Therapeutic Strategies, 2012, 8, 103.

    Google Scholar 

  2. W. Loging, R. Rodriguez-Esteban, J. Hill, T. Freeman, J. Miglietta, Drug Discovery Today: Therapeutic Strategies, 2012, 8, 109.

    Google Scholar 

  3. N. C. Baker, B. M. Hemminger, J. Biomed. Inform., 2010, 43, 510.

    Article  CAS  Google Scholar 

  4. R. Leaman, L. Wojtulewicz, R. Sullivan, A. Skariah, J. Yang, G. Gonzalez, Proc. 2010 Workshop on Biomedical Natural Language Processing (Uppsala, Sweden, July 15, 2010), Uppsala, 2010, p. 117.

    Google Scholar 

  5. A. Benton, L. Ungar, S. Hill, S. Hennessy, J. Mao, A. Chung, C. E. Leonard, J. H. Holmes, J. Biomed. Inform., 2011, 44, 989.

    Article  Google Scholar 

  6. C. C. Freifeld, J. S. Brownstein, C. M. Menone, W. Bao, R. Filice, T. Kass-Hout, N. Dasgupta, Drug Safety, 2014, 37, 343.

    Article  CAS  Google Scholar 

  7. A. Nikfarjam, A. Sarker, K. ÓConnor, R. Ginn, G. Gonza-lez, J. Am. Med. Inform. Ass., 2015, 1.

    Google Scholar 

  8. S. Karimi, C. Wang, A. Metke-Jimenez, R. Gaire, C. Paris, ACM Computing Surveys, 2015, 47, 56.

    Article  Google Scholar 

  9. C. C. Huang, Z. Lu, Brief. Bioinform., 2016, 17, 132.

    Article  Google Scholar 

  10. C. H. Wei, Y. Peng, R. Leaman, A. P. Davis, C. J. Mattingly, J. Li, C.W. Thomas, Z. Lu, Proc. 5th BioCreative Challenge Evaluation Workshop, 2015, 154.

    Google Scholar 

  11. M. Rastegar-Mojarad, H. Liu, P. Nambisan, JMIR Res. Protocols, 2016, 5.

    Google Scholar 

  12. A. Sarker, R. Ginn, A. Nikfarjam, K. ÓConnor, K. Smith, S. Jayaraman, T. Upadhaya, G. Gonzalez, J. Biomed. Inform., 2015, 54, 202.

    Article  Google Scholar 

  13. J. Lardon, R. Abdellaoui, F. Bellet, H. Asfari, J. Souvignet, N. Texier, J. Med. Internet Res., 2015, 17, 171.

    Article  Google Scholar 

  14. H. J. Murff, V. L. Patel, G. Hripcsak, D. W. Bates, J. Biomed. Inform., 2003, 36, 131.

    Article  Google Scholar 

  15. R. Harpaz, A. Callahan, S. Tamang, Y. Low, D. Odgers, S. Finlayson, K. Jung, P. LePendu, N. H. Shah, J. Drug Safety, 2014, 37, 777.

    Article  CAS  Google Scholar 

  16. R. Sloane, O. Osanlou, D. Lewis, D. Bollegala, S. Maskell, M. Pirmohamed, British J. Clin. Pharmacol., 2015, 80, 910.

    Article  Google Scholar 

  17. A. Benton, L. Ungar, S. Hill, S. Hennessy, J. Mao, A. Chung, C. H. Leonard, J. H. Holmes, J. Biomed. Inform., 2011, 44, 989.

    Article  Google Scholar 

  18. C. C. Yang, H. Yang, L. Jiang, M. Zhang, Proc. 2012 In-tern. Workshop on Smart Health and Wellbeing (Sheralon, October 29, 2012), Sheralon, 2012, 33.

    Google Scholar 

  19. X. Liu, H. Chen, Proc. Intern. Conf. Smart Health (Beijing, August 3—4, 2013), Beijing, 2013, 134.

    Google Scholar 

  20. S. Yeleswarapu, A. Rao, T. Joseph, V. G. Saipradeep, R. Srinivasan, J. BMC Med. Inform. Decisicion Making, 2014, 14.

    Google Scholar 

  21. C. C. Freifeld, J. S. Brownstein, C. M. Menone, W. Bao, R. Filice, T. Kass-Hout, N. Dasgupta, J. Drug Safety, 2014, 37, 343.

    Article  CAS  Google Scholar 

  22. K. ÓConnor, P. Pimpalkhute, A. Nikfarjam, R. Ginn, K. L. Smith, G. Gonzalez, Proc. Am. Med. Inform. Association (AMIA) Ann. Symp., 2014, 2014, 924.

    Google Scholar 

  23. C. C. Yang, H. Yang, L. Jiang, J. ACM Trans. Management Inform. Systems, 2014, 5, 2.

    Google Scholar 

  24. E. Tutubalina, S. Nikolenko, J. Computaciуn y Sistemas, 2017, 21.

    Google Scholar 

  25. J. C. Na, W. Y. M. Kyaing, C. S. Khoo, S. Foo, Y. K. Chang, Y. L. Theng, Proc. Intern. Conf. on Asian Digital Libraries (Taivan, November 12—15, 2012), Taivan, 2012, 189.

    Google Scholar 

  26. A. Nikfarjam, G. H. Gonzalez, Proc. AMIA Ann. Symp. (Washington, October 22—26, 2011), Washington, 2011, 2011, p. 1019.

    Google Scholar 

  27. Y. Niu, X. Zhu, J. Li, G. Hirst, J. AMIA, 2005, 2005, 507.

    Google Scholar 

  28. J. Bian, U. Topaloglu, F. Yu, Proc. 2012 Intern. Workshop on Smart Health and Wellbeing (Sheralon, October 29, 2012), Sheralon, 2012, 25.

    Google Scholar 

  29. M. Yang, X. Wang, M. Y. Kiang, PACIS, 2013, 193.

    Google Scholar 

  30. A. Patki, A.Sarker, P. Pimpalkhute, A. Nikfarjam, R. Ginn, K. ÓConnor, K. Smith, G. Gonzalez, Proc. BioLinkSig 2014 (Boston, July 11—12, 2014), Boston, 2014, 2014, p. 1–8.

    Google Scholar 

  31. B. W. Chee, R. Berlin, B. Schatz, Proc. AMIA Ann. Symp. (Washington, October 22—26, 2011), Washington, 2011, 2011, 217.

    Google Scholar 

  32. A. Sarker, R. Ginn, A. Nikfarjam, K. ÓConnor, K. Smith, S. Jayaraman, U. Tejaswi, G. Gonzalez, J. Biomed. Inform., 2015, 54, 202.

    Article  Google Scholar 

  33. R. Leaman, L. Wojtulewicz, R. Sullivan, A. Skariah, J. Yang, G. Gonzalez, Proc. 2010 Workshop on Biomedical Natural Language Proc., 2010, 117–125.

    Google Scholar 

  34. A. Yates, N. Goharian, O. Frieder, Proc. 2013 ACM SIGIR Workshop on Health Search and Discovery (Dublin, Ireland, August 1, 2013), 2013, p. 55.

    Google Scholar 

  35. E. Aramaki, Y. Miura, M. Tonoike, T. Ohkuma, H. Masui-chi, K. Waki, K. Ohe, J. Stud. Health. Technol. Inform., 2010, 160, 739.

    Google Scholar 

  36. M. A. J. I. D. Rastegar-Mojarad, R. K. Elayavilli, Y. Yu, H. Liu, Proc. Social Media Mining Shared Task Workshop at the Pacific Symposium on Biocomputing, Big Island of Ha-waii, 2016; http://diego.asu.edu/psb2016/acceptedpapers/Mayo-NLP.pdf.

    Google Scholar 

  37. T. Huynh, Y. He, A. Willis, S. Rьger, Proc. COLING 2016, 26th Intern. Conf. on Computational Linguistics: Technical Papers (Osaka, December 11—17, 2016), Osaka, 2016, 877.

    Google Scholar 

  38. Y. Wu, J. Xu, M. Jiang, Y. Zhang, H. Xu, AMIA Ann. Symp. Proc., 2015, 1326.

    Google Scholar 

  39. D. L. Ngo, N. Yamamoto, V. A. Tran, N. G. Nguyen, D. Phan, F. R. Lumbanraja, M. Kubo, K. Satou, J. Biomed. Sci. Eng., 2016, 9, 7.

    Article  Google Scholar 

  40. A. N. Jagannatha, J. Chen, H. Yu, Proc. 6th Intern. Work-shop on Health Text Mining and Information Analysis (Louhi, 2015), Louhi, 2015, 142.

    Book  Google Scholar 

  41. T. Mikolov, I. Sutskever, K. Chen, G. Corrado, J. Dean, Proc. of NIPS, Eds C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, K. Q. Weinberger, Online, 2013, 3111.

  42. T. Mikolov, K. Chen, G. Corrado, J. Dean, arXiv preprint arXiv, 2013.

    Google Scholar 

  43. B. F. Begam, J. S. Kumar, Proc. Eng., 2012, 38, 1264.

    Article  CAS  Google Scholar 

  44. A. Varnek, I. I. Baskin, Mol. Inform., 2011, 30, 20.

    Article  CAS  Google Scholar 

  45. M. A. Johnson, G. M. Maggiora, in Concepts and Applica-tions of Molecular Similarity, John Wiley & Sons, Hoboken, New Jersey, 1990, p. 394.

    Google Scholar 

  46. P. G. Polishchuk, T. I. Madzhidov, A. Varnek, J. Comput-er-Aided Molecular Design, 2013, 27, 675.

    Article  CAS  Google Scholar 

  47. R. Todeschini, V. Consonni, Molecular Descriptors for chemoinformatics, John Wiley & Sons, Hoboken, New Jersey, 2009.

    Book  Google Scholar 

  48. URL: https://scrapy.org.

  49. URL: http://www.webmd.com.

  50. URL: http://www.askapatient.co.

  51. URL: https://www.drugs.com.

  52. URL: https://dailystrength.org.

  53. URL: http://patient.info.

  54. J. McAuley, C. Targett, Q. Shi, Van Den Hengel, Proc. 38th Intern. ACM SIGIR Conf. on Research and Develop-ment in Information Retrieval, ACM, New York, 2015, p. 43–52.

    Google Scholar 

  55. J. Beck, B. Woolf, J. Intelligent Tutoring Systems, 2000, 584.

    Book  Google Scholar 

  56. Y. Bengio, R. Ducharme, P. Vincent, J. Machine Learning Res., 2003, 3, 1137.

    Google Scholar 

  57. X. Rong, arXiv preprint arXiv, 2014; https://arxiv.org/pdf/1411.2738.

  58. R. Rehurek, P. Sojka, Proc. LREC 2010 Workshop on New Challenges for NLP Frameworks (Valletta, Malta, May 22, 2010), ELRA, 2010, p 45.

    Google Scholar 

  59. A. Varnek, D. Fourches, F. Hoonakker, V. P. Solovév, J. Comput. Aided. Mol. Des, 2005, 19, 693.

    Article  CAS  Google Scholar 

  60. https://www.drugbank.ca.

  61. A. Kishimoto, K. Kamata, T. Sugihara, S. Ishiguro, H. Hazama, R. Mizukawa, N. Kunimoto, Acta Psychiatrica Scandinavica, 1988, 77, 81.

    Article  CAS  Google Scholar 

  62. S. Morishita, S. Aoki, J. Affective Disorders, 1999, 53, 275.

    Article  CAS  Google Scholar 

  63. B. W. Dunlop, P. G. Davis, Prim Care Companion J. Clin. Psychiatry, 2008, 10, 222.

    Article  Google Scholar 

  64. S. V. Kane, E. L. Altschuler, R. E. Kast, Gastroenterology, 2003, 125, 1290.

    Article  Google Scholar 

  65. T. Mikolov, W. Yih, G. Zweig, Proc. 2013 Conf. of the North American Chapter of the Association for Computational Lin-guistics: Human Language Technologies (Atlanta, 2013), 2013, 13, 746.

    Google Scholar 

  66. E. Tutubalina, S. Nikolenko, Proc. Intern. Conf. on Analysis of Images, Social Networks and Texts, Biznes-Tsentr Palla-dium, Ekaterinburg, 2016, p. 208.

    Google Scholar 

  67. S. I. Nikolenko, Proc. 39th Intern. ACM SIGIR Conf. on Research and Development in Information Retrieval (Pisa, 2016), Pisa, 2016, p. 1029.

    Google Scholar 

  68. N. A. Loukachevitch, Proc. Intern. Conf. on Text, Speech, and Dialogue (Moscow, 2016), RGGU, Moscow, 2016, p. 134.

    Google Scholar 

  69. V. Solovyev, V. Ivanov, J. Comput. Intelligence and Neurosci., 2016; doi: 10.1155/2016/4183760.

    Google Scholar 

  70. M. Ester, H.-P. Kriegel, J. Sander, X. Xu, Proc. Second Intern. Conf. on Knowledge Discovery and Data Mining (Port Land, 1996), AAAI, Menlo Park, 1996, p. 226.

    Google Scholar 

  71. L. V. D. Maaten, G. Hinton, J. Machine Learning Res., 2008, 9, 2579.

    Google Scholar 

  72. L. van der Maaten, J. Machine Learning Res., 2008, 9, 2579.

    Google Scholar 

  73. S. X. M. Li, K. W. Perry, D. T. Wong, Neuropharmacology, 2002, 42, 181.

    Article  CAS  Google Scholar 

  74. J. A. Bodkin, R. A. Lasser, Jr., J. D. Wines, D. M. Gardner, R. J. Baldessarini, J. Clinical Psychiatry, 1997, 58, 137.

    Article  CAS  Google Scholar 

  75. P. Blier, H. E. Ward, P. Tremblay, L. Laberge, C. Hйbert, R. Bergeron, Am. J. Psychiatry, 2009, 167, 281.

    Article  Google Scholar 

  76. Z. Sh. Miftahutdinov, E. V. Tutubalina, A. E. Tropsha, Komp. Lingv. and Intel. Teh. [Comp. Lingv. and Intell. Tech.], 2017, 1, 155 (in Russian).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to E. V. Tutubalina.

Additional information

Based on the Materials of the XX Mendeleev Congress on General and Applied Chemistry (September 26—30, 2016, Ekaterinburg, Russia).

Published in Russian in Izvestiya Akademii Nauk. Seriya Khimicheskaya, No. 11, pp. 2180—2189, November, 2017.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tutubalina, E.V., Miftahutdinov, Z.S., Nugmanov, R.I. et al. Using semantic analysis of texts for the identification of drugs with similar therapeutic effects. Russ Chem Bull 66, 2180–2189 (2017). https://doi.org/10.1007/s11172-017-2000-8

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11172-017-2000-8

Key words

Navigation