Cognitive Computation

, Volume 10, Issue 4, pp 670–685 | Cite as

Relation Extraction of Medical Concepts Using Categorization and Sentiment Analysis

  • Anupam Mondal
  • Erik Cambria
  • Dipankar Das
  • Amir Hussain
  • Sivaji Bandyopadhyay


In healthcare services, information extraction is the key to understand any corpus-based knowledge. The process becomes laborious when the annotation is done manually for the availability of a large number of text corpora. Hence, future automated extraction systems will be essential for groups of experts such as doctors and medical practitioners as well as non-experts such as patients, to ensure enhanced clinical decision-making for improving healthcare systems. Such extraction systems can be developed using medical concepts and concept-related features as the part of a structured corpus. The latter can assist in assigning the category and sentiment to each of the medical concepts and their lexical contexts. These categories and sentiment assignments constitute semantic relations of medical concepts, with their context, represented by sentences of the corpus. This paper presents a new domain-based knowledge lexicon coupled with a machine learning approach to extract semantic relations. This is done by assigning category and sentiment of the medical concepts and contexts. The categories considered in this research, are diseases, symptoms, drugs, human_anatomy, and miscellaneous medical terms, whereas sentiments are considered as positive and negative. The proposed assignment systems are developed on the top of WordNet of Medical Event (WME) lexicon. The developed lexicon provides medical concepts and their features, namely Parts-Of-Speech (POS), gloss (descriptive explanation), Similar Sentiment Words (SSW), affinity score, gravity score, polarity score, and sentiment. Several well-known supervised classifiers, including Naïve Bayes, Logistic Regression, and support vector-based Sequential Minimal Optimization (SMO) have been applied to evaluate the developed systems. The proposed approaches have resulted in a concepts clustering application by identifying the semantic relations of concepts. The application provides potential exploitation in several domains, such as medical ontologies and recommendation systems.


Bio-NLP Category Medical concept Medical context Semantic Sentiment 


Compliance with Ethical Standards

Conflict of interest

The authors declare that they have no conflict of interest.

Informed Consent

Informed consent was not required as no human or animals were involved.

Human and Animal Rights

This article does not contain any studies with human or animal subjects performed by any of the authors.


  1. 1.
    Abacha AB, Zweigenbaum P. A hybrid approach for the extraction of semantic relations from medline abstracts. In: International conference on intelligent text processing and computational linguistics, pp 139–150. Springer. 2011.Google Scholar
  2. 2.
    Bandhakavi A, Wiratunga N, Massie S, Deepak P. Lexicon generation for emotion analysis of text. IEEE Intell Syst 2017;32(1):102–108.CrossRefGoogle Scholar
  3. 3.
    Basili R, Pazienza MT, Vindigni M. Corpus-driven unsupervised learning of verb subcategorization frames. In: Congress of the Italian Association for Artificial Intelligence, pp 159–170. Springer. 1997.Google Scholar
  4. 4.
    Birks Y, McKendree J, Watt I. Emotional intelligence and perceived stress in healthcare students: a multi-institutional, multi-professional survey. BMC Med Educ 2009;9(1):1.CrossRefGoogle Scholar
  5. 5.
    Boytcheva S, Strupchanska A, Paskaleva E, Tcharaktchiev D, Str DG. Some aspects of negation processing in electronic health records. In: Proceedings of international workshop language and speech infrastructure for information access in the balkan countries, pp 1–8. Citeseer. 2005.Google Scholar
  6. 6.
    Cambria E. An introduction to concept-level sentiment analysis. In: Mexican international conference on artificial intelligence, pp 478–483. Springer. 2013.Google Scholar
  7. 7.
    Cambria E. Affective computing and sentiment analysis. IEEE Intell Syst 2016;31(2):102–107.CrossRefGoogle Scholar
  8. 8.
    Cambria E, Das D, Bandyopadhyay S, Feraco A. A practical guide to sentiment analysis. Switzerland: Springer, Cham; 2017.CrossRefGoogle Scholar
  9. 9.
    Cambria E, Jie F, Bisio F, Poria S. Affectivespace 2: Enabling affective intuition for concept-level sentiment analysis. In: AAAI, pp 508–514. 2015Google Scholar
  10. 10.
    Cambria E, Hussain A. Sentic computing: A Common-Sense-Based Framework for Concept-Level Sentiment Analysis. Switzerland: Springer, Cham; 2015.CrossRefGoogle Scholar
  11. 11.
    Cambria E, Hussain A, Durrani T, Havasi C, Eckl C, Munro J. Sentic computing for patient centered application. In: IEEE ICSP, pp 1279–1282. 2010.Google Scholar
  12. 12.
    Cambria E, Hussain A, Durrani T, Havasi C, Eckl C, Munro J. Sentic computing for patient centered applications. In: IEEE 10th International Conference on Signal Processing Proceedings, pp 1279–1282. IEEE. 2010.Google Scholar
  13. 13.
    Cambria E, Hussain A, Eckl C. Bridging the gap between structured and unstructured healthcare data through semantics and sentics. In: ACM WebSci. 2011.Google Scholar
  14. 14.
    Cambria E, Poria S, Bajpai R, Schuller B. SenticNet 4: A semantic resource for sentiment analysis based on conceptual primitives. In: COLING, pp 2666–2677. 2016.Google Scholar
  15. 15.
    Cambria E, Poria S, Gelbukh A, Thelwall M. Sentiment analysis is a big suitcase. IEEE Intell Syst 2017;32(6):74–80.CrossRefGoogle Scholar
  16. 16.
    Cambria E, Poria S, Hazarika D, Kwok K. SenticNet 5: Discovering conceptual primitives for sentiment analysis by means of context embeddings. In: AAAI. 2018.Google Scholar
  17. 17.
    Cambria E, Schuller B, Xia Y, Havasi C. New avenues in opinion mining and sentiment analysis. IEEE Intell Syst 2013;28(2):15–21.CrossRefGoogle Scholar
  18. 18.
    Cavallari S, Zheng V, Cai H, Chang K, Cambria E. Learning community embedding with community detection and node embedding on graphs. In: CIKM, pp 377–386. 2017.Google Scholar
  19. 19.
    Chaturvedi I, Ragusa E, Gastaldo P, Zunino R, Cambria E. Bayesian network based extreme learning machine for subjectivity detection. Journal of The Franklin Institute. 2018.
  20. 20.
    Denecke K, Deng Y. Sentiment analysis in medical settings. Artif Intell Med 2015;64(1):17–27.CrossRefPubMedGoogle Scholar
  21. 21.
    Deng Y, Stoehr M, Denecke K. Retrieving attitudes: Sentiment analysis from clinical narratives. In: MedIR@ SIGIR, pp 12–15. 2014.Google Scholar
  22. 22.
    Dey M, Mondal A, Das D. Ntcir-12 mobileclick: Sense-based ranking and summarization of english queries. In: NTCIR-12 Conference. 2016.Google Scholar
  23. 23.
    Ebrahimi M, Hossein A, Sheth A. Challenges of sentiment analysis for dynamic events. IEEE Intell Syst 2017;32(5):70–75.CrossRefGoogle Scholar
  24. 24.
    Elkin PL, Brown SH, Bauer BA, Husser CS, Carruth W, Bergstrom LR, Wahner-Roedler DL. A controlled trial of automated classification of negation from clinical notes. BMC Med Inform Decis Mak 2005;5(1): 13.CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Embarek M, Ferret O. Learning patterns for building resources about semantic relations in the medical domain. In: LREC. 2008.Google Scholar
  26. 26.
    Esuli A, Sebastiani F. Sentiwordnet: A publicly available lexical resource for opinion mining. In: Proceedings of LREC, vol 6, pp 417–422. Citeseer. 2006.Google Scholar
  27. 27.
    Goldin I, Chapman WW. Learning to detect negation with ‘not’in medical texts. In: Proc workshop on text analysis and search for bioinformatics, ACM SIGIR. 2003.Google Scholar
  28. 28.
    Grassi M, Cambria E, Hussain A, Piazza F. Sentic web: A new paradigm for managing social media affective information. Cogn Comput 2011;3(3):480–489.CrossRefGoogle Scholar
  29. 29.
    Greaves F, Ramirez-Cano D, Millett C, Darzi A, Donaldson L. Use of sentiment analysis for capturing patient experience from free-text comments posted online. J Med Internet Res 2013;15(11):e239.CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Huang Y, Lowe HJ. A novel hybrid approach to automated negation detection in clinical radiology reports. J Am Med Inform Assoc 2007;14(3):304–311.CrossRefPubMedPubMedCentralGoogle Scholar
  31. 31.
    Hussain A, Cambria E. Semi-supervised learning for big social data analysis. Neurocomputing 2018;275: 1662–1673.CrossRefGoogle Scholar
  32. 32.
    Jacob SG, Geetha Ramani R. Discovery of knowledge patterns in clinical data through data mining algorithms: multi-class categorization of breast tissue data. Int J Comput Appl 2011;32(7):46–53.Google Scholar
  33. 33.
    Jang H, Shin H. Effective use of linguistic features for sentiment analysis of korean. In: PACLIC, pp 173–182. 2010.Google Scholar
  34. 34.
    Jiang M, Chen Y, Liu M, Trent Rosenbloom S, Mani S, Denny JC, Hua X. A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries. J Am Med Inform Assoc 2011;18(5):601–606.CrossRefPubMedPubMedCentralGoogle Scholar
  35. 35.
    Kambhatla N. 2004. Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations. In: Proceedings of the ACL 2004 on Interactive poster and demonstration sessions, pp 22. Association for Computational Linguistics.Google Scholar
  36. 36.
    Katz JE, Rice RE. Public views of mobile medical devices and services: A us national survey of consumer sentiments towards rfid healthcare technology. Int J Med Inform 2009;78(2):104– 114.CrossRefPubMedGoogle Scholar
  37. 37.
    Kilgarriff A, Fellbaum C. Wordnet: An electronic lexical database. 2000.Google Scholar
  38. 38.
    Kim J-D, Ohta T, Tateisi Y, Tsujii J. Genia corpus—a semantically annotated corpus for bio-textmining. Bioinformatics 2003;19(1):i180—i182.Google Scholar
  39. 39.
    Kulick S, Bies A, Liberman M, Mandel M, McDonald R, Palmer M, Schein A, Ungar L, Winters S, White P. Integrated annotation for biomedical information extraction. In: Proceedings of the human language technology conference and the annual meeting of the North American chapter of the association for computational linguistics (HLT/NAACL), pp 61–68. 2004.Google Scholar
  40. 40.
    Li Y, Pan Q, Yang T, Wang SH, Tang JL, Cambria E. Learning word representations for sentiment analysis. Cogn Comput 2017;9(6):843–851.CrossRefGoogle Scholar
  41. 41.
    Lo SL, Cambria E, Chiong R, Cornforth D. Multilingual sentiment analysis: From formal to informal and scarce resource languages. Artif Intell Rev 2017;48(4):499–527.CrossRefGoogle Scholar
  42. 42.
    Ma Y, Cambria E, Sa G. Label embedding for zero-shot fine-grained named entity typing. In: COLING, pp 171–180. 2016.Google Scholar
  43. 43.
    Ma Y, Peng H, Cambria E. Targeted aspect-based sentiment analysis via embedding commonsense knowledge into an attentive LSTM. In: AAAI. 2018.Google Scholar
  44. 44.
    Majumder N, Poria S, Gelbukh A, Cambria E. Deep learning-based document modeling for personality detection from text. IEEE Intell Syst 2017;32(2):74–79.CrossRefGoogle Scholar
  45. 45.
    Mihalcea R, Garimella A. What men say, what women hear: Finding gender-specific meaning shades. IEEE Intell Syst 2016;31(4):62–67.CrossRefGoogle Scholar
  46. 46.
    Mondal A, Chaturvedi I, Das D, Bajpai R, Bandyopadhyay S. Lexical resource for medical events: A polarity based approach. In: 2015 IEEE International Conference on Data Mining Workshop (ICDMW), pp 1302–1309. IEEE. 2015.Google Scholar
  47. 47.
    Mondal A, Das D, Cambria E, Bandyopadhyay S. Wme: Sense, polarity and affinity based concept resource for medical events. In: Proceedings of the 8th global wordnet conference, pp 242–246. 2016.Google Scholar
  48. 48.
    Mondal A, Satapathy R, Das D, Bandyopadhyay S. A hybrid approach based sentiment extraction from medical context. In: 4th workshop on sentiment analysis where ai meets psychology (SAAIP 2016), IJCAI 2016 Workshop, July 10, Hilton, New York City, USA. 2016.Google Scholar
  49. 49.
    Morante R, Liekens A, Daelemans W. Learning the scope of negation in biomedical texts. In: Proceedings of the conference on empirical methods in natural language processing, pp 715–724. Association for Computational Linguistics. 2008.Google Scholar
  50. 50.
    Na J-C, Kyaing WYM, Khoo CSG, Foo S, Chang Y-K, Theng Y-L. Sentiment classification of drug reviews using a rule-based linguistic approach. In: International conference on asian digital libraries, pp 189–198. Springer. 2012.Google Scholar
  51. 51.
    Niu Y, Zhu X, Li J, Hirst G. Analysis of polarity information in medical text. In: AMIA. 2005.Google Scholar
  52. 52.
    Oneto L, Bisio F, Cambria E, Anguita D. Statistical learning theory and ELM for big social data analysis. IEEE Comput Intell Mag 2016;11(3):45–55.CrossRefGoogle Scholar
  53. 53.
    Patel CO, Cimino JJ. Using semantic and structural properties of the unified medical language system to discover potential terminological relationships. J Am Med Inform Assoc 2009;16(3):346–353.CrossRefPubMedPubMedCentralGoogle Scholar
  54. 54.
    Pedersen T, Pakhomov SVS, Patwardhan S, Chute CG. Measures of semantic similarity and relatedness in the biomedical domain. J Biomed Inform 2007;40(3):288–299.CrossRefPubMedGoogle Scholar
  55. 55.
    Poria S, Cambria E, Bajpai R, Hussain A. A review of affective computing: From unimodal analysis to multimodal fusion. Inf Fus 2017;37:98–125.CrossRefGoogle Scholar
  56. 56.
    Poria S, Cambria E, Hazarika D, Vij P. A deeper look into sarcastic tweets using deep convolutional neural networks. In: COLING, pp 1601–1612. 2016.Google Scholar
  57. 57.
    Poria S, Cambria E, Hazarika D, Mazumder N, Zadeh A, Morency L-P. Context-dependent sentiment analysis in user-generated videos. In: ACL, pp 873–883. 2017.Google Scholar
  58. 58.
    Prabowo R, Thelwall M. Sentiment analysis: A combined approach. J Inf 2009;3(2):143–157.Google Scholar
  59. 59.
    Rink B, Harabagiu S, Roberts K. Automatic extraction of relations between medical concepts in clinical texts. J Am Med Inform Assoc : JAMIA 2011;18(5):594–600.CrossRefPubMedGoogle Scholar
  60. 60.
    Rosario B, Hearst MA. Classifying semantic relations in bioscience texts. In: Proceedings of the 42nd annual meeting on association for computational linguistics, pp 430. Association for Computational Linguistics. 2004.Google Scholar
  61. 61.
    Sarker A, Mollá-Aliod D, Paris C, et al. Outcome polarity identification of medical papers, pp 105–114. 2011.Google Scholar
  62. 62.
    Shukla RS, Yadav KS, Rizvi STA, Haseen F. An efficient mining of biomedical data from hypertext documents via nlp. In: Proceedings of the 3rd international conference on frontiers of intelligent computing: Theory and applications (FICTA) 2014, pp 651–658. Springer. 2015.Google Scholar
  63. 63.
    Smith B, Fellbaum C. Medical wordnet: a new methodology for the construction and validation of information resources for consumer health. In: Proceedings of the 20th international conference on Computational Linguistics, pp 371. Association for computational linguistics. 2004.Google Scholar
  64. 64.
    Smith P, Lee M. Cross-discourse development of supervised sentiment analysis in the clinical domain. In: Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis, pp 79–83. Association for Computational Linguistics. 2012.Google Scholar
  65. 65.
    Sohn S, Torii M, Li D, Wagholikar K, Wu S, Liu H. A hybrid approach to sentiment sentence classification in suicide notes. Biomedical Inf Insights 2012;5(Suppl. 1):43.Google Scholar
  66. 66.
    Spasic I, Ananiadou S, McNaught J, Kumar A. Text mining and ontologies in biomedicine: making sense of raw text. Brief Bioinform 2005;6(3):239–251.CrossRefPubMedGoogle Scholar
  67. 67.
    Swaminathan R, Sharma A, Yang H. Opinion mining for biomedical text data: Feature space design and feature selection. In: The 9th international workshop on data mining in bioinformatics, BIOKDD. 2010.Google Scholar
  68. 68.
    Szarvas G, Vincze V, Farkas R, Csirik J. The bioscope corpus: annotation for negation, uncertainty and their scope in biomedical texts. In: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing, pp 38–45. Association for Computational Linguistics. 2008.Google Scholar
  69. 69.
    Taboada M, Brooke J, Tofiloski M, Voll K, Stede M. Lexicon-based methods for sentiment analysis. Comput Linguist 2011;37(2):267–307.CrossRefGoogle Scholar
  70. 70.
    Tanabe L, Xie N, Thom LH, Matten W, Wilbur JW. Genetag: a tagged corpus for gene/protein named entity recognition. BMC Bioinf 2005;6(1):1.CrossRefGoogle Scholar
  71. 71.
    Uzuner Ö, South BR, Shen S, DuVall SL. 2010 i2b2/va challenge on concepts, assertions, and relations in clinical text. J Am Med Inform Assoc 2011;18(5):552–556.CrossRefPubMedPubMedCentralGoogle Scholar
  72. 72.
    Román JV, Pérez SC, Serrano SL, Carlos J, Cristóbal G. Hybrid approach combining machine learning and a rule-based expert system for text categorization. In: Proceedings of the 24th international Florida artificial intelligence research society conference. AAAI. 2011.Google Scholar
  73. 73.
    Wilbur JW, Rzhetsky A, Shatkay H. New directions in biomedical text annotation: definitions, guidelines and corpus construction. BMC Bioinf 2006;7(1):1.CrossRefGoogle Scholar
  74. 74.
    Xia L, Gentile AL, Munro J, Iria J. Improving patient opinion mining through multi-step classification. In: International conference on text, speech and dialogue, pp 70–76. Springer. 2009.Google Scholar
  75. 75.
    Xia Y, Cambria E, Hussain A, Zhao H. Word polarity disambiguation using bayesian model and opinion-level features. Cogn Comput 2015;7(3):369–380.CrossRefGoogle Scholar
  76. 76.
    Xing F, Cambria E, Welsch R. 2018. Natural language based financial forecasting: A survey. Artificial Intelligence Review.
  77. 77.
    Chi X, Cambria E, Tan PS. 2017. Adaptive two-stage feature selection for sentiment classification. In: IEEE SMC, pp 1238–1243.Google Scholar
  78. 78.
    Yetisgen-Yildiz M, Solti I, Xia F, Halgrim SR. Preliminary experience with amazon’s mechanical turk for annotating medical named entities. In: Proceedings of the NAACL HLT, 2010 Workshop on creating speech and language data with amazon’s mechanical turk, pp 180–183. Association for computational linguistics. 2010.Google Scholar
  79. 79.
    Young T, Cambria E, Chaturvedi I, Zhou H, Biswas S, Huang M. Augmenting end-to-end dialog systems with commonsense knowledge. In: AAAI. 2018.Google Scholar
  80. 80.
    Zadeh A, Liang PP, Poria S, Vij P, Cambria E, Morency L-P. Multi-attention recurrent network for human communication comprehension. In: AAAI. 2018.Google Scholar
  81. 81.
    Zhang M, Zhang J, Su J, Zhou G. A composite kernel to extract relations between entities with both flat and structured features. In: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the association for computational linguistics, pp 825–832. Association for Computational Linguistics. 2006.Google Scholar
  82. 82.
    Zheng H-T, Kang B-Y, Kim H-G. Exploiting noun phrases and semantic relationships for text document clustering. Inf Sci 2009;179(13):2249–2262. Special Section on High Order Fuzzy Sets.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Anupam Mondal
    • 1
  • Erik Cambria
    • 2
  • Dipankar Das
    • 1
  • Amir Hussain
    • 3
  • Sivaji Bandyopadhyay
    • 1
  1. 1.Department of Computer Science and EngineeringJadavpur UniversityKolkataIndia
  2. 2.School of Computer Science and EngineeringNanyang Technological UniversitySingaporeSingapore
  3. 3.Division of Computing Science and Maths, Faculty of Natural SciencesUniversity of StirlingStirlingUK

Personalised recommendations