Skip to main content
Log in

Relation Extraction of Medical Concepts Using Categorization and Sentiment Analysis

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

In healthcare services, information extraction is the key to understand any corpus-based knowledge. The process becomes laborious when the annotation is done manually for the availability of a large number of text corpora. Hence, future automated extraction systems will be essential for groups of experts such as doctors and medical practitioners as well as non-experts such as patients, to ensure enhanced clinical decision-making for improving healthcare systems. Such extraction systems can be developed using medical concepts and concept-related features as the part of a structured corpus. The latter can assist in assigning the category and sentiment to each of the medical concepts and their lexical contexts. These categories and sentiment assignments constitute semantic relations of medical concepts, with their context, represented by sentences of the corpus. This paper presents a new domain-based knowledge lexicon coupled with a machine learning approach to extract semantic relations. This is done by assigning category and sentiment of the medical concepts and contexts. The categories considered in this research, are diseases, symptoms, drugs, human_anatomy, and miscellaneous medical terms, whereas sentiments are considered as positive and negative. The proposed assignment systems are developed on the top of WordNet of Medical Event (WME) lexicon. The developed lexicon provides medical concepts and their features, namely Parts-Of-Speech (POS), gloss (descriptive explanation), Similar Sentiment Words (SSW), affinity score, gravity score, polarity score, and sentiment. Several well-known supervised classifiers, including Naïve Bayes, Logistic Regression, and support vector-based Sequential Minimal Optimization (SMO) have been applied to evaluate the developed systems. The proposed approaches have resulted in a concepts clustering application by identifying the semantic relations of concepts. The application provides potential exploitation in several domains, such as medical ontologies and recommendation systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://www.nactem.ac.uk/GENIA/tagger/

  2. https://catalog.ldc.upenn.edu/LDC2008T21

  3. http://alt.qcri.org/semeval2015/task6/

  4. http://alexabe.pbworks.com/f/Dictionary+of+Medical+Terms+4th+Ed.-+(Malestrom).pdf http://alexabe.pbworks.com/f/Dictionary+of+Medical+Terms+ 4th+Ed.-+(Malestrom).pdf

  5. http://sentiwordnet.isti.cnr.it/

  6. http://sentic.net/downloads/

  7. https://www.cs.uic.edu/

  8. http://neuro.imm.dtu.dk/wiki/

  9. https://en.wikipedia.org/wiki/Cohen’skappa

  10. http://www.nltk.org/

  11. http://alt.qcri.org/semeval2015/task6/

  12. http://www.medicinenet.com/script/main/hp.asp

  13. http://alt.qcri.org/semeval2015/task6/

  14. http://www.medicinenet.com/script/main/hp.asp

References

  1. Abacha AB, Zweigenbaum P. A hybrid approach for the extraction of semantic relations from medline abstracts. In: International conference on intelligent text processing and computational linguistics, pp 139–150. Springer. 2011.

  2. Bandhakavi A, Wiratunga N, Massie S, Deepak P. Lexicon generation for emotion analysis of text. IEEE Intell Syst 2017;32(1):102–108.

    Article  Google Scholar 

  3. Basili R, Pazienza MT, Vindigni M. Corpus-driven unsupervised learning of verb subcategorization frames. In: Congress of the Italian Association for Artificial Intelligence, pp 159–170. Springer. 1997.

  4. Birks Y, McKendree J, Watt I. Emotional intelligence and perceived stress in healthcare students: a multi-institutional, multi-professional survey. BMC Med Educ 2009;9(1):1.

    Article  Google Scholar 

  5. Boytcheva S, Strupchanska A, Paskaleva E, Tcharaktchiev D, Str DG. Some aspects of negation processing in electronic health records. In: Proceedings of international workshop language and speech infrastructure for information access in the balkan countries, pp 1–8. Citeseer. 2005.

  6. Cambria E. An introduction to concept-level sentiment analysis. In: Mexican international conference on artificial intelligence, pp 478–483. Springer. 2013.

  7. Cambria E. Affective computing and sentiment analysis. IEEE Intell Syst 2016;31(2):102–107.

    Article  Google Scholar 

  8. Cambria E, Das D, Bandyopadhyay S, Feraco A. A practical guide to sentiment analysis. Switzerland: Springer, Cham; 2017.

    Book  Google Scholar 

  9. Cambria E, Jie F, Bisio F, Poria S. Affectivespace 2: Enabling affective intuition for concept-level sentiment analysis. In: AAAI, pp 508–514. 2015

  10. Cambria E, Hussain A. Sentic computing: A Common-Sense-Based Framework for Concept-Level Sentiment Analysis. Switzerland: Springer, Cham; 2015.

    Book  Google Scholar 

  11. Cambria E, Hussain A, Durrani T, Havasi C, Eckl C, Munro J. Sentic computing for patient centered application. In: IEEE ICSP, pp 1279–1282. 2010.

  12. Cambria E, Hussain A, Durrani T, Havasi C, Eckl C, Munro J. Sentic computing for patient centered applications. In: IEEE 10th International Conference on Signal Processing Proceedings, pp 1279–1282. IEEE. 2010.

  13. Cambria E, Hussain A, Eckl C. Bridging the gap between structured and unstructured healthcare data through semantics and sentics. In: ACM WebSci. 2011.

  14. Cambria E, Poria S, Bajpai R, Schuller B. SenticNet 4: A semantic resource for sentiment analysis based on conceptual primitives. In: COLING, pp 2666–2677. 2016.

  15. Cambria E, Poria S, Gelbukh A, Thelwall M. Sentiment analysis is a big suitcase. IEEE Intell Syst 2017;32(6):74–80.

    Article  Google Scholar 

  16. Cambria E, Poria S, Hazarika D, Kwok K. SenticNet 5: Discovering conceptual primitives for sentiment analysis by means of context embeddings. In: AAAI. 2018.

  17. Cambria E, Schuller B, Xia Y, Havasi C. New avenues in opinion mining and sentiment analysis. IEEE Intell Syst 2013;28(2):15–21.

    Article  Google Scholar 

  18. Cavallari S, Zheng V, Cai H, Chang K, Cambria E. Learning community embedding with community detection and node embedding on graphs. In: CIKM, pp 377–386. 2017.

  19. Chaturvedi I, Ragusa E, Gastaldo P, Zunino R, Cambria E. Bayesian network based extreme learning machine for subjectivity detection. Journal of The Franklin Institute. 2018. https://doi.org/10.1016/j.jfranklin.2017.06.007.

  20. Denecke K, Deng Y. Sentiment analysis in medical settings. Artif Intell Med 2015;64(1):17–27.

    Article  PubMed  Google Scholar 

  21. Deng Y, Stoehr M, Denecke K. Retrieving attitudes: Sentiment analysis from clinical narratives. In: MedIR@ SIGIR, pp 12–15. 2014.

  22. Dey M, Mondal A, Das D. Ntcir-12 mobileclick: Sense-based ranking and summarization of english queries. In: NTCIR-12 Conference. 2016.

  23. Ebrahimi M, Hossein A, Sheth A. Challenges of sentiment analysis for dynamic events. IEEE Intell Syst 2017;32(5):70–75.

    Article  Google Scholar 

  24. Elkin PL, Brown SH, Bauer BA, Husser CS, Carruth W, Bergstrom LR, Wahner-Roedler DL. A controlled trial of automated classification of negation from clinical notes. BMC Med Inform Decis Mak 2005;5(1): 13.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Embarek M, Ferret O. Learning patterns for building resources about semantic relations in the medical domain. In: LREC. 2008.

  26. Esuli A, Sebastiani F. Sentiwordnet: A publicly available lexical resource for opinion mining. In: Proceedings of LREC, vol 6, pp 417–422. Citeseer. 2006.

  27. Goldin I, Chapman WW. Learning to detect negation with ‘not’in medical texts. In: Proc workshop on text analysis and search for bioinformatics, ACM SIGIR. 2003.

  28. Grassi M, Cambria E, Hussain A, Piazza F. Sentic web: A new paradigm for managing social media affective information. Cogn Comput 2011;3(3):480–489.

    Article  Google Scholar 

  29. Greaves F, Ramirez-Cano D, Millett C, Darzi A, Donaldson L. Use of sentiment analysis for capturing patient experience from free-text comments posted online. J Med Internet Res 2013;15(11):e239.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Huang Y, Lowe HJ. A novel hybrid approach to automated negation detection in clinical radiology reports. J Am Med Inform Assoc 2007;14(3):304–311.

    Article  PubMed  PubMed Central  Google Scholar 

  31. Hussain A, Cambria E. Semi-supervised learning for big social data analysis. Neurocomputing 2018;275: 1662–1673.

    Article  Google Scholar 

  32. Jacob SG, Geetha Ramani R. Discovery of knowledge patterns in clinical data through data mining algorithms: multi-class categorization of breast tissue data. Int J Comput Appl 2011;32(7):46–53.

    Google Scholar 

  33. Jang H, Shin H. Effective use of linguistic features for sentiment analysis of korean. In: PACLIC, pp 173–182. 2010.

  34. Jiang M, Chen Y, Liu M, Trent Rosenbloom S, Mani S, Denny JC, Hua X. A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries. J Am Med Inform Assoc 2011;18(5):601–606.

    Article  PubMed  PubMed Central  Google Scholar 

  35. Kambhatla N. 2004. Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations. In: Proceedings of the ACL 2004 on Interactive poster and demonstration sessions, pp 22. Association for Computational Linguistics.

  36. Katz JE, Rice RE. Public views of mobile medical devices and services: A us national survey of consumer sentiments towards rfid healthcare technology. Int J Med Inform 2009;78(2):104– 114.

    Article  PubMed  Google Scholar 

  37. Kilgarriff A, Fellbaum C. Wordnet: An electronic lexical database. 2000.

  38. Kim J-D, Ohta T, Tateisi Y, Tsujii J. Genia corpus—a semantically annotated corpus for bio-textmining. Bioinformatics 2003;19(1):i180—i182.

    Google Scholar 

  39. Kulick S, Bies A, Liberman M, Mandel M, McDonald R, Palmer M, Schein A, Ungar L, Winters S, White P. Integrated annotation for biomedical information extraction. In: Proceedings of the human language technology conference and the annual meeting of the North American chapter of the association for computational linguistics (HLT/NAACL), pp 61–68. 2004.

  40. Li Y, Pan Q, Yang T, Wang SH, Tang JL, Cambria E. Learning word representations for sentiment analysis. Cogn Comput 2017;9(6):843–851.

    Article  Google Scholar 

  41. Lo SL, Cambria E, Chiong R, Cornforth D. Multilingual sentiment analysis: From formal to informal and scarce resource languages. Artif Intell Rev 2017;48(4):499–527.

    Article  Google Scholar 

  42. Ma Y, Cambria E, Sa G. Label embedding for zero-shot fine-grained named entity typing. In: COLING, pp 171–180. 2016.

  43. Ma Y, Peng H, Cambria E. Targeted aspect-based sentiment analysis via embedding commonsense knowledge into an attentive LSTM. In: AAAI. 2018.

  44. Majumder N, Poria S, Gelbukh A, Cambria E. Deep learning-based document modeling for personality detection from text. IEEE Intell Syst 2017;32(2):74–79.

    Article  Google Scholar 

  45. Mihalcea R, Garimella A. What men say, what women hear: Finding gender-specific meaning shades. IEEE Intell Syst 2016;31(4):62–67.

    Article  Google Scholar 

  46. Mondal A, Chaturvedi I, Das D, Bajpai R, Bandyopadhyay S. Lexical resource for medical events: A polarity based approach. In: 2015 IEEE International Conference on Data Mining Workshop (ICDMW), pp 1302–1309. IEEE. 2015.

  47. Mondal A, Das D, Cambria E, Bandyopadhyay S. Wme: Sense, polarity and affinity based concept resource for medical events. In: Proceedings of the 8th global wordnet conference, pp 242–246. 2016.

  48. Mondal A, Satapathy R, Das D, Bandyopadhyay S. A hybrid approach based sentiment extraction from medical context. In: 4th workshop on sentiment analysis where ai meets psychology (SAAIP 2016), IJCAI 2016 Workshop, July 10, Hilton, New York City, USA. 2016.

  49. Morante R, Liekens A, Daelemans W. Learning the scope of negation in biomedical texts. In: Proceedings of the conference on empirical methods in natural language processing, pp 715–724. Association for Computational Linguistics. 2008.

  50. Na J-C, Kyaing WYM, Khoo CSG, Foo S, Chang Y-K, Theng Y-L. Sentiment classification of drug reviews using a rule-based linguistic approach. In: International conference on asian digital libraries, pp 189–198. Springer. 2012.

  51. Niu Y, Zhu X, Li J, Hirst G. Analysis of polarity information in medical text. In: AMIA. 2005.

  52. Oneto L, Bisio F, Cambria E, Anguita D. Statistical learning theory and ELM for big social data analysis. IEEE Comput Intell Mag 2016;11(3):45–55.

    Article  Google Scholar 

  53. Patel CO, Cimino JJ. Using semantic and structural properties of the unified medical language system to discover potential terminological relationships. J Am Med Inform Assoc 2009;16(3):346–353.

    Article  PubMed  PubMed Central  Google Scholar 

  54. Pedersen T, Pakhomov SVS, Patwardhan S, Chute CG. Measures of semantic similarity and relatedness in the biomedical domain. J Biomed Inform 2007;40(3):288–299.

    Article  PubMed  Google Scholar 

  55. Poria S, Cambria E, Bajpai R, Hussain A. A review of affective computing: From unimodal analysis to multimodal fusion. Inf Fus 2017;37:98–125.

    Article  Google Scholar 

  56. Poria S, Cambria E, Hazarika D, Vij P. A deeper look into sarcastic tweets using deep convolutional neural networks. In: COLING, pp 1601–1612. 2016.

  57. Poria S, Cambria E, Hazarika D, Mazumder N, Zadeh A, Morency L-P. Context-dependent sentiment analysis in user-generated videos. In: ACL, pp 873–883. 2017.

  58. Prabowo R, Thelwall M. Sentiment analysis: A combined approach. J Inf 2009;3(2):143–157.

    Google Scholar 

  59. Rink B, Harabagiu S, Roberts K. Automatic extraction of relations between medical concepts in clinical texts. J Am Med Inform Assoc : JAMIA 2011;18(5):594–600.

    Article  PubMed  Google Scholar 

  60. Rosario B, Hearst MA. Classifying semantic relations in bioscience texts. In: Proceedings of the 42nd annual meeting on association for computational linguistics, pp 430. Association for Computational Linguistics. 2004.

  61. Sarker A, Mollá-Aliod D, Paris C, et al. Outcome polarity identification of medical papers, pp 105–114. 2011.

  62. Shukla RS, Yadav KS, Rizvi STA, Haseen F. An efficient mining of biomedical data from hypertext documents via nlp. In: Proceedings of the 3rd international conference on frontiers of intelligent computing: Theory and applications (FICTA) 2014, pp 651–658. Springer. 2015.

  63. Smith B, Fellbaum C. Medical wordnet: a new methodology for the construction and validation of information resources for consumer health. In: Proceedings of the 20th international conference on Computational Linguistics, pp 371. Association for computational linguistics. 2004.

  64. Smith P, Lee M. Cross-discourse development of supervised sentiment analysis in the clinical domain. In: Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis, pp 79–83. Association for Computational Linguistics. 2012.

  65. Sohn S, Torii M, Li D, Wagholikar K, Wu S, Liu H. A hybrid approach to sentiment sentence classification in suicide notes. Biomedical Inf Insights 2012;5(Suppl. 1):43.

    Google Scholar 

  66. Spasic I, Ananiadou S, McNaught J, Kumar A. Text mining and ontologies in biomedicine: making sense of raw text. Brief Bioinform 2005;6(3):239–251.

    Article  PubMed  CAS  Google Scholar 

  67. Swaminathan R, Sharma A, Yang H. Opinion mining for biomedical text data: Feature space design and feature selection. In: The 9th international workshop on data mining in bioinformatics, BIOKDD. 2010.

  68. Szarvas G, Vincze V, Farkas R, Csirik J. The bioscope corpus: annotation for negation, uncertainty and their scope in biomedical texts. In: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing, pp 38–45. Association for Computational Linguistics. 2008.

  69. Taboada M, Brooke J, Tofiloski M, Voll K, Stede M. Lexicon-based methods for sentiment analysis. Comput Linguist 2011;37(2):267–307.

    Article  Google Scholar 

  70. Tanabe L, Xie N, Thom LH, Matten W, Wilbur JW. Genetag: a tagged corpus for gene/protein named entity recognition. BMC Bioinf 2005;6(1):1.

    Article  CAS  Google Scholar 

  71. Uzuner Ö, South BR, Shen S, DuVall SL. 2010 i2b2/va challenge on concepts, assertions, and relations in clinical text. J Am Med Inform Assoc 2011;18(5):552–556.

    Article  PubMed  PubMed Central  Google Scholar 

  72. Román JV, Pérez SC, Serrano SL, Carlos J, Cristóbal G. Hybrid approach combining machine learning and a rule-based expert system for text categorization. In: Proceedings of the 24th international Florida artificial intelligence research society conference. AAAI. 2011.

  73. Wilbur JW, Rzhetsky A, Shatkay H. New directions in biomedical text annotation: definitions, guidelines and corpus construction. BMC Bioinf 2006;7(1):1.

    Article  CAS  Google Scholar 

  74. Xia L, Gentile AL, Munro J, Iria J. Improving patient opinion mining through multi-step classification. In: International conference on text, speech and dialogue, pp 70–76. Springer. 2009.

  75. Xia Y, Cambria E, Hussain A, Zhao H. Word polarity disambiguation using bayesian model and opinion-level features. Cogn Comput 2015;7(3):369–380.

    Article  Google Scholar 

  76. Xing F, Cambria E, Welsch R. 2018. Natural language based financial forecasting: A survey. Artificial Intelligence Review. https://doi.org/10.1007/s10462-017-9588-9.

  77. Chi X, Cambria E, Tan PS. 2017. Adaptive two-stage feature selection for sentiment classification. In: IEEE SMC, pp 1238–1243.

  78. Yetisgen-Yildiz M, Solti I, Xia F, Halgrim SR. Preliminary experience with amazon’s mechanical turk for annotating medical named entities. In: Proceedings of the NAACL HLT, 2010 Workshop on creating speech and language data with amazon’s mechanical turk, pp 180–183. Association for computational linguistics. 2010.

  79. Young T, Cambria E, Chaturvedi I, Zhou H, Biswas S, Huang M. Augmenting end-to-end dialog systems with commonsense knowledge. In: AAAI. 2018.

  80. Zadeh A, Liang PP, Poria S, Vij P, Cambria E, Morency L-P. Multi-attention recurrent network for human communication comprehension. In: AAAI. 2018.

  81. Zhang M, Zhang J, Su J, Zhou G. A composite kernel to extract relations between entities with both flat and structured features. In: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the association for computational linguistics, pp 825–832. Association for Computational Linguistics. 2006.

  82. Zheng H-T, Kang B-Y, Kim H-G. Exploiting noun phrases and semantic relationships for text document clustering. Inf Sci 2009;179(13):2249–2262. Special Section on High Order Fuzzy Sets.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anupam Mondal.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Informed Consent

Informed consent was not required as no human or animals were involved.

Human and Animal Rights

This article does not contain any studies with human or animal subjects performed by any of the authors.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mondal, A., Cambria, E., Das, D. et al. Relation Extraction of Medical Concepts Using Categorization and Sentiment Analysis. Cogn Comput 10, 670–685 (2018). https://doi.org/10.1007/s12559-018-9567-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-018-9567-8

Keywords

Navigation