Abstract
Medical sentiment analysis refers to the extraction of sentiments or emotions from documents retrieved from healthcare sources, such as public forums and drug review websites. Previous studies prove that sentiment analysis for clinical documents has the potential for assisting patients with information for self assessing treatments, providing health professionals with more insights into patients’ health conditions, or even managing relations between patients and doctors. Nevertheless, the lack of data used for empirical experiments in previous research indicates that there are strong needs for a systematic framework in order to identify medical field specific sentiments. We propose a new feature extraction approach utilising position embeddings to generate a medical domain enhanced sentiment lexicon with position encoding representation for drug review sentiment analysis. Experiments on different feature extraction methods using two types of sentiment lexicons with various machine learning classifiers, support the superior performance of sentiment classification with position encoding incorporated medical sentiment lexicon for drug review datasets.
Similar content being viewed by others
References
Ali T, Schramm D, Sokolova M, Inkpen D. Can i hear you? sentiment analysis on medical forums. In: Proceedings of the sixth international joint conference on natural language processing. 2013. p. 667–673.
Melzi S, Abdaoui A, Azé J, Bringay S, Poncelet P, Galtier F. Patient’s rationale: patient knowledge retrieval from health forums. In: eTELEMED: eHealth, Telemedicine, and Social Medicine. 2014.
Zhao L, Li Q, Xue Y, Jia J, Feng L. A systematic exploration of the micro-blog feature space for teens stress detection. Health Inf Sci Syst. 2016;4(1):3.
Gopalakrishnan V, Ramaswamy C. Patient opinion mining to analyze drugs satisfaction using supervised learning. J Appl ResTechnol. 2017;15(4):311–9.
Denecke K, Deng Y. Sentiment analysis in medical settings: new opportunities and challenges. Artif Intell Med. 2015;64(1):17–27.
Gohil S, Vuik S, Darzi A. Sentiment analysis of health care tweets: review of the methods used. JMIR Public Health Surveill. 2018;4(2):e43.
Beam AL, Kompa B, Fried I, Palmer NP, Shi X, Cai T, Kohane IS. Clinical concept embeddings learned from massive sources of medical data. 2018. arXiv:1804.01486.
Liu H, Feng J, Qi M, Jiang J, Yan S. End-to-end comparative attention networks for person re-identification. IEEE Trans Image Process. 2017;26(7):3492–506.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention is all you need. In: Advances in neural information processing systems. 2017. p. 5998–6008.
Deng Y, Stoehr M, Denecke K. Retrieving attitudes: sentiment analysis from clinical narratives. In: MedIR@ SIGIR. 2014. p. 12–15
Islam MR, Kabir MA, Ahmed A, Kamal ARM, Wang H, Ulhaq A. Depression detection from social network data using machine learning techniques. Health Inf Sci Syst. 2018;6(1):8.
Mondal A, Satapathy R, Das D, Bandyopadhyay S: A hybrid approach based sentiment extraction from medical context. In: SAAIP@ IJCAI. Volume 1619. 2016. p. 35–40.
Baccianella S, Esuli A, Sebastiani F: Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: LREC. Volume 10. 2010. p. 2200–2204.
Goeuriot L, Na JC, Min Kyaing WY, Khoo C, Chang YK, Theng YL, Kim JJ. Sentiment lexicons for health-related opinion mining. In: Proceedings of the 2nd ACM SIGHIT international health informatics symposium, ACM 2012. p. 219–226.
Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, et al. Tensorflow: a system for large-scale machine learning. OSDI. 2016;16:265–83.
Shah S, Luo X, Kanakasabai S, Tuason R, Klopper G. Neural networks for mining the associations between diseases and symptoms in clinical notes. Health Inf. Sci. Syst. 2019;7(1):1.
Jiang Z, Li L, Huang D, Jin L. Training word embeddings for deep learning in biomedical text mining tasks. In: IEEE international conference on bioinformatics and biomedicine (BIBM). 2015. p. 625–628.
Li C, Song R, Liakata M, Vlachos A, Seneff S, Zhang X. Using word embedding for bio-event extraction. Proc BioNLP. 2015;15:121–6.
Nikfarjam A, Sarker A, O’connor K, Ginn R, Gonzalez G. Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features. J Am Med Inf Assoc. 2015;22(3):671–81.
Wei CH, Harris BR, Kao HY, Lu Z. tmvar: a text mining approach for extracting sequence variants in biomedical literature. Bioinformatics. 2013;29(11):1433–9.
Białecki A, Muir R, Ingersoll G, Imagination L. Apache lucene 4. In: SIGIR 2012 workshop on open source information retrieval. 2012. p. 17.
Saeed M, Lieu C, Raber G, Mark RG. Mimic ii: a massive temporal icu patient database to support research in intelligent patient monitoring. In: Computers in cardiology, 2002. IEEE. 2002. p. 641–644.
Rehurek R, Sojka P: Software framework for topic modelling with large corpora. In: In Proceedings of the LREC 2010 workshop on new challenges for NLP frameworks. 2010. p. 45–50.
Asghar MZ, Khan A, Ahmad S, Kundi FM. A review of feature extraction in sentiment analysis. J Basic Appl Sci Res. 2014;4(3):181–6.
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The weka data mining software: an update. ACM SIGKDD Explor Newslett. 2009;11(1):10–8.
Scholkopf B, Sung KK, Burges CJ, Girosi F, Niyogi P, Poggio T, Vapnik V. Comparing support vector machines with gaussian kernels to radial basis function classifiers. IEEE Trans Signal Process. 1997;45(11):2758–65.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, S., Lee, I. Extracting features with medical sentiment lexicon and position encoding for drug reviews. Health Inf Sci Syst 7, 11 (2019). https://doi.org/10.1007/s13755-019-0072-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13755-019-0072-6