Medical Discourse and Subjectivity

  • Natalia Grabar
  • Pierre Chauveau-Thoumelin
  • Loïc Dumonet
Part of the Studies in Computational Intelligence book series (SCI, volume 615)


Actors and users of the medical field (doctors, nurses, patients, medical students, pharmacists, etc.) are neither from the same social and professional category nor they have the same expertise level of the field. Their writings testify about this fact through the terminology used, for instance. Besides, the writings also show difference in the use of subjectivity markers. The automatic study of the subjectivity in the medical discourse in texts written in French is addressed in this paper. We compare the documents written by medical doctors and biomedical researchers (scientific literature, clinical reports) with the patient discourse (discussions from health fora) through a contrastive analysis of differences observed in the use of descriptors like uncertainty and polarity markers, non-lexical (smileys, repeated punctuations, etc.) and lexical emotional markers, and medical terms related to disorders, medications and procedures. We perform automatic annotation and categorization of documents in order to better observe the specificities of the studied medical discourses.


NLP Uncertainty Emotions Supervised categorization 



This work is partially funded by the French Agence Nationale de la Recherche (ANR) and the DGA, under the Tecsan grant ANR-11-TECS-012 (RAVEL project), and by the research programme Patients’ mind funded by the Maison des Sciences de l’Homme network (interMSH framework). We are thankful to the reviewers for their comments.


  1. Abdaoui, A., J. Azé, S. Bringay, and P. Poncelet. 2014. Feel: French extended emotional lexicon. Technical Report, Université de Montpellier 2. iSLRN: 041-639-484-224-2.
  2. Akdag, H., M. DeGlas, and D. Pacholczyk. 1992. A qualitative theory of uncertainty. Fundamenta Informaticae 17(4): 333–362.zbMATHMathSciNetGoogle Scholar
  3. Akdag, H., I. Truck, A. Borgi, and N. Mellouli. 2001. Linguistic modifiers in a symbolic framework. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 9(SI): 49–62.zbMATHMathSciNetCrossRefGoogle Scholar
  4. Akrich, M., and C. Méadel. 2009. Les échanges entre patients sur l’Internet. Presse médicale 38: 1484–1490.CrossRefGoogle Scholar
  5. Antheunis, M.L., K. Tates, and T.E. Nieboe. 2013. Patients’ and health professionals’ use of social media in health care: Motives, barriers and expectations. Patient Education and Counseling 92: 426–431.CrossRefGoogle Scholar
  6. Augustyn, M., S. Ben Hamou, G. Bloquet, V. Goossens, M. Loiseau, and F. Rynck. 2008. Constitution de ressources pédagogiques numériques: le lexique des affects, 407–414. Presses Universitaires de Grenoble.Google Scholar
  7. Basilico, J., and T. Hofmann. 2004. Unifying collaborative and content-based filtering. In International conference on machine learning, 65–72.Google Scholar
  8. Battaïa, C. 2012. L’analyse de l’émotion dans les forums de santé. In Actes de la conférence conjointe JEP-TALN-RECITAL, RECITAL, 267–280.Google Scholar
  9. Chapman, W., W. Bridewell, P. Hanbury, G. Cooper, and B. Buchanan. 2001. A simple algorithm for identifying negated findings and diseases in discharge summaries. Journal of Biomedical Informatics 34(5): 301–310.CrossRefGoogle Scholar
  10. Chmielik, J., and N. Grabar. 2011. Détection de la spécialisation scientifique et technique des documents biomédicaux grâce aux informations morphologiques. TAL 51(2): 151–179.Google Scholar
  11. Cornelis, C., M. DeCock, and E. Kerre. 2004. Efficient approximate reasoning with positive and negative information, 779–785.Google Scholar
  12. Côté, R. 1996. Répertoire d’anatomopathologie de la SNOMED internationale, v3.4. Université de Sherbrooke, Sherbrooke, Québec.Google Scholar
  13. Daille, B., B. Habert, C. Jacquemin, and J. Royauté. 1996. Empirical observation of term variations and principles for their description. Terminology 3(2): 197–257.CrossRefGoogle Scholar
  14. Denny, J., and J. Peterson. 2007. Identifying qt prolongation from ECG impressions using natural language processing and negation detection. In Medinfo, 1283–1288.Google Scholar
  15. Ekman, P. 1992. An argument for basic emotions. Cognition and emotion 6(3–4): 169–200.CrossRefGoogle Scholar
  16. Elkin, P., S. Brown, B. Bauer, C. Husser, W. Carruth, L. Bergstrom, and D. Wahner-Roedler. 2005. A controlled trial of automated classification of negation from clinical notes. BMC Medical Informatics and Decision Making 5(13).Google Scholar
  17. Gindl, S., K. Kaiser, and S. Miksch. 2007. Syntactical negation detection in clinical practice guidelines. In Studies in Health Technology and Informatics, 187–192.Google Scholar
  18. Goeuriot, L., N. Grabar, and B. Daille. 2007. Caractérisation des discours scientifique et vulgarisé en français, japonais et russe. In TALN, 93–102.Google Scholar
  19. Goryachev, S., M. Sordo, Q. Zeng, and L. Ngo. 2006. Implementation and evaluation of four different methods of negation detection. Technical Report, I2B2.Google Scholar
  20. Grabar, N., S. Krivine, and M. Jaulent. 2007. Classification of health webpages as expert and non expert with a reduced set of cross-language features. In AMIA, 284–288.Google Scholar
  21. Grabar, N., and T. Hamon. 2009. Exploitation of speculation markers to identify the structure of biomedical scientific writing. AMIA 2009, 203–207.Google Scholar
  22. Hadjouni K.M. 2012. Un système de recherche d’information personnalisée basé sur la modélisation multidimensionnelle de l’utilisateur. Thèse de doctorat, Université de Paris-Sud, Paris, France.Google Scholar
  23. Hamon, T., and A. Nazarenko. 2008. Le développement d’une plate-forme pour l’annotation spécialisée de documents Web: retour d’expérience. TAL 49(2): 127–154.Google Scholar
  24. Herlocker, J., J. Konstan, L. Terveen, and J. Riedl. 2004. Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems 22(1): 5–53.CrossRefGoogle Scholar
  25. Hole, W., and S. Srinivasan. 2000. Discovering missed synonymy in a large concept-oriented metathesaurus. In AMIA 2000, 354–358.Google Scholar
  26. Hyland, K. 1995. The author in the text: Hedging in scientific writing. Hong Kong Papers in Linguistics and Language Teaching 18: 33–42.Google Scholar
  27. Kassab, R., and J. Lamirel. 2006. A new approach to intelligent text filtering based on novelty detection. In Australasian database conference, 149–156.Google Scholar
  28. Lakoff, G. 1973. Hedges: A study in meaning criteria and the logic of fuzzy concepts. Journal of Philosophical Logic 2: 458–508.zbMATHMathSciNetCrossRefGoogle Scholar
  29. Levenshtein, V.I. 1966. Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics. Doklady 707(10).Google Scholar
  30. Light, M., X.Y. Qiu, and P. Srinivasan. 2004. The language of bioscience: Facts, speculations and statements in between. In ACL WS on linking biological literature, ontologies and databases, 17–24.Google Scholar
  31. Lindberg, D., B. Humphreys, and A. McCray. 1993. The unified medical language system. Methods of Information in Medicine 32(4): 281–291.Google Scholar
  32. Marco, C.D., and R. Mercer. 2004. Hedging in scientific articles as a means of classifying citations. In AAAI, 50–54.Google Scholar
  33. Mauranen, A. 1997. Hedging in Language Revisers’ Hands, 115–133. Walter de Gruyter.Google Scholar
  34. Mercer, R.E., C.D. Marco, and F.W. Kroon. 2004. The frequency of hedging cues in citation contexts in scientific writing. In CSCSI, ed. Computer Science LN, 75–88. Berlin: Springer.Google Scholar
  35. Miller, T., G. Leroy, S. Chatterjee, J. Fan, and B. Thoms. 2007. A classifier to evaluate language specificity of medical documents. In HICSS, 134–140.Google Scholar
  36. Mohammad, S.M., and P.D. Turney. 2010. Emotions evoked by common words and phrases: Using mechanical turk to create an emotion lexicon. In Proceedings of the NAACL HLT 2010 workshop on computational approaches to analysis and generation of emotion in text, 26–34.Google Scholar
  37. Namer, F. 2009. Morphologie, Lexique et TAL: l’analyseur DériF. London: TIC et Sciences cognitives. Hermes Sciences Publishing.Google Scholar
  38. New, B. 2006. Lexique 3: une nouvelle base de données lexicales. In Actes de la Conférence Traitement Automatique des Langues Naturelles (TALN 2006). Louvain, Belgique.Google Scholar
  39. Pasi, G. 2010. Issues in personalizing information retrieval. IEEE Intelligent Informatics Bulletin 11(1): 3–7.Google Scholar
  40. Pearson, J. 1998. Terms in context, studies in corpus linguistics, vol. 1. Amsterdam: John Benjamins.Google Scholar
  41. Poprat, M., K. Markó, and U. Hahn. 2006. A language classifier that automatically divides medical documents for experts and health care consumers. In MIE 2006–Proceedings of the XX international congress of the European federation for medical informatics, 503–508. Maastricht.Google Scholar
  42. Quinlan, J. 1993. C4.5 programs for machine learning. San Mateo: Morgan Kaufmann.Google Scholar
  43. Rittman, R. 2008. Automatic discrimination of genres. Saarbrucken: VDM.Google Scholar
  44. Ruch, P., C. Boyer, C. Chichester, I. Tbahriti, A. Geissbühler, P. Fabry, J. Gobeill, V. Pillet, D. Rebholz-Schuhmann, C. Lovis, and A. Veuthey. 2006. Using argumentation to extract key sentences from biomedical abstracts. International Journal of Medical Informatics 76(2–3): 195–200.Google Scholar
  45. Sagot, B. 2010. The Lefff, a freely available and large-coverage morphological and syntactic lexicon for French. In 7th international conference on language resources and evaluation (LREC 2010). Valletta, Malte.Google Scholar
  46. Salton, G. 1991. Developments in automatic text retrieval. Science 253: 974–979.MathSciNetCrossRefGoogle Scholar
  47. Schmid, H. 1994. Probabilistic part-of-speech tagging using decision trees. In Proceedings of the international conference on new methods in language processing, 44–49. Manchester, UK.Google Scholar
  48. Sebastiani, F. 2002. Machine learning in automated text categorization. ACM Computing Surveys 34(1): 1–47.CrossRefGoogle Scholar
  49. Witten, I., and E. Frank. 2005. Data mining: Practical machine learning tools and techniques. San Francisco: Morgan Kaufmann.Google Scholar
  50. Zadeh, L. 1972. A fuzzy-set-theoretic interpretation of linguistic hedges. Journal of Cybernetics 2(3): 4–34.MathSciNetCrossRefGoogle Scholar
  51. Zeng, Q.T., T. Tse, G. Divita, A. Keselman, J. Crowell, and A.C. Browne. 2006. Exploring lexical forms: first-generation consumer health vocabularies. In AMIA 2006, 1155Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Natalia Grabar
    • 1
  • Pierre Chauveau-Thoumelin
    • 1
  • Loïc Dumonet
    • 1
  1. 1.STL UMR 8163 CNRSUniversité Lille 3 et Lille 1Villeneuve d’AscqFrance

Personalised recommendations