Anonymisation of Swedish Clinical Data

  • Dimitrios Kokkinakis
  • Anders Thurin
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4594)


There is a constantly growing demand for exchanging clinical and health-related information electronically. In the era of the Electronic Health Record the release of individual data for research, health care statistics, monitoring of new diagnostic tests and tracking disease outbreak alerts are some of the areas in which the protection of (patient) privacy has become an important concern. In this paper we present a system for automatic anonymisation of Swedish clinical free text, in the form of discharge letters, by applying generic named entity recognition technology.


anonymisation hospital discharge letters entity recognition 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kokkinakis, D.: Reducing the Effect of Name Explosion. LREC Workshop: Beyond Named Entity Recognition Semantic Labeling for NLP tasks. Portugal (2004)Google Scholar
  2. 2.
    Bondi Johannessen, J., et al.: Named Entity Recognition for the Mainland Scandinavian Languages. Literary and Linguistic Computing 20, 1 (2005)CrossRefGoogle Scholar
  3. 3.
    Sweeney, L.: Replacing Personally-Identifying Information in Medical Records, the Scrub System. J. of the Am Med Informatics Assoc. Washington, DC, 333–337 (1996)Google Scholar
  4. 4.
    Ruch, P., et al.: Medical Document Anonymisation with a Semantic Lexicon. J Am Med Inform Assoc (Symposium Suppl), 729–733 (2000)Google Scholar
  5. 5.
    Taira, R.K, Bui, A.A., Kangarloo, H.: Identification of Patient Name References within Medical Documents Using Semantic Selectional Restrictions. In: AMIA Symposium. pp. 757–761 (2002)Google Scholar
  6. 6.
    Thomas, S.M., Mamlin, B., Schadow, G., McDonald, C.: A Successful Technique for Removing Names in Pathology Reports Using an Augmented Search and Replace Method. In: AMIA Symposium, pp. 777–781 (2002)Google Scholar
  7. 7.
    Sweeney, L.: k-anonymity: a Model for Protecting Privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(5), 557–570 (2002)zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Gupta, D., Saul, M., Gilbertson, J.: Evaluation of a Deidentification (De-Id) Software Engine to Share Pathology Reports and Clinical Documents for Research. Am J of Clin. Pathology 121(6), 176–186 (2004)CrossRefGoogle Scholar
  9. 9.
    Hsinchun, C., Fuller, S.S., Friedman, C., Hersh, W.: Medical Informatics – Knowledge Management and Data Mining in Biomedicine, pp. 109–121. Springer, Heidelberg (2005)Google Scholar
  10. 10.
    Uzuner, O., Kohane, I., Szolovits, P.: Challenges in Natural Language Processing for Clinical Data Workshop (2006),
  11. 11.
    Mikheev, A., Moens, M., Grover, C.: Named Entity Recognition without Gazeteers. In: Proc. of the 9th European Assoc. of Computational Linguistics (EACL), Norway, pp. 1–8 (1999)Google Scholar
  12. 12.
    Aramaki, E., Imai, T., Miyo, K., Ohe, K.: Automatic Deidentification by using Sentence Features and Label Consistency. Challenges in NLP for Clinical Data. Washing. D.C. (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Dimitrios Kokkinakis
    • 1
  • Anders Thurin
    • 2
  1. 1.Göteborg University, Department of Swedish Language, SpråkdataSweden
  2. 2.Clinical Physiology, Sahlgrenska Univ. Hospital/ÖstraSweden

Personalised recommendations