Priority-Based k-Anonymity Accomplished by Weighted Generalisation Structures

  • Konrad Stark
  • Johann Eder
  • Kurt Zatloukal
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4081)


Biobanks are gaining in importance by storing large collections of patient’s clinical data (e.g. disease history, laboratory parameters, diagnosis, life style) together with biological materials such as tissue samples, blood or other body fluids. When releasing these patient-specific data for medical studies privacy protection has to be guaranteed for ethical and legal reasons. k-anonymity may be used to ensure privacy by generalising and suppressing attributes in order to release sufficient data twins that mask patients’ identities. However, data transformation techniques like generalisation may produce anonymised data unusable for medical studies because some attributes become too coarse-grained. We propose a priority-driven anonymisation technique that allows to specify the degree of acceptable information loss for each attribute separately. We use generalisation and suppression of attributes together with a weighting-scheme for quantifying generalisation steps. Our approach handles both numerical and categorical attributes and provides a data anonymisation based on priorities and weights. The anonymisation algorithm described in this paper has been implemented and tested on a carcinoma data set. We discuss some general privacy protecting methods for medical data and show some medical-relevant use cases that benefit from our anonymisation technique.


Information Loss Generalisation Step Privacy Protection Generalisation Level Data Twin 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    A biobank for the advancement of medicine,
  2. 2.
    Fung, B.C.M., Wang, K., Yu, P.S.: Top-down specialization for information and privacy preservation. In: ICDE, pp. 205–216 (2005)Google Scholar
  3. 3.
    Genomeresearch in Austria,
  4. 4.
    Sweeney, L.: Computational disclosure control for medical microdata (1997)Google Scholar
  5. 5.
    LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: Efficient full-domain k-anonymity. In: SIGMOD Conference, pp. 49–60 (2005)Google Scholar
  6. 6.
    LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Multidimensional k-anonymity. In Technical Report 1521, University of Wisconsin, 2005 (2005)Google Scholar
  7. 7.
    Sweeney, L., Samarati, P.: Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. In: Proceedings of the IEEE Symposium on Research in Security and Privacy (1998)Google Scholar
  8. 8.
    Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering 13(6), 1010–1027 (2001)CrossRefGoogle Scholar
  9. 9.
    Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(5), 571–588 (2002)MATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Wang, K., Yu, P.S., Chakraborty, S.: Bottom-up generalization: A data mining solution to privacy protection. In: ICDM, pp. 249–256 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Konrad Stark
    • 1
  • Johann Eder
    • 2
  • Kurt Zatloukal
    • 1
  1. 1.Institute of PathologyMedical University GrazGraz
  2. 2.Department of Knowledge and Business EngineeringUniversity of ViennaWien

Personalised recommendations