Suppressing Microdata to Prevent Probabilistic Classification Based Inference

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Enterprises have been collecting data for many reasons including better customer relationship management, and high-level decision making. Public safety was another motivation for large-scale data collection efforts initiated by government agencies. However, such widespread data collection efforts coupled with powerful data analysis tools raised concerns about privacy. This is due to the fact that collected data may contain confidential information, or it can be used to infer confidential information. One method to ensure privacy is to selectively hide confidential data values from the data set to be disclosed. However, with data mining technology it is now possible for an adversary to predict the hidden data values, which is another threat to privacy. In this paper we concentrate on probabilistic classification, which is a specific data mining technique widely used for prediction purposes, and propose methods for downgrading probabilistic classification models in order to block the inference of hidden microdata values.

This work is funded by the PIA-BOSPHORUS programme of EGIDE (France) and TÜBİTAK (Turkey).