Chapter

Secure Data Management

Volume 5159 of the series Lecture Notes in Computer Science pp 32-49

ARUBA: A Risk-Utility-Based Algorithm for Data Disclosure

  • Mohamed R. FouadAffiliated withDepartment of Computer Science, Purdue University
  • , Guy LebanonAffiliated withDepartment of Statistics and School of Electrical and Computer Engineering, Purdue University
  • , Elisa BertinoAffiliated withDepartment of Computer Science, Purdue University

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Dealing with sensitive data has been the focus of much of recent research. On one hand data disclosure may incur some risk due to security breaches, but on the other hand data sharing has many advantages. For example, revealing customer transactions at a grocery store may be beneficial when studying purchasing patterns and market demand. However, a potential misuse of the revealed information may be harmful due to privacy violations. In this paper we study the tradeoff between data disclosure and data retention. Specifically, we address the problem of minimizing the risk of data disclosure while maintaining its utility above a certain acceptable threshold. We formulate the problem as a discrete optimization problem and leverage the special monotonicity characteristics for both risk and utility to construct an efficient algorithm to solve it. Such an algorithm determines the optimal transformations that need to be performed on the microdata before it gets released. These optimal transformations take into account both the risk associated with data disclosure and the benefit of it (referred to as utility). Through extensive experimental studies we compare the performance of our proposed algorithm with other date disclosure algorithms in the literature in terms of risk, utility, and time. We show that our proposed framework outperforms other techniques for sensitive data disclosure.

Keywords

Privacy Security Risk Management Data Sharing Data Utility Anonymity