Abstract
Publishing medical datasets about individuals, in a privacy-preserving way, has led to a significant body of research. Meanwhile, algorithms for anonymizing such datasets, with relational or set-valued (a.k.a. transaction) attributes, in a way that preserves data truthfulness, are crucial to medical research. Selecting, however, the most appropriate algorithm is still far from trivial, and tools that assist data publishers in this task are needed. To highlight this need, we initially provide a brief description of the popular anonymization algorithms and the most commonly used metrics to quantify data utility. Next, we present a system that we have designed, which is capable of applying a set of anonymization algorithms, enabling data holders, including medical researchers and healthcare organizations, to test the effectiveness and efficiency of different methods. Our system, called SECRETA, allows evaluating a specific anonymization algorithm, comparing multiple anonymization algorithms, and combining algorithms for anonymizing datasets with both relational and transaction attributes. The analysis of the algorithms is performed in an interactive and progressive way, and results, including attribute statistics and various data utility indicators, are summarized and presented graphically.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We note that in the case of RT-datasets containing more than one transaction attributes, these attributes can be integrated into a single transaction attribute by changing their domain and values’ specification. Assuming two such attributes X and Y, with X having values x 1, x 2 and Y values y 1, y 2, we can replace these attributes with a single transaction attribute, say Z, with domain {X. x 1, X. x 2, Y. y 1, Y. y 2}.
References
Cao, J., Karras, P., Raïssi, C., Tan, K.: ρ-uncertainty: inference-proof transaction anonymization. PVLDB 3(1), 1033–1044 (2010)
Dai, C., Ghinita, G., Bertino, E., Byun, J.W., Li, N.: TIAMAT: a tool for interactive analysis of microdata anonymization techniques. PVLDB 2(2), 1618–1621 (2009)
Dwork, C.: Differential privacy. In: ICALP, pp. 1–12 (2006)
Fung, B.C.M., Wang, K., Yu, P.: Top-down specialization for information and privacy preservation. In: 21st International Conference on Data Engineering (ICDE), pp. 205–216 (2005)
Gkoulalas-Divanis, A., Loukides, G.: PCTA: privacy-constrained clustering-based transaction data anonymization. In: 2011 International Workshop on Privacy and Anonymity in Information Society (PAIS), pp. 1–10 (2011)
Gkoulalas-Divanis, A., Loukides, G.: Anonymization of Electronic Medical Records to Support Clinical Analysis, 1st edn. Springer, New York (2013)
Gkoulalas-Divanis, A., Loukides, G., Sun, J.: Publishing data from electronic health records while preserving privacy: a survey of algorithms. J. Biomed. Inform. 50, 4–19 (2014). doi: 10.1016/j.jbi.2014.06.002. http://dx.doi.org/10.1016/j.jbi.2014.06.002
LeFevre, K., DeWitt, D., Ramakrishnan, R.: Incognito: efficient full-domain k-anonymity. In: SIGMOD, pp. 49–60 (2005)
LeFevre, K., DeWitt, D., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: ICDE, p. 25 (2006)
Li, N., Li, T., Venkatasubramanian, S.: t-closeness: privacy beyond k-anonymity and l-diversity. In: ICDE, pp. 106–115 (2007)
Loukides, G., Gkoulalas-Divanis, A., Malin, B.: COAT: constraint-based anonymization of transactions. Knowl. Inf. Syst. 28(2), 251–282 (2011)
Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: l-diversity: privacy beyond k-anonymity. In: ICDE, p. 24 (2006)
Nergiz, M.E., Atzori, M., Clifton, C.: Hiding the presence of individuals from shared databases. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, SIGMOD ’07, pp. 665–676. ACM, New York (2007). doi:10.1145/1247480.1247554. http://doi.acm.org/10.1145/1247480.1247554
Poulis, G., Loukides, G., Gkoulalas-Divanis, A., Skiadopoulos, S.: Anonymizing data with relational and transaction attributes. In: ECML/PKDD, pp. 353–369 (2013)
Poulis, G., Skiadopoulos, S., Loukides, G., Gkoulalas-Divanis, A.: Distance-based kˆm-anonymization of trajectory data. In: 2013 IEEE 14th International Conference on Mobile Data Management, vol. 2, pp. 57–62, 3–6 June 2013, Milan (2013). doi:10.1109/MDM.2013.66. http://dx.doi.org/10.1109/MDM.2013.66
Poulis, G., Skiadopoulos, S., Loukides, G., Gkoulalas-Divanis, A.: Select-organize-anonymize: a framework for trajectory data anonymization. In: 13th IEEE International Conference on Data Mining Workshops, ICDM Workshops, pp. 867–874, 7–10 December 2013, TX (2013). doi:10.1109/ICDMW.2013.136. http://dx.doi.org/10.1109/ICDMW.2013.136
Poulis, G., Gkoulalas-Divanis, A., Loukides, G., Skiadopoulos, S., Tryfonopoulos, C.: SECRETA: a system for evaluating and comparing relational and transaction anonymization algorithms. In: Proceedings of the 17th International Conference on Extending Database Technology, EDBT 2014, pp. 620–623, 24–28 March 2014, Athens (2014). doi:10.5441/002/edbt.2014.58. http://dx.doi.org/10.5441/002/edbt.2014.58
Poulis, G., Skiadopoulos, S., Loukides, G., Gkoulalas-Divanis, A.: Apriori-based algorithms for k m-anonymizing trajectory data. Trans. Data Privacy 7(2), 165–194 (2014). http://www.tdp.cat/issues11/abs.a194a14.php
Prasser, F., Kohlmayer, F., Kuhn, K.A.: Arx-a comprehensive tool for anonymizing biomedical data. In: AMIA Annual Symposium (2014)
Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness Knowl.-Based Syst. 10(5), 557–570 (2002)
Terrovitis, M., Mamoulis, N., Kalnis, P.: Privacy-preserving anonymization of set-valued data. PVLDB 1(1), 115–125 (2008)
Terrovitis, M., Mamoulis, N., Kalnis, P.: Local and global recoding methods for anonymizing set-valued data. VLDB J. 20(1), 83–106 (2011)
Xiao, X., Wang, G., Gehrke, J.: Interactive anonymization of sensitive data. In: SIGMOD, pp. 1051–1054 (2009)
Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., Chee Fu, A.W.: Utility-based anonymization using local recoding. In: SIGKDD, pp. 785–790 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Poulis, G., Gkoulalas-Divanis, A., Loukides, G., Skiadopoulos, S., Tryfonopoulos, C. (2015). SECRETA: A Tool for Anonymizing Relational, Transaction and RT-Datasets. In: Gkoulalas-Divanis, A., Loukides, G. (eds) Medical Data Privacy Handbook. Springer, Cham. https://doi.org/10.1007/978-3-319-23633-9_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-23633-9_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23632-2
Online ISBN: 978-3-319-23633-9
eBook Packages: Computer ScienceComputer Science (R0)