Skip to main content

SECRETA: A Tool for Anonymizing Relational, Transaction and RT-Datasets

  • Chapter
Medical Data Privacy Handbook

Abstract

Publishing medical datasets about individuals, in a privacy-preserving way, has led to a significant body of research. Meanwhile, algorithms for anonymizing such datasets, with relational or set-valued (a.k.a. transaction) attributes, in a way that preserves data truthfulness, are crucial to medical research. Selecting, however, the most appropriate algorithm is still far from trivial, and tools that assist data publishers in this task are needed. To highlight this need, we initially provide a brief description of the popular anonymization algorithms and the most commonly used metrics to quantify data utility. Next, we present a system that we have designed, which is capable of applying a set of anonymization algorithms, enabling data holders, including medical researchers and healthcare organizations, to test the effectiveness and efficiency of different methods. Our system, called SECRETA, allows evaluating a specific anonymization algorithm, comparing multiple anonymization algorithms, and combining algorithms for anonymizing datasets with both relational and transaction attributes. The analysis of the algorithms is performed in an interactive and progressive way, and results, including attribute statistics and various data utility indicators, are summarized and presented graphically.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We note that in the case of RT-datasets containing more than one transaction attributes, these attributes can be integrated into a single transaction attribute by changing their domain and values’ specification. Assuming two such attributes X and Y, with X having values x 1, x 2 and Y values y 1, y 2, we can replace these attributes with a single transaction attribute, say Z, with domain {X. x 1, X. x 2, Y. y 1, Y. y 2}.

References

  1. Cao, J., Karras, P., Raïssi, C., Tan, K.: ρ-uncertainty: inference-proof transaction anonymization. PVLDB 3(1), 1033–1044 (2010)

    Google Scholar 

  2. Dai, C., Ghinita, G., Bertino, E., Byun, J.W., Li, N.: TIAMAT: a tool for interactive analysis of microdata anonymization techniques. PVLDB 2(2), 1618–1621 (2009)

    Google Scholar 

  3. Dwork, C.: Differential privacy. In: ICALP, pp. 1–12 (2006)

    Google Scholar 

  4. Fung, B.C.M., Wang, K., Yu, P.: Top-down specialization for information and privacy preservation. In: 21st International Conference on Data Engineering (ICDE), pp. 205–216 (2005)

    Google Scholar 

  5. Gkoulalas-Divanis, A., Loukides, G.: PCTA: privacy-constrained clustering-based transaction data anonymization. In: 2011 International Workshop on Privacy and Anonymity in Information Society (PAIS), pp. 1–10 (2011)

    Google Scholar 

  6. Gkoulalas-Divanis, A., Loukides, G.: Anonymization of Electronic Medical Records to Support Clinical Analysis, 1st edn. Springer, New York (2013)

    Book  Google Scholar 

  7. Gkoulalas-Divanis, A., Loukides, G., Sun, J.: Publishing data from electronic health records while preserving privacy: a survey of algorithms. J. Biomed. Inform. 50, 4–19 (2014). doi: 10.1016/j.jbi.2014.06.002. http://dx.doi.org/10.1016/j.jbi.2014.06.002

  8. LeFevre, K., DeWitt, D., Ramakrishnan, R.: Incognito: efficient full-domain k-anonymity. In: SIGMOD, pp. 49–60 (2005)

    Google Scholar 

  9. LeFevre, K., DeWitt, D., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: ICDE, p. 25 (2006)

    Google Scholar 

  10. Li, N., Li, T., Venkatasubramanian, S.: t-closeness: privacy beyond k-anonymity and l-diversity. In: ICDE, pp. 106–115 (2007)

    Google Scholar 

  11. Loukides, G., Gkoulalas-Divanis, A., Malin, B.: COAT: constraint-based anonymization of transactions. Knowl. Inf. Syst. 28(2), 251–282 (2011)

    Article  Google Scholar 

  12. Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: l-diversity: privacy beyond k-anonymity. In: ICDE, p. 24 (2006)

    Google Scholar 

  13. Nergiz, M.E., Atzori, M., Clifton, C.: Hiding the presence of individuals from shared databases. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, SIGMOD ’07, pp. 665–676. ACM, New York (2007). doi:10.1145/1247480.1247554. http://doi.acm.org/10.1145/1247480.1247554

  14. Poulis, G., Loukides, G., Gkoulalas-Divanis, A., Skiadopoulos, S.: Anonymizing data with relational and transaction attributes. In: ECML/PKDD, pp. 353–369 (2013)

    Google Scholar 

  15. Poulis, G., Skiadopoulos, S., Loukides, G., Gkoulalas-Divanis, A.: Distance-based kˆm-anonymization of trajectory data. In: 2013 IEEE 14th International Conference on Mobile Data Management, vol. 2, pp. 57–62, 3–6 June 2013, Milan (2013). doi:10.1109/MDM.2013.66. http://dx.doi.org/10.1109/MDM.2013.66

  16. Poulis, G., Skiadopoulos, S., Loukides, G., Gkoulalas-Divanis, A.: Select-organize-anonymize: a framework for trajectory data anonymization. In: 13th IEEE International Conference on Data Mining Workshops, ICDM Workshops, pp. 867–874, 7–10 December 2013, TX (2013). doi:10.1109/ICDMW.2013.136. http://dx.doi.org/10.1109/ICDMW.2013.136

  17. Poulis, G., Gkoulalas-Divanis, A., Loukides, G., Skiadopoulos, S., Tryfonopoulos, C.: SECRETA: a system for evaluating and comparing relational and transaction anonymization algorithms. In: Proceedings of the 17th International Conference on Extending Database Technology, EDBT 2014, pp. 620–623, 24–28 March 2014, Athens (2014). doi:10.5441/002/edbt.2014.58. http://dx.doi.org/10.5441/002/edbt.2014.58

  18. Poulis, G., Skiadopoulos, S., Loukides, G., Gkoulalas-Divanis, A.: Apriori-based algorithms for k m-anonymizing trajectory data. Trans. Data Privacy 7(2), 165–194 (2014). http://www.tdp.cat/issues11/abs.a194a14.php

  19. Prasser, F., Kohlmayer, F., Kuhn, K.A.: Arx-a comprehensive tool for anonymizing biomedical data. In: AMIA Annual Symposium (2014)

    Google Scholar 

  20. Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness Knowl.-Based Syst. 10(5), 557–570 (2002)

    Google Scholar 

  21. Terrovitis, M., Mamoulis, N., Kalnis, P.: Privacy-preserving anonymization of set-valued data. PVLDB 1(1), 115–125 (2008)

    Google Scholar 

  22. Terrovitis, M., Mamoulis, N., Kalnis, P.: Local and global recoding methods for anonymizing set-valued data. VLDB J. 20(1), 83–106 (2011)

    Article  Google Scholar 

  23. Xiao, X., Wang, G., Gehrke, J.: Interactive anonymization of sensitive data. In: SIGMOD, pp. 1051–1054 (2009)

    Google Scholar 

  24. Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., Chee Fu, A.W.: Utility-based anonymization using local recoding. In: SIGKDD, pp. 785–790 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giorgos Poulis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Poulis, G., Gkoulalas-Divanis, A., Loukides, G., Skiadopoulos, S., Tryfonopoulos, C. (2015). SECRETA: A Tool for Anonymizing Relational, Transaction and RT-Datasets. In: Gkoulalas-Divanis, A., Loukides, G. (eds) Medical Data Privacy Handbook. Springer, Cham. https://doi.org/10.1007/978-3-319-23633-9_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23633-9_5

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23632-2

  • Online ISBN: 978-3-319-23633-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics