Privacy and Policy in Polystores: A Data Management Research Agenda

  • Joshua A. KrollEmail author
  • Nitin Kohli
  • Paul Laskowski
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11721)


Modern data-driven technologies are providing new capabilities for working with data across diverse storage architectures and analyzing it in unified frameworks to yield powerful insights. These new analysis capabilities, which rely on correlating data across sources and types and exploiting statistical structure, have challenged classical approaches to privacy, leading to a need for radical rethinking of the meaning of the term. In the area of federated database technologies, there is a growing recognition that new technologies like polystores must incorporate the mitigation of privacy risks into the design process.



The work of authors Kroll and Kohli was supported in part by the National Security Agency (NSA). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSA. Kroll was also supported by the Berkeley Center for Law and Technology at the University of California, Berkeley Law School. Author Laskowski was supported by the Center for Long Term Cybersecurity at the University of California, Berkeley.


  1. 1.
    Albarghouthi, A., D’Antoni, L., Drews, S., Nori, A.: Fairness as a program property. arXiv preprint arXiv:1610.06067 (2016)
  2. 2.
    Anderson, R.: Security Engineering. Wiley, New York (2008)Google Scholar
  3. 3.
    Bater, J., He, X., Ehrich, W., Machanavajjhala, A., Rogers, J.: Shrinkwrap: efficient SQL query processing in differentially private data federations. Proc. VLDB Endowment 12(3), 307–320 (2018)CrossRefGoogle Scholar
  4. 4.
    Bhowmick, A., Duchi, J., Freudiger, J., Kapoor, G., Rogers, R.: Protection against reconstruction and its applications in private federated learning. arXiv preprint arXiv:1812.00984 (2018)
  5. 5.
    Bruening, P.J., Waterman, K.K.: Data tagging for new information governance models. IEEE Secur. Priv. 8(5), 64–68 (2010)CrossRefGoogle Scholar
  6. 6.
    Cohen, A., Nissim, K.: Towards formalizing the GDPR notion of singling out. arXiv preprint arXiv:1904.06009 (2019)
  7. 7.
    Cranor, L.F., Idouchi, K., Leon, P.G., Sleeper, M., Ur, B.: Are they actually any different? Comparing thousands of financial institutions’ privacy practices. In: Proceedings of the WEIS, vol. 13 (2013)Google Scholar
  8. 8.
    Datta, A., Sen, S., Zick, Y.: Algorithmic transparency via quantitative input influence: theory and experiments with learning systems. In: IEEE Symposium on Security and Privacy (SP), pp. 598–617. IEEE (2016)Google Scholar
  9. 9.
    Duggan, J., et al.: The BigDAWG polystore system. ACM Sigmod Rec. 44(2), 11–16 (2015)CrossRefGoogle Scholar
  10. 10.
    Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., Naor, M.: Our data, ourselves: privacy via distributed noise generation. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 486–503. Springer, Heidelberg (2006). Scholar
  11. 11.
    Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006). Scholar
  12. 12.
    Dwork, C., Rothblum, G.N., Vadhan, S.: Boosting and differential privacy. In: 51st Annual Symposium on Foundations of Computer Science, pp. 51–60. IEEE (2010)Google Scholar
  13. 13.
    Federal Trade Commission: FTC Policy Statement on Deception. 103 F.T.C. 110, 174 (1984).
  14. 14.
    Federal Trade Commission: FTC Policy Statement on Unfairness. 104 F.T.C. 949, 1070 (1984).
  15. 15.
    Feigenbaum, J., Weitzner, D.J.: On the incommensurability of laws and technical mechanisms: or, what cryptography can’t do. In: Matyáš, V., Švenda, P., Stajano, F., Christianson, B., Anderson, J. (eds.) Security Protocols 2018. LNCS, vol. 11286, pp. 266–279. Springer, Cham (2018). Scholar
  16. 16.
    Gellman, R.: Fair information practices: a basic history. SSRN 2415020 (2017)Google Scholar
  17. 17.
    Hoofnagle, C.J.: Federal Trade Commission Privacy Law and Policy. Cambridge University Press, Cambridge (2016)CrossRefGoogle Scholar
  18. 18.
    Hsu, J., et al.: Differential privacy: an economic method for choosing epsilon. In: 27th Computer Security Foundations Symposium, pp. 398–410. IEEE (2014)Google Scholar
  19. 19.
    Johnson, N., Near, J.P., Song, D.: Towards practical differential privacy for SQL queries. Proc. VLDB Endowment 11(5), 526–539 (2018)Google Scholar
  20. 20.
    Kamarinou, D., Millard, C., Oldani, I.: Compliance as a service. Queen Mary School of Law Legal Studies Research Paper, No. 287/2018 (2018)Google Scholar
  21. 21.
    Kohli, N., Laskowski, P.: Epsilon voting: mechanism design for parameter selection in differential privacy. In: IEEE Symposium on Privacy-Aware Computing (PAC), pp. 19–30. IEEE (2018)Google Scholar
  22. 22.
    Kroll, J.A., et al.: Accountable algorithms. Univ. PA. Law Rev. 165(3), 633–705 (2017)Google Scholar
  23. 23.
    Lee, J., Clifton, C.: How much is enough? Choosing \(\varepsilon \) for differential privacy. In: Lai, X., Zhou, J., Li, H. (eds.) ISC 2011. LNCS, vol. 7001, pp. 325–340. Springer, Heidelberg (2011). Scholar
  24. 24.
    Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: L-Diversity: privacy beyond k-anonymity. In: 22nd International Conference on Data Engineering (ICDE 2006), pp. 24–24. IEEE (2006)Google Scholar
  25. 25.
    McSherry, F.D.: Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, pp. 19–30. ACM (2009)Google Scholar
  26. 26.
    Mironov, I.: On significance of the least significant bits for differential privacy. In: Proceedings of the 2012 ACM Conference on Computer and Communications Security, pp. 650–661. ACM (2012)Google Scholar
  27. 27.
    Mulligan, D.K., Koopman, C., Doty, N.: Privacy is an essentially contested concept: a multi-dimensional analytic for mapping privacy. Philos. Trans. R. Soc. A 374(2083), 20160118 (2016)CrossRefGoogle Scholar
  28. 28.
    Nabar, S.U., Kenthapadi, K., Mishra, N., Motwani, R.: A survey of query auditing techniques for data privacy. In: Aggarwal, C.C., Yu, P.S. (eds.) Privacy-Preserving Data Mining. Advances in Database Systems, vol. 34, pp. 415–431. Springer, Boston (2008). Scholar
  29. 29.
    Naldi, M., D’Acquisto, G.: Differential privacy: an estimation theory-based method for choosing epsilon. arXiv preprint arXiv:1510.00917 (2015)
  30. 30.
    Narayanan, A., Felten, E.W.: No silver bullet: de-identification still doesn’t work. Manuscript (2014)Google Scholar
  31. 31.
    Narayanan, A., Huey, J., Felten, E.W.: A precautionary approach to big data privacy. In: Gutwirth, S., Leenes, R., De Hert, P. (eds.) Data Protection on the Move. LGTS, vol. 24, pp. 357–385. Springer, Dordrecht (2016). Scholar
  32. 32.
    Narayanan, A., Shmatikov, V.: Robust de-anonymization of large, sparse datasets. In: IEEE Security and Privacy (2008)Google Scholar
  33. 33.
    Narayanan, A., Shmatikov, V.: Myths and fallacies of personally identifiable information. Commun. ACM 53(6), 24–26 (2010)CrossRefGoogle Scholar
  34. 34.
    Nissim, K., et al.: Bridging the gap between computer science and legal approaches to privacy. Harvard J. Law Technol. 31(2), 687–780 (2018)Google Scholar
  35. 35.
    Papernot, N., Abadi, M., Erlingsson, U., Goodfellow, I., Talwar, K.: Semi-supervised knowledge transfer for deep learning from private training data. arXiv preprint arXiv:1610.05755 (2016)
  36. 36.
    Selbst, A.D., Powles, J.: Meaningful information and the right to explanation. Int. Data Priv. Law 7(4), 233–242 (2017)CrossRefGoogle Scholar
  37. 37.
    Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 10(05), 557–570 (2002)MathSciNetCrossRefGoogle Scholar
  38. 38.
    Truex, S., Baracaldo, N., Anwar, A., Steinke, T., Ludwig, H., Zhang, R.: A hybrid approach to privacy-preserving federated learning. arXiv preprint arXiv:1812.03224 (2018)
  39. 39.
    United States Department of Health: Education, and Welfare: Secretary’s Advisory Committee on Automated Personal Data Systems, Records, Computers, and the Rights of Citizens: Report. MIT Press (1973)Google Scholar
  40. 40.
    Wachter, S., Mittelstadt, B.: A right to reasonable inferences: re-thinking data protection law in the age of big data and AI. Columbia Bus. Law Rev. (2018)Google Scholar
  41. 41.
    Warren, S., Brandeis, L.: The right to privacy. Harvard Law Rev. 4, 193–220 (1890)CrossRefGoogle Scholar
  42. 42.
    Wu, X., Li, F., Kumar, A., Chaudhuri, K., Jha, S., Naughton, J.: Bolt-on differential privacy for scalable stochastic gradient descent-based analytics. In: Proceedings of the 2017 ACM International Conference on Management of Data, pp. 1307–1322. ACM (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.U.C. Berkeley School of InformationBerkeleyUSA

Personalised recommendations