Securing Data Analytics on SGX with Randomization

  • Swarup Chandra
  • Vishal Karande
  • Zhiqiang Lin
  • Latifur Khan
  • Murat Kantarcioglu
  • Bhavani Thuraisingham
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10492)

Abstract

Protection of data privacy and prevention of unwarranted information disclosure is an enduring challenge in cloud computing when data analytics is performed on an untrusted third-party resource. Recent advances in trusted processor technology, such as Intel SGX, have rejuvenated the efforts of performing data analytics on a shared platform where data security and trustworthiness of computations are ensured by the hardware. However, a powerful adversary may still be able to infer private information in this setting from side channels such as cache access, CPU usage and other timing channels, thereby threatening data and user privacy. Though studies have proposed techniques to hide such information leaks through carefully designed data-independent access paths, such techniques can be prohibitively slow on models with large number of parameters, especially when employed in a real-time analytics application. In this paper, we introduce a defense strategy that can achieve higher computational efficiency with a small trade-off in privacy protection. In particular, we study a strategy that adds noise to traces of memory access observed by an adversary, with the use of dummy data instances. We quantitatively measure privacy guarantee, and empirically demonstrate the effectiveness and limitation of this randomization strategy, using classification and clustering algorithms. Our results show significant reduction in execution time overhead on real-world data sets, when compared to a defense strategy using only data-oblivious mechanisms.

Keywords

Data privacy Analytics Intel SGX Randomization 

Notes

Acknowledgments

This research was supported in part by NSF awards CNS-1564112 and CNS-1629951, AFOSR award FA9550-14-1-0173, and NSA award H98230-15-1-0271. Any opinions, findings, conclusions, or recommendations expressed are those of the authors and not necessarily of the funding agencies.

References

  1. 1.
    Aggarwal, C.C., Philip, S.Y.: A general survey of privacy-preserving data mining models and algorithms. In: Privacy-Preserving Data Mining, pp. 11–52. Springer, Boston (2008)Google Scholar
  2. 2.
    Anati, I., Gueron, S., Johnson, S., Scarlata, V.: Innovative technology for cpu based attestation and sealing. In: Proceedings of the 2nd International Workshop on Hardware and Architectural Support for Security and Privacy, vol. 13 (2013)Google Scholar
  3. 3.
    Antani, S., Kasturi, R., Jain, R.: A survey on the use of pattern recognition methods for abstraction, indexing and retrieval of images and video. Pattern Recogn. 35(4), 945–965 (2002)CrossRefMATHGoogle Scholar
  4. 4.
    Barreno, M., Nelson, B., Sears, R., Joseph, A.D., Tygar, J.D.: Can machine learning be secure? In: Proceedings of the 2006 ACM Symposium on Information, Computer and Communications Security, pp. 16–25. ACM (2006)Google Scholar
  5. 5.
    Batcher, K.E.: Sorting networks and their applications. In: Proceedings of the Spring Joint Computer Conference, 30 April–2 May, 1968, pp. 307–314. ACM (1968)Google Scholar
  6. 6.
    Bauman, E., Lin, Z.: A case for protecting computer games with SGX. In: Proceedings of the 1st Workshop on System Software for Trusted Execution (SysTEX 2016), Trento, Italy, December 2016Google Scholar
  7. 7.
    Baumann, A., Peinado, M., Hunt, G.: Shielding applications from an untrusted cloud with haven. ACM Trans. Comput. Syst. (TOCS) 33(3), 8 (2015)Google Scholar
  8. 8.
    Bifet, A., Holmes, G., Pfahringer, B., Kranen, P., Kremer, H., Jansen, T., Seidl, T.: Moa: massive online analysis, a framework for stream classification and clustering. J. Mach. Learn. Res., 44–50 (2010)Google Scholar
  9. 9.
    Bishop, C.M.: Pattern recognition. Mach. Learn. 128, 1–58 (2006)Google Scholar
  10. 10.
    Brickell, J., Shmatikov, V.: The cost of privacy: destruction of data-mining utility in anonymized data publishing. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 70–78. ACM (2008)Google Scholar
  11. 11.
    Costan, V., Devadas, S.: Intel SGX explained. IACR Cryptology ePrint Archive 2016, p. 86 (2016)Google Scholar
  12. 12.
    Fu, Y., Bauman, E., Quinonez, R., Lin, Z.: SGX-LAPD: thwarting controlled side channel attacks via enclave verifiable page faults. In 20th International Symposium on Research in Attacks, Intrusions, and Defenses (RAID) (2017)Google Scholar
  13. 13.
    Gentry, C., et al.: Fully homomorphic encryption using ideal lattices. In: STOC, vol. 9, pp. 169–178 (2009)Google Scholar
  14. 14.
    Götzfried, J., Eckert, M., Schinzel, S., Müller, T.: Cache attacks on intel SQX. In: Proceedings of the 10th European Workshop on Systems Security, p. 2. ACM (2017)Google Scholar
  15. 15.
    Gupta, V., Lehal, G.S., et al.: A survey of text mining techniques and applications. J. Emerg. Technol. Web Intell. 1(1), 60–76 (2009)Google Scholar
  16. 16.
    Huang, L., Joseph, A.D., Nelson, B., Rubinstein, B.I., Tygar, J.: Adversarial machine learning. In: Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence, pp. 43–58. ACM (2011)Google Scholar
  17. 17.
    Karande, V., Bauman, E., Lin, Z., Khan, L.: SGX-Log: securing system logs with SQX. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, pp. 19–30. ACM (2017)Google Scholar
  18. 18.
    Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the privacy preserving properties of random data perturbation techniques. In: Third IEEE International Conference on Data Mining, pp. 99–106. IEEE (2003)Google Scholar
  19. 19.
    Lee, S., Shih, M.-W., Gera, P., Kim, T., Kim, H., Peinado, M.: Inferring fine-grained control flow inside SQX enclaves with branch shadowing. arXiv preprint arXiv:1611.06952 (2016)
  20. 20.
    Li, F., Sun, J., Papadimitriou, S., Mihaila, G.A., Stanoi, I.: Hiding in the crowd: privacy preservation on evolving streams through correlation tracking. In: IEEE 23rd International Conference on Data Engineering, ICDE 2007, pp. 686–695. IEEE (2007)Google Scholar
  21. 21.
    Liu, C., Wang, X.S., Nayak, K., Huang, Y., Shi, E.: Oblivm: a programming framework for secure computation. In: 2015 IEEE Symposium on Security and Privacy (SP), pp. 359–376. IEEE (2015)Google Scholar
  22. 22.
    Luk, C.-K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. In: ACM Sigplan Not. 40, 190–200 (2005). ACMGoogle Scholar
  23. 23.
    Masud, M.M., Gao, J., Khan, L., Han, J., Thuraisingham, B.: A practical approach to classify evolving data streams: training with limited amount of labeled data. In: Eighth IEEE International Conference on Data Mining, ICDM 2008, pp. 929–934. IEEE (2008)Google Scholar
  24. 24.
    Murphy, K.P.: Machine learning: a probabilistic perspective. MIT Press (2012)Google Scholar
  25. 25.
    Navarro, G.: A guided tour to approximate string matching. ACM Comput. Surv. (CSUR) 33(1), 31–88 (2001)CrossRefGoogle Scholar
  26. 26.
    Ohrimenko, O., Schuster, F., Fournet, C., Mehta, A., Nowozin, S., Vaswani, K., Costa, M.: Oblivious multi-party machine learning on trusted processors. In: USENIX Security Symposium, pp. 619–636 (2016)Google Scholar
  27. 27.
    Rane, A., Lin, C., Tiwari, M.: Raccoon: closing digital side-channels through obfuscated execution. In: 24th USENIX Security Symposium (USENIX Security 15), pp. 431–446 (2015)Google Scholar
  28. 28.
    Repository, U. M. L. (1998), https://archive.ics.uci.edu/ml/datasets/
  29. 29.
    Schuster, F., Costa, M., Fournet, C., Gkantsidis, C., Peinado, M., Mainar-Ruiz, G., Russinovich, M.: Vc3: trustworthy data analytics in the cloud using SQX. In: 2015 IEEE Symposium on Security and Privacy, pp. 38–54. IEEE (2015)Google Scholar
  30. 30.
    Shih, M.-W., Lee, S., Kim, T., Peinado, M.: T-SQX: eradicating controlled-channel attacks against enclave programs. In: Proceedings of the 2017 Annual Network and Distributed System Security Symposium (NDSS), San Diego, CA (2017)Google Scholar
  31. 31.
    Shinde, S., Chua, Z. L., Narayanan, V., Saxena, P.: Preventing page faults from telling your secrets. In: Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security, pp. 317–328. ACM (2016)Google Scholar
  32. 32.
    Sinha, R., Rajamani, S., Seshia, S., Vaswani, K.: Moat: verifying confidentiality of enclave programs. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1169–1184. ACM (2015)Google Scholar
  33. 33.
    Stefanov, E., Van Dijk, M., Shi, E., Fletcher, C., Ren, L., Yu, X., Devadas, S.: Path oram: an extremely simple oblivious ram protocol. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security, pp. 299–310. ACM (2013)Google Scholar
  34. 34.
    Xu, Y., Cui, W., Peinado, M.: Controlled-channel attacks: deterministic side channels for untrusted operating systems. In: 2015 IEEE Symposium on Security and Privacy (SP), pp. 640–656. IEEE (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Swarup Chandra
    • 1
  • Vishal Karande
    • 1
  • Zhiqiang Lin
    • 1
  • Latifur Khan
    • 1
  • Murat Kantarcioglu
    • 1
  • Bhavani Thuraisingham
    • 1
  1. 1.University of Texas at DallasRichardsonUSA

Personalised recommendations