Privacy-preserving big data analytics for cyber-physical systems

  • Marwa KeshkEmail author
  • Nour Moustafa
  • Elena Sitnikova
  • Benjamin Turnbull


Cyber-physical systems (CPS) generate big data collected from combining physical and digital entities, but the challenge of CPS privacy-preservation demands further research to protect CPS sensitive information from unauthorized access. Data mining, perturbation, transformation and encryption are techniques extensively used to preserve private information from disclosure whilst still providing insight, but these are limited in their effectiveness in still allowing high-level analysis. This paper studies the role of big data component analysis for protecting sensitive information from illegal access. The independent component analysis (ICA) technique is applied to transform raw CPS information into a new shape whilst preserving its data utility. The mechanism is evaluated using the power CPS dataset, and the results reveal that the technique is more effective than four other privacy-preservation techniques, obtaining a higher level of privacy protection. In addition, the data utility is tested using three machine learning algorithms to estimate their capability of identifying normal and attack patterns before and after transformation.


Privacy preservation Big data Independent component analysis SCADA CPS Power system 



We would like to thank the School of Engineering and Information Technology (SEIT) at UNSW@ADFA for sponsoring this work under the Cyber Physical Security project-PS47084.


  1. 1.
    Fahad, A., Tari, Z., Almalawi, A., Goscinski, A., Khalil, I., & Mahmood, A. (2014). PPFSCADA: Privacy preserving framework for scada data publishing. Future Generation Computer Systems, 37, 496–511.CrossRefGoogle Scholar
  2. 2.
    Sun, C.-C., Liu, C.-C., & Xie, J. (2016). Cyber-physical system security of a power grid: State-of-the-art. Electronics, 5(3), 40.CrossRefGoogle Scholar
  3. 3.
    Zakerzadeh, H., Aggarwal, C. C., & Barker, K. (2015). Privacy-preserving big data publishing. In Proceedings of the 27th international conference on scientific and statistical database management (p. 26). ACM.Google Scholar
  4. 4.
    Song, H., Fink, G. A., & Jeschke, S. (2017). Security and privacy in cyber-physical systems: Foundations, principles, and applications. New York: Wiley.CrossRefGoogle Scholar
  5. 5.
    Keshk, M., Moustafa, N., Sitnikova, E., & Creech, G. (2017). Privacy preservation intrusion detection technique for scada systems. arXiv preprint arXiv:1711.02828.
  6. 6.
    Chim, T. W., Yiu, S.-M., Li, V. O., Hui, L. C., & Zhong, J. (2015). PRGA: Privacy-preserving recording & gateway-assisted authentication of power usage information for smart grid. IEEE Transactions on Dependable and Secure Computing, 12(1), 85–97.CrossRefGoogle Scholar
  7. 7.
    Baby, V., & Chandra, N. S. (2016). Privacy-preserving distributed data mining techniques: A survey. International Journal of Computer Applications, 143(10), 1–50.CrossRefGoogle Scholar
  8. 8.
    Power systems datasets. 2017. Available: Accessed 10 Mar 2017.
  9. 9.
    Erez, N., & Wool, A. (2015). Control variable classification, modeling and anomaly detection in modbus/TCP SCADA systems. International Journal of Critical Infrastructure Protection, 10, 59–70.CrossRefGoogle Scholar
  10. 10.
    Aggarwal, C. C., & Philip, S. Y. (2008). A general survey of privacy-preserving data mining models and algorithms. In Privacy-preserving data mining (pp. 11–52). Springer.Google Scholar
  11. 11.
    He, D., Kumar, N., Zeadally, S., Vinel, A., & Yang, L. T. (2017). Efficient and privacy-preserving data aggregation scheme for smart grid against internal adversaries. IEEE Transactions on Smart Grid, 8(5), 2411–2419.CrossRefGoogle Scholar
  12. 12.
    Fang, W., Zamani, M., & Chen, Z. (2018). Secure and privacy preserving consensus for second-order systems based on paillier encryption. arXiv preprint arXiv:1805.01065.
  13. 13.
    Hajian, S., Domingo-Ferrer, J., & Farràs, O. (2014). Generalization-based privacy preservation and discrimination prevention in data publishing and mining. Data Mining and Knowledge Discovery, 28(5–6), 1158–1188.MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Femandes, M., & Gomes, J. (2017). Heuristic approach for association rule hiding using ECLAT. In 2017 2nd International conference on communication systems, computing and IT applications (CSCITA) (pp. 218–223). IEEE.Google Scholar
  15. 15.
    Zamani Boroujeni, F., & Hossein Afshari, D. (2018). An efficient rule-hiding method for privacy preserving in transactional databases. Journal of Computing and Information Technology, 25(4), 279–290.CrossRefGoogle Scholar
  16. 16.
    Sohani, A., & Sawant, K. (2016). PSDS: Privacy preserving system for data security implementation and countermeasures. International Journal of Computer Applications, 156(4), 21–25.CrossRefGoogle Scholar
  17. 17.
    Yu, C.-M., Chen, C.-Y., Kuo, S.-Y., & Chao, H.-C. (2014). Privacy-preserving power request in smart grid networks. IEEE Systems Journal, 8(2), 441–449.CrossRefGoogle Scholar
  18. 18.
    Iqbal, K., Yin, X.-C., Hao, H.-W., Ilyas, Q. M., & Yin, X. (2014). A central tendency-based privacy preserving model for sensitive xml association rules using bayesian networks. Intelligent Data Analysis, 18(2), 281–303.CrossRefGoogle Scholar
  19. 19.
    Ferrag, M. A., Maglaras, L. A., Janicke, H., & Jiang, J. (2016). A survey on privacy-preserving schemes for smart grid communications. arXiv preprint arXiv:1611.07722.
  20. 20.
    Cheung, J. C., Chim, T. W., Yiu, S.-M., Li, V. O., & Hui, L. C. (2011). Credential-based privacy-preserving power request scheme for smart grid network. In 2011 IEEE global telecommunications conference (GLOBECOM 2011) (pp. 1–5). IEEE.Google Scholar
  21. 21.
    Moustafa, N., Creech, G., & Slay, J. (2017). Big data analytics for intrusion detection system: Statistical decision-making using finite dirichlet mixture models. In Data analytics and decision support for cybersecurity (pp. 127–156). Springer.Google Scholar
  22. 22.
    Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16–28.CrossRefGoogle Scholar
  23. 23.
    Pan, S., Morris, T., & Adhikari, U. (2015). Developing a hybrid intrusion detection system using data mining for power systems. IEEE Transactions on Smart Grid, 6(6), 3104–3113.CrossRefGoogle Scholar
  24. 24.
    Hink, R. C. B., Beaver, J. M., Buckner, M. A., Morris, T., Adhikari, U., & Pan, S. (2014). Machine learning for power system disturbance and cyber-attack discrimination. In 2014 7th international symposium on resilient control systems (ISRCS) (pp. 1–8). IEEE.Google Scholar
  25. 25.
    Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H., et al. (2008). Top 10 algorithms in data mining. Knowledge and Information Systems, 14(1), 1–37.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Marwa Keshk
    • 1
    Email author
  • Nour Moustafa
    • 1
  • Elena Sitnikova
    • 1
  • Benjamin Turnbull
    • 1
  1. 1.University of New South Wales-CanberraCanberraAustralia

Personalised recommendations