Skip to main content

Natural Language Processing and Deep Learning Based Techniques for Evaluation of Companies’ Privacy Policies

  • Conference paper
  • First Online:
Computational Science and Its Applications – ICCSA 2022 Workshops (ICCSA 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13377))

Included in the following conference series:

Abstract

Companies’ websites are vulnerable to privacy attacks that can compromise the confidentiality of data which, particularly in sensitive use cases like personal data, financial transaction details, medical diagnosis, could be detrimental and unethical. The noncompliance of companies with privacy policies requirements as stipulated by the various Data Protection Regulations has raised lot of concerns for users and other practitioners. To address this issue, previous research developed a model using conventional algorithms such as Neural Network (NN), Logistic Regression (LR) and Support Vector Machine (SVM) to evaluate the levels of compliance of companies to general data protection regulations. However, the research performance shows to be unsatisfactory as the model’s performance across the selected core requirements of the legislation attained F1-score of between 0.52–0.71. This paper improved this model’s performance by using Natural Language Processing (NLP) and Deep Learning (DL) techniques. This was done by evaluating the same dataset used by the previous researcher to train the proposed model. The overall results show that LSTM outperform both GRU and CNN models in terms of F1-score and accuracy. This research paper is to assist the Supervisory Authority and other practitioners to better determine the state of companies’ privacy policies compliance with the relevant data protection regulations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Andow, B., et al.: Policylint: investigating internal privacy policy contradictions on google play. In: 28th USENIX Security Symposium (USENIX Security 19), pp. 585–602 (2019)

    Google Scholar 

  • Baia, A.E., Biondi, G., Franzoni, V., Milani, A., Poggioni, V.: Lie to me: shield your emotions from prying software. Sensors 22(3), 967 (2022)

    Article  Google Scholar 

  • Bowyer, K.W., Chawla, N.V., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. CoRR abs/1106.1813 (2011). http://arxiv.org/abs/1106.1813

  • Costante, E., Sun, Y., Petković, M., Den Hartog, J.: A Machine Learning Solution to Assess Privacy Policy Completeness: (short paper). In: Proceedings of the 2012 ACM Workshop on Privacy in the Electronic Society, pp. 91–96, October 2012

    Google Scholar 

  • Chika, D.M., Tochukwu, E.S.: An Analysis of Data Protection and Compliance in Nigeria (2020). https://www.rsisinternational.org/journals/ijriss/DigitalLibrary/volume-4-issue-5/377-382.pdf

  • Degeling, M., Utz, C., Lentzsch, C., Hosseini, H., Schaub, F., Holz, T.: We value your privacy... now take some cookies: Measuring the GDPR's impact on web privacy. arXiv preprint arXiv:1808.05096 (2018)

  • Franzoni, V., Kozak, Y.: Yeasts automated classification with extremely randomized forests. In International Conference on Computational Science and Its Applications, pp. 436–447. Springer, Cham, September 2021

    Google Scholar 

  • Goltz, N., Mayo, M.: Enhancing regulatory compliance by using artificial intelligence text mining to identify penalty clauses in legislation. RAIL 1, 175 (2018)

    Google Scholar 

  • Harkous, H., Fawaz, K., Lebret, R., Schaub, F., Shin, K. G., Aberer, K.: Polisis: Automated analysis and presentation of privacy policies using deep learning. In: 27th USENIX Security Symposium (USENIX Security 18), pp. 531–548 (2018)

    Google Scholar 

  • Kinne, J., Axenbeck, J.: Web Mining of Firm Websites: A Framework for Web Scraping and a Pilot Study for Germany. In: ZEW-Centre for European Economic Research Discussion Paper, (18–033) (2018)

    Google Scholar 

  • Micheti, A., Burkell, J., Steeves, V.: Fixing broken doors: strategies for drafting privacy policies young people can understand. Bull. Sci. Technol. Soc. 30(2), 130–143 (2010)

    Article  Google Scholar 

  • Muller, N. M., Kowatsch, D., Debus, P., Mirdita, D., Böttinger, K. (2019, September). On GDPR Compliance of Companies’ Privacy Policies. In: International Conference on Text, Speech, and Dialogue, pp. 151–159. Springer, Cham (2019)

    Google Scholar 

  • Labadie, C., Legner, C.: Understanding data protection regulations from a data management perspective: a capability-based approach to EU-GDPR. In: Proceedings of the 14th International Conference on Wirtschaftsinformatik, February 2019

    Google Scholar 

  • Liu, F., Fella, N. L., Liao, K.: Modeling Language Vagueness in Privacy Policies Using Deep Neural Networks. In: 2016 AAAI Fall Symposium Series, September 2016

    Google Scholar 

  • O’Connor, P.: Privacy and the online travel customer: an analysis of privacy policy content, use and compliance by online travel agencies. In: ENTER, pp. 401–412, January 2004

    Google Scholar 

  • Ramaiah, M., Chandrasekaran, V., Ravi, V., Kumar, N.: An intrusion detection system using optimized deep neural network architecture. Trans. Emerging Telecommun. Technol. 32(4), e4221 (2021)

    Google Scholar 

  • Sadeh, N., et al.: The usable privacy policy project: Combining crowdsourcing. Machine Learning and Natural Language Processing to Semi-Automatically Answer Those Privacy Questions Users Care About. Carnegie Mellon University Technical Report CMU-ISR-13–119, 1–24 (2013)

    Google Scholar 

  • Sathyendra, K.M., Wilson, S., Schaub, F., Zimmeck, S., Sadeh, N.: Identifying the provision of choices in privacy policy text. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2774–2779, September 2017

    Google Scholar 

  • Sánchez, D., Viejo, A., Batet, M.: Automatic assessment of privacy policies under the GDPR. Appl. Sci. 11(4), 1762 (2021)

    Article  Google Scholar 

  • Tesfay, W.B., Hofmann, P., Nakamura, T., Kiyomoto, S., Serna, J.: PrivacyGuide: Towards an Implementation of the EU GDPR on Internet Privacy Policy Evaluation. In: Proceedings of the Fourth ACM International Workshop on Security and Privacy Analytics, pp. 15–21, March 2018

    Google Scholar 

  • Zaeem, R.N., German, R.L., Barber, K.S.: Privacycheck: automatic summarization of privacy policies using data mining. ACM Trans. Internet Technol. (TOIT) 18(4), 1–18 (2018)

    Article  Google Scholar 

  • Zimmeck, S., Bellovin, S.M.: Privee: an architecture for automatically analyzing web privacy policies. In 23rd Security Symposium (USENIX Security 14), pp. 1–16 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Saka John .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

John, S., Ajayi, B.A., Marafa, S.M. (2022). Natural Language Processing and Deep Learning Based Techniques for Evaluation of Companies’ Privacy Policies. In: Gervasi, O., Murgante, B., Misra, S., Rocha, A.M.A.C., Garau, C. (eds) Computational Science and Its Applications – ICCSA 2022 Workshops. ICCSA 2022. Lecture Notes in Computer Science, vol 13377. Springer, Cham. https://doi.org/10.1007/978-3-031-10536-4_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-10536-4_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-10535-7

  • Online ISBN: 978-3-031-10536-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics