Advertisement

Science and Engineering Ethics

, Volume 21, Issue 4, pp 941–966 | Cite as

Data Mining and Privacy of Social Network Sites’ Users: Implications of the Data Mining Problem

  • Yeslam Al-Saggaf
  • Md Zahidul Islam
Original Paper

Abstract

This paper explores the potential of data mining as a technique that could be used by malicious data miners to threaten the privacy of social network sites (SNS) users. It applies a data mining algorithm to a real dataset to provide empirically-based evidence of the ease with which characteristics about the SNS users can be discovered and used in a way that could invade their privacy. One major contribution of this article is the use of the decision forest data mining algorithm (SysFor) to the context of SNS, which does not only build a decision tree but rather a forest allowing the exploration of more logic rules from a dataset. One logic rule that SysFor built in this study, for example, revealed that anyone having a profile picture showing just the face or a picture showing a family is less likely to be lonely. Another contribution of this article is the discussion of the implications of the data mining problem for governments, businesses, developers and the SNS users themselves.

Keywords

Data mining Social network sites (SNS) Privacy Content analysis Logic rules 

References

  1. Alim, S., Abdulrahman, R., Neagu, D., & Ridley, M. (2011). Online social network profile data extraction for vulnerability analysis. International Journal of Internet Technology and Secured Transactions, 3, 194–209.CrossRefGoogle Scholar
  2. Al-Saggaf, Y. (2011). Saudi females on Facebook: An ethnographic study. International Journal of Emerging Technologies and Society, 9(1), 1–19.Google Scholar
  3. Al-Saggaf, Y. (2012). The mining of data retrieved from the eHealth record system should be governed. Information Age, 2012, 46–47.Google Scholar
  4. Al-Saggaf, Y., & Islam, Z. (2012). Privacy in social network sites (SNS): The threats from data mining. Ethical Space: The International Journal of Communication Ethics, 9(4), 32–40.Google Scholar
  5. Al-Saggaf, Y., & Nielsen, S. (2014). Self-disclosure on Facebook among female users and its relationship to feelings of loneliness. Computers in Human Behavior, 36(2014), 460–468. http://dx.doi.org/10.1016/j.chb.2014.04.014.
  6. BBC News. (2007). Facebook opens profiles to public. Retrieved 12 January, 2012, from http://news.bbc.co.uk/go/pr/fr/-/2/hi/technology/6980454.stm.
  7. Birrer, F. A. J. (2005). Data mining to combat terrorism and the roots of privacy concerns. Ethics and Information Technology, 7, 211–220.CrossRefGoogle Scholar
  8. Bonneau, J., Anderson, J., & Danezis, G. (2009). Prying data out of a social network. International Conference on Advances in Social Network Analysis and Mining, 20–22, 249–254.CrossRefGoogle Scholar
  9. Boyd, D. M., & Ellison, N. B. (2007). Social network sites: Definition, history, and scholarship. Journal of Computer-Mediated Communication, 13, 210–230.CrossRefGoogle Scholar
  10. Brankovic, L., Islam, M. Z., & Giggins, H. (2007). Privacy-preserving data mining. In M. Petkovic, & W. Jonker (Eds.), Security, privacy and trust in modern data management. Springer, ISBN: 978-3-540-69860-9, Chapter 11, pp. 151–166.Google Scholar
  11. Catanese, S. A., Meo, D. E., Ferrara, E., Fiumara, G., & Provetti, A. (2011). Crawling Facebook for social network analysis purposes. In Proceedings of the international conference on web intelligence, mining and semantics, May 25–27.Google Scholar
  12. Caudill, E. M., & Murphy, P. E. (2000). Consumer online privacy: Legal and ethical issues. Journal of Public Policy and Marketing, 19(1), 7–19.CrossRefGoogle Scholar
  13. Clifton, C., Kantarcioglu, M., Vaidya, J., & Zhu, M. Y. (2002). Tools for privacy preserving distributed data mining. ACM SIGKDD Explorations Newsletter, 4(2), 28–34.CrossRefGoogle Scholar
  14. Debatin, B., Lovejoy, J. P., Horn, A., & Hughes, B. N. (2009). Facebook and online privacy: Attitudes, behaviors, and unintended consequences. Journal of Computer-Mediated Communication, 15, 83–108.CrossRefGoogle Scholar
  15. Edwards, L., & Brown, I. (2009). Data control and social networking: Irreconcilable ideas? In A. Matwyshyn (Ed.), Harboring data: Information security, law and the corporation. Stanford: Stanford University Press.Google Scholar
  16. Facebook. (2012). One Billion People on Facebook. http://newsroom.fb.com/News/One-Billion-People-on-Facebook-1c9.aspx. Accessed on October 13, 2012.
  17. Felt, A., & Evans, D. (2008). Privacy protection for social networking platforms. Workshop on Web 2.0 Security and Privacy, May 22, pp. 1–8.Google Scholar
  18. Gross, R., & Acquisti, A. (2005). Information revelation and privacy in online social networks. In Proceedings of the 2005 ACM workshop on privacy in the electronic society, pp. 71–80.Google Scholar
  19. Harfoush, R. (2011). Has Facebook gone too far? The Mark News Online, November 11. Retrieved January 12, 2012, from http://ca.news.yahoo.com/know-facebook-050204096.html.
  20. Hildebrandt, M. (2009). Who is profiling who? Invisible visibility. In S. Gutwirth, Y. Poullet, P. de Hert, C. de Terwangne, & S. Nouwt (Eds.), Reinventing data protection? (pp. 239–252). Berlin: Springer.CrossRefGoogle Scholar
  21. Islam, M. Z. (2008). Privacy preservation in data mining through noise addition. PhD thesis in Computer Science, School of Electrical Engineering and Computer Science, The University of Newcastle, Australia.Google Scholar
  22. Islam, M. Z. (2012). EXPLORE: A novel decision tree classification algorithm. In L. M. MacKinnon (Ed.), Data security and security data. Berlin/Heidelberg: Springer. LNCS Vol. 6121, ISBN 978-3-642-25703-2, pp. 55–71.Google Scholar
  23. Islam, M. Z., & Brankovic, L. (2011). Privacy preserving data mining: A noise addition framework using a novel clustering technique. Knowledge-Based Systems, 24(8), ISBN 0950-7051, (December 2011), 1214–1223.Google Scholar
  24. Islam, M. Z., & Giggins, H. (2011). Knowledge discovery through SysFor: A systematically developed forest of multiple decision trees. In Proceedings of the ninth australasian data mining conference (AusDM 11), Ballarat, Australia. December 01–December 02, 2011. CRPIT, 121. P. Vamplew, A. Stranieri, K.-L. Ong, P. Christen, & P. J. Kennedy (Eds.), ACS, pp. 205–210.Google Scholar
  25. Jagatic, T. N., Johnson, N. A., Jakobsson, M., & Menczer, F. (2007). Social phishing. Communications - ACM, 50, 94–100.CrossRefGoogle Scholar
  26. Johnson, B. (2009). Danah boyd: ‘People looked at me like I was an alien’. guardian.co.uk, at http://www.guardian.co.uk/technology/2010/jan/11/facebook-privacy. Accessed May 30, 2012.
  27. Johnson, B. (2010). Privacy no longer a social norm, says Facebook founder. guardian.co.uk, at http://www.guardian.co.uk/technology/2010/jan/11/facebook-privacy. Accessed May 14, 2012.
  28. Khan, M. A., Islam, M. Z., & Hafeez, M. (2011). Irrigation water demand forecasting—A data pre-processing and data mining approach based on spatiotemporal data. In Proceedings of the ninth Australasian data mining conference (AusDM 11), Ballarat, Australia. December 01–December 02, 2011, CRPIT, 121. Vamplew, P., Stranieri, A., Ong, K.-L., Christen, P. and Kennedy, P. J. Eds., ACS, pp. 183–194.Google Scholar
  29. Kirkpatrick, M. (2010). Facebook’s Zuckerberg says the age of privacy is over. ReadWriteWeb, at http://www.readwriteweb.com/archives/facebooks_zuckerberg_says_the_age_of_privacy_is_ov.php. Accessed May 14, 2012.
  30. Kosala, R., & Blockeel, H. (2000). Web mining research: A survey. SIGKDD Explorations, 2, 1–15.CrossRefGoogle Scholar
  31. Krill, P. (2011). Big Data mining: Who owns your social network data? InfoWorld.com, March 9. Retrieved 19 December 2011 from http://www.infoworld.com/d/business-intelligence/big-data-mining-who-owns-your-social-network-data-746.
  32. Laurent, W. (2011). The realities of social media data mining. Dashboard Insight, March 14. Retrieved 19 December 2011 from http://www.dashboardinsight.com/articles/new-concepts-in-business-intelligence/the-realities-of-social-media-data-mining.aspx.
  33. Manjoo, F. (2007). Facebook finally lets users turn off privacy-invading ads. Salon.com, December 7. Retrieved 18 December 2011 from http://www.salon.com/2007/12/06/facebook_beacon_2/.
  34. Moor, J. (1990). The ethics of privacy protection. Library Trends, 39, 69–82.Google Scholar
  35. Moor, J. (1997). Towards a theory of privacy in the information age. Computers and Society, 27, 27–32.CrossRefGoogle Scholar
  36. Nakashima, E. (2007). Feeling betrayed, Facebook users force site to honor their privacy. The Washington Post, November 30. Retrieved 18 December 2011 from http://www.washingtonpost.com/wp-dyn/content/article/2007/11/29/AR2007112902503.html.
  37. Nissenbaum, H. (1997). Toward an approach to privacy in public: Challenges of information technology. Ethics and Behavior, 7, 207–220.CrossRefGoogle Scholar
  38. Nissenbaum, H. (1998). Protecting privacy in an information age: The problem of privacy in public. Law and Philosophy, 17, 559–596.Google Scholar
  39. Nissenbaum, H. (2004). Privacy as contextual integrity. Washington Law Review, 79, 119–158.Google Scholar
  40. Nissenbaum, H. (2010). Privacy in context. Stanford, CA: Stanford University Press.Google Scholar
  41. Nosko, A., Wood, E., & Molema, S. (2010). All about me: Disclosure in online social networking profiles: The case of Facebook. Computers in Human Behavior, 26(2010), 406–418.CrossRefGoogle Scholar
  42. Oboler, A., Welsh, K., & Cruz, L. (2012). The danger of big data: Social media as computational social science. First Monday, Volume 17, Number 7. Retrieved 24 December 2012 from http://www.firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/3993/3269.
  43. PPIP Act. (1998). NSW privacy and personal information act. http://www.legislation.nsw.gov.au/maintop/view/inforce/act+133+1998+cd+0+N. Accessed 9 June 2014.
  44. Quinlan, J. R. (1993). C4.5: Programs for machine learning. San Mateo, CA: Morgan Kaufmann Publishers.Google Scholar
  45. Rachels, J. (1975). Why privacy is important. Philosophy & Public Affairs, 4, 323–333.Google Scholar
  46. Rahman, M. A., & Islam, M. Z. (2011). Seed-detective: A novel clustering technique using high quality seed for K-means on categorical and numerical attributes. In Proceedings of the ninth Australasian data mining conference (AusDM 11), Ballarat, Australia. December 01–December 02, 2011, CRPIT, 121. Vamplew, P., Stranieri, A., Ong, K.-L., Christen, P. & Kennedy, P. J. Eds., ACS, pp. 211–220.Google Scholar
  47. Rahman, M. G., Islam, M. Z., Bossomaier, T., & Gao, J. (2011). CAIRAD: A novel technique for incorrect records and attribute-values detection. In Proceedings of IEEE international joint conference on neural networks (IJCNN 12), Brisbane, Australia. June 10–June 15, 2012, pp. 1–10.Google Scholar
  48. Rubenstein, I. S., Lee, R. D., & Schwartz, P. M. (2008). Data mining and internet profiling: Emerging regulatory and technological approaches. University Of Chicago Law Review, 75, 261–286.Google Scholar
  49. Sar, R. K., & Al-Saggaf, Y. (2014). Contextual Integrity’s decision heuristic and social network sites tracking. Ethics and Information Technology, 16(1), 15–26.CrossRefGoogle Scholar
  50. Sar, R. K., Al-Saggaf, Y., & Zia, T. (2012). You are what you type: Privacy in online social networks. In S. Leitch & M. Warren (Eds.) Proceedings of the Sixth AICE conference, Melbourne, Australia, 13 February 2012 (pp. 13–18). Deakin: School of Information Systems, Deakin University.Google Scholar
  51. Tavani, H. T. (1999). Informational privacy, data mining, and the Internet. Ethics and Information Technology, 1, 137–145.CrossRefGoogle Scholar
  52. Tavani, H. T. (2011). Ethics and technology: Controversies, questions, and strategies for ethical computing (3rd ed.). Hoboken, NJ: John Wiley.Google Scholar
  53. Thelwall, M., Wilkinson, D., & Uppal, S. (2010). Data mining emotion in social network communication: Gender differences in MySpace. Journal of the American Society for Information Science and Technology, 61, 190–199.CrossRefGoogle Scholar
  54. Ting, I. (2008). Web mining techniques for on-line social networks analysis. In International conference on service systems and service Management, June 30 2008–July 2, pp. 1–5.Google Scholar
  55. Vaidya, J., & Clifton, C. (2004). Privacy-preserving outlier detection. In Proceedings of the 4th IEEE international conference on data mining (ICDM 2004), pp. 233–240.Google Scholar
  56. Van den Hoven, J. (2008). Information technology, privacy, and the protection of personal data. In J. van den Hoven & J. Weckert (Eds.), Information technology and moral philosophy (pp. 301–321). Cambridge: Cambridge University Press.Google Scholar
  57. Van Wel, L., & Royakkers, L. (2004). Ethical issues in web data mining. Ethics and Information Technology, 6, 129–140.CrossRefGoogle Scholar
  58. Young, A. L., & Quan-Hasse, A. (2009). Information revelation and internet privacy concerns on social network sites: A case study of Facebook. Proceedings of the fourth international conference on Communities and technologies, 2009, 265–274.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2014

Authors and Affiliations

  1. 1.School of Computing and MathematicsCharles Sturt UniversityWagga WaggaAustralia
  2. 2.Centre for Research in Complex Systems, School of Computing and MathematicsCharles Sturt UniversityBathurstAustralia

Personalised recommendations