Data Mining and Privacy of Social Network Sites’ Users: Implications of the Data Mining Problem
- 2.4k Downloads
This paper explores the potential of data mining as a technique that could be used by malicious data miners to threaten the privacy of social network sites (SNS) users. It applies a data mining algorithm to a real dataset to provide empirically-based evidence of the ease with which characteristics about the SNS users can be discovered and used in a way that could invade their privacy. One major contribution of this article is the use of the decision forest data mining algorithm (SysFor) to the context of SNS, which does not only build a decision tree but rather a forest allowing the exploration of more logic rules from a dataset. One logic rule that SysFor built in this study, for example, revealed that anyone having a profile picture showing just the face or a picture showing a family is less likely to be lonely. Another contribution of this article is the discussion of the implications of the data mining problem for governments, businesses, developers and the SNS users themselves.
KeywordsData mining Social network sites (SNS) Privacy Content analysis Logic rules
- Al-Saggaf, Y. (2011). Saudi females on Facebook: An ethnographic study. International Journal of Emerging Technologies and Society, 9(1), 1–19.Google Scholar
- Al-Saggaf, Y. (2012). The mining of data retrieved from the eHealth record system should be governed. Information Age, 2012, 46–47.Google Scholar
- Al-Saggaf, Y., & Islam, Z. (2012). Privacy in social network sites (SNS): The threats from data mining. Ethical Space: The International Journal of Communication Ethics, 9(4), 32–40.Google Scholar
- Al-Saggaf, Y., & Nielsen, S. (2014). Self-disclosure on Facebook among female users and its relationship to feelings of loneliness. Computers in Human Behavior, 36(2014), 460–468. http://dx.doi.org/10.1016/j.chb.2014.04.014.
- BBC News. (2007). Facebook opens profiles to public. Retrieved 12 January, 2012, from http://news.bbc.co.uk/go/pr/fr/-/2/hi/technology/6980454.stm.
- Brankovic, L., Islam, M. Z., & Giggins, H. (2007). Privacy-preserving data mining. In M. Petkovic, & W. Jonker (Eds.), Security, privacy and trust in modern data management. Springer, ISBN: 978-3-540-69860-9, Chapter 11, pp. 151–166.Google Scholar
- Catanese, S. A., Meo, D. E., Ferrara, E., Fiumara, G., & Provetti, A. (2011). Crawling Facebook for social network analysis purposes. In Proceedings of the international conference on web intelligence, mining and semantics, May 25–27.Google Scholar
- Edwards, L., & Brown, I. (2009). Data control and social networking: Irreconcilable ideas? In A. Matwyshyn (Ed.), Harboring data: Information security, law and the corporation. Stanford: Stanford University Press.Google Scholar
- Facebook. (2012). One Billion People on Facebook. http://newsroom.fb.com/News/One-Billion-People-on-Facebook-1c9.aspx. Accessed on October 13, 2012.
- Felt, A., & Evans, D. (2008). Privacy protection for social networking platforms. Workshop on Web 2.0 Security and Privacy, May 22, pp. 1–8.Google Scholar
- Gross, R., & Acquisti, A. (2005). Information revelation and privacy in online social networks. In Proceedings of the 2005 ACM workshop on privacy in the electronic society, pp. 71–80.Google Scholar
- Harfoush, R. (2011). Has Facebook gone too far? The Mark News Online, November 11. Retrieved January 12, 2012, from http://ca.news.yahoo.com/know-facebook-050204096.html.
- Islam, M. Z. (2008). Privacy preservation in data mining through noise addition. PhD thesis in Computer Science, School of Electrical Engineering and Computer Science, The University of Newcastle, Australia.Google Scholar
- Islam, M. Z. (2012). EXPLORE: A novel decision tree classification algorithm. In L. M. MacKinnon (Ed.), Data security and security data. Berlin/Heidelberg: Springer. LNCS Vol. 6121, ISBN 978-3-642-25703-2, pp. 55–71.Google Scholar
- Islam, M. Z., & Brankovic, L. (2011). Privacy preserving data mining: A noise addition framework using a novel clustering technique. Knowledge-Based Systems, 24(8), ISBN 0950-7051, (December 2011), 1214–1223.Google Scholar
- Islam, M. Z., & Giggins, H. (2011). Knowledge discovery through SysFor: A systematically developed forest of multiple decision trees. In Proceedings of the ninth australasian data mining conference (AusDM 11), Ballarat, Australia. December 01–December 02, 2011. CRPIT, 121. P. Vamplew, A. Stranieri, K.-L. Ong, P. Christen, & P. J. Kennedy (Eds.), ACS, pp. 205–210.Google Scholar
- Johnson, B. (2009). Danah boyd: ‘People looked at me like I was an alien’. guardian.co.uk, at http://www.guardian.co.uk/technology/2010/jan/11/facebook-privacy. Accessed May 30, 2012.
- Johnson, B. (2010). Privacy no longer a social norm, says Facebook founder. guardian.co.uk, at http://www.guardian.co.uk/technology/2010/jan/11/facebook-privacy. Accessed May 14, 2012.
- Khan, M. A., Islam, M. Z., & Hafeez, M. (2011). Irrigation water demand forecasting—A data pre-processing and data mining approach based on spatiotemporal data. In Proceedings of the ninth Australasian data mining conference (AusDM 11), Ballarat, Australia. December 01–December 02, 2011, CRPIT, 121. Vamplew, P., Stranieri, A., Ong, K.-L., Christen, P. and Kennedy, P. J. Eds., ACS, pp. 183–194.Google Scholar
- Kirkpatrick, M. (2010). Facebook’s Zuckerberg says the age of privacy is over. ReadWriteWeb, at http://www.readwriteweb.com/archives/facebooks_zuckerberg_says_the_age_of_privacy_is_ov.php. Accessed May 14, 2012.
- Krill, P. (2011). Big Data mining: Who owns your social network data? InfoWorld.com, March 9. Retrieved 19 December 2011 from http://www.infoworld.com/d/business-intelligence/big-data-mining-who-owns-your-social-network-data-746.
- Laurent, W. (2011). The realities of social media data mining. Dashboard Insight, March 14. Retrieved 19 December 2011 from http://www.dashboardinsight.com/articles/new-concepts-in-business-intelligence/the-realities-of-social-media-data-mining.aspx.
- Manjoo, F. (2007). Facebook finally lets users turn off privacy-invading ads. Salon.com, December 7. Retrieved 18 December 2011 from http://www.salon.com/2007/12/06/facebook_beacon_2/.
- Moor, J. (1990). The ethics of privacy protection. Library Trends, 39, 69–82.Google Scholar
- Nakashima, E. (2007). Feeling betrayed, Facebook users force site to honor their privacy. The Washington Post, November 30. Retrieved 18 December 2011 from http://www.washingtonpost.com/wp-dyn/content/article/2007/11/29/AR2007112902503.html.
- Nissenbaum, H. (1998). Protecting privacy in an information age: The problem of privacy in public. Law and Philosophy, 17, 559–596.Google Scholar
- Nissenbaum, H. (2004). Privacy as contextual integrity. Washington Law Review, 79, 119–158.Google Scholar
- Nissenbaum, H. (2010). Privacy in context. Stanford, CA: Stanford University Press.Google Scholar
- Oboler, A., Welsh, K., & Cruz, L. (2012). The danger of big data: Social media as computational social science. First Monday, Volume 17, Number 7. Retrieved 24 December 2012 from http://www.firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/3993/3269.
- PPIP Act. (1998). NSW privacy and personal information act. http://www.legislation.nsw.gov.au/maintop/view/inforce/act+133+1998+cd+0+N. Accessed 9 June 2014.
- Quinlan, J. R. (1993). C4.5: Programs for machine learning. San Mateo, CA: Morgan Kaufmann Publishers.Google Scholar
- Rachels, J. (1975). Why privacy is important. Philosophy & Public Affairs, 4, 323–333.Google Scholar
- Rahman, M. A., & Islam, M. Z. (2011). Seed-detective: A novel clustering technique using high quality seed for K-means on categorical and numerical attributes. In Proceedings of the ninth Australasian data mining conference (AusDM 11), Ballarat, Australia. December 01–December 02, 2011, CRPIT, 121. Vamplew, P., Stranieri, A., Ong, K.-L., Christen, P. & Kennedy, P. J. Eds., ACS, pp. 211–220.Google Scholar
- Rahman, M. G., Islam, M. Z., Bossomaier, T., & Gao, J. (2011). CAIRAD: A novel technique for incorrect records and attribute-values detection. In Proceedings of IEEE international joint conference on neural networks (IJCNN 12), Brisbane, Australia. June 10–June 15, 2012, pp. 1–10.Google Scholar
- Rubenstein, I. S., Lee, R. D., & Schwartz, P. M. (2008). Data mining and internet profiling: Emerging regulatory and technological approaches. University Of Chicago Law Review, 75, 261–286.Google Scholar
- Sar, R. K., Al-Saggaf, Y., & Zia, T. (2012). You are what you type: Privacy in online social networks. In S. Leitch & M. Warren (Eds.) Proceedings of the Sixth AICE conference, Melbourne, Australia, 13 February 2012 (pp. 13–18). Deakin: School of Information Systems, Deakin University.Google Scholar
- Tavani, H. T. (2011). Ethics and technology: Controversies, questions, and strategies for ethical computing (3rd ed.). Hoboken, NJ: John Wiley.Google Scholar
- Ting, I. (2008). Web mining techniques for on-line social networks analysis. In International conference on service systems and service Management, June 30 2008–July 2, pp. 1–5.Google Scholar
- Vaidya, J., & Clifton, C. (2004). Privacy-preserving outlier detection. In Proceedings of the 4th IEEE international conference on data mining (ICDM 2004), pp. 233–240.Google Scholar
- Van den Hoven, J. (2008). Information technology, privacy, and the protection of personal data. In J. van den Hoven & J. Weckert (Eds.), Information technology and moral philosophy (pp. 301–321). Cambridge: Cambridge University Press.Google Scholar