Abstract
Social media is an area where users often experience censorship through a variety of means such as the restriction of search terms or active and retroactive deletion of messages. In this paper we examine the feasibility of automatically detecting censorship of microblogs. We use a network growing model to simulate discussion over a microblog follow network and compare two censorship strategies to simulate varying levels of message deletion. Using topological features extracted from the resulting graphs, a classifier is trained to detect whether or not a given communication graph has been censored. The results show that censorship detection is feasible under empirically measured levels of message deletion. The proposed framework can enable automated censorship measurement and tracking, which, when combined with aggregated citizen reports of censorship, can allow users to make informed decisions about online communication habits.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
If actors A and B are connected and B and C are connected, there is a high probability that actors A and C are also connected.
- 2.
Diameter and radius are calculated on the largest connected component.
- 3.
- 4.
We omit plots for the Laplacian eigenvalues due to space considerations.
References
Bamman, D., O’Connor, B., Smith, N.A.: Censorship and deletion practices in Chinese social media. First Monday 17(3) (2012)
Chakrabarti, D.: Graph mining: laws, generators, and algorithms. ACM Comput. Surv. (CSUR) 38 (2006). http://dl.acm.org/citation.cfm?id=1132954
Clauset, A., Shalizi, C.R., Newman, M.E.J.: Power-law distributions in empirical data. SIAM Rev. 51(4), 661–703 (2009)
Cohen, R., Erez, K., Ben-Avraham, D., Havlin, S.: Breakdown of the internet under intentional attack. Phys. Rev. Lett. 86(16), 3682–3685 (2001)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Costenbader, E., Valente, T.W.: The stability of centrality measures when networks are sampled. Soc. Netw. 25(4), 283–307 (2003)
Crandall, J.R., Zinn, D., Byrd, M., Barr, E.T., East, R.: ConceptDoppler: a weather tracker for internet censorship. In: ACM Conference on Computer and Communications Security, pp. 352–365 (2007)
Deibert, R.: Black code redux: censorship, surveillance, and the militarization of cyberspace. In: Boler, M. (ed.) Digital Media and Democracy: Tactics in Hard Times, pp. 137–164. MIT Press, Cambridge (2008)
Dick, A., Oyieke, L., Bothma, T.: Are established democracies less vulnerable to Internet censorship than authoritarian regimes?: The social media test. Technical report, Committee on Freedom of Access to Information and Freedom of Expression (FAIFE), University of Pretoria, South Africa (2012)
Fang, Z., Wang, J., Liu, B., Gong, W.: Double pareto lognormal distributions in complex networks. In: Thai, M.T., Pardalos, P.M. (eds.) Handbook of Optimization in Complex Networks, pp. 55–80. Springer, New York (2012)
Gallos, L.K., Argyrakis, P., Bunde, A., Cohen, R., Havlin, S.: Tolerance of scale-free networks: from friendly to intentional attack strategies. Phys. A: Stat. Mech. Appl. 344(3), 504–509 (2004)
Hwang, T.: Herdict: a distributed model for threats online. Netw. Secur. 2007(8), 15–18 (2007). http://www.sciencedirect.com/science/article/pii/S1353485807700740
Kempe, D., Kleinberg, J., Tardos, E.: Maximizing the spread of influence through a social network, p. 137 (2003)
Leskovec, J., Kleinberg, J.M., Faloutsos, C.: Graph evolution: densification and shrinking diameters. TKDD 1(1), 1–40 (2007)
MacKinnon, R.: Consent of the Networked: The Worldwide Struggle For Internet Freedom. Basic Books, New York (2012)
Morrison, D., Mcloughlin, I., Hogan, A., Hayes, C.: Evolutionary clustering and analysis of user behaviour in online forums. In: Proceedings of ICWSM-12, 6th International AAAI Conference on Weblogs and Social Media (2012)
Newman, M.E., Strogatz, S.H., Watts, D.J.: Random graphs with arbitrary degree distributions and their applications. Phys. Rev. E 64(2), 026118 (2001)
Nuss, J.: Web site tracks world online censorship reports, 4 August 2009. http://phys.org/news168613309.html
Roberts, H., Zuckerman, E., Palfrey, J.: Circumvention tool evaluation. Technical report (2011)
Sang-Hun, C.: Korea Policing the Net. Twist? Its South Korea, 12 August 2012. http://www.nytimes.com/2012/08/13/world/asia/critics-see-south-korea-internet-curbs-as-censorship.html
Stone, B.: The Inexact Science Behind D.M.C.A. Takedown Notices, 5 June 2008. http://bits.blogs.nytimes.com/2008/06/05/the-inexact-science-behind-dmca-takedown-notices/
Takahashi, D.Y., Sato, J.R., Ferreira, C.E., Fujita, A.: Discriminating different classes of biological networks by analyzing the graphs spectra distribution. CoRR abs/1208.2976 (2012)
Yadav, G., Babu, S.: NEXCADE: perturbation analysis for complex networks. PloS One 7(8), e41827 (2012)
Zhu, T., Phipps, D., Pridgen, A., Crandall, J.R., Wallach, D.S.: Tracking and quantifying censorship on a chinese microblogging site. CoRR abs/1211.6166 (2012)
Zhu, T., Phipps, D., Pridgen, A., Crandall, J.R., Wallach, D.S.: The velocity of censorship: High-fidelity detection of microblog post deletions. CoRR abs/1303.0597 (2013)
Acknowledgments
This work was funded by the European Union (EU) and Science Foundation Ireland (SFI) in the course of the projects ROBUST (EU grant no. 257859) and CLIQUE Strategic Research Cluster (SFI grant no. 08/SRC/I1407), and LION-2 (SFI grant no. SFI/08/CE/I1380).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Morrison, D. (2014). Toward Automatic Censorship Detection in Microblogs. In: Peng, WC., et al. Trends and Applications in Knowledge Discovery and Data Mining. PAKDD 2014. Lecture Notes in Computer Science(), vol 8643. Springer, Cham. https://doi.org/10.1007/978-3-319-13186-3_51
Download citation
DOI: https://doi.org/10.1007/978-3-319-13186-3_51
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13185-6
Online ISBN: 978-3-319-13186-3
eBook Packages: Computer ScienceComputer Science (R0)