Losing My Revolution: How Many Resources Shared on Social Media Have Been Lost?

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7489)


Social media content has grown exponentially in the recent years and the role of social media has evolved from just narrating life events to actually shaping them. In this paper we explore how many resources shared in social media are still available on the live web or in public web archives. By analyzing six different event-centric datasets of resources shared in social media in the period from June 2009 to March 2012, we found about 11% lost and 20% archived after just a year and an average of 27% lost and 41% archived after two and a half years. Furthermore, we found a nearly linear relationship between time of sharing of the resource and the percentage lost, with a slightly less linear relationship between time of sharing and archiving coverage of the resource. From this model we conclude that after the first year of publishing, nearly 11% of shared resources will be lost and after that we will continue to lose 0.02% per day.


Web Archiving Social Media Digital Preservation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ainsworth, S.G., Alsum, A., SalahEldeen, H., Weigle, M.C., Nelson, M.L.: How Much of the Web Is Archived? In: Proceedings of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries, JCDL 2011, pp. 133–136 (2011)Google Scholar
  2. 2.
    Bar-Yossef, Z., Broder, A.Z., Kumar, R., Tomkins, A.: Sic Transit Gloria Telae: Towards an Understanding of the Web’s Decay. In: Proceedings of the 13th International Conference on World Wide Web, WWW 2004, pp. 328–337 (2004)Google Scholar
  3. 3.
    Benevenut, F., Rodrigues, T., Cha, M., Almeida, V.: Characterizing User Behavior in Online Social Networks. In: Proc. of ACM SIGCOMM Internet Measurement Conference, SIGCOMM 2009, pp. 49–62 (2009)Google Scholar
  4. 4.
    Lee, C.S., Ma, L., Goh, D.H.-L.: Why Do People Share News in Social Media? In: Zhong, N., Callaghan, V., Ghorbani, A.A., Hu, B. (eds.) AMT 2011. LNCS, vol. 6890, pp. 129–140. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  5. 5.
  6. 6.
    Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a Social Network or a News Media? In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pp. 591–600 (2010)Google Scholar
  7. 7.
    Mohr, G., Kimpton, M., Stack, M., Ranitovic, I.: Introduction to Heritrix, an Archival Quality Web Crawler. In: 4th International Web Archiving Workshop, IWAW 2004 (2004)Google Scholar
  8. 8.
    McCown, F., Diawara, N., Nelson, M.L.: Factors Affecting Website Reconstruction from the Web Infrastructure. In: Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2007, pp. 39–48 (2007)Google Scholar
  9. 9.
    Nelson, M.L., Danette Allen, B.: Object Persistence and Availability in Digital Libraries. D-Lib Magazine 8(1) (January 2002)Google Scholar
  10. 10.
    Newman, M.E.J., Park, J.: Why social networks are different from other types of networks. Phys. Rev. E 68(3), 036122 (2003)Google Scholar
  11. 11.
    Nunns, A., Idle, N.: Tweets From Tahrir, ISBN-10: 1935928457Google Scholar
  12. 12.
    Phelps, T.A., Wilensky, R.: Robust Hyperlinks Cost Just Five Words Each. Technical Report, UCB/CSD-00-1091, EECS Department, University of California, Berkeley (2000)Google Scholar
  13. 13.
    SalahEldeen, H.M., Nelson, M.L.: Losing My Revolution: A year after the Egyptian Revolution, 10% of the social media documentation is gone,
  14. 14.
    Sanderson, R., Phillips, M., Van de Sompel, H.: Analyzing the Persistence of Referenced Web Resources with Memento. CoRR, arXiv:1105.3459 (2011)Google Scholar
  15. 15.
    Stanford SNAP Project Dataset,
  16. 16.
  17. 17.
    Van de Sompel, H., Nelson, M.L., Sanderson, R., Balakireva, L.L., Ainsworth, S., Shankar, H.: Memento: Time Travel for the Web. Technical Report, arXiv:0911.1112 (November 2009)Google Scholar
  18. 18.
    Wan, X., Yang, J.: Wordrank-based Lexical Signatures for Finding Lost or Related Web Pages. In: Zhou, X., Li, J., Shen, H.T., Kitsuregawa, M., Zhang, Y. (eds.) APWeb 2006. LNCS, vol. 3841, pp. 843–849. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  19. 19.
    Wilson, C., Boe, B., Sala, A., Puttaswamy, K.P., Zhao, B.Y.: User Interactions in Social Networks and their Implications. In: Proceedings of the 4th ACM European Conference on Computer Systems, EuroSys 2009, pp. 205–218 (2009)Google Scholar
  20. 20.
    Wu, S., Hofman, J.M., Mason, W.A., Watts, D.J.: Who Says What to Whom on Twitter. In: Proceedings of the 20th International Conference on World Wide Web, WWW 2011, pp. 705–714 (2011)Google Scholar
  21. 21.
    Yang, J., Leskovec, J.: Patterns of Temporal Variation in Online Media. In: ACM International Conference on Web Search and Data Minig, WSDM 2011, pp. 177–186 (2011)Google Scholar
  22. 22.
    Yang, J., Counts, S.: Predicting the Speed, Scale, and Range of Information Diffusion in Twitter. In: 4th International AAAI Conference on Weblogs and Social Media, ICWSM 2010 (May 2010)Google Scholar
  23. 23.
    Zhao, D., Rosson, M.B.: How and Why People Twitter: The Role that Micro-blogging Plays in Informal Communication at Work. In: Proceedings of the ACM 2009 International Conference on Supporting Group Work, GROUP 2009, pp. 243–252 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.Department of Computer ScienceOld Dominion UniversityNorfolkUSA

Personalised recommendations