International Journal on Digital Libraries

, Volume 16, Issue 3–4, pp 183–205 | Cite as

When should I make preservation copies of myself?

And after I do, how will I send messages to my copies?


We investigate how different replication policies ranging from least aggressive to most aggressive affect the level of preservation achieved by autonomic processes used by web objects (WOs). Based on simulations of small-world graphs of WOs created by the Unsupervised Small-World algorithm, we report quantitative and qualitative results for graphs ranging in order from 10 to 5000 WOs. Our results show that a moderately aggressive replication policy makes the best use of distributed host resources by not causing spikes in CPU resources nor spikes in network activity while meeting preservation goals. We examine different approaches that WOs can communicate with each other and determine the how long it would take for a message from one WO to reach a specific WO, or all WOs.


Web object Small-world Preservation Crowd sourcing 



This work supported in part by the National Science Foundation (NSF), Project 370161.


  1. 1.
    Alam, S.: HTTP mailbox-asynchronous RESTful communication. Master’s thesis, Old Dominion University, Norfolk, VA (2013)Google Scholar
  2. 2.
    Alam, S., Cartledge, C.L., Nelson, M.L.: HTTP mailbox-asynchronous RESTful communication. Technical report. arXiv:1305.1992 (2013)
  3. 3.
    Albert, R., Jeong, H., Barabási, A.-L.: Error and attack tolerance of complex networks. Nature 406(6794), 378–382 (2000)CrossRefGoogle Scholar
  4. 4.
    Barabási, A.-L., Albert, R., Jeong, H.: Scale-free characteristics of random networks: the topology of the world wide web. Physica A 281(1), 69–77 (2000)CrossRefGoogle Scholar
  5. 5.
    Beck, M., Moore, T., Plank, J.S.: An end-to-end approach to globally scalable network storage. In: Proceedings of the 2002 conference on applications, technologies, architectures, and protocols for computer communications, pp. 339–346 (2002)Google Scholar
  6. 6.
    Birman, K.P., Hayden, M., Ozkasap, O., Xiao, Z., Budiu, M., Minsky, Y.: Bimodal multicast. ACM Trans. Comput. Syst. 17(2), 41–88 (1999)CrossRefGoogle Scholar
  7. 7.
    Bollobás, B.: Modern Graph Theory. Springer, New York (1998)Google Scholar
  8. 8.
    Bollobás, B., Riordan, O., Spencer, J., Tusnády, G.: The degree sequence of a scale-free random graph process. Random Struct Algorithms 18(3), 279–290 (2001)CrossRefMATHGoogle Scholar
  9. 9.
    Carriero, N., Gelernter, D.: Linda in context. Commun. ACM 32(4), 444–458 (1989)CrossRefGoogle Scholar
  10. 10.
    Cartledge, C.: Preserve Me! (... if you can, using Unsupervised Small-World graphs.). (2013)
  11. 11.
    Cartledge, C.L.: A Framework for Web Object Self-Preservation. PhD thesis, Old Dominion University, Norfolk, VA 23529, August (2014)Google Scholar
  12. 12.
    Cartledge, C.L., Nelson, M.L.: Self-arranging preservation networks. In: Proceedings of the 8th ACM/IEEE-CS joint conference on digital libraries, pp. 445–445 (2008)Google Scholar
  13. 13.
    Cartledge, C.L., Nelson, M.L.: Unsupervised creation of small world networks for the preservation of digital objects. In: Proceedings of the 9th ACM/IEEE-CS joint conference on digital libraries, pp. 349–352 (2009)Google Scholar
  14. 14.
    Cartledge, C.L., Nelson, M.L.: Analysis of graphs for digital preservation suitability. In: Proceedings of the 21st ACM conference on hypertext and hypermedia, pp. 109–118. ACM (2010)Google Scholar
  15. 15.
    Cartledge, C.L., Nelson, M.L.: Connectivity damage to a graph by the removal of an edge or vertex. Technical report. Old Dominion University, Computer Science Department, Norfolk, VA. arXiv:1103.3075 (2011)
  16. 16.
    Ciancarini, P., Gorrieri, R., Zavattaro, G.: Towards a calculus for generative communication. Formal Methods Open Object-Based Distrib. Syst. 1, 283 (1997)Google Scholar
  17. 17.
    Cooper, B., Crespo, A., Garcia-Molina, H.: Implementing a reliable digital object archive. In: Proceedings of the 4th European conference on research and advanced technology for digital libraries, pp. 128–143 (2000)Google Scholar
  18. 18.
    Cooper, B.F., Garcia-Molina, H.: Peer-to-peer data trading to preserve information. ACM Trans. Inf. Syst. 20(2), 133–170 (2002)CrossRefGoogle Scholar
  19. 19.
    Dabek, F., Kaashoek, M.F., Karger, D., Morris, R., Stoica, I.: Wide-area cooperative storage with CFS. In: Proceedings of the 18th annual ACM symposium on operating systems principles, October (2001)Google Scholar
  20. 20.
    de la Rosa, J.L., Del Acebo, E., Trias, A., Aciar, S., Quisbert, H.: Crew intelligence systems for digital objects preservation. In: The 2nd swarm intelligence algorithms and applications symposium-SIAAS, vol. 9 (2009)Google Scholar
  21. 21.
    de la Rosa, J.L., Olvera, J.A.: First studies on self-preserving digital objects. In: Artificial intelligence research and development: Proceedings of the 15th international conference of the Catalan Association for Artificial Intelligence and Applications, pp. 213–222 (2012)Google Scholar
  22. 22.
    Duchon, P., Hanusse, N., Lebhar, E., Schabanel, N.: Could any graph be turned into a small-world? Theor. Comput. Sci. 355(1), 96–103 (2006)MathSciNetCrossRefMATHGoogle Scholar
  23. 23.
    Duchon, P., Hanusse, N., Lebhar, E., Schabanel, N.: Towards small world emergence. In: ACM symposium on parallelism in algorithms and architectures, pp. 225–232 (2006)Google Scholar
  24. 24.
    Gantz, J., Reinsel, D.: The Digital Universe in 2020: big data, bigger digital shadows, and biggest growth in the far east. IDC iView IDC Anal. Futur. 2007, 1–16 (2012)Google Scholar
  25. 25.
    Gaume, B., Mathieu, F.: From random graph to small world by wandering. Technical report 6489, Unité de recherche INRIA Rocquencourt (2008)Google Scholar
  26. 26.
    Gelernter, D., Carriero, N.: Coordination languages and their significance. Commun. ACM 35(2), 97–107 (1992)CrossRefGoogle Scholar
  27. 27.
    Goh, K.I., Kahng, B., Kim, D.: Universal behavior of load distribution in scale-free networks. Phys. Rev. Lett. 87(27), 278701 (2001)CrossRefGoogle Scholar
  28. 28.
    Hunter, J., Choudhury, S.: A semi-automated digital preservation system based on semantic web services. In: Proceedings of the 4th ACM/IEEE-CS joint conference on digital libraries, pp. 269–278 (2004)Google Scholar
  29. 29.
    Ikeda, S., Kubo, I., Yamashita, M.: The hitting and cover times of random walks on finite graphs using local degree information. Theor. Comput. Sci. 410(1), 94–100 (2009)MathSciNetCrossRefMATHGoogle Scholar
  30. 30.
    Kahn, R., Wilensky, R.: A framework for distributed digital object services. Int. J. Digit. Libr. 6(2), 115–123 (2006)CrossRefGoogle Scholar
  31. 31.
    Kleinberg, J.: The small-world phenomenon: an algorithmic perspective. In: Proceedings of the 32nd ACM symposium on theory of computing 32, 163–170 (2000)Google Scholar
  32. 32.
    Klemm, K., Eguíluz, V.M.: Growing scale-free networks with small-world behavior. Phys. Rev. E. 65(5), 26107 (2002)CrossRefGoogle Scholar
  33. 33.
    Maniatis, P., Roussopoulos, M., Giuli, T.J., Rosenthal, D.S.H., Baker, M.: The LOCKSS peer-to-peer digital preservation system. ACM Trans. Comput. Syst. 23(1), 2–50 (2005)CrossRefGoogle Scholar
  34. 34.
    McCown, F., Nelson, M.L.: What happens when facebook is gone? In: Proceedings of the 9th ACM/IEEE-CS joint conference on digital libraries, pp. 251–254 (2009)Google Scholar
  35. 35.
    Milian, M.: GeoCities’ Time has Expired. Yahoo Closing the Site Today. Los Angeles Times, Los Angeles, USA (2009)Google Scholar
  36. 36.
    Miller, I., Freund, J.E.: Probability and Statistics for Engineers. Prentice-Hall, Englewood Cliffs, NJ (1977)MATHGoogle Scholar
  37. 37.
    Nelson, M.L., Van de Sompel, H.: IJDL special issue on complex digital objects: Guest editors’ introduction. Int. J. Digit. Libr. 6(2), 113–114 (2006)CrossRefGoogle Scholar
  38. 38.
    Newman, M.E.J.: Models of the small world: a review. J. Stat. Phys. 101, 819 (2000)CrossRefMATHGoogle Scholar
  39. 39.
    Nguyen, V., Martel, C.: Analyzing and characterizing small-world graphs. In: ACM-SIAM symposium on discrete algorithms, pp. 311–320 (2005)Google Scholar
  40. 40.
    Payette, S., Staples, T.: The Mellon Fedora Project. In: Proceedings of the 6th European conference on research and advanced technology for digital libraries, pp. 406–421 (2002)Google Scholar
  41. 41.
    Rajasekar, A., Wan, M., Moore, R.: MySRB and SRB: components of a data grid. In: Proceedings of the 11th IEEE international symposium on high performance distributed computing, pp. 301–310 (2002)Google Scholar
  42. 42.
    Rajasekar, A., Wan, M., Moore, R., Schroeder, W.: A prototype rule-based distributed data management system. In: HPDC workshop on next generation distributed sata management (2006)Google Scholar
  43. 43.
    Ratnasamy, S., Francis, P., Handley, M., Karp, R., Schenker, S.: A Scalable Content-Addressable Network. In: Proceedings of the 2001 conference on applications, technologies, architectures, and protocols for computer communications, pp. 161–172 (2001)Google Scholar
  44. 44.
    Reich, V.: CLOCKSS—it takes a community. Ser. Libr. 54(1–2), 135–139 (2008)Google Scholar
  45. 45.
    Reynolds, C.W.: Flocks, herds and schools: a distributed behavioral model. SIGGRAPH Comput. Graph. 21(4), 25–34 (1987)CrossRefGoogle Scholar
  46. 46.
    Rhea, S., Wells, C., Eaton, P., Geels, D., Zhao, B., Weatherspoon, H., Kubiatowicz, J.: Maintenance-free global data storage. IEEE Internet Comput. 5(5), 40–49 (2001)CrossRefGoogle Scholar
  47. 47.
    Rosenthal, D.S.H., Rosenthal, D.C., Miller, E.L., Adams, I.F., Storer, M.W., Zadok, E.: The economics of long-term digital storage. Paper presented at the Memory of the World in the Digital Age, Vancouver, BC (2012)Google Scholar
  48. 48.
    Rosenthal, D.S.H.: Estimating storage costs. (2013)
  49. 49.
    Rosenthal, D.S.H., Robertson, T.S., Lipkis, T., Reich, V., Morabito, S.: Requirements for digital preservation systems: a bottom-up approach. Dlib Mag. 11 (2005)Google Scholar
  50. 50.
    Rothenberg, J.: Avoiding Technological Quicksand: Finding a Viable Technical Foundation for Digital Preservation. A Report to the Council on Library and Information Resources. Council on Library and Information Resources, Washington, DC (1999)Google Scholar
  51. 51.
    Salaheldeen, H.M., Nelson, M.L.: Resurrecting My Revolution: Using Social Link Neighborhood in Bringing Context to the Disappearing Web. In: Proceedings of theory and practice of digital libraries, pp. 333–345 (2013)Google Scholar
  52. 52.
    Smith, M.: DSpace: an institutional repository from the MIT libraries and Hewlett Packard Laboratories. In: Proceedings of the 6th European conference on research and advanced technology for digital libraries, pp. 543–549 (2002)Google Scholar
  53. 53.
    Spector, L., Klein, J., Perry, C., Feinstein, M.: Emergence of collective behavior in evolving populations of flying agents. In: Genetic and Evolutionary Computation Conference, pp. 61–73. Springer (2003)Google Scholar
  54. 54.
    Van de Sompel, H., Bekaert, J., Liu, X., Balakireva, L., Schwander, T.: aDORe: a modular, standards-based digital object repository. Comput. J. 48(5), 514–535 (2005)CrossRefGoogle Scholar
  55. 55.
    Walker, R.: Cyberspace When You’re Dead. The New York Times, NY, New York (2011)Google Scholar
  56. 56.
    Waters, D., Garrett, J.: Preserving Digital Information. Report of the Task Force on Archiving of Digital Information. The Commission on Preservation and Access, Washington, DC (1996)Google Scholar
  57. 57.
    Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small world’ networks. Nature 393, 440–442 (1998)CrossRefGoogle Scholar
  58. 58.
    Wohlsen, M.: Digital Data that Never Dies. Associated Press, NY, New York (2011)Google Scholar
  59. 59.
    Yin, S.: Flickr Permanently Deletes User’s Account, 4,000 Photos by Accident. PC Magazine, February (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.Computer Science DepartmentOld Dominion UniversityNorfolkUSA

Personalised recommendations