Advertisement

Extending OAI-PMH over structured P2P networks for digital preservation

  • Everton F. R. SeáraEmail author
  • Marcos S. Sunye
  • Luis C. E. Bona
  • Tiago Vignatti
  • Andre L. Vignatti
  • Anne Doucet
Article

Abstract

Open archives initiative (OAI) allows both libraries and museums create and share their own low-cost digital libraries (DL). OAI DL are based on OAI-PMH protocol which, although is consolidated as a pattern for disseminating metadata, does not rely on either digital preservation and availability of content, essential requirements in this type of system. Building new mechanisms that guarantee improvements, at no or low cost increases, becomes a great challenge. This article proposes a distributed archiving system based on a P2P network, that allows OAI-based libraries to replicate digital objects to ensure their reliability and availability. The proposed system keeps and extends the current OAI-PMH protocol characteristics and is designed as a set of OAI repositories, where each repository has an independent fail probability assigned to it. Items are inserted with a reliability that is satisfied by replicating them in subsets of repositories. Communication between the nodes (repositories) of the network is organized in a distributed hash table and multiple hash functions are used to select repositories that keep the replicas of each stored item. The OAI characteristics combined with a structured P2P digital preservation system allow the construction of a reliable and totally distributed digital library. The archiving system has been evaluated through experiments in a real environment and the OAI-PMH extension validated by the implementation of a proof-of-principle prototype.

Keywords

Digital library Long-term preservation Digital archiving Peer-to-peer 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    arXiv.org e-Print archive. www.arxiv.org. Accessed 30 Jan 2012
  2. 2.
    Agosti, M., Hans Jörg, H., Türker, C.: Digital library architectures: peer-to-peer, grid, and service-orientation. In: Pre-proceedings of the sixth thematic workshop of the EU network of excellence DELOS, S. Margherita di Pula, Cagliari, Italy, 24–25 June, 2004. Edizioni Libreria Progetto, Padova (2004). Accessed 24 July 2010Google Scholar
  3. 3.
    Ahlborn, B: OAI-P2P: A peer-to-peer network for open archives. In: ICPPW ’02: Proceedings of the 2002 international conference on parallel processing workshops, p. 462. IEEE Computer Society, Washington, DC (2002). Accessed 24 July 2010Google Scholar
  4. 4.
    Amrou, A., Maly, K., Zubair, M.: Freelib: peer-to-peer-based digital libraries. In: AINA ’06: Proceedings of the 20th international conference on advanced information networking and applications, vol. 1 (AINA’06), pp. 9–14. IEEE Computer Society, Washington, DC (2006). Accessed 30 Jan 2012Google Scholar
  5. 5.
    Brian, C., Hector, G.: Creating trading networks of digital archives. In: JCDL ’01: Proceedings of the 1st ACM/IEEE-CS joint conference on digital libraries. ACM, New York (2001). Accessed 30 Jan 2012Google Scholar
  6. 6.
    Brisco, T.: DNS Support for load balancing. www.dl.acm.org. April 1995. Accessed 30 Jan 2012
  7. 7.
    Dabek, F., Frans Kaashoek, M., Karger, D., Morris, R., Stoica, I.: Wide-area cooperative storage with CFS. In: Proceedings of the 18th ACM symposium on operating systems principles (SOSP ’01). Chateau Lake Louise, Banff, Canada, October (2001)Google Scholar
  8. 8.
    Flávio Rufino Seára, E.: Uma arquitetura OAI para Preservação Digital utilizando redes Peer-to-Peer Estruturadas. Master’s thesis, Federal University of Paraná (2008)Google Scholar
  9. 9.
    Garey, M.R., Johnson, D.S.: Computers and intractability: a guide to the theory of NP-completeness. W.H. Freeman and Company, New York (1979)Google Scholar
  10. 10.
    Ghodsi, A., Alima, L.O., Haridi, S.: Symmetric replication for structured peer-to-peer systems. In: Proceedings of DBISP2P, pp. 74–85 (2005)Google Scholar
  11. 11.
    Haeberlen, A., Mislove, A., Druschel, P. Glacier: highly durable, decentralized storage despite massive correlated failures. In: Proceedings of NSDI’05. USENIX Association, Berkeley, CA (2005)Google Scholar
  12. 12.
    Ktari, S., Zoubert, M., Hecker, A., Labiod, H.: Performance evaluation of replication strategies in DHTs under churn. In: Proceedings of MUM ’07. ACM, New York (2007)Google Scholar
  13. 13.
    Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D., Gummadi, R., Rhea, S., Weatherspoon, H., Wells, C., Zhao, B.: Oceanstore: an architecture for global-scale persistent storage. In: Proceedings of ASPLOS-IX. ACM, New York (2000)Google Scholar
  14. 14.
    Lagoze, C., Van de Sompel, H.: The open archives initiative: building a low-barrier interoperability framework. In: JCDL ’01: Proceedings of the 1st ACM/IEEE-CS joint conference on digital libraries, pp. 54–62. ACM, New York (2001)Google Scholar
  15. 15.
    Liskov, B., Ghemawat, S., Gruber, R., Johnson, P., Shrira, L.: Replication in the harp file system. In: Proceedings of ACM SIGOPS, Pacific Grove, CA (1991)Google Scholar
  16. 16.
    Lv, Q., Cao, P., Cohen, E., Li, K., Shenker, S.: Search and replication in unstructured peer-to-peer networks. In: Proceedings of SIGMETRICS ’02. ACM, New York (2002)Google Scholar
  17. 17.
    Maniatis P., Roussopoulos M., Giuli T., Rosenthal D., Baker M.: The LOCKSS peer-to-peer digital preservation system. ACM Trans. Comput. Syst. 23, 2–50 (2005)CrossRefGoogle Scholar
  18. 18.
    Martins, V., Pacitti, E., Valduriez, P.: Survey of data replication in P2P systems. Technical report, INRIA (2006)Google Scholar
  19. 19.
    Milojicic, D.S., Kalogeraki, V., Lukose, R., Nagarajan, K.: Peer-to-peer computing. Technical report, HP Labs, Bristol (2002)Google Scholar
  20. 20.
    Mitzenmacher, M., Upfal, E.: Probability and computing : randomized algorithms and probabilistic analysis. Cambridge University Press, Cambridge (2005)Google Scholar
  21. 21.
    Open access and institutional repositories with eprints. www.eprints.org
  22. 22.
    Open Archives Initiative. www.openarchives.org
  23. 23.
    Ratnasamy, S., Francis, P., Handley, M., Karp, R., Schenker, S.: A scalable content-addressable network. In: Proceedings of the SIGCOMM ’01. ACM, New York (2001)Google Scholar
  24. 24.
    Risse, T., Knezevic, P.: A self-organizing data store for large scale distributed infrastructures. In: ICDEW ’05: Proceedings of the 21st international conference on data engineering workshops. IEEE Computer Society, Washington, DC (2005)Google Scholar
  25. 25.
    Rowstron, A., Druschel, P.: Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. In: Proceedings of ACM SOSP′01. Banff, Canada (2001)Google Scholar
  26. 26.
    Rowstron, A., Druschel, P.: Pastry: scalable, decentralized object location, and routing for large-scale peer-to-peer systems. Lecture Notes in Computer Science (2001)Google Scholar
  27. 27.
    Shudo K., Tanaka Y., Sekiguchi S.: Overlay weaver: an overlay construction toolkit. Comput. Commun. 31, 402–412 (2008)CrossRefGoogle Scholar
  28. 28.
    Stoica, I., Morris, R., Karger, D., Kaashoek, F., Balakrishnan, H.: Chord: A scalable peer-to-peer lookup service for internet applications. In: Proceedings of the SIGCOMM ’01, pp. 149–160 (2001)Google Scholar
  29. 29.
    Tansley, R., Bass, M., Stuve, D., Branschofsky, M., Chudnov, D., McClellan, G., Smith, M.: The D space institutional digital repository system: current functionality. In: JCDL ’03: Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries, pp. 87–97. IEEE Computer Society, Washington, DC (2003)Google Scholar
  30. 30.
    The Digital Object Identifier System. www.doi.org
  31. 31.
    The Handle System. www.handle.net
  32. 32.
    Vignatti, T., Bona, L.C.E., Vignatti, A.L., Sunye, M.S.: Long-term digital archiving based on selection of repositories over P2P networks. In: IEEE P2P’09: Ninth international conference on peer-to-peer computing (2009)Google Scholar
  33. 33.
    Virtua tls. www.vtls.com.
  34. 34.
    Weibel, S., Kunze, J., Lagoze, C., Wolf, M.: Dublin core metadata for resource discovery, The Internet Society (1998)Google Scholar
  35. 35.
    Winett, J.: Definition of a socket, May (1971)Google Scholar
  36. 36.
    Xu, Y.: A P2P based personal digital library for community. In: PDCAT ’05: Proceedings of the sixth international conference on parallel and distributed computing applications and technologies, pp. 796–800. IEEE Computer Society, Washington, DC (2005)Google Scholar
  37. 37.
    Zhao, B.Y., Huang, L., Stribling, J., Rhea, S.C., Joseph, A.D., Kubiatowicz, J.D.: Tapestry: a resilient global-scale overlay for service deployment. IEEE J. Sel. Areas Commun. 22(1) January (2004)Google Scholar

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  • Everton F. R. Seára
    • 1
    Email author
  • Marcos S. Sunye
    • 1
  • Luis C. E. Bona
    • 1
  • Tiago Vignatti
    • 1
  • Andre L. Vignatti
    • 2
  • Anne Doucet
    • 3
  1. 1.Department of InformaticsFederal University of ParanáCuritibaBrazil
  2. 2.Institute of ComputingUniversity of CampinasCampinasBrazil
  3. 3.Département DAPAUniversité PARIS VI, LIP6ParisFrance

Personalised recommendations