Abstract
Open archives initiative (OAI) allows both libraries and museums create and share their own low-cost digital libraries (DL). OAI DL are based on OAI-PMH protocol which, although is consolidated as a pattern for disseminating metadata, does not rely on either digital preservation and availability of content, essential requirements in this type of system. Building new mechanisms that guarantee improvements, at no or low cost increases, becomes a great challenge. This article proposes a distributed archiving system based on a P2P network, that allows OAI-based libraries to replicate digital objects to ensure their reliability and availability. The proposed system keeps and extends the current OAI-PMH protocol characteristics and is designed as a set of OAI repositories, where each repository has an independent fail probability assigned to it. Items are inserted with a reliability that is satisfied by replicating them in subsets of repositories. Communication between the nodes (repositories) of the network is organized in a distributed hash table and multiple hash functions are used to select repositories that keep the replicas of each stored item. The OAI characteristics combined with a structured P2P digital preservation system allow the construction of a reliable and totally distributed digital library. The archiving system has been evaluated through experiments in a real environment and the OAI-PMH extension validated by the implementation of a proof-of-principle prototype.
Similar content being viewed by others
References
arXiv.org e-Print archive. www.arxiv.org. Accessed 30 Jan 2012
Agosti, M., Hans Jörg, H., Türker, C.: Digital library architectures: peer-to-peer, grid, and service-orientation. In: Pre-proceedings of the sixth thematic workshop of the EU network of excellence DELOS, S. Margherita di Pula, Cagliari, Italy, 24–25 June, 2004. Edizioni Libreria Progetto, Padova (2004). Accessed 24 July 2010
Ahlborn, B: OAI-P2P: A peer-to-peer network for open archives. In: ICPPW ’02: Proceedings of the 2002 international conference on parallel processing workshops, p. 462. IEEE Computer Society, Washington, DC (2002). Accessed 24 July 2010
Amrou, A., Maly, K., Zubair, M.: Freelib: peer-to-peer-based digital libraries. In: AINA ’06: Proceedings of the 20th international conference on advanced information networking and applications, vol. 1 (AINA’06), pp. 9–14. IEEE Computer Society, Washington, DC (2006). Accessed 30 Jan 2012
Brian, C., Hector, G.: Creating trading networks of digital archives. In: JCDL ’01: Proceedings of the 1st ACM/IEEE-CS joint conference on digital libraries. ACM, New York (2001). Accessed 30 Jan 2012
Brisco, T.: DNS Support for load balancing. www.dl.acm.org. April 1995. Accessed 30 Jan 2012
Dabek, F., Frans Kaashoek, M., Karger, D., Morris, R., Stoica, I.: Wide-area cooperative storage with CFS. In: Proceedings of the 18th ACM symposium on operating systems principles (SOSP ’01). Chateau Lake Louise, Banff, Canada, October (2001)
Flávio Rufino Seára, E.: Uma arquitetura OAI para Preservação Digital utilizando redes Peer-to-Peer Estruturadas. Master’s thesis, Federal University of Paraná (2008)
Garey, M.R., Johnson, D.S.: Computers and intractability: a guide to the theory of NP-completeness. W.H. Freeman and Company, New York (1979)
Ghodsi, A., Alima, L.O., Haridi, S.: Symmetric replication for structured peer-to-peer systems. In: Proceedings of DBISP2P, pp. 74–85 (2005)
Haeberlen, A., Mislove, A., Druschel, P. Glacier: highly durable, decentralized storage despite massive correlated failures. In: Proceedings of NSDI’05. USENIX Association, Berkeley, CA (2005)
Ktari, S., Zoubert, M., Hecker, A., Labiod, H.: Performance evaluation of replication strategies in DHTs under churn. In: Proceedings of MUM ’07. ACM, New York (2007)
Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D., Gummadi, R., Rhea, S., Weatherspoon, H., Wells, C., Zhao, B.: Oceanstore: an architecture for global-scale persistent storage. In: Proceedings of ASPLOS-IX. ACM, New York (2000)
Lagoze, C., Van de Sompel, H.: The open archives initiative: building a low-barrier interoperability framework. In: JCDL ’01: Proceedings of the 1st ACM/IEEE-CS joint conference on digital libraries, pp. 54–62. ACM, New York (2001)
Liskov, B., Ghemawat, S., Gruber, R., Johnson, P., Shrira, L.: Replication in the harp file system. In: Proceedings of ACM SIGOPS, Pacific Grove, CA (1991)
Lv, Q., Cao, P., Cohen, E., Li, K., Shenker, S.: Search and replication in unstructured peer-to-peer networks. In: Proceedings of SIGMETRICS ’02. ACM, New York (2002)
Maniatis P., Roussopoulos M., Giuli T., Rosenthal D., Baker M.: The LOCKSS peer-to-peer digital preservation system. ACM Trans. Comput. Syst. 23, 2–50 (2005)
Martins, V., Pacitti, E., Valduriez, P.: Survey of data replication in P2P systems. Technical report, INRIA (2006)
Milojicic, D.S., Kalogeraki, V., Lukose, R., Nagarajan, K.: Peer-to-peer computing. Technical report, HP Labs, Bristol (2002)
Mitzenmacher, M., Upfal, E.: Probability and computing : randomized algorithms and probabilistic analysis. Cambridge University Press, Cambridge (2005)
Open access and institutional repositories with eprints. www.eprints.org
Open Archives Initiative. www.openarchives.org
Ratnasamy, S., Francis, P., Handley, M., Karp, R., Schenker, S.: A scalable content-addressable network. In: Proceedings of the SIGCOMM ’01. ACM, New York (2001)
Risse, T., Knezevic, P.: A self-organizing data store for large scale distributed infrastructures. In: ICDEW ’05: Proceedings of the 21st international conference on data engineering workshops. IEEE Computer Society, Washington, DC (2005)
Rowstron, A., Druschel, P.: Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. In: Proceedings of ACM SOSP′01. Banff, Canada (2001)
Rowstron, A., Druschel, P.: Pastry: scalable, decentralized object location, and routing for large-scale peer-to-peer systems. Lecture Notes in Computer Science (2001)
Shudo K., Tanaka Y., Sekiguchi S.: Overlay weaver: an overlay construction toolkit. Comput. Commun. 31, 402–412 (2008)
Stoica, I., Morris, R., Karger, D., Kaashoek, F., Balakrishnan, H.: Chord: A scalable peer-to-peer lookup service for internet applications. In: Proceedings of the SIGCOMM ’01, pp. 149–160 (2001)
Tansley, R., Bass, M., Stuve, D., Branschofsky, M., Chudnov, D., McClellan, G., Smith, M.: The D space institutional digital repository system: current functionality. In: JCDL ’03: Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries, pp. 87–97. IEEE Computer Society, Washington, DC (2003)
The Digital Object Identifier System. www.doi.org
The Handle System. www.handle.net
Vignatti, T., Bona, L.C.E., Vignatti, A.L., Sunye, M.S.: Long-term digital archiving based on selection of repositories over P2P networks. In: IEEE P2P’09: Ninth international conference on peer-to-peer computing (2009)
Virtua tls. www.vtls.com.
Weibel, S., Kunze, J., Lagoze, C., Wolf, M.: Dublin core metadata for resource discovery, The Internet Society (1998)
Winett, J.: Definition of a socket, May (1971)
Xu, Y.: A P2P based personal digital library for community. In: PDCAT ’05: Proceedings of the sixth international conference on parallel and distributed computing applications and technologies, pp. 796–800. IEEE Computer Society, Washington, DC (2005)
Zhao, B.Y., Huang, L., Stribling, J., Rhea, S.C., Joseph, A.D., Kubiatowicz, J.D.: Tapestry: a resilient global-scale overlay for service deployment. IEEE J. Sel. Areas Commun. 22(1) January (2004)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Seára, E.F.R., Sunye, M.S., Bona, L.C.E. et al. Extending OAI-PMH over structured P2P networks for digital preservation. Int J Digit Libr 12, 13–26 (2012). https://doi.org/10.1007/s00799-012-0080-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00799-012-0080-5