Thinking digital libraries for preservation as digital cultural heritage: by R to R4 facet of FAIR principles

The Art. 2 of the UE Council conclusions of 21 May 2014 on cultural heritage as a strategic resource for a sustainable Europe (2014/C 183/08) states: “Cultural heritage consists of the resources inherited from the past in all forms and aspects—tangible, intangible and digital (born digital and digitized), including monuments, sites, landscapes, skills, practices, knowledge and expressions of human creativity, as well as collections conserved and managed by public and private bodies such as museums, libraries and archives”. Starting from this assumption, we have to rethink digital and digitization as social and cultural expressions of the contemporary age. We need to rethink digital libraries produced by digitization as cultural entities and no longer as mere dataset for enhancing fruition of cultural heritage, by defining clear and homogeneous criteria to validate and certify them as memory and sources of knowledge for future generations. By expanding R: Re-usable of the FAIR Guiding Principles for scientific data management and stewardship into R4: Re-usable, Relevant, Reliable and Resilient, this paper aims to propose a more reflective approach to creation of descriptive metadata for managing digital resource of cultural heritage, which can guarantee their long term preservation.


Introduction
Digital revolution transformed the way to produce, transmit and share knowledge. The widespread diffusion of digital methods and techniques brings an unheard democratization of knowledge and culture, making the citizens leading actors in the sustainable development of the new smart societies based on digitization, digital creation and digital design.
The Art. 2 of the "EU Council Conclusions of 21 May 2014 on cultural heritage as a strategic resource for a sustainable Europe (2014/C 183/08)" states: "Cultural heritage consists of the resources inherited from the past in all forms and aspects-tangible, intangible and digital (born digital and digitized), including monuments, sites, landscapes, skills, practices, knowledge and expressions of human creativity, as well as collections conserved and managed by pub-lic and private bodies such as museums, libraries and archives. It originates from the interaction between people and places through time and it is constantly evolving. These resources are of great value to society from a cultural, environmental, social and economic point of view and thus their sustainable management constitutes a strategic choice for the 21st century" [12].
Starting from this conclusion, digitization and digital creation become social and cultural expressions of the contemporary age. Data are no longer mere instruments to simplify administration management or to enhance the fruition of cultural heritage, but they will become digital artifacts representative of the new digital cultural heritage (DCH) [15,26].
From several years authoritative scientific voices highlight that long term digital preservation (LTDP) is the emergency to face worldwide [1, 6, 8-10, 13, 16-25]. In 2012 UNESCO dedicated to this issue the Conference in Vancouver entitled The Memory of the World in the Digital Age: Digitization and Preservation [5]. In 2015 Vinton Cerf raised the alarm about the risk that the Twenty-First Century will become the first black hole in human evolution since the establishment of intelligent communication [2,3,7].
The current processes used for indexing and archiving digital artifacts don't solve this problem, because digitization and digital creation still are activities strongly conditioned by the instrumental use of data.
The FAIR Guiding Principles for scientific data management and stewardship published in 2016 intend "to provide guidelines to improve the findability, accessibility, interoperability, and reuse of digital assets" [11]. These guidelines refer to any digital object, to metadata and to infrastructures. Nevertheless, we can consider the principles a first step for facing the problem of data management and of LTDP.
This paper discuss a possible expansion of R: Re-usable of FAIR Principles into R 4 : Re-Usable, Relevant, Reliable and Resilient with the goal to propose a different approach to creation of descriptive metadata, aiming at fostering preservation of digital artifacts related to cultural heritage.

Some primary level issues
The art. 2 of EU Conclusions arises several questions concerning the definition of new DCH. Among them: • Which digital entities we identify as cultural resources? • Which requirements make digital entities DCH. • How many digital entities exist today among those produced by the start of digital revolution? • How many digital entities we consider DCH among digital artifacts we produced by the start of digital revolution? • How and by what features do we recognize them? The processes? The outputs? Both? • How soon do we consider digital entities as cultural heritage? • What skills digital curator-or digital librarian-should have to identify, manage and preserve DCH?
Really a critical issue is the identification of DCH within the informal digital magma in which today we float. We neither know it nor we can manage the myriad of digital entities that populate it. A clear classification that allows us to identify, validate and certify DCH misses, so we think today it does not exist!

A proposal for identification of digital cultural heritage
We need to identify digital cultural entities within contemporary digital magma in order to answer some of the above questions.
The reliability of digital resources focusing on metadata, and above all on descriptive metadata, is a first-level issue to be addressed for identifying DCH. The FAIR Principles at the state-of-the-art do not seem sufficient requirements to guarantee their validation.
Our attention focuses on different processes of digitization, use and re-use of digital data. We surveyed several metadata scheme used in international digitization projects to index and manage digital entities related to different cultural resources. Among them: Europeana, World Digital Library, Library of Congress. Almost all of data of these digital collections are scarcely reusable, no reliable, no interoperable, no resilient. Descriptive metadata have poor and generic contents.
We think that descriptive metadata are the most important source to distinguish digital cultural entities by digital "consuming" data. If these metadata are well described in the digitizing process for cultural heritage, they can preserve the information about their life cycle and their design, creation, fruition, reuse and transformation over time. This goal needs some rules.
A first rules proposal can be to expand the Re-usable of FAIR Principles into R 4 as follows: • Re-usable reusability guarantees the sustainability of digital entities as different reuses of descriptive metadata over time foster their transformation in cultural sources and memory (an example above all: the Flavian Amphitheater, that is the Colosseum); • Relevant relevance of digital entities connects to the transformations of descriptive metadata functions linked to their reuse over time, and it is an indispensable requirement so these entities evolve in memory and transform into cultural resource; • Reliable reliability of digital entities strictly links to descriptive metadata capability of testifying their evolution by representing the validated and certified processes that characterized their life cycle; • Resilient resilience, that is: "the capacity of a system to adapt itself to the conditions of use and to resist usury in order to guarantee the availability of the services provided" (https ://it.wikip edia.org/wiki/Resil ienza ), is the requirement to recover and reuse over time descriptive metadata preserving the memory of their original function even in transformation of their functions from practical to cultural.
Such expansion applied to the creation of descriptive metadata could give digital entities the value of cultural heritage, as they make them sustainable and permanent, mirroring what we consider tangible and intangible cultural heritage.
Already the analysis and design phases of digitization process shall foresee the creation of descriptive metadata using R 4 requirements, focusing on LTDP needed to define both each digital entity and digital libraries as digital cultural resources.
But we think that R 4 requirements should determine also the methodological and technological approaches, systems, information, metadata schema, digital image content structures, data description, complex data set, and their any further development and sustainability. This approach exists only if we recognize the whole process of digitization as DCH and we clearly classify the digital artifacts carried out by this process.
About this last assumption, we propose the following possible classification of digital cultural entities: • Born digital heritage born digital entities whose contents, and in particular descriptive metadata, record processes, methods and techniques used by contemporary communities for their creation, to safeguard, reuse and preserve over time as source of knowledge and historical memory; • Digital FOR cultural heritage processes, methods and techniques for digitizing tangible and intangible cultural heritage which aim to create digital artifacts composed by images and descriptive metadata (digital libraries, virtual museums, demo-ethno-anthropological databases, etc.); • Digital AS cultural heritage digital artifacts produced by digitization and dematerialization of tangible and intangible cultural heritage, whose contents, and in particular descriptive metadata, record approaches, processes, methods and techniques representative of their life cycle, to safeguard, reuse and preserve enhancing them as source of knowledge and historical memory.
By the above classification we propose the following definition of DCH: Digital cultural heritage is the ecosystem of processes, entities, virtual phenomena Born Digital and Digitized whose descriptive metadata are certified and validated as created using R 4 requirements and represent their life cycle over time. So, they are testimonies, manifestations and expressions of the life cycle that identify and connote each community, socio-cultural context, simple or complex ecosystem of the Digital Age, assuming the function of historical memory and source of knowledge.
However, starting by this definition we could consider a lot of digital artifacts as DCH, but we know that any rating would be arbitrary. So, the matter requires urgent and scientifically reliable solution.
Facing this, we think that what differentiates the descriptive metadata of digital cultural entities by those of digital "consuming" artifacts is the correct proportion between: • Quantity it is the correct ratio between exhaustiveness of information, knowledge to provide, number of metadata elements and attributes necessary to make them R 4 with the goal to retrieve, reuse and preserve the digital resources; • Quality it is the correct ratio between the informative/ cognitive level to give both to each descriptor and to set of descriptors representing the data and its life-cycle, and the variables of information and cognitive needs of the users, according to whether they are contemporary or future.
A test to show the above assumption has been the digitization project we describe below.

Descriptive metadata as sources of digital cultural heritage: the digitization project "Casa Editrice G. Laterza and Figli"
The case study for testing the above hypothesis was the metadata scheme we created for indexing and managing digital artifact carried out by the digitization project "Archivio Storico della Casa G. Laterza and Figli", undertaken at the end of 2015 together with Regione Puglia and still ongoing. Part of data is now published in Puglia Digital Library, the multimedia digital library of Regione Puglia [4,14]. The scheme was created in accordance with the Italian METS-SAN standard structured by the Italian National Archival System.
The preservation of both the digitization process and of the digital artifacts produced was the goal of the project. So, we focused on storytelling content of the descriptive metadata of the whole project history, of the original Archive (series, sub-series, etc.) and of each one digital artefact, also paying attention to the cognitive and informative needs of future users. We preferred to use "granular" indexing, describing each digital artifact with its metadata scheme.
In designing the scheme we considered the tag sequence as an organic storytelling structure composed of formal entities (elements and attributes) and descriptive contents. These was created by hybridizing methods and techniques of archival description with cataloguing solutions and storytelling methodology, providing information on the whole project and on the detail of each section and, inside the sections, of each partition.
In the scheme, the < header > section after the namespaces (<xlmns: ->) embeds the descriptive elements and attributes related to: • project: body responsible for the project, owner of original Archive, editor of digital resources; • history of the original Archive; • structure of the original Archive; • historical/biographical profile of the owner of the original Archive; • rights that regulate the use of original documents.
The <desc> section divides into two sub-sections: 1. context: it embeds the data of entities involved in the ownership and management of original documents; 2. description: it shows the consistency of the sub-fund to which the resource described in the sub-section <File> belongs.
The scheme of the above sections with their descriptive contents follows: We focused particularly on the contents: information shall be exhaustive, clear and easily intelligible to users of different cultural and interest levels. Elements and their contents shall compose a well-balanced storytelling structure. The narrative is thus easy to use and reuse, as it allows to gather information both to users who want to know the contents related to original archive and to users who want explore the digital project, the metadata structure and the choices that have addressed it.
• the original document represented in the image: subject, text abstract, creator, contributors, chronic date, topical date, support, language; • the physical position of the original in the archive; • the editor who creates the descriptions and data of the creation.
The <file> sub-section dedicated to single document describes: Here follows the structure: The <unittitle> element gives some detailed information about the subject of the document, while the <abstract> element provides a short syllabus of the content, mixing archival description with narrative technic.
The <right> sub-section follows, which describes: • ownership of the digital artifact; • accessibility and reuse of the digital artifact; • ownership and accessibility of the original document.
Here follows the structure: The scheme closes with the technical metadata describing the different image formats created for each digital artifact and its structural elements.

Conclusions
The EU identification of digital and digitization as cultural heritage together with tangible and intangible opens a new phase in the Digital Age. It definitively recognizes the value of cultural heritage to digital artifacts. A systematization that helps us to identify which digital entities can be considered DCH is necessary.
Our reflection starts by expanding the R of the FAIR Principles in R 4 , with the goal to create digital entities whose descriptive metadata shall be also Re-usable, Relevant, Reliable and Resilient. We think that these four requirements could boht guarantee the sustainability and foster the reuse and preservation of digital resources over time, by addressing correct proportion between quantity and quality of contents of descriptive metadata.
This requirements match a first proposal of classification of DCH which enclose born digital heritage, Digital FOR cultural heritage and digital AS cultural heritage, with the aim to give digital entities we create by digitization the function of historical memory and of source of knowledge.
We tested our hypothesis in creating the metadata scheme for indexing the digital artifacts produced by the digitization project of "Archivio Storico Casa Editrice G. Laterza & Figli". The results are of some interest to foster the discussion about the critical issue of certification and validation of digital entities as the new Digital Cultural Heritage produced by contemporary age, referring the evaluation on assured and shared rules or guidelines.
Funding Open access funding provided by Università degli Studi di Bari Aldo Moro within the CRUI-CARE Agreement.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.