Data Management

Data management includes the development and use of architectures, guidelines, practices andprocedures for accuratemanaging of data during the entire data lifecycle of an institutional unit or a research project defined as different information units such as numbers, alphabetic characters, and symbols that are particularly formatted and can be processed by computer. The data in the project is provided by various actors which can be GeomInt partners, their legal representatives, employees, and external partners. GeomInt Data is provided at GeomInt data management portal (DMP). InGeomInt project the partnerswork in very close cooperation. Project-owned and connected infrastructures are synergetic used (as illustrated in Fig. 5.1). In addition to the rock mechanics laboratories of the partners CAU, IfG and TUBAF with partly unique equipment data from ongoing experiments in the underground laboratories

. With regard to the use of the simulation platform OpenGeoSys, the development of which is coordinated by the UFZ and in whose further development the GeomInt partners BGR, CAU, IfG and TUBAF are involved, the simulation and development infrastructures located at the UFZ, including version management, is available to these partners.
The collaborative work requires data management structures and guidelines. Therefore, the first step was to set up a document that includes a user agreement and a data management plan which is the basis for data management in the project.

User Agreement and Data Management Plan
The GeomInt project partners agreed to set up a user agreement which includes specifications for data structures including metadata, data formats, access authorization for data, the possible publication of data, as well as the handling of the data after the end of the project and outside the project. A first version of this user agreement was created six months after the start of the project.
The user agreement includes guidelines and definitions for the following aspects -Which data will be generated in the project and has to be managed? -How will data be provided and exchanged? -What are the rights of use for the partners and for third parties? -How to cite data? -How to supervise the compliance of the user agreement?
As part of the user agreement, a data management plan, which is a formal document that describes how project data is managed during the research period and after completion of the project, was developed. The goal of a data management plan is to consider the aspects of data management (metadata creation, data preservation and analysis) before the start of the project. Following points are discussed in the GeomInt data management plan: 1. Generation and management methods (data infrastructure, external data, data integration, data formats, quality control, user groups, data processing stages, versioning, documentation and meta data, geocoding) 2. Data Legal Management 3. Data exchange and provision, citation rules 4. Short-term storage and data management (storages, data transfer, backup, security) 5. Long-term storage (characteristics, metadata and documentation, responsibility) 6. Resources (organizational roles and responsibilities for data management).

GeomInt Data
The project results include specific data from laboratory and in-situ experiments, software components and data sets from numerical simulations (i.e. model and result files). An estimation of the extent of the data generated in GeomInt could not be made before the project. Therefore, the data management concept had to be flexible. This uncertainty was mainly due to the fact that the evaluation of test and calculation results may lead to a change in test and calculation planning and may even lead to additional experiments or simulation calculations.
The availability of experimental and numerical data generated in the project, including existing metadata, is realized on an internal area of the Geomint homepage. The UFZ is responsible for the project data and has many years of experience in data management regarding the cooperative development of open source software (OpenGeoSys) as well as the acquisition, storage and processing of data from experiments on different scales, exploration and monitoring campaigns, numerical simulations and scientific 3D visualizations.
The UFZ has sufficient capacities and modern data management systems for data storage, which are available as a central data infrastructure for the research network. Specifically, data sets are managed by means of an ORACLE database. Access is via a web portal, where each data record must be provided with metadata before uploading. The metadata standard used is compatible with the INSPIRE Directive 2007/2/EC and also regulates the rights for access, use and transfer of the data. A tape system is also available for the long-term storage of very large amounts of data. For the provision of exploration and monitoring data, geo-services mentioned in the GDI-DE are used as far as possible. Since such services for complex modelling and simulation data do not exist so far, the provision is done via a data research portal, where data can be found by means of stored metadata.
As software components are part of non-commercial, scientific program platforms and are open source products (e.g.. OpenGeoSys), they are hosted by the responsible partner via established source code hosting services (e.g. GitHub) and is publicly available. A possible public access to project data, which goes beyond the status quo as described in technical publications, as well as the handling of the data after the end of the project is regulated in the cooperation agreement or in the cooperation contract between the project partners.
The handling of data obtained from the in-situ experiments in the underground laboratories through synergies with other projects is also regulated separately (access authorisation for these data, storage location, publication, handling of the data after the end of the project). Such an approach is necessary because specific parts of these data can be used for the scientific purposes of GeomInt, but they are generated in other projects with partly other partners.

GeomInt DMP
In this section, exemplary data sets of every project partner are described. A table of these data sets including description and link are available only for project partners at the website ( Fig. 5.2). Some data sets can be found on the UFZ data investigation portal https://www.ufz.de/drp/. These data sets are uploaded to the data management portal at UFZ (DMP@UFZ).
The GeomInt data management system (DMS) is organised in three sections (  Table 5.1 summarizes the MEX related data concerning experiments and simulations. A selection will be described in the following sections. The following codes (and related input files) are used (see Chap. 7 for detailed code introductions).

CAU (LEM)
The required LEM code and the input variables of the three-Point fracture toughness test on the Rockville Granite samples are uploaded to the IfG (Kiel) NextCloud server. The data is accessible through the following link: https://nextcloud.ifg.unikiel.de/index.php/s/pRmBPJ9gK5Se6ci. The uploaded protected MATLAB file in a *.p format requires a MATLAB version with a built-in Voronoi Tessellation and Delaunay Triangulation functions. The input variables are prepared in a single file for the simulation of the fracture toughness in Rockville Granite. Figure 5.4 shows the relation between the load versus CMOD as described in Sect. 4.1.

UFZ (FEM-VPF)
The source code can be found in OpenGeoSys project on github and the input files for the three point bending test performed on the Rockville Granite samples have been uploaded. The files include the unstructured finite element mesh files in vtu format and an OGS input file in xml format. As homogeneous properties such as Young's modulus are assigned int the computational domain, the spatially constant material properties are specified in the OGS input file rather than in the mesh file. The load and crack mouth opening displacment computed from the simulations are shown in Fig. 5.5 as described in Sect. 4.1.

CAU Kiel
The experimental results of the three-point fracture toughness test on the Opalinus Clay samples are uploaded to the IfG (Kiel) NextCloud server. The data is accessible through the following link: https://nextcloud.ifg.uni-kiel.de/index.php/s/ pJxp2eNEJb6PfiS. The data set, which includes the time, applied force (N ) and the displacement of the sample at the loading point (mm), is provided in a *.txt file. The crack mouth opening displacement (CMOD), which is determined from the image processing technique (Sect. 2.2.2), is given in a *.xlsx file. The data includes the time and the calculated CMOD (mm). To be merged to OGS benchmarks (see below) Access permissions Free access   The required LEM code and the input variables of the three-Point fracture toughness test on the Opalinus Clay samples are uploaded to the IfG (Kiel) NextCloud server. The data is accessible through the following link: https://nextcloud.ifg.unikiel.de/index.php/s/ZBFN2rSZ99kPY9M.
The uploaded protected MATLAB file in a *.p format requires a MATLAB version with a built-in Voronoi Tessellation and Delaunay Triangulation functions. The input variables are prepared in two different files for a parallel and perpendicular embedded layer orientations. Figure 5.6 shows the comparison between the experimental and numerical data as described in Sect. 4.2.

UFZ
The input files for OGS, which were used to simulate the three point bending test performed on the orthogonal and parallel laminations of Opalinus Clay samples, have been uploaded. The files include the unstructured finite element mesh files in vtu format and OGS input files in xml format. Also in the mesh files, the material properties are defined per element. Particularly for the orthogonal and the parallel Lamentations in the samples are represented through a contrast in the fracture tough- MEX 0-1b (UFZ) will be also provided as an OGS benchmark case at: https://www. opengeosys.org/docs/benchmarks/phase-field/pf_tpb_ani/.

Meta Data Overview (According to Dublin Core)
See (Tables 5.7 and 5.8)

CAU Kiel
The required LEM code and the input variables for simulating the swelling process of the salt clay are uploaded to the IfG (Kiel) NextCloud server. The data is accessible through the following link: https://nextcloud.ifg.uni-kiel.de/index.php/s/ JmZseQqrsbgWNqC. The uploaded protected MATLAB file in a *.p format requires a MATLAB version with a built-in Voronoi Tessellation and Delaunay Triangulation functions. Fig. 5.8 shows the change of hydraulic conductivity with applied linear strains as described in Sect. 4.4.

Meta Data Overview (According to Dublin Core)
See (Table 5.10).

CAU Kiel
The experimental results of the drying and wetting paths of the sandy Opalinus Clay are uploaded to the IfG (Kiel) NextCloud server. The data is accessible through the following link: https://nextcloud.ifg.uni-kiel.de/index.php/s/q6g25nWy WJKqzNB.
The experimental data (*.xlsx) of drying and wetting paths are uploaded to the server. The data includes the reading number, time (day), stain values in perpendicular and parallel orientations, weight of the sample and measured water content values. Fig. 5.9 shows the change of the strains under the applied suction values.
The required LEM code and the input variables for simulating the drying and wetting paths of the sandy Opalinus Clay are uploaded to the IfG (Kiel) NextCloud server. The data is accessible through the following link: https://nextcloud.ifg.unikiel.de/index.php/s/fDNoPoXpXMqeAsK.
The uploaded protected MATLAB file in a *.p format requires a MATLAB version with a built-in Voronoi Tessellation and Delaunay Triangulation functions. Fig. 5.10 shows the change of hydraulic conductivity with applied linear strains as described in Sect. 4.5. Meta Data Overview (According to Dublin Core) See (Table 5.12).

UFZ
Link to the data set at UFZ data investigation portal (Download only for project members): https://www.ufz.de/record/dmp/archive/7706/.
The link contains two input decks for OGS-6 in which pressure driven percolation as described in MEX2 is simulated under different configurations of boundary loading. The first case applies the boundary loading of 12 MPa, 21 MPa, and 8 MPa in x-, y-, and z-direction respectively. It is called "case 1" and the corresponding OGS-6 input file is "me2_insitu_case1.prj". The second case is loaded with 4 MPa, 15 MPa, and 19 MPa in x-, y-, and z-direction respectively and the input file is named "me2_insitu_case2.prj". The remaining files are vtu files that describe the computing domain and the boundaries as shown in Fig. 4.58.

CAU Kiel
The required LEM code and the input variables of the percolation test on saltstone samples are uploaded to the IfG (Kiel) NextCloud server. The data is accessible through the following link: https://nextcloud.ifg.uni-kiel.de/index.php/s/9JZZcpS4 S3JJT9S. The uploaded protected MATLAB file in a *.p format requires a MATLAB version with a built-in Voronoi Tessellation and Delaunay Triangulation functions. The input variables are prepared in two files for two different stress configurations. Fig. 5.11 shows the frack surfaces under the percolation test as described in Sect. 4.8.

Meta Data Overview (According to Dublin Core)
See (Tables 5.16 and 5.17).

CAU (LEM)
The experimental results of the pressure driven percolation test on the cubic Opalinus Clay samples are uploaded to the IfG (Kiel) NextCloud server. The data is accessible through the following link: https://nextcloud.ifg.uni-kiel.de/index.php/s/ EMdNkdF4PRKWCqa.  The required LEM code and the input variables of the percolation test on Opalinus Clay samples are uploaded to the IfG (Kiel) NextCloud server. The data is accessible through the following link: https://nextcloud.ifg.uni-kiel.de/index.php/s/ tFKKjxnSpgNG25b.
The uploaded protected MATLAB file in a *.p format requires a MATLAB version with a built-in Voronoi Tessellation and Delaunay Triangulation functions. The input variables are prepared in two files for two different stress configurations. Fig. 5.13 illustrates an example of the evolved frack surfaces for the 2nd stress configuration discussed in the Sect. 4.9).

Meta Data Overview (According to Dublin Core)
See (Tables 5.19 and 5.20).
The measured gas flow is converted into permeabilities, which are stored as time series in an Excel file. For each of the three experiment there are two columns. The   To be merged to OGS benchmarks (see below) Access permissions Free access first contains the time in hours since the start of the experiment, the second contains the permeability (Fig. 5.14).

Meta Data Overview (According to Dublin Core)
See (Table 5.22).

CAU Kiel
The required LEM code and the input variables for simulating the effect of compressibility are uploaded to the IfG (Kiel) NextCloud server. The data is accessible through the following link: https://nextcloud.ifg.uni-kiel.de/index.php/s/6Mfg3P4 PyKNN6By.
The uploaded protected MATLAB file in a *.p format requires a MATLAB version with a built-in Voronoi Tessellation and Delaunay Triangulation functions. The input variables are prepared in two different files for the simulation of the pressure drop in gas and brine reservoirs. Fig. 5.15 shows the differences between the pressure drop in gas and brine reservoirs.

Meta Data Overview (According to Dublin Core)
See (Table 5.24).

Meta Data Overview (According to Dublin Core)
See (Table 5.26). The data set contains a measured pressure curve over time (50 days) during an injection test in a large wellbore (Springen URL). The injection regime was conducted in 5 bar increasing pressure steps followed by shut-in periods for observing the related pressure diffusion. Corresponding pumping rates were between 1-5.5 liters  per second. Peak pressures of more than 60 bars were required to create in-situ discontinuities by overcoming the tensile/adhesive rock strength of salt. Additionally, acoustic emission (AE) data were recorded (Fig. 5.16).
The CNL data set contains four text files. One text file with the rock properties of the used granite (see Table 2.1). Two files with the scan data of the two surfaces. One point cloud can be seen in Fig. 5.17. The last file contains the laboratory data. In Fig. 5.18 the results for the four shear stress levels can be seen.
Meta Data Overview (According to Dublin Core) See (Table 5.28).
The data set of the CNS test contains a file with the rock properties of the used basalt (see Table 2.1), two files with the scan data of the two surfaces. One point cloud can be seen in Fig. 5.19. The results of the laboratory tests are available as ASCII files and the shear curves and the dilatation are visualized in Fig. 5.20. Additionally three photos of the basalt surface before, after the first and after the fourth shear test are included.
Meta Data Overview (According to Dublin Core) See (Table 5.30).
The uploaded data set contains two compiled executables for simulations to fit experimental data for the Reiche Zeche fracture characterization tests and to reproduce the non-linear flow response throughout harmonic testing of a single fracture on the laboratory scale like described in Sect. 4.15. The folder includes executables, input files and the used discretization in terms of meshing files, required to perform the simulations. README.txt files provide further information how to start the simulations.  Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.