Applications of the Metadata Standards

Horsch, Martin Thomas; Chiacchiera, Silvia; Cavalcanti, Welchy Leite; Schembera, Björn

doi:10.1007/978-3-030-68597-3_5

Part of the book series: SpringerBriefs in Applied Sciences and Technology ((BRIEFSAPPLSCIENCES))

1595 Accesses

Abstract

This chapter addresses issues related to the practical use of the metadata standards, including syntactic interoperability and concrete scenarios from molecular modelling and simulation. It discusses challenges that arise from semantic heterogeneity, wherever multiple interoperability standards are concurrently employed for identical or overlapping domains of knowledge, or where domain ontologies need to be matched to top-level ontologies such as the European Materials and Modelling Ontology (EMMO).

You have full access to this open access chapter, Download chapter PDF

5.1 Representing Scenarios

The division of a knowledge base $\mathcal {K}= (\mathcal {T}, \mathcal {A})$ into an ontology $\mathcal {T}$ and a scenario $\mathcal {A}$, as introduced in Sect. 3.1, is not only formal, but also motivated by practice. Fulfilling the role of a schema, an ontology needs to be ingested into a data infrastructure, or implemented by it, only once; frequent updates are undesirable, since they require a reannotation of data. For the scenarios handled by digital platforms, obversely, data retrieval and ingest are routine operations, and so are updates, since they need to occur whenever the represented reality changes, e.g. a new service is offered or a new user is registered. Challenges related to I/O (or ingest and retrieval) mainly concern the scenarios, not the ontologies, and their standardized representation by files, streams or protocols is the main vehicle for syntactic interoperability.

Since the IRIs of resources on the semantic web can point to each other as freely as the URLs of sites on the World Wide Web, i.e. in a graph-like way, it is natural to visualize scenarios by graphs. These representations are referred to as knowledge graphs. In Sect. 3.1, a scenario was defined as a tuple $\mathcal {A}= (\mathbf {I}, A _\mathrm {c}, A _\mathrm {r}, H )$ with individual names $\mathbf {I}$, conceptual assertions $ A _\mathrm {c}$, relational assertions $ A _\mathrm {r}$ and elementary datatype property assertions $ H $. The corresponding knowledge graph is a labelled graph $ G = (\mathbf {I}, E , \varLambda _\mathrm {v}, \varLambda _\mathrm {e})$ where the vertices are given by $\mathbf {I}$ and the edges by

$$\begin{aligned} E ~ = ~ \{( I , J ) \,\mid \, \exists R \in \mathbf {R}: ~ ( I , R , J ) \in A _\mathrm {r}\} ~ \subseteq ~ \mathbf {I}^2. \end{aligned}$$

(5.1)

Vertices are labelled according to the function $ \varLambda _\mathrm {v}: \mathbf {I}\rightarrow 2^{\mathbf {C}\cup \mathbb {R}\cup \Sigma ^ \star }$ that maps^{Footnote 1} each individual name $ I \in \mathbf {I}$ to a set of labels

$$\begin{aligned} \varLambda _\mathrm {v}( I ) ~ = ~ \left( A _\mathrm {c}( I ) \,\cap \, \mathbf {C}\right) \,\cup \, \{ v \in \Sigma ^ \star \,\mid \, \exists k \in \Sigma ^ \star : ~ ( k , v ) \in H ( I )\}, \end{aligned}$$

(5.2)

while the edge labelling function $ \varLambda _\mathrm {e}: E \rightarrow 2^\mathbf {R}$ assigns the corresponding relation names

$$\begin{aligned} \varLambda _\mathrm {e}\left( ( I , J )\right) ~ = ~ \{ R \in \mathbf {R}\,\mid \, ( I , R , J ) \in A _\mathrm {r}\} \end{aligned}$$

(5.3)

to an edge $( I , J ) \in E $.

An example is given in Fig. 5.1; this knowledge graph might be read as follows: “There is a course labelled ‘CECAM SWiMM 2021’. This course has a syllabus, in which information is given on an instructor who is labelled ‘Jean-Pierre’ and ‘Minier’. The course has a training unit labelled ‘Salome/YACS’ for which event information is given,” etc. While this particular representation does not contain IRIs of datatype properties (to match the definition of the knowledge graph given above), it could easily be modified to incorporate this information as well, e.g. by using property graphs following Abad Navarro et al. [1]. The individual name IRIs are not shown in the figure to simplify the visualization; however, they are included in the definition of the knowledge graph.

The technical implementation of semantic interoperability requires a syntactic representation by which information can be extracted from (or ingested into) a digital platform including a knowledge base; cf. Fig. 5.2 for a typical multi-tier design approach. For this purpose, subject-predicate-object triples can be employed, e.g. in TTL format (cf. Sect. 3.1), by which the scenario from Fig. 5.1 is rendered as follows:

Above, @prefix statements introduce the abbreviations employed for IRI prefixes, e.g. the datatype property https://w3id.org/ccso/ccso#csName is abbreviated by ccso:csName. The elementary datatypes follow the conventions for XML schemas, cf. Chapter 2.

TTL notation has the advantage that it can be employed consistently for the whole knowledge base, including both the ontology and the scenario. For many applications, however, this is more problematic than beneficial, because the expressive power of OWL and its various serializations (including TTL) goes far beyond what is needed to represent objects and their properties; consequently, it is harder to parse and to process. Moreover, it cannot be ensured at the syntax level that only information on the scenario is included. Instead, JavaScript Object Notation (JSON) is often preferred, particularly in its JSON Linked Data (JSON-LD) variety which was specifically designed for the purpose of exchanging semantically characterized information on objects and their relations. In JSON-LD format, the example scenario becomes

There, every pair of curly braces encloses the description of an object (except the value of @context, which includes the IRI prefix definitions), given as a sequence of key-value pairs. The individual names are provided as values corresponding to the key @id, while the instantiated concept names are indicated by the key @type. The other keys are relation names, and the associated values are the third elements of the respective triples, as can be seen from the direct correspondence between the TTL and JSON-LD examples given above.

Additionally, domain-specific solutions on the basis of the hierarchical data format HDF5 facilitate combining a greater volume of data, including binary data, with the corresponding semantic annotation [3], e.g. the H5MD format [4] for semantically enriched data in molecular modelling and simulation. The VIMMP marketplace platform API and its Zontal Space back end permit handling annotated digital objects through the HDF5-based Allotrope Data Format (ADF) [5,6,7].

5.2 Top-Level Ontology

For a fundamental philosophical underpinning, the European Materials and Modelling Ontology [8, 9] relies on a combination of physicalist mereotopology following Varzi [10] and a nominalist reinterpretation of Peirce’s semiotics [11]. Therein, physicalist mereotopology primarily addresses the description of materials, which is extended by nominalist semiotics to describe modelling, simulation and experiments. For a discussion of nominalism, cf. Lewis [12], more specific implications of the approach of the EMMO on representing modelling and simulation of physical systems have been discussed elsewhere [13].

To facilitate the top-level ontology alignment of the VIMMP ontologies, a module with a scaled-down EMMO in TTL format is included, EMMO version 1 simplified (EMMO1s), which at the present stage (version 1.0.4) is based on EMMO version 1.0.0 alpha 2 (April 2020). EMMO1s provides user-friendly IRIs for EMMO concepts,^{Footnote 2} retaining the labels, e.g. the IRI of the EMMO concept with rdfs:label “Semiosis” is given in the original EMMO as emmo-semiotics:EMMO_008fd3b2_ 4013_451f_8827_52bceab11841. For these entities, EMMO1s specifies aliases that can be accessed directly through the label, such as emmo1s:Semiosis. In the interest of notational clarity, to indicate the origin of the concept definitions and the respective EMMO modules, these entities will here be denoted by the EMMO prefix followed by the EMMO1s suffix, e.g. by emmo-semiotics:Semiosis, even though internally, for VIMMP, it is actually emmo1s:Semiosis.

The VIMMP Primitives (VIPRS) module amplifies the ways in which the EMMO-based top-level semantic interoperability architecture can be applied to the relations characterizing metadata from the VIMMP marketplace-level domain ontologies.^{Footnote 3} With this aim, VIPRS extends the EMMO system of top-level relations by three features:

1.
modal logic (e.g. Kripke semantics) and modal squares of opposition;
2.
concatenation of mereotopological and semiotic relations, yielding mereosemiotic relations;
3.
top-level datatype properties.

While the EMMO can be used to describe materials and models as such, statements on necessity and possibility anchored in modal logic are metaontological, i.e. beyond the ontology, from the point of view of the EMMO [9], e.g. within the framework of the EMMO, an event can be described as a physical process, but the statement that “this process can possibly occur, but it will not necessarily occur” cannot be expressed. The present domain ontologies, however, make ample use of relations that are ultimately modal to specify capabilities (it is possible that :X will be used to do :Y) or requirements (if is necessary that if :X occurs, :Y also occurs).

To provide a top-level structure for modal relations, VIPRS includes modal squares of opposition,^{Footnote 4} cf. Fig. 5.3, by which the presence of individuals in a knowledge base can be associated with statements on whether their occurrence is possible, necessary, factual or fictional [16]. The modal operators can be given a variety of interpretations, depending on the precise use that is made of the ideas of necessity ($ \Box $) and possibility ($ \Diamond $), respectively [17]; similarly, the definition of “occurrence” depends on the use that is made of the ontology and may depend on context—VIPRS accepts this ambiguity in order to be applicable to diverse types of knowledge bases and infrastructures. The term “to occur” in , “:X may occur,” and similar, is employed to refer to the (possible or necessary) appearance of an individual :X in a certain type of environment, e.g. as an element of a valid simulation workflow. On this basis, relations concerning the possible or necessary co-occurrence of multiple individuals are defined, e.g. viprs:n_loc_or_rnoc (and others following the same pattern, cf. Fig. 5.3), where the IRI is to be read as “necessarily, the left occurs or the right does not occur”

(5.4)

cf. Fig. 5.3. Thereby, “occurrence” (by appearing in a certain type of environment) is not the same as “existence,” i.e. presence in a knowledge base. It is in this sense that VIPRS can be employed as an implementation of possible-world semantics, Kripke semantics and/or ontological Meinongianism [16], even though it does not necessarily presuppose the use of any of these paradigms. The conceptualization relation

(5.5)

with $ K _ I \,\mathsf {C}\, I $ to be read as “$ K _ I $ conceptualizes $ I $,” relates a more (or equally^{Footnote 5}) generic individual to a more (or equally) specific one; it is used to introduce a step of abstraction into the modal co-occurrence relations, e.g. “necessarily, the left occurs conceptual-or the right does not occur”

(5.6)

Relations from the EMMO are mereological (or, more properly, mereotopological [10, 18, 19]), represented here at the highest level by proper parthood

$$\begin{aligned} \mathsf {P}~ \equiv ~ \textsf {{viprs:is\_proper\_part\_of}} ~ \equiv ~ {\textsf {{emmo-mereotopology:hasProperPart}}}^ - , \end{aligned}$$

(5.7)

and semiotic, represented at the highest level by the sign-to-object reference relation^{Footnote 6}

(5.8)

cf. Expressions (3.6) and (3.7). To facilitate ontology alignment, which is discussed in Sects. 5.3 and 5.4, VIPRS also contains mereosemiotic chain products of these fundamental relations, i.e. elements of the free semigroup $\mathbf {R}_\mathrm {ms}^+$ over $\mathbf {R}_\mathrm {ms}= \{\mathsf {P}, \mathsf {S}, {\mathsf {P}}^ - , {\mathsf {S}}^ - \}$, with the product defined by concatenation. The mereosemiotic relations for which there is an explicit definition in VIPRS are limited to $\mathbf {R}_\mathrm {ms}\, \cup \, \mathbf {R}_\mathrm {ms}^2 \, \cup \, \mathbf {R}_\mathrm {ms}^3$, i.e. relations generated by a sequence of up to three fundamental relations which are not redundant ($\mathsf {P}\circ \,\mathsf {P}$ and its inverse),^{Footnote 7} complete (or almost complete), i.e. relating everything to everything, except possibly for a single “universe” entity,^{Footnote 8} as it is the case for $\mathsf {P}\circ {\mathsf {P}}^ - $, or consist of three elements from the same category, e.g. ${\mathsf {S}}^ - \circ \,\mathsf {S}\,\circ \,{\mathsf {S}}^ - $ is excluded, because all three constituent elements are semiotic. In the nomenclature employed by VIPRS, the IRI elements ip, hp, is and hs stand for “is proper part,” “has proper part,” “is sign” and “has sign,” respectively. Accordingly, the binary chain relations include

(5.9)

while the ternary chain relations include

(5.10)

With minor exceptions, datatype properties (owl:DatatypeProperty) are absent from the EMMO [9]; by the domain ontologies, however, datatype properties are amply employed to associate objects with textual (xs:string), numerical (xs:decimal) attributes and xs:boolean flags. Figure 5.4 visualizes the hierarchy of top-level datatype properties introduced in VIPRS. At the highest level, VIPRS categorizes datatype properties according to their role:

Identification of an object is positioned below viprs:has_identifier; examples include otras:has_topic_code, which maps a materials modelling topic (otras:mm_topic) from OTRAS to a four-digit code. Each topic code uniquely corresponds to one topic, and its purpose is identification.
Where an elementary-datatype entry is the content (or part of the content) of an object, datatype properties below viprs:has_content are used, e.g. this applies to textual or numerical content of MODA from entries (in OSMO, aspects), corresponding to osmo:has_aspect_text_content and osmo:has_aspect_text_content [20, 21], cf. Section 3.3.
Elementary descriptors, specifiers and similar metadata that provide additional, contingent information on objects, viprs:has_specifier is used, e.g. otras:has_cited_video_duration_seconds points to a metadata item on the length of a video. This contributes to our knowledge about the video by specification, while it does not permit its identification; moreover, the video duration is information about the video content, but it is not itself the content. Therefore, otras:has_cited_video_duration_seconds $ \sqsubseteq $ viprs:has_specifier.

At the second level, the datatypes are distinguished (string, decimal or Boolean). Further below, at the third level, the textual datatype properties are further split into subproperties according to their function (cf. Fig. 5.4).

5.3 Ontology Matching

A major design goal for a top-level ontology consists in achieving the desired level of expressivity with a minimal repertoire of basic terms and relations. Obversely, to ensure interoperability for services and tools interoperating at the level of a specific digital platform, the employed ontologies need to capture detailed characteristics of data pertaining to a particular domain of knowledge. Accordingly, the structure of the corresponding semantic space at the lower level is comparably complex, e.g. the ontologies from VIMMP contain about 1000 concepts, 550 relations (object properties) and 180 elementary datatype properties. Therefore, by design, the EMMO needs to have a structure that is substantially different from that of the marketplace-level ontologies [7]. To ensure that the EMMO is consistently employed at all levels, so that it can contribute to platform and service interoperability as far as possible, the marketplace-level ontologies need to be aligned with the EMMO. Before returning to this specific problem, the present section summarizes some of the related theoretical concepts.

In principle, semantic assets are designed to allow data integration and overcome the data heterogeneity problem; in reality, semantic heterogeneity does arise, and it grows over time as resources are added to the semantic web. This is known as the Tower of Babel problem [22, 23]. While some authors regard any presence of semantic heterogeneity as a failure of semantic interoperability and hope for universal agreements, others think that it is unavoidable and look for strategies to deal with it. This may involve a standardized way of documenting semantic assets; basic agreements on the approach to ontology design; and the formalizations of roles, procedures and good practices (or best practices), aiming at pragmatic interoperability [24,25,26,27]. For this approach, the challenge consists in agreeing and specifying how the semantic space is structured, documented and employed in practice; by raising the domain for which universal agreements are pursued from the ontological level to the metaontological level, “the Tower of Babel becomes a Meta-Tower of Babel” [28].

As a consequence, semantic heterogeneity is seen as a necessary property of the semantic web, and ontology matching and integration become basic features of its successful mode of operation, rather than an expression of incompleteness. Options for implementing such a mode of operation have been extensively discussed in the literature, first for schemas and then for ontologies, cf. Noy [29] as well as Euzenat and Shvaiko [30]. The common challenge is how to make use of the knowledge represented in two ontologies, which can differ at various levels (language used, expressivity, modelling paradigm, etc.). Typically, such challenges arise if there is an overlap in the domains of knowledge addressed by multiple ontologies, such that data annotated in diverse ways need to be combined and processed together, or if a platform employs multiple domain ontologies that are based on different top-level ontologies. Typical applications include, e.g. simultaneous querying of multiple knowledge bases [31,32,33,34] or, as addressed here, the mapping of semantic content from a source ontology $\mathcal {S}$ to a target ontology $\mathcal {T}$.

Such a mapping $ \alpha $, by which a scenario $\mathcal {A}_\mathcal {S}$ expressed in the source ontology is mapped to a $\mathcal {A}_\mathcal {T}$ expressed in the target ontology, is an ontology alignment. Equivalently, this can be applied to the corresponding knowledge graphs, $ \alpha : G _\mathcal {S}\mapsto G _\mathcal {T}$. The process by which an alignment is constructed is known as ontology matching [35]. Alignments can be probabilistic or deterministic, e.g. in a probabilistic formalism, it might be stated that “an osmo:condition that osmo:contains_variable an evmpo:material_property has a 40% probability of being an emmo-models:Physics BasedModel”, cf. Suchanek et al. [36]. For the present purpose, we restrict ourselves to deterministic alignments, based on rules that are asserted to be valid in general. If such an alignment is formulated coherently and correctly, the source and target scenarios need to be semantically consistent, i.e. the assertions from the target scenario may not contradict the assertions from the source scenario, which can be checked in multiple ways:

1.
Immanently (ontologically), on the basis of a series of alignments $ \alpha \circ \alpha ' \circ \dots $, at the end of which another version of the scenario expressed in the source ontology is obtained. Then the consistency of the original and final scenarios can be determined on the basis of the rules from the source ontology $\mathcal {S}$.
2.
Transcendentally (metaontologically), either by creating a new ontology that encompasses both $\mathcal {S}$ and $\mathcal {T}$, containing rules in which concepts or relations from both ontologies occur jointly, or alternatively by a different system of—possibly human—arbitration that can detect contradictions between $\mathcal {A}_\mathcal {S}$ and $\mathcal {A}_\mathcal {T}$.

Under the constraint of consistency, it is the main challenge to preserve as much of the originally given information as possible. Test scenarios, for which the desired target representation is known, can be used to validate the alignment [34]. Moreover, alignment rules, whether probabilistic or deterministic, can be obtained by evaluating corpora of data that are annotated in both the source and target ontologies [35, 37]; in the probabilistic case, however, the outcome can be assumed to apply only as long as the population or corpus underlying the statistical analysis from which the probabilities were determined is representative of a class of scenarios to which $\mathcal {A}_\mathcal {S}$ belongs. Simple alignment correspondences [38] can be specified by categorically subsuming concepts and relations from $\mathcal {S}$ under those from $\mathcal {T}$, yielding relabelling rules [39] that do not affect the graph structure (only the labels) and that are context free, i.e. independent of adjacent vertices and edges, such as

$$\begin{aligned} \textsf {{vivo:evaluates}} ~ \sqsubseteq ~ \mathsf {S}, \end{aligned}$$

(5.11)

stating that whatever evaluates an object, by implication, always also is a sign for that object. Besides, qualified subsumptions can be formulated, such as

$$\begin{aligned} \exists ({\textsf {{vivo:evaluates}}}^ - ).\textsf {{evmpo:assertion}} ~ \sqsubseteq ~ \textsf {{emmo-semiotics:Object}}, \end{aligned}$$

(5.12)

i.e. that which is evaluated by an assertion is an “object” in the sense of Peircean semiotics; this is a context-sensitive rule, since the relabelling of the vertex (individual) is contingent on one of the edges, namely, an incoming edge with the label vivo:evaluates. Beyond this, more complex graph transformation rules [40] can be applied in the case that the transformation goes beyond relabelling, i.e. if vertices or edges in the knowledge graph need to be eliminated or created by applying m : n property chain correspondences [38].

5.4 VIMMP-EMMO Alignment

To permit the transformation of a knowledge graph from the way in which it appears to the VIMMP marketplace platform to the more abstract representation required for interoperability within a heterogeneous ecosystem of platforms mediated through the EMMO, both concepts and relations need to be aligned between the (VIMMP marketplace) domain level and the (EMMO) top level. This is realized by an ontology module for EMMO-VIMMP Integration (EVI). For the present purpose, accordingly, $\mathcal {S}$ is the VIMMP system of ontologies, including the EVMPO (but excluding VIPRS), and $\mathcal {T}$ is the EMMO, in the case of concepts, and the EMMO in combination with VIPRS, in the case of relations. In the absence of co-annotated corpora that can be analysed automatically, the correspondences were all specified explicitly, by evaluating the concept and relation definitions from the EMMO alpha version in comparison with the respective definitions from the VIMMP ontologies.

Concerning the conceptual alignment, Fig. 5.5 shows how the categories from the EVMPO, cf. Sect. 3.2, are mapped to EMMO concepts. The red arrows and double lines in Fig. 5.5 represent this alignment, which is itself expressed as an ontology and implemented in the EVI module. This part of the alignment guarantees that all VIMMP domain-ontology concepts are subsumed under EMMO concepts (where they are all situated below emmo-physical:Physical taxonomically), since all of these concepts are either subclasses of one of the fundamental paradigmatic categories from the EVMPO or of the fundamental non-paradigmatic category evmpo:annotation.

Beyond this, the concepts from the domain ontologies are aligned with the EMMO down to a comparably fine-grained level; this is also implemented in EVI.^{Footnote 9} Table 5.1 contains the EVI statements corresponding to the concepts that were listed as examples in Chap. 3.

Table 5.1 Alignment between selected concepts from the VIMMP marketplace-level ontologies (source ontology $\mathcal {S}$) introduced in Sects. (top), (middle) and (bottom) and the EMMO top-level ontology (target ontology $\mathcal {T}$)

Full size table

Table 5.2 Alignment between selected relations from the VIMMP marketplace-level ontologies (source ontology $\mathcal {S}$) introduced in Sects. 3.3 (top), 3.4 (middle) and 3.5 (bottom) and VIPRS in combination with the EMMO (target ontology $\mathcal {T}$)

Full size table

The relational alignment, which is shown for the MMTO in Fig. 5.6 and for the examples from Chap. 3 in Table 5.2, is implemented directly in the domain ontology TTL files, which contain statements by which the domain ontology relations are subsumed under VIPRS relations. Property chain correspondences are applied when the mereosemiotic chain relations from VIPRS are unfolded, cf. Fig. 5.7, yielding series of elementary parthood and reference relations from the EMMO, so that the graph grows both in terms of vertices and edges; in TTL notation, this corresponds to the introduction of blank nodes (individuals without an IRI [41]) by which, e.g.

which encodes becomes

5.5 Documentation of Molecular Models

For documenting molecular models and exchanging them between platforms, a semantic interoperability standard on the basis of the VIMMP system of ontologies as well as MODA [20] was agreed between VIMMP and the Molecular Model Database (MolMod DB) of the Boltzmann-Zuse Society [43]; the associated environment of interoperable platforms will prospectively also include Bottled SAFT [44].

The structure of the knowledge graph representing a molecular model is illustrated by Fig. 5.8, which corresponds to a two-centre Lennard-Jones plus point-quadrupole model (2CLJQ) where a molecule (in this case, acetylene) is represented by a rigid unit (viso-am:rigid_object), consisting of two Lennard-Jones interaction sites (viso-am:lj_site), a point-quadrupole site (viso-am:charge_quadrupole_site) as well as a viso-am:structureless_object, representing the molecular centre of mass, which is used the initial point of vov:relative_position vectors that indicate the coordinates of the interaction sites. The relation vov:has_attached_variable and subproperties of it are used to connect the interaction sites with the non-geometrical model parameters, i.e. the mass associated with each of the two LJ sites (half the molecular mass), the $\sigma $ and $\epsilon $ site and energy parameters of the LJ potential, and a second-order tensor characterizing the quadrupole moment. Other rigid molecular models are described analogously.

The platform interoperability implementation developed on this basis employs JSON-LD to exchange information on molecular models. Therefore, the knowledge graph needs to be connected (i.e. it may not consist of multiple connected components), and its topology needs to be simplified to a tree structure such that each object is subordinate to exactly one object, except for a single root node at the top. For the present example, an osmo:workflow_graph with two sections, a use case (MODA Sect. 5.1) and a model (MODA Sect. 5.2), is selected as the root of the tree. In this way, e.g. one of the site coordinates is included in the rigid unit description as follows:

The hierarchy by which objects are embedded in other objects in JSON is obtained from a subset of the relations from the knowledge graph, shown in Fig. 5.8 as solid arrows in blue colour, while references to IRIs are used to represent the other relations (dashed black arrows) in JSON-LD. The relations vov:involves_object and vov:involves_variable are part of the JSON-LD tree structure (solid blue arrows), so that COM, SITE_LJ_A, and LJ_A_POS are all hierarchically subordinate to RIGID_UNIT. The relations vov:has_initial_point and vov:has_final_point, however, are sideways connections between nodes from multiple branches of the tree (dashed black arrows); therefore, their JSON-LD representation only points to the IRI of the referenced object, using the "@id" keyword.

Notes

1.
Notation: $2^{\mathbf {C}\cup \mathbb {R}\cup \Sigma ^ \star }$ is the power set (i.e. set of sets) over concept names for labelling individuals by class, reals for numerical datatype properties (including Booleans with 1 for true, 0 for false, as explicitly permitted by XSD) and words for textual datatype properties.
2.
EMMO1s: https://purl.vimmp.eu/semantics/alignment/emmo1s.ttl (non-resolvable IRI), mirrored at http://www.molmod.info/semantics/emmo1s.ttl (resolvable URL).
3.
VIPRS: https://purl.vimmp.eu/semantics/alignment/viprs.ttl (non-resolvable IRI), mirrored at http://www.molmod.info/semantics/viprs.ttl (resolvable URL).
4.
A square of opposition, going back to Aristotle, is a diagram containing four related statements, concepts or predicates labelled A for universal affirmation (“all”), E for universal negation (“no”), I for existential affirmation (“some”) and O for existential negation (“not all”); cf. Westerståhl [14, 15] for a variety of applications.
5.
$\mathsf {C}$ is reflexive, $\forall K \in \mathbf {I}: \, K \,\mathsf {C}\, K $.
6.
It will require an explanation why the shorthand symbols $\mathsf {P}$ (“is proper part of”) and $\mathsf {S}$ (“is sign for”) are here assigned a meaning corresponding to the inverse of relations defined by the EMMO, which only include “has proper part” and “has sign,” respectively. While making the correspondence with the EMMO slightly more indirect, we find this to be more in line with common conventions.
First, concerning mereology, parthood is a partial ordering relation; such relations are conventionally defined in terms of the operator meaning “is smaller than,” not “is greater than,” e.g. in description logic, where subsumption ($ \sqsubseteq $) rather than inclusion ($\sqsupseteq $) is employed as a primitive. The seminal papers on mereotopology by Smith [18], Varzi [10], as well as Smith and Varzi [19] unsurprisingly all define “is part of” as fundamental, which they denote by $\mathsf {P}$. We rely on proper rather than improper parthood here based on our empirical assessment that proper parthood is more frequently the most useful choice for ontology alignment, cf. Sect. 5.3, when the EMMO is considered as a target ontology, cf. Sect. 5.4.
Second, concerning semiotics, when drawing a knowledge graph using the relation “is sign for”, the arrow points from the sign to the object, which is intuitive; with “has sign”, the object would have to point to the sign. In view of this, to avoid a counterintuitive notation that would encourage the misinterpretation of diagrams, the symbol $\mathsf {S}$ is here employed for “is sign for”.
7.
In terms of the 4D spatiotemporal entities considered within the EMMO, $\mathsf {P}$ and ${\mathsf {P}}^ - $ are idempotent, since for any $ I \,\mathsf {P}\, J $ there is an $ I '$ such that $( I \,\mathsf {P}\, I ') \wedge ( I ' \,\mathsf {P}\, J )$ due to the continuum nature of spacetime. The EMMO explicitly permits items to be “void,” i.e. not to contain any physical matter, so that continuum nature can be assumed for EMMO spacetime even concerning properties that are subject to quantization. Hence, chains that contain $\mathsf {P}\circ \mathsf {P}$ or ${\mathsf {P}}^ - \circ \,{\mathsf {P}}^ - $ can be excluded.
8.
There is a 4D spatiotemporal entity $ \Omega $ (“trajectory of the universe”) that encloses everything that exists within any given knowledge base; therefore, “$ I $ is a proper part of something (namely, $ \Omega $) that has $ J $ as a proper part” holds for all EMMO individuals $ I , J \ne \Omega $ from the knowledge base. Hence, $\mathsf {P}\circ {\mathsf {P}}^ - $ is (almost) complete, and any chains that contain it can be excluded.
9.
EVI: https://purl.vimmp.eu/semantics/alignment/evi.ttl (non-resolvable IRI), mirrored at http://www.molmod.info/semantics/evi.ttl (resolvable URL).

References

F. Abad Navarro, J.A. Bernabé Diaz, A. García Castro, J.T. Fernández Breis, Semantic publication of agricultural scientific literature using property graphs. Appl. Sci. 10(3), 861 (2020)
Google Scholar
E. Katis, H. Kondylakis, G. Agathangelos, V. Kostas, Developing an ontology for curriculum & syllabus, in Prof. ESWC, Satellite Events, ed. by A. Gangemi, A.L. Gentile, A.G. Nuzzolese, M. Rudolph, S. Maleshkova, H. Paulheim, J.Z. Pan, M. Alam (Springer, Cham, Switzerland, 2018), pp. 55–59
Google Scholar
G.J. Schmitz, Microstructure modeling in integrated computational materials engineering (ICME) settings: can HDF5 provide the basis for an emerging standard for describing microstructures? JOM 68(1), 77–83 (2016)
Google Scholar
P. de Buyl, P.H. Colberg, F. Höfling, H5MD: a structured, efficient, and portable file format for molecular data. Comput. Phys. Commun. 185, 1546–1553 (2014)
Google Scholar
W. Colsman, R. Uphill, A portable data format for laboratory data (Sci. Comput, World Feature, 2015)
Google Scholar
H. Oberkampf, H. Krieg, C. Senger, T. Weber, W. Colsman, Allotrope data format: semantic data management in life sciences, in Proceedings of SWAT4HCLS 2018, ed. by A. Splendani (2018)
Google Scholar
M.T. Horsch, S. Chiacchiera, M.A. Seaton, I.T. Todorov, K. Šindelka, M. Lísal, B. Andreon, E.B. Kaiser, G. Mogni, G. Goldbeck, R. Kunze, G. Summer, A. Fiseni, H. Brüning, P. Schiffels, W.L. Cavalcanti, Ontologies for the Virtual Materials Marketplace. Künstl. Intell. 34(3), 423–428 (2020). https://doi.org/10.1007/s13218-020-00648-9
G. Goldbeck, E. Ghedini, A. Hashibon, G.J. Schmitz, J. Friis, A reference language and ontology for materials modelling and interoperability, in Proceedings of NWC 2019, NAFEMS, (Knutsford, UK, 2019), p NWC\_19\_86
Google Scholar
EMMC Coordination and Support Action, European Materials and Modelling Ontology (2020), https://github.com/emmo-repo/, https://emmc.info/emmo-info/. Accessed 8 Apr 2020
A.C. Varzi, Parts, wholes, and part-whole relations: the prospects of mereotopology. Data Knowl. Eng. 20, 259–286 (1996)
Article Google Scholar
C.S. Peirce, Peirce on Signs: Writings on Semiotic (University of North Carolina Press, Chapel Hill, North Carolina, USA, 1991)
Google Scholar
D. Lewis, New work for a theory of universals. Aust. J. Philos. 61(4), 343–377 (1983)
Article Google Scholar
M.T. Horsch, S. Chiacchiera, B. Schembera, M.A. Seaton, I.T. Todorov, Semantic interoperability based on the European materials and modelling ontology and its ontological paradigm: mereosemiotics, in Proceedings of WCCM-ECCOMAS 2020, to appear (2021). https://doi.org/10.5281/zenodo.3902900
D. Westerståhl, The traditional square of opposition and generalized quantifiers. Stud. Logic (Beijing) 2, 1–18 (2008)
Google Scholar
D. Westerståhl, Classical vs. modern squares of opposition, and beyond, in The Square of Opposition: A General Framework for Cognition, ed. by J.Y. Béziau, G. Payette (Switzerland, Peter Lang, Bern, 2012), pp. 195–229
Google Scholar
F. Berto, M. Plebani, Ontology and Metaontology (Bloomsbury, London, UK, 2015)
Google Scholar
M. Huth, M. Ryan, Logic in Computer Science: Modelling and Reasoning about Systems, 2nd edn. (Cambridge University Press, Cambridge, 2004)
Google Scholar
B. Smith, Mereotopology: a theory of parts and boundaries. Data Knowl. Eng. 20(3), 287–303 (1996)
Google Scholar
B. Smith, A.C. Varzi, Fiat and bona fide boundaries. Philos. Phenomenol. Res. 60(2), 103–119 (2000)
Article Google Scholar
CEN-CENELEC Management Centre, Materials modelling: terminology, classification and metadata, in CEN Workshop Agreement 17284 (Belgium, Brussels, 2018)
Google Scholar
M.T. Horsch, C. Niethammer, G. Boccardo, P. Carbone, S. Chiacchiera, M. Chiricotto, J.D. Elliott, V. Lobaskin, P. Neumann, P. Schiffels, M.A. Seaton, I.T. Todorov, J. Vrabec, W.L. Cavalcanti, Semantic interoperability and characterization of data provenance in computational molecular engineering. J. Chem. Eng. Data 65(3), 1313–1329 (2020)
Google Scholar
B. Hu, B. Hu, Tower of Babel: Interoperability of ontologies for pervasive computing, in First International Symposium on Pervasive Computing and Applications, ed. by V. Callaghan, B. Hu, Z. Lin, H. Zhang (IEEE, Piscataway, New Jersey, USA, 2006), pp. 690–695
Chapter Google Scholar
A. Iliadis, The tower of Babel problem: making data make sense with basic formal ontology. Online Inf. Rev. 43(6), 1021–1045 (2019)
Article Google Scholar
C.H. Asuncion, M.J. van Sunderen, Pragmatic interoperability: a systematic review of published definitions, in Proceedings of EAI2N, WCC 2010, ed. by P. Bernus, G. Doumeingts, M. Fox (Springer, Heidelberg, Germany, 2010), pp. 164–175
Google Scholar
M.T. Horsch, S. Chiacchiera, M.A. Seaton, I.T. Todorov, B. Schembera, P. Klein, N.A. Konchakova, Pragmatic interoperability and translation of industrial engineering problems into modelling and simulation solutions, in Proceedings of DAMDID 2020, to appear (2021). https://doi.org/10.5281/zenodo.3902873
B. Schembera, J.M. Durán, Dark data as the new challenge for big data science and the introduction of the scientific data officer. Philos. Technol. 33, 93–115 (2020)
Article Google Scholar
M. Schoop, A. de Moor, J. Dietz, The pragmatic web: a manifesto. Commun. ACM 49(5), 75–76 (2006)
Article Google Scholar
Varzi A (2019) Ontology: from philosophy to innovation in materials and manufacturing. Keynote address, in 2nd EU Workshop on Materials and Manufacturing Ontology, Brussels, 6th June 2019
Google Scholar
N.F. Noy, Semantic integration: a survey of ontology-based approaches. SIGMOD Rec. 33(4), 65–70 (2004)
Google Scholar
J. Euzenat, P. Shvaiko, Ontology Matching, 2nd edn. (Springer, Heidelberg, 2013)
Book Google Scholar
B. Bouchou, C. Niang, Semantic mediator querying, in Proceedings of IDEAS ’14, ed. by A.M. Almeida, J. Bernardino, E. Ferreira Gomes (ACM, New York, USA, 2014), pp 29–38
Google Scholar
D. Lembo, R. Rosati, V. Santarelli, D.F. Savo, E. Thorstensen, (2017) Mapping repair in ontology-based data access evolving systems, in Proceedings of IJCAI, IJCAI, ed. by C. Sierra (San José, California, USA, 2017), pp. 1160–1166
Google Scholar
G. Xiao, D. Calvanese, R. Kontchakov, D. Lembo, A. Poggi, R. Rosati, M. Zakharyaschev, (2018) Ontology-based data access: a survey, in Proceedings of IJCAI, IJCAI, ed. by J. Lang, (San José, California, USA, 2018), pp. 5511–5519
Google Scholar
G. Fusco, L. Aversano, An approach for semantic integration of heterogeneous data sources. PeerJ Comput. Sci. 6, e254 (2020). https://doi.org/10.7717/peerj-cs.254
Article Google Scholar
P. Ochieng, S. Kyanda, A statistically-based ontology matching tool. Distrib. Parall. Databases 36, 195–217 (2018)
Article Google Scholar
F.M. Suchanek, S. Abiteboul, P. Senellart, PARIS: probabilistic alignment of relations, instances, and schema. Proc. VLDB Endow. 5(3), 157–168 (2011)
Google Scholar
M. Koutraki, N. Preda, D. Vodislav, Online relation alignment for linked datasets, in Proceedings of ESWC 2017, ed. by E. Blomqvist, D. Maynard, A. Gangemi, R. Hoekstra, P. Hitzler, O. Hartig (Springer, Cham, Switzerland, 2017), pp. 152–168
Google Scholar
L. Zhou, M. Cheatham, P. Hitzler, Towards association rule-based complex ontology alignment, in Proceedings of JIST 2019, ed. by X. Wang, F.A. Lisi, G. Xiao, E. Botoeva, LNCS, vol. 10249 (Springer, Cham, Switzerland, 2020), pp. 287–303
Google Scholar
Y. Métivier, E. Sopena, Graph relabelling systems: a general overview. Comput. AI 16(2), 167–185 (1997)
Google Scholar
B. König, D. Nolte, J. Padberg, A. Rensink, A tutorial on graph transformation, in Graph Transformation, Specifications, and Nets, ed. by R. Heckel, G. Taentzer, LNCS, vol. 12032 (Springer, Cham, 2018), pp. 83–104
Google Scholar
D. Allemang, J. Hendler, Semantic Web for the Working Ontologist, 2nd edn. (Morgan Kaufmann, Waltham, Massachusetts, USA, 2011)
Google Scholar
M.T. Horsch, S. Chiacchiera, M.A. Seaton, I.T. Todorov, R. Kunze, G. Summer, A. Fiseni, B. Andreon, A. Scotto Di Minico, E. Bayro Kaiser, G. Kanagalingam, S. Stephan, K.Šindelka, M. Lísal, J. Díaz Brañas, I. Pagonabarraga, M. Chiricotto, J.D. Elliott, P. Carbone, D. Toti, G. Mogni, G. Goldbeck, H. Brüning, P. Schiffels, W.L. Cavalcanti, Ontology-based semantic interoperability on the virtual materials marketplace, in Proceedings of the ISWC 2020 Demos and Industry Tracks, ed. by K. Taylor, R. Gonçalves, F. Lecue, J. Yan (CEUR-WS, Aachen, 2021), pp. 134–137. https://doi.org/10.5281/zenodo.3986825
S. Stephan, M.T. Horsch, J. Vrabec, H. Hasse, MolMod - an open access database of force fields for molecular simulations of fluids. Mol. Sim. 45(10), 806–814 (2019)
Article CAS Google Scholar
Å. Ervik, A. Mejía, E.A. Müller, Bottled SAFT: a web app providing SAFT-$\gamma $ Mie force field parameters for thousands of molecular fluids. J. Chem. Inf. Model. 56(9), 1609–1614 (2016)
Google Scholar
K. Stöbener, P. Klein, M. Horsch, K. Küfer, H. Hasse, Parametrization of two-center Lennard-Jones plus point-quadrupole force field models by multicriteria optimization. Fluid Phase Equilib. 411, 33–42 (2016)
Article Google Scholar

Download references

Author information

Authors and Affiliations

High-Performance Computing Center Stuttgart, Stuttgart, Germany
Martin Thomas Horsch & Björn Schembera
STFC Daresbury Laboratory, Daresbury, Cheshire, UK
Silvia Chiacchiera
Fraunhofer IFAM, Bremen, Germany
Welchy Leite Cavalcanti

Authors

Martin Thomas Horsch
View author publications
You can also search for this author in PubMed Google Scholar
Silvia Chiacchiera
View author publications
You can also search for this author in PubMed Google Scholar
Welchy Leite Cavalcanti
View author publications
You can also search for this author in PubMed Google Scholar
Björn Schembera
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Martin Thomas Horsch .

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Horsch, M.T., Chiacchiera, S., Cavalcanti, W.L., Schembera, B. (2021). Applications of the Metadata Standards. In: Data Technology in Materials Modelling. SpringerBriefs in Applied Sciences and Technology. Springer, Cham. https://doi.org/10.1007/978-3-030-68597-3_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-68597-3_5
Published: 20 March 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68596-6
Online ISBN: 978-3-030-68597-3
eBook Packages: Chemistry and Materials ScienceChemistry and Material Science (R0)

Publish with us

Policies and ethics