1 Introduction

Widespread use of interactive 3D technologies has been recently enabled by the significant progress in hardware performance, the rapid growth in the available network bandwidth as well as the availability of versatile input-output devices. The 3D technologies have become increasingly popular in various application domains on the web, such as education, training, tourism, entertainment, social media and cultural heritage, significantly enhancing possibilities of presentation and interaction with complex data and objects. The primary element of any VR/AR system, apart from interface technologies, is interactive 3D content. Dependencies between components of interactive 3D content may include, in addition to its basic meaning and presentation form, also spatial, temporal, structural, logical and behavioral aspects. Hence, creating and composing interactive 3D content on the web are more complex and challenging tasks than in the case of typical web resources.

The potential of VR/AR applications accessible on the web can be fully exploited only if the interactive 3D content is created with efficient and flexible methods, which conform to the recent trends in the development of the web. In 2001, the W3C (World Wide Web Consortium) initiated the research on the semantic web, which aims at the evolutionary development of the current web towards a distributed semantic database linking structured content and documents. Semantic description of web content makes it understandable for both humans and computers, achieving a new quality in building web applications that can “understand” the meaning of particular components of content as well as their relationships, leading to much better methods of creating, searching, reasoning, combining and presenting web content.

The semantic web consists of content described with common schemes, ontologies and knowledge bases, which specify the meaning of particular content components at different levels of abstraction. In particular, in the domain of computer graphics, content may be described using concepts that are specific to 2D/3D modeling as well as concepts that are specific to an application or an application domain. Furthermore, the semantic description of content includes not only the knowledge (content properties, dependencies and constraints) that has been explicitly specified by the content designer, but also knowledge that has not been explicitly specified. Such hidden knowledge may be inferred from the available data in the knowledge discovery process and influence the final form of the content being modeled. The knowledge-based approach to modeling content liberates content developers from the specification of all content elements and the implementation of complex algorithms that specify content elements, e.g., using chains of properties of content elements or setting properties on the basis of multiple constraints on elements. However, although a few approaches have been proposed for semantic modeling of 3D content, they do not provide comprehensive solutions for conceptual knowledge-driven content creation.

The main contribution of this paper is a new approach to semantic creation of 3D content. The proposed solution leverages semantic web techniques to enable content creation by referring to the meaning of particular content components at different levels of abstraction, with regards to content properties, dependencies and constraints, which may be either explicitly specified or dynamically extracted on the basis of the available content representation.

The remainder of this paper is structured as follows: Section 2 provides an overview of the current state of the art in the domain of semantic modeling of 3D content. Sections 3, 4 and 5 present the proposed SEMIC approach, including the semantic 3D content representation and the method of semantic creation of 3D content with a comprehensive, illustrative example of conceptual knowledge-based content creation. In Sect. 6, the implementation of SEMIC is described. In Sect. 7, qualitative and quantitative evaluations of the approach are presented. Section 8 contains a discussion of the approach. Finally, Sect. 9 concludes the paper and indicates the possible directions of future research.

2 State of the art

Numerous works have been devoted to semantic description and semantic modeling of 3D content. The works can be categorized into three groups.

The works in the first group are mainly devoted to describing 3D content with semantic annotations to facilitate access to content properties. In [32], an approach to designing interoperable RDF-based semantic virtual environments, with system-independent and machine-readable abstract descriptions has been presented. In [5, 6], a rule-based framework using MPEG-7 has been proposed for the adaptation of 3D content, e.g., geometry and texture degradation as well as filtering of objects. Content can be described with different encoding formats (in particular X3D), and it is annotated with an indexing model. In [36], integration of X3D and OWL using scene-independent ontologies and semantic zones has been proposed to enable querying 3D scenes at different levels of semantic detail. In [29], an approach to semantic description of architectural elements based on the analysis of architectural treaties has been proposed. In [25], searching for semantic correspondences between man-made 3D models and recognizing functional parts of the models has been addressed. In [10, 12, 14, 15], an approach to building semantic descriptions embedded in 3D web content and a method of harvesting semantic metadata from 3D web content have been proposed.

The second group encompasses works devoted to modeling of different aspects of 3D content, including geometry, appearance and behavior. In [23], an ontology providing elements and properties that are equivalent to elements and properties specified in X3D has been proposed. Moreover, a set of semantic properties have been proposed to enable description of 3D scenes with domain knowledge. However, the semantic conformance to X3D limits the possibilities of efficient modification of entire content layers, including multiple components related to a common aspect of the designed content, e.g., appearance or behavior.

In [4345], a method of creating VR content on the basis of reusable elements with specific roles and behavior has been proposed. The method has been developed to enable 3D content design by non-IT specialists. This solution does not rely, however, on the semantic representation of content. The use of semantic techniques could further facilitate content creation by users who use arbitrarily selected application-specific ontologies and knowledge bases.

In [7, 40], an approach to generating virtual environments upon mappings of domain ontologies to particular 3D content representation languages (e.g., X3D) has been considered. The following three content generation stages are distinguished: specification of a domain ontology, mapping the domain ontology to a 3D content representation language, and generation of a final presentation. The solution stresses spatial relations (position and orientation) between objects in the scene. It enables mapping between application-specific objects and 3D content components, but it does not address complex logical relationships between application-specific concepts and 3D content components and properties. In particular, it is not possible to reflect compositions of low-level content properties and relations between content components by high-level (e.g., application-specific) elements (properties, individuals and classes) and combinations of such high-level elements. In addition, this approach does not enable separation of concerns between users involved in the process of modeling content.

In [20], a semantic model of virtual environments based on the MPEG-7 and MPEG-21 standards has been proposed to enable dynamic scaling and adapting the geometry and functions of virtual objects. In [37], an approach to semantic modeling of indoor scenes with an RGBD camera has been presented.

Several approaches have been proposed to enable modeling of 3D content by example. The approach proposed in [19] includes segmentation of 3D models, searching for models with parts that match queries by semantic properties and composition of parts of the models into new models. The approach proposed in [3] enables parameterized exploration and synthesis of 3D models based on semantic constraints such as size, length, contact and symmetry. In [48], an approach to generating 3D models on the basis of symmetric arrangements of other models has been proposed. The approach and the system presented in [9] leverage semantic attributes, which are selected by the user in the content creation process and describe created models with different strengths determining their final form.

Several works have been conducted on modeling behavior of VR objects. The approach proposed in [34, 35] facilitates modeling of complex content behavior by providing temporal operators, which may be used for combining primitive behaviors. A rule-based ontology framework for feature modeling and consistency checking has been presented in [47]. In [21], an ontology-based approach to creating virtual humans as active semantic entities with features, functions and interaction skills has been proposed.

Finally, the third group encompasses works that have been devoted to the use of semantic descriptions of 3D content in artificial intelligence systems. The idea of semantic description of 3D worlds has been summarized in [27]. In [39], a review of the main aspects related to the use of 3D content in connection with the semantic web techniques has been provided. In [4], diverse issues arising from combining AI and virtual environments have been reviewed. In [8, 30], abstract semantic representations of events and actions in AI simulators have been presented. In [26, 28, 46], a technique of integration of knowledge into VR applications, a framework for decoupling components in real-time intelligent, interactive systems with ontologies and a concept of semantic entities in VR applications have been discussed. In [38], a camera controlling approach to exploration of virtual worlds in real time by using topological and semantic knowledge has been proposed.

3 The SEMIC approach

Although several approaches have been proposed for semantic modeling of 3D content, they lack general and comprehensive solutions for modeling of interactive 3D content on the web. Recent trends in the development of the web provide new requirements for efficient and flexible content creation, which go beyond the current state of the art in modeling of 3D content.

  1. 1.

    The approach to content creation should enable declarative modeling of 3D content stressing the specification of the results to be presented, but not the way in which the results are to be achieved.

  2. 2.

    Content creation should be supported by discovery of hidden knowledge covering content properties, dependencies and constraints, which are not explicitly specified, but which may be extracted from the explicit data, and which have impact on the modeled content.

  3. 3.

    The approach should enable conceptual modeling of content components and properties at arbitrarily chosen levels of abstraction, including both the aspects that are directly related to 3D content and the aspects that are specific to a particular application or domain.

  4. 4.

    The approach should enable decoupling of modeling activities related to different parts of content representation, enabling separation of concerns between different modeling users with different expertise, who are equipped with different modeling tools, e.g., to facilitate content creation by domain experts who are not IT specialists.

  5. 5.

    Content created with the approach should be independent of particular hardware and software platforms to enable creation of multi-platform 3D content presentations.

In this paper, an approach to semantic modeling of interactive 3D content (SEMIC) is proposed. SEMIC covers various aspects of 3D content such as geometry, structure and space, appearance, scene, animation and behavior. In the approach, semantic web techniques are applied to satisfy the aforementioned requirements. SEMIC combines two elements, which have been partially described in previous works. The first element is the semantic content model (SCM) proposed in [11, 13, 17], which provides concepts that enable 3D content representation at different (arbitrarily chosen) levels of abstraction. SCM allows for incorporation of application-specific knowledge in content representations in order to simplify the content creation process. Such application-specific knowledge (at different levels of abstraction/semantics) is modeled using ontologies. The second element of SEMIC is the semantic content creation method (SCCM) proposed in [16], which consists of a sequence of steps, in which the concepts of SCM are used to create desirable content. Some of the steps in SCCM need to be performed manually by a human, whereas the other steps are accomplished automatically—by algorithms that transform different content representations, thus enabling transitions between the subsequent steps of SCCM. The particular elements of SEMIC are described in the following sections.

4 The semantic content model

In [13, 17], the SCM has been proposed. It is a collection of ontologies that enable semantic representation of 3D content at different levels of abstraction—low-level concrete content representation (CrR), which reflects elements that are directly related to 3D content, and arbitrarily high-level conceptual content representation (CpR), which reflects elements that are abstract in the sense of their final representation—they are not directly related to 3D content (Fig. 1). Both representations are knowledge bases based on semantic concepts (classes and properties), which are used for specifying semantic individuals, their properties and relations between them.

Fig. 1
figure 1

The semantic content model (SCM)

4.1 Concrete representation of 3D content

A concrete representation of 3D content (CrR) is a knowledge base built according to the multi-layered semantic content model (ML-SCM), proposed in [17]. The model enables separation of concerns between several layers corresponding to distinct aspects that are specific to 3D content—geometry layer, structure and space layer, appearance layer, scene layer, animation layer and behavior layer.

The model encompasses concepts (classes and properties) widely used in well-established 3D content representation languages and programming libraries, such as X3D, VRML, Java3D and Away3D. The concepts are assigned to different layers depending on their role. The geometry layer introduces basic uniform individual geometrical components and their properties, e.g., planes, meshes and coordinates. The structure and space layer introduces complex structural components, which assemble geometrical components, allowing for definition of spatial dependencies between them, e.g., position, orientation and size. The appearance layer adds appearance to geometrical and structural components, e.g., color, transparency and texture. The scene layer extends structural components to navigable scenes with viewpoints. The animation and behavior layers enrich components, which have been defined in the previous layers, with animation and behavior. The layers are partly dependent—every layer uses its concepts and concepts specified in the lower layers.

4.2 Conceptual representation of 3D content

A conceptual representation of 3D content (CpR) is a knowledge base compliant with an application-specific ontology. A CrR consists of application-specific individuals, which are described by application-specific properties. Application-specific concepts may represent the created 3D content at an arbitrarily high (arbitrarily chosen) level of semantic abstraction. The concepts are abstract in the sense of their final presentation, as—in general—they can be presented in various manners (e.g., 2D graphics, 3D models and text). The concepts do not need to cover any aspects that are specific to 3D content, or such aspects do not need to be indicated directly. For instance, an abstract (conceptual) car does not have to be specified as a particular 3D shape, though it may be implicitly considered as such in terms of its final presentation, e.g., by belonging to a particular sub-class of cars (delivery van, limousine, etc.). Various dependencies may be specified for individual concepts using semantic web standards (RDF, RDFS and OWL [42]), in particular multiple inheritance, restrictions on members of classes as well as domains and ranges of properties. Since various application-specific ontologies may be used in the proposed approach, neither creation nor selection of them is addressed in this paper. CrRs and CpRs are linked by semantic representation mappings (RMs).

4.3 Representation mapping

The goal of mapping is to make application-specific concepts (included in an application-specific ontology), which are abstract in terms of presentation, presentable by the use of the concrete concepts (included in ML-SCM), which are specific to 3D content.

SEMIC does not restrict the acceptable kinds or domains of application-specific ontologies; thus it may be used for creating contents of different types, for different domains and applications. In general, covering a particular application-specific ontology may be difficult, as it may require the specification of a large number of rules corresponding to various cases and contexts of the use of ontological concepts. Therefore, in the SEMIC approach, the general classes of cases and contexts of the use of particular concepts (e.g., domains and ranges of properties) are expected to be precisely specified by the domain experts who will use the ontology for 3D modeling. The purpose is to cover only well-specified cases and contexts of the use of the selected application-specific concepts, but not to cover all (potentially) possible use cases and contexts of all concepts available in the ontology. As the use cases and contexts are already well defined, the application-specific concepts may be mapped to 3D-specific concepts by representation mappings (RMs)—similarly to encapsulating low-level functions behind high-level objects’ interfaces in object-oriented programming.

An RM is a knowledge base that links CrRs to CpRs. An RM complies with the semantic mapping model (SMM), which has been proposed in [13]. Each mapping assigns concrete representation concepts to application-specific concepts. Linking application-specific classes and properties used in a CpR to particular concepts of a CrR improves efficient modeling and reusability of the application-specific concepts, in contrast to defining individual concrete representations for particular application-specific objects and scenes. Mapping is performed using mapping concepts.

The following mapping concepts are distinguished in SMM: presentable objects (POs), data properties (DPs) with literals, object properties (OPs) with descriptive individuals (DIs), descriptive classes (DCs) and relations (RLs).

Every class from an application-specific ontology whose individuals are primary entities to be presented in the created content, is specified as a presentable object (PO) class, e.g., artifacts in a virtual museum exhibition, avatars in an RPG game or UI controls. For each PO class, various concrete representation properties related to geometry, structure and space, appearance, scene, animation and behavior can be specified.

POs may be described by application-specific properties represented by data properties (DPs), which indicate application-specific features of the POs (shape, material, behavior, etc.) that may be expressed by literal values (e.g., ‘big cube’, ‘wood’, ‘flying object’).

Descriptors are a functional extension of DPs, as they gather multiple properties of POs. The properties assigned to a descriptor do not describe this descriptor, but they describe the PO, the particular descriptor is linked to—descriptors only carry properties. Unlike POs, descriptors do not have individual 3D representations. There are two types of descriptors. Descriptive classes (DCs) are application-specific classes that may be assigned to POs to specify some concrete properties of them, e.g., a class of interactive rotating objects includes POs that rotate after being touched—the POs have common concrete properties related to interaction and animation. Descriptive individuals (DIs) are instances of classes that are linked to the described POs by object properties (OPs). For example, a piece of furniture (a PO) can be made of (an OP) different types of wood (DIs), each of which is described by a few DPs such as color, shininess, texture, etc.

A relation (RL) is an application-specific property or an application-specific individual that links different POs occurring in the created content. Every RL has at least two parts (participants), which are connected one to another by mutual dependencies related to some concrete properties of 3D content, e.g., a relation that specifies the relative position of some POs links these POs and determines their relative orientations and distances between them.

5 The semantic content creation method

In [16], the semantic content creation method (SCCM) has been proposed. The method enables flexible creation of 3D content at an arbitrarily chosen level of abstraction by leveraging the particular parts of SCM, which has been explained in the previous section. Creation of 3D content with SCCM consists of a sequence of steps, which correspond to different levels of semantic abstraction of the created content—design of a CrR, mapping the CrR to application-specific concepts, design of a CpR, expanding the semantic representation and building the final content representation (Fig. 2). In SCCM, succeeding steps depend on the results of their preceding steps.

The first three steps are performed manually—by a developer or a domain expert. These steps produce knowledge bases that conform to different parts of SCM. These steps may be performed using a typical semantic editor (e.g., Protégé), however, the development of a specific visual semantic modeling tool is also possible. The other two steps are accomplished automatically. They precede the final 3D content presentation, which may be performed using different 3D content browsers and presentation tools. The following sections describe the subsequent steps of the modeling process, along with an example, in which different 3D content components are created and assembled into a 3D scene of a virtual museum of agriculture.

Fig. 2
figure 2

Creation of 3D content based on the SEMIC approach

5.1 Step 1: design of a concrete content representation

The design of a CrR provides basic components of 3D content that are a foundation for presentation of application-specific concepts, which will be further used in Step 3. A CrR is a knowledge base compliant with the ML-SCM (cf. Sect. 4.1). The elements of a CrR are concrete components—concrete classes and concrete properties that are directly related to 3D content and whose formation is, in general, complex—it may require the use of additional specific hardware or software tools. For instance, the creation of a 3D mesh requires the use of a 3D scanner or a 3D modeling tool, while drawing a texture requires a 2D graphical editor. The design of a CrR may cover different layers of the ML-SCM, e.g., the design of a mesh is related to the geometry layer, while the design of a motion trajectory is related to the animation layer. CrRs represent neither particular coherent scenes nor particular compositions of objects, but they include (possibly independent) templates of reusable content components that may be flexibly composed into complex 3D objects and 3D scenes.

In most cases, concrete components need to be designed with regards to the application-specific concepts that are to be presented, e.g., a particular 3D mesh represents specific car models, a particular texture represents wood surface, etc. Hence, they need to be created in collaboration with domain experts, who will further use the components in Step 3.

Every component created in this step is represented by a class or a structure linking classes, which is described by properties. Every class is a subclass of an appropriate ML-SCM class, which is specific to 3D modeling, and it will be used to represent low-level objects in the created content (meshes, materials, viewpoints, events, complex objects, etc.) that have common values of properties. Data and object properties may be specified for classes linking the classes with literal values and other classes. Literal values may be directly used (interpreted) in the content representation (e.g., when reflecting coordinates or color maps) or they may indicate external data sets (e.g., paths to documents including meshes or images). The specification of complex values of data properties is done using external tools, and it is beyond the scope of this paper.

figure c

This step of modeling is typically performed by a developer with technical skills in 3D modeling, who is equipped with a specific additional software (e.g., a 3D modeling tool) or hardware (e.g., a 3D scanner) for creating visual (2D or 3D), haptic or aural elements.

Listings 1–5 and Figs. 35 show a simplified example of using SEMIC for conceptual knowledge-based 3D content creation. In Step 1 in the example, a developer creates several virtual objects, which represent real museum artifacts. First, the developer uses specific modeling tools to create graphical elements required for low-level content representation—a 3D scanner to capture the geometry of: a granary, a woman statuette, a sower, a smoker, a ring, a seal and a badge, and a 2D graphical tool—to prepare textures for selected models (Fig. 3). Second, the developer uses a semantic modeling tool (which may be a plug-in to a modeling package, e.g., Blender or 3ds Max), to create a CrR, which semantically reflects the created graphical elements. Listing 1 presents an example CrR encoded in the RDF Turtle format. Some content components and properties that are not crucial for the presented example are skipped. The prefixes used in the listing correspond to different semantic models and content representations. For every model and every texture, the tool generates OWL restrictions, which are classes with specific values of properties. The classes will be mapped in Step 2 to 3D objects and materials used by domain experts in Step 3. In the example, the woman statuette is to be used by domain experts in three different forms: as wooden and glassy virtual artifacts (lines 4–26) and as a painted virtual artifact with the inherent texture mapping (28–35). The other models are to be used in single forms (lines 37–45). In addition, the developer creates the crr:TouchSensor (47–51), which activates the crr:RotatingInterpolator (53–63) controlled by the crr:TimeSensor (65–69), which will enable the artifacts in virtual museum scenes to be rotated after being touched.

Fig. 3
figure 3

An example of concrete content components

5.2 Step 2: mapping content representations

Mapping a CrR (created in Step 1) to application-specific concepts enables 3D presentation of application-specific knowledge bases (created in Step 3) by linking them to concrete components of 3D content included in the CrR. The result of this step is an RM, which is comprised of mapping concepts that inherit from concepts defined in SMM (cf. Sect. 4.3). Mapping is performed once for a particular application-specific ontology and a CrR, and it enables the reuse of concrete components for forming 3D representations of various application-specific knowledge bases, which conform to the application-specific ontology selected. An RM needs to cover all concepts (classes and properties) of the application-specific ontology that need to have representations in conceptually modeled 3D content.

This step of modeling may be performed using a typical semantic editor and it does not require the use of additional (complex) specific 3D modeling hardware or software, e.g., linking a previously created animation to an object, inclusion of sub-objects within a complex structural object, etc. However, in terms of the semantic structures that need to be created, mapping is more complex in comparison to the design of a CrR, and it requires more semantic expressiveness. A specific visual mapping tool could be developed to simplify this step.

This step of modeling is typically performed by a developer or a technician, who is equipped with a semantic editor and has basic skills in semantic modeling.

In Step 2 in the example, a developer or a technician creates an RM (listing 2—the RDF Turtle and Prolog-like syntax) including semantic statements linking application-specific concepts (used in Step 3) to components of the CrR (created in Step 1). The aso:Woman (5–6) artifact is a PO class and a subclass of the crr:WomanMesh, so it inherits its properties related to the geometry, which were specified in the previous step. Every instance of this class may be made of wood or glass (as indicated by the aso:madeOf DP), thus having an appropriate material assigned using proper DCs (7–16). In contrast to wooden and glassy artifacts, every aso:PaintedWoman PO has a texture assigned, as indicated by its super-class (17–18). Mapping the other classes of the application-specific ontology to PO classes has been performed in a similar fashion (19–21). Moreover, two basic shapes (the rm:Box and the rm:Cylinder) are designed (23–32) and assembled into the aso:Stool PO class (33–40). Every rm:Box and rm:Cylinder included in a aso:Stool have dimensions and relative positions specified (41–52). To enable rotation of virtual museum artifacts, after being touched,

figure d

every artifact is linked to a crr:TouchSensor using a DC (54–58). Furthermore, two RLs have been specified. The aso:incorporates RL is an equivalent to the mlscm:includes (60–62), while the aso:standsOn RL determines the x, y and z coordinates of the object by semantic rules (64–81). The created mapping is depicted in Fig. 4.

Fig. 4
figure 4

Example mapping of concrete content components to application-specific concepts

5.3 Step 3: design of a conceptual content representation

The design of a CpR enables declarative creation of 3D content at an arbitrary level of abstraction that is permitted by the application-specific ontology selected. This step can be performed multiple times for a particular application-specific ontology, a CrR and an RM when new 3D content is required for a particular, specific 3D/VR/AR application. This step of modeling focuses on application-specific semantic concepts and does not cover concrete components of 3D content, which are hidden behind the RM.

A CpR, which is a knowledge base compliant with the application-specific ontology, consists of semantic statements (facts) and semantic rules (implications), which declaratively represent content at a conceptual level of abstraction. Both the statements and the rules are built upon application-specific concepts and objects. A CpR explicitly specifies properties and relations between content objects as well as constraints on object properties and object relations that will be further used in knowledge discovery in the next step of SCCM. In contrast to CrRs, which include possibly independent components, CpRs reflect coherent 3D scenes or complex 3D objects.

This step is typically performed by a domain expert, who is not required to have advanced technical skills. A domain expert uses an application-specific ontology to focus only on application-specific semantic concepts and does not need to work with concrete components of 3D content. A domain expert may be equipped with a semantic editor. However, a visual semantic modeling tool could be also developed.

In general, this step of modeling is independent of the steps described previously, and a CpR may be created before a CrR and an RM are created, e.g., when a domain expert designs an accurate digital equivalent to a known real object. However, when designing non-existing objects or scenes (e.g., a planned virtual museum exhibition) the availability of the CrR and the RM may be desirable to enable the preview of the results during the modeling.

figure e

In Step 3 in the example, a domain expert creates a CpR (listing 3) including instances of application-specific concepts (classes and properties) that have been mapped to concrete content components and properties included in the CrR. The domain expert creates several artifacts (4–5) and three woman statuettes (6–9). The first two statuettes are made of different materials (wood and glass), while the third statuette is inherently covered by a texture (as specified in the RM). Furthermore, eight stools are created (11), and x, y and z coordinates are declaratively specified for them by assertions (13–21). In line 20, a cut-off and a negation-as-failure are used to determine a stool, for which no x coordinate has been specified. Next, in one declaration all artifacts are assigned to stools (23–33). Finally, all artifacts and stools are incorporated in the cpr:granary (35–38).

5.4 Step 4: expanding the semantic representation

The first three steps of the modeling process provide a comprehensive semantic representation of 3D content at different levels of semantic abstraction. This representation covers all modeling elements that must be resolved by a human—the specification of how application-specific concepts should be reflected by concrete components and the specification of what application-specific individuals should be included in the created content. The remaining steps of the content creation process may be completed automatically.

So far, the concrete semantic components of 3D content (designed in Step 1) are assigned to application-specific concepts (classes and properties) by mapping concepts (designed in Step 2), but they are not directly linked to application-specific individuals (instances of classes and instances of properties—designed in Step 3). To enable presentation of the application-specific individuals, the overall semantic content representation (encompassing the CpR and the CrR) is expanded according to the RM, and the concrete components are linked to the application-specific individuals in the following three transformation stages. In the first stage, reasoning is performed to discover the hidden OPs, which determine the structure of the created content. In the second stage, a structure linking semantic individuals is created for every PO on the basis of the OPs discovered. Finally, in the third stage of the expanding process, DPs are discovered for the created semantic individuals to determine the presentational effects of the POs. The exact description of the expanding algorithm is out of the scope of this paper.

In the result of the above transformation stages, the CpR is transformed to an expanded content representation (ER). The created ER is equivalent to the original CpR in terms of the represented content, but both representations use different levels of abstraction. While a CpR reflects the created content using only application-specific concepts (abstract in terms of presentation), an ER is a counterpart to

figure f

a CpR that reflects the content using only the ML-SCM concepts (directly specific to the 3D domain). An ER is a structure of linked concrete individuals, which are described by concrete properties.

In Step 4 in the example, the overall semantic content representation (including the CrR and the CpR) is automatically expanded according to the RM. Consequently, new semantic individuals are generated and linked to the individuals of the CpR by OPs, and their DPs are set properly (listing 4). All artifacts are specified as mlscm:Mesh3D individuals (4). For wooden and glassy woman statuettes (6–9), appropriate individuals reflecting materials are generated (11–16). The cpr:paintedWoman and its material are created in a similar way (18–22). Furthermore, for every artifact (24), an mlscm:TouchSensor is generated (27–30). Every mlscm:TouchSensor activates an mlscm:RotatingInterpolator (32–37), which is controlled by an mlscm:TimeSensor (39–41). Next, every aso:Stool is expanded to an mlscm: Structural Component that includes an mlscm:Cylinder and an mlscm:Box with appropriate dimensions and relative positions (43–61). For every aso:Stool a position is determined (63–68), and artifacts are assigned to stools by getting appropriate positions (70–75) according to the constraints (declarative rules) specified in the previous step. Finally, all objects (artifacts and stools) are included in the cpr:granary (77).

5.5 Step 5: building the final content representation

The last step of the content creation process is a transformation of an ER (including the concrete semantic components), to a final content representation (including final 3D counterparts of the concrete components), which is encoded in a particular 3D content representation language. This part of the content creation process can be performed automatically with a transformation knowledge base that links concrete components to their corresponding final counterparts. The transformation can cover a wide range of target presentation platforms based on either declarative (e.g., VRML, X3D and XML3D) or imperative (e.g., Java, ActionScript and JavaScript) content representation languages. Building final 3D content presentations on the basis of semantic knowledge bases has been explained in detail in [18].

In Step 5 in the example, a final 3D scene is generated on the basis of the ER created in the previous step (listing 5—the X3D/XML syntax). For every artifact, a Transform node with a position indicated by the translation attribute is generated (e.g., 8–24). It includes a Shape node with a material and an optional texture. Moreover, every artifact is equipped with

figure g

a TouchSensor, an OrientationInterpolator and a TimeSensor, which are connected by ROUTE nodes and enable the rotation of the artifact after it is touched. Stools are generated as Transform nodes including two shapes—a Cylinder and a Box with positions and scales (e.g., 49–62). All objects are enclosed in a common Transform node (6–65). The final 3D scene generated is presented in Fig. 5.

Fig. 5
figure 5

The example final 3D content representation

6 Implementation

SCM, SMM as well as semantic content representations have been implemented using the semantic web standards (RDF, RDFS and OWL), which express facts as semantic statements, as well as the Prova declarative language [24], which expresses semantic rules as horn clauses. The restrictive use of the formally specified semantic web standards—the Resource Description Framework (RDF), the Resource Description Framework Schema (RDFS) and the Web Ontology Language (OWL)—in SEMIC is preferred over the use of other concepts (in particular rules, which have high semantic expressiveness) because of the following two reasons: First, the semantic web standards provide concepts, which are widely accepted on the web and can be processed using well-established tools, such as editors and reasoners. Second, complexity measures have been investigated and specified for these standards, including a number of typical reasoning problems (such as ontology consistency, instance checking and query answering) [41], which allows for building applications with more predictable computational time.

The steps of SCCM that may be performed automatically—expanding semantic content representations and building final content representations—have been implemented as Java-based applications—an expander and a compiler. The Pellet reasoner [33] and the Apache Jena SPARQL engine [2] are used in both applications to process semantic statements, while the Prova rule engine [24] is used in the expander to process semantic rules. The selected target languages are: VRML, X3D and ActionScript with the Away3D library. The 3D content representations encoded in ActionScript are presented using the Adobe Flash Player, while the representations encoded in VRML and X3D are presented using VRML and X3D browsers, e.g., Cortona3D and Bitmanagement BS Contact.

Several 3D scenes generated with the SEMIC implementation are presented in Figs. 68. The scenes have been conceptually modeled using an ontology with concepts reflecting selected elements of cities and they have been expanded by knowledge discovery.

Fig. 6
figure 6

A generated 3D scene: mark in red all buildings, to which a road from the building (indicated by the arrow) leads

Fig. 7
figure 7

A generated 3D scene: mark in red all cars that are close to priority vehicles (indicated by the arrows) going on the same road

Fig. 8
figure 8

A generated 3D scene: the trees that are closer to buildings are lower than the other trees

7 Evaluation

Qualitative and quantitative evaluations have been performed to compare the SEMIC approach with selected approaches to 3D content creation.

7.1 Qualitative evaluation

The qualitative evaluation performed includes a comparison of SEMIC with selected approaches to 3D content creation in terms of functionality. The selected approaches are leading in terms of functionality, available documentation and the community of users. The evaluation covers approaches to semantic content creation (proposed by Latoschik et al., Troyer et al. and Kalogerakis et al.), imperative programming languages and programming libraries (ActionScript with Away3D and Java with Java3D) as well as environments for visual content creation (advanced environments—Blender and 3ds Max, and user-friendly environments—SketchUp and 3DVIA). The qualitative evaluation (presented in Table 1) aims to indicate the major gaps in the available approaches, which are to be covered by the SEMIC approach (cf. Sect. 3).

Table 1 Comparison of the selected approaches to 3D content creation

Declarative content creation is an aspect that significantly distinguishes semantic approaches from imperative languages and visual environments. Semantic approaches allow for content reflection by facts and rules, for which the order of appearance in the content representation is not important, in contrast to the instructions of imperative languages. Therefore, declarative content representation can be more intuitive for content authors, who may specify desirable content properties and constraints, instead of specifying sequences of instructions and analyzing the order, in which they are processed. Moreover, declarative content description significantly facilitates automatic content management (indexing, searching and analyzing) in repositories by referring to content attributes, which may conform to common schemes and ontologies. Since the knowledge bases used in SEMIC consist of semantic triples (expressing facts) and horn clauses (expressing rules), which declaratively represent the content, the approach satisfies the requirement 1 (Sect. 3).

Knowledge-based 3D content creation has been considered in terms of building content representations with regards to discovered properties and dependencies of content objects, which may be hidden (not explicitly specified), but they are the logical implications of facts and rules that have been explicitly specified in the knowledge base. On the one hand, this aspect of content creation is not available in imperative languages, including the languages used in the visual environments. On the other hand, although the available semantic approaches could be extended to enable knowledge-based modeling, currently, they do not support content creation based on extracted data. Since ERs in SEMIC include not only facts and rules explicitly specified during the first three steps of the modeling process, but also inferred logical implications of the facts and rules, the approach satisfies the requirement 2.

Conceptual content creation has been considered in terms of representation of 3D content at different levels of abstraction (detail) and the use of the well-established semantic web concepts (classes, individuals, properties, facts and rules) in 3D content creation process. Overall, the available semantic approaches enable the use of basic semantic expressions (combinations of semantic concepts), such as classes and properties, at different levels of abstraction in modeling of content. However, they do not permit a number of more sophisticated combinations of concepts, which are essential to visualization of complex knowledge bases and which are covered by SEMIC. The imperative languages and visual environments permit complex conceptual content representations at different levels of abstraction, however, expressed imperatively, which is not convenient for knowledge extraction, reasoning and content management in web repositories. Since the knowledge bases used in SEMIC represent the content at the concrete (specific to 3D modeling) and conceptual (specific to an application or domain) levels of abstraction, the approach satisfies the requirement 3.

The previous approaches do not support the separation of concerns that are related to substantially different aspects of content creation between different modeling users, who have different modeling skills and experience, and are equipped with different modeling tools. Although, the available approaches allow for assigning different tasks to different users (e.g., creating different content components or writing different parts of code), the tasks are difficult to be accomplished using different tools and require similar expertise from all involved users (e.g., in the use of a common visual environment or a common language). Separation of concerns has been achieved in SEMIC by decoupling particular aspects of the modeled content into the separate steps of the modeling process, which can be performed manually—by different modeling users, or automatically—by specific algorithms (the requirement 4).

Multi-platform content representation has been considered in terms of flexible and generic content transformation to different formats. Although, the analyzed visual environments support different content formats and the advanced environments (Blender and 3ds Max) enable the introduction of new formats (e.g., by implementing appropriate plug-ins), they do not enable generic description of content transformation that is independent of particular content representation formats. Such description could facilitate the introduction of new content formats and permit content presentation on new platforms. Overall, the semantic approaches do not permit generic, flexible and extensible content transformation for different platforms. Since ERs in SEMIC are platform- and standard-independent, and since they can be transformed to different content representation languages and presented using different presentation platforms, the approach satisfies the requirement 5.

7.2 Quantitative evaluation

The quantitative evaluation performed includes the complexity of 3D content representations and profit from conceptual modeling of 3D content. The evaluation covers CpRs, ERs and final content representations. The CpRs and ERs have been encoded with SCM using the RDF-Turtle format. The ontology used for building the CpRs provides different types of application-specific complex objects and complex properties assembling multiple concrete components of 3D content. Final content representations have been encoded with the VRML, X3D and ActionScript (with Away3D) languages.

The evaluation has been carried out starting from CpRs, which are scenes assembled from various numbers of objects. The number of objects has varied over the range 5–50 with the step equal to 5. For every number of objects, 20 scenes have been randomly generated and the average results have been calculated. The test environment used is equipped with the Intel Core i7-2600K 3.4GHz CPU, 8 GB RAM and the Windows 7 OS.

7.2.1 Complexity of 3D content representations

The complexity of 3D content representations has been evaluated with the following metrics: the structured document complexity metric [31], the number of bytes, the number of logical lines of code (LLOC) [1] and the Halstead metrics [22]. The first metric has been used to measure the complexity of representation schemes of conceptual, concrete and final content representations, whereas the other metrics have been used to measure the complexity of particular content representations.

Structured document complexity metric. The structured document complexity metric has been used to measure the complexity of representation schemes regarding unique elements and attributes, required elements and attributes as well as attributes that need to be specified at the first position within their parent elements. The metric may be calculated for XML- and grammar-based documents. The values of the structured document complexity metric calculated for the schemes of conceptual, concrete and final content representations (encoded in VRML, X3D and ActionScript) are presented in Table 2. The results obtained for VRML and X3D representations are equal, because both languages use schemes with equivalent basic elements and attributes. While in VRML and X3D representations, different hierarchical document elements have been classified as unique elements, in ActionScript representations, different classes and data types (potentially corresponding to different elements of VRML/X3D) have been classified as unique elements. In ERs, unique elements cover different RDF, RDFS and OWL elements as well as semantic properties of 3D content, which are encoded by document elements (according to the RDF syntax). In comparison to ERs, CpRs include a lower number of unique elements, which are semantic combinations of multiple elements of ERs. Different properties occurring in hierarchical VRML/X3D elements or properties of objects in ActionScript representations have been classified as unique attributes in the metric calculation. Since, in ERs the content properties are encoded using document elements, only a few attributes, which are primary RDF, RDFS and OWL attributes, may be classified as unique attributes in ERs. In comparison to ERs, CpRs have a higher number of unique attributes, which is determined by the application-specific ontology used. RMs incorporate the highest number of unique elements of all analysed documents, because they indicate unique elements of both concrete and conceptual representations. Unique attributes of RMs are semantic properties specified in SMM. In VRML/X3D and ActionScript representations, the scene and the view are the only required elements, and they have a few required attributes. The elements and the attributes that are required in CpRs, ERs and RMs, are basic RDF, RDFS and OWL elements and properties. The calculated values of the structured document complexity metric show that the overall complexity of CpRs is much lower than the overall complexity of ERs, which is a little lower than the overall complexity of VRML/X3D and much lower than the overall complexity of ActionScript representations. The overall complexity of RMs is highest of all the complexities calculated.

Table 2 Structured document complexity metric of the content representation schemes and the mapping scheme

Size metrics. The number of bytes (Fig. 9) and the number of logical lines of code—LLOC (Fig. 10)—without comments—have been used to measure the size of content representations. The graphs present the metrics in relation to the number of application-specific components included in CpRs.

Fig. 9
figure 9

The size of representation (bytes) depending on the number of components

Fig. 10
figure 10

The size of representation (LLOC) depending on the number of components

The differences in size are relatively high between different types of representations. In both comparisons, CpRs are more than twice as concise as the corresponding final representations, as the CpRs assembly multiple concrete content elements. ERs are most verbose, which confirms that RDF-Turtle is the most verbose encoding format of all the encoding formats used, taking into account that the elements of ERs are semantic equivalents to the elements of the corresponding final representations.

Halstead metrics. The Halstead metrics have been used to measure the complexity of content representations in several respects: the size of vocabulary and length of the content representations, volume corresponding to the size of the representations, difficulty corresponding to the error proneness of the representations, effort in implementation and analysis of the representations as well as estimated time required for the development of the representations. The particular Halstead metrics in relation to the number of components of CpRs are presented in the graphs in Figs. 1116. The VRML and X3D representations that have been considered in the presented evaluation are based on equivalent basic elements and attributes, therefore they have been presented together.

Vocabulary (Fig. 11), which is the sum of unique operators (\(n\)1) and unique operands (\(n\)2) of the representation:

$$\begin{aligned} \mathrm{Voc} = n1 + n2 \end{aligned}$$
(1)

is lowest for CpRs and it is highest for VRML/X3D, because of a high number of unique operands, which are individual scene graph nodes. In contrast to the other languages, in VRML/X3D, a relation between two components in the generated representation is reflected by nesting one component in another component with specifying all intermediate nodes, which are also classified as unique operands, e.g., applying a Material to a Shape requires an intermediate Appearance node to be nested in the Shape node. In the other languages, such associations are usually described directly—without any intermediate nodes.

Fig. 11
figure 11

The vocabulary of representation depending on the number of components

Length (Fig. 12), which is the sum of the total number of operators (N1) and the total number of operands (N2) of the representation:

$$\begin{aligned} \hbox {Len} = N1 + N2 \end{aligned}$$
(2)

is lowest for CpRs. VRML/X3D representations predominate concrete and ActionScript representations, as their operands typically occur once and all references to them are specified by nesting attributes and other nodes in them. Therefore, the operands do not require to be additionally explicitly indicated (e.g., by proper instructions or statements), and thus the length of VRML/X3D representations is lower than the length of the other representations, in which all references to operands must be explicitly declared by referring to their identifiers.

Fig. 12
figure 12

The length of representation depending on the number of components

The graph of volume (Fig. 13), which depends on the length and vocabulary of content representations:

$$\begin{aligned} \mathrm{Vol} = \mathrm{Len} \times \mathrm{log}_2 (\mathrm{Voc}) \end{aligned}$$
(3)

is similar to the graph of length for all types of representations.

In contrast to the other Halstead metrics discussed, difficulty (Fig. 14), which is given by the formula:

$$\begin{aligned} \mathrm{Diff} = \frac{n1}{2} \times \frac{N2}{n2} \end{aligned}$$
(4)

has similar values for different numbers of components in the scene. It is lowest for conceptual and VRML/X3D representations (low error proneness) because of the relatively low values of the number of distinct operators and the total number of operands, and relatively high values of the number of distinct operands. Relatively high difficulty of ActionScript representations (high error proneness) is caused by relatively high values of the first two factors and a relatively low value of the third factor.

Fig. 13
figure 13

The volume of representation depending on the number of components

Fig. 14
figure 14

The difficulty of representation depending on the number of components

Effort (Fig. 15) and time (Fig. 16) required for the implementation or analysis of the representation, which are the products of its difficulty and volume:

$$\begin{aligned} \mathrm{Eff} = \mathrm{Diff} \times \mathrm{Vol} \end{aligned}$$
(5)
$$\begin{aligned} \mathrm{Time}\ (\mathrm{h}) = \frac{\mathrm{Eff}}{18 \times 3600} \end{aligned}$$
(6)

are lowest for CpRs because of the relatively low values of their difficulties and volumes. Higher values of effort and time occur for the other representations.

Fig. 15
figure 15

The effort in analyzing the representation depending on the number of components

Fig. 16
figure 16

The time of implementing the representation depending on the number of components

7.2.2 Profit from conceptual content creation

The profit from conceptual content modeling is directly proportional to the number of scenes that have to be created and their size, and it is inversely proportional to the sum of the size of the RM (that must be implemented) and the size of the primary CpRs that have to be transformed to final representations. The metric is given by the formula:

$$\begin{aligned} \mathrm{Profit} = \frac{N \times \mathrm{AvgSize(FR)}}{\mathrm{Size(RM)} + N \times \mathrm{AvgSize(CpR)}} \end{aligned}$$
(7)

where: N is the number of scenes (final content representations) that have to be created, Size and AvgSize are the size and the average size of a scene.

The values of profit in relation to the number of scenes are presented in the graphs in Figs. 1719. The cost of the creation of the RM is not acceptable for low numbers of simple scenes (the values of profit lower than 1) and it is acceptable for high numbers of complex scenes (the values grater than 1). The profit is higher for the more verbose languages (ActionScript) and it is lower for the more concise languages (VRML).

Fig. 17
figure 17

The profit from conceptual content creation for VRML (for representation size in bytes)

Fig. 18
figure 18

The profit from conceptual content creation for X3D (for representation size in bytes)

Fig. 19
figure 19

The profit from conceptual content creation for ActionScript (for representation size in bytes)

8 Discussion

The tests carried out show that the SEMIC approach outperforms the previous approaches to 3D content creation in terms of functionality as well as the parameters of content representation schemes and generated content representations.

The advantage in terms of the size and complexity of content representations results from modeling 3D content at a higher level of abstraction and leads to creating objects that are assemblies of multiple low-level content components. Although the overall size and complexity of an RM and a CpR may be higher than the size and the complexity of the resulting VRML, X3D or ActionScript scene, RMs are created only once and are common for all CpRs compliant with the selected application-specific ontology, which are the only documents created by domain experts. The size of CpRs is typically much smaller than the size of the corresponding final content representations, which are encoded with the widely used 3D content representation languages and programming libraries. The smaller size can enhance storage and transmission of 3D content to target users, e.g., from distributed web repositories to content presentation platforms.

Furthermore, the proposed approach provides much better possibilities of content implementation (requires less effort and shorter time) by enabling the use of representation schemes whose complexity can be lower than the complexity of commonly used content representation languages. As a result, the vocabulary, length, volume and difficulty of content representations are lower and the creation as well as the understanding of the representations requires less effort and time. Hence, the proposed approach can be used to accelerate and improve effectiveness of 3D content creation using application-specific ontologies and tools that are less complicated than the tools currently used for 3D content creation with the available representation languages.

Moreover, the approach enables content creation by users without considerable expertise in 3D modeling, e.g., domain experts. The calculated values of the profit from conceptual creation of 3D content show that the use of the proposed approach is effective even for systems with relatively low numbers of scenes that have to be designed, in comparison to modeling all the scenes using the available languages.

In the quantitative evaluation, ERs and CpRs have been considered separately, because a CpR and its corresponding ER independently and comprehensively represent the content at different (conceptual and concrete) levels of abstraction—each of them includes all elements that have to be presented. Summing up the complexities of ERs and CpRs may be important if the created content needs to be managed (indexed, searched or analyzed) simultaneously at different levels of abstraction, e.g., storing and accessing content in repositories. The values of size, length, volume, difficulty, effort and time calculated for ERs are much higher than the values calculated for VRML/X3D representations. However, ERs are generated automatically on the basis of CpRs, so it is not a burden on content authors to create them.

9 Conclusions and future works

In this paper, a new approach to semantic creation of 3D content has been proposed. The SEMIC approach leverages the semantic web techniques to satisfy the requirements for modeling of 3D content, which have been specified in Sect. 3. It goes beyond the current state of the art in the field of conceptual knowledge-based modeling of 3D content, and it can improve the creation of complex 3D content for VR/AR web applications in various domains.

The presented solution has several important advantages in comparison to the available approaches to 3D content creation. First, the use of the semantic web techniques enables declarative content creation, which stresses the description of the results to be achieved, but not the manner in which the results are to be obtained. Second, it permits content creation at different levels of abstraction regarding hidden knowledge, which may be inferred and used in the modeling process. Third, the approach permits the reuse and assembly of common content elements with respect to their dependencies, which may be conceptually described with application-specific ontologies and knowledge bases—this enables content creation by different users equipped with different modeling tools. Furthermore, content creation with the SEMIC approach requires less time and effort in implementation, and it provides less complicated results, which can be stored and transmitted more effectively. Next, due to the conformance to the semantic web approach, the content created with SEMIC is suitable to be processed (indexed, searched and analyzed) in web repositories. Finally, the created content is platform- and standard-independent and it can be presented using diverse 3D content presentation tools, programming languages and libraries liberating users from the installation of additional software, which can improve the dissemination of 3D content.

Possible directions of future research incorporate several facets. First, a visual modeling tool supporting semantic 3D content creation according to SEMIC can be developed. Such a tool can allow for the extension of the current SEMIC evaluation with modeling of 3D content by domain experts. Such an evaluation can cover time required for creating CrRs, RMs and CpRs by users with low and high skills in 3D modeling, time required for creating content components of different types (geometrical, structural, behavioral, etc.) and scenes with different complexities, as well as profit from creating particular numbers of scenes using different tools. Second, the approach can be enriched with content creation based on semantic queries (expressed, e.g., in SPARQL) to content repositories. Third, the proposed approach could be extended with semantic transformation of declarative rule-based descriptions of complex content behavior. Finally, persistent link between semantic and final 3D content representations can be proposed to enable real-time synchronization of content representations.