Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1.1 Introduction

Software systems are complex. This holds in particular for the user interfaces of those software systems, which contribute about 50 % to the overall complexity of a software system (Myers and Rosson 1992). To deal with that complexity, models as abstractions of user interfaces are helpful.

To create such models, a variety of user interface description languages has been proposed (Guerrero-Garcia et al. 2009; Paternò et al. 2008; Souchon and Vanderdonckt 2003). These languages, most of which are XML-based, allow for describing user interfaces on an abstract level. The goal is, most often, to generate user interface code from them in a model-driven approach.

While most of those UI description languages are useful for their purpose, some use cases require a stronger formalization than that given by a UML diagram or an XML schema. A more formal approach is to use ontologies for describing the categories of things that exist in the domain of user interfaces, and their possible relations. An ontology is “a formal, shared conceptualization of a domain” (Gruber 1995), i.e., it captures the categories of things that exist in a domain, and their possible relations, in a formal manner.

Although ontologies have been widely adopted in other software engineering fields, e.g., in the domain of web services (Studer et al. 2007), their employment for user interface development is still rare. Although some first work is done, e.g., in the course of W3C’s WAI ARIA initiative (W3C 2011a), a universal ontology of user interfaces is still missing.

This chapter discusses the development of UI 2 Ont,Footnote 1 a formal ontology of user interfaces, split into a top level and a detail level. The former describes the general concepts that exist in the user interface domain (such as components and activities), the latter contains detailed taxonomies of those concepts, i.e., a categorization of component types etc. We have designed the UI2Ont ontology by examining existing user interface description languages and formalizing the concepts contained therein in a rigid ontology, based on the formal top level ontology DOLCE (Masolo et al. 2003).

The rest of this chapter is structured as follows. Section 1.2 motivates the development of a formal ontology of user interfaces and interactions. and Sect. 1.3 discusses a number of potential use cases for such an ontology. Section 1.4 gives an overview on existing ontologies of the domain. Section 1.5 discusses design decisions and the building process of the UI2Ont ontology, while Sect. 1.6 depicts the resulting ontology itself. A sample application using the ontology is discussed in Sect. 1.7. We conclude with a summary and an outlook on future work in Sect. 1.8.

1.2 Ontologies vs. UI Models

Although ontologies and software models are related, they are not essentially the same. Software models and ontologies are different by nature. An ontology claims to be a generic, commonly agreed upon specification of a conceptualization of a domain (Gruber 1993), with a focus on precisely capturing and formalizing the semantics of terms used in a domain. A software model in turn is task-specific, with the focus on an efficient implementation of an application for solving tasks in the modeled domain (Atkinson et al. 2006; Ruiz and Hilera 2006; Spyns et al. 2002). Thus, a software engineer would rather trade off precision for a simple, efficient model, with the possibility of code generation, while an ontology engineer would trade off simplicity for a precise representation (Paulheim et al. 2011). Another difference is that in software engineering, models are most often prescriptive models, which are used to specify how a system is supposed to behave, while ontologies are rather descriptive models, which describe how the world is (Aßmann et al. 2006). Figure 1.1 illustrates those differences.

Fig. 1.1
figure 1

Ontologies and modeling languages serve different purposes (reprinted from Paulheim and Probst 2011)

Taking this thought to the domain of user interfaces and interactions, models are used to define particular user interfaces (e.g. with the goal of generating code implementing those interfaces), while a formal ontology would capture the nature of things that exist in the domain, e.g., which types of user interfaces exist, and how they are related.

Due to those differences, we argue that developing a formal ontology on user interfaces will not lead to yet another user interface description language, but to a highly formal model with different intentions and usages.

1.3 Use Cases

The literature discusses several use cases for employing ontologies in the field of engineering user interfaces, e.g. the position paper by Rauschmayer (2005) and our more recent survey (Paulheim and Probst 2010b). In the latter, we have identified a number of use cases where an ontological representation of the domain of user interfaces and interactions is required or at least beneficial. Those use cases address improving both the development process as well as the user interface itself.

Automatic Generation of UI Code

The classic use case for user interface models is generating user interface code from an abstract model, typically in an MDA based setting. An example for using ontologies as input models to a code generator is shown by Liu et al. (2005). The authors argue that employing background knowledge from a richly axiomized ontology can improve the quality of the generated user interfaces, e.g., by identifying useless interaction paths or illegal combinations of interaction components (e.g., foreseeing a mouse-triggered interaction on a device without a mouse). Furthermore, domain ontologies may already be used for other purposes in a software engineering project; they can be reused for creating the description of UI components.

Supporting Repositories of User Interface Components

Reusing existing UI components is desirable to reduce development efforts. With a growing number of components that can potentially be reused, it is not an easy task to find suitable components. Happel et al. (2006) discuss an ontology-based repository for software components (in general, not specifically UI components). Reasoning on those ontologies assists users in finding components which fit their needs, e.g., in terms of license agreements, hardware platforms, or required libraries. For systems to be built from a large number of components, conflicts which are hard to find manually can be detected automatically by a reasoner.

Supporting Repositories of Usability Patterns

Not only code artifacts such as software components may be stored and reused, but also conceptual artifacts such as design and usability patterns. Henninger et al. (2003) introduce an approach using ontologies for classifying and annotating usability patterns. The authors propose the use of ontologies for managing a repository of patterns. By representing those properties using formal ontologies, more sophisticated approaches could also validate those patterns, find inconsistencies and incompatibilities between different patterns, and identifying commonalities between different usability patterns.

Integration of UI Components

Ontologies may not only be used for identifying, but also for integrating user interface components. We have introduced an approach which uses ontologies for annotating user interface components and messages exchanged by those components (Paulheim and Probst 2010a). A reasoner acts as a central message processor which coordinates the interactions between the different components, based on formalized rules, thus facilitating run-time integration of user interface components. This example is discussed in more detail in Sect. 1.7.

UI Adaptation

Different users have different expectations and needs towards an application’s user interface. Therefore, making user interfaces adaptive is a significant improvement regarding usability. Different approaches have been discussed to employ ontologies for building adaptive user interface have been discussed. The W3C’s WAI ARIA initiative (W3C 2011a), for example, suggests the use of ontologies for annotating web based interfaces. Based on a user’s profile and semantic annotations of the interface, a reasoner can decide on a optimal realizations for users with impairments, such as color-blindness.

Self-explanatory User Interfaces

User interfaces for complex systems are often difficult to understand. Therefore, users may need assistance in finding out how to achieve their goals, how particular user interface components work, and how they are related to each other. Kohlhase and Kohlhase (2009) discuss different approaches which use ontologies for automatically generating explanations in user interfaces, both textually and graphically: ontology-based formalizations of user interfaces are used to create help texts and visual hints at run-time.

1.4 Related Work

In the previous section, we have discussed a number of use cases for a formal ontology of the domain of user interfaces and interactions. Although there are some prototypes for those use cases, most of them are built on top of only small, pragmatic ontologies that fit the requirements of those use cases, but to the best of our knowledge, there has not been an attempt to build a concise and comprehensive ontology of the domain.

The WAI ARIA ontology (W3C 2011a), whose contents have been used as an input for our ontology, provides a taxonomy of roles that elements in a web based application can play, and a set of attributes of those elements. Annotations based on that ontology can be used to make web pages accessible for people with different impairments. The WAI ARIA ontology is not general, but has a strong bias towards web based user interfaces, which makes it difficult to transfer it to other types of user interfaces. The hardware parts of user interfaces are not contained in WAI ARIA, neither are user interactions. Furthermore, it is does not follow a rigid ontology engineering approach, but contains some questionable subclass relations. The top level consists of the three categories Window, Widget, and Structure, but there are many categories which are sub-categories of more than one of those. For example, Alert Dialog is at the same time a sub-category of Window and of Region (a sub-category of Structure), where the latter is explained to be “a large perceivable section of a web page or document”, while the former follows the notion of being a browser displaying such web pages and documents. Such contradicting axioms may cause some difficulties when reasoning on the ontology.

The GLOSS ontology (Coutaz et al. 2003) defines categories for multi surface interactions, i.e., interactions with tangible user interface components, user interfaces consisting of interactive surfaces (e.g., touch screens), and combinations thereof. The ontology has a strong focus on the hardware aspects of interactive devices and their relations. It contains a formalization of the top level, but omits a detail level laying out different types of devices and their attributes, nor does it contain further categorizations of interactions or user interface components.

The FIPA device ontology (Foundation for Intelligent Physical Agents 2002) is also an ontology of user interface devices, which focuses on mobile devices, such as smart phones. It provides means to describe the technical capabilities of such devices, e.g., audio input and output, memory, and screen resolution. While being suitable for certain use cases, such as comparing devices and their capabilities and defining requirements of software for such devices, the ontology is not expressive and detailed enough to be used for other scenarios in the user interface area.

A similar approach is taken by the W3C’s CC/PP (Composite Capability/Preference Profiles) recommendation (W3C 2004a), where capabilities and preferences of mobile devices are represented in RDF. While the recommendation contains an informative example vocabulary for displays and printers, CC/PP, like the GLOSS ontology, only defines the top level and leaves the specification of the detail level open. Another (meanwhile discontinued) attempt was made by the W3C in the course of the Delivery Context Ontology (W3C 2010), which focuses on mobile, Java based application front ends to web services and provides means to formalize both mobile hardware and Java software. Although its scope is rather limited, it provides a reasonable degree of formalization.

Another aspect of user interfaces and interactions is addressed by the Computer Work Ontology (CWO) (Schmidt et al. 2011). The ontology formalizes the way people work with computers, the goals they pursue, and the actions they perform to achieve these goals. While this ontology provides a concise formal description of the activities, reusing top level ontologies such as DOLCE, it is not capable of capturing interactions with the components and those components as such.

All of the ontologies discussed are rather narrow in scope and do not cover the whole area of user interfaces. Furthermore, many of them are only weakly formalized and do not leverage extensive formalizations of top level ontologies. In contrast, UI2Ont is the first ontology covering the entire spectrum of user interfaces and interactions, allowing for concise descriptions of interactions with classical interfaces as well as the formalization of multi-modal interactions.

1.5 Building the Ontology

Most ontology engineering approaches start from collecting concepts from the domain (Fernández et al. 1997; Uschold and King 1995). To that end, we have first reviewed a number of user interface description languages and extracted a list of concepts. Furthermore, we have re-used a number of top-level ontologies and aligned the concepts identified in the first step to those ontologies in order to facilitate a rich axiomatization.

1.5.1 Reuse of UI Description Languages

As discussed above, a number of user interface description languages already exists. From the large variety presented in the surveys (Paternò et al. 2008; Souchon and Vanderdonckt 2003), we picked a subset based on criteria such as popularity in the literature, relevance with respect to the modeling goals of the ontology, availability of detailed documentation (as the exact set of tags or keywords is needed for identifying key concepts), and expressiveness.

Figure 1.2 depicts the chosen subset, organized along the three levels of the Cameleon reference framework, i.e., the concepts and tasks level, the abstract user interface level, and the concrete user interface level (Calvary et al. 2003).Footnote 2

Fig. 1.2
figure 2

User interface description languages that have been used as input for our ontology (reprinted from Paulheim 2011)

The set of languages taken into account for the development of the UI2Ont ontology consists of UIML (OASIS 2009), XIML (RedWhale Software 2000), the abstract roles defined in WAI ARIA (W3C 2011a), and the abstract user interface parts of UsiXML (UsiXML Consortium 2007), TeresaXML (Paternò et al. 2008) and its successor MARIA XML (Paternò et al. 2009). For the detail level, we have used LZX (Laszlo Systems 2006), XUL (Mozilla 2011), XForms (W3C 2009) and HTML5 (W3C 2011b), the MONA UIML vocabulary (Simon et al. 2004), the concrete user interface part of UsiXML, and the concrete roles defined in WAI ARIA.

Table 1.1 lists the key concepts from the different UI description languages that we have examined on the abstract UI level. In addition to those 52 concepts, we have identified 26 relations between those concepts. These collections have served as input for building the ontology.

Table 1.1 Key concepts identified from examined UI description languages. An X denotes that the concept is present in the respective language, a ∗ denotes that it is present, but expressed as a relation. The table lists all concepts that exist in at least two of the languages. The last line shows all the concepts that exist in only one language

The table shows that there are some differences between the different UI definition languages. Besides different modeling scopes, one reason is that the border between abstract and concrete user interface (Calvary et al. 2003) is not sharply defined: e.g., Condition belongs to the abstract user interface part of MARIA, but to the concrete user interface part of UsiXML. Since the table only depicts the languages’ respective abstract user interface parts, such deviations occurred.

For the detail level, we have collected definitions of user interface components and user and system activities (which are often expressed as system events notifying about those activities). Figure 1.3 exemplarily shows the distribution of user interface component definitions across the concrete user interface languages examined. The figure shows that there is a “long tail” of components that are only defined in one or two languages. Therefore, it makes sense to unify the input of several languages when collecting concepts.

Fig. 1.3
figure 3

Distribution of UI component definitions across different UI description languages (reprinted from Paulheim 2011)

1.5.2 Reuse of Top Level Ontologies

We have reused a number of foundational and upper level ontologies for several reasons. First of all, those ontologies already contain definitions for a larger number of concepts, so they reduce the initial efforts of developing the ontology. Second, using foundational level ontologies eases the interoperability with applications based on ontologies using the same foundational level ontologies. Third, foundational level ontologies provide a certain guidance which simplify the definition of useful categories and prevent typical modeling mistakes (Guarino and Welty 2009). Figure 1.4 shows the stack of ontologies that we have reused.

Fig. 1.4
figure 4

Stack of the ontologies that have been reused. The top and detail level ontologies of the user interfaces and interactions domain are located on the right hand side (reprinted from Paulheim 2011)

The scope of the UI description languages examined was in most cases limited to or at least focused on the software part of user interfaces. Therefore, we have reused the ontologies of software and of software components described by Oberle et al. (2009). These two ontologies define categories such as Software, Software Component, Data, Computational Task, etc., and their relations, which form a useful basis for ontological modeling of software related things. The core software ontology defines software and software objects in general, while the core ontology of software components can be used to describe properties of actual software components, such as states and parameters.

These ontologies in turn build upon a set of top level ontologies: DOLCE (Masolo et al. 2003) divides the top level of Particulars into Endurants (i.e., entities that are in time, such as physical objects), Perdurants (i.e., entities that happen in time, such as events), Qualities inherent to other particulars (such as color or spatial position), and Abstracts (i.e., entities that have neither spatial nor temporal qualities). These basic categories are then further subdivided to form an abstract, very general level of categories.

There are various extensions to DOLCE. Two high-level extensions define spatial and temporal relations between entities. The descriptions and situations extension, often referred to as DnS, is used to express descriptions about other entities. It is a useful basis to define, e.g., communications and interpretations of utterances (Gangemi and Mika 2003). Information Objects, such as books, are a special type of Descriptions, which carry information about other entities (Gangemi et al. 2005). Digital data objects in an information system are also considered information objects, therefore, information objects are the basic entities for defining software.

When software is executed, tasks are performed by an information system. Such tasks are defined by a plan, which is expressed by the software. Therefore, the ontology of plans (Bottazzi et al. 2006) is also reused by the ontology of software. It in turn builds upon the ontology of functional participation (Masolo et al. 2003), which defines relations between the execution of tasks and the entities involved in those executions, such as an object serving as an Instrument or a Resource in a task execution process.

The basic categories and relations defined by the reused ontologies divide the ontology of user interfaces and interactions along two axis, as depicted in Fig. 1.5:

  • At design time, there are only instances of the Descriptions of user interfaces, such as the software specifications and the task descriptions. At run time, User Interface Descriptions are realized by Computational Objects and Tangible Objects, and task descriptions are carried out as Activities.

  • Tasks and Activities describe the interactions possible with user interfaces, while User Interface Components and their realizations describe the Components that are involved.

Fig. 1.5
figure 5

The top level of the ontology of the user interfaces and interactions domain. In the upper part, the design time concepts are shown, the lower part contains the run time concepts. The left part deals with interactions, the right part with components. The white ellipses denote concepts from the reused ontologies (with the following namespace conventions: DOLCE (dolce), Information Objects (io), Temporal Relations (tr), Functional Participation (fp), Plans (plan), Descriptions and Situations (dns), Core Software Ontology (cso), Core Ontology of Software Components (cosc)), the grey ellipses denote concepts from the top level ontology of the user interfaces and interactions domain. The gray triangles denote definitions carried out in the detail ontology (reprinted from Paulheim 2011)

Some of the concepts and relations identified in the first step are already contained in the stack of reused ontologies. For example, tasks and events are already defined in the DOLCE ontologies, and data types (e.g., of data entered in input fields) are already defined in the ontology of software components.

Furthermore, there are constructs in some UI description languages that are inherent to most ontology languages, such as OWL. For example, XIML provides generic slots for creating user defined relations, and UIML has means for defining rules (which, in an ontology case, would be expressed with an ontology-based rule language, such as SWRL (W3C 2004b)). Those constructs are omitted in the ontology, as they can be provided by the ontology language used for coding the ontology.

1.6 The UI2Ont Ontology

The UI2Ont ontology consists of two levels: the top level ontology defines the elementary categories such as user interface components and interactions, and the basic relations that can hold between objects of those categories. The detail level ontology defines the sub-categories and actual types of components and interactions, based on the concepts found in the user interface description languages used as input in the design process.

1.6.1 The UI2Ont Top Level Ontology

We use the basic notion of software, as defined in the ontology of software, to categorize user interfaces. To this end, some fundamental extensions to the reused ontologies were necessary.

The first fundamental extension is that for describing user interfaces, Computational Tasks are not enough. Instead, the plan expressed by a user interface consists of both User Tasks and Computational Tasks.

For describing more complex interaction patterns, we use the top level concept Plan. Generally, a plan can be seen as a description of some interaction between a user and an IT system. We derive the category Interaction Plan, which defines both Computational Tasks and User Tasks. Those tasks which are carried out as Computational Activities and User Activities. A User Interface Component expresses one or more Interaction Plans. As a plan describes interactions based on conceptual Tasks, not on actually carried out Activities, it can also be seen as a pattern for interactions.

In the descriptions and situations ontology, a Task is a concept which defines how Activities are sequenced, while those Activities are the perdurants which are actually happening. In other words, tasks exist at design time, while activities happen at run-time. As a task may sequence different activities, activities may be used for more fine-grained definitions than tasks. For user interfaces, a typical task is select an object. Corresponding activities for that task can be click a radio button, click on a drop down list and click an entry in that list, type a shortcut key, etc.

We found this distinction quite useful, as a task can be sequenced by different activities for different user interface modalities (e.g. speech input and typing can be activities for data input). Thus, the task level is a modality independent description defining the purpose of a UI component, while the activity level is a modality dependent description defining the usage of a UI component.

Following Fowler’s classic three tier architecture (Fowler 2003), we divide Software Components into Storage Components, Processing Components, and User Interface Components. The latter are realized at run time by User Interface Component Realizations.

The last extension affects Computational Objects. Although we focus on WIMP user interfaces, our intention was to design the top level of our ontology general enough to cover other forms of user interfaces, such as tangible components, as well. Therefore, we defined the category Peripherical Hardware, where Tangible Hardware Objects, as well as non-physical Visual Computational Objects, can realized User Interface Components. This construction allows our top level ontology to cover both WIMP based as well as tangible user interfaces.

Some of the UI description languages contain classes that have been modeled as relations in the ontology, e.g., Abstract Adjacency in UsiXML, which has been turned into the adjacent to relation. Also, by aligning our ontology with the respective top levels, the domain and range of relations has sometimes been changed. The adjacent to relation, for example, has been changed from a relation between user interface components to a relation between the Screen Regions they occupy.

The most specific categories in our top level ontology are at the level of User Interface Components and User Tasks. The definition of the subtypes of components and tasks is done in the detail level ontology.

Figure 1.5 shows the top level ontology. The size of the OWL implementation of the top level ontology is depicted in Table 1.2. Although we defined a number of additional classes, we have mostly reused existing relations. Therefore, the number of relations is comparatively low.

Table 1.2 Size of the OWL version of the top and the detail level ontology, as well as the reused ontologies

While the top level contains definitions of generic categories and relations used to describe user interfaces and interactions, the detail level aims at providing a categorization of user interface components and tasks which is as complete as possible.

As discussed above, we have followed the distinctions imposed by the reused upper level ontologies, which encourage the separation of information objects and their realizations, as well as of description of tasks and actually carried out activities. Transferred to our domain, this results in separating the design time level from the run time level.

Due to this distinction, there are various points where the detail level ontology enhances the top level ontology: on the design time level, taxonomies of User Interface Components, User Tasks, and Computational Tasks, are defined. On the run time level, hierarchies of User Activities are defined, as well as Hardware Items with which those activities are performed. We have intentionally not defined any axioms restricting the allocation of activities to tasks, in order not to exclude any forms of interaction. Furthermore, user interface components are realized at run time by computational objects (i.e., software) and tangible objects (i.e., hardware), as shown in Fig. 1.6, or mixtures of both.

Fig. 1.6
figure 6

Different realizations of a slider user interface componentFootnote

Image sources: http://www.flickr.com/photos/anachrocomputer/2574918867/, http://zetcode.com/tutorials/javaswttutorial/widgets/, accessed April 20th, 2011.

1.6.2 The UI2Ont Detail Level Ontology

On the other hand, Computational Tasks and Computational Objects are not further specified. Such as a specification is not necessary from a user interface perspective: for describing a user interface, it may be beneficial to describe how a user performs a selection task in a certain modality, but it is not relevant how the computer performs a certain computational task.

Due to this distinction, two possibilities of locating the detail level layer are possible: defining the details of Tasks and User Interface Components on the design time level, or defining the details of Activities and User Interface Component Realizations on the run time level. We decided for the former, since in some of the use cases discussed above, the run time level does not exist. In a repository of UI components, for example, those components are not instantiated and executed when the ontology is used, e.g., for querying the repository. Thus, only instances of categories of the description level exist. Therefore, it is useful to put as much detail as possible on that level.

In our analysis of user interface definition languages, we have identified 80 different types of user interface components, as shown in Fig. 1.7. Using a bottom-up clustering approach, we have grouped them in seven central categories:

  • Data input components are used by the user to manipulate data. Examples are text input fields and radio buttons.

  • Presentation manipulation components change the appearance of a user interface component, which is usually another one than the presentation manipulation component itself. Examples are scroll bars and window resizers.

  • Operation triggering components are used to invoke system functionalities. Examples are buttons in a tool bar, and menu items.

  • Decorative elements improve the appearance of user interfaces, are neither interactive nor informative. Examples are separation bars and empty spaces.

  • Outputs provide a human consumable representation of data. Examples are text and speech output.

  • Labels assign meaning to other (most often interactive) UI components. Examples are text labels next to radio buttons.

  • Containers group other UI components. Examples are windows and tool bars.

Fig. 1.7
figure 7

Top categories for UI components (reprinted from Paulheim 2011)

The first three are considered interactive components, as they allow the user to act with them, while the latter four are presentation components, which are non-interactive. We also introduced composite UI elements, which can contain both interactive and presentation components.

By collecting tasks and activities and, as for components, clustering them, we have identified four basic categories of user tasks:

  • Information consumption tasks are tasks where information provided by the system, typically through an Output component, is consumed by user.

  • Information input tasks are tasks where the user provides information to the system, typically through a Data Input Component. Input tasks can be performed as unbound input, e.g., entering text into a text field, or as bound input, e.g., by selecting from a list of values.

  • Command issuing tasks are all tasks where the user issues a system command, typically using an Operation triggering component.

  • Information organization tasks are performed by the user to organize the consumption of information, e.g., by scrolling through a document, following a hyperlink, or fast-forwarding a video, typically using a Presentation Manipulation Component.

Unlike user tasks, user activities depend on user interface modalities, i.e., the actual interactive devices they are performed with. Consequently, we have clustered them according to the devices that are required to perform those tasks, leading to categories such as keyboard activities, mouse activities, speech activities, touch activities, pen based activities, activities with special tangible objects (such as a reacTable (Jordà et al. 2007)), as well as general perception activities. We have furthermore defined 11 categories for the corresponding hardware items and relation axioms between those, as depicted in Fig. 1.8. The ontology can be extended to more activities and hardware items for describing further modalities.

  • Display, Play, and Print, which create different manifestations of an information.

  • Highlight and Dehighlight, which modify an existing presentation of objects, without changing there order, and Rearrange, which changes the order of an existing presentation.

  • Suspend Playback and Resume Playback, which modify a streaming presentation of information.

Fig. 1.8
figure 8

User activities and their mapping to hardware devices, as defined in UI2Ont (reprinted from Paulheim 2011)

Computational tasks are roughly categorized into information administration, information modification and information presentation tasks, where the latter are the ones which are most relevant for the domain of user interfaces and interactions, and are thus defined in more detail in the ontology. There are three basic categories for computational tasks defined in UI2Ont:

  • Information Administration Tasks are all tasks concerned with managing data stored in different media. Typical information administration tasks are loading and saving data.

  • Information Manipulation Tasks are concerned with altering information objects, e.g., creating, modifying, and deleting information objects.

  • Information Presentation Tasks are particularly relevant for user interfaces, as they control the presentation of information. We distinguish Information Presentation Initialization Tasks, which start the presentation of an information object, such as displaying or printing an information object, and Information Presentation Modification Tasks, which influence an already started presentation, e.g., by highlighting or moving a Visual Computational Object.

As depicted in Table 1.2, the detail level ontology contains is richly axiomatized. Those axioms also make explicit knowledge contained in the UI languages’ informal documentations. For example, descriptions such as “a menu bar is a bar that contains menus” can be directly translated into formal axioms in OWL. Such formalizations assure that the ontology can provide meaningful information in use cases which demand for a large amount of formal reasoning on user interfaces. In total, the detail level ontology defines 80 categories of user interface components, 15 categories for user tasks and 20 categories for computational tasks, and 38 categories for user activities.

1.7 Case Study: Application Integration on the User Interface Level

To test the applicability of our ontology, we employed it in a framework for integrating user interface components. Applications built using this framework instantiate user interface components, and the coordination of cross-component interactions, such as drag and drop from one component to the other, is performed centrally by an ontology reasoner.

The ontology is used in that framework for describing the user interface components that are to be integrated, to annotate events exchanged between those components, and for providing the vocabulary for defining integration rules:

  • Each component is described by using categories from the UI2Ont ontology. This description defines which sub components the component is built from, and which tasks can be performed with the component. Sub-categories of the existing ontology categories may be defined to describe more specific UI components.

  • At run-time, when a component is instantiated, instance data of the respective realizations, e.g., computational objects, is provided to a reasoner.

  • Each event issued by a component is annotated using the categories from the ontology. The annotation contains information about the of activity (and corresponding task type) that caused the event, as well as about the involved information objects and UI components.

  • Integration rules are defined that determine which (computational) activities are triggered by events. Those rules are defined using the vocabulary from the ontology.

The UI2Ont ontology alone is not enough to fulfill those functions. An additional ontology of the real world domain that the application is built for is required. For example, when annotating an event, the annotation may state that the user has selected an object in a table which identifies a bank account. While concepts such as Select Action, Mouse, and Table are defined in the UI ontology, concepts like Bank Account or Customer are concepts from the real world domain ontology.

Since different applications may use incompatible programming models for representing real world objects, but an exchange of that data is necessary in order to facilitate seamless interaction (such as dragging and dropping and object from one application to another), a rule-based mechanism is employed which supports a transformation between the different programming models, using the domain ontology as an interlingua (Paulheim et al. 2011).

A central event processor, based on an ontology reasoner and rule engine, processes the events, based on the integration rules and the axioms encoded in the ontology, determines how to react to an event, and notifies the respective components about the activities they are supposed to perform as a reaction. Those notifications are again annotated using the ontology. The event processor thus acts as a central coordinator facilitating the integration at run-time. Details about the framework can be found in Paulheim and Probst (2010a).

The integration framework completely decouples the interactions between the applications, which only communicate using the ontologies as an interlingua, forming a comprehensive layer of abstraction over the actual implementation. This allows the integration even of user interfaces developed with different technologies, such as Java and Flex, while still supporting deep integration such as drag and drop (Paulheim and Erdogan 2010).

We have applied the framework in the SoKNOS project for building an integrated emergency management system (Babitski et al. 2011), comprised of 24 applications (see Fig. 1.9). The respective integrated applications use 99 different annotated component types (only those components had to be annotated that are used in some cross-application interaction), and 189 different event types. For the prototype, we have used an additional domain ontology of emergency management (Babitski et al. 2009), which consists of 214 classes, 330 relations, and 1514 axioms.

Fig. 1.9
figure 9

Screenshot of the SoKNOS project depicting the user interfaces of integrated applications (reprinted from Paulheim et al. 2009). The arrows indicate examples for possible interactions

In the SoKNOS project, different types of interactions with multiple modalities and user interface components have been combined, including desktop computers and laptops, large touch screens, and speech interaction devices. With the narrower scoped ontologies discussed in Sect. 1.4, that spectrum of interactions could not have been covered. Furthermore, the use of a reasoner for the automatic computation of possible interactions at run-time was only possible through the rich axiomatization of the UI2Ont ontology. Although a reasoner is run every time an event is processed, the event processing times are still below one second (Paulheim 2010).

1.8 Conclusion

In this chapter, we have laid out a number of use cases in which an ontology of the user interfaces and interactions domain can improve either the development process of user interfaces, or the user interfaces themselves. Motivated by those use cases, we have discussed the construction of a rigid, richly axiomized ontology of the domain, divided into a top level and a detail level ontology. The former contains modality-independent descriptions of tasks and components at design time, while the latter contains modality-dependent descriptions of activities and components at run time. The ontologies are based on a set of foundational ontologies, especially the generic top level ontology DOLCE.

For identifying the relevant concepts, we have used a number of existing user interface description languages. We have clustered the concepts identified and categorized them using the top level categories given by the reused higher level ontologies. Our analysis has shown that there is a “long tail” of concepts that are only covered by a few user interface description languages. This shows that it is beneficial to use the input of several of those languages.

From the use cases discussed as a motivation, we picked the use case of integration of user interface components to show how our ontology has been applied in a real world scenario. Based on the case a large-scale emergency management system, we have shown a real world application of the ontology discussed in this chapter.

Besides the use cases discussed in this paper, having an embracing and formal ontology of the domain of user interfaces and interactions has a number of additional advantages. As discussed above, several user interface description languages exist. During the process of building the ontology, we have observed a number of semantic ambiguities between those languages, e.g. the use of elements called Dialog with different meanings in different languages (a set of interactions following each other, a window on a screen blocking an application, etc.). Another example is the List element, which is sometimes used for static lists in texts (such as in HTML), sometimes for interactive selections (such as combo boxes). Such ambiguities make it difficult to work with different languages in parallel, especially without extensively consulting the respective documentations. Annotating user interface descriptions in different languages with a formal ontology can help identifying and resolving those ambiguities and foster an easier understanding of user interface models.

In their classical paper from 1996, Uschold and Gruninger discussed the vision of an ontology being used as an inter-lingua bridging different languages (Uschold and Grüninger 1996). Transferred to the domain of user interfaces and interactions, this vision could be embodied by a system able to translate between arbitrary user interface description languages and automatically convert models from one language to another, resulting in the ultimate portability of user interfaces across systems, platforms, and modalities. Although this vision is still distant, we believe that we have taken an important step in that direction by developing a unifying formal and comprehensive ontology of the domain.