Keywords

1 Introduction

In the effort of improving patient safety, much attention has centered on the importance, benefits, and challenges of developing well-functioning patient safety reporting systems. Among many other research aspects in patient safety, the knowledge management under human-computer interaction (HCI) was identified as one crucial research objective to address the challenges, including data types to report [1], means to minimize reporting costs [2], and means to conduct evaluation [3], which present weak links to the reporting systems. A pivotal role of knowledge in patient safety embodies in bridging patient safety data and healthcare quality. The development and application of knowledgebase is prominent in sharing and learning from safety events. High quality data can provide invaluable insights into potential safety concerns and benefit root cause analysis for further intervening measures. Therefore, its importance has been increasingly recognized [1, 4, 5].

A great number of reporting systems are suffering low quality of data, such as inefficiency and ineffectiveness of data entry, inconsistency in data formats, and technical challenges in processing text data [1, 6, 7]. The increase in quantity generated by reporting systems does not guarantee an improvement of performance in reporting systems, on the contrary gradually becomes a burden for data processing. This is primarily because the majority of patient safety data is recorded in free text. Although free text might be an efficient and natural means for users to deliver informative cases, it could be costly to turn the raw information into a cognitively organized and manageable format for professionals to use. The usage of pre-defined reporting categories was proposed as a key component in patient safety reporting [1]. However, structured data entry as such could be limited on both timeliness and accuracy [8]. Natural language provides the richest information that conveys details of patient safety events [9, 10], yet on the other hand, it prevents traditional computerized system from effectively processing the data. It is generally agreed that manually categorizing large-scale dataset is not practical. Due to these barriers, the data quality has hindered the development of patient safety reporting.

Taxonomies can be used to solve these problems, as they are capable of managing patient safety events as a knowledgebase. The use of taxonomies for documenting and classifying patient safety reports can be traced back to 1987 when the Australian Patient Safety Foundation (APSF) originally employed a taxonomy in the Australian Incident Monitoring System [11]. Later on, many other taxonomies were developed to perform the similar function. Most of them serve a variety of purposes from one hospital to another and vary in structures and terminologies. Nevertheless, the taxonomies developed for organizing patient safety events yield limited values for healthcare providers and patients. A notable barrier to learning from safety events was known as the lack of comprehensive architecture and sharable format for the taxonomy [12]. Clear and consistent definitions and terms are a core in describing a full spectrum of patient safety. The World Health Organization (WHO) World Alliance for Patient Safety launched a project which described a conceptual framework for International Classification for Patient Safety (ICPS) [13]. A drafting group of WHO intended to construct a standardized collection of concepts in a hierarchy. The ICPS serves as a taxonomy of patient safety and an underlying knowledgebase supporting any types of patient safety reporting. However, clinical practice in patient safety reporting still needs a transformation from ICPS to a localized patient safety classification. For example, the Common Definitions and Reporting Formats (a.k.a., Common Formats) (CFs) promulgated by The Agency for Healthcare Research and Quality (AHRQ) were made compatible with ICPS and used for patient safety reporting in US hospitals [1]. ICPS and CFs, working as a combination, benefit reporting, managing, and improving of patient safety reports from the perspectives of both academic research and clinical practice. Unfortunately, they show limited advantages in handling ever-growing concepts, terms, and real-world data cooperated into reporting systems.

2 Background

2.1 Semantic Web Ontology

Ontologies are explicit specifications of conceptualizations where these specifications define a taxonomy of the knowledge [14]. In medical error reporting, a taxonomy is served as a controlled vocabulary represented in a hierarchical structure. In general, an ontology provides a broader scope to describe domain information than a taxonomy. In specific, an ontology models real-world knowledge by encoding entities and relationships between them.

A semantic web describes information with explicit meanings and supports machine interpretable web content [15]. The semantic web ontology has been used in biomedical domain for various purposes in terms of facilitating biomedical data integration, managing biomedical concepts, and supporting ontology-driven biomedical natural language processing (NLP), disambiguation and named-entity recognition (NER) [16, 17]. Semantic web ontology was employed in our project because it captures both a hierarchical structure of patient safety concepts and relations among related concepts [18]. In addition, semantic web technologies are made compatible with descriptive logic reasoning and other data mining processes. In the project, we build a semantic web ontology using W3C open standard Web Ontology Language (OWL), through which we are able to (1) identify unique concepts/terms that appear in different sources or in different terms; (2) encode all the relations between concepts/terms in a certain domain; and (3) perform semantic reasoning.

2.2 Source Taxonomy

WHO International Classification for Patient Safety. The ICPS conceptual framework contains approximately 600 patient safety concepts across the existing classifications. The concepts are organized under ten top-level categories, which are incident type, patient characteristics, incident characteristics, detection, mitigating factors, patient outcomes, organizational outcomes, ameliorating actions, actions taken to reduce risk, and contributing factors/hazards. Each category may contain subcategories for organizing substantial concepts. The ICPS provides a supportive structure, which helps the transformation into an ontology. For this reason, we used ICPS conceptual framework as the cornerstone of our proposed ontology. In specific, we kept the hierarchical structure and the concepts and terms used in ICPS as a founding stone for the subsequent design and revision.

AHRQ Common Formats. The Common Formats (v1.2) developed by AHRQ are primarily designed for clinical purposes. Recognized as a unified standard of reporting patient safety events, the CFs are designed to specify and collect event information, which range from general concerns to frequently occurring and serious types of the events. The complete forms are comprised of generic formats and event-specific formats. The generic formats are designed for collecting incidents, near misses, and unsafe conditions that occur in all patient safety events. Types of information being collected are organized into the following subcategories: types of event, circumstances of event, patient information, and reporting/reporter/report information. The event-specific formats are particularly used for collecting patient safety concerns that occur in a high frequency and/or in severe events. These formats collect information such as definitions of the event, scope of reporting, risk assessments and preventive actions, and circumstances of the event. We employed CFs as the additional data source where we extracted and encoded semantic knowledge into our ontology.

3 Design and Implementation

3.1 The Framework

The main component in an ontology statement is a set of triples, which constitute what are assumed to be true in certain domain [15]. A simple triple declares entities in the domain and the relations between them. For example,

Cellular products has Red blood cells

Cellular products” and “red blood cells” plasma are entities we assume to be true in the domain that an ontology models. Whereas “has” is a type of relations that links the two entities. This triple is assumed to be true in the ontology and must pass the semantic reasoning so that it contains meanings.

Similarly in ICPS, the concepts and terms being used are true assumptions and fundamental in our ontology. As a result, we imported the concepts and terms from ICPS and the CF concepts originated from ICPS. For the differences between ICPS and CFs, we adopted the CFs concepts with a prioritized consideration of the practical needs in patient safety reporting in US hospitals. For example, patient fall is one of the frequently occurred events that has been reported to AHRQ Patient Safety Organization. Nevertheless, there is no definition or individual category in ICPS. It is assured that ICPS is still under development and evaluation by WHO experts from various disciplines. Therefore, we believe it is beneficial to the design and implementation of the ontology when keeping a flexible consideration of using combined concepts, and/or the knowledge structure. To help understand the framework, we visualize of the top-level concepts of the ontology shown in Table 1.

Table 1. List of concepts and terms for top-level ontology

3.2 Data Transformation

Although the entities implemented in our ontology were initially exported from ICPS and CFs, the data cannot be directly used for ontology constructing without a translation. This is because ontology breaks a triple down into entities and relations, whereas the concepts and terms used in ICPS and CFs were not designed in this fashion. The translational processes were comprised of two steps.

First, we followed a set of principles towards a comprehensible ontology to obtain high-quality data from ICPS and CFs [19]. Most concepts and terms from ICPS are semantically clear, and are ready to be imported to our ontology. Since CFs are used as a guideline for real-world patient safety data entries, the concepts and terms do not fit for the ontology without a transformation. Three coders, including two Doctors of Medicine (YG and XW) and one ontology developer (CL), participated in the translational process.

Second, we manually extracted the relations that link concepts and terms in ICPS and CFs, and implemented them into the ontology. The relations in ICPS have a clear schema since ICPS was developed as a uniform classification where the hierarchical relations between concepts/terms have been well defined. Following ICPS structure, we linked the selected concepts and terms in CFs with those in ICPS by utilizing the existing relations extracted from ICPS and/or defining appropriate relations. Four reviewers (YG, XW, CL, and KA) discussed the effectiveness of the relation implementation and performed revisions until a consensus was reached.

3.3 Ontology Implementation

We imported the corresponding data into the ontology implemented in Protégé 4.3.0. Then, we performed consistency checking, classification, and semantic reasoning for the ontology once the implementation was completed [20, 21]. The ontology constructed a uniform knowledge representation that links the entities and relations of patient safety reports.

3.4 Evaluation

We performed an evaluation intended to demonstrate the effectiveness and validity of the ontology. The evaluation was designed to assess two portions, (1) user experience of domain experts and, (2) effectiveness and validity of the ontology. A questionnaire in a 5-point Likert scale was designed for the evaluation. An example of the draft questionnaire is shown in Table 2.

Table 2. A set of questions listed in the questionnaire for ontology evaluation

The questionnaire is subject to a pre-assessment prior to domain experts’ evaluation on the ontology. The pre-assessment was intended for measuring the content validity and inter-rater reliability of the questionnaire and revising of the questionnaire as needed. The content validity measures to what extent the designed questions subjectively reflect the tasks they purpose to measure. The inter-rater reliability measures the degree of agreement among the raters. Table 3 enumerates a set of scales that were used for content validity in the pre-assessment. The measurement for inter-rater reliability is gained by using the questions shown in Table 2.

Table 3. Questions for measuring content validity

4 Discussion and Future Work

Patient safety research has shown a rapid growth in the past decade. However, the depth on research and a broader application of patient safety system have been constrained by road blocks such as redundant volume of data, inconsistency in data formats, and technical challenges in processing semantic data [1, 6]. We proposed an ontology to represent the concepts/terms and the relations towards a knowledgebase for representing patient safety data and information. The goal of this ontological knowledgebase is to facilitate data entry, information retrieval and data management in the patient safety reporting system. The present work employing an ontological method holds promise to address the challenges in patient safety research.

The present work is a part of the effort to build a uniformed knowledgebase, which is identified as a significant contribution in patient safety report. Redundant volume and low quality of patient safety data reveal a pressing need for a comprehensive knowledgebase, which could aid data entry and storage towards clinical usage and research. Ironically, there is no lack of well-defined taxonomies over decades. As shown in Table 4, a list of patient safety taxonomies/ontologies that were previously used or being presently in use.

Table 4. A review of taxonomies/ontologies used for specific domains

While these taxonomies/ontologies served primarily as standards of domain specific taxonomies, the rapid increase in medical information calls for a uniformed knowledgebase that can be used across patient safety reporting systems. ICPS and CFs are two remarkable milestones to represent patient safety events. They significantly contribute to disambiguation of concepts and terms by defining the uniformed concepts and preferred terms. In particular, ICPS lays particular emphasis on defining uniformed concepts and preferred terms, which holds promise to increase the data quality for patient reporting systems [13]. CFs received a nationwide acceptance in US hospitals because CFs provide a standard to record a wide range from general concerns to frequently occurred patient safety events and serious types of the events. These features bought by CFs hold critical meanings in practical use. For example, Patient fall as an incident type takes a large portion in hospital statistics in the US. CFs count ‘patient fall’ as a type of frequent occurrence in hospital, whereas in ICPS this concept and/or term has not been defined or not shown the significance. From this viewpoint, CFs play an important role in the practical use of patient safety knowledgebase. Towards integrating the advantages of ICPS and CFs, our ontology may provide better standardization and flexibility for nation-wide usage in US hospitals.

The present work endeavors to meet the challenges of data consistency and semantic data processing, therefore provides a different angle to improving data quality in patient safety reporting. Among many factors that are fundamental for high quality data in patient safety reporting, the incompleteness and inaccuracy of data were identified as two major concerns [1, 30]. Our strategy for addressing these concerns is focusing on data entry, a critical step in the reporting system [8, 31, 32]. Meanwhile, the capability of system to deal with semantic data has largely drawn our attention since the majority of patient safety data are recorded in free text. Our ontological approach helps improve the system performance in terms of data entry and semantic data processing. It is a reasonable yet technically challenging approach in data management and information retrieval that a data entry combining both structured format and unstructured format. The ontology used as a conceptual map would help annotate unstructured data mapping to a domain conceptual map where concepts, terms, and relations are contained. For the semantic data processing, the ontology serves as a thesaurus to support NLP for information extraction tasks. The ontology also helps disambiguation when terms (i.e., acronym) used in patient safety may represent different things. Therefore, there is a pressing need in developing a domain ontology.

During the development of the ontology, we need to continuously evaluate its effectiveness, confidence, and acceptance in multiple dimensions. We are currently evaluating whether our ontology is valid in a broad scale and how compatible it would be when it works under various reporting systems across healthcare settings. In the future, we will design and implement a pipeline to collect, manage, analyze, and retrieve patient safety data supported by the ontology. The pipeline functions as modules of sharing the domain knowledge and supporting date entry, semantic data annotation, document similarity and so forth.