Background

This review appraises studies examining the different approaches to integrating patient data from heterogeneous IS. Special attention is given to the type of integration engine and the type of integrated data. Articles published in the English literature between 1995 and 2005 with abstracts available were reviewed. We aimed to specifically review the integration of patient data, and how systems are evolving in practice to meet patient, professional and organisational needs.

A patient record is a set of documents containing clinical and administrative information regarding one particular patient, supporting communication and decision making in daily practice, and having different users and purposes [1]. Clinical care increasingly requires healthcare professionals to access patient record information that may be distributed across multiple sites, held in a variety of paper and electronic formats, and represented as mixtures of narrative, structured, coded and multimedia entries [2]. In hospitals, information technologies tend to combine different modules or subsystems, resulting in a best-of-breed approach [3]. Integration of healthcare Information Systems (IS) is essential to support shared care in hospitals, to provide proper care to mobile individuals and to make regional healthcare systems more efficient. However, to integrate clinical IS in a way that will improve communication and data use for healthcare delivery, research and management, many different issues must be addressed [46]. Consistently combining data from heterogeneous sources takes a great deal of effort because the individual feeder systems usually differ in several aspects, such as functionality, presentation, terminology, data representation and semantics [3]. It is still a challenge to make electronic health records interoperable because good solutions to the preservation of clinical meaning across heterogeneous systems remain to be explored [2]. Over the years different solutions to these problems have been proposed and some applied. Many of these solutions coexist in today's healthcare settings and are influenced by technology innovation and changes in healthcare delivery. Some of these solutions use differing standards and data architectures that may prove to be the greatest obstacle to semantic operability [7].

Methods

Eligible studies

Only studies describing or evaluating IS implementation for integrating patient data from heterogeneous IS were selected.

Review team

The review team was composed of three Computer Scientists, namely Ana Margarida Ferreira, Pedro Vieira Marques, and Ricardo Cruz Correia, one medical doctor Filipa Canário Almeida advised by health informaticians experienced in systematic reviewing, Jeremy Crispin Wyatt and Altamiro Costa Pereira.

Search methods

Studies were searched between September and October 2005 in the bibliographic databases. Since there is no specific standardised MeSH term, we developed a search string that includes the concepts of patient record, computers and data integration or sharing. Only articles with an abstract in English were considered. Given the significant evolution in ICT in the last decade, only studies published after 1994 (the last ten years) were included.

Three distinct bibliographic databases were searched: Medline (via Pubmed), ISI (ISI Web of Knowledge) and IEEE (IEEE Xplore). The query search string used in each database was ((medical or clinical or patient) and record*) and (comput* or digital or electronic*) and (integrat* or link* or sharing or share or shared).

This search method found 2443 articles in Pubmed, 961 in ISI and 414 in IEEE Xplore, a total of 3818 articles. After eliminating duplicate articles 3124 were selected.

Selection of studies for the review

All four reviewers from the review team were involved in study selection. Six combinations of reviewer pairs were defined, due to the large number of articles found. The first selection was based on the study title. Each pair of reviewers read 512 titles. The study was considered eligible when at least one of the reviewers considered that the title mentioned one of three key concepts:

– Patient Records (e.g.: patient record, EPR, EHR, EMR, clinical documents – CDA, administrative database)

– Integration (e.g.: IS integration, record linkage, information sharing)

– Distributed environment (e.g.: e-Health, distributed healthcare, shared healthcare)

A total of 923 of 3124 articles were selected in this first selection on title alone.

The second phase of the study selection was based on abstracts. Again, six combinations of reviewer pairs were defined. Each pair of reviewers read 154 abstracts. The inclusion criterion in this phase was that articles should fulfil all three of the following conditions:

– Describe or assess IS implementations

– Integrate patient data from various IS

– Describe the technology used to integrate

To maximize specificity, only selection by both reviewers was considered adequate. In cases of disagreement a third reviewer was called to decide. A total of 84 out of 923 articles were selected to be read entirely. These 84 articles were grouped into 69 distinct integration projects to avoid the distortion created by multiple papers describing the same project. All statistical analysis is based on projects and not on articles. Some of articles (n = 13) were descriptions of project plans or architecture models that were not already implemented on a real scenario nor even as a prototype. These projects were also excluded, leaving only 56 projects. Figure 1 is a flowchart illustrating the different stages of paper selection.

Figure 1
figure 1

Diagram showing the methods used for study selection.

Underlying model and definition of variables

Figure 2 illustrates the stages of a generic integration of heterogeneous IS. The variables examined in this review are related to these stages and intend to describe the context where the integration takes place (country, date, area covered, institutions involved, type of final users), the type of data integrated and the technology used (standards, communication methods, integration model, repositories of data, client applications).

Figure 2
figure 2

Framework for generic integration of heterogeneous Information Systems showing the stages and variables considered in the review.

The variables are:

– Country where the system is implemented;

– Date of article publication;

– Area covered by each project (country, region, hospital, department);

– Institutions involved as sources for patient data integration, i.e., institutions that own feeder systems to integration (departments, hospitals, primary care, private clinics, private labs, patient health portals) – multiple values are accepted;

– What type of medical data is integrated (lab orders, lab results, prescription orders, diagnosis or problems, procedures, admission letters, discharge letter, transfers letters, referral letters, medical images, biosignals) – multiple values are accepted;

– Medical informatics standards used (e.g.: HL7 – Health Level 7, CDA – Clinical Document Architecture, GEHR – Good European Health Record, SCIPHOX – Standardized Communication of Information Systems in Physician Offices and Hospitals using XML, DICOM -Digital Imaging and Communications in Medicine, MML – Medical Markup Language) – multiple values are accepted;

– Communication method (DICOM, DDE – Dynamic Data Exchange, e-mail, computer agents, Web services, Direct database access, CGI – Common Gateway Interface, CORBA – Common Object Request Broker Architecture, DHE – Distributed Healthcare Environment) – multiple values are accepted;

– Type of integration model for semantic interoperability (direct communication ie. when the systems create different interfaces to connect to each other; middleware ie. when an application programming interface is made available to talk with the central repository; semantic ie. when all possible data has a predefined message template, both semantic and syntax is known; generic ie. when the document structure accepts a certain degree of evolution without re-defining the whole template) – adapted from Bernstein et al. [8] – only one type of model is accepted;

– Type of data repository (File System, Database, PACS – Picture Archiving and Communication System, LDAP – Lightweight Directory Access Protocol, Virtual repository system) – multiple values are accepted;

– How data are made available to users (client application or web browser) – multiple values are accepted;

– How data are made available to other IS (Web services, CORBA or others) – multiple values are accepted;

– User groups (health professionals – medical users, nurses and other clinicians, clerical staff and patients) – multiple values are accepted;

Time intervals considered

To analyse time trends, we divided the total period up into three shorter periods because of the small overall number of projects identified. The first period includes projects with their last publication in 1994–1999, the second period with their last publication in 2000–2002 and the third period with their last publication in 2003–2005.

Statistical analysis

The statistical analysis was performed with SPSS® version 14. P values in Table 1 were calculated using Pearson and linear-by-linear association chi-square tests with significance level of 0.05.

Table 1 Frequencies (and percentages) for each variable analysed among the 56 data integration projects reviewed

Results

Study selection

The agreement rate for the first phase was 83%, and for the second phase was 77%. The number of different IS implemented was 56. Table 2 lists all integrated IS considered in this review, their country, number of publications and period of publication. Countries with the most published projects were the USA (15), Germany (8), Greece (6), Denmark (4) and China (4). Most IS (73%) have just one publication. 52% of the IS had their last publication in the period 2003–5, and 36% during 2000–2.

Table 2 Integrated IS included in the review, country in which installed, number and date of publications.

Trends

Area covered by integration

59% of the IS covered only a region, while 29% covered a hospital, 9% a department and 4% a whole country. There was a downward trend in publications related to projects that cover a hospital from 57% until 1999, 35% in 2000–02 and 17% in 2003–05. The number of projects covering a region or country has increased over the years, and currently represents 76% (p = 0.037).

Institutions involved in the integration

Most of the integrated information comes from hospital IS (69%), with departmental (40%) and primary care (33%) IS representing the next two most frequent institution types. Four projects (8%) integrated information from health portals; all were published in the most recent period considered (2003–05).

User groups

As expected, all information systems provided access to health professionals. Two recent projects claim giving data access to patients [9, 10]. Medical doctors are more often referenced as users (48%) than nurses (10%).

Integrated data

77% of the projects integrated diagnosis and problems, 67% medical images, 65% lab results, 63% discharge notes and 60% procedures. There has been an increase in projects integrating referral letters (from 0% until 1999, to 18% in 2000–02 and to 25% in 2003–05).

Type of models

Regarding the type of integration model, although the number of projects found using a predefined message templates (semantic – all data structured) and middleware are very similar (44% and 40% respectively), it seems that there is a trend to use more predefined message templates (46% in 2003–05) and fewer middleware solutions (31% in 2003–05). This tendency is clearer, if the values of the projects using messaging (both "Semantic – all data structured" and "Generic – structure and data dynamic") are added, representing 54% in 2003–05. Direct communication to databases is very low (10%) and more flexible messaging is now appearing (12% in 2003–05).

Messaging standards

HL7 is the most frequently used messaging standard (68%). It seems that CDA is becoming the reference to use inside HL7 (25% in 2003–05). DICOM is becoming less used when compared to other standards, which is understandable as it is mainly for images. Nevertheless, DICOM is no more the only success example of standards use in medical communication protocols. Other standards have very low usage nowadays (19% in 2003–05).

Repository

Regarding the type of data storage, 77% of the projects stored data in databases, 25% used virtual repositories and 16% stored in files. There is no real change over the periods considered.

Communication method

Recently (since 2000) more different technologies have been used to establish communication (3 until 1999, 8 in 2000–02 and again 8 in 2003–05). Web services have increasing importance (p = 0.042), whilst Database direct access and Common Gateway Interface have decreasing importance.

How data are made available for users

92% of the Information Systems use a Web browser to deploy their applications, whilst only 19% give user access through client-server applications.

How data are made available for other IS

88% of the IS use Web services to communicate with other systems, whilst only 13% use CORBA. The absolute number of systems using Web Services has grown from zero until 1999, two in 2000–2 and 5 in 2003–05.

Current status (results regarding 2003–05)

Currently there are more projects carrying out regional integration, especially between hospitals and primary care. Referral letters are mentioned in 7 of the 29 projects described in articles published in 2003–05. It is also clear that patients are also becoming active participants because they appear for the first time as a user group in more recent projects.

Regarding integration models, messaging between systems, both Semantic and Generic, is lately used more frequently (58%) than middleware (31%). Databases are still the most common method for data storage (86%). Communication between integrated systems uses many different technologies with Web services being used in 41% of the projects. The most common user interface by far is the Web browser (90%).

Discussion

Our results show an increasing number of publications describing projects which integrate data from multiple Information Systems. This is in agreement with our initial assumption about the interest in improving the communication of health related data to support person-centred healthcare. As the number of heterogeneous health IS grows, their integration becomes a priority. Moreover, we may be witnessing an increasing interest in regional integration between heterogeneous healthcare information systems across different institutions, to help communication between the different stake holders (primary and secondary care doctors, nurses and patients). This is also supported by the increasing communication of referral letters.

It should be noticed the efforts being put into integration in countries like Germany, Greece and Denmark which are trying to implement nationwide healthcare integrated networks feed by heterogeneous information systems.

Messaging technologies (in particular HL7) are more used than middleware solutions (like DCOM or CORBA). Web based technologies (web-services and web-browsers) support most of the projects, indicating that these new technologies are quickly adopted in healthcare institutions. Nevertheless, it is obvious that many distinct technological solutions coexist to integrate patient data.

The concept of message passing appears to be radically different from the conventional concept of procedure calls or operation invocation, but the difference is more one of pedagogical emphasis than of semantics. Message passing emphasizes the remoteness of the object and the caller's lack of knowledge of the code body which will be executed. However, any procedure call can be viewed as an exchange of messages [11]. The main difference is both approaches is the reliance on open Internet standards like HTTP, XML, SOAP, WSDL, UDDI and WSFL by the Web services (messaging), in opposition to DCOM and CORBA solutions (middleware) that resulted many times in single-vendor implementation requirements.

One key omission from the literature reviewed is that most of the project publications failed to mention any type of error detection. We feel that is mandatory to verify the quality of integrated data, so that instead of propagating data errors, alerts regarding data quality can be triggered and correction processes can take place [12].

Limitations

One of the main limitations of this review is lack of detail reported in most of the articles, and especially the non existence of any impact evaluation of the technologies they describe, despite the enormous cost of such systems and the evident change in working practices that they entail. The percentage of missing values for each time interval varied between 0 and nearly 50% depending on the type of variable analysed and interval of time considered.

Another limitation is only considering papers published in the last ten years may exclude early work on integration at the hospitals, although we feel it is justifiable given the significant evolution in ICT in the last decade.

Although we feel that grouping the papers into projects is essential to decrease the bias of multiple publications of the same project, on some of the papers it was difficult to determine if they were describing the same project or not.

Conclusion

Currently people have more mobility, longer lives and health care is more shared than ever before. It is clear that Information Systems are evolving to meet people's needs by implementing regional networks, allowing patient access and integration of ever more items of patient data. We conclude that patient information is becoming more accessible as there are more integrated IS which are more likely to involve primary care and a wider range of patient data.

Web based technologies and messaging technologies are supporting most of the current integration projects, indicating that these new technologies are quickly adopted in healthcare institutions. Many distinct technological solutions coexist to integrate patient data, using differing standards and data architectures which may difficult further interoperability.