Introduction

The ongoing digital transformation is having a major impact on healthcare. New technologies offer great opportunities to improve the quality of care. Electronic patient records (EPR) are key components for the digital transformation in hospitals and determine several clinical elements like communication and collaboration, information availability and workflows [1]. There is evidence for improved coordination of care and therefore higher quality of care which is contrasted with certain aspects of the EPR that may lead to higher staff burden [2, 3]. Despite the impact that the implementation of an EPR has been shown to have, clinical documentation itself is often not included in investigations. However, certain analyzable aspects of documentation like completeness, accuracy or legibility have been proposed since the emergence of EPRs [4]. Ignoring possible changes in documentation due to the introduction of an EPR seems doubtful, since inadequate documentation of clinically relevant aspects could result in patients not receiving the treatments they need [5, 6]. This review follows the research question of which effect the introduction of the EPR has on the actual clinical documentation in hospitals and summarizes evidence from the comparison of paper-based and electronic patient records.

Methods

To shed light on the research question, a systematic review was conducted and is reported based on the most recent version of the “Preferred Reporting Items for Systematic Reviews and Meta-Analyses” (PRISMA) guidelines described by Page et al. whenever it is applicable [7]. See Online Resource 1 for a detailed list where to find which items.

Search Strategy & Selection Criteria

Following a sensitive search strategy to identify all suitable studies, several electronic bibliographic databases were searched including PubMed (incl. PubMed, PubMed Central, MEDLINE), Web of Science Core Collection, CINAHL, and PDQ Evidence. The components for the database search were “implementation”, “electronic patient record”, “paper-based”, “documentation”, and “hospital”. For all databases, filters were used to limit the results to English and German language and the period of publication from 2010–2020. The period of publication was limited as previous, unsystematic research showed that some studies from before 2010 examined technologies that are no longer in use today due to the rapid progress of digital systems. See Online Resource 2 for the detailed search strategies whose construction was not accompanied by a librarian. Synonyms, Boolean Operators, number of results, date, and filters or special features like truncations for all databases can be found there.

Screening of results was conducted in three steps by three researchers (FW, GF, UK) with inclusion or exclusion of studies following the criteria in Table 1. At this point, it should be emphasized that the focus of this systematic review is on the documentation itself and not on the results of interviews or surveys about it. According to point 5 in Table 1, only studies that analyzed actual patient records were included. In the first step, all titles were screened independently by FW and GF. Thereupon, abstracts were screened independently by FW and GF resulting in screening of the remaining full texts by FW and GF. Discrepancies in the first two steps meant including the studies in the next step until enduring discrepancies were discussed in the last step together with UK and consensus was reached. Screening was conducted in all steps following a questionnaire (see Online Resource 3) that covered all inclusion criteria.

Table 1 Inclusion and exclusion criteria

Data Items & Collection Process

The extracted data included authors, year, country, setting, study design, number of analyzed records, outcomes, results, and, if applicable, a use case. The outcomes were classified into the framework given by Nonnemacher, Nasseh, and Stausberg regarding their dimension of quality, meaning that e.g. the analysis of the outcome usage of standardized nursing language could be assigned to the dimension of structural quality [8]. For quantitative studies, statistical numbers like confidence intervals, p-values, or other relevant effect measurements were also extracted. See Table 2 for study characteristics, Table 3 for outcomes and results and Table 4 for study designs which allows a clear overview of the results of the individual studies, potential missing data and heterogeneity of the included studies. Included publications were stored in a Citavi library and extracted data was summarized in Microsoft Excel.

Table 2 Study characteristics
Table 3 Key results
Table 4 MMAT ratings

Study Risk of Bias Assessment

The Mixed Methods Appraisal Tool (MMAT) (Version 2018) proposed by Hong et al. was used to assess the quality of the included studies [9]. MMAT is a specially designed tool that can be used for assessing the quality of different study types in the same review including qualitative, quantitative, and mixed methods studies. The assessment was conducted independently by FW and GF with discrepancies discussed within the research team (FW, GF, UK). Following the recommendations for reporting the results of the MMAT (Version 2018) the studies were rated on a scale of zero to five stars. Each of the five conditions that was met scored as one, an unclear or unmet condition scored as zero. Studies with low quality were not excluded for this review, but quality of included studies was presented and a possible risk of bias discussed on basis of the MMAT rating.

Results

The study selection process and the reasons for excluding studies are depicted in Fig. 1. The database search resulted in 261 studies after duplicates were removed, plus three studies that were identified through a backward search of the records of the included studies [10,11,12]. 12 studies were excluded after title screening, 196 studies were excluded after abstracts were assessed for eligibility, and 39 studies were excluded after full texts were assessed for eligibility. The remaining 17 studies were included in this systematic review.

Fig. 1
figure 1

PRISMA 2009 Flow diagram

All included studies examine the documentation by performing a document analysis with comparison of the paper-based patient records and EPRs. Due to the hospital setting and the explicit exclusion of the outpatient setting, this concerns only the hospital's internal documentation in the patient records. Although the hospital setting was an inclusion criterion, the hospital setting still varies. There are differences in specialty (e.g., burn unit or orthopedic surgical ward) [13, 14], size (e.g., 700 beds or 1,200 beds) [15, 16], academical teaching activity, and one hospital which was not further specified [17]. Derived from that, all included studies investigate the documentation through the lens of a certain use case like for example operation reports or discharge instructions [14, 18]. The number of analyzed records varies from a minimum of 40 records (20 paper records vs. 20 electronic records) [19] to a maximum of 20,848 records (9,236 paper records vs. 11,612 electronic records) [20]. Except for Jamieson et al. who followed a prospective study design, all other studies evaluated the patient records retrospectively [11]. Only Montagna et al. followed a mixed methods approach, also investigating qualitative aspects such as the structure of the patient record in general or the format of the documentation in particular [19]. See Table 2 for detailed characteristics of all included studies.

The most commonly analyzed outcomes were completeness [15, 17, 20, 21, 23, 25, 26], guideline adherence [13, 14, 18, 22], and volume of documentation [11, 16, 17, 19]. Of all included studies, 11 of 17 proved a positive effect of the introduction of the EPR on documentation. Six of 17 showed a mixed effect with positive and negative changes, or no changes while no study showed an exclusively negative effect. Table 3 gives an overview of the analyzed outcomes, the key results of all included studies and whether a positive (+), negative (-), mixed (~) effect was measured. If the authors specified a p-value, it is indicated in the table. See Online Resource 4 for detailed summaries of all included studies.

MMAT was used to assess the individual risk of bias in the included studies and to rate their quality based on questions like “Are there complete outcome data?”. The two screening questions whether the study is an empirical one were fulfilled in all cases except one study [14]. That study fulfilled only one of the two screening questions with the second remaining unclear. Nevertheless, all studies were evaluated in terms of their quality. In Table 4, the final MMAT score of all included studies is depicted with a maximum of five stars. The detailed rating of all individual conditions is accessible in the appendix (Online Resource 5) which might be important since many conditions may not necessarily be unmet but remain unclear. Jamieson et al. and Liu and Edye used the QNOTE-instrument to measure their outcome [11, 12, 27], while Bruylands et al. used the Q-DIO-instrument [10, 28]. All other studies did not use any validated instrument to measure their outcomes. Moreover, several studies did not define their outcomes [16, 19, 24], or did so only superficially [21]. None of the studies followed a theoretical framework.

Discussion

The database search identified 264 studies of which 17 met the inclusion criteria. The majority of those showed improved documentation after the introduction of the EPR. Although none of the studies followed a theoretical framework, there are certainly several more general frameworks that might have suited after an adaption to the topic. A framework for data quality in medical research was presented [8], originally targeting registry data and cohort studies. This framework classifies a total of 51 items into the quality model according to Donabedian [29], with the underlying dimensions of structure, process, and outcome quality that also fits to the present research question. This means, for example, that the outcome “standardized nursing language” could be assigned to the framework’s item "values from standards" (proportion of values that correspond to terms from controlled vocabularies) and thus be assigned to the dimension structural quality. The classification of all outcomes shows that five out of 17 studies have examined structure quality and 13 out of 17 studies outcome quality. The used instruments were not classified as they attempted to cover multiple dimensions [27, 28].

EPRs provide the possibility to automatically fill fields with information that are collected from other digital sources. This was seen in the study by Jang et al. where electronic documentation significantly improved only the automatically documented items but not the manually documented items [15]. EPRs also provide mandatory fields that need to be filled before the record can be closed. Zargaran et al. assumed that higher rates of completeness which they found were mainly reached with mandatory entries in the EPR before the record can be closed [20]. Depending on the mechanism that determines the change in documentation, the literature shows different implications for practitioners. On the one hand, increased documentation effort is conceivable through the use of features such as pop-ups, mandatory fields, etc. On the other hand, there might be improved documentation with the same or even reduced documentation effort due to automatically filled fields and optimized layout [30]. Montagna et al. also described a general change in documentation format from a continuous text towards a clear list of events showing that the introduction of the EPR is also a possibility to shape the structure of documentation [19]. This gives the opportunity to involve practitioners, as they have important insight into how to reduce documentation burdens, as a recent study showed [31]. Overall, the EPR appears to improve documentation while it remains unclear whether this change will come at the cost of an additional burden on practitioners.

When talking about improved documentation, the interpretation of the presented results and outcomes is often ambiguous. For example, it is not clarified, whether the outcome volume of documentation evaluates length of documentation only or also takes information density into account. Therefore, a lengthening documentation is not necessarily to be evaluated negatively, if at the same time completeness increases and vice versa. Moreover, regarding the frequently analyzed outcome guideline adherence, it remains unclear whether the improvements are due to a mere change in documentation or whether the actual treatment has changed due to the introduction of the EPR and is more guideline-compliant thereafter. This could be the case if the EPR conveys guideline information or offers clinical decision support based on guidelines or care might have delivered but was not documented before the introduction and is now forced to be documented with mandatory fields.

A challenge of this review was the heterogeneity of the setting, outcomes, and the lack of the outcome definitions in some studies. However, the differently shaped setting and variety of outcomes gives a wide overview of the different applications of the EPR and how documentation changes in different views. Moreover, except for Zargaran et al. from South Africa [20], which is an upper middle income country [32], no studies from low- or middle-income countries were found, making it challenging to compare or transfer the results into all healthcare systems worldwide. A common difficulty is also the probability of a present publication bias. Publication bias was not assessed in this review but since it is conceivable that the analyzation of records is carried out internally and published afterwards, the risk of negative effects not being published is probably high. The fact that none of the included studies showed an exclusively negative effect underlines a suspected publication bias. The results must be interpreted with caution, since the MMAT rating proved low scores in several studies, meaning that the methodological standards in those studies imply a high risk of bias. It is important to highlight that of the five studies with the maximum MMAT score, implying only a small risk of bias, Jamieson et al., Yadav et al. and Coffey et al. show only mixed effects [11, 17, 23] and Jang et al. only a partially positive effect [15]. On the other hand, in those studies with lower MMAT ratings, only one showed a mixed effect [19] while the remaining seven studies all proved a solely positive effect. This shows that all but one effect that were not solely positive were proved in the studies with high methodological standards. Therefore it has to be underlined that a bias in the studies with low MMAT scores should be considered. In the matter of evidence there is only one randomized controlled trial [11].

There are some limitations of the present review that must be stated. Although the searched databases were carefully selected based on their topic and range, important results in other databases may still have been missed. Moreover, only studies from the last ten years were included. Nevertheless, some studies might have addressed the topic of this review, which were published before 2010 and could still be valid today. This could be an important aspect, as some healthcare systems are already highly digitized and thereby a lot of research might have been conducted before 2010. On the other hand, the results of this review generate evidence regarding the analyzation of change in documentation through EPRs of the current state of the art.

Due to the ongoing digital transformation of the healthcare systems worldwide, it is expected that many hospitals will continue to implement new EPRs or adapt existing EPRs in the future. Each of these episodes of organizational change offers the opportunity to customize the structure of the record in terms of what is documented where and how. This results in the possibility of optimizing documentation regarding treatment quality or billing purposes on the one hand. On the other hand, documenting itself could be made as non-stressful as possible for the healthcare professionals. To make this process efficient, a systematic analysis of the change in documentation is essential. Healthcare professionals should use the existing validated instruments to produce comparable results. Also, future research should aim at developing further, more specific instruments to make it as easy as possible for practitioners to systematically collect data and publish results. This allows growing evidence on how to design documentation in the best way for all parties involved.