As a widely used standard, DICOM provides the ability to communicate any kind of medical information together with its corresponding images from any type of (DICOM compatible) acquisition device. It standardizes the handling, storing, printing, and transmitting of information in medical imaging. It introduces explicit information objects for various formats of data [1] and utilizes other standards to facilitate imaging integration in the health care enterprise [4].
A DICOM file consists of two parts, the image itself (pixels data) and a header with meta-elements containing any information regarding the patient, institution, study, or pixel data. The header also involves public data such as patient name and number that will lead to the identity of a particular patient and thus introduces security and privacy issues since medical and administration staff have direct access to these data [11]. However, the risks are not only related to the header information directly connected to the identity of the patient. More general study information, also available in the header of a DICOM file, holds for example information about the study performed, the institutions that participate and the staff involved. The aggregation of this information from the DICOM header could on itself or combined with other sources of information also be used to track patient information and reveal the identity of a specific patient indirectly. This already introduces risks with respect to security and privacy within the walls of one institution. However, when images are shared among health enterprises without proper protection, the possible risks concerning data protection and securing patient privacy both increase.
De-identification of the DICOM tag elements should therefore be performed adequately by removing or changing all possibly sensitive information from the DICOM header. There are two known methods to perform such tasks, anonymization and pseudonymization.
Anonymization is claimed to be the most secure approach to ensure the privacy of DICOM data since it fully uncouples the data from the original patient [12]. It is used to completely remove confidential entries in the standard DICOM data dictionary, which could be used to derive the patient’s real identity, either by themselves or in combination with other entries. This method is aimed to gain an irreversible result in order to reduce the probability of revealing the patients identity.
Pseudonymization uses artificial identifiers to replace the most identifying fields within a data record. The purpose of adding these artificial identifiers or pseudonyms is to maker the data record less identifying One of the reasons to choose this method instead of anonymization is to provide an ability to trace back the real identity of the subject involved. This possibility of tracing back is useful when an appropriate follow up is necessary for the study or an aggregation of longitudinal data is important. Therefore, instead of removing data completely, a modification is done in such way that the associated parties (in most cases the principle investigator and/or data manager of a research project) are still able to obtain the real identity of the subject while attempts to identify the patient directly can be avoided. Thus, only necessary data are pseudonymized while the remaining fields should be made anonymous.
Integrating the Healthcare Enterprise (IHE) is an organization that develops and introduces profiles that are aimed at improving interoperability in healthcare. IHE profiles are not a technical standard. Therefore the initiated profiles are mainly an open infrastructure which is simple, easy, vendor independent, and free to implement by all enterprises involved [13]. The XDS-I profile was initiated by the IHE to provide a structure for an image sharing environment through a trusted network providing diagnostic related reports and information between healthcare enterprises. It uses existing standards in medical imaging, document management, and communication. It is a framework that describes the registration, query, retrieval, and publication of clinical documents.
XDS-I employs actors that can be grouped into Document Source, Document Repository, Document Registry, and Document Consumer. A Document Source is responsible for the document publishing and provides the clinical documents to a Document Consumer or the Document Repository. A Document Consumer is the actor that requests and retrieves documents from the XDS network, either from a Document Repository or a Document Source. The Document Repository handles the document storage in a transparent, secure, reliable, and persistent manner [14]. It is also responsible for delivering the requested documents to the Document Consumer, therefore the Document Repository should always be available. In XDS-I, images are not stored in the Repository. Instead, a small DICOM object called Key Object Selection (KOS) document containing a list of UID references is stored so that documents of interest can be easily found and retrieved from their original source location. The Document Registry is the actor that indexes all published documents and repositories involved in data sharing. Actors and transactions involved in the XDS-I environment are shown in Fig. 1.
The registry may contain a set of attributes of documents including sensitive data such as patient’s name, author’s institution, and author’s name. In the clinical research environment, the shared data are not allowed to have sensitive or private information embedded in it since that information can be used by other parties to trace the identity of the patient or study participant. Even though XDS-I offers a secure transfer over a trusted network, further efforts are needed to ensure that the sensitive information is eliminated or encoded. Furthermore, the original data must remain stored at the source domain and no duplication should be made in local or central repositories.
De-identification of DICOM data, as previously described, is the most common method to perform information removal. However, since de-identification is not a standard part of the XDS profiles, different methodologies could be used to enable the de-identification within XDS. Therefore, in this study different methodologies were defined and evaluated to decide how and where the de-identification should take place within an XDS environment used for scientific research.