MRIdb: Medical Image Management for Biobank Research
- First Online:
- Cite this article as:
- Woodbridge, M., Fagiolo, G. & O’Regan, D.P. J Digit Imaging (2013) 26: 886. doi:10.1007/s10278-013-9604-9
- 1.7k Downloads
Clinical picture archiving and communications systems provide convenient, efficient access to digital medical images from multiple modalities but can prove challenging to deploy, configure and use. MRIdb is a self-contained image database, particularly suited to the storage and management of magnetic resonance imaging data sets for population phenotyping. It integrates a mature image archival system with an intuitive web-based user interface that provides visualisation and export functionality. In addition, utilities for auditing, data migration and system monitoring are included in a virtual machine image that is easily deployed with minimal configuration. The result is a freely available turnkey solution, designed to support epidemiological and imaging genetics research. It allows the management of patient data sets in a secure, scalable manner without requiring the installation of any bespoke software on end users’ workstations. MRIdb is an open-source software, available for download at http://www3.imperial.ac.uk/bioinfsupport/resources/software/mridb.
KeywordsPACS Digital imaging and communications in medicine (DICOM) MR imaging
Population-based epidemiological studies using whole-body magnetic resonance (MR) imaging, such as the UK Biobank , present an opportunity to develop new approaches for quantitative phenotyping. MR data sets may be acquired and archived at multiple sites and then shared in various formats between research groups for analysis. This has created a need for an image database which supports simple but powerful search and retrieval facilities and which can be readily deployed in research facilities by non-specialists. Image management systems implementing the digital imaging and communications in medicine (DICOM) standard are commonly used for their flexibility and robustness; however, they are typically complex applications that present a steep learning curve to new users and can prove difficult to install [2, 3]. These issues can be addressed by presenting a consistent and intuitive interface to users of the database, providing bespoke tools to systems administrators, and including comprehensive yet concise documentation.
The foundation of MRIdb is DCM4CHEE [4, 5], a mature, highly configurable open-source PACS. It handles the vital, low-level functions of image archival from scanners using the DICOM protocol, metadata extraction from images into a relational database schema and raw image and thumbnail retrieval facilities. It natively provides web and DICOM interfaces, but the former is complex and provides extensive administrative and data manipulation facilities, whilst the latter enables access to unanonymised data. MRIdb’s primary function is, therefore, to provide an alternative, intuitive, read-only interface to image metadata, whilst offering study management and multi-format image export with enforced preservation of data integrity and anonymity.
The MRIdb web application is intended as a simple, secure, centralised portal for users to retrieve images. It requires users to be authenticated using either an institutional lightweight directory access protocol (LDAP) server or a local password database and additionally enforces role-based access control for “visitors,” “researchers,” and “administrators.” Visitors are able to browse images, but it does not show (and does not allow search by) patient names, ages or dates of birth. Researchers are able to view patient information but cannot perform functions such as viewing usage logs or information about other users of the system. Administrators have full access and can create other administrators, as well as configure help material, perform import audits and check authentication and download logs. Importantly, no user, regardless of role, is able to export unanonymised data in any format, and bulk downloads are password protected by default. The DCMTK  toolkit is used for DICOM anonymisation.
MRIdb supports several means of retrieving images. Individual studies or series can be downloaded in DICOM, NIfTI or Analyze, format with anonymisation and format conversion performed automatically. A custom-built tool is used to convert multi-frame series to NIfTI format, with the XMedCon  and MRIcron  toolkits used for other conversions. Regardless of format, the downloads are packaged, compressed and named in a consistent manner, reflecting details of the date of acquisition, protocol, scanner and project identifier (if available). Series can be interactively added to a clipboard, enabling bulk download of scans of interest from various subjects. These downloads are password protected. Administrators are able to perform a non-interactive batch download by uploading a spreadsheet specifying the relevant series. These are automatically downloaded to the researcher’s computer by a bespoke tool with progress reported in the web interface.
MRIdb allows subjects to be assigned to research projects with an optional participation identifier. In order to preserve anonymity, these subject identifiers are not permitted to contain any part of a patient’s name or hospital identifier. The identifiers are visible to all users and are used throughout the user interface. They can be searched for in order to retrieve and export subsets of subjects, and images downloaded for offline analysis are named using both project and subject identifiers (when present). Project assignments are shared between users of the system and the action of assigning or modifying identifiers is logged in the MRIdb audit log. Projects can only be deleted or renamed by administrators.
MRIdb provides a series of administrative features and tools. For systems administration, these include interruptible data migration and audit scripts to import DICOM files from legacy systems to DCM4CHEE, customized initialisation (init) scripts to automatically start and stop DCM4CHEE and MRIdb as required, and scheduled (cron) jobs to simplify systems monitoring and remove temporary files. The web application logs all user actions, allows bulk upload of system initialisation data (such as user and project records) and optionally reports system errors to a specified user via e-mail. MRIdb is accompanied by a reference guide that provides a full description of how to download, install and configure the system, as well as describing the security, backup and monitor procedures that should be adhered to. It is also possible for administrators to rebrand the system, providing a customised name and logo for a local installation.
MRIdb is an integrated solution for MR clinical research image management. It combines mature open-source software tools for image archival, format conversion and visualisation with a consistent, intuitive, cross-platform user interface and a suite of administrative tools. The user interface is implemented using HyperText Markup Language and is, therefore, platform-independent and accessible using any modern web browser. The server component is written in Java and Python and designed to run on Linux. It is an open source under the GNU General Public License v3.0  and is freely available from the MRIdb website  in source and binary form.
A turnkey distribution of MRIdb that can be deployed by researchers without deep technical knowledge is available in the form of a virtual appliance. Usage of this machine image eliminates the lengthy installation process that is common to most PACS as it only requires minimal configuration. This includes specification of the location of the storage space allocated for image archival, the address of the LDAP server used for user authentication and the e-mail address of the system manager (to whom errors are automatically reported). The image is based on a minimal installation of CentOS Linux  and can be imported into any virtualisation container supporting the standard open virtualization format (OVF).
The virtual appliance contains a full installation of DCM4CHEE and the PostgreSQL  database and does not disable or restrict access to any of its functionality. It therefore provides a simple means to deploy a proven PACS, in addition to the user-friendly visualisation, retrieval and study management options provided by the MRIdb web application. The instance of DCM4CHEE in the MRIdb VM is configured to pre-cache image thumbnails for retrieval via WADO and to index additional series attributes to optimise performance of the bespoke web interface.
An evaluation of available free/open-source medical imaging data handling software was undertaken prior to the development of MRIdb, which itself replaces an existing in-house solution. The available solutions can be divided into three different categories: (1) DICOM data and protocol handling software suites such as the DCMTK toolkit or DCM4CHEE, (2) self-contained or virtualised PACS such as the DCMTB DICOM Toolbox  or CDMEDIC  and (3) comprehensive image-based clinical research data handling solutions such as the XNAT  imaging informatics platform. Given the underlying support that all these systems have for the DICOM protocol, it will be possible to develop integration connectors as required in the future.
XNAT, the most similar solution to MRIdb, is a powerful, flexible system that offers a project-centric storage system. XNAT can sort incoming data according to project specific tags that have to be specified using the scanner terminal during image acquisition. This, in turn, means that a change of data entry procedure has to be enforced. If project tagging is not done at scan time, XNAT still allows the data to be sorted after acquisition by using a temporary data storage facility. Once data are assigned to a project in XNAT, those data are only accessible by the project members. MRIdb is more flexible and open in the sense that the acquisition procedure is not affected, i.e. the data do not need to be tagged at scan time. Once the data are on MRIdb, the data can be accessed by all the research centre members, and they can be also tagged to a specific project. This open and simple approach, adopted in both the data acquisition process and the user interface, allows users to transfer and retrieve data without having to go through a steep learning curve or change in operating modalities.
MRIdb has been in use at a medium-sized clinical research department of a large university for 6 months. This installation currently contains 7 TB of data and 15,000 studies, primarily from Philips MR scanners, including a large corpus of legacy images as well as scans acquired since its development. It has been reliable, required very minimal system administration and has been well received by its users.
This work was funded by the Medical Research Council, UK, the Imperial NIHR Biomedical Research Centre, UK, and a British Heart Foundation, UK, project grant (PG/12/27/29489).
Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.