Ethics and registration
The UNRAVEL RDP follows the Code of Conduct and the Use of Data in Health Research and has been approved by the Biobank Board of the Medical Ethics Committee of the University Medical Centre Utrecht (no. 12-387 UNRAVEL Biobank). As a part of UNRAVEL, the use of already existing text files (e.g. clinical notes) is exempt from the Medical Research Involving Human Subjects Act (WMO) as per judgement of the Medical Ethics Committee (Text mining in cardiovascular notes, 18/446, Utrecht, the Netherlands). Eligible patients (see below) are asked to provide written informed consent for use of their clinical data and previously stored material. Consent is required prior to using the clinical (meta) data. In addition, consent is requested to draw blood via venepuncture during routine investigations, to minimise the impact on the patient, and to request information from other medical centres and municipality registries. For additional stem-cell-related research, an informed consent form has been developed and approved by the Medical Ethics Committee. After inclusion, patients are registered as UNRAVEL enrolees in the EHR, and all their clinical data are automatically collected in the RDP (Fig. 1). Data governance is secured by a data management plan. More information on protocols, data governance and informed consent is provided on www.unravelrdp.nl.
Eligible participants are individuals with proven or suspected genetic cardiac disease, and their relatives. UNRAVEL also includes family members who are not mutation carriers or show no signs of disease; these serve as healthy controls. Participants must be able to provide written informed consent and be at least 18 years of age.
In order to minimise selection bias, patients and relatives from both in- and outpatient clinics are prospectively screened and asked to participate. If a participant is deemed eligible after discharge, the patient is contacted by the managing physician by mail and/or phone to retrospectively request consent. Additionally, previously eligible individuals were retrospectively identified and asked to participate using registered diagnoses in the EHR and a database of all CMP patients who visited the outpatient clinic of a clinical geneticist or had DNA analysis performed at the University Medical Centre (UMC) Utrecht.
Research data platform
Consent is required prior to the extraction of data. Based on in-house clinical protocols, phenotyping of participants includes medical history, family history, physical examination, routine laboratory testing, 12-lead electrocardiography, chest radiography, cardiac ultrasonography, computed tomography (CT) and magnetic resonance imaging (MRI). These tests are performed at the discretion of the managing physician and have multiple time points in the EHR (Fig. 2). In contrast to manually maintained registries, all available data are captured. For example, during a visit to the in-patient clinics several electrocardiograms (ECGs) can be produced per day. Not all data might be entered into manually maintained registries, since this is a meticulous and laborious task.
Raw data is gathered, processed and standardised for all cardiological, electrophysiological, imaging and genetic modalities (Fig. 1). On a weekly basis, these (numeric) data are automatically extracted to the RDP. Metadata is specific information describing the data (such as date of visit, type of ECG, or managing physician) which have been gathered for logistical and administrative purposes. These meta-data harbour valuable information and are also stored in the RDP. Data are viewed, combined, linked to external databases and analysed using query-based searches for data extraction using SAS Enterprise Guide (Fig. 3).
The UNRAVEL RDP contains multiple outcome measures that can be used for primary or secondary outcome analyses. All-cause death and date of death are extracted from the EHR and retrieved from the municipality registry . Other outcome measures such as diagnoses, date of diagnosis, occurrence of clinical events such as acute heart failure, arrhythmia or hospitalisation, ventricular assist device implantation and clinical interventions, including heart transplantation, can be extracted from the UNRAVEL RDP.
The UNRAVEL RDP includes all structured data from the EHR. However, some data remain unstructured, such as free text. These texts might harbour valuable variables to extract, such as New York Heart Association (NYHA) class or other clinical symptoms. To enrich the UNRAVEL RDP with these unstructured data from clinical notes, a text-mining prototype tool was developed. In short, we defined pre-set variables for the tool to extract from clinical notes, e.g. NYHA classification and cardiovascular risk factors such as diabetes, hypercholesterolaemia and hypertension. The pre-set variables are now in accordance with the variables in the TORCH registry but can be defined at the discretion of the researcher . The algorithm and further explanation are provided open source on www.unravelrdp.nl. Since the tool is under development, it should only be used with caution and under the supervision of medical and text-mining experts until further evaluation. A sample output of this automated tool is presented in Fig. 4. Future perspectives include the use of natural language processing for automated standardised diagnosis registration from clinical notes based on the International Classification of Disease (ICD) 10 classification mapped to the diagnosis thesaurus and reimbursement codes set by the project group “DHD diagnosis thesaurus-DBC-ICD 10” of the Dutch Society of Cardiology . Data standardisation will be harmonised with the OMOP Common Data Model to allow for systematic analysis of disparate observational databases .
All patients are asked concerning the collection of biomaterials for the UNRAVEL Blood Biobank. The exact laboratory protocol is available on www.unravelrdp.nl. In short, the standardised biobank protocol consists of one 10 ml serum, one 4.5 ml citrate, one 2 ml ethylenediaminetetraacetic acid (EDTA), one 10 ml EDTA and one 10 ml Na-heparin blood collection tube. These are processed and aliquoted to two vials of 0.5 ml whole blood from EDTA tubes, four vials of 0.5 ml plasma from citrate tubes, six vials of 0.5 ml plasma from EDTA and heparin tubes and six vials of 0.5 ml serum. All samples are stored at −80 °C. Availability, type and storage of material are linked to the RDP for easy accessibility.
Cardiac tissue database
Cardiac tissue of patients that have received a left ventricular assist device or undergone heart transplantation, and received donor spleen tissue during heart transplantation are routinely stored by the Department of Pathology. Samples are paraffin embedded and frozen at −80 °C. All samples are stored according to the protocol available on www.unravelrdp.nl, and explanted hearts are divided into slices and cubes accordingly. The registration of these samples is performed using an electronic case registration form in Redcap in the cardiac tissue database which is linked to the UNRAVEL RDP. Further information can be found on www.unravelrdp.nl.