A systematic review and meta-analysis to determine the contribution of mr imaging to the diagnosis of foetal brain abnormalities In Utero

Objectives This systematic review was undertaken to define the diagnostic performance of in utero MR (iuMR) imaging when attempting to confirm, exclude or provide additional information compared with the information provided by prenatal ultrasound scans (USS) when there is a suspicion of foetal brain abnormality. Methods Electronic databases were searched as well as relevant journals and conference proceedings. Reference lists of applicable studies were also explored. Data extraction was conducted by two reviewers independently to identify relevant studies for inclusion in the review. Inclusion criteria were original research that reported the findings of prenatal USS and iuMR imaging and findings in terms of accuracy as judged by an outcome reference diagnosis for foetal brain abnormalities. Results 34 studies met the inclusion criteria which allowed diagnostic accuracy to be calculated in 959 cases, all of which had an outcome reference diagnosis determined by postnatal imaging, surgery or autopsy. iuMR imaging gave the correct diagnosis in 91 % which was an increase of 16 % above that achieved by USS alone. Conclusion iuMR imaging makes a significant contribution to the diagnosis of foetal brain abnormalities, increasing the diagnostic accuracy achievable by USS alone. Key points • Ultrasound is the primary modality for monitoring foetal brain development during pregnancy • iuMRI used together with ultrasound is more accurate for detecting foetal brain abnormalities • iuMR imaging is most helpful for detecting midline brain abnormalities • The moderate heterogeneity of reviewed studies may compromise findings


Introduction
Abnormalities of the foetal brain occur in approximately 25 per 10,000 births in the UK [1] and can result from environmental, chromosomal, genetic or acquired causes. Accurate diagnosis of foetal brain abnormalities is necessary to guide management of the pregnancy and facilitate parental counselling.
Ultrasound scanning (USS) is the primary diagnostic imaging method for screening of the pregnancy and considered the reference standard for imaging the foetus brain. There are occasions when technical limitations hinder clear visualisation of the foetal anatomy [2,3] which led to the exploration of other diagnostic tests to supplement USS.
Advances in MR technology have meant initial technical restrictions in imaging the foetus with in utero magnetic resonance (iuMR) imaging have been overcome, experience within radiology has increased and a growing body of literature confirms increasing use of iuMR in diagnosing foetal brain abnormalities [4][5][6][7]. Despite this, the true clinical value of iuMR has not been established. Previous limited statistical evidence was unable to demonstrate, in terms of diagnostic accuracy, any benefit [8].
To our knowledge, there have been only two other recently published systematic reviews in which Rossi and van Doorn aimed to clarify the additional benefit of MRI in the diagnostic pathway when used in addition to USS [9,10]. Rossi reviewed 13 studies and van Doorrn selected 27 studies for review. Despite similar aims and inclusion criteria only seven studies were included in both reviews. This could, along with date differences for searches, be due to the differences in exclusion criteria. The criteria used by Rossi excluded studies without an outcome reference diagnosis (ORD), non-English publications and those where data were reported in graphs or percentages. Van Doorns review excluded studies with a sample size of less than 20 and studies where diagnoses were inadequately described. We felt a new systematic review was justified in order to update the existing, to attempt to limit the number of studies excluded and to identify any other studies which may have been erroneously excluded.
The aim of this study is to answer the following question: Is the diagnostic accuracy of iuMR superior, equivalent or inferior to USS? We aimed to assess diagnostic accuracy of iuMR following antenatal USS through: (a) Measurement of diagnostic accuracy of antenatal USS alone (i.e. prior to iuMR) in relation to an ORD determined by postnatal imaging, surgery or post-mortem examination (b) Measurement of diagnostic accuracy of iuMR (following antenatal USS) relative to an ORD Secondary aims were to determine if counselling and/or management of the pregnancy changes as a result of iuMR imaging and to identify the foetal brain anomalies for which iuMR is most useful.

Methods Protocol
The protocol was written in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [11] and registered with the International Prospective Register of Systematic Reviews (PROSPERO, CRD42015010265).

Eligibility criteria
All study designs were considered eligible apart from case reports, reviews or commentaries.

Participants
Pregnant women who had undergone, due to suspicion of a brain abnormality, prenatal ultrasound and subsequent prenatal iuMR of their foetus' brain and any findings confirmed by an ORD.

Reference standard
Reference standards accepted to confirm the outcome diagnosis were postnatal imaging (transcranial US, MRI or CT) and surgery or, in cases of foetal demise or neonatal death, autopsy and post-mortem MR imaging.

Exclusions
Studies not reported in English and translation was unavailable. If an English abstract was available these were scrutinised for relevant information, but limited data meant adherence to the inclusion criteria could not be certain.

Search methods
We identified all studies in which iuMR imaging was used to supplement USS for imaging foetal brain abnormalities in utero using a sensitive search strategy of the following electronic databases using MesH and free-text terms as detailed in Appendix 1, adapting the strategy for each database.
Databases searched were Medline (via OVID) (1966 to present), EMBASE (via OVID) (1980 to present), Cochrane Register of Diagnostic Test Accuracy Studies (accessed 18/ 03/2015 and 02/10/2015) and Web of Science (1900 to present). In addition, we searched relevant journals, conference proceedings and examined reference lists of relevant and included studies.
Electronic searches were conducted in March 2015 without date restriction and later updated to identify all relevant papers up to September 2015.

Selection of studies
Screening of citations was completed independently by two reviewers (DJ, CM). Any disagreements were resolved by consensus. Where only abstracts were available, attempts were made to contact authors for full reports. If the same data had been published in more than one publication, the most up to date or complete study was selected.
A PRISMA flowchart was used to document and report any decisions made during the study selection process [9] ( Fig. 1).

Assessment of methodological quality of included studies
Included studies were assessed independently for methodical quality (DJ and CM) using a modified Quality Assessment of Diagnostic Accuracy Studies (QUADAS 2) tool [12]. Studies were rated in terms of bias risk and applicability using signalling questions to score the four key domains-Patient selection, Index tests, Reference standard and Flow and Timing. Studies were scored as BYes^, BNo^or BUnclear^for each checklist item. Additional signalling questions were introduced for both study design and index tests. These were to determine prospective versus retrospective design and details regarding USS and iuMR technique and reporting as these were elements considered likely to introduce bias.

Data items and analysis
Study characteristics and outcomes were extracted independently (DJ and CM) and recorded using a data collection form (Appendix 2) which was piloted on three papers to ensure suitability. Characteristics noted for each study are listed in Appendix 2. The number of correct and incorrect diagnoses made by both USS and iuMR were also recorded as judged by the ORD confirmed by postnatal imaging, autopsy or surgery. Clinical examination was discounted as a reference standard as the majority of structural brain abnormalities are not apparent externally. Where studies reported the results of imaging from multiple anatomical areas, only results of the foetal brain were included.
It was anticipated that all studies would recruit only (or predominantly) foetuses with a brain abnormality diagnosed by USS, meaning the sensitivity and specificity of the imaging modalities could not be estimated because of the lack of foetuses without brain abnormality. Therefore, the analysis defined diagnostic accuracy for each modality as the percentage of cases where the diagnosis was confirmed by ORD. In foetuses with multiple abnormalities a primary diagnosis was identified as the abnormality with the most detrimental clinical outcome. In cases where both modalities identified the primary diagnosis but one provided a more specific diagnosis and/or additional information without fundamentally changing the primary diagnosis, our analysis assumed both modalities were correct but the nature of disagreements was subsequently investigated. A meta-analysis of the diagnostic accuracy of iuMR in relation to USS was conducted using the Stata statistical analysis software [13]. For each study the odds ratio for the paired iuMR and USS accuracies and its standard error were computed using the method of Becker and Balagtas, using a 0.5 correction for zero cells [14,15]. Odds ratios were combined using a random effects model and the I 2 statistic was used as an indicator of heterogeneity within the included studies [16,17].

Results
Our initial searches generated a total of 1252 potential studies with 807 remaining for additional scrutiny after duplicates were removed. Further screening resulted in 34 published studies for final inclusion [3,. Categories for exclusion of full papers reviewed but rejected are listed in the PRISMA flowchart ( Fig. 1).
Experience of the clinician reporting the iuMR study was only available in 10/34 studies [20,21,25,27,28,30,32,35,37,42], half of these quantified this in terms of years (between 1 and 15) the remaining gave a description of 'experienced'. In two studies, the reporting radiologist was unaware of USS findings [21,37]. Information regarding MR technique was reported in all papers including at least two of the following: manufacturer, sequences, types of receiver coils and patient positioning. Fast T2weighted sequences were performed in all studies with some using additional sequences (e.g. T1, DWI, 3D and FLAIR). Early studies reported the use of fasting and sedation to achieve optimal imaging [22,34].

Diagnostic accuracy of US and MRI
The 34 included studies reported a combined total of 2530 foetuses (median 32.5, range 10-834) but of these 62 % (n = 1571) were excluded as they did not have an iuMR (n = 796), 542 did not have an ORD, were nonbrain pathology (n = 159) or other exclusions (n = 74). Consequently this systematic review reports on the outcomes of 959 foetuses. In 6/34 studies [19, 28-30, 44, 49], all foetuses had an ORD, and combined contributed 186/959 to the analysis in this review (median 24.5, range 12-72). The remaining 773/959 (median 38, range  Fig. 3). Although individual studies were heterogeneous (I 2 = 45 %; p = 0.002), nearly all reported an improvement in diagnostic accuracy following iuMR. The data are also represented in the form of a L'Abbe plot (Fig. 4) in which the diagnostic accuracies of iuMR and USS are presented as percentages.

Agreement between USS and iuMR
The reports from USS and iuMR were in agreement and agreed with the ORD in 527/959 (55 %). USS and iuMR were in agreement but discordant with the ORD in 52/959 (5.5 %) foetuses (Table 1a and b, and 2).
In 160/959 (16.5 %) foetuses iuMR and USS were in agreement regarding the primary diagnosis but additional information was added-either secondary diagnoses or a more concise/confident primary diagnosis given. In this category iuMR provided additional information in 146/ 959 (15 %) and USS provided additional information in 14/959 (1.5 %) cases as confirmed by ORD.

Disagreement between USS and iuMR
The diagnoses on iuMR and USS disagreed in 222 (23 %) cases. Of these, the iuMR was in agreement with the ORD in 186 (19 %), the majority of which were abnormalities undetected by USS (139/186, 75 %). The remaining 47/186 (25 %) were abnormalities reported by USS but correctly excluded by iuMR. In 34 cases the USS diagnosis was incorrectly overturned by iuMR, 10 of which were abnormalities wrongly excluded or missed by iuMR and 24/34 were abnormalities diagnosed by iuMR but not found by USS or on the ORD (Table 2b and 3b). Table 3 presents the discordant diagnoses between USS and iuMR according to category of anomaly. The most frequent areas of disagreement were midline (24 %) and posterior fossa abnormalities (21 %). In particular agenesis of the corpus callosum and the Dandy Walker spectrum of abnormalities were frequently missed or, less frequently, wrongly identified on USS. The most frequently misdiagnosed anomalies on both USS and iuMR were cortical formation abnormalities (17 %) such as hemimegalencephaly, lissencephaly and heterotopia.

Changes in counselling and management
Eleven studies [3, 18, 28-31, 40, 41, 44, 47, 48] reporting on 186 foetuses specified the benefit of iuMR in terms of changes to counselling of parents or management of the pregnancy. These changes as a result of findings on iuMR affected 78/ 186 (41.9 %) foetuses.

Discussion
This systematic review and meta-analysis demonstrates that using iuMR to support USS in the diagnosis of foetal  [51], suggesting methodological and clinical variability and inconsistency in the measurement of outcomes within each study. Although investigation of heterogeneity is recommended [51], the ability to do so is compromised by the lack of reporting (and indeed quantification) of all the ways in which studies differ. The performance of both diagnostic tests is influenced by many factors, and a limitation of this review was incomplete reporting of characteristics that would potentially influence diagnostic performance such as operator experience (specified in just a third of included studies) and technical difficulties (three studies) [3,28,48].
iuMR is not without its limitations and our review demonstrated that iuMR overestimates the presence of abnormalities more frequently than failing to identify them. This could be explained by the nature of foetal iuMR in which the need for fast imaging compromises image quality. To the untrained eye artefacts from maternal breathing, foetal movement and image aliasing may potentially mimic or obscure pathology [52]. It is for this reason 'experience' should perhaps be defined by the number of foetal brain examinations reported.
The timing of USS in relation to iuMR imaging is also relevant in the assessment of both tests. The foetal brain develops rapidly and significant delay between the two examinations may influence the ability to diagnose accurately either because of natural brain development, increase in size of critical anatomical structures or because  Odds ratio The extent to which iuMR ultimately contributes to changes in management or in counselling regarding the pregnancy is also unclear as this was only reported in a small proportion of studies. Equally the impact of a wrong diagnosis made by iuMR was not defined in any study despite it occurring in 14/34 studies [18, 19, 21, 23, 26, 29, 33, 35, 36, 39-42, 44, 47].
Our review builds on the systematic reviews undertaken by Rossi  An important difference between the two is that Rossi restricted studies to those where outcomes were confirmed by a reference diagnosis, although chose to accept clinical examination as an ORD whereas van Doorns' selection criteria did not require an ORD. A strength of our review was the requirement of an ORD for any outcomes included in the meta-analysis. As previously stated we excluded clinical examination as an ORD. Although this significantly reduced the number of outcomes available, we felt this was justified as most structural brain abnormalities, and consequently Our analysis included 34 studies, of which 15 were additional to those included in the previous reviews owing to more recent searches and differences in selection criteria such as unlimited year of publication or sample size within studies. Although Van Doorns' searches were unrestricted by non-English publications or the requirement of an ORD, our review included more studies. This may be due to the limitation of sample size of less than 20 by van Doorn, resulting in six additional studies in this review, and the requirement of 'adequate description of diagnoses' which was not clearly defined by van Doorn.
Even with subtle differences in methods between all the reviews, findings were similar. Rossi reported that iuMR Abnormalities diagnosed by USS but wrongly excluded by iuMR 10 Abnormalities overdiagnosed by iuMR that were absent on USS and ORD 24 Total 959 was accurately able to identify brain abnormalities in 94.3 % of included foetuses, van Doorn reported 80 % and our study 91 %, an increase of 15-20 % when compared to USS alone. Both Rossi and van Doorn reported that the highest proportion of disagreement between USS and iuMR was related to midline abnormalities, particularly the posterior fossa. iuMR was better able to diagnose abnormalities in this anatomical region, also consistent with the findings of this systematic review which incorporates a further four studies published since 2012.
Although heterogeneity was not quantified by Rossi and van Doorn, both reviews highlighted the inadequate reporting of study characteristics which may compromise the findings of all systematic reviews. In order to adequately assess the accuracy of a diagnostic test and determine its true benefit in clinical practice, optimal study design is necessary [51].
We believe replication of the previous reviews is both justified and necessary-it reassures that the minor differences in inclusion and exclusion criteria both at study selection and data extraction do not change the outcomes significantly, thus adding weight to the current evidence base. In spite of the different nature of all the studies, the diagnostic accuracy of iuMR was clearly superior across the studies but the heterogeneity identified may compromise these findings. The moderate level of heterogeneity identified by our review warranted further investigation but was prevented by insufficient reporting of study characteristics. Despite its increasing use in clinical practice, poor study design has previously brought into question the diagnostic capabilities of iuMR above that which is achieved by USS and its benefit in terms of guiding the management of pregnancy and further studies are needed [53]. For this reason we instigated the MERIDIAN [54] project, a large prospective study to investigate iuMR imaging in the diagnosis of foetal brain abnormalities to provide definitive evidence to guide future practice.

Conclusion
When foetal brain abnormalities are suspected on USS, iuMR imaging is able to contribute significantly to the diagnostic pathway by both clarifying findings and increasing significantly the detection rate of abnormalities, particularly in midline and posterior fossa anomalies. Limitations of previous studies suggests that further investigation is still required to clarify the full impact of iuMR.