Introduction

Pelvic organ prolapse is a major health care problem, with 11% of women undergoing surgery for pelvic organ prolapse and/or urinary incontinence during life time, and 30% repeat surgery [1]. The symptoms reported by the patients are often non-specific, except for the sensation or visualization of a vaginal lump or bulge [2]. The most common diagnoses related to pelvic organ prolapse are cystoceles, uterine or vaginal vault prolapse, enteroceles, rectoceles, intussusception/rectal prolapse and descending perineum [35]. Abnormalities in one compartment are often combined with disorders in other compartments [69].

Proper staging of pelvic organ prolapse is important in clinical practice and outcome studies. Before the introduction of the Pelvic Organ Prolapse Quantification (POP-Q) [10] by the International Continence Society in 1996, several other clinical staging systems were and are still in use.

It remains difficult to make a correct and complete diagnosis on clinical examination only, especially in case of posterior vaginal wall prolapse and/or a multi-compartment problem [11, 12]. Underestimation of pelvic organ prolapse may lead to incomplete or incorrect surgery [5, 13], which may be one of the reasons for the high rate of recurrences after prolapse surgery [1416]. Imaging of the pelvic floor has become an important complementary tool in the assessment of pelvic floor disorders.

The publication by Yang et al. in 1991 has given an impetus to the implementation of dynamic magnetic resonance (MR) imaging [17]. Dynamic MR imaging allows to assess the three compartments at the same time and to observe their mutual relationship at rest and during straining. Other benefits are the absence of ionizing radiation and the excellent anatomical details of the soft tissue such as muscles and pelvic viscera [6, 9, 1721]. On the other hand, the costs of dynamic MR imaging are high as compared with the clinical investigation.

A major problem in both clinical examination and dynamic MR imaging is the enormous diversity of reference lines and measurement points which can be used in staging pelvic organ prolapse. The aim of our study was to provide a systematic literature review of clinical studies which have compared pelvic organ prolapse stages as assessed on dynamic MR imaging (using a reference line) with a standardized method of clinical prolapse staging (not only according to POP-Q) [22]. The available reference lines and anatomical landmarks are discussed in the light of their correspondence with clinical findings.

Materials and methods

We included studies that compared results of pelvic organ prolapse staging in females with both dynamic MR imaging and clinical examination. Dynamic MR imaging was defined as a cine loop obtained at rest, during squeezing, straining, and/or defecation. For inclusion, studies had to report on a reference line used to stage the prolapse on dynamic MR imaging, a standardized method of gynecological prolapse staging (pre- or intraoperative), and comparison of the dynamic MR imaging and gynecological prolapse staging. Papers were excluded in case of review articles, and furthermore studies describing various clinical findings but not cystocele, rectocele, enterocele, uterine or vaginal vault, and studies only describing postoperative gynecologic examination. There were no language restrictions.

The databases EMBASE and PubMed were searched by one of the authors (S.B.) in association with a senior librarian until January 8, 2008. The used search terms were adapted for each database accordingly, and generally referred to the different terms for “pelvic organ prolapse” and “dynamic magnetic resonance imaging”. The entire string of search terms, including Medical Subject Headings and Thesaurus terms, are depicted in the Appendices A and B. All studies were evaluated independently by two of the authors (SB and KK), and disagreement was resolved in consensus meetings. References of relevant retrieved studies were cross-checked for additional studies. Reviewers were not blinded to details of authorship.

In each study, data on study design, aim of the study, study population, control group, sample size, MR imaging protocol, reporting of blinding of image assessment, number of observers, reference line(s) on MR imaging, anatomical landmarks on MR imaging, standardized gynecological staging system (pre- or intraoperative), and methods of prolapse symptom assessment were collected. Furthermore, data on the comparison of MR imaging and clinical or operative prolapse staging, as well as the authors’ conclusions in this respect, were collected.

Cohen kappa (as presented in the papers included in this review) of more than 0.8 denote excellent agreement, between 0.8 and 0.6 good agreement, between 0.6 and 0.4 fair agreement, and below 0.4 poor agreement, respectively [23]. Pearson’s coefficient for correlation range from +1 to −1, where a higher value implies better agreement.

Results

The EMBASE search revealed 369 studies. The PubMed search revealed another 140 studies. Thus, a total of 509 studies were checked for eligibility, of which 432 studies could be excluded on the basis of title and abstract. The remaining 77 studies were read by paper, but only ten studies [2433], published between 1993 and 2007, fulfilled the inclusion criteria. Sixty-one studies were excluded because they did not conform to our inclusion criteria: no report on a reference line used to stage the prolapse on dynamic MR imaging (n studies = 1), no standardized method of gynecological prolapse staging (pre- or intraoperative; n studies = 23), and no comparison of the dynamic MR imaging and gynecological prolapse staging (n studies = 7). Or papers confirmed to our exclusion criteria: review articles (n studies = 23) and studies describing various clinical findings but not cystocele, rectocele, enterocele, uterine, or vaginal vault prolapse (n studies = 7). And the other six out of the 67 excluded studies were rejected in the consensus meeting, because two studies were misinterpreted by one of the authors, three studies described various clinical findings but not cystocele, rectocele, enterocele, uterine or vaginal vault prolapse, and one study did not compare findings on clinical examination with the findings on dynamic MR imaging. No studies were revealed by cross checking. One study was excluded from this review because of inconsistency in their description of the reference line used to assess pelvic floor prolapse on MR imaging, which could not be verified by an email to the authors [34]. None of the studies compared dynamic MR imaging with standardized intraoperative prolapse staging.

All studies included were cross-sectional observational studies. The data on study characteristics, such as study population and sample size, are shown in Table 1. The study populations consisted of healthy asymptomatic women (n studies = 2), women with pelvic organ prolapse (n studies = 4), whereas the remaining studies compared symptomatic women to healthy controls (n studies = 4). The median sample size of the studies is 38 (range, 13–100).

Table 1 Study characteristics

Pelvic organ prolapse was staged using the POP-Q and Baden–Walker system, except for Agildere et al. [24] who have classified rectal protrusion into the vagina as absent, small, moderate, or large, according to a system as described by Delemarre et al. [35]. Six out of the ten studies included in the present review have included symptom-free or healthy women [2629, 31, 32]. Clinical staging in this subgroup of patients was only performed in four out of these six studies, of which Gousse et al. have assessed the healthy parous subgroup but not the healthy nulliparous subgroup [2729, 31]. The other two studies were not excluded from the present study, because they did comply with our inclusion criteria with respect to the subgroup of prolapse patients. Overall, only 20 women did not have a clinical examination, whereas 133 women had a standardized clinical examination.

In all studies, MR imaging of the pelvic floor was performed with the subjects in a supine position. The bladder, and/or vagina, and/or rectum was/were opacified in three studies [25, 29, 33], whereas the other seven did not use opacification [24, 2628, 3032], or even emptied the bladder [24, 26, 31]. Agildere et al. [24] used an oral contrast agent (gadopentetatedimeglumine or gadopentetatedimeglumine plus polyethylene glycol).

Patients were asked to defecate during imaging in two studies [24, 31]. In three studies, patients were instructed to perform an increasing straining maneuver [25, 27, 30]. Lienemann et al. [31] and Fauconnier et al. [26] endeavored to standardize the effort of straining, using the reversal of flow in the femoral veins on the axial images to indicate an adequate straining maneuver, or assessed the prolapse when the degree of protrusion remained the same for at least three sequences, respectively.

Images were assessed by more than one observer in six studies [2427, 31, 33]. The intra- and interobserver reliability of MR imaging measurements were, however, only reported by Fauconnier et al. and were overall excellent (Intra-class Correlation Coefficient for mid-pubic line 0.87–0.92, and perineal line 0.76–0.90) [26]. The intra- and interobserver reliability of clinical assessment has not been addressed in the studies included in the review.

In Tables 2 and 3, the data on the reference line(s) and anatomical landmarks used on dynamic MR imaging are presented. Seven different reference lines have been used to assess the presence or absence of pelvic organ prolapse (Fig. 1). Two studies used more than one reference line and compared them for their correlation with clinical examination. Reference lines with the same name had variable definitions, whereas reference lines with different names had the same definition. Furthermore, the anatomical landmarks used on MR imaging in the studies were diverse.

Table 2 Definition of reference lines used on dynamic MRI
Table 3 MRI prolapse assessment and measurement points
Fig. 1
figure 1

MR imaging at rest in a 62-year-old woman with pelvic organ prolapse. Dynamic midsagittal half-Fourier acquisition single-shot turbo spin-echo (2000/90;150°) through the pelvis. The image shows the used reference lines in the papers included in this review. The marked section outlines the definition area for the pubococcygeal line (PCL), sacrococcygeal inferior pubic point line (SCIPP) and pubosacral line (PSL). AL Axial line, PL perineal line, MPL mid-pubic line, HL hymenal line

Table 4 provides an overview of the results on the comparison of clinical prolapse staging and dynamic MR imaging staging, as well as the author’s conclusions in this respect. For all studies, this comparison concerned the primary outcome measure of the study, except for Agildere et al. [24]. Reliability of MR imaging findings using the mid-pubic line versus clinical findings measured by Cohen’s kappa was poor to fair. The Pearson’s correlations coefficient was good to excellent for the mid-pubic line, perineal line, and pubosacral line, with the exception for the posterior compartment. However, Fauconnier et al. concluded poor agreement with clinical examination for the mid-pubic line and perineal line [26].

Table 4 Study comparison

In two studies, 100% agreement was reached between clinical examination and dynamic MR imaging in women without pelvic organ prolapse (n = 35 and 10, respectively) [28, 32], whereas in another study, an overestimation of pelvic organ prolapse was seen on dynamic MR imaging as compared to clinical examination in two out of five women [27]. In the remaining three studies (including the largest study), the agreement has not been presented (n = 5, 52 and 41, respectively) [26, 29, 31].

Discussion

In this study, we have performed a systematic literature review on the comparison of dynamic MR imaging staging and standardized gynecological prolapse staging. Given the heterogeneity of the available studies in terms of difference in participants, clinical examination, reference lines, anatomical landmarks and statistical methods used it was not possible to pool the data. The main outcome was a large heterogeneity of the available studies. The main conclusion in this respect is that proper validation of MR imaging is lacking and further research is needed.

Some of the reference lines have similar definitions but with different names, for example, the mid-pubic line or the hymenal line and the PCL or sacrococcygeal inferior pubic point line. We suggest that only one of the two names should be maintained, especially in view of the large number of reference lines. Our preference would be the use of “mid-pubic line” and “pubococcygeal line”, respectively, because these are the original and most common used names for these lines.

The most commonly used reference line on dynamic MR imaging is the pubococcygeal line (PCL), which is thought to approximate the plane of the levator plate. In two studies, it could be demonstrated that there was an agreement between MR imaging, using the PCL as reference, and clinical stages of pelvic organ prolapse [28, 31]. Gousse et al. [28] have shown that pelvic organ prolapse was accurately staged on dynamic MR imaging compared to physical examination in the anterior and central compartment, but not for rectoceles (posterior compartment). Lienemann et al. [31] described the PCL as a useful reference line for descent in the anterior compartment only.

The mid-pubic line was introduced by Singh et al. [32], in order to overcome the lack of a generally accepted standard, and in an attempt to find a common reference line for both clinical staging and MR staging of prolapse. The axis of the mid-pubic line was expected to correspond with the level of the hymenal remnants as used in the clinical staging. In their study, however, the agreement between clinical and MR staging was only moderate. Lienemann et al. [31] who have also applied this mid-pubic line in their study, suggested that the mid-pubic line should only be used for staging in the posterior compartment. In 2007, Fauconnier et al. [26] have introduced another reference line, the perineal line, with the theoretical advantage of better correspondence with clinical stages. Again a poor agreement between clinical and MR imaging measurements has been found. Consequently, both the mid-pubic line and perineal line do not provide better validity.

The anatomical landmarks in the anterior and central compartment of the pelvic floor on MR imaging were quite similar throughout the studies. In case the bladder base or bladder neck and the vaginal vault or distal edge of the cervix descended below the reference line, the diagnosis of cystocele and vaginal vault or uterine prolapse was made accordingly. In contrast, the definitions for the diagnosis of a rectocele or an enterocele were diverse. The two main methods used for rectocele assessment were the measurement of descent of an anterior rectal outpouching below a certain reference line [24, 31], and the measurement of the size of the anterior rectal outpouching [28, 30, 33]. Until now, it is not known which method is more valid.

In clinical practice, the differentiation between an enterocele and rectocele can be difficult, and this probably represents the most important additive role of pelvic floor imaging, such as dynamic MR imaging, colpocystodefecography (CCD) and pelvic floor ultrasonography. To correctly stage an enterocele clinically, the intraoperative findings need to be assessed, since standard gynecological staging is less reliable for this purpose. Kelvin et al. reported that only half of the enteroceles (51%) were identified with physical examination [36]. Unfortunately, there are no studies available on the comparison of standardized enterocele staging with the use of dynamic MR imaging and intraoperative findings, which surpass the comparison of the mere absence or presence of enteroceles.

Another tool to differentiate between an enterocele or rectocele is the open-magnet-unit MR imaging, in which the patient is sitting during assessment. It is, however, not very widespread. In a study on closed-magnet unit dynamic MR imaging versus open-magnet unit dynamic MR imaging, Bertschinger et al. have concluded that overall, MR imaging performed in the sitting position depicted a greater degree of pelvic floor laxity (i.e., organ descent) and more anterior rectoceles and enteroceles. With regards to the detection of clinically relevant findings, however, the position of the patient did not seem relevant [37]. The study by Vanbeckevoort et al. focused on the comparison between dynamic MR imaging and CCD in the diagnosis of descent in each compartment [38]. Their data suggested that dynamic MR imaging in the supine position was less accurate in the evaluation of pelvic floor descent. The authors concluded that the high number of false-negative MR imaging studies in the anterior and middle compartment (and to a lesser extent the posterior compartment) was likely to be due to the position of the patient. Probably the easiest and most cost-effective tool to preoperatively diagnose a rectocele, enterocele or rectal intussusception, however, is perineal two-dimensional ultrasonography. This method can be performed by gynecologists in the outpatient setting and is likely to become more widespread in the future investigation of prolapse [3941]. It is of utmost importance to assess the effect of a Valsalva and rule out levator co-activation, irrespective of the staging method used [42]. Possibly, this is easiest done with ultrasonography because of the close patient–physician contact with opportunities for immediate feedback and instruction.

Fauconnier et al. [26] were the only authors who have compared clinical measurement points with MR imaging anatomical landmarks by positioning the POP-Q points on the dynamic MR images. They have found good correlations with the clinical staging for the anterior and central compartment, but not for the posterior compartment. This might be due to the fact that the validity of the POP-Q is likewise the least for the posterior compartment [11].

A consensus on a standardized protocol for the dynamic MR imaging examination and interpretation is lacking, and there is no evidence on how to overcome this problem. In some studies the bladder, vagina, and rectum have not been opacified, or the bladder was even emptied prior to examination [24, 2628, 3032]. Whereas in the other studies, at least one of the mentioned structures was opacified with sonographic gel or a mixture with gadolinium [25, 29, 33]. The dynamic MR imaging consists of a cine loop of images of relaxation and maximal straining of the pelvic floor. In some studies, patients were also asked to contract their pelvic floor muscles and/or were instructed to actually defecate during imaging [24, 29, 31]. A shared problem was the lack of a method to objectively assess the effort of strain. For future research on the validation of dynamic MR imaging in women with pelvic organ prolapse, it seems of utmost importance that radiologists and gynecologist cooperate in studies on the standardized assessment of the various signs and symptoms of prolapse. More evidence, from well-conducted clinical studies, is needed to enable the future definition of guidelines for dynamic MR imaging. For example, the standardized assessment of the patients’ symptoms, i.e., with the use of validated questionnaires, has only been performed in one study included in this review [31].

In conclusion, in spite of the abundant number of studies on dynamic MR imaging of the pelvic floor, only few studies have reported in a standardized manner on pelvic organ prolapse as assessed by dynamic MR imaging and clinical staging. Although dynamic MR imaging is a promising complementary diagnostic tool, proper validation of the method is lacking. The studies available have only small sample sizes and are difficult to compare due to differences in protocols on the examination and evaluation of dynamic MR imaging. None of the reference lines used showed clear superiority. The pubococcygeal line, however, has the advantage of being the most widely used reference line. The high agreement in the anterior and central compartment shows that clinical assessment and dynamic MR imaging are interchangeable. The agreement between methods in the posterior compartment is lower. It seems reasonable to assume that dynamic MR imaging may have advantages over clinical staging in the assessment of posterior compartment prolapse, since it is difficult to identify enterocele and rectal intussusceptions on clinical examination.