Introduction

Dynamic magnetic resonance (MR) imaging and perineal ultrasonography are newly introduced diagnostic tools used to stage or evaluate pelvic organ prolapse (POP). Previously, radiographic imaging has been used for this purpose, but since the 1990s dynamic MR imaging is gaining interest, and in recent years perineal ultrasonography is gaining ground as well. The acquisition speed during imaging makes these dynamic assessments possible. The shared advantages of these imaging modalities are the absence of ionizing radiation, the non-invasiveness, and the superior soft tissue contrast. Additionally, perineal ultrasonography has the advantages of low costs and can be performed by the gynecologist in the outpatient clinic, whereas dynamic MR has the advantage of imaging a larger volume of the pelvic floor.

To standardize clinical examination, the International Continence Society recommends the use of a site-specific system, with the hymenal remnants as point of reference: the Pelvic Organ Prolapse Quantification (POP-Q) [1]. However, for the identification of all compartments involved in the prolapse, and to differentiate between an enterocele and an anterior rectocele, additional imaging of the pelvic floor is frequently needed [25].

Four previous studies have assessed the agreement of POP-Q measurements with the use of at least one reference line on dynamic MR imaging, but resulted in conflicting findings [69]. With respect to ultrasonography, only one study has compared perineal ultrasonography with POP-Q, and has found good correlation for the anterior and middle compartment and moderate correlation for the posterior compartment [10]. To our knowledge, there are no studies available which have assessed the agreement between dynamic MR imaging and perineal ultrasonography measurements.

The aim of this study was to assess the agreement in prolapse staging between clinical examination (according to POP-Q), dynamic MR imaging, and perineal ultrasonography.

Materials and methods

This prospective observational study was carried out at the Radboud University Nijmegen Medical Centre from September 2005 until March 2007. The center is a national tertiary referral center for women with pelvic organ dysfunctions. All women who underwent dynamic MR imaging for the investigation of symptoms of complex pelvic floor dysfunction during this period were included in our study. MR imaging was performed as part of routine clinical practice in patients with recurrent prolapse, especially in the posterior compartment, or in case the patient’s complaints did not correspond with clinical findings.

Pelvic organ prolapse was staged in a standardized manner on clinical examination (POP-Q) in all women. A subset of women additionally underwent POP staging with use of perineal ultrasonography. This technique was introduced in our center in January 2007. Since then, all women who underwent dynamic MR imaging were evaluated by perineal ultrasonography as well. The study was submitted to and deemed exempt by the local Institutional Review Board.

Clinical examination

Clinical assessment of the pelvic floor was performed by one out of three gynecologists experienced in the assessment of pelvic organ prolapse, while patients were in the supine lithotomy position. In the POP-Q, nine measurement points are assessed during maximal Valsalva maneuver, except for the transvaginal length (TVL), which is measured at rest. Only the measurements of POP-Q points Ba, C, Bp, and the TVL were used in the comparison with measurements on dynamic MR imaging and ultrasonography. Ba is the most descended edge on the anterior vaginal wall, C represents either the most distal edge of the cervix or the leading edge of the vaginal vault after total hysterectomy, whereas Bp is the most descended edge on the posterior vagina wall. TVL is the depth of the vagina until the fornix posterior when C is replaced to its normal position. Measurements in centimeters relative to the hymenal remnants were used in the analysis.

Dynamic MR imaging protocol

The dynamic MR imaging examination was performed with the patient in supine position with parallel and slightly flexed legs. Patients were requested not to void for 1–2 h prior to the examination. The rectum was opacified using 100–150 ml ultrasound gel. The urethra, bladder, and vagina were not opacified. No premedication was given. MR images were acquired using a 3 T MR scanner (TIM TRIO, Siemens Medical, Germany) and an eight-channel body phased-array coil. MR images were obtained in the sagittal plane using a Half-Fourier acquisition single-shot turbo spin-echo sequence (2,000 ms/90 ms repetition time/echo time; 150° flip angle). During the MR examination, the patient was asked to relax her pelvic floor muscles, to contract the muscles slowly, relax again, and then to increase the intra-abdominal pressure and strain in order to defecate. To assure that the patient followed the instruction given, all images were viewed online on the MR console. A whirl of urine in the bladder and/or a dent into the cranial portion of the bladder, seen on the sagittal images, indicated adequate straining. The MR examination time was approximately 35 min.

The images were analyzed at a later stage on a console with zoom facilities and electronic calipers. The observer was blinded to the clinical findings. The midsagittal images on maximal strain were used to assess the prolapse. The three reference lines used are shown in Fig. 1. The pubococcygeal line was defined as a straight line between the inferior rim of the pubic bone and the last visible coccygeal joint [8, 11], the H-line as a straight line between the inferior rim of the pubic bone and the posterior wall of the anal canal on the level of the impression of the puborectal sling [12], and the mid-pubic line as a line drawn through the longitudinal axis of the pubic bone, passing through its midequatorial point [9].

Fig. 1
figure 1

MR image obtained at rest. Dynamic midsagittal Half-Fourier acquisition single-shot turbo spin-echo (2,000/90; 150°) through the pelvis of a 62-year-old woman with symptoms of pelvic organ prolapse. The image shows the reference lines used. PCL pubococcygeal line; H-line; MPL mid-pubic line

On maximum strain, the leading edge of the bladder (bladder), the cervix or vaginal vault (C/VV), and the most anteriocaudal point of the anterior rectal wall (rectum) were determined in centimeters perpendicular to the three reference lines.

In addition to the anatomical landmarks as mentioned above, clinical measurement points were assessed to approximate point Ba, C, and Bp of the POP-Q system on MR imaging. These measurement points were assessed using the mid-pubic line, because this line has been introduced by Singh et al. as a reflection of the hymen, which is the reference structure in the POP-Q system [9]. In the anterior compartment, the most posteriocaudal point of the anterior vaginal wall was used; in the central compartment, the most distal point of the cervix or the vaginal vault; and in the posterior compartment, the most anteriocaudal point of the posterior vaginal wall. At rest, the TVL was furthermore assessed, measured from the fornix posterior or vaginal vault, following the contour of the vagina, until the crossing with the mid-pubic line.

Perineal ultrasonography protocol

Translabial ultrasonography prolapse assessment was carried out in the midsagittal plane using an 8–4 MHz transabdominal transducer (Voluson 730 expert, GE Kretz Ultrasound, Zipf, Austria) [13]. Women were in the supine lithotomy position with an (almost) empty bladder. Anatomical landmarks used were the leading edge of the bladder (bladder), the cervix or vaginal vault (C/VV), and the most anteriocaudal point of the anterior rectal wall (rectum). On maximum strain, the distance perpendicular to a horizontal reference line through the inferoposterior margin of the symphysis pubis was determined in centimeters [14]. In order to obtain maximum descent, care was taken to minimize pressure of the probe on the perineum. The ultrasonography examination time was approximately 20 min.

The three-dimensional images were stored and assessed offline at a later stage with the use of GE Kretz 4D View software (Kretztecknik GmbH, Zipf, Austria). The physician who performed the imaging was not blinded to the findings on physical examination, but the observer of the anonymous offline images was blinded to these findings.

Statistical evaluation

Spearman’s rank correlation coefficient was used to compare POP-Q measurements with the measurements on dynamic MR imaging and ultrasonography, and to compare measurements on dynamic MR imaging and ultrasonography, respectively. A Spearman’s correlation coefficient of more than 0.80 denotes excellent correlation, between 0.80 and 0.60 good correlation, between 0.60 and 0.40 moderate correlation, and below 0.40 poor correlation, respectively. The mean difference between two measurements, with 95% confidence interval, and according Bland and Altman plots are presented [15]. SPSS version 16.0 (SPSS, Inc.,Chicago, IL, USA) was used to perform the statistical analysis. P values <0.05 were considered statistically significant.

Results

Out of a hundred women who underwent dynamic MR imaging of the pelvic floor during the study period, 97 women were included. Two patients were excluded because of imaging artifacts as a result of movement of the patient, and imaging in one patient was cancelled because of the patient’s anxiety in the MR imaging scanner. All of the included women had a POP-Q examination and 61 women (63%) also had a perineal ultrasonography examination. Women’s baseline characteristics are shown in Table 1.

Table 1 Baseline characteristics of the study population (n = 97 women)

POP-Q vs. dynamic MR imaging and perineal ultrasonography

Table 2 shows the Spearman’s correlation coefficient between POP-Q points Ba, C, and Bp and the anatomical landmarks and the clinical measurement points on dynamic MR and the anatomical landmarks on perineal ultrasonography. All measurements mutually correlate statistically significant, except for the POP-Q TVL with the clinical measurement on MR imaging, and POP-Q Bp with the rectum (posterior compartment) for the pubococcygeal line on dynamic MR imaging and on perineal ultrasonography. Correlation of POP-Q Ba with the bladder (anterior compartment) were good for all reference lines on MR images (r s = 0.61–0.66), but only moderate for the clinical measurement point on MR imaging and perineal ultrasonography (r s = 0.49 and 0.58, respectively). Correlation of POP-Q C with the C/VV (central compartment) on MR images were poor (r s range = 0.29; 0.33), except for the pubococcygeal line (r s = 0.40). Correlation of POP-Q Bp with the rectum (posterior compartment) for the mid-pubic line and the clinical measurement point on MR imaging, however, were higher (r s = 0.49; 0.49) as compared with the correlation of POP-Q Bp with the rectum using the pubococcygeal line (r s = 0.01), the H-line (r s = 0.23), as well as on perineal ultrasonography (r s = −0.03).

Table 2 Correlation and mean difference, with 95% confidence interval between POP-Q points and measurements on dynamic MR images using three different reference lines and dynamic ultrasonography using a horizontal reference line

The cervix was seldom seen on perineal ultrasonography, including the women with uterine descent. Even in retrospect, we could detect the cervix in the 3D cineloops in only four out of 11 women with POP-Q point C > −5. Consequently, comparison of perineal ultrasonography of the middle compartment to other measurements was not feasible.

Figure 2 presents the Bland and Altman plots of the POP-Q measurements vs. the corresponding dynamic MR measurements and perineal ultrasonography measurements. The range of two standard deviations, the mean, and the reference at zero are indicated. Each dot represents one woman. The plots show oblique lines, because of the discrete values for POP-Q Ba, C, and Bp. In each plot, the majority of the individual dots are situated above the reference at zero, except when the mid-pubic line was used. This is due to the fact that the pubococcygeal line is situated most cranially. The difference in reference lines leads to a systematic difference with higher POP stages on dynamic MR imaging and perineal ultrasonography as compared with clinical examination, except for the mid-pubic line.

Fig. 2
figure 2

The differences of two methods (Y-axis) is plotted against the average of two methods (X-axis). The dotted lines represent two standard deviations, the continuous line represents the mean, and the reference at zero is plotted. POP-Q Pelvic Organ Prolapse Quantification; Ba most descended edge on the anterior vaginal wall on strain; C most descended edge of the cervix or the leading edge of the vaginal vault on strain; Bp most descended edge on the posterior vagina wall on strain; bladder most descended edge of the bladder; C/VV most descended edge of the cervix/vaginal vault; rectum most anteriocaudal point of the anterior rectal wall; PCL pubococcygeal line; MPL mid-pubic line; MPL MRI clinical measurement points to approximate POP-Q Ba, C, and Bp on MR imaging using the mid-pubic line; US horizontal reference line on ultrasonography. Note: left top panel shows the difference of PCL bladder with POP-Q Ba vs. the average of these two in centimeters

Perineal ultrasonography vs. dynamic MR imaging

Table 3 shows the Spearman’s correlation coefficient between the measurements on perineal ultrasonography and on dynamic MR images. Correlation of the bladder (anterior compartment) between perineal ultrasonography and dynamic MR imaging were statistically significant and good (r s range = 0.61; 0.70), whereas correlation of the rectum (posterior compartment) were not statistically significant and only poor (r s range = 0.11; 0.19).

Table 3 Correlation and mean difference, with 95% confidence interval between measurements on dynamic MR images using three different reference lines and ultrasonography using a horizontal reference line

Figure 3 presents the Bland and Altman plots of the perineal ultrasonography measurements and the corresponding dynamic MR measurements. The range of two standard deviations, the mean, and the reference at zero are indicated. Each dot represents one woman. Most dots are plotted above the reference at zero due to the systematic difference. The horizontal reference line on perineal ultrasonography seems to correspond best with the hymen, as compared to the other reference lines.

Fig. 3
figure 3

The differences of two methods (Y-axis) against the average of two methods (X-axis). The dotted lines represent two standard deviations, the continuous line represents the mean, and the reference at zero is plotted. bladder most descended edge of the bladder, rectum most anteriocaudal point of the anterior rectal wall, PCL pubococcygeal line, MPL mid-pubic line, US horizontal reference line on ultrasonography. Note: left top panel shows the difference of PCL bladder with US bladder vs. the average of these two in centimeters

Discussion

In the present study, we have determined the agreement between clinical prolapse staging and staging on dynamic MR imaging and perineal ultrasonography. The agreement showed to be moderate to poor, except for the anterior compartment. We have also determined the agreement between the measurements on perineal ultrasonography and on dynamic MR images, which showed good agreement in the anterior compartment only. The results were independent of the reference line used on dynamic MR imaging.

Correlation of POP-Q stages with dynamic MR imaging has been studied before [69]. Until now, however, only one paper has compared the POP-Q centimeters and not the derivated POP-Q stages with measurements on dynamic MR imaging. This was by Fauconnier et al., who have described good correlations for the anterior and central compartment when using the mid-pubic and perineal line (r = 0.71, 0.79, and 0.74, 0.80, respectively) [7].

Dietz et al. have first assessed the correlation of measurements on perineal ultrasonography with POP-Q. They have described good correlation between the two modalities, except for the posterior compartment (r s = 0.72, r s = 0.77, and r s = 0.53 for the anterior, central, and posterior compartment, respectively) [10]. In agreement with their findings, we have found the poorest correlation in the posterior compartment. However, our results even showed a lack of correlation between the two modalities. In our search for an explanation, we have analyzed whether the posterior compartment correlated poorly with all modalities because of a more ventral than caudal development of the anterior rectal wall (i.e., bulging into the vagina and not through the introitus of the vagina). This hypothesis could not be confirmed since there was no difference in the direction of development of the bulge in women with and without a rectocele on dynamic MR imaging.

It has previously been described that the cervix was only clearly visible on perineal ultrasonography when values of >−5 were documented for POP-Q point C [10]. In our series, however, even in retrospect, the cervix was only infrequently seen in women with uterine descent. Unfortunately, we had no access to 2D ultrasonography loops which would probably made the assessment easier, because of better imaging quality as compared with the 3D volumes used in the present study.

Correlations can be influenced by the intra- or interobserver reliability (or reproducibility) of the different modalities. When reliability is moderate or poor, correlations can be negatively influenced. In the literature, reliability of measurements with the POP-Q system are described by Hall et al. as largely substantial and highly significant, although the interobserver reliability for POP-Q point C and TVL is only moderate (r s = 0.52 and 0.49, respectively), as well as the intra-observer reliability for TVL (r s = 0.43) [16]. Reliability of MR imaging measurements in women with POP in relation to the reference lines used in this study were described as excellent to good by Fauconnier et al. and Broekhuis et al. [7, 17]. Braekken et al. has described excellent intra-observer reliability when the bladder neck was used as measurement point on perineal ultrasonography [18]. The reliability of the other measurement points on perineal ultrasonography has not been described until now.

In line with our findings, other studies have reported that dynamic MR imaging bears the risk of overestimation of the severity of POP as compared to clinical examination [7, 9, 10]. Consequently, the question is whether clinical examination or the imaging techniques represent the real severity of POP. According to our clinical experience in the operating theater, the imaging techniques seem to exaggerate the extensiveness of POP and bear the risk of overtreatment.

To our best knowledge, this is the first report on the agreement of dynamic MR imaging with dynamic perineal ultrasonography in women with POP. The findings were comparable to the correlations of each modality with POP-Q. In view of the high correlations, imaging is not likely to have an additional value for the anterior compartment, and POP-Q can be regarded as the golden standard of POP staging in the anterior compartment. Agreement between the measurements of the two imaging modalities in the posterior compartment was poor. At this stage, the available evidence does not provide the proof for the superiority of one of these imaging techniques in the central and posterior compartment. It is conceivable that dynamic imaging modalities outperformed clinical examination and unveiled findings that were missed with clinical examination, although this cannot be proven at this time point.

Correlation of the two imaging modalities with surgical findings remains difficult to accomplish, because no standardized staging system is available for assessment during surgery. The future evaluation of patient’s symptoms in relation to POP-Q and various imaging techniques may further lead the way in the choice for the golden standard POP staging in the central and posterior compartment.

Our results may have been influenced by our patient population, which consisted of a large number of women who had at least three previous operations for POP or urinary incontinence. In general, this specific patient population may result in more difficult assessment of POP in all three modalities of assessment.

In conclusion, our study showed that POP staging with the use of POP-Q, dynamic MR imaging and dynamic perineal ultrasonography correlates good in the anterior compartment and only moderate to poor in the central and posterior compartment regardless of the reference line used.