Introduction

This report is part of a series of articles that are the product of the International Urogynecology Consultation (IUC), which is sponsored by the International Urogynecological Association (IUGA). This is a 4-year, four-chapter project, with 16 reports dedicated to reviewing and summarizing the world’s literature on pelvic organ prolapse (POP).

This report is from the 2nd year and chapter of the project, which is dedicated to the evaluation of POP. This year/chapter is divided into three reviews, the other two involve the radiographic evaluation of POP and the use of patient-reported outcomes (POP condition-specific quality-of-life questionnaires) in the evaluation of POP. This report focuses on the clinical evaluation of women with POP and describe how to use the physical examination to describe pelvic organ support or prolapse. In addition, the associated testing to evaluate comorbid conditions of the urinary and gastrointestinal tracts (GITs) is described and evaluated. Radiographic testing to evaluate comorbid lower urinary tract and gastrointestinal conditions is part of this report.

It is recommended that every patient with POP has a thorough clinical examination. Describing and evaluating the patient for POP, although it at first seems straightforward, is in fact very complex. First, there are several classification systems currently in use to describe and quantify POP. The clinician is then left to determine the relative benefits of using one system over another. In addition, it is recognized that many patients with POP often have pelvic floor comorbidities involving other pelvic/abdominal organ systems [1]. Choosing how best to use clinical resources to properly investigate these conditions in patients with POP can be confusing. In addition, the interpretation of test results in a patient with POP may be different than interpretation of the same studies in a patient with normal pelvic organ support. Finally, this paper addresses the question as to which additional testing is necessary and should be routine versus which testing should only be performed if there are associated symptoms present. This review is not meant to be an exhaustive paper regarding the evaluation of lower urinary tract or gastrointestinal symptoms in women, except as they are uniquely influenced by POP.

In this review, the components of a clinical examination and the conditions under which they should be performed are assessed and the best practices described. Any additional testing of co-morbid conditions that should be routinely undertaken, and the conditions under which they are best performed, are evaluated and the best practices described. Knowledge gaps and areas that require further study are also noted.

Materials and methods

This manuscript is a narrative review that includes a systematic search of the literature using terms from the PubMed and Embase databases (January 2000 to August 2020). Only human studies involving adult women and limited to the English language were included. The terms for searching the literature were developed by the authors of this report and were presented to the IUGA membership at the annual scientific meeting in 2020; progress was reported at subsequent meetings. These are shown in Table 1 the titles and abstracts were reviewed using the Covidence database to determine whether they met the inclusion criteria. In the event of uncertainty, this was discussed at regularly scheduled meetings. The manuscripts were next reviewed for suitability using the Specialist Unit for Review Evidence checklists for cohort, cross-sectional, and case–control epidemiological studies. This was done to assess data presentation, population description, and bias. Only studies that included populations with clear definitions of patients with symptomatic POP, which described examination findings, were included. The full-text manuscripts were extracted and then reviewed. Those manuscripts that qualified were reviewed in depth and the process is summarized in the Results section (Fig. 1).

Table 1 Keywords used for searching the literature
Fig. 1
figure 1

Preferred Reporting Items for Systematic Reviews and Meta-Analyses diagram for prolapse and examination findings

Results

The search strategy found 11,242 abstracts, which were reviewed and led to the extraction of 940 full-text articles, of which 220 articles were used to inform this narrative review. The results and the PRISMA figure for each are reported in three areas:

  1. 1.

    Clinical physical examination 

  2. 2.

    The urinary tract (LUTS), and

  3. 3.

    The gastrointestinal tract (GIT).

Other comorbid conditions such as pain and sexual dysfunction are better evaluated and recorded using patient-reported outcomes, which are covered in a separate manuscript of the IUC [2].

Clinical physical examination of a woman with POP

A review of the existing literature on the examination of a patient with POP and the impact of various parameters on the examination findings was performed. The initial search identified more than 7,155 abstracts of which around 96 studies were included in the final review (Fig. 2) This review of the clinical examination is divided into four sections:

  1. 1.

    General aspects of examination of a woman with POP

  2. 2.

    Examination of the anterior compartment

  3. 3.

    Examination of the posterior compartment

  4. 4.

    Examination of the apical compartment

Fig. 2
figure 2

Preferred Reporting Items for Systematic Reviews and Meta-Analyses diagram for lower urinary tract symptoms

General aspects of examination of a woman with POP

Methods to describe/quantify examination of POP

A variety of systems have been devised to classify and quantify POP. Eight studies focused on assessing the reliability and reproducibility of various staging systems (Table 2). It was found that the Baden–Walker Halfway Grading System had moderate reproducibility, making it unsuitable for clinical care or research [3]. The Pelvic Organ Prolapse Quantification (POP-Q) system, on the other hand, was found to have good interobserver agreement and was found to be particularly useful in the research setting [4].

Table 2 Results of studies assessing the different staging systems for pelvic organ prolapse

Owing to the complexity of the POP-Q, a simplified POP-Q (S-POP) system was devised. This system retains the ordinal stages of the POP-Q but simplifies the terminology and reduces the number of points measured. Three studies evaluated the validity, interobserver agreement, and inter-system agreement between the simplified POP-Q and POP-Q [5,6,7]. The authors concluded that a substantial intersystem association exists between S-POP and POP-Q, and S-POP, being simpler, may be more applicable to clinical practice worldwide. It was also found that the simplified POP-Q system retains its inter-examiner agreement across centers of varying degrees of expertise and is a valid, user-friendly alternative to POP-Q. For a complete description of the POP-Q please refer to the article by Bump et al. [8]. For a complete description of the S-POP please refer to the article by Swift et al. [9].

One study described and evaluated the validity of the novel “eye-ball” POP-Q technique (POP-Q by estimation) [10]. In this technique, the points along the anterior and posterior vaginal walls (Aa, Ba, Ap, and Bp) and on the perineum genital hiatus (GH) and perineal body (Pb) were visually estimated. Determination of vaginal depth (total vaginal length, or TVL) and apical descent (points C and D) were assessed by both visual estimation and palpation with the examiner’s dominant hand. The authors suggested that estimating POP-Q values provided comparable results to measuring them when performed by physicians well versed with the standard POP-Q.

Impact of various parameters on POP examination

When examining a patient with suspected POP, it is critical that the examiner sees and describes the maximal extent of the POP as experienced by the woman. This may be impacted by many variables including the patient’s age, parity, body mass index (BMI), position, bladder volume, rectal fullness, the timing of the day of the examination, examination performed at rest or Valsalva/straining, and effect of anesthesia in the case of examination in operating rooms. The correlation of examination findings with these variables was examined separately in nine studies. The conclusions of these are summarized below.

  1. 1.

    Age, parity, and BMI: there is no literature on how any of these impacts the ability of a woman to aid in her examination to identify the bothersome extent of her POP.

  1. 2.

    Bladder volume and rectal fullness: the effect of bladder volume on examination of POP was evaluated by two studies [11, 12]. Both concluded that the maximal extent of POP should always be assessed with an empty bladder. This could be because a full bladder does not allow maximal straining and also distorts the anatomy of the vaginal wall, especially of the anterior and central compartments. Similarly, a full rectum may cause confounding of findings by competing for space. One study commented that all patients with POP should be examined with an empty rectum if possible [13]. However, there is a lack of evidence to support this.

  2. 3.

    Patient position: there is a lack of standardized recommendations regarding patient position during a POP examination. Three studies examined the effect of patient position on the staging of POP [14,15,16]. It was found that the severity of POP demonstrated is greater when the examination is done in the upright position on a birthing chair or in the standing position rather than in the supine or lithotomy position. The inter-observer repeatability and correlation with the quality of life scores were also greater for examination findings in the upright position. In cases where the examination is not possible in an upright position, validation of POP-Q in a left lateral position was also assessed and the authors found a high degree of inter-observer reliability of POP-Q findings in this position [12].

  3. 4.

    Time of examination: the effect of the time of the day (morning versus afternoon) on POP-Q measurements, was assessed in a prospective observational study on 32 subjects [17]. No correlation was found between time of the day and extent of POP on examination. The authors concluded that for patients complaining of POP extending beyond the hymen there is no need to repeat an examination late in the day to confirm the full extent of prolapse.

  4. 5.

    Rest or straining: one study examined the predictive value of GH and Pb measurements obtained at rest and with straining for signs and symptoms of POP[18]. GH and Pb measured on straining were consistently stronger predictors of prolapse symptoms and objective prolapse (by clinician examination and by ultrasound) than at Gh and Pb measured at rest.

  5. 6.

    Anesthesia/neuromuscular blockade: the effect of neuromuscular blockade on POP staging was examined by one study [19]. It was found that neuromuscular blockade during anesthesia led to a significant increase in POP-Q measurements, especially in the apical compartment. The authors highlighted that in asymptomatic women with up to stage II POP, the surgical procedure should be limited to that planned preoperatively rather than allowing intraoperative findings to affect surgical management.

  6. 7.

    Role of cervical traction in prolapse examination: one study compared the degree of uterine prolapse between POP-Q with cervical traction and POP-Q in the standing position. They also assessed patient-reported pain and acceptability scores between the two examinations [20]. The median point C in the standing position was −4 (−7 to +2) and with cervical traction −0.5 (−3 to +4). Forty percent reported visual analog scale (VAS) pain scores of ≥5 under examination with cervical traction. Surprisingly, there was no significant difference in acceptability scores between the groups.

Relation of POP stage to GH length, Pb, and TVL

Two studies were aimed at describing the relationship between GH and Pb measurements with increasing POP stage [21, 22]. It was found that as the extent of POP increases, GH measurements also increased until stage 4 POP, where mean GH decreased. Also, the POP-Q measurement GH ≥ 3.75 cm is highly associated with and predictive of apical vaginal support loss. One study found that measurement of the TVL improved the correlation between the C-point measurement and POP symptoms [23].

Evaluation of pelvic floor muscle function in women with POP

Different methods have been used to study the pelvic floor muscle function (PFMF) and its correlation with severity of POP and pelvic symptoms. One study assessed whether POP severity, pelvic symptoms, quality of life, and sexual function differ based on PFMF (assessed by the Brink scale score; Table 3) by re-analyzing preoperative assessments of 317 of the 322 women enrolled in the Colpopexy and Urinary Reduction Efforts (CARE) trial [24]. They found that women with the highest Brink scores (n=75), suggesting enhanced pelvic floor muscle tone, had less advanced POP and smaller GH measurements, than those with the lowest Brink scores (n=56), suggesting weak pelvic floor muscle tone.

Table 3 Brink scoring system

Two other studies tested the correlation between the PFMF, using the Oxford grading scale (Table 4), and the severity of POP. In one study 1,037 women were evaluated by assessing the POP-Q and the Oxford assessment of the PFMF. The muscle contraction was graded according to the modified Oxford grading system (Table 4): 0 = no contraction, 1 = flicker, 2 = weak, 3 = moderate, 4 = good, 5 = strong [25]. The levator hiatus (LH) size and GH were measured by digital examination [26]. Severity of POP correlated moderately with GH (r = 0.5, p<0.0001) and with LH (transverse r = 0.4, p<0.0001; longitudinal r = 0.5, p<0.0001), but weakly with the modified Oxford grading scale (r = 0.16, p<0.0001). In the second study, it was seen that POP stage had a significant influence on effective involuntary pelvic floor muscle contraction to counteract a sudden increase in intra-abdominal pressure during coughing. Women with POP stages II or more were significantly less able to achieve effective involuntary muscle contraction during coughing (which resulted in stabilization of the perineum; 37.7%) than women without POP (75.2%) [27].

Table 4 Modified Oxford Grading scale for pelvic floor muscle (PFM) strength

Neurological examination in women with POP

There are very few data on neurological assessment in patients with POP. In a case–control study, the vaginal and clitoral sensory thresholds were assessed in 66 women with (n=22) and without POP (n=44) using a thermal and vibration Genito-Sensory Analyzer [28]. They found that women with POP exhibited significantly lower sensitivity in the genital area to vibratory and thermal stimuli than women without POP.

Association of spine curvature with POP and bony dimensions of the pelvis

Three studies evaluated the relationship of spinal curvature with POP. One study assessed the relationship of spinal curvature and POP, specifically, the loss of lumbar lordosis or pronounced thoracic kyphosis in 363 patients with symptomatic POP [29]. They found that patients with abnormal spinal curvature were 3.2 times more likely to develop POP than patients with a normal curvature (odds ratio, 3.18; 95% confidence interval, 1.46 to 6.93; p=0.002) and identified an abnormal change in spinal curvature as a significant risk factor in the development of POP. In the other two studies no differences in the mean T or L spine angles were found between women with and those without POP symptoms (p≥0.05) and bony dimensions on MRI at the level of the pelvic floor in matched cohorts were similar [30, 31].

Examination of different pelvic compartments in POP

Examination of anterior vaginal wall compartment for paravaginal defects

With respect to the clinical examination of the anterior vaginal wall defects, using the standardized POP-Q examination and a clinically defined technique for describing the presence of paravaginal defects, right and/or left lateral, central or superior defects have been described. To differentiate a midline or central defect from a paravaginal defect, an index finger or ring forceps must be placed vaginally toward each ischial spine separately. If the prolapse becomes reduced, the woman is clinically diagnosed with a paravaginal defect on that side. In a prospective study, the sensitivity to detect left, right, and bilateral paravaginal defects was reported to be 48%, 40%, and 23.5% respectively, whereas the specificities for each side were 71%, 67%, and 80% respectively compared with intraoperative findings. The overall prevalence of paravaginal defects in patients with at least POP-Q stage II POP of the anterior vaginal wall was 38% [32].

Another study assessed the inter-examiner and intra-examiner reliability of the evaluation of the anterior vaginal wall, including the evaluation of paravaginal defects, using the POP-Q examination and a standardized evaluation of paravaginal defects [33]. The clinical examination of anterior vaginal wall support defects displayed poor inter-examiner and intra-examiner agreement. Overall inter-examiner agreement was 42%, with a kappa of 0.16.

Correlation of anterior and apical compartment prolapse

The relationship or coexistence of anterior vaginal wall prolapse with apical prolapse was investigated in one study [34]. Women with a POP-Q Point Ba value ≥ −1 were retrospectively analyzed for the presence of apical POP defined as POP-Q point C value ≥ −3. The finding of POP-Q stage II or greater anterior vaginal wall prolapse was highly suggestive of clinically significant apical vaginal descent to −3 cm or greater. Furthermore, as the anterior vaginal wall POP-Q stage increased, the predictive value of apical POP increased. In women with POP-Q stage II anterior vaginal wall prolapse there was associated apical descent (defined as POP-Q point C ≥ −3) in 42%; in stage III anterior vaginal wall POP, apical descent was found in 85%; and in POP-Q stage IV anterior vaginal wall POP it was 100%.

Examination of the posterior compartment and the need for a rectovaginal examination

Three studies were identified that specifically evaluated the posterior vaginal wall and its relationship to GI dysfunction. A prospective cohort study used a variety of validated questionnaires and standardized examination measures, including Bp, AP, GH, and Pb, transverse GH, Pb at rest, with strain in addition to a “pocket” noted on rectal examination [35]. Inter- and intra-rater reliability for these were assessed by two independent examiners. This study demonstrated the reliability of these measurements of the posterior vaginal compartment and a weak association between obstructed defecation and pelvic organ prolapse.

Another study evaluated the association between defecatory symptoms such as constipation, painful defecation, fecal incontinence, and flatus incontinence and posterior vaginal wall examination using the POP-Q and by defecography [36]. The authors found no association between defecation disorders and posterior wall prolapse (evaluated by POP-Q) or rectocele (assessed by defecography) and that clinical examination missed most enteroceles. They concluded that most anatomical measures of posterior compartment prolapse are reliable and reproducible; however, they do not correlate well with defecatory symptoms.

One study assessed the evaluation of the rectovaginal septum (RVS) using a digital rectal examination [37]. The authors concluded that extending the clinical examination of prolapse to include rectal examination to palpate defects in the RVS may reduce the need for a defecatory proctogram or ultrasound for the assessment of obstructive defecation and may help to triage patients in the management of posterior compartment prolapse. Larger rectoceles were easier to identify and true rectoceles may be best diagnosed by rectal examination.

Examination of the apical compartment

Normal values for the apical component of the POP-Q

One study assessed normal values for the apical component of the POP-Q (points C, D, and TVL) in asymptomatic women by re-analyzing data from the original 2005 Pelvic Organ Support Study using a data set of 1,011 women [38]. In patients without POP defined as all POP-Q points above the hymenal remnants, they found mean POP-Q values to be: point C (vaginal cuff) −7.3 ± 1.5 cm, point C (cervix) −5.9 ± 1.5, point D −8.7 cm ± 1.5 cm, TVL (no hysterectomy) 9.8 cm ± 1.3 cm, and TVL (hysterectomy) 8.9 cm ± 1.5 cm.

Clinical evaluation of cervical elongation

A study evaluating 39 consecutive patients who had a preoperative POP-Q and a pathology report that documented the cervical length was performed. The comparison was between estimated cervical length (eCL) on the preoperative POP-Q (by subtracting point D from point C) to the actual cervical length (aCL) reported in the pathology report. The authors found a statistically significant difference between the eCL (mean 5.6 ± 2.91 cm) and the mean aCL (3.2 cm ± 0.99; p<0.0001). However, there was not a statistically significant difference between the eCL and aCL in patients whose prolapse was proximal to the hymen (3.5 ± 2.21 cm vs 3.1 ± 1.06 cm; p = 0.475). The authors concluded that the cervical length measured using POP-Q may not be accurate at more advanced stages of prolapse [39].

Apical descent in the office compared with evaluation in the operating room

One study compared the assessment of apical prolapse in the office and assessment in the operating room [40]. The office assessment was conducted using a standard POP-Q examination with measurement at straining. The intraoperative assessment was performed by placing traction on the cervix with a tenaculum. The mean difference in the C point between the two clinical settings was 3.5 cm with a difference of ≥5 cm in 33% of subjects. Of note, the mean difference was larger for women with lesser stages of prolapse: 5.8 cm at stage 1, 3.0 cm at stage 2, and 1.4cm at stages 3/4 (p<0.001). A difference of ≥5 cm in point C with cervical traction was more commonly noted with lower stages of prolapse; it was noted in 70.3% of women with stage 1 versus only 9.3% of women with stage 2, and 8.5% in women with stage 3 (p<0.001).

Association of posterior and anterior prolapse with apical support

Two studies evaluated the association of anterior and posterior compartment prolapse with apical support. In the first study the authors found that the mean point C location was −6.9 ± 1.5 (mean ± standard deviation) in control patients without POP. In patients with posterior prolapse point C was −4.7 ± 2.7 cm. In patients with anterior prolapse point C was −1.2 ± 4.1 cm, p values were <0.001 for all comparisons [41]. The authors concluded that posterior-predominant prolapse involved threefold less apical descent than in patients with anterior-predominant prolapse. In the second study the authors analyzed 196 subjects and performed a standard POP-Q examination, and then assessed anterior and posterior prolapse in each subject before and following support of the apex using the posterior half of a Graves speculum [42]. Their POP-Q stages before apical support were stage 2 prolapse in 36% of patients, stage 3 in 55%, and stage 4 in 10%. With simulated apical support from the Graves speculum, point Ba changed to stage 0 or 1 in 55% and Bp changed to stage 0 or 1 in 30% (p<0.001 for both). The mean change in Ba with apical support was 3.5 ± 2.6 cm and for point Bp the mean change was 1.9 ± 2.9 cm (p<0.001). These findings suggest a greater relationship between the anterior vaginal wall and apical prolapse.

Summary of clinical examination of a woman with POP

The clinical evaluation of a patient suspected of having POP by presenting symptoms should start with a thorough pelvic examination. The POP-Q system is the most studied POP classification system for describing and quantifying POP. It should be used clinically in settings where clinicians have extensive experience and comfort in its use. In clinicians with extensive experience, POP-Q values can often be reliably and adequately obtained by “eyeballing.” The POP-Q should be used in all research settings. In settings that do not have extensive experience with the POP-Q, or in settings that find it cumbersome to use, substituting the S-POP is acceptable as a means of describing and quantifying POP. The use of other systems currently in the literature should be discouraged unless more literature is published demonstrating their utility.

To optimally perform a physical examination on a patient with suspected POP several parameters should be met:

  1. 1.

    The subject should have an empty bladder (and empty rectum, if possible.

  2. 2.

    If the subject cannot confirm the extent of their POP by examination in the supine or left lateral position, the examination should be repeated in a more upright or standing position.

  3. 3.

    The time of day of the examination is not important in most cases.

  4. 4.

    The examination should be done during straining or coughing.

  5. 5.

    Cervical traction or examination under the effects of a neuromuscular blockade may overstate the degree of apical POP and should not be relied upon.

Other parameters of a thorough pelvic examination and imaging for pelvic anatomy are less well investigated but may provide some clinical assistance in planning therapy.

  1. 1.

    Noting the dimensions of the GH or vaginal introitus plays a role in the evaluation of a patient with POP. A large GH as documented by a POP-Q examination ≥ 3.75 cm is associated with greater degrees of POP. Understanding what information this provides to the clinician in staging and quantifying POP is less clear and requires more study. Of note, recording the size of the GH is part of the POP-Q but not the S-POP.

  2. 2.

    A greater pelvic floor muscle contraction strength has been associated with less severe POP by both POP-Q examination and various POP symptom scores. In addition, patients with POP appear to have some degree of neurological deficit in other pelvic structures. Therefore, evaluating and recording pelvic floor muscle contraction strength and the presence or absence of neurological deficits, although encouraged, does not currently play a recognized role in the evaluation or quantification of POP.

  3. 3.

    Evaluation of the spine in patients with POP may lead to better understanding of the epidemiology and pathophysiology of POP but does not play any specific role in the evaluation of patients with POP.

  4. 4.

    Clinical examination to identify and characterize site-specific defect of the anterior vaginal wall prolapse has not been studied enough to draw robust conclusions. However, although reporting these clinical findings may aid the individual surgeon in preoperative planning, is too nonspecific for widespread adoption into current clinical grading schemes.

  5. 5.

    Evaluation of posterior vaginal wall prolapse can be complemented by a rectovaginal examination as there is evidence that it can help to distinguish between true rectoceles and enteroceles. There is poor correlation between posterior vaginal prolapse by clinical examination and GI dysfunction.

  6. 6.

    Evaluation and grading of apical (cervical/vaginal vault) POP is complex and currently there is very little information from which to draw clinically relevant information. It appears that in normal subjects the cervix (POP-Q point C) is 4.5 to 7.5 cm above the hymenal remnants, the posterior vaginal fornix (POP-Q point D) is 7 to 10 cm above the hymenal remnants, and in hysterectomized patients the vaginal cuff (POP-Q point C in hysterectomized patients) is 6 to 8.5 cm above the hymen. The TVL in patients with a uterus is 8.5 to 11 cm and in hysterectomized patients it is 7.5 to 10.5 cm. The determination of a cut-off point beyond which apical values represent true POP or clinical symptomatic disease is unknown although any compartment prolapse at or beyond the hymen is more likely to be symptomatic.

  7. 7.

    Repeating a POP-Q examination under anesthesia often overestimates apical prolapse and although useful for surgical planning, currently should not be recommended. It is not known whether there is a long-term prognostic value for this apical assessment.

  8. 8.

    Using a tenaculum to provide traction on the cervix in the clinical setting can overestimate uterine prolapse, is deemed uncomfortable by patients, and therefore should be discouraged.

Further research

  1. 1.

    Future research needs to determine the predictive value of a large GH as a sign of impending POP that may require prophylactic therapeutic measures. Further, is a large GH a risk factor for POP or a side effect of having the vaginal bulge protruding through and physically dilating the vaginal opening?

  2. 2.

    Future research on what represents true uterine or vaginal vault prolapse is critical. There are some data on the normal range of values for POP-Q points C and D. However, what is not known is if patients with POP-Q point C and D values below these ranges but still above the hymenal remnants have a type of POP that requires therapeutic measures, particularly if that patient is undergoing surgery to correct anterior or posterior vaginal wall prolapse.

  3. 3.

    If a paravaginal defect is detected what is the role of anterior vaginal repair? To what degree does a paravaginal defect contribute to anterior vaginal wall recurrence?

  4. 4.

    Further study on how physical examination under the effects of neuromuscular blockade (anesthesia) affects future outcomes. For example, if a subject has significant cervical or apical POP identified in the operating room that was not noted during clinical physical examination, are they at a greater risk of future apical POP, particularly if nothing is done to address this apparent apical defect at the time of surgery for another form of POP?

  5. 5.

    Future research should better define the role of weak pelvic floor muscle tone or contraction strength as a predictor of the subsequent development of POP. A complete discussion of the role of pelvic floor muscle strength training and its role in treating POP will be included under another report in the IUC that has been published as part of this series entitled “International Urogynecology Consultation chapter 3 committee 2; conservative treatment of patients with pelvic organ prolapse: pelvic floor muscle training” [43].

Assessment of lower urinary tract function in women with POP

A review of the existing literature on the assessment of lower urinary tract function in women with POP was performed. The initial search identified 2,711 titles and abstracts, of which 63 studies were included in the final review of this section (Fig. 2).

This section is presented in three sub-sections: the assessment of voiding dysfunction, assessment of detrusor overactivity (DO), and assessment of stress urinary incontinence (SUI).

Assessment of voiding dysfunction in women with POP

The prevalence of voiding dysfunction in women with prolapse varies depending on the definition but ranges from 6 to 60%. Assessment of voiding difficulty in women with prolapse was addressed in 11 papers. Six papers had voiding difficulty as the focus [44,45,46,47,48,49] , 4 papers addressed voiding difficulty as part of LUTS assessment [50,51,52,53], and 1 paper addressed the accuracy of ultrasound in measuring bladder volume [54]. Six themes were identified in these studies.

Post-void residual urine volume

Post-void residual urine volume (PVR) was the most utilized measure to define voiding dysfunction in the studies reviewed; however, there was no agreement on the cut-off value at which retention was diagnosed ranging from 50 to 200 ml, as shown in Table 5.

Table 5 Measures for the assessment of voiding difficulty

To find a cut-off value for PVR that could predict postoperative voiding trial results more accurately than a predetermined value of 100 ml, one study used a receiver operating curve, but no PVR value was better than 100 ml (the predetermined value used in the study) [49].

The accuracy of translabial ultrasound scan formulae used for PVR measurement in patients with prolapse was examined in one paper [54]. It found that the results obtained by the three published formulae correlated with the catheter-measured PVR.

Urine flow studies

These included free-flow studies (non-instrumented flow studies) and pressure-flow studies (instrumented urodynamic flow studies). Different measurements were used to define voiding dysfunction, as shown in Table 5.

One study [46] examined the correlation between free-flow and pressure-flow studies. It concluded that the peak and average flow rates in women with POP are dependent on voided volume and the correlation between free-flow and pressure-flow studies decreases as the prolapse stage increases.

Detrusor contractility measures

The concept of detrusor underactivity was addressed in two papers [45, 50] to predict the potential course of postoperative voiding difficulty. The Bladder Contractility Index (BCI), as defined by Abrams [55], was used in one paper [45]. BCI <100 was associated with higher PVR and a more severe stage of prolapse, but it failed to predict postoperative resolution of voiding difficulty. The second study [50] used a six-class grading of detrusor contractility based on Schafer’s nomograms [56]. They reported women with weak detrusor contractility having increased PVR in the immediate postoperative period, with resolution after 1 month.

Bladder trabeculation on cystoscopy

One study addressed the cystoscopic finding of trabeculation in women with POP. Trabeculations were scored from 0 to 4, representing increasing severity from none, slight, moderate, severe, and severe with diverticula. They reported significantly higher prevalence of symptoms of voiding difficulty and increased PVR (>100 ml) in women with any degree of trabeculations compared with women with no trabeculations [53].

Prolapse reduction in assessing voiding dysfunction

Prolapse reduction using a pessary or gauze pack was used to assess the impact of prolapse on voiding difficulty in three papers [47, 48, 52]. One study used pessary reduction of prolapse to predict postoperative resolution of voiding difficulties [47]. Authors reported that the resolution of voiding difficulty with pessary reduction of prolapse has 89% sensitivity and 80% specificity in predicting post-repair resolution [47]. In another study, pessary reduction of prolapse was used routinely in all patients while performing urodynamics [48] to assess voiding dysfunction and occult SUI. This resulted in the diagnosis of voiding dysfunction defined as post-void residual of >50 ml or 20% of voided volume in 27%, which reduced to 10% postoperatively. The authors did not test the value of pessary in predicting postoperative voiding dysfunction. The third study used vaginal packing for prolapse reduction and found that PVR decreased significantly after vaginal packing [52].

Risk factors for postoperative voiding dysfunction

Five studies looked at the assessment of potential risk factors to predict postoperative persistence of voiding dysfunction [45, 47,48,49,50]. In two studies, no potential risk factors were found [45, 50]. Three papers reported various potential risk factors to include history of diabetes, PVR >200 ml and detrusor pressure at maximum flow (Pdet Max) <10 cm H2O, all of which were found to have some impact on postoperative voiding dysfunction [48]. Persistence of voiding difficulty after pessary reduction of prolapse was associated with a 67% chance of persistent postoperative voiding difficulty [47]. Patient age was reported as the only risk factor for postoperative elevated PVR [49].

Assessment for detrusor overactivity in patients with POP

The effect of POP on detrusor overactivity (DO) was addressed in ten papers [50,51,52,53, 57,58,59,60,61,62]. Table 6 demonstrates the measures used to assess DO, the aim of assessment, and the use of prolapse reduction.

Table 6 Studies addressing detrusor overactivity (DO) in patients with pelvic organ prolapse (POP)

Assessment methods for DO

Urodynamic assessment [50,51,52,53,54,55] trabeculation on cystoscopy [53] and artificial neural network analysis of clinical assessment [62] were used to assess for DO. However, even when other methods of assessment of DO were used, urodynamic assessment was carried out as the gold standard for comparison, despite no evidence that urodynamics are the gold standard.

The importance of urodynamic studies in the assessment of DO in patients with POP

Five studies were designed to evaluate the role of preoperative urodynamic assessment of DO (uninhibited detrusor contractions on a cystometrogram) in women with POP. Two studies examined the impact of urodynamic assessment on changing patient management [58, 59]. Two other studies examined the role of urodynamic assessment in predicting postoperative DO [50, 61] whereas the last study focused on the role of urodynamic assessment in diagnosing asymptomatic DO [51]. Not surprisingly, they came to different conclusions regarding the importance of preoperative urodynamic assessment in women with POP and two of the three found no benefit of urodynamic assessment in the preoperative evaluation of patients with POP.

Predicting post-repair overactive bladder

Three papers considered the preoperative risk factors for persistent or de novo overactive bladder (OAB; symptom of urinary frequency and urgency with or without the complaint of urgency incontinence) following surgical repair. Two studies used symptoms to assess for postoperative OAB [50, 57], one used urodynamic assessment post-operatively to assess for DO [60]. Pre-operative DO was not predictive of post-repair OAB or DO; however, one study found that preoperative OAB symptoms are more likely to resolve in the absence of preoperative DO [50].

Summary: assessment of voiding dysfunction in women with POP

Voiding dysfunction in patients with POP is common but evaluation techniques provide limited information.

  1. 1.

    The post-void residual volume estimation is commonly used for assessment of voiding dysfunction. The most commonly used value for diagnosing an elevated post-void residual is 100 ml by catheter or ultrasound.

  2. 2.

    Severity of POP is associated with reduced maximum and average flow rate, and voiding dysfunction is associated with the cystoscopic finding of trabeculation; however, there is no demonstrated benefit for using any of these methods in the routine assessment of the patient with POP.

  3. 3.

    Reduction of POP by pessary or packing often resolves voiding dysfunction and if this is noted on evaluation, it has a high predictive value for resolution of voiding difficulty after surgical POP repair.

  4. 4.

    Postoperative persistence of voiding dysfunction was found to be associated with diabetes, age, PVR >200 ml, P det max <10 and failure of a pessary to resolve voiding difficulty.

  5. 5.

    Preoperative urodynamic assessment was the most commonly utilized diagnostic tool for DO. Preoperative urodynamic diagnosis of DO did not change management, but the absence of preoperative urodynamic DO suggests that symptoms of OAB are more likely to resolve after surgery.

Further research

  1. 1.

    Further research is needed in the development of predictive models for persistence of voiding difficulty or DO postoperatively to aid in patient counseling.

  2. 2.

    Understanding how varying degrees of POP and how prolapse of different compartments affects voiding is poorly understood and needs further research.

  3. 3.

    Further study to assess the effect of voiding dysfunction on the patient both from a symptomatic and a morbidity perspective (recurrent UTIs, upper urinary tract disease) is not currently well understood

Assessment for SUI in women with POP

A substantial proportion of women presenting with POP report SUI symptoms. Preoperative SUI can either resolve or persist after POP surgery. Furthermore, a significant proportion of preoperatively continent women develop de novo SUI after POP surgery. SUI was addressed, either exclusively or as part of LUTS assessment, in 47 papers. Three main themes were identified: assessment of preoperative SUI, prediction of postoperative SUI, and prediction of de novo SUI.

Assessment of preoperative SUI

  1. 1.

    Stress test: the significance of patient position and prolapse reduction during the cough stress test was demonstrated in a study performed on 297 women waitlisted for POP surgery, with a third of them reporting SUI symptoms. Five different cough stress tests were performed with a subjectively full bladder (standing, semi-lithotomy, with and without reduction, reduction with speculum, and pessary). The test with the fewest positive results (34%) was the one performed without POP reduction in a semi-lithotomy position; the test with the most positive results (80%) was the one performed with pessary reduction in a semi-lithotomy position. With the full battery of tests, 93% of women with SUI symptoms demonstrated leakage; only 50% demonstrated leakage without reduction. Eighty-nine percent of the women with a positive stress test were diagnosed when performing at least two of the three tests with prolapse reduction, and 98% were diagnosed when performing all three tests with prolapse reduction (speculum and pessary reduction in the semi-lithotomy position, pessary reduction in the standing position). The authors also emphasized the importance of adequate bladder volume (200 ml) [63]. The findings were not compared with postoperative outcomes.

  2. 2.

    Q-tip angle: one study concluded that the Q-tip test is affected by POP. The angles were smaller with the prolapse reduced and with a full bladder [64]. A substantial correlation (r=0.68) between POP-Q point Aa and Q-tip angle was noted in a study on women presenting predominantly with SUI and anterior wall prolapse [65].

  3. 3.

    Importance of urodynamic studies in the assessment of preoperative SUI: one study concluded that a computer-based model including preoperative symptoms and patient’s baseline characteristics cannot predict preoperative urodynamic diagnosis and, as a consequence, cannot replace a preoperative urodynamic study [62]. In another retrospective study, preoperative urodynamic testing in patients with POP changed the management or counseling in only 3% (11 out of 316) in a cohort of women, with the indication for the study being OAB symptoms, mixed, or insensible urinary incontinence, or voiding difficulty (i.e., not occult SUI evaluation only). Major management alterations occurred mostly in women with SUI and concurrent voiding difficulty. The authors inferred that it might be in these patients that preoperative urodynamic study has its greatest value [58]. These two studies did not correlate the preoperative examination with postoperative outcomes.

Prediction of postoperative SUI

Postoperative SUI can represent persistent or de novo SUI. In this section, some studies approached postoperative SUI as persistent SUI [66] specifically, whereas some studies included women with any preoperative continence status and their results on postoperative SUI include both persistent and de novo SUI. De novo SUI specifically is addressed separately in the following section.

  1. 1.

    Predictive value of preoperative stress test: five studies provided data to calculate the predictive value of a negative stress test during preoperative urodynamic study for postoperative SUI in an unselected POP population (i.e., any preoperative continence status) [67,68,69,70]. All studies included stress tests with prolapse reduction. The negative predictive value ranged between 45 and 90% (median 78%; Table 7).

    Table 7 Predictive value of a negative preoperative stress test for postoperative stress urinary incontinence after pelvic organ prolapse surgery
  1. 2.

    Other predictors for postoperative SUI:

    Three studies looked at other predictors of postoperative SUI. One study included only women with preoperative urodynamic SUI and the predictive urodynamic parameters for persisting urodynamic stress incontinence were overt (versus occult) urodynamic SUI, below normal maximum urethral closure pressure (MUCP, defined by the authors as <60 mmHg), and functional urethral length (FUL) < 2 cm [71].

    Two further studies included all women, regardless of preoperative incontinence status. The only two urodynamic parameters predictive of postoperative SUI in the one study were preoperative urodynamic stress incontinence and low P det Max [72]. In the other study, none of the investigated urodynamic parameters was associated with postoperative SUI [61].

Prediction model for postoperative SUI

A model developed to predict postoperative SUI for women regardless of preoperative continence status considers subjective urinary incontinence symptoms, stress test with and without prolapse reduction, age, point Ba, vaginal parity, and insertion of a mid-urethral sling during surgery [73]. The strongest predictor for postoperative SUI was preoperative SUI. The model’s ability to discriminate women at low or high risk for bothersome postoperative SUI or treatment for SUI during the first postoperative year was at a “useful level” (defined as area under the curve 0.76; interpretation: 0.5 not better than chance—1 perfect discrimination). However, the study does not report the extent to which the model correctly estimates the absolute risk (i.e., calibration), making it difficult to use it in counseling patients regarding operative options. Furthermore, our search did not identify any external validation studies for the model.

Prediction of postoperative SUI (occult SUI)

Occult SUI is defined as urine loss observed during a cough stress test with the POP reduced in a patient with POP who reports no urinary incontinence [74]. It is used as a preoperative test with the intention to identify women at risk of developing de novo SUI after prolapse surgery. Table 8 summarizes the studies that address the predictive value of occult SUI for de novo SUI.

Table 8 Diagnostic accuracy of occult stress urinary incontinence for de novo stress urinary incontinence after pelvic organ prolapse surgery without concomitant anti-incontinence surgery

Twenty-five studies provided either the diagnostic accuracy measures or data enabling the calculation for positive and/or negative test [50, 67, 75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97]. Baseline continence status, diagnostic criteria for occult SUI, methods to reduce the prolapse, surgical procedures, and the definition of de novo SUI varied widely among the studies, making the comparison challenging. Most studies defined occult SUI clearly as SUI demonstrated only during prolapse reduction, whereas some also included demonstrable urodynamic SUI without prolapse reduction in symptomatically continent women. The diagnostic accuracy of occult SUI differed greatly, likely because of the heterogeneity in the studies. The medians (and ranges) for sensitivity were 39% (5–100), for specificity they were 86% (57–97), for positive predictive value they were 40% (0–79), and for negative predictive value they were 91% (51–100).

Importance of urodynamic studies for diagnosis of occult SUI

One study reported similar occult SUI rates with stress testing during physical examination and urodynamic studies. In 76%, occult SUI was identified with both tests, in 11% with urodynamic studies only, and in 13% during physical examination only (kappa 0.648). They did not correlate the findings with postoperative de novo SUI rates [98].

Another study compared the predictive value of demonstrable SUI during basic office evaluation versus urodynamic study for de novo SUI. Stress tests were performed in the lithotomy position with (swab on forceps) and without reduction of the prolapse. During basic office evaluation women were examined with a subjectively full bladder and during urodynamic studies with 300-ml bladder filling and at maximal bladder capacity. More women demonstrated SUI during urodynamic study, but the diagnostic accuracy for bothersome de novo SUI or treatment for de novo SUI was not improved by the addition of the urodynamic study [94].

Other predictors of de novo SUI

Two studies were aimed at identifying other risk factors for de novo SUI. Urodynamic markers that were associated with de novo SUI were low MUCP [99], low FUL [99], and bladder outlet obstruction [100].

Two studies demonstrated that occult SUI is also seen in posterior wall prolapse [101, 102] up to the same extent as with anterior wall prolapse [101].

Prediction model for de novo SUI

A model and risk calculator developed to predict de novo SUI among women without preoperative SUI symptoms contains seven predictors: age, number of previous vaginal births, urine leakage associated with urgency, history of diabetes, BMI, preoperative reduction stress test result, and placement of a midurethral sling during surgery. The model predicted absolute risk accurately, with slight tendencies to overestimate the risk when the probability reached 50% or greater. The concordance index (interpretation: 0.5, not better than chance to 1, perfect discrimination) was 0.73 in the original study [103], and it outperformed both expert opinion and preoperative stress testing in discriminating between women who developed de novo SUI during 12 months follow-up and not. However, when the model was applied to other samples (external validation), the results for the concordance index or area under the curve decreased to 0.62, 0.63, and 0.69 [103,104,105]. One study assessed the model’s performance as a diagnostic test using a probability of de novo SUI of >50% as a cut-off for a positive test. Using this cut-off, a positive test had a predictive value of 27% (i.e., 27% of women with an estimated risk of 0.5 or higher actually developed SUI). This illustrates how the model overestimates the risk when the baseline risk is lower than in the original sample [105].

Summary: assessment of SUI in patients with POP

The evaluation of SUI in patients with POP is very complex and recommendations vary widely.

  1. 1.

    In women with POP and SUI, the cough stress test should be performed with at least 200-ml bladder volume and with the prolapse reduced either with a speculum or pessary in order to have the highest chance of identifying a positive result.

  2. 2.

    Assessment of UDS in women prior to POP surgery has been shown to change management in a small percentage of cases, for example, when SUI (clinical or occult) coexists with voiding dysfunction. The management may change by the avoidance of a concomitant continence procedure or the choice of one with a perceived lower risk of associated voiding dysfunction.

  3. 3.

    There are no comparative data on different diagnostic alternatives correlating with postoperative outcomes as studies such as VALUE [106] and VUSIS [107] excluded women with prolapse beyond the hymen.

  4. 4.

    In an unselected POP population, a negative reduction stress test during preoperative urodynamic assessment has a median negative predictive value of 78% (range 45–90%) for postoperative SUI. There is conflicting evidence regarding the predictive value of further urodynamic parameters such as MUCP and FUL.

  5. 5.

    More preoperatively continent women will demonstrate occult SUI during a urodynamic assessment compared with office evaluation stress test but this does not have greater accuracy for bothersome de novo SUI or treatment for de novo SUI. The demonstration of preoperative occult SUI during urodynamic assessment has a positive predictive value for de novo SUI of 40% (0–79%) and its absence has a negative predictive value of 91% (51–100%) respectively.

  6. 6.

    A de novo SUI prediction model that incorporates seven variables and outperforms pure chance, expert opinion, and reduction cough stress test alone. However, in follow-up studies the model performed poorly, overestimating the risk when compared with the original study.

To sum up, the most useful information from the evaluation of a patient with POP with regard to postoperative stress incontinence is the high negative predictive value of a negative stress reduction test.

Further research

  1. 1.

    Future research should look to improve the performance of current prediction testing, and develop new predictive parameters. These could probably be identified by deepening our understanding of the biological and biomechanical explanations behind de novo and persistent SUI.

  2. 2.

    The prognostic value of MUCP and FUL should be re-assessed in further studies.

  3. 3.

    Persistent and de novo SUI probably have different prognostic factors, thus developing separate models may be feasible and increase accuracy.

  4. 4.

    Researchers should follow The Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis statement when presenting a new or validating an existing prediction model [108].

Evaluation of hydronephrosis and hematuria in patients with POP

There were two studies that discussed the prevalence of hydronephrosis and hematuria in women with POP. The study on hydronephrosis evaluated 180 patients and found some degree of hydronephrosis in 30%. A multivariate statistical analysis revealed only the two following factors associated with hydronephrosis. First, anterior compartment prolapse, as defined by POP-Q point Ba; noting that for every 1-cm increase, the relative risk of hydronephrosis increases by 1.68. Second, cystometric capacity; it was found that every 100-ml increase in maximum cystometric capacity increases the relative risk of hydronephrosis by 1.5. However, the model only predicted about 30% of the hydronephrosis [109].

The study evaluating microscopic hematuria (defined as ≥ red blood cells per high power field) noted its presence in 20.1% in a population of 1,040 women. This population is at a very low risk of urinary tract malignancy and the authors suggested that the cut-off for significant microscopic hematuria in this population should be re-evaluated [110].

To summarize: the severity of anterior vaginal wall prolapse and cystometric capacity are associated with hydronephrosis in a limited number of studies; prediction models are not well developed.

Assessment of gastrointestinal tract symptoms in women with POP

A review of the existing literature on the assessment of GIT symptoms in women with POP identified 2,251 titles and abstracts, of which 17 studies were included in the final review of this section (Fig. 3). Studies were included whose primary population or a significant portion of the study population were women with POP, who then underwent evaluation of the GIT other than or in addition to symptom assessment and clinical examination (Table 9).

Fig. 3
figure 3

Preferred Reporting Items for Systematic Reviews and Meta-Analyses diagram for gastrointestinal radiographic/physiological testing

Table 9 Evidence table for the evaluation of prolapse in women with symptoms of obstructed defecation and anal incontinence

Defecography

Several studies compared various defecography imaging modalities with each other [112, 118, 124, 125]. Difficulties in evaluation of the existing literature included the use of various methods for the assessment of prolapse on physical examination, including the Baden–Walker halfway system, the POP-Q system, and several manuscript-specific nonstandardized examination techniques. In addition, various methods of performing the imaging and interpretation of results were described. In studies of fluoroscopic defecography, there was variability in which compartments were opacified with contrast; although the rectum was universally opacified, other possible compartments included the bladder, vagina, perineum, peritoneum, and small bowel.

Three studies of fluoroscopic defecography found that this imaging modality detected more enteroceles than physical examination [111, 113, 117]. Two studies found that MRI defecography was able to diagnose enteroceles more readily than physical examination, and one of these found that MRI defecography was also able to diagnose more enteroceles than fluoroscopic defecography [122, 125]. Two studies found that sigmoidoceles were not diagnosed on examination but were identified by fluoroscopic defecography [112, 117]. One study found that the size of the posterior vaginal wall prolapse, as assessed by physical examination, was associated with the finding of enterocele and/or rectal intussusception on fluoroscopic defecography [114].

Patient symptoms were assessed in two studies that found that defecatory symptoms were not significantly associated with findings on radiographic imaging or examination [115, 116]. One study found no relationship between defecatory symptoms in women with posterior vaginal wall prolapse and abnormal defecography. The other found no relationship between defecatory symptoms and posterior vaginal wall prolapse on examination or rectocele or enterocele on defecography [115, 116]. One study found that two thirds of women with a rectocele and symptoms of obstructed defecation or anal incontinence had intussusception (13.5% Oxford Grade I, 41% Grade II, and 13.5% Grade III) on MR defecography and were more likely to have an enterocele [119].

Anal physiological testing and anal ultrasound versus physical examination

Anal physiology and anorectal endosonography testing added limited information to the routine physical examination evaluation of POP patients for identifying intussusception [126, 127].

Patients with fecal incontinence may benefit from this testing. In terms of the clinical consequences of the imaging investigation, two studies found that the imaging results led to a change in surgical plan for 22–41% of patients [112, 117].

Definitions/interpretation of radiographic imaging studies

Consensus on definitions and interpretations of fluoroscopic defecography and MRI defecography have been developed by multiple stakeholder societies including the IUGA [128, 129]. Although these documents represent consensus on the use of these imaging modalities in patients with defecatory disorders, they “do not” contain information pertinent to patients with pelvic organ prolapse regarding specific methods and measurements. There is no consensus on whether or not patients with prolapse and no GI symptoms should undergo any testing beyond a thorough physical examination. It has been agreed upon that imaging should include measurements performed during the defecation phase rather than only with strain to improve sensitivity [123, 128, 129]. Studies in which there was no defecography phase have limited applicability.

Summary: assessment of GIT symptoms in women with POP

Summary of supplemental evaluation for GI dysfunction in women with POP is an area requiring a significant amount of research before any concrete recommendation can be made.

  1. 1.

    There were no studies that reported on patient outcomes in those evaluated by fluoroscopic defecography, MRI defecography, or anal physiology testing, and those who did not undergo this evaluation. Therefore, the clinical significance of this testing, particularly in asymptomatic patients, remains uncertain. It does seem that some anatomical defects, including enterocele, sigmoidocele, and intussusception, are better visualized with either fluoroscopic defecography or MRI defecography, but how this relates to clinical decision-making or more specifically outcomes, remains unclear.

  2. 2.

    In patients where these diagnoses are in question or in patients who present with GI symptoms, it is reasonable to obtain further imaging and testing beyond a routine clinical examination. However, these additional studies can be expensive and uncomfortable to patients, and currently there is no apparent benefit to identifying an underlying condition that would influence treatment decisions and outcomes. Until a benefit is established, their routine use in asymptomatic women with POP should be discouraged outside of research protocols.

Further research

Future studies comparing imaging and physiological testing with clinical examination need to compare their results with standardized clinical evaluation in the form of the POP-Q. Standardized minimum criteria for imaging and physiological testing need to be established, as well as a standardized reporting system to allow for comparison between studies. Until these are drawn up it will remain almost impossible to evaluate the literature.

Studies in patients with POP and no GI complaints comparing radiographic/physiological testing with no testing need to be evaluated with meaningful outcome measures.