Introduction

A correct, voluntary pelvic floor muscle (PFM) contraction has been described as an elevation and a squeeze around the pelvic openings [1]. The levator ani muscle is primarily responsible for this function. Furthermore, at rest, the levator ani muscle keeps constant tone to keep the urogenital hiatus of the levator ani closed [2]. Vaginal resting pressure, PFM strength, and muscular endurance may be assessed in several ways: visual observation and palpation, electromyography, vaginal pressure measurements (manometry), dynamometry, and imaging, such as magnetic resonance imaging techniques and ultrasound [3, 4].

In physiotherapy practices, the manometer is the most common method of assessing PFM function [4]. In general, the manometer has been established as a reliable assessment method for PFM strength [5,6,7]. However, comparing results across studies using different devices is not possible [8]. To date, randomized controlled trials (RCT) have demonstrated significant improvements in PFM strength and endurance after pelvic floor muscle training (PFMT) using a manometer (Camtech AS, Sandvika, Norway) [9,10,11]. Observational studies have also shown statistically significant differences in vaginal resting pressure in symptomatic and asymptomatic women [12, 13]. Although good reliability and validity for PFM strength has been established [14, 15], the reliability and agreement of vaginal resting pressure and muscular endurance has not been assessed using Camtech AS, nor has interrater reliability and agreement for PFM strength been assessed.

The International Continence Society Clinical Assessment Group recommended studies on intra- and interrater variability for PFM function, voluntary contraction, and relaxation [3]. Thus, the aims of the present study were to assess intra- and interrater reliability and agreement of vaginal resting pressure PFM strength and muscular endurance.

Materials and methods

Subjects and design

This study on intra- and interrater reliability and agreement was performed at a physiotherapy center in Sandvika, Norway, from March 2015 to April 2015. A convenient sample of 23 women was recruited to evaluate the intra- and interrater reliability and agreement of PFM function measured by manometer (Camtech AS). The sample of 23 women was based on previous reliability studies in the field [5,6,7, 14]. The women received information through leaflets available in the reception area at the center or they were encouraged to participate in the study during general group fitness classes. The inclusion criterion was the ability to contract the PFM correctly, defined as an inward movement and squeeze around the pelvic openings assessed by observation and palpation [1, 15]. No grading of PFM strength was done for inclusion purposes. Exclusion criterion was the inability to understand instructions given in any of the Scandinavian languages. To avoid a possible effect of training/detraining, the participants were asked not to change PFMT habits between testing days. To maintain anonymity, the participants received a unique ID number, which was the only link between the examination and the woman. The Regional Medical Ethics Committee (2014/1768) approved the study and the Data Protection Officer at Akershus University hospital (15–018) was informed about the study. All participants gave written informed consent to participate. The study procedures were in accordance with the World Medical Association, Helsinki Declaration (2013) [16]. The applied terminology follows recommendations from the clinical assessment group of the International Continence Society, except where specifically noted [3]. The Guidelines for Reporting Reliability and Agreement Studies (GRRAS) studies were followed [17].

Procedure and apparatus

Two women’s health physiotherapists were involved in the study. Participants were tested twice on the same day by two independent physiotherapists. The order in which they were examined was random. One week later one physiotherapist, MKT, re-tested the same group of women at the same time-point as test 1. The physiotherapists were blinded to each other’s results and the results from test 1 were unavailable during test 2. Both physiotherapists had thorough training before conducting the study. The training included: positioning of the participant and assessor, verbal instructions, catheter placement, recording of measurements, and analysis. All participants answered a short questionnaire before the examination including: age, level of education, if they undertook strenuous physical work, weight, height, parity, and any symptoms from the pelvic floor (urinary and anal incontinence, pelvic organ prolapse [vaginal bulging, pelvic pressure], pelvic floor pain, other).

The procedures were recorded on a flat bench with a small pillow underneath the head. The physiotherapist was sitting at the examination table next to the woman supporting one leg, while the other leg of the woman was resting against a pillow to the wall. The participating women were given a short anatomy lecture and were taught how to perform a correct PFM contraction using observation and palpation [1, 15] before measurement was taken using the instructions: “breathe slowly in and out”; “you are ready”; “lift and squeeze your pelvic floor —lift as hard as you can”; “let go and breathe out slowly”. The sequence for muscle testing was as follows: three repetitions of maximum voluntary contractions lasting approximately 3 s each with an approximately 6-s rest in between. Less than 1-min rest was allowed before muscular endurance was tested. PFM contraction without any movement of the pelvis or visible contraction of the glutei, hip or abdominal muscles was emphasized [14, 15].

Vaginal resting pressure, PFM strength, and muscular endurance were measured using a high-precision pressure transducer connected to a vaginal balloon (Camtech AS; Fig. 1). After compressing the balloon 10–20% to allow for air expansion at body temperature, the balloon catheter was connected to the fiber tip and calibrated in air. A lubricating gel was applied to the balloon catheter. The device was positioned with the middle of the balloon 3.5 cm internal to the introitus in the vaginal high pressure zone [18], a method found to be reliable and valid for the assessment of PFM strength, with simultaneous observation of an inward movement of the catheter and no use of extra-pelvic muscle contraction [14, 15]. To control placement and movement of the balloon, the physiotherapist held the catheter with the thumb and index finger before and during every measurement. The physiotherapist followed the movement of the catheter during contraction. The atmospheric pressure on the balloon was calibrated to 0 cmH2O for each woman before it was placed in the vagina. Vaginal resting pressure was measured as the difference between the atmospheric pressure and the vaginal high pressure zone at rest, with no voluntary PFM activity, and was registered as cmH2O. The measurement was taken before the first contraction and registered as a flat curve after the woman was instructed to relax and given time to slowly breathe in and out. PFM strength was measured from the resting pressure line until the peak, not including the resting pressure, reported as the mean of three maximum voluntary contractions, and registered as cmH2O. Local muscular endurance is the ability of a muscle to sustain near maximal or maximal force, assessed by the time a person is able to maintain a maximal static or isometric contraction [19], and was quantified as the area under the curve for 10 s, measured during one attempt and registered as cmH2O/s (Fig. 2). Using the area under the curve includes the force applied during a specific time (10 s). To ensure maximal tension in the muscle, it is commonly recommended that the contraction is held for more than 6 s [20]. Local muscular endurance may also be defined as the ability to repeatedly develop near maximal or maximal force determined by the number of repetitions [19], but using time in seconds will not give details on the exact force.

Fig. 1
figure 1

High-precision pressure transducer connected to a vaginal balloon (Camtech AS, Sandvika, Norway)

Fig. 2
figure 2

Pressure curves from one participant showing vaginal resting pressure, pelvic floor muscle strength, and muscular endurance

Manometer analysis

Data showing pressure values and pressure curves were stored on the hard disk of the apparatus using the unique ID number for each woman. Each physiotherapist analyzed their own measurement.

Statistical analysis

Statistical analysis was performed using SPSS version 15. All parameters were measured once, except for PFM strength, where the mean of three contractions was used for analyses. Demographic data and results were given as mean values with standard deviations (SD), or, in the case of categorical data, as counts (%). Normality tests were performed. Intra- and interrater reliability were analyzed using the intraclass correlation coefficient (ICC, average measures) using a two-way mixed model for absolute agreement with the 95% confidence interval (CI). ICC values under 0.20 were considered poor, 0.21– 0.40 fair, 0.41–0.60 moderate, 0.61–0.80 good, and 0.81–1.00 very good. One sample t test was used to calculate the mean difference (bias) between measurements and the corresponding SD and 95% CI. To assess agreement, the Bland–Altman approach was used [21]. This method assesses for systematic bias and random error using the mean difference and 95% limits of agreement (1.96 SD). Minimal detectable change was calculated to identify the smallest amount of change above the threshold of error using the SD of the mean difference (bias) multiplied by 1.96 SD [22].

Results

One woman was excluded because of her inability to insert the probe (owing to restricted vaginal opening) and one woman did not attend her scheduled appointment. Furthermore, one had to be excluded owing to poor image quality, leaving 20 women for analysis (mean age 55.8 (range 27–71), mean parity 1.7 (range 0–3), and mean body mass index 23.6 (range 18.4–27.2, SD 2.4). The majority, 17 (85%), had a college/university degree and 10 (50%) reported undergoing strenuous physical work. All participants knew of PFMT and they were all able to perform a correct PFM contraction after instructions. Fourteen (70%) reported that they sometimes experienced minor symptoms from the pelvic floor (urinary incontinence, vaginal dryness, and urinary tract infection). Two were pregnant with their second child, one was in the second and one in the third trimester. None of the above conditions interfered with the placement of the catheter or the procedure.

Intra- and interrater analysis are shown in Tables 1 and 2 respectively. There was considerable intervariation of scores as seen from the large SD in the first two rows. ICC values were very good for all measurements for both intra- and interrater assessments (ICC >0.90).

Table 1 Intrarater reliability analysis for vaginal resting pressure (VRP), pelvic floor muscle (PFM) strength, and muscular endurance for assessor 1. N = 20
Table 2 Interrater reliability analysis for VRP, PFM strength, and muscular endurance. N = 20

Results from the Bland–Altman plot are illustrated in Fig. 3, showing vaginal resting pressure (a, b), PFM strength (c, d), and muscular endurance (e, f). When looking at the dots on the plot, one can see that the distribution of scores was right-skewed, and that the limits of agreement were relatively wide. There was also a slight bias as the centre line deviated from zero (Fig. 3); vaginal resting pressure (mean difference −2.44 (95% CI −4.51, −0.36) in the intrarater assessment and PFM strength (mean difference 2.24 (95% CI 0.03, 4.45) in the interrater assessment (Tables 1, 2). Outliers were observed in all measurements, most of which represented the strongest women (Fig. 3).

Fig. 3
figure 3

Bland–Altman plot showing a, b vaginal resting pressure, c, d pelvic floor muscle strength, and e, f muscular endurance. The differences between the tests/assessors are plotted against each individual mean for the two tests. The bias line and random error lines forming the 95% limits of agreement are presented

Discussion

This intra- and interrater reliability and agreement study showed very good ICC values (>0.90) for vaginal resting pressure, PFM strength, and muscular endurance using a manometer. Agreement as seen in the Bland–Altman plot was poorer. The heterogeneity of the samples may explain these findings [21]. Systematic bias was statistically significant for two measurements. Visual inspection of the data showed a right-skewed distribution of scores and outliers representing the strongest women. Hence, the manometer used in this study seems less accurate for the strongest women and could potentially underestimate the highest scores and overestimate the lowest scores.

In a similar study by Bø et al. [14], test–retest on PFM strength using the same manometer was performed. The authors concluded that the test results were reproducible, but wide confidence intervals imply to some degree inaccuracy around their estimates. This is in line with findings from the present study and corresponds with previous studies using different types of manometers [5,6,7, 14]. However, results from the above cited studies are not directly comparable owing to the use of different measurement devices [8]. Measurements recorded for the strongest women were more problematic than those recorded for the weakest women. The vaginal probe may be pulled further inside the vagina during contraction, and may not have been in the high-pressure zone [23], a possible explanation for the outliers seen in our sample. For the probe to stay in the high-pressure zone [18], the assessors had to control the movement of the probe, which could yield a potential source of error. The use of dynamometry may be less sensitive to the movement of the apparatus during contraction, as the device is better fixed inside the vagina [24]. Dumoulin et al. [24] concluded that there was good reliability for PFM strength, but that with a coefficient of variation (CV%) of 21%, some degree of random error was present. We have not been able to find studies on muscular endurance using the same manometer as this present study. Therefore, direct comparisons of studies are challenging owing to different apparatus and methods of measurement [8]. Frawley et al. [6] measured multiple repeated muscle contractions (20 fast contractions) with the Peritron and concluded poor reliability (ICC 0.05-0.42) in lying positions. Poor reliability for muscular endurance, as measured as a 1-min maximum contraction was also found using a dynamometer (dependability indices of 0.10) [24].

Vaginal resting pressure was measured before PFM muscle contraction. This method was chosen as it has been used in previous studies using the same manometer [9, 12, 13]. Although no voluntary muscle contraction was seen (stable pressure values and a flat curve), outliers were also seen for this measurement. Using Peritron, Frawley et al. [6] found good reliability (ICC 0.74–0.77) for vaginal resting pressure in lying positions. The use of a dynamometer has also shown “enough” reliability for the passive properties of the PFM in postmenopausal women [25]. However, “enough” is hard to quantify as there is a lack of data on normal values for the resting condition of the PFM [3]. Although surface electromyography (EMG) is not recommended to assess function of the PFM, rather than the presence/absence of muscle activation, it could be that surface EMG might be a more reliable tool for assessing the resting condition of the PFM, as no voluntary muscle activation is present and the measurement would be less biased by cross-talks from nearby muscles at rest than during contraction [3, 26].

Previous randomized controlled trials on women with stress urinary incontinence and pelvic organ prolapse, using Camtech AS [9,10,11], have shown statistically significant improvement in PFM strength and muscular endurance after PFMT. These improvements have been close to or above what we found was the minimal detectable change for PFM strength: 8.2 cmH2O (p < 0.03) and 15.5 cmH2O (p < 0.01) respectively [10, 11], and 13.1 cmH2O, and for muscular endurance 107 cmH2O (p < 0.001) [9]. Two recent observational studies also found statistically significant differences for vaginal resting pressure of 3.6 cmH2O (95% CI 0.7, 6.6) and 3.3 cmH2O (p = 0.02) respectively [12, 13] in women with and without vaginal laxity and between women with provoked vestibulodynia and asymptomatic controls respectively. According to the results from this present study, differences of 3.6 and 3.3 cmH2O may be the results of measurement error. However, although gain in PFM function may be low and clinically nonsignificant, women may still report significant improvement in symptoms after PFMT [27].

The heterogeneity of participants representing the clinical everyday life, the standardization of test procedures, and the use of recommended statistical methods are strengths of this study [15, 18, 22, 28]. The physiotherapists had been thoroughly trained by the supervisor of the project, KB, and had extensive experience in use of the method. All women in the sample knew of PFMT and were able to perform a correct contraction after receiving instructions. Although less likely, we cannot rule out a possible learning effect. The number of participants included may be another limitation. Regarding sample size, the number of women in this study was in line with previous studies in this field [5,6,7, 14]. However, including more women in the study may have given a better estimate of the limits of agreement [28], but would probably not have changed the overall outcome. Although we included a heterogeneous group of women, we could still question the generalizability of the results. Our results indicate that the apparatus seem less accurate for the strongest women in this sample. This means that the limits of agreement could be wider apart than they should for the lowest scores and narrower than they should for the highest scores [21]. Based on results from this present study, it is important for clinicians to be aware that to evaluate improvement over the error of measurement after conservative treatments, the gain should exceed the minimal detectable change.

The clinical relevance of using a manometer could be the visual biofeedback obtained during a PFM contraction giving extra motivation for maximum contraction. In addition, it may be motivating to follow the development of quantifiable data on PFM function throughout an exercise period. However, in countries where such measurement tools are not available because of cost and feasibility, palpation should still be the core measurement. However, assessors need to be aware of the limitations of palpation and reported low interrater reliability [4, 29].

Conclusions

Compared with previous studies in this field, using the same device, this study presents new data on the reliability and agreement of vaginal resting pressure, PFM strength, and muscular endurance. Very good ICC values were found; however, agreement using the Bland–Altman approach was poorer. Outliers were women with the strongest PFM. Thus, Camtech AS seems less accurate for the strongest women and could potentially underestimate the highest scores and overestimate the lowest scores. For use in clinical practice, examiners must be aware that a significant improvement in PFM variables needs to exceed the minimal detectable change to be above the error of measurement.