Introduction

Computed tomographic colonography (CTC) [1, 2] has been shown to be sufficiently accurate in detecting colorectal neoplasia. It is less invasive and better tolerated than colonoscopy [3, 4]. Introduced for the first time in 1994 by Vining et al. [5], rapid advancements in technology improved visualisation of the colon. Multidetector computed tomography (MDCT) now permits image acquisition of thin 1– to 2-mm slices of the entire large intestine, well within breath-hold imaging times. Computer imaging graphics constantly refine three-dimensional (3D) visualisation with endoscopic fly-through of the colon with simultaneous interactive depiction of multi-planar two-dimensional (2D) images. This integrated use of the 3D and 2D techniques improves polyp detection [6]. CTC involves helical CT scanning of the cleansed, distended colorectum, followed by 3D image rendering to simulate the endoscopic view, hence the alternative title “virtual colonoscopy”. Within subject comparisons between CTC and conventional colonoscopy have reported similar detection rates for polyps 10 mm or larger [1, 2, 7, 8], and meta-analysis data support good diagnostic performance [9, 10]. Moreover, it is now established that CTC is more accurate and acceptable to patients than its radiological alternative, the barium enema [11]. Furthermore, the debate as to who should interpret CTC (radiologists, gastroenterologists, radiographers or even computer algorithms) continues to intensify. Previous studies have shown that radiographers may perform well in reading CTC images [1215] and a recent study by Haan et al. [16] showed that the diagnostic accuracy of radiographers and radiologists for intracolonic lesions were comparable.

The aim of this study was to investigate the reviewer performance of four trained radiographers in comparison with that of two experienced radiologists in the evaluation of CTC examinations of 87 patients by comparing the sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of CTC in polyp detection with the reference standard, optical colonoscopy (OC).

Materials and methods

Study design

The prospective study started in September 2008 and ended in November 2010, and the study protocol was approved by the Institutional Review Board (Videnskabsetisk Komité) in accordance with the Declaration of Helsinki.

All patients provided written informed consent before participation in the study and signed an informed consent before the examination. The study was granted by Metropolitan University College (DK), University College Nordjylland (DK), Odense University Hospital (DK), Copenhagen University Hospital Herlev (DK) and the Danish Association of Radiographers.

Four radiographers trained in CTC and two radiologists interpreted the 87 CTC examinations. The radiographers were trained and tested in CTC previously in their competence of CTC interpretation [12]. They had no experience with CTC, and only very basic experience with colonic anatomy and pathology. They had practical experience with numerous abdominal CT and barium enema examinations, and they selected the training voluntarily. The training included a 3-day CTC workshop, and subsequently a tele-training pragramme based on the interpretation of 75 cases performed at the local department. To evaluate the educational performance, each radiographer was tested on 20 test cases, and the outcome measure of the test was to achieve a per-polyp sensitivity per radiographer of 80 % for polyps ≥6 mm. They yielded a per-polyp sensitivity of 80.7 % (95 % CI 69.5–92.0) and 94.7 % (95 % CI 85.6–100) for polyps ≥6 mm and ≥10 mm, respectively. The test cases included five normal cases and 15 cases with colonic polyps, and encompassed a total of 27 polyps ≥6 mm, with 12 and 15 polyps 6–9 mm and ≥10 mm, respectively.

One of the radiologists was trained at a 2-day ESGAR (European Society of Gastrointestinal Radiology) workshop. The other radiologist had the same training as the radiographers except for the test. Both radiologists had clinical experience of more than 200 clinical CTCs.

Study population

A total of 87 consecutive symptomatic outpatients examined in two university hospitals in Denmark (67 from hospital A and 20 from hospital B) (57 men and 30 women, 35–90 years of age, mean [SD] 64 [11.4] years) were included in the study. They underwent same-day CTC and OC and CTC was performed on the patients immediately prior to OC.

Inclusion and exclusion criteria

Inclusion criteria were referral for OC, age ≥18 years, and the ability to give written and orally informed consent. Patients were excluded in case of inflammatory bowel disease, pregnancy, colostomy after colorectal surgery, colorectal biopsy performed within 72 h, and/or polypectomy within 2 weeks prior to CTC, and/or known allergy with Buscopan.

Diagnostic procedures

Examination technique

All patients underwent a colonic preparation using a low-fibre diet, 2 l polyethylene glycol electrolyte solution (Moviprep; Norgine, Mid Glamorgan, UK) and faecal tagging. In 67 patients (hospital A), faecal tagging was obtained with 100 ml ionic iodinated contrast (Gastrografin 370 mgI/ml; Bracco Diagnostics, Princeton, USA) soluted in 400 ml water and administered the day before their CTC. In 20 patients (hospital B), faecal tagging was obtained with 20 ml non-ionic iodinated contrast (Iomeron 300 mgI/ml; Bracco Diagnostics, Princeton, USA) soluted in 200 ml water and administered in the late afternoon the day before the examination.

In 67 patients (hospital A), 20 mg i.v. hyoscine butylbromide (Buscopan; Boehringer, Ingelheim, Germany) was used for bowel relaxation [17]. All patients underwent colonic insufflation with carbon dioxide using a CO2 injector (PROTOCO2L; Bracco, Princeton, USA). At hospital B, there was no use of medicine for bowel relaxation.

All the examinations were performed using a 64-channel multislice CT scanner (hospital A, Brilliance Philips Medical Systems, The Netherlands; hospital B, Lightspeed, General Electric Medical Systems, France).

Scans were obtained at 50 mAs (hospital A) and 40 mAs (hospital B) with 120 kV. Patients were examined in supine and prone positions with identical scanning parameters for both positions: collimation 64 × 0.625, slice thickness 1 mm, increment 1 mm, rotation time 0.5 s.

Interpretation

All readers read the 87 examinations independently and were blinded to all clinical findings, the results from OC and each other’s findings.

Image processing and interpretation were performed with the use of a CT workstation (Extended Brilliance workspace 3.5, Philips, The Netherlands) provided with dedicated CTC software. This system was used by the radiographers and by one radiologist. Due to local technical limitations of the workstation, simultaneous projection of the supine and prone acquisition, allowing fast comparison between both acquisitions, was impossible. The other radiologist interpreted the examinations on a Vitrea workstation (Vital Images, Minnetonka, MN, USA).

CT data were transferred to the workstations for subsequent reading: either primary 2D reading with 3D problem-solving or primary 3D reading with 2D problem-solving.

The choice of reading method was decided by personal preferences.

As recommend in a recently published “consensus” paper by ESGAR [6], polyps were measured with electronic calipers on 2D view using a soft tissue window setting and recorded according to the segment (caecum, ascending colon, transverse colon, descending colon, sigmoid colon or rectum). The polyp size was determined as the measurement of its largest diameter (the stalk of the polyp when visible was not considered for measurement).

Per polyp detected, the readers annotated the segmental location, the size, the attenuation, the slice numbers per acquisition, and the distance to the anal margin of the polyp in a case record form including an image of the polyp. Colorectal polyps ≥6 mm were reported and classified in two size categories (≥6 mm and ≥10 mm). Tumours were included in the calculations and analysed as polyps but were described separately as well. The C-RADs classification was used [18]. To be included in the study all six segments (caecum, ascending, transverse, descending colon, sigmoid, and rectum) needed to be distended and without obscured fluid and fecal residue in either supine or prone position. Segmental unblinding was not used in the study.

Colonoscopy protocol

OCs were performed by an experienced staff member (gastroenterologist or gastrointestinal surgeon) or by a gastroenterology fellow under direct supervision of experienced staff using 165-cm colonoscopes (Olympus CF-Q1, 160DL; Olympus Europe, Hamburg, Germany). While performing the OCs, the endoscopist was unaware of the CTC findings. Patients received 2.5–7.5 mg midazolam (Dormicum; Roche, Basel, Schweiz) and 0.05–0.1 mg fentanyl (Janssen Pharmaceuticals, Titusville, NJ, USA) on request. The size, morphological features, segmental location and the distance from the anal margin of the polyps were documented on a case record form by the endoscopist who performed the examination and by the attending research fellow. Polyp size was measured at endoscopy using open biopsy forceps. The research fellow not involved in interpreting the findings matched CTC and colonoscopic (reference standard) findings. Face-to-face comparison was made of the CTC and colonoscopic images.

According to the adopted segmental checking procedure, a lesion found at CTC was matched to a corresponding one found at OC if it was located in the same or adjacent colon segment and when its size differed by no more than 50 % [2]. Discrepancies in the results of the lesion-matching were adjudicated by a third expert reader not involved in the interpretations of the CTCs.

Statistical analyses

Sensitivity, specificity and positive and negative predictive values (PPV and NPV respectively) were assessed by means of point estimates and respective 95 % confidence intervals (95 % CI) on a per-patient basis. Moreover, sensitivity was analysed on a per-polyp basis and stratified according to the respective size categories (polyps ≥6 mm as well as polyps ≥10 mm). Patient-based analyses per reader were carried out by means of 95 % CI based on the Wilson-score method [19]. For average reader analyses as well as for polyp-based analyses, linear regression models were used with the constant term as the only explanatory variable and clustered sandwich estimators of variance to allow intra-group correlation. Bootstrapping [20] was applied in order to account for the correlated nature of the data when computing point estimates and 95 % CI. Group comparisons were performed by comparing the respective 95 % CIs using a significance level of 5 %. Inter-reader agreement was assessed for both radiologists and radiographers using Cohen’s kappa [21] and Fleiss’s kappa [22], respectively. Supplementary 95 % CI were calculated using bootstrapping.

Assuming a prevalence of patients with colorectal neoplasia of 33 % and a true (but unknown) sensitivity on a per-patient basis of 0.85, including 87 patients in the study was sufficient for an expected width of a 95 % Wilson-score CI of 0.25. This precision was deemed appropriate for this exploratory study. All results were kept in a worksheet (Microsoft Excel version 2007; Microsoft, Redmond, WA, USA), and analysed by using Stata/MP 11.1 (StataCorp, College Station, TX, USA).

Results

There were a total of 40 polyps ≥6 mm with 24 and 16 polyps measuring 6–9 mm and ≥10 mm, respectively. The polyps were detected in 22 of 87 patients (25 %). Four masses, 25 hyperplastic polyps and 11 adenomas were detected with 28, 6 and 2 having a sessile, pedunculated and flat morphology, respectively. There were six incomplete OCs (6.7 %). In these cases the CTCs were compared to OCs for the colon segments examined by both technologies. Incomplete OCs were normal in three cases, and showed a stenosing mass in two cases and a polyp in the ascending colon in one case. Among the six incomplete OCs, the two stenosing masses were detected by all the readers. The third mass (17 mm) located in the rectum 7.5 cm from the anal margin was initially missed when using OC (Fig. 1). This mass was detected by five out of six CTC readers. Review OC confirmed the lesion with histology revealing an adenocarcinoma. The fourth mass (25 mm) was a metastasis from prostatae cancer and located in the rectum. This lesion was seen by all the readers.

Fig. 1
figure 1

Tumour in rectum (17 mm) initially not seen by OC. a Supine position 2D axial CTC image. b Prone position 2D axial CTC image. c Supine 3D endoluminal CTC image shows the tumour within the rectum

Sensitivity

The radiographers obtained an overall per-patient sensitivity (using bootstrapping) of 76.2 % (95 % CI 61.4–91.0) for patients with polyps ≥6 mm. Individual per-patient sensitivity with 95 % CI is shown in Table 1 and ranged between 71.4 % and 85.7 % for polyps ≥6 mm. The radiologists achieved an overall per-patient sensitivity (using bootstrapping) at 76.2 % (95 % CI 61.7–90.6) for patients with polyps ≥6 mm. Individual per-patient sensitivity with 95 % CI is shown in Table 1 and ranged between 66.7 % and 85.7 % for polyps ≥6 mm. The bootstrapping analysis of the data for the overall per-patient sensitivity for both the radiographers and the radiologists demonstrated no difference between the two groups. The overall sensitivity per-patient inter-reader agreement between radiologists and radiographers separately showed moderate and good (Altman et al. [23]) kappa values at 0.42 (95 % CI 0.23–0.60) and 0.69 (95 % CI 0.58–0.80) respectively.

Table 1 Performance characteristics per patient and per polyp

The radiographers achieved an overall per-polyp sensitivity (using bootstrapping) at 60.3 % (95 % CI 50.3–70.3) and 60.7 % (95 % CI 42.2–79.2) for polyps ≥6 mm and ≥10 mm, respectively. Individual per-polyp sensitivity (using bootstrapping) with 95 % CI is shown in Table 1 and ranged between 53.8 % and 71.8 % and between 50.0 % and 71.4 % for polyps ≥6 mm and ≥10 mm, respectively.

The radiologists obtained an overall per-polyp sensitivity (using bootstrapping) of 59.2 % (95 % CI 46.4–72.0) and 69.0 % (95 % CI 48.1–89.6) for polyps ≥6 mm and ≥10 mm, respectively. Individual per-polyp sensitivity (using bootstrapping) with 95 % CI is shown in Table 1 and ranged between 51.3 % and 67.6 % and between 66.7 % and 71.4 % for polyps ≥6 mm and ≥10 mm, respectively.

There was no statistically significant difference in per-polyp sensitivity between the radiographers as a group and the radiologists as a group. For polyps ≥10 mm there was a larger difference compared to polyps ≥6 mm (Table 1).

Specificity

Overall per-patient specificity (using bootstrapping) for the radiographers was 81.4 % (95 % CI 73.7–89.2) for patients with polyps ≥6 mm. Individual specificity with 95 % CI is shown in Table 1 and ranged between 78.8 % and 83.3 %. The radiologists obtained an overall per-patient specificity using bootstrapping of 81.1 % (95 % CI 73.8–88.3) for patients with polyps ≥6 mm. Individual specificity with 95 % CI is shown in Table 1 and ranged between 74.2 % and 87.9 %.

Positive predictive value

The radiographers achieved an overall per-patient PPV of 56.6 % (95 % CI 40.1–73.2) for patients with polyps ≥6 mm. Individual PPV with 95 % CI is shown in Table 2 and ranged between 51.7 % and 60.0 %.

Table 2 Analysis of the positive and negative predictive values per patient

Overall per-patient PPV for the radiologists was 56.1 % (95 % CI 40.0–72.3) for polyps ≥6 mm. Individual PPV with 95 % CI is shown in Table 2 and ranged between 51.4 % and 63.6 %.

Negative predictive value

Overall per-patient NPV for the radiographers was 91.5 % (95 % CI 85.2–97.8) for polyps ≥6 mm. Individual NPV with 95 % CI is shown in Table 2 and ranged between 89.7 % and 94.7 %. The radiologists obtained an overall per-patient NPV of 91.5 % (95 % CI 85.4–97.5) for polyps ≥6 mm. Individual NPV with 95 % CI is shown in Table 2 and ranged between 89.2 % and 94.2 %.

Discussion

In some countries, radiographers are likely to play a useful role in the dissemination of CTC, including conversion of current services from barium enema by decreasing the radiologists’ workload. If sufficient experience of these radiographers is obtained and validated, their interpretation of CTC under supervision of a radiologist could be considered. In the present study, the diagnostic performance of trained radiographers was comparable to that of experienced radiologists interpreting CTC examinations. Like other recent prospective CTC studies [2, 2426], our study focused on polyps measuring 6 mm or more, since the prevalence of advanced histological features in small polyps (i.e. <6 mm) is reportedly low [27].

We investigated the performance characteristics of CTC by trained radiographers and experienced radiologists in 87 consecutively enrolled symptomatic outpatients. No statistically significant differences were found in detection rates between radiologists and radiographers. We found detection rates for radiographers similar to those of experienced radiologists. The overall sensitivity (Table 1) per patient with polyps ≥6 mm was 76.2 % for both radiographers and radiologists. The overall specificity per-patient with polyps ≥6 mm for radiographers and radiologists was 81.1 % and 81.4 % respectively (Table 1). The overall sensitivity per patient with polyps ≥6 mm in this study is lower than results reported in two studies on average risk individuals including more than 300 patients [28, 29].

One probable reason for the results could be the training of the radiographers who were tested on only 20 cases with a total of 27 polyps ≥6 mm. However, the same number of cases was also used for testing the participants at the ACRIN trial [1].

A second reason could be the lack of bowel relaxation at hospital B. According to the second ESGAR consensus statement on CTC [6], use of spasmolytics is preferable prior to colonic distension. A third reason may be the use of 2 l polyethylene glycol at both hospitals, which could result in residual fluid in the colon [30]. The use of sodium phosphate in a single dose would probably have resulted in a smaller amount of residual fluid. Another reason could be the use of different tagging material at the two hospitals (100 ml Gastrografin at hospital A and 20 ml Iomeron at hospital B). In two large studies [1, 7] showing good results, Gastrografin was used as tagging material.

Bodily et al. [31] found that in a selected data set of 56 cases, two trained radiographers and 15 radiologists achieved a sensitivity and specificity per patient with polyps ≥5 mm at 70 % versus 84 % and 80 % versus 74 % respectively. Per-polyp sensitivity for radiographers and radiologists was 79.5 % versus 71 % for polyps ≥5 mm and polyps ≥10 mm respectively. In the study by Jensch et al. [32] two trained radiographers and two radiologists (one experienced and one in training) evaluated 145 cases with a sensitivity and specificity per-patient with polyps ≥6 mm at 87 % versus 81 % and 67 % versus 71 %, respectively. Per-polyp sensitivity for radiographers and radiologists was 65 % versus 71 % and 66 % versus 69 % for polyps ≥6 mm and ≥10 mm respectively. In our study, the overall per-polyp sensitivity for radiographers and radiologists was 60.3 % versus 59.2 % and 60.7 % versus 69 % for polyps ≥6 mm and ≥10 mm, respectively (Table 1). In the two mentioned studies by Jensch and Bodilly, there was no statistically significant difference in the diagnostic performance between radiographers and radiologists. Compared with our study there was a larger difference in terms of per-patient sensitivity between the two groups of readers. Our study showed exactly the same per-patient sensitivity at 76.2 % for both groups. For calculation of inter-reader variability, the kappa value is the accepted statistical method, and in our study we have calculated lower values of sensitivity per-patient inter-reader agreement between the two experienced radiologists (0.42) compared with the four radiographers (0.69).

The reason for the difference between the two groups of readers could be that the radiologists had different training in CTC reading, compared with the radiographers who all went through the aforementioned training.

This result concurs with another study by Burling et al. [14], which showed an agreement between the reference standard (consensus between expert radiological review, colonoscopy data, and clinical follow-up) and computer aided detection (CAD)-assisted radiographers demonstrating the kappa value at 0.72 (95 % CI 0.65–0.78).

There were limitations to our study, though. The first limitation is that, because this was a two-centre study, CTC and preparation protocols were not uniform across participating centres. The performance characteristics found in our study are probably affected by these variable conditions. A second factor that probably had a negative influence on our results could be the lack of sufficient stool and fluid tagging and poor patient compliance that could make interpretation difficult in some cases. A third limitation that could have an impact on the results could be the lack of simultaneous evaluation of supine and prone images for some of the readers and the use of difference workstations. For a futher evaluation of the results, an analysis of the pitfalls made from the radiographers and the radiologists in this study could be of great interest.

However, the results imply that deployment of radiographers as reviewers in CTC is acceptable but radiologists would still be necessary for the evaluation of extracolonic findings. One could certainly ask the question if the participating radiologists were experienced enough, and if other radiologists could have achieved a better level of diagnostic performance.

Despite this study only including 87 patients, the conclusion of the results suggests that dedicated radiographers trained in interpretation of CTC examinations can achieve a diagnostic accuracy comparable to that of experienced radiologists in the evaluation of CTC. The results in this study also indicate that the diagnostic performance can still be improved with further experience and better techniques. The present study showed that radiographers can reach similar sensitivities compared with the radiologists. This finding raises the question whether double-reading by one radiologist and a radiographer would result in higher sensitivities compared with reading by one radiologist. Further studies need to be done to evaluate this.