, Volume 57, Issue 12, pp 3299-3302
Date: 09 Aug 2012

Interpretation of Probe-Based Confocal Laser Endomicroscopy of Indeterminate Biliary Strictures: Is There Any Interobserver Agreement?

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access



Probe-based confocal laser endomicroscopy (pCLE) has enabled in vivo histopathology by obtaining high resolution images of the mucosal layers of the gastrointestinal tract. For indeterminate bile duct strictures, biopsy, cytologic brushing and needle aspiration have low levels of diagnostic accuracy.


The objective of this multi-center pilot study was to assess the interobserver agreement in interpretation of pCLE imaging.


Twenty-five de-identified pCLE video clips of indeterminate biliary strictures were sent to 6 observers at 5 institutions. Miami Classification was used to standardize image interpretation. Seven variables were assessed for interobserver agreement using the Fleiss kappa statistic which included: presence of thick (>20 μm) or thin (<20 μm) dark or white bands, dark clumps, epithelium including glandular structures, interstitial fluorescein leakage, ease of interpretation, and final diagnosis. Based on operator experience, observers were categorized into 3 categories of experience (Category 1: 0–10; Category 2: 11–20; Category 3: >21 cases).


Upon stratification, Category 1 interobserver agreement ranged from “Poor” to “Fair” (κ = 0.277, κ = −0.079, κ = −0.025, κ = −0.066, κ = 0.128, κ = 0.088), and for the final diagnosis variable, the agreement was slight (κ = 0.033). Category 2 and 3 interobserver agreement ranged from “Poor” to “Fair” (κ = 0.211, κ = 0.181, κ = 0.347, κ = 0.238, κ = −0.050, κ = 0.092), and for the final diagnosis variable, the agreement was slight (κ = 0.195).


The overall interobserver agreement for pCLE image interpretation in indeterminate biliary strictures ranges from poor to fair. Interpretation criteria as well as training require further standardization toward improving inter-rater reliability.