Virchows Archiv

, Volume 461, Issue 5, pp 495–504

Serrated polyps of the colon: how reproducible is their classification?

Authors

    • Department of PathologyAnkara University Medical School
  • Banu Bilezikçi
    • Department of PathologyAnkara Guven Hospital
  • Fatima Carneiro
    • Institute of Molecular Pathology and Immunology of the University of Porto (IPATIMUP)
    • Medical Faculty and Centro Hospitalar São João
  • Gülen Bülbül Doğusoy
    • Department of PathologyGroup Florence Nightingale Hospitals
  • Ann Driessen
    • Department of PathologyMaastricht University Medical Center
  • Ayşe Dursun
    • Department of PathologyGazi Medical Faculty
  • Jean-François Flejou
    • Department of Pathology, Saint-Antoine Hospital, AP-HPPierre et Marie Curie Medical School
  • Karel Geboes
    • Universiteit Gent
  • Gert de Hertogh
    • Department of PathologyKU Leuven
  • Anne Jouret-Mourin
    • Department of PathologyCliniques Universitaires St Luc, UCL
  • Cord Langner
    • Institute of PathologyMedical University of Graz
  • Irıs D. Nagtegaal
    • Department of PathologyRadboud University Nijmegen Medical Center
  • Johan Offerhaus
    • Department of PathologyUniversity Medical Center Utrecht
  • Janina Orlowska
    • Department of PathologyMaria Sklodowska-Curie Memorial Cancer Centre and Institute of Oncology
  • Ari Ristimäki
    • Department of Pathology, HUSLAB and Haartman InstituteHelsinki University Central Hospital
    • Genome-Scale Biology, Research Program UnitUniversity of Helsinki
  • Julian Sanz-Ortega
    • Departamento de Anatomia Patologuca, Facultad de MedicinaUniversidad Complutense de Madrid, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC)
  • Berna Savaş
    • Department of PathologyAnkara University Medical School
  • Maria Sotiropoulou
    • Department of PathologyAlexandra Hospital
  • Vincenzo Villanacci
    • Department of PathologySpedali Civili Brescia
  • Nazmiye Kurşun
    • Department of BiostatisticsAnkara University Medical School
  • Fred Bosman
    • University Institute of PathologyUniversity of Lausanne Medical Center
Original Article

DOI: 10.1007/s00428-012-1319-7

Cite this article as:
Ensari, A., Bilezikçi, B., Carneiro, F. et al. Virchows Arch (2012) 461: 495. doi:10.1007/s00428-012-1319-7

Abstract

For several years, the lack of consensus on definition, nomenclature, natural history, and biology of serrated polyps (SPs) of the colon has created considerable confusion among pathologists. According to the latest WHO classification, the family of SPs comprises hyperplastic polyps (HPs), sessile serrated adenomas/polyps (SSA/Ps), and traditional serrated adenomas (TSAs). The term SSA/P with dysplasia has replaced the category of mixed hyperplastic/adenomatous polyps (MPs). The present study aimed to evaluate the reproducibility of the diagnosis of SPs based on currently available diagnostic criteria and interactive consensus development. In an initial round, H&E slides of 70 cases of SPs were circulated among participating pathologists across Europe. This round was followed by a consensus discussion on diagnostic criteria. A second round was performed on the same 70 cases using the revised criteria and definitions according to the recent WHO classification. Data were evaluated for inter-observer agreement using Kappa statistics. In the initial round, for the total of 70 cases, a fair overall kappa value of 0.318 was reached, while in the second round overall kappa value improved to moderate (kappa = 0.557; p < 0.001). Overall kappa values for each diagnostic category also significantly improved in the final round, reaching 0.977 for HP, 0.912 for SSA/P, and 0.845 for TSA (p < 0.001). The diagnostic reproducibility of SPs improves when strictly defined, standardized diagnostic criteria adopted by consensus are applied.

Keywords

Serrated polypDiagnosisCriteriaKappa statistics

Introduction

The term “serrated polyp” is used as a generic name for polyps demonstrating saw tooth-like infolding of the surface and crypt epithelium. The family of serrated polyps (SPs) comprises hyperplastic polyps (HPs), sessile serrated adenomas/polyps (SSA/Ps), and traditional serrated adenomas (TSAs) according to the latest WHO classification [1], in which the term “SSA/P with dysplasia” is preferred instead of the category of mixed hyperplastic/adenomatous polyps (MPs) [2]. The most common members of the SP family, HPs, comprise 80–90 % of all serrated polyps and are found throughout the colon and rectum, yet with distal predominance. Histologically, HPs are characterized by simple elongated crypt architecture and narrow crypt bases resembling normal mucosa, with proliferative activity confined to the basal third of the crypts [1, 36]. SSA/Ps, on the other hand, account for 8–20 % of serrated polyps with a predilection for the right colon. Their diagnosis is based mainly on crypt architectural features including serration, dilatation, horizontal orientation, L-shape or inverted T-form at the base of the crypts [1, 39], which furthermore show an asymmetrical proliferation zone and goblet cell or gastric foveolar cell differentiation [10]. The rarest type of SPs, TSA, has a protuberant growth pattern with a complex villiform configuration and premature crypt formation, defined as “ectopic crypt” [1, 5, 9, 11].

Lack of consensus in pathological diagnosis and classification in the past two decades resulted in publications in which polyps with almost identical morphology were reported with different diagnostic terms [1215]. Hence, the molecular data that were created on the “serrated neoplasia” pathway have been rather inconsistent [1621]. An accurate diagnosis of SPs is, therefore, crucial for a better understanding of the biology of the different entities, their specific molecular characteristics, and a better definition of the risk for malignant neoplasia that they incur. Only when this has been clarified can accurate risk assessment and adequate surveillance approaches be adopted. In daily practice, reproducibility of histopathological lesions in the serrated polyp family has been perceived as less than optimal [2230]. We therefore designed a European study aiming to find answers to the following questions:
  1. 1.

    How reproducible are the diagnoses of serrated polyps?

     
  2. 2.

    What is the impact of a standardized classification (i.e., WHO classification) on diagnostic reproducibility?

     
  3. 3.

    Which criteria are used for diagnosis, how reproducible are they, and what is their discriminative value?

     

Materials and methods

Twenty European pathologists, all members of the Digestive Diseases Working Group of European Society of Pathology and all with special interest and experience in gastrointestinal pathology, were invited to participate in the study. Cases diagnosed as SPs (28 HPs, 25 SSA/Ps, 11 TSAs, and six MPs) in the original sign-outs were retrieved from the pathology archives of Ankara, Graz, and Warsaw. None of the features as size, location, or biopsy orientation was considered in the selection of cases in order to simulate a real-life experience for participating pathologists. Thus, the cases comprised a mixture of polyps with characteristic features and those with features hampering an ambiguous diagnosis. The aim was to assess the histological criteria and classification without any supportive information to avoid bias in morphological assessment. Consequently, the original sign-out diagnoses, size, and location of the polyps were not provided on the worksheet.

We used the definitions and diagnostic criteria available in the literature by 2008 [39] when the study was initiated, prior to the last edition of the WHO blue book [1]. Thus, diagnostic categories comprised of HP, SSA/P, TSA, and MP when a final diagnosis could be made, while a category of unclassified polyp (UCP) was used for a case for which a participant could not make a final diagnosis, e.g., due to inadequate sampling or poor orientation [31, 32]. Diagnostic categories are presented in Fig. 1. The participants were requested to provide a diagnosis and to indicate which of the nine categories of diagnostic criteria, comprising epithelial serration (surface/upper crypt/lower crypt), crypt dilatation (upper/lower crypt), crypt architectural changes (horizontal crypts, crypt branching, inverted crypts), mitotic activity (upper/lower crypt), mature goblet cell distribution (upper/lower crypt), gastric epithelium (lower crypt), nuclear features (hyperchromasia, elongation, pseudostratification, vesicular nucleus, prominent nucleolus), cytoplasmic eosinophilia, and dysplasia (low/high grade), were present in the lesion. The final diagnosis (HP, SSA/P, TSA, MP, or UCP) was determined according to the diagnoses made by the majority of participants for each case.
https://static-content.springer.com/image/art%3A10.1007%2Fs00428-012-1319-7/MediaObjects/428_2012_1319_Fig1_HTML.gif
Fig. 1

SP categories including a HP (H&E, ×200), b SSA/P (H&E, ×200), c TSA (H&E, ×100), and d MP (H&E, ×100)

The study was designed to allow an assessment of diagnostic reproducibility before and after introduction of the latest WHO classification [1]. The initial round of observations included 20 pathologists and a total of 70 cases. To initiate this process, the group first evaluated 15 cases of SPs retrieved from the files of the Department of Pathology, Ankara University Medical School, including round table discussions based on the initial observations and focusing on diagnostic criteria and terminology in order to establish a common language and a standardized diagnostic approach. A further 55 cases were then provided by three centers (Ankara, Graz, and Warsaw) to allow in-depth analysis of reproducibility and diagnostic criteria on a total of 70 cases, permitting reliable Kappa analysis. Subsequent to the publication of the 4th edition of WHO Classification of Tumors of the Digestive System [1], the same group (four members could not participate, leaving a total of 16 assessors) re-assessed the 70 case series using the WHO diagnostic categories and proposed criteria, which constituted the second round. The worksheets for the second round contained the initial diagnostic categories of HP, SSA/P, TSA, and MP in order to allow comparisons between the two rounds, but the participants were informed that the category of MP in their previous assessment corresponded to the new category of SSA/P with dysplasia, in line with the WHO classification. The participants were requested to diagnose such cases as SSA/P with dysplasia, including a qualification of the degree of dysplasia (low or high grade). Dysplasia was not further classified as adenomatous or serrated, although nuclear features comprising hyperchromasia, elongation, and pseudostratification are considered as diagnostic for “adenomatous” dysplasia, while vesicular nucleus, prominent nucleolus, and cytoplasmic eosinophilia are indicative of “serrated” dysplasia according to the WHO [1].

To define the most reproducible and discriminative criteria for each diagnostic category of SPs, all participants registered on the excel sheets as to which of the predefined diagnostic criteria they regarded as the most relevant.

Statistical analysis

Kappa statistics was performed in the Department of Biostatistics, Ankara University Medical School, by the same biostatistician (NK) for each round using SPSS for Windows 15.0 for paired (between two observers) and overall (group) inter-observer agreement and for intra-observer agreement. Comparisons between the initial and the second round were assessed using the Wilcoxon signed ranks tests. Analysis of the criteria was performed using chi-square tests and by means of proportions of positive judgements method for agreement levels [33]. Kappa values were grouped as poor (<0.2), fair (0.21–0.40), moderate (0.41–0.60), good (0.61–0.80), and perfect (>0.80). A p value less than 0.05 was considered as significant.

Results

Initial round: how reproducible are the diagnoses of SPs?

In the initial round, 44 % (n = 31) of the 70 cases were diagnosed as HP, 40 % (n = 28) as SSA/P, 10 % (n = 7) as TSA, and 6 % (n = 4) as MP by the majority of the observers.

The overall kappa value for the first 15 cases was poor with a kappa value of 0.202 (CI lower 0.147–CI upper 0.256; p < 0.001), while overall kappa values for each diagnostic category were 0.315 for HP, 0.223 for SSA/P, 0.181 for TSA (p < 0.001), and 0.107 for MP (p > 0.05).

Following consensus discussions, on the additional 55 cases a fair inter-observer agreement was achieved with an overall kappa value of 0.349 (CI lower 0.320–CI upper 0.377; p < 0.001). Kappa values for each diagnostic category were 0.443 for HP, 0.323 for SSA/P, 0.512 for TSA (p < 0.001), and 0.235 for MP (p = 0.01), respectively.

Finally, on the full 70-case set, a fair overall kappa of 0.318 (CI lower 0.293–CI upper 0.343; p < 0.001) was reached; also, for each diagnostic category, kappa values were fair, reaching to 0.415 for HP, 0.301 for SSA/P, 0.433 for TSA (p < 0.001), and 0.221 for MP (p = 0.01).

Second round: the impact of WHO classification on reproducibility

In the second round, a re-evaluation of all 70 cases was performed by 16 of the initial 20 participants using WHO criteria. A diagnosis of HP was made on 43 % (n = 30), SSA/P on 46 % (n = 32), TSA in 10 % (n = 7), and MP on 1 % (n = 1) of the cases. One HP was diagnosed as SSA/P and three MPs as SSA/P with dysplasia, while the diagnoses of 30 HPs, 28 SSA/Ps, seven TSAs, and one MP remained unchanged (Fig. 1d). No case was classified as UCP by the majority in either round, but 49 % (n = 33) of the cases received this diagnosis from between one to three observers.

Overall kappa value was moderate (kappa = 0.557; CI lower 0.532–CI upper 0.612, p < 0.001), while kappa values were significantly higher than the previous observations for each diagnostic category: 0.977 for HP, 0.912 for SSA/P, 0.845 for TSA, and 0.158 for MP (p < 0.001). Paired and overall agreements for all rounds and diagnostic categories are summarized in Table 1.
Table 1

Paired and overall agreement for all cases and for each diagnostic category

 

All cases

HP

SSA

TSA

MP

Initial round (total, n = 70)

Overall kappa

0.318

0.415

0.301

0.433

0.221

Paired kappa (min–max)

0.03–0.93

0.08–0.75

0.05–0.82

0.09–0.93

0.03–0.88

p value

<0.001

<0.001

<0.001

<0. 001

0.014

Second round (total, n = 70)

Overall kappa

0.557

0.977

0.912

0.845

0.158

Paired kappa (min–max)

0.09–1.00

0.11–1.00

0.09–1.00

0.21–1.00

0.09–1.00

p value

<0.001

<0.001

<0.001

<0.001

0.014

Intra-observer agreement for each diagnostic category was evaluated with respect to the initial and final rounds of assessments made by each of the 16 observers for the total of 70 cases. Kappa values ranged between 0.013 and 1.00 (Table 2). The overall agreement between the initial and second round of diagnoses for all cases was 0.79 (p < 0.0001), while the kappa value was 0.855 for HP, 0.767 for SSA/P, 0.841 for TSA, and 0.386 for MP (p < 0.001). Kappa analysis was not performed for UCP as this cannot be considered as a diagnostic category. Examples of cases which were renamed after the final round are presented in Fig. 2.
Table 2

Intra-observer agreement between the initial and final rounds of assessments on 70 cases

Observer

HP

SSA/P

TSA

MP

1

0.550a

0.573a

0.297b

1.00c

2

0.826c

0.750d

0.628d

0.665d

3

0.492a

0.295b

0.716d

4

0.424a

0.378b

0.237b

5

0.672d

0.746d

0.530a

0.203e

6

1.00c

1.00c

1.00c

1.00c

7

0.453a

0.557a

0.480a

0.800d

8

0.659d

0.621d

0.925c

0.549a

9

0.632d

0.465a

0.497a

0.379b

10

0.909c

0.857c

0.873c

1.00c

11

0.013e

0.287b

0.156e

0.071e

12

0.733d

0.576a

0.647d

0.488a

13

0.726d

0.549a

0.573a

14

0.603a

0.627d

0.056e

15

0.672d

0.746d

0.531a

0.245b

16

0.822c

0.793d

1.00c

1.00c

aModerate

bFair

cPerfect

dGood

ePoor

https://static-content.springer.com/image/art%3A10.1007%2Fs00428-012-1319-7/MediaObjects/428_2012_1319_Fig2_HTML.gif
Fig. 2

Examples of cases which were renamed after the final round. a HP renamed as SSA/P (H&E, ×200), b HP renamed as SSA/P (H&E, ×200), c SSA/P renamed as TSA (H&E, ×100), d MP renamed as SSA/P with dysplasia (H&E, ×100)

How reproducible and discriminative are the histologic criteria?

Firstly, the analysis of histological criteria revealed perfect and near-perfect agreement on epithelial serration in the upper crypts (kappa = 0.97), serration of the surface epithelium (kappa = 0.83), mitosis in the lower crypts (kappa = 0.79), goblet cells in the upper crypts (kappa = 0.77), and dilatation of upper crypts (kappa = 0.72) in contrast to crypt architectural features, which yielded fair to moderate agreement and to nuclear features that resulted in only fair agreement between the observers (Table 3).
Table 3

Agreement of observers for diagnostic criteria

Diagnostic criteria

Kappa value

Epithelial serration in upper crypts

0.9661

Surface epithelial serration

0.8328

Mitosis in lower crypts

0.7928

Goblet cells in upper crypts

0.7666

Dilatation in upper crypts

0.7189

Dilatation in lower crypts

0.5658

Goblet cells in lower crypts

0.4841

Epithelial serration in lower crypts

0.4307

Crypt branching

0.4165

Nuclear elongation

0.3780

Nuclear pseudostratification

0.3719

Horizontal crypts

0.3671

Nuclear hyperchromasia

0.3553

Cytoplasmic eosinophilia

0.3370

Prominent nucleolus

0.3258

Vesicular nucleus

0.3217

Dysplasia—low grade

0.2631

Mitosis in upper crypts

0.2533

Ectopic crypts

0.2104

Gastric epithelium in lower crypts

0.2044

Inverted crypts

0.1745

Dysplasia—high grade

0.0405

Secondly, the association of the criteria to each diagnostic category was analyzed by recording the frequencies of the criteria selected by the observers for each diagnostic category in every case using proportions of positive judgements method [33]. The presence of dilatation, mature goblet cells in upper crypts (p = 0.01), and epithelial serration (p < 0.05) were significantly associated with a diagnosis of HP, while epithelial serration, crypt dilatation, presence of mature goblet cells in lower crypts (p < 0.001), crypt architectural changes (p < 0.001), vesicular nucleus (p < 0.001), and prominent nucleolus (p < 0.01) were significantly associated with a diagnosis of SSA/P. TSA, on the other hand, was determined by the presence of epithelial serration in lower crypts (p < 0.001), mitotic activity in upper crypts (p < 0.001), nuclear features of atypia (p < 0.001), cytoplasmic eosinophilia (p < 0.001), presence of low-grade dysplasia (p < 0.001), crypt branching (p < 0.01), ectopic crypts (p < 0.05), and gastric epithelium in lower crypts (p < 0.05) (Table 4). Most discriminatory criteria for each diagnostic category are presented in Fig. 3. Low-grade dysplasia was diagnosed in 3.2 % of HP, 24.4 % of SSA/P, 81.3 % of TSA, 22.4 % of MP, and 27.3 % of UCP, while high-grade dysplasia was found in 2.2 % of SSA/P, 3.1 % of TSA, 22.5 % of MP, and 5.5 % of UCP. Among SSA/Ps with dysplasia, 69.8 % showed “serrated” dysplasia, while 30.2 % had “adenomatous” dysplasia as determined by the associated cytological features. Adenomatous dysplasia was more frequent in TSAs (59.7 %) in comparison to “serrated” dysplasia (40.3 %), similar to MPs of which 26.1 % had “adenomatous” and 17.96 % had “serrated” dysplasia.
Table 4

Diagnostic criteria discriminative for each SP category

Criteria

HP (%)

SSA/P (%)

TSA (%)

MP (%)

Epithelial serration

Upper crypt

98.8*

95.75

94.05

99.5

Lower crypt

25.0

70.1***

64.2***

43.4

Dilatation

Upper crypt

64.8*

72.2

69.6

81.9

Lower crypt

23.5

88.75***

62.55

77.5

Architectural features

Horizontal crypts

4.0

67.5***

31.4

40.5

Crypt branching

8.5

62.95***

52.3**

66.4

Inverted crypts

7.0

24.8

11.6

14.5

Ectopic crypts

4.0

20.05

88.6 *

12.8

Mitosis

Upper crypt

5.0

18.15

63.8***

43.45

Lower crypt

68.15

78.4

77.3

79.5

Mature goblet cell

Upper crypt

77.4**

76.85

71.55

82.4

Lower crypt

40.1

57.7 ***

38.2

63.45

Gastric epithelium

14.4

22.15

22.0 *

25.8

Nuclear features

Hyperchromasia

18.0

39.0

83.6***

76.2***

Elongation

11.1

42.35

86.85***

83.25***

Pseudostratification

17.5

40.6

90.85***

88.5***

Vesicular nucleus

18.1

43.7***

45.4

53.4

Prominent nucleolus

21.5

45.45***

67.2

64.7

Cytoplasmic eosinophilia

10.3

23.25

78.35***

46.8

Dysplasia

Low grade

3.3

24.45

81.85***

72.4***

High grade

0

1.56

3.1

22.55**

*p ≤ 0.05; **p ≤ 0.01; ***p ≤ 0.001

https://static-content.springer.com/image/art%3A10.1007%2Fs00428-012-1319-7/MediaObjects/428_2012_1319_Fig3_HTML.gif
Fig. 3

Most discriminatory criteria for each diagnostic category. a, b Serration, dilatation, and mature goblet cells in upper crypts in HP (H&E, ×100 and × 200, respectively). c, d Serration, dilatation, horizontal shape in lower crypts and vesicular nuclei with nucleoli in SSA/P (H&E, ×200 and × 400, respectively). e, f Ectopic crypts, cytoplasmic eosinophilia, pseudostratification, nuclear elongation and hyperchromasia in TSA (H&E, ×100 and × 400, respectively)

Discussion

This study presents a European initiative aiming to improve the reproducibility of histological diagnoses of SPs. The poor overall inter-observer agreement for 15 cases studied initially led us to repeat the assessment after a consensus discussion on 70 cases, which improved the agreement from poor to fair, highlighting the significance of a standardized approach supported by a consensus discussion.

Introduction of the WHO classification through a second consensus meeting led to a further improvement in overall agreement for the whole group, and perfect agreement was achieved for the categories HP, SSA/P, and TSA.

Few studies [2230] have addressed the question of reproducibility in the diagnosis of SPs (Table 5). Those that evaluated inter-observer agreement had a major drawback as they used the term “serrated adenoma” for all serrated polyps except HP, possibly due to the poorly defined criteria to distinguish SSA from TSA, available at time of publication [22, 24]. However, Farris et al. [23], in an approach similar to ours, evaluated the diagnostic concordance of five GI pathologists on 185 SPs comprising 92 HPs, 74 SSAs, and 19 TSAs. The observers were provided with a list of criteria for SSA and TSA in the first round, which revealed moderate overall agreement, while clinical information including size and localization of the polyps caused no change on the level of agreement in the second round. Following a consensus conference, the agreement was evaluated on a different but smaller series and revealed moderate agreement for HP and SSA and near-perfect agreement for TSA. Their study design is similar to ours with respect to the consecutive rounds of evaluation followed by consensus discussions to achieve the best possible agreement between the observers. In contrast to our study, they did not evaluate the reproducibility of the diagnostic criteria.
Table 5

Summary of publications on reproducibility of the diagnoses of SPs

Publication (reference number)

Cases

Results

Bariol et al. 2003 [22]

255 polyps (72 HPs, 9 SAs, 4 MPs, 170 conventional adenomas), 2 observers

Kappa for diagnostic criteria ranging from −0.029 to 0.852

Sandmeier et al. 2007 [27]

102 SPs (58 HPs, 7 SSAs, 5 TSAs, 3 MPs, 29 UCPs)

Criteria for HP vs. SSA, no kappa

Glatz et al. 2007 [26]

20 SPs (8HPs, 4 TSAs, 4 SSAs, 4 tubulovillous adenomas), 168 participants (Internet-based quiz)

High variation in SSA, no kappa

Farris et al. 2008 [23]

185 SPs (92 HPs, 74 SSAs, 19 TSAs), 5 observers

Kappa = 0.55

Bustamente-Balen et al. 2009 [28]

195 SPs (187 HPs, 8 SAs), 2 observers

Kappa = 0.14

Wong et al. 2009 [24]

60 polyps (26 SAs, 11 HPs, 6 MPs, 12 conventional adenomas, 5 other polyps), 4 observers

Kappa = 0.38

Khalid et al. 2009 [25]

40 SPs (comprised of HPs and SSAs and all originally diagnosed HPs), 3 observers

Kappa = 0.16

Gunia et al. 2011 [29]

19 SPs (8 SSAs, 3 TSAs, 8 inflammatory polyps), 3 observers/trainees

Kappa = 0.29–0.65

Present study

70 SPs (28 HPs, 25 SSA/Ps, 11 TSAs, 6 MPs)

Kappa = 0.557, 0.977 for HP, 0.912 for SSA/P, and 0.845 for TSA

Our study was conducted in a blind fashion regarding the size and localization of the polyps in order to analyze the morphological criteria in a more stringent way, devoid of any bias. It is, on the other hand, generally accepted that diagnosis of polyps with intermediate features that lie on a continuum between HP and SSA/P may require clinical information, and for such polyps, pathologists may be more inclined to make a diagnosis of SSA/P when the polyp is large and localized in the right colon. However, additional clinical information failed to have any effect on the diagnostic accuracy of the observers in two previous studies [23, 27]. We believe that the histopathological diagnosis of a polyp should not depend upon the size or the localization of the lesion. However, for unclassified non-dysplastic serrated polyps with intermediate features or in case of sampling error, pathologists need to know the size and localization of the lesion before making a diagnosis. Although a diagnosis of serrated polyp “unclassified” is recommended by WHO [1], improvement in our diagnostic skills together with the use of standardized criteria will help to better classify such borderline cases and avoid unnecessary utilization of the unclassified category.

The third issue that was raised by the initiative was to define the most reproducible and discriminative criteria for each type of serrated polyp. Although many publications utilize similar criteria in the diagnosis of SPs, there are considerable variations in histologic features as well as definitions and terminology in many others, particularly for SSA/P [4, 10, 18]. The first approach of standardization came from Bariol et al. [22], who evaluated the diagnostic utility of histologic criteria attributed to serrated adenomas in a series of cases including hyperplastic polyps, serrated adenomas, admixed polyps, and conventional adenomas. In their study, surface epithelial dysplasia, surface epithelial tufting, increased surface mitosis, and epithelial serration in more than 20–50 % of crypts were the criteria with the highest kappa values for the diagnosis of serrated adenoma. They did not, however, assess how consistently different observers could distinguish serrated adenoma from other polyps, nor did they further classify their cases as sessile and traditional serrated adenomas.

In a more recent attempt to distinguish HP from SSA, Sandmeier and colleagues [27] assessed 102 SPs and tested the reproducibility of Snover’s criteria [5]. They, too, blinded the observers to the clinical information in order to avoid any bias in the histopathological interpretation. As in our study, they found architectural changes together with the various cell types including goblet cells, undifferentiated cells, and gastric foveolar cells in basal crypts to be the most useful criteria to distinguish SSA from HP. Also, Farris et al. [23] concluded that architectural rather than cytological features are most helpful in distinguishing SSA from HP, whereas cytological features like nuclear elongation and stratification are more useful in distinguishing TSA from other serrated lesions. Of the WHO criteria, the most discriminatory for HP were serration, dilatation, and presence of mature goblet cells in the upper crypts, whereas the presence of these features in the lower crypt zone was significantly consistent with a diagnosis of SSA/P together with the architectural features, such as horizontal crypts and crypt branching, and nuclear features, including vesicular nucleus and prominent nucleolus. TSA, on the other hand, was associated with ectopic crypts and crypt branching as well as nuclear features comprising hyperchromasia, elongation, pseudostratification, and cytoplasmic eosinophilia. These findings confirm that architectural rather than cytologic features are diagnostically useful and also that nuclear features are more reproducible than architectural criteria.

On similar grounds, the Working Group on Gastrointestinal Pathology of the German Society of Pathology [34] proposed that the SSA/P architectural features should be present in at least two different crypts, not necessarily adjacent, although the new WHO definition of SSA/P requires two or three contiguous crypts with these features [1]. There is no evidence base for either view, but it is safe to postulate that several crypts close together should be a minimal criterium, which was however not assessed in our study.

Two types of dysplasia have been observed in SSA/Ps: “adenomatous” dysplasia and “serrated” dysplasia, the latter characterized by round cells with eosinophilic cytoplasm, vesicular nuclei, and prominent nucleoli [1, 20, 35]. SSA/Ps with dysplasia were probably classified as “mixed polyps” in the past [2, 5, 6, 32, 36], a term used for lesions with distinct foci of adenomatous epithelium and hyperplastic/serrated architecture. WHO [1] recommends the term “SSA/P with dysplasia” in order to emphasize that the dysplastic part of the lesion never shows APC mutations as found in adenomatous epithelium but rather presents with MSI resulting from methylation of MLH-1. In our study, a small number of cases, almost half of which showed dysplastic foci, were diagnosed as MP in the first round, whereas a diagnosis of SSA/P with high-grade dysplasia was made for these cases in the second round. Surprisingly, however, there was one particular case which was persistently diagnosed as MP in both rounds by the majority of the participants who apparently felt that dysplastic SSA/P category does not always coincide with polyps showing truly and distinctly mixed features. For the latter ones, a category of MP would seem legitimate.

In the present study, although dysplasia was graded as low and high without further classification into adenomatous or serrated types, the histological criteria used in the study already comprised features of serrated dysplasia such as vesicular nuclei, prominent nucleoli, and eosinophilic cytoplasm, as well as features of adenomatous dysplasia characterized by pseudostratification, nuclear byperchromasia, and elongation [1]. An evaluation of these criteria demonstrated that the majority of SSA/Ps possessed features of “serrated” dysplasia. Although their original description involves dysplastic epithelium, TSAs lack mitotic activity in the tall columnar epithelial cells with pencillate nuclei and eosinophilic cytoplasm and are therefore not truly dysplastic in the sense of tubular or villous adenomas [1]. The high rate of low-grade dysplasia in the TSA group suggests that the epithelial lining of TSAs was misinterpreted as low-grade dysplastic by the majority of the observers due to its resemblance to adenomatous epithelium. The low-grade dysplasia observed in a small percentage of HPs seems to correspond with a mucin-poor variant of HP presenting with regenerative epithelium that can be misinterpreted as dysplasia.

In conclusion, the results of the study show that consensus discussions on a sufficiently large SP collection improve inter-observer agreement, which was further improved when the new WHO classification was introduced. Furthermore, architectural criteria appear as most reliable for an accurate diagnosis of a SP. However, even when a consensus classification such as that provided by the WHO is applied, the reproducibility of the histopathological diagnosis on a SP remains imperfect.

Conflict of interest

The authors declare that they have no conflict of interest.

Copyright information

© Springer-Verlag Berlin Heidelberg 2012