, Volume 100, Issue 1, pp 65-70
Date: 04 Jul 2006

Cosmetic outcomes following breast conservation therapy: in search of a reliable scale

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access



Multiple scales to evaluate breast cosmesis following breast conserving treatment (BCT) have been developed, however reliability is a problem. Panel scores, where scores from two or more individuals are combined, were assessed to examine their effect on reliability for two different cosmetic scales.


Women, two or more years following BCT, were recruited from a single breast centre. Photographs of each participant were evaluated independently by six health care professionals on two separate occasions. A simple four-point scale and more involved multi-item scale were used to assess cosmetic outcome. Reliability was assessed with the weighted kappa statistic for increasing panel sizes.


Ninety-nine women were evaluated. Intra rater reliability increased from 0.73 to 0.83 for the four-point scale, for increasing panel sizes, however 95% confidence intervals generally overlapped. A smaller and more unpredictable effect was seen on the multi-item subscale, range 0.69 to 0.73. Inter rater reliability increased from 0.68 to 0.93 for the four-point scale, and 0.75 to 0.96 for the multi-item scale, for increasing panel sizes; 95% confidence intervals did not overlap. A panel of three for either scale provided almost perfect kappa values with only small improvements with larger panel sizes.


Care should be used in interpreting results where cosmetic outcomes have been obtained from a single evaluator. Panel scores can be used to significantly improve inter-rater, but not intra rater reliability, for the scales studied. Comparable reliability, in combination with simplicity of use and interpretation, would favour the four-point scale for breast cosmetic evaluation over the multi-item scale.