PD-L1 expression in urothelial bladder cancer varies more among specimen types than between companion assays

Urothelial bladder cancer (UBC) patients ineligible to platinum-based chemotherapy can be treated with immune-checkpoint inhibitors (ICI) in Programmed Death Ligand 1 (PD-L1) positive cases. Although concordance exists between different PD-L1 assays, little is known on PD-L1 expression variability in matched UBC samples. We compared PD-L1 expression in whole slides of matched transurethral resections (TURBT), radical cystectomies (RC), and lymph node metastasis (LN). Immunohistochemistry using the VENTANA PD-L1 (SP263) assay was performed on 115 patients and scored positive if expression occurred in ≥25% immune cells (IC), ≥25% tumour cells (TC), or both. PD-L1 was positive in 42.7% TURBT, 39.8% RC, and 27.3% LN specimens. Concordance was moderate (κ=0.52; P<0.001) between TURBT and RC, and fair between LN and TURBT (κ=0.31; P=0.048) or RC (κ=0.25; P=0.075). Comparison with the VENTANA PD-L1 (SP142) assay which had been performed previously on the same cohort showed moderate to substantial inter-assay agreement (κ=0.42–0.66). Although TC staining is not part of the SP142 scoring algorithm, discordant PD-L1 assay outcome could be attributed to SP263 TC≥25% staining in only 41% of cases. These results show that PD-L1 expression variability between matched specimens is higher than that between individual assays. Optimal specimen determination for PD-L1 testing needs to be addressed in future studies. Supplementary Information The online version contains supplementary material available at 10.1007/s00428-021-03094-6.


Introduction
With a global annual incidence of 430,000 patients, bladder cancer is the fourth and tenth most common cancer in men and women, respectively [1]. From these patients, approximately 25% present with muscle-invasive bladder cancer (MIBC). According to American Urological Association (AUA) guidelines, neo-adjuvant cisplatin-based chemotherapy (NAC) followed by radical cystectomy (RC) is the recommended treatment for MIBC [2]. Despite this aggressive treatment regimen, the 5-year overall survival of MIBC patients is only 55%. Importantly, the overall incidence and mortality rate have undergone little change in the past decades.
Many immunotherapeutic agents targeting Programmed cell Death 1 (PD-1) receptor and its PD-L1 ligand are currently tested in clinical trials, offering new opportunities for the treatment of advanced urothelial cancer patients [3]. In recent studies, higher response rates were observed in patients with high expression of PD-L1 in tumour tissue. Consequently, first-line use of atezolizumab and pembrolizumab for patients being ineligible to cisplatin-based chemotherapy has been restricted to PD-L1 positive tumours [4].
Although immunohistochemical PD-L1 expression may serve as a measure for effectiveness of immune-checkpoint inhibitors (ICI), there are some studies that did not show significant predictive value among PD-L1 subgroups [5,6]. Furthermore, the use of different PD-L1 companion diagnostic tests, scoring algorithms and cut-off points have raised the question on how to implement immunohistochemical assays in clinical practice. Multiple studies have shown overall good concordance between different PD-L1 assays [7][8][9]. However, little is known about the variability of PD-L1 expression among different tumour tissues from individual patients [10,11]. Previously, we found poor concordance of PD-L1 expression in urothelial cancer as determined in matched transurethral resection of the bladder tumour (TURBT), RC, and lymph node metastasis (LN) using the VENTANA PD-L1 (SP142) assay [12]. The PD-L1 (SP142) assay has been used as companion diagnostic for atezolizumab and is based on PD-L1 expression on tumour-associated immune cells (IC) only. Since IC reflect an inflammatory reaction to genomically aberrant tumour cells (TC), but are not considered genetically changed themselves, we hypothesise that a PD-L1 assay taking TC expression into account would be more stable among matched urothelial cancer specimens. The VENTANA PD-L1 (SP263) assay, which is used as companion diagnostic for durvalumab, takes into account PD-L1 expression on both TC and IC, and has, like the SP142 assay, been developed for a VENTANA BenchMark ULTRA platform. The objective of this study was to determine the concordance of PD-L1 expression using the SP263 assay in matched TURBT, RC, and LN samples, and to compare its outcome with the SP142 assay, which had been performed previously on the same population [12].

Patient selection and pathological review
In total, we included 115 patients who underwent RC with bilateral pelvic lymph node dissection (PLND) for cT2-T4aN0N1M0 viable urothelial carcinoma of the bladder, at the Erasmus MC University Medical Centre, Rotterdam, The Netherlands, between 1998 and 2017. Thirty-five patients had received (neo)adjuvant therapy before operation. We selected cases for the availability of matched TURBT, RC, and/ or LN specimens, as we aimed to investigate PD-L1 assay performance in matched primary bladder cancer and metastases. The use of patient material for scientific purposes was approved by the local Medical Research Ethics Committee (Rotterdam, Netherlands, MEC-2014-553). All haematoxylin and eosin (HE) slides were reviewed by a genitourinary pathologist (GvL), who monitored the following: grade (WHO1973 and 2016), pT stage (TNM 8th edition), surgical margin status, presence of carcinoma in situ (CIS), vascular invasion (VI), and presence of variant histology (squamous, glandular, neuroendocrine, sarcomatoid) [12]. In the present study, whole tissue slides were newly stained with the PD-L1 SP263 assay and compared to PD-L1 SP142 stainings, which had been performed on the same slides and published previously [12].

Immunohistochemistry and scoring
Four-micron consecutive sections were cut from representative formalin-fixed, paraffin-embedded (FFPE) diagnostic tissue blocks, mounted on adhesive glass slides and stained for PD-L1 using the SP263 and SP142 assays on the VENTANA BenchMark ULTRA platform, according to the manufacturer's protocols (Ventana Medical Systems, Tucson, AZ, USA) [6,[13][14][15][16]. For both assays, PD-L1 staining on IC, including lymphocytes, macrophages, histiocytes, reticular dendritic cells, plasma cells, and neutrophils, was scored within the tumour reactive stroma, between the tumour islands and invading the tumour border. Samples stained with SP142 are considered positive if PD-L1 expression in IC covers ≥5% of the tumour area (IC≥5%). Samples stained with SP263 are positive if PD-L1 expression occurs in ≥25% of either IC or TC (IC≥25% and/or TC≥25%) (Fig. 1). Full details of the VENTANA SP263 and SP142 PD-L1 assay evaluation and scoring algorithms are provided in the manufacturer's manuals [15,16]. PD-L1 expression was scored by one pathologist (GvL) with experience in PD-L1 assay assessment [17]. PD-L1 staining with the SP142 assay had been performed on consecutive slides of the same cohort and was used in the current study for comparison with the SP263 assay [12].

Patient characteristics
The clinicopathological patient characteristics at time of RC of the 115 patients are summarised in Table 1. Median patient age at time of RC was 65.7 years (interquartile range (IQR) 57.9-72.3 years). All patients had undergone TURBT for MIBC (≥pT2). In total 109/115 (94.8%) patients had undergone cystectomy; in 6 patients cystectomy was omitted because of intra-operative identification of lymph node metastasis. Thirty-five (30.4%) patients had received pre-operative neo-adjuvant therapy, including chemotherapy (n=27), radiation (n=6), or chemoradiation (n=2). PLND was performed in 109 (94.8%) patients of whom 57 (52.3%) had lymph node metastasis. Perivesical lymph nodes (PVLN) were identified in 32 patients and were positive in 11 (34.4%) cases. In 3 patients, metastases were present in perivesical but not pelvic lymph nodes, resulting in a total of 60 (55%) patients with metastatic disease at time of operation.

Discussion
Since PD-L1 testing is required for starting first-line treatment with atezolizumab and pembrolizumab in urothelial carcinoma, it is important to elucidate what assay and specimen type are most representative for prediction of therapeutic response. While several studies have indicated overall inter-assay concordance rates of 59-93% [7], little is known yet on PD-L1 variability among matched tumour specimens [10,11]. In this study, we found that PD-L1 status using the SP263 assay on whole tissue sections showed moderate agreement (κ=0.52) between TURBT and RC specimens, and fair agreement (κ=0.25-0.31) between both specimens and LN metastasis. In RC specimens, PD-L1 expression in TC was more stable than in IC, as 75% and 50% of RC specimens with TC≥25% also had positive TC PD-L1 status in TURBT and LN, respectively, as compared to 47% and 10% of cases with IC≥25%. Furthermore, use of the SP263 assay resulted in more frequent positive PD-L1 status than the SP142 assay, with moderate to substantial inter-assay agreement (κ=0.42-0.66). Although TC staining is not part of the SP142 scoring algorithm, discordant PD-L1 assay outcome could be attributed to SP263 TC≥25% only staining in 41%. Concordance of PD-L1 (SP263) expression between specimens (κ=0.25-0.52) of the same patient was lower than between both SP263 and SP142 assays (κ=0.42-0.66). Therefore, PD-L1 expression varies more among matched specimen types than between individual assays.
While various studies have investigated the performance of different PD-L1 assays in urothelial cancer, it is not known yet what tissue specimen, sampling technique and time of sampling are most representative for determination of PD-L1 status. Clinical trials have used a broad range of archival specimens for PD-L1 immunohistochemistry, including biopsies and excisions of primary and metastatic sites, before and after (neo-)adjuvant chemotherapy. The impact of this variability has only rarely been subject to investigation. Within the second-line atezolizumab trial using SP142 as companion diagnostic, Rosenberg et al. reported that PD-L1 expression was higher in resection specimens (39%) and TURBT (34%) than in primary lesion biopsies (17%) or metastasis (8%) [13]. In the current study, we showed fair to moderate agreement (κ=0.25-0.52) of PD-L1 outcome using the SP263 assay in matched urothelial cancer specimens, which is higher than for the SP142 assay (κ=0.05-0.35), which we reported previously [12]. A low agreement rate between primary and metastatic lesions was also found for the SP142 assay by Burgess et al. (κ=0.086) [11]. The higher PD-L1 concordance rate between matched TURBT, RC, and LN specimens found for the SP263 than for the SP142 assay indicates SP263 is a more robust assay for PD-L1 assessment. At this moment, however, it is unclear what specimen type is most representative for PD-L1 assessment. Since SP263 shows less variability between specimens types of the same patient, its use is preferred over SP142 for urothelial carcinoma. Only if multiple specimens are tested for PD-L1 expression status in patients who are treated by ICI, it will become clear which specimen type is most representative for predicting response.
In this study we found moderate to substantial agreement (κ=0.43-0.66) between the SP263 and SP142 assay in whole tissue sections of urothelial bladder cancer. This is in line with the agreement (κ=0.582) observed in our previous study, in which we performed a tissue microarray (TMA)-based interassay concordance study of another urothelial cancer cohort [17]. Most concordance studies have been performed on TMAs, and it might be questioned to what extent these are representative for whole tissue sections, which are not used for PD-L1 expression analysis in clinical practice. Wang et al. found moderate concordance between matched TMA and whole tissue sections with slightly higher agreement for the SP263 (κ=0.573) than the SP142 (κ=0.493) assay [18]. In fact, most inter-assay agreement studies found best concordance rates between 22C3 and SP263 assay, while discrepant outcome was more frequent for SP142 [7,8,10,17]. Since PD-L1 expression status is increasingly required for urothelial cancer management, it is important to develop standards for its testing. Apart from specimen type, local availability of the immunohistochemical staining platform limits the choice for PD-L1 companion assay use. Since the 22C3 and 28-8 assays have been developed and optimised for a DAKO staining platform, and the SP142 and SP263 assays for the VENTANA BenchMark ULTRA platform, companion diagnostic selection is highly dependent on the technical equipment being present at the pathology department. Due to the high concordance rate of 22C3 and SP263, which both take TC and IC staining into account, these assays might serve as Fig. 2 Intra-and inter-assay agreement of PD-L1 expression for matched TURBT, cystectomy, and LN+ specimens Fig. 3 Modified Venn diagrams for the inter-assay agreement of both PD-L1 assay scores in matched TURBT, cystectomy, and LN+ specimens first choice depending on the platform present. If a pathology laboratory has the availability of a DAKO staining platform, 22C3 is applied together with its assay-specific scoring algorithm (combined positive score (CPS), IC and TC ≥10%). In case a VENTANA BenchMark ULTRA platform is present, the SP263 assay can be applied as surrogate for 22C3 companion diagnostics. In the latter case, a relevant and yet unanswered question is whether the SP263 assay staining should be evaluated according to its manufacturer's algorithm (IC or TC ≥25%) or the 22C3 companion algorithm (CPS>10) for pembrolizumab treatment selection. We scored the SP263 stainings also following the CPS algorithm and found substantial agreement (kappa 0.68-0.73; concordance rate 84.2-88.6%; P<0.001) between the manufacturer's (IC or TC ≥25%) and 22C3 (CPS>10) algorithms (data not shown). Future studies should indicate which scoring algorithm is most predictive for pembrolizumab treatment selection using the SP263 assay.
The strong point of this study was its use of whole tissue sections instead of TMA punches, as this will also be used in clinical practice. One disadvantage is the relatively low number of samples, specifically of metastatic sites. The study included specimens obtained over a long time period. The median specimens' age was, however, not significantly associated with assay discordance rate excluding differences in epitope degeneration. Finally, as in other inter-assay concordance studies, none of the patients had actually been treated with ICI, so that the most representative sampling technique and assay being predictive for therapeutic response cannot be definitely determined.
In conclusion, we observed fair to moderate agreement of PD-L1 expression outcome using the SP263 assay in matched TURBT, RC, and LN urothelial cancer samples. In matched TURBT and RC specimens, IC more often had discordant PD-L1 expression status than TC. The SP263 assay resulted in more frequent positive PD-L1 outcome than the SP142 assay, with moderate to substantial inter-assay agreement. While the SP142 assay does not include TC staining, discordant PD-L1 outcome between assays was attributed to SP263 TC staining in only 41% of cases. Based on its higher level of concordance between matched specimen types, the SP263 assay seems to represent a more robust assay for PD-L1 assessment than the SP142 test. Overall, PD-L1 expression however varied more between matched urothelial cancer specimens than between both companion assays.