PD-L1 immunohistochemistry in non-small-cell lung cancer: unraveling differences in staining concordance and interpretation

Programmed death ligand 1 (PD-L1) immunohistochemistry (IHC) is accepted as a predictive biomarker for the selection of immune checkpoint inhibitors. We evaluated the staining quality and estimation of the tumor proportion score (TPS) in non-small-cell lung cancer during two external quality assessment (EQA) schemes by the European Society of Pathology. Participants received two tissue micro-arrays with three (2017) and four (2018) cases for PD-L1 IHC and a positive tonsil control, for staining by their routine protocol. After the participants returned stained slides to the EQA coordination center, three pathologists assessed each slide and awarded an expert staining score from 1 to 5 points based on the staining concordance. Expert scores significantly (p < 0.01) improved between EQA schemes from 3.8 (n = 67) to 4.3 (n = 74) on 5 points. Participants used 32 different protocols: the majority applied the 22C3 (56.7%) (Dako), SP263 (19.1%) (Ventana), and E1L3N (Cell Signaling) (7.1%) clones. Staining artifacts consisted mainly of very weak or weak antigen demonstration (63.0%) or excessive background staining (19.8%). Participants using CE-IVD kits reached a higher score compared with those using laboratory-developed tests (LDTs) (p < 0.05), mainly attributed to a better concordance of SP263. The TPS was under- and over-estimated in 20/423 (4.7%) and 24/423 (5.7%) cases, respectively, correlating to a lower expert score. Additional research is needed on the concordance of less common protocols, and on reasons for lower LDT concordance. Laboratories should carefully validate all test methods and regularly verify their performance. EQA participation should focus on both staining concordance and interpretation of PD-L1 IHC. Supplementary Information The online version contains supplementary material available at 10.1007/s00428-020-02976-5.


Introduction
Several immune-checkpoint inhibitors (ICIs) have emerged which target the programmed cell death protein 1 (PD-1)/programmed death ligand 1 (PD-L1) interaction in non-small-cell lung cancer (NSCLC), such as the anti-PD-1 drugs nivolumab and pembrolizumab [1][2][3][4], and the PD-L1 inhibitors atezolizumab and durvalumab [5,6]. The efficacy of ICIs in NSCLC has been shown in various clinical trials, and PD-L1 immunohistochemistry (IHC) has been widely accepted as a predictive biomarker because of its association with increased efficacy of ICIs [7,8]. Both nivolumab and atezolizumab have been approved by the Food and Drug Administration (FDA) and the European Medicines Agency (EMA) [9,10] as second-line therapy irrespective of PD-L1 expression. Treatment with pembrolizumab requires at least 50% of PD-L1 positive tumor cells in a first-line setting for stage IV NSCLC patients or those with stage III disease who cannot be treated by chemotherapy or radiation therapy [3,4]. Recently, the FDA approved durvalumab as maintenance therapy in patients with unresectable stage III NSCLC without progression after concurrent chemoradiotherapy [11], irrespective of the PD-L1 status. The EMA, however, has restricted this indication to patients with PD-L1 on ≥ 1% of tumor cells [12].
Four commercial assays are currently available, each for a specific drug and applying a specific tumor proportion score (TPS) threshold for positivity. The Ventana PD-L1 (SP142) Assay, Ventana PD-L1 (SP263) Assay, and the PD-L1 IHC 22C3 pharmDx (Agilent Technologies/Dako) kit are CEmarked in vitro diagnostic (CE-IVD) labeled [13] and have been validated in the clinical trials for atezolizumab, pembrolizumab or durvalumab, and pembrolizumab only, respectively. The 22C3 kit and SP263 assay have also been a p p r o v e d a s c o m p a n i o n d i a g n o s t i c s ( C D x ) f o r pembrolizumab by the FDA [14] and received a CE-IVD certification in Europe [13], respectively, while the other kits are considered as complementary diagnostics [15].
Due to the wide variety in commercially available platforms, their concomitant implementation in one laboratory would result in increased costs and a limited number of NSCLC being tested on more than one platform. Laboratories may also opt to use a laboratory developed test (LDT), such as E1L3N or QR1 primary antibodies, or the use of the antibodies described above with a different protocol than the CE-IVD certified ones. Comparison of LDTs to reference CE-IVD assays yielded varying results ranging from 52% to 54% concordance to even 85% or 100 % [17,23]. To date, however, there remains confusion about the range of assays which are fit-for-purpose for PD-L1 testing for individual drugs and the interchangeability between them.
Irrespective of the protocol used, laboratories are required to appropriately verify or validate their PD-L1 IHC test, to take part in continuous quality monitoring and participation to External Quality Assessment (EQA) [24][25][26]. Lower staining concordance for LDTs compared with CE-IVD approved assays was reported by two other EQA providers [27][28][29], but participants' interpretation of the TPS was not always assessed [27]. The aim of this study is to evaluate the results of assessment of the staining concordance of PD-L1 IHC and its influence on TPS estimations, for the different (LDT or CDx approved) methods in two subsequent EQA schemes of the European Society of Pathology (ESP).
Laboratory characteristics have shown to affect the EQA performance for other markers in NSCLC [30], but not yet for the technical assessment of PD-L1 concordance with optimal reference stains. Therefore, we also aimed to evaluate how different laboratory characteristics influence concordance rates. Finally, we provide an overview of most common staining artifacts observed for our EQA participants.

Material and methods
Two EQA schemes were organized in 2017 (pilot) and 2018, both accredited for International Organization for Standardization (ISO) 17043:2010 [31] and open to all laboratories worldwide. Participants received two unstained formalin-fixed paraffin embedded (FFPE) slides of 3-μm thickness from a tissue micro-array (TMA) containing three (2017) and four (2018) cases from archival FFPE NSCLC resection specimens (collected 7.4-76.4 months prior to distribution) and a positive tonsil control. In 2017, one large core per case was provided. In 2018, three cores with a diameter of 2 mm were punched for every case. Any one or a combination of the three cores per case could be used for interpretation of the TPS. Hematoxylin and eosin stained slides of parallel sections were made digitally available to enable assessment of tissue morphology, preservation, and the minimum number of tumor cells. To select a sample set with varying TPS and determine the ground truth, samples were pretested by a central accredited reference laboratory [26] with 22C3 (Dako) or SP263 (Ventana) according to manufacturer's instructions (Supplemental Figure 1).
Participants were requested to stain the slides according to their routine protocol within 14 calendar days after sample receipt and to send the stained slides back to the EQA provider. The maximum time between cutting of the slides and staining by the participants was 1 month. An electronic datasheet was completed including information on the laboratory characteristics, applied methodology, and estimation of the TPS (in categories of < 1%, 1-50%, or > 50%).
A team of three pathologists assessed the stains simultaneously under a multi-head microscope for the staining concordance, based on pre-defined scoring criteria. Prior harmonization was performed for equal assessment on slides with an excellent concordance with the reference stain for a specific antibody. Each participant stain was compared with the optimal reference stain and relative to stains from international peers. An expert staining score (ESS) ranging from 1 to 5 points was awarded based on the staining concordance of all cases with the reference slides, corresponding to 5: Excellent concordance for the specific protocol, 4: Concordant staining with minor remark, 3: Non-concordant staining without affecting clinical output, 2: Non-concordant staining affecting clinical output, 1: Failed, uninterpretable staining.
At the end of the scheme, participants received online examples of optimally concordant stains and corresponding protocols, a general scheme summary on sample outcomes (TPS) and ESS, and individual comments on their individual staining concordances (supplemental Table 1).
In 2018, one of the four cases was excluded, as varying TPS values were reported and no consensus outcome was reached. Thus, six cases were included, two for every TPS category. The reported laboratory settings and accreditation statuses were validated on the websites of the laboratories and their relevant national accreditation bodies [30]. Statistics were performed using SAS software (version 9.4 of the SAS System for Windows, SAS Institute Inc., Cary, NC, USA).The relationship of the ESS on 5 points with laboratory characteristics or used protocols was determined by proportional odds models, presented as odds ratios (OR) with 95% confidence intervals (CIs). The incidence of analysis failures and incorrect TPS estimations related to the ESS and laboratory characteristics was assessed by Poisson models with incidence rate ratios (IRR) with 95% CIs, with the log of the number of EQA samples as an offset variable. Generalized estimating equations (GEE) accounted for clustering in the data (i.e., tests performed by the same laboratory). 'Approved methods' were defined as CE-IVD-labeled FDAapproved CDx or complementary diagnostics without a change of protocol. The number of EQA participations, samples tested annually, or involved staff members were considered as ordinal variables (instead of categorical) to evaluate the influence of a +1 level increase.

Results
In 2017 and 2018, 67 and 74 laboratories participated respectively, resulting in 141 EQA participations from 104 unique laboratories in 30 different countries. The average ESS significantly (p < 0.01) improved between 2017 and 2018 from 3.8 to 4.3 points (Table 1); however, there was no significant difference (p = 0.2859) between laboratories who participated for the first (4.0) or second (4.2) time.
Almost half of the 141 participants (49.6%) were university and research (such as specialized cancer centers) laboratories, compared with 25.5% of laboratories affiliated to a general hospital, 22.0% private laboratories, and 2.8% industry laboratories. More than half (54.6%) were accredited for PD-L1 IHC specifically or on a laboratory level according to ISO 15189 or relevant national standards (e.g., College of American Pathologists 15189). The majority of laboratories (63/141, 44.7%) tested on average between 10 and 100 routine clinical samples annually for PD-L1, whereas seven participants (5.0%) did not perform clinical testing. Between 1 and 5 (37.6%) or 6 and 10 (34.8%) staff members were most frequently involved in performing and interpreting the PD-L1 IHC test. The abovementioned laboratory characteristics did not correlate with the ESS in both EQA schemes ( Table 1).
The participants stained and interpreted 423 cases in total, of which 371 (87.7%) were correct (i.e., reported TPS was in line with the pre-validated consensus value). In 8 (1.9%) cases, an analysis failure occurred, meaning that the staining could not be performed or interpreted. The TPS was underand over-estimated in 20 (4.7%) and 24 (5.7%) cases, respectively. The majority of under-estimations occurred for a TPS between 1% and 50%, close to the cutoff value of 1%, while over-estimations where more evenly distributed across TPS categories (Supplemental Table 1).
A lower ESS correlated with TPS under-estimations (p < 0.0001) in all cases and over-estimations (p < 0.0043) for cases with a TPS between 1% and 50% (Fig. 1). Accredited laboratories less frequently over-estimated cases (p < 0.05) ( Table 1), but there was no effect on under-estimations.  There was no relationship between other laboratory characteristic and incorrect estimations. For the 8 observed analysis failures, 3 were caused by 1 laboratory unable to interpret the cases as they were still validating their protocol. Another 4 laboratories incorrectly indicated that there were no tumor cells in the provided samples, and 1 participant that their control stained negatively (Supplemental Table 1). The relationship between failures and ESS could not be calculated for cases with a TPS of 1%-50% and > 50% as only 1 failure was observed.
In total, 81/141 participants did not obtain the maximum ESS of 5 on 5 points and received individual feedback. The majority of issues observed included a very weak (28.4%) or weak (34.6%) demonstration of the antigen in the tumor population, as well as slight (12.4%) or excessive (7.4%) background staining, not related to the used protocol (Supplemental Table 2). Examples of most frequently observed staining artifacts are given in Fig. 2.
The applied test methods for PD-L1 IHC significantly influenced the ESS. CE-IVD labeled or CDx kits (e.g., Ventana PD L1 (SP142) Assay, Ventana PD L1 (SP263), and the PD-L1 IHC 22C3 pharmDx) reached a higher ESS (4.2/5, n = 67) compared with LDTs (3.9/5, n = 74) (OR1.916 [1.012; 3.626], p < 0.05) ( Table 1). To assess if a recent change in protocol negatively affected the ESS, we evaluated the difference in ESS for participants who did or did not change their method between both schemes. Exactly 104 participants (73.8%) were excluded as they were first time participants, and no method information from previous years was available. For the remaining 37 laboratories, 12 changed their test method (either the primary antibody, antigen retrieval, or detection method), but no difference in ESS was observed (OR 0.899 [0.247; 3.280], p = 0.8723).
The use of a ready-to use (RTU) antibody dispenser yielded significantly higher ESS compared with using a specific dilution factor between 1/50 and 1/100 (OR 5.025 Abbreviations: # number, CDx companion diagnostic, CI confidence interval, EQA external quality assessment, ESS expert staining score, GEE generalized estimating equations, IHC immunohistochemistry, IRR incidence rate ratio, LDT laboratory-developed test, NA not applicable, ND not determined, OR odds ratio, PD-L1 programmed death ligand 1, RT room temperature, RTU ready-to-use, TPS tumor proportion score +Proportional odds models were used to analyze the difference in ESS. ++Poisson models were used to evaluate the association with analysis failures or under-/over-estimations. Both models applied GEE for clustering of the data. Results are presented as ORs/IRRs (± 95% CI), respectively. OR/IRR > 1 represent a higher ESS/higher incidence for a higher category level. OR/IRR < 1 represent a lower ESS/lower incidence for a higher category level. *p < .05, **p < .01, ***p < .001, ****p < .0001. ND; statistics not computed due to low power (absence or very few events in one level). For variables with more than two categories (laboratory setting, incubation time, and temperature), overall significance levels are given. ORs for every pairwise comparison between categories are described in the main text°A nalysis failures are defined as the failure to stain or interpret the PD-L1 IHC results on all assessed cases. Under-estimations are calculated on samples validated as a TPS of 1-50% or > 50%. Over-estimations are calculated on the total number of samples with TPS < 1% or 1-50% †Industry are laboratories involved in the development of diagnostic commercial kits. (Private) Laboratories are not within a hospital's infrastructure. Hospital laboratories included private and public hospitals. University and research included education and research hospitals, university hospitals, university laboratories, and anti-cancer centers [30] ‡Laboratory setting and accreditation were validated on the websites of the laboratories and national accreditation bodies. Accreditation is defined as compliant to ISO 15189 or relevant national standards §Approved kits are defined as using the Dako 22C3, Ventana SP142, or Ventana SP263 kits with platform for their intended use. LDTs are defined as these three clones in combination with another platform, or any other antibody clone ¶A switch included either the change in primary antibody, antigen retrieval, or detection method. 'Not applicable' included entries from first participations for which no method information from previous years was available [2.058; 12.346], p = 0.0004) or > 1/100 (OR 9.009 [2.169; 37.037]; p = 0.0024), but not compared with a dilution factor of < 1/50 (OR 2.681 [0.924; 7.752]; p = 0.0696) (data not shown). In contrast, incubation at room temperature (RT) reduced the ESS compared with higher temperatures ( Table 1). The incubation time or the use of amplification during detection did not alter the ESS. Because of the low frequency of technical failures and misclassifications, their percentages are given on a descriptive level only and no ORs are provided.
We compared the most frequently used protocols with other protocols ( Table 2). The SP263 (Ventana) CDx kit (with the Cc1 antigen retrieval and OptiView DAB IHC Detection Kit) displayed significantly higher ESS compared with all other protocols (LDTs and approved kits) ( Table 2, code d). The most frequently used antibody, 22C3, yielded varying ESS depending on the detection platform used. For instance, 22C3 in combination with less commonly used antigen retrieval and detection methods (not included in the CDx kit) ( Table 2, code c) resulted in significantly lower ESS compared with 22C3 with reagents from the CDx kit (EnVisionFLEX Target Retrieval Solution and Envision Flex detection method), or with the Optiview platform. We observed no other statistical differences in ESS for the other methods.

Discussion
Detection of PD-L1 expression is a valuable biomarker in NSCLC to select patients for ICI treatment [8]. Many studies have emphasized the variation in techniques, positivity thresholds, and staining concordance [15,23, 25, ].
This study for the first time correlated the ESS with the different protocols, laboratories' characteristics, and the incidence of reporting an incorrect TPS.
First, our results confirm a wide variety of testing protocols used across Europe not only for the primary antibodies but also for the different detection methods, with an overall better IRRs > 1 represent a higher number of incidents for higher ESS. *p < .05, **p < .01, ***p < .001, ****p < .0001. The IRR for analysis failures in cases with a TPS of 1-50% and > 50% was not computed as only one incident occurred. Abbreviations: CI confidence interval, EQA external quality assessment, ESS expert staining score, GEE generalized estimating equations, IRR incidence rate ratio, N/A not applicable, ND not determined, PD-L1 programmed death ligand 1, TPS tumor proportion score staining concordance for FDA Cdx approved kits, compared with LDTs.
It must be noted that the number of users in this study could have contributed to the difference between CE-IVD kits and LDTs, as the SP263 and 22C3 assays made up 17.0% and 22.7% (Table 2) of performed tests. Some antibody clones, such as SP142 or other LDTs, had only a limited number of users, and results should be interpreted with caution. The same is true for other non-commercial antibodies reported in literature with a lower sensitivity than E3L1N, such as the 5H1, 7G11, 015, and 9A11 [32,33], which were not assessed due to an absence of users in these EQA schemes.
An explanation for better concordance of CE-IVD marked kits might also include (i) the reduced inter-laboratory variation by restricting of the protocol in automatic software deployers, (ii) the associated chemistry used to build these assays [32], or (iii) difficulties in adhering to existing validation guidelines [25,26,34] for LDTs, as a gold standard for PD-L1 assays and cut-offs is not available [33]. Additional research into the different validation practices of the participants might provide a better insight as to why LDTs are currently underperforming. Some previous studies confirm our results, in which fewer LDTs passed the quality control compared with the clinically validated assays for PD-L1 [17,[27][28][29], and for ALK receptor tyrosine kinase IHC [35,36]. However, other studies reported a high concordance of LDTs with reference assays [21,37]. With the change of the CE-IVD directive into a European IVD regulation, more stringent validations need to be performed for the kits to retain their label, possibly resulting in more laboratories switching to approved kits [13,38]. Continued data are thus needed to confirm the lower concordance of LDTs in these EQA schemes.
Even though we compared the broad categories of LDTs versus CE-IVD kits, we also observed variability within each category, demonstrated by the higher concordance of the SP263 CE-IVD kit compared with the 22C3 CDx kit, but also the high concordance of the E1L3N primary antibody (Cell Proportional odds models with GEE for clustering of the data were used to analyze the difference in ESS. Differences in ESS are represented as ORs (95% CI) for every method (row level) relative to other methods used (column level). OR > 1 represent a higher ESS for a given method (column level) relative to the other method (row level). OR < 1 represent a lower ESS for a method relative to other methods. Statistics are computed for main method categories. ND; statistics not computed due to low power (low number of users). Significant results are highlighted in italics. *p < .05, **p < .01, ***p < .001, ****p < .0001. Analysis failures are defined as the failure to stain or interpret the PD-L1 IHC results on all assessed cases. Under-estimations are calculated on samples validated as a TPS of 1-50% or > 50%. Over-estimations are calculated on the total number of samples with TPS < 1% or 1-50% Abbreviations: # number, CI confidence interval, EQA external quality assessment, ESS expert staining score, GEE generalized estimating equations, IHC immunohistochemistry, ND not determined, OR odds ratio, PD-L1 programmed death ligand 1, TPS tumor proportion score Signaling) compared with other LDTs. This is in line with previously reported results, both for E1L3N [17] and for SP263 [39]. Secondly, we observed an effect of antibody dilution and incubation temperature, with higher concordance for RTU antibody dilutions (compared with a dilution factor between 1/50 and 1/100) and for incubation between 30°C and 37°C (compared with RT). However, that might be explained as the majority of the data were derived from RTU antibodies as part of CDx commercial kits. Although amplification has been reported to alter the test outcome for expression levels near a cut-off [40], we did not observe a difference.
Third, under-or over-estimations should be avoided, as they could significantly alter the treatment options for patients. In this study, their absolute frequency was low, and laboratories overall interpreted the PD-L1 IHC outcomes well, especially given that PD-L1 is a relatively novel marker and increased error rates were reported during the introduction of novel markers into practice [18,30]. The TPS estimation was however significantly affected by the ESS, resulting in underestimations for lower ESS. This emphasizes that rather than interpretation of the obtained staining pattern, key to a correct result is selecting an appropriate staining protocol with careful validation and quality monitoring. Moreover, it is important that both laboratories and EQA assessors calibrate the outcome for each staining protocol with respect to the optimal staining for that specific antibody.
The majority of misclassifications occurred at the threshold cut-off, which is a well-known problem [39], mainly due to weak demonstration of the antigen in the tumor population or excessive background staining (Supplemental table 2), resulting in the loss of the signal to background ratio. Even TPS values differing by 20% or more compared with the validated outcome were observed (Supplemental table 1). Therefore, sample switches (e.g., confusion about which core belongs to which case on the TMA) cannot be excluded.
In contrast to under-estimations, there was no significant relationship between the ESS and analysis failures, and overall incidence of these failures was low. While 4 laboratories indicated a lack of neoplastic cells in the sample, this could not be confirmed by microscopic review of the slides by the assessors, and peers who received slide sets from a similar position in the tissue block did not report any problems.
It must be noted that the schemes were performed on TMAs with 1 or 3 cores per case, and might not completely reflect the entire tumor microenvironment or PD-L1 expression on the invasive tumor front seen in routine practice [41]. Nevertheless, EQA results from the participants were correlated to the received tissue section, and cases displaying heterogeneity were excluded from the concordance assessment (Supplemental Figure 1).
Fourth, this is the first study to evaluate the relationship between different laboratory characteristics and the ESS. We observed a significant improvement over time from 3.8 to 4.3 on 5 points (p < 0.01). Even though second-time participants had a higher ESS and fewer incorrect outcomes/analysis failures, results were not significant. It must be noted that only 37 laboratories participated in both schemes and the TMAs sent in 2017 and 2018 were different (Supplemental Figure 1). More longitudinal data are needed to confirm a positive effect of repeated EQA participation and the feedback provided, as previously suggested [42].
Interestingly, while accreditation was significantly associated with fewer misclassifications, this was not the case for the ESS. Even when using an optimal IHC protocol, interpretation of the PD-L1 status might still be subjective based on the correct separation of membrane staining of the neoplastic cells versus nonneoplastic epithelial cells, immune cells, and necrosis.
The fact that laboratory accreditation affected the interpretation, but not the staining concordance, suggests that laboratories should participate in both aspects. From the EQA providers' side, schemes should be fit-for-purpose to assess both staining concordance and interpretation [43,44]. Previously, interpretation of PD-L1 IHC results has been described to improve upon pathologist training [17]. In our schemes, PD-L1 IHC was more frequently performed in research institutes, but laboratory setting and experience (number of samples tested, number of staff members involved, and change in methodology) did not correlate with overall ESS or TPS interpretation, in contrast to previously reported data [30]. As we included data from only two subsequent EQA schemes, additional schemes might bring more clarity on the effect of laboratory characteristics.
To conclude, the increasingly complex testing paradigm for PD-L1 poses many challenges for pathologists and oncologists. EQA participation could guide laboratories in obtaining better concordance. The use of a CE-IVD kit according to the manufacturer's instructions positively influences EQA concordance, even though additional research is needed on less common protocols and non-automated techniques. In addition, EQA participation should include a technical evaluation, given that lower ESS was shown to be at the basis of TPS misclassifications, rather than interpretation issues, and both aspects were differently affected by the laboratory characteristics.
One of the advantages of the EQA schemes is the large participants group for which a TPS estimation is available, allowing to determine an optimal consensus TPS for every case, and objectively comparing protocols by eliminating influences of the pre-analytical phase (i.e., all participants receiving similar and pre-validated slides). It remains to be elucidated how these findings are reflected in routine settings, where different preanalytical variables and sampling of heterogeneous biopsies can occur. Additionally, supplementing research on the errors made (e.g., personnel errors when following the protocol, clerical errors when reporting the outcomes) might reveal shortcomings in individual laboratories leading to lower concordance in the EQA scheme. Authors' contributions CK and EMCD were responsible for data collection according to ISO 17043 and statistical analysis. CK, EMCD, and JvdT interpreted the data and wrote the manuscript. PP, AR, NtH, and JvdT conceived and designed the set-up of the technical assessment, and took part as an assessing pathologist. NtH and JvdT provided medical expertise during the PD-L1 EQA schemes. AR was involved in sample preparation. PP was responsible for the multi-head microscope. JvdT selected and scanned the representative example images. All authors critically revised the manuscript for important intellectual content. Data availability The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request. Ethics The samples originated from tissue blocks of leftover patient material obtained during routine care. Each scheme organizer signed a subcontractor agreement stating that the way in which the samples were obtained conformed to the national legal requirements for the use of patient samples. The samples were excluded from research regulations requiring informed consent.

Compliance with ethical standards
Ethical responsibilities of authors' section Virchows Archiv conforms to the ICMJE recommendation for qualification of authorship. The ICMJE recommends that authorship be based on the following 4 criteria: All individuals listed as co-authors of the manuscript must qualify for every one of the four criteria listed above. Should an individual's contributions to the manuscript meet three of the criteria or fewer, then they should not be listed as a co-author on the manuscript; instead, their contributions should be acknowledged in the Acknowledgements section of the manuscript.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.