Evaluation of a worldwide EQA scheme for complex clonality analysis of clinical lymphoproliferative cases demonstrates a learning effect

Clonality analysis of immunoglobulin (IG) or T-cell receptor (TR) gene rearrangements is routine practice to assist diagnosis of lymphoid malignancies. Participation in external quality assessment (EQA) aids laboratories in identifying systematic shortcomings. The aim of this study was to evaluate laboratories’ improvement in IG/TR analysis and interpretation during five EQA rounds between 2014 and 2018. Each year, participants received a total of five cases for IG and five cases for TR testing. Paper-based cases were included for analysis of the final molecular conclusion that should be interpreted based on the integration of the individual PCR results. Wet cases were distributed for analysis of their routine protocol as well as evaluation of the final molecular conclusion. In total, 94.9% (506/533) of wet tests and 97.9% (829/847) of paper tests were correctly analyzed for IG, and 96.8% (507/524) wet tests and 93.2% (765/821) paper tests were correctly analyzed for TR. Analysis scores significantly improved when laboratories participated to more EQA rounds (p=0.001). Overall performance was significantly lower (p=0.008) for non-EuroClonality laboratories (95% for IG and 93% for TR) compared to EuroClonality laboratories (99% for IG and 97% for TR). The difference was not related to the EQA scheme year, anatomic origin of the sample, or final clinical diagnosis. This evaluation showed that repeated EQA participation helps to reduce performance differences between laboratories (EuroClonality versus non-EuroClonality) and between sample types (paper versus wet). The difficulties in interpreting oligoclonal cases highlighted the need for continued education by meetings and EQA schemes. Supplementary Information The online version contains supplementary material available at 10.1007/s00428-021-03046-0.


Introduction
Clonality testing is widely accepted as a valuable tool in routine diagnosis of lymphoid malignancies [1,2]. The vast majority of lymphoid malignancies arise from the unconstrained expansion of a single transformed B-or T-cell, accompanied by the presence of clonal rearrangements of immunoglobulin (IG) or T-cell receptor (TR) genes, rendering them the most widely applied gene targets for clonality testing.
Due to the technical standardization and commercialization (PCRs, protocols, and readouts), clonality assays can be performed in routine diagnostics [13]. However, reporting of clonality assays is still considered a complex task, because molecular clonality testing reflects immunobiology and comprises the integrated interpretation of multiple multiplex PCR results. These multiplex PCRs use primers of potentially different efficiencies, annealing to highly homologous genes. Although there are basic rules for interpretation of the molecular patterns [13], extensive knowledge of IG and TR gene rearrangement patterns and the PCR design is needed. Also, interpretation should consider the pathology and the clinical question as the presence of a small clone in a reactive lesion has another implication than its presence in a full-blown malignancy.
Laboratories performing molecular pathology tests are advised to participate in external quality assessment (EQA) schemes [14], preferably an accredited scheme providing samples mimicking routine cases as closely as possible [15]. It is essential that the EQA participants read the final reports with feedback on errors made by all participants, act on recommendations made, and ensure that their own errors are corrected rapidly [14].
The EuroClonality consortium organized five EQA rounds between 2008 and 2011 [16], using capillary electrophoresis (GeneScan, GS) or polyacrylamide heteroduplex (HD) gel analysis [3]. The schemes aimed to (i) assess the laboratory performance and (ii) develop a uniform scoring system for interpretation of IG/TR clonality testing. To render interpretation less subjective, algorithms have been introduced, especially in the USA [17][18][19]. However, this may potentially lead to false negative or false positive interpretation [13], and the need for guidelines on interpretation and reporting of clonality data is apparent for IG/TR routine diagnostics and EQA schemes. This prompted the development of the EuroClonality (BIOMED-2) guidelines. The guidelines describe the technical scoring of the individual IG and TR PCR target results, and scoring of the final molecular conclusion, based on the integration of the different PCR results. During validation of the EuroClonality uniform description and reporting system, the majority of the cases were scored appropriately, with only 3.1% of 1150 cases being identified as difficult to score, i.e., the final scoring of either a minor clone with polyclonal background or polyclonal with a minor background, actually describe the same phenomenon but the scoring may reflect the personal favor of the clinical scientist [13].
Several other providers offer IG/TR EQA schemes [20][21][22][23][24]. Both small sample sets with frequent distribution and larger sample sets are currently used in different EQA programs [25,26] (supplemental Table 1). Previously, performance improvement upon EQA participation or other quality improvement projects has been reported in schemes for testing of oncological biomarkers [27,28], but not yet for clonality testing.
The aim of this paper was to investigate the effect of repeated EQA participation on the laboratories' performance for complex clonality analysis. Important parameters such as the participant group, the different final molecular interpretations (clonal, polyclonal, oligoclonal; without evaluation of the more detailed molecular interpretation), sample information, and the analysis method were integrated in these analyses. The data are based on the results of five EQA rounds for IG and TR rearrangement analysis in suspected lymphoproliferations between 2014 and 2018.

EQA scheme set-up
The schemes were organized by the EuroClonality consortium [29] in collaboration with the Biomedical Quality Assurance Research Unit of KU Leuven as the coordination center, accredited conforming to ISO/IEC 17043:2010 [30]. Each EQA round comprised analysis of extracted DNA samples and interpretation of clonality patterns from paper-based cases on a total of 10 clinical cases. In addition to the EuroClonality laboratories who were involved in the development of EuroClonality/BIOMED-2 primer sets and protocols [3][4][5] and are members of the EuroClonality consortium, also non-EuroClonality laboratories could register. Enrolled laboratories could opt to participate in IG or TR testing, or both. The EQA scheme process is depicted in Fig. 1.

Sample selection
Participants received five cases for IG clonality testing and/or five cases for TR clonality testing. With the exception of 2014 (only paper-based cases), these cases alternated yearly to include three DNA samples and two paper-based cases in a given year, versus two DNA and three paper cases in the next year (Table 1). As both hemato-oncology and pathology labs perform clonality analysis, cases of different sample types (e.g., peripheral blood, fresh tissue, FFPE tissue) were included, reflecting the clinical diagnostics. The selection for the wet cases was based on the availability of samples with sufficient DNA-yield for testing in an EQA, the representation of the tube patterns, and results from previous EQA rounds. The selection for the paper-based cases was based on the representation of the tube patterns, the evaluation of rearrangement patterns of separate tubes into an integrated conclusion, and Fig. 1 Overview of the EuroClonality EQA scheme process. EQA, external quality assessment; IG, immunoglobulin gene; TR, T-cell receptor gene EQA, external quality assessment; IG, immunoglobulin gene; N, number; TR, T-cell receptor gene * Successful participation is defined as a score of ≥4/5. Successful participation after two participations is defined as a score of ≥ 9/10 for both schemes combined. The number of participants does not necessarily equal the number of participants who submitted results, given that not all laboratories participated during the previous EQA scheme and the performance over 2 EQA rounds was not calculated ** One paper case from 2015 was considered an educational sample to calculate the TR performance score; all participants received 1 point as no consensus outcome could be reached *** In 2016, 0.5 points were deducted in case there was a discrepancy between individual tubes and final conclusion or wrong identification of clonal peak for one specific case the results from previous EQA rounds. Only wet and paper cases with a consensus overall molecular interpretation during pretesting were included. Wet samples consisted of color-coded tubes containing 40 μL DNA at a concentration of 25-50 ng/μL. Paper-based cases focused on the interpretation of IG/TR GS patterns, created by duplicate fragment analysis (GS) of PCR products on various Genetic Analyzer Systems (Life Technologies, Beckman Coulter). Quality of the DNA samples was assessed with the EuroClonality/BIOMED-2 quality control-gene PCR (100, 200, 300, 400 bp amplicons), and the largest sized amplicon product detectable was reported. Participants received information on the PCR targets (e.g., FR1-JH for IGH tube A), fluorochromes (FAM or HEX), and size standards (e.g., LIZ500, ILS600). Patterns were provided per BIOMED-2 tube, including a full view of the tube patterns and a zoomed view per sample to aid visualization of the case's overall GS profile. All samples (paper and wet) were presented as clinical cases and relevant clinical details (sample type, age/sex of patient, suspected diagnosis, and request), and flow cytometry, histomorphology, and/or immunostaining data were provided.

Results of clonality analysis: the individual tests as well as the final molecular interpretation
Participants were asked to analyze all cases using their routine protocols and to interpret the results according to the published guidelines [13]. Results were entered in an electronic datasheet (Formdesk) and included (i) the overall molecular interpretation, (ii) an optional more detailed interpretation, and (iii) a technical description per PCR tube (with or without peak size(s)). Additionally, information about the detection technique (HD or GS) and test assay used (only in 2018) were requested.

Evaluation and feedback
In 2014 and 2015, a consensus (overall) molecular interpretation and result per PCR tube was reached based on the concerted discussion of the participants' data. From 2016 up to 2018, consensus scoring was established by the EQA committee experts for the wet and paper cases prior to distribution.
In all scheme years, a maximum score of 1 point could be obtained per sample for a correct final molecular interpretation ( Table 1). As the results of the different multiplex PCRs were used to form the basis for the final molecular interpretation, the individual PCR tube results were not scored. For particular cases, a more detailed interpretation of the final molecular interpretation was required (out of scope for this paper). Only in 2016 and 2017, half a point was additionally deducted for an incorrect or suboptimal detailed interpretation. In 2016, half a point was also deducted for discrepancies between individual tubes and the final conclusion or an incorrect identification of clonal peaks.
In the yearly EuroClonality meeting, the results of the EQA scheme were discussed, starting with a plenary presentation of the results, followed by detailed small group discussions involving the expert EQA committee members and the EuroClonality consortium participants. Finally, there was a summarizing plenary presentation. Analysis of the EQA data was integrated and described in detail in an educational EQA report and provided to all scheme participants. The general scheme summary included detailed information about the molecular conclusion and per tube PCR results, an assessment table with scores per case, and a participation certificate [25]. The criterion for successful participation was a performance rate of ≥80% in that respective scheme year, corresponding to at least 4 out of 5 correct final molecular interpretations. Laboratories with a score of ≤4.5 on 5 received a warning due to possible risk of unsatisfactory performance after two EQA rounds. Laboratories with a score of at least 90% (9/10) [25] in two subsequent EQA rounds were listed on the EuroClonality website [29,31].

Statistics
Statistics were performed with IBM SPSS Statistics v25 (IBM, Armonk, NY, USA) with significance levels set at α=0.05. Mann-Whitney U (MWU) tests were performed to evaluate differences in average analysis scores between groups, and Kruskal-Wallis (KW) tests to assess improvement upon repeated participation for a given group.

General overview
Over all schemes between 2014 and 2018, 84 unique laboratories from 17 (Table 2). This excludes a paperbased case in 2016 (peripheral blood with relapsed T-cell prolymphocytic leukemia) for which 29.4% (15/51) of participants incorrectly interpreted this difficult case with oligoclonality/multiple clones detected (Supplemental Table 2).

Improvement related to repeated EQA participation
We evaluated the performance on individual laboratory level based on the number of EQA participations, not related to the general average score for that scheme year. The average analysis scores were significantly higher for individual Overall, laboratories performed better for the paper-based cases as compared to the wet cases, although not significant (MWU, p=0.466) (Fig. 2, panel B). During a first EQA participation, a score of 85.9% was reached for wet cases, compared to 92.5% for paper-based cases (IG and TR combined). The difference between both sample types decreased upon frequent EQA participation, ultimately reaching scores of 98.0% (wet cases) and 99.4% (paper cases).
Both EuroClonality and non-EuroClonality laboratories benefited from repeated EQA participation. The EuroClonality laboratories performed significantly better compared to non-EuroClonality participants (MWU, p=0.008) (Fig. 2, panel C). Better performance by EuroClonality laboratories was also observed for IG and TR testing for paper and wet cases separately, although only significant for the paper cases (MWU, p=0.007 for paper cases, p=0.149 for wet cases, p=0.057 for IG, p=0.068 for TR) (Supplemental Figure 1).

Evaluation of the different final molecular interpretations
In total, 1380 tests (both on wet and paper-based cases) were performed for IG rearrangements (Table 3). A total of 97.5% (1021/1047) and 96.1% (265/276) tests correctly assigned the final interpretation of clonal or polyclonal, respectively. Note that the more detailed molecular interpretations were evaluated but no points were deduced when the more detailed interpretation was not correct. Only 12.3% (7/57) of tests with a consensus outcome of oligoclonality/multiple clones were correct, as 78.9% (45/57) were reported as clonal. For TR analysis, 1345 tests were performed (Table 3), of which one paper-based case in 2015 was considered to be educational since no consensus outcome was reached. Similar to IG, the majority of clonal (893/913, 97.8%) and polyclonal (251/269, 93.3%) TR tests were correct, while the majority of oligoclonal tests were incorrectly assigned as clonality detected (76/107, 71.1%). Incorrect interpretations were more often observed for wet samples compared to paper cases, especially for IG analysis (except for oligoclonality, which only included paper-based cases).

Evaluation of the analysis methods used by the EQA participants
For IG and TR analysis of the wet samples, 90.8% (n=533) and 92.2% (n=524) of tests were analyzed by GS. The remaining tests were analyzed by HD (8.3% for IG and 7.3% for TR), which is also described as a preferred analytical technique for some multiplex PCR-tubes [3]. For IG analysis, the majority of the participants tested the IGH-A (94.6%), IGH-B (98.7%), and IGH-C (98.9%) tubes (Supplemental Table 3). These three tubes were mainly tested by EuroClonality For wet TR testing, the most included tubes were TRG-A and TRG-B tubes (97.5%). For these two tubes, EuroClonality Invivoscribe reagents were mainly used (29/55 participants), followed by EuroClonality LDT (22/55), and non-EuroConality LDT (2/55). One participant used Invivoscribe NGS reagents, and one other laboratory did not test these targets. Reagents for the other tubes are shown in Supplemental Table 3.

Discussion
The BIOMED-2/EuroClonality assays are widely used for clonality testing of suspected lymphoproliferations. Clonality testing is not a stand-alone test but is an important integral part in the diagnosis of lymphoid malignancies. Correct analysis, evaluation, and result reporting are indispensable and contribute to a correct diagnosis. Particularly, the appropriate interpretation of clonality assays requires study, learning, and training on the job. This can be facilitated by participation in EuroClonality educational workshops, or by submitting difficult cases via the EuroClonality website to get online support. In this paper, we show that participation in the EuroClonality EQA schemes significantly contributes to improving the diagnostic interpretation.
The overall performance scores for both IG and TR analysis were high, with more than 90% of successful participants each year. The individual participants had a significantly higher score when participating in more EQA rounds, although there was no obvious overall improvement between 2014 and 2018. There was no observed difference in performance based on the sample type or final clinical diagnosis. The used methodology could not be linked to the improvement for a specific laboratory or sample type.
The final molecular interpretations such as "clonal" or "polyclonal" were in general scored well. Also, truly challenging cases were included such as clonal cases with polyclonal background, cases with bi-allelic rearrangements, bi-clonal cases, and cases with multiple clonal (IGK or TRB) rearrangements that still belong to one clone. In our schemes, thus far, we  have evaluated the more detailed molecular interpretations such as bi-allelic or bi-clonal, but no points were deducted when the more detailed interpretation was incorrect. In the next EQA schemes, we intend to include the more detailed interpretations in the evaluation. Based on the previous EQA schemes, we then expect a lower performance for clonal samples. The scoring of oligoclonality clearly was difficult. Oligoclonality is defined as the reproducible detection of three or more clones. As for the interpretation of clonality (including bi-allelic or bi-clonal cases), this requires the appropriate interpretation of the individual tube results as well as understanding the IG and TR loci and PCR design. Due to the non-quantitative PCR nature and potentially preferential amplification of some rearrangements, the identification of true clonal rearrangements compared to minor peaks in an irregular polyclonal background is especially difficult for oligoclonal cases. Because true oligoclonal cases are scarce, the experience with these cases is limited. Only three oligoclonal cases could be included in the paper-based EQA; one for IG and TR in 2015 and one TR case in 2016. Of the total 164 molecular interpretations (Table 3) A better performance (although not significant) was observed for paper-based tests in which only the result interpretation was evaluated, versus the wet cases in which the technological approach, performance of the test, and the interpretation of the results were evaluated. While paper cases were evaluated adequately during a laboratory's first or second EQA participation, wet cases were more error-prone (Fig. 2,  panel B). This is not surprising, given that wet sample analysis includes extra (pre-)analytical processes potentially impacting the results, compared to solely interpreting GS results according to guidelines. However, paper-based cases may also be perceived as difficult by the laboratories, as cases with complex rearrangement patterns were included.
Both EuroClonality and non-EuroClonality laboratories improved their performance upon repeated participation, which is in line with results for biomarker analysis in colorectal cancer [28]. The overall scores were significantly better for EuroClonality laboratories (Fig. 2, panel C). The EuroClonality-affiliated participants may have benefited from the annual meeting and the provided feedback in the group discussions. Feedback has been shown to be an important parameter in learning [32]. However, the better performance should be interpreted with caution, as the majority of the EuroClonality-affiliated laboratories also participated more frequently to the EQA schemes, and repeated participation significantly improved performances. In addition, the difference between EuroClonality and non-EuroClonality laboratories is smaller for individual scheme years. We expected a larger score difference in a single scheme, given the longstanding experience of the EuroClonality participants, who were involved in the design and testing phase of the BIOMED-2/EuroClonality assays and in preparation of the EuroClonality guidelines. The question remains how non-EuroClonality laboratories educated themselves. Most likely, training on the job within an expert environment and/or attending dedicated trainings resulted in translation of theoretical knowledge into diagnostic practice and competence building. In addition, the feedback given in the extended EQA report may also have contributed to learning and good performance [33,34]. In the end, it remains the responsibility of the participants to implement the necessary corrective actions to improve performance.
This EQA scheme with five wet cases and five paper-based cases in each round allowed us to evaluate the successful performance over two EQA rounds, as it was previously estimated that at least 10 samples are needed to allow a reliable performance estimate [25,26]. Accredited laboratories have to demonstrate their performance, but not all laboratories participated to all EQA rounds, as the frequency of participation is not specified by ISO15189 [15] or equivalent national accreditation standards. Recent recommendations from the Belgian Molecular Diagnostics working group now state that laboratories should perform a risk analysis to determine their ideal participation frequency [35].
The several international clonality EQA providers each (i) evaluate a different number of IG/ TR targets, (ii) distribute various numbers of samples per annum, and (iii) apply different criteria for successful participation (Supplemental Table  1). Although not all providers include paper-based cases, the majority assesses the laboratory's interpretation of the rearrangement patterns according to the guidelines. Namely, EQA providers should assess the complete analysis process, from pre-analytical to post-analytical phase [25]. As the participants received pre-extracted DNA samples, the DNA extraction and preparation steps are not evaluated in the EQA scheme, and could impose additional difficulties in a routine setting. The cut-off of 80% for successful performance and 90% after two participations in these EQAs was based on the requirements for EQA programs, which recommend a cut-off of 90% assessed on a total of 10 samples [25]. Similar to the harmonization efforts in molecular oncology, increased harmonization between providers is advisable for clonality analysis to define a uniform scope, scoring system, criteria for successful participation, and actions following unsatisfactory performance in Europe [36].
In summary, we observed a high performance for IG and TR analysis, which increased when participating to more EQA rounds. There was a higher performance for paper-based cases compared to wet cases and for EuroClonality compared to non-EuroClonality laboratories. There was no difference related to the EQA scheme year, sample origin, or clinical diagnosis. The observed difficulties in interpreting oligoclonal cases highlight the need for continued education via meetings and EQA schemes. Funding The EuroClonality Educational Quality Assurance Scheme is funded by EuroClonality Foundation.
Data availability The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Declarations
Ethics approval and consent to participate Virchows Archiv conforms to the ICMJE recommendation for qualification of authorship. The ICMJE recommends that authorship be based on the following 4 criteria: ▪ Substantial contributions to the conception or design of the work; or the acquisition, analysis, or interpretation of data for the work; and ▪ Drafting the work or revising it critically for important intellectual content; and ▪ Final approval of the version to be published; and ▪ Agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
All individuals listed as co-authors of the manuscript must qualify for every one of the four criteria listed above. Should an individual's contributions to the manuscript meet three of the criteria or fewer, then they should not be listed as a co-author on the manuscript; instead, their contributions should be acknowledged in the "Acknowledgements" section of the manuscript.
The samples originated from leftover patient materials obtained during routine care. Each scheme organizer signed a subcontractor agreement stating that the way in which the samples were obtained conformed to the national legal requirements for the use of patient samples. The samples were excluded from research regulations requiring informed consent.
Conflict of interest EuroClonality is a scientific foundation that, together with the foundations EuroMRD and EuroFlow, is connected to the ESLHO foundation/EHA scientific working group. The objectives of the EuroClonality Foundation are aimed at innovation, standardization, quality control, and education in the field of diagnostic clonality analysis. The revenues of the previously obtained patent (PCT/NL2003/000690), which is collectively owned by the EuroClonality Foundation and licensed to Invivoscribe, are exclusively used for EuroClonality activities, such as for covering costs of the Working Group meetings, collective WorkPackages (including the WorkPackage: External Quality Control, EQA), and the EuroClonality Educational Workshops.
-Cleo Keppens has nothing to declare.
-Elke Boone is a member of the EuroClonality Scientific Foundation and member of the WorkPackage EQA (assessor 2014-now).
-Paula Gameiro is a member of the Board of the EuroClonality Scientific Foundation, and chair of the WorkPackage EQA (assessor 2014-now).
-Véronique Tack has nothing to declare.
-Elisabeth Moreau is a member of the EuroClonality Scientific Foundation and member of the WorkPackage EQA (assessor 2015-2018).
-Elizabeth Hodges is a member of the EuroClonality Scientific Foundation and member of the WorkPackage EQA (assessor 2014-now).
-Paul Evans is a member of the EuroClonality Scientific Foundation and member of the WorkPackage EQA (assessor 2017-now), and has received honoraria and lecture fees from Novartis, Diceutics, and Astellas within the past 36 months, outside the scope of this research.
-Monika Brüggemann is a member of the EuroClonality Scientific Foundation and member of the WorkPackage EQA (assessor 2014-2016), performed contract research for Affimed, Amgen, Regeneron, advisory board of Amgen, Incyte, and is a speaker for the bureau of Janssen, Pfizer, Roche.
-Ian Carter is a member of the EuroClonality Scientific Foundation and member of the WorkPackage EQA (assessor 2017-2018).
-Dido Lenze is a member of the EuroClonality Scientific Foundation and member of the WorkPackage EQA (assessor 2015-2017).
-Maria Eugenia Sarasquete is a member of the EuroClonality Scientific Foundation and member of the WorkPackage EQA (2019-now).
-Markus Möbs is a member of the EuroClonality Scientific Foundation and member of the WorkPackage EQA (2019-now).
-Hongxiang Liu is a member of the EuroClonality Scientific Foundation and member of the WorkPackage EQA (assessor 2017-now).
-Elisabeth MC Dequeker has nothing to declare.
-Patricia JTA Groenen is chair of the EuroClonality Scientific Foundation and member of the WorkPackage EQA (assessor 2014-now).
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.