The impact of formalin fixation

The fixation of tissue specimens by formalin has been in use since almost 120 years [1]. Its application in histology was rapidly accepted, and a worldwide success story begun. Already at that time, a formalin concentration of 10% was suggested which is equivalent to a final formaldehyde concentration of approx. 4% (formalin, 37% w/v aqueous solution of formaldehyde). This formalin concentration is still in use until today. The only modification since its introduction is the use of neutralizing phosphate buffer for the dilution of the formaldehyde stock solution which reduces the chemical aggressiveness of formalin.

Formalin reacts with protein side groups as well as with nucleic acids leading to cross-linking of proteins and fragmentation of nucleic acids. The fragmentation of nucleic acids is caused by depurination whereas protein cross-linking is mediated by reactive NH3 side groups. The impact of formalin is further reinforced if the fixation is carried out (1) at higher temperature, (2) in unbuffered solution or (3) for a prolonged time. Whereas fixation in neutral-buffered formalin for a period not exceeding 24 h results in DNA fragments of up to 400–500 base pairs, very aggressive conditions lead to very short DNA fragments of around 100 base pairs or less [2].

The use of formalin-fixed (and paraffin-embedded) tissues for protein detection (e.g. immunohistochemistry) or nucleic acid-based methods requires several considerations in order to avoid misinterpretation of the results. To break the protein cross-links, enzymatic degradation with proteinases such as proteinase K was initially the method of choice. Since the enzymatic treatment is often associated with an impaired quality of the morphology, heat pretreatment of formalin tissue sections by boiling for a few minutes in buffers with or without high pressure is nowadays the predominantly applied antigen retrieval approach. This heat pretreatment leads to a change of the protein conformation and enables the detection of relevant immunogenic epitopes regardless of the cross-linked proteins.

The quality of degraded nucleic acids derived from formalin-fixed tissue specimens cannot be improved by any kind of (pre-)treatment. In order to isolate the nucleic acids (DNA and RNA) from tissue specimens, an extraction and thus a destruction of the cellular integrity has to be performed. This requires first of all a breakage of the cross-linked protein network to release the nucleic acids from the cells. A further purification of the DNA and RNA extracts can be obtained by various procedures (see next paragraph). However, since depurination leads to irreversible strand breaks, highly fragmented DNA and RNA molecules, respectively, will be the result of any extraction protocol irrespective of the method applied.

DNA extraction from FFPE samples

DNA extraction from formalin-fixed and paraffin-embedded (FFPE) samples differs by two main characteristics from the procedure used for fresh or frozen tissue samples or unfixed liquid samples. One is the presence of paraffin that is used to embed the tissue, and the other is the very strong cross-linking of the DNA with proteins and other macromolecules resulting from the formalin fixation procedure (see paragraph above).

The removal of the paraffin takes place before the actual DNA isolation and is accomplished by extraction with xylene and a subsequent wash with ethanol to remove any residual xylene. However, it has been found that this step is not mandatory. Paraffin melts at low temperatures (approx. 56°C) and is thus solubilized during the proteinase K treatment of the tissue (see below). After this treatment, the paraffin solidifies again at the tube wall or on top of the solution. By simply transferring the aqueous solution to another tube, the paraffin is removed without xylene extraction. Thus, it may be worth trying to omit the xylene extraction protocol, since this modification results in a much simpler DNA isolation procedure.

In order to make the cross-linked DNA derived from FFPE specimens accessible for extraction, the tissue sections have to be treated with proteinase K. Many protocols exist for this step varying in the final concentration of the proteinase K solution (0.2–4 μg/μl), the incubation time (16–48 h) and the incubation temperature (37–70°C). In our hands, overnight incubation at 56–70°C at final concentrations of 1–2 μg/μl proteinase K works well. Optionally, the enzyme may be inactivated after the incubation for 5–10 min at 95°C.

The treatment of the FFPE tissue with proteinase K to release the DNA is, in most instances, only the first step of the DNA purification procedure [3]. However, the use of this crude extract for subsequent PCR analyses without further purification of the nucleic acids is also possible. Crude nucleic acid extracts are regarded as less stable and are not suitable for long-term storage and usage.

Three main DNA extraction protocols can be differentiated which differ in the grade of the purity of the extracted nucleic acids:

  1. 1.

    DNA extraction by proteinase K digestion of the tissue without any further purification (high contamination with peptidic cleavage products and other cellular components)

  2. 2.

    DNA extraction by proteinase K digestion of the tissue followed by alcohol precipitation of the DNA, with or without previous organic extraction (partial removal of peptidic cleavage products)

  3. 3.

    DNA extraction by proteinase K digestion of the tissue followed by silica-based purification of the DNA (commercial kits) (highly purified nucleic acid extracts without significant contamination)

In order to decide on the appropriate method, the advantages and disadvantages of each protocol should be weighed carefully. To provide an overview, we have listed the pros and cons in Table 1. For a more detailed description of the various extraction protocols, please see also the paper of Bonin et al. which provides an excellent and comprehensive overview [4].

Table 1 Comparison of DNA extraction protocols

However, with regard to the complexity of the PCRs used in clonality analysis, the better compatibility with accurate concentration measurement and the higher standardization potential appropriate for the diagnostic setting, we strongly recommend to use silica-based extraction methods.

Quality control of DNA

The quality assessment of DNA extracted from FFPE samples is essential for the interpretation of the results of the immunoglobulin (Ig) and T cell receptor (TCR) gene rearrangement PCRs. In this respect, the term quality relates to the grade of fragmentation and the amount of DNA as well as to the presence of PCR inhibitors. All three parameters, either separately or in combination, can lead to artificial PCR results easily misinterpreted if not evaluated appropriately.

Amount of DNA used for PCR

The input of too much DNA and the use of crude DNA extracts (without silica-based purification) can lead to inhibition of amplification or significantly reduced amplification efficiencies in general and of larger products in particular. The effect is a general lower detection rate for clonal Ig/TCR gene rearrangements or an accumulation of artificial products which potentially impair the reliable detection of clonality. Similar effects can be induced when too low amounts of DNA are used for PCR. Therefore, in cases with low loads of suspect (clonal) malignant cells, it can be beneficial to increase the amount of DNA used for each PCR run to 250 or even 400 ng instead of the 50–200 ng usually employed per PCR tube. This presumes, however, that the DNA used is purified and does not thereby introduce PCR inhibitors into the PCR assay.

The DNA extracted from FFPE specimens can be quantified by spectrophotometric or fluorimetric measurements. This has the advantage that also the presence of potential PCR inhibitors can be assessed by calculating the OD ratios 260/280 (protein, phenol, e.g. from TRIzol) or 260/230 (EDTA, carbohydrate, phenol, e.g. from TRIzol). However, it is important to keep in mind that photometric quantification is not accurate for crude DNA extracts, since this method requires purified DNA. The advantage of fluorimetric analysis is the selective measurement of DNA over RNA, resulting in a more accurate measurement.

Degree of fragmentation

In general, separation on a low percentage agarose gel may be used to estimate the degradation of genomic DNA. However, in DNA extracted from FFPE tissue samples, a certain degree of degradation is always present, and thus, rather, the precise extent of fragmentation is the matter of interest. This is especially important when the DNA is used to determine the clonality of lymphoid cells by Ig or TCR PCR protocols. Heavily degraded DNA (average fragment size below 200 base pairs, bp) might prevent the amplification of larger PCR products and thus the detection of B and T cell clonality.

A very useful approach to determine the degree of fragmentation is the simultaneous generation of different PCR products of various sizes based on the amplification of single-copy genes. In order to cover the size range of DNA fragments usually present in FFPE-derived DNA extracts, the BIOMED-2 consortium developed a size control PCR which produces five amplificates of different lengths (100, 200, 300, 400 and 600 bp [5]. Figure 1 exemplifies possible outcomes of the size control PCR for amplificates from 100 to 400 bp (the 600 bp product was not included). Whereas (non-degraded) DNA from a fresh-frozen, unfixed tissue sample (lane 1) led to the detection of all expected PCR products, over-fixed and stored samples (lanes 3, 5, 7 and 8) harbour highly degraded DNA which resulted in amplificate sizes of 200 bp or less. Properly fixed tissue samples (less than 48 h of fixation with neutral-buffered formalin) contain DNA fragments of 400 bp or more as clearly demonstrated (lanes 2, 4 and 6).

Fig. 1
figure 1

Results of the size control PCR developed by the BIOMED-2/EuroClonality consortium with different preparations of FFPE DNA

Since clonality PCR assays will produce PCR products over a broad range of sizes (approx. 100 to 400 bp), the knowledge about the degree of DNA fragmentation in a given sample is a decisive parameter for the correct interpretation of the results on a technical level (i.e. per tube) and for final conclusion (i.e. per target).

PCR inhibitors

The presence of PCR inhibitors may originate from the FFPE tissues themselves or from the DNA extraction protocol employed. Their presence will impair the activity of the polymerase and thus lower the amplification efficiency. In addition to the generation of (very) weak PCR products in general, an unequal amplification of different sequences may occur. For clonality assays, this can result in very artificial patterns mimicking a restricted/clonal lymphocyte repertoire and, potentially, in the interpretation of a false clonality.

As mentioned above, PCR inhibitors can be detected in the extracted DNA by spectrophotometric measurements at 230, 260 and 280 nm and calculation of the OD ratios 260/230 and 260/280. Furthermore, from the size control PCR, not only the integrity of the DNA can be evaluated but also its amplifiability. Therefore, the use of the same DNA concentrations in the size control PCR and in the clonality assays might also be of help to detect the presence of PCR inhibitors.

Performance of Ig/TCR gene rearrangement PCRs with DNA from FFPE specimens

For the detection of Ig and TCR rearrangements in FFPE samples, the design of primers was already adapted in initial PCR studies to the size range of DNA fragments available in formalin-fixed tissue specimens. In an attempt to cover even highly degraded tissue DNA samples, some primer sets were designed to produce very small PCR products which, however, display significant limitations to detect clonal B and T cell populations in a sufficient proportion of cases [6, 7]. The BIOMED-2 approach considered, however, more aspects for the design of the respective primer sets which include (1) coverage of all types of possible Ig and TCR rearrangements, (2) reduction of the negative impact of somatic hypermutations (IgH) on the amplification efficiency and (3) generation of amplification products of less than 400 bp [5]. As a result of these design standards, the BIOMED-2 primer sets display exceptionally high rates of clonality detection (sensitivity and specificity) and are in principle applicable to DNA extracted from FFPE specimens [5, 810]. However, in order to improve the results obtained from FFPE samples, the original BIOMED-2 protocol (35 amplification cycles) might benefit from the addition of five to ten PCR cycles and from a slight increase of IgH (30 pmol) and IgK J and Kde (20 pmol) primers. In addition, the performance of duplicate PCR assays for each primer set has proven to be essential in order to avoid misinterpretation of artificial results originating from the quality issues associated with FFPE DNA.

Limitations and pitfalls in the use of FFPE DNA

Despite major improvements in the overall quality of FFPE-extracted DNA in the last 10 years attributed to the widespread use of neutral-buffered formalin, a significant proportion of tissue samples still contain DNA of poor quality. Reasons for this lack of quality are prolonged fixation in formalin, thermic alteration during removal of the tissues (laser-assisted surgical procedures) or post-fixation treatment (i.e. EDTA decalcification under non-neutral pH conditions). In these cases, a reliable detection of clonality is only possible if the amplification product which defines clonality is of a low size (Fig. 2). Furthermore, there is a substantial number of cases in which, e.g. the IgH clonality is only detectable with the FR1 and FR2 primer sets (BIOMED-2 IgH tubes A and B) due to the presence of somatic mutations. If these PCR products are not amplifiable because of poor DNA quality due to their size of more than 200 bp and the IgH FR3 primer (BIOMED-2 IgH tube C) combination is unable to demonstrate the clonality, the sole usage of IgH primers might produce false-negative results. In order to overcome this severe problem, the additional usage of primers suitable to demonstrate B cell clonality based on Ig light chain gene rearrangements is highly recommended. Since most of the amplificates generated by the Ig light chain gene PCRs are relatively small, there is a high probability of detecting B cell clonality even in cases in which the IgH PCR was unable to demonstrate the presence of the clonal B cell population. However, the additional use of Ig light chain gene primer sets is not only recommended for FFPE specimens but also for DNA preparations from unfixed specimens since extensive somatic IgH hypermutations can prevent detection of B cell clonality based on IgH primers alone, irrespective of DNA quality. Thus, the simultaneous application of both IgH and Ig light chain gene primer sets should be regarded as a general rule (see Groenen et al. in this issue and [11]. The use of the Ig lambda primer set is optional since it provides no additional information in cases where clonality is detectable with the Ig kappa primers. However, due to its small product size range it might be useful in cases with limited DNA quality.

Fig. 2
figure 2

Impact of DNA degradation on the results of IgH, Ig kappa and Ig lambda PCR assays

The low quality of DNA might not only produce artificial results from Ig rearrangement PCRs, but is also able to impair the results of the TCR PCR. This holds especially true for the application of the TCR-gamma PCR which is most widely used for the detection of clonal T cell populations. Due to the design of the BIOMED-2 TCRG primer set tube A, a wide range of different amplificate sizes is generated with this primer combination. In cases with poor quality of the extracted DNA, there is strong overrepresentation of preferentially amplified small PCR products. Since in the small product size range of the TCRG tube A rare types of TCRG rearrangements are covered, an artificial dominant PCR product pattern, mimicking a clonal peak, may arise. Very similar to the situation described for the Ig PCR, the additional application of the TCRB PCR might help to clarify the diagnostic question in most cases. The PCR products of the TCRB PCR cover a very small size range preventing the preferential artificial amplification of small amplificates. Since the TCRG PCR products are not only difficult to interpret in cases with poor DNA quality but also in a proportion of cases with sufficient DNA quality, the simultaneous amplification of both TCRG and TCRB rearrangements is highly recommended for the reliable detection of T cell clonality [12].

The limited quality of FFPE-derived DNA especially in combination with a low percentage of (clonal) lymphoid cells tends to produce artificial clonal rearrangement patterns which are best described by the term pseudoclonality. In these cases, single dominant PCR products arise demonstrating clonality if only single PCR assays are examined (Fig. 3, “1. PCR” diagrams). However, when the PCR is carried out as a duplicate, it becomes evident that these dominant amplification products are not reproducible in their sizes and thus do not represent a true clonal B or T cell population (Fig. 3, “1. PCR” and “2. PCR” diagrams of each primer set). This phenomenon is predominantly caused by a low content of lymphoid cells in a given sample and the resulting presence of a restricted repertoire of rearrangements. This very complex situation is further reinforced in FFPE cases of limited DNA quality. Examples for such a setting are the investigation of gastric biopsies for the presence of a B cell MALT lymphoma [10] or of skin biopsies for the differential diagnostic question of a mycosis fungoides versus a reactive T cell lesion [6]. Very similarly, the analysis of intestinal biopsies to discriminate between (refractory) sprue and early enteropathy-associated T cell lymphomas [13] might illustrate the limitation of clonality studies.

Fig. 3
figure 3

Pseudoclonal T cell rearrangement pattern arising from low-quality FFPE DNA

Conclusions and recommendations

Formalin-fixed and paraffin-embedded tissue samples represent the diagnostic substrate of molecular pathology with only few exceptions. In general, they represent a feasible material for diagnostic testing, including clonality assessment. However, in the diagnostic routine, only histologically and immunophenotypically unclear and difficult cases are subjected to clonality analyses, thereby challenging the limits of this technique. Based on our (Hematopathology, Charité, Berlin, Germany) broad experience of more than 15 years in diagnostic PCR-based clonality analysis (more than 10,000 cases representing approx. 15,000 DNA samples), we would like to formulate the following recommendations:

  • Purified DNA rather than crude extracts should be used for clonality assays. Crude extracts (1) might contain PCR inhibitors, (2) are unstable excluding later re-use and (3) cannot be accurately quantified.

  • Quality control of the extracted DNA is essential for proper interpretation of PCR clonality results. The most appropriate approach is a PCR for single-copy target genes generating amplificates of well-defined different sizes. This ladder of PCR products allows a very good estimation of the DNA quality. In addition, the equal amplification of the whole expected range of polyclonal product sizes is also a good indicator of DNA quality in cases without clonality.

  • The input of a sufficient amount of DNA derived from lymphoid cells is relevant to confirm clonality and exclude pseudoclonality especially in extranodal tissue specimens (see also: Groenen et al., in this issue)

  • Clonality assays should be performed in duplicate. Since cases subjected to diagnostic clonality assays are usually difficult and most often do not produce unequivocal polyclonal or monoclonal patterns, oligoclonal or pseudoclonal rearrangement patterns might be interpreted as false-positive results. This misinterpretation can be avoided by independent repetition of the clonality PCR.

  • The interpretation of the clonality PCR results obtained in a diagnostic molecular pathology environment should be performed with adequate knowledge of the respective histology of the specimens (e.g. cellular composition of the specimens) but independent from the histological diagnostic process. This prevents a biassed interpretation of complex rearrangement patterns in the light of the preliminary histological diagnosis.

Amplification was carried out according to the BIOMED-2 protocol with primers for TBXAS1, RAG1, PLZF and AF4 resulting in PCR products of 100, 200, 300 and 400 bp, respectively. PCR products are separated on a 6% polyacrylamide gel.

  • Lane 1: DNA from a frozen tonsillar biopsy representing high-quality DNA. PCR products up to 400 bp of equal intensity are visible.

  • Lanes 2, 4, 6: DNA extracted from well-fixed FFPE samples. All four PCR products are amplified demonstrating the in principle suitability of DNA from FFPE samples for PCR analysis. However, the 400-bp product is weaker than in the frozen biopsy in lane 1.

  • Lanes 3, 5, 7: Same samples as in lanes 2, 4 and 6, respectively, but treated with a higher concentration (40%) of unbuffered formalin to artificially produce extensive DNA degradation. Only PCR products up to 200 bp are amplifiable.

  • Lane 8: DNA extracted from an optimally fixed FFPE sample stored over several years. In this case, the DNA was affected by the storage, and as maximum, a weak product of 300 bp is seen.

DNA was extracted from brain tissue (FFPE material) to determine the presence of a B cell lymphoma. The size ladder PCR demonstrated almost complete degradation of the extracted DNA. Amplification was performed according to the BIOMED-2 protocol (modified as mentioned above) with (a) primer sets FR1, FR2 and FR3 for IgH and (b) primer sets A and B for Ig kappa and the Ig lambda primer set. Single examples from duplicate analyses are shown. Amplification products were separated on Genetic Analyzer 3130 (Applied Biosystems). The X-axis shows the lengths of amplification products in base pairs, and the Y-axis the fluorescence intensity. In highly degraded DNA, PCR products and thus potentially clonal lymphocyte populations are only detectable with primer sets resulting in low molecular weight products (FR3, IGK A, IG Lambda).

DNA was extracted from a gastric biopsy (FFPE material, size control PCR demonstrated a 200 bp product, the proportion of T cells in the analysed tissue was 30%) to determine whether an enteropathy-associated T cell lymphoma had developed in a patient with refractory sprue. Amplification was performed according to the BIOMED-2 protocol with primer sets TCRG A and B for detection of T cell rearrangements. Amplification products were separated on Genetic Analyzer 3130 (Applied Biosystems). The X-axis shows the lengths of amplification products in base pairs, and the Y-axis the fluorescence intensity.

Looking at a single PCR assay (1. PCR), a clonal rearrangement pattern is seen with primer set TCRG A. Duplicate analysis (2. PCR from same DNA extraction) demonstrates that the result is not reproducible and thus not representative of a real clonal T cell population. Similarly, if only the 1. PCR result with the TCRG primer set B is examined, a clonal T cell population might be diagnosed.