Background

Array comparative genomic [13] and FISH-based methods [4] have been widely used for the detection of DNA copy number changes. However, the resolution of commercially available DNA arrays can be too low to detect microdeletions or microduplications [5, 6], whilst FISH is generally only useful when the regions of interest have been previously defined. Currently, DNA arrays providing full coverage of the human genome are not widely available and too expensive to diagnostically screen large numbers of patients. Moreover, the findings that are emerging from recent array comparative genomic hybridization studies indicate that significant validations of both controls and patient populations will be required to make phenotype-genotype interpretations in a clinical context. Similarly, standard FISH methods are time consuming, costly and suffer the significant limitation that some patients with uniquely localized microdeletions or duplications may yield normal clinical FISH findings because the probe set used does not map precisely to the entire region of deletions/duplication.

In this study we have chosen to validate the use of qPCR technology for detection of microdeletions or microduplications using Velocardiofacial Syndrome (VCFS) or chromosome 22q11 deletion syndrome (22q11DS) as a test model. The frequency of the causative deletion for 22q11DS in the general population is 1 in 3000 live births, making it one of the most common constitutional genomic alterations found in humans [7]. 22q11DS is suspected in individuals with characteristic clinical findings and is confirmed in most cases by detection of a sub-microscopic deletion using FISH.

The currently accepted clinical laboratory assay for 22q11DS uses the TUPLE1 FISH probe, which is located within a typically deleted region of approximately 3 Mb. Although this assay can detect the majority of affected patients (85–90%), many patients with phenotypic features of 22q11DS have no deletion detectable by FISH testing. As a consequence these patients will go undiagnosed due to the presence of atypical deletions that map outside the area covered by the TUPLE1 probe [812]. In addition, there have been reports of individuals with some features of 22q11DS with microduplications of 22q11.2 [13]. Unfortunately, clinical FISH assays are not usually capable of detecting such duplications, so alternative methods, such as FISH analysis of interphase nuclei, are required [14]. However, such techniques require advanced optical instrumentation, presently only used in specialized research laboratories. Alternative molecular technologies that could potentially be used to screen deletions in the diagnose of 22q11DS is multiplex ligation-dependent probe amplification (MLPA) [15] or microsatellites marker analysis [10].

The novel approach validated in this study utilized qPCR rather than FISH to detect copy number alterations (microdeletions and microduplications) in patient DNA. This approach had several advantages. Primers were selected within regions of unique sequence utilizing publicly available sequence databases [1624]. This method can allow for the production of a high-resolution map of any region of interest; the 22q11DS region is used in this example. The qPCR technique provides a quantitative measurement of DNA copy number and accurately characterizing chromosomal breakpoints. This method will therefore permit the identification of individuals who would otherwise go undetected by the currently available clinical FISH methods. In addition, qPCR provides greater flexibility and adaptability, whilst being less technically challenging than FISH, thus making it more appropriate for use in a large number of laboratories. Furthermore, the qPCR technique only takes a fraction of the time usually required for FISH, which allows for multiple samples and multiple primer sets to be studied in parallel, using convenient and cost effective high throughput analysis. The method described in this paper has been evaluated using patient and controls samples with known copy number changes on 22q11. The approach can be readily adapted for molecular diagnostics of any region of the genome suffering recurrent constitutional genomic deletion or duplication.

Results

Fish results

The twelve patients utilized in this study have previously been analysed in a clinical laboratory by FISH using the TUPLE 1 probe. Using this test, six of the patients (group 1) were identified as 22q11 deletion positive whilst the other six (group 2) and the four controls showed normal cytogenetic results. The DNA sample known to have three copies of 22q11 (trisomy for chromosome 22) had been previously analysed by FISH [25].

Primer design

The most critical factor for successful detection of micro-alterations using qPCR was primer design. To guarantee optimal primer design a high-resolution 22q11.2 physical map was constructed using information available from published reports [812, 2634] and online databases and repositories [1624]. This allowed for the identification of unique sequences within the 22q11DS affected region whilst also avoiding the complex repetitive regions. Figure 1 shows a schematic representation of the 22q11.2 region studied in this work, the location of previously reported deletions and deletions identified from our study are shown as are the location of primers, repeat sequences, known genes and pseudogenes.

Figure 1
figure 1

Schematics of the 22q11.2 region. Previously reported deletions and deletions identified from our study are shown. A) The 10 qPCR primers used to screen for hemizygous deletions, orientation is centromere to telomere. B) Known low copy number repeats or segmental duplications in 22q11.2: LCR-A, LCR-B and LCR-D (Shaikh et al., 2000). C) Known genes [24]. D) Location of previously reported deletions in 22q11DS patients. E) Locations of hemizygous deletions and duplications identified in this study. For D and E, hashed ends represent regions of uncertainty regarding precise location of deletion breakpoints.

Sequences were selected to lie, mostly, within exonic and/or intervening regions from known or putative genes: UniSTS marker D22S181 (UniSTS is a comprehensive database of sequence tagged sites) [35], Proline Dehydrogenase/Proline Oxidase (PRODH), TUP-like enhancer of SPLIT 1 (TUPLE1), Catechol-O-Methyltransferase (COMT), Zinc Finger Protein 74 (ZNF74), Phosphatidylinositol 4-kinase (PIK4CA), Leucine-zipper-like Transcriptional Regulator 1 (LZTR1), Cationic Amino acid Transporter-4 (CAT4), D22S936 and Similar to Pre-B lymphocyte gene 1 (VPREB1). The selected sequences were aligned against the human genome using the BLAT program [36] to ensure that only contiguous sequences with 100% homology to one unique location were selected. A single primer set was then designed from each of these unique sequences using the Primer Express v 2.0 program (Applied Biosystems). We designed reference primers for each of the "housekeeping genes" Glucose-6-Phosphate Dehydrogenase (G6PDH) and Hydroxymethylbilane Synthase (HEM3) which were selected using published guidelines [37]; moreover we insured that they were single copy genomic sequences according to the BLAT alignment [38], in addition agarose gel visualization confirmed a single band of the expected size. All primer sequences are shown in Table 1. The use of reference primers was to control for varying input amounts of DNA from each separate patient. Thus any differences in the qPCR values obtained for test primers/markers would correspond to differences in the amount of the primers target sequence.

Table 1 Sequence and parameters of the reference and 22q11.2 test primer sets. Ten sets of primers were designed from within regions of unique sequence on 22q11.2 using Primer Express v2.0. In addition, two sets of reference primer for G6PDH and HEM3 were also designed to allow for data correction.

Optimization process

Primer concentrations were optimised over a range of final concentrations, 100 nM to 900 nM at 100 nM intervals. The optimal concentration was that which obtained the lowest threshold cycle (Ct) and maximum ΔRn while minimizing non-specific amplification. The results indicated that for all of the 22q11.2 primer sets the optimal final primer concentration was 800 nM, whilst for the G6PDH and HEM3 primer sets this value was 400 nM. The specificity of amplification for each qPCR product was confirmed by determining that the melting curve, showed a single dissociation peak corresponding to the melting temperature of the analysed amplicons (See Additional File 3, Figure 2 b. Dissociation Curve for PRODH -14 DNA samples).

To allow for comparisons between each primer sets all had to amplify with comparable efficiency. This was assessed by analysis of the slopes from the standard curves, which were generated using a log10 dilution series of input genomic DNA (range 102 nM to 10-2 nM). If all conditions are optimal and reactions are 100% efficient, it will take approximately 3.32 cycles for ten fold amplification (log2 of 10 = 3.321928) of product, a value that is equal to the slope of the standard curve. This translates to 1 cycle to copy 1 molecule into 2; a second cycle to copy 2 molecules into 4; a third cycle to copy 4 into 8 and 0.32 cycles to copy 8 into 10. There was a linear relationship between the amount of input DNA and the threshold cycle (Ct) values for the various reactions. Regression analyses of the Ct values generated by the log10 dilution series gave R2 values for all reactions in excess of 0.99 (Table 1).

At the optimal primer concentrations all of the primer sets gave slope values of 3.32 ± 0.25, indicating that the reactions were occurring with similar efficiencies. Following primer optimisation a baseline Ct was identified for each primer set, which was used when analysing subsequent data (R2, slope and threshold cycle values are shown in Table 1).

Data correction/normalization

To control for differences in sampling, DNA preparation, reaction efficiency (the varying PCR efficiencies between patient samples and the calibrator) [39] and other variables such as the Ct values for each primer pair [40] from 22q11.2 were corrected/normalized using the Ct value of the G6PDH and HEM3 products for the same sample. Although the input of template was standardised at 10 ng of DNA, the Ct values for the housekeeping genes differed slightly from patient to patient and from group to group (Table 3), thus demonstrating the need for correction of the raw data prior to further analysis.

Table 3 Example of uncorrected Ct values characteristic to the reference (housekeeping) primer G6PDH. Exemplification of the uncorrected Ct (average of three replicates) values characteristic to the reference (housekeeping) primer G6PDH of the DNA samples under study. Although starting concentration of DNA for all samples was 10 ng the Ct values for the housekeeping genes differed slightly from patient to patient and from group to group.

Correction was performed using a method described by Moody et al. (2000) [41]. Once the corrected Ct values (KCt) for each of the test markers had been determined it was then possible to identify fold copy number change (ΔKCt) for each of the markers from the 22q11.2 region, using a formula described by Sijben et al. (2003) [42]. In the context of 22q11DS this approach calculates a ratio; by comparing the Ct value obtained for each primer pair between the normal (control) DNA samples and the patient (affected) DNA samples. This value is then translated into fold changes (copy number gain or loss) per sample. For all calculations please refer to the Methods section.

qPCR data

A summary of the qPCR results is presented in Table 2; light grey shading denotes loss, whilst dark grey shading denotes gain. We obtained ΔKCt values of either 0 ± 0.35 indicating an equal ratio of the target and reference, which corresponds to no loss and therefore no genetic abnormality, or -1 ± 0.35, indicating loss of one copy (microdeletion), for the affected samples. For the trisomy 22 patient we obtained a ratio of 1 ± 0.35, indicating gain of one copy. Each experiment was performed in triplicate, with replicates performed on different days. The inter-assay (same assay repeated on different days) Ct variation and the intra-assay Ct variation (the triplicates) was less than or equal to ± 0.35 cycles.

Table 2 Fold copy number change (ΔKCt) for the 12 22q11DS patients, 22q11 Duplication and 4 controls. ΔKCt values of 0 ± 0.35 indicate an equal ratio of the target and reference, which corresponds to no loss and therefore no genetic abnormality, values of -1 ± 0.35 indicate loss of one copy (microdeletion), whilst values of 1 ± 0.35 (seen for the trisomy 22 patient) indicate gain of one copy (microduplication).

Discussion

In our analysis we have used a series of 12 qPCR primer sets to analyze twelve patients with clinical symptoms of 22q11DS and four controls. Ten of the primer sets (test primers) amplify markers localized within and around the chromosome 22q11.2-deleted region (D22S181, PRODH, TUPLE1, COMT, ZNF74, PIK4CA, LZTR1, CAT4, D22S936 and VPREB1) and two amplify "housekeeping" genes/markers (reference primers) (G6PDH and HEM3). The 22q11DS locus contains approximately 50 genes or pseudogenes and is characterized by an unusual genomic architecture comprising a large polymorphic chromosome-specific low copy repeats (Figure 1)[34]. The repetitive nature of this region of the genome is thought to increase the frequency of deletions and duplication events associated with clinical disease. The repetitive nature of the region of 22q11.2 under study required bioinformatics to identify unique regions for primer design.

In the context of 22q11DS we were trying to discriminate between 2 copies of a product (normal) versus 1 copy of the same product (deleted). In order to do this we had to implement data correction [41] to control for variations such as the input DNA concentration or reaction efficiencies [39]. To perform the correction we used the reference genes. The copy number of the reference genes will be the same in all of the samples under investigation. Any variation in copy number for the reference between samples will be the result of differences in initial template concentration, as long as the same DNA is sampled for both the control and target primers. Once the Ct values for the reference were determined for each sample, it was then possible to use these to correct the values of the test markers for variation in initial template concentration. This correction permitted determination of copy number differences between the samples under investigation.

Individuals who had previously shown a deletion of TUPLE1 by FISH also showed deletion by qPCR (group 1). Our data demonstrates that the ΔKCt values, -1 ± 0.35, for the primers PRODH, TUPLE1, COMT, ZNF74, PIK4CA, LZTR1, CAT4 and D22S936 are indicative of deletion (Table 2). A finding that is in 100% concordance with the FISH results for the TUPLE1 probe. The region of deletion spanning PRODH, to D22S936, represents an interval of 2,502,410 base pairs (bp). Furthermore, the implementation of qPCR has allowed for the identification of breakpoint within the 22q11.2 region. The markers D22S181 and VPREB1 show values indicative of no deletion (0 ± 0.35), suggesting that the proximal deletion breakpoint occurs between the markers D22S181 and PRODH and the distal breakpoint between D22S936 and VPREB1 (see Table 2 and Figure 1).

The individuals that did not demonstrate loss using the TUPLE1 FISH probe were also deletion negative by qPCR, showing ΔKCt values similar to the normal controls (0 ± 0.35) thus indicative of no loss. For the 22q11 trisomy sample the qPCR results showed ΔKCt values indicative of duplication (1 ± 0.35) for PRODH, TUPLE1, COMT, ZNF74, PIK4CA, LZTR1, CAT4 and D22S936. The markers D22S181 and VPREB1 again showed values indicative of no copy number change (0 ± 0.35).

To accurately calculate fold changes, to the high precision required for genomic DNA quantitation, the real-time data analysis used here makes the most of the existing methods. The slope values determined from the standard curve allows for correction for variations in the primers efficiency and reaction kinetics, whilst the relative abundance ratio, calculated after the samples are normalized using the reference genes, allows for determination of gene/marker copy number. This rationale reduces, as much as possible, the unavoidable approximations introduced with any method of data interpretation.

The use of qPCR to detect and refine copy number differences in patients suffering from 22q11DS provides a further novel application for qPCR methodology. When used in a research setting, this type of analysis has proven to be very useful when comparing levels of a transcript or genomic markers [43, 44] between different groups, and has been widely used in the study of human malignancies. However, such an approach has rarely been applied in a clinical setting to study constitutional deletions. The advantage of this technique over standard FISH assays is that qPCR provides a quantitative measurement of DNA copy number.

Conclusion

Here we demonstrate the application of a robust, fast and accurate real-time quantitative PCR based assay using SYBR ® Green I dye, that is capable of screening for copy-number alterations in genomic DNA.

Although qPCR detection methods have previously been used in 22q11.2 deletion analysis [4547], these reports have only used a small number of primers/markers and have not been able to refine the region of deletion as has been done here. The utility of the approach outlined in this paper is the ease with which one can increase resolution by increasing the number of primers in the 22q11 deleted region thus facilitating accurate mapping of deletion breakpoints. The fine structure of qPCR mapping of deletions will reveal important clues into the mechanism by which the deletion occurs and thus will offer insights into the "at risk" factors predictive of deletions or other rearrangements.

The implementation of qPCR for genomic copy number profiling will provide a valuable tool for detection of atypical microdeletions and/or microduplications in individuals who go undiagnosed by the current available FISH methods. Such data will be useful for phenotype correlation studies. In addition, this methodology has the advantage of providing greater flexibility and adaptability than the currently available cytogenetic methods and will be beneficial in molecular classification and diagnosis.

Methods

Primer design

Primers (Table 1) were designed using Primer Express v2.0 (Applied Biosystems). The parameters for primer design were as follows; amplicons of 100–250 bp with a penalty score no higher than 10–12; primer melting temperature (Tm) range 58°C to 60°C; primer length range 18 to 25 bp, optimal 20 bp; primer G/C content range 20% to 60%; amplicon maximum Tm of 85°C. Runs of more than three identical nucleotides were avoided as polyG or polyC stretches can promote non-specific annealing, whilst runs of polyA and polyT can potentially open up stretches of the primer-template complex. Primers were required to be free of self-complementary sequence in order to avoid hairpin loop formation. In addition, no more than two G and/or C bases were permitted in the last five nucleotides at the 3' end to avoid GC clamp formation. Minimal deviation from the parameters was allowed only when there were no other options due to the complex nature of the 22q11.2 sequence. Primer and amplicon sequences were compared to the human genome using the BLAT program to guarantee that they showed 100% homology to only the sequence from which they were designed and also to guarantee that the forward and reverse primers were free of single nucleotide polymorphisms.

Isolation of DNA

The DNA used in this study was obtained from the lymphocytes of four normal controls and 12 patients displaying clinical features of 22q11DS using phenol-chloroform extraction [48]. DNA was quality tested. Suitable sample for the qPCR reactions were of high molecular weight (un-sheared band of undigested DNA visible on a 0.5% agarose gel) and as clean as possible (an OD 260/280 ranging from 1.8 to 2.0).

Reaction conditions

Reactions were performed using SYBR Green I PCR Master Mix (Applied Biosystems), which includes the internal reference (ROX). Each qPCR reaction comprised 12.5 μl 2× SYBR Green PCR Master Mix, forward and reverse primer at optimized concentrations of 800 nM (final concentration) for the 22q11.2 test primers and 400 nM (final concentration) for the reference primers, 10 ng/μl genomic DNA template and sterile water up to a final volume of 25 μl. The qPCR reactions were performed using the ABI Prism 7900 high-throughput sequence detection system. The reaction profile was: initial step, 50°C for 2 min, denaturation, 95°C for 10 min, then 40 cycles of denaturing at 95°C for 15 sec and combined annealing and extension at 60°C for 60 sec.

Generating the standard curve

To generate standard curves for the selected primers and the reference primers a log10 dilution series of genomic DNA was prepared at concentrations ranging from 102 nM to 10-2 nM. Each dilution was tested in triplicate. When analyzed by qPCR, the dilution series produced a set of standard curves, which were used to calculate the slope value with the aid of the SDS software version 2.1 Applied Biosystems (values are shown in Table 1). See Additional File 3, Figure 2 e. and f. for an example of SDS output report of standard and amplification plots.

DNA quantification data analysis

Each qPCR experiment contained triplicates of the no-template-controls and patient samples for all of the primers tested. On the same reaction plate all DNA samples were tested with the test and reference primers. When any particular sample was being tested, the qPCR using reference primers and that sample were always included on the reaction plate. Each experiment was performed in triplicate, with replicates being performed on different days. Quantification was based on the increased fluorescence, which was measured and recorded using the ABI Prism 7900 sequence detection system and associated SDS software version 2.1 (Applied Biosystems). Results were expressed in terms of the threshold cycle value (Ct; the cycle at which the change in fluorescence for the SYBR dye passes a significance threshold). The threshold values are shown in Table 1. The output of the results was exported in tab-delimited text file format. Further calculations were performed using Microsoft Excel. PCR products were resolved by agarose gel electrophoresis to confirm the presence of a single band of the expected size (See Additional File 3, Figure 2 a. showing SDS output of amplification plot for 14 DNA samples for the PRODH and G6PDH primer sets along with images of corresponding gel bands c. and d.).

Data normalization

The qPCR data was normalized adapting a method devised by Moody et al. (2000) [41] and also described by Sijben et al. (2003) [42]. (See Additional File 1: Derivation of the formula)

K C t i = ( A C t R - C t R i S R ) × S T + C t T i MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqabeGadaaakeaacqWGlbWscqWGdbWqdaWgaaWcbaGaemiDaqNaemyAaKgabeaacqWGGaaikiabg2da9iabdccaGiabdccaGmaabmaabaWaaSqaaeaacqWGbbqqcqWGdbWqdaWgaaWcbaGaemiDaqNaemOuaifabeaakiabdccaGiabd2caTiabdccaGiabdoeadnaaBaaaleaacqWG0baDcqWGsbGucqWGPbqAaeqaaaGcbaGaem4uam1aaSbaaSqaaiabdkfasbqabaaaaaGccaGLOaGaayzkaaGaemiiaaIaey41aqRaemiiaaIaem4uam1aaSbaaSqaaiabdsfaubqabaGccqWGGaaicqGHRaWkcqWGGaaicqWGdbWqdaWgaaWcbaGaemiDaqNaemivaqLaemyAaKgabeaaaaa@53F7@

Where:

KCti = 'Corrected C t ' (KC t ) of the test primer (T) against the reference (R)

ACtR = the Average Ct value for Reference primer set for all the samples included in one qPCR run (control and patient).

CtRi = Ct value for Reference primer set for the sample to be corrected.

SR = slope value (from the standard curve) for reference primer set.

ST = slope value (from the standard curve) for test primer set.

CtTi = Ct value for test primer set.

Copy number calculation

Fold copy number (ΔKCt) change for each of the markers from the 22q11.2 region, was obtained using the formula:

ΔKC t = KCt/control- KCt/affected

ΔKCt = fold change (copy number gain or loss)

KCt/control = "Corrected Ct" of the test primer for the control samples.

KCt/affected = "Corrected Ct" of the test primer for the affected sample.

(Of note when multiple controls were used in the same reaction run the KCt/control was obtained by averaging all the controls'

"Corrected Ct" – KCt/control's). (See Additional File 2: Working example)