Background

Gene-expression analysis is ubiquitously used both in life sciences and in medical research [13]. Microarray and reverse transcription quantitative real-time polymerase chain reaction (RT-qPCR) analyses are commonly used to measure transcript abundance. Microarrays allow the massive parallel analysis of thousands of genes but still require significant time and financial expenses. RT-qPCR is used as a conventional routine method for expression profiling of a moderate number of genes and is highly suitable when only a small amount of sample is available [1, 4]. High throughput thermal cyclers and PCR platforms help to overcome the limitation of RT-qPCR and to perform simultaneous expression analysis of tens and hundreds of genes for one or more samples depending on the platform [1].

In research practice these two technologies are able to solve a broad spectrum of questions supplementing each other. In clinical diagnostics RT-qPCR is still more popular with the advantage of speed and a relatively low cost [5].

Based on a totally different readout strategy, RT-qPCR has become the standard method for validation of microarray data. However, the application of RT-qPCR requires an appropriate normalization to obtain accurate and reliable quantification of gene expression levels. The purpose of normalization is to remove experimentally induced errors that can be introduced at a number of stages throughout the procedure (variability in sample acquisition and RNA extraction protocols, different reverse transcription reactions and PCR efficiencies) and to reveal the true biological changes in expression [4, 68]. Lack of appropriate normalization can lead to misinterpretation of a target gene expression profile. This is especially pronounced when the samples come from different individuals, different tissues, different time courses, different environmental conditions, etc.

To date, the reference gene concept is a gold standard used for normalization [1]. According to this concept reference genes are endogenous controls whose expression remains stable across all sources of variation during the experimental workflow. Results of RT-qPCR studies that use improper reference genes (e.g. genes that are not constitutively expressed) can be significantly different from results obtained with proper reference genes [7, 9, 10]. So far, as ideal reference genes do not exist, it is essential to select appropriate reference genes for accurate normalization of the RT-qPCR results for each experimental condition. It is worth noting that the expression levels of many commonly used reference genes such as GAPDH, ACTB, RPS18, UBC, B2M, and HPRT1 have been reported to vary considerably in multiple tissues and cells [4, 7, 9, 1115]. However, in the absence of a better choice, the majority of researchers continue to use these genes, sometimes even without preliminary validation [9, 13, 16]. Thus for many studies there is a necessity to identify other candidate reference genes. One of possible ways is RT-qPCR screening of all 535 transcripts involved in maintaining cellular function by RT-qPCR [11, 15]. Another way in our genomic era is to use the wealth of accumulated available data obtained by high-throughput microarray and sequencing technologies. This idea is getting more and more popular and was used in some recent studies of plants [1719], diatoms [20], mammals [21, 22] and human [2327]. Certain attempts were also made to assess the stability characteristics of transcripts in different tissue types and cells of several species in different physiological states [14, 28, 29]. However, no genes were identified as generally stably expressed. Thus it is still a problem to reveal genes that are stably expressed in a given biological context, and the analysis of well-annotated microarray data may be very helpful to solve this problem. This approach promises to be particularly fruitful for heterogenic samples like biopsies which may contain different cell types in different ratios that can vary widely between samples.

In this study, we identified a set of genes that can be used as a reference simultaneously for analysis of breast cancer samples with an unknown hormone receptor status and different cancer status. To achieve this goal we analyzed four available well-annotated Affymetrix Human Genome U133A array datasets containing different breast cancers and normal breast epithelium from breast cancer and cancer-free patients. Then we validated identified candidate reference genes by RT-qPCR on 39 fresh breast cancer biopsies and compared expression stability of these genes and commonly used control genes including ACTB, GAPDH, RPLP0, GUSB and TFRC that form a set of reference genes in a commercial Oncotype DX Breast Cancer diagnostic test [2].

In order to select an optimal set of reference genes we developed a new strategy that utilizes the advantages of the most widely used algorithms for gene stability evaluation: comparative ∆Ct method [8], geNorm [12], NormFinder [1, 30] and Haller equivalence test [31]. Each of these popular methods for gene stability evaluation has its own advantages and pitfalls, and the developed strategy allows scientists to avoid weighing all the pros and cons of individual methods each time, but instead to combine all these methods. We showed that novel candidate reference genes identified by our microarray search have expression stability comparable to or even higher than the stability of commonly used reference genes. We also demonstrated that the selected set of 5 reference genes is more stable than the set of 5 reference genes used in Oncotype DX Breast Cancer diagnostic test. The RT-qPCR data used for the validation study was obtained in compliance with the global standardization accords presented in the MIQE guidelines [32], see Additional file 1.

Methods

Large-scale selection of an extended list of candidate reference genes using microarray data

Four GEO Series [33, 34] of Affymetrix Human Genome U133A array data were taken: GSE17705 (title: “Endocrine Sensitivity Index Validation Dataset”, 298 samples, [35]), GSE10780 (title: “Proliferative genes dominate malignancy-risk gene signature in histologically-normal breast tissue”, 185 samples, [36]), GSE20711 (title: “Epigenetic portraits of human breast cancers (expression data)”, 90 samples, [37]), GSE20437 (title: “Histologically normal epithelium from breast cancer patients and cancer-free prophylactic mastectomy patients”, 42 samples, [38]).

For each data series raw microarray data were processed using Bioconductor [39] xps package [40] implementation of RMA [41]. For each probeset a variability characteristic was computed as a ratio of 95- and 5-percentiles of log-transformed expression levels, and an ascending ordering of probesets was performed according to this variability characteristic. After these steps four lists of probesets were obtained (each corresponding to one data series). Genes corresponding to probesets that belong to top-500 for all four lists were included into the extended list of candidate reference genes.

The variability characteristic used for stability ranking is methodologically close to the interquartile range, but controls expression variation for 90 percent of samples in a dataset instead of 50 percent. Unlike mean deviation and standard deviation that are commonly used for expression stability comparison [14, 15, 24] the ratio of quantiles is absolutely stable with respect to data outliers.

For genes with stable expression the stability ranking based on the ratio of 95- and 5-percentiles of log-transformed expression levels is similar to ranking based on a relative width of expression range in a logarithmic scale: if 5-percintile p 5 of log-transformed expression level equals m(1-w), and 95-percintile p 95 equals m(1 + w), and 0 ≤ w<< 1 then

p 95 p 5 = m 1 + w m 1 w = 1 + w 1 + w + w 2 + = 1 + 2 w + 2 w 2 1 w 1 + 2 w .

The top-500 lists of most stable probesets for four datasets used for the generation of an extended list of candidate genes contained 347–410 unique gene symbols. However, pairwise intersection of top-500 lists corresponding to two datasets (GSE17705 – ER-positive breast cancer patients [35]; GSE20711 – breast cancer patients [37]) contained only 48 unique gene symbols, and interception of top-500 lists corresponding to three datasets (a small dataset GSE20437 – histologically normal epithelium from breast cancer patients and cancer-free prophylactic mastectomy patients [38] – added) contained 27 unique gene symbols. Hence, our extended list of candidate reference genes – it contained 25 unique gene symbols – was formed based mainly on three datasets out of four, and the forth dataset (GSE10780 – a larger set of histologically-normal breast tissues from breast carcinoma patients [36]) only validated our selection.

Patients

For validation of reference genes, samples from 39 patients (mean age 60.5 years; range 35–87 years) were analyzed. Patients were treated for breast cancer at the Gertsen Moscow Research Institute of Oncology or the Moscow State Oncology Medical Center, Russia, in 2012. Patient selection was based only on the availability of tumor tissue. All patients gave written informed consent to access their tissues and review their medical records in accordance with the principles of the Declaration of Helsinki. All patients were treated by mastectomy. Patients did not receive neoadjuvant chemotherapy.

Histologically, 32 tumors were diagnosed as ductal carcinomas, 1 was a mucus-producing adenocarcinoma, and 6 were of other histological types. TMN classification of all tumors is presented in Additional file 2. Twenty five tumors (64%) were estrogen-receptor (ER) positive, and 14 cases (36%) were ER-negative. Progesterone receptors (PR) were detectable in 24 cases (61.5%), whereas 15 tumors (38.5%) were PR-negative. Detailed information about tumor receptor status is provided in Additional file 2. All tumor samples were placed in appropriate volume of RNAlater RNA stabilization reagent (Qiagen, Germany) within 30 min after mastectomy and incubated at 4°C at least overnight but no more than 3 days. If total RNA was not isolated from a tissue sample after 3 day incubation, the sample was frozen at −80°C.

RNA extraction and quality control

Tumor samples stabilized in RNAlater reagent were crushed in liquid nitrogen. QIAzol lysis reagent (Qiagen, Germany) was added and the homogenate was centrifuged through a QIAshredder column (Qiagen, Germany). Total RNA was extracted from the eluate by the miRNeasy mini kit (Qiagen, Germany) according to the manufacturer’s instructions. Genomic DNA contamination was removed by performing DNaseI digestion on the RNA binding column for 15 min. Total RNA was eluted in 50 μl of RNase-free H2O and stored at −80°C. RNA yield and purity was assessed spectrophotometrically by measuring OD260 and OD260/280 ratio respectively in RNase-free H2O using NanoDrop ND-1000 (NanoDrop Technologies, USA). For all samples OD260/280 ratio was between 2.0 and 2.2.

RNA integrity was determined using the Experion Automated Electrophoresis Station with the standard-sensitivity RNA analysis kit according to the manufacturer’s instructions (Bio-Rad, USA). Values of RNA quality indicator (RQI) generated by Experion system’s software ranged from 6.4 to 9.4.

Primer and probe design and validation

A gene list used for validation is presented in Table 1. Primer pairs and probes were designed using Primer3 software [42] according to the recommendations described in [43], and their specificity was determined with Primer-blast program [44]. All sequences were tested for potential secondary structure and dimerization formation using OligoAnalyzer 3.1 program [45]. Primer specificity was confirmed by agarose gel electrophoresis after RT-qPCR reaction. Validated sequences are shown in Table 2 (see details in Additional file 3). Oligonucleotides were synthesized by Syntol (Russia).

Table 1 Description of selected candidate reference genes
Table 2 Description of primers and probes for validated candidate reference genes

The amplification efficiency of each set of primer pair and probe was determined by plotting the Ct values obtained for 7 serial dilutions of cDNA (1:25, 1:100, 1:250, 1:500, 1:1000, 1:2500, and 1:5000). For this aim cDNA was synthesized by reverse transcription of the RNA pool which was prepared by mixing of an equal amount of total RNA isolated from five random tumor samples from the biopsy collection. The corresponding RT-qPCR primer efficiencies (E) were calculated according to the equation E = 10(−1/slope). Each reaction was performed in triplicates. All PCR-mixes were pipetted by the automated pipetting system epMotion 5075 (Eppendorf, Germany).

PCR efficiencies for all candidate reference genes were higher than 1.83 and lower than 2.1 (Table 2, see Additional files 4 and 5), except for TPT1 (1.76). The amplicon sizes for all primer pairs are presented in Table 2.

cDNA synthesis and quantitative RT-qPCR

RNA samples were reverse transcribed to cDNA with the Reverse Transcription reaction mix using random hexamer primer according to the manufacturer’s instruction (Syntol, Russia). Initially 2 μg of total RNA was incubated at 65°C for 5 min followed by an incubation with 10 μl 2.5× reaction mix, 1 μl of 15 OD/μl random hexamer, 1 μl of 50U/ μl MMLV reverse transcriptase, 1 μl of 5U/ μl RNase inhibitor and nuclease-free water in final volume 25 μl for 1 hour at 37°C and 5 min at 95°C. First strand cDNA samples were stored at −20°C.

To measure the transcript levels of selected candidate genes by RT-qPCR, a 2.5× PCR reaction mix for qPCR (Syntol, Russia) was applied and analysis was performed on a 96-well DT-prime detection system (DNA-Technology, Russia). Each reaction was performed in triplicates in a reaction volume of 25 μl in high-profile 96-well unskirted PCR plates (BioRad, USA). All reactions contained 1 μl of cDNA diluted in 10 times, 10 μl of 2.5× Master Mix containing 6.25 mM MgCl2, 2.5 μl of 25 mM MgCl2 (till final concentration in a reaction mix 5 mM), 0.6 μl of 10 μM of each primers, 0.6 μl of 10 μM of a probe and 9.7 μl of DNase/RNase-free water. The PCR program consists of an initial 10 min template denaturation step at 94°C for enzyme activation, followed by 50 cycles of 94°C for 20 sec, 64°C for 10 sec, and 72°C for 15 sec. Nontemplate controls were also performed in triplicate for each gene set of primer pair and probe. Sets for all 18 candidate genes were analyzed in the same run. Each run was repeated three times. All PCR-mixes were pipetted by the automated pipetting system epMotion 5075 (Eppendorf, Germany). To verify the presence of genomic DNA contamination in the isolated total RNA, RT-qPCR were performed on equivalent amounts of total RNA without reverse transcription (as template) for each tumor sample.

Baseline and threshold values were automatically determined for all reactions in the plate using RealTime PCR v.7.3 software (DNA-Technology, Russia).

All RT-qPCR experiment data comply with the Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE) guidelines [32]. The MIQE checklist is presented in Supplementary Information (see Additional file 1).

Analysis of gene expression stability by RT-qPCR

To analyze the stability of the candidate reference genes, four different methods were used [9]: comparative ΔCt method, geNorm, NormFinder and Haller equivalence test. Briefly, the geNorm tool is based on the principle that the expression ratio of two ideal reference genes should be constant in samples from different experimental conditions or cell types [12]. The expression stability value (MgeNorm) of a gene is defined as the standard deviation of the log-transformed expression ratio for the particular gene relative to all other genes under investigation in a panel of cDNA samples. Lower values of MgeNorm correspond to higher gene expression stability. Because a freely available geNorm program does not allow the data to be processed efficiently if the RT-qPCR for each cDNA is repeated several times, we performed the geNorm analysis of our RT-qPCR data manually as described in [9].

The comparative ΔCt method compares a Ct ratio of “pairs of genes” within each sample and assigns a relative variation value MΔCt to each gene. Higher values of relative variation MΔCt correspond to lower gene expression stability.

NormFinder is a freely available program [46] which calculates the stability index using analysis of variance (ANOVA) on log-transformed expression values [1, 30]. The NormFinder algorithm estimates overall gene expression variation MNormFinder as well as the variation between subgroups, such as ER-positive and ER-negative breast cancer tissue samples. Higher values of MNormFinder correspond to higher gene expression variability.

Haller equivalence test gives for each pair of samples k, l an α-confidence interval δ kl , L δ kl U for gene expression difference in samples k, l[31]. An integral measure of gene variability in Haller equivalence test approach (MHaller) was set to the 95-th percentile of a set δ kl L , δ kl U k l with a confidence level α set to 0.05. Similarly to the other approaches considered in our study, the higher MHaller value corresponds to higher gene expression variability. The analysis of gene expression stability using Haller equivalence test is described in details in [9].

Gene expression levels were estimated using a standard exponential model A = I/ECt, where A is an expression level, I is a threshold value, Ct is a number of cycle corresponding to the threshold value, and E is a qPCR efficiency for a gene.

Data generation for “quasi-genes”

In order to compare the stability of a set of reference genes with the stability of individual genes using standard methods that were used for the stability evaluation of genes (comparative ∆Ct method, geNorm, NormFinder, Haller equivalence test) we implemented a simple procedure that generates data for a “quasi-gene” that can be associated with a set of reference genes.

An expression level, assigned to a “quasi-gene” by this procedure, was equal to a normalization factor defined by the set of genes, or, in other words, to the geometric mean of expression levels for genes that belong to the set: A ˜ = i = 1 n A i n ,

In order to find an appropriate Ct-value C ˜ , required by some methods for stability evaluation, we used a standard exponential model I = AECt, where A is an expression level, E is a gene efficiency, I is a threshold, Ct is threshold cycle. We also accepted as a fact that all Ct-values for a fixed sample correspond to the same threshold I ˜ , and set an efficiency for a “quasi-gene” to the geometric mean of efficiency values of genes in the set E ˜ = i = 1 n E i n . Due to the exponential model assumptions, log E ˜ I ˜ = log E ˜ A ˜ + C ˜ log E ˜ E ˜ = log E ˜ A ˜ + C ˜ , so Ct-value, assigned to by the procedure to a “quasi-gene” for a fixed sample, was equal to

log E ˜ I ˜ log E ˜ A ˜ = log E ˜ I ˜ log E ˜ i = 1 n A i n = log E ˜ I ˜ 1 n i = 1 n log E ˜ I ˜ E i C t i = 1 n i = 1 n C t i log E ˜ E i .

The RT-qPCR data contained 3 replicates for every gene and 3 replicates for every sample, and hence, there were nine Ct-values for each gene-sample pair. Taking the respective Ct-values of this 3x3 matrix for each gene of a system we got the corresponding matrix of Ct-values for a “quasi-gene”.

Results

The strategy for the selection of reference genes that was used in this study consists of the following stages: the large-scale screening of an extended list of candidate reference genes based on microarray data series, the selection of a candidate gene set for validation by RT-qPCR, and the determination of the final set of reference genes.

Large-scale selection of an extended list of candidate reference genes based on microarray data

The large-scale selection of the extended list of candidate reference genes was performed by a selection of genes that are present in sub-lists of most stable genes in all four analyzed microarray data sets (see the details in the Methods). Note that this approach allows a combining of microarray studies that utilize different microarray technologies. The total number of such genes was 25 (Table 3). Sixty percent of these genes (15 genes) were present in the list of maintenance genes expressed across eleven human adult and fetal tissues [11], and 9 percent (3 genes) were also present in the list of 47 transcripts expressed at the same level across eleven human adult and fetal tissues [11]. The extended list of candidates included eight RPL and eight RPS genes. These results are concordant with the results of [14, 15, 24].

Table 3 An extended list of candidate reference genes

Such genes as ACTB and GAPDH, which are widely used for normalization of RT-qPCR data, were not included into the resulting extended list of candidate genes. The GAPDH gene did not belong to any list of top-500 stable probesets, while gene ACTB (as well as UBC) belonged to three lists of top-500 stable probesets, but was not present in top-500 list corresponding to a dataset of histologically normal epithelium from breast cancer patients and cancer-free prophylactic mastectomy patients (GSE20437). It is also worth noting that ACTB and UBC (in contrast with, e.g., GUSB) had high intergroup difference in expression level for samples from cancer and cancer-free patients. However, the intragroup variation was low for each of the groups. Hence, the assessment of the expression level of these genes can be potentially used for the estimation of a tumor/normal epithelium ratio in a biopsy sample. An ability of housekeeping genes to define different biological states was also reported in [15].

Selection of a gene set for the experimental validation and RT-qPCR validation

The selection of a gene set for validation from the preliminary extended list was performed by an identification of genes with the highest stability values in the majority of the analyzed microarray data series, but the following additional limitation was imposed: a simultaneous selection of genes with known interactions, as well as a selection of genes with similar biological functions should be minimized. Informally, this limitation increases the independence of genes in the selected set.

Nine genes from the extended list of candidate reference genes (TPT1, HCFC1, PTMA, RPL23A, RPL37A, RPL39, RPS23, HUWE1, EEF1A1) were selected for the experimental validation of the expression stability and comparison with other genes traditionally used as reference genes in RT-qPCR studies of breast tissue (ACTB, GAPDH, RPLP0, GUSB, TFRC, SF3A1, MRPL19, PSMC4, TBP[2, 23, 27, 47, 48]) (Table 1). As no unique criterion exists for the comparison of gene expression stability, we combined four different approaches: comparative ΔCt method [8], geNorm [12], NormFinder [1, 30], and Haller equivalence test [31] (see Methods section for the details).

The simplest method, the comparative ΔCt method, uses only raw values of threshold cycles Ct. An application of the ΔCt method to the results of RT-qPCR analysis of 39 breast cancer tissue samples for 18 genes from the set for validation gave the following results (Figure 1a, Additional file 6: Figure S1). Ten genes, namely, ACTB, HUWE1, RPS23, EEF1A1, SF3A1, HCFC1, RPL37A, MRPL19, TBP and TFRC showed nearly the same expression stability for the whole set of samples (Figure 1a). The RPL23A gene displayed the lowest stability in the whole set of samples and ER-negative tumors (Additional file 6: Figure S1). The highest variability for ER- positive tumors was exhibited by PSMC4.

Figure 1
figure 1

Evaluation of candidate reference gene expression stability by RT-qPCR. Gene expression variations calculated for aggregated set of samples by a: comparative ΔCt (MΔCt); b: geNorm (MgeNorm) and d: NormFinder (MNormFinder). c: Maximum fold changes of gene expression levels MHaller calculated by Haller equivalence test.

In contrast to the comparative ΔCt method, geNorm, NormFinder and Haller equivalence test use expression levels as input data. The values of expression stability MgeNorm for 18 candidate genes generated by the most widely used algorithm geNorm are presented in Figure 1b and Additional file 6: Figure S1. The 10 top-ranked genes were similar to the top-10 genes in the rating generated by the comparative ΔCt method in the whole set of samples: stability values of their expression were nearly the same in each sample group and only the exact order of genes was slightly different.

Unlike comparative ΔCt method and geNorm, NormFinder and Haller equivalence test examine the expression stability of each candidate independently from other candidates. The results of NormFinder application to the studied set of genes is presented in Figure 1d and Additional file 6: Figure S1. In intergroup variation analysis groups were formed by ER-positive and ER-negative samples, respectively (Additional file 7: Figure S2). Higher stability was observed for ACTB, RPS23, HUWE1, EEF1A1, MRPL19, SF3A1, HCFC1, RPL37A and TBP.

The Haller equivalence test allows to evaluate a fold change value (MHaller) of candidate reference genes expression level between different samples for a given significance value. The results of the expression variability estimation for the 18 candidate genes studied are presented in Figure 1c and Additional file 6: Figure S1. In contrast to the comparative ΔCt method, geNorm and NormFinder PSMC4 demonstrated the lowest variability for the whole set of samples as well as for ER-negative and ER-positive tumors. The next gene in the stability ranking was RPL39, the third gene was ACTB that demonstrated lowest variation according to other methods of stability evaluation that A large group of genes from the validation set (GUSB, RPS23, MRPL19, RPL37A, SF3A1, HUWE1, TBP) showed very similar MHaller values and were the next in stability ranking. In concordance with the results of the comparative ΔCt method, geNorm and NormFinder, RPL23A had the greatest expression variability. The relatively high MHaller values of all candidate genes are consistent with highly heterogeneous nature of biopsies and with other studies [15, 47]. Also it is worth noting that MHaller stability values are “pessimistic” in comparison with other methods as they represent the “poor case”, but not the “averaged” one (see details in Methods).

An important inference is that such genes as GAPDH and RPLP0, commonly employed as reference genes (e.g., in the commercial Oncotype DX diagnostic test) were ranked among the most variable candidate genes by almost all methods.

Final selection of reference genes

Thus four rankings of gene expression stability were obtained by the following methods: comparative ∆Ct method [8], geNorm [12], NormFinder [1, 30] and Haller equivalence test [31]. In spite of the fact that generally the results of all stability evaluation methods are consistent each of these ranking has its own order of the genes. Based on both four stability rankings and intergroup variations calculated by NormFinder we formed a subset of the best 5 reference genes: ACTB, RPS23, HUWE1, EEF1A1, SF3A1 (other good choices can be obtained by a replacement of EEF1A1 or SF3A1 by MRPL19). Note that three out of five genes in this set (RPS23, HUWE1, EEF1A1) were selected by microarray analysis and were top-ranked in the extended list of candidate genes. ACTB was not included in the extended list of candidate genes only because it had high intergroup variation between normal epithelium samples from cancer and cancer-free patients. However it showed low variability value for other microarray data series used for the selection of the extended list.

In order to validate the selected subset of reference genes we added a “quasi-gene” corresponding to the set of 5selected genes (see Methods section for details) and calculated the stability measures by all four methods (Table 4). The results show that the selected subset clearly outperforms not only every individual gene from the tested set, but also a system of reference genes used by OncotypeDX test. The only method that did not rank the selected set as the most stable was the Haller equivalence test: it ranked the individual gene PSMC4 higher than “quasi-genes”.

Table 4 Stability characteristics evaluated for the system of 18 individual genes and 2 “quasi-genes” – one corresponding to the selected system of reference genes, and the other corresponding to the system of reference genes used in OcotypeDX test

Discussion

RT-qPCR has become the most popular method for detection and quantification of mRNA transcripts. An efficient normalization enables the gathering of reproducible and biologically relevant RT-qPCR data correcting non-biological sample-to-sample variations that could be introduced by protocol-dependent inconsistencies [1, 30]. To achieve an efficient normalization the use of internal control genes (reference genes) has been established as the gold standard normalization method. Formerly, the genes that have housekeeping roles in basal cellular activities were selected as internal control for gene expression studies. However, it is obvious now that a housekeeping function does not guarantee that a gene has a stable expression level. It was well documented that expression levels of many commonly used housekeeping genes, including GAPDH, ACTB, RPS18, UBC, B2M, HPRT1 and others, may vary significantly in different tissues and cells [4, 7, 9, 1115]. Therefore it is necessary to identify novel candidate reference genes for given individual studies.

Our goal was to select a set of reference genes that can be efficiently used for both research and clinical analysis of breast cancer samples with an unknown hormone receptor status and different cancer status. In order to achieve this goal we developed the strategy for the selection of reference genes that consists of the following stages: 1) selection of an extended list of candidate reference genes based on microarray data, 2) selection of a set of candidate genes for validation by RT-qPCR and 3) selection of the final set.

Similarly to the studies presented in [14, 24] we started from the analysis of available microarray datasets, however, we used variation characteristics that are more stable with respect to outliers. This analysis allowed us to identify 25 genes with stable expression level for the studied sample types (Table 3). Forming the list of candidate genes for validation, we took into account the possible interactions of genes and their involvement in the same biological pathways. The final set of reference genes was selected based on four stability rankings generated by the most widely used methods for evaluation of gene expression variability. This allowed us to take advantage of all considered methods. We noted that both microarray and RT-qPCR data allows to evaluate only the stability of a transcript portion of an individual gene in a fixed amount of total RNA, or, in other words, to evaluate the relative stability of gene expression. Hence it is a natural idea to use tools that estimate the expression stability of a gene relative to the expression of other genes, e.g. the comparative ∆Ct method [8] and the geNorm [12]. However, low values of such relative variation characteristics can be a result of gene co-regulation. That is why the usage of tools based on stability measures of individual genes independently on other genes such as NormFinder and Haller equivalence test is very reasonable.

All methods used for the gene stability evaluation showed that the genes selected by microarray analysis are comparable with or even better than traditionally used reference genes: in the final subset 3 out of 5 genes belonged to the list of genes selected by means of microarray analysis. These three genes are RPS23, HUWE1 and EEF1A1.

Conclusions

We presented experimentally validated evidence that a selection of reference genes for RT-qPCR can be efficiently performed by combining a preliminary search based on the high-throughput analysis of microarray datasets with subsequent selection and validation using RT-qPCR and simultaneous examination of different expression stability measures. We showed that the identified set of reference genes, including ACTB, RPS23, HUWE1, EEF1A1 and SF3A1 proved to be less variable and thus potentially more efficient for research and clinical analysis of breast samples in comparison with individual genes and a set of reference genes used in OncotypeDX assay.