Learning objectives

The purpose of this article was to review:

  1. 1.

    Microarrays, the basics

  2. 2.

    Evolution of microarray technology over time

  3. 3.

    The importance of microarrays in human biology

  4. 4.

    Microarray-based insights for the transplant physician

  5. 5.

    Current unmet biological questions in transplantation and how can we use microarrays to address them

  6. 6.

    Methods of applying microarrays to clinical practice

  7. 7.

    Limitations of using microarrays in clinical practice

  8. 8.

    The black-box of microarray data analysis

Introduction

The completion of the human genome project has revolutionized translational medicine. High-throughput technologies, mapped to known target sequences, many with known functions, now permit investigators to interrogate the genome, transcriptome, proteome and metabolome systematically, and to assess genomic mutations, polymorphisms, epigenetic alterations, and micro-RNAs. These technologies herald the potential for us to translate their results into novel sensitive and specific diagnostic tests and less toxic therapeutics, with the anticipation of moving away from protocol-based approaches to personalized medicine. In this new era, investigators could potentially assess the molecular and pathophysiological characteristics of individual patients and their transplanted organs, tailor therapeutic regimens, and administer them based on these profiles. One of the key steps in this process will be the identification and validation of biomarkers. To date, microarray technologies are, perhaps, the most successful and mature methodology for high-throughput and large-scale genomic analyses.

Microarrays, the basics

Microarray technology is based on the principle of complementary, single-stranded, nucleic acid sequences forming double-stranded hybrids; thus, in essence, it is a high-throughput Southern blot, where thousands of single-stranded sequences that are complementary to target sequences are synthesized, or spotted on to a small glass or membrane support. The gene probes on the array are either small (20–60 bp) single-stranded oligonucleotides, synthesized in situ (provided by Affymetrix, Agilent), or cloned complementary DNA (cDNA) amplified by polymerase chain reaction (PCR) and obtained by reverse transcription of messenger RNAs (mRNAs). Illumina uses designed oligonucleotide probes attached to beads that are deposited randomly on a substrate. The selection of microarray platforms will have important effects in the later stages, determining the complexity and flexibility of the data to be analyzed. Care must be taken to choose a platform that will allow data to be analyzed and disseminated in the manners desired. Experimental design and analysis are generally more straightforward with one-color microarrays.

In organ transplantation, a prospective study of the gene-expression profile of graft injury usually involves sample collection from tissue biopsy, blood and biofluids, such as urine, bile, or broncho-alveolar lavage, taken before, during, and after injury. A schematic presentation of microarray study is shown in Fig. 1. All collected samples are then subject to standardized protocols for RNA extraction [1]. Routinely, one assesses RNA quality control by looking for ribosomal RNA ratios of 260/280 > 2. The integrity of the RNA can also be assessed with Agilent 2100 Bioanalyzer, using RNA Nano Chips (Agilent Technologies), where the degradation of RNA can be determined by RNA integrity number (RIN) [2, 3]. RNA amplification techniques are often required for microarray analysis and are related to downstream genetic analyses when small sample input is used as starting material. Linear RNA amplification is a strategy that has been used successfully to generate adequate input RNA for molecular profiling studies. One method of linear amplification, termed amplified antisense RNA (aRNA) amplification [4], utilizes a T7 RNA polymerase-based amplification procedure that allows quantitation of relative gene expression levels from small tissue samples. With modification of the classic Eberwine method, Wang et al. [5] exploited a template switching effect at the 5′ end of the mRNA transcript to ensure the synthesis of full-length double-stranded cDNA. The most common aspects arising from the use of sample amplification, irrespective of whether the method confers linear or exponential amplification, include amplification efficiency, 3′ bias and length of aRNA/cDNA products, reproducibility, fidelity of maintaining relative transcript abundance, benefits of the use of amplified material versus non-amplified material, and disadvantages with amplification procedures [6].

Fig. 1
figure 1

Schematic representation of DNA microarray technology. Total RNA is first isolated from the samples of interest; this test RNA and a reference RNA are then differentially labeled with fluorescent dyes and then competitively hybridized onto a printed DNA microarray. Images that are generated are then scanned, and the resulting fluorescence intensities are used for further data analysis. IH immunohistochemistry, SNP single nucleotide polymorphism, SAM significance analysis of microarrays, PAM prediction analysis of microarrays, BRB Biometric Research Branch, GO Gene ontology, IPA Ingenuity Pathway Analysis, KEGG Kyoto Encyclopedia of Gene and Genomes

RNA is next labeled with a detectable marker (fluorescent dye) and hybridized to an array containing individual gene-specific probes, in either a dual-color (sample and control pool with different colors, e.g. cDNA array) [7] or single-color (sample label only, e.g. oligonucleotide arrays) hybridization system [8]. The array is hybridized with the labeled sample(s) by incubation (usually overnight) and is then washed to remove non-specific hybrids. A laser excites the attached fluorescent dyes to produce light detected by a scanner, which generates a digital image from the excited microarray. The digital image is then processed by specialized software to transform the image of each spot into a numerical reading. This process finds the specific spot location and shape, summarizes spot intensities, and subtracts the surrounding background noise. To facilitate the comparison between the experiments and to compensate for differences in labeling, hybridizations and detection methods, a data normalization step is usually performed. This final numerical reading is proportional to the concentration of the target sequence in the sample to which the probe in the spot is directed. In competitive two-dye assays, the reading is transformed to a ratio equal to the relative abundance of the target sequence (labeled with one type of fluorochrome) from a sample respective to a reference sample (labeled with another type of fluorochrome). In the one-dye technologies, the fluorescence is commonly yellow, whereas, in two-dyes technologies, the colors used are green for reference and red for sample (although a replicate using dye-swap is often done for quality control). The appropriate choice of technology depends on experimental design, availability and cost.

Evolution of microarray technology over time

Early microarray technology allowing the hybridization of high-density cDNA with samples on nylon membranes provided a technological step in the right direction, but this technology is limited in scope by the nature of the matrix supporting the clones. Two later innovations made possible the new microarray technologies. One was the use of solid supports, such as glass, which are much more amenable to miniaturization and fluorescence-based detection [7]. The second innovation was high-density oligonucleotides on glass wafers using photolithographic masking techniques. The latter approach is more versatile in its applications and has the key advantage that the oligonucleotides can be synthesized at will, allowing chips to be manufactured directly from sequence databases. Using this technology, Affymetrix’s short-oligonucleotide array (HG_U133plus2.0) has became one of the most popular platforms for research [9, 10]. Agilent has applied its ink printing technology to microarrays, which has enabled high-speed and quality production of oligonucleotide microarrays [11]. The Illumina BeadChip is a relatively new method with increasing usage. The essential element of BeadChip technology is the attachment of oligonucleotides to silica beads. The beads are then randomly deposited into wells on a substrate, such as a glass slide. The resultant array is decoded to determine which oligonucleotide-bead combination is in which well [12]. More recently, Affymetrix developed the GeneChip Exon Array, which offers a more complete and accurate picture of overall gene expression by enabling researchers to investigate the entire length of the gene, not just the 3′ end. This exon-level analysis on a whole-genome scale opens the door to the detection of specific alternative splicing events that may play a central role in disease mechanism and etiology. These arrays have greater than 99% coverage of sequences present in the RefSeq database, covering only well-annotated content [13]. The evolution of microarray technology is summarized in Table 1.

Table 1 The evolution of microarray technology (n/a not applicable)

The overall differences in cDNA (e.g. Lymphochip) vs oligonucleotide-based arrays (e.g. Affymetrix, Agilent) [14] lie in the fact that the probe for cDNA arrays is 0.5–3 kb in length, and it is 15–70 bp in length for the oligonucleotide arrays. The oligonucleotide arrays can also perform genotyping studies and detect splice variants, in addition to mRNA profiling, but, unlike cDNA arrays, they require multiple probes per target, with greater spot consistency and less batch-to-batch variability.

The importance of microarrays in human biology

Microarray technologies were initially designed to measure the transcriptional levels of RNA transcripts derived from thousands of genes within a genome in a single experiment. This technology has made it possible for one to relate physiological cell states to gene-expression patterns for studying tumors, disease progression, cellular response to stimuli, drug target identification and transplant injury mechanisms. For example, subsets of genes with increased and decreased activities (referred to as transcriptional profiles or gene-expression “signatures”) have been identified for acute lymphoblast leukemia [15], breast cancer [16], prostate cancer [17], lung cancer [18], colon cancer [19], multiple tumor types [20], organ transplantation [1], and drug response [21]. Moreover, because the pool of published data grows every day, integrated analysis of several studies, or “meta-analysis”, have been proposed in the literature [22]. These approaches detect generalities and particularities of gene expression in diseases.

More recent uses of DNA microarrays in biomedical research are not limited to gene-expression. DNA microarrays are being used to detect single nucleotide polymorphisms (SNPs) of the human genome (Hap Map project) [23], aberrations in methylation patterns [24], alterations in gene copy number [25], alternative RNA splicing [26], pathogen detection [27, 28] and micro-RNA [29].

Gene-expression profiles for prognostic classifiers are usually built by the correlation of gene-expression patterns, generated from specimens, with clinical outcome (e.g. acute rejection vs stable without rejection). Gene-expression predictive classifiers of response to treatment are generated by the correlation of gene-expression data, derived from samples taken before treatment, with clinical and pathological response to treatment. Although the identification of the most relevant information from microarray experiments is still under active research, well-established methods are available for a broad spectrum of experimental set-ups. The analysis of gene-expression data at the pathway and functional level, along with a systems biology approach, will provide deeper insights into the biological effects of complex disease states, such as in the organ transplant milieu, and will improve risk assessment of the same.

Microarray-based insights for the transplant physician

It is challenging to dissect any allograft injury mechanism with single-gene studies because of the complexity of the mechanisms for renal allograft rejection with different immunosuppressive protocols and the spectrum of the response with immunological injury. Previously researchers have reported that expression of the cytotoxic molecules granzyme B and perforin has been associated with rejection and has been detected in blood [30], urine [31], and biopsy tissue samples [32, 33] in human and experimental studies. However, renal allografts transplanted into perforin or granzyme A or B “double knockout” (gene deletion) mice showed T cell-mediated rejection that was not mediated by perforin or granzymes [34], indicating the redundancy of the immune response during rejection.

The advent of microarray technology has enabled researchers to detect the expression of thousands of genes simultaneously, rather than measuring the expression of one gene at a time, and has unlocked information about disease heterogeneity that could not have been predicted by standard clinical or pathologic criteria. Pioneering studies of gene-expression profiles in breast cancer have identified the molecular classification of breast cancer into clinically relevant sub-types. This has provided new tools with which one can predict cancer recurrence and response to different treatments, and new insights into various oncogenic pathways and the process of tumor progression [35]. Subsequent microarray studies have changed the paradigm of approach in lymphoma [36], where the diffuse B-cell lymphoma was identified as having the worst prognostic outcome, and in kidney transplantation, where rejection sub-types with differential survival benefits and a prognostic role for focal B-cell infiltrates was identified for recalcitrant rejections [1]. Using cDNA microarrays, Hauser and co-workers [37] have determined the gene-expression patterns specific to living-donor vs deceased-donor kidneys and suggest that suppression of specific targets of inflammation in the deceased donor might be a promising intervention for abrogating post-ischemic acute renal failure.

These findings have brought a global paradigm shift from traditional hypothesis-driven experiments toward large-scale hypothesis generation and testing through clinical trials. In the past few years, there has been an increasing number of publications on solid organ transplantation, with particular emphasis on the heart and kidney. Supporting the results of earlier studies, microarrays have also corroborated the finding of known pathways in rejection injury, such as evidence of the dysregulation of the complement system [38, 39], interleukins [40], anti-human leukocyte antigen (HLA) allo-antibodies [41, 42] and solute transport genes [43], in allograft rejection.

The interrogation of minimally invasive or non-invasive biomarkers of graft injury has been more challenging than the direct interrogation of the transcriptional changes in the injured graft. Until recently, most gene-expression profile studies were performed on biopsy specimens. Transcriptional changes in peripheral blood during graft rejection demonstrate significantly disparate gene-expression changes from those of the inflamed graft, suggesting that the local response to inflammation and injury in the rejecting organ is highly localized. Additionally, the intensity (fold-change) and quantity (number of significant genes in rejection) of the rejection response in peripheral blood is much smaller than the corresponding response in the organ, even when biopsy and blood samples from the same patient are examined simultaneously [44] (Fig. 2). A recently published study [45, 46] using both microarray and reverse transcriptase-polymerase chain reaction (RT-PCR) to discriminate rejection from non-rejection in peripheral blood samples from heart transplant patients gave a reasonable correlation only with severe and high-grade tissue rejection. Further, this was only from samples taken later than 6 months after transplantation, even though the risk of rejection is highest during the first 6 months after transplantation [47]. When biopsy predictor sets were used on blood samples, these microarray data from blood did not give significant predictions [47]. Despite these limitations of peripheral blood sampling, efforts to examine this sample source for clinical monitoring continue to hold promise. The answer for increasing the sensitivity and specificity of biomarker detection in peripheral blood may lie in the more careful attention to improved methods of sample collection, storage, and processing [44].

Fig. 2
figure 2

Correlation of acute rejection gene expression in biopsy vs blood. Significant genes for graft rejection are identified in blood and biopsy tissue (Sarwal et al., unpublished data) with low false discovery rates (q scores < 1% by significance analysis of microarrays (SAM) analysis (https://doi.org/www-stat.stanford.edu/∼tibs/SAM/)). The logarithmic fold expression values are shown on the X and Y axes. Only 26% of the significant genes overlap in the two tissue sources. These overlapping genes show much higher fold expression in tissue than in blood

Conventional wisdom holds that long-term allograft survival requires life-long immunosuppression. A tremendous advance in peripheral blood monitoring for immunosuppression customization may lie in the data emerging from studies on a highly selected group of organ transplant patients with spontaneous graft acceptance or prope tolerance in liver [48] and kidney [49]. Blood gene-expression profiles from transplant patient cohorts with tolerance, stable graft function and acute and chronic graft injury, as well as peripheral blood samples from healthy individuals, were analyzed on microarrays [49]. A tolerance-specific signature of 49 genes was identified in the kidney patients, which was strongly regulated by transforming growth factor-beta (TGF-β) signaling and cell cycle signaling. The tolerance signature in immunosuppression-free liver patients included genes encoding for gamma delta thymus (T)-cell and natural killer (NK) receptors, and for proteins involved in cell proliferation arrest. Importantly, 50% of kidney recipients on steroid monotherapy, and 8% of kidney recipients on triple-drug immunosuppression, also had the tolerance signature, suggesting that those patients may benefit from immunosuppression minimization [49]. We anticipate that, after further validation studies, these biomarkers might be useful as minimally invasive monitoring tools for guiding immunosuppression titration and might provide novel mechanistic insights into the acceptance mechanisms for renal and liver allografts. As there is only a single gene overlap between the renal and liver tolerance signatures, we can hypothesize that either there is sufficient redundancy in the system or there is some tissue (liver vs kidney) specificity. Key array-based published studies of transplantation are summarized in Table 2.

Table 2 Key array-based published studies in transplantation (AR acute rejection, CAN chronic allograft nephropathy, TOL operational tolerance, MIS minimum immunosuppression, HTN hypertension, RVA reno-vascular abnormalities, EPO erythropoietin, LDN laparoscopic donor nephrectomy, DT drug toxicity, HTN hypertension, STA patient with stable graft function)

Current unmet biological questions in transplantation: how can we address them?

Whilst great advances have been made in the discovery of putative biomarkers in transplantation, disappointingly few have been translated into clinically applicable assays; much of this is due to a lack of well-designed clinical validation studies. The most important challenge is having well-designed validation and varying endpoint definitions. To adapt molecular endpoints from single-gene studies as representative of a particular mechanism of toxicity/injury often assumes that a postulated mechanism must be known beforehand, and this may result in “over-fitting” of the data, making the inference not entirely accurate.

Chronic allograft nephropathy what are the early injury pathways?

Chronic graft injury, potentially an indolent immune response resulting in slow deterioration of organ function, characterized pathologically in the kidney by tubular atrophy, interstitial fibrosis, and fibrous intimal thickening of the arteries, has a relatively transcriptionally homogeneous response of tissue fibrosis, when investigated at the time of the established injury [1]. Although there is a general consensus on the patient criteria for chronic allograft nephropathy (CAN), it is not universal. Despite the many presumed triggers for this injury (alloimmune responses, donor age and tissue quality; brain death; preservation/reperfusion injury; post-transplantation and systemic stresses in the recipient environment) [50], early injury triggers that could provide drug targets for manipulation of injury progression have not been identified in cross-sectional human studies. Animal models, where injury mechanisms can be segregated better, have been useful to study [51], but careful design of clinical trials to ascertain longitudinal and evolutionary studies on graft injury may be required. Given the difficulty of recipient consent for multiple post-transplantation biopsies, the discovery of biomarkers specific to CAN has been challenging. Most of the published reports (Table 2) are from a single sample time point [5255]. However, with careful study design, controlled studies have been performed using sequential (paired) patient samples [5658]. These studies are limited by relatively small sample size, non-standardized protocol biopsies, and few sample collection time points; perhaps the field will be led by organs where performance of these protocol biopsies is almost standard of care, e.g. heart and lung transplantation. Additional issues that would be important in the design of clinical validation studies for biomarkers would be: a prospective nature of sample identification, also allowing for samples that could be then used for prediction of the clinical events, maintaining homogeneity of patient sub-groups and disease pathology, similar immunosuppression regimes and controlling for patient demographics and the use of other concomitant treatments.

The limitations of microarray studies and the importance of well-designed validation strategies have been demonstrated by microarray applications in cancer. In an attempt to predict prognoses of cancer patients on the basis of previously published DNA microarray studies, re-analysis of data from the seven largest published studies showed that the list of genes identified as predictors of prognosis was highly unstable; the selection of training sets strongly affected molecular signatures [59], and biomarkers from the training sets did not perform as well in independent validation studies. The most important challenge in translational transplantation research is the lack of a true gold standard for the classification of disease in organ transplantation. The current histologic classification of deteriorating organ transplants has many limitations, including arbitrary cut-off points and poor inter- and intra-observer reproducibility. This makes identification of informative samples sets very difficult, resulting in the difficult generation of microarray-based hypotheses in transplantation.

Acute rejection prediction and immunosuppression customization

Acute rejection (AR) depends on an orchestrated immune response to histocompatibility antigens expressed by the grafted tissue. The redundancy of AR mechanisms and the problems with peripheral blood transcriptional analysis (outlined above) has made it difficult for reliable biomarkers to be developed for prediction of allograft rejection and its outcome, irrespective of immunosuppression usage, concomitant infection, recipient age or organ type. The delineation of AR from antibody-mediated rejection (AHR, also termed humoral rejection) is still debated. Though effector mechanisms primarily responsible for the rejection process classically involve activation of effector T cells and memory T cells, alternative mechanisms of acute rejection also recruit (to varying degrees) B-cells, natural killer cells, eosinophils and neutrophils, antibody-mediated rejection is currently thought to play a role in approximately 33% of AR episodes [42]. Thus, though numerous markers of different biologic pathways have been evaluated as diagnostic and prognostic tools to serve this purpose in human [47, 56] and animal organ transplantation [60, 61], no biomarkers have become firmly established for prediction of acute rejection. It is unlikely that a single biomarker will meet all clinical needs, such as non-invasive diagnosis and prediction of transplant rejection and survival, given the clinical confounders in the recipients’ post-transplantation environment. Microarrays may, over time, offer multiple markers as gene-based tests, and combining these with the genes found for graft tolerance [49] may enhance future patient monitoring and enable individualized risk-adapted patient care.

Limited donor source—how can we expand this?

Finding ways to use kidneys more effectively for transplantation has the potential to extend the donor pool using dead donors who meet the Standard Criteria for Donation (SCD) as well as those from Expanded Criteria Donors (ECDs), created by the United Network for Organ Sharing in 2002. Higher-risk donor organs, once considered unsuitable, could also be transplanted safely. With regard to higher-risk donor organs, the question remains: Are all kidneys from older donors equivalent with regards to biological and cellular health? An array-based analysis of kidneys from a wide spectrum of ages (8 months to 80 years) [62] suggested that, though older kidneys appeared to have increased extracellular matrix turnover and a non-specific inflammatory response, combined with a reduction in processes dependent on energy metabolism and mitochondrial function, these results did not always correlate with chronological age. Extension of expression data into hypothesis testing for correlation studies of older kidneys with good vs poor “transcriptional health” with post-transplantation function, may offer a means to potentially expand the donor pool. To date, these studies have not been performed.

Methods of applying microarrays to clinical practice

Identification of differentially expressed genes between sample groups

Identifying whether genes have increased or decreased in expression between two or more groups of samples is the most common and basic type of analysis and provides a simple characterization of the specific molecular differences that are associated with a specific biological phenotype. Using power analysis, it is possible to estimate the number of samples required to identify a high percentage of truly differentially regulated genes between sample groups. Unsupervised analysis [63] is a useful means to assess, a priori, the inter- and intra-group differences or similarities. Unsupervised analysis is based on the assumption that co-expressed genes have the potential to be regulated by same transcriptional factors or to have similar biological functions. Examples of unsupervised analysis methods are hierarchical clustering [64], R (https://doi.org/cran.r-project.org), the Gene Expression Profile Analysis Suite (GEPAS) [65], The Institute for Genomic Research (TIGR) T4 [66, 67], GeneSpring [68] and Genesis [69]. Typically, genes are represented on the y-axis, whereas samples are drawn on the x-axis, and a dendrogram on each axis shows the degree of relatedness of samples and genes to each other. A color-coded matrix (heat map), where samples and genes are sorted according to the results of the clustering, is used to represent the expression values for each gene in each sample and is the basis of many of the published microarray figures. Though fold-change in fluorescence intensity (expressed as the logarithm (base 2 or log2) of the sample divided by the reference), is often used descriptively, it fails to ascertain the true significance of small but significant changes in gene-expression levels. Univariate analysis measures of significance are preferred, e.g. for data sets with normalized distribution, the t-test and analysis of variance (ANOVA) test can be used, and for data without normalized distribution, the Wilcoxon or Mann–Whitney tests are often used. Another commonly used measure of significance testing is significance analysis of microarrays (SAM) (https://doi.org/www-stat.stanford.edu/~tibs/SAM/). It should be pointed out that the methods given in this review are a selection from many others possible.

  1. (a)

    Determining biomarkers for clinical phenotypes of disease

    The identification of gene-expression “signatures” associated with diseases categories is called biomarker detection or supervised classification. As the biomarker panel needs to be predictive of disease class or clinical outcome, learning and validation sets of samples are required, making the sample size relatively large for this type of analysis. Prediction analysis of microarrays (PAM, https://doi.org/www-stat.stanford.edu/~tibs/PAM/) is a powerful tool that can be adapted for this use. The selection of a unique list of genes by this approach does not, in and of itself, offer sufficient knowledge for one to understand the biology of a given system, suggesting the necessity to incorporate biological knowledge into array analysis. Recent approaches to microarray analysis address the limitations of conventional bioinformatics approaches by enriching the analysis with knowledge of biological processes [70]. This approach has the advantage over classical bioinformatics approaches that the feature selection step can be performed based on data that are completely independent of the clinical samples used for the analysis. This strategy is very promising, especially in disease states that are not easily classified into clear distinct categories, as is the case in clinical transplantation. Some ways to do this are either to use commercially available software (Ingenuity Pathway Analysis: https://doi.org/www.ingenuity.com/; Pathway Studio: https://doi.org/www.ariadnegenomics.com/) or to use hypergeometric enrichment analysis from published data sets of biologically relevant experiments [1, 49].

  2. (b)

    Microarray data for survival analysis

    Biomarkers that correlate with survival times are a very important objective in the analysis of microarray data. Selected genes can be combined with clinical classes and incorporated into regression models to detect variations in survival times using both the Kaplan–Meier method and statistical tests. An example of this is shown where the gene-expression microarray data for different molecular rejection groups (AR-1, AR-2, and AR-3) [1] segregate by the performance for recovery of graft function 4–6 weeks after treatment intensification for the rejection episode (Fig. 3). Different linear regression models can be tested with independent variables (time, drug levels, and graft function) and dependent variables (genes) to ascertain any association between gene-expression and clinical variables.

  3. (c)

    Other applications of microarrays in clinical practice

    Commercially available microarrays can detect single nucleotide polymorphisms (SNPs), which are an important tool for identifying genetic loci linked to complex disorders [71]. Unfortunately, the number of SNPs covered by the array-based methods is fewer than 1% of the known SNPs deposited in the public databases. Altered methylation patterns in genomic DNA can also be identified by microarrays, by the use of methylation-sensitive restriction enzymes to generate fragments enriched with either unmethylated or methylated CpG sites. Epigenetic phenomena, such as cytosine methylation, histone acetylation and phosphorylation, control the activation and deactivation of genes, such that genes methylated in their promoters can become inactive and can predispose individuals to cancers [67]. Chromatin immune-precipitation (ChIP-on-chip) assays [72, 73] can allow the estimation of alterations in the expression of transcription factors in several diseases (e.g. c-Myc is known to be differentially expressed in a variety of cancers [74]). Pathogen specific microarrays have been generated [27, 28] and can allow the direct interrogation of specific pathogens on an array-based platform.

    Fig. 3
    figure 3

    Correlation between AR sub-type and graft outcome. Analysis of the recovery of graft function over time revealed that grafts with AR that were clustered in the AR-I transcriptional sub-group had significantly poorer functional recovery than those classified as either AR-II or AR-III [1] (P = 0.02). P values were calculated from Kaplan–Meier survival analysis. Data are for grafts with incomplete functional recovery in the analyses according to sub-type of AR, where 80% of AR-1 and ∼40% of AR-II had incomplete recovery of serum creatinine to baseline values 6 weeks after treatment of the rejection episode. All AR-III episodes recovered graft function by the same definition

What limits the use of microarrays in clinical practice?

Gene-expression profiling studies of renal transplantation are highly complex, and the successful execution of such studies requires close collaboration between physicians, molecular biologists and bioinformatictists. The most important challenge is the lack of a true gold standard for the classification of disease in organ transplantation. The current histologic classification of deteriorating organ transplants has many limitations, including arbitrary cut-off points and poor inter- and intra-observer reproducibility. The classical approach to microarray analysis, which starts with the identification of genes that differentiate between two sample groups, depends on the assumption that distinct disease entities exist and that we know, with certainty, what these classes are. While the cancer literature may be able to rely on disease classification based on outcome data, the situation in organ transplantation is much more complicated, with many overlapping disease processes occurring simultaneously. If microarray analysis in clinical transplantation starts with a classification based on a flawed clinical gold standard, the results of the microarray study will not be any better than histologic examination and may even be misleading. Currently, there is no simple solution to this problem.

  • Quality control

    The high-throughput nature of this technology, combined with the expected large numbers of data, result in a high risk for error. With the increasing use of genomic studies in transplantation, there is a need to control for various confounder effects that obscure biomarker discovery in graft rejection. In view of many concerns raised, the US Food and Drug Administration (FDA) lunched the Microarray Quality Control (MAQC) project. An excellent correlation of gene expression of human reference RNA (Stratagene) and human brain reference RNA (Ambion), across seven different array platforms, across five different laboratories, using three different amplification protocols [75], was shown in this study. Thus, while the need for quality control is a limitation for array studies, recognition of means to address this could turn this around as a benefit, resulting in the generation of robust datasets that could be queried with confidence by multiple users.

  • High cost

    At present, because of the sophistication of microarrays, this is a costly technology available only in selected laboratories. Microarray technologies, however, are rapidly improving, and the costs of the technique continue to fall, thus paving the way for wider access and more generalized usage.

  • Sampling variability

    Particularly for renal transplant biopsies, differing amounts of cortex vs medulla represented in a sample can affect the pattern of gene expression of a sample. Therefore, one can cross-reference a publicly available gene list specific for different compartments of kidney to minimize false clustering of samples [76]. There is also the problem of variable sample pathology. If only one biopsy core is being used for microarray analysis, it will be necessary to identify transcriptional changes that are more global and robust than patchy cellular interstitial infiltration, such as effects of cytokines on the renal tissue or global interstitial changes. mRNA is a very fragile molecule that can be degraded within minutes of surgical procedure [77], drastically affecting the interpretation of microarray data [78]. Moreover, subtle variations in biopsy handling and method of RNA extraction from samples can result in different levels of gene expression [78].

  • Difficulty in detecting some disease processes in transplantation by microarrays

    Existing collagen, readily visible to the pathologist, is not necessarily associated with mRNA changes if the process of active fibrogenesis is complete. Small cell populations that make a major contribution to disease might give only a weak signal in transcriptome studies of whole biopsies or unseparated blood. Antibodies produced in lymphoid tissues could damage the kidney without any mRNA being detectable in the kidney. Microarray analysis cannot offer insights into these critical cellular and molecular processes in the tissues.

  • Discrepancy in array studies

    Weak overlap exists between gene lists from individual studies of similar phenotypes in transplantation. The disparity among microarray data can be attributed to several factors: differences in microarray platforms with differing gene sets; weak statistical power and small sample sizes; biological variance because of variability in patient characteristics; experimental variance including lack of uniform protocols for study design, sample collection, RNA processing, and sample labeling and hybridization; different tools for data processing and statistical analysis (Table 3); variable thresholds for data filtering; varying stringencies for false discovery rates and statistical significance; and different data analysis methods. Nevertheless, a recent study compared microarray data for rejection across platforms, samples, and laboratories with some success. A gene set for acute rejection prediction generated from a heart biopsy [47] was used to predict previous published data for kidney biopsy [1, 56] and lung broncho-alveolar lavage cells [79].

  • Confounders exist in microarray experiments

    Previous studies have demonstrated that the abundance of globin genes in whole blood may mask the underlying biological differences in whole-blood samples. In a comparison of gene-expression profiles of peripheral blood, using different protocols of sample preparation, amplification and hybridization on the Affymetrix platform, we demonstrated that the globin reduction method is not sufficient to unmask clinically relevant, rejection-specific, transcriptome profiles in whole blood. Additional mathematical application for globin gene depletion improves the efficacy of globin reduction but cannot remove the confounding influence of globin gene hybridization [44]. Other problems of analysis of blood may be more serious than the globin issue: the massive changes in cell populations caused by illness, surgery or infection make it difficult to define small changes in specific mRNA levels. It will be challenging to distinguish the blood signal for the alloimmune response from such common non-specific changes.

    Table 3 List of pitfalls in microarray analyses and solutions (SVD singular value decomposition, Cy cyanine, qPCR quantitative polymerase chain reaction)

The black-box of microarray data analysis

All published microarray studies should be made publicly available through the internet, on proprietary websites and in public microarray database repositories, and should generally follow the minimum information about a microarray experiment (MIAME) compliance format [80] or microarray gene-expression markup language (MAGE-ML) [81].

Many freely available software tools are now available for microarray data analysis. Though not completely intuitive, they have extensive manuals that can take a relatively inexperienced user towards the rapid understanding of their application for data analysis, ranging from image analysis, visualizations, differential expression, principal component analysis, clustering, classification, regression and survival analysis. Examples of some of these selected analytical methods are:

A comprehensive discussion of different analytical strategies for microarray analyses are beyond the scope of this review.

The annotation of probes on microarrays is problematic for data analysis, which is shown in a particular commercial microarray design in which the number of probes associated with a given gene changes over time. These changes concern approximately 5% of the probe sets across the history of annotation releases over a 2-year span [82]. For Affymetrix Mouse 430 A/B, 13,699 out of 45,000 probes changed gene names from 2003 to 2004, and 2,277 (5%) probes changed annotation by their Entrez Gene identifiers [82]. Similarly, in human array platforms, e.g. for the HG-U133plus Affymetrix chip, unreliable representative public identifiers were seen for 18.2% [83] of the probes.

Probe redundancy is an additional problem (each transcript is probed by multiple oligonucleotide probes). This could potentially be caused by an annotation problem, with at least 5% of misannotation in each generation of the platform [82], e.g. multiple probes may assign to multiple genes or a single probe may map to multiple genes or Entrez IDs. The attention of manufacturers should be drawn to the maintenance of annotation accuracy and to the reduction of the number of probes required for each gene, attempting to choose the most representative probe/s. In addition, different portions of probe sets contained unreliable representative public gene IDs, with multiple genome hits. Harbig et al. [84] recently reassigned the probe sets on the Affymetrix platform, on the basis of each 25-mer probe sequence, and found that a large percentage of probe sets did not actually bind fully to a gene. They concluded that the set of probes assigned to be an official probe set is a problem with the Affymetrix platform. This is also a significant problem that may also be an issue with the other platforms, i.e. the sequence of a gene changes as additional information becomes available.

There are also different levels of detection (background over noise) for probes or probe sets, with specific criteria for each platform, which can have an impact on downstream analyses. Some of the different criteria used for a probe detection for different array platforms are discussed: Agilent cut off, absolute value of log2 red channel/green channel > 0.5 for at least one array; cDNA mean of channel 1 intensity/media background intensity > 1.5, and/or normalized mean of channel 2 intensity/media background intensity > 1.5; Affymetrix using a perfect-match-only model, the value for each probe or probe sets are extracted after background subtraction.

The exiting tools for converting the probe ID between microarray platforms are very limited. The difficulty in reusing data lies with the mapping of probes to established gene identifiers. Therefore, microarray results need to be re-evaluated periodically with the latest probe annotations. Most recently, a tool (Array Information Library Universal Navigator, AILUN, https://doi.org/ailun.stanford.edu/) was developed by a Stanford University group that re-annotates all gene-expressions/proteomics data from the Gene Expression Omnibus (GEO; https://doi.org/www.ncbi.nlm.nih.gov/geo/), which is a public repository for gene-expression and other high-throughput experimental data covering numerous platforms and species. The AILUN server builds a universal identifier ID table by relating all probe IDs to Entrez Gene IDs on a monthly basis, and it is the first tool available that allows researchers to compare microarray data across different platforms and map genes across species [85]. It also provides an opportunity for further discovery of complicated disease processes using more samples that have been deposited in GEO. The choice of processing method has a major impact on differential expression analysis of microarray data [86]. Some statistical issues should be given consideration in data analysis, such as class comparison, class prediction and class discovery [87]. When microarray data are being compared, various factors influence the agreement between studies, such as different technologies and platforms, statistical analysis criteria, protocols, and laboratory variability [88].

Conclusions

High-throughput DNA microarray technology has been increasingly applied in kidney transplantation to classify molecular sub-types, to predict outcome and the response to treatment, and to identify novel therapeutic targets. Although results hold promise, this technology will not have a full impact on routine clinical practice until there is further standardization of techniques and optimal clinical trial designs to set up higher volume validation studies for the generated biomarkers. Owing to substantial disease heterogeneity and the number of genes being analyzed, collaborative, multi-institutional studies are required to accrue enough patients for sufficient statistical power. Customized arrays or multiplex PCR for informative biomarkers can then be applied to the clinics for event prediction, treatment stratification, immunosuppression customization and improved graft and patient survival.

Our scientific environment is ripe for research-based implementations of integrative tools that support knowledge-based data mining. This integration can provide the cornerstone of research in the coming years. Developments in this area will require close interdisciplinary collaboration and will lead not only to the integration of data and knowledge, but also to computer-supported experiments and to knowledge generation platforms, thereby closing the loop of data gathering, hypothesis generation and hypothesis testing.