In praise of arrays
- First Online:
- 540 Downloads
Microarray technologies have both fascinated and frustrated the transplant community since their introduction roughly a decade ago. Fascination arose from the possibility offered by the technology to gain a profound insight into the cellular response to immunogenic injury and the potential that this genomic signature would be indicative of the biological mechanism by which that stress was induced. Frustrations have arisen primarily from technical factors such as data variance, the requirement for the application of advanced statistical and mathematical analyses, and difficulties associated with actually recognizing signature gene-expression patterns and discerning mechanisms. To aid the understanding of this powerful tool, its versatility, and how it is dramatically changing the molecular approach to biomedical and clinical research, this teaching review describes the technology and its applications, as well as the limitations and evolution of microarrays, in the field of organ transplantation. Finally, it calls upon the attention of the transplant community to integrate into multidisciplinary teams, to take advantage of this technology and its expanding applications in unraveling the complex injury circuits that currently limit transplant survival.
KeywordsMicroarray Gene expression Transplantation Rejection Immunosuppression Monitoring
Microarrays, the basics
Evolution of microarray technology over time
The importance of microarrays in human biology
Microarray-based insights for the transplant physician
Current unmet biological questions in transplantation and how can we use microarrays to address them
Methods of applying microarrays to clinical practice
Limitations of using microarrays in clinical practice
The black-box of microarray data analysis
The completion of the human genome project has revolutionized translational medicine. High-throughput technologies, mapped to known target sequences, many with known functions, now permit investigators to interrogate the genome, transcriptome, proteome and metabolome systematically, and to assess genomic mutations, polymorphisms, epigenetic alterations, and micro-RNAs. These technologies herald the potential for us to translate their results into novel sensitive and specific diagnostic tests and less toxic therapeutics, with the anticipation of moving away from protocol-based approaches to personalized medicine. In this new era, investigators could potentially assess the molecular and pathophysiological characteristics of individual patients and their transplanted organs, tailor therapeutic regimens, and administer them based on these profiles. One of the key steps in this process will be the identification and validation of biomarkers. To date, microarray technologies are, perhaps, the most successful and mature methodology for high-throughput and large-scale genomic analyses.
Microarrays, the basics
Microarray technology is based on the principle of complementary, single-stranded, nucleic acid sequences forming double-stranded hybrids; thus, in essence, it is a high-throughput Southern blot, where thousands of single-stranded sequences that are complementary to target sequences are synthesized, or spotted on to a small glass or membrane support. The gene probes on the array are either small (20–60 bp) single-stranded oligonucleotides, synthesized in situ (provided by Affymetrix, Agilent), or cloned complementary DNA (cDNA) amplified by polymerase chain reaction (PCR) and obtained by reverse transcription of messenger RNAs (mRNAs). Illumina uses designed oligonucleotide probes attached to beads that are deposited randomly on a substrate. The selection of microarray platforms will have important effects in the later stages, determining the complexity and flexibility of the data to be analyzed. Care must be taken to choose a platform that will allow data to be analyzed and disseminated in the manners desired. Experimental design and analysis are generally more straightforward with one-color microarrays.
RNA is next labeled with a detectable marker (fluorescent dye) and hybridized to an array containing individual gene-specific probes, in either a dual-color (sample and control pool with different colors, e.g. cDNA array)  or single-color (sample label only, e.g. oligonucleotide arrays) hybridization system . The array is hybridized with the labeled sample(s) by incubation (usually overnight) and is then washed to remove non-specific hybrids. A laser excites the attached fluorescent dyes to produce light detected by a scanner, which generates a digital image from the excited microarray. The digital image is then processed by specialized software to transform the image of each spot into a numerical reading. This process finds the specific spot location and shape, summarizes spot intensities, and subtracts the surrounding background noise. To facilitate the comparison between the experiments and to compensate for differences in labeling, hybridizations and detection methods, a data normalization step is usually performed. This final numerical reading is proportional to the concentration of the target sequence in the sample to which the probe in the spot is directed. In competitive two-dye assays, the reading is transformed to a ratio equal to the relative abundance of the target sequence (labeled with one type of fluorochrome) from a sample respective to a reference sample (labeled with another type of fluorochrome). In the one-dye technologies, the fluorescence is commonly yellow, whereas, in two-dyes technologies, the colors used are green for reference and red for sample (although a replicate using dye-swap is often done for quality control). The appropriate choice of technology depends on experimental design, availability and cost.
Evolution of microarray technology over time
The evolution of microarray technology (n/a not applicable)
Type of array
Number of probes or probe sets
1 million exons
The overall differences in cDNA (e.g. Lymphochip) vs oligonucleotide-based arrays (e.g. Affymetrix, Agilent)  lie in the fact that the probe for cDNA arrays is 0.5–3 kb in length, and it is 15–70 bp in length for the oligonucleotide arrays. The oligonucleotide arrays can also perform genotyping studies and detect splice variants, in addition to mRNA profiling, but, unlike cDNA arrays, they require multiple probes per target, with greater spot consistency and less batch-to-batch variability.
The importance of microarrays in human biology
Microarray technologies were initially designed to measure the transcriptional levels of RNA transcripts derived from thousands of genes within a genome in a single experiment. This technology has made it possible for one to relate physiological cell states to gene-expression patterns for studying tumors, disease progression, cellular response to stimuli, drug target identification and transplant injury mechanisms. For example, subsets of genes with increased and decreased activities (referred to as transcriptional profiles or gene-expression “signatures”) have been identified for acute lymphoblast leukemia , breast cancer , prostate cancer , lung cancer , colon cancer , multiple tumor types , organ transplantation , and drug response . Moreover, because the pool of published data grows every day, integrated analysis of several studies, or “meta-analysis”, have been proposed in the literature . These approaches detect generalities and particularities of gene expression in diseases.
More recent uses of DNA microarrays in biomedical research are not limited to gene-expression. DNA microarrays are being used to detect single nucleotide polymorphisms (SNPs) of the human genome (Hap Map project) , aberrations in methylation patterns , alterations in gene copy number , alternative RNA splicing , pathogen detection [27, 28] and micro-RNA .
Gene-expression profiles for prognostic classifiers are usually built by the correlation of gene-expression patterns, generated from specimens, with clinical outcome (e.g. acute rejection vs stable without rejection). Gene-expression predictive classifiers of response to treatment are generated by the correlation of gene-expression data, derived from samples taken before treatment, with clinical and pathological response to treatment. Although the identification of the most relevant information from microarray experiments is still under active research, well-established methods are available for a broad spectrum of experimental set-ups. The analysis of gene-expression data at the pathway and functional level, along with a systems biology approach, will provide deeper insights into the biological effects of complex disease states, such as in the organ transplant milieu, and will improve risk assessment of the same.
Microarray-based insights for the transplant physician
It is challenging to dissect any allograft injury mechanism with single-gene studies because of the complexity of the mechanisms for renal allograft rejection with different immunosuppressive protocols and the spectrum of the response with immunological injury. Previously researchers have reported that expression of the cytotoxic molecules granzyme B and perforin has been associated with rejection and has been detected in blood , urine , and biopsy tissue samples [32, 33] in human and experimental studies. However, renal allografts transplanted into perforin or granzyme A or B “double knockout” (gene deletion) mice showed T cell-mediated rejection that was not mediated by perforin or granzymes , indicating the redundancy of the immune response during rejection.
The advent of microarray technology has enabled researchers to detect the expression of thousands of genes simultaneously, rather than measuring the expression of one gene at a time, and has unlocked information about disease heterogeneity that could not have been predicted by standard clinical or pathologic criteria. Pioneering studies of gene-expression profiles in breast cancer have identified the molecular classification of breast cancer into clinically relevant sub-types. This has provided new tools with which one can predict cancer recurrence and response to different treatments, and new insights into various oncogenic pathways and the process of tumor progression . Subsequent microarray studies have changed the paradigm of approach in lymphoma , where the diffuse B-cell lymphoma was identified as having the worst prognostic outcome, and in kidney transplantation, where rejection sub-types with differential survival benefits and a prognostic role for focal B-cell infiltrates was identified for recalcitrant rejections . Using cDNA microarrays, Hauser and co-workers  have determined the gene-expression patterns specific to living-donor vs deceased-donor kidneys and suggest that suppression of specific targets of inflammation in the deceased donor might be a promising intervention for abrogating post-ischemic acute renal failure.
These findings have brought a global paradigm shift from traditional hypothesis-driven experiments toward large-scale hypothesis generation and testing through clinical trials. In the past few years, there has been an increasing number of publications on solid organ transplantation, with particular emphasis on the heart and kidney. Supporting the results of earlier studies, microarrays have also corroborated the finding of known pathways in rejection injury, such as evidence of the dysregulation of the complement system [38, 39], interleukins , anti-human leukocyte antigen (HLA) allo-antibodies [41, 42] and solute transport genes , in allograft rejection.
Key array-based published studies in transplantation (AR acute rejection, CAN chronic allograft nephropathy, TOL operational tolerance, MIS minimum immunosuppression, HTN hypertension, RVA reno-vascular abnormalities, EPO erythropoietin, LDN laparoscopic donor nephrectomy, DT drug toxicity, HTN hypertension, STA patient with stable graft function)
Brouard et al. 
Proc Natl Acad Sci U S A 2007
TOL, AR, CAN, stable, MIS
AKR1C1, AREG, BRRN1, C1S, CCL20, CDC2, CDH2, CHEK1, DHRS2, DEPDC1, ELF3, HBB, IGFBP3, LTB4DH, MS4A1, MTHFD2, PARVG, PLXNB1, PODXL, PPAP2C, RAB30, RASGRP1, RBM9, RHOH, SLC29A, SMILE, SOX3, SPON1, TK1 and TLE4
Li et al. 
Physiol Genomics 2007
Globin genes onfounders in biomarker discovery from PAX gene samples for AR
Nagarajan et al. 
Clin Transplant 2007
HTN, RVA, EPO
Hemoglobin zeta, G2, E1, CTGF, PLA2 G2A, PDGF-A, VEGF, CDH5, GDF1, TIE, TBRG1, EPS8, FIBP, EPOR, TFRC, STAT5, Jak2 and CLK1
Park et al. 
U133A 2.0 GeneChip
STAT1, STAT2, proteasome subunit [beta]-type-8, Col1A1, FN1, phosphoinositide-3-kinase regulatory subunit-3, VCAM1, GRZMA, GBP1, IER3, HLA-DRbeta, IL-10, TGFB, IFNG, IL-6 and FoxP3
Mas et al. 
U133A 2.0 GeneChip
Kidney biopsy, peripheral blood, urine
TGF-beta, laminin, gamma 2, metalloproteinases-9, collagen type IX alpha 3, immunoglobulins, cytokine, chemokines receptors, EGFR, FGFR2, AGT, EGFR and TGFB
Morgun et al. 
Circ Res 2006
Heart biopsy, kidney and lung
CCL18, TRB, LTB, ITGB2, HA-1, CORO1A, IGKC, RARRES3, CCL5, HLADRB3, STAT1, C1QA, GMFG, CD74, CD14, PSCD4, BTN3A3, HLA-F and UBE2L6
Hotchkiss et al. 
TGF-B, thrombospondin 1, PDGF, integrins, MMP7, C4B, properdin, VCAM1, Annexins, VEGF, EGF and FGF
Kurian et al. 
HIF1a, HIF1B, TNF, TNFR, TGF-B, FGF, integrins, MMP, elastin, GHRH and VEGF
Eikmans et al. 
J Am Soc Nephrol 2005
HG U95Av2 GeneChip
Surfactant protein-C (SP-C), S100 calcium-binding protein A8 (S100A8), S100A9 and immuno-globulin genes
Melk et al. 
Kidney Int 2005
NADH dehydrogenase, APO, kynureninase PAH, dynein, CLDN8, MMP7, fibulin, tenascin, CSPG2, SERPINA3, immunoglobulins, somatostatin receptor, THY1, natriuretic peptide receptor and SLC solute transporter family
Zhang et al. 
Clin Transplant 2004
HG U95Av2 GeneChip
Membrane-type matrix metalloproteinase 1, SH3 binding protein, MEA6, TOB family 4, RBP2, IL-1A, Argininosuccinate synthetase, Brain and nasopharyngeal carcinoma, NSG-x, hVH-5 and Eosinophil Charcot-Leyden crystal protein
Mansfield et al. 
Am J Transplant 2004
MIP-1, CCR5, CX3CR1, DARC, SCYB10, SCYA5,SCYA3, SCYA13, SCYA2, IL2RB, IL6R, IL16, 1L15R, DEFA1, DEFB1, SCYA2, SCYA5, MST1, STAT1, STAT6, CD69, MAL, NFATC3, Annexins, CASP10, PECAM1 and VCAM1
Hauser et al. 
Lab Invest 2004
Complements, LTF, NK4, VCAM1, interleukins, HLA, BCL6, GPX2,FBP1, PCK2, SORD, APOA4, CYP3A7, FABP1, APOM, CYP3A4, HIF1A, STAT1,TIMP1, ADAMTS1, TNFSF10 and CDC25B
Kainz et al. 
Am J Transplant 2004
Osteopontin, SOD2, RARRES1, chemokine ligand 1, antileukoproteinase, STAT1, CDH6, SPP1, SERPINA3 and GPX2
Flechner et al. 
Am J Transplant 2004 (a)
CAN, drug effect
TGFB, TNFA, PDGF, ICAM, VCAM1, integrin B, MCP-1, CCR2, MPI-3B, MHC, MMP, TIMP1, RANTES, VEGF, collagen III, Angiotensin II receptor, TSP and FN1
Flechner et al. 
Am J Transplant 2004 (b)
HG U95Av2 GeneChip
Kidney biopsy, peripheral blood
AIF, CD14, CD163, CD2, CD3D, CD48, CD53, chemokines, interleukins, C1q, immunoglobulins, INFG, TCR TNF, and HLA
Donauer et al. 
AQP2, AQP3, lipoprotein lipase, PML-2, Napsin 1, precursor, Flotillin-1, Type IV collagenase, Hepatocyte growth factor activator inhibitor, RIG-like 7–1, MECI-1, PGER, TEM8, MHC class I, C1s and immunoglobulins
Higgins et al. 
Mol Biol Cell 2004
Cortex, medulla, papillary tips,
Identify patterns of gene expression in discrete portions of the normal kidney
Sarwal et al. 
N Engl J Med 2003
Kidney biopsy, pediatrics
AR, CAN, DT and infection
TCR, HLA class II, HLA class I, immunoglobulins, lactotransferrin, chemokines, CD20, CD34, IGF1R, TNFR, MST1, NK4, duffy antigen/chemokine, receptor, STAT1, TGFR1, granzyme A, perforin, IL2R, CD53, lymphotoxin, lymphotoxin R, NFKB1, CD59, IFNGR1 and annexins
Scherer et al. 
HG U95Av2 GeneChip
Keratin tumor suppressor candidate 7, OS9(APRIL), G-protein gamma7, protein/cell adhesion molecule-like, GRB2-associated binding protein 1, and PRLR
Chua et al. 
Am J Transplant 2003
Hb-zeta, Hb-beta, Hb-alpha2, FOLR2, FOLR3, CAH1, immunoglobulins, GPX1, and lactotransferrin
Zhang et al. 
Transplant Proc 2002
CD80, interleukins, CD44, CD40L, CD40, VLA-5, LFA-1, TCR alpha, Lck, calcineurin, PKC, IFNG, LFA-1, TCR alpha, Lck, calcineurin, PKC, IFNG, TGFB, TNF-alpha, TNFR1, G-CSFR and PDGF receptor,
Akalin et al. 
HuMig, TCR RING4, ISGF-3, CD18
Kusaka et al. 
Agilent rat oligonucleotide array G4130A
Kidney allografts, T lymphocytes
Brain death donor
Gro1, IP-10, p53, NF kappa B, Myc, Jun, c-fos, LCN2 and SPP1
Berthier et al. 
Kidney Int 2006
230 A GeneChip
MMP-11,-12,-14, ADAM-17, TIMP-1,-2 TGF-B, MMP-9, meprin and MMP-24
Djamali et al. 
mouse stress toxicity GEArray
ANXA5, CASP1, CASP8, TNFRII, TRAIL, FASL, BAX, inducible nitric oxide synthase, cytochrome p450 4A, [alpha]-crystalline B, heme-oxygenase II, SOD, HSP60, HSP27, BCL-X and metallothionein
Schuurs et al. 
Am J Transplant 2004
HTN brain death
Water channel AQP-2, selectins, IL-6, oc-B-fibrinogen, KIM-1, HO-1, Hsp70, MnSOD2, ATF-3, EGR-1 and PIK3R1
Einecke et al. 
Am J Transplant 2007
Leonard et al. 
FASEB J 2006
HG U95Av2 GeneChip, murine U77A
Mouse kidney, human proximal tubular epithelial cells
Ischemia reperfusion injury
In mouse model: ALDH1A1, ALDH1A7, GSTM5, GSTA2, GSTP1, NQO1 and Nrf2. In human: Nrf2 is up-regulated on reoxygenation
Famulski et al. 
Am J Transplant 2006
Define IFNG-dependent, rejection-induced transcripts (GRITs) in mouse kidney allografts. IFNG inducible: CXCl9, UBD and MHC
Einecke et al. 
Am J Transplant 2007
Kidney allografts, T lymphocytes
Cytotoxic T lymphocyte-associated transcripts (CATs): CD2, CD3g, GZMB, TCRB, MES
Current unmet biological questions in transplantation: how can we address them?
Whilst great advances have been made in the discovery of putative biomarkers in transplantation, disappointingly few have been translated into clinically applicable assays; much of this is due to a lack of well-designed clinical validation studies. The most important challenge is having well-designed validation and varying endpoint definitions. To adapt molecular endpoints from single-gene studies as representative of a particular mechanism of toxicity/injury often assumes that a postulated mechanism must be known beforehand, and this may result in “over-fitting” of the data, making the inference not entirely accurate.
Chronic allograft nephropathy what are the early injury pathways?
Chronic graft injury, potentially an indolent immune response resulting in slow deterioration of organ function, characterized pathologically in the kidney by tubular atrophy, interstitial fibrosis, and fibrous intimal thickening of the arteries, has a relatively transcriptionally homogeneous response of tissue fibrosis, when investigated at the time of the established injury . Although there is a general consensus on the patient criteria for chronic allograft nephropathy (CAN), it is not universal. Despite the many presumed triggers for this injury (alloimmune responses, donor age and tissue quality; brain death; preservation/reperfusion injury; post-transplantation and systemic stresses in the recipient environment) , early injury triggers that could provide drug targets for manipulation of injury progression have not been identified in cross-sectional human studies. Animal models, where injury mechanisms can be segregated better, have been useful to study , but careful design of clinical trials to ascertain longitudinal and evolutionary studies on graft injury may be required. Given the difficulty of recipient consent for multiple post-transplantation biopsies, the discovery of biomarkers specific to CAN has been challenging. Most of the published reports (Table 2) are from a single sample time point [52, 53, 54, 55]. However, with careful study design, controlled studies have been performed using sequential (paired) patient samples [56, 57, 58]. These studies are limited by relatively small sample size, non-standardized protocol biopsies, and few sample collection time points; perhaps the field will be led by organs where performance of these protocol biopsies is almost standard of care, e.g. heart and lung transplantation. Additional issues that would be important in the design of clinical validation studies for biomarkers would be: a prospective nature of sample identification, also allowing for samples that could be then used for prediction of the clinical events, maintaining homogeneity of patient sub-groups and disease pathology, similar immunosuppression regimes and controlling for patient demographics and the use of other concomitant treatments.
The limitations of microarray studies and the importance of well-designed validation strategies have been demonstrated by microarray applications in cancer. In an attempt to predict prognoses of cancer patients on the basis of previously published DNA microarray studies, re-analysis of data from the seven largest published studies showed that the list of genes identified as predictors of prognosis was highly unstable; the selection of training sets strongly affected molecular signatures , and biomarkers from the training sets did not perform as well in independent validation studies. The most important challenge in translational transplantation research is the lack of a true gold standard for the classification of disease in organ transplantation. The current histologic classification of deteriorating organ transplants has many limitations, including arbitrary cut-off points and poor inter- and intra-observer reproducibility. This makes identification of informative samples sets very difficult, resulting in the difficult generation of microarray-based hypotheses in transplantation.
Acute rejection prediction and immunosuppression customization
Acute rejection (AR) depends on an orchestrated immune response to histocompatibility antigens expressed by the grafted tissue. The redundancy of AR mechanisms and the problems with peripheral blood transcriptional analysis (outlined above) has made it difficult for reliable biomarkers to be developed for prediction of allograft rejection and its outcome, irrespective of immunosuppression usage, concomitant infection, recipient age or organ type. The delineation of AR from antibody-mediated rejection (AHR, also termed humoral rejection) is still debated. Though effector mechanisms primarily responsible for the rejection process classically involve activation of effector T cells and memory T cells, alternative mechanisms of acute rejection also recruit (to varying degrees) B-cells, natural killer cells, eosinophils and neutrophils, antibody-mediated rejection is currently thought to play a role in approximately 33% of AR episodes . Thus, though numerous markers of different biologic pathways have been evaluated as diagnostic and prognostic tools to serve this purpose in human [47, 56] and animal organ transplantation [60, 61], no biomarkers have become firmly established for prediction of acute rejection. It is unlikely that a single biomarker will meet all clinical needs, such as non-invasive diagnosis and prediction of transplant rejection and survival, given the clinical confounders in the recipients’ post-transplantation environment. Microarrays may, over time, offer multiple markers as gene-based tests, and combining these with the genes found for graft tolerance  may enhance future patient monitoring and enable individualized risk-adapted patient care.
Limited donor source—how can we expand this?
Finding ways to use kidneys more effectively for transplantation has the potential to extend the donor pool using dead donors who meet the Standard Criteria for Donation (SCD) as well as those from Expanded Criteria Donors (ECDs), created by the United Network for Organ Sharing in 2002. Higher-risk donor organs, once considered unsuitable, could also be transplanted safely. With regard to higher-risk donor organs, the question remains: Are all kidneys from older donors equivalent with regards to biological and cellular health? An array-based analysis of kidneys from a wide spectrum of ages (8 months to 80 years)  suggested that, though older kidneys appeared to have increased extracellular matrix turnover and a non-specific inflammatory response, combined with a reduction in processes dependent on energy metabolism and mitochondrial function, these results did not always correlate with chronological age. Extension of expression data into hypothesis testing for correlation studies of older kidneys with good vs poor “transcriptional health” with post-transplantation function, may offer a means to potentially expand the donor pool. To date, these studies have not been performed.
Methods of applying microarrays to clinical practice
Identification of differentially expressed genes between sample groups
Determining biomarkers for clinical phenotypes of disease
The identification of gene-expression “signatures” associated with diseases categories is called biomarker detection or supervised classification. As the biomarker panel needs to be predictive of disease class or clinical outcome, learning and validation sets of samples are required, making the sample size relatively large for this type of analysis. Prediction analysis of microarrays (PAM, http://www-stat.stanford.edu/~tibs/PAM/) is a powerful tool that can be adapted for this use. The selection of a unique list of genes by this approach does not, in and of itself, offer sufficient knowledge for one to understand the biology of a given system, suggesting the necessity to incorporate biological knowledge into array analysis. Recent approaches to microarray analysis address the limitations of conventional bioinformatics approaches by enriching the analysis with knowledge of biological processes . This approach has the advantage over classical bioinformatics approaches that the feature selection step can be performed based on data that are completely independent of the clinical samples used for the analysis. This strategy is very promising, especially in disease states that are not easily classified into clear distinct categories, as is the case in clinical transplantation. Some ways to do this are either to use commercially available software (Ingenuity Pathway Analysis: http://www.ingenuity.com/; Pathway Studio: http://www.ariadnegenomics.com/) or to use hypergeometric enrichment analysis from published data sets of biologically relevant experiments [1, 49].
Microarray data for survival analysis
Biomarkers that correlate with survival times are a very important objective in the analysis of microarray data. Selected genes can be combined with clinical classes and incorporated into regression models to detect variations in survival times using both the Kaplan–Meier method and statistical tests. An example of this is shown where the gene-expression microarray data for different molecular rejection groups (AR-1, AR-2, and AR-3)  segregate by the performance for recovery of graft function 4–6 weeks after treatment intensification for the rejection episode (Fig. 3). Different linear regression models can be tested with independent variables (time, drug levels, and graft function) and dependent variables (genes) to ascertain any association between gene-expression and clinical variables.
Other applications of microarrays in clinical practiceCommercially available microarrays can detect single nucleotide polymorphisms (SNPs), which are an important tool for identifying genetic loci linked to complex disorders . Unfortunately, the number of SNPs covered by the array-based methods is fewer than 1% of the known SNPs deposited in the public databases. Altered methylation patterns in genomic DNA can also be identified by microarrays, by the use of methylation-sensitive restriction enzymes to generate fragments enriched with either unmethylated or methylated CpG sites. Epigenetic phenomena, such as cytosine methylation, histone acetylation and phosphorylation, control the activation and deactivation of genes, such that genes methylated in their promoters can become inactive and can predispose individuals to cancers . Chromatin immune-precipitation (ChIP-on-chip) assays [72, 73] can allow the estimation of alterations in the expression of transcription factors in several diseases (e.g. c-Myc is known to be differentially expressed in a variety of cancers ). Pathogen specific microarrays have been generated [27, 28] and can allow the direct interrogation of specific pathogens on an array-based platform.
What limits the use of microarrays in clinical practice?
The high-throughput nature of this technology, combined with the expected large numbers of data, result in a high risk for error. With the increasing use of genomic studies in transplantation, there is a need to control for various confounder effects that obscure biomarker discovery in graft rejection. In view of many concerns raised, the US Food and Drug Administration (FDA) lunched the Microarray Quality Control (MAQC) project. An excellent correlation of gene expression of human reference RNA (Stratagene) and human brain reference RNA (Ambion), across seven different array platforms, across five different laboratories, using three different amplification protocols , was shown in this study. Thus, while the need for quality control is a limitation for array studies, recognition of means to address this could turn this around as a benefit, resulting in the generation of robust datasets that could be queried with confidence by multiple users.
At present, because of the sophistication of microarrays, this is a costly technology available only in selected laboratories. Microarray technologies, however, are rapidly improving, and the costs of the technique continue to fall, thus paving the way for wider access and more generalized usage.
Particularly for renal transplant biopsies, differing amounts of cortex vs medulla represented in a sample can affect the pattern of gene expression of a sample. Therefore, one can cross-reference a publicly available gene list specific for different compartments of kidney to minimize false clustering of samples . There is also the problem of variable sample pathology. If only one biopsy core is being used for microarray analysis, it will be necessary to identify transcriptional changes that are more global and robust than patchy cellular interstitial infiltration, such as effects of cytokines on the renal tissue or global interstitial changes. mRNA is a very fragile molecule that can be degraded within minutes of surgical procedure , drastically affecting the interpretation of microarray data . Moreover, subtle variations in biopsy handling and method of RNA extraction from samples can result in different levels of gene expression .
Difficulty in detecting some disease processes in transplantation by microarrays
Existing collagen, readily visible to the pathologist, is not necessarily associated with mRNA changes if the process of active fibrogenesis is complete. Small cell populations that make a major contribution to disease might give only a weak signal in transcriptome studies of whole biopsies or unseparated blood. Antibodies produced in lymphoid tissues could damage the kidney without any mRNA being detectable in the kidney. Microarray analysis cannot offer insights into these critical cellular and molecular processes in the tissues.
Discrepancy in array studies
Weak overlap exists between gene lists from individual studies of similar phenotypes in transplantation. The disparity among microarray data can be attributed to several factors: differences in microarray platforms with differing gene sets; weak statistical power and small sample sizes; biological variance because of variability in patient characteristics; experimental variance including lack of uniform protocols for study design, sample collection, RNA processing, and sample labeling and hybridization; different tools for data processing and statistical analysis (Table 3); variable thresholds for data filtering; varying stringencies for false discovery rates and statistical significance; and different data analysis methods. Nevertheless, a recent study compared microarray data for rejection across platforms, samples, and laboratories with some success. A gene set for acute rejection prediction generated from a heart biopsy  was used to predict previous published data for kidney biopsy [1, 56] and lung broncho-alveolar lavage cells .
Confounders exist in microarray experimentsPrevious studies have demonstrated that the abundance of globin genes in whole blood may mask the underlying biological differences in whole-blood samples. In a comparison of gene-expression profiles of peripheral blood, using different protocols of sample preparation, amplification and hybridization on the Affymetrix platform, we demonstrated that the globin reduction method is not sufficient to unmask clinically relevant, rejection-specific, transcriptome profiles in whole blood. Additional mathematical application for globin gene depletion improves the efficacy of globin reduction but cannot remove the confounding influence of globin gene hybridization . Other problems of analysis of blood may be more serious than the globin issue: the massive changes in cell populations caused by illness, surgery or infection make it difficult to define small changes in specific mRNA levels. It will be challenging to distinguish the blood signal for the alloimmune response from such common non-specific changes.Table 3
List of pitfalls in microarray analyses and solutions (SVD singular value decomposition, Cy cyanine, qPCR quantitative polymerase chain reaction)
Pitfalls in microarray analysis
Data variability, particularly for genes with low expression levels
Use replicate arrays to reduce false positives
Small sample amounts which limit replication
Use of amplified RNA (aRNA)
Expression bias due to amplification
Use improved protocols with single-roundamplification
Difficult to control input RNA amounts accurately
Use of normalization standard and two-color labelingstrategy to minimize
Spot quality may vary
Use stringent data-filtering criteria to assess signal/noise ratio and spot signal consistency
Lot-to-lot variation in PCR yield on cDNA arrays
Use data-filtering methods such as SVD to reduce batch biases (see text)
Hybridization efficiency varies with different probes
Use long-oligonucleotide arrays to minimize selected hybridization artifacts
Unequal labeling efficiency of Cy3 and Cy5 dyes
Use reciprocal labeling to confirm observations or use single-dye labeling system
Small numbers of samples and very large numbers of genes analyzed may contribute to false discovery
Confirm mRNA measurements using independent test methods such as qPCR and independent samples
Heterogeneity within study groups may contribute to false discovery
Use statistical modeling such as logistic regression to combine multiple genes
Protein expression levels and function not measured
Conform with protein expression methods (e.g. immunohistochemistry, protein arrays)
The black-box of microarray data analysis
All published microarray studies should be made publicly available through the internet, on proprietary websites and in public microarray database repositories, and should generally follow the minimum information about a microarray experiment (MIAME) compliance format  or microarray gene-expression markup language (MAGE-ML) .
The Gene Expression Profile Analysis Suite (GEPAS; http://www.gepas.org)
The Institute for Genomic Research (TIGR; http://www.tigr.org/software/microarray.shtml)
Significance analysis of microarrays (SAM; http://www-stat.stanford.edu/tibs/SAM/),
Prediction analysis of microarrays (PAM; http://www-stat.stanford.edu/tibs/PAM/),
Expression Profiler: next generation (EP:NG; http://www.ebi.ac.uk/expressionprofiler),
Cancer gene expression data analyzer (caGEDA; http://bioinformatics.upmc.edu/GE2/GEDA.html),
Analysis of microarray data (AMIADA; http://dambe.bio.uottawa.ca/amiada.asp),
A comprehensive discussion of different analytical strategies for microarray analyses are beyond the scope of this review.
The annotation of probes on microarrays is problematic for data analysis, which is shown in a particular commercial microarray design in which the number of probes associated with a given gene changes over time. These changes concern approximately 5% of the probe sets across the history of annotation releases over a 2-year span . For Affymetrix Mouse 430 A/B, 13,699 out of 45,000 probes changed gene names from 2003 to 2004, and 2,277 (5%) probes changed annotation by their Entrez Gene identifiers . Similarly, in human array platforms, e.g. for the HG-U133plus Affymetrix chip, unreliable representative public identifiers were seen for 18.2%  of the probes.
Probe redundancy is an additional problem (each transcript is probed by multiple oligonucleotide probes). This could potentially be caused by an annotation problem, with at least 5% of misannotation in each generation of the platform , e.g. multiple probes may assign to multiple genes or a single probe may map to multiple genes or Entrez IDs. The attention of manufacturers should be drawn to the maintenance of annotation accuracy and to the reduction of the number of probes required for each gene, attempting to choose the most representative probe/s. In addition, different portions of probe sets contained unreliable representative public gene IDs, with multiple genome hits. Harbig et al.  recently reassigned the probe sets on the Affymetrix platform, on the basis of each 25-mer probe sequence, and found that a large percentage of probe sets did not actually bind fully to a gene. They concluded that the set of probes assigned to be an official probe set is a problem with the Affymetrix platform. This is also a significant problem that may also be an issue with the other platforms, i.e. the sequence of a gene changes as additional information becomes available.
There are also different levels of detection (background over noise) for probes or probe sets, with specific criteria for each platform, which can have an impact on downstream analyses. Some of the different criteria used for a probe detection for different array platforms are discussed: Agilent cut off, absolute value of log2 red channel/green channel > 0.5 for at least one array; cDNA mean of channel 1 intensity/media background intensity > 1.5, and/or normalized mean of channel 2 intensity/media background intensity > 1.5; Affymetrix using a perfect-match-only model, the value for each probe or probe sets are extracted after background subtraction.
The exiting tools for converting the probe ID between microarray platforms are very limited. The difficulty in reusing data lies with the mapping of probes to established gene identifiers. Therefore, microarray results need to be re-evaluated periodically with the latest probe annotations. Most recently, a tool (Array Information Library Universal Navigator, AILUN, http://ailun.stanford.edu/) was developed by a Stanford University group that re-annotates all gene-expressions/proteomics data from the Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/), which is a public repository for gene-expression and other high-throughput experimental data covering numerous platforms and species. The AILUN server builds a universal identifier ID table by relating all probe IDs to Entrez Gene IDs on a monthly basis, and it is the first tool available that allows researchers to compare microarray data across different platforms and map genes across species . It also provides an opportunity for further discovery of complicated disease processes using more samples that have been deposited in GEO. The choice of processing method has a major impact on differential expression analysis of microarray data . Some statistical issues should be given consideration in data analysis, such as class comparison, class prediction and class discovery . When microarray data are being compared, various factors influence the agreement between studies, such as different technologies and platforms, statistical analysis criteria, protocols, and laboratory variability .
High-throughput DNA microarray technology has been increasingly applied in kidney transplantation to classify molecular sub-types, to predict outcome and the response to treatment, and to identify novel therapeutic targets. Although results hold promise, this technology will not have a full impact on routine clinical practice until there is further standardization of techniques and optimal clinical trial designs to set up higher volume validation studies for the generated biomarkers. Owing to substantial disease heterogeneity and the number of genes being analyzed, collaborative, multi-institutional studies are required to accrue enough patients for sufficient statistical power. Customized arrays or multiplex PCR for informative biomarkers can then be applied to the clinics for event prediction, treatment stratification, immunosuppression customization and improved graft and patient survival.
Our scientific environment is ripe for research-based implementations of integrative tools that support knowledge-based data mining. This integration can provide the cornerstone of research in the coming years. Developments in this area will require close interdisciplinary collaboration and will lead not only to the integration of data and knowledge, but also to computer-supported experiments and to knowledge generation platforms, thereby closing the loop of data gathering, hypothesis generation and hypothesis testing.
- 16.van’t Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415:530–536CrossRefGoogle Scholar
- 22.Rhodes DR, Yu J, Shanker K, Deshpande N, Varambally R, Ghosh D, Barrette T, Pandey A, Chinnaiyan AM (2004) Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression. Proc Natl Acad Sci USA 101:9309–9314PubMedCrossRefGoogle Scholar
- 35.van de Vijver MJ, He YD, van’t Veer LJ, Dai H, Hart AA, Voskuil DW, Schreiber GJ, Peterse JL, Roberts C, Marton MJ, Parrish M, Atsma D, Witteveen A, Glas A, Delahaye L, van der Velde T, Bartelink H, Rodenhuis S, Rutgers ET, Friend SH, Bernards R (2002) A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med 347:1999–2009PubMedCrossRefGoogle Scholar
- 36.Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X, Powell JI, Yang L, Marti GE, Moore T, Hudson J Jr, Lu L, Lewis DB, Tibshirani R, Sherlock G, Chan WC, Greiner TC, Weisenburger DD, Armitage JO, Warnke R, Levy R, Wilson W, Grever MR, Byrd JC, Botstein D, Brown PO, Staudt LM (2000) Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403:503–511PubMedCrossRefGoogle Scholar
- 40.Leslie JA, Meldrum KK (2008) The role of interleukin-18 in renal injury. J Surg Res 145:170–175Google Scholar
- 46.Deng MC, Eisen HJ, Mehra MR, Billingham M, Marboe CC, Berry G, Kobashigawa J, Johnson FL, Starling RC, Murali S, Pauly DF, Baron H, Wohlgemuth JG, Woodward RN, Klingler TM, Walther D, Lal PG, Rosenberg S, Hunt S, CARGO Investigators (2006) Noninvasive discrimination of rejection in cardiac allograft recipients using gene expression profiling. Am J Transplant 6:150–160PubMedCrossRefGoogle Scholar
- 48.Martinez-Llordella M, Puig-Pey I, Orlando G, Ramoni M, Tisone G, Rimola A, Lerut J, Latinne D, Margarit C, Bilbao I, Brouard S, Hernández-Fuentes M, Soulillou JP, Sánchez-Fueyo A (2007) Multiparameter immune profiling of operational tolerance in liver transplantation. Am J Transplant 7:309–319PubMedCrossRefGoogle Scholar
- 49.Brouard S, Mansfield E, Braud C, Li L, Giral M, Hsieh SC, Baeten D, Zhang M, Ashton-Chess J, Braudeau C, Hsieh F, Dupont A, Pallier A, Moreau A, Louis S, Ruiz C, Salvatierra O, Soulillou JP, Sarwal M (2007) Identification of a peripheral blood transcriptional biomarker panel associated with operational renal allograft tolerance. Proc Natl Acad Sci USA 104:15448–15453PubMedCrossRefGoogle Scholar
- 53.Eikmans M, Roos-van Groningen MC, Sijpkens YW, Ehrchen J, Roth J, Baelde HJ, Bajema IM, de Fijter JW, de Heer E, Bruijn JA (2005) Expression of surfactant protein-C, S100A8, S100A9, and B cell markers in renal allografts: investigation of the prognostic value. J Am Soc Nephrol 16:3771–3786PubMedCrossRefGoogle Scholar
- 56.Flechner SM, Kurian SM, Head SR, Sharp SM, Whisenant TC, Zhang J, Chismar JD, Horvath S, Mondala T, Gilmartin T, Cook DJ, Kay SA, Walker JR, Salomon DR (2004) Kidney transplant rejection and tissue injury by gene profiling of biopsies and peripheral blood lymphocytes. Am J Transplant 4:1475–1489PubMedCrossRefGoogle Scholar
- 70.Chang HY, Nuyten DS, Sneddon JB, Hastie T, Tibshirani R, Sorlie T, Dai H, He YD, van’t Veer LJ, Bartelink H, van de Rijn M, Brown PO, van de Vijver MJ (2005) Robustness, scalability, and integration of a wound-response gene expression signature in predicting breast cancer survival. Proc Natl Acad Sci USA 102:3738–3743PubMedCrossRefGoogle Scholar
- 74.Gardner L (2002) The c-Myc oncogenic transcription factor, 2nd edn. Academic Press, San Diego, CalifGoogle Scholar
- 75.Shi L, Reid LH, Jones WD, Shippy R, Warrington JA, Baker SC, Collins PJ, de Longueville F, Kawasaki ES, Lee KY, Luo Y, Sun YA, Willey JC, Setterquist RA, Fischer GM, Tong W, Dragan YP, Dix DJ, Frueh FW, Goodsaid FM, Herman D, Jensen RV, Johnson CD, Lobenhofer EK, Puri RK, Schrf U, Thierry-Mieg J, Wang C, Wilson M, Wolber PK, Zhang L, Amur S, Bao W, Barbacioru CC, Lucas AB, Bertholet V, Boysen C, Bromley B, Brown D, Brunner A, Canales R, Cao XM, Cebula TA, Chen JJ, Cheng J, Chu TM, Chudin E, Corson J, Corton JC, Croner LJ, Davies C, Davison TS, Delenstarr G, Deng X, Dorris D, Eklund AC, Fan XH, Fang H, Fulmer-Smentek S, Fuscoe JC, Gallagher K, Ge W, Guo L, Guo X, Hager J, Haje PK, Han J, Han T, Harbottle HC, Harris SC, Hatchwell E, Hauser CA, Hester S, Hong H, Hurban P, Jackson SA, Ji H, Knight CR, Kuo WP, LeClerc JE, Levy S, Li QZ, Liu C, Liu Y, Lombardi MJ, Ma Y, Magnuson SR, Maqsodi B, McDaniel T, Mei N, Myklebost O, Ning B, Novoradovskaya N, Orr MS, Osborn TW, Papallo A, Patterson TA, Perkins RG, Peters EH, Peterson R, Philips KL, Pine PS, Pusztai L, Qian F, Ren H, Rosen M, Rosenzweig BA, Samaha RR, Schena M, Schroth GP, Shchegrova S, Smith DD, Staedtler F, Su Z, Sun H, Szallasi Z, Tezak Z, Thierry-Mieg D, Thompson KL, Tikhonova I, Turpaz Y, Vallanat B, Van C, Walker SJ, Wang SJ, Wang Y, Wolfinger R, Wong A, Wu J, Xiao C, Xie Q, Xu J, Yang W, Zhang L, Zhong S, Zong Y, Slikker W Jr (2006) The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol 24:1151–1161PubMedCrossRefGoogle Scholar
- 80.Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, Stoeckert C, Aach J, Ansorge W, Ball CA, Causton HC, Gaasterland T, Glenisson P, Holstege FC, Kim IF, Markowitz V, Matese JC, Parkinson H, Robinson A, Sarkans U, Schulze-Kremer S, Stewart J, Taylor R, Vilo J, Vingron M (2001) Minimum information about a microarray experiment (MIAME)—toward standards for microarray data. Nat Genet 29:365–371PubMedCrossRefGoogle Scholar
- 81.Spellman PT, Miller M, Stewart J, Troup C, Sarkans U, Chervitz S, Bernhart D, Sherlock G, Ball C, Lepage M, Swiatek M, Marks WL, Goncalves J, Markel S, Iordan D, Shojatalab M, Pizarro A, White J, Hubley R, Deutsch E, Senger M, Aronow BJ, Robinson A, Bassett D, Stoeckert CJ Jr, Brazma A (2002) Design and implementation of microarray gene expression markup language (MAGE-ML). Genome Biol 3:RESEARCH0046Google Scholar
- 86.Shedden K, Chen W, Kuick R, Ghosh D, Macdonald J, Cho KR, Giordano TJ, Gruber SB, Fearon ER, Taylor JM, Hanash S (2005) Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data. BMC Bioinformatics 6:26PubMedCrossRefGoogle Scholar
- 93.Kainz A, Mitterbauer C, Hauser P, Schwarz C, Regele HM, Berlakovich G, Mayer G, Perco P, Mayer B, Meyer TW, Oberbauer R (2004) Alterations in gene expression in cadaveric vs. live donor kidneys suggest impaired tubular counterbalance of oxidative stress at implantation. Am J Transplant 4:1595–1604PubMedCrossRefGoogle Scholar
- 94.Flechner SM, Kurian SM, Solez K, Cook DJ, Burke JT, Rollin H, Hammond JA, Whisenant T, Lanigan CM, Head SR, Salomon DR (2004) De novo kidney transplantation without use of calcineurin inhibitors preserves renal structure and function at two years. Am J Transplant 4:1776–1785PubMedCrossRefGoogle Scholar
- 95.Chua MS, Barry C, Chen X, Salvatierra O, Sarwal MM (2003) Molecular profiling of anemia in acute renal allograft rejection using DNA microarrays. Am J Transplant 3:17–22Google Scholar
- 96.Zhang HQ, Lu H, Enosawa S, Takahara S, Sakamoto K, Nakajima T, Saito H, Suzuki S (2002) Microarray analysis of gene expression in peripheral blood mononuclear cells derived from long-surviving renal recipients. Transplant Proc 34:1757–1759Google Scholar
- 97.Akalin E, Hendrix RC, Polavarapu RG, Pearson TC, Neylan JF, Larsen CP, Lakkis FG (2001) Gene expression analysis in human renal allograft biopsy samples using high-density oligoarray technology. Transplantation 72:948–953Google Scholar
- 98.Kusaka M, Kuroyanagi Y, Kowa H, Nagaoka K, Mori T, Yamada K, Shiroki R, Kurahashi H, Hoshinaga K (2007) Genomewide expression profiles of rat model renal isografts from brain dead donors. Transplantation 83:62–70Google Scholar
- 99.Berthier CC, Lods N, Joosten SA, van Kooten C, Leppert D, Lindberg RL, Kappeler A, Raulf F, Sterchi EE, Lottaz D, Marti HP (2006) Differential regulation of metzincins in experimental chronic renal allograft rejection: potential markers and novel therapeutic targets. Kidney Int 69:358–368Google Scholar
- 100.Djamali A, Reese S, Oberley T, Hullett D, Becker B (2005) Heat shock protein 27 in chronic allograft nephropathy: a local stress response. Transplantation 79:1645–1657Google Scholar
- 101.Schuurs TA, Gerbens F, van der Hoeven JA, Ottens PJ, Kooi KA, Leuvenink HG, Hofstra RM, Ploeg RJ (2004) Distinct transcriptional changes in donor kidneys upon brain death induction in rats: insights in the processes of brain death. Am J Transplant 4:1972–1981Google Scholar
- 102.Leonard MO, Kieran NE, Howell K, Burne MJ, Varadarajan R, Dhakshinamoorthy S, Porter AG, O’Farrelly C, Rabb H, Taylor CT (2006) Reoxygenation-specific activation of the antioxidant transcription factor Nrf2 mediates cytoprotective gene expression in ischemia-reperfusion injury. FASEB J 20:2624–2626Google Scholar