RefSeq Refinements of UniGene-Based Gene Matching Improve the Correlation of Expression Measurements Between Two Microarray Platforms

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.


Matching genes across microarray platforms is a critical step in meta-analysis. Standard practice uses UniGene to match genes. Numerous studies have found poor correlations between platforms when using UniGene matching.

We profiled samples from 33 breast cancer patients on two different microarray platforms (Affymetrix and cDNA) and investigated gene matching. Our results confirmed that UniGene-based matching led to poor correlations of gene expression between platforms. Using RefSeq, a database maintained by the National Center for Biotechnology Information (NCBI), we developed and implemented a new method to refine gene matching. We found that the correlations between gene expression measurements were substantially higher after the RefSeq matching. Our approach differs from previously reported sequence-matching approaches and retains useful expression measurements. It is a sensible approach for matching probes across platforms.

We conclude that UniGene alone is insufficient to match genes across platforms. Refined matching based on RefSeq significantly improves the quality of matches.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Table I
Table II
Table III
Fig. 3
Table IV
Fig. 4


  1. 1.

    Wheeler DL, Church DM, Federhan S, et al. Database resources of the National Center for Biotechnology. Nucleic Acids Res 2003; 31: 28–33

    PubMed  Article  CAS  Google Scholar 

  2. 2.

    Barczak A, Rodriguez MW, Hanspers K, et al. Spotted long oligonucleotide arrays for human gene expression analysis. Genome Res 2003; 13(7): 1775–85

    PubMed  Article  CAS  Google Scholar 

  3. 3.

    Bloom G, Yang IV, Boulware D, et al. Multi-platform, multi-site, microarray-based human tumor classification. Am J Pathol 2004; 164(1): 9–16

    PubMed  Article  CAS  Google Scholar 

  4. 4.

    Culhane AC, Perriere G, Higgins DG. Cross-platform comparison and visualisation of gene expression data using co-inertia analysis. BMC Bioinformatics 2003; 4(1): 59

    PubMed  Article  Google Scholar 

  5. 5.

    Ghosh D, Barette TR, Rhodes D, et al. Statistical issues and methods for meta-analysis of microarray data: a case study in prostate cancer. Funct Integr Genomics 2003; 3(4): 180–8

    PubMed  Article  CAS  Google Scholar 

  6. 6.

    Kothapalli R, Yoder SJ, Mane S, et al. Microarray results: how accurate are they? BMC Bioinformatics 2002; 3(1): 22

    PubMed  Article  Google Scholar 

  7. 7.

    Kuo WP, Jenssen T, Butte AJ, et al. Analysis of matched mRNA measurements from two different microarray technologies. Bioinformatics 2002; 18(3): 405–12

    PubMed  Article  CAS  Google Scholar 

  8. 8.

    Mah N, Thelin A, Lu T, et al. A comparison of oligonucleotide and cDNA-based microarray systems. Physiol Genomics 2004; 16(3): 361–70

    PubMed  Article  CAS  Google Scholar 

  9. 9.

    Moreau Y, Aerts S, De Moor B, et al. Comparison and meta-analysis of microarray data: from the bench to the computer desk. Trends Genet 2003; 19(10): 570–7

    PubMed  Article  CAS  Google Scholar 

  10. 10.

    Rogojina AT, Orr WE, Song BK, et al. Comparing the use of Affymetrix to spotted oligonucleotide microarrays using two retinal pigment epithelium cell lines. Mol Vis 2003; 9: 482–96

    PubMed  CAS  Google Scholar 

  11. 11.

    Wang J, Coombes KR, Highsmith W, et al. Differences in gene expression between B-cell chronic lymphocytic leukemia and B cells: a meta-analysis of three microarray studies. Bioinformatics 2004; 20(17): 3166–78

    PubMed  Article  CAS  Google Scholar 

  12. 12.

    Pruitt KD, Tatusova T, Maglott DR. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 2005; 33(1): D501–4

    PubMed  CAS  Google Scholar 

  13. 13.

    Ji Y, Wu C, Liu P, et al. Applications of beta-mixture models in bioinformatics. Bioinformatics 2005; 21: 2118–1

    PubMed  Article  CAS  Google Scholar 

  14. 14.

    Mecham BH, Wetmore DZ, Szallasi Z, et al. Increased measurement accuracy for sequence-verified microarray probes. Physiol Genomics 2004; 18: 308–15

    PubMed  Article  CAS  Google Scholar 

  15. 15.

    Mecham BH, Klus GT, Strovel J, et al. Sequence-matched probes produce increased cross-platform consistency and more reproducible biological results in microarray-based gene expression measurements. Nucleic Acids Res 2004; 25: e74

    Article  Google Scholar 

  16. 16.

    Symmans W, Ayers M, Clark E. Fine needle aspiration and core needle biopsy samples of breast cancer provide similar total RNA yield, but different stromal gene expression profiles cancer. Cancer 2003; 97: 2960–71

    PubMed  Article  CAS  Google Scholar 

  17. 17.

    Pusztai L, Ayers M, Stec J, et al. Gene expression profiles obtained from single passage fine needle aspirations (FNA) of breast cancer reliably identify prognostic/predictive markers such as estrogen (ER) and HER-2 receptor status and reveal large scale molecular differences between ER-negative and ER-positive tumors. Clin Cancer Res 2003; 9: 2406–15

    PubMed  CAS  Google Scholar 

  18. 18.

    Ayers M, Symmans FW, Stec J, et al. Gene expression profiles predict complete pathologic response to neoadjuvant paclitaxel/FAC chemotherapy in breast cancer. J Clin Oncol 2004; 22: 2284–93

    PubMed  Article  CAS  Google Scholar 

  19. 19.

    Li C, Wong W. Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application. Genome Biol 2001; 4: 1–11

    Google Scholar 

  20. 20.

    Dempster A, Laird N, Rubin D. Maximum likelihood from incomplete data via the EM algorithm (with discussion). J R Stat Soc Ser B 1977; 39: 1–38

    Google Scholar 

  21. 21.

    Ali S, Coombes RK. Estrogen receptor alpha in human breast cancer: occurrence and significance. J Mammary Gland Biol Neoplasia 2002; 5: 271–81

    Article  Google Scholar 

  22. 22.

    Cunliffe H, Ringner M, Bilke S, et al. The gene expression response of breast cancer to growth regulations: patterns and correlation with tumor expression profiles. Cancer Res 2003; 63: 7158–66

    PubMed  CAS  Google Scholar 

  23. 23.

    Nielsen T, Hsu F, Jensen K, et al. Immunohistochemical and clinical characterization of the basal-like subtype of invasive breast carcinoma. Clin Cancer Res 2004; 10: 5367–74

    PubMed  Article  CAS  Google Scholar 

  24. 24.

    Carter SL, Eklund AC, Mecham BH, et al. Redefinition of Affymetrix probe sets by sequence overlap with cDNA microarray probes reduces cross-platform inconsistencies in cancer-associated gene expression measurements. BMC Bioinformatics 2005 Apr 25; 6(1): 107

    PubMed  Article  Google Scholar 

  25. 25.

    Anderson K, Hess KR, Gold D, et al. Reproducibility of gene expression signature based predictions in replicate experiments. Clin Cancer Res 2006; 12(6): 1721–7

    PubMed  Article  CAS  Google Scholar 

Download references


We would like to acknowledge Stephen Tirrell, James Stec, Mark Ayers and Jeffrey S Ross from Millennium Pharmaceuticals (Cambridge, MA, USA) for performing the microarray hybridisation. The Millennium Pharmaceuticals also provided research funding to Dr Pusztai to conduct the clinical trial.

This research was in part supported by the University of Texas SPORE in Lung Cancer grant CA070907 and Prostate Cancer grant CA90270.

The authors have no conflicts of interest that are directly relevant to the content of this article.

Author information



Corresponding author

Correspondence to Dr Jing Wang.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Ji, Y., Coombes, K., Zhang, J. et al. RefSeq Refinements of UniGene-Based Gene Matching Improve the Correlation of Expression Measurements Between Two Microarray Platforms. Appl-Bioinformatics 5, 89–98 (2006).

Download citation


  • cDNA Clone
  • Expression Measurement
  • cDNA Array
  • Gene Expression Measurement
  • Affymetrix Array