Introduction

Cancer researchers have made significant progress in identifying a new 'molecular taxonomy' of cancer through the use of genomics technologies. Specifically, the use of DNA microarrays has created robust molecular phenotypes for many tumours, including brain [1, 2], breast [310], colon [11, 12], gastric [13], kidney [14], leukaemia [1517], lymphoma [1820], lung [2123], mela-noma [24], ovary [2528], prostate [2932] and small, round blue-cell tumours of childhood [33]. In a subset of these studies, the gene expression profiles strongly suggest that this information would improve diagnosis and predict clinical outcome when compared with the standardised prognostic criteria, such as tumour grade, tumour size, patient age and patient performance status [6, 7, 19, 20]. These recent breakthroughs in the laboratory have been qualified successes. The lack of a standardised method for data collection, data analysis and validation, however, has made it difficult to rigorously compare studies from different laboratories, and has thus hampered the introduction of this type of data into clinical medicine. Fortunately, the microarray field has proposed universal standardisation guidelines to help scientists and clinicians accurately compare the results from different laboratories, and thereby has potentially paved the way for the use of gene expression analysis in clinical medicine [34, 35]. Based on the published work of van't Veer et al [6]. and van de Vijver et al [7]. the Netherlands Cancer Institute in Amsterdam announced in January 2003 that it would become the first institution in the world to use DNA microarray analysis to make treatment decisions regarding women with breast cancer [36]. As of today, four additional institutions are incorporating gene expression patterns into clinical trials of breast cancer (Table 1) [37], and the field awaits the results with great interest.

Table 1 Clinical trials using gene expression analysis

The encouraging early results from the DNA microarray technology has inspired the development of a variety of array-based platforms [3842] and contributed to the development of a system-wide study of proteins -- the field of modern pro-teomics. Traditionally, two-dimensional gel electrophoresis (2-D PAGE) with mass spectrometry has been the basis of proteomic technology [43]. Although this technique has provided rich information about individual proteins, its clinical usefulness for the study of the proteome is limited. The limitations of this technique for clinical diagnostics are primarily due to low sample throughput compared with other techniques. Additionally, there are technical challenges relating to the reproducibility of the 2-D gels, the relatively low numbers of proteins that can be resolved on each gel and the limited sensitivity (ie low abundant proteins are not easily identified). Modern proteomics technology, as discussed in several recent reviews, is developing quickly and is focused on mass spec-trometry-based serum pattern profiling and protein micro-arrays in continuing efforts to discover new molecular markers and therapeutic targets in human cancer [4446]. The use of serum proteomic pattern analysis has created diagnostic signatures for ovarian [47, 48], breast [49], prostate [50] and liver cancers [51]. In the Laboratory of Pathology at the National Cancer Institute (NCI), in collaboration with the Food and Drug Administration (FDA), this proteome-based strategy is being used for early detection of ovarian cancer in high-risk women. Preliminary data is so encouraging that the technology is on track to enter clinical trials in the very near future [47]. For women living with the spectre of an aggressive disease, such as ovarian cancer, a reliable screening test would represent a major step forward in the diagnostic capabilities of the physicians who treat them.

Clinical applications of genomics

The integration of 'global gene profiling' into clinical medicine is exemplified in 2003 by the fact that in the early weeks of the severe acute respiratory syndrome (SARS) pandemic in 2003, the Centers for Disease Control sent tissue samples and viral cultures from SARS patients to Dr Joseph DeRisi's research team at the University of California, San Francisco. They explored the origin of the novel coronavirus using their custom oligonucleotide DNA microarray which represents 1,000 viruses [52, 53]. Of the 12,000 oligonucleotides spotted on the array, the patient samples hybridised only to a group of eight oligonucleotides representing two virus families: Coro-naviridae and Astroviridae [54]. Although the identification of the novel coronavirus associated with SARS was made using a broad range of laboratory testing, this example demonstrates the power of microarray technology as a molecular diagnostic tool to test the spectrum of viruses in a single assay and to narrow, to a finite number, the potential pathogens in an unknown disease with a nonspecific clinical presentation. Admittedly, the current review is focused on genomic applications to cancer, but this example deserves special mention, given the worldwide attention on SARS. Moreover, as illustrated in this example, the integration of genomic analysis with traditional clinical and histological assessments is finding its way into clinical management.

Breast cancer -- a genomic approach

The van't Veer et al [6]. and van de Vijver et al [7]. expression data on breast cancer have been discussed extensively since the announcement that, collectively, the two groups were beginning a large clinical trial for women with stage I-II disease at the Netherlands Cancer Institute [6, 7, 55, 56]. The upcoming trial has the potential to significantly improve the current breast cancer classification system, and influence treatment, if the original results can be reproduced in a much larger population of women. The first study by van't Veer et al [6]. identified a molecular profile of 70 genes that could be used to predict which patients, in a group of young women diagnosed with stage I-II breast cancer who did not have axillary lymph node metastases, will develop distant metastases within five years. The study generated great interest because of its implications for the use of adjuvant treatments in patients with early stage breast cancer. Current statistics show that the majority of patients with stage I-II breast cancer who have histologically tumour-free axillary lymph nodes will be disease-free after five years when treated with breast-conserving surgery plus radiation therapy or mastectomy -- the current recommended therapy for localised breast cancer [57]. Unfortunately, nearly 20 per cent of women treated appropriately for their primary breast cancer will develop metastatic disease and, thus, are the best candidates for adjuvant therapy. Identifying these candidates using the current clinical criteria, however, is incredibly difficult. The recommendations of the National Institutes of Health (NIH) Consensus Panel and the International Consensus Panel from the St Gallen Conference, using their respective prognostic and predictive criteria (hereafter collectively referred to as the 'conventional consensus criteria') to determine eligibility for adjuvant therapy, advise treatment for up to 90 per cent of patients with lymph node-negative breast cancer [58, 59]. When van't Veer and colleagues [6] applied the 70-gene expression profile to the patient cohort in their study, the application allowed them to accurately differentiate between low-risk and high-risk women, thereby reducing in number the patients who would ordinarily be advised to receive adjuvant treatment from between 70-90 per cent -- based on the conventional consensus criteria -- to 40 per cent.

To validate the prognostic value of the 70-gene expression profile as an accurate predictor of the risk of distant metastases, van de Vijver et al. studied 295 new breast cancer tumours, which included specimens from women with histologically positive axillary lymph nodes [7]. Currently, nearly all women with positive lymph nodes are treated with some type of adjuvant therapy. The expression analysis of the validation study again proved to be a more accurate predictor of distant metastases within five years than the conventional consensus criteria and, thus, improved the likelihood of identifying the patients who will benefit most from adjuvant therapy. Specifically, the probability of remaining metastasis-free after ten years was 85.2 per cent in the patients classified as having a 'good-prognosis signature' and 50.6 per cent in the patients classified as having a 'poor-prognosis signature'. In addition, the gene expression profile was predictive of survival. The overall ten-year survival rate was 94.5 per cent in patients with a 'good-prognosis signature' and 54.6 per cent in patients with a 'poor-prognosis signature'. Three of the important clinical characteristics of the conventional consensus criteria were associated with the gene expression profile -- age of the patient, histological grade of the tumour and oestrogen receptor (OR) status. Of note, one of the most widely accepted prognostic factors -- nodal status -- was not associated with the gene expression profile, as those with node-positive disease and node-negative disease were nearly equally distributed in the two groups. The Dutch team is now ready to use this information in a trial of 5,000 women with stage I-II breast cancer, to decide who will receive adjuvant therapy.

The preliminary data provide an exciting start for the integration of genomics approaches into the clinic, and they demonstrate the potential of the technology to improve diagnostics and treatment. It will be interesting to learn whether or not the gene expression profile will be validated in a larger group of women, given that the expression profile was generated from young women (< 53 years of age), while the median age for the diagnosis of breast cancer is between 60-65 years of age. In addition, as anticipated when making treatment decisions for stage I-II patients with breast cancer, a small percentage of women will be misclassified as having a 'good-prognosis signature' -- ie a low risk for developing distant metastases, will not receive adjuvant systemic therapy and will develop metastases within five years. Although this misclassification error rate is not significantly higher than the misclassification rate that occurs today, using the best clinical criteria available to the physician, further work should be done so that this technology can be introduced into the clinic without denying women available treatments. An upcoming study to be directed by Daniel Haber at the Massachusetts General Hospital underscores this point, as he will not use the gene expression profiles from his study to decide on treatment options, but will rather validate the Dutch prognostic profile. Lastly, it is interesting that 97 per cent of the tumours in the 'good-prognosis signature' were OR positive. Other gene expression profiling studies have observed that the OR status of a breast tumour has a strong influence on the subsequent classifications [60]. As 63 per cent of the tumours in the 'poor-prognosis signature' identified by van de Vijver et al [7]. were OR positive, however, further studies are required to detect the additional pathways -- along with the oestrogen signalling pathway -- that determine the classification. The addition of future proteomic data to the breast cancer gene expression analysis may help to identify those additional signalling pathways that are important for a good prognosis.

Ovarian cancer -- a proteomics approach

Ovarian cancer is diagnosed at an advanced clinical stage -- when the ovarian cancer cells have metastasised from the ovary to the pelvis, peritoneal cavity or other distant sites -- in more than two-thirds of women [61]. The five-year survival rate for these late-stage patients is 35-40 per cent, despite the best possible surgical and chemotherapeutic treatment. If, however, ovarian cancer is detected while it is still confined to the ovary (stage I) and treated appropriately, the five-year survival rate is optimistic (95 per cent). Unfortunately, early-stage ovarian cancer is difficult to detect because patients are frequently asymptomatic and few reliable tumour markers exist. Thus, the development of dependable serum markers for the early detection of ovarian cancer would improve the survival rate of women facing this disease. At the NCI-FDA Clinical Proteomics Program, a proteomics approach is being used for the early detection of a variety of cancers, as well as non-neoplastic diseases such as infectious diseases, autoimmune diseases, vascular diseases, prenatal diagnosis and transplantation rejection [44, 62]. Several clinical trials are being planned, including the first of its kind for ovarian cancer. The identification of specific and sensitive molecular markers for epithelial ovarian cancer is a priority of the NCI-FDA Clinical Proteomics Program (Table 2), and this proposed trial has the potential to develop the first reliable screening test for ovarian cancer, if the protein signature that was identified in a preliminary study of women with a high risk of developing the disease can be applied to screening women in the general population.

Table 2 Clinical trials using proteomic analysis

Petricoin et al., using mass spectroscopy coupled with an artificial intelligence computer algorithm, developed a system of low molecular weight serum protein profiling and identified a specific protein signature that was associated with ovarian cancer from asymptomatic women with a high risk of developing the disease [47]. In the study, serum samples from 50 healthy women and 50 women with ovarian cancer were used as a 'training set' to identify a serum protein signature that was specific for ovarian cancer. Validation of an additional 116 unknown serum samples resulted in a highly accurate classification of the patients without ovarian cancer (95 per cent), as well as those with benign disease (100 per cent) and these with ovarian cancer (100 per cent) -- including all stage I cancers. Overall, this result yielded 100 per cent sensitivity and 95 per cent specificity for the identification of ovarian cancer. The positive predictive value for the sample set was 94 per cent, compared with 35 per cent for the CA-125 serum marker for the same samples.

The significant implication of the NCI-FDA study is that the serum protein signature accurately identified all of the cases of early-stage ovarian cancer, making it a potential diagnostic tool that has so far been lacking in the clinical laboratory. In addition, this study promises to create some very realistic opportunities for the clinical management of ovarian cancer, as the generation of the diagnostic mass spectra requires only a small serum sample, and the results can be obtained within 30 minutes in a cost-effective manner. The most pressing goal for now, however, is to validate the original results in a large, multi-institutional clinical trial so that this new diagnostic tool will soon be available for patients and physicians. The results of the trial will be important for a better understanding of epithelial ovarian tumours, as the specific molecular and protein pathways involved in the development of this disease remain unknown. It is hoped that adding a serum proteomics approach to the current screening methods for ovarian cancer will translate into improved diagnostics and treatments for what is currently an unpredictable and aggressive cancer.

Another goal of the NCI-FDA Clinical Proteomics Program is to use protein microarrays to profile the functional state of the signalling pathways that comprise the protein networks within human tissue cells (Figure 1). It is anticipated that the data generated from such studies will identify specific signalling pathways that are deregulated in a variety of human diseases. Moreover, it is the authors' hope that understanding these signalling pathways will facilitate the development of new drugs that target specific 'nodes' in the protein circuitry, and thereby the design of a treatment regimen that will have greater efficacy with less toxicity. For example, one ongoing clinical application at the NCI involves a phase II clinical trial studying the type III receptor tyrosine kinase inhibitor, imatinib mesylate, for the treatment of women with epithelial ovarian cancer. In this trial, biopsies are taken from the primary tumour prior to the initiation of therapy; four weeks into the treatment regimen, a protein microarray is used to determine whether the receptor tyrosine kinase pathways are active and to correlate the clinical efficacy and toxicity of the therapy. This concept is the premise for molecular-targeted therapy, and similar studies involving other types of cancer are currently underway at the NCI.

Figure 1
figure 1

Profiling the disease signature for application to individualisation of therapy. From: Liotta, LA. et al. (2001), JAMA Vol. 286, p. 18. Used with permission.

Conclusion

The application of serum proteomics and gene expression analysis to early diagnosis and treatment decisions is a daunting challenge. It will require parties practising in previously disparate disciplines to work together and understand the data, which may be in the form of clinical information, mathematical algorithms and mass spectrometry. The challenge of 'clinical proteomics' and DNA microarrays is no longer just the development of new technologies, but rather the best use and integration of these technologies for the diagnosis and treatment of disease. The process of this integration represents a new, evolving field of 'translational medicine'.