Successful phase III trials improve the standard of care for cancer patients by providing new treatment regimens that have superior survival rates, less toxicity, or both when compared with conventional regimens. However, the external validity or generalizability of trial results depends directly on the extent to which the results are applicable to the patient population of a specific practice. Selection bias can seriously influence the external validity of a clinical trial. Participants in clinical trials tend to have better survival outcomes than do nonparticipants. In addition, differences among the patient characteristics that are reported in trials complicate direct comparisons between reports. Therefore, the presentation of baseline patient characteristics is a critical component of any study report on treatment for cancer. Patient characteristics should be presented clearly, so that all readers—physicians, oncologists, and other healthcare decision-makers—can confirm, to the extent possible, whether the study population is representative of that seen in their clinical practice.

Researchers are aggressively pursuing more effective treatments by conducting many global trials of new drugs, including those using molecular-targeted agents. However, various problems are emerging in interpreting and comparing the data from these clinical trials. The ToGA study has proven the survival benefits of adding trastuzumab to capecitabine (or 5-fluorouracil [5-FU]) and cisplatin in the treatment for patients with human epidermal growth factor receptor 2 (HER2)-positive, advanced gastric cancer [1]. The AVAGAST study failed to show significant prolongation of overall survival when bevacizumab was added to the chemotherapy for advanced gastric cancer [2]. Review of these studies reveals remarkable diversity in patient characteristics among different regions or countries (e.g., Asia, Europe, and the United States). Diversity existed in patient characteristics such as age, site of primary stomach cancer, and the rate of receipt of second-line chemotherapy, all factors which may influence trial outcomes. Asian countries contributed greatly to patient recruitment in these two global trials, but the proportion of patients enrolled from Asia was less than half of the trial populations. The realistic magnitude of benefit of the new drugs in Asian countries can only be estimated from the subgroup analyses. Investigators often want to know whether patients will respond differently to a treatment depending on prerandomization characteristics. Investigation of this clinical question can lead to important findings that indicate a treatment has a greater effect in some patients than in others. However, caution must be exercised in interpreting the results of subgroup analyses, because of the lack of statistical power with small sample sizes. Deciding on a course of clinical treatment based on subgroup analysis results is highly controversial.

The simultaneous expansion of global clinical research and a deepening understanding of the need for individualized medicine have revealed the importance of reporting patient characteristics appropriately so that data from multiple, often international, trials can be more effectively integrated into subgroup analyses that have more statistical power. Results from these types of analyses allow researchers to test the relationships between various characteristics and endpoints, to construct hypotheses useful for answering clinical questions, and to plan new trials that will yield more effective treatments. This old—but new—topic has been addressed in this issue of Gastric Cancer by Shitara and colleagues [3]. They performed a comprehensive review of published randomized clinical trials for advanced gastric cancer and found substantial inconsistencies in the reported patient characteristics and the adaptation of stratification factors. Histology, number of disease sites, adjuvant treatment, and receipt of second-line chemotherapy were reported in less than 50% of the trials. Important prognostic characteristics, such as performance status, disease status, and number and location of metastases differed among reports with different chronologies (prior to vs. after 2004) and regions. For example, compared with non-Asian trials, Asian trials included statistically significantly more patients with diffuse disease than with intestinal disease. As Shitara and colleagues point out, now is the time to generate international consensus on the patient characteristics that should be used for trial stratification and the additional characteristics that should be provided in reports of results in order to better inform future research efforts as well as clinical decision-making.

It can be challenging to enroll a homogeneous patient population, because patients have so many different backgrounds and disease characteristics, even in a randomized controlled trial of a single disease entity such as advanced gastric cancer. Researchers acknowledge that allocation bias is inevitable to some extent, but this drawback should be minimized, especially when the sample size is small. With a small sample size, stratified randomization is adopted to minimize the confounding bias, using a few important baseline characteristics (i.e., confounding factors) which influence the study outcome. Although stratification is very commonly used for clinical trials, investigators and readers are often uncertain about its importance. What stratification factors should be selected for trials in advanced gastric cancer? What type of trial design should be used? Often, guidelines for reporting clinical trials do not mention these important elements. A clinical guide has been suggested to help investigators know when to stratify randomization and to help readers know when to look for such randomization. Stratified randomization is important for small trials (those enrolling fewer than 400 patients) in which the treatment outcome may be affected by clinical factors known to have a large effect on prognosis; such stratification is also important for large trials when it is planned to perform interim analyses of data from a small number of patients; and this stratification is important for trials designed to show equivalence between therapies. Stratified randomization is frequently useful to reduce both type I and II errors, improve trial efficiency, and facilitate subgroup analyses and interim analyses. Evaluations of prognostic factors are important but are often complex, with uncertain implications. Clinicians and researchers know that performance status, disease status (advanced or recurrent), number of metastatic organs, location of metastasis, and disease extension (locally advanced or metastatic) influence the survival of patients with advanced gastric cancer. Greater attention is needed in the planning and analysis of the studies that use these factors.

Current major themes in studies of the treatment of cancer focus on three categories: patient demographics, tumor histology, and molecular biomarkers. Conventional risk-stratification methods for cancer incompletely predict prognosis, treatment efficacy, or both. For example, in patients with the intestinal type of gastric cancer, which is most often HER2-positive, treatment with trastuzumab is considered most effective. However, Shitara and colleagues [4, 5] show that, compared with non-Asian trials, Asian trials included statistically significantly more patients with diffuse disease than with intestinal disease. They point to subset analyses of the First-line Advanced Gastric Cancer Study (FLAGS) trial and the Japan Clinical Oncology Group (JCOG) 9912 trial, both of which showed the superiority of S-1 to 5-FU in the treatment of diffuse-type gastric cancer [4, 5]. These results indicate that as new therapeutic options emerge, it is desirable to use our increasing knowledge of tumor biology to optimize and individualize therapy. Biomarkers can aid in patient stratification (risk assessment), treatment response identification (surrogate markers), or differential diagnosis (to identify the patients most likely to respond to specific drugs). There is an interrelationship between these factors, and they are interlinked and not independent. We explicitly seek to explore and evaluate the methodology for the clinical validation of biomarker-guided therapy. Despite the fact that most oncologists consider patients with advanced gastric cancer as each having quite different characteristics and prognoses, most phase III clinical trials testing new drugs still treat these patients as a homogeneous population, with the possible exception of trials of HER-2-directed drugs. The clinical applicability of many biomarkers will be revealed over the next decade, but the point here is that more and more we will be individualizing therapy, even for advanced gastric cancer, and stratification factors that are used for trials and the patient characteristics that are reported will change drastically.