Introduction

Gastric cancer remains a common malignancy in many countries [1]. Photofluorography and esophagogastroduodenoscopy have been used to detect gastric cancer in its early stages, and have contributed to a practical approach for reducing gastric cancer mortality [2,3,4]. However, these methods, especially esophagogastroduodenoscopy, can be invasive and expensive. Therefore, there is need for a simple, inexpensive, and effective system for detecting individuals at high risk for gastric cancer who could then receive a more detailed examination.

Gastric cancer is considered to be a multifactorial disease, the development of which involves various risk factors. Recently, it has been increasingly recognized that statistical models for disease risk prediction and subsequent risk assessment tools are clinically useful for estimating the probability that a person with a given set of risk factors will develop a disease of interest [5]. These risk assessment tools can help to detect high-risk populations for a disease and in the subsequent clinical decision-making process for both healthcare providers and patients. However, studies aimed at the establishment of risk assessment tools for the development of gastric cancer are scarce [6].

The Hisayama Study is a population-based cohort study exploring the risk factors for cardiovascular disease and lifestyle-related diseases, including gastric cancer, in a Japanese community. Previously, we identified several risk factors for the development of gastric cancer: H. pylori infection, atrophic gastritis, smoking habits, higher hemoglobin (Hb) A1c levels, and dietary factors [7,8,9,10,11]. The aim of this study was to develop and evaluate a risk assessment tool to stratify an individual’s risk of gastric cancer in the future.

Methods

Setting and participants

  1. (a)

    Derivation cohort

The town of Hisayama is located in a suburban area adjacent to Fukuoka City, a large urban center on Kyushu Island in the southern part of Japan. The present population of the town is approximately 8500. Full community surveys of the residents have been repeated since 1961 [12]. The population screened in 1988 was set as a derivation cohort in this study. A detailed description of the survey has been published previously [13]. Briefly, a total of 2742 Hisayama residents aged 40 years or older (80.9% of the total population in that age group) underwent a health check-up in this survey. After the exclusion of 130 persons with a history of gastrectomy or gastric cancer, 163 for whom laboratory data were not available, and 5 who died during the examination period, a total of 2444 subjects (1016 men and 1428 women) were enrolled in the baseline examination.

  1. (b)

    Validation cohort

A screening survey for a validation cohort was conducted in a similar fashion to the derivation cohort in 2002 [14]. A total of 3298 residents aged 40 years or older (77.6% of the total population in that age group) underwent a health check-up. After the exclusion of 87 persons with a history of gastrectomy or gastric cancer and 7 for whom laboratory data were not available, a total of 3204 subjects (1349 men and 1855 women) were enrolled at the baseline survey.

Follow-up survey of the derivation and validation cohorts

Each cohort was followed up prospectively by yearly health examinations from December 1988 to November 2002 for the derivation cohort and from December 2002 to November 2007 for the validation cohort. The health status of any subject who had not undergone a regular examination or who had moved out of town was checked every year by mail or telephone. In addition, a daily monitoring system was established by the study team and local physicians or members of the Division of Health and Welfare of the town. All the participants were followed up completely over 14 years in the derivation cohort and over 5 years in the validation cohort. The cases who developed gastric cancer were surveyed in local clinics in the town and hospitals around the town by referring to medical records of barium radiographic examinations, upper endoscopic examinations, and biopsy diagnoses. We also checked all records from annual mass screenings for gastric cancer that applied an upper gastrointestinal series. In addition, to find any concealed gastric cancer, an autopsy was performed on 552 (71.3%) of the 774 subjects in the derivation cohort and 152 (66.4%) of the 229 subjects in the validation cohort who died during the follow-up period. The diagnosis of gastric cancer was confirmed by histological examination of tissue obtained by gastrectomy, endoscopic resection, or autopsy. Pathologic diagnosis of an identified gastric cancer was made according to the guidelines proposed by the Japanese Gastric Cancer Association [15].

Laboratory testing and risk factor measurement

In both the 1988 and 2002 surveys, clinical evaluation and laboratory testing were performed in a similar manner except for the measurement of IgG antibodies to H. pylori and HbA1c. At the baseline survey, serum IgG antibodies to H. pylori were assayed by means of a quantitative enzyme immunoassay using commercial kits (HM-CAP: Enteric Products Inc., Westbury, New York in 1988; and HELpTEST: AMRAD, Kew, Victoria, Australia in 2002). The measurement of serum pepsinogen concentrations was carried out by immunoradiometric assay (PG I/II RIA BEAD; Dainabot, Tokyo, Japan), and atrophic gastritis was defined as positive given that pepsinogen I ≤ 70 ng/ml and the pepsinogen I/II ratio ≤ 3.0 [16]. Since H. pylori infection causes atrophic gastritis, which is strongly related to the development of gastric cancer, the subjects were classified into three groups according to the statuses of baseline H. pylori infection and atrophic gastritis: H. pylori antibody (−) and atrophic gastritis (−); H. pylori antibody (+) and atrophic gastritis (−); and H. pylori infection (+)/(−) and atrophic gastritis (+) [17, 18]. HbA1c levels were measured by high-performance liquid chromatography (HLC-723Hb; TOSOH Inc., Tokyo, Japan) in 1988 and by latex aggregation immunoassay using a Determiner HbA1c kit (Kyowa Medix, Tokyo, Japan) in 2002. Subjects were divided into two groups according to whether their HbA1c level was ≥ 6.5%, based on the level diagnostic for diabetes according to the guidelines of the American Diabetes Association [19]. Information about smoking habits and alcohol intake was obtained by means of a questionnaire administered to each subject. Height and weight were measured with the subject in light clothes without shoes, and obesity was defined as a body mass index ≥ 25.0 kg/m2.

Statistical analysis

In the derivation cohort, we performed univariate analysis to estimate the hazard ratio (HR) with a 95% confidence interval (CI) for each risk factor for the development of gastric cancer using a Cox proportional hazards model. In this analysis, subjects were censored if they underwent gastrectomy for reasons other than gastric cancer, they died during the study, or at the end of follow-up for those still alive. Clinically or biologically plausible risk factors for gastric cancer were included in the relevant Cox model: age, sex, the combination of H. pylori infection and atrophic gastritis, HbA1c levels, current smoking, current drinking, and obesity [7,8,9,10, 18, 20, 21]. To build the risk prediction model, we selected independent risk factors for the development of gastric cancer using a multivariable Cox proportional hazards model with backward selection at P < 0.10 for the remaining variables. The goodness of fit between the observed and the predicted number of gastric cancer occurrences was tested using the Hosmer–Lemeshow (H–L) test [22]. The ability of the risk prediction model to discriminate persons who would experience gastric cancer from those who would not was evaluated using a c-statistic [23]. We formulated a risk score system based on the risk prediction model [24]. In this system, we set one point to be equivalent to the increase in gastric cancer risk associated with smoking status, which has the lowest regression coefficient among the risk factors in the multivariable model. Thus, the regression coefficient of each risk factor was divided by that of smoking status, and the resulting value was rounded up to the nearest integer. Total risk score was calculated as the sum of the risk scores for the various risk factors. The incidence of first-ever gastric cancer according to the quartiles of total risk scores was estimated with the person-year method. The differences and trends in the incidence among the total risk score levels were tested by means of the Cox model. Receiver operating characteristic (ROC) curve analysis was performed to determine the cutoff point of the risk score for predicting gastric cancer incidence. The cutoff value that optimizes the ability to discriminate the risk of incident gastric cancer was determined as the point closest to (0, 1) on the ROC curve (= min{[1 − sensitivity]2 + [1 − specificity]2}) [25]. Statistical analyses were conducted using Statistical Analysis Software (SAS) version 9.3 (SAS Institute, Cary, NC, USA). A two-tailed value of P < 0.05 was considered statistically significant.

Ethical considerations

The study protocol was approved by the Kyushu University Institutional Review Board for Clinical Research, and written informed consent for medical research was obtained from the study subjects.

Results

Baseline characteristics of the derivation and validation cohorts

The baseline characteristics of the subjects in the derivation and validation cohorts are shown in Table 1. The subjects in the derivation cohort were younger than those in the validation cohort. Males, H. pylori infection, atrophic gastritis, current drinking, and obesity were less frequent, whereas the mean value of HbA1c and the proportion of smokers were higher in the derivation cohort than in the validation cohort.

Table 1 Baseline characteristics of the derivation and validation cohorts

Development of the risk prediction model for gastric cancer in the derivation cohort

During the 14-year follow-up period, gastric cancer developed in 90 subjects (66 men and 24 women) in the derivation cohort. In the univariate analysis, age, sex, the combination of H. pylori antibody and serum pepsinogen test, HbA1c level, smoking habit, and alcohol intake were significantly associated with gastric cancer incidence (Table 2). In the multivariable analysis with backward selection, the abovementioned risk factors other than drinking status remained significant. We constructed the risk prediction model using the regression coefficient of each risk factor from the multivariable model shown in Table 2. Figure 1 shows calibration plots comparing the actual and predicted gastric cancer events based on the deciles of risk in the derivation cohort. In the derivation cohort, the risk prediction model was well calibrated in the H–L test (χ 2 statistic = 9.37 [df = 8], P = 0.31), and showed good discrimination ability for the development of future gastric cancer, with a c-statistic of 0.79 (95% CI 0.74–0.83), which is significantly higher than the c-statistic of the model including only the combination of H. pylori antibody and serum pepsinogen (c-statistic = 0.69 [95% CI 0.64–0.73]) (P < 0.001 between models). On the basis of this risk prediction model, we produced a risk assessment tool. The risk score sheet is shown in Table 3. The distribution of total risk scores had a median of 6 (interquartile range 4–8) for this population. The incidence of gastric cancer increased significantly with elevating quartile of total risk scores (P for trend < 0.001) (Fig. 2). The area under the ROC curve was 0.79 (95% CI 0.74–0.83) (Fig. 3). The cutoff point of the risk score for predicting gastric cancer incidence was 8. Subjects with scores of 8 or over were at a 5.3-fold (95% CI 3.4–8.2) increased risk for the development of gastric cancer than those with scores of 7 or less.

Table 2 Regression coefficients and hazard ratios of risk factors in the gastric cancer risk prediction model, 1988–2002
Fig. 1
figure 1

Actual and predicted incidence of gastric cancer versus decile of risk in the derivation cohort. Hosmer–Lemeshow χ 2 statistic = 9.37, df = 8, P = 0.31

Table 3 Risk score sheet for the prediction of gastric cancer
Fig. 2
figure 2

Incidence of gastric cancer versus quartile of total risk scores in the derivation cohort, 1988–2002. *P < 0.01 vs. lowest quartile. ¶P for trend < 0.001. The differences and trends in the incidence among the total risk score levels were tested by means of a Cox proportional hazards model

Fig. 3
figure 3

Receiver operating characteristic (ROC) curve of the risk score for gastric cancer incidence. The cutoff point for the risk score was 8

Internal validation of the risk prediction model in the validation cohort

An internal validation of the risk assessment tool was performed in the validation cohort. During the 5-year follow-up, gastric cancer developed in 35 subjects (23 men and 12 women). The distribution of total risk scores showed a median of 8 (interquartile range 6–10). The subjects of this cohort were divided into four groups according to the range of total risk scores obtained from the analysis of the derivation cohort. The incidence of gastric cancer increased with total risk score (Fig. 4). The calibration χ 2 statistic for the risk prediction model was 1.68 (df = 2), indicating a reasonable fit by the H–L test (P = 0.43). Moreover, the ability of the model to discriminate gastric cancer was good (c-statistic = 0.76 [95% CI 0.69–0.83]).

Fig. 4
figure 4

Incidence of gastric cancer versus total risk score in the validation cohort, 2002–2007. The cutoff values for the total risk scores in this analysis are the same as those for the derivation cohort

Discussion

In this study, we developed and validated a risk prediction model and a subsequent risk assessment tool for gastric cancer using the data obtained from a prospective cohort study of a general Japanese population. The variables included in the model were age, sex, the combination of H. pylori antibody and serum pepsinogen status, HbA1c level, and smoking status. The established risk prediction model demonstrated good performance in regard to both discrimination and calibration for the incidence of gastric cancer in both the derivation and validation cohorts. Notably, our risk assessment tool is simple and inexpensive enough to be used in normal clinical practice and mass screening, because blood collection is the only invasive maneuver needed. In addition, individuals who were estimated by our tool to have a higher probability of future gastric cancer would be strongly recommended to undergo esophagogastroduodenoscopy. Therefore, our risk assessment tool for gastric cancer should prove useful for estimating populations at high risk for the future occurrence of gastric cancer.

In recent years, several studies have reported the development and validation of risk assessment tools for various cancers [26,27,28]. Some prospective cohort studies and meta-analyses have shown that the combination of H. pylori antibody and serum pepsinogen is a good predictive marker for the development of gastric cancer [17, 29]. However, gastric cancer is known to be a multifactorial disease. Certainly, the present risk assessment tool, which includes multiple risk factors, exhibited greater discrimination ability than a model based only on the combination of H. pylori antibody and serum pepsinogen. Therefore, we believe that our model can provide more detailed information that could be used to build an effective screening strategy for gastric cancer.

To our knowledge, there is only one other risk assessment tool for gastric cancer aside from ours. The Japan Public Health Center-based prospective (JPHC) Study Group developed a risk assessment tool permitting the estimation of the 10-year cumulative probability of gastric cancer occurrence, in which age, sex, the combination of H. pylori antibody and serum pepsinogen, smoking status, family history of gastric cancer, and consumption of high-salt food were included as risk factors [6]. The risk factors selected in that study were similar to those incorporated in our tool, but we did not include a family history of gastric cancer and daily salt consumption in our risk prediction model. In our derivation cohort, data relating to a possible family history of gastric cancer were not available. As an alternative, we considered a family history of malignancies as a risk factor, given that relevant data on this parameter were available for our cohort, but there was no evidence of a significant association between this parameter and the risk of gastric cancer (data not shown). The daily salt intake, which is a well-known risk factor for gastric cancer, was also a significant risk factor for gastric cancer in our study [11]. However, an objective assessment of daily salt intake based on a questionnaire or interview is impractical and difficult to perform in ordinary health checkups because it requires much time and effort. The risk assessment tool should contain items that can be measured easily and objectively and can be used safely in mass screening. Thus, we did not include the daily salt intake in our risk assessment tool.

The strengths of our study include its longitudinal population-based study design, long duration of follow-up, accuracy of gastric cancer diagnosis, and complete follow-up. In addition, our study had the advantage that the internal validation was performed in the same regional population where the derivation cohort was recruited. However, there are limitations of the present study. First, we could not perform an external validation with a different regional population. Therefore, the generalizability of the present risk assessment tool is limited. Second, we were unable to rule out subclinical gastric cancer at baseline, since we did not perform a screening survey of the stomach in each subject at the time of recruitment. However, it was reported that the prevalence of gastric cancer in healthy subjects was low (0.12%) in a nationwide mass screening in Japan [30]. Therefore, we believe that subclinical gastric cancer at baseline was rare, and that the influence of this bias would be small. Third, we did not obtain information on the history of H. pylori eradication therapy in this study. It is reasonable to believe that eradicating H. pylori could change the risk of gastric cancer and affect the accuracy of our risk assessment tool. However, since this eradication therapy was first covered by national insurance in 2000, when the Japanese guidelines for the treatment of H. pylori were established [31], it is unlikely that a significant number of the subjects in the derivation cohort would have undergone H. pylori eradication therapy. Moreover, our risk assessment tool performed with acceptable accuracy in the validation cohort, where eradication therapy had already become more widespread. Finally, we found that the cutoff value of the risk score for predicting gastric cancer incidence was 8; using this cutoff optimizes the ability to discriminate the risk of developing gastric cancer under the assumptions that sensitivity and specificity are weighted equally and that ethical and cost constraints are ignored. Further investigations will be needed to elucidate this matter, with consideration given to ethical issues and cost-effectiveness compared to conventional mass screening and esophagogastroduodenoscopy.

In conclusion, we developed and validated a risk assessment tool for gastric cancer in a general Japanese population. The model included multiple risk factors—age, sex, the combination of H. pylori antibody and the pepsinogen test, HbA1c level, and smoking status—that can be easily assessed in clinical practice. Therefore, the present risk assessment tool helps to identify people at high risk for gastric cancer, for whom esophagogastroduodenoscopy is strongly recommended. Further validation studies in other populations will be needed to evaluate the usefulness of this risk assessment tool for gastric cancer.