Introduction

Previous trend analyses on cervical cancer mortality in Belgium, including a tentative solution for the 'not otherwise specified' (NOS) uterine cancers certification problem, have shown a 50% decline over the past 4 decades [1]. Age-period-cohort (APC) models, based on log-linear Poisson regression, have been used to describe trends and to disentangle the separate effects that have driven these trends [24]. APC modeling has certain intrinsic shortcoming such as the identifiability problem due to the linear interdependency of the three components (age, period and cohort, where cohort = period-age) [5]. However, non-linear changes are identifiable [5] and several solutions including imposition of certain constraints, such as fixing age-effects, allow a rather straight forward estimation of period and cohorts effects [6]. Cohort effects can be interpreted as the consequence of the changing exposure to risk factors whereas period effects can be explained as due to improvements in oncological treatment or increased screening and treatment of screen-detected lesions [7]. An APC analysis of Belgian cervical cancer mortality data from the period 1954-1994, not adjusted for NOS, revealed strong cohort effects, indicating an increased risk for women born after 1935-1960, which reflexes changes in sexual behavior and increased exposure to HPV infection of these generations [7]. This cohort effect is observed in most industrialized countries and corresponds with the sexual revolution and availability of oral contraception since the 1960s. In Belgium, the cohort effect increased less steeply than in the UK. The difference between both countries can most plausibly be explained by screening that counterbalanced to a certain degree the cohort effect in Belgium, whereas in England it was largely unaffected due to poor quality screening up to the mid 1980s [8]. Besides, revealing some evidence of changed exposure to carcinogenic agents and protective effects from Pap smear screening, APC models also allow prediction of future trends.

Bayesian methods are becoming more frequent in epidemiological research, including APC models. In Bayesian methods, both the data and model parameter are considered to be random quantities [9, 10]. The likelihood function is considered to define how likely is the data, given a particular value of the parameter of the model. The parameters of the model are regarded as unknown quantities which can have a probability distribution referred to as prior distribution. This prior distribution can be based on the evidence from previous studies or on subjective priori beliefs. The joint posterior density function is then obtained after combining the prior probability density function for all the model parameters with the likelihood function. Due to some complexity that may arise while implementing the Bayesian methods, much work has been carried out in developing simulation-based methods called Markov Chain Monte Carlo (MCMC) methods using Gibbs sampling. This simulation method has been incorporated into a package called WinBugs [11]. More information about MCMC, Gibbs sampling and WinBugs can be found in the specialized literature [1012]. Detailed descriptions of Bayesian APC models can be found elsewhere [1316]. Comparing with the classical APC models which zero counts or sparse data in the young and old age groups may lead to problems of instability in parameter estimates, the zero counts or sparse data in Bayesian framework do not pose any implementation problems when fitting APC models.

In this paper, our objective is to apply Bayesian APC models, following procedures proposed by Bashir and Estève [15], and to describe the influence of age, period, and cohort effects on corrected cervical cancer (corCVX) mortality data in Belgium (1954-1997).

Materials and methods

Source of data

In order to study trends of cervical cancer mortality, we downloaded the World Health Organisation (WHO) mortality database (http://www.who.int/whosis/mort/download/en/) and extracted data regarding deaths from uterine cancers together with the population of women living in European countries. For Belgium, data were available for the period 1954-1997. Two major uterine cancers can be distinguished: cervix uteri cancer (CVX) and corpus uteri cancer (CRP), besides some other very rare cancers such as placenta cancer (OTH). However, often the death cause certification only contains the information "cancer from the uterus not otherwise specified (NOS). Death causes were coded using the subsequent International Statistical Classification of Diseases, Injuries, and Causes of Deaths (ICD): ICD-7 for the period 1954-1967, ICD-8 for the period 1968-1978, and ICD-9 for the period 1979-1997.

In all ICD editions, separate codes were foreseen to identify cervical cancer (171 in the 7th, 180 in the 8th and 9th, and C53 in the 10th edition. Corpus uteri cancer and uterus NOS cancer were codified separately in most editions (172 [ICD-7], 182 [ICD-9] and C54 [ICD-10] for corpus cancer; 174 [ICD-7], 179 [ICD-9] and C55 [ICD-10] for uterus NOS cancer. However, in the 8th edition, 182 was used for both corpus and uterus NOS cancer. They could only be distinguished with the 4th digit (182.0 for corpus cancer and 182.9 for uterus NOS cancer), but distinction was in many countries not possible by lack of this 4th digit. The rare other cancers of the uterus were coded with 171 in the 7th edition, with 181 in the 8th and 9th edition and C57/C58 in the 10th edition.

Below, we explain how the number of deaths from cervix uteri cancer (corCVX) can be estimated from the number of deaths certified as originating from the uterine cervix (CVX), from the uterine corpus (CRP), from the uterus not otherwise specified (NOS), or from the combination of CRP or NOS (CRPNOS or CRPNOSOTH).

Reallocation rules

According to Loos et al, when the proportion of NOS of all uterus cancer is less then 25%, adjustments can be based using allocation rule 1, assuming that the NOS death certification is allocated at random [17]:

c o r C V X a y = C V X a y + N O S a y * C V X a y / ( C V X a y + C R P a y )
(Rule1)

where the indices a and y stand for age group and year at death, respectively.

If this assumption of random allocation would not correspond with the truth, the error would be limited since the rule is only applied when the proportion of NOS is small. In Belgium, however, this proportion always was greater than 25% and therefore use of a representative external template country with low NOS proportion such as the Netherlands is preferred as a more reliable basis for reallocation [1, 18]. Data from the Netherlands were downloaded from the same WHO mortality database. The Netherlands (NL) showed a low proportion of NOS for the periods 1955-1962 and 1972-2004 and the number of corCVX ay could be computed by applying allocation rule 1 [18, 19]. For this period, proportion of total uterus cancer that is probably of cervical origin is computed as pcorCVX ay = corCVX ay /UT ay , where UT ay = CVX ay +CRP ay +NOS ay .

For the period, 1963-1969, where the proportion of NOS was greater than 25% and the periods 1950-1954 and 1970-1971 where combined codes were used, pcorCVX ay was obtained through imputation given the data available from the periods where allocation Rule 1 was applied. To apply the imputation method, the periods with proportion of NOS ay > 25% or the combined codes (CRPNOS ay and CRPNOSOTH ay ) were regarded as missing observations [2022]. We then applied the imputation method by regressing the logit of pcorCVX ay (dependent variable) on the interaction between age and year. The logistic transformation was applied to avoid values to be negative or greater than unity.

l o g i t p c o r C ^ V X = l o g p c o r C ^ V X 1 - p c o r C ^ V X = b ^ 0 + b ^ 1 a g e a * y e a r y
(Rule2)

We used a source period of six years where the proportion of NOS was <25% to compute the proportion pcorCVX in the preceding or in-between periods with missing data. The source period 1955-60 was used to estimate pcorCVX for the preceding target period 1950-54 and the source periods 1960-62 and 1972-74 were used for the in-between period 1963-71 (Figure 1).

The proportions pcorCVX ay obtained after application of Rules 1 & 2 for the Netherlands (NL) were applied on the total number of uterus cancers deaths from Belgium to compute the corrected number of cervical cancer deaths in Belgium (BE)

corCV X ayBE = U T ayBE  * pcorCV X ayNL
(Rule3)
Figure 1
figure 1

Pictorial description of imputation method applied to data from the Netherlands.

Identification of age groups, calendar periods and birth cohorts

The cervical cancer mortality and population data was stratified into A (A = 13) categories: thirteen 5-year age groups (20-24, 25-29, [...] 80-84). Calendar time was grouped into 5-year period bands (1954-1958, 1959-1963, [...] 1994-1998) indexed as p (p = 1, 2, [...] P), with P = 9 except for the last period (1994-1998) which contains only 4-years. Through the 13 age groups and 9 periods, 21 cohorts can be considered, indexed as c (c = A + p - a). Because intervals for age and period categories are both 5 years wide, a birth cohort spans 10 years. Successive cohorts are overlapping partially and are identified by the central year 1874, 1879, 1884, [...] 1994. The indexes for the counts of cases and women-years by i = 1, 2, [...] AP = 13*9 = 117 according to age, period, and cohort of observed data are presented in Table 1.

Table 1 Indexing of cases and person-years according to age and period for observed data with A = 13, P = 9, and C = 21

Bayesian age-period-cohort models

Bayesian framework is based on conditional distributions between the likelihood data and prior distributions for the model parameters. Markov chain Monte Carlo methods (MCMC) constructs and simulates from full conditional distributions of any likelihood data and derive the posterior distribution from which inferences were drawn about the model parameters and functions of these parameters [23].

The form of the model which belongs to the family of generalized linear models [24] is described as follows. The estimated mortality rates ( M ^ a p ) are derived from the corrected number of deaths (corCVX ap denoted by D ap ) occurring in age group a during period p in N ap person-years:

M ^ a p = D a p N a p
(1)

Rates are non-negative and naturally modeled on the log-scale as:

( M ^ a p ) =  log ( D a p ) - log ( N a p )
(2)

The temporal variation of mortality can be explained by variables such as age at death a, period at death p and epoch of birth c. The logarithmic transformation of the mortality rate allows the formulation of a generalized linear model [25]. The corrected number of deaths D ap for age group a in time period p can be assumed to follow a Poisson distribution with mean or expected value (μ) i.e., D ap ~ Poisson (μ ap ). We then model the mean μ ap as:

( μ a p ) =  log ( N a p ) + α a + β p + γ c
(3)

In this model, α, β and γ parameters, corresponding with the effects of age (alpha), period (beta) and cohort (gamma), are assigned with non-informative priors which are normally distributed with mean zero and precision τ a , τ p and τ c respectively. The precision parameters for each of these factors, (which are the reciprocals of the respective variances [1/variance]), are also assigned with a vague prior following a Gamma distribution with scale and shape parameters equal to 0.001.

α a ~ N o r m a l 0 , T a , β p ~ N o r m a l 0 , T p , γ c ~ N o r m a l 0 , T c T a ~ Γ 0 . 001 , 0 . 001 , T p ~ Γ 0 . 001 , 0 . 001 , T c ~ Γ 0 . 001 , 0 . 001

We subsequently modeled: 1) the age model, where log (μ a ) was constant over the periods with one parameter for each age class; 2) a model where log (μ ap ) varied by age with one common linear slope (drift). This is a sub-model for both age-period and age-cohort models. However, the annual constant change in rates obtained from this model can not be attributed to either period or cohort effect. Whatever be the cause for the regular temporal variation of rates, the fitted rate would be the same [25]. In this paper, only age-drift from an age-period model is considered; 3) a model where the log-rates can vary irregularly by age and period. There is one parameter for each age class and one per period; 4) a model where log (μ ap ) changes by age and époque of birth. There is also one parameter for each age class and cohort; and finally, 5) a model where log-rates change by age, period and cohort. This is referred to as the age-period-cohort model (equation 3). In this model, due to identifiability problem or linear dependency between age, period, and cohort, one parameter was fixed to be zero. These parameters correspond to the reference period (p 0 ) and cohort (c 0 ) in the model.

Model selection and goodness-of-fit of the model

In classical statistical modeling, nested sequences of models are compared using likelihood ratio tests, F-tests, tests of deviance for generalized models. More generally, the AIC (Akaike's information criterion) can be used to judge also the fit of non-nested models accounting for model complexity (number of degrees of freedom) [26].

In the Bayesian framework, Spiegelhalter et al [27] has proposed a method for judging the goodness-of-fit, based on the Bayesian equivalent of the deviance [28]:

D θ = - { p ( y | θ ) } + { f y }
(4)

where f(y) depends only on the data y and not on parameter θ. The posterior mean of D(θ), denoted by D θ ¯ , is a measure of model adequacy and D θ ¯ is the deviance evaluated at the parameter posterior mean of the parameters. The difference p D = D θ - D θ ̄ can be thought of as a measure of model complexity. By analogy with AIC, the Deviance Information Criterion (DIC) is defined as:

DIC = D θ ¯ + 2 p D
(5)

which is the same as:

DIC = D θ ¯ + p D
(6)

Fitting linear models with vague priors, DIC roughly corresponds to AIC unless the prior strongly conflicts with the data [28]. It is also valid to compare models (nested or not). Small DIC values are better even if they are negative. For non-hierarchical models with non-informative priors, p D is approximately the true number of parameters. Details on the statistical characteristics of DIC can be found in Spiegelhalter et al [27]. The fit of models can also be appreciated by plotting the fitted and the observed rates against period.

Implementation of the Bayesian models

The models were implemented in WinBUGS, version 1.4.3 [11]. Three chains were selected and after 30,000 iterations, the Brooks and Gelman (bgr) diagnostic plots were examined for the convergence of the three chains. The bgr plot is one of the diagnostic methods in Bayesian framework for assessing the convergence of number of chains chosen for the fitted model. The sampling at the chosen number of chains is converged only when the plot shows that the iterations have reached equilibrium or almost constant. For this paper, the bgr plot shows that the three chains have converged after 10,000 iterations. In order to obtain the DIC values, the models were set up for further 10,000 iterations. The posterior mean and the 95% credible intervals were obtained after the last 30,000 iterations. The obtained posterior mean and the 95% credible intervals are then exported to STATA 9.2 version to plot the graphs. The WinBUGS syntaxes used to obtain the posterior means is available in the appendix.

Results

Observed mortality rates

Figure 2
figure 2

Figure 2a. Age specific mortality rates and age standardized mortality rates (ASMR, solid black line) for corrected cervical cancer by period. Figure 2b. Age specific rates by birth cohort; Belgium (1954-1997). Age groups 25-29, 35-39, (...), 75-79 are omitted for reasons of visual clarity. Periods are indicated by the first year of the period, birth cohorts are indicated by the central year of the respective cohort.

The graphs of the observed world-age-standardized (ASMR) and age-specific mortality rates of corrected cervical cancer are plotted as a function of period and birth cohort in Figure 2. The ASMR declines almost linearly from 9.2 per 100,000 women-years in the period 1954-1959 to 2.5 per 100,000 women-years in the period 1994-1998. The mortality rates increase with age. There is a decline over time in all age specific rates. In the age group 80-84 years, the mortality rate decreases substantially until period 1969-1973 with little increase in period 1974-1978 and then further decreasing. In the trend for age groups 70-74, 60-64, and 50-54 years, the mortality rate increases from period 1954-1958 to 1959-1963 and then gradually decrease until period 1994-1998. For age groups less than 30 years old, the mortality rate remained rather stable. There was a large decrease in mortality rate by cohort up to 1929-1938 (C1934) with discrete interruptions in the declining trend for cohorts 1884-1893 (C1889) and 1909-1918 (C1914). However, for younger cohorts, born after 1929-1938 (C1934), a horizontal and sometimes even an increasing trend could be discerned.

Model selection and goodness-of-fit of the model

The DIC with their corresponding pD values of the different fitted models are presented in Table 2.

Table 2 Goodness-of-fit parameters for age, age-drift, age-period and age-period-cohort models

The most complex APC model (pD = 36.81) showed the lowest DIC value (909.84) and was therefore chosen as the best fitted model. To assess the goodness-of-fit of the models graphically, the fitted and observed age specific mortality rates for each of the models are plotted against period or cohort (Figure 3). By visual inspection, it is obvious that the fitted rates approximate more and more the observed rates, when going from Figure 3a (horizontal straight lines, corresponding to the age-model) to Figure 3e (irregular curves, corresponding to the full age-period-cohort model).

Figure 3
figure 3

Fitted (curves) and observed (points) age specific mortality rates from Bayesian age, age-drift, age-period, age-cohort and models.

The effects of age, period and cohort together with their corresponding 95% credible intervals, estimated from the full APC model are presented in Table 3. The age effects correspond with the fitted mortality rate per 100,000 women-years considered over the whole period 1954-1997. The period and cohort effects can be interpreted as log rate-ratio relative to period 1954-1958 and the log rate-ratio relative to cohort 1919-1928. For example, the fitted mortality rate for women aged 40-44 years who died during the period 1974-1978 were born during the years 1929-1938 (C1934) can be obtained by taking the antilogarithm of the addition of age, period, and cohort effects estimates. That is e α a + β p + γ c = 4 . 96 per 100,000 women-years with credible interval (4.44, 5.53). This can be observed in Figure 1. The age effects increase with age. The period effects declined quite regularly over the studied period, whereas the cohort effects varied irregularly over the different generations (Figure 3 and 4).

Figure 4
figure 4

Age effects (rate per 100,000 women-year), Period and Cohort effects (rate-ratio) estimates from full Bayesian APC. The blue line connects the estimates and the red dash lines represent the 95% credible intervals. Note how the credible intervals on the rate-ratio reveal the reference period and cohort.

Table 3 The effects of age, period and cohort on cervical cancer mortality (adjusted for non-specified uterine cancers) estimated from a full Bayesian APC model

Discussion

Different authors [1, 17] had addressed the methods of resolving the NOS problem in cervical cancer morality data. We have extended these methods by using imputation to correct for the periods where the proportion of NOS is > 25% and where CRPNOS or CRPNOSOTH ICD coding have been used in the template country (Netherlands) to obtain our new corrected cervical cancer (corCVX) mortality data in Belgium. With the corCVX data, we have applied a simple Bayesian age-period-cohort model to describe the trend of the corrected rate of cervical cancer mortality in Belgium between 1954 and 1997. Due to many zero counts for the mortality at younger age groups 0-4, 5-9, 10-14 and 15-19 years old and lack of reliable death cause certification in older age groups 85+ years old, we have restricted our analysis to women between age groups 20-24, 25-29, 30-34,..., 80-84 years old.

Observed data show that the mortality increases with age and decline over time. The ASMR decreased regularly from 9.2 per 100,000 women-years in the period 1954-1959 to 2.5 per 100,000 women-years in period 1994-1998. Plotting the trends by époque of birth imported irregular changes in successive generations.

Our Bayesian APC model provides a good fit to the corrected mortality rates compared to the other models. At the same time, the separate effects associated with age, period, and cohort were estimated. The fitted rates from age effects show that the mortality rates increases as age increases with wider credible interval width at older age groups. The wideness of the intervals is due to the small population size of women in the older age groups. In addition, it encompasses the heterogeneity in the data where there are sparse, zero counts and uncertainty associated with the fitted model. For the period effects, there is gradual decrease in the rate-ratio over the periods. The precision of the cohort effects was lowest (widest credibility intervals) near the ends. In particular, the youngest cohort trends are unstable due the low number of deaths. There is a gradual decrease, almost 50% in the cervical cancer mortality in Belgium over four decades. The decreasing period effect probably is best explained by down-staging that is the result of improved access to general gynecological care. The impact of more efficacious oncological treatment schemes probably was limited given the small changes in stage-specific survival observed elsewhere [29]. In addition, the APC-model reveals strong cohort effects, indicating complex phenomena, such as changed exposure to risk factors over time, which are partially influenced by screening which is not uniform by age groups [30].

Further research is still going on where the Bayesian APC models will be extended with models such as generalized additive models (GAM), which smoothes the age, period and cohort effects and which are particularly useful in prediction of future trends.

The purpose of this paper was didactic. We wanted to familiarize the reader with Bayesian methods, which still is uncommon for the average epidemiologist or public health specialist. The results from the Bayesian analysis confirm conclusions from our previous work, which had shown that cervical cancer mortality trends cannot be described adequately from a simple age-period models. Cohort effects are very strong and screening probably has influenced these cohort factors by avoiding the impact of increased exposure to risk factors.

A strong argument for this statement can be derived from trend analyses in East-European countries where similar cohort effects, not or slightly affected by screening, have resulted in a continued or even increased burden of cervical cancer incidence and mortality [31].

Appendix

Sample syntax for the Bayesian age-period-cohort model fitted

model

{

###Loop over all observations###

for(i in 1:I){

###Define age-period-cohort model###

corcvx[i] ~ dpois(mu[i])

log(mu[i]) < - log(pyr[i]) + alpha[age[i]] + beta[period[i]] + gamma[cohort[i]]

###Calculate fitted rate per 100000###

rate[i] < -100000*exp(alpha[age[i]] + beta[period[i]] + gamma[cohort[i]])

}

###Non-informative prior model for age effects###

for (j in 1: A){

alpha[j] ~ dnorm(0, taua)

}

###Non-informative prior model for period effects###

beta[1] < -0     #constraint to zero as reference period

for (k in 1:P){

beta[k] ~ dnorm(0, taup)

}

###Non-informative prior model for cohort effects###

gamma[11] < -0     #constraint to zero as reference cohort

for (c in 1:C){

gamma[c] ~ dnorm(0, tauc)

}

###Precisions for age, period and cohort effects###

taua ~ dgamma(1.0E-3,1.0E-3)

taup ~ dgamma(1.0E-3,1.0E-3)

tauc ~ dgamma(1.0E-3,1.0E-3)

}

#Data

list(pyr = c(...), corcvx = c(...), period = c(...), cohort = c(...), age = c(...), A = 13, I = 117, P = 9, C = 21)

##Initial values for three chains

list(alpha = c(0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5, 0.5,0.5,0.5), beta = c(NA,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05), gamma = c(0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01, NA,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01), taua = 1.2, taup = 0.7, tauc = 0.9)

list(alpha = c(0,0,0,0,0,0,0,0,0,0, 0,0,0), beta = c(NA,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01), gamma = c(0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1, NA,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1), taua = 2.0, taup = 1.0, tauc = 0.5)

list(alpha = c(0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05, 0.05,0.05,0.05), beta = c(NA,0,0,0,0,0,0,0,0), taua = 1.0, taup = 0.5, tauc = 1.5, gamma = c(0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1, NA,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1))