Background

In the absence of organized screening, self-report is often the only means to monitor mammography utilization and to investigate trends in uptake at the population level [1]. In Canada, mammographic screening occurs both within organized programs and opportunistically through routine medical practice [2]. The validity of mammography self-report has previously been studied primarily using convenience samples. Despite differences in methodology, design and population characteristics, studies from a variety of settings have found that self-reports of mammography use are valid provided women are not required to precisely recall the timing of a previous mammogram [314]. Women generally tend to underestimate the time elapsed since their most recent mammogram by an average of three months or more, though overestimation can also occur [46, 813, 1517]. Greater discrepancies in recall may occur when the mammogram took place longer ago [15, 16], though contrary evidence also exists [11]. With some exceptions [4, 6, 8, 9, 13, 16, 17], studies have not been designed to assess false negative reporting due to the challenge of verifying historical data from multiple service providers. Those that have attempted to validate non-use have indicated that women are unlikely to deny having had a mammogram when indeed they have had one [4, 6, 8, 9, 13, 17], though false-negative self-reports are not always negligible [16]. However, the opposite is often true; women tend to report having had mammograms which are not verified against medical records [3, 4, 6, 7, 911, 1316, 1821], a particular problem in groups with low screening prevalence [20]. Valid reporting of mammography use has been found to be unrelated to various health behaviors and perceptions, socioeconomic, demographic, and questionnaire administration factors in some studies [4, 10, 12, 13]. However, others provide some evidence that age, ethnicity, education, employment status, family history of breast cancer, recency of the mammogram, and the regularity with which women receive mammograms do affect self-report accuracy [7, 1013, 17, 22].

Few evaluations of the reliability of mammography self-report are reported in the literature. Excellent test-retest reliability for having ever had a mammogram was reported in interviews conducted 1 week, 6–30 days, or 6–8 months after an initial interview, while reliability within the past year varied from excellent to good [14, 23, 24]. In a socioeconomically advantaged group of women aged 50–75 followed annually for 3 years, 98 percent provided logically consistent responses to a question on ever/never use of mammography [25]. Self-reports of ever use have been shown to be more reliable among Caucasian women and those with higher income and education [24]. However, in this study, date of last mammogram was not as reliably reported 6–8 months after initial testing [24].

Based on data from the longitudinal panel of the National Population Health Survey (NPHS), the present study examines the prevalence and determinants of inconsistent self-reports of mammography utilization among Canadian women aged 40 years and older and quantifies the extent that inconsistent self-reports of mammography use contribute to biased estimates of mammography utilization and uptake. To our knowledge, this is one of the first studies of mammography utilization to provide specific longitudinal data on the determinants of inconsistent responses over time and the impact of such responses on population screening estimates.

Methods

The National Population Health Survey (NPHS) is a survey of the Canadian household population. Initiated in 1994–95 and repeated biennially, it is a split panel survey, combining repeated cross-sectional components with the longitudinal follow-up of a panel of respondents. A representative sample of Canadian household residents aged 12 and older from all ten provinces was sampled using a multistage probability design with stratification and clustering at various stages. The overall response rate for the baseline 1994–95 survey was 89 percent with a 96 percent response rate for the selected panel respondent [26]. On follow-up to the baseline survey two years later, 94 percent of the panel members responded [26]. Further details of the sampling procedures, design, data collection and response rates are published elsewhere [26, 27].

This study evaluated data from longitudinal panel respondents of the 1994–95 (baseline) and 1996–97 (follow-up) waves of the NPHS to examine inconsistencies in reported mammography utilization among women aged 40+ years at first contact. Questions about mammography use were administered to female respondents through a personal interview conducted in 1994–95 and repeated by telephone approximately two years later. In both survey years, women were asked the identical question: "Have you ever had a mammogram, that is, a breast x-ray?". Those with positive responses were further probed for the time and reason of their most recent mammogram. All women provided their own health-related information; no proxy responses were allowed.

Analyses were restricted to women aged 40 and older (at baseline) who participated in the first two waves of the NPHS and consented to share their information with federal and provincial governments. Two types of inconsistent responses were assessed: (i) baseline reports of ever use which were contradicted by follow-up reports of never use; and (ii) baseline reports of never use which were contradicted by follow-up reports of use prior to 1994–95. Multivariate logistic regression techniques were used to evaluate the associations between women's baseline sociodemographic and health characteristics and type (i) inconsistent responses. Variables significant at p ≤ .05 in age-adjusted analyses were eligible for entry in the multivariate logistic models. Sample size constraints permitted only simple bivariate, rather than multivariate exploration of factors associated with reports reflecting inconsistent timing of most recent use at follow-up (type (ii) response). Estimates were weighted to reflect baseline population characteristics. To account for stratification and clustering in the NPHS sampling design, 95% confidence intervals for parameter estimates were calculated using exact standard errors generated through bootstrap re-sampling methods [28]. All statistical analyses were conducted using SAS.

Results

(i) Inconsistent ever/never utilization

Of the 3,535 women aged 40+ years who responded to the ever/never mammography question in both survey waves (Figure 1: 2 women with missing data regarding timing of mammogram were excluded), four percent (95% CI: 3.1–4.9) reported having had a mammogram at baseline and subsequently, on follow-up reported never having had a mammogram (Table 1). Among women who reported having had a mammogram at baseline, 5.9% (95%CI: 4.6–7.3) reported never use at follow-up (estimate not shown). The majority of women with inconsistent responses (64.4%, 95% CI: 54.4–74.4) reported receiving a recent (i.e., <2 years ago) mammogram at baseline and most (85.6%, 95% CI: 78.2–93.1) reported that the mammogram was done as part of a regular check up (Table 1). It should be noted that the percentage estimates in Table 1 have been weighted according to 1994/95 population characteristics whereas the frequency data represent actual numbers of women surveyed.

Figure 1
figure 1

Response characteristics of the NPHS longitudinal panel of women aged 40+ years in 1994–95.

Table 1 Mammography utilization characteristics among Canadian women aged 40+ years, 1994–95 and 1996–97 NPHS longitudinal cohort (n = 3535)*

Table 2 presents the estimated adjusted odds ratios (95% CIs) of inconsistent ever/never responses associated with women's baseline sociodemographic and health characteristics. Among women reporting ever use at baseline (1994–95), those reporting never use in 1996–97 were significantly more likely to be outside the target age group for screening (50–69), to have lower income, to have not used hormone replacement therapy in the past month and to have never had a Pap test, after adjusting for relevant covariates. Women with lower education levels were also more likely to report such inconsistent responses between baseline and follow-up although education failed to remain a significant predictor in the multivariate model. Other variables considered but not found to be significantly associated with this outcome were rural/urban residence, place of birth, languages spoken, marital status and other social support indices and having a regular physician.

Table 2 Estimated Odds Ratios (95% CIs) of inconsistent responses for ever having had amammogram* according to baseline sociodemographic and health characteristics amongCanadian women (aged 40+) assessed in the 1994–95 and 1996–97 NPHS (n = 2255).

(ii) Inconsistent timing

Follow-up interviews were completed, on average, 1.98 years from the baseline survey (range 1.19–3.01 years). Of the 293 women who reported never use at baseline and ever use at follow-up, 17.4 percent (95%CI: 11.7–23.1) reported a time for their most recent mammogram at follow-up that was inconsistent with never use at baseline. Despite baseline reports that they had never had a mammogram, approximately half of these women reported having had a mammogram at least 5 years ago. Although limited by small numbers, determinants of such inconsistent responses were assessed with simple bivariate analyses. Inconsistencies in timing occurred more often in older women. Compared to women aged 50–69, those 70 and older were more likely to report (at follow-up) that their most recent mammogram had occurred prior to 1994–95, despite a report of never use at baseline (OR = 6.96, 95%CI: 2.42–20.0). Women reporting fair or poor self-rated health were also more likely to report a time for their most recent mammogram at follow-up that was inconsistent with never use at baseline (OR = 2.44, 95% CI: 0.99–6.05).

Impact of inconsistent reporting on uptake estimates

Depending on how inconsistent responses are handled, different measures of use and uptake of mammography may be obtained. The lack of a gold standard such as a medical chart for validation makes the choice of a corrective measure unclear. If inconsistent ever/never responses are included in the analysis unchanged, 67.3 percent (95% CI: 65.1–69.5%) of women would be classified as ever having had a mammogram in 1994–95 while 71.7 percent (95% CI: 69.6–73.7%) would be classified as ever users in 1996–97. Conversely, if it is assumed that inconsistent ever/never responses represent false-positive responses at baseline (an assumption supported by our study findings), the 1994–95 prevalence estimate becomes 63.3 percent (95% CI: 61.0–65.6%), demonstrating an absolute increase in mammography use of 8.4 percent (95% CI: 7.1–9.6%) by this cohort of women by 1996–97.

Discussion

Although a limited number of studies have assessed the reliability of mammography self-report [2325], detailed evaluations have not been conducted for population-based longitudinal surveys. In this study, reliability could not be assessed, per se, as women's status of never having had a mammogram could normally be expected to change over a two year span. However, by examining inconsistencies in responses expected to remain constant and in responses regarding logical timing of mammography use, it is possible to examine potential concerns regarding response reliability and recall bias, respectively. In longitudinal studies, inconsistent data removed during data cleaning can yield significant losses, and may lead to bias, depending on the amount of attrition at each time point and the magnitude of the differences between those retained in the panel and those lost by such attrition [29]. Longitudinal studies of health must be acutely aware of causes of attrition because losses accumulate over survey waves [30].

Although direct comparison with our sample was not possible due to the expectation that behavior might have changed in a 2-year time span, earlier findings from longitudinal studies of fairly affluent [25] and low-income [24] populations and a population-based study [23], indicated that women reliably report having ever had a mammogram with estimated reliability measured by Cohen's kappa ranging from 0.82–0.87 [23, 24]. Our finding that 4 percent (95% CI: 3.1–4.9 percent) of the women participating in the second wave of this longitudinal study inconsistently reported ever having had a previous mammogram was surprisingly high. Previous studies have found that initial use refuted on subsequent interviews occurred in 2–2.9 percent of respondents [24, 25].

Our analyses of factors associated with inconsistent ever/never responses indicate that women reporting ever having had a mammogram at baseline but never use at follow-up exhibited many of the sociodemographic and health behaviour characteristics (e.g., lower income, outside age groups targeted for screening, non-users of Pap screening and hormone replacement therapy) commonly observed among non-participants in mammography screening in previous studies [3138]. Such findings provide support for the assumption that the 1994–95 response of ever use is more likely erroneous. Additional factors (e.g. being born in an Asian country) previously associated with non-use of mammography [31, 33] also showed a positive association with providing an inconsistent response; however, small numbers resulted in high variability once clustering and stratification were taken into account and precluded further analysis of this variable.

Validation studies also provide support for our assumption that inconsistent ever/never responses (as assessed in the present longitudinal panel) are most likely to be explained by false-positive responses at baseline. Women are more likely to falsely claim having had a mammogram, than not having one [4, 6, 8, 9, 13, 17, 39, 40]. The majority of women in our study with inconsistent ever/never responses also indicated (at baseline) that their most recent mammogram had occurred within the last two years, a finding consistent with past research [10]. Imputation, suggested as a remedy for item non-response [30], may equally be used to deal with inconsistencies with evidence, in this situation, favoring treatment of women's earlier responses as false-positive.

Among women who reported never use at baseline and ever use at follow-up, approximately 17.4 percent reported a time for their most recent mammogram (at follow-up) that was inconsistent with never use at baseline. If respondents truly initiated mammography use subsequent to the baseline interview, they overestimated the time elapsed since their mammogram. Such a finding is relatively inconsistent with previous studies that have generally found that women tend to underestimate the time since their last mammogram [46, 813, 15, 17]. Although McGovern et al. and Caplan et al. found reverse-telescoping in approximately 9 percent of their samples [13, 16], only 8 percent of women in this group miscalculated by more than 1 year [16]. In the present study, women reporting inconsistent timing were older and in poorer health, suggesting that competing health events may have interfered with accurate recall.

Unfortunately, no gold standard was available to assess the validity of the responses women generated. Thus, the proportion of consistent responses that may actually represent invalid responses is unclear. Nor was it possible to distinguish errors in recall of timing from false reports of mammography utilization. Further, the reasons for providing inconsistent responses can only be inferred. The possibility of data entry errors is remote. Although data entry checks for consistency between survey cycles were not included among the comprehensive quality control strategies implemented by Statistics Canada, computer assisted interviewing was used by highly trained interviewers. Also, data entry errors would be expected to occur randomly and not disproportionately among women outside of the target age range or with relatively lower socioeconomic status, as observed in the present study. However, several plausible explanations for the inconsistencies exist, including survey methodology changes and deliberate or inadvertent provision of inaccurate responses by the respondent.

We cannot exclude the possibility that interview changes (from a baseline personal interview to a telephone follow-up) prompted women who reported ever use in 1994/95 to alter their response in 1996–97. The NPHS used an initial personal interview to foster a good long-term relationship with the panel representative, but the cost and logistics of traveling to different regions was prohibitive. Therefore, unless the respondent objected or had no phone, future interviews (including the 1996–97 cycle) were conducted by telephone [26]. The need to maintain study procedures over time in longitudinal studies has been stressed [41], but the impact of altering the interview method on the NPHS results has not been investigated [26]. Sensitive questions may be answered more truthfully by phone. Editing of survey responses by the respondent may occur. Social desirability and a tendency to give positive responses are possible sources of over-reporting [3, 11]. Such biases remain largely unexplored with respect to mammography use.

Cognitive research has implicated comprehension as a barrier to providing valid, reliable survey responses. Several researchers employ the lead-in question 'Have you ever heard of a mammogram, that is a breast x-ray?' to identify comprehension difficulties [3, 16, 21, 42] but this was not done in the NPHS where women were directly asked if they had ever had a mammogram. One study using focus group testing and in-depth interviews showed that despite some confusion between mammography and breast exams, women generally understood what mammography was [11], a finding further supported by one population-based survey [21]. However other investigations suggest this is not uniformly so [16, 42, 43]. One possibility cited is that confusion with other tests such as chest x-rays may lead to over-reporting of mammography [3].

There are several possible explanations for the higher rate of inconsistent responses observed in the present study. The focus of the NPHS questionnaire was not limited to preventive practices nor was it designed for reliability testing. The length and comprehensiveness of the NPHS may have contributed to greater respondent fatigue. Also, the longer time interval between surveys, relative to that observed in other studies, may have contributed to instability in responses. A more favourable level of concordance among responses may be obtained by studies that apply eligibility criteria that ensure more accurate responses (e.g. by having the respondent recall where she had her mammogram to validate her ever/never use) [6, 10, 15]. Finally, the overall response rate of the NPHS was higher than in comparable studies, so it potentially included a more difficult-to-reach population, less able to provide accurate responses.

The inconsistent data reported here, if removed from longitudinal analyses, could yield losses to follow-up equivalent to or greater than other sources of attrition (e.g., deaths, institutionalization, non-response) over the planned 20 year course of the NPHS. The 1998–99 NPHS included probing questions to reduce inconsistencies and it should alleviate many of the problems evident here [44]. However, probes were not designed to address the larger problem of reverse-telescoping observed among some respondents in the NPHS longitudinal cohort. Incorporation of women's previous responses in subsequent interviews to avoid telescoping and stimulate recall can be used to minimize such inconsistencies [30]. A recent critical review of the accuracy of self-reported health behaviors, including mammography, provides further suggestions for enhancing the accuracy of such data [45].

Conclusions

In summary, inconsistent responses represent a challenge to longitudinal, population-based evaluations of breast screening practices. Losses from inconsistent data regarding mammography participation are not negligible and may contribute to inaccurate estimates of mammography uptake. Women reporting inconsistent ever/never use in the present study displayed characteristics typical of never users, favoring treatment of women's baseline responses as false-positive. Inconsistent responses regarding the timing of recent mammography practices, however, may be primarily related to the impact of age and competing morbidity on recall.