Introduction

Internet-based interventions show substantial promise for improving health-related behaviors (Haug et al., 2011; Patrick et al., 2011), chronic disease self-management (Glasgow et al., 2010; Lorig et al., 2008; van der Meer et al., 2009), and psychological functioning (Kessler et al., 2009; Lintvedt et al., 2012). Results of Internet-based interventions specifically for cancer patients and survivors have also been promising, but outcomes have varied markedly across studies. For example, in a randomized trial of 72 women with early-stage breast cancer, many of whom had clinically-significant levels of depression, Winzelberg et al. (2003) demonstrated that a 12-week online support group significantly reduced symptoms of depression and post-traumatic stress symptoms. Similarly, in a randomized pilot of a 12-week, self-guided Internet-based intervention for 62 women with early-stage breast cancer, Owen et al. (2005) reported improvements in perceived health status, but only for those women with poor health status prior to the intervention. In a randomized trial of a 24-week Internet-based intervention for 295 women with breast cancer, Gustafson et al. (2001) reported no effects on quality of life or emotional well-being but found that the intervention (comprehensive health enhancement support system, CHESS) significantly improved perceived information competence, comfort participating in health care decision-making, and confidence in healthcare providers. David et al. (2011) tested an 8-week program of Internet-based individual counseling and found no effect of the intervention on mood or quality of life for 133 women with breast cancer, despite high levels of satisfaction with the intervention.

Making sense of these conflicting results is complicated by differences in intervention methodologies but also by recruitment procedures that result in samples that are substantially different across studies. To date, most trials of Internet-based interventions have relied on convenience sampling procedures (Gorlick et al., 2011). This has been true for online interventions for cancer patients and survivors (David et al., 2011; Gustafson et al., 2001; Winzelberg et al., 2003) and interventions for a wide range of other health conditions (Berman et al., 2009; Devinini & Blanchard, 2005; Strom et al., 2000). With few exceptions (Salzer et al., 2010), trials using convenience samples have identified benefits associated with Internet-based intervention (Hill & Weinert, 2004; Lieberman et al., 2003; McTavish et al., 1995; Winzelberg et al., 2003), whereas results from trials that incorporate a systematic sample have yielded inconsistent findings, including worse outcomes associated with Internet-based intervention (Hoybe et al., 2010) or benefits for only select subgroups (Gustafson et al., 2001; Owen et al., 2005). This uneven pattern of results suggests the need for a closer look at sampling strategies used in Internet-based trials.

Convenience samples allow investigators to take advantage of potentially large sample sizes, and the Internet is ideally suited for large-scale convenience sampling, including the capacity to access difficult-to-reach populations and to recruit across wide geographic areas or even globally. In addition, online sampling allows people who feel they could benefit to self-identify as ready to take part in an intervention, which might promote higher engagement, as opposed to being identified by the research team as a potential candidate. In addition, it may be that those who self-select into an online trial are representative of those in the population at large who would be most likely to use Internet-based services. However, convenience samples can also yield potential sources of bias. For example, they may not be representative of the larger population of those living with cancer with respect to important and intervention-relevant patient characteristics, such as gender, age, ethnicity, cancer type, or motivation to engage in treatment (Chou et al., 2011). Moreover, convenience samples make it difficult to determine what proportion of the intended population (e.g., all survivors with distress or other unmet psychosocial needs) would make use of or benefit from the intervention, thereby obfuscating the degree to which study results are generalizable to particular patient populations (Gross et al., 2002; Wright et al., 2006). The Internet could advance the understanding of sampling patterns, but as with any intervention that relies on self-identification for recruitment, it can be challenging to gain information about the larger pool of potential participants.

Even if shown to be efficacious, trials that primarily include highly-motivated or self-selected participants may not generalize well enough to be translated into lasting, effective, real-world treatments for cancer-related distress. Cancer survivors who use the Internet are younger, better educated, healthier, and less likely to be in a minority group than those who do not use the Internet (Chou et al., 2011). Additionally, use of Internet-based support groups has been linked with distinct demographic and medical profiles. In a large study of online support group use among those with a chronic health condition, use was higher among women, younger adults, those with higher income and education, and those with worse health and emotional well-being (Owen et al., 2010). Similarly, in nationally representative data specific to those with cancer, use of online support groups is more prevalent among those with worse health status (Chou et al., 2011). These findings are meaningful because they both mirror and differ from who makes use of face-to-face services, such as support groups. In a population sample of 1,844 cancer survivors, participation in face-to-face support groups was associated with being younger, female, and more highly educated, but not with health status or emotional well-being (Owen et al., 2007). Hence, patients and survivors who are recipients of convenience sampling recruitment messages may represent distinct demographic, medical, and psychosocial subsets of the larger population of cancer survivors, and such differences could impact interpretations about who is most likely to benefit from these types of interventions.

Alternatively, systematic recruitment strategies, which involve identifying a population of potential participants and then attempting to recruit all members or a random subset of that population, offer potential methodological advantages, as well as limitations. In contrast with convenience sampling designs, systematic recruitment approaches give investigators the ability to evaluate sampling patterns (e.g., enrollment and eligibility fractions), potential biases associated with participation in the intervention, and other threats to external validity. Cancer registries are a potentially more representative mechanism for systematic recruitment, in addition to providing more transparency in terms of who is included in the potential recruitment source. Registries are population-based, are held to stringent reporting standards, and their comprehensiveness also allows investigators to identify a more discrete sample of cancer patients and survivors (Beskow et al., 2006). A number of studies in cancer survivors have used registry-based recruitment strategies (Boehmer et al., 2010; Cadmus Bertram et al., 2011), but these strategies have not yet been tested in Internet-based interventions, and institution or organization-specific registries may not be representative of the population at large. Additionally, no previous studies have evaluated the association between recruitment method and sample characteristics in Internet-based studies for cancer survivors.

The current study had three primary aims. First, we sought to evaluate the representativeness of a registry-based sample and an Internet-based convenience sample of those wanting to participate in an Internet-based intervention for cancer-related distress by comparing both samples (and the combined sample) with a nationally-representative comparison group of cancer survivors on cancer type and basic demographic variables. Because recruitment took place at a cancer center with a large population of prostate cancer patients, we did not expect our sample to be representative of the larger population with respect to cancer type, but that it would be representative with respect to gender, age, and ethnicity. Previous studies have suggested that cancer survivors using the Internet are younger and more heavily comprised of non-hispanic Whites than non-Internet users (Chou et al., 2011). Accordingly, we hypothesized that the Internet sample would be representative of the larger population of cancer survivors with respect to gender but not ethnicity or age. The second aim of the study was to compare more detailed medical and sociodemographic profiles of the two samples using variables not commonly available in nationally-representative datasets. We hypothesized that the two samples would differ with respect to factors commonly associated with the “digital divide” including age, educational attainment, and income levels, but also that the Internet sample would exhibit more advanced cancers and a greater degree of functional impairment associated with cancer (Owen et al., 2010). The third aim of the study was to compare psychological profiles of the two samples. On the basis of previous findings of higher levels of depressive symptoms in those who seek out online support groups (Owen et al., 2010), we hypothesized that the Internet sample would exhibit higher levels of psychological distress and lower quality of life. Some research has suggested a link between social support and use of the Internet for health-related information among cancer survivors, with some studies showing greater social support in those using the Internet for health-related purposes (Fogel et al., 2002) and others suggesting that having unmet social support needs predicts greater use of Internet discussion groups (Lee & Hawkins, 2010). Given these previous findings, we also sought to evaluate potential differences between the samples with respect to levels of social support and unmet social support needs.

Methods

Participants

With IRB approval, two separate recruitment procedures were used: systematic sampling using a large cancer registry and convenience sampling via the Internet.

Registry sample

Cancer patients and survivors who became part of a registry maintained by a large medical center in Southern California between July 2007 and June 2009 were identified for potential recruitment. The registry was part of a statewide population-based cancer surveillance system and collected basic demographic and medical characteristics of all new cancer diagnoses. Those listed in the registry were mailed a letter describing the study, how to find out more and enroll using the study website, as well as how to opt out of contact by study personnel. Those who did not opt out or visit the study website were contacted by telephone, provided with information about the study, and screened for eligibility to enroll. Among 2,025 cancer patients and survivors who received the letter, 937 (46.3 %) were successfully contacted, and of these, 425 (45.4 %) agreed to be screened for eligibility to join the study. Once screened, 212 individuals (50 %) were eligible to participate, and 80 (38 %) of these completed the consent and baseline questionnaire.

Internet sample

Information about the study was sent to administrators and moderators of over 657 unique cancer-related websites, online forums, and Facebook groups, representing a wide array of general (n = 303, 46.1 %) and cancer-specific online groups (n = 354, 53.9 %). Cancer types associated with these online groups were modestly representative of the population of cancer survivors in the US (see Table 1), with sites targeting breast cancer (22.1 %), prostate cancer (10.5 %), colorectal cancers (5.9 %), lung cancer (3.1 %), female reproductive cancers (14.1 %), hematologic cancers (5.6 %), melanoma (2.5 %), urinary cancers (5.4 %), and other cancers (26.6 %). Potential participants were encouraged to visit the study website, where they were able to learn more and screen themselves for eligibility to participate in the study. No identifying information was obtained from those who visited the study website prior to enrollment in the study, and it was not possible to track the number of individuals who viewed the study website after receiving some kind of information about the study. However, using IP addresses obtained from study server, we were able to estimate the number of unique individuals who completed the online screening questionnaire (n = 516) and the number of individuals who were identified as eligible to participate (n = 280; 54 %). Of the 280 eligible, 160 participants (57 %) enrolled through the study website and completed the consent and baseline questionnaire.

Table 1 Demographic characteristics of the registry, Internet, and combined samples relative to a nationally-representative comparison sample

Eligibility criteria

In order to be eligible to enroll in the study, potential participants were required to be at least 18 years of age, have consistent access to the Internet, have proficiency reading and writing in English, and have a score on the distress thermometer (DT) of 4 or higher (Jacobsen et al., 2005; Roth et al., 1998), as this cut-point has been demonstrated to be indicative of clinically significant distress (Gessler et al., 2008; Jacobsen et al., 2005).

Procedures

Participants were recruited to participate in a 12-week randomized, wait-list controlled clinical trial to evaluate an Internet-based social-networking intervention (health-space.net) to reduce cancer-related distress. The health-space.net intervention provided a professionally-facilitated discussion board, weekly guidance modules to encourage development of approach-oriented coping strategies to alleviate distress, a weekly, professionally-facilitated chat with other group members to encourage discussion of each week’s guidance module, personal pages for sharing photos and personal stories with other group members, and blogs for completing weekly exercises specific to each guidance module. After enrolling in the study, participants completed an online consent process and a baseline questionnaire using the study website. Upon completion of the baseline questionnaire, participants were randomized to one of two conditions: immediate access to the health-space.net intervention or a 12-week waiting list. Twelve weeks after completing the baseline questionnaire, participants completed a follow-up questionnaire, and wait list participants were given access to the online intervention. Baseline questionnaire data were used in the current study.

Measures

Recruitment fractions

Interpreting the generalizability of trial results requires the identification of a target population of survivors in the general population who might be expected to benefit from Internet-based intervention. In both samples, the target population was operationalized as those individuals who demonstrated at least some level of interest in receiving Internet-based support for cancer-related distress as evidenced by their willingness to undergo the very brief screening procedure. For the registry sample, the target population included 425 individuals who agreed to be screened, and for the Internet sample the target population included 516 individuals who completed the online screening tool. Eligibility, enrollment, and recruitment fractions were calculated as described by Gross et al. (2002). The eligibility fraction was defined as the proportion of the target population determined to be eligible to participate, and the enrollment fraction was defined as the proportion of those who were eligible who completed the baseline study questionnaire described in more detail below. The recruitment fraction was calculated as the product of the eligibility and enrollment fractions and represents the proportion of the target population who fully enrolled in the study.

Demographic and medical characteristics

For registry-recruited participants, age, gender, cancer type, and ethnicity were obtained directly from the cancer registry for those participants and by self-report for web-recruited participants. All participants also self-reported their level of education, annual household income, marital status, current employment, frequency of Internet use, time since cancer diagnosis, cancer stage, and days/month of restricted activities due to their cancer.

Physical, social, and psychological well-being

Measures of quality of life, social support, and psychological well-being were included in the baseline questionnaire.

Quality of life was measured using the Functional Assessment of Cancer Therapy (FACT-G) scale and the Quality of Well-Being Scale. The FACT-G is a 27-item questionnaire which utilizes 5-point Likert scales to evaluate social well-being, physical well-being, emotional well-being, functional well-being, and overall quality of life (Cella, 1997). This instrument has adequate internal consistency (overall α = .90, subscale α’s = .63–.86) and good concurrent validity with ECOG performance status (Brady et al., 1997). The EuroQol-5D Quality of Well-Being Scale, or “feelings thermometer,” is a single-item visual analogue scale designed to measure self-rated overall health (Brooks, 1996). Participants were asked to rate their overall health on a 0–100 scale anchored at 0 by the “least desirable state of health you can imagine” and at 100 by “perfect health.” The measure has good test–retest reliability, concurrent validity, and sensitivity to change (Llach et al., 1999).

Psychological well-being was assessed using measures of distress, mood disturbance, depressive symptoms, and trauma symptoms. Distress was measured using the single-item DT (Roth et al., 1998), which asks respondents to rate the level of distress they experienced over the previous week using a 0–10 scale. The DT exhibits good sensitivity and specificity for identifying clinically-significant distress in cancer survivors (Jacobsen et al., 2005; Ransom et al., 2006). Depressive symptoms were evaluated using the Center for Epidemiologic Studies Depression Scale (CES-D), a 20-item measure that employs 4-point Likert scales for each item (Radloff, 1977). The CES-D has been shown to be reliable and has been validated for use in cancer populations (Baker et al., 2002; Hann et al., 1999). Total mood disturbance was measured using the short form of the Profile of Mood States (POMS–SF; Baker et al., 2002). The POMS–SF requires respondents to rate the extent to which they have experienced each of 37 distinct mood states in the previous week using a 5-point Likert scale anchored with “not at all” and “extremely.” The current study used the total mood disturbance score (α = .91) and the 5-item fatigue subscale (α = .90; Baker et al., 2002). The Impact of Events Scale-Revised (IES-R) is a 22-item, Likert-type scale designed to measure the intrusiveness and avoidance of cancer-related thoughts and stimuli (Weiss & Marmar, 1997). The instrument has good internal consistency (Cronbachs α = .79–.92) and is sensitive to the effects of psychosocial intervention (Edgar et al., 1992).

Emotional social support was measured using 6 items drawn from the Yale Social Support Index (Seeman & Berkman, 1988) to capture both positive emotional support and aversive emotional support in cancer survivors (Butler et al., 1999). Respondents were asked to use a 4-point Likert scale to rate the amount and quality of emotional social support received from friends and family (α = .73). Unmet social support needs, or social constraints, were measured using the 15-item Social Constraints Scale (Lepore, 2001). Respondents used a 4-point Likert scale to rate the extent to which a spouse, significant other, or close friend had been receptive to the participant’s expressions of feelings and concerns about their cancer during the past month. Reliability of the total score is excellent (α = .88) in cancer survivors, and the instrument possesses good test–retest reliability and validity (Lepore & Ituarte, 1999).

Data analysis

To evaluate the representativeness of both a registry-based sample and an Internet-based sample with respect to cancer type, we compared both samples with recent prevalence data derived from the surveillance, epidemiology, and end results programs (SEER; Rowland et al.,2012). SEER provides estimates based on reporting data contributed by over 11 million cancer survivors. To evaluate the representativeness of our samples with respect to age, gender, and ethnicity, both samples were compared with a nationally representative sample of 5,150 cancer survivors derived from the 2003, 2004, and 2005 National Health Interview Surveys (Kaiser et al., 2010). Potential differences in age were tested with independent-samples t tests, using 3 planned comparisons between (1) the nationally-representative sample and the registry-recruited sample, (2) the nationally-representative sample and the web-recruited sample, and (3) the registry-recruited and web-recruited samples. Comparisons of gender, ethnicity, and cancer type across the 3 groups (registry, web, nationally-representative samples) were evaluated using χ2 analyses. Differences between registry and web-recruited samples on additional demographic, medical, and psychosocial characteristics were evaluated using t tests and χ2 analyses. When annual household income levels were compared between registry-recruited participants and web-recruited participants, five outliers reporting income >$500,000/year were removed. Removal of outliers did not affect the non-significant between-group difference. To test group differences (registry vs web-recruited) in psychosocial characteristics, covariates were identified separately for each psychosocial variable. We examined possible univariate associations between each psychosocial variable and those demographic and medical characteristics that differed between the registry and web-recruited samples (i.e., age, gender, race, educational attainment, frequency of Internet use, cancer type, cancer severity, restrictions due to cancer, and time since diagnosis). Demographic and medical characteristics that demonstrated significant associations with a psychosocial variable were included as covariates when testing group differences on that psychosocial variable.

Results

Characteristics of cancer survivors in the cancer registry

There were 2,025 individuals listed in the cancer registry database obtained for this study. The average age was 62.2 (SD = 13.6), and the majority of those listed in the registry were male (60.9 %) and non-Hispanic White (75.5 %). The most frequently represented cancer types were prostate (37.1 %), female reproductive (11.2 %), breast (9.5 %), urinary (4.5 %), lung (4.4 %), colorectal (4.2 %), melanoma (2.9 %), hematologic (4.5 %), and other cancers (24.3 %).

Representativeness of registry-enrolled and web-enrolled study participants

We first sought to establish whether study participants who were enrolled through the cancer registry were representative of the registry as a whole. Relative to this cancer registry at large, registry participants who enrolled in the study and completed the baseline assessment were similar with respect to gender, χ2(2) = .44, p = .80, ethnicity, χ2(4) = 4.9, p = .29, and cancer type, χ2(9) = 6.3, p = .71. However, those who enrolled in the study were significantly younger (\( \bar{x} \) = 57.1 years) than registry patients who did not enroll or complete a baseline assessment (\( \bar{x} \) = 62.4 years), t (2,022) = 3.47, p = .001.

Next, we sought to determine whether our registry-enrolled study participants and web-enrolled study participants were representative of all cancer survivors in the United States. Both registry participants and web-enrolled participants exhibited significant differences compared to nationally-representative comparison groups. Registry participants differed with respect to cancer type, χ2(8) = 58.0, p < .001, with a lower proportion of breast and urinary cancers and a higher proportion of prostate and other cancers, as did web-enrolled participants, χ2(8) = 112.7, p < .001, with a higher proportion of breast and other cancers and a lower proportion of prostate and urinary cancers. Additional differences by cancer type are shown in Table 1. Both registry, t (2,024) = 4.3, p < .001, and web-enrolled participants, t (159) = 15.0, p < .001, were significantly younger than the nationally-representative sample. The proportion of males among registry participants was significantly higher than the nationally-representative sample, χ2(1) 11.5, p < .001, whereas the proportion of males among web-enrolled participants was significantly lower than the nationally representative sample, χ2(1) = 69.4, p = < .001. Neither registry nor web-enrolled participants differed significantly from the nationally-representative sample with respect to ethnicity. When the web-enrolled and registry participants were combined (n = 240), the total sample remained significantly younger, t (5,388) = 65.6, p < .001, had a greater proportion of female participants, χ2(1) = 22.6, p = < .001, and consisted of more women with breast cancer, χ2(9) = 110.4, p = < .001, than did the nationally representative comparison group.

Recruitment fractions

With respect to eligibility and enrollment, 53.7 % of potential Registry participants (i.e., those who were successfully contacted and agreed to be screened for eligibility) were determined to be eligible to participate in the study, and 45.0 % of these completed the baseline survey. The recruitment fraction, a product of the eligibility and enrollment fractions, was 24.2 % for the Registry sample. For the Internet sample, 61.6 % of potential participants (i.e., those who navigated to the study website and completed the screening questionnaire) were determined to be eligible to join the trial. Of these, 51.2 % completed a baseline survey, yielding a recruitment fraction of 31.5 %. Eligibility fractions were significantly higher in the Internet sample, χ2(1) = 7.00, p = .008, but enrollment fractions did not differ between groups. Recruitment fractions were significantly higher in the Internet sample, χ2(1) = 7.24, p = .007.

Demographic and medical differences between registry-enrolled and web-enrolled study participants

Registry and web-enrolled participants were compared on sociodemographic and medical characteristics (see Table 2). The two groups did not differ with respect to annual household income, marital status, or employment. However, the web-enrolled sample reported significantly higher educational attainment (16.0 vs 14.9 years), t (238) = 3.00, p = .003 and more frequent use of the Internet, t (121.7) = 3.2, p = .002. With respect to medical characteristics, the web-enrolled sample reported higher levels of cancer spread to lymph nodes or beyond (56.9 %) relative to the registry sample (20 %), χ2(3) = 30.5, p < .001. The web-enrolled sample also reported more time since initial diagnosis, t (156.9) = 3.41 and more days of restricted activities due to cancer, t (228.8) = 3.5, p < .001.

Table 2 Demographic and medical characteristics of registry-recruited and web-recruited participants

Psychosocial profiles of registry-enrolled and web-enrolled study participants

Web-enrolled and registry participants differed with respect to each of the psychosocial characteristics, with all differences suggesting worse psychosocial functioning in the web-enrolled sample (see Table 3). Web-enrolled participants exhibited significantly worse distress, t (111.8) = 5.0, p < .001, total mood disturbance, t (238) = 5.4, p < .001, depressive symptoms, t (238) = 5.2, p < .001, quality of life, t (238) = 6.1, p < .001, quality of well-being, t (237) = 2.9, p = .004, intrusive/avoidant symptoms, t (238) = 4.2, p < .001, fatigue, t (238) = 4.6, p < .001, and social support, t (238) = 5.0, p < .001. Web-enrolled participants also reported significantly higher levels of social constraints on expression of emotional needs and concerns, t (238) = 5.9, p < .001. Unadjusted effect sizes were moderate to large (d’s = .39–.82). With the exception of quality of well-being, all of these group differences persisted after adjusting for group differences in demographic and medical characteristics. Covariate-adjusted effect sizes were small to moderate (d’s = .35–.63).

Table 3 Psychosocial characteristics of registry-recruited and web-recruited participants

Discussion

Broad-based recruitment of cancer survivors into an Internet intervention for cancer-related distress yields samples that are not entirely representative of those living with cancer, and this is true for both Internet-based and registry-based sampling strategies. The Internet and registry samples in this study did not differ from a representative sample with respect to ethnicity, and there were no between-group differences in income, marital status, or whether they were currently employed. Both groups showed prominent differences with respect to age, gender, and cancer type. Moreover, these differences were in some cases in opposite directions (e.g., higher proportions of men in the registry sample compared with higher proportions of women in the Internet sample). Although the registry sample appeared to better approximate the representative sample with respect to age, gender, and ethnicity, combining the registry and Internet samples yielded a much better approximation for gender and common cancer types than either sample alone. It is also interesting to note that despite broad sampling of cancer-related online groups, the Internet sample consisted predominantly of women. Although we do not have data to address why this might be, it may be that online groups for female cancer survivors have more members than do mens’ groups or have a greater degree of support for clinical trials. We noted a “boom or bust” cycle to Internet recruitment. Many online groups were non-responsive to our recruitment requests, but a single group that agreed to disseminate information about the study could yield large numbers of participants.

Registry-based sampling alone is insufficient for generating a nationally-representative sample. The registry used in the current study contained a disproportionately high number of men with prostate cancer, likely due to the availability of Proton radiation therapy at the treatment facility. There are additional reasons why systematic samples might be biased. Systematic sampling that occurs in large cancer centers is likely to be biased toward those who more frequently have private health insurance or medicare (Harlan et al., 2005). Many systematic recruitment strategies, including the use of registries or consecutive clinic patients, will serve to identify patients who are more likely to be in active or recently completed treatment by virtue of the fact that patients are identified at instances of care. Such strategies are unlikely to identify patients in long-term follow-up who may visit their oncology providers at a much lower frequency or not at all.

Similarly, Internet-based sampling procedures are also likely to be insufficient for obtaining a sample representative of the larger population of cancer survivors. Although we attempted to identify the broadest possible range of cancer-related groups and organizations on the Internet, the Internet sample was largely comprised of women with breast cancer and exhibited significantly greater educational attainment, higher levels of functional impairment, and more advanced disease than did the registry sample. Perhaps survivors who use the groups and forums we targeted on the Internet are a distinct and non-representative subgroup of cancer survivors, consistent with findings from previous studies of those who use online support groups to cope with chronic disease (Owen et al., 2010). Alternatively, our Internet sample could be an accurate reflection of those survivors with cancer-related distress who are interested in using Internet-based psychosocial interventions. More research is needed to better understand which cancer survivors are most likely to benefit from Internet-based interventions. It is important to note that with Internet-based sampling strategies, it would be very difficult to determine whether cancer type, gender, or other characteristics are associated with interest in using an Intervention, because we are not able to say how many survivors with cancer-related distress received information about the intervention, i.e., the potential reach of the intervention (Bennett & Glasgow, 2009). It is also important to note that there is likely substantial variability in demographic characteristics across channels used for Internet recruitment. It is likely that recruiting from more widely used types of social media (e.g., facebook) would lead to participants with different characteristics than people who are engaged in sites focused on actively providing emotional support. This remains an open and relevant question. In post hoc tests conducted in the registry sample, cancer type, gender, and other characteristics did not seem to be associated with enrollment, although age was related to enrollment, such that people who were younger participated at higher rates. Overall, though, this does suggest that knowledge, expectation levels, motivation, perceived stigma of psychosocial care, or other factors may play a role in creating a self-selection bias from Internet-based samples.

The Internet recruitment strategy resulted in a more efficient screening process, in that those who came to the study website in response to an Internet recruitment strategy were more likely to be distressed and therefore eligible to participate in the intervention. Because the health-space.net intervention was designed specifically for those with cancer-related distress, these participants may be more motivated to participate and perhaps also more likely to benefit from the intervention. This is then a substantial advantage of Internet recruitment, in that such strategies may yield samples that are poised to make use of the Internet-based intervention. Additionally, for many the experience of distress can serve as its own barrier to accessing services. Internet-based interventions reduce the level of participant burden required to join and participate in a clinical trial, helping those in distress, those who would otherwise have to travel long distances, and/or those with significant physical limitations to easily join a behavioral trial.

In addition to having higher levels of distress, the Internet sample exhibited significantly worse quality of life and lower psychological well-being than those in the Registry sample. The Internet sample also had significantly lower social support and greater social constraints within their support networks, with effect sizes in the moderate to large range. Given that all participants in the present study had to exhibit significant levels of distress to be eligible to participate, it is noteworthy that the Internet sample was even more distressed than the already-distressed Registry sample. These results suggest that decisions about which types of recruitment strategies to use may depend on the goals of the research. The Internet is an extremely effective recruitment channel for distressed participants. However, Internet-based recruitment methodologies should be supplemented with systematic recruitment approaches if a more representative sample is required. The merits and limitations of various sampling strategies may depend heavily on the focus of each type of intervention being tested. While our results do not identify one single strategy as being ideal, researchers should carefully consider how recruitment might influence generalizability of intervention results.

Differences in distress between the Internet and Registry samples have important implications for testing interventions to improve psychological well-being in cancer survivors. Because people recruited from Internet samples have more extreme levels of distress and lower overall psychological functioning and quality of life, Internet interventions may work differently in this group than for those with less severe disruptions in psychological functioning. More highly distressed participants may have higher levels of motivation to engage in treatment, may respond to interventions differently than those with less severe distress, and may be particularly well-suited for identifying intervention-related effects. We have previously shown that baseline functioning is a predictor of response to Internet-based treatment, with those with worse health status showing greater benefits of treatment than those with better health status (Owen et al., 2005). Interventions tested only in those recruited via the Internet may not generalize well to other more representative samples and have the potential to overestimate effect sizes if the intended population is all survivors with distress. It is also important to recognize that distress is a multi-factorial experience (Kendall et al., 2011) that may be associated with a number of distinct contributing factors. It may be necessary to tailor interventions based on either distress severity or specific distress-inducing factors (e.g., fatigue, worsening physical health, financial concerns, anxiety or fears of recurrence, changes in social support networks, etc.). Alternatively, interventions would need to be designed in a manner that is flexible enough to accommodate wide ranges in distress levels and to be able to meet the needs of a very heterogeneous group of survivors living with cancer-related distress.

There are several limitations of the current study. First, our representative sample likely included survivors with a range of distress levels (including no distress) and may not have been representative of those living with cancer-related distress. Distress in cancer survivors is known to be associated with younger age, being female, having lower educational attainment, and being unmarried (Kaiser et al., 2010; Hoffman et al., 2009), which would suggest that our combined (i.e., Internet plus Registry) sample may be more representative than appears in Table 1. Unfortunately, large representative datasets are still difficult to subset with respect to distress, because tools used to measure non-somatic distress in large population studies (e.g., NHIS) appear to be much less sensitive to distress than other widely used instruments (Kaiser et al., 2010). Second, our systematic sample was from a single cancer center, and results might be expected to differ for multi-site sampling strategies. Multi-site recruitment likely would provide a more representative sample than that obtained in the current study. Stratified sampling procedures might also have better ensured representativeness of the samples by using population estimates of demographic (e.g., sex), medical (e.g., primary cancer site, stage of cancer), and other pertinent characteristics of the population under investigation (e.g., distress). Third, Internet recruitment strategies are not likely to remain stable over time. Given the rapidly changing nature of the Internet (Zickuhr & Madden, 2012) and increasing use of the Internet by cancer survivors for information-seeking and social-networking (Chou et al., 2011), ways to provide outreach to survivors via the Internet are likely to evolve. As the Internet increasingly saturates the US population in general, and older adults specifically, Internet convenience samples are likely to become increasingly representative of the population of cancer survivors.

In summary, recruitment strategies can provide very different types of samples, as the results of this study make plain. Understanding how recruitment shapes a sample for this type of intervention has important implications for interpretation and dissemination. Interventions that are not designed and evaluated for their target audience are considerably less likely to be either effective or implemented in the real world (Glasgow et al., 2012). We would recommend that future studies of Internet interventions for those with cancer, particularly those studies focused on treating cancer-related distress, carefully consider adjustment for potential sources of bias and/or employ a stratified sampling design in order to obtain a more fully representative sample. Such a strategy would make it possible to benefit from large-scale and cost-effective convenience sampling via the Internet while ensuring that certain groups that might otherwise be missed (e.g., males, minorities, certain cancer types, those with less severe levels of distress) are adequately sampled. Internet recruitment does appear to successfully reach those who may be most in need of services: individuals who have active health problems, poor psychological well-being, and limited social support. However, including other recruitment strategies, such as the Registry-based approach described in the present study, has the potential to reach patients and survivors who might not otherwise seek out care despite having significant levels of distress.