Introduction

Computer tailoring and motivational interviewing are promising intervention methods to motivate people to improve their health behaviors. Tailored behavior change interventions provide participants with personalized feedback on their behaviors and related determinants and personalized advice on behavior change based on individual assessment [1]. Computers facilitate low-cost personalization of health messages for large groups of people [2]. Recent reviews suggest that computer tailoring can positively affect diet and physical activity (PA) and can outperform generic information [35]. The rationale for these effects is that computer tailoring education is more likely to be read, remembered, and experienced as personally relevant than standard material [6].

Motivational interviewing is defined as a “collaborative, person-centered form of guiding to elicit and strengthen motivation for change” (p. 137) [7]. It is based on the premise that people become more committed to what they say themselves than what they hear from others [8]. Indeed, letting people express arguments for change (“change talk”) predicts behavior change [9, 10]. Although a limited number of studies have focused on the effects of motivational interviewing on diet and PA, reviews report positive effects on these behaviors [1119], and motivational interviewing outperforms traditional advice regarding a broad range of behavioral problems and diseases [13, 15, 16].

While both computer tailoring and motivational interviewing have shown efficacy, few studies have simultaneously compared their effects [20, 21]. Consequently, the Vitalum study [22] compared the effects of computer-generated tailored print communication (TPC) to telephone-delivered motivational interviewing (TMI) on PA and fruit and vegetable consumption in people aged 45 to 70. Vitalum’s efficacy was tested on participants with different levels of education. Because TMI is more interactive and demands lower literacy abilities than TPC, it may have relatively more impact on people with a lower education level [2]. We also examined the effects on participants with or without hypertension, since hypertension is a risk factor for cardiovascular morbidity and mortality [23] and such higher risk may modify the intervention effects. In line with the protection motivation theory [24, 25], people without hypertension may be less motivated to change behavior and may benefit more from motivation-enhancing interventions like motivational interviewing [2628] than people with hypertension. The largest effect is anticipated for participants receiving a combination of TPC and TMI, as we assume that the positive aspects of both will be mutually reinforcing.

Pedometers may be considered an additional strategy to stimulate PA [29]. To examine the effect of this device on PA, half of the Vitalum participants in the intervention groups (TPC, TMI, and combined) randomly received a pedometer during the intervention period; the remaining participants received the pedometer after the study.

The following hypotheses were tested:

  1. 1.

    Participants receiving TPC, TMI, or a combination of both will show greater behavioral change than those of the control group.

  2. 2.

    Participants receiving a combination of TPC and TMI will show greater behavioral change than those receiving either individually.

  3. 3.

    Participants with a low education level will benefit more from TMI than from TPC.

  4. 4.

    Participants without hypertension will benefit more from TMI than from TPC.

  5. 5.

    Participants who received a pedometer during the study will increase their level of PA more than participants who received this device after the study.

Methods

A detailed design description of Vitalum can be found elsewhere [22]. The Vitalum study was approved by the Medical Ethics Committee of Maastricht University and the University Hospital Maastricht and is registered with the Dutch Trial Register (NTR1068). All participants provided written informed consent prior to their inclusion in the study.

Participants

Vitalum’s potential participants (n = 6,420) were randomly selected from 23 Dutch general practices [30, 31] using five recruitment aims: (1) aged 45–70; (2) ~50% diagnosed by their general practitioner as hypertensive according to the International Classification of Primary Care (ICPC code K86 or K87 for hypertension without or with organ damage, respectively) [3234]; (3) ~50% male; (4) not participating in other studies according to the general practice database; and (5) only one person per address. The general practices ensured participants’ suitability, according to their physical and psychological condition, before they were invited. The remaining participants (n = 5,545) received a written invitation from their general practice, which explained the content of the study and group assignment. Those who consented (n = 2,881) received a written baseline questionnaire. Participants were recruited and treated in batches; 18 months passed between the first and last batches (March 2005–August 2006). People who returned the baseline questionnaire were included if they failed to meet at least two of three Dutch public health guidelines: PA (being moderately intense physically active for ≥30 min a day on at least 5 days per week [35]) and consumption of either fruit (eating at least two servings of fruit daily [36]) or vegetables (at least 200 g of vegetables a day [36]). Whether participants failed to meet a guideline was based on a multiple-item and a single-item self-report measurement [22, 37]. Other studies have shown that multiple-item measurements may result in people overestimating their behavior [3847] and that adjusting the results of a multiple-item measurement with those of a single-item measurement reduced the overestimation [37, 38, 40, 41, 45, 46]. Participants were therefore considered as failing a guideline if they did so according to at least one of the two measurement types. In total, 1,629 (63%) of the 2,568 participants who filled out the baseline questionnaire met the inclusion criteria and were randomized after stratification based on a general practice’s diagnosis of hypertension derived from the patient’s record.

Power analysis, assuming a small effect size d = 0.30, a power of 90%, a two-tailed alpha of 1% to correct for multiple testing, an intraclass correlation of 0.02, and an average sample size of 70 participants per general practice, indicated a total sample size of 1,600 participants at baseline (for further details, see [22]). Assuming that half would refuse participation and that 40% of the remaining half would meet behavioral recommendations (implying their exclusion from the trial), we aimed at approaching 5,400 people.

Design

Participants were randomized into the following groups by a computer program. Researchers were aware of the group assignment (see Fig. 1 for study design and timeline): (1) TPC—participants received four tailored letters, #1 and #3 focused on PA and #2 and #4 on fruit and vegetable consumption; (2) TMI—participants received four telephone calls based on motivational interviewing. Participants chose the order of the conversation topics in #1 and #3; if PA was preferred in #1, fruit and vegetable consumption was discussed in #2, and vice versa; (3) combined—participants received two tailored print letters and two telephone motivational interviews; #1 (letter) and #2 (interview) addressed PA, #3 (letter) and #4 (interview) focused on fruit and vegetable consumption; and (4) control—after the intervention period, participants received one tailored letter addressing PA and fruit and vegetable consumption, based on the last follow-up questionnaire.

Fig. 1
figure 1

Vitalum design and timeline. ICPC International Classification of Primary Care, K86 hypertension without organ damage, K87 hypertension with organ damage, TPC tailored print communication, TMI telephone motivational interviewing, Combined combination of TPC and TMI

Intervention group participants received their four intervention components at 5, 13, 30, and 43 weeks after returning the baseline questionnaire. An intermediate telephone survey after two intervention components (week 25) was used to assess all participants’ behaviors and determinants to gather the most recent information for the computer-tailored intervention and assess the intermediate effects of the interventions. Participants in the TPC group were called (week 39) an additional time to collect the most recent data on behavior and its determinants for the fourth tailored letter. Intervention effects were assessed by two follow-up printed questionnaires (weeks 47 and 73). All questionnaires were mailed to the participants’ home addresses. Research staff called participants who partially completed the printed questionnaires to finish data collection. Group assignment was concealed for calling staff for the baseline and second follow-up questionnaire (week 73), but not for the first follow-up questionnaire (week 47) because this questionnaire also contained intervention process questions.

To examine the effect of receiving a pedometer, participants received a pedometer with brief instructions encouraging them to gradually increase their number of steps to at least 10,000 a day [48]. Half of the participants in the TPC, TMI, and combined groups received their pedometer before the third intervention component (week 29); the remainder received the pedometer after the last follow-up. Participants in the control group also received their pedometer after the last follow-up to facilitate comparison across groups. Prior to randomization in two groups (receiving a pedometer during or after the study), participants were stratified on hypertension status.

Interventions

Tailored Print Communication

To increase the chance that participants read the tailored information [3], it is important to limit the number of tailoring variables to those that predict the greatest amount of change in the outcome behavior [49]. The TPCs and selection of tailoring variables were based on focus group interviews, prior effective theory-based computer-tailored interventions [5053], the I-Change Model [5456] and Control Theory [57]. The tailoring variables were current behavior, awareness, age, gender, stage of change, attitude, self-efficacy expectations, and action plans. Focus group interviews with the target population revealed that attitudes and self-efficacy expectations were important variables related to the outcome behaviors. The I-Change model was chosen because it includes attitudes and self-efficacy expectations. Besides, it acknowledges additional cognitive determinants (e.g., awareness, social influence, and action plans) as well as motivational phases (stage of change) [55]. Perceived social influences, as part of the I-Change model [5456], are sometimes also used in tailored interventions [3]. However, to reduce the length of the tailored feedback letters, they (with the exception of normative feedback) were not included on the grounds of lack of evidence that such social influences are associated with stage transitions. To refine the concept awareness, control theory [57] was used because it states that behavioral regulation is stimulated by feedback control processes, a process that can be triggered by, for example, comparing participants behavior with their self-rated intake (i.e., awareness), with behavior of others of the same age and gender and with their own behavior over time. Consequently, current behavior, awareness, age, and gender were used to tailor the information in the letters. The I-Change Model acknowledges that individuals can move forward and backward through different motivational phases towards behavior change. Because each stage transition is associated with different determinants, it is recommended that interventions be stage specific [2]. Consequently, the Vitalum participants all received stage-matched advice [5860]. Earlier studies have shown pros, cons, self-efficacy expectations, and action plans to be the most important variables associated with stage transitions, with pros and cons being more important in the transitions through earlier stages, and self-efficacy expectations and action plans (i.e., planning specific actions needed to execute a health behavior goal) being more important in the transitions through later stages [6164]. Thus, stage-matched advice focused on attitude (pros and cons), self-efficacy expectations, and action plans.

The letters, TPC1 and TPC2 (each three to six pages), were personalized with the participant’s name and included the following elements: introduction, specific behavioral feedback on targeted behavior, stage-matched advice to change behavior, and conclusions. Subsequent letters, TPC3 (two to four pages) and TPC4 (four to six pages) were also personalized and reinforced tailored feedback on behavioral progress and stages of change based on the intermediate survey data. They used similar strategies as in TPC1 and 2. TPC for participants in the combined group took the same format as those described above. The letters were mailed to the participants’ home addresses. More details on the TPC intervention are described elsewhere [22].

Telephone Motivational Interviewing

Vitalum’s motivational interviewing counselors received six 3-h training sessions by two certified trainers after which they had to perform one TMI conversation with adequate integrity according to the 1-Pass Coding System for motivational interviewing [65]. Eligible interviewers were bachelor’s and master’s students of Health Sciences or Psychology at Maastricht University. Of 39 finishing the training, 34 had adequate integrity and 16 were contracted to work on Vitalum. Counselors were aware of the group assignment of participants.

Interview protocols were established to enhance treatment integrity. The protocols were based on those used by Resnicow et al. in the Healthy Body Healthy Spirit trial [66, 67] and reflect the principles of motivational interviewing [68]: introduction, assess current behaviors and progress, discuss the public health guideline, assess and enhance motivation and self-efficacy for behavior change, assess readiness to change, summarize, and close session. Additional topics could be discussed, e.g., current situation and progress on action plans in subsequent interviews, the tailored letters (combined group), and the values clarification strategy [67].

Measurement

The primary outcome measures were PA and fruit and vegetable consumption, measured at baseline, the intermediate telephone survey and both follow-ups (weeks 25, 47, and 73, respectively).

PA was measured with 28 items from the modified Community Healthy Activities Model Program for Seniors PA questionnaire [69]. This assessed the frequency of an activity (times per week) and its duration (hours per week) and was validated with cardiorespiratory fitness (VO2 maximum) estimated by a submaximal treadmill test (Pearson’s r = 0.17, p < 0.05) in another study by Resnicow et al. [69]. Measured activities included walking leisurely or briskly and doing light or heavy housekeeping. Metabolic equivalents were determined for each activity on the basis of the PA compendium by Ainsworth et al. [70]. Metabolic equivalent levels were used as cut-offs to calculate the total number of weekly PA hours with at least a moderate intensity. Only activities with at least three metabolic equivalents were considered moderate for all participants [71]. The number of weekly PA hours with at least moderate intensity was used as the outcome measure.

Fruit and vegetable consumption was measured with a 16-item short questionnaire [72] assessing frequency (days per week) and quantity (servings/serving spoons per day) of vegetables (cooked and raw) and fruit (juice, tangerines, other citrus fruits, apples or pears, bananas, and other fruits). The questionnaire was validated by blood levels of carotenoids (Spearman’s r = 0.39, p < 0.001 for fruit; Spearman’s r = 0.24, p < 0.001 for vegetables) and vitamin C (Spearman’s r = 0.37, p < 0.001 for fruit; Spearman’s r = 0.26, p < 0.001 for vegetables) in another study by Bogers et al. [72]. Two tangerines or a small bowl of small fruit (e.g., grapes) were considered one piece of fruit. Frequency and quantity were used to determine daily consumption. The quantity of daily consumption was used as the outcome measure (fruit consumption in servings/day and intake of vegetables in grams/day; one piece of fruit is considered one serving).

Cognitive behavioral determinants were used to describe the baseline sample and as covariates in the effect analyses. The determinants were measured for each behavior, fruit and vegetables separately, using variables from the I-Change Model [5456] and the Transtheoretical Model [60], with meeting the guideline as the target behavior: awareness, pros and cons, social support, modeling self-efficacy expectations, action plans, habit strength, and stages of change.

Socio-demographic and lifestyle variables (described below) were used to describe the baseline sample and as covariates in the analyses of intervention effects. Sex, hypertension status, and age were used to select participants from the general practice database and were provided by the general practice only when participants agreed to participate. Because of hypotheses 3 and 4, education level (1 = low; less than secondary or vocational education; 2 = intermediate; secondary through pre-university education; and 3 = high; professional or university education) [7375] and hypertension (0 = no hypertension; 1 = hypertension) were included as moderators. Socio-demographic and lifestyle variables measured were highest completed level of education, marital status, work situation, native country, presence of diabetes, smoking behavior, alcohol consumption, family history of cardiovascular disease, and body weight and height to calculate body mass index (kg/m2). Saturated fat intake was a secondary outcome and was used to describe the baseline sample and as a covariate in the analyses of intervention effects for the primary outcomes for PA and fruit and vegetable consumption.

Process Evaluation

In the first follow-up questionnaire, 47 weeks post-baseline, all participants were asked if they owned and used a pedometer (which half of the participants in the three intervention groups should have received in week 29).

The intermediate (week 25) and first follow-up (week 47) survey contained process questions that addressed the quality of each TPC and TMI delivery. The results of successive interviews or letters were averaged into one outcome per participant, thus participant was the unit of analysis. In addition, competence of interviewers in their use of motivational interviewing was evaluated with the motivational interviewing treatment integrity (3.0) code [76]. Integrity results will be described elsewhere.

Statistical Analyses

Full details on the statistical analyses can be found in the electronic supplementary materials, Appendix A. Residence of the participants was dichotomized into “region” and used as a covariate in the effect analyses. Because seasonal variation in eating and PA patterns may occur [77], the season in which participants filled out the baseline questionnaire was also used as a covariate (using dummy coding).

Baseline Differences

Baseline differences between intervention groups were assessed with the Statistical Package for Social Sciences 15.0 (SPSS) with chi-square tests for categorical variables and ANOVA for interval variables and corrected with Bonferroni (α/number of variables = 0.05/46 = 0.001).

Selective Dropout

The MLwiN software [78] was used to examine selective dropout with mixed logistic regression using PQL estimation. The mixed regression model had three levels, with measurements nested in participants nested in general practices. Dependence of dropout on general practice or participant was accommodated by allowing the intercept to vary randomly between general practices and between participants. As predictors of dropout, we used baseline variables without missing values (group, time of measurement, group by time of measurement interactions, age, gender, hypertension and region, and level of education).

Intervention Effects

Vitalum’s intermediate (week 25), short-term (week 47), and long-term (week 73) efficacy was analyzed with mixed linear regression using maximum likelihood estimation by SPSS. SPSS was used because it facilitates specifying the covariance structure for the repeated measures. These were intention-to-treat analyses since all available measurements of all randomized participants are analyzed [79, 80] without imputation for missing measurements. Estimating effects with maximum likelihood and multiple imputation converge to the same results in large sample sizes like ours [80]. The mixed model again had three levels: general practices, participants, and measurements (baseline, weeks 25, 47, and 73). General practice effects were included as a random intercept, and participant effects were included by choosing an unstructured 4 × 4 covariance matrix for the repeated measures. Additionally, treatment by general practice interaction was checked by random slopes for the treatment effects but was never found and therefore dropped from the model. The dependent variables were the primary outcomes (PA in hours per week and fruit in servings a day and vegetables in grams a day). Intervention group was included as a between-subject factor using TPC, TMI (indicating treatment type), and TPC × TMI as predictors, and time of measurements as a within-subject factor using dummy coding with baseline as reference category and a dummy indicator for each time point. Region, season, baseline socio-demographic and lifestyle variables, baseline cognitive behavioral determinants, and baseline behaviors measured with multiple items were included as between-subject covariates (except for baseline behavior of the outcome itself, which was included as a repeated measure) because these variables were related to the outcome behavior or cardiovascular disease. By including these covariates, power and precision of treatment-effect testing and estimation are improved due to reduced residual outcome variance. Education level and hypertension were included as moderators because of hypotheses 3 and 4. Having received a pedometer during the intervention period was included as a within-subject factor. Interaction between intervention group and pedometer was tested in a separate analysis excluding the control group which did not receive a pedometer before the last follow-up. In view of multiple testing, an alpha of 0.01 was used to test the hypotheses. First, non-significant interaction effects (α = 0.01) were hierarchically excluded from the model except for effects of primary interest (TPC × time and TMI × time). Second, non-significant covariates (α = 0.10 to prevent type II errors) were excluded, except for education level, hypertension status and having received a pedometer because of the hypotheses, and age and gender because these are broadly accepted covariates. Group and time were also never excluded as predictors of interest. Depending on the absence or presence of TMI by TPC interaction, further model simplifications were then tested with equality constraints [81]. For details, please see the “Results” section.

Missing Values and Data Checking

Missing values on any covariate were replaced if the following conditions were met. For scales, a missing item score was replaced with the person’ s average on all other items of that scale, provided at most 20% of the items were missing. For variables measured with one item, missing values were imputed using multiple regression [82] provided less than 20% of the sample was missing on that variable. The primary outcomes were checked for normality. Fruit and PA were square-root transformed because of positive skewness. Heteroscedasticity of residuals was checked and not found. Predictors and covariates were checked for multicollinearity which was not found either.

Results

Baseline Characteristics

Baseline values of the Vitalum sample (n = 1,629) in each group are shown in Table A 1 in the electronic supplementary materials, Appendix B. No significant differences between groups at baseline were found (all p > 0.05). Mean age of the sample was 57.15 years (SD = 7.13) and half of the participants (52%) were classified as hypertensive, which is due to the inclusion criteria and does not reflect prevalence of hypertension in the Dutch population. More than half of the sample was men (55%), of whom 54% had a low and 23% had an intermediate education level. Participants were physically active with moderate intensity for 4.65 h a week (SD = 3.83). Participants in the sample consumed 165.06 g (SD = 82.45) of vegetables and 2.08 (SD = 1.64) servings of fruit a day. Note that the baseline values for PA and fruit consumption were above the guidelines (PA, ≥2.5 h/week; fruit, ≥2 servings/day). The explanation for this seeming protocol deviation is that the inclusion of participants in Vitalum was based on a single- and multiple-item measure (for details, see the “Participants” section of the “Methods” section and [22]).

Dropout

The Vitalum study took place between March 2005 and March 2008. Of the 1,629 participants completing the baseline questionnaire, 1,509 (93%) finished the intermediate survey, and 1,201 (74%) and 1,186 (73%) filled out follow-ups 1 and 2. The additional survey (week 39) for the TPC group was completed by 356 participants (88% of baseline). Reasons for dropout were non-response (38%), intervention related (15%), health related (8%), or other reasons (38%; e.g., not having time to participate or no interest). Dropout was significantly higher at 47 (26%) and 73 weeks (27%) than at 25 weeks (7%) after baseline. Also, dropout was significantly higher among participants that received TPC than in participants that received no TPC from week 47 on (week 25, 8% vs. 6%; week 47 and 73, 32% vs. 22%). In addition, there were more dropouts in participants with a low education level than in participants with higher educational levels (25% vs. 17%). Possible bias due to group and education effects on dropout was adjusted for in the effect analyses by including all dropouts and all predictors of dropout into the analyses of each outcome. Dropout was unrelated to age, gender, hypertension, and region.

Efficacy of TPC, TMI, and Their Combination

Full details on the efficacy of TPC, TMI, and their combination can be found in the electronic supplementary materials, Appendix C. Primary outcome averages per group and time point are shown in Table 1 (see also Figures A 1 to A 3 in the electronic supplementary materials, Appendix D). Likelihood ratio tests showed the interactions of group by time with hypertension and of group by time with educational level to be non-significant. After removal of these from the model, all lower order interactions of group or time with hypertension or educational level were likewise tested and removed because of non-significance. This also applied to covariates (except for education level, hypertension status, having received a pedometer, age, and gender) and baseline group effects [83, 84]. The final model thus contained time and group by time effects plus significant covariates. This final model showed (nearly) significant time by TPC by TMI interaction for vegetable consumption (p < 0.01) and fruit intake (p = 0.02), but not for PA (p = 0.57). This implied that the TPC effect depended on the absence or presence of TMI and vice versa. Therefore, pairwise comparisons between all four conditions were made by switching from a 2 × 2 two-way model for treatment effects (TMI yes/no, TPC yes/no) to a one-way model with four groups. The results are shown in Table 2 and suggested testing the following model simplifications against the final model on which Table 2 was based:

Table 1 Observed means and standard deviations (SD) of primary outcome measures per time point and groupa
Table 2 Unstandardized regression coefficients (B), standard errors (SE), effect sizes (ES; Cohen’s d), and 95% confidence intervals (95% CI) of group comparisons per outcome per time point and overall, i.e., averaged across time points
  • -Equality of an intervention effect across time points. This was done by replacing in all group × time terms the three time dummies (weeks 25, 47, and 73) by a single indicator for time (post: 0 = baseline, 1 = else). This model simplification was tested with a likelihood ratio test.

  • -Equality of the intervention effects (TPC = TMI = combined) at any given time point. This was done by replacing the three group indicators with a single indicator (treat: 0 = control group, 1 = TPC/TMI/combined). With a likelihood ratio test this model was compared to the preceding models.

The results of the model simplifications can be summarized as follows. For all outcomes (PA, intake of fruit and vegetables), the most simple model with equality of the intervention effects (TPC = TMI = combined) and equality of that effect at all time points (weeks 25, 47, and 73) did not fit significantly worse than the two more complicated models. Therefore, the constrained model is preferred, and the resulting effect sizes and significances are reported in Table 2. This means that all three interventions, TMI, TPC, and combined, can be regarded as equally effective, and participants in these groups increased their level of PA (hours/week) and intake of fruit (servings/day) and vegetables (grams/day) significantly more than those in the control group. It may be observed that Table 2 and Figures A 1 to A 3 (electronic supplementary materials, Appendix D) suggest some differences in favor of TPC. Because there was also slightly more dropout in the TPC and combined groups than in the TMI and control groups, the best conclusion appears to be rough equivalence of the three interventions. Finally, we did not find significant outcome variation between general practices for any primary outcome (all p > 0.01 one-tailed).

Hypotheses

Hypotheses were tested two-tailed to allow detection of unexpected effects and to be consistent with the two-tailed confidence intervals. Although hypothesis 1 and 2 were formulated in terms of change, evaluations were made in terms of group differences at the first follow-up measurement because no significant differences between groups were found at baseline (as expected given the randomized treatment assignment).

Hypothesis 1

Participants receiving TPC, TMI or a combination of both will show greater behavioral change than those of the control group,” could be tested by looking at Table 2. The hypothesis was supported for TPC on all outcomes, TMI on intake of fruit, and the combined group on PA. That is, participants in the intervention groups increased these behaviors significantly more than the control group at the first follow-up measurement. The difference between the combined group and the control group was borderline significant for the other outcomes (p = 0.03 and p = 0.02 for intake of fruit and vegetables, respectively).

Hypothesis 2

“Participants receiving a combination of TPC and TMI will show greater behavioral change than those receiving either individually,” could likewise be tested by the relevant pairwise comparisons in Table 2. This hypothesis was not supported for any outcome.

Hypothesis 3

“Participants with a low education level will benefit more from TMI than from TPC,” was not supported because we found no treatment type by time by education interactions for any outcome (for details, see Table A 2 in the electronic supplementary materials, Appendix E) and TPC and TMI appeared equally effective.

Hypothesis 4

“Participants without hypertension will benefit more from TMI than from TPC,” was not supported because we found no treatment type by time by hypertension interactions for any outcome (for details, see Table A 3 in the electronic supplementary materials, Appendix E).

Hypothesis 5

“Participants who received a pedometer during the study will increase their level of PA more than participants who received this device after the study,” was not supported (booster main pedometer effect: β = −0.02, SE = 0.04; p = 0.72). Given the absence of a significant pedometer main effect and the absence of substantive or consistent differences between TPC, TMI, and their combination, testing intervention by pedometer interaction was superfluous.

Process Evaluation

Of the 1,201 participants who filled out the first follow-up questionnaire (week 47), 514 (43%) indicated pedometer possession, of whom 322 (63%) stated that they had used it.

The TMIs lasted on average 24.45 min per participant (SD = 6.10). Of the 398 participants in the TMI group, 201 (51%) chose to discuss fruit and vegetable consumption instead of PA in the first interview. In the third interview in the TMI group, 183 (53%) of the 346 participants selected PA as the topic instead of fruit and vegetable consumption.

Results of participant satisfaction with and evaluation of TPC and TMI are shown in Table A 4 in the electronic supplementary materials, Appendix F. More participants remembered receiving TMI than TPC. Most of the participants that remembered receiving a letter read at least one letter (99% in TPC group; 98% in combined group) and kept at least one letter (80% in TPC group; 75% in combined group). More participants discussed the interviews with others than allowed others to read the letters. Participants who received TMI rated their overall satisfaction as higher than participants that received TPC. Also, participants who received TMI appeared to remember the content of the intervention better than participants who received TPC. Within the combined group, the content of the interviews were perceived as more interesting than the content of the letters.

Discussion

This paper has described the efficacy of tailored (TPC) and motivational interviewing (TMI) health communications. The results of this study provide evidence that TPC, TMI, and their combination are equally effective in promoting self-reported PA and fruit and vegetable consumption in middle-aged adults, also in those with lower educational levels and with elevated health risk. Vitalum is one of few studies that has shown positive effects of computer tailoring and motivational interviewing among middle-aged adults [20, 8589]. Vitalum also showed simultaneous positive effects for multiple behaviors, as has been shown by others [21, 52, 9092].

Our first hypothesis, i.e., participants receiving TPC, TMI, or a combination of both will have larger behavioral changes than those of the control group at the first follow-up measurement, was not true in the TMI group for PA and intake of vegetables. This was possibly caused by a power problem (all p values < 0.12) because averaged across weeks 25, 47, and 73, the differences between the TMI and control group (Table 2) for these behaviors were borderline significant. Contrary to hypothesis 2, no extra effect of the combined intervention compared to the separate interventions was found. It could be that the intervention dose in Vitalum (two tailored letters and two motivational interviews) was too low to reinforce the positive effects of both interventions. Contrary to predictions of hypotheses 3 and 4, no differences in effect were found between participants with and without hypertension and between participants ranging in education levels. It may be that the interventions were both tailored enough to appeal to all subgroups [3] as other studies have also found similar intervention effects across socio-demographic groups [26, 52, 66]. In addition, more than 50% of the sample was “in preparation” or in “higher stages” of change for the outcome behaviors, which may have explained why TMI did not outperform TPC in participants without hypertension. Perhaps a research setting in general reduces the effects of motivational interviewing because participants may be more motivated than the general population. Contrary to hypothesis 5, no effects of a pedometer were found on PA in Vitalum, although using such an instrument is associated with increased PA [29]. The lack of finding a pedometer effect might be because this device was distributed as a gift and not as an intervention instrument; the recipients were not asked to monitor their steps [29].

Effect sizes (Cohen’s d) found in Vitalum were of similar magnitude as found in earlier tailoring studies [3, 5], but are somewhat smaller than reported in earlier motivational interviewing studies [12, 18, 88, 93, 94]. The latter may be due to lower treatment integrity results [95] or the manual-based motivational interviews [13]. Effect sizes (Cohen’s d ranged from 0.15 to 0.18) were also somewhat smaller than anticipated during power calculations (Cohen’s d = 0.30). However, because we found neither treatment by general practice interaction nor a general practice effect (so that the intraclass correlation was 0 rather than 0.02 as anticipated, see (22)), the present study still has enough power to detect significant differences with small effect sizes (90% power for an effect size d = 0.20 and 80% for d = 0.17 and 66% for d = 0.15.

Vitalum’s design was inspired by the North Carolina Strategies for Improving Diet, Exercise and Screening (NC STRIDES) study [20, 96], a randomized trial of TPC and TMI among cancer survivors and non-cancer affected participants. With regard to increasing fruit and vegetable consumption, both Vitalum and NC STRIDES have found positive effects for the combined group compared to the control group. However, NC STRIDES did not find significant differences between TPC and control, and between TMI and control [20], whereas Vitalum did. This was probably caused by the smaller sample size in NC STRIDES. In addition, the effects of the combined group doubled the effects of the TPC and TMI groups in NC STRIDES [20, 96], while the combined group in Vitalum was equally effective as the TPC and TMI groups. This could be explained by the intervention dose, being twice as high in the combined group in NC STRIDES than in Vitalum. Third, NC STRIDES did not find intervention effects for PA, whereas Vitalum did. This was probably caused by participants in NC STRIDES being older than in Vitalum (66 and 57 years on average, respectively) [96] or by environmental aspects.

Participants in the control group significantly increased their level of PA and intake of fruit and vegetables from baseline to the intermediate telephone survey. This may have been caused by them completing health behavior questionnaires, which may in itself have motivated them to change their behaviors [97, 98]. Another cause could be the use of the telephone to collect the intermediate data because using telephone interviews may be more subject to social desirability bias than a written questionnaire [99].

According to the motivational interviewing treatment integrity (3.0) code [76], an instrument to evaluate the fidelity of motivational interviewing delivery, there was room for improvement in the fidelity of motivational interviewing delivery (i.e., the percentage of open questions, the questions to reflections ratio and the percentage of motivational interviewing adherent responses) [95]. However, participant evaluations of Vitalum interventions were generally positive and showed that TMI had better evaluation scores than TPC. More participants remembered receiving TMI and its content than TPC. Participants who received TMI were more satisfied with the intervention and perceived it as more interesting than did the participants who received TPC. Also, more participants discussed the interview with others than allowed others to read the letter. These differences may be explained by the fact that we can exert more control in the delivery of TMI than of TPC or because TMI is delivered personally.

Because the primary goal of the present study was to examine change in PA and fruit and vegetable consumption, we have not examined intervention effects on potential determinants of behavior change and if such changes in these presumed determinants mediated the intervention effects on health behaviors. Inclusion of such further analyses was not possible within the word limits of the journal, but will be published elsewhere. Mediation could be expected through the measured determinants (awareness, attitude, social influences, self-efficacy expectations, action plans, habit strength, and stages of change) as was found in earlier studies (e.g., [100103]).

Limitations

The study has several limitations. First, long questionnaires were used to measure behaviors, psychosocial determinants, and socio-demographic and lifestyle variables [2]. The length may have annoyed participants resulting in dropout or invalid results [104]. Although responses to the intermediate (93%), first (74%), and second (73%) follow-up measurements were adequate, questionnaires were sometimes partly completed, but we called participants to complete data collection. This could have caused participants to give socially desirable responses. However, since the completion of data collection was executed with less than 10% of the participants, we expect social desirability bias due to assisted completion of data collection to be limited. Second, we could not use more objective measures such as biomarkers for fruit and vegetable consumption [72, 105] or an activity monitor for PA [106] to validate the self-report questionnaires because they were considered too time-consuming and expensive [47, 107]. We thus relied on self-report measurements, which require participants to have good memories and estimation skills, and may therefore result in measurement error, for example, because participants give social desirable answers or lack valid recall [108, 109]. These measures also lack detail and specificity because they tend to focus on common foods and activities. Behavioral performance may thus be over- or underestimated [109]. We chose to use self-report questionnaires as these are recommended for and most often used in large intervention studies with multiple measurements per participant over time [110]. Third, when two interventions are compared in an efficacy study, elements of both interventions are controlled as far as possible to maximize internal validity. Yet, despite the use of a randomized controlled trial as the design for the Vitalum study, full control of experimental factors is almost impossible. Comparing TPC with TMI is complex because many factors have to be taken into account, the most important of them being intervention delivery (nonperson vs. person-delivered), intervention medium (letter vs. telephone), audiovisual aspects of the medium (visual vs. audio), and theoretical basis (theory-based vs. clinical-based). For example, TPC may be easier to evaluate than TMI because it allows for more control over the intervention delivery with regard to layout and content. In our study, the delivery of TMI was less controlled because there were multiple counselors and because the success of the personal delivery may also depend on the match between the counselor and the client (e.g., conversation tone, ambiance, or language). Keeping in mind the external validity of our interventions, which may be limited when maximizing internal validity, we tried to enhance internal validity by making certain choices for some factors. For instance, because a computer-tailored intervention is not usually person-delivered, we chose a more distant medium (i.e., the telephone) for motivational interviewing delivery. Another example was the use of a TMI counseling protocol to control for any idiosyncratic factors involved in personal delivery, in order to make it more similar to the nonperson-delivered TPC intervention. Nevertheless, because TPC and TMI differed on more than one factor, the comparison of intervention efficacy may be restricted to the choices we made with respect to the delivery and medium of the interventions. Fourth, using the telephone both for TMI and the intermediate survey seemed to confuse participants because some participants thought the intermediate survey was the TMI. To avoid confusion, we recommend that future studies avoid using the same delivery method for both measurement and intervention. Fifth, dropout was higher in participants receiving a tailored letter (TPC and combined group) than in participants receiving no tailored letter (TMI and control group). As stated earlier, the fact that we can exert more control in the delivery of TMI than of TPC may explain this selective dropout. Dropout was also higher in participants with a low education level than with higher levels of education. In case of non-ignorable dropout (i.e., dropout depending on unmeasured outcome variables, known as missingness not at random), the mixed linear regression analyses could be biased. However, the analyses were intention-to-treat [80], including all available data from dropouts. Treatment group and education level were always included as predictors in the outcome analyses, and dropout did not depend on other covariates or measured outcome variables. Thus, under the assumption of so-called missingness at random, the analyses of the present study were unbiased. Finally, selective samples are a common problem in intervention studies (e.g. [26, 50]) and limit external validity. Although dropout was not related to age, gender, hypertension status and region, our scores on the baseline multiple-item health behaviors revealed that our participants were motivated to participate. This was confirmed by the fact that for PA, fruit as well as vegetable consumption more than 50% of the sample was in preparation or higher stages of change.

Recommendations

TPC, TMI, or a combined version can be used to promote PA and fruit and vegetable consumption among middle-aged adults with different educational background, and with or without hypertension. Future research needs to examine the optimal dose of TPC and TMI. Additionally, long-term efficacy (>6 months after intervention delivery) of TPC and TMI needs further attention.