Data and variables
Baseline data were obtained from the Adolescent Health and Lifestyle Surveys of 1981, 1985, 1987, 1989 and 1991. Nationally representative samples of 12-, 14-, 16-, and 18-year-old Finns born on certain days in July, June and August were drawn each study year from the Population Register Center. The response rate was 82% (N=15,167) and, by age-sex groups, 83% in 12-and 14-year-old boys (N=2502) and 91% in girls (N=2704), and 72% in 16- and 18-year-olds boys (N=4680) and 86% in girls (N=5281). A self-administered questionnaire was sent in February, followed by two re-inquiries to non-respondents. The variables in this study were based on similar questions in each survey.
Follow-up data, the highest attained educational level, were obtained from the Register of Completed Education and Degrees containing information on every resident in Finland. Statistics Finland performed the data linkage according to a contract specifying the rights and duties of both parties. The study protocol was approved by the Data Protection Ombudsman.
Follow-up ended 31 December, 2001, when participants reached the ages of 28 to 38 and most had completed their education. The variable educational level in adulthood was based on the person’s highest educational attainment [23]: higher degree-level tertiary or doctorate (16+ years in education), lower degree-level tertiary (14–16 years), lowest tertiary (13–14 years), upper secondary (11–12 years), basic (includes lower secondary) education (9–10 years, or no completed education. Each participant had a value in the Register. We excluded 125 (0.8%) baseline respondents who had died during the follow-up.
The baseline variables were categorized from most “favourable” to most “unfavourable” in terms of socioeconomic position, school career and health behaviours. The repeatability of the variables had been previously tested and shown to be good [24]. The following six constructs (five exogenous and one endogenous) formed the latent variables (measurement part) in our statistical models.
Family socioeconomic position (SEP) was described by the father’s or guardian’s occupation (Statistics Finland, 1989: upper white-collar employee, lower white-collar employee or farmer, blue-collar employee) and father’s or guardian’s educational level: high (over 12 years), middle (9–12 years), and low (at most 9 years). The correlations between father’s/guardian’s occupation and education (family socioeconomic position) varied between the age-sex groups from 0.74 to 0.79.
Family structure, measured by family type, was categorized as nuclear (living with both parents) and other.
School career in adolescence was measured by School attainment at ages 12 and 14, based on the end of term school report. Adolescents were asked whether it was much better than the class average, slightly better, average, slightly below, much below average. Educational track was used for 16- and 18-year-olds, some of whom had finished school. According to the type of school and school attainment respondents were classified into seven categories presumed to predict their education in adulthood, the first category having the highest probability of reaching a high level of education in adulthood, the seventh the lowest probability: upper secondary school (1) with above-average school achievement, (2) with average achievement, (3) with below-average achievement; vocational or other schools (4) with above-average school achievement, (5) with average school achievement, (6) with below-average school achievement; (7) not attending school.
Health-compromising behaviours were smoking: never tried, smoked once, smoked 2 to 50 times, smoked over 50 times, smokes less than 10 times daily, smokes at least 10 times daily; and alcohol drinking style: abstinence, occasional drinking, recurring drinking (drinks alcohol at least once a month), recurring drunkenness (drinks until really drunk at least once a month). The correlations between smoking and alcohol drinking style varied from 0.56 to 0.67 in the age/sex subgroups.
Health-enhancing behaviours. Intensity of weekly physical activity summarized information from five questions which measured frequency of physical activity: participation in sports and physical activity organized by 1) sports clubs, 2) school or workplace (physical training lessons excluded), 3) other associations/clubs, 4) practised alone or with friends/family members, and 5) the extent of getting out of breath or sweating during physical activity. The derived categories were: very active vigorous activity, vigorous activity, occasional vigorous activity, light activity, and no activity. Frequency of brushing teeth was categorized as: several times a day, once a day, about 4 to 5 times a week, about 2 to 3 times a week, at most once a week, never. The correlations between weekly physical activity and brushing teeth varied from 0.10 to 0.19 in the age-sex subgroups.
Educational level in adulthood was measured by one indicator variable only. This endogenous factor is our main variable of interest.
Statistical analysis
Structural equation modeling with the Lisrel 8.71 program [25] was used to study how five latent variables characterising the baseline situation in adolescence were related with our main variable of interest, educational level in adulthood. These latent (exogenous) variables were family structure, family socioeconomic position, school career, health-compromising behaviours and health-enhancing behaviours in adolescence. Polychoric correlation coefficients of pairwise present cases from the PRELIS program [25, 26] were used to quantify the associations between the measured variables in the four age/sex subgroups. The models were fitted by the method of weighted least squares with polychoric correlation coefficients.
Three models were fitted separately in age-sex groups. First, the models (basic models) including associations between family background, school career and educational level were fitted. This was followed by fitting the models relating to Hypothesis 1 and Hypothesis 2 (Figure 1). Then, the same model was fitted including only statistically significant associations at a 5% risk level (t-statistic smaller than −1.96 or bigger than +1.96). Standardized regression coefficients (range 0–1) for the associations are presented. The fit of these models was evaluated by means of root mean square error (RMSEA), an adequate fit of the model indicated by RMSEA<0.08 [27].