1 Introduction

This article focuses on family background effects on high cultural activities, and investigates to what extent estimates of these effects are biased by measurement error due to the retrospective longitudinal design. As in many fields of sociology, family background effects on high cultural activities are routinely measured by asking respondents retrospective questions about their parents and the cultural climate in the family of origin. The preference for a retrospective design seems to be logical and efficient since panel studies would take far too long to assess the effect of the family of origin on the off-spring’s life-course. Retrieving information by questions about family background and about cultural practices in the family of origin is not a problem, as long as the information respondents provide about their parents is correct. However, to some extent the information respondents provide cannot be completely correct, and less than completely correct information must result in biased effects of family background. The size and direction of the bias is the concern of this article.

Cultural consumption is a core dimension of life styles (Bourdieu 1984; DiMaggio and Mukhtar 2004). Cultural consumption has many aspects, and in this paper we focus on high cultural activities, since formal high culture is a way for the elite to distinguish themselves from others (Bourdieu 1984). The indicators of high cultural activities are the frequency of visits to theatres, concerts, and museums, and the frequency of reading literature and poetry. These indicators are strongly correlated, which suggests that there is a latent variable, affecting all the indicators, which can be labeled as ‘affinity with high cultural activities’. Bourdieu and Passeron (1977) have argued that intergenerational transmission of cultural capital has become the main reproduction channel in modern society, and has replaced economic transmission of status. Thus, it is important to assess whether conventional, retrospective designs arrive at the true intergenerational effect of high cultural activities, both for research in cultural consumption per se and for research in social stratification and mobility.

What do studies of the intergenerational transmission of high cultural activities tell us? First, they show that inclusion of family background in regression models of high cultural activities yields strong effects, both in studies in which family background is measured by standard indicators of the socio-economic status of the family of origin (educational and occupational status), and in studies in which family background is measured by parental high cultural activities (Ganzeboom 1984; Ganzeboom and De Graaf 1991; Kraaykamp 2003; Mohr and DiMaggio 1995; Van Eijck 1996; DiMaggio and Mukhtar 2004). The latter approach is more fruitful, not only because it yields larger effects, but also because it points directly to the two presumed mechanisms behind the intergenerational transmission of high cultural activities. Children who are socialized in an environment, in which culture is a standard facet of the leisure time repertoire, learn (a) how to appreciate culture and (b) imitate their parents’ life style. Indeed, as soon as direct measurement of cultural socialization is included in multivariate models of high cultural activities the effect of parental level of education becomes small. It is clear that especially parental socializing practices, and not so much unspecified effects of social status, affect children’s high cultural activities in their adult lives.

Second, the studies of family background on high cultural activities show that the often found relationship between individual educational attainment and high cultural activities is severely biased if family background is not controlled for in the regression models. Apparently, a large part of the association between educational attainment and high cultural activities is spurious, due to the effects of family background on both educational attainment and high cultural activities (DiMaggio and Useem 1978; Ganzeboom 1984; DiMaggio and Ostrower 1990; Ganzeboom and De Graaf 1991; Kraaykamp and De Graaf 1995; Van Eijck 1996). The effect of parental high cultural activities is even larger than the effect of individual educational attainment (Van Eijck 1997).

In this paper, we set out to investigate whether these two conclusions about the effects of family background and educational attainment on high cultural activities are biased because of measurement error in the retrospective account of parental high cultural activities. Such measurement error is likely to occur, since respondents report about a situation in the past. Respondents answer questions like “Did your parents go to the theatre when you were 12 to 15 years old?” In this article we want to find out at what cost the rational decision to collect information in a retrospective design has been: does conventional research lead to reliable estimates of the intergenerational transmission of high cultural activities?

Measurement theory argues that random error in an independent variable leads to an underestimation of its effect. Thus, the true association between parental and son’s/daughter’s high cultural activities is higher than the reported correlation coefficients in studies that do not include measurement error in the model. However, there are two reasons why measurement error can lead to the overestimation of an effect. First, in an analysis with more than one independent variable it is possible that the effect of one variable is overestimated, whereas the effect of another variable is underestimated. Second, measurement error in retrospective questions about parents’ high cultural activities may not be random. For example, it is possible that the measurement error in the respondent’s report of parental high cultural activities is correlated to the respondent’s own high cultural activities. It is clear that correlated measurement error would lead to an overestimation of the intergenerational transmission of high cultural activities in models without correction for measurement error.

The different and contradictory hypotheses about the consequences of measurement error for the size of the intergenerational transmission of high cultural activities mean that empirical evidence must solve the issue. Our analytical strategy will be to measure parents’ high cultural activities in a more reliable way by using multi-informant models. Questions about parental high cultural activities are answered by three informants: the primary respondent, a random sibling of the respondent, and a random parent of the respondent. The three informants have answered the same questions about parental high cultural activities when the primary respondent was between 12 and 15 years old. Note, that we do not assume that one of the informants gives more reliable information than the others, but that a measurement model based on the three pieces of information yields a more reliable measure of parental high cultural activities. The (dis)similarities in the responses to questions about the cultural practices in the family of origin are modeled in a LISREL design (Jöreskog and Sörböm, 1996), which makes it possible to handle both random and correlated measurement error in regression models. The comparison of the outcome of multi-informant models to those from conventional research will tell us (a) whether there are biases of the intergenerational transmission of high cultural activities, (b) to what extent the effect of individual educational attainment on high cultural activities is affected by measurement bias in family background variables, and (c) in which direction these biases are.

2 Data and descriptives

2.1 Data

We employ the repeated cross-sectional retrospective life-course survey Family Survey Dutch Population as collected in 1992 and 2000 (Ultee and Ganzeboom 1992; De Graaf et al. 2000). In these three surveys, primary respondents and their (married or unmarried) partners were interviewed in face-to-face interviews plus self-completion questionnaires. Samples were drawn from the population registers from a representative selection of Dutch municipalities. The response rate (=contact rate × cooperation rate) was 42.5 percent in 1992 and 40.6 percent in 2000. The contact rates were about 90 percent, and the cooperation rates about 50 percent. The sample sizes of the 1992 and 1998 surveys are 1,000 and 1,561 respondents respectively, which adds up to 2,561 respondents. Since many of the older respondents do not have living parents, we decided to include in the analysis only respondents of 54 years or younger. Of these respondents, 86 percent reported to have at least one living parent, and 90 percent reported to have at least one living sibling. As a result, we have included 1,950 primary respondents in our analysis.

The primary respondents were asked to provide their parents’ address and the address of one randomly selected sibling. These siblings and parents then were sent a questionnaire by mail, with a stamped return envelope. After two reminders, the second one with a fresh questionnaire and return envelope, completed parent questionnaires were received for 43 percent of the respondents with living parents. The response rate of siblings among respondents with at least one living sibling was 39 percent. The non-response had two causes: some respondents did not give the address of their parents or siblings, and some parents and siblings did not return the questionnaire they received. Not all questionnaires contain the information we need for our analysis, especially because we did not include questions about the deceased spouses of the surviving parent in the questionnaire. This makes that, although we have data from 1,950 primary respondents between 18 and 54 years old who answered the questions about parental high cultural activities, we have parent reports on parental high cultural activities for 590 respondents, and that we have sibling reports on parental high cultural activities for 711 respondents. The missing data problem is handled by distinguishing groups of individuals with different patterns of missing values. This procedure is described below.

For the analytical models we need a limited set of variables: respondent’s education and high cultural activities, parental high cultural activities, and the control variables age and gender. The information about parental high cultural activities comes from three informants, which makes it possible to control for measurement error in family background.

Highest completed education of fathers and their sons/daughters (the primary respondents) is the number of years necessary to complete their highest level of education: primary school is 6 years of schooling, lower vocational training (LBO) is 9 years, lower general education (MAVO) and short intermediate vocational training (KMBO) are 10 years, normal intermediate vocational training (MBOFootnote 1) and intermediate general education (HAVO) are 11 years, pre-university education (VWO) is 12 years, higher vocational training (HBO) is 15 years, university (WO) is 17 years, and post-university is 20 years.

Parental and son’s/daughter’s high cultural activities is measured by a set of questions about reading behavior and visits of cultural events. The questions referring to the high cultural activities of the primary respondents are similar to those referring to the parents’ high cultural activities, but the items differ between the two surveys. Since the items vary with regard to the frequency in the general population (more persons read books than there are persons who visit modern art museums), we computed percentile scores for each item. In this way, each item has an average score of 50, and thus all items are standardized with respect to their average occurrence in the population. The items have been selected both on theoretical grounds (we selected items referring to high culture) and on empirical grounds (we performed a reliability analysis). The factor loadings of the items and the reliabilities of the scales are presented in Table 1. The reliability is good in both surveys and for all informants, but in the 1992 survey it is somewhat higher than in the 2000 survey, which may be due to the fact that in that questionnaire all items were put in one list, which may have caused halo effects. The final score for cultural participation is the average of the different items (measured in percentile scores).

Table 1 Factor loadings of the different culture items, reliability of the total scale

Female is a dummy variable for gender (0 = male, 1 = female). Birth year is coded as the year of birth minus 1938, the birth year of the oldest respondents in the sample.

2.2 Descriptives

Table 2 presents descriptive information about all variables in the analysis. The average for parental high cultural activities is 50 (49.92 to be precisely) as a result of the standardization procedure we followed. In the subgroup with parental information this mean is somewhat higher. The mean according to parents does not differ from the mean according to respondents, but the mean according to siblings is significantly lower than the means according to respondents and parents. The correlations between the three answers are between r = .674 and r = .710. The Cronbach’s reliability coefficient of the three answers together is α = .868.

Table 2 Descriptive information about all variables in the analysis

The average of son’s/daughter’s educational attainment is 11.36. Due to the standardization of the cultural items, the average score for sons/daughters high cultural activities is equal to parental high cultural activities. Half of the respondents are women. The average birth year is 20.15, which refers to 1958.

3 Design

3.1 Approach to measurement error

We analyze the consequences of measurement error with Structural Equations Models (Hayduk 1987; Byrne 1998) using the LISREL software version 8.54 (Jöreskog and Sörbom 1996). The total effect of parental high cultural activities and the direct effects of high cultural activities and son’s/daughter’s educational attainment are estimated in two separate models. In Model A the total effect of high cultural activities on son’s/daughter’s high cultural activities is estimated, and in Model B the effect of son’s/daughter’s educational attainment on high cultural activities is controlled for. Sex and birth year are included as covariates in both models. Although sex differences have decreased significantly, the educational attainment of women is lower than that of men. Moreover, it has been found that women more often participate in high culture then men. Excluding sex from the model would lead to an underestimation of the effect of education on cultural participation.

Models A and B both are estimated in four variants. Model 1 only includes information provided by the primary respondents, which is considered to be measured without error. Fig. 1 represents this model graphically.

Fig. 1
figure 1

Model without measurement error

In Model 2, which is presented graphically in Fig. 2, we incorporated random measurement error. For parental high cultural activities measurement error is included by modeling it as a latent variable, measured by the information of the three informants. Son’s/daughter’s own educational attainment and high cultural activities are measured by one indicator (in the case of high cultural activities this indicator is a scale made of several items), and measurement error is included by setting the error variance of these indicators to 15 percent and 20 percent of their total variance, respectively. Because these variables are only available for multiple informants in the 2000 survey, we can not incorporate the three measurements in the analysis. However, on the basis of the correlations between the respondent’s answer and a parent’s answer in the 2000 survey, we can assess the reliability and hence calculate the error variance (Hayduk 1987). Another way to assess the reliability of high cultural activities would be to include the reliability of the scale in the measurement model, which is .857 and .822 in the two surveys. However, these reliability coefficients may be overestimated due to halo effects. For that reason, we use the correlation with the parental answers about the respondents, as we have done for educational attainment. This yields a somewhat lower reliability than the alpha of the scale, namely .80. Sex and birth year are considered to be measured without error, since previous research showed that these variables are measured reliably (Schreiber 1975/1976; Porst and Zeifang 1987; Poulain et al. 1992).

Fig. 2
figure 2

Model with random measurement error

Model 3 (Fig. 3) takes correlated measurement error into account. We test the presence of a bias in the answers of respondents about parental high cultural activities towards their own high cultural activities.

Fig. 3
figure 3

Model with correlated measurement error

Model 4 (see Fig. 4) only includes information provided by primary respondents, and thus resembles Model 1, but now we incorporate measurement error, by constraining the error (co)variances to be equal to the values found in the second model and, if correlated error is present, to the third model. We do this to make clear that with the results of our analyses, future research can find the correct effects using surveys with primary respondent information only.

Fig. 4
figure 4

Model with imputed measurement error

Three fit statistics are used to assess the model fit. The χ2 evaluates whether the model fit is significantly worse than that of the saturated model. A disadvantage of the χ2 is that it is frequently significant in large samples, although the model represents the data rather well. Since we have a sample size of 1,950 respondents, we also use two additional fit statistics that account for the number of cases, namely the BIC and the RMSEA. A negative value of the BIC (Raftery 1993; 1995) is considered to imply a good fit. For the RMSEA (Root Mean Square Error of Approximation), which is the average error per degree of freedom, a value below .05 is usually considered to imply a good fit (Browne and Cudeck 1993).

3.2 Approach to missing values

Since we do not have complete information for all respondents, we estimated our models using the multiple-group option in the LISREL software (Jöreskog and Sörbom 1996). On the basis of the missing value pattern, four groups can be distinguished (Table 3). Group A (n = 342) consists of respondents for whom we have information on parental high cultural activities from all three informants. In the other four groups, data of at least one informant is missing. Group B (n = 248) does not have sibling information, and group C (n = 369) misses parent information. Group D is the largest category. This group contains 991 respondents for whom we have no other informants than the primary respondents.Footnote 2

Table 3 Missing value structure: sample size of four subgroups

All four groups can be included in one analysis in the LISREL software, for this software allows that one latent variable is measured by different indicators over groups of respondents. When an indicator is missing, the mean and the covariances with all other variables in the analysis are constrained to be equal to zero, the variance is constrained to be equal to one, and the effect of the latent variable on this indicator is constrained to be equal to zero. Further, the regression effects are restricted to be equal over the four groups.Footnote 3 If the data are missing at random (MAR) instead of missing completely at random (MCAR), the means of the indicators (if they are not missing) in the different groups are restricted to be equal. This method gives reliable results if data are either MAR or MCAR (Allison 1987). However, differences in the missing data structure do deteriorate the fit statistics. Low values for the fit statistics can be the result of a misspecified model, but can also occur when the missing values are MAR instead of MCAR. Therefore, we also provide fit statistics for a model in which the means are not constrained to be equal. It is important to note that the estimated effects are also corrected for measurement error in Group D, in which we have included the respondents for whom we do not have additional family background information by a parent or a sibling, since the errors are restricted to be equal to those in the group of respondents for whom we do have information from parents or siblings.

4 Models

4.1 Model 1: No measurement error

Models 1A and 1B in Table 4 explain high cultural activities without accounting for measurement error in the variables. Both models fit the data well. The χ2 statistics are not statistically significant; the BIC-values of all three models are negative. The RMSEA cannot be computed for these models because the χ2 is lower than the number of degrees of freedom.

Table 4 Effects of parental high cultural activities, educational attainment, female, and cohort on high cultural activities

The effects are in line with results from previous research. Model 1A, which models the effect of parental high cultural activities when the respondent was 15 years old on son’s/daughter’s high cultural activities, shows that the intergenerational transmission of high cultural activities is strong. The standardized effect of parental high cultural activities is β = .486, and the explained variance is β = .256. Model 1B adds educational attainment to Model 1A. Indeed, the effect of parental high cultural activities now becomes smaller, and the standardized effect of educational attainment is stronger than the standardized effect of parental high cultural activities (β = .387 vs. β = .343). Nevertheless, the effect of parental high cultural activities is still rather strong; only 30 percent of this effect is mediated by educational attainment. The R square is relatively high, namely .339, which also shows that the effect of educational attainment is not only mediating the effect of parental cultural resources, but is largely additive.

4.2 Model 2: Random measurement error

Models 2A and 2B in Table 4 present analyses in which random measurement error is incorporated. The model fit statistics provide somewhat ambiguous information. The χ2 are significant, indicating a bad fit, but the BIC-value is negative and the RMSEA is below .05, indicating a good fit. In both models, the effect of parental high cultural activities when the respondent was 15 years old on son’s/daughter’s high cultural activities is much stronger than in the models that do not correct for random measurement error. In Models 2A and 2B, the standardized effects of parental high cultural activities are 31 and 28 percent larger than in Models 1A and 1B. These differences are statistically significant (p < .01). Footnote 4

The effect of educational attainment in Model 2B is somewhat stronger than in Model 1B, but the difference of seven percent is not significant (p = .43).

In addition, we did a sensitivity analysis of the error-variance in son’s/daughter’s educational attainment and high cultural activities. When we constrained these error variances to be larger or smaller than 15 and 20 percent of the total variance in these variables, the conclusions did not change. We conclude that, as expected, random error in parental high cultural activities leads to an underestimation of the intergenerational transmission of high cultural activities.

4.3 Model 3: Correlated errors

In order to investigate correlated error and how it biases effects, we allowed for correlated error between the measurement of parental and respondent’s high cultural activities. The structural models for which the error-covariance between parental and respondent’s high cultural activities is allowed to be free, are presented in Models 3A and 3B in Table 4. Just as for Model 2, the fit statistics provide ambivalent information: Significant χ2, but good RMSEA and BIC-values. Looking at the difference between Models 2A, B and Models 3A, B, it turns out that the χ2 are significantly lower in Models 3A, B, while the BIC-values are more negative. Thus, we conclude that Models 3A and 3B should be preferred over Models 2A and 2B.

The biases of respondent answers about parental high cultural activities towards respondent’s high cultural activities are presented in Table 5. Both in Model 3A and in Model 3B, measurement error in the respondent answer about parental high cultural activities is correlated with measurement error in respondent’s high cultural activities in both models. The correlation coefficient is r = .096 in Model 3A and r = .070 in Model 3B.

Table 5 Correlation between errors in answers of respondents about their parents and about themselves

What consequences do these correlated errors have for the effect sizes? Accounting for correlated error makes the total effect of parental high cultural activities (Model 3A) smaller compared to the random measurement error Model 2A, but in Model 3A the effect is still stronger than in Model 1A (p < .01). The difference in the effects of parental high cultural activities between Models 3B and 1B, that control for the effect of educational attainment, is not significant (p = .16). The consequence of measurement error for the intergenerational transmission of high cultural activities clearly depends upon the model specified: the total effect of parental high cultural activities is deflated by measurement error, while the direct effect is not.

The effect of educational attainment on high cultural activities is now significantly stronger than in Model 1B. According to the estimates of Model 3B, the relative difference between the effects of educational attainment and parental high cultural activities is larger in favor of educational attainment than it was according to Model 1B. Additional analyses, not presented here, show that different values for the error-variance in son’s/daughter’s educational attainment and high cultural activities (five percent points higher and lower) do not lead to different conclusions.

4.4 Model 4: Imputed error

This article shows that the effects of parental high cultural activities and educational attainment on high cultural activities are biased due to measurement error. Therefore, we recommend to include our estimates of the error variances in future research of family background effects on high cultural activities. Based on the 2000 Family Survey Dutch Population we found that the error variances in respondent’s educational attainment and respondent’s high cultural activities are about 15 and 20 percent, respectively. Table 6 shows the effects of the latent characteristic parental high cultural activities on its indicators (Model 3B). The square of the standardized effect refers to the reliability. The error-variance in the information provided by the three informants is presented as a proportion of the total variance. Error variance accounts for as much as one third of the total variance.

Table 6 The effects of latent family background characteristics on their indicators

In Models 4A and 4B of Table 4 we imputed these proportions of error variance and error covariance as reported in Tables 5 and 6 in the model with information by the respondent only, i.e. in Model 1. As could be expected, Model 4 provides the same structural effects as Model 3.

5 Conclusion

In this article we have found that son’s/daughter’s information on parental high cultural activities contains both random measurement error that deflates its effect on son’s/daughter’s high cultural activities and correlated error that inflates its effect on son’s/daughter’s high cultural activities. Whether the effect is underestimated by measurement error, depends on the specific model to be estimated: the total effect is underestimated, while the direct effect is not biased. In addition, we have shown that the effect of educational attainment on high cultural activities is underestimated due to measurement error. After inclusion of random and correlated measurement error in the model, the effect of educational attainment remains stronger than the effect of parental high cultural activities.

We recommend that researchers remain cautious about the biases which result from retrospective measurement of family background. Although, we have shown that––by accident!––the intergenerational transmission of high cultural activities is not affected by the sum of random and correlated measurement error, it would be simplistic to assume that there is no need to be concerned about the biases caused by measurement error. The correct model can be specified by including appropriate controls for measurement error; this model gives the true effect of parental cultural activities on the respondent’s cultural activities. Students of family background effects on high cultural activities need to reconsider their models, and should address these biases seriously.