Genes, Culture and ConservatismA PsychometricGenetic Approach
 First Online:
 Received:
 Accepted:
DOI: 10.1007/s1051901597689
 Cite this article as:
 Schwabe, I., Jonker, W. & van den Berg, S.M. Behav Genet (2016) 46: 516. doi:10.1007/s1051901597689
 2 Citations
 1.6k Downloads
Abstract
The Wilson−Patterson conservatism scale was psychometrically evaluated using homogeneity analysis and item response theory models. Results showed that this scale actually measures two different aspects in people: on the one hand people vary in their agreement with either conservative or liberal catchphrases and on the other hand people vary in their use of the “?” response category of the scale. A 9item subscale was constructed, consisting of items that seemed to measure liberalism, and this subscale was subsequently used in a biometric analysis including genotype–environment interaction, correcting for nonhomogeneous measurement error. Biometric results showed significant genetic and shared environmental influences, and significant genotype–environment interaction effects, suggesting that individuals with a genetic predisposition for conservatism show more nonshared variance but less shared variance than individuals with a genetic predisposition for liberalism.
Keywords
Conservatism Liberalism Genotype–environment interaction Measurement error Psychometrics IRTIntroduction
The term conservatism is used in many ways (Pedhazur and Schmelkin 1991), but most often refers to politiceconomic conservatism. More generally, conservatism can be seen as a generalized resistance to change and ambiguity which is expressed as a preference for safe, traditional and conventional forms of institutions and behaviour. Wilson and Patterson (1968) developed a conservatism scale to measure social attitudes related to the conservative personality. They regarded the then existing scales to be of poor psychometric quality because of susceptibility to agreementresponse bias and complex, doublebarrelled and/or confusing questions. Instead, their conservatism scale consists of very short catchphrases. Examples of catchphrases include “Liberals” and “Living together”. The test taker is then asked “Please indicate whether or not you agree with each topic by circling “Yes” or “No” as appropriate. If uncertain please circle “?”. These catchphrases were expected to activate the respondent’s affective system, the system Wilson and Patterson (1968) hypothesised to be the most influential component for conservative attitudes and behaviours. The affective system seems indeed to be important, since conservatives tend to have stronger disgust reactions than liberals (Inbar et al. 2009) and brain data suggest that conservatives and liberals process risk and fear differently (Schreiber et al. 2013). Furthermore, Hibbing et al. (2014) found that conservatives tend to have stronger physiological responses to features of the environment that are negative and also devote more psychological resources to these stimuli.
The development of the scale was based on seven characteristics that Wilson and Patterson (1968) expected to be present in highly conservative individuals: (1) religious fundamentalism, (2) rightwing political orientation, (3) insistence on strict rules and punishments, (4) intolerance of minority groups, (5) preference for conventional art, clothing and institutions, (6) antihedonistic outlook, and (7) superstition and resistance to science. A large pool of more than 130 catchphrases were created that Wilson and Patterson (1968) regarded to be effective discriminators for these seven characteristics. Based on three successive item analyses (Wilson and Patterson1968), 50 items were selected from this pool. To control for response bias, half of the items were phrased in the affirmative direction of conservatism and half of the items were liberally phrased. Although initially conceived of as a unidimensional scale (Wilson 1973), subsequent research on the structure of the scale showed that four factors were required to explain most of the observed variance. These factors were named (1) militarismpunitiveness (12 items), (2) antihedonism (12 items), (3) ethnocentrism and outgroup hostility (12 items) and (4) religionpuritanism (12 items; Wilson 1973). Eaves et al. (1999) devised a shortened and somewhat altered conservatism scale consisting of 28 items. Most of the items were taken from the original conservatism scale, with a few items added that were regarded relevant at the time of data collection. The eigenvalues of the interitem correlations for the 28 items suggested, according to Eaves et al. (1999), that a general “conservative  liberalism” factor was substantial but not exhaustive to explain the observed variance. An exploratory factor analysis with oblique rotation suggested that five factors explained most of the observed variance in 24 of the 28 items. These factors were named: (1) sexual permissiveness (8 items), (2) economic liberalism (5 items), (3) militarism (5 items), (4) political preference for democrats or republicans (2 items) and (5) religious fundamentalism (5 items). Note that in all these psychometric analyses, linear relationships were assumed between the items, treating the “?” response as exactly midway between a “yes” and “no” answer.
Prior genetic research
Using various versions of the Wilson and Patterson conservatism scale, research has shown that both, genetic and cultural, influences are responsible for the observed variance in conservatism. Based on a conservatism measure derived from the original scale (Wilson and Patterson 1968), Martin et al. (1986) reported monozygotic (MZ) twin correlations of 0.60 for males and 0.64 for females, assessed in a large sample from the Australian Twin Registry. Eaves et al. (1997) reported on MZ and dizygotic (DZ) twin correlations across age (9.5−75 years). They found that prior to age 20 all variance due to individual differences is agerelated, implicating environmental influences. However, after age 20, age effects vanished and there were significant differences between MZ and DZ twin correlations, suggesting genetic influences. In a later study, Eaves et al. 1999 reported heritability estimates of 0.65 for males and 0.45 for females based on the Virginia 30,000 study of twins and their relatives. Bouchard et al. (2003) assessed the 28item conservatism scale in the Minnesota Study of Twins Reared Apart (MISTRA) and reported a heritability of 0.56. Hatemi et al. (2014) published results of a genomewide association study (GWAS) metaanalysis, where several cohorts and, among other measures, various versions of the Wilson−Patterson conservatism scale were used. They also reported on variance components. They found a combined weighted mean of relative influences across measures and cohorts of 0.40 for genetic influences, 0.18 for commonenvironmental influences and 0.42 for uniqueenvironmental influences. The GWAS metaanalysis showed no genomewide significant hits, which may be partly related to the heterogeneity of the measures used across cohorts, but could also be due to multidimensionality of the conservatism measure (van der Sluis et al. 2010).
Need for psychometric evaluation
Establishing a measure with good psychometric properties is important for finding genomic signals for personality traits such as conservatism (van der Sluis et al. 2010; van den Berg and Service 2012). Prior research on the psychometric dimensionality of the conservatism scale was based on linear factor analysis with “yes”, “?” and “no” coded as 3, 2 and 1 respectively, thus assuming that a “?” response is exactly midway between a “yes” and “no” response. This assumption, however, is not necessarily true—a “?” response might mean something else than being psychologically (exactly) between a “yes” and “no” answer. Reactions like “I don’t know what I think” or “I don’t know what is meant by busing”, are psychologically different from for example an “I don’t care” reaction, or a reluctance to convey the true affective response. Converse (1964) and other political scientists (e.g. Campbell et al. 1960) have demonstrated that the general American public is largely uninformed about current political affairs and has gaps in knowledge of political systems. Arguably, it is likely that respondents do not understand or care about (some of) the catchphrases of the Wilson−Patterson scale.
In addition, it is important to know to what extent scores on this scale reflect true trait variability and to what extent they reflect measurement error. Measured as splithalf internal consistency, the conservatism scale has been reported to have a high reliability of 0.94 (Wilson and Patterson 1968). This finding is supported by several studies. For example, Henningham (1996) reports alpha reliability of 0.81 on a 27item version based on the original scale and an alpha reliability of 0.74 on a simplified and modernized 12item version. In this paper, the psychometric properties of the conservatism scale were assessed more rigorously by using homogeneity analysis (de Leeuw and Mair 2009) and item response theory (IRT), thereby greatly relaxing the assumption of prior research of linear relationships among items. With the establishment of a good scale, a biometric analysis including genotype–environment interaction was done.
Genotype–environment interaction
Genotype–environment interaction refers to the situation that some genotypes are more sensitive to changes in the environment than other or, conversely, that genotypes respond differently to the same environment (see e.g. Cameron 1993; Martin 2000; Sorensen 2010). Although various studies suggest that genotype–environment interaction is an important phenomenon in complex behavioural traits (e.g., anti–social behaviour, Caspi et al. 2002; cognitive ability, Turkheimer et al. 2003 or depression, Hicks et al. 2009), research on genotype–environment interaction has not been a focus of genetic studies on conservatism. Present study was concerned with an omnibus test to assess whether there is any statistically significant genotype–environment interaction. Therefore, the method that we use here to model genotype–environment interaction is parametrized such that both, genetic as well as environmental, influences are modelled as latent (i.e., unmeasured) variables. If indeed, genotype–environment interaction is found, future research on the etiology of conservatism can focus on the exact nature of this effect by collecting specific, environmental measures at the family or individual level, depending on the results of this research.
Schwabe and van den Berg (2014; see also Molenaar and Dolan 2014) recently developed a method that models genotype–environment interaction in such a way that statistical findings are independent of scale properties. This means that as long as a set of items measures a particular trait, such as conservatism, biometric results (i.e., conclusions regarding heritability and genotype–environment interactions) are the same regardless what particular (sub)set of items is used. This is important since it is generally recognized that statistical findings regarding nonlinear effects such as genotype–environment interaction are dependent on the scale at which the analysis takes place; a simple transformation such as taking the logarithm or computing the root of a particular measure (e.g., a sum score) either obscures or reveals interaction effects (see e.g. Eaves et al. 1977; Martin 2000; van der Sluis et al. 2006; Eaves 2006; Molenaar et al. 2012; Schwabe and van den Berg 2014; Molenaar and Dolan 2014). Schwabe and van den Berg (2014; see also Molenaar and Dolan 2014) showed that the skewness in the phenotype distribution in large part determines finding a genotype–environment effect, even when that skewness in sum scores is only due to response frequencies in the items’ response categories. For instance, a relatively large proportion of “yes” responses on dichotomous yes–no questions leads to a skewed distribution of the number of “yes” answers (total test score). Slightly rephrasing the questions might cause no real change in item content (e.g., changing “Do you like peanut butter?” into “Do you like peanut butter very much?”), but can cause a change in proportion of yesanswers and thus change the skewness of the test score. Therefore, a rewording can lead to obscuring or revealing genotype–environment interaction effects even when the measured construct is the same. By applying the method by Schwabe and van den Berg 2014, that involves itemresponse theory (IRT) modelling while modelling genotype–environment interaction at the level of the latent construct, our results regarding genotype–environment are free of any statistical artefacts due to response category frequencies. Still, the scale at which we model the interaction effect is arbitrary, but at least it is identified by using an IRT model. This makes our results comparable to other studies with perhaps slightly different items or a subset of the items, but where the scale was identified in the same way (i.e. the same IRT model).
This research
The first part of this study consists of a psychometric evaluation of the 28item conservatism scale as used in Eaves et al. (1997, 1999), Bouchard et al. (2003) and the adult cohort in Hatemi et al. (2009). Item response models were used that take into account the categorical nature of the responses by modelling nonlinear relationships between item responses and the trait being measured. In an exploratory analysis, multidimensional homogeneity models (Gifi 1990) that assume nominal response categories were fitted in order to reevaluate the psychometric dimensionality of the Wilson−Patterson scale. The Gifi method relaxes the assumption that an “?” answer falls exactly halfway between a “yes” and a “no” answer. Based on the results, a new scale was devised. IRT models were then used to confirm the results of the homogeneity analysis and to evaluate the psychometric quality of the new scale. In the second part of this study, the new scale was used to investigate genotype–environment interaction. For the genotype–environment analysis, a Bayesian approach was used in which the biometric model and an IRT model were fitted simultaneously.
Method
Data
The data come from the Health and LifeStyle Survey for Twins assessed in the Virginia 30K sample (Eaves et al. 1999; Hatemi et al. 2009), selecting data on twins and their parents. Part of this survey was the 28item scale described above. Zygosity status was based on selfreported resemblance with a reported percentage correct of 95 % (Eaves et al. 1999). Total sample size was 14454. Mean age was 52.13 (\(SD = 17.8\), range 16–94). For the psychometric analyses in the first part of this study, we used all available data from twins and their parents that had complete data for the 28 items (\(N = 12315\), of which 10405 were twins). For the biometric modelling in the second part of this research we only used twin data (2795 MZ twin pairs, 3280 DZ twin pairs) and item data that was missing was assumed missing at random.
Part I: psychometric analyses
For the psychometric analyses, only data from twins and their parents were used with complete data on all 28 items with “no” coded as 1, “?” coded as 2 and “yes” coded as 3. Items associated with conservatism [as reported by Eaves et al. (1999), i.e. items 1 (Death penalty), 9 (Military Drill, 10 (Draft), 16 (Capitalism), 17 (Segregation), 18 (Moral Majority), 20 (Censorship), 21 (Nuclear Power), 23 (Republicans), 25 (School Prayer) and 28 (Busing)] were reverse coded, so that a high sum score is associated with low conservatism (high liberalism). The analyses were done using SPSS (IBM 2013) and R (development core team 2007). R is an open source language and environment for statistical computing, which is freely available at http://cran.rproject.org
First, using SPSS (IBM 2013), a classical assessment of psychometric quality was performed on the scale as proposed by Eaves et al. (1999): computing itemtotal correlations and estimating reliability. Next, the responses were assumed nominal and a homogeneity analysis was done. Homogeneity analysis can be seen as a principal components analysis for nominal data. The analysis positions both individuals and item answer categories into one geometric space. It uses alternating least squares to minimize the distances between the position of an item’s particular answer category (“no”, “?”,“yes”) and individuals that chose that particular category (see e.g., Heiser and Meulman 1994; van der Kloot 1997). Using SPSS (IBM 2013), the dimensionality of the geometric space was determined. A twodimensional homogeneity model was then further analysed with the R package homals (de Leeuw and Mair 2009). Based on the homogeneity analysis results, a unidimensional conservatism scale was constructed.
The reliability of the new scale was calculated in SPSS and IRT models were used to confirm the results of the homogeneity analysis and further evaluate the new scale. For the IRT modelling, the R package mirt (Chalmers 2012) was used. The IRT analysis was done using a generalized partial credit IRT model (GPCM) (Muraki 1992), which is an IRT model that is suitable for polytomous, ordinal data. The GPCM model has parameters both for difficulty (i.e., thresholds) as well as discrimination parameters that are the IRT analog of factor loadings. For our current data with three ordered response categories (“no”, “?”, “yes”) the GPCM specifies two threshold parameters for each item, one for the location on the scale where the probability of a “?” equals the probability of a negative response, and one parameter for the location on the scale where the probability of a positive response equals the probability of a “?” response. Also a Partial Credit Model (PCM) was applied, which is a restricted version of the GPCM where the discrimination parameters (factor loadings) are all assumed equal to 1. For model comparison purposes, the AIC and the BIC were computed. At item level, goodness of fit was evaluated using chisquare statistics, comparing observed and expected response frequencies for different bins of test scores.
Part II: biometric analysis
In the second part of this study, the newly constructed scale was used in a biometric analysis including genotype–environment interaction. Here we follow the new method that integrates an IRT model into biometric modelling of genotype–environment interaction at the latent construct (see Schwabe and van den Berg 2014; Molenaar and Dolan 2014). As biometric model, the socalled ACE model was used which decomposes total phenotypic variance, \(\sigma ^2_P\), into variance due to additive genetic influences (\(\sigma ^2_A\)), variance explained by commonenvironmental influences (\(\sigma ^2_C\)) and variance due to uniqueenvironmental influences (\(\sigma ^2_E\)). Whereas commonenvironmental influences were parametrized to be perfectly correlated in a twin pair, we parametrized uniqueenvironmental influences to be uncorrelated in one family.
Bayesian approach
van den Berg et al. (2007) showed that, in order to take full advantage of the IRT approach, both the IRT measurement model and the biometric model have to be estimated simultaneously, using a so called onestep approach. However, as this procedure is computationally burdensome, widespread methods of estimating variance components through structural equation modelling (SEM) reach their computational limit. van den Berg et al. (2007) showed that Bayesian statistical modelling can be an alternative to enable the simultaneous modelling of an IRT measurement and variance decomposition model. In the Bayesian approach, statistical inference is based on the posterior density of the model parameters which is proportional to the product of a prior probability and the likelihood function of the data (for further reading see e.g. Box and Tiao 1992). Here we use Gibbs sampling (Geman and Geman 1984; Gelfand and Smith 1990; Gelman et al. 2004), a Markov chain Monte Carlo (MCMC) algorithm, to study the posterior densities of model parameters. This method was applied using the freely obtainable MCMC software package JAGS (Plummer 2003). The JAGS script can be found in the online supplementary material. As similar syntax is used, the script can be used also in the free software package WinBUGS Lunn et al. (2000) with minor adaptations. As an interface from R to JAGS, the R package rjags was used (Plummer 2013).
As in Eaves and Erkanli (2003) and van den Berg et al. (2006, 2007), a Bayesian version of the ACE model was used that only specifies univariate distributions. This model is an extension of the Schwabe and van den Berg (2014) model to a (Generalized) Partial Credit model ((G)PCM, Muraki 1992) version at the measurement level. Furthermore, the model was extended to include, besides an interaction with uniqueenvironmental influences, also an interaction with commonenvironmental influences, following Molenaar and Dolan (2014).
Biometric and IRT model
In the following, the full model, consisting of both variance decomposition (ACE model) and measurement model (IRT model), will be described for MZ and DZ twins.
As for MZ twins, simultaneous to the variance decomposition, the latent phenotype, \(\theta _{ij}\), appeared in the GPCM IRT model for three response categories (see Equations 68) and observed item data was assumed to have a multinomial distribution (see Equation 9).
Prior distributions
As prior distribution for the additive genetic variance, we chose an inverse gamma distribution (\(\sigma ^2_A \sim InvG(1, 1)\)). We chose independent normal distributions for both intercepts (\({\text {exp}}(\beta _0)\), \({\text {exp}}(\gamma _0) \sim N(1,2)\)) as well as for both slope parameters (\(\gamma _1\), \(\beta _1 \sim N(0,10)\)). For the item thresholds, we used a normal distribution \((\beta _k \sim N(0,10))\) and a lognormal distribution for the item discrimination parameters \(({\text {log}}(\alpha _k) = N(0, 10)\)).
In order to find the biometric model that fits the data well and, at the same time, is parsimonious, we estimated different biometric models. These included a biometric model without any interaction effects (simple ACE model), an ACE model with one (either A × E or A × C) interaction effect and a model with both interaction effects. The deviance information criterion (DIC, Spiegelhalter et al. (2002), a measure that estimates the amount of information that is lost when a given model is used to represent the datagenerating process, was calculated to assess model fit of each model. The DIC takes account of both the complexity of a model and the goodness of fit. It can be seen as a Bayesian analog of Akaike’s Information Criterion (AIC). In the models without interaction effect(s), the same, independent, prior distributions were chosen for commonenvironmental and uniqueenvironmental variance (\(\sigma ^2_C,\)\(\sigma ^2_E \sim InvG(1, 1)).\)
After a burnin phase of 20,000 iterations for each separate chain, the characterisation of the posterior distribution for the model parameters was based on a total of 120,000 iterations from six different Markov chains. This was chosen on the basis of previous test runs with multiple chains and computing Gelman and Rubin’s convergence diagnostic (Gelman and Rubin 1992). The mean and standard deviation of the posterior point estimates was calculated for each parameter as was the 95 % highest posterior density (HPD, see e.g. Box and Tiao 1992) interval. The HPD can be interpreted as the Bayesian analog of a confidence interval (CI). When the HPD does not contain zero, the influence of a parameter can be regarded as significant.
Sum score analysis
In order to compare biometric results gained by this methodology with results gained by the sum score approach, the biometric model that was chosen as the best model for our data was also estimated using sum scores instead of item scores. In this analysis, sum scores were calculated from the twin data with answer categories coded as 0,1 and 2 respectively and rescaled so that they had a mean of zero and variance of one in order to make results of both approaches comparable with respect to the prior distributions. Sum scores were then analyzed with the same JAGS script (see online supplementary material) but without the IRT part. After a burnin period of 10,000 iterations, the characterisation of the posterior distribution for the sum score analysis was based on 15,000 iterations from 1 Markov chain, based on previous test runs with multiple chains and computing Gelman and Rubin’s convergence diagnostic (Gelman and Rubin 1992).
Results
Based on the original 28 item scale with reverse coding (following Eaves et al. 1999), the reliability estimate was 0.73 (Guttman’s lambda 2; Cronbach’s alpha = 0.71). Item 28 (Busing) had a negative correlation (r = −0.18) with the total score, as did item 16 (Capitalism, r = −0.02).
Homogeneity analysis results
Investigating the category points on the second dimension of the catchphrase “Liberals”, an item with a higher loading on the second dimension, we can see that the “?” answer category falls roughly between the “yes” and “no” answer categories. This implies that the second dimension distinguishes between liberal and conservative persons, assuming that someone with a high latent trait value for liberalism would be more inclined to approve of liberals (i.e., by answering “yes”) while someone with a very low latent trait for liberalism would be more inclined to disapprove of liberals (i.e. by answering “no”). When we look at the first dimension, we can observe that the distance between the “yes” and “no” answer categories is very small, while their distance to the “?” category point is relatively large. This suggests that the first dimension mainly distinguishes between people who answered with “?” or with either “yes” or “no”. When we investigate the category points plot for the catchphrase “Capitalism”, an item with a higher loading on the first dimension, we can see the same pattern, but also that, although the “?” falls roughly in between the “yes” and “no” answer categories on the second dimension, the distance between the “yes” and “no” answer categories is much smaller on this dimension (compared to the “Liberals” item) while the distance between the “?” category point and the “yes” and “no” category points is still large on the first dimension. This suggests that this item better distinguishes between people who answered with “?” and people who answered with “yes” or “no” than between conservative and liberal persons.
Interpretation of the dimensions
The relationship between itemloadings and proportion of “?” answers on each item implies that the tendency of items to distinguish between individuals with a “?” answer and respondents who either answered “yes” or “no” is based in a high level of “?” responses, which might indicate that these items measure concepts on which many people simply do not have wellformed attitudes. The interpretation of this dimension however remains unclear, as we do not know participants’ true reason to give a “?” answer. Not having a wellformed attitude might indeed be the reason to give a “?” answer, but there are other possible reasons—for example, participants might not have understood what a particular catchphrase means (e.g., item “Busing”).
How can we handle this multidimensionality?
A way to handle the multidimensionality of the Wilson−Patterson conservatism scale could be to use a weighted sum score, weighting items differently based on their itemloading on the second (liberalismconservatism) dimension. However, although it is not possible to retrieve participants’ true motivation to give a “?” answer, category points plots as well as the relationship between factor loadings and the proportion of “?” answers suggest that items with a high loading on this dimension do not give much relevant information to distinguish between conservative and liberal persons. Therefore, we decided to use only items with a higher itemloading on dimension 2 (conservatismliberalism) than on dimension 1 (ambiguous interpretation). For further psychometric analysis, we consequently selected the remaining 9 liberalismconservatism items Gay rights, Women’s liberation, Living together, Modern art, Divorce, Xrated movies, School prayer (reversecoded), Liberalism and Abortion. In all items, the “?” category point was in between the “yes” and “no” category points on the second dimension, a requirement for the application of the ordinal G(PCM) IRT model.
Evaluation of the new scale
Itemtotal correlations showed positive signs and were in the range 0.500.68. The reliability was estimated at 0.78 (lambda 2).
GPCM parameter estimates and item fit statistics
\(\alpha\)  \(\beta _2\)  \(\beta _3\)  \(X^2\)  df  p  

Xrated movies  0.58  −1.52  −1.47  253.53  27  <0.01 
Modern art  0.60  0.11  0.09  232.92  27  <0.01 
Women’s liberation  1.00  0.22  0.80  204.93  27  <0.01 
Abortion  1.01  −0.65  −0.16  112.60  27  <0.01 
Gay rights  1.65  −0.90  −1.95  235.90  27  <0.01 
Liberals  1.44  0.57  −0.67  357.63  27  <0.01 
Living together  1.08  −0.91  −0.53  205.03  27  <0.01 
Divorce  0.74  −0.44  0.23  147.16  27  <0.01 
School prayer (rev. coded)  0.65  −1.86  −2.04  109.54  27  <0.01 
Item fit statistics showed the largest χ^{2} value for the Liberals item. Observed number of responses for each response category as well as expected number of responses for each response category under the GPCM were plotted for ordered bins of total scores (see online supplementary material). Supplementary Fig. 1 shows that there is no systematic misfit for the Liberals item: the red lines (observed number of responses) largely overlap with the corresponding black lines (number of responses predicted by the fitted GPCM), as they do for all items. Supplementary Fig. 2 shows model fit based on twin data only.
Biometric modelling
Model fit (DIC) for all fitted biometric models
Biometric model  DIC 

No interaction effects (simple ACE model)  181853 
ACE model with A × E  181782 
ACE model with A × C  181849 
ACE model with A × E and A × C  181776 
Posterior mean (standard deviation) and HPD of variance components for the ACE model with A × E and A × C interaction effects
\(\sigma ^2_A\)  \({\text {exp}}(\gamma _0)\)  \({\text {exp}}(\beta _0)\)  \(\beta _1\)  \(\gamma _1\)  

Mean (SD)  0.43 (0.04)  0.29 (0.03)  0.07 (0.01)  −2.81 (0.21)  0.54 (0.14) 
HPD  [0.33; 0.51]  [0.22; 0.35]  [0.05; 0.10]  [−3.22; −2.39]  [0.31; 0.84] 
Analysis of sum scores
Sum score analysis: posterior mean (standard deviation) and HPD of variance components for the ACE model with A × E and A × C interaction effects
\(\sigma ^2_A\)  \({\text {exp}}(\gamma _0)\)  \({\text {exp}}(\beta _0)\)  \(\beta _1\)  \(\gamma _1\)  

Mean (SD)  0.64 (0.03)  0.35 (0.03)  0.53 (0.02)  0.10 (0.05)  −2.21 (0.09) 
HPD  [0.58; 0.70]  [0.29; 0.41]  [0.49; 0.56]  [0.00; 0.21]  [−2.38; −2.05] 
Discussion
In this paper, we evaluated the Wilson−Patterson conservatism scale psychometrically before using a shorter version of the scale for a biometric analysis including genotype–environment interaction.
A psychometric evaluation of the 28item conservatism scale Eaves et al. (1999) showed that this scale actually measures two different aspects in people: while one set of items distinguished between people’s agreement with either conservative or liberal catchphrases, another set of items distinguished mainly between people who answered “?” or either “yes” or “no”. Earlier political research (e.g. Campbell et al. 1960; Converse 1964) has shown that most Americans are uninformed about politics and are not consistent in their agree or disagreement with political statements. Arguably, it is likely that the dimension that differentiated mostly between respondents who answered “?” or either “yes” or “no”, mainly distinguished between individuals with and without an opinion. Considering the item content of the two sets of items, this seems plausible. While this dimension mainly consists of politicallyloaded items (e.g., “Property Tax”, “Pacifism”) and seems to measure economic liberalism, the conservatismliberalism dimension is composed of more approachable items (e.g., “Gay Rights”, “Abortion”), likely measuring social liberalism. Exceptions are the item “Liberals” on the conservatismliberalism dimension and the items “Astrology” and “Death Penalty” on the dimension that distinguished between individuals who answered “?” or either “yes” or “no”.
To handle this multidimensionality, we decided to use a shorter version of the scale, consisting of only 9 items with a high loading on the liberalismconservatism dimension. Ignoring the multidimensionality of the Wilson−Patterson conservatism scale can threaten validity of future or existing studies that use this scale. Therefore, we advise researchers to use the 9item subscale as presented here rather than the full 28 items scale. Furthermore, results from the homogeneity analysis suggest that the item schoolprayer should be reversecoded. An IRT analysis of this 9item subscale showed good fit with a Generalized Partial Credit model and the reliability of the new scale was sufficient. The 9item subscale is, however, constrained in the sense that the content of the remaining items reflects a measure of social liberalism rather than economic liberalism or security attitudes.
Comparing model fit and parsimony of different biometric models, an ACE model with both A × E and A × C was chosen as the best model for the data of this study. A biometric analysis that included an IRT model to correct for bias due to category response frequencies suggested a negative A × E and a positive A × C: a higher genetic propensity towards liberalism was associated with less uniqueenvironmental and more commonenvironmental variance. The finding of a negative A × E effect means that the nonshared environment plays a more important role in explaining differences in individuals with a genetic tendency towards favouring conservative ideas than explaining differences in individuals genetically predisposed towards favouring liberal ideas. Conversely, the finding of a positive A × C effect means that the shared environment seems to be more important in explaining differences in individuals predisposed towards liberalism than in explaining differences in individuals predisposed towards conservatism. Arguably, genetic effects important for the expression of conservatism do not work in isolation, but instead influence the extent to which individuals are sensitive to environmental influences, favouring an interactionist framework for the study of conservatism as a personality trait.
These findings suggest that there are unique environmental factors that affect attitudes in the conservative genotype but much less affect attitudes in the liberal genotype. Likewise, the familial environment seems to be more important in forming political attitudes in families with a genetic tendency for liberalism than in families with a genetic tendency for conservatism. These results are surprising. Conservative people are generally seen as people who do not like change; they generally favour the safety of the known over the unknown (Wilson1973). Research by Carney et al. (2008) showed that two Big Five personality traits differentiate between liberal and conservative individuals: Openness to New Experiences and Conscientiousness. In general, conservative participants score higher on conscientiousness (e.g. being more conventional, orderly and better organized) whereas liberals score higher on openness to new experiences (e.g. being more curious, noveltyseeking and creative). The differences in personalities were even reflected in personal possessions and the characteristics of living and working spaces: Liberal participants collected more CDs, books, movie tickets, and travel paraphernalia, whereas conservative participants showed more sports decor, U.S. flags, cleaning supplies, calendars, and uncomfortable furniture. Based on these trait differences, one could expect that family environmental influences would be more important for individuals with a genetic tendency for conservatism than for individuals with a genetic tendency for liberalism. Likewise, it could be expected that uniqueenvironmental influences would be more important for individuals genetically predisposed towards liberalism—with a tendency for noveltyseeking behaviour. Liberalism has been shown to be associated with higher IQ scores (Kanazawa 2010), which predicts that conservative people generally end up in different environmental circumstances than liberalminded people and perhaps different amounts of variation of those environmental factors that act on political views and personality. The finding of a negative A × E suggests that conservatives might come into contact with people and ideas outside of their shared environment that might be more reflective of their genetic preference. Eaves et al. (1997) indeed showed that genetic expression of conservatismliberalism only occurs after individuals have left their parental home. Individuals with a genetic tendency for conservatism then seem to be likely to be influenced by uniqueenvironmental influences that might affect their thinking about political issues, while, surprisingly, individuals with a genetic tendency towards liberalism, are still influenced by their family environment. Future research on genotype–environment interaction in conservatism should focus on the exact nature of both, common and unique, influences by including specific, environmental moderators, measured at the family and individual level. This can be done, for example, by using the genotypeenvironment parametrization introduced by Purcell (2002) by regressing moderators directly on the genotypic value.
In order to compare results gained by the new methodology with the sum score approach, the same biometric model was estimated using sum scores instead of item scores. As the sum score approach does not take into account measurement unreliability, estimated average environmental variance was much higher. Furthermore, the sum score approach suggested a positive A × E and a negative A × C interaction effect, meaning that people with a genetic tendency towards liberalism show more residual variance and less commonenvironmental variance than people with a genetic tendency towards conservatism. However, since the distribution of sum scores was skewed, this may be an artefact of item characteristics (see e.g. Schwabe and van den Berg 2014; Molenaar and Dolan 2014).
To our knowledge, this is the first study that used the Wilson−Patterson scale to investigate genotype–environment interaction in case of unmeasured environmental variables. Regarding the testing of genotype–environment interaction in future research, we advise researchers to use the same IRT model (i.e., the GPCM) to make results concerning any interaction effects comparable. Results regarding genotype–environment interaction replicate only when the same underlying scale is used, as every transformation leads to a different result (see also Schwabe and van den Berg 2014; Molenaar and Dolan 2014).
In this research, the psychometric evaluation of the scale was done on all available data, of parents and offspring, enhancing statistical power. For the biometric modelling, however, only twin data was used. Unfortunately it was not possible to use a parentoffspring model for this paper, since methodology for the inclusion of a genotype–environment interaction effect in parentoffspring design with significant spouse correlation is still lacking. In future research, the method that was used in this paper will be extended to the parentoffspring design.
Acknowledgments
We would like to thank Lindon Eaves for providing the data for this study and commenting on an earlier version of this manuscript. Furthermore, we would like to thank two anonymous reviewers for their helpful comments on an earlier version of this manuscript. This study was funded by PROO Grant 41112623 (PI SM van den Berg) from the Netherlands Organisation for Scientific Research (NWO). Data collection was funded by the two NIH Grants AA06781 (PI AC Heath) and MH40828 (PI KS Kendler). Statistical analyses were carried out on the Genetic Cluster Computer (http://www.geneticcluster.org) hosted by SURFsara and financially supported by NWO Grant 48005003 (PI D Posthuma) along with a supplement from the Dutch Brain Foundation and the VU University Amsterdam.
Compliance with Ethical Standards
Human and Animal Rights and Informed Consent
All procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the Helsinki Declaration of 1975, as revised in 2000 (5). Informed consent was obtained from all patients for being included in the study.
Supplementary material
Funding information
Funder Name  Grant Number  Funding Note 

Netherlands Organisation for Scientic Research (NWO) 

Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.