Introduction

The relation between reading performance and IQ has been well established in groups with and without reading disability (Tiu et al. 2003). Several studies in children examined the heritability of reading performance in relation to intelligence (Byrne et al. 2008; Cardon et al. 1990; Thompson et al. 1991; Tiu et al. 2004; Zumberge et al. 2007). Many of these studies are based on the Colorado Twin Study of Reading Disability (Alarcón and DeFries 1997; Brooks et al. 1990; Gayán and Olson 2003; Knopik et al. 2002; Pennington et al. 1992; Tiu et al. 2004; Wadsworth et al. 2000). They are based on a continuously growing sample which in 2004 consisted of 340 monozygotic twins and 281 same sex dizygotic twins in which at least one member of the pair had a positive school history of reading disability with a mean age of 11 years (Tiu et al. 2004). In studies of reading performance different measures of reading performance are used (see for example Brooks et al. 1990; Gayán and Olson 2003; Olson et al. 1989; Tiu et al. 2004; Wadsworth et al. 2000). Comprehension of text or reading comprehension is the ultimate goal of reading and strongly depends on accurate and fluent recognition of words, reading recognition. Reading recognition depends on the component skills phonological decoding and orthographical coding. Phonological decoding refers to a reader’s ability to decode (non)words encountered for the first time using knowledge of common graphemes/phonemes. Orthographical coding is the ability to distinguish the meaning of homophones based on their specific spelling patterns instead of how they sound. Phoneme awareness is a very specific reading-related skill which is strongly related with word-reading skills, and is defined as the ability to isolate and manipulate the segments of speech at the level of the single phoneme (Gayán and Olson 2003). All measures and an index of spelling are included in different combinations in studies on reading performance.

For reading comprehension, Brooks et al. (1990) reported a heritability estimate of 27%. For reading recognition broad sense heritability estimates are reported between 45–89% (Brooks et al. 1990; Cardon et al. 1990; Gayán and Olson 2003). Heritability of phonological decoding is estimated around 90% (Gayán and Olson 2003; Olson et al. 1989). For orthographical coding broad sense heritability varies between 0 and 90% and for phoneme awareness between 54–91% (Gayán and Olson 2003; Tiu et al. 2004). Byrne et al. (2008) and Brooks et al. (1990) reported, respectively heritability estimates for spelling of 74 and 21%. When reading recognition, reading comprehension and spelling are combined heritability estimates vary between 34 and 72% (also depending on average IQ; Tiu et al. 2004; Wadsworth et al. 2000). In general, studies on reading performance and IQ (Byrne et al. 2008; Cardon et al. 1990; Thompson et al. 1991; Tiu et al. 2004; Zumberge et al. 2007) report a moderate correlation between reading performance and intelligence that is genetic in origin.

Alarcón and DeFries (1997) investigated whether the heritability of reading performance and intelligence and their genetic correlation was the same for twin pairs which were selected for reading disabilities as for control twin pairs. Reading performance was based on a combination of scales measuring reading recognition, reading comprehension and spelling. There were no differences between the groups in heritability of general cognitive ability. However, the genetic and phenotypic variances and covariances amongst the reading measures were larger for the affected than for the control group, with higher heritabilities for reading performance in the affected group. According to the authors this finding is in concordance with the hypothesis that DNA polymorphisms that influence reading disability are more prevalent in this group. The phenotypic correlations between reading performance and cognitive ability were somewhat larger in the control group than in the affected group (respectively, r = .76 and r = .41), as were the genetic correlations (respectively, r = .81 and r = .52).

The relationship between IQ and reading performance might be explained by specific memory processes involved in reading. In the literature on individual differences in reading performance the relationship between reading performance, working memory (WM) and short-term memory (STM), is a subject of debate (Cohen-Mimran and Sapir 2007; Gathercole et al. 2006; Kercher and Sandoval 1991; Swanson and Jerman 2007). STM is the capacity to store material over short periods of time in situations that do not impose other competing cognitive demands (Gathercole et al. 2006). WM is the system responsible for the concurrent storage and manipulation of information (Baddeley 1992). WM constitutes of the central executive and three storage systems: the phonological loop, the visuospatial sketchpad and the episodic buffer. The central executive is the system responsible for a range of regulatory functions, including attention, the control of action, and problem solving (Baddeley 1996). The phonological loop comprises a phonological store that can hold memory traces for a few seconds before they fade, and an articulatory rehearsal process. The visuospatial sketchpad is its visuospatial counterpart (Baddeley 2003). In multiple studies the phonological loop and the visuospatial sketchpad are considered equivalent to STM (e.g., Gathercole et al. 2004). The episodic buffer provides temporary storage of information held in a multimodal code, which is capable of integrating information from a variety of sources, including long-term memory, into a unitary episodic representation (Baddeley 2000).

A possible explanation for a relation between STM and reading disability is that in children with reading disability the ability to code information phonemically or verbally is affected, which is required for storage in STM (Kercher and Sandoval 1991). This theory is in concordance with the difficulties in phonemic coding strategies observed in dyslexic children (Snowling 1980). Kercher and Sandoval (1991) and Swanson and Ashbaker (2000) indeed found that children with reading disability performed poorly on STM tasks but Gathercole et al. (2006) and Swanson and Jerman (2007) did not find a relation between reading disability and STM. Because of these inconsistencies further research is required.

A possible explanation for the relation between WM and reading disabilities is that impairments of WM result in reading disability because the system serves as a bottleneck for learning: children with low WM skills will have difficulties in meeting the routine WM demands of structured learning activities that are necessary for the acquisition of knowledge. Observations in children show that they have problems with following complex instructions, performing tasks that impose significant storage and processing loads, and performing tasks with a complex hierarchical structure (Gathercole et al. 2006). Most studies agree that children with reading disabilities score poorly on WM tasks (Gathercole et al. 2006; Swanson 2003; Swanson and Ashbaker 2000; Swanson and Berninger 1995), but see Van der Sluis et al. (2005). Further, Swanson and Berninger (1995) and Swanson and Ashbaker (2000) found that WM contributed independently from STM to the reading deficits found in less skilled readers.

The genetic relation between reading performance and memory is not very well established. Particularly the relationship between reading performance and WM requires further investigation. In children, only one study investigated the relation between verbal STM and reading performance in a genetically informative design (Wadsworth et al. 1995). A moderate correlation between reading performance and verbal STM was observed, which was for 80% accounted for by genetic factors.

To better understand which processes contribute to variation in reading performance, we explore in this paper whether there is an association between WM, STM and reading performance independent of intelligence. A genetically informative design is used that permits a test of whether associations among WM, STM, intelligence and reading performance are explained by a common genetic factor and whether memory and intelligence also contribute independently to the variance in reading performance. Recurring findings show that a disorder is more severe when underlying deficits co-occur (Bishop 2006). So, if we find that a genetic association between memory and reading performance is independent from the association between IQ and reading performance, this suggests that a combination of deficits in these three areas is a sign of the severity of the reading disability rather than a symptom of reading disability by itself (Bishop 2006). The associations among reading performance, IQ, WM, and STM were assessed in a sample of 9-year-old twins and their older siblings.

Materials and methods

Participants

Participants were 112, 9-year-old twin pairs (M = 9.1, SD = .1) and one of their siblings aged 9–14 years (N = 100, M = 11.8, SD = 1.2). Children were recruited from the Netherlands Twin Registry (NTR; Boomsma et al. 2002b; Boomsma et al. 2006). This group takes part in a study on the development of cognition and brain structure (Peper et al. 2008; Van Leeuwen et al. 2008), and included 23 monozygotic male (MZM), 23 dizygotic male (DZM), 25 monozygotic female (MZF), 21 dizygotic female (DZF), and 20 dizygotic opposite-sex (DOS) twins, 56 female and 44 male siblings. The study was approved by the Central Committee on Research involving Human Subjects (CCMO). Parents signed an informed consent form for their children.

Data of one MZM family were excluded because the mother did not speak Dutch and the children’s Dutch language skills seemed to be delayed. Reading performance was assessed in 21 MZM, 22 DZM (including 1 incomplete pair), 23 MZF, 21 DZF (1 incomplete pair), and 19 DOS (3 incomplete pairs) pairs and 82 siblings (44 female). Three families did not complete the Corsi block tapping task, one sibling did not perform the 2-back, and in two siblings IQ was not assessed. Ten children were not able to complete the n-back task and eight children could not complete the Corsi block tapping task.

Protocol

All children were tested at VU University in separate rooms by trained test administrators. Testing lasted for about 5 h (including three breaks). Children completed the Corsi block tapping task, the n-back task and the WISC-III as part of a larger test battery. Most of the families went, after they had been to the VU University, to the University Medical Center of Utrecht (UMCU) for a magnetic resonance (MR) scan. Children were tested for reading performance prior to the MR scan. Average time between testing at the VU and the UMCU was 43 days (VU before UMCU) ranging from 63 days before testing at the VU until 124 days after testing at the VU (SD = 35).

Measures

Reading performance

One subtest of the ‘one minute reading test’ (OMRT; Cito 1995) was administered as a measure of technical reading performance, or oral reading fluency. It measures the accuracy and speed of decoding (Moelands et al. 2008). Since Dutch spelling is more regular than English, one is less dependent on reading recognition for correct pronunciation than in English (Moelands et al. 2008). Children were instructed to read out loud as many words as possible in 1 min without making errors from a card containing 120 unrelated words. The OMRT is a standardized test frequently used in Dutch education as a measure of early reading performance (Van der Sluis et al. 2005) and corresponds well with other instruments (Moelands et al. 2008). The norms of the OMRT date from 2003 and are age dependent. Nine-year-old children are suspected to be dyslexic when they score below 28 words a minute. Test–retest reliability in the nine year olds is .92 (Moelands et al. 2008). For siblings age appropriate norms were used. Analyses were based on raw scores.

Short-term spatial memory

The Corsi block tapping task (Corsi 1974) was included to assess short-term spatial memory. Children sat in front of a touch screen monitor on which nine white blocks were displayed unevenly across a gray screen. In succession a number of blocks turned red for 1 s, after which the screen was blank for 3 s. After reappearance of the blocks, the child had to tap the blocks on the screen in the same sequence in which they had changed color before. When a block was tapped, the block would turn red and stay that way until the end of the run. The computer registered each tap. Each child was given two practice runs. In these runs each person had to memorize two blocks. Immediately after the practice runs the actual test was administered, starting with a series of two blocks. After every five runs the item length was increased by one block. The test ended when the child responded incorrectly to three out of five runs of the same length. The maximum number of blocks that could turn red in succession was nine. Performance was measured by total number of correct runs.

Visuospatial working memory

Children had to perform a spatial variant of the n-back task to assess visuospatial working memory. The n-back used in this protocol was designed after Gevins and Cutillo (1993) and Jansma et al. (2000) with increasing levels of difficulty. The children were asked to look at an apple presented on a screen. The apple had four holes in which a caterpillar could appear. Children were told to catch the caterpillar to prevent it from eating the apple, and were instructed to respond to the caterpillar by pushing one of four buttons with the thumb and index finger of both hands. The layout of the four buttons corresponded spatially to the four holes in which the caterpillar could appear. Children had to indicate where the caterpillar was one move back (1-back, i.e., the hole in which the caterpillar was before its last appearance), two moves back (2-back, i.e., the previous hole in which the caterpillar was before its last appearance), or three moves back (3-back, i.e., the hole in which the caterpillar was three moves before its last appearance). The caterpillar appeared in a hole for 1 s; after its disappearance there was a warning sound. Children were instructed to respond after this warning sound and could respond until the next caterpillar appeared. Between two caterpillar moves, the apple was empty for 1 s. Sessions contained 20 trials. Each condition consisted of a practice session and three sessions in which performance was recorded. Practicing continued until the participants understood the task. The 1-back condition was administered for practice purposes only, performance was recorded on the 2-back and 3-back conditions. Children were motivated during the task by counting the moves of the caterpillar. In the 2-back version the test administrator counted continuously to three and in the 3-back version the administrator counted to four. After each session children received feedback on the number of apples they had saved from the caterpillar (correct responses) and how many had been eaten (incorrect responses). Following the feedback there was a break of 15 s. The task requires a continuous response to all stimuli and simultaneous monitoring and update of all movements of the caterpillar. Performance on the task was scored by using the total number of correct responses. Maximum score per condition was 60. For this study we used performance on the 2-back condition. For children the test–retest on 2-back is .65 (Van Leeuwen et al. 2007).

Verbal working and short-term memory

Digit span forwards (DSF) of the WISC-III (Wechsler et al. 2002) was used to measure verbal short-term memory. In this task participants had to recall a list of numbers. The test started with a list of two numbers. If participants recalled one out of two trials correctly, the list increased with one digit. Increments proceeded, until participants had both of two trials wrong. Performance was scored as the total number of correct trials. To measure verbal working memory the digit span backwards task (DSB) was used. This time the participants had to recall the list of numbers in reverse order. The test–retest coefficient over 3 years of digit span (forward and backward together) of the WISC-Revised is .53 (Livingston et al. 2003), the split half coefficient for the internal consistency of digit span of the WISC-III is .67 (Wechsler et al. 2002).

Intelligence

Psychometric IQ was measured with the Dutch adaptation of the WISC-III (Wechsler et al. 2002). IQ was based on 10 subtests (information, similarities, arithmetic, vocabulary, comprehension, block design, picture completion, picture arrangement, object assembly and digit-symbol substitution). The two digit span subtests were not included in the total IQ score. Cronbach’s α for total IQ is .93 (Wechsler et al. 2002).

Statistical analyses

All analyses were performed with the Mx software package (Neale et al. 2006). First, covariance matrices, means and sex regressions on the means were estimated in a saturated model. By fitting nested models in which the means and variances between twins and siblings were equated, several assumptions were tested. After testing equality of means between twins and siblings, significance of sex and age effects on the means were tested. We continued equating parameters until the most parsimonious model with still acceptable fit was established. The choice for the best fitting model was based on likelihood-ratio tests. The difference between minus twice the log likelihoods (−2 LL) of two nested models, asymptotically follows a χ 2 distribution. The degrees of freedom are given by the difference in the number of parameters estimated in the two nested models. A high increase in χ 2 against a low gain of degrees of freedom denotes a worse fit of the sub model compared to the full model. All data were analyzed, including data from incomplete twin pairs using the raw data option in Mx.

MZ, DZ and sibling correlations contain information on the relative influence of genetic and environmental factors on the variability in traits. To have sufficient power to detect D or C large samples are required (Boomsma et al. 2002a; Plomin et al. 2001; see also the “Appendix” which contains power analyses based on 46 MZ and 64 DZ twin pairs and 100 siblings. The “Appendix” shows the power to detect A and C simultaneously at a significance level of .05.). Based on the limited sample size and on inspection of the MZ and DZ/twin sibling correlations as estimated in the saturated models we proceeded to fit a genetic model in which the contributions of A and E were estimated. The phenotype for an individual can be represented as: \( P_{\text{ij}} = a \times A_{\text{ij}} + e \times E_{\text{ij}} , \) where i = 1, 2, …or 112 (families) and j = 1, 2 and 3 (twin 1, twin 2 and sibling). A and E are latent variables (factor scores) standardized to have unit variance and a and e are factor loadings. The variance in P due to A and E is given by the square of a and e, respectively, so that \( {\text{Var}}\left( P \right) = a^{2} + e^{2} . \) Note that e 2 also contains variance due to measurement error. MZ twins have (nearly) the same DNA sequence and therefore genetic factors are perfectly correlated in MZ twins. DZ twins and siblings share on average half of their segregating genes, so that the expected correlation between their additive genetic factor scores (A) is ½. The covariance within MZ twin pairs thus is: Cov (MZ) = a 2, and within DZ twin pairs and siblings: Cov (DZ) = ½ a 2.

To determine to what extent the covariation between the reading performance, memory and IQ was due to genetic and environmental effects, multivariate genetic modeling was applied. In a six-variate saturated AE model the factor loadings of the A and E factors were modeled in lower triangular matrices of dimensions 6 × 6 (IQ, four memory measures, and reading performance), where matrix A contains the genetic factor loadings, and matrix E the environmental factor loadings. The model is represented as: \( p_{\text{ij}} = A \times a_{\text{ij}} + E \times e_{\text{ij}} , \) where i = 1,2, …or 112 (families) and j = 1, 2 and 3 (twin 1, twin 2, and sibling), vector p denotes the 6 phenotypes and has the dimension 6 × 1. Vectors a ij and e ij have the dimensions 6 × 1 and contain the genetic and environmental factor scores. The random factors A and E are standardized to have unit variance. The variance in p due to a and e is given by:

$$ V_{\text{P}} = A \times A' + E \times E' $$

where matrix V P is a symmetric matrix of 6 × 6, A and E are triangular 6 × 6 matrices of, factor loadings and indicates transposition. To test whether variation in genes contributed significantly to the variability in IQ, memory and reading performance, deterioration of model fit of the saturated six factor model was assessed after the A factor was dropped from the model. Next, non-significant parameters were dropped from the model until the most parsimonious model with still acceptable fit was established.

Results

Means, standard deviations, and age effects are reported in Table 1. Means were not equal for twins and siblings (Δχ 2 = 29.88, Δdf = 12, P < .01). Means were higher for siblings on IQ, reading performance and DSB, therefore only for the Corsi block tapping task, DSF, and the 2-back task means were constrained to be equal. There were no significant effects of sex on the means of any of the six measures. A significant effect of age was observed for all variables, except IQ. An age regression was included in all genetic models. For all tasks, except for the Corsi block tapping task, the variances in siblings were larger than in twins (Δχ 2 = 58.67, Δdf = 21, P < .01). We accounted for this inequality by modeling an additional variances parameter in siblings.

Table 1 Maximum likelihood estimates of means, SD in twins, SD in sibs and age regression of the variables

Correlations among variables were moderate (see Table 2). In the lower parts of Table 2 the MZ and DZ/twin-sibling correlations are given on the diagonal. The DZ covariance could be equated to twin-sibling covariation (Δχ 2 = 19.54, Δdf = 21, P = .55) for all measures. In all further analyses DZ and twin sibling correlations were constrained to be equal.

Table 2 Phenotypic and MZ and DZ/twin sibling correlations

Twins could read on average 58 words in 1 min, ranging from 7 to 100 words. According to Cito (1995) children of this age are suspected to be dyslexic when they score below 28 words a minute. Six percent of the twins had a score lower than 28. The siblings scored between 36 and 120 words a minute, with an average of 79 words. The number of children with possible dyslexia in the group siblings was 7 (9%).

MZ correlations were higher than DZ/twin-sibling correlations, suggesting genetic influences on all six variables. Below the diagonal MZ cross correlations and above the diagonal DZ/twin-sibling cross correlations are presented. Cross correlations were higher in MZ than DZ/twin-sibling pairs, suggesting that genes play a role in the covariation amongst the six measures. Additive genetic effects could not be dropped from the AE model without a significant deterioration of fit (compared to full AE model: Δχ 2 = 219.67, Δdf = 21, P < .01), but all environmental covariation could be dropped from the model (compared to full AE model: Δχ 2 = 7.58, Δdf = 15, P = .94). The final most parsimonious model is represented in Fig. 1 with three common genetic factors: (1) a genetic factor common to all variables; (2) a genetic factor common to visuospatial STM and verbal and visuospatial WM (3) a genetic factor common to verbal memory and reading performance. For reading performance and visuospatial WM there were specific genetic factors (compared to the model without environmental covariation: Δχ 2 = 5.99, Δdf = 7, P = .54).

Fig. 1
figure 1

Best fitting AE model with standardized parameter estimates in italics and unstandardised parameter estimates in normal script (95% confidence intervals). The model is shown for a single individual (V verbal; VS visuospatial; STM short-term memory; WM working memory; DSF digit span forward; DSB digit span backward; OMRT one minute reading test)

Heritability estimates based on this model are presented in Table 3. Heritability ranged between 27 and 83%. The highest heritabilities were seen for IQ (75%) and reading performance (83%). Genetic variation in the reading measure is largely independent of IQ and memory related traits: of the total genetic variance, roughly half (47%: .622)/(.452 + .492 + .622) is not shared with the other phenotypes in the model. Visuospatial WM and STM do not contribute independently from intelligence to the genetic variability in reading performance. The verbal memory tasks contribute independently from intelligence to 29% of the genetic variability in reading performance: (.492)/(.452 + .492 + .622). Genetic covariance shared between reading performance and verbal WM is for 45% independent of IQ: (.49 × .28)/(.49 × .28 + .45 × .38). In the case of verbal STM, 71% of this shared genetic covariance is independent of IQ: (.49 × .63)/(.49 × .63 + .45 × .27). Verbal STM and WM do no contribute independently from each other to the genetic covariance in reading performance.

Table 3 Heritability estimates (95% confidence intervals)

Discussion

We investigated in a genetically informative design whether the association among IQ, WM, STM and reading performance is explained by common genes and whether IQ, WM and STM independently contribute to the genetic and environmental variance in reading ability. For reading performance we found a heritability of 83%. This is somewhat higher than some of the previous studies (Tiu et al. 2004; Wadsworth et al. 2000). A possible explanation for this difference is that we in contrast to previous studies captured the full range of IQ.

Results showed further that part of the correlation between IQ, WM, STM and reading performance is explained by one common set of genes (genetic pleiotropy). This suggests that there is not a complete etiological separation between low IQ, deficits in WM, STM and reading. Environmental effects however, do not create associations between these traits. This finding is in concordance with the view of Price et al. (2000) and Petrill (1997): Genetic studies tend to show substantial genetic overlap which explains the association found between different cognitive abilities, while environmental factors primarily drive the different dimensions of cognitive functioning.

However, in addition to this genetic pleiotropy, variation in reading performance is also explained by genes specific to reading performance and a set of genes in common with verbal memory and reading performance. The specific factor for reading performance explains about half of the genetic variation in reading performance. The verbal memory factor is as important for explaining variation in reading performance as the common genetic factor for reading IQ and memory. Verbal STM and WM contribute independently from intelligence to 29% in the variability in reading performance. This factor could represent the problems children with reading disability have in the ability to code information phonemically or verbally, which is an important aspect of verbal STM (Kercher and Sandoval 1991), but also of verbal WM. So, although there is no etiological separation between low IQ, deficits in WM, STM and reading, there is an etiological separation between low intelligence, low memory performance and lower reading performance on one hand and deficits in verbal memory and reading performance on the other hand.

Three genetic factors influence variability in reading performance, a genetic factor which might represent general intelligence or “g”, a genetic factor representing verbal coding and a genetic factor specific to reading performance. Children who have a genetic predisposition for low g, might still have a genetic predisposition for average verbal coding and vice versa, but a combination of a genetic predisposition for low g and verbal coding deficits is also possible. This suggests that a possible way to classify children with reading disabilities is the following: children who are low in general intelligence and therefore are less skilled in reading; children who have normal IQ and a deficit in coding information phonemically or verbally and therefore experience problems with reading; and children with low IQ and deficits in phonemic and verbal coding, this group may experience more severe reading problems than the other two. This hypothesis is supported by the findings of Alarcón and DeFries (1997) which revealed that phenotypic and genetic correlations between intelligence and reading performance were larger for a control group than for the group affected with reading disabilities. This study also showed that genetic and phenotypic variances and covariances amongst the reading measures were larger for the affected group (twin pairs of which at least one twin had reading disabilities), suggesting that reading disability is caused by one or more genes with major effects.

One possible limitation of this study is that only 6% of the twins and 9% of the siblings are suspected of having reading problems. Although there is some evidence that the etiology of reading performance within the normal range does not differ from that of reading deficits, this still could affect the generalizeability of our findings to the population of reading disabled individuals. Therefore, caution is warranted in drawing inferences about categorization of children with reading disabilities. Another limitation is that the sample size of the study did not permit to estimate the contributions of A and C simultaneously (see “Appendix”).

The categorization, which resulted from our study, could possibly be important for the diagnosis of specific reading disability. “Specific reading disability, or dyslexia, is defined as an unexpected, specific, and persistent failure to acquire efficient reading skills despite conventional instruction, adequate intelligence, and sociocultural opportunity” (see Démonet et al. (2004), pg 1451). Several authors have argued against including adequate intelligence in the definition dyslexia and therefore against the use of discrepancy scores, based on differences between IQ and reading achievement scores, to identify children with reading disability (Stanovich 1993; Sternberg and Grigorenko 2002). The risk of using IQ-achievement discrepancy in the definition of reading disability is that children with lower IQ scores (or those with no significant IQ-achievement discrepancy) may be denied intervention services (Wadsworth et al. 2000). Our finding also argues against this strategy, because using it would miss the children with low IQ and deficits in phonemic and verbal coding: the children which may have most severe reading disability.

Our findings also might be of importance in the search of genes for reading performance. Our results suggest that variation in reading performance may be genetically heterogeneous: for some children with reading disability their deficits may be caused by low g, while for others a STM deficit may contribute to their reading problems. At an individual level it may be very difficult to make this distinction. Moreover, excluding children who are for example low in g, might lower the statistical power in genetic association studies, which require large sample sizes to identify genetic polymorphisms (e.g., McCarthy et al. 2008; Evans 2008). However, including children with reading problems different etiologies may also hamper gene detection. Possibly, a solution may come from including multiple correlated measures of reading ability, IQ and memory into multivariate analyses. This strategy has proven successful in QTL linkage studies of reading- and language-related measures (Marlow et al. 2003) and of e.g., lipid levels (Hottenga and Boomsma 2008). Multivariate approaches have also been suggested within the context of genome wide association (GWA) studies (Ferreira and Purcell 2009; Ding et al. 2009) and may offer a first-step solution to analyzing phenotypic measures of reading (dis)ability simultaneously with data on IQ and memory performance.