Does Household Income Affect children’s Outcomes? A Systematic Review of the Evidence

There is abundant evidence that children in low income households do less well than their peers on a range of developmental outcomes. However, there is continuing uncertainty about how far money itself matters, and how far associations simply reflect other, unobserved, differences between richer and poorer families. The authors conducted a systematic review of studies using methods that lend themselves to causal interpretation. To be included, studies had to use Randomised Controlled Trials, quasi-experiments or fixed effect-style techniques on longitudinal data. The results lend strong support to the hypothesis that household income has a positive causal effect on children’s outcomes, including their cognitive and social-behavioural development and their health, particularly in households with low income to begin with. There is also clear evidence of a positive causal effect of income on ‘intermediate outcomes’ that are important for children’s development, including maternal mental health, parenting and the home environment. The review also makes a methodological contribution, identifying that effects tend to be larger in experimental and quasi-experimental studies than in fixed effect approaches. This finding has implications for our ability to generalise from observational studies.

On average, children growing up in low income households have poorer health than children from richer backgrounds and score worse on tests of cognitive, social and behavioural development (e.g. Duncan and Brooks-Gunn 1997;Mayer 1997;Washbrook et al. 2014;Bradbury et al. 2015). They go on to do less well in education, have lower self-esteem as adolescents, and are more likely to become involved in crime or delinquent behaviour (e.g. Haveman et al. 1997;Hobcraft 1998;Ermisch et al. 2001).
However, the extent to which these associations reflect causal relations is still not well understood. On the one hand, there are good reasons to think that income itself has an influence on child development. The literature has highlighted two relevant theories (Duncan et al. 2014). 'Investment theory' captures the way financial resources enable parents to buy goods that children need in order to thrive: good quality housing and a healthy diet; books and other learning materials; outings and holidays (Duncan et al. 2017). The 'family stress model' refers to the impact of economic disadvantage on the emotional home environment, holding that hardship causes stress for parents, and this in turn affects their parenting abilities (Conger et al. 2000). Economic stress can make parents frustrated, less patient and lacking in the emotional resources needed for supportive and nurturing parenting behaviours (McLoyd 1990;Magnuson and Duncan 2002). The two models are not mutually exclusive, and might interact with each other: for example, more money might give parents the mental space to plan healthier meals and stimulating activities, as well as the resources to afford them.
On the other hand, a number of confounding factors may explain the apparent link between financial resources and children's outcomes. Parents who command higher incomes are likely to have higher levels of human capital, leaving them better placed to help with school work and to negotiate public services in their child's interest. They may, on average, put more emphasis on educational success. Higher income and wealth may also be associated with types of social and cultural capital which can offer children multiple direct and indirect advantages affecting health as well as educational attainment and progression (Lareau 1987;Waterston et al. 2004;Ferguson 2006;Tramonte and Willms 2010).
Whether and how far income in itself makes a difference to children's outcomes, rather than simply being a correlate of these other drivers of child development, is crucial to our understanding of what children need in order to thrive. It is also a question of central importance to policy, given that the level of household income can be influenced relatively easily by government through the tax-benefit system. This article seeks to answer this question with a review of the evidence examining the causal relationship between household financial resources and children's health, cognitive and social-behavioural outcomes. It attempts to bring together the published evidence on this topic in OECD countries over a 29 year period, 1988-2017. To ensure that the review was robust and unbiased, two decisions were taken at the outset. The first was to limit the review to studies using credible causal methods: Randomised Controlled Trials, quasi-experiments or fixed effect-style techniques on longitudinal data. The second was to conduct the study using systematic review principles (see Oakley et al. 2005;Wallace et al. 2004). This meant establishing clear criteria for inclusion and exclusion of studies at the outset as well as specifying and publishing search terms.
We begin by expanding on the methodological choices made in the review, and the limitations that remain. We go on to present the results in two sections, first exploring the evidence on whether money matters for different outcomes, and then how much it matters.

Method
There were two stages to the review process. First, a fully comprehensive systematic review was conducted, examining every abstract from systematic searches of studies published between 1988 and 2012. This review was subsequently updated with evidence published from 2012 to early 2017. At the second stage abstracts of the top 2000 search results were reviewed, where results are ordered by relevance. Resource constraints ruled out the comprehensive approach taken in the first stage, but the sampling approach should protect against bias in terms of the evidence found.
The aim was to identify all relevant studies using reliable causal methods to investigate the impact of household financial resources on children's outcomes, or on intermediate outcomes. We interpreted 'reliable causal methods' to include randomised controlled trials (RCTs), quasi-experiments, and fixed effect or other similar techniques using longitudinal data to measure within-household changes in resources and outcomes. We do not wish to imply that all included studies should be treated equally, however. There is a clear hierarchy of methods in establishing causation from the 'gold standard' of RCTs at the top (Sefton et al. 2002), down through quasi-experimental approaches to the least superior fixed effect studies. We reflect further below on the particular weaknesses of each approach, and throughout the review we consider how the evidence falls across the different method types as well as overall.
All included studies used a measure of household income as their key explanatory variable. Searches for evidence on children's outcomes were conducted under three broad headings: cognitive development and school achievement; social, behavioural and emotional development; and physical health. In addition, we also searched for evidence on intermediate outcomes: parenting and the home learning environment; maternal mental health; parental health behaviours; and expenditure patterns.
There were several additional inclusion criteria: & Financial resources had to be measured during childhood, but outcomes could be measured with a lagfor example the effect of childhood income on high school graduation. & We chose to focus the review on the richer country context, so studies had to cover countries in the EU and/or OECD (see Bastagli et al. 2019 for a review of evidence from low-and middle-income countries). In part this decision was made to keep the scope of the review manageable, and in part to focus on the importance of household resources in countries with well-established welfare states. Where high quality public services are universally available, there might be less reason to expect private financial resources to affect child outcomes, making the question especially interesting. In practice, there are extensive differences in public service provision in the countries covered in the review; for example, the absence of universal health services in the USA. We bear these differences in mind in discussion. & Studies had to examine the impact of money on children's outcomes in the last fifty years, to maximise relevance to present day policy. & Studies had to have an abstract in English. A handful of potentially relevant studies in other languages were translated but eventually excluded. & To keep searches manageable, studies had to be published between 1988 and 2017, as searches were completed in March 2017.
We focus primarily on peer-reviewed journal articles, providing a quality threshold for included studies, though we also included working papers from 2009 onwards, to avoid excluding relevant recent work that had not yet made it through the peer-review process. In practice only one of the working papers we identified (Tominey 2010) remained unpublished at the time of writing. In clinical literature, studies that identify significant results have been found more likely to be published than those that do not (e.g. Dubben and Beck-Bornholdt 2005), and we acknowledge that there remains a risk of publication bias here, although we expect that a finding that income has no effect on wider outcomes might be considered an interesting result in itself.
Having confirmed the inclusion criteria we conducted the search strategy in three steps. First, search terms were developed for each outcome, and initial searches run on selected datasets. Each search template included three common sets of termsone for financial resources, one for method, and one to limit the search to childrenand one unique set of terms for the particular outcome. Databases included were: EconLit, SocIndex, IBSS (International Bibliography of the Social Sciences), British Education Index, PsychInfo and Medline.
In a second step, studies were manually screened on title and abstract, and excluded if they did not fulfil the criteria set out above. Following Greenhalgh and Peacock (2005), we then added potentially relevant studies referred to us or cited in identified studies ('snowballed'). For transparency we note which studies were referred or snowballed.
In the third step, methods sections were reviewed to ensure that studies used one of the listed methods, and could reasonably be said to examine the potential causal effect of income.

A Hierarchy of Methodologies
An important question about the study approach concerns the potential conflation of results from studies using a range of methods, some significantly more robust than others in regard to causal inference. Our aim is to be clear about the methods used, and their weaknesses, so that readers are not misled about the strength of evidence in particular areas. The clearest decisions regarding inclusion involved studies using randomised controlled trials to identify an income effect. RCTs are uniquely strong, because in these studies income can truly be treated as an exogenous 'shock', affecting households independently of other hidden characteristics. This allows observed changes in outcomes to be confidently attributed to the change in income.
Quasi-experimental studies are more common and usually larger in scale: situations in which some people receive more income than others because of natural policy variation due to circumstances of time, place or observed characteristics. Different approachesdifference-in-difference, regression discontinuity, instrumental variables can then be used to estimate the effect of the change in income by comparing outcomes with a quasi-control group.
The weakest studies, which we refer to as observational studies, make use of fixed effects methods or similar techniques on longitudinal data to measure within-household changes in income and outcomes. By focusing on within-household changes, these studies reduce the risk of omitted variable bias compared to cross-sectional analysis, but they do not eliminate it: much variation in income is likely to be driven by changes in labour supply or wages, and cannot be seen as exogenous to the family's other characteristics; for example, a younger sibling experiencing higher family income than an older one may also have experienced less time with parents and more time in childcare. Additionally there is greater risk of measurement error with these studies as in most cases they capture household income changes through survey data rather than using specific income shocks from RCTs and quasi-experiments. Further, standard errors are often considerably higher in models focusing on within-household change, where most of the available variation has been discarded, than in models which make use of both between-and within-household variation; this means results are less likely to be found to be significant (Allison 2005). Nevertheless our ability to generalise from their results may be greater than for localised RCTs. Comparing results from these studies to those from more robust approaches should be informative, if care is taken not to treat all studies as equal.

Search Results
Figure 1 tracks the number of references from the initial search results for both stages of the review. The first and second stages found 46,657 and 6198 studies respectively. Of these, 54 studies met all inclusion criteria. 42 of these came from the online searches, two were referred to us (Wickham et al. 2017 andFitzsimons et al. 2017 both published shortly after the searches were conducted) and ten came from snowballing. Seven of the snowballed studies were identified in the second stage of the review, where only the top 2000 references in the search results were examined, indicating that (as expected) this approach allowed some studies to slip through the net. Of the three studies snowballed from the first stage searches, two had been returned in searches but initially excluded on the basis of the abstract, which had been interpreted as examining the impact of welfare-to-work programmes overall without adequately separating income effects from employment effects (Gennetian and Miller 2002;Clark-Kauffman et al. 2003). The third study, by Løken et al. (2012), did not include any of the outcome terms, referring to 'children's outcomes' in general.
2 Results: Does Money Matter for children's Outcomes? Table 1 presents an initial summary of the evidence grouped by the three main methodological approaches and then by country. (Online Appendix Table A1 lists each study; further details are available in the additional online Appendix Table A2.) Overall, 34 of the 54 studies show a significant and positive relation between income and each of the child outcomes they looked at (although not necessarily all measures of each outcome; that is, under cognitive development a study would be coded as having positive effects if it finds significant results for maths but no effects for reading). We use the 5% threshold as the criteria for assessing significance; if results are significant at 5% but not 1% level this is noted in Appendix Table A2. Eight studies find no significant causal relation, while eight find a positive effect for some but not all outcomes; for example, a significant effect on cognitive but not health outcomes. There are four studies that find some evidence of negative income effects (further discussion below). On a simple headcount, then, the clear majority of studies indicate a positive causal relation between income and child outcomes. This is true across all three methodological approaches.   Table 1 shows that over half the studies come from the US; a reflection of the availability of high quality longitudinal data, the long tradition of research in this field, and the potential of state-level policy variation. This raises questions about generalizability, which we return to below, and highlights the need for more research on this topic in Europe and beyond.
A simple headcount of individual studies has limitations. Several of the studies make use of the same quasi-experiment, or same dataset, albeit often measuring different child outcomes. Further, there are groups of studies authored by the same or overlapping teams, slicing the evidence from one experiment into several papers. This raises a risk of double-counting and giving too much weight to what is really a single piece of evidence. There are two extreme examples of this. Nine studies analyse aspects of the Earned Income Tax Credit (EITC) in the US, though often using different approaches to estimate the effect, while four of the five RCTs make use of evaluations of the same welfare programmes including the US Family Investment Programme. While there is considerable value in having repeated studies take a fresh approach to a dataset or experiment, nine EITC studies cannot be treated as equivalent to nine separate pieces of evidence if we are interested in generalizable findings about the importance of income effects. Therefore Table 2 presents the evidence grouped by 'case', with each case identifying a distinct RCT, quasi-experiment or dataset. Organised like this, the total Note: By 'outcome' is meant a type of outcomecognitive/educational, social-behavioural, health, or intermediate outcome, e.g. parenting/home environment or maternal mental health. Where more than one measure of a given outcome was examined in a study (e.g. for cognitive/educational outcomes, both school attainment tests and total years in education), the case is coded as positive if at least one measure showed significant positive results. 'Positive results for all outcomes' means positive effects were found for at least one measure in each outcome number of pieces of evidence reduces to 29. Importantly, the conclusions remain the same: a clear majority of cases (16 of the 29) find significant positive effects on all outcomes measured, with a further six finding positive results for some outcomes.
There are only four cases in all where no positive effects are identified (including one identifying negative effects), plus three cases that find some negative alongside some positive results. Is the evidence stronger in relation to some outcomes in particular? Tables 3 and 4 present evidence by type of outcome, first for individual studies and then by cases. Grouped in both ways, there is most evidence on children's educational attainment and cognitive development, followed by social-behavioural-emotional and physical health outcomes. In all three categories, a clear majority of studies find positive and significant income effects.
Fewer studies examined the effect of income on intermediate outcomes. Though smaller in number, the evidence for an income effect on parenting, the home environment and maternal mental health is strong. For both parental health behaviours and expenditure results are more mixed, especially when grouped by case. We can see now that the negative effects show up in relation to expenditure patterns, parental health behaviours and child health.
There is insufficient space here to do justice to the methods or findings of each included study, but we briefly review some of the evidence for each outcome category. We concentrate on experimental and quasi-experimental evidence, while also Note: By 'outcome' is meant a type of outcomecognitive/educational, social-behavioural, health, or intermediate outcome, e.g. parenting/home environment or maternal mental health. Where more than one measure of a given outcome was tested within a case, the case is coded as positive if at least one measure showed significant positive results considering all cases that find no effects or negative effects. A list of all included studies, with detail on each, can be found in the two appendix tables.

Cognitive Development and School Attainment
Our evidence base identifies positive income effects on a range of measures of cognitive development and educational attainment, including short-and medium-term changes in school engagement and test scores or grade point average, and longer-term outcomes including high school graduation and college entry. The evidence on cognitive outcomes from RCTs all comes from four studies of up to 14 US and Canadian welfare to work programmes, which compare the impact of programmes that increase employment but not income to those that boost income  Note: In this table multiple studies are treated as one, and the same rules applied to evaluating the findings i.e. 'positive' if positive effects were found for outcomes by at least one measure/in at least one of the studies, 'no effect' if none of the studies/measures within that case found a significant effect. 'Mixed' refers to when a mixture of positive and negative effects are found too. Because of the overlaps, we count all four studies looking at these programmes as a single 'case'. Two of the studies bring together data from multiple programmes, giving the largest sample size; these find positive effects on maternal and teacher reports of performance in school and on test scores. Following up two years after assignment, Clark-Kauffman et al. (2003) find effects were only significant for younger children, aged under five when first affected, but a later follow-up, assessing outcomes 2-5 years on, finds more widespread effects (Duncan et al. 2011).
Quasi-experimental evidence includes Akee et al.'s (2010) examination of a casino profit share in the US state of North Carolina, in which the casino began distributing profits to households containing Cherokee tribal members but not other adults. Following households with children aged 9-13 at the time of the first distribution, the study finds the extra income had positive effects on school attendance, high school graduation and years of completed schooling. Also for the US, Dahl and Lochner (2012) identify improvements in maths and reading scores for children in the low-and middle-income households that benefited from substantial increases to income through the EITC in the late 1980s and early 1990s; while in Canada, higher receipt of child benefit, resulting from variation in policy over time and across provinces, is found to positively affect maths scores while also (for boys) decreasing the likelihood of a learning difficulty diagnosis, but only for children of mothers with lower levels of lower education (Milligan and Stabile 2011).
Three studies examine Mexico's Oportunidades Conditional Cash Transfer (CCT) programme (Fernald et al. 2008(Fernald et al. , 2009Manley et al. 2015). The difficulty with assessing the impact of CCTs is isolating the effect of the income independently of any effect (including selection effects) of the conditions. Manley et al. (2015) address this most robustly, with an instrumental variable approach which simulates the amount a family should receive because of household size and children's ages, rather than the amount they receive in practice. Higher cumulative amounts are linked to significantly higher verbal assessment scores; cognitive scores are also higher but below our significance threshold.
There are several cases examining educational outcomes in Norway, reflecting the availability of rich administrative data for that country, including military records which capture IQ for men at age 18. Children in families that qualified for childcare subsidies achieved better results in grade point average and oral (but not written) exams at age 13-16 than children in families just above the income cut-off for subsidies, but who also attended childcare (Black et al. 2014). Years of schooling, high school completion and IQ at age 18 are all found to be positively linked to local area income shocks between ages 1 and 16 (Tominey 2010). And these same outcomes were boosted for children living in Norwegian regions most affected by the 1970s oil shock. While no overall effects of the oil shock were identified by Løken (2010), Løken et al.'s (2012) follow-up study allows for non-linearities, finding significant effects for lower-income households.
There are only three 'cases' where there are no significant income effects at conventional levelstwo experiments and one observational study. Cesarini et al. (2016) look at the effects of large wins in three different Swedish lotteries and find no effects on grade point average, English, maths or Swedish scores. It is possible that lottery wins may affect families differently to more day-to-day income changes from wages or social security benefits (Doherty et al. 2006). Shea (2002) uses variation in fathers' earnings due to union status, industry and involuntary job loss to explore the effects of long-term childhood income on schooling duration; no effects are found. And finally, Blanden and Gregg (2004) use sibling fixed effect models to look at income and education in the UK British Household Panel Study. They find income at age 16 is a predictor of staying on at school at that age, but is only marginally significant because of large standard errors, reflecting a relatively small sample size; we classify this as a null resultthe only null result of our seven observational cases. The six other cases, all identifying positive effects, include studies of the Longitudinal Study of Australian Children (Khanam and Nghiem 2016); public register data for Norway (Elstad and Bakken 2015); the UK Millennium Cohort Study (Violato et al. 2011); and three different US datasets -the Panel Study of Income Dynamics (Duncan et al. 1998), the Children of the National Longitudinal Survey of Youth (Blau 1999;Burnett and Farkas 2009), and the Miami School Readiness Project (Morrissey et al. 2014).

Social, Emotional and Behavioural Development
There is a smaller body of evidence on social, emotional and behavioural outcomes, including fewer experimental and quasi-experimental studies. In the only RCT case, the Minnesota Family Investment Program (MFIP), Gennetian and Miller (2002) compare outcomes for children of mothers in three groups: those receiving financial incentives to enter work; those who received incentives plus mandatory employment services; and a control group. They find significant improvements of the financial incentives on positive behaviour, but not on behavioural problems, for children aged 5-13 at the time of the three year follow-up, with larger effects on girls and those who were 6+ at study entry. They also find some evidence of negative effects of the employment mandate: children's social competency and autonomy is lower in households where the extra income came with a mandatory work requirement. A later study by the same authors also finds a link between higher income on positive behaviour, but in this case not reaching the significance threshold (Morris and Gennetian 2003).
Positive effects were identified on behavioural symptoms from the Cherokee casino profits share discussed above (Costello et al. 2003), though only for children whose households moved out of poverty as a result of the transfers. Parental supervision was found to be the main mediator. The extra income also reduced youth involvement in crime, including whether young people had ever committed a crime or ever dealt in drugs by age 21, with the greatest impact on the poorest households (Akee et al. 2010).
The US EITC is found to have reduced behavioural problems measured by the BPI at two-year follow up, but effects are no longer significant four years on (Hamad and Rehkopf 2016). For Canada, Milligan and Stabile's study of child benefit finds positive effects on a range of scores for children aged 4-10, including those for hyperactivity, anxiety and conduct disorder (Milligan and Stabile 2011).
We identify two cases which examine social and behavioural outcomes and find no significant effects. One is Cesarini et al.'s (2016) study of Swedish lotteries: no effect of wins is identified on young men's armed forces psychological assessment at age 18. And among the six cases using observational approaches there is one null result: Khanam and Nghiem (2016) find no effect of income from age 4-5 years on on any of the subscales of the Strength and Difficulties Questionnaire in the Longitudinal Study of Australian Children. However, this study controls for factors that are potentially mediators, including parents' mental health and parenting, meaning income effects could be being controlled away. Indeed, the authors conclude that parental stress plays a significant role in mediating the effect of income on non-cognitive development. Five other observational studies find positive income effects on social-behavioural outcomes using data for the UK, US, Canada and Norway (Dearing and Taylor 2007;Dooley and Stewart 2007;Zachrisson and Dearing 2015;Fitzsimons et al. 2017;Wickham et al. 2017).

Health
The results for health are somewhat more mixed than for other child outcomes. There is a clear and positive story in relation to outcomes around birth, but more ambiguous results on later health measures, and only null results from the few studies looking at asthma or respiratory diseases. We should remember that the scale of income changes captured in these studies are unlikely to be sufficient to facilitate improvements in housing conditions. Further, some studies are found to control for other potential mediators such as parental smoking.
A number of quasi-experimental US cases examine early birth outcomes, all identifying positive effects. Chung et al. (2016) explore the Alaska Permanent Fund in the 1980s, which began to distribute (unanticipated) substantial dividends to Alaska's population in the early 1980s, resulting in substantial reductions in low birthweight in recipient households alongside small positive effects on other birth outcomes including Apgar scores. Komro et al. (2016) exploit variation in minimum wages across US states and over time, finding higher wages associated with a fall in low birthweight and post-neonatal mortality. Mocan et al. (2015) make use of the way skillbiased technology shocks affected earnings in different industries, identifying positive effects on birthweight and gestational age for low-skilled women specifically. And two studies looking at the EITC find improvements in birthweight and Apgar score and reductions in the incidence of pre-term birth (Strully et al. 2010;Hoynes et al. 2015).
The story about later health is less clear. Manley et al. (2015) find positive effects of income from Mexico's Oportunidades CCT on children's height-for-age, and in the UK, Kuehnle (2014) exploits variation in local labour markets and finds positive effects on children's general health between ages 4 and 8 as reported by parents, though no effects on specific health conditions such as respiratory diseases.
But several studies find no effects, or even negative effects, on measures of child health. We classify four studies as having null results, with any positive effects falling below our significance threshold: Chia (2013) on the impact of the EITC on the risk of children aged two plus being overweight or obese; Milligan and Stabile (2011) on child benefit and a range of child health measures in Canada; Tominey (2010) on local area income shocks and links to men's physical health at 18 in Norway; and an observational study using the UK Millennium Cohort Study to look at links between family income and asthma and wheezing (Violato et al. 2009). Note however that this last study controls for several potential mechanismstobacco exposure, parental health and maternal depression.
Finally, we group two studies as 'mixed', because they show some indications of negative as well as positive effects. One is Cesarini et al.'s (2016) lottery study, which finds a reduction in young men's obesity risk at age 18 but no effect on birthweight or drug consumption for allergies or asthma, and an increase in hospitalisations in the year following the lottery. And an observational study using the US Panel Study of Income Dynamics finds positive effects on general health of additional income in low income families, but with unexpected, and unexplained, negative effects for those on the very lowest incomes (Johnson and Schoeni 2011).

Intermediate Mechanisms
All the evidence on parenting and the home environment comes from the US, and five out of six cases find positive evidence. The null study is the MFIP, with no significant effects identified on the home environment or on any of the parenting measures included, though the direction of change is in the right direction (Gennetian and Miller 2002). But results from another RCT do show positive effects. Cancian et al. (2013) examine the Wisconsin Works programme, which allowed child support payments made by non-resident fathers to be kept in full by families in the treatment group, rather than being partly withheld for families in receipt of benefits. In households where mothers kept the full payment there was a significant reduction in the risk of child abuse and neglect.
Quasi-experimental evidence on parenting comes from the North Carolina casino study, where extra resources led to increased parental supervision and an improvement in mother-child relations (Akee et al. 2010), and from the EITC expansion: Hamad and Rehkopf (2016) find no significant effects on the home environment at the two year follow up, but a significantly improved score four years on. Observational studies using two US datasets also find improvements in measures of both the physical and psychosocial home environment (Blau 1999;Votruba-Drzal 2003;Dearing and Taylor 2007).
All six cases looking at maternal mental health find positive effects of income, including four experimental or quasi-experimental studies. For the US, Gennetian and Miller (2002) using the MFIP find significant reductions in maternal depression, including the risk of clinical depression. The EITC is also found to have positive effects, with reductions in symptoms of depression, reductions in reported bad mental health days, and increases in measures of happiness, self-worth and self-efficacy (Boyd-Swan et al. 2016;Evans and Garthwaite 2014). In Canada, Milligan and Stabile (2011) find significant reductions in maternal depression linked to increases in income from child benefit. And in Sweden, Cesarini et al.'s (2016) lottery study finds a small but significant reduction in the consumption of mental health drugs after lottery winsthough the authors urge caution in interpreting this result.
In contrast to the unequivocal picture on maternal mental health, links between income and parental health behaviours are more uncertain. Two US studies look at take-up of pre-natal care and both find positive income effects (Chung et al. 2016;Mocan et al. 2015), but the absence of universal free health care access in the US might be relevant context here, making us wary about generalizability. The remaining four studies focus on maternal smoking, and results are mixed: two out of three EITC studies find significant effects on reducing maternal smoking (Cowan and Tefft 2012;Strully et al. 2010), while in the third effects also point in this direction but are not significant (Averett and Wang 2013). But using US technology shocks, Mocan et al. (2015) find no effect on smoking during pregnancy, while Raschke (2016) finds some weak evidence that higher child benefit is linked to an increase in the probability of smoking in Germany (Raschke 2016).
Our review also includes four studies looking at expenditure patterns, and here too the story is complex. In the UK, Gregg et al. (2006) examine how spending patterns changed in low income families with children in the UK as the child tax credit system was introduced; they exploit the fact that families with younger children benefited most from the reforms. They find spending patterns converged with those of higher income families; expenditure increased on children's clothing and footwear, toys and books, and fell on alcohol and cigarettes. Raschke's (2016) study of German child benefit variation finds increases in child benefit lead to increased food expenditure, especially in low income families. Higher child benefit is also associated with more living space and a lower probability of renting.
But in two other studies results are null or negative. Kaushal et al. (2007) find increases in income due to US welfare reforms have no significant effect on spending on children's clothing or on learning and enrichment. Instead, more was spent on workrelated items: adult clothing, food away from home and transport. The authors note that in contrast to the reforms examined by Gregg et al. (2006) for the UK, the US welfare changes created a mandatory work requirement as well as increasing income support, which may explain the different findings. Another possible explanation is that the income was not labelled as being for children in the US reform as it was in the UK. However, Blow et al. (2012) examine variation in the value of UK child benefitlabelled for childrendue to policy reform and inflation, and find unanticipated increases result in higher spending on alcohol by middle and higher income couples (but not lower income couples or lone parents) in their sample. Note that the sample is already restricted, excluding households in receipt of means-tested benefits. This would appear to make Gregg et al's analysis a more convincing guide to the expenditure response to additional benefit receipt in low-income families.

How Much Does Money Matter?
The majority of the 54 studies we include in this review identify positive and significant effects of income on a range of children's outcomes. But how large are these effects? Tables 5, 6, 7 and 8 present standardised effect sizes for continuous outcome variables, calculated for a $1000 change in annual income in 2000 prices. Results were only included if they were statistically significant and could be interpreted as the size of response to a given income change; studies that estimate the impact of a move across a particular threshold (e.g. 'poor' to 'not poor') were excluded. Currencies were converted into US dollars using OECD Purchasing Power Parities (PPP), and then to 2000 prices using the US Consumer Price Index.
This standardisation leaves a number of issues unresolved. First, PPP and price conversion adjust for the cost of living in terms of a basket of goods, but not for differences in average incomes across countries: US$1000 might be expected to make a bigger difference in Mexico than in the US, given lower average income in the former. In addition, within a given country, US$1000 is likely to mean more at the bottom of the distribution than the top. The tables therefore indicate whether the sample includes the full distribution or just low income households.
A third problem arises from the different approach to equivalisation of income taken in different studies. Some but not all studies adjust household income in accordance with household size: thus Dearing et al. (2006) standardise income using the US income-to-needs ratio, while Blau (1999) looks at changes in total household income and Akee et al. (2010) at the impact of a transfer of US$4000, regardless of household size. This means that a standardised US$1000 is actually capturing something rather different across studies. Roughly 40% of the observational studies use equivalized measures of income while most of the experimental studies use a non-equivalized dollar amount for the household. As it would be near impossible to adjust effect sizes accurately to compensate for these different approaches, the standardized measures should be treated as giving us a broad idea of the range of effect sizes rather than a clean comparison across individual studies. The effect size tables highlight where studies have used equivalized income measures, and more detail is given in the online appendix.
One thing that emerges from the tables is that (with the exception of measures of the home environment) observational studies using fixed effect (or similar) approaches appear to identify smaller impacts of income than experimental designs. For example, the fixed effect studies find that US$1000 consistently delivers just 1 to 2% of a standard deviation improvement in cognitive outcomes (Table 5), while studies using RCTs or quasi-experiments find effect sizes ranging between 5% and 37%.
There are three main reasons why fixed effect studies might find consistently smaller effect sizes than studies using other approaches. Firstly, on the whole, the experimental studies and those exploiting an exogenous change in income are focused on the low income part of the population, while the longitudinal studies make use of survey data which includes higher-income families. If income matters more in low-income households, studies that focus on lower-income families will find larger effects. Indeed, where the fixed effect studies isolate the income effect for the lower income population, Notes: For Dahl and Lochner the effect size given for reading is an average of the effect size s for reading recognition (0.04) and reading comprehension (0.06). For Black et al. (2014) the effect size given is the mid-point of the range given (0.09-0.26). For Elstad and Bakken (2016) the result is significant for children in lowincome families only. All coefficients presented are significant at at least the 5% level. Shaded and blank columns distinguish separate cases which are sometimes made up of more than one study effect sizes increase, though they remain lower than for the other types of study, making this at best a partial explanation. Second, the longitudinal studies almost certainly capture income change less accurately than the other types of study. Angrist and Pischke (2009) argue that the higher level of noise in the measurement of differenced regressors means fixed-effect estimates may be smaller than those in cross-sectional models for reasons that are not driven only by selection. They suggest that researchers avoid "overly strong claims" in interpreting fixed effects estimates (p.227). Experimental approaches, in contrast, identify income changes with greater precision, because the experiment itself is driving the change.
A third factor, already noted, is that fixed effect estimations are likely to be insufficiently purged of omitted variable bias, making them less purely 'causal' than other estimates. Omitted variables could bias coefficients upwardsfor example, if a move to a better job has a direct effect on maternal mental health but we observe only the income change. But downward bias is equally plausible. Perhaps income rises with longer parental working hours, which mean parents spend less time at home. Positive income effects may be unobserved because offset by the effects of lower parental involvement.
Because of the relative sparsity of experiments and the difficulties of identifying good instruments, fixed effect studies are popular with researchers, but the variation in effect sizes between the study types underlines the need for caution in relying on these studies for information on the extent to which money matters. Additionally, as discussed above, a number of these studies are weakened by including controls for variables that are likely to be mediators, such as mothers' mental health, parenting and the home environment. In sum, the very small effect sizes in comparison to other Table 6 Effect sizes for social and behavioural outcomes (standard deviation change linked to USD$1000 in 2000 prices) Note: All coefficients presented are significant at at least the 5% level. Shaded and blank columns distinguish separate cases which are sometimes made up of more than one study

RCTs
Quasi-experiments Observational Gennetian and Miller (2002) Milligan and Stabile ( studies suggest the observational studies may be considerably underestimating true income effects. If we take the range of results from quasi-experimental and RCT studies, effect sizes are far from negligible: the effect of a US$1000 (2000) annual change in income ranges from 5% to 37% for cognitive outcomes and from 3% to 22% for social and behavioural outcomes, 1% to 24% for children's health and estimates of 4-15% for maternal depression. By way of comparison, one meta-analysis of early education programmes points to average effect sizes of between 23% and 52% on measures of children's achievement or school readiness (Higgins et al. 2012), though few studies record the level of spending on each programme. Duncan et al. (2011) point out that the Abecedarian early education project in the US found treatment effects on IQ of one full standard deviation at age three and 75% at age five, but the project cost more than US$40,000 per child in 2003 prices (see also Karoly et al. 2005). Similarly, the Perry Pre-school project delivered 60% of a standard deviation improvement in IQ but cost US$15,000 per child in 2003 prices (Duncan et al. 2011;Karoly et al. 2005).
It should be noted that reported costs for these interventions include administration, and the $1000 derived from benefit changes does not. Nevertheless, the comparisons suggest that income support policies for low income households hold their own alongside early intervention policies, and should be considered key elements of any strategy to improve child development.

Does Additional Money Matter More in Low Income Households?
Finally, we briefly consider what light our evidence base sheds on the question of nonlinearity of income effects: does a given change in resources make more difference to children in more disadvantaged households? Both the family stress model and investment theory might predict a larger impact of the marginal pound in lower income households. Table 9 summarises findings from 15 studies (11 cases) that explicitly examine whether income effects are non-linear. Many studies include a natural log form for income, implying diminishing returns to income, but studies are only included here if they test this assumption by trying out other models. We might further expect the effect of a given increase to depend on the income levels in the household's neighbourhood, community or wider reference group; if it is relative low income that matters, we would ideally like information on the relevant comparison group, which might not be the country as a whole. However, there are no studies in our review which use a design or sample that allows us to explore this question. All but two of the 15 find the effect of income to be greater in lower income households, at least for some outcomes. Three different methodological approaches are used. Ten of the studies simply divide the sample into groups based on income level and analyse income effects for each group separately. Nine of these find clear evidence that effects are higher in poor than non-poor households, with five finding no effect at higher income. Others find that there are effects for higher income households, but these are smaller than for lower income familiesin some cases up to 15 times smaller (Dearing et al. 2006). The one exception in this group is Blau (1999), who finds nonlinear effects for achievement and vocabulary tests, but with largest effects for middle and lower middle income groups not for the very lowest.
The second approach, employed by two studies, is to use a spline function, allowing the income effect of income to vary either above and below a single 'knot' such as the official poverty line, or at two or more points in the distribution. Using a three-segment spline function, Johnson and Schoeni (2011) find income to have a greater impact on children's physical health in poorer households, with no effect on the highest income households, but they also find adverse effects on health for the poorest households. In contrast, Duncan et al. (1998) find strong evidence of a non-linear income effect for high school completion and additional years of schooling: using a two-segment spline   Løken's (2010) study of the Norwegian oil boom as a natural experiment, this time using a quadratic form of income. The quadratic estimates show the impact of income to rise more steeply at the lower end of the income distribution and then flatten higher up the distribution. Votruba-Drzal (2003) and Zachrisson and Dearing (2015) estimate their results using both linear and semi-log functions and find the non-linear function best describes the impact of income on the home environment (Votruba-Drzal 2003) and on children's internalising and externalising problems (Zachrisson and Dearing 2015), with income making a bigger difference in poorer households.

Conclusion
This review examined 54 studies, representing 29 'cases', which investigate the causal effects of household financial resources on wider outcomes for children. These studies were selected after a systematic review of the literature on EU and OECD countries. This robust evidence base indicates strongly that money itself makes a difference to children's outcomes. Children from low income households do worse in life in part because of low income, and not just because poverty is correlated with other household and parental characteristics. In only four of the 29 'cases' examined were no positive income effects identified, and 16 of the 29 found positive effects on indicators of each type of outcome they looked at.
There is most evidence in relation to cognitive and schooling outcomes, with positive income effects also apparent for social and behavioural outcomes and for measures of child health at birth, though the picture was more mixed in relation to later child health outcomes. Consistently significant positive effects were also identified for maternal mental health, parenting and the home environment, pointing towards likely mediating mechanisms. Links between income and parental health behaviours are more uncertain. The results indicate support for both Investment and Family Stress models, and suggest that it is unhelpful to draw a dichotomy between policies that increase household income and those that seek to change parents' behavior: money seems to matter partly because of its effects on what parents do.
Effects were found to be non-linear: the impact of a given increase in income tends to be greater, as might be expected, in households that had lower incomes to start with. Adjusted for programme cost, the size of income effects for low-income families from RCTs and quasi-experimental studies appear comparable to those calculated for early interventions including early education.
Does income at some stages of childhood matter more than at others? And how long do the effects last? For child health, our evidence base suggests the period before birth is particularly important, but effects on educational and social-behavioural outcomes are also found for older children and teenagers (see Cooper and Stewart (2013) for further discussion). In terms of the duration of effects, some studies identify early impacts that have later disappeared; notably Hamad and Rehkopf's (2016) analysis of the US EITC on behavioural impacts. But there is also evidence that increased welfare payments in the US had more impact on school achievement 2-5 years on than in the initial assessment (Clark-Kauffman et al. 2003;Duncan et al. 2011); while several studies using Norwegian administrative data link outcomes at the end of compulsory schooling to income changes that took place in much earlier childhood (Tominey 2010;Løken et al. 2012;Black et al. 2014). These results underline the potential for income support early in a child's life to have lasting effects; early impact may place children on a different long-term pathway.
While the findings from the studies in this review tell a powerful story, the evidence base has limitations. Despite extensive searches, only 54 studies met the inclusion criteria, representing a total of 19 experimental or quasi-experimental situations plus observational analysis of 10 longitudinal datasets. Around 45% of 'cases' come from the US, raising questions about our ability to generalise to other contexts. There is also less evidence on some outcomes than others, with only a handful of studies on each of the intermediate mechanisms, and less evidence on health and social, behavioural and emotional outcomes than on cognitive attainment and schooling. Children's own perspective is notably absent: indicators of emotional well-being, for example, are almost exclusively based on parent or teacher reports.
In seeking to address these gaps, researchers should pay attention to the crucial finding that the size of identified effects appears highly sensitive to the methods used. Studies exploiting experiments or other sources of exogenous changes in income identify effect sizes that are greater than those using fixed effect techniques on longitudinal data. This is likely at least in part to be related to the stronger likelihood of measurement error in studies relying on reported income in household survey data, and by the fact that fixed effect studies are not able to deal as adequately with the problem of omitted variable bias. The scale of the apparent downward bias in results raises questions about how useful these methods are to establish the existence and size of income effects and underlines the importance of creative researchers continuing to identify quasi-experimental situations to study.
Despite these limitations, the policy implications from the existing evidence are clear and important: policies to support household income have a key role to play in any strategy to improve life chances for children from disadvantaged backgrounds. A boost to income affects parenting and the physical home environment, maternal depression, children's cognitive ability, achievement and engagement in school, and behavior. Few other policies are likely to influence as many different outcomes at once, giving income support policies a claim to be the 'ultimate "multipurpose" policy instrument' (Mayer 1997, p.145). Even very small income effects operating across this range of domains are likely to add up to a larger cumulative impact. included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.