The electronic database search yielded a large number of articles (over 20,000), which reflects the widespread discussion on nature and health. Many articles were rejected based on title and/or abstract as the articles could be classed as either clearly irrelevant, concerned with a more general discussion, or were promotional material on health and activity in nature. Based on title and/or abstract, 70 articles were deemed potentially relevant and the full text of all but 7 of these were successfully retrieved from either Bangor University Library, the British Library or from the web/authors. After full-text viewing, 24 articles were included in the review (one article contained two relevant studies ). All articles identified as relevant were published in peer-reviewed journals except one charity report .
Description of studies
Activities and environmental settings
Additional file 2 presents the main characteristics of the studies included in the review. Most studies investigated the effects of walking [28, 30–41] or running [42–45] in the natural environment. Other activities under investigation were wilderness backpacking , gardening , a passive/sedentary activity only [46, 47] or a mixture of activities [27, 29, 48, 49]. The most common types of natural environment in the studies were parks [28, 32–34, 41, 43, 46] and university campuses [37–39, 44, 45], which in the latter case, based on the authors' descriptions, appeared to be relatively 'green'. Other environments were a nature reserve/wildlife preserve [35, 42], 'wilderness' , 'forest' [29–31, 36, 40] or a garden [26, 47]. Other studies were reportedly in an outdoor 'green' environment but the exact type of environment was not defined [27, 48, 49]. The 'synthetic' comparator environment also varied among studies but could be grouped into two main categories, with some studies falling into both. Fourteen studies compared the natural environment with an outdoor built, non-green environment (such as an urban/city street, urban residential area) with thirteen of these attempting to make a least one comparison of the same activity in each environment [30–32, 34–36, 40–43, 47–49]. Fourteen studies made a comparison with an indoor environment (usually a gym or a laboratory, but also included a shopping centre and indoor room) but only nine of these compared the same activity [33, 37–39, 44–46, 48, 49]. Most of these activities were short-term, with around one hour or less in each environment. Exceptions to this were studies that investigated the effects of repeated exposure to a natural environment over more than one day [26–28, 32, 36] or in some, the duration was not clear [29, 48, 49].
The most common study participants were college/university students [30–32, 35, 37, 38, 40, 41, 44, 46, 47] and physically active individuals such as backpackers, regular runners or athletes [32, 39, 42–45]. Several studies focused on individuals of one sex (six used only males and three used only females). A few studies focused on individuals with specific health conditions such as inactive adults at risk from cardiovascular disease ; children with impaired vision ; children with Attention Deficit Disorder/Attention Deficit Hyperactivity Disorder [34, 48, 49]; adults with 'profound mental retardation'  or menopausal women . Other participants were children attending kindergartens  and members of MIND (mental health charity) groups . The median number of participants within a study was 38 (range = 3 - 943).
The most common health/well-being outcome was some measure of an individual's emotions (Figure 1). Seventeen of the 25 studies collected data on at least one measure of a particular emotion [28, 30–33, 35, 37–39, 41–47]. Many measured more than one emotion (e.g. revitalisation, anger, anxiety), which varied with the particular psychological score used (e.g. Zuckerman's Inventory of Personal Reactions, Profile of Mood States). Eight studies investigated effects on attention/concentration (including two studies that focused specifically on ratings of ADD/ADHD symptoms of children; see "Methodology") [32, 34, 35, 41, 42, 48, 49]. Impacts on physiological variables were usually investigated on cardiovascular outcomes (e.g. blood pressure or pulse) [26, 28, 31, 32, 35, 39, 45], or hormone levels [30, 31, 36, 39, 40, 45], which included salivary or urinary cortisol, amylase and adrenaline. Less common outcomes investigated were effects on immune function [31, 36] (e.g. immunoglobin A concentration; natural kill cell activity); levels of physical activity ; motor performance ; cerebral brain activity (measured as absolute haemoglobin concentration) ; engagement , memory recall  and sleeping hours (see Figure 1).
Six criteria were used to summarise the methodology and reporting quality of studies. Many, but not all studies, described the characteristics of individuals participating in their study (16 studies) in terms of their age, sex, and health condition and/or amount of previous physical activity; the remaining studies only provided part of this information [26, 31–33, 37, 38, 41, 46, 47]. Most studies recruited participants as volunteers (21 studies) rather than them being referred from a third party or independently selected [except [26–28, 46]]. Thirteen studies were crossover trials. In ten of these, individuals were randomised and/or counter-balanced to determine the order of environments [27, 30, 31, 34, 40–45]; while in three other studies, participants were exposed to the environments in the same order [33, 36, 39]. Seven other studies were randomised controlled trials [26, 28, 32, 35, 37, 38, 47]. Across all studies reporting randomisation, apart from one case, the method of randomisation was not described. Five other studies used an observational study design that did not involve experimental control of exposure to different environments [29, 32, 46, 48, 49]. Most studies (20 studies) took pretest measurements before exposure to the environment, which allowed investigation of the baseline comparability of participants [except [29, 34, 47–49]. Thirteen studies were potentially affected by confounding variables in their comparison of different environments, which arose from various factors such as the presence of additional stimuli in the synthetic environment (e.g. a video of the outdoor walk ; internal/external stimuli received through headphones ). In other cases, there were differences in the activity [26–29, 32], potential environment order effects in a crossover trial [33, 36, 39], or other potential differences arising from an observational study design [46, 48, 49]. However, in several of these cases, this was because the hypothesis of the study was not the effects of nature and therefore additional factors were manipulated or present according to the particular question of the study.
Different measurement tools and techniques were used to collect data on the different outcomes and there was variation in the methodological information provided. Assessment of concentration or attention was usually based on standard tests such as Digit Span Test; Symbol Digits Modalities Test; Necker Cube Pattern Control or another test e.g. proof reading task. However, in two cases, effects on attention were only based on parental perceptions (of ADD/ADHD) [48, 49]. Information on emotions was based on self-reported data, obtained through use of various psychological questionnaires/scores (using a Likert scale), which asked participants to rate how close their mood matched statements of mood.
Differences between natural and synthetic environments after the activity
Effect sizes were calculated for the most commonly measured outcomes, with between four and eight studies measuring the same outcome. Additional file 3 presents the effect sizes that could be calculated from each study, and where appropriate, effect sizes for different subgroups within a study, derived from data measured after activity in each environment. Self-reported emotions (energy/revitalization, tranquillity/calmness, anxiety/tension, anger/aggression, fatigue/tiredness and sadness/depression), tests of attention, blood pressure and cortisol concentrations were synthesized (see Figure 2). We analysed different self-reported emotions separately for the purposes of interpretation. Combining these effect sizes, using average data per study, provided evidence for beneficial effects of activity in a natural environment compared to the synthetic environment in terms of reduced negative emotions such as anger (Hedges g = 0.46, 95% CI = 0.23, 0.69), fatigue (Hedges g = 0.42, 95% CI = 0.07, 0.76) and sadness (Hedges g = 0.36, 95% CI = 0.08, 0.63) (Figure 2). There was a marginally positive effect on energy scores (Hedges g = 0.28, 95% CI = -0.01, 0.57). Data on anxiety (Hedges g = 0.12, 95% CI = -0.34, 0.58) and tranquillity (Hedges g = 0.39, 95% CI = -0.08, 0.86) were less consistent with greater variation in the observed effect. A positive effect was also found on tests of attention, based on the average effect across studies (Hedges g = 0.32, 95% CI = 0.06, 0.58). We also tested the effect of adjusting these effect sizes by any pretest differences (see Additional file 4). In most cases, the results were similar, which supports the comparability of participants at base-line, however, this adjustment moves the confidence intervals for fatigue so that they overlap zero (95% CI = -0.1, 1.47). This effect also occurred for the meta-analysis of attention after accounting for pretest differences (95% CI = -0.12, 0.60) although only three of the five studies present pretest data.
Synthesis of the results from blood pressure (systolic: Hedges g = 0.07, 95% CI = -0.22, 0.36; diastolic: Hedges g = 0.07, 95% CI = -0.24, 0.38) and cortisol concentrations (Hedges g = 0.03 95% CI = -0.53, 0.58) found little difference in the effect of different environmental settings with confidence intervals of the pooled effect sizes overlapping zero (Figure 2) although the trends are similar across outcomes.
For studies with significant heterogeneity, which included tranquillity (Q = 17.52, df = 6, p = 0.01) and anxiety (Q = 14.14, df = 5, p = 0.02), we aimed to investigate the effect of comparator environment type (indoor or outdoor built) on effect size. Feelings of tranquillity after exposure to a natural environment were more positive than after exposure to an outdoor built environment (Q = 5.55, df = 1, p = 0.02; 95% CI = 0.18, 1.68, number of studies = 4), but not in comparison to an indoor environment (95% CI = -0.68, 0.54, number of studies = 3). All studies recording anxiety compared a natural environment with an indoor environment and so the impact of comparator type could not be investigated. There was non-significant heterogeneity for all other health or well-being outcomes (all p > 0.1). There was no evidence for publication bias as assessed with Egger's tests (all p > 0.1) but statistical power is limited by the low number of studies.
Changes before and after exposure to a natural environment
The previous analysis compared the differences in outcomes after exposure to each environment. It is possible that, in this analysis, a positive effect size for nature could arise even if the outcome declined in both environments, as long as this decline was smaller in nature. In order to investigate this possibility, we compared outcomes before and after exposure to a natural environment to investigate changes over time using the subset of studies that presented pretest data. This analysis found beneficial changes on feelings of energy, anxiety, anger, fatigue and sadness (Table 1). For other variables, which included attention, tranquillity, blood pressure and cortisol concentrations, there were no consistent changes between measurements before and after the activity in the natural environment as assessed by whether the confidence interval of the pooled effect overlapped zero. This analysis supports the interpretation that the positive effect sizes observed in self-reported emotions when comparing a natural to a synthetic environment are based on greater improvements over time in the natural environment rather than a smaller decline.
Other health or well-being outcomes
A limitation to quantitative synthesis of the studies included in this review is the variety of different health or well-being outcomes measured. Due to small numbers of studies measuring other outcomes, insufficient data points were available to attempt more powerful meta-analyses. Two studies conducted in Japan investigated the effects of walking in a forest on measures of immune function [31, 36], which included measuring variables such as secretory immunoglobin A, NK activity, number of T-cells and white blood cells. Other hormones, or measures of hormone activation, apart from cortisol, have also been investigated such as adrenaline and noradrenalin [36, 45] and salivary amylase [39, 40]. Across these different studies and outcomes, their results provide mixed findings, with no clear, consistent difference emerging in the effect of different environments.
Hartig et al.  investigated the effects of a natural (garden) and urban environment on memory recall, and found that, despite an improvement in mood in the natural environment, there was no evidence of a difference in the recall of positive, negative or neutral memories between environments. Two cross-sectional studies used questionnaires to ask parents of children with ADHD/ADD to rate their child's symptoms after different activities and within different environmental and social settings [48, 49]. Based on the parental assessment, the results support a positive impact of a natural environment compared to both an indoor and a built environment. The reliability of parental assessment as a measure of ADD/ADHD symptoms is, however, unclear. Cuvo et al.  compared the effects of an indoor living room and multisensory room, with outdoor activities in the grounds of an institution in a rural area on adults described as having 'profound mental retardation'. Three adult participants were observed, specifically for behaviour such as mouthing and body rocking, as well as engagement, and there was some indication of an improvement in behaviour during the outdoor activity compared to the indoor environments. In another study, Scholz and Krombholz  compared the motor performance of children from 10 forest kindergartens and from four 'regular' kindergartens, and concluded that the motor performance of the children from forest kindergartens was superior. In a longer-term trial, Isaacs et al.  compared 10 week programmes of leisure-centre based activities with instructor-led walking programmes through parks and open spaces, and also with an advice-only group. This study included follow up assessments at 10 weeks, 6 months and 1 year and measured a range of physical and mental health, and physical fitness outcomes. The results show that there was generally little difference in health/well-being benefits between the two activity groups, even in comparison with the advice-only group .