Background

Approximately two-thirds of the world’s freshwater is used to dilute wastewater discharges. The demand for freshwater is expected to rise by 70% by 2050 [1] driving an urgent need to understand the impacts of treated waste effluent discharges on aquatic ecosystems. Wastewater treatment works (WWTW) effluents contain tens of thousands of chemicals, including pharmaceuticals and natural steroid estrogens that are biologically active at low (ng/L) exposure concentrations [2]. However, the long-term consequences of exposure to most of these chemicals on fish health and population sustainability are not known.

There is substantial evidence showing that experimental exposure of fish to WWTW effluents and the estrogens they contain can result in adverse health effects, including effects on reproductive development and breeding output. This has led to concerns that freshwater fish populations might also be affected with cascading consequences for freshwater ecosystems. Feminization of male fish is widespread in stretches of rivers downstream of WWTW outfalls as demonstrated in studies using wild [3, 4] and caged [5, 6] fish. Feminized phenotypes include the presence of vitellogenin, a female-specific protein in the blood of male fish [7] and the intersex condition: the presence of oocytes and/or female reproductive ducts in otherwise male gonads [3]. Feminization has been attributed to the presence of estrogens in effluents: estradiol (E2) and estrone (E1) from human excretion; 17 alpha-ethinylestradiol (EE2), a component of the female contraceptive pill [8]; and a large number of other estrogenic chemicals from industrial and domestic effluents. WWTW effluents can also induce genotoxic effects [9], alterations in immune function [10], decreased reproductive output [11], altered stress response [12] and changes in reproductive behavior [13].

Concern about estrogens in rivers in the United Kingdom drove a £40 M programme to evaluate the efficacy of various tertiary treatment processes in the removal of estrogens [14, 15]. Implementation of such processes will, however, incur considerable costs and a greater carbon footprint for WWTW [14, 16], emphasising the need to understand better the population-level consequences for exposure to estrogenic and other so-called endocrine disrupting chemicals (EDCs).

A critical question is whether chronic exposure to estrogenic effluents negatively impacts the viability of wild fish populations, but this has been difficult to address experimentally as it requires controlled experiments extending over periods of several years. Limited studies suggest that high concentrations of EE2 (between 3 to 6 ng/L) in the aquatic environment could be a threat to the sustainability of fish populations. For example, a controlled exposure of an entire lake to EE2 in Canada resulted in the collapse of the fathead minnow (Pimephales promelas) population within three years [17]. Likewise, long-term (>204 days) laboratory exposures of a range of fish species have resulted in the absence of breeding males [1820] and a three-year exposure of roach (Rutilus rutilus L.) to an undiluted WWTW effluent in large tanks resulted in an all-female population [21]. It is not known, however, if this occurs in rivers contaminated by effluents. Female fecundity can also be reduced through estrogen exposure, which can potentially reduce population growth rates [22]. Although the exposure concentrations in these studies were high compared to those typically experienced by wild fish populations [23], exposures to EE2 at concentrations below 1 ng/L during the period of sexual development, have been shown to result in feminized gonads in roach [19] and decreased egg fertilization and female-skewed sex ratios in fathead minnows [24]. Evidence from wild roach living in UK rivers has similarly shown that feminized fish (generally less than 10% of males) with large numbers of eggs in their gonads have impaired semen quality [25] and severely (up to 76%) reduced reproductive success [26].

While these studies suggest that exposure to high concentrations of effluent could threaten the viability of fish populations, aggregates of cyprinid fish, including roach, are often found in effluent contaminated rivers. However, numbers alone may provide a misleading assessment of population sustainability as these could be sink populations maintained by substantial immigration from less contaminated locations where successful reproduction still occurs. Likewise, effective population sizes (Ne) – related to the breeding population of fish – may be decreased without necessarily impacting on population sizes [27], as effluent exposure can affect the number of reproducing individuals and can skew reproductive success [21, 26]. Density-dependent growth and survival can also play an important role [28], so a few reproducing individuals can potentially maintain large adult population sizes. Indeed, studies in several species of marine fish with high fecundity have shown that Ne can be several orders of magnitude smaller than census population sizes. One study found two populations of the exploited New Zealand snapper (Pagrus auratus) to have values of Ne less than 1,000 despite adult census population sizes in the millions [29]. Similarly, a study of striped bass (Morone saxatilis), a freshwater fish species, found cohorts to consist of a few, full sib families, despite an adult census size of over 300,000 [30]. Critically, Ne influences long-term sustainability as it determines the rate at which genetic diversity is lost from a population through genetic drift [31]. High genetic diversity increases the long-term potential for populations to adapt to changes in the environment and also acts to reduce the risk of inbreeding [32]. Small Ne, however, may act to increase the chances of losing some lethal or sub-lethal mutations through genetic purging.

Understanding the impact of estrogenic effluents on the sustainability of fish populations is, therefore, paramount, but has been limited to date by the logistical challenges involved in undertaking long-term exposures to realistic effluent concentrations, and understanding the demographic history of wild fish populations at highly contaminated sites. In this study, we examine evidence for population impacts on wild roach (R. rutilus), a fish species in which feminization is widespread, in southern England. Southern England has some of the highest proportions of WWTW effluent in rivers known globally, and numerous weirs and locks which potentially confine fishes to heavily polluted stretches of river. We have used this system to evaluate whether stretches of river highly contaminated with estrogenic effluents have impaired breeding populations of roach. To do this we undertook analysis of population genetic structures of roach in the region using DNA microsatellite analysis. Microsatellite data were also used to calculate Ne and estimate levels of gene flow to determine the extent to which these populations are maintained through immigration of fish from less contaminated stretches of river.

Results

Genetic diversity and genetic bottlenecks

A total of 1,769 roach, constituting 39 samples (roach sampled from one location within a river – typically a stretch of approximately 100 m – at one time point) from 32 different geographic locations in England were genotyped (Figure 1). Data for 14 microsatellite loci [see Additional file 1 for details] revealed high genetic diversity in all 39 samples (Table 1). Allelic richness (AR) ranged from 6.8 to 8.9, and expected heterozygosity (He) ranged from 0.69 to 0.75 (Table 1, see Additional file 2 for diversity statistics for each locus). Nevertheless, significant differences in AR (analysis of variance (ANOVA), F(38,532) = 2.1398, P = 0.00014) and observed heterozygosity (Ho) (ANOVA, F(38,532) = 1.8677, P = 0.0017) among samples were detected. Roach sampled at two relatively unpolluted sites (LamSha and LeeHUS’95) exhibited comparatively low AR and LeeHyd’95 exhibited comparatively relatively low Ho [see Additional file 3]. No significant differences in He were found. Additionally, there was evidence for genetic bottlenecks at two relatively unpolluted sites sampled within the rivers Arun (AruHUS) and Lee (LeeHUS), and at a polluted site in the Lee (LeeWhe) (Table 1).

Figure 1
figure 1

Locations of sample sites in England. (A) Modified from Williams et al. [33]. For (B) and (C), numbers in circles represent the number of obstructions to fish movement (either weirs or locks). Locks in the Kennet and Thames below ThaWhi have fish passes for salmon movements, but are likely to represent a barrier to movement of roach. In the upper river Lee, only weirs over 1 m are shown. PE = population equivalents, which relates to the size of the population served by the waste water treatment works. The different colours used to depict the rivers represent predicted mean estrogenicity (E2 equivalents in ng/L) [34].

Table 1 Sampling locations, genetic diversity statistics (allelic richness ( AR ) and expected heterozygosity ( H e )) for each population sample

Population genetic structure of roach in English rivers

We undertook an analysis of population genetic structure in order to examine the genetic similarity of roach between and within catchments. Analysis of molecular variance (AMOVA) indicated the majority of variation was partitioned among individuals within river locations, with river location accounting for a small 2.27%, but highly significant proportion of the genetic variation (Table 2). Average pairwise FST between roach samples from different catchments was 0.028 and comparisons were consistently highly significant (Table 3, Additional file 4). The population tree (Figure 2) shows distinct clusters of samples in different catchments: the Arun, the Nene, the Anglian Blackwater and the Trent, supported with moderate–high bootstrap values (>64%). Samples from the Arun and the Nene also group in the principal component (PCA) and the STRUCTURE analyses [see Additional file 5 and Additional file 6] demonstrating a distinct genetic identity of fish at these sites. Utilising the method of Evanno et al.[35], the inferred most likely number of genetically distinct clusters in the STRUCTURE analysis was three, comprising: the Arun, the two most upstream sample sites in the river Lee, and all remaining sites genotyped. However, from visual examination of STRUCTURE plots run with higher levels of K [see Additional file 6] other possible groups are apparent. We found no evidence that roach in the Thames catchment constitute a distinct genetic group, as samples failed to group together in any analysis (Figure 2, Additional file 5 and Additional file 6). This may reflect a true lack of genetic distinctiveness of roach in this catchment, but may also result from the limited ability of the microsatellite markers used to resolve population genetic structure at this level.

Table 2 Analysis of molecular variance (AMOVA) testing for partitioning of genetic variation among roach samples, grouped according to geography
Table 3 Summary of pairwise F ST and D est among roach samples (see Additional file 4 for full table of values)
Figure 2
figure 2

Neighbour-joining phylogenetic tree for the 39 roach population samples. The tree is based on the data from 14 microsatellite loci using chord distance from Cavalli-Sforza and Edwards [36]. Only bootstrap values above 50% are shown. Numbers at the end of sample codes indicate years that populations were sampled.

Population genetic structure of roach within rivers and catchments

Despite Thames catchment roach appearing not to constitute a single genetic unit, distinct from roach in other regions, the study did find evidence for significant genetic structuring in roach populations within the Thames catchment. This suggests the existence of local subpopulations exchanging a limited number of effective migrants (breeding individuals) rather than panmixia (where all individuals are potential partners). For example, average FST between samples in the Thames catchment was 0.022, only slightly lower than the average for between-catchment comparisons (0.028) for the study as a whole, while 262 of the 325 pairwise FST comparisons in the catchment were highly significant. There was a weak, but significant, relationship between genetic and geographic distance (r2 = 0.1089, P = 0.010) within the catchment, indicating a tendency for individuals to produce offspring with fish from nearby populations rather than distant populations [see Additional file 7]. Additionally, the population tree (Figure 2) and PCA analyses [see Additional file 5] showed groups comprising samples from neighbouring Thames sites: three from the main Thames; four from the Kennet and its tributary (Lambourn); samples from the Stort and the Lee; and samples from the Wandle and Mole. Samples from the Thames Blackwater collected in the years 2000 and 2010 (approximately two to three generations) clustered with very high bootstrap support (98%) in the population tree (Figure 2). This indicates that this roach population is largely restricted to this stretch of river, which includes both moderately and highly polluted sites, and has no substantial uncontaminated upstream stretch (Figure 1).

Despite the proximity of some populations in the PCA and tree, we also found significant genetic differentiation between samples from some neighbouring stretches of the same river, sometimes occurring over small distances of separation (<10 km), for example, within the upper Lee (see below), between the Lee and the Stort, between the Blackwater and main Thames, between the Lambourn and the Kennet and within the Anglian Blackwater. In other cases, despite the separation of sampling locations by in-river impoundments such as weirs, we found no significant genetic differentiation between sites, for example, within the Stort, the main Thames, the Kennet, the Arun, the Nene and the Trent. Thus, patterns of within-river genetic structure differed between river stretches. For some other fish species analysis of genome-wide SNP data has provided greater resolution in population structure than that achievable using microsatellite data [37], and it is possible that some fine-scale genetic structure in the roach populations has not been detected with the microsatellites used in the current study.

Relationship between exposure to estrogenic effluents and effective population size

Estimates of Ne calculated from the microsatellite data using the approximate Bayesian computation method (Ne(ABC)), ranged from 54 to 301 for each sample, with higher precision for small Ne estimates (Figure 3A). We found no evidence for a correlation between Ne(ABC) and predicted E2 equivalents (E2Eq), a measure of total estrogenicity of the river water due to contamination by sewage effluent (generalized linear models (GLM), F(1,20) = 0.7468, P = 0.40) or for an interaction between sample site and estrogen exposure (GLM, F(6,19) = 1.9954, P = 0.14) across the 28 sample sites where no recent restocking had occurred and had sufficient sample sizes for robust Ne calculation. However, the 95% confidence intervals (CI) for the model coefficient indicated Ne(ABC) could decrease by a maximum of 5.6% for each incremental increase in exposure of 1 ng/L E2Eq, or 65% at 11.6 ng/L E2Eq, equivalent to the most polluted river stretch included in this study. The inclusion of roach density as an additional covariate within the model also produced a non-significant result (GLM, F(1,16) = 1.3966, P = 0.26), albeit for a reduced number of sites (19). Similarly, there was no significant correlation between the other variables included in the statistical analyses (average flow rate, geographic/phylogenetic group and roach density) with Ne(ABC). This analysis makes the assumption that immigration of fish from remote sites is limited and this is discussed below.

Figure 3
figure 3

Effective population size ( N e ) plotted against predicted estrogen exposure for 37 population samples of Rutilus rutilus . (A) N e calculated using the Approximate Bayesian Computation (ABC) method in the program OneSAMP [38]. Tests for homogeneity of variances: Bartlett’s, P = 0.0036, Levene’s, P = 0.17. (B) Results of binning analysis for data shown in A; each bin which encompasses all data points starting at the lower point represented by the mean and standard error up to, but not including, the next bin. (C) Ne calculated using the sibling assignment (SA) method in Colony [39]. In A and C, error bars are 95% confidence intervals. In cases in which more than one population had similar values, data points overlie each other; thus, individual data points are not always visible. These plots include estimates from sample sites sampled in different years, for example, in the River Nene (N) (which were averaged for statistical analysis) and sites where recent restocking had occurred (open circles: River Aire (A), River Wandle (W)), which were excluded from the statistical analyses.

There was limited evidence for reduced variation in Ne in roach sampled from more contaminated stretches of river compared to those sampled from less contaminated sites; all estimates above 6 ng/L E2Eq were below 100, whereas there was greater variation (54 to 301) where E2Eq was below 6 ng/L (Figure 3B). Estimates of Ne using other methods were of the same order of magnitude but had wider confidence intervals for each estimate; Ne(SA) ranged from 36 to 145 and also showed no relationship with E2Eq (Figure 3C). Temporal estimates of Ne, calculated from allele frequency changes over several generations using the Jorde and Ryman method [40], varied from 14 to 265, but were available for too few locations to make meaningful comparisons (Table 4). Overall, the relatively small variations in Ne observed in this study could not be explained by any of the environmental or other variables measured in this study.

Table 4 Temporal estimates of effective population size ( N e ) among roach samples

Population genetic structure within the River Lee, a high effluent river

The average proportion of effluent in the upper Lee downstream from Harpenden and East Hyde WWTWs ranges from 28% to 70% in different stretches. Significant genetic differentiation was detected between fish sampled from four of the five locations in this stretch of river (FST values ≥0.009), and between these samples and two from its tributary, the Stort (FST ≥0.015) shown in Table 3. The presence of numerous large weirs (Figure 1) likely confines fish to particular areas of this river; nonetheless, samples from the Lee and the Stort did cluster together in some analyses (Figure 2, Additional file 5). A sample from the upstream, unpolluted sample site (LeeHUS) grouped with two samples (collected in 1995 and 2010) from the most polluted river stretch immediately downstream (LeeHyd). The next sample site downstream, LeeWhe, was distinct from these (Figure 2, Additional file 5 and Additional file 6), indicating restricted movement of fish between LeeHyd and LeeWhe over at least three to five generations (Figure 2). Analysis using the program IMA2 [42] suggested that there was less than one effective (breeding) migrant per generation between LeeHyd and LeeWhe in either direction, and about one migrant per generation from LeeWhe downstream to LeeSta (Figure 4). Collectively, these data suggest that roach populations at LeeWhe and those downstream do not rely on migration from the uncontaminated stretch of this river. Despite this, Ne(ABC) estimates for these polluted sites in the upper Lee ranged from 70 to 84 (95% CI: 50 to 127) compared to only 54 (95% CI: 42 to 82) for the upstream uncontaminated location, suggesting no substantial impact of the effluent on the effective population size of these roach.

Figure 4
figure 4

Posterior probabilities of migration rates, effective population sizes ( N e ) and time since divergence estimated using IM A 2 [[42]] for two pairs of roach populations in the River Lee. (A) Estimates for parameters calculated for the LeeHyd’10 and LeeWhe and (B) for LeeSta and LeeWhe. These Ne values are influenced by average Ne since the initial split of the populations.

Discussion

In this study, analyses of population genetic structure via the analysis of DNA microsatellite loci identified distinct subpopulations of roach in two tributaries of the Thames, the rivers Lee and the Blackwater, that were largely restricted to high-effluent stretches of the rivers over multiple generations. This is despite evidence for widespread feminization of male fish in the studied rivers and previous evidence that feminization alters breeding capabilities [21, 26]. Both of these tributaries contain feminized fish [25]), with predicted average exposure of between 4 and 9 ng/L E2Eq. We also found no statistically robust evidence for a substantial impact of estrogenic sewage effluents on Ne of roach. The possibility of a reduction in Ne of up to 65% for roach living in the most polluted river stretches (E2Eq of 11.6 ng/L) could not be ruled out, due to the wide 95% confidence intervals associated with the statistical model. Moreover, our analysis included relatively few samples from rivers in the highest risk category, largely because these sites are rare.

Caveats

As with any modelling exercise, this analysis makes assumptions that may affect the interpretation of the results. One of these assumptions is that migration between sites with different pollution profiles is limited over two to three generations, the time frame likely to have the greatest influence on Ne(ABC)[43]. This was ensured by selecting sites with physical obstructions between them. However, quantifying migration rates over this timescale was not always possible because all potential source populations could not be sampled and, in some cases, we found no significant genetic differentiation between roach at sites distant from one another. Genetic differentiation can take many generations to manifest with low levels of migration [44]. Histological data from the Arun and the Lee show that feminized gonads in roach were approximately 6-fold (Lee) and approximately 2.5-fold (Arun) more prevalent in populations living in the stretches downstream of major WWTW inputs compared with those living upstream [3]. This demonstrates that migration in these rivers was indeed restricted to stretches delimited by physical barriers, despite no significant genetic differentiation observed between river stretches (FST <0.002, Table 3).

A second assumption is that no restocking of the rivers sampled had occurred or that the effect of restocking activities on Ne(ABC) was relatively minor. Approximately 500,000 hatchery-reared roach just over one-year-old (so called ‘1+’ fish) have been introduced into the Thames catchment since 2000; broodstock for hatchery fish originate from the river Trent. The influence of restocking activities on Ne(ABC), however, is likely to be relatively minor, as the sites sampled in this study were separated by major physical barriers from sites where these introductions had occurred. Moreover, introductions prior to 2000 are unlikely to have had a large influence on Ne(ABC) as this is primarily affected by the size and variance in reproductive success of the parental generation, which would have spawned between 2004 and 2007 for most of the samples in this study. However, we cannot exclude some influence of introductions prior to 2000 as some of the summary statistics used to calculate Ne(ABC) are known to be affected by demographic processes over a longer time period [38]. The effects of introductions on genetic diversity, the detection of bottlenecks and population structure are likely to be greater, as these factors are affected by demography over many generations. However, neither the success of the reintroduced fish nor the size of the roach population in the Thames is currently known. In salmonids, restocking success is highly variable and has been attributed to local adaptation [45]. Using our microsatellite dataset, 73% of 48 individual roach from a stretch of the Wandle (restocked in 2007, 2009, 2010) assigned to the Thames reporting regions. Only 5% (two fish) assigned to the Trent (the source of the parents of introduced fish), which may be mis-assignments, as 5% also assigned to the Arun, 10% to the Anglian Blackwater and 10% to the Nene, from where no restocking had taken place. Thus, the success of the re-introduced fish may be low, but this requires further investigation.

Evidence for self-sustaining populations in effluent contaminated rivers

While this study does not exclude the possibility that estrogenic effluents reduce Ne of fish populations, it suggests that roach populations can be self-sustaining despite exposure to estrogens over several generations. These findings are consistent with the fact that the prevalence of male fish with moderate to severely feminized gonads (that have been shown to have substantially reduced reproductive competitiveness in controlled breeding studies) is generally less than 10% in English rivers [3, 25, 26, 46]. The reproductive competitiveness of fish with the more common mild-intersex condition is similar to those of fish without gonadal feminization [26]. In roach, the gonads of male fish exposed to estrogens become progressively feminized with age [46], so gonadal feminization could theoretically increase Ne by reducing the reproductive dominance of large older males in estrogen-contaminated rivers. While the effects on females are less well studied, female roach exposed to an undiluted effluent for three years in large tanks were able to breed, despite the fact that this exposure caused complete gonadal feminization of males [21]; similarly, the majority of females collected from two effluent-polluted rivers examined in this study were also able to breed [26].

Population risks of long-term exposures to estrogenic effluents

The results of this study on wild roach populations seem to contrast with studies that have assessed population risk through long-term exposures to estrogens, where exposure to concentrations between 3 and 6 ng EE2/L [1720] or to a full-strength effluent [21] resulted in all-female populations and/or reproductive failure. The apparent difference between the wild populations and those experiments designed to simulate ‘real world’ exposure, however, may be because the fish living in the effluent-contaminated rivers examined in this study have been exposed to a lower level of estrogen or because all of the estrogen is not bioavailable; organic pollutants can bind to particulates and dissolved organic matter [47]. The most contaminated river in this study has a mean proportion of effluent of approximately 70%, although the majority of contaminated English rivers average approximately 10% to 30% [34]. While EE2 has been measured up to approximately 4 to 8 ng/L in English WWTW effluents [33, 48], for the most part, they are lower [23, 49], and estrogen concentrations vary greatly over short periods. For instance, EE2 was detected in only 21 of 135 water samples from the Lee, although occasionally reaching 4 ng/L [33]. Considering the totality of estrogen content, the predicted average estrogenicity of the most contaminated site in this study is 12 ng/L E2Eq and would be below 21 ng/L for 90% of the time. Only 1% to 3% of 10,313 individual river reaches in the UK receiving WWTW effluent were predicted to have average E2Eq >10 ng/L, and, of these, many are ditches composed almost entirely of sewage effluent [34]. As E2 is approximately 10 times less potent than EE2 in inducing gonadal feminization in fish [34, 50], it is probable that average life-time exposure to estrogens in the wild does not currently reach the concentrations shown to cause sex-reversal and population collapse in controlled experimental exposures. Green et al. [51] recently predicted a doubling of estrogen exposure concentrations in some rivers with population growth and climate change by 2050 suggesting an increased likelihood of population level effects of estrogenic effluents in the future, unless mitigated by substantial improvements in sewage treatment processes.

Influences on the population genetic structure of roach

The population genetic structure of roach in southern England observed in this study may have been influenced by historical biogeography, migratory behaviour, human translocations and in-river barriers. Roach can be highly mobile and can migrate over 10 km, particularly in the spawning period April to June, if migration is not obstructed [52]; additionally, there is some evidence that roach show fidelity in migration and return to spawning sites they have used previously [53]. Within the Thames catchment, the observed population genetic structure likely results, at least in part, from the large number of physical barriers, such as weirs and locks (Figure 1); these have been recognized as major factors restricting movement (including downstream) of roach [54]. Similarly, the importance of barriers in driving intra-catchment genetic variation is well documented in other fish species, for example, brown trout [55]. Only obstructions in the main River Thames and the Kennet are equipped with fish passes and, although some passes can be used by roach [56], the effectiveness of these passes in allowing fish movement has not been studied. As we identified significant genetic differentiation between roach from the Kennet and the Thames, despite being connected by fish passes, these passes may represent major physical separation barriers to this fish species.

Conclusions

Despite the widespread feminization of male roach in effluent-contaminated rivers of southern England, using nuclear DNA microsatellites we were able to identify some populations that have been confined to stretches of river with moderate to high exposure to estrogenic effluents over multiple generations. We also found no evidence of a correlation between the Ne of roach populations and predicted exposure to estrogens, although because of the wide confidence intervals, a reduction in Ne of up to 65% is still possible at the most contaminated sites.

Methods

Study location

Southern England, particularly the region within the Thames catchment, was chosen for this study for four reasons. Firstly, it is a densely populated region with relatively low rainfall and, therefore, includes some river stretches with some of the highest concentrations of WWTW effluents in the United Kingdom [34]. Secondly, feminization of roach has been widely reported in the region [3, 46]. Thirdly, many rivers in the region have locks, dams or weirs which are likely to limit movement of fish species between stretches of river with different pollution profiles. Fourthly, the effluent concentrations and risk of estrogenic endocrine disruption have been modelled [34]. Sample sites are shown in Figure 1 and Table 1 and were selected to span the full range of predicted estrogen concentrations in English rivers and where obstructions are likely to restrict fish movements [3, 46].

Roach study species

Roach was selected as the study species because it is native and widely distributed in the United Kingdom, including in rivers polluted with WWTW effluents. Additionally, widespread feminization has been reported in wild populations and with a proven association with exposure to estrogenic effluents [14, 19, 21, 57]. Roach generally reach sexual maturity between two and three years and spawn annually in the spring. Adult roach can migrate considerable distances, but where weirs obstruct upstream and downstream movement they are able to complete their lifecycles in a single stretch of river [54].

Population-genetic analyses

To understand the extent to which roach populations are restricted to various stretches of river, several approaches were used to investigate population genetic structure. We analysed microsatellite loci variation in 1,769 fish sampled between 1995 and 2011. Each fish was genotyped at between 14 to 19 microsatellite loci. Microsatellite genotypes are provided in Additional file 8. Protocols for DNA extraction and details of amplification of the microsatellite loci are illustrated in Additional file 1. Data for 14 microsatellite loci were used to calculate three measures of genetic diversity: observed heterozygosity (HO) and expected heterozygosity (He) using GenAlEx 6 [58]; allelic richness (AR) was calculated using Fstat v2.9.3 [59] – see Additional file 1 for full details. The programme BOTTLENECK [60, 61] was used to test for recent genetic bottlenecks. This programme tests for a relative excess in heterozygosity that is apparent for a few generations after a bottleneck and develops because allelic diversity declines faster than heterozygosity, due to loss of rare alleles. Pairwise genetic differentiation between the sampled sites was estimated using FST, calculated using Arlequin 3.5 [62] and Jost’s D, Dest[63], calculated using SMOGD [64]. The significance of the FST estimates was assessed based on 10,000 permutations. AMOVA was performed using Arlequin. In order to test whether fish are more likely to produce offspring with local mates, compared to mates in geographically distant locations within the Thames catchment, isolation by distance analysis was performed using the Mantel test [65] in GenAlEx 6 [58]. Genetic similarity between populations was investigated using population based trees, calculated in POPULATIONS, v1.2.30beta [66], PCA in GenAlEx 6 [58] and a Bayesian clustering approach in STRUCTURE [67]. Finally, the program IMA2 [42, 68] was used to estimate migration rates between adjacent populations within high effluent stretches of the Lee, giving relatively high pairwise FST values (LeeHyd, LeeWhe, and LeeSta). See Additional file 1 for further details. To investigate the influence of restocking, genetic assignment of fish from the Wandle was undertaken using the ‘leave one out test’ in the computer program, ONCOR [69], based on their microsatellite genotypes. The reporting regions comprised: (1) the Wandle, (2) Lee/Stort, (3) rest of the Thames, (4) Trent, (5) Nene, (6) Arun, (7) Chelmer and (8) Anglian Blackwater. All animals used in this research were treated humanely and with regard for the alleviation of suffering; all procedures were subject to approval by the local ethical review process as required under the U.K. Animals (Scientific Procedures) Act (1986).

Effective population size

To test whether WWTW effluents substantially reduce the size of breeding populations, effective population sizes (Ne), which relate to the number of breeding fish and skews in breeding success, were estimated using the microsatellite genotypes. We compared Ne from sites ranging from little/no upstream WWTW effluent inputs to those where the majority of the flow can comprise WWTW effluent. Two single sample (generation) methods, that use different aspects of the microsatellite data, were used to estimate Ne for each population; the Approximate Bayesian Computation (ABC) method using ONeSAMP 1.2 [38], hereafter referred to as Ne(ABC); and the sibling assignment method (SA), Ne(SA)[39]. Temporal estimates for Ne, which are calculated from the change in allele frequencies between generations, were also estimated for sites where fish had been sampled more than once using TempoFs [40] and NeEstimator [70]. For further details see Additional file 1.

Statistical analysis

GLM were used to examine the relationship between predicted exposure to estrogenic effluents and Ne(ABC). For further details see Additional file 1. Differences in genetic diversity among sampled populations were tested using ANOVA. All statistical analysis was performed using the software R 2.13.0 [71].

Authors’ information

PH, the corresponding author, is a molecular ecologist and evolutionary biologist, and is a Research Fellow at the Biosciences Department at the University of Exeter.