Cryptosporidiosis is a common gastrointestinal disease, and is widespread in many developed and developing countries [1]. Our understandings of the causes of cryptosporidiosis are based upon a large number of outbreak studies and a smaller number of case-control studies. These have highlighted risks from drinking water from poorly treated public and private supplies, swimming in swimming pools, contact with farm animals and spread within institutions such as day care centres. Contact with young children is also known to be a risk factor [2]. There are 16 species of Cryptosporidium but C. hominis and C. parvum are the most important pathogens for humans [3]. In England and Wales there is a large environmental reservoir of C. parvum in livestock and transmission is related to direct or indirect contact (e.g. through faecally contaminated drinking water) with these animals. The only major environmental reservoir for C. hominis is humans, and so infection is acquired through direct or indirect contact with other infected or colonised humans particularly when associated with foreign travel [2, 4]. For both species once a primary infection has occurred, secondary infections may result from person-to-person transmission.

Epidemiological studies into cryptosporidiosis have used questionnaires to identify proximal risk factors (patient activity) for cryptosporidiosis such as drinking water consumption or farm visits [5]. The objective of this study was to investigate the role of wider environmental and socioeconomic factors (e.g. water supply, socioeconomic status, land use, livestock densities and healthcare accessibility) upon human cryptosporidiosis.

This was achieved by linking small area data with postcoded cryptosporidiosis cases. All cases were genotyped to the species level enabling us to explore the aetiologies of the two main species C. hominis and C. parvum separately. We used a Geographical Information System (GIS) to derive detailed and aetiologically relevant indicators of environmental sources and points of contamination. We generated detailed measures of drinking water risks by tracing each individual’s mains water supply back to the water treatment works and into the source catchment.


The incidence of cryptosporidiosis varies geographically and studies have indicated that C. hominis is more common in urban areas and C. parvum in rural areas [6, 7]. However, few studies have examined the reasons behind this. The C. parvum excess in rural areas may be related to higher probabilities of animal contact [6], a risk factor identified in a case-control study in England and Wales [2]. Higher population densities in urban areas may lead to greater person–to-person transmission for C. hominis. Socioeconomic status varies between geographical areas and this may lead to differences in reported incidence of disease. Higher social classes are more likely to undertake foreign travel [8] and use swimming pools [9]; known risk factors for cryptosporidiosis, especially C. hominis. Conversely, individuals of higher social class are more likely to consume fresh fruit and vegetables [10], known to be negatively associated with cryptosporidiosis [2]. Agricultural workers are more likely to be exposed to Cryptosporidium [11] and their distribution varies spatially. Individuals with poor access to healthcare are less likely to seek treatment [12] and this may vary geographically or socially. Water supplies are a well-known risk factor for cryptosporidiosis [13] and the raw water quality and the efficiency and efficacy of treatment systems vary. Biofilms, organic coatings in water distribution systems, may play a role in prolonging the presence of Cryptosporidium in water distribution systems [14]. The probability of biofilm build-up is likely to be higher in long water distribution systems.


Species specific cryptosporidiosis incidence data were obtained from the UK Cryptosporidium reference unit for the 4 year period from January 2000, collected as part of the national collection of Cryptosporidium oocysts [15]. These data represent just under half of all the cases reported to national surveillance in England and Wales over that period (Chalmers et al. in preparation). From the original dataset of 8075 cases, 3343 were analysed once a number of records were removed from the analysis. All cases where the patient had reported recent foreign travel (22%) were removed from the sample as the infection may have been acquired overseas. These data were based upon information collected by laboratories from the faecal sample request form that was filled in by the patient or physician. It is likely, based on information from previous work, that travel information is under-reported. Further cases were removed due to gaps in the geographical coverage of the independent variables (5%) and the inability to identify the species using PCR-based methods (2%).

However, the main reason for the removal of cases were records which, had incomplete postcodes (59%) meaning that it was impossible to identify an accurate residential location. It is important to consider whether this omission may be correlated with any of our independent variables and hence introduce bias into the results. One source of missing postcodes is transcription omissions at individual laboratories but because these are likely to affect samples in a random manner they are unlikely to be correlated with any independent variables. The second major source is omissions by general practitioners due to them having inaccurate patient records or failing to include postcodes when submitting samples to laboratories for analysis. There is no evidence as to whether this varies systematically in a manner, which would bias the results of this research.

For each case one comparator control postcode was randomly selected from within the service area of the laboratory to which the case stool sample was sent. The process was weighted to account for differences in population between postcodes. Control postcodes were selected from within the same laboratory service area as the case postcode because different laboratories have different selection criteria and screening protocols for cryptosporidiosis [16] and so differences in incidence between laboratory service areas may be artefactual. Service areas were generated by assuming that each sample was sent to its nearest laboratory (based on shortest road distance). This resulted in 135 service areas across England and Wales and the method was validated successfully by comparing the postcodes of the cases, to the laboratory to which their sample was sent. This indicated that 87% of cases were assigned to the correct laboratory. These service areas varied in size from 106 km2 to 3,770 km2 with an average size of 1,142 km2. They were consequently large enough to contain a variety of different drinking water sources and encompass a range of agricultural landscapes and socioeconomic characteristics.

Our control postcodes represent areas where no cryptosporidiosis was reported. However, due to issues of under reporting [17] and asymptomatic infection [18, 19] some of the control postcodes may have contained affected individuals. However, the reported rate of cryptosporidiosis in England and Wales is only 9 per 100,000 per year and so even taking into account under reporting and asymptomatic infection, the probability of a control postcode containing cases remains extremely low. Furthermore, such a bias would only tend our results to the null hypothesis.

For each case postcode and control postcode explanatory variables were derived using GIS (ArcGIS9.1). A measure of the degree of rurality was obtained using the Rural and Urban Area Classification [20]. This groups census areas into one of eight categories ranging from “urban areas” to “hamlets and isolated dwellings in sparse surroundings”. A buffer zone of 2.5 km was created around the centre of each postcode and within these areas the GIS was used to extract estimates of the total amount of Cryptosporidium applied to land through animal manures. These were based upon a 1 km2 map of manure applications developed by the Agricultural Land Advisory Service [21]. This was created by combining information from the agricultural census with land use data, information from animal excreta and manure management surveys and estimates of oocyst concentrations in manure [22]. The estimates will consist of C. parvum and other species/genotypes common in animals, which are unlikely to include C. hominis.

A range of socioeconomic variables were obtained by identifying the 2001 Output Area within which each postcode was located. Output areas are the smallest area for which census data are released and contain approximately 125 individuals. Within each Output Area the percentage of people in each of the eight socio-economic status bands were identified. These ranged from higher managerial and professional occupations to those in routine occupations. Additionally, the proportion of the population aged 0–4 years and the percentage employed in agriculture was obtained. Finally a measure of health care accessibility was derived by calculating the travel time from each postcode to the nearest GP.

Individuals receive their water from one or more water treatment works, which abstract their water from one or more sources and are subject to different forms of water treatment. Consequently, a number of volume-weighted measures were derived to describe the water supply of each case and control postcode. Information on the proportion of water supplied from different sources (surface vs. groundwater) and subject to different treatments (e.g. membrane filtration, simple disinfection) was obtained from the England and Wales Drinking Water Inspectorate. These data included any changes in treatment that occurred during the 2000–2003 period. Information could only be obtained for public water supplies. For each surface or groundwater abstraction, catchments were calculated using the GIS and the density of Cryptosporidium applications to land, sewage discharges and sewage overflows in each were calculated. The probability of biofilm build-up was simulated by calculating the volume weighted, average straight line distance between each postcode and the water treatment works supplying their water.

In total over 50 explanatory variables were produced. Multivariable models were then constructed using conditional logistic regression for all Cryptosporidium infections followed by separate models for C. hominis and C. parvum. A forward regression technique was performed by adding the most significant variable in turn. Collinearity was avoided by ensuring that the addition of each variable did not lead to significant changes in the coefficients or significance of any other variables in the model. Standardised forms of the independent variables were entered into the models. Results are presented in terms of Odds Ratio (OR) estimates and the 95% Confidence Interval (CI). The statistical analysis was undertaken using STATA/SE 8.2.

Much of our understanding of the aetiology of cryptosporidiosis is obtained from an analysis of reported outbreaks. However, these only represent about 8% of all cases reported to national surveillance. In order to examine whether outbreak cases were significantly influencing the results presented in this paper all our models were fitted with and without outbreak cases. Cases associated with a particular outbreak are identified within the national collection of Cryptosporidium oocysts used for this study from records held by local Health Protection Teams or Units. The omission of outbreak cases had a minimal impact upon the results implying that outbreak cases are not significantly influencing the results presented. Consequently, the models presented in this paper are inclusive of all cases.


The results for all Cryptosporidium infections are shown in Table 1. These indicate that living in an area with higher amounts of Cryptosporidium applied to land in a 2.5 km buffer around each postcode (OR 1.084 P = 0.022), larger proportions of individuals in the 0–4 years age group (OR 1.145 P < 0.001) and more individuals in the highest socioeconomic status groups (OR 1.203 P < 0.001) were all positively associated with risk. Drinking water subject to superior water treatment (OR 0.770 P < 0.001) and groundwater sourced drinking water (OR 0.821 P = 0.001) were negatively associated with risk. Once these two drinking water variables were included in the model the proportion of the water supply which was from non superiorly treated surface water sources was negatively associated with risk (OR 0.869 P = 0.019).

Table 1 Multivariate model for all Cryptosporidium infection

The dataset was then split into C. parvum and C. hominis. The results for C. parvum are shown in Table 2 and indicate that living in an urban area was negatively associated with risk (OR 0.852 P < 0.001). Living in an area with higher amounts of Cryptosporidium applied to land in a 2.5 km buffer around each postcode (OR 1.167 P < 0.005), larger proportions of individuals in the 0–4 age group (OR 1.094 P = 0.018) and more individuals in the highest socioeconomic status groups (OR 1.109 P = 0.010) were all positively associated with risk. Drinking water subject to superior water treatment (OR 0.738 P < 0.001) and groundwater sourced drinking water (OR 0.679 P < 0.001) were negatively associated with risk. Once these two drinking water variables were controlled for two further variables became significant. An interaction between the proportion of groundwater supply and the amount of Cryptosporidium applied to land in the catchment (OR 1.289 P < 0.001) was positively associated with risk. The proportion of the water supply which was not superiorly treated, from a surface water source with a high amount of Cryptosporidium applied to land in the catchment was negatively associated with risk (OR 0.846 P = 0.009).

Table 2 Multivariate model for Cryptosporidium parvu m infection

The results for C. hominis are presented in Table 3 and indicate positive associations with living in an urban area (OR 1.261 P < 0.001), an area with a high proportion of individuals from social classes 1–4 (OR 1.297 P < 0.001) and many individuals in the 0–4 age group (OR 1.189 P < 0.001).

Table 3 Multivariate model for Cryptosporidium hominis infection


In the full model all Cryptosporidium illness was positively associated with the quantity of Cryptosporidium applied to land in a 2.5 km buffer around each postcode. This may be due to the increased probability of direct contact with the environmental reservoir of Cryptosporidium in animal manures. When the analysis was split into C. hominis and C. parvum, surrounding Cryptosporidium applications to land was only a significant risk factor for C. parvum. This is unsurprising as animal manures are unlikely to include C. hominis.

There were positive associations between cryptosporidiosis and higher social classes for all Cryptosporidium illness but when the individual species were examined the social class gradient was significantly stronger for C. hominis in comparison to C. parvum (OR 1.297 vs. 1.109). There are a number of explanations for this. If we consider C. parvum, then there is a large environmental reservoir and individuals of higher social classes are more likely to undertake recreational activities such as walking in the countryside [9] which may increase the chance of animal-to-human transmission. Additionally individuals of higher social classes are more likely to undertake foreign travel [8] a known risk factor for cryptosporidiosis but especially for C. hominis [4]. Foreign travel cases were removed from our analysis, though it is likely that some travel related cases remained within the dataset. Furthermore, the association with social class may indicate community transmission from travel cases to other individuals in the same social group with whom they are most likely to interact. This may be direct person-to-person transmission [2] or indirect transmission such as through swimming pools. It is known that recreational swimming is positively associated with higher social groups [9]. It is also worth highlighting that these positive associations with higher social classes are overriding other factors. Individuals of a higher social class are more likely to consume fresh fruit and vegetables [10], known to be negatively associated with cryptosporidiosis [2].

Living in an area with a higher proportion of individuals aged 0–4 years is a risk factor for all illness with Cryptosporidium and for C. parvum and C. hominis individually. It is known that cryptosporidiosis incidence is highest in this age group. However, the observation that the magnitude of effect is greater for C. hominis than for C. parvum (OR 1.190 vs. 1.094) is surprising as previous studies have shown a greater proportion of C. parvum (33%) occur in children in the 0–4 age group compared to C. hominis (20%) [2]. Toileting contact with a child under 5 years, even in the absence of symptoms, has been shown to be a risk factor for C. hominis but not C. parvum (2). The authors of this study suggested that asymptomatic carriage in young children may be one of the main reservoirs of C. hominis.

The association with children in the 0–4 year age group was examined further by subdividing the C. parvum and C. hominis models into those cases where the affected individual was 4 years or younger and those where the individual was 5 years and older. To ensure comparability, identical explanatory variables were fitted to each sub model as in the original models. The resulting models were then analysed for major shifts in the significance of the explanatory variables. The only major difference for C. parvum was that the percentage of individuals aged 0–4 in the area was not significant (P = 0.9) in the model for those 5 years and older. This suggests that young children are not a risk factor for C. parvum illness in older individuals. In contrast the percentage aged 0–4 was a significant risk factor in both young and old for C. hominis, highlighting the importance of person-to-person transmission between young and older individuals. In the 4 years and younger model, living in an urban area was an insignificant risk factor (P = 0.7) for C. hominis.

All illness with Cryptosporidium showed negative associations with areas having superior water treatment, and negative associations with areas supplied by large amounts of groundwater. When the models were split by species, this effect was observed for only C. parvum. Although there have been outbreaks of C. hominis associated with sewage contamination of drinking water, C. parvum dominates in the environment and so the lack of association between C. hominis and drinking water is unsurprising.

Once water source and treatment were included in the model, further variables became significant in the overall model and in the C. parvum model. These need to be interpreted with caution as they will be greatly impacted by any measurement error in our source and treatment drinking water variables. In the overall model the proportion of poorly treated water supplied from surface water sources was negatively associated with risk. In the C. parvum model a similar variable became significant namely the proportion of poorly treated water supplied from surface water sources from catchments with high amounts of Cryptosporidium applications to land. Both these associations indicate reductions in risk from the drinking water supplies most at risk of Cryptosporidium contamination. This may highlight immunity in a population periodically exposed to Cryptosporidium [19]. Alternatively new Cryptosporidium regulations to improve water supplies have led to improved water treatment at sites most at risk of Cryptosporidium contamination [23] and these improvements have been incorporated into our water supply database. Consequently the negative association with the most at risk sites may be evidence of the new regulations leading to the improved monitoring and attention being paid to these most at risk plants [23].

In the C. parvum model a positive association was observed in the interaction variable between the proportion of groundwater sourced drinking water and the quantity of Cryptosporidium applied to land in the groundwater catchment. This highlights that, although generally of lower risk, if there is a high density of Cryptosporidium in the groundwater catchment then these can be a risk factor for illness with C. parvum.

The links between cryptosporidiosis and drinking water are noteworthy as this period covers the new drinking water regulations to control Cryptosporidium implemented in 2000 [23]. Since this time a reduction in cryptosporidiosis has been reported, especially the reduction in the size of the spring peak nationally [24]. Consequently, although the regulations appear to have had success in reducing illness, drinking water remains a significant risk factor for cryptosporidiosis in England and Wales.

The urban–rural gradient was not a significant variable in the full model. However, in the C. parvum model negative associations were found with urban areas. It is important to note that these associations were independent of any impacts relating to water supply and Cryptosporidium applications to land, which we would expect to be correlated with urban areas. It may be that other factors associated with living in urban areas are important. It may indicate that these individuals are less likely to interact with their surroundings reducing the probability of C. parvum infection. An alternative explanation is that our water treatment variables reflect the supply pattern in 2003. In earlier years it may be that the association with land use is accounting for the greater uncertainty in the water treatment variables. This is plausible as rural water supplies are likely to be at greater risk of C. parvum contamination from agricultural and possibly wildlife sources. This argument is strengthened by the observation that when the data were stratified into 2000–2001 and 2002–2003 the land use variable was only significant in the earlier period.

For C. hominis a positive association with urban areas was observed. This may be due to increased opportunities for person-to-person transmission in these areas due to factors such as more nursing homes, day care centres and enhanced availability of swimming pools. The association between increased risk of a person-to person-transmitted disease and increased population density is in line with that predicted by epidemiological theory [25]. In the model of all illness with Cryptosporidium the urban–rural density is not a significant variable implying that the different relationships observed for C. parvum and C. hominis are cancelling each other out in the full model. The observation of different associations with urban areas for C. parvum and C. hominis is also noteworthy because of our earlier concern about omissions of postcodes by general practitioner and whether this could bias the study results. It could be suggested that postcode completeness varies between general practitioners in urban and non-urban areas. However, the models for C. parvum and C. hominis show different signs for the urban area variable. This implies that any bias due to differential general practitioner completeness of postcode information between urban and non-urban areas is likely to be minimal.

In this study no associations were found between cryptosporidiosis and the proxy variable for biofilm build-up, accessibility to healthcare or the proportion of the population employed in agriculture. This suggests that these variables are not of major importance to cryptosporidiosis aetiology or that our methodology was unable to create effective measures of them.

In conclusion, this is the first study to examine the influence of the wider environment upon illness with Cryptosporidium and of the two species C. hominis and C. parvum separately. It has shown separate aetiologies for each species highlighting the importance of separating C. hominis and C. parvum in epidemiological studies. For the first time it has also demonstrated all illness with Cryptosporidium to be higher in areas populated by higher social classes and this effect is greater for illness with C. hominis. It also supports studies indicating the importance of young children to illness with both Cryptosporidium species. This is the first study to highlight the importance of the agriculture surrounding place of residence as a risk factor for C. parvum illness. Elevated incidence of C. parvum illness was found in rural areas and elevated levels of C. hominis illness in urban areas, effects that were independent of water supply and agricultural factors. The analysis highlighted that, in spite of new regulations to reduce Cryptosporidium concentrations in drinking water, water supply remained a risk factor for illness with C. parvum. Risk was lowered in areas with superior water treatment and in areas supplied by groundwater. There was also evidence that, in certain circumstances, groundwater could be a risk factor for C. parvum illness. Finally there were indications of a lowering of risk of C. parvum illness in the most at risk water supplies, an impact that was independent of superior treatment. This may be evidence of population immunity or effective regulation governing water treatment and monitoring.