Introduction

Legionella species are an important cause of severe respiratory illness (Stout and Yu, 1997). Although traditionally associated with aerosol-related respiratory outbreaks (due to contaminated cooling towers, fountains, and misters), Legionella sp. are increasingly recognized as a cause of sporadic community-acquired pneumonia (Fisman et al., 2005; Benin et al., 2002), particularly in hosts with underlying medical illnesses. The case-fatality rate for patients with legionellosis is 10–40%, and may approach 50% in nosocomial outbreaks (Benin et al., 2002).

Several lines of evidence point to the importance of environmental influences in the pathogenesis of legionellosis in human hosts, including the ubiquity of Legionella species in soil and fresh water, seasonality of occurrence, and apparent geographic variation in incidence (Fisman et al., 2005; Colbourne and Trew, 1986; Koide et al., 2001; Riffard et al., 2001; Steele et al., 1990; Stout et al., 1985; Sala et al., 2007; Bhopal and Fallon, 1991a). The Greater Toronto Area (GTA) in Canada has experienced a gradual increase in legionellosis incidence during the past two decades, although this may be attributable in part to improved recognition and diagnosis of disease (Benin et al., 2002; Joseph, 2004). However, it also is possible that the changing patterns of disease reflect changes in local environmental conditions (Fisman et al., 2005; Straus et al., 1996; Bhopal and Fallon, 1991b; Hicks et al., 2007). Improved understanding of environmental factors associated with legionellosis would aid formulation of prevention strategies and could enhance recognition of cases by clinicians.

Traditional methodologies used to establish causal links between environmental factors and occurrence of communicable disease can be confounded when the disease is seasonal in nature (Fisman et al., 2005). It may be difficult to separate seasonal changes in population behavior (e.g., use of air conditioning) from direct environmental effects on hosts and pathogens, resulting in confounded associations and potential “ecological fallacy” (Portnov et al., 2007). We sought to assess the link between environmental effects and legionellosis occurrence using two methods that control for underlying seasonal oscillation in disease occurrence: (1) the use of negative binomial regression, with inclusion of seasonal smoothers that account for periodic oscillation in incidence (Jensen et al., 2003; Afifi et al., 2007); and (2) case-crossover design, which permits identification of acute environmental effects associated with increased risk of rare health outcomes (Maclure, 1991; Bateman and Schwartz, 2001).

Methods

The GTA is the most populous metropolitan area in the Canadian province of Ontario, with a population of 5,555,912 in 2006 (StatsCanada. Canadian Census, 2006). The GTA is a 7,125 km² area located in the southern part of Ontario and is comprised of five administrative regions (Toronto, York, Halton, Peel, and Durham), all of which have separate public health units charged with disease prevention and control activities (Figure 1). Demographic data for each health region, and for Ontario, were obtained from the 1996 and 2001 Canadian census (2006); data for intercensus years were estimated through linear interpolation and extrapolation. As current health regions came into being in 1995, we restricted estimates of disease rates to the period from 1990 to 2006, due to concerns that population extrapolation to earlier dates would be inaccurate.

Figure 1
figure 1

Map of public health units comprising the greater Toronto area (GTA). The map illustrates major water bodies within the GTA, particularly the rivers and creeks that flow into Lake Ontario. The dots indicate the locations of river and creek stations; the airplane symbol represents the airport station where meteorological data were recorded; and the triangle corresponds to the Lake Ontario station where lake temperature was recorded.

Legionellosis is a notifiable disease in the province of Ontario, and all testing for Legionella species related to public health disease control activities is performed at the Ontario Central Public Health Laboratory (CPHL) in Toronto. One of the authors (PT) has maintained a database of all test results related to Legionella species dating back to 1978. The database includes information on date of onset, age and sex of patient, hospital of admission, public health unit and/or city from which specimens were submitted, and Legionella testing method and results. Because this is a laboratory-based database, information on case outcomes, clinical characteristics of cases, and background data, such as smoking status and the presence of immune compromise, were not available.

A diagnosis of legionellosis was confirmed when at least one of the following criteria were met in conjunction with a compatible clinical illness: (1) isolation of a Legionella species or detection of the antigen from respiratory secretions, lung tissue, pleural fluid, or other normally sterile fluids; or (2) a significant increase (fourfold or greater) in Legionella sp. IgG/IgM titre between acute and convalescent sera; or (3) single specimen or standing IgG/IgM titre >1:256.

Daily weather data from 1978 to 2006 were obtained from the Canadian National Climate Archive (Environment Canada. National Climate Archive, 2006). Weather recordings were collected at Toronto Pearson International Airport, located approximately 40 km from downtown Toronto (Figure 1). Exposures included temperature, precipitation, atmospheric pressure, and relative humidity. Hydrological data included Lake Ontario surface temperature and levels, obtained from the NOAA Great Lakes Coast watch website (NOAA Great Lakes Coast watch Node, 2007). Data on flows and levels for the major components of the GTA watershed (the Humber, Don, and Rouge Rivers, and Black Creek) were obtained from the Water Survey Branch of Environment Canada (Water Survey of Canada—Environment Canada, 2007). Locations of Lake Ontario and river monitoring stations are presented in Figure 1.

Statistical Methods

Rates of disease occurrence were calculated using estimated public health units populations for 1998 (midpoint of the 1990–2006 time series). Date of disease onset was available for 817 of 837 cases (97.6%). When the date of onset of symptoms was not available, the case was assigned an onset dates based on the average lag between test date and onset dates (1-day lag).

Seasonal and temporal trends in weekly case occurrence were evaluated using an approached described by Jensen et al. (2003) using the construction of negative binomial regression models incorporating weekly sine and cosine terms (adjusting for seasonal effect), and yearly terms such that:

$$ E(Y) = \exp \left( {\alpha + \beta_{1} ({\text{year}}) + \beta_{2} (\sin (2\, \cdot \,\pi \, \cdot \,{\text{week}}/52)) + \beta_{2} (\cos (2\, \cdot \,\pi \, \cdot \,{\text{week}}/52))} \right) $$

E(Y) is the expected case count in a given week, α is a constant, and each \( \beta_{\text{i}} \) represents a regression coefficient for year or week. We used negative binomial models instead of Poisson models in our study due to overdispersion of weekly case counts as a result of high frequency of weeks with zero cases (Hougaard et al., 1997). Multivariable models were created using a backwards stepwise algorithm, with covariates retained if P values were ≤0.05. Standard errors were adjusted for clustering by public health unit.

Acute associations between environmental exposures and occurrence of cases were evaluated using a case-crossover approach (Janes et al., 2005; Levy et al., 2001), a study design analogous to the case-control design, but characterized by self-matching, such that exposures occurring before the “hazard period” (i.e., time period during which the case occurred) are compared with exposures occurring before control periods. We used a time-stratified, 2:1 matched case-crossover design, with hazard periods identified as the date of onset of symptoms of legionellosis. Random directionality of control period selection was used to limit bias resulting from time trends in the exposure series (e.g., seasonal trends) (Janes et al., 2005; Levy et al., 2001). Person-time at risk was divided into 3-week strata beginning on January 1, 1978. Control periods were the 2 days within each stratum that could be matched to the hazard period by day of the week and could precede or follow the hazard period, or both (Levy et al., 2001).

Within each stratum, exposures were ranked by tertile (lowest, intermediate, or highest); indicator variables were created for each tertile. We calculated the odds ratios for the occurrence of cases, based on daily and aggregated lags of environmental and hydrological exposures in the highest tertile versus the bottom two tertiles, through the construction of conditional logistic regression models (Woodward, 2005; Hosmer and Lemeshow, 1989).

Dose–response relationships between exposures and outcomes were assessed by assigning within-stratum (within a 3-week period) quintile rank to environmental and hydrological exposure variables and incorporating the resulting exposure estimates into regression models as 5-level ordinal variables. Evidence for linear dose–response relationship was assessed using the Wald test for linear trend and considered present based on P < 0.05 (Woodward, 2005). Effect modification was assessed through the introduction of multiplicative interaction terms into conditional logistic regression models, and was considered statistically significant if P < 0.05 for the interaction term coefficient. Analyses were performed using SAS® version 9.1 (SAS Institute, Cary, NC) and STATA version 9.0 (Stata Corporation, College Station, TX).

Results

Epidemiological Profile of Legionellosis in the GTA

A total of 837 cases of legionellosis were reported between 1978 and 2006. The majority of these cases were linked to the City of Toronto (77%), followed by Peel and Durham (both 8%), Halton (5%), and York (2%). A male:female predominance was observed (61–39%) and the incidence of legionellosis increased with increasing age, with most cases seen in the 75–84 age group (19% of all cases). Legionella species was identified in 478 cases; the most common pathogens were Legionella pneumophila serogroup 1 (55%), Legionella sainthelensi (8%), and Legionella pneumophila serogroup 6 (8%).

A documented date for onset of symptoms was available for 817 patients (98%). There were no significant differences in age (mean age, 59.1 vs. 64.7 years; P = 0.28) and sex (P = 0.65) between individuals with a documented date of onset of symptoms and those without. Individuals with missing dates of onset of symptoms were more commonly from Toronto health unit compared with non-Toronto health units (3 vs. 1%; P < 0.005). We restricted our estimates of legionellosis rates to the period between 1990 and 2006, due to changing health unit boundaries in Ontario as described earlier; annualized rates of case occurrence during that period are presented in Table 1.

Table 1 Case counts and annualized rates of legionellosis in the greater Toronto area, Ontario: 1990–2006

Seasonality, Temporal Trends, and Environmental Effects on Weekly Case Counts

The occurrence of legionellosis in the GTA peaks during late summer or early autumn predominance. We found strong statistical evidence for seasonal oscillation in incidence (P < 0.001 for log-linear combination of model sine and cosine terms) and for increasing case counts over time (3% per year, P < 0.001; Figure 2). In univariable models, after controlling for seasonal oscillation, we found association between increasing risk of legionellosis and humidity, pressure, and lake temperature. Inverse relationships were seen between rainfall and creek and river levels, and risk of case occurrence. However, in multivariable models, after controlling for seasonal oscillation, the association between disease risk and rainfall did not persist. We found a strong inverse association with river and creek levels and disease risk and an equally strong inverse relationship between lake temperature and disease risk. Weaker but statistically significant associations were observed between disease risk and humidity and barometric pressure (Table 2).

Figure 2
figure 2

Trends in legionellosis in the Greater Toronto Area from 1978 to 2006. Bars represent actual numbers and the curve represents predicted number of cases by a multivariate negative binomial model, including sine and cosine seasonal smoother terms. Occurrence of cases is seasonal with a gradual increase in total cases over time (incidence rate ratio for each successive year, 1.03 [95% confidence interval, 1.02–1.04]). The y-axis scale is broken to permit inclusion of the large number of cases that occurred as part of an institutional legionellosis outbreak in autumn, 2005.

Table 2 Relationship between weekly incidence of legionellosis and environmental exposures in univariable and multivariable negative binomial models

Case-Crossover Analysis of Acute Environmental Effects

Case-crossover analyses, like negative binomial regression models, identified an increased risk of infection with lowest river and creek levels. Acute effects were seen 25–31 days before case occurrence (summary odds ratio (OR) for lowest tertile of water levels 3.55 (95% CI, 2.38–5.29); Figure 3). Risk was enhanced with acute decreases in lake temperature, using similar lags (OR for lowest tertile of temperature, 1.33 (95% CI, 1.08–1.64) with 25–28-day lag). We also identified a modest increase in risk with humidity (OR for highest tertile of humidity, 1.34 (95% CI, 1.14–1.57) with a 30–34-day lag). We detected no acute association between temperature, atmospheric pressure, or rainfall, and risk of case occurrence. Dose–response relationships were assessed by assigning within-stratum quintile ranks to environmental exposures. A strong negative dose–response relationship was found for river and creek level and lake temperature. A strong positive dose–response relationship was seen for humidity (Table 3).

Figure 3
figure 3

Low river and creek levels and risk of legionellosis in the Greater Toronto Area. Odds ratios (solid line) and 95% confidence intervals (dashed lines) for occurrence of cases are plotted on the y-axis, whereas lags before occurrence of cases are plotted on the x-axis. An increase in the risk of occurrence of cases is seen with the lowest tertile of river and creek levels at a 25–31-day lag.

Table 3 Case-crossover evaluation of dose–response relationship between environmental exposures and legionellosis risk

We assessed the possibility that environmental exposures had varying effects on the risk of occurrence of legionellosis among case patients with differing demographic characteristics. We observed no significant modification of the effects of environmental exposures by sex or age group.

Sensitivity Analysis

We were concerned that a large outbreak that occurred in autumn 2005 (39 cases in September and October) might have undue influence on observed associations (Gilmour et al., 2007). We performed restriction analyses, with these 39 outbreak-associated cases excluded from models, and found no important changes in effects estimated with negative binomial or case-crossover approaches.

We were concerned that the biological mechanism underlying the lag between changes in environmental conditions and increases in legionellosis risk was not immediately apparent. As such we were concerned that observed watershed effects might be statistical artifacts. To evaluate this possibility, we performed supplementary analyses by using data derived from Hamilton, Ontario, and Philadelphia, Pennsylvania, using legionellosis case dates and river-flow data from Stoney Creek and the Schuylkill River, respectively. We identified qualitatively similar associations between low river flows and legionellosis risk at 25–30-day lags as had been seen in Toronto (Figure 4).

Figure 4
figure 4

Low river and creek levels and risk of legionellosis in Toronto, Hamilton, and Philadelphia. Odds ratios for occurrence of cases are plotted on the y-axis, whereas lags before occurrence of cases are plotted on the x-axis. An increase in the risk of occurrence of cases is seen with the lowest tertile of river and creek levels at a 25–31-day lag in Toronto (thick solid line); similar (and statistically significant) effects are seen with Stoney Creek in Hamilton, Ontario (dashed line) and with the Schuylkill River in Philadelphia (thin solid line).

Discussion

We evaluated seasonal and environmental patterns in legionellosis case occurrence in Toronto, Canada, during a period of several decades. We found the disease to exhibit late summer or early autumn seasonality in occurrence, as noted elsewhere in North America (Fisman et al., 2005; Straus et al., 1996; Marston et al., 1994). However, we identified hydrological changes in the local watershed, rather than weather, as the strongest contributors to legionellosis risk in this large metropolitan area. Our findings provide additional insights into the importance of the physical environment in determining patterns of legionellosis occurrence.

We analyzed data collected by the Ontario Provincial Laboratory in the GTA during a 28-year period, using both traditional regression methods, and a novel case-crossover methodology. These alternate methods permit identification of environmental effects on legionellosis occurrence on different time scales (weekly vs. day-to-day). We found that, although legionellosis occurred predominantly during warmer months, both average weekly case occurrence and acute changes in legionellosis risk were most strongly associated with changes in the hydrology of the Toronto watershed. In particular, both methodological approaches identified marked increases in legionellosis risk with decreases in local river and creek flows. The lag in effects identified using a case-crossover approach (3–4 weeks) likely represents the time taken for water from the rivers and creeks to flow into Lake Ontario (the source of drinking water for most GTA residents), and from Lake Ontario back into the GTA water distribution system.

The idea that the environmental factors influence legionellosis risk through drinking water quality is consistent with existing models of Legionella ecology (Best et al., 1983; Stout et al., 1992). Yu has long argued that most sporadic cases of legionellosis, which constitute the overwhelming majority of cases included in our database, occur when susceptible hosts drink contaminated water, with Legionella sp. introduced into the lung via microaspiration (Yu, 2000). The apparent protection against legionellosis afforded hospitalized patients through chloramination (rather than chlorination) of water supplies supports such a model (Kool et al., 1999).

Water distribution systems and contaminated water supplies are recognized as the important reservoirs Legionella species (Colbourne and Trew, 1986; Riffard et al., 2001; Stout et al., 1985). Colonization of water systems by Legionella is dependent on a combination of sediment accumulation, commensal microflora, and optimal water temperatures (Yu, 2000; Borella et al., 2005). It is plausible that when river and creek levels are low, higher concentrations of sediment and commensal microflora in the water are present and may stimulate growth, survival, and replication of L. pneumophila in treated or untreated water supplies. Legionella has been shown to enter water distribution systems in small numbers and survive for long periods within biofilms, even under conditions of high-shear turbulent flow and chlorination (Zanetti et al., 2000; States et al., 1987). Furthermore, a recent report suggests that local water distribution systems in Ontario are in poor repair, allowing direct communication between treated water and the surrounding soil and groundwater (Barkwell, 2007). Legionella species are ubiquitous in groundwater, and direct inoculation into treated water through such a mechanism is possible (Riffard et al., 2001). However, we acknowledge that such mechanisms are currently speculative, and additional fieldwork is necessary to evaluate whether changes in river levels can be linked physically to changes in Legionella ecology or sediment burden in relevant water sources. Nonetheless, the fact that we were able to identify similar effects in two additional cities suggests that these findings are unlikely to represent a statistical artifact.

We found that other environmental factors, including relative humidity and decreased lake temperature, enhanced the risk of legionellosis. The association with elevated humidity is consistent with previous reports, although the magnitude of this effect seems smaller, and in case-crossover analyses, occurred with a longer lag, than in previous evaluation of legionellosis and environmental factors in Philadelphia and Spain (Fisman et al., 2005; Sala et al., 2007).

Given the thermophilic nature of L. pneumophila, the augmentation of risk in the face of acute lake cooling may at first appear counterintuitive. However, Legionella species have evolved to survive aqueous environments at lower temperature (Yamamoto et al., 2003; Kusnetsov et al., 2003). Lake cooling may influence legionellosis risk via increased lake circulation. Lake Ontario has little vertical circulation of waters during summer months, when the lake surface forms a warm “cap” above cooler waters below. When this cap cools, surface waters sink, creating vertical mixing currents (a.k.a., “autumn overturn”) (Thomas et al., 1996). If these currents mobilize lake sediments, legionellosis risk could be enhance through the mechanisms described earlier.

The linkage between environmental factors and legionellosis risk identified earlier provides further evidence that environmental factors can be important determinants of infectious disease risk, in the developed and the developing world. We identified qualitatively similar effects using two very distinct methodological approaches; our inclusion of seasonal and annual smoothers in multivariable negative binomial model makes it unlikely that the effects identified are confounded by other seasonal or longitudinally varying confounders, whether environmental (e.g., temperature), behavioral (e.g., increased propensity to swim or shower in summer months), or diagnostic (more sensitive testing in recent years). The case-crossover design is even more resistant to confounding by seasonal or multiyear trends, due to the employment of self-matching, with control periods drawn from the same 3-week period as cases. Case-crossover design also eliminates the need to aggregate cases, which eliminates the risk of “ecological fallacy” in identifying causal associations.

Like any observational study, ours is subject to several limitations. We used public health laboratory data, which can be incomplete due to failure to isolate Legionella species from clinical specimens or due to failure of healthcare providers consider this diagnosis and submit specimens for testing. However, undercounting of legionellosis cases would bias our analysis only if weather patterns influenced the likelihood of diagnosis or test performance, which seems unlikely. A second limitation to the study is the use of aggregate or average environmental exposure data, which may not accurately represent the exposures experienced by individual cases. However, the direction of such misclassification is likely to be random, which would actually weaken observed associations, suggesting that our estimates of effect may represent lower bounds. Finally, it can be argued that clustering of cases in association with outbreaks might have resulted in misidentification of coincidental weather patterns as risk factors for disease. However, only one known multicase institutional outbreak was included in our database, and exclusion of outbreak cases did not change our results.

Conclusions

We found a trend toward increasing legionellosis in the GTA, with a late summer–early fall seasonality. Whereas other studies have linked legionellosis risk to meteorological conditions (Fisman et al., 2005; Sala et al., 2007; Hicks et al., 2007; Marston et al., 1994), this is the first study to identify local hydrological influences as equally or more influential in case occurrence, at least in this geographic locale. Although the methodological approach used makes it unlikely that these associations are confounded by other temporally varying factors, our proposed mechanisms for these effects (e.g., direct contamination of water sources or introduction of micronutrients or commensal organisms into water distribution systems) are speculative, and further work is needed to evaluate the environmental correlates of the effects we observe. We believe that this study justifies the generation and testing of hypotheses regarding the influence of hydrological and meteorological factors on fluctuation of Legionella species in both source waters and treated water sources. Nonetheless, our findings are consistent with the model of drinking water as central to the occurrence of sporadic legionellosis. More broadly, these findings highlight the importance of the physical environment in the genesis of infectious diseases in the developed world, and underline the relevance of environmental protection and global environmental change to communicable disease control efforts.