Introduction

Many studies have examined spatial relationships of water quality patterns and landscape or anthropogenic factors, concluding that the ability of landscape metrics to explain water quality depend largely on which spatial scale is used (Mainali et al., 2019). Mainali and Chang (2018) found that a 100-m scale and 1-km upstream scale best explained variations in water quality in a large river basin, while Shi et al. (2017) found varying abilities of catchment, riparian, and reach scales to explain degraded water quality (Mainali & Chang, 2018; Shi et al., 2017). However, relatively few studies have examined relationships between water quality and landscape variables at multiple microscales (smaller than a 500-m radius buffer) within an urbanized region. Given that the urban landscape is spatially heterogeneous (Cadenasso et al., 2007), water quality can exhibit a large spatial and temporal variation within a city (and even within a neighborhood). Thus, it is important to understand what microscale landscape factors are associated with the variations (Sliva & Dudley Williams, 2001).

Water quality is in part determined by the presence of physical pollutants, both aqueous and particulate (Lintern et al., 2018). Escherichia coli (E. coli) is a fecal coliform that inhabits the intestinal tract of animals and humans and commonly contaminates water sources in areas of high population density, thus posing significant public health risks in urbanized environments (Jang et al., 2017). A 2018 evaluation by the City of Portland Bureau of Environmental Services concluded that E. coli is the main pollutant that exceeds water quality standards in Portland streams and rivers, with the highest recorded concentrations occurring in the summer and during storms. This report contrasts with McKee et al. (2020), whose study of recreational areas and the surrounding watershed in Atlanta, Georgia, found that E. coli concentrations were highest during the winter (McKee et al., 2020). Spatial differences were observed for concentrations of E. coli concentrations in Portland as well; concentrations were found to be “significantly lower in the Willamette Streams and Columbia Slough” than in most other watersheds sampled in the Portland area (Fish & Jordan, 2018). Accounting for “land use and stormwater management policies” helps to explain variations in fecal coliform levels at a multi-watershed scale in North Carolina (Vitro et al., 2017).

Phosphorus and nitrogen are organic nutrients that occur naturally in vegetation and soil, but excess amounts in water bodies can lead to eutrophication and subsequent water body impairment, among other ecosystem problems (Smith et al., 1999). Although phosphorus and nitrogen excesses commonly result from agricultural runoff, they are also important pollutants in urban environments (Billen & Garnier, 1997; Sonoda et al., 2001; Withers et al., 2014; Yu et al., 2012). For instance, urbanized watersheds in St. Paul, Minnesota, were found to experience major pollution from household nitrogen and phosphorus runoff (Hobbie et al., 2017). Furthermore, multiple studies have found that a lack of street sweeping for trees lining streets in urbanized areas greatly increases nitrogen and phosphorus loads in stormwater runoff (Taguchi et al., 2021). However, no previous studies examined the spatial variations in nutrient concentrations in relation to various microscale landscape factors with spatially intensive monitoring data.

This study examines relationships between water quality, anthropogenic and landscape factors, and seasonality at a microscale in Portland, Oregon, using a unique set of monitoring data. We address the following research questions:

  1. 1.

    How do selected water quality parameter concentrations vary spatially between the wet and dry seasons? We expect that E. coli concentrations are likely to be highest in developed (including open and recreational) areas during the dry season, heavy metal concentrations are higher in the wet season for areas in close proximity to roads, and total suspended solids (TSS) concentrations are likely to be greatest in steep areas with high foot traffic.

  2. 2.

    Which landscape variables explain spatial variations in water quality between the wet season? We expect that areas with larger percentages of sandy clay loam soil are likely to be negatively associated with pollutant concentrations and that total storm pipe length would be positively associated with pollutant concentrations. We anticipate that land cover variables such as imperviousness and road density are likely to be important explanatory variables of water quality in accordance with previous literature.

Materials and methods

Study area

This study was conducted in the metropolitan area of Portland, Oregon, a city that has recently undergone accelerated population growth and urbanization (Goodling et al., 2015; Jun, 2004). The region’s climate consists of relatively dry and warm summers and wet, cool winters. Average annual precipitation and temperature are approximately 965 mm and 12 °C, respectively (Chang, 2007; Cooley & Chang, 2017). Historical climate and future climate projections show increasing winter precipitation intensities with rising air temperatures (Cooley & Chang, 2021), likely to result in increased surface runoff, which can potentially decrease water quality.

Local soil types vary widely in texture between clay, silt, silt/loam, and gravel, creating a range of sizes that impact water infiltration flow rates (Baker et al., 2019). Most of Portland is in low-lying foothills situated between the Columbia and Willamette Rivers (O’Donnell et al., 2020). Forest Park, a largely undeveloped, slightly higher-elevation conservation area popular with hikers and bicyclists, comprises much of the western side of the study area. The Columbia Slough, a flat, low-elevation, slow-moving water body, comprises the northern side of the study area (Fig. 1). Many small urban streams have been heavily modified by human activities, resulting in rerouted, straightened, or buried streams (Post et al., 2022). Previous studies have found significant seasonal and spatial variability in water quality for Portland’s water bodies, including the Columbia and Buffalo Sloughs (Fish & Jordan, 2018; McCarthy, 2006). Zinc concentrations generally increase downstream of Johnson Creek, while lead concentrations do not show clear spatial patterns (Chang et al., 2019). While streams in Forest Park, Tryon Creek, and Johnson Creek are flash and fast-moving, flow in Columbia Slough is stagnant or slowly moving due to flat topography and wetlands.

Fig. 1
figure 1

Distribution of PAWMAP water quality station locations around the City of Portland used in the study. The lack of streams in the center of the study area reflects the outcome of removing, piping, and burying streams in the mid-twentieth-century urban development plans (Post et al., 2022)

Data

Water quality data were obtained from the City of Portland Bureau of Environmental Services’ Portland Area Watershed Monitoring and Assessment Program (PAWMAP) (City of Portland, 2019). The data originated from 128 water quality monitoring stations located on the outskirts of the City of Portland (Fig. 1), situated within the Willamette River, Columbia Slough, Johnson Creek, and Balch Creek watersheds. Pollutant concentration data were collected according to the protocol developed by the United States Environmental Protection Agency (USEPA) through the Environmental Monitoring and Assessment Program (City of Portland, 2019; USEPA, 2019). Samples were analyzed in the City of Portland’s water chemistry laboratory following the standard USEPA methods for lead (EPA 200.8), zinc (EPA 200.8), nitrate (EPA 300.0), and orthophosphate (EPA 365.1). Standard Total Coliform Membrane Filter Procedure (SM9222G) was used for E. coli, and standard methods SM2540D (total suspended solids dried at 103–105 ˚C) was used for total suspended solids, respectively (Supplementary Table 2A).

Generally, water quality measurements were taken for at least one monitoring station at least once a month by the City of Portland from July 2015 through May 2021. The PAWMAP program routinely rotates active stations, which include 20 perennial and 12 intermittent stations; as such, the completeness of data varied, with some station records containing data for multiple years, and others for less than 1 year (City of Portland, 2019). Thus, we had to take into account the possibility of interannual variation when analyzing means for each station. Furthermore, no station data was documented from March through most of May of 2020, likely due to the onset of the COVID-19 pandemic in the USA in March 2020, which temporarily impeded field work (Oregon Department of Transportation, 2021).

Six water quality parameters representing physical, chemical, and biological importance were selected for this study: E. coli (MPN/100 mL), lead (ug/L), nitrate (mg/L), orthophosphate (mg/L), total suspended solids (TSS) (mg/L), and zinc (ug/L). Nitrate and orthophosphate were chosen because they were reported more frequently in the dataset compared to other measures of nitrogen and phosphorus. Most but not all data entries reported consistent detection limits for each pollutant; thus, majority detection limits are reported in Figs. 3 and 4 and Supplementary Table 2A, B. The data available to us measured E. coli directly as opposed to fecal coliform levels as a proxy, providing an uncommon opportunity to measure a water pollutant of direct relevance to human health (Vitro et al., 2017).

Explanatory spatial variables

Explanatory variables were chosen based on hypothesized relationships with water quality (Table 1). Using ESRI ArcGIS Desktop 10.8, we initially defined a circular buffer area of 100 m in diameter around each water quality station to derive explanatory variables (Table 1) (ESRI, 2021). We chose the 100-m distance to avoid spatial overlap in buffer area between stations that are in close proximity to one another, and the circular buffer area was deemed adequate because of the relatively flat, urban land cover of the areas surrounding the water quality stations. However, some explanatory variables, road length and pipe length, became insignificant at the 100-m scale. Therefore, we introduced a 250-m-diameter circular buffer scale, with the added benefit of allowing for a multiscalar analysis at the microscale by comparing the 100-m scale to the 250-m scale (Fig. 2).

Table 1 Landscape characteristics selected as potential explanatory variables and summarized literature review of variable relationships with water quality
Fig. 2
figure 2

Microscale delineation at the 100-m and 250-m scale around each water quality station through which explanatory variable metrics were calculated. Background demonstrates streams and 30-m resolution NLCD land cover raster data

We calculated all candidate explanatory variable measurements for each water quality station at the 250-m and 100-m buffer scales (pipe length was ultimately calculated only at the 250-m scale because of the lack of pipe presence at the 100-m scale). While it is more appropriate to use catchment scale in more natural settings, since our study region’s flow paths are heavily modified by anthropogenic activities such as storm pipes and disappeared streams (Post et al., 2022), we used circular buffers for our analysis. Also, PAWMAP sampling sites are found within areas with heterogeneous land cover, allowing for substantial variation in explanatory variable measurements. Strahler stream order was calculated using ArcGIS Hydrology tools in the Spatial Analyst toolkit (Horton, 1945; Strahler, 1952).

For land cover variables, we defined “developed” to be the total percentage of pixels classified as “Developed” by the NLCD land cover classification system, which included four categories of varying development intensities (i.e., amounts of impervious surface) (Dewitz & U.S. Geological Survey, 2021). Despite correlation between imperviousness and developed land cover types, we included both variables as candidate predictors because developed land encompassed a wide range of urban land use types. As shown in Supplementary Table 1, developed land areas include much of open and low-density developed areas, which can potentially function as a sink of pollutants as well as sources.

We defined wet season measurements as any data recorded in October through April, and dry season measurements as any data recorded in May through September, considering rainfall distribution in the study region (Chang et al., 2021). Mean pollutant concentrations for each station were calculated after we evaluated relative amounts of interannual variation in pollutant concentration for each station, which creates some inherent noise in our analysis (Supplementary Table 2). We only considered stations with at least three measurements taken in the duration of the study period, which consisted of 36 stations in the dry season and 128 stations in the wet season. Most of the eligible stations exhibited pollutant concentrations that were consistently high or low (i.e., standard deviation less than the mean for each pollutant for each station). Furthermore, most of the mean concentrations for each of these stations was above the detection limit for that pollutant. Only dry and wet season measurements for the 36 stations with at least three measurements for both seasons were analyzed for spatial variation in pollutants, excluding the other 92 eligible wet season stations for the purpose of direct seasonal comparison (Fig. 3a, b).

Fig. 3
figure 3figure 3

a Relative proportions of mean E. coli, lead, and nitrate concentrations for each water quality station with background NLCD Land Cover Classification (National Land Cover Database 2019 | NLCD, 2019 Legend, n.d.). Larger circles correspond to higher mean concentrations. DL, detection limit (majority value). b Relative proportions of mean orthophosphate, TSS, and zinc concentrations for each water quality station with background NLCD land cover classification (Dewitz & U.S. Geological Survey, 2021). Larger circles correspond to higher mean concentrations. DL, detection limit (majority value)

Statistical analysis

We used R version 4.1 to observe the distribution shape of pollutant concentrations across stations and produce pairwise correlation coefficients for explanatory and dependent variables. We tested the correlation between explanatory and dependent variables at the 95% confidence interval (RStudio Team, 2021). We used the Spearman rank correlation analysis for all correlation tests, to account for possible non-linear trends in water quality measurements (Shrestha & Kazama, 2007). We then generated heatmaps for each season at the 100-m and 250-m scales for visual comparison.

Because of the lack of data in the dry season, we only performed regression analysis for measurements taken in the wet season. Upon observing that the concentrations for E. coli, lead, nitrate, orthophosphate, TSS, and zinc were positively skewed, we applied the transformation log10(concentration + 1) to the original data when performing regression analysis. We introduced multiple linear regression to evaluate the influence of multiple landscape factors on each pollutant. To rule out autocorrelated explanatory variables when determining the model that best explains variations in pollutant concentrations, we employed the Exploratory Regression Tool in ArcMap. This geoprocessing tool takes a shapefile input and applies the Global Moran’s I spatial autocorrelation test to models that fit user-specified criteria (e.g., minimum R2 value and minimum Jarque–Bera p-value) to produce candidate ordinary least squares (OLS) models for analysis. For this preliminary step, we used the k-nearest neighbor’s approach with k = 8, the default value, to calculate spatial weights for Global Moran’s I. We recorded the “best model” for each pollutant in the wet season based on highest R2, lowest Akaike information criteria, and variation inflation factor (VIF) value less than 10 (Kutner et al., 2004).

We created a weights matrix for the wet season measurements (n = 128) in GeoDa using the distance band method and the software’s default bandwidth value. We input the best OLS model detected by exploratory regression into GeoDa 1.18.10’s Regression tool, running the tool twice more to incorporate the weights matrix for the spatial lag and spatial error models (Matthews, 2006). From the results output, we formatted the variable coefficients into multiple linear regression equations (Table 2).

Table 2 Ordinary least squares and spatial regression results for each pollutant in the wet season

Results

Spatial variations of pollutants

There were clear seasonal differences in mean pollutant concentrations across different regions in the study area when averaged across all measurements for each season. In both the wet and dry seasons, mean E. coli concentrations tended to be higher in Portland’s southern metropolitan area. However, the area around the Columbia Slough demonstrated higher E. coli concentrations in the dry season than in the wet season (Fig. 3a).

Lead concentrations tended to be high in the middle of the study area close to Interstate Highways 5 and 405, but more stations overall, exhibited higher concentrations in the wet season than in the dry season (Fig. 3a). Similar to lead, overall mean zinc concentrations were higher in the wet season, with the southern study area exhibiting the highest concentrations in both seasons (Fig. 3b).

Mean nitrate concentrations were consistently higher by the Columbia Slough and a small developed area directly east of the Willamette River in both seasons, although overall concentrations were higher in the wet season (Fig. 3a). Mean orthophosphate concentrations were highest in the dry season and were consistently high in the same area around the south part of the Willamette River in both seasons (Fig. 3b). There did not appear to be clear seasonal variation in TSS concentrations at the scale of the study area, although certain areas were consistently high in both seasons (Fig. 3b).

Correlation analysis

More variables were significantly correlated in the wet season than in the dry season. In the wet season at both scales, E. coli, followed by zinc, was associated with the highest number of explanatory variables at the 95% confidence level (Fig.  4a, c). The strongest correlations in the wet season occurred between E. coli and percent developed ( +), percent forested ( −), and percent imperviousness ( +) at both scales. Zinc was correlated most strongly with pipe length ( +), road length ( +), and percent developed ( +). Orthophosphate was most strongly correlated with pipe length ( +) and mean elevation ( −) at the 250-m scale in the wet season. Pipe length was positively associated with all dependent variables in the wet season except nitrate, which showed negative correlation, and TSS, which showed no significant correlation. Lead was only significantly correlated with road length and pipe length in the wet season, the latter only at the 250-m scale. Nitrate and TSS were barely correlated with any candidate predictors in the wet season.

Somewhat different explanatory variables were correlated with water quality parameters in the dry season than in the wet season. E. coli continued to demonstrate the highest number of significant associations with explanatory variables, while lead was negatively associated with mean slope and mean elevation, but the latter only at the 250-m scale (Fig. 4b, d). Orthophosphate was positively associated with percent forested and negatively associated with percent imperviousness at the 250-m scale. Pipe length and road length were less significant overall in the dry season than in the wet season, although zinc was still positively associated with pipe length. There were no significant correlations between explanatory variables and nitrate or TSS in the dry season.

Correlations between pollutants and explanatory variables did not necessarily increase at the 250-m scale compared to the 100-m scale. Slope and elevation measures became more significant at the 250-m scale, especially in the dry season. In the dry season, nitrate and TSS were significantly correlated with more variables at the 250-m scale than at the 100-m scale. In the wet season, orthophosphate was significantly correlated with more variables at the 250-m scale than at the 100-m scale.

Fig. 4
figure 4

a-d Spearman rank correlation coefficient heatmaps for the 100-meter and 250-meter scales in the wet (n = 128) and dry seasons (n = 36). Correlation values significant at the 95% confidence level are shown in black while insignificant values are in gray. OP = orthophosphate; TSS = total suspended solids.

Exploratory regression analysis

The model with the highest R2 value was produced for E. coli (Table 2). As demonstrated through the importance of spatial weights terms and improvements in R2 values and reductions in AIC values for the spatial error/spatial error models, E. coli concentrations exhibited a relatively low amount of spatial autocorrelation in the wet season (Table 2). Significant explanatory variables in the wet season were percent developed (250 m) ( +), standard deviation in slope (250 m) ( +), mean slope (250 m) ( −), stream order ( +), and percent soil group C (100 m) ( −) (Table 2). The E. coli model was the only model to include stream order as a significant explanatory variable.

Models for lead exhibited relatively high spatial autocorrelation in the wet season (Table 2). Pipe length ( +) was the most significant predictor of lead, followed by standard deviation in slope (250 m) ( +), mean slope (250 m) ( −), and mean elevation (100 m) ( +) (Table 2). R2 values were relatively low, indicating that most of the variation in lead concentration between water quality stations was unable to be explained using the chosen predictors.

Nitrate exhibited slightly lower spatial autocorrelation than lead, although like lead, models had relatively low R2 values (Table 2). All selected explanatory variables were significant at the 0.05 level for all models in both the wet and dry seasons. Significant explanatory variables were percent developed (250 m) ( −) and percent imperviousness (100 m) ( +) (Table 2). Orthophosphate demonstrated strong spatial autocorrelation, indicated by leading spatial terms, significant decreases in AIC values, and large increases in R2 values for the spatial lag/spatial error models (Table 2). Significant explanatory variables for the spatial lag and spatial error models at the 0.05 level were mean elevation (250 m) ( −) and standard deviation in elevation (100 m) ( +) (Table 2). Percent soil group C (100 m), standard deviation in elevation (250 m), and percent developed (250 m) became insignificant when the spatial models were applied, suggesting that these variables are highly spatially autocorrelated.

TSS models had low R2 values compared to models for other pollutants, but all selected explanatory variables were significant at the 95% confidence level and spatial autocorrelation was low (Table 2). Significant explanatory variables for spatial lag/spatial error models at the 95% confidence level were standard deviation in slope (100 m) ( −), pipe length ( +), and percent imperviousness (250 m) ( −) (Table 2). Standard deviation in slope at the 250-m scale ( +) was borderline significant for the spatial lag/spatial error models.

Models for zinc exhibited relatively strong spatial autocorrelation, with leading spatial terms, decreases in AIC values, and large increases in R2 values for the spatial lag and spatial error models (Table 2). Six explanatory variables best modeled zinc, more than for all other pollutants. Significant explanatory variables were percent developed (250 m) ( +), percent imperviousness (100 m) ( −), standard deviation in slope (250 m) ( +), mean slope (100 m) ( −), and pipe length ( +), the last of which bordered on insignificant for the spatial regression models. Percent soil group C became insignificant when spatial models were applied, suggesting that there is significant spatial autocorrelation attributable to this variable.

Discussion

Spatial and seasonal variation in water quality

E. coli

As averaged for each water quality station over the study period, E. coli concentrations were highest in the southern portion of the study area, which encompasses the Tryon Creek State Natural Area in addition to a number of Portland suburban neighborhoods, yet were comparatively low in many parts of the northwestern portion, which includes Forest Park, a natural area frequented by hikers and their pets (Fig. 3a, b). As such, E. coli contamination appears to be heterogeneous even across recreational areas within the same geographic locale, complicating our hypothesis that recreational areas in general will experience higher E. coli contamination than non-recreational areas. This unexpected pattern may result in part from the significance of slope variables and stream order in wet season E. coli models (Table 2). The negative association with mean slope and positive association with stream order in the wet season indicate that E. coli organisms tend to proliferate most in high-order, low-elevation streams during seasonal periods of increased streamflow. Positive associations with standard deviation of slope might relate to the formation of puddles that form in the wet season for areas with more irregular inclines and facilitate E. coli survival. E. coli exhibited higher concentrations in the dry season, which previous research has suggested is related to warmer summer temperatures that enable growth (Chen & Chang, 2014). However, a recent study of southern Oregon wetlands found that E. coli concentrations were much more associated with livestock grazing than with seasonality, which calls for an examination of whether increased outdoor recreation and animal activity in the summer months as opposed to inherent seasonal climatic variation predominantly influence seasonal variation in E. coli within Portland urban and suburban areas (Smalling et al., 2021).

Heavy metals

As hypothesized, both lead and zinc were positively associated with road length and pipe length negatively associated with mean slope, and positively associated with standard deviation in slope, complementing a recent Portland City report that heavy metal concentrations were correlated with each other in Portland area watersheds (Fish & Jordan, 2018). However, R2 values were relatively low, particularly for lead, raising further questions about the differences in landscape and anthropogenic factors that contribute to lead as opposed to zinc contamination in the study area (Ramirez et al., 2022). However, the same report noted that heavy metals were also correlated with total suspended solid concentrations. Pipe length and percent imperviousness are significant predictors for both lead and TSS in the wet season, similar to the findings of previous work that focused on Johnson Creek (Chang et al., 2019).

Despite previous literature tracing zinc to the deterioration of asphalt, car tires, and brake pads, road length was not included in models for zinc (Sörme & Lagerkvist, 2002). Other sources of zinc, such as industrial operations and galvanized building materials, have been found to be significant contributions to storm runoff and thus should be investigated as potential influencing factors in future studies (Brown & Peake, 2006; Sörme & Lagerkvist, 2002). These sources of zinc would occur in highly developed areas with anthropogenic activities, but not necessarily around roads and other areas of high impervious surface.

Nutrients

Developed area and impervious surface best explained variations in nitrate for both seasons, even as developed area was negatively associated and impervious surface positively associated with nitrate. This unexpected relationship with percent imperviousness and percent developed may result from the disproportionate placement of water quality monitoring stations in or near parks and other urban green spaces, which the NLCD land cover classification system nevertheless designates as “Developed-Open Space” for areas with less than 20% impervious surface (Dewitz & U.S. Geological Survey, 2021) (Supplementary Table 1). Low-intensity developed areas include open spaces, which may serve as nitrogen sinks with a buffering effect. At the same time, impervious surfaces such as urban and suburban roads and sidewalks facilitate increased nitrogen runoff despite lower densities of vegetation. Important nitrogen sources in urban areas include household fertilizer and dead leaves from urban street trees, as documented by previous studies (Hobbie et al., 2017; Taguchi et al., 2021).

The positive association of orthophosphate with developed area for the OLS model may relate to low-intensity developed areas serving as a source for phosphate from lawn fertilizer applications, while positive relationships with soil type C may have to do with low infiltration rates creating higher overland flow. However, soils in developed areas are spatially autocorrelated, causing them to become insignificant in spatial regression models. The lack of significant explanatory variables for spatial lag and spatial error models in either season indicates that there are factors unexplored in this analysis that affect phosphorus concentrations. Such factors might include high flow events and decreased drainage density, which was found to reduce nutrient runoff in urbanized watersheds (Pratt & Chang, 2012). For instance, decreased drainage density reduces nutrient runoff in urbanized watersheds, and locally, high phosphorus levels in Fanno Creek, on the outskirts of our study area, are known to increase total phosphorus concentrations during storms (Anderson & Rounds, 2002; Meierdiercks et al., 2017).

Total suspended solids (TSS)

TSS concentrations did not follow any clear spatial patterns between regions of the study area. However, there was a negative association between TSS and standard deviation in slope, indicating that unpaved areas with consistent inclines tend to have higher concentrations with TSS. This is consistent with our prediction that areas with high foot traffic have the greatest TSS concentrations, as less paved areas are more likely to deposit sediment. This result is further supported by Pratt and Chang’s findings that standard deviation of slope is negatively associated with total solids across seasons and scales for watersheds in the greater Portland, Oregon region (Pratt & Chang, 2012; You et al., 2019). Furthermore, Lintern et al.’s literature review also suggests a negative correlation of slope with TSS for developed areas (Lintern et al., 2018). The failure of spatial regression tools to find a model with adequate explanatory power for TSS suggests that analysis of TSS concentrations at this microscale may require including additional explanatory variables accounting for human activity—for instance, population density (Xu et al., 2021).

Predictive power of landscape variables

Percent developed, percent imperviousness, and percent forested had the highest Spearman correlation coefficients overall, emphasizing the importance of land cover on water quality variability even at the microscale. However, the spatial lag and spatial error models confirmed that percent imperviousness and percent developed were spatially autocorrelated, which decreased their explanatory power for orthophosphate in both seasons and zinc in the dry season. In the dry season at the 250-m scale, pipe length and road length exhibited high positive correlation with E. coli. Thus, it is interesting that road length was not a significant predictor of any pollutant in the spatial regression, although this could be due to correlations between pipe and road density ruling out road length as a predictor.

Hydrologic soil group C was negatively correlated with E. coli and TSS for the Spearman tests in the wet season. However, for all other spatial lag and spatial error models, soil group C was ruled out as a significant predictor when it was initially included in OLS models; in other words, soil group C is highly spatially autocorrelated within the study area, thus reducing our ability to assess the predictive power of hydrologic soil group using the assumptions of linear regression. Because hydrologic soil group C has “relatively high runoff potential” when wet, negative correlation with E. coli suggests that even relatively impermeable soil still serves a purpose in influencing E. coli concentrations (Phillips et al., 2019; USDA, 2007). Furthermore, soil is also a growth medium for E. coli under certain conditions and E. coli transport through soil is a function of soil water content (Byappanahalli & Fujioka, 1998; Dwivedi et al., 2016). Future studies evaluating the effects of hydrologic soil group on E. coli colony formation would benefit analysis in this regard.

Scale effects

The 250-m scale produced a higher number of significant correlations and higher correlation coefficients between water quality parameters and explanatory variables in both seasons, suggesting that a larger microscale is more indicative of water quality than a more immediate microscale, at least when using a circular buffer. Future studies could employ multiple riparian buffers to further compare spatial determinants of water quality across microscales (Pratt & Chang, 2012). Such studies should also calculate landscape fragmentation metrics using software such as FRAGSTATS (McGarigal & Marks, 1995) for more robust explanatory power (Chang et al., 2021; Fernandes et al., 2019).

Because we conducted analysis at the microscale, we were unable to incorporate sociodemographic factors as explanatory variables in our analysis of water quality. Another important next step of this research is to perform a multi-level analysis at the census block group scale to evaluate how income, race, education, and other socioeconomic variables are associated with water quality parameters at multiple spatial scales (Baker et al., 2019; Chan & Hopkins, 2017; Garcia-Cuerva et al., 2018).

Conclusions

Correlation and spatial regression analyses were conducted for samples of six pollutants originating from 128 water quality stations around the Portland, Oregon area, from 2015 to 2021. We examined the ability of various land cover, infrastructure, and soil and geomorphological factors to act as explanatory variables at the microscale between the wet and dry seasons. We found that there were seasonal and spatial differences in water quality parameters that can be attributed to differences in land use and land cover at the chosen scale, which were often associated in opposite directions from initial Spearman correlation coefficients. Using a distance band weights matrix, spatial lag and spatial error models best explain variations in water quality and uncovered strong spatial autocorrelation for hydrologic soil group C, imperviousness, and percent developed variables. E. coli was associated with land cover, soil group C, and topographic variables, while pipe length primarily explained variations in lead concentrations. Nitrate was primarily affected by percent developed area as well as impervious surface in both seasons. Spatial regression models for orthophosphate ruled out several strongly spatially autocorrelated predictors, though mean elevation maintained a negative association. Total suspended solids were also affected by topographic variables and pipe length. Models for zinc included topographic variables in the wet season and land cover variables in both seasons.

Unexpected relationships of imperviousness and developed area with pollutants might result from the large amount of urban green spaces in Portland, which we considered “developed” but with low amounts of impervious surface. To address this result, our methodology could be modified to better evaluate the effects of the amount of development on pollutant concentrations. Observations of the effects of hydrologic soil group C on water quality were limited by spatial autocorrelation that ruled out significance, although multiple OLS models included soil as a significant variable. By incorporating precipitation data and comparing other hydrologic soil groups in the future, we could better examine the effects of hydrologic soil groups on concentrations of E. coli and other pollutants.

Our research adds to the body of knowledge regarding local hydrology, urban infrastructure, and ecosystem services in Portland, Oregon. Facing unprecedented environmental and social challenges as a result of climate change, city planners hoping to improve water quality in metropolitan areas can utilize the findings of this study to better evaluate water pollution in a metropolitan city with. Researchers in the field can use findings from this study to understand how anthropogenic and natural variables interact to affect water quality across space and time.