1 Introduction

Cover crops are crops planted between growing seasons to improve soil health, decrease soil erosion, manage nutrients, and suppress weeds, among other benefits (Blanco-Caqui et al. 2015; O’Connell et al. 2014). Weed-suppressive cover crops can be an important tool to manage weeds sustainably because they can limit weed seed rain and prevent the buildup of the weed seed bank (Baraibar et al. 2018; Brainard et al. 2011; Brennan and Smith 2005). However, the ability of a cover crop to effectively suppress weeds can vary substantially. In 2015, Hamilton (2016) measured cover crop and weed biomass in 110 fields across Pennsylvania (USA) and reported a surprising amount of variation in weed biomass among fields planted to the same cover crop, across cover crop species, and across fields. Similarly, weed-suppressive effects can vary even from the same cover crop species depending on the year, seeding time, and weed community composition (Baraibar et al. 2018; Björkman et al. 2015; Hayden et al. 2012). Cover crop biomass is considered one of the main factors driving weed biomass in a standing cover crop because of competition for light, water, space, and nutrients (Brennan and Smith 2005; Finney et al. 2016; Wittwer et al. 2017). However, other factors such as cover crop functional traits, timing and management of sowing and termination, and management context (organic, conventional) may also influence weed biomass in cover crops (Björkman et al. 2015; Brainard et al. 2011; Dorn et al. 2015). The interacting effects of these various factors have not been adequately explored.

Knowledge of cover crop functional groups can be helpful in predicting weed suppression. Grasses and brassicas tend to suppress weeds effectively through rapid soil cover and large biomass production (Brainard et al. 2011; Brennan and Smith 2005; Dorn et al. 2015; Finney et al. 2016; Hayden et al. 2012) whereas legumes are slower-growing and less competitive against weeds (Lawson et al. 2015; but see Hayden et al. 2012; Fig. 1). Also, different functional groups can affect different parts of the weeds differently; for example, in perennial weeds, some cover crops can be more suppressive of aboveground rather than belowground biomass (Ringselle et al. 2017). Cover crop mixtures may provide different levels of weed suppression depending on cover crop species and functional group composition, and can ensure service provisioning in various conditions, also when one of the species fails to establish. Grass–legume mixtures can be as weed-suppressive as grass monocultures (Hayden et al. 2012; Lawson et al. 2015). Higher-diversity mixtures may need as little as 20% of the monoculture seeding rate of an aggressive grass such as cereal rye (Secale cereale L.) to effectively control weeds (Baraibar et al. 2018). However, weed suppression by mixtures that include less-aggressive species, such as some legumes, may be lower and more dependent on climatic conditions, which influence cover crop establishment and growth (Brainard et al. 2011).

Fig. 1
figure 1

A weed-suppressive cereal rye monoculture (left) and a red and crimson clover mix with common lambsquarter seed heads sticking out over the canopy (right)

Cover crop sowing and termination dates may influence weed biomass because they define the length of the growing season and the associated climatic conditions. A longer growing season will provide both cover crops and weeds more time to grow, which may result in greater weed biomass (Baraibar et al. 2018; Murrell et al. 2017). Planting or seeding date can also influence weed germination periodicity—that is, the time of the year when conditions (mainly temperature and moisture) for a particular weed species are optimal for germination. For example, if cover crops are planted in early summer, cover crop germination may coincide with a peak in germination of summer annual weeds and, therefore, the cover crop can become weedier than if planting date is delayed past the peak germination period for those weeds (Baraibar et al. 2018; Myers et al. 2004). Finally, cover crop termination date can influence cover crop biomass in spring and, thus, weed suppression potential via competition at the end of the cover cropping period.

Tillage before cover crop seeding can stimulate weed germination and indirectly influence weed biomass in the cover crop (Mirsky et al. 2010; White et al. 2017). If cover crops are seeded after a tillage operation, the likelihood that a flush of weeds germinates with the cover crop is higher than if the cover crop is no-till drilled. This tillage effect can be harnessed to reduce weed biomass with the use of a stale seedbed. In this practice, the field is tilled to stimulate weed seed germination and then tilled again to kill the germinated seedlings prior to cover crop planting.

Finally, management systems may also influence weed suppression by cover crops. In systems that allow high weed seed production due to low weed control efficacy, such as some organic systems (Dorn et al. 2015) and conventional systems with herbicide-resistant weeds, high seedbank density may reduce the relative competitiveness of cover crops against weeds. This is because the number of individuals surviving a weed management treatment can be directly related to the number of individuals initially present (Dieleman et al. 1999). Additionally, weed species composition and the presence of perennial weeds, which are usually more difficult to control in organic systems (Orloff et al. 2018), may also influence weed biomass in cover crops.

As cover crop adoption increases, more research is needed to assist farmers and land owners to choose the best cover crop species and management practices to achieve desired goals. In some contexts, weed suppression by cover crops may be critical to meet those goals, whereas in other situations, weeds may be providing the very same services desired from the cover crop. In this paper, we draw on a large dataset of cover crop and weed biomass measurements collected across seven experiments at the Penn State Russell E. Larson Agricultural Research Center, Rock Springs, PA, USA, and farms across Pennsylvania (Table 1) to predict weed biomass based on cover crop type (grasses, brassicas, legumes, and mixtures), length of the growing season, seed bed preparation, management system, and fall and spring cover crop biomass.

Table 1 Summary of the main characteristics of the experiments from which data was extracted

2 Materials and methods

Cover crop and weed biomass data from 1764 measurements (810 in the fall and 954 in the spring) were used in a random forest model to identify the main factors related to weed biomass in winter cover crops in the fall and spring. All observations were limited to winter cover crops in arable cropping systems (mainly in grain crops) in the Mid-Atlantic Region (USA), primarily in Pennsylvania. The Mid-Atlantic Region includes areas in plant hardiness zones 5 to 7 (USDA 2012), which means that the average extreme minimum temperatures range from − 26 to – 12 °C, and therefore some cover crop species are susceptible to winter kill.

Cover crops included in this analysis were seeded after a winter grain, after corn and soybean, interseeded into corn, or frost seeded into a winter grain. Data used in this analysis included, for the fall and spring, respectively, 179 and 275 measurements in grass monocultures, 83 in brassica monocultures, 166 and 206 in legume monocultures, and 382 and 390 in mixtures. The discrepancy between the number of observations in the fall and spring for some cover crop types mainly arises from dataset number 5 (Malcolm et al. 2015), where cover crop and weed biomass were only measured in the spring. As for system type, 490 observations were made in organic systems in the fall and spring, and 320 and 464 were made in conventional systems in the fall and spring, respectively. Finally, cover crops were no-till seeded in 411 instances, and tilled before seeding in 1353 instances.

Aboveground cover crop and weed biomass were assessed by clipping all plants in a defined area within the cover crop (usually one or more 0.25-m2 quadrats) in the fall, before the first killing frost and in the spring, prior to cover crop termination. Cover crop species and weeds were sorted, dried at 65 °C for 1 week, and weighed. Cover crop species, seeding and termination dates, seed bed preparation (till, no-till), and system type (organic, conventional) were also recorded.

2.1 Data analysis

We used random forests (RF) to predict fall and spring weed biomass and identify the most important variables for predicting weed biomass in a cover crop. RF are an ensemble of classification and regression trees, where each tree is constructed from bootstrapped samples of observations using a limited number of randomly selected predictor variables (Strobl et al. 2009; Breiman et al. 1984). This approach also largely reduces the impact of block effects within an individual experiment. Though individual trees remain a valuable tool to identify and visualize relationships among predictors, RF is considered a more robust strategy for assessing variable importance, as forests are less prone to instability and more fully leverage information held in large datasets.

We developed two models to predict weed biomass: one for the fall and one for the spring. Predictor variables included in the RF were cover crop biomass (in the fall and in the spring), cover crop type (grass, legume, brassica, mixtures), GDD from cover crop planting to the end of the year (fall model) or from January 1st to cover crop termination (spring model, base temperature 0 °C for both), seed bed preparation (no-till, tillage), and management system type (organic, conventional). Because data came from different trials, the independent variables were unbalanced. Therefore, we constructed training sets for the RF models in two ways: (1) accepting an unbalanced design, where training samples were selected from the original dataset with equal probability, and (2) selecting training samples with weighted probabilities such that each training set had a balanced sample size of each predictor variable. We achieved this balancing by oversampling from the levels of predictor variables with relatively few observations rather than undersampling the levels of predictor variables with many observations. Because the balanced sampling method did not substantially improve the predictive accuracy of the model nor change the interpretation of the most important variables, we used the RF models developed from sampling the original dataset with equal probability to develop our final interpretation of the results. To do this, we constructed partial dependence plots from the RF models to visualize the effects of the variables with the highest importance scores on predicting weed biomass in the fall and spring. RF models were built in R statistical software (R Core Team 2018) using the “randomForestSRC” package (Ishwaran and Kogalur 2018). Variable importance scores are calculated as the change in prediction error that occurs when the values of each variable in the model are permuted. The increase in predictive error that occurs when the observations of a given variable are perturbed signifies the importance that variable carries in predicting the response. We report relative importance scores, which are the importance scores for each variable divided by the greatest variable importance score in the model. Partial dependence plots were constructed by cover crop type for fall and spring weed biomass by systematically varying GDD and cover crop biomass values across the input dataset and calculating an average predicted weed biomass at each interval of GDD and cover crop biomass. Combinations of GDD and cover crop biomass that were not present in the experimental data were not included in the partial dependence plots.

3 Results and discussion

The relative variable importance scores of the RF models were very similar between models trained on the original unbalanced datasets and models trained on balanced sampling in the fall (Table 2). In spring, the two variables with the highest importance scores exchanged rank order when balanced sampling frequencies for the cover crop type and seedbed preparation factors were implemented. However, these two variables, spring GDD and cover crop type, had similarly high importance scores and were well separated from variables of lesser importance under all methods of developing RF models for the spring. Predictive accuracy (model r2) was similar between the sampling methodologies (Table 2), with the exception of balanced sampling by seedbed preparation, which reduced accuracy of the model. From these results, we concluded that balanced sampling of the training dataset did not change the interpretation or improve accuracy of model and we therefore used results from equal probability sampling of the whole original dataset to construct the partial dependence plots and interpret the data.

Table 2 Variable importance scores for each of the explanatory variables using an unbalanced design and a balanced design for each factor, and the predictive accuracy of the random forest models (r2). Partial dependence plots (Figs. 2 and 3) were constructed using the results from the unbalanced design

3.1 Fall weed biomass

In the fall, the RF model explained 65% of the variance in weed biomass. Fall GDD was the variable with the highest importance score, followed by cover crop type, and cover crop biomass (Table 2). Weed biomass was low in all cover crop types below 1500 fall GDDs and increased thereafter (Fig. 2). Fall growing degree days are related to soil degree days (DD), which have been positively correlated to weed emergence (Myers et al. 2004). A long growing period in the fall (high GDD), which was the result of an early cover crop seeding date, may have provided sufficient time for weeds to accumulate large amounts of biomass. When cover crops and weeds emerge together, competition between weeds and cover crops is low and some weeds may become established before the cover crop can effectively compete with them (Brennan and Smith 2005). Unfortunately, we do not have sufficient weed species composition data to be able to assess which species drove high levels of weed biomass in high GDD situations. Differences in competitive ability among weed species or growth forms (perennial vs. annual species) could shed more light on the mechanisms that affected weed biomass in the different cover crops. However, Baraibar et al. (2018) reported an increase in summer annual weed biomass when cover crops were seeded in early August compared to mid-August or early September. Summer annual weeds in this location, such as common lambsquarters (Chenopodium album L.), generally have higher biomass production potential than their winter annual counterparts. This indicates that the germination periodicity of annual weeds may interact with GDD accumulation to influence weed biomass in cover crops.

Fig. 2
figure 2

Partial dependence plots for fall weed biomass (kg ha−1) in grass (a), brassica (b), legume (c) cover crops, and cover crop mixtures (d) as related to fall growing degree days (base temp 0 °C) and cover crop biomass (kg ha−1). Colors in each figure are included to facilitate the interpretation of the results and represent a gradient from less (blue) to more (red) weed biomass, but do not represent a given value or interval

These results suggest that delaying cover crop seeding until later in the fall can decrease weed biomass in cover crops. Interestingly, the magnitude of the effect of fall GDD on weed biomass varied depending on cover crop type. Legume and brassica cover crops (Fig. 2b, c) harbored around three and 1.5 times more weed biomass than grass cover crops and mixtures, respectively (Fig. 2a, d), especially for GDD above 1500. These differences are likely caused by a more rapid establishment and growth of grass cover crops (in monoculture and in mixtures) compared to the slower establishment of brassica and legume cover crops.

Cover crop biomass was a moderately important predictor of weed biomass in the fall (Table 2), and it had a similar effect across cover crop types (Fig. 2). Weed biomass was slightly higher when cover crop biomass was below 2000 kg ha−1 than with higher levels of cover crop biomass. However, additional increases in cover crop biomass above 2000 kg ha−1 did not further decrease weed biomass. These results suggest that once weeds are established, they can effectively grow within the cover crop and accumulate large amounts of biomass regardless of how large the cover crop grows. These results are in agreement with a growing body of literature that shows that cover crop biomass alone may not be the main factor explaining weed biomass and that other factors, such as the speed of cover crop establishment, ground cover, or allelopathy, are better predictors of weed biomass (Björkman et al. 2015; Dorn et al. 2015; Gfeller et al. 2018; Lawley et al. 2012; Lawson et al. 2015).

System type had a relatively low variable importance score in predicting fall weed biomass compared to GDD and cover crop biomass (Table 2), even though organic systems had greater weed biomass on average compared to conventional systems (214 and 69 kg ha−1 respectively). Contrary to our expectations, tillage before cover crop seeding did not substantially influence weed biomass in the fall. Tillage may have triggered weed germination and influenced weed density but higher weed germination is not necessarily correlated to higher weed biomass (Fisk et al. 2001).

3.2 Spring weed biomass

In the spring, the RF model explained 47% of the variance in weed biomass. Cover crop type, spring GDD, and cover crop biomass were the variables with the highest importance scores (Table 2). Contrary to the fall, where weed biomass was primary related to GDD in all cover crop types, the response to spring GDD and cover crop biomass differed substantially across cover crop types (Fig. 3). Weed biomass was greatest in legume cover crops, intermediate in brassica cover crops, and lowest in grasses and mixtures. GDD strongly influenced weed biomass, especially in the less competitive legume and brassica cover crops. Weed biomass in legume cover crops reached an average of 727 kg ha−1 when GDD exceeded 1500 and cover crop biomass was below 2000 kg ha−1 (Fig. 3c). Increases in legume cover crop biomass lead to decreases in weed biomass until a minimum of 292 kg ha−1. This minimum was achieved in the highest yielding legume cover crops even with high GDD. In brassica cover crops, GDD above 1200 increased weed biomass to a maximum of 861 kg ha−1, which occurred when cover crop biomass was below 1200 kg ha−1 (Fig. 3b). Increasing cover crop biomass decreased weed biomass until reaching a minimum of 177 kg ha−1. Finally, in grass cover crops and mixtures, the highest weed biomass was 442 kg ha−1 and occurred only when cover crop biomass was less than 1200 kg ha−1 and with moderate to high GDD (Fig. 3a, d). Increasing cover crop biomass above 6000 kg ha−1 lowered weed biomass in grass monocultures and mixtures to an average of 105 kg ha−1 and 50 kg ha−1, respectively. Similar to the fall, system type had a relatively low variable importance score, and average weed biomass between the systems was quite similar, with 154 and 206 kg ha−1 in conventional and organic systems, respectively.

Fig. 3
figure 3

Partial dependence plots for spring weed biomass (kg ha−1) in grass (a), brassica (b), legume (c) cover crops, and cover crop mixtures (d) as related to spring growing degree days (base temp 0 °C) and cover crop biomass (kg ha−1). Colors in each figure are included to facilitate the interpretation of the results and represent a gradient from less (blue) to more (red) weed biomass, but do not represent a given value or interval

In contrast to the fall, high spring cover crop biomass can effectively reduce weed success and may help prevent weed seed rain (Baraibar et al. 2018; Brennan and Smith 2005). This is likely due to direct competition from cover crops, but may also reflect a change in weed species composition as summer annuals are lost to winter kill. We do not have species-specific data available for all experiments, but the main winter annual species at the research station (where five of the seven experiments used in this paper were located) are common chickweed, henbit (Lamium amplexicaule L.), and shepherd’s purse (Capsella bursa-pastoris (L.) Medik.). Common chickweed is the most competitive species of these three and can cause yield losses in wheat and other winter crops (Marshall et al. 2003; Olsen et al. 2006). Another winter annual species common in the region that can cause problems in subsequent crops is horseweed (Conyza canadensis L.). High spring cover crop biomass can help decrease some of these species sizes and seed production, and help mitigate problems later in the rotation (Baraibar et al. 2018). Spring-germinating weed cohorts have little opportunity to produce biomass, since they emerge into an extremely competitive environment. Despite the general importance of cover crop biomass, in some cases, brassica and grass monocultures and mixtures provided good weed suppression even with low levels of cover crop biomass. Low cover crop biomass may have resulted from winter-kill cover crops, such as oats or forage radish. High-residue cover and/or the release of allelopathic components may explain weed suppression from these winter-killed cover crops (Baraibar et al. 2018; Bhowmik and Inderjit 2003).

Given the variability among fields, variation in weed biomass explained by the models was high (65 and 47% in fall and spring, respectively). However, there was still a percentage that could not be explained. Differences in precipitation across years and sites, weed seed bank pressure, weed species composition (and the importance of perennials vs. annual weeds), or background soil fertility are factors that we did not consider in our analysis and could have also modulated weed biomass in different cover crops. The large dataset used in this analysis encompasses a wide range of winter cover crop species, seeding times, and growing conditions in grain crop rotations representative of the Mid-Atlantic Region. We chose to use cover crop types as a grouping factor to distinguish between cover crop life forms that can differently affect weed biomass. However, as more information becomes available, using cover crop functional traits related to weed suppression such as specific leaf area, leaf to stem ratio or cover crop height (Storkey et al. 2015) will likely provide more generalizable information to understand the specific attributes that mediate cover crop weed suppression ability.

4 Conclusions

Taken together, our results suggest that farmers can achieve low weed biomass in their cover crops by carefully selecting seeding time and cover crop species. In the fall, the likelihood of a cover crop accumulating substantial weed biomass increases with the length of the growing season, which is primarily related to the seeding date. Seeding early may trigger stronger weed seed germination and lead to increased weed biomass, even with high levels of cover crop biomass. Planting monoculture grasses and mixtures containing grasses can help limit weed biomass in early-seeded cover crops, while monocultures of legumes and brassicas are less effective at limiting weed biomass. In the spring, robust cover crop growth can help ensure low weed biomass, especially when cover crops are terminated late. These results also suggest that there may be trade-offs associated with seeding dates for winter hardy species because early planted cover crops will also ensure large cover crop biomass in the spring.

Weed suppression is only one of the many goals of cover cropping, and in some cases, weedy plants may enhance ecosystem services from cover crops. However, managing for low weed biomass is likely to be important in many cases. Our results may help farmers achieve multifunctional cover crops that support their weed management strategy while also benefiting production and conservation.