Introduction

Loads of phosphorus (P), the main limiting nutrient in freshwater ecosystems, cause intense algal blooms and impair water quality (Schindler 1974). Recent estimates of nutrient loads from Sweden to the Baltic Sea show that diffuse losses from agriculture are now the largest anthropogenic source of P (Hansson et al. 2019). As a rule of thumb, Sharpley et al. (2009) proposed the 80:20 rule, where the majority (~ 80%) of diffuse P losses originate from a small proportion (~ 20%) of catchment area. These critical source areas (CSAs) coincide with hydrologically active, interconnected areas where overland and/or shallow subsurface flows mobilize and transfer P from terrestrial to aquatic ecosystems. Identification, quantification, and targeting of CSAs is the best way to increase the cost efficiency of P loss mitigation strategies, but remains a challenge for the research community and for policymakers (Pionke et al. 2000; Sharpley et al. 2015). Djodjic and Markensten (2019) used high-resolution distributed modeling to calculate erosion and overland flow risk, with particulate phosphorus (PP) as the dominant P fraction, for a large area (202,279 km2) representing > 90% of Swedish arable land.

Knowledge and available environmental data required for proper and reliable identification of CSAs for dissolved phosphorus (DP) losses (Sims et al. 1998; Djodjic et al. 1999) are currently lacking and quantification of subsurface water and P pathways and fluxes still remains a challenge (Sharpley et al. 2015), although DP may be a much more pervasive contributor to eutrophication than previously acknowledged (Kleinman et al. 2011). Significant downward P transport has been reported in sandy soils and particularly those having high P content due to long-term fertilization and manuring (Pierzynski et al. 2005), which also occurred in Sweden between the 1950s and 1980s (Djodjic and Mattsson 2013). Recently, Djodjic et al. (2021) identified in a large national screening study of small headwater catchments (< 50 km2) approximately 25% of Swedish arable land as potentially sensitive to DP losses. Sensitive catchments, found in regions with sedimentary bedrock and predominantly sandy soils, were shown to be characterized by high pH values, high calcium (Ca) content, and low P sorption capacity (PSC), often measured as the content of iron (Fe) and Al in soil (Schoumans and Groenedijk 2000; Börling et al. 2004).

Such findings, based on results of water quality monitoring programs at catchment scale, need however to be scaled-down to field and even sub-field level. Haygarth et al. (2012) concluded that “it was not possible to disaggregate an influence of soil (Olsen) P concentration on P export at the larger scale”. The large variability in important soil properties at field and sub-field/parcel scale demands much higher spatial resolution of identified CSA:s, or at least identification of easily measured soil properties governing P losses. In this regard, long-term monitoring of water quality at field scale is of great importance for understanding and mapping losses and nutrient cycling in agricultural systems (Norberg et al. 2022a, b), but at the same time very expensive and only applicable to few fields. For instance, the Swedish national program for monitoring of water discharge, nutrient losses and cultivation practices on arable fields includes only 13 fields across the whole country with more than 2.5 million hectares of arable land (Norberg et al. 2022a, b). On the other hand, soil sampling and testing for P is a well-established agricultural practice (Sims et al. 2000) and comprehensive databases with results from soil P tests have been developed covering large areas (Sims et al. 2000; Paulsson et al. 2015). Hence, a combination of soil testing data and monitoring data at field and small catchment scale in areas indicated to be vulnerable in screening study (Djodjic et al. 2021) might help to integrate different spatial and temporal scales. Soil P tests aim at quantification of plant available P (often referred to as P quantity), and delivering P to the soil solution (referred to as P intensity) (Pierzynski et al. 2005), as a snapshot at a given sampling occasion. However, there is a consistent pattern of increases in P losses if soil test P values exceed the agronomically optimal range (Heckrath et al. 1995; Sharpley 1995; Vadas et al. 2005) or if the degree of P sorption saturation increases (Pautler and Sims 2000; Schoumans and Groenedijk 2000; Maguire and Sims 2002). Additionally, weaker soil P extraction procedures, aiming at quantification of P intensity, are also used to assess the environmental risks. For example, McDowell & Sharpley (2003) proposed use of 0.01 M CaCl2 for estimating P in soil solution and subsurface flow whereas the use of water extraction was suggested to estimate surface runoff P (Yli-Halla 1995). Further, as no single technique can comprehensively identify the variety of P species in complex soil matrices, a combination of molecular techniques, such as 31P nuclear magnetic resonance (31P NMR, (Newman and Tate 1980)) to determine organic P compounds and P K-edge X-ray absorption near-edge structure (XANES, (Hesterberg et al. 1999)) spectroscopy to determine inorganic P species, together with a routine fractionation method, was used to characterize soil legacy P (McLaughlin et al. 2011; Liu et al. 2015; Schmieder et al. 2018).

However, there is less agreement that soil P tests or other techniques to analyze individual soil samples are alone the most appropriate means to characterize the risk of agricultural P to water quality (Sims et al. 2000), as they clearly do not characterize site hydrology (Djodjic et al. 2004; Shirmohammadi et al. 2005). Therefore, intact or disturbed soil columns are also used as a compromise between tedious and expensive long-term monitoring programs at field scale and the limitations in the interpretation of the chemical extractions of individual soil samples (Djodjic et al. 2004; Liu et al. 2012; Svanbäck et al. 2013; Parvage et al. 2015; Riddle et al. 2018).

Hence, a combination of soil sample analyses, lysimeter studies and results from long-term monitoring programs might give us valuable insights in possibilities and limitations of each individual technique/scale. While soil surveys may help us to cover large geographic areas and the variation range(s) in important soil properties, lysimeter studies can add short-term evaluation of the effect of both legacy P effects and recent agricultural practices, such as P fertilization. Finally, long-term monitoring of water quality at field scale can help us to evaluate long-term trends and applicability of the results from soil analyses and lysimeter studies to field conditions.

Here, we present the results of a three-step study. In the first step we investigate the relationships between the results of agronomic and environmental soil P tests and molecular techniques for 80 soil samples collected from seven sites previously indicated to be at high risk due to the underlying geology and P management (Djodjic et al. 2021). In the second step, we perform a rainfall simulation experiment on topsoil lysimeters from four of these sites to quantify and compare P leaching from soils with different initial soil P status, both before and after fertilization event. Finally, in the third step, we compare results and calculate trends in P losses for two of these sites with long-term monitoring data at field scale, as they are included in the Swedish national monitoring program.

The aims were: (i) to identify the speciation of organic and inorganic P in soils previously identified as potentially vulnerable to P losses, (ii) to identify soil properties governing P release and leaching in these soils, and (iii) to quantify and relate nutrient losses before and after fertilization from soils with varying soil P content and sorption characteristics, in the short- and the long-terms.

Materials and methods

Sites and soil sampling

The seven study sites from which in total 80 soil samples were collected are small catchments and observation fields included in different environmental monitoring programs (Supplemental Figure S1). All seven sites were located in areas vulnerable to nutrient losses in Sweden according to a previous screening study (Djodjic et al. 2021). In order to characterize the topsoil and shallow subsoils at the study sites, at least one soil sample was taken from each soil textural and geological class present at each site. Each sample consisted of at least 10 soil cores collected from an area of approximately 2 m2. Topsoil samples were collected from 0–20 cm depth and subsoil samples from 20–40 cm depth. All samples were air-dried and gradually broken down by hand and sieved by 2-mm sieve before analysis. One site (3M, an arable field) is included in the monitoring program “Observation fields on arable land” (Djodjic and Bergström 2005; Linefur et al. 2017). Three of the sites (E21, I28, and M42) are included in the Swedish “Agricultural Monitoring Programme” (Kyllmar et al. 2014), which covers 21 small catchments dominated by agriculture. In catchment E21, field 21E is also included in the above-mentioned monitoring program “Observation fields on arable land”. Three remaining catchments (Aln, Per, Ram) are included in regional or municipal water quality monitoring programs. Some important characteristics of the study sites are given in Supplemental Table S1.

Lysimeter sampling and rainfall simulation experiment

A total of 20 topsoil columns (0–0.2 m depth) were collected from 4 sites including 7 different fields in September 2021. Based on previous soil characterization, two sets of soil columns were collected from three of the sites (E21, Per and Ram sites), where one set was sampled from the field with lower initial soil P status, and the second one from the field with high soil P status. High soil P status means here soil P content above 5.2 mmol P kg−1 (Table 4) which corresponds to the highest P-AL class (class V) according to Swedish classification system (Swedish Board of Agriculture 2020). The initial soil P status in columns with lower P content corresponds to P-AL class III, i.e. the optimal soil P content (Swedish Board of Agriculture 2020). Three replicates were taken from each site, amounting to 18 columns in total. At the fourth site, 3M, only columns with high soil P content were available. In this case, due to the limitation in the number of columns in the rain simulator (maximum of 20 columns), only two replicates were collected. A polyamide cloth filter (50 μm) was fitted to the bottom of each 25-cm-high polyvinyl chloride (PVC) lysimeter casing with an 18.8 cm inner diameter. The columns from these annually plowed, predominantly sandy soils were excavated, brought to the laboratory and homogenized to a bulk density of 1.3 kg dm−3 and a porosity of approximately 41%.

Two rain simulators described earlier by Liu et al. (2012) and Riddle et al. (2018) were used to measure potential leaching from the soil columns. Each simulator is a stainless steel cabinet fitted with 10 sprayer nozzles, each for one soil column and 40-cm apart. The desired rainfall intensity was achieved by individual calibration of each nozzle with data logger. Each prepared soil columns was installed in a shallow, stainless steel container to allow leachate to pass freely to a water collecting container.

All soil columns were initially irrigated to assure start of the leaching and thereafter left to drain freely for 3 d prior to the start of the leaching experiment. In total, four irrigation events were performed. At each event, a total volume of 33 mm of water was applied with an intensity of 11 mm h−1 for a period of three hours. The value of totally 32 mm ± 1.9 mm is characteristic for a heavy rainfall with a duration of three hours and a return time of 10 years in the south-east region of Sweden, where the soil columns were collected (SMHI 2022). The first irrigation was conducted on unfertilized soil columns to retrieve the base scenario. The soil columns were allowed to drain freely for 3 d, the leachate was collected and sent for analyses. After leachate collection, all soil columns were fertilized with disodium hydrogen phosphate (Na2HPO4) at a rate corresponding to 22 kg P ha−1. In Sweden, farmers are not allowed to apply more than 22 kg P ha−1 yr−1 on average for a period of five years to ensure good water quality (Bergström et al. 2015). The fertilizer was carefully applied and mixed with the upper soil, and left for additionally four days until the second irrigation event, one week after the first irrigation event. Two additional irrigation events with the same methodology were performed, three and eight weeks after the first irrigation.

Soil analyses and water quality data

Soil P content was determined in all soil samples by extraction with ammonium lactate⁄acetic acid (P-AL) at pH 3.75 (Egnér et al. 1960), which is the standard agronomic soil P test used in Sweden, and by Olsen P (0.5 M NaHCO3, pH 8.5) (Olsen et al. 1954), which has recently been recommended in Sweden as an alternative soil P test for plant-available P in soils with high pH (Swedish Board of Agriculture 2018). The P-AL extraction test used in Sweden is assumed to show the levels of plant-available soil nutrients (Egnér et al. 1960). Besides P, potassium (K), Ca, and magnesium (Mg) concentrations were also determined by extraction with ammonium lactate/acetic acid (P-AL) at pH 3.75. The same extraction procedure was used to analyze Fe and Al, as indicators of soil P sorption capacity and degree of phosphorus saturation, DPSAL, calculated as the ratio between P-AL and sum of Al-AL and Fe-AL (Ulén 2006; Blombäck et al. 2021). The single-point P sorption index (PSI) developed by Bache and Williams (1971) was calculated for each soil at 1:10 w/v. Each soil sample was equilibrated with 19.4 mmol P kg−1 soil (KH2PO4) in 0.01 M CaCl2 on a shaker for 20 h, before centrifugation at a relative centrifugal force of 2095 g for 10 min. The supernatant was filtered (0.2 μm), and then the remaining P concentration in solution was measured colorimetrically as described by Murphy and Riley (1962). For analysis of easily soluble P, the soil samples were equilibrated with 0.01 m CaCl2 (1:10 w⁄v), centrifuged at 503.1 g for 10 min. and then filtered. Analyses for P-AL and Olsen-P were carried out by inductively coupled plasma atomic emission spectroscopy (ICP-AES) (OPTIMA 5300DV; Perkin Elmer), while P-CaCl2 was analyzed colorimetrically by flow injection analysis (FIA) (FIAStar 5000 Analyzer; FOSS). In addition, acid-digestible analysis of pseudo-total P content (TPps) was performed after oxidative boiling in 7 M nitric acid (HNO3) (SIS 1997) (Element2, Finnigan MAT, a subsidiary of Thermo Instruments inc.).

A sequential P fractionation (indicated hereafter with subscript PF) method based on Psenner (1988), Psenner and Pucsko (1988), Hupfer et al. (1995), and Hupfer et al. (2009) was also used to analyze the soil samples for P concentration and different operationally defined P fractions (Jan et al. 2015), which is described in detail in Supplemental Material.

Extraction of soil samples for liquid-state 31P-NMR analysis (indicated hereafter with subscript NMR) was based on a method developed by Cade-Menun and Liu (2014) (see Methods in Supplemental Material for details).

K-edge XANES measurements (indicated hereafter with subscript XANES) were conducted using the Soft X-ray Microcharacterization Beamline (SXRMB), equipped with Si(111) and InSb(111) double-crystal monochromators, at Canadian Light Source (CLS), Saskatoon, Canada. Linear combination fitting (LCF) analysis was performed to determine individual P-compounds using Athena software package (see Methods in Supplemental Material for details).

Water samples from the rainfall simulation study as well as within the monitoring programs covering the study sites, were analyzed at a water laboratory certified by the Swedish Board for Accreditation and Conformity Assessment (SWEDAC), following Swedish Standard Methods. Total P was analyzed on unfiltered samples after digestion in acid persulfate solution, while DP was measured after filtration with a 0.2-μm pore diameter filter (Schleicher and Schüll GmbH, Dassel, Germany).

Data from monitoring program

In order to compare results from soil survey and rainfall simulation with field observations, data for two fields (3M and 21E) included in the long-term monitoring program “Nutrient losses from arable fields” (Norberg et al. 2022a, b) were downloaded and evaluated. Field 21E is located within the catchment E21, and represented by site E21 7 in the soil survey and rainfall simulation study. In this monitoring program, water discharge is measured continuously where flow from drainpipes is slowed down in a concrete dam, in which the water level is recorded over a Thomson V-notch. Manual water sampling (grab sampling) was performed at the V-notches every second week during drainage periods since July 1973 (3 M) and December 1988 (21E). Since July 2012, flow-proportional water subsamples (∼20 ml per occasion) have been taken at field 21E using peristaltic pumps after 0.1 mm of runoff and collected in glass bottles. For this study, daily water discharge (mm), calculated to monthly runoff (mm/month) and calculated monthly loads of DP and TP (kg ha−1 month−1) were used to calculate monthly flow-weighted concentration of DP and TP (mg l−1). As fields may consist of more than one parcel with various agricultural crops and P amendment rates, amounts of manure and mineral fertilizer applied on each field were calculated from the quantity spread on a known parcels within the field and re-calculated for the total area of the fields to get an approximate area-weighted annual value for the whole field. Thereafter, trend analyses were performed on annual P additions (sum of manure and fertilizer), as well as on monthly discharge and monthly flow-weighted DP and TP concentrations. The trend analyses followed the same procedure as described in von Brömssen et al. (2021), using general additive models (GAM) to detect increasing or decreasing trends at any time during the analyzed period. The R script from von Brömssen et al. (2021) was used for trend analyses and production of trend plots.

Statistics

Principal component analysis (PCA) (JMP 13.0.0, SAS Institute, Cary, NC) was performed to identify possible clustering and Pearson correlation matrix estimated by a pairwise method in PCA analysis was calculated. To reduce the uncertainty in determination of individual mineral species, the XANES results were combined as Al-PXANES, Fe-PXANES, and Ca-PXANES-compounds. Thereafter, bootstrap forest modelling and calculated column contribution report were used to identify significant variables to the fit of selected variables (CaCl2-P, H2O-PPF, Fe-PPF, Fe-PXANES, PSI). The multiple regression equations including only statistically significant independent variables (p < 0.05) were calculated for the above-mentioned dependent variables. R2 values of the linear fit and corresponding p-values were calculated, to determine the percentage of dependent variables explained by independent variables. Analysis of variance (ANOVA) was performed to test for statistically significant differences between soil layers (topsoil and subsoil samples) and sites, and Tukey–Kramer Honestly Significant Difference (HSD) test to perform multiple comparisons of group means. ANOVA was also performed to test for statistically significant differences between rainfall simulation occasions (before fertilization, and one, three and eight weeks after fertilization) and sites, and Tukey–Kramer Honestly Significant Difference (HSD) test to perform multiple comparisons of group means.

Results

Agronomic and environmental soil P extractions

Soil P concentration measured by the different extraction methods varied considerably (Table 1). The amount of P extracted by the extraction method followed the order TPps > P-AL > Olsen P > P-CaCl2. Although the concentration in the different P extractions was generally lower for the subsoil samples, according to ANOVA there was no significant differences between topsoil and subsoil samples. Comparisons between sites showed rather large but consistent differences. For instance, site 3M, in most cases together with site Ram, had significantly higher (p < 0.0001) concentrations of TPps, P-AL, Olsen-P, and P-CaCl2 (Table 1). In contrast, site 3M, together with site Per, had significantly lower P sorption capacity (Table 1). Site I28 generally had the lowest P concentrations compared with the other sites, but the highest PSI (Table 1). All soils had rather high pH values and there were no statistically significant differences in pH between sites (Table 1).

Table 1 Number of samples (N), mean values and standard deviations (SD) for pH, pseudo-total phosphorus (TPps), phosphorus, calcium, aluminum and iron extracted by ammonium-lactate (P-AL, Ca-AL, Al-AL, Fe-AL), Olsen-P, single-point phosphorus sorption index (PSI), degree of P saturation (DPS-AL), CaCl2-extracted phosphorus (P-CaCl2), operationally defined fractions of phosphorus fractionation (PF): extraction with Milli-Q water (H2O-PPF), buffered dithionate solution (Fe-PPF), and NaOH (Al-PPF), organically bound P (org-PPF), defined as the difference between Al-P and extraction with NaOH following digestion and extraction with HCl (Ca-PPF), P fractions bound to Al, Fe and Ca analyzed with K-edge XANES and organic P fractions analyzed with liquid-state 31P-NMR analysis. Values within columns with different letters (A-D) are significantly different (p < 0.05)

Based on the ANOVA results for XANES analysis, site 3M had a significantly higher proportion of Fe-bound P than all other sites. Site Ram had the highest content of P associated with Al, significantly higher than all other sites. Site Per had a significantly lower content of Al-P than sites 3M and Ram. Sites E21 and Per had significantly higher content of P associated with Ca than sites I28 and RAM.

According to the LCF analysis, Fe–P compounds were dominated by FePO4-P and ferrihydrite-P, together with some P adsorbed to goethite at the Aln and M42 sites. Gibbsite-P was the dominant Al-P compound at all sites except Ram, where variscite-P dominated. Apatite-P was the dominant Ca-P compound. However, since it is difficult to distinguish individual P compounds in routine P-XANES, they were all grouped together into Al-P, Fe–P and Ca-P compounds after the LCF analysis (see Figure S2 in the Supplemental material). There is also a clear distinction between Fe–P and Al-P, as indicated by the pre-edge feature for the Fe–P, and the shift in edge positions, as shown in Figure S2. However, the post-edge region of Al-P and organic P (lecithin) is rather featureless, thus P K-edge is not sensitive. Therefore, 31P-NMR analysis was performed to determine organic P compounds.

31P-NMR results confirmed the XANES LCF results regarding a low proportion of P-org in the soils (Supplemental Figure S3). Mean P recovery in the NaOH extracts used for 31P-NMR and calculated as percentage of TP-NMRPS over the TPPS, was highest for sites Aln (58% of TPps) and RAM (60%), closely followed by sites 3M (53%) and M42 (56%). Recovery was lowest for sites Per (30% of TPps), E21 (38%), and I28 (41%). The NMR spectra for all sites were dominated by a signal commonly assigned to orthophosphate (Smernik and Dougherty 2007), whereas additional significant signals were present in a chemical shift interval associated with P monoester signals. Much smaller signals associated with phosphonates, ortho P diesters, and polyphosphates were also detected (Supplemental Figure S3). Assuming that recovery of organic P in the NaOH-EDTA extracts was complete, soils from site 3M had the lowest content of P associated with organic matter (9%), calculated as the sum of all organic P fractions in 31P-NMR analysis, followed by sites Per (14%) and I28 (18%). Site Aln (24%) had the highest content of organic P, followed by sites RAM and M42 (both 21%) and E21 (20%).

The sequential P fractionation recovered on average from 42% (I28) to 67% (M42) of TPps. For four of the sites (E21, 3M, Aln, and Per) the recovery was quite similar (54, 55, 55, and 56%, respectively) compared with 48% average recovery for the samples from site Ram. As seen in the case of TPps (Table 1), sites 3M and Ram also had the highest soil P content, calculated as sum of all fractions, while site I28 had the lowest P content (Fig. 3). Among the individual fractions, the Ca-P fraction was dominant in soils from all sites (Fig. 3).

Correlations

The PCA data revealed reasonable correlation between results of different analyses (Fig. 1). The two first components of PCA explained 58.6% of the total variance. PCA plot (Fig. 1) and correlation matrix (Table 2) show some important clustering of the variables. First, TPps had very strong and positive correlation to the sum of both all PF and 31P-NMR fractions (r = 0.84 and 0.85, respectively) as well as to the orthophosphate-PNMR (r = 0.84). The strong positive correlation was observed also to both agronomic soil P tests (P-AL, r = 0.79 and Olsen-P, r = 0.72). Additionally, TPps was also strongly positively correlated to Fe-PPF (r = 0.75), Al-PXANES (r = 0.76) whereas correlation to P-CaCl2, H2O-PPF and Fe-PXANES was moderate (r = 0.44, 0,57 and 0.56, respectively, Table 2). There was a moderate positive correlation between the operationally defined organic P fraction determined by sequential P fractionation and both ortho P monoesters (r = 0.53), and the sum of all 31P-NMR fractions (r = 0.53) (Table 2) excluding orthophosphate-PNMR. Ortho P monoesters were the largest organic P fraction (median value 36%) identified by 31P-NMR, although the largest P fraction in total was inorganic orthophosphate (median 61%) (Table 1).

Fig. 1
figure 1

Principal component analysis (PCA) plot of correlations between phosphorus (P) fractions determined by sequential P fractionation (PF), K-edge X-ray absorption near-edge structure (XANES) spectroscopy, and 31P nuclear magnetic resonance (31P NMR); soil P, calcium (Ca), iron (Fe), and aluminum (Al) content according to the ammonium lactate (AL), Olsen-P, and P-CaCl2 extraction methods; single point phosphorus sorption index (PSI); and average clay content at each site

Table 2 Correlation matrix for soil concentrations of different phosphorus (P) fractions determined by sequential P fractionation (PF), K-edge X-ray absorption near-edge structure (XANES) spectroscopy, and 31P nuclear magnetic resonance (31P NMR), and soil P content according to the ammonium lactate (AL), Olsen-P and P-CaCl2 extraction methods

There was also a strong positive correlation between the operationally defined Al-P fraction determined with sequential P fractionation and the sum of Al-PXANES compounds (r = 0.65), and between the operationally defined Fe-PPF fraction and the sum of Fe-PXANES compounds (r = 0.64, Table 2). The corresponding correlation between the operationally defined Ca-PPF fraction and the sum of Ca-PXANES compounds was moderate (r = 0.46, Table 2).

Weak extraction of soil P with 0.01 m CaCl2, as a proxy for dissolved reactive P in subsurface drainage water, was closely correlated with several P fractions (Fig. 1). First, both H2O-PPF (r = 0.76; Table 2) and Fe-PPF fraction (r = 0.67; Table 2) determined with sequential P fractionation and their sum (r = 0.69; Table 2), assumed here to represent mobile P, were strongly positively correlated with P-CaCl2. Strong positive correlation with P-CaCl2 was also found for both agronomic soil P tests (P-AL: r = 0.74; Olsen-P: r = 0.67, Table 2), and for the inorganic orthophosphate fraction (r = 0.61; Table 2) determined in EDTA-NaOH extraction performed by 31P-NMR. However, no correlation was found between P-CaCl2-P and organic P fractions identified by either 31P-NMR or sequential P fractionation (Fig. 1 and Table 2). Additionally, P-CaCl2 was strongly and positively correlated with Fe-PXANES (r = 0.67) compounds. There was also a moderate negative correlation between PSI and P-CaCl2 (r = 0.50). In turn, PSI was strongly negatively correlated to DPS-AL (r = -0.62), but was moderately positively correlated with Al-AL (r = 0.44).

Based on the result of bootstrap forest and column contribution analysis, 66% of variation in P-CaCl2 (p < 0.0001) could be explained by a multiple regression equation including three variables, either P-AL or Olsen-P, and PSI and Fe-PXANES (Table 3). The estimates for P-AL/Olsen-P and Fe-PXANES were positive whereas the estimate for PSI was negative.

Table 3 Multiple regression equations and coefficient of determination (R2) for easily dissolved P (P-CaCl2), Phosphorus Sorption Index (PSI) and soil P tests P-AL and Olsen-P

Similar procedure for PSI revealed that content of P-AL and Al-AL together with pH explained 67% of the variation in PSI (p < 0.0001, Table 3). The estimates were positive for both pH and Al-AL but negative for P-AL. Including DPSAL, which had strong negative correlation with PSI (r = − 0.62, Table 2), in multiple regression, together with pH in multiple regression resulted in lower explanation of variance (R2 = 0.54, p < 0.0001).

All three P fractions determined by XANES included in the multiple regression equation were significantly correlated with P-AL (Al-PXANES p < 0.0001; Fe-PXANES p < 0.0001; Ca-PXANES p < 0.05), explaining 73% of its variation (Table 3). In the case of Olsen P, only Al-PXANES (p < 0.0001) and Fe-PXANES (p < 0.0001) were significantly correlated, explaining 65% of its variation.

Rainfall simulation experiment

Table 4 shows some selected properties of the lysimeter soils included in the rainfall simulation experiment. As intended, the soils cover a wide range of soil P content, but the difference between the soils seems to be governed by the strength of the extraction.

Table 4 Selected soil properties for lysimeter soils: pH, clay content, pseudo-total phosphorus (TPps), phosphorus, aluminum, iron and calcium extracted by ammonium-lactate (P-AL, Al-AL, Fe-AL, Ca-AL), Olsen-P, single-point phosphorus sorption index (PSI), degree of P saturation (DPS-AL), CaCl2-extracted phosphorus (P-CaCl2), operationally defined fractions of phosphorus fractionation (PF): extraction with Milli-Q water (H2O-PPF), buffered dithionate solution (Fe-PPF), and NaOH (Al-PPF), organically bound P (org-PPF), defined as the difference between Al-P and extraction with NaOH following digestion and extraction with HCl (Ca-P), P fractions bound to Al, Fe and Ca analyzed with K-edge XANES and organic P fractions analyzed with liquid-state 31P-NMR analysis

Consequently, the ratio between maximum and minimum values in TPps, as the strongest extraction is only 1.6 (31.8/19.9 mmol kg−1, Table 4) and increases to 7.0 and 9.0 for the agronomic soil P tests, P-AL and Olsen P, respectively, and reaches as high as 25 and 68 for the H2O-PPF and P-CaCl2, respectively. Notably, sites 3M (corresponding to monitoring field 3M) and E21 7 (corresponding to monitoring field 21E) had most often the highest and the lowest soil P content, respectively. Total water discharge and concentrations of DP and TP per irrigation occasion and site are shown in Fig. 2. ANOVA showed few statistically significant differences in water discharge between irrigation events (Fig. 2a). According to Tukey–Kramer HSD, the discharge was significantly lower (p < 0.0001) from Per 2 lysimeters after the first irrigation event, before fertilization (BF), compared to the following three irrigation events. Also, for Ram 3, the discharge was significantly lower (p < 0.03) after the first irrigation event, before fertilization (BF), compared to the two last irrigations.

Fig. 2
figure 2

Total water discharge and concentrations of DP and TP per irrigation event and site. Please note different y-axis for site 3M for DP and TP

Considerable differences in DP and TP concentrations between the sites and irrigation events were recorded. Site 3M had by far the highest concentrations of DP and TP before the fertilization event (on average 7.3 and 7.7 mg P l−1, respectively, Fig. 2) followed by two other sites with high initial soil P content (Table 4), i.e. sites E21 5 (0.7 and 0.83 mg l−1, respectively) and Per 4 (0.9 and 0.94 mg P l−1, respectively). In these three sites, DP stands for more than 90% of TP. Site Ram 3, also with high initial soil P content (Table 4), had considerably lower DP and TP concentrations (on average 0.06 and 0.21 mg P l−1, respectively, Fig. 2) and DP was on average only 40% of TP. The two out of three sites with low initial P content had very low DP and TP concentrations (Fig. 2) (E21 7: 0.04 and 0.10 mg P l−1, respectively; and Ram 10: 0.005 and 0.007 mg l−1) whereas site Per 2 showed somewhat higher losses (0.12 and 0.23 mg l−1, respectively). In Per 2, DP accounted on average for 70% of TP, whereas the percentage was lower for E21 7 and Ram 10 with 47 and 49%, respectively.

An increasing trend in the concentrations of DP and TP after the fertilisation is generally observed (Fig. 2b and c). However, in spite of the increasing tendency, the differences were not statistically significant for site Ram 3 (both DP and TP) and sites 3M and Per 2 (TP). In all other cases, the concentrations of DP and TP were significantly lower before the fertilisation. Figure 3 shows the increase in DP in both relative terms, percentage, x-axis, and in absolute term, DP concentrations (mg l−1), y-axis. With an exception of site Ram 3, the sites with high initial soil P status (3M, E21 5 and Per 4, Table 4) had higher increase in absolute DP concentrations after ferilization. Two of the sites with low initial P content (Per 2 and E21 7) showed high increase in DP in relative terms (97% and 240%, respectively), but from low initial concentrations.

Fig. 3
figure 3

Increase in dissolved phosphorus (DP) concentrations in leachate after fertilisation. Dots represent the mean value for each site, and the ends of the whiskers represent the standard deviation. The x-axis represents increase in percentage and the y-axis shows increase in concentration relative to concentrations before fertilisation

Long-term monitoring at field scale

The trend analyzes of the annual fertilization rate, monthly discharge and flow-weighted P concentrations are shown in Fig. 4. For field 3M, fertilization rate was high amounting to on average 45 kg P ha−1 yr−1 with no significant trends in the annual fertilization rate nor in discharge data (Fig. 4). However, two separate increasing trends in both DP (data not shown) and TP concentrations were recorded, the first one between 1978 and 1985, and the second one between 2008 and 2020.

Fig. 4
figure 4

Annual fertilization, monthly discharge and monthly flow-weighted concentration for field 3M (left panel) and field 21E (right panel). The black line is observed values and the solid curve line describes the fitted general additive model (GAM). Yellow color indicates no significant trend, red indicates significant increasing trend and blue significant decreasing trend

The TP concentrations at field 3M are very high with the mean value of 0.54 mg l−1 and reaching over 1 mg l−1 in the later years. As in the lysimeter study for the same site, DP was more than 90% of TP. For field 21E, there are no significant trends in DP and TP concentrations. Initially, there is a decreasing trend in annual P fertilization rate (from 1989 to 2002) but the fertilization rate is lower (on average 21 kg ha−1 yr−1), and a short increasing trend in water discharge between 1993 and 1998. However, in this case, the concentrations of TP are very low (mean value 0.009 mg l−1). Once again, the proportion of DP was similar in data from the monitoring program for this field (48%) and in leachate from the lysimeter study (47%).

Discussion

Initial soil P status governs P losses

A correlation found between the results of two commonly used soil P tests (P-AL and Olsen P) (r = 0.81, Table 2) was in line with previous studies. For instance, Blombäck et al. (2021) reported a correlation coefficient of 0.88 for 62 topsoil samples and Otabbong et al. (2009) found a Pearson’s r-value of 0.86 based on 82 samples from Swedish long-term fertility trials. Eriksson et al. (2013) also found a strong correlation between P-AL and Olsen P (r = 0.85) for a dataset of soils from countries around the Baltic Sea. These studies encompassed a wider range of important soil properties (e.g., pH, clay content, background material), while we focused on sandy soils with high pH, which may be the main explanation for the somewhat weaker correlation. High agronomic soil P test (P-AL or Olsen-P), often referred to as a measure of P quantity (Pierzynski et al. 2005) was correlated with high P release (measured by P-CaCl2 and H2O-PPF) and mobile P (sum of H2O-PPF and Fe-PPF), often referred to as a measure of P intensity (Pierzynski et al. 2005), but also with high P leaching in the simulation rainfall experiment, for three out of the four sites. These three sites (3M, E21 5 and Per 4) showed also the largest increase in concentrations of DP and TP in leachate following the fertilization. The quantity and the intensity factors are linked by the soil’s sorption capacity where the capability of quantity factor to supply the intensity factor is inversely related to the sorption capacity (Pierzynski et al. 2005). This is illustrated with site Ram 3, characterized by highest PSI and Al-AL as well as the highest content of Al-P compounds, measured by both PF and XANES. In this case, high quantity factor does not result in particularly high intensity factor due to high sorption capacity of the Ram 3 soil. The high P release (soil analyses) and P leaching (lysimeter study) for site 3M was also propagated to extremely high P losses at field scale, as measured in the long-term monitoring program. The trend analysis of long-term monitoring data also showed increasing concentrations of both DP and TP from this field, although there are no increasing trends in the fertilization rates for this field (Fig. 4). Pote et al. (2003) suggested that added P causes a greater increase in soil P (measured as water-extractable P) when applied to soil that is initially higher in soil P. Djodjic and Mattsson (2013) concluded that “increasing the P rate together with a high initial soil P status resulted in much higher levels of easily soluble P concentrations” but also that “the applied in situ sampling strategy made it difficult to distinguish between the effects of differences in soil P status as opposed to differences caused by varying P amendment rates”. Our lysimeter study is in line with such results, suggesting also higher increase in P leaching losses after fertilization of soils with high initial P status, even when the same P amendment rate was used. Additionally, the long-term monitoring results and the increasing trend in P losses in field 3M suggest also that field P losses increase with time even at the constant fertilization rate on fields with soil P content above the agronomic optimum.

High soil P sorption capacity reduces P leaching

Soil sorption capacity, measured as PSI, was another important factor influencing P release and possibly P losses from the soils. Overall, the PSI values (0.4–2.1 mmol kg−1, Table 4) were generally lower than reported in other studies. For instance, Börling et al. (2001) reported PSI values ranging from 2.7 to 4.6 mmol kg−1 for soils included in the Swedish long-term fertilization experiments. Recently, Blombäck et al. (2021) reported PSI values ranging from 1.5 (min) to 7.6 (max) mmol kg−1, with an average value of 6.1 mmol kg−1. Indeed, three of our sites (3M, Per 4 and E21 5) had very low PSI values, 0.6, 0.6 and 0.5 mmol kg−1, respectively, indicating very high leaching risk. The rainfall simulation experiment confirmed high P leaching from these three soils. Combination of high soil P content and low sorption capacity in these soils resulted in high P release (high P-CaCl2 and H2O-P), high P leaching (lysimeter study) and high and over time increasing P losses (field 3M in the monitoring program). Accordingly, low soil P content (E21 7 and Ram 10) resulted in low P release and low leaching, as well as low and constant measured P losses at field level (Field 21E in the monitoring program). These results strengthen the findings of the screening study (Djodjic et al. 2021), identifying these parts of Sweden as vulnerable to high P losses. Sites Ram 3 and Per 2 are also interesting to compare: site Ram 3 has the highest PSI, which can counteract high soil P content and prevent high losses, whereas site Per 2 has higher P leaching losses (Fig. 2), especially after fertilization, in spite of lower P content, but also lower sorption capacity, Al-AL, Al-PPF and Al-PXANES content (Table 4). The increase in P concentrations after fertilization in Per 2 is higher both in relative and in absolute terms (97% and 0.12 mg l−1, Fig. 3) than for Ram 3 (73% and 0.04 mg l−1). This indicates that new P additions to soils with low sorption capacity may lead to considerable P leaching even though the initial soil P content is low.

The Al and Fe concentrations in soil are often suggested to be important factors in P sorption (Schoumans and Groenedijk 2000; Börling et al. 2004; Ulén 2006). In the absence of explicit measurements of soil P sorption capacity for Swedish soils, Ulén (2006) proposed use of AL-extracted Al and Fe as a proxy. All sites and soils included here had low Al and Fe content. In fact, site 3M (3.8 mmol kg−1 Al-AL) and especially Per (2.0 mmol kg−1 Al-AL) were both well below the 10th percentile (7.4 mmol kg−1 Al-AL) recorded for 12,598 soil samples collected in the Swedish soil survey (Djodjic 2015). Additionally, this was the case for all other sites except for Ram, which was above that value (8.7 mmol kg−1 Al-AL). Moreover, with an exception for site M42, the Fe-AL content in our soils was lower than the 10th percentile (4.84 mmol kg−1 Fe-AL) for the same national dataset of 12,598 soil samples (Djodjic 2015). However, Al-AL had stronger influence (r = 0.44) on PSI than Fe-AL (r = 0.26), but the highest correlation was recorded with the DPSAL (r = − 0.62). The higher DPSAL means more P sorption places being already occupied leading to an increase in soil P desorption as sorbed P accumulates in soil after P addition (Sharpley 1995). The bootstrap forest modelling of PSI and calculated column contribution indicated also Al-AL and P-AL along with pH as important factors, but not Fe-AL. The low content of Al-AL and Fe-AL in the soils included in our study might explain the weak relationship between DPSAL and P-CaCl2-P, as even low P-AL values could result in high DPS when the Al-AL and/or Fe-AL value is low. High DPS values did not result in high P-CaCl2 values, since the P content in the soils was actually low.

The low content of Al in the Per soil (Table 1) might also explain its significantly lower content of Al-PPF compared with all other soils (Table 1). Site Per also had the lowest Al-PXANES content (Table 1). In the large national screening study, Djodjic et al. (2021) found that Al-AL content, but not Fe-AL content, was significantly negatively correlated with the DP concentration in outflow from small catchments (< 50 km2) in Sweden. Eriksson et al. (2015) found that phosphate sorbed to iron (Fe) (hydr)oxides was the dominant P species in clay fractions in unfertilized soil, but that after long-term fertilization P accumulated mainly as P adsorbed to Al (hydr)oxides. Considering the redox sensitivity of Fe–P compounds (Sallade and Sims 1997; Young and Ross 2001), where P is released into the soil solution under anoxic conditions, our results suggest that, among Swedish soils vulnerable to P leaching, aluminum might be the main factor governing P sorption capacity, and soil resilience to P losses. Additionally, the strong correlation of both P-CaCl2 and H2O-PPF with both Fe-PXANES and Fe-PPF indicates the high mobility of P compounds bound to Fe in these soils. The fact that site 3M had highest P release, P leaching and field P losses in spite of the highest Fe-PXANES and Fe-PPF content (Tables 1 and 3) further confirms that P is loosely bound to Fe compounds and therefore do not substantially contribute to the soil sorption capacity as a moderator of P release and losses.

Soil Test P value is a useful base for fertilizer recommendations

Soil P test values are critical for formulation of fertilization recommendations in a replacement strategy for soils within optimum soil P status. A possible discrepancy between P-AL, which is an acid extraction (pH = 3.75), and Olsen P (pH = 8.5) for soils with high pH has been observed previously (Lovang 2015). Therefore, falsely high soil P-AL values may actually overestimate plant-available P in soils with high pH, and lead to erroneous fertilizer recommendations. The Swedish recommendation (Swedish Board of Agriculture 2020) for optimum soil P status is P-AL-class III (i.e., 40–80 mg P kg−1 soil or 1.29–2.58 mmolkg−1) (see Table 1). The corresponding optimum range for Olsen P in neighboring Denmark (Knudsen 2021) is 21–40 mg P kg−1 soil (0.68–1.29 mmolkg−1) (Table 1). Even in our small dataset, there were discrepancies in classification of soil samples between P-AL and Olsen P, especially for samples classified as high soil P content according to P-AL (Supplemental Table S2). Around 45% (21/47) of samples classified as having high P according to P-AL were classified as having optimum (17 samples) or low (4) soil P content according to Olsen P. Interestingly, the Fe-AL content was significantly lower in soil samples where P-AL overestimated soil P status (Supplemental Table S2) than in the 26 samples classified as having high soil P content according to both methods (P-AL and Olsen P). From this point of view, the Swedish Board of Agriculture (2020) recommendation to lower soil P status by one class for P-AL analyses of soil samples with pH > 7, or to perform Olsen-P analysis instead, seems justified in some cases. However, this may apply only for soils with a low content of Fe-AL, as the other soils analyzed in this study, with high pH but higher concentrations of Fe-AL, were classified to the same soil P-availability class irrespective of soil P test method (P-AL or Olsen P). This needs to be confirmed for a larger number of samples, as determination of Fe in the same AL-extract does not mean high additional costs.

The sequential P fractionation seemed to overestimate the proportion of Ca-bound P (Ca-PPF) in the soils studied here, compared with the proportion identified by XANES spectroscopy (Ca-PXANES) (Fig. 1). This was not unexpected, as Hupfer et al. (2009), Gu et al. (2020) and Barrow et al. (2021) showed that fractionation can produce misleading results for Ca-rich samples, causing a shift from alkaline-soluble P (NaOH-extracted Al-PPF) to acid-soluble P (HCl-extracted Ca-PPF). This is also the advantage of combining the analyses using the sequential P fractionation method, P K-edge XANES and NMR, as previously demonstrated (Liu et al. 2015; Gu et al. 2020) Indeed, in all soils studied here Ca-PPF was the largest P fraction according to sequential P fractionation, whereas Al-PXANES was the largest P fraction according to XANES spectroscopy (Table 1), with site Per as an exception. At site Per, the Al-content was very low (see the discussion above). It is difficult to distinguish Al-P compounds from organic P compounds in LCF analysis (Ajiboye et al. 2008; Zhang et al. 2018).

In studies by Ahlgren et al. (2013) and Liu et al. (2015), the 31P-NMR spectra indicated that all samples were dominated by orthophosphate and orthophosphate monoesters, which was also the case for the samples analyzed here. Phosphorus monoesters are generally known to be more resistant to microbial degradation than e.g., orthophosphate diesters (Condron et al. 2005), although some orthophosphate diesters can be degraded to orthophosphate monoesters, producing artifacts in the subsequent 31P NMR spectra (Cade-Menun and Liu 2014).

Conclusions and future implications

High spatial variability in P losses requires reliable identification of vulnerable soils as a precondition for cost-effective abatement. Low P sorption capacity of agricultural soils vulnerable to DP losses demands extreme caution when managing P in these soils, especially regarding P additions to soils having already high soil P content. The results of this study give concrete evidence of the high vulnerability of the soils previously identified as prone to losses of dissolved nutrients in a national screening study (Djodjic et al. 2021). Agronomic soil P tests (P-AL and Olsen P) are good indicators of the risk of P losses from soil. However, even moderate fertilization rates caused increase in P leaching in these soils, which needs to be reflected in future improved fertilization recommendations.

The forms of P in soil had a major impact on P leaching. Organic P compounds in the investigated soils seem to be rather stable and do not contribute substantially to the P leaching risk. On the other hand, presence of Al and Al-P compounds in soil seems to limit the release of DP, whereas high soil P content and a high content of Fe-P compounds may indicate more labile P and higher release of DP, leading to high P losses at field scale. The results regarding soils’ vulnerability to P losses are consistent comparing soil samples and P extractions with leaching experiments in soil columns and field-scale measurements. However, the two studied fields with long-term monitoring data represent the end-points of a wide spectra and therefore future research needs to include other fields situated between these end-points. As P losses from these fields are high even without new P applications and the alternatives to on-field mitigation measures are limited, further studies are needed to test the ability of edge-of-field measures (e.g. reactive permeable barrier-type P sequesters, constructed wetlands) to capture and reduce high field P losses.