On the Challenges of Soil Phosphorus Prospections in Heterogeneous Environments—a Case Study on the Iron Age Altenburg Hillfort (Niedenstein, Hesse, Germany)

Heterogeneous environments might be considered inadequate for geoarchaeological soil phosphorus (P) prospections due to the manifold potential alterations of archaeologically relevant soil P contents. To elucidate if this assumption is correct, we conducted a case study in the heterogeneous environment of the Iron Age Altenburg hillfort near Niedenstein (Hesse, Germany). We introduce the least influenced area (LIA) approach to deduce the native soil P contents of the local soils inductively from own soil P data. The native P contents are crucial for further data evaluation in any P prospection. Moreover, we develop two parameters to quantify and compare soil P enrichment in pedogenically heterogeneous areas: the mean enrichment ratio (MER) and the maximum enrichment ratio (XER). Finally, we modify and discuss the approach of Lorch (The Science of Nature 28(40/41):633–640, 1940) to interpret spatial patterns of soil P enrichment with regard to prehistoric land use. Our study shows that soil P prospections can be conducted in heterogeneous environments. Still, they require a systematic field work strategy and soil scientific support. Moreover, instead of absolute soil P contents, alternative parameters must be derived from the data to evaluate soil P enrichment adequately for a heterogeneous area. Using such alternative parameters on the basis of our data, the Iron Age Altenburg hillfort was significantly larger than formerly assumed. Only the Altenburg plateau and the directly adjacent slopes seem to have been densely covered with buildings. The neighboring areas might have been used for different purposes including agriculture, livestock grazing, and economic activities.


When the Going Gets Tough-Soil Phosphorus Prospections in Heterogeneous Environments
Archaeologists sometimes face the problem that they want-or need-to get more information out of the soils they study than there is visible to the naked eye (Wells and Moreno Cortés 2010;Ethelberg 2008). Particularly, the chemical features of a soil might be extremely informative but inaccessible to observation (Wells 2010; Barba and Ortiz 1992). One such chemical parameter is the soil phosphorus (P) content.
P is one of the major nutrients that all living beings on our planet require (Kruse et al. 2015;Vance 2001;Abelson 1999). As it is a building block of the DNA (Goll 2017;Stewart et al. 2017;Vaccari 2011), P is contained in most organic materials (Sharpley and Rekolainen 1997;Amberger 1996). After such organic materials were introduced into a soil, they are mineralized biochemically (Blume et al. 2016;Nash et al. 2014). During this process, the contained P is set free gradually and can then readily be bound to soil particles (Weihrauch and Opp 2018). The energy of these bonds is often considered as increasing with time (Weihrauch 2019;Margenot et al. 2017;Smeck 1985). This means that, unlike other elements, the solubility of P in soils declines over time (Gerke 2015;Smil 2000;Walker and Syers 1976). Consequently, P tends to enrich in soils that regularly receive inputs of P-containing matter, and such enrichment may last for very long time (Holliday and Gartner 2007;Zölitz 1983).
In the 1920s, the Swedish biochemist Olof Arrhenius discovered that this context may significantly support archaeological investigations because soils in sites of archaeological interest are often enriched with P (Heinrich 1987;Zölitz 1983;Provan 1971;Arrhenius 1931). During land use (e.g., agriculture, settling), people intentionally or unintentionally introduce P into the soils they use, for instance, in the form of feces, food scraps, and ashes (Weihrauch et al. 2019;Wells and Moreno Cortés 2010;Holliday and Gartner 2007;Barba and Ortiz 1992). Such inputs are considered larger under more intensive land use types (e.g., settlements as a spatial mosaic of locations used for food preparation, livestock housing, slaughtering, crafts, ritual activities, etc.) than for land use without or with just small P inputs (e.g., agriculture without fertilizer application ;Ethelberg 2008;Jakob 1955;Lorch 1940).
In a geoarchaeological P prospection, we determine the P contents of soil and sediment units relevant for archaeological questions. Thus, we need to sample these soils systematically with regard to the respective study area and research question (e.g., Zimmermann 2001;Bakkevig 1980). The soil samples are then mostly analyzed chemically in a laboratory (Holliday and Gartner 2007). As a result, we receive a spatial overview of the study area's soil P distribution. For instance, this can be a map depicting that a certain section of a study area has significantly larger P contents than its surroundings (e.g., Zimmermann 2001, Zölitz andHeinrich 1990;Sjöberg 1976). In such a case, one would interpret that the P-enriched section might have been used anthropogenically and the surroundings have not.
Neglecting soil features (besides the P content) has a tradition in the history of geoarchaeological P prospection. Walter Lorch, one of the forefathers of the method in Germany, wrote: "The P content of the soil is only slightly dependent on the geological conditions, soil texture and soil type, terrain morphology, climate, vegetation and other geographical factors. There is, however, a very strong and direct relationship between the soil P content and human activity, which is manifested in economy, transport and settlement." (Lorch 1939, p. 84;translated). Hence, allegedly, soil P contents need only be interpreted with regard to anthropogenic activities in a P prospection (Zimmermann, 2011;Jakob 1955;Lorch 1955).
This optimistic and anthropocentric attitude was increasingly rejected during the 1970s and 1980s (e.g., Gebhardt 1982;Bakkevig 1980;Woods 1977;Provan 1971). For instance, Reinhard Zölitz, one of the major developers of the method in Germany during the 1980s, deliberately and repeatedly pointed out the importance of geological and pedological soil features for interpreting soil P contents (e.g., Zölitz and Heinrich 1990;Zölitz 1980Zölitz , 1982Zölitz , 1983. However, the appeal to consider these features made the method much more demanding and required soil scientific background knowledge that archaeologists often do not have (Weihrauch et al. 2019;Holliday and Gartner 2007). Maybe for this reason, the investigators might have reduced pedological complexity in geoarchaeological P prospections by limiting soil and bedrock variation to a minimum. Thus, homogeneous areas with minor or without geological and/or pedological changes were often studied (Bakkevig 1980).
In a homogeneous research area, anthropogenic P enrichment is often easier to note. To evaluate if an area has been anthropogenically enriched with P, we require knowledge of how much P the investigated soils contain by nature (Wells 2010). If this background value, "natural" or "native" P content (Holliday and Gartner 2007;Lienemann and Tolksdorf-Lienemann 1992;Gundlach 1971) is exceeded, we can argue that an area has received additional P, probably by anthropogenic inputs (Arrhenius 1955;Lorch 1941a).
The native P content results mainly from bedrock. Rocks contain P in primary form (mainly apatite minerals) which are set free during weathering and which are then converted into different pedogenic P forms (Weihrauch and Opp 2018). Besides the P originally contained in a soil, soils can bind and retain P introduced later. Such secondary P would successively raise the soil's P content and make it increasingly exceed the native P content (Zimmermann 2001;Gebhardt 1982;Proudfoot 1976). Hence, the larger the difference between the detected P contents and the native P content of a soil, the more intense the input of secondary P has probably been (Jungmann 2002;Bakkevig 1980;Lorch 1940).
Because a homogeneous environment has one native soil P content, P enrichment can be convenient to capture in such an area. However, when we study certain archaeological site classes (e.g., Iron Age hillforts) in particular geographic areas (e.g., the German middle mountain ranges), we often do not deal with homogeneous environments. Instead, we find archaeologically relevant sites distributed over pedogenically heterogeneous areas. We define those as characterized by the variation of one or several of the following features: (1) bedrock and soil-forming substrata in the underground, (2) surface morphology and slope inclination, and (3) recent land use and vegetation on the surface.
In heterogeneous environments, soil P enrichment is difficult to detect for several reasons (Kondratiuk and Banaszuk 1993). First, native P contents differ across such an area according to different bedrocks/substrata. Second, soils can be naturally Penriched as a result of the deposition of P-containing colluvial slope sediments or due to underground P translocation related to surface morphology (Weihrauch 2018;Fisher 2015). By contrast, soils can be P-depleted in some areas (e.g., on topslopes) because of erosion or in some sections (e.g., topsoils) because of underground P translocation (Stewart et al. 2017;Bol et al. 2016). Third, soils can be P-enriched as a result of archaeologically not relevant, possibly rather recent anthropogenic processes like modern agriculture or forestry Amberger 1996;Zölitz and Heinrich 1990). Fourth, soil P contents may be altered due to recent dynamics of soil water (Weihrauch 2018;Reed and Wood 2017;Reddy et al. 2005).
Considering all these potential alterations of archaeologically relevant P enrichment, the methodological question arises if heterogeneous areas are adequate to investigate with a soil P prospection or if they should rather be avoided. We conducted a case study in the heterogeneous environment of the Iron Age hillfort on the Altenburg mountain near Niedenstein (Hesse, Germany) to address this question. We hypothesize that heterogeneous environments can be studied successfully with a soil P prospection but that they require a more sophisticated approach than homogeneous areas. In the present paper, we propose such an approach.

The Altenburg Hillfort
The "Altenburg" (N 51°14′ 51″; E 9°17′ 26″) is a prominent solitary mountain (450.7 m above sea level) in the vicinity of Niedenstein approximately 12 km southwest of Kassel (Hesse, Germany; Fig. 1). During the twentieth century, archaeological investigations brought to light that the Altenburg had been settled intensively during prehistory (Söder 2004;Hofmeister 1930). Two settling periods were discerned based on the material finds. The first period was dated to the Late Neolithic Michelsberg Culture (4400-3500 BC) with its focus being in stage IV according to Lüning (i.e., 3800-3700 BC;Meyer 1994). The second settling period was in the Late Iron Age (Latène Period,450 BC until 15 BC).
From the Neolithic, stone axes, knife blades, and ceramic fragments were found on the Altenburg plateau. Hence, it was deduced that only the plateau had been used during the first settling phase (Meyer 1994). However, the most intense anthropogenic activity occurred during the last two centuries BC when a hillfort was built on the Altenburg mountain. During this time, land use was extended further into the area surrounding the mountain.
The Altenburg hillfort belongs to the most remarkable archaeological sites of the Late Latène Period in Hesse, judging from the material finds and the archaeological structures. Many relatively well-preserved prehistoric structures have been discovered, e.g., ca. 3000 m of ramparts (often with wooden supports) and eleven cisterns with wooden floors. The cisterns were 5 × 7 m large and ca. 1.8 m deep. They were excavated during the first decade of the twentieth century. One of them was excavated again in 1991 and its wooden floor was dendrochronologically dated to 193 BC (Söder 2004). Additionally, two wooden doors were discovered during the first excavation campaigns, one with a dendrochronological age of 101 BC (Raetzel-Fabian 2001;Hofmeister 1930). With regard to the conservation of these finds, there are few comparable material remainders from prehistoric times in Europe, two of them discovered in Switzerland (i.e., Robenhausen and Pfäffikon-Burg) and one in Great Britain (Glastonbury Lake Village, Somerset). Numerous other interesting finds have been recovered from the cisterns, for example horse harnesses, forging tools, belt hooks, and fibulae.
Moreover, the Altenburg hillfort is archaeologically remarkable because of a multitude of deposits (e.g., tools, horse's harnesses), which were discovered on the mountain plateau and on its slopes (Söder 2010). The large number and heterogeneity of archaeological artifacts and structures indicates that the Iron Age Altenburg hillfort was densely populated and had relevance for a larger region. For the Iron Age period, some damage by fire was detected, but the settlement persisted. It is unclear why the Altenburg hillfort was finally abandoned. Deliberate destruction could not be proven so far.
Due to the Altenburg hillfort's outstanding relevance for the prehistoric archaeology of Hesse, the investigations on the site were revived in 2014 by a cooperative research project of the Hessian State Office for the Preservation of Monuments (LfDH), the Institute of Prehistoric Archaeology and the Department of Geography (both Marburg University, Hesse, Germany).
During extensive field surveys and from the evaluation of digital terrain models and LiDAR scans, we discovered a further, almost complete rampart ring (henceforth termed the "outer rampart") surrounding the Altenburg hillfort in its formerly assumed extension plus an additional area of ca. 80 ha (Fig. 2). Latène Period artifacts were found in the interspace between the former and the new "outer rampart," e.g., ceramics, fibulae ("Nauheim" type), tools, and household items like a cauldron chain (unpublished). These finds indicate that the interspace belonged to the Iron Age Altenburg hillfort. Hence, the traditional conception of the settlement might have been incomplete: Instead of ca. 70 ha, the settlement might have been ca. 150 ha large. Thus, it would have been one of the largest Iron Age settlements surrounded by a rampart ring in Hesse (Georg et al. 2017).
To corroborate these important findings, we carried out several field surveys and prospections (e.g., examinations of root sections of fallen trees, geophysical and geomagnetic prospections). As "in depth" evidence from the local soils is still missing, we carried out an extensive soil P prospection to find further proof for our assumption regarding the Altenburg hillfort's size. Such new results might motivate systematic geoarchaeological excavations in the future.
The archaeological research questions of our study are as follows: (1) Does the interspace between the old and the newly identified outer rampart belong to the Iron Age Altenburg hillfort? (2) What was the settlement area used for? Were there different spatial sections regarding land use?
The Altenburg mountain forms a heterogeneous environment with a diverse surface morphology including slope inclinations < 2°on the plateau, from 22 to 27.5°on the steep flanks, and from 7 to 11°on the flatter slopes of soil regions 1.1-1.3, 1.6-1.8, 2.1-2.3, and 2.7 (Fig. 3). The study area is characterized by a marked difference between strongly P-containing volcanic bedrocks/substrata (basalt, basaltic tuffs) and rather P-poor materials of Mesozoic and Quaternary origin (e.g., sandstones, loess) (HLfB 1971). Due to the prehistoric anthropogenic occupation of the mountain, all the soils include relatively thick colluvial layers. We found mainly Cambisols, but also Luvisols (in the loess areas) and locally Stagnosols (formed from basaltic weathering clays in region 1.2 and on the plateau), all with colluvic features (FAO 2015). The local soils predominantly have a silty-clayey texture in the soil regions 1.2-1.8 and 2.1-2.7, as well as sandy-loamy texture in region 1.1. Today, the Altenburg mountain is mainly used for forestry with deciduous forest dominated by beech (Fagus sylvatica). Besides, the soil region 1.8 is agricultural land, and region 2.7 represents grassland, which is temporarily grazed on by cattle.

Material and Methods
The archaeologically relevant area surrounding the Altenburg plateau is too large to investigate completely in a cost-efficient manner. Hence, we conducted a twodimensional soil P prospection (Weihrauch et al. 2016). We consider this the most efficient way to get a detailed overview of the soil P contents of the large area. The linear dimension (i.e., transects) enables to detect differences between the land use segments formed by the several ramparts and the area outside the outer rampart. The vertical dimension (i.e., soil profiles) is required as the archaeologically relevant soil depth is not known. This depth probably differs across the investigated area because surface morphology is heterogeneous and a different intensity of erosion and colluvial deposition is to be expected. Moreover, P can be translocated underground (Weihrauch 2019;Zölitz 1983) and archaeologically relevant P must not stay in its original position. Thus, choosing a homogeneous sampling depth for the whole study area would most likely give misleading results (Bakkevig 1980).

Field Work
We created two transects of soil profiles that intersect approximately in the middle of the Altenburg plateau (Fig. 2). Each transect begins in the area in front of the newly discovered outer rampart, cuts through the entire archaeologically relevant area, and ends behind the outer rampart at the other side of the mountain. The transects were positioned in a way that they cut the plateau diagonally, i.e., at its maximum diameter. Transect 1 is 1721 m long and consists of 27 soil profiles. Transect 2 has a length of 1531 m and is comprised of 19 soil profiles.
As there are several rampart "rings" around the Altenburg plateau, each transect crossed different segments of the archaeologically relevant area (Fig. 3). Theoretically, a different prehistoric land use can be assumed for each of these segments. To depict such land use differences, a sufficient spatial representation of the segments is required (Wells 2010). We placed at least two soil profiles within each segment (i.e., for small segments). The larger segments are represented by three to four soil profiles, and the plateau by five profiles. No profiles were created in segments that are very small, located in steep terrain or that are densely covered with basalt rocks at the surface. The distances between the profiles were chosen according to segment size, site accessibility (e.g., trees, shrubs), and the features of the recent surface (e.g., avoiding holes from fallen trees or animal activity). Hence, within the transects, the soil profiles were distributed with irregular distances from each other.
As the Altenburg hillfort is an archaeologically sensitive area, we were legally not allowed to open trenches. To investigate the large area in a cost-and time-efficient way, we used an auger (Pürckhauer type; 1 m length, 2 cm diameter) to create the soil profiles. They were cored as deep as the underground features (e.g., stoniness) allowed, with an average length of 98 cm. The positions were localized by handheld GPS (Garmin Etrex 30). Each soil core was documented soil scientifically according to Ad-Hoc-AG Boden (2005), considering soil stratification and each soil horizon's Prelevant features. The soil cores were sampled in 10 cm sections (i.e., maximally ten samples per core). Samples were taken with a field knife and were put into plastic bags where they were stored airtight until further processing.
Prior to laboratory analysis, the samples were air-dried for 2 weeks at ambient temperature, then ground in a mortar, and sieved (2 mm mesh). The fine fraction (i.e., < 2 mm) was used for further analysis. The coarse fraction (i.e., rock fragments > 2 mm) was analyzed optically to identify common rock types (according to the geological map; HLfB 1971). For very small rock fragments, a microscope was used. This attribution of rock types was necessary to identify "soil regions." As we observed some deviation between the spatial extension of bedrocks in the geological map and our soil profiles during field work, we grouped our soil profiles according to "soil regions" during data evaluation. These soil regions do not match the spatial extension of bedrocks in the geological map but correspond to which bedrocks we found in the deepest sections of our soil cores. As the soils within one soil region are influenced by the same type of bedrock, they are likely to have a comparable native P content.

Laboratory Analyses
Different from former soil P prospection research (e.g., Lienemann and Tolksdorf-Lienemann 1992;Kiefmann 1978;Jakob 1955;Arrhenius 1931), we know today that soil P is not necessarily stable (e.g., Weihrauch and Opp 2018, and cited literature). Due to different processes, it can become easily soluble again in a soil and is then not archaeologically relevant. Hence, we proposed a fractioned soil P prospection for geoarchaeological purposes (Weihrauch et al., 2016(Weihrauch et al., , 2019. Then, not only one but several P fractions are determined in the laboratory. This allows to deduce on the recent soil P dynamics, which may overlay and distort archaeologically relevant soil P contents and patterns . Our P fractionation determines three P fractions differing in solubility. The laboratory methods and the rationale behind the fractions are explained in Weihrauch et al. (2019). Briefly, easily soluble P-dHCl is extracted with diluted (i.e., 0.1 M) hydrochloric acid (HCl), slightly soluble P-cHCl with concentrated (i.e., 12.1 M) HCl, and recalcitrant P-AR with aqua regia (i.e., 12.1 M HCl and 14.4 M nitric acid/HNO 3 in a ratio of 3:1; Table 1). The first P fraction depicts rather recent site conditions (e.g., hydromorphosis) and is archaeologically not relevant. The second fraction depicts slightly soluble P, which is considered secondary. This means, it could have been introduced into the soil secondarily (e.g., by prehistoric people) and was then bound and retained for a long time. The third fraction comprises both the other fractions, but also depicts primary P (i.e., apatite) from bedrock. Hence, this fraction can help to identify when large P contents are geogenic and not archaeologically relevant.
The first and second fractions were measured as phosphate on a spectrophotometer at 700 nm (Genesys 10S; Thermo Fisher Scientific, Bremen, Germany). The third fraction was measured as elementary P on an ICP-MS (X Series 2; Thermo Fisher Scientific). Unfortunately, some of our data proved false during evaluation as the ICP-MS had a defect at the later stages of our measurements. False data were excluded from our dataset and do not contribute to the results. Hence, as the ICP-MS defect occurred during the measurements of samples from transect 2, the respective dataset is smaller and includes some data blanks. Altogether, we analyzed 254 samples from transect 1 (i.e., 762 analyses; eight results excluded) and 185 samples from transect 2 (i.e., 555 analyses; 37 results excluded).
Soils do contain different kinds of phosphates instead of elementary P. However, it is plausible to mathematically convert phosphate into P contents. Hence, for comparability, we converted all our results into the unit mg P/kg of soil. In the present paper, we report soil P contents and refer to a "soil P prospection." However, the principle of the method is the same as in a "soil phosphate prospection."

Statistical Analysis
As indicated above, the three P fractions are cumulative: The second fraction contains the first, and the third fraction contains the first and second fractions. Because the first fraction is archaeologically not relevant, we subtracted it from the other fractions and only report the resulting differential P fractions hereafter. Slightly soluble P (P-sl) is the difference between the first and second fraction. Recalcitrant P (P-rc) is the difference between the first and third fractions (Table 1; Weihrauch et al. 2019).
We stated above that we positioned our soil profiles according to the land use segments formed by the (possibly prehistoric) ramparts. During soil coring, it became clear that these segments do not match the spatial distribution of bedrocks (Fig. 3). During data evaluation, we first grouped the soil profiles in "soil regions" according to Mean of all the P content data within the respective soil section MIA Mean P content of the most influenced area; all the non-LIA soil sections, exclusive of the recent topsoils and the next 10 cm below them; interpreted as depicting average absolute anthropogenic P enrichment of a soil Mean of all the P content data within the respective soil sections MER Mean enrichment ratio; average relative anthropogenic P enrichment of a soil profile in relation to the native P content (LIA); relates to soil sections (e.g., useful for soils with colluvial layers) ((100% × soil profile MIA)/soil region LIA) − 100 XER Maximum enrichment ratio; maximum relative anthropogenic P enrichment of a soil profile in relation to the native P content (LIA); relates to a distinct soil depth (e.g., useful for past surfaces) ((100% × soil profile maximum P content)/soil region LIA) − 100 the prevalence of bedrocks (Table 2). Then, we classified the soil profiles in "land use segments" according to the area's segmentation by ramparts (Table 2). From this, we evaluate the anthropogenic enrichment with phosphate in the spaces between the ramparts. Next to these accumulations, data were grouped in different ways for different purposes. The respective procedures and the rationales behind are explained and discussed in the upcoming sections. Basic statistical operations were conducted in Microsoft Excel (Microsoft, Redmond). Means of grouped data, respective standard errors, and significances of differences between means (Tukey's HSD test) were calculated in SPSS 25 (IBM Corporation, Armonk). Significances were determined for the p ≤ 0.05 level.

Results
The mean P contents of the studied profiles differ significantly in the two transects ( Fig. 4). From this coarse depiction, a focus of high P contents in the center of the transects is visible. The respective areas correspond to the center of the Altenburg hillfort on the mountain's plateau. The P contents are significantly lower in the nonplateau regions. For transect 1 (Fig. 4a), this is especially true for the western profiles outside the outer rampart (i.e., possibly outside the settlement). In the eastern part of transect 1, the P contents are higher and indicate more intense land use than in the west. A general trend is obvious for P contents to decline successively with increasing distance from the settlement center on the plateau. This trend might indicate a decrease of land use intensity (e.g., from settling to agriculture) toward the settlement's periphery.
For transect 2 ( Fig. 4b), the lateral distribution of the mean P contents is more homogeneous than in transect 1. The nonplateau areas on both sides of the mountain have comparable low mean P contents. This rather indicates a clear difference between intensively P-introducing land use (i.e., settling) on the Altenburg plateau and less intensively P-introducing land use (e.g., agriculture, certain crafts) in the nonplateau areas. Moreover, the soils outside the outer rampart are not clearly lower in P than those in the nonplateau area within the outer rampart. Hence, for transect 2, we cannot deduce the area around the plateau has been settled like the plateau.
These conclusions from the lateral differences of the P contents would be meaningful for a soil P prospection in a homogeneous environment. They are not so for our study because the different bedrocks/substrata from which the soils in our transects have developed let expect different native P contents. Thus, we cannot evaluate the P contents in an overarching way for the whole of our heterogeneous study area. Instead, we must put the measured P contents in the context of the soils' native P contents. Finding the native P contents is generally a challenge in geoarchaeological soil P prospections.

Deducing the Native P Contents
In former soil P prospections, reference samples were commonly taken to find out the native soil P content of a study area (e.g., Duncan 2000;Bakkevig 1980;White 1978;Sjöberg 1976;Rottländer 1970). Such reference samples were taken nearby the Table 2 Attribution of soil profiles to "soil regions" and "land use segments" archaeologically relevant study areas but in locations where anthropogenic influence was expected to be small or (at best) zero (Ethelberg 2008;Zimmermann 2008). In settled areas, there is mostly insecurity regarding the question if there are subareas without anthropogenic P inputs at all. Problems also arise when investigating areas that are spatially limited, e.g., hillforts limited by the extension of the mountain plateau. In such cases, one could on the one hand sample the mountain's slopes. However, due to the often steep relief and related erosion, the P contents of the slope soils might not well represent the native P contents of the plateaus. On the other hand, one could seek a comparable location further off, e.g., a neighboring mountain of the same bedrock. Finding such a place which was assuredly not used by people before is a huge challenge in the densely settled German cultural landscape. Moreover, even mountains of the same bedrock might differ in soil P contents (e.g., due to different vegetation, management history). Hence, finding proper areas to take reference samples is difficult for soil P prospections on mountain sites.
For the Altenburg hillfort, the situation is aggravated by the fact that the plateau is surrounded by areas that were also used by the prehistoric people. Hence, these areas are not useful as reference locations. There are some smaller mountains of comparable bedrock in the Altenburg's vicinity (HLfB 1971), but we consider them too insecure to be fitting reference sites due to their unclear land use history. As a further challenge, we do not just require to know one native P content but several as our transects cross several bedrock/substratum type regions (Fig. 3). Many of these materials are relatively common around the Altenburg mountain, but the land use history of these areas is also unclear.

Seeking Guidance in the Literature
An overview of native P contents can sometimes be drawn from the literature (Table 3). Still, information was not retrievable for all our bedrocks/substrata. Besides, bedrock P contents are mostly determined on the basis of total P methods (e.g., X-ray fluorescence-based), which are in quantity not equal to soil P contents despite possible correlations (Pecoroni et al. 2014). Moreover, bedrock total P contents mainly depict primary P forms, but we require secondary P forms for geoarchaeological P prospections. It is also necessary that data were produced with the exact same methodology to be fully comparable, which is a problem for our specific P fractions (i.e., P-dHCl, P-cHCl). Furthermore, a comparable sampling time is required to compare results because there might be different shares of easily soluble P at different times. Thus, we do not consider our data to be adequately comparable with those by Pecoroni et al. (2014).
Another aspect is the spatial provenience and representation of data considered for a comparison. For instance, the P contents reported by Pecoroni et al. (2014) are for topsoils in Hesse. As our case study investigates entire soil profiles, topsoil data are not sufficient to deduce a native P content because the subsoils would be neglected. Besides, P dynamics are mostly intense in the topsoils (e.g., large proportions of easily soluble P due to mineralization), and there are multiple distorting recent influences (e.g., agriculture). Thus, topsoils are often not well suited for geoarchaeological P prospections (e.g., Zölitz 1983;Gundlach 1971;Lorch 1939).
Setting all these preconditions it becomes clear that, for geoarchaeological research, suitable data for comparison is mostly not available due to different field and/or laboratory methodology and different research focuses or study areas. Hence, we should strive to systematically build up a database of comparable soil P data with future geoarchaeological soil prospections.
As we could not find adequate data in the literature to relate our soil P contents to, we determined the native P contents on the basis of our data. For this, the classification of our study area into soil regions is crucial. Table 4 and the boxplots in Fig. 5 depict the distribution of measured P contents within our soil regions.  (2005) Loess 484.4 Hesse (Germany) Rosenberg and Sabel (1996), cited by Pecoroni et al. (2014) a Median; mg total P per kg dry matter; determined via X-ray fluorescence analysis In transect 1, the median P contents clearly differ between the regions. In transect 2, the differences are often small. This indicates on the one hand that our transects are relatively different in terms of the spatial soil P distribution despite their proximity. On the other hand, it shows that the natural P contents between regions of the same bedrock/substratum (as identified from the geological map) may diverge.
As a steep basalt plateau, the Altenburg mountain is surrounded by soils developed from basaltic debris and respective weathering loams. Each of our transects cut through this type of substratum twice (i.e., regions 1.2, 1.7, 2.1, 2.3). However, these regions' P contents (e.g., their medians) are so different from each other that combining them would cause unwanted inaccuracies in deducing the native P contents (Fig. 6a). This is plausible as the geological map groups materials resulting from the same primary and/ or secondary process type (e.g., volcanic intrusion, solifluction), but not according to process events or geobiochemical features (e.g., distinct minerals and their proportions).
For instance, as the abovementioned basalt debris-containing loams resulted from solifluction during the Pleistocene (HLfB 1971), they strongly depended on insolation and relief. There was probably more intense thawing and resulting solifluction on the southern compared to the northern slopes. Hence, the proportion of contained basalt debris may differ and the resulting substrata may diverge in their native P contents even though geologically formed in a comparable manner. Thus, the information given in a geological map does not let us deduce the native P contents of the resulting soils, which also result from geographic site features (e.g., surface morphology, exposition). These site features are potentially more comparable the closer the respective regions are to each other. For instance, the Altenburg plateau consists of basalt (region 1.4a) and of weathering clays developed from basaltic debris and basaltic tuff (regions 1.4b and 2.4). The data of these regions are comparable (Fig. 6b). Hence, we grouped the plateau regions to deduce their native P content.
These results indicate that even geological comparability must be considered critically. Hence, it seems favorable to deduce the native P contents from our own data instead from not fully adequate external literature sources. However, data quality of the statistical sample would then strongly influence the resulting native P contents. For instance, if all our study area was anthropogenically altered, we would not find a real "native" P content. To avoid such distortions, we took soil stratigraphy into account.

An Easy Solution from Soil Stratigraphy?
It is often observed that, in natural soils, P contents are highest in the topsoils and decline with increasing depth as secondary P inputs happen on the soil surface (Weihrauch and Opp 2018;Blume et al. 2016;Bol et al. 2016). Hence, the deepest of our sampled soil sections might be hypothesized to display the native soil P content. However, it cannot be generally assumed that soil P contents decline with increasing soil depth. Instead, soils developed from P-bearing bedrocks/substrata like basalt and basaltic tuff may increase in P with increasing depth where the primary P source is closer (Weihrauch et al. 2016). Moreover, hydromorphic conditions might foster P enrichment with depth. Impeded drainage enables chemical P mobilization and its subsequent translocation with the soil water (mostly vertically downwards in the profile (Fig. 7). Weihrauch and Opp 2018;Godlinski et al. 2008;Reddy et al. 2005). Finally, the depth of the prehistoric surface was not clear in our study. Locally, it might be Table 4 Statistical overview of P contents in different soil sections according to soil regions      Due to data blanks for P-rc, the cumulative calculation of statistical parameters resulted in a lower value for P-rc than for P-sl. With respect to the standard errors, data are reported here but it should be considered that-by logic-values for P-rc must be larger than those for P-sl 1 P-sl = slightly soluble P (P-cHCl minus P-dHCl) 2 SE = standard error of mean 3 Sig. = significance; different letters indicate significant difference between means (p ≤ 0.05); letters should only be compared within the same line 4 LIA (least influenced area) = soil sections with the smallest P contents in each region, as identified from the standard profiles 5 MIA (most influenced area) = soil sections not belonging to LIA and excluding the recent topsoils (and the adjacent 10 cm) 6 P-Max = maximum P content in the most influenced soil sections excluding the recent topsoils and the next 10 cm below 7 P-rc = recalcitrant P (P-AR minus P-dHCl) On the Challenges of Soil Phosphorus Prospections in Heterogeneous... located in significant depth as we documented colluvial layers on the plateau with an average thickness of ca. 80 cm. Thus, we conclude that the deepest soil sections should not generally be taken as having the native P content.

Deducing a Solution from Own Data
Instead of deductively terming the deepest soil sections the least anthropogenically influenced, we analyzed our P data inductively to identify the native P contents. According to our sampling strategy, we calculated the mean P contents for all samples of a depth section (i.e., 0-10 cm) within a soil region. Combining the means for all soil sections, we can depict the average vertical P distribution in each soil section in "standard profiles" (Figs. 8, 9, and 10). It is visible that the mean P contents of the lowermost soil sections (dotted lines) do not necessarily represent the sections with the smallest P contents. Thus, we identified the mean P concentration of all samples within the soil horizon with the lowest P contents. To identify the respective soil horizon, we used the standard profiles and our field documentation of soil stratigraphy. If we identified P contents to be smallest in 40-90 cm of depth in the standard profiles, we only used the data from 40 to 70 cm if this was defined as a distinct horizon according to soil stratigraphy. The resulting soil sections represent the least influenced area (LIA) of the soils as inductively determined from our data (Table 1). We then grouped all the LIA data from the profiles within one soil region and calculated the average. These LIA mean P contents are considered to be the native P contents of the soil regions (Table 4). As depicted in Figs. 8, 9, 10, and 11, the LIA means (dashed lines) often differ significantly from the mean P contents of the lowermost soil sections.
In terms of stratigraphy, the LIA forms the upper part of the soil profiles in some of our soil regions (Figs. 8, 9, and 10). This might be due to intensive erosion and colluvial deposition in our study area. Our investigated soils are on average comprised of ca. 71 cm thick colluvial slope sediment layers. Hence, the area's surface must have been altered significantly in the past. After the settlement has been given up, repeated intensive erosion would at first translocate soil material enriched with P during the settling period. Later, erosion would increasingly translocate soil material from deeper sections that were possibly slightly or not P-enriched at all (i.e., having the native P content). Hence, a kind of "soil P profile inversion" could have taken place that deposited the least enriched soil sections of another site at our profiles' tops, i.e., above the P-enriched layers. Such intensive erosion most likely happened in areas with steep relief, like the Altenburg plateau's margin and top slopes. As we did not core in these areas due to their steepness, we did not find soil profiles shortened by erosion. Instead, we investigated the Altenburg's middle and toeslope sites where the colluvia were deposited. Therefore, we might have found the smallest P contents (LIA) in the upper profile sections in those soil regions which neighbor the plateau (i.e., regions 1.3, 1.5, 1.6, 2.3, 2.5; Fig. 3).
We also identified the LIA in the upper profile sections of the plateau itself (soil region 1 + 2.4). Judging from its relatively flat recent relief, one would rather exclude intensive erosion. Yet, our soils' stratigraphy and the large thickness of colluvial layers Fig. 8 "Standard profiles" depicting the mean P contents (columns; based on n soil samples) according to soil depth in transect 1, soil regions 1.1-1 + 2.4 (a-d). Bars represent standard errors of means. Different letters indicate significant differences between means (p ≤ 0.05). Dotted lines indicate the mean P content of the lowermost soil sections; dashed lines mark the respective LIA mean P content indicate that the summit area of the Altenburg mountain might not have been as even and plateau-like in prehistory as today. As the plateau formed the central area of the Altenburg hillfort, it was likely most intensively influenced by different anthropogenic activity. It is difficult to judge from our auger cores if layers are really colluvial, and if the "colluvial layers" resulted directly and intentionally from anthropogenic soil construction (i.e., artificially applied soil material, e.g., to create buildings on). Otherwise, the colluvial slope sediments could have been indirectly (and unintentionally) Fig. 9 "Standard profiles" depicting the mean P contents (columns; based on n soil samples) according to soil depth in transect 1, soil regions 1.5-1.8 (a-d). Bars represent standard errors of means. Different letters indicate significant differences between means (p ≤ 0.05). Dotted lines indicate the mean P content of the lowermost soil sections; dashed lines mark the respective LIA mean P content anthropogenic, i.e., caused by natural processes (e.g., rainfall) but enabled by anthropogenic alteration of the surface (e.g., clear-cutting of vegetation). However, there must have been sites on the Altenburg mountain's summit area that were significantly higher formerly than today-by nature or artificially. Such locations would have been leveled by erosion until today. Hence, the respective soil profiles would probably be very short Fig. 10 "Standard profiles" depicting the mean P contents (columns; based on n soil samples) according to soil depth in transect 2, soil regions 2.1-2.5 (a-d) (region 1 + 2.4 is shown in Fig. 8). Bars represent standard errors of means. Different letters indicate significant differences between means (p ≤ 0.05). Dotted lines indicate the mean P content of the lowermost soil sections; dashed lines mark the respective mean P content of the least anthropogenically influenced soil sections (LIA) or the basalt rocks would be uncovered of soil material. As we strived for long soil cores during our field work, we did not sample such a shallow site during our study. Yet, we experienced extreme difficulties in coring at the plateau's northwestern margin due to stony underground, as the positions of our soil profiles 10 and 35 indicate (Fig. 3).
Surprisingly, the LIA is also in the upper soil sections of our region 2.1, which is farthest from the plateau's steep margin and outside the expected settlement. Still, the studied soil profiles have a ca. 90 cm thick colluvial cover. Hence, results of anthropogenic activity are present even in this relatively remote region. Possibly, the colluvial material now lying in this region formerly was part of a structure (i.e., the outer rampart ring) to delimit or fortify the settlement against its surrounding area.
One might wonder about the differences in the top or bottom position of the LIA between our soil regions (e.g., bottom position in region 2.2, top position in regions 2.1 and 2.3). Here, it must be taken into account that the Altenburg area was relatively densely segmented by ramparts (Fig. 3). These still preserved structures might have blocked erosion from one soil region into the next region downslope. Hence, soil material was probably only translocated within each land use segment (i.e., between two ramparts). Due to this important special feature of the Altenburg hillfort site, we consider it plausible to deduce the native P contents of each region from our data. If instead soil material would have been translocated between the regions, our strategy for identifying the LIA could have been misled. Then, the colluvial layers should rather be excluded from data evaluation for deducing the native P contents.
Regarding our conceptual definition of the LIA, a general limitation must be underlined. We identified the LIA sections inductively from our data. Our data quality Fig. 11 "Standard profiles" depicting the mean P contents (columns; based on n soil samples) according to soil depth in transect 2, soil regions 2.6 (a) and 2.7 (b). Bars represent standard errors of means. Different letters indicate significant differences between means (p ≤ 0.05). Dotted lines indicate the mean P content of the lowermost soil sections; dashed lines mark the respective mean P content of the least anthropogenically influenced soil sections (LIA) thus determines the correctness of the LIA sections. We cannot prove if we really sampled unaffected material at all or if it was all somewhat anthropogenically altered in this intensively used settlement. Hence, we cannot validate the LIA sections are anthropogenically unaffected. Especially on the plateau, even the LIA might be anthropogenically enriched with P. Therefore, we interpret the LIA as the "least influenced areas." This can mean they are slightly influenced or not influenced at all. Because the LIA means are mostly within the second quartile of our regions' P data, we think they are plausible-remembering that there are standard errors for the LIA means (Table 4), and bearing in mind that the native P content is generally a range instead of a sharp threshold value. Regarding the conceptual and practical difficulties finding proper reference sites for soil P prospections in heterogeneous environments, we consider the LIA approach useful and practical despite its limitations.

Evaluating P Enrichment
After we identified the native P contents of our soil regions, it is possible to analyze how intensively the studied soils have been enriched with P, i.e., how much the native P contents are exceeded. For this purpose, we need to group our soil profiles according to the land use segments (Table 2) instead of the soil regions because we interpret P enrichment to be due to land use.

Absolute Versus Relative P Enrichment by Mean or Maximum P Content
Conclusions based on absolute P enrichment (i.e., as the difference between the mean or maximum P content of a soil and its native P content) might appear straightforward for soil P prospections in a homogeneous environment. They are not so for a study in a heterogeneous environment because the different bedrocks/substrata and the resulting soils have different capacity to bind and retain P (Weihrauch and Opp 2018). For instance, soils formed from basalt or basaltic tuff are rich in pedogenic oxides and clay minerals (Blume et al. 2016;Galván-Tejada et al. 2014). They can bind more secondary P than soils formed from Mesozoic sedimentary rocks or loess (Pecoroni et al. 2014). This means that, theoretically, under the same intensity of anthropogenic activity, different degrees of P enrichment can result according to the binding capacity of the respective soils. Evaluating only the absolute differences of P enrichment between the segments would not take this into account. Alternatively, relative P enrichment should be assessed.
According to the LIA approach, we calculated a parameter for the more influenced soil sections (MIA; Table 1). We grouped all those data in a soil profile which do not belong to the LIA. As the recent topsoils are mostly not suitable for geochemical prospections (see above), we excluded topsoil data from the MIA. Because there are often large proportions of easily soluble P in the topsoils and these can be readily translocated vertically downwards (Weihrauch 2018;Blume et al. 2016), we also excluded data from the next 10 cm below the topsoils' border. We interpret the MIA means as the average P enrichment of the studied soils which is probably anthropogenic. The MIA represents an equal soil P distribution in vertical (i.e., over several soil depth sections) and lateral terms (i.e., for an entire soil region).
We propose two options for the evaluation of relative P enrichment. (1) First, we calculated the mean enrichment ratio (MER) via ((100% × profile MIA)/region LIA) − 100 (Table 1). It results in a percentage of how intensely a soil profile has on average been enriched with P in relation to the native P content. Hence, MER = 100% means Fig. 12 Mean enrichment ratios (MER) and maximum enrichment ratios (XER) of the soil P contents in transect 1 (a) and transect 2 (b); depicted are the means for each land use segment (Sgm), as well as the distinct values of the soil profiles (Prof). c Interpretation of P enrichment in the spatial context (according to Fig. 2; © N. Georg 2019; data source: Hessische Verwaltung für Bodenmanagement und Geoinformation 2014) that the native P content has been doubled by secondary P inputs. Conceptually, the MER hypothesizes an equal spatial distribution of P vertically within the soil profiles and laterally within a segment. This parameter seems useful when larger sections of soils may be P-enriched, e.g., colluvial layers in a larger area (Jungmann 2002;Zölitz 1983;Peinemann and Brunotte 1982). We mentioned above that our study area was intensively affected by erosion. Due to the segmentation of the area by several ramparts, colluvial material was probably just translocated intra-but not intersegmentally. As it is not possible to judge which of our colluvial layers are anthropogenically P-enriched at all, it seems plausible to include the entire soil profile samples in calculating the MER (i.e., including the topsoils). Thus, enriched soil Fig. 13 a, b Ratio of slightly soluble P-sl to recalcitrant P-rc (P-sl:P-rc); means for each land use segment (Sgm), and distinct values of the soil profiles (Prof) material eroded at one site might be taken into account as a colluvial layer in another soil profile of the same segment. However, it must be borne in mind that the topsoils might be enriched with P from recent sources Peinemann and Brunotte 1982;Bakkevig 1980), e.g., fertilizer from agriculture in segments 1.8 and 2.7. Hence, our MER displays probable P inputs of all time periods cumulatively. In a case where the archaeologically relevant depths are clear (i.e., 40-80 cm), the MER could be calculated on the basis of a different set of a soil profile's samples (i.e., only from 40 to 80 cm). In our case study, this was not possible as even the topsoils developed from colluvial material and might hence have been P-enriched in prehistory.
(2) Second, we calculated the maximum enrichment ratio (XER) via ((100% × profile maximum P content)/region LIA) − 100 (Table 1). This parameter is a segment's mean of the respective profiles' differences between their maximum P content and the regional LIA value. To neglect possible recent P inputs, we excluded the topsoils and the next 10 cm below them from identifying each profile's maximum P content. The XER can be useful in areas where anthropogenic P inputs only or largely affected one soil horizon or even a certain depth. This may be the case with prehistoric surfaces, e.g., in house layouts with trampled floors (Zimmermann, 2001(Zimmermann, , 2008Ethelberg 2008). Thus, the XER is rather time-specific as it represents a distinct soil depth. Instead, the MER is quantified on the basis of P data from larger soil sections. Hence, it incorporates P information from different times.
As the MER and the XER are relative parameters, they can be used for comparing different soil regions and land use segments. We thus calculated the means of the MER or XER for each segment. However, it must be borne in mind that our statistical samples are in part relatively small (n ≤ 4). Larger samples (i.e., more data) would result in more informative results, for instance due to a smaller influence of extreme values on means.
The question arises from which MER or XER value upwards the data depict meaningful P enrichment instead of natural variation. As we calculated the native P contents on the basis of each region's mean of the respective profiles' LIA P contents, these means have standard errors (Table 4). Thus, we calculated the relative proportions of the standard errors in relation to each LIA P content. The resulting percentages are interpreted as the threshold above which P enrichment is meaningful. Hence, we subtracted these percentages from the profiles' MER and XER values.
As mentioned above, the MER indicates the mean distribution of P over larger soil sections (e.g., colluvia). The XER rather shows local vertical P enrichment attributed to a certain depth section (e.g., past surfaces). Hence, the difference between the MER and the XER might be relevant. A large difference might point to a more intensely Penriched layer within profiles mainly comprised of material with a smaller P content. A small difference might instead indicate that the respective soils are entirely or largely comprised of material which is comparably P-enriched. As the XER is more sensitive to local P enrichment, we consider it more useful for geoarchaeological studies (when the recent topsoils are excluded).
For transect 1, the P-sl MER and XER have comparable trends (Fig. 12). Generally, they have a larger difference in the northwestern part of the transect than in the southeast. Possibly, the first area is rather characterized by vertically localized P enrichment, while the second area was more affected by colluvial deposition.
The mean XER values for P-sl increase toward the Altenburg plateau where P enrichment is largest (mean 104%). The smallest values were found in the segments outside of the outer rampart. Still, even these segments are P-enriched by ca. 29% (average on the basis of profile data). However, the first segments within the outer rampart show a larger XER of 57% (average on the basis of profile data). We conclude that the outer rampart delimited two anthropogenically used areas from each other. The area within the outer rampart seems to have been under more intensive use and was potentially part of the prehistoric settlement.
We found completely different patterns for the MER and XER of P-sl and P-rc (Fig. 12). This underlines that the choice of adequate P fractions is highly relevant for geoarchaeological investigations, especially the less fractions are determined. Still, for a P fractionation, the challenge arises to put all the (possibly contradictory) results together finally.
In our case study, P-sl rather depicts secondary P enrichment. Probably, the P-sl XER increases toward the Iron Age hillfort's center according to increasing land use and secondary P input intensity. P-rc additionally mirrors the influence of primary P forms (e.g., apatite). These might, e.g., be more dominant in young, less weathered natural soils (Izquierdo et al. 2013;Crews et al. 1995;Walker and Syers 1976). They are also likely to be more abundant in anthropogenically constructed soil profiles, when soil material of higher apatite content was applied (e.g., on artificial terraces; Weihrauch et al. 2016). Finally, higher contents of primary P forms could also result within soil material enriched with apatite-containing rocks (e.g., debris of buildings or archaeological structures; Weihrauch et al. 2016).
For P-rc, the largest mean XER values were found for segments 1.2 and 1.6 (means on the basis of profile data-123% and 92%, respectively). This corroborates that segment 1.2 might have been used more intensively than the segment outside the outer rampart. Still, the local soils only contain relatively small proportions of P-rc (Fig. 13). Instead, a relatively large proportion of P-rc was found in segment 1.6. Thus, P enrichment might be due to rock-bound P introduction into the local soils in this segment. One option could be the adjacent rampart structures, which might have been constructed using primary P-containing soil material and local basalt rocks (Weihrauch et al. 2019;Söder 2004).
The lowest P-rc XER occur in segments 1.7 and 1.8. Hence, both areas might have received P inputs, but need not have been part of the prehistoric settlement. On the other side of the mountain, there was large P-rc enrichment in segment 1.1, indicating again that this area might have been used anthropogenically despite its location outside the outer rampart. However, the significant differences between the P-rc MER and XER point to a different land use in segment 1.1 than in segment 1.2. Generally, the differences between the P-rc MER and XER are larger from segments 1.2 to 1.6 than in the other segments. This could indicate that the latter have been more intensely affected by the deposition of colluvial slope sediments.
For transect 2, P-sl enrichment also increases toward the Altenburg plateau (Fig. 12), though the highest P-sl XER occur in segments 2.5 and 2.3 (means on the basis of profile data-523% and 268%, respectively). The soil profiles in both segments are located on toeslopes. In segment 2.5, the entire soil profiles are comprised of colluvial slope sediments. In segment 2.3, the colluvial layers are ca. 70 cm thick. As larger soil sections are P-enriched, both the XER and the MER are large in these segments.
Generally, the differences between the P-sl MER and XER are comparable throughout the transect, indicating a relevance of vertically localized P enrichment.
Besides, the next largest XER were recorded for the central settlement (segment 2.4) and segment 2.6. The latter indicates that there might have been a change in land use outside and within the outer rampart at this side of the mountain. Thus, it appears that segment 2.6 was part of the prehistoric settlement. On the other side of the mountain, no such conclusion can be drawn. Both segments 2.1 and 2.2 are comparably enriched with P-sl. Hence, it is unclear whether both, none, or just segment 2.2 was part of the settlement.
Again, a different pattern shows for the P-rc XER (Fig. 12). The maximum was also found in segment 2.5 (mean on the basis of profile data-616%), but no peak occurs for segment 2.3. In both segments, the P-sl:P-rc ratios are high (Fig. 13). Hence, secondary P forms are the most relevant. The P-rc XER indicate land use changes between segments 2.1 and 2.2, as well as segments 2.6 and 2.7. While segment 2.6 might have been part of the prehistoric settlement, the results for segment 2.2 are unspecific. Possibly, segment 2.2 was used in a way that did not introduce much P into the soils (e.g., prehistoric agriculture).
Generally, we hoped to find clearer results, e.g., zero enrichment in the segments outside the outer rampart. However, we deduced the native P contents from parts of our soil profiles (i.e., the LIA). With such an approach, one would probably in most cases Fig. 14 Types of graphs depicting the lateral soil P distribution according to prehistoric land use as interpreted by Lorch (1940) (adapted/translated by permission from Springer Nature: Lorch 1940Lorch , p. 635, © 1940 detect P enrichment, for instance, when topsoils are naturally higher in P than subsoils (e.g., due to organic matter, litter, edaphic activity), or the other way around (due to Pcontaining bedrock; Weihrauch and Opp 2018). The same applies when soil sections are formed of P-enriched colluvial material. The LIA approach excludes the topsoils but does not take other forms of pedogenic P enrichment into account. Thus, the question arises if a threshold value should be applied below which P enrichment is termed natural (e.g., on the basis of bedrock or topsoil P enrichment). Such a threshold would likely have to be bedrock-or substratum-specific, as it is, e.g., much harder for a sandy soil to retain 100% of additional P than for a clayey soil (Blume et al. 2016;Syers et al. 2008;Amberger 1996). A good option to deduce such a threshold would probably be to study nonenriched reference profiles, for which one could determine the percentage difference between the LIA and MIA or maximum P content. As mentioned above, adequate reference profiles are difficult to find in our study area. We cannot guarantee for our studied soil profiles that any of their sections was really anthropogenically unaffected. Hence, we leave the question for a XER and MER threshold open to be answered in future systematic research.

Relating P Enrichment to Land Use Types
In the preceding sections, we used the difference between the maximum (or mean) and the native P content (i.e., soil P enrichment) to deduce on the intensity of past inputs of P-containing matter into the soils of the Altenburg hillfort. P enrichment is a function of three interrelated aspects: (1) the type of P-containing matter introduced into the soils, (2) the amount which was introduced, and (3) the frequency or duration of the inputs (Jakob 1955;Lorch 1939). A geoarchaeological P prospection can hardly elucidate these aspects because it detects the result of all P enrichment which has occurred at a site (Holliday and Gartner 2007;Lorch 1940). As the method gives a cumulative overview of soil P enrichment, it is hard to deduce on distinct land use phases which contributed to the total enrichment (Ethelberg 2008;Duncan 2000;Arrhenius 1931). Still, this information is necessary and often wanted by archaeologists.
Often, researchers try to close the knowledge gap between past land use and the resulted P enrichment on the basis of hypotheses deduced from comparable research (Wells and Moreno Cortés 2010). For instance, geoarchaeological P prospections were often used to differentiate activity zones in prehistoric house layouts (e.g., Barba and Ortiz 1992;Zimmermann 1986;Hassan 1981). In these cases, the researchers mostly had an idea which rooms (e.g., kitchen, sleeping area, stables) a house of a certain time period would have consisted of because they had found evidence for this during prior excavations (e.g., Ethelberg 2008;Zimmermann, 2001Zimmermann, , 2008. The same applies for graves and burial sites, which were often proven in advance by excavations and material finds (e.g., Núñez 1975;Provan 1971;Arrhenius 1955).
Such a combination and "cross-validation" between the archaeological findings and the results of a P prospection is relatively convenient to achieve for smaller research areas which can often be studied entirely (Barba and Ortiz 1992). Still, it is logistically difficult with larger sites. On the one hand, large areas can regularly not be excavated completely, e.g., due to limited financial capacity or restrictions of land ownership. Hence, there might be limited archaeological information on the respective site. On the other hand, settlements form a spatial mosaic of different land uses regarding types, amounts, and frequency of P inputs. This (often small-scale) spatial heterogeneity could be too complex to reconstruct from a P prospection with limited spatial resolution. In consequence, P prospections might rather be used to identify settled areas instead of specifying their functional composition (e.g., Zimmermann, 2011;Jungmann 2002;Zölitz 1986).
However, one approach to deduce on the functional differentiation of settlement areas was proposed by Lorch (1940). He distinguished seven types of prehistoric settlements on the basis of the lateral P distribution within soil transects (Fig. 14). As we also investigated soil transects in our case study, Lorch's approach seems applicable to our data. Hence, we compared our MER and XER graphs with the P graphs proposed by Lorch (1940).
We already mentioned that different graphs resulted for P-sl and P-rc. For the averaged MER and XER of P-rc (Fig. 12), we could not find a connection with Lorch's P graphs. The reader should note that Lorch mainly determined stable secondary P forms in his soil P prospections (e.g., Lorch 1955Lorch , 1941bLorch , 1939. Hence, his P graphs most likely do not intend to depict spatial patterns of recalcitrant P in a settled area. A general interpretation scheme for recalcitrant P could probably not be provided because it would largely relate to local bedrock conditions (e.g., shallow or deep bedrock). Therefore, we do not interpret our MER and XER of P-rc in the light of Lorch's approach.
We found different patterns of P-sl enrichment in both our transects. For transect 1, we deduce on Lorch's settlement type 4 ("agricultural settlements") on the basis of the averaged XER of P-sl (Fig. 12). Hence, our soil profiles 10-14 would mark the location of the settlement which might have been flanked by areas used for agriculture and/or for "economic activities" (e.g., crafting, processing of raw materials, production of food and goods). According to Lorch (1940), the less P-enriched areas might have been used for agriculture (i.e., profiles 15-24) and the more intensely Penriched areas might have been used for economic activities (i.e., profiles 4-9). However, for both land use classes, P enrichment decreases with increasing distance from the settlement. Lorch interpreted the least intensely P-enriched areas as not used for agriculture (i.e., profiles 1-3, 25-27). However, this does not necessarily mean these areas have not been used at all.
When we consider the averaged MER of P-sl, a different pattern shows (Fig. 12). The right-hand side of the graph somewhat resembles Lorch's settlement type 5 ("local livestock farming") with vast pastures (profiles 15-17, 21-27) bordering the settlement (profiles 10-14). Within the pastures, there is localized P enrichment on sites the animals might preferably have grazed on (profiles 18-20). The left-hand side of the MER graph of P-sl rather reflects a variation of Lorch's settlement type 6 ("agricultural settlements without application of fertilizer"). The central settlement might have been bordered by a small area used for economic activities (profiles 8-9), itself bordering areas used for economic activities with minor P inputs (profiles 4-7). The most distant section might not have been used for agriculture according to Lorch (1940). Still, this section is P-enriched in his approach. Hence, the question arises how this distant land might have been used.
For transect 2, the XER of P-sl are difficult to align with Lorch's P graphs. Lorch (1940) generally interpreted the most intensely P-enriched areas as settlements (Fig. 14). Hence, in our case, we would have to interpret the areas bordering the plateau as settled (profiles 33-34, 40-41). Still, it is known from archaeological research that the Altenburg plateau has been settled as well (Söder 2004;Hofmeister 1930). If we neglect the strongly heightened P contents next to the plateau, the remaining P graph somewhat resembles Lorch's settlement type 4, with a central settlement (profiles 35-39) possibly flanked by further settled areas (profiles 33-34, 40-41) next to agricultural areas (profiles 30-32), and sections presumably used for economic activities (profiles 42-43). It remains unclear if the most distant areas (profiles 28-29, 44-46) have been used in the same manner (respectively). However, they are also P-enriched.
The averaged MER of P-sl show a comparable trend as the XER in transect 2 (Fig. 12). However, the most distant southeastern section (profiles 44-46) might not have been used in a way that introduced any or much P into the soils. According to Lorch (1940), it could have been an area "not used for agriculture." As this is the category with the least P inputs in Lorch's approach, it remains open if these areas have not been used at all.
In summary, we found different spatial patterns of P enrichment with regard to the approach of Lorch (1940). We attributed some of our P graphs to agricultural settlements (Lorch types 4 and 6) and some to settlements based on livestock farming (Lorch type 5). We also came to divergent results regarding the application of fertilizer on agricultural lands (Lorch type 4 versus 6).
We do not see this heterogeneity in land use as contradictory. Instead, we think one must expect a heterogeneous spatial distribution of different land uses around a large diversified settlement. Lorch's approach somewhat implies a concentric, ring-like distribution of land uses around a central settlement-so that they can be captured representatively by one transect. Our study rather shows that several transects may depict different patterns of P enrichment for the same settlement. Hence, the more transects, the more detailed and representative information can probably be retrieved in a P prospection.
Another conclusion from the different results of transects 1 and 2 is that, in large areas, land use might rather be distributed according to patches or spatial sectors. Hence, we consider it more likely that, for instance, the area between the outer and the first inner rampart was not one connected agricultural "belt" but that it consisted of sectors under different land use. The distribution of land use types might strongly have depended on site conditions, like insolation and precipitation for agriculture, access to raw materials for the crafts, and certain wind constellations for firing the ovens in metal and ceramic processing. Furthermore, land use distribution might have reflected sociocultural aspects, like the social status of persons and the degree of feeling harassed by certain activities (e.g., due to odor or noise).
Consequently, a transect may depict a very local expression of the land use distribution within a settlement but it may not be sufficient to judge on the general nature of the respective settlement type. Thus, in our case study, we do not conclude the Iron Age Altenburg hillfort was a settlement of farmers and shepherds only. Its spatially far-reaching relevance has already been indicated by the artifacts found during the archaeological investigations in the twentieth century (Söder 2004;Hofmeister 1930). However, our data imply that some sections of the area within the outer rampart might have been used for agriculture. Hence, we do not infer on an entirely densely built area within the outer rampart. This could suggest that the population of the Altenburg hillfort was at least partly self-supplied with food and agricultural products. Possibly, the people even produced goods for trade (e.g., from the "economic activities"). Without such a functional differentiation and heterogeneity within the Altenburg hillfort, we would most likely have found a strong correlation with Lorch's settlement type 7 in both our transects (Fig. 14).
Despite these findings, we consider further conceptual remarks on Lorch's (1940) approach necessary. One might generally wonder what the different P graphs and their attribution to settlement types may result from. To our knowledge, Lorch has not conducted conclusive empirical research that could have led to the seven P graphs. Thus, his approach might be based on personal hypotheses. Such hypotheses mirror the state of research and can be false or at least vague. For instance, agriculture is often considered a P-enriching land use type (e.g., Zimmermann 2008;Gebhardt 1976;Jakob 1955;Lorch 1940). This is true for modern agriculture (since ca. the 1840-1880s) which applies mineral P fertilizers (Finck 1979). However, we know from other fields of research that soils used for agriculture without mineral fertilizers were often rather Pdeficient instead of P-enriched (Holliday and Gartner 2007;Finck, 1979Finck, , 2007. Under extensive agriculture, it can even be assumed that soil P is increasingly depleted Negassa and Leinweber 2009;Bakkevig 1980). Hence, the traditional archaeological hypotheses about the effects of certain land use types on soil P enrichment are sometimes not plausible. In Lorch's approach, this is also depicted by the areas close to the settlement which were used for "economic activities" (letter "N"; Fig. 14): They are strongly P-enriched in the P graph of settlement type 4 but almost Pfree in the type 6 graph.
Moreover, Lorch's approach shows a somewhat simple constellation regarding the settlements' histories. Prehistoric hillforts in Hesse, for instance, were often settled during more than just one time, as is also the case for the Altenburg hillfort (Söder 2004;Hofmeister 1930). The accumulation of P introduced at the same site but at different times might theoretically lead to a P graph indicating an intensively used settlement (e.g., Lorch's type 7), while there were actually just two settlements with minor to medium P inputs in the same location. Lorch did not reflect this issue in his approach.
Our study furthermore shows that soil P contents (as well as MER and XER values) regularly differ greatly from one soil profile to the other even over small distances (Fig. 12). We think it is questionable if a harmonious and conclusive P graph (like depicted in Lorch's approach) can result from a P prospection, at least without averaging the data according to sections. We consider it likely to find small-scale spatial heterogeneity of P enrichment in a diversified settlement. For instance, we expect such variation between areas with or without buildings (e.g., paths/roads between houses), and in areas with small localized structures (e.g., pits; Lauer et al. 2013). Hence, a P graph based on averaged data can only give a coarse overview of prehistoric land use. Moreover, our imagination and hypotheses might generally be limited compared to prehistoric reality.
Finally, as this is the core issue of the present paper, Lorch did not consider the relevance and effect of varying bedrocks/substrata in the underground on the P graphs. If we had just considered the means of our measured P contents (i.e., without relating them to the native P content; Fig. 4), the spatial P distribution seems strongly in line with Lorch's graphs (transect 1: type 4; transect 2: type 5). Still, as mentioned above, varying bedrocks/substrata cannot be neglected in a soil P prospection because they determine natural (i.e., nonanthropogenic) variations in soil P over an area.
We conclude that Lorch's (1940) approach could be a desirable and useful tool to evaluate the results of geoarchaeological soil P prospections. Still, it is rather idealized in its current form. To come to a more conclusive and straightforward scheme for the interpretation of land use from a soil P prospection, much systematic and comparative empirical research is still needed.

Our Archaeological Results in the Nutshell
Our soil P prospection enabled several new interpretations regarding the Iron Age Altenburg hillfort. We found potentially anthropogenic soil P enrichment throughout the study area. Due to its depth in the soil profiles, we deduce subsoil P enrichment largely mirrors prehistoric land use.
Interestingly, we found different spatial patterns of P enrichment in our two transects. This indicates that the Altenburg hillfort was no concentric or radial settlement despite its rampart rings. Instead, our results point to a rather sectoral spatial distribution of different land uses within the rampart rings (probably depending on environmental and/or sociocultural factors). Our transects draw a selective picture of the spatial and functional heterogeneity of the Altenburg hillfort. Still, in both transects, we found evidence that the first segments within the outer rampart were used differently than the outer areas. This might corroborate that the newly discovered outer rampart marks the extension of the Iron Age hillfort. However, our data indicate that the areas between the different internal ramparts might have been used for different purposes, including agriculture and livestock grazing. Hence, not the entire area surrounded by the outer rampart seems to have been densely covered with buildings.
In transect 1, the most intense P inputs seem to have occurred in the settlement's center on the Altenburg plateau (Fig. 12c). In the adjacent areas, P enrichment decreased successively with increasing distance from the center. The areas directly neighboring the central settlement were probably used for economic activities (e.g., production, crafts). Segment 1.5 proved archaeologically relevant by relatively strong P enrichment. It marks a potential gate and a path leading down from the plateau. Another possible gate situation was identified in segment 1.6. Further geoarchaeological investigations are required in these locations. Generally, the remaining northwestern part of the transect might have been used for agriculture (without fertilizer application), and the southeastern part for livestock husbandry and grazing. We found the smallest P enrichment outside of the outer rampart. Hence, these areas might have been used in a way that only brought minor P inputs or they might not have been used at all.
In transect 2, we also found the largest P enrichment in the central settlement, followed by the neighboring areas which possibly contained buildings as well. We interpret the next adjoining area in the northwest might have been used for agriculture and the southeastern lands were used for economic activities. According to our data, the first land section within the outer rampart in the northwest belonged to the Iron Age settlement. Still, the first section in the southeast remains unclear despite its P enrichment. The areas outside of the outer rampart are P-enriched as well but slightly, which indicates that these patches might have been used by the prehistoric population but not intensively (e.g., for agriculture).

Methodological Conclusions for Soil P Prospections in Heterogeneous Environments
From our study on the Altenburg hillfort, we conclude that soil P prospections can be carried out in heterogeneous environments. However, a more targeted field work strategy and a more sophisticated approach for data evaluation are needed.
Prior to conducting a soil P prospection, one should generally study the local geological and soil maps to decide if a P prospection is plausible at all, and to develop a suitable sampling strategy (Wells 2010). During field work, soil documentation by a person with soil scientific competence is required. The studied soils should be characterized regarding their stratigraphy and P-relevant soil features. From soil documentation, one should derive the spatial distribution of different bedrocks, called "soil regions" in our study. These "soil regions" may diverge from the information shown in the geological maps due to the larger scale of the latter. Moreover, the bedrock-based soil regions may diverge from (pre) historic land use. Hence, one should also derive the study area's division into land use segments.
Soil regions and land use segments differed slightly in some locations of our research area. However, both schemes of spatial differentiation may be equal in another study area. Anyways, one should strive for a largely equal number of soil profiles and soil samples within each soil region and land use segment during field work. Furthermore, the soil profiles should be distributed over the entire regions/segments with relatively homogeneous distances from each other (Wells 2010). Doing so would result in an adequate spatial representation of the regions/segments by soil P data. As we had not been aware of this prior to field work, our sampling strategy was only based on the land use segments and our soil profiles were sometimes concentrated in certain sections of the segments. Hence, the spatial representation of our study area could have been improved locally.
A large challenge of a soil P prospection in a heterogeneous environment lies in deducing the native P content of each soil region. If possible, one should try to establish at least one reference soil profile which is assuredly without anthropogenic influence and which mirrors the typical soil P content and vertical P distribution of the soil region (Wells 2010). However, there might be areas where such reference profile locations are not available (like in our case study). For such areas, we proposed to identify the LIA of each soil region inductively to determine the native soil P content from own data. Still, the resulting native P contents are strongly influenced by the quality of the underlying soil P data and their spatial representation.
Another challenge arises in quantifying and evaluating soil P enrichment. Native soil P contents change in a heterogeneous environment. Hence, to evaluate soil P enrichment on the basis of native P contents would give misleading results. We discussed alternative parameters to absolute P contents, the MER and the XER. These parameters are corrected on the basis of the respective native P contents. They depict relative P enrichment over a larger area and they allow for comparison between different spatial sections. However, it remains a challenge to relate lateral variation in soil P enrichment to (pre) historic land use. We discussed the approach of Lorch (1940) to facilitate the interpretation of spatial soil P enrichment patterns.
Our study shows that, in a heterogeneous environment, we cannot just apply the absolute soil P contents but we need to develop alternative parameters for evaluating P enrichment. We encourage further studies should be conducted in heterogeneous environments to test the applicability of our approaches, to modify them, and to find additional or alternative solutions. Establishing a methodological framework for soil P prospections in heterogeneous environments could make areas which are too large and too difficult to excavate accessible for systematic geoarchaeological investigations.