Introduction

Perhaps one of the earliest and most fundamental types of “archaeological observation” is that of similarity; of one thing (e.g. an artefact or an architectural detail) being somehow and intriguingly connected to another by common material aspects (such as shape, technology or decoration). From this originate the two traditional tools of ordering the archaeological record in time and space: Typological sequences and distribution maps. It is this latter aspect, the “geographical order” of the archaeological record, that is the subject of this treatise. More specifically, the following will be concerned with mathematical models for the reconstruction of the spatial network structures behind the processes of interaction and exchange, so fundamental to the evolution and transformation of human societies (Anderson-Whymark et al., 2015; Bauer and Agbe-Davies, 2010; LaBianca and Scham, 2010).

Much (if not all) of the archaeological record’s spatial patterning has been shaped (directly or indirectly, obviously or subtly) by underlying network structures (e.g. Cowley, 2016; Verhagen et al., 2013; Makohonienko, 2009); be they navigable rivers, pathways cut through forests or paved roads. Indeed, well-built and dense road networks are commonly considered a hallmark of advanced civilisations, such as the Incan Empire (D’Altroy, 2018) or the Roman Empire (Carreras and de Soto, 2013; de Soto, 2019).

From a more abstract point of view, networks are thus complex manifestations of social processes, bounded by physical constraints. If the physical network can be reconstructed, the social processes that it supports become constraint by its shape; one degree of freedom (the spatial arrangement) is thus removed from the analysis and the social interpretation of the archaeological evidence becomes more explicit and less convoluted. Consequently, network-centric spatial analysis has a long history in archaeological research (for an overview: Brughmans, 2014; Collar et al., 2015), with mathematical and formalised methods predating the dawn of geographical information systems (GIS: Conolly and Lake, 2006), agent-based models (ABM: Bentley and Maschner, 2003; Crema, 2013; Graham, 2006) and other computational tools (see e.g. the analysis of Roman Britain’s network by Hodder and Orton, 1976).

Intriguingly, network analysis, with its concepts that are at the same time generic and mathematically explicit, occupies an “intellectual tension zone” in archaeology that represents the contradiction of wanting to achieve two seemingly opposite goals. On the one hand, there is an essential need for radical abstraction: Research aimed at revealing general trends and correlations (laws) must break down its objects of interest into simpler, purposefully reduced representations, generally called models of archaeology’s roots, that in science, and is compatible with GIS-based analysis and mathematical thinking (Clarke, 1968; Binford, 1989; Binford, 2001; Barcelo and Bogdanovic, 2015). On the other hand, being concerned with the study of past human beings and their material remains, archaeological research also aims at individual phenomena and processes. It attempts to reconstruct biographies and write narratives that are interesting, precisely because they represent the “anomalies” of human history, rather than the general trends and mainstreams of cultural evolution. This, of course, reflects another root of archaeology, that in the humanities (e.g. Trigger, 2006).

In the context of spatial data analysis, connecting these two perspectives, of geographical determinism and social dynamics, translates into the ability of being able to use models with few assumptions while retaining the flexibility to observe accurately modelled effects in their local manifestations (e.g. Knappett et al., 2011). The network paradigm and its associated analytical tools stand out as being uniquely well suited for this task, as challenging as they may prove in their detailed application to archaeological material (Brughmans, 2010; Brughmans, 2018; Knappett, 2013).

Motivation and Rationale

Accordingly, our primary motivation is to contribute pragmatic tools for the reconstruction of the links of past geographical networks based on common (i.e. sparse and incomplete) archaeological site data. Although physical remains of ancient network links have been detected and unearthed (e.g. Lay, 1992; Ur, 2003 uses remote sensing to detect Mesopotamian hollow way networks; see also Leidwanger and Knappett, 2018 on maritime networks), such direct observations are, on the whole, significantly sparser than the indirect archaeological evidence for connected interactions, as represented by a multitude of shared technologies, styles and decorations found on sites around the globe. Thus, node-based network reconstruction is a “natural” starting point, and an active research topic that has produced much progress in methods over a significant amount of time (compare Terrell, 1976 with Amati et al., 2020). The inverse problem of node reconstruction (given a predefined set of links) will not be discussed here, as it is a research topic in itself that is much less explored (White and Barber, 2012 touch on the potential).

Our purposeful focus on node-based reconstruction is not to deny that, considered in all aspects, realistic network reconstruction remains a dauntingly complex undertaking. Partially because it presents itself as a “chicken and egg” problem: Did a pair of nodes exist first, and then a link evolved to connect them? Or did the node evolve in a location of favourable connectivity, on or near the existing links of an established network? And partially because additional complexity arises when more than binary relations (“connected” or “unconnected”) between nodes are considered, and when aspects of hierarchy and flow capacity are introduced.

Attempts to more fully capture the complex interaction and evolution dynamics of networks have been undertaken with considerable success in archaeology; by advanced means such as agent-based models (ABM), systems of nonlinear equations and stochastic simulations (e.g. Graham, 2006; Wilkinson et al., 2007; Bevan and Wilson, 2013; Lawall and Graham, 2018; see Evans, 2018 for an overview), and iterative optimisation techniques (the so-called “efficiency networks”: Fulminante et al., 2017). These approaches, however, come with heavy demands on computational resources, and supplying all necessary parameters (based on either data or assumptions) can require considerable effort, especially given the partial and nonhomogeneous nature of the archaeological record. Thus, they have so far been limited to singular case studies, just like the various attempts at detailed and realistic reconstruction of physical transportation networks, based on energetic expenditure and travel time optimisation (e.g. Groenhuijzen and Verhagen, 2017; see Fonte et al., 2017 for a particularly advanced, GIS-based example; also Ludwig, 2020).

By contrast, the mathematical methods discussed here have been selected because they are simple enough to implement in common GIS, generic enough to be transferable to a large range of scenarios and input data, yet flexible enough to provide results that allow specific, interesting insights into connected archaeological phenomena. In particular, we focus on an adaptation of the classic XTENT model (Renfrew and Level, 1979) to network space. XTENT is a well-tested tool, with a long research tradition and many desirable properties (among them simple parametrisation and a good balance between mathematical “legibility” and flexibility). More in-depth explanations of these choices will be provided in the next section (“Methods and observations”), together with an overview of generic approaches to node-based reconstruction.

It is, however, not our aim to provide a comprehensive overview of spatial network reconstruction. The latter is a field of research that is progressing rapidly. Recent publications on the topic include excellent summaries on the state of the art (Evans, 2018; Amatie et al., 2018) and evaluations of methods (Rivers et al., 2013; Groenhuijzen and Verhagen, 2017).

Open Source Implementation

The software used to produce all illustrations and analytical results discussed in this text (v.net.models) is available, under an open source license, from its repository on GitHubFootnote 1. It has been designed to work with open source GRASS GISFootnote 2 or QGISFootnote 3. We do not provide this merely as a convenience, but rather as a scholarly contribution (Hafer and Kirkpatrick, 2009), and to fulfil what we perceive as a requirement for fully reproducible research (Barns, 2010; Ince et al., 2012). Creating research software means transforming ideas and concepts into operational tools. It is a process that exposes more relevant issues, subtle problems and pitfalls than any theoretical study, no matter how thoroughly conducted, could ever stumble upon. The HTML manual page that accompanies v.net.models contains technical details and practical hints that are not covered by this article. We invite the reader to engage with the further development of the software by using the “Discussions” and “Issues” systems provided by GitHub, or by making direct contributions to the source code.

Terminology

The formal framework of network analysis features an exhaustive set of concepts and terms that define both the abstract representations of networks and their properties (see Collar et al., 2015). However, some specific meanings and semantics exist within the various social, mathematical, geographical, logistical and computing domains of network analysis. The following is a purposefully reduced and simplified list of terms that apply throughout the remainder of this text. They are in accordance with the glossary provided by Collar et al. (2015), except for the additional differentiation between “connection” and “link”, i.e. hypothetical and physical manifestations:

Connection: the state of two entities of interest being joined with each other by a physical (e.g. a road) or immaterial (e.g. treaty) relation. Shared features (such as similar artefacts or architectural styles) are considered indicators of connections between archaeological sites (synonym: “tie”).

Node: an entity of interest within a connected system (in social networks, nodes are also called “actors”). In the context of this study, all nodes are archaeological sites.

Link: the physical manifestation of a connection between a pair of nodes. Two nodes may be connected without a direct physical link (via intermediate nodes in a so-called “transitive” relationship).

Network: an analytical abstraction of a connected system that consists solely of nodes and links. We narrow the definition by adding that a network does not include unconnected nodes (here, “spatial network” and “geographical network” are taken to be synonyms).

Least cost path (LCP): an accurate representation of a network link, with a shape that reflects economic or energetic optimisation processes.

Flow: the number of interactions (e.g. movement of goods between towns) that occur over the links of a network within a specific time frame.

Model: a purposeful abstraction of reality, simplified and reduced to those concepts that are most significant within a framework of analysis or discourse. Here, we are concerned exclusively with models of connectivity.

Parameters: quantitative assertions (no matter whether provided a priori or inferred from data) that are required to control the behaviour of a specific model. In our context, model parameters generally represent connectivity criteria (e.g. the maximum distance across which to sites can establish a link between each other).

Parametrisation: the process of assigning values to a model’s parameters. The following will focus on models with few parameters and “cheap” (i.e. easy to achieve using typical archaeological data) parametrisation.

In the remainder of this text, the terms “node”, “site” and “place” will be used interchangeably, with the first being used more frequently when discussing abstract concepts of network analysis and reconstruction.

Archaeological Data (sites)

Node-based reconstruction assumes that another, independent method or body of evidence exists to supply the network nodes. In our case, this is simply a published catalogue of archaeological sites (Suchowska-Ducke, 2016). From this, we selected a set of 14 sites that are distributed across Europe and the eastern Mediterranean (see Fig. 1 and Table 1). They are embedded in geographical settings that range from the Central European basins across the mountain ranges of the Balkans and into the islands of the Mediterranean basin. This data set is a typical example of sparse and “legacy” data; originally compiled, filtered and structured to fit a specific research purpose (which was not network reconstruction).

Fig. 1
figure 1

Map of sites included in this study (see Table 1 for more information)

Table 1 Archaeological data used in this study. Each row represents a site that revealed Late Bronze Age weaponry in a settlement context. The column “Size” contains a relative (rather experimental) rank estimate. Following this, similarity data (presence and type similarities) has been added. IDs refer to catalogue entries in Suchowska-Ducke, 2016

More specifically, the rows in Table 1 (and their corresponding points in Fig. 1) represent archaeological sites identified as settlements, where different types of weaponry, mostly of Italian or Central European origin, have been found in contexts dated to the Late Bronze Age (generally to the chronological phase described in the Aegean as Late Helladic IIIB and IIIC, i.e. approximately 1300 BC to 1000 BC). The archaeological evidence from this period suggests that far-reaching networks had been established across temperate Europe and the Mediterranean, and that the societies of the time were well-connected and mobile (Bouzek, 1985; Gale, 1991; Cline, 2009; Kristiansen and Suchowska-Ducke, 2015; Kiriatzi and Knappett, 2016). The aforementioned data was originally compiled to record observations on the presence of common offensive weapons, including a widely distributed type, called the “Naue II” (Naue, 1903) or flange-hilted sword, and associated artefacts. Sprockhoff (1931) originally used the name “gemeines Griffzungenschwert” (common flange-hilted sword) for Naue II, and this describes both its main aspect of shape and the foundation of its long-lasting success, very well: It is an efficient and robust design, intended to be cost-effective and simple to produce, and made to last. Therefore, these swords were not the rare weapons of a small elite. A conservative estimate that one out of 1000 swords survived (at least in part) and was recovered in modern times, would mean hundreds of thousands of these weapons had been manufactured since the fifteenth century BC (albeit not in circulation at the same time). We take the fact that this type of artefact was so widely distributed as an indication of connected societies and transfer of technology as a form of networked interaction (similarly: Golubiewski-Davis, 2018; Suchowska-Ducke 2015 and 2018).

Note that there are many further, noteworthy properties of the Naue II swords and their archaeological contexts. Most of the catalogued swords were found in grave mounds dated to the Scandinavian Period III, and in other types of burials dated to Bronze Age D/Hallstatt A1 in the Central European chronology. Other find contexts (especially hoards and ritual deposits) are much harder to narrow down chronologically. Regarding Naue II as an indicator of far-reaching connectivity, there is indirect evidence for strong interdependencies across Europe (Kristiansen and Suchowska-Ducke, 2015). Another contemporary source of evidence in support of this “connected view” is the remarkable volume of amber, traded from the Baltic to the Mediterranean (Czebreszuk, 2013; Hughes-Brock, 2005). We will not discuss these aspects further here, but simply consider the recorded occurrences of Naue II (and its associated artefact types) in confirmed settlement contexts as primary indicator of connectivity.

Geographic coordinates (latitudes and longitudes) are available for all 14 sites, but for the majority of them, there are no reliable observations or estimates regarding their size (in terms of area or population). There are, however, sufficient sources of information (e.g. Evans et al., 2008) on a number of these sites to suggest a size ranking among them (column “Size” in Table 1), and this aspect of hierarchy will be exploited later.

In addition, two assumptions are made regarding the input data:

  • They represent sites that co-existed at the same time and were part of the same network.

  • No sites are missing from the data set that would, if included, significantly alter the network’s properties.

Both assumptions are, of course, difficult and (especially regarding the assumption of completeness) very likely flawed. However, this is of secondary concern here, as the objective is to produce, in a transparent and reproducible manner, a plausible network reconstruction, with plausibility being bounded by the properties of the available data and the assumptions of the mathematical model—no more and no less.

Geodata and Cost Surface

Physical geography is the main determinant of the cost of building and maintaining a network (as shown by, e.g., White and Barber, 2012). Despite the fact that the landscape surrounding the sites of interest has changed significantly in its local aspects since the Late Bronze Age (Zangger, 1994; Philip, 2003), changes of principal physical geography have been limited in scope (mostly to that of soil erosion and deposition: French, 2010) and can be neglected at the scale of this study (especially given the error margins already present in the data and assumptions). This makes modern elevation and hydrography data acceptable proxies for past topography and (by derivation) the resource expenditure associated with surmounting it.

As has been demonstrated before (e.g. Bevan and Wilson, 2013; Groenhuijzen and Verhagen, 2017), physical network reconstructions benefit greatly, both in realism and analytical potential, if the cost of establishing connections is considered. A significant volume of literature exists on the topic of cost-based spatial modelling in archaeology, including network link reconstruction (see Fonte et al., 2017, as a recent example). Therefore, we keep the discussion of this aspect minimal (for a comprehensive overview, see Verhagen et al., 2019). In essence, we use a cost surface (in technical GIS terms: a cost raster) that encodes the expenditures for traversing any location (raster cell) of the study area in any direction (Fig. 2).

Fig. 2
figure 2

Cost surface used in this study, overlaid on shaded relief for more intuitive visualisation. Lighter shades represent lower costs. Lowest cost value is “1”, highest is “88”

Here, the term “cost” is used in its widest possible sense. It might represent the investment made for establishing a link (e.g. the material and labour costs of building a road), as well as that for using a link (e.g. the time spent travelling from origin to destination on a road) or for maintaining it. Owing to the generic nature of the connectivity models that we will employ, we will simply consider “cost” as a compound link utilisation metric. Consequently, link costs will not be expressed in meaningful physical units (such as calories spent per kilometre or travel time: see e.g. Groenhuijzen and Verhagen, 2017); they are to be understood as relative differences in expenditure; and this will be sufficient for our purposes.

The cost raster used in this study (Fig. 2) has a resolution (cell size) of 1 km. Costs were assigned to all raster cells, according to the following rules:

  1. 1.

    The minimal cost of traversing a cell is “1” (“unit cost”). Since the cell size of the cost raster is 1 km, this establishes a convenient relation between cost and geographic distance. “Unit cost” of “1” applies to all cells that represent planar terrain, coastal waters of up to 20 km from the main land and the bodies of major rivers.

  2. 2.

    Major rivers are represented as linear features in the data. A buffer cell of cost “10” is added to each side of each stream’s centre line. This makes moving up or down a major river inexpensive, but crossing it costs the equivalent of 20 km of travel over planar terrain.

  3. 3.

    The maximum cost of crossing a cell on dry land corresponds to the maximum slope (steepness in degrees), which is “88” for the study region.

  4. 4.

    All cells that represent high sea areas are assigned a crossing cost of “1000”.

This simple scheme (favouring travel across plains, along major river networks and along coasts with visual contact to land) has been used previously in the study by Ducke and Rassmann (2010).

All base geodata, required for deriving the cost surface, was acquired from open data sources on the Internet. The elevation data used to compute slope was extracted from the CIGAR SRTM data set, version 4.1Footnote 4, which provides complete coverage at three arc seconds resolution (at the equator). It was resampled to a resolution of 30 arc seconds, which translates to a measurement spacing of less than one kilometre within the study area. This was deemed sufficient for the scale of this study, being roughly the equivalent of 1:1,000,000 scale cartographic maps.

Additional topographical data, provided by the Natural Earth projectFootnote 5 at a scale of 1:10 m, was used for masking the terrain, i.e. separating land from sea (this is not easily possible with just the SRTM data, which contains no well-defined boundaries between the two categories).

The river networks were provided by the Global Self-consistent, Hierarchical, High-resolution Geography Database (GSHHGFootnote 6). We extracted those GSHHG subsets that best matched the scale of our analysis: river lines of categories “L1”, “L2” and “L3” (“permanent major rivers”, “additional major rivers” and “additional rivers”) at intermediate “resolution” (level of detail). Embedding hydrography into the cost surface ensures that the conductive effects of natural network features (river systems) will be represented in the reconstructions.

The cost surface for this study was composed using the raster algebra module “r.mapcalc”, provided by open source GRASS GIS, version 7.8Footnote 7. All geodata was reprojected to a Cartesian coordinate reference system with ETRS 89 datum and EPSG code 3035. It uses metres as map units and a Lambert Azimuthal Equal Area projection to provide true area representation for the entire study region.

Methods and Observations

We broadly define (node-based) network reconstruction as the process of identifying useful mathematical models that will tell us, for all possible pairs of nodes in a network, whether they fulfil one or more plausible connectivity criteria, and can thus be assumed to have been linked. In principle, we are free to choose our connectivity criteria, but whereas immaterial social networks (Fig. 3) represent full (ideal) connection potential, their geographical (real) counterparts are bounded by much harder resource constraints. We are therefore interested in models that will plausibly reduce the number of links in the network from densely connected social space to more sparsely connected geographic space.

Fig. 3
figure 3

Cartographic representation of a social network, based on the similarity observations recorded in Table 1. Link thickness represents connection strength (number of similarities shared by connected nodes), node size represents summed strength (degree centrality) at each node. Maximum link strength is 3 (summed: 16 at node representing Mycenae)

To support a pragmatic approach, we define some key properties that any acceptable reconstruction model must have to lend it general usefulness:

  1. 1.

    It should work with the typical research data available in archaeological publications.

  2. 2.

    Parametrisation should be easy, i.e. there should be as few parameters as possible, and those that are strictly required should be easy to provide (from data or estimates).

  3. 3.

    Implementation in existing open source GIS should be straight-forward, eliminating the need for costly licensing of software, and avoiding the black-box effects of proprietary and closed source systems (Ducke, 2013).

  4. 4.

    It should be based on previously explored methods and assumptions, so that the relevant potentials and limitations are reasonably well known.

In this section, we discuss a number of formal methods for node-based network reconstruction that exhibit the above properties (in principle or practice), apply them to our archaeological data set and observe the effects of models and parameters. We add to the available tool set by providing a version of Renfrew and Level’s (1979) classic XTENT model, with slight modifications to achieve robust results for network data.

Social Versus Geographical Networks

To illustrate why models that do not take geography into account explicitly will not suffice for the reconstruction of plausible networks, we will first take a look at the results of a simple, non-spatial network analysis. After all, archaeology is the study of connected social phenomena, and social network analysis (SNA), which traditionally revolves around the non-physical concepts of “social space” (Bourdieu, 1985) and “social topology” (Burt, 1977), has certainly made its impact on the discipline (Brughmans, 2010), including studies on site networks (e.g. Iacono, 2018).

It is not within the scope of this text to provide in-depth exposure of the enormous apparatus of SNA methods (see: Scott, 2000; Hanneman and Riddle, 2005; Brughmans, 2014). Essentially, SNA targets the observable or hypothetical manifestations of social interaction processes in immaterial, re-configurable networks (in its all-encompassing form, this is referred to as “actor-network theory”: Latour 1996). A common descriptive method of SNA is to connect all pairs of nodes that have one or more features in common (such as the presence of a certain artefact type or an architectural style) and to compute the “degree centrality” of each of the network’s nodes as a function of the number of incoming connections (Hage and Harary, 1995). For visualisation purposes, the nodes may be arranged freely to improve the aesthetics or “clarity” of the resulting diagram, and layout algorithms exist that optimise various aspects of a social network’s visual representation (e.g. Bourgeois and Kroon, 2017). In the context of originally geographical data, one (deceptively) obvious choice is to place the network nodes at their associated coordinates.

As a simple example, Fig. 3 shows the result of such a computation for the sites in Table 1 and their “similarity attributes”. Here, similarity is expressed as the common presence of an artefact class (see Mills et al., 2013 for more refined similarity metrics that also work with fragmented material; Amati et al., 2020 for a more comprehensive list of similarity-based network studies). In the cartographic output, the thickness of a link indicates its strength: the more traits two connected nodes have in common, the thicker the link between them. The size of a node indicates its “centrality”, as given by the summed strength values of all links connected to it. Viewed against the archaeological background, the visual result confirms expectations, with the Argolid and Crete dominating the network. It is, however, very important to keep in mind that this picture reflects only the similarities subjectively chosen and encoded into the data. It is not a representation of “absolute” importance or connectivity.

Coward (2013, 256) notes that “the significance of geography in the establishment and evolution of networks has rarely been recognised in social network approaches”. And indeed: Such “social maps” (or rather: “cartographic sociograms”: Moreno, 1934) are as intuitive to read as they are misleading, because they do not differentiate between process and manifestation: Are these links the results of direct or indirect interactions? Are they caused by spurious or regular exchanges? Do they all represent the same cost/effort of interaction? These and other questions are important for understanding the processes behind the observed similarities (Brughmans, 2018). From an analytical perspective, the fundamental problem is that “similarity” may be a proxy for physical connections, but is not a spatial connectivity criterion in itself.

Broadly speaking, social links are “cheap” and dynamic, physical links are “expensive” and static, but the former cannot be made without the latter. Reconstructions that focus on the manifestations and ignore the physical processes behind network evolution will thus produce physically implausible, over-connected results. Instead, (observation-based) SNA and (process-based) geographical network analysis should be viewed as two aspects that complement each other. In the following, we will concentrate exclusively on geographical network reconstruction as a fundamental task.

Node-Based Network Reconstruction

The following discussion of methods will focus on their usefulness for reconstructing physical networks relevant to archaeological time frames, such as road networks and coastal shipping routes. Note that this is not a complete coverage of current possibilities. The reader is advised to consult at least Groenhuijzen and Verhagen (2017), Amati, Shafie and Brandes (2018 and 2020) and Evans (2018) for a more complete picture.

In accordance with the scientific principle that the simplest solution is the preferable one (Ellis, 2010), connectivity models that require zero or only a single parameter will be considered first. The most basic such model is the one that connects every node with every other one via a direct link. For the sites in Table 1, such a model (Fig. 4) produces 91 links (the minimum number of links required to connect n nodes with each other directly is n(n-1)/2).

Fig. 4
figure 4

A fully connected model for the sites in Table 1 (upper left), successively reduced by distance thresholds of 750, 500 and 250 km

Clearly, a fully connected model contains an excessively high number of links and is not plausible under realistic resource constraints. Adding one parameter, a distance threshold, allows the inclusion of such constraints (Fig. 4; this is also referred to as a “maximum distance network”: Amatie et al., 2018). While a plausible choice for the threshold distance can be hard to substantiate, it does allow for some initial exploration of connectivity potential (Fig. 4 shows the results of applying a range of thresholds). It also provides the base for more advanced tools widely available in GIS, most notably the “minimum spanning tree” (see below).

Several zero or one-parameter connectivity models have been discussed previously in the archaeological literature and that of related fields. They differ significantly in the average number of connected neighbouring nodes (see Table 2):

Table 2 Average number of neighbouring nodes directly connected to a node, as determined by different connectivity models (source: Fortin and Dale, 2005)

Delaunay triangulation produces a sparse initial model (Fig. 5) by reducing the links to only those that correspond geometrically with the edges of a minimal triangulation (Fortin and Dale, 2005; Jiménez-Badillo, 2014; Groenhuijzen and Verhagen, 2017).

Fig. 5
figure 5

A sparse connectivity model that reduces the set of links in the network to only those which are part of a minimum triangulation (Delaunay) of the nodes (left), and an even sparser one, that retains only the links which are on a minimum spanning tree of the network (right)

Nearest neighbours (NN) connects each node with only those nodes that are closest to it (Fig. 6). The single required parameter is the number (n) of the nearest nodes with which to connect each node. This connectivity model has been used in ethnographical and archaeological case studies (originally called “proximal point analysis” by Terrell and later popularised by Broodbank, 2000). Provided that small numbers are chosen for n, the network will be sparse and appear more realistic than a fully connected or triangulated network (Groenhuijzen and Verhagen, 2017).

Fig. 6
figure 6

Nearest neighbour connectivity models with several choices for n. Lower right: n is set individually for each node, according to its rank (see column “Size” in Table 1)

The Gabriel Graph is a model that minimises the spatial density of connected nodes. It does not require any parameters and produces even sparser results than a Delaunay triangulation. A simple geometric criterion is used to decide which nodes to connect (Gabriel and Sokal, 1969; Fortin and Dale, 2005; Jiménez-Badillo, 2014): The Euclidean (straight line) distances between all pairs of nodes are calculated, and two nodes A and B will be connected, if the circle, whose diameter is the same as the distance between the two nodes, and whose circumference passes through both nodes, does not contain any other nodes.

The relative neighbourhood graph is a variation of the Gabriel Graph (introduced by Toussaint, 1980) which has seen some archaeological use (Jiménez-Badillo, 2014). In this case, two nodes A and B will be connected if there are no other nodes within the overlap area of two circles, centred at each of the nodes, and with their radii set to the distance between A and B.

A minimum spanning tree (MST) is a reduced network representation that contains only the minimum number of links required to connect all of a network’s nodes (Kruskal, 1956; Prim, 1957). In practical application, the MST is an economic optimisation method that can be used to e.g. compute the smallest possible road network that is still sufficient for all places to be reachable from any other place in the network. As such, it is a useful model for the shape of a network’s backbone (see Hage et al., 1996 for an archaeological interpretation). Computation of the MST can include link weights (e.g. representing different road building costs), in which case the result is the set of links that minimises the summed link weight of all connected nodes. An MST for our complete network (Fig. 5) contains just 13 links.

The Gabriel Graph and the relative neighbourhood graph will not be considered further here, since current implementations suffer from a number of shortcomings: There is no simple extension into the non-Euclidean domain of cost-based distances (since the connectivity criterion is trigonometrically defined) and no way to accommodate for node hierarchy (both of these aspects are key to obtaining more realistic reconstructions, as will be discussed shortly). Groenhuijzen and Verhagen (2017) do provide a version of the Gabriel Graph with an adjusted trigonometric calculus that solves for triangles whose edge lengths represent travel times rather than Euclidean distances. However, this approach suffers from the numerical instabilities (caused by rounding errors) that “malformed” triangulations will inevitably produce in more extreme cases.

The NN (proximal point) model, on the other hand, seems problematic, because the choice of n can be hard to justify and seems even more arbitrary than that of distance threshold. However, replacing constant n with one that is chosen individually for each node is straight-forward: Inserting the values in column “Size” of Table 1 into the NN model produces the network shown in Fig. 6 (lower right).

Before we explore XTENT as an alternative to the methods discussed so far, a short digression on extending the realism of network reconstructions via cost-based link modelling is needed.

Least-Cost Paths and Network Topology

If we accept the general premise that the construction of physical networks respects economic constraints, then we can compute any link between two nodes as the least-cost path (LCP) across a given cost surface (Verhagen et al., 2013; Lewis 2021 provides an up-to-date entry point into published research on the topic).

Figure 7 shows the results of replacing straight-line links with LCPs in a maximum distance model. Here, we take the accurate lengths of LCPs, instead of straight-line measures, to determine cut-off values. The strong convergence of links into least-cost corridors (see also Parcero-Oubiña et al., 2019) produces a more realistically shaped network (under the assumptions of the underlying cost surface: see section “Geodata and cost surface”).

Fig. 7
figure 7

A completely connected network (upper left) and its maximum distance derivations, with links modelled as least-cost paths (compare with Fig. 4)

Note that the cost model used in this study is isotropic, i.e. costs are uniform for crossing a location (raster cell) in any direction. This is fundamental, as isotropy is also an implicit assumption of all connectivity models discussed here: establishing a link from node A to B and vice versa are equivalent actions. We used “r.cost” to derive isotropic least-cost and directional rasters from the cost surface, and “r.path” to trace the LCPs on them. Both programmes are available as part of open source GRASS GISFootnote 8.

A frequent issue with LCPs, computed in GIS over a rasterised surface, is that their shapes deviate from the more “natural” character of real network links, which are a “smoothed-out” compromise between physical constraints and geometric perfection. Instead, LCP segments tend to appear jagged or unrealistically straight in alternating patterns. This poor geometric quality is caused by “quantisation effects”, a loss of information that occurs when algorithms “squash” a naturally continuous phenomenon (such as elevation or terrain steepness) into a limited number of categories (such as only eight directions of movement between cells in a raster). In our case, these effects are made worse by the fact that SRTM elevation data is only metre-precise. A practical solution (Herzog and Posluschny, 2011) is to add some random noise to the cost surface. This increases the variance of the data, thereby preventing algorithms from producing identical solutions repetitively. Ideally, the amount of noise should reflect the scale and measurement error of the data, in which case no real detail is lost. At the scale of our analysis, we found a stochastic error of 20% (±10% randomly added to the cost surface) to be sufficient and appropriate, since it roughly matches the vertical measurement error contained in the SRTM elevation data that we used (see Lewis 2021 for a much more elaborate treatise on the stochastic elements of LCP generation).

The least-cost representation of network links also produces topological relations in geographical space, which can be exploited using available GIS tools. The LCPs computed for our sample data set converge, where physical geography features corridors that facilitate movement. The number of overlapping LCPs along a particular segment of the network can be counted and taken as an indicator of flow capacity. In this view, the network’s backbone is composed by the segments with highest link coincidences (Fig. 8; compare with MST representation of backbone in Fig. 5, right).

Fig. 8
figure 8

Visualisation of least-cost path (LCP) density in a fully connected network: Thicker lines represent higher link density, i.e. higher coincidence of LCPs converging along a segment of the network. Thick red lines mark the network’s “backbone”, which includes the upper 50% of density scores. Link density ranges from “1” to “45”

In addition, the points where LCPs cross or converge can be identified and taken as candidates for additional network nodes. Previous studies have tentatively explored this potential. White and Barber (2012) show that it is possible to locate previously unknown sites at crossing points, given solely a realistic set of LCPs; Fonte et al. (2017) identify additional historic sites at points of LCP convergence (they refer to them as points of “divergence”, but in an isotropic model, this is synonymous). The study by Verhagen et al. (2014) considers junction points as candidates for new network nodes and explores their role as means of further link optimisation (so-called “Steiner points”). However, they conclude that current GIS technology offers no simple solution for this (note that the task of computing a minimally linked network with Steiner points is an NP-hard problem, i.e. not solvable in reasonable time, even for relatively small data sets: Gilbert and Pollak, 1968).

Hierarchical Reconstructions with XTENT

The previous example of using an NN model with node-dependent parametrisation demonstrated that it is relatively easy to introduce the aspect of node hierarchy (differences in site “size” or “importance”: Bevan, 2011; Bevan and Wilson, 2013) into network reconstructions. This shifts the focus away from global connectivity criteria and toward the individual network nodes and their capacities for forming or “attracting” connections locally and to varying degrees.

Such “generator-based” models are highly developed in disciplines other than archaeology (Barthelemy, 2018), but the specifics of archaeological data demand that concepts for models should be sought in the body of published archaeological research first. In doing so, we make use of the fact that there is considerable overlap between the topics of networks, hierarchy (Ortman et al., 2015), and another essential concept of spatial analysis, that of territoriality (a territory can only be controlled if a network exists that connects all of its important places).

XTENT, developed for use in archaeology by Renfrew and Level (1979), is a territorial model that has been demonstrated to yield plausible results in several case studies (Grant, 1986; Scarry and Payne, 1986; Ducke and Kröfges, 2008; Bevan, 2011), and that has also been extended to include cost-based distances (Ducke and Kröfges, 2008). Here, we adopt XTENT as a simple but flexible tool for hierarchical, node-based network reconstruction (note also that Rivers et al., 2013 already discuss its importance for the issue of centrality in networks).

XTENT belongs into the traditional category of gravity models (Crumley, 1979). These take inspiration from how the natural (by extension: social, economical, political, etc.) force of gravity shapes processes and their manifestations. As tools for spatial analysis, such models balance distance against mass (weight, size, wealth, etc.) when measuring the impact (influence, importance, etc.) of a variable at a given location. In the case of XTENT, the distance factor is given a linear impact (something that is twice as far away has half as much influence), while the size factors in exponentially, giving larger sites a much larger weight. This idea of a “snowball effect”, of large entities growing disproportionally faster than smaller ones, is at the core of all gravity models (Haggett, 2001; Robinson, 1998; Rivers et al., 2013).

The original XTENT formula consists of the weighted terms C (site size) and d (distance), with global exponential site weight a and global linear distance weight k:

$$ \mathrm{I}=\mathrm{Ca}-\mathrm{k}\times d $$

Any geographic location can then be assigned to the territory of the site (called “centre” in the original treatise), that produces the largest value for I (“influence”) at that location, given its size and its distance. In our adoption of XTENT for network reconstruction, we interpret C as a pair of interacting nodes in the network and I as connectivity criterion. We (re-)construct a link between the two nodes if I>0.

In practice, the largest drawback of the formula above is that it uses unbounded value ranges for both terms. It can be hard to provide a parametrisation that results in useful output (Scarry and Payne, 1986). To compensate for this, we modify the original formula by multiplying the weighted site size term Ca with the average inter-node distance avg(d) in the (hypothetical) fully connected network:

$$ I= Ca\times \mathrm{avg}(d)-k\times d. $$

Scaling C by the mean distance brings its value into the same range as d (as long as k is not set unreasonably high).

As mentioned before, our framework of methods assume isotropic interactions. However, with the formula above, results would vary, depending on whether we compute distances from A or B or vice versa. To solve this problem, we compute I as the combination of weighted sizes of both nodes:

$$ I= Ct\times \mathrm{avg}(d)-k\times d\ \mathrm{with}\ Ct={C}_{Aa}+{C}_{Ba}. $$

To accommodate for the impact of variable site size, we further modify the first term by dividing the average distance by the maximum size of any site present in the input data set, max(C), to arrive at the final model formula:

$$ I= Ct\times \left(\mathrm{avg}(d)/\max (C)\right)-k\times d\ \mathrm{with}\ Ct={C}_{Aa}+{C}_{Ba}. $$

If it can be assumed that the sites under consideration were once connected (directly or indirectly), then it is also reasonable to assume that the largest places were able to connect at least across the average distance between nodes. The larger they were, the (exponentially) more reach they had, and two sites would be able to use their added weights (“resources”) to connect.

Regarding the choice of averaging function, it should be noted that the simple arithmetic mean will not deliver good results if the network is characterised by strongly varying distances between nodes (the mean is sensitive to outliers). When using cost distances, this will be the case frequently for large networks that connect sites across very different conditions of physical geography (such as coastal and hillfort settlements). In such cases, the median is a more robust choice.

While this modified version looks more complex than the original XTENT formula, parametrisation is still limited to choices for (a) each site’s size, (b) the global size weight and (c) the global distance weight. The latter two are now significantly easier to choose experimentally, since normalisation ensures that both terms fall into a similar range of values. Straight-line or cost-based distances can be used without further modifications. Note, however, that this comes at the cost of discarding physically-grounded, proportional relationships between the variables, which were a basic design feature of the original formulation of XTENT.

Once several parameters are introduced, interaction between them occurs, and it becomes much more difficult to understand which parametrisation choices lead to a particular model outcome. The following properties determine the basic behaviour of our XTENT implementation:

  • Setting a to “0” will cause all sites to have uniform weight “1”.

  • Negative values for a will produce counter-intuitive results.

  • A node of size “1” will always have weight “1”, regardless of the choice of a.

  • Site size C should not be set smaller than “1” to prevent nonsensical results.

Simple starting values for experiments lie in the range of 1-3 for C (compare Table 1), a=1 and k=1; a and k should then be increased in steps smaller than “1”, and only one parameter at a time, to observe their impacts (see also similar recommendations made by Renfrew and Level, 1979; Scarry and Payne, 1986).

Higher values of a will then result in more widely connected networks. A higher proportion of relatively larger sites will produce more densely connected networks. Larger sites will form connections across larger (cost) distances. These trends can be countered globally by increasing k, but only in a linear manner. Network evolution under developing technological means of construction can be simulated by decreasing k gradually. Increasing political or ecological importance of central sites can be simulated by increasing a gradually. The effects of resource growth can be simulated by increasing C.

XTENT-Based Model Generations

The results of running a series of XTENT-based reconstructions for various parametrisations with cost-based links are shown in Figs. 9, 10 and 11. Note that we have only included outputs of those model iterations that showed significant differences when compared to a previous run. Site weights (XTENT parameter C) were read from column “Size” in Table 1.

Fig. 9
figure 9

XTENT-generated models with k (distance weight) kept at "1" and increasing a (site weight)

Fig. 10
figure 10

XTENT-generated models with k (distance weight) kept at “2” and increasing a (site weight)

Fig. 11
figure 11

XTENT-generated models with a (site weight) kept at “1” and decreasing “k” (site weight)

The properties of XTENT-based reconstructions are quite robust under reasonable assumptions for a and k. One of these is the fact that the generated network is a tripartite structure: The sites within the Aegean are tightly connected, because they can communicate via very cheap coastal links. The continental sites (in Europe and the Levant) are far more costly to reach from most sites of this core network.

In our experiments, this becomes visible in model runs with k set to “1” (Fig. 9). In this case, the inner network connects even for very small values of a. But the network stays only locally connected until a threshold for a of “1.8” is crossed. Apparently, connectivity does not increase easily in those regions of the network where larger sites (and their resources) are absent. This effect gets substantially more pronounced if we introduce higher link costs by increasing k to “2” (Fig. 10). We now find that even the inner Aegean network does no longer connect for very small values of a. Connectivity also increases in much subtler steps with increasing values of a.

To test the hypothesis that k (the link cost) is the most important driver of connectivity, we run another model iteration with strongly varying values of k and constant “a=1” (Fig. 11). This shows that network connectivity reacts strongly and globally to decreasing link costs.

Provided that the assumptions behind XTENT are both plausible and significant (gravity models are an accepted tool in spatial analysis: Evans et al., 2008; Rivers et al., 2013; Ortman et al., 2015), then these observations have general repercussions for our understanding of past networks and connectivity, that are independent of shortcomings in concepts or flaws in the data of individual case studies.

Generally speaking, it should then be assumed that connectivity throughout history was not driven primarily by the emergence of local centres and their growing importance, but rather by progress in the organisation of labour and the technologies required to build and maintain network links. Additional factors, that might have reduced the cost of links, could be sought in improved global planning and increased safety of roads and shipping lanes, as well as many other social aspects (Earle and Kristiansen, 2010). Consequently, the increased connectivity of the Late Bronze Age should also be attributed to the effect of innovations that were widely adopted and had significant impacts on communication cost, not just to the influence of a few, highly developed centres (in our experiments, we assigned three times the regular weight to central sites such as Mycenae: see Table 1).

Discussion

Before we discuss specific properties and results of our analysis, some general points are worth reiterating:

  • Our approach is exploratory, not inferential: The acceptability of all observations and interpretations is subject to the assumptions behind the models themselves and their parameters.

  • We made no attempt at reconstructing the entire network of the epoch. There are many more sites in the study from which the data set was extracted (Suchowska-Ducke, 2016). Here, we only looked at a subnetwork of sites that share a feature of interest.

  • For the time period and geographic region studied here, direct evidence of roads or other physical network links does not exist in sufficient quantity to validate any reconstructions. Even if this were otherwise, there would be no tractable way of determining the optimal reconstruction, since different models, parameters, input and validation data might always produce a better fit.

The models discussed here represent a perspective that is centred on a core network of Aegean sites. Our archaeological interpretations apply strictly to this limited network and its specific properties.

Computational Tractability

Excessively long computing times can hamper the applicability of computational tools severely in practice. In our case, the primary concern is how the number of nodes affects computational complexity. To construct a completely connected network, every node has to be linked to every other node. Even if a much reduced model of connectivity is used, it is still necessary to first examine every possible pair to check whether they meet the connectivity criteria. As mentioned previously, the number of links in a fully connected (undirected) network is n(n-1)/2, with n being the number of nodes. This is the upper boundary of computational complexity in our case, as long as we can assume that the reconstruction of each link takes (approximately) a fixed amount of computing time (the actual time is not relevant in theory: it is a function of available computing power). Table 3 shows that ten times the number of nodes require 100 times the computing time for a full reconstruction.

Table 3 Number of links and increase in link reconstruction operations for fully connected networks with different numbers of nodes

Theoretically, such linear growth represents a tractable class of computational complexity (see Cormen et al., 1990). In practice, however, this still poses a problem, even for relatively small inputs, when high-resolution cost surfaces are included into the models. As a work-around, we implemented an “initialise and reduce” strategy: A complete network, containing all possible links is computed once, and kept (“cached”) as an initial reconstruction state for all successive computations. Any one of the models discussed here (NN, XTENT, etc.) can be derived from a fully connected network by simply reducing the number of links to those that meet the connectivity criteria. It is not necessary to recompute LCPs, unless the underlying cost surface is changed.

Strengths

Computational complexity aside, the mathematical models illustrated here all share the same strength of being easy to understand and parametrise, while still allowing for some advanced aspects, such as cost-based distances and node hierarchy.

The XTENT model in particular offers a good balance between transparency and flexibility. We perceive this to be the greatest strength of our approach, and the primary reason why it is useful for the analysis of archaeological data.

Keeping the mathematical models simple and outsourcing the complexities of detailed realism to cost surfaces and existing GIS tools was a strategy that worked well, since the effects of model parameters and link costs were easy to separate and interpret. In combination with some pragmatic performance optimisations, we were able to compute a large series of model iterations in reasonable time, enabling us to compare different parameter sets and explore their effects.

At the same time, GIS-based implementation and exploration suggested additional avenues for further exploration. In particular, topological features such as crossings or points of convergence can be exploited in a variety of ways. This includes the simulation of network evolution by adding topologically selected nodes to each new model generation.

Weaknesses

The reconstruction models discussed here are somewhat one-sided, since they are strictly node-based. This is very obvious in the case of the XTENT model, where we can control. More dynamic models should take into account the interplay between links and nodes in evolving networks (this aspect is explored by de Soto, 2019).

We have also assumed undirected relationships between all linked nodes. This not only simplifies mathematical models and reduces computational complexity, it is also an implicit requirement of the vast majority of published reconstruction models. However, it is also an assumption that has some significant consequences for model realism. There are some examples of the importance of anisotropic networked processes in the real world. E.g. travel on rivers is more cost-effective in the direction of water flow and sea travel along coasts is energetically cheaper in the prevailing wind direction (see e.g. Murray, 1987). A related issue is the fact that we have only considered direct, pair-wise (dyadic) links between nodes, and have ignored the possibility of links via intermediate nodes (“transitive” relations), which are not uncommon in real-world networks. Once more, this is a common problem of established methods.

The most severe weakness of our approach, however, is the lack of a formal testing framework, including useful quality metrics, that would allow us to compare different reconstructions and select the one with the best fit to the input data. Clearly, this is an issue that needs a dedicated research effort. Formal approaches to this topic exist (Sanil et al., 1995), but it is currently unclear how these might apply to the fragmentary nature of archaeological evidence. Perhaps the concept of movement corridors, with its more realistic error margins (Parcero-Oubiña et al., 2019) will prove helpful in this endeavour.

Outlook

Principal weaknesses aside, the combination of GIS with generic and transferable link reconstruction methods is a flexible approach that can be refined by step-wise improvements, many of which are “low hanging fruits”.

For example, the cost surface we employed is somewhat simplistic. More locally accurate cost modelling would first require a narrower definition of “cost” (i.e. a specific mode of link construction or use, such as walking or sailing) and then use a physically accurate cost assignment (e.g. an exponential weighting of the slope factor, as exemplified by Herzog and Posluschny, 2011 for pedestrian movement). We avoided these complexities to concentrate on the effects of the different connectivity models, but more refined cost surfaces, and differentiation between modes of movement, would provide a straight-forward increase in explanatory power.

Another issue that can be addressed on the data level is the fact that real networks consist of subnetworks that may each be subject to different optimisation strategies and connectivity criteria (networks of networks). The challenge here lies in identifying which types of subnetworks are likely to be manifest in the data, and which ones are the most significant for an observed spatial patterning. If this can be addressed, however, then more clarity can be gained about the suitability of specific connectivity criteria for modelling specific social processes.

Most recently, exponential random graph models (ERGM) have been proposed as another, very promising, tool for node-based network reconstruction (Amatie et al., 2018). ERGM are a generalised (as evolved from Markov graphs: Wasserman and Pattison 1996) form of graph-theoretic models that support statistical reasoning on network data, and could (at least in principle) overcome the weakness of relying on undirected and dyadic relationships. Formally, ERGM represent probability distributions of all possible sets of links for a given set of nodes (see Lusher et al., 2012). Every particular manifestation of a network can thus be considered a sample (of size n=1) of such a distribution, and statistical inference can be carried out on the data without violating the assumption of independent observations. Their generic, highly abstract character allows application of ERGM to a wide variety of domains, including archaeology (Amati et al., 2020). In fact, from a mathematical point of view, ERGM are flexible enough to “subsume maximum distance networks, proximal point analysis and gravity models as special cases” (Amati et al., 2020). As general statistical models, ERGM can be used for (probabilistic) network reconstruction via a model fitting approach. However, this requires a non-trivial amount of knowledge on the (known or presumed) properties of the network-generating processes, because the ERGM distributions must be constrained as much as possible to obtain any meaningful results. Since alternative (“brute force”, simulation, etc.) approaches will inevitably produce computational resource bottlenecks for all but the smallest networks (Schmid and Desmarais, 2017), it remains to be seen whether the theoretical promises of ERGM can be fulfilled in practice.

Conclusions

We have explored a framework of relatively simple methods that touch what we perceive to be a core aspect of the archaeological research agenda: the explicit reconstruction of geographical networks to better understand the spatial patterning of past social processes.

Since archaeology explores the manifestations of social interactions, the concept of “social space” is an attractive one. It is made even more attractive by the fact that SNA offers a plethora of tools for translating similarities observable in the archaeological record to network metrics that are easy to visualise and explore.

However, “social space” is a hypothetical construct that is more or less loosely anchored to physical reality. One can easily be misled to interpret similarity-based connections as direct and meaningful, when they are in fact routed through several nodes in the underlying physical network. In other cases, they might be the results of “similarity by spatial autocorrelation” (i.e. simply produced by the geographical proximity of entities of interest) or pseudo-similarity induced by the effects of fragmentation on statistics (Iacono, 2018 also acknowledges this characteristic problem of archaeological evidence in his analysis, but offers no solution).

Plausible reconstruction of network links in physical space is thus a necessary prerequisite for spatial analysis. To this end, we outlined an approach to network reconstruction that, in its most basic form, requires only a set of archaeological sites (or other objects of interest) with their coordinates as input. More advanced reconstructions can be computed if it is possible to assign individual weights to the network’s nodes, and much more realism can be gained by introducing cost-distances into the models.

Renfrew and Level’s (1979) XTENT is an example of a simple mathematical tool that retains its essential usefulness because it is “legible” and makes very little assumptions about the quality of input data. When amended with some detailed geodata and a cost-based distance metric, it provides a flexible analytical model. XTENT has certainly served us well for gaining some insights into the connectivity potential of our small set of sample data. Regarding our interpretation of the results, we feel comfortable in suggesting the hypothesis that the increase in Late Bronze Age connectivity in our study region (as witnessed by increasing distribution of shared technologies such as swords of type Naue II) was most likely enabled by widely available innovations and progress that drove down the cost of establishing links, not just by the influence of a few outstanding “centres of civilisation”.

In this context, one should not forget that the archaeological record, including those aspects that provide clues on networks and connections, represents the cumulative result of social processes and interactions. The models presented here do not attempt to explain the latter in depth or in detail. They merely provide a tool that can help explore the structural effects of (hidden) networks and help formulate hypotheses through spatial data exploration.

As Box (1976) has famously stated: “All models are wrong”: It would be a vain attempt to produce “correct” models by extreme refinement of concepts and details. Determining any model’s acceptability or usefulness is rather a question of deciding whether it may be acceptably wrong in its specific assumptions and outputs. This is just as true for much more computationally complex and data-intense methods.

Therefore, our approach to generic network reconstruction was to keep the mathematics simple and easy to understand, and to make sure it works with data of limited scope and quality. In this study, we have used archaeological evidence from traditional print publications and freely available GIS data at a relatively coarse resolution. Providing more, and more fine-grained, data, such as reliable estimates of node sizes and a higher resolution cost surface with a better physical movement model, would inevitably result in more detailed network reconstructions.

But it would also inevitably result in more specific, less transferable assumptions and interpretations. An excessive amount of detail in the computation of LCPs might even result in less realism, since this would assume a level of planning perfection that might not have been achievable by prehistoric (or even current) societies (this point is raised by Verhagen et al., 2014). And then there is also the matter of a multitude of social factors, not easily representable in a cost surface, that might have prevented optimal network links from being established in the first place (modern transportation networks certainly are a compromise between what should be done and what can be done, taking into consideration political, judicial and economic constraints), which add to our conviction that over-optimisation of models can actually have a negative impact on their realism.

On the scale between plain “intuition” and the complexity of simulation models and “artificial intelligence”, comprehensible mathematical models like XTENT represent an elegant middle ground with much potential. In the absence of clear evidence (which is so often a basic feature of archaeological research) a formal expression of plausible assumptions is as good a starting point as any. We hope that our contribution will motivate archaeologists to rediscover simple mathematical methods that work well with equally simple, but typical archaeological data, and take both to a new level of usefulness with the help of GIS and other advances in modern computing technology.