Main

Oceanic commerce has been a major driver of European expansion and economic growth for centuries, but perhaps equally important is less-documented coastal or short-sea shipping. Among other things, coastal trade played a fundamental role in the precocious development of European economy and was a key enabling factor in the British Industrial Revolution. However, coastal shipping remains relatively unknown and poorly quantified. The absence of physical infrastructure common to most land-based means of transportation (such as roads, rails, or canals) and the limited cartographic traces of past sea routes explain our relative ignorance of ship courses. However, ships were never randomly distributed across the immensity of Earth’s oceans. Climatic and oceanic features, navigation technology, and economic geography determine shipping routes. Here, we show that a dynamic and multi-criteria simulation of historical routing decisions can reveal the likely outlines of European shipping corridors in the age of sail. Based on this modelling exercise, we produced 12 monthly routing predictions for historical shipping corridors and tested them against historical evidence. We found that oceanic currents increased the historical route length by almost 20 per cent on average, but that they offset some of the seasonality in route length caused by wind patterns. Furthermore, we show that route length and journey time were very seasonal (with a deviation of over 30 percentage points from peak to trough). Finally, indeterminacy in local weather patterns and uncertainty in routing choices, leading to more intricate and longer routes, can be modelled by taking into account variability in the wind data, and especially the likelihood of dangerous weather events. We now have a historical routable coastal network for pre-steam navigation suitable for all calculations involving shipping distance, cost, and journey time. These data can be included in state-of-the-art multimodal models of historical routing to calculate end-to-end journey costs, which makes it possible to quantify the impact of coastal accessibility and changes in freight costs on market potential, economic growth, and regional specialisation.

The use of GIS and network analysis to model mobility phenomena is an established methodology among computational archaeologists, geospatial analysts, and digital humanists. These have been used to determine migratory patterns, infer ancient road networks, and understand settlement locations (Verhagen et al., 2019). A path representing the least-cost trajectory between any pair of points can be calculated by quantifying the cost of movement across a three dimensional grid, adjusted for any additional transitional costs (weights). The least-cost path (LCP) analysis was first used in the early 1990s to model ancient-world routes (van Leusen, 1993) and remains the key methodology for modelling land movement, with uses ranging from understanding the rationale behind Roman roads (Fonte et al., 2017; Lewis, 2021; Verhagen and Jeneson, 2012) to predicting routes of hominin dispersal (Beyin et al., 2019; Kealy et al., 2018).

In comparison, modelling historical sea routes is a relatively recent endeavour. Contributions from archaeologists, such as Indruszewski and Barton (2007) on Viking seafaring, Leidwanger (2013), Warnking (2016), Alberti (2017) and Gal et al. (2021) on ancient Mediterranean sailing routes, Gustas and Supernant (2017) on Northwest Pacific sailing in the late Pleistocene, Slayton (2018) on canoeing routes in the fifteenth-century Caribbean, and more recently, Perttola (2021) on the early seventeenth-century Southeast Asian trade routes, and Blankshein (2021) on prehistoric seafaring are notable exceptions.

However, historians have been slow to adopt such techniques. This is partly because while archaeologists have developed methodologies adapted to the paucity of data that characterises more remote periods, early modern and modern historians tend to drown in a wealth of data. More pragmatically, this is also the result of the computational skill gap among historians. We deplore the fact that these critical skills are still largely ignored by history curricula, and we hope that this article will encourage a change in this direction.

Seafaring is particularly suitable for the application of this new methodology. Unlike modern terrestrial mobility, which generally followed relatively, albeit labouriously, identifiable historical infrastructure, such as paths, roads, canals, or even railway lines, maritime mobility was only bound by the extent of the sea. Martín Cortés de Albacar, the famous sixteenth-century Spanish cosmographer, explained:

‘Sayling,[...] differ[s] from viages by land [...] for the land is fyrme and stedfast, but this is fluxible, wavering, and mooveable. That of the lande, is knowen and termined by markes, signes, and limittes: but this of the Sea, is uncertayne and unknowen.’Footnote 1

Principles guiding the modelling

It goes without saying that sailors did not chart their courses randomly. Even before the advent of precise navigational charting techniques and technology, mariners endeavoured to minimise the number of days at sea while avoiding dangerous areas or weathering a looming storm. To simulate rational routing behaviour for an eighteenth-century pilot, we based our model on i. the basic thermodynamic principles of sailing adjusted to fit historical parameters, and ii. long-term historical environmental time series for the key determinants of navigation (wind, waves, currents, visibility, and extreme weather events).

Eighteenth-century sea-goers may not have had precise forecasting models and GPS navigation systems currently used by sailors around the globe, but our hypothesis is that the written and oral body of knowledge, know-how, and experience accumulated over countless sea journeys and transmitted from masters to apprentices and disseminated in port taverns, as well as early navigational charts and sailing treatises, would have provided an equivalent basis for informed routing decisions.

Previous LCP modelling applied to navigation used only simple wind models and bathymetry data (i.e., sea depth). Recently, Alberti (2017) introduced anisotropic modelling (based on the transition cost in the direction of travel to account for directional sailing conditions) and Perttola (2021) added a seasonal variable to account for wind conditions based on four six-hourly averages over 20 monthly averages. Our model draws on these examples to offer a more complex, dynamic, realistic, and operational approach to sea routing using a broader range of criteria for modelling sea conditions: bathymetry, coastal visibility, wind speed and direction, wind variability, frequency of extreme weather conditions, wave height, direction and period, and current speed and direction.

Climatic data

Environmental variables were all produced using data obtained from key global meteorological datasets and normalised to obtain hourly averages for every month and year from 1950 to 1978 (except for surface current data which encompassed 1992 to 2020). Our key assumption is that the average weather patterns observed over the third quarter of the twentieth century can be applied to determine sea conditions in earlier periods. We cannot reliably confirm this hypothesis, but the rare changes documented in the literature fall well within the variance induced by aggregation, suggesting that our model does not suffer from a significant bias in this respect.

Though the scope of the study pertains to the period covered by the ‘Little Ice Age’ (c. 1550 to 1850), a period traditionally associated with an increase in storms, precipitation, glaciation and variability in trade winds (Bridgman, 1983; Lamb, 1977, 1979), the use of historical records from outside Western Europe and the North Atlantic have demonstrated that the period is not one of monolithic climatic change but is best described as periods of cooling associated with the expansion of glaciers in certain regions (Matthews and Briffa, 2005). Additionally, it can be argued that despite a correlation between the North Atlantic Oscillation (the circulatory phenomenon of air pressure and wind in the North Atlantic) and glaciation during the period (Nesje and Dahl, 2003), the variability of this phenomenon in the 20th century (Hurrell, 1995; Kushnir, 1994) can justify the use of modern averages to model past climatic conditions.

Furthermore, although the data used in this exercise are relatively granular (hourly), it is important to note that the routes obtained do not represent any individual voyage but the average optimal path for an average ship on an average month of an average year. While some may object that we should have used even more granular (temporal and spatial) data, we argue that choosing long-run averages rather than punctual data is both a pragmatic choice guided by the extensive computation cost involved in running these models and a more suitable methodological approach given the data being modelled. Historical weather data are not direct observations; rather, they are the product of powerful weather models, in our case created by the Copernicus Climate Change Service (C3S) and NASA/JPL. Thus, these data aggregates (no model could ever account for wind patterns changing every minute and potentially every meter on rugged terrain) and we believe that the inherent imprecision caused by the aggregation better encapsulates the local unpredictability of weather patterns. Sailors would have known general trends in weather patterns, which is precisely what our averaging of weather data aims to represent.

The model

We first trialled a stochastic approach iterating route calculation 10,000 times for each port pair (i.e., documented maritime connection) using a random selection of hourly data for each month, thus producing a series of the most probable corridors for each of these pairs. This probabilistic model accounted for the variability in weather events and sea conditions that were not captured in the averaged data. However, we decided to abandon this approach because it suffers from a theoretical flaw. By shuffling the values randomly, we ignored the co-dependency of environmental variables across space; that is, the average wind speed in one cell was highly correlated with the average wind speed in its neighbouring cells. Therefore, we adopted a deterministic approach to model and calculate the optimal path in each direction for each month of the year for each historical port pair, producing 24 different routes for each port pair. Finally, we compared our results with the available historical evidence for robustness checks.

We describe each of the variables used in our model, starting with historical data in the Methods Section.

Model outputs

Combining all the data described in the previous sections, we created four routing models with varying levels of complexity. All of these are based on our baseline wind model (Model 1), to which more variables are added at each stage. For each new model, we assessed the marginal changes in route length per port pair and the geographic and temporal distribution of these changes. Table 1 summarises the variables included in each model.

Table 1 Summary of the models created.

Model 1 exhibited a linear routing pattern. This was mostly caused by the temporal and spatial resolutions of the wind data used, which was the only climatic variable used in this model. Model 1 does not properly illustrate what navigational patterns based strictly on wind data would look like in the real world. The averaging of the speed and direction in the ERA-5 series creates unrealistically homogeneous sailing conditions for each month, which translated into straight LCP lines. Micro-level weather conditions requiring constant tacking or even re-routing when faced with changing winds cannot be captured. As shown in Fig. 1, Model 1 also display an interesting seasonal routing pattern with longer than average route length between February and May and shorter from June to December with a dramatic peak in March.

Fig. 1
figure 1

Impact of each model on routing seasonality and mean route length.

The data added in Model 2 (waves) had a negligible impact on routing choices, route lengths, and seasonality, but it significantly increased the average journey time along these routes by degrading the maximum speed.

Model 3, however, adds considerable spatial variation in the routing pattern by interacting winds and currents. The model 3 routes are more varied and less linear, as shown in Supplementary Fig. 6 in the Extended Data section. This model also reduced the size of the seasonal variation in route length observed in both Models 1 and 2, while increasing the overall route lengths by 20 per cent on average when compared to Model 1. (Fig. 1).

Finally, Model 4 reinstates the amplitude of the seasonal variation (although it has a shifted annual pattern and a more normal distribution, as shown in Fig. 1), while adding even more spatial complexity to routing and increasing again the average route length.

Another important observation from these iterations is that the changes described above are not distributed homogeneously across the study area. Mediterranean ports are overall losers with more complex routing parameters. In the British Isles, as shown in Supplementary Fig. 7 in the Extended Data section, it is the East-Coast ports that benefit the most from more complex routing models, while Irish and Welsh seaports are losing out, which is consistent with historical observations.

Route validation

We tested the output of each model against a series of historical and modern data to determine the best fitting model. The data and methods used for the validation are described in the Methods Section. We will only here report comparison with historical logbooks observations (CLIWOC) and port books data.

We first compared the monthly route density per cell from the CLIWOC dataset to our models. We focused on seasonal variations observed in the historical data and checked whether the model predicted similar deviations. This exercise revealed four key similarities in seasonal routing patterns which we take as confirmation of our routing model (visible on Supplementary Figs. 8 and 9 in the Extended Data section):

  1. 1.

    The autumn shift to the South, towards the coast of Morocco, of routes going into the Mediterranean from the Atlantic.

  2. 2.

    The summer concentration of routes towards the Southern Atlantic in the area North-West of La Coruña.

  3. 3.

    The shift of routes towards the Portuguese Atlantic Coast in summer.

  4. 4.

    The availability of routes further away from the English East coast towards Scotland during the late spring and summer and their disappearance during the winter months.

We then used independent data extracted from port books (described in the section Extant historical data for route validation) to calculate the average real-world journey time (in days) for each route and month for the period corresponding to our study. We only kept port-pairs (31 out of 206) for which we had at least seven monthly observations, and we also calculated the monthly-to-annual ratio to compare the variability of monthly values under and over the annual mean for each one of them. We calculate the bootstrap confidence intervals for each month in the dataset, and then used these intervals to calculate the margin of error for each data point. Finally, we removed any outliers (ratio over ten times the annual mean). We then computed the journey time estimate for each of these 31 routes based on models M1 to M4, added constant values for the port connectors (based on a speed of 3km/h), and calculated the same monthly-to-annual ratio, and plotted all the data together, as shown in Supplementary Figs. 11 and 12.

It may be worth noting that, as the port data collected by Dunn (2020) went up to the 1790s only, we did not have to include the effect of tug steamers, which increased the speed of ships along estuaries from the early 1810s. This may create a slight overestimation in the port data compared with our model which ignores in- and out-of-port navigation.

We also compared our routes with outputs generated by a modern navigation routing engine. Several programmes are available in the market, but most are too expensive for academic purposes. We therefore decided to use the excellent (and free) software, QtVlm developed by Meltemus. QtVlm allows the import of meteorological data and customised polars to set the boat’s sailing profile, as well as the export of routing output as GPX (GPS data) files.

QtVlm uses meteorological data from the NOAA at a 0.5/3h resolution. We downloaded data covering February from the VLM server and asked the engine to chart the optimal course for the following four routes: Newcastle to London, Le Havre to Marseille, Bristol to London, and Liverpool to Dunkirk. We then performed the same using our own models and plotted all routes together in Supplementary Fig. 13. A sanity check comparison between the two results shows that our routes are relatively similar to what QtVlm suggested, with added complexity for longer-distance routing.

Methods

In this section we review all the parameters included in our models and the data we collected to estimate their effect on historical navigation.

Ship characteristics

Eighteenth-century ships were generally described by the form of their hulls. Chapman (2006), who provided the first systematic classification distinguished six main types of vessels: frigates, barks, flutes (and smaller-scale hoys, galliots, and hookers), pinks, cats, and hagboats (MacGregor, 1980, 20). By the end of the century and throughout the nineteenth century, it became more common to refer to ships by their rigs, such as brigs, brigantine, schooners, clippers, and barques (not to be confused with the earlier term bark of the same origin); convention to which we adhere in this article.

We used the estimates available in the literature as the parameters for our model. The information summarised in Table 2 comes from a variety of sources, and it is a simplistic attempt (which would certainly horrify maritime historians) to create average ship characteristics for our models.Footnote 2

Table 2 Some ship types and characteristics.

Ships speed

To determine the maximum sailing speeds of historical ships under different wind speeds and angles, we initially hoped to find polars for historical ships. For readers unfamiliar with this term, polars are a common way to represent an empirical relationship between wind speed and sailing speed at different angles to the wind as a polar diagram. Surprisingly, little data exist for the types of ships we are interested in. The only square-rigged ocean-going ship for which we could find a polar was STS Young Endeavour, a 200-tonne brigantine (Mudie, 2007). The data is shown in Fig. 2 below.Footnote 3 We adjusted the data in the second upper quadrant to reflect the fact that Young Endeavour has a brigantine rig that allows her to sail much closer to windward than square-rigged brigs such as colliers, which we are trying to model. In the absence of empirical data, we decided to reduce all values below 60 degrees by 25 per cent. Finally, we interpolated values for all wind speeds between 1 and 25 knots using the hrosailing Python package (Dannenberg et al., 2022).

Fig. 2
figure 2

Ship speed according to true wind angle and velocity parameters represented as a polar diagram.

A second independent check we performed was to use speed and wind observations from both the Royal Navy and East India Company between 1750 and 1830 from Kelly and Ó Gráda (2018) to estimate the historical wind speed at an approximated optimal sail setting. These values were set as the upper boundaries of our model.

It is important to note that the speeds indicated in the table above and on the diagram correspond to the maximum speed realised under optimal wind and sea conditions and are therefore very far from what would have been the average journey speeds. To compare our calculations to real-life journeys, one can approximate the journey time using speed estimates for the coastal trade of 3.1 km/h from Bogart et al. (2021) as a lower bound and the median daily speed for oceanic shipping of 7.2 km/h based on observations for East-India-Company ships between 1750 and 1780 from Kelly and Ó Gráda (2018, p. 469) as an upper bound.

Historical ports

Our historical port points (Table 3) result from the combination of existing datasets by Alvarez-Palau and Dunn (2019) for England and Wales, unpublished data for Scotland and Ireland courtesy of Eduard Alvarez, and the ANR Portic (2023) project generously provided to us by Sylvia Marzagalli. The quality of port geolocation varies among these datasets, but this has no impact on regional routing.

Table 3 Historical port-points.

To avoid cases in which ports located relatively far inland could be cut off from the navigable area, we created port connectors (at a cost of zero) for all ports, joining them to their closest point on the navigability outline.

Maximum potential land-sight visibility area (MPLVA)

The MPLVA corresponds to the entire area from which the coast is potentially visible in good weather. This section describes the method used to determine this area. Despite the prevalent use of three-dimensional viewsheds and sightline analysis by architects, urban planners, and modellers, these tools, built in most GIS suites, have not yet been applied to the modelling of visibility in a historical maritime context. The understanding that mariners have always utilised empirical measurements of sight to guide their passage begs to how these factors have not been more widely utilised in the routing literature. One notable exception is Alvarez-Palau and Dunn (2019), who used a simplified geometric visibility equation to restrict the allowed navigable areas for coastal routing in England and Wales. Using a simple application of Pythagorean trigonometry, they determined how far a coastal landmark could be sighted by an observer aboard a ship, leading to the following relation:

$$Range={d}_{1}+{d}_{2}=\sqrt{2{R}_{E}{h}_{1}}+\sqrt{2{R}_{E}({z}_{2}+{h}_{2})}$$

where RE is the radius of the Earth, h1 and h2 are the heights of the observer and coastal landmark, respectively, and z2 represents the elevation for a given point on the shoreline. Their model uses a radius of the Earth of 6378 km;Footnote 4 It assumes the height of the observer at sea (h1) to be 1.0 m (located on the ship’s mast), the shoreline elevation z2 is set to the value calculated with a Digital Topographic Model (DTM), and they assumed the presence of 20-m high landmarks (h2) for all of the 13,000 coastal points included in their model. The limitations inherent to this model (the curvature of the earth is ignored, and it presupposes the presence of tall landmarks regularly spaced along the coast) led us to suggest a more realistic approach for determining potential landsight areas.

We first created a Python script utilising ArcPy’s spatial analysis library to determine precise coastal visibility. We then constructed a digital elevation model (DEM, ie a 3-D terrain model) for Western Europe, merging existing data provided by the EU Copernicus land cover (European Environment Agency, 2016). The resulting DEM covers the entire study area at a resolution of 0.01° × 0.01° (115 m × 115 m) and a vertical resolution of ±7 m.

We used the DEM to create an outline based on the geometric formula described above. We then generated points along this outline, equally spaced at intervals of 1 km, and for each of these points, we computed the visibility radius using ArcPy’s visibility function. We then intersected the radius with our digital elevation model (DEM) to determine the overlapping area. Using the cell size of the DEM, we calculated one unit as 257087.75 m2. When the overlapping area exceeded 13.5 units, we shifted the point back half of the distance separating the point from the coast along the orthogonal line connecting the geometric outline to the coast. Inversely, when the overlapping area was less than 1.5 unit, we moved the point towards the coast in the same manner. We then iterated this process until the overlap for each point was within the thresholds of 1 and 15 units. The resulting area is the Maximum Potential Land-sight Visibility Area (MPLVA). It is worth noting that MPLVA corresponds to the theoretical maximum area which allows ships to sail while maintaining constant landsight under perfect atmospheric conditions. For the purpose of our model, this has two key limitations: i. it is too restrictive, as even coastal shipping vessels lost sight of land for brief periods, and ii. it does not reflect real-world visibility, which can be significantly degraded by events, such as low cloud cover, rain, or fog. In cases of low visibility, sailors were reluctant to sail close to the coast, as this significantly increased the risk of collision and wreckage. These factors are discussed in the following two sections.

Land sight requirement for historical sailing

Coastal visibility was a critical requirement for historical navigation and in this section we analyse the importance of this constraint for historical routing. We argue first that the visibility constraint should be loosened and then we determine empirically the extent to which it should be extended. A common misunderstanding about the point i. described above comes from older definitions of coastal shipping, which tended to denigrate the quality of this type of navigation compared with more ‘noble’, scientific, or adventurous transoceanic shipping. This distinction emerged in the early modern period (see, for example, Richard Eden’s 1561 preface to his translation of Cortés’ Art of Navigation, quoted in Ash (2007, 509)) and was echoed throughout the eighteenth and nineteenth centuries by compilers of navigation treatises, who rejected coasters as second-rate mariners limited to local experiential knowledge rather than the then-developing science of navigation. It was often dismissively argued that coastal shipping was simply a training ground, a nursery, for future sailors (Armstrong, 1996, XIII), who would then graduate and join a long-distance shipping fleet or, better, the Navy. However, this image is far from the reality of coastal and short-sea sailing in the eighteenth century.

First, sailors undoubtedly lost sight of the land (albeit temporarily) when sailing, be it because of adverse weather conditions, night-time sailing, or longer crossings, for which coastal hugging would have required a major added time cost or risk.

Second, hugging the coast was by far the most dangerous routing option, especially under poor visibility conditions and/or at night. Maintaining a safe distance from the coast was a vital requirement, and coastal shipping was no exception to this fundamental sailing rule.

Third, with the exponential increase in trade in the eighteenth century, coastal shipping covered many international voyages (most notably in Europe, owing to the multinational continental shoreline).

Fourth, ships did not always sail between the same ports (port books demonstrate that, overall, this was the exception rather than the norm), which goes against the perception of a strict restriction of sailors’ activity on well-travelled local connections.

Finally, the ships used for coastal shipping were not inferior to the ocean-going vessels. Colliers were very sought-after all rounders, and when in 1768 the Admiralty was looking to buy a ship in preparation for James Cook’s first expedition to the South Pacific, they chose a flat-bottom collier built at Whitby in 1764 (the Earl of Pembroke then renamed Endeavour), as a safer, more economical and flexible option than the faster vessels used by the Navy. Endeavour’s career circumventing the globe between 1768 and 1779 is a testament to their judgement.

In this article, we decided to draw a line between intercontinental shipping, which we are not covering, and other forms of shipping: local ("petit cabotage” in French), regional, national and even international as long as it remained within European seas ("grand cabotage” in French). We do not directly model the European leg of intercontinental voyages, unless they include stopovers in two European ports.

To model routing based on less stringent visibility conditions, we created several concentric buffers (Fig. 3) around the MPLVA line, corresponding to the areas in which a ship could get sight of land within 6 h (i.e., at least every 12 h), 12 h (i.e., at least every 24 h), and 24 h (i.e., at least every 48 h). This resulted in four different possible parameters to restrict routing based on visibility: strict MPLVA (i.e., constant land visibility), MPLVA +6H, MPLVA +12H, MPLVA +24H, and no restrictions. By estimating the routes for each model for all ports in our dataset, we found that extending the navigability area beyond 6 h to land sight had no implication on the average route length (see Table 4).

Fig. 3
figure 3

Different visibility restriction areas based on time to reach visibility outline.

Table 4 Total route length in kilometres based on different navigability restrictions from constant land sight to no land sight requirement at all.

However, based on observations of ship positions extracted from the CLIWOC data (see Fig. 9 and our description of these data in Section 3), we adopted the MPVLA+12 line as our main navigability area. This approximates the maximum distance a ship could sail out while not losing sight of land for longer than 24 h, based on a maximum sailing speed of 20 km/h, which is twice the higher bound for fast oceanic sailing under strong wind conditions (Kelly and Ó Gráda, 2018, Fig. 5). The conclusion is that ships mostly sailed in an area where sight of land would have been possible at least once every 24 h. The obtained area is the navigable area used in our routing calculations, and all cells beyond this area are considered non-navigable.

Horizontal visibility and fog

We can now return to the second point raised at the end of the section on the MPLVA, which is the issue of limited visibility caused by changing atmospheric conditions. In this section we explain how we created a measure of horizontal visibility and fog likelihood, and why we decided not to include it in our parameters for the models.

We collected almost 5.5 million historical observations (Table 5) of hourly/three-hourly/daily horizontal visibility from automated airport weather observations (AWOS sensors) available on the ASOS website maintained by the Iowa Environmental Mesonet network (2021) and meteorological station data from NCEI’s (2018) Global Surface Summary of the Day. From these two datasets, we collected all available data for Belgium, Britain, France, Ireland, Italy, the Netherlands, Portugal, and Spain for the period 1929−1970, and averaged them to monthly values. As the resulting dataset had very few coastal points in France, we completed it using publicly available SYNOP data from Météo-France (2014) for 1996−2020.

Table 5 Visibility data.

Once compiled and standardised, we interpolated the values using Empirical Bayesian Kriging in ArcGIS to create 12 continuous rasters, showing the average horizontal visibility for each month. Figure 4 below reveals three strikingly different sub-spaces that are relatively consistent throughout the year: areas with excellent visibility (the Mediterranean), areas with lower than average visibility (the Atlantic coast), and a particularly dangerous dark and hazy corner of the North Sea with very low average visibility, centred around East Anglia.

Fig. 4
figure 4

Interpolated mean monthly horizontal visibility, in kilometres.

We decided not to modify the extent of the navigability area based on the likelihood of degraded visibility conditions, as this would have required an arbitrary decision on the threshold of visibility that was deemed acceptable by sailors. We used these data to calculate the visibility impairment coefficient for each route and for each month, which we report in the Extended Data section. One immediately noticeable outcome of this analysis is the extraordinary danger of sailing around the East Anglian coast. In both Norfolk and Suffolk, the MPVLA shrinks because of the flat lowlands (as shown in Fig. 3). Therefore, sailors were forced to sail closer to the coast. Simultaneously, reduced visibility makes proximity to the shore even more perilous. A particularly foggy day could lead a captain to loose bearings and hit rocks along the coast or one of the many sandbanks. So terrifying was England’s east coast that some North Sea mariners ‘had rather run the hazard of an East India voyage than be obliged to sail all the winter between London and Newcastle.’Footnote 5

Use of optical magnification for navigation

In this section we argue that thanks to technological change in the seventeenth and eighteenth centuries, optical devices became generally available and affordable. Thus, our navigability area does not require to be degraded for naked-eye visibility.

The role of optical magnification was crucial in aiding sight navigation, particularly during the 18th centuries, when new technology made telescopes both affordable and much more efficient.Footnote 6 The development of the telescope and its subsequent improvements have significantly influenced navigational practices by allowing for more accurate observations of celestial bodies and distant landmarks. London, in this period, emerged as a leading production centre for optical devices, with a high degree of specialisation and division of labour within the industry (Morrison-Low, 2011). By the early eighteenth century, telescopes had become a standard part of a ship captain’s equipment, as evident in caricatures and representations of the period.

As telescopes became more common, their range also increased significantly. The telescope often attributed to Galileo, though not the first of its kind, offered a magnification of approximately three times in the early seventeenth century. By the second half of the 18th century, a high-quality telescope could achieve 60−80 times magnification, whereas the more affordable options on the market offered ten times magnification. One notable figure in the development of telescopes is John Dollond. In 1758, he patented an invention that addressed the issue of chromatic aberration in refracting telescopes, which not only improved magnification capabilities, but also contributed to enhanced image quality (Dunn, 2011).

Because the improvement of telescopes enabled navigators to make more accurate observations, ultimately contributing to better sight navigation during the 17th and 18th centuries, we decided not to include atmospheric degradation based on naked-eye landsight in our model.

Bathymetry

In this section, we move away from the determinant of visibility to focus on sea depth as a key determinant of the navigability. Ships hitting sandbanks or underwater rock formations in shallow water were certainly the most common cause of wrecking in the eighteenth and nineteenth centuries (Litvine et al., forthcoming). Using modern LIDAR data extracted from EUMODnet’s (2020) bathymetric digital terrain model (DTM), we were able to use a detailed map of the European seabed (at a resolution of 115 m by 115 m) and control the navigability of each cell according to its average depth. Based on the draft (distance between the waterline and keel at full load) for ships frequently encountered in the European seas in the eighteenth and nineteenth centuries (Harland, 2015), we estimated that any cell with an average depth of less than 10 m was not navigable. Furthermore, as it was very difficult to locate shallow waters with precision (owing to shifting sandbanks, the limited precision of contemporary navigation charts, and the lack of a precise positioning system), we considered that the fear of wrecking would have led even experienced sailors to err cautiously and stay away from any potentially dangerous area altogether. However, the fine resolution of our bathymetry data would have enabled a zigzagging route within 50 m of any serious hazard (see Fig. 5), a course which no sound pilot has ever attempted. We therefore re-sampled the data by a factor of ten to obtain 1.15 km by 1.15 km grid cells, each being given the maximum value in the underlying area. By doing so, we ensured that any suggested routing would remain at least 500 m away from any potential hazard.

Fig. 5
figure 5

Logic for routing through shallow water, assuming a 10 m-depth threshold for navigability, raw data on the left, re-sampled on the right.

Wind speed and direction

Models of the ‘Age of Sail’ would be nothing without incorporating variables representing wind. The most suitable long-run wind series come from the Climate Data Service’s ERA-5 dataset (2020) with resolution 0.25° by 0.25° (27.25 km by 27.5 km). These series not only represent the most appropriate spatial and temporal extent but also the most granular resolution of the variable itself. The data comprise a re-analysis of forecast data from 1950 to 1978. For each variable, data were acquired at hourly intervals for each month, across 29 years. These subsets were then averaged and smoothed over extraneous meteorological events. The ability to choose how granular the data can be represented also allows the use of different resolutions.

First, data entailing the wind’s near-surface (10 m above the Earth’s surface) U and V components, or the eastward and northward vectors, were used to determine the real direction and strength. These two vectors can be combined using Equation 1 to determine both wind velocity (VT) and wind direction (γ).

$${V}_{T}=\sqrt{{u}^{2}+{v}^{2}}$$
(1a)
$$\gamma =\arctan v,u$$
(1b)

The resulting weight matrix is shown in Supplementary Fig. 1 in the Extended Data section.

We first used the rWind R package developed by Fernández-López and Schliep (2019) to compute the wind connectivity values between all cells and their eight adjacent cells (Moore neighbourhood). Wind connectivity values were derived using the algorithm described in Felicísimo et al. (2008).

We then created a second set of wind connectivity values to estimate the speeds of the historical ships (VB). We first calculated for each cell the angle of the ship’s course η and determined the angle of the wind with respect to the direction of the ship (β), using the wind direction (γ) calculated above; thus, 0° indicates a ship sailing upwind (obviously an impossible situation) and 180 downwind. We then used the polars described in Section 3 to interpolate the transition cost for each cell along the predetermined route using both wind velocity(VT) and (β) as parameters.

$$\beta =| \eta -\gamma |$$
(2a)
$${V}_{B}=f(\beta ,{V}_{T})$$
(2b)

Wind variability and extreme weather events

The average wind speed does not distinguish between areas that have stronger but intermittent winds and areas that have lower but steadier wind conditions. Knowledge of local wind patterns would have necessarily informed sailors’ choices in the past, and it seems reasonable that all else being equal, sailors would have favoured areas characterised by less variability (in both speed and direction) and higher average wind speeds. This phenomenon was particularly significant during winter months, as shown in Fig. 6. Therefore, we calculated the coefficient of variation (CV) of wind speed and direction for each grid cell by dividing the standard deviation by the mean speed. We then multiplied the transition coast of each cell by this value.

Fig. 6
figure 6

Distribution of wind direction (left) and wind speed (right) per month based on 100,000 random samples from each of the raster months.

Finally, the average wind speed did not account for possible subhourly local variations in the probability of extreme winds. Assuming that the spatial distribution of these events would have been known by sailors, we included the likelihood of extreme winds in our routing model. Our hypothesis was that if an area is, on average, more prone to stormy weather, sailors, guided by accumulated local knowledge, would have tended to avoid this area. Using the definition of the World Meteorological Organisation (1987) and following Minola et al. (2020), we decided to capture abrupt variations in wind patterns by including near-surface peaks or wind gusts in the routing model, defined as the maximum of the wind (m/s−1) averaged over 3s intervals at a height of ten metres above the surface of the Earth. To consider the likelihood of stormy winds, we calculated the ratio of the mean speed to the gust max for each cell (MoM) and multiplied the transition cost for each cell by this value, as shown in Equations 3. The spatial patterns of these coefficients are shown in Supplementary Figs. 2 and 3.

$${\sigma }_{speed}=\sqrt{\frac{\sum {(spee{d}_{i}-mea{n}_{speed})}^{2}}{696}}$$
(3a)
$$C{V}_{speed}=\frac{{\sigma }_{speed}}{mea{n}_{speed}}\quad C{V}_{dir}=\frac{{\sigma }_{dir}}{mea{n}_{dir}}$$
(3b)
$$MoM=\frac{mea{n}_{speed}}{ma{x}_{speed}}$$
(3c)
$$weighted\,transition\,cost=C{V}_{speed}\times C{V}_{direction}\times MoM\times transition\,cost$$
(3d)

Waves

Data corresponding to the direction, height, and period of the waves were obtained using the Climate Data Service’s ERA5 dataset (Bell et al., 2020). Similar to the wind data, a time series of 29 years was utilised to determine the monthly average at each time interval between 00:00 and 23:00. The direction of waves is given in true degrees and comprises a two-dimensional wave spectrum, or all frequencies and directions of the wind-sea waves and swell. Thus, the mean wave direction (MWD) used is the product of the directions of the local winds and the swell, or the waves generated by the wind at different locations and times.

In addition, both the significant height of the combined wave and swell (SWH) in metres and mean wave period (MWP) in seconds were used.

We used data on the effect of the mean wave height and direction on the speed loss calculated by Prpić-Oršić and Faltinsen (2012) for large tankers to provide a lower-bound estimate of the effect of waves on the sailing speed. Large tankers (over 175 m in length, i.e., at least six times longer than the best brigs sailing European seas in the eighteenth and nineteenth centuries) offer more resistance to waves and, unlike wind-powered vessels, are also not affected by intermittent loss of propulsion. However, they are also far more stable and seaworthy in adverse weather than ships navigating the European seas in the eighteenth and nineteenth centuries. Thus, our model is likely to underestimate the effect of waves on speed for small boats, but this has limited importance because we are not trying to calculate voyage speed per se in the routing stage of the simulation, but simply determining weights to model mariners’ routing choices within given environmental conditions. We extracted the data presented by Prpic`-Oršić and Faltinsen (2012, 7, Supplementary Fig. 2) to create a cost matrix that is applicable to our ERA-5 wave data, shown in Table 6. Finally, in order to apply the matrix we first calculated the angle of the ship transiting through the cell relative to the direction of the waves, and applied the coefficient corresponding to the monthly SWH value for this cell.

Table 6 Speed loss (in per cent) according to wave height and direction.

High waves can also lead to dangerous conditions, in which sailing becomes extremely hazardous or even impossible. We adopted two rules of thumb commonly used by skippers to determine the cells that would be too dangerous to travel through. Thus, our no-sail areas are defined as i. cells for which the average wavelength is equal to or less than seven times the mean wave height of the combined wind waves and swell, and ii. cells, for which the wave height was over 30 per cent of the average boat length. The first ratio indicates that when the distance between waves decreases (relative to their size), waves are more likely to break and overwhelm ships. For the second criterion, we took the average hull length for brigs in the eighteenth and nineteenth centuries documented in Harland (2015). As the ERA-5 dataset did not include the wavelength, we estimated it as follows:

  1. 1.

    For shallow waters, where waves interact with the bottom of the ocean, the wavelength is mostly determined by the depth of the sea. Given that ocean waves rarely have wavelengths greater than 200 m (Earle, 2015), we estimated that for any cell where the depth of the sea is below 50 m, the wavelength should be estimated using the following formula derived from Earle (2015):

    where L is the mean wavelength and T is the mean period:

    $$L=(-0.0052{T}^{3}+0.2521{T}^{2}-0.6117T+0.1392)* T$$
    (4a)
  2. 2.

    For deep water (over 50 m) and fully developed waves, by using a best-fit estimate for the relation between the mean period and wavelength from (Earle, 2015) for each grid cell:

    $$\begin{array}{l}L=0.002{T}^{6}-0.1094{T}^{5}+2.376{T}^{4}-26.081{T}^{3}\\\qquad+\,152.33{T}^{2}-436.99T+489.31\end{array}$$
    (4b)

The resulting weight matrix is shown in Supplementary Fig. 4 in the Extended Data section.

Currents

It is certain that sailors in the early modern period had only an imperfect understanding of oceanic currents, but as shown by Peterson et al. (1996), patchy, local knowledge was accumulated and recorded systematically from the sixteenth century and the emergence of the Sevillian School of Cartography. In the late 1760s more accurate large-scale maps of oceanic currents began to appear, including Benjamin Franklin and Timothy Folger’s chart of the Gulf Stream, but with very limited usefulness, other than for transoceanic sailing. It was only in the era of Matthew Fontaine Maury and his ‘Wind and Current Charts’ that the availability of global current charts became a reality for sailors. However, in our modelling, we assumed that familiarity with prevailing local currents was part of the sailors’ practical knowledge of the sea and formed part of their routing decisions.

For the purpose of our routing model, we only considered superficial currents (up to 15m depth) or “direct wind-driven currents” according to the terminology defined by Röhrs et al. (2021). Deeper oceanic currents flow at a depth that largely exceeds the average vessel draft, so we considered that they had no impact on the sailing speed. In our model, the current acts as an independent component of the total cell transition cost (ship speed) akin to a ’maritime conveyor belt’ on which the ship travels.

For this exercise, we used global reanalysis data from the OSCAR third-degree resolution ocean surface currents obtained from the NASA/JPL PO.DAAC repository (ESR, 2009) The data comes at a resolution of 1/3 × 1/3 (33 km × 33 km) at 5-daily intervals from 1992 to 2020. We averaged the data to obtain twelve monthly variables for both vectors (u,v) and clipped it to the extent of this study (as shown in Supplementary Fig. 5 in the Extended Data section). Similar to the wind, the current speed and direction and the subsequent current connectivity values were calculated using the rWind R package by Fernández-López and Schliep (2019).

Finally, it is worth highlighting a shortcoming of our oceanic model. As we could not obtain any tidal current data, our model did not account for the local effects of tidal systems. This has generally very limited consequences for sailing on the high seas, but it can have a much larger impact on specific areas where tidal systems dominate oceanic currents, which is particularly true for the English Channel in our study area.

Summary of variables

All variables containing a temporal dimension were mean-averaged to create 12 monthly values. Twelve simulations were created, one for each calendar month. In contrast, the stochastic approach, which we eventually discarded, propagated the uncertainty in the aggregation process by randomly assigning values to cells based on all values available within a given month. This means that the wind vector value in cell number 1 for the January simulation represented a randomly selected value from all the hourly January values for cell number 1 included in our sample of 24 hourly values for each calendar month for 29 years (i.e., one in 8352). Table 7 summarises the number of values for each cell in our grid.

Table 7 Number of values per each cell and for each variable.

Extant historical data for route validation

Historical records provide some evidence (albeit patchy and indirect) of sea routing.

  1. 1.

    Port records indicate the arrival and subsequent departure dates of vessels. In the best cases, meticulous and time-consuming record linking (of port entries and departures) allows the reconstruction of a limited number of voyages, and therefore, the total journey time. The interpretation of these data remains controversial, as it is relatively difficult to ascertain whether a journey between two points is direct, or if it contains stops at different ports along the coast. Dunn and Bogart recently collected a large amount of port data and meticulously linked them to obtain 44,541 single-journey observations. We are very grateful to both for allowing us to use their data to check the journey length calculated from our routes against their documented voyages.Footnote 7

  2. 2.

    When they survive, logbooks offer probably the best direct evidence for ship routing. These documents recorded (among other things) the daily position of vessels, speed and weather conditions. By identifying, mapping, and linking these points, it is possible to observe historical routing patterns. The best example of these data is undoubtedly the CLIWOC dataset (García-Herrera et al., 2005), which contains 287,114 positions extracted from 1891 logbooks covering British, Dutch, French, and Spanish ships.Footnote 8 The routes corresponding to the study area are illustrated in Fig. 8. We used these data to validate our speed estimates, the extent of our maximum navigability area, and our routing decisions. Sadly, in the eighteenth century, such logbooks were mostly used by the long-distance commercial fleet and the Navy, and only a handful of equivalent documents were identified in the archives for coasters. However, their data have not yet been digitised.

  3. 3.

    The position of wrecks on the seabed gives another indication of the past presence of ships in the neighbouring area. As part of this research project, we created a dataset containing approximately 200,000 wrecks around the British Isles and the coasts of (metropolitan) France. Supplementary Fig. 10 shows the spatial distribution of wrecks. What can be inferred from this is not, however, self-evident: is a high concentration of wrecks an indication of ships sinking because they veered off from their expected safe routes, or do they simply indicate a more treacherous area located on a main trading route? Probably a bit of both, making wreck locations less suitable for establishing historical sailing routes.

  4. 4.

    Published navigational charts could shed some lights on recommended historical routes. These charts, conceived as navigational aides for seafarers, emerged during the period covered by this study: in a rather rudimentary form in the sixteenth century, and progressively with more detail (including sea floor depth, landmarks, sand banks, and coast outline) in the following two centuries. Some eighteenth-century nautical charts with navigation routes have survived, but their coverage is lamentably patchy. An example is Murdoch Mackenzie’s 1750 chart of OrkneyFootnote 9 Mackenzie’s maps of Orkney are unique; although an amateur surveyor until then, their great precision got the one-time schoolmaster in Kirkwall an appointment as Surveyor to the Admiralty. The map depicts the route taken and the soundings recorded, but only for the short journey around Orkney, as shown by the dotted line in Fig. 7. Regarding Britain, Captain Greenville Collins was probably the first to produce a methodical survey of the British coast (Collins, 1758). His navigational charts were realised over seven years of observations (1681−1688), and as the dedication indicates, they provide: ‘Directions to Mariners to Sail alongst the Coast of Great Britain, and how to carry a Ship into any Harbour, River, Port, Road, Bay, or Creek with safety, and how to avoid all Dangers known.’ Sadly, however, detailed they may be, these charts do not report the routes taken by ships once they were out of the main ports, and they are therefore of little use for our enterprise.

  5. 5.

    Other documents compiling sailors’ experiences at sea (autobiographical accounts, navigation treatises, nautical charts, letters, and travel literature) may provide some partial and punctual evidence of sea routing, but attempting to gather all the information and geo-locate each point (assuming this is even possible) is well beyond the scope of this paper.

Despite all these varied sources of information on historical navigation, even in the best case, very little is known about coastal and short-sea routes. To validate our estimates, we therefore decided to adopt a triple approach: first, a visual comparison with the only large historical record of historical routing for this area (reconstructing routes from ship log data included in the CLIWOC dataset); second, to compare port-to-port speed estimates extracted from linked port book data by Dunn and Bogart to total travel time along a similar route using our models; and third, a sanity check using a modern open-source routing software (qtVlm) with historical square riggers ship characteristics as described above Fig. 7.

Fig. 7: Rare example of an eighteenth-century nautical chart containing sea routing.
figure 7

Courtesy of the National Library of Scotland.

Validating predicted routes using CLIWOC

Although, as explained above, they does not represent coastal routes per se, the CLIWOC data are by far the most complete record of ship positions along the European coasts. Each observation represents the point at which the mariners took and recorded meteorological and bathymetric readings. We used these points to create three sub-datasets: i. a chronological dataset to determine the effect of geopolitics on seafaring, ii. a reconstitution of individual voyages by joining points for individual ships over time, which we used to validate the extent of our navigability area, and iii. a point dataset to identify route density per month, which can be compared to our routes. Each is described in the following three subsections.

Sailing in time of war and peace

We used a subset of the CLIWOC data to determine how oceanic routes changed depending on whether a large-scale conflict occurred in Europe. To create this subset, data were filtered based on an approximate area corresponding to the European waters. This provided a dataset of 47,821 points. Records were then selected by their ID, and lines were generated depending on the relative date of each record. Points of the same ID with readings within four days were designated as parts of the same journey. Groups of points were then aggregated into lines to illustrate the historical routes.

Lines were then classified into ’Peaceful’ or ’Warring’ years, depending on the date. The results of this process provided a total of 3716 lines, 2321 during peace and 1395 under war conditions. For the classification of belligerency, we took the entire year under consideration (1 January to 31 December) and considered whether any state represented in the dataset was at war, using the assumption that conflict would impact everyone regardless of the flag flown. The results are shown in Fig. 8, which clearly shows that two areas clearly suffered from a significant decrease in traffic by British and Dutch vessels during bellicose episodes: the French Atlantic and the Channel. This is exactly what would have been expected. Our routing patterns should therfore be considered peacetime sailing routes only as we did not account for specific exclusion areas that correspond to the aforementioned maritime spaces.

Fig. 8
figure 8

All peace and wartime logbook data from the CLIWOC dataset.

Empirical validation of the extent of the navigable area

The second subset of the CLIWOC data was used to provide an empirical validation for the MPLVA (described in Section 3). The CLIWOC routes were queried such that only routes with a start and end point within the MPLVA + 6h area were used (intermediary points were not filtered). This resulted in 8332 points comprising of 519 individual journeys. This subset of the data corresponds to journeys which both began and ended within the study area. By looking at the routes between these points, we were able to visualise the general course of the ships in our study area. Figure 9 shows the different MPLVA outlinse together with the CLIWOC routes. From this, it is clear that the area included in the MPLVA + 12 h outline is the best fit. Although the CLIWOC routes are a rough approximation, the degree to which they concentrate within the MPLVA + 12 h outline gives a good justification for the settings used for modelling.

Fig. 9
figure 9

CLIWOC routes drawn with hourly visibility outlines.

Checking monthly route density from CLIWOC

The third subset from the CLIWOC dataset was created to provide an indication of routing density and monthly deviations. Similar to the first branch, points were first filtered based on geographic position, although instead of a square extent, the MPLVA + 24 h grid was used. This narrowed the subset down to 16,013 points, which were then sub-divided based on the month they were recorded and joined to a 10 sq.Km hex grid. The number of points in each month is shown in Table 8 and the point density in Supplementary Fig. 9. As a point of comparison, we counted the number of our routes crossing each of the cells in the same hex grid for each month. The result is shown in Supplementary Fig. 8.

Table 8 CLIWOC observations (points) by month.

Implications and limitations of the simulation

This exercise in historical modelling is the first step in a broader and ambitious attempt to analyse historical navigation between the seventeenth and early nineteenth centuries. First, it demonstrates that simulation is a useful approach for an object that is as elusive as maritime routing. Although historians are often unfamiliar with this probabilistic approach, we hope our work will convince some of our colleagues to look at it less unfavourably. By providing the tools to create any route desired, in Europe, and potentially elsewhere, we hope that this work will offer scholars from many different fields (maritime and economic historians to the “blue humanities") a new way to engage with historical sea faring.

Second, it demonstrates the hitherto overlooked significance of seasonal and climatic factors in shaping historical European shipping routes by introducing a more comprehensive modelling methodology that captures not only wind and bathymetric data but also variables such as wave conditions, coastal visibility, and the frequency of extreme weather events, and provides a more accurate and dynamic simulation of the maritime corridors in the age of sail as clearly visible in Supplementary Fig. 8. The model outputs revealed different routing patterns based on the included variables, adding considerable spatial variation when interacting winds and currents were considered. We also showed that seasonal variability and regional characteristics significantly influenced route selection.

Third, we show that the more complex environmental modelling particularly affected Mediterranean ports and ports on the West coasts of England and Ireland and in Wales. Given that these included some the most transited ports in the two countries, and of great interest to all historians of maritime trade, our work shows that including the effects of currents, waves, and wind variability are essential to capture routing and journey time to and from these ports.

The next step will be to refine existing data on coastal accessibility and freight cost variations for all of Europe. This will then be factored into existing multimodal models of historical routing such as those developed at the Cambridge Group for the History of Population and Social Structure (CAMPOP) and the COMMUNE project in France. This will make it possible to calculate accurately the journey time and cost of moving any good from any point in France and Britain to any other point in these countries, unlocking in particular new analyses regarding trade routes for key commodities in the early industrial age, and the effect or ports connectivity on urban development.

Our work is not however without its limitations. First, the climatic data used is obviously not contemporaneous and our work relies on the assumption that we can use twentieth-century meteorological data to model the determinants of navigation two hundred years earlier. Second, it should be made clear that our model does not accurately replicate individual journeys, as it cannot encapsulate the myriad of other factors and contingencies of historical navigation: changing course because of extreme weather, breakdowns, delays at port, piracy, disease, and quarantine among many others. Therefore, our output should be considered as the best-case scenario routing. Overall, despite the inherent challenges of modelling historical weather patterns and seafaring behaviours, we believe the study’s models offer a credible overview of historical maritime routes.

Finally, we would like to conclude with a wish, that our findings may emphasise the importance of bridging the computational skill gap among historians and demonstrate the value of broad interdisciplinarity in refining our understanding of the past.