# Transport link scanner: simulating geographic transport network expansion through individual investments

## Abstract

This paper introduces a GIS-based model that simulates the geographic expansion of transport networks by several decision-makers with varying objectives. The model progressively adds extensions to a growing network by choosing the most attractive investments from a limited choice set. Attractiveness is defined as a function of variables in which revenue and broader societal benefits may play a role and can be based on empirically underpinned parameters that may differ according to private or public interests. The choice set is selected from an exhaustive set of links and presumably contains those investment options that best meet private operator’s objectives by balancing the revenues of additional fare against construction costs. The investment options consist of geographically plausible routes with potential detours. These routes are generated using a fine-meshed regularly latticed network and shortest path finding methods. Additionally, two indicators of the geographic accuracy of the simulated networks are introduced. A historical case study is presented to demonstrate the model’s first results. These results show that the modelled networks reproduce relevant results of the historically built network with reasonable accuracy.

### Keywords

Transportation Network growth Agent-based modelling### JEL Classification

H54 L92 O33 R42## 1 Introduction

The expansion of transport networks is considered an important factor for the spatial distribution of activities and receives considerable politic and academic attention. It is commonly perceived as a technology diffusion process in which the innovation spreads geographically (Grübler 1990; Nakicenovic 1995). The geographic paths that the developed networks assume have important societal and economic ramifications. Ideally these paths constitute a social optimum considering construction costs and generalized travel costs. However, due to often non-cooperative decision-makers (Knick Harley 1982; Dobbin and Dowd 1997; Xie and Levinson 2011), potential transport network expansion outcomes may be limited to Nash equilibria (Bala and Goyal 2000; Anshelevich et al. 2003) that can entail considerable extra costs to reach a target state of connectivity.

Although it is known that transport network expansion may follow a clear rationale, largely based on, e.g. expected transport flows versus costs (Rietveld and Bruinsma 1998; Xie and Levinson 2011), relatively little is known about how economic and institutional conditions affect transport network expansion. This is partially because, in contrast to other instruments available to transport planners such as land-use and transport demand models, ex ante models of transport network expansion are few and they are hardly ever empirically validated. For a comprehensive overview of transport network modelling, we refer to Xie and Levinson (2009). In the 1960s, conceptual and empirical modelling efforts have been undertaken by quantitative geographers (Taaffe et al. 1963; Warntz 1966; Kolars and Malin 1970). More recently, network optimality and bi-level optimization methods (Patriksson 2008; Youn et al. 2008; Li et al. 2010), the role of self-organization (Xie and Levinson 2011) and the role of ownership (Xie and Levinson 2007) have been investigated in controlled conditions. This has been followed by empirically based exercises to test heuristic network design optimization methods (Vitins and Axhausen 2009), to understand the driving forces of network growth (Rietveld and Bruinsma 1998) and the role of first mover advantages (Levinson and Xie 2011) and to forecast future network investments in a fairly mature transport system (Levinson et al. 2012).

An instrument to evaluate geographically explicit network expansion outcomes in settings with multiple decision-makers is not yet available in the literature. This is presumably because of limited data availability, computational limitations and difficulties in reproducing topologically realistic links or ‘shortcuts’ (Li et al. 2010). The aim of this paper is to introduce transport link scanner (TLS), an agent-based model that simulates the overall geographic diffusion of a transport network through the individual investment decisions that drive network expansion, and to demonstrate that it is able to reproduce a historical network expansion process reasonably accurate. The model allows the inclusion of multiple decision-makers with varying objectives; institutional conditions and the level of cooperation between decision-makers can be explicitly modelled. A novel heuristic method is integrated to generate the plausible geographic paths of potential investments that aim to maximize fares. It does so in a manner that is consistent with the model’s transport demand module and is responsive to previously selected links. The principal model output is a network of transport links, which enables the measurement of model performance based on graph-theoretic indicators such as diameter and node degree (Rodrigue et al. 2006), and indicators relevant to transportation networks such as accessibility and network efficiency (Jacobs-Crisioni et al. 2016). The model is illustrated with a case study in which the start and expansion of the Dutch railway network in the nineteenth and early twentieth century is simulated, but the model itself is developed in such a way that other applications may be configured reasonably easily.

The theoretical basis, overall structure and key assumptions are outlined in Sect. 2. Subsequently, particular aspects of TLS are described in more detail in Sect. 3. The case study is described in Sect. 4, and simulation results for that case study are given in Sect. 5. This is followed by general conclusions on the development of TLS and ideas for further research in Sect. 6. Lastly, the estimation of cost and demand functions, a breakdown of model results per investor type and a table of nomenclature are given in appendices. Before the model and case study are introduced, it is worth mentioning that this model is programmed in the Geo Data and Model Server (GeoDMS) software (ObjectVision 2014), which is presumably best known as the platform that supports land-use models such as Land-Use Scanner and the Land-Use-based Integrated Sustainability Assessment modelling platform (LUISA) (Hilferink and Rietveld 1999; Baranzelli et al. 2014). GeoDMS is rather different from commonly used GIS packages, and we emphasize here that its availability has been a key prerequisite for the development of TLS. It is an open-source platform that interprets scripts into a sequence of operations and executes these operations on dynamically defined C++ arrays. Just like geospatial semantic array programming tools such as the Mastrave library (de Rigo et al. 2013), GeoDMS adheres to large-scale modelling and assessment tasks. The major advantages of using GeoDMS for the work presented in this paper are considerable gains in computation speed, reproducibility of modelling steps, flexibility and control over data operations and straightforward links between various data types such as raster and vector type spatial data. The TLS program and the data that have been used for this paper are freely available through http://www.jacobs-crisioni.nl/publications/download_tls.

## 2 Model structure and key assumptions

Transport network expansion is commonly initiated by a technical innovation that can substantially lower generalized travel costs, such as the introduction of steam power or the invention of motorways (Nakicenovic 1995). The expansion process itself is the result of sequential decisions to construct transport links for that new technology. Transport link investments generally come with considerable set-up costs and sunk costs and are physically bound, thus making it hard for investors to move their enterprise (Xie and Levinson 2011). The involved decision-makers may be private or public and may have very different objectives, including economic and societal factors, but are generally concerned with providing transport service for which the built infrastructure is instrumental. Because of the high costs of market access, in many cases the transport market is an oligopoly subject to fierce competition (Knick Harley 1982; Veenendaal 1995). Thus, potential final network outcomes consist of Nash equilibria rather than a social optimum (Anshelevich et al. 2003; Xie and Levinson 2007; Youn et al. 2008).

Given high costs of link construction, it stands to reason that investment decisions are taken with deliberation and that an investor will decide to construct the link that best fits investor objectives. The high costs involved in link construction create local monopolies when largely exhausted revenues block competitor investments in the same space (Xie and Levinson 2011). The position of the first investor is further boosted by the existence of network externalities that imply that newly added links may increase revenues for the existing network. This leads to advantages for the established playing field, as can be seen in the first mover advantages and lock-in described by network economics. For an overview of network economics, we refer to Economides (1996). All in all, sequential link construction is a dynamic process in which previous decisions organize the potential for future decisions. The characteristics of network expansion processes are the basis for the ‘strongest link’ assumption of transport network expansion (Xie and Levinson 2011), which is adopted in this paper. In such an approach, any agent selects a most attractive investment for construction, if a sufficiently attractive option is available. After that decision, investments are reconsidered and construction decisions are taken iteratively until the pool of sufficiently attractive investments is exhausted.

### 2.1 Model structure

Especially when network expansion is driven by economic motives, the spatial distribution of suitable terrain and potential transport revenues may be presumed to be important aspects of the choice process. This may be one reason why railways prefer paths with high potential interaction values (Warntz 1966; Kolars and Malin 1970). The geographic nature of these factors supports GIS-based modelling such as in TLS. In TLS network, investments are drawn from a pool of potential network extensions with plausible geographic paths. That selection of extensions is based on a set of adaptable rules. The modelled network investments are discrete choices. The model is turn-based and dynamic: thus, one investment decision from one investor is allocated in any iteration, causing one distinct link to be added to the modelled transport network. The transport link allocated in that iteration affects the market conditions that are relevant for the generated choice set and for the estimated revenues of investments in subsequent turns. The model allows multiple investors to construct network links, such as private investors or governments. The partaking investors are allowed differing investment objectives.

### 2.2 Key assumptions

*H*of alternative–investor combinations.

*H*contains a finite number of alternative–investor combinations

*O*with index

*l*= 1,…,

*L*, and known attributes. This choice set is composed of a number of likely additions. Then, the probability that alternative

*o*is realized equals:

*S*

_{l}for a line

*I*given investor type

*p*, which takes this form:

The attractiveness of alternatives may differ per investor and may contain a variety of different social or financial objectives. In the presented case study, investor-specific attractiveness functions have been estimated using railway investment choices taken while constructing the Dutch railway network and mainly aim at increasing the revenues (reflected by passenger kilometres) on the investor’s network; in other cases, these attractiveness functions need to be modified to reflect case-specific investor goals or transport revenue types.

The selection of investment choices and the computation of investment attractiveness is constrained by the following assumptions: (1) the territory is divided into a given number of zones with estimable numbers of potential passengers and/or movable goods, observed as origins (*i*) and destinations (*j*); furthermore, (2) all zones are already connected by a preceding base communications network (*base*), so that spatial interactions already exist before the transport mode is introduced. This network is expected to have maximum plausible connectivity, so that the *i* to *j* travel distances \( L_{ij}^{base} \) obtained from this network are the minimum realistic link lengths between two zones. A last constraining assumption (3) is that the introduced transport mode is expected to lower generalized travel costs per kilometre with a fixed relative cost improvement factor \( \varphi \).

*l*in the modelled network have travel costs

*c*based on \( c_{l}^{\text{base}} = L_{l} /V^{\text{base}} \) or \( c_{l}^{\text{intr}} = L_{l} /V^{\text{intr}} , \) where \( L_{l} \) indicate link lengths. In the case of public transport, it seems fair to adapt travel cost estimates with travel cost penalties

*cp*to simulate the effort involved in entering and exiting the introduced transport network. This leads to fixed maximum obtainable travel cost improvements between two zones, which can be computed as a ratio between minimum new-mode travel costs \( c_{ij}^{ \hbox{min} } \) and existing travel costs \( c_{ij}^{\text{base}} \) over the base network. Maximum obtainable travel cost improvements are expressed as:

## 3 Choice set generation, investment selection and model accuracy measures

Although in continuous space infinite potential links exist, computational limitations force us to work with a limited choice set. This is justified by the property of the conditional logit model demonstrated by McFadden (1974) that drawing a limited number of alternatives leads to consistent estimates, provided that the true choice process is described by the estimation procedure.^{1} TLS establishes a set of discrete choice set alternatives by drawing samples with a reasonable probability of selection using heuristic generation methods. In the attractiveness estimation procedure, the built links are added to the choice set. Because of the dynamic nature of TLS, the choice sets used in prediction are bound to differ from those used in the estimation process, and we must therefore assume that the validity of estimated attractiveness functions holds as long as investment options are selected with the same criteria as the choice set used in the estimations.

We furthermore assume that link construction is incremental, which implies that the most profitable link construction investments are selected first, and later, other links are built as extensions to the investor’s network. To generate investment options given these assumptions, a two-stage method is applied, which first deals with the selection of terminating zones and later selects a plausible path between the terminating points using corridor location searching methods. For a recent overview of corridor location search methods, we refer to Scaparra et al. (2014). For this section, it is necessary to explicitly discern links (*l*), which we consider as complete investments between two terminating zones, and segments (*s*), which are the separately observed lines in the model of which a link is composed.

### 3.1 Selecting terminating zones

*i*and

*j*in kilometres over the base network;

*T*is the potential number of trips in both directions with (est1) and without (curr) the new link; and \( C_{ij}^{\text{est1}} \) is a first estimate of construction costs.

Lengths and costs are assumed to be symmetric for both directions. We must emphasize here that the link lengths and flows for potential investments are rough estimates, because at this step in the selection procedure the optimal path of a potential link between *i* and *j* is not yet known and as a consequence, neither are the definitive travel times. The construction costs are obtained by finding the least-cost path between zones given estimated construction costs for each potential network segment. These construction costs are imposed on a fine-meshed network of regularly distributed segments, which is elaborated upon later.

*T*between zones are computed using a spatial interaction model derived from Alonso’s General Theory of Movements (GTM) (Alonso 1978). It must be emphasized that in the model these formulations can be easily substituted by any other spatial interaction formulation, for example to take into account spatial dependencies (Patuelli and Arbia 2013), heterogeneity or endogeneity (Donaghy 2010). For the selection of terminating zones, we compute trips

*T*in three cases:

*P*represents zonal populations; \( c_{ij}^{\text{base}} \) describes travel costs over the base network; \( c_{ij}^{\text{curr}} \) describes current travel costs obtained from the network at the start of the model’s iteration, thus including already allocated investments; \( c_{ij}^{\text{est1}} \) approximates travel costs if the potential investment is in place and is computed as \( c_{ij}^{\text{est1}} = L_{ij}^{\text{base}} /V^{\text{intr}} \);

*f(.)*is a distance-decay function; \( \gamma \) and \( \theta \) are parameters that govern transport consumption elasticity for reduced travel costs; and \( B_{j} \) is a destination-specific constant that may be used to model congestion at destinations.

*O*are selected. To exclude lines that offer relatively small total travel cost improvements between two terminating zones, the line proposed in \( c_{ij}^{\text{est1}} \) must offer minimally half the maximum travel cost improvements that may be obtained by substituting a base network link with the link considered. Furthermore, intrazonal links and symmetrical elements in the matrix are excluded. These criteria yield the following selection dummy \( Z_{ij} \):

The criterion is admittedly chosen ad hoc, but seems to be a reasonable assumption. This selection criterion is necessary to obtain a small choice set with reasonably plausible alternatives. Note that in the case that \( cp > 0, \) proposed links between *i* and *j* also have an absolute minimum distance, because with lower distances the rail link’s travel cost including waiting times does not offer sufficient travel cost advantages. Finally, a fixed number of links between *i* and *j* with the highest values of \( {\text{RCR}}_{ij}^{\text{est1}} Z_{ij} \) are selected as investment options.

### 3.2 Finding plausible paths

*r*= 4. The used method differs somewhat from known solutions to corridor location problems. The key difference is that the used method depends on the outcomes of previous model iterations and may yield different results in subsequent model iterations. To allow for this, the regularly latticed network is combined with the network already built at the start of the model’s iteration and with segments that connect the simulated rail network to zone centroids. The combined network and a shortest path finding algorithm are used to obtain a path with an optimal combination of revenues, construction costs and length.

*s*are computed as:

*k*, while the least lengthy path is favoured in case

*k*= 1. The method to estimate segment revenues will be explained later. Construction costs are obtained from terrain characteristics. To model additive network construction, already built railway segments are given a very low cost of one. Note that more sophisticated cost structures for existing links can be configured to simulate specific cooperation conditions. Finally, segment lengths are primarily taken into account to ensure that the found path respects \( L_{ij}^{\text{intr}} < L_{ij}^{{{\text{intr}}\,{ \hbox{max} }}} . \)

*k*parameter. Thus, the shortest path finding algorithm with \( {\text{RC}}_{S}^{ - 1} \) is repeated in 40 iterations, in which

*k*is gradually increased from zero to one. The total inverse revenue-cost indicator of a path is:

For *k* = 0, this amounts to a distance-weighted sum of inverse revenue-to-cost ratios, while for *k* = 1 it is simply total distance.

#### 3.2.1 Estimating segment revenues

*R*of segments

*s*are computed by means of the population

*P*of zone

*i*in which the segment’s first point (

*s*1) and last point (

*s*2) are located and the zone’s saturation factor \( {\text{MS}}_{i} \). One person is added to each zone to ensure that values of \( R_{s} \) are above zero and thus warrant the computation of Eq. (9).

#### 3.2.2 Optimal path selection

*i*to

*j*matrix, for which the travel costs and travel distances between connected zones are repeatedly re-estimated for every value of

*k.*To do so, a dummy variable \( Q_{i} \) indicates whether zone

*i*is connected to the alternative path at hand. Subsequently, the estimated travel costs \( c_{ij}^{\text{curr}} \) and travel distances \( L_{ij}^{\text{base}} \) between all connected zones are updated so that \( c_{ij}^{{{\text{est}}\,k}} \) and \( L_{ij}^{{{\text{est}}\,k}} \) are defined for each alternative path

*k*as:

In Eq. (15), the length of links is purposely squared to enforce that the shortest path is only selected if no path is found that meets the \( L_{ij}^{{{\text{intr}}\,{ \hbox{max} }}} \) criterion. Subsequently, the path with the highest value of RCR is selected. In this way, the path with the highest estimated revenue-to-cost ratios is selected if a path that meets the length criterion is found, and else the method picks the path with the shortest overall length.

It is important to note that two additional restrictions are imposed on the path decision method: first, we assume that railway network construction is incremental, so that a) in all cases, if a link starts or terminates in a zone already connected by a built line, the generated line must connect to the line already built there and b) the links of an investor’s already existing network have negligible costs for the considered expansion; second, to simulate that built railway links terminated outside contemporary urban areas, the link may not start on a node less than 500 metres away from the zone’s centroid. This approximates the distance between stations and urban area centres that are observable in the historically built network.

### 3.3 Investment selection

Subsequently, the attractiveness of the investment options is computed. A wide range of variables that deal with investor objectives can be computed here. Increasing mileage, total transport flows or reduction in congestion due to insufficient transport network capacity are, presumably, generally important reasons for transport network investments. TLS therefore includes a module to model expected transport flows on potential network extensions, on the investor’s remaining network or on the whole transport network.

For all investment options generated in the choice set, the attractiveness is estimated with the methods shown in the previous section, yielding values of \( S_{l} \) specific for each investor in a vector that is as long as the number of active investors times the number of options. A very small random component is added to the computed attractiveness values to warrant that two options do not have identical attractiveness. Based on the estimated values, Eq. (1) is solved to obtain probabilities for the considered investments. Ultimately, the investment with the highest probability is selected. The new link and its relevant attributes are added to the already existing network in a new file; this file may form the basis for the evaluation of a subsequent investment if need be.

### 3.4 Measuring model accuracy

*X*is a zone-specific dummy that takes the value one when a municipality is connected by the modelled and observed railway networks, and zero otherwise.

To ensure a meaningful comparison, modelled networks are compared with the state of the historically built network that is closest to the modelled network in terms of length. Thus, if in the fourth investment turn, a modelled network has a length of 1000 km, subsequent individual historical investments are tested for cumulative length until the historical investment is selected that brought the historically built network the closest to a 1000-km cumulative length. The network comprising that and previous investments is selected for comparison. In addition, the population levels of the year in which the selected historical investment is built are selected to serve as weights for the presented indicators.

## 4 Case study

In this section, we present an effort to simulate the development of the Dutch railway network in the nineteenth and early twentieth century using TLS. Investment attractiveness functions were fitted on observed transport network investments. First the history of the development of that railway network is summarized, after which the model set-up, main assumptions and estimation of transport link attractiveness are outlined.

### 4.1 The development of the Dutch railway network

The first railway in the Netherlands opened in 1839 (Veenendaal 2008). It was operated by the ‘Holland Iron Railway Company’ (HSM) and linked Amsterdam to Haarlem. It was soon extended towards Rotterdam. Subsequently, competing companies built their own lines in the Netherlands. More than ten operators have separately provided railway services on railway links in the Netherlands. The Dutch government began participating actively by building state lines defined in the Railway Acts of 1860 and 1875. Most of those state lines were run by the ‘State Railways’ (SR), a private company which leased lines owned by the state. In 1878, a third Act followed that allowed for the cheaper construction of railways, if operated with slow light trains. Supported by attractive loans from the Dutch State and subsidies from local governments (Doedens and Mulder 1989), this Railway Act incited the construction of ‘local tracks’ that typically connected rural areas to the main railway network (Veenendaal 2008) and were often subsidized by local governments. In this paper, we treat state involvement as the introduction of other types of investors with distinct preferences in the railway development playing field.

### 4.2 Population distribution, network speeds and network ownership

Based on Veenendaal (2008) and Stationsweb (2009), the historical railway network development in the Netherlands has been reconstructed in a GIS database that also contains population counts from 1830 to 1930 in 1076 municipalities. The data, furthermore, build on the same assumptions as in Koopmans et al. (2012), of which we now list the most important ones. The study area is assumed to already have an underlying network of paths that connects all municipalities with each other. In the nineteenth century, horse-drawn boats through the country’s tow canals were the main long distance travel mode and often the only alternative to walking to most people. They operated at a speed that was but slightly faster than walking. We must acknowledge that the historical networks of paved roads and tow canals are not taken into account explicitly; instead, just as Koopmans et al. (2012), we consider both networks to be regional substitutes for each other that are approximated using one simplified network. In the case study, that network connects each municipality with its five nearest neighbours. A speed \( V^{\text{base}} = 6\;{\text{km}}/{\text{h}} \) is maintained as the average speed to traverse this network to proxy movement over roads and waterways. We assume this is a reasonably accurate assumption for the Netherlands. One model variant is run with \( V^{\text{base}} = 4\;{\text{km}}/{\text{h}} \) to test model sensitivity for this setting.

Municipalities are represented by means of their geographic centres. The base network has direct connections between those centroids. The rail network is connected to those centroids through connector road links. Train schedules or the accelerating and decelerating of trains are not explicitly modelled, but are approximated by imposing relatively low average speeds for the introduced transport links. To proxy that passengers lose some time with entering and exiting the rail network as well as with transferring between physically separate rail networks, a relatively small travel cost penalty \( {\text{cp}} = 10\;\hbox{min} \) is given to all connectors between rail networks and municipalities.

When assessing the attractiveness of investments, links of the previously modelled network extensions are included as well as the underlying network. As given in Sect. 4.3, passenger transport demand is an important reason for investment. The level of demand depends on generalized transport cost, which is proxied by travel time, and on price elasticity. This makes the modelled speeds on the railway network and assumptions on price elasticity a key factor for network outcomes. To take these factors into account, we present scenarios with varying travel-time improvements and with varying assumptions on price elasticity of passenger transport demand. Construction costs, passenger demand and price elasticity have been estimated using observed data. Details of the method used, data and results can be found in Appendices 1 and 2.

To model railway network expansion in a case with multiple investors with varying objectives, five independent investors are simulated. This set of investors consists of two regular private investors, two private local line investors and the state and roughly resembles the playing field during Dutch railway construction. The regular private investors partake in investments from the model start. The state partakes from 1860; local line investors from 1879. At any point, the investment--investor combination with the highest probability is selected. All investors are eligible to the same investments with attributes that may differ per investor; ten investment options are available in every round. The built lines are assumed to be operated by the building investor, so that all revenues from an investor’s line are therefore assumed to fall to that investor. In the presented case study, the modelled investment sequence starts in 1839, with one investment allowed every year. After an investment, an operator is excluded one round to simulate financial recuperation and evaluation of the investment. Municipal population counts are updated every decade. If the model does not find any suitable investments, it skips years to a following decade; if it does no longer find suitable investments in 1930, the network expansion sequence ends.

The scenarios used

Scenario | Description | \( \varphi \) and \( (V^{\text{intr}} / V^{\text{base}} ) \) | \( \gamma \) |
---|---|---|---|

A | Slow trains, elastic consumption | 3 (18/6) | 0.3 |

B1 | Fast trains, inelastic consumption | 5 (30/6) | 0 |

B2 | As B1, but investors are not excluded directly after an investment | 5 (30/6) | 0 |

B3 | As B1, but only private investors | 5 (30/6) | 0 |

B4 | As B3, but change in passenger mileage on competitor network is a factor for investment attractiveness | 5 (30/6) | 0 |

C | Slower walking speeds, B5 parameters | 7.5 (30/4) | 0 |

Rietveld and Bruinsma | Reproduction of Rietveld and Bruinsma (1998) | 5 (30/6) | 0 |

Measuring performance is meaningless without a baseline comparison of accuracy. To compare relative model performance, the model described by Rietveld and Bruinsma (1998) has been approximated using the TLS framework. The Rietveld and Bruinsma method repeatedly adds a straight line between the two cities that yield the highest expected return on investment. Only the 35 most populous cities in the country are taken into account. Costs are equal to length, with the exception of links that cross large waterbodies; those links cost a factor 20 more. No fixed costs or minimum travel times are applied, and varying investor differences are not accounted for. This model is implemented in TLS by selecting the highest value of Eq. (4), taking into account only the original subset of 35 cities. One link is added in every model iteration. All links are assumed to be private lines. The plausible paths method in Sect. 3.2 is adapted to exclude variation in estimated link revenues. The allocation procedure is finished when the pool of available cities is exhausted. We must note that a comparison with a socially optimal network (Li et al. 2010) is also useful here; further work is needed to establish norms for optimality and generate a meaningful optimum.

### 4.3 Investment choices

Because inland water transport provided the Dutch freight sector, a cheap substitute for rail passenger transport was a particularly important service for Dutch railway investors (Filarski and Mom 2008). Furthermore, railways have been considered to possess unifying qualities (Veenendaal 2008), which were presumably sought after by the Dutch administration in the nineteenth century. Although the ‘United Provinces’ created in the seventeenth century had become a centrally led monarchy by 1806, the country was only starting to form a political union when the railways began to develop (Kossmann 1986).

To investigate the motives of investment decisions in the development of the Dutch railway network, the conditional logit choice model in (1) has been fitted on sets of built and unbuilt railway links. Investments were separated into regular private lines, private lines that comply with local track legislation, and state lines. As noted before, return on investment is assumed to be the key driving force. Revenues are expected to be linear with travelled distances; this cannot be validated because data on historical ticket pricing structures are currently unavailable. We thus implicitly assume that pricing levels were equal throughout the country regardless of regulation or level of competition. This is presumably not true, and the consequences are worth exploring in follow-up research.

*l*, are taken into account. Thus, \( A_{i}^{\text{curr}} \) is a measure of accessibility with initial travel times; \( A_{i}^{{{\text{opt}}\,l}} \) describes accessibility levels when including the travel costs improvements from the potential investment.

Furthermore, two dichotomous variables indicate whether a link connects to other links in the entire railway network and in particular to links on the operator’s network. Connecting to the existing rail network is presumed to add option values for revenues of later connections to further cities; operational cost reductions for an operator because inventory can be kept at one centralized point; and furthermore, operators might consider that having an extensive connected network brings prestige. Another dichotomous variable indicates whether a link provides a first connection to provincial capitals or to the country capital city, Amsterdam. Connecting to these cities might be attractive if investors expected larger growth of the passenger market in those cities and might have prestige value as well. Yet another dichotomous variable indicates whether a link connects municipalities on the country border. This variable represents attempts to profit from international passenger and mail transport. A last dichotomous variable indicates whether a link connects to a sea harbour. This variable represents endeavours to connect Dutch sea harbours with their hinterlands by means of rail for the sake of goods transport.

The built links in the choice set were derived from the database of constructed railway links. We have used the following definition of a link: a link connects at least two existing nodes (railway junctions, stations or municipalities) and has been realized by an investor as one integrated project within a limited number of years. We assume that the results of the applied models are more accurate in the case of longer links, and therefore weight the results of Eq. (1) by the length of built link *o*, normalized by the average length of all built links in period *t* so that the total number of observations in the choice model is not affected. To generate a choice set of unbuilt links, we applied the following procedure: (1) a set of 50 alternatives was generated for all links that were built in one decade; (2) to simulate that investors presumably had limited capital in particular in the early stages of network development, the costs of railway construction of an alternative could not exceed the costs of a built railway in a longer period (either 1839–1859, 1859–1889 or 1889–1929); (3) selection of terminating municipalities and the routing of the intermediate path were not affected by the transport market saturation of municipalities *MS*.

Results of fitting a conditional logit model on the attributes of the built and automatically generated unbuilt lines in the Dutch railway network

Scenario | A | B1 | ||
---|---|---|---|---|

Coefficient |
| Coefficient |
| |

Return on investment | ||||

Private lines | 0.64** | (3.84) | 1.28** | (3.81) |

Private local lines | 0.11 | (0.58) | 0.38 | (0.79) |

State lines | 0.46 | (1.64) | 0.40 | (0.81) |

Change in accessibility inequality | ||||

Private lines | 6.59** | (3.03) | 8.16** | (3.10) |

Private local lines | −21.00** | (−3.83) | −20.53** | (−3.70) |

State lines | −20.89** | (−4.48) | −23.20** | (−5.10) |

Connects operator network | ||||

Private lines | 1.69 | (1.76) | 1.69 | (1.78) |

Private local lines | 2.42** | (3.11) | 2.22** | (2.87) |

State lines | 3.83* | (2.57) | 4.07** | (2.76) |

Connects railway network | ||||

Private lines | −0.65 | (−1.05) | −0.71 | (−1.17) |

Private local lines | 0.05 | (0.12) | 0.14 | (0.31) |

State lines | −2.68** | (−3.41) | −2.77** | (−3.36) |

First connection to a provincial capital | ||||

Private lines | 3.86** | (4.62) | 3.77** | (4.55) |

Private local lines | N/A | N/A | ||

State lines | −1.67 | (−1.34) | −1.90 | (−1.47) |

Connects border zone | ||||

Private lines | 3.45** | (4.96) | 3.71** | (5.26) |

Private local lines | −0.59 | (−0.52) | −0.52 | (−0.45) |

State lines | 0.06 | (0.06) | −0.10 | (−0.10) |

Connects sea harbour | ||||

Private lines | −0.35 | (−0.53) | −0.43 | (−0.64) |

Private local lines | −0.11 | (−0.17) | −0.01 | (−0.01) |

State lines | 1.46* | (2.15) | 1.41* | (2.11) |

McFadden’s Pseudo-R | 0.57 | 0.57 | ||

AIC | 262.95 | 265.37 |

## 5 Simulation results

The differences in network shapes and network ownership are striking. In all cases, private lines mostly function as trunk lines, with the state providing peripheral extensions to the trunk network and local lines providing connections between trunk lines. With the exception of scenario C, local lines do not seem to have a dominant feeder function. The density of the trunk line network depends on overarching conditions: for example, with a lower value of ϕ the trunk network appears to be more extensive (cf. scenario A vs scenario B1). Interestingly, in the B2 variant, one operator obtains complete monopoly in the private lines and expands that network much more than happens in a more competitive setting (cf. scenario B1). Possibly the existence of greater network externalities allows for a greater density in the final network of the monopolist.

## 6 Closing remarks

This paper presents transport link scanner, a model that simulates the expansion of transport networks. Based on a conditional logit method, the model repeatedly selects one most attractive link from a choice set to add to the expanding network. That choice set is generated using heuristics with the goal to obtain a limited set of relevant, geographically plausible links. The model outlined in this paper explicitly allows the empirical estimation of preferences in a context with multiple actors with possibly different characteristics. It allows to test, amongst others, the impact of investor preferences, transport revenue structures and network effects on the final outcomes of a transport network.

A practical application of the model is presented as well. This exercise focuses on the expansion of the Dutch railway network in the nineteenth and early twentieth century and compares the model’s accuracy with a previous attempt by Rietveld and Bruinsma (1998). The results presented show that the early expansion of the Dutch railway network is simulated by TLS with similar accuracy as by Rietveld and Bruinsma, without the necessity of an a priori selection of connectable cities. The results corroborate findings that transport network expansion follows a clear rationale (Rietveld and Bruinsma 1998; Xie and Levinson 2011; Levinson et al. 2012), show that the modelling rationale can simulate network expansion processes with some success and illustrate that institutional and economic settings may have a profound effect on network expansion outcomes. Future research may be necessary to further improve the accuracy of the model and measure its performance in terms of characteristic spatial network metrics (Rodrigue et al. 2006). One other useful addition would be the inclusion of socially optimal networks (Li et al. 2010) that would enable exploration of how competitive investment decisions can be directed towards social optima (Anshelevich et al. 2003). Nevertheless, we conclude that the model appears to become a useful tool for academic studies and policy evaluations.

## Footnotes

- 1.
An assumption of multinomial logit models is independence of irrelevant alternatives. There have been some recent attempts to develop sampling strategies that may overcome this assumption; see, for example, Guevara and Ben-Akiva (2013). However, it is beyond the scope of the present paper to try and apply such methods, in particular since the generation of meaningful links is not trivial, as can be seen in the rest of the paper.

- 2.
The results are available on request.

## Notes

### Acknowledgements

This work has profited immeasurably from the many inputs given by Piet Rietveld, whose untimely death has prevented him from seeing these final results. We must also thank Aart Huijg, Peter Groote, Maarten Hilferink, Martin van der Beek, the Dutch Railway museum and three anonymous reviewers, who have all had an important role in the preparation of this paper.

### References

- Alonso W (1978) A theory of movements. In: Hansen NM (ed) Human settlement systems: international perspectives on structure, change and public policy. Ballinger, Cambridge, pp 197–211Google Scholar
- Anshelevich E, Dasgupta A, Tardos E, Wexler T (2003) Near-optimal network design with selfish agents. In: Proceedings of the thirty-fifth annual ACM symposium on theory of computing, ACM, San Diego, 9–11 June 2003, pp 511–520Google Scholar
- Bala V, Goyal S (2000) A noncooperative model of network formation. Econometrica 68(5):1181–1229CrossRefGoogle Scholar
- Baranzelli C, Jacobs-Crisioni C, Batista F, Castillo CP, Barbosa A, Torres JA, Lavalle C (2014) The reference scenario in the LUISA platform—updated configuration 2014 towards a common baseline scenario for EC impact assessment procedures. Report EUR 27019 EN. Publications Office of the European Union, LuxembourgGoogle Scholar
- Bekhor S, Ben-Akiva ME, Ramming MS (2006) Evaluation of choice set generation algorithms for route choice models. Ann Oper Res 144(1):235–247CrossRefGoogle Scholar
- De Rigo D, Corti P, Caudullo G, McInerney D, Di Leo M, San Miguel-Ayanz J (2013) Toward open science at the European scale: geospatial semantic array programming for integrated environmental modelling. Geophys Res Abstr 15:13245Google Scholar
- De Vries JJ, Nijkamp P, Rietveld P (2001) Alonso’s theory of movements: developments in spatial interaction modeling. J Geogr Syst 3(3):233–256CrossRefGoogle Scholar
- De Vries JJ, Nijkamp P, Rietveld P (2002) Estimation of Alonso’s theory of movements by means of instrumental variables. Netw Spat Econ 2(2):107–126CrossRefGoogle Scholar
- Dobbin F, Dowd TJ (1997) How policy shapes competition: early railroad foundings in Massachusetts. Adm Sci Q 42(3):501–529CrossRefGoogle Scholar
- Doedens A, Mulder L (1989) Een spoor van verandering: Nederland en 150 jaar spoorwegen. Bosch & Keuning, BaarnGoogle Scholar
- Donaghy KP (2010) Models of travel demand with endogenous preference change and heterogeneous agents. J Geogr Syst 13(1):17–30CrossRefGoogle Scholar
- Economides E (1996) The economics of networks. Int J Ind Organ 14(2):673–699CrossRefGoogle Scholar
- Filarski R, Mom G (2008) Van transport naar mobiliteit: de transportrevolutie 1800–1900. Walburg, ZutphenGoogle Scholar
- Fotheringham AS, O’Kelly ME (1989) Spatial interaction models: formulations and applications. Kluwer, DordrechtGoogle Scholar
- Goodchild MF (1977) An evaluation of lattice solutions to the corridor location problem. Environ Plan A 9(7):727–738CrossRefGoogle Scholar
- Grübler A (1990) The rise and fall of infrastructures. Dynamics of evolution and technological change in transport. Physica Verlag, HeidelbergGoogle Scholar
- Guevara CA, Ben-akiva ME (2013) Sampling of alternatives in multivariate extreme value (MEV) models. Transp Res Part B 48(1):31–52CrossRefGoogle Scholar
- Hilferink M, Rietveld P (1999) Land use scanner: an integrated GIS based model for long term projections of land use in urban and rural areas. J Geogr Syst 1(2):155–177CrossRefGoogle Scholar
- HSM (1889) Financieel verslag over dienstjaar 1888. Metzler & Basting, AmsterdamGoogle Scholar
- Jacobs-Crisioni C, Batista e Silva F, Lavalle C, Baranzelli C, Barbosa A, Castillo CP, Perpiña Castillo C (2016) Accessibility and territorial cohesion in a case of transport infrastructure improvements with endogenous population distributions. Eur Transp Res Rev 8(9):1–16Google Scholar
- Knick Harley C (1982) Oligopoly agreement and the timing of American railroad construction. J Econ Hist 42(4):797–823CrossRefGoogle Scholar
- Kolars J, Malin HJ (1970) Population and accessibility: an analysis of Turkish railroads. Geogr Rev 60(2):229–246CrossRefGoogle Scholar
- Koopmans C, Rietveld P, Huijg A (2012) An accessibility approach to railways and municipal population growth, 1840–1930. J Transp Geogr 25(1):98–104CrossRefGoogle Scholar
- Kossmann EH (1986) De lage landen 1780–1980: Twee eeuwen Nederland en België. Elsevier, AmsterdamGoogle Scholar
- Levinson D, Karamalaputi R (2003) Induced supply: a model of highway network expansion at the microscopic level. J Transp Econ Policy 37(3):297–318Google Scholar
- Levinson D, Xie F (2011) Does first last? The existence and extent of first mover advantages on spatial networks. J Transp Land Use 4(2):47–69Google Scholar
- Levinson D, Xie F, Oca NM (2012) Forecasting and evaluating network growth. Netw Spat Econ 12(2):239–262CrossRefGoogle Scholar
- Li G, Reis SDS, Moreira AA, Havlin S, Stanley HE, Andrade JS (2010) Towards design principles for optimal transport networks. Phys Rev Lett 104(018701):1–4Google Scholar
- McFadden D (1974) Conditional logit analysis of qualitative choice behavior. In: Zarembka P (ed) Frontiers in econometrics. Academic Press, New York, pp 105–142Google Scholar
- Morrill RL (1970) The spatial organization of society. Duxbury Press, BelmontGoogle Scholar
- Nakicenovic N (1995) Overland transportation networks: history of development and future prospects. In: Batten D, Casti J, Thord R (eds) Networks in action. Springer, Heidelberg, pp 195–228CrossRefGoogle Scholar
- ObjectVision (2014) Geo data and model server (GeoDMS). http://objectvision.nl/geodms. Accessed 30 Mar 2016
- Patriksson M (2008) Robust bi-level optimization models in transportation science. Philos Trans R Soc A 366(1872):1989–2004CrossRefGoogle Scholar
- Patuelli R, Arbia G (2013) Editorial: advances in the statistical modelling of spatial interaction data. J Geogr Syst 15(3):229–231CrossRefGoogle Scholar
- Rietveld P, Bruinsma F (1998) Is transport infrastructure effective?. Springer, BerlinCrossRefGoogle Scholar
- Rodrigue J-P, Comtois C, Slack B (2006) The geography of transport systems, 2nd edn. Routledge, LondonGoogle Scholar
- Scaparra M, Church R, Medrano FA (2014) Corridor location: the multi-gateway shortest path model. J Geogr Syst 16(3):287–309CrossRefGoogle Scholar
- Sen A, Sööt S (1981) Selected procedures for calibrating the generalized gravity model. Pap Reg Sci Assoc 48(1):165–176CrossRefGoogle Scholar
- Stationsweb (2009) Information on stations in the Netherlands. http://www.stationsweb.nl. Accessed 5 May 2016
- Stern E, Bovy PHL (1989) Theory and models of route choice behaviour. Research Institute of Urban Planning and Architecture, DelftGoogle Scholar
- Taaffe EJ, Morrill RL, Gould PR (1963) Transport expansion in underdeveloped countries: a comparative analysis. Geogr Rev 53(4):503–529CrossRefGoogle Scholar
- Veenendaal AJ (1995) State versus private enterprise in railway building in the Netherlands, 1838–1938. Bus Econ Hist 24(1):186–193Google Scholar
- Veenendaal AJ (2008) Spoorwegen in Nederland, 2nd edn. Boom, AmsterdamGoogle Scholar
- Vitins BJ, Axhausen KW (2009) Optimization of large transport networks using the ant colony heuristic. Comput Civ Infrastruct Eng 24(1):1–14CrossRefGoogle Scholar
- Vrtic M, Axhausen KW (2002) The impact of tilting trains in Switzerland: A route choice model of regional- and long distance public transport trips. Institut für Verkehrsplanung und Transportsysteme, ETH Zürich, ZürichGoogle Scholar
- Warntz W (1966) The topology of a socio-economic terrain and spatial flows. Pap Reg Sci Assoc 17(1):47–61CrossRefGoogle Scholar
- Xie F, Levinson DM (2007) Jurisdictional control and network growth. Netw Spat Econ 9(3):459–483CrossRefGoogle Scholar
- Xie F, Levinson D (2009) Modeling the growth of transportation networks: a comprehensive review. Netw Spat Econ 9(3):291–307CrossRefGoogle Scholar
- Xie F, Levinson D (2011) Evolving transportation networks. Springer, New YorkCrossRefGoogle Scholar
- Youn H, Gastner MT, Jeong H (2008) The price of anarchy in transportation networks: efficiency and optimality control. Phys Rev Lett 101(128701):1–4Google Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.