Introduction

The current technological advances in the field of autonomous buses (AB) allow for tests and operations of AB on public roads. These pilot studies target investigations of user acceptance and vehicle operation of AB in transport systems. Most pilot studies either offer last-mile connections from a transportation hub or close a transportation gap between two stations. When operating AB as a transport mode on fixed routes the operating costs are expected to reduce by ca. 50% compared to conventional bus operations (Lidestam et al. 2018). Walters (1982) showed that an operating cost reduction facilitates a shift in resource allocation towards higher service frequencies and lower vehicle capacities. With higher frequencies and smaller vehicles changes in the route network layout are possible. Eluru and Choudhury (2019) present eight different studies discussing potential impacts of autonomous vehicles on trip generation, vehicle-miles traveled and other transport related metrics. All these studies assume demand-responsive autonomous vehicles no line-based operations. In contrast to these studies, we elaborate more holistically on the impacts of autonomous vehicles on the combined operator, infrastructure and user costs on fixed-line services, which deepens the understanding of system level impacts due to AB deployment. Additionally we can get insights in how the reduction of operator costs in case of AB deployment effects the level of service for public transport users.

The transit network design and frequency setting problem (TNDFSP) is characterized as creating an optimal set of public transport bus routes including the service frequency for each bus route operating in the network. The problem has been subject to extensive research in the last decades. However, even though vehicle automation has the potential to greatly impact service design properties, it has not yet been systematically investigated. In this work, we formulate a variant of the TNDFSP that caters for the characteristics of AB and hence allows examining the potential impacts of its deployment on service design. With the deployment of AB additional constraints regarding the connectivity of the bus lines are required. More specifically every bus stop should be connected in the network with every other bus, so that all bus stops can be reached by any bus. This property has not been considered in past TNDFSP formulations. The line-based network characteristics and the connectivity constraint are motivated twofold. First, the use of a line-based public transport system is known to most people and has been accepted for a long time and has important merits in serving the bulk of the travel demand. Second, the legal operation of autonomous buses in most urban environments is currently bound to certain areas and or specific roads. The legislative framework does not allow for the deployment of large fleet sizes in flexible operational environments; hence the line-based mode of operation is seen as the first practical use case. By adding line-based and connectivity characteristics to the problem formulation the study investigates near future and transition scenarios from traditional to autonomous public transport operations. Additionally, the deployment of AB affects the operational cost structure by reducing the crew cost as well as the infrastructure cost by increased costs for AB specific road or bus stop enhancements. The impact of both cost terms on the TNDFSP has not been studied previously. Further, the majority of past research on TNDFSP requires PT networks to serve every bus stop. In this AB specific TNDFSP the bus stop position is an output value. Thus, not every potential bus stop in the area of interest has to be served. This allows for a network design which is simultaneously benefiting operator focused designs and serving the demand with a user perspective in mind.

In this study, we investigate how the network design (number of routes and route alignment), the infrastructure (number of bus stops and network length), operating and traveler costs are affected by changed supply characteristics (service frequency and vehicle capacity) induced by AB. Additionally, the study investigates the impact of changes in supply provision on user costs and consequently its potential impacts on ridership. Furthermore, structural differences in user-focused network design and operator-focused network design are investigated. We solve the TNDFSP using a simulation-based multi-objective optimization framework. The created solutions are constrained by the maximum number of routes, the maximum service frequency and number of bus stops per route. The use of a agent-based simulation module allows the simulation of walking as a transportation mode. By this passengers have the option to board a bus or walk at every bus stop and at every point in time. This is achieved by additional walking links added to the network design problem.

The remaining of the paper is structured as follows. First, the relevant literature is reviewed in the  "Literature review" section. Then the methodology of this study is described in "Methodology". In the "Case study" section the case study is detailed. The following "Results" section discusses the impact of AB on the network design, differences between user-focused and operator-focused PT networks and benchmark results. The remaining of the paper "Discussion and conclusion" critically assesses the results and concludes about the applicability of the framework. This section closes with an outlook, sketching directions for potential future studies.

Literature review

This section reviews the relevant literature. First, works one transit network design and frequency setting problems are presented and second studies presenting passenger assignment models in PT are introduced.

Network design

The strategic public transport planning process can be subdivided into five consecutive steps (Ceder and Wilson 1986). Step one is network design where demand and supply data are used to create routes and operation strategies. In the second step, the service frequencies are determined based on the available vehicle fleet, policy constraints, and the created routes. In the next step, the timetable is developed which results in trip arrival and departure times for each stop in the network. The fourth step is solving the vehicle scheduling problem based on schedule constraints, deadheading and work hour times. In the last step, the crew schedule is created. In the past, studies have proposed solutions for each of the five steps. Recent developments include the combined solution of multiple steps [see e.g. Michaelis and Schöbel (2009)]. Guihaire and Hao (2008) provide an extensive literature study about research on transit network design and scheduling. They categories the original five-step approach by grouping (a) the first and second steps, (b) the second and third steps and (c) all first three steps together. The resulting problem categories are (a) transit network design and frequency setting problem (TNDFSP), (b) transit network frequency setting and scheduling problem (TNSP) and (c) transit network design and scheduling problem (TNDSP), respectively. In this study, we focus on the first combination, the TNDFSP. The solution for this problem implies finding a set of routes to serve a given area and their respective frequencies.

Transit design and transit planning problems are challenging to solve (Guihaire and Hao 2008). In Magnanti and Wong (1984) the authors state multiple transit network design problem formulations and elaborate on their complexity. The authors conclude that the general network design problem is NP-hard and therefore requires certain heuristics to be solved for large problem instances. Similarly, sub-problems of the general network design have been proven to be NP-hard, e.g. the uncapacitated budget design Johnson et al. (1978), the Steiner tree on a graph Garey and Johnson (1979).

Table 1 provides an overview of selected studies that developed and applied heuristics or meta-heuristics to solve the TNDFSP. For each study, we mention the elements included in the objective function, decision variables and constraints; the solution method and the demand-related assumptions.

Table 1 Literature Overview TNDFSP

In most studies heuristic algorithms are employed to solve large scale network design problems. The heuristic approaches include simulated annealing (SA), tabu search, genetic algorithm (GA) and artificial bee colony optimization (ABC) as well as other more niche algorithms for specific problems. Most of the variation in the studies lies in the details of the problem formulation and the size of the investigated network. In Lampkin and Saalmans (1967) the authors simultaneously optimize vehicle capacity and travel time of a single route network. The demand level is fixed and many-to-many origin-destination demand scenarios are assumed. The approach optimizes a social welfare objective function characterizing the user cost and operator cost as a monetary value. More sophisticated models (Zhao and Zeng 2008; Chien et al. 2003; Chakroborty 2003) apply several constraints regarding the travel experience in the network. Hence the number of transfers, the directness of the routes and the number of stops on each route are controlled by appropriate constraint formulations. Fan and Machemehl (2006) present a simulated annealing algorithm to solve the TNDFSP. The algorithm objective is the combined cost of user, operator and unsatisfied demand. The problem is constrained by the maximum route capacity, maximum fleet size, maximum trip length and maximum number of routes. The authors apply the framework to a test network consisting of 160 nodes and 418 edges. They conclude that the results based on SA are converging to lower total cost values compared to applying GA to this problem. Similarly, Bourbonnais et al. (2019) utilize a GA algorithm to solve TNSFP using road network data. They optimize the sum of satisfied transit demand, unsatisfied transit demand and vehicle-hours. The authors show the applicability of their framework on three cities in Quebec, Canada.

A multi-objective network design formulation is proposed by Possel et al. (2018). The objectives are based on total emissions, the number of traffic accident fatalities and the total travel time. The main contribution is the formulation of a bi-level program which has the minimization of the objectives on the upper level and the passenger assignment on the lower level. The authors compare the optimization algorithm NSGA-II with simulated annealing algorithms and show the minor dominance of the NSGA-II. Szeto and Jiang (2012) discuss the application of an ABC algorithm for the TNDPFSP. The objective value is computed as the weighted sum of passenger transfers and travel time. The problem is constrained by the vehicle fleet size. The service frequency is determined by a heuristic considering the maximum available vehicles per route. In a later work Szeto and Jiang (2014) design a bi-level optimization approach for the network design problem utilizing a linear program for the frequency assignment as the bottom-level problem and an ABC approach for the top-level route creation problem. The authors apply their framework to a small synthetic network as well as the entire public transport network of Winnipeg, Canada and Tin Shui Wai, China. It is concluded that the ABC algorithm outperforms genetic algorithm for large networks.

To evaluate the user cost for a given network design passengers must be assigned to service routes. Existing passenger assignment models can be classified into static and dynamic models, with deterministic or stochastic and frequency-based assignment (Nguyen and Pallottino 1988; Spiess and Florian 1989; Cepeda et al. 2006) or schedule-based assignment (Mark D. Hickman and David H. Bernstein 1997; Tong and Wong 1998; Poon et al. 2004). Liu et al. (2010) give a comprehensive overview of past and ongoing research in transit assignment models. With static assignment models the demand and supply are assumed to be constant over the period of analysis, hence within-day changes, e.g. variations in vehicle crowding and waiting times, cannot be studied (Liu et al. 2010). Additional to the analytical approaches for the network design problem, studies integrating simulation models into the optimization framework have recently been published. Neumann (2014) proposes a framework consisting of the agent-based simulation software MATSim and a genetic algorithm to design public transport networks for paratransit vehicle operations. The network is defined as a set of routes each serving a sequence of stops and each having a dedicated start time. The genetic algorithm adjusts the route layout and operation starting times. The simulation model then computes passengertrip plans and movements throughout the network. The study replicates reference network operations and improve the operations in non-profitable public transport corridors. The main limiting factor of the framework, as stated by the authors, is the limited search space.

Scientific contribution

In a related study by Kim et al. (2019) the authors use a simulation based approach to evaluate what impact autonomous buses have on the departure times for commuting trips. In their study the authors did not assume a public transport network but rather taxi-like operations using AB rather than private vehicles. The authors could show that, in the autonomous case for many commuters the desired and actual departure time are not the same. The authors conclude that the additional policy regulations (e.g. dynamic pricing schemes or travel reservations) must be implemented to make autonomous taxi operations more popular.

The main contribution of this work is an AB specific network design and frequency setting problem, which aims to realistically represent passenger decisions and transport supply. For the analysis of AB systems, it is important to have a realistic and detailed passenger assignment with which it is possible to identify bottlenecks in passenger flow. Therefore, a dynamic passenger assignment model with with-in day characteristics is chosen in this study. Through the addition of walking links on the entire network the design public transport network a realistic competing mode for the bus lines is implemented. The characteristics of autonomous buses compared to conventional buses are captured by presenting an adjusted operator cost model. Autonomous bus specific infrastructure costs for dedicated lanes and autonomous bus specific bus stops are assumed. The network design is also constrained to create connected PT networks in which every AB can reach every bus stop in the network without human interaction. In contrast to recent work in network design we formulate the problem as a multi-objective optimization problem to optimize simultaneously for user cost, operator cost and infrastructure costs.

In comparison to Neumann (2014) the proposed framework in this work differs in two ways. First, the user cost is incorporated as an additional objective function component rather as an additional objective, which means that user and operator interests are directly competing with each other. Second, the implemented heuristic optimization algorithm allows for a more efficient optimization of large solution spaces compared to a genetic algorithm, which allows for larger scenarios and more robust solutions.

Methodology

This section describes the proposed formulation and solution approach for the multi-objective transit network design and frequency setting problem (TNDSFP) with autonomous buses. First, the problem formulation is introduced. Then an overview over the different objectives is given. The section closes with the description of the solution approach employed.

Problem formulation

With the proposed problem formulation the impact of AB on the network design for the users, operators and authorities of PT systems can be computed. The problem is formulated as a multi-objective problem where each objective represents one stakeholder. The detailed formulation is given in the following subsections. To rate different solutions and analyze the impact of AB the network design is assessed using key performance indicators (KPI), such as walking times, in-vehicle times, number of transfers, passenger load, number of bus stops and cycle times of vehicles.

The decision variables of the problem are; (1) number of bus routes, (2) the frequency of each route and (3) the bus stops served by each route. The design of the network is constrained by the maximum number of routes, and the minimum and maximum number of bus stops per route. To cater for AB specific requirements, the network of bus routes has to be connected so that each vehicle can reach any bus stop (connected graph) and that one bus stop is not served twice on the same route (non-cyclic route). This connectivity constraint is motivated by the assumed AB operations. If a bus is driven autonomously there should always be the possibility to autonomously drive back to the depot without any human interaction. Additionally, it should be possible for every bus to reach every position in the network so that in case of technical failures other buses could compensate. The infrastructure cost incorporates the fact that AB operations require e.g. dedicated lane operations and technology enhanced bus stops. The number of bus stops is an output of the framework. The input values are the road network graph, set of potential bus stops and an OD matrix.

Network definition

We consider a two-layer network representation to capture the infrastructure constraints and costs associated with the deployment of AB in addition to the service configuration. The bottom-level graph represents the physical road network (infrastructure graph). The nodes in this graph represent the road start- and endpoints and intersections. An edge in the graph is a road section on which an AB can operate. The top-level graph represents the bus stop connectivity (service graph). A node in this graph is a bus stop or bus hub. A bus stop is served by one service route, whereas a bus hub is served by a minimum of two service routes. An edge in the top-level graph is the shortest path connecting two bus stops, based on the bottom level graph connecting of these two nodes. We add the number and position of all potential bus stops in the network are provided as input. A service route is defined as a sorted list of bus stops in the second level graph. Figure 1 shows the relation between the two graphs.

Besides the bus routes also walking links are added to the network. Passengers have the option to walk between every pair of nodes if these nodes are not more than 300m apart. For rapid transit operations a bus stop distance of minimal 500m is required by the latest report of the City of Stockholm [see Firth (2012)]. Based on the geographical size of the studied network and the operation speed of the buses 300m as the threshold distance are assumed reasonable since only a few bus stops are further apart, hence passenger have the option to reach nearly every point in the network by foot.

Fig. 1
figure 1

Connection between infrastructure graph, service graph and service routes

Demand formulation

The demand is represented as an origin-destination matrix with a distinct time-dependent passenger arrival rate (pax/h) for each origin-destination pair. The origin and destination are defined as bus stops in the network. The demand matrix is a required input to the model. A path is the route or combination of routes a passenger chooses to travel from the origin bus stop to the destination bus stop. A passenger path can consist of multiple routes or a combination of a subsection of routes. The connection to a different route is possible at hubs, at the same stop if both routes stop there or via walking to nearby bus stops. Besides the option to walk between every pair of nodes passengers have the option to walk their entire trip.

Objective function

The aim is to minimize three conflicting objectives, namely (1) total user cost, (2) total operator cost and (3) infrastructure preparation cost. The multi-objective characteristics allows for a diversified analysis of user-focused and operator-focused network design solutions. The input parameters for the model are listed and described in Table 2. The objective functions are formulated as follows:

Table 2 Input parameter for network design framework

User cost The total user cost (\(c_u\)) is the summation of the total waiting time (\(t_{w,p}\)), access and egress times (\(t_{a,p}\) and \(t_{e,p}\)), transfer penalty (\(tr_p\)), walking time (\(t_{wa,p}\)) and the perceived in-vehicle time (\(t_{piv,p}\)). To account for crowding in buses the travel time per passenger inside a bus is multiplied with a crowding factor. The resulting time is the perceived in-vehicle time (\(t_{piv,p}\)). Hence the number of passengers are in direct relation to the perceived in-vehicle time. The transfer penalty (\(tr_p\)) adds additional time to the total travel time of a passenger since a transfer to a different route/vehicle is perceived as negative by passengers. The waiting time (\(t_{w,p}\)) is the time a passenger spends at a bus stop between alighting and boarding the next bus and before the initial boarding. The access time (\(t_{a,p}\)) is the walking time from an origin to his/her origin bus stop; and the egress time (\(t_{e,p}\)) is defined as the time it takes to walk from the destination bus station to the final destination. Each component is multiplied with the corresponding cost parameter and then summed over all travelers (\(\forall p \in P\)), where P is the set of all travelers, to obtain the total user cost (see Eq. 1).

$$\begin{aligned} \begin{aligned}&c_{u} = \left( \gamma _{wa} \cdot \sum _{p \in P} t_{wa,p} + \gamma _{wt} \cdot \sum _{p \in P} t_{w,p} + \gamma _{ivt} \cdot \sum _{p \in P} t_{piv,p} + \gamma _{atet} \right. \\&\quad\quad \left. \cdot \sum _{p \in P} (t_{a,p}+t_{e,p}) + \gamma _{tr} \cdot \sum _{p \in P} tr_i \right) \cdot \upsilon \end{aligned} \end{aligned}$$
(1)

If a bus stop is not served by any route in the solution network or if walking gives a higher utility than taking PT, passengers can walk to the closest served bus stop or walk to their final destination. Unsatisfied demand is accounted for through the walking time incurred when passengers decide the walk their entire trip.

Operator cost The operator cost (\(c_o\)) is the summation of capital costs (\(c_{cptl}\)) and operating costs (\(c_{oper}\)) for the vehicle fleet. The minimum required fleet size (\(n_r\)) can be estimated using the frequency (\(f_r\)) and cycle time (\(t_r\)) of a bus route (r) (\(n_r = t_r \cdot f_r / 60\)). The AB specific operating cost and capital costs formulations are in accordance to the formulations in Zhang et al. (2019).

The operating cost (\(c_{oper}\)) is the cost per hour considering the driver and the maintenance costs. In Eq. 2 the formula for conventional buses and AB is shown. The fleet size per route is multiplied by the summation of the unit fixed and unit size-dependent operating cost parameter. The vehicle capacity is denoted with \(\kappa\). For AB the unit fixed operating cost parameter is reduced by \(\eta\) and for conventional bus operations this parameter is set to zero. The operator cost for the network is achieved by the summation of all routes.

$$\begin{aligned} \begin{aligned}&c^{oper} = \sum _{r \in R} \frac{t_r \cdot f_r}{60} \left( (1-\eta ) \gamma ^{oper}_{fixed} + \gamma ^{oper}_{unit} \cdot \kappa \right) \end{aligned} \end{aligned}$$
(2)

The capital cost (\(c_{cptl}\)) is defined as the fixed price for a vehicle depending on capacity and vehicle type. In Eq. 3 the formulation for the total capital cost of conventional buses and AB is shown. The capital cost for one route is the multiplication of fleet size and the unit fixed/size-dependent capital costs (\(\gamma\)), which is summed over all routes to arrive at the total capital cost. For autonomously operated routes an additional fixed unit capital cost (\(\beta\)) factor is added and for conventional bus operations this parameter is set to zero.

$$\begin{aligned} \begin{aligned}&c^{cptl} = \sum _{r \in R} \frac{t_r \cdot f_r}{60} \left( (1+\beta ) \gamma ^{cptl}_{fixed} + \gamma ^{cptl}_{unit} \cdot \kappa \right) \end{aligned} \end{aligned}$$
(3)

Infrastructure cost

The infrastructure cost (\(c_{infra}\)) considers the building cost of a bus stop and the total length of the network. The building cost of one bus stop (\(\gamma _{stop}\)) is the fixed cost of building all stops serviced by at least one bus line. The parameter \(\delta\) represents additional bus stops costs for AB operations, through e.g. larger bus stops and additional fences. In conventional bus scenarios \(\delta = 0\) holds. The cost associated with the length of the network is computed as the total network length \(\left( \sum _{r \in R}l_r\right)\), where \(l_r\) is the length of bus route \(r \in R\), multiplied with the road infrastructure building costs (\(\gamma _{link}\)). This additional cost proportion \(\theta\) considers AB specific infrastructure preparations, e.g. road markings and dedicated lanes. In conventional bus operations \(\theta = 0\). In Eq. 4\(s_{r}\) is the number of bus stops served per route by the network solution.

$$\begin{aligned} \begin{aligned}&c_{infra} = \sum _{r \in R} \left( \left( 1 + \delta \right) \gamma _{stop} \cdot s_{r} + \gamma _{link} \cdot l_r \cdot \theta \right) \\ \end{aligned} \end{aligned}$$
(4)

Mathematical problem formulation

The multi-objective optimization problem is formulated as follows:

$$\mathop {{\text{min}}}\limits_{{F,R}} \quad \quad Q_{1} = {\text{c}}_{{\text{u}}} (F,R),$$
(5)
$$\mathop {{\text{min}}}\limits_{{F,R}} \quad Q_{2} = \sum\limits_{{r \in R}} {\frac{{t_{r} \cdot f_{r} }}{{60}}} \left[ {\left( {(1 + \beta )\gamma _{{fixed}}^{{cptl}} + \gamma _{{unit}}^{{cptl}} \cdot \kappa } \right) + \left( {(1 - \eta )\gamma _{{fixed}}^{{oper}} + \gamma _{{unit}}^{{oper}} \cdot \kappa } \right)} \right],$$
(6)
$$\mathop {{\text{min}}}\limits_{R} \quad Q_{3} = \sum\limits_{{r \in R}} {\left( {\left( {1 + \delta } \right)\gamma _{{stop}} \cdot s_{r} + \gamma _{{link}} \cdot l_{r} \cdot \theta } \right)} ,$$
(7)
$${\text{subject to}}\quad f^{{min}} \le f_{r} \le f^{{max}} \quad \forall r \in R\qquad \qquad ({\text{min./max. service frequency}}),$$
(8)
$$s_{{min}} \le s_{r} \le s_{{max}} \;\quad \forall r \in R\qquad \qquad {\text{(min./max. stops per route}}),$$
(9)
$$r^{{min}} \le |R| \le r^{{max}} \;\;\;\qquad \qquad \qquad \quad {\text{(min./max. number of routes}})$$
(10)

where \(F = {f_1, ... , f_{|R|}}\) is the set of all line frequencies. R is the set of routes, \(s_r\) is the number of bus stops on route r and \(f_r\) is the service frequency. The input value of potential bus stops chosen to be large and evenly distributed over the area of interest. The first objective is representing the user cost, which is calculated based on output extracted from the simulation model. In the simulation module further operational constraints (e.g. vehicle capacity, pick-up and drop-off timing of passengers, bus stop order along a route) are considered.

Solution approach

The TNDFSP is a combinatorial optimization problem. The complexity of the network design problem grows exponentially with the input dimensions. In (Magnanti and Wong 1984; Farahani et al. 2013) the authors show that the network design problem is NP-hard. Similar problems such as the widely studied vehicle routing problem (Lenstra and Kan 1981; Toth and Vigo 2002), transport line design (Bussieck 1998) and frequency setting problem (Michaelis and Schöbel 2009), have the same characteristics and have been proven to be NP-hard, therefore with the exception of very small instances, model applications need to be solved using heuristic optimization algorithms. The optimization algorithm for the TNDFSP used in this study is a variant of the multi-objective artificial bee colony (MOABC) heuristic optimization algorithm as described in Szeto and Jiang (2014); Zou et al. (2011) and the NSGA-II algorithm which was first proposed by Deb et al. (2002). This variant allows for the solution of multi-objective problems and includes AB specific characteristics. In Fig. 2 the overall methodology of this study is presented. The optimization problem is framed with the light grey box. The input to the optimization layer is the initial solution set and the two-level graph, which contains the mapping information of all bus stops onto the underlying road network. The output of the MOABC is a network design solution including the sequence of bus stops on each route, frequency settings for each route and the corresponding objective values. The frequency setting is computed in three steps (compare 2). First, the solution is simulated using high service frequencies and high vehicle capacities for each route. After the passenger assignment on this solutions the passenger load of each route is determined. Third, the service frequency for each route is then determined by computing using the determined passenger load per route and the vehicle capacity of that route. With the assigned service frequency the PT network is simulated and the cost terms computed based on the simulation outputs. Trails represent the number of network evaluations in the neighborhood of a given solution. If the maximal number of trails is not met the same PT network is used as the basis for the next iteration. Once the termination criteria is met the best PT network solutions are stored. If the termination criteria (maximum number of iterations or hypervolume convergence) are not met and the maximum number of trails is met, the route with the lowest passenger load is removed from the network (Fig. 2). The hypervolume for a solution is the volume of a cuboid which has its two diagonal points as the reference point and the solution vector, respectively. The reference point was chosen to be the origin and the solution vector is a 3-dimensional vector where each objective value is one dimension.

Fig. 2
figure 2

Methodology overview

Input

The framework requires three different input data sets: the demand as an OD matrix, the potential bus stop positions, and the road network. The road network is needed to compute the distances between bus stops and represents the feasible operation network. The OD matrix allows for a many-to-many travel pattern and specifies the demand in terms of passenger rate per hour.

Artificial bee colony optimization

The ABC algorithm is chosen for its capability to explore and exploit the solution space. The number of evaluations is not prohibitive in the context of strategic network design since the computation time is subordinate to the quality of the solution. Its name is inspired by the behavior of bees trying to find nectar in the proximity of their hive. The concept in the proposed algorithm is based on the following analogy. Bees spread out in the proximity of the hive to explore the neighborhood for good quality nectar. If a bee is successful it flies back to the hive and reports to other bees about the position and the quality of that food source. In this study, a food source represents a solution to the network design problem and the quality of that food source is the objective values of the solution. The algorithm defines different types of bees. First, the employed bees explore the available food sources and report their information. Second, the onlooker bees receive and process the information from the employed bees. Each onlooker bee decides about the neighborhood near a reported food source to be explored by the employed bees. As soon as the source is empty (no better solution can be found) the employed bees change its type into a scout bee. This bee type searches in the entire harvesting area for new food sources. The algorithm is initialized with multiple random feasible solutions, while the convergence is assessed by computing the hypervolume (Auger et al. 2012) at each iteration.

Transferred to the network design problem, the solution space is defined by all feasible network design configurations (e.g. combination of nodes, links and routes). In this study one solution is defined as a set of routes and a set of associated frequencies. The routes are generated using the underlying fully connected bus stop graph (see Fig. 3). A neighborhood solution is defined as a change in a single route in the current solution.

Fig. 3
figure 3

Example bus stop graph and a corresponding bee (solution)

The general ABC algorithm for single-objective problems is adjusted in this work to cope with the special requirements of multi-objective problems. The different steps (mutation, fitness computation and ranking of solutions) and type of bees (employed bee, onlooker bee, and scout bee) should be preserved to maintain the explore, exploit and elitism characteristics of the original ABC algorithm. Figure 4 gives a schematic overview of the step-wise logic of the multi-objective artificial bee colony optimization algorithm (MOABC). The proposed framework is a combination of the methodologies presented in Zou et al. (2011); Szeto and Jiang (2014) and Deb et al. (2002), whereas the visualization scheme is based on the algorithm of Deb et al. (2002). In contrast to Deb et al. (2002) the concept of an onlooker phase has been added to the algorithm. The algorithm in Zou et al. (2011) combines the employed phase and onlooker phase to one step, whereas in this work the two phases are processed in consecutive steps in alignment with the single-objective ABC algorithm by Karaboga (2005). The ranking of solutions is achieved through a dominance scheme considering multiple objectives. Specific solutions which have been ranked among the best solutions for the past iterations are modified randomly using one mutation operation. Hence the optimization algorithm is capable of simultaneously explore and exploit the solution space while keeping track of past computations to avoid multiple solution evaluations and sub-optimal areas in the solution space. In the following paragraphs, the key concepts of the implemented ABC algorithm are explained.

Fig. 4
figure 4

Overview over multi-objective artificial bee colony algorithm (MOABC)

Mutation To exploit one solution the neighborhood of that solution has to be created. For the neighborhood creation, we introduce three different mutation algorithms (see Fig. 5). The first type of mutation is addition, where a connected node is added to a route from a given solution. The route and the node insertion position are selected randomly. The second type of mutation is removal, where a random node in a route from a given solution is removed. The third type of mutation is the exchange of a random node in a random route from a given solution with another feasible node to guarantee connectivity. With these three steps, it can be guaranteed that the created neighborhood solutions are similar to the initial solution and therefore locally exploit the solution space. The choice of mutation is random. This procedure restricts the attention to simple neighborhood moves in the space of feasible solutions, whereas the removal of the route with the lowest passenger load allows for larger moves in the solution space. The removal of a route is done after a certain number of trail computations have been computed (see Fig. 2).

Fig. 5
figure 5

Neighborhood definition and mutation creation algorithms for MOABC

Crowding distance and dominance Since the multi-objective problem formulation does not allow for a single compensatory fitness computation, the ranking of solutions is done in a two-step approach. First, a fast non-dominated sorting algorithm is used to find non-dominated solutions and create different levels of non-dominated fronts (Fig. 6). A solution (\(bee_1\)) is said to dominate another solution (\(bee_2\)) if all objective values are smaller or equal and at least one objective value is smaller than in the other solution. For simplicity reasons only two out of the three objectives are visualized in Fig. 6. In this example two levels of non-dominated fronts have been computed. The second step is the computation of the crowding distance. The crowding distance for a solution is the summation overall objectives of the mean distance between the neighboring solutions. With the computation of the crowding distance a measure for the diversity of each point is generated. The final ranking of the solutions is done with the combination of both metrics (see Algorithm 1). The solutions of the same non-dominated front are sorted based on their crowding distance value. The larger the distance, the higher the solution is ranked. This is done to increase the heterogeneity of the investigated solutions. If the divergence would be low the solutions would not explore the entire range of the non-dominated front.

Fig. 6
figure 6

Non-dominated sorting and diversity computation

figure a

Employed bee phase Based on the non-dominated and crowding distance sorting, we select the 90% best solutions as employed bees. If the total population is denoted as P the number of bees in the system at this step is \(\epsilon P = 0.9 \cdot P\). Each employed bee solution is mutated resulting in a total number of \(2\epsilon P\). Each mutated solution is compared with the original solution using Algorithm 2. If the mutated solution is dominating the original solution, the mutation replaces the original solution. If the original solution dominates the mutation the original solution stays in the population. If none of the two solutions is dominated by the other one, we pick one of the two solutions at random and sort that solution in the population. After the employed bee phase the total number of bees in the population is again \(\epsilon P\).

figure b

Onlooker bee phase The purpose of the onlooker bee phase is to focus on the neighborhoods of highly ranked solutions. In this study this is done by computing the summed normalized objective value for each solution in the population. The selection probability for each solution is proportional to this normalized value. A total of EP bees is generated in this way, generating a population size of 2EP. This population is then sorted based on non-dominated and crowding distance and only the solutions of the first front are kept in the population.

Scout bee phase In the last phase of the MOABC algorithm, the number of bees is adjusted to the initial population number. If the current number of bees in the solution is lower than the initial population number we add an according number of bees to the solution. If the current number of bees in the solution is higher than the initial population number the best-ranked solutions from the onlooker bee phase are chosen to remain in the population so that the total population is P.

Route generation

The creation of a new route in this work follows three criteria. First, the new route has to connect to the existing network by sharing at minimum one bus stop. This is in line with the theory of preferential attachment (Barabasi and Albert 1999). Second, the new route should have a feasible amount of bus stops and third, the new route should not include any loop, meaning that a bus stop is served twice. If there is no feasible route available which fulfills all three criteria no new route is added to the existing network.

With the proposed route generation method the requirements for an AB specific network are given. We guarantee connected routes for an autonomous vehicle flow between the routes, emphasize long operation duration of AB and at the same time allow for a controlled extension of the network. Additionally, the maintenance of the vehicles, recharging and legal requirements can be guaranteed if the AB vehicles can reach any point in the network without leaving dedicated routes. Similarly, to traditional bus networks, the AB networks designed and analyzed in this paper are line-based.

Simulation

For the simulation part of the model, an agent-based simulation software (Cats et al. 2010) is utilized. The inputs for this module are the OD matrix, the network definition and the simulation specific parameters (including duration and number of replications). After the simulation has terminated, the KPIs and passenger loads are extracted for the simulated network and are used to evaluate the network solution under consideration.

The simulation software incorporates a dynamic public transport assignment model. The simulation of operations considers several transit-related parameters, e.g. bus routes, level of demand, PT schedule, locations of bus stops and more. A schedule-based assignment model was chosen to realistically capture the different ranges of potential service frequencies. Additionally, the presented results are not dependent on the detailed schedule the vehicles operate on. This affects waiting times at transfers in the network. Since the resulting networks are simple and have only few lines this negative effect can be neglected. In this model, passenger route choices and vehicle travel times are stochastic. The passenger assignment model includes within-day and day-to-day dynamics (Cats and West 2020). A random utility model determines the passenger’s path choice decisions. The utility of each potential path, from the current position to the trip’s destination, for every passenger is evaluated based on the expected waiting time, in-vehicle time, walking time and number of transfers. These metrics are computed based on real-time transit information and the current bus schedule. The utility of an action is calculated as the logsum of the utilities of all paths available given this action. The final decision is made based on the probability distribution as given in Eq. 11, where \(p_{t,k}\) is the probability for action \(t \in T\) of passenger k based on the utility \(u_{t,k}\) for action t and passenger k.

$$p_{{t,k}} = \frac{{e^{{u_{{t,k}} }} }}{{\sum\limits_{{i \in T}} {e^{{u_{{i,k}} }} } }}$$
(11)

The day-to-day dynamics fuse the passengers experienced travel times from a previous day with the expected travel times using a Markov decision process. With this simulation, the resulting load profile of a given PT network is the outcome of service reliability, on-board crowding and real-time information which makes it possible to analyze the dynamic passenger load distribution resulting from various service configurations. The final outputs of the simulation model are averaged values of ensemble runs of the same scenario. By that stochastic decisions in the simulation model are accounted for.

Post-processing

The post-processing steps aim at creating more practically applicable and realistic solutions. In order to achieve this two sequential steps have been implemented. The first is a basic sub-cycle elimination routine, and the second step merges nearby served bus stops. Each of these steps creates a child solution which is evaluated by re-simulation and computation of the objective values. The child solution is accepted as a new solution if it dominates its parent. The output of the post-processing step are the final proposed solutions to the network design problem.

Sub-cycle elimination In the first post-processing step each route of the solution is analyzed. If the route serves the same point twice on the road network a cycle is detected. A cycle is then removed by rearranging the order of the served bus stops. The rearrangement is done by creating k-shortest paths serving all bus stops of that route and choosing the shortest path without a cycle. If only cyclic bus routes can be found the original solution advances unchanged to the next post-processing step. An example of this step is shown in Fig. 7.

Fig. 7
figure 7

Illustrative example of the sub-cycle elimination post-processing step

Merging close by stops The used service graph generation algorithm allows for the creation of spatially close bus stops. To prevent two, spatially close, bus stops being served the algorithm finds all neighboring served bus stop pairs which are below 300m distance from each other. If such a pair is found the bus stop with lower passenger load on the incoming link is removed from the corresponding route. Passengers which previously used this bus stop are assigned new paths to their destination.

Case study

The modeling framework is used to design a new PT network for Barkarby, located in the northern part of Stockholm and connected with the city center by train. To match current travel demand and connect residents with the existing PT system a line-based AB service is operated since the beginning of 2019. However, due to the building of new residential areas and an industrial park, the daily number of trips is expected to increase to approximately 10000 in 2025. Therefore, an extension and redesign of the exiting PT system is required, including a larger line-based AB network in combination with a new metro line. The proposed framework allows incorporating changed circumstances, e.g. new road network, increased demand, and integration of new metro stations in the existing PT system. Furthermore, it allows evaluating the effect of different vehicle technologies on the provided level of service in the area.

Input data The framework requires three main sets of input data: the road network, the demand rate per hour in origin-destination matrix form and the potential bus stop locations. The road network is extracted using OpenStreetMap data (OpenStreetMap 2017). An additional pre-processing step is applied to remove non-drivable roads, e.g. narrow roads, complex intersections and private sections. The pre-processing step is specifically relevant for the operation of AB. Due to regulatory, legislative and technical constraints AB are currently not able to operate on all existing roads. The filtering of drivable roads is done based on input from the local authorities. The demand is estimated based on existing PT data and forecast models of the PT operator. The bus stop map, which includes all potential bus stop locations, is created manually with the input from local authorities. Following this process, it can be assured that the generated solutions represent realistic and drivable network designs. The case study area is shown in Fig. 8. The green dots represent potential bus stops, the arrows indicate the volume and direction of the demand and the blue lines show the underlying road network.

Fig. 8
figure 8

Representation of the case study including an abstraction of the input values

Parameter settings The case study uses the parameters presented in Table 3. The parameters for the optimization algorithm are chosen following literature and based on the benchmark tests performed. The value of time of passengers in Stockholm is specified as 69SEK/hour (Börjesson and Eliasson 2014). The fixed costs of a bus stop are estimated as 12,000 SEK based on average publicly available information Anderson et al. (2015), while the network length-dependent cost is taken from the latest EU report on transport infrastructure expenditures and costs (Schroten et al. 2019). The average infrastructure costs per kilometer road network length in Sweden are stated to be 120,000 SEK/km. The additional AB specific infrastructure costs is estimated to be 10%, hence the cost parameters are \(\delta = 0.1\) and \(\theta = 0.1\). These values are chosen based on reported infrastructure enhancements costs in BRT worldwide projects (Menckhoff 2005; Hensher and Golob 2008; Hidalgo et al. 2013; Nikitas and Karlsson 2015). The accuracy of these additional cost estimations has to be seen as low due to the lack of available data. In this study, the absolute cost values are not of critical importance as, the multi-objective problem formulation allows for an impact study based on infrastructure cost changes.

The internal simulation parameters are based on the values reported in Cats (2013). The vehicle capacity is set to match the AB currently operating in the residential area as part of a pilot study. To account for statistical variations the final variable value is the mean value of five individual simulations. The required number of replications is computed according to the work by Burghout (2004) and Dowling et al. (2004). To account for the stochastic nature of the MOABC the final results presented are the averaged values over 10 full framework iterations. The visualized final network designs are the best from all simulation results.

Table 3 Optimization and simulation parameters used in the case study

Results

This section is divided into two parts. First, the framework is applied to Mandl’s benchmark network and the achieved results are compared with results published in the literature. Second, an application of the framework to a case study in Stockholm, Sweden is performed and analyzed.

Mandl network benchmark

Many of the past studies on optimal network design (Mandl 1980; Baaj and Mahmassani 1991; Kidwai 1998; Chakroborty 2003; Fan and Mumford 2010; Nikolić and Teodorović 2014) have been applied on a benchmark network first proposed by Mandl (1980). The network consists of 15 nodes and 20 edges (see Fig. 9). Since the majority of these studies propose a single objective problem formulation, we adjust the framework of this work to match their objective formulations. The objective function chosen for the benchmark testing is shown in Eq. 12. Additionally, the walking links are deactivated to benchmark our results. Meaning that passengers are assumed to always board the bus, if the bus stop is served by a vehicle.

$$\begin{aligned} \begin{aligned}&{\text {min:}}&Z = \alpha \sum _{i = 1}^{i = n-1} \sum _{j = i+1}^{j = n} d_{ij}p_{ij} + \beta \sum _{i = 1}^{i = n-1} \sum _{j = i+1}^{j = n} d_{ij}t_{ij}&\end{aligned} \end{aligned}$$
(12)

The problem is constrained by (1) a maximum route length, (2) a connected network, (3) minimum number of nodes per route, (4) each route is free of cycles and backtracks. In Eq. 12\(\alpha\) and \(\beta\) are weights for the two terms. The first term describes the path length \(p_{ij}\) between the two nodes \(i,j\in N\) multiplied by the demand \(d_{ij}\) between these two nodes. The second term describes the number of transfers \(t_{ij}\) per path between the nodes multiplied by the corresponding demand. This reduces the problem complexity since no passenger travel costs components such as in-vehicle times, walking time, waiting times and infrastructure costs are considered. Additionally, the objective function is based on a single objective rather than on multiple objectives as proposed in Eq. 5. This limits the applicability for more complex problems as presented in Sect. 4. However, the quality of the benchmark results can still be transferred to the complex case since the same heuristics and algorithms are used in both cases.

The solution method parameters are chosen to be 20 initial solutions, 100 total iterations and 10 trails per solution. Table 4 shows the results achieved with the proposed framework compared to a selection of results previously reported in the literature. The solutions are compared using the following metrics as proposed in Fan and Mumford (2010):

  1. 1.

    \(d_0\) The percentage of demand satisfied with zero transfers.

  2. 2.

    \(d_1\) The percentage of demand satisfied with one transfer.

  3. 3.

    \(d_2\) The percentage of demand satisfied with two transfers.

  4. 4.

    \(d_{un}\) The percentage of unsatisfied demand.

  5. 5.

    ATT Average travel time per passenger [min].

In Table 4 the best configurations are chosen out of the reported results in the literature. The number of lines may differ between studies. The results of this study are shown in the last column. The numerical values reported are averaged values from three complete optimizations. The previous studies diverge mostly in the percentage of unsatisfied demand. This is due to the fact that none of the studies have accounted for the possibility of passengers walking from an unserved stop to a served stop at the beginning or end of their trip. In our proposed framework the passengers can choose to walk to the closest served stop if a bus stop is not directly served by a route. People walking to the closest stop are consequently not considered unserved demand. Only if there is no served bus stop within a threshold distance to the origin stop these passengers are considered as unserved. As can be seen in Fig. 9 bus stop 5 is not being served directly by any of the bus routes. However, bus stop 4 is considered close enough so that people from bus stop 5 walk to bus stop 4 and board buses from there. Hence it registers 100% of the demand being served. The increase in average travel time can be explained with the walking model. Since stop 5 is not served, people walk to bus stop 4, which results in a long travel time for these passengers. As a direct consequence of this the average travel time increases to roughly 14min per passenger. The post-processing step removes the virtual loop on route 2 in Fig. 9a, which results in a more practical solution (see Fig. 9b). Due to the underlying differences in the passenger assignment and walking model in comparison to other studies, the benchmark results are assumed to be acceptable for the analysis performed in this work.

Table 4 Comparison of results for Mandl’s Network
Fig. 9
figure 9

Best network solution of the proposed framework on Mandl’s network

Case study analysis

In Fig. 10 the hypervolume convergence is shown for the AB optimization and the conventional bus optimization scenario. After 150 iterations the optimization is terminated. The number of iterations was determined based on the following reasons. First, 150 iterations were reported to be sufficient in other studies that solved similar-sized problems [see Zou et al. (2011); Szeto and Jiang (2014)]. Second, the analysis of the benchmark problem indicated that this is sufficient to achieve a converging solution. Furthermore, the number of function evaluations is two orders of magnitude larger than the number of iterations. Using the parameters from Table 3 the average number of function evaluations for one optimization run can be estimated to \(150 \cdot (0.5 \cdot 30 \cdot 2) \cdot (0.1 \cdot 30) = 13500\); where 150 is the number of iterations and 30 the number of bees; \(0.5 \cdot 30\) is the number of evaluations for each employed phase and onlooker phase assuming a mutation and focus rate of 50%, and \(0.1 \cdot 30\) equals the evaluations in each scout phase assuming \(\epsilon = 0.9\). From Fig. 10 it can be seen that the problem converges step-wise for both, the AB and conventional bus, cases. The steps can be explained by the removal of bus lines with low passenger load in the candidate solutions after a certain number of iterations. After approximately 90 iterations the hypervolume starts to slightly diverge again, and additionally the noise between computation repetitions increases. Both of these can be explained by the random search characteristics of the MOABC and the line removal procedure. New random candidate solutions are generated after each iteration; In each iteration, approximately 50% of the solutions are mutated or slightly varied which can lead to an improved or deteriorated solution. Towards the end of the optimization, the probability that a mutated solution is worse than the mean average of all bees is increasing since 50% of these solutions are the best solutions achieved through the previous iterations. Hence the mean average is more likely to fluctuate and more likely to increase towards later iterations. The overall best solutions are not affected by this phenomenon since the solutions used as the bases for the numerical analysis and conclusions are the overall best from all ensemble runs and iterations.

Fig. 10
figure 10

Hypervolume convergence for the AB deployment scenario and for conventional bus deployment scenario. The reference point is set to be the origin. The colors are the hypervolume curves for a single framework iteration. The blue line is the mean of all 10 ensemble runs

Fig. 11
figure 11

Maps (a) and (b) show the computed user-focused design for conventional and autonomous buses respectively, while (c) and (d) show the operator-focused design for conventional and autonomous buses

User-focused and operator-focused design

In contrast to the benchmark analysis, the case study framework is implemented as a MOABC. In the following analysis the difference between user-focused and operator-focused network designs are presented. In Fig. 11 the numerical values for the specific solutions are given. The values belong to to one solution each. The solutions are achieved by computing the Pareto front using the concept of non-domination and then sorting these solutions either for minimum user cost or minimum operator cost respectively. In general the inclusion of walking as an alternative mode of transport has a big impact on the network design and achieved results. Comparing the total number of people walking with the number of people boarding a bus clearly shows the significance of walking in this case study. Based on the spatial size of the case scenario walking is a valuable option for many passengers and is therefore selected by the majority of travelers. Nevertheless, two distinct network designs are identifiable. The user-focused design (see Fig. 11a and b) is characterized by higher operating costs and higher infrastructure costs compared to the operator-focused design. This is the direct consequence of the higher number of bus stops and networks in the public transport network. The larger network length and the higher number of bus stops leads to a better accessibility for passenger in to the public transport network, therefore resulting in more boarding passengers, and shorter access times per passenger. Additionally, it can be seen that the level of crowding is low, since the difference between recorded in-vehicle time and perceived in-vehicle time is marginal. Interestingly, the in-vehicle time per passenger, waiting time ans service frequency per line are similar for both, the user-focused and operator-focused design. The service frequency of 4veh/h indicates the competitiveness of walking in this case study and additionally the low supply utilization. Furthermore, no transfers could be registered, indicating that passengers take direct trips from their origin to destination. The solution in Fig. 11b shows a large overlap of route 1 and 2. Route 1 seemingly goes in a loop from north to south and back north. This network however was not dominated by a post-processed solution and is therefore presented as is. The solution in Fig. 11a could be improved through the post-processing step by merging one bus stop and removing several virtual loops.

Table 5 Results for user-focused and operator-focused network design solutions, the values represent the non-dominated solution

The operator-focused network (see Fig. 11c and d) is characterized by fewer, shorter lines and few bus stops. This results in lower infrastructure costs compared to the user-driven design. Additionally, the operator-focused design leads to increased access times and long waiting times due to denied boarding.

Effect of autonomous technology

Figure 12 shows the point clouds of all Pareto optimal solutions, the dominated solutions are not plotted since the projection from a 3D space into multiple 2D spaces would not result in an insightful visualization. Each point represents the objective values of one solution to the network design and frequency setting problem. In Fig. 12a the relation between operator cost concerning the user cost is shown, Fig. 12b shows the relation between infrastructure cost with respect to the operator cost, while Fig. 12c shows the relation between infrastructure costs with respect to user costs.

Fig. 12
figure 12

Solution distribution of the MOABC on the case study. Each point represents a network design solution. The red points show the solutions with conventional buses and the green points are solutions with AB

It can be seen that the operator cost for autonomous buses is much lower than for conventional buses (compare Table 5). This impact is in line with the expectations since the problem formulation implies a reduction of user cost. Similarly, the infrastructure costs are substantially increased by the introduction of AB. This is mostly due to the additional AB specific infrastructure costs which represent infrastructure enhancements. In general no big shift towards lower user costs can be seen in the scatter plots. This is unexpected since the reduction of operator costs are assumed to lead to an improved level of service. However, the user cost is not affected much by the deployment of autonomous buses. Additionally, the service frequency at which the vehicles operate on each line is not affected by the vehicle technology, which contradicts a recent analytical study (compare Tirachini and Antoniou (2020)). A potential reason is the low passenger load per vehicle in combination with the attractiveness of the alternative walking mode.

To assess the impact of AB on user-focused and operator-focused design the Pareto optimal solution with the lowest user cost and operator cost are studied in detail. Table 5 summarizes the numerical values for each technology and network design focus. The first two columns show the change from the AB user-focused design to the conventional user-focused design. As expected, the infrastructure cost is increased through the deployment of AB and the operator cost is reduced. The higher infrastructure cost is due to larger network length. This in turn leads to a reduction in average access time per passenger. Additionally the number of passengers in the public transport system is largely increased, approximately double the amount of people take the bus compared to the conventional bus operations. The number of people walking to bus stops is also larger when operating autonomous buses, approximately 42% of the passengers boarding the bus walk to a bus stop first. Whereas only approximately 29% do so in the case of conventional bus operations. The overall level of server can not significantly be improved through the deployment of autonomous buses. The benefits gained by reduced walking times and are mainly compensated by an increase in denied passenger boardings and additional waiting time per passenger at the bus stop.

Columns three and four in Table 5 show the impact AB has on operator-focused design based on the utilized KPI. Similarly to the user-focused design, the infrastructure costs increase and the operational costs decrease in the case of AB deployment. This is again mainly due to an increase in the total network length and a reduction in operating costs for AB. In alignment with the user-focused design also in the operator focused design more passengers are boarding the AB and the total walking time per passenger can be reduced. However, the level of crowding and the number of denied passenger boardings are significantly higher in the case of autonomous bus, which in turn leads to a deterioration in the level of service.

It is concluded that while the introduction of AB on line-based PT networks does not improve the level of service across the board, a slight improvement can be identified in the user-focused designs. However, due to a larger network length and shorter walking times more passengers can be attracted to the public transport system when operating AB. In the presented case study approximately 2-3 more passengers can be expected to board when the buses operate autonomously as opposed to humanly-driven buses.

Fig. 13
figure 13

Change of different objectives with increased network complexity, PT networks with autonomous solutions in blue and conventional buses in orange

Impact of additional lines To study the impact additional lines have on the PT design the generated solutions have been grouped by the number of lines. Each group represents one line configuration. To get representative numerical values for each group the mean average over all non-dominated solutions for each group have been computed. In Fig. 13 the three objectives are shown, while in Fig. 14 different performance indicator are illustrated. Generally it can be said that the differences between optimized conventional PT network and optimized PT network utilizing AB are small. This is especially true for user cost and infrastructure costs in networks consisting of a few bus lines. However, the operator cost indicates a clear reduction trend induced by the deployment of AB. Also, the performance indicator in Fig. 14 shows that there is no large difference between optimized conventional PT networks and optimized AB networks. The main differences can be found in more complex networks, especially (1) an increase in walking times (see Fig. 14a), (2) a reduction in in-vehicle time (see Fig. 14b), and; (3) a reduction in waiting time due to denied boarding (see Fig. 14c) for the autonomous bus networks.

In Fig. 13 several trends can be observed. First, the infrastructure costs and operator costs increase steady with the increased network complexity. The higher number of lines results in longer total network length, more bus stops and also a larger vehicle fleet. Beyond three lines, the user cost increases, whereas before that it is slightly decreasing with an increasing number of lines. The initial decrease is due to a reduction in average walking times due to more people traveling by bus. For more complex networks this decrease in average walking times is more than compensated by longer waiting times, more transfers, slightly longer in-vehicle times and a higher rate of denied passenger boardings. Our findings suggest that public transport networks, operated with AB or conventional buses, consisting of a larger number of lines serving the same geographical unit lead to a lower level of service for the user.

Fig. 14
figure 14

Change of different performance indicator with increased network complexity, PT networks with autonomous solutions in blue and conventional buses in orange

Increasing the number of lines when operating autonomous buses leads to a similar increase in infrastructure cost as for conventionally operated PT networks. The operator cost is as expected reduced due to the reduction of operating costs when deploying AB. Furthermore, can be seen that the user cost for more complex AB networks is reduced. This is caused by the reduced number of denied boardings, indicating less total boardings and a higher number of people walking. This is supported by a slight increase in average walking time per passenger indicating that more people decide to walk further distances. Additionally, fewer transfers are registered for AB operated networks, indicating more direct trips. Hence, the increase in user cost on complex PT networks is lower for AB operations compared to conventional bus operations.

Comparison with real-world design plan

To show the practical applicability of the framework, the network design solutions generated with the proposed framework can be compared with the network solution proposed by the local public transport operator (see Fig. 15). This network consist of a single line connecting the northernmost point with the southernmost point within the area with a direct line and is operated with autonomous buses at a service frequency of 4veh/h.

Fig. 15
figure 15

Proposed network by the local public transport operator

The numerical values for this network design are given in column five of Table 5. When comparing the operator-focused solutions with the proposed solution certain several similarities can be seen. All solutions consist of a single bus line which is operated at 4veh/h. The main difference lays in the number of bus stops. As a consequence of the higher number of bus stops in Fig. 15, the access time to the bus line is reduced which leads to a large increase in passengers per line. This in turn results in crowding effects, more denied passenger boardings and longer average waiting time per passenger. The result of which is an increased user cost compared to the optimized solutions generated with the proposed framework. An explanation for the higher user cost can be the attractiveness of walking in that area. Since the average in-vehicle time is with approximately 75sec/pas relatively low, the time spend walking the same trip is fairly short and does in fact lead to lower user costs since the waiting time penalty and denied boarding penalties can be avoided.

Computational aspects

To investigate the impact the chosen parameters have on the results several parameter studies have been done. The sensitivity analysis is performed in relation to various parameters separated into three categories: (i) optimization algorithm’s parameters (number of iterations, number of bees, and number of trails); (ii) simulation parameters (walking distance, number of simulation repetitions), and; (iii) parameters of the autonomous bus cost structure. From the results it can be concluded that an increase in the number of iterations or the number of bees does not generate different results. A similar trend is observed for simulation parameters, the number of simulation repetitions does not significantly impact the results. However, an increase in the walking threshold distance does lead to a reduction in ridership and therefore reduces the overall network length, a decrease of the threshold has the opposite effect on the network length. Based on the multi-objective problem formulation the cost parameters for AB are not directly influencing the other objective values, therefore a change in them does not lead to different conclusions but rather to different absolute values for the operational costs in the results.

The total computation time for the framework for one vehicle technology is approximately 72h. This time includes the 5 simulation repetitions per solution and an ensemble run of total 10 individual optimization runs. The evaluation of one network solution including within-day learning and five simulation runs takes approximately 30sec. The code is not utilizing parallel computation of each ensemble run and can therefore be sped up. The hardware used to run the code was an Intel i7-7820HQ CPU @2.90GHz and 16GM RAM.

Discussion and conclusion

In this work an AB specific transit network design and frequency setting (TNDFS) framework is presented. The framework consists of a multi-objective optimization problem which allows computing the impacts AB has on multiple stakeholders. A simulation model with a stochastic, dynamic schedule-based passenger assignment is used to evaluate a given network design and assigns passenger demand to available bus routes. The optimization problem is solved by utilizing a multi-objective artificial bee algorithm (MOABC). The presented framework incorporates AB characteristics by constraining network solutions to have connected bus routes, smaller vehicle capacities, considering changed operator costs for AB and determining approximated infrastructure enhancement costs. Additionally, the number of bus stops per route is a decision variable in the problem formulation which allows for a more diverse bus route creation. With this framework, potential differences between autonomous bus transport systems and conventional bus transport systems can be investigated. The expected differences originate from the reduced operation costs for autonomous vehicles. Additionally, the study provides insight into the characteristics of user-oriented and operator oriented network design for fixed-line operations. The proposed framework is applied to a benchmark network to show the applicability and the overall functionality and a real sized case study in Barkarby, Sweden.

In small-scale geographical units, public transport and active modes are the two main travel alternatives. The majority of the demand in our case study is distributed over direct walking connections from origin to destination. Furthermore, the comparison of the user-oriented and operator oriented AB network design is in line with expectations. The user-focused network design is characterized by many bus stops and many lines, resulting in a reduction of the average walking time to a bus stop and reduced crowding. In contrast, the operator-oriented network design consists of only a few lines. This results on one hand in a smaller required fleet and shorter cycle times, and on the other hand, the few lines increase the average waiting times considerably, which reduces the level of service for the user.

In the presented case study the deployment of autonomous buses on fixed-line public transport networks leads to a reduction of operator costs and an increased infrastructure cost this is mainly due to the reduction in operating costs and an increase in total network length. As a direct consequence of the longer PT network the number of passengers boarding a bus is 2-3 times higher in the case of AB deployment compared to conventional PT systems. Consequently, crowding increases and in-vehicle times become longer. The user cost can not be reduced in all cases. Especially for smaller, simple networks with a few lines the differences between optimized PT networks using conventional buses and AB diminishes. For simple networks (e.g. one or two bus lines) the reduction in walking times is counteracted mainly by longer in-vehicles times and a higher number of denied passenger boardings. However, for more complex networks and user-focused design, autonomous technology leads to an improved level of service, due to a reduction in the number of denied boardings. Additionally, shorter waiting times due to denied boarding and shorter in-vehicle time is yielded when operating AB on more complex networks. In none of the analyzed network design solutions an increase in service frequency due to the operation of AB is observed.

The limitations of this work lie mainly in the applied optimization method. There is a compromise between the level of detail of the generated solutions and the number of solution evaluations. Due to the complexity of the problem formulation and the nature of the network design problem, it is not possible to find the global optimal solutions for all three objective values in a feasible amount of time. Therefore the quality of the computed solutions depends on the computation time reserved and optimization parameters. The optimization parameters are chosen to match the proposed parameter settings from similar problems as reported in the literature. It is possible to further improve the proposed theoretical solutions by manually or automatically adjusting the generated solutions in a post-procession step. The big advantage of the proposed framework is the applicability of large and complex problems. The generated network design can then be used as initial solutions for manual post-process made by transportation experts. Applying meta-heuristic algorithms creates the challenge of a suiting problem formulation. The network solutions have to be encoded in a way so that the algorithm can find solutions efficiently. Therefore the solution quality of meta-heuristic optimization algorithms is dependent on the search space as defined in the problem formulation. A second limitation can be seen in the isolated investigation of the case study. It is possible that the existing PT lines in the area affect the network design and therefore the presented final solutions would not be the best to apply in the real world. For future work, it would, therefore, be advisable to integrate the existing PT infrastructure into the framework to create a network extension rather than an isolated network. However, the isolated investigation of the case study area does not affect the general results and trends presented in this work. Additionally, due to the use of sequential optimization and simulation and the high number of evaluations required for the ABC algorithm to result into trustworthy results the approach does not scale to city-wide problems. In future work, the computation time can be considerably reduced by utilizing more powerful computers and parallel computing for the different ensemble runs.

For the implementation of AB in fixed-line PT systems it can be said that they are most promising in underdeveloped areas where the current PT accessibility is low. In the Barkarby case study, an improved level of service can be attained by deploying AB, however, a larger improvement of level of service can be achieved when implementing an operator-focused PT network. A possible extension of this study is the integration of sustainable considerations (e.g. veh-km) for AB operations in the optimization formulation. Due to the reduction of operational costs vehicles might be operated for longer times and for longer distances which in turn has an impact on the total exhaust emissions. Another future research direction is the consideration of electrification in the model and the removal of the connectivity constraint to study feeder operations or improve service niches using a series of local feeder networks.