1 Introduction

The modern transportation system is composed of complex large-scale interactions that are generated as the travellers engage with their dynamic environment to go from one location to another. The dynamic environment of modern transport system comprises of transport infrastructures, modes, services, and technologies. The decisions travellers make regarding their activities and trips in this dynamic environment are also governed by their spatial, societal and economical characteristics. The multi-dimensionality of the travel decisions and the underlying heterogeneity makes travel behaviour difficult to predict when there are changes in the transport system.

Researchers, engineers, and planners in transport rely on transportation forecasting models to predict the performance of the transport system in alternate future scenarios and evaluate the potential effectiveness of new plans and policies. Over the last few decades, two distinct approaches for travel demand modelling have emerged: trip-based and activity-based approaches.

The trip-based approach, frequently referred to the traditional four-step travel demand model, considers aggregate travel choices in four steps: trip generation, trip distribution, modal split, and route assignment [1]. Even though this model considers interactions between the stages in the simulation stage, in the majority of the cases, the models used in each step are estimated as stand-alone models (e.g. separate models for trip generation, attraction, mode choice and traffic assignment). Hence, they often struggle to predict certain situations such as derived travel or demand [2]. In addition, the conventional sequential travel demand modelling approach used in the four step approach may not adequately capture individual decision-making processes as it concentrates on aggregate travel behaviours. To solve these shortcomings, a coherent framework that can simulate four stages simultaneously at the disaggregate level is required. This motivated the development and increased use of activity-based models [3].

Activity-based models predict activities and associated travel choices by taking into account time and space constraints as well as individual characteristics. Using a sequence of activities and corresponding trips to connect those activities, individuals are assumed to maximise their activity utility by choosing the maximum utility among trips [4]. Even though activity-based models have the ability to be an alternative to four-step models, these models also have several problems and issues [5]. The current activity-based models simulate typical activity-travel patterns in a day. It is possible that certain activities cannot be completed within a simulation run, for example, where working hours exceed the end of the run. As a result, those activities may be removed from the schedule. Further, the integration of demand generation and its traffic assignment to the network still needs a robust solution. This approach is incapable of making changes in departure time or more significant activity rescheduling decisions. A more fundamental approach in simulating frequent responses between traffic assignment algorithm and the activity-based model of travel demand may be a technical solution to resolve this incompatibility.

On a parallel stream, in the last twenty years, agent-based modelling has emerged as a method to replicate the complexity in social systems. Agent-based models (ABM) represent individuals autonomous agents with independent characteristics and behavioural rules guiding their decisions and actions. Agents generally ‘act’ within a dynamic environment, allowing for the analysis of their interactions both with other individuals (in relation to proximity or connectivity) and with the environment in which they are placed [6]. Agents can learn, adapt, and hold different perceptions of an environment. The flexibility of the ABM framework means applications are broad, as agents can represent any sort of entity (e.g. person, car, road, city). Within transportation domains, ABM enables the creation of complex, dynamic, and stochastic transport systems, typically consisting of individual traveller or vehicular agents that have heterogeneous characteristics and behaviours (e.g. perceptions, needs, capabilities) and is adaptive to changes in circumstances and or the environment [7, 8]. The ABM approach is often conflated with microscopic traffic simulation and population microsimulation, and while it shares some characteristics with these approaches, its notions of agent learning, adaptation and behavioural heterogeneity sets it apart. ABM furthermore places few constraints on how an agent or environment is represented, and as such, there is little to no dependence on specific processes or software packages for its implementation, allowing its application in a very broad set of contexts.

Agent-based models originated in the area of computing, where agents represent software entities that run independently and interact with other agents in an environment [9]. Alongside ABM, Cellular Automata (CA) models emerged, a simplified grid-based simulation approach, similar to ABM. CA models were introduced to transportation research as a novel method for modelling traffic flow in the 1990s [10], representing the first exploration of autonomous agents in the transportation domain. Expansions on this approach followed, and the first mentions of agent-based modelling in mobility research occurred in the 2000s [11, 12]. The expansions include the development of intelligent traffic control systems [13] and the construction of decision support systems (DSS) which allow for the provision of recommendations of efficient route allocation across time and space for the travellers or other agents in the domain of road traffic management [14, 15].

ABMs have since been applied to a diverse range of applications in transportation systems, including the micro, meso, and macro levels, which represent the interactions between agents, groupings of agents with similar attributes, and large-scale structures of agents in transport systems, respectively. In the case of the microscopic scale, they have been used to simulate the behavioural aspects of pedestrian movement [16] and their crossing behaviour in front of an automated vehicle (AV) [17]. Agents who share common properties (such as location or destination) can be aggregated to generate a higher level called mesoscopic. In this level, ABM has been developed to simulate the behaviour of drivers in a spatially explicit environment and is capable of capturing the characteristics of a large group of parking agents [18]. Meanwhile, ABM has been used to simulate the entire city or region at the macroscale, such as in Paris [19] and Singapore [20]. According to these studies, ABM was employed due to its capability of dealing with the uncertainty of a dynamic environment in which there is a complex interaction of modern transportation system that composes of innovative transport technologies.

As the modelling of a transportation system is well suited to ABM approach [21], extensive ABM tools have been developed within the past decade to overcome lots of complexities in modern transportation systems. Most ABM frameworks consist of several modules that can be integrated or used stand-alone, for instance SimMobility [22], TRANSIMS [23], and AnyLogic [24] There are also free and open-source frameworks which enable users to develop or replace any module by custom implementations to test particular aspects of their project such as MATSim [25], GAMA [26], and NetLogo [27]. Furthermore, technological advancements in ICT (information and communication technology) have resulted in the development of agent-based transport modelling and analysis using open and publicly available data [19].

Based on the wide-ranging growing literatures above, there has been changes in the way agent-based model are applied in the field of transportation in recent years. The introduction of new techniques due to the development of computational capabilities, as well as the emergence of new transport modes as a result of technological innovations, pose numerous challenges that must be addressed. Though some studies indicated that ABM have also been utilised at the national level, this has only occurred in countries with prominent Activity-travel Diary Survey data that can be scaled up to the national level, such as Singapore and Switzerland. Because of the lack of comprehensive research undertaken at the national level, the scope of this study is limited to a review of ABM studies conducted at the urban scale, where this model is frequently employed. Furthermore, it can also be implied that transportation in urban areas is extremely complicated due to the variety of modes of transportation used, the number of origins and destinations, and the variety and volume of traffic. Yet, to date, very few studies have reviewed the contribution of agent-based models within urban transportation fields in a wide-ranging literature approach. Previous research examining the application of agent-based modelling in transportation were limited to overviews of ABMs for autonomous vehicles in urban mobility and logistics [28], transport simulation and analysis [29], and the simulation of e-scooter sharing services [30].

This study, however, will contribute to the body of literatures by examining how researchers employed agent-based models in the field of urban transport research by using a bibliometric technique. By examining all of the publications related to a given topic or field, bibliometric analysis offers a promising approach for identifying the most important research or authors, as well as their relationships with one another [31, 32]. This enables the researchers to investigate the current development of agent-based models while also shedding light on the emerging areas in the urban transport domain. Further, a detailed examination of content analysis using keyword clustering has been performed to identify the key trends and potential research gaps in existing literatures. This is expected to be useful for transport researchers and serve as the first step in developing ideas for new research, especially to those who are tackling important issues in urban transport research.

The remainder of this paper is structured as follows. Section 2 describes the methodology used in this study which are bibliometric and content analysis techniques. Section 3 presents the results and discussions of the application of agent-based models in the urban transport domain. Following this, challenges faced in agent-based model are presented in Sect. 4. This section also provides some perspective for future research of agent-based model in the context of urban transport field. Finally, Sect. 5 summarises the findings and recommendations from this study.

2 Methods

The three key methodological steps used in this study are data collection, data analysis and visualisation. Each procedure contains several steps to be undertaken. Data was collected from Scopus indexed database and then refined by removing irrelevant sources using PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines [33]. Bibliometric approaches were then used in the refined data to determine the distribution of the publications across years, types of documents, and perform co-word occurrence analysis. Co-word occurrence analysis allows for the exploration of past and present research trends as well as helping the analyst to uncover the research gaps. Results from bibliometric analysis were analysed further in content analysis where each selected paper’s content was studied in depth to uncover the challenges and opportunities of agent-based models in urban transport research.

2.1 Selection criteria

The literature data used in this research were downloaded from the Scopus indexed database. Scopus is used because it is an academic database that provides a broader coverage of scientific resource collections and has better metrics than other academic databases [34]. In the initial keyword search in the academic database, the time span was outlined as “all years” and the type of documents was expressed as “all types”.

It should be noted that whilst the terms agent-based and multi-agent are often used interchangeably, Railsback and Grimm [35] claimed that the second term is a branch of the first term that originated from computer science. This implies that the term agent-based should capture an article’s content that is also covered by the term multi-agent. Therefore, the search for relevant literature of agent-based model in urban transport involved sorting through a large number of articles using the main topic keyword: agent-based and “agent based” followed by the predetermined 52 keywords shown in Table 1. The chosen keywords were selected to capture extensive previous literatures on agent-based model and matched three criteria: (1) emerging mobility services in urban areas (2) simulation framework and policy related to urban transport, and (3) agent-based model software toolkits for the purpose of both general and transport analysis. It may be noted that these criteria were determined based on general main topics used in previous extensive studies on agent-based model in urban transport research. Moreover, criteria related to agent-based modelling platforms for both general and transport purpose were considered to capture the implementation of various agent-based toolkits in this particular field. Afterward, PRISMA guidelines [33] were used to identify, select, assess, and refine the final data set of literatures in this study.

Table 1 Criteria for literature search

There were 1144 documents identified from the Scopus database through keyword searches. The refinement consisted of eliminating unrelated keywords and filtering only English papers (769 exclusions), and reading the abstract of each article (66 exclusions). After refinement, there were 309 documents included in the final dataset of literatures. The bibliometric analysis and content analysis were then employed to analyse the challenges and opportunities of agent-based model in urban transportation research.

2.2 Bibliometric analysis

Bibliometric analysis is a comprehensive method for examining and analysing large amounts of scientific data [36]. This method is widely used to read and understand the developmental key points of a particular field while also shedding light on its emerging areas. This technique can be used to perform the quantitative and qualitative analyses which depends on the scope and volume of the dataset.

There are two categories of bibliometric analysis: performance analysis and science mapping [37]. Performance analysis is the process of evaluating the contributions of research constituents to a particular field, while science mapping is the process of examining the relationships that exist between research components. Science mapping is used to visually observe the research distribution and direction of research trend as well as development. The techniques used in science mapping include citation analysis, co-citation analysis, co-word analysis, co-authorship analysis, and bibliographic coupling. In this study, co-word analysis or keyword co-occurrence networks were applied to build the science map of agent-based modelling in urban transport studies. The number of keywords and its frequency of occurrence in the network can express the theme of literatures. A clustering analysis of these frequently occurring keywords can shed light on the knowledge structure of the research field as well as highlight significant areas of interest.

Bibliometric tools including VOSviewer and bibliometrix are used in this study. VOSviewer is a bibliometric tool developed by van Eck and Waltman [38] used to visualise the science mapping which displays cluster analysis results. In the network generated by VOSviewer, items are expressed as nodes and links. The nodes, such as authors, keywords, countries, and affiliations, are proportional to the weight of appearance. Links indicate the association between the nodes, suggesting that nodes that are close to one another tend to appear together, whereas nodes that are distant from one another do not or nearly never occur together. Another tool, bibliometrix, is an open-source tool for executing a comprehensive bibliometric analysis [39]. This tool has been applied for the extraction, analysis, and visualisation of bibliographical information in various display metrics, such as three-field plot, word cloud, and treemap.

2.3 Content analysis

Content analysis is the process of detailed examination of the content of selected literatures. Literatures were grouped based on research clusters within agent-based model in urban transport field. Visualisation outputs from VOSviewer were used to investigate the research clusters within this particular field. Literatures in each research cluster were explored in detail with content analysis to deepen the analysis.

3 Results and discussion

In this section, the results of this study are presented and discussed. First, the historical trend of publications and modelling tools of agent-based models are analysed. Next, the distribution of geographical locations of the case studies used within agent-based modelling studies in urban transport is illustrated. Then, the results of keyword co-occurrence analysis to explore the trend and current state-of-the-art of agent-based model in urban transport are identified. Finally, the content analysis based on research clusters are demonstrated, which reveal the knowledge domain within research clusters.

3.1 Overview of the article outputs and modelling tools used

The final data set (consisting of 309 documents), represents the largest dataset for this type of analysis on agent-based models. There are four document types found in 309 publications within the time span of 2006–2022. Even though there may be several agent-based model-related publications prior to 2006, these documents do not appear in the search results. This is due to these publications not containing the keyword combinations used in their metadata, i.e., title, abstract, or keyword.

The most frequent document type is conference paper, accounting for 151 publications (49%), followed by journal articles, accounting for 147 outputs (48%). Book chapters account for 8 publications (3%) and review articles account for 3 documents (1%). The yearly output of articles is presented in Fig. 1.

Fig. 1
figure 1

Dynamic of the top 10 agent-based modelling tools used and yearly output of articles within the period of study

Based on Fig. 1, there were no significant increase in the number of publications in the first few years from 2006. In this period, there were limited agent-based simulation frameworks that had the sufficient performance required to do real-time simulations in terms of simulating the complete time horizon of the decision makers or agents. After this period, a steady rise can be observed from 2010 to 2014. Then, the growth of the publications is exponential, starting from 2015 to date. This is partially explained by the fact that numerous modelling tools in agent-based model were developed alongside significant increases in computing performance during the past decade, meaning that a much more realistic simulation could be made that runs in real time. Additionally, most of these simulation platform toolkits are fully open-source, allowing for researchers to more easily create extensions to improve the key features and solve particular problems. The use of the ten most frequently used agent-based modelling tools in the 309 documents is shown in Fig. 1 below.

Each line in Fig. 1 represents the cumulative occurrences of the top ten agent-based models used in the context of urban transport simulation. ABM frameworks that have specific names (e.g., SimMobility, MATSim, TAPAS) have been identified separately with their programming language used. The results from this illustration reveal that MATSim [25] is the most popular agent-based simulation framework used in this context. The ability of MATSim (Multi-Agent Transport Simulation) to simulate a large-scale agent-based model framework on stochastic and co-evolutionary algorithm in which each agent continuously searches for better travel plans until reaching its maximum utility, enables the user to develop and create various urban mobility scenarios. These are for instance measuring the presence of emerging modes of transport: autonomous mobility-on-demand in Zurich [40]; shared mobility in San Francisco [41]; electric vehicle in Berlin [42]; demand responsive transport in Michigan [43]. Furthermore, the MATSim simulation framework also provides the flexibility to be integrated with other platforms or frameworks in order to enhance or overcome some limitations (e.g. microscopic land-use model, multiple mode choice specifications, the modelling of sequences of activities and choices of location) of the existing MATSim framework, such as SILO [44], FEATHERS [45], EQASIM [46], BEAM [47], and mobiTopp [48].

The second most frequently used agent-based tool is NetLogo, a free open-source software based on an agent-based programming Scala and Java language [27]. NetLogo has been widely used for general and transport purpose studies due to its flexibility, which allows users to build and modify the model, perform multilevel modelling through connecting other models together, import both raster and vector data, and explore the elements interface. Previous research showed that NetLogo was employed to explore the impact of behavioural parameters on modal shift in public transport in Paris [49] and measure the effect of demand responsive shared transport on taxi service in Ragusa, Italy [50]. Other frequently applied agent-based frameworks are SimMobility [22] and AnyLogic [24]. SimMobility is a fully modular activity-based simulation platform that allows the user to utilise distinct modules by timeframe i.e. short-term (traffic simulation), mid-term (travel demand), and long-term (land use). A study measuring the effects of Automated Mobility-on-demand on accessibility and residential relocation was conducted in Singapore [51]. AnyLogic is a multimethod simulation modelling tool capable of simulating three major simulation modelling methodologies in place today: system dynamics, discrete-event, and agent-based modelling. This platform features various visual modelling languages and an industry-specific toolkit. Application of this modelling tool includes investigating the use of Automated Last-Mile Transport (ALMT) of train trips in Delft, Netherlands [52].

Previous studies have also proposed frameworks to analyse agent-based models in urban transport contexts. This includes an application of unspecific or uncommercial agent-based tool. Aschwanden et al. [53] used Esri City Engine for generating a 3D environment city model, then agents were created by using Massive Prime. This approach allows the user to analyse, predict, and quantify traffic fluctuations over time, as well as defining the number of individual traffic, public transport, and pedestrians in each area and link or street of a city. Another example is a study by Hyland and Mahmassani [54], which utilised an agent-based simulation framework in Python to model the dynamic system of autonomous vehicles (AV) and compared assignment strategies for a shared-use AV mobility service (SAMS). Additionally, Matlab and C++ are widely used for agent-based modelling in the context of urban transportation due to their adaptability, which allows users to construct computational codes utilising a large database of built-in algorithms.

These findings above show that many studies have simulated agent-based models in various platforms and frameworks to achieve numerous research purposes and solve complex problems. These approaches have different programming languages, primary application domains, scalability, strengths and shortcomings. Some platforms, such as SUMO [55] and AIMSUN [56], are specifically designed for modelling microscopic traffic flow dynamic, while others, such as TRANSIMS [23], SimMobility [22], and POLARIS [57], are designed for mesoscopic or large-scale simulation which lead to longer running times. The list of agent-based modelling tools and its additional frameworks including their detailed specifications such as aim, language used, and key features can be seen in the Appendix.

3.2 Geographical distributions

This research took into consideration the country where the case study has been undertaken, which may be different from the country of authors’ affiliations (Fig. 2).

Fig. 2
figure 2

Case studies of agent-based models in urban transport by country

The geographical distribution of papers—based on case studies of agent-based model in urban transport—is concentrated in developed countries such as Germany, US, Switzerland, and Singapore. In contrast, models are rarely applied in the context of the Global South, where countries have different characteristics of travel behaviour and may require different approaches.

Germany is the most considered country in terms of where the case studies have been conducted. This is in line with the finding from cumulative occurrences of agent-based modelling tools analysis where it is revealed that MATSim, a framework developed continuously in TU Berlin, is the most frequently used modelling tool for agent-based transportation modelling. This relationship can be seen on the diagram in Fig. 3 as follows.

Fig. 3
figure 3

The relationship among agent-based framework, case studies, and authors

The left-most column represents the agent-based modelling tools, the middle column displays the countries where the case studies were completed, and the right-most column shows the top authors in the field. The height of the box indicates the number of publications, and thicker line connections imply that a greater volume of work or information is produced.

3.3 Co-occurrence analysis and key research clusters

Further, analysis of the distribution of keywords co-occurrence network map was undertaken to effectively reflect on the research hotspots. The words in the co-occurrence network map were derived from a textual field (e.g., title, abstract, and author’s keywords) in a bibliographic collection [38]. Additionally, the co-occurrence analysis assumes that words that frequently occur together reflect a thematic relationship represented in the same colour which is then formed clustering [37]. The co-occurrence network map was created by the VOSviewer software as shown in Fig. 4.

Fig. 4
figure 4

Keyword co-occurrence network visualisation

Co-occurrence analysis, which identifies the major categories and their interrelationships, was used to detect the disciplinary distribution of agent-based modelling in urban transport research. The size of the nodes and words in Fig. 4 corresponds to the weights of the nodes in that particular graph. The weight increases in proportion to the size of the node and word. Nodes are separated by a distance, which indicates how strong their relationship is between each other. A closer distance is usually indicative of a stronger relationship. The line drawn between two keywords indicates that they have appeared together in the same document. The greater the thickness of the line, the greater the likelihood of their co-occurrence.

By analysing the co-occurrence of frequent terms, the research hotspots of agent-based modelling in urban transport research were determined. The minimum number of co-occurrences for a keyword was set to 3. From the 685 keywords associated with agent-based model that were extracted, 47 met the criteria of having at least 3 occurrences. The keyword “agent-based modelling” appears the most frequently. Other keywords that appear frequently include “transport modelling”, “travel demand”, “demand responsive transport”, “public transport”, and “electric vehicles”. Since this study applied keywords co-occurrence analysis, the clustering results show keywords with a strong correlation to one another will appear in the same cluster. This process is thus not subject to analyst bias, though may result in some counterintuitive clustering. For example, the results indicate that “modal shift” is a part of cluster 8 “public transport”, which means that the majority of the papers in that cluster include both “modal shift” and “public transport” as keywords and may focus on how to move people to public transportation, such as paper by Rahman et al. [58] and Barber et al. [49].

Furthermore, the trend of the research in this field over time can be identified by exploring Fig. 5. As shown in Fig. 5, dynamic traffic assignment and transport measures related to congestion pricing in agent-based models are the dominant categories from 2016 to 2017, but they became less influential in the 2020s. Emerging transport technology trends such as ride sharing, demand responsive transport, and electrification appeared in the period 2017–2019 and received more attention from researchers in the 2020s to date.

Fig. 5
figure 5

Keyword co-occurrence network based on trend of publication date

On the basis of the network in Fig. 4, comparable terms as shown with the same colour were clustered. The nomenclature of the clusters was defined based on the lists of the keywords for each cluster having the same colour: Cluster 1 (red): General Transport Modelling, Cluster 2 (green): Travel Behaviour, Cluster 3 (blue): Emerging Transport Modes, Cluster 4 (yellow): Transport Policy, Cluster 5 (purple): Urban Logistic, Cluster 6 (cyan): Travel Demand, Cluster 7 (orange): Parking, Cluster 8 (brown): Public Transport, and Cluster 9 (pink): Shared Autonomous Taxi. Clusters are analysed next in turn by considering prominent examples in the literatures.

3.4 Content analysis based on research cluster

Research clusters are found as representations of various studies utilising a diverse set of research approaches and problem interpretations. In this subsection, the outline of the nine clusters identified are described. Cluster 1 (General Transport Modelling) mainly consists of publications developing the theory and conceptual works to different spatial scales of agent-based modelling in urban transport including microscopic and macroscopic simulation. The type of application at microscopic level includes movement of pedestrians or the movement of cars on the road network. For example, Fujii et al. [59] developed a new framework for simulating mixed traffic composed of pedestrians, cars, and trams, which may be used to support considerations concerning road management, signal control, and public transportation. The pedestrian agents can walk freely, avoid collisions, stop momentarily, pass other pedestrians, and move in the same way as car and tram agents. The car agents can plan their routes and determine acceleration. The tram agents are based on the car agents, but without the functions for changing lanes nor considering the best route. Moreover, simulation at a large-scale, which involves the modelling of corridor-level and sub-area transportation operations and planning applications, fits in this cluster [60].

In cluster 2 (Travel Behaviour) the links between data-driven simulation, dynamic traffic assignment, transport planning, and travel behaviour were first beginning to be recognised. For example, a high-performance data-driven agent-based modelling framework has been employed to simulate the uptake of active mode choice during commuting such as walking and bicycling in New York City [61]. Through a GIS-enabled database for the City of New York, the ABM model explicitly incorporated walking and cycling network data with pedestrian and bike accident data. This study pointed out that data-driven simulation is the most proper way to leverage the ABM approach for a close analysis of mode choice as it allows for a more realistic simulation of the environment. Moreover, smart card data can be utilised as an input for analysing travel behaviour concerning transit users in a large-scale activity based public transport simulation [62]. This work demonstrated that smart card data can generate microsimulation travel demand models effectively by improving the statistical analysis and utilising advanced data mining approaches. An effort to leverage the mode decisions has been done by bridging discrete mode choice models and agent-based simulation [63], where it was found that the convergence speed of the simulation may significantly increase by implementing a discrete mode choice model in the ABM compared to the baseline model. The baseline model is in this case the existing model used by ABM platforms when selecting mode for each trip; for instance, ABM tool such as MATSim use a co-evolutionary algorithm to reach an equilibrium state of the system and allow each agent to select different modes for a trip based on utility value from the previous agent’s plan [25]. Additionally, incorporating dynamic traffic assignment with agent-based travel behaviour models can be used to provide a better result in evaluating the impact of land development on transportation infrastructure, compared to using traditional approaches such as static modelling [64]. It provides a complex yet practical method for analysing the effect of a single or series of land development projects on a driver's behaviour, as well as on travel demand pattern and time-dependent traffic conditions. Furthermore, agent-based simulation has been developed to examine the impacts of transportation development plans on modal shifts and residential location choice [65].

The research then focused more on how agent-based modelling has been adapted for emerging transport modes which also relates to the application of technologies in transport that build a complex traffic system. Clusters 3 and 9 are related as both are characterised by the performance of emerging transport modes. Cluster 3 (Emerging Transport Modes) contains studies about the impact of emerging transport modes in response to travel behaviour as well as travel demand. Given the nature of public transport data and the readiness of transportation infrastructure technology, the majority of publications in cluster 3 are aggregated at the city level. The key problems studied include expected capacity gains and increases in vehicle kilometres travelled for shared autonomous vehicles [66], autonomous vehicle fleet sizes [67], the travel and environmental implications [68], travellers’ behaviour or acceptance of emerging transport modes as well as their interactions with a complex transport environment [69], and competition between existing and emerging modes [50]. On the other hand, cluster 9 (Shared Autonomous Taxi) is becoming an important attraction in emerging transport modes. Though this falls within the broad umbrella of ‘emerging modes’, this has been assigned to a separate cluster due to the high number of papers on this. Sharing autonomous vehicles will allow people to travel without the costs and responsibility of vehicle ownership. Consequently, taxi passengers will likely be the first users of shared autonomous vehicles [70]. Problems analysed include comparing the potential benefits and drawbacks of ride sharing both traditional taxis and shared autonomous taxis [71], the impact of introduction of autonomous taxi to travel demand [72], and commuters’ departure times [73].

The focus in cluster 4 (Transport Policy) shifts to the analysis of transport management policy using agent-based modelling approaches. For example, it is believed that road pricing is an effective management strategy for reducing traffic congestion on transportation networks. Various road pricing schemes have been developed using agent-based simulators. A combination of macroscopic fundamental diagram and an agent-based traffic model can replicate the heterogeneity and complexities of traveller preferences in analysing the impact of a dynamic cordon pricing scheme [74]. Further, a time-dependent area-based pricing scheme for congested multimodal urban networks has been developed by also adding incentive programmes to improve public transport services and encourage modal shift [75]. The pricing scheme also can be done for the specific service area of demand responsive transit (DRT) to encourage modal shift from car to DRT [76]. Cluster 7 (Parking) is considered to have relationship with the transport policy. A parking choice model can be implemented into an existing agent-based traffic simulation [77]. This model can send an input to the traffic simulation, allowing the simulation to respond to spatial variations in parking demand and supply. Another implementation of agent-based simulation in parking is examining the role of ridesharing to reduce the burden of high-demand for parking in urban centres [78]. It has also been demonstrated that parking pricing policies significantly impact the probability that a traveller would send their autonomous vehicle to travel back home instead of parking at parking lot [79].

Cluster 5 (Urban Logistics) moves on from the private and public transport modes to look at the role of urban freight transport especially in improving the quality of the urban environment and profit margins in the supply chain [80]. The papers in this cluster mostly use agent-based frameworks for analysing urban logistics as this method can be used to assess the interaction between agents. Traditional approaches are inadequate at evaluating such relations and fail to consider heterogeneous objectives among urban logistics agents. A development of agent-based framework in city logistics also allows for the implementation of fully-disaggregated simulations of commodity contracts, operation planning of logistics and vehicle, parking decisions, and electrification of urban freight transport [81].

Cluster 6 (Travel Demand) tends to focus on how to generate synthetic populations of travellers and their detailed travel demand as a basis for agent-based transport simulations. Unlike conventional transport models, agent-based transport modelling requires more detail on synthetic populations as activity chains are required. However, such output is rarely reproducible because it relies on proprietary data and tools. Thus to stimulate reproducible agent-based transport simulations, a number of studies have developed a method for creating synthetic travel demand based on open data and open software, which can be replicated by any researcher. The mobiTopp is a modular agent-based travel demand modelling framework that enables the modules to be integrated with agent-based platforms [82]. Furthermore, a framework providing a continuous pipeline from raw data to a final generic synthetic travel demand was introduced [46]. This framework has been applied in various regions, namely Île-de-France [19], Switzerland [46], and Sao Paulo [83]. Also, travel demand data for agent-based simulation purposes can be acquired from human mobility based on mobile phone data [84]. Population synthesis can be produced by employing various methods, such as Iterative Proportional Fitting to create a synthetic baseline population of individuals and households for activity-based models at the microscopic level [85], Iterative Proportional Updating approach to match both individual and household attributes level in the population [86], and Markov chain Monte Carlo (MCMC) simulation-based approach for synthesising populations [87].

Finally, Cluster 8 (Public Transport) covers a set of studies regarding agent-based modelling in association with public transport. This includes studies employing agent-based simulation to examine the impact of public transport in modal shift [49], analysing potential demand for emerging transport mode competing or complementing public transport [20, 88], designing public transport network [60], assessing the impact of public transport infrastructure extension on future traffic [89], and improving public transport routes [90] as well as day-to-day operation [91].

4 Challenges and future research directions

Publications in each research cluster were investigated in detail to explore the key challenges of the existing agent-based models, particularly in urban transport studies. These are discussed below along with future research directions.

4.1 Improving computing efficiency

The review confirmed that when a large number of agents are simulated (Cluster 1 and 2), the model environment becomes more complex. This is particularly the case when different components such as mode choice, route choice, scheduling, land use, ride-sharing scheme, destination choice, etc. are combined. This requires an increase in computational resources to improve model performance in capturing complex interactions at a highly granular spatial scale among agents (e.g., individuals and the transport system).

In large-scale scenarios, MATSim typically manage numerous agents at the micro level, which can require a significant amount of time to run. For instance, in 2015, a study by Waraich et al. [92] exploring a MATSim simulation run for Switzerland scenario taking 7.3 million agents in one million links on the network was reported to take 3 h and 16 min to complete a single iteration. Based on their experience, 60 iterations were required meaning a total runtime of up to 11 days. The hardware used in the experiment was a Sun Fire X4600 M2 with 16 cores in 8 dual core CPUs and 128 GB of memory. Adding more complex transport systems such as demand responsive transport will only increase computing time. A MATSim transport model of a demand-responsive transit system in Wayne County, Michigan, with a travel demand of 9 million trips, required roughly 43 h to simulate 30 iterations on a high-performance computer cluster with 12 cores and 144 GB of memory [93]. In order to alleviate the requirement of multiple days of computing time, utilising cloud-based computation services allows for the possibility of increasing simulation realism and parallel processing while also shortening the runtime. Meanwhile, some researchers have attempted to build a framework to accelerate the computing time of large-scale agent-based mobility scenarios. For example, Manley et al. developed a hybrid agent-based modelling approach that combines a descriptive representation of detailed driver behaviour with a simplified, collective model of traffic flow in an effort to strike a balance between the demands of behavioural realism and computational capacity [15]. Taking central London as a study zone, the hybrid model was run in 4 h and 51 min, while the purely agent-based approach completes in 11 h and 24 min. GEMSim, a GPU-accelerated (graphics processing unit) simulation platform, has also been developed [94]. GEMSim has been tested on simulating a large-scale scenario for Switzerland, running a full day of the 5.2 million agents’ daily plans with detailed road infrastructures and public transport schedule in less than 5 min computing time. However, such innovative approaches are relatively scarce in the common agent-based platforms used and the effort to improve the computing efficiency in large-scale agent-based models will continue to be a motivation as transport systems become more complex in the future.

4.2 Unified calibration and validation procedures

Based on cluster analysis, it can be noticed that the methods for developing agent-based models for a variety of transport system purposes are frequently addressed. Different calibration and validation methods are implemented such as comparing the distribution of population socio-demographics from simulations against real household census surveys [19], mode share comparison between simulation and real-life traffic counts [95], and matching daily activity pattern including time-of-day, intermediate stops, number of tours, and mode choice for all tours between household travel survey data and simulation results [96]. However, there is no unified conceptual framework that can be implemented properly and securely for calibration and validation process as the application of agent-based modelling in transport is diverse across different problem levels. Thus, a calibration or validation method such as comparing the findings with analytical models can be explored further to increase the credibility of the agent-based models results. It should be noted that transport is not alone in facing this challenge, with calibration and validation noted as challenges in ABM practice across different disciplines [97].

4.3 Reproducibility of work

As one of the most recently developed methods, for which applications are still growing exponentially, agent-based models in transport system require extensive exploration by many researchers. However, the researchers in this area face difficulties even to reproduce simulations since previous models are rarely replicable due to confidential data and tools. Cluster 6 (Travel Demand) is the most significant area that requires open and publicly available data of travel daily activity and population data as a basis to execute agent-based transport simulations.

An effort to tackle this issue has been raised by Hörl and Balac [19]. They introduced a streamlined process for producing a synthetic travel demand with specific households, persons, and their daily activity chains for Paris and the surrounding region of Île-de-France which is totally based on open data, open software and can be replicated by any researcher. The generated travel demand is made available for others to utilise as a comprehensive data source for agent-based transport simulations and as a testing ground for population and demand synthesis techniques.

More broadly, there is a movement towards open-source software and publication of code. This is particularly evident within the MATSim community,Footnote 1 and has some traction in ABM more broadly.Footnote 2 However, an improved standardisation of model design and parameterisation is needed and there are opportunities to learn from elsewhere in establishing these [98].

4.4 Embedding various modules or frameworks in models

The use of agent-based models in many applications of transportation research is growing. In general, the majority of open-source agent-based simulation platforms can be coupled with other frameworks to solve specific problems. Some of the independent modular frameworks are being integrated with existing agent-based simulation platforms to achieve efficient and accurate individual behavioural models, i.e., Eqasim framework by Hörl and Balac [46] which can be integrated with MATSim and SUMO. Embedding modular framework to existing ABM platforms can also overcome some challenges or limitations of the existing agent-based model studies. Hörl et al. [63] integrated discrete choice model with agent-based model to improve convergence speed of ABM simulation. FEATHERS [45] and mobiTopp [48] can be employed with ABM platforms to increase the capability of existing ABM components related to activity-based models and travel demand models, respectively. ABM can also have multiple mode choice specifications by embedding BEAM [47]. Furthermore, SILO framework can be integrated with open-source ABM to explore complex interactions between land use and transport models [44]. Nonetheless, the existing integrated frameworks have shortcomings that need to be addressed, for instance adding environmental analysis and considering full-day activity-travel patterns in integrated land use/transport models. Therefore, exploring as well improving a variety of modular frameworks to be embedded in agent-based toolkit is essential for working closely with real-world individual behavioural and traffic systems.

4.5 Transport complex system in affecting travel behaviour and travel demand

Based on the previous studies grouped in Cluster 3 and 9, transport systems are becoming more complex as they faces various emerging transport modes and mobility schemes such as ridesharing, ride-pooling, demand responsive transport, electric vehicles, autonomous vehicles, and urban air mobility. Transport modellers need to adjust models to these developments occurring in the transport system as it is difficult to capture the interactions in the complex transport system in conventional transport models.

Investigating the impact of these complex systems on existing transport environments is crucial. In that sense, there would be either competing [50] or complementary [99] interactions between emerging transport modes and existing modes. The results of this phenomenon might be different in various countries or regions. Emerging modes of transport also have the potential to be integrated with existing public transport or various travel demand management measures such as mobility hubs or park and ride. Moreover, measuring the impact of large transportation infrastructure developments or extensions in spatial and temporal approaches is crucial. Nevertheless, studies regarding this matter have not so far been well assessed in agent-based approaches. Further, it is also essential to investigate the use of ICT (information communication technology) for mobility substitution, including new forms of teleworking, telecommuting, and e-shopping.

Moreover, new forms of data sources as an input in the context of ABM are increasingly being utilised, including smart card data and mobile phone data. These methods allow for better coverage of public transportation trips. For example, smart card data records and stores the date and time of each entry and exit activity as well as the boarding and alighting stops/stations [20]. This data can then be used, for example, to investigate the demand characteristics for integrating autonomous vehicles into the public transport system. Meanwhile, mobile phone data is utilised to determine work, education, or any other types of locations for each of the agents in the study area’s population [100] and to generate origin–destination matrices of trips during different time slots as well as commuting behaviours of people in the population [84]. Having a better input and result on the travel behaviour and travel demand towards emerging transport modes and mobility schemes is essential for policy makers in making more adaptive plans and infrastructure that can endure despite the uncertainty caused by urban mobility transition and technological changes.

4.6 Equity concerns of study location

The studies on agent-based transport simulation in urban area are unevenly distributed across geographical scale. The case studies are mostly conducted in developed economies, for instance Germany, Switzerland, Singapore, and the US rather than developing economies. Some of the studies in developing countries can be found in China [101], Indonesia [95], and Thailand [89]. In developed economic countries, the sources of data tend to be more complete and require less preparatory work for the implementation of an agent-based simulation framework compared to data from developing countries. Hence, data availability that is compatible with the ABM framework is a significant issue that must be addressed in order to have more greater shares of ABM studies in the context of Global South. However, different contexts in developing economies and developed economies may also result in additional challenges and insights that need to be addressed for future research. These additional challenges include socio-demographic structure, travel patterns, and available transport modes that can be substantially different. There may also be substantial challenges in achieving an accurate perception of urban travel patterns and individual behaviour based on travel characteristics, transport modes, cultural and social influence, road network supply capacity, etc.

5 Conclusions

The current study provides a comprehensive review of agent-based models by employing bibliometric and content analysis. The main contributions of the study are (1) highlighting the diversity of the applications of agent-based models in urban transport research; (2) identifying the research gaps and (3) summarising the key challenges and opportunities for future research in this domain. The paper is expected to serve as a valuable resource for researchers and practitioners considering the application of agent-based models in the context of urban transport planning.